Use atomic.. builtins instead of sync.. builtins in gcc 4.8.2 or later versions

745 views

Skip to first unread message

Hari Reddy

unread,

Jun 25, 2014, 4:50:54 PM6/25/14

to redi...@googlegroups.com

In zmalloc.c the following primitives are currently used
to synchronize access to single global variable:
__sync_add_and_fetch
__sync_sub_and_fetch

In some architectures such as powerpc these primitives are overhead
intensive. More efficient C11 __atomic builtins are available with
newer GCC versions, see
http://gcc.gnu.org/onlinedocs/gcc-4.8.2/gcc/_005f_005fatomic-Builtins.html#_005f_005fatomic-Builtins

By substituting the following __atomic… builtins:
__atomic_add_fetch
__atomic_sub_fetch

the performance improvement on certain architectures such as powerpc can be significant,
around 10% to 15%, over the implementation using __sync builtins while there is only slight uptick on
Intel architectures because it was already enforcing Intel Strongly ordered memory semantics.

The selection of __atomic built-ins can be predicated on the definition of ATOMIC_RELAXED
which Is available on in gcc 4.8.2 and later versions.

Files affected:        zmalloc.c

Code affected diff:
@@ -68,8 +68,13 @@ void zlibc_free(void *ptr) {
#endif
+#ifdef __ATOMIC_RELAXED
+#define update_zmalloc_stat_add(__n)     __atomic_add_fetch (&used_memory, (__n), __ATOMIC_RELAXED)
+#define update_zmalloc_stat_sub(__n)    __atomic_sub_fetch (&used_memory, (__n), __ATOMIC_RELAXED)
+#else
#define update_zmalloc_stat_add(__n) __sync_add_and_fetch(&used_memory, (__n))
#define update_zmalloc_stat_sub(__n) __sync_sub_and_fetch(&used_memory, (__n))
+#endif
#else
#define update_zmalloc_stat_add(__n) do { \
     pthread_mutex_lock(&used_memory_mutex); \
@@ -220,7 +225,12 @@ size_t zmalloc_used_memory(void) {
     if (zmalloc_thread_safe) {
#ifdef HAVE_ATOMIC
+#ifdef __ATOMIC_RELAXED
+        um = __atomic_add_fetch(&used_memory, 0, __ATOMIC_RELAXED);
+#else
+
         um = __sync_add_and_fetch(&used_memory, 0);
+#endif
#else
         pthread_mutex_lock(&used_memory_mutex);
         um = used_memory;

Thanks for your critique on this proposal,

Hari

2014-06-24 Hari Reddy <hnr...@us.ibm.com>

Reply all

Reply to author

Forward

0 new messages

Use __atomic.. builtins instead of __sync.. builtins in gcc 4.8.2 or later versions

Hari Reddy

Use atomic.. builtins instead of sync.. builtins in gcc 4.8.2 or later versions