Atomic ops on android (2.2)

Louise Cypher

unread,

Oct 26, 2010, 2:39:43 PM10/26/10

to andro...@googlegroups.com

Hi

I'm writing some lockless stuff on android (target is Android 2.1 and up on modest arm (7 and up ?)

I see that __atomic_inc/dec, __atomic_cmpxchg, and __atomic_swap exist but i need 2 more atomics:

__atomic_and and __atomic_or -- are those exist on ARM architecture, if no is there some common way to

simulate them ?

BTW. what is performance of atomics compared to ordinal mutex based sync - they are just intrinsics or they map to something more complex under the hood ?

--
Code it black!

David Turner

unread,

Oct 26, 2010, 3:04:26 PM10/26/10

to andro...@googlegroups.com

On Tue, Oct 26, 2010 at 11:39 AM, Louise Cypher <szatan...@gmail.com> wrote:

Hi

I'm writing some lockless stuff on android (target is Android 2.1 and up on modest arm (7 and up ?)
I see that __atomic_inc/dec, __atomic_cmpxchg, and __atomic_swap exist but i need 2 more atomics:

These functions are internal to the C library, and you should not rely on them being here, or even work properly in future release of the platforms, or even current releases running on future hardware (see below).

__atomic_and and __atomic_or -- are those exist on ARM architecture, if no is there some common way to

simulate them ?

I would sat just use __android_cmpxchg in a loop, that should be straightforward.

BTW. what is performance of atomics compared to ordinal mutex based sync - they are just intrinsics or they map to something more complex under the hood ?

It really depends on the CPU. Also with multi-core / SMP boards, you're going to need memory barrier instructions to ensure your program works correctly and these can be very costly.

THE FOLLOWING IS AN IMPORTANT MESSAGE FROM THE "I-TOLD-YOU-SO" DEPARTMENT:

DO NOT USE __atomic_xxx functions DIRECTLY IN YOUR APPLICATION MACHINE CODE.

Because your code will lack any memory barrier implementation, IT WILL NOT RUN CORRECTLY on upcoming future SMP devices (which I won't comment on).

If you absolutely require atomicity, protect the data with a mutex, or mutex+condvar instead using the provided pthread APIs.

--
Code it black!

--
You received this message because you are subscribed to the Google Groups "android-ndk" group.
To post to this group, send email to andro...@googlegroups.com.
To unsubscribe from this group, send email to android-ndk...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/android-ndk?hl=en.

Louise Cypher

unread,

Oct 26, 2010, 3:26:24 PM10/26/10

to andro...@googlegroups.com

So the __atomic_inc and __atomic_dec will broke on SMP ? :( damn

but wait a moment, digging deeper ... those future multicore arms have some syncing opcodes (like lwarx/stwcx/lwsync on ppc architecture) right ?, so just updating implementation should work ?

mutexes are 10 or more times slower than atomic inc/dec/swap on actual test devices (nexus (Snapdragon) and galaxy (Hummingbird))

(just wrote lockless linked list performance test implementation (2 inserter threads, 4 consumer threads))

I do not need to be compatible over years, i just need to be compatible with CURRENT devices, for future devices I can always adjust code ;)

BTW. Thanks for clarification about atomics. (I'll definitivelly leave 'mutex code' just to have 'mr. proper' implementation)

--
--
Code it black!

Tim Mensch

unread,

Oct 26, 2010, 3:37:55 PM10/26/10

to andro...@googlegroups.com

On 10/26/2010 1:04 PM, David Turner wrote:
> THE FOLLOWING IS AN IMPORTANT MESSAGE FROM THE "I-TOLD-YOU-SO"
> DEPARTMENT:
>
> DO NOT USE __atomic_xxx functions DIRECTLY IN YOUR APPLICATION MACHINE
> CODE.
>
> Because your code will lack any memory barrier implementation, IT WILL
> NOT RUN CORRECTLY on upcoming future SMP devices (which I won't
> comment on).
>
> If you absolutely require atomicity, protect the data with a mutex, or
> mutex+condvar instead using the provided pthread APIs.

Wait wait wait. You took part in this thread earlier:

https://groups.google.com/group/android-ndk/browse_thread/thread/3760ff092d5228c7/d689161e98dc26d3?hl=en&#d689161e98dc26d3

...and you didn't tell me about any problems using "sys/atomics.h",
which of course I am now using, because I don't WANT something as heavy
as a mutex. I'm currently using __atomic_cmpxchg() in my code, based on
the fact that no one from Google answered my question earlier directly
one way or another. I'd say that falls under the
"WE-FAILED-TO-TELL-YOU-WHEN-YOU-ASKED" department.

To clarify: When you say "DIRECTLY IN YOUR APPLICATION MACHINE CODE" do
you mean not to call these functions? If not, then what CAN I call to
get atomic functionality? And why wouldn't the library be updated on
those platforms? Or are those GCC intrinsics? In that case would it work
if we forced them NOT to be generated as intrinsics?

Tim

David Turner

unread,

Oct 26, 2010, 4:25:50 PM10/26/10

to andro...@googlegroups.com

On Tue, Oct 26, 2010 at 12:37 PM, Tim Mensch <tim.m...@gmail.com> wrote:

On 10/26/2010 1:04 PM, David Turner wrote:

THE FOLLOWING IS AN IMPORTANT MESSAGE FROM THE "I-TOLD-YOU-SO" DEPARTMENT:

DO NOT USE __atomic_xxx functions DIRECTLY IN YOUR APPLICATION MACHINE CODE.

Because your code will lack any memory barrier implementation, IT WILL NOT RUN CORRECTLY on upcoming future SMP devices (which I won't comment on).

If you absolutely require atomicity, protect the data with a mutex, or mutex+condvar instead using the provided pthread APIs.

Wait wait wait. You took part in this thread earlier:

https://groups.google.com/group/android-ndk/browse_thread/thread/3760ff092d5228c7/d689161e98dc26d3?hl=en&#d689161e98dc26d3

I don't see anything about atomics in this thread.

...and you didn't tell me about any problems using "sys/atomics.h", which of course I am now using, because I don't WANT something as heavy as a mutex. I'm currently using __atomic_cmpxchg() in my code, based on the fact that no one from Google answered my question earlier directly one way or another. I'd say that falls under the "WE-FAILED-TO-TELL-YOU-WHEN-YOU-ASKED" department.

That is true, and I'm sorry for it. Looking at it, it looks like the __atomic_xxx functions of <sys/atomics.h> have been exposed by the NDK, so I have no other choice than to continue supporting them as best as I can.

I guess this means that you can continue using them. More precisely:

For SMP builds, their implementations will probably provide explicit memory barriers before and after the operation, which will make them (perharps significantly) slower. But at least won't break NDK machine code. And no change for non-SMP devices.
The system, will use a different implementation that doesn't have the included barriers. It will also use explicit barrier instructions, so it won't suffer from this.

To clarify: When you say "DIRECTLY IN YOUR APPLICATION MACHINE CODE" do you mean not to call these functions? If not, then what CAN I call to get atomic functionality? And why wouldn't the library be updated on those platforms? Or are those GCC intrinsics? In that case would it work if we forced them NOT to be generated as intrinsics?

Well, I meant do not call these functions, but as I just corrected, you can still use them, we will support them.

The Android team is still discussing exactly which set of SMP-safe operations to implement and expose through the NDK. Apparently, there are several ways to do that, and some of us are reluctant to expose the stuff being used internally because we may change it for the better soon.

So in the meantime, keep going. Sorry for the annoyance.

- David

Tim

Tim Mensch

unread,

Oct 26, 2010, 5:30:09 PM10/26/10

to andro...@googlegroups.com

On 10/26/2010 2:25 PM, David Turner wrote:

> On Tue, Oct 26, 2010 at 12:37 PM, Tim Mensch<tim.m...@gmail.com
> <mailto:tim.m...@gmail.com>> wrote:
>> Wait wait wait. You took part in this thread earlier:

> I don't see anything about atomics in this thread.

Sorry, I hit the wrong thread. Try this one:

https://groups.google.com/group/android-ndk/browse_thread/thread/67e061ba06108c9d/89457890ed885e9c

It's the one that's on point, and the one where you did tell me that
everything under sys/ should be OK after I asked about sys/atomics.h.

> That is true, and I'm sorry for it. Looking at it, it looks like the
> __atomic_xxx functions of<sys/atomics.h> have been exposed by the
> NDK, so I have no other choice than to continue supporting them as
> best as I can.

Thanks. :)

> * For SMP builds, their implementations will probably provide

> explicit memory barriers before and after the operation, which will
> make them (perharps significantly) slower. But at least won't break
> NDK machine code. And no change for non-SMP devices.

But I can't imagine that it would end up being slower than a mutex,
right? That's what really matters to me; if it ended up SLOWER than a
mutex, that would suck, but if it performs at worst no slower than a
mutex, then it's still a win on single-processor platforms, and simply
not as much of a win on faster SMP platforms. Obviously the faster it is
the better, though. :)

> So in the meantime, keep going. Sorry for the annoyance.

No worries. Thank you for keeping the NDK stable for us. :)

Tim

David Turner

unread,

Oct 26, 2010, 6:31:56 PM10/26/10

to andro...@googlegroups.com

On Tue, Oct 26, 2010 at 2:30 PM, Tim Mensch <tim.m...@gmail.com> wrote:

But I can't imagine that it would end up being slower than a mutex, right?

Most probably not. Even though our mutex implementation is really fast, you would need one lock + one unlock to replace a single atomic inc/dec, which can add up pretty quickly.

That's what really matters to me; if it ended up SLOWER than a mutex, that would suck, but if it performs at worst no slower than a mutex, then it's still a win on single-processor platforms, and simply not as much of a win on faster SMP platforms. Obviously the faster it is the better, though. :)

So in the meantime, keep going. Sorry for the annoyance.

No worries. Thank you for keeping the NDK stable for us. :)

Tim

Reply all

Reply to author

Forward