Pros and Concerns for using packed pointers to solve ABA instead of DCAS

306 views
Skip to first unread message

Eugene Zatepyakin

unread,
Jun 19, 2015, 3:47:54 AM6/19/15
to lock...@googlegroups.com
Hello,

there are several ways to solve ABA problem. Most commonly used are Hazard Pointers and Versioned/Tagged pointers.

Hazards are quite tricky implement especially targeting multiple platforms desktop/mobiles.
So i was thinking about trying Tagged pointers approach. 
Here i see 2 ways: use hack with tagged pointer: single 64bit atomic or use full versioned pointer 2x64Bit values.

I wonder what are the concerns when implementing hack-ish version:
struct head_t final {
    node_t *node;
    // 64 bit pointer uses 48 bit addressing
        // the unused 16 bits are going to hold an identification number
#if X64_PLATFORM
using stack_id = uint16_t;

inline head_t() noexcept : node{ nullptr }       { }
inline head_t(node_t* n) noexcept : node{ n }    { }
inline void create_id(const head_t& nid)         { ((stack_id*)this)[3] = ((const stack_id*)&nid)[3] + 1; }
inline node_t* next_pointer()                    { return (node_t*)((uint64_t)node & 0x0000ffffffffffff); }

// 32 bit pointers are "full", so I use another 32bit piece of data for the counter
// on 32bit x86, we can swap the entire 64 bits without a lock
#else
using stack_id = uint32_t;
stack_id t_;

inline head_t() noexcept : node{ nullptr }, t_{ 0 }  { }
inline head_t(node_t* n) noexcept : node{ n }, t_{ 0 } { }
inline void create_id(const head_t& nid)             { t_ = nid.t_ + 1; }
inline node_t* next_pointer()                        { return node; }
#endif
inline void set(node_t* n, const head_t& nid)        { node = n; if (node) create_id(nid); else create_id(*this); }
};

is it safe to assume this will alway work? or its not official and better go with full DCAS version:
struct head_t final {
    uintptr_t aba_conter;
    node_t *node;
};

 but this will most likely be problematic to implement on mobile platforms....

Dmitry Vyukov

unread,
Jun 20, 2015, 5:49:47 AM6/20/15
to lock...@googlegroups.com
Hi Eugene,

In my practice 16-bit ABA counters work reasonably well.
AFAICT, Microsoft lock-free stack uses only 9-bit counters:
https://msdn.microsoft.com/en-us/library/windows/desktop/ms683482(v=vs.85).aspx
(at least in early days of 64-bit platforms, and that was the reason
to restrict address space to 44 bits)

There are several tricks to make counters more reliable:
- 48-th bit is user/kernel marker, so you can use it as well
- usually low 3 or 4 bits are zeros (or you can specifically make them
zero), so you can use them as well
- also you can use per-node counters, rather than single per-head counter

As for mobile platforms, it is dependent on characteristics of a
particular platform. Some mobile platforms has double-word CAS.
Generally you end up with platform-specific code for such components.
For example, see what we do in Go runtime in lfstack* files:
https://github.com/golang/go/tree/master/src/runtime
It does work on a variety of platforms, including mobile.

Eugene Zatepyakin

unread,
Jun 20, 2015, 12:56:35 PM6/20/15
to lock...@googlegroups.com
Hi Dmitry,

thanx for your explanation.
i did quick tests on iOS platform and it seems to support DwCAS (that was quicks runs and i'm not 100% sure) 

i also read in some articles that it is enough to only increment counter during "pop" operation does it sounds valid?

Dmitry Vyukov

unread,
Jun 20, 2015, 1:47:24 PM6/20/15
to lock...@googlegroups.com
On Sat, Jun 20, 2015 at 6:56 PM, Eugene Zatepyakin <zatep...@gmail.com> wrote:
> Hi Dmitry,
>
> thanx for your explanation.
> i did quick tests on iOS platform and it seems to support DwCAS (that was
> quicks runs and i'm not 100% sure)
>
> i also read in some articles that it is enough to only increment counter
> during "pop" operation does it sounds valid?


Yes, it is enough to ensure that a node cannot be reinserted with the
same counter.
You can also use per-node counters, rather than single global counter.

Eugene Zatepyakin

unread,
Jun 20, 2015, 6:40:55 PM6/20/15
to lock...@googlegroups.com
I've quickly ported Go version and run tests comparing to DCAS:

obviously version with pointer hacking is faster. u can see my results on Gist page

Eugene Zatepyakin

unread,
Jun 21, 2015, 8:45:15 AM6/21/15
to lock...@googlegroups.com

BTW: blocking stack implementation using simple spin-lock is as fast as tagged pointer lock-free version...
switching to std::mutex for locking makes it 5x times slower.

Dmitry Vyukov

unread,
Jun 21, 2015, 10:02:43 AM6/21/15
to lock...@googlegroups.com
On Sun, Jun 21, 2015 at 2:45 PM, Eugene Zatepyakin <zatep...@gmail.com> wrote:
>
> BTW: blocking stack implementation using simple spin-lock is as fast as
> tagged pointer lock-free version...

Just a note: spin-lock version is also portable and does not require
type-stable memory for nodes.
> --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "Scalable Synchronization Algorithms" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to lock-free+...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/lock-free/c9b3b08b-381e-486f-a05a-4666948547bf%40googlegroups.com.
>
> For more options, visit https://groups.google.com/d/optout.



--
Dmitry Vyukov

All about lockfree/waitfree algorithms, multicore, scalability,
parallel computing and related topics:
http://www.1024cores.net
Reply all
Reply to author
Forward
0 new messages