Pros and Concerns for using packed pointers to solve ABA instead of DCAS

Eugene Zatepyakin

unread,

Jun 19, 2015, 3:47:54 AM6/19/15

to lock...@googlegroups.com

Hello,

there are several ways to solve ABA problem. Most commonly used are Hazard Pointers and Versioned/Tagged pointers.

Hazards are quite tricky implement especially targeting multiple platforms desktop/mobiles.

So i was thinking about trying Tagged pointers approach.

Here i see 2 ways: use hack with tagged pointer: single 64bit atomic or use full versioned pointer 2x64Bit values.

I wonder what are the concerns when implementing hack-ish version:

struct head_t final {
    node_t *node;
    // 64 bit pointer uses 48 bit addressing        // the unused 16 bits are going to hold an identification number
#if X64_PLATFORM
	using stack_id = uint16_t;

	inline head_t() noexcept : node{ nullptr }       { }
	inline head_t(node_t* n) noexcept : node{ n }    { }
	inline void create_id(const head_t& nid)         { ((stack_id*)this)[3] = ((const stack_id*)&nid)[3] + 1; }
	inline node_t* next_pointer()                    { return (node_t*)((uint64_t)node & 0x0000ffffffffffff); }

	// 32 bit pointers are "full", so I use another 32bit piece of data for the counter
	// on 32bit x86, we can swap the entire 64 bits without a lock
#else
	using stack_id = uint32_t;
	stack_id t_;

	inline head_t() noexcept : node{ nullptr }, t_{ 0 }  { }
	inline head_t(node_t* n) noexcept : node{ n }, t_{ 0 } { }
	inline void create_id(const head_t& nid)             { t_ = nid.t_ + 1; }
	inline node_t* next_pointer()                        { return node; }
#endif
	inline void set(node_t* n, const head_t& nid)        { node = n; if (node) create_id(nid); else create_id(*this); }
};

is it safe to assume this will alway work? or its not official and better go with full DCAS version:

struct head_t final {
    uintptr_t aba_conter;
    node_t *node;
};

but this will most likely be problematic to implement on mobile platforms....

Dmitry Vyukov

unread,

Jun 20, 2015, 5:49:47 AM6/20/15

to lock...@googlegroups.com

Hi Eugene,

In my practice 16-bit ABA counters work reasonably well.
AFAICT, Microsoft lock-free stack uses only 9-bit counters:
https://msdn.microsoft.com/en-us/library/windows/desktop/ms683482(v=vs.85).aspx
(at least in early days of 64-bit platforms, and that was the reason
to restrict address space to 44 bits)

There are several tricks to make counters more reliable:
- 48-th bit is user/kernel marker, so you can use it as well
- usually low 3 or 4 bits are zeros (or you can specifically make them
zero), so you can use them as well
- also you can use per-node counters, rather than single per-head counter

As for mobile platforms, it is dependent on characteristics of a
particular platform. Some mobile platforms has double-word CAS.
Generally you end up with platform-specific code for such components.
For example, see what we do in Go runtime in lfstack* files:
https://github.com/golang/go/tree/master/src/runtime
It does work on a variety of platforms, including mobile.

Eugene Zatepyakin

unread,

Jun 20, 2015, 12:56:35 PM6/20/15

to lock...@googlegroups.com

Hi Dmitry,

thanx for your explanation.

i did quick tests on iOS platform and it seems to support DwCAS (that was quicks runs and i'm not 100% sure)

i also read in some articles that it is enough to only increment counter during "pop" operation does it sounds valid?

Dmitry Vyukov

unread,

Jun 20, 2015, 1:47:24 PM6/20/15

to lock...@googlegroups.com

On Sat, Jun 20, 2015 at 6:56 PM, Eugene Zatepyakin <zatep...@gmail.com> wrote:
> Hi Dmitry,
>
> thanx for your explanation.
> i did quick tests on iOS platform and it seems to support DwCAS (that was
> quicks runs and i'm not 100% sure)
>
> i also read in some articles that it is enough to only increment counter
> during "pop" operation does it sounds valid?

Yes, it is enough to ensure that a node cannot be reinserted with the
same counter.
You can also use per-node counters, rather than single global counter.

Eugene Zatepyakin

unread,

Jun 20, 2015, 6:40:55 PM6/20/15

to lock...@googlegroups.com

I've quickly ported Go version and run tests comparing to DCAS:

Lock-free Stack using Double width CAS and Packed/Tagged pointer (based on Go)

obviously version with pointer hacking is faster. u can see my results on Gist page

Eugene Zatepyakin

unread,

Jun 21, 2015, 8:45:15 AM6/21/15

to lock...@googlegroups.com

BTW: blocking stack implementation using simple spin-lock is as fast as tagged pointer lock-free version...

switching to std::mutex for locking makes it 5x times slower.

Dmitry Vyukov

unread,

Jun 21, 2015, 10:02:43 AM6/21/15

to lock...@googlegroups.com

On Sun, Jun 21, 2015 at 2:45 PM, Eugene Zatepyakin <zatep...@gmail.com> wrote:
>
> BTW: blocking stack implementation using simple spin-lock is as fast as
> tagged pointer lock-free version...

Just a note: spin-lock version is also portable and does not require
type-stable memory for nodes.

> --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "Scalable Synchronization Algorithms" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to lock-free+...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/lock-free/c9b3b08b-381e-486f-a05a-4666948547bf%40googlegroups.com.
>
> For more options, visit https://groups.google.com/d/optout.

--
Dmitry Vyukov

All about lockfree/waitfree algorithms, multicore, scalability,
parallel computing and related topics:
http://www.1024cores.net

Reply all

Reply to author

Forward