http://home.comcast.net/~vzoom/demos/pc_sample.c
#define PDR_HASH_DEPTH() 128
#define PDR_DEPTH() (PDR_HASH_DEPTH() + 1)
static pc_t pxy_master[PDR_DEPTH()];
#define PXY_HASHPTR(__ptr) \
(((ptrdiff_t)(__ptr)) % PDR_HASH_DEPTH())
#define PXY_MASTER_HASH(__ptr) \
(&pxy_master[PXY_HASHPTR(__ptr)])
static user_collection_t g_col1;
static user_collection_t g_col2;
static user_collection_t g_col3;
void user_collection_iterate(user_collection_t* const _this) {
pc_t *master = PXY_MASTER_HASH(_this);
pc_deferred_t *pcd = pc_inc(master);
/* iterate read-only _this */
pc_dec(pcd);
}
void user_collection_defer(void *state) {
user_collection_node_t* const _this = state;
user_collection_node_free(_this);
}
void user_collection_pop(user_collection_t* const _this) {
user_collection_node_t *node;
pc_t *master = PXY_MASTER_HASH(_this);
global_hash_lock(_this);
/* pop node from _this into node */
global_hash_unlock(_this);
pc_defer(master, user_collection_defer, node);
}
void reader_threads() {
for(;;) {
user_collection_iterate(&g_col1);
user_collection_iterate(&g_col2);
user_collection_iterate(&g_col3);
}
}
void writer_remove_threads() {
for(;;) {
user_collection_pop(&g_col1);
user_collection_pop(&g_col2);
user_collection_pop(&g_col3);
}
}
/* writer add threads removed for brevity. Hint, you can take a proxy
reference for writer algorithms... */
Any thoughts?
;^)
this is a programming news group. and if you don't like to read code you
don't have to.
i'm quite happy about it, because even undocumented, you can gain some
insights.
... and incomplete, and very often buggy.
> this is a programming news group. and if you don't like to read code you
> don't have to.
> i'm quite happy about it, because even undocumented, you can gain some
> insights.
Not really. I agree with Elcaro: Usenet isn't the right forum to post
undocumented code. Most of what Chris Thomasson (and several others) post
here would be better suited to a blog.
--
David Hopwood <david....@industrial-designers.co.uk>
quick comments:
/* 'pc_inc' atomically acquires a strong reference from
the 'master' proxy achor */
extern "C" pc_deferred_t* pc_inc(pc_t *master);
/* 'pc_dec' atomically decrements a basic reference from
'defer' */
extern "C" void pc_dec(pc_deferred_t *defer);
/*
these functions can be logically renamed...
*/
typedef pc_deferred_t read_ticket_t;
#define read_acquire_ticket pc_inc
#define read_release_ticket pc_dec
/* or... */
#define read_lock read_acquire_ticket
#define read_unlock read_acquire_ticket
/* read_lock returns a reference to the rw-lock, this reference is passed to
read_unlock in order to unlock the previous read access */
/* 'pc_defer' atomically schedules a 'callback' to be called with the
'state', ONLY when 'state' is no longer being referenced by ANY read-side
critical-sections wrt any thread(s) participating in parallel 'read_lock' /
'read_unlock' functions... */
extern "C" void pc_defer(pc_t *master, void (*callback) (void*), void
*state);
/* this can be refined */
#define sched_write_only_when_no_readers pc_defer
/* the PATTERN */
static pc_t pg_lock; /* init with pc_alloc; look in pc_sample.c */
/* free with pc_free */
/* readers */
read_ticket_t *myticket = read_lock(&pg_lock);
/* read-only access to dynamic-linked node-based list */
read_unlock(myticket);
/* quiescent */
extern "C" void my_alone_time(void *state) {
/* state is free to be freed! ;^) */
}
/* writer removers*/
your_node_t *node = 0;
lock_your_mutex_for_your_collection(...);
/* node = try to pop a node */
unlock_your_mutex_for_your_collection(...);
if (node) {
sched_write_only_when_no_readers(&pg_lock, my_alone_time, node);
}
/* writer adders*/
your_node_t *node = your_node_alloc(...);
lock_your_mutex_for_your_collection(...);
/* push the node */
unlock_your_mutex_for_your_collection(...);
There, is that better at all?
If not, please tell me where your having problems understanding any of it?
> #define read_unlock read_acquire_ticket
^^^^^^^^^^^^^^^^
#define read_unlock read_release_ticket
of course!
acquiring a master proxy reference is analogous to a read-lock. You can do a
read-only iteration of your dynamic node-based collection. When your done,
you release the deferred reference you received from the master; this is the
read-unlock.
The write-lock can be you own custom mutex, or POSIX Mutex... Whatever.
You use your mutex of choice to pop or add a node to your own dynamic
node-based collections.
If you decide to pop an node, and want to free that node when no readers are
accessing it, you use pc_defer. You defer the destruction of the node when
there are no reader threads!
PDR = PCOW Deferred Destruction
PCOW = Partial COW
COW= Copy-On-Write!
...
> PDR = PCOW Deferred Destruction
Deferred Reclamation?
Dmitriy V'jukov
> Not really. I agree with Elcaro: Usenet isn't the right forum to post
> undocumented code. Most of what Chris Thomasson (and several others) post
> here would be better suited to a blog.
Chris offer to you the most advanced high tech in the world that very
few people around the world can create. And very few of those people
post such code to public. And comments don't help here at all, or it
must be some papers of hundreds of pages with pictures and graphs...
So Chris can join those people who don't post such code to public...
It is yours choice...
You really can't find such code anywhere else... So let it would be
undocumented then it would not be at all. This is my opinion.
Dmitriy V'jukov
Yup. :^)
> /* Hint, you can take a proxy reference for writer algorithms... */
Are you mean that writer threads can call pc_inc/pc_dec like readers
to work concurrently? I.e. several writer threads can modify data
structure concurrently by means of some lock-free algorithm?
>
> Any thoughts?
>
> ;^)
Now I understand the PDR with global hash table!
You don't use proxy for holding some object at all, you use proxy only
for reclamation of unnecessary nodes. I always think about PDR on the
other hand. I use PDR for holding some object such that writers can
replace that object and readers can get this object *from* proxy.
So pc_inc/pc_dec doesn't actually acquire any object, they just
"freeze" and defer node deletion! Cool!
PDR used in that way outperforms SMR in sense that one don't have to
pin every used node by hazard pointer, one can pin the whole structure
by only one atomic RMW. If I get you right.
Dmitriy V'jukov
PDR doesn't refer to a specific algorithm. It's a form of memory management
used in a common lock-free design pattern (the PCOW in PDR). There's a variation
of that called proxy collectors of which Paul McKenney, Maged Michael, Chris, and
myself have done a few versions.
--
Joe Seigh
When you get lemons, you make lemonade.
When you get hardware, you make software.
Little effort?
;^)
>
>> And very few of those people post such code to public. And comments
>> don't help here at all, or it must be some papers of hundreds of
>> pages with pictures and graphs.
>
> Chris posts his buggy undocumented code every few days and this doesn't
> help anyone.
Fuc^ you.
Sorry for that juvenile crap attitude...
Just out of curiosity...
How many of you see thin as absolutely NOO help at ALLL
???
This used to be an interesting group for the general discussion of
threaded programming, with a diverse mix of questions and insightful
answers.
It is now almost a one man lock-free splatter fest. Either a new
lock-free group or a blog would be a more appropriate place.
--
Ian Collins.
Indeed.
> You don't use proxy for holding some object at all, you use proxy only
> for reclamation of unnecessary nodes.
Right.
> I always think about PDR on the other hand.
I noticed. I mention this wrt your algorithm focus on single object. Still,
your algorithm works.
> I use PDR for holding some object such that writers can
> replace that object and readers can get this object *from* proxy.
Yes. I completely understand. I read your code in detail.
> So pc_inc/pc_dec doesn't actually acquire any object, they just
> "freeze" and defer node deletion! Cool!
Bingo! ;^)
Freeze... Well, that moment in time, is tracked in the generation pool
"so-to-speak"... So, your totally correct in a sense.
> PDR used in that way outperforms SMR in sense that one don't have to
> pin every used node by hazard pointer, one can pin the whole structure
> by only one atomic RMW. If I get you right.
Exatcly.
!
I was thinking C++ for some reason.
Humm... Perhaps.
> "most advanced high tech" - LOL.
> There's enough documentation on lock-free Code around and you can
> create your own lock-free algorithms with little effort.
LROTF
> Chris posts his buggy undocumented code every few days and this doesn't
> help anyone.
Speak for oneself
Dmitriy V'jukov
> It is now almost a one man lock-free splatter fest. Either a new
> lock-free group or a blog would be a more appropriate place.
I create dedicated group for lock-free tech:
http://groups.google.com/group/lock-free/topics
So all interested people are welcome to post/comment some thoughts/
ideas/algos related to lock-free tech to this new group.
Dmitriy V'jukov
> I noticed. I mention this wrt your algorithm focus on single object. Still,
> your algorithm works.
My algorithm is very close per se to your pc_sample and Joe Seigh
atomic_ptr. So I am sure that the same back-link mechanism can be
plugged into my algorithm. And vice versa pc_sample and atomic_ptr can
work only on xadd primitive like my algo.
BTW. I don't know what it's all about and I am not well-informed about
this field. But. Joe Seigh said some time ago that there are some
patents related to PDR and atomic_ptr. Can such improvement
((atomic_cas -> atomic_add) and hence (lock-free -> wait-free)) be a
blocking patent?
Dmitriy V'jukov
> PDR doesn't refer to a specific algorithm. It's a form of memory management
> used in a common lock-free design pattern (the PCOW in PDR). There's a variation
> of that called proxy collectors of which Paul McKenney, Maged Michael, Chris, and
> myself have done a few versions.
What things from my post refer only to Chris Thomasson's
implementation and not refer to PDR as a whole?
I didn't see papers on PDR by Paul McKenney and Maged Michael... It
seems that they call this differently. Can you provide some links?
Dmitriy V'jukov
> PDR doesn't refer to a specific algorithm. It's a form of memory management
> used in a common lock-free design pattern (the PCOW in PDR). There's a variation
> of that called proxy collectors of which Paul McKenney, Maged Michael, Chris, and
> myself have done a few versions.
What things from my post refer only to Chris Thomasson's
Most of us don't use google groups to read USENET. One person
complaining about discussion of lock-free techniques is not enough to
move the discussion elsewhere. I find it interesting, even if I don't
use lock-free techniques that often -- but when I do, I currently know
where to look.
--
Steve Watt KD6GGD PP-ASEL-IA ICBM: 121W 56' 57.5" / 37N 20' 15.3"
Internet: steve @ Watt.COM Whois: SW32-ARIN
Free time? There's no such thing. It just comes in varying prices...
You can try some of the links in the references section here
http://atomic-ptr-plus.sourceforge.net/
Most papers only use the term for their own algorithms, not
a general term. So
RCU for Paul McKenney's stuff
SMR (safe memory relamation?) hazard pointers for Maged Michael's stuff
ROP (repeat offender problem) for Herlihy's stuff
It would have to have a fairly dramatic advantage over the CAS
version to make it worthwhile from an individual point of view.
For companies the patents don't have to be that good to have
value as part of a patent porfolio.
And I think there a few ways of packing an address and count into
a word out there, so you have to demonstrate novelty and non-obviousness
there. Just being different wouldn't be enough in of itself.
It wasn't the topic or the discussion, but the deluge of incomplete code
and postings that I have a problem with.
I too find lock free an interesting topic.
--
Ian Collins.
Most of the lock-free stuff I've seen here has been promoted as a
solution to the reader/writer problem with the writer part using
conventional locking. This application of lock-free will be important
because in most cases conventional rwlocks will not scale very well.
You can disagree on that latter point and are welcome to try and convice
Linus that Linux would be better off with conventional rwlocks instead
of RCU but I suspect you'd be laughed off of LKML.
I did do a lock-free FIFO queue implementation using STM compared
against Java's adaption of Michael and Scott's lock-free FIFO queue.
A comparison of lock-free AVL or red-black trees would have been
better from a reader/writer point of view but no one's done any
Java implemetations yet. The STM version did a lot worse because
of higher memory usage. If you subtracted out the GC the execution
times were fairly close. Of course using hash maps a lot didn't help
things too much. But that's about as close as I've gotten to
a pure lock-free producer/consumer solution.
It would be nice if in general there was a lot more large project
experience out there besides Linux and a few other things. The
problem is it takes disapline to do locking properly let alone
lock-free. If you can't do the former right, you certainly can't
do the latter right. Let's say there's a lot of faith based
multi-threading out there. :)
> It would have to have a fairly dramatic advantage over the CAS
> version to make it worthwhile from an individual point of view.
> For companies the patents don't have to be that good to have
> value as part of a patent porfolio.
>
> And I think there a few ways of packing an address and count into
> a word out there, so you have to demonstrate novelty and non-obviousness
> there. Just being different wouldn't be enough in of itself.
I better read some articles on this first before asking more stupid
questions :)
Dmitriy V'jukov
> lock-free code is unsuitable for most producer-consumer patterns
What are you mean? Can you explain more detailed?
Dmitriy V'jukov
Yes, good point. Lock-free algorithms can be applied for the short
term scheduling and, obviously, they are not applicable for the long
term scheduling of the processes, which is the case in the producer-
consumer pattern. However,lock-free maniacs do not see the point.
Best Regards,
Szabolcs
> lock-free code is unsuitable for most producer-consumer patterns
What are you mean? Can you explain more detailed?
Dmitriy V'jukov
> > With lock-free algorithms it isn't possible to block whenn a FIFO-queue
> > is empty. But this is necessary in most producer/consumer-patterns.
>
> Yes, good point. Lock-free algorithms can be applied for the short
> term scheduling and, obviously, they are not applicable for the long
> term scheduling of the processes, which is the case in the producer-
> consumer pattern. However,lock-free maniacs do not see the point.
You can build blocking behavior on top of any lock-free queue as you
do with lock-based queue. There is completely no difference. What's
the problem?
For example you can look at this library:
http://appcore.home.comcast.net/
(see ac_eventcount_algo1/ac_queue_spsc components)
Btw with blocking behavior you don't get any additional overhead while
queue is not emppy.
Dmitriy V'jukov