Software Transactional Memory

Joe Seigh

unread,

Feb 18, 2007, 6:28:48 PM2/18/07

to

I've gone over to the dark side. I'm doing an STM
implementation in Java. :)

It's not too tricky. Most of the work is working out
the api.

--
Joe Seigh

When you get lemons, you make lemonade.
When you get hardware, you make software.

Chris Thomasson

unread,

Feb 18, 2007, 8:55:23 PM2/18/07

to

"Joe Seigh" <jsei...@xemaps.com> wrote in message
news:guOdnXWtKZ8UfUXY...@comcast.com...

> I've gone over to the dark side.

;^) I too have several old STM implementations in C/ASM laying around on
some old CD I used for archiving my experiments.

I just hate that STM seems to have been created because the "threads are
impossible to program with" mantra keeps being spewed all over Academia and
"BIG" businesses. Oh well! If you can't beat em'.... JOIN EM'

:^/

> I'm doing an STM implementation in Java. :)

I can't say that I ever tried doing it in "pure" Java... Humm, are you doing
this for as an interesting experiment? Well, I guess it's better than
synchronized blocks... BTW, why did you pick Java? STM seems like something
that should be built directly into the JVM, or in an external library based
on C/ASM and PThreads...

> It's not too tricky.

How are you implementing the commit(...) function and does it use per-object
meta-data?

I think one of my older STM's got away with using no meta-data on a
per-object basis. Humm, I can't totally remember the internal STM
implementation algorithms I came up with. Damn... That archive CD might be
hard to find. Some of those algorithms allowed be to outperform all of the
other existing STM's at the time... Its like my brain does not want to
remember my old stm algorithms!

Joe Seigh

unread,

Feb 18, 2007, 9:36:09 PM2/18/07

to

Chris Thomasson wrote:
> "Joe Seigh" <jsei...@xemaps.com> wrote in message
> news:guOdnXWtKZ8UfUXY...@comcast.com...
>
>>I've gone over to the dark side.
>
>
> ;^) I too have several old STM implementations in C/ASM laying around on
> some old CD I used for archiving my experiments.
>
> I just hate that STM seems to have been created because the "threads are
> impossible to program with" mantra keeps being spewed all over Academia and
> "BIG" businesses. Oh well! If you can't beat em'.... JOIN EM'
>
> :^/
>

Go work in a UML shop where they don't have data structures (e.g.
collections, etc...). Everything is just a bunch of relations.

>
>
>>I'm doing an STM implementation in Java. :)
>
>
> I can't say that I ever tried doing it in "pure" Java... Humm, are you doing
> this for as an interesting experiment? Well, I guess it's better than
> synchronized blocks... BTW, why did you pick Java? STM seems like something
> that should be built directly into the JVM, or in an external library based
> on C/ASM and PThreads...
>

I'm working with Java now. Basically I'm doing a canonical
implmentation so I can support something like it. I'll know
what the issues are.

>
>
>>It's not too tricky.
>
>
> How are you implementing the commit(...) function and does it use per-object
> meta-data?

Just the standard tricks the other implementations use I
would guess. I'm not sure what you mean by per object
meta-data. The objects are versioned of course.

Joe Seigh

unread,

Feb 19, 2007, 5:59:15 AM2/19/07

to

Joe Seigh wrote:
> Chris Thomasson wrote:
[...]

>> I just hate that STM seems to have been created because the "threads
>> are impossible to program with" mantra keeps being spewed all over
>> Academia and "BIG" businesses. Oh well! If you can't beat em'.... JOIN
>> EM'
>>
>> :^/
>>
> Go work in a UML shop where they don't have data structures (e.g.
> collections, etc...). Everything is just a bunch of relations.
>

I should add that STM might be a good solution for doing lock-free
implementations of arbitrary data structures where the trick for
doing it with conventional PDR is not so obvious. It's not so
clear that it will scale system wide since it just replaces the
Giant Global Lock with a Giant Compare and Swap.

I'm sort of investing the latter issue but I don't think there's
any magic to solve that problem. A lot of hype maybe but no
magic.

Bin Chen

unread,

Feb 19, 2007, 8:19:49 PM2/19/07

to

On 2月19日, 下午6时59分, Joe Seigh <jseigh...@xemaps.com> wrote:
> Joe Seigh wrote:
> > Chris Thomasson wrote:
> [...]
> >> I just hate that STM seems to have been created because the "threads
> >> are impossible to program with" mantra keeps being spewed all over
> >> Academia and "BIG" businesses. Oh well! If you can't beat em'.... JOIN
> >> EM'
>
> >> :^/
>
> > Go work in a UML shop where they don't have data structures (e.g.
> > collections, etc...). Everything is just a bunch of relations.
>
> I should add that STM might be a good solution for doing lock-free
> implementations of arbitrary data structures where the trick for
> doing it with conventional PDR is not so obvious. It's not so
> clear that it will scale system wide since it just replaces the
> Giant Global Lock with a Giant Compare and Swap.
>

What is the benefit of replace the GGL with GCS? That only I can see
is it is deadlock-immune.
Can't image a lot of performance promotion.
I read some lock-free papers these days, the direct sense is the lock-
free algorithm is a speculate method. You first do a bunch of things
assuming someone don't touch the source, but when you done you check
others have touched the source, if so you abandon what you have done
and do it again.
It is deadlock-immune seems it has no locks any more.

Please correct me if wrong... thx.

Joe Seigh

unread,

Feb 19, 2007, 10:24:15 PM2/19/07

to

Bin Chen wrote:
> On 2月19日, 下午6时59分, Joe Seigh <jseigh...@xemaps.com> wrote:
>
>>Joe Seigh wrote:
>>
>>I should add that STM might be a good solution for doing lock-free
>>implementations of arbitrary data structures where the trick for
>>doing it with conventional PDR is not so obvious. It's not so
>>clear that it will scale system wide since it just replaces the
>>Giant Global Lock with a Giant Compare and Swap.
>>
>
> What is the benefit of replace the GGL with GCS? That only I can see
> is it is deadlock-immune.
> Can't image a lot of performance promotion.
> I read some lock-free papers these days, the direct sense is the lock-
> free algorithm is a speculate method. You first do a bunch of things
> assuming someone don't touch the source, but when you done you check
> others have touched the source, if so you abandon what you have done
> and do it again.
> It is deadlock-immune seems it has no locks any more.
>
> Please correct me if wrong... thx.

It's also reader lock-free and if you have a lot of read only
accesses, there's a clear performance and scalability benefit.

You can do writes with a conventional lock and still have
lock-free readers. I'm doing all lock-free first since it has
a simpler api. I came up with at least three different ways
of doing lock-based writes. Better to start with the simpler
stuff first, work out the basic principles, and add the complicated
stuff later.

The big challenge will be how to get STM to scale from simple
lock-free collections to large scale systems.

Chris Thomasson

unread,

Feb 20, 2007, 1:00:26 AM2/20/07

to

"Joe Seigh" <jsei...@xemaps.com> wrote in message

news:seidncKgvvat9EfY...@comcast.com...

> Bin Chen wrote:
>> On 2月19日, 下午6时59分, Joe Seigh <jseigh...@xemaps.com> wrote:
>>
>>>Joe Seigh wrote:
>>>

[...]

>>>It's not so
>>>clear that it will scale system wide since it just replaces the
>>>Giant Global Lock with a Giant Compare and Swap.
>>>
>>
>> What is the benefit of replace the GGL with GCS? That only I can see
>> is it is deadlock-immune.
>> Can't image a lot of performance promotion.

[...]

>> Please correct me if wrong... thx.
>
> It's also reader lock-free and if you have a lot of read only
> accesses, there's a clear performance and scalability benefit.

[...]

Indeed... However, how are you actually implementing the reader logic? Can
they become part of a transaction? Each transactional pointer load/store
basically means that a record of some kind has to be added to the
transaction... Get a hundred pointer loads per-transaction, well the 100
checks at commit-time... Basically, do your lock-free reader threads make
use of the so-called (begin_transaction - commit_transaction) API's?

> You can do writes with a conventional lock and still have
> lock-free readers.

That's usually an ideal scenario... Unless you can get a writer algorithm to
outperform it on the non-contended case...

> I'm doing all lock-free first since it has
> a simpler api. I came up with at least three different ways
> of doing lock-based writes. Better to start with the simpler
> stuff first, work out the basic principles, and add the complicated
> stuff later.

The writer stuff is fairly straight forward. For sure I would do a
lock-based STM first...

>>>It's not too tricky.
>>
>>
>> How are you implementing the commit(...) function and does it use
>> per-object
>> meta-data?
>
> Just the standard tricks the other implementations use I
> would guess. I'm not sure what you mean by per object
> meta-data. The objects are versioned of course.

FWIW, I was using a single version counter per-object group. A
"transactional object", including all of the loads/stores, are attached to a
group. Therefore, a single change to the groups version may affect multiple
objects within... The transactional memory is allocated and organized into
"transactional groups" at application startup time...

BTW, how are you handling the race-condition in the commit function? Here is
an example of the race-condition:

http://groups.google.com/group/comp.programming.threads/browse_frm/thread/b21b212dd175baab

This is a dangerous condition...

> The big challenge will be how to get STM to scale from simple
> lock-free collections to large scale systems.

Yup... I got it to scale once... However, the end result wrt the internal
implementation, well, it was basically multiplexed message passing with
time-version stamp per-message; nothing like STM... Seems like STM
implemented in terms of message passing can scale a bit... Humm...

Chris Thomasson

unread,

Feb 20, 2007, 1:14:06 AM2/20/07

to

"Chris Thomasson" <cri...@comcast.net> wrote in message
news:RPqdnQMgxPf1E0fY...@comcast.com...

> "Joe Seigh" <jsei...@xemaps.com> wrote in message

[...]

>> You can do writes with a conventional lock and still have
>> lock-free readers.
>
> That's usually an ideal scenario... Unless you can get a writer
> algorithm to

^ lock-free

Joe Seigh

unread,

Feb 20, 2007, 6:39:21 AM2/20/07

to

Chris Thomasson wrote:
> "Joe Seigh" <jsei...@xemaps.com> wrote in message
> news:seidncKgvvat9EfY...@comcast.com...
>
>>Bin Chen wrote:
>>
>>>On 2月19日, 下午6时59分, Joe Seigh <jseigh...@xemaps.com> wrote:
>>>
>>>
>>>>Joe Seigh wrote:
>>>>
>
> [...]
>
>
>>>>It's not so
>>>>clear that it will scale system wide since it just replaces the
>>>>Giant Global Lock with a Giant Compare and Swap.
>>>>
>>>
>>>What is the benefit of replace the GGL with GCS? That only I can see
>>>is it is deadlock-immune.
>>>Can't image a lot of performance promotion.
>
> [...]
>
>>>Please correct me if wrong... thx.
>>
>>It's also reader lock-free and if you have a lot of read only
>>accesses, there's a clear performance and scalability benefit.
>
> [...]
>
> Indeed... However, how are you actually implementing the reader logic? Can
> they become part of a transaction? Each transactional pointer load/store
> basically means that a record of some kind has to be added to the
> transaction... Get a hundred pointer loads per-transaction, well the 100
> checks at commit-time... Basically, do your lock-free reader threads make
> use of the so-called (begin_transaction - commit_transaction) API's?

I believe I mentioned this before. Versioning through pointer indirection
to pick up the right version of the object. There's an api involved
since you have to pick up a current version number at the beginning of the
transaction.

>
>
>
>
>>You can do writes with a conventional lock and still have
>>lock-free readers.
>
>
> That's usually an ideal scenario... Unless you can get a writer algorithm to
> outperform it on the non-contended case...
>
>
>
>
>>I'm doing all lock-free first since it has
>>a simpler api. I came up with at least three different ways
>>of doing lock-based writes. Better to start with the simpler
>>stuff first, work out the basic principles, and add the complicated
>>stuff later.
>
>
> The writer stuff is fairly straight forward. For sure I would do a
> lock-based STM first...

Lock-free can be converted to lock based without too much trouble.
Going the other way could be problematic depending on your lock
based write logic.

>
>
>
>>>>It's not too tricky.
>>>
>>>
>>>How are you implementing the commit(...) function and does it use
>>>per-object
>>>meta-data?
>>
>>Just the standard tricks the other implementations use I
>>would guess. I'm not sure what you mean by per object
>>meta-data. The objects are versioned of course.
>
>
> FWIW, I was using a single version counter per-object group. A
> "transactional object", including all of the loads/stores, are attached to a
> group. Therefore, a single change to the groups version may affect multiple
> objects within... The transactional memory is allocated and organized into
> "transactional groups" at application startup time...
>
> BTW, how are you handling the race-condition in the commit function? Here is
> an example of the race-condition:
>
> http://groups.google.com/group/comp.programming.threads/browse_frm/thread/b21b212dd175baab
>
> This is a dangerous condition...
>
>

No KCSS currently. I'm missing something. How would you use KCSS? I
would think you'd need KCKS or something if you wanted to go that route.

Chris Thomasson

unread,

Feb 20, 2007, 6:17:37 PM2/20/07

to

"Joe Seigh" <jsei...@xemaps.com> wrote in message

news:EoadneZAS_SlQEfY...@comcast.com...

> Chris Thomasson wrote:
[...]
>> Indeed... However, how are you actually implementing the reader logic?
>

> I believe I mentioned this before.

The indirection method, or the version per-group thing mentioned below? I
remember you mentioning something about integrating the transactional memory
into a PDR implementation.

> Versioning through pointer indirection
> to pick up the right version of the object.

Do your readers have to pick up version numbers as well? I am nit-picking
here, however, if they do need to pickup versions and validate them at
commit time, well, that's one of my main concerns with STM...

>> The writer stuff is fairly straight forward. For sure I would do a
>> lock-based STM first...
>
> Lock-free can be converted to lock based without too much trouble.
> Going the other way could be problematic depending on your lock
> based write logic.

True. However, I think I remember having kind of a hard time getting a STM
based on lock-free to outperform one based on locks... I remember reading an
Intel research paper on the issue entitled "STM Should Not Be
Obstruction-Free" a couple of years ago...

>> BTW, how are you handling the race-condition in the commit function? Here
>> is an example of the race-condition:
>>
>> http://groups.google.com/group/comp.programming.threads/browse_frm/thread/b21b212dd175baab
>>
>> This is a dangerous condition...
>>
> No KCSS currently.

Doesn't the same race-condition I described for SUN's KCSS apply to the
commit function for basically any STM?

> I'm missing something. How would you use KCSS?

I guess you could use it in a STM implementation... Something like "compare
X number of version counts with the fresh versions, and mark a threads
transaction descriptor from 'commit_pending' to 'commit_finalizing' if they
are all the same"

> I would think you'd need KCKS or something if you wanted to go that route.

MCAS sucks! :O

Joe Seigh

unread,

Feb 20, 2007, 9:49:49 PM2/20/07

to

Chris Thomasson wrote:
> "Joe Seigh" <jsei...@xemaps.com> wrote in message
> news:EoadneZAS_SlQEfY...@comcast.com...
>
>>Chris Thomasson wrote:
>
> [...]
>
>>>Indeed... However, how are you actually implementing the reader logic?
>>
>>I believe I mentioned this before.
>
>
> The indirection method, or the version per-group thing mentioned below? I
> remember you mentioning something about integrating the transactional memory
> into a PDR implementation.
>

It will be a while before I'm finished coding and experimenting with
api design so I just discuss the general techniques. The obvious
PDR solution is a proxy collector with the version number in the
proxy object. To start a transaction, you just atomically reference
the proxy object. You use object versioning. Newer versions point
to older versions. When you add a new version of an object, you
queue the older version onto the current proxy object. In Java
you use WeakReferences. The proxy object references keep the older
verions around as long as they're needed. You make PDR/GC do
all the work of keeping track of which verions need to be kept.

>
>
>>Versioning through pointer indirection
>>to pick up the right version of the object.
>
>
> Do your readers have to pick up version numbers as well? I am nit-picking
> here, however, if they do need to pickup versions and validate them at
> commit time, well, that's one of my main concerns with STM...
>
>
>

Readers don't commit. They just use the version number from the proxy
object to index into the per object version list.

[...]

>>
>>No KCSS currently.
>
>
> Doesn't the same race-condition I described for SUN's KCSS apply to the
> commit function for basically any STM?
>
>
>
>>I'm missing something. How would you use KCSS?
>
>
> I guess you could use it in a STM implementation... Something like "compare
> X number of version counts with the fresh versions, and mark a threads
> transaction descriptor from 'commit_pending' to 'commit_finalizing' if they
> are all the same"
>
>

No KCSS. Just a bunch single compare and swaps. I'll post the commit
method (it's not tested yet. It's still in the design phase.

There's a commit linked queue which normally contains a single node
pointing to the current proxy object.

You build a list of nodes each pointing a new version of object which
points to the previous version (making it a COW of the object version list).
The node also points to the version list anchor. The last node in the
list points to a new proxy object with the version number incremented.
In the code below the list is build LIFO so the last node in the list is
build first.

You attempt to commit the transaction by swapping your list with the
current node as long as the proxy object it points to is still
equal to your verson number (meaning no other transactions have
committed since you started your transaction).

You then execute the commit list using compare and swaps whether
or not your commit succeeded or not. If it failed, it's another
threads transaction. All compare and swaps in the list will
succeed exactly once for some thread.

So the commit list logic is set all the new object verions in
place and then set the new proxy/version number.

If your commit succeeded, you own the replaced proxy object which
you can use to queue deferred reclamations. You also have to
back link it to the new proxy object. If you don't have GC,
a lot of other stuff will have to be proxy collected also.

After you do the tryCommit, you can exit the transaction by
dropping the proxy reference you got on the start.

public boolean tryCommit() {
Set<Object> oldObjectVersions;
boolean rc = false;
XmemLocal xlocal = local.get();

xlocal.assertActiveTransaction();

// check if transaction is still current
// implies updateMap is still valid
AtomicCommitNode oldNode = commit.get();
if (xlocal.current.version == proxy.get().version) { // acquire

oldObjectVersions = new HashSet<Object>();
XmemCollectorNode newNode = new XmemCollectorNode(xlocal.current.version + 1);
AtomicCommitNode commitList = new AtomicCommitNode(newNode);
// copy new objects to atomic commit list
// add old objects to proxy garbage collection set
for (XmemObject obj : xlocal.updateMap.values()) {
commitList = new AtomicCommitNode(commitList, obj);
oldObjectVersions.add(obj.next.get()); // previous version of object
}

/*
* attempt to compare and swap commitList
*/
rc = commit.compareAndSet(oldNode, commitList);
if (rc == true) {
xlocal.current.deferredCollection = oldObjectVersions;
xlocal.current.next = newNode;
xlocal.updateMap = null;
}
}

/*
* process commitList
* executed by all threads to ensure forward progress
*/

AtomicCommitNode commitNode = commit.get();
while (commitNode.collectorNode == null) {
XmemObject obj = commitNode.obj;
XmemReference ref = obj.currentRef;
ref.currentObj.compareAndSet(obj.next.get(), obj);
commit.compareAndSet(commitNode, commitNode.next); // attempt to atomically advance

commitNode = commit.get(); // get next commit
}

// finally attempt to set current proxy
proxy.compareAndSet(commitNode.collectorNode.next, commitNode.collectorNode);

return rc;
}

There's a little bit of logic for the versioning pointers with per thread
caching of modified objects (the updateMap) The above is most of the logic.

As far as the rest of the api goes, I'm leaning towards doing the transactions
with enclosures with a retry policy thrown in there somewhere.

Chris Thomasson

unread,

Feb 20, 2007, 11:44:19 PM2/20/07

to

"Joe Seigh" <jsei...@xemaps.com> wrote in message

news:P9-dnZ_7DL8EL0bY...@comcast.com...

> Chris Thomasson wrote:
>> "Joe Seigh" <jsei...@xemaps.com> wrote in message

[...]

Need to properly examine your commit function... Anyway, FWIW here is the
older collector source code I used for the transactions:

http://home.comcast.net/~vzoom/demos/pc_sample.c
(fairly slick design, imho at least... ;^))

It's fairly efficent proxy collector implementation. It has some of that
"kung fu" monkey stuff. Just a simple XCHG instruction for proxy collector
swaps on the anchor... It uses some standard stuff like back-linking the
proxy in a FIFO queue. This has everything you need for 100% lock-free proxy
collector service.

I stripped out the transaction logic which was in x86 assembly. I wonder how
many people reading this would like to see a transaction implementation in
80% x86 assembly language? My guess is not too many... Oh well...

:O

Joe Seigh

unread,

Feb 21, 2007, 6:45:00 AM2/21/07

to

Chris Thomasson wrote:
> "Joe Seigh" <jsei...@xemaps.com> wrote in message
> news:P9-dnZ_7DL8EL0bY...@comcast.com...
>
>>Chris Thomasson wrote:
>>
>>>"Joe Seigh" <jsei...@xemaps.com> wrote in message
>
>
> [...]
>
> Need to properly examine your commit function... Anyway, FWIW here is the
> older collector source code I used for the transactions:

The commit logic pushes the new object versions onto the object version
queues. It's a lock-free push. The new verions can't be see until the
version number is incremented after all the individual verion queues
are updated. The logic is probably in use elsewhere already. I was
just too lazy to go and research how the current STM implementations
did it so I just took a guess at it using what I already know about
techniques used in lock-free algorithms.

>
> http://home.comcast.net/~vzoom/demos/pc_sample.c
> (fairly slick design, imho at least... ;^))
>
> It's fairly efficent proxy collector implementation. It has some of that
> "kung fu" monkey stuff. Just a simple XCHG instruction for proxy collector
> swaps on the anchor... It uses some standard stuff like back-linking the
> proxy in a FIFO queue. This has everything you need for 100% lock-free proxy
> collector service.
>
> I stripped out the transaction logic which was in x86 assembly. I wonder how
> many people reading this would like to see a transaction implementation in
> 80% x86 assembly language? My guess is not too many... Oh well...
>

Any proxy collector should work. The versioned pointers in the java version
are gc'd so you'd need to work out how to do that. Most likely the way
you do things now by knowing when an object is removed from a collection.
Plus I should change what I call the versioned pointers since there is only
one per object.

Chris Thomasson

unread,

Feb 23, 2007, 12:18:05 AM2/23/07

to

"Joe Seigh" <jsei...@xemaps.com> wrote in message

news:P9-dnZ_7DL8EL0bY...@comcast.com...

> Chris Thomasson wrote:
>> "Joe Seigh" <jsei...@xemaps.com> wrote in message
>> news:EoadneZAS_SlQEfY...@comcast.com...

[...]

> Readers don't commit. They just use the version number from the proxy
> object to index into the per object version list.

won't certain algorithms require what it reads in a transaction to be
validated? For instance, a transactions with a lot of reads, and only a
single write that is dependant on the reads still being valid at commit
time...

> [...]
>
>>>
>>>No KCSS currently.
>>
>>
>> Doesn't the same race-condition I described for SUN's KCSS apply to the
>> commit function for basically any STM?

[...]

> No KCSS. Just a bunch single compare and swaps. I'll post the commit
> method (it's not tested yet. It's still in the design phase.

I think you can cut down on the number of CAS's by using some clever
locking...

Chris Thomasson

unread,

Feb 23, 2007, 12:23:59 AM2/23/07

to

[...]

> If your commit succeeded, you own the replaced proxy object which
> you can use to queue deferred reclamations. You also have to
> back link it to the new proxy object. If you don't have GC,
> a lot of other stuff will have to be proxy collected also.

I remember using 64-bit CAS and an offset trick so that I could store
transactional object descriptors in a shared cache instead sending them
through the PDR... It really cut down on the number of objects that needed
to have their lifetimes managed... You can use that StampedReference, or
whatever SUN calls it, thing in Java?? BTW... How the heck does the JVM
implement that function on a SPARC in 64-bit mode?

Bin Chen

unread,

Feb 23, 2007, 6:24:19 AM2/23/07

to

How can I find some introduction material about PDR?

Joe Seigh

unread,

Feb 23, 2007, 6:41:48 AM2/23/07

to

Chris Thomasson wrote:
> "Joe Seigh" <jsei...@xemaps.com> wrote in message
> news:P9-dnZ_7DL8EL0bY...@comcast.com...
>
>>Chris Thomasson wrote:
>>
>>>"Joe Seigh" <jsei...@xemaps.com> wrote in message
>>>news:EoadneZAS_SlQEfY...@comcast.com...
>
> [...]
>
>>Readers don't commit. They just use the version number from the proxy
>>object to index into the per object version list.
>
>
> won't certain algorithms require what it reads in a transaction to be
> validated? For instance, a transactions with a lot of reads, and only a
> single write that is dependant on the reads still being valid at commit
> time...
>

Yes. Generally, the transaction manager can't tell which reads the
writes have dependencies on unless the api is modified to require those
reads distinquished from other reads. It's no worse than current
lock-free which require data structures that tolerate stale reads. You
can use one transaction manager to to read only access of a collection
and another transaction manager or other form of synchronization on the
object in the collection. Just like we do now with reader/writer solutions
on collections.

>
>>[...]
>>
>>
>>>>No KCSS currently.
>>>
>>>
>>>Doesn't the same race-condition I described for SUN's KCSS apply to the
>>>commit function for basically any STM?
>
>
> [...]
>
>
>>No KCSS. Just a bunch single compare and swaps. I'll post the commit
>>method (it's not tested yet. It's still in the design phase.
>
>
> I think you can cut down on the number of CAS's by using some clever
> locking...
>

It's a straightforward operation to replace the CAS's with a lock but
not necessarily the other way around.

You can always make the thing adaptive. Distinquish read-only transactions
from write transactions and use a conventional read/write lock on the
write transactions. If contention gets too high, write transaction starts
get the rwlock in exclusive mode rather than shared mode. But it's a little
early for that at this point of development.

Joe Seigh

unread,

Feb 23, 2007, 6:44:22 AM2/23/07

to

Internal implementation of Java references don't have to use a full pointer.
They can be indices into a object table. Or they can use locks. The
Atomics package doesn't preclude conventional synchronization in the
implementation.

David Hopwood

unread,

Feb 23, 2007, 4:57:02 PM2/23/07

to

If you mean

<http://java.sun.com/javase/6/docs/api/java/util/concurrent/atomic/AtomicStampedReference.html>

then it's perfectly valid to implement that with a mutex. I don't know how
Sun's JVM implements it, but you could always download the source.

--
David Hopwood <david.nosp...@blueyonder.co.uk>

Joe Seigh

unread,

Feb 23, 2007, 7:51:42 PM2/23/07

to

PDR stands for PCOW (Partial Copy On Write) Deferred Reclamation.
PCOW is a characterization of read lock-free algorithms (readers
don't block writers or other readers). PDR refers to the memory
management needed to make these algorithms work. There's also
a design pattern involved. There's no cook book involved. You
sort of have to "get it". You can try going through the RCU
bibliography here

http://www.rdrop.com/users/paulmck/

Chris Thomasson

unread,

Feb 26, 2007, 9:35:50 PM2/26/07

to

I remember using something like the following:

http://groups.google.com/group/comp.programming.threads/browse_frm/thread/e0c011baf08844c4

in a lock-based STM... The locks are taken in the transaction itself. When a
lock is out-of-order a try_lock is performed, and if it fails the
transaction is restarted... something like:

reverse_and_restart(transaction_locks, lock_count) {
for (idx = lock_count; idx > -1; --idx) {
pthread_mutex_unlock(&locks[transaction_locks[idx]]);
}
// restart the transaction
}

reverse_and_commit(transaction_locks, lock_count) {
for (idx = lock_count; idx > -1; --idx) {
pthread_mutex_unlock(&locks[transaction_locks[idx]]);
}
// commit the transaction
}

do stm_transaction {
lock_count = 0;
// take a lock for a write
stm_write &w1 = stm_lock(&mydata1 ptr) {
HASHVAL h = HASH(ptr);
pthread_mutex_lock(&locks[h]);
transaction_locks[lock_count] = h;
lock_count++;
}

// take another lock for a write
stm_write &w2 = stm_lock(&mydata2 ptr) {
HASHVAL h = HASH(ptr);

// need to validate order now
if (h >= last_transaction_lock) {
pthread_mutex_lock(&locks[h]);
} else if (pthread_mutex_trylock(&locks[h])) {
reverse_and_restart(transaction_locks, lock_count);
}
}

/* okay, w1 and w2 are locked, there for this transaction
atomically locked everything it needs to peform the writes... */

stm_do_read(w1, buf1);
// alter buf1 to whatever
stm_do_write(w1, buf1);

stm_do_read(w2, buf2);
// alter buf2 to whatever
stm_do_write(w2, buf2);

reverse_and_commit(transaction_locks, lock_count);
}

IIRC,this locking method worked perfect for STM... I guess it kind of like
two-phase locking thing...

Any thoughts?

Chris Thomasson

unread,

Feb 26, 2007, 9:42:26 PM2/26/07

to

"Chris Thomasson" <cri...@comcast.net> wrote in message

news:gvqdnd-1WfNuBX7Y...@comcast.com...

>I remember using something like the following:
>
> http://groups.google.com/group/comp.programming.threads/browse_frm/thread/e0c011baf08844c4
>
> in a lock-based STM... The locks are taken in the transaction itself. When
> a lock is out-of-order a try_lock is performed, and if it fails the
> transaction is restarted... something like:

[...]
> do stm_transaction {
[...]

>
> reverse_and_commit(transaction_locks, lock_count);

I think I should also point out this commit if final. When you use locks for
STM, well, you can get away with making a commit call always succeed... The
transaction-collision detection is based on a very simple hierarchy... If a
transaction takes locks a, b, c well, that's fine and pthread_mutex_lock can
be used... However, if it takes b, c, a, then the locking of 'a' will have
to use pthread_mutex_trylock, and if it fails, all of the transaction
previously acquired mutexs will be unlocked in reverse order, and the
transaction will restart...

> }
>

Chris Thomasson

unread,

Feb 27, 2007, 1:28:28 AM2/27/07

to

Oops! I forgot to update the transaction_locks, lock_count. The following
code:

[...]

> // take another lock for a write
> stm_write &w2 = stm_lock(&mydata2 ptr) {
> HASHVAL h = HASH(ptr);
>
> // need to validate order now
> if (h >= last_transaction_lock) {
> pthread_mutex_lock(&locks[h]);
> } else if (pthread_mutex_trylock(&locks[h])) {
> reverse_and_restart(transaction_locks, lock_count);
> }
> }

[...]

has to be:

// take another lock for a write
stm_write &w2 = stm_lock(&mydata2 ptr) {
HASHVAL h = HASH(ptr);

// need to validate order now
if (h >= last_transaction_lock) {
pthread_mutex_lock(&locks[h]);
} else if (pthread_mutex_trylock(&locks[h])) {
reverse_and_restart(transaction_locks, lock_count);
}

/* store and inc the index */

Greg Herlihy

unread,

Feb 27, 2007, 10:19:04 AM2/27/07

to

On Feb 18, 3:28 pm, Joe Seigh <jseigh...@xemaps.com> wrote:
> I've gone over to the dark side. I'm doing an STM
> implementation in Java. :)
>
> It's not too tricky. Most of the work is working out
> the api.

It may be interesting to compare your STM implementation against other
Java STM libraries (especially since there can't be that many - or so
I would think). I do know of one other (since my brother did some work
on it): the DSTM 2.0 library ( http://www.sun.com/download/products.xml?id=453fb28e
).

Greg

Chris Thomasson

unread,

Feb 27, 2007, 12:39:02 PM2/27/07

to

"Greg Herlihy" <gre...@pacbell.net> wrote in message
news:1172589544.7...@m58g2000cwm.googlegroups.com...

I have read many of your brothers papers! He's a very smart guy, and I
encourage people to download some of his work on STM and read it; is a nice
programming model... However, as we know, its prone to forward progress and
livelock issues. Its basically a tradeoff between simplicity and
performance:

http://groups.google.com/group/comp.arch/msg/995379a16beb3b69

http://groups.google.com/group/comp.arch/msg/fa395eb9ff0b1956

http://groups.google.com/group/comp.arch/msg/c648b0a25d308207

I have also read some of your bothers well thought out work on so-called
"contention managers" to deal with the livelock issue. It would have been
nice if he could of dedicated some more time dealing with the caveats of STM
(e.g., livelock) in the following lecture:

http://groups.google.com/group/comp.arch/browse_frm/thread/7fc5b59b1489194b

Chris Thomasson

unread,

Feb 27, 2007, 12:45:06 PM2/27/07

to

"Greg Herlihy" <gre...@pacbell.net> wrote in message
news:1172589544.7...@m58g2000cwm.googlegroups.com...

> On Feb 18, 3:28 pm, Joe Seigh <jseigh...@xemaps.com> wrote:
>> I've gone over to the dark side. I'm doing an STM
>> implementation in Java. :)
>>
>> It's not too tricky. Most of the work is working out
>> the api.

[...]

> I do know of one other (since my brother did some work
> on it): the DSTM 2.0 library (
> http://www.sun.com/download/products.xml?id=453fb28e
> ).

That one works well. I agree... Its worth testing against.

Joe Seigh

unread,

Feb 27, 2007, 10:21:04 PM2/27/07

to

There should probably be some prototype code out this weekend. It's more
of an exercise in working out the api possibilities so I'll probably look
at the DSTM api later on. Initally the api is directed towards making
individual collections lock-free.

Joe Seigh

unread,

Mar 3, 2007, 4:11:34 PM3/3/07

to Joe Seigh

Joe Seigh wrote:

> Greg Herlihy wrote:
>>
>>
>> It may be interesting to compare your STM implementation against other
>> Java STM libraries (especially since there can't be that many - or so
>> I would think). I do know of one other (since my brother did some work
>> on it): the DSTM 2.0 library (
>> http://www.sun.com/download/products.xml?id=453fb28e
>> ).
>>
>
> There should probably be some prototype code out this weekend. It's more
> of an exercise in working out the api possibilities so I'll probably look
> at the DSTM api later on. Initally the api is directed towards making
> individual collections lock-free.
>

It's on http://sourceforge.net/projects/atomic-ptr-plus/
in the xmem_0-0-0 release.

Joe Seigh

unread,

Mar 5, 2007, 8:43:01 PM3/5/07

to

>
> It's on http://sourceforge.net/projects/atomic-ptr-plus/
> in the xmem_0-0-0 release.
>
>

Some futher comments. The switching of the version number is
what makes the transaction visible to other readers and writers.
It doesn't have to be in the proxy object but if it is it save
an extra compare and swap. If it's updated separately from the
proxy object it should be done last after a new proxy object is
put into place.

If you're implementing it w/ RCU which doesn't have an explicit
proxy object per se, then you just have the version number to
deal with.

If you're using something other than java WeakReferences for the
object version chain then the dtors for the old object versions
should zero out the pointer from the newer object version to them
selves. The version number of the last ojbect in the chain is
ambigous since the object could have lasted for several wrap arounds
of the version number. The presumption is that the dtor will run
long before wrap around occurs and zero out the pointer to the older
version making testing of the version unnecessary.