http://www-sal.cs.uiuc.edu/~zilles/papers/non_transact.transact2006.pdf
That is a lot of work for a synchronization instruction... Whatever happened 
to plain old CAS!
:)
IMHO, they have to implement transactional memory in the hardware if they 
want it to be even remotely usable. I can make use of a transactional 
instruction in vZOOM, however I would try to avoid its use at all costs... 
The transactional instructions have simply got to be more expensive than a 
"normal" CAS or LL/SC... I would not allow my reader threads to use the 
transaction instructions when they are traversing through large 
data-structures. This is not because I don't like TM; I don't let my reader 
threads use CAS, or any atomic operations for that matter...
Reader threads don't seem to like to call these types of instructions. The 
amount of searches' they can perform per-second gets changes drastically 
when CAS and/or memory barriers are used. I could only imagine the overhead 
involved with many reader threads frequently searching/traversing throughout 
a data-structure that has hundreds-of-thousands or even million+ nodes 
during periods of load. Design a TM can be tedious.
Humm, come to think about it... I could use vZOOM to create a STM... 
Transaction logic integrated into my per-thread memory allocator and proxy 
garbage collector... Humm... I think Joe posted something on integrating 
transactions into RCU-SMR...
I would be neat to give vZOOM or RCU-SMR hardware support. I think alls that 
it would take is periodic/episodic interrupt every time all of the CPUS in 
the system have executed "something" that behaves as a memory barrier... 
vZOOM could use this interrupt to implement its polling logic... 
One other MAJOR *point about TM... STM or HTM both "suffer" from a not so 
well known phenomena:
Inconsistent data can, and will, be accessed in a transactional atomic 
block! A users algorithm for a transactional based system is based solely 
inside of an atomic block, e.g.,
atomic { // start transactional atomic block
 users_algorithm{
   // Access transactional data-structures
 }
} // try to commit transactional atomic block
This means that the users algorithm can actually "operate" on inconsistent 
data... No kidding, not at all... therefore, a users algorithm behavior is 
basically undefined, whether the programmer likes it or not... Think about 
it for a minute... If your algorithm is fed with "no-so coherent" data, then 
your algorithm will be subject to undefined behavior by default...
Wow... The implementations that I have seen have to actually catch any and 
all fatal exceptions that can, and WILL, arise when a user algorithm is 
subjected to inconsistent data... Sometimes your algorithm crashes right off 
the bat and gets intercepted by the STM where the transaction will be 
aborted, and retried... Sometimes your algorithm will be stuck in an 
infinite loop (most common problem, IMHO) , inside the atomic block... STM 
has to watch out for this kind of stuff... IMHO, this is a MAJOR drawback 
and a down right dangerous aspect of transactional memory... This behavior 
is inherent in basically any TM implementation; hardware or software...
Need to think some more on this troubling subject... Now you know just some 
of the reasons why I don't like TM...
Yikes!
:O
* I can't believe I forgot to mention this; totally slipped my mind. Right 
when I remembered this I went over to Wikipedia to see if anybody bring this 
fact up; somebody did... IMO, its vital that anybody who thinks about using 
TM completely understands this particular caveat! 
I depends on what your semantics are, i.e. what operations are atomic
w.r.t other operations.  The operations can be plain reads, plain writes,
read only transactions, write transactions, and read/modify/write transations.
Stronger guarantees are likely to incur more overhead or performance impact.
-- 
Joe Seigh
When you get lemons, you make lemonade.
When you get hardware, you make software. 
>
> One other MAJOR *point about TM... STM or HTM both "suffer" from a not so
> well known phenomena:
>
> Inconsistent data can, and will, be accessed in a transactional atomic
> block! A users algorithm for a transactional based system is based solely
> inside of an atomic block, e.g.,
> * I can't believe I forgot to mention this; totally slipped my mind. Right
> when I remembered this I went over to Wikipedia to see if anybody bring this
> fact up; somebody did... IMO, its vital that anybody who thinks about using
> TM completely understands this particular caveat!
The Stanford effort is attempting to address the Consistency concern:
In this paper, we propose a new shared memory model: Transactional
memory Coherence and Consistency (TCC). TCC provides
a model in which atomic transactions are always the basic
unit of parallel work, communication, memory coherence, and
memory reference consistency. TCC greatly simplifies parallel
software by eliminating the need for synchronization using conventional
locks and semaphores, along with their complexities.
http://ogun.stanford.edu/~kunle/publications/tcc_isca2004.pdf