Lecture on Transactional Memory

Stephen Fuld

unread,

Feb 8, 2007, 1:43:14 PM2/8/07

to

One of the free, public interest channels I get on my Dish Network
satellite TV is the University of Washington TV network (UWTV). They
air the computer science departments guest lecture series.

I just saw a lecture, by Maurice Herlihy of Brown University. that was a
good introduction to transactional memory. He explained the problem and
showed how transactional memory might be a solution. He also spent a
lot of time going over the open issues and research opportunities still
available. I believe that pretty much everyone here would learn
something from watching it (I learned a lot, but I am a novice in this
area). He is a good speaker.

Fortunately, these lectures are often available for watching or
download. This one is at

http://www.uwtv.org/programs/displayevent.aspx?rID=9300

--
- Stephen Fuld
(e-mail address disguised to prevent spam)

Chris Thomasson

unread,

Feb 20, 2007, 5:55:47 PM2/20/07

to

"Stephen Fuld" <S.F...@PleaseRemove.att.net> wrote in message
news:6bKyh.38044$Xq6....@bgtnsc04-news.ops.worldnet.att.net...

> One of the free, public interest channels I get on my Dish Network
> satellite TV is the University of Washington TV network (UWTV). They air
> the computer science departments guest lecture series.
>
> I just saw a lecture, by Maurice Herlihy of Brown University.

[...]

> Fortunately, these lectures are often available for watching or download.
> This one is at
>
> http://www.uwtv.org/programs/displayevent.aspx?rID=9300

http://groups.google.com/group/comp.arch/browse_frm/thread/91cb3fbfa2eb362a

Well, what do you think? ;^)

Stephen Fuld

unread,

Feb 21, 2007, 1:00:45 AM2/21/07

to

I'm not sure what you are asking here.

If you are asking me about the issues in last year's thread about the
subject, you are way beyond my level of expertise to comment. I really
just learned what TM is from the above mentioned lecture.

If you are asking what I thought of the lecture, I enjoyed it and
learned a lot.

If you weren't asking me at all, but this was directed to others, then
sorry for the reply. :-(

Ken Hagan

unread,

Feb 21, 2007, 5:41:20 AM2/21/07

to

Comparing...

>> http://www.uwtv.org/programs/displayevent.aspx?rID=9300
> http://groups.google.com/group/comp.arch/browse_frm/thread/91cb3fbfa2eb362a

I came away from the lecture thinking that it was one more argument for
some form of aspect oriented programming, to make it easier to write
things like transactions. TM didn't seem all that revolutionary, more of a
coding style than a paradigm shift.

I did wonder why he was worrying about where the hardware/software split
would lie, since it seemed highly dependent on the ratio of memory speeds
to CPU speeds, and therefore a prime candidate for "cycles of
reincarnation".

Regarding the comp.arch thread, I don't worry about the effect it will
have on students. No-one would dream of using a database that didn't
support transactions, but that doesn't absolve students from needing to
know the problem that they solve in a database context. I can't see how
the context of concurrency would be any different. In any discipline, the
good students will still want to know what their abstractions are
protecting them from, or "what's under the hood" if you prefer.

David Hopwood

unread,

Feb 21, 2007, 9:50:56 PM2/21/07

to

Chris Thomasson wrote:

> "Stephen Fuld" <S.F...@PleaseRemove.att.net> wrote:
>
>>One of the free, public interest channels I get on my Dish Network
>>satellite TV is the University of Washington TV network (UWTV). They air
>>the computer science departments guest lecture series.
>>
>>I just saw a lecture, by Maurice Herlihy of Brown University.
>
> [...]
>
>>Fortunately, these lectures are often available for watching or download.
>>This one is at
>>
>>http://www.uwtv.org/programs/displayevent.aspx?rID=9300
>
> http://groups.google.com/group/comp.arch/browse_frm/thread/91cb3fbfa2eb362a
>
> Well, what do you think? ;^)

I think that thread was lacking in technical argument. The above lecture is
actually quite reasonable about unsolved problems of STM, and I recommend it
highly.

I wrote some more specific comments on the lecture and on optimistic scheduling
in STM systems at:

<http://www.eros-os.org/pipermail/e-lang/2007-February/011873.html>

(The "transactional E" proposal mentioned earlier in that thread has not been
posted yet, but it will be soon.)

--
David Hopwood <david.nosp...@blueyonder.co.uk>

Chris Thomasson

unread,

Feb 23, 2007, 12:09:24 AM2/23/07

to

"David Hopwood" <david.nosp...@blueyonder.co.uk> wrote in message
news:ky7Dh.34$HO5...@fe1.news.blueyonder.co.uk...

> Chris Thomasson wrote:
>> "Stephen Fuld" <S.F...@PleaseRemove.att.net> wrote:
>>
>>>One of the free, public interest channels I get on my Dish Network
>>>satellite TV is the University of Washington TV network (UWTV). They air
>>>the computer science departments guest lecture series.
>>>
>>>I just saw a lecture, by Maurice Herlihy of Brown University.
>>
>> [...]
>>
>>>Fortunately, these lectures are often available for watching or download.
>>>This one is at
>>>
>>>http://www.uwtv.org/programs/displayevent.aspx?rID=9300
>>
>> http://groups.google.com/group/comp.arch/browse_frm/thread/91cb3fbfa2eb362a
>>
>> Well, what do you think? ;^)
>
> I think that thread was lacking in technical argument.

Well, I guess, technically speaking, STM is basically equal to a giant
LL/SC... IMHO, locks have far less overhead and they work better than a STM.
The fact that STM has to rely on a contention manager should say a lot about
it... Well, STM is very prone to livelock once you hit it with any sort of
"worst-case" scenario wrt "putting load on the system". Here are some
comments on STM that are bit more technical:

http://groups.google.com/group/comp.arch/msg/c648b0a25d308207

http://groups.google.com/group/comp.arch/msg/995379a16beb3b69

http://groups.google.com/group/comp.arch/msg/335aeb22fd6fe526

As you can see... I am definitely not a fan of STM! :O BTW, how come most
of the cheerleaders for STM seem to think that locks are too difficult to
program with without deadlocking?

> The above lecture is
> actually quite reasonable about unsolved problems of STM, and I recommend
> it
> highly.

I recommend it as well...

> I wrote some more specific comments on the lecture and on optimistic
> scheduling
> in STM systems at:
>
> <http://www.eros-os.org/pipermail/e-lang/2007-February/011873.html>

> (The "transactional E" proposal mentioned earlier in that thread has not
> been
> posted yet, but it will be soon.)

Well, I would recommend message passing over STM anyway... Heck, at least
message passing can be implemented in a highly efficient manner... Man, try
to create a STM that's not saturated with interlocked RMW instructions and
StoreLoad style memory barriers... I think you will have a hard time.

Joe Seigh

unread,

Feb 23, 2007, 6:50:45 AM2/23/07

to

Chris Thomasson wrote:
> "David Hopwood" <david.nosp...@blueyonder.co.uk> wrote in message
> news:ky7Dh.34$HO5...@fe1.news.blueyonder.co.uk...
>
>

> Well, I guess, technically speaking, STM is basically equal to a giant
> LL/SC... IMHO, locks have far less overhead and they work better than a STM.
> The fact that STM has to rely on a contention manager should say a lot about
> it... Well, STM is very prone to livelock once you hit it with any sort of
> "worst-case" scenario wrt "putting load on the system". Here are some
> comments on STM that are bit more technical:
>
> http://groups.google.com/group/comp.arch/msg/c648b0a25d308207
>
> http://groups.google.com/group/comp.arch/msg/995379a16beb3b69
>
> http://groups.google.com/group/comp.arch/msg/335aeb22fd6fe526
>
> As you can see... I am definitely not a fan of STM! :O BTW, how come most
> of the cheerleaders for STM seem to think that locks are too difficult to
> program with without deadlocking?
>
>

There are organizations with large unstructured code bases who do not want
to go through the trouble of structuring it to allow effecient and safe
multi-threading. STM is supposed to be their silver bullet. UML can
take a lot of the blame for this. Everything is a bunch of arbitary
relations. There's no attempt to organize things into hierarchical
structures to enable conventional approaches to multi-threading.

--
Joe Seigh

When you get lemons, you make lemonade.
When you get hardware, you make software.

Nick Maclaren

unread,

Feb 23, 2007, 7:12:21 AM2/23/07

to

In article <IeednW_4wpxg60PY...@comcast.com>,

"Chris Thomasson" <cri...@comcast.net> writes:
|>
|> Well, I guess, technically speaking, STM is basically equal to a giant
|> LL/SC... IMHO, locks have far less overhead and they work better than a STM.
|> The fact that STM has to rely on a contention manager should say a lot about
|> it... Well, STM is very prone to livelock once you hit it with any sort of

|> "worst-case" scenario wrt "putting load on the system". ...

The cause of that is the standard one that there are damn few people
left who are both capable of thinking mathematically and working on
practical problems. A hell of a lot of the bullshit in computer
science is due to the fact that the perpetrators can get away with
it :-(

It is bloody obvious that STM has every livelock that a lock-based
design has, and will expose every case where a traditional lock was
being used to prevent livelock. It is equally self-evident that every
sufficiently complex problem will suffer from livelock under load,
even if the form that the livelock takes is merely severe performance
degradation. All of that just drops out of the mathematics :-(

What is less forgiveable is that it was discovered to occur in
practice many decades back, repeatedly.

|> As you can see... I am definitely not a fan of STM! :O BTW, how come most
|> of the cheerleaders for STM seem to think that locks are too difficult to
|> program with without deadlocking?

Because THEY can't do it? :-)

More seriously, the sort of complex problem where STM has major
difficulties with livelock even under a decent scheduler ARE hard
to program with traditional locks without causing deadlock for the
problem inputs, for the reasons I mention above. What the STM people
miss is that they are hard to program using locks because they are
inherently hard, and there CAN BE no silver bullet.

|> Well, I would recommend message passing over STM anyway... Heck, at least
|> message passing can be implemented in a highly efficient manner... Man, try
|> to create a STM that's not saturated with interlocked RMW instructions and
|> StoreLoad style memory barriers... I think you will have a hard time.

I agree, but that doesn't eliminate any of the fundamental problems,
either. The difference is that the message passing people don't claim
that it does. What it does is to expose the issues, which makes them
easier to program round.

Arguing that STM is a better programming paradigm is one thing, but
claiming that it actually eliminated problems is another.

Regards,
Nick Maclaren.

Eric P.

unread,

Feb 23, 2007, 10:19:00 AM2/23/07

to

Nick Maclaren wrote:
>
> In article <IeednW_4wpxg60PY...@comcast.com>,
> "Chris Thomasson" <cri...@comcast.net> writes:
> |>
> |> Well, I guess, technically speaking, STM is basically equal to a giant
> |> LL/SC... IMHO, locks have far less overhead and they work better than a STM.
> |> The fact that STM has to rely on a contention manager should say a lot about
> |> it... Well, STM is very prone to livelock once you hit it with any sort of
> |> "worst-case" scenario wrt "putting load on the system". ...
>
> The cause of that is the standard one that there are damn few people
> left who are both capable of thinking mathematically and working on
> practical problems. A hell of a lot of the bullshit in computer
> science is due to the fact that the perpetrators can get away with
> it :-(
>
> It is bloody obvious that STM has every livelock that a lock-based
> design has, and will expose every case where a traditional lock was
> being used to prevent livelock. It is equally self-evident that every
> sufficiently complex problem will suffer from livelock under load,
> even if the form that the livelock takes is merely severe performance
> degradation. All of that just drops out of the mathematics :-(
>
> What is less forgiveable is that it was discovered to occur in
> practice many decades back, repeatedly.

Yes, but my intuition tells me that STM is worse, and that it is
inherently meta-stable and chaotic, though I have no proof of this.
Because of transaction composability, apparently disconnected
operations could start to interact in unanticipated ways.
My concern is that systems built with it could appear to work
fine until some critical set of disparate events occur,
and the whole system will begin to thrash and could remain
in that state indefinitely. Or ust an apparently minor change
to a subroutine, even though properly tested, will cause live lock.

Furthermore, because it is all timing based, is may be inherently
unprovable (just guessing - I'm not a mathy). At least with locks
you can prove that your configuration always works.

As I said, I have no proof but it's got that smell.

Eric

Stefan Monnier

unread,

Feb 23, 2007, 11:52:58 AM2/23/07

to

> Well, I guess, technically speaking, STM is basically equal to a giant
> LL/SC... IMHO, locks have far less overhead and they work better than a STM.
> The fact that STM has to rely on a contention manager should say a lot about
> it... Well, STM is very prone to livelock once you hit it with any sort of
> "worst-case" scenario wrt "putting load on the system". Here are some
> comments on STM that are bit more technical:

You fail to distinguish the STM programming model from the "optimistic-lock"
implementation strategy often associated with it.

The STM programming model basically allows the programmer to write a spec,
which can then be implemented with traditional locks, optimistic locks, ...

Of course, we currently know much better how to automatically derive an
implementation from the spec using optimistic locking than using
traditional locks. But an alternative is to ask the user to write the
locking code as well and to only *check* that the locks do correctly
implement the spec.

Stefan

Stefan Monnier

unread,

Feb 23, 2007, 11:53:50 AM2/23/07

to

> STM is supposed to be their silver bullet.

Obviously it's not: there is no silver bullet.

Stefan

David Hopwood

unread,

Feb 23, 2007, 1:14:25 PM2/23/07

to

Eric P. wrote:

> Nick Maclaren wrote:
>>"Chris Thomasson" <cri...@comcast.net> writes:
>>|>
>>|> Well, I guess, technically speaking, STM is basically equal to a giant
>>|> LL/SC... IMHO, locks have far less overhead and they work better than a STM.
>>|> The fact that STM has to rely on a contention manager should say a lot about
>>|> it... Well, STM is very prone to livelock once you hit it with any sort of
>>|> "worst-case" scenario wrt "putting load on the system". ...
>>
>>The cause of that is the standard one that there are damn few people
>>left who are both capable of thinking mathematically and working on
>>practical problems. A hell of a lot of the bullshit in computer
>>science is due to the fact that the perpetrators can get away with
>>it :-(
>>
>>It is bloody obvious that STM has every livelock that a lock-based
>>design has, and will expose every case where a traditional lock was
>>being used to prevent livelock. It is equally self-evident that every
>>sufficiently complex problem will suffer from livelock under load,
>>even if the form that the livelock takes is merely severe performance
>>degradation. All of that just drops out of the mathematics :-(
>>
>>What is less forgiveable is that it was discovered to occur in
>>practice many decades back, repeatedly.
>
> Yes, but my intuition tells me that STM is worse, and that it is
> inherently meta-stable and chaotic, though I have no proof of this.
> Because of transaction composability, apparently disconnected
> operations could start to interact in unanticipated ways.

No, not because of transaction composability. Only because of optimistic
scheduling.

Transactions would be useful even in a fully sequential context.

--
David Hopwood <david.nosp...@blueyonder.co.uk>

Nick Maclaren

unread,

Feb 23, 2007, 1:21:38 PM2/23/07

to

In article <5aGDh.370320$MO2.3...@fe3.news.blueyonder.co.uk>,

David Hopwood <david.nosp...@blueyonder.co.uk> writes:
|>
|> Transactions would be useful even in a fully sequential context.

Would be? The database people would disagree - they ARE useful :-)

More seriously, I entirely agree. They could make a massive improvement
to language exception specifications, to mention just one aspect.

Regards,
Nick Maclaren.

Eric P.

unread,

Feb 23, 2007, 8:37:38 PM2/23/07

to

Well, the only ones I have read about were various optimistic
flavors, and the honest authors acknowledge that their approach
can potentially livelock.

I was speculating beyond their acknowledgment, and wondering if it
is even possible to prove that an STM application cannot livelock.
In other words, put the onus on the STM developers to prove that it
cannot livelock or deadlock (as one can prove with locks).

Transaction composability and nesting means operations are additive
so that what previously might have been small isolated atomic
operations now become members of any larger enclosing units of
work which they do not necessarily know anything about.
This can collide with other threads that just want to do a
small transaction, but now doing so in unexpected ways.

Optimistic scheduling (collision-undo-retry) obviously can thrash.
I have not seen the details on pessimistic scheduling
(blocking until you are sure the whole transaction will succeed)
nor given it much thought, but it seems it would have its problems.

I'm guessing you would either wind up with a single BigLock,
or it might indefinitely stall large transactions because small
operations keep jumping in and preventing forward progress of
larger transactions.

And reintroducing locks reinstates potential for deadlocks.
To avoid that would require a whole program scan to get the
order correct (which kinda works against modular development
or opaque DLL code).

> Transactions would be useful even in a fully sequential context.

That's called BigLock, and I don't need some fancy pants
mechanism for _that_ design.

Anyway, I didn't say the idea isn't useful, just that as currently
described it does not look like it will do what the proponents hope.
The original 1992 idea was hardware based, using CAMs and swapping
cache lines. It might be worth revisiting the beginning in light
of vastly increased transistor counts over the last 15 years.

Eric

Bill Todd

unread,

Feb 23, 2007, 9:40:38 PM2/23/07

to

Eric P. wrote:

...

> I'm guessing you would either wind up with a single BigLock,
> or it might indefinitely stall large transactions because small
> operations keep jumping in and preventing forward progress of
> larger transactions.

Not with anything resembling fair lock-queue management: when the large
transaction needs something it can't have immediately it doesn't just
retry (let alone start again from scratch) but *waits* for it, and no
small transaction that asks for it later can get in the way. Thus is
forward progress assured, though it can be fairly slow if the large
transaction has to wait for every resource it requests.

>
> And reintroducing locks reinstates potential for deadlocks.
> To avoid that would require a whole program scan to get the
> order correct (which kinda works against modular development
> or opaque DLL code).

Most real-world systems sufficiently complex to make full lock-ordering
difficult just accept the potential for deadlocks and use deadlock
detectors to break them (everything from full examination of the
wait-for graph down to simple time-outs). These too need some concept
of fairness, of course - one simple one that, again, guarantees forward
progress is 'oldest transaction wins' (with time-outs, this just means
that when the time-out interval expires the younger of the waiter and
the waitee gets scragged: since - according to Jim Gray, who will be
sorely missed - the vast majority of deadlocks involve only two
participants, this actually works fairly well, especially for the
typically brief and concise transactions that characterize traditional
transaction processing workloads where the *only* thing that's likely to
delay things very much is a deadlock).

>
>> Transactions would be useful even in a fully sequential context.
>
> That's called BigLock, and I don't need some fancy pants
> mechanism for _that_ design.

You do it you want automated atomicity across various potential failures
- something which (as was already noted in the statement to which you
responded) a lot of applications find quite useful.

- bill

Chris Thomasson

unread,

Feb 23, 2007, 11:47:16 PM2/23/07

to

"Bill Todd" <bill...@metrocast.net> wrote in message
news:n_mdnZnaO5Y6OELY...@metrocastcablevision.com...

> Eric P. wrote:
>
> ...
>
>> I'm guessing you would either wind up with a single BigLock,
>> or it might indefinitely stall large transactions because small
>> operations keep jumping in and preventing forward progress of
>> larger transactions.
>

[...]

>> And reintroducing locks reinstates potential for deadlocks.
>> To avoid that would require a whole program scan to get the
>> order correct (which kinda works against modular development
>> or opaque DLL code).
>
> Most real-world systems sufficiently complex to make full lock-ordering
> difficult just accept the potential for deadlocks and use deadlock
> detectors to break them (everything from full examination of the wait-for
> graph down to simple time-outs).

Deadlock detectors? Humm. Well, the first question I ask when I am acting as
a consultant is a list of the people involved with creating the existing
synchronization scheme. I as them if they understand if the know how to use
locks to get the job done. If they say something like. "Well, if I use
locks, won't I risk deadlock?". Then you answer, "Okay, go to the book store
and purchase David Butenhofs book on POSIX Threads." Then, please read the
whole book. Okay, know you know that it seems to be possible to outperform
STM with a coherent and "clever" locking scheme nearly all fronts. Its the
damn explicit contention management schemes that are involved with any sort
of forward progress guarantees. Think of STM under pressure. Well, okay, the
thing is going to thrash. Ahh, nice, the contention manager is now at fault.
It just basically "queues things up 'until' the current contention-interval
state 'drops it status (e.g., total pending transactions; total failed
transactions; other_stats) below' some algorithmic set 'low-watermark' ".

Well, here is one for the STM guys. A STM implementation that is based on
locks is better than a STM implementation that is based on 100% lock-free
algorithms. That should improve some performance characteristics of certain
existing STM implementations.

Chris Thomasson

unread,

Feb 24, 2007, 12:12:41 AM2/24/07

to

"Eric P." <eric_p...@sympaticoREMOVE.ca> wrote in message
news:45df981f$0$26697$834e...@reader.greatnowhere.com...
[...]

> It might be worth revisiting the beginning in light
> of vastly increased transistor counts over the last 15 years.

If you talking about HTM... Doesn't that mean extremely strong cache
coherency mechanisms? Too strong in my humble opinion. I want weak cache
coherency, and explicit per-architecture memory model documentation. HTM
doesn't scale to good...

Nick Maclaren

unread,

Feb 24, 2007, 4:50:10 AM2/24/07

to

In article <Nb6dndbbhZ_XXkLY...@comcast.com>,

"Chris Thomasson" <cri...@comcast.net> writes:
|>
|> Deadlock detectors? Humm. Well, the first question I ask when I am acting as
|> a consultant is a list of the people involved with creating the existing
|> synchronization scheme. I as them if they understand if the know how to use
|> locks to get the job done. If they say something like. "Well, if I use
|> locks, won't I risk deadlock?". Then you answer, "Okay, go to the book store
|> and purchase David Butenhofs book on POSIX Threads." Then, please read the
|> whole book. Okay, know you know that it seems to be possible to outperform

|> STM with a coherent and "clever" locking scheme nearly all fronts. ...

Well, exactly the same thing is true for memory management; you can
always outperform any generic system (ESPECIALLY any automatic garbage
collector) doing it manually. And the same is true of optimisation,
and low-level scheduling, and ....

That, in itself, is merely an argument that STM is inferior to locks
for the most performance-critical purposes. I don't think that peak
performance has been claimed as an advantage by the serious STM
proponents, though converts are always more evangelical than their
gurus :-(

I can accept the STM claims that it eliminates the more straightforward
and common errors made using locks, but I wish that people pushing
solutions would have the intellectual honesty to describe the defects
and limitations of the techniques as clearly as they describe the
advantages. That is a generic statement, not specific to STM, of
course. For example:

http://research.microsoft.com/~simonpj/papers/stm/beautiful.pdf

does mention the problems with imperative languages but does not even
mention livelock or scalability. Its one reference to performance is
a (true) statement that a single global lock is dreadful.

Despite my remarks, I do agree with the STM people that it is a much
better design for software engineering. The STM approach has, after
all, been the basis for working parallel designs for many decades;
any experienced person knows that, to maintain even minimal sanity,
you decompose a problem into approximately atomic units in a sort of
hierarchical fashion. That is, after all, how you are recommended to
use locks if you want your program ever to work :-)

Regards,
Nick Maclaren.

Eric P.

unread,

Feb 24, 2007, 2:29:47 PM2/24/07

to

I wasn't suggesting anything in particular. Sometimes it is
good to go back to the beginning and reevaluate, but now with
all the knowledge that has been accumulated along the way.

Since the major cpu houses all appear to see massive SMP
as the future, there may be more acceptance of hardware
modifications to support this than there was in the past.
So ideas that were previously dismissed may now be worthwhile.

For example there might be some value in exploring better support
for wait free algorithms but without regard to limitations of
current instruction sets.

The original proposal by Herlihy in 1993 was hardware based
and STM came later in 1995. Since then there has also been a
lot of work done on wait and lock free algorithms.
Perhaps CAMs and some control over the cache line MOESI state bits,
combined with controlling software might get around some of the
problems that STM alone cannot.

Eric

Eric P.

unread,

Feb 24, 2007, 2:50:41 PM2/24/07

to

Bill Todd wrote:
>
> Eric P. wrote:
>
> ...
>
> > I'm guessing you would either wind up with a single BigLock,
> > or it might indefinitely stall large transactions because small
> > operations keep jumping in and preventing forward progress of
> > larger transactions.
>
> Not with anything resembling fair lock-queue management: when the large
> transaction needs something it can't have immediately it doesn't just
> retry (let alone start again from scratch) but *waits* for it, and no
> small transaction that asks for it later can get in the way. Thus is
> forward progress assured, though it can be fairly slow if the large
> transaction has to wait for every resource it requests.

Yes, I was unclear as to my meaning.
What I meant was that, extrapolating out what might happen,
it seems to me that you wind up with all transactions being
slowed due to the largest ones, which is effectively the
same as having a single BigLock. If you try to work around the
above problem by allowing small readers to bypass large writers,
which can be allowable in certain very controlled circumstance,
in general it can indefinitely stall writers which is not acceptable.

Therefore system performance tends towards BigLock performance.

> > And reintroducing locks reinstates potential for deadlocks.
> > To avoid that would require a whole program scan to get the
> > order correct (which kinda works against modular development
> > or opaque DLL code).
>
> Most real-world systems sufficiently complex to make full lock-ordering
> difficult just accept the potential for deadlocks and use deadlock
> detectors to break them (everything from full examination of the
> wait-for graph down to simple time-outs). These too need some concept
> of fairness, of course - one simple one that, again, guarantees forward
> progress is 'oldest transaction wins' (with time-outs, this just means
> that when the time-out interval expires the younger of the waiter and
> the waitee gets scragged: since - according to Jim Gray, who will be
> sorely missed - the vast majority of deadlocks involve only two
> participants, this actually works fairly well, especially for the
> typically brief and concise transactions that characterize traditional
> transaction processing workloads where the *only* thing that's likely to
> delay things very much is a deadlock).

Yes, thanks, I know it is possible to do deadlock detection.
I was exploring how it might be avoided and thereby
retain the STM's original 'deadlock free' requirement.
I conclude that unless there is some other method than
what I mentioned, it is impractical.

> >> Transactions would be useful even in a fully sequential context.
> >
> > That's called BigLock, and I don't need some fancy pants
> > mechanism for _that_ design.
>
> You do it you want automated atomicity across various potential failures
> - something which (as was already noted in the statement to which you
> responded) a lot of applications find quite useful.

Ok, but they should update the marketing literature to reflect
the new limitations.

Eric

Bill Todd

unread,

Feb 24, 2007, 6:14:04 PM2/24/07

to

Eric P. wrote:
> Bill Todd wrote:
>> Eric P. wrote:
>>
>> ...
>>
>>> I'm guessing you would either wind up with a single BigLock,
>>> or it might indefinitely stall large transactions because small
>>> operations keep jumping in and preventing forward progress of
>>> larger transactions.
>> Not with anything resembling fair lock-queue management: when the large
>> transaction needs something it can't have immediately it doesn't just
>> retry (let alone start again from scratch) but *waits* for it, and no
>> small transaction that asks for it later can get in the way. Thus is
>> forward progress assured, though it can be fairly slow if the large
>> transaction has to wait for every resource it requests.
>
> Yes, I was unclear as to my meaning.
> What I meant was that, extrapolating out what might happen,
> it seems to me that you wind up with all transactions being
> slowed due to the largest ones, which is effectively the
> same as having a single BigLock.

Not at all.

First, unless a transaction is so large that it locks a significant
percentage of the entire data set, most other transactions aren't
affected by it at all. As Gray observed, in most environments (even
those where transaction sizes can vary fairly considerably) lock
collisions themselves are rare, and deadlocks are (rare)^2.

Second, truly humongous transactions, while sometimes useful, are
sufficiently rare that special techniques can be used to avoid the kinds
of adverse interactions that you describe: operation on (static or
dynamically-materialized) data snapshots (this is almost always a good
solution for large read-only transactions, e.g., that generate database
statistics or decision-support information), subdivision into a sequence
of smaller transactions (many real-world full-table update operations
find this entirely acceptable), or exclusive access to the database
(preferably at some relatively quiet time of day) - equivalent to
BigLock, but very seldom necessary.

If you try to work around the
> above problem by allowing small readers to bypass large writers,
> which can be allowable in certain very controlled circumstance,
> in general it can indefinitely stall writers which is not acceptable.
>
> Therefore system performance tends towards BigLock performance.

The decades of success of conventionally-locked databases strongly
suggest that you are mistaken in that conclusion.

>
>>> And reintroducing locks reinstates potential for deadlocks.
>>> To avoid that would require a whole program scan to get the
>>> order correct (which kinda works against modular development
>>> or opaque DLL code).
>> Most real-world systems sufficiently complex to make full lock-ordering
>> difficult just accept the potential for deadlocks and use deadlock
>> detectors to break them (everything from full examination of the
>> wait-for graph down to simple time-outs). These too need some concept
>> of fairness, of course - one simple one that, again, guarantees forward
>> progress is 'oldest transaction wins' (with time-outs, this just means
>> that when the time-out interval expires the younger of the waiter and
>> the waitee gets scragged: since - according to Jim Gray, who will be
>> sorely missed - the vast majority of deadlocks involve only two
>> participants, this actually works fairly well, especially for the
>> typically brief and concise transactions that characterize traditional
>> transaction processing workloads where the *only* thing that's likely to
>> delay things very much is a deadlock).
>
> Yes, thanks, I know it is possible to do deadlock detection.
> I was exploring how it might be avoided and thereby
> retain the STM's original 'deadlock free' requirement.

Systems that employ deadlock detection and resolution *are*, from an
external perspective, deadlock-free.

> I conclude that unless there is some other method than
> what I mentioned, it is impractical.
>
>>>> Transactions would be useful even in a fully sequential context.
>>> That's called BigLock, and I don't need some fancy pants
>>> mechanism for _that_ design.
>> You do it you want automated atomicity across various potential failures
>> - something which (as was already noted in the statement to which you
>> responded) a lot of applications find quite useful.
>
> Ok, but they should update the marketing literature to reflect
> the new limitations.

I have no idea what 'new limitations' you may be talking about, nor how
they would apply to your suggestion above that the *only* useful
difference between transactions and BigLock was parallelism (the
particular thrust of what was quoted above).

- bill

Chris Thomasson

unread,

Feb 24, 2007, 8:48:21 PM2/24/07

to

"Eric P." <eric_p...@sympaticoREMOVE.ca> wrote in message

news:45e0981b$0$26700$834e...@reader.greatnowhere.com...

> Chris Thomasson wrote:
>>
>> "Eric P." <eric_p...@sympaticoREMOVE.ca> wrote in message
>> news:45df981f$0$26697$834e...@reader.greatnowhere.com...
>> [...]
>>
>> > It might be worth revisiting the beginning in light
>> > of vastly increased transistor counts over the last 15 years.
>>
>> If you talking about HTM... Doesn't that mean extremely strong cache
>> coherency mechanisms? Too strong in my humble opinion. I want weak cache
>> coherency, and explicit per-architecture memory model documentation. HTM
>> doesn't scale to good...
>
> I wasn't suggesting anything in particular. Sometimes it is
> good to go back to the beginning and reevaluate, but now with
> all the knowledge that has been accumulated along the way.

[...]

> For example there might be some value in exploring better support
> for wait free algorithms but without regard to limitations of
> current instruction sets.

Adding solid memory management support for all sorts of lock-free algorithms
at the hardware level can be as simple as this:

------------
The chip's cores are partitioned into groups. So, lets say a 1000 core
processor sets up 10 groups of 100 cores, fine. Alls it has to do now is
track when each core in a group executes something that is analogous to a
memory barrier strong enough to at least render its pending stores visible.
After the chip notices that each core in a group has recently executed a
"memory barrier", it then generates an interrupt with a single piece of
state, a group-id:

http://groups.google.com/group/comp.arch/msg/2a0f4163f8e13f1e
------------

The scheme can be used to create a highly efficient PDR-Engine for an
Operating System and all the way down to the User level...

Any thoughts?

Chris Thomasson

unread,

Feb 24, 2007, 8:54:29 PM2/24/07

to

I think we should all take note to the fact that STM hype is being pushed on
so-called normal applications... Not huge database systems, just plain user
applications... I trust a database programmer to use STM in a "sensible"
fashion... I don't trust programmer Jane or Joe "Green" to be able to
understand that STM is very prone to livelock ot have any idea's on how they
might be able to workaround some of it...

Nick Maclaren

unread,

Feb 25, 2007, 3:52:02 AM2/25/07

to

In article <8LadnfRojsHUcX3Y...@comcast.com>,

They can't use locks or message passing competently, either :-)

Yes. Any technique suitable for helping the "average programmer" to
take advantage of parallelism must be fool-resistant (nothing designed
by man can be fool-proof).

Regards,
Nick Maclaren.

David Hopwood

unread,

Feb 25, 2007, 1:05:54 PM2/25/07

to

Chris Thomasson wrote:
> Adding solid memory management support for all sorts of lock-free algorithms
> at the hardware level can be as simple as this:
>
> ------------
> The chip's cores are partitioned into groups. So, lets say a 1000 core
> processor sets up 10 groups of 100 cores, fine. Alls it has to do now is
> track when each core in a group executes something that is analogous to a
> memory barrier strong enough to at least render its pending stores visible.
> After the chip notices that each core in a group has recently executed a
> "memory barrier", it then generates an interrupt with a single piece of
> state, a group-id:
>
> http://groups.google.com/group/comp.arch/msg/2a0f4163f8e13f1e
> ------------

I've no idea why that particular semantics would be useful. With current
memory models, it would result in far too many interrupts, only a small
fraction of which would be relevant.

--
David Hopwood <david.nosp...@blueyonder.co.uk>

David Hopwood

unread,

Feb 25, 2007, 1:06:32 PM2/25/07

to

Chris Thomasson wrote:
> I think we should all take note to the fact that STM hype is being pushed on
> so-called normal applications...

It's quite an overstatement to call the extent of promotion of STM "hype".
Save that for things that really are hyped, like Java, .NET, Vista, etc.

--
David Hopwood <david.nosp...@blueyonder.co.uk>

Nick Maclaren

unread,

Feb 25, 2007, 1:54:50 PM2/25/07

to

In article <IekEh.17264$HO5....@fe1.news.blueyonder.co.uk>,

David Hopwood <david.nosp...@blueyonder.co.uk> writes:
|> Chris Thomasson wrote:
|> > I think we should all take note to the fact that STM hype is being pushed on
|> > so-called normal applications...
|>
|> It's quite an overstatement to call the extent of promotion of STM "hype".
|> Save that for things that really are hyped, like Java, .NET, Vista, etc.

Oh, I don't think so! There are degrees of overstatement, after all,
and the fact that some things indulged in hyperhype doesn't mean that
all overstatement of merit can justifiably be called hype :-)

I agree that STM is a model well worth pursuing, but it DOESN'T have
the advantages over locks that many of its promponents claim it does.

Regards,
Nick Maclaren.

Eric P.

unread,

Feb 25, 2007, 2:12:51 PM2/25/07

to

The terms large and small are wrt serialized code sections,
and this directly affects potential parallel throughput.
And since transactions are defined as serialized, all you need
is a single common global variable and you are globally serialized.
It is trivial to construct an example of such including on databases.

And it is no good to say "well you need to redesign your system to
avoid collision on global variables" because the one of the major
TM selling points is that you are not supposed to know the details
and this operation might be buried deep in some opaque module.

Remember that the original requirements for TM/STM, and its' much
touted selling points, are that it is non-blocking, free from
deadlocks, priority inversion and convoys, concurrent scalable,
and that transactions appear serialized and atomic.
Later STM seems to have added nestable transactions.

So it is not claimed that it appear deadlock free from a distance,
but that it actually be deadlock free.

I am simply starting from their requirement.

> > I conclude that unless there is some other method than
> > what I mentioned, it is impractical.
> >
> >>>> Transactions would be useful even in a fully sequential context.
> >>> That's called BigLock, and I don't need some fancy pants
> >>> mechanism for _that_ design.
> >> You do it you want automated atomicity across various potential failures
> >> - something which (as was already noted in the statement to which you
> >> responded) a lot of applications find quite useful.
> >
> > Ok, but they should update the marketing literature to reflect
> > the new limitations.
>
> I have no idea what 'new limitations' you may be talking about, nor how
> they would apply to your suggestion above that the *only* useful
> difference between transactions and BigLock was parallelism (the
> particular thrust of what was quoted above).

When you revert to a fully sequential context, you loose non-blocking
and get back deadlocks, priority inversion and convoys, and limit
scalability.

That leaves nestable, composable, consistent transactions,
albeit serialized, for your comfort and safety.

Eric

David Hopwood

unread,

Feb 25, 2007, 3:49:04 PM2/25/07

to

Eric P. wrote:
> And it is no good to say "well you need to redesign your system to
> avoid collision on global variables" because the one of the major
> TM selling points is that you are not supposed to know the details
> and this operation might be buried deep in some opaque module.

No, it is not a "TM selling point" that you're not supposed to have to
know the details of your program's implementation. It is a selling point
that the *correctness* of the program does not depend (to the same extent
as for lock-based concurrency) on the details. Its performance does.

--
David Hopwood <david.nosp...@blueyonder.co.uk>

Chris Thomasson

unread,

Feb 25, 2007, 5:49:20 PM2/25/07

to

"David Hopwood" <david.nosp...@blueyonder.co.uk> wrote in message

news:6ekEh.17259$HO5....@fe1.news.blueyonder.co.uk...

> Chris Thomasson wrote:
>> Adding solid memory management support for all sorts of lock-free
>> algorithms
>> at the hardware level can be as simple as this:
>>
>> ------------

[...]

>> http://groups.google.com/group/comp.arch/msg/2a0f4163f8e13f1e
>> ------------
>
> I've no idea why that particular semantics would be useful.

You can use it as a highly efficient method for detecting
"synchronization-epochs" that the following PDR algorithms heavily depend
on:

- Read, Copy and Update (RCU)
- Read, Copy and Update /w Safe Memory Reclamation (RCU+SMR)
- Virtually Zero-Overhead Object Management (vZOOM)

And any other PDR scheme you can come up with...

> With current
> memory models, it would result in far too many interrupts, only a small
> fraction of which would be relevant.

Well, if the chip could limit the interrupts to not more than one every
second or so, well, that would be perfect. Does an interrupt every second or
so per-core-group sound practical?

Bill Todd

unread,

Feb 25, 2007, 7:38:42 PM2/25/07

to

Either you don't have a clue how transactions work, or I'm missing
something here. Transactions are *not* normally 'defined as
serialized', they are defined as *serializable* - i.e., that there is
some serial execution order which, had they been performed that way,
would be absolutely equivalent to the actual, parallel execution.

>
> And it is no good to say "well you need to redesign your system to
> avoid collision on global variables" because the one of the major
> TM selling points is that you are not supposed to know the details
> and this operation might be buried deep in some opaque module.

You serialize on a global variable only when multiple accessors want to
access it at the same time in incompatible manners. With optimistic
concurrency control, both execute to completion and then one is rolled
back (i.e., all the work it performed is wasted); with pessimistic
concurrency control, as soon as the conflict occurs one either waits or
(in the case of deadlock) dies (only in the latter case wasting *any*
work), while the other proceeds.

...

>> Systems that employ deadlock detection and resolution *are*, from an
>> external perspective, deadlock-free.
>
> Remember that the original requirements for TM/STM, and its' much
> touted selling points, are that it is non-blocking, free from
> deadlocks, priority inversion and convoys, concurrent scalable,
> and that transactions appear serialized and atomic.
> Later STM seems to have added nestable transactions.
>
> So it is not claimed that it appear deadlock free from a distance,
> but that it actually be deadlock free.

If the STM mechanism includes deadlock detection and resolution, it *is*
deadlock free. Everything else is also achievable with conventional
lock-based transactions (and well-understood from their use in
databases); you could even make them non-blocking if you were prepared
to abort immediately on any conflict rather than wait to see if you
could ride it out, though whether this would be a good idea is debatable.

To be clear, I have little use for STM at all, optimistic or
pessimistic. But both approaches *are* feasible (though implementing
optimistic mechanisms in the hardware may be easier, and perhaps even
use sufficiently fewer transistors to perform better overall in the
absence of significant contention, than doing full-blown locks).

- bill

Jan Vorbrüggen

unread,

Feb 26, 2007, 3:39:33 AM2/26/07

to

> Yes, thanks, I know it is possible to do deadlock detection.
> I was exploring how it might be avoided and thereby
> retain the STM's original 'deadlock free' requirement.

Even if that were true - I think it is a fragile property. Just read an
article on a carefully constructed real-time executive (the one that was the
basis of the "AI" component of the Deep Space 1 probe) where an application
programmer didn't play by the rules (inadvertently) and implemented a deadlock
that was only observed in real operation, not in hundreds of hours of testing
on a high-fidelity simulator. So I'd rather have belts and suspenders, please
- construct the system such that deadlocks _shouldn't_ occur, but provide
protection against them anyway.

Jan

Eric P.

unread,

Feb 26, 2007, 1:41:40 PM2/26/07

to

The statements I made above are all correct, and db row update locks
do block and serialize updates to the same row. I don't know what you
are missing as it is clear from what you write below that you are
saying the same thing.

It is a simple fact that thread locks (critical sections) guarding
variables that are common between transaction threads are acquired
and quickly released and thus affording potential concurrency.

Conversely those same common variables updated inside transactions
will have update locks acquired and held to the end of the
transaction. This would have the effect of essentially serializing
all code that attempts to update any of the same variables.

Since TP code can have lots of variables in common between the
various code sections, making the access STM Atomic will force
those updates, and therefore their transactions, to serialize.
There will be much less concurrency, possibly none.

A similar effect would occur trying to parallelize a transaction
processing system that stores a unique TranId in a single row
and assigns a new one to each change. All the parallel TP servers
would bottleneck on updating the unique TranId row and you get
effective serial throughput.

Thus my previous statements about BigLock, small and large duration
locks, and serialized throughput.

> > And it is no good to say "well you need to redesign your system to
> > avoid collision on global variables" because the one of the major
> > TM selling points is that you are not supposed to know the details
> > and this operation might be buried deep in some opaque module.
>
> You serialize on a global variable only when multiple accessors want to
> access it at the same time in incompatible manners. With optimistic
> concurrency control, both execute to completion and then one is rolled
> back (i.e., all the work it performed is wasted); with pessimistic
> concurrency control, as soon as the conflict occurs one either waits or
> (in the case of deadlock) dies (only in the latter case wasting *any*
> work), while the other proceeds.

This was never in doubt. And the net result in both cases,
when there is contention, is serial throughput.

The only question is its' frequency of occurrence
which depends on if there are any common variables.
And I am estimating there are.

> >> Systems that employ deadlock detection and resolution *are*, from an
> >> external perspective, deadlock-free.
> >
> > Remember that the original requirements for TM/STM, and its' much
> > touted selling points, are that it is non-blocking, free from
> > deadlocks, priority inversion and convoys, concurrent scalable,
> > and that transactions appear serialized and atomic.
> > Later STM seems to have added nestable transactions.
> >
> > So it is not claimed that it appear deadlock free from a distance,
> > but that it actually be deadlock free.
>
> If the STM mechanism includes deadlock detection and resolution, it *is*
> deadlock free. Everything else is also achievable with conventional
> lock-based transactions (and well-understood from their use in
> databases); you could even make them non-blocking if you were prepared
> to abort immediately on any conflict rather than wait to see if you
> could ride it out, though whether this would be a good idea is debatable.

It should be noted that deadlock scans could be expensive if
there are many locks (isn't something like N factorial?).
And being timer triggered on some interval (say, 1 sec)
there can be a delay to breaking the deadlock.
(E.g. It would be nice if this did not occur
just as the plane was landing.)

> To be clear, I have little use for STM at all, optimistic or
> pessimistic. But both approaches *are* feasible (though implementing
> optimistic mechanisms in the hardware may be easier, and perhaps even
> use sufficiently fewer transistors to perform better overall in the
> absence of significant contention, than doing full-blown locks).
>
> - bill

Yep.

Eric

Bill Todd

unread,

Feb 26, 2007, 2:30:38 PM2/26/07

to

No, they are not - though they're at least somewhat closer to being
correct now that you're clarified what you meant by "transactions are
defined as serialized" to be not a general statement about all
transactions but rather "transactions that update the same variable are
serialized".

However, the modified statement is still incorrect, because transactions
that update the same variable can still act in parallel just fine
*until* the point at which they update the variable, at which point only
those updates (and whatever the transactions do subsequent to that
point) execute serially. So the rather trivial way to let such
transactions execute almost completely in parallel is to make those
updates the last action the transaction takes.

Or to find other ways around the issue. The fact (as I've noted before)
that transaction processing has enjoyed remarkable commercial success
(while demonstrating high degrees of parallelism) over the past several
decades indicates that effective solutions to this alleged performance
bottleneck abound (which is not, of course, to say that the potential
problem does not exist at all, just that when it actually crops up there
are plenty of well-established ways around it).

If compute cycles were completely free (in both execution time and power
consumption), the fact that optimistic concurrency control executes
transactions to completion but then throws away the work of all but one
if any conflict is detected while pessimistic concurrency control
instead stalls incompatible accesses until they can safely proceed and
aborts only when actual *deadlock* occurs might be irrelevant (at least
as long as suitable mechanisms were in place to guarantee against
livelock and deadlock) - as it would also be if conflicts were
sufficiently rare that their overheads could be completely ignored.
That situation often does not obtain in the real world, however, so one
must choose between the drawbacks of the two approaches (both of which
sometimes require moderately competent programming to avoid performance
problems).

In database and related arenas optimistic concurrency control has been
actively studied and experimented with for over a quarter century, and
pessimistic concurrency control for even longer. So the near-complete
dominance of the latter is due to something more than just the lag in
percolating technology from academia to the commercial world. That the
same kinds of considerations would apply to a transactional memory
implementation does not necessarily follow automatically, but at least
some transfers of insight ought to be leverageable.

- bill

Chris Thomasson

unread,

Feb 26, 2007, 8:47:28 PM2/26/07

to

"Nick Maclaren" <nm...@cus.cam.ac.uk> wrote in message
news:errini$95u$1...@gemini.csx.cam.ac.uk...

>
> In article <8LadnfRojsHUcX3Y...@comcast.com>,
> "Chris Thomasson" <cri...@comcast.net> writes:
> |>
> |> I think we should all take note to the fact that STM hype is being
> pushed on
> |> so-called normal applications... Not huge database systems, just plain
> user
> |> applications... I trust a database programmer to use STM in a
> "sensible"
> |> fashion... I don't trust programmer Jane or Joe "Green" to be able to
> |> understand that STM is very prone to livelock ot have any idea's on how
> they
> |> might be able to workaround some of it...
>
> They can't use locks or message passing competently, either :-)

Good point! ;^)

> Yes. Any technique suitable for helping the "average programmer" to
> take advantage of parallelism must be fool-resistant (nothing designed
> by man can be fool-proof).

I agree that STM is fairly fool-resistant. In most implementations I've seen
the transactions are allowed to simply restart when something seg-faults or
livelocks. They seem to all have signal/exception handlers setup for the
various errors that can and will happen if you don't know exactly what your
doing during a transaction. Then they have the so-called contention manager
for the livelock. The perfect storm... STM has you covered. Fool-proof
indeed! If you do something stupid, the STM will work real hard to make
sure you don't know about it. If it has to sacrifice its performance and
scalability to the alter of the contention manager in order to accommodate
crappy livelock-laced transactional program.

The "Beautiful Concurrency" paper you linked to in an eairler post was
excellent... Your correct, no mention of livelock. I agree with you... I
would have much more respect for the STM cheerleaders if they pointed out
all of the caveats!

:O

Chris Thomasson

unread,

Feb 26, 2007, 8:59:07 PM2/26/07

to

"Chris Thomasson" <cri...@comcast.net> wrote in message
news:ctGdndKheJL8j3_Y...@comcast.com...

> "David Hopwood" <david.nosp...@blueyonder.co.uk> wrote in message
> news:6ekEh.17259$HO5....@fe1.news.blueyonder.co.uk...
>> Chris Thomasson wrote:

[...]

>> With current
>> memory models, it would result in far too many interrupts, only a small
>> fraction of which would be relevant.
>
> Well, if the chip could limit the interrupts to not more than one every
> second or so, well, that would be perfect. Does an interrupt every second
> or so per-core-group sound practical?

I was wondering how difficult it would be to create a chip that had that a
per-core-group synchronization-epoch interrupt table that fired every time
all the cores in a group executes something analogous to a #StoreStore SPARC
memory barrier? The interrupts would fire no more than once every
half-second or so. If the chip notices that a per-core-group has a valid
synchronization-epoch, it then checks if it should send the interrupt, or
just discard the operation, and go back to whatever it was doing. This
would have act as a sort of "throttle" on the number interrupts that get
generated. How hard would it be to do that in the hardware?

Chris Thomasson

unread,

Feb 26, 2007, 11:54:18 PM2/26/07

to

"David Hopwood" <david.nosp...@blueyonder.co.uk> wrote in message
news:6ekEh.17259$HO5....@fe1.news.blueyonder.co.uk...

> Chris Thomasson wrote:
>> Adding solid memory management support for all sorts of lock-free
>> algorithms
>> at the hardware level can be as simple as this:
>>
>> ------------

[...]

>> http://groups.google.com/group/comp.arch/msg/2a0f4163f8e13f1e
>> ------------
>
> I've no idea why that particular semantics would be useful.

You can implement the following scheme:

http://groups.google.com/group/comp.programming.threads/msg/e59c2dd1afb8e0a3

With a chip that allowed it users to be interrupted on per-core-group
sync-epochs... That scheme basically creates an environment where many
different kinds of lock-free algorithms are possible. The hardware
interrupts may sound trivial at first, however, once you realize how they
can be tracked in such a way that basically allows for reader threads to
traverse large linked data-structures without using any explicit memory
barrier calls, except of course if your running your application on an
Alpha. Your reader threads have to use explicit #LoadLoad on an Alpha.
However, there is a way of amortizing the Alpha call to your advantage. Of
course when you iterate throughout a number of fully mutable shared
data-structures without memory barriers, well, that when you can get
excellent performance and scalability characteristics...

:^)

I am all for using a clever locking pattern for the writer side of a
lock-free memory barrier free reader pattern... That's how you can get very
good performance numbers. This stuff is very friendly with very weak cache
coherency based systems, RMO mem-model or various NUMA-based memory models.
NUMA systems don't like to execute interlocked RMW and/or memory barrier
instructions!

:O

Eric P.

unread,

Feb 26, 2007, 11:54:46 PM2/26/07

to

Bill Todd wrote:
>
> Eric P. wrote:
> >
> > The statements I made above are all correct,
>
> No, they are not - though they're at least somewhat closer to being
> correct now that you're clarified what you meant by "transactions are
> defined as serialized" to be not a general statement about all
> transactions but rather "transactions that update the same variable are
> serialized".

At the level of detail of those high level discussions, the kinds
of people I am accustomed to dealing with are presumed to know about
details like 'updates that collide are serialized' so it is skipped.

> However, the modified statement is still incorrect, because transactions
> that update the same variable can still act in parallel just fine
> *until* the point at which they update the variable, at which point only
> those updates (and whatever the transactions do subsequent to that
> point) execute serially.

This is just silly. Most people understand that update locks apply
from the point of update and don't need it spelled out either.

> So the rather trivial way to let such
> transactions execute almost completely in parallel is to make those
> updates the last action the transaction takes.

That was just a trivial example to illustrate a point which was
provided in answer to your claim to have missed the point.
Most people would have understood that too.

> < snip long winded monolog >

Eric

Bill Todd

unread,

Feb 27, 2007, 2:48:02 AM2/27/07

to

Eric P. wrote:
> Bill Todd wrote:
>> Eric P. wrote:
>>> The statements I made above are all correct,
>> No, they are not - though they're at least somewhat closer to being
>> correct now that you're clarified what you meant by "transactions are
>> defined as serialized" to be not a general statement about all
>> transactions but rather "transactions that update the same variable are
>> serialized".
>
> At the level of detail of those high level discussions, the kinds
> of people I am accustomed to dealing with are presumed to know about
> details like 'updates that collide are serialized'

Of course I knew about it: that's why I corrected your statement. The
kinds of people *I'm* accustomed to dealing with are considerably less
sloppy in their wording - at least if they're competent and actually
know what they're talking about.

so it is skipped.
>
>> However, the modified statement is still incorrect, because transactions
>> that update the same variable can still act in parallel just fine
>> *until* the point at which they update the variable, at which point only
>> those updates (and whatever the transactions do subsequent to that
>> point) execute serially.
>
> This is just silly. Most people understand that update locks apply
> from the point of update and don't need it spelled out either.

And most people who do understand this manage not to make incorrect (or,
again, perhaps just sloppy) statements that strongly suggest that they
don't.

>
>> So the rather trivial way to let such
>> transactions execute almost completely in parallel is to make those
>> updates the last action the transaction takes.
>
> That was just a trivial example to illustrate a point which was
> provided in answer to your claim to have missed the point.

Strike three, I'm afraid: either you hadn't a clue what you were
talking about and are now attempting to back-pedal to cover it up, or
you *really* need to tighten up your discussion skills.

- bill

Eric P.

unread,

Feb 27, 2007, 1:51:59 PM2/27/07

to

Bill Todd wrote:
>
> Eric P. wrote:
> > Bill Todd wrote:
> >> Eric P. wrote:
> >>> The statements I made above are all correct,
> >> No, they are not - though they're at least somewhat closer to being
> >> correct now that you're clarified what you meant by "transactions are
> >> defined as serialized" to be not a general statement about all
> >> transactions but rather "transactions that update the same variable are
> >> serialized".
> >
> > At the level of detail of those high level discussions, the kinds
> > of people I am accustomed to dealing with are presumed to know about
> > details like 'updates that collide are serialized'
>
> Of course I knew about it: that's why I corrected your statement. The
> kinds of people *I'm* accustomed to dealing with are considerably less
> sloppy in their wording - at least if they're competent and actually
> know what they're talking about.

You corrected nothing. You made a speech about database concurrency
theory which while perhaps impressive to junior programmers
but is only one, well known consideration in real TP systems design.
Anyone who has ever done a fail-safe hand off of financial transactions
to MQ would know that. And that is not even the tough stuff.

> so it is skipped.
> >
> >> However, the modified statement is still incorrect, because transactions
> >> that update the same variable can still act in parallel just fine
> >> *until* the point at which they update the variable, at which point only
> >> those updates (and whatever the transactions do subsequent to that
> >> point) execute serially.
> >
> > This is just silly. Most people understand that update locks apply
> > from the point of update and don't need it spelled out either.
>
> And most people who do understand this manage not to make incorrect (or,
> again, perhaps just sloppy) statements that strongly suggest that they
> don't.

Your predilection for obvious, well known, low level minutia does
not constitute correction. It is most usually associated with
limited patterns of thought.

> >> So the rather trivial way to let such
> >> transactions execute almost completely in parallel is to make those
> >> updates the last action the transaction takes.

Business TP systems can consist of hundreds of tables and
many hundreds of thousands of lines of code implementing complex
business rules often spread across multiple processors
with multiple database vendors.

Transactions are often complex, possibly nested requiring inputs from
many systems often over proprietary communications links, and
distribution of results to multiple destinations in fail-safe manner.

If the data a TP needs to proceed must be pulled from some other
system, and the messaging system requires a unique message id,
those STM locking rules are going to stall you.
And that is just one of many interfaces.

Your assumption that these can be trivially rearranged
is laughably uninformed.

> > That was just a trivial example to illustrate a point which was
> > provided in answer to your claim to have missed the point.
>
> Strike three, I'm afraid: either you hadn't a clue what you were
> talking about and are now attempting to back-pedal to cover it up, or
> you *really* need to tighten up your discussion skills.

In your dreams. I back peddled nothing.
I have simply ignored your rude remarks and corrected your attempts
to use petty nit-picking to misrepresent my statements as incorrect.

Eric

Bill Todd

unread,

Feb 27, 2007, 3:37:14 PM2/27/07

to

Since you appear determined to try to bluster your way out of this, I
guess I'll just have to drag you back to your original incompetent
statements that began this interchange:

Eric P. wrote:

...

> I'm guessing you would either wind up with a single BigLock,
> or it might indefinitely stall large transactions because small
> operations keep jumping in and preventing forward progress of
> larger transactions.

Incompetent statement #1: equating pessimistic scheduling with 'a
single BigLock' (modified with some fuzzy thinking about how small and
large transactions might interact which no competently-designed system
would allow).

A single BigLock forces *strictly serial transaction execution*, whereas
pessimistically-scheduled transactions offer varying degrees of
parallelism (in fact sufficiently often a *great deal* of parallelism to
make the significant effort of allowing them to execute concurrently
eminently worthwhile). Later, you attempted to waffle out of this by
saying, "Most people understand that update locks apply from the point
of update" - but in the early discussion here and a bit later, for example,

"you wind up with all transactions being slowed due to the largest ones,

which is effectively the same as having a single BigLock"; no - that's
misunderstanding Amdahl's Law to state that no parallelism is possible
at all, rather than just that *complete* parallelism is usually impossible

and

"transactions are defined as serialized, all you need is a single common

global variable and you are globally serialized"; note the sloppy, or
incompetent - I don't much care which - omission of the need for
*conflict* on that variable here

and

"All the parallel TP servers would bottleneck on updating the unique

TranId row and you get effective serial throughput"; same
misunderstanding of Amdahl's Law applies here

the persistence with which you return to this misunderstanding suggests
rather strongly that either your grasp of the subject or your ability to
reason about it is significantly flawed.

>
> And reintroducing locks reinstates potential for deadlocks.
> To avoid that would require a whole program scan to get the
> order correct (which kinda works against modular development
> or opaque DLL code).

Incompetent statement #2, suggesting complete ignorance of deadlock
management (as distinct from avoidance, such as careful lock-ordering
can achieve). Kind of suggests that you might be an academic: they
sometimes tend to lose track of the fact that the more elegant solutions
aren't the *only* important ones. Your difficulty understanding later
that deadlock detection and resolution could create a deadlock-free
system adds to that impression.

>
>> Transactions would be useful even in a fully sequential context.
>
> That's called BigLock, and I don't need some fancy pants
> mechanism for _that_ design.

Incompetent statement #3: as I observed earlier, the transparent
atomicity which transactions confer has major utility which BigLock
totally lacks.

And that's just in the *first* of your posts to which I responded, but
it's not worth the time to cover the rest in similar detail.

You know, you really should have just stuck with one of your own
statements earlier in that post:

"I have not seen the details on pessimistic scheduling (blocking until
you are sure the whole transaction will succeed) nor given it much thought."

*That* statement was quite correct.

- bill

Chris Thomasson

unread,

Feb 27, 2007, 5:20:44 PM2/27/07

to

"Bill Todd" <bill...@metrocast.net> wrote in message
news:bu2dnaeWP9LmC3nY...@metrocastcablevision.com...

> Since you appear determined to try to bluster your way out of this, I
> guess I'll just have to drag you back to your original incompetent
> statements that began this interchange:
>
> Eric P. wrote:
>
> ...
>
>> I'm guessing you would either wind up with a single BigLock,
>> or it might indefinitely stall large transactions because small
>> operations keep jumping in and preventing forward progress of
>> larger transactions.
>
> Incompetent statement #1: equating pessimistic scheduling with 'a single
> BigLock' (modified with some fuzzy thinking about how small and large
> transactions might interact which no competently-designed system would
> allow).

If your careful, and have control of the scheduler and the contention
manager, I guess you can orchestrate things so that large transactions are
essentially segregated from small so-called "quick" transactions. You can
get into some of the "livelock" issues here like Eric P. was describing if
your simply not careful when your implementing the transaction manager and
its contention managers.

> A single BigLock forces *strictly serial transaction execution*,

Yup. Well, You can have a implementation that hold an array of prioritized
transaction "slots" each protected by its own BigLock. So, transaction 1 in
Slot A can usually run concurrently with transaction 2 in Slot X. A writer
transaction can hash the pointer to an integer and then use that as an index
into the transaction slot array. You then define a simple locking order for
the transaction slots and force writer threads to take the locks in order.

> whereas pessimistically-scheduled transactions offer varying degrees of
> parallelism (in fact sufficiently often a *great deal* of parallelism to
> make the significant effort of allowing them to execute concurrently
> eminently worthwhile). Later, you attempted to waffle out of this by
> saying, "Most people understand that update locks apply from the point of
> update" - but in the early discussion here and a bit later, for example,
>
> "you wind up with all transactions being slowed due to the largest ones,
> which is effectively the same as having a single BigLock"; no - that's
> misunderstanding Amdahl's Law to state that no parallelism is possible at
> all, rather than just that *complete* parallelism is usually impossible

Well, if you really think about it, transactions essentially turn the
"BigLock" into a "Big Load-Linked/Store-Conditional Reserved" instruction.

>
> and
>
> "transactions are defined as serialized, all you need is a single common
> global variable and you are globally serialized"; note the sloppy, or
> incompetent - I don't much care which - omission of the need for
> *conflict* on that variable here
>
> and
>
> "All the parallel TP servers would bottleneck on updating the unique
> TranId row and you get effective serial throughput"; same misunderstanding
> of Amdahl's Law applies here
>
> the persistence with which you return to this misunderstanding suggests
> rather strongly that either your grasp of the subject or your ability to
> reason about it is significantly flawed.
>
>>
>> And reintroducing locks reinstates potential for deadlocks.
>> To avoid that would require a whole program scan to get the
>> order correct (which kinda works against modular development
>> or opaque DLL code).
>
> Incompetent statement #2, suggesting complete ignorance of deadlock
> management (as distinct from avoidance, such as careful lock-ordering can
> achieve).

Yup. Here is a crude and very basic pseudo-code sketch of how a locking
order can avoid deadlock in lock-based software transactional memory on the
fly:

http://groups.google.com/group/comp.programming.threads/msg/942b0b2e3e35f665

http://groups.google.com/group/comp.programming.threads/msg/8bb88a54fae9c445

http://groups.google.com/group/comp.programming.threads/msg/b0ad8f9462b19864

Chris Thomasson

unread,

Feb 27, 2007, 5:24:24 PM2/27/07

to

[...]
Left one part out!

> Yup. Well, You can have a implementation that hold an array of prioritized
> transaction "slots" each protected by its own BigLock. So, transaction 1
> in Slot A can usually run concurrently with transaction 2 in Slot X. A
> writer transaction can hash the pointer to an integer and then use that as
> an index into the transaction slot array.

^^^^^^^^^^^

That last sentence should read:

A writer transaction can hash a "pointer to a shared data-structure it
whishes to modify" to some integer value and then use that as an index into
the transaction slot array.

> You then define a simple locking order for the transaction slots and force
> writer threads to take the locks in order.

[...]

Bill Todd

unread,

Feb 27, 2007, 7:54:55 PM2/27/07

to

Chris Thomasson wrote:

...

>> A single BigLock forces *strictly serial transaction execution*,
>
> Yup. Well, You can have a implementation that hold an array of prioritized
> transaction "slots" each protected by its own BigLock.

What part of the difference between 'a single BigLock' and 'a bunch of
partitioned medium-sized locks' is managing to escape you?

...

> Well, if you really think about it, transactions essentially turn the
> "BigLock" into a "Big Load-Linked/Store-Conditional Reserved" instruction.

I think you're confusing the subject under discussion here (pessimistic
lock-based concurrency control) with the optimistic mechanism.

- bill

Bill Todd

unread,

Feb 27, 2007, 7:58:23 PM2/27/07

to

Chris Thomasson wrote:

...

>> You then define a simple locking order for the transaction slots and force
>> writer threads to take the locks in order.

Sounds suspiciously similar to what timestamp-ordered concurrency
control achieves with somewhat less mechanism and potentially greater
fairness (not to mention interesting extensions in cases where
transactions - e.g., read-only - can be allowed to operate upon stale,
though consistent, data).

- bill

Chris Thomasson

unread,

Feb 28, 2007, 1:05:29 AM2/28/07

to

"Bill Todd" <bill...@metrocast.net> wrote in message

news:7tidnV3v-cMyTnnY...@metrocastcablevision.com...

> Chris Thomasson wrote:
>
> ...
>
>>> You then define a simple locking order for the transaction slots and
>>> force writer threads to take the locks in order.
>
> Sounds suspiciously similar to what timestamp-ordered concurrency control
> achieves with somewhat less mechanism and potentially greater fairness
> (not to mention interesting extensions in cases where transactions -

I agree with you that the lock-based transaction slots scheme would probably
be a bit simple and fairer than timestamp-ordered concurrency.

> e.g., read-only - can be allowed to operate upon stale, though consistent,
> data).

Indeed!

http://groups.google.com/group/comp.programming.threads/msg/fdc665e616176dce
(stale reads addressed in the last two paragraphs... Is that what you were
getting at?)

Eric Pattison

unread,

Feb 28, 2007, 4:31:50 PM2/28/07

to

Bill Todd wrote:
>
> Since you appear determined to try to bluster your way out of this, I
> guess I'll just have to drag you back to your original incompetent
> statements that began this interchange:
>
> Eric P. wrote:
>
> ...
>
> > I'm guessing you would either wind up with a single BigLock,
> > or it might indefinitely stall large transactions because small
> > operations keep jumping in and preventing forward progress of
> > larger transactions.
>
> Incompetent statement #1: equating pessimistic scheduling with 'a
> single BigLock' (modified with some fuzzy thinking about how small and
> large transactions might interact which no competently-designed system
> would allow).

I clarified the meaning of the the terms large and small
and their effect of concurrent execution to you before.
You are of the opinion that there is plenty of potential parallelism
available in all other operations; I am of the opposite opinion.
There is no point in revisiting this.

Reordering transactions to achieve greater concurrency is a
perfectly valid technique that can increase average throughput.
It is quite appropriate to explore the possibility
particularly during a high level design discussion.

Your characterization of reordering as something which
"no competently-designed system would allow" is simply uninformed.

> A single BigLock forces *strictly serial transaction execution*, whereas
> pessimistically-scheduled transactions offer varying degrees of
> parallelism (in fact sufficiently often a *great deal* of parallelism to
> make the significant effort of allowing them to execute concurrently
> eminently worthwhile). Later, you attempted to waffle out of this by
> saying, "Most people understand that update locks apply from the point
> of update" - but in the early discussion here and a bit later, for example,

Well, most people do, and they don't need this spelled out for them.
Clarifying this to you is not a waffle.
Apparently that wasn't obvious to you either.

> "you wind up with all transactions being slowed due to the largest ones,
> which is effectively the same as having a single BigLock"; no - that's
> misunderstanding Amdahl's Law to state that no parallelism is possible
> at all, rather than just that *complete* parallelism is usually impossible
>
> and
>
> "transactions are defined as serialized, all you need is a single common
> global variable and you are globally serialized"; note the sloppy, or
> incompetent - I don't much care which - omission of the need for
> *conflict* on that variable here

Oh, so now you acknowledge that this will happen when
there is a *conflict*.
And you needed that spelled out for you??? Amazing!
And you dare to call _anyone_ incompetent.

> and
>
> "All the parallel TP servers would bottleneck on updating the unique
> TranId row and you get effective serial throughput"; same
> misunderstanding of Amdahl's Law applies here
>
> the persistence with which you return to this misunderstanding suggests
> rather strongly that either your grasp of the subject or your ability to
> reason about it is significantly flawed.

In regard to all of your BigLock comments, try reading:

Unlocking Concurrency
Multicore programming with transactional memory
ALI-REZA ADL-TABATABAI, INTEL
CHRISTOS KOZYRAKIS, STANFORD UNIVERSITY
BRATIN SAHA, INTEL
ACM QUEUE December/January 2006-2007 25

On page 27: "It can allow concurrent read and write
operations to different variables."

On page 29: "Under pessimistic conflict detection, the system checks
for conflicts progressively as transactions read and write data.
Conflicts are detected early and can be handled either by stalling
one of the transactions in place or by aborting one transaction and
retrying it later."

As I pointed out repeatedly, the global variables accessed by
transactions in TP systems will, in my experience, get write
conflicts. Under the above mechanism, the result will be the
aforementioned stall and little concurrency.

There are some proposals that support multi-version object updates
that would forward update values from transaction to transaction.
Unfortunately those too would stall as soon as a value is exported
to the outside world, and that happens a lot in TP code
for database accesses or inter-process messaging.
So once again it looks though there is little
opportunity for concurrent execution.

> > And reintroducing locks reinstates potential for deadlocks.
> > To avoid that would require a whole program scan to get the
> > order correct (which kinda works against modular development
> > or opaque DLL code).
>
> Incompetent statement #2, suggesting complete ignorance of deadlock
> management (as distinct from avoidance, such as careful lock-ordering
> can achieve). Kind of suggests that you might be an academic: they
> sometimes tend to lose track of the fact that the more elegant solutions
> aren't the *only* important ones. Your difficulty understanding later
> that deadlock detection and resolution could create a deadlock-free
> system adds to that impression.

As I told you before, deadlock free reference is from the STM
authors requirements documents. And no that is not the same as
deadlock detect and break and I explained the difference before.

You apparently don't understand plain english now.

> >> Transactions would be useful even in a fully sequential context.
> >
> > That's called BigLock, and I don't need some fancy pants
> > mechanism for _that_ design.
>
> Incompetent statement #3: as I observed earlier, the transparent
> atomicity which transactions confer has major utility which BigLock
> totally lacks.

Uhm, the topic in that message was transaction scheduling, and
a fully sequential context has the same concurrency as BigLock.
That statement does not detract from other benefits TM may accrue.
Apparently you didn't understand that either.

This is the last time I am going to post on this topic.
I don't care what you say after this.
Free free to rant to yourself.

Eric

Bill Todd

unread,

Mar 1, 2007, 8:02:57 AM3/1/07

to

Not at all: your suggestion (still right up there above in plain
English) that any competent designer would allow small transactions to
stall large transactions *indefinitely* is simply - well, incompetent
(yet again).

However, you're neither interesting nor important enough to waste any
more time on: I do feel some personal responsibility to confront
rubbish when I trip over it, but have now done so sufficiently for
anyone paying attention to have formed their own opinion on the matter.

- bill

Bill Todd

unread,

Mar 1, 2007, 8:15:33 AM3/1/07

to

Chris Thomasson wrote:
> "Bill Todd" <bill...@metrocast.net> wrote in message
> news:7tidnV3v-cMyTnnY...@metrocastcablevision.com...
>> Chris Thomasson wrote:
>>
>> ...
>>
>>>> You then define a simple locking order for the transaction slots and
>>>> force writer threads to take the locks in order.
>> Sounds suspiciously similar to what timestamp-ordered concurrency control
>> achieves with somewhat less mechanism and potentially greater fairness
>> (not to mention interesting extensions in cases where transactions -
>
> I agree with you that the lock-based transaction slots scheme would probably
> be a bit simple and fairer than timestamp-ordered concurrency.

No, you don't - since I was suggesting the exact opposite.

>
>> e.g., read-only - can be allowed to operate upon stale, though consistent,
>> data).
>
> Indeed!
>
> http://groups.google.com/group/comp.programming.threads/msg/fdc665e616176dce
> (stale reads addressed in the last two paragraphs... Is that what you were
> getting at?)

No: my reference was specifically to *consistent* stale data, not just
any stale data (though in fact following the link in the material you
refer to above suggests that you were talking about *avoiding* stale
data - via a retry mechanism - rather than using it).

- bill

Chris Thomasson

unread,

Mar 1, 2007, 4:37:43 PM3/1/07

to

"Bill Todd" <bill...@metrocast.net> wrote in message

news:TaGdnfx6mNlrTHvY...@metrocastcablevision.com...

> Chris Thomasson wrote:
>> "Bill Todd" <bill...@metrocast.net> wrote in message
>> news:7tidnV3v-cMyTnnY...@metrocastcablevision.com...
>>> Chris Thomasson wrote:

[...]

>> I agree with you that the lock-based transaction slots scheme would
>> probably be a bit simple and fairer than timestamp-ordered concurrency.
>
> No, you don't - since I was suggesting the exact opposite.

OOPS! Sorry about that. ;^(...