Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

When is "volatile" used instead of "lock" ?

6 views
Skip to first unread message

Samuel R. Neff

unread,
May 21, 2007, 10:35:06 AM5/21/07
to

When is it appropriate to use "volatile" keyword? The docs simply
state:

"
The volatile modifier is usually used for a field that is accessed by
multiple threads without using the lock Statement (C# Reference)
statement to serialize access.
"

But when is it better to use "volatile" instead of "lock" ?

Thanks,

Sam

------------------------------------------------------------
We're hiring! B-Line Medical is seeking .NET
Developers for exciting positions in medical product
development in MD/DC. Work with a variety of technologies
in a relaxed team environment. See ads on Dice.com.


ben.bid...@gmail.com

unread,
May 21, 2007, 11:28:20 AM5/21/07
to

You can also the System.Threading.Interlocked class which maintains
volatile semantics.

Seealso: http://www.albahari.com/threading/part4.html

Christof Nordiek

unread,
May 21, 2007, 11:58:05 AM5/21/07
to
"Samuel R. Neff" <samue...@nomail.com> schrieb im Newsbeitrag
news:ecb3531i9o3ekhp7t...@4ax.com...

>
> When is it appropriate to use "volatile" keyword? The docs simply
> state:
>
> "
> The volatile modifier is usually used for a field that is accessed by
> multiple threads without using the lock Statement (C# Reference)
> statement to serialize access.
> "
>

For a volatile field the reodering of the memory access by the optimizer is
restricted.
A write to a volatile field is always done after all other memory accesses
which precede in the instruction sequence.
A read from a volatile field is always done before all other memory accesses
wich occur after it in the instruction sequence.

A volatile field as a simple way to flag, that memorymanipulations are over.

following an example from the specs:

using System;
using System.Threading;
class Test
{
public static int result;
public static volatile bool finished;
static void Thread2() {
result = 143;
finished = true;
}

static void Main() {
finished = false;
// Run Thread2() in a new thread
new Thread(new ThreadStart(Thread2)).Start();
// Wait for Thread2 to signal that it has a result by setting
// finished to true.
for (;;) {
if (finished) {
Console.WriteLine("result = {0}", result);
return;
}
}
}
}

Since finished is volatile, in method Thread2 the write to result will
allways occur before the write to finished and in method Main the read from
finished will allways occur before the read from result, so the read from
result in Main can't occur before the write in Thread2.

HTH

Christof


james....@gmail.com

unread,
May 21, 2007, 12:05:47 PM5/21/07
to
On May 21, 10:35 am, Samuel R. Neff <samueln...@nomail.com> wrote:
> When is it appropriate to use "volatile" keyword? The docs simply
> state:

Often, if just one thread is writing to the object (and other
threads just reading it),you can get away with using just volatile.

Generally, the shared object would need to be an atomic value, so the
reader may see it sudden change from state A to state B, but would
never see it half-way between A & B.

Jon Skeet [C# MVP]

unread,
May 21, 2007, 1:05:34 PM5/21/07
to
ben.bid...@gmail.com <ben.bid...@gmail.com> wrote:
> You can also the System.Threading.Interlocked class which maintains
> volatile semantics.
>
> Seealso: http://www.albahari.com/threading/part4.html

But only if you use it for both the writing *and* the reading, which
isn't terribly obvious from the docs.

--
Jon Skeet - <sk...@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too

Brian Gideon

unread,
May 21, 2007, 4:09:56 PM5/21/07
to

One other important behavior that is demonstrated in your example is
that it guarentees that writes to finished are seen from other
threads. That prevents the infinite loop in Main().

Brian

Chris Mullins [MVP]

unread,
May 21, 2007, 5:49:52 PM5/21/07
to
"Samuel R. Neff" <samue...@nomail.com> wrote:
> When is it appropriate to use "volatile" keyword? The docs simply
> state:
> "The volatile modifier is usually used for a field that is accessed by
> multiple threads without using the lock Statement (C# Reference)
> statement to serialize access. "
>
> But when is it better to use "volatile" instead of "lock" ?

I would recommend using locks and properties, rather than volatile variables
or Interlocked Methods.

Locking is easier and more straight forward, and has fewer subtle issues,
than do the other two methods.

--
Chris Mullins, MCSD.NET, MCPD:Enterprise, Microsoft C# MVP
http://www.coversant.com/blogs/cmullins


Ben Voigt

unread,
May 22, 2007, 6:48:56 PM5/22/07
to

<ben.bid...@gmail.com> wrote in message
news:1179761300....@z28g2000prd.googlegroups.com...

You should use volatile and Interlocked together, neither fully replaces the
other.

>
> Seealso: http://www.albahari.com/threading/part4.html
>


Willy Denoyette [MVP]

unread,
May 23, 2007, 3:43:04 AM5/23/07
to
"Ben Voigt" <r...@nospam.nospam> wrote in message
news:uln5JOMn...@TK2MSFTNGP06.phx.gbl...

Not necessarily, there is no need for volatile, as long you Interlock
consistently across all threads in the process. This means that once you
access a shared variable using Interlock, all threads should use Interlock.

Willy.

Ben Voigt

unread,
May 23, 2007, 9:08:40 AM5/23/07
to

"Willy Denoyette [MVP]" <willy.d...@telenet.be> wrote in message
news:1B1A6E75-67F7-4B6E...@microsoft.com...

I don't think so, actually. Without volatile semantics, the compiler is
free to cache the value of any parameter, including in/out parameters. Say
you are calling an Interlocked method in a loop. If the variable is not
volatile, the compiler can actually call Interlocked on a local copy, and
then write the value to the real variable once, at the end of the loop (and
worse, it can do so in a non-atomic way). Anything that maintains correct
operation from the perspective of the calling thread is permissible for
non-volatile variable access. Why would a compiler do this? For optimal
use of cache. By using a local copy of a variable passed byref, locality of
reference is improved, and additionally, a thread's stack (almost) never
incurs cache coherency costs.

Note that this is not a problem for pass-by-pointer, which must use the true
address of the referenced variable in order to enable pointer arithmetic.
But pointer arithmetic isn't allowed for tracking handles, a handle is an
opaque value anyway.

For lockless data structures, always use volatile. And then stick that
volatile variable close in memory to what it is protecting, because CPU
cache has to load and flush an entire cache line at once, and volatile write
semantics require flushing all pending writes.

>
> Willy.
>


Brian Gideon

unread,
May 23, 2007, 12:22:47 PM5/23/07
to
On May 23, 8:08 am, "Ben Voigt" <r...@nospam.nospam> wrote:
> I don't think so, actually. Without volatile semantics, the compiler is
> free to cache the value of any parameter, including in/out parameters. Say
> you are calling an Interlocked method in a loop. If the variable is not
> volatile, the compiler can actually call Interlocked on a local copy, and
> then write the value to the real variable once, at the end of the loop (and
> worse, it can do so in a non-atomic way). Anything that maintains correct
> operation from the perspective of the calling thread is permissible for
> non-volatile variable access. Why would a compiler do this? For optimal
> use of cache. By using a local copy of a variable passed byref, locality of
> reference is improved, and additionally, a thread's stack (almost) never
> incurs cache coherency costs.
>
>
> Note that this is not a problem for pass-by-pointer, which must use the true
> address of the referenced variable in order to enable pointer arithmetic.
> But pointer arithmetic isn't allowed for tracking handles, a handle is an
> opaque value anyway.
>
> For lockless data structures, always use volatile. And then stick that
> volatile variable close in memory to what it is protecting, because CPU
> cache has to load and flush an entire cache line at once, and volatile write
> semantics require flushing all pending writes.
>

The Interlocked methods have volatile semantics. So as long as you
consistently use them for both reading and writing the end result
should be the same.

Brian Gideon

unread,
May 23, 2007, 12:23:16 PM5/23/07
to
On May 23, 8:08 am, "Ben Voigt" <r...@nospam.nospam> wrote:
> I don't think so, actually. Without volatile semantics, the compiler is
> free to cache the value of any parameter, including in/out parameters. Say
> you are calling an Interlocked method in a loop. If the variable is not
> volatile, the compiler can actually call Interlocked on a local copy, and
> then write the value to the real variable once, at the end of the loop (and
> worse, it can do so in a non-atomic way). Anything that maintains correct
> operation from the perspective of the calling thread is permissible for
> non-volatile variable access. Why would a compiler do this? For optimal
> use of cache. By using a local copy of a variable passed byref, locality of
> reference is improved, and additionally, a thread's stack (almost) never
> incurs cache coherency costs.
>
>
> Note that this is not a problem for pass-by-pointer, which must use the true
> address of the referenced variable in order to enable pointer arithmetic.
> But pointer arithmetic isn't allowed for tracking handles, a handle is an
> opaque value anyway.
>
> For lockless data structures, always use volatile. And then stick that
> volatile variable close in memory to what it is protecting, because CPU
> cache has to load and flush an entire cache line at once, and volatile write
> semantics require flushing all pending writes.
>

The Interlocked methods have volatile semantics. So as long as you

Jon Skeet [C# MVP]

unread,
May 23, 2007, 2:48:25 PM5/23/07
to
Ben Voigt <r...@nospam.nospam> wrote:
> > Not necessarily, there is no need for volatile, as long you Interlock
> > consistently across all threads in the process. This means that once you
> > access a shared variable using Interlock, all threads should use
> > Interlock.
>
> I don't think so, actually. Without volatile semantics, the compiler is
> free to cache the value of any parameter, including in/out parameters. Say
> you are calling an Interlocked method in a loop. If the variable is not
> volatile, the compiler can actually call Interlocked on a local copy, and
> then write the value to the real variable once, at the end of the loop (and
> worse, it can do so in a non-atomic way).

No - the CLI spec *particularly* mentions Interlocked operations, and
that they perform implicit acquire/release operations. In other words,
the JIT can't move stuff around in this particular case. Interlocked
would be pretty pointless without this.

Willy Denoyette [MVP]

unread,
May 23, 2007, 3:38:28 PM5/23/07
to
"Ben Voigt" <r...@nospam.nospam> wrote in message
news:uXg8juTn...@TK2MSFTNGP03.phx.gbl...

No, not at all. Interlocked operations imply a full fence, that is, reads
have acquire and writes have release semantics. That means that the JIT may
not register these variables nor store them locally and cannot move stuff
around them.
Think of this, what would be the use of Interlocked operation when used in
languages that don't support volatile (like VB.NET) or good old C/C++
(except VC7 and up).
I also don't agree with your statement that you should *always* use volatile
in lock free or low lock scenario's. IMO, you should almost never use
volatile, unless you perfectly understand the semantics of the memory model
of the CLR/CLI (ECMA differs from V1.X differs from V2 for instance) and the
memory model of the CPU (IA32 vs. IA64). The last year I was involved in the
resolution of a number of nasty bugs , all of them where the result of
people trying to out-smart the system by applying lock free or low lock
techniques using volatile, since then whenever I see volatile I'm getting
very suspicious, really.......


Willy.


Barry Kelly

unread,
May 23, 2007, 5:05:13 PM5/23/07
to
Willy Denoyette [MVP] wrote:

> I also don't agree with your statement that you should *always* use volatile
> in lock free or low lock scenario's.

As far as I can see from the rest of your post, I think you've made a
mis-statement here. I think what you mean to say is that you shouldn't
use lock-free or low-locking unless there's no alternative, not that
volatile shouldn't be used - because volatile is usually very necessary
in order to get memory barriers right in those circumstances.

> IMO, you should almost never use
> volatile, unless you perfectly understand the semantics of the memory model
> of the CLR/CLI (ECMA differs from V1.X differs from V2 for instance) and the
> memory model of the CPU (IA32 vs. IA64). The last year I was involved in the
> resolution of a number of nasty bugs , all of them where the result of
> people trying to out-smart the system by applying lock free or low lock
> techniques using volatile, since then whenever I see volatile I'm getting
> very suspicious, really.......

I agree with you about seeing 'volatile' and it raising red flags, but
the cure is to use proper locking if possible, and careful reasoning
(rather than shotgun 'volatile' and guesswork), rather than simply
omitting 'volatile'.

-- Barry

--
http://barrkel.blogspot.com/

Willy Denoyette [MVP]

unread,
May 23, 2007, 6:17:46 PM5/23/07
to
"Barry Kelly" <barry....@gmail.com> wrote in message
news:bea953lm6mqnu24mr...@4ax.com...

Well, I wasn't suggesting to omit 'volatile, sorry hif I gave this
impression. What I meant was, that you should be very if when looking for
lock-free or low locking alternatives, and if you do, that you should not
"always" use volatile.
Note that there are alternatives to volatile fields, there are
Thread.MemoryBarrier, Thread.VolatileRead, Thread.VolatileWrite and the
Interlocked API's, and these alternatives have IMO the (slight) advantages
that they "forces" developers to reason about their usage, something which
is less the case (from what I've learned when talking with other devs.
across several teams) with volatile.
But here also, you need to be very careful, (the red flag should be raised
whenever you see any of these too). You need to reason about their usage and
that's the major problem when writing threaded code, even experienced
developer have a hard time when reasoning about multithreading using locks,
programming models that require to reason about how and when to use explicit
fences or barriers are IMO too difficult, even for experts, to use reliably
in mainstream computing, and this is what .NET is all about isn't it?.

Willy.


Willy.

Ben Voigt

unread,
May 24, 2007, 10:26:10 AM5/24/07
to

"Willy Denoyette [MVP]" <willy.d...@telenet.be> wrote in message
news:BFF370C9-379B-4B12...@microsoft.com...

Let's look at the Win32 declaration for an Interlocked function:

LONG InterlockedExchange(
LONG volatile* Target,
LONG Value
);Clearly, Target is intended to be the address of a volatile variable.
Sure, you can pass a non-volatile pointer, and there is an implicit
conversion, but if you do *the variable will be treated as volatile only
inside InterlockedExchange*. The compiler can still do anything outside
InterlockedExchange, because it is dealing with a non-volatile variable.
And, it can't possibly change behavior when InterlockedExchange is called,
because the call could be made from a different library, potentially not yet
loaded.

Consider this:

/* compilation unit one */
void DoIt(LONG *target)
{
LONG value = /* some long calculation here */;
if (value != InterlockedExchange(target, value))
{
/* some complex operation here */
}
}

/* compilation unit two */

extern void DoIt(LONG * target);
extern LONG shared;

void outer(void)
{
for( int i = 0; i < 1000; i++ )
{
DoIt(&shared);
}
}

Now, clearly, the compiler has no way of telling that DoIt uses Interlocked
access, since DoIt didn't declare volatile semantics on the pointer passed
in. So the compiler can, if desired, transform outer thusly:

void outer(void)
{
LONG goodLocalityOfReference = shared;
for( int i = 0; i < 1000; i++ )
{
DoIt(&goodLocalityOfReference);
}
shared = goodLocalityOfReference;
}

Except for one thing. In native code, pointers have values that can be
compared, subtracted, etc. So the compiler has to honestly pass the address
of shared. In managed code, with tracking handles, the compiler doesn't
have to preserve the address of the variable (that would, after all, defeat
compacting garbage collection). Oh, sure, the JIT has a lot more
information about what is being called than a native compiler does, it
almost gets rid of separate compilation units.... but not quite. With
dynamically loaded assemblies and reflection in the mix, it is just a
helpless as a "compile-time" compiler.

I'm fairly sure that the current .NET runtime doesn't actually do any such
optimization as I've described. But I wouldn't bet against such things
being added in the future, when NUMA architectures become so widespread that
the compiler has to optimize for them.

Be safe, use volatile on every variable you want to act volatile, which
includes every variable passed to Interlocked.

> Think of this, what would be the use of Interlocked operation when used in
> languages that don't support volatile (like VB.NET) or good old C/C++
> (except VC7 and up).

VC++, all versions, and all other PC compilers that I'm aware of (as in, not
embedded), support volatile to the extent needed to invoke an interlocked
operation. That is, the real variable is always accessed at the time
specified by the compiler. The memory fences are provided by the
implementation of Interlocked*, independent of the compiler version.

> I also don't agree with your statement that you should *always* use
> volatile in lock free or low lock scenario's. IMO, you should almost never
> use volatile, unless you perfectly understand the semantics of the memory
> model of the CLR/CLI (ECMA differs from V1.X differs from V2 for instance)
> and the memory model of the CPU (IA32 vs. IA64). The last year I was
> involved in the resolution of a number of nasty bugs , all of them where
> the result of people trying to out-smart the system by applying lock free
> or low lock techniques using volatile, since then whenever I see volatile
> I'm getting very suspicious, really.......

You are claiming that you should almost never use lock free techniques, and
thus volatile should be rare. This hardly contradicts my statement that
volatile should always be used in lock free programming.


Ben Voigt

unread,
May 24, 2007, 10:28:16 AM5/24/07
to

"Brian Gideon" <brian...@yahoo.com> wrote in message
news:1179937367.3...@q75g2000hsh.googlegroups.com...


Inside the implementation. But who guarantees that the variable atomically
read and written to, is your so-called "end result"? No one, unless you use
volatile, forcing the compiler to generate a reference to the actual
variable every time you mention it.


Jon Skeet [C# MVP]

unread,
May 24, 2007, 10:47:40 AM5/24/07
to
On May 24, 3:28 pm, "Ben Voigt" <r...@nospam.nospam> wrote:
> > The Interlocked methods have volatile semantics. So as long as you
> > consistently use them for both reading and writing the end result
> > should be the same.
>
> Inside the implementation.

No, inside the JIT which has to notice that you call Interlocked.

> But who guarantees that the variable atomically
> read and written to, is your so-called "end result"?

The CLI spec.

By the way, if you pass volatile parameters by reference, the
volatility is irrelevant as far as the called method is concerned
anyway - it even triggers a warning from the compiler (CS0420).

Jon

Ben Voigt

unread,
May 24, 2007, 10:52:04 AM5/24/07
to

"Jon Skeet [C# MVP]" <sk...@pobox.com> wrote in message
news:1180018060.0...@p77g2000hsh.googlegroups.com...

> On May 24, 3:28 pm, "Ben Voigt" <r...@nospam.nospam> wrote:
>> > The Interlocked methods have volatile semantics. So as long as you
>> > consistently use them for both reading and writing the end result
>> > should be the same.
>>
>> Inside the implementation.
>
> No, inside the JIT which has to notice that you call Interlocked.

How can it, with dynamically generated code and reflection?

>
>> But who guarantees that the variable atomically
>> read and written to, is your so-called "end result"?
>
> The CLI spec.
>
> By the way, if you pass volatile parameters by reference, the
> volatility is irrelevant as far as the called method is concerned
> anyway - it even triggers a warning from the compiler (CS0420).

C++ has the concept of pointer (or reference) to volatile, letting a
function declare to its callers that a parameter will be treated as
volatile. Doesn't .NET have the same? It seems like the only way to
correctly resolve CS0420.

Well, .NET Reflector doesn't show any type of annotation on Interlocked.Add,
for example, so I guess it does not.

>
> Jon
>


Jon Skeet [C# MVP]

unread,
May 24, 2007, 11:33:25 AM5/24/07
to
On May 24, 3:52 pm, "Ben Voigt" <r...@nospam.nospam> wrote:
> >> Inside the implementation.
>
> > No, inside the JIT which has to notice that you call Interlocked.
>
> How can it, with dynamically generated code and reflection?

By refusing to reorder memory reads/writes around method calls - which
is exactly what it does, I believe.

The same problem exists with all the other ways of introducing memory
barriers - unless you can guarantee that a method you call won't
introduce a memory barrier (and in some cases you can do that, of
course), you've got to effectively assume that it will do so,
otherwise you violate the spec.

> > By the way, if you pass volatile parameters by reference, the
> > volatility is irrelevant as far as the called method is concerned
> > anyway - it even triggers a warning from the compiler (CS0420).
>
> C++ has the concept of pointer (or reference) to volatile, letting a
> function declare to its callers that a parameter will be treated as
> volatile. Doesn't .NET have the same? It seems like the only way to
> correctly resolve CS0420.
>
> Well, .NET Reflector doesn't show any type of annotation on Interlocked.Add,
> for example, so I guess it does not.

I don't know, to be honest. I haven't seen any such thing.

Jon

Willy Denoyette [MVP]

unread,
May 24, 2007, 6:13:25 PM5/24/07
to
"Ben Voigt" <r...@nospam.nospam> wrote in message
news:%23MmyV$gnHHA...@TK2MSFTNGP06.phx.gbl...

Sure, but this was not my point, the point is that Interlocked operations
imply barriers, all or not full. "volatile" implies full barriers, so they
both imply barriers, but they serve different purposes. One does not exclude
the other, but that doesn't mean they should always be used in tandem, all
depends on what you want to achieve in your code, what guarantees you want.
Anyway, the docs do not impose it, the C# docs on Interlocked don't even
mention volatile, and the Win32 docs (Interlocked API's) don't spend a word
on the volatile argument. (note that the volatile was added to the
signature after NT4 SP1).

> And, it can't possibly change behavior when InterlockedExchange is called,
> because the call could be made from a different library, potentially not
> yet loaded.
>

Sorry but you are mixing native code and managed code semantics. What I
mean, is that the semantics of the C (native) volatile is not the same as
the semantics of C# 'volatile'. So when I refered to C++ supporting
"volatile" I was refering to managed dialects (VC7.x and VC8) who's volatile
semantics are obviously the same as all other languages
I don't wanna discuss the semantics of volatile in standard C/c++ here, they
are so imprecise that IMO it will lead to an endless dicussion, not relevant
to C#.
Also I don't wanna discuss the semantics of Win32 Interlocked either, "Win32
interlocked API's" do accept pointers to volatile items, while .NET does
accept "volatile pointers" (in unsafe context) as arguments of a method
call, but treats the item as non volatile. Also, C#, will issue a warning
when passing a volatile field (passed by ref is required by Interlocked
operations), that means that the item will be treated as volatile, but the
reference itself will not.

Where in the docs (MSDN Platform SDK etc..) do they state that Interlocked
should always be on volatile items?


>> I also don't agree with your statement that you should *always* use
>> volatile in lock free or low lock scenario's. IMO, you should almost
>> never use volatile, unless you perfectly understand the semantics of the
>> memory model of the CLR/CLI (ECMA differs from V1.X differs from V2 for
>> instance) and the memory model of the CPU (IA32 vs. IA64). The last year
>> I was involved in the resolution of a number of nasty bugs , all of them
>> where the result of people trying to out-smart the system by applying
>> lock free or low lock techniques using volatile, since then whenever I
>> see volatile I'm getting very suspicious, really.......
>
> You are claiming that you should almost never use lock free techniques,
> and thus volatile should be rare. This hardly contradicts my statement
> that volatile should always be used in lock free programming.

Kind of, I'm claiming that you should rarely use lock-free techniques when
using C# in mainstream applications, I've seen too many people trying to
implement lock free code, and if you ask "why", the answer is mostly
"performance", and if you asked if the measured their "locked "
implementation, the answer is mostly, well I have no 'locked'
implementation, this is what I call "premature optimization" without any
guarantees, other than probably producing unrealiable code, which is (IMO)
more important than performant code .
IMO the use of volatile should be rare in the sense that you better use
locks and only use volatile for the most simple cases (which doesn't imply
'rare'), for instance when you need to guarantee that all possible observers
of a field (of type accepted by volatile) see the same value when that value
has been written to by another observer.
Remember "volatile" is something taken care of by the JIT, all it does is
eliminate some of the possible optimizations like (but not restricted to):
- volatile items cannot be registered...
- multiple stores cannot be suppressed...
- re-ordering is restricted.
- ...
But keep in mind that, 'volatile' suppresses optimizations for all possible
accesses, even when not subject to multiple observers (threads), and that
volatile fields accesses can move, some people think they can't....

Willy.


Peter Ritchie [C# MVP]

unread,
Jun 14, 2007, 12:20:02 PM6/14/07
to
Sorry, coming in late; but this are some poor implications with respect to
"volatile" and "lock" in this thread (other statements like "..there is no
need for volatile [when] you Interlock consistently across all threads in the
process." are valid).

"lock" and "volatile" are two different things. You may not always need
"lock" with a type that can be declared volatile; but you should always use
volatile with a member that is accessed by multiple threads (an optimization
would be that you wouldn't need "volatile" if Interlocked were always used
with the member in question, if applicable--as has been noted). For example,
why would anyone assume the the line commented with "// *" was thread-safe
simply because "i" was declared with "volatile":

volatile int i;
static Random random = new Random();
static int Transmogrify(int value)
{
return value *= random.Next();
}

void Method()
{
i = Transmogrify(i); // *
}

"volatile" doesn't make a member thread-safe, the above operation still
requires at least two instructions (likely four), which are entirely likely
to be separated by preemption to another thread that modifies i.

By the same token, the lock statement surrounding access to a member doesn't
stop the compiler from having optimized use of a member by caching it to a
register especially if that member is declared in a different assembly that
was compiled for this code was written:

lock(lockObject)
{
i = i + 1;
}

...yes, the compiler *could* assume that all members within the lock
statement block are likely accessible by multiple threads (implicit
volatile); but that's not its intention and it's certainly not documented as
doing that (and it would be pointless, other code knows nothing about this
block and could have optimized use of i by changing its order of access or
caching to a registry).

volatile and lock should be used in conjunction, one is not a replacement
for the other.

--
Browse http://connect.microsoft.com/VisualStudio/feedback/ and vote.
http://www.peterRitchie.com/blog/
Microsoft MVP, Visual Developer - Visual C#

Jon Skeet [C# MVP]

unread,
Jun 14, 2007, 1:48:02 PM6/14/07
to
Peter Ritchie [C# MVP] <PRS...@newsgroups.nospam> wrote:

<snip>

> By the same token, the lock statement surrounding access to a member doesn't
> stop the compiler from having optimized use of a member by caching it to a
> register especially if that member is declared in a different assembly that
> was compiled for this code was written:
>
> lock(lockObject)
> {
> i = i + 1;
> }

Acquiring a lock has acquire semantics, and releasing a lock has
release semantics. You don't need any volatility if all access to any
particular item of shared data is always made having acquired a certain
lock.

If different locks are used, you could be in trouble, but if you always
lock on the same reference (when accessing the same shared data) you're
guaranteed to be okay.

> ...yes, the compiler *could* assume that all members within the lock
> statement block are likely accessible by multiple threads (implicit
> volatile); but that's not its intention and it's certainly not documented as
> doing that (and it would be pointless, other code knows nothing about this
> block and could have optimized use of i by changing its order of access or
> caching to a registry).

It certainly *is* documented. ECMA 335, section 12.6.5:

<quote>
Acquiring a lock (System.Threading.Monitor.Enter or entering a
synchronized method) shall implicitly
perform a volatile read operation, and releasing a lock
(System.Threading.Monitor.Exit or leaving a
synchronized method) shall implicitly perform a volatile write
operation.
</quote>

> volatile and lock should be used in conjunction, one is not a replacement
> for the other.

If you lock appropriately, you never need to use volatile.

Willy Denoyette [MVP]

unread,
Jun 14, 2007, 2:45:01 PM6/14/07
to
"Jon Skeet [C# MVP]" <sk...@pobox.com> wrote in message
news:MPG.20db8fa...@msnews.microsoft.com...


True, when using locks, make sure you do it consistently. And that's exactly
why I said that I'm getting suspicious when I see a "volatile" field. Most
of the time this modifier is used because the author doesn't understand the
semantics of "volatile", or he's not sure about his own locking policy or he
has no locking policy at all. Also some may think that volatile implies a
fence, which is not the case, it only tells the JIT to turn off some of the
optimizations like register allocation and load/store reordering, but it
doesn't prevent possible re-ordering and write buffering done by the CPU,
note, that this is a non issue on X86 and X64 like CPU's , given the memory
model enforced by the CLR, but it is an issue on IA64.

Willy.

Peter Ritchie [C# MVP]

unread,
Jun 14, 2007, 3:40:01 PM6/14/07
to
> Acquiring a lock has acquire semantics, and releasing a lock has
> release semantics. You don't need any volatility if all access to any
> particular item of shared data is always made having acquired a certain
> lock.

...which only applies to reference types. Most of this discussion has been
revolving around value types (by virtue of Interlocked.Increment), for which
"lock" cannot not apply. e.g. you can't switch from using lock on a member
to using Interlocked.Increment on that member, one works with references and
the other with value types (specifically Int32 and Int64). This is what
raised my concern.

> It certainly *is* documented. ECMA 335, section 12.6.5:
>
> <quote>
> Acquiring a lock (System.Threading.Monitor.Enter or entering a
> synchronized method) shall implicitly
> perform a volatile read operation, and releasing a lock
> (System.Threading.Monitor.Exit or leaving a
> synchronized method) shall implicitly perform a volatile write
> operation.
> </quote>

...still doesn't document anything about the members/variables within the
locked block (please read my example). That quote applies only to the
reference used as the parameter for the lock.

There can be no lock acquire semantics for value members. Suggesting
"locking appropriately" cannot apply here and can be misconstrued by some
people by creating something like "lock(myLocker){intMember = SomeMethod();}"
which does not do the same thing as making intMember volatile, increases
overhead needlessly, and still leaves a potential bug.

>
> > volatile and lock should be used in conjunction, one is not a replacement
> > for the other.
>
> If you lock appropriately, you never need to use volatile.

Even if the discussion hasn't been about value types, a dangerous statement;
because it could only apply to reference types (i.e. if myObject is wrapped
with lock(myObject) in every thread, yes I don't need to declare it with
volatile--but that's probably not why I'm using lock). In the context of
reference types, volatile only applies to the pointer (reference) not
anything within the object it references. Reference assignment is atomic,
there's no need to use lock to guard that sort of thing. You use lock to
guard a non-atomic invariant, volatile has nothing to do with that--it has to
do with the optimization (ordering, caching) of pointer/value reads and
writes.

Calling Monitor.Enter/Minitor.Exit is a pretty heavy-weight means of
ensuring acquire semantics; at least 5 times slower if volatile is all you
need.

-- Peter

Peter Ritchie [C# MVP]

unread,
Jun 14, 2007, 3:45:00 PM6/14/07
to
Do you think the following is suspicous:?

volatile int intMember;

...assumes you didn't read my last post, I suppose :-)

-- Peter

--
Browse http://connect.microsoft.com/VisualStudio/feedback/ and vote.
http://www.peterRitchie.com/blog/
Microsoft MVP, Visual Developer - Visual C#

Jon Skeet [C# MVP]

unread,
Jun 14, 2007, 4:50:49 PM6/14/07
to
Peter Ritchie [C# MVP] <PRS...@newsgroups.nospam> wrote:
> > Acquiring a lock has acquire semantics, and releasing a lock has
> > release semantics. You don't need any volatility if all access to any
> > particular item of shared data is always made having acquired a certain
> > lock.
>
> ...which only applies to reference types. Most of this discussion has been
> revolving around value types (by virtue of Interlocked.Increment), for which
> "lock" cannot not apply. e.g. you can't switch from using lock on a member
> to using Interlocked.Increment on that member, one works with references and
> the other with value types (specifically Int32 and Int64). This is what
> raised my concern.

It's not a case of using a lock on a particular value - taking the lock
out creates a memory barrier beyond which *no* reads can pass, not just
reads on the locked expression.

> > It certainly *is* documented. ECMA 335, section 12.6.5:
> >
> > <quote>
> > Acquiring a lock (System.Threading.Monitor.Enter or entering a
> > synchronized method) shall implicitly
> > perform a volatile read operation, and releasing a lock
> > (System.Threading.Monitor.Exit or leaving a
> > synchronized method) shall implicitly perform a volatile write
> > operation.
> > </quote>
>
> ...still doesn't document anything about the members/variables within the
> locked block (please read my example). That quote applies only to the
> reference used as the parameter for the lock.
>
> There can be no lock acquire semantics for value members. Suggesting
> "locking appropriately" cannot apply here and can be misconstrued by some
> people by creating something like "lock(myLocker){intMember = SomeMethod();}"
> which does not do the same thing as making intMember volatile, increases
> overhead needlessly, and still leaves a potential bug.

No, it *doesn't* leave a bug - you've misunderstood the effect of lock
having acquire semantics.

> > > volatile and lock should be used in conjunction, one is not a replacement
> > > for the other.
> >
> > If you lock appropriately, you never need to use volatile.
>
> Even if the discussion hasn't been about value types, a dangerous statement;
> because it could only apply to reference types (i.e. if myObject is wrapped
> with lock(myObject) in every thread, yes I don't need to declare it with
> volatile--but that's probably not why I'm using lock). In the context of
> reference types, volatile only applies to the pointer (reference) not
> anything within the object it references. Reference assignment is atomic,
> there's no need to use lock to guard that sort of thing. You use lock to
> guard a non-atomic invariant, volatile has nothing to do with that--it has to
> do with the optimization (ordering, caching) of pointer/value reads and
> writes.

Atomicity and volatility are very different things, and shouldn't be
confused.

Locks do more than just guarding non-atomic invariants though - they
have the acquire/release semantics which make volatility unnecessary.

To be absolutely clear on this, if I have:

int someValue;
object myLock;

...

lock (myLock)
{
int x = someValue;
someValue = x+1;
}

then the read of someValue *cannot* be from a cache - it *must* occur
after the lock has been taken out. Likewise before the lock is
released, the write back to someValue *must* have been made effectively
flushed (it can't occur later than the release in the logical memory
model).

Here's how that's guaranteed by the spec:

"Acquiring a lock (System.Threading.Monitor.Enter or entering a
synchronized method) shall implicitly perform a volatile read
operation"

and

"A volatile read has =3Facquire semantics=3F meaning that the read is
guaranteed to occur prior to any references to memory that occur after
the read instruction in the CIL instruction sequence."

That means that the volatile read due to the lock is guaranteed to
occur prior to the "reference to memory" (reading someValue) which
occurs later in the CIL instruction sequence.

The same thing happens the other way round for releasing the lock.

> Calling Monitor.Enter/Minitor.Exit is a pretty heavy-weight means of
> ensuring acquire semantics; at least 5 times slower if volatile is all you
> need.

But still fast enough for almost everything I've ever needed to do, and
I find it a lot easier to reason about a single way of doing things
than having multiple ways for multiple situations. Just a personal
preference - but it definitely *is* safe, without ever needing to
declare anything volatile.

Willy Denoyette [MVP]

unread,
Jun 14, 2007, 5:06:23 PM6/14/07
to
"Peter Ritchie [C# MVP]" <PRS...@newsgroups.nospam> wrote in message
news:53C56549-0FD9-4D24...@microsoft.com...

> Do you think the following is suspicous:?
>
> volatile int intMember;
>
> ...assumes you didn't read my last post, I suppose :-)
>
> -- Peter

Yes, I do, maybe it's a sign that someone is trying to write lock free
code....

But , I get even more suspicious is when I see this:

...
volatile int intMember;
...
void Foo()
{
lock(myLock)
{
// use intMember here and protect it's shared state by preventing
other threads to touch intMember
// for the duration of the critical section
}
...
}

In above case, when you apply a consistent locking policy to protect your
invariants, there is no need for a volatile intMember. Else, it can be an
indication that some one is trying to play smart, by not taking a lock to
access intMember.


Willy.


Willy Denoyette [MVP]

unread,
Jun 14, 2007, 5:30:04 PM6/14/07
to
"Jon Skeet [C# MVP]" <sk...@pobox.com> wrote in message
news:MPG.20dbba8...@msnews.microsoft.com...

Actually on modern processors (others aren't supported anyway, unless you
are running W98 on a 80386) , the read and writes will come/go from/to the
cache (L1, L2 ..), the cache coherency protocol will guarantee consistency
across the cache lines holding the variable has changed. That way, the
"software" has a uniform view of what is called the "memory" irrespective
the number of HW threads (not talking about NUMA here!).


> Here's how that's guaranteed by the spec:
>
> "Acquiring a lock (System.Threading.Monitor.Enter or entering a
> synchronized method) shall implicitly perform a volatile read
> operation"
>
> and
>
> "A volatile read has =3Facquire semantics=3F meaning that the read is
> guaranteed to occur prior to any references to memory that occur after
> the read instruction in the CIL instruction sequence."
>
> That means that the volatile read due to the lock is guaranteed to
> occur prior to the "reference to memory" (reading someValue) which
> occurs later in the CIL instruction sequence.
>
> The same thing happens the other way round for releasing the lock.
>
>> Calling Monitor.Enter/Minitor.Exit is a pretty heavy-weight means of
>> ensuring acquire semantics; at least 5 times slower if volatile is all
>> you
>> need.
>
> But still fast enough for almost everything I've ever needed to do, and
> I find it a lot easier to reason about a single way of doing things
> than having multiple ways for multiple situations. Just a personal
> preference - but it definitely *is* safe, without ever needing to
> declare anything volatile.
>

Probably one of the reasons why I've never seen a volatile modifier on a
field in the FCL.
And to repeat myself, volatile is not a guarantee against re-ordering and
write buffering by CPU's implementing a weak memory model, like the IA64.
Volatile serves only one thing, that is, prevent optimizations like
re-registering and re-ordering as there would be done by the JIT compiler.

Willy.

Jon Skeet [C# MVP]

unread,
Jun 14, 2007, 5:50:01 PM6/14/07
to
Willy Denoyette [MVP] <willy.d...@telenet.be> wrote:

<snip>

> > then the read of someValue *cannot* be from a cache - it *must* occur
> > after the lock has been taken out. Likewise before the lock is
> > released, the write back to someValue *must* have been made effectively
> > flushed (it can't occur later than the release in the logical memory
> > model).
>
> Actually on modern processors (others aren't supported anyway, unless you
> are running W98 on a 80386) , the read and writes will come/go from/to the
> cache (L1, L2 ..), the cache coherency protocol will guarantee consistency
> across the cache lines holding the variable has changed. That way, the
> "software" has a uniform view of what is called the "memory" irrespective
> the number of HW threads (not talking about NUMA here!).

Yes - I've been using "cache" here somewhat naughtily (because it's the
terminology Peter was using). The sensible way to talk about it is in
terms of the .NET memory model, which is

> > But still fast enough for almost everything I've ever needed to do, and
> > I find it a lot easier to reason about a single way of doing things
> > than having multiple ways for multiple situations. Just a personal
> > preference - but it definitely *is* safe, without ever needing to
> > declare anything volatile.
>
> Probably one of the reasons why I've never seen a volatile modifier on a
> field in the FCL.
> And to repeat myself, volatile is not a guarantee against re-ordering and
> write buffering by CPU's implementing a weak memory model, like the IA64.
> Volatile serves only one thing, that is, prevent optimizations like
> re-registering and re-ordering as there would be done by the JIT compiler.

No, I disagree with that. Volatile *does* prevent (some) reordering and
write buffering as far as the visible effect to the code is concerned,
whether the effect comes from the JIT or the CPU. Suppose variables a
and b are volatile, then:

int c = a;
int d = b;

will guarantee that the visible effect is the value of "a" being read
before the value of "b" (which wouldn't be the case if they weren't
volatile). In particular, if the variables both start out at 0, then we
do:

b = 1;
a = 1;

in parallel with the previous code, then you might get c=d=1, or c=d=0,
or c=0, d=1, but you're guaranteed *not* to get c=1, d=0.

Whether that involves the JIT doing extra work to get round a weak CPU
memory model is unimportant - if it doesn't prevent that last
situation, it's failed to meet the spec.

Willy Denoyette [MVP]

unread,
Jun 14, 2007, 6:48:25 PM6/14/07
to
"Jon Skeet [C# MVP]" <sk...@pobox.com> wrote in message
news:MPG.20dbc86...@msnews.microsoft.com...

Willy Denoyette [MVP]

unread,
Jun 14, 2007, 8:03:24 PM6/14/07
to
"Jon Skeet [C# MVP]" <sk...@pobox.com> wrote in message
news:MPG.20dbc86...@msnews.microsoft.com...


Agreed, reads (all or not volatile) cannot move before a volatile read, and
writes cannot move after a volatile write.

But this is not my point, what I'm referring to is the following (assuming a
and b are volatile):

a = 5;
int d = b;

here it's allowed for the write to move after the read, they are referring
to different locations and they have no (visible) dependencies).


Willy.

Peter Ritchie [C# MVP]

unread,
Jun 14, 2007, 8:07:00 PM6/14/07
to
For the record, I've been talking about the compiler re-organizing the code
during optimization. And I thought I was pretty clear about the compiler
"caching" values to a register, not the CPUs caches.

> It's not a case of using a lock on a particular value - taking the lock
> out creates a memory barrier beyond which *no* reads can pass, not just
> reads on the locked expression.

I don't see you how you get that from:

> "Acquiring a lock (System.Threading.Monitor.Enter or entering a
> synchronized method) shall implicitly perform a volatile read
> operation"
>
> and
>

> "A volatile read has "acquire semantics" meaning that the read is

> guaranteed to occur prior to any references to memory that occur after
> the read instruction in the CIL instruction sequence."

I would agree that a volatile read/write is performed on the parameter for
Monitor.Enter and Monitor.Exit.

> To be absolutely clear on this, if I have:
>
> int someValue;
> object myLock;
>

> ....


>
> lock (myLock)
> {
> int x = someValue;
> someValue = x+1;
> }
>

> then the read of someValue *cannot* be from a cache - it *must* occur
> after the lock has been taken out. Likewise before the lock is
> released, the write back to someValue *must* have been made effectively
> flushed (it can't occur later than the release in the logical memory
> model).

You're talking about CPU re-organizations and CPU cachings, I've been
talking about compiler optimizations.

None of the quotes affect code already optimized by the compiler. If the
compiler decides writing code that doesn't write a temporary value directly
back to the member/variable because it's faster and it doesn't know it's
volatile, nothing you've quoted will have a bearing on that.

Monitor.Enter may create memory barrier for the current thread, it's unclear
from 335; but it could not have affected code that accesses members outside
of a lock block.

335 says nothing about what the compiler does with code within a locked block.

Jon Skeet [C# MVP]

unread,
Jun 15, 2007, 12:06:28 AM6/15/07
to
Peter Ritchie [C# MVP] <PRS...@newsgroups.nospam> wrote:
> For the record, I've been talking about the compiler re-organizing the code
> during optimization. And I thought I was pretty clear about the compiler
> "caching" values to a register, not the CPUs caches.

That's all irrelevant - the important thing is the visible effect.

<snip>

> > then the read of someValue *cannot* be from a cache - it *must* occur
> > after the lock has been taken out. Likewise before the lock is
> > released, the write back to someValue *must* have been made effectively
> > flushed (it can't occur later than the release in the logical memory
> > model).
>
> You're talking about CPU re-organizations and CPU cachings, I've been
> talking about compiler optimizations.

As I said to Willy, I shouldn't have used the word "cache". Quite what
could make things appear to be out of order is irrelevant - they're all
forbidden by the spec in this case.

> None of the quotes affect code already optimized by the compiler. If the
> compiler decides writing code that doesn't write a temporary value directly
> back to the member/variable because it's faster and it doesn't know it's
> volatile, nothing you've quoted will have a bearing on that.

So here are you talking about the C# compiler rather than the JIT
compiler?

If so, I agree there appears to be a hole in the C# spec. I don't
believe the C# compiler *will* move any reads/writes around, however.
For the rest of the post, however, I'll assume you were actually still
talking about the JIT.

> Monitor.Enter may create memory barrier for the current thread, it's unclear
> from 335; but it could not have affected code that accesses members outside
> of a lock block.

Agreed, but irrelevant.



> 335 says nothing about what the compiler does with code within a locked block.

Agreed, but irrelevant.

The situation I've been talking about is where a particular variable is
only referenced *inside* lock blocks, and where all the lock blocks
which refer to that variable are all locking against the same
reference.

At that point, there is an absolute ordering in terms of the execution
of those lock blocks - only one can execute at a time, because that's
the main point of locking.

Furthermore, while the ordering *within* the lock can be moved, none of
the reads which are inside the lock can be moved to before the lock is
acquired (in terms of the memory model, however that is achieved) and
none of the writes which are inside the lock can be moved to after the
lock is released.

Therefore any change to the variable is seen by each thread, with no
"stale" values being involved.

Now I totally agree that *if* you start accessing the variable from
outside a lock block, all bets are off - but so long as you keep
everything within locked sections of code, all locked with the same
lock, you're fine.

Jon Skeet [C# MVP]

unread,
Jun 15, 2007, 12:09:20 AM6/15/07
to
Willy Denoyette [MVP] <willy.d...@telenet.be> wrote:

<snip>

> Agreed, reads (all or not volatile) cannot move before a volatile read, and

> writes cannot move after a volatile write.
>
> But this is not my point, what I'm referring to is the following (assuming a
> and b are volatile):
>
> a = 5;
> int d = b;
>
> here it's allowed for the write to move after the read, they are referring
> to different locations and they have no (visible) dependencies).

Assuming they're not volatile, you're absolutely right - but I thought
you were talking about what could happen with *volatile* variables,
given that you said:

<quote>


And to repeat myself, volatile is not a guarantee against re-ordering
and write buffering by CPU's implementing a weak memory model, like the
IA64.

</quote>

I believe volatile *is* a guarantee against the reordering of volatile
operations. Volatile isn't a guarantee against the reordering of two
non-volatile operations with no volatile operation between them, but
that's the case for the JIT as well as the CPU.

I don't believe it's necessary to talk about the JIT separately from
the CPU when thinking on a purely spec-based level. If we were looking
at generated code we'd need to consider the platform etc, but at a
higher level than that we can just talk about the memory model that the
CLR provides, however it provides it.

Willy Denoyette [MVP]

unread,
Jun 15, 2007, 3:45:18 AM6/15/07
to
"Jon Skeet [C# MVP]" <sk...@pobox.com> wrote in message
news:MPG.20dc214...@msnews.microsoft.com...

Willy Denoyette [MVP]

unread,
Jun 15, 2007, 6:04:52 AM6/15/07
to
"Jon Skeet [C# MVP]" <sk...@pobox.com> wrote in message
news:MPG.20dc214...@msnews.microsoft.com...

> Willy Denoyette [MVP] <willy.d...@telenet.be> wrote:
>
> <snip>
>
>> Agreed, reads (all or not volatile) cannot move before a volatile read,
>> and
>> writes cannot move after a volatile write.
>>
>> But this is not my point, what I'm referring to is the following
>> (assuming a
>> and b are volatile):
>>
>> a = 5;
>> int d = b;
>>
>> here it's allowed for the write to move after the read, they are
>> referring
>> to different locations and they have no (visible) dependencies).
>
> Assuming they're not volatile, you're absolutely right - but I thought
> you were talking about what could happen with *volatile* variables,
> given that you said:
>

Note really, I'm talking about volatile field b
The (ECMA) rules for volatile state that:
- reads and writes cannot move before a *volatile* read
- reads and writes cannot move after a *volatile* write.
As I see it, this means that ordinary writes can move after a volatile read.
So, in the above, the write to 'a' can move after the volatile read from
'b', agree?
However, above rules are not clear on the case where 'a' and 'b' are
volatile, do the rules prohibit a volatile write to move after a volatile
read? IMO they don't.

Jon Skeet [C# MVP]

unread,
Jun 15, 2007, 6:34:11 AM6/15/07
to
Willy Denoyette [MVP] <willy.d...@telenet.be> wrote:
> >> But this is not my point, what I'm referring to is the following
> >> (assuming a
> >> and b are volatile):
> >>
> >> a = 5;
> >> int d = b;
> >>
> >> here it's allowed for the write to move after the read, they are
> >> referring
> >> to different locations and they have no (visible) dependencies).
> >
> > Assuming they're not volatile, you're absolutely right - but I thought
> > you were talking about what could happen with *volatile* variables,
> > given that you said:
>
> Note really, I'm talking about volatile field b

Sorry - I stupidly misread "are volatile" as "are not volatile". Doh!

> The (ECMA) rules for volatile state that:
> - reads and writes cannot move before a *volatile* read
> - reads and writes cannot move after a *volatile* write.
> As I see it, this means that ordinary writes can move after a volatile read.
> So, in the above, the write to 'a' can move after the volatile read from
> 'b', agree?
> However, above rules are not clear on the case where 'a' and 'b' are
> volatile, do the rules prohibit a volatile write to move after a volatile
> read? IMO they don't.

Yup, I think you're right.

I basically think of volatile as pretty much *solely* a way to make
sure you always see the latest value of the variable in any thread.
When it comes to interactions like that, while they're interesting to
reason about, I'd rather use a lock in situations where I really care
:)

Willy Denoyette [MVP]

unread,
Jun 15, 2007, 6:47:17 AM6/15/07
to
Sorry , but previous message went out before being finished.

"Willy Denoyette [MVP]" <willy.d...@telenet.be> wrote in message
news:ent%23GTzrH...@TK2MSFTNGP05.phx.gbl...

However, the memory model as implemented by V2 of the CLR, also defines an
explicit rule that states that :
- All shared writes shall have release semantics.
which could be restated as : "writes cannot be reordered, point". That means
that on the current platforms, emitting every write with release semantics
is sufficient to:
1) perform each processor's stores in order, and
2) make them visible to other processors in that order
That makes the execution environment Processor Consistent (PC), great, that
would mean that the above optimization (move of the write after the volatile
read) is excluded. The problem however is, that notably the JIT64 on IA64,
does not enforce that rule consistently, it appears to enable such
optimizations in violation of the "managed memory model". MSFT is aware of
this, but as of today, I have no idea whether they are addressing this or
that they consider this to be acceptable on the IA64 platform.

Willy.


Jon Skeet [C# MVP]

unread,
Jun 15, 2007, 6:54:04 AM6/15/07
to
Willy Denoyette [MVP] <willy.d...@telenet.be> wrote:

<snip>

> However, the memory model as implemented by V2 of the CLR, also defines an

> explicit rule that states that :
> - All shared writes shall have release semantics.
> which could be restated as : "writes cannot be reordered, point". That means
> that on the current platforms, emitting every write with release semantics
> is sufficient to:
> 1) perform each processor's stores in order, and
> 2) make them visible to other processors in that order
> That makes the execution environment Processor Consistent (PC), great, that
> would mean that the above optimization (move of the write after the volatile
> read) is excluded. The problem however is, that notably the JIT64 on IA64,
> does not enforce that rule consistently, it appears to enable such
> optimizations in violation of the "managed memory model". MSFT is aware of
> this, but as of today, I have no idea whether they are addressing this or
> that they consider this to be acceptable on the IA64 platform.

I *thought* (though I could well be wrong) that before release, the
IA64 JIT was indeed very lax, but that it had been tightened up close
to release. I wouldn't like to try to find any evidence of that though
;)

Just another reason to stick to "simple" thread safety via locks, IMO.

Peter Ritchie [C# MVP]

unread,
Jun 17, 2007, 6:12:00 PM6/17/07
to
"Jon Skeet [C# MVP]" wrote:
> So here are you talking about the C# compiler rather than the JIT
> compiler?
>
> If so, I agree there appears to be a hole in the C# spec. I don't
> believe the C# compiler *will* move any reads/writes around, however.
> For the rest of the post, however, I'll assume you were actually still
> talking about the JIT.

Could be either, I suppose. I don't think the spec. is clear at all in this
respect. With regard to compiler-level optimzations, 12.6.4 details: "...are
visible in the order specified in the CIL.". Which suggests to me that the
C#-to-IL compiler doesn't optimize other than potential reorganizations. The
detail before that seems concerning: "guarentees, within a single thread of
execution, that side-effects ... are visble in the order specified by the
CIL". Sounds like memory barriers are set up within Monitor.Enter and
Monitor.Exit to ensure CPU-level re-ordering is limited; but, unless there's
a modreq(volatile) on a member the JIT can't know not to introduce
cross-thread visible side-effects unless it looks for calls to Monitor.Enter
and Monitor.Exit.

Jon Skeet [C# MVP]

unread,
Jun 18, 2007, 3:04:10 PM6/18/07
to
Peter Ritchie [C# MVP] <PRS...@newsgroups.nospam> wrote:
> "Jon Skeet [C# MVP]" wrote:
> > So here are you talking about the C# compiler rather than the JIT
> > compiler?
> >
> > If so, I agree there appears to be a hole in the C# spec. I don't
> > believe the C# compiler *will* move any reads/writes around, however.
> > For the rest of the post, however, I'll assume you were actually still
> > talking about the JIT.
>
> Could be either, I suppose. I don't think the spec. is clear at all in this
> respect. With regard to compiler-level optimzations, 12.6.4 details: "...are
> visible in the order specified in the CIL.". Which suggests to me that the
> C#-to-IL compiler doesn't optimize other than potential reorganizations.

The CIL spec can't determine what the C# compiler is allowed to do. I
haven't seen anything in the C# spec which says it won't reorder things
- although I hope and believe that it wohn't.

> The detail before that seems concerning: "guarentees, within a single thread of
> execution, that side-effects ... are visble in the order specified by the
> CIL". Sounds like memory barriers are set up within Monitor.Enter and
> Monitor.Exit to ensure CPU-level re-ordering is limited; but, unless there's
> a modreq(volatile) on a member the JIT can't know not to introduce
> cross-thread visible side-effects unless it looks for calls to Monitor.Enter
> and Monitor.Exit.

I believe it *actually* avoids any reordering around *any* method
calls. I don't think 12.6.7 leaves much wiggle-room though - acquiring
the lock counts as a volatile read, and releasing the lock counts as a
volatile write, and the reordering prohibitions therefore apply - and
apply to *all* reads and writes, not just reads and writes of that
variable.

Note that 12.6.4 says: "(Note that while only volatile operations
constitute visible side-effects, volatile operations also affect the
visibility of non-volatile references.)" It's that affect non-volatile
visibility which I'm talking about.

Would it be worth me coming up with a small sample problem which shares
data without using any volatile variables? I claim that (assuming I
write it correctly) there won't be a bug - if you can suggest a failure
mode, we could try to reason about a concrete case rather than talking
in the abstract.

Peter Ritchie [C# MVP]

unread,
Jun 18, 2007, 9:58:00 PM6/18/07
to
"Jon Skeet [C# MVP]" wrote:

> I believe it *actually* avoids any reordering around *any* method
> calls. I don't think 12.6.7 leaves much wiggle-room though - acquiring
> the lock counts as a volatile read, and releasing the lock counts as a
> volatile write, and the reordering prohibitions therefore apply - and
> apply to *all* reads and writes, not just reads and writes of that
> variable.

Yes, acquiring the lock is a run-time operation. Just as MemoryBarrier,
volatile read, and volatile write are run-time operations and only ensures
the CPU has flushed any new values to RAM and won't reorder side-effects
between the acquire and release semantics.

I'm talking about compiler optimizations (specifically JIT--okay, I should
have been calling it the IL compiler--because the C# to IL compiler doesn't
really have the concept of registers, as I've made reference to).

>
> Note that 12.6.4 says: "(Note that while only volatile operations
> constitute visible side-effects, volatile operations also affect the
> visibility of non-volatile references.)" It's that affect non-volatile
> visibility which I'm talking about.

Again, run-time operations.

>
> Would it be worth me coming up with a small sample problem which shares
> data without using any volatile variables? I claim that (assuming I
> write it correctly) there won't be a bug - if you can suggest a failure
> mode, we could try to reason about a concrete case rather than talking
> in the abstract.

The issue I'm talking about will only occur if the JIT optimizes in a
certain way. Let's take an academic example:
internal class Tester {
private Object locker = new Object();
private int number;
private Random random = new Random();

private void UpdateNumber ( ) {
int count = random.Next();
for (int i = 0; i < count; ++i) {
number++;
Trace.WriteLine(number);
}
}
public void DoSomething() {
lock(locker) {
Trace.WriteLine(number);
}
}
}

*if* the JIT optimized the incrementation of number as follows (example x86,
it's been a while; I may have screwed up the offsets...):
for (int i = 0; i < count; ++i)
00000020 xor ebx,ebx
00000022 test ebp,ebp
00000024 jle 00000033
{
number++;
00000026 add edi,1
00000029 add ebx,1
0000002C cmp ebx,ebp
0000002E jl 00000026
00000030 mov dword ptr [esi+0Ch],edi
00000033 pop ebp

...where it's optimized the calculations on number to use a register (edi)
during the loop and assigned that result to number at the end of the loop.
Within a single thread of execution, very valid (because we haven't told it
otherwise with "volatile"); and within the native world: been done for
decades.

Clearly another thread accessing Tester.number isn't going to see any of
those incremental changes.

Even if you wrap that with a lock statement, create a MemoryBarrier, etc.
those are all still run-time operations, it does not give any information to
the JIT about anything within the lock block (which is what I was referring
to by my original comment about certainly not documented...). By the time
the code is loaded into memory (let alone when Monitor.Enter is called) the
compiler has already done its optimizations.

The only thing that could tell the compiler anything about volatility with
respect to compile-time optimizations is something declarative, like
volatile. Yes, writes to fields declared as volatile also get volatile
reads/writes and acquire/release semantics just like Monitor.Enter and
Monitor.Exit; but that's the run-time aspect of it (for the too-smart
processors).

Jon Skeet [C# MVP]

unread,
Jun 19, 2007, 2:22:01 AM6/19/07
to
Peter Ritchie [C# MVP] <PRS...@newsgroups.nospam> wrote:
> "Jon Skeet [C# MVP]" wrote:
>
> > I believe it *actually* avoids any reordering around *any* method
> > calls. I don't think 12.6.7 leaves much wiggle-room though - acquiring
> > the lock counts as a volatile read, and releasing the lock counts as a
> > volatile write, and the reordering prohibitions therefore apply - and
> > apply to *all* reads and writes, not just reads and writes of that
> > variable.
> Yes, acquiring the lock is a run-time operation. Just as MemoryBarrier,
> volatile read, and volatile write are run-time operations and only ensures
> the CPU has flushed any new values to RAM and won't reorder side-effects
> between the acquire and release semantics.
>
> I'm talking about compiler optimizations (specifically JIT--okay, I should
> have been calling it the IL compiler--because the C# to IL compiler doesn't
> really have the concept of registers, as I've made reference to).

That's irrelevant though - the spec just says what will happen, not
which bit is responsible for making sure it happens.

> > Note that 12.6.4 says: "(Note that while only volatile operations
> > constitute visible side-effects, volatile operations also affect the
> > visibility of non-volatile references.)" It's that affect non-volatile
> > visibility which I'm talking about.
> Again, run-time operations.

But affected by compilation decisions.

> > Would it be worth me coming up with a small sample problem which shares
> > data without using any volatile variables? I claim that (assuming I
> > write it correctly) there won't be a bug - if you can suggest a failure
> > mode, we could try to reason about a concrete case rather than talking
> > in the abstract.
>
> The issue I'm talking about will only occur if the JIT optimizes in a
> certain way. Let's take an academic example:
> internal class Tester {
> private Object locker = new Object();
> private int number;
> private Random random = new Random();
>
> private void UpdateNumber ( ) {
> int count = random.Next();
> for (int i = 0; i < count; ++i) {
> number++;
> Trace.WriteLine(number);
> }
> }
> public void DoSomething() {
> lock(locker) {
> Trace.WriteLine(number);
> }
> }
> }

That is indeed buggy code - you're accessing number without locking.
That's not the situation I've been describing. If you change your code
to:

int count = random.Next();
for (int i = 0; i < count; ++i) {

lock (locker)
{
number++;
Trace.WriteLine(number);
}
}

then the code is okay. That's the situation I've been consistently
describing.

<snip>

> The only thing that could tell the compiler anything about volatility with
> respect to compile-time optimizations is something declarative, like
> volatile. Yes, writes to fields declared as volatile also get volatile
> reads/writes and acquire/release semantics just like Monitor.Enter and
> Monitor.Exit; but that's the run-time aspect of it (for the too-smart
> processors).

Again, you're making assumptions about which bit of the spec applies to
CPU optimisations and which bit applies to JIT compilation
optimisations. The spec doesn't say anything about that - it just makes
guarantees about what will be visible when. With the corrected code
above, there is no bug, because the JIT must know that it *must*
freshly read number after acquiring the lock, and *must* "flush" number
to main memory before releasing the lock.

Peter Ritchie [C# MVP]

unread,
Jun 19, 2007, 10:11:08 AM6/19/07
to
"Jon Skeet [C# MVP]" wrote:
<snip>

I guess we'll just have to disagree on a few things, for the reasons I've
already stated. I don't see much point in going back and forth saying the
same things...

With regard to runtime volatile read/writes and acquire/release semantics of
Monitor.Enter and Monitor.Exit we can agree.

I don't agree that anything specified in either 334 or 335 covers all levels
of potential compile-time class member JIT/IL compiler optimizations.

I don't agree that "int number; void UpdateNumber(){lock(locker){
number++;}}" is equally as safe as "volatile int number; void UpdateNumber(){
number++; }"

With the following Monitor.Enter/Exit IL, for example:
.field private int32 number
.method private hidebysig instance void UpdateNumber() cil managed
{
.maxstack 3
.locals init (
[0] int32 count,
[1] int32 i)
L_0000: ldarg.0
L_0001: ldfld class [mscorlib]System.Random Tester::random
L_0006: callvirt instance int32 [mscorlib]System.Random::Next()
L_000b: stloc.0
L_000c: ldarg.0 // *
L_000d: ldfld object Tester::locker //*
L_0012: call void [mscorlib]System.Threading.Monitor::Enter(object) //*
L_0017: ldc.i4.0
L_0018: stloc.1
L_0019: br.s L_003d
L_001b: ldarg.0
L_001c: dup
L_001d: ldfld int32 Tester::number
L_0022: ldc.i4.1
L_0023: add
L_0024: stfld int32 Tester::number
L_0029: ldarg.0
L_002a: ldfld int32 Tester::number
L_002f: box int32
L_0034: call void [System]System.Diagnostics.Trace::WriteLine(object)
L_0039: ldloc.1
L_003a: ldc.i4.1
L_003b: add
L_003c: stloc.1
L_003d: ldloc.1
L_003e: ldloc.0
L_003f: blt.s L_001b
L_0041: leave.s L_004f
L_0043: ldarg.0 // *
L_0044: ldfld object Tester::locker // *
L_0049: call void [mscorlib]System.Threading.Monitor::Exit(object) //*
L_004e: endfinally
L_004f: ret
.try L_0017 to L_0043 finally handler L_0043 to L_004f
}

...what part of that IL tells the JIT/IL compiler that Tester.number
specifically should be treated differently--where lines commented // * are
the only lines distinct to usage of Monitor.Enter/Exit?

Compared to use of volatile:
.field private int32
modreq([mscorlib]System.Runtime.CompilerServices.IsVolatile) number
.method private hidebysig instance void UpdateNumber() cil managed
{
.maxstack 3
.locals init (
[0] int32 count,
[1] int32 i)
L_0000: ldarg.0
L_0001: ldfld class [mscorlib]System.Random One.Tester::random
L_0006: callvirt instance int32 [mscorlib]System.Random::Next()
L_000b: stloc.0
L_000c: ldc.i4.0
L_000d: stloc.1
L_000e: br.s L_0038
L_0010: ldarg.0
L_0011: dup
L_0012: volatile
L_0014: ldfld int32
modreq([mscorlib]System.Runtime.CompilerServices.IsVolatile)
One.Tester::number
L_0019: ldc.i4.1
L_001a: add
L_001b: volatile
L_001d: stfld int32
modreq([mscorlib]System.Runtime.CompilerServices.IsVolatile)
One.Tester::number
L_0022: ldarg.0
L_0023: volatile
L_0025: ldfld int32
modreq([mscorlib]System.Runtime.CompilerServices.IsVolatile)
One.Tester::number
L_002a: box int32
L_002f: call void [System]System.Diagnostics.Trace::WriteLine(object)
L_0034: ldloc.1
L_0035: ldc.i4.1
L_0036: add
L_0037: stloc.1
L_0038: ldloc.1
L_0039: ldloc.0
L_003a: blt.s L_0010
L_003c: ret
}

...where an IL compiler is given ample amounts of information that
Tester.number should be treated differently.

I don't think it's safe, readable, or future friendly to utilize syntax
strictly for their secondary consequences (using Monitor.Enter/Exit not for
synchronization but for acquire/release semantics. As in the above line
where modification of an int is already atomic; "synchronization" is
irrelevant), even if they were effectively identical to another syntax. Yes,
if you've got a non-atomic invariant you still have to synchronize (with
lock, etc.)... but volatility is different and needs to be accounted for
equally as much as thread-safety.

-- Peter

Jon Skeet [C# MVP]

unread,
Jun 19, 2007, 11:12:16 AM6/19/07
to
On Jun 19, 3:11 pm, Peter Ritchie [C# MVP] <PRS...@newsgroups.nospam>
wrote:

> I guess we'll just have to disagree on a few things, for the reasons I've
> already stated. I don't see much point in going back and forth saying the
> same things...

I should say (and I've only just remembered) that a few years ago I
was unsure where the safety came from, and I mailed someone (Vance
Morrison? Chris Brumme?) who gave me the explanation I've been giving
you.

> With regard to runtime volatile read/writes and acquire/release semantics of
> Monitor.Enter and Monitor.Exit we can agree.
>
> I don't agree that anything specified in either 334 or 335 covers all levels
> of potential compile-time class member JIT/IL compiler optimizations.

It specifies how the system as a whole must behave: given a certain
piece of IL, there are

> I don't agree that "int number; void UpdateNumber(){lock(locker){
> number++;}}" is equally as safe as "volatile int number; void UpdateNumber(){
> number++; }"

I agree - the version without the lock is *unsafe*. Two threads could
both read, then both increment, then both store in the latter case.
With the lock, everything is guaranteed to work.

> With the following Monitor.Enter/Exit IL, for example:

<snip>

> ...what part of that IL tells the JIT/IL compiler that Tester.number
> specifically should be treated differently--where lines commented // * are
> the only lines distinct to usage of Monitor.Enter/Exit?

The fact that it knows Monitor.Enter is called, so the load (in the
logical memory model) cannot occur before Monitor.Enter. Likewise it
knows that Monitor.Exit is called, so the store can't occur after
Monitor.Exit. If it calls another method which *might* call
Monitor.Enter/Exit, it likewise can't move the reads/writes as that
would violate the spec.

> ...where an IL compiler is given ample amounts of information that
> Tester.number should be treated differently.

It's being given ample

> I don't think it's safe, readable, or future friendly to utilize syntax
> strictly for their secondary consequences (using Monitor.Enter/Exit not for
> synchronization but for acquire/release semantics. As in the above line
> where modification of an int is already atomic; "synchronization" is
> irrelevant), even if they were effectively identical to another syntax. Yes,
> if you've got a non-atomic invariant you still have to synchronize (with
> lock, etc.)... but volatility is different and needs to be accounted for
> equally as much as thread-safety.

Again you're treating atomicity as almost interchangeable with
volatility, when they're certainly not. Synchronization is certainly
relevant whether or not writes are atomic. Atomicity just states that
you won't see a "half way" state; volatility state that you will see
the "most recent" value. That's a huge difference.

The volatility is certainly not just a "secondary consequence" - it's
vital to the usefulness of locking.

Consider a type which isn't thread-aware - in other words, nothing is
marked as volatile, but it also has no thread-affinity. That should be
the most common kind of type, IMO. You can't retrospectively mark the
fields as being volatile, but you *do* want to ensure that if you use
objects of the type carefully (i.e. always within a consistent lock)
you won't get any unexpected behaviour. Due to the guarantees of
locking, you're safe. Otherwise, you wouldn't be. Without that
guarantee, you'd be entirely at the mercy of type authors for *all*
types that *might* be used in a multi-threaded environment making all
their fields volatile.

Further evidence that it's not just a secondary effect, but one which
certainly *can* be relied on: there's no other thread-safe way of
using doubles. They *can't* be marked as volatile - do you really
believe that MS would build .NET in such a way that wouldn't let you
write correct code to guarantee that you see the most recent value of
a double, rather than one cached in a register somewhere?

This *is* guaranteed, it's the normal way of working in the framework
(as Willy said, look for volatile fields in the framework itself) and
it's perfectly fine to rely on it.

Jon

Jon Skeet [C# MVP]

unread,
Jun 19, 2007, 11:23:31 AM6/19/07
to
On Jun 19, 4:12 pm, "Jon Skeet [C# MVP]" <s...@pobox.com> wrote:

<snip - looks like I didn't finish all of this>

> > I don't agree that anything specified in either 334 or 335 covers all levels
> > of potential compile-time class member JIT/IL compiler optimizations.
>
> It specifies how the system as a whole must behave: given a certain
> piece of IL, there are

It specifies how the system as a whole must behave: given a certain

piece of IL, there are valid behaviours and invalid behaviours. If you
can observe that a variable has been read before a lock has been
acquired and that value has then been used (without rereading) after
the lock has been acquired, then the CLR has a bug, pure and simple.
It violates the spec in a pretty clear-cut manner.

Jon

Ben Voigt [C++ MVP]

unread,
Jun 19, 2007, 12:57:28 PM6/19/07
to
> Consider a type which isn't thread-aware - in other words, nothing is
> marked as volatile, but it also has no thread-affinity. That should be
> the most common kind of type, IMO. You can't retrospectively mark the
> fields as being volatile, but you *do* want to ensure that if you use

You don't need to modify the type definition, you would need a volatile
variable of that type.


Peter Ritchie [C# MVP]

unread,
Jun 19, 2007, 1:51:02 PM6/19/07
to
> It specifies how the system as a whole must behave: given a certain
> piece of IL, there are valid behaviours and invalid behaviours. If you
> can observe that a variable has been read before a lock has been
> acquired and that value has then been used (without rereading) after
> the lock has been acquired, then the CLR has a bug, pure and simple.
> It violates the spec in a pretty clear-cut manner.

That's not the same thing as saying use of Monitor.Enter and Monitor.Exit
are what are used to maintain that behaviour.

In 335 section 12.6.5 has "[calling Monitor.Enter]...shall implicitly
perform a volatile read operation..." says to me that one volatile operation
is performed. And "[calling Monitor.Exit]...shall implicitly perform a
volatile write operation." A write to what? As in this snippet:
Monitor.Enter(this.locker)
Trace.WriteLine(this.number);
Monitor.Exit(this.locker)

It only casually mentions "See [section] 12.6.7" which discussions acquire
and release semantics in the context of the volatile prefix (assuming the C#
volatile keyword is what causes generation of this prefix). 12.6.7 only
mentions "the read" or "the write" it does not mention anything about a set
or block of read/writes. I think you've made quite a leap getting to: code
between Monitor.Enter and Monitor.Exit has volatility guarantees.

Writing a sample "that works" is meaningless to me. I've dealt with
thousands of snippets of code "that worked" in certain circumstances (usually
resulting in me fixing them to "really work").

You're free to interpret the spec any way you want, and if you've gotten
information from Chris or Vance, you've got their interpretation of the spec.
and, best case, you've got information specific to Microsoft's JIT/IL
Compilers.

Based upon the spec, I *know* that this is safe code:
public volatile int number;
public void DoSomething() {
this.Number = 1;
}

This is equally as safe:
public volatile int number;
public void DoSomething() {
lock(locker) {
this.Number = 1;
}
}

I think it's open to interpretation of the spec whether this is safe:
public int number;
public void DoSomething() {
lock(locker) {
this.Number = 1;
}
}

...it might be safe in Microsoft's implementations; but that's not open
information and I don't think it's due to Monitor.Enter/Monitor.Exit.

I don't see what the issue with volatile is, if you're not using "volatile"
for synchronization. Worst case with this:
public volatile int number;
public void DoSomething() {
this.Number = 1;
}
you've explicitly stated your volatility usage/expectation: more readable,
makes no assumptions...

Whereas:
public int number;
public void DoSomething() {
lock(locker) {
this.Number = 1;
}
}

...best case, this isn't as readable because it uses implicit volatility
side-effects.

What happens with the following code?


internal class Tester {
private Object locker = new Object();

private Random random = new Random();

public int number:

public Tester()
{
DoWork(false);
}

public void UpdateNumber() {
Monitor.Enter(locker);
DoWork(true);
}

private void DoWork(Boolean doOut) {
this.number = random.Next();
if(doOut)
{
switch(random.Next(1))
{
case 0:
Out1();
break;
case 1:
Out2();
break;
}
}
}

private void Out1() {
Montior.Exit(this.locker);
}

private void Out2() {
Monitor.Exit(this.locker);
}
}

...clearly there isn't enough information merely from the existence
Monitor.Enter and Monitor.Exit to maintain those guarantees.


> Again you're treating atomicity as almost interchangeable with
> volatility,

<snip>
No, I'm not. I said you don't need to synchronize an atomic invariant but
you still need to account for its volatility (by declaring it volatile). I
didn't say volatility was a secondary concern, I said it needs to be
accounted for equally. I was implying that using the "lock" keyword is not
as clear in terms of volatility assumptions/needs as is the "volatile"
keyword. If a I read some code that uses "lock", I can't assume the author
did that for volatility reasons and not just synchronization reasons; whereas
if she had put "volatile" on a field, I know for sure why she put that there.

> This *is* guaranteed, it's the normal way of working in the framework
> (as Willy said, look for volatile fields in the framework itself)

Which ones? Like Hashtable.version or StringBuilder.m_StringValue?


Jon Skeet [C# MVP]

unread,
Jun 19, 2007, 2:05:19 PM6/19/07
to

Just because the variable itself is volatile doesn't mean every access
would be volatile in the appropriate way. Consider:

public class Foo
{
public string bar; // No, I'd never use a public field really...
}


public class AnotherClass
{
volatile Foo x;

SomeMethod()
{
x.bar = 100;
}
}

Now, you've got a volatile *read* but not a volatile *write* - which is
what you really want to make sure that the write is visible to other
threads.

Jon Skeet [C# MVP]

unread,
Jun 19, 2007, 3:02:17 PM6/19/07
to
Peter Ritchie [C# MVP] <PRS...@newsgroups.nospam> wrote:
> > It specifies how the system as a whole must behave: given a certain
> > piece of IL, there are valid behaviours and invalid behaviours. If you
> > can observe that a variable has been read before a lock has been
> > acquired and that value has then been used (without rereading) after
> > the lock has been acquired, then the CLR has a bug, pure and simple.
> > It violates the spec in a pretty clear-cut manner.
>
> That's not the same thing as saying use of Monitor.Enter and Monitor.Exit
> are what are used to maintain that behaviour.

Well, without that guarantee for Monitor.Enter/Monitor.Exit I don't
believe it would be possible to write thread-safe code.

> In 335 section 12.6.5 has "[calling Monitor.Enter]...shall implicitly
> perform a volatile read operation..." says to me that one volatile operation
> is performed. And "[calling Monitor.Exit]...shall implicitly perform a
> volatile write operation." A write to what? As in this snippet:
> Monitor.Enter(this.locker)
> Trace.WriteLine(this.number);
> Monitor.Exit(this.locker)

It doesn't matter what the volatile write is to - it's the location in
the CIL that matters. No other writes can be moved (logically) past
that write, no matter what they're writing to.

> It only casually mentions "See [section] 12.6.7" which discussions acquire
> and release semantics in the context of the volatile prefix (assuming the C#
> volatile keyword is what causes generation of this prefix).

I don't see what's "casual" about it, nor why you should believe that
12.6.7 should only apply to instructions with the "volatile." prefix.
The section starts off by mentioning the prefix, but then talks in
terms of volatile reads and volatile writes - which is the same terms
as 12.6.5 talks in.

> 12.6.7 only
> mentions "the read" or "the write" it does not mention anything about a set
> or block of read/writes. I think you've made quite a leap getting to: code
> between Monitor.Enter and Monitor.Exit has volatility guarantees.

I really, really haven't. I think the problem is the one I talk about
above - you're assuming that *what* is written to matters, rather than
just the location of a volatile write in the CIL stream. Look at the
guarantee provided by the spec:

<quote>
A volatile read has =3Facquire semantics=3F meaning that the read is

guaranteed to occur prior to any references to memory that occur after

the read instruction in the CIL instruction sequence. A volatile write
has =3Frelease semantics=3F meaning that the write is guaranteed to happen
after any memory references prior to the write instruction in the CIL
instruction sequence.
</quote>

Where does that say anything about it being dependent on what is being
written or what is being read? It just talks about reads and writes
being moved in terms of their position in the CIL sequence.

So, no write that occurs before the call to Monitor.Exit in the IL can
be moved beyond the call to Monitor.Exit in the memory model, and no
read that occurs after Monitor.Enter in the IL can be moved to earlier
than Monitor.Enter in the memory model. That's all that's required for
thread safety.

> Writing a sample "that works" is meaningless to me. I've dealt with
> thousands of snippets of code "that worked" in certain circumstances (usually
> resulting in me fixing them to "really work").

I'm not talking about certain circumstances - I'm talking about
*guarantees* provided by the CLI spec.

I'm saying that I can write code which doesn't use volatile but which
is *guaranteed* to work. I believe you won't be able to provide any
exmaple of how it could fail without the CLI spec itself being
violated.

> You're free to interpret the spec any way you want, and if you've gotten
> information from Chris or Vance, you've got their interpretation of the spec.
> and, best case, you've got information specific to Microsoft's JIT/IL
> Compilers.

Well, I've got information specific to the .NET 2.0 memory model (which
is stronger than the CLI specified memory model) elsewhere.

However, I feel pretty comfortable in having the interpretation experts
who possibly contributed to the spec or at least have direct contact
with those who wrote it.

> Based upon the spec, I *know* that this is safe code:
> public volatile int number;
> public void DoSomething() {
> this.Number = 1;
> }
>
> This is equally as safe:
> public volatile int number;
> public void DoSomething() {
> lock(locker) {
> this.Number = 1;
> }
> }
>
> I think it's open to interpretation of the spec whether this is safe:
> public int number;
> public void DoSomething() {
> lock(locker) {
> this.Number = 1;
> }
> }

Well, this is why I suggested that I post a complete program - then you
could suggest ways in which it could go wrong, and I think I'd be able
to defend it in fairly clear-cut terms.



> ...it might be safe in Microsoft's implementations; but that's not open
> information and I don't think it's due to Monitor.Enter/Monitor.Exit.

I *hope* we won't just have to agree to disagree, but I realise that
may be the outcome :(

> I don't see what the issue with volatile is, if you're not using "volatile"
> for synchronization. Worst case with this:
> public volatile int number;
> public void DoSomething() {
> this.Number = 1;
> }
> you've explicitly stated your volatility usage/expectation: more readable,
> makes no assumptions...

It implies that without volatility you've got problems - which you
haven't (provided you use locking correctly). This means you can use a
single way of working for *all* types, regardless of whether you can
use the volatile modifier on them.



> Whereas:
> public int number;
> public void DoSomething() {
> lock(locker) {
> this.Number = 1;
> }
> }
>
> ...best case, this isn't as readable because it uses implicit volatility
> side-effects.

If you're not used to that being the idiom, you're right. However, if
I'm writing thread-safe code (most types don't need to be thread-safe)
I document what lock any shared data comes under. I can rarely get away
with a single operation anyway.

Consider the simple change from this:

this.number = 1;

to this:

this.number++;

With volatile, your code is now broken - and it's not obvious, and
probably won't show up in testing. With lock, it's not broken.

> What happens with the following code?
> internal class Tester {
> private Object locker = new Object();
> private Random random = new Random();
> public int number:
>
> public Tester()
> {
> DoWork(false);
> }
>
> public void UpdateNumber() {
> Monitor.Enter(locker);
> DoWork(true);
> }

What happens here is that I don't let this method go through code
review. There have to be *very* good reasons not to use lock{}, and in
those cases there would almost always still be a try/finally.

I wouldn't consider using volatile just to avoid the possibility of
code like this (which I've never seen in production, btw).


> private void DoWork(Boolean doOut) {
> this.number = random.Next();
> if(doOut)
> {
> switch(random.Next(1))
> {
> case 0:
> Out1();
> break;
> case 1:
> Out2();
> break;
> }
> }
> }
>
> private void Out1() {
> Montior.Exit(this.locker);
> }
>
> private void Out2() {
> Monitor.Exit(this.locker);
> }
> }
>
> ...clearly there isn't enough information merely from the existence
> Monitor.Enter and Monitor.Exit to maintain those guarantees.

It's the other way round - the JIT compiler doesn't have enough
information to perform certain optimisations, simply because it can't
know whether or not Monitor.Exit will be called.

Assuming the CLR follows the spec, it can't move the write to number to
after the call to random.Next() - because that call to random.Next()
may involve releasing a lock, and it may involve a write.

Now, I agree that it really limits the scope of optimisation for the
JIT - but that's what the CLI spec says.

> > Again you're treating atomicity as almost interchangeable with
> > volatility,
> <snip>
> No, I'm not. I said you don't need to synchronize an atomic invariant but
> you still need to account for its volatility (by declaring it volatile). I
> didn't say volatility was a secondary concern, I said it needs to be
> accounted for equally. I was implying that using the "lock" keyword is not
> as clear in terms of volatility assumptions/needs as is the "volatile"
> keyword. If a I read some code that uses "lock", I can't assume the author
> did that for volatility reasons and not just synchronization reasons; whereas
> if she had put "volatile" on a field, I know for sure why she put that there.

I use lock when I'm going to use shared data. When I use shared data, I
want to make sure I don't ignore previous changes - hence it needs to
be volatile.

Volatility is a natural consequence of wanting exclusive access to a
shared variable - which is why exactly the same strategy works in Java,
by the way (which has a slightly different memory model). Without the
guarantees given by the CLI spec, having a lock would be pretty much
useless.

> > This *is* guaranteed, it's the normal way of working in the framework
> > (as Willy said, look for volatile fields in the framework itself)
>
> Which ones? Like Hashtable.version or StringBuilder.m_StringValue?

Yup, there are a few - but I believe there are far more places which
use the natural (IMO) way of sharing data via exclusive access, and
taking account the volatility that naturally provides.

Willy Denoyette [MVP]

unread,
Jun 19, 2007, 3:28:16 PM6/19/07
to
"Jon Skeet [C# MVP]" <sk...@pobox.com> wrote in message
news:1182265936....@q75g2000hsh.googlegroups.com...


I see that my remark about the FCL was too strong worded, I didn't mean to
say that "volatile" fields were not used at all in the FCL, sure they are
used, but only in a context where the author wanted to guarantee that a
field (most often a bool) access had acquire/release semantics and would not
be reordered, not in the context of a locked region. Also note that a large
part of the FCL was written against v1.0 (targeting X86 only) at a time
there was no VolatileRead and long before the Interlocked Class was
introduced.
The latest bits in the FCL use more often Interlocked and VolatileXXX
operations than than applying the volatile modifier.
Also note that volatile does not imply a memory barrier, while lock,
Interlocked ops. and VolatileXXX do effectively imply a MemoryBarrier. The
way the barrier is implemented is platform specific, on X86 and X64 a full
barrier is raised, while on IA64 it depends on the operation.


Willy.

Peter Ritchie [C# MVP]

unread,
Jun 19, 2007, 4:49:31 PM6/19/07
to
"Jon Skeet [C# MVP]" wrote:
> I'm saying that I can write code which doesn't use volatile but which
> is *guaranteed* to work. I believe you won't be able to provide any
> exmaple of how it could fail without the CLI spec itself being
> violated.
Actually, I'm having a hard time getting the JIT to optimize *any* member
fields, even with lack of locking. Local variables seem to optimized into
registers easily, but not member fields...

If I could get an optimization of a member field I believe I would be able
show an example.

For example:


private Random random = new Random();

public int Method()
{
int result = 0;
for(int i = 0; i < this.random.Next(); ++i)
{
result += 10;
}
return result;
}

ebx is used for result (and edi for i) while in the loop; but with:


private Random random = new Random();

private int number;
public int Method()
{
for(int i = 0; i < this.random.Next(); ++i)
{
this.number += 10;
}
return this.number
}

...number is always accessed directly and never optimized to a register. I
think I'd find the same thing with re-ordering.

Jon Skeet [C# MVP]

unread,
Jun 20, 2007, 2:31:48 PM6/20/07
to
Peter Ritchie [C# MVP] <PRS...@newsgroups.nospam> wrote:
> "Jon Skeet [C# MVP]" wrote:
> > I'm saying that I can write code which doesn't use volatile but which
> > is *guaranteed* to work. I believe you won't be able to provide any
> > exmaple of how it could fail without the CLI spec itself being
> > violated.

> Actually, I'm having a hard time getting the JIT to optimize *any* member
> fields, even with lack of locking. Local variables seem to optimized into
> registers easily, but not member fields...

I can well believe that, just as an easy way of fulfilling the spec.

> If I could get an optimization of a member field I believe I would be able
> show an example.

Well, rather than arguing from a particular implementation (which, as
you've said before, may be rather stricter than the spec requires) I'd
be perfectly happy arguing from the spec itself. Then at least if there
are precise examples where I interpret the spec to say one thing and
you interpret it a different way, we'll know exactly where our
disagreement is.

<snip code>

Willy Denoyette [MVP]

unread,
Jun 20, 2007, 3:07:18 PM6/20/07
to
"Peter Ritchie [C# MVP]" <PRS...@newsgroups.nospam> wrote in message
news:B3CC9E00-7F15-4259...@microsoft.com...


In your sample, the member field has to be read from the object location in
the GC heap, and after the addition it has to be written back to the same
location.
The write "this.number +=.... "must be a "store acquire" to fulfill the
rules imposed by the CLR memory model. Note that this model derives from the
ECMA model!

The assembly code of the core part of the loop, looks something like this
(your mileage may vary):

mov eax,dword ptr [ebp-10h]
add dword ptr [eax+8],0Ah

here the object reference of the current instance (this) is loaded from
[ebp-10h] and stored in eax, after which 0Ah is added to the location of the
'number' field [eax+8].

Question is what else do you expect to optimize any further, and what are
you expecting to illustrate?

Willy.

Ben Voigt [C++ MVP]

unread,
Jun 20, 2007, 4:11:55 PM6/20/07
to

"Willy Denoyette [MVP]" <willy.d...@telenet.be> wrote in message
news:etCf632s...@TK2MSFTNGP02.phx.gbl...

I know what just *did* get illustrated -- that the .NET JIT doesn't optimize
nearly as well as the C++ optimizing compiler.

>
> Willy.
>


Peter Ritchie [C# MVP]

unread,
Jun 20, 2007, 10:58:02 PM6/20/07
to
Willy, I'm not following where you are going with your comment. As I've
said, my example should have been number = 10 (or something similar) to
capture CLI atomicity guarentees; but either operation is optimized to a
single opcode on x86:

x86 for number += 10:
int count = random.Next();
00000000 56 push esi
00000001 8B F1 mov esi,ecx
00000003 8B 4E 04 mov ecx,dword ptr [esi+4]
00000006 8B 01 mov eax,dword ptr [ecx]
00000008 FF 50 3C call dword ptr [eax+3Ch]
0000000b 8B D0 mov edx,eax
for(int i = 0; i > count; ++i)
0000000d 33 C0 xor eax,eax
0000000f 85 D2 test edx,edx
00000011 7D 0B jge 0000001E
{
number += 10;
00000013 83 46 08 0A add dword ptr [esi+8],0Ah
for(int i = 0; i > count; ++i)
00000017 83 C0 01 add eax,1
0000001a 3B C2 cmp eax,edx
0000001c 7F F5 jg 00000013

and x86 for number = 10:
int count = random.Next();
00000000 56 push esi
00000001 8B F1 mov esi,ecx
00000003 8B 4E 04 mov ecx,dword ptr [esi+4]
00000006 8B 01 mov eax,dword ptr [ecx]
00000008 FF 50 3C call dword ptr [eax+3Ch]
0000000b 8B D0 mov edx,eax
for(int i = 0; i > count; ++i)
0000000d 33 C0 xor eax,eax
0000000f 85 D2 test edx,edx
00000011 7D 0E jge 00000021
{
number = 10;
00000013 C7 46 08 0A 00 00 00 mov dword ptr [esi+8],0Ah
for(int i = 0; i > count; ++i)
0000001a 83 C0 01 add eax,1
0000001d 3B C2 cmp eax,edx
0000001f 7F F2 jg 00000013

I don't know if your comment was supposed to show the adjacentness of object
reference load to the increment; but it's clearly not doing that (the object
reference load is hoisted out of the loop and before the Next() call, which
is where it needs it first. If that wasn't your point, pardon my ramblings;
but they do provide basis for what follows...

But, the difference here is irrelevent. It's the difference between a
member field and a local variable. x86 for the same code with a local
variable instead of a member field:
int count = random.Next();
00000000 8B C1 mov eax,ecx
00000002 8B 48 04 mov ecx,dword ptr [eax+4]
00000005 8B 01 mov eax,dword ptr [ecx]
00000007 FF 50 3C call dword ptr [eax+3Ch]
0000000a 8B D0 mov edx,eax
int result = 0;
0000000c 33 C9 xor ecx,ecx

for (int i = 0; i > count; ++i)

0000000e 33 C0 xor eax,eax
00000010 85 D2 test edx,edx
00000012 7D 0A jge 0000001E
{
result += 10;
00000014 83 C1 0A add ecx,0Ah

for (int i = 0; i > count; ++i)

00000017 83 C0 01 add eax,1
0000001a 3B C2 cmp eax,edx
0000001c 7F F6 jg 00000014

I was expecting the JIT to do much better optimizations (looping x times
assigning the same value to number) than it had. Sure, the difference
between an single add with a register and a memory location is small... In
the increment case, I was expecting something similar to the local variable:
by using a register for the duration of the loop. If Next() returned 10, the
loop would effectively be:

number += 10; number += 10; number += 10; number += 10;
number += 10; number += 10; number += 10; number += 10;
number += 10; number += 10; // joined for brevity

All adjacent writes on the same thread, where optimizing to a register being
removing a write...

And, in fact, if you do 10 increments instead of a loop, the JIT *still*
won't optimize any writes away. I know it knows how; because it will do it
with local variables.

Which leads me to believe that that JIT implemention is giving people a
false sense of security by not introducing things clearly accounted for in
the specs. Not to mention, makes the whole discussion somewhat moot.

Willy Denoyette [MVP]

unread,
Jun 21, 2007, 8:39:17 AM6/21/07
to
"Peter Ritchie [C# MVP]" <PRS...@newsgroups.nospam> wrote in message
news:8C109B5F-F143-4096...@microsoft.com...


This is because you run this code in a "managed debugger", the JIT produces
different code from what is produced when no managed bedugger is attached!
You need to run the code (release version) in a native debugger to see what
the JIT really produces for 'release' mode code.
As unmanaged debugger you can use any of the debuggers from the Debugging
Tools for Windows like windbg, sdb etc... (Which is what I prefer, because
it's more powerfull as the VS2005 debugger)
You can also use VS2005 as unmanaged debugger, but you need to make sure you
break into an unmanaged debugging session. That means you cannot use
System.Diagnostics.Debugger.Break(), you have to call Kernel32.dll
DebugBreak().

Here is the PInvoke signature:
[DllImport("kernel32"), SuppressUnmanagedCodeSecurity] static extern void
DebugBreak();

Add a call to DebugBreak() in your code, run the program without debugging
(CTRL+F5) from within VS and wait until a break is hit, select 'Debug' -
select the current VS instance from the list in the "VS JIT Debugger"
dialog. Wait for the break message is hit, press 'Break' and in the
following dialog press 'Show Disassembly'.

When you hit this point you'll see this (partly stripped) :

X86, member variable number = 10;
...
0032013A mov eax,dword ptr [ebp-10h]
0032013D mov dword ptr [eax+8],0Ah
00320144 add edx,1
00320147 cmp edx,esi
00320149 jl 0032013A
0032014B mov eax,dword ptr [ebp-10h]
0032014E mov eax,dword ptr [eax+8]
......


The first two instructions are the load of the 'this' instance pointer into
eax, and the store of '0Ah' into the member field 'number' of 'this'.
This sequence is repeated until the loop counter reaches the count value.

> But, the difference here is irrelevent. It's the difference between a
> member field and a local variable. x86 for the same code with a local
> variable instead of a member field:
> int count = random.Next();
> 00000000 8B C1 mov eax,ecx
> 00000002 8B 48 04 mov ecx,dword ptr [eax+4]
> 00000005 8B 01 mov eax,dword ptr [ecx]
> 00000007 FF 50 3C call dword ptr [eax+3Ch]
> 0000000a 8B D0 mov edx,eax
> int result = 0;
> 0000000c 33 C9 xor ecx,ecx
> for (int i = 0; i > count; ++i)
> 0000000e 33 C0 xor eax,eax
> 00000010 85 D2 test edx,edx
> 00000012 7D 0A jge 0000001E
> {
> result += 10;
> 00000014 83 C1 0A add ecx,0Ah
> for (int i = 0; i > count; ++i)
> 00000017 83 C0 01 add eax,1
> 0000001a 3B C2 cmp eax,edx
> 0000001c 7F F6 jg 00000014
>
> I was expecting the JIT to do much better optimizations (looping x times
> assigning the same value to number) than it had.

That's right, the JIT optimizer is quite conservative when optimizing loops.
However, I don't know who writes code like this:


for(int i = 0; i < count ; ++i)

result = 10;

Sure, the difference
> between an single add with a register and a memory location is small...
> In
> the increment case, I was expecting something similar to the local
> variable:
> by using a register for the duration of the loop. If Next() returned 10,
> the
> loop would effectively be:
>
> number += 10; number += 10; number += 10; number += 10;
> number += 10; number += 10; number += 10; number += 10;
> number += 10; number += 10; // joined for brevity
>

Same here, this can be optimized by storing the number in a local before
running the loop and once done moving the local to the field variable. This
is something you should do whenever you are dealing with field variable in
long running algorithms.

Granted, in the sample above, the loop won't be optimized as agressively as
a native compiler would do (a C compiler will hoist the loop completely),
but again, I don't know if one writes code like this.

> All adjacent writes on the same thread, where optimizing to a register
> being
> removing a write...
>
> And, in fact, if you do 10 increments instead of a loop, the JIT *still*
> won't optimize any writes away. I know it knows how; because it will do
> it
> with local variables.
>
> Which leads me to believe that that JIT implemention is giving people a
> false sense of security by not introducing things clearly accounted for in
> the specs. Not to mention, makes the whole discussion somewhat moot.
>

Don't know what this has to do with security and the specs, this is about
loop optimizing, right?.

Willy.

Peter Ritchie [C# MVP]

unread,
Jun 21, 2007, 10:07:01 AM6/21/07
to
"Willy Denoyette [MVP]" wrote:
<snip>

> This is because you run this code in a "managed debugger", the JIT produces
> different code from what is produced when no managed bedugger is attached!
> You need to run the code (release version) in a native debugger to see what
> the JIT really produces for 'release' mode code.
> As unmanaged debugger you can use any of the debuggers from the Debugging
> Tools for Windows like windbg, sdb etc... (Which is what I prefer, because
> it's more powerfull as the VS2005 debugger)
> You can also use VS2005 as unmanaged debugger, but you need to make sure you
> break into an unmanaged debugging session. That means you cannot use
> System.Diagnostics.Debugger.Break(), you have to call Kernel32.dll
> DebugBreak().

I've been using Vance Morrison's guide observing optimized managed code [1]

> That's right, the JIT optimizer is quite conservative when optimizing loops.

As I already pointed out, it optimized identical loops--not using member
fields.

> However, I don't know who writes code like this:
> for(int i = 0; i < count ; ++i)
> result = 10;

That's irrelevant, the optimizer doesn't know who's likely to write what
code. The exercise is to show optimized code.

> Same here, this can be optimized by storing the number in a local before
> running the loop and once done moving the local to the field variable. This
> is something you should do whenever you are dealing with field variable in
> long running algorithms.
>
> Granted, in the sample above, the loop won't be optimized as agressively as
> a native compiler would do (a C compiler will hoist the loop completely),

Huh? In the post you replied to I showed an example where the JIT *did*
hoist the loop completely, just not with member fields.

> Don't know what this has to do with security and the specs, this is about
> loop optimizing, right?.

No, as I pointed out, it's about getting an example of JIT optimization of
member fields. It doesn't have to be a loop, it's just loop optimization is
easy to have generated.

[1] http://blogs.msdn.com/vancem/archive/2006/02/20/535807.aspx

Willy Denoyette [MVP]

unread,
Jun 21, 2007, 12:15:38 PM6/21/07
to
"Peter Ritchie [C# MVP]" <PRS...@newsgroups.nospam> wrote in message
news:0BC83EEC-7D0B-4E28...@microsoft.com...

> "Willy Denoyette [MVP]" wrote:
> <snip>
>> This is because you run this code in a "managed debugger", the JIT
>> produces
>> different code from what is produced when no managed bedugger is
>> attached!
>> You need to run the code (release version) in a native debugger to see
>> what
>> the JIT really produces for 'release' mode code.
>> As unmanaged debugger you can use any of the debuggers from the Debugging
>> Tools for Windows like windbg, sdb etc... (Which is what I prefer,
>> because
>> it's more powerfull as the VS2005 debugger)
>> You can also use VS2005 as unmanaged debugger, but you need to make sure
>> you
>> break into an unmanaged debugging session. That means you cannot use
>> System.Diagnostics.Debugger.Break(), you have to call Kernel32.dll
>> DebugBreak().
>
> I've been using Vance Morrison's guide observing optimized managed code
> [1]
>

Which is wrong, it doesn't show machine code as it would do when you don't
run in the managed debugger (VS debugger or mdbg) code. The CLR knows that
he runs in the managed debugger using the "managed debugger interfaces"
(ICorDebug - COM interfaces), and forces the JIT to produce different code
as it would when no managed debugger would be attached! What Vance calls
optimized code is not what the JIT produces when run outside of the
debugger. That's why I allways use Windbg to analyze assembly code.

You don't have to believe me, just do as I said and try to run the code in
windbg (you can download the latest builds for free from
http://www.microsoft.com/whdc/devtools/debugging/default.mspx, or as I have
explained in my previous post, using VS2005, but take care not to run in the
VS Debugger!.
If you still don't believe me, you can ngen your code and run "dumpbin
/rawdata program.ni.exe", where "program" is the name of the assembly. The
ngen'd image can be found in:
C:\Windows\assembly\NativeImages_v2.0.50727_32\blable....

The output should contain something like:

30002640: 8B 45 F0 C7 40 08 0A 00 00 00 83 C2 01 3B D1 7C .E­Ã@......┬.;Ð|
30002650: EF 8B 45 F0 8B 40 08 8B 7D CC 89 7E 0C 8D 65 F4 ´.E­.@..}╠.~..e¶
30002660: 5B 5E 5F 5D C3 CC CC CC BF 27 00 30 6A 46 00 30 [^_]├╠╠╠┐'.0jF.0
....

To find the exact addresses you will have to look at the unmanaged debugger
output...
Following is how it looks like when I ran this in windbg:

vols2_ni!Willys.Test.Method2()+0x7c:
30002640 8b45f0 mov eax,dword ptr [ebp-10h]
30002643 c740080a000000 mov dword ptr [eax+8],0Ah
3000264a 83c201 add edx,1
3000264d 3bd1 cmp edx,ecx
3000264f 7cef jl vols2_ni!Willys.Test.Method2()+0x7c
(30002640)

Now you just have to compare the sequences of bytes.
What I see when running in windbg or sdb orVS2005's VSJIT debugger and what
I see produced by ngen are exactly the same, are you telling me that what I
see is not correct?.

>> That's right, the JIT optimizer is quite conservative when optimizing
>> loops.
>
> As I already pointed out, it optimized identical loops--not using member
> fields.
>
>> However, I don't know who writes code like this:
>> for(int i = 0; i < count ; ++i)
>> result = 10;
>
> That's irrelevant, the optimizer doesn't know who's likely to write what
> code. The exercise is to show optimized code.
>
>> Same here, this can be optimized by storing the number in a local before
>> running the loop and once done moving the local to the field variable.
>> This
>> is something you should do whenever you are dealing with field variable
>> in
>> long running algorithms.
>>
>> Granted, in the sample above, the loop won't be optimized as agressively
>> as
>> a native compiler would do (a C compiler will hoist the loop completely),
>
> Huh? In the post you replied to I showed an example where the JIT *did*
> hoist the loop completely, just not with member fields.
>

Yes, but again you were running in the managed debugger!

>> Don't know what this has to do with security and the specs, this is
>> about
>> loop optimizing, right?.
>
> No, as I pointed out, it's about getting an example of JIT optimization of
> member fields. It doesn't have to be a loop, it's just loop optimization
> is
> easy to have generated.
>
> [1] http://blogs.msdn.com/vancem/archive/2006/02/20/535807.aspx
>

Willy.

Peter Ritchie [C# MVP]

unread,
Jun 21, 2007, 1:13:00 PM6/21/07
to
It appears the x86 JIT's (or the design-team’s thereof) interpretation is
somewhat similar to my interpretation in that compile-time optimization
restrictions and run-time acquire/release semantics are separate
considerations.

The JIT is using the protected region as the guard in which not to optimize,
not Monitor.Enter/Monitor.Exit. Anything within a try block will be
considered volatile operations by the compile-time optimizer, it has nothing
to do with Enter/Exit or directly with "lock" (other than lock is implemented
with a try block). Secondary to that, and outside of any documentation I’ve
been able to find, it appears the all member access (only tried Instance, not
Class) is considered a volatile operation by the JIT in terms of
optimizations, regardless of being in or out of a protected region (obviously
acquire/release semantics is the responsibility of Enter/Exit, MemoryBarrier,
etc. and is not implicitly obtained without them). Therefore acquire/release
semantics guarantees do not directly affect what the JIT decides not to
optimize.

For example:
int result = 0;
Monitor.Enter(locker);
result += 10;
result += 10;
Monitor.Exit(locker);
result += 10;
result += 10;


result += 10;
return result;

...is compile-time optimized by the JIT to the equivalent of:
Monitor.Enter(locker);
Monitor.Exit(locker);
return 50;

...and you get acquire/release semantics on nothing in the current threads
(other than "locker" in the call to Exit).

And:
int result = 0;
try
{
result += 10;
result += 10;
}
finally {
result += 10;
result += 10;


result += 10;
}
return result;

...is compile-time optimized by the JIT to the equivalent of:
int result = 0;
try
{
result += 10;
result += 10;
}
finally {
result += 30;
}
return result;

...but I do not get acquire/release semantics within the try block.

And finally:

int result = 0;
Monitor.Enter(locker);
try {
result += 10;
result += 10;
} finally {
Monitor.Exit(locker);
result += 10;
result += 10;


result += 10;
}
return result;

...is compile-time optimized by the JIT to the equivalent of:

int result = 0;
Monitor.Enter(locker);
try {
result += 10;
result += 10;
} finally {
Monitor.Exit(locker);
result += 30;
}
return result;

...and this is the only example where I get the JIT optimization AND
acquire/release semantics guarantees you've been talking about.

I don’t believe this compile-time optimization behaviour is covered clearly,
if at all, in 335.

--
Browse http://connect.microsoft.com/VisualStudio/feedback/ and vote.
http://www.peterRitchie.com/blog/
Microsoft MVP, Visual Developer - Visual C#

Ben Voigt [C++ MVP]

unread,
Jun 21, 2007, 1:43:09 PM6/21/07
to

> This is because you run this code in a "managed debugger", the JIT
> produces
> different code from what is produced when no managed bedugger is attached!

The JIT produces different code when *started* in a debugger. When a
debugger is attached later, managed or not, the optimized code is already
generated.


Willy Denoyette [MVP]

unread,
Jun 21, 2007, 2:01:31 PM6/21/07
to
"Ben Voigt [C++ MVP]" <r...@nospam.nospam> wrote in message
news:ecY0PvCt...@TK2MSFTNGP06.phx.gbl...

Yep, but this is not the case when an "unmanaged debugger" is attached.
Using an unmanaged debugger like sdb, you can break into the debugger before
the CLR is even loaded, after JITing you will get 'fidelity' code. Unmanaged
debuggers aren't using the ICORDebugger COM interface to interact with the
CLR (using the CLR's debugger thread as present in any managed code
process).
Note that when running in VS debugger, you can get the same behavior, you
only need to take care not to break using
System.Diagnostics.Debugger.Break(), else you will get ICORdebug as
interface and the CLR will signal the presence of a managed debugger to the
JIT.

Willy.

Jon Skeet [C# MVP]

unread,
Jun 21, 2007, 5:52:42 PM6/21/07
to
Peter Ritchie [C# MVP] <PRS...@newsgroups.nospam> wrote:
> It appears the x86 JIT's (or the design-team?s thereof) interpretation is
> somewhat similar to my interpretation in that compile-time optimization
> restrictions and run-time acquire/release semantics are separate
> considerations.

They can be implemented separately without the team having decided that
our reading of the spec is incorrect.

> The JIT is using the protected region as the guard in which not to optimize,
> not Monitor.Enter/Monitor.Exit. Anything within a try block will be
> considered volatile operations by the compile-time optimizer, it has nothing
> to do with Enter/Exit or directly with "lock" (other than lock is implemented
> with a try block).

That may be *a* type of optimisation blocking - it doesn't mean it's
the only one.

> Secondary to that, and outside of any documentation I?ve

> been able to find, it appears the all member access (only tried Instance, not
> Class) is considered a volatile operation by the JIT in terms of
> optimizations, regardless of being in or out of a protected region (obviously
> acquire/release semantics is the responsibility of Enter/Exit, MemoryBarrier,
> etc. and is not implicitly obtained without them). Therefore acquire/release
> semantics guarantees do not directly affect what the JIT decides not to
> optimize.

Unless the reason the JIT decided not to optimise *anything* for member
variables is because it's simpler than trying to work out exactly where
it can and can't optimise due to Monitor.Enter/Exit.

By not reordering member access, the JIT is automatically complying
with the spec without having to do any extra checking. That doesn't
mean that the guarantees given by the spec don't apply - just that the
JIT is being stricter than it needs to.

> For example:
> int result = 0;
> Monitor.Enter(locker);
> result += 10;
> result += 10;
> Monitor.Exit(locker);
> result += 10;
> result += 10;
> result += 10;
> return result;
>
> ...is compile-time optimized by the JIT to the equivalent of:
> Monitor.Enter(locker);
> Monitor.Exit(locker);
> return 50;

That's certaily interesting - but the difference can't be *observed*
because no other thread has access to the value on that thread's stack.

I'll readily confess that I can't see where that's made clear in the
spec, unless it's the section about 12.6.4. It certainly makes sense
though - optimising within a stack can be done easily without
introducing bugs.

Another argument in favour of this is that the volatile prefix can't be
applied to the ldloc instruction - it's only applicable for potentially
shared data:

<quote>
The volatile. prefix specifies that addr is a volatile address (i.e.,
it can be referenced externally to the current thread of execution) and
the results of reading that location cannot be cached or that multiple
stores to that location cannot be suppressed.
</quote>



> ...and you get acquire/release semantics on nothing in the current threads
> (other than "locker" in the call to Exit).

Again you're talking about acquire/release semantics *on* something -
which is something the spec doesn't talk about. It talks about
acquire/release semantics at a particular point in time.

> And:

<snip example with try/finally>

> I don?t believe this compile-time optimization behaviour is covered clearly,

> if at all, in 335.

The spec doesn't talk about compile-time vs run-time optimisation
though - it talks about observable behaviour. As a developer trying to
write code which is guaranteed to work against the spec, I don't care
whether the JIT has to do more or less work depending on the CPU it's
on - I just care that my code works in all situations.

I still believe the spec guarantees that for the situation I've
specified.

Peter Ritchie [C#MVP]

unread,
Jun 22, 2007, 1:04:14 AM6/22/07
to
It's somewhat moot, I feel, at this point to discuss it much further, other
to continue to say each other's interpretation is different. But, if you're
interested in arguing from the spec itself... If you want to take it
offline, just reply to the email address you have for me.

The behaviour I've observed proves nothing about anyone's interpretation of
the spec, it merely speaks to what appears to be the JIT's opinion of what a
volatile operation is, despite what the spec says. What I've observed shows
optimization blocking and acquire/release semantics are at least considered
indpendantly (I'll admit it's not proof that the JIT does or does not take
into account Enter/Exit calls, but it can't use that to decide how it should
generate assembler for another method, regardless of whether a JIT allows
any optimzation of observable read/writes). The combination of behaviour
I've observed may suggest the JIT is attempting to guarantee observable
read/writes can't be reorderd and therefore must be flushed at Enter to Exit
(which is good); but that neither proves your interpretation of the spec nor
disproves mine.

If the MS x86 JIT does not in fact optimize member fields to fulfil
that/those particular guarantee(s), that really just substantiates my
assertion that the spec. is unclear and that your rebuttal is intepretive.
It's also a bit contradictory to information you've said you received from
Vance and Chris. No offence or implication that you didn't receive that
information; just that it seems contradictory, if it's indeed true that the
JIT doesn't optimize member fields and therefore does not need to look for
Enter/Exit...

You're readily confessing that the reasons for the side-effects you're
relying upon are not clear in the spec, yet you're still advocating reliance
upon the side-effects (arguing from the spec itself)?

To reduce typing:

"Conforming implementations of the CLI are free to execute programs using
any technology that guarantees, within a single thread of execution, that
side-effects and exceptions generated by a thread are visible in the order
specified by the CIL. For this purpose only volatile operations (including
volatile reads) constitute visible side-effects."

I'll call that statement optimization allowance 1 (OA1).

"Acquiring a lock (System.Threading.Monitor.Enter or entering a synchronized
method) shall implicitly perform a volatile read operation, and releasing a
lock (System.Threading.Monitor.Exit or leaving a synchronized method) shall

implicitly perform a volatile write operation."

I'll call that statement locking rule 1 (LR1).

> Again you're talking about acquire/release semantics *on* something -
> which is something the spec doesn't talk about. It talks about
> acquire/release semantics at a particular point in time.

Semantics. In the context, the acquire/release semantics ensure flushing to
memory of no values being read after the Enter and before the Exit, other
than "locker" (the only thing read merely supports the existance of
Enter/Exit) and there are no observable writes. An example that shows the
lack of clarity of LR1 (i.e. an "implicitly perform[ing] a volatile write"
of what?), the only association between Enter/Exit and acqurie/release.
Yes, it has the side effect of having flushed values to memory for
subsequent reads (the acquire semantics) but that makes no guarentees for
the instructions immediately following Exit and therefore no guarentees on
any reads.

What LR1 implies for acquire/release semantics hinges on whether you believe
the implication that everything on and after a call to Enter and before a
call to Exit constitutes one volatile read and that the call to Exit
constitues one volatile write. No matter how you interpret that paragraph
it's unclear. Regardless of intepretation it still leaves the code between
related Enter and Exit calls in a black hole (ignoring the fact there is no
syntax ensuring related Enter/Exit calls occur in the same block, the same
method, or even the same assembly). With your intepretation the release
semantics for the block occur at the call to Exit; which leaves any writes
within the block without release semantics until the end of the block and
therefore makes no guarantees any writes are visible to other threads until
Exit.

Without clarity of LR1, you can neither make the connection between
acquire/release semantics and with Enter/Exit nor, therefore, the connection
to observable side-effect guarantees.

> The spec doesn't talk about compile-time vs run-time optimisation
> though - it talks about observable behaviour.

And that has been my point. Without taking into account what the JIT *is*
doing, your interpretation of the guarentee(s) means the following is safe:
//thread A:
instance.firstIntMember = 1;
instance.firstIntMember = 2;

//thread B:
instance.secondIntMember = 3;
instance.secondIntMember = 4;

//thread C:
Monitor.Enter(locker)
instance.otherMember = instance.firstIntMember;
instance.anotherMember = instance.secondIntMember;
Monitor.Exit(locker);

...including atomicity rules: the assignment to otherMember in C is
"guaranteed" to see any observable side-effects made to firstIntMember, and
the assignment to anotherMember in C is "guarenteed" to see any observable
side-effects made to secondIntMember. And yet, nowhere in that code is
there enough information for a JIT to make any decisions what and what not
to optimize in A and B, especially considering thread A code and thread B
code are likely in different methods than C and that they could be in
different assemblies: JITted independantly. OA1 suggests it could optimize
away assignment of 1 in A and 3 in B because that isn't observable "within
[that] single thread of execution."

Is it good code? Of course not. Should it pass code review? Of course not.
What is and isn't sanctioned code is outside the domain of a C# compiler,
the JIT, or the CLI. The point is the spec is unclear in this area and to
use it as a crutch to support using syntax because of its side-effects is,
in my opinion, not a good practice. Using observed behaviour as a crutch is
better; but if the behviour doesn't match the spec it's subject to change
and, again, not a good practice.


Jon Skeet [C# MVP]

unread,
Jun 22, 2007, 3:03:00 AM6/22/07
to
On Jun 22, 6:04 am, "Peter Ritchie [C#MVP]" <prs...@newsgroups.nospam>
wrote:

<sni>

> If the MS x86 JIT does not in fact optimize member fields to fulfil
> that/those particular guarantee(s), that really just substantiates my
> assertion that the spec. is unclear and that your rebuttal is intepretive.
> It's also a bit contradictory to information you've said you received from
> Vance and Chris. No offence or implication that you didn't receive that
> information; just that it seems contradictory, if it's indeed true that the
> JIT doesn't optimize member fields and therefore does not need to look for
> Enter/Exit...

I don't have time to reply to everything right now (and replying to
large posts with a web browser is annoying anyway) but I've found the
information which confirms what I was saying about Vance:

http://discuss.develop.com/archives/wa.exe?A2=ind0203B&L=DOTNET&P=R375&I=-3

I mailed Vance at the time to clarify how exactly things were
guaranteed, and he came back with a reply which is *either* word for
word *or* my summary (I've just found this from a 4 year old post I
made reporting back) - bear in mind this is the previous version of
the spec, hence the numbering changes:

<quote>
Section 11.6.7 makes guarantees about volatile reads/writes,
effectively
making them memory barriers.

Section 11.6.5 (last paragraph) states that acquiring a lock
implicitly
performs a volatile read operation, and releasing a lock implicitly
performs a volatile write operation.

The two in conjunction make the appropriate guarantees.
</quote>

Jon

Peter Ritchie [C#MVP]

unread,
Jun 23, 2007, 7:40:54 PM6/23/07
to
That goes to how an MS JIT in .NET 1.x may have been implemented. An
obscure post by Vance does nothing to clarify the spec for everyone else,
including those writing JITs on other platforms or for other processors.
I'd comment more on what Vance posted; but we're talking about the spec, not
what was implemented--it clearly doesn't follow the spec to the letter
already (good or bad).

"Jon Skeet [C# MVP]" <sk...@pobox.com> wrote in message

news:1182495780.4...@o61g2000hsh.googlegroups.com...

Willy Denoyette [MVP]

unread,
Jun 24, 2007, 11:50:28 AM6/24/07
to
"Peter Ritchie [C#MVP]" <prs...@newsgroups.nospam> wrote in message
news:%23KQVy$etHHA...@TK2MSFTNGP02.phx.gbl...

> That goes to how an MS JIT in .NET 1.x may have been implemented. An
> obscure post by Vance does nothing to clarify the spec for everyone else,
> including those writing JITs on other platforms or for other processors.
> I'd comment more on what Vance posted; but we're talking about the spec,
> not what was implemented--it clearly doesn't follow the spec to the letter
> already (good or bad).
>

True, the CLR V2 and JIT compilers (MS, Mono) do not strictly follow the
ECMA specs "memory model", the V2 CLR is currently based on a redesigned
memory model, and it's the first (and sole?) implementation of the CLI that
targets X64 and IA64, The ECMA specs. and V1.X memory model, who's target
was X86, was considered "too weak, to be used as a viable platform to write
reliable code" for a memory model weaker than X86.
Check C. Brummes's Blog post here:
http://blogs.msdn.com/cbrumme/archive/2003/05/17/51445.aspx, for some more
details on the rationale of this change.

Willy.

Peter Ritchie [C# MVP]

unread,
Jun 24, 2007, 4:33:01 PM6/24/07
to
"Jon Skeet [C# MVP]" wrote:

> On May 24, 3:52 pm, "Ben Voigt" <r...@nospam.nospam> wrote:
> > >> Inside the implementation.
> >
> > > No, inside the JIT which has to notice that you call Interlocked.
> >
> > How can it, with dynamically generated code and reflection?
>
> By refusing to reorder memory reads/writes around method calls - which
> is exactly what it does, I believe.

That's an implementation detail as Ben has pointed out. Where in the spec
does it mention guarentees that reads/writes around *all* method calls won't
be reordered or will be considered volatile that aren't already considered
volatile? Only "operations on the Interlocked class" could be viewed as
covered...

> > C++ has the concept of pointer (or reference) to volatile, letting a
> > function declare to its callers that a parameter will be treated as
> > volatile. Doesn't .NET have the same? It seems like the only way to
> > correctly resolve CS0420.
> >
> > Well, .NET Reflector doesn't show any type of annotation on Interlocked.Add,
> > for example, so I guess it does not.
>
> I don't know, to be honest. I haven't seen any such thing.

The only conceivable thing is the IsVolatileAttribute which the C++/CLI
compiler uses to emit volatile prefixes. It would be dependant on the
language compiler compiling the code that uses the method whose parameter has
the IsVolatileAttribute to write appropriately volatlile-guarenteeing IL up
to the point of method call though. I don't know if anything other than the
C++/CLI uses the IsVolatileAttribute; but it should be pretty easy to find
out... In the specific case of calls to Interlocked methods, the compiler is
just supposed to know--if it is to be CLI compliant.

Peter Ritchie [C# MVP]

unread,
Jun 24, 2007, 5:12:00 PM6/24/07
to
Actually, the IsVolatileAttribute is wrapped in a modreq which, as with
modopt, aren't used in C# best match algorithms [1], which means it won't
even attempt to call a method declared in C++/CLI with a volatile parameter
(a compile error if it can't find a matching method without the modreq, and a
short bout of insanity if you don't know why).

C++/CLI just doesn't get it's just kudos...

[1]
http://codebetter.com/blogs/gregyoung/archive/2007/06/11/more-on-mopopt.aspx

Ben Voigt [C++ MVP]

unread,
Jun 25, 2007, 9:21:20 AM6/25/07
to
> "Conforming implementations of the CLI are free to execute programs using
> any technology that guarantees, within a single thread of execution, that
> side-effects and exceptions generated by a thread are visible in the order
> specified by the CIL. For this purpose only volatile operations (including
> volatile reads) constitute visible side-effects."

That last sentence is pretty clear. Frequent use of volatile is needed for
locking to work properly. Maybe somewhere it is stated that all member
accesses are considered volatile (that's the behavior this thread has
observed, that member field access isn't subject to any optimization)?


Jon Skeet [C# MVP]

unread,
Jun 25, 2007, 9:40:25 AM6/25/07
to

You certainly need volatile reads and writes - but that doesn't mean
you need to specify that fields are volatile, as calls to
Monitor.Enter/Exit act as volatile reads and writes.

Jon

Willy Denoyette [MVP]

unread,
Jun 25, 2007, 10:48:26 AM6/25/07
to
"Jon Skeet [C# MVP]" <sk...@pobox.com> wrote in message
news:1182778825....@p77g2000hsh.googlegroups.com...


Next to Monitor.Enter & Exit, the framework also provides VolatileRead,
VolatileWrite and MemoryBarrier, without all these "volatile operations", it
would be nearly impossible to write "Sequential Consistent" code in VB, as
it lacks the "volatile" modifier.

Willy.

Ben Voigt [C++ MVP]

unread,
Jun 25, 2007, 1:24:39 PM6/25/07
to

"Jon Skeet [C# MVP]" <sk...@pobox.com> wrote in message
news:1182778825....@p77g2000hsh.googlegroups.com...

But acquire and release semantics (which Monitor.Enter/Exit exhibit) only
guarantee the order of "visible side-effects" and according to the sentence
quoted, have *no effect* on the ordering of non-volatile operations.


Jon Skeet [C# MVP]

unread,
Jun 25, 2007, 2:48:16 PM6/25/07
to
Ben Voigt [C++ MVP] <r...@nospam.nospam> wrote:
> >> > "Conforming implementations of the CLI are free to execute programs
> >> > using
> >> > any technology that guarantees, within a single thread of execution,
> >> > that
> >> > side-effects and exceptions generated by a thread are visible in the
> >> > order
> >> > specified by the CIL. For this purpose only volatile operations
> >> > (including
> >> > volatile reads) constitute visible side-effects."
> >>
> >> That last sentence is pretty clear. Frequent use of volatile is needed
> >> for
> >> locking to work properly. Maybe somewhere it is stated that all member
> >> accesses are considered volatile (that's the behavior this thread has
> >> observed, that member field access isn't subject to any optimization)?
> >
> > You certainly need volatile reads and writes - but that doesn't mean
> > you need to specify that fields are volatile, as calls to
> > Monitor.Enter/Exit act as volatile reads and writes.
>
> But acquire and release semantics (which Monitor.Enter/Exit exhibit) only
> guarantee the order of "visible side-effects" and according to the sentence
> quoted, have *no effect* on the ordering of non-volatile operations.

Ah, I see what you're getting at now. Interesting. The very next
sentence in the spec is interesting though:

<quote>
(Note that while only volatile operations constitute visible side-
effects, volatile operations also affect the visibility of non-volatile
references.)
</quote>

The rationale part is also interesting:

<quote>
[Rationale: An optimizing compiler is free to reorder side-effects and
synchronous exceptions to the extent that this reordering does not
change any observable program behavior. end rationale]
</quote>

I don't think this is meant to take away from the guarantees made in
12.6.7 about not reordering references to memory around volatile
operations - I *think* it's just meant to give a bit more leeway so
that if you have four operations, the first and last of which are
volatile but the middle two of which aren't, the middle two *can* be
reordered.

My annotated copy of the spec is at work, unfortunately - I should have
looked at it ages ago in this thread to see if it's any use.

As Willy said though, without these guarantees it does make multi-
threaded programming a bit of a joke. Even in C# which *does* have the
volatile modifier, you can't make a "double" variable volatile. I
believe any reading of the spec which makes it impossible to
consistently fetch the most recently written value of a double is going
over the top.

Ben Voigt [C++ MVP]

unread,
Jun 26, 2007, 10:31:27 AM6/26/07
to
> As Willy said though, without these guarantees it does make multi-
> threaded programming a bit of a joke. Even in C# which *does* have the

Unless there's a statement somewhere that access to member fields is always
volatile... which would explain the complete lack of optimization by the
compiler in such cases.

Jon Skeet [C# MVP]

unread,
Jun 26, 2007, 12:55:41 PM6/26/07
to
Ben Voigt [C++ MVP] <r...@nospam.nospam> wrote:
> > As Willy said though, without these guarantees it does make multi-
> > threaded programming a bit of a joke. Even in C# which *does* have the
>
> Unless there's a statement somewhere that access to member fields is always
> volatile... which would explain the complete lack of optimization by the
> compiler in such cases.

That would make the ability to make member fields volatile just as
silly though :)

Willy Denoyette [MVP]

unread,
Jun 26, 2007, 1:09:41 PM6/26/07
to
"Ben Voigt [C++ MVP]" <r...@nospam.nospam> wrote in message
news:uejJ17$tHHA...@TK2MSFTNGP05.phx.gbl...

>> As Willy said though, without these guarantees it does make multi-
>> threaded programming a bit of a joke. Even in C# which *does* have the
>
> Unless there's a statement somewhere that access to member fields is
> always volatile... which would explain the complete lack of optimization
> by the compiler in such cases.
>

Monitor.Enter is a downwards fence, so, no reads can move before it, which
means that the *first* read of a publically visible memory location must be
a load acquire. After the load acquire, the JIT is free to apply all
possible optimizations to the value read, respecting the rules as imposed by
the memory model.
Monitor.Exit is an upwards fence, that means no writes can move after it, so
the last write to a publically visible memory location must be a store
release.


Willy.

Ben Voigt [C++ MVP]

unread,
Jun 26, 2007, 1:58:48 PM6/26/07
to
> Monitor.Enter is a downwards fence, so, no reads can move before it, which
> means that the *first* read of a publically visible memory location must
> be a load acquire. After the load acquire, the JIT is free to apply all
> possible optimizations to the value read, respecting the rules as imposed
> by the memory model.
> Monitor.Exit is an upwards fence, that means no writes can move after it,
> so the last write to a publically visible memory location must be a store
> release.

The only way that can work, is if *every* (non-inlined) method call is
treated as a memory barrier, in case it results in a volatile operation or
call to Monitor.Enter/Exit at some point.

Or else you're confusing the issue with this "publicly visible memory
location" terminology. The quote Peter found indicates "only volatile
operations constitute visible side-effects". That would mean that only
volatile operations are prevented from moving past the Monitor.Enter/Exit
fences.


Jon Skeet [C# MVP]

unread,
Jun 26, 2007, 2:34:12 PM6/26/07
to
Ben Voigt [C++ MVP] <r...@nospam.nospam> wrote:
> > Monitor.Enter is a downwards fence, so, no reads can move before it, which
> > means that the *first* read of a publically visible memory location must
> > be a load acquire. After the load acquire, the JIT is free to apply all
> > possible optimizations to the value read, respecting the rules as imposed
> > by the memory model.
> > Monitor.Exit is an upwards fence, that means no writes can move after it,
> > so the last write to a publically visible memory location must be a store
> > release.
>
> The only way that can work, is if *every* (non-inlined) method call is
> treated as a memory barrier, in case it results in a volatile operation or
> call to Monitor.Enter/Exit at some point.

Yup. Note that that argument is valid regardless of the behaviour of
Monitor.Enter/Exit, because of the possibility of a volatile operation
within the method.



> Or else you're confusing the issue with this "publicly visible memory
> location" terminology. The quote Peter found indicates "only volatile
> operations constitute visible side-effects". That would mean that only
> volatile operations are prevented from moving past the Monitor.Enter/Exit
> fences.

It would mean that if it weren't for the other clause which talks about
not moving references past volatile operations. I don't see that clause
12.6.4 can overrule other clauses. For instance, it wouldn't allow you
to use a "technology" which didn't bother to call methods which were
known not to include volatile operations (but might output to the
screen, for instance). Taken entirely in isolation, the start of 12.6.4
sounds like you can do *anything* so long as you don't interfere with
volatile operations and exceptions. So any operations which just change
the values of non-volatile variables can be optimised away completely,
right? No, of course not - that would make a mockery of the whole
framework.

Again, the rationale provides the key to this clause: side-effects and
synchronous exceptions can be reordered so long as that reordering
doesn't change any observable program behaviour.

Ben Voigt [C++ MVP]

unread,
Jun 26, 2007, 6:14:31 PM6/26/07
to

"Jon Skeet [C# MVP]" <sk...@pobox.com> wrote in message
news:MPG.20eb6c7...@msnews.microsoft.com...

Yes, they can. Only the resulting value must be preserved, the operations
need not.

for( int i = 0; i < 5; i++ ) x += x;

can be replaced by

x <<= 5;

Notice that i is totally gone, the increment operation is totally gone, the
addition and repeated assignment is totally gone.

And if x isn't read afterwards, the whole thing can go away.

Only for volatile variables are you guaranteed to touch the variable each
time it is mentioned in the program.


Jon Skeet [C# MVP]

unread,
Jun 26, 2007, 7:10:58 PM6/26/07
to
Ben Voigt [C++ MVP] <r...@nospam.nospam> wrote:
> > It would mean that if it weren't for the other clause which talks about
> > not moving references past volatile operations. I don't see that clause
> > 12.6.4 can overrule other clauses. For instance, it wouldn't allow you
> > to use a "technology" which didn't bother to call methods which were
> > known not to include volatile operations (but might output to the
> > screen, for instance). Taken entirely in isolation, the start of 12.6.4
> > sounds like you can do *anything* so long as you don't interfere with
> > volatile operations and exceptions. So any operations which just change
> > the values of non-volatile variables can be optimised away completely,
> > right? No, of course not - that would make a mockery of the whole
> > framework.
>
> Yes, they can. Only the resulting value must be preserved, the operations
> need not.

They do if it affects observable behaviour - which needn't necessarily
involve volatile variables.

To take your example:

> for( int i = 0; i < 5; i++ ) x += x;
>
> can be replaced by
>
> x <<= 5;

True - but:

for (int i=0; i < 5; i++)
{
Console.WriteLine (i);
x += x;
}

can't be replaced by x <<= 5; even if the JIT compiler can prove that
Console.WriteLine doesn't use any volatile operations. That's the kind
of situation which I was meaning would be utterly silly.

Likewise your original code couldn't be reordered to just x++ even
though that wouldn't change any ordering of side-effects or exceptions.
I'm playing devil's advocate here - saying that if we're going to go
for an absolute "if 12.6.4 allows it, it must be okay" reading, then
life becomes ridiculous very quickly.



> Notice that i is totally gone, the increment operation is totally gone, the
> addition and repeated assignment is totally gone.
>
> And if x isn't read afterwards, the whole thing can go away.
>
> Only for volatile variables are you guaranteed to touch the variable each
> time it is mentioned in the program.

Yes - unless there are intervening volatile operations between the
reads. For instance:

using System;

class Test
{
volatile int v;
int x;

void Foo()
{
int a = x;
int b = v;
int c = x;

Console.WriteLine (a+b+c);
}
}

I believe that Foo *can't* optimised as:

int a = c = x;
int b = v;

That (in my reading) would violate 12.6.7. The same would be true if
the read of v were replaced by a call to Monitor.Enter.

In theory I believe it *could* be replaced by:

int b = v;
int a = c = x;

because that only moves a non-volatile read to *after* a volatile read,
which is okay.

Do we agree so far? (As I said to Peter, it would be good to know
*exactly* which bits of the spec we're reading differently - assuming
you agree with Peter in the first place, that is.)

Ben Voigt [C++ MVP]

unread,
Jun 27, 2007, 9:19:06 AM6/27/07
to
> Yes - unless there are intervening volatile operations between the
> reads. For instance:
>
> using System;
>
> class Test
> {
> volatile int v;
> int x;
>
> void Foo()
> {
> int a = x;
> int b = v;
> int c = x;
>
> Console.WriteLine (a+b+c);
> }
> }
>
> I believe that Foo *can't* optimised as:
>
> int a = c = x;
> int b = v;
>
> That (in my reading) would violate 12.6.7. The same would be true if
> the read of v were replaced by a call to Monitor.Enter.
>
> In theory I believe it *could* be replaced by:
>
> int b = v;
> int a = c = x;
>
> because that only moves a non-volatile read to *after* a volatile read,
> which is okay.
>
> Do we agree so far? (As I said to Peter, it would be good to know
> *exactly* which bits of the spec we're reading differently - assuming
> you agree with Peter in the first place, that is.)

But what about:

int a = x;
int b = v;

int c = a; // was c = x, but x is non-volatile, so another access to x is
not required, the locally cached a can be used instead?

Jon Skeet [C# MVP]

unread,
Jun 27, 2007, 9:28:49 AM6/27/07
to
On Jun 27, 2:19 pm, "Ben Voigt [C++ MVP]" <r...@nospam.nospam> wrote:

<snip>

> But what about:
>
> int a = x;
> int b = v;
> int c = a; // was c = x, but x is non-volatile, so another access to x is
> not required, the locally cached a can be used instead?

No, I don't believe that's a legal optimisation. The reference (in IL)
is to x, and so that's logically reordered the read of x to earlier
than the read of v, and therefore prohibited by 12.6.7.

I've just checked the annotated spec by the way, and there's nothing
particularly enlightening in there unfortunately.

Jon


Ben Voigt [C++ MVP]

unread,
Jun 27, 2007, 9:33:29 AM6/27/07
to

"Jon Skeet [C# MVP]" <sk...@pobox.com> wrote in message
news:1182950929.4...@n2g2000hse.googlegroups.com...

> On Jun 27, 2:19 pm, "Ben Voigt [C++ MVP]" <r...@nospam.nospam> wrote:
>
> <snip>
>
>> But what about:
>>
>> int a = x;
>> int b = v;
>> int c = a; // was c = x, but x is non-volatile, so another access to x is
>> not required, the locally cached a can be used instead?
>
> No, I don't believe that's a legal optimisation. The reference (in IL)
> is to x, and so that's logically reordered the read of x to earlier
> than the read of v, and therefore prohibited by 12.6.7.

Maybe not legal for the JIT to perform, but legal for the language compiler,
if it can prove that the expression "x" doesn't depend on the value of b?


Jon Skeet [C# MVP]

unread,
Jun 27, 2007, 10:04:49 AM6/27/07
to
On Jun 27, 2:33 pm, "Ben Voigt [C++ MVP]" <r...@nospam.nospam> wrote:
> >> int a = x;
> >> int b = v;
> >> int c = a; // was c = x, but x is non-volatile, so another access to x is
> >> not required, the locally cached a can be used instead?
>
> > No, I don't believe that's a legal optimisation. The reference (in IL)
> > is to x, and so that's logically reordered the read of x to earlier
> > than the read of v, and therefore prohibited by 12.6.7.
>
> Maybe not legal for the JIT to perform, but legal for the language compiler,
> if it can prove that the expression "x" doesn't depend on the value of b?

The language spec is sadly silent on this front as far as I can see.
However, I think for the purposes of this discussion we should assume
for the moment that the IL represents the C# code in the most obvious
way.

That's not to say it's not a potential issue, just that it's worth
concentrating on one thing at a time :)

If we can all agree it's not a valid JIT (or CPU) optimisation, that
would be a good start.

Jon

Jon Skeet [C# MVP]

unread,
Jun 27, 2007, 2:11:16 PM6/27/07
to
Jon Skeet [C# MVP] <sk...@pobox.com> wrote:
> The language spec is sadly silent on this front as far as I can see.
> However, I think for the purposes of this discussion we should assume
> for the moment that the IL represents the C# code in the most obvious
> way.

It appears I was wrong about the language spec not mentioning the
memory model.

Section 17.4.3 talks about volatile fields and release/acquire
semantics, and mentions the lock statement, but *doesn't* have the same
bit as the CLI spec in terms of specifying that acquiring a lock
performs an implicit volatile read and releasing a lock performs an
implicit volatile write.

I'll mail the C# team to see if this can be fixed.

Peter Ritchie

unread,
Jun 27, 2007, 2:48:31 PM6/27/07
to
Sorry for the repost, this seems to only appear in the Microsoft web
side newsgroup front-end:

"Jon Skeet [C# MVP]" wrote:
> It would mean that if it weren't for the other clause which talks about
> not moving references past volatile operations. I don't see that clause
> 12.6.4 can overrule other clauses. For instance, it wouldn't allow you
> to use a "technology" which didn't bother to call methods which were
> known not to include volatile operations (but might output to the
> screen, for instance). Taken entirely in isolation, the start of 12.6.4
> sounds like you can do *anything* so long as you don't interfere with
> volatile operations and exceptions. So any operations which just change
> the values of non-volatile variables can be optimised away completely,
> right? No, of course not - that would make a mockery of the whole
> framework.

Jon, it took me a while; but here's an example where lock doesn't work
and
you *must* use volatile to get the correct behaviour:
// compile: csc -optimize test.cs
using System;
using System.Threading;

internal class Tester {
public /*volatile*/ bool continueRunning = true;
public void ThreadEntry() {
int count=0;
Object locker = new Object();
lock (locker) {
while (continueRunning) {
count++;
}
}
}
}

public class Program {
static void Main() {
Tester tester = new Tester();
Thread thread = new Thread(new ThreadStart(tester.ThreadEntry));
thread.Name = "Job";
thread.Start();

Thread.Sleep(2000);

tester.continueRunning = false;
}
}

... uncomment the volatile on continueRunning and it runs as
expected,
terminating after 2 seconds.

Jon Skeet [C# MVP]

unread,
Jun 27, 2007, 2:54:57 PM6/27/07
to
Jon Skeet [C# MVP] <sk...@pobox.com> wrote:
> Jon Skeet [C# MVP] <sk...@pobox.com> wrote:
> > The language spec is sadly silent on this front as far as I can see.
> > However, I think for the purposes of this discussion we should assume
> > for the moment that the IL represents the C# code in the most obvious
> > way.
>
> It appears I was wrong about the language spec not mentioning the
> memory model.
>
> Section 17.4.3 talks about volatile fields and release/acquire
> semantics, and mentions the lock statement, but *doesn't* have the same
> bit as the CLI spec in terms of specifying that acquiring a lock
> performs an implicit volatile read and releasing a lock performs an
> implicit volatile write.
>
> I'll mail the C# team to see if this can be fixed.

Word back from the C# team:

1) They'll look at making it clearer
2) The spec says that lock is exactly equivalent to calling
Monitor.Enter/Exit, so the CLI rules apply

I'm uncomfortable with the second part, as it's got too much of a tie
between the two specs, but it certainly indicates the intention that
the C# compiler should respect the CLI model.

Ben Voigt [C++ MVP]

unread,
Jun 27, 2007, 3:19:51 PM6/27/07
to

"Peter Ritchie" <goog...@peterritchie.com> wrote in message
news:1182970111....@o11g2000prd.googlegroups.com...

But the lock isn't actually entered or exited in the region of interest.
There would have to be a lock inside the loop.


Jon Skeet [C# MVP]

unread,
Jun 27, 2007, 3:25:12 PM6/27/07
to
Peter Ritchie <goog...@peterritchie.com> wrote:
> Jon, it took me a while; but here's an example where lock doesn't work
> and you *must* use volatile to get the correct behaviour:

Nah - I'll fix it just using an extra lock.

<snip>

> public class Program {
> static void Main() {
> Tester tester = new Tester();
> Thread thread = new Thread(new ThreadStart(tester.ThreadEntry));
> thread.Name = "Job";
> thread.Start();
>
> Thread.Sleep(2000);
>
> tester.continueRunning = false;
> }
> }
>
> ... uncomment the volatile on continueRunning and it runs as
> expected, terminating after 2 seconds.

But you've violated my conditions for correctness:

<quote>
The situation I've been talking about is where a particular variable is
only referenced *inside* lock blocks, and where all the lock blocks
which refer to that variable are all locking against the same
reference.
</quote>

In particular I said:

<quote>
Now I totally agree that *if* you start accessing the variable from
outside a lock block, all bets are off - but so long as you keep
everything within locked sections of code, all locked with the same
lock, you're fine.
</quote>

but by setting tester.continueRunning outside the lock, you've gone
into the "all bets are off" territory.

Now, we don't want to start trying to acquire the lock that you've
already got: for one thing, the object reference is never available
outside ThreadEntry(), and for a second thing we appear to want to hold
that lock for a long time - we'd end up in a deadlock if the looping
thread held the lock and the thread which was trying to stop the loop
had to wait for the loop to finish before it could acquire the lock, if
you see what I mean.

So, we introduce a new lock to go round every access to
continueRunning. For ease of coding, we'll encapsulate continueRunning
in a property access, so the complete code becomes:

using System;
using System.Threading;

internal class Tester {
bool continueRunning = true;

object continueRunningLock = new object();

public bool ContinueRunning {
get {
lock (continueRunningLock) {
return continueRunning;
}
}
set {
lock (continueRunningLock) {
continueRunning = value;


}
}
}

public void ThreadEntry() {
int count=0;
Object locker = new Object();
lock (locker) {

while (ContinueRunning) {
count++;
}
}
}
}

public class Program {
static void Main() {
Tester tester = new Tester();
Thread thread = new Thread(new ThreadStart
(tester.ThreadEntry));
thread.Name = "Job";
thread.Start();

Thread.Sleep(2000);

tester.ContinueRunning = false;
}
}

It now works without any volatile variables (but plenty of volatile
reads and writes!).

Peter Ritchie [C#MVP]

unread,
Jul 1, 2007, 10:59:43 AM7/1/07
to
Delayed response, I've been away from the newsreader that seems to reliably
post to this thread.

The sample clearly shows that putting a member between Enter/Exit does not
guarantee those members won't have visible side-effects optimized away in
other threads. That was one of the points I made a few days ago; and the
impression I got from you was that you believe compile-time reorders on
members between Enter/Exit could occur. If that's not what you meant to
imply, then we don't disagree on that point and the sample is pointless and
we can ignore that rat hole and "fixing" it is irrelevant.

As I said, I get from the spec. that only the locked object is the volatile
read/write--anything between Enter/Exit is not a volatile read/write. So,
no cross-thread volatility guarantees apply and it falls back to "single
thread of execution" implications and that "there appears to be a hole in
the C# spec", regardless of how correct it *should* be to make anything
possibly cross-thread visible between Enter/Exit volatile, or what a
*particular implementation* is doing. Without clear and unambiguous
documentation, showing a snippet of code does not prove a concept is sound;
only that that snippet of code works in the limited circumstances in which
it is run. (otherwise, I could "prove" cross-thread WinForm data access
"works"). Whereas, it's clear what the "volatile" keyword means.

Besides, I'm not comfortable with compromising an application's vertical
scalability by potentially causing all but one processor to wait while a
member is accessed simply because of volatility concerns. I believe it's
safer to separate dealing with volatility and synchronization; but by your
interpretation the best case is they're equally as safe.

-- Peter

"Jon Skeet [C# MVP]" <sk...@pobox.com> wrote in message

news:MPG.20ecc9e...@msnews.microsoft.com...

Jon Skeet [C# MVP]

unread,
Jul 1, 2007, 3:11:04 PM7/1/07
to
Peter Ritchie [C#MVP] <prs...@newsgroups.nospam> wrote:
> Delayed response, I've been away from the newsreader that seems to reliably
> post to this thread.
>
> The sample clearly shows that putting a member between Enter/Exit does not
> guarantee those members won't have visible side-effects optimized away in
> other threads.

Other threads that *also* use Enter/Exit (with the same reference)
though? (That *didn't* happen in the broken sample you gave.)

> That was one of the points I made a few days ago; and the
> impression I got from you was that you believe compile-time reorders on
> members between Enter/Exit could occur. If that's not what you meant to
> imply, then we don't disagree on that point and the sample is pointless and
> we can ignore that rat hole and "fixing" it is irrelevant.

Within Enter/Exit, if there are no other volatile operations involved,
they can indeed by reordered. However, if all other uses of the shared
data involves locking against the same reference, then those
reorderings *won't* be visible in other threads. Consider this sequence
of operations:

A
B
C
D

If A acquired a lock and D releases it, then any thread aquiring the
same lock will only be able to see the results of B and C *after* D has
occurred, so it doesn't matter to that thread whether B occurs before
or after C.

> As I said, I get from the spec. that only the locked object is the volatile
> read/write--anything between Enter/Exit is not a volatile read/write.

Indeed - and I'd never claimed anything else.

> So, no cross-thread volatility guarantees apply and it falls back to "single
> thread of execution" implications

No, the same cross-thread volatility guarantees as always apply - a
read can't be reordered to before a volatile read - in this case the
implicit volatile read involved in acquiring a lock.

> and that "there appears to be a hole in
> the C# spec", regardless of how correct it *should* be to make anything
> possibly cross-thread visible between Enter/Exit volatile, or what a
> *particular implementation* is doing.

No, there's no need to prohibit reorders within a lock.

> Without clear and unambiguous
> documentation, showing a snippet of code does not prove a concept is sound;
> only that that snippet of code works in the limited circumstances in which
> it is run. (otherwise, I could "prove" cross-thread WinForm data access
> "works"). Whereas, it's clear what the "volatile" keyword means.

Well, I don't think I've seen anyone else read the spec in the way you
do, and I've seen *lots* of people (including threading experts) use
locks to achieve thread safety in the way that I do.

Is that as good as having a spec which is so unambiguous that there's
only one possible reading? Certainly not. However, with the spec the
way it is, I'll choose to read it in the way that:

1) Is supported by all current runtimes
2) Seems to me to be the way that it's read by experts in the field
3) Allows me to not have volatile performance penalties for code which
doesn't need to share data
4) Allows me to safely share doubles and other types which can't have
the volatile modifier applied to them
5) Allows me to have a simpler mental model for most multi-threaded
development

> Besides, I'm not comfortable with compromising an application's vertical
> scalability by potentially causing all but one processor to wait while a
> member is accessed simply because of volatility concerns.

Whereas I'm not comfortable with putting *potential* performance
considerations above the relative simplicity of having fewer rules when
it comes to implementing thread-safe code.

You already have to know about locking in order to deal with the common
situation where you're operating on more than one piece of data and
don't want to have race conditions. For 99% of the time, I just need to
consider the rule of "don't access shared data outside a lock".

How do you cope with sharing double values in a thread-safe way if you
don't believe locks are enough, by the way? Or do you believe it's
impossible?

Note that from a performance point of view, the "lock when you need
to" case means that I can use a type which *wasn't* implemented using
volatile variables from a type which requires thread-safety. If you
were right, all types should make *everything* volatile just in case
you ever need to use it in a multi-threaded context. At that point
anything *not* needing to share the data has to pay a performance
penalty.

> I believe it's safer to separate dealing with volatility and
> synchronization; but by your interpretation the best case is they're
> equally as safe.

They're equally as safe until you change

number = 1;

to

number++;

at which point volatility isn't good enough, but locking is. In at
least one post in this thread you accidentally gave an example using
"number++" as if it were safe to do so. How sure are you that you
haven't done the same thing in real code?

Peter Ritchie [C#MVP]

unread,
Jul 5, 2007, 12:03:50 AM7/5/07
to
"Jon Skeet [C# MVP]" <sk...@pobox.com> wrote in message
news:MPG.20f20c9...@msnews.microsoft.com...

> Well, I don't think I've seen anyone else read the spec in the way you
> do, and I've seen *lots* of people (including threading experts) use
> locks to achieve thread safety in the way that I do.

It must be me then.

I've been trying to get to your assertion of no JIT optimizations of members
between Enter/Exit because of "volatility guarantees" in the CLI spec. and
the only thing in the spec. that could have done it for me was if everything
between Enter/Exit were considered volatile operations-which is probably why
I erroneously got that you were implying that.

If only visible side-effects can't be reordered, and "...only volatile
operations constitute visible side-effects...", and what's between
Enter/Exit isn't implicitly volatile (then 12.6.4 para 1 wouldn't apply) or
explicitly volatile (and 12.6.7 and 12.6.5 point 3 wouldn't apply), I can't
get to no JIT re-ordering of member fields whose effects would leak out of
an Enter/Exit block.

Ignoring for a moment that 12.6.7 is explicitly talking about the volatile.
prefix and IL between Enter/Exit is not considered volatile and therefore
not generated with this prefix, I can't get to "a read can't be reordered
before a volatile read" from 12.6.7. para 2, with respect to JIT
optimizations. This is mostly due to my reading of "...occur prior to any
references to memory..." applies only to flushing writes to RAM and not JIT
optimizations, because of the consequences. If it doesn't just apply to
flushing writes to RAM and is interpreted as "...any references to *any*
memory..." and not "...any references to *the* memory..." (referring to the
memory used in the volatile operation) then given:

volatile int value;
void int Calculate()
{
int i;
i = 0xA;
value = i;
i = 3;
return i;
}

Calculate() could not be JIT optimized to:

void int Calculate()
{
value = 0xA;
return 3;
}

...because reading it as "...any references to *any* memory..." means local
assignments before and after "value = 20" (being a volatile operation) are
technically references to *any* memory. It would also suggest that the
introduction of a Enter/Exit would cause any reference to any memory to be
unoptimized because in the general case the JIT can't know what will and
won't be called before or after that Enter/Exit. If 12.6.7. para 2 applies
only to flushings of current writes to RAM and volatility of only memory
used in the volatile operation, 12.6.7 para 2 doesn't apply to the
reads/writes within Enter/Exit and there are no reordering guarantees of
anything not volatile by another means between Enter/Exit. This falls back
to single thread of execution guarantees or overriding that with the
"volatile" keyword.

Plus, added details like the Enter/Exit volatility applies only if a
publicly visible object is used as the lock object, just isn't in the spec.

I'm not disputing what a particular implementation is doing or whether what
is doing is safer or not. I'm also not disputing the flushing of writes to
RAM (and never was, I was always separating JIT optimizations).

Anyway, you haven't pointed to anything in the spec. to clarify these
ambiguities other than suggest potential problems if it's not interpreted
the same way as you; problems I'm not disputing. I never said writing MT
code was easy. I don't think the framework is doing what it's doing because
of those clauses, so it's somewhat moot (so, I haven't responded to your
other points, but will if you're interested) unless you want to target
different platforms.


Jon Skeet [C# MVP]

unread,
Jul 5, 2007, 2:41:30 AM7/5/07
to
Peter Ritchie [C#MVP] <prs...@newsgroups.nospam> wrote:
> "Jon Skeet [C# MVP]" <sk...@pobox.com> wrote in message
> news:MPG.20f20c9...@msnews.microsoft.com...
> > Well, I don't think I've seen anyone else read the spec in the way you
> > do, and I've seen *lots* of people (including threading experts) use
> > locks to achieve thread safety in the way that I do.
>
> It must be me then.
>
> I've been trying to get to your assertion of no JIT optimizations of members
> between Enter/Exit because of "volatility guarantees" in the CLI spec. and
> the only thing in the spec. that could have done it for me was if everything
> between Enter/Exit were considered volatile operations-which is probably why
> I erroneously got that you were implying that.

It only makes sense to think that I was implying that if I had actually
made an assertion of "no JIT optimizations of members between
Enter/Exit".



> If only visible side-effects can't be reordered, and "...only volatile
> operations constitute visible side-effects...", and what's between
> Enter/Exit isn't implicitly volatile (then 12.6.4 para 1 wouldn't apply) or
> explicitly volatile (and 12.6.7 and 12.6.5 point 3 wouldn't apply), I can't
> get to no JIT re-ordering of member fields whose effects would leak out of
> an Enter/Exit block.

I strongly believe the "visible side-effects" clause (12.6.4 IIRC - I
don't have time to check the spec right now) is a red herring, for
reasons I've pointed out before. It's not meant to contradict 12.6.7.

> Ignoring for a moment that 12.6.7 is explicitly talking about the volatile.

No, 12.6.7 *does* talk about the volatile prefix, but not *only* about
the volatile prefix. It's talking about volatile *operations*.

> prefix and IL between Enter/Exit is not considered volatile and therefore
> not generated with this prefix, I can't get to "a read can't be reordered
> before a volatile read" from 12.6.7. para 2, with respect to JIT
> optimizations. This is mostly due to my reading of "...occur prior to any
> references to memory..." applies only to flushing writes to RAM and not JIT
> optimizations, because of the consequences.

No - the spec doesn't (and shouldn't) talk about the differences
between JIT optimizations and CPU optimizations. It just makes
assertions about what the overall effect of a program can be.

> If it doesn't just apply to
> flushing writes to RAM and is interpreted as "...any references to *any*
> memory..." and not "...any references to *the* memory..." (referring to the
> memory used in the volatile operation) then given:
>
> volatile int value;
> void int Calculate()
> {
> int i;
> i = 0xA;
> value = i;
> i = 3;
> return i;
> }
>
> Calculate() could not be JIT optimized to:
>
> void int Calculate()
> {
> value = 0xA;
> return 3;
> }

As I've said before, I believe the spec *should* talk about stack
references separately from heap references. There may be some bit of
the spec I've missed, but I'll acknowledge that I can't point to it
right now.



> ...because reading it as "...any references to *any* memory..." means local
> assignments before and after "value = 20" (being a volatile operation) are
> technically references to *any* memory. It would also suggest that the
> introduction of a Enter/Exit would cause any reference to any memory to be
> unoptimized because in the general case the JIT can't know what will and
> won't be called before or after that Enter/Exit. If 12.6.7. para 2 applies
> only to flushings of current writes to RAM and volatility of only memory
> used in the volatile operation, 12.6.7 para 2 doesn't apply to the
> reads/writes within Enter/Exit and there are no reordering guarantees of
> anything not volatile by another means between Enter/Exit. This falls back
> to single thread of execution guarantees or overriding that with the
> "volatile" keyword.


> Plus, added details like the Enter/Exit volatility applies only if a
> publicly visible object is used as the lock object, just isn't in the spec.

I've never said that it only applies if it's a publicly visible object.
The problem is that if you use a reference which no other thread can
lock on, then you can't apply the rest of my reasoning (namely that no
other thread will be reading/writing the value while you hold the
lock).



> I'm not disputing what a particular implementation is doing or whether what
> is doing is safer or not. I'm also not disputing the flushing of writes to
> RAM (and never was, I was always separating JIT optimizations).
>
> Anyway, you haven't pointed to anything in the spec. to clarify these
> ambiguities other than suggest potential problems if it's not interpreted
> the same way as you; problems I'm not disputing. I never said writing MT
> code was easy. I don't think the framework is doing what it's doing because
> of those clauses, so it's somewhat moot (so, I haven't responded to your
> other points, but will if you're interested) unless you want to target
> different platforms.

I think it *is* an important academic question though. If my style of
threading is unsafe, then I suspect the vast majority of supposedly
thread-safe code in the world (including stuff in the framework) is
also unsafe. It also means that no program can ever use doubles in a
theoretically thread-safe fashion.

One thing it may be worth considering is what the authors of the spec
*intended*. If everyone agrees on that, then at least the spec can
hopefully be improved in the future to reflect it. In fact, I think
there's already a problem with the spec which I haven't brought up
before because I didn't want to muddy the waters. Code like this:

int memberVariable;
....

int a;
lock (someReference)
{
a = memberVariable;
}
Console.WriteLine(a);

could, by my understanding of the memory model, but reordered to:

int a;
lock (someReference)
{
}
a = memberVariable;
Console.WriteLine(a);

The reason for this is that the read is being moved later than the
volatile write rather than earlier than the volatile read - and that's
not forbidden at all. This would muck things up significantly (in
particular, you couldn't change two variables in a way which was
guaranteed to appear to be atomic).

Peter Ritchie [C#MVP]

unread,
Jul 9, 2007, 7:17:39 PM7/9/07
to
>It only makes sense to think that I was implying that if I had actually
>made an assertion of "no JIT optimizations of members between
>Enter/Exit".
Agreed, you never explicitly said that.

So, you're not asserting there can be no JIT optimizations of members
between
Enter/Exit that aren't explicitly involved in a volatile operation, and
you're agreeing that:

>int memberVariable;
>//....


>
>
>int a;
>lock (someReference) {
> a = memberVariable;
>}
>
>Console.WriteLine(a);
>
>could, by my understanding of the memory model, but reordered to:
>
>int a;
>lock (someReference) {
>
>}
>
>
>a = memberVariable;
>Console.WriteLine(a);

...which is what I was trying to get at originally with "the lock statement
surrounding access to a member doesn't stop the compiler from having
optimized use of a member by caching it to a register"...yes, the compiler
*could* assume that all members within the lock statement block are likely
accessible by multiple threads (implicit volatile); but that's not its
intention and it's certainly not documented as doing that". To which, your
response was quoting 335 12.6.5's "Acquiring a lock
(System.Threading.Monitor.Enter or entering a synchronized method) shall
implicitly perform a volatile read operation, and releasing a lock
(System.Threading.Monitor.Exit or leaving a synchronized method) shall
implicitly perform a volatile write operation." In fairness, by "lock
statement surrounding access to a member" I was intending a member within
the block (i.e. lock(...){member=something;}) not the reference being
locked; so I can see how the conversation side-tracked (I don't dispute the
reference sent to Monitor.Enter/Monitor.Exit is considered volatile, but
you'd think the IL generated for "lock" would have instructions with a
volatile prefix). This made me incorrectly think you were implying all
member access between Monitor.Enter and Monitor.Exit was volatile and not
subject to JIT optimizations. But, I think we agree we were talking about
different things.

A variation in your example would be

int a;
lock(someReference) {
memberVariable = a;
}
Console.WriteLine(memberVariable);

being optimized to

int a
lock(someReference){
}
memberVariable = a;
Console.WriteLine(memberVariable);

...which shouldn't occur if memberVariable were declared with "volatile".

>One thing it may be worth considering is what the authors of the spec
>*intended*. If everyone agrees on that, then at least the spec can
>hopefully be improved in the future to reflect it.

Well, it may be moot at this point what was intended in the spec. I doubt
the .NET JIT can change what it's currently doing should "what's intended"
be different that what was implemented. But, I do agree. I think it's
vital to have a coherent unambigious quantifiable and qualifiable spec so
"compliance" means something so developers *can* develop truly platform
independant code. (*can* because they'll always be able to incorrectly
write code to works on only one platform). Where platform is as granular as
OS/architecture combination.


Jon Skeet [C# MVP]

unread,
Jul 9, 2007, 8:09:09 PM7/9/07
to
Peter Ritchie [C#MVP] <prs...@newsgroups.nospam> wrote:
> >It only makes sense to think that I was implying that if I had actually
> >made an assertion of "no JIT optimizations of members between
> >Enter/Exit".
> Agreed, you never explicitly said that.
>
> So, you're not asserting there can be no JIT optimizations of members
> between
> Enter/Exit that aren't explicitly involved in a volatile operation, and
> you're agreeing that:
>
> >int memberVariable;
> >//....
> >
> >
> >int a;
> >lock (someReference) {
> > a = memberVariable;
> >}
> >
> >Console.WriteLine(a);
> >
> >could, by my understanding of the memory model, but reordered to:
> >
> >int a;
> >lock (someReference) {
> >
> >}
> >
> >
> >a = memberVariable;
> >Console.WriteLine(a);
>
> ...which is what I was trying to get at originally with "the lock statement
> surrounding access to a member doesn't stop the compiler from having
> optimized use of a member by caching it to a register"

Well, that's not caching (which would be an *earlier* read) - it's
delaying. However, I've just reread the spec, and it doesn't just say
that *writes* can't be moved past a volatile write - it says that *no*
memory references can move to later than a volatile write or earlier
than a volatile read.

In other words, I withdraw the concerns I expressed in the previous
post. :)

> ...yes, the compiler
> *could* assume that all members within the lock statement block are likely
> accessible by multiple threads (implicit volatile); but that's not its
> intention and it's certainly not documented as doing that". To which, your
> response was quoting 335 12.6.5's "Acquiring a lock
> (System.Threading.Monitor.Enter or entering a synchronized method) shall
> implicitly perform a volatile read operation, and releasing a lock
> (System.Threading.Monitor.Exit or leaving a synchronized method) shall
> implicitly perform a volatile write operation." In fairness, by "lock
> statement surrounding access to a member" I was intending a member within
> the block (i.e. lock(...){member=something;})

Yes, member access within the lock is what I was talking about too.

> not the reference being
> locked; so I can see how the conversation side-tracked (I don't dispute the
> reference sent to Monitor.Enter/Monitor.Exit is considered volatile, but
> you'd think the IL generated for "lock" would have instructions with a
> volatile prefix). This made me incorrectly think you were implying all
> member access between Monitor.Enter and Monitor.Exit was volatile and not
> subject to JIT optimizations. But, I think we agree we were talking about
> different things.

Hmm... not sure given what you've written later on.

> A variation in your example would be
>
> int a;
> lock(someReference) {
> memberVariable = a;
> }
> Console.WriteLine(memberVariable);
>
> being optimized to
>
> int a
> lock(someReference){
> }
> memberVariable = a;
> Console.WriteLine(memberVariable);
>
> ...which shouldn't occur if memberVariable were declared with "volatile".

It can't occur within the spec either though. That's moving one write
(the one to memberVariable) to *after* the lock is released. That's
prohibited.

I *thought* that the following optimisation could occur - but having
rechecked the spec, I'm happy that it can't.

int a = ...;
memberVariable = a;
lock (someReference)
{
}

> >One thing it may be worth considering is what the authors of the spec
> >*intended*. If everyone agrees on that, then at least the spec can
> >hopefully be improved in the future to reflect it.
>
> Well, it may be moot at this point what was intended in the spec. I doubt
> the .NET JIT can change what it's currently doing should "what's intended"
> be different that what was implemented. But, I do agree. I think it's
> vital to have a coherent unambigious quantifiable and qualifiable spec so
> "compliance" means something so developers *can* develop truly platform
> independant code. (*can* because they'll always be able to incorrectly
> write code to works on only one platform). Where platform is as granular as
> OS/architecture combination.

I called for such an unambiguous spec (in reference to a *really,
really* odd reading of it) a while ago. The responses from Joe Duffy
and Joel Pobar are enlightening about the .NET 2.0 memory model (as
opposed to the ECMA spec model):

http://msmvps.com/blogs/jon.skeet/archive/2006/11/26/the-cli-memory-
model-and-specific-specifications.aspx

In particular, there's a reference to an article by Vance Morrison
(http://msdn.microsoft.com/msdnmag/issues/05/10/MemoryModels/)

which includes the following when describing the ECMA spec:

<quote>
1. Reads and writes cannot move before a volatile read.
2. Reads and writes cannot move after a volatile write.
</quote>

It also has this just before the ECMA description:

<quote>
This is one more place where the locking protocol really adds value.
The protocol ensures that every access to thread-shared, read/write
memory only occurs when holding the associated lock. When a thread
exits the lock, the third rule ensures that any writes made while the
lock was held are visible to all processors. Before the memory is
accessed by another thread, the reading thread will enter a lock and
the second rule ensures that the reads happen logically after the lock
was taken. While the lock is held, no other thread will access the
memory protected by the lock, so the first rule ensures that during
that time the program will behave as a sequential program.

The result is that programs that follow the locking protocol have the
same behavior on any memory model having these three rules. This is an
extremely valuable property. It is hard enough to write correct
concurrent programs without having to think about the ways the compiler
or memory system can rearrange reads and writes. Programmers who follow
the locking protocol don't need to think about any of this. Once you
deviate from the locking protocol, however, you must specify and
consider what transformations hardware or a compiler might do to reads
and writes.
</quote>

Interestingly the three rules he refers to only talk about *reads*
moving before entering a lock and *writes* moving after exiting a lock
- in other words, the slightly looser behaviour I was worried about!
However, it makes it pretty clear that using this "locking protocol" is
and always has been intended to be safe. (Given the earlier discussion
of caching, I'm not sure that reads being delayed and writes being
advanced is even considered as a possibility.)

Peter Ritchie [C#MVP]

unread,
Jul 9, 2007, 9:08:54 PM7/9/07
to
"Jon Skeet [C# MVP]" <sk...@pobox.com> wrote in message
news:MPG.20fcde8...@msnews.microsoft.com...

> Well, that's not caching (which would be an *earlier* read) - it's
> delaying.

Maybe a poor choice of an overload term, but I did say "compiler from having
optimized use a member by caching it to a register".

>However, I've just reread the spec, and it doesn't just say
> that *writes* can't be moved past a volatile write - it says that *no*
> memory references can move to later than a volatile write or earlier
> than a volatile read.

...with regard to "acquire semantics" and "release semantics". So 12.6.7
make sense with regard only to processor cachings. Everything I've read
discusses "acquire semantics" and "release semantics" only in the context of
processor caching, not compiler (JIT or otherwise) optimizations.

Joe Duffy's blog on broken double-checked locking [1] describes "ensuring
writes have 'release' semantics on IA-64, via the st.rel instruction. A
single st.rel x guarantees that any other loads and stores leading up to its
execution (in the physical instruction stream) must have appeared to have
occurred to each logical processor at least by the time x's new value
becomes visible to another logical processor." Clearly a CPU instruciton
cannot have an effect on what the JIT does and does not optimize. Joe's
mention of the physical instruction stream only deals with the processor's
caching of writes in relation to that stream.

http://msdn2.microsoft.com/EN-US/library/aa490209.aspx
Discusses the acquire semantics of specific Win32 functions, nothing to do
with compiler optimizations and no existing native compiler I know of
changes it's optimization behaviour in the presence of those Interlocked
functions.

http://msdn2.microsoft.com/en-us/library/ms686355.aspx
Details that prior to VC 2003 "volatile" had no acquire/release semantics
and only dealt with compiler optimizations.

If you take 12.6.7 para 2's "after any memory references" to mean anything
other than processor cachings, as soon as you introduce Monitor.Enter or
Monitor.Exit (or volatile) you can't optimize anything, even locals. That's
clearly not the intention of that paragraph. That paragraph says almost the
same thing as Joe's blog with respect to st.rel and ld.acq, which only
applies to processor caching. That paragraph is the only place 335 doesn't
associate the volatile. prefix when it talks about what volatile reads and
writes do. There's the almost casual mention of Enter and Exit being an
implicite volatile read and write respectively; but that also makes perfect
sense if you're only discussing processor caching.

Relating to origins, the C++ volatile keyword in VC++ never dealt with
acquire or release semantics until VC++ 2005. Prior to that, for the past
30+ years it has been used only to tell the compiler not to optimize that
identifier.

It's not a huge stretch to think of 335:12.6 only in the context of
processor caching. All native Windows synchronization primitives deal with
"acquire semantics" and "release semantics" in the context of processor
caching only, a native compiler simply can't know what it should and should
not optimize based essentially unrelated function calls. MT code (including
the framework) is written with these issues.

<snip>


> It can't occur within the spec either though. That's moving one write
> (the one to memberVariable) to *after* the lock is released. That's
> prohibited.

Only if you make the leap that "acquire semantics" and "release semantics"
don't refer to anything other than processor caching. The spec make
complete sense if all it's talking about is processor caching with regard to
acquire/relese semantics. The processor does reorder memory accesses, and
similar to Joe's blog, in relation to the instruction stream. The mere
mention of the CIL instruction sequence in 335 doesn't imply that

> I called for such an unambiguous spec (in reference to a *really,
> really* odd reading of it) a while ago. The responses from Joe Duffy
> and Joel Pobar are enlightening about the .NET 2.0 memory model (as
> opposed to the ECMA spec model):

I would agree and would join you on a renewed call...

[1]
http://www.bluebytesoftware.com/blog/PermaLink,guid,543d89ad-8d57-4a51-b7c9-a821e3992bf6.aspx


Jon Skeet [C# MVP]

unread,
Jul 10, 2007, 2:52:39 AM7/10/07
to
Peter Ritchie [C#MVP] <prs...@newsgroups.nospam> wrote:
> "Jon Skeet [C# MVP]" <sk...@pobox.com> wrote in message
> news:MPG.20fcde8...@msnews.microsoft.com...
> > Well, that's not caching (which would be an *earlier* read) - it's
> > delaying.
>
> Maybe a poor choice of an overload term, but I did say "compiler from having
> optimized use a member by caching it to a register".

True,.



> >However, I've just reread the spec, and it doesn't just say
> > that *writes* can't be moved past a volatile write - it says that *no*
> > memory references can move to later than a volatile write or earlier
> > than a volatile read.
>
> ...with regard to "acquire semantics" and "release semantics". So 12.6.7
> make sense with regard only to processor cachings. Everything I've read
> discusses "acquire semantics" and "release semantics" only in the context of
> processor caching, not compiler (JIT or otherwise) optimizations.

Once again, there's nothing in the spec which distinguishes the two.
Yes, a lot of pages describing the details go into more detail about
different CPU architectures, but please say where in the spec it says
"You can't rely on any of this as the overall semantics of the program,
because the JIT and the CPU are separate."

The JIT needs to take account of what CPU it's running on in order to
make sure that the overall semantics are correct, that's all.

> http://msdn2.microsoft.com/EN-US/library/aa490209.aspx
> Discusses the acquire semantics of specific Win32 functions, nothing to do
> with compiler optimizations and no existing native compiler I know of
> changes it's optimization behaviour in the presence of those Interlocked
> functions.
>
> http://msdn2.microsoft.com/en-us/library/ms686355.aspx
> Details that prior to VC 2003 "volatile" had no acquire/release semantics
> and only dealt with compiler optimizations.

You shouldn't try to reason about the CLR term "volatile" with
reference to what it means outside the CLI, in my view.



> If you take 12.6.7 para 2's "after any memory references" to mean anything
> other than processor cachings, as soon as you introduce Monitor.Enter or
> Monitor.Exit (or volatile) you can't optimize anything, even locals. That's
> clearly not the intention of that paragraph. That paragraph says almost the
> same thing as Joe's blog with respect to st.rel and ld.acq, which only
> applies to processor caching. That paragraph is the only place 335 doesn't
> associate the volatile. prefix when it talks about what volatile reads and
> writes do. There's the almost casual mention of Enter and Exit being an
> implicite volatile read and write respectively; but that also makes perfect
> sense if you're only discussing processor caching.

But if you don't read the spec as overall semantics, it's entirely
useless.

> Relating to origins, the C++ volatile keyword in VC++ never dealt with
> acquire or release semantics until VC++ 2005. Prior to that, for the past
> 30+ years it has been used only to tell the compiler not to optimize that
> identifier.

Yup - but again, that doesn't alter what the spec says.

> It's not a huge stretch to think of 335:12.6 only in the context of
> processor caching.

That's where we disagree. I believe that if the spec says that X will
happen and Y doesn't happen, then if I can show Y happening and X not
happening on a particular implementation, then that implementation is
*broken* - it doesn't conform with the spec.

> <snip>
> > It can't occur within the spec either though. That's moving one write
> > (the one to memberVariable) to *after* the lock is released. That's
> > prohibited.
>
> Only if you make the leap that "acquire semantics" and "release semantics"
> don't refer to anything other than processor caching.

No - I'm saying they refer to the overall effect - that you should view
the spec as an absolute in terms of what's allowed to occur and what's
not, regardless of how that's achieved.

> The spec make complete sense if all it's talking about is processor
> caching with regard to acquire/relese semantics. The processor does
> reorder memory accesses, and similar to Joe's blog, in relation to
> the instruction stream. The mere mention of the CIL instruction
> sequence in 335 doesn't imply that

If any CLI implementation allows the processor to reorder instructions
in a way which means that 12.6.7 can't be relied upon for visible
effects, that implementation is broken.

Did you read the section of Grant's page about the locking protocol, by
the way?

It is loading more messages.
0 new messages