Newbie question: accessing global variable on multiprocessor

amit

unread,

Jan 8, 2010, 6:09:40 PM1/8/10

to

Hello friends,

If there's a global variable - to be accessed (read/written) by multiple
threads (on multiprocessor), then any correctly implemented access (of
that variable) will cause complete cache reload of all CPUs - is that
true or not? Anyway, what would be the cost (as compared to single read/
write instruct

I'm not talking about locking here - I'm talking about all threads seeing
the most recent value of that variable.

Thanks,

amit

unread,

Jan 8, 2010, 6:28:35 PM1/8/10

to

*bump*

anyone in this chatroom?

Ian Collins

unread,

Jan 8, 2010, 6:35:42 PM1/8/10

to

Which chat room?

--
Ian Collins

Keith Thompson

unread,

Jan 8, 2010, 6:41:50 PM1/8/10

to

amit <nos...@nospam.com> writes:

This is not a chatroom, it's a newsgroup. If you got a response
within 19 minutes, you'd be very lucky. If you don't see anything
within a day or two, you can start to wonder. You can think of it,
very loosely, as a kind of distributed e-mail; people will see your
posts when they get around to checking the newsgroup, not as soon
as you send them. There are also some delays imposed by propagation
from one server to another.

Standard C doesn't support threads. (The draft of the new
standard adds threading support, but it won't be relevant to
programmers for quite a few years.) You'll get better answers
in comp.programming.threads. I suggest browsing that newsgroup's
archives and/or checking its FAQ first.

--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

BGB / cr88192

unread,

Jan 9, 2010, 12:50:04 AM1/9/10

to

"amit" <nos...@nospam.com> wrote in message
news:hi8dvk$9qd$1...@speranza.aioe.org...

well, this is not strictly standard C, but a few things go here:
the CPU is smart enough, it will not flush "all caches" on access, but would
instead only flush those which are relevant, and typically only on write
(for the other processors);
the functionality for this is built into the CPU's and the bus, so nothing
particularly special is needed.

note that, for shared variables, you would want to mark them 'volatile'
(this is a keyword which serves this purpose, among others). basically, this
just tells the compiler to read from and write changes directly to memory,
rather than have them likely sit around in a register somewhere.

as for the cost, in itself it is usually fairly small.

there are also atomic/bus-locking operations, which are usually used for
implementing mutexes, but are not usually needed for most data structures.

Rui Maciel

unread,

Jan 8, 2010, 8:21:38 PM1/8/10

to

amit wrote:

You will get better replies if you post your question on a newsgroup dedicated to
parallel programming, such as comp.programming.threads.

Hope this helps,
Rui Maciel

la...@ludens.elte.hu

unread,

Jan 9, 2010, 1:02:38 PM1/9/10

to

In article <hi95ef$4qc$1...@news.albasani.net>, "BGB / cr88192" <cr8...@hotmail.com> writes:
> "amit" <nos...@nospam.com> wrote in message
> news:hi8dvk$9qd$1...@speranza.aioe.org...
>> Hello friends,
>>
>> If there's a global variable - to be accessed (read/written) by multiple
>> threads (on multiprocessor), then any correctly implemented access (of
>> that variable) will cause complete cache reload of all CPUs - is that
>> true or not? Anyway, what would be the cost (as compared to single read/
>> write instruct
>>
>> I'm not talking about locking here - I'm talking about all threads seeing
>> the most recent value of that variable.
>>
>
> well, this is not strictly standard C, but a few things go here:
> the CPU is smart enough, it will not flush "all caches" on access, but would
> instead only flush those which are relevant, and typically only on write
> (for the other processors);
> the functionality for this is built into the CPU's and the bus, so nothing
> particularly special is needed.
>
> note that, for shared variables, you would want to mark them 'volatile'
> (this is a keyword which serves this purpose, among others). basically, this
> just tells the compiler to read from and write changes directly to memory,
> rather than have them likely sit around in a register somewhere.

I sincerely believe that you're wrong. This is a very frequent fallacy
(I hope I'm using the right word). volatile in C has nothing to do with
threads. Volatile is what the standard defines it to be. See

http://www.open-std.org/JTC1/sc22/wg21/docs/papers/2006/n2016.html

The question is interesting and relevant (if perhaps not topical in this
newsgroup). I was waiting for somebody to give "amit" an answer. Off the
top of my head:

- The new C++ standard will have atomic<type> which seems to be exactly
what amit needs.

- The CPU is absolutely not smart enough to find out what you need. The
compiler and the CPU may jointly and aggressively reorder the machine
level loads and stores that one would naively think to be the direct
derivation of his/her C code. Memory barriers are essential and the
POSIX threads implementations do utilize them.

To substantiate (or fix) these claims, a few links:

http://www.hpl.hp.com/personal/Hans_Boehm/c++mm/threadsintro.html
http://bartoszmilewski.wordpress.com/2008/08/04/multicores-and-publication-safety/
http://bartoszmilewski.wordpress.com/2008/11/05/who-ordered-memory-fences-on-an-x86/
http://bartoszmilewski.wordpress.com/2008/11/11/who-ordered-sequential-consistency/
http://bartoszmilewski.wordpress.com/2008/12/01/c-atomics-and-memory-ordering/

I'm obviously not in the position to give advice to anybody here.
Nonetheless, my humble suggestion for the interested is to read all of
the writings linked to above. For me personally, the conclusion was to
avoid both unsychnronized and not explicitly synchronized access like
the plague.

Cheers,
lacos

Nobody

unread,

Jan 10, 2010, 2:20:23 AM1/10/10

to

On Sat, 09 Jan 2010 19:02:38 +0100, lacos wrote:

>> note that, for shared variables, you would want to mark them 'volatile'
>> (this is a keyword which serves this purpose, among others). basically, this
>> just tells the compiler to read from and write changes directly to memory,
>> rather than have them likely sit around in a register somewhere.
>
> I sincerely believe that you're wrong. This is a very frequent fallacy
> (I hope I'm using the right word). volatile in C has nothing to do with
> threads. Volatile is what the standard defines it to be. See

Notably, the standard states that reading from a "volatile" variable is a
sequence point, while reading from non-volatile variables isn't.

The more significant issue is that a sequence point isn't necessarily what
people expect. The specification only describes the *abstract* semantics,
which doesn't have to match what actually occurs at the hardware level.

AFAIK, there are only two situations where you can say "if this variable
is declared "volatile", this code will behave in this way; if you omit the
qualifier, it's undefined or implementation-defined behaviour". One
case relates to setjmp()/longjmp(), the other to signal().

And even if the compiler provides the "assumed" semantics for "volatile"
(i.e. it emits object code in which read/write of volatile variables
occurs in the "expected" order), that doesn't guarantee that the processor
itself won't re-order the accesses.

Flash Gordon

unread,

Jan 10, 2010, 6:06:45 AM1/10/10

to

Nobody wrote:
> On Sat, 09 Jan 2010 19:02:38 +0100, lacos wrote:
>
>>> note that, for shared variables, you would want to mark them 'volatile'
>>> (this is a keyword which serves this purpose, among others). basically, this
>>> just tells the compiler to read from and write changes directly to memory,
>>> rather than have them likely sit around in a register somewhere.
>> I sincerely believe that you're wrong. This is a very frequent fallacy
>> (I hope I'm using the right word). volatile in C has nothing to do with
>> threads. Volatile is what the standard defines it to be. See
>
> Notably, the standard states that reading from a "volatile" variable is a
> sequence point, while reading from non-volatile variables isn't.

C&V? I don't think reading from a volatile is a sequence point.

> The more significant issue is that a sequence point isn't necessarily what
> people expect. The specification only describes the *abstract* semantics,
> which doesn't have to match what actually occurs at the hardware level.

At this point, it is worth noting that there is a relationship between
volatile and sequence points. I believe the language for this is being
tidied up in the next version of the C standard, but since
reading/writing a volatile object is a side effect it has to be complete
before the sequence point.

> AFAIK, there are only two situations where you can say "if this variable
> is declared "volatile", this code will behave in this way; if you omit the
> qualifier, it's undefined or implementation-defined behaviour". One
> case relates to setjmp()/longjmp(), the other to signal().

For signal it needs to be volatile sig_atomic_t.

> And even if the compiler provides the "assumed" semantics for "volatile"
> (i.e. it emits object code in which read/write of volatile variables
> occurs in the "expected" order), that doesn't guarantee that the processor
> itself won't re-order the accesses.

However, it does have to document what it means by accessing a volatile,
and it should be possible to identify from this whether it prevents the
processor from reordering further down, whether it bypasses the cache etc.

In short, volatile seems like a sensible thing to specify on objects
accessed by multiple threads, but definitely is NOT guaranteed to be
sufficient, and may not be necessary. It's something where you need to
read the documentation for your implementation, and it may depend on
whether you have multiple cores on one processor, multiple separate
processors, and how the HW is designed.
--
Flash Gordon

Nobody

unread,

Jan 10, 2010, 10:47:58 AM1/10/10

to

On Sun, 10 Jan 2010 11:06:45 +0000, Flash Gordon wrote:

>> Notably, the standard states that reading from a "volatile" variable is a
>> sequence point, while reading from non-volatile variables isn't.
>
> C&V? I don't think reading from a volatile is a sequence point.

Ugh; sorry. Reading from a volatile is a *side-effect*, which must not
occur before the preceding sequence point and must have occurred by the
following sequence point. 5.1.2.3 p2 and p6.

>> And even if the compiler provides the "assumed" semantics for "volatile"
>> (i.e. it emits object code in which read/write of volatile variables
>> occurs in the "expected" order), that doesn't guarantee that the processor
>> itself won't re-order the accesses.
>
> However, it does have to document what it means by accessing a volatile,
> and it should be possible to identify from this whether it prevents the
> processor from reordering further down, whether it bypasses the cache etc.

Easier said than done. The object code produced by a compiler may
subsequently be run on a wide range of CPUs, including those not invented
yet. The latest x86 chips will still run code which was generated for a
386.

gwowen

unread,

Jan 11, 2010, 12:02:29 PM1/11/10

to

On Jan 9, 6:02 pm, la...@ludens.elte.hu wrote:

> I sincerely believe that you're wrong. This is a very frequent fallacy
> (I hope I'm using the right word). volatile in C has nothing to do with
> threads.

Well, nothing is C has anything to do with threads. However, since a
C-compiler may assume a piece of code is single threaded, its often
the case that the compiler will optimize away operations on a non-
volatile global variable that a . As such its often necessary (but
NOT sufficient) to declare such variables as volatile.

Suppose the following two bits of code are running concurrently:

------------------------
/* volatile */ unsigned int flag = 0;

void function_wait_for flag()
{
while(flag == 0) {}
do_some_parallel_processing();
return;
}
--------------------------
extern unsigned int flag;

void do_processing()
{
do_non_parallel_processing();
flag = 1;
do_some_other_parallel_processing();
}
--------------------------

You can see that that's a very simplistic way to parallelize a bit of
processing. Note, that since flag is not declared volatile, the
compiler may happily decide that flag is always zero and turn your
function into:

void function_wait_for flag()
{
while(true);
}

Yoinks!

Of course, its a busy-wait, and its terrible style, and there are
better ways to implement it, and its nearly always better to use real
threading primitives, like pthread supplies, rather than faking them
with volatile variables.

But with volatile it works, and without, it may not.

Threads change variables behind the compilers back -- volatile can act
as a warning that that might happen.

Flash Gordon

unread,

Jan 10, 2010, 1:46:01 PM1/10/10

to

Nobody wrote:
> On Sun, 10 Jan 2010 11:06:45 +0000, Flash Gordon wrote:

<snip>

>>> And even if the compiler provides the "assumed" semantics for "volatile"
>>> (i.e. it emits object code in which read/write of volatile variables
>>> occurs in the "expected" order), that doesn't guarantee that the processor
>>> itself won't re-order the accesses.
>> However, it does have to document what it means by accessing a volatile,
>> and it should be possible to identify from this whether it prevents the
>> processor from reordering further down, whether it bypasses the cache etc.
>
> Easier said than done. The object code produced by a compiler may
> subsequently be run on a wide range of CPUs, including those not invented
> yet. The latest x86 chips will still run code which was generated for a
> 386.

If the compiler does not claim to support processors not yet invented
then that is not a problem. You can't blame a compiler (or program) if
it fails for processors which are not supported even if the processor is
theoretically backwards compatible.
--
Flash Gordon

BGB / cr88192

unread,

Jan 14, 2010, 2:24:25 AM1/14/10

to

<la...@ludens.elte.hu> wrote in message news:aZtfy5uqZoQ0@ludens...

what something is defined as and how it is used are not always strictly the
same...

AFAIK, it is common understanding for compiler implementors that volatile
also be made an operation for doing thread-safe behavior, even though it is
not stated for this purpose.

similarly, as for load/store ordering with different variables:
how often does this actually matter in practice?...

granted, I can't say about non-x86 CPU's, but in general, on x86, everything
tends to work just fine simply using volatile for most variables which may
be involved in multi-thread activity.

a relative rarity as in my case most often in my case threads act
independently and on different data (and in the cases they do share data, it
is either fully synchronized, or almost entirely non-synchronized with one
thread not having any real assurance WRT data being handled in other
threads).

granted, fully-synchronous/fenced operations are generally used in special
conditions, such as for locking and unlocking mutexes, ...

> Cheers,
> lacos

BGB / cr88192

unread,

Jan 14, 2010, 2:42:54 AM1/14/10

to

"gwowen" <gwo...@gmail.com> wrote in message
news:00a9ae44-5e60-43aa...@q4g2000yqm.googlegroups.com...

On Jan 9, 6:02 pm, la...@ludens.elte.hu wrote:

> I sincerely believe that you're wrong. This is a very frequent fallacy
> (I hope I'm using the right word). volatile in C has nothing to do with
> threads.

snip...

<--

But with volatile it works, and without, it may not.

Threads change variables behind the compilers back -- volatile can act
as a warning that that might happen.

-->

and I think most compiler writers already know this one implicitly...

beyond threading, volatile has little use in user-mode applications, so it
is essentially "re-dubbed" as an implicit "make variable safe for threads"
operation (possibly inserting memory fences, ... if needed).

all this is because, sometimes, us compiler writers don't exactly care what
exactly the standards say, and so may re-interpret things in some subtle
ways to make them useful.

this may mean:
volatile synchronizes memory accesses and may insert fences (although, as
noted, the x86/x86-64 ISA is usually smart enough to make this unneeded);
non-volatile variables are safe for all sorts of thread-unsafe trickery (as,
after all, if thread synchronization mattered for them they would have been
volatile);
...

as well as other subtleties:
pointer arithmetic on 'void *' working without complaint;
free casting between function pointers and data pointers;
...

as well, there may be restrictions for an arch above the level of the
standard:
for example, given structure definitions must be laid out in particular
ways, and apps may depend on the specific size and byte-level layout of
structures;
apps may depend on underlying details of the calling convention, stack
layout, register-allocation behavior, ...
...

a standards head will be like "no, code may not depend on this behavior",
"the compiler may do whatever it wants", ...

in reality, it is usually much more confined than this:
if the compiler varies on much of any of these little subtle details,
existing legacy code may break, ...

of course, this may lead to code getting "stuck" for a while, and when major
a change finally happens, it breaks a lot of code...

it is notable how much DOS-era C code doesn't work on Windows, or for that
matter, how lots of Win32 code will not work on Win64 even despite some of
the ugliness MS went through to try to make the transition go smoothly...

or such...

BGB / cr88192

unread,

Jan 14, 2010, 2:57:28 AM1/14/10

to

"Flash Gordon" <sm...@spam.causeway.com> wrote in message
news:esjp17x...@news.flash-gordon.me.uk...

if a processor claims to be "backwards compatible" yet old code often breaks
on it, who takes the blame?...
that is right, it is the manufacturer...

it is worth noting the rather large numbers of hoops Intel, MS, ... have
gone through over the decades to make all this stuff work, and keep
working...

it is only the great sudden turn of events that MS dropped Win16 and MS-DOS
support from Win64, even though technically there was little "real" reason
for doing so (lacking v86 and segments in long mode to me seems more like an
excuse, as MS does demonstratably have the technology to just use an
interpreter...).

AMD could partly be blamed for their design decisions, but I guess they
figured "well, probably the OS will include an emulator for this old
stuff...".

the end result is that it is then forced on the user to go get and use an
emulator for their older SW, which works, but from what I have heard, there
are probably at least a few other unhappy users around from the recent turn
of events...

it doesn't help that even lots of 32-bit SW has broken on newer Windows, due
I suspect to MS no longer really caring so much anymore about legacy
support...

> --
> Flash Gordon

Chris M. Thomasson

unread,

Jan 14, 2010, 4:32:40 PM1/14/10

to

"amit" <nos...@nospam.com> wrote in message
news:hi8dvk$9qd$1...@speranza.aioe.org...

http://groups.google.com/group/comp.arch/browse_frm/thread/df6f520f7af13ea5
(read all...)

Flash Gordon

unread,

Jan 14, 2010, 7:25:11 PM1/14/10

to

<snip>

Not successfully. I used programs that worked on a 286 PC but failed on
a 386 unless you switched "Turbo mode" off. This was nothing to do with
the OS.

> it doesn't help that even lots of 32-bit SW has broken on newer Windows, due
> I suspect to MS no longer really caring so much anymore about legacy
> support...

It ain't all Microsoft's fault. Also, there are good technical reasons
for dropping support of ancient interfaces.
--
Flash Gordon

BGB / cr88192

unread,

Jan 14, 2010, 8:28:19 PM1/14/10

to

"Flash Gordon" <sm...@spam.causeway.com> wrote in message

news:88p427x...@news.flash-gordon.me.uk...

on DOS, yes, it is the HW in this case...

>> it doesn't help that even lots of 32-bit SW has broken on newer Windows,
>> due I suspect to MS no longer really caring so much anymore about legacy
>> support...
>
> It ain't all Microsoft's fault. Also, there are good technical reasons for
> dropping support of ancient interfaces.

yeah, AMD prompted it with a few of their changes...

but, MS could have avoided the problem by essentially migrating both NTVDM
and DOS support into an interpreter (which would itself provide segmentation
and v86).

a lot of the rest of what was needed (to glue the interpreter to Win64) was
likely already implemented in getting WoW64 working, ...

this way, we wouldn't have been stuck needing DOSBox for software from
decades-past...

if DOSBox can do it, MS doesn't have "that" much excuse, apart from maybe
that they can no longer "sell" all this old software, so for them there is
not as much market incentive to keep it working...

Nobody

unread,

Jan 15, 2010, 5:52:10 AM1/15/10

to

On Thu, 14 Jan 2010 00:57:28 -0700, BGB / cr88192 wrote:

>> If the compiler does not claim to support processors not yet invented then
>> that is not a problem. You can't blame a compiler (or program) if it fails
>> for processors which are not supported even if the processor is
>> theoretically backwards compatible.
>
> if a processor claims to be "backwards compatible" yet old code often breaks
> on it, who takes the blame?...
> that is right, it is the manufacturer...
>
> it is worth noting the rather large numbers of hoops Intel, MS, ... have
> gone through over the decades to make all this stuff work, and keep
> working...

It's also worth noting that they know when to give up.

If maintaining compatibility just requires effort (on the part of MS or
Intel), then usually they make the effort. If it would require a
substantial performance sacrifice (i.e. complete software emulation),
then tough luck.

BGB / cr88192

unread,

Jan 16, 2010, 2:30:33 AM1/16/10

to

"Nobody" <nob...@nowhere.com> wrote in message
news:pan.2010.01.15....@nowhere.com...

for the DOS or Win 3.x apps, few will notice the slowdown, as these apps
still run much faster in the emulator than on the original HW...

an emulator would also not slow down things not running in it:
DOS and Win 3.x apps would be "slowed" (still much faster than original),
whereas 32-bit apps run at full speed directly on the HW.

(I have also written the same sort of interpreter, and it is not exactly a
huge or difficult feat).

so, as I see it, there was little reason for them not to do this, apart from
maybe a lack of economic payofff (they can't make lots of money off of
having peoples' Win 3.x era stuff keep working natively...).

DOSBox works plenty well for DOS, but I have found DOSBox running Win3.11 to
be kind of lame (the primary reason being that DOSBox+Win3.11 means
poor-FS-sharing, one usually ends up having to exit Win3.11 to sync files,
...).

I had partly considered doing my own Win3.x on Win64 emulator, but I figured
this would be more effort than it is probably worth (I don't need good
integration that badly, but it would be nice...).

unsurprisingly, there does not seem to be a Windows port of Wine (but Wine
itself has the ability to make use of emulation...).

although, FWIW, Win 3.11 on DOSBox does seem a bit like a little toy OS,
almost like one of those little gimmick OS's that people put within some
games... (content simulated, and only a small number of things to look at,
...). except, this was the OS...

Flash Gordon

unread,

Jan 16, 2010, 4:19:59 AM1/16/10

to

Yes, which is relevant to the original point. If a compiler conforms to
the standard on the platform it was written for, but does not conform to
the standard when run on hardware/software more recent than the
compiler, then that is *not* a problem with the compiler.

If either software or hardware vendor changes things in a way that
breaks other software for good reason (and there are lots of good
reasons) then I'm afraid that's part of life.

>>> it doesn't help that even lots of 32-bit SW has broken on newer Windows,
>>> due I suspect to MS no longer really caring so much anymore about legacy
>>> support...
>> It ain't all Microsoft's fault. Also, there are good technical reasons for
>> dropping support of ancient interfaces.
>
> yeah, AMD prompted it with a few of their changes...
>
> but, MS could have avoided the problem by essentially migrating both NTVDM
> and DOS support into an interpreter (which would itself provide segmentation
> and v86).

That doesn't stop it breaking things due to running too fast. It is also
another peace of software which has to be maintained on an ongoing
basis, so security audits, testing, regression testing whenever Window
is patched, redoing the security audit if it needs to be patched due to
a patch in core Windows... it isn't cheap to do right.

> a lot of the rest of what was needed (to glue the interpreter to Win64) was
> likely already implemented in getting WoW64 working, ...
>
> this way, we wouldn't have been stuck needing DOSBox for software from
> decades-past...
>
> if DOSBox can do it, MS doesn't have "that" much excuse, apart from maybe
> that they can no longer "sell" all this old software, so for them there is
> not as much market incentive to keep it working...

It's not simply that there is no money in it. It's also that there are
costs which increase over time. There were costs involved in being able
to run DOS under Win3.1 in protected mode. Getting it working in Win95
required more work and so more costs. Making everything work under
Windows NT would have cost even more (so they did not try and make
everything work). At some point it would require a complete processor
emulator, which can be written, but is even more work and more complex
and would need revalidating every time Windows is patched (the DOSBox
people can just wait until someone reports it is broken, rather than
having to revalidate it themselves for every patch).

There is also the simple old rule that the more lines of code the more
bugs there will be and the harder maintenance and future development is,
and keeping backwards compatibility and obsolete features increases the
line count.
--
Flash Gordon

BGB / cr88192

unread,

Jan 16, 2010, 3:20:35 PM1/16/10

to

"Flash Gordon" <sm...@spam.causeway.com> wrote in message

news:2vc827x...@news.flash-gordon.me.uk...

> BGB / cr88192 wrote:
>> "Flash Gordon" <sm...@spam.causeway.com> wrote in message

<snip>

>>> <snip>
>>>
>>> Not successfully. I used programs that worked on a 286 PC but failed on
>>> a 386 unless you switched "Turbo mode" off. This was nothing to do with
>>> the OS.
>>
>> on DOS, yes, it is the HW in this case...
>
> Yes, which is relevant to the original point. If a compiler conforms to
> the standard on the platform it was written for, but does not conform to
> the standard when run on hardware/software more recent than the compiler,
> then that is *not* a problem with the compiler.
>

yes, this is the fault of the HW vendor(s) for having changed their spec in
a not backwards compatible way...

for example, AMD is to blame for a few things:
REX not working outside long mode;
v86 and segments not working in long mode;
...

but, they did an overall decent job considering...
(much better than Itanium, we can see where this went...).

> If either software or hardware vendor changes things in a way that breaks
> other software for good reason (and there are lots of good reasons) then
> I'm afraid that's part of life.
>

but, then one has to determine what is good reason.

in the DOS/Win16 case, I am not convinced it was good reason.

>>>> it doesn't help that even lots of 32-bit SW has broken on newer
>>>> Windows, due I suspect to MS no longer really caring so much anymore
>>>> about legacy support...
>>> It ain't all Microsoft's fault. Also, there are good technical reasons
>>> for dropping support of ancient interfaces.
>>
>> yeah, AMD prompted it with a few of their changes...
>>
>> but, MS could have avoided the problem by essentially migrating both
>> NTVDM and DOS support into an interpreter (which would itself provide
>> segmentation and v86).
>
> That doesn't stop it breaking things due to running too fast. It is also
> another peace of software which has to be maintained on an ongoing basis,
> so security audits, testing, regression testing whenever Window is
> patched, redoing the security audit if it needs to be patched due to a
> patch in core Windows... it isn't cheap to do right.
>

running too fast doesn't break most apps, and for those rare few that do,
emulators like DOSBox include the ability to turn down the virtual
clock-rate (turning it down really low can make Doom lag, ...).

presumably, something like this would be done almost purely in userspace,
and hence it would not be nearly so sensitive to breaking.

basically, in this case likely the whole 16-bit substructure (including GUI,
...) would likely be moved into the emulator, then maybe the 16-bit apps
draw into the real OS via Direct2D or whatever...

>> a lot of the rest of what was needed (to glue the interpreter to Win64)
>> was likely already implemented in getting WoW64 working, ...
>>
>> this way, we wouldn't have been stuck needing DOSBox for software from
>> decades-past...
>>
>> if DOSBox can do it, MS doesn't have "that" much excuse, apart from maybe
>> that they can no longer "sell" all this old software, so for them there
>> is not as much market incentive to keep it working...
>
> It's not simply that there is no money in it. It's also that there are
> costs which increase over time. There were costs involved in being able to
> run DOS under Win3.1 in protected mode. Getting it working in Win95
> required more work and so more costs. Making everything work under Windows
> NT would have cost even more (so they did not try and make everything
> work). At some point it would require a complete processor emulator, which
> can be written, but is even more work and more complex and would need
> revalidating every time Windows is patched (the DOSBox people can just
> wait until someone reports it is broken, rather than having to revalidate
> it themselves for every patch).
>
> There is also the simple old rule that the more lines of code the more
> bugs there will be and the harder maintenance and future development is,
> and keeping backwards compatibility and obsolete features increases the
> line count.

a CPU emulator is not that complicated, really...
one can write one in maybe around 50 kloc or so.

more so, MS already had these sorts of emulators, as they used them for
things like WinNT on Alpha, ... little says why similar emulators wouldn't
work on x64.

the rest would be to dump some old DLL's on top (hell, maybe the ones from
Win 3.11, FWIW, or a subset of 95 or 98...), and maybe do a little plumbing
to get graphics to the host, get back mouse, allow interfacing the native
filesystem, ...

I could almost do all this myself, apart from not wanting to bother (since,
DOSBox+Win3.11 works, or I could install 95, 98, or XP in QEMU, ...). but as
I it, this one should have been MS's responsibility (rather than burdening
end-users with something which is presumably their responsibility).

(DOSBox gives direct filesystem, but doesn't do very well at keeping it
sync'ed, resulting in extra hassles, 32-bit XP + QEMU though would allow
mounting a network share, OTOH).

hell, MS could have probably even just included DOSBox, FWIW.

well, ok, it is worth noting that Windows 7 Professional & Enterprise do
come with an emulator (not seen personally), which I guess just runs 32-bit
XP (and, so yes, 16-bit SW does maybe return on Win-7, in an emulator...).
(well, with these, one can also get an MS-adapted version of GCC and BASH,
hmm...).

I have Win-7 on my laptop, but it is "Home Ultimate", and hence also
requires the DOSBox or QEMU trick...

anyways, big code is not really that big of a problem IME:
I am working on codebases in the Mloc range, by myself, and in general have
not had too many problems of this sort.

MS has lots more developers, so probably they have easily 10s or 100s of
Mloc to worry about, rather than just the few 100s of kloc needed to make
something like this work, and maybe even be really nice-looking and well
behaved...

Nobody

unread,

Jan 17, 2010, 12:11:25 AM1/17/10

to

On Sat, 16 Jan 2010 00:30:33 -0700, BGB / cr88192 wrote:

>>> it is worth noting the rather large numbers of hoops Intel, MS, ... have
>>> gone through over the decades to make all this stuff work, and keep
>>> working...
>>
>> It's also worth noting that they know when to give up.
>>
>> If maintaining compatibility just requires effort (on the part of MS or
>> Intel), then usually they make the effort. If it would require a
>> substantial performance sacrifice (i.e. complete software emulation),
>> then tough luck.
>
> for the DOS or Win 3.x apps, few will notice the slowdown, as these apps
> still run much faster in the emulator than on the original HW...

That depends upon how much you emulate.

For code which interacts with hardware, you may need to emulate the
hardware as well. This is less of a problem for the PC, due to the
widespread presence of clones (i.e. non-IBM systems). On platforms with
little or no hardware variation (e.g. the Amiga), programs would often
rely upon a particular section of code completing execution before a
certain hardware event occurred (or vice versa).

You may also need to emulate the timings for other reasons. A game which
doesn't scale to frame rate may need to be slowed down to maintain
playability. OTOH, a game which does scale to frame rate may need to be
slowed down so that the frame timings fit within the expected range.

A concrete example of the latter: the original Ultima Underworld game
basically still runs under Win98 on a P4. However: it scales to frame
rate, i.e. the distance anything moves each frame is proportional to the
time between frames. Normally this would be a good thing, except that
everything uses integer coordinates. On a modern system, the time between
frames is so low that the distance moved per frame comes often out at less
than one "unit", so positive values get rounded to zero and negative
values to minus one, resulting in some entities (including the player)
being unable to move North or East.

tl;dr version: some code is so tied to specific hardware that the only way
you can run it on anything else involves VHDL/Verilog simulation.

BGB / cr88192

unread,

Jan 17, 2010, 12:53:40 PM1/17/10

to

"Nobody" <nob...@nowhere.com> wrote in message

news:pan.2010.01.17....@nowhere.com...

> On Sat, 16 Jan 2010 00:30:33 -0700, BGB / cr88192 wrote:
>
>>>> it is worth noting the rather large numbers of hoops Intel, MS, ...
>>>> have
>>>> gone through over the decades to make all this stuff work, and keep
>>>> working...
>>>
>>> It's also worth noting that they know when to give up.
>>>
>>> If maintaining compatibility just requires effort (on the part of MS or
>>> Intel), then usually they make the effort. If it would require a
>>> substantial performance sacrifice (i.e. complete software emulation),
>>> then tough luck.
>>
>> for the DOS or Win 3.x apps, few will notice the slowdown, as these apps
>> still run much faster in the emulator than on the original HW...
>
> That depends upon how much you emulate.
>
> For code which interacts with hardware, you may need to emulate the
> hardware as well. This is less of a problem for the PC, due to the
> widespread presence of clones (i.e. non-IBM systems). On platforms with
> little or no hardware variation (e.g. the Amiga), programs would often
> rely upon a particular section of code completing execution before a
> certain hardware event occurred (or vice versa).
>

yeah. DOS emulators typically run them on fake HW.
for example, DOSBox uses a fake SoundBlaster and S3 Virge (from what I
remember), ...

for Win16, this should not be necessary, since Win16 still did have
protection, and generally isolated the software from the HW. even if it did
allow direct HW access, these apps would unlikely have run on NT-based
systems (unless NTVDM was actually faking a bunch of HW as well, but I doubt
this).

> You may also need to emulate the timings for other reasons. A game which
> doesn't scale to frame rate may need to be slowed down to maintain
> playability. OTOH, a game which does scale to frame rate may need to be
> slowed down so that the frame timings fit within the expected range.
>
> A concrete example of the latter: the original Ultima Underworld game
> basically still runs under Win98 on a P4. However: it scales to frame
> rate, i.e. the distance anything moves each frame is proportional to the
> time between frames. Normally this would be a good thing, except that
> everything uses integer coordinates. On a modern system, the time between
> frames is so low that the distance moved per frame comes often out at less
> than one "unit", so positive values get rounded to zero and negative
> values to minus one, resulting in some entities (including the player)
> being unable to move North or East.
>

granted, poor code is allowed to break, presumably...
I think in general the Ultima games were known for being horridly
unreliable/broken even on the HW they were designed for...

anyways, the point would be to make old software work, not to make buggy
software work.
the vast majority of old SW working is what is asked, not all old software
which may contain obscure bugs.

> tl;dr version: some code is so tied to specific hardware that the only way
> you can run it on anything else involves VHDL/Verilog simulation.
>

errm, I doubt this...

most full-system emulators fake things at the level of the IO ports, ... and
this in general works plenty well (both OS's and apps generally work). other
things, such as the DMA and IRQ controller, ... can similarly be faked in
SW, and don't require full HW simulation.

on many newer systems, the bus controller itself contains a processor and
some code, which emulates some legacy devices in much the same way: watching
IO ports, responding, ...

granted, not everything works exactly, within reasonable bounds:
QEMU or Bochs will probably not give HW Accel graphics, for example, but
most other things work.

bit-twiddling != need for VHDL...

Nobody

unread,

Jan 18, 2010, 9:25:28 AM1/18/10

to

On Sun, 17 Jan 2010 10:53:40 -0700, BGB / cr88192 wrote:

>> tl;dr version: some code is so tied to specific hardware that the only way
>> you can run it on anything else involves VHDL/Verilog simulation.
>
> errm, I doubt this...
>
> most full-system emulators fake things at the level of the IO ports, ... and
> this in general works plenty well (both OS's and apps generally work). other
> things, such as the DMA and IRQ controller, ... can similarly be faked in
> SW, and don't require full HW simulation.

But they only emulate the hardware to the extent sufficient for "typical"
use.

That's not a problem if the system you're trying to emulate includes a
"real" OS. You only need to emulate to the level at which the OS uses the
hardware, and at which it permits other applications to use it.

On platforms where it was common for applications to just kick the OS
out of the way and access the hardware directly (i.e. most of the 8- and
16-bit micros, and PCs before Win3.1 took over from DOS), anything could
happen (and often did).