contract-based programming in C++ and D

Andrei Alexandrescu (See Website For Email)

unread,

Dec 24, 2005, 3:42:06 AM12/24/05

to

In an older message in thread "A safer/better C++?" () I wrote:

>> I have to add on a related note that D implements contracts "so
>> wrong it hurts."

And Steven E Harris asked:

> Can you elaborate?

I thought it's interesting to open a discussion on implementing
contracts in D (http://www.digitalmars.com/d/dbc.html) as opposed to the
currently-proposed contracts for C++
(http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2005/n1773.html).

Note that I don't know about the level of acceptance of the proposal for
C++. I also don't know much about how D programmers find D's
implementation of D contracts.

My opinion is that contracts as implemented for D are fatally flawed and
must be revised, and that contracts as proposed for C++ are sound
barring a few limitations.

If I understand the documentation correctly, D makes a contract part of
the implementation and not part of the interface. Fundamentally, a
contract must be part of the interface and force implementations of that
interface to conform to the contract. Essentially, contracts offer more
expressive and more precise ways of defining interfaces. Not (only)
implementations!

In contrast, the proposal for C++ contracts clearly and repeatedly shows
with examples how contracts can, and should, be associated with the
interface, even when no implementation is in sight (such as specifying a
contract with a pure virtual function, which is contracts' main destined
idiom).

I have one nit and one question about C++ contracts. The nit is about
section 2.2: the description of the translation is misleading because
__old variables are initialized upon entrance of the function but could
be used much later. The translation sketch suggested that the compiler
has to only replace __old definitions right upon use.

The question is, why was precondition weakening left out? Section 2.3.1
says:

============================================
"Section 3.5 of
[http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2004/n1613.pdf]
explains how little redefinition of preconditions is used. Even though
subcontracting is theoretically sound, it ends up being fairly useless
in practice [8]."

Footnote 8: A weaker precondition can be taken advantage of if we know
the particular type of the object. If weaker preconditions should be
allowed, then there exists two alternatives: to allow reuse of an
existing contract or to require a complete redefinition. The former
favours expressiveness, the latter favours overview.
============================================

The relevant quoted reference is:

============================================
Has subcontracting actually any use in practice? According to one
professional Eiffel programmer I spoke with, Berend de Boer, he has not
used the possibility to loosen preconditions, but he has used the
ability to provide stronger postconditions regularly. As an example, the
Gobo Eiffel Project consists of approximately 140k lines of code
(including comments) and contains 219 stronger postconditions and 109
weaker preconditions [Bez03].
============================================

One data point, be it from a professional, can't be significant.
Besides, the numbers given for the Gobo Eiffel project do show that
there is significant use of weaker preconditions, which contradicts de
Boer's experience.

My understanding of a weaker precondition means "or"ing the overriden
and the overriding precondition, so that's not much trouble for the
implementation. But it looks like my current view is simplistic... could
anyone enlighten me?

Andrei

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

dbj...@gmail.com

unread,

Dec 25, 2005, 1:05:08 PM12/25/05

to

One can say also that "contract is in the message". This is the old
one: "What is more important : interface or the message" ? Or are they
equaly important or is it a "wrong question to ask" ?
I personaly think that ideall interface should be as narrow as
possible. This (ideally) leads to a constant and unchageable interface.

In that world message becomes more important, and thus the contract.
Which is conceptually used and practicaly implemented as message
coding/decoding and thus not part of the interface. Following this
(valid?) concept leads to a contract handling being encapsulated inside
objects. This is one of the pilars of what we today know as MOM
(Message Oriented Middleware). In my experience still the most usable
kind-of-a middleware.

p.s. Using the C++, one can take the : "I can do whatever I want/need
to do" attitude. This if course gives to the C++ user both power and
danger of abusing it. I am affraid this is what I like about C++, and
what "D" seems unable to offer...yet?

DBJ

Walter Bright

unread,

Dec 26, 2005, 6:50:13 AM12/26/05

to

<dbj...@gmail.com> wrote in message
news:1135507534.8...@g49g2000cwa.googlegroups.com...

> p.s. Using the C++, one can take the : "I can do whatever I want/need
> to do" attitude. This if course gives to the C++ user both power and
> danger of abusing it. I am affraid this is what I like about C++, and
> what "D" seems unable to offer...yet?

I'm curious what "power" you see in C++ that you can't do in D. For example,
the ultimate power is inline assembler - and D has it, C++ does not. D has
unrestricted pointer access available (though it is rarely necessary to use
it). C++ has proposals for many powerful features that D has.

-Walter Bright
www.digitalmars.com C, C++, D programming language compilers

Bob Hairgrove

unread,

Dec 26, 2005, 10:05:43 AM12/26/05

to

On 26 Dec 2005 06:50:13 -0500, "Walter Bright"
<wal...@nospamm-digitalmars.com> wrote:

>
><dbj...@gmail.com> wrote in message
>news:1135507534.8...@g49g2000cwa.googlegroups.com...
>> p.s. Using the C++, one can take the : "I can do whatever I want/need
>> to do" attitude. This if course gives to the C++ user both power and
>> danger of abusing it. I am affraid this is what I like about C++, and
>> what "D" seems unable to offer...yet?
>
>I'm curious what "power" you see in C++ that you can't do in D. For example,
>the ultimate power is inline assembler - and D has it, C++ does not. D has
>unrestricted pointer access available (though it is rarely necessary to use
>it). C++ has proposals for many powerful features that D has.
>
>-Walter Bright
>www.digitalmars.com C, C++, D programming language compilers

Why do you say that one cannot use inline assembler in C++? Although
it is implementation-defined as to how one can use it, the syntax is
explicitly mentioned in the C++ standard (see section 7.4).

--
Bob Hairgrove
NoSpam...@Home.com

Branimir Maksimovic

unread,

Dec 26, 2005, 8:44:10 PM12/26/05

to

Walter Bright wrote:
> <dbj...@gmail.com> wrote in message
> news:1135507534.8...@g49g2000cwa.googlegroups.com...
> > p.s. Using the C++, one can take the : "I can do whatever I want/need
> > to do" attitude. This if course gives to the C++ user both power and
> > danger of abusing it. I am affraid this is what I like about C++, and
> > what "D" seems unable to offer...yet?
>
> I'm curious what "power" you see in C++ that you can't do in D.

I started to learn D while ago, and found few things that
I need. Initialization of arrays new char[n] slows things down and
also new T[n] gives array of pointers instead of objects.
I found very difficult to implement vector, each class that is stored
has to have placement new and cannot be used for other purposes.
Perhaps I can get global placement new operator? Don't know how.
But I'm impressed that destructors are called instead of
finalizers and that is working despite circular references.
Add to that for_each native support and really fast associative arrays,
also lambdas and closures, I finally I think that D is good competitor

to C++, unlike Java and C#.
It just needs more time.

For example,
> the ultimate power is inline assembler - and D has it, C++ does not. D has
> unrestricted pointer access available (though it is rarely necessary to use
> it). C++ has proposals for many powerful features that D has.

In standard or not I never found C++ to be restricted in low level
features. Native support for inline assembler is not a plus, as
assemblers
can vary in syntax and implementation even on same CPU.
This is always platform/compiler specific so native support for
assembly
actually restricts what one can do. We don;t want portability in any
way
when doing assembly, rather, possibility to use any assembler
with full features available on system.

Greetings, Bane.

Hasan Aljudy

unread,

Dec 26, 2005, 9:07:22 PM12/26/05

to

Well, try writing inline assembly in VC++, then try to compile it on
gcc!!

Walter Bright

unread,

Dec 26, 2005, 9:05:37 PM12/26/05

to

"Bob Hairgrove" <inv...@bigfoot.com> wrote in message
news:72pvq1lugrvkjbgiq...@4ax.com...

> On 26 Dec 2005 06:50:13 -0500, "Walter Bright"
> <wal...@nospamm-digitalmars.com> wrote:
>><dbj...@gmail.com> wrote in message
>>news:1135507534.8...@g49g2000cwa.googlegroups.com...
>>> p.s. Using the C++, one can take the : "I can do whatever I want/need
>>> to do" attitude. This if course gives to the C++ user both power and
>>> danger of abusing it. I am affraid this is what I like about C++, and
>>> what "D" seems unable to offer...yet?
>>
>>I'm curious what "power" you see in C++ that you can't do in D. For
>>example,
>>the ultimate power is inline assembler - and D has it, C++ does not. D has
>>unrestricted pointer access available (though it is rarely necessary to
>>use
>>it). C++ has proposals for many powerful features that D has.
>>
>>-Walter Bright
>>www.digitalmars.com C, C++, D programming language compilers
>
> Why do you say that one cannot use inline assembler in C++?

Because it isn't part of standard C++. Some C++ compilers implement one as
an extension, often using wildly incompatible syntax. Other C++ compilers
implement none at all.

> Although
> it is implementation-defined as to how one can use it, the syntax is
> explicitly mentioned in the C++ standard (see section 7.4).

Here's section 7.4 in its entirety:

---------------------------------------------
[dcl.asm] 7.4 The asm declaration
1 An asm declaration has the form
asm-definition:
asm ( string-literal ) ;
The meaning of an asm declaration is implementation defined.
[Note: Typically it is used to pass information through the implementation
to an assembler. ]
----------------------------------------------

A string with an "implementation defined" meaning isn't a specification for
an inline assembler. I could just as well implement that to be a double
entry bookkeeping system, but that doesn't mean that Standard C++ supports
tax accounting.

-Walter Bright
www.digitalmars.com C, C++, D programming language compilers

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Mirek Fidler

unread,

Dec 27, 2005, 6:47:28 AM12/27/05

to

> Because it isn't part of standard C++. Some C++ compilers implement one as
> an extension, often using wildly incompatible syntax. Other C++ compilers
> implement none at all.

Do you suggest that D has compatible syntax for assembler? Means the
same assembler code can be compiled both for PowerPC and x86? (Otherwise
it does matter whether assembler syntax is "wildly incompatible", as you
have to #ifdef assembly for each platform anyway).

Mirek

Walter Bright

unread,

Dec 27, 2005, 6:47:03 AM12/27/05

to

"Branimir Maksimovic" <bm...@volomp.com> wrote in message
news:1135618324.5...@o13g2000cwo.googlegroups.com...

>
> Walter Bright wrote:
>> <dbj...@gmail.com> wrote in message
>> news:1135507534.8...@g49g2000cwa.googlegroups.com...
>> > p.s. Using the C++, one can take the : "I can do whatever I want/need
>> > to do" attitude. This if course gives to the C++ user both power and
>> > danger of abusing it. I am affraid this is what I like about C++, and
>> > what "D" seems unable to offer...yet?
>>
>> I'm curious what "power" you see in C++ that you can't do in D.
>
> I started to learn D while ago, and found few things that
> I need. Initialization of arrays new char[n] slows things down

D has a variety of techniques available to allocate memory, new char[n] is
only one. You can use malloc() for uninitialized storage.

> and also new T[n] gives array of pointers instead of objects.

That's only if T is a class type. If T is a basic type, or a struct type,
etc., it'll be an array of objects.

> I found very difficult to implement vector, each class that is stored
> has to have placement new and cannot be used for other purposes.

D's core arrays are solid enough I'm curious why implement a vector type?

> Perhaps I can get global placement new operator? Don't know how.

Overriding global operator new is a misfeature of C++, which is why it is
absent in D. Be that as it may, the garbage collector in D can be replaced
with one's own version, after all, it is nothing more than a few library
functions.

> But I'm impressed that destructors are called instead of
> finalizers and that is working despite circular references.
> Add to that for_each native support and really fast associative arrays,
> also lambdas and closures, I finally I think that D is good competitor
> to C++, unlike Java and C#.
> It just needs more time.

D does get constantly better, as more experience with it is gained. D is
intended to be usable for system level applications. One test of that is,
can you write a garbage collector in D? (yes!). Can you write one in Java?
In C#?

> For example,
>> the ultimate power is inline assembler - and D has it, C++ does not. D
>> has
>> unrestricted pointer access available (though it is rarely necessary to
>> use
>> it). C++ has proposals for many powerful features that D has.
> In standard or not I never found C++ to be restricted in low level
> features.

I have, in particular the erratic and generally poor support for inline
assembler. An inline assembler is rarely needed, but when you do need it,
it's extremely useful - for example, if you wish to access the special
registers on the x86.

> Native support for inline assembler is not a plus, as
> assemblers can vary in syntax and implementation even on same CPU.

I see that as the result of not standardizing it. The CPU is the same, why
isn't the inline assembler the same? What purpose is served by gcc and vc
having the register operands reversed?

> This is always platform/compiler specific

I disagree, it need only be CPU specific. There are a lot fewer CPU
instruction set families than platforms.

> so native support for assembly actually restricts what one can do.

I don't see how.

> We don;t want portability in any way
> when doing assembly, rather, possibility to use any assembler
> with full features available on system.

I'm not sure what you're saying there, but I'll take a stab at it:

1) For some things, you just gotta do it in assembler. Having an inline
assembler does the job 95% of the time, and obviates the need for a separate
assembler, extra files, a more complex build process, etc.

2) Portability in the assembler does matter. For example, I've ported apps
between Windows and Linux for the x86. It's a nuisance having to rewrite all
the inline assembler due to gratuitous and pointless incompatibilities.

3) Needing a few lines of assembler here and there doesn't mean one needs a
full blown macro assembler. Usually, the need is less than 10 lines.

I've written a lot of assembler code, including complete applications. Much
of the Digital Mars runtime library is written using a separate assembler,
and I've over time been switching it to using inline assembler. I wouldn't
be doing it if it made things worse.

Walter Bright
www.digitalmars.com C, C++, D programming language compilers

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

E. Mark Ping

unread,

Dec 27, 2005, 6:46:36 AM12/27/05

to

In article <1135645446....@g43g2000cwa.googlegroups.com>,

Hasan Aljudy <hasan....@gmail.com> wrote:
>Well, try writing inline assembly in VC++, then try to compile it on
>gcc!!

Such as with VC++ for x86 and GCC for PowerPC?
--
Mark Ping
ema...@soda.CSUA.Berkeley.EDU

Branimir Maksimovic

unread,

Dec 27, 2005, 10:32:57 AM12/27/05

to

Walter Bright wrote:
> "Branimir Maksimovic" <bm...@volomp.com> wrote in message
> news:1135618324.5...@o13g2000cwo.googlegroups.com...
>
>>Walter Bright wrote:
>>
>>><dbj...@gmail.com> wrote in message
>>>news:1135507534.8...@g49g2000cwa.googlegroups.com...
>>>
>>>>p.s. Using the C++, one can take the : "I can do whatever I want/need
>>>>to do" attitude. This if course gives to the C++ user both power and
>>>>danger of abusing it. I am affraid this is what I like about C++, and
>>>>what "D" seems unable to offer...yet?
>>>
>>>I'm curious what "power" you see in C++ that you can't do in D.
>>
>>I started to learn D while ago, and found few things that
>>I need. Initialization of arrays new char[n] slows things down
>
>
> D has a variety of techniques available to allocate memory, new char[n] is
> only one. You can use malloc() for uninitialized storage.
>

Then I have to free explicitely? Perhaps there is gcmalloc?

>
>>and also new T[n] gives array of pointers instead of objects.
>
>
> That's only if T is a class type. If T is a basic type, or a struct type,
> etc., it'll be an array of objects.
>
>
>>I found very difficult to implement vector, each class that is stored
>>has to have placement new and cannot be used for other purposes.
>
>
> D's core arrays are solid enough I'm curious why implement a vector type?

Because arrays of class objects are arrays of pointers in D.
I don;t want n+1 memory allocation for array of class objects.
That's why I need vector.

>
>
>>Perhaps I can get global placement new operator? Don't know how.
>
>
> Overriding global operator new is a misfeature of C++, which is why it is
> absent in D. Be that as it may, the garbage collector in D can be replaced
> with one's own version, after all, it is nothing more than a few library
> functions.

I don;t want to override global operator new. I need placement
overloading in order to use that for vectors.
For now I have to overload placement new in class and then
delete is no op so class can;t be allocated with something
else. Also I need explicit destructor call (in case of global
placement new).

>
>
>>Native support for inline assembler is not a plus, as
>>assemblers can vary in syntax and implementation even on same CPU.
>
>
> I see that as the result of not standardizing it. The CPU is the same, why
> isn't the inline assembler the same? What purpose is served by gcc and vc
> having the register operands reversed?

this is at&t syntax assembler vs intel syntax assembler.
No one can win.

>
>
>>This is always platform/compiler specific
>
>
> I disagree, it need only be CPU specific. There are a lot fewer CPU
> instruction set families than platforms.

Problem is that assemblers are not related to compilers.
For example I write my super assembler and want to use that
with C++/D compiler?

>
>
>>so native support for assembly actually restricts what one can do.
>
>
> I don't see how.

It ties particular assembler to language.
And assembler is something different then C++ or D.

>
>
>>We don;t want portability in any way
>>when doing assembly, rather, possibility to use any assembler
>>with full features available on system.
>
>
> I'm not sure what you're saying there, but I'll take a stab at it:
>
> 1) For some things, you just gotta do it in assembler. Having an inline
> assembler does the job 95% of the time, and obviates the need for a separate
> assembler, extra files, a more complex build process, etc.
>
> 2) Portability in the assembler does matter. For example, I've ported apps
> between Windows and Linux for the x86. It's a nuisance having to rewrite all
> the inline assembler due to gratuitous and pointless incompatibilities.

This can be easilly solved if same assembler can be used for both
compilers. Perhaps this is because both compilers are tied to
specific assmeblers?

>
> 3) Needing a few lines of assembler here and there doesn't mean one needs a
> full blown macro assembler. Usually, the need is less than 10 lines.

Agreed.

>
> I've written a lot of assembler code, including complete applications. Much
> of the Digital Mars runtime library is written using a separate assembler,
> and I've over time been switching it to using inline assembler. I wouldn't
> be doing it if it made things worse.

That's another good point.

Greetings, Bane.

Rob

unread,

Dec 27, 2005, 10:39:53 AM12/27/05

to

Walter Bright wrote:

>
> "Branimir Maksimovic" <bm...@volomp.com> wrote in message
> news:1135618324.5...@o13g2000cwo.googlegroups.com...
> >
> > Walter Bright wrote:
> >> <dbj...@gmail.com> wrote in message
> >> news:1135507534.8...@g49g2000cwa.googlegroups.com...
> >> > p.s. Using the C++, one can take the : "I can do whatever I
> want/need >> > to do" attitude. This if course gives to the C++ user
> both power and >> > danger of abusing it. I am affraid this is what I
> like about C++, and >> > what "D" seems unable to offer...yet?
> > >
> >> I'm curious what "power" you see in C++ that you can't do in D.
> >
> > I started to learn D while ago, and found few things that
> > I need. Initialization of arrays new char[n] slows things down
>
> D has a variety of techniques available to allocate memory, new
> char[n] is only one. You can use malloc() for uninitialized storage.

I'm not quite sure how that is either superior or inferior to C++. In
C++, most raw memory is uninitialised unless a specific effort is made
to initialise it (eg using calloc() for raw memory, initialising an
array). The only difference I see is that some methods of obtaining
raw memory in D result in it being initialised and others don't.

>
> > and also new T[n] gives array of pointers instead of objects.
>
> That's only if T is a class type. If T is a basic type, or a struct
> type, etc., it'll be an array of objects.
>
> > I found very difficult to implement vector, each class that is
> > stored has to have placement new and cannot be used for other
> > purposes.
>
> D's core arrays are solid enough I'm curious why implement a vector
> type?

I'd guess for learning purposes (eg implementing a vector type is a
good learning exercise, and gives good practice for learning features
of the language. That's the same reason that people still implement
basic containers in C++, despite reasonably "solid" versions being
available in the STL.

>
> > Perhaps I can get global placement new operator? Don't know how.
>
> Overriding global operator new is a misfeature of C++, which is why
> it is absent in D. Be that as it may, the garbage collector in D can
> be replaced with one's own version, after all, it is nothing more
> than a few library functions.

That depends on your viewpoint I guess. The ability to override global
operator new, or to define placement new operators, is an advanced
feature that gives tremendous power but that power can be misused.

>
> > But I'm impressed that destructors are called instead of
> > finalizers and that is working despite circular references.
> > Add to that for_each native support and really fast associative
> > arrays, also lambdas and closures, I finally I think that D is good
> > competitor to C++, unlike Java and C#.
> > It just needs more time.
>
> D does get constantly better, as more experience with it is gained. D
> is intended to be usable for system level applications. One test of
> that is, can you write a garbage collector in D? (yes!). Can you
> write one in Java? In C#?

With disciplined programming techniques, garbage collection is not
needed, so having it be default isn't necessarily viewed universally as
a plus of those languages that have it. For someone who is
disciplined in using C++, the ability to write a GC would rarely be
viewed as a drawcard, even if one could turn GC completely off.

>
> > For example,
> >> the ultimate power is inline assembler - and D has it, C++ does
> not. D >> has
> >> unrestricted pointer access available (though it is rarely
> necessary to >> use
> >> it). C++ has proposals for many powerful features that D has.
> > In standard or not I never found C++ to be restricted in low level
> > features.
>
> I have, in particular the erratic and generally poor support for
> inline assembler. An inline assembler is rarely needed, but when you
> do need it, it's extremely useful - for example, if you wish to
> access the special registers on the x86.
>
> > Native support for inline assembler is not a plus, as
> > assemblers can vary in syntax and implementation even on same CPU.
>
> I see that as the result of not standardizing it. The CPU is the
> same, why isn't the inline assembler the same? What purpose is served
> by gcc and vc having the register operands reversed?
>
> > This is always platform/compiler specific
>
> I disagree, it need only be CPU specific. There are a lot fewer CPU
> instruction set families than platforms.
>
> > so native support for assembly actually restricts what one can do.
>
> I don't see how.

You might want to look more closely at the way gcc works. The
assembler used is largely portable between operating systems, but that
means it is different from what is viewed as "native" on some of those
operating systems.

That aside, I would have trouble with the notion of standardising an
assembler so that inline assembler can be supported within a higher
level language. It introduces a chicken vs egg problem. If there was
enough knowledge to do that, even if we assumed only one CPU, then the
specification of the higher level language would already be written to
support such features --- and there would be no need for inline
assembler at all. If we go beyond one CPU, any such specification
would automatically limit the ability of hardware vendors to provide
more advanced instructions (and hence capabilities) in future CPUs.

And that doesn't allow for the possibility of compiler vendors
optimising their compiler by laying out structures differently, passing
arguments in different ways to functions, etc etc. While it is
possible to standardise on such things (eg the ABI's with C++) support
for such things would limit the ability of compiler vendors to offer
competing products (eg with different trade-offs suitable for different
types of application programming).

>
> > We don;t want portability in any way
> > when doing assembly, rather, possibility to use any assembler
> > with full features available on system.
>
> I'm not sure what you're saying there, but I'll take a stab at it:
>
> 1) For some things, you just gotta do it in assembler. Having an
> inline assembler does the job 95% of the time, and obviates the need
> for a separate assembler, extra files, a more complex build process,
> etc.
>
> 2) Portability in the assembler does matter. For example, I've ported
> apps between Windows and Linux for the x86. It's a nuisance having to
> rewrite all the inline assembler due to gratuitous and pointless
> incompatibilities.

I would hardly describe the incompatibilities as gratuitous or
pointless. They are caused by basic differences in architecture of the
operating systems themselves, how they manage memory, what mechanisms
they use to prevent code from doing "privileged" instructions, basic
things like format of an executable file, etc etc.

THe basic fact is that, if someone makes a conscious decision to use
assembler, it's because they are trying to squeeze the last bit of
performance (or some other characteristic) out of the target machine.
Very few people doing that would have sufficient knowledge to do such
things so they can be ported to another machine. Standardising on an
assembler specification to ensure compatibilities would introduce
trade-offs: particularly on a machine that does not easily implement
the "standardised" assembler, so what is optimal (however one defines
"optimal") on one machine would not be optimal on another. But the
whole point of using assembler is to optimise (or, at least, enhance in
some way) execution characteristics for a target machine.......

>
> 3) Needing a few lines of assembler here and there doesn't mean one
> needs a full blown macro assembler. Usually, the need is less than 10
> lines.
>
> I've written a lot of assembler code, including complete
> applications. Much of the Digital Mars runtime library is written
> using a separate assembler, and I've over time been switching it to
> using inline assembler. I wouldn't be doing it if it made things
> worse.
>

Maybe not. But you provide a single data point, just as I provide
another. You may even be right that it is possible to standardise on
some form of inline assembler. But, based on experience with
programming on multiple systems, and a fairly confident prediction that
hardware vendors will continue to introduce features of their CPUs and
instructions to control them, I have trouble accepting some of your
claims as any more than advocacy for the particular style of
programming that you have designed D for.

Bob Hairgrove

unread,

Dec 27, 2005, 11:34:28 AM12/27/05

to

On 26 Dec 2005 21:05:37 -0500, "Walter Bright"
<wal...@nospamm-digitalmars.com> wrote:

>> Why do you say that one cannot use inline assembler in C++?
>
>Because it isn't part of standard C++. Some C++ compilers implement one as
>an extension, often using wildly incompatible syntax. Other C++ compilers
>implement none at all.
>
>> Although
>> it is implementation-defined as to how one can use it, the syntax is
>> explicitly mentioned in the C++ standard (see section 7.4).
>
>Here's section 7.4 in its entirety:
>
>---------------------------------------------
>[dcl.asm] 7.4 The asm declaration
>1 An asm declaration has the form
>asm-definition:
>asm ( string-literal ) ;
>The meaning of an asm declaration is implementation defined.
>[Note: Typically it is used to pass information through the implementation
>to an assembler. ]
>----------------------------------------------
>
>A string with an "implementation defined" meaning isn't a specification for
>an inline assembler. I could just as well implement that to be a double
>entry bookkeeping system, but that doesn't mean that Standard C++ supports
>tax accounting.

Of course it isn't -- but saying that something is "implementation
defined", and therefore not necessarily portable, is not the same as
saying "you cannot do this" in C++.

I'm not familiar with the D language, although it sounds like it would
be well worth the time to investigate. But does it offer the same
cross-platform portability that C++ offers?

What about inline assembly? Can I compile the same inline ASM source
on two different OS which happen to run on the same CPU? If it isn't
possible, then I would be careful when making comparisons between C++
and D as far as inline assembly goes.

--
Bob Hairgrove
NoSpam...@Home.com

Mirek Fidler

unread,

Dec 27, 2005, 1:43:42 PM12/27/05

to

>>I started to learn D while ago, and found few things that
>>I need. Initialization of arrays new char[n] slows things down
>
>
> D has a variety of techniques available to allocate memory, new char[n] is
> only one. You can use malloc() for uninitialized storage.

Interesting. Does it in turn mean that D uses conservative GC? (Because
otherwise it is unclear to me how GC can track references in raw storage).

Mirek

Walter Bright

unread,

Dec 28, 2005, 5:08:35 AM12/28/05

to

"Rob" <nos...@nonexistant.com> wrote in message
news:43b14cce$2...@duster.adelaide.on.net...

> Walter Bright wrote:
>> D has a variety of techniques available to allocate memory, new
>> char[n] is only one. You can use malloc() for uninitialized storage.
> I'm not quite sure how that is either superior or inferior to C++. In
> C++, most raw memory is uninitialised unless a specific effort is made
> to initialise it (eg using calloc() for raw memory, initialising an
> array). The only difference I see is that some methods of obtaining
> raw memory in D result in it being initialised and others don't.

The main difference is that D has gc available, and Standard C++ does not
(although third party gc's exist and there's a proposal to add it to C++).

>> Overriding global operator new is a misfeature of C++, which is why
>> it is absent in D. Be that as it may, the garbage collector in D can
>> be replaced with one's own version, after all, it is nothing more
>> than a few library functions.
> That depends on your viewpoint I guess. The ability to override global
> operator new, or to define placement new operators, is an advanced
> feature that gives tremendous power but that power can be misused.

Tremendous power? I think that's an exaggeration. It's tempting to use it,
but it winds up with the same faults as using global state variables. You
wind up tripping up library functions that depend on it, having
incompatibilities with 3rd party code, etc. Use a class specific allocator
for specific needs, or just create a mymalloc() function and use that.

> With disciplined programming techniques, garbage collection is not
> needed,

Everything you can do in C++ can also be done in C, if you use disciplined
programming techniques. The problem with disciplined programming techniques
is that one spends a lot of time following them and enforcing them. If one
can get the same results by building it into the language, then there's that
much programmer time saved.

The reason we program in C++ over C is because it saves programmer time. Why
not continue with that thought?

> so having it be default isn't necessarily viewed universally as
> a plus of those languages that have it.

True, but I'm more interested in practical results, as conventional wisdom
is often wrong.

> For someone who is
> disciplined in using C++, the ability to write a GC would rarely be
> viewed as a drawcard, even if one could turn GC completely off.

It's certainly the conventional wisdom that GC is slower. But try running
some benchmarks; you might be surprised. Here's one:
www.digitalmars.com/d/cppstrings.html

>> > so native support for assembly actually restricts what one can do.
>> I don't see how.
> You might want to look more closely at the way gcc works.

In general, I avoid looking at implementation details of gcc, to avoid
possible 'taint' since I'm in the compiler business.

> That aside, I would have trouble with the notion of standardising an
> assembler so that inline assembler can be supported within a higher
> level language. It introduces a chicken vs egg problem.

I don't see the chicken and egg problem. The Digital Mars DMD compiler works
on both Windows and Linux, with no difference in the inline assembler. It
works; there is no problem with it, and nothing was broken by its existence.

> If there was
> enough knowledge to do that, even if we assumed only one CPU, then the
> specification of the higher level language would already be written to
> support such features --- and there would be no need for inline
> assembler at all.

CPU's often have some specialized instructions that have no analogue in high
level languages.

> If we go beyond one CPU, any such specification
> would automatically limit the ability of hardware vendors to provide
> more advanced instructions (and hence capabilities) in future CPUs.

Such new instructions appear every year or two, and they get folded into the
inline assembler in the next update as a normal part of the support of the
compiler/language.

> And that doesn't allow for the possibility of compiler vendors
> optimising their compiler by laying out structures differently, passing
> arguments in different ways to functions, etc etc.

Of course it does:

mov EAX, classname.member[EBX]

voila! It's immune to changes in the offset of member in classname.

>While it is
> possible to standardise on such things (eg the ABI's with C++) support
> for such things would limit the ability of compiler vendors to offer
> competing products (eg with different trade-offs suitable for different
> types of application programming).

Absolutely not. There's nothing at all limiting about saying:

mov EAX, 5

is the instruction syntax to use, rather than:

movl $5,%eax

>> 2) Portability in the assembler does matter. For example, I've ported
>> apps between Windows and Linux for the x86. It's a nuisance having to
>> rewrite all the inline assembler due to gratuitous and pointless
>> incompatibilities.
> I would hardly describe the incompatibilities as gratuitous or
> pointless. They are caused by basic differences in architecture of the
> operating systems themselves, how they manage memory, what mechanisms
> they use to prevent code from doing "privileged" instructions, basic
> things like format of an executable file, etc etc.

Here's a real life example:

#if __GCC__
asm("pushl %eax");
asm("pushl %ebx");
asm("pushl %ecx");
asm("pushl %edx");
asm("pushl %ebp");
asm("pushl %esi");
asm("pushl %edi");
unsigned dummy;
dummy = fullcollect(&dummy - 7);
asm("addl $28,%esp");
#elif _MSC_VER
__asm push eax;
__asm push ebx;
__asm push ecx;
__asm push edx;
__asm push ebp;
__asm push esi;
__asm push edi;
unsigned dummy;
dummy = fullcollect(&dummy - 7);
__asm add esp,28;
#endif

I see gratuitous, pointless incompatibilities. Here's the equivalent section
in D:

asm
{
pushad ;
mov sp[EBP],ESP ;
}
result = fullcollect(sp);

which works on Windows and Linux, unchanged, no #ifdef's required.

Here's another one:

ulonglong Port::read_timer()
{
#if 0 // This works with GCC 2.91.66, but not GCC 2.95.3
ulonglong r;

__asm__ __volatile__("rdtsc\n"
: "=A" (r)
: /* */
: "ax", "dx");
return r;
#elif __GCC__
// Alternate version if the other one doesn't compile

#define rdtsc(low, high) \
__asm__ __volatile__("rdtsc" : "=a" (low), "=d" (high))

unsigned long low, high;

rdtsc(low, high);

return ((ulonglong)high << 32) | low;
#elif __DMC__
__asm
{
rdtsc
}
#endif
}

> THe basic fact is that, if someone makes a conscious decision to use
> assembler, it's because they are trying to squeeze the last bit of
> performance (or some other characteristic) out of the target machine.

That's why I called it a power/performance feature.

> Very few people doing that would have sufficient knowledge to do such
> things so they can be ported to another machine.

The fact is, inline assemblers are often implemented in C++ compilers. That
means that there are a significant population of programmers who find them
very useful. Why not try and make the assemblers MORE useful to them,
instead of making them gratuitously and pointlessly difficult?

> Standardising on an
> assembler specification to ensure compatibilities would introduce
> trade-offs:

No, it wouldn't. What are you losing by standardizing the spelling of the
mnemonics, the order of the instruction operands? You're neither gaining nor
losing one iota of programming capability.

> particularly on a machine that does not easily implement
> the "standardised" assembler, so what is optimal (however one defines
> "optimal") on one machine would not be optimal on another.

What a standalone assembler looks like on a particular platform is
irrelevant. In any case, I don't know what an "optimal" assembler would be,
because its behavior is 100% determined as a simple mapping from source
mnemonic to instruction.

> But the
> whole point of using assembler is to optimise (or, at least, enhance in
> some way) execution characteristics for a target machine.......

Not always. It is also used to access special instructions or do other
things not reasonably supported by the HLL, as the above (real world)
examples I gave do. Here's an example of another use in D's storage
allocator:

asm
{
mov EAX,newlength ;
mul EAX,sizeelem ;
mov newsize,EAX ;
jc Loverflow ;
}

The need here is to detect overflow in the multiplication; this is not so
easy to do efficiently with HLL expressions. This code works on Windows and
Linux without any modification. Try that with C++'s inline assembler <g>.

> But, based on experience with
> programming on multiple systems, and a fairly confident prediction that
> hardware vendors will continue to introduce features of their CPUs and
> instructions to control them,

Of course CPU vendors will add instructions, and it's nothing more than the
usual update problem to upgrade the assembler to support them (in fact,
that's infinitely easier than reengineering the code generator to make
effective use of those new opcodes!).

> I have trouble accepting some of your
> claims as any more than advocacy for the particular style of
> programming that you have designed D for.

An assembler gives a 1:1 mapping of mnemonic to object code. There's no
programming "style" in that.

My only "advocacy" is that an inline assembler is an important tool to have
in the language, and that the syntax it uses should be standardized, rather
than the existing mess seen in C++ compiler inline assemblers.

Walter Bright
www.digitalmars.com C, C++, D programming language compilers

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Walter Bright

unread,

Dec 28, 2005, 5:05:43 AM12/28/05

to

"Branimir Maksimovic" <bm...@hotmail.com> wrote in message
news:dord04$t75$1...@domitilla.aioe.org...

> Walter Bright wrote:
>> D has a variety of techniques available to allocate memory, new char[n]
>> is
>> only one. You can use malloc() for uninitialized storage.
> Then I have to free explicitely? Perhaps there is gcmalloc?

There is as a low level library function.

>> D's core arrays are solid enough I'm curious why implement a vector type?
> Because arrays of class objects are arrays of pointers in D.
> I don;t want n+1 memory allocation for array of class objects.
> That's why I need vector.

You could do it as a struct?

>> I see that as the result of not standardizing it. The CPU is the same,
>> why
>> isn't the inline assembler the same? What purpose is served by gcc and vc
>> having the register operands reversed?
> this is at&t syntax assembler vs intel syntax assembler.
> No one can win.

I'd go with intel because intel designed the CPU and all the CPU
documentation is in intel format. There's no need to 'win', just have the
inline assembler follow the intel syntax. What's the big deal?

>> I disagree, it need only be CPU specific. There are a lot fewer CPU
>> instruction set families than platforms.
> Problem is that assemblers are not related to compilers.

Inline assemblers do exist, are integrated in with compilers, and have
proven to be useful.

> For example I write my super assembler and want to use that
> with C++/D compiler?

I think you're coming from the point of view that a compiler's job is to
emit assembler source text which is then fed to a separate assembler. That's
the 1970's and earlier way of doing things. Compilers I've worked on have
always emitted object files directly, and when an inline assembler was
added, the compiler did the actual assembler work. I didn't think anyone
still did it the 1970's way <g>.

>>>so native support for assembly actually restricts what one can do.
>> I don't see how.
> It ties particular assembler to language.

Only if the compiler emits assembler source (does anyone do that anymore?).

> And assembler is something different then C++ or D.

What do you think of the C++ preprocessor, then (and its history)? It's an
entirely distinct language, that started out as a separate program entirely.
Over time, it wound up being integrated in as part of C/C++, but
semantically it remains a distinct and separate language with its own
lexical, syntactic, and semantic rules and oddities.

>> 2) Portability in the assembler does matter. For example, I've ported
>> apps
>> between Windows and Linux for the x86. It's a nuisance having to rewrite
>> all
>> the inline assembler due to gratuitous and pointless incompatibilities.
>
> This can be easilly solved if same assembler can be used for both
> compilers. Perhaps this is because both compilers are tied to
> specific assmeblers?

D isn't tied to any external assembler. It spits out object files directly.
Same goes for Digital Mars C and C++ compilers. Frankly, it's more work to
try and generate assembler source code instead, and it certainly would make
for a dreadfully slow compiler.

(Digital Mars compilers do have an "assembler source" option, but it's
actually a separate program that reads the object files and disassembles
them!)

Walter Bright
www.digitalmars.com C, C++, D programming language compilers

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Walter Bright

unread,

Dec 28, 2005, 5:04:13 AM12/28/05

to

"Bob Hairgrove" <inv...@bigfoot.com> wrote in message

news:v0o2r11enftoi0kio...@4ax.com...

In practice, you cannot do this in C++. You can have a perfectly standard
compliant C++ compiler, and have no inline assembler. VC++ for 64 bits, for
example, does not have one.

> I'm not familiar with the D language, although it sounds like it would
> be well worth the time to investigate. But does it offer the same
> cross-platform portability that C++ offers?

It's more portable, since it nails down things like source character sets,
wchar sizes, int sizes, floating point behavior, etc.

> What about inline assembly? Can I compile the same inline ASM source
> on two different OS which happen to run on the same CPU?

Yes. That's the whole point! The garbage collector, for example, is written
in D. Portions of it are in inline assembler, which is unmodified from Win32
to Linux.

> If it isn't
> possible, then I would be careful when making comparisons between C++
> and D as far as inline assembly goes.

I have some experience with trying to use inline assembler and going between
Windows and Linux with C++. It isn't a positive experience, and it doesn't
have to be that way.

-Walter Bright
www.digitalmars.com C, C++, D programming language compilers

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Walter Bright

unread,

Dec 28, 2005, 5:07:22 AM12/28/05

to

"Mirek Fidler" <c...@volny.cz> wrote in message
news:41cd45F...@individual.net...

>> Because it isn't part of standard C++. Some C++ compilers implement one
>> as
>> an extension, often using wildly incompatible syntax. Other C++ compilers
>> implement none at all.
>
> Do you suggest that D has compatible syntax for assembler? Means the
> same assembler code can be compiled both for PowerPC and x86? (Otherwise
> it does matter whether assembler syntax is "wildly incompatible", as you
> have to #ifdef assembly for each platform anyway).

Of course the powerpc and x86 will be different. But x86 on Windows and x86
on Linux are identical. For C++, the x86 inline assembler for Windows and
Linux are wildly (and pointlessly) incompatible.

-Walter Bright
www.digitalmars.com C, C++, D programming language compilers

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Walter Bright

unread,

Dec 28, 2005, 5:09:13 AM12/28/05

to

"Mirek Fidler" <c...@volny.cz> wrote in message

news:41d9unF...@individual.net...

>>>I started to learn D while ago, and found few things that
>>>I need. Initialization of arrays new char[n] slows things down
>> D has a variety of techniques available to allocate memory, new char[n]
>> is
>> only one. You can use malloc() for uninitialized storage.
> Interesting. Does it in turn mean that D uses conservative GC?

Yes.

> (Because otherwise it is unclear to me how GC can track references in raw
> storage).

I don't see how it could work otherwise, either <g>.

Walter Bright
www.digitalmars.com C, C++, D programming language compilers

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Walter Bright

unread,

Dec 28, 2005, 5:06:14 AM12/28/05

to

"E. Mark Ping" <ema...@soda.csua.berkeley.edu> wrote in message
news:doqprq$1s1c$1...@agate.berkeley.edu...

> In article <1135645446....@g43g2000cwa.googlegroups.com>,
> Hasan Aljudy <hasan....@gmail.com> wrote:
>>Well, try writing inline assembly in VC++, then try to compile it on
>>gcc!!
>
> Such as with VC++ for x86 and GCC for PowerPC?

Such as VC++ for x86 and GCC for x86.

Or try VC++ for 64 bits, which has no inline assembler at all.

Dave Harris

unread,

Dec 28, 2005, 7:29:30 AM12/28/05

to

SeeWebsit...@moderncppdesign.com (Andrei Alexandrescu (See Website
For Email)) wrote (abridged):

> If I understand the documentation correctly, D makes a contract part of
> the implementation and not part of the interface.

I'm not sure that is correct. One reason is that when you override a
function you can change its implementation, but in D you are still bound
to its contract. I /think/ you are mislead by the D syntax, which does not
separate declarations from definitions for classes. You expect to see the
interface in a header file, and D doesn't have header files.

On the other hand, it does sound like contracts only apply to classes and
functions. D does not support multiple inheritance, and instead has a
Java-like keyword "interface", and as far as I can tell these "interfaces"
don't support contract programming. Which does make me wonder if the
language designer may have missed the point.

Maybe Walter Bright could comment on this specific area? So far the
replies I've seen have been about memory allocation and assembler, and not
the subject the thread title.

-- Dave Harris, Nottingham, UK.

Mirek Fidler

unread,

Dec 28, 2005, 7:32:39 AM12/28/05

to

>>>only one. You can use malloc() for uninitialized storage.
>>
>>Interesting. Does it in turn mean that D uses conservative GC?
>
>
> Yes.
>
>
>>(Because otherwise it is unclear to me how GC can track references in raw
>>storage).
>
>
> I don't see how it could work otherwise, either <g>.

Well, that alone would make me a lot nervous. Of course, I know that
statistically, conservative GC works well, but in the end it is still
stochastic system with little bit unpredictible behaviour. Designing
language that REQUIRES conservative GC is perhaps pragmatic, but IMHO
limits applications of the language.

But maybe it is just me, as I consider ANY resource leaks unacceptable
(I have leak checker on by default in debug mode and any of those very
rare leaks that escape my destructor based deterministic resource
management I consider first class bugs that are to be hunted immediately).

Mirek

Mirek Fidler

unread,

Dec 28, 2005, 7:32:18 AM12/28/05

to

Walter Bright wrote:
> "Mirek Fidler" <c...@volny.cz> wrote in message
> news:41cd45F...@individual.net...
>
>>>Because it isn't part of standard C++. Some C++ compilers implement one
>>>as
>>>an extension, often using wildly incompatible syntax. Other C++ compilers
>>>implement none at all.
>>
>>Do you suggest that D has compatible syntax for assembler? Means the
>>same assembler code can be compiled both for PowerPC and x86? (Otherwise
>>it does matter whether assembler syntax is "wildly incompatible", as you
>>have to #ifdef assembly for each platform anyway).
>
>
> Of course the powerpc and x86 will be different. But x86 on Windows and x86
> on Linux are identical. For C++, the x86 inline assembler for Windows and
> Linux are wildly (and pointlessly) incompatible.

AFAIK not if you are using the same compiler (GCC).

It is true that difference between GCC and MSC sucks. However, it is
more or less caused by things that has a little to do with GCC or MSC or
C++ standard. The main problem is that linux development toolchains
originally picked different assembler syntax for x86 than was the one
provided by Intel.

However, I do not see how reasonable C++/D language standard could
enforce assembler syntax without limiting the language to single CPU.

Mirek

Mirek Fidler

unread,

Dec 28, 2005, 7:33:02 AM12/28/05

to

>>Of course it isn't -- but saying that something is "implementation
>>defined", and therefore not necessarily portable, is not the same as
>>saying "you cannot do this" in C++.
>
>
> In practice, you cannot do this in C++. You can have a perfectly standard
> compliant C++ compiler, and have no inline assembler.

Actually, makes 100% sense to me - what about C++ interpreter that has
no assembler at all?

Maybe you have not understand what C++ standard is...

It has to deal with ALL possibilities. The purpose of C++ standard is to
IMPROVE source code portability, not to resolve ALL problems.

Mirek

Mirek Fidler

unread,

Dec 28, 2005, 12:12:20 PM12/28/05

to

>>With disciplined programming techniques, garbage collection is not
>>needed,
>
>
> Everything you can do in C++ can also be done in C, if you use disciplined
> programming techniques. The problem with disciplined programming techniques
> is that one spends a lot of time following them and enforcing them.

Not true. I write single 'delete' statement per 5000 lines of code
average without thinking about the issue or spending a time with it.
Moreover, I do not remember the last time when I had to 'manually' close
the file or GUI window...

> If one
> can get the same results by building it into the language, then there's that
> much programmer time saved.

No, those are different results.

> The reason we program in C++ over C is because it saves programmer time. Why
> not continue with that thought?

That is OK. And actually, I think that it is always good to keep trying.

Anyway, if one is to follow GC path (which IMHO is wrong decision), why
he should prefer D over C#?

>>so having it be default isn't necessarily viewed universally as
>>a plus of those languages that have it.
>
>
> True, but I'm more interested in practical results, as conventional wisdom
> is often wrong.

I can agree with that :) OTOH, GC is "conventional wisdom" now outside
C++ circles :)

> It's certainly the conventional wisdom that GC is slower. But try running
> some benchmarks; you might be surprised. Here's one:
> www.digitalmars.com/d/cppstrings.html

Funny. I have used almost the same example to compare my U++/NTL library
with STL :)

(see here: http://upp.sourceforge.net/examples$idmapBench.html).

Actually, NTL is 2.5 times faster than STL in this example (which is IMO
dominated by the map performance anyway).

So my beliefe is that you are actually comparing smart library vs.
stupid one (STL). Not D vs C++ neither GC vs manual memory management.

(And yes, I think that it is STL what makes C++ fail. But rather than
inventing completely new language, replacing the library is simpler
option IMHO :).

> I don't see the chicken and egg problem. The Digital Mars DMD compiler works
> on both Windows and Linux, with no difference in the inline assembler. It
> works; there is no problem with it, and nothing was broken by its existence.

So does GCC. You will have to wait for independent D compiler
implementation to really find out how your language standard works.

> Such new instructions appear every year or two, and they get folded into the
> inline assembler in the next update as a normal part of the support of the
> compiler/language.

Well, means there will have to be a new ANSI D standard each two years? :)

> Here's a real life example:
>
> #if __GCC__
> asm("pushl %eax");
> asm("pushl %ebx");

> dummy = fullcollect(&dummy - 7);
> asm("addl $28,%esp");
> #elif _MSC_VER
> __asm push eax;
> __asm push ebx;

> dummy = fullcollect(&dummy - 7);
> __asm add esp,28;
> #endif
>
> I see gratuitous, pointless incompatibilities.

So do I. I would be happy to have really good compiler with compatible
asm for both Linux and Win32. However, it has nothing to do with C++
standard. Nothing prevents compiler writers to make things compatible here.

Mirek

kanze

unread,

Dec 28, 2005, 1:11:24 PM12/28/05

to

Walter Bright wrote:

This is getting off subject for the thread, but...

[...]

> 1) For some things, you just gotta do it in assembler. Having
> an inline assembler does the job 95% of the time, and obviates
> the need for a separate assembler, extra files, a more complex
> build process, etc.

It's curious, but my experience has been just the opposite.
I've never found a use for inline assembler, even when it has
been present. To me, it just seems far more natural to write a
separate module in assembler when I need assembler.

I'm not sure why our experiences here differ. Part of it may be
because I have done a lot of programming in assembler, in the
past, and am more or less used to writing entire modules in
assembler. And possibly, it is because the inline assemblers
I've used haven't been that well integrated -- I've had problems
figuring out which registers I could or could not use, where
specific variables were, etc. All of this is strictly defined
in the API for a separate function.

> 2) Portability in the assembler does matter. For example, I've
> ported apps between Windows and Linux for the x86. It's a
> nuisance having to rewrite all the inline assembler due to
> gratuitous and pointless incompatibilities.

I suspect that this depends. Most of what I use inline assembly
for are things so specific to the machine AND the system that
I'd have to rewrite it anyway. And of course, the only
portability which has really interested me in the past was
between Solaris, and one of HP/UX, AIX or MS-DOS. So you can
forget any idea of portable assembler anyway.

Frankly, too, I think there is a quality of implementation issue
involved. The standard doesn't specify much about inline
assembler because from a quality of implementation point of
view, you want it to be as close as possible to native
assembler. Thus, it is almost imperative (again, from a QoI
point of view) that even something as basic as the order of
operands differ between an Intel machine and a Sparc : if the
destination isn't the left most parameter on an Intel, and the
right most on a Sparc, then there is a serious lack of quality.

> 3) Needing a few lines of assembler here and there doesn't
> mean one needs a full blown macro assembler. Usually, the
> need is less than 10 lines.

Often, it can be as little as two or three lines. It's still
easier, IMHO, to put it in a separate file. As soon as you
support more than one platform (and everything I write today
MUST compile on both a Sparc and a Linux based PC), you need
separate files for it anyway.

> I've written a lot of assembler code, including complete
> applications. Much of the Digital Mars runtime library is
> written using a separate assembler, and I've over time been
> switching it to using inline assembler. I wouldn't be doing
> it if it made things worse.

But how does it make things better? Given something like:

.align 4
.section ".text",#alloc,#execinstr
.global GB_atomicRead
GB_atomicRead:
membar #LoadLoad
retl
ld [%o0],%o0
.type GB_atomicRead,2
.size GB_atomicRead,(.-GB_atomicRead)

.align 4
.section ".text",#alloc,#execinstr
.global GB_atomicFetchAndAdd
GB_atomicFetchAndAdd:
membar #StoreStore | #LoadStore
ld [%o0],%l0
add %l0,%o1,%l1
cas [%o0],%l0,%l1
sub %l1,%l2,%g0
bne,pn %icc,GB_atomicFetchAndAdd
nop
membar #LoadLoad
retl
add %l1,%o1,%o0
.type GB_atomicFetchAndAdd,2
.size GB_atomicFetchAndAdd,(.-GB_atomicFetchAndAdd)

What would having to wrap it in some C++ buy me, except more
lines of code to write?

--
James Kanze GABI Software
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Francis Glassborow

unread,

Dec 28, 2005, 1:11:02 PM12/28/05

to

In article <CPqdnQpqGZE...@comcast.com>, Walter Bright
<wal...@nospamm-digitalmars.com> writes

Which has exactly nothing to do with the C++ Standard. And AFAIK the
first of the above works perfectly well on Linux and any other x86 based
system. It also fails for Linux versions implemented on systems using no
x86 CPUs.

Now if you want to have a go at Microsoft for using a different syntax
to GCC, you are free to do so but this is not the place for it.

The compatibility of asm support in D for Linux and Windows on x86
architectures is an artefact of their being a single implementor and not
a result of a well defined standard. Please note that it is completely
impossible for WG21 to specify the syntax for asm if C++ is to be
processor independent. Yes it would be nice if those targeting the same
hardware used the same syntax but that is not likely to happen unless
there is some Standard for the assembler for that hardware.

--
Francis Glassborow ACCU
Author of 'You Can Do It!' see http://www.spellen.org/youcandoit
For project ideas and contributions: http://www.spellen.org/youcandoit/projects

Francis Glassborow

unread,

Dec 28, 2005, 1:10:27 PM12/28/05

to

In article <GaSdnVWw061...@comcast.com>, Walter Bright
<wal...@nospamm-digitalmars.com> writes

>Of course the powerpc and x86 will be different. But x86 on Windows and x86
>on Linux are identical. For C++, the x86 inline assembler for Windows and
>Linux are wildly (and pointlessly) incompatible.

You repeatedly make this assertion. However AFAIK D has a single
implementor and that goes an immense way to ensure compatibility.

There is no reason why a C++ compiler should not support identical
'inline assembler instructions' and no reason that two implemetatopns
from different implementors for the same machine should be compatible
(and that isn't just for asm but at link time quite a lot of other
things as well)

--
Francis Glassborow ACCU
Author of 'You Can Do It!' see http://www.spellen.org/youcandoit
For project ideas and contributions: http://www.spellen.org/youcandoit/projects

kanze

unread,

Dec 28, 2005, 1:12:08 PM12/28/05

to

Walter Bright wrote:
> "Rob" <nos...@nonexistant.com> wrote in message
> news:43b14cce$2...@duster.adelaide.on.net...
> > Walter Bright wrote:
> >> D has a variety of techniques available to allocate memory,
> >> new char[n] is only one. You can use malloc() for
> >> uninitialized storage.

> > I'm not quite sure how that is either superior or inferior
> > to C++. In C++, most raw memory is uninitialised unless a
> > specific effort is made to initialise it (eg using calloc()
> > for raw memory, initialising an array). The only difference
> > I see is that some methods of obtaining raw memory in D
> > result in it being initialised and others don't.

> The main difference is that D has gc available, and Standard
> C++ does not (although third party gc's exist and there's a
> proposal to add it to C++).

Which means that it is available:-). I use garbage collection
more often than not in C++.

I'd say that the main difference is that D programmers use
garbage collection, where as many C++ still don't. And one of
the reasons is doubtlessly that garbage collection is
"officially" part of D, where as it is still only a third party
add in in C++.

> >> Overriding global operator new is a misfeature of C++,
> >> which is why it is absent in D. Be that as it may, the
> >> garbage collector in D can be replaced with one's own
> >> version, after all, it is nothing more than a few library
> >> functions.

> > That depends on your viewpoint I guess. The ability to
> > override global operator new, or to define placement new
> > operators, is an advanced feature that gives tremendous
> > power but that power can be misused.

> Tremendous power? I think that's an exaggeration. It's
> tempting to use it, but it winds up with the same faults as
> using global state variables. You wind up tripping up library
> functions that depend on it, having incompatibilities with 3rd
> party code, etc. Use a class specific allocator for specific
> needs, or just create a mymalloc() function and use that.

The ability to define global placement new is an important
advantage, because new (xxx) T( a, b ) also calls the
constructor, which mymalloc() won't. The ability to define a
global replacement for the standard new is useful for
debugging... or integrating the garbage collector:-).

> > With disciplined programming techniques, garbage collection
> > is not needed,

> Everything you can do in C++ can also be done in C, if you use
> disciplined programming techniques.

And everything you can do in C can also be done in assembler,
with enough discipline. Been there, done that. But whips and
leather aren't my thing. No point in making writing correct
code more work than necessary.

> The problem with disciplined programming techniques is that
> one spends a lot of time following them and enforcing them.
> If one can get the same results by building it into the
> language, then there's that much programmer time saved.

> The reason we program in C++ over C is because it saves
> programmer time. Why not continue with that thought?

Well, some people like whips and leather and lots of pain:-).

In fact, there is ONE very compulsive argument against garbage
collection. Without it, you need more programmers, with a
higher skill level. Which increases the demand, and thus pushes
up the tarif for the experts like myself. Other than that,
however, I can't see any advantage of not having such a useful
tool available.

[...]

> I don't see the chicken and egg problem. The Digital Mars DMD
> compiler works on both Windows and Linux, with no difference
> in the inline assembler. It works; there is no problem with
> it, and nothing was broken by its existence.

How about between Linux (on IA-32 architectures) and Solaris (on
Sparc). Those are the two platforms I'm currently concerned
with.

[...]

> Here's a real life example:

> #if __GCC__
> asm("pushl %eax");
> asm("pushl %ebx");
> asm("pushl %ecx");
> asm("pushl %edx");
> asm("pushl %ebp");
> asm("pushl %esi");
> asm("pushl %edi");
> unsigned dummy;
> dummy = fullcollect(&dummy - 7);
> asm("addl $28,%esp");
> #elif _MSC_VER
> __asm push eax;
> __asm push ebx;
> __asm push ecx;
> __asm push edx;
> __asm push ebp;
> __asm push esi;
> __asm push edi;
> unsigned dummy;
> dummy = fullcollect(&dummy - 7);
> __asm add esp,28;
> #endif

I suppose that the goal is to recover the contents of the
registers ; that fullcollect exploits the (formally illegal)
address it is passed. I have some experience with this sort of
thing trying to write code to walk back the stack. (One of my
rare uses of inline assembler -- I need a special instruction on
the Sparc to ensure that the stack is actually visible at memory
addresses.)

My experience with this is that even on the same platform,
different compilers or systems will use slightly different stack
layouts. Under Sun OS 4, for example, the exact position of the
frame pointer relative to a local variable was slightly
different that under Solaris (Sun OS 5).

When you're playing these sort of games, you really do want
separate code for each system --- historically, I've maintained
separate code for each triplet: system (inluding version),
compiler, architecture.

I might add in passing that the Microsoft code above is not
according to the standard.

And like you, I can't imagine what stupid reason led g++ to not
use the standard Intel format... except that the assembler
furnished with Linux also uses this non-standard format, and I
guess that g++ expects that to be the assembler someone
programming under Linux to know. So g++ could be said to be
making the best of a bad situation; the problem is present, and
there is no easy work-around.

--
James Kanze GABI Software
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Walter Bright

unread,

Dec 28, 2005, 9:19:23 PM12/28/05

to

"Mirek Fidler" <c...@volny.cz> wrote in message

news:41fbo8F...@individual.net...

>>>Of course it isn't -- but saying that something is "implementation
>>>defined", and therefore not necessarily portable, is not the same as
>>>saying "you cannot do this" in C++.
>> In practice, you cannot do this in C++. You can have a perfectly standard
>> compliant C++ compiler, and have no inline assembler.
> Actually, makes 100% sense to me - what about C++ interpreter that has
> no assembler at all?

Presumably a C++ interpreter would be compiling to some bytecode, and so the
inline assembler would be for that bytecode.

> Maybe you have not understand what C++ standard is...
> It has to deal with ALL possibilities.

It only has to deal with possibilities the committee members care to cover.
For example, there's no support for 4 bit computers. There's no support for
the near/far memory models that is required for effective 16 bit x86
programming.

> The purpose of C++ standard is to IMPROVE source code portability,
> not to resolve ALL problems.

I agree. The D implementations show how nice having portable inline
assembler source, where possible, can be.

Walter Bright
www.digitalmars.com C, C++, D programming language compilers

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Walter Bright

unread,

Dec 28, 2005, 9:19:44 PM12/28/05

to

"Mirek Fidler" <c...@volny.cz> wrote in message

news:41fbhdF...@individual.net...

> Well, that alone would make me a lot nervous. Of course, I know that
> statistically, conservative GC works well, but in the end it is still
> stochastic system with little bit unpredictible behaviour. Designing
> language that REQUIRES conservative GC is perhaps pragmatic, but IMHO
> limits applications of the language.
> But maybe it is just me, as I consider ANY resource leaks unacceptable

It's also possible for an application in C++ to wedge itself into an out of
memory condition even though there is plenty of memory available, and even
if there are no memory leaks. This can happen due to fragmentation of the
free memory pool.

C++ applications where it is CRITICAL that no out of memory wedges can
happen are written to statically allocate all the memory required, and avoid
all dynamic memory allocation. One can do this in D just as easilly.

> (I have leak checker on by default in debug mode and any of those very
> rare leaks that escape my destructor based deterministic resource
> management I consider first class bugs that are to be hunted immediately).

Yes, that's what I did for many years in C++. It's a requirement if one
wishes to write robust C++. I've now switched to using a gc even with C++,
and found I'm much more productive by concentrating on the algorithm rather
than memory management problems.

Walter Bright
www.digitalmars.com C, C++, D programming language compilers

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Howard Hinnant

unread,

Dec 28, 2005, 9:18:10 PM12/28/05

to

In article <1135781743.3...@g14g2000cwa.googlegroups.com>,
"kanze" <ka...@gabi-soft.fr> wrote:

> It's curious, but my experience has been just the opposite.
> I've never found a use for inline assembler, even when it has
> been present. To me, it just seems far more natural to write a
> separate module in assembler when I need assembler.
>
> I'm not sure why our experiences here differ. Part of it may be
> because I have done a lot of programming in assembler, in the
> past, and am more or less used to writing entire modules in
> assembler. And possibly, it is because the inline assemblers
> I've used haven't been that well integrated -- I've had problems
> figuring out which registers I could or could not use, where
> specific variables were, etc. All of this is strictly defined
> in the API for a separate function.

I think we're talking well-integrated inline assembler here. In the
inline assembler I've used most, you don't deal directly with registers.
That's the domain of the compiler's register allocator. You name
abstract registers and let the compiler deal with it.

....
register volatile unsigned* const p = &activity_;
register unsigned a;
unsigned n_readers;
private_lock lk(mut_, defer_lock);
loop:
asm {lwarx a, 0, p}
n_readers = a & max_readers_;
if ((a & write_entered_) || n_readers == max_readers_)
{
if (!lk.locked())
lk.lock();
a |= entry_sleeping_;
asm
{
stwcx. a, 0, p
bne- loop
}
entry_.wait(lk);
goto loop;
}
++n_readers;
....

Most of the code is C++. Just a few inline asm blocks scattered only
where absolutely necessary to reach otherwise unreachable machine
instructions. No machine registers specifically mentioned. Most of the
code stays in the much-more-maintainable C++.

-Howard

Walter Bright

unread,

Dec 28, 2005, 9:23:20 PM12/28/05

to

"Mirek Fidler" <c...@volny.cz> wrote in message

news:41fha3F...@individual.net...

>> Everything you can do in C++ can also be done in C, if you use
>> disciplined
>> programming techniques. The problem with disciplined programming
>> techniques
>> is that one spends a lot of time following them and enforcing them.
> Not true. I write single 'delete' statement per 5000 lines of code
> average without thinking about the issue or spending a time with it.

If you carefully look at how the STL is implemented, there's a great deal of
effort in there expended to manage memory, even though the 'delete'
statement is not present. The 'delete' statement is only a small part of how
memory is managed in C++. (Copy constructors and overloaded assignment
operators are a much larger part, and goofing those up is both easy and a
major source of leakage.)

> Anyway, if one is to follow GC path (which IMHO is wrong decision), why
> he should prefer D over C#?

The simple answer is D is for native code, and C# is not.

>> It's certainly the conventional wisdom that GC is slower. But try running
>> some benchmarks; you might be surprised. Here's one:
>> www.digitalmars.com/d/cppstrings.html
> Funny. I have used almost the same example to compare my U++/NTL library
> with STL :)
>
> (see here: http://upp.sourceforge.net/examples$idmapBench.html).
>
> Actually, NTL is 2.5 times faster than STL in this example (which is IMO
> dominated by the map performance anyway).
>
> So my beliefe is that you are actually comparing smart library vs.
> stupid one (STL). Not D vs C++ neither GC vs manual memory management.
>
> (And yes, I think that it is STL what makes C++ fail. But rather than
> inventing completely new language, replacing the library is simpler
> option IMHO :).

In my experience, the reason C++ is slower for the benchmark is because C++
needs to make copies of strings. In D, because of gc, it becomes possible to
just point at substrings, and so the D version just manipulates pointers,
where the C++ is constantly copying.

I'm curious how your package deals with substrings.

>> I don't see the chicken and egg problem. The Digital Mars DMD compiler
>> works
>> on both Windows and Linux, with no difference in the inline assembler. It
>> works; there is no problem with it, and nothing was broken by its
>> existence.
> So does GCC. You will have to wait for independent D compiler
> implementation to really find out how your language standard works.

It already exists, it's GDC (the Gnu D Compiler) created by David Friedman.

>> Such new instructions appear every year or two, and they get folded into
>> the
>> inline assembler in the next update as a normal part of the support of
>> the
>> compiler/language.
> Well, means there will have to be a new ANSI D standard each two years? :)

Just an appendix upgrade. It happens with the C++ standard, called a
"Technical Report".

> So do I. I would be happy to have really good compiler with compatible
> asm for both Linux and Win32. However, it has nothing to do with C++
> standard. Nothing prevents compiler writers to make things compatible
> here.

Something clearly is preventing it, as it doesn't happen. The lack of a
standard to press the issue is the problem.

Walter Bright
www.digitalmars.com C, C++, D programming language compilers

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Walter Bright

unread,

Dec 28, 2005, 9:28:06 PM12/28/05

to

"Mirek Fidler" <c...@volny.cz> wrote in message

news:41fb3dF...@individual.net...

> Walter Bright wrote:
>> Of course the powerpc and x86 will be different. But x86 on Windows and
>> x86
>> on Linux are identical. For C++, the x86 inline assembler for Windows and
>> Linux are wildly (and pointlessly) incompatible.
> AFAIK not if you are using the same compiler (GCC).

Every compiler vendor does it differently.

> However, it is
> more or less caused by things that has a little to do with GCC or MSC or
> C++ standard.

The whole point of a standard is to prevent gratuitous, pointless
incompatibilities. The standard leaves an inline assembler as totally
"implementation defined", which is why none of the inline assemblers are
compatible from one compiler vendor to another.

> The main problem is that linux development toolchains
> originally picked different assembler syntax for x86 than was the one
> provided by Intel.

There's no technical reason forcing this. But realistically, I know this
will never change for C and C++. But why repeat the problem with a new
language?

> However, I do not see how reasonable C++/D language standard could
> enforce assembler syntax without limiting the language to single CPU.

By providing an appendix, one for each CPU family, describing the assembler
syntax for that family. I don't see how that is so impossible to achieve,
it's just a bit on the tedious side. The Intel CPU manual describes the
format they use in about 3 pages. Most assembler formats are pretty simple
beasts.

Walter Bright
www.digitalmars.com C, C++, D programming language compilers

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Walter Bright

unread,

Dec 28, 2005, 9:20:29 PM12/28/05

to

"Dave Harris" <bran...@cix.co.uk> wrote in message
news:memo.20051228111242.1832A@brangdon.m...

> SeeWebsit...@moderncppdesign.com (Andrei Alexandrescu (See Website
> For Email)) wrote (abridged):
>> If I understand the documentation correctly, D makes a contract part of
>> the implementation and not part of the interface.
>
> I'm not sure that is correct. One reason is that when you override a
> function you can change its implementation, but in D you are still bound
> to its contract. I /think/ you are mislead by the D syntax, which does not
> separate declarations from definitions for classes. You expect to see the
> interface in a header file, and D doesn't have header files.

I inferred the same.

> On the other hand, it does sound like contracts only apply to classes and
> functions. D does not support multiple inheritance, and instead has a
> Java-like keyword "interface", and as far as I can tell these "interfaces"
> don't support contract programming. Which does make me wonder if the
> language designer may have missed the point.
>
> Maybe Walter Bright could comment on this specific area? So far the
> replies I've seen have been about memory allocation and assembler, and not
> the subject the thread title.

You're right, it should be possible to add contracts to interface members
and abstract functions. Not allowing this is an oversight on my part, but
it's hardly fatal <g>, as the use of contract programming has caught on in
the D community and has proven to be invaluable. Adding contracts to
interface members and abstract functions won't break existing code, and will
be a significant improvement to the contract programming support.

Walter Bright
www.digitalmars.com C, C++, D programming language compilers

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Howard Hinnant

unread,

Dec 28, 2005, 9:28:54 PM12/28/05

to

In article <41fha3F...@individual.net>,
Mirek Fidler <c...@volny.cz> wrote:

> Funny. I have used almost the same example to compare my U++/NTL library
> with STL :)
>
> (see here: http://upp.sourceforge.net/examples$idmapBench.html).
>
> Actually, NTL is 2.5 times faster than STL in this example (which is IMO
> dominated by the map performance anyway).
>
> So my beliefe is that you are actually comparing smart library vs.
> stupid one (STL). Not D vs C++ neither GC vs manual memory management.
>
> (And yes, I think that it is STL what makes C++ fail. But rather than
> inventing completely new language, replacing the library is simpler
> option IMHO :).

Or perhaps, instead of completely replacing STL, one might upgrade it to
NTL performance levels in a backwards compatible fashion?

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2005/n1771.html

;-)

One of my favorite Alexandrescu quotes is:

> As I'm sure you know very well, creating, copying around, and destroying
> temporary objects is the favorite indoor sport of your C++ compiler.

That is being fixed. I know it is a slow process, changing standards
takes time - but no more time than it would take to replace STL with NTL
in the standard.

-Howard

kanze

unread,

Dec 29, 2005, 7:13:44 AM12/29/05

to

Walter Bright wrote:
> "Dave Harris" <bran...@cix.co.uk> wrote in message
> news:memo.20051228111242.1832A@brangdon.m...
> > SeeWebsit...@moderncppdesign.com (Andrei Alexandrescu
> > (See Website For Email)) wrote (abridged):

> >> If I understand the documentation correctly, D makes a
> >> contract part of the implementation and not part of the
> >> interface.

> > I'm not sure that is correct. One reason is that when you
> > override a function you can change its implementation, but
> > in D you are still bound to its contract. I /think/ you are
> > mislead by the D syntax, which does not separate
> > declarations from definitions for classes. You expect to see
> > the interface in a header file, and D doesn't have header
> > files.

> I inferred the same.

I'm curious about this. I would have thought that on large
projects, it is relatively important that the contract and the
implementation be in different files, unless you have some sort
of versionning system where the granularity is finer than the
file.

Note that C++ is far from perfect in this regard, and the fact
that a header file must often contain more than just the
contractual interface is a serious weakness, requiring a major
work-around in the form of the compilation firewall idiom. (Of
course, if the contractual interface is defined in an interface
class, with all data and actual behavior in the derived classes,
this isn't too much of a problem.)

--
James Kanze GABI Software
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Walter Bright

unread,

Dec 29, 2005, 7:16:52 AM12/29/05

to

"Francis Glassborow" <fra...@robinton.demon.co.uk> wrote in message
news:Hee+5oLn...@robinton.demon.co.uk...

> In article <GaSdnVWw061...@comcast.com>, Walter Bright
> <wal...@nospamm-digitalmars.com> writes
>>Of course the powerpc and x86 will be different. But x86 on Windows and
>>x86
>>on Linux are identical. For C++, the x86 inline assembler for Windows and
>>Linux are wildly (and pointlessly) incompatible.
>
> You repeatedly make this assertion. However AFAIK D has a single
> implementor and that goes an immense way to ensure compatibility.

Actually, there are two independent D inline asm implementors - myself, and
David Friedman for GDC's (the reason is I couldn't release my inline
assembler under GPL). And for multiple implementors, having a standard is
the way to ensure compatibility. The first C standard, for example, brought
together many divergent implementations of C.

> There is no reason why a C++ compiler should not support identical
> 'inline assembler instructions'

Right, but in practice they don't.

> and no reason that two implemetatopns
> from different implementors for the same machine should be compatible

There's plenty of reason they should be, and they are the same reasons that
we found it worthwhile to have a C++ *standard*.

> (and that isn't just for asm but at link time quite a lot of other
> things as well)

All aspects of C++ source code, which includes any inline assembler, are
appropriate topics for the C++ standard.

Walter Bright
www.digitalmars.com C, C++, D programming language compilers

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Walter Bright

unread,

Dec 29, 2005, 7:17:53 AM12/29/05

to

"Francis Glassborow" <fra...@robinton.demon.co.uk> wrote in message

news:Ius9VcMs...@robinton.demon.co.uk...

>>I see gratuitous, pointless incompatibilities.
> Which has exactly nothing to do with the C++ Standard.

Implementation defined behavior does have something to do with the C++
Standard.

> Now if you want to have a go at Microsoft for using a different syntax

> to GCC, you are free to do so but this is not the place for it.

The topic is whether it should be standardized or left as implementation
defined. Which syntax is preferable is another issue entirely, and it's
pointless to debate that if the first is not resolved. Since some here argue
that standard C++ has an inline assembler, and others argue it does not,
this thread is an appropriate topic for this newsgroup. D's inline assembler
serves as an example of how a standard inline assembler can work.

> The compatibility of asm support in D for Linux and Windows on x86
> architectures is an artefact of their being a single implementor and not
> a result of a well defined standard.

There are two independent implementations of D inline assembler now. The
reason they are compatible is not an accident, it is because I was able to
make a convincing case that they should be the same to the people (i.e.
David Freidman) doing the work.

> Please note that it is completely
> impossible for WG21 to specify the syntax for asm if C++ is to be
> processor independent.

I don't agree. Have an appendix for each processor family outlining the
syntax for it. If a new CPU architecture comes out between updates to the
standard, or before a TR could be done for it, nothing is stopping an
implementation from forging ahead anyway to implement it. After all, that
happens now with proposed new features of C++.

>Yes it would be nice if those targeting the same
> hardware used the same syntax but that is not likely to happen unless
> there is some Standard for the assembler for that hardware.

Isn't that what I've been saying? <g>

Walter Bright
www.digitalmars.com C, C++, D programming language compilers

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Walter Bright

unread,

Dec 29, 2005, 7:18:15 AM12/29/05

to

"kanze" <ka...@gabi-soft.fr> wrote in message
news:1135784807....@o13g2000cwo.googlegroups.com...

> I'd say that the main difference is that D programmers use
> garbage collection, where as many C++ still don't. And one of
> the reasons is doubtlessly that garbage collection is
> "officially" part of D, where as it is still only a third party
> add in in C++.

Yes, I agree. I might also point out that gc in C++ is (usually) a third
party add-on, and many developers are leery of using packages that are not
officially supported by their compiler vendor (especially something as
intrusive as a gc). Using gc successfully in C++ is not for the novice
programmer.

>> Tremendous power? I think that's an exaggeration. It's
>> tempting to use it, but it winds up with the same faults as
>> using global state variables. You wind up tripping up library
>> functions that depend on it, having incompatibilities with 3rd
>> party code, etc. Use a class specific allocator for specific
>> needs, or just create a mymalloc() function and use that.
>
> The ability to define global placement new is an important
> advantage, because new (xxx) T( a, b ) also calls the
> constructor, which mymalloc() won't.

If you've got a constructor, you can also overload operator new for that
class. No need to do it globally.

> The ability to define a
> global replacement for the standard new is useful for
> debugging...

Not very. And even if it wasn't officially replaceable, one can still
replace it for debugging purposes.

> or integrating the garbage collector:-).

That's where the trouble comes in. Suppose I use a third party library which
is not gc friendly, and it uses global operator new?

> And everything you can do in C can also be done in assembler,
> with enough discipline. Been there, done that. But whips and
> leather aren't my thing. No point in making writing correct
> code more work than necessary.

Exactly.

> In fact, there is ONE very compulsive argument against garbage
> collection. Without it, you need more programmers, with a
> higher skill level. Which increases the demand, and thus pushes
> up the tarif for the experts like myself. Other than that,
> however, I can't see any advantage of not having such a useful
> tool available.

Yes. All one's carefully built up expertise with smart pointers becomes, er,
irrelevant <g>.

> How about between Linux (on IA-32 architectures) and Solaris (on
> Sparc).

I believe I answered that in another post.

> I suppose that the goal is to recover the contents of the
> registers ; that fullcollect exploits the (formally illegal)
> address it is passed.

You're right.

> My experience with this is that even on the same platform,
> different compilers or systems will use slightly different stack
> layouts. Under Sun OS 4, for example, the exact position of the
> frame pointer relative to a local variable was slightly
> different that under Solaris (Sun OS 5).

Stack layout is implementation defined, but that doesn't mean the assembler
syntax needs to be implementation defined as well.

> I might add in passing that the Microsoft code above is not
> according to the standard.

I know. Their inline assembler implementation predates the standard (as does
Digital Mars').

Walter Bright
www.digitalmars.com C, C++, D programming language compilers

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Walter Bright

unread,

Dec 29, 2005, 7:16:30 AM12/29/05

to

"kanze" <ka...@gabi-soft.fr> wrote in message

news:1135781743.3...@g14g2000cwa.googlegroups.com...

>> 1) For some things, you just gotta do it in assembler. Having
>> an inline assembler does the job 95% of the time, and obviates
>> the need for a separate assembler, extra files, a more complex
>> build process, etc.
> It's curious, but my experience has been just the opposite.
> I've never found a use for inline assembler, even when it has
> been present. To me, it just seems far more natural to write a
> separate module in assembler when I need assembler.

Here's why an inline assembler is worthwhile:

1) It simplifies the build process. Don't need to locate an assembler, put
it on the path, figure out the switches to it that are analogous to the C++
compiler's, etc.

2) Separate assemblers are on a different upgrade cycle than compilers, and
are often from a different vendor. For example, on the PC, I depended on
Microsoft's assembler. Unfortunately, they'd break my existing asm code with
nearly every upgrade. Getting the customer to build the library and have it
match the ones I build was a major aggravation because of this, and the
problem remains to this day.

3) To have a separate assembler module, you have to learn and adjust the
function calling conventions. This is a significant problem when you're
trying to support multiple memory models with the same code.

4) I didn't find much joy in writing header files twice, once for C++, and
again for the asm modules.

5) Neither was it much fun to rewrite struct declarations in the assembler
format. Class declarations were worse, because one must reverse engineer how
the C++ compiler laid them out.

6) Separate modules have no access to local variables.

7) It's a lot more writing to do a separate module.

> I'm not sure why our experiences here differ. Part of it may be
> because I have done a lot of programming in assembler, in the
> past, and am more or less used to writing entire modules in
> assembler.

That's not it, because I've written entire programs in assembler, such as a
cartridge for the Mattel Intellivision, and a VT100 terminal emulator for
the IBM PC. There's also the IEEE 748 floating point emulator for the C
compiler.

> And possibly, it is because the inline assemblers
> I've used haven't been that well integrated -- I've had problems
> figuring out which registers I could or could not use, where
> specific variables were, etc. All of this is strictly defined
> in the API for a separate function.

That's true, there are some crummy inline assemblers. Standardization would
help that, too.

> I suspect that this depends. Most of what I use inline assembly
> for are things so specific to the machine AND the system that
> I'd have to rewrite it anyway. And of course, the only
> portability which has really interested me in the past was
> between Solaris, and one of HP/UX, AIX or MS-DOS. So you can
> forget any idea of portable assembler anyway.

In D, having a portable inline assembler has already paid off. There are
crucial bits of the runtime library in assembler. I don't have to rewrite
them moving from Windows and Linux.

> Frankly, too, I think there is a quality of implementation issue
> involved. The standard doesn't specify much about inline
> assembler because from a quality of implementation point of
> view, you want it to be as close as possible to native
> assembler.

How does specifying it reduce quality? Does the C++ standard reduce the
quality of C++ compilers?

> Thus, it is almost imperative (again, from a QoI
> point of view) that even something as basic as the order of
> operands differ between an Intel machine and a Sparc : if the
> destination isn't the left most parameter on an Intel, and the
> right most on a Sparc, then there is a serious lack of quality.

Yes, but nobody said the sparc inline assembler must match the x86 inline
assembler. The only thing I've been advocating is that the x86 inline
assembler match for all compilers generating code for the x86, and ditto for
the sparc.

>> 3) Needing a few lines of assembler here and there doesn't
>> mean one needs a full blown macro assembler. Usually, the
>> need is less than 10 lines.
>
> Often, it can be as little as two or three lines. It's still
> easier, IMHO, to put it in a separate file.

I'm game <g>. Write the following in C++ source code:

int a, b, c;
try
{
a = b * c;
}
catch (IntegerOverflow)
{
...
}

I can do it in 4-5 lines of inline assembler.

> As soon as you
> support more than one platform (and everything I write today
> MUST compile on both a Sparc and a Linux based PC), you need
> separate files for it anyway.

No you don't. Use #ifdef. Honestly, from the responses here, I almost
believe that people are going looking for problems so they can throw in the
towel, rather than looking for solutions.

It's invariably fewer lines to write, as you can use the compiler to handle
boilerplate. See my comments above on reimplementing header files, struct
declarations, etc.

Walter Bright
www.digitalmars.com C, C++, D programming language compilers

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Walter Bright

unread,

Dec 29, 2005, 7:18:59 AM12/29/05

to

"Howard Hinnant" <howard....@gmail.com> wrote in message
news:howard.hinnant-D23...@syrcnyrdrs-01-ge0.nyroc.rr.com...

> I think we're talking well-integrated inline assembler here. In the
> inline assembler I've used most, you don't deal directly with registers.
> That's the domain of the compiler's register allocator. You name
> abstract registers and let the compiler deal with it.

That can be pretty cool, but it more or less requires an instruction set
with an orthogonal register set. That isn't the case with the x86.

However, the inline assembler still can be pretty well integrated in with
the compiler. For example, for DMC++ and DMD, the inline assembler keeps
track of which registers are read/written, so that:

1) register allocation will work for the rest of the function

2) the compiler can automatically save/restore registers on the function
prolog/epilog

This is far easier and less error prone than the g++ method of having the
programmer annotate the code with which registers got changed.

Walter Bright
www.digitalmars.com C, C++, D programming language compilers

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

kanze

unread,

Dec 29, 2005, 7:15:03 AM12/29/05

to

Howard Hinnant wrote:
> In article <1135781743.3...@g14g2000cwa.googlegroups.com>,
> "kanze" <ka...@gabi-soft.fr> wrote:

> > It's curious, but my experience has been just the opposite.
> > I've never found a use for inline assembler, even when it
> > has been present. To me, it just seems far more natural to
> > write a separate module in assembler when I need assembler.

> > I'm not sure why our experiences here differ. Part of it
> > may be because I have done a lot of programming in
> > assembler, in the past, and am more or less used to writing
> > entire modules in assembler. And possibly, it is because
> > the inline assemblers I've used haven't been that well
> > integrated -- I've had problems figuring out which registers
> > I could or could not use, where specific variables were,
> > etc. All of this is strictly defined in the API for a
> > separate function.

> I think we're talking well-integrated inline assembler here.
> In the inline assembler I've used most, you don't deal
> directly with registers. That's the domain of the compiler's
> register allocator. You name abstract registers and let the
> compiler deal with it.

And what happens if you allocate more abstract registers than
the machine has? Or more to the point, what happens if not all
registers are equal: definitely the case on the old Intel 16 bit
machines (where I did most of my assembler), but even on a
Sparc, you generally need to be able to control in which
subrange (o, i, l or g) you are, and sometimes the specific
register in the o and i sets.

I guess it depends somewhat on what you need to do. I use
inline assembler to generate the ta 3 instruction necessary to
synchronize the register stack and memory on a Sparc, before
doing the stack walkback: it's a single instruction, using no
registers or arguments other than the constant 3, so there's no
problem. I use a separate module for atomicRead and
atomicFetchAndAdd -- I suppose that I could use inline
assembler, but for software engineering reasons, they are in
separate functions anyway, and I don't see any real advantage in
moving them into C++, with inline assembler for the instructions
I can't access in C++ -- I find it easier to read if the entire
algorithm is in a single language, and there isn't that much in
them which can be expressed in C++ anyway.

Of course, there might be some argument if the inline assembler
were in an inline function, which the compiler really did inline
correctly. A function call isn't that expensive, but when the
actual function invoked consists of only two instructions (not
counting the return), it probably could make a difference in
some cases.

--
James Kanze GABI Software
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Branimir Maksimovic

unread,

Dec 29, 2005, 8:03:12 PM12/29/05

to

Walter Bright wrote:
> "Mirek Fidler" <c...@volny.cz> wrote in message
> news:41fha3F...@individual.net...
>

>>>It's certainly the conventional wisdom that GC is slower. But try running

>>>some benchmarks; you might be surprised. Here's one:
>>>www.digitalmars.com/d/cppstrings.html
>>
>>Funny. I have used almost the same example to compare my U++/NTL library
>>with STL :)
>>
>>(see here: http://upp.sourceforge.net/examples$idmapBench.html).
>>
>>Actually, NTL is 2.5 times faster than STL in this example (which is IMO
>>dominated by the map performance anyway).
>>
>>So my beliefe is that you are actually comparing smart library vs.
>>stupid one (STL). Not D vs C++ neither GC vs manual memory management.
>>
>>(And yes, I think that it is STL what makes C++ fail. But rather than
>>inventing completely new language, replacing the library is simpler
>>option IMHO :).
>
>
> In my experience, the reason C++ is slower for the benchmark is because C++
> needs to make copies of strings. In D, because of gc, it becomes possible to
> just point at substrings, and so the D version just manipulates pointers,
> where the C++ is constantly copying.

In this benchmark D wins because of very fast assoc arrays.
I've made MyString that does just keeps references and
used google::dense_hash_map in order to reach D performance:
here are results:
[bmaxa@maxa ~] $ time ./wcd alice30.txt > /dev/null

real 0m0.018s
user 0m0.010s
sys 0m0.010s
[bmaxa@maxa ~] $ time ./wc alice30.txt > /dev/null

real 0m0.021s
user 0m0.020s
sys 0m0.000s
[bmaxa@maxa ~] $ time ./wccpp alice30.txt > /dev/null

real 0m0.042s
user 0m0.040s
sys 0m0.000s
[bmaxa@maxa ~] $ time ./wccpp1 alice30.txt > /dev/null

real 0m0.046s
user 0m0.040s
sys 0m0.000s

wccpp and wccpp1 are C++ with maps, wc is my optimised program
and wcd is D program for counting.
I thought that I finally got it but wait, I've concatenated
alice30.txt about 10 times in order to test hash table lookup
speed.
And surprise:
[bmaxa@maxa ~] $ time ./wccpp huge.txt > /dev/null

real 0m7.109s
user 0m6.170s
sys 0m0.160s
[bmaxa@maxa ~] $ time ./wccpp1 huge.txt > /dev/null

real 0m6.678s
user 0m6.500s
sys 0m0.190s
[bmaxa@maxa ~] $ time ./wc huge.txt > /dev/null

real 0m2.366s
user 0m2.300s
sys 0m0.070s
[bmaxa@maxa ~] $ time ./wcd huge.txt > /dev/null

real 0m1.270s
user 0m1.200s
sys 0m0.070s

D's assoc arrays lookups are almost more than
twice faster then google:dense_hash_map lookups,
std::map is out of league here.
This benchmark is about assoc array lookups not
about memory management.
Nothing is freed in my and D benchmark as MyString
substrings just hold size and reference to other strings and
so forth.

Greetings, Bane.

Walter Bright

unread,

Dec 29, 2005, 8:14:11 PM12/29/05

to

"kanze" <ka...@gabi-soft.fr> wrote in message

news:1135851547.4...@g14g2000cwa.googlegroups.com...

> I'm curious about this. I would have thought that on large
> projects, it is relatively important that the contract and the
> implementation be in different files,

That's the conventional C++ approach, which has evolved from the necessity
to write header files for all one's source files. This necessity was the
result of primitive C compiler technology, and it's been recast as a feature
<g>.

The only time one needs this is when one is specifying interface classes or
abstract classes, and I agree that putting contracts on such is a good idea.

> unless you have some sort
> of versionning system where the granularity is finer than the
> file.
>
> Note that C++ is far from perfect in this regard, and the fact
> that a header file must often contain more than just the
> contractual interface is a serious weakness, requiring a major
> work-around in the form of the compilation firewall idiom. (Of
> course, if the contractual interface is defined in an interface
> class, with all data and actual behavior in the derived classes,
> this isn't too much of a problem.)

I think we're in agreement, then.

Walter Bright
www.digitalmars.com C, C++, D programming language compilers

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Branimir Maksimovic

unread,

Dec 31, 2005, 8:19:15 AM12/31/05

to

kanze wrote:
> Walter Bright wrote:
>
>>"Dave Harris" <bran...@cix.co.uk> wrote in message
>>news:memo.20051228111242.1832A@brangdon.m...
>>
>>>SeeWebsit...@moderncppdesign.com (Andrei Alexandrescu
>>>(See Website For Email)) wrote (abridged):
>
>
>>>>If I understand the documentation correctly, D makes a
>>>>contract part of the implementation and not part of the
>>>>interface.
>
>
>>>I'm not sure that is correct. One reason is that when you
>>>override a function you can change its implementation, but
>>>in D you are still bound to its contract. I /think/ you are
>>>mislead by the D syntax, which does not separate
>>>declarations from definitions for classes. You expect to see
>>>the interface in a header file, and D doesn't have header
>>>files.
>
>
>>I inferred the same.
>
>
> I'm curious about this. I would have thought that on large
> projects, it is relatively important that the contract and the
> implementation be in different files, unless you have some sort
> of versionning system where the granularity is finer than the
> file.

I've found workaround in D for this.

Actually it is pretty fine if:
//header.d
module Foo;
class Intf{
void foo();
}
void bar();
// end header.d

// impl.d
module Foo;
class Intf{
void foo(){}
}

void bar(){}
// end impl.d
and then just compile impl.d in lib and distribute header.d
This actually works, but I'm not sure if this is intended.
Andrei's complaint about contracts still stands in this
case, as they are not part of function signature, rather
function implementation.
Perhaps that would improve if compiler generated D interface
binary files as for example is done with Haskell *.hi files
with GHC used instead of headers, so one need to provide
just library documentation instead of text headers + lib docs.

Greetings, Bane.

Mirek Fidler

unread,

Dec 31, 2005, 8:16:38 AM12/31/05

to

>>Anyway, if one is to follow GC path (which IMHO is wrong decision), why
>>he should prefer D over C#?
>
>
> The simple answer is D is for native code, and C# is not.

There is nothing that would prevent C# be compiled into native code.

>>(And yes, I think that it is STL what makes C++ fail. But rather than
>>inventing completely new language, replacing the library is simpler
>>option IMHO :).
>
>
> In my experience, the reason C++ is slower for the benchmark is because C++
> needs to make copies of strings. In D, because of gc, it becomes possible to
> just point at substrings, and so the D version just manipulates pointers,
> where the C++ is constantly copying.

Well, I believe that in that particular example, number of times code
has to go through string in order to compare them and or generate hashes
is much greater than time spend copying chars.

Moreover, that example is unintentionally cheating, as probably NO
garbage collecting is perfomed (if you are using somthing like Boehm's
implementation, allocating the array for the whole file will allocate
more memory than is needed in subsequent code and remaining allocations
will likely not trigger another another mark&sweep, as boehm's GC
reserves 50% more after first step).

>>So does GCC. You will have to wait for independent D compiler
>>implementation to really find out how your language standard works.
>
>
> It already exists, it's GDC (the Gnu D Compiler) created by David Friedman.

Well, ok. (Makes me think that David could contribute Intel asm to gcc :)

Mirek

Mirek Fidler

unread,

Dec 31, 2005, 8:17:17 AM12/31/05

to

> Or perhaps, instead of completely replacing STL, one might upgrade it to
> NTL performance levels in a backwards compatible fashion?
>
> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2005/n1771.html
>
> ;-)
>
> One of my favorite Alexandrescu quotes is:
>
>
>>As I'm sure you know very well, creating, copying around, and destroying
>>temporary objects is the favorite indoor sport of your C++ compiler.
>
>
> That is being fixed. I know it is a slow process, changing standards
> takes time - but no more time than it would take to replace STL with NTL
> in the standard.

Who is speaking about standard? Does this really need to be standard?

Mirek

Walter Bright

unread,

Dec 31, 2005, 8:21:16 AM12/31/05

to

"Branimir Maksimovic" <bm...@hotmail.com> wrote in message
news:dp13e2$fc3$1...@domitilla.aioe.org...

>> In my experience, the reason C++ is slower for the benchmark is because
>> C++
>> needs to make copies of strings. In D, because of gc, it becomes possible
>> to
>> just point at substrings, and so the D version just manipulates pointers,
>> where the C++ is constantly copying.
>
> In this benchmark D wins because of very fast assoc arrays.

Very interesting results. I didn't think there was anything special about
D's AA performance. Just goes to show, benchmarking results are often
surprising! It's especially surprising since C++ template implementations
are supposed to be inlining things for speed, and D's AA routines are an OOP
virtual function approach.

Nevertheless, copying a pointer and a length is pretty much guaranteed to be
faster than doing malloc/copy/free, although it apparently isn't dominant in
this benchmark.

Walter Bright
www.digitalmars.com C, C++, D programming language compilers

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Mirek Fidler

unread,

Dec 31, 2005, 8:47:11 AM12/31/05

to

>>(And yes, I think that it is STL what makes C++ fail. But rather than
>>inventing completely new language, replacing the library is simpler
>>option IMHO :).
>
>
> Or perhaps, instead of completely replacing STL, one might upgrade it to
> NTL performance levels in a backwards compatible fashion?
>
> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2005/n1771.html

Well, I would like to add that n1771 solves about 25% of STL
problems.... Will we have to wait for another 50 years to solve the rest? :)

Mirek

kanze

unread,

Dec 31, 2005, 8:45:44 AM12/31/05

to

Walter Bright wrote:
> "kanze" <ka...@gabi-soft.fr> wrote in message
> news:1135784807....@o13g2000cwo.googlegroups.com...

>> I'd say that the main difference is that D programmers use
>> garbage collection, where as many C++ still don't. And one
>> of the reasons is doubtlessly that garbage collection is
>> "officially" part of D, where as it is still only a third
>> party add in in C++.

> Yes, I agree. I might also point out that gc in C++ is
> (usually) a third party add-on, and many developers are leery
> of using packages that are not officially supported by their
> compiler vendor (especially something as intrusive as a gc).

I find that the inhibitions in that regard more often come from
management.

> Using gc successfully in C++ is not for the novice programmer.

I'm not sure. It took me about ten minutes to install the Boehm
collector, so that I used it systematically, but of course, with
over fifteen years experience in C++, I'm hardly in the novice
category. Once installed, it is for all pratical purposes
transparent. About the only difference is that if you don't
have any particular actions that are needed at the end of object
lifetime, you can skip the delete. (Can, but don't have to...
Existing code written without garbage collection in mind works
perfectly well.)

>>> Tremendous power? I think that's an exaggeration. It's
>>> tempting to use it, but it winds up with the same faults as
>>> using global state variables. You wind up tripping up
>>> library functions that depend on it, having
>>> incompatibilities with 3rd party code, etc. Use a class
>>> specific allocator for specific needs, or just create a
>>> mymalloc() function and use that.

>> The ability to define global placement new is an important
>> advantage, because new (xxx) T( a, b ) also calls the
>> constructor, which mymalloc() won't.

> If you've got a constructor, you can also overload operator
> new for that class. No need to do it globally.

If you want to do it on a class specific level, fine. If you
have classes that are sometimes in shared memory, and the
decision whether to use shared memory or not is made at
allocation time, it's a different issue.

It's not a feature that I use frequently, but when I do, it's
very convenient.

>> The ability to define a global replacement for the standard
>> new is useful for debugging...

> Not very. And even if it wasn't officially replaceable, one
> can still replace it for debugging purposes.

One might be able to, yes. One can usually replace malloc,
which isn't officially replaceable. But what's the harm in
requiring it to be possible?

>> or integrating the garbage collector:-).

> That's where the trouble comes in. Suppose I use a third
> party library which is not gc friendly, and it uses global
> operator new?

What do you mean by "not gc friendly"? I have never had a
problem with third party libraries with the Boehm collector.
They don't necessarily benefit from it, because they always do
call delete, but they do work, and I still benefit from it.

[...]

>> How about between Linux (on IA-32 architectures) and Solaris
>> (on Sparc).

> I believe I answered that in another post.

I just keep bringing it up because it is really the only
portability which interests me currently.

>> I suppose that the goal is to recover the contents of the
>> registers ; that fullcollect exploits the (formally illegal)
>> address it is passed.

> You're right.

>> My experience with this is that even on the same platform,
>> different compilers or systems will use slightly different
>> stack layouts. Under Sun OS 4, for example, the exact
>> position of the frame pointer relative to a local variable
>> was slightly different that under Solaris (Sun OS 5).

> Stack layout is implementation defined, but that doesn't mean
> the assembler syntax needs to be implementation defined as
> well.

But what do you gain by standardization? You have to adapt the
code for each platform anyway.

>> I might add in passing that the Microsoft code above is not
>> according to the standard.

> I know. Their inline assembler implementation predates the
> standard (as does Digital Mars').

This history of inline assmebler is interesting in itself. The
original K&R C had it. ISO C dropped it, but C++ reinstated it
(presumably because the original implementations of C++ were
based on K&R C, there being no ISO C at the time). I don't have
access to my K&R C documents here, but I would suppose that the
current C++ standard corresponds to what the early AT&T C
compilers did. Which would seem to me to be a sort of defacto
standard.

In pure C++, of course, there has never been a standard which
didn't support inline assembler. Before 1998, the ARM was the
de facto standard, and before the ARM, the reference sections of
"The C++ Programming Language". As far as I know, all of these
defined "asm" exactly as it is defined in the ISO C++ standard.
So there is really no excuse for any compiler not being conform
here.

Which, of course, brings up another question: what's the point
of standardizing anything, if the implementors are just going to
ignore the standard? (Obviously, since you are arguing for a
more rigorous standard, your C++ compiler is fully compliant
with the current standard. Which is good news, because it means
that there are now at least two compilers which implement
export.)

--
James Kanze GABI Software
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

kanze

unread,

Dec 31, 2005, 9:07:08 AM12/31/05

to

Walter Bright wrote:
> "kanze" <ka...@gabi-soft.fr> wrote in message
> news:1135851547.4...@g14g2000cwa.googlegroups.com...
>> I'm curious about this. I would have thought that on large
>> projects, it is relatively important that the contract and
>> the implementation be in different files,

> That's the conventional C++ approach, which has evolved from
> the necessity to write header files for all one's source
> files. This necessity was the result of primitive C compiler
> technology, and it's been recast as a feature <g>.

> The only time one needs this is when one is specifying
> interface classes or abstract classes, and I agree that
> putting contracts on such is a good idea.

You are ignoring then the value of a contractual interface for
value type classes, which don't use inheritance?

The issue is, of course, not a simple one, and there are many
solutions. And the use of C++ header files as a solution is in
some ways a hack, although in practice, with judicious use of
the compilation firewall idiom, it tends to work fairly well.
Header files, as such, aren't a feature -- I don't think anyone
could argue that. But they do provide (in a very bad way) a
functionality which is necessary. (Ada does it much better, for
example, with its interface and implementation sections -- which
would also normally be in separate files.)

Another solution, which also works to a point, is to use a
different language completely to define the interfaces. I've
worked on several projects, for example, where the interfaces
were systematically specified in Rational Rose. Rose's support
for programming by contract is very limited, however. Still, if
I were to develop a large project in Java, this is the way I
would go. (There is always the risk that someone modifying the
source accidentally modifies something generated by Rose, and
not only what he is allowed to modify. It should be able to
write some sort of checkin script which validates the code
against the petal file to prevent this, however.)

--
James Kanze GABI Software
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

kanze

unread,

Dec 31, 2005, 9:05:00 AM12/31/05

to

Walter Bright wrote:
> "kanze" <ka...@gabi-soft.fr> wrote in message
> news:1135781743.3...@g14g2000cwa.googlegroups.com...

>>> 1) For some things, you just gotta do it in assembler.
>>> Having an inline assembler does the job 95% of the time,
>>> and obviates the need for a separate assembler, extra
>>> files, a more complex build process, etc.

>> It's curious, but my experience has been just the opposite.
>> I've never found a use for inline assembler, even when it
>> has been present. To me, it just seems far more natural to
>> write a separate module in assembler when I need assembler.

> Here's why an inline assembler is worthwhile:

> 1) It simplifies the build process. Don't need to locate an
> assembler, put it on the path, figure out the switches to it
> that are analogous to the C++ compiler's, etc.

All of the compilers I use are in fact compiler drivers, all of
them know about the assembler, and all of them invoke it
automatically when given an assembler source. I use exactly the
same rule in my makefiles for C++ sources and assembler sources.

> 2) Separate assemblers are on a different upgrade cycle than
> compilers, and are often from a different vendor. For
> example, on the PC, I depended on Microsoft's assembler.
> Unfortunately, they'd break my existing asm code with nearly
> every upgrade. Getting the customer to build the library and
> have it match the ones I build was a major aggravation because
> of this, and the problem remains to this day.

I'll admit that I've never seen this. But back when I was doing
assembler for Intel, we always used the Intel assembler, and
since then, I've been exclusively active on Unix platforms,
where the assembler is normally bundled with the OS -- at least,
I've never asked myself where it came from, and it has always
been just there.

In sum, I regularly have more problems with awk and the other
tools I use than with the assembler. And this despite the fact
that awk is standardized, as part of Posix: under Solaris, I
need a special path ("/usr/xpg4/bin/awk", rather than the
expected "/usr/bin/awk"), and under Linux, I have to give it
special flags.

> 3) To have a separate assembler module, you have to learn and
> adjust the function calling conventions. This is a
> significant problem when you're trying to support multiple
> memory models with the same code.

But the instructions you use won't necessarily be the same
depending on the memory model either. In all cases, you have to
learn a lot of very platform specific details.

> 4) I didn't find much joy in writing header files twice, once
> for C++, and again for the asm modules.

Asm modules don't have header files:-). (We're not talking
about writing a significant part of the application in
assembler. Just accessing a few hardware details which you
can't access in the language itself.)

> 5) Neither was it much fun to rewrite struct declarations in
> the assembler format. Class declarations were worse, because
> one must reverse engineer how the C++ compiler laid them out.

It's been years (maybe twenty) since I've written anything in
assembler which would require a struct or a class declaration.

> 6) Separate modules have no access to local variables.

So?

> 7) It's a lot more writing to do a separate module.

??? I don't understand this one. The total number of lines I
have to write doesn't really change much. What's the difference
whether I put them in one file, or in two?

>> I'm not sure why our experiences here differ. Part of it
>> may be because I have done a lot of programming in
>> assembler, in the past, and am more or less used to writing
>> entire modules in assembler.

> That's not it, because I've written entire programs in
> assembler, such as a cartridge for the Mattel Intellivision,
> and a VT100 terminal emulator for the IBM PC. There's also the
> IEEE 748 floating point emulator for the C compiler.

>> And possibly, it is because the inline assemblers I've used
>> haven't been that well integrated -- I've had problems
>> figuring out which registers I could or could not use, where
>> specific variables were, etc. All of this is strictly
>> defined in the API for a separate function.

> That's true, there are some crummy inline assemblers.
> Standardization would help that, too.

Can standardization impose minimum quality? I don't think so.

>> I suspect that this depends. Most of what I use inline
>> assembly for are things so specific to the machine AND the
>> system that I'd have to rewrite it anyway. And of course,
>> the only portability which has really interested me in the
>> past was between Solaris, and one of HP/UX, AIX or MS-DOS.
>> So you can forget any idea of portable assembler anyway.

> In D, having a portable inline assembler has already paid off.
> There are crucial bits of the runtime library in assembler. I
> don't have to rewrite them moving from Windows and Linux.

When I move from Windows to Linux, I have to redo a lot of
things: everything concerning threading, for example, some of
the socket code, much of the more basic file handling (reading
directories, etc.). The couple of lines of assembler don't
really make a significant change.

Where the difference shows up is porting between Solaris and
Linux. Because 99% of my Solaris code works as is under Linux.
The big exception is anything written in inline assembler. (The
other is things like stack walkback. Even though the Linux
version doesn't contain a single line of assembler.)

>> Frankly, too, I think there is a quality of implementation
>> issue involved. The standard doesn't specify much about
>> inline assembler because from a quality of implementation
>> point of view, you want it to be as close as possible to
>> native assembler.

> How does specifying it reduce quality? Does the C++ standard
> reduce the quality of C++ compilers?

As I said, you want it to be as close as possible to the native
assembler. The Intel assembler for IA-32, the Sun assembler for
Sparc, etc. And anything you specify will go against this.

It's about as if you were specifying C++ "as close to C as
possible", but C used {...} on one platform, and BEGIN...END on
another.

>> Thus, it is almost imperative (again, from a QoI point of
>> view) that even something as basic as the order of operands
>> differ between an Intel machine and a Sparc : if the
>> destination isn't the left most parameter on an Intel, and
>> the right most on a Sparc, then there is a serious lack of
>> quality.

> Yes, but nobody said the sparc inline assembler must match the
> x86 inline assembler. The only thing I've been advocating is
> that the x86 inline assembler match for all compilers
> generating code for the x86, and ditto for the sparc.

OK. The standard can't very well say much about that, EXCEPT
what it already says (which is to provide a means of escaping to
the relative syntax). Quality of implementation, of course,
says that the inline assembler will match the native platform
assembler. My only recent experience with inline assembler is
on a Sparc, and I can say that there, this is the case, and the
inline assembler I wrote for Sun CC works with g++. If this is
not the case for Windows, there is a serious quality of
implementation issue -- given the examples you posted, I would
say that neither VC++ nor g++ have acceptable quality here.

But what can you standardize (in the sense of ISO
standardization) which would be valid for all platforms.

I'm also tempted to say: what good would it do if you could?
>From your posted example, I gather that Microsoft already
ignores the little standardization there is. There's also the
point that an implementation is required by the standard to
document all implementation defined behavior, and "the meaning
of an asm declaration is implementation-defined". I have been
relatively unsuccessful in all of my attempts to find such
documentation, however, with the exception of g++. So the
problem I currently see concering inline assembler isn't that it
isn't standardized enough, but that implementors are ignoring
the little standardization there is. How does adding to a
standard that everybody ignores help anything?

>>> 3) Needing a few lines of assembler here and there doesn't
>>> mean one needs a full blown macro assembler. Usually, the
>>> need is less than 10 lines.

>> Often, it can be as little as two or three lines. It's
>> still easier, IMHO, to put it in a separate file.

> I'm game <g>. Write the following in C++ source code:

> int a, b, c;
> try
> {
> a = b * c;
> }
> catch (IntegerOverflow)
> {
> ...
> }

> I can do it in 4-5 lines of inline assembler.

Why would you? Is there ever a time when you would want a catch
block in a module where you would want assembler?

On the other hand, you do have a general point I find
interesting: while I can't imagine wanting a catch block here, I
can very much imagine wanting to throw an exception. In IA-32
assembler (from memory -- and it's been a long time), what I
want is something like:

mov ax,a
mul b
jnov $1
throw exception
$1:

And I don't, off hand, know how to throw a C++ exception in
assembler. Which is some justification for inline assembler,
provided I can somehow get the compiler to treat the jnov as an
if. (I've never seen an inline assembler which would support
that, but who knows?)

>> As soon as you support more than one platform (and
>> everything I write today MUST compile on both a Sparc and a
>> Linux based PC), you need separate files for it anyway.

> No you don't. Use #ifdef.

Never. I have to maintain my code.

> Honestly, from the responses here, I almost believe that
> people are going looking for problems so they can throw in the
> towel, rather than looking for solutions.

I don't know. Honestly, I've never found the problem great
enough to need more of a solution.

But the C++ itself requires boilerplate. It's more logical
boilerplate, I admit (and generally more useful to the reader),
but the number of lines doesn't change significantly here.

I can see some argument for the first function above. With
inline assembler, it would be simply:

int
GB_atomicRead( int* addr )
{
asm( "membar #LoadLoad" ) ;
return *addr ;
}

That is, of course, almost as many lines as the assembler. If
it is a separate function, there is a good chance that the
generated code will execute more instructions than my version,
but if inline assembler doesn't inhibit inlining, the results
could be much better than my separate function, gaining a call
and a return. (But it is the membar instruction which takes the
most time on a Sparc.)

The second is more complex, because of the loop, and the
necessity of using several registers. I'd like to see what the
equivalent was using a good inline assembler.

> See my comments above on reimplementing header files, struct
> declarations, etc.

What header files? What struct declarations? In C++, the
functions would be:

int
GB_atomicRead( int* addr )
{
return *addr ;
}

int
GB_atomicFetchAndAdd( int* addr, int delta )
{
unsigned result = *addr ;
*addr += delta ;
return result ;
}

Except that the assembler versions work in a multithreaded
envirionment without locks. There is a header file for C++,
which declares the two functions as global, with "C" linkage,
and I would probably include it in a C++ version, but it isn't
formally necessary, and I didn't write one for the assembler.

This is the sort of thing I use assembler for. Things that you
cannot do in C++. And those things are typically very, very
small, and use very basic types.

(Which makes me wonder if part of the reason behind our
different experience isn't simply due to different applications
under a different system. I can imagine needing a lot more in
assembler, manipulating more complicated data structures, if I
were writing a debugger under a system which didn't have
something like Unix's ptrace facilities.)

--
James Kanze GABI Software
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Mirek Fidler

unread,

Jan 1, 2006, 6:25:13 AM1/1/06

to

Walter Bright wrote:
> "Mirek Fidler" <c...@volny.cz> wrote in message
> news:41fbhdF...@individual.net...
>
>>Well, that alone would make me a lot nervous. Of course, I know that
>>statistically, conservative GC works well, but in the end it is still
>>stochastic system with little bit unpredictible behaviour. Designing
>>language that REQUIRES conservative GC is perhaps pragmatic, but IMHO
>>limits applications of the language.
>>But maybe it is just me, as I consider ANY resource leaks unacceptable
>
>
> It's also possible for an application in C++ to wedge itself into an out of
> memory condition even though there is plenty of memory available, and even
> if there are no memory leaks. This can happen due to fragmentation of the
> free memory pool.

Well, maybe is a time to ask a question about conservative GC that flows
in my mind for some time now:

Imagine raster image processing software, images are stored using GC,
memory space is 32-bit (4GB) and 1GB of it is used for storing images.

Now you will load white noise into that 1GB of images. What happens to
conservative GC? If I understand things well, white noise will keep all
frames as marked and your app will run out of memory. Am I right?

This is what makes nervous about conservative GC. I agree with your
above statement, but I think I can to some degree predict allocation
patterns with respect to size. Conservative GC adds a new problem here -
it depends not only on the size, but also on the data. And data is
something you have only a little control about....

Crashing my app by loading specific data sounds really scary...

Mirek

Mirek Fidler

unread,

Jan 1, 2006, 6:26:01 AM1/1/06

to

Walter Bright wrote:
> "Branimir Maksimovic" <bm...@hotmail.com> wrote in message
> news:dp13e2$fc3$1...@domitilla.aioe.org...
>
>>>In my experience, the reason C++ is slower for the benchmark is because
>>>C++
>>>needs to make copies of strings. In D, because of gc, it becomes possible
>>>to
>>>just point at substrings, and so the D version just manipulates pointers,
>>>where the C++ is constantly copying.
>>
>>In this benchmark D wins because of very fast assoc arrays.
>
>
> Very interesting results. I didn't think there was anything special about
> D's AA performance. Just goes to show, benchmarking results are often
> surprising! It's especially surprising since C++ template implementations
> are supposed to be inlining things for speed,

Well, binary trees lousy performance can in no case be saved by inlining :)

As for superior performance of hash tables - I suppose that sometimes
you can get stellar results by simplified hash functions just to be
bitten a while later by colliding situation. May I ask whether you are
doing some hash-code magic?

> Nevertheless, copying a pointer and a length is pretty much guaranteed to be
> faster than doing malloc/copy/free,

Mostly depends on what are you doing. Such implementation of string is
ideal if a lot of slicing is done. However, it is less ideal for
"building" operations (I guess you have some dedicated class for this
purpose). Also, some operations are really a bit faster with zero
terminating character.

Mirek

Howard Hinnant

unread,

Jan 1, 2006, 6:27:59 AM1/1/06

to

In article <41kfq5F...@individual.net>,
Mirek Fidler <c...@volny.cz> wrote:

> >>(And yes, I think that it is STL what makes C++ fail. But rather than
> >>inventing completely new language, replacing the library is simpler
> >>option IMHO :).
> >
> >
> > Or perhaps, instead of completely replacing STL, one might upgrade it to
> > NTL performance levels in a backwards compatible fashion?
> >
> > http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2005/n1771.html
>
> Well, I would like to add that n1771 solves about 25% of STL
> problems.... Will we have to wait for another 50 years to solve the rest? :)

auto_ptr "copy" semantics (i.e. moving from lvalues with copy syntax)
has been shown to be error prone:

auto_ptr<T> p1(/*...*/);
auto_ptr<T> p2 = p1; // p1 modified

For a more in-depth description see:

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2005/n1856.html#20.4.5
%20-%20Class%20template%20auto_ptr

Instead of replacing the STL with an entire library based on auto_ptr's
destructive copy (from lvalues), I instead recommend deprecating
auto_ptr.

-Howard

Walter Bright

unread,

Jan 1, 2006, 6:44:48 AM1/1/06

to

"kanze" <ka...@gabi-soft.fr> wrote in message

news:1135936135....@g43g2000cwa.googlegroups.com...

> Walter Bright wrote:
>> Using gc successfully in C++ is not for the novice programmer.
> I'm not sure. It took me about ten minutes to install the Boehm
> collector, so that I used it systematically, but of course, with
> over fifteen years experience in C++, I'm hardly in the novice
> category.

There you go <g>.

> Once installed, it is for all pratical purposes transparent.

No, it isn't. I think you're so used to dealing with it you don't notice the
problems anymore, as you unconsciously avoid them.

> If you want to do it on a class specific level, fine. If you
> have classes that are sometimes in shared memory, and the
> decision whether to use shared memory or not is made at
> allocation time, it's a different issue.

All I do is derive those classes from a common ancestor that all it does is
declare operators new and delete. Done.

> But what's the harm in requiring it to be possible?

I look at it from a different point of view - what's the good in it? If the
good is marginal, and easilly achieved by other (less bug prone) means, then
it isn't justified.

>> That's where the trouble comes in. Suppose I use a third
>> party library which is not gc friendly, and it uses global
>> operator new?
> What do you mean by "not gc friendly"?

1) You've never seen code that stores bit flags in pointers? <g> I have -
it's more common than you might suspect. With closed source third party
libraries, you're sailing into uncharted and untested territory when you
change the allocator that it's designed and tested for.

2) With a gc, you have to be careful where you store your 'roots', to be
sure that the collection pass of the gc knows about them. If the third party
library allocates some memory via an unusual method, a method not hooked by
the gc, and stores roots in it, the gc will mistakenly delete objects that
are still in use.

>> Stack layout is implementation defined, but that doesn't mean
>> the assembler syntax needs to be implementation defined as
>> well.
> But what do you gain by standardization?

If you see nothing to be gained by not having to rewrite inline assembler
even though the CPU is the same, I can't explain it any further.

> Which, of course, brings up another question: what's the point
> of standardizing anything, if the implementors are just going to
> ignore the standard?

The C++ "standard" for inline assembler is so faint as to not enhance source
portability by one iota by adhering to it. There's simply nothing to be
gained by being compliant here, as you have to rewrite 100% of the inline
assembler anyway to move from windows to linux.

What if all the C++ standard said was "C++ source code shall be in ASCII."
Would you fault any C++ compiler vendor for not bothering with that
standard? I wouldn't.

> (Obviously, since you are arguing for a
> more rigorous standard, your C++ compiler is fully compliant
> with the current standard. Which is good news, because it means
> that there are now at least two compilers which implement
> export.)

Sorry to say it, but DMC++ doesn't support exported templates.

Walter Bright
www.digitalmars.com C, C++, D programming language compilers

Walter Bright

unread,

Jan 1, 2006, 6:46:31 AM1/1/06

to

"Mirek Fidler" <c...@volny.cz> wrote in message

news:41j9d5F...@individual.net...

> Moreover, that example is unintentionally cheating, as probably NO
> garbage collecting is perfomed

It doesn't even do any allocation, so no need for collecting. Is that
cheating? I don't believe so, because a characteristic of gc is that *far
fewer allocations are necessary*. This is often the reason why gc apps can
outperform non-gc apps, so why shouldn't a benchmark illustrate that?

>> It already exists, it's GDC (the Gnu D Compiler) created by David
>> Friedman.
> Well, ok. (Makes me think that David could contribute Intel asm to gcc :)

That would be worthwhile, but it would be the gcc team's responsibility to
do the work.

Walter Bright
www.digitalmars.com C, C++, D programming language compilers

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

James Kanze

unread,

Jan 2, 2006, 5:17:28 AM1/2/06

to

Walter Bright wrote:
> "kanze" <ka...@gabi-soft.fr> wrote in message
> news:1135936135....@g43g2000cwa.googlegroups.com...

>>Walter Bright wrote:

>>>Using gc successfully in C++ is not for the novice programmer.

>>I'm not sure. It took me about ten minutes to install the
>>Boehm collector, so that I used it systematically, but of
>>course, with over fifteen years experience in C++, I'm hardly
>>in the novice category.

> There you go <g>.

>>Once installed, it is for all pratical purposes transparent.

> No, it isn't. I think you're so used to dealing with it you
> don't notice the problems anymore, as you unconsciously avoid
> them.

What problems, for example?

>>If you want to do it on a class specific level, fine. If you
>>have classes that are sometimes in shared memory, and the
>>decision whether to use shared memory or not is made at
>>allocation time, it's a different issue.

> All I do is derive those classes from a common ancestor that
> all it does is declare operators new and delete. Done.

Which would mean, in the context of C++, that classes like
std::string and std::vector don't use garbage collection. And
that classes which contain members of these types can't use
garbage collection either, because they don't use it.

All I do for the moment is define a global operator new and
operator delete which call gcmalloc and do nothing,
respectively. Nothing else changes, except that when I know it
is acceptable (which is fairly often), I don't call delete.

As I say, I've not yet enough experience for my results to be
conclusive, but my impression to date is that this is a win-win
solution -- I have the advantages of both worlds. Perhaps when
I run into something more complex, where I want finalization,
I'll change my mind, but for the moment, it looks like a simple,
easy to use and easy to install solution.

>>But what's the harm in requiring it to be possible?

> I look at it from a different point of view - what's the good
> in it? If the good is marginal, and easilly achieved by other
> (less bug prone) means, then it isn't justified.

>>>That's where the trouble comes in. Suppose I use a third
>>>party library which is not gc friendly, and it uses global
>>>operator new?

>>What do you mean by "not gc friendly"?

> 1) You've never seen code that stores bit flags in pointers?

I you mean using the low bit as a boolean flag, yes. I've not
only seen it, I've written it myself. In fact, I'm using it in
one of my applications today. (The application is a bit
special. The pointer is in a union with a pointer sized
unsigned -- the low order bit is used as a discriminator. The
results are ugly, as code goes, but in this particular case,
given the number of instances, the difference in memory use is
significant, and important to the usability of the application.)

Where is the problem with regards to garbage collection? This
particular application currently runs using (and counting on)
garbage collection. Without any problems that I've seen to
date.

> <g> I have - it's more common than you might suspect.

Hopefully, it's not that common -- it's ugly and difficult to
maintain. But of course, if it were a problem, once is enough.

> With closed source third party libraries, you're sailing into
> uncharted and untested territory when you change the allocator
> that it's designed and tested for.

My application is mainly designed to test a library; the funny
code is in the library. The library will probably, one day, be
made available on the network.

Of course, this library has been tested with garbage collection
(and in fact, doesn't work without it). Which isn't the case
for other third party software. On the other hand, a lot of
people are also using boost::shared_ptr, and other "simulated"
garbage collection, on code which was never tested with it. Sun
provides three versions of malloc with its compiler, and you can
pick up more off the net, but most of the third party libraries
I've seen have only been tested with one of them. This is the
reason why garbage collection, like different versions of
malloc, must be more or less transparent.

> 2) With a gc, you have to be careful where you store your
> 'roots', to be sure that the collection pass of the gc knows
> about them.

I've never paid any attention, and the Boehm collection doesn't
seem to mind. I know that there are ways of hiding pointers so
that it cannot see them (including certain optimization
techniques which might be used by the compiler), but in
practice, they're pretty obscure, and I've not found them to be
a problem in practice, at least not with g++ under Linux on a
PC.

> If the third party library allocates some memory via an
> unusual method, a method not hooked by the gc, and stores
> roots in it, the gc will mistakenly delete objects that are
> still in use.

I don't think you understand how the Boehm collector works. The
only problems would be if it somehow cammoflaged the pointers --
writing them to disk, and not keeping an in memory copy, or
adding some arbitrary value to them (undefined behavior, but
works on my machines). As I say, in practice, I've not found
this to be a problem, and quite frankly, I don't expect it to be
one.

>>>Stack layout is implementation defined, but that doesn't mean
>>>the assembler syntax needs to be implementation defined as
>>>well.

>>But what do you gain by standardization?

> If you see nothing to be gained by not having to rewrite
> inline assembler even though the CPU is the same, I can't
> explain it any further.

Two points:

-- If the CPU is the same, you shouldn't have to rewrite system
independant assembler today. It's a quality of
implementation issue, of course, but then, to a certain
degree, so is standard conformance; I know of more than one
compiler which doesn't implement export at all, for example,
despite what the standard says.

-- You can't standardize enough to make it useful. You can't
standardize the names of the machine instructions, for
example -- on the 8080, there were two widespread sets, and
I seem to recall seeing some 8086 assemblers which added a b
post-fix for byte accesses. So you're still stuck that one
compiler might require movb al,xxx and another mov al,xxx
for the same instruction.

If you could guarantee that I would never have to modify inline
assembler when moving between two machines of the same
architecture, you might have a small point, although even then,
are sparc pre version 9 and sparc version 9 and above the same
architecture? (I use the same compiler for both, but different
machine instructions in the assembler for atomic fetch and add.)
What about Intel 8086 and 80286?

Of course, to be really useful, you'd have to guarantee
portability within a large range of processors: to repeat, the
only "portability" which would be useful to me today is between
Intel IA-32 under Linux and Sparc v9 under Solaris. But I think
you'll agree that that isn't feasable.

>>Which, of course, brings up another question: what's the point
>>of standardizing anything, if the implementors are just going
>>to ignore the standard?

> The C++ "standard" for inline assembler is so faint as to not
> enhance source portability by one iota by adhering to it.
> There's simply nothing to be gained by being compliant here,
> as you have to rewrite 100% of the inline assembler anyway to
> move from windows to linux.

> What if all the C++ standard said was "C++ source code shall
> be in ASCII." Would you fault any C++ compiler vendor for not
> bothering with that standard? I wouldn't.

With regards to inline assembler, at present, not really. With
regards to other things, like export, very much. My point is
that even when something is well defined, like export, compiler
vendors ignore it if the feel like it. Given the little
importance of inline assembler in most uses of the compiler, I
doubt that adhering to any standardization concerning it would
be high on their level of priorities. Which means that it
really doesn't matter whether it is standardized or not.

This issue is, of course, independant of any other advantages
one might glean from having it standardized. And it rather
pisses me off that it should be an issue. But the handling of
export leaves me with no other conclusion possible. For the
moment, the C++ standard is useful for generating arguments in
forums like this, but not much else.

>>(Obviously, since you are arguing for a more rigorous
>>standard, your C++ compiler is fully compliant with the
>>current standard. Which is good news, because it means that
>>there are now at least two compilers which implement export.)

> Sorry to say it, but DMC++ doesn't support exported templates.

From a purely selfish, personal point of view, that doesn't
matter to me, because (as far as I know) it doesn't support
sparc processors under Solaris either, so I couldn't use it even
if it did:-).

From an abstract point of view, I'd say that you have no right
asking for anything more to be standardized if you ignore the
standard which is already there. If I had my way, implementors
who didn't implement at least everything in C++98 would not have
a legal right to call their product C++.

But I rest my case. Even if the standards committee were to
standardize inline assembler for IA-32, what would that buy you
if Microsoft, G++ and your own compiler continued to only
support what they currently do, and ignored the standard?

--
James Kanze mailto: james...@free.fr

Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung

9 pl. Pierre Sémard, 78210 St.-Cyr-l'École, France +33 (0)1 30 23 00 34

Walter Bright

unread,

Jan 2, 2006, 5:18:13 AM1/2/06

to

"Mirek Fidler" <c...@volny.cz> wrote in message

news:41nk5cF...@individual.net...

> As for superior performance of hash tables - I suppose that sometimes
> you can get stellar results by simplified hash functions just to be
> bitten a while later by colliding situation. May I ask whether you are
> doing some hash-code magic?

I don't know what you mean by magic. But you're free to look at the source
code, it comes with the dmd package at http://ftp.digitalmars.com/dmd.zip.
The AA source file is /dmd/src/phobos/internal/aaA.d.

>> Nevertheless, copying a pointer and a length is pretty much guaranteed to
>> be
>> faster than doing malloc/copy/free,
> Mostly depends on what are you doing. Such implementation of string is
> ideal if a lot of slicing is done. However, it is less ideal for
> "building" operations (I guess you have some dedicated class for this
> purpose).

If by "building" you mean putting together strings from substrings, I don't
agree. The pointer/length combo is faster than pointer/strlen for
concatenating substrings.

> Also, some operations are really a bit faster with zero terminating
> character.

For those cases, pointer/length doesn't prevent you from appending a 0.
Overall, pointer/length is a big win.

Walter Bright
www.digitalmars.com C, C++, D programming language compilers

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Walter Bright

unread,

Jan 2, 2006, 5:16:41 AM1/2/06

to

"Mirek Fidler" <c...@volny.cz> wrote in message

news:41nkn3F...@individual.net...

> Well, maybe is a time to ask a question about conservative GC that flows
> in my mind for some time now:
>
> Imagine raster image processing software, images are stored using GC,
> memory space is 32-bit (4GB) and 1GB of it is used for storing images.
>
> Now you will load white noise into that 1GB of images. What happens to
> conservative GC? If I understand things well, white noise will keep all
> frames as marked and your app will run out of memory. Am I right?

No. It doesn't much matter what the data in the heap is, it matters what the
data in the 'roots' are. The roots are the initial pointers into the data,
and they'll be on the stack, in the registers, and in static data. In a well
written program, there isn't much static data containing roots, and it's
been my experience that the stack rarely contains any data that isn't a heap
pointer but looks like one.

And, of course, you aren't *forced* to use GC in D. You can very easilly
explicitly manage your white noise buffers.

Also, it's always possible to concoct a program that will wedge a GC, just
like it's possible to concoct one that will wedge C++'s operator new. You
shouldn't use either if your program must *guarantee* it cannot wedge - all
data must be statically allocated.

Walter Bright
www.digitalmars.com C, C++, D programming language compilers

Peter Dimov

unread,

Jan 2, 2006, 5:19:23 AM1/2/06

to

Walter Bright wrote:
> "Mirek Fidler" <c...@volny.cz> wrote in message
> news:41j9d5F...@individual.net...
> > Moreover, that example is unintentionally cheating, as probably NO
> > garbage collecting is perfomed
>
> It doesn't even do any allocation, so no need for collecting. Is that
> cheating? I don't believe so, because a characteristic of gc is that *far
> fewer allocations are necessary*. This is often the reason why gc apps can
> outperform non-gc apps, so why shouldn't a benchmark illustrate that?

Collected languages usually need more allocations, not less, because
they don't support stack variables. In the string case, fewer
allocations are needed because of the immutability of the character
data that allows representation sharing, not because of GC. GC by
itself doesn't eliminate copies. See Item 24 in Effective Java, for
example:

http://java.sun.com/docs/books/effective/chapters.html

(it's in chapter 6.)

Of course proper tracing GC is usually more efficient than reference
counting, but not because of the allocations. :-)

Mirek Fidler

unread,

Jan 2, 2006, 5:22:49 AM1/2/06

to

Walter Bright wrote:
> "Mirek Fidler" <c...@volny.cz> wrote in message
> news:41j9d5F...@individual.net...
>
>>Moreover, that example is unintentionally cheating, as probably NO
>>garbage collecting is perfomed
>
>
> It doesn't even do any allocation, so no need for collecting. Is that
> cheating? I don't believe so, because a characteristic of gc is that *far
> fewer allocations are necessary*. This is often the reason why gc apps can
> outperform non-gc apps, so why shouldn't a benchmark illustrate that?

Fair enough. But I am not sure about no allocations claim - I believe
there have to be some allocation in the hash-map implementation.

In order to make it a little bit more fair, I would run it in a loop
(like 10 times). That way, GC collecting would be accounted for...

Mirek

Mirek Fidler

unread,

Jan 2, 2006, 10:58:09 AM1/2/06

to

Walter Bright wrote:
> "Mirek Fidler" <c...@volny.cz> wrote in message
> news:41nkn3F...@individual.net...
>
>>Well, maybe is a time to ask a question about conservative GC that flows
>>in my mind for some time now:
>>
>>Imagine raster image processing software, images are stored using GC,
>>memory space is 32-bit (4GB) and 1GB of it is used for storing images.
>>
>>Now you will load white noise into that 1GB of images. What happens to
>>conservative GC? If I understand things well, white noise will keep all
>>frames as marked and your app will run out of memory. Am I right?
>
>
> No. It doesn't much matter what the data in the heap is, it matters what the
> data in the 'roots' are. The roots are the initial pointers into the data,
> and they'll be on the stack, in the registers, and in static data. In a well
> written program, there isn't much static data containing roots, and it's
> been my experience that the stack rarely contains any data that isn't a heap
> pointer but looks like one.

OK, but if I understand things well, mark process goes from roots and
then checks heap storage pointed to by roots for additional pointers.

Now it will be enough to have 512MB of storage pointed to by roots (as
those image data will be still active) and another 512MB inactive.
Chances that active 512MB of white noise will mark another 512MB of
already unreachable blocks are pretty high, are not they?

> And, of course, you aren't *forced* to use GC in D. You can very easilly
> explicitly manage your white noise buffers.

Can I? How?

The only possibility I see is to use some form of pointer encryption. Or
not to use any of D's language facilities (like those well slicing
strings) based on GC.

> Also, it's always possible to concoct a program that will wedge a GC, just
> like it's possible to concoct one that will wedge C++'s operator new.

I believe that my example is not artificial.

Mirek

Mirek Fidler

unread,

Jan 2, 2006, 10:55:41 AM1/2/06

to

>>bitten a while later by colliding situation. May I ask whether you are
>>doing some hash-code magic?
>
>
> I don't know what you mean by magic.

Simplified hash function that uses just sample rather than all characters.

>>Mostly depends on what are you doing. Such implementation of string is
>>ideal if a lot of slicing is done. However, it is less ideal for
>>"building" operations (I guess you have some dedicated class for this
>>purpose).
>
>
> If by "building" you mean putting together strings from substrings, I don't
> agree. The pointer/length combo is faster than pointer/strlen for
> concatenating substrings.

Nobody speaks about strlen, but rather about allocation reserve - like
the one that makes std::vector push_back done in constant amortized time.

Sliced strings will often lead to copying all the stuff for any
concatenation.

>>Also, some operations are really a bit faster with zero terminating
>>character.
>
>
> For those cases, pointer/length doesn't prevent you from appending a 0.
> Overall, pointer/length is a big win.

Sure it is. But pointer/length/reserve is even better :)

Mirek

Dave Harris

unread,

Jan 3, 2006, 6:18:07 AM1/3/06

to

pdi...@gmail.com (Peter Dimov) wrote (abridged):

> Collected languages usually need more allocations, not less, because
> they don't support stack variables.

Some collected languages do support stack variables.

> In the string case, fewer allocations are needed because of
> the immutability of the character data that allows representation
> sharing, not because of GC.

GC makes representation sharing easier, because it obviates the need to
track responsibility for deleting. Immutability is important too, of
course, but it's orthogonal to GC.

-- Dave Harris, Nottingham, UK.

Dave Harris

unread,

Jan 3, 2006, 6:20:45 AM1/3/06

to

c...@volny.cz (Mirek Fidler) wrote (abridged):

> Imagine raster image processing software, images are stored using GC,
> memory space is 32-bit (4GB) and 1GB of it is used for storing images.
>
> Now you will load white noise into that 1GB of images. What happens to
> conservative GC? If I understand things well, white noise will keep all
> frames as marked and your app will run out of memory. Am I right?

You mean, the white noise in the image may contain a sequence of bytes
which looks like a pointer to another frame?

This can be a problem. It partly depends on the detailed design of the
language and GC. Sometimes they are only partially conservative; there may
be a way to mark the image memory blocks as not containing pointers.
Another factor is whether an object can be kept alive by a pointer to its
middle. If not, then the chance of random data appearing to be a pointer
to an object is much reduced. The memory architecture matters. Using 1GB
out of 4 is getting bad; switching to a 64-bit address space will reduce
the chances of accidental aliasing because a smaller fraction of the
address space is used.

White noise is pathological case for GC. Most memory values are not
random. They are (smallish) integers, ASCII strings, IEEE floating point
or whatever. You can sometimes design the architecture so that these are
unlikely to alias heap pointers. Even if a pointer alias does happen, it
just means a memory block won't be freed as early as it could be.

GC isn't a panacea.

-- Dave Harris, Nottingham, UK.

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Walter Bright

unread,

Jan 3, 2006, 6:26:16 AM1/3/06

to

"Peter Dimov" <pdi...@gmail.com> wrote in message
news:1136155552....@g44g2000cwa.googlegroups.com...

> Walter Bright wrote:
>> "Mirek Fidler" <c...@volny.cz> wrote in message
>> news:41j9d5F...@individual.net...
>> > Moreover, that example is unintentionally cheating, as probably NO
>> > garbage collecting is perfomed
>> It doesn't even do any allocation, so no need for collecting. Is that
>> cheating? I don't believe so, because a characteristic of gc is that *far
>> fewer allocations are necessary*. This is often the reason why gc apps
>> can
>> outperform non-gc apps, so why shouldn't a benchmark illustrate that?
> Collected languages usually need more allocations, not less, because
> they don't support stack variables.

That is not a consequence of having GC. D has stack variables, for example,
and it coexists nicely with GC.

> In the string case, fewer
> allocations are needed because of the immutability of the character
> data that allows representation sharing, not because of GC. GC by
> itself doesn't eliminate copies.

Nothing is stopping you from making copies; you just don't *need* to with a
GC. The immutability of the underlying data can just be a convention, and
you're right it is needed to support slicing.

Walter Bright
www.digitalmars.com C, C++, D programming language compilers

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Walter Bright

unread,

Jan 3, 2006, 6:27:07 AM1/3/06

to

"Mirek Fidler" <c...@volny.cz> wrote in message

news:41q2k2F...@individual.net...

> Walter Bright wrote:
>> "Mirek Fidler" <c...@volny.cz> wrote in message
>> news:41j9d5F...@individual.net...
>>
>>>Moreover, that example is unintentionally cheating, as probably NO
>>>garbage collecting is perfomed
>>
>>
>> It doesn't even do any allocation, so no need for collecting. Is that
>> cheating? I don't believe so, because a characteristic of gc is that *far
>> fewer allocations are necessary*. This is often the reason why gc apps
>> can
>> outperform non-gc apps, so why shouldn't a benchmark illustrate that?
>
> Fair enough. But I am not sure about no allocations claim - I believe
> there have to be some allocation in the hash-map implementation.

You're right, there is some going on in the AA implementation. But the
strings are not copied.

Walter Bright
www.digitalmars.com C, C++, D programming language compilers

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

kanze

unread,

Jan 3, 2006, 6:35:40 AM1/3/06

to

Peter Dimov wrote:
> Walter Bright wrote:
> > "Mirek Fidler" <c...@volny.cz> wrote in message
> > news:41j9d5F...@individual.net...
> > > Moreover, that example is unintentionally cheating, as
> > > probably NO garbage collecting is perfomed

> > It doesn't even do any allocation, so no need for
> > collecting. Is that cheating? I don't believe so, because a
> > characteristic of gc is that *far fewer allocations are
> > necessary*. This is often the reason why gc apps can
> > outperform non-gc apps, so why shouldn't a benchmark
> > illustrate that?

> Collected languages usually need more allocations, not less,
> because they don't support stack variables.

I don't think we're comparing C++ with any particular language
here (except maybe D -- but if I've understood correctly, D
supports full value semantics, just like C++); we're comparing
with GC to without GC. And while I've not particularly noticed
less allocations in my use of GC in C++, I haven't actually
looked for it either.

> In the string case, fewer allocations are needed because of
> the immutability of the character data that allows
> representation sharing, not because of GC.

> GC by itself doesn't eliminate copies. See Item 24 in
> Effective Java, for example:

> http://java.sun.com/docs/books/effective/chapters.html

> (it's in chapter 6.)

> Of course proper tracing GC is usually more efficient than
> reference counting, but not because of the allocations. :-)

That's my impression, in general. I can imagine specific cases
where you might make less allocations when counting on garbage
collection, and specific cases where you might make more, but on
the whole, if there is any real issue, it is that garbage
collection will occasionally give you an alternative of using a
dynamically allocated object, and passing a pointer, rather than
using deep copy of stack based objects. If the deep copy does a
lot of allocation itself, this can result in less allocations.
(If the deep copy does no allocations, of course, it results in
one more.)

--
James Kanze GABI Software

Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung

9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

kanze

unread,

Jan 3, 2006, 6:37:37 AM1/3/06

to

Mirek Fidler wrote:
> Walter Bright wrote:
> > "Mirek Fidler" <c...@volny.cz> wrote in message
> > news:41nkn3F...@individual.net...

> >>Well, maybe is a time to ask a question about conservative
> >>GC that flows in my mind for some time now:

> >>Imagine raster image processing software, images are stored
> >>using GC, memory space is 32-bit (4GB) and 1GB of it is used
> >>for storing images.

> >>Now you will load white noise into that 1GB of images. What
> >>happens to conservative GC? If I understand things well,
> >>white noise will keep all frames as marked and your app will
> >>run out of memory. Am I right?

> > No. It doesn't much matter what the data in the heap is, it
> > matters what the data in the 'roots' are. The roots are the
> > initial pointers into the data, and they'll be on the stack,
> > in the registers, and in static data. In a well written
> > program, there isn't much static data containing roots, and
> > it's been my experience that the stack rarely contains any
> > data that isn't a heap pointer but looks like one.

> OK, but if I understand things well, mark process goes from
> roots and then checks heap storage pointed to by roots for
> additional pointers.

That's the way the Boehm collector works, at any rate. And the
"roots" include all stack based and statically allocated memory.

> Now it will be enough to have 512MB of storage pointed to by
> roots (as those image data will be still active) and another
> 512MB inactive. Chances that active 512MB of white noise will
> mark another 512MB of already unreachable blocks are pretty
> high, are not they?

It really depends on the rest of the application, but typically,
you won't loose too much. 512MB means, roughly speaking, 128MB
"random" addresses; the probability of a random address falling
in a block of 512MB is only 1/2^15; for smaller blocks, it is
significantly less. Of course, if the addresses really are
random, and there are 2^17 of them, if you have a second block
of the same size, there is a good chance that one of the values
in the first block will point into it.

In such cases, it's possible to tell the garbage collector that
the block contains no addresses, and so eliminate it from the
sweep. Irrespective of the risk of failing to free memory, this
might be useful if you are dealing with very large blocks, since
simply sweeping a half a GB can take a certain amount of time.

Note that even without garbage collection, you may want to take
special precautions when dealing with very large blocks of
memory. My usual procedure, when I have an array, is to declare
an std::vector, and grow it by means of push_back, but with
blocks of a GB, or even a half a GB, this is probably not a very
good idea.

> > Also, it's always possible to concoct a program that will
> > wedge a GC, just like it's possible to concoct one that will
> > wedge C++'s operator new.

> I believe that my example is not artificial.

Rather special, at any rate. I don't know many programs that
will need large, monolithic blocks of totally random machine
words. (Note that if the data is in float format, with a linear
distribution of values in a certain range, the apparent
addresses are far from random -- with IEEE floats, something
like 3/4 of them will have the same high order byte.)

--
James Kanze GABI Software
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Peter Dimov

unread,

Jan 3, 2006, 9:56:03 AM1/3/06

to

Dave Harris wrote:
> pdi...@gmail.com (Peter Dimov) wrote (abridged):
> > Collected languages usually need more allocations, not less, because
> > they don't support stack variables.
>
> Some collected languages do support stack variables.

Yes of course. These languages would typically need the same number of
allocations to accomplish the same task.

> > In the string case, fewer allocations are needed because of
> > the immutability of the character data that allows representation
> > sharing, not because of GC.
>
> GC makes representation sharing easier, because it obviates the need to
> track responsibility for deleting. Immutability is important too, of
> course, but it's orthogonal to GC.

Immutability is more important than having tracing GC (as opposed to
reference counting GC). An immutable string can be reference counted
and require the same number of allocations, although it would need one
extra word (start, offset, length instead of start+offset, length) and
some "sink load" memory barriers on reference drop.

Defensive copies are only caused by (potential) in-place modifications;
there's no need to do extra allocations if the data is immutable,
regardless of the collector type. Tracing GC will still be more
efficient because it does less work on copy. Copying/generational GC
may be even more efficient because it can allocate much faster. :-)
Strings can be a nice testbed for GC comparisons... unfortunately all
researchers seem focused on Java for some reason or other. :-)

Thorsten Ottosen

unread,

Jan 3, 2006, 9:54:38 AM1/3/06

to

Andrei Alexandrescu (See Website For Email) wrote:
> In an older message in thread "A safer/better C++?" () I wrote:
>
> >> I have to add on a related note that D implements contracts "so
> >> wrong it hurts."
>
> And Steven E Harris asked:
>
> > Can you elaborate?
>
> I thought it's interesting to open a discussion on implementing
> contracts in D (http://www.digitalmars.com/d/dbc.html) as opposed to the
> currently-proposed contracts for C++
> (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2005/n1773.html).

Please see

http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2005/n1866.html

for the newest version.

-Thorsten

Mirek Fidler

unread,

Jan 4, 2006, 8:25:47 AM1/4/06

to

>>Now it will be enough to have 512MB of storage pointed to by
>>roots (as those image data will be still active) and another
>>512MB inactive. Chances that active 512MB of white noise will
>>mark another 512MB of already unreachable blocks are pretty
>>high, are not they?
>
>
> It really depends on the rest of the application, but typically,
> you won't loose too much. 512MB means, roughly speaking, 128MB
> "random" addresses; the probability of a random address falling
> in a block of 512MB is only 1/2^15;

I must understand something quite wrong here. I always thought that this
probabilyty is 1/8 (there is 4096MB total, 512 is 1/8 of 4096).

Mirek

Thorsten Ottosen

unread,

Jan 4, 2006, 9:24:15 AM1/4/06

to

Hi Andrei,

Andrei Alexandrescu (See Website For Email) wrote:

> I have one nit and one question about C++ contracts. The nit is about
> section 2.2: the description of the translation is misleading because
> __old variables are initialized upon entrance of the function but could
> be used much later. The translation sketch suggested that the compiler
> has to only replace __old definitions right upon use.

I see, the sentence

"The copy of its argument is taken after the precondition is evaluated."

should be augmented with

"and before the function body is evaluated."

> The question is, why was precondition weakening left out? Section 2.3.1
> says:
>
> ============================================
> "Section 3.5 of
> [http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2004/n1613.pdf]
> explains how little redefinition of preconditions is used. Even though
> subcontracting is theoretically sound, it ends up being fairly useless
> in practice [8]."
>
> Footnote 8: A weaker precondition can be taken advantage of if we know
> the particular type of the object. If weaker preconditions should be
> allowed, then there exists two alternatives: to allow reuse of an
> existing contract or to require a complete redefinition. The former
> favours expressiveness, the latter favours overview.
> ============================================
>
> The relevant quoted reference is:
>
> ============================================
> Has subcontracting actually any use in practice? According to one
> professional Eiffel programmer I spoke with, Berend de Boer, he has not
> used the possibility to loosen preconditions, but he has used the
> ability to provide stronger postconditions regularly. As an example, the
> Gobo Eiffel Project consists of approximately 140k lines of code
> (including comments) and contains 219 stronger postconditions and 109
> weaker preconditions [Bez03].
> ============================================
>
> One data point, be it from a professional, can't be significant.
> Besides, the numbers given for the Gobo Eiffel project do show that
> there is significant use of weaker preconditions, which contradicts de
> Boer's experience.

I don't think you can say that Gobo gives significant data for either.

I recall that some of the base class preconditions are simply empty (ie.
true) or something similar silly,

> My understanding of a weaker precondition means "or"ing the overriden
> and the overriding precondition, so that's not much trouble for the
> implementation.

I must admit that looking at Gobo scared me a bit away from weaker
preconditions. I didn't find much good use of it and I found a lot of
really bad code: the functions called in the precondtion were virtual
and it was a nightmare to actually see what the precondition was. You
had to look a multiple files to figure it out. Adding weaker
preconditions into the mix could only make it worse.

There are situations where you might use weaker preconditions, but they
are quite few.

OTOH, not allowing precoditions to be altered can bring additional benefits:

1. it's easier to locate the contract

2. it can be used to give a compile error when an intentionally static
function by accident overrides a virtual function.

-Thorsten

PeteK

unread,

Jan 4, 2006, 1:02:36 PM1/4/06

to

Mirek Fidler wrote:
> >>Now it will be enough to have 512MB of storage pointed to by
> >>roots (as those image data will be still active) and another
> >>512MB inactive. Chances that active 512MB of white noise will
> >>mark another 512MB of already unreachable blocks are pretty
> >>high, are not they?
> >
> >
> > It really depends on the rest of the application, but typically,
> > you won't loose too much. 512MB means, roughly speaking, 128MB
> > "random" addresses; the probability of a random address falling
> > in a block of 512MB is only 1/2^15;
>
> I must understand something quite wrong here. I always thought that this
> probabilyty is 1/8 (there is 4096MB total, 512 is 1/8 of 4096).
>
> Mirek
>

512MB of white noise means 2^17 virtual addresses.
512MB of target means 1/8 chance of a hit or 7/8 chance of a miss.

The chances of all the virtual addresses missing is (7/8)^(2^17), or
virtually zero.
Therefore 512MB of white noise is almost certain to generate a virtual
address within another 512MB block.

PeteK

Andrei Alexandrescu (See Website For Email)

unread,

Jan 4, 2006, 7:45:54 PM1/4/06

to

Thorsten Ottosen wrote:
> Andrei Alexandrescu (See Website For Email) wrote:
>
>
>>I have one nit and one question about C++ contracts. The nit is about
>>section 2.2: the description of the translation is misleading because
>>__old variables are initialized upon entrance of the function but could
>>be used much later. The translation sketch suggested that the compiler
>>has to only replace __old definitions right upon use.
>
>
>
> I see, the sentence
>
> "The copy of its argument is taken after the precondition is evaluated."
>
> should be augmented with
>
> "and before the function body is evaluated."

Yah, something like that.

>>One data point, be it from a professional, can't be significant.
>>Besides, the numbers given for the Gobo Eiffel project do show that
>>there is significant use of weaker preconditions, which contradicts de
>>Boer's experience.
>
>
>
> I don't think you can say that Gobo gives significant data for either.

Oh, what I really meant was: "Besides, the numbers given for the Gobo

Eiffel project do show that there is significant use of weaker

preconditions within that project (numerically, about half of the
stronger postconditions), which contradicts de Boer's experience."

> I recall that some of the base class preconditions are simply empty (ie.
> true) or something similar silly,
>
>
>>My understanding of a weaker precondition means "or"ing the overriden
>>and the overriding precondition, so that's not much trouble for the
>>implementation.
>
>
>
> I must admit that looking at Gobo scared me a bit away from weaker
> preconditions. I didn't find much good use of it and I found a lot of
> really bad code: the functions called in the precondtion were virtual
> and it was a nightmare to actually see what the precondition was. You
> had to look a multiple files to figure it out. Adding weaker
> preconditions into the mix could only make it worse.
>
> There are situations where you might use weaker preconditions, but they
> are quite few.
>
> OTOH, not allowing precoditions to be altered can bring additional benefits:
>
> 1. it's easier to locate the contract
>
> 2. it can be used to give a compile error when an intentionally static
> function by accident overrides a virtual function.

I think collecting more data and experience from the Eiffel community
would be immensely useful, as would be any further formalization of why
weaker preconditions are hard to encapsulate. Maybe it's possible to
come up with something usefully restricted instead of giving up the
feature entirely?

Andrei

Seungbeom Kim

unread,

Jan 5, 2006, 7:42:35 AM1/5/06

to

Walter Bright wrote:
> "Francis Glassborow" <fra...@robinton.demon.co.uk> wrote in message
> news:Ius9VcMs...@robinton.demon.co.uk...
>>Please note that it is completely
>>impossible for WG21 to specify the syntax for asm if C++ is to be
>>processor independent.
>
> I don't agree. Have an appendix for each processor family outlining the
> syntax for it. If a new CPU architecture comes out between updates to the
> standard, or before a TR could be done for it, nothing is stopping an
> implementation from forging ahead anyway to implement it. After all, that
> happens now with proposed new features of C++.

I don't see yet how C++, being an instruction set architecture-
independent language, could try to cover all the (major) ISAs in its
standard, even in the form of appendices.

I see the inline assembly feature of C++ as a interface to another
world, and the interface specifies only what is needed for the
interface; an airport gate doesn't care what other parts of airplanes
than the gate look like, and the USB specification doesn't need to say a
word about any file system that an external hard disk drive connected
through it may be using.

Besides that, considering the available resources of WG21 and the
current pace of the maintaining the standard, I doubt whether such a
coverage is possible for the C++ standard. Even if the committee had
more resources, I would vote for other things of greater importance to
be improved first.

--
Seungbeom Kim

John Nagle

unread,

Jan 5, 2006, 7:45:15 AM1/5/06

to

One big problem with this approach to "contract programming" is that
it doesn't provide a way to reestablish the invariant when calling
out of a class. This typically leads to trouble when the call out
eventually results in re-entry to a public method of the object.
This happens frequently in GUI systems utilizing callbacks, and
is a constant headache.

I mentioned this in the C++ standards group about two years
ago, and proposed a solution. More recently, it turns
out that the Microsoft Spec# group has implemented a solution for Spec#

http://research.microsoft.com/specsharp

If you're doing anything involving class invariants, read
the Spec# papers. They've done considerable work in this
area. They have to, because Microsoft code is callback-heavy.

John Nagle
Animats

Andrei Alexandrescu (See Website For Email) wrote:
> Thorsten Ottosen wrote:
>
>>Andrei Alexandrescu (See Website For Email) wrote:
>>
>>
>>
>>>I have one nit and one question about C++ contracts.

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

kanze

unread,

Jan 5, 2006, 7:52:45 AM1/5/06

to

Mirek Fidler wrote:

> >>Now it will be enough to have 512MB of storage pointed to by
> >>roots (as those image data will be still active) and another
> >>512MB inactive. Chances that active 512MB of white noise
> >>will mark another 512MB of already unreachable blocks are
> >>pretty high, are not they?

> > It really depends on the rest of the application, but
> > typically, you won't loose too much. 512MB means, roughly
> > speaking, 128MB "random" addresses; the probability of a
> > random address falling in a block of 512MB is only 1/2^15;

> I must understand something quite wrong here. I always thought
> that this probabilyty is 1/8 (there is 4096MB total, 512 is
> 1/8 of 4096).

I think I got something confused myself; it's pretty obvious
that there's a 1/8 chance of a random address falling into a
partition which represents 1/8 of the address space.

Of course, if you're dealing with single blocks which represent
1/8 of the address space, you need special handling for memory
management anyway; just growing an std::vector with successive
calls to push_back isn't going to cut it. But that's a
different argument than the one I presented as to why GC isn't a
problem in this case. In fact, the special handling may be
simpler in the case of garbage collection (but it will certainly
be different).

In general, garbage collection does require more memory than
manual allocation. It's probably not applicable to programs
which already use, say, over about a quarter of the total
available address range.

--
James Kanze GABI Software
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Sergey P. Derevyago

unread,

Jan 5, 2006, 2:02:27 PM1/5/06

to

kanze wrote:
> Rather special, at any rate. I don't know many programs that
> will need large, monolithic blocks of totally random machine
> words.
>

IMHO crypto software is good example of such kind of programs. They are
widespread and very important.

In particular, my DersCrypt crypto algorithm (freeware, BTW) use the
following function:

sh_ptr<vector<uchar> > blockEncrypt(sh_ptr<Key> key,
const vector<uchar>& plain,
const vector<uchar>& prevEncr);

So it generates a lot of unused memory blocks with virtually random data. I
believe conservative GC isn't a good choice for long-running crypto servers.
--
With all respect, Sergey. http://ders.angen.net/
mailto : ders at skeptik.net

Mirek Fidler

unread,

Jan 5, 2006, 2:03:52 PM1/5/06

to

Is it? I guess, given finalizer/destructor dilemma, it is quite fragile
stuff...

Then of course, another question is what happend for a lot of small
blocks. Code allocating 512-1024MB of memory is not unusual today
(happens to me quite often). It is hard to even predict problems that
can happen then...

Mirek

Walter Bright

unread,

Jan 6, 2006, 4:57:00 AM1/6/06

to

"Seungbeom Kim" <musi...@bawi.org> wrote in message
news:dpi2r5$rnk$1...@news.Stanford.EDU...

> Walter Bright wrote:
>> "Francis Glassborow" <fra...@robinton.demon.co.uk> wrote in message
>> news:Ius9VcMs...@robinton.demon.co.uk...
>>>Please note that it is completely
>>>impossible for WG21 to specify the syntax for asm if C++ is to be
>>>processor independent.
>> I don't agree. Have an appendix for each processor family outlining the
>> syntax for it. If a new CPU architecture comes out between updates to the
>> standard, or before a TR could be done for it, nothing is stopping an
>> implementation from forging ahead anyway to implement it. After all, that
>> happens now with proposed new features of C++.
> I don't see yet how C++, being an instruction set architecture-
> independent language, could try to cover all the (major) ISAs in its
> standard, even in the form of appendices.

How many major ones are there? And it's not like they're very complicated,
Intel describes its assembler syntax in 3 pages. There really isn't that
much to an inline assembler. It's just an instruction mnemonic followed by
operand expressions.

> I see the inline assembly feature of C++ as a interface to another
> world, and the interface specifies only what is needed for the
> interface; an airport gate doesn't care what other parts of airplanes
> than the gate look like, and the USB specification doesn't need to say a
> word about any file system that an external hard disk drive connected
> through it may be using.

Hence the mess we have now with pointlessly and wildly incompatible C++
assembler syntaxes.

> Besides that, considering the available resources of WG21 and the
> current pace of the maintaining the standard, I doubt whether such a
> coverage is possible for the C++ standard. Even if the committee had
> more resources, I would vote for other things of greater importance to
> be improved first.

Hence the motivation for the D programming language <g>.

Thorsten Ottosen

unread,

Jan 6, 2006, 5:17:20 AM1/6/06

to

John Nagle wrote:
> One big problem with this approach to "contract programming" is that
> it doesn't provide a way to reestablish the invariant when calling
> out of a class. This typically leads to trouble when the call out
> eventually results in re-entry to a public method of the object.
> This happens frequently in GUI systems utilizing callbacks, and
> is a constant headache.
>
>
> I mentioned this in the C++ standards group about two years
> ago, and proposed a solution. More recently, it turns
> out that the Microsoft Spec# group has implemented a solution for Spec#
>
> http://research.microsoft.com/specsharp
>
> If you're doing anything involving class invariants, read
> the Spec# papers. They've done considerable work in this
> area. They have to, because Microsoft code is callback-heavy.

Could you point to your post about this matter?

From a quick reread of some of the spec# papers, I can't figure out
what the invariant-callback problem really is. If the invariant is
always checked as part of the precondition, then what is the problem
with re-entrency?

-Thorsten

Thorsten Ottosen

unread,

Jan 6, 2006, 5:14:26 AM1/6/06

to

Andrei Alexandrescu (See Website For Email) wrote:

> Thorsten Ottosen wrote:

>>I must admit that looking at Gobo scared me a bit away from weaker
>>preconditions. I didn't find much good use of it and I found a lot of
>>really bad code: the functions called in the precondtion were virtual
>>and it was a nightmare to actually see what the precondition was. You
>>had to look a multiple files to figure it out. Adding weaker
>>preconditions into the mix could only make it worse.
>>
>>There are situations where you might use weaker preconditions, but they
>>are quite few.
>>
>>OTOH, not allowing precoditions to be altered can bring additional benefits:
>>
>>1. it's easier to locate the contract
>>
>>2. it can be used to give a compile error when an intentionally static
>>function by accident overrides a virtual function.

Actually, (2) would also work if we have special syntax for the weaker
preconditio, like in Eiffel (require else).

> I think collecting more data and experience from the Eiffel community
> would be immensely useful,

Well, I tried on an eiffel newsgroup, but I did get much feedback except
from my private mails with De Boer.

> as would be any further formalization of why
> weaker preconditions are hard to encapsulate.

what kind of formalization do you have in mind?

> Maybe it's possible to
> come up with something usefully restricted instead of giving up the
> feature entirely?

It could be. But if we add the feature, we can't take it away. If we
don't add it, we might do so later. So we took a conservative
approach on this matter.

-Thorsten

Thorsten Ottosen

unread,

Jan 6, 2006, 5:17:47 AM1/6/06

to

Andrei Alexandrescu (See Website For Email) wrote:

> Thorsten Ottosen wrote:

> I think collecting more data and experience from the Eiffel community
> would be immensely useful, as would be any further formalization of why
> weaker preconditions are hard to encapsulate. Maybe it's possible to
> come up with something usefully restricted instead of giving up the
> feature entirely?

Incidently, while reading

http://research.microsoft.com/specsharp/papers/krml136.pdf

I saw

"Spec# does not allow any changes in the precondition, because
callers expect the specification at the static resolution of the method
to agree with the
dynamic checking."

Which I think can be understood much like my view that the information
of the precondition ought to be localalixzed.

-Thorsten

Walter Bright

unread,

Jan 6, 2006, 5:18:12 AM1/6/06

to

"James Kanze" <ka...@none.news.free.fr> wrote in message
news:43b806b5$0$7928$626a...@news.free.fr...
>>>What do you mean by "not gc friendly"?
>> 1) You've never seen code that stores bit flags in pointers?
> Where is the problem with regards to garbage collection?

If you store in the low bit, it makes the pointer odd. This can cause the gc
to not regard it as a pointer, as it may contain assumptions that pointers
to object starts are dword aligned. It can also cause the pointer to point
beyond the end of the object, into the next object.

If you store it in the high bit, well, I think you can guess what goes wrong
there.

> -- If the CPU is the same, you shouldn't have to rewrite system
> independant assembler today.

Well, you do, for a multitude of reasons. Let's see an unsigned long
multiply in C++, where the size of an unsigned long is 2 * the size of a CPU
register.

> -- You can't standardize enough to make it useful.

Yes, you can. The Digital Mars D compiler does. It works, and it certainly
is useful.

> You can't
> standardize the names of the machine instructions, for
> example

Absolutely, you can. All you have to do is write it down.

> -- on the 8080, there were two widespread sets, and
> I seem to recall seeing some 8086 assemblers which added a b
> post-fix for byte accesses. So you're still stuck that one
> compiler might require movb al,xxx and another mov al,xxx
> for the same instruction.

Isn't that the point of standardizing? So that there aren't multiple
spellings? Your argument seems to be that it can't be standardized because
it isn't standardized. If that was the prevailing behind standardization
efforts, C wouldn't have been standardized (see differences in preprocessor
behavior, and value-preserving vs sign-preserving integral promotions), and
neither would C++.

> If you could guarantee that I would never have to modify inline
> assembler when moving between two machines of the same
> architecture, you might have a small point, although even then,
> are sparc pre version 9 and sparc version 9 and above the same
> architecture? (I use the same compiler for both, but different
> machine instructions in the assembler for atomic fetch and add.)
> What about Intel 8086 and 80286?

I've already explained that, probably 6 or 7 times now.

> But the handling of
> export leaves me with no other conclusion possible.

The C++ committee is responsible for the export issue - by voting in an
essentially unimplementable feature for which no implementation experience
existed, and for which the utility was unknown. Yes, I know EDG implemented
it.

An inline assembler, on the other hand, is straightforward to implement, has
plenty of implementation experience, has proven utility, and so is not at
all comparable to export.

> From an abstract point of view, I'd say that you have no right
> asking for anything more to be standardized if you ignore the
> standard which is already there.

You'd have to kick half the members off of the C++ standards committee, then
<g>. Do you really want a language standards committee with only one
compiler vendor on it?

> If I had my way, implementors
> who didn't implement at least everything in C++98 would not have
> a legal right to call their product C++.

AT&T's lawyers graciously gave me permission to call my product C++ back in
1987 <g> (yes, I asked them).

But I don't advertise it as "fully conformant C++ 98". I don't see how
anyone could, as there is no official C++98 conformance test.

> But I rest my case. Even if the standards committee were to
> standardize inline assembler for IA-32, what would that buy you
> if Microsoft, G++ and your own compiler continued to only
> support what they currently do, and ignored the standard?

If BrandX implemented the standard version and the others didn't, BrandX
would make it a big point in the positioning of the compiler. That puts
BrandX at a competitive advantage. And, of course, this effect is recognized
by all the compiler vendors, as they all move, over time, to fuller and
fuller compliance.

The problem with export is this competitive marketing advantage plus the
utility of export is overwhelmed by the cost of implementation.

Walter Bright
www.digitalmars.com C, C++, D programming language compilers

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

kanze

unread,

Jan 6, 2006, 7:46:17 AM1/6/06

to

Independantly of the finalizers or destructors. You just can't
go around allocating 1/8 of the existing address space without
special handling.

> Then of course, another question is what happend for a lot of
> small blocks.

Once one of them frees up, all of the false hits in it
disappear. It's just not a problem in practice (although one
could presumably construct degenerate cases where it would be).

> Code allocating 512-1024MB of memory is not unusual today
> (happens to me quite often).

But that could only be because you are working on a 64 bit
machine. In which case, they only contain half as many
addresses, and the probability of hitting another mapped address
is extremely small.

> It is hard to even predict problems that can happen then...

Not really. I can predict very easily that if I allocate 1024MB
blocks of memory a lot (say more than four times) on a 32 bit
machine, I'm going to see a bad_alloc exception:-).

Realistically, any time you're using more than a small
percentage of the memory available, you have to take special
precautions.

--
James Kanze GABI Software
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Francis Glassborow

unread,

Jan 6, 2006, 9:31:15 AM1/6/06

to

In article <A6ednTFqVuw...@comcast.com>, Walter Bright
<wal...@nospamm-digitalmars.com> writes

>An inline assembler, on the other hand, is straightforward to implement, has
>plenty of implementation experience, has proven utility, and so is not at
>all comparable to export.

Perhaps you should ask ECMA to standardise all the assemblers for
currently used processors. It would seem much better use of their time
and the results could be fast tracked to ISO Standards which the next
C++ standard could reference (basically requiring inline assembler to
conform to the relevant ISO standard where one exists.

For those whose understanding of English is poor, please sprinkle the
above with a liberal assortment of smileys.

--
Francis Glassborow ACCU
Author of 'You Can Do It!' see http://www.spellen.org/youcandoit
For project ideas and contributions: http://www.spellen.org/youcandoit/projects

kanze

unread,

Jan 6, 2006, 9:35:22 AM1/6/06

to

Walter Bright wrote:
> "James Kanze" <ka...@none.news.free.fr> wrote in message
> news:43b806b5$0$7928$626a...@news.free.fr...
> >>>What do you mean by "not gc friendly"?
> >> 1) You've never seen code that stores bit flags in pointers?
> > Where is the problem with regards to garbage collection?

> If you store in the low bit, it makes the pointer odd. This
> can cause the gc to not regard it as a pointer, as it may
> contain assumptions that pointers to object starts are dword
> aligned. It can also cause the pointer to point beyond the
> end of the object, into the next object.

The Boehm collector considers any aligned word as a pointer,
regardless of contents. It has no problems with tracing
internal pointers, either.

I suppose that if the pointer points to the very last byte of a
block, making it odd could cause problems. Except that because
blocks are allocated with enough granularity to ensure correct
alignment for all types, the length of a block is never odd, so
the pointer to the very last byte would be odd anyway.

> If you store it in the high bit, well, I think you can guess
> what goes wrong there.

Yes, but on modern machines, all of the bits in a word are used;
the only free bits are a few low order bits, IF you aren't
addressing individual bytes.

> > -- If the CPU is the same, you shouldn't have to rewrite system
> > independant assembler today.

> Well, you do, for a multitude of reasons. Let's see an
> unsigned long multiply in C++, where the size of an unsigned
> long is 2 * the size of a CPU register.

> > -- You can't standardize enough to make it useful.

> Yes, you can. The Digital Mars D compiler does. It works,
> and it certainly is useful.

> > You can't
> > standardize the names of the machine instructions, for
> > example

> Absolutely, you can. All you have to do is write it down.

You can standardize for a given platform, but that's normally
already been done anyway. Not by the language, but by the
architecture specifications of the platform: Intel has
standardized the names of the IA-32 instructions, the Sparc
group those for a Sparc, etc.

On a single platform, the names are more or less standardized.
In the case of two competing vendors, and no separate
standardization group (like the Sparc group), you may end up
with competing standards. That was the case with Intel and
Zilog for the 8080/Z80, but today, most architectures seem to be
defined by (nominally) independant groups. (I'm not sure about
IA-32, but I think that AMD follows the Intel standard.)

Of course, if some compilers choose not to follow the standard,
then standardization doesn't help. Regardless of who does it.

> > -- on the 8080, there were two widespread sets, and
> > I seem to recall seeing some 8086 assemblers which added a b
> > post-fix for byte accesses. So you're still stuck that one
> > compiler might require movb al,xxx and another mov al,xxx
> > for the same instruction.

> Isn't that the point of standardizing? So that there aren't
> multiple spellings? Your argument seems to be that it can't
> be standardized because it isn't standardized. If that was
> the prevailing behind standardization efforts, C wouldn't have
> been standardized (see differences in preprocessor behavior,
> and value-preserving vs sign-preserving integral promotions),
> and neither would C++.

I guess I didn't make myself clear: I've seen some 8086
assemblers that didn't follow the standard established by Intel.
The Intel architecture, which determines the assembler, is
proprietary, so it is more or less normal that the standard is
established by Intel, and not by ISO.

> > If you could guarantee that I would never have to modify
> > inline assembler when moving between two machines of the
> > same architecture, you might have a small point, although
> > even then, are sparc pre version 9 and sparc version 9 and
> > above the same architecture? (I use the same compiler for
> > both, but different machine instructions in the assembler
> > for atomic fetch and add.) What about Intel 8086 and 80286?

> I've already explained that, probably 6 or 7 times now.

You've explained that you can't provide useful standardization,
yes:-).

> > But the handling of export leaves me with no other
> > conclusion possible.

> The C++ committee is responsible for the export issue - by
> voting in an essentially unimplementable feature for which no
> implementation experience existed, and for which the utility
> was unknown. Yes, I know EDG implemented it.

It's true that the lack of concrete experience never helps; it's
probable that with practical experience, the feature would look
somewhat different. On the other hand, it was recognized that
something along the lines was needed. And as you say, it has
been implemented, so it is hardly impossible to implement.

> An inline assembler, on the other hand, is straightforward to
> implement, has plenty of implementation experience, has proven
> utility, and so is not at all comparable to export.

The basic problem is the same: if implementors ignore the
standard, then there's no point standardizing. The current
situation is that there is very little standardization: the
syntax and the keyword, and the instruction set for each
individual processor. From the examples you've posted, I gather
that Microsoft ignores the first, and g++ the second.

So it's not totally comparable to export. When export was
adopted, there was no reason to suppose that compiler
implementors would ignore it (especially as they were voting for
it). Where as with inline assembler, it seems pretty clear that
any standard would be ignored, since even what little we have
today is being ignored.

> > From an abstract point of view, I'd say that you have no
> > right asking for anything more to be standardized if you
> > ignore the standard which is already there.

> You'd have to kick half the members off of the C++ standards
> committee, then <g>. Do you really want a language standards
> committee with only one compiler vendor on it?

Several vendors (at least two), just one implementor.

Actually, I'd hope that such a menace would stimulate more
implementors to implement it.

> > If I had my way, implementors who didn't implement at least
> > everything in C++98 would not have a legal right to call
> > their product C++.

> AT&T's lawyers graciously gave me permission to call my
> product C++ back in 1987 <g> (yes, I asked them).

The question is whether C++ is a trademark, or a generic name
for a specific programming language. And if the second, what
the definition for that name is. (In 1987, I think one could
legitimately ask the first question. Today, certainly not.)

> But I don't advertise it as "fully conformant C++ 98". I don't
> see how anyone could, as there is no official C++98
> conformance test.

Sun does. (Or did -- much to the displeasure of the people
actually involed with the compiler. I presume that Digital Mars
is small enough that you don't have separate marketing droids
writing things you don't approve of.)

> > But I rest my case. Even if the standards committee were to
> > standardize inline assembler for IA-32, what would that buy
> > you if Microsoft, G++ and your own compiler continued to
> > only support what they currently do, and ignored the
> > standard?

> If BrandX implemented the standard version and the others
> didn't, BrandX would make it a big point in the positioning of
> the compiler. That puts BrandX at a competitive advantage.
> And, of course, this effect is recognized by all the compiler
> vendors, as they all move, over time, to fuller and fuller
> compliance.

Apparently, none of the vendors of Intel compilers think it
important enough to even conform to what little exists.

> The problem with export is this competitive marketing
> advantage plus the utility of export is overwhelmed by the
> cost of implementation.

The problem with export is that there are a lot of non-technical
issues which condition acceptance. For example, if you wanted
me to use the Digital Mars compiler, there are at least two
things more important than export: The first is obvious, you'd
have to add support for Sparc under Solaris, since that remains
my most important target platform, even today. The second is
that you'd have to convince management that you have a big
enough organization behind it to support it, and that there's no
risk of your going out of business (but I'm sure you've run into
that problem before). Only once those two conditions are met
can we even begin to discuss truly technical issues. (Note that
g++ sidesteps the second one, because I can download it and use
it without asking anything from management. Which has been
important in the past -- today, the organization behind g++ is
generally recognized as being "serious".)

--
James Kanze GABI Software
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Andrei Alexandrescu (See Website For Email)

unread,

Jan 6, 2006, 8:42:42 PM1/6/06

to

Thorsten Ottosen wrote:
>>I think collecting more data and experience from the Eiffel community
>>would be immensely useful,
>
>
> Well, I tried on an eiffel newsgroup, but I did get much feedback except
> from my private mails with De Boer.

Unless de Boer is an authority on Eiffel, private correspondence with
him becomes irrelevant for making you decide one way or another. The
project you mention is a relevant data point, and that data point
suggests that people do use weaker preconditions in larger projects,
even though they had you scared.

>>as would be any further formalization of why
>>weaker preconditions are hard to encapsulate.
>
>
> what kind of formalization do you have in mind?

I'm thinking along the lines of disabling virtual calls - if you can
show that preventing virtual calls helps the human or the analysis in
any objective way (e.g., look what the precondition is in one module).
But maybe virtuals in preconditions are, on the contrary, powerful.
Traditionally, the spirit of C++ was to enable power even when it wasn't
clear what it could be used for.

>>Maybe it's possible to
>>come up with something usefully restricted instead of giving up the
>>feature entirely?
>
>
> It could be. But if we add the feature, we can't take it away. If we
> don't add it, we might do so later. So we took a conservative
> approach on this matter.

If adding the feature takes us to C++19, then it's not very useful for
some of us is it :o).

Andrei

Walter Bright

unread,

Jan 7, 2006, 7:25:53 AM1/7/06

to

"Francis Glassborow" <fra...@robinton.demon.co.uk> wrote in message

news:UzuqysDp...@robinton.demon.co.uk...

> In article <A6ednTFqVuw...@comcast.com>, Walter Bright
> <wal...@nospamm-digitalmars.com> writes
>>An inline assembler, on the other hand, is straightforward to implement,
>>has
>>plenty of implementation experience, has proven utility, and so is not at
>>all comparable to export.
>
> Perhaps you should ask ECMA to standardise all the assemblers for
> currently used processors. It would seem much better use of their time
> and the results could be fast tracked to ISO Standards which the next
> C++ standard could reference (basically requiring inline assembler to
> conform to the relevant ISO standard where one exists.

That wouldn't be useful, because an inline assembler is different from a
standalone assembler and has different requirements.

> For those whose understanding of English is poor, please sprinkle the
> above with a liberal assortment of smileys.

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

John Nagle

unread,

Jan 7, 2006, 7:29:18 AM1/7/06

to

Thorsten Ottosen wrote:
> John Nagle wrote:
>
>> One big problem with this approach to "contract programming" is that
>>it doesn't provide a way to reestablish the invariant when calling
>>out of a class.
>
>

> Could you point to your post about this matter?

Here's the link to Google Groups: (comp.lang.c++.moderated, June 21,
2001, "Public methods implemented in terms of other public methods of
the same class"):

http://groups.google.com/group/comp.lang.c++.moderated/browse_thread/thread/a189110198e55f81/2a5f2e5b679b7d32?q=nagle+public&rnum=1#2a5f2e5b679b
7d32

> From a quick reread of some of the spec# papers, I can't figure out
> what the invariant-callback problem really is. If the invariant is
> always checked as part of the precondition, then what is the problem

> with (sic) re-entrency?

The whole idea of class invariants is that either control is outside the
object, and the object is in its stable state with the invariant true,
or control is inside the object, and the invariant may be false.
If you call back into an object's public method with the object
in a transient state, you've potentially broken the object's
internal consistency. This may be caught by run-time checking
of the invariant, or it might be missed. But in a formal sense,
it's an error, like a potential race condition.

This kind of thing happens in GUI systems all too often, because
long chains of callbacks and event message passing can result
in unexpected reentrancy. This usually manifests itself at
the user level as "program crashed when user performed some
complex, unusual sequence of actions".

John Nagle
Animats

Walter Bright

unread,

Jan 8, 2006, 5:55:30 AM1/8/06

to

"kanze" <ka...@gabi-soft.fr> wrote in message
news:1136551885.2...@f14g2000cwb.googlegroups.com...

> You can standardize for a given platform, but that's normally
> already been done anyway.

That's just the problem. It isn't normally done.

Thorsten Ottosen

unread,

Jan 8, 2006, 3:35:08 PM1/8/06

to

John Nagle wrote:
> Thorsten Ottosen wrote:

>>Could you point to your post about this matter?
>
>
> Here's the link to Google Groups:

Thanks.

>>From a quick reread of some of the spec# papers, I can't figure out
>>what the invariant-callback problem really is. If the invariant is
>>always checked as part of the precondition, then what is the problem
>>with (sic) re-entrency?
>
>
> The whole idea of class invariants is that either control is outside the
> object, and the object is in its stable state with the invariant true,
> or control is inside the object, and the invariant may be false.
> If you call back into an object's public method with the object
> in a transient state, you've potentially broken the object's
> internal consistency. This may be caught by run-time checking
> of the invariant, or it might be missed.

right, but runtime-checking seems to trivial.

-Thorsten

Mirek Fidler

unread,

Jan 8, 2006, 3:32:47 PM1/8/06

to

Walter Bright wrote:
> "James Kanze" <ka...@none.news.free.fr> wrote in message
> news:43b806b5$0$7928$626a...@news.free.fr...
>
>>>>What do you mean by "not gc friendly"?
>>>
>>>1) You've never seen code that stores bit flags in pointers?
>>
>>Where is the problem with regards to garbage collection?
>
>
> If you store in the low bit, it makes the pointer odd. This can cause the gc
> to not regard it as a pointer, as it may contain assumptions that pointers

I think it can't. Consider:

const char *x = new char[200];
const char *b = x + 1;

Now if only 'b' is in roots at the time of collection, you still have to
mark that block as "used".

Mirek

Thorsten Ottosen

unread,

Jan 8, 2006, 8:50:11 PM1/8/06

to

Andrei Alexandrescu (See Website For Email) wrote:

> Thorsten Ottosen wrote:
>
>>>I think collecting more data and experience from the Eiffel community
>>>would be immensely useful,
>>
>>
>>Well, I tried on an eiffel newsgroup, but I did get much feedback except
>>from my private mails with De Boer.
>
>
> Unless de Boer is an authority on Eiffel, private correspondence with
> him becomes irrelevant for making you decide one way or another.

I don't think he is, but he did use the language on a daily basis. I've
certainly met authorities that had a poor understanding of this subject.

> The
> project you mention is a relevant data point, and that data point
> suggests that people do use weaker preconditions in larger projects,
> even though they had you scared.

Right, but I didn't many good examples (if any) and I would categorize
most of them as misuses.

It's been a while since I looked into the source code iself, but I
wasn't convinced back then, and I doubt I would be if I went there again.

>>>as would be any further formalization of why
>>>weaker preconditions are hard to encapsulate.
>>
>>
>>what kind of formalization do you have in mind?
>
>
> I'm thinking along the lines of disabling virtual calls - if you can
> show that preventing virtual calls helps the human or the analysis in
> any objective way (e.g., look what the precondition is in one module).
> But maybe virtuals in preconditions are, on the contrary, powerful.
> Traditionally, the spirit of C++ was to enable power even when it wasn't
> clear what it could be used for.

I don't have a good answer to that issue other than locality seems to be
good here. OTOH, I'm not advocating to disallow virtual calls inside
preconditions.

>>>Maybe it's possible to
>>>come up with something usefully restricted instead of giving up the
>>>feature entirely?
>>
>>
>>It could be. But if we add the feature, we can't take it away. If we
>>don't add it, we might do so later. So we took a conservative
>>approach on this matter.
>
>
> If adding the feature takes us to C++19, then it's not very useful for
> some of us is it :o).

That's why somebody should start making a simple proposal for local
lambda functions in C++ so we can get it into C++0x.

-Thorsten

Walter Bright

unread,

Jan 9, 2006, 5:53:20 AM1/9/06

to

"Mirek Fidler" <c...@volny.cz> wrote in message
news:42cng3F...@individual.net...

> Walter Bright wrote:
>> "James Kanze" <ka...@none.news.free.fr> wrote in message
>> news:43b806b5$0$7928$626a...@news.free.fr...
>>
>>>>>What do you mean by "not gc friendly"?
>>>>
>>>>1) You've never seen code that stores bit flags in pointers?
>>>
>>>Where is the problem with regards to garbage collection?
>>
>>
>> If you store in the low bit, it makes the pointer odd. This can cause the
>> gc
>> to not regard it as a pointer, as it may contain assumptions that
>> pointers
>
> I think it can't. Consider:
>
> const char *x = new char[200];
> const char *b = x + 1;
>
> Now if only 'b' is in roots at the time of collection, you still have to
> mark that block as "used".

Suppose one writes a library that depends on the default global operator new
returning allocations always aligned on 8 byte boundaries, and stores extra
flags in the lower 3 bits. Now suppose your program overrides it with a new
global gc operator new, that aligns things on 4 bytes. Now the flags can
overflow a pointer into pointing into the next chunk of memory rather than
the current one.

Can this happen? Sure. Me, I think it's just asking for trouble to override
global operator new with a gc allocator, and then expect third party
libraries to keep working.

For another reason why, consider if the library overrides global operator
new, too.

Here's a list of problems with code that may not be gc friendly, from the
Boehm collector instructions:

1.. The collector did not intercept the creation of threads correctly in a
multithreaded application, e.g. because the client called pthread_create
without including gc.h, which redefines it.
2.. The last pointer to an object in the garbage collected heap was stored
somewhere were the collector couldn't see it, e.g. in an object allocated
with system malloc, in certain types of mmaped files, or in some data
structure visible only to the OS. (On some platforms, thread-local storage
is one of these.)
3.. The last pointer to an object was somehow disguised, e.g. by XORing it
with another pointer.

Walter Bright
www.digitalmars.com C, C++, D programming language compilers

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

kanze

unread,

Jan 9, 2006, 10:41:47 AM1/9/06

to

Walter Bright wrote:
> "kanze" <ka...@gabi-soft.fr> wrote in message
> news:1136551885.2...@f14g2000cwb.googlegroups.com...
> > You can standardize for a given platform, but that's
> > normally already been done anyway.

> That's just the problem. It isn't normally done.

It isn't normally done, or the compiler writers just ignore it
when it is done? It's certainly done for Sparc, and from what
little I've tried, I can move between g++ and Sun CC without
having to change assemblers. Intel had a standard format for
ASM-86 -- I suspect that the same is true today. If two
implementations of inline assembler for Intel don't use the same
instruction mnemonics, then it's more a case of the implementors
ignoring the standard than of the lack of a standard, isn't it?

--
James Kanze GABI Software
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

kanze

unread,

Jan 9, 2006, 11:09:32 AM1/9/06

to

Walter Bright wrote:
> "Mirek Fidler" <c...@volny.cz> wrote in message
> news:42cng3F...@individual.net...
> > Walter Bright wrote:
> >> "James Kanze" <ka...@none.news.free.fr> wrote in message
> >> news:43b806b5$0$7928$626a...@news.free.fr...

> >>>>>What do you mean by "not gc friendly"?

> >>>>1) You've never seen code that stores bit flags in pointers?

> >>>Where is the problem with regards to garbage collection?

> >> If you store in the low bit, it makes the pointer odd. This
> >> can cause the gc to not regard it as a pointer, as it may
> >> contain assumptions that pointers

> > I think it can't. Consider:

> > const char *x = new char[200];
> > const char *b = x + 1;

> > Now if only 'b' is in roots at the time of collection, you
> > still have to mark that block as "used".

> Suppose one writes a library that depends on the default
> global operator new returning allocations always aligned on 8
> byte boundaries, and stores extra flags in the lower 3 bits.
> Now suppose your program overrides it with a new global gc
> operator new, that aligns things on 4 bytes.

Then in practice, you've changed implementations. An
implementation of garbage collection can't do just anything and
still be conforming. In particular, it cannot reduce the
alignment.

(You don't need any business concerning flags for this. If the
global operator new forces alignment to 8, there is a reason.
And this same reason will force the alignment of the garbage
collector to 8.)

> Now the flags can overflow a pointer into pointing into the
> next chunk of memory rather than the current one.

> Can this happen? Sure. Me, I think it's just asking for
> trouble to override global operator new with a gc allocator,
> and then expect third party libraries to keep working.

I suppose that it depends on the library, but the only problems
I can see are if the library keeps its pointers on disk, and not
in memory, or masks them in some other way. And I can't imagine
that being a problem in practice.

In fact, I worry more about compiler optimizers; there are
optimization techniques which can invalidate garbage collection.
In practice, they don't apply to risc processors, nor to the
Intel architecture, and g++ doesn't use them (presumably, since
gcc also supports Objective C with the Boehm collector, using
the same back end). So I feel fairly safe for a specific subset
of platforms, which happens to include all those which interest
me.

> For another reason why, consider if the library overrides
> global operator new, too.

Don't use it.

I'm serious about that. Global operator new is global. It's
not the property of any library. It's quite frequent to
replace it for debugging purposes, for example.

> Here's a list of problems with code that may not be gc
> friendly, from the Boehm collector instructions:

> 1.. The collector did not intercept the creation of threads
> correctly in a multithreaded application, e.g. because the
> client called pthread_create without including gc.h, which
> redefines it.

> 2.. The last pointer to an object in the garbage collected
> heap was stored somewhere were the collector couldn't see
> it, e.g. in an object allocated with system malloc, in
> certain types of mmaped files, or in some data structure
> visible only to the OS. (On some platforms, thread-local
> storage is one of these.)

> 3.. The last pointer to an object was somehow disguised,
> e.g. by XORing it with another pointer.

It's not perfect, that's for sure. In practice, the only one
that really worries me is the first -- third party libraries do
use threads. (If the library is open source, this is easy to
fix. But a lot of interesting libraries aren't.) If there's
some possibility of a problem from malloc, of course, the
solution is to replace malloc rather than the global operator
new. As for the last, I suspect that it is mentionned only for
reasons of completeness. I can't imagine it really happening in
serious code.

Still, it remains true that when using third party libraries,
you would probably want to vet them for garbage collection
before committing.

--
James Kanze GABI Software
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34