Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Using vector<unsigned char> as raw memory

17 views
Skip to first unread message

ka...@gabi-soft.fr

unread,
Feb 5, 2004, 6:45:05 AM2/5/04
to
Is it possible to use std::vector< unsigned char > as raw memory? Or,
more specifically, if I have a std::vector< unsigned char > v, is &v[0]
guaranteed to be aligned for all possible data types. (Since an
implementation is required ultimately to use operator new to obtain the
buffer, I can't see how it couldn't be in practice, but I rather doubt
that the standard gives me this guarantee, even indirectly. But maybe
this was the intent.)

For those who are curious: certain Posix functions require some very
strange memory tricks. The second parameter of readdir_r is the one
that's giving me the problems -- and the actual size needed isn't known
until runtime, because it depends on the filesystem where the directory
is hosted. So I need to allocate dynamcally, and I need RAII (and my
compiler is too old to support Boost, so scoped_array isn't an option).

--
James Kanze GABI Software mailto:ka...@gabi-soft.fr
Conseils en informatique orientée objet/ http://www.gabi-soft.fr
Beratung in objektorientierter Datenverarbeitung
11 rue de Rambouillet, 78460 Chevreuse, France, +33 (0)1 30 23 45 16

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Thomas Mang

unread,
Feb 6, 2004, 5:15:21 AM2/6/04
to

ka...@gabi-soft.fr schrieb:

> Is it possible to use std::vector< unsigned char > as raw memory? Or,
> more specifically, if I have a std::vector< unsigned char > v, is &v[0]
> guaranteed to be aligned for all possible data types. (Since an
> implementation is required ultimately to use operator new to obtain the
> buffer, I can't see how it couldn't be in practice, but I rather doubt
> that the standard gives me this guarantee, even indirectly. But maybe
> this was the intent.)

Does the Standard somewhere require vector to obtain its memory always via
operator new?

Wouldn't it be possible that class vector, especially when optimized for
small classes (say, the built-in types), reserves internally some raw
memory for a limited number of T-objects, to avoid allocating memory from
the heap (the compiler could, of course, easily find out how this raw
memory would have to be aligned at instantiation time)? I remember having
heard of a string implementation (Intels?) that was optimized for small
strings and had a charT[16] - array as data member, exactly to avoid
operator new. Couldn't the same issue apply to vector?

In case this is true, then I think your question can be easily answered
that there is no alignment guarantee.


regards,

Thomas

Joe

unread,
Feb 6, 2004, 5:31:52 AM2/6/04
to
I am no expert on the STL or the thorny issues of data alignment, but are
you sure that vector<T> will use new to obtain memory for data storage?
Although I know of no implementation that does, could not vector<T> (like
some string implementations) use some internal space to store "T" for small
vector<T> sizes and then use new as needed for larger sizes. Would be a
faster implementation for small vector<T> sizes.

In addition, I seem to remember a post some time of go about a standards
proposal that would require all elements of a vector<T> to occupy contiguous
memory locations -- thus implying that the current standard does not require
it. Although I strongly doubt that any implementation would not have
contiguous elements.

Joe


<ka...@gabi-soft.fr> wrote in message
news:d6652001.0402...@posting.google.com...


> Is it possible to use std::vector< unsigned char > as raw memory? Or,
> more specifically, if I have a std::vector< unsigned char > v, is &v[0]
> guaranteed to be aligned for all possible data types. (Since an
> implementation is required ultimately to use operator new to obtain the
> buffer, I can't see how it couldn't be in practice, but I rather doubt
> that the standard gives me this guarantee, even indirectly. But maybe
> this was the intent.)
>
> For those who are curious: certain Posix functions require some very
> strange memory tricks. The second parameter of readdir_r is the one
> that's giving me the problems -- and the actual size needed isn't known
> until runtime, because it depends on the filesystem where the directory
> is hosted. So I need to allocate dynamcally, and I need RAII (and my
> compiler is too old to support Boost, so scoped_array isn't an option).

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Hendrik Belitz

unread,
Feb 6, 2004, 7:31:31 AM2/6/04
to
ka...@gabi-soft.fr wrote:

> Is it possible to use std::vector< unsigned char > as raw memory? Or,
> more specifically, if I have a std::vector< unsigned char > v, is &v[0]
> guaranteed to be aligned for all possible data types.

For the most implementations of std::vector you will get the desired
behaviour. But since the standard not forces a specific implementation of
std::vector, this cannot be guaranteed. Also using another allocator for
the vector may change this behaviour too.

--
To get my real email adress, remove the two onkas
--
Dipl.-Inform. Hendrik Belitz
Central Institute of Electronics
Research Center Juelich

Bogdan

unread,
Feb 6, 2004, 7:32:14 AM2/6/04
to
> Is it possible to use std::vector< unsigned char > as raw memory? Or,
> more specifically, if I have a std::vector< unsigned char > v, is &v[0]
> guaranteed to be aligned for all possible data types. (Since an
> implementation is required ultimately to use operator new to obtain the
> buffer, I can't see how it couldn't be in practice, but I rather doubt
> that the standard gives me this guarantee, even indirectly. But maybe
> this was the intent.)
>
> For those who are curious: certain Posix functions require some very
> strange memory tricks. The second parameter of readdir_r is the one
> that's giving me the problems -- and the actual size needed isn't known
> until runtime, because it depends on the filesystem where the directory
> is hosted. So I need to allocate dynamcally, and I need RAII (and my
> compiler is too old to support Boost, so scoped_array isn't an option).

I don't know what standard says about &v[0].
Maybe you could use std::basic_string<unsigned char>. I know that
data() return a const pointer to an array owned by the string, but if
no non-const methods are called on the string that array could be
used.
Anyway, in this case maybe it worth to write a simplified version of
boost::scoped_array

Best regards,
Bogdan Sintoma

tom_usenet

unread,
Feb 6, 2004, 7:32:36 AM2/6/04
to
On 5 Feb 2004 06:45:05 -0500, ka...@gabi-soft.fr wrote:

>Is it possible to use std::vector< unsigned char > as raw memory? Or,
>more specifically, if I have a std::vector< unsigned char > v, is &v[0]
>guaranteed to be aligned for all possible data types. (Since an
>implementation is required ultimately to use operator new to obtain the
>buffer, I can't see how it couldn't be in practice, but I rather doubt
>that the standard gives me this guarantee, even indirectly. But maybe
>this was the intent.)

Looking at 20.4.1.1, std::allocator<T>::allocate allocates memory
suitably aligned for T, so memory returned by std::allocator<unsigned
char>::allocate is only guaranteed to have an alignment of 1. In
addition, vector doesn't necessarily have to use the memory returned
by the allocator directly. e.g. one could envisage a specialization of
vector for unsigned char that stored the size of the vector at the
start of the allocation (using knowledge of std::allocator to
determine the capacity).

e.g.

template<>
class vector<unsigned char, std::allocator<unsigned char> >
{
unsigned char* m_data; //only data member, never null
public:
//...

size_type size() const
{
return *reinterpret_cast<size_type*>(m_data);
}

size_type capacity() const
{
//using knowledge of operator new.
return *reinterpret_cast<size_t*>(m_data - 4) - sizeof(size_type);
}

reference operator[](size_type t)
{
return m_data[t + sizeof(size_type)];
}
};

So even if std::allocator returned fully aligned memory,
vector<unsigned char> could make v[0] unaligned.

Tom

C++ FAQ: http://www.parashift.com/c++-faq-lite/
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html

Vladimir Kouznetsov

unread,
Feb 6, 2004, 7:40:27 AM2/6/04
to
<ka...@gabi-soft.fr> wrote in message
news:d6652001.0402...@posting.google.com...
> Is it possible to use std::vector< unsigned char > as raw memory? Or,
> more specifically, if I have a std::vector< unsigned char > v, is &v[0]
> guaranteed to be aligned for all possible data types. (Since an
> implementation is required ultimately to use operator new to obtain the
> buffer, I can't see how it couldn't be in practice, but I rather doubt
> that the standard gives me this guarantee, even indirectly. But maybe
> this was the intent.)

I think to be on the safe side you better to implement your own allocator.
There are no any requirements for the standard allocator on how data are
laid out inside the ::operator new() allocated storage except they should be
T-aligned. But I believe you know that already.

> James Kanze GABI Software mailto:ka...@gabi-soft.fr

thanks,
v

Balog Pal

unread,
Feb 6, 2004, 7:42:58 AM2/6/04
to
<ka...@gabi-soft.fr> wrote in message
news:d6652001.0402...@posting.google.com...

> For those who are curious: certain Posix functions require some very


> strange memory tricks. The second parameter of readdir_r is the one
> that's giving me the problems -- and the actual size needed isn't known
> until runtime, because it depends on the filesystem where the directory
> is hosted. So I need to allocate dynamcally, and I need RAII (and my
> compiler is too old to support Boost, so scoped_array isn't an option).

Come on, James do you say you can't write a wrapper to a raw memblock
handler class in 20 minutes? If you'd use scoped_array or some other boost
class unless your compiler ptoblems, you can just delete the template<> part
and write a typedef 2 lines below it -- creating a nontemplate instance with
the same functionality.

Why not do it the way it worked 10 years back if it does the job?

Paul

Chris Theis

unread,
Feb 6, 2004, 2:34:19 PM2/6/04
to

"Thomas Mang" <a980...@unet.univie.ac.at> wrote in message
news:40222FD4...@unet.univie.ac.at...

>
>
> ka...@gabi-soft.fr schrieb:
>
> > Is it possible to use std::vector< unsigned char > as raw memory? Or,
> > more specifically, if I have a std::vector< unsigned char > v, is &v[0]
> > guaranteed to be aligned for all possible data types. (Since an
> > implementation is required ultimately to use operator new to obtain the
> > buffer, I can't see how it couldn't be in practice, but I rather doubt
> > that the standard gives me this guarantee, even indirectly. But maybe
> > this was the intent.)
>
> Does the Standard somewhere require vector to obtain its memory always via
> operator new?
>

The containers allocate the memory via the allocators and the implementation
of the default allocator uses global new & delete. However, AFAIK the
standard does not require other optimized/user implemented allocators to do
so. Til now I assumed that this was the idea behind the allocator approach,
so that this is not necessary.

[SNIP]


Chris

Maciej Sobczak

unread,
Feb 6, 2004, 2:58:55 PM2/6/04
to
Hi,

ka...@gabi-soft.fr wrote:

> Is it possible to use std::vector< unsigned char > as raw memory?

I cannot answer your specific question (I can only share your
assumptions that it is OK), but there is a place for spinning off a
related question.

Why do you want to use unsigned char as the underlying type? Is it any
better than plain char when used as "raw memory" (where by "raw memory"
I mean that the only later use will involve reinterpret casts or object
copy).

Consider:

3.9/2 states that POD can be copied back and forth using array of char
OR unsigned char, preserving its value.

3.9.1/1:
"A char [...] AND unsigned char [...] have the same object representation."

There are other places where such properties are defined to be the same
for both char and unsigned char.

The only relevant place that shows some assymetry is 3.9/4:
"The object representation of an object of type T is the sequence of N
unsigned char objects taken up by the object of type T, where N equals
sizeof(T)."

This would state that unsigned char is better than plain char for "raw
memory" uses, but somehow I cannot believe it due to the 3.9.1/1 cited
above.

It was my habit to use unsigned char, but I resigned from it and now
consequently use char buffers when the "raw memory" is what I need.
I just found it more consistent with various API functions, where
pointer to char is expected as a buffer parameter.

I will be glad to know your opinion on this.

--
Maciej Sobczak : http://www.msobczak.com/
Programming : http://www.msobczak.com/prog/

Hendrik Belitz

unread,
Feb 6, 2004, 3:12:15 PM2/6/04
to
Thomas Mang wrote:

std::vector uses an allocator to obtain memory. So even if the standard
vector does not call new, you're always able to write your own allocator
and use it with the vector class without major modifications of your code.

--
To get my real email adress, remove the two onkas
--
Dipl.-Inform. Hendrik Belitz
Central Institute of Electronics
Research Center Juelich

Francis Glassborow

unread,
Feb 6, 2004, 3:16:19 PM2/6/04
to
In message <d6652001.0402...@posting.google.com>,
ka...@gabi-soft.fr writes

>Is it possible to use std::vector< unsigned char > as raw memory? Or,
>more specifically, if I have a std::vector< unsigned char > v, is &v[0]
>guaranteed to be aligned for all possible data types. (Since an
>implementation is required ultimately to use operator new to obtain the
>buffer, I can't see how it couldn't be in practice, but I rather doubt
>that the standard gives me this guarantee, even indirectly. But maybe
>this was the intent.)
>
>For those who are curious: certain Posix functions require some very
>strange memory tricks. The second parameter of readdir_r is the one
>that's giving me the problems -- and the actual size needed isn't known
>until runtime, because it depends on the filesystem where the directory
>is hosted. So I need to allocate dynamcally, and I need RAII (and my
>compiler is too old to support Boost, so scoped_array isn't an option).

what I do not understand is why you do not just use operator new direct:

unsigned char * ptr = (unsigned char) operator new(requirement);

--
Francis Glassborow ACCU
Author of 'You Can Do It!' see http://www.spellen.org/youcandoit
For project ideas and contributions: http://www.spellen.org/youcandoit/projects

Graeme Prentice

unread,
Feb 6, 2004, 3:18:05 PM2/6/04
to
On 5 Feb 2004 06:45:05 -0500, ka...@gabi-soft.fr wrote:

>Is it possible to use std::vector< unsigned char > as raw memory? Or,
>more specifically, if I have a std::vector< unsigned char > v, is &v[0]
>guaranteed to be aligned for all possible data types. (Since an
>implementation is required ultimately to use operator new to obtain the
>buffer, I can't see how it couldn't be in practice, but I rather doubt
>that the standard gives me this guarantee, even indirectly. But maybe
>this was the intent.)

It does guarantee it. As you probably know, 20.4.1.1 requires operator
new(size_t) to be used by the default allocator. 3.7.3.1 para 2
requires the address returned by new to be suitably aligned so that it
can be converted to a pointer of any complete object type.

You also probably know that you can't use v[0] until there is at least
one element in the vector even if memory has been allocated/reserved for
one or more elements. One more thing, for the memory returned by
operator new, the address of one past the end of that memory has to be a
valid address because new has to allow for the fact that the memory
might be being used for an array, which requires one past the end to be
a valid address.

Graeme

Graeme Prentice

unread,
Feb 6, 2004, 3:21:33 PM2/6/04
to
On 5 Feb 2004 06:45:05 -0500, ka...@gabi-soft.fr wrote:

>Is it possible to use std::vector< unsigned char > as raw memory? Or,
>more specifically, if I have a std::vector< unsigned char > v, is &v[0]
>guaranteed to be aligned for all possible data types. (Since an
>implementation is required ultimately to use operator new to obtain the
>buffer, I can't see how it couldn't be in practice, but I rather doubt
>that the standard gives me this guarantee, even indirectly. But maybe
>this was the intent.)

I take back what I said in another reply to this (which may or may not
have turned up yet). Even though operator new is required to be used,
allocator<T> returns memory suitably aligned for an object of type T
only, so if the default allocator breaks up the memory returned by
operator new (which it's allowed to do) then there's no alignment
guarantee for non char objects.

Graeme

Dhruv Matani

unread,
Feb 6, 2004, 3:24:06 PM2/6/04
to
On Thu, 05 Feb 2004 06:45:05 -0500, kanz wrote:

> For those who are curious: certain Posix functions require some very
> strange memory tricks. The second parameter of readdir_r is the one
> that's giving me the problems -- and the actual size needed isn't known
> until runtime, because it depends on the filesystem where the directory
> is hosted. So I need to allocate dynamcally, and I need RAII (and my
> compiler is too old to support Boost, so scoped_array isn't an option).

What about std::auto_ptr<>?


Regards,
-Dhruv.

Dhruv Matani

unread,
Feb 6, 2004, 3:24:53 PM2/6/04
to

Quoting from the holy standard:
8 Copy constructors for all container types defined in this clause copy
the allocator argument from their respective first parameters. All
other constructors for these container types take an Allocator& argu-
ment (_lib.allocator.requirements_). A copy of this argument is used
for any memory allocation performed, by these constructors and by all
member functions, during the lifetime of each container object. In
all container types defined in this clause, the member get_allocator()
returns a copy of the Allocator object used to construct the
container.

Thus, vector MUST obtain memory from "Allocator" which in turn must obtain
memory from operator new, so vector always get's newd memory. However,
since it is unspecified how many times new is called by "Allocator", we
can not give any alignment guarantees. Of course unofficially, what you
(Kanze) wan to do would work quite well in practice!


Now, AFAIK, basic_string<> is not a part of the standard, so it may use
any whacky optimization technique.


Regards,
-Dhruv.

Francis Glassborow

unread,
Feb 6, 2004, 3:25:15 PM2/6/04
to
In message <diu420ptq8pk1oju9...@4ax.com>, tom_usenet
<tom_u...@hotmail.com> writes

>On 5 Feb 2004 06:45:05 -0500, ka...@gabi-soft.fr wrote:
>
> >Is it possible to use std::vector< unsigned char > as raw memory? Or,
> >more specifically, if I have a std::vector< unsigned char > v, is &v[0]
> >guaranteed to be aligned for all possible data types. (Since an
> >implementation is required ultimately to use operator new to obtain the
> >buffer, I can't see how it couldn't be in practice, but I rather doubt
> >that the standard gives me this guarantee, even indirectly. But maybe
> >this was the intent.)
>
>Looking at 20.4.1.1, std::allocator<T>::allocate allocates memory
>suitably aligned for T, so memory returned by std::allocator<unsigned
>char>::allocate is only guaranteed to have an alignment of 1.

However see the requirement for the implementation provided operator new
as given in 18.4.1.1 para 1 which requires that the returned pointer is
to suitably aligned memory for any object of that size.

--
Francis Glassborow ACCU
Author of 'You Can Do It!' see http://www.spellen.org/youcandoit
For project ideas and contributions: http://www.spellen.org/youcandoit/projects

Nicola Musatti

unread,
Feb 6, 2004, 3:26:16 PM2/6/04
to
> Is it possible to use std::vector< unsigned char > as raw memory? Or,
> more specifically, if I have a std::vector< unsigned char > v, is &v[0]
> guaranteed to be aligned for all possible data types. (Since an
> implementation is required ultimately to use operator new to obtain the
> buffer, I can't see how it couldn't be in practice, but I rather doubt
> that the standard gives me this guarantee, even indirectly. But maybe
> this was the intent.)

I believe the standard does give you this guarantee: std::vector
specialized with a single explicit template argument must use a
standard allocator [23.2.4], which must use ::operator new [20.4.1.1],
which must return memory aligned for any complete object type
[3.7.3.1].

Cheers,
Nicola Musatti

Francis Glassborow

unread,
Feb 6, 2004, 3:27:19 PM2/6/04
to
In message <1025dgn...@corp.supernews.com>, Vladimir Kouznetsov
<vladimir....@ngrain.com> writes

>I think to be on the safe side you better to implement your own allocator.
>There are no any requirements for the standard allocator on how data are
>laid out inside the ::operator new() allocated storage except they should be
>T-aligned. But I believe you know that already.

Yes, but there is a requirement on the implementation provided
::operator new() which seems to meet James' requirements.

--
Francis Glassborow ACCU
Author of 'You Can Do It!' see http://www.spellen.org/youcandoit
For project ideas and contributions: http://www.spellen.org/youcandoit/projects

Frank Birbacher

unread,
Feb 7, 2004, 6:16:49 AM2/7/04
to
Hi!

Hendrik Belitz wrote:
> For the most implementations of std::vector you will get the desired
> behaviour. But since the standard not forces a specific implementation of
> std::vector, this cannot be guaranteed. Also using another allocator for
> the vector may change this behaviour too.

One could use a custom allocator which uses new/delete. But in this case
one could rewrite a boost::scoped_array as well.

Frank

Raoul Gough

unread,
Feb 7, 2004, 6:20:37 AM2/7/04
to
"Dhruv Matani" <dhru...@gmx.net> writes:

> On Thu, 05 Feb 2004 06:45:05 -0500, kanz wrote:
>
>> For those who are curious: certain Posix functions require some
>> very strange memory tricks. The second parameter of readdir_r is
>> the one that's giving me the problems -- and the actual size needed
>> isn't known until runtime, because it depends on the filesystem
>> where the directory is hosted. So I need to allocate dynamcally,
>> and I need RAII (and my compiler is too old to support Boost, so
>> scoped_array isn't an option).
>
> What about std::auto_ptr<>?

That's no good for arrays (of chars or anything else). It uses delete
instead of delete[].

--
Raoul Gough.
export LESS='-X'

Thomas Mang

unread,
Feb 7, 2004, 6:25:12 AM2/7/04
to

Dhruv Matani schrieb:

The vector must get its memory from the allocator when it needs extra heap
memory.
That's what the allocator is good for.

However, my point is that I believe unless the vector does not need extra
memory, there is no need to call any of the functions provided by allocator.
It would not need memory when it has raw memory as data member (i.e. a
char-array), and the size of the vector never exceeds the limit of this raw
memory.
At least I cannot find anything in the Standard that would prohibit such an
implementation.

>
> Now, AFAIK, basic_string<> is not a part of the standard, so it may use
> any whacky optimization technique.

Well, basic_string<> is definitely part of the standard.

regards,

Thomas

Thomas Mang

unread,
Feb 7, 2004, 6:25:40 AM2/7/04
to

Dhruv Matani schrieb:

> On Thu, 05 Feb 2004 06:45:05 -0500, kanz wrote:
>
> > For those who are curious: certain Posix functions require some very
> > strange memory tricks. The second parameter of readdir_r is the one
> > that's giving me the problems -- and the actual size needed isn't known
> > until runtime, because it depends on the filesystem where the directory
> > is hosted. So I need to allocate dynamcally, and I need RAII (and my
> > compiler is too old to support Boost, so scoped_array isn't an option).
>
> What about std::auto_ptr<>?

Doesn't work with arrays.


regards,

Thomas

Gennaro Prota

unread,
Feb 7, 2004, 6:26:24 AM2/7/04
to
On 6 Feb 2004 14:58:55 -0500, Maciej Sobczak <no....@no.spam.com>
wrote:

>I cannot answer your specific question (I can only share your
>assumptions that it is OK), but there is a place for spinning off a
>related question.
>
>Why do you want to use unsigned char as the underlying type? Is it any
>better than plain char when used as "raw memory" (where by "raw memory"
>I mean that the only later use will involve reinterpret casts or object
>copy).

This is a thorny issue. The question comes down to whether a signed
char may have padding bits and/or trap representations. I'm not sure
the C++ standard was written having in mind a definite answer to that
question. I guess the committee simply thought to make explicit, or
reword, what were the C90 requirements.

However, C90 wasn't particularly clear on this point and there was a
fundamental DR by Clive Feather:

http://wwwold.dkuug.dk/JTC1/SC22/WG14/www/docs/dr_069.html

The response clarified many things that were in the committee
intentions but not explicitly spelt out in the standard text (note:
they didn't mean to change/fix the rules; just to clarify what was
always the case for any conforming C90 implementation). Based on that,
C99 is much more explicit and clear on this topics and there's text,
written by Clive Feather himself, that explicitly allows trap
representations in signed chars (see 6.2.6.2). A C99 implementation
where

CHAR_BIT = 10
UCHAR_MAX = 1023
SCHAR_MIN = -128
SCHAR_MAX = 127

would be conforming (I don't know if it really exists, though).


In that scenario, and since C++ certainly aims at C compatibility in
this delicate area, I would say that if the C++ standard says anything
different from C99 that isn't intentional.


PS: You won't believe me but I simply gave up consulting the C++
standard for signed vs. unsigned char issues a long time ago, for the
reasons I said above; never noticed there's a related DR:

http://std.dkuug.dk/jtc1/sc22/wg21/docs/cwg_active.html#350

I had the scruple to check the DR list before hitting the send button,
and it was there :) As you may see, it confirms that much of the
wording you quote was written without a clear intent.

Genny.

Vladimir Kouznetsov

unread,
Feb 7, 2004, 6:26:51 AM2/7/04
to
"Francis Glassborow" <fra...@robinton.demon.co.uk> wrote in message
news:FrnA8HDa...@robinton.demon.co.uk...

> In message <d6652001.0402...@posting.google.com>,
> ka...@gabi-soft.fr writes
> >Is it possible to use std::vector< unsigned char > as raw memory? Or,
> >more specifically, if I have a std::vector< unsigned char > v, is &v[0]
> >guaranteed to be aligned for all possible data types. (Since an
> >implementation is required ultimately to use operator new to obtain the
> >buffer, I can't see how it couldn't be in practice, but I rather doubt
> >that the standard gives me this guarantee, even indirectly. But maybe
> >this was the intent.)
> >

> what I do not understand is why you do not just use operator new direct:


>
> unsigned char * ptr = (unsigned char) operator new(requirement);

Perhaps the storage should grow, preserving the old content.

> Francis Glassborow ACCU

thanks,
v

Vladimir Kouznetsov

unread,
Feb 7, 2004, 6:27:44 AM2/7/04
to
"Francis Glassborow" <fra...@robinton.demon.co.uk> wrote in message
news:yFWtnLCc...@robinton.demon.co.uk...

> In message <1025dgn...@corp.supernews.com>, Vladimir Kouznetsov
> <vladimir....@ngrain.com> writes
> >I think to be on the safe side you better to implement your own
allocator.
> >There are no any requirements for the standard allocator on how data are
> >laid out inside the ::operator new() allocated storage except they should
be
> >T-aligned. But I believe you know that already.
>
> Yes, but there is a requirement on the implementation provided
> ::operator new() which seems to meet James' requirements.

I'm not sure what is your point. Are you suggesting that a raw allocated
array can be used instead? This array lacks some useful properties of
vector.

> Francis Glassborow ACCU

thanks,
v

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Chris Theis

unread,
Feb 9, 2004, 1:58:47 PM2/9/04
to

"Dhruv Matani" <dhru...@gmx.net> wrote in message
news:pan.2004.02.06....@gmx.net...

> On Fri, 06 Feb 2004 05:15:21 -0500, Thomas Mang wrote:
[SNIP]

>
> Quoting from the holy standard:
> 8 Copy constructors for all container types defined in this clause copy
> the allocator argument from their respective first parameters. All
> other constructors for these container types take an Allocator& argu-
> ment (_lib.allocator.requirements_). A copy of this argument is used
> for any memory allocation performed, by these constructors and by all
> member functions, during the lifetime of each container object. In
> all container types defined in this clause, the member get_allocator()
> returns a copy of the Allocator object used to construct the
> container.
>
> Thus, vector MUST obtain memory from "Allocator" which in turn must obtain
> memory from operator new, so vector always get's newd memory.

The standard requires that vector obtains its memory via the allocator,
however AFAIK & I just rechecked with the standard without success, the
allocator is not necessarily required to use new & delete. Although the
general implementation does it.

> However,
> since it is unspecified how many times new is called by "Allocator", we
> can not give any alignment guarantees. Of course unofficially, what you
> (Kanze) wan to do would work quite well in practice!
>
>
> Now, AFAIK, basic_string<> is not a part of the standard, so it may use
> any whacky optimization technique.
>

Hmm, why do you assume that basic_string<> is not a part of the standard
because actually it is (covered in chapter 21.3). Any "optimizations"
regarding the memory management must be attributed rather to the allocators
than the string class itself.

Regards
Chris

ka...@gabi-soft.fr

unread,
Feb 9, 2004, 7:41:33 PM2/9/04
to
Thomas Mang <a980...@unet.univie.ac.at> wrote in message
news:<40240FF5...@unet.univie.ac.at>...
> Dhruv Matani schrieb:

> > > ka...@gabi-soft.fr schrieb:

My impression, too is that you have a point (although in my case, I
don't think I'll have to worry about small object optimizations:-)).
More generally, since you suggested the idea, I think an implementation
could maintain its own static memory pool, using some special allocation
algorithm, as long as it used operator new to obtain the memory for this
pool.

FWIW: one of my target compilers is very, very old, so I would prefer
avoiding any added template complexity like supplying my own allocator.
But given what you've said (and it sounds like a reasonable
interpretation to me), even with my own allocator, I couldn't guarantee
alignment.

I'm not sure if this was intentional or not. (The standard, as
published, very definitly allows non-contiguous implementations of
vector. It also seems clear that this freedom was not intentional.)

--


James Kanze GABI Software mailto:ka...@gabi-soft.fr

Conseils en informatique orientée objet/ http://www.gabi-soft.fr
Beratung in objektorientierter Datenverarbeitung
11 rue de Rambouillet, 78460 Chevreuse, France, +33 (0)1 30 23 45 16

ka...@gabi-soft.fr

unread,
Feb 9, 2004, 7:46:00 PM2/9/04
to
"Balog Pal" <pa...@lib.hu> wrote in message
news:<4022...@andromeda.datanet.hu>...

> <ka...@gabi-soft.fr> wrote in message
> news:d6652001.0402...@posting.google.com...

> > For those who are curious: certain Posix functions require some
> > very strange memory tricks. The second parameter of readdir_r is
> > the one that's giving me the problems -- and the actual size needed
> > isn't known until runtime, because it depends on the filesystem
> > where the directory is hosted. So I need to allocate dynamcally,
> > and I need RAII (and my compiler is too old to support Boost, so
> > scoped_array isn't an option).

> Come on, James do you say you can't write a wrapper to a raw memblock
> handler class in 20 minutes? If you'd use scoped_array or some other
> boost class unless your compiler ptoblems, you can just delete the
> template<> part and write a typedef 2 lines below it -- creating a
> nontemplate instance with the same functionality.

Actually, I suspect that if I just extract scoped_array from the rest of
Boost (which I think the Boost licensing allows), it would probably
work. I doubt that it is the part of the Boost library which will cause
problems with my compiler. On the other hand, it means creating new
objects in the source code control system, and a few other hassles; my
predecessor has already done all this for std::vector:-).

There is also a consideration for the person who will read my code.
Presumably, he will know what std::vector is and does. The same thing
is not necessarily true of a class I write.

Of course, if the std::vector solution doesn't work, a try/catch block
will involve the least work for me. For this one case, of course.

I also thought that the question would be of more general interest. One
of the motivations behind the requirement for contiguity in std::vector,
if not THE motivation, was support for cases where C style arrays were
being used (legacy and C interfaces, for example). I stumbled on a case
where a C style interface was being used that didn't seem to be covered.

Of course, one could argue (very well, IMHO) that in this case, I am
going beyond the C style interface anyway -- Posix requires it, and
presumably, it will work on an Posix compilant implementation of C, but
as far as ISO 9899 is concerned, it is undefined behavior.

Anyhow, given the point made by Thomas Mang, I think that std::vector is
out. Should it be? To be frank, I don't know. It would be very
convenient for me if I could use it, and would fit into the argument
that we don't need new[] in application code. On the other hand, I'm
not convinced that my personal convenience, especially when it involves
a work-around for a very poorly designed interface for one particular
system, is really a good justification for a standards requirement.

> Why not do it the way it worked 10 years back if it does the job?

Well, 10 years back, I didn't have exceptions to contend with:-).

--
James Kanze GABI Software mailto:ka...@gabi-soft.fr
Conseils en informatique orientée objet/ http://www.gabi-soft.fr
Beratung in objektorientierter Datenverarbeitung
11 rue de Rambouillet, 78460 Chevreuse, France, +33 (0)1 30 23 45 16

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

ka...@gabi-soft.fr

unread,
Feb 9, 2004, 7:46:39 PM2/9/04
to
Francis Glassborow <fra...@robinton.demon.co.uk> wrote in message
news:<FrnA8HDa...@robinton.demon.co.uk>...

> In message <d6652001.0402...@posting.google.com>,
> ka...@gabi-soft.fr writes
> >Is it possible to use std::vector< unsigned char > as raw memory?
> >Or, more specifically, if I have a std::vector< unsigned char > v, is
> >&v[0] guaranteed to be aligned for all possible data types. (Since
> >an implementation is required ultimately to use operator new to
> >obtain the buffer, I can't see how it couldn't be in practice, but I
> >rather doubt that the standard gives me this guarantee, even
> >indirectly. But maybe this was the intent.)

> >For those who are curious: certain Posix functions require some very
> >strange memory tricks. The second parameter of readdir_r is the one
> >that's giving me the problems -- and the actual size needed isn't
> >known until runtime, because it depends on the filesystem where the
> >directory is hosted. So I need to allocate dynamcally, and I need
> >RAII (and my compiler is too old to support Boost, so scoped_array
> >isn't an option).

> what I do not understand is why you do not just use operator new
> direct:

> unsigned char * ptr = (unsigned char) operator new(requirement);

I could, but I'd then require a try/catch block, with an explicit call
to operator delete in both the normal branch and in the catch block.

This is probably what I will end up doing. The context is limited
enough that it isn't really a problem. I just liked the idea that all
uses of new[] could be subsumed by std::vector; until this case, it has
always been the case in my code.

--
James Kanze GABI Software mailto:ka...@gabi-soft.fr
Conseils en informatique orientée objet/ http://www.gabi-soft.fr
Beratung in objektorientierter Datenverarbeitung
11 rue de Rambouillet, 78460 Chevreuse, France, +33 (0)1 30 23 45 16

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

ka...@gabi-soft.fr

unread,
Feb 9, 2004, 7:49:00 PM2/9/04
to
Graeme Prentice <inv...@yahoo.co.nz> wrote in message
news:<o5u6209bnqp4rhhli...@4ax.com>...

> On 5 Feb 2004 06:45:05 -0500, ka...@gabi-soft.fr wrote:

> >Is it possible to use std::vector< unsigned char > as raw memory?
> >Or, more specifically, if I have a std::vector< unsigned char > v, is
> >&v[0] guaranteed to be aligned for all possible data types. (Since
> >an implementation is required ultimately to use operator new to
> >obtain the buffer, I can't see how it couldn't be in practice, but I
> >rather doubt that the standard gives me this guarantee, even
> >indirectly. But maybe this was the intent.)

> It does guarantee it. As you probably know, 20.4.1.1 requires
> operator new(size_t) to be used by the default allocator. 3.7.3.1
> para 2 requires the address returned by new to be suitably aligned so
> that it can be converted to a pointer of any complete object type.

This was my interpretation, but it seemed indirect enough that I wanted
a confirmation. Regretfully, Thomas Mang seems to have pointed out a
real flaw in the reasoning, so I think that unless someone from the
committee says otherwise, I'll have to do without.

> You also probably know that you can't use v[0] until there is at least
> one element in the vector even if memory has been allocated/reserved
> for one or more elements. One more thing, for the memory returned by
> operator new, the address of one past the end of that memory has to be
> a valid address because new has to allow for the fact that the memory
> might be being used for an array, which requires one past the end to
> be a valid address.

I know. The actual code in question is:

BasicDirReader::BasicDirReader(
std::string const& dirName )
: myBuffer( sizeof( dirent )
+ pathconf( dirName.c_str(), _PC_NAME_MAX ) )
{
}

dirent*
BasicDirReader::tryRead(
DIR* dir )
{
dirent* result ;
if ( readdir_r( dir,
reinterpret_cast< dirent* >( &myBuffer[ 0 ] ),
&result ) != 0 )
{
throw GeneralException(
"DirReader::tryRead",
"Erreur lors de la lecture du répertoire" ) ;
}
return result ;
}

(myBuffer is an std::vector< unsigned char >, BasicDirReader is a base
class used to abstract the threaded/non-threaded dependant code from
DirReader, which is in turn a class used to abstract the system
dependant aspects of reading a directory. And quite frankly, interfaces
like that of readdir_r are enough to make one turn to Windows:-).)

--
James Kanze GABI Software mailto:ka...@gabi-soft.fr
Conseils en informatique orientée objet/ http://www.gabi-soft.fr
Beratung in objektorientierter Datenverarbeitung
11 rue de Rambouillet, 78460 Chevreuse, France, +33 (0)1 30 23 45 16

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

"Daniel Krügler (nee Spangenberg)"

unread,
Feb 10, 2004, 10:51:54 AM2/10/04
to
Good Morning, James!

ka...@gabi-soft.fr schrieb:

>I'm not sure if this was intentional or not. (The standard, as
>published, very definitly allows non-contiguous implementations of
>vector. It also seems clear that this freedom was not intentional.)


I have the new C++ 2003 Standard in my hands and find in 23.2.4/p. 1 the
added statement:

"The elements of a vector are stored contiguously, meaning that if v is
a vector<T, Allocator>
where T is some type other than bool, then it obeys the identity &v[n]
== &v[0] + n for all
0 <= n < v.size()."

As far as I remember from statements of other much more informed people
there did never
exist implementations of std::vector, which violated this requirement
(Is that true?).

Regrettably, this does not answer your alignment question, of course...

Greetings from Bremen,

Daniel

E. Mark Ping

unread,
Feb 10, 2004, 10:52:18 AM2/10/04
to
In article <d6652001.0402...@posting.google.com>,

<ka...@gabi-soft.fr> wrote:
>Is it possible to use std::vector< unsigned char > as raw memory? Or,
>more specifically, if I have a std::vector< unsigned char > v, is &v[0]
>guaranteed to be aligned for all possible data types.

Yes. I hesitated to answer this, but I finally tracked down the post
which convinces me.

Do a google groups search with:

c++ technical corrigendum vector contiguous

and you'll find some good posts. The summary is:

1) The standard was intended to mean this.
2) Every known implementation does this.
3) Technical Corrigendum #1 now guarantees this.

Herb Sutter's post regarding #3 (watch for line wrap):

http://groups.google.com/groups?q=c%2B%2B+technical+corrigendum+vector+contiguous&hl=en&lr=&ie=UTF-8&oe=utf-8&safe=off&scoring=d&selm=tbf7muk89v
0vtlujrn2babrg2ugd5456af%404ax.com&rnum=7

Pete Becker's post on #1:
http://groups.google.com/groups?q=c%2B%2B+technical+corrigendum+vector+contiguous&hl=en&lr=&ie=UTF-8&oe=utf-8&safe=off&scoring=d&selm=4BD201.C
1E164B0%40acm.org&rnum=9

Herb Sutter's post on #2:
http://groups.google.com/groups?q=c%2B%2B+technical+corrigendum+vector+contiguous&hl=en&lr=&ie=UTF-8&oe=utf-8&safe=off&selm=mnm9asoe8gkfhenb1nbb
usep665r4ddb67%404ax.com&rnum=2
--
Mark Ping
ema...@soda.CSUA.Berkeley.EDU

Dylan Nicholson

unread,
Feb 10, 2004, 10:52:44 AM2/10/04
to
ka...@gabi-soft.fr wrote in message news:<d6652001.04020...@posting.google.com>...

> Graeme Prentice <inv...@yahoo.co.nz> wrote in message
> news:<o5u6209bnqp4rhhli...@4ax.com>...
>
> > It does guarantee it. As you probably know, 20.4.1.1 requires
> > operator new(size_t) to be used by the default allocator. 3.7.3.1
> > para 2 requires the address returned by new to be suitably aligned so
> > that it can be converted to a pointer of any complete object type.
>
> This was my interpretation, but it seemed indirect enough that I wanted
> a confirmation. Regretfully, Thomas Mang seems to have pointed out a
> real flaw in the reasoning, so I think that unless someone from the
> committee says otherwise, I'll have to do without.
>
Surely you have some control over which platforms you're supporting?
I use the technique you've described all the time, and never had a
problem on at least 6 platforms (including AS/400!). It may not be
100% guaranteed to be portable, but in real-world application
development that usually isn't an overriding factor. In fact, what
the standard *does* guarantee is next to useless if the compilers and
environments that people actually use don't conform.

Dylan

Rob Williscroft

unread,
Feb 10, 2004, 12:24:28 PM2/10/04
to
wrote in news:d6652001.04020...@posting.google.com:

[snip]

>
> I know. The actual code in question is:
>
> BasicDirReader::BasicDirReader(
> std::string const& dirName )
> : myBuffer( sizeof( dirent )
> + pathconf( dirName.c_str(), _PC_NAME_MAX ) )
> {
> }
>
> dirent*
> BasicDirReader::tryRead(
> DIR* dir )
> {
> dirent* result ;
> if ( readdir_r( dir,
> reinterpret_cast< dirent* >( &myBuffer[ 0 ] ),
> &result ) != 0 )
> {
> throw GeneralException(
> "DirReader::tryRead",
> "Erreur lors de la lecture du répertoire" ) ;
> }
> return result ;
> }
>
> (myBuffer is an std::vector< unsigned char >, BasicDirReader is a base
> class used to abstract the threaded/non-threaded dependant code from
> DirReader, which is in turn a class used to abstract the system
> dependant aspects of reading a directory. And quite frankly,
> interfaces like that of readdir_r are enough to make one turn to
> Windows:-).)


Why can't you use

std::vector< dirent > myBuffer(
2 + ( pathconf( dirName.c_str(), _PC_NAME_MAX ) / sizeof( dirent ) )
);

Instead of std::vector< unsigned char > ?


Rob.
--
http://www.victim-prime.dsl.pipex.com/

ka...@gabi-soft.fr

unread,
Feb 10, 2004, 2:43:43 PM2/10/04
to
Maciej Sobczak <no....@no.spam.com> wrote in message
news:<bvvjc0$gka$1...@atlantis.news.tpi.pl>...

> ka...@gabi-soft.fr wrote:

> > Is it possible to use std::vector< unsigned char > as raw memory?

> I cannot answer your specific question (I can only share your


> assumptions that it is OK), but there is a place for spinning off a
> related question.

> Why do you want to use unsigned char as the underlying type? Is it any
> better than plain char when used as "raw memory" (where by "raw
> memory" I mean that the only later use will involve reinterpret casts
> or object copy).

In C++, no. At least not from a language standpoint, both are equally
good.

> Consider:

> 3.9/2 states that POD can be copied back and forth using array of char
> OR unsigned char, preserving its value.

The equivalent guarantee in C only holds for unsigned char. A signed
char (and thus, a plain char) is allowed to have trapping
representations.

I don't know the reasons for the difference between C and C++ here, but
for some reason, they are different.

> 3.9.1/1:
> "A char [...] AND unsigned char [...] have the same object representation."

For a very strange meaning of "object representation"; the text
immediately before this makes it plain that all that is meant is that
they have the same size and alignment requirements.

The key guarantee is just below (in the same paragraph): "For character
types, all bits of the object representation participate in the value
representation." In the C standard, the equivalent guarantee, for this
and for the memcpy, only holds for unsigned char. In the C++ standard,
the next sentence also gives food for thought: "For unsigned character
types, all possible bit patterns of the value representation represent
numbers." I'm not too sure what this is meant to mean: could a 1's
complement machine arrange for all 0's to be positive, and trap on a
negative 0? (But this would invalidate the memcpy guarantee, supposing
plain char was signed.)

> There are other places where such properties are defined to be the
> same for both char and unsigned char.

> The only relevant place that shows some assymetry is 3.9/4: "The
> object representation of an object of type T is the sequence of N
> unsigned char objects taken up by the object of type T, where N equals
> sizeof(T)."

> This would state that unsigned char is better than plain char for "raw
> memory" uses, but somehow I cannot believe it due to the 3.9.1/1 cited
> above.

I think, despite the one worrying sentence I quoted above, that char and
unsigned char are equally valid in C++. However:

- I knew C before I knew C++, and got into the habit of unsigned char,

- it also seems to be the prevailing usage.

The latter point is, IMHO, very important. I only use plain char for
characters; if I want a small integer, I will use either unsigned char
or signed char, and if I want raw memory, I use unsigned char.
Independantly of what the language allows, this seems to correspond to a
long tradition, and thus communicates my intent better.

> It was my habit to use unsigned char, but I resigned from it and now
> consequently use char buffers when the "raw memory" is what I need. I
> just found it more consistent with various API functions, where
> pointer to char is expected as a buffer parameter.

The "normal" convention when passing a pointer to raw memory is to use
void*. But obviously, neither void[] nor std::vector<void> will work,
so some type conversions are necessary.

The actual type requested by the interface in my case is a dirent*.
However, the interface requires that there some free bytes behind the
dirent -- I can't just pass it "new dirent".

Formally, the interface takes a struct ending with a VLA. But we don't
have VLA's in C++ yet. The correct solution would be to write a small
wrapper in C, and call it from C++. But I don't have access to a C
compiler, so I'm stuck. (And to be honest, the interface was designed
before VLA was adopted in C, and was intended to use some awful C
hackery, that is in fact undefined behavior in C, but happens to work on
most systems.)

> I will be glad to know your opinion on this.

Unless there are definite reasons for doing otherwise, stick with
tradition, and what the others do.

--
James Kanze GABI Software mailto:ka...@gabi-soft.fr
Conseils en informatique orientée objet/ http://www.gabi-soft.fr
Beratung in objektorientierter Datenverarbeitung
11 rue de Rambouillet, 78460 Chevreuse, France, +33 (0)1 30 23 45 16

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

tom_usenet

unread,
Feb 10, 2004, 3:04:46 PM2/10/04
to
On 6 Feb 2004 15:25:15 -0500, Francis Glassborow
<fra...@robinton.demon.co.uk> wrote:

>In message <diu420ptq8pk1oju9...@4ax.com>, tom_usenet
><tom_u...@hotmail.com> writes
>>On 5 Feb 2004 06:45:05 -0500, ka...@gabi-soft.fr wrote:
>>
>> >Is it possible to use std::vector< unsigned char > as raw memory? Or,
>> >more specifically, if I have a std::vector< unsigned char > v, is &v[0]
>> >guaranteed to be aligned for all possible data types. (Since an
>> >implementation is required ultimately to use operator new to obtain the
>> >buffer, I can't see how it couldn't be in practice, but I rather doubt
>> >that the standard gives me this guarantee, even indirectly. But maybe
>> >this was the intent.)
>>
>>Looking at 20.4.1.1, std::allocator<T>::allocate allocates memory
>>suitably aligned for T, so memory returned by std::allocator<unsigned
>>char>::allocate is only guaranteed to have an alignment of 1.
>
>However see the requirement for the implementation provided operator new
>as given in 18.4.1.1 para 1 which requires that the returned pointer is
>to suitably aligned memory for any object of that size.

Yes, but std::allocator doesn't need to return the pointer from
operator new directly. e.g.

template<>
class allocator<unsigned char>
{
//...
pointer allocate(size_type size, void* hint)
{
//ridiculous implementation, but conforming.
return static_cast<pointer>(::operator new(size + 1)) + 1;
}

//...
void deallocate(pointer p)
{
::operator delete(p - 1); //ridiculous implementation
}

//...
};

I see no requirement that std::allocator return the output from
operator new directly, and indeed, most don't, since they employ
memory pools to speed smaller allocations.

Tom

C++ FAQ: http://www.parashift.com/c++-faq-lite/
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html

ka...@gabi-soft.fr

unread,
Feb 11, 2004, 4:38:54 PM2/11/04
to
ema...@soda.csua.berkeley.edu (E. Mark Ping) wrote in message
news:<c0a0b2$13lo$1...@agate.berkeley.edu>...

> >Is it possible to use std::vector< unsigned char > as raw memory?
> >Or, more specifically, if I have a std::vector< unsigned char > v,
> >is &v[0] guaranteed to be aligned for all possible data types.

> Yes. I hesitated to answer this, but I finally tracked down the post
> which convinces me.

> Do a google groups search with:

> c++ technical corrigendum vector contiguous

> and you'll find some good posts. The summary is:

> 1) The standard was intended to mean this.
> 2) Every known implementation does this.
> 3) Technical Corrigendum #1 now guarantees this.

I'm quite aware of technical corrigendum #1, and much of the debate that
went on around it.

Could you please quote where it guarantees alignment? All I'm aware of
is the guarantee concerning contiguity (and I wouldn't have even
suggested the idea without that guarantee).

--
James Kanze GABI Software mailto:ka...@gabi-soft.fr
Conseils en informatique orientée objet/ http://www.gabi-soft.fr
Beratung in objektorientierter Datenverarbeitung
11 rue de Rambouillet, 78460 Chevreuse, France, +33 (0)1 30 23 45 16

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

ka...@gabi-soft.fr

unread,
Feb 11, 2004, 4:39:20 PM2/11/04
to
Rob Williscroft <r...@freenet.REMOVE.co.uk> wrote in message
news:<Xns948B15FE99DABuk...@195.129.110.131>...
> wrote in news:d6652001.04020...@posting.google.com:

> [snip]

Interesting idea. I probably could. I pity the poor maintenence
programmer who will have to figure out what is going on in this case,
though. I hate it when I have to write comments explaining code (as
opposed to defining an interface).

In fact, the code in question was originally part of a larger class,
DirReader. Since I've factored it out into an isolated base class, and
it is now the only dynamic resource in the class, malloc/free will do
the trick quite well.

--
James Kanze GABI Software mailto:ka...@gabi-soft.fr
Conseils en informatique orientée objet/ http://www.gabi-soft.fr
Beratung in objektorientierter Datenverarbeitung
11 rue de Rambouillet, 78460 Chevreuse, France, +33 (0)1 30 23 45 16

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

ka...@gabi-soft.fr

unread,
Feb 11, 2004, 4:39:45 PM2/11/04
to
wizo...@hotmail.com (Dylan Nicholson) wrote in message
news:<7d428a77.0402...@posting.google.com>...

> > Graeme Prentice <inv...@yahoo.co.nz> wrote in message
> > news:<o5u6209bnqp4rhhli...@4ax.com>...

> > > It does guarantee it. As you probably know, 20.4.1.1 requires
> > > operator new(size_t) to be used by the default allocator.
> > > 3.7.3.1 para 2 requires the address returned by new to be
> > > suitably aligned so that it can be converted to a pointer of any
> > > complete object type.

> > This was my interpretation, but it seemed indirect enough that I
> > wanted a confirmation. Regretfully, Thomas Mang seems to have
> > pointed out a real flaw in the reasoning, so I think that unless
> > someone from the committee says otherwise, I'll have to do without.

> Surely you have some control over which platforms you're supporting?

Yes and no. The only machine this code will ever run on is a Sun Sparc,
under Solaris. And of course, since it is a Posix interface I'm trying
to cope with, I can exclude any non-Unix platforms entirely.

The problem, however, isn't the platform, but the compiler, or more
precisely, the library which comes with the compiler. And I can't
eliminate the possibility that the compiler might be upgraded at some
point in the future.

> I use the technique you've described all the time, and never had a
> problem on at least 6 platforms (including AS/400!). It may not be
> 100% guaranteed to be portable, but in real-world application
> development that usually isn't an overriding factor. In fact, what
> the standard *does* guarantee is next to useless if the compilers and
> environments that people actually use don't conform.

I think I mentionned that I couldn't logically think of a reasonable
implementation where it didn't hold. But I was curious. And I can't
think of everything -- Thomas Mang pointed out a possible implementation
which isn't really too impossible, and in which it wouldn't hold.

If there were no other solution, or the other solutions were excessively
complicated, I would still take the risk and use it. As it is, however,
I think I'll switch to a try block. Actually, since I initially
posted, I've isolated this part of the functionality into a separate
base class, so even the try block won't be necessary. But I can't
exclude the possibility of a similar problem arising in the future.
And, as I say, I'm curious.

--
James Kanze GABI Software mailto:ka...@gabi-soft.fr
Conseils en informatique orientée objet/ http://www.gabi-soft.fr
Beratung in objektorientierter Datenverarbeitung
11 rue de Rambouillet, 78460 Chevreuse, France, +33 (0)1 30 23 45 16

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Scott Meyers

unread,
Feb 14, 2004, 7:29:23 AM2/14/04
to
On 6 Feb 2004 05:15:21 -0500, Thomas Mang wrote:
> Does the Standard somewhere require vector to obtain its memory always via
> operator new?
>
> Wouldn't it be possible that class vector, especially when optimized for
> small classes (say, the built-in types), reserves internally some raw
> memory for a limited number of T-objects, to avoid allocating memory from
> the heap (the compiler could, of course, easily find out how this raw
> memory would have to be aligned at instantiation time)? I remember having
> heard of a string implementation (Intels?) that was optimized for small
> strings and had a charT[16] - array as data member, exactly to avoid
> operator new. Couldn't the same issue apply to vector?

I don't think so. Unlike strings, swapping two vectors must not invalidate
any iterators. Now consider swapping a small vector with a big one. The
formerly small vector must now be a big one, but the iterators into the
internal memory space for the small one must remain valid. I can't think
of a reaasonable way to do that without all vector data being stored
external to the vector object itself.

The only reason the small string optimization is legal is that string
iterators come with more flexible invalidation guarantees than vector.

Scott

Thomas Mang

unread,
Feb 14, 2004, 6:14:44 PM2/14/04
to

Scott Meyers schrieb:

> On 6 Feb 2004 05:15:21 -0500, Thomas Mang wrote:
> > Does the Standard somewhere require vector to obtain its memory always via
> > operator new?
> >
> > Wouldn't it be possible that class vector, especially when optimized for
> > small classes (say, the built-in types), reserves internally some raw
> > memory for a limited number of T-objects, to avoid allocating memory from
> > the heap (the compiler could, of course, easily find out how this raw
> > memory would have to be aligned at instantiation time)? I remember having
> > heard of a string implementation (Intels?) that was optimized for small
> > strings and had a charT[16] - array as data member, exactly to avoid
> > operator new. Couldn't the same issue apply to vector?
>
> I don't think so. Unlike strings, swapping two vectors must not invalidate
> any iterators. Now consider swapping a small vector with a big one. The
> formerly small vector must now be a big one, but the iterators into the
> internal memory space for the small one must remain valid. I can't think
> of a reaasonable way to do that without all vector data being stored
> external to the vector object itself.

Yes, it seems the memory where the Ts are stored has to be external to the
vector. However, I don't find any requirement that it has to be obtained via the
allocate - function of the allocator.

For example, couldn't an implementation set aside some static raw memory (of
course, only aligned for type T - which is, in case of std::vector<unsigned
char>, probably the weakest alignment requirements) and try to make use of this
memory pool before it gets the memory from the allocator?
Does the Standard prohibit such an implementation somehow?


Two additional issues I'd like to bring up:


1)
I studied the requirements for the basic_string<>::swap member funtion.
To quote the Standard:
"Complexity: constant time".

Now back to the optimized basic_string implementation I talked about in my first
response to James (and, BTW, have read about in your Effective STL book). Is
this
implementation legal at all? If it stores internally some raw memory for a
limited number of charT to avoid heap allocation, how can it fullfil the
constant
time swap requirement?
Using some static memory pool, yes, could meet the requirement, but with raw
memory - per - instance?

2)
The bottom of page 485 contains note 248), which says:
"reserve() uses Allocator::allocate()..."

and we all know that the default - allocator uses ::operator new in its
allocate-member function(20.4.1.1).


So in order to force std::vector to get its memory from the allocator, one
simply
has to write this:

std::size_t const BufferSize(100 * sizeof(Foo)); // the size of our buffer
std::vector<unsigned char> Vec;
Vec.reserve(std::max(BufferSize, Vec.capacity + 1));

If I interpret the Standard correctly, this should force memory allocation from
the heap.


However, I see 2 problems:

a) the essential part I quoted is written in a note. AFAIK (from this group,
BTW), notes are non-normative and such of little use. However, I may well be
wrong with that assumption.

b) The memory returned by std::allocator<T>::allocate is, as other have already
pointed out, only guaranteed to be "aligned appropriately for objects of type
T",
and not necesserily the return value of the call to ::operator new (or some
other
address with alignment guarantees for any type).
I have no idea if this was really the intent of the Standard. In case it wasn't,
a defect seems to be in order.
For non-portable code, however, this problem can be circumvented by
instantiating
the vector not with unsigned char, but a 'max_alignment_requirements" type.


>
>
> The only reason the small string optimization is legal is that string
> iterators come with more flexible invalidation guarantees than vector.

Yes, thank you for the information, but I still don't see how the string
implementation presented in your book can fulfill the swap-requirement.


regards,

Thomas

Balog Pal

unread,
Feb 15, 2004, 6:41:53 AM2/15/04
to
<ka...@gabi-soft.fr> wrote in message
news:d6652001.04020...@posting.google.com...

> Actually, I suspect that if I just extract scoped_array from the rest of
> Boost (which I think the Boost licensing allows), it would probably
> work. I doubt that it is the part of the Boost library which will cause
> problems with my compiler. On the other hand, it means creating new
> objects in the source code control system, and a few other hassles; my
> predecessor has already done all this for std::vector:-).

That sounds interesting. So you have a copy of formerly std::vector as your
own vector in your codebase already? Then why not peek that internalized
implementation, and possibly even respecify it to add the guarantee you
want?

> There is also a consideration for the person who will read my code.
> Presumably, he will know what std::vector is and does. The same thing
> is not necessarily true of a class I write.

I personally use a modified version of std::auto_ptr. It has the same set of
get/reset/release... functions, but supports a 'policy' for the deleter
function as the second template param. Defaulting to simple delete. With
some "builtin" policies for previsioned use, like delete[], Release().
Certaily any client can write his own 1-liner to handle anything else -- and
keep the uniform interface, look, feel whatever.
I have other variants on the theme, several RAII classes for all kinds of
system objects, and those also has the same functions as auto_ptr.

So I see no reason you couldn't pick any of your in-house interfaces fit the
role, and create that magic class to be like them. [Also, knowing
interface to vector is probably an advantage, but it's probably
counterbalanced if vector is used more like a hackery. :-]

> Of course, if the std::vector solution doesn't work, a try/catch block
> will involve the least work for me. For this one case, of course.

Actually I had in mind solution that avoided that try/catch -- the 'tailed
object' or what was its regular name. When the last member of the struct
was a char[1] and the space was allocated to have the needed amount of
characters. Even sounds like a pertty good candidate for a template :) I
just wonder why I never created one such.

> I also thought that the question would be of more general interest.

Sure it is an interesting one ;-) I was merely poking.

> One
> of the motivations behind the requirement for contiguity in std::vector,
> if not THE motivation, was support for cases where C style arrays were
> being used (legacy and C interfaces, for example). I stumbled on a case
> where a C style interface was being used that didn't seem to be covered.

And it certainly can be used as such. The problem of alignment is teh same
if you use some array of bytes and decide to use it as array of doubles.
And the solution is similar too: use vector<double> instead of vector<char>
and at a cost of a few possibly wasted bytes you get your alignment.

> Anyhow, given the point made by Thomas Mang, I think that std::vector is
> out. Should it be? To be frank, I don't know.

You mean the byte vector is out.

> It would be very
> convenient for me if I could use it, and would fit into the argument
> that we don't need new[] in application code.

On the alternative solutions -- why would you need it? I grab raw memory
using 'operator new()' not new char[]. and follows the same error
handling as new. And for nonthrowing version malloc/realloc wins hands
down.

> > Why not do it the way it worked 10 years back if it does the job?
>
> Well, 10 years back, I didn't have exceptions to contend with:-).

That is almost true, but for some reason I see no real extra problems
brought in by exceptions for the case -- sure I'm used to work with my smart
ptrs and other RAII objects. (for which I create classes without hesitation
if some really new resource appears.)

Paul

Scott Meyers

unread,
Feb 15, 2004, 7:03:28 AM2/15/04
to
On 14 Feb 2004 18:14:44 -0500, Thomas Mang wrote:
> Yes, it seems the memory where the Ts are stored has to be external to the
> vector. However, I don't find any requirement that it has to be obtained via the
> allocate - function of the allocator.
>
> For example, couldn't an implementation set aside some static raw memory (of
> course, only aligned for type T - which is, in case of std::vector<unsigned
> char>, probably the weakest alignment requirements) and try to make use of this
> memory pool before it gets the memory from the allocator?

If I understand you correctly, you are suggesting that vector<T> might set
aside a chunk of static memory to use in lieu of heap allocation when it
can. Wow. So instead of worrying only about heap fragmentation (or, more
accurately, letting its allocator do the worrying), vector would also have
to worry about pseudo-heap fragmentation. Bizarre, and I'd call it
contrary to the spirit of the standard. But, yeah, I guess it's
technically legal.

> I studied the requirements for the basic_string<>::swap member funtion.
> To quote the Standard:
> "Complexity: constant time".
>
> Now back to the optimized basic_string implementation I talked about in
> my first response to James (and, BTW, have read about in your Effective
> STL book). Is this implementation legal at all? If it stores internally
> some raw memory for a limited number of charT to avoid heap allocation,
> how can it fullfil the constant time swap requirement?

The size of the per-instance buffer is fixed (i.e., constant), so copying
data to/from it is constant time. (The complexity is linear in the size of
the buffer, but it's still considered constant, because the size of the
buffer is constant.) If the internal buffer is insufficient, we just move
a pointer, which is also constant time. So I don't see why swap can't be
performed in constant time.

> a) the essential part I quoted is written in a note. AFAIK (from this group,
> BTW), notes are non-normative and such of little use. However, I may well be
> wrong with that assumption.

Geez, the people who like to point out that distinction used to only loiter
around comp.std.c++. When did that gap get breached?

> Yes, thank you for the information, but I still don't see how the string
> implementation presented in your book can fulfill the swap-requirement.

FWIW, while it's not inaccurate to say that my book "presents" that
implementation (among others), I think it'd be better to say that it
"describes" the implementation. I don't invent or propose any string
implementations of my own in Effective STL. I simply describe four that
happened to commonly exist at the time I wrote the book.

Scott

E. Mark Ping

unread,
Feb 16, 2004, 7:38:09 AM2/16/04
to
In article <d6652001.04021...@posting.google.com>,

<ka...@gabi-soft.fr> wrote:
>Could you please quote where it guarantees alignment? All I'm aware
>of is the guarantee concerning contiguity (and I wouldn't have even
>suggested the idea without that guarantee).

Sadly (hangs head in shame) I completely missed the alignment issue.
I thought the concern was whether it was save to use the allocated
memory on all platforms for now and in the future as well. Hence my
comments about current implementations, etc.
--
Mark Ping
ema...@soda.CSUA.Berkeley.EDU

ka...@gabi-soft.fr

unread,
Feb 16, 2004, 6:23:41 PM2/16/04
to
"Balog Pal" <pa...@lib.hu> wrote in message
news:<402e...@andromeda.datanet.hu>...

> <ka...@gabi-soft.fr> wrote in message
> news:d6652001.04020...@posting.google.com...

> > Actually, I suspect that if I just extract scoped_array from the
> > rest of Boost (which I think the Boost licensing allows), it would
> > probably work. I doubt that it is the part of the Boost library
> > which will cause problems with my compiler. On the other hand, it
> > means creating new objects in the source code control system, and a
> > few other hassles; my predecessor has already done all this for
> > std::vector:-).

> That sounds interesting. So you have a copy of formerly std::vector as
> your own vector in your codebase already? Then why not peek that
> internalized implementation, and possibly even respecify it to add the
> guarantee you want?

The copy of std::vector is there because one of the compilers we support
doesn't support the STL yet. It is considered a temporary paliative,
and one that we'd like to get rid of.

> > There is also a consideration for the person who will read my code.
> > Presumably, he will know what std::vector is and does. The same
> > thing is not necessarily true of a class I write.

> I personally use a modified version of std::auto_ptr. It has the same
> set of get/reset/release... functions, but supports a 'policy' for the
> deleter function as the second template param. Defaulting to simple
> delete. With some "builtin" policies for previsioned use, like
> delete[], Release(). Certaily any client can write his own 1-liner to
> handle anything else -- and keep the uniform interface, look, feel
> whatever. I have other variants on the theme, several RAII classes
> for all kinds of system objects, and those also has the same functions
> as auto_ptr.

> So I see no reason you couldn't pick any of your in-house interfaces
> fit the role, and create that magic class to be like them. [Also,
> knowing interface to vector is probably an advantage, but it's
> probably counterbalanced if vector is used more like a hackery. :-]

Every new class is something additional for a maintenance programmer to
learn. It has a price. The question is: is the price worth it? If the
standard class can do the job just as well, the answer is almost
certainly no. If there is no standard class which does anything
remotely similar, the answer is a definite yet. In between...

> > Of course, if the std::vector solution doesn't work, a try/catch
> > block will involve the least work for me. For this one case, of
> > course.

> Actually I had in mind solution that avoided that try/catch -- the
> 'tailed object' or what was its regular name. When the last member of
> the struct was a char[1] and the space was allocated to have the
> needed amount of characters. Even sounds like a pertty good candidate
> for a template :) I just wonder why I never created one such.

Perhaps because it is illegal -- undefined behavior according to the C
and the C++ standards.

In fact, that is exactly what I am trying to do, but adapted to use
RAII. Illegal or not, the interface in question requires this.

> > I also thought that the question would be of more general interest.

> Sure it is an interesting one ;-) I was merely poking.

> > One of the motivations behind the requirement for contiguity in
> > std::vector, if not THE motivation, was support for cases where C
> > style arrays were being used (legacy and C interfaces, for example).
> > I stumbled on a case where a C style interface was being used that
> > didn't seem to be covered.

> And it certainly can be used as such. The problem of alignment is the


> same if you use some array of bytes and decide to use it as array of
> doubles. And the solution is similar too: use vector<double> instead
> of vector<char> and at a cost of a few possibly wasted bytes you get
> your alignment.

> > Anyhow, given the point made by Thomas Mang, I think that std::vector is
> > out. Should it be? To be frank, I don't know.

> You mean the byte vector is out.

> > It would be very convenient for me if I could use it, and would fit
> > into the argument that we don't need new[] in application code.

> On the alternative solutions -- why would you need it? I grab raw
> memory using 'operator new()' not new char[]. and follows the same
> error handling as new. And for nonthrowing version malloc/realloc
> wins hands down.

The difference is that std::vector has a destructor, so the memory is
automatically freed regardless of what happens elsewhere (exceptions,
etc.)

> > > Why not do it the way it worked 10 years back if it does the job?

> > Well, 10 years back, I didn't have exceptions to contend with:-).

> That is almost true, but for some reason I see no real extra problems
> brought in by exceptions for the case -- sure I'm used to work with my
> smart ptrs and other RAII objects. (for which I create classes without
> hesitation if some really new resource appears.)

The problem is that to manage the resource, I must create a class.
Unless an existing class will do the trick. I had hopes that
std::vector might be that class.

--
James Kanze GABI Software mailto:ka...@gabi-soft.fr
Conseils en informatique orientée objet/ http://www.gabi-soft.fr
Beratung in objektorientierter Datenverarbeitung
11 rue de Rambouillet, 78460 Chevreuse, France, +33 (0)1 30 23 45 16

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Thomas Mang

unread,
Feb 17, 2004, 6:30:32 AM2/17/04
to

Scott Meyers schrieb:

> On 14 Feb 2004 18:14:44 -0500, Thomas Mang wrote:
> > Yes, it seems the memory where the Ts are stored has to be external to the
> > vector. However, I don't find any requirement that it has to be obtained via the
> > allocate - function of the allocator.
> >
> > For example, couldn't an implementation set aside some static raw memory (of
> > course, only aligned for type T - which is, in case of std::vector<unsigned
> > char>, probably the weakest alignment requirements) and try to make use of this
> > memory pool before it gets the memory from the allocator?
>
> If I understand you correctly, you are suggesting that vector<T> might set
> aside a chunk of static memory to use in lieu of heap allocation when it
> can. Wow. So instead of worrying only about heap fragmentation (or, more
> accurately, letting its allocator do the worrying), vector would also have
> to worry about pseudo-heap fragmentation. Bizarre, and I'd call it
> contrary to the spirit of the standard. But, yeah, I guess it's
> technically legal.

It might sound bizarre in the general case, but always keep the as-if-rule in mind. A
really very smart optimizer and code analyzer could use such a memory pool to gain both
space and runtime savings - it just depends on the case in question, and the
quality/possibilities of the compiler/optimizer.


BTW, I am not even sure anymore that raw memory (in case it is being used) has to be
external to the vector, when the operations performed on the vector never cause any
conflicts with the Standard.

Take this simple snippet as example:

{
std::vector<int> Vec(3);

for (std::size_t i(3); i > 0; --i)
{
Vec[i-1] += i;
Vec.push_back(i * 2);
}

print(Vec);
}

Here the compiler/optimizer could figure out that the vector holds a maximum of 6 ints,
does no swapping operation, etc.
I think the compiler would be allowed, under the as-if-rule, to use a vector
implementation optimized for such cases, where the vector holds internally some memory
(most likely a simple int[6] in our case) and uses this, instead of requesting memory
from the heap via the allocator.


regards,

Thomas

Paavo Helde

unread,
Feb 17, 2004, 4:41:43 PM2/17/04
to
ka...@gabi-soft.fr wrote in message
news:<d6652001.04021...@posting.google.com>...

> Every new class is something additional for a maintenance programmer to
> learn. It has a price. The question is: is the price worth it? If the

[...]

> The problem is that to manage the resource, I must create a class.
> Unless an existing class will do the trick. I had hopes that
> std::vector might be that class.

If the goal is to throw in RAII without creating any new classes, then
I would go for Alexandrescu's ScopeGuard and a couple of one-liners to
convert delete operators into function calls. ScopeGuard is pure RAII
and can be applied to any resource. Of course this pays off only if
you need it in several situations. In particular, ScopeGuard comes
very handy when converting old C-style code exception-safe.

#include <ScopeGuard/ScopeGuard.h>
#include <malloc.h>

template <class T>
struct DeleteArr {
void operator()(T *t) const { delete[] t;}
};

int main() {
unsigned char* buffer = new unsigned char[10000];
ON_BLOCK_EXIT( DeleteArr<unsigned char>(), buffer);

void* other_resource = malloc(1024);
ON_BLOCK_EXIT(free, other_resource);

// ....

}


Paavo

Balog Pal

unread,
Feb 18, 2004, 5:55:12 AM2/18/04
to
<ka...@gabi-soft.fr> wrote in message
news:d6652001.04021...@posting.google.com...

> > Actually I had in mind solution that avoided that try/catch -- the
> > 'tailed object' or what was its regular name. When the last member of
> > the struct was a char[1] and the space was allocated to have the
> > needed amount of characters. Even sounds like a pertty good candidate
> > for a template :) I just wonder why I never created one such.
>
> Perhaps because it is illegal -- undefined behavior according to the C
> and the C++ standards.

Really? What part makes it UB? I thought C-style array indexing is ordered
to behave like simple pointer math. And expression using the pointer math
there will address a byte that is properly allocated. So how it is
undefined behavior?
And another question: if it is indeed illegal, is there a known scenario
when it can actually break?

For the reference I talk about this kind of thingie:

typedef unsigned char BYTE;
typedef int tQlen;
struct Qdata
{
// additional members can be added here

tQlen len;
BYTE data[1]; // rest follows!

inline tQlen GetLen() const {return len;}
inline BYTE * GetData() {return &data[0];}

static Qdata * Alloc(tQlen len)
{
Qdata * p = (Qdata *) operator new(len + offsetof(Qdata, data));
if(p) // to work in both throwing and 0-returning environments
p->len = len;
return p;
}
static void Free(Qdata * p)
{
operator delete(p);
}

static Qdata * CreateNew(const void * data, tQlen len)
{
//ASSERT(len > 0);
Qdata * p = Alloc(len);
if(p)
{
memcpy(p->data, data, len);
}
return p;
}
private: //cctor, op= declared, nonimplemented
};


Paul

ka...@gabi-soft.fr

unread,
Feb 19, 2004, 8:12:31 PM2/19/04
to
"Balog Pal" <pa...@lib.hu> wrote in message
news:<4032...@andromeda.datanet.hu>...

> <ka...@gabi-soft.fr> wrote in message
> news:d6652001.04021...@posting.google.com...
> > > Actually I had in mind solution that avoided that try/catch -- the
> > > 'tailed object' or what was its regular name. When the last
> > > member of the struct was a char[1] and the space was allocated to
> > > have the needed amount of characters. Even sounds like a pertty
> > > good candidate for a template :) I just wonder why I never created
> > > one such.

> > Perhaps because it is illegal -- undefined behavior according to the
> > C and the C++ standards.

> Really? What part makes it UB? I thought C-style array indexing is
> ordered to behave like simple pointer math. And expression using the
> pointer math there will address a byte that is properly allocated. So
> how it is undefined behavior?

The start address is in the array of the struct, an array with one
element. Pointer math is only defined within a given array (and one
past the end). §5.7/5 (last sentence): "If both the pointer operand and
the result point to elements of the same array object, or one past the
last element of the array object, the evaluation shall not produce an
overflow; otherwise the behavior is undefined." The exact same words are
in §6.5.6/8 of the C standard, and (from memory, I don't have my copy
handy) were present in the original C standard.

> And another question: if it is indeed illegal, is there a known
> scenario when it can actually break?

Sure. Anytime the compiler uses fat pointers and does bounds checking.
I believe that CenterLine once sold such a compiler.

It was the expressed intent of at least some of the authors of the
original C standard that an implementation with full bounds checking
using fat pointers be legal.

> For the reference I talk about this kind of thingie:

> typedef unsigned char BYTE;
> typedef int tQlen;
> struct Qdata
> {
> // additional members can be added here
>
> tQlen len;
> BYTE data[1]; // rest follows!

> inline tQlen GetLen() const {return len;}
> inline BYTE * GetData() {return &data[0];}

Note here that the resulting BYTE* points into the array data. Any
attempt to add more than one to it is undefined behavior, as is any
attempt to dereference GetData() + 1.

As mentioned above, there has been at least one compiler where such
things were checked, and caused a run-time error.

--
James Kanze GABI Software mailto:ka...@gabi-soft.fr
Conseils en informatique orientée objet/ http://www.gabi-soft.fr
Beratung in objektorientierter Datenverarbeitung
11 rue de Rambouillet, 78460 Chevreuse, France, +33 (0)1 30 23 45 16

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Val. Creux

unread,
Feb 20, 2004, 6:42:23 AM2/20/04
to
"Balog Pal" <pa...@lib.hu> wrote in message news:<4032...@andromeda.datanet.hu>...

> <ka...@gabi-soft.fr> wrote in message
> news:d6652001.04021...@posting.google.com...
> > > Actually I had in mind solution that avoided that try/catch -- the
> > > 'tailed object' or what was its regular name. When the last member of
> > > the struct was a char[1] and the space was allocated to have the
> > > needed amount of characters. Even sounds like a pertty good candidate
> > > for a template :) I just wonder why I never created one such.
> >
> > Perhaps because it is illegal -- undefined behavior according to the C
> > and the C++ standards.
>
> Really? What part makes it UB? I thought C-style array indexing is ordered
> to behave like simple pointer math. And expression using the pointer math
> there will address a byte that is properly allocated. So how it is
> undefined behavior?

Because you declare an array of one element and trying to access via
the array more than this element has always been UB. The fact that you
have allocated memory is irrelevant.

The C (not C++) standard has created a different/specific syntax to
authorize this "struct hack" (IOS/IEC 9899:1999+TC1: 6.7.2.1p16).

Val.

Dave Harris

unread,
Feb 20, 2004, 8:45:29 AM2/20/04
to
pa...@lib.hu (Balog Pal) wrote (abridged):

> > > Actually I had in mind solution that avoided that try/catch --
> > > the 'tailed object' or what was its regular name. When the
> > > last member of the struct was a char[1] and the space was
> > > allocated to have the needed amount of characters.
>
> > [...] it is illegal -- undefined behavior according to the C

> > and the C++ standards.
>
> Really? What part makes it UB?

I don't have a citation, but it's easy to see how it could go wrong
if the class had a virtual function:

struct Demo {
char data1[1];
virtual void method();
char data2[1];
};

Although many current implementations put a vtable pointer at the start
of the struct (ie before data1), I knew of one which would put it
after the first virtual function (ie between data1 and data2), and they
are allowed to put it at the end (ie after data2). So:

demo.data2[1] = 0;

could overwrite the vtable pointer.

Even if the class has no virtual functions, the compiler is allowed to
add a vtable anyway. I think it is also allowed to added debugging aids
like:

struct Demo {
byte __guard1;
char data1[1];
byte __guard2;
char data2[1];
byte __guard3;
};

where the __guard bytes are initialised to some known value (eg 0xcd)
and checked to detect bugs like:

demo.data1[1] = 0;

I believe this is allowed even if Demo is a POD.

The standard tries not to restrict implementation freedom unnecessarily,
and these things could be valuable.

-- Dave Harris, Nottingham, UK

Francis Glassborow

unread,
Feb 20, 2004, 3:42:44 PM2/20/04
to
In message <d6652001.04021...@posting.google.com>,
ka...@gabi-soft.fr writes

>Note here that the resulting BYTE* points into the array data. Any
>attempt to add more than one to it is undefined behavior, as is any
>attempt to dereference GetData() + 1.
>
>As mentioned above, there has been at least one compiler where such
>things were checked, and caused a run-time error.

Jensen & Partners TopSpeed C++ compiler had optional array bounds
checking back in about 1990. Tragically this excellent product got lost
when the TopSpeed languages got sold on to some database specialist who
was largely interested in its smart linker.

--
Francis Glassborow ACCU
Author of 'You Can Do It!' see http://www.spellen.org/youcandoit
For project ideas and contributions: http://www.spellen.org/youcandoit/projects

Balog Pal

unread,
Feb 21, 2004, 6:01:30 AM2/21/04
to
<ka...@gabi-soft.fr> wrote in message
news:d6652001.04021...@posting.google.com...
> > > Perhaps because it is illegal -- undefined behavior according to the
> > > C and the C++ standards.
>
> > Really? What part makes it UB?

> The start address is in the array of the struct, an array with one


> element. Pointer math is only defined within a given array (and one
> past the end).

I'm certainly avare of that one. But we actually have a bigger array at that
place, so pointer math should work for that object. Instead of arguments
let's see some variants.
[please assume correct syntax where I mistype something. :]

void * pObj = operator new(100); // alloc 100 bytes

template<int N>
struct Arr
{
unsigned char data[N];
};

Arr<1> * p1 = reinterpret_cast<Arr<1> *>(pObj);
Arr<2> * p2 = reinterpret_cast<Arr<2> *>(pObj);
//... etc up to 100

I possibly forget to mention in the previous post, that I assumed the
implementation defines reinterpret_cast to work correctly for all those
situations, when the underlying memory exists and alignment is observed.
So we can use p1->data[0], p2->data[1], ... p100->data[99] all within
defined behavior. (?)

Now let's have

unsigned char * pc1 = &p1->data[0];
unsigned char * pc2 = &p2->data[0];
....
unsigned char * pc100 = &p100->data[0];

pc100[0] and pc100[99] all shall address the correct byte.

But what about the other pointers? Are they different? All those pointers
are of the same type -- pointer to unsigned char, and point to the same byte
in memory. Are they allowed to carry extra info besides that? Are they
allowed to remember their history? To behave differently?

The eample in the previous post could use an extra reinterpret_cast to
formally change type from that char[1] to a char[many], and let it decay to
simple pointer only after that.


> §5.7/5 (last sentence): "If both the pointer operand and
> the result point to elements of the same array object, or one past the
> last element of the array object, the evaluation shall not produce an
> overflow; otherwise the behavior is undefined."

And that IS the case as long as we stay within the object -- and the
object -- the region of storage is what we got originally from op new. We
got the char[1] thing with a retinterpret_cast in the first place.

> Sure. Anytime the compiler uses fat pointers and does bounds checking.

An is that legal and standard-conforming?
[though I think such inplementation hardly can give that reintepret_cast
guarantee i built the theory on.]

> I believe that CenterLine once sold such a compiler.

That is really interesting. And having
char c[100];
char * p = &c[50];

produced a pointer that remembered it can go 50 back and forward? And how
casts worked here? Especially could I cast some other char * to pick up
identical bounds info?

> It was the expressed intent of at least some of the authors of the
> original C standard that an implementation with full bounds checking
> using fat pointers be legal.

And guess the embedded folk stood up yelling 'over my dead body' :)

The intent is a good one, but I don't think it fits the practice and the
ways of C and C++.
Pointer math shall not be used in wide range of applications. There bounds
checking in irrelevant. In the rest it is either handy or necessary, but it
very often means some hackery or dealing with the metal. Where bounds
checks are more in the way than helping.

Okey, too bad in the REAL world there are yet other situations, and we get
all those buffer overruns. The most practical question: was that
Centerline solution good enough to catch all possible buffer overruns? If
so, it is really a big achievement, that would make me switch sides.

> > BYTE data[1]; // rest follows!

> > inline BYTE * GetData() {return &data[0];}
>
> Note here that the resulting BYTE* points into the array data. Any
> attempt to add more than one to it is undefined behavior, as is any
> attempt to dereference GetData() + 1.

But that can be prevented with an extra cast, isn't it?

Paul

Balog Pal

unread,
Feb 21, 2004, 8:44:13 PM2/21/04
to
"Dave Harris" <bran...@cix.co.uk> wrote in message
news:memo.20040219201012.396B@brangdon.m...

> I don't have a citation, but it's easy to see how it could go wrong
> if the class had a virtual function:

This whole thing is for PODs, and the example would break right at the
beginning if the struct had a vtable. casting raw memo to a nonpod is
definitely a bad idea. ;-)

[for the note I mistakenly edited a private section in my prevoius post --
disregarding my own comment for the original code I edited down the example
stating 'keep this thing POD'. ]

> struct Demo {
> char data1[1];
> virtual void method();
> char data2[1];
> };
>
> Although many current implementations put a vtable pointer at the start
> of the struct (ie before data1), I knew of one which would put it
> after the first virtual function

DOH. I thought any reasonable compiler collects the data and the functions
separately. If the struct had a parent with data members but no VMT, that
pointer could be located between new and old members -- or even after the
data members, but placing it depending on where functions are placed in the
class definition sounds weird. (Though we shall generally make no
assumptations on layout of a nonPOD.)

> Even if the class has no virtual functions, the compiler is allowed to
> add a vtable anyway.

Hmm, I'm sure this is absolutely not allowed -- that would break C
compatibility, and contradict lots of stuff regarding PODs in the standard.

>I think it is also allowed to added debugging aids
> like:
>
> struct Demo {
> byte __guard1;
> char data1[1];
> byte __guard2;
> char data2[1];
> byte __guard3;
> };

AFAIK padding is allowed inside the struct, but not at the beginning. But
most implenetations define the way of how they create the layout -- at least
for a set of situations. And often allow tuning via switches or #pragma
pack. Knowing the layout is important in a plenty of real-life problems.

Paul

Dave Harris

unread,
Feb 22, 2004, 5:56:01 AM2/22/04
to
pa...@lib.hu (Balog Pal) wrote (abridged):
> This whole thing is for PODs

I went on to talk about PODs later in that message.


> DOH. I thought any reasonable compiler collects the data and the
> functions separately.

Why should it?

Placing the vtable after the first virtual function could make it
easier to keep non-PODs at least partially compatible with PODs and
C structs.

(With some CPUs, making the vtable at the front could make it
more efficient to call virtual functions. That's an argument for
putting the vtable first, not for keeping it separate to data.)


> > Even if the class has no virtual functions, the compiler is
> > allowed to add a vtable anyway.
>
> Hmm, I'm sure this is absolutely not allowed -- that would break C
> compatibility, and contradict lots of stuff regarding PODs in the
> standard.

The C++ standard does not require C compatibility for PODs. Having a
vtable need not prevent a class from being copied with memcpy.
Exactly what stuff do you think is contradicted?


> AFAIK padding is allowed inside the struct, but not at the beginning.

Oops, yes, you're right there. I should have inserted another member
before the array. (Technically I think there can be padding if
reinterpret_cast is smart enough to allow for it, but that rather
goes against the standard's intent.)


> But most implenetations define the way of how they create the layout
> -- at least for a set of situations. And often allow tuning via
> switches or #pragma pack. Knowing the layout is important in a plenty
> of real-life problems.

Oh, agreed. Implementations can make guarantees above and beyond the
standard. They can guarantee that the allocation trick will work for
PODs, if they want, and then you can rely on that guarantee, if you
want. That's just the way undefined behaviour may be interpreted for
that implementation.

-- Dave Harris, Nottingham, UK

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Gabriel Dos Reis

unread,
Feb 23, 2004, 6:34:00 AM2/23/04
to
Francis Glassborow <fra...@robinton.demon.co.uk> writes:

| In message <diu420ptq8pk1oju9...@4ax.com>, tom_usenet
| <tom_u...@hotmail.com> writes

| >On 5 Feb 2004 06:45:05 -0500, ka...@gabi-soft.fr wrote:
| >
| > >Is it possible to use std::vector< unsigned char > as raw memory? Or,
| > >more specifically, if I have a std::vector< unsigned char > v, is &v[0]

| > >guaranteed to be aligned for all possible data types. (Since an
| > >implementation is required ultimately to use operator new to obtain the
| > >buffer, I can't see how it couldn't be in practice, but I rather doubt
| > >that the standard gives me this guarantee, even indirectly. But maybe
| > >this was the intent.)
| >
| >Looking at 20.4.1.1, std::allocator<T>::allocate allocates memory
| >suitably aligned for T, so memory returned by std::allocator<unsigned
| >char>::allocate is only guaranteed to have an alignment of 1.
|
| However see the requirement for the implementation provided operator new
| as given in 18.4.1.1 para 1 which requires that the returned pointer is
| to suitably aligned memory for any object of that size.

Yes, but that is not a requirement that the pointer returned by operator
new is the one returned by std::allocator<T>::allocate(). It is fine
for std::allocator<T> to do internal book-keeping and adjustments, as
far as the pointer value returned is suitable aligned to store a T.

--
Gabriel Dos Reis
g...@cs.tamu.edu
Texas A&M University -- Computer Science Department
301, Bright Building -- College Station, TX 77843-3112

Gabriel Dos Reis

unread,
Feb 23, 2004, 6:36:06 AM2/23/04
to
ka...@gabi-soft.fr writes:

| I also thought that the question would be of more general interest. One


| of the motivations behind the requirement for contiguity in std::vector,
| if not THE motivation, was support for cases where C style arrays were
| being used (legacy and C interfaces, for example). I stumbled on a case
| where a C style interface was being used that didn't seem to be covered.

std::vector<T> is, no doubt, meant to be a safe good replacement for
C-arrays -- for the most uses. I however doubt that even a C-array of
char is required to have its address suitably aligned to meet the
requirement of an arbitrary type.

Dylan Nicholson

unread,
Feb 23, 2004, 6:41:15 AM2/23/04
to
bran...@cix.co.uk (Dave Harris) wrote in message news:<memo.20040222033438.780C@brangdon.m>...

>
> The C++ standard does not require C compatibility for PODs.

It doesn't? What about if the declaration is extern "C"?
If there were a compiler for which PODs declared in C++ code wouldn't
work when passed to C code (and vice versa), a hellavu lot of code
would surely be broken.

Dylan

Dave Harris

unread,
Feb 23, 2004, 1:58:55 PM2/23/04
to
wizo...@hotmail.com (Dylan Nicholson) wrote (abridged):

> > The C++ standard does not require C compatibility for PODs.
>
> It doesn't? What about if the declaration is extern "C"?

These things /enable/ compatibility but don't require it. They
cannot. A given platform might not even have a C compiler. Or it
may not have one from the same vendor. Typically a vendor's C++
compiler isn't even layout-compatible with itself, if you fiddle
with the compile options between compiles.


> If there were a compiler for which PODs declared in C++ code
> wouldn't work when passed to C code (and vice versa), a hellavu
> lot of code would surely be broken.

It's Quality of Implementation. Vendors keep compatibility with C
for commercial reasons rather than because the standard requires it.

-- Dave Harris, Nottingham, UK

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

ka...@gabi-soft.fr

unread,
Feb 23, 2004, 2:34:21 PM2/23/04
to
"Balog Pal" <pa...@lib.hu> wrote in message
news:<4036...@andromeda.datanet.hu>...

> <ka...@gabi-soft.fr> wrote in message
> news:d6652001.04021...@posting.google.com...
> > > > Perhaps because it is illegal -- undefined behavior according
> > > > to the C and the C++ standards.

> > > Really? What part makes it UB?

> > The start address is in the array of the struct, an array with one
> > element. Pointer math is only defined within a given array (and
> > one past the end).

> I'm certainly avare of that one. But we actually have a bigger array
> at that place, so pointer math should work for that object.

If the pointer is derived from the bigger array, it is guaranteed to
work. Consider the following example:

struct S { int i; char c[ 1 ] ; } ;
void* p = malloc( sizeof( S ) + 100 ) ;
char* pc1 = (char*)p + sizeof( S ) ;
char* pc2 = (S*)p->c ;

pc1[ 50 ] ; // legal...
pc2[ 50 ] ; // illegal...

The bounds are determined by where you got the pointer from.

> Now let's have

Yes. This was the explicit intent of at least some of the members of
the C committee, it was what the C committee voted, and there is nothing
in C++ which changes this.

Centerline once sold a compiler in which pointers did carry extra
information. Francis cites another example of such a compiler, which I
wasn't aware of.

> The eample in the previous post could use an extra reinterpret_cast to
> formally change type from that char[1] to a char[many], and let it
> decay to simple pointer only after that.

With such a system, the reinterpret cast might have to do some strange
things. I've not given the issue much thought. (I also don't know
exactly what Centerline did in such cases.)

> > §5.7/5 (last sentence): "If both the pointer operand and the result
> > point to elements of the same array object, or one past the last
> > element of the array object, the evaluation shall not produce an
> > overflow; otherwise the behavior is undefined."

> And that IS the case as long as we stay within the object -- and the
> object -- the region of storage is what we got originally from op new.

That is not what the C standard says. The object is the object we got
the pointer from. The operator new function (or malloc) returns a
pointer to an object of raw memory of a certain size. When we cast the
pointer, we obtain a new pointer, which points to a new object, albeit
at the same address.

> We got the char[1] thing with a retinterpret_cast in the first place.

Yep. Presumably, a reinterpret_cast would allow us to get back to where
we started (although the standard doesn't guarantee it).

> > Sure. Anytime the compiler uses fat pointers and does bounds
> > checking.

> An is that legal and standard-conforming?

That's the intent, at least according to the people involved in the C
standard.

> [though I think such inplementation hardly can give that
> reintepret_cast guarantee i built the theory on.]

I might be able to. But I think that generally speaking, it would allow
more restrictive reinterpret_cast, but not less restrictive ones. Thus,
if we consider that a pointer is actually a structure of three pointers:
begin, end and current, the return value of malloc(100) would have begin
and current set to the start of memory, and end to the start plus 100.
A reinterpret_cast would only work if the size of the target element is
less than 100 -- perhaps, too, it would adjust the end pointer so that
the total size is an exact multiple of the size of the target type.

> > I believe that CenterLine once sold such a compiler.

> That is really interesting. And having
> char c[100];
> char * p = &c[50];

> produced a pointer that remembered it can go 50 back and forward? And
> how casts worked here?

That's how it was documented, at least. I don't see where this case
would cause a problem at the implementation level.

> Especially could I cast some other char * to pick up identical bounds
> info?

The bounds info is not part of the type, but of the value of the
pointer. It is generated when creating the pointer from some larger
type. Thus, to continue the above example (values in comments refer to
raw byte pointer values):

struct S { int i ; char c[ 1 ] ; } ;
// sizeof( S ) == 8, on my machine, to have a concrete example...

void* p = (S*)malloc( sizeof( S ) + 100 ) ;
// p->begin == p->current, p->end == p->begin+108
S* ps = (S*)p ;
// p->begin == p->current, p->end == p->begin+104
// Note the adjustment of the end pointer....
char* pc = ps->c ;
// ps->begin == ps->begin+4, ps->current == ps->begin
// ps->end == ps->begin+1

And so on. Once you're down to pc, there is no going back.

> > It was the expressed intent of at least some of the authors of the
> > original C standard that an implementation with full bounds
> > checking using fat pointers be legal.

> And guess the embedded folk stood up yelling 'over my dead body' :)

Not at all. There was never any intent to *require* such pointers.

It may surprise you, but most of Centerline's customers where "embedded
folk". Not all embedded systems are strapped for memory. And a lot of
them have very rigorous quality requirements, and like to detect errors
as early as possible.

> The intent is a good one, but I don't think it fits the practice and
> the ways of C and C++.

I suspect that you are right. At least at present, I think it is only
good intent on the part of the authors of the C standard. There doesn't
seem to be any driving market force for a compiler which actually does
this. There doesn't seem to be any driving market force for anything
which would improve the quality of the developed code, in fact.

> Pointer math shall not be used in wide range of applications. There
> bounds checking in irrelevant.

Given that the definition of [] in C and C++ involves pointer math, I
suspect that there are very few applications which make no use of it
whatsoever.

> In the rest it is either handy or necessary, but it very often means
> some hackery or dealing with the metal. Where bounds checks are more
> in the way than helping.

At some sufficiently low level, yes. You have to be able to turn bounds
checking off. It would certainly get in the way when implementing
malloc.

> Okey, too bad in the REAL world there are yet other situations, and we
> get all those buffer overruns. The most practical question: was that
> Centerline solution good enough to catch all possible buffer overruns?
> If so, it is really a big achievement, that would make me switch
> sides.

My impression is that Centerline never sold enough compilers so that we
could know. I know that it would catch everything that Purify catches,
and a lot more. And that Purify already catches most of those buffer
overruns. I also know that it had a very negative effect on
performance. Enough so that I don't think many people could have used
it as their production compiler.

> > > BYTE data[1]; // rest follows!
> > > inline BYTE * GetData() {return &data[0];}

> > Note here that the resulting BYTE* points into the array data. Any
> > attempt to add more than one to it is undefined behavior, as is any
> > attempt to dereference GetData() + 1.

> But that can be prevented with an extra cast, isn't it?

I don't think so. Not once you're down to that level.

My trick to make this work in C++ was to keep the array of struct's.
Basically, in the above example, to work with ps (in the form of the
this pointer), and not with pc. I think that this is legal, although an
even stricter interpretation would require maintaining the original
void* returned by malloc/operator new somewhere.

In practice, of course, the struct hack works on all current compilers,
and is part of the Posix API.

--
James Kanze GABI Software mailto:ka...@gabi-soft.fr
Conseils en informatique orientée objet/ http://www.gabi-soft.fr
Beratung in objektorientierter Datenverarbeitung
11 rue de Rambouillet, 78460 Chevreuse, France, +33 (0)1 30 23 45 16

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Dylan Nicholson

unread,
Feb 23, 2004, 10:32:49 PM2/23/04
to
bran...@cix.co.uk (Dave Harris) wrote in message
news:<memo.20040223121819.1892B@brangdon.m>...

> wizo...@hotmail.com (Dylan Nicholson) wrote (abridged):
> > > The C++ standard does not require C compatibility for PODs.
> >
> > It doesn't? What about if the declaration is extern "C"?
>
> These things /enable/ compatibility but don't require it. They
> cannot. A given platform might not even have a C compiler. Or it
> may not have one from the same vendor. Typically a vendor's C++
> compiler isn't even layout-compatible with itself, if you fiddle
> with the compile options between compiles.
>
True enough, but that would certainly be outside of the what any
standard should dictate.

>
> > If there were a compiler for which PODs declared in C++ code
> > wouldn't work when passed to C code (and vice versa), a hellavu
> > lot of code would surely be broken.
>
> It's Quality of Implementation. Vendors keep compatibility with C
> for commercial reasons rather than because the standard requires it.
>
Well, I suppose - it's always bemused me the way programmers fret over
standards-compliance when they frequently make assumptions about
compiler behaviour that are not part of any standard (and futhermore
assume that standards-compliant code is necessarily more likely to be
more portable).
If standards were so important, then all modern OSes should have
set-in-stone C ABI's, and any compiler written for them should be able
to label itself as "OS XXX-C ABI-compliant".
And ideally of course, all modern OSes should have set-in-stone C++
ABI's. One can but hope...

Dylan

ka...@gabi-soft.fr

unread,
Feb 24, 2004, 2:24:31 PM2/24/04
to
Gabriel Dos Reis <g...@cs.tamu.edu> wrote in message
news:<m3hdxiq...@merlin.cs.tamu.edu>...
> ka...@gabi-soft.fr writes:

> | I also thought that the question would be of more general interest.
> | One of the motivations behind the requirement for contiguity in
> | std::vector, if not THE motivation, was support for cases where C
> | style arrays were being used (legacy and C interfaces, for example).
> | I stumbled on a case where a C style interface was being used that
> | didn't seem to be covered.

> std::vector<T> is, no doubt, meant to be a safe good replacement for
> C-arrays -- for the most uses. I however doubt that even a C-array of
> char is required to have its address suitably aligned to meet the
> requirement of an arbitrary type.

It doesn't. But since the size of the object isn't known until runtime,
the alternative was a C-style object created with malloc, and the return
value of malloc is guaranteed suitably aligned. I could also use the
operator new function, or new char[]. The only real advantage of
std::vector here was that it implements RAII, all by itself. (In this
case, boost::array_ptr would also be an option. Except that Boost won't
compile with my antediluvian compilers -- even our version of the STL
has been significantly compromized in order to pass.)

--
James Kanze GABI Software mailto:ka...@gabi-soft.fr
Conseils en informatique orientée objet/ http://www.gabi-soft.fr
Beratung in objektorientierter Datenverarbeitung
11 rue de Rambouillet, 78460 Chevreuse, France, +33 (0)1 30 23 45 16

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Francis Glassborow

unread,
Feb 24, 2004, 2:31:59 PM2/24/04
to
In message <7d428a77.04022...@posting.google.com>, Dylan
Nicholson <wizo...@hotmail.com> writes

>bran...@cix.co.uk (Dave Harris) wrote in message news:<memo.20040222033438.780C@brangdon.m>...
> >
> > The C++ standard does not require C compatibility for PODs.
>
>It doesn't? What about if the declaration is extern "C"?
>If there were a compiler for which PODs declared in C++ code wouldn't
>work when passed to C code (and vice versa), a hellavu lot of code
>would surely be broken.

But extern "C" does not guarantee your code will link with just any C.
It assumes a compatible compiler. The PODs declared in C++ code must
work with a compatible C compiler (if there is one, but C++ does not
require there to be).


--
Francis Glassborow ACCU
Author of 'You Can Do It!' see http://www.spellen.org/youcandoit
For project ideas and contributions: http://www.spellen.org/youcandoit/projects

0 new messages