Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Why no VLAs in C++0x?

11,110 views
Skip to first unread message

Scott Meyers

unread,
Nov 15, 2009, 1:47:31 AM11/15/09
to
C99 offers variable length arrays (VLAs), whereby the size of an array
with automatic storage duration is specified at runtime:

void f(size_t n)
{
int vla[n];
...
}

There is nothing like this in C++0x, although
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2648.html
proposes essentially this feature for TR2. Searching the net has
yielded scattered mentionings of how VLAs are evil and C++ should
never adopt them, but I have not been able to find any summaries of
the technical reasons behind such a view. Having just finished talking
with a group of developers about C++0x and having had to respond to
the question, "Does C++0x have my favorite feature from C99 --
variable length arrays?" with "No, but I don't know why," I'd be
interested to know why what seems like a useful idea --
stack-allocated arrays with runtime-determined bounds -- doesn't seem
to have received any serious consideration for C++0x.

Thanks,

Scott

--
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std...@netlab.cs.rpi.edu]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html ]

Florian Weimer

unread,
Nov 15, 2009, 8:04:30 PM11/15/09
to
* Scott Meyers:

> C99 offers variable length arrays (VLAs), whereby the size of an array
> with automatic storage duration is specified at runtime:
>
> void f(size_t n)
> {
> int vla[n];
> ...
> }

What would std::vector<decltype(vla)> mean?

Daniel Krügler

unread,
Nov 15, 2009, 8:05:14 PM11/15/09
to
On 15 Nov., 07:47, Scott Meyers <use...@aristeia.com> wrote:
> C99 offers variable length arrays (VLAs), whereby the size of an array
> with automatic storage duration is specified at runtime:
>
> void f(size_t n)
> {
> int vla[n];
> ...
> }
>
> There is nothing like this in C++0x, althoughhttp://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2648.html

> proposes essentially this feature for TR2. Searching the net has
> yielded scattered mentionings of how VLAs are evil and C++ should
> never adopt them, but I have not been able to find any summaries of
> the technical reasons behind such a view. Having just finished talking
> with a group of developers about C++0x and having had to respond to
> the question, "Does C++0x have my favorite feature from C99 --
> variable length arrays?" with "No, but I don't know why," I'd be
> interested to know why what seems like a useful idea --
> stack-allocated arrays with runtime-determined bounds -- doesn't seem
> to have received any serious consideration for C++0x.

I believe that a proposal like N2648 could indeed provide
similar features to C++ as VLA's do in C99 (so I don't think
that technical arguments would exclude this idea). During the
Sophia Antipolis meeting 2008 this paper was discussed. The
result was a much stronger preference to reconsider it for a
TR2 instead of C++0x. I assume, this was also partly due to
the lack of convincing use-cases.

HTH & Greetings from Bremen,

Daniel Kr�gler

Kaz Kylheku

unread,
Nov 15, 2009, 8:06:55 PM11/15/09
to
On 2009-11-15, Scott Meyers <use...@aristeia.com> wrote:
> C99 offers variable length arrays (VLAs), whereby the size of an array
> with automatic storage duration is specified at runtime:
>
> void f(size_t n)
> {
> int vla[n];
> ...
> }

There is no need for this stupid, dangerous stupidity in C++, since you have
std::vector.

What might be useful would be a way to provide a hint to std::vector to allow
the compiler to allocate an instance in automatic storage, if possible, as an
optimization.

(E.g. semantically similar to (declare (dynamic-extent ...)) in Common Lisp).

VLA's can blow up your stack.

VLA's are a moronic idea. C did not need another form of second class array.
Please keep this out of C++.

> the question, "Does C++0x have my favorite feature from C99 --
> variable length arrays?" with "No, but I don't know why," I'd be

That developer needs to be flogged and then have his head examined.

VLA's are a good way to turn your codebase into one for which it's impossible
to compute the worst-case stack usage. That's, like, wonderful for embedded
systems, the last frontier for C development!

These days people are using threads. If you use a lot of threads, you have
to constrain the stack size, even if you are not in a small system with
a severely restricted memory. E.g. in Linux, thread stacks are mapped with
mmap. So if you have N threads with a stack size of M, your virtual memory
footprint is N * M. The only way you can run your program, if this exceeds your
physical memory size, is to have overcommit enabled and cross your fingers.

C99 provides no way for the program to detect and recover from a failure to
allocate the VLA. The behavior is simply undefined. If the VLA blows past
a guard page at the top of the stack, too bad.

Basically, this feature is another gets().

In C++, at least, VLA's could have a std::bad_alloc exception.

VLA's are still low-level arrays; the expression ``vla'' in your example
decays to an ``int *'' pointer, which provides no bounds checking.
The function may return this pointer, which becomes garbage.

There is no way to do what you often really want: to dynamically grow the
array! Even though they are sized and allocated on the fly, VLA's have a fixed
size. You can't increase the size of a VLA: to do that, you are back to
malloc a new array and memcpy to relocate your data there.

With alloca, hacky as it is, you can make requests to obtain more storage.
The objects don't die until you return from the function, so you can do
something like this:

for (...) {
struct foo *node = alloca(...)
/* insert node into dynamic set, keep going */
}

Can't do this with a VLA because they have a lifetime tied to the instance
of their enclosing block scope.

Here is an idea: have an array-declaration-like syntax (perhaps hightly
compatible with VLA's) but which is understood to be a veneer for std::vector).

So that is to say,

int vla[n];

would actually make a std::vector<int>, pre-allocated to size n. That would be
the actual type of the object. Consequently, in the scope of this
identifier, the expression ``vla'' would not simply be an ``int *'' pointer.

Alf P. Steinbach

unread,
Nov 15, 2009, 8:10:30 PM11/15/09
to
* Scott Meyers:

> C99 offers variable length arrays (VLAs), whereby the size of an array
> with automatic storage duration is specified at runtime:
>
> void f(size_t n)
> {
> int vla[n];
> ...
> }
>
> There is nothing like this in C++0x, although
> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2648.html
> proposes essentially this feature for TR2. Searching the net has
> yielded scattered mentionings of how VLAs are evil and C++ should
> never adopt them, but I have not been able to find any summaries of
> the technical reasons behind such a view. Having just finished talking
> with a group of developers about C++0x and having had to respond to
> the question, "Does C++0x have my favorite feature from C99 --
> variable length arrays?" with "No, but I don't know why," I'd be
> interested to know why what seems like a useful idea --
> stack-allocated arrays with runtime-determined bounds -- doesn't seem
> to have received any serious consideration for C++0x.

Because the C99 implementation is an abomination, and so gave VLAs a bad name.

In earlier discussions, e.g. in cl++, it seems that committee members are unable
to imagine any other implementation (that includes syntax) than the C99 one. Of
course, it may be that there are other unarticulated reasons, like, "this is
argued for by camp W (the Windows/Microsoft guys, they use it a lot because of
Windows' UTF-16 based core whence they need "inline" string conversions
everywhere), and hence we in camp US (the Unix/Sun guys, hah, UTF-8 everywhere,
just bytes) are automatically opposed". I.e., it may be that the association to
C99 VLAs has just been proffered as a vicarious argument, wouldn't surprise me.

But anyway, as a consequence, people will have to cope using the de-facto
standard and type unsafe alloca. :-(


Cheers & hth.,

- Alf

Scott Meyers

unread,
Nov 15, 2009, 11:24:38 PM11/15/09
to
Kaz Kylheku frothed:

>
> VLA's can blow up your stack.

So can recursion. So can unusually long call chains. This hardly
seems like a compelling argument.

> VLA's are a good way to turn your codebase into one for which it's impossible
> to compute the worst-case stack usage.

In the general case, this is already impossible. To reduce the
general case to something more amenable to analysis, developers
concerned with this issue impose restrictions on themselves to make
the analysis possible. Such restrictions could be applied on the
sizes of VLAs, too (e.g., require that the maximum of all possible
sizes must be statically computable).

> VLA's are still low-level arrays; the expression ``vla'' in your example
> decays to an ``int *'' pointer, which provides no bounds checking.
> The function may return this pointer, which becomes garbage.

Both of which you can currently do with non-VLA arrays. VLAs don't
make things any safer, but they don't introduce any new dangers in
this regard, either. Developers who don't think the danger is worth
the benefit can avoid the feature, as they can currently do with
non-VLA arrays.

> There is no way to do what you often really want: to dynamically grow the
> array! Even though they are sized and allocated on the fly, VLA's have a fixed
> size. You can't increase the size of a VLA: to do that, you are back to
> malloc a new array and memcpy to relocate your data there.

If you need to change the size of your data structure, an array (of
any kind) is a poor choice.

> Here is an idea: have an array-declaration-like syntax (perhaps hightly
> compatible with VLA's) but which is understood to be a veneer for std::vector).
>
> So that is to say,
>
> int vla[n];
>
> would actually make a std::vector<int>, pre-allocated to size n. That would be
> the actual type of the object. Consequently, in the scope of this
> identifier, the expression ``vla'' would not simply be an ``int *'' pointer.

But it would run contrary to what I perceive to be the primary reason
for choosing a VLA: no dynamic memory allocation.

Scott

Scott Meyers

unread,
Nov 15, 2009, 11:24:53 PM11/15/09
to
Alf P. Steinbach wrote:
>
> Because the C99 implementation is an abomination, and so gave VLAs a bad name.

Can you elaborate? I don't know anything about the C99 implementation.

Thanks,

Scott

Scott Meyers

unread,
Nov 15, 2009, 11:24:03 PM11/15/09
to
Florian Weimer wrote:
>
> * Scott Meyers:
>
>> C99 offers variable length arrays (VLAs), whereby the size of an array
>> with automatic storage duration is specified at runtime:
>>
>> void f(size_t n)
>> {
>> int vla[n];
>> ...
>> }
>
> What would std::vector<decltype(vla)> mean?

I don't know. Nor do I know what sizeof(vla) would yield. But I
don't think these questions are unanswerable. One possible answer
would be that expressions requiring full compile-time resolution would
not compile for vlas.

The fact that questions such as this would have to be answered doesn't
strike me as the likely explanation for why VLAs don't appear to have
been seriously considered for C++0x.

Scott

REH

unread,
Nov 16, 2009, 2:39:08 PM11/16/09
to
On Nov 15, 11:24 pm, Scott Meyers <use...@aristeia.com> wrote:
> But it would run contrary to what I perceive to be the primary reason
> for choosing a VLA: no dynamic memory allocation.
>

I don't believe the standard guarantees that VLAs cannot be allocated
from the free store. In fact, it says the memory allocated to VLA may
be lost when executing a longjmp. I don't think that could happen if
it were forced to be stack based.

REH

Alf P. Steinbach

unread,
Nov 16, 2009, 2:40:28 PM11/16/09
to
* Scott Meyers:

Alf P. Steinbach wrote:


Because the C99 implementation is an abomination, and so gave
VLAs a bad name.


Can you elaborate? I don't know anything about the C99 implementation.


In C99 the syntax for a VLA is the same as for a raw array, as in your example,

void f(size_t n)
{
int vla[n];
...
}

With that choice follows that sizeof(vla) is evaluated at run time,
and that the type of vla is, well I don't know what it is, but it
isn't good.

In C++ one would use something similar to std::vector, but with
compiler (core language) support. Compiler (core langauge) support is
necessary to ensure that it's only used for automatic variables, and
to do the alloca from within what in a user-defined class solution
would be a nested stack frame. It might look like

void f( ptrdiff_t const n )
{
std::local_array<int> vla( n );
...
}

The point being to avoid std::vector's dynamic allocation overhead.

Existence of de facto standard alloca in a number of compilers (*nix,
Windows) says that there is a practical need for this functionality.


Cheers & hth.,

- Alf

--

Gabriel Dos Reis

unread,
Nov 16, 2009, 2:41:35 PM11/16/09
to
Scott Meyers <use...@aristeia.com> writes:

| Florian Weimer wrote:
| >
| > * Scott Meyers:
| >
| >> C99 offers variable length arrays (VLAs), whereby the size of an array
| >> with automatic storage duration is specified at runtime:
| >>
| >> void f(size_t n)
| >> {
| >> int vla[n];
| >> ...
| >> }
| >
| > What would std::vector<decltype(vla)> mean?
|
| I don't know. Nor do I know what sizeof(vla) would yield. But I
| don't think these questions are unanswerable. One possible answer
| would be that expressions requiring full compile-time resolution would
| not compile for vlas.
|
| The fact that questions such as this would have to be answered doesn't
| strike me as the likely explanation for why VLAs don't appear to have
| been seriously considered for C++0x.

Appearance may sometimes be deceptive. I would not be surprised that
since nobody came up with good answers, it wasn't included in C++0x.

Despite urban legends, committee members have to consider these
questions and, most (rightfully) expect good answers. It isn't just a
matter of 'answerable' questions.

Gabriel Dos Reis

unread,
Nov 16, 2009, 5:51:54 PM11/16/09
to
Scott Meyers <use...@aristeia.com> writes:

| C99 offers variable length arrays (VLAs), whereby the size of an array
| with automatic storage duration is specified at runtime:
|
| void f(size_t n)
| {
| int vla[n];
| ...
| }
|
| There is nothing like this in C++0x, although
| http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2648.html
| proposes essentially this feature for TR2. Searching the net has
| yielded scattered mentionings of how VLAs are evil and C++ should
| never adopt them, but I have not been able to find any summaries of
| the technical reasons behind such a view. Having just finished talking
| with a group of developers about C++0x and having had to respond to
| the question, "Does C++0x have my favorite feature from C99 --
| variable length arrays?" with "No, but I don't know why," I'd be
| interested to know why what seems like a useful idea --
| stack-allocated arrays with runtime-determined bounds -- doesn't seem
| to have received any serious consideration for C++0x.

I don't recall the charter of C++0x was to win the contest of

Does C++0x have my favorite feature from C99


I'm sure people who want to VLA in C++0x have reasons beyond feature
accretion from C99.

Bo Persson

unread,
Nov 16, 2009, 5:52:45 PM11/16/09
to
Scott Meyers wrote:
> Florian Weimer wrote:
>>
>> * Scott Meyers:
>>
>>> C99 offers variable length arrays (VLAs), whereby the size of an
>>> array with automatic storage duration is specified at runtime:
>>>
>>> void f(size_t n)
>>> {
>>> int vla[n];
>>> ...
>>> }
>>
>> What would std::vector<decltype(vla)> mean?
>
> I don't know. Nor do I know what sizeof(vla) would yield. But I
> don't think these questions are unanswerable. One possible answer
> would be that expressions requiring full compile-time resolution
> would not compile for vlas.
>
> The fact that questions such as this would have to be answered
> doesn't strike me as the likely explanation for why VLAs don't
> appear to have been seriously considered for C++0x.
>

I believe the committee has considered it briefly, but didn't like to
have sizeof being a run time operator for some types. Adding to this
the potential problem of the exact type of a vla, a structure
containing a vla, a class derived from that struct, etc, no one wanted
to invest time in this issue. There hasn't exactly been any lack of
items for the committee to decide on!

There is also the problem of specifying the runtime complexity of
vlas. What happens on machines without a stack? Should C++ include a
feature that is a potential performace optimization only on some
hardware?

Bo Persson

Florian Weimer

unread,
Nov 17, 2009, 2:19:18 PM11/17/09
to
* Scott Meyers:

>> What would std::vector<decltype(vla)> mean?
>
> I don't know. Nor do I know what sizeof(vla) would yield. But I
> don't think these questions are unanswerable.

You have two choices: disallow such template instantiations, or extend
the type systems. GCC currently does the former.

Extending the type system is hard, and the end effect might be "good
bye, concepts" because you've introduced dependent types and the
concepts type system might no longer be decidable.

> One possible answer would be that expressions requiring full
> compile-time resolution would not compile for vlas.

This means that you'd have to special-case them when language features
are specified in terms of template instantiations (e.g. the enhanced
for loop). For some things, this might be acceptable because they are
sufficiently obscure, but probably not for arrays.

Kaz Kylheku

unread,
Nov 17, 2009, 2:20:40 PM11/17/09
to
On 2009-11-16, Scott Meyers <use...@aristeia.com> wrote:
> Kaz Kylheku frothed:
>>
>> VLA's can blow up your stack.
>
> So can recursion. So can unusually long call chains. This hardly
> seems like a compelling argument.

You can die from drinking too much water. So that's not a compelling argument
against drinking paint thinner.

Recursion gives us a large, uniquely expressive space, which is safe at the
same time. Lots of very useful, recursive algorithms generate only
a small number of execution frames (for instance proportional to the logarithm
of the size of a dynamic data set, which means that a data set with
2**128 nodes, more than the number of atoms in the Sun, could be recursively
processed by binary subdivision with a stack usage on order of 128 frames).
Furthermore, some forms of recursion can express iteration, and not consume
space.

In many cases, if you don't express a solution recursively, the alternativee
is ugly. That's what I mean by uniquely expressive.

VLA's have no large space of safe use which is uniquely expressive in the same
way as recursion.

>> VLA's are a good way to turn your codebase into one for which it's impossible
>> to compute the worst-case stack usage.
>
> In the general case, this is already impossible.

In most well-designed code, this is easy to ascertain.

> general case to something more amenable to analysis, developers
> concerned with this issue impose restrictions on themselves to make
> the analysis possible. Such restrictions could be applied on the
> sizes of VLAs, too (e.g., require that the maximum of all possible
> sizes must be statically computable).

If the upper bound for the VLA is known, you can replace the VLA
with a regular array, by substituting that upper bound constant into the
declaration.

VLA's are used by bad programmers specifically when they don't know what the
upper bound is; that is precisely why they substitute a parameter which is
computed at run time.

VLA means ``I have no idea how big this quantity may be, but I think it's
okay if the compiler tries to grab that many bytes from the stack in
one fell swoop, and don't bother me with whether it worked or not.''

> Both of which you can currently do with non-VLA arrays. VLAs don't
> make things any safer, but they don't introduce any new dangers in
> this regard, either.

That's a good rationale for standardizing something which is already
widely implemented, not for a committee invention.

New revisions of a programming languages should close safety holes, not imitate
them.

> Developers who don't think the danger is worth
> the benefit can avoid the feature, as they can currently do with
> non-VLA arrays.

That is false. You can choose to avoid VLA arrays only the code you
are writing yourself. In the real world, you don't write every piece
of yourself. There are module from the team next door, and third-party
code.

How do I defend myself against stupid VLA problems in some library I'm
calling. How do I know that some parameter ``int X'' I'm passing to
a function isn't being used to allocate a VLA?

VLA's are broadly bad for the C landscape.

jacob navia

unread,
Nov 17, 2009, 7:31:05 PM11/17/09
to
Kaz Kylheku a �crit :

> If the upper bound for the VLA is known, you can replace the VLA
> with a regular array, by substituting that upper bound constant into the
> declaration.
>

This provokes much more stack usage than what is needed!

The usual case, for instance getting a temporary buffer for a
character string that in most cases will be 30-50 chars, but in the worst case
can be 1000 leads to reserving 1000 bytes when in most calls you need only
50! VLAs LIMIT stack usage precisely because you are not forced to
allocate the MAXIMUM SIZE!

> VLA's are used by bad programmers specifically when they don't know what the
> upper bound is; that is precisely why they substitute a parameter which is
> computed at run time.
>

This is just nonsense. See above. Even if I know the maximum possible size
(because I can LIMIT it) I surely do not want to allocate the maximum
EACH TIME. You are just throwing words around because all that smells
"C" is an abomination to you.

> VLA means ``I have no idea how big this quantity may be, but I think it's
> okay if the compiler tries to grab that many bytes from the stack in
> one fell swoop, and don't bother me with whether it worked or not.''
>

You can test the input parameter before using it in a VLA. Yes, bad
programmers will not test input parameters, and that is why they are
bad programmers. So what?

>> Both of which you can currently do with non-VLA arrays. VLAs don't
>> make things any safer, but they don't introduce any new dangers in
>> this regard, either.
>
> That's a good rationale for standardizing something which is already
> widely implemented, not for a committee invention.
>

VLAs are widely implemented since C99 requires them.

> New revisions of a programming languages should close safety holes, not imitate
> them.
>


SO, a program that in the average case allocates only 20
bytes is preferred to one that allocates always the maximum?

int foo(int table_size)
{
// Allocates in average 20 bytes
int table[table_size];
}
#define MAX_SIZE 1000
int foo(int table_size)
{
// Allocates in average 20 bytes
int table[MAX_SIZE];
}

The second is more likely to crash!
And no, there is no way to know if it crashes!

Besides, a better usage would be
int foo(int table_size)
{
if (table_size > 1000)
table_size=1000;
// Allocates in average 20 bytes
int table[table_size];
}


>> Developers who don't think the danger is worth
>> the benefit can avoid the feature, as they can currently do with
>> non-VLA arrays.
>
> That is false. You can choose to avoid VLA arrays only the code you
> are writing yourself. In the real world, you don't write every piece
> of yourself. There are module from the team next door, and third-party
> code.
>
> How do I defend myself against stupid VLA problems in some library I'm
> calling. How do I know that some parameter ``int X'' I'm passing to
> a function isn't being used to allocate a VLA?
>

And if you call a library and it crashes?
And if you call a library and the machine crashes?

That happens!

This is one of the most stupid arguments against VLAs that I have ever seen:

It could be misused in a library provoking the crash of my program.

GREAT!

Greg Herlihy

unread,
Nov 18, 2009, 1:30:27 PM11/18/09
to
On Nov 17, 7:31 pm, jacob navia <ja...@nospam.org> wrote:
> Kaz Kylheku a �crit :
>
> > If the upper bound for the VLA is known, you can replace the VLA
> > with a regular array, by substituting that upper bound constant into the
> > declaration.
>
> This provokes much more stack usage than what is needed!

Yes, it does.

> The usual case, for instance getting a temporary buffer for a
> character string that in most cases will be 30-50 chars, but in the worst case
> can be 1000 leads to reserving 1000 bytes when in most calls you need only
> 50! VLAs LIMIT stack usage precisely because you are not forced to
> allocate the MAXIMUM SIZE!

Yes - and not being forced to allocate the maximum size is precisely
the problem with variable-length arrays.

> > VLA's are used by bad programmers specifically when they don't know what the
> > upper bound is; that is precisely why they substitute a parameter which is
> > computed at run time.
>
> This is just nonsense. See above. Even if I know the maximum possible size
> (because I can LIMIT it) I surely do not want to allocate the maximum
> EACH TIME.

But you should. The size of stack-allocated array should be as large
as the array would ever need to be - not any larger and not any
smaller.

> > New revisions of a programming languages should close safety holes, not imitate
> > them.
>
> SO, a program that in the average case allocates only 20
> bytes is preferred to one that allocates always the maximum?

Absolutely.

> int foo(int table_size)
> {
> // Allocates in average 20 bytes
> int table[table_size];}
>
> #define MAX_SIZE 1000
> int foo(int table_size)
> {
> // Allocates in average 20 bytes
> int table[MAX_SIZE];
>
> }
>
> The second is more likely to crash!
> And no, there is no way to know if it crashes!

Yes, the second version is more likely to crash - but - if the program
does crash then it is very likely to crash often. In other words, if
there is no room for a 1,000 element array on the stack, than that
fact will quickly become apparent as soon as the programmer runs the
program. And a C++ programmer would then replace the local array with
a std::vector<int> - which is what the programmer should have been
using in the first place. (Unfortunately, C programmers do not have
this option - which explains why C99 has VLAs and why C++ should not).

With the VLA option, the programmer has far less assurance that the
1,000 element array has been in fact been tested. Perhaps an array of
that size is needed only when the customer runs the program - or is
needed only when the satellite goes into orbit. In other words, by
making the crash less frequent - the programmer has also made the
crash more serious.

The principle in this case is well-known in the Eiffel programming
language, that is, that all failures should be "hard" failures. And
along those same lines: if a bug can crash a program, then that bug
should always crash the program. In that way, the programmer is
certain not only to discover the bug - but must fix the bug as well.

> This is one of the most stupid arguments against VLAs that I have ever seen:

Personally, I don't think such comments are constructive. Nonetheless,
you are free to criticize Alf's arguments all you like. Keep in mind,
though, that his arguments are correct.

Greg

Edward Rosten

unread,
Nov 18, 2009, 1:30:30 PM11/18/09
to
On Nov 16, 7:40�pm, "Alf P. Steinbach" <al...@start.no> wrote:

> In C99 the syntax for a VLA is the same as for a raw array, as in your example,
>
> �void f(size_t n)
> �{
> � �int vla[n];
> � �...
> �}
>
> With that choice follows that sizeof(vla) is evaluated at run time,
> and that the type of vla is, well I don't know what it is, but it
> isn't good.

That's OK for C: it doesn't have the programmable type system, so it
seems pretty harmless to me.

> In C++ one would use something similar to std::vector, but with
> compiler (core language) support. Compiler (core langauge) support is
> necessary to ensure that it's only used for automatic variables, and
> to do the alloca from within what in a user-defined class solution
> would be a nested stack frame. It might look like
>
> �void f( ptrdiff_t const n )
> �{
> � � �std::local_array<int> vla( n );
> � � �...
> �}
>
> The point being to avoid std::vector's dynamic allocation overhead.
>
> Existence of de facto standard alloca in a number of compilers (*nix,
> Windows) says that there is a practical need for this functionality.

I'm curious if it is possible to make such a feature work cleanly on
both automatic variables and the free store. That way one could write
a class that gets space from the stack where it can, but uses the free
store if it is not a local variable. If one hypothesised the existence
of a function:

bool is_automatic(void* pointer);

which could deduce the allocation mechanism for a pointer, and alloca,
then local_array<> could be implemented as (eg):

template<int T> class local_array
{
T* ptr;
public:
local_array(size_t n)
{
if(is_automatic(this) && n < TOO_BIG)
ptr = alloca(sizeof(T) * n);
else
ptr = new T[n];
}

~local_array()
{
if(!is_automatic(ptr))
delete ptr;
}

T& operator[](size_t i)
{
return ptr[i];
}

//etc
};

I expect that is_automatic is generally as simple as a comparison of
pointers on most CPUs, and in many cases could probably be optimized
away. Either way, for large arrays or machines with no hardware stack
(let TOO_BIG =0), there is little to no penalty, especially relative
to the cost of malloc() / free(). I believe that there have been
attempts to make moderately portable is_automatic functions before.
They generally rely on detailed knowledge of each system, but that is
available to the compiler. On systems with no hardware stack, it could
be hard to deduce. Perhaps then it could always return false, so all
code would use new/delete instead.

The implementation would require a working alloca() which is not very C
++ ish. Perhaps the old meaning for auto could be reinstated with a
syntax like:

new auto T[n]

meaning stack allocation.

The disadvantage relative to C99 is that free-store allocated classes
would require two calls to new/delete, since there can not be variable
length structs. In practise this may not be such a burden. Assuming
structs are small and arrays are large, new/delete on the struct can
be made fast (eg glibc seems to use a fast allocator for small sizes,
distinct from the allocator for large sizes) so the main speed penalty
will be allocating a large chunk of memory.

-Ed
--
(You can't go wrong with psycho-rats.)(http://mi.eng.cam.ac.uk/~er258)

/d{def}def/f{/Times s selectfont}d/s{11}d/r{roll}d f 2/m{moveto}d -1
r 230 350 m 0 1 179{ 1 index show 88 rotate 4 mul 0 rmoveto}for/s 12
d f pop 235 420 translate 0 0 moveto 1 2 scale show showpage

Alf P. Steinbach

unread,
Nov 18, 2009, 2:24:21 PM11/18/09
to
* Edward Rosten:
>
[snip]

>
> I'm curious if it is possible to make such a feature work cleanly on
> both automatic variables and the free store. That way one could write
> a class that gets space from the stack where it can, but uses the free
> store if it is not a local variable. If one hypothesised the existence
> of a function:
>
> bool is_automatic(void* pointer);
>
> which could deduce the allocation mechanism for a pointer, and alloca,
> then local_array<> could be implemented as (eg):
>
> template<int T> class local_array
> {
> T* ptr;
> public:
> local_array(size_t n)
> {
> if(is_automatic(this) && n < TOO_BIG)
> ptr = alloca(sizeof(T) * n);

At this point you're in a nested stack frame. alloca can't do magic. So compiler
support is (practically) required to do the alloca in the stack frame of the
variable declaration and pass the pointer to the constructor.


> else
> ptr = new T[n];
> }
>
> ~local_array()
> {
> if(!is_automatic(ptr))
> delete ptr;

This needs to call destructors of the elements.

> }
>
> T& operator[](size_t i)
> {
> return ptr[i];
> }
>
> //etc
> };


Cheers,

- Alf

aus...@gmail.com

unread,
Nov 18, 2009, 2:25:27 PM11/18/09
to
On Nov 14, 10:47 pm, Scott Meyers <use...@aristeia.com> wrote:
> C99 offers variable length arrays (VLAs), whereby the size of an array
> with automatic storage duration is specified at runtime:
>
> void f(size_t n)
> {
> int vla[n];
> ...
> }
>
> There is nothing like this in C++0x, althoughhttp://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2648.html

> proposes essentially this feature for TR2. Searching the net has
> yielded scattered mentionings of how VLAs are evil and C++ should
> never adopt them, but I have not been able to find any summaries of
> the technical reasons behind such a view. Having just finished talking
> with a group of developers about C++0x and having had to respond to
> the question, "Does C++0x have my favorite feature from C99 --
> variable length arrays?" with "No, but I don't know why," I'd be
> interested to know why what seems like a useful idea --
> stack-allocated arrays with runtime-determined bounds -- doesn't seem
> to have received any serious consideration for C++0x.

At the most basic level, the answer is pretty simple: there
wasn't any serious discussion of this feature because there
wasn't ever a proposal to discuss. Nobody did the work of
writing and presenting a paper, and this feature would
obviously be complicated enough that a discussion couldn't
go anywhere without a paper. There are no records of an
official decision because there was none.

Certainly everyone on the committee was and is aware that
this feature is part of C, and that it exists as an
extension in some C++ compilers. So certainly it would have
been considered if someone had presented a full proposal.

Presumably you're looking for a deeper reason. That's hard
to answer, since nobody has to give reasons for not writing
a paper. Presumably some people thought that C++ VLAs
would be an actively bad idea, other people thought that
they might be OK but would be too much work, and still
others just didn't care very much.

I was somewhere in the latter two camps, for what it's
worth. I thought this feature would be much harder to
specify fully in C++ than in C (because of exceptions,
constructors and destructors, and C++'s greater
emphasis on knowing the exact types of things), and I
also thought it was much less important for C++ than for
C (because C++ has abstraction features that make it
possible to use other data structures more than raw
built-in arrays). The technical issues with integrating
C VLAs into C++ were perhaps solvable, but solving them
was nowhere near the top of my list of priorities. For
whatever reason or combination of reasons, it never
made it to the top of anyone else's list either.

--Matt

Leigh Johnston

unread,
Nov 19, 2009, 6:16:47 PM11/19/09
to
Take a look at my vecarray container for a "safe" alternative to VLAs, a
max-size stack (or free store) based container.

http://i42.co.uk/stuff/vecarray.htm

VLAs are a dodgy hack as having no idea how much stack you are using is
dodgy. Stack for known (static) quantities, freestore for unknown (dynamic)
quantities.

/Leigh

"Greg Herlihy" <gre...@mac.com> wrote in message
news:684cc3d7-8b81-4938...@m20g2000vbp.googlegroups.com...

Kaz Kylheku

unread,
Nov 20, 2009, 1:29:03 AM11/20/09
to
On 2009-11-19, Leigh Johnston <le...@i42.co.uk> wrote:
> Take a look at my vecarray container for a "safe" alternative to VLAs, a
> max-size stack (or free store) based container.
>
> http://i42.co.uk/stuff/vecarray.htm
>
> VLAs are a dodgy hack as having no idea how much stack you are using is
> dodgy. Stack for known (static) quantities, freestore for unknown (dynamic)
> quantities.

There is one rasonable use case for VLA's and it is this: You have some
situation where multiple arrays are being defined, at different levels in a
function activation chain. For whatever algorithmic reason, the upper bound on
the total size of these arrrays is known. However, it is not known how that
upper bound is partitioned among the individual arrays, only that the
total is bounded. In this case, the individual arrays can be allocated using
the variable lenghth. You know that if one of the arrays takes more space, this
is compensated by the others taking less space (the total being bounded).

This contrived situation can be handled easily without VLA's. The functions
are almost certainly closely related and can be augmented to pass an
extra parameter to each other, a pointer to allocator object which provides
allocations with a stack-like discipline (something like the ``obstacks'' in
GNU's ``libiberty''). The functions likely share some kind of context anyway,
and so this could just be an extra field in that context.

Code like:

if (parameter >= 1000)
parameter = 1000;

int vla[parameter];

is mostly useless and dangerous. All that you are doing is making it more
difficult to reproduce the worst case during development. The worst case still
exists, and may occur in the field, which is where you don't want it to occur.

Smart programmers do not write programs in which functions nest and allocate
big arrays at every level. There is usually no need for that.

I've seen situations in code where a debug trace macro would insert a large
array into every function where it is expanded. The right way to solve that is
for code to have access to an object which represents the trace context of the
calling thread. The trace object can hold the array needed to do formatting.
The trace context can be passed as an extra parameter, or retrieved with a
thread-local-storage API. Since the formatting takes place in one thread, it is
not interrupted; there is no need for reentrancy with multiple buffers at
different function nesting levels. Where asynchronous interrupts or signal
handlers can be dispatched, there are reasonable ways of providing tracing.
The Linux kernel's printk function can be called from ISR's and does not
allocate a big buffer on the stack. The solution is very simple: a
``static char printk_buf[1024]'' is defined right inside the vprintk
function (linux 2.6.26). Parts of the function run with interrupts locally
disabled on each CPU that is running the code, and protect the buffer with
spinlock. Problem solved, and no VLA braindamage threatening to take down the
kernel. Sure, this global locking means that printk will not scale. But code
that makes heavy calls to printk is not inherently scalable (doh!)

VLA's cannot help with this kind this problem because the users of the trace
macro or functions do not know (and should not know) how large of a parameter
ot use anyway. You need a buffer which is ``reasonably large'': i.e. much
larger than a typical trace message, while ensuring that there is no buffer
overflow if an anomalous formatting call occurs (e.g. swprintf, snprintf).

It would not be reasonable to expect the users to advise the tracer that a
buffer that is only 20 characters wide is needed. To know the exact buffer
size, the formatting job has to be done twice: first a dry run which computes
how many characters are needed, and then a real run which stores the
characters. (The C99 function snprintf supports this usage well, the swprintf
does not). To avoid the overhead of always formatting the message twice, you
might start with a ``reasonably large'' size, and just use the strategy to deal
with the exceptions. So you are back to having some fixed buffer size that is
reasonably large for the typical case. If the exceptions occur, you probably
don't want to be stack-allocating them.

Alf P. Steinbach

unread,
Nov 20, 2009, 12:35:53 PM11/20/09
to
* Kaz Kylheku:

>
> There is one rasonable use case for VLA's and it is this: You have some
> situation where multiple arrays are being defined, at different levels in a
> function activation chain. For whatever algorithmic reason, the upper bound on
> the total size of these arrrays is known. However, it is not known how that
> upper bound is partitioned among the individual arrays, only that the
> total is bounded. In this case, the individual arrays can be allocated using
> the variable lenghth. You know that if one of the arrays takes more space, this
> is compensated by the others taking less space (the total being bounded).
>
> This contrived situation can be handled easily without VLA's. The functions
> are almost certainly closely related and can be augmented to pass an
> extra parameter to each other, a pointer to allocator object which provides
> allocations with a stack-like discipline (something like the ``obstacks'' in
> GNU's ``libiberty''). The functions likely share some kind of context anyway,
> and so this could just be an extra field in that context.

I think the most well known use of dynamic size stack allocated arrays is in
Windows' narrow character API, where each API routine has to convert textual
arguments to wide characters before calling the corresponding wide character API
routine (the base API is exclusively based on wide characters).

I'm guessing that where there's no reasonable upper bound on a string length,
then dynamic allocation is used, otherwise (as with e.g. paths) stack allocation
via alloca is used.

Please explain how that is a case of the "one reasonable use" above, *and* how
the API routines can be "augmented to pass an extra parameter to each other".


> Code like:
>
> if (parameter >= 1000)
> parameter = 1000;
>
> int vla[parameter];
>
> is mostly useless and dangerous. All that you are doing is making it more
> difficult to reproduce the worst case during development. The worst case still
> exists, and may occur in the field, which is where you don't want it to occur.

That seems to be a straw man argument.

Nobody would write code like that.


Cheers,

- Alf

Mathias Gaunard

unread,
Dec 7, 2009, 12:22:41 PM12/7/09
to
On Nov 15, 6:47 am, Scott Meyers <use...@aristeia.com> wrote:
> C99 offers variable length arrays (VLAs), whereby the size of an array
> with automatic storage duration is specified at runtime:

This basically means you're introducing dynamically-sized objects.
This of course means a whole new category of types, and thus requires
an overhaul of the whole type system, but that's not necessarily a
stopping point if there really is interest.
While this is interesting and has many uses, the way VLAs work is
quite restrained and make it of very limited interest in C++, at least
in my opinion.
You wouldn't be able to use VLAs to build a string type that uses the
stack, for example, and ability to abstract is the main feature of C+
+.

I believe a much more interesting solution would be to introduce a two-
phase construction mechanism in the object system: first you call a
static member function that takes the same arguments as the
constructor, returns the size the object wants to be, then you
allocate memory, then you call the actual constructor.

This would not only allow to define containers that use the stack, but
would also avoid a lot of overhead when using typical type erasure
techniques, removing indirection and memory management altogether, and
could be applied to have high-performance heterogeneous node-based
containers (each node would simply be of a different size, no need to
introduce another level of indirection -- std::list could probably be
extended to support it so that you can add derived objects in a list
of T).
Basically, we get lower coupling, more type dynamism, better
efficiency, more compact memory usage, etc.

A dynamically-sized object could fairly easily be "converted" to a
statically-sized one by simply copying it on the freestore and
wrapping the pointer.

Note all this can be implemented as a library, except you'd have to
call a macro instead of a constructor.
I have been toying with this kind of library framework myself to see
what the extent of what can be done, and I haven't reached it yet.

0 new messages