Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

dynamic buffer allocation at char buf[1]

2 views
Skip to first unread message

Kok How Teh

unread,
Apr 10, 2010, 9:43:49 AM4/10/10
to
Hi;
Is it possible to allocate 20-byte buffer at buf in the following
structure?

struct {
int len;
char buf[1];
};
If yes, how? Thanks.

Regards.

August Karlstrom

unread,
Apr 10, 2010, 12:17:49 PM4/10/10
to

No, at least not without breaking the abstraction that the C language
provides. Anyway, why would you want to do that? And what is the purpose
of the the field len?


August

Ben Bacarisse

unread,
Apr 10, 2010, 1:16:12 PM4/10/10
to
August Karlstrom <fusio...@gmail.com> writes:

> Kok How Teh wrote:
>> Hi;
>> Is it possible to allocate 20-byte buffer at buf in the following
>> structure?
>>
>> struct {
>> int len;
>> char buf[1];
>> };
>> If yes, how? Thanks.
>
> No, at least not without breaking the abstraction that the C language
> provides. Anyway, why would you want to do that?

Are you pointing out that the technique often called the "struct hack"
involves, technically, undefined behaviour? If so, you should probably
have been more explicit about that simply because it is such a commonly
used device despite being undefined.

So common, in fact, that there is an official C99 version:

struct buffer {
size_t len;
char buf[];
};

were buf is called a "flexible array member". There can be only one and
it must be the last member and there must be at least one other member.

> And what is the purpose of the the field len?

len will presumably be used to record the number of elements that can be
stored and will be set after using malloc to allocate more than
sizeof(struct buffer) bytes.

--
Ben.

August Karlstrom

unread,
Apr 10, 2010, 8:06:39 PM4/10/10
to
Ben Bacarisse wrote:
[...]

> So common, in fact, that there is an official C99 version:
>
> struct buffer {
> size_t len;
> char buf[];
> };
>
> were buf is called a "flexible array member". There can be only one and
> it must be the last member and there must be at least one other member.

Oh my, that's ugly. And some say C is a simple language... well I have
my doubts.


August

Ian Collins

unread,
Apr 10, 2010, 8:57:05 PM4/10/10
to

Unless you can suggest a cleaner way to have variable length field in a
struct, we're stuck with it. I've yet to come across a pretty hack!

--
Ian Collins

August Karlstrom

unread,
Apr 11, 2010, 9:05:15 AM4/11/10
to

I don't quite get the advantage of the struct hack compared to just
having a pointer field like in

struct buffer {
size_t len;
char *buf;
};

Is it to avoid an extra call to malloc?


August

Willem

unread,
Apr 11, 2010, 9:34:39 AM4/11/10
to
August Karlstrom wrote:
) I don't quite get the advantage of the struct hack compared to just
) having a pointer field like in
)
) struct buffer {
) size_t len;
) char *buf;
) };
)
) Is it to avoid an extra call to malloc?

- It also avoids an extra call to free()
- It avoids the extra allocated space for the pointer
- It keeps the items together in memory which improves locality of
reference (which is good for performance because of caching)
- A lot of file formats consist of blocks with a fixed-size header,
followed by a variable-size bit of data


SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT

Ben Bacarisse

unread,
Apr 11, 2010, 10:33:58 AM4/11/10
to
Willem <wil...@turtle.stack.nl> writes:

> August Karlstrom wrote:
> ) I don't quite get the advantage of the struct hack compared to just
> ) having a pointer field like in
> )
> ) struct buffer {
> ) size_t len;
> ) char *buf;
> ) };
> )
> ) Is it to avoid an extra call to malloc?
>
> - It also avoids an extra call to free()
> - It avoids the extra allocated space for the pointer
> - It keeps the items together in memory which improves locality of
> reference (which is good for performance because of caching)
> - A lot of file formats consist of blocks with a fixed-size header,
> followed by a variable-size bit of data

It's probably worth saying that in this specific instance (where the
buffer elements are characters) one can combine some of the
advantages of both (single allocation/free and some degree of locality)
by doing this:

struct buffer *bp = malloc(sizeof *bp + BUF_LENGTH);
if (bp) {
bp->len = BUF_LENGTH;
bp->buf = (char *)bp + sizeof *bp;
}

Space is wasted and it it of no use if one is trying to match a file
format, but it is well-defined where C99's solution is not available.

Even so, I am not sure I'd bother.

--
Ben.

Eric Sosman

unread,
Apr 11, 2010, 10:48:29 AM4/11/10
to
On 4/11/2010 9:34 AM, Willem wrote:
> August Karlstrom wrote:
> ) I don't quite get the advantage of the struct hack compared to just
> ) having a pointer field like in
> )
> ) struct buffer {
> ) size_t len;
> ) char *buf;
> ) };
> )
> ) Is it to avoid an extra call to malloc?
>
> - It also avoids an extra call to free()
> - It avoids the extra allocated space for the pointer
> - It keeps the items together in memory which improves locality of
> reference (which is good for performance because of caching)
> - A lot of file formats consist of blocks with a fixed-size header,
> followed by a variable-size bit of data

For that last, a struct of any kind is risky. Padding,
you know. (Also byte order and other such representational
issues, but they're not unique to structs.)

--
Eric Sosman
eso...@ieee-dot-org.invalid

Willem

unread,
Apr 11, 2010, 11:54:45 AM4/11/10
to
Eric Sosman wrote:
) On 4/11/2010 9:34 AM, Willem wrote:
)> - A lot of file formats consist of blocks with a fixed-size header,
)> followed by a variable-size bit of data
)
) For that last, a struct of any kind is risky. Padding,
) you know. (Also byte order and other such representational
) issues, but they're not unique to structs.)

I know. That's why a lot of compilers have extensions to force a
struct to not be padded. Should have been added to the language IMO.

Along with multi-character character initializers.
For example: unsigned long exif_tag = 'EXIF';

Unfortunately, that was never standardized. A pity.

Ersek, Laszlo

unread,
Apr 11, 2010, 12:14:39 PM4/11/10
to
On Sun, 11 Apr 2010, Willem wrote:

> Along with multi-character character initializers.
> For example: unsigned long exif_tag = 'EXIF';
>
> Unfortunately, that was never standardized. A pity.

{ little endian, big endian, ... } x { ascii, ebcdic, ... }

OTOH, it is standardized to some degree:

C89 6.1.3.4 Character constants

----v----
[...] The value of an integer character constant containing more than one
character, or containing a character or escape sequence not represented in
the basic execution character set, is implementation-defined. [...]
----^----

C99 6.4.4.4 Character constants p10

----v----
The value of an integer character constant containing more than one
character (e.g., 'ab'), or containing a character or escape sequence that
does not map to a single-byte execution character, is
implementation-defined.
----^----

Cheers,
lacos

Moi

unread,
Apr 11, 2010, 2:15:43 PM4/11/10
to
On Sun, 11 Apr 2010 15:33:58 +0100, Ben Bacarisse wrote:

> Willem <wil...@turtle.stack.nl> writes:
>
>
> It's probably worth saying that in this specific instance (where the
> buffer elements are characters) one can combine some of the advantages
> of both (single allocation/free and some degree of locality) by doing
> this:
>
> struct buffer *bp = malloc(sizeof *bp + BUF_LENGTH); if (bp) {
> bp->len = BUF_LENGTH;
> bp->buf = (char *)bp + sizeof *bp;
> }
>

Personally I would prefer:
bp->buf = (char *) (bp+1);

or maybe even...
bp->buf = (char*) (&bp[1]);


> Even so, I am not sure I'd bother.

Me neither. I use the struct hack wherever I go.
But that is not allowed here.

AvK

Eric Sosman

unread,
Apr 11, 2010, 2:19:30 PM4/11/10
to
On 4/11/2010 11:54 AM, Willem wrote:
> Eric Sosman wrote:
> ) On 4/11/2010 9:34 AM, Willem wrote:
> )> - A lot of file formats consist of blocks with a fixed-size header,
> )> followed by a variable-size bit of data
> )
> ) For that last, a struct of any kind is risky. Padding,
> ) you know. (Also byte order and other such representational
> ) issues, but they're not unique to structs.)
>
> I know. That's why a lot of compilers have extensions to force a
> struct to not be padded. Should have been added to the language IMO.

To provide a still-inadequate non-solution for the
benefit of lazy programmers? Great idea: Submit a proposal
to the C1x Committee ...

> Along with multi-character character initializers.
> For example: unsigned long exif_tag = 'EXIF';
>
> Unfortunately, that was never standardized. A pity.

Looks like the meat of another proposal.

(In short: Bah!)

--
Eric Sosman
eso...@ieee-dot-org.invalid

Seebs

unread,
Apr 11, 2010, 2:24:03 PM4/11/10
to
On 2010-04-11, Eric Sosman <eso...@ieee-dot-org.invalid> wrote:
> On 4/11/2010 11:54 AM, Willem wrote:
>> I know. That's why a lot of compilers have extensions to force a
>> struct to not be padded. Should have been added to the language IMO.

> To provide a still-inadequate non-solution for the
> benefit of lazy programmers? Great idea: Submit a proposal
> to the C1x Committee ...

There are cases where it's useful in code which isn't expected to be
portable to many machines, but yeah.

>> Along with multi-character character initializers.

>>


>> Unfortunately, that was never standardized. A pity.

> Looks like the meat of another proposal.

There's a good reason not to standardize it, too, which is that its
meaning is deeply ambiguous, and whichever answer you pick will surprise
some people and break their code.

-s
--
Copyright 2010, all wrongs reversed. Peter Seebach / usenet...@seebs.net
http://www.seebs.net/log/ <-- lawsuits, religion, and funny pictures
http://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated!

Nick Keighley

unread,
Apr 11, 2010, 4:56:57 PM4/11/10
to
On 11 Apr, 16:54, Willem <wil...@turtle.stack.nl> wrote:
> Eric Sosman wrote:
>
> ) On 4/11/2010 9:34 AM, Willem wrote:
> )> - A lot of file formats consist of blocks with a fixed-size header,
> )>    followed by a variable-size bit of data
> )
> )      For that last, a struct of any kind is risky.  Padding,
> ) you know.  (Also byte order and other such representational
> ) issues, but they're not unique to structs.)
>
> I know.  That's why a lot of compilers have extensions to force a
> struct to not be padded.  Should have been added to the language IMO.

nooo!!

on some architectures this becomes very inefficient. I think having
the option tends to lead to its overuse.


> Along with multi-character character initializers.
> For example: unsigned long exif_tag = 'EXIF';
>
> Unfortunately, that was never standardized.  A pity.

jolly good thing

Michael Foukarakis

unread,
Apr 12, 2010, 3:02:56 AM4/12/10
to
On Apr 11, 6:54 pm, Willem <wil...@turtle.stack.nl> wrote:

> Along with multi-character character initializers.
> For example: unsigned long exif_tag = 'EXIF';

Posting that as if it were a piece of code actually encountered in the
wild and considered half-useful made me shed a little tear.

Seebs

unread,
Apr 12, 2010, 2:59:40 AM4/12/10
to

It was an idiom very popular on pre-OS X macs, where it was used to
create 32-bit values for use as "creator" and "type" tags.

Not a big fan, myself, but it was a plausible idiom, because it allowed
you to compare the 32-bit values rather than iterating through strings.

Alan Curry

unread,
Apr 12, 2010, 6:54:48 AM4/12/10
to
In article <slrnhs5hfh.e5q...@guild.seebs.net>,

Seebs <usenet...@seebs.net> wrote:
|On 2010-04-12, Michael Foukarakis <electr...@gmail.com> wrote:
|> On Apr 11, 6:54 pm, Willem <wil...@turtle.stack.nl> wrote:
|>> Along with multi-character character initializers.
|>> For example: unsigned long exif_tag = 'EXIF';
|
|> Posting that as if it were a piece of code actually encountered in the
|> wild and considered half-useful made me shed a little tear.
|
|It was an idiom very popular on pre-OS X macs, where it was used to
|create 32-bit values for use as "creator" and "type" tags.

I suppose the byte order question didn't bother anyone, because everyone
knew Macs would forever be big-endian.

--
Alan Curry

Willem

unread,
Apr 12, 2010, 11:51:28 AM4/12/10
to
Alan Curry wrote:
) In article <slrnhs5hfh.e5q...@guild.seebs.net>,
) Seebs <usenet...@seebs.net> wrote:
)|On 2010-04-12, Michael Foukarakis <electr...@gmail.com> wrote:
)|> On Apr 11, 6:54?pm, Willem <wil...@turtle.stack.nl> wrote:
)|>> Along with multi-character character initializers.
)|>> For example: unsigned long exif_tag = 'EXIF';
)|
)|> Posting that as if it were a piece of code actually encountered in the
)|> wild and considered half-useful made me shed a little tear.
)|
)|It was an idiom very popular on pre-OS X macs, where it was used to
)|create 32-bit values for use as "creator" and "type" tags.
)
) I suppose the byte order question didn't bother anyone, because everyone
) knew Macs would forever be big-endian.

For the record: I encountered it in C code for AmigaOS, for reading
their standard image format (IFF). This was in the early nineties.

Seebs

unread,
Apr 12, 2010, 12:39:02 PM4/12/10
to
On 2010-04-12, Willem <wil...@turtle.stack.nl> wrote:
[in reference to:]

> )|>> For example: unsigned long exif_tag = 'EXIF';

> For the record: I encountered it in C code for AmigaOS, for reading


> their standard image format (IFF). This was in the early nineties.

Trivia point: IFF is not just an image format, but an all-purpose file
format definition, used as well for many other types of data. It's actually
still in use, somewhat, although the widespread existence of datums in
excess of 4GB has reduced its appeal slightly.

It's actually a really *good* format. I believe it was developed by
Electronic Arts, although C= standardized on it.

IFF is a great example of how to design a file format which is reasonably
friendly to efficient use in C, and flexible/powerful enough to be a good
fit for many tasks.

Nick

unread,
Apr 13, 2010, 2:43:14 PM4/13/10
to
Ben Bacarisse <ben.u...@bsb.me.uk> writes:

I do that somewhere. One reason is that you can mix-and-match these
structures with others where the pointer points to a string held
elsewhere. Each needs its own allocate and initialise function(s), but
you can then do whatever you like to them, including free-ing.
--
Online waterways route planner | http://canalplan.eu
Plan trips, see photos, check facilities | http://canalplan.org.uk

Nick

unread,
Apr 13, 2010, 2:45:57 PM4/13/10
to
Willem <wil...@turtle.stack.nl> writes:

> Eric Sosman wrote:
> ) On 4/11/2010 9:34 AM, Willem wrote:
> )> - A lot of file formats consist of blocks with a fixed-size header,
> )> followed by a variable-size bit of data
> )
> ) For that last, a struct of any kind is risky. Padding,
> ) you know. (Also byte order and other such representational
> ) issues, but they're not unique to structs.)
>
> I know. That's why a lot of compilers have extensions to force a
> struct to not be padded. Should have been added to the language IMO.

I'd also like the opposite - a way to say "nothing depends on the order
and relative alignment of the members of this structure - feel free to
re-arrange to minimise space, maximise access speed, whatever).

That would be really useful for something like:

struct thing {
char type_of_item_1;
void *item_1;
char type_of_item_2;
void *item_2;
};

or even things involving bit-fields. You can happily put the elements
in the most logical order, without caring that you are wasting memory.

Phil Carmody

unread,
Apr 13, 2010, 5:04:59 PM4/13/10
to
Nick <3-no...@temporary-address.org.uk> writes:
> Willem <wil...@turtle.stack.nl> writes:
>> Eric Sosman wrote:
>> ) On 4/11/2010 9:34 AM, Willem wrote:
>> )> - A lot of file formats consist of blocks with a fixed-size header,
>> )> followed by a variable-size bit of data
>> )
>> ) For that last, a struct of any kind is risky. Padding,
>> ) you know. (Also byte order and other such representational
>> ) issues, but they're not unique to structs.)
>>
>> I know. That's why a lot of compilers have extensions to force a
>> struct to not be padded. Should have been added to the language IMO.
>
> I'd also like the opposite - a way to say "nothing depends on the order
> and relative alignment of the members of this structure - feel free to
> re-arrange to minimise space, maximise access speed, whatever).
>
> That would be really useful for something like:
>
> struct thing {
> char type_of_item_1;
> void *item_1;
> char type_of_item_2;
> void *item_2;
> };
>
> or even things involving bit-fields. You can happily put the elements
> in the most logical order, without caring that you are wasting memory.

I sometimes wish C compilers could be persuaded to lay out

struct { int x, y, z; } things[1024];

as if it were more like:

struct lots_of_things {
char things_x[1024];
char things_y[1024];
char things_z[1024];
} rearranged_things.

If I'm having to pay attention to cache usage, and I don't use half
of the members of thousands of structure, I don't want to waste the
cache by pulling them in.

Phil
--
I find the easiest thing to do is to k/f myself and just troll away
-- David Melville on r.a.s.f1

0 new messages