conservative static region allocator...

Chris M. Thomasson

unread,

Mar 7, 2009, 10:45:55 AM3/7/09

to

Here is the code:

This region allocator does not force all allocations to be aligned up to a
so-called maximum alignment like `malloc' and friends are required to do.
For instance, allocations for a single char will not be forced to waste the
extra alignment space. This can have benefits in space constrained
environments (e.g., embedded devices). Also, its static in nature and can be
fed with a buffer residing on the stack. There are no underlying calls to
`malloc'. This can be useful in free standing environments that do not
support heap allocation. One other thing. Its a pure region allocator which
means you do not need to free individual objects. You can free the entire
region. This is more efficient than calling `malloc()/free()' for each
object. For instance, you can destroy all the nodes in a linked list just by
freeing the region(s) in which the nodes were allocated from. There is no
need to traverse the list to free its nodes. Its a sort of "manual" garbage
collection... ;^)

Anyway, do you have any suggestions on how to improve the code?

Thanks, and I hope somebody might find this useful...

;^o

Malcolm McLean

unread,

Mar 8, 2009, 11:09:57 AM3/8/09

to

"Chris M. Thomasson" <n...@spam.invalid> wrote in message

>
> Anyway, do you have any suggestions on how to improve the code?
>
> Thanks, and I hope somebody might find this useful...
>

You very cleverly fight the C language to get alignments out of a buffer.

One suggestion I would make is to change the interface so that
region_alloc_type() takes a count as well as a type. In practise people are
going to want to allocate short arrays with this, most of the time.

I'd also consider changing the name to rgalloc_type(). The rule of two.

I'd strip out all the debugging code, for the release version. People don't
want clutter in code they're trying to read.

The strategy of assert failing on out of memory I'm not happy about. I think
maybe have two constructors, the default one which assert fails on out of
meory conditions, another one which returns NULL. This is easily implemented
by having a flag in region.

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm

Keith Thompson

unread,

Mar 8, 2009, 11:02:37 PM3/8/09

to

"Malcolm McLean" <regn...@btinternet.com> writes:
> "Chris M. Thomasson" <n...@spam.invalid> wrote in message
>>
>> Anyway, do you have any suggestions on how to improve the code?
>>
>> Thanks, and I hope somebody might find this useful...
>>
> You very cleverly fight the C language to get alignments out of a buffer.
>
> One suggestion I would make is to change the interface so that
> region_alloc_type() takes a count as well as a type. In practise
> people are going to want to allocate short arrays with this, most of
> the time.
>
> I'd also consider changing the name to rgalloc_type(). The rule of two.

[...]

Where does this "rule of two" come from? Personally, I think that
abbreviating "region_" to "rg" just makes the name harder to
understand; do you see some overriding benefit?

--
Keith Thompson (The_Other_Keith) k...@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Flash Gordon

unread,

Mar 9, 2009, 3:27:37 AM3/9/09

to

Keith Thompson wrote:
> "Malcolm McLean" <regn...@btinternet.com> writes:
>> "Chris M. Thomasson" <n...@spam.invalid> wrote in message
>>> Anyway, do you have any suggestions on how to improve the code?
>>>
>>> Thanks, and I hope somebody might find this useful...
>>>
>> You very cleverly fight the C language to get alignments out of a buffer.
>>
>> One suggestion I would make is to change the interface so that
>> region_alloc_type() takes a count as well as a type. In practise
>> people are going to want to allocate short arrays with this, most of
>> the time.
>>
>> I'd also consider changing the name to rgalloc_type(). The rule of two.
> [...]
>
> Where does this "rule of two" come from? Personally, I think that
> abbreviating "region_" to "rg" just makes the name harder to
> understand; do you see some overriding benefit?

All of the rules Malcolm quotes are ones he has invented with, as far as
I can see, no supporting evidence. I agree that the abbreviation would
make it harder to read.
--
Flash Gordon

nick_keigh...@hotmail.com

unread,

Mar 9, 2009, 4:16:08 AM3/9/09

to

On 8 Mar, 15:09, "Malcolm McLean" <regniz...@btinternet.com> wrote:

<snip>

> I'd also consider changing the name [of region_alloc_type()] to

> rgalloc_type(). The rule of two.

what is the Rule Of Two?

<snip>

pete

unread,

Mar 9, 2009, 7:46:43 AM3/9/09

to

Two by two, hands of blue

Agents of the Blue Sun Corporation,
always travel in pairs, and wear blue gloves.

http://www.google.com/search?hl=en&q=Two+by+two%2C+hands+of+blue

Results 1 - 10 of about 42,000,000 for Two by two, hands of blue.

--
pete

Richard Bos

unread,

Mar 11, 2009, 2:08:22 PM3/11/09

to

nick_keigh...@hotmail.com wrote:

There is no such thing, but there _is_ a Double Rule of Three. Ask the
gardener if you don't believe me.

Richard

Richard Heathfield

unread,

Mar 11, 2009, 2:30:10 PM3/11/09

to

nick_keigh...@hotmail.com said:

If there is such a rule, it is unreasonable, relying as it does on
an unreasonable number. There are in fact only four reasonable
numbers for programming languages, so it ought to be the Rule of
Four.

The four reasonable numbers are 0, 1, 4, and oo:

0: it is reasonable not to allow something at all (e.g. forbidding
void functions to return a value);
1: it is reasonable to allow something once but no more (e.g.
forbidding int functions to return two values);
4: it is reasonable for the list of reasonable numbers to include
the number of items on the list;
oo (NaN, but let's pretend not to notice): it is reasonable to allow
something infinitely often, where "infinitely" is defined as
"enough times that you are unlikely to run out accidentally if
you're being sensible" (e.g. where the length of variable names is
concerned, 30+ can reasonably be considered infinite).

When a programming language designer allows you to do something a
small number of times (but more than once), that's unreasonable. In
C's case, the obvious candidate is the six-unique-characters
limitation in C90 linkers, and this is a flaw not so much of C as
of ISO. As far as I'm aware, K&R C imposed no such limit. Of
course, even C90 didn't exactly "impose" it - it just gave you no
recourse if breaching the limit broke things.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999

Ben Bacarisse

unread,

Mar 11, 2009, 3:09:20 PM3/11/09

to

Richard Heathfield <r...@see.sig.invalid> writes:

> nick_keigh...@hotmail.com said:
>
>> On 8 Mar, 15:09, "Malcolm McLean" <regniz...@btinternet.com>
>> wrote:
>>
>> <snip>
>>
>>> I'd also consider changing the name [of region_alloc_type()] to
>>> rgalloc_type(). The rule of two.
>>
>> what is the Rule Of Two?
>
> If there is such a rule, it is unreasonable, relying as it does on
> an unreasonable number.

<snip>

> When a programming language designer allows you to do something a
> small number of times (but more than once), that's unreasonable. In
> C's case, the obvious candidate is the six-unique-characters
> limitation in C90 linkers, and this is a flaw not so much of C as
> of ISO. As far as I'm aware, K&R C imposed no such limit.

If anything K&R C is worse (as might be expected given the era) in
that is states that not more than 8 are significant for internal names
and external names "may be less". The effect being that no minimum
length is assured, though some example systems are given with the
least generous being 6 characters ignoring case (i.e. Strcpy ==
strcpy).

> Of
> course, even C90 didn't exactly "impose" it - it just gave you no
> recourse if breaching the limit broke things.

Ditto K&R.

--
Ben.

Richard Heathfield

unread,

Mar 11, 2009, 3:22:13 PM3/11/09

to

Ben Bacarisse said:

> Richard Heathfield <r...@see.sig.invalid> writes:
>
<snip>

>> In C's case, the obvious candidate is the six-unique-characters
>> limitation in C90 linkers, and this is a flaw not so much of C as
>> of ISO. As far as I'm aware, K&R C imposed no such limit.
>
> If anything K&R C is worse (as might be expected given the era) in

> that [it] states that not more than 8 are significant for internal

> names and external names "may be less".

Well well - I just learned something. Thanks, Ben.

Walter Banks

unread,

Mar 11, 2009, 6:08:26 PM3/11/09

to

Richard Heathfield wrote:

> When a programming language designer allows you to do something a
> small number of times (but more than once), that's unreasonable. In
> C's case, the obvious candidate is the six-unique-characters
> limitation in C90 linkers, and this is a flaw not so much of C as
> of ISO.

Its origins go back to early tool sets. The Honeywell 6000 stored
6 characters in a 36 bit word later 4 9 bit chars in the same word.
PDP 8's stored two characters in 12 bit words and some PDP11
linkers which had a 6 character limit in external symbols. The linker
had 40 symbols in the supported character set and could encode
6 characters in 2 16 bit numbers.

> As far as I'm aware, K&R C imposed no such limit. Of
> course, even C90 didn't exactly "impose" it - it just gave you no
> recourse if breaching the limit broke things.

K&R (pp179) defines first 8 at most significant characters in
identifiers and notes some of the processor exceptions.

Regards,

--
Walter Banks
Byte Craft Limited
http://www.bytecraft.com

Eric Sosman

unread,

Mar 11, 2009, 5:24:27 PM3/11/09

to

Walter Banks wrote:
>
> Richard Heathfield wrote:
>
>> When a programming language designer allows you to do something a
>> small number of times (but more than once), that's unreasonable. In
>> C's case, the obvious candidate is the six-unique-characters
>> limitation in C90 linkers, and this is a flaw not so much of C as
>> of ISO.
>
> Its origins go back to early tool sets. The Honeywell 6000 stored
> 6 characters in a 36 bit word later 4 9 bit chars in the same word.
> PDP 8's stored two characters in 12 bit words and some PDP11
> linkers which had a 6 character limit in external symbols. The linker
> had 40 symbols in the supported character set and could encode
> 6 characters in 2 16 bit numbers.

... using an encoding called "RAD50," for "radix forty." Yes.

--
Eric....@sun.com

Bartc

unread,

Mar 11, 2009, 6:39:19 PM3/11/09

to

"Walter Banks" <wal...@bytecraft.com> wrote in message
news:49B8365A...@bytecraft.com...

>
>
> Richard Heathfield wrote:
>
>> When a programming language designer allows you to do something a
>> small number of times (but more than once), that's unreasonable. In
>> C's case, the obvious candidate is the six-unique-characters
>> limitation in C90 linkers, and this is a flaw not so much of C as
>> of ISO.
>
> Its origins go back to early tool sets. The Honeywell 6000 stored
> 6 characters in a 36 bit word later 4 9 bit chars in the same word.
> PDP 8's stored two characters in 12 bit words and some PDP11
> linkers which had a 6 character limit in external symbols. The linker
> had 40 symbols in the supported character set and could encode
> 6 characters in 2 16 bit numbers.

I remember the use of 'sixbit' on the 36-bit pdp10, giving 6 upper-case
characters per word. And this seemed to pervade everything including
filenames and program variables.

Far from being a limitation, it made some text processing (and compiler work
in my case) easy and very efficient. Made easier by the users at the time
accepting the six-character limit on names.

--
Bartc

Richard

unread,

Mar 11, 2009, 7:01:47 PM3/11/09

to

"Bartc" <ba...@freeuk.com> writes:

Come off it Bart, of course it was a limitation. The "efficiency" offset
by the obvious limitations imposed by striving to get a meaningful name
into 6 characters.

--
You can’t prove anything about a program written in C or FORTRAN.
It’s really just Peek and Poke with some syntactic sugar.

Chris M. Thomasson

unread,

Mar 11, 2009, 9:03:50 PM3/11/09

to

"Malcolm McLean" <regn...@btinternet.com> wrote in message
news:mcydnV52QfAuQi7U...@bt.com...

> "Chris M. Thomasson" <n...@spam.invalid> wrote in message
>>
>> Anyway, do you have any suggestions on how to improve the code?
>>
>> Thanks, and I hope somebody might find this useful...
>>
> You very cleverly fight the C language to get alignments out of a buffer.

;^)

Yeah, it does seem as if it will work pretty darn good as long as
`sizeof(size_t) == sizeof(void*)'. Humm... Perhaps I should add a macro that
the user can pre-define in order to setup a custom integral type that can be
freely converted to a void* in the cast that `sizeof(size_t) !=
sizeof(void*)'. Something simple like this:
___________________________________________________________
#if ! defined (CUSTOM_UINTPTR_TYPE)
typedef size_t uintptr_type;
#else
typedef CUSTOM_UINTPTR_TYPE uintptr_type;
#endif

typedef char static_assert[
sizeof(uintptr_type) == sizeof(void*) ? 1 : -1
];
___________________________________________________________

That way, if the user knows that size_t is not sufficient, they can simply
define `CUSTOM_UINTPTR_TYPE' to a type that is sufficient. Also, I should
probably try to detect if the user is compiling on a C99 compiler, and just
use the types from <stdint.h> (e.g., uintptr_t). However, I would have to
check to see if its under a XSI conformant system. AFAICT, the types
`intptr_t/uintptr_t' are optional. Well, I guess I could just see if the
`UINTPTR_MAX' macro is defined.

> One suggestion I would make is to change the interface so that
> region_alloc_type() takes a count as well as a type. In practise people
> are going to want to allocate short arrays with this, most of the time.

Completely agreed; here is the mutated code:

http://pastebin.com/f207f6232

> I'd also consider changing the name to rgalloc_type(). The rule of two.

I defined an alternative "condensed" API as follows:
______________________________________________________
#define rinit region_init
#define ralloc region_alloc
#define ralloct region_alloc_type
#define rallocex region_alloc_ex
#define rflush region_flush
______________________________________________________

It certainly reduces the amount of typing...

;^)

> I'd strip out all the debugging code, for the release version. People
> don't want clutter in code they're trying to read.

You mean remove it completely from the posted code? The preprocessor already
removes it when `NDEBUG' is defined. I must be totally misunderstanding you.

> The strategy of assert failing on out of memory I'm not happy about. I
> think maybe have two constructors, the default one which assert fails on
> out of meory conditions, another one which returns NULL. This is easily
> implemented by having a flag in region.

Where are you seeing this condition? Humm... Well, if the current state of a
region cannot satisfy the requested allocation, it does indeed return NULL,
and no assertion is tripped. For instance, the following program with the
current region code as-is does not barf up an assertion failure:
______________________________________________________
int main(void) {
struct region region;
unsigned char buffer[128] = { '\0' };
rinit(&region, buffer, sizeof(buffer));
ralloc(&region, 10000);
return 0;
}
______________________________________________________

What am I missing?

CBFalconer

unread,

Mar 11, 2009, 9:05:16 PM3/11/09

to

Bartc wrote:
> "Walter Banks" <wal...@bytecraft.com> wrote in message

Remember that those 40 characters included 26 upper-case letters,
ten digits, and a blank to mark 'the end'. That's 37 chars, which
left 3 available for other uses. Very useful for compressing
Pascal names (which have only one case).

--
[mail]: Chuck F (cbfalconer at maineline dot net)
[page]: <http://cbfalconer.home.att.net>
Try the download section.