[OT] Buddy System Memory Allocator

Mark

unread,

Nov 4, 2012, 3:10:48 PM11/4/12

to

Hello

I am looking at writing a buddy system memory allocator in assembler for a
Forth system.

I am not sure how best to implement the tracking of allocated blocks. I've
read lots of papers and articles regarding buddy allocation and think that I
understand the concepts quite well. I am working with embedded systems so I
want to keep the memory and processing overheads as low as possible.

I understand how I am going to handle free blocks, i.e. using a linked-list
where, apart from the initial pointer, the free list pointers are contained
within the free blocks themselves. I am going to maintain a free list per
block size, so the only additional memory overhead is an initial pointer per
block size. In this case, for example, 32 free lists will cover up to a
total memory of 2^31 * (minimum block size), so I could quite easily operate
with a minimum block size of 1 byte if I so desired.

However, I'm not sure about the tracking of allocated blocks. I want a quick
and easy way of tracking allocated blocks and buddies using a method that
doesn't involve dynamically growing a linked-list or tree at run-time. It
seems that this is commonly done using a bitmap, but I don't understand how
you keep this manageable if you have a reasonable amount of memory and want
a relatively small minimum block size. A bitmap of 32 bits will only give me
coverage for 8k of memory if I use a minimum block size of 256 bytes. I am
trying to figure out how to avoid having a bitmap that is hundreds or
thousands of bits long. Is there any easier or alternative technique? I also
imagine that I need a method of tracking the sizes of the allocated blocks
(in addition to the bitmap) so that a 'free' knows how many bytes to
de-allocate.

Any help would be much appreciated.

Thanks

Mark

Hugh Aguilar

unread,

Nov 4, 2012, 6:05:49 PM11/4/12

to

On Nov 4, 1:11 pm, "Mark" <m...@dibsco.co.uk> wrote:
> Hello
>
> I am looking at writing a buddy system memory allocator in assembler for a
> Forth system.
>
> I am not sure how best to implement the tracking of allocated blocks. I've
> read lots of papers and articles regarding buddy allocation and think that I
> understand the concepts quite well. I am working with embedded systems so I
> want to keep the memory and processing overheads as low as possible.

I've never heard of a "buddy system" memory allocator. Can you provide
any links to discussions of this?

> I understand how I am going to handle free blocks, i.e. using a linked-list
> where, apart from the initial pointer, the free list pointers are contained
> within the free blocks themselves. I am going to maintain a free list per
> block size, so the only additional memory overhead is an initial pointer per
> block size. In this case, for example, 32 free lists will cover up to a
> total memory of 2^31 * (minimum block size), so I could quite easily operate
> with a minimum block size of 1 byte if I so desired.

This seems pretty straight-forward. When you want to allocate some
memory, you find the smallest (or maybe the largest) block that will
work, and use that. Whatever is left over gets moved to the
appropriate list.

I wouldn't use a minimum of 1 byte as that is a waste of memory
considering that the link field is one word in size. Also, a lot of
processors have a "paragraph" that can be worked with efficiently. On
the 16-bit x86 it is 16 bytes. On most micro-controllers (including
the MSP-430 which I presume you are working on) it is 1 word.

On a processor with data caching, I would use trees rather than lists,
and sort them by their address. If you always prefer the lowest (or
the highest) address, your blocks will tend to be close together. This
might help somewhat in that if you are accessing several blocks at the
same time, all of them might be close enough together to be in the
cache together.

Be sure to provide support for my ALLOCATION word! That is a huge help
to the user. See my novice package for more on this. Also, read this
article:
http://www.forth.org/novice.pdf

> However, I'm not sure about the tracking of allocated blocks. I want a quick
> and easy way of tracking allocated blocks and buddies using a method that
> doesn't involve dynamically growing a linked-list or tree at run-time. It
> seems that this is commonly done using a bitmap, but I don't understand how
> you keep this manageable if you have a reasonable amount of memory and want
> a relatively small minimum block size. A bitmap of 32 bits will only give me
> coverage for 8k of memory if I use a minimum block size of 256 bytes. I am
> trying to figure out how to avoid having a bitmap that is hundreds or
> thousands of bits long. Is there any easier or alternative technique? I also
> imagine that I need a method of tracking the sizes of the allocated blocks
> (in addition to the bitmap) so that a 'free' knows how many bytes to
> de-allocate.

Why do you want to track allocated blocks??? The only reason would be
for GC, but that is very un-Forth-like. You really don't have a good
way to track what blocks are still in use (neither a reference count
nor yet searching through the allocated blocks for any that aren't
referenced), as Forth doesn't distinguish pointers in any way --- they
are just a value on the stack, and they can't be distinguished from
integers.

I wouldn't mess with tracking allocated blocks at all --- just leave
it up to the user to deallocate memory blocks when they are no longer
in use.

For debugging purposes, it might be useful to have a word that
determines if any memory is allocated, or if everything is free (but
doesn't tell you when or where the memory got allocated). Sometimes,
at certain points in your program, you know that everything should be
free. If anything isn't, then there must be a leak somewhere. For
example, in a micro-controller you might have a paced-loop. Some tasks
will be spread out over several iterations of the loop, and they will
hold data in the heap while they are active. You can determine,
however, if nothing is active, at which time you can do your check.
You don't have to track the allocated blocks at all --- just traverse
all of your free nodes and build a sorted list of allocated blocks ---
that is somewhat slow, but it is just for debugging purposes and it
can be removed from the production release of the program so the micro-
controller doesn't have any awkward moments when it freezes up.

Because Straight Forth will support ALLOCATION (which Forth-200x
won't), I will need to write my own heap. I'll definitely be
interested in seeing what you come up with, and perhaps incorporating
it directly into Straight Forth. :-) I don't know enough about the
subject right now to dive into it, not without some research first.

Paul Rubin

unread,

Nov 4, 2012, 7:55:01 PM11/4/12

to

"Mark" <ma...@dibsco.co.uk> writes:
> bitmap of 32 bits will only give me coverage for 8k of memory if I use
> a minimum block size of 256 bytes. I am trying to figure out how to
> avoid having a bitmap that is hundreds or thousands of bits long.

I wouldn't worry about this. You'll lose mucm more memory to the
powers-of-two allocation units if the actual requests will be for blocks
of random size. You're doing pretty good if your allocator's overhead
is a few percent of the total memory. In a very small system, the
total number of blocks will be small so the bitmap will be small. In a
larger system, you can afford a few kilobytes for bitmaps.

Hugh Aguilar

unread,

Nov 4, 2012, 10:08:25 PM11/4/12

to

On Nov 4, 5:55 pm, Paul Rubin <no.em...@nospam.invalid> wrote:
> You'll lose mucm more memory to the
> powers-of-two allocation units if the actual requests will be for blocks
> of random size.

They aren't for random sizes. In most applications, most of the
allocations are for records or for strings, both of which tend to be
pretty small (usually < 32 bytes). I don't think the problem of
"internal fragmentation" is very bad practically.

If you are really worried about this, then allow for arbitrary sizes.
Use a binary tree sorted by block size to hold the free blocks. When
allocating, you can search for the smallest block that will work, or
you can just use the largest block available (the rightmost leaf node)
--- either way you have to insert the freed remainder in the tree
afterward --- so you have one or two searches.

You also have two links rather than one for every node, because it is
a tree rather than a list (you could use a sorted list too, but your
searches would be a lot slower). Modern micro-controllers tend to be
faster than necessary, but to still have memory shortages --- so it is
generally more important to save memory than save time ---meaning that
a sorted list might be better than a tree (you are just comparing
integers, so the search is going to be pretty fast even with a list).
If you use a list, and you want to use the biggest block available,
then your list should be sorted backwards (large to small) to give you
easy access to the node of the biggest block.

Hugh Aguilar

unread,

Nov 5, 2012, 1:02:43 AM11/5/12

to

On Nov 4, 4:05 pm, Hugh Aguilar <hughaguila...@yahoo.com> wrote:
> On Nov 4, 1:11 pm, "Mark" <m...@dibsco.co.uk> wrote:
> On most micro-controllers (including
> the MSP-430 which I presume you are working on) it is 1 word.

When I wrote this, I was thinking that you were Mark Wills, our brave
TI afficianado. Now I realize that you are a different Mark.

Can you tell us which Forth system you are working with? Is this for a
homebrew system, or a publicly-available one? Is this for a small 8-
bit or 16-bit micro-controller, or for a big 32-bit processor? What is
your ultimate goal?

Ron Aaron

unread,

Nov 5, 2012, 1:17:02 AM11/5/12

to

On 11/04/2012 10:10 PM, Mark wrote:

> However, I'm not sure about the tracking of allocated blocks. I want a
> quick and easy way of tracking allocated blocks and buddies using a
> method that doesn't involve dynamically growing a linked-list or tree at
> run-time. It seems that this is commonly done using a bitmap, but I
> don't understand how you keep this manageable if you have a reasonable
> amount of memory and want a relatively small minimum block size.

You keep a bitmap multiple words long, depending on how you've balanced
block size vs available memory. I did this once for a device-driver on
a no-disk device.

However, I would be carefully weigh alternatives such as preallocating
pools, rather than using the allocate/free model. You don't say what
sort of application you are envisioning, that would make a difference as
well.

Alex McDonald

unread,

Nov 5, 2012, 9:33:47 AM11/5/12

to

On Nov 4, 11:05 pm, Hugh Aguilar <hughaguila...@yahoo.com> wrote:
> On Nov 4, 1:11 pm, "Mark" <m...@dibsco.co.uk> wrote:
>
> > Hello
>
> > I am looking at writing a buddy system memory allocator in assembler for a
> > Forth system.
>
> > I am not sure how best to implement the tracking of allocated blocks. I've
> > read lots of papers and articles regarding buddy allocation and think that I
> > understand the concepts quite well. I am working with embedded systems so I
> > want to keep the memory and processing overheads as low as possible.
>
> I've never heard of a "buddy system" memory allocator. Can you provide
> any links to discussions of this?

http://www.cs.purdue.edu/homes/hosking/690M/p421-peterson.pdf. It's
been around for half a century.

>
> > I understand how I am going to handle free blocks, i.e. using a linked-list
> > where, apart from the initial pointer, the free list pointers are contained
> > within the free blocks themselves. I am going to maintain a free list per
> > block size, so the only additional memory overhead is an initial pointer per
> > block size. In this case, for example, 32 free lists will cover up to a
> > total memory of 2^31 * (minimum block size), so I could quite easily operate
> > with a minimum block size of 1 byte if I so desired.

That sounds like a slab allocator.
http://www.usenix.org/publications/library/proceedings/bos94/full_papers/bonwick.ps

>
> This seems pretty straight-forward. When you want to allocate some
> memory, you find the smallest (or maybe the largest) block that will
> work, and use that. Whatever is left over gets moved to the
> appropriate list.
>

See D.E. Knuth. The Art Of Computer Programming, Volume 1: Fundamental
Algorithms, Addison-Wesley, 1973 for an analysis of first fit vs best
fit in buddy systems.

> I wouldn't use a minimum of 1 byte as that is a waste of memory
> considering that the link field is one word in size. Also, a lot of
> processors have a "paragraph" that can be worked with efficiently. On
> the 16-bit x86 it is 16 bytes. On most micro-controllers (including
> the MSP-430 which I presume you are working on) it is 1 word.
>
> On a processor with data caching, I would use trees rather than lists,
> and sort them by their address. If you always prefer the lowest (or
> the highest) address, your blocks will tend to be close together. This
> might help somewhat in that if you are accessing several blocks at the
> same time, all of them might be close enough together to be in the
> cache together.

A analysis of using binary trees is here; http://csrl.unt.edu/~kavi/Research/southeastcon.pdf.

In this paper we described the use of Binary
Trees for maintaining the available chunks of
memory. The Binary Tree is based on the starting
address of memory chunks. In addition, we keep
track of the sizes of largest blocks of memory in the
left and right sub-trees. This information is used
during allocation to find a suitable chunk of memory.
Our data shows that Best Fit (or Better Fit)
allocation policies can easily be implemented using
the chunk sizes in the left and right sub-trees. The
Binary Tree implementation permits immediate
coalescing of newly freed memory with other free
chunks of memory. Binary Tree naturally improves
the search for appropriate size blocks of memory over
Linear Linked lists.

[snip]

Rod Pemberton

unread,

Nov 5, 2012, 2:10:56 PM11/5/12

to

"Mark" <ma...@dibsco.co.uk> wrote in message
news:yZzls.238791$it2....@fx22.am4...
...

> However, I'm not sure about the tracking of allocated blocks. I want a
> quick and easy way of tracking allocated blocks and buddies using a
> method that doesn't involve dynamically growing a linked-list or tree
> at run-time. It seems that this is commonly done using a bitmap, but
> I don't understand how you keep this manageable if you have a reasonable
> amount of memory and want a relatively small minimum block size. A
> bitmap of 32 bits will only give me coverage for 8k of memory if I use
> a minimum block size of 256 bytes. I am trying to figure out how to
> avoid having a bitmap that is hundreds or thousands of bits long. Is
> there any easier or alternative technique?

This issue affects file systems too. So, you may wish to think of your
memory allocator as a type of in-memory file-system or read up on how file
systems to understand how they solve the problem.

Bitmaps are good for known and fixed sizes since they're so compact. But,
like everything else, they do consume space. I'm not sure that there is an
"easy" solution of a small bitmap with small allocation sizes when being
used to map a large amount of memory. You're going to need alot of bits.
Of course, the total size of system memory depends on how much was installed
by the user. This is unlike removable media which is always fixed, or
multiple sizes with one maximum size. For an embedded system or OS, you
could pick a maximum size of memory for which the code will work, then
calculate backwards for what you need to implement ... In the future, the
code may need to be 'fixed' to handle more memory.

E.g., you're likely familiar with Microsoft's FAT12/16 use FATs (File
Allocation Table) to keep track of files or perhaps Unix's inodes. You're
likely unfamiliar with CBM (Commodore Business Machines) filesystem, which
used bitmaps. CBM's PC's (personal computers) like the C64's and Vic 20's,
used CBM disk drives, such as the 1541, that ran CBM's DOS (Disk Operating
System). CBM's DOS used bitmaps called BAMs (Block Availability Map) to
keep track of allocated sectors. The linked-list of sectors for the
ordering of a file's sectors was stored in the sectors with the data, unlike
FATs. If interested, the D64 format documents 1541's format:
http://ist.uwaterloo.ca/~schepers/formats/D64.TXT

As for memory allocators, there are a few to be found on the internet, none
of which are coded in Forth. They may provide you with some ideas, e.g.:

"A Memory Allocator," by Doug Lea
http://g.oswego.edu/dl/html/malloc.html

"The BGET Memory Allocator," by John Walker
http://www.fourmilab.ch/bget/

"Dynamic Storage Allocator," Richard Harter, comp.lang.c, Nov. 11, 1990
http://groups.google.com/group/comp.lang.c/msg/7da27dcbc6e2ace1

Rod Pemberton

Alex McDonald

unread,

Nov 5, 2012, 3:58:19 PM11/5/12

to

On Nov 5, 7:06 pm, "Rod Pemberton" <do_not_h...@notemailnotz.cnm>
wrote:
> "Mark" <m...@dibsco.co.uk> wrote in message

>
> news:yZzls.238791$it2....@fx22.am4...
> ...
>
> > However, I'm not sure about the tracking of allocated blocks. I want a
> > quick and easy way of tracking allocated blocks and buddies using a
> > method that doesn't involve dynamically growing a linked-list or tree
> > at run-time. It seems that this is commonly done using a bitmap, but
> > I don't understand how you keep this manageable if you have a reasonable
> > amount of memory and want a relatively small minimum block size. A
> > bitmap of 32 bits will only give me coverage for 8k of memory if I use
> > a minimum block size of 256 bytes. I am trying to figure out how to
> > avoid having a bitmap that is hundreds or thousands of bits long. Is
> > there any easier or alternative technique?
>
> This issue affects file systems too. So, you may wish to think of your
> memory allocator as a type of in-memory file-system or read up on how file
> systems to understand how they solve the problem.

Although that may superficially appear to be the case, file systems
are designed to solve a quite different problem; how do you minimize
file IO, reduce seeks to a minimum and parallelise operations across
many devices to decrease latency and increase bandwidth. Many file
systems are btree or b+tree based as those algorithms match well the
limitations of disks.

Memory allocators that use btrees can be found; http://locklessinc.com/
for example. But their domain is something like MPI (message passing)
where the interface benefits from a btree implementation; and even
this allocator reverts to a slab allocator for small allocations.

The OP needs to state the use case for his allocator.

[snip]

Rod Pemberton

unread,

Nov 6, 2012, 8:54:41 PM11/6/12

to

"Alex McDonald" <bl...@rivadpm.com> wrote in message
news:94819742-bce4-43d3...@a6g2000vbl.googlegroups.com...

> On Nov 5, 7:06 pm, "Rod Pemberton" <do_not_h...@notemailnotz.cnm>
> wrote:
> > "Mark" <m...@dibsco.co.uk> wrote in message
> > news:yZzls.238791$it2....@fx22.am4...
...

> > > However, I'm not sure about the tracking of allocated blocks. I want a
> > > quick and easy way of tracking allocated blocks and buddies using a
> > > method that doesn't involve dynamically growing a linked-list or tree
> > > at run-time. It seems that this is commonly done using a bitmap, but
> > > I don't understand how you keep this manageable if you have a
> > > reasonable amount of memory and want a relatively small minimum
> > > block size. A bitmap of 32 bits will only give me coverage for 8k of
> > > memory if I use a minimum block size of 256 bytes. I am trying to
> > > figure out how to avoid having a bitmap that is hundreds or thousands
> > > of bits long. Is there any easier or alternative technique?
>
> > This issue affects file systems too. So, you may wish to think of your
> > memory allocator as a type of in-memory file-system or read up on how
> > file systems to understand how they solve the problem.
>

> Although that may superficially appear to be the case, [...]

It's not superficial. It affects old and modern filesystems. However, it
especially affects older, simple filesystems.

> [...] file systems are designed to solve a quite different problem;
> how do you minimize file IO, [...]

Minimizing the access time of blocks of memory is an issue for a memory
allocator too.

1) If the blocks of memory contain a file as raw sectors from the disk,
e.g., a ramdisk, you've got the same problem in memory as on disk. You want
those sectors ordered and contiguous. You don't want them randomly arranged
and distributed all across memory.

2) Many programs allocate and free large quantites of small sized memory
blocks. After a while, the resulting memory fragmentation results in
various problems for the memory allocator. Any one or perhaps all of which
can slow the allocator down dramatically.

> [...] reduce seeks to a minimum [...]

This only affects physical hard disks (HD) which have to physically move a
read/write head. It doesn't affect solid state disks (SDD) which have the
same (near zero) seek time per sector. For many years now, HDs have come
with cache's for minimizing that problem. I'm not sure what improvement a
filesystem can offer over a large, fast cache in hardware on the hard disk.
I wouldn't doubt it if modern performance drives automatically moved sectors
around to boost performance, but I don't know if they do... They do map and
lock out bad tracks.

Also, I'd think you'd want to 'reduce seek time' to a minimum for memory
allocation too, i.e., have good spatial locality. Without it, it's possible
you could up with situations with unwanting "thrashing", possibly of the
memory allocator code, the microprocessor cache, or the swap space on disk.

> [...] and parallelise operations across many devices

> to decrease latency and increase bandwidth.

That would definately be true for certain RAID configurations, e.g., data
striping. It'd also be true for multiple devices each with a portion of a
filesystem when mounted into a single filesystem, like for POSIX
environments. However, some filesystems treat each device as having a
separate filesystem. And, some hardware doesn't have hardware support
configuring multiple physical drives as one virtual drive. I.e., this can
be true for more advanced modern hardware.

> The OP needs to state the use case for his allocator.

We're not expecting him to write a specialized allocator or world-class
allocator for Forth are we? As I understood it, He just wanted something
simple and quick.

Rod Pemberton

Hugh Aguilar

unread,

Nov 7, 2012, 12:50:25 AM11/7/12

to

On Nov 6, 6:50 pm, "Rod Pemberton" <do_not_h...@notemailnotz.cnm>
wrote:
> "Alex McDonald" <b...@rivadpm.com> wrote in message

> > The OP needs to state the use case for his allocator.
>
> We're not expecting him to write a specialized allocator or world-class
> allocator for Forth are we? As I understood it, He just wanted something
> simple and quick.

Why not write a world-class allocator for Forth? :-) You make it sound
like there is a rule somewhere that Forth software has to suck.

Actually, I think our OP has wandered off. It was pretty exciting
there for a while, as we thought that somebody other than the usual
gang of idiots was going to write some Forth code --- but then nothing
came of it. :-(

Mark

unread,

Nov 7, 2012, 3:36:59 AM11/7/12

to

"Hugh Aguilar" wrote in message
news:f6aa048c-66f8-4b54...@v9g2000pbi.googlegroups.com...

I haven't wandered off, I've just got very little free time at the moment.

Mark

unread,

Nov 7, 2012, 3:49:22 AM11/7/12

to

"Paul Rubin" wrote in message news:7xvcdku...@ruckus.brouhaha.com...

You are right, I guess that in the general scheme of things the bitmaps
represent a low overhead compared to the internal fragmentation associated
with the buddy system. It's just that the free list handling seems much more
elegant/efficient because of the zero overhead associated with keeping the
list pointers inside the empty blocks along with the fact that only 32
pointers/lists will cover a huge range of memory. But then I suppose that
you can't really store data about allocation inside an allocated block.

Mark

unread,

Nov 7, 2012, 4:02:28 AM11/7/12

to

"Ron Aaron" wrote in message news:k77lkv$g5h$1...@dont-email.me...

Pre-allocating pools? I haven't come across this - only the allocate/free
model. I am trying to write a general purpose allocator that fulfils the
requirements of the ANS memory allocation wordset.

Mark

unread,

Nov 7, 2012, 4:34:30 AM11/7/12

to

"Rod Pemberton" wrote in message news:k792o8$c4l$1...@speranza.aioe.org...

>"Mark" <ma...@dibsco.co.uk> wrote in message
>news:yZzls.238791$it2....@fx22.am4...
>...
>
>> However, I'm not sure about the tracking of allocated blocks. I want a
>> quick and easy way of tracking allocated blocks and buddies using a
>> method that doesn't involve dynamically growing a linked-list or tree
>> at run-time. It seems that this is commonly done using a bitmap, but
>> I don't understand how you keep this manageable if you have a reasonable
>> amount of memory and want a relatively small minimum block size. A
>> bitmap of 32 bits will only give me coverage for 8k of memory if I use
>> a minimum block size of 256 bytes. I am trying to figure out how to
>> avoid having a bitmap that is hundreds or thousands of bits long. Is
>> there any easier or alternative technique?
>
>This issue affects file systems too. So, you may wish to think of your
>memory allocator as a type of in-memory file-system or read up on how file
>systems to understand how they solve the problem.

Thanks, I'll take a look at the subject.

>
>Bitmaps are good for known and fixed sizes since they're so compact. But,
>like everything else, they do consume space. I'm not sure that there is an
>"easy" solution of a small bitmap with small allocation sizes when being
>used to map a large amount of memory. You're going to need alot of bits.
>Of course, the total size of system memory depends on how much was
>installed
>by the user. This is unlike removable media which is always fixed, or
>multiple sizes with one maximum size. For an embedded system or OS, you
>could pick a maximum size of memory for which the code will work, then
>calculate backwards for what you need to implement ... In the future, the
>code may need to be 'fixed' to handle more memory.
>

I had assumed that I would work with a fixed memory size, number of blocks
etc. as most embedded boards probably ship with a fixed (soldered) amount of
RAM. It is not a problem to have a few '#defines' or equivalent somewhere to
configure these things at build time.

>
>E.g., you're likely familiar with Microsoft's FAT12/16 use FATs (File
>Allocation Table) to keep track of files or perhaps Unix's inodes. You're
>likely unfamiliar with CBM (Commodore Business Machines) filesystem, which
>used bitmaps. CBM's PC's (personal computers) like the C64's and Vic 20's,
>used CBM disk drives, such as the 1541, that ran CBM's DOS (Disk Operating
>System). CBM's DOS used bitmaps called BAMs (Block Availability Map) to
>keep track of allocated sectors. The linked-list of sectors for the
>ordering of a file's sectors was stored in the sectors with the data,
>unlike
>FATs. If interested, the D64 format documents 1541's format:
>http://ist.uwaterloo.ca/~schepers/formats/D64.TXT
>

Thanks, I'll take a look.

>
>As for memory allocators, there are a few to be found on the internet, none
>of which are coded in Forth. They may provide you with some ideas, e.g.:
>
>"A Memory Allocator," by Doug Lea
>http://g.oswego.edu/dl/html/malloc.html
>
>"The BGET Memory Allocator," by John Walker
>http://www.fourmilab.ch/bget/
>
>"Dynamic Storage Allocator," Richard Harter, comp.lang.c, Nov. 11, 1990
>http://groups.google.com/group/comp.lang.c/msg/7da27dcbc6e2ace1
>

I've looked at a lot of papers on memory allocators, including your first
link above. I had settled on the buddy system approach because it seemed to
offer a good trade-off between speed and fragmentation.

Mark Wills

unread,

Nov 7, 2012, 7:07:04 AM11/7/12

to

On Nov 7, 9:34 am, "Mark" <m...@dibsco.co.uk> wrote:
> "Rod Pemberton" wrote in messagenews:k792o8$c4l$1...@speranza.aioe.org...
> >"Mark" <m...@dibsco.co.uk> wrote in message

> offer a good trade-off between speed and fragmentation.- Hide quoted text -
>
> - Show quoted text -

What is the definition of 'buddy' in this context? Are two more
consecutive memory blocks that are assigned to the same object
considered to be buddies?

I've yet to study dynamic memory allocation in any detail, as in my
assembler and C days we relied on static allocation (which worked
perfectly) and in my later higher-level language (VB and .Net) it was
handled automagically. It's a very interesting subject.

Chris Hinsley

unread,

Nov 7, 2012, 8:10:42 AM11/7/12

to

I don't have a buddy allocator, but if you want to look at my version
of a best fit, sorted by size of block allocator, based on a simple
double linked freelist have a look here:

https://sites.google.com/site/chrishinsley/

You want the list.f and memory.f files from the Forth.zip.

Hope that gives you some ideas.

Best Regards

Chris

Chris Hinsley

unread,

Nov 7, 2012, 8:31:30 AM11/7/12

to

Only words that you need from the forth.f file (beyond the standard
stuff, that your Forth will have anyways) are the structure definition
words, and the >FIELD word makes use of a special I did in my kernel
'ADD-IMMED,' which just compiles an immediate add. But you can do that
instead as:

: >FIELD \ ( s-addr 'name' -- f-addr )
' EXECUTE
?DUP
IF
POSTPONE LITERAL
POSTPONE +
THEN
; IMMEDIATE

Regards

Chris

Alex McDonald

unread,

Nov 7, 2012, 3:46:30 PM11/7/12

to

On Nov 7, 1:50 am, "Rod Pemberton" <do_not_h...@notemailnotz.cnm>
wrote:
> "Alex McDonald" <b...@rivadpm.com> wrote in message

If I read a fragmented disk file, are you claiming that the memory is
fragmented? No file system I'm aware of does that. They all either
read to buffers under program management, or map to contiguous memory;
for example, mmap over a file.

>
> 2) Many programs allocate and free large quantites of small sized memory
> blocks. After a while, the resulting memory fragmentation results in
> various problems for the memory allocator. Any one or perhaps all of which
> can slow the allocator down dramatically.

Yes, but that's not a good reason to use a filesystem as a starting
point. It's a quite separate problem, and many memory allocators use
different algorithms to satisfy small requests than those used to
satisfy large requests.

>
> > [...] reduce seeks to a minimum [...]
>
> This only affects physical hard disks (HD) which have to physically move a
> read/write head. It doesn't affect solid state disks (SDD) which have the
> same (near zero) seek time per sector. For many years now, HDs have come
> with cache's for minimizing that problem. I'm not sure what improvement a
> filesystem can offer over a large, fast cache in hardware on the hard disk.
> I wouldn't doubt it if modern performance drives automatically moved sectors
> around to boost performance, but I don't know if they do... They do map and
> lock out bad tracks.

SSDs suffer from random write problems; it's called stutter, due to
the fact that SSD pages are managed in large 100s of KB lumps. It is
most apparent when an SSD has been aged; that is, when the SSD has
written to every page at least once. Seeking and writing on SSDs is as
bad as seeking on HDs. No, drives do not move sectors around to
improve performance; it would have the reverse effect. Motto: Never
fix a global problem at the local level.

>
> Also, I'd think you'd want to 'reduce seek time' to a minimum for memory
> allocation too, i.e., have good spatial locality. Without it, it's possible
> you could up with situations with unwanting "thrashing", possibly of the
> memory allocator code, the microprocessor cache, or the swap space on disk.

With a very poor allocator, thrashing can be a problem. Page size on
an x86 in 32bit mode is 4K. Expecting an allocator to understand that
a specific request is best satisfied in the same 4K block as some
previous request, because they're referenced by *time* locality, is a
very tough nut to crack. Compacting memory can help, but that implies
garbage collection and some way of modifying pointers. This kind of
activity is best left to the application. If it knows that structure A
is generally followed by a reference to structure B, then allocate
them from a single request for A and B.

In other words, spatial locality is important on disks for fast
reading and writing sequentially, but it's time locality that's
important in random access virtual memory systems.

>
> > [...] and parallelise operations across many devices
> > to decrease latency and increase bandwidth.
>
> That would definately be true for certain RAID configurations, e.g., data
> striping. It'd also be true for multiple devices each with a portion of a
> filesystem when mounted into a single filesystem, like for POSIX
> environments. However, some filesystems treat each device as having a
> separate filesystem. And, some hardware doesn't have hardware support
> configuring multiple physical drives as one virtual drive. I.e., this can
> be true for more advanced modern hardware.
>
> > The OP needs to state the use case for his allocator.
>
> We're not expecting him to write a specialized allocator or world-class
> allocator for Forth are we? As I understood it, He just wanted something
> simple and quick.

Then a slab allocator would seem to be appropriate. It is relatively
easy to implement and has well understood characteristics, but it is
not like a filesystem.

>
> Rod Pemberton

Rod Pemberton

unread,

Nov 7, 2012, 7:44:17 PM11/7/12

to

"Mark Wills" <forth...@gmail.com> wrote in message
news:1b12d69b-ae6e-4a1a...@k6g2000vbr.googlegroups.com...
...

> I've yet to study dynamic memory allocation in any detail, as in my
> assembler and C days we relied on static allocation (which worked
> perfectly) and in my later higher-level language (VB and .Net) it was
> handled automagically. It's a very interesting subject.
>

It's "automagically" done for you in C too. In C, it's hidden behind C
functions. At some point, these call the host OS' memory allocator. E.g.,
malloc() and free() are the easiest C functions to notice. But, the file
I/O functions also use the host OS' memory allocator too. Technically,
they allocate and free file storage space, but the file I/O is usually
buffered in memory. Some functions, like tmpfile(), are commonly
implemented in memory only. So, you can use file I/O functions to do
"memory" allocation in C. From the programming perspective with C,
it doesn't really matter if the data is in a file or memory.

Rod Pemberton

Rod Pemberton

unread,

Nov 7, 2012, 7:49:26 PM11/7/12

to

"Hugh Aguilar" <hughag...@yahoo.com> wrote in message
news:f6aa048c-66f8-4b54...@v9g2000pbi.googlegroups.com...

> On Nov 6, 6:50 pm, "Rod Pemberton" <do_not_h...@notemailnotz.cnm>
> wrote:
> > "Alex McDonald" <b...@rivadpm.com> wrote in message

...

> > > The OP needs to state the use case for his allocator.
>
> > We're not expecting him to write a specialized allocator
> > or world-class allocator for Forth are we? As I understood

> > it, he just wanted something simple and quick.

>
> Why not write a world-class allocator for Forth? :-)

Hey, you know what? You're elected!

> You make it sound like there is a rule somewhere
> that Forth software has to suck.

No, that wasn't intended. Freudian-slip interpretation? ;-)

I just thought Marcel (the OP) said he wanted something easy.

Alex seemed to be heading towards a discussion of the full
technical merits of different memory allocators.

Rod Pemberton

Paul Rubin

unread,

Nov 7, 2012, 7:57:57 PM11/7/12

to

"Rod Pemberton" <do_no...@notemailnotz.cnm> writes:
>> assembler and C days we relied on static allocation (which worked
>> perfectly) and in my later higher-level language (VB and .Net) it was
>> handled automagically.

> It's "automagically" done for you in C too....
> malloc() and free() ...

I think "automagically" in HLL's these days means "garbage collected".

I get the impression that in C++11, smart pointers (i.e. reference
counted) are included in the standard library and generally considered
preferable to manual allocation with new and delete (equivalent to
malloc/free).

Rod Pemberton

unread,

Nov 7, 2012, 8:14:48 PM11/7/12

to

"Paul Rubin" <no.e...@nospam.invalid> wrote in message
news:7xhap0d...@ruckus.brouhaha.com...

> "Rod Pemberton" <do_no...@notemailnotz.cnm> writes:

> >> assembler and C days we relied on static allocation (which worked
> >> perfectly) and in my later higher-level language (VB and .Net) it was
> >> handled automagically.
> > It's "automagically" done for you in C too....
> > malloc() and free() ...
>
> I think "automagically" in HLL's these days means "garbage collected".
>

Ah, that'd be a different meaning ...

> I get the impression that in C++11, smart pointers (i.e. reference
> counted) are included in the standard library and generally considered
> preferable to manual allocation with new and delete (equivalent to
> malloc/free).

Just how "equivalent" is equivalent?

It's my understanding from what I've read that coding an operating system in
C++ is very difficult, primarily because of new and delete. It's not
difficult to code an OS in C. I.e., they are not equivalent.

I'll take it that you mean that new and delete are used in C++ for memory
allocation, whereas malloc and free are used in C for memory allocation.

Rod Pemberton

Paul Rubin

unread,

Nov 7, 2012, 8:32:41 PM11/7/12

to

"Rod Pemberton" <do_no...@notemailnotz.cnm> writes:
> It's my understanding from what I've read that coding an operating system in
> C++ is very difficult, primarily because of new and delete. It's not
> difficult to code an OS in C. I.e., they are not equivalent.

That makes no sense at all; lots of OS's are coded in C++, and C++ is
almost exactly a superset of C (that is, C programs can be compiled and
run under C++ with very little if any modification). C++'s new and
delete operators give class instances instead of raw pointers, but in
the case of an OS you probably want some kind of block allocator even
lower level than malloc/free, so the difference between new and malloc
is irrelevant.

> I'll take it that you mean that new and delete are used in C++ for memory
> allocation, whereas malloc and free are used in C for memory allocation.

Yes. You can still use malloc/free in C++ if you have to, though it
would be considered bad style unless you had a clear reason to do it.

Mark Wills

unread,

Nov 8, 2012, 3:22:28 AM11/8/12

to

On Nov 8, 12:57 am, Paul Rubin <no.em...@nospam.invalid> wrote:

Correct. In .Net the garbage collector is actually a global-scoped
object that you can send messages to.

Can't remember the Java paradigm. IIRC Java's garbage collector is
fairly invisible and operates only behind the scenes. Correct me if
I'm wrong.

Anton Ertl

unread,

Nov 8, 2012, 7:32:27 AM11/8/12

to

Paul Rubin <no.e...@nospam.invalid> writes:
>"Rod Pemberton" <do_no...@notemailnotz.cnm> writes:
>> It's my understanding from what I've read that coding an operating system in
>> C++ is very difficult, primarily because of new and delete. It's not
>> difficult to code an OS in C. I.e., they are not equivalent.
>
>That makes no sense at all; lots of OS's are coded in C++

Really? Please name a lot.

>and C++ is
>almost exactly a superset of C (that is, C programs can be compiled and
>run under C++ with very little if any modification).

That may be the view of some "C" compiler maintainers, which may explain
why "C" compilers like to miscompile C programs.

I particular, I remember reading about a discussion of a GNU C
extension that the GCC maintainers did not want to support properly
anymore because it does not combine well with some C++ features. As a
GNU C programmer, I don't care about C++ and don't see why GNU C
should be maimed because of C++.

>Yes. You can still use malloc/free in C++ if you have to, though it
>would be considered bad style unless you had a clear reason to do it.

Which shows that C++ is not a superset of C idiomatically, and it's
certainly not a superset formally (e.g., a C++ compiler does not
compile C programs that contain a variable called "this").

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: http://www.forth200x.org/forth200x.html
EuroForth 2012: http://www.euroforth.org/ef12/

Alex McDonald

unread,

Nov 8, 2012, 7:42:52 AM11/8/12

to

On Nov 8, 12:45 am, "Rod Pemberton" <do_not_h...@notemailnotz.cnm>
wrote:

>

> Alex seemed to be heading towards a discussion of the full
> technical merits of different memory allocators.

You keep heading towards filesystems; I'm dragging you back to where
Mark started.

Hugh Aguilar

unread,

Nov 8, 2012, 10:43:29 PM11/8/12

to

On Nov 7, 5:45 pm, "Rod Pemberton" <do_not_h...@notemailnotz.cnm>
wrote:
> "Hugh Aguilar" <hughaguila...@yahoo.com> wrote in message

> > Why not write a world-class allocator for Forth? :-)
>
> Hey, you know what? You're elected!

Well, I said in my very first response that I don't know enough about
the subject to dive in --- not without some research. I have only a
rudimentary understanding of the subject. I still don't know what this
"buddy" system is.

Eventually I will have to dive in though --- Straight Forth will need
a memory allocator, and it can't use GLIBC because I need to have
ALLOCATION --- this, according to Anton, is why Forth-200x won't have
ALLOCATION (because Forth-200x depends upon GLIBC which doesn't have
ALLOCATION).

I'm definitely interested in seeing what this guy comes up with. If it
works, and it is public-domain, I may just use it in Straight Forth

> I just thought Marcel (the OP) said he wanted something easy.

Marcel? I thought his name was "Mark" (although "Marcel" is the French
equivalent of "Mark"). Do you know more about Mark/Marcel than is
apparent in this thread? This isn't a sock-puppet for Marcel Hendrix
is it?

Paul Rubin

unread,

Nov 8, 2012, 11:00:01 PM11/8/12

to

an...@mips.complang.tuwien.ac.at (Anton Ertl) writes:
>>That makes no sense at all; lots of OS's are coded in C++
> Really? Please name a lot.

http://www.google.com/search?q="operating+system+written+in+c%2B%2B"

> Which shows that C++ is not a superset of C idiomatically, and it's
> certainly not a superset formally (e.g., a C++ compiler does not
> compile C programs that contain a variable called "this").

It's never claimed to be a formal superset, just pretty close to one.
You'd have to rename any variable called "this". There is a list of
further such issues on Stroustrup's web site and they are all pretty
minor iirc.

Of course it's idiomatically different, that's the point. I'd also say
Forth with files and locals is idiomatically different from Forth with
blocks and stackrobatics. I'm not exactly a C++ fan and I haven't used
it a whole lot, but in my limited experience it is improves
substantially on C. Of course I'm not a fan of C either.

Rod Pemberton

unread,

Nov 9, 2012, 2:20:54 AM11/9/12

to

"Paul Rubin" <no.e...@nospam.invalid> wrote in message

news:7xa9us3...@ruckus.brouhaha.com...
> "Rod Pemberton" <do_no...@notemailnotz.cnm> writes:
...

> > It's my understanding from what I've read that coding an operating
> > system in C++ is very difficult, primarily because of new and delete.
> > It's not difficult to code an OS in C. I.e., they are not equivalent.
>
> That makes no sense at all;

Well, I've read that more than a few times from various sources. So, I'm
assuming there is some truth to it. Someone coding an OS (operating system)
in C++ must implement new and delete - custom to their OS. Perhaps, that's
were the problem is. Personally, I've got no familiarity with new, delete,
or an OS in C++, so it's possible it's a myth. But, I can't imagine why
people would intentionally spread such a myth ... Who opposes using C++
for OS development? There are enough C++ programmers that I'm sure a few
have tried.

> [...] lots of OS's are coded in C++, [...]

No offense, but I seriously doubt that. I wouldn't doubt that a few exist
though.

I've seen people attempt OSes in all sorts of languages. But, I've not come
across a single OS on the Internet coded in C++. Nor do I recall anyone
posting about using C++ for their OS to alt.os.development.

C and the host's assembly still seem to be the predominant languages that
OSes are coded in.

I wouldn't doubt it that a "C++" OS is mostly coded in C with a bit of
trivial C++ usage, just as many C++ programs aren't actually C++ either but
C masquerading as C++.

I can't imagine an OS developer wanting to deal with additional complexities
within their OS, such as inheritance, object oriented programming,
polymorphism, classes and overloading of C++. At some point, an OS
developer coding in a HLL (high-level language) will likely create their own
compiler, or attempt to bootstrap an existing compiler. The more
complicated the compiler, the less likely it is that they'll succeed.
Simplicity is needed initially. Complexity can be added at a later time.

For my OS project, I really don't even want C. I want a very small C
subset. Technically, even that is a bit more than I really want. If
assemblers could manage variables like C, I'd probably use an assembler.
Today, most are powerful enough that they can handle structures, control
flow, macro's, procedures, etc. I.e., many are almost HLLs.

> [...] and C++ is almost exactly a superset of C (that is, C programs

> can be compiled and run under C++ with very little if any modification).

Well, I've never coded C++. AIR (as I recall), C is a subset of C++. From
what I've seen, "pure" C++ programs appear very different syntactically from
C programs and C treated as C++. E.g., from Wikipedia's C++ page:
'std::cout << "..."' AIUI, std class, cout function, i.e., functionally
equivalent to C's printf(). Clearly, that's not a 'superset' of C. That's
purely C++. No C programmer recognizes that without being provided an
explanation. A superset would extend the existing C functionality and
syntax, e.g., printf::file. A C programmer would recognize printf and might
be able to guess that '::file' makes printf function similar to fprintf.

Rod Pemberton

Paul Rubin

unread,

Nov 9, 2012, 3:53:58 AM11/9/12

to

"Rod Pemberton" <do_no...@notemailnotz.cnm> writes:
> Well, I've read that more than a few times from various sources. So,
> I'm assuming there is some truth to it. Someone coding an OS
> (operating system) in C++ must implement new and delete - custom to
> their OS. Perhaps, that's were the problem is.

new and delete are basically syntax sugar for a typed version of malloc
and free. The API is here:

http://en.cppreference.com/w/cpp/memory/new/operator_new

so instead of saying
p = (struct foo *) malloc(sizeof (struct foo));
you'd say
p = new foo;

The default action of "new" is to call a malloc-like library routine but
you can override it to call some code of your own, and (it occurs to me)
this might be a reasonable way to handle something like kernel buffers
in linux.

> Who opposes using C++ for OS development? There are enough C++
> programmers that I'm sure a few have tried.

Haiku (successor to BeOS) and Symbian were written in C++; Haiku is
trendy in a certain crowd and Symbian was highly successful for a long
time (though now pretty much crushed under the Android/iOS juggernaut).
According to Wikipedia, MacOS now contains some C++ though I don't know
specifics.

>> [...] lots of OS's are coded in C++, [...]
> No offense, but I seriously doubt that. I wouldn't doubt that a few
> exist though.

Try a web search. Lots are there that are not very famous.

> C and the host's assembly still seem to be the predominant languages that
> OSes are coded in.

There has to be a little bit of low-level asm in any OS, but if the OS
is more than a toy, for the past 30 years it's (almost) always written
predominantly in a HLL. As with compilers, for projects started before
the 1990's, a lot of the time the HLL was C. The OS's that we've heard
of (Linux, BSD, etc.) are all quite old so they fit that pattern.

> I can't imagine an OS developer wanting to deal with additional
> complexities within their OS, such as inheritance, object oriented
> programming, polymorphism, classes and overloading of C++.

http://okmij.org/ftp/cpp-digest/toy_OS.txt

And anyway, OS's in C usually have dispatch tables for device drivers
etc. that amount to manually implemented OOP.

> At some point, an OS developer coding in a HLL (high-level language)
> will likely create their own compiler, or attempt to bootstrap an
> existing compiler. The more complicated the compiler, the less likely
> it is that they'll succeed.

I don't understand this. I don't see why someone writing an OS would
write their own compiler, under normal circumstances. If they were
writing a general purpose OS then of course they'd have to target a
toolchain to it (probably more about the linker etc. than the compiler
proper).

> For my OS project, I really don't even want C.

I thought you were writing a Forth. You're writing an OS too?

> I want a very small C subset. Technically, even that is a bit more
> than I really want. If assemblers could manage variables like C, I'd
> probably use an assembler.

Why not use the Forth you're writing?

> Well, I've never coded C++. AIR (as I recall), C is a subset of C++.

Pretty much so. There are some minor deviations, like some more
keywords, so in C++ you possibly can't have a variable called "new".

> From what I've seen, "pure" C++ programs appear very different
> syntactically from C programs and C treated as C++. E.g., from
> Wikipedia's C++ page: 'std::cout << "..."' AIUI, std class, cout
> function, i.e., functionally equivalent to C's printf().

cout is like stdout. << is operator overloading, so cout << x is the
equivalent of something like fwrite(stdout,x). It's specialized on
various datatypes so if cout << n (for an int n) prints n in decimal,
while cout << s (for string s) prints s directly, etc.

If you write a new class, you can specify what << is supposed to do with
it, so it's better than printf in the sense that you can extend it easily.
It's uglier and clumsier in some other ways. You can still use printf if
you want, and I sometimes do that.

> Clearly, that's not a 'superset' of C. That's purely C++. No C
> programmer recognizes that without being provided an explanation.

Of course it's a (near) superset. It's just like today if you want to
go somewhere, you can walk, drive, take a taxi, buy a plane ticket, ride
a horse, etc.; while if you lived in the 18th century your choices would
have been walk or horse. We would say your options today are a superset
of the 18th century options. That doesn't mean an 18th century person
would understand airplanes without an explanation, and it doesn't mean
that the choices an 18th century person would have made are sensible
choices now (riding a horse in a town where cars are available would
usually be silly). But they're still available if that's what a given
situation calls for.

> A superset would extend the existing C functionality and syntax, e.g.,
> printf::file. A C programmer would recognize printf and might be able
> to guess that '::file' makes printf function similar to fprintf.

It's not like that--using it idiomatically really has a completely
different feeling than writing in C. It's actually pretty easy to get
used to though; it's "well-worn" if that makes any sense. I mean
something like: the people who designed it had clearly written a lot of
C code and the stuff that went into C++ was all invented to solve real
problems.

At least for 21st century PC-class computers (that would include things
like smartphones, which are equivalent to PC's of the early 2000's) I just
don't see much point of writing anything in pure C. You do get code bloat
if you use idiomatic C++ and its template library, but you gain
convenience, and with some care you can avoid taking any performance hit.

Albert van der Horst

unread,

Nov 9, 2012, 5:48:23 AM11/9/12

to

In article <0efea753-0396-4bb0...@qi8g2000pbb.googlegroups.com>,

Hugh Aguilar <hughag...@yahoo.com> wrote:
>On Nov 7, 5:45�pm, "Rod Pemberton" <do_not_h...@notemailnotz.cnm>
>wrote:
>> "Hugh Aguilar" <hughaguila...@yahoo.com> wrote in message
>> > Why not write a world-class allocator for Forth? :-)
>>
>> Hey, you know what? �You're elected!
>
>Well, I said in my very first response that I don't know enough about
>the subject to dive in --- not without some research. I have only a
>rudimentary understanding of the subject. I still don't know what this
>"buddy" system is.
>
>Eventually I will have to dive in though --- Straight Forth will need
>a memory allocator, and it can't use GLIBC because I need to have
>ALLOCATION --- this, according to Anton, is why Forth-200x won't have
>ALLOCATION (because Forth-200x depends upon GLIBC which doesn't have
>ALLOCATION).

You can steal the memory words from the ciforth version 5.
It has no ALLOCATION, but it can be trivially added.
(like DUP 1 CELLS - @ SWAP - or some such).

>
>I'm definitely interested in seeing what this guy comes up with. If it
>works, and it is public-domain, I may just use it in Straight Forth
>
>> I just thought Marcel (the OP) said he wanted something easy.
>
>Marcel? I thought his name was "Mark" (although "Marcel" is the French
>equivalent of "Mark"). Do you know more about Mark/Marcel than is
>apparent in this thread? This isn't a sock-puppet for Marcel Hendrix
>is it?

Groetjes Albert
--
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- being exponential -- ultimately falters.
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

Bernd Paysan

unread,

Nov 9, 2012, 4:15:38 PM11/9/12

to

Hugh Aguilar wrote:
> Eventually I will have to dive in though --- Straight Forth will need
> a memory allocator, and it can't use GLIBC because I need to have
> ALLOCATION --- this, according to Anton, is why Forth-200x won't have
> ALLOCATION (because Forth-200x depends upon GLIBC which doesn't have
> ALLOCATION).

It doesn't have it as public exported function. And the argument of
Anton was that a number of Forth systems don't have their own allocater
implemented, and therefore would require such an interface.

If you want to rely on "undocumented features" that change whenever
Ulrich Drepper feels like to change it (and he won't tell you), this is
ALLOCATION for glibc:

: allocation ( addr -- len ) 1 cells - @ 1 cells - 1- ;

You could test if this gives reasonable values, and if not, use a
wrapper around malloc() and free() that allocates one cell more and
stores the allocation value in the first cell.

No real need to write your own allocator.

BTW: If you make an RFC, you will create a *discussion*. Arguments are
exchanged, and a flame-proof suit is considered minimum requirement for
starting such a discussion. You don't have one, so you withdraw your
proposals after the first round, pretending they were knocked out.

--
Bernd Paysan
"If you want it done right, you have to do it yourself"
http://bernd-paysan.de/

Rod Pemberton

unread,

Nov 9, 2012, 7:10:00 PM11/9/12

to

"Paul Rubin" <no.e...@nospam.invalid> wrote in message

news:7xpq3nj...@ruckus.brouhaha.com...

> "Rod Pemberton" <do_no...@notemailnotz.cnm> writes:
...

> > At some point, an OS developer coding in a HLL (high-level language)
> > will likely create their own compiler, or attempt to bootstrap an
> > existing compiler. The more complicated the compiler, the less likely
> > it is that they'll succeed.
>
> I don't understand this. I don't see why someone writing an OS would
> write their own compiler, under normal circumstances. If they were
> writing a general purpose OS then of course they'd have to target a
> toolchain to it (probably more about the linker etc. than the compiler
> proper).
>

Well, I started my OS using existing C compilers, none of which have ever
been used to produce an OS. At first, development was very fast. I didn't
have to create my own toolchain. I didn't have to code in "primitive"
assembly. However, C compilers are not completely independent of the
environment they were coded for. Much of the C language is and many parts
of the C library can be. But, there are still many things in the toolchain
that produce environment specific code, e.g., code for interrupts. You
can't get around it, without rewriting the toolchain you're using. Of
course, the toolchain is so large and complex, you're not going to waste
time doing that. You just have to handle the problems such environment
specific code creates. So, eventually, you become frustrated with the lack
of control. Next, you find bugs in the compiler or library, which you have
to code around too. Finally, you come to the conclusion that you should've
developed your own simple toolchain first, because you can't control
everything you need to control for your OS without total control of the
toolchain.

> > For my OS project, I really don't even want C.
>
> I thought you were writing a Forth. You're writing an OS too?
>

Yes, I have an in-progress Forth interpreter. Yes, I have an stalled OS
project too. Also, x86 assembler, x86 interpreter, a bunch of different
attempts at C parsers and/or lexers, and C compilers, a simple C-like
programming language, etc. None are entirely complete. I have too many
projects for my time.

The OS project stalled after coming across two problems: computer with
needed IDE hardware died, and the "lack of control" toolchain problem above.

> > I want a very small C subset. Technically, even that is a bit more
> > than I really want. If assemblers could manage variables like C, I'd
> > probably use an assembler.
>
> Why not use the Forth you're writing?
>

I can't not offend Forth users by replying honestly to that.

Basically, I have yet to see Forth as being an effective language.
Although, I've only coded various Forth dictionary words so far. However,
there are a few things I find negative or detrimental about Forth. C or
PL/1 are far more powerful and effective than any other languages from what
I've seen. C and Forth are noteworthy in that they are far more widespread
than other languages because of their minimal, simple machine models.
Pascal, Fortran, BASIC, etc just aren't that useful, i.e., generally
high-level, but no low-level, or other problematic features.

I'd also need a Forth compiler, not a Forth interpreter. That would be
another project... Generating code, even for very simple languages, is far
more complicated than creating address-lists used by DTC and ITC.

Rod Pemberton

Paul Rubin

unread,

Nov 9, 2012, 9:47:35 PM11/9/12

to

"Rod Pemberton" <do_no...@notemailnotz.cnm> writes:
> C compilers are not completely independent of the environment they

> were coded for... there are still many things in the toolchain that

> produce environment specific code, e.g., code for interrupts.

Right, you write an interrupt vector in assembler and have the
interrupts call a trampoline that calls a C function. So you have to
know the calling conventions, and in the case of a few interrupts
needing very fast response, you might have to code the handler in
assembler, but the amount of asm code involved is generally pretty
small.

> Finally, you come to the conclusion that you should've developed your
> own simple toolchain first, because you can't control everything you
> need to control for your OS without total control of the toolchain.

Sounds like you are reinventing Forth ;-). In practice other people
manage to use existing toolchains.

>> > For my OS project, I really don't even want C.

What kind of hardware is it for, if I may ask?

> I'd also need a Forth compiler, not a Forth interpreter. That would be
> another project... Generating code, even for very simple languages, is far
> more complicated than creating address-lists used by DTC and ITC.

It's true that compilers are more complicated. I wonder how much time
is spent in OS functions on today's computers, though. Writing the OS
in threaded Forth with a few asm words for critical loops is a
time-tested approach. Traditional Forths are very well integrated with
assemblers, normally also written in Forth.

Elizabeth D. Rather

unread,

Nov 9, 2012, 10:51:42 PM11/9/12

to

On 11/9/12 2:10 PM, Rod Pemberton wrote:
...

>
> Well, I started my OS using existing C compilers, none of which have ever
> been used to produce an OS. At first, development was very fast. I didn't
> have to create my own toolchain. I didn't have to code in "primitive"
> assembly. However, C compilers are not completely independent of the
> environment they were coded for. Much of the C language is and many parts
> of the C library can be. But, there are still many things in the toolchain
> that produce environment specific code, e.g., code for interrupts. You
> can't get around it, without rewriting the toolchain you're using. Of
> course, the toolchain is so large and complex, you're not going to waste
> time doing that. You just have to handle the problems such environment
> specific code creates. So, eventually, you become frustrated with the lack
> of control. Next, you find bugs in the compiler or library, which you have
> to code around too. Finally, you come to the conclusion that you should've
> developed your own simple toolchain first, because you can't control
> everything you need to control for your OS without total control of the
> toolchain.

Love it. This is exactly, precisely why Chuck invented Forth in the
first place!

...

>> Why not use the Forth you're writing?
>>
>
> I can't not offend Forth users by replying honestly to that.
>
> Basically, I have yet to see Forth as being an effective language.
> Although, I've only coded various Forth dictionary words so far. However,
> there are a few things I find negative or detrimental about Forth. C or
> PL/1 are far more powerful and effective than any other languages from what
> I've seen. C and Forth are noteworthy in that they are far more widespread
> than other languages because of their minimal, simple machine models.
> Pascal, Fortran, BASIC, etc just aren't that useful, i.e., generally
> high-level, but no low-level, or other problematic features.
>
> I'd also need a Forth compiler, not a Forth interpreter. That would be
> another project... Generating code, even for very simple languages, is far
> more complicated than creating address-lists used by DTC and ITC.

Yet another occasion to point out that you really should learn to
program in Forth.... Those of us who do it regularly find it *more*
powerful than C or PL/1, not less. And quite a few of us worked
successfully for years on Forth OSs based on an ITC model (though one
designed for efficiency with assembler primitives).

Cheers,
Elizabeth

--
==================================================
Elizabeth D. Rather (US & Canada) 800-55-FORTH
FORTH Inc. +1 310.999.6784
5959 West Century Blvd. Suite 700
Los Angeles, CA 90045
http://www.forth.com

"Forth-based products and Services for real-time
applications since 1973."
==================================================

Mark Wills

unread,

Nov 10, 2012, 1:49:58 AM11/10/12

to

On Nov 10, 12:05 am, "Rod Pemberton" <do_not_h...@notemailnotz.cnm>
wrote:
>

> Well, I started my OS using existing C compilers, none of which have ever
> been used to produce an OS. At first, development was very fast. I didn't
> have to create my own toolchain. I didn't have to code in "primitive"
> assembly. However, C compilers are not completely independent of the
> environment they were coded for. Much of the C language is and many parts
> of the C library can be. But, there are still many things in the toolchain
> that produce environment specific code, e.g., code for interrupts. You
> can't get around it, without rewriting the toolchain you're using. Of
> course, the toolchain is so large and complex, you're not going to waste
> time doing that. You just have to handle the problems such environment
> specific code creates. So, eventually, you become frustrated with the lack
> of control. Next, you find bugs in the compiler or library, which you have
> to code around too. Finally, you come to the conclusion that you should've
> developed your own simple toolchain first, because you can't control
> everything you need to control for your OS without total control of the
> toolchain.
>

You're aware of *why* Forth was invented, right?

Mark Wills

unread,

Nov 10, 2012, 1:51:32 AM11/10/12

to

> Los Angeles, CA 90045http://www.forth.com

>
> "Forth-based products and Services for real-time
> applications since 1973."
> ==================================================

Ah! Sorry - should have read to the end first!

Mark Wills

unread,

Nov 10, 2012, 1:57:12 AM11/10/12

to

> > > I want a very small C subset. Technically, even that is a bit more
> > > than I really want. If assemblers could manage variables like C, I'd
> > > probably use an assembler.

Well, just write your OS in assembler. Forth can be your entire tool-
chain.

My own Forth system has a TMS9900 assembler, written in Forth. It's
the most powerful assembler I've ever used, not because I wrote it or
there's anything particularly special about it, but because Forth is
available to you, even at the assembly language level. It's difficult
to describe until you have tried it and experienced it. Also, having
things like IF...THEN...ELSE and BEGIN...UNTIL etc. at the assembly
language level is lovely.

Rod Pemberton

unread,

Nov 10, 2012, 2:15:29 AM11/10/12

to

"Mark Wills" <forth...@gmail.com> wrote in message

news:100469c5-bd7b-4936...@s12g2000vbw.googlegroups.com...

> On Nov 10, 12:05 am, "Rod Pemberton" <do_not_h...@notemailnotz.cnm>
> wrote:

...

For interactive control of a radio telescope, or because CM was frustrated
with high-level languages on now ancient, memory constrained, machines.

RP

Rod Pemberton

unread,

Nov 10, 2012, 2:16:19 AM11/10/12

to

"Paul Rubin" <no.e...@nospam.invalid> wrote in message

news:7xpq3md...@ruckus.brouhaha.com...

> "Rod Pemberton" <do_no...@notemailnotz.cnm> writes:
...

> What kind of hardware is it for, if I may ask?
>

x86 PC

Rod Pemberton

Rod Pemberton

unread,

Nov 10, 2012, 2:26:20 AM11/10/12

to

"Elizabeth D. Rather" <era...@forth.com> wrote in message
news:BLSdna1LYZfSUgDN...@supernews.com...

> On 11/9/12 2:10 PM, Rod Pemberton wrote:
...

> > [RP's OS experience]

>
> Love it. This is exactly, precisely why Chuck invented Forth in the
> first place!

I recall he developed Forth for control of a radio telescope, and then there
is the apparent frustration he had with programming high-level languages on
memory constrained machines. Given that scenario, I'm not sure how you see
my experience as being "exactly, precisely why Chuck invented Forth in the
first place!" I.e., other than "frustration" with some issue or another, I
don't see the similarity. I'm clearly not frustrated with coding
applications in C, just with the lack of control when coding an OS in C
using C compilers that weren't setup to do so. Had I used an "appropriate"
C compiler, I might not have ever had those issues. That was a mistake on
my part, but I chose them because of the environment, not the compiler's
capabilities.

Rod Pemberton

Elizabeth D. Rather

unread,

Nov 10, 2012, 3:01:30 AM11/10/12

to

Chuck's frustration wasn't limited to resource-constrained machines. At
the time he developed Forth he also had virtually unlimited access to an
IBM mainframe. It was frustration with the *tool chain*: the number of
different programs (each of which had some sort of language) he had to
use: OS control language, editor, compiler, assembler, etc. With every
step it was necessary to move between these different programs, each
with its own syntax, command set, rules, procedures, limitations, etc.
And every time he had to move from one of these to another it was a
break in his concentration on the actual problem he was trying to solve.
What he wanted was a single, unified environment that he could control,
and within which he was free to concentrate entirely on the program he
was trying to write.

Forth was conceived as that single, unified environment incorporating
its own OS, editor, compiler, assembler, etc., in which all were
available all the time, with consistent syntax and rules, so that moving
from one to another was transparent.

Since obviously he wasn't going to be permitted to replace IBM's OS, the
opportunity presented by dedicated minicomputers was what he needed.

Mark

unread,

Nov 10, 2012, 7:05:33 AM11/10/12

to

"Hugh Aguilar" wrote in message
news:60a11b41-caa8-42b8...@n2g2000pbp.googlegroups.com...

>On Nov 4, 1:11 pm, "Mark" <m...@dibsco.co.uk> wrote:
>> Hello
>>
>> I am looking at writing a buddy system memory allocator in assembler for
>> a
>> Forth system.
>>
>> I am not sure how best to implement the tracking of allocated blocks.
>> I've
>> read lots of papers and articles regarding buddy allocation and think
>> that I
>> understand the concepts quite well. I am working with embedded systems so
>> I
>> want to keep the memory and processing overheads as low as possible.
>
>I've never heard of a "buddy system" memory allocator. Can you provide
>any links to discussions of this?

Here are a few of the articles that I have looked at:

http://en.wikipedia.org/wiki/Buddy_memory_allocation
http://www.memorymanagement.org/articles/alloc.html
https://umdrive.memphis.edu/blstuart/htdocs/excerpt3.pdf
http://courses.engr.illinois.edu/cs241/sp2012/lectures/09-malloc.pdf
http://ssw.jku.at/Misc/SSW/01.MemoryManagement.pdf

>
>> I understand how I am going to handle free blocks, i.e. using a
>> linked-list
>> where, apart from the initial pointer, the free list pointers are
>> contained
>> within the free blocks themselves. I am going to maintain a free list per
>> block size, so the only additional memory overhead is an initial pointer
>> per
>> block size. In this case, for example, 32 free lists will cover up to a
>> total memory of 2^31 * (minimum block size), so I could quite easily
>> operate
>> with a minimum block size of 1 byte if I so desired.
>
>This seems pretty straight-forward. When you want to allocate some
>memory, you find the smallest (or maybe the largest) block that will
>work, and use that. Whatever is left over gets moved to the
>appropriate list.

Buddy systems work with blocks that are powers of two in size. All I need to
do to allocate N bytes is to round N up to the next nearest power of two and
look in the free list for that size. If there are no free blocks of that
size, you look at the next block size up and split it in half and so on.

>
>I wouldn't use a minimum of 1 byte as that is a waste of memory
>considering that the link field is one word in size. Also, a lot of
>processors have a "paragraph" that can be worked with efficiently. On
>the 16-bit x86 it is 16 bytes. On most micro-controllers (including
>the MSP-430 which I presume you are working on) it is 1 word.

You are right, a 1 byte block size would be a little wasteful. I was merely
trying to show the flexible range of block sizes that you can have with 32
free lists.

>
>On a processor with data caching, I would use trees rather than lists,
>and sort them by their address. If you always prefer the lowest (or
>the highest) address, your blocks will tend to be close together. This
>might help somewhat in that if you are accessing several blocks at the
>same time, all of them might be close enough together to be in the
>cache together.
>
>Be sure to provide support for my ALLOCATION word! That is a huge help
>to the user. See my novice package for more on this. Also, read this
>article:
>http://www.forth.org/novice.pdf

>
>> However, I'm not sure about the tracking of allocated blocks. I want a
>> quick
>> and easy way of tracking allocated blocks and buddies using a method that
>> doesn't involve dynamically growing a linked-list or tree at run-time. It
>> seems that this is commonly done using a bitmap, but I don't understand
>> how
>> you keep this manageable if you have a reasonable amount of memory and
>> want
>> a relatively small minimum block size. A bitmap of 32 bits will only give
>> me
>> coverage for 8k of memory if I use a minimum block size of 256 bytes. I
>> am
>> trying to figure out how to avoid having a bitmap that is hundreds or

>> thousands of bits long. Is there any easier or alternative technique? I
>> also
>> imagine that I need a method of tracking the sizes of the allocated
>> blocks
>> (in addition to the bitmap) so that a 'free' knows how many bytes to
>> de-allocate.
>
>Why do you want to track allocated blocks??? The only reason would be
>for GC, but that is very un-Forth-like. You really don't have a good
>way to track what blocks are still in use (neither a reference count
>nor yet searching through the allocated blocks for any that aren't
>referenced), as Forth doesn't distinguish pointers in any way --- they
>are just a value on the stack, and they can't be distinguished from
>integers.
>
>I wouldn't mess with tracking allocated blocks at all --- just leave
>it up to the user to deallocate memory blocks when they are no longer
>in use.

I need to track allocated blocks so that I know how much memory to free and
which buddies to merge if any.

When the user calls a 'free' word with a single address of the allocated
region, the memory allocator needs to know how big the memory block is so
that it can return the correct amount of memory to the free list(s). With a
buddy system the memory allocator also needs to calculate the address of the
memory block's buddy so that the two blocks can be merged if necessary. You
need to know the size of the block so that you can work out where the buddy
block starts.

>
>For debugging purposes, it might be useful to have a word that
>determines if any memory is allocated, or if everything is free (but
>doesn't tell you when or where the memory got allocated). Sometimes,
>at certain points in your program, you know that everything should be
>free. If anything isn't, then there must be a leak somewhere. For
>example, in a micro-controller you might have a paced-loop. Some tasks
>will be spread out over several iterations of the loop, and they will
>hold data in the heap while they are active. You can determine,
>however, if nothing is active, at which time you can do your check.

I agree that some debugging words would be useful. They'll probably come
from debugging words written as part of the development process.

>You don't have to track the allocated blocks at all --- just traverse
>all of your free nodes and build a sorted list of allocated blocks ---
>that is somewhat slow, but it is just for debugging purposes and it
>can be removed from the production release of the program so the micro-
>controller doesn't have any awkward moments when it freezes up.
>
>Because Straight Forth will support ALLOCATION (which Forth-200x
>won't), I will need to write my own heap. I'll definitely be
>interested in seeing what you come up with, and perhaps incorporating
>it directly into Straight Forth. :-) I don't know enough about the
>subject right now to dive into it, not without some research first.

I am going to be writing the allocator in ARM assembler, but I'm sure that I
could back-port the algorithm to Forth if others would find it useful.

Mark

unread,

Nov 10, 2012, 7:53:27 AM11/10/12

to

"Hugh Aguilar" wrote in message

news:d77f4ffe-1c80-45d9...@jj5g2000pbc.googlegroups.com...

>On Nov 4, 4:05 pm, Hugh Aguilar <hughaguila...@yahoo.com> wrote:
>> On Nov 4, 1:11 pm, "Mark" <m...@dibsco.co.uk> wrote:
>> On most micro-controllers (including
>> the MSP-430 which I presume you are working on) it is 1 word.
>

>When I wrote this, I was thinking that you were Mark Wills, our brave
>TI afficianado. Now I realize that you are a different Mark.

Yes, I am a different Mark.

>
>Can you tell us which Forth system you are working with? Is this for a
>homebrew system, or a publicly-available one? Is this for a small 8-
>bit or 16-bit micro-controller, or for a big 32-bit processor? What is
>your ultimate goal?

This is for my homebrew system. I've been dabbling in Forth for the last 30
years or so, but never that seriously (former teenage Saturday assistant at
Skywave Software http://www.dibsco.co.uk/index.php/skywave-software ).

I decided to teach myself ARM assembler and writing a homebrew Forth system
seemed like a good way of 'killing two birds with one stone'. I work
professionally with embedded systems and embedded Linux, so my current
target system is a Linux-based ARM9 system. I'm not quite sure what my
ultimate goal is at the moment, although I do have the following current
goals in mind:

- Full ANS-94 compliant system with all wordsets.
- Small footprint and fast operation (with some loss of portability).
- Flexible build options - inline or common NEXT, Linux or native, threading
technique etc.

I have been developing and testing on a Linux-based PC using qemu to test my
code. I just have the core, core extension, double and double extension
wordsets completed at the moment. I have a number of longer terms goals in
the back of my mind including:

- native Forth system, i.e. not running under Linux, that could be used for
embedded control applications or as a board test/diagnostic suite and
bootloader.
- as a CGI or general purpose scripting language.

Both of these goals come from ideas that would potentially make my
professional development work a little easier.

Mark

unread,

Nov 10, 2012, 8:09:52 AM11/10/12

to

"Alex McDonald" wrote in message
news:94819742-bce4-43d3...@a6g2000vbl.googlegroups.com...

>On Nov 5, 7:06 pm, "Rod Pemberton" <do_not_h...@notemailnotz.cnm>
>wrote:
>> "Mark" <m...@dibsco.co.uk> wrote in message
>>
>> news:yZzls.238791$it2....@fx22.am4...
>> ...

>>
>> > However, I'm not sure about the tracking of allocated blocks. I want a
>> > quick and easy way of tracking allocated blocks and buddies using a
>> > method that doesn't involve dynamically growing a linked-list or tree
>> > at run-time. It seems that this is commonly done using a bitmap, but
>> > I don't understand how you keep this manageable if you have a
>> > reasonable
>> > amount of memory and want a relatively small minimum block size. A
>> > bitmap of 32 bits will only give me coverage for 8k of memory if I use
>> > a minimum block size of 256 bytes. I am trying to figure out how to
>> > avoid having a bitmap that is hundreds or thousands of bits long. Is
>> > there any easier or alternative technique?
>>

>> This issue affects file systems too. So, you may wish to think of your
>> memory allocator as a type of in-memory file-system or read up on how
>> file
>> systems to understand how they solve the problem.
>
>Although that may superficially appear to be the case, file systems
>are designed to solve a quite different problem; how do you minimize
>file IO, reduce seeks to a minimum and parallelise operations across
>many devices to decrease latency and increase bandwidth. Many file
>systems are btree or b+tree based as those algorithms match well the
>limitations of disks.
>
>Memory allocators that use btrees can be found; http://locklessinc.com/
>for example. But their domain is something like MPI (message passing)
>where the interface benefits from a btree implementation; and even
>this allocator reverts to a slab allocator for small allocations.

>
>The OP needs to state the use case for his allocator.

I am trying to implement a general purpose memory allocator to provide the
ANS-94 memory allocation wordset features for an embedded ARM Forth. Speed
and simplicity are the most important factors. The buddy system seems to
offer this, possibly at the expense of internal fragmentation. I don't want
to have to spend a lot of time walking through lists and I don't want to
have any dynamic structures that need to grow or shrink at runtime.

Mark

unread,

Nov 10, 2012, 8:43:37 AM11/10/12

to

"Mark Wills" wrote in message
news:1b12d69b-ae6e-4a1a...@k6g2000vbr.googlegroups.com...

>On Nov 7, 9:34 am, "Mark" <m...@dibsco.co.uk> wrote:
>> "Rod Pemberton" wrote in messagenews:k792o8$c4l$1...@speranza.aioe.org...

>> >"Mark" <m...@dibsco.co.uk> wrote in message
>> >news:yZzls.238791$it2....@fx22.am4...
>> >...
>>
>> >> However, I'm not sure about the tracking of allocated blocks. I want a
>> >> quick and easy way of tracking allocated blocks and buddies using a
>> >> method that doesn't involve dynamically growing a linked-list or tree
>> >> at run-time. It seems that this is commonly done using a bitmap, but
>> >> I don't understand how you keep this manageable if you have a
>> >> reasonable
>> >> amount of memory and want a relatively small minimum block size. A
>> >> bitmap of 32 bits will only give me coverage for 8k of memory if I use
>> >> a minimum block size of 256 bytes. I am trying to figure out how to
>> >> avoid having a bitmap that is hundreds or thousands of bits long. Is
>> >> there any easier or alternative technique?
>>
>> >This issue affects file systems too. So, you may wish to think of your
>> >memory allocator as a type of in-memory file-system or read up on how
>> >file
>> >systems to understand how they solve the problem.
>>

>> Thanks, I'll take a look at the subject.
>>
>>
>>
>> >Bitmaps are good for known and fixed sizes since they're so compact.
>> >But,
>> >like everything else, they do consume space. I'm not sure that there is
>> >an
>> >"easy" solution of a small bitmap with small allocation sizes when being
>> >used to map a large amount of memory. You're going to need alot of
>> >bits.
>> >Of course, the total size of system memory depends on how much was
>> >installed
>> >by the user. This is unlike removable media which is always fixed, or
>> >multiple sizes with one maximum size. For an embedded system or OS, you
>> >could pick a maximum size of memory for which the code will work, then
>> >calculate backwards for what you need to implement ... In the future,
>> >the
>> >code may need to be 'fixed' to handle more memory.
>>
>> I had assumed that I would work with a fixed memory size, number of
>> blocks
>> etc. as most embedded boards probably ship with a fixed (soldered) amount
>> of
>> RAM. It is not a problem to have a few '#defines' or equivalent somewhere
>> to
>> configure these things at build time.
>>
>>
>>
>> >E.g., you're likely familiar with Microsoft's FAT12/16 use FATs (File
>> >Allocation Table) to keep track of files or perhaps Unix's inodes.
>> >You're
>> >likely unfamiliar with CBM (Commodore Business Machines) filesystem,
>> >which
>> >used bitmaps. CBM's PC's (personal computers) like the C64's and Vic
>> >20's,
>> >used CBM disk drives, such as the 1541, that ran CBM's DOS (Disk
>> >Operating
>> >System). CBM's DOS used bitmaps called BAMs (Block Availability Map) to
>> >keep track of allocated sectors. The linked-list of sectors for the
>> >ordering of a file's sectors was stored in the sectors with the data,
>> >unlike
>> >FATs. If interested, the D64 format documents 1541's format:
>> >http://ist.uwaterloo.ca/~schepers/formats/D64.TXT
>>
>> Thanks, I'll take a look.
>>
>>
>>
>> >As for memory allocators, there are a few to be found on the internet,
>> >none
>> >of which are coded in Forth. They may provide you with some ideas,
>> >e.g.:
>>
>> >"A Memory Allocator," by Doug Lea
>> >http://g.oswego.edu/dl/html/malloc.html
>>
>> >"The BGET Memory Allocator," by John Walker
>> >http://www.fourmilab.ch/bget/
>>
>> >"Dynamic Storage Allocator," Richard Harter, comp.lang.c, Nov. 11, 1990
>> >http://groups.google.com/group/comp.lang.c/msg/7da27dcbc6e2ace1
>>
>> I've looked at a lot of papers on memory allocators, including your first
>> link above. I had settled on the buddy system approach because it seemed
>> to
>> offer a good trade-off between speed and fragmentation.- Hide quoted
>> text -
>>
>> - Show quoted text -
>
>What is the definition of 'buddy' in this context? Are two more
>consecutive memory blocks that are assigned to the same object
>considered to be buddies?

The buddy system memory allocator handles memory blocks that are always
powers of two in size. A request for N bytes will be rounded up to the next
power of two. If a block of that size exists, it will be allocated. If it
doesn't exist, larger blocks will be split into two pieces until a block of
the correct size exists. Each time a block is split the other half becomes
it's buddy.

For example, if your total memory (heap) is 64 bytes and you want to
allocate 12 bytes...

12 is rounded up to 16 bytes. No blocks of 16 bytes exist. The 64 bytes (A)
are split into two blocks of 32 bytes (B and C are buddies). There are still
no blocks of 16 bytes. Block B is split into two blocks of 16 bytes (D and E
are buddies). Block D is allocated.

A: 64 bytes
B: 32 bytes C: 32 bytes
D: 16 bytes E: 16 bytes C: 32 bytes
D: [16 bytes] E: 16 bytes C: 32 bytes

When a block is freed it is merged with it's buddy if it is also free. This
process is repeated until a block cannot be merged with it's buddy.

Using powers of two makes it easy to locate a block's buddy without having
to search any lists.

See:
http://en.wikipedia.org/wiki/Buddy_memory_allocation

>
>I've yet to study dynamic memory allocation in any detail, as in my

>assembler and C days we relied on static allocation (which worked
>perfectly) and in my later higher-level language (VB and .Net) it was

>handled automagically. It's a very interesting subject.

There was a time in my life too when the projects that I worked on avoided
dynamic allocation at all costs.

Alex McDonald

unread,

Nov 10, 2012, 10:43:42 AM11/10/12

to

The buddy system is fine as long as you realise that you need to be
able to tell if a buddy is free or allocated when coalescing blocks.
That takes a marker; either something in the buddy itself, or a tree
of pointers to allocations. The marker technique means that you need
something unique in a free block that an allocated block can't
contain. Some memory systems have special tags associated with each
word that can store this information, much like an ECC bit. But most
don't, which means building and handling trees; and they too need to
live somewhere in memory.

A slab allocator can be much simpler and more efficient. For optimal
memory use it requires that you (roughly) know the size and number of
allocations in advance. Running the code and logging allocations and
frees can provide the information you need. Even without making it
tunable or worrying about optimal use, it's still a fairly simple
programming exercise, and by using bitmaps, the memory overhead of
maintaining free & allocated blocks is minimized.

Rod Pemberton

unread,

Nov 10, 2012, 8:25:40 PM11/10/12

to

"Mark" <ma...@dibsco.co.uk> wrote in message
news:r7sns.233489$A%.87153@fx26.am4...

> [...] I'm not quite sure what my

> ultimate goal is at the moment, although I do have the following current
> goals in mind:
>
> - Full ANS-94 compliant system with all wordsets.

Why?

My thought process says that there should only be a few ANS Forth systems
with all ANS wordsets implemented, e.g., commercial Forths and
maybe a few hobbyist Forths. I.e., I think that there will be very few
Forth's that have much more than ANS CORE and CORE EXT implemented.
If my thinking is correct, then Forth code that needs more than just the
CORE and CORE EXT wordsets is likely to be scarce too. I.e., if you're
going to use Forth code provided by others, you're probably fine without
a full Forth.

Rod Pemberton

Josh Grams

unread,

Nov 11, 2012, 6:49:49 AM11/11/12

to

Rod Pemberton wrote: <k7muil$93u$1...@speranza.aioe.org>

I know of *very* few non-trivial Forth programs that use only CORE and
CORE EXT, and I don't think there are *any* Forth systems that provide
only that. Most of the major Forth systems provide all wordsets, or at
least a most of most of the wordsets, and even toy systems usually
include some things from TOOLS and FILE and STRING.

--Josh

Charles Mélice

unread,

Nov 11, 2012, 8:29:53 AM11/11/12

to Mark

Le dimanche 4 novembre 2012 21:11:10 UTC+1, Mark a écrit :
> Hello
>
>
>
> I am looking at writing a buddy system memory allocator in assembler for a
>
> Forth system.
>
>
>
> I am not sure how best to implement the tracking of allocated blocks. I've
>
> read lots of papers and articles regarding buddy allocation and think that I
>
> understand the concepts quite well. I am working with embedded systems so I
>
> want to keep the memory and processing overheads as low as possible.
>
>
>

> I understand how I am going to handle free blocks, i.e. using a linked-list
>
> where, apart from the initial pointer, the free list pointers are contained
>
> within the free blocks themselves. I am going to maintain a free list per
>
> block size, so the only additional memory overhead is an initial pointer per
>
> block size. In this case, for example, 32 free lists will cover up to a
>
> total memory of 2^31 * (minimum block size), so I could quite easily operate
>
> with a minimum block size of 1 byte if I so desired.
>
>
>

> However, I'm not sure about the tracking of allocated blocks. I want a quick
>
> and easy way of tracking allocated blocks and buddies using a method that
>
> doesn't involve dynamically growing a linked-list or tree at run-time. It
>
> seems that this is commonly done using a bitmap, but I don't understand how
>
> you keep this manageable if you have a reasonable amount of memory and want
>
> a relatively small minimum block size. A bitmap of 32 bits will only give me
>
> coverage for 8k of memory if I use a minimum block size of 256 bytes. I am
>
> trying to figure out how to avoid having a bitmap that is hundreds or
>

> thousands of bits long. Is there any easier or alternative technique? I also
>
> imagine that I need a method of tracking the sizes of the allocated blocks
>
> (in addition to the bitmap) so that a 'free' knows how many bytes to
>
> de-allocate.
>
>
>

> Any help would be much appreciated.
>
>
>
> Thanks
>
>
>
> Mark

Here is a simpler system I have used:

base-memory dictionary-size + value heap-link

: heap ( size -- addr )
heap-link swap - ( addr )
heap-link over cell- ( addr link addr- )
dup to heap-link ! ;

: release ( -- )
heap-link @ to heap-link ;

\ NB: release can be automated inside ';' -- then no more necessary

\ ---- SAMPLE TEST ----

: use-mem ( addr -- )
314 swap ! ;

: test
cell heap use-mem
...
cell heap use-mem
...
release
...
release ;

Bernd Paysan

unread,

Nov 11, 2012, 11:43:39 AM11/11/12

to

Josh Grams wrote:
> I know of *very* few non-trivial Forth programs that use only CORE and
> CORE EXT, and I don't think there are *any* Forth systems that provide
> only that. Most of the major Forth systems provide all wordsets, or
> at least a most of most of the wordsets, and even toy systems usually
> include some things from TOOLS and FILE and STRING.

Embedded Forth systems usually limit themselves to CORE plus a few
things you need on these idiosyncratic processors.

I'd say that even the peg solitaire robot I did using the b16 is a "non-
trivial Forth program", and the b16 has just ~30 Forth words as
instructions. Far less than CORE.

On the other hand, on a PC or smartphone, I expect my Forth system to
provide access to system libraries like OpenGL.

Elizabeth D. Rather

unread,

Nov 11, 2012, 12:06:10 PM11/11/12

to

It really depends on your objectives in developing the system. Some
people want to write a Forth just for the experience of getting it up
and running, and don't really plan to use it for applications. Why
should they feel obligated to include wordsets they don't need? Others
have specific application (or application domains) in mind, and should
pick and choose those features and wordsets directly applicable to the
intended use.

Those of us who are developing systems for general use obviously need to
cover a broader spectrum of utility, and there's a distribution
advantage to claiming not only full Standard coverage but also
additional features (programmer aids, libraries, utilities, etc.), but
we never forget that Forth is primarily a tool for application
development and one that we use daily for that purpose.

Josh Grams

unread,

Nov 11, 2012, 8:55:20 PM11/11/12

to

Bernd Paysan wrote: <3845103.S...@sunwukong.fritz.box>

> Josh Grams wrote:
>> I know of *very* few non-trivial Forth programs that use only CORE and
>> CORE EXT, and I don't think there are *any* Forth systems that provide
>> only that. Most of the major Forth systems provide all wordsets, or
>> at least a most of most of the wordsets, and even toy systems usually
>> include some things from TOOLS and FILE and STRING.
>
> Embedded Forth systems usually limit themselves to CORE plus a few
> things you need on these idiosyncratic processors.
>
> I'd say that even the peg solitaire robot I did using the b16 is a "non-
> trivial Forth program", and the b16 has just ~30 Forth words as
> instructions. Far less than CORE.

Yeah, sorry; I was thinking desktop use, not embedded.

--Josh

Hugh Aguilar

unread,

Nov 13, 2012, 10:49:51 PM11/13/12

to

On Nov 10, 5:07 am, "Mark" <m...@dibsco.co.uk> wrote:
> "Hugh Aguilar" wrote in message

> Buddy systems work with blocks that are powers of two in size. All I need to
> do to allocate N bytes is to round N up to the next nearest power of two and
> look in the free list for that size. If there are no free blocks of that
> size, you look at the next block size up and split it in half and so on.

Okay, I'm aware of the idea of making all of the allocated blocks a
power-of-2 size. I hadn't known that was called "buddy system." I'll
read up it, starting with those articles that you mentioned.

> >Be sure to provide support for my ALLOCATION word! That is a huge help
> >to the user. See my novice package for more on this. Also, read this
> >article:
> >http://www.forth.org/novice.pdf

> >...

> >Why do you want to track allocated blocks??? The only reason would be
> >for GC, but that is very un-Forth-like. You really don't have a good
> >way to track what blocks are still in use (neither a reference count
> >nor yet searching through the allocated blocks for any that aren't
> >referenced), as Forth doesn't distinguish pointers in any way --- they
> >are just a value on the stack, and they can't be distinguished from
> >integers.
>
> >I wouldn't mess with tracking allocated blocks at all --- just leave
> >it up to the user to deallocate memory blocks when they are no longer
> >in use.
>
> I need to track allocated blocks so that I know how much memory to free and
> which buddies to merge if any.
>
> When the user calls a 'free' word with a single address of the allocated
> region, the memory allocator needs to know how big the memory block is so
> that it can return the correct amount of memory to the free list(s). With a
> buddy system the memory allocator also needs to calculate the address of the
> memory block's buddy so that the two blocks can be merged if necessary. You
> need to know the size of the block so that you can work out where the buddy
> block starts.

Well, you don't need to keep track of the allocated blocks to do this.
You wouldn't want to anyway, as that would involve searching through a
list or whatever to find it, which is slow.

If you are going to support ALLOCATION (please do!), then you have
killed two birds with one stone. At the front of every allocated block
you have the size of the block (which is a power of two). This tells
you where the buddy is (immediately after it) and how big the buddy
is.

Note that it is okay to store the size of the memory-block (a power-
of-2 in your case) rather than the size that the user requested (which
is <= to what he got). I use ALLOCATION primarily for CLONE-NODE that
clones a node in a list (or a tree or whatever). This needs ALLOCATION
so that it knows how much data to CMOVE from the original node to the
clone. If it moves more than necessary (more than the user originally
requested, but the amount that he actually got) that is okay --- this
would be slightly slower because CMOVE is moving some garbage data at
the tail that it doesn't strictly need to, but the speed difference is
so small as to be meaningless.

> I am going to be writing the allocator in ARM assembler, but I'm sure that I
> could back-port the algorithm to Forth if others would find it useful.

You may want to look into Menuet. This is an OS written in NASM for
the x86. The 32-bit version is open-source and the 64-bit version does
not come with source-code but is freely available to use. I think
there is an ARM version also, but you would have to ask about this on
the forum:
http://www.menuetos.net/
I'm considering writing a Forth for Menuet --- that would be the first
high-level-language available for Menuet, as those guys normally write
everything in NASM and distain to use HLLs at all. They might use
Forth though, if I present Forth not as an HLL but rather as a wrapper
and a framework for assembly-language code (which is the way that I
thought of Forth in my Commodore-64 days).

Note that Menuet uses an old version of NASM.