Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Simplifying memory management

33 views
Skip to first unread message

Marven Lee

unread,
May 19, 2012, 2:00:33 PM5/19/12
to
I've been thinking of modifying my memory management code
that I touched upon in the "how does paging work" thread. As I
mentioned I've got a high level description of a process's address
space. Copying from an old post an address space looks like this:

AddressSpace (range 0x0000 to 0x10000) ---> Pmap (pagetables)
|
\|/
MemRegion7 (0x0000 to 0x1000, mapped) -> pf1
|
MemRegion3 (0x1000 to 0x3000, unmapped)
|
MemRegion1 (0x3000 to 0x6000, mapped) -> pf2 - pf5 - pf7
|
MemRegion4 (0x6000 to 0x8000, mapped) -> pf6 - pf3
|
MemRegion2 (0x8000 to 0x10000, unmapped)

The AddressSpace maintains a linked list of MemRegion structures
sorted by address. There is also a linked list of free MemRegions
to make allocation a little faster. These MemRegions are allocated
from a fixed pool in the kernel, shared by all processes.

Each MemRegion maintains a linked list of Pageframes that have
been allocated for that area, sorted by virtual address. From these
structures the actually page tables in the Pmap can be constructed.

I am thinking of simplifying things a lot as I don't intend to add
swapping or shared memory now.

Instead of having a pool of MemRegions, allocating them and
attaching them onto a linked list I was thinking of having a fixed
size array per process of MemRegions, perhaps room for
1000 entries and a count of the number of MemRegions in use.

struct MemRegion
{
uint32 base;
uint32 ceiling;
uint32 type; AREA_FREE, AREA_PHYS or AREA_ANON
uint32 padding/other;
};

Initially there is one entry in the array that describes the whole
of user mode as free, base = 0 ceiling = 0xF0000000. As
entries are added and the address space is carved up the array
expands.

Finding a specific area within an address space would just be a
binary search on the array. A search for a free area might
be a linear search or an extra field added to the MemRegion
to maintain a list of free areas.

I'm not sure if 1000 entries would be enough for a process. Typically
I allocate areas the size of 64k and malloc() then carves them up
into smaller units, currently using a slab allocator. Bigger allocations
just allocate an area directly.

I only intend to have say 128 processes so 16k for an array of
MemRegion structures isn't bad. Perhaps I could use just the base
and type fields, using only 8 bytes per MemRegion, it could
determine the ceiling of an area from next Memregion's base.

My thinking is that an array of MemRegions is easier to search,
just a simple binary search. Inserting a region may not be that bad,
especially if inserting at or near the tail of the array. Also linked
lists in my current design may go through a lot of TLB entries
when searching.

In the old version above Pageframes are attached to the MemRegion
on linked lists. There is a Pageframe for each page of physical memory
in the system. With the new way, just the page tables are used to
keep track of which pages are allocated for an area of memory.
So a Pageframe might be something simple like:

struct Pageframe
{
uint32 flags;
struct Pageframe *next_free
};


One of the reasons for wanting to simplify my memory management
is that I've been thinking of different ways to do IPC. Currently I've
got an asynchronous message passing system that copies data
between processes using a single copy. It's sort of a mixture of
AmigaOS's and QNX. It works but I was thinking of alternatives
or possibly adding to the existing code.

I was thinking of zero-copy IPC where an area of memory is unmapped
from one address space and then mapped into another. With simpler
data structures it would make remapping pages from one process to
another quicker.

If an area was made irrevocably read-only then it could be shared
between several processes instead of unmapping, that would need
a reference count inside the Pageframe structures. I gather an
old system called MUSS did something similar but perhaps that
used segmentation.

I think KVP had a similar scheme in mind years ago, not sure how
well it worked:

http://groups.google.com/group/alt.os.development/msg/bec3b82a84289d82

If my thoughts get any clearer on it I'll post some more.


--
Marv

Rod Pemberton

unread,
May 19, 2012, 7:58:32 PM5/19/12
to
"Marven Lee" <marv...@gmail.com> wrote in message
news:a1q5eo...@mid.individual.net...
> I've been thinking of modifying my memory management code

...

> [...] I was thinking of having a fixed
> size array per process of MemRegions, perhaps room for
> 1000 entries and a count of the number of MemRegions in use.
>
> struct MemRegion
> {
> uint32 base;
> uint32 ceiling;
> uint32 type; AREA_FREE, AREA_PHYS or AREA_ANON
> uint32 padding/other;
> };

If this is per process, why does the process need to know about AREA_FREE?

Shouldn't all AREAs be NOT_FREE for a process ... ?


struct MemRegion
{
uint32 base;
uint32 ceiling;
};

Smaller is better.

Of course, I'm surprised there isn't a pointer or two for your linked-list
within your struct and no typedef...

typedef struct MmRgn
{
struct MmRgn *link;
uint32 base;
uint32 ceiling;
} MemRegion;

MemRegion *node;

Then, you can apply node to any address of an array or malloc'd area.

Of course, you can keep the pointers in a separate array, i.e., one array
for MemRegion's and one array for the linked-list. That might be an
advantage.

typedef struct MmRgn
{
uint32 base;
uint32 ceiling;
} MemRegion;

typedef struct PtrMmRgn
{
struct PtrMmRgn *link;
struct MmRgn *loc;

} PtrMemRegion;

That's definately more flexible, since you can change both links, i.e.,
re-link, and which memory block a link points to.

> Initially there is one entry in the array that describes the whole
> of user mode as free, base = 0 ceiling = 0xF0000000. As
> entries are added and the address space is carved up the array
> expands.
>

Do you have "garbage collection" (or coalescing)? I.e., once the address
space is all chopped up (fragmentation), how do programs allocate large
blocks of memory?

> Finding a specific area within an address space would just be a
> binary search on the array. A search for a free area might
> be a linear search or an extra field added to the MemRegion
> to maintain a list of free areas.

I read this today on Wikipedia:
http://en.wikipedia.org/wiki/Skip_list

> I'm not sure if 1000 entries would be enough for a process.

Set it to 20, see if it crashes ...
(No, that was not a humorous response.)

The point being, you sort of need to know the average allocation by a
program of memory and stack space to provide balance your code design. Of
course, you're always going to have some processes which allocate huge
amounts. So, you need to know peak allocation too.

> I allocate areas the size of 64k and malloc() then carves them up
> into smaller units, currently using a slab allocator. Bigger allocations
> just allocate an area directly.

According to Wikipedia, you shouldn't have fragmentation issues with a slab
allocator.
http://en.wikipedia.org/wiki/Slab_allocator

From it's description, it seems like you'd need to have some good statistics
on the sizes and quantity of objects that would be allocated by a program.
Apparently, the slab allocator pre-divides memory into blocks of fixed size
for the expected allocations of various data objects.

> My thinking is that an array of MemRegions is easier to search,
> just a simple binary search.

Yes, especially if you have a pointer in each MemRegion so that you can skip
over no longer used MemRegions.

> Inserting a region may not be that bad,
> especially if inserting at or near the tail of the array.

Why insert?

If a MemRegion is not in use, leave it in place and re-link around it. If
you need a free MemRegion, re-use a skipped over MemRegion, or concatenate a
new one onto the end of the array.


Rod Pemberton



Marven Lee

unread,
May 25, 2012, 9:40:47 AM5/25/12
to

Rod Pemberton <do_no...@notemailntt.cmm> wrote:
> Marven Lee <marv...@gmail.com> wrote:
> [...]
> If this is per process, why does the process need to know about AREA_FREE?
>
> Shouldn't all AREAs be NOT_FREE for a process ... ?

Currently my processes keep a sorted double linked list of MemRegions
that are either:

1) allocated (anonymous memory)
2 ) mapped region of physical memory
3) free

In the allocator it was easier to search for a suitable free MemRegion and
then use a pointer to it in the remaining code to split it into 1, 2 or 3
MemRegions. The allocator has an option to allocate at specific virtual
addresses such as for mapping an executable's code and data sections.
So a MemRegion was split depending on:

1) region is same size as allocation, mark as allocated
2) split region in half, bottom half is allocated, to half marked free
3) split region in half, bottom half is free, to half marked allocated
4) split region into 3, bottom is free, middle is allocated, top is free

There is a secondary list maintained by each process of just the free
regions which made searching for free memory a bit quicker. There is
a third list that a MemRegion can be put on, the unused list which is just
a global pool of free MemRegion structures that aren't being used.

But your comment got me thinking about whether I'd need a list of free
regions if I convert my code to an array instead of the linked lists. I
suppose it's easy enough to determine the size of the free regions
between the allocated regions.

> struct MemRegion
> {
> uint32 base;
> uint32 ceiling;
> };
>
> Smaller is better.
>
> Of course, I'm surprised there isn't a pointer or two for your linked-list
> within your struct and no typedef...

I tend to only use typedefs for simple types, like bool, uint32, function
pointers and occasional structures that don't have pointers or refer to
themselves. Perhaps I could

> Of course, you can keep the pointers in a separate array, i.e., one array
> for MemRegion's and one array for the linked-list. That might be an
> advantage.

I do something like that in my Process structure to maintain a list of
free handles, it's sort of like a FAT table of handles.

> Do you have "garbage collection" (or coalescing)? I.e., once the address
> space is all chopped up (fragmentation), how do programs allocate large
> blocks of memory?

No garbage collection. If a region is freed then it is merged into a single
region if the regions below and above are free. This is the opposite of
the 4 options I listed that the allocator does.

> According to Wikipedia, you shouldn't have fragmentation issues with a
> slab allocator.

There is no internal fragmentation within a slab but there will be external
fragmentation between slabs and other allocations.. The slabs are always
64k aligned and have a maximum size of 64k, this makes it easy to round
down an object's address to find the slab header. Slabs start out at 4k
and then and each new slab grows by 4k upto 64k.

Objects larger than a few kilobytes don't use the slab allocator, just the
VirtualAlloc()/VirtualFree() system calls directly, these are also
64k aligned. Slab objects won't be 64k aligned due to the header,
so when it comes to freeing an object it is easy to determine if it is
a slab object or allocated directly with VirtualAlloc().

>> Inserting a region may not be that bad,
>> especially if inserting at or near the tail of the array.
>
> Why insert?
>
> If a MemRegion is not in use, leave it in place and re-link around it. If
> you need a free MemRegion, re-use a skipped over MemRegion, or concatenate
> a > new one onto the end of the array.

I hadn't really understood why you were mentioning the link lists in your
original post, especially as I wanted to get away from them by using an
array.
The thought hadn't occurred to me to leave unused regions in place and
skip over them in the array.


--
Marv

Rod Pemberton

unread,
May 25, 2012, 2:40:18 PM5/25/12
to
"Marven Lee" <marv...@gmail.com> wrote in message
news:a29gfd...@mid.individual.net...

> I hadn't really understood why you were mentioning the link lists in your
> original post, especially as I wanted to get away from them by using an
> array.
>
> The thought hadn't occurred to me to leave unused regions in place and
> skip over them in the array.
>

The less work your code does, the faster it is, e.g., "lazy" garbage
collection. Of course, negatives can come up by doing so, requiring more
work to balance things, e.g., fragmentation, larger amount of work to do
later and all-at-once, etc.

If the array is "fresh," then all the nodes will be sequential, just like an
array. After some time, those links
will become quite jumbled, and they might be jumping from the beginning to
the end and back. That
probably doesn't delay pointer access to the nodes. But, all that jumping
around might start to mess with caching of the data (or locality of
reference). If so, you can call a routine to compact the array,
periodically. At the same time, it'll be re-linked and re-ordered.

Personally, I think the idea of a skip list (or similar) with slab allocator
"sounds" very efficient to me. I'm going to keep that in mind.


Rod Pemberton



0 new messages