Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Should memory be washed?

6 views
Skip to first unread message

Peter Gibbs

unread,
Nov 4, 2002, 3:27:04 PM11/4/02
to perl6-internals
What is the official position with respect to laundry services in the
Parrot memory allocation code?

Some code assumes that the memory returned by Parrot_allocate
and its cousins will be pre-washed, while other code does its own
laundry. The same applies to 'bufferlike' headers - the buffer fields
will always be initialised, but the remainder of the allocated header
is sometimes assumed dirty, and sometimes assumed clean.
(The latter currently breaks recycling of bufferlike headers)

Making the allocation routines responsible for providing clean
memory simplifies things for everybody else, but with a potential
performance cost.
--
Peter Gibbs
EmKel Systems


Leopold Toetsch

unread,
Nov 5, 2002, 4:03:34 AM11/5/02
to Peter Gibbs, perl6-internals
Peter Gibbs wrote:

> What is the official position with respect to laundry services in the
> Parrot memory allocation code?


I would strongly urge for calloc()ed memory as done now.


> Some code assumes that the memory returned by Parrot_allocate
> and its cousins will be pre-washed, while other code does its own
> laundry.


By just deleting all this x->y = 0 and memsets all over the place we
would save hundreds of source lines.

> ... The same applies to 'bufferlike' headers - the buffer fields


> will always be initialised, but the remainder of the allocated header
> is sometimes assumed dirty, and sometimes assumed clean.
> (The latter currently breaks recycling of bufferlike headers)


Ah, this is the reason - well spotted. I did try to add_free these
buffers, with a lot of breaking tests in list and hash, which are the
main users of buffer_likes. Cleaning the part beyond a plain Buffer in
dod.c before add_free_buffer would do the trick I presume?


> Making the allocation routines responsible for providing clean
> memory simplifies things for everybody else, but with a potential
> performance cost.


Yes to first - but I don't think its a performance penalty. Almost all
used fields have to be cleaned first (and are explictely currently).

The only exception is currently Parrot_rellocate_string in res_lea.c
which doesn't clean the realloced part - and this doesn't harm.
I think, we could go for uncleaned memory in strings and res_lea. The
allocator in resources uses cleaned memory pools, so this wouldn't change.

leo


Leopold Toetsch

unread,
Nov 5, 2002, 4:45:59 AM11/5/02
to Peter Gibbs, perl6-internals
Peter Gibbs wrote:


> (The latter currently breaks recycling of bufferlike headers)

Fixed. Thanks again for this.

s. also "Questions about Px registers and memory usage"

leo

Dan Sugalski

unread,
Nov 6, 2002, 12:47:58 PM11/6/02
to Leopold Toetsch, Peter Gibbs, perl6-internals
At 10:03 AM +0100 11/5/02, Leopold Toetsch wrote:
>Peter Gibbs wrote:
>
>>What is the official position with respect to laundry services in the
>>Parrot memory allocation code?
>
>
>I would strongly urge for calloc()ed memory as done now.

This is one place where the current copying scheme is a win--we
allocate zeroed pages from the OS, which often plays games with the
MMU to do this really cheaply. That's one of the reasons lots of the
older code assumes freshly allocated memory is zeroed.

I don't have a problem with guaranteeing zeroed memory, but it's not
free. If we're going with a malloc-style allocator then I explicitly
do *not* want to give guarantees of zeroed memory.

Arguably any code that assumes zeroed memory is broken, as it ought
not be using any memory its not actually filled in data for.

>>Some code assumes that the memory returned by Parrot_allocate
>>and its cousins will be pre-washed, while other code does its own
>>laundry.
>
>
>By just deleting all this x->y = 0 and memsets all over the place we
>would save hundreds of source lines.

Buffers and PMCs should be sanitized when put on the free list. This
should be done in the DOD sweep. Extended buffer headers should be
zeroed past the end of the 'known' bits.

>
>>Making the allocation routines responsible for providing clean
>>memory simplifies things for everybody else, but with a potential
>>performance cost.
>
>
>Yes to first - but I don't think its a performance penalty. Almost
>all used fields have to be cleaned first (and are explictely
>currently).

There's a big difference between clean memory and clean headers. The
headers we need to clean up ourselves, and there's a cost there we
pay regardless. (Though keeping it in one tight part of the DOD code
minimizes the cost somewhat)

--
Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski even samurai
d...@sidhe.org have teddy bears and even
teddy bears get drunk

Leopold Toetsch

unread,
Nov 6, 2002, 2:58:41 PM11/6/02
to Dan Sugalski, Peter Gibbs, perl6-internals
Dan Sugalski wrote:

> I don't have a problem with guaranteeing zeroed memory, but it's not
> free. If we're going with a malloc-style allocator then I explicitly do
> *not* want to give guarantees of zeroed memory.


It doesn't make much sense to give Parrot_{re,}alloc different semantics
depending on the memory manager and forcing users to memset their
(maybe already zeroed) allocated memory.

If we want this, then lets have Parrot_{re,}allocate{,zeroed}.

The allocate_string variants are ok with unzeroed mem already.


> Arguably any code that assumes zeroed memory is broken, as it ought not
> be using any memory its not actually filled in data for.


When implementing list.c I looked at _allocate, saw calloc and assumed,
it's ok to assume, mem is zeroed - so ...

> ... Extended buffer headers should be

> zeroed past the end of the 'known' bits.


Done already, thanks to Peters hint. PMCs were ok.


leo

Dan Sugalski

unread,
Nov 6, 2002, 3:59:07 PM11/6/02
to Leopold Toetsch, Peter Gibbs, perl6-internals
At 8:58 PM +0100 11/6/02, Leopold Toetsch wrote:
>Dan Sugalski wrote:
>
>>I don't have a problem with guaranteeing zeroed memory, but it's
>>not free. If we're going with a malloc-style allocator then I
>>explicitly do *not* want to give guarantees of zeroed memory.
>
>
>It doesn't make much sense to give Parrot_{re,}alloc different
>semantics depending on the memory manager and forcing users to
>memset their (maybe already zeroed) allocated memory.

Nope, you're right.

>If we want this, then lets have Parrot_{re,}allocate{,zeroed}.
>The allocate_string variants are ok with unzeroed mem already.

Which was my thought here. Things that care can ask for zeroed
memory, which they may get anyway. (Or we may keep a big pool of
zeroed memory around as it's cheaper to clean large blocks or
something)

>>... Extended buffer headers should be zeroed past the end of the
>>'known' bits.
>
>
>Done already, thanks to Peters hint. PMCs were ok.

Cool, thanks.

Leopold Toetsch

unread,
Nov 7, 2002, 3:45:20 AM11/7/02
to Dan Sugalski, Peter Gibbs, perl6-internals
Dan Sugalski wrote:

> At 8:58 PM +0100 11/6/02, Leopold Toetsch wrote:


>> If we want this, then lets have Parrot_{re,}allocate{,zeroed}.
>> The allocate_string variants are ok with unzeroed mem already.

> Which was my thought here. Things that care can ask for zeroed memory,
> which they may get anyway. (Or we may keep a big pool of zeroed memory
> around as it's cheaper to clean large blocks or something)


Appended is a test program that shows timings (i386 w rdtsc) and the
limit, where malloc changes strategy to use mmap and returns zeroed memory.
The program does 3 allocations, globbers and frees mem between 2nd and
3rd and tests if memory in 3rd malloc is zeroed. It includes timing for
memset as well.

A typical snippet for output is:

size 1.mal 2.mal 3.mal memset clean
16384 13297 9338 207 40246 **
32768 13310 9589 207 83827 **
65536 14164 10208 209 193610 **
131072 19951 12077 4085 0 0

So chunks with size >= 131072 are already zeroed due to mmap. With
malloc.c this limit defaults to twice that size, but is configurable.

Parrot_allocate is currently allocating chunks of 32768 bytes, where
zeroing mem is already expensive compared to allocating mmaped chunks.

Maybe it's possible to make a test out of this to see, if there is such
a limit, where cleaned mmaped mem is returned and where this is.

leo

Leopold Toetsch

unread,
Nov 7, 2002, 4:47:13 AM11/7/02
to Leopold Toetsch, Dan Sugalski, Peter Gibbs, perl6-internals
Leopold Toetsch wrote:
> Appended is a test program

Arg, damned Mozilla, shows attachment and doesn't include it

/* test program for malloc */
/* run program with
* cc -o chkm -Wall chkm.c -O3 && ./chkm
* cc -o chkm -Wall chkm.c malloc.c -O3 && ./chkm
*
* the timing macro needs adjustment for !i386
*/

#include <stdio.h>
#include <malloc.h>
#include <stdarg.h>

#define rdtscl(low) \
__asm__ __volatile__ ("rdtsc" : "=a" (low) : : "edx")


int
PIO_eprintf(void *i, const char *s, ...) {
va_list args;
int ret;
va_start(args, s);

ret=vfprintf(stderr, s, args);
va_end(args);
return ret;
}

int main(int argc, char *argv[])
{
long a,b,c,d,e,f,g;
char *buf, *buf2;
size_t size = 1;
int i, j, u;

printf(" size 1.mal 2.mal 3.mal memset clean\n");
for (i = 0; i < 24; i++) {
rdtscl(a);
buf = malloc(size);
rdtscl(b);
buf2 = malloc(size);
rdtscl(c);
for (j = 0; j < i; j++)
buf[j] = j & 0xff;
free(buf);
rdtscl(d);
buf = malloc(size);
rdtscl(e);
u = 0;
for (j = 0; j < i; j++)
if (buf[j] != 0) {
u = 1;
break;
}
f = g = 0;
if (u) {
rdtscl(f);
memset(buf, 0, size);
rdtscl(g);
}

printf("%8d %8lu %8lu %8lu %8ld %s\n",
size, b-a, c-b, e-d, g-f, u ? "**": "0");
size <<= 1;
}
return 0;
}
/*
* Local variables:
* c-indentation-style: bsd
* c-basic-offset: 4
* indent-tabs-mode: nil
* End:
*
* vim: expandtab shiftwidth=4:
*/


Aldo Calpini

unread,
Nov 7, 2002, 6:45:47 AM11/7/02
to Leopold Toetsch, Dan Sugalski, Peter Gibbs, perl6-internals
Leopold Toetsch wrote:
> Appended is a test program that shows timings (i386 w rdtsc) and
> the limit, where malloc changes strategy to use mmap and returns
> zeroed memory.

I don't know if it helps, but there are the results on my machine,
using Windows XP Pro and Cygwin 1.3.10 and GCC 2.95.3:

# gcc -o chkm chkm.c -O3 && ./chkm


size 1.mal 2.mal 3.mal memset clean

1 8500 1188 932 0 0
2 8908 932 892 664 **
4 6828 904 928 276 **
8 6828 904 936 276 **
16 10520 968 928 612 **
32 8796 908 912 840 **
64 8092 988 916 892 **
128 7584 1032 936 932 **
256 6328 1052 936 1088 **
512 6952 1236 940 1096 **
1024 81784 1408 944 1912 **
2048 55984 1108 944 3940 **
4096 53812 13592 1044 7928 **
8192 51976 15932 980 28880 **
16384 56988 15520 996 70516 **
32768 60780 16584 952 159332 **
65536 60876 16580 1104 356616 **
131072 62492 16440 964 748300 **
262144 68668 17628 996 1545984 **
524288 69152 17604 924 3272504 **
1048576 74324 38948 992 6678812 **
2097152 86604 38012 1052 14211056 **
4194304 106172 40908 1060 458976804 **
8388608 147276 74432 1144 56008064 **

# gcc -o chkm chkm.c malloc.c -O3 && ./chkm


size 1.mal 2.mal 3.mal memset clean

1 7104 300 204 0 0
2 3964 188 180 0 0
4 3304 188 180 0 0
8 3264 256 172 0 0
16 6508 164 180 0 0
32 2112 388 168 1500 **
64 3496 260 192 904 **
128 3268 208 296 1304 **
256 3276 392 272 1232 **
512 1924 492 196 1344 **
1024 2664 78572 268 1676 **
2048 3036 31248 116 3420 **
4096 63312 14524 144 6588 **
8192 54508 15632 116 26584 **
16384 52724 14888 172 73844 **
32768 58336 14632 164 165624 **
65536 57680 14344 164 421396 **
131072 58164 15748 236 757964 **
262144 249528 49220 49836 0 0
524288 160816 72180 53728 0 0
1048576 173212 71464 66964 0 0
2097152 214488 95112 98340 0 0
4194304 296548 171728 161160 0 0
8388608 405532 283444 291892 0 0

to compile with Visual C++ I had to change the rdtscl macro to:

#define rdtscl(low) \
{ \
_asm rdtsc \
_asm mov dword ptr [low], eax \
}

the low word seems to be in eax instead of edx, don't know why.
and these are the results:

# cl -nologo -o chkm_vc.exe chkm_vc.c && chkm_vc
chkm_vc.c


size 1.mal 2.mal 3.mal memset clean

1 2288 668 352 0 0
2 5068 456 348 176 **
4 2040 552 304 260 **
8 1936 380 280 172 **
16 2864 368 296 172 **
32 1408 756 328 172 **
64 1360 408 308 188 **
128 1260 500 376 228 **
256 1276 828 396 292 **
512 1424 400 304 568 **
1024 3936 1080 420 1168 **
2048 39664 432 388 4492 **
4096 49412 12760 420 7684 **
8192 42512 15484 524 15476 **
16384 67436 26780 544 75100 **
32768 43680 25584 532 167956 **
65536 46392 26052 125328 359224 **
131072 57456 26324 27228 788760 **
262144 52196 116656 26800 1598200 **
524288 58720 29900 17500 0 0
1048576 26504 16048 15168 0 0
2097152 38564 27296 15904 0 0
4194304 34008 27608 15804 0 0
8388608 40480 24464 25492 0 0

the results are apparently the same with -Ox (maximum opts).

cheers,
Aldo

__END__
$_=q,just perl,,s, , another ,,s,$, hacker,,print;

Leopold Toetsch

unread,
Nov 9, 2002, 4:09:43 AM11/9/02
to Dan Sugalski, perl6-internals
Dan Sugalski wrote:

> At 8:58 PM +0100 11/6/02, Leopold Toetsch wrote:

>> If we want this, then lets have Parrot_{re,}allocate{,zeroed}.
>> The allocate_string variants are ok with unzeroed mem already.

> Which was my thought here. Things that care can ask for zeroed memory,
> which they may get anyway. (Or we may keep a big pool of zeroed memory
> around as it's cheaper to clean large blocks or something)

This patch implements the first step of mem alloc cleanup:

- NEW mem_sys_allocate_zeroed (calloc)
- mem_sys_allocate uses malloc now
- NEW Parrot_allocate_zeroed
- Parrot_allocate uses malloc in res_lea now

implying the following rules:

Buffer headers and such are guaranteed to be zeroed.
Mem allocated via Parrot_allocate may or may not be zeroed, depending on
allocator scheme.
The _allocate_string variants do never guarantee to deliver zeroed memory.

Next step is going through files and look, where allocating zeroed
memory looks cheaper and use mem_sys_allocate_zeroed.

leo

0 new messages