Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Leaks in mmap address space: 2.6.11.4

0 views
Skip to first unread message

Wolfgang Wander

unread,
Apr 15, 2005, 12:50:10 PM4/15/05
to
Hi,

we are running some pretty large applications in 32bit mode on 64bit
AMD kernels (8GB Ram, Dual AMD64 CPUs, SMP). Kernel is 2.6.11.4 or
2.4.21.

Some of these applications run consistently out of memory but only
on 2.6 machines. In fact large memory allocations that libc answers
with private mmaps seem to contribute to the problem: 2.4 kernels
are able to combine these mmaps to large chunks whereas 2.6
generates a rather fragmented /proc/self/maps table.

The following C++ program reproduces the error (compiled statically
on a 32bit machine to get exactly the same executable for 2.4 and
2.6 environments):

----------------------------------------------------------------------
#include <iostream>
#include <vector>
#include <unistd.h>
#include <fstream>
#include <string>
#include <iterator>

void
aLlocator()
{
const int bsz = 160;
const int ssz = 1000000;
std::vector<char*> svec(ssz);
std::vector<char*> bvec(bsz);
std::fill( bvec.begin(), bvec.end(), (char*)0);
std::fill( svec.begin(), svec.end(), (char*)0);
for( unsigned i = 0, j = 0; i < 1843750 /* for our setup we crash in 2.6
if we iterate one more, YMMV */
; ++i ) {
unsigned oidx;
unsigned kidx;
if( i % (ssz/bsz/2) == 0 ) {
kidx = j % bsz;
oidx = (j+bsz/10) % bsz;
bvec[kidx] = new char[ 9500000 ]; // served via private mmap
delete [] bvec[oidx];
bvec[oidx] = (char*)0;
++j;
}
kidx = rand() % ssz;
delete [] svec[kidx];
svec[kidx] = new char[kidx%3500+1]; // served mostly via brk
}
std::ifstream ifs("/proc/self/maps");
std::string line;
while( std::getline(ifs, line))
std::cout << line << std::endl;
}

int main() {
aLlocator();
}

----------------------------------------------------------------------

The final output of this program results in a very large and fragmented
mapping table of /proc/self/maps on 2.6 and a fairly small one on 2.4:

2.4:

08048000-080bd000 r-xp 00000000 00:82 16643931 /tmp/leak-me2
080bd000-080c9000 rwxp 00074000 00:82 16643931 /tmp/leak-me2
080c9000-55542000 rwxp 00000000 00:00 0
55555000-56227000 rwxp 00000000 00:00 0
56236000-58e77000 rwxp 00ce1000 00:00 0
58f87000-5b0b7000 rwxp 03a32000 00:00 0
5b9c7000-a54e7000 rwxp 06472000 00:00 0
a58f7000-a9457000 rwxp 503a2000 00:00 0
a9667000-b3077000 rwxp 54112000 00:00 0
ffffd000-ffffe000 rwxp 00000000 00:00 0

2.6

08048000-080bd000 r-xp 00000000 00:5a 16643931 /tmp/leak-me2
080bd000-080c9000 rwxp 00074000 00:5a 16643931 /tmp/leak-me2
080c9000-55542000 rwxp 080c9000 00:00 0
55555000-55b26000 rwxp 55555000 00:00 0
56236000-56338000 rwxp 56236000 00:00 0
56b48000-56c48000 rwxp 56b48000 00:00 0
57458000-57658000 rwxp 57458000 00:00 0
57d68000-57e68000 rwxp 57d68000 00:00 0
58678000-58778000 rwxp 58678000 00:00 0

[ removed 150 lines ]

b0c38000-ff7e8000 rwxp b0c38000 00:00 0
ffffc000-ffffe000 rwxp ffffc000 00:00 0
ffffe000-fffff000 ---p 00000000 00:00 0

Any ideas?

Thanks!
Wolfgang
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Wolfgang Wander

unread,
Apr 15, 2005, 6:00:15 PM4/15/05
to
Here is another program that illustrates the problem which this time
in C and without using glibc allocation schemes.

----------------------------------------------------------------------
/* run in 32 bit mode on 64Bit kernel, >4GB of RAM is helpful */

#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <sys/mman.h>

#define bsz 600 /* number of mmaps to keep */
#define large 9500000 /* some odd large number */
#define success 1000000 /* number of iterations before we believe we are ok*/

/* program fails here on 2.6.11.4 kernel after 52K iterations
with a fragmented /proc/self/mmap, 2.4 kernels behave fine */
void
aLLocator()
{

char* bvec[bsz];
unsigned int i;
memset( bvec,0,sizeof(bvec));

for( i = 0; i < success ; ++i ) {
unsigned oidx;
unsigned kidx;
int len;
kidx = i % bsz;
oidx = (i+bsz/10) % bsz;
len = (oidx & 7) ? ((oidx&7)* 1048576) : large;
if( bvec[oidx] ) { munmap( bvec[oidx], len ); bvec[oidx] = 0; }
len = (kidx & 7) ? ((kidx&7)* 1048576) : large;
bvec[kidx] = (char*)(mmap(0, len, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0));
if( bvec[kidx] == (char*)(-1) ) {
printf("Failed after %d rounds\n", i);
break;
}
}
}

int main() {
FILE *f;
int c;

aLLocator();

f = fopen( "/proc/self/maps", "r" );
while( (c = fgetc(f)) != EOF )
putchar(c);
fclose(f);

return 0;
}
----------------------------------------------------------------------

Wolfgang Wander writes:
> Hi,
>
> we are running some pretty large applications in 32bit mode on 64bit
> AMD kernels (8GB Ram, Dual AMD64 CPUs, SMP). Kernel is 2.6.11.4 or
> 2.4.21.
>
> Some of these applications run consistently out of memory but only
> on 2.6 machines. In fact large memory allocations that libc answers
> with private mmaps seem to contribute to the problem: 2.4 kernels
> are able to combine these mmaps to large chunks whereas 2.6
> generates a rather fragmented /proc/self/maps table.
>
> The following C++ program reproduces the error (compiled statically
> on a 32bit machine to get exactly the same executable for 2.4 and
> 2.6 environments):

Wolfgang Wander

unread,
Apr 21, 2005, 11:00:25 AM4/21/05
to
Looks like I have to answer myself here with you guys all busy
gitting...

I posted two sample programs last week that showed that large
application can run out of memory a lot quicker on 2.6 than on 2.4.
The reason is that the /proc/*/maps space fragments a lot faster
on 2.6 than with 2.4 kernels.

2.4 started searching for unused space from the mmap base up to the
stack, 2.6 starts search from the end of the last mapped area
(mm->free_area_cache).

The difference in the two algorithms is obvious (apart from the
efficiency which is undoubtedly better in 2.6):

Whereas 2.4 naturally started to fill small holes closer to the base thus
leaving larger areas open towards the stack, 2.6 will place small maps
all over the mappable area thus closing up potential large holes inefficiently.

The attached patch is left as a hack so that you guys can come up (please;-)
with a neater solution but something ought to be done that saves large
holes from small map clutter...

Wolfgang

PS: Patch^H^H^H^H^H Ugly_Hack is against 2.6.11.7 and only 'fixes' the
two architectures I'm interested in (i386 and x86_64)

diff -ru linux-2.6.11.7.orig/arch/x86_64/kernel/sys_x86_64.c linux-2.6.11.7/arch/x86_64/kernel/sys_x86_64.c
--- linux-2.6.11.7.orig/arch/x86_64/kernel/sys_x86_64.c 2005-03-02 02:38:13.000000000 -0500
+++ linux-2.6.11.7/arch/x86_64/kernel/sys_x86_64.c 2005-04-21 09:27:38.000000000 -0400
@@ -112,8 +112,8 @@
(!vma || addr + len <= vma->vm_start))
return addr;
}
- addr = mm->free_area_cache;
- if (addr < begin)
+ /* addr = mm->free_area_cache;
+ if (addr < begin) */
addr = begin;
start_addr = addr;

diff -ru linux-2.6.11.7.orig/mm/mmap.c linux-2.6.11.7/mm/mmap.c
--- linux-2.6.11.7.orig/mm/mmap.c 2005-03-02 02:38:12.000000000 -0500
+++ linux-2.6.11.7/mm/mmap.c 2005-04-21 09:32:06.000000000 -0400
@@ -1173,7 +1173,7 @@
(!vma || addr + len <= vma->vm_start))
return addr;
}
- start_addr = addr = mm->free_area_cache;
+ start_addr = addr = TASK_UNMAPPED_BASE; /* mm->free_area_cache; */

full_search:
for (vma = find_vma(mm, addr); ; vma = vma->vm_next) {

Wolfgang Wander writes:
> Here is another program that illustrates the problem which this time
> in C and without using glibc allocation schemes.
>
> ----------------------------------------------------------------------
>

0 new messages