VM/kswapd with 4GB

Eckhard Neber

unread,

Dec 5, 2002, 10:57:16 AM12/5/02

to

Hi,

We encountered a problem with the swapping/paging mechanism when using
4GB main memory.
If we use tar to copy a huge bunch of files over the network, the
available memory is used for caching and the free mem decreases down to
10 MB. If this limit is reached, the 'kswapd' and 'kinoded' daemon need
most of the computer time, even if there is no or very little memory
swapped out.
We found the same behaviour with other utilities, e.g. if the tivoli
backup software is transfering many files. With swap turned off, there
were some processes killed (the usual behaviour, if the kernel runs out
of memory), even though nearly all main memory was used for caching.

We did some tests which can be summarized as follows:
1. Using only 1GB (/etc/lilo.conf entry 'append="mem=0x40000000"')
or 2GB memory, these problems don't occur.
2. Using kernel 2.4.20: Same problems.
3. tar onto a ext2 partition instead of a reiserfs: Same problems.
4. appears also when copying on the local machine.

We've already searched at Google and found some similar descriptions,
but no solution.
Any helpfull hint will be appreciated.

Hardware:
Supermicro P4DMS-6GM motherboard, Intel 7500 chipset, 2x Xeon 2.4GHz,
4x1GB DDR-RAM, 100MB e100-LAN on board.

Software:
Linux kernel 2.4.19 with 4GB support enabled, patches of lvm-1.0.5 and
reiserfs (quota). Reiserfs, 5 swap partitions of 2GB

Thanks,
Eckhard

Paul Lutus

unread,

Dec 5, 2002, 12:46:04 PM12/5/02

to

On Thu, 05 Dec 2002 16:57:16 +0100, Eckhard Neber wrote:

/ ...

> Software:
> Linux kernel 2.4.19 with 4GB support enabled, patches of lvm-1.0.5 and
> reiserfs (quota). Reiserfs, 5 swap partitions of 2GB

Just a suggestion for an experiment. Try *one* swap partition that is
twice the size of physical RAM. Your system may have too many choices
available to be able to make them quickly.

> Thanks,
> Eckhard

--
Paul Lutus
www.arachnoid.com

F. Michael Orr

unread,

Dec 6, 2002, 11:24:32 AM12/6/02

to

"Eckhard Neber" <eckhar...@web.de> wrote in message
news:3def775c$1...@news.uni-ulm.de...

This has been a problem Red Hat has been working on for us for quite some
time, because it has been killing our high-end server machines. They have
told us it was related to Advanced Server 2.1 and the 2.4.9 kernel. In our
shop, this seemed to be the case, because we have not seen this behaviour on
any of our RH7.1, RH7.2, or RH7.3 systems; only our RHAS 2.1 systems. I was
told it was related to the VMM data structures, and that on large memory
machines it spends so much time in memory management routines that it does
not have time for any thing else. This is the first time I have heard of it
occurring on a later version of the kernel, or on a 2 CPU system. If it is
an option to you, reduce the memory visible to the OS via the 'mem=' kernel
boot parameter as a workaround. I am told that a fix will be available
'real soon now' from Red Hat (if that is your distribution; SuSe supposedly
already has one).

Eckhard Neber

unread,

Dec 7, 2002, 8:03:28 AM12/7/02

to

>>Software:
>>Linux kernel 2.4.19 with 4GB support enabled, patches of lvm-1.0.5 and
>>reiserfs (quota). Reiserfs, 5 swap partitions of 2GB
>
>
> Just a suggestion for an experiment. Try *one* swap partition that is
> twice the size of physical RAM. Your system may have too many choices
> available to be able to make them quickly.

Thank, but it occured without swap as well.

Eckhard

Eckhard Neber

unread,

Dec 7, 2002, 10:39:30 AM12/7/02

to

F. Michael Orr wrote:
>
> This has been a problem Red Hat has been working on for us for quite some
> time, because it has been killing our high-end server machines. They have
> told us it was related to Advanced Server 2.1 and the 2.4.9 kernel. In our
> shop, this seemed to be the case, because we have not seen this behaviour on
> any of our RH7.1, RH7.2, or RH7.3 systems; only our RHAS 2.1 systems. I was
> told it was related to the VMM data structures, and that on large memory
> machines it spends so much time in memory management routines that it does
> not have time for any thing else. This is the first time I have heard of it
> occurring on a later version of the kernel, or on a 2 CPU system. If it is
> an option to you, reduce the memory visible to the OS via the 'mem=' kernel
> boot parameter as a workaround. I am told that a fix will be available
> 'real soon now' from Red Hat (if that is your distribution; SuSe supposedly
> already has one).

Thanks, this was the proper hint. We used a vanilla kernel with only a
few patches.
Now we took the 2.4.19 from Suse 8.1 and it works!
Thanks a lot!

Eckhard

Paul Lutus

unread,

Dec 7, 2002, 2:18:55 PM12/7/02

to

On Sat, 07 Dec 2002 14:03:28 +0100, Eckhard Neber wrote:

/ ...

>> Just a suggestion for an experiment. Try *one* swap partition that is

>> twice the size of physical RAM. Your system may have too many choices
>> available to be able to make them quickly.
>
> Thank, but it occured without swap as well.

If it occurred without swap, then swap is not the problem, but likely the
virtual memory algorithm is trying to do something fatally complex in the
name of optimization.

--
Paul Lutus
www.arachnoid.com