page allocation failure kernel messages on storage nodes

153 views
Skip to first unread message

Christian Iseli

unread,
Apr 5, 2012, 2:43:59 AM4/5/12
to fhgfs...@googlegroups.com
Hi,

I have two storage nodes running
- CentOS release 6.2
- kernel 2.6.32-220.4.2.el6.x86_64
- fhgfs-storage-2011.04.r15-el6.x86_64

on both machines, I observe kernel messages in /var/log/messages of the form:
Apr  4 12:20:55 fhgfs02 kernel: fhgfs-storage/W: page allocation failure. order:5, mode:0xd0
Apr  4 12:20:55 fhgfs02 kernel: Pid: 7433, comm: fhgfs-storage/W Not tainted 2.6.32-220.4.2.el6.x86_64 #1
Apr  4 12:20:55 fhgfs02 kernel: Call Trace:
[... lots of following messages]

Is this expected ? worrisome ?  Can I do something ?

Christian Mohrbacher

unread,
Apr 5, 2012, 3:45:03 AM4/5/12
to fhgfs...@googlegroups.com
Hi,
basically "page allocation errors" sometimes occur in the kernel and they are not necessarily something to worry about.

In this case, the error message will most likely not come from fhgfs_storage itself, as our servers are running in user space, and page allocation errors normally only can be caused by memory allocations from the kernel. Are you running XFS as storage filesystem for the server? I'm asking this, because when looking through mailing lists, page allocation errros on kernel 2.6.32 often occur when using XFS under higher loads.

One thing to try is setting the kernel parameter "min_free_kbytes" higher. Then the kernel will try to reserve this amount of memory for its own operations at all times, which reduces the risk of allocation error in critical times. If you have less then 8GB of memory in that machine we usually recommend setting the value to 64MB, otherwise we recommend 256MB (or even more).

To set it to 256MB you can execute the following command :

echo
262144 > /proc/sys/vm/min_free_kbytes

This change will be gone after a reboot. To make it persistent, you can add this to /etc/sysctl.conf.

vm.min_free_kbytes=262144


If you still get errors afterwards you can send us your /var/log/messages and the log file of the storage server. Then we can see if we find anything else.

Regards,
Christian Mohrbacher
Fraunhofer
-- 
=====================================================
| Christian Mohrbacher                              |
| Competence Center for High Performance Computing  |
| Institut fuer Techno- und                         |
| Wirtschaftsmathematik (ITWM)                      |
| Fraunhofer-Platz 1                                |
|                                                   |
| D-67663 Kaiserslautern                            |
===================================================== 
| Tel: (49) 631 31600 4425                          |  
| Fax: (49) 631 31600 1099                          |
|                                                   |
| E-Mail:   christian....@itwm.fraunhofer.de |
| Internet: http://www.itwm.fraunhofer.de           |
=====================================================

Bernd Schubert

unread,
Apr 5, 2012, 4:20:47 AM4/5/12
to fhgfs...@googlegroups.com, Christian Iseli
On 04/05/2012 08:43 AM, Christian Iseli wrote:

Out of interest, could you please post a few complete traces? I would
like to know where this exactly came from. Usually drivers are
requesting serial aligned pages (2^5 pages = 32kb in your case).

Other than that, please follow Christians advise.
Also, did you disable transparent huge pages, which are known to have
bugs in RHEL kernels?

http://www.fhgfs.com/wiki/wikka.php?wakka=ServerTuning
(section "Huge Pages").

Cheers,
Bernd

Christian Iseli

unread,
Apr 5, 2012, 8:58:47 AM4/5/12
to fhgfs...@googlegroups.com
Hi Christian,


On Thursday, April 5, 2012 9:45:03 AM UTC+2, Christian Mohrbacher wrote:
Are you running XFS as storage filesystem for the server?

yes, indeed
 
If you have less then 8GB of memory in that machine we usually recommend setting the value to 64MB, otherwise we recommend 256MB (or even more).

This is what I currently have in /etc/rc.local (following recommendations in the doc of fhgfs if I remember correctly):
# vm things
echo 5 > /proc/sys/vm/dirty_background_ratio
echo 10 > /proc/sys/vm/dirty_ratio
echo 131072 > /proc/sys/vm/min_free_kbytes

# RedHat special
echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled

the machine has exactly 8GB of RAM... so I suppose I should increase to 256MB in this case ?

Christian Iseli

unread,
Apr 5, 2012, 9:06:03 AM4/5/12
to fhgfs...@googlegroups.com, Christian Iseli
Hi,

thanks for all the replies :-)


On Thursday, April 5, 2012 10:20:47 AM UTC+2, Bernd Schubert wrote:

Out of interest, could you please post a few complete traces? I would
like to know where this exactly came from. Usually drivers are
requesting serial aligned pages (2^5 pages = 32kb in your case).

sure:

Apr  4 12:20:55 fhgfs02 kernel: fhgfs-storage/W: page allocation failure. order:5, mode:0xd0
Apr  4 12:20:55 fhgfs02 kernel: Pid: 7433, comm: fhgfs-storage/W Not tainted 2.6.32-220.4.2.el6.x86_64 #1
Apr  4 12:20:55 fhgfs02 kernel: Call Trace:
Apr  4 12:20:55 fhgfs02 kernel: [<ffffffff81123daf>] ? __alloc_pages_nodemask+0x77f/0x940
Apr  4 12:20:55 fhgfs02 kernel: [<ffffffff8115dc62>] ? kmem_getpages+0x62/0x170
Apr  4 12:20:55 fhgfs02 kernel: [<ffffffff8115e87a>] ? fallback_alloc+0x1ba/0x270
Apr  4 12:20:55 fhgfs02 kernel: [<ffffffff8115e2cf>] ? cache_grow+0x2cf/0x320
Apr  4 12:20:55 fhgfs02 kernel: [<ffffffff8115e5f9>] ? ____cache_alloc_node+0x99/0x160
Apr  4 12:20:55 fhgfs02 kernel: [<ffffffff8142186a>] ? __alloc_skb+0x7a/0x180

Apr  4 12:20:55 fhgfs02 kernel: fhgfs-storage/W: page allocation failure. order:5, mode:0xd0
Apr  4 12:20:55 fhgfs02 kernel: Pid: 7440, comm: fhgfs-storage/W Not tainted 2.6.32-220.4.2.el6.x86_64 #1

Apr  4 12:20:55 fhgfs02 kernel: Call Trace:
Apr  4 12:20:55 fhgfs02 kernel: [<ffffffff81123daf>] ? __alloc_pages_nodemask+0x77f/0x940
Apr  4 12:20:55 fhgfs02 kernel: [<ffffffff8115dc62>] ? kmem_getpages+0x62/0x170
Apr  4 12:20:55 fhgfs02 kernel: [<ffffffff8115e87a>] ? fallback_alloc+0x1ba/0x270
Apr  4 12:20:55 fhgfs02 kernel: [<ffffffff8115e2cf>] ? cache_grow+0x2cf/0x320
Apr  4 12:20:55 fhgfs02 kernel: [<ffffffff8115e5f9>] ? ____cache_alloc_node+0x99/0x160
Apr  4 12:20:55 fhgfs02 kernel: [<ffffffff8142186a>] ? __alloc_skb+0x7a/0x180
Apr  4 12:20:55 fhgfs02 kernel: [<ffffffff8115f4bf>] ? kmem_cache_alloc_node_notrace+0x6f/0x130
Apr  4 12:20:55 fhgfs02 kernel: [<ffffffff8115f6fb>] ? __kmalloc_node+0x7b/0x100
Apr  4 12:20:55 fhgfs02 kernel: [<ffffffff8142186a>] ? __alloc_skb+0x7a/0x180
Apr  4 12:20:55 fhgfs02 kernel: [<ffffffff8146f341>] ? sk_stream_alloc_skb+0x41/0x110
Apr  4 12:20:55 fhgfs02 kernel: [<ffffffff8146f760>] ? tcp_sendmsg+0x350/0xa10
Apr  4 12:20:55 fhgfs02 kernel: [<ffffffff811845d4>] ? follow_managed+0x184/0x370
Apr  4 12:20:55 fhgfs02 kernel: [<ffffffff81419a0a>] ? sock_sendmsg+0x11a/0x150
Apr  4 12:20:55 fhgfs02 kernel: [<ffffffff81183795>] ? path_to_nameidata+0x45/0x60
Apr  4 12:20:55 fhgfs02 kernel: [<ffffffff81090a90>] ? autoremove_wake_function+0x0/0x40
Apr  4 12:20:55 fhgfs02 kernel: [<ffffffffa049a2a1>] ? xfs_icsb_sync_counters_locked+0x61/0x80 [xfs]
Apr  4 12:20:55 fhgfs02 kernel: [<ffffffffa049a2f9>] ? xfs_icsb_sync_counters+0x39/0x50 [xfs]
Apr  4 12:20:55 fhgfs02 kernel: [<ffffffff811a7034>] ? statfs_by_dentry+0x74/0xa0
Apr  4 12:20:55 fhgfs02 kernel: [<ffffffff811a716b>] ? vfs_statfs+0x1b/0xb0
Apr  4 12:20:55 fhgfs02 kernel: [<ffffffff8141a329>] ? sys_sendto+0x139/0x190
Apr  4 12:20:55 fhgfs02 kernel: [<ffffffff811a75d1>] ? sys_statfs+0x81/0xb0
Apr  4 12:20:55 fhgfs02 kernel: [<ffffffff8100b0f2>] ? system_call_fastpath+0x16/0x1b
Apr  4 12:20:55 fhgfs02 kernel: Mem-Info:
Apr  4 12:20:55 fhgfs02 kernel: Node 0 DMA per-cpu:
Apr  4 12:20:55 fhgfs02 kernel: CPU    0: hi:    0, btch:   1 usd:   0
Apr  4 12:20:55 fhgfs02 kernel: CPU    1: hi:    0, btch:   1 usd:   0
Apr  4 12:20:55 fhgfs02 kernel: CPU    2: hi:    0, btch:   1 usd:   0
Apr  4 12:20:55 fhgfs02 kernel: CPU    3: hi:    0, btch:   1 usd:   0
Apr  4 12:20:55 fhgfs02 kernel: CPU    4: hi:    0, btch:   1 usd:   0
Apr  4 12:20:55 fhgfs02 kernel: CPU    5: hi:    0, btch:   1 usd:   0
Apr  4 12:20:55 fhgfs02 kernel: CPU    6: hi:    0, btch:   1 usd:   0
Apr  4 12:20:55 fhgfs02 kernel: CPU    7: hi:    0, btch:   1 usd:   0
Apr  4 12:20:55 fhgfs02 kernel: Node 0 DMA32 per-cpu:
Apr  4 12:20:55 fhgfs02 kernel: CPU    0: hi:  186, btch:  31 usd:   0
Apr  4 12:20:55 fhgfs02 kernel: CPU    1: hi:  186, btch:  31 usd:  35
Apr  4 12:20:55 fhgfs02 kernel: CPU    2: hi:  186, btch:  31 usd:   0
Apr  4 12:20:55 fhgfs02 kernel: CPU    3: hi:  186, btch:  31 usd:   0
Apr  4 12:20:55 fhgfs02 kernel: CPU    4: hi:  186, btch:  31 usd:   0
Apr  4 12:20:55 fhgfs02 kernel: CPU    5: hi:  186, btch:  31 usd:   0
Apr  4 12:20:55 fhgfs02 kernel: CPU    6: hi:  186, btch:  31 usd:   0
Apr  4 12:20:55 fhgfs02 kernel: CPU    7: hi:  186, btch:  31 usd:   0
Apr  4 12:20:55 fhgfs02 kernel: Node 0 Normal per-cpu:
Apr  4 12:20:55 fhgfs02 kernel: CPU    0: hi:  186, btch:  31 usd:   0
Apr  4 12:20:55 fhgfs02 kernel: CPU    1: hi:  186, btch:  31 usd:   0
Apr  4 12:20:55 fhgfs02 kernel: CPU    2: hi:  186, btch:  31 usd:   0
Apr  4 12:20:55 fhgfs02 kernel: CPU    3: hi:  186, btch:  31 usd:   0
Apr  4 12:20:55 fhgfs02 kernel: CPU    4: hi:  186, btch:  31 usd:   0
Apr  4 12:20:55 fhgfs02 kernel: CPU    5: hi:  186, btch:  31 usd:   0
Apr  4 12:20:55 fhgfs02 kernel: CPU    6: hi:  186, btch:  31 usd:   0
Apr  4 12:20:55 fhgfs02 kernel: CPU    7: hi:  186, btch:  31 usd:   0
Apr  4 12:20:55 fhgfs02 kernel: active_anon:440115 inactive_anon:74690 isolated_anon:0
Apr  4 12:20:55 fhgfs02 kernel: active_file:1052482 inactive_file:249682 isolated_file:0
Apr  4 12:20:55 fhgfs02 kernel: unevictable:1250 dirty:62697 writeback:4136 unstable:0
Apr  4 12:20:55 fhgfs02 kernel: free:49996 slab_reclaimable:30374 slab_unreclaimable:51020
Apr  4 12:20:55 fhgfs02 kernel: mapped:2074 shmem:203 pagetables:2583 bounce:0
Apr  4 12:20:55 fhgfs02 kernel: Node 0 DMA free:15720kB min:240kB low:300kB high:360kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15332kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Apr  4 12:20:55 fhgfs02 kernel: lowmem_reserve[]: 0 3253 8050 8050
Apr  4 12:20:55 fhgfs02 kernel: Node 0 DMA32 free:93840kB min:52864kB low:66080kB high:79296kB active_anon:145632kB inactive_anon:29240kB active_file:2008116kB inactive_file:676588kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3331420kB mlocked:0kB dirty:221604kB writeback:11280kB mapped:20kB shmem:0kB slab_reclaimable:48088kB slab_unreclaimable:24764kB kernel_stack:112kB pagetables:832kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
Apr  4 12:20:55 fhgfs02 kernel: lowmem_reserve[]: 0 0 4797 4797
Apr  4 12:20:55 fhgfs02 kernel: Node 0 Normal free:90424kB min:77960kB low:97448kB high:116940kB active_anon:1614828kB inactive_anon:269520kB active_file:2201812kB inactive_file:322140kB unevictable:5000kB isolated(anon):0kB isolated(file):0kB present:4912636kB mlocked:5000kB dirty:29184kB writeback:5264kB mapped:8276kB shmem:812kB slab_reclaimable:73408kB slab_unreclaimable:179316kB kernel_stack:3016kB pagetables:9500kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
Apr  4 12:20:55 fhgfs02 kernel: lowmem_reserve[]: 0 0 0 0
Apr  4 12:20:55 fhgfs02 kernel: Node 0 DMA: 2*4kB 2*8kB 1*16kB 2*32kB 2*64kB 1*128kB 0*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15720kB
Apr  4 12:20:55 fhgfs02 kernel: Node 0 DMA32: 19771*4kB 511*8kB 118*16kB 46*32kB 31*64kB 5*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 1*4096kB = 95044kB
Apr  4 12:20:55 fhgfs02 kernel: Node 0 Normal: 21696*4kB 149*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 1*4096kB = 92072kB
Apr  4 12:20:55 fhgfs02 kernel: 1403768 total pagecache pages
Apr  4 12:20:55 fhgfs02 kernel: 100263 pages in swap cache
Apr  4 12:20:55 fhgfs02 kernel: Swap cache stats: add 158365, delete 58102, find 283089/283491
Apr  4 12:20:55 fhgfs02 kernel: Free swap  = 7752216kB
Apr  4 12:20:55 fhgfs02 kernel: Total swap = 8187896kB
Apr  4 12:20:55 fhgfs02 kernel: 2097150 pages RAM
Apr  4 12:20:55 fhgfs02 kernel: 83747 pages reserved
Apr  4 12:20:55 fhgfs02 kernel: 664291 pages shared
Apr  4 12:20:55 fhgfs02 kernel: 1313693 pages non-shared
Apr  4 12:20:55 fhgfs02 kernel: [<ffffffff8115f4bf>] ? kmem_cache_alloc_node_notrace+0x6f/0x130
Apr  4 12:20:55 fhgfs02 kernel: [<ffffffff8115f6fb>] ? __kmalloc_node+0x7b/0x100
Apr  4 12:20:55 fhgfs02 kernel: [<ffffffff8142186a>] ? __alloc_skb+0x7a/0x180
Apr  4 12:20:55 fhgfs02 kernel: [<ffffffff8146f341>] ? sk_stream_alloc_skb+0x41/0x110
Apr  4 12:20:55 fhgfs02 kernel: [<ffffffff8146f760>] ? tcp_sendmsg+0x350/0xa10
Apr  4 12:20:55 fhgfs02 kernel: [<ffffffff811845d4>] ? follow_managed+0x184/0x370
Apr  4 12:20:55 fhgfs02 kernel: [<ffffffff81419a0a>] ? sock_sendmsg+0x11a/0x150
Apr  4 12:20:55 fhgfs02 kernel: [<ffffffff81183795>] ? path_to_nameidata+0x45/0x60
Apr  4 12:20:55 fhgfs02 kernel: [<ffffffff81090a90>] ? autoremove_wake_function+0x0/0x40
Apr  4 12:20:55 fhgfs02 kernel: [<ffffffffa049a2a1>] ? xfs_icsb_sync_counters_locked+0x61/0x80 [xfs]
Apr  4 12:20:55 fhgfs02 kernel: [<ffffffffa049a2f9>] ? xfs_icsb_sync_counters+0x39/0x50 [xfs]
Apr  4 12:20:55 fhgfs02 kernel: [<ffffffff811a7034>] ? statfs_by_dentry+0x74/0xa0
Apr  4 12:20:55 fhgfs02 kernel: [<ffffffff811a716b>] ? vfs_statfs+0x1b/0xb0
Apr  4 12:20:55 fhgfs02 kernel: [<ffffffff8141a329>] ? sys_sendto+0x139/0x190
Apr  4 12:20:55 fhgfs02 kernel: [<ffffffff811a75d1>] ? sys_statfs+0x81/0xb0
Apr  4 12:20:55 fhgfs02 kernel: [<ffffffff8100b0f2>] ? system_call_fastpath+0x16/0x1b
Apr  4 12:20:55 fhgfs02 kernel: Mem-Info:
Apr  4 12:20:55 fhgfs02 kernel: Node 0 DMA per-cpu:
Apr  4 12:20:55 fhgfs02 kernel: CPU    0: hi:    0, btch:   1 usd:   0
Apr  4 12:20:55 fhgfs02 kernel: CPU    1: hi:    0, btch:   1 usd:   0
Apr  4 12:20:55 fhgfs02 kernel: CPU    2: hi:    0, btch:   1 usd:   0
Apr  4 12:20:55 fhgfs02 kernel: CPU    3: hi:    0, btch:   1 usd:   0
Apr  4 12:20:55 fhgfs02 kernel: CPU    4: hi:    0, btch:   1 usd:   0
Apr  4 12:20:55 fhgfs02 kernel: CPU    5: hi:    0, btch:   1 usd:   0
Apr  4 12:20:55 fhgfs02 kernel: CPU    6: hi:    0, btch:   1 usd:   0
Apr  4 12:20:55 fhgfs02 kernel: CPU    7: hi:    0, btch:   1 usd:   0
Apr  4 12:20:55 fhgfs02 kernel: Node 0 DMA32 per-cpu:
Apr  4 12:20:55 fhgfs02 kernel: CPU    0: hi:  186, btch:  31 usd:  37
Apr  4 12:20:55 fhgfs02 kernel: CPU    1: hi:  186, btch:  31 usd:  68
Apr  4 12:20:55 fhgfs02 kernel: CPU    2: hi:  186, btch:  31 usd:  46
Apr  4 12:20:55 fhgfs02 kernel: CPU    3: hi:  186, btch:  31 usd:   0
Apr  4 12:20:55 fhgfs02 kernel: CPU    4: hi:  186, btch:  31 usd:  68
Apr  4 12:20:55 fhgfs02 kernel: CPU    5: hi:  186, btch:  31 usd: 172
Apr  4 12:20:55 fhgfs02 kernel: CPU    6: hi:  186, btch:  31 usd:   0
Apr  4 12:20:55 fhgfs02 kernel: CPU    7: hi:  186, btch:  31 usd:   0
Apr  4 12:20:55 fhgfs02 kernel: Node 0 Normal per-cpu:
Apr  4 12:20:55 fhgfs02 kernel: CPU    0: hi:  186, btch:  31 usd:  96
Apr  4 12:20:55 fhgfs02 kernel: CPU    1: hi:  186, btch:  31 usd:   0
Apr  4 12:20:55 fhgfs02 kernel: CPU    2: hi:  186, btch:  31 usd:   0
Apr  4 12:20:55 fhgfs02 kernel: CPU    3: hi:  186, btch:  31 usd:   0
Apr  4 12:20:55 fhgfs02 kernel: CPU    4: hi:  186, btch:  31 usd:   0
Apr  4 12:20:55 fhgfs02 kernel: CPU    5: hi:  186, btch:  31 usd: 165
Apr  4 12:20:55 fhgfs02 kernel: CPU    6: hi:  186, btch:  31 usd:   0
Apr  4 12:20:55 fhgfs02 kernel: CPU    7: hi:  186, btch:  31 usd:   0
Apr  4 12:20:55 fhgfs02 kernel: active_anon:440091 inactive_anon:74703 isolated_anon:19
Apr  4 12:20:55 fhgfs02 kernel: active_file:1018522 inactive_file:273458 isolated_file:0
Apr  4 12:20:55 fhgfs02 kernel: unevictable:1250 dirty:53398 writeback:0 unstable:0
Apr  4 12:20:55 fhgfs02 kernel: free:59808 slab_reclaimable:30367 slab_unreclaimable:51066
Apr  4 12:20:55 fhgfs02 kernel: mapped:2078 shmem:203 pagetables:2583 bounce:0
Apr  4 12:20:55 fhgfs02 kernel: Node 0 DMA free:15720kB min:240kB low:300kB high:360kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15332kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Apr  4 12:20:55 fhgfs02 kernel: lowmem_reserve[]: 0 3253 8050 8050
Apr  4 12:20:55 fhgfs02 kernel: Node 0 DMA32 free:132292kB min:52864kB low:66080kB high:79296kB active_anon:145508kB inactive_anon:29320kB active_file:1877584kB inactive_file:768328kB unevictable:0kB isolated(anon):76kB isolated(file):0kB present:3331420kB mlocked:0kB dirty:192880kB writeback:0kB mapped:36kB shmem:0kB slab_reclaimable:48072kB slab_unreclaimable:24948kB kernel_stack:112kB pagetables:832kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:64 all_unreclaimable? no
Apr  4 12:20:55 fhgfs02 kernel: lowmem_reserve[]: 0 0 4797 4797
Apr  4 12:20:55 fhgfs02 kernel: Node 0 Normal free:91840kB min:77960kB low:97448kB high:116940kB active_anon:1614856kB inactive_anon:269492kB active_file:2197008kB inactive_file:325516kB unevictable:5000kB isolated(anon):0kB isolated(file):0kB present:4912636kB mlocked:5000kB dirty:20712kB writeback:0kB mapped:8276kB shmem:812kB slab_reclaimable:73396kB slab_unreclaimable:179316kB kernel_stack:3016kB pagetables:9500kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:1304 all_unreclaimable? no
Apr  4 12:20:55 fhgfs02 kernel: lowmem_reserve[]: 0 0 0 0
Apr  4 12:20:55 fhgfs02 kernel: Node 0 DMA: 2*4kB 2*8kB 1*16kB 2*32kB 2*64kB 1*128kB 0*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15720kB
Apr  4 12:20:55 fhgfs02 kernel: Node 0 DMA32: 20929*4kB 2481*8kB 1037*16kB 89*32kB 31*64kB 2*128kB 0*256kB 1*512kB 1*1024kB 0*2048kB 1*4096kB = 130876kB
Apr  4 12:20:55 fhgfs02 kernel: Node 0 Normal: 21707*4kB 191*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 1*4096kB = 92452kB
Apr  4 12:20:55 fhgfs02 kernel: 1394366 total pagecache pages
Apr  4 12:20:55 fhgfs02 kernel: 100270 pages in swap cache
Apr  4 12:20:55 fhgfs02 kernel: Swap cache stats: add 158385, delete 58115, find 283104/283509
Apr  4 12:20:55 fhgfs02 kernel: Free swap  = 7752268kB
Apr  4 12:20:55 fhgfs02 kernel: Total swap = 8187896kB
Apr  4 12:20:56 fhgfs02 kernel: 2097150 pages RAM
Apr  4 12:20:56 fhgfs02 kernel: 83747 pages reserved
Apr  4 12:20:56 fhgfs02 kernel: 658610 pages shared
Apr  4 12:20:56 fhgfs02 kernel: 1311445 pages non-shared
 

Other than that, please follow Christians advise.
Also, did you disable transparent huge pages, which are known to have
bugs in RHEL kernels?

http://www.fhgfs.com/wiki/wikka.php?wakka=ServerTuning
(section "Huge Pages").


Yes, I followed those pages already.

I'm happy to send server logs if you wish.  I suppose to the support address ?

Bernd Schubert

unread,
Apr 5, 2012, 11:00:13 AM4/5/12
to fhgfs...@googlegroups.com, Christian Iseli

Yeah, I would increase it to 128MB or 256MB.


Cheers,
Bernd

Bernd Schubert

unread,
Apr 5, 2012, 11:07:44 AM4/5/12
to fhgfs...@googlegroups.com, Christian Iseli
Hello Christian,

On 04/05/2012 03:06 PM, Christian Iseli wrote:
> Hi,
>
> thanks for all the replies :-)
>
> On Thursday, April 5, 2012 10:20:47 AM UTC+2, Bernd Schubert wrote:
>
> Out of interest, could you please post a few complete traces? I would
> like to know where this exactly came from. Usually drivers are
> requesting serial aligned pages (2^5 pages = 32kb in your case).
>
> sure:
> Apr 4 12:20:55 fhgfs02 kernel: fhgfs-storage/W: page allocation failure.
> order:5, mode:0xd0

[...]

> ____cache_alloc_node+0x99/0x160


> kmem_cache_alloc_node_notrace+0x6f/0x130
> Apr 4 12:20:55 fhgfs02 kernel: [<ffffffff8115f6fb>] ?
> __kmalloc_node+0x7b/0x100
> Apr 4 12:20:55 fhgfs02 kernel: [<ffffffff8142186a>] ? __alloc_skb+0x7a/0x180
> Apr 4 12:20:55 fhgfs02 kernel: [<ffffffff8146f341>] ?
> sk_stream_alloc_skb+0x41/0x110
> Apr 4 12:20:55 fhgfs02 kernel: [<ffffffff8146f760>] ?
> tcp_sendmsg+0x350/0xa10

[...]

So as I thought, it is comming from the kernel network stack, which
needs continuous pages for dma driver dma.

In principle you should open a RHEL bugzilla and complain that somthing
is fragmenting your memory.
I would also install collectl, which saves quite verbose logs, which
might help to narrow down the issue. Interesting are the information in
/proc/buddyinfo, which shows information about the number of free pages
for each order (2^0, 2^1, ...).

Cheers,
Bernd

Reply all
Reply to author
Forward
0 new messages