kswapd吃掉了我一个核

Shell Xu

unread,

Apr 7, 2014, 7:02:43 AM4/7/14

to shlug

近三天，在升级系统后，kswapd吃掉了我一个核。使用echo 1 > /proc/sys/vm/drop_caches能够解决问题数秒，但是很快又会使用大量CPU。swapoff和vm.swappiness = 0没有帮助。

以下是最近升级信息。

[升级] libtotem-plparser18:amd64 3.10.1-2 -> 3.10.2-1

[升级] python-pandas-lib:amd64 0.13.1-2 -> 0.13.1-2+b1

[升级] tar:amd64 1.27.1-1 -> 1.27.1-2

[升级] usbmuxd:amd64 1.0.8-2 -> 1.0.8-3

[升级] apache2-bin:amd64 2.4.7-1 -> 2.4.9-1

[升级] apache2-utils:amd64 2.4.7-1 -> 2.4.9-1

[升级] bash:amd64 4.3-4 -> 4.3-5

[升级] binfmt-support:amd64 2.1.3-2 -> 2.1.4-1

[升级] gir1.2-upowerglib-1.0:amd64 0.9.23-2+b1 -> 0.9.23-2+b2

[升级] gvfs:amd64 1.20.0-1 -> 1.20.0-1+b1

[升级] gvfs-backends:amd64 1.20.0-1 -> 1.20.0-1+b1

[升级] gvfs-bin:amd64 1.20.0-1 -> 1.20.0-1+b1

[升级] gvfs-daemons:amd64 1.20.0-1 -> 1.20.0-1+b1

[升级] gvfs-fuse:amd64 1.20.0-1 -> 1.20.0-1+b1

[升级] gvfs-libs:amd64 1.20.0-1 -> 1.20.0-1+b1

[升级] im-config:amd64 0.24-1 -> 0.25-1

[升级] libimobiledevice4:amd64 1.1.5-2+b1 -> 1.1.5-2+b2

[升级] libreadline6:amd64 6.3-4 -> 6.3-5

[升级] libsecret-1-0:amd64 0.15-2 -> 0.18-1

[升级] libsecret-common:amd64 0.15-2 -> 0.18-1

[升级] libupower-glib1:amd64 0.9.23-2+b1 -> 0.9.23-2+b2

[升级] python-nose:amd64 1.3.1-1 -> 1.3.1-2

[升级] readline-common:amd64 6.3-4 -> 6.3-5

[升级] upower:amd64 0.9.23-2+b1 -> 0.9.23-2+b2

[升级] gir1.2-packagekitglib-1.0:amd64 0.8.16-1 -> 0.8.17-1

[升级] gir1.2-telepathyglib-0.12:amd64 0.22.1-1 -> 0.24.0-1

[升级] libcups2:amd64 1.7.1-10 -> 1.7.1-12

[升级] libcupsimage2:amd64 1.7.1-10 -> 1.7.1-12

[升级] libegl1-mesa:amd64 10.1.0-4 -> 10.1.0-5

[升级] libegl1-mesa-drivers:amd64 10.1.0-4 -> 10.1.0-5

[升级] libgbm1:amd64 10.1.0-4 -> 10.1.0-5

[升级] libgl1-mesa-dri:amd64 10.1.0-4 -> 10.1.0-5

[升级] libgl1-mesa-glx:amd64 10.1.0-4 -> 10.1.0-5

[升级] libglapi-mesa:amd64 10.1.0-4 -> 10.1.0-5

[升级] libgles2-mesa:amd64 10.1.0-4 -> 10.1.0-5

[升级] libgnupg-interface-perl:amd64 0.50-1 -> 0.50-2

[升级] libgtk2-perl:amd64 2:1.249-1 -> 2:1.249-2

[升级] libopenvg1-mesa:amd64 10.1.0-4 -> 10.1.0-5

[升级] libpackagekit-glib2-16:amd64 0.8.16-1 -> 0.8.17-1

[升级] libsvn1:amd64 1.8.8-1 -> 1.8.8-2

[升级] libtelepathy-glib0:amd64 0.22.1-1 -> 0.24.0-1

[升级] libwayland-egl1-mesa:amd64 10.1.0-4 -> 10.1.0-5

[升级] mobile-broadband-provider-info:amd64 20130915-1 -> 20140317-1

[升级] openssh-client:amd64 1:6.5p1-6 -> 1:6.6p1-2

[升级] openssh-server:amd64 1:6.5p1-6 -> 1:6.6p1-2

[升级] openssh-sftp-server:amd64 1:6.5p1-6 -> 1:6.6p1-2

[升级] packagekit:amd64 0.8.16-1 -> 0.8.17-1

[升级] packagekit-backend-aptcc:amd64 0.8.16-1 -> 0.8.17-1

[升级] packagekit-tools:amd64 0.8.16-1 -> 0.8.17-1

[升级] python-scipy:amd64 0.12.0-3+b1 -> 0.13.3-2

[升级] python3-packagekit:amd64 0.8.16-1 -> 0.8.17-1

[升级] subversion:amd64 1.8.8-1 -> 1.8.8-2

[升级] xserver-common:amd64 2:1.15.0-2 -> 2:1.15.0.901-1

[升级] xserver-xephyr:amd64 2:1.15.0-2 -> 2:1.15.0.901-1

[升级] xserver-xorg-core:amd64 2:1.15.0-2 -> 2:1.15.0.901-1

[升级] xserver-xorg-video-vmware:amd64 1:13.0.1-3+b1 -> 1:13.0.2-2

是否有人见到过类似问题。

--

彼節者有間，而刀刃者無厚；以無厚入有間，恢恢乎其於游刃必有餘地矣。
blog: http://shell909090.org/blog/

twitter: @shell909090
about.me: http://about.me/shell909090

Chaos Eternal

unread,

Apr 7, 2014, 8:16:44 AM4/7/14

to sh...@googlegroups.com

上systap/ktap看看它在干嘛。

话说这是啥版本？

> --
> -- You received this message because you are subscribed to the Google Groups
> Shanghai Linux User Group group. To post to this group, send email to
> sh...@googlegroups.com. To unsubscribe from this group, send email to
> shlug+un...@googlegroups.com. For more options, visit this group at
> https://groups.google.com/d/forum/shlug?hl=zh-CN
> ---
> 您收到此邮件是因为您订阅了Google网上论坛中的“Shanghai Linux User Group”论坛。
> 要退订此论坛并停止接收此论坛的电子邮件，请发送电子邮件到shlug+un...@googlegroups.com。
> 要查看更多选项，请访问https://groups.google.com/d/optout。

none_nobody

unread,

Apr 7, 2014, 11:57:11 AM4/7/14

to sh...@googlegroups.com

孤以为不是 kswapd 出问题了，要对着介个下手那是用错劲了。

是某个服务或进程，也许不只一个，发生了内存或者凯奇竞争使用。

所以还是先停服务再说。要是服务停不下来，那可以刮流量。要是这也做不了，只能在备份的环境里先模拟玩耍了。

On Monday, April 7, 2014 8:16:44 PM UTC+8, Soahc Lanrete wrote:

上systap/ktap看看它在干嘛。

Shell Xu

unread,

Apr 7, 2014, 12:08:53 PM4/7/14

to shlug

我也怀疑，因为一周内没升级内核。内核好像是几周前升级的了。

from nexus 4

liyaoshi

unread,

Apr 7, 2014, 9:29:58 PM4/7/14

to sh...@googlegroups.com

你现在升多少kernel？ 3.11 ？

http://forums.opensuse.org/showthread.php/487620-Kswapd0-high-CPU-usage

这个跟你很像唉

Shell Xu

unread,

Apr 7, 2014, 9:56:01 PM4/7/14

to shlug

应该是3.13.7-1。系统是debian testing。

RH的bugzilla上，记录了11年出过一个类似问题。也是drop cache解决问题。问题是那是2.6系列内核，而且标记为已解决。

而且刚刚看了你提供的讨论，问题没解决。。。难道我暂时就不能用有问题的电脑，也不能升级么。。。

liyaoshi

unread,

Apr 7, 2014, 11:18:46 PM4/7/14

to sh...@googlegroups.com

sysctl -w vm.vfs_cache_pressure=100

看看？

liyaoshi

unread,

Apr 7, 2014, 11:21:27 PM4/7/14

to sh...@googlegroups.com

还有个蠢办法是换btrfs， ext4 这种多会用到vfs cache， xfs和btrfs 对vfs cache 不怎么大用

不过我实际用下来，xfs好慢啊

liyaoshi

unread,

Apr 7, 2014, 11:25:14 PM4/7/14

to sh...@googlegroups.com

顺大便给你看看我的

Linux Z620-NAV 3.13.0-rc8+ #3 SMP Tue Jan 14 12:38:24 CST 2014 x86_64 x86_64 x86_64 GNU/Linux

xihuang@Z620-NAV:~$ top

top - 10:28:33 up 21 days, 1:13, 4 users, load average: 0.15, 0.05, 0.06

Tasks: 266 total, 1 running, 265 sleeping, 0 stopped, 0 zombie

Cpu0 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st

Cpu1 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st

Cpu2 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st

Cpu3 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st

Cpu4 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st

Cpu5 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st

Cpu6 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st

Cpu7 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st

Cpu8 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st

Cpu9 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st

Cpu10 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st

Cpu11 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st

Mem: 32827440k total, 26453196k used, 6374244k free, 488k buffers

Swap: 0k total, 0k used, 0k free, 23715800k cached

我当然没跑啥业务

你闲的时候会有吃掉么？

有啥测试代码可以复现么？

liyaoshi

unread,

Apr 7, 2014, 11:28:06 PM4/7/14

to sh...@googlegroups.com

还有就是，你看看top，如果你跑100个java instance，每个多用掉4G 内存，这事就不好说了

Shell Xu

unread,

Apr 7, 2014, 11:34:40 PM4/7/14

to shlug

内存是空的，还剩下了2G左右的free+cache。

求别顺大便。。。

liyaoshi

unread,

Apr 8, 2014, 12:52:42 AM4/8/14

to sh...@googlegroups.com

那你喜欢顺什么便？

slabtop 看看

Active / Total Objects (% used) : 5980864 / 6305931 (94.8%)

Active / Total Slabs (% used) : 231044 / 231044 (100.0%)

Active / Total Caches (% used) : 73 / 115 (63.5%)

Active / Total Size (% used) : 2286859.05K / 2376614.71K (96.2%)

Minimum / Average / Maximum Object : 0.01K / 0.38K / 8.00K

OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME

1788213 1787691 99% 0.19K 85153 21 340612K dentry

1172668 1172668 100% 1.01K 37828 31 1210496K btrfs_inode

743445 743445 100% 0.29K 27535 27 220280K btrfs_transaction

681996 561465 82% 0.09K 16238 42 64952K btrfs_extent_state

594275 482156 81% 0.31K 23771 25 190168K btrfs_extent_buffer

440748 368352 83% 0.55K 15741 28 251856K radix_tree_node

342940 342824 99% 0.15K 13190 26 52760K btrfs_extent_map

132440 132440 100% 0.07K 2365 56 9460K Acpi-ParseExt

101312 91295 90% 0.06K 1583 64 6332K kmalloc-64

52350 51643 98% 0.62K 2094 25 33504K proc_inode_cache

43008 43008 100% 0.03K 336 128 1344K kmalloc-32

33444 33289 99% 0.11K 929 36 3716K sysfs_dir_cache

28544 24259 84% 0.06K 446 64 1784K ext4_free_data

14364 14364 100% 0.55K 513 28 8208K inode_cache

13872 13362 96% 0.04K 136 102 544K ext4_extent_status

13824 13824 100% 0.01K 27 512 108K kmalloc-8

12512 12307 98% 0.17K 544 23 2176K vm_area_struct

10812 10812 100% 0.04K 106 102 424K Acpi-Namespace

10432 10432 100% 0.06K 163 64 652K anon_vma

8192 8192 100% 0.02K 32 256 128K kmalloc-16

7232 6829 94% 0.12K 226 32 904K kmalloc-128

6132 5319 86% 0.09K 146 42 584K kmalloc-96

6080 5455 89% 0.25K 190 32 1520K kmalloc-256

5796 5003 86% 0.19K 276 21 1104K kmalloc-192

5290 5244 99% 0.09K 115 46 460K ext3_xattr

3485 3485 100% 0.05K 41 85 164K nsproxy

3200 3200 100% 0.50K 100 32 1600K kmalloc-512

3003 3003 100% 0.81K 77 39 2464K task_xstate

2418 2418 100% 0.10K 62 39 248K buffer_head

2304 2142 92% 1.00K 72 32 2304K kmalloc-1024

2205 2205 100% 0.37K 105 21 840K btrfs_ordered_extent

2040 2040 100% 0.02K 12 170 48K fsnotify_event_holder

1650 1472 89% 1.06K 55 30 1760K signal_cache

1440 1440 100% 0.13K 48 30 192K ext4_allocation_context

1320 1206 91% 4.00K 165 8 5280K kmalloc-4096

1225 1225 100% 0.62K 49 25 784K shmem_inode_cache

1050 1050 100% 0.62K 42 25 672K files_cache

1044 1044 100% 0.88K 29 36 928K mm_struct

925 925 100% 0.62K 37 25 592K sock_inode_cache

810 775 95% 2.06K 54 15 1728K sighand_cache

688 640 93% 2.00K 43 16 1376K kmalloc-2048

646 646 100% 0.12K 19 34 76K fsnotify_event

612 612 100% 0.08K 12 51 48K Acpi-State

525 395 75% 0.31K 21 25 168K mnt_cache

490 447 91% 5.81K 98 5 3136K task_struct

483 369 76% 0.38K 23 21 184K blkdev_requests

432 432 100% 0.88K 12 36 384K UDP

429 429 100% 0.81K 11 39 352K bdev_cache

384 384 100% 0.06K 6 64 24K kmem_cache_node

375 375 100% 2.06K 25 15 800K idr_layer_cache

336 336 100% 0.14K 12 28 48K btrfs_path

325 325 100% 0.16K 13 25 52K sigqueue

300 300 100% 0.16K 12 25 48K btrfs_trans_handle

288 288 100% 0.32K 12 24 96K taskstats

270 270 100% 1.06K 9 30 288K UDPv6

liyaoshi

unread,

Apr 8, 2014, 12:56:55 AM4/8/14

to sh...@googlegroups.com

你们是狂写的应用还是狂读的应用？

fstab 里面加个 noatime 看看

怎么模拟你们这种啊

none_nobody

unread,

Apr 8, 2014, 1:09:29 AM4/8/14

to sh...@googlegroups.com

一日囚困在12.04里了。

Linux gluster 3.2.0-51-generic #77-Ubuntu SMP Wed Jul 24 20:18:19 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

来晒。

top - 13:06:55 up 233 days, 18:19, 2 users, load average: 1.33, 1.27, 1.19
Tasks: 110 total,   1 running, 109 sleeping,   0 stopped,   0 zombie
Cpu(s): 1.1%us, 0.9%sy, 0.0%ni, 97.9%id, 0.1%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 16435924k total, 13221152k used, 3214772k free,   601404k buffers
Swap:        0k total,        0k used,        0k free, 8684920k cached

PID USER      PR NI VIRT RES SHR S %CPU %MEM    TIME+ COMMAND
25262 root      20   0 14.0g 2.8g 1780 S   98 17.8 145:17.05 livermore
    1 root      20   0 24460 2152 1116 S    0 0.0   0:01.79 init
    2 root      20   0     0    0    0 S    0 0.0   0:01.92 kthreadd
    3 root      20   0     0    0    0 S    0 0.0   8:00.76 ksoftirqd/0
    6 root      RT   0     0    0    0 S    0 0.0 20:25.30 migration/0

liyaoshi

unread,

Apr 8, 2014, 1:17:00 AM4/8/14

to sh...@googlegroups.com

他们可能是啥特别的应用引起的，或者就是机器太牛比，人家写代码的还没有想到过真的有人会这么干

还有debian编译内核的时候，以前多是按照-march=i486做的，不知道64位他们拿啥优化

强烈建议你用我上次贴的kernel config 试一把

--

Shell Xu

unread,

Apr 8, 2014, 3:08:28 AM4/8/14

to shlug

刚刚发现，在机器没有压力的时候，始终不发生现象。而且用vbox也不发生现象。

from nexus 4

Shell Xu

unread,

Apr 8, 2014, 3:29:59 AM4/8/14

to shlug

vfs_cache调整到100和0，均不影响kswapd的出现。

slabtop没什么奇怪的，只有一个kmalloc-16384以前没见过，400个对象，6M。buffer_head最大，12M。

文件系统ext4和xfs都用到了。

机器是我自己手里的小本子，应该没有什么太高的读写负担。noatime本来就在。

在 2014年4月8日下午12:52，liyaoshi <liya...@gmail.com>写道：

liyaoshi

unread,

Apr 8, 2014, 3:55:28 AM4/8/14

to sh...@googlegroups.com

你这是烤验我来了么

再使个照，你不要用unitiy，用twm，看看是不是显示的原因？或者lightdm 关了看看？

16k的 kmalloc 。。也没啥吧

我发觉我快露陷了，没招了快

liyaoshi

unread,

Apr 8, 2014, 4:01:31 AM4/8/14

to sh...@googlegroups.com

要不你再贴个dmesg ？

Shell Xu

unread,

Apr 8, 2014, 4:05:23 AM4/8/14

to shlug

行了。药师说准了。关掉X后，kswapd的耗用基本在正常值范围内波动。结合上面这个：

[升级] xserver-common:amd64 2:1.15.0-2 -> 2:1.15.0.901-1

[升级] xserver-xephyr:amd64 2:1.15.0-2 -> 2:1.15.0.901-1

[升级] xserver-xorg-core:amd64 2:1.15.0-2 -> 2:1.15.0.901-1

[升级] xserver-xorg-video-vmware:amd64 1:13.0.1-3+b1 -> 1:13.0.2-2

看来就是升级xserver导致的问题。

问题是怎么降级啊:(

在 2014年4月8日下午3:55，liyaoshi <liya...@gmail.com>写道：

liyaoshi

unread,

Apr 8, 2014, 4:13:38 AM4/8/14

to sh...@googlegroups.com

介个，呼唤chaos

我一般很傻的用dpkg -x ；cp -arf xxx

不过倒霉的时候是load symbol error，所以我的大招是

重装

Shell Xu

unread,

Apr 8, 2014, 4:15:00 AM4/8/14

to shlug

google了一下，据说是apt-get install package=version

liyaoshi

unread,

Apr 8, 2014, 4:17:37 AM4/8/14

to sh...@googlegroups.com

凹，教学相长，又学了一招

Shell Xu

unread,

Apr 8, 2014, 4:19:12 AM4/8/14

to shlug

没，刚刚发现xserver-xorg也升级了。本着不怕死的小白鼠精神，升上去看看，是不是不匹配造成的问题。

Shell Xu

unread,

Apr 8, 2014, 4:24:40 AM4/8/14

to shlug

Confirmed, version of xserver-common and xserver-xorg dismatch lead to kswapd eats all cpu in one core。After upgrade, problem not show in 5-mins with my pressure.

Great, Thank you guys.

liyaoshi

unread,

Apr 8, 2014, 4:29:57 AM4/8/14

to sh...@googlegroups.com

还好没露陷

AR

unread,

Apr 8, 2014, 5:26:18 AM4/8/14

to sh...@googlegroups.com

2014-04-08 16:24 GMT+08:00 Shell Xu <shell...@gmail.com>:
> Confirmed, version of xserver-common and xserver-xorg dismatch lead to
> kswapd eats all cpu in one core。After upgrade, problem not show in 5-mins
> with my pressure.
>
> Great, Thank you guys.

竟然是这样囧

--
Silence is golden.

Felix Yan

unread,

Apr 8, 2014, 6:07:31 AM4/8/14

to sh...@googlegroups.com

On Tuesday, April 08, 2014 16:19:12 Shell Xu wrote:
> 没，刚刚发现xserver-xorg也升级了。本着不怕死的小白鼠精神，升上去看看，是不是不匹配造成的问题。

借宝地顺大便问两个问题:

- 平日系统升级的时候, 不是一起升级的吗? 虽然因为我用 Arch, 所以对 "一起升级" 的要求可能比较极端, 但是即使是其他发行版的话, 也应当是
"所有软件包均保持最新" 的状态可以减少出现问题的几率...吧?

- 以及如果存在这样的匹配问题, xserver-xorg 相关包的依赖关系里应该写明版本吧... 所以应该这算是打包方面的 Bug?

Regards,
Felix Yan

signature.asc

AR

unread,

Apr 8, 2014, 6:11:52 AM4/8/14

to sh...@googlegroups.com

2014-04-08 18:07 GMT+08:00 Felix Yan <felix...@gmail.com>:
>
> 借宝地顺大便问两个问题:
>
> - 平日系统升级的时候, 不是一起升级的吗? 虽然因为我用 Arch, 所以对 "一起升级" 的要求可能比较极端, 但是即使是其他发行版的话, 也应当是
> "所有软件包均保持最新" 的状态可以减少出现问题的几率...吧?
>
> - 以及如果存在这样的匹配问题, xserver-xorg 相关包的依赖关系里应该写明版本吧... 所以应该这算是打包方面的 Bug?
>
> Regards,
> Felix Yan

对于生产来说升级的风险本身就是大的，几率的事情真不好说。能有测试当然是完整测试过再上。至于经常跑 bleeding edge 的 Arch
来说某觉得闭着眼睛升级出问题的可能更大 233 不过这真不黑，某还是挺喜欢 Arch 的。

后面那个版本 mismatch 要具体看了，是未确认 bug 还是就不应该这么用。

--
Silence is golden.

Felix Yan

unread,

Apr 8, 2014, 6:19:48 AM4/8/14

to sh...@googlegroups.com

On Tuesday, April 08, 2014 18:11:52 AR wrote:
> 对于生产来说升级的风险本身就是大的，几率的事情真不好说。能有测试当然是完整测试过再上。至于经常跑 bleeding edge 的 Arch
> 来说某觉得闭着眼睛升级出问题的可能更大 233 不过这真不黑，某还是挺喜欢 Arch 的。

哦, 我可能没说清楚.

我指的是这两种情况下, 哪种风险更大呢?

- 类似 apt-get 不带dist-的 upgrade, 普通升级全部系统组件. 对于 ubuntu 等锁版本的发行版来说, 除 PPA 等第三方资源
外, 能更新的几乎只有 backport 来的 bug fix, 安全补丁, 等等. (全局更新)

- 只在看见某个安全漏洞 (比如今天的 openssl) 被修复, 或者某个新包修好了自己关心的 bug 的时候选择性更新想到的包, 来规避更新无关包时可能
带来的问题. (部分更新)

我个人倾向于, 后者风险更大, 因为你很难保持关注所有需要关注的包, 包括你用到的软件, 以及这些软件用到的库, 等等. 因此你会错过安全补丁. 此外, 还
可能带来打包者没有考虑到的版本不兼容问题.

并没有比较升级vs不升级的意思哈...

Regards,
Felix Yan

signature.asc

Shell Xu

unread,

Apr 8, 2014, 7:47:27 AM4/8/14

to shlug

我一定是全部升级的，可是由于上游同步问题，某个时刻可能只有一半。
另外，出问题的两个包，并不共享一套版本号。从数字上无法看出两者关系。

from nexus 4

Shell Xu

unread,

Apr 8, 2014, 8:20:01 AM4/8/14

to shlug

shit，大晚上的居然又出问题了。够恶心的。

AR

unread,

Apr 8, 2014, 8:21:33 AM4/8/14

to sh...@googlegroups.com

但就壳叔这里遇到上游的问题无法改变某倾向于保持升级的基本看法，因为某觉得不应该由于担心升级的风险而放弃更新的步伐。（当然如果你遇到的环境对这个问题特别敏感，不允许出半点差错，那应该用其他方式保证；这时不升级也许就是一个选择了。）

这个 ubuntu server 在安装的时候也给了几种策略

1. 自动更新
2. 只更新安全升级
3. 不动

从一般性的观点看，由于包的升级导致的问题，那安全性是 3 > 2 > 1
的。而且某也见过不止一两个公司的运维选择第三种（他们甚至连安全升级都几乎不动，因为认为不动就不会出问题）。甚至比如 RH
系出于他们自己和客户的需要，至今在 backport 各种安全升级回老版本（比如 Linux Kernel 的 2.6.32 分支至今
LTS 就是这个原因）。

所以这个问题也没啥标准答案，选择的人根据自己的实际需求来选择。而某个人会选择尽可能升级，最起码安全升级是必要的。

--
Silence is golden.

Felix Yan

unread,

Apr 8, 2014, 8:50:06 AM4/8/14

to sh...@googlegroups.com

On Tuesday, April 08, 2014 19:47:27 Shell Xu wrote:
> 我一定是全部升级的，可是由于上游同步问题，某个时刻可能只有一半。

嗯, 上游在这里出错的可能性还是有的...

一般发行版的话, 包数据库和包本身是分开的, 因此同步到的每个版本的包数据库应当是完整的. 而如果是文件没同步完, 应该在升级中会遇到 404 而中断整个更
新操作.

所以, 如果遇到部分同步的问题, 只可能是发行版发布软件包的时候不科学了... 比如走完了更新一整个包的流程才更新下一个, 之类的...
(我并不熟悉其他发行版的流程哈, 不过就 Arch 来说, 提交编译脚本/上传包完成后, 还需要一步手动的 "更新数据库" 才会把自己刚才提交的包更新上去.
因此, 比如 fcitx 更新了版本, 有几个相关的组件需要一起更新的话, 我会编译/上传完这一系列更新的所有包后, 再跑 "更新数据库". 这样包数据库在
任何一个时间点都是完整的, 不会包含一部分的 fcitx 包从而导致用户更新之后出问题...

> 另外，出问题的两个包，并不共享一套版本号。从数字上无法看出两者关系。

这样的确有点麻烦了...

简单查了一下, Xorg 上游主干应该已经完全切换到 1.xx.x 的版本风格相当久了, debian 等发行版保留 7.x 的版本似乎没什么道理啊...

Regards,
Felix Yan

signature.asc

Felix Yan

unread,

Apr 8, 2014, 8:54:08 AM4/8/14

to sh...@googlegroups.com

On Tuesday, April 08, 2014 20:20:01 Shell Xu wrote:
> shit，大晚上的居然又出问题了。够恶心的。

还是关掉 X 就恢复正常吗?

Regards,
Felix Yan

signature.asc

Shell Xu

unread,

Apr 8, 2014, 9:47:47 AM4/8/14

to shlug

没错。

问题是关闭X后压力也几乎降低到0了。并不是特别具有说明力。

Felix Yan

unread,

Apr 8, 2014, 10:13:08 AM4/8/14

to sh...@googlegroups.com

On Tuesday, April 08, 2014 21:47:47 Shell Xu wrote:
> 没错。
> 问题是关闭X后压力也几乎降低到0了。并不是特别具有说明力。

可以确定 kswapd 占用的是 CPU 计算, 而不是 iowait 吗?

Regards,
Felix Yan

signature.asc

Shell Xu

unread,

Apr 8, 2014, 10:45:00 AM4/8/14

to shlug

不，是sys吃掉的。

Shell Xu

unread,

Apr 8, 2014, 11:48:54 AM4/8/14

to shlug

更换kernel，到现在还没事。不过现在换的是3.2.0的内核，我打算升级到新一点的内核。

估计是新的内核和新的xorg配合起来就出鬼问题了。

Chaos Eternal

unread,

Apr 9, 2014, 7:41:33 AM4/9/14

to sh...@googlegroups.com

systap啊

Shell Xu

unread,

Apr 9, 2014, 11:26:18 PM4/9/14

to shlug

补一下测试结果。

3.12以上内核，不管是debian自己的，还是直接从内核源码编译出来的，全面挂掉。3.10的内核编出来有点问题，没有测。

您收到此邮件是因为您订阅了 Google 网上论坛的“Shanghai Linux User Group”论坛。

要退订此论坛并停止接收此论坛的电子邮件，请发送电子邮件到shlug+un...@googlegroups.com。
要查看更多选项，请访问 https://groups.google.com/d/optout。

Shell Xu

unread,

Apr 13, 2014, 10:25:40 PM4/13/14

to shlug

补个刀。xserver-xorg系列降级回去了，没有任何改善。我准备回头上stap了。

胡瀚森

unread,

Apr 17, 2014, 8:53:02 PM4/17/14

to sh...@googlegroups.com

没有memory pressure，kblockd又没有异动的时候就别在意kswapd。。。

none_nobody

unread,

Apr 17, 2014, 10:08:38 PM4/17/14

to sh...@googlegroups.com

会阻塞机器肿么办？

On Friday, April 18, 2014 8:53:02 AM UTC+8, SoftRank.net wrote:

没有memory pressure，kblockd又没有异动的时候就别在意kswapd。。。

胡瀚森

unread,

Apr 17, 2014, 10:15:49 PM4/17/14

to sh...@googlegroups.com

你是说在CPU压力很大的时候kswapd还是大量占用CPU？

Shell Xu

unread,

Apr 17, 2014, 10:25:36 PM4/17/14

to shlug

cpu没压力，但是kswapd用了一个核。

liyaoshi

unread,

Apr 17, 2014, 10:47:40 PM4/17/14

to sh...@googlegroups.com

还没搞定啊？

Shell Xu

unread,

Apr 17, 2014, 10:50:43 PM4/17/14

to shlug

早了去了。现在问题回到内核自身了。

胡瀚森

unread,

Apr 18, 2014, 2:42:45 AM4/18/14

to sh...@googlegroups.com

CPU没有压力的时候kswapd可以相当于idle。。。具体显示出来的统计数据，随kernel版本而变

胡瀚森

unread,

Apr 18, 2014, 2:44:13 AM4/18/14

to sh...@googlegroups.com

主要就测一下，在满负荷计算型任务下面这个使用率还高不高，如果非常低那就没有问题了

Shell Xu

unread,

Apr 18, 2014, 2:57:49 AM4/18/14

to shlug

空负荷下使用率就很高了，根本到不了满负荷。你想我只剩一个核，随便跑点啥整个电脑就卡的和中国的网络一样了。。。

Shell Xu

unread,

Apr 28, 2014, 10:19:15 AM4/28/14

to shlug

xserver升级后，问题得到了缓解。偶尔还是有kswapd吃掉一个核的情况，但是持续只有几秒，而且频率也不高。比较起来我觉得还是不算太正常，但是在可以接受的范围内。

Reply all

Reply to author

Forward