Can't connect to SSH

Alex Keiti Nosse

unread,

Oct 16, 2017, 9:16:56 AM10/16/17

to gce-discussion

Using browser SSH or terminal, I can't connect anymore to my instance.

Output from gcloud compute instances get-serial-port-output

[3144931.651399] Out of memory: Kill process 30949 (php.bin) score 19 or sacrifice child
[3144931.659366] Killed process 30949 (php.bin) total-vm:264728kB, anon-rss:8356kB, file-rss:25972kB
[3145082.039628] INFO: task kswapd0:26 blocked for more than 120 seconds.
[3145082.046316]       Not tainted 3.16.0-4-amd64 #1
[3145082.051125] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[3145082.059233] kswapd0         D ffff880069a1e628     0    26      2 0x00000000
[3145082.066794]  ffff880069a1e1d0 0000000000000046 0000000000012f40 ffff88006648ffd8
[3145082.198983]  0000000000012f40 ffff880069a1e1d0 ffff88006656e000 000000000013bde1
[3145082.207114]  ffff88006656e088 ffff88006656e024 ffff88006648fba0 ffff88006656e0a0
[3145082.215248] Call Trace:
[3145082.217994]  [<ffffffffa00ca7d5>] ? jbd2_log_wait_commit+0x95/0x100 [jbd2]
[3145082.225152]  [<ffffffff810a9610>] ? prepare_to_wait_event+0xf0/0xf0
[3145082.231725]  [<ffffffffa00f7c0a>] ? ext4_evict_inode+0x2ea/0x4e0 [ext4]
[3145082.238624]  [<ffffffff811c4efc>] ? evict+0xac/0x170
[3145082.243870]  [<ffffffff811c4ff9>] ? dispose_list+0x39/0x50
[3145082.249725]  [<ffffffff811c5e34>] ? prune_icache_sb+0x44/0x50
[3145082.255756]  [<ffffffff811ae2bf>] ? super_cache_scan+0xff/0x170
[3145082.261958]  [<ffffffff8114f9f7>] ? shrink_slab_node+0x137/0x2f0
[3145082.268374]  [<ffffffff811513d2>] ? shrink_slab+0x82/0x150
[3145082.274141]  [<ffffffff81154c6e>] ? balance_pgdat+0x3be/0x5c0
[3145082.280176]  [<ffffffff81154fcc>] ? kswapd+0x15c/0x460
[3145082.285598]  [<ffffffff810a9610>] ? prepare_to_wait_event+0xf0/0xf0
[3145082.292143]  [<ffffffff81154e70>] ? balance_pgdat+0x5c0/0x5c0
[3145082.298175]  [<ffffffff8108954d>] ? kthread+0xbd/0xe0
[3145082.303509]  [<ffffffff81089490>] ? kthread_create_on_node+0x180/0x180
[3145082.310326]  [<ffffffff8151a3d8>] ? ret_from_fork+0x58/0x90
[3145082.316182]  [<ffffffff81089490>] ? kthread_create_on_node+0x180/0x180
[3145082.446916] INFO: task jbd2/sda1-8:96 blocked for more than 120 seconds.
[3145082.453904]       Not tainted 3.16.0-4-amd64 #1
[3145082.458712] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[3145082.466820] jbd2/sda1-8     D ffff8800666084a8     0    96      2 0x00000000
[3145082.474376]  ffff880066608050 0000000000000046 0000000000012f40 ffff8800368b7fd8
[3145082.482499]  0000000000012f40 ffff880066608050 ffff8800368b7dc8 ffff8800368b7e60
[3145082.490632]  ffff88006656e0b8 ffff880066608050 ffff880068d0c900 ffff8800368b7db

If I use the browser SSH, it get stuck forever in the message

Establishing connection to SSH server...

For some unknown reason, the CPU utilization is on 100% for over an hour.

Can anyone help me here?

Digil (Google Cloud Platform Support)

unread,

Oct 16, 2017, 4:44:27 PM10/16/17

to gce-discussion

Have you tried accessing the VM using the serial console? There is this help center article which will guide you to use the serial console [1].

If you are able to login to the instance through serial console, you can check for the errors from there or make the appropriate configuration changes.

[1] https://cloud.google.com/compute/docs/instances/interacting-with-serial-console

Gerrit DeWitt

unread,

Oct 17, 2017, 2:56:24 PM10/17/17

to gce-discussion

Hello Alex,

In this case, it appears that your instance has run out of memory (RAM), and the operating system has sacrificed processes [1] in order to keep the operating system running.

These log entries indicate you're in a situation where your instance is out of memory:

[3144931.651399] Out of memory: Kill process 30949 (php.bin) score 19 or sacrifice child

[3144931.659366] Killed process 30949 (php.bin) total-vm:264728kB, anon-rss:8356kB, file-rss:25972kB

Most likely, the sshd process itself (or some critical system on which is depends) has already been killed, which is why you're unable to connect via SSH. To fix this problem, stop your instance, then edit it. Increase the amount of RAM available [2], and start the instance again [3].

--Gerrit

Cloud TSE, Seattle

References:

1: https://linux-mm.org/OOM

2: https://cloud.google.com/compute/docs/instances/changing-machine-type-of-stopped-instance#changing_a_machine_type

3: https://cloud.google.com/compute/docs/instances/changing-machine-type-of-stopped-instance#billing_implications

RED GAMER

unread,

Nov 25, 2019, 9:16:26 AM11/25/19

to gce-discussion

It seems I have the same issue.

How you got these logs.

I can access wordpress bitnami or even SSH on the instance.

Thank you.

On Monday, October 16, 2017 at 9:16:56 AM UTC-4, Alex Keiti Nosse wrote:

Jason

unread,

Nov 25, 2019, 4:12:40 PM11/25/19

to gce-discussion

Hi,

The logs that are shown are coming from the serial console port. You can check these logs by following the steps mentioned in the attached article [1]. Keep in mind that the VM instance needs to be running to check these logs.

[1] https://cloud.google.com/compute/docs/instances/interacting-with-serial-console#connectserialconsole

Kirill Katsnelson

unread,

Nov 27, 2019, 6:23:31 PM11/27/19

to gce-discussion

This is an old thread, but you are soooo OOM that kernel tries to save itself by randomly (well, not really randomly at all) shooting down processes. sshd daemon is not the first in line to be killed, but the VM is extremely overloaded that sshd, not being used in a while, may have been swapped out and the kernel is simply unable to swap it in back into RAM (the "kswapd0:26 blocked for more than 120 seconds" is a terminal diagnosis, pretty much).

Always try simplest things first. If you were able to SSH into the VM before, the quickest thing to try is to bump its CPU and memory temporarily, e.g. selecting a much beefier machine (if you are n1-standard-1, set to n1-standard-4), boot and quickly try ssh again. CPUs and RAM are relatively cheap, and, while it will set you back maybe $10 while you are analyzing the issue, you'll save much more of your time.

Maybe you have simply undersized the machine for its load.

A side note, kernel 3.16 is quite old, think about upgrading if possible. Memory management has significantly smoothed out since then.

Digil (Google Cloud Platform Support)

unread,

Nov 28, 2019, 10:30:53 AM11/28/19

to gce-dis...@googlegroups.com

SSH issues are common in Linux based environments. Some are easily(a simple rebooting/restarting might help) fixable, while some are quite complex to resolve. There isn't any one-touch resolution guide for this issue. However, you can use the 'troubleshooting SSH' help center article to find out common causes for a failed SSH connectivity.

Additionally, a 'General troubleshooting' page is also available that describes troubleshooting steps which you might find helpful if you run into problems using Compute Engine instances as well.

Reply all

Reply to author

Forward