Range Server crash - low memory

46 views
Skip to first unread message

Tomaž Hauzer

unread,
Dec 24, 2015, 2:34:50 AM12/24/15
to Hypertable User
Hi,

a few days ago we had a problem with the Range server. Below is the only useful information i got from the logs. The other maybe connected thing is that we had a swap space spike around that time. It went from 80% free to to zero in a minute.

Is there any other logs i could check? Are there other things we could do to prevent the low memory situation, besides adding more memory?

The database ran for almost month before the crash.

RangeServer.log file:
1450803429 INFO htRangeServer : (/root/src/hypertable/src/cc/Hypertable/RangeServer/AccessGroup.cc:824) Finished Compaction of 2/26641[..ÿÿ](default) to /hypertable/tables/2/26641/default/qyoNKN5rd__dbHKv/cs7
1450803429 INFO htRangeServer : (/root/src/hypertable/src/cc/Hypertable/RangeServer/AccessGroup.cc:824) Finished Compaction of 2/26690[..ÿÿ](default) to /hypertable/tables/2/26690/default/qyoNKN5rd__dbHKv/cs399
1450803429 INFO htRangeServer : (/root/src/hypertable/src/cc/Hypertable/RangeServer/AccessGroup.cc:552) Starting Minor Compaction of 2/26691[..ÿÿ](default)
1450803429 INFO htRangeServer : (/root/src/hypertable/src/cc/Hypertable/RangeServer/AccessGroup.cc:545) Starting GC Compaction of 2/6970[6,bdb050-1458873               1448233795:148131609..ÿÿ](default)
1450803429 INFO htRangeServer : (/root/src/hypertable/src/cc/Hypertable/RangeServer/TimerHandler.cc:211) Application queue PAUSED due to low memory
1450803443 INFO htRangeServer : (/root/src/hypertable/src/cc/Hypertable/RangeServer/AccessGroup.cc:824) Finished Compaction of 2/26691[..ÿÿ](default) to /hypertable/tables/2/26691/default/qyoNKN5rd__dbHKv/cs399
1450803443 INFO htRangeServer : (/root/src/hypertable/src/cc/Hypertable/RangeServer/AccessGroup.cc:824) Finished Compaction of 2/26638[..ÿÿ](default) to /hypertable/tables/2/26638/default/qyoNKN5rd__dbHKv/cs56
1450803443 INFO htRangeServer : (/root/src/hypertable/src/cc/Hypertable/RangeServer/AccessGroup.cc:824) Finished Compaction of 2/6970[6,4fd5a-89610             1437206883:7659068..6,bdb050-1458873            1448233795:148131609](default) to /hypertable/tables/2/6970/default/eypYI2CLSiaxIWNW/cs125
1450803443 INFO htRangeServer : (/root/src/hypertable/src/cc/Hypertable/Lib/CommitLog.cc:341) purge(/hypertable/servers/rs1/log/root,rev=1433507403402040001) breaking on FileInfo=(logdir=/hypertable/servers/rs1/log/root,num=0,revision=1433507403426878001,references=0,rmOk=0)
1450803443 INFO htRangeServer : (/root/src/hypertable/src/cc/Hypertable/Lib/CommitLog.cc:341) purge(/hypertable/servers/rs1/log/metadata,rev=1433507403515678001) breaking on FileInfo=(logdir=/hypertable/servers/rs1/log/metadata,num=0,revision=1435758422825719001,references=0,rmOk=0)
1450803443 INFO htRangeServer : (/root/src/hypertable/src/cc/Hypertable/Lib/CommitLog.cc:341) purge(/hypertable/servers/rs1/log/system,rev=1446958813749711001) breaking on FileInfo=(logdir=/hypertable/servers/rs1/log/system,num=13,revision=1447016414209019154,references=0,rmOk=0)
1450803443 INFO htRangeServer : (/root/src/hypertable/src/cc/Hypertable/Lib/CommitLog.cc:374) Removing log fragment '/hypertable/servers/rs1/log/user/3583' revision=1450787232845818004, parent=0
1450803443 INFO htRangeServer : (/root/src/hypertable/src/cc/Hypertable/Lib/CommitLog.cc:341) purge(/hypertable/servers/rs1/log/user,rev=1450787256719318001) breaking on FileInfo=(logdir=/hypertable/servers/rs1/log/user,num=3584,revision=1450789724245787002,references=0,rmOk=0)
1450803443 INFO htRangeServer : (/root/src/hypertable/src/cc/Hypertable/RangeServer/MaintenanceScheduler.cc:349) Memory Statistics (MB): VM=153023.64, RSS=9849.67, tracked=7307.29, computed=7301.03 limit=7142.40
1450803443 INFO htRangeServer : (/root/src/hypertable/src/cc/Hypertable/RangeServer/MaintenanceScheduler.cc:354) Memory Allocation: BlockCache=60.07% BlockIndex=0.58% BloomFilter=14.38% CellCache=24.31% ShadowCache=0.00% QueryCache=0.65%
1450803443 INFO htRangeServer : (/root/src/hypertable/src/cc/Hypertable/RangeServer/MaintenancePrioritizerLowMemory.cc:199) WRITE workload prioritization (update_bytes=581335608, scan_count=4)
1450806984 INFO htRangeServer : (/root/src/hypertable/src/cc/Common/Config.cc:632) Initializing htRangeServer (Hypertable 0.9.8.6 (v0.9.8.6-0-gf4a780a))...

Doug Judd

unread,
Dec 29, 2015, 7:22:22 PM12/29/15
to hypertable-user
How many CPUs on the RangeServer machine?  You can find out with a command such as the following:

grep "^processor" /proc/cpuinfo | wc -l

Also, are you running the RangeServer in a standalone setup over RAID?  Or is it a single disk?

- Doug


--
You received this message because you are subscribed to the Google Groups "Hypertable User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hypertable-us...@googlegroups.com.
To post to this group, send email to hyperta...@googlegroups.com.
Visit this group at https://groups.google.com/group/hypertable-user.
For more options, visit https://groups.google.com/d/optout.



--
Doug Judd
CEO, Hypertable Inc.

Tomaž Hauzer

unread,
Jan 4, 2016, 4:44:02 AM1/4/16
to Hypertable User, do...@hypertable.com
cpu number on server:
[root@node01 ~]# grep "^processor" /proc/cpuinfo | wc -l
6

Forgat to add the info about the server:
Hypertable is running the on hadoop filesystem. RangeServer has 12GB of RAM, and six cores. I think it's a single disk, but i'm waiting for the response from our systems guy.

Tomaž Hauzer

unread,
Jan 4, 2016, 5:01:45 AM1/4/16
to Hypertable User, do...@hypertable.com
Got more info about the server if this helps.
All pyshical servers run on RAID10 with four disks.
Each server runs a few virtual machines and one of them is the RandeServer machine.

Tomaž Hauzer

unread,
Jan 22, 2016, 4:01:15 AM1/22/16
to Hypertable User
Hi,

yesterday we had another crash of the Range Server.
Our setup is three servers:
 - first server - running hyperspace
 - second server - running master
 - third server - running range server

All of this is running on Hadoop HDFS.

The crash is basically the same as before. But this time i could find anything in the logs, at least nothing that is unusual. The only clue we have that we had a processor load spike when this happened. Could this be the cause of the crash?
As before is there anything more i can check? Can i look for anything specific which could help?

Cheers, Tomaz Hauzer

Doug Judd

unread,
Jan 22, 2016, 12:10:22 PM1/22/16
to hypertable-user
Hi Tomaz,

Sorry for the lag in response.  Do you have the monitoring UI available?  If so, look at the graphs for some of the RangeServers over a 1 week period and let me know if you see anything that looks odd such as growing CPU consumption or Load Average.  I'm currently chasing a problem down for a customer that is reporting such a problem.  Also, what version of Hypertable are you running?

- Doug

--
You received this message because you are subscribed to the Google Groups "Hypertable User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hypertable-us...@googlegroups.com.
To post to this group, send email to hyperta...@googlegroups.com.
Visit this group at https://groups.google.com/group/hypertable-user.
For more options, visit https://groups.google.com/d/optout.

Tomaž Hauzer

unread,
Jan 26, 2016, 5:54:06 AM1/26/16
to Hypertable User, do...@hypertable.com
HI Doug,

we are using the 0.9.8.6 version of Hypertable.
The Hypertable monitoring UI was disabled, because it was taking too much space. But i checked in Zabbix and nothing not unusual was happening last week.
Reply all
Reply to author
Forward
0 new messages