Restarting HBASE

159 views

Skip to first unread message

shubham agarwal

unread,

Nov 24, 2015, 6:44:33 AM11/24/15

to Async HBase

Hi,

I accidentally deleted some datanode of the HBASE cluster. Now whenever I am trying to restart HBASE master and region server, it keeps on splitting regions and the fail after some time. So can someone help me on how to restart these.

The logs of regionserver

2015-11-24 11:26:00,548 INFO [RS_LOG_REPLAY_OPS-] util.FSHDFSUtils: recoverLease=false, attempt=14 on file=hdfs://localhost/hbase/WALs/,1448343272089-splitting/%2C60020%2C1448343272089..meta.1448343280481.meta after 837759ms

2015-11-24 11:26:58,137 INFO [LruBlockCacheStatsExecutor] hfile.LruBlockCache: totalSize=3.29 MB, freeSize=3.13 GB, max=3.13 GB, blockCount=0, accesses=0, hits=0, hitRatio=0, cachingAccesses=0, cachingHits=0, cachingHitsRatio=0,evictions=89, evicted=0, evictedPerRun=0.0

2015-11-24 11:27:04,654 INFO [RS_LOG_REPLAY_OPS-:60020-0] util.FSHDFSUtils: recoverLease=false, attempt=15 on file=hdfs://localhost/hbase/WALs/,60020,1448343272089-splitting/%2C60020%2C1448343272089..meta.1448343280481.meta after 901865ms

2015-11-24 11:27:04,654 WARN [RS_LOG_REPLAY_OPS-:60020-0] util.FSHDFSUtils: Cannot recoverLease after trying for 900000ms (hbase.lease.recovery.timeout); continuing, but may be DATALOSS!!!; attempt=15 on file=hdfs://localhost/hbase/WALs/,60020,1448343272089-splitting/.visa.com%2C60020%2C1448343272089..meta.1448343280481.meta after 901865ms

2015-11-24 11:27:04,922 WARN [RS_LOG_REPLAY_OPS-:60020-0] wal.WALFactory: Lease should have recovered. This is not expected. Will retry

java.io.IOException: Cannot obtain block length for LocatedBlock{BP-1837467880-10.211.26.203-1439511762417:blk_1073763333_22515; getBlockSize()=83; corrupt=false; offset=0; locs=[DatanodeInfoWithStorage[127.0.0.1:50010,DS-b74ae2a5-b2d6-42df-848d-c1aee5cfc112,DISK]]}

at org.apache.hadoop.hdfs.DFSInputStream.readBlockLength(DFSInputStream.java:386)

at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:329)

at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:265)

at org.apache.hadoop.hdfs.DFSInputStream.<init>(DFSInputStream.java:257)

at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1492)

at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:302)

at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:298)

at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)

at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:298)

at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:766)

at org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:290)

at org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:266)

at org.apache.hadoop.hbase.wal.WALSplitter.getReader(WALSplitter.java:839)

at org.apache.hadoop.hbase.wal.WALSplitter.getReader(WALSplitter.java:763)

at org.apache.hadoop.hbase.wal.WALSplitter.splitLogFile(WALSplitter.java:304)

at org.apache.hadoop.hbase.wal.WALSplitter.splitLogFile(WALSplitter.java:242)

at org.apache.hadoop.hbase.regionserver.SplitLogWorker$1.exec(SplitLogWorker.java:104)

at org.apache.hadoop.hbase.regionserver.handler.WALSplitterHandler.process(WALSplitterHandler.java:72)

at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

at java.lang.Thread.run(Thread.java:745)

2015-11-24 11:31:58,137 INFO [LruBlockCacheStatsExecutor] hfile.LruBlockCache: totalSize=3.29 MB, freeSize=3.13 GB, max=3.13 GB, blockCount=0, accesses=0, hits=0, hitRatio=0, cachingAccesses=0, cachingHits=0, cachingHitsRatio=0,evictions=119, evicted=0, evictedPerRun=0.0

2015-11-24 11:32:05,451 ERROR [RS_LOG_REPLAY_OPS-:60020-0] wal.WALFactory: Can't open after 300 attempts and 300797ms for hdfs://localhost/hbase/WALs/,60020,1448343272089-splitting/%2C60020%2C1448343272089..meta.1448343280481.meta

2015-11-24 11:32:05,453 INFO [RS_LOG_REPLAY_OPS--0] wal.WALSplitter: Processed 0 edits across 0 regions; edits skipped=0; log file=hdfs://localhost/hbase/WALs/,60020,1448343272089-splitting/%2C60020%2C1448343272089..meta.1448343280481.meta, length=83, corrupted=false, progress failed=false

2015-11-24 11:32:05,453 WARN [RS_LOG_REPLAY_OPS:60020-0] regionserver.SplitLogWorker: log splitting of WALs/,60020,1448343272089-splitting/%2C60020%2C1448343272089..meta.1448343280481.meta failed, returning error