Problem with namenode going to safe mode

469 views
Skip to first unread message

mich.ta...@gmail.com

unread,
Feb 28, 2021, 12:59:30 PM2/28/21
to Google Cloud Dataproc Discussions
Hi,

Over Friday and this weekend we created a new GCP Dataproc cluster with one master and two worker nodes. This provides the latest Spark 3.1.1 etc.

This was working OK. However, due to unplanned shutdown, the master node has gone into safe mode so could not start spark-shell.

I performed the following as root

 hdfs dfsadmin -safemode get
Safe mode is ON

Then I tried to make it leave safe mode

 hdfs dfsadmin -safemode leave
Safe mode is OFF

However, when I try to put a file into /tmp it comes back with error

 hdfs dfs -put log4j.properties /tmp
2021-02-28 17:37:58,101 WARN hdfs.DataStreamer: DataStreamer Exception
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /tmp/log4j.properties._COPYING_ could only be written to 0 of the 1 minReplication nodes. There are 0 datanode(s) running and 0 node(s) are excluded in this operation.
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2278)
        at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2808)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:905)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:577)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:528)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1086)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1029)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:957)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2957)

Any idea how this can be resolved?

Thanks,

Mich

i...@google.com

unread,
Mar 1, 2021, 4:02:05 PM3/1/21
to Google Cloud Dataproc Discussions

Hello,

You need to inspect HDFS DataNode logs on worker nodes to understand why HDFS DataNodes can not join a cluster/NameNode.

mich.ta...@gmail.com

unread,
Mar 1, 2021, 4:08:43 PM3/1/21
to Google Cloud Dataproc Discussions
Thanks I created a new Dataproc cluster.
Reply all
Reply to author
Forward
0 new messages