Hi All,
Here are some tips for quick debuging 'Bootstrap Failed' error.
We may add it into Trouble Shooting guide for M5.
When creating/starting/configuring a cluster and some nodes says
'Bootstrap Failed', please follow this to find out the reason.
ssh serengeti@node_ip
cat /var/chef/cache/chef-stacktrace.out
This will show the error log of 'chef-client' process started by
Serengeti.
Here are some typical error log we have met :
1)
Generated at 2013-05-26 23:53:44 -0400
Errno::EHOSTUNREACH:
remote_file[/etc/yum.repos.d/cloudera-cdh4.repo]
(hadoop_common::add_repo line 45) had an error:
Errno::EHOSTUNREACH: No route to host - connect(2)
/usr/lib/ruby/1.9.1/net/http.rb:644:in `initialize'
/usr/lib/ruby/1.9.1/net/http.rb:644:in `open'
/usr/lib/ruby/1.9.1/net/http.rb:644:in `block in connect'
/usr/lib/ruby/1.9.1/timeout.rb:44:in `timeout'
/usr/lib/ruby/1.9.1/timeout.rb:89:in `timeout'
/usr/lib/ruby/1.9.1/net/http.rb:644:in `connect'
This is because the yum server is not available. We need to ensure
the yum server is running and the yum repo url
http://.../cloudera-cdh4.repo can be reached.
2) ERROR: package[hadoop] (hadoop_cluster::namenode line 438) has had an error ...
This is because the yum server doesn't have the rpm named
'hadoop'. The root cause might be that the rpm or rpm it depends on is not on the yum
server (this means the ova or the code has problem) or the yum
server is not created correctly (need to recreate the yum server).
3) For other error, please login as the node as root or serengeti and run 'sudo chef-client' to get the full
log of 'chef-client'
Thanks
Jesse Hu |
Project
Serengeti