Tips for quick debugging 'Bootstrap Failed'

425 views
Skip to first unread message

Jesse Hu

unread,
May 27, 2013, 2:45:55 AM5/27/13
to serenge...@googlegroups.com
Hi All,

Here are some tips for quick debuging 'Bootstrap Failed' error.  We may add it into Trouble Shooting guide for M5.

When creating/starting/configuring a cluster and some nodes says 'Bootstrap Failed', please follow this to find out the reason.

ssh serengeti@node_ip
cat /var/chef/cache/chef-stacktrace.out  

This will show the error log of 'chef-client' process started by Serengeti.

Here are some typical error log we have met :

1)
Generated at 2013-05-26 23:53:44 -0400
Errno::EHOSTUNREACH: remote_file[/etc/yum.repos.d/cloudera-cdh4.repo] (hadoop_common::add_repo line 45) had an error: Errno::EHOSTUNREACH: No route to host - connect(2)
/usr/lib/ruby/1.9.1/net/http.rb:644:in `initialize'
/usr/lib/ruby/1.9.1/net/http.rb:644:in `open'
/usr/lib/ruby/1.9.1/net/http.rb:644:in `block in connect'
/usr/lib/ruby/1.9.1/timeout.rb:44:in `timeout'
/usr/lib/ruby/1.9.1/timeout.rb:89:in `timeout'
/usr/lib/ruby/1.9.1/net/http.rb:644:in `connect'

This is because the yum server is not available. We need to ensure the yum server is running and the yum repo url http://.../cloudera-cdh4.repo can be reached.

2) ERROR: package[hadoop] (hadoop_cluster::namenode line 438) has had an error ...

This is because the yum server doesn't have the rpm named 'hadoop'. The root cause might be that the rpm or rpm it depends on is not on the yum server (this means the ova or the code has problem) or the yum server is not created correctly (need to recreate the yum server).
 
3) For other error,  please login as the node as root or serengeti and run 'sudo chef-client' to get the full log of 'chef-client'

Thanks
Jesse Hu  | Project Serengeti
Reply all
Reply to author
Forward
0 new messages