Hi,
We recently upgraded from Ironfan 3.1.6 to 4.5 and noticed that knife cluster kick is broken and knife cluster sync does not seem to be working as expected. Below are the details about the Ironfan cluster configuration and the workflow followed when the issue was observed.
Ironfan gem version: 4.5
Ironfan-hombase git-sha: c1d68b90bfe3bc2e798248a407791b16c806283d
cluster configuration:
Ironfan.cluster 'test0' do
cloud(:ec2) do
permanent false
availability_zones ['us-east-1d']
flavor 't1.micro'
backing 'ebs'
image_name 'natty'
bootstrap_distro 'ubuntu10.04-ironfan'
chef_client_script 'client.rb'
mount_ephemerals
end
environment :development
role :systemwide
role :chef_client
role :ssh
role :org_base
node_attributes = { :a => 1, :b => 2 }
cluster_overrides = Mash.new(node_attributes)
facet :foobar do
instances 1
facet_role do
override_attributes(node_attributes)
end
end
cluster_role.override_attributes(cluster_overrides)
end
KNIFE CLUSTER KICK:
-------------------------------------
$ bundle exec knife cluster kick test0-foobar-0
Error:
INFO: Inventoried 1 computers
+----------------+-------+---------+----------+------------+-------------+------------+---------------+----------------+------------+--------------+--------------+---------+-----------+------------+
| Name | Chef? | State | Flavor | AZ | Env | MachineID | Public IP | Private IP | Created On | Image | Volumes | SSH Key | Startable | Launchable |
+----------------+-------+---------+----------+------------+-------------+------------+---------------+----------------+------------+--------------+--------------+---------+-----------+------------+
| test0-foobar-0 | yes | running | t1.micro | us-east-1d | development | i-a8cf6dd7 | 23.22.194.215 | 10.242.117.187 | 2012-11-16 | ami-fd589594 | vol-55a81329 | test0 | no | no |
+----------------+-------+---------+----------+------------+-------------+------------+---------------+----------------+------------+--------------+--------------+---------+-----------+------------+
DEBUG: connection established
INFO: negotiating protocol version
DEBUG: remote is `SSH-2.0-OpenSSH_5.8p1 Debian-1ubuntu3'
DEBUG: local is `SSH-2.0-Ruby/Net::SSH_2.2.2 x86_64-darwin11.0.0'
-- snip --
DEBUG: received packet nr 5 type 51 len 28
DEBUG: allowed methods: publickey
ERROR: all authorization methods failed (tried publickey)
/Users/abhi/.rvm/gems/ruby-1.9.3-p286/gems/chef-10.16.2/lib/chef/knife/ssh.rb:101:in `block in session': undefined method `each' for nil:NilClass (NoMethodError)
from /Users/abhi/.rvm/gems/ruby-1.9.3-p286/gems/net-ssh-multi-1.1/lib/net/ssh/multi/session.rb:499:in `call'
from /Users/abhi/.rvm/gems/ruby-1.9.3-p286/gems/net-ssh-multi-1.1/lib/net/ssh/multi/session.rb:499:in `block in next_session'
from /Users/abhi/.rvm/gems/ruby-1.9.3-p286/gems/net-ssh-multi-1.1/lib/net/ssh/multi/session.rb:499:in `catch'
from /Users/abhi/.rvm/gems/ruby-1.9.3-p286/gems/net-ssh-multi-1.1/lib/net/ssh/multi/session.rb:499:in `rescue in next_session'
from /Users/abhi/.rvm/gems/ruby-1.9.3-p286/gems/net-ssh-multi-1.1/lib/net/ssh/multi/session.rb:482:in `next_session'
from /Users/abhi/.rvm/gems/ruby-1.9.3-p286/gems/net-ssh-multi-1.1/lib/net/ssh/multi/server.rb:138:in `session'
from /Users/abhi/.rvm/gems/ruby-1.9.3-p286/gems/net-ssh-multi-1.1/lib/net/ssh/multi/session_actions.rb:36:in `block (2 levels) in sessions'
KNIFE CLUSTER SYNC:
--------------------------------------
Changed node-attributes to {:a => 100 :b => 100 }
Ironfan.cluster 'test0' do
cloud(:ec2) do
permanent false
availability_zones ['us-east-1d']
flavor 't1.micro'
backing 'ebs'
image_name 'natty'
bootstrap_distro 'ubuntu10.04-ironfan'
chef_client_script 'client.rb'
mount_ephemerals
end
environment :development
role :systemwide
role :chef_client
role :ssh
role :org_base
node_attributes = { :a => 100, :b => 100 }
cluster_overrides = Mash.new(node_attributes)
facet :foobar do
instances 1
facet_role do
override_attributes(node_attributes)
end
end
cluster_role.override_attributes(cluster_overrides)
end
Ran cluster sync command after changing the cluster definition
$ bundle exec knife cluster sync test0-foobar-0
Checking node attributes on Opscode shows that the attribute values are the same as what they were after the bootstrap and before the sync.
However, running a bootstrap again syncs the value.
$ bundle exec knife cluster bootstrap test0-foobar-0 -y
Please let me know if any other information is required to diagnose the issue. Also, Is there a 4.x version of Ironfan in which sync and kick are known to work? Perphaps I could try testing it with version and confirm if this is an issue with 4.5.
Note: The above issue was also observed with Ironfan gem 4.4.3 too.
--
Abhi