Iterating through all of the nodes is a little slow, so I would really like to use searching instead. The problem with this is that for searching, chef indexes all of the nested attributes as if they were top level attributes, making it easy to search for values buried deep within the node structure. This feature frustrates using search as a cluster discovery because of some unfortunate choices of other configuration environments:
What happened? goldencap and chimpmark are configured to talk to the hbase residing on greeneggs by the hbase.cluster_name attribute. Maybe this was a poor choice of attributes to use to indicate which hbase cluster to talk to (it was my choice, actually). But it brings up a general problem with chef search and attribute collisions that we need to be aware of.
Another feature of chef search is that the node attributes are also indexed by the full path. A nodes node.hbase.cluster_name attribute gets indexed as "cluster_name:<value>" and "hbase_cluster_name:value":
chris@basqueseed:~/.ssh$ knife exec -E 'nodes.search("hbase_cluster_name:greeneggs") {|n| puts "#{n.node_name} #{n.cluster_name} #{n.hbase.cluster_name}" }' | sort
chimpmark-master-0 chimpmark greeneggs
chimpmark-slave-0 chimpmark greeneggs
chimpmark-slave-10 chimpmark greeneggs
chimpmark-slave-11 chimpmark greeneggs
chimpmark-slave-12 chimpmark greeneggs
chimpmark-slave-13 chimpmark greeneggs
chimpmark-slave-14 chimpmark greeneggs
chimpmark-slave-15 chimpmark greeneggs
chimpmark-slave-16 chimpmark greeneggs
chimpmark-slave-17 chimpmark greeneggs
chimpmark-slave-18 chimpmark greeneggs
chimpmark-slave-1 chimpmark greeneggs
chimpmark-slave-2 chimpmark greeneggs
chimpmark-slave-3 chimpmark greeneggs
chimpmark-slave-4 chimpmark greeneggs
chimpmark-slave-5 chimpmark greeneggs
chimpmark-slave-6 chimpmark greeneggs
chimpmark-slave-7 chimpmark greeneggs
chimpmark-slave-8 chimpmark greeneggs
chimpmark-slave-9 chimpmark greeneggs
goldencap-nikko-0 goldencap greeneggs
goldencap-twscraper-0 goldencap greeneggs
goldencap-twscraper-1 goldencap greeneggs
goldencap-twstream-0 goldencap greeneggs
greeneggs-alpha greeneggs greeneggs
greeneggs-beta greeneggs greeneggs
greeneggs-delta-0 greeneggs greeneggs
greeneggs-delta-1 greeneggs greeneggs
greeneggs-delta-2 greeneggs greeneggs
greeneggs-delta-3 greeneggs greeneggs
greeneggs-delta-4 greeneggs greeneggs
greeneggs-delta-5 greeneggs greeneggs
greeneggs-delta-6 greeneggs greeneggs
greeneggs-gamma-0 greeneggs greeneggs
greeneggs-gamma-1 greeneggs greeneggs
greeneggs-gamma-2 greeneggs greeneggs
I propose that cluster chef should mark nodes in an unequivocal way so that it can search for them and get the answer it expects. Basically, this means adding a top level "clusterchef" attribute that contains cluster_name, cluster_facet, and cluster_facet_index. The top level "clusterchef" attribute gives us a unique namespace that we can use for the chef search interface. (note that the top level cluster_name attribute will still be there.)