I'm trying to set up a Salt Stack master with salt-cloud on GCE, to be able to manage the infrastructure completely with salt-cloud.I was trying to follow the instructions as per e.g. https://github.com/GoogleCloudPlatform/compute-video-demo-salt and it all seems fine, except for the fairly simplified section:"Create a Compute Engine SSH key and upload it to the metadata server. The easist way to do this is to use the gcutil command-line utility and try to SSH from the machine back into itself."Why do I say it is simplified? Because that thing doesn't seem to be written with "best practices" in mind, and the tools don't seem to work as advertised.First time I try and set up google cloud sdk tools on the GCE instance, I run ```gcloud auth login``` as I was told it tells me that if I want to authenticate on a GCE instance, I should use service accounts instead of my private account for security (the "best practices" that I mentioned) or whatnuts, which makes sense to me.So I find out how and create a service account, get the email address and P12 file for it, put that file on the server and run, as per instructions (replacing <> -tokens with real values):
gcloud auth activate-service-account --project <my-project-id> '<service-account@developer.gserviceaccount.com>' --key-file </path/to/p12-file>
The output for this is just:
"Activated service account credentials for <service-account@developer.gserviceaccount.com>"
Hi Janne,Thank you for reporting this and taking the time to write up the details of your (poor) experience. I think there are two issues here and I'll work on getting the guide cleaned up. The main issue is that the docs are outdated with respect to changes that have been made with 'gcloud' and 'gcutil' which I think I can get addressed by updating the procedure. Your last traceback will take a bit more digging but I hope to find/fix the issue as I dig into this further.Could I ask you to reply with the output of 'salt --versions-report' and 'pip freeze' so I can make sure I'm able to reproduce the errors?I'll report back shortly with some suggestions to get you unblocked.Kind regards,
-erjohnso
On Thursday, August 28, 2014 7:33:58 AM UTC-7, Janne Enberg wrote:
I'm trying to set up a Salt Stack master with salt-cloud on GCE, to be able to manage the infrastructure completely with salt-cloud.I was trying to follow the instructions as per e.g. https://github.com/GoogleCloudPlatform/compute-video-demo-salt and it all seems fine, except for the fairly simplified section:"Create a Compute Engine SSH key and upload it to the metadata server. The easist way to do this is to use the gcutil command-line utility and try to SSH from the machine back into itself."Why do I say it is simplified? Because that thing doesn't seem to be written with "best practices" in mind, and the tools don't seem to work as advertised.First time I try and set up google cloud sdk tools on the GCE instance, I run ```gcloud auth login``` as I was told it tells me that if I want to authenticate on a GCE instance, I should use service accounts instead of my private account for security (the "best practices" that I mentioned) or whatnuts, which makes sense to me.So I find out how and create a service account, get the email address and P12 file for it, put that file on the server and run, as per instructions (replacing <> -tokens with real values):
gcloud auth activate-service-account --project <my-project-id> '<service...@developer.gserviceaccount.com>' --key-file </path/to/p12-file>
The output for this is just:
"Activated service account credentials for <service...@developer.gserviceaccount.com>"
Issues with Salt setup
Firstly, you are suggesting to install an old version of salt and then "patching" it .. why not just a) update instructions to a newer version (e.g. 2014.1.10, or 2014.7), b) install the latest stable?
Google Cloud SDK SetupThe guide skips any mention of installing the Cloud SDK on the salt master, just assumes it is installed. I also *assume* I need to run gcloud auth authenticate-service-account here to get things working right.
After setting it up I can run "gcloud compute instances list", but trying "gcloud compute ssh salt --zone europe-west1-a" just gives me permission denied for a bunch of times and then tells me to try again later because "Your SSH key has not propagated to your instance yet". No matter how long I wait, it will never work.
Also it doesn't seem to matter if the gcloud compute ssh -command works, since I could get salt-cloud working just fine without ssh working, so maybe it's better to use "gcloud compute instances list" for the test command too?
- Janne--To view this discussion on the web visit https://groups.google.com/d/msgid/gce-discussion/f0d01edf-1bb0-4e07-b7e1-045e57b5c01a%40googlegroups.com.
© 2014 Google Inc. 1600 Amphitheatre Parkway, Mountain View, CA 94043
Email preferences: You received this email because you signed up for the Google Compute Engine Discussion Google Group (gce-dis...@googlegroups.com) to participate in discussions with other members of the Google Compute Engine community and the Google Compute Engine Team.
---
You received this message because you are subscribed to a topic in the Google Groups "gce-discussion" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/gce-discussion/urP5GY2gJH0/unsubscribe.
To unsubscribe from this group and all its topics, send an email to gce-discussio...@googlegroups.com.
To post to this group, send email to gce-dis...@googlegroups.com.
I'm trying to set up a Salt Stack master with salt-cloud on GCE, to be able to manage the infrastructure completely with salt-cloud.I was trying to follow the instructions as per e.g. https://github.com/GoogleCloudPlatform/compute-video-demo-salt and it all seems fine, except for the fairly simplified section:"Create a Compute Engine SSH key and upload it to the metadata server. The easist way to do this is to use the gcutil command-line utility and try to SSH from the machine back into itself."Why do I say it is simplified? Because that thing doesn't seem to be written with "best practices" in mind, and the tools don't seem to work as advertised.First time I try and set up google cloud sdk tools on the GCE instance, I run ```gcloud auth login``` as I was told it tells me that if I want to authenticate on a GCE instance, I should use service accounts instead of my private account for security (the "best practices" that I mentioned) or whatnuts, which makes sense to me.So I find out how and create a service account, get the email address and P12 file for it, put that file on the server and run, as per instructions (replacing <> -tokens with real values):
gcloud auth activate-service-account --project <my-project-id> '<service-account@developer.gserviceaccount.com>' --key-file </path/to/p12-file>
The output for this is just:
"Activated service account credentials for <service-account@developer.gserviceaccount.com>"
{u'domain': u'global', u'message': u"The resource 'projects/my-test-project/zones/europe-west1-b/disks/gw-1' was not found", u'reason': u'notFound'}
...
AttributeError: 'bool' object has no attribute 'pop'
The issue has been that I've once successfully created a VM with
that name already, and the disk image is left behind blocking
creation of a new one with the same name, which totally confuses
salt-cloud. I have gotten to that error quite a lot of times,
meaning I've gotten a working build a lot of times, but the error
messages were so bad that it was impossible to tell what the issue
was.[INFO ] Creating GCE instance web-1 in europe-west1-b [INFO ] Creating GCE instance vpn-1 in europe-west1-b [INFO ] Rendering deploy script: /usr/lib/python2.6/site-packages/salt/cloud/deploy/bootstrap-salt.sh [INFO ] Rendering deploy script: /usr/lib/python2.6/site-packages/salt/cloud/deploy/bootstrap-salt.sh [ERROR ] Authentication failed: status code 255 [ERROR ] Failed to start Salt on Cloud VM vpn-1 [INFO ] Created Cloud VM 'vpn-1' [ERROR ] Authentication failed: status code 255 [ERROR ] Failed to start Salt on Cloud VM web-1 [INFO ] Created Cloud VM 'web-1'I tried to find some workarounds .. tried to set up salt to create a new SSH key for salt, configure it with ssh_key_file in main config and ssh_username in profile, then I added that to the project metadata's SSH keys, but for some reason those project-wide SSH keys fail to propagate to new GCE instances, at least when created via salt-cloud.
# gcloud compute project-info add-metadata --metadata-from-file startup-script=startup.sh
Traceback (most recent call last):
File "/opt/google-cloud-sdk/./lib/googlecloudsdk/gcloud/gcloud.py", line 150, in <module>
main()
File "/opt/google-cloud-sdk/./lib/googlecloudsdk/gcloud/gcloud.py", line 146, in main
_cli.Execute()
File "/opt/google-cloud-sdk/./lib/googlecloudsdk/calliope/cli.py", line 431, in Execute
post_run_hooks=self.__post_run_hooks, kwargs=kwargs)
File "/opt/google-cloud-sdk/./lib/googlecloudsdk/calliope/frontend.py", line 274, in _Execute
pre_run_hooks=pre_run_hooks, post_run_hooks=post_run_hooks)
File "/opt/google-cloud-sdk/./lib/googlecloudsdk/calliope/backend.py", line 885, in Run
output_formatter(result)
File "/opt/google-cloud-sdk/./lib/googlecloudsdk/calliope/backend.py", line 870, in OutputFormatter
command_instance.Display(args, obj)
File "/opt/google-cloud-sdk/./lib/googlecloudsdk/compute/lib/base_classes.py", line 918, in Display
list(resources)
File "/opt/google-cloud-sdk/./lib/googlecloudsdk/compute/lib/base_classes.py", line 881, in Run
new_object = self.Modify(args, objects[0])
File "/opt/google-cloud-sdk/./lib/googlecloudsdk/compute/lib/base_classes.py", line 929, in Modify
new_object = copy.deepcopy(existing)
File "/usr/lib64/python2.6/copy.py", line 189, in deepcopy
y = _reconstruct(x, rv, 1, memo)
File "/usr/lib64/python2.6/copy.py", line 338, in _reconstruct
state = deepcopy(state, memo)
File "/usr/lib64/python2.6/copy.py", line 162, in deepcopy
y = copier(x, memo)
File "/usr/lib64/python2.6/copy.py", line 255, in _deepcopy_dict
y[deepcopy(key, memo)] = deepcopy(value, memo)
File "/usr/lib64/python2.6/copy.py", line 162, in deepcopy
y = copier(x, memo)
File "/usr/lib64/python2.6/copy.py", line 255, in _deepcopy_dict
y[deepcopy(key, memo)] = deepcopy(value, memo)
File "/usr/lib64/python2.6/copy.py", line 189, in deepcopy
y = _reconstruct(x, rv, 1, memo)
File "/usr/lib64/python2.6/copy.py", line 338, in _reconstruct
state = deepcopy(state, memo)
File "/usr/lib64/python2.6/copy.py", line 162, in deepcopy
y = copier(x, memo)
File "/usr/lib64/python2.6/copy.py", line 255, in _deepcopy_dict
y[deepcopy(key, memo)] = deepcopy(value, memo)
File "/usr/lib64/python2.6/copy.py", line 162, in deepcopy
y = copier(x, memo)
File "/usr/lib64/python2.6/copy.py", line 255, in _deepcopy_dict
y[deepcopy(key, memo)] = deepcopy(value, memo)
File "/usr/lib64/python2.6/copy.py", line 189, in deepcopy
y = _reconstruct(x, rv, 1, memo)
File "/usr/lib64/python2.6/copy.py", line 329, in _reconstruct
y.append(item)
File "/opt/google-cloud-sdk/./lib/protorpc/messages.py", line 1087, in append
self.__field.validate_element(value)
AttributeError: 'FieldList' object has no attribute '_FieldList__field'
If I try to create one via the webui, it's not a textarea but a
single text input and everything ends up on one line, pasting the
script in it makes it all end up on one line .. I changed the input
to a textarea via my chrome dev tools, but submitting the multilne
text via that seems to freak out the webui totally and it just
clears the value. Confirmed these via curl -H 'Metadata-Flavor:
Google'
metadata/computeMetadata/v1/project/attributes/startup-script .. it
is not just the webui showing it wrong.[ERROR ] Authentication failed: status code 255 [ERROR ] Failed to start Salt on Cloud VM vpn-1 [INFO ] Created Cloud VM 'vpn-1' [ERROR ] Authentication failed: status code 255 [ERROR ] Failed to start Salt on Cloud VM web-1 [INFO ] Created Cloud VM 'web-1'But after the failure I can SSH in via ssh -i /etc/salt/ssh_key.pem (same SSH key as defined in my main config) salt@vpn-1 (same SSH username as in my profiles)..
Hi,
So I think I figured out what all the errors about the disks meant and what has been the core issue all this time..
It seems to be that all salt things are littered with horribad error messages, every time I've had the error listed above, saying stuff like:
{u'domain': u'global', u'message': u"The resource 'projects/my-test-project/zones/europe-west1-b/disks/gw-1' was not found", u'reason': u'notFound'} ... AttributeError: 'bool' object has no attribute 'pop'
The issue has been that I've once successfully created a VM with that name already, and the disk image is left behind blocking creation of a new one with the same name, which totally confuses salt-cloud. I have gotten to that error quite a lot of times, meaning I've gotten a working build a lot of times, but the error messages were so bad that it was impossible to tell what the issue was.
The cause of the issue seems to be that I had probably deleted the VM that I had managed to successfully create without deleting the boot disk, as deleting the boot disk seems to be a checkbox unchecked by default in the GCE webui on machines created by salt-cloud, even with the delete_boot_pd: True -option. I guess this option only affects salt-cloud and it doesn't actually set the "delete boot disk when instance is deleted" -option for the VM.
When I create a VM manually from the webui with the "Delete boot disk when instance is deleted" -option, I don't need to worry about the checkbox when deleting the instance, thus I never even thought there would be such an option for me to look for. Just one time when I was deleting some of the VMs I had managed to create again I noticed that and then realized what the issue was..
Another thing that started causing me issues very fast was the amount of memory on my f1-micro instance, since apparently salt-master takes several hundred megs of run to run. To make sure things work, I switched to g1-small, maybe some swap on the f1-micro would work too, but not into that atm.
After these realizations, I've been able to easily get to the state where I can create new instances, but my CentOS minions just don't work out of the box, I assume this is because on the CentOS image sshd is set up (correctly) with "PermitRootLogin no", as I get this kind of messages in the salt-cloud output:
[INFO ] Creating GCE instance web-1 in europe-west1-b [INFO ] Creating GCE instance vpn-1 in europe-west1-b [INFO ] Rendering deploy script: /usr/lib/python2.6/site-packages/salt/cloud/deploy/bootstrap-salt.sh [INFO ] Rendering deploy script: /usr/lib/python2.6/site-packages/salt/cloud/deploy/bootstrap-salt.sh [ERROR ] Authentication failed: status code 255 [ERROR ] Failed to start Salt on Cloud VM vpn-1 [INFO ] Created Cloud VM 'vpn-1' [ERROR ] Authentication failed: status code 255 [ERROR ] Failed to start Salt on Cloud VM web-1 [INFO ] Created Cloud VM 'web-1'I tried to find some workarounds .. tried to set up salt to create a new SSH key for salt, configure it with ssh_key_file in main config and ssh_username in profile, then I added that to the project metadata's SSH keys, but for some reason those project-wide SSH keys fail to propagate to new GCE instances, at least when created via salt-cloud.
Tried to set up a startup-script metadata with a script that creates a salt user and ~/.ssh/authorized_keys for it, but the gcloud tool is broken and crashes if I try to do anything with metadata.
# gcloud compute project-info add-metadata --metadata-from-file startup-script=startup.sh Traceback (most recent call last): File "/opt/google-cloud-sdk/./lib/googlecloudsdk/gcloud/gcloud.py", line 150, in <module> main() File "/opt/google-cloud-sdk/./lib/googlecloudsdk/gcloud/gcloud.py", line 146, in main _cli.Execute() File "/opt/google-cloud-sdk/./lib/googlecloudsdk/calliope/cli.py", line 431, in Execute post_run_hooks=self.__post_run_hooks, kwargs=kwargs) File "/opt/google-cloud-sdk/./lib/googlecloudsdk/calliope/frontend.py", line 274, in _Execute pre_run_hooks=pre_run_hooks, post_run_hooks=post_run_hooks) File "/opt/google-cloud-sdk/./lib/googlecloudsdk/calliope/backend.py", line 885, in Run output_formatter(result) File "/opt/google-cloud-sdk/./lib/googlecloudsdk/calliope/backend.py", line 870, in OutputFormatter command_instance.Display(args, obj) File "/opt/google-cloud-sdk/./lib/googlecloudsdk/compute/lib/base_classes.py", line 918, in Display list(resources) File "/opt/google-cloud-sdk/./lib/googlecloudsdk/compute/lib/base_classes.py", line 881, in Run new_object = self.Modify(args, objects[0]) File "/opt/google-cloud-sdk/./lib/googlecloudsdk/compute/lib/base_classes.py", line 929, in Modify new_object = copy.deepcopy(existing) File "/usr/lib64/python2.6/copy.py", line 189, in deepcopy y = _reconstruct(x, rv, 1, memo) File "/usr/lib64/python2.6/copy.py", line 338, in _reconstruct state = deepcopy(state, memo) File "/usr/lib64/python2.6/copy.py", line 162, in deepcopy y = copier(x, memo) File "/usr/lib64/python2.6/copy.py", line 255, in _deepcopy_dict y[deepcopy(key, memo)] = deepcopy(value, memo) File "/usr/lib64/python2.6/copy.py", line 162, in deepcopy y = copier(x, memo) File "/usr/lib64/python2.6/copy.py", line 255, in _deepcopy_dict y[deepcopy(key, memo)] = deepcopy(value, memo) File "/usr/lib64/python2.6/copy.py", line 189, in deepcopy y = _reconstruct(x, rv, 1, memo) File "/usr/lib64/python2.6/copy.py", line 338, in _reconstruct state = deepcopy(state, memo) File "/usr/lib64/python2.6/copy.py", line 162, in deepcopy y = copier(x, memo) File "/usr/lib64/python2.6/copy.py", line 255, in _deepcopy_dict y[deepcopy(key, memo)] = deepcopy(value, memo) File "/usr/lib64/python2.6/copy.py", line 162, in deepcopy y = copier(x, memo) File "/usr/lib64/python2.6/copy.py", line 255, in _deepcopy_dict y[deepcopy(key, memo)] = deepcopy(value, memo) File "/usr/lib64/python2.6/copy.py", line 189, in deepcopy y = _reconstruct(x, rv, 1, memo) File "/usr/lib64/python2.6/copy.py", line 329, in _reconstruct y.append(item) File "/opt/google-cloud-sdk/./lib/protorpc/messages.py", line 1087, in append self.__field.validate_element(value) AttributeError: 'FieldList' object has no attribute '_FieldList__field'
If I try to create one via the webui, it's not a textarea but a single text input and everything ends up on one line, pasting the script in it makes it all end up on one line .. I changed the input to a textarea via my chrome dev tools, but submitting the multilne text via that seems to freak out the webui totally and it just clears the value. Confirmed these via curl -H 'Metadata-Flavor: Google' metadata/computeMetadata/v1/project/attributes/startup-script .. it is not just the webui showing it wrong.
Then I tried to pass the bash script through a quick tr '\n' ';' and pasted that to the project startup-script metadata input, and told salt-cloud to create the instances.
I get the same errors:
[ERROR ] Authentication failed: status code 255 [ERROR ] Failed to start Salt on Cloud VM vpn-1 [INFO ] Created Cloud VM 'vpn-1' [ERROR ] Authentication failed: status code 255 [ERROR ] Failed to start Salt on Cloud VM web-1 [INFO ] Created Cloud VM 'web-1'But after the failure I can SSH in via ssh -i /etc/salt/ssh_key.pem (same SSH key as defined in my main config) salt@vpn-1 (same SSH username as in my profiles)..
I'm again running out if ideas here, been also trying to ask around on #salt and #gcloud @Freenode but that hasn't been very fruitful so far.
- Janne