Jira (BOLT-592) Running Bolt task in CI to non-localhost fails

23 views
Skip to first unread message

Christopher Thorn (JIRA)

unread,
Jun 13, 2018, 3:08:04 PM6/13/18
to puppe...@googlegroups.com
Christopher Thorn created an issue
 
Puppet Task Runner / Bug BOLT-592
Running Bolt task in CI to non-localhost fails
Issue Type: Bug Bug
Affects Versions: BOLT 0.20.5
Assignee: Unassigned
Created: 2018/06/13 12:07 PM
Priority: Normal Normal
Reporter: Christopher Thorn

As a user of PE that wants to run Bolt tasks on multiple nodes in CI, I expect if I provide the --nodes a list of nodes and the root user and password, that the task will run on all the nodes with no error.
What is happening is I'm getting a connection refused.

The command I'm running in Beaker is found in this fork of pe_acceptance_tests here.

on(master, "/opt/puppetlabs/puppet/bin/bolt task run touch -m #{repo_dir}/tasks --nodes #{hosts_string} --no-host-key-check --debug --user root --password **redacted**")

Where the repo_dir is just a checked out enterprise_tasks, and the hosts_string is a list of the hosts that the task will run on. The password is removed here, but it is the default password for our vmpooler hosts.

Here is the output when I run this in CI of that command:

  * Run the touch task with Bolt on all agent nodes
    
    p4nwq9sxpvfhigr.delivery.puppetlabs.net (centos6-64-1) 19:00:07$ /opt/puppetlabs/puppet/bin/bolt task run touch -m /tmp/enterprise_task_repo20180613-46158-1vbbywc/tasks --nodes p4nwq9sxpvfhigr.delivery.puppetlabs.net,uhyitpa95tl8cen.delivery.puppetlabs.net,qcxf72193e8fhvg.delivery.puppetlabs.net --no-host-key-check --debug --user root --password Qu@lity!
      Did not find config for p4nwq9sxpvfhigr.delivery.puppetlabs.net in inventory
      Did not find config for uhyitpa95tl8cen.delivery.puppetlabs.net in inventory
      Did not find config for qcxf72193e8fhvg.delivery.puppetlabs.net in inventory
      Started with 100 max thread(s)
      ModuleLoader: module 'boltlib' has unknown dependencies - it will have all other modules visible
      Did not find config for p4nwq9sxpvfhigr.delivery.puppetlabs.net in inventory
      Did not find config for uhyitpa95tl8cen.delivery.puppetlabs.net in inventory
      Did not find config for qcxf72193e8fhvg.delivery.puppetlabs.net in inventory
      Starting: task touch on p4nwq9sxpvfhigr.delivery.puppetlabs.net, uhyitpa95tl8cen.delivery.puppetlabs.net, qcxf72193e8fhvg.delivery.puppetlabs.net
      Authentication method 'gssapi-with-mic' is not available
      Running task touch with '{}' via both on ["p4nwq9sxpvfhigr.delivery.puppetlabs.net"]
      Running task touch with '{}' via both on ["uhyitpa95tl8cen.delivery.puppetlabs.net"]
      Running task run 'Task({'name' => 'touch', 'implementations' => [{'name' => 'init.rb', 'path' => '/tmp/enterprise_task_repo20180613-46158-1vbbywc/tasks/touch/tasks/init.rb', 'requirements' => []}], 'description' => 'Touch file, used for testing', 'parameters' => {}})' on p4nwq9sxpvfhigr.delivery.puppetlabs.net
      Running task run 'Task({'name' => 'touch', 'implementations' => [{'name' => 'init.rb', 'path' => '/tmp/enterprise_task_repo20180613-46158-1vbbywc/tasks/touch/tasks/init.rb', 'requirements' => []}], 'description' => 'Touch file, used for testing', 'parameters' => {}})' on uhyitpa95tl8cen.delivery.puppetlabs.net
      Running task touch with '{}' via both on ["qcxf72193e8fhvg.delivery.puppetlabs.net"]
      Running task run 'Task({'name' => 'touch', 'implementations' => [{'name' => 'init.rb', 'path' => '/tmp/enterprise_task_repo20180613-46158-1vbbywc/tasks/touch/tasks/init.rb', 'requirements' => []}], 'description' => 'Touch file, used for testing', 'parameters' => {}})' on qcxf72193e8fhvg.delivery.puppetlabs.net
      {"node":"p4nwq9sxpvfhigr.delivery.puppetlabs.net","status":"failure","result":{"_error":{"kind":"puppetlabs.tasks/connect-error","msg":"Failed to connect to p4nwq9sxpvfhigr.delivery.puppetlabs.net: Connection reset by peer","details":{},"issue_code":"CONNECT_ERROR"}}}
      {"node":"uhyitpa95tl8cen.delivery.puppetlabs.net","status":"failure","result":{"_error":{"kind":"puppetlabs.tasks/connect-error","msg":"Failed to connect to uhyitpa95tl8cen.delivery.puppetlabs.net: Connection reset by peer","details":{},"issue_code":"CONNECT_ERROR"}}}
      {"node":"qcxf72193e8fhvg.delivery.puppetlabs.net","status":"failure","result":{"_error":{"kind":"puppetlabs.tasks/connect-error","msg":"Failed to connect to qcxf72193e8fhvg.delivery.puppetlabs.net: Connection reset by peer","details":{},"issue_code":"CONNECT_ERROR"}}}
      Finished: task touch with 3 failures in 0.33 sec
      Started on p4nwq9sxpvfhigr.delivery.puppetlabs.net...
      Started on uhyitpa95tl8cen.delivery.puppetlabs.net...
      Started on qcxf72193e8fhvg.delivery.puppetlabs.net...
      Failed on p4nwq9sxpvfhigr.delivery.puppetlabs.net:
        Failed to connect to p4nwq9sxpvfhigr.delivery.puppetlabs.net: Connection reset by peer
      Failed on uhyitpa95tl8cen.delivery.puppetlabs.net:
        Failed to connect to uhyitpa95tl8cen.delivery.puppetlabs.net: Connection reset by peer
      Failed on qcxf72193e8fhvg.delivery.puppetlabs.net:
        Failed to connect to qcxf72193e8fhvg.delivery.puppetlabs.net: Connection reset by peer
      Failed on 3 nodes: p4nwq9sxpvfhigr.delivery.puppetlabs.net,uhyitpa95tl8cen.delivery.puppetlabs.net,qcxf72193e8fhvg.delivery.puppetlabs.net
      Ran on 3 nodes in 0.42 seconds
    
    p4nwq9sxpvfhigr.delivery.puppetlabs.net (centos6-64-1) executed in 1.65 seconds
    Exited: 2

If I were to change the hosts_string to just be localhost, the bolt task runs fine, here is an example of that from CI:

 * Run the touch task with Bolt on all agent nodes
    
    l2x5gg1orjym6n2.delivery.puppetlabs.net (centos6-64-1) 18:40:22$ /opt/puppetlabs/puppet/bin/bolt task run touch -m /tmp/enterprise_task_repo20180613-45700-fw4dpb/tasks --nodes localhost --no-host-key-check --debug --user root --password Qu@lity!
      Started with 100 max thread(s)
      ModuleLoader: module 'boltlib' has unknown dependencies - it will have all other modules visible
      Starting: task touch on localhost
      Running task touch with '{}' via both on ["localhost"]
      Running task run 'Task({'name' => 'touch', 'implementations' => [{'name' => 'init.rb', 'path' => '/tmp/enterprise_task_repo20180613-45700-fw4dpb/tasks/touch/tasks/init.rb', 'requirements' => []}], 'description' => 'Touch file, used for testing', 'parameters' => {}})' on localhost
      Running '/tmp/d20180613-23485-qruah7/init.rb' with {}
      Started on localhost...
      {"node":"localhost","status":"success","result":{"_output":""}}
      Finished: task touch with 0 failures in 0.08 sec
      Finished on localhost:
       
        {
        }
      Successful on 1 node: localhost
      Ran on 1 node in 0.17 seconds

All of the above examples have been when the Beaker test-runner is being handled in Jenkins.
If I use my personal laptop as the Beaker test-runner, and use a list of nodes for the --nodes, Bolt tasks work perfectly. This only fails when in CI.

Looking at module job that uses Bolt, here, it is only using Bolt on the localhost.

Is there an example elsewhere of Bolt being used in our CI that is not using --nodes localhost?

Add Comment Add Comment
 
This message was sent by Atlassian JIRA (v7.7.1#77002-sha1:e75ca93)
Atlassian logo

Christopher Thorn (JIRA)

unread,
Jun 13, 2018, 3:11:02 PM6/13/18
to puppe...@googlegroups.com
Christopher Thorn commented on Bug BOLT-592
 
Re: Running Bolt task in CI to non-localhost fails

ping Michael Smith, Nick Lewis. This is the issue that I brought up last week, but I've simplified it enough to not be using certs.

David Kramer (JIRA)

unread,
Jun 19, 2018, 12:25:03 PM6/19/18
to puppe...@googlegroups.com

Nick Walker (JIRA)

unread,
Jun 19, 2018, 4:06:03 PM6/19/18
to puppe...@googlegroups.com
Nick Walker commented on Bug BOLT-592
 
Re: Running Bolt task in CI to non-localhost fails

Yasmin Rajabi Michael Smith Nick Lewis is this something that can be worked on soon? It is currently blocking the installer team's ability to work on TOTES.

Let us know either way so we can plan.

Michael Smith (JIRA)

unread,
Jun 19, 2018, 4:12:04 PM6/19/18
to puppe...@googlegroups.com
Michael Smith commented on Bug BOLT-592

We added it to our board, so hopefully take a look at it this week.

Nick Walker (JIRA)

unread,
Jun 19, 2018, 4:13:03 PM6/19/18
to puppe...@googlegroups.com
Nick Walker commented on Bug BOLT-592

Excellent, thanks for the update.

Alex Dreyer (JIRA)

unread,
Jun 19, 2018, 4:19:03 PM6/19/18
to puppe...@googlegroups.com
Alex Dreyer commented on Bug BOLT-592

Do you have an ssh key configured for these nodes in ~/.ssh/config on your laptop but not in CI? What happens when you use ssh://localhost(localhost is special and will not use the ssh transport by default?

This does not look like a bolt bug to me. I suggest sshing onto the node beaker is running on in CI and trying to connect from there. If that doesn't work pairing will probably be faster than treating this as a bug

Christopher Thorn (JIRA)

unread,
Jun 19, 2018, 5:16:03 PM6/19/18
to puppe...@googlegroups.com

Alex Dreyer I've tried to simplify my testing scenario down to the basics, so I don't believe any of the node's SSH is configured, unless Beaker is handling that as part of the provision/setup process pre-pre-suite?

I do agree with you that this isn't a Bolt bug, but more likely a bug in how I'm setting up my testing environment in our CI. But I'm at my wits end in coming up what I could possibly be doing wrong.
Are there any examples in the modules that are using beaker-task-helper to install Bolt, that are not using localhost?

I need to find time to get my testing environment setup with Jenkins, then I'll try the ssh://localhost and get back to you.

Alex Dreyer (JIRA)

unread,
Jun 19, 2018, 5:23:04 PM6/19/18
to puppe...@googlegroups.com
Alex Dreyer commented on Bug BOLT-592

I meant ssh being configured on your laptop is probably the reason your tests work there but not in CI.

Christopher Thorn (JIRA)

unread,
Jun 22, 2018, 1:54:03 PM6/22/18
to puppe...@googlegroups.com

Alex Dreyer here is what happens when I'm running with ssh://localhost:

    gh5tgyl2936nnlp.delivery.puppetlabs.net (oracle6-64-1) 17:51:07$ /opt/puppetlabs/puppet/bin/bolt task run touch -m /tmp/enterprise_task_repo20180622-4466-dkxp7s/tasks --nodes ssh://localhost --no-host-key-check --debug --user root --password Qu@lity!
      Did not find config for ssh://localhost in inventory
      Started with 100 max thread(s)
      ModuleLoader: module 'boltlib' has unknown dependencies - it will have all other modules visible
      Did not find config for ssh://localhost in inventory
      Starting: task touch on ssh://localhost
      Authentication method 'gssapi-with-mic' is not available
      Running task touch with '{}' via both on ["ssh://localhost"]
      Running task run 'Task({'name' => 'touch', 'implementations' => [{'name' => 'init.rb', 'path' => '/tmp/enterprise_task_repo20180622-4466-dkxp7s/tasks/touch/tasks/init.rb', 'requirements' => []}], 'description' => 'Touch file, used for testing', 'parameters' => {}})' on ssh://localhost
      {"node":"ssh://localhost","status":"failure","result":{"_error":{"kind":"puppetlabs.tasks/connect-error","msg":"Failed to connect to ssh://localhost: Connection reset by peer","details":{},"issue_code":"CONNECT_ERROR"}}}
      Finished: task touch with 1 failure in 0.14 sec
      Started on localhost...
      Failed on localhost:
        Failed to connect to ssh://localhost: Connection reset by peer
      Failed on 1 node: ssh://localhost
      Ran on 1 node in 0.24 seconds
    
    gh5tgyl2936nnlp.delivery.puppetlabs.net (oracle6-64-1) executed in 1.52 seconds
    Exited: 2

Alex Dreyer (JIRA)

unread,
Jun 22, 2018, 5:05:02 PM6/22/18
to puppe...@googlegroups.com
Alex Dreyer commented on Bug BOLT-592

Are you gem installing bolt instead of using the package?

Christopher Thorn (JIRA)

unread,
Jun 22, 2018, 5:27:03 PM6/22/18
to puppe...@googlegroups.com

Alex Dreyer thanks for the tips! Turns out in our weird setup in Jenkins, we need to set ENABLE_SSH_AGENT='true' in our CJC configuration. This will in turn setup SSH properly on the agents via https://github.com/puppetlabs/ci-job-configs/blob/master/resources/scripts/integration-beaker.sh#L157-L162.

Reply all
Reply to author
Forward
0 new messages