Jira (BOLT-1397) The wait_until_available() function should return (or make accessible) details of why it timed out.

4 views
Skip to first unread message

Joshua Partlow (JIRA)

unread,
Jun 14, 2019, 6:23:04 PM6/14/19
to puppe...@googlegroups.com
Joshua Partlow created an issue
 
Puppet Task Runner / Improvement BOLT-1397
The wait_until_available() function should return (or make accessible) details of why it timed out.
Issue Type: Improvement Improvement
Assignee: Unassigned
Created: 2019/06/14 3:22 PM
Priority: Normal Normal
Reporter: Joshua Partlow

If a user has a basic connection problem, such as needing --no-host-key-check or --user=root, Bolt provides a useful error message explaining the problem.

It would be nice if the error returned included the underlying reason for failure, or at least made it possible to find this via _catch_errors.

Example of a host key error:

jpartlow@work1804:~/work/src/enterprise_tasks$ bolt plan run enterprise_tasks::testing::install_pe master=n7il9hevr7qj949.delivery.puppetlabs.net tarball=/home/jpartlow/pe_builds/puppet-enterprise-2019.2.0-rc1-1
58-ga04d97d-ubuntu-18.04-amd64.tar                                                                                                                                                                                 
Starting: plan enterprise_tasks::testing::install_pe                                                                                                                                                               
Install_pe: Checking connectivity to infrastructure nodes.                                                                                                                                                         
Starting: command 'true' on n7il9hevr7qj949.delivery.puppetlabs.net                                                                                                                                                
Finished: command 'true' with 1 failure in 0.11 sec                                                                                                                                                                
Finished: plan enterprise_tasks::testing::install_pe in 0.13 sec                                                                                                                                                   
{                                                                                                                                                                                                                  
  "kind": "bolt/run-failure",                                                                                                                                                                                      
  "msg": "Plan aborted: run_command 'true' failed on 1 nodes",                                                                                                                                                     
  "details": {                                                                                                                                                                                                     
    "action": "run_command",                        
    "object": "true",                                     
    "result_set": [                                                
      {                                             
        "node": "n7il9hevr7qj949.delivery.puppetlabs.net",      
        "target": "n7il9hevr7qj949.delivery.puppetlabs.net",
        "action": null,       
        "object": null,
        "status": "failure",
        "result": {
          "_error": {
            "kind": "puppetlabs.tasks/connect-error",                                                                                                                                                                          "msg": "Host key verification failed for n7il9hevr7qj949.delivery.puppetlabs.net: fingerprint SHA256:LBaIoBI8pLYMNsO3uNADoCkVPATgBul1LImLBBoSP+s is unknown for \"n7il9hevr7qj949.delivery.puppetlabs.$
et,10.16.126.186\"",                                
            "details": {                                  
            },                                                           
            "issue_code": "HOST_KEY_ERROR"               
          }                                                     
        }
      }                      
    ]                                                           
  }           
}     

 

But if a plan calls wait_until_available(), a generic timeout error is returned.

jpartlow@work1804:~/work/src/enterprise_tasks$ bolt plan run enterprise_tasks::testing::install_pe master=n7il9hevr7qj949.delivery.puppetlabs.net tarball=/home/jpartlow/pe_builds/puppet-enterprise-2019.2.0-rc1-$58-ga04d97d-ubuntu-18.04-amd64.tar                                                                                                                                                                                 
Starting: plan enterprise_tasks::testing::install_pe
Install_pe: Checking connectivity to infrastructure nodes.
Starting: wait until available on n7il9hevr7qj949.delivery.puppetlabs.net
Finished: wait until available with 1 failure in 0.14 sec
Finished: plan enterprise_tasks::testing::install_pe in 0.16 sec
{        
  "kind": "bolt/run-failure",
  "msg": "Plan aborted: wait_until_available failed on 1 nodes",
  "details": {
    "action": "wait_until_available",
    "object": null,                                                                                                                                                                                                
    "result_set": [                                   
      {                                             
        "node": "n7il9hevr7qj949.delivery.puppetlabs.net",
        "target": "n7il9hevr7qj949.delivery.puppetlabs.net",       
        "action": null,                                           
        "object": null,                            
        "status": "failure",                                   
        "result": {
          "_error": {        
            "kind": "puppetlabs.tasks/exception-error",       
            "issue_code": "EXCEPTION",
            "msg": "Timed out waiting for target",
            "details": {
              "class": "Bolt::Executor::TimeoutError",
              "stack_trace": "/opt/puppetlabs/bolt/lib/ruby/gems/2.5.0/gems/bolt-1.22.0/lib/bolt/executor.rb:310:in `wait_until'\\n/opt/puppetlabs/bolt/lib/ruby/gems/2.5.0/gems/bolt-1.22.0/lib/bolt/executor.rb:$92:in `block (3 levels) in wait_until_available'\\n/opt/puppetlabs/bolt/lib/ruby/gems/2.5.0/gems/bolt-1.22.0/lib/bolt/executor.rb:224:in `with_node_logging'\\n/opt/puppetlabs/bolt/lib/ruby/gems/2.5.0/gems/bolt-$.22.0/lib/bolt/executor.rb:290:in `block (2 levels) in wait_until_available'\\n/opt/puppetlabs/bolt/lib/ruby/gems/2.5.0/gems/bolt-1.22.0/lib/bolt/executor.rb:100:in `block (3 levels) in queue_execute'\\n/opt/pu$petlabs/bolt/lib/ruby/gems/2.5.0/gems/concurrent-ruby-1.1.4/lib/concurrent/executor/ruby_thread_pool_executor.rb:348:in `run_task'\\n/opt/puppetlabs/bolt/lib/ruby/gems/2.5.0/gems/concurrent-ruby-1.1.4/lib/concu$rent/executor/ruby_thread_pool_executor.rb:337:in `block (3 levels) in create_worker'\\n/opt/puppetlabs/bolt/lib/ruby/gems/2.5.0/gems/concurrent-ruby-1.1.4/lib/concurrent/executor/ruby_thread_pool_executor.rb:3$0:in `loop'\\n/opt/puppetlabs/bolt/lib/ruby/gems/2.5.0/gems/concurrent-ruby-1.1.4/lib/concurrent/executor/ruby_thread_pool_executor.rb:320:in `block (2 levels) in create_worker'\\n/opt/puppetlabs/bolt/lib/ruby/$ems/2.5.0/gems/concurrent-ruby-1.1.4/lib/concurrent/executor/ruby_thread_pool_executor.rb:319:in `catch'\\n/opt/puppetlabs/bolt/lib/ruby/gems/2.5.0/gems/concurrent-ruby-1.1.4/lib/concurrent/executor/ruby_thread_pool_executor.rb:319:in `block in create_worker'\\n/opt/puppetlabs/bolt/lib/ruby/gems/2.5.0/gems/logging-2.2.2/lib/logging/diagnostic_context.rb:474:in `block in create_with_logging_context'"
            }
          }
        }
      }
    ]
  }
}

Add Comment Add Comment
 
This message was sent by Atlassian JIRA (v7.7.1#77002-sha1:e75ca93)
Atlassian logo

Alex Dreyer (JIRA)

unread,
Jun 14, 2019, 6:35:02 PM6/14/19
to puppe...@googlegroups.com
Alex Dreyer commented on Improvement BOLT-1397
 
Re: The wait_until_available() function should return (or make accessible) details of why it timed out.

This probably means tracking the last error received from each host and making it available in the details of the timeout error

Joshua Partlow (JIRA)

unread,
Jun 14, 2019, 7:33:02 PM6/14/19
to puppe...@googlegroups.com

Some notes from Cas: "I think that instead of batch_connected and connected? simply rescuing connection errors and returning false, they would need to pass along the connection error. https://github.com/puppetlabs/bolt/blob/b062fa3faed7fa8600ef17275a46933b37a12bc6/lib/bolt/transport/base.rb#L175 https://github.com/puppetlabs/bolt/blob/b062fa3faed7fa8600ef17275a46933b37a12bc6/lib/bolt/transport/ssh.rb#L79-L83 That way instead of the timeout error we could give the pass through the connection error."

Reply all
Reply to author
Forward
0 new messages