[JIRA] [ec2-plugin] (JENKINS-32915) Spot instance launched one after another until capacity reached for single task in queue

8 views
Skip to first unread message

deejay.zajac+jenkins@gmail.com (JIRA)

unread,
Feb 12, 2016, 7:54:03 AM2/12/16
to jenkinsc...@googlegroups.com
Mateusz Zając created an issue
 
Jenkins / Bug JENKINS-32915
Spot instance launched one after another until capacity reached for single task in queue
Issue Type: Bug Bug
Assignee: Francis Upton
Attachments: Capture.PNG, Capture2.PNG
Components: ec2-plugin
Created: 12/Feb/16 12:53 PM
Environment: Jenkins ver. 1.647 java7
slaves java8 from oracle
EC2 plugin 1.31
Launched in VPC in AWS along with Slaves. Open ports 443, 22, 9090 (for slave communication) communication in VPC allowed only using private ip addresses.
Labels: ec2-plugin plugin spot
Priority: Minor Minor
Reporter: Mateusz Zając

For single element in queue with not slaves attached and master executors set to 0 plugin as designed starts launching spot instances. It does not stop physically until capacity is reached. It still tries until task from queue is picked up by worker that launches after ~3m

logs ALL:

Feb 12, 2016 12:17:38 PM INFO hudson.plugins.ec2.SlaveTemplate provisionSpot
Launching ami-6c645106 for template Jenkins Build Agent
Feb 12, 2016 12:17:39 PM INFO hudson.plugins.ec2.SlaveTemplate provisionSpot
Spot instance id in provision: sir-02g4tk2d
Feb 12, 2016 12:17:39 PM INFO hudson.plugins.ec2.EC2Cloud provision
Attempting provision - finished, excess workload: 0
Feb 12, 2016 12:17:39 PM INFO hudson.slaves.NodeProvisioner$StandardStrategyImpl apply
Started provisioning Jenkins Build Agent (ami-6c645106) from ec2-Amazon profile with 1 executors. Remaining excess workload: -0.282
Feb 12, 2016 12:17:39 PM INFO hudson.plugins.ec2.EC2Cloud$1 call
Expected - Spot instance null failed to connect on initial provision
Feb 12, 2016 12:17:47 PM INFO hudson.slaves.NodeProvisioner$2 run
Jenkins Build Agent (ami-6c645106) provisioning successfully completed. We have now 3 computer(s)
Feb 12, 2016 12:17:48 PM INFO hudson.plugins.ec2.SlaveTemplate provisionSpot
Launching ami-6c645106 for template Jenkins Build Agent
Feb 12, 2016 12:17:49 PM INFO hudson.plugins.ec2.SlaveTemplate provisionSpot
Spot instance id in provision: sir-02g3vkvj
Feb 12, 2016 12:17:49 PM INFO hudson.plugins.ec2.EC2Cloud provision
Attempting provision - finished, excess workload: 0
Feb 12, 2016 12:17:49 PM INFO hudson.slaves.NodeProvisioner$StandardStrategyImpl apply
Started provisioning Jenkins Build Agent (ami-6c645106) from ec2-Amazon profile with 1 executors. Remaining excess workload: -0.254
Feb 12, 2016 12:17:49 PM INFO hudson.plugins.ec2.EC2Cloud$1 call
Expected - Spot instance null failed to connect on initial provision
Feb 12, 2016 12:17:57 PM INFO hudson.slaves.NodeProvisioner$2 run
Jenkins Build Agent (ami-6c645106) provisioning successfully completed. We have now 4 computer(s)
Feb 12, 2016 12:17:58 PM INFO hudson.plugins.ec2.SlaveTemplate provisionSpot
Launching ami-6c645106 for template Jenkins Build Agent
Feb 12, 2016 12:17:59 PM INFO hudson.plugins.ec2.SlaveTemplate provisionSpot
Spot instance id in provision: sir-02g254a9
Feb 12, 2016 12:17:59 PM INFO hudson.plugins.ec2.EC2Cloud provision
Attempting provision - finished, excess workload: 0
Feb 12, 2016 12:17:59 PM INFO hudson.slaves.NodeProvisioner$StandardStrategyImpl apply
Started provisioning Jenkins Build Agent (ami-6c645106) from ec2-Amazon profile with 1 executors. Remaining excess workload: -0.229
Feb 12, 2016 12:17:59 PM INFO hudson.plugins.ec2.EC2Cloud$1 call
Expected - Spot instance null failed to connect on initial provision
Feb 12, 2016 12:18:07 PM INFO hudson.slaves.NodeProvisioner$2 run
Jenkins Build Agent (ami-6c645106) provisioning successfully completed. We have now 5 computer(s)
Feb 12, 2016 12:18:08 PM INFO hudson.plugins.ec2.SlaveTemplate provisionSpot
Launching ami-6c645106 for template Jenkins Build Agent
Feb 12, 2016 12:18:09 PM INFO hudson.plugins.ec2.SlaveTemplate provisionSpot
Spot instance id in provision: sir-02g5r01b
Feb 12, 2016 12:18:09 PM INFO hudson.plugins.ec2.EC2Cloud provision
Attempting provision - finished, excess workload: 0
Feb 12, 2016 12:18:09 PM INFO hudson.slaves.NodeProvisioner$StandardStrategyImpl apply
Started provisioning Jenkins Build Agent (ami-6c645106) from ec2-Amazon profile with 1 executors. Remaining excess workload: -0.206
Feb 12, 2016 12:18:09 PM INFO hudson.plugins.ec2.EC2Cloud$1 call
Expected - Spot instance null failed to connect on initial provision
Feb 12, 2016 12:18:17 PM INFO hudson.slaves.NodeProvisioner$2 run
Jenkins Build Agent (ami-6c645106) provisioning successfully completed. We have now 6 computer(s)
Feb 12, 2016 12:18:19 PM INFO hudson.plugins.ec2.EC2Cloud provision
Attempting provision - finished, excess workload: 1
Feb 12, 2016 12:18:29 PM INFO hudson.plugins.ec2.EC2Cloud provisionSlaveIfPossible
Cannot provision - no capacity for instances: -3
Feb 12, 2016 12:18:29 PM INFO hudson.plugins.ec2.EC2Cloud provision
Attempting provision - finished, excess workload: 1
Feb 12, 2016 12:18:39 PM INFO hudson.plugins.ec2.EC2Cloud provisionSlaveIfPossible
Cannot provision - no capacity for instances: -3
Feb 12, 2016 12:18:39 PM INFO hudson.plugins.ec2.EC2Cloud provision
Attempting provision - finished, excess workload: 1
Feb 12, 2016 12:18:49 PM INFO hudson.plugins.ec2.EC2Cloud provision
Attempting provision - finished, excess workload: 1
Feb 12, 2016 12:18:59 PM INFO hudson.plugins.ec2.EC2Cloud provision
Attempting provision - finished, excess workload: 1
Feb 12, 2016 12:19:09 PM INFO hudson.plugins.ec2.EC2Cloud provision
Attempting provision - finished, excess workload: 1
Feb 12, 2016 12:19:17 PM INFO hudson.model.AsyncPeriodicWork$1 run
Started EC2 alive slaves monitor
Feb 12, 2016 12:19:19 PM INFO hudson.plugins.ec2.EC2Cloud provision
Attempting provision - finished, excess workload: 1
Feb 12, 2016 12:19:19 PM INFO hudson.plugins.ec2.EC2SlaveMonitor execute
EC2 instance is dead: null
Feb 12, 2016 12:19:19 PM INFO hudson.plugins.ec2.EC2SpotSlave terminate
Canceled Spot request: sir-02g5r01b
Feb 12, 2016 12:19:20 PM INFO hudson.model.AsyncPeriodicWork$1 run
Finished EC2 alive slaves monitor. 2,391 ms
Feb 12, 2016 12:19:28 PM INFO hudson.plugins.ec2.SlaveTemplate provisionSpot
Launching ami-6c645106 for template Jenkins Build Agent
Feb 12, 2016 12:19:29 PM INFO hudson.plugins.ec2.SlaveTemplate provisionSpot
Spot instance id in provision: sir-02fzgvxj
Feb 12, 2016 12:19:29 PM INFO hudson.plugins.ec2.EC2Cloud provision
Attempting provision - finished, excess workload: 0
Feb 12, 2016 12:19:29 PM INFO hudson.slaves.NodeProvisioner$StandardStrategyImpl apply
Started provisioning Jenkins Build Agent (ami-6c645106) from ec2-Amazon profile with 1 executors. Remaining excess workload: -0.089
Feb 12, 2016 12:19:29 PM INFO hudson.plugins.ec2.EC2Cloud$1 call
Expected - Spot instance null failed to connect on initial provision
Feb 12, 2016 12:19:37 PM INFO hudson.slaves.NodeProvisioner$2 run
Jenkins Build Agent (ami-6c645106) provisioning successfully completed. We have now 6 computer(s)
Feb 12, 2016 12:19:39 PM INFO hudson.plugins.ec2.EC2Cloud provision
Attempting provision - finished, excess workload: 1
Feb 12, 2016 12:19:43 PM INFO hudson.TcpSlaveAgentListener$ConnectionHandler run
Accepted connection #86 from /10.0.0.186:44954
Feb 12, 2016 12:19:43 PM WARNING jenkins.slaves.JnlpSlaveHandshake error
TCP slave agent connection handler #86 with /10.0.0.186:44954 is aborted: 381e79a9-c093-47df-9d79-0b921c35135d is already connected to this master. Rejecting this connection.
Feb 12, 2016 12:19:43 PM INFO hudson.TcpSlaveAgentListener$ConnectionHandler run
Accepted connection #87 from /10.0.0.186:44955
Feb 12, 2016 12:19:43 PM WARNING jenkins.slaves.JnlpSlaveHandshake error
TCP slave agent connection handler #87 with /10.0.0.186:44955 is aborted: 381e79a9-c093-47df-9d79-0b921c35135d is already connected to this master. Rejecting this connection.
Feb 12, 2016 12:19:43 PM INFO hudson.TcpSlaveAgentListener$ConnectionHandler run
Accepted connection #88 from /10.0.0.186:44956
Feb 12, 2016 12:19:43 PM WARNING hudson.TcpSlaveAgentListener$ConnectionHandler run
Connection #88 failed
java.io.EOFException
	at java.io.DataInputStream.readUnsignedShort(DataInputStream.java:340)
	at java.io.DataInputStream.readUTF(DataInputStream.java:589)
	at java.io.DataInputStream.readUTF(DataInputStream.java:564)
	at hudson.TcpSlaveAgentListener$ConnectionHandler.run(TcpSlaveAgentListener.java:150)

Feb 12, 2016 12:19:49 PM INFO hudson.plugins.ec2.EC2Cloud provision
Attempting provision - finished, excess workload: 1
Feb 12, 2016 12:19:53 PM INFO hudson.TcpSlaveAgentListener$ConnectionHandler run
Accepted connection #89 from /10.0.0.88:47938
Feb 12, 2016 12:19:54 PM INFO hudson.TcpSlaveAgentListener$ConnectionHandler run
Accepted connection #90 from /10.0.0.234:45467
Feb 12, 2016 12:19:56 PM INFO hudson.TcpSlaveAgentListener$ConnectionHandler run
Accepted connection #91 from /10.0.0.9:37386
Feb 12, 2016 12:24:41 PM INFO hudson.TcpSlaveAgentListener$ConnectionHandler run
Accepted connection #92 from /10.0.0.9:37394
Feb 12, 2016 12:24:41 PM WARNING jenkins.slaves.JnlpSlaveHandshake error
TCP slave agent connection handler #92 with /10.0.0.9:37394 is aborted: b5c0b783-ccf3-44dd-ad2a-e89f18327dc2 is already connected to this master. Rejecting this connection.
Feb 12, 2016 12:24:41 PM INFO hudson.TcpSlaveAgentListener$ConnectionHandler run
Accepted connection #93 from /10.0.0.9:37395
Feb 12, 2016 12:24:41 PM WARNING jenkins.slaves.JnlpSlaveHandshake error
TCP slave agent connection handler #93 with /10.0.0.9:37395 is aborted: b5c0b783-ccf3-44dd-ad2a-e89f18327dc2 is already connected to this master. Rejecting this connection.
Feb 12, 2016 12:24:41 PM INFO hudson.TcpSlaveAgentListener$ConnectionHandler run
Accepted connection #94 from /10.0.0.9:37396
Feb 12, 2016 12:24:41 PM WARNING hudson.TcpSlaveAgentListener$ConnectionHandler run
Connection #94 failed
java.io.EOFException
	at java.io.DataInputStream.readUnsignedShort(DataInputStream.java:340)
	at java.io.DataInputStream.readUTF(DataInputStream.java:589)
	at java.io.DataInputStream.readUTF(DataInputStream.java:564)
	at hudson.TcpSlaveAgentListener$ConnectionHandler.run(TcpSlaveAgentListener.java:150)

Feb 12, 2016 12:24:42 PM INFO hudson.TcpSlaveAgentListener$ConnectionHandler run
Accepted connection #95 from /10.0.0.88:47946
Feb 12, 2016 12:24:42 PM WARNING jenkins.slaves.JnlpSlaveHandshake error
TCP slave agent connection handler #95 with /10.0.0.88:47946 is aborted: 82aeacc2-cf06-4cca-8374-9b9c0033f8e9 is already connected to this master. Rejecting this connection.
Feb 12, 2016 12:24:42 PM INFO hudson.TcpSlaveAgentListener$ConnectionHandler run
Accepted connection #96 from /10.0.0.88:47947
Feb 12, 2016 12:24:42 PM WARNING jenkins.slaves.JnlpSlaveHandshake error
TCP slave agent connection handler #96 with /10.0.0.88:47947 is aborted: 82aeacc2-cf06-4cca-8374-9b9c0033f8e9 is already connected to this master. Rejecting this connection.
Feb 12, 2016 12:24:42 PM INFO hudson.TcpSlaveAgentListener$ConnectionHandler run
Accepted connection #97 from /10.0.0.88:47948
Feb 12, 2016 12:24:42 PM WARNING hudson.TcpSlaveAgentListener$ConnectionHandler run
Connection #97 failed
java.io.EOFException
	at java.io.DataInputStream.readUnsignedShort(DataInputStream.java:340)
	at java.io.DataInputStream.readUTF(DataInputStream.java:589)
	at java.io.DataInputStream.readUTF(DataInputStream.java:564)
	at hudson.TcpSlaveAgentListener$ConnectionHandler.run(TcpSlaveAgentListener.java:150)

Feb 12, 2016 12:24:42 PM INFO hudson.TcpSlaveAgentListener$ConnectionHandler run
Accepted connection #98 from /10.0.0.186:44961
Feb 12, 2016 12:24:42 PM WARNING jenkins.slaves.JnlpSlaveHandshake error
TCP slave agent connection handler #98 with /10.0.0.186:44961 is aborted: 381e79a9-c093-47df-9d79-0b921c35135d is already connected to this master. Rejecting this connection.
Feb 12, 2016 12:24:42 PM INFO hudson.TcpSlaveAgentListener$ConnectionHandler run
Accepted connection #99 from /10.0.0.186:44962
Feb 12, 2016 12:24:42 PM WARNING jenkins.slaves.JnlpSlaveHandshake error
TCP slave agent connection handler #99 with /10.0.0.186:44962 is aborted: 381e79a9-c093-47df-9d79-0b921c35135d is already connected to this master. Rejecting this connection.
Feb 12, 2016 12:24:42 PM INFO hudson.TcpSlaveAgentListener$ConnectionHandler run
Accepted connection #100 from /10.0.0.186:44963
Feb 12, 2016 12:24:42 PM WARNING hudson.TcpSlaveAgentListener$ConnectionHandler run
Connection #100 failed
java.io.EOFException
	at java.io.DataInputStream.readUnsignedShort(DataInputStream.java:340)
	at java.io.DataInputStream.readUTF(DataInputStream.java:589)
	at java.io.DataInputStream.readUTF(DataInputStream.java:564)
	at hudson.TcpSlaveAgentListener$ConnectionHandler.run(TcpSlaveAgentListener.java:150)

Feb 12, 2016 12:24:43 PM INFO hudson.TcpSlaveAgentListener$ConnectionHandler run
Accepted connection #101 from /10.0.0.234:45488
Feb 12, 2016 12:24:43 PM WARNING jenkins.slaves.JnlpSlaveHandshake error
TCP slave agent connection handler #101 with /10.0.0.234:45488 is aborted: 01df9f0d-6b5d-4051-b944-dffc698ae1d3 is already connected to this master. Rejecting this connection.
Feb 12, 2016 12:24:43 PM INFO hudson.TcpSlaveAgentListener$ConnectionHandler run
Accepted connection #102 from /10.0.0.234:45489
Feb 12, 2016 12:24:43 PM WARNING jenkins.slaves.JnlpSlaveHandshake error
TCP slave agent connection handler #102 with /10.0.0.234:45489 is aborted: 01df9f0d-6b5d-4051-b944-dffc698ae1d3 is already connected to this master. Rejecting this connection.
Feb 12, 2016 12:24:43 PM INFO hudson.TcpSlaveAgentListener$ConnectionHandler run
Accepted connection #103 from /10.0.0.234:45490
Feb 12, 2016 12:24:43 PM WARNING hudson.TcpSlaveAgentListener$ConnectionHandler run
Connection #103 failed
java.io.EOFException
	at java.io.DataInputStream.readUnsignedShort(DataInputStream.java:340)
	at java.io.DataInputStream.readUTF(DataInputStream.java:589)
	at java.io.DataInputStream.readUTF(DataInputStream.java:564)
	at hudson.TcpSlaveAgentListener$ConnectionHandler.run(TcpSlaveAgentListener.java:150)

Feb 12, 2016 12:24:50 PM INFO hudson.TcpSlaveAgentListener$ConnectionHandler run
Accepted connection #104 from /10.0.0.243:48286

screenshots of the configuration are attached

Add Comment Add Comment
 
This message was sent by Atlassian JIRA (v6.4.2#64017-sha1:e244265)
Atlassian logo

deejay.zajac+jenkins@gmail.com (JIRA)

unread,
Feb 12, 2016, 7:56:02 AM2/12/16
to jenkinsc...@googlegroups.com
Mateusz Zając updated an issue
Change By: Mateusz Zając
For single element in queue with not slaves attached and master executors set to 0 plugin as designed starts launching spot instances. It does not stop physically until capacity is reached. It still tries until task from queue is picked up by worker that launches after ~3m

logs ALL:

{code:java}
{code}


screenshots of the configuration are attached

Please contact me here on via email if I can help in any way

deejay.zajac+jenkins@gmail.com (JIRA)

unread,
Feb 25, 2016, 4:43:01 AM2/25/16
to jenkinsc...@googlegroups.com

deejay.zajac+jenkins@gmail.com (JIRA)

unread,
Feb 25, 2016, 4:43:01 AM2/25/16
to jenkinsc...@googlegroups.com

deejay.zajac+jenkins@gmail.com (JIRA)

unread,
Feb 25, 2016, 4:43:02 AM2/25/16
to jenkinsc...@googlegroups.com
Mateusz Zając started work on Bug JENKINS-32915
 
Change By: Mateusz Zając
Status: Open In Progress

deejay.zajac+jenkins@gmail.com (JIRA)

unread,
Mar 22, 2016, 4:54:02 PM3/22/16
to jenkinsc...@googlegroups.com

scm_issue_link@java.net (JIRA)

unread,
May 3, 2016, 11:57:01 PM5/3/16
to jenkinsc...@googlegroups.com

Code changed in jenkins
User: James Judd
Path:
src/main/java/hudson/plugins/ec2/EC2AbstractSlave.java
src/main/java/hudson/plugins/ec2/EC2Cloud.java
http://jenkins-ci.org/commit/ec2-plugin/ac5574e23654ae1a7cff8be6bced44b3d6320470
Log:
JENKINS-32915 (#193)

  • JENKINS-32915: Corrected horizontal scaling if cloud/job label is null - is not defined in configuration. Better handling of inproper Jenkins core manegment over excessWorkload - if an instance takes 5 minutes to wake up Jenkins does not take it into consideration and tries to provision more until capacity is reached or queue is picked finally. Added logging message that provisioning for spot instance will not be possible if label is not configured. It should be made explicit for the user in configuration that it is needed.
  • JENKINS-32915: Amendments to match tested bottom up code to minimize risk of inproper behaviour.
  • JENKINS-32915: Refactored code to make it more readable and perform better
  • JENKINS-32915: Refactored code to scale out properly. Added warning messages, information messages, amended future task to hold for timeout time
  • Cleaning up PR as requested by @francisu

francisu@gmail.com (JIRA)

unread,
May 9, 2016, 2:56:03 AM5/9/16
to jenkinsc...@googlegroups.com
Francis Upton closed an issue as Fixed
 

1.32

Change By: Francis Upton
Status: In Progress Closed
Resolution: Fixed
Reply all
Reply to author
Forward
0 new messages