[JIRA] (JENKINS-55618) When using spot instances, requests 3-4 nodes when one is needed

15 views
Skip to first unread message

vladislav.naumov@gmail.com (JIRA)

unread,
Jan 16, 2019, 6:56:01 AM1/16/19
to jenkinsc...@googlegroups.com
Vladislav Naumov created an issue
 
Jenkins / Bug JENKINS-55618
When using spot instances, requests 3-4 nodes when one is needed
Issue Type: Bug Bug
Assignee: FABRIZIO MANFREDI
Components: ec2-plugin
Created: 2019-01-16 11:55
Labels: spot
Priority: Major Major
Reporter: Vladislav Naumov

When it is time to request a worker node, more than one can be requested.

It requests one, then rapidly checks for if its up (node needs a minute or so to come up when using spot instances), and fires another one, then another – until one of spots is finally there on time. This one becomes Jenkins slave, rest linger there until Spot Marketplace kills it.

Seems to be 1.42 specific: I reverted to version 1.41 and it seems to work fine.

This is how it looks in Jenkins log file:
(notice really short delays, only sir-bd78adpq gets into Jenkins slave list)

Jan 15, 2019 2:17:40 PM hudson.plugins.ec2.SlaveTemplate provisionSpot
INFO: Spot instance id in provision: sir-dvfib1pn
Jan 15, 2019 2:17:40 PM hudson.plugins.ec2.EC2Cloud provision
INFO: SlaveTemplate

{ami='ami-XXXXX', labels=''}. Attempting provision finished, excess workload: 0
Jan 15, 2019 2:17:40 PM hudson.plugins.ec2.EC2Cloud provision
INFO: We have now 1 computers, waiting for 1 more
Jan 15, 2019 2:17:40 PM hudson.slaves.NodeProvisioner$StandardStrategyImpl apply
INFO: Started provisioning current_ami (ami-XXXXX) from ec2-ec2 cloud with 1 executors. Remaining excess workload: 0
Jan 15, 2019 2:17:40 PM hudson.plugins.ec2.EC2Cloud$1 call
WARNING: SlaveTemplate{ami='ami-XXXXX', labels=''}

. Node terminated is neither pending, neither running, its

{2}. Terminate provisioning
Jan 15, 2019 2:17:48 PM hudson.plugins.ec2.EC2Cloud provision
INFO: SlaveTemplate{ami='ami-XXXXX', labels=''}. Attempting to provision slave needed by excess workload of 1 units
Jan 15, 2019 2:17:49 PM hudson.plugins.ec2.SlaveTemplate provisionSpot
INFO: Launching ami-XXXXX for template current_ami

Jan 15, 2019 2:17:50 PM hudson.plugins.ec2.SlaveTemplate provisionSpot
INFO: Spot instance id in provision: sir-bieg943m
Jan 15, 2019 2:17:50 PM hudson.plugins.ec2.EC2Cloud provision
INFO: SlaveTemplate{ami='ami-XXXXX', labels=''}. Attempting provision finished, excess workload: 0
Jan 15, 2019 2:17:50 PM hudson.plugins.ec2.EC2Cloud provision
INFO: We have now 1 computers, waiting for 1 more
Jan 15, 2019 2:17:50 PM hudson.slaves.NodeProvisioner$StandardStrategyImpl apply
INFO: Started provisioning current_ami (ami-XXXXX) from ec2-ec2 cloud with 1 executors. Remaining excess workload: 0
Jan 15, 2019 2:17:50 PM hudson.plugins.ec2.EC2Cloud$1 call
WARNING: SlaveTemplate{ami='ami-XXXXX', labels=''}. Node terminated is neither pending, neither running, its {2}

. Terminate provisioning
Jan 15, 2019 2:17:58 PM hudson.plugins.ec2.EC2Cloud provision
INFO: SlaveTemplate

{ami='ami-XXXXX', labels=''}. Attempting to provision slave needed by excess workload of 1 units

Jan 15, 2019 2:17:59 PM hudson.plugins.ec2.SlaveTemplate provisionSpot
INFO: Launching ami-XXXXX for template current_ami

Jan 15, 2019 2:18:00 PM hudson.plugins.ec2.SlaveTemplate provisionSpot
INFO: Spot instance id in provision: sir-5mtr9erp
Jan 15, 2019 2:18:00 PM hudson.plugins.ec2.EC2Cloud provision
INFO: SlaveTemplate{ami='ami-XXXXX', labels=''}

. Attempting provision finished, excess workload: 0
Jan 15, 2019 2:18:00 PM hudson.plugins.ec2.EC2Cloud provision
INFO: We have now 1 computers, waiting for 1 more
Jan 15, 2019 2:18:00 PM hudson.slaves.NodeProvisioner$StandardStrategyImpl apply
INFO: Started provisioning current_ami (ami-XXXXX) from ec2-ec2 cloud with 1 executors. Remaining excess workload: 0
Jan 15, 2019 2:18:00 PM hudson.plugins.ec2.EC2Cloud$1 call
WARNING: SlaveTemplate

{ami='ami-XXXXX', labels=''}. Node terminated is neither pending, neither running, its {2}. Terminate provisioning
Jan 15, 2019 2:18:08 PM hudson.plugins.ec2.EC2Cloud provision
INFO: SlaveTemplate{ami='ami-XXXXX', labels=''}

. Attempting to provision slave needed by excess workload of 1 units
Jan 15, 2019 2:18:09 PM hudson.plugins.ec2.SlaveTemplate provisionSpot
INFO: Launching ami-XXXXX for template current_ami
Jan 15, 2019 2:18:10 PM hudson.plugins.ec2.SlaveTemplate provisionSpot

INFO: Spot instance id in provision: sir-bd78adpq
Jan 15, 2019 2:18:10 PM hudson.plugins.ec2.EC2Cloud provision
INFO: SlaveTemplate

{ami='ami-XXXXX', labels=''}

. Attempting provision finished, excess workload: 0
Jan 15, 2019 2:18:10 PM hudson.plugins.ec2.EC2Cloud provision
INFO: We have now 1 computers, waiting for 1 more
Jan 15, 2019 2:18:10 PM hudson.slaves.NodeProvisioner$StandardStrategyImpl apply
INFO: Started provisioning current_ami (ami-XXXXX) from ec2-ec2 cloud with 1 executors. Remaining excess workload: 0
Jan 15, 2019 2:18:21 PM hudson.plugins.ec2.EC2Cloud$1 call

Add Comment Add Comment
 
This message was sent by Atlassian Jira (v7.11.2#711002-sha1:fdc329d)

sylvain.lebeaupin@z-battle.net (JIRA)

unread,
Jan 17, 2019, 5:25:02 PM1/17/19
to jenkinsc...@googlegroups.com
Sylvain LEBEAUPIN commented on Bug JENKINS-55618
 
Re: When using spot instances, requests 3-4 nodes when one is needed

I have exactly the same issue with the version 1.42

In my settings, I set only 1 executor (advanced section) of a spot instance.

Work fine with the previous version 1.40.1.

Since the update to 1.42:

  • I view only 1 slave (good);
  • several spot instances are started (~4-5 argh);
  • jenkins logs display several provisioning every 10s (jenkins.ec2.142.txt)

sylvain.lebeaupin@z-battle.net (JIRA)

unread,
Jan 17, 2019, 5:25:02 PM1/17/19
to jenkinsc...@googlegroups.com

alistair.gilbert@basware.com (JIRA)

unread,
Mar 30, 2019, 7:22:02 AM3/30/19
to jenkinsc...@googlegroups.com

harrymclaughlinwork@gmail.com (JIRA)

unread,
Apr 9, 2019, 6:25:02 AM4/9/19
to jenkinsc...@googlegroups.com

Also facing the same issue; this is quite a problem as the entire benefit of spot instances is reduced user cost.

rinat_khairullin@epam.com (JIRA)

unread,
May 2, 2019, 9:30:06 AM5/2/19
to jenkinsc...@googlegroups.com

rinat_khairullin@epam.com (JIRA)

unread,
May 2, 2019, 9:33:17 AM5/2/19
to jenkinsc...@googlegroups.com
Rinat Khairullin commented on Bug JENKINS-55618
 
Re: When using spot instances, requests 3-4 nodes when one is needed

We are experiencing the same issue, totally unable to use spot instances as plugin creates huge amount of instances that is only limited by cap value.

vladislav.naumov@gmail.com (JIRA)

unread,
May 23, 2019, 8:16:02 AM5/23/19
to jenkinsc...@googlegroups.com

smishraec@gmail.com (JIRA)

unread,
May 28, 2019, 3:42:02 AM5/28/19
to jenkinsc...@googlegroups.com

Vladislav Naumov, I used version 1.43 but still I am getting the same issue.

can anyone has any clue how to resolve this issue .Please help.

fabrizio.manfredi@gmail.com (JIRA)

unread,
May 28, 2019, 3:59:02 AM5/28/19
to jenkinsc...@googlegroups.com

The existing algorithm of ec2-plugin to raise nodes  is quite reactive, and is not waiting that the node is online, for some historical reason to follow as much as possible the peak (and the fact that the linux node is payed by seconds )

To fix your problem we need to implement a new algorithm or option of waiting online nodes, that will be not possible before the 1.46/7 we don't have enough capacity for that at the moment.

One question how long takes your node to be ready for use ? 

smishraec@gmail.com (JIRA)

unread,
May 28, 2019, 4:03:03 AM5/28/19
to jenkinsc...@googlegroups.com

My Node is already in active state.but not sure why slave spot instance are not created.Here is the log error for the same.

 

SlaveTemplate{ami='ami-XXXXXXX', labels='SPOT_TEST1'}. Attempting to provision slave needed by excess workload of 1 units

May 28, 2019 4:10:40 PM INFO hudson.plugins.ec2.SlaveTemplate provisionSpot

Launching ami-XXXXXXX for template ci-tools_test_SPOT_TEST

May 28, 2019 4:10:41 PM INFO hudson.plugins.ec2.SlaveTemplate provisionSpot

Spot instance id in provision: sir-4cqg8y5j

May 28, 2019 4:10:41 PM INFO hudson.plugins.ec2.EC2Cloud provision

SlaveTemplate{ami='ami-XXXXXXX', labels='SPOT_TEST1'}. Attempting provision finished, excess workload: 0

May 28, 2019 4:10:41 PM INFO hudson.plugins.ec2.EC2Cloud provision

We have now 13 computers, waiting for 1 more

May 28, 2019 4:10:41 PM INFO hudson.slaves.NodeProvisioner$StandardStrategyImpl apply

Started provisioning EC2 (AWS) - ci-tools_test_SPOT_TEST from ec2-AWS with 1 executors. Remaining excess workload: 0

May 28, 2019 4:10:47 PM WARNING hudson.plugins.ec2.EC2Cloud$1 call

SlaveTemplate{ami='ami-XXXXXXX', labels='SPOT_TEST1'}. Node shutting-down is neither pending, neither running, its {2}. Terminate provisioning

vladislav.naumov@gmail.com (JIRA)

unread,
May 28, 2019, 4:42:02 AM5/28/19
to jenkinsc...@googlegroups.com

> how long takes your node to be ready for use ?

Up to a minute with Spot instances. It used to be much worse few years ago – it could take up to 5 minutes back then.

Most of this time is spent in that marketplace – on-demand instance comes up 2-3x times faster from same AMI.

smishraec@gmail.com (JIRA)

unread,
May 29, 2019, 3:46:03 AM5/29/19
to jenkinsc...@googlegroups.com

Can someone help me to resolve this issue as I am stuck and not able to resolve this issue.

Vladislav Naumov, could you please help me to resolve this issue.

smishraec@gmail.com (JIRA)

unread,
May 29, 2019, 5:15:03 AM5/29/19
to jenkinsc...@googlegroups.com

Hi Sylvain LEBEAUPIN, I am also getting the same error in my error logs while using the ec2 plugin version 1.43.

If I change the version to 1.40,would the spot instance work fine?

Please suggest.

vladislav.naumov@gmail.com (JIRA)

unread,
May 29, 2019, 6:27:03 AM5/29/19
to jenkinsc...@googlegroups.com

> If I change the version to 1.40,would the spot instance work fine?

Try rolling back to 1.41 first.
It worked fine for me.
But then again, 1.43 works for me, too.

smishraec@gmail.com (JIRA)

unread,
May 29, 2019, 10:10:03 PM5/29/19
to jenkinsc...@googlegroups.com

Thanjs Vladislav Naumov, I will check and test it.

BTW do we need to downgrade the jenkins version also because as of now I am using jenkins version 2.138 and ec2 plyugin version is 1.43.

Please suggest.

fabrizio.manfredi@gmail.com (JIRA)

unread,
Aug 10, 2019, 4:00:05 PM8/10/19
to jenkinsc...@googlegroups.com
Reply all
Reply to author
Forward
0 new messages