[JIRA] (JENKINS-61370) EC2 instances are terminated during launch

24 views
Skip to first unread message

bochenski.kuba+jenkins@gmail.com (JIRA)

unread,
Mar 6, 2020, 10:35:04 AM3/6/20
to jenkinsc...@googlegroups.com
Jakub Bochenski created an issue
 
Jenkins / Bug JENKINS-61370
EC2 instances are terminated during launch
Issue Type: Bug Bug
Assignee: FABRIZIO MANFREDI
Components: ec2-plugin
Created: 2020-03-06 15:34
Environment: Jenkins ver. 2.204.2
ec2 plugin 1.49.1
Priority: Major Major
Reporter: Jakub Bochenski

I'm seeing this in the logs every 10 minutes.
The monitor thread starts and then dies.
10 minutes later it repeats.

Mar 05, 2020 10:54:35 AM INFO hudson.model.AsyncPeriodicWork lambda$doRun$0

Started EC2 alive slaves monitor

Mar 05, 2020 10:54:35 AM SEVERE hudson.init.impl.InstallUncaughtExceptionHandler$DefaultUncaughtExceptionHandler uncaughtException

A thread (EC2 alive slaves monitor thread/10354) died unexpectedly due to an uncaught exception, this may leave your Jenkins in a bad way and is usually indicative of a bug in the code.
java.lang.NullPointerException
	at hudson.plugins.ec2.util.MinimumInstanceChecker.lambda$countQueueItemsForAgentTemplate$8(MinimumInstanceChecker.java:67)
	at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:174)
	at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
	at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
	at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
	at java.util.stream.LongPipeline.reduce(LongPipeline.java:461)
	at java.util.stream.LongPipeline.sum(LongPipeline.java:419)
	at java.util.stream.ReferencePipeline.count(ReferencePipeline.java:593)
	at hudson.plugins.ec2.util.MinimumInstanceChecker.countQueueItemsForAgentTemplate(MinimumInstanceChecker.java:68)
	at hudson.plugins.ec2.util.MinimumInstanceChecker.lambda$null$11(MinimumInstanceChecker.java:87)
	at java.util.ArrayList.forEach(ArrayList.java:1257)
	at java.util.Collections$UnmodifiableCollection.forEach(Collections.java:1082)
	at hudson.plugins.ec2.util.MinimumInstanceChecker.lambda$checkForMinimumInstances$12(MinimumInstanceChecker.java:76)
	at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
	at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
	at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
	at java.util.Iterator.forEachRemaining(Iterator.java:116)
	at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
	at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
	at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
	at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485)
	at hudson.plugins.ec2.util.MinimumInstanceChecker.checkForMinimumInstances(MinimumInstanceChecker.java:75)
	at hudson.plugins.ec2.EC2SlaveMonitor.execute(EC2SlaveMonitor.java:41)
	at hudson.model.AsyncPeriodicWork.lambda$doRun$0(AsyncPeriodicWork.java:100)
	at java.lang.Thread.run(Thread.java:748)

Add Comment Add Comment
 
This message was sent by Atlassian Jira (v7.13.12#713012-sha1:6e07c38)
Atlassian logo

bochenski.kuba+jenkins@gmail.com (JIRA)

unread,
Mar 6, 2020, 10:36:03 AM3/6/20
to jenkinsc...@googlegroups.com
Jakub Bochenski updated an issue
Change By: Jakub Bochenski
I'm seeing this in When trying to provision a new agent the logs every 10 minutes.
The monitor thread starts
plugin would start and then dies terminate an EC2 instance several times before succeeding in the end .
10 minutes later it repeats I have no explanation for this behavior . Might be related to JENKINS-61343
Might be because the plugin somehow allocates "2 computer(s)" even though the instance cap is 1.

{code}Mar 05 06 , 2020 10 3 : 54 05 : 35 AM 22 PM INFO hudson. model plugins . AsyncPeriodicWork lambda$doRun$0 ec2.EC2Cloud log

Started EC2 alive slaves monitor bootstrap()

Mar
05 06 , 2020 10 3 : 54 05 : 35 AM SEVERE 22 PM INFO hudson. init plugins . impl ec2 . InstallUncaughtExceptionHandler$DefaultUncaughtExceptionHandler uncaughtException EC2Cloud log

A thread (EC2 alive slaves monitor thread/10354) died unexpectedly due to an uncaught exception, this may leave your Jenkins in a bad way and is usually indicative of a bug in the code Getting keypair .
java
. lang . NullPointerException
at
Mar 06, 2020 3:05:22 PM INFO
hudson.plugins.ec2. util.MinimumInstanceChecker.lambda$countQueueItemsForAgentTemplate$8(MinimumInstanceChecker.java:67) EC2Cloud log
at java.util.stream.ReferencePipeline$2$1.accept
Using private key j4a-ec2-ssh-key
( ReferencePipeline.java SHA-1 fingerprint a7 : 174 b4:70:08:35:11:e3:cf:4b:f5:92:57:b8:02:7f:c6:8e:54:52:02 )
at java.util.stream.ReferencePipeline$
Mar 06, 2020
3 $1.accept(ReferencePipeline.java : 193)
at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java
05 : 1382)
at java
22 PM INFO hudson . util plugins . stream ec2 . AbstractPipeline.copyInto(AbstractPipeline.java:482) EC2Cloud log
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
at java Authenticating as admin

Mar 06, 2020 3:05:22 PM INFO hudson
. util slaves . stream.ReduceOps NodeProvisioner lambda $ ReduceOp.evaluateSequential update$6

EC2
( ReduceOps.java:708 ec2 )
at java.util.stream.AbstractPipeline.evaluate
- ec2 ( AbstractPipeline.java:234 ami-028d96c69234f9d1a )
at java
provisioning successfully completed . util.stream.LongPipeline.reduce We have now 2 computer ( LongPipeline.java:461 s )
at java
Mar 06, 2020 3:05:22 PM INFO hudson
. util plugins . stream ec2 . LongPipeline.sum(LongPipeline.java:419) EC2Cloud log
at java
Connecting to 10
. util 20 . stream 4 . ReferencePipeline 41 on port 22, with timeout 10000 . count(ReferencePipeline.java:593)
at
Mar 06, 2020 3:05:29 PM INFO
hudson.plugins.ec2. util EC2Cloud log

Connected via SSH
. MinimumInstanceChecker.countQueueItemsForAgentTemplate(MinimumInstanceChecker.java:68)
at
Mar 06, 2020 3:05:29 PM INFO
hudson.plugins.ec2. util.MinimumInstanceChecker.lambda$null$11(MinimumInstanceChecker.java:87) EC2Cloud log
at java.util.ArrayList.forEach(ArrayList.java:1257)
at java.util.Collections$UnmodifiableCollection.forEach(Collections.java:1082) connect fresh as root
at
Mar 06, 2020 3:05:29 PM INFO
hudson.plugins.ec2. util.MinimumInstanceChecker.lambda$checkForMinimumInstances$12(MinimumInstanceChecker.java:76) EC2Cloud log
at java
Connecting to 10
. util 20 . stream 4 . ForEachOps$ForEachOp$OfRef 41 on port 22, with timeout 10000 . accept(ForEachOps.java:183)
at java.util.stream.ReferencePipeline$
Mar 06, 2020
3 $1.accept(ReferencePipeline.java : 193)
at java
05:30 PM INFO hudson . util plugins . stream ec2 . ReferencePipeline$2$1.accept(ReferencePipeline.java:175) EC2Cloud log
at java
Connected via SSH
. util.Iterator.forEachRemaining(Iterator.java:116)
at java
Mar 06, 2020 3:05:30 PM INFO hudson
. util plugins . Spliterators$IteratorSpliterator ec2 . forEachRemaining EC2Cloud log

Creating tmp directory
( Spliterators.java:1801 /tmp ) if it does not exist
at java
Mar 06, 2020 3:05:30 PM INFO hudson
. util plugins . stream ec2 . AbstractPipeline.copyInto(AbstractPipeline.java:482) EC2Cloud log
at
Verifying:
java -fullversion

Mar 06, 2020 3:05:30 PM INFO hudson
. util plugins . stream ec2 . AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java EC2Cloud log

Verifying
: 472) which scp
at java
Mar 06, 2020 3:05:30 PM INFO hudson
. util plugins . stream ec2 . ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) EC2Cloud log
at java
Copying remoting
. util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java jar to : 173) /tmp
at java
Mar 06, 2020 3:05:30 PM INFO hudson
. util plugins . stream ec2 . AbstractPipeline.evaluate EC2Cloud log

Launching remoting agent
( AbstractPipeline.java:234 via Trilead SSH2 Connection )
at
: java   -jar /tmp/remoting . util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485) jar -workDir /opt/jenkins
at
Mar 06, 2020 3:05:31 PM INFO
hudson.plugins.ec2. util.MinimumInstanceChecker.checkForMinimumInstances EC2OndemandSlave terminate

Terminated EC2 instance
( MinimumInstanceChecker.java:75 terminated ) : i-021d76d0ffff3375f
at
Mar 06, 2020 3:05:31 PM INFO
hudson.plugins.ec2. EC2SlaveMonitor.execute(EC2SlaveMonitor.java EC2OndemandSlave terminate

Removed EC2 instance from jenkins master
: 41) i-021d76d0ffff3375f
at
Mar 06, 2020 3:05:32 PM INFO
hudson. model plugins . AsyncPeriodicWork ec2 . lambda$doRun$0(AsyncPeriodicWork EC2Cloud provision

SlaveTemplate{ami='ami-028d96c69234f9d1a', labels='docker docker-bakery'}
. java:100) Attempting to provision slave needed by excess workload of 1 units
at java
Mar 06, 2020 3:05:32 PM INFO hudson
. lang plugins . Thread ec2 . run(Thread.java:748) SlaveTemplate logProvisionInfo

SlaveTemplate { ami='ami-028d96c69234f9d1a', labels='docker docker-bakery'}. Considering launching
{
code}

bochenski.kuba+jenkins@gmail.com (JIRA)

unread,
Mar 30, 2020, 12:25:02 PM3/30/20
to jenkinsc...@googlegroups.com
Jakub Bochenski commented on Bug JENKINS-61370
 
Re: EC2 instances are terminated during launch

After changing to "SSH process" connection method I was able to see additional errors, which are swallowed by the Trilead java connector.

Exception in thread "main" java.nio.file.AccessDeniedException: /opt/jenkins
	at sun.nio.fs.UnixException.translateToIOException(UnixException.java:84)
	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
	at sun.nio.fs.UnixFileSystemProvider.createDirectory(UnixFileSystemProvider.java:384)
	at java.nio.file.Files.createDirectory(Files.java:674)
	at java.nio.file.Files.createAndCheckIsDirectory(Files.java:781)
	at java.nio.file.Files.createDirectories(Files.java:767)
	at org.jenkinsci.remoting.engine.WorkDirManager.initializeWorkDir(WorkDirManager.java:211

Turns out that if you mount a block device in user-data script it can sometimes not be available when the master SSH connection comes in.
The solution seems to be to use the init-script instead.

Reply all
Reply to author
Forward
0 new messages