[JIRA] (JENKINS-36242) Conflicts with ec2-plugin when there are dead spot fleet instances still registered

26 views
Skip to first unread message

jdrawneek@nationaltheatre.org.uk (JIRA)

unread,
Jun 27, 2016, 10:08:01 AM6/27/16
to jenkinsc...@googlegroups.com
John-Paul Drawneek created an issue
 
Jenkins / Bug JENKINS-36242
Conflicts with ec2-plugin when there are dead spot fleet instances still registered
Issue Type: Bug Bug
Assignee: Aleksei Besogonov
Components: ec2-fleet-plugin
Created: 2016/Jun/27 2:07 PM
Environment: 1.651.3 LTS
ec2-fleet 1.0
ec2-plugin 1.34
Priority: Major Major
Reporter: John-Paul Drawneek

I can create both spot fleet instances and spot instances from ec2-plugin.

The issue is when a spot fleet instance had been terminated it's still logged in the system and the ec2-plugin can see it. When you delete a ec2-plugin instance it removes it from AWS but can not clean up locally as it falls over the dead ec2 fleet instance. This leaves the ec2 plugin instance still configured locally but with no matching record in AWS so it will cause Jenkins to crash when it tries to start up, also you end up will lots of slave nodes listed.

I will try to capture stack traces next time it breaks and include them, priority was to get Jenkins back up.

Add Comment Add Comment
 
This message was sent by Atlassian JIRA (v7.1.7#71011-sha1:2526d7c)
Atlassian logo

jdrawneek@nationaltheatre.org.uk (JIRA)

unread,
Jun 27, 2016, 6:01:02 PM6/27/16
to jenkinsc...@googlegroups.com
John-Paul Drawneek commented on Bug JENKINS-36242
 
Re: Conflicts with ec2-plugin when there are dead spot fleet instances still registered

From log files:
Timer task hudson.slaves.ComputerRetentionWork@e76edd2 failed
java.lang.IllegalStateException: Unknown instance terminated: i-1fa5d993
at com.amazon.jenkins.ec2fleet.EC2FleetCloud.terminateInstance(EC2FleetCloud.java:266)
at com.amazon.jenkins.ec2fleet.IdleRetentionStrategy.check(IdleRetentionStrategy.java:28)
at com.amazon.jenkins.ec2fleet.IdleRetentionStrategy.check(IdleRetentionStrategy.java:12)
at hudson.slaves.ComputerRetentionWork$1.run(ComputerRetentionWork.java:71)
at hudson.model.Queue._withLock(Queue.java:1306)
at hudson.model.Queue.withLock(Queue.java:1189)
at hudson.slaves.ComputerRetentionWork.doRun(ComputerRetentionWork.java:62)
at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:50)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

This instance has been terminated by AWS due to cost spike.
Its still listed under slave nodes.

jdrawneek@nationaltheatre.org.uk (JIRA)

unread,
Jun 27, 2016, 6:06:01 PM6/27/16
to jenkinsc...@googlegroups.com

When I try to launch a new instance with a dead ec2-fleet instance still listed:

javax.servlet.ServletException: java.lang.IllegalStateException: Unknown instance terminated: i-b56b793d
at org.kohsuke.stapler.Stapler.tryInvoke(Stapler.java:796)
at org.kohsuke.stapler.Stapler.invoke(Stapler.java:876)
at org.kohsuke.stapler.MetaClass$5.doDispatch(MetaClass.java:233)
at org.kohsuke.stapler.NameBasedDispatcher.dispatch(NameBasedDispatcher.java:58)
at org.kohsuke.stapler.Stapler.tryInvoke(Stapler.java:746)
at org.kohsuke.stapler.Stapler.invoke(Stapler.java:876)
at org.kohsuke.stapler.Stapler.invoke(Stapler.java:649)
at org.kohsuke.stapler.Stapler.service(Stapler.java:238)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:848)
at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:686)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1494)
at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:135)
at org.tuckey.web.filters.urlrewrite.RuleChain.handleRewrite(RuleChain.java:176)
at org.tuckey.web.filters.urlrewrite.RuleChain.doRules(RuleChain.java:145)
at org.tuckey.web.filters.urlrewrite.UrlRewriter.processRequest(UrlRewriter.java:92)
at com.marvelution.jenkins.plugins.jira.filter.UrlRewriteFilter.doFilter(UrlRewriteFilter.java:51)
at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:132)
at com.marvelution.jenkins.plugins.jira.filter.OAuthFilter.doFilter(OAuthFilter.java:87)
at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:132)
at net.bull.javamelody.MonitoringFilter.doFilter(MonitoringFilter.java:201)
at net.bull.javamelody.MonitoringFilter.doFilter(MonitoringFilter.java:178)
at net.bull.javamelody.PluginMonitoringFilter.doFilter(PluginMonitoringFilter.java:85)
at org.jvnet.hudson.plugins.monitoring.HudsonMonitoringFilter.doFilter(HudsonMonitoringFilter.java:102)
at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:132)
at hudson.plugins.audit_trail.AuditTrailFilter.doFilter(AuditTrailFilter.java:95)
at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:132)
at hudson.util.PluginServletFilter.doFilter(PluginServletFilter.java:126)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1482)
at hudson.security.csrf.CrumbFilter.doFilter(CrumbFilter.java:49)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1482)
at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:84)
at hudson.security.UnwrapSecurityExceptionFilter.doFilter(UnwrapSecurityExceptionFilter.java:51)
at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
at jenkins.security.ExceptionTranslationFilter.doFilter(ExceptionTranslationFilter.java:117)
at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
at org.acegisecurity.providers.anonymous.AnonymousProcessingFilter.doFilter(AnonymousProcessingFilter.java:125)
at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
at org.acegisecurity.ui.rememberme.RememberMeProcessingFilter.doFilter(RememberMeProcessingFilter.java:142)
at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
at org.acegisecurity.ui.AbstractProcessingFilter.doFilter(AbstractProcessingFilter.java:271)
at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
at jenkins.security.BasicHeaderProcessor.doFilter(BasicHeaderProcessor.java:93)
at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
at org.acegisecurity.context.HttpSessionContextIntegrationFilter.doFilter(HttpSessionContextIntegrationFilter.java:249)
at hudson.security.HttpSessionContextIntegrationFilter2.doFilter(HttpSessionContextIntegrationFilter2.java:67)
at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
at hudson.security.ChainedServletFilter.doFilter(ChainedServletFilter.java:76)
at hudson.security.HudsonFilter.doFilter(HudsonFilter.java:171)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1482)
at org.kohsuke.stapler.compression.CompressionFilter.doFilter(CompressionFilter.java:49)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1482)
at hudson.util.CharacterEncodingFilter.doFilter(CharacterEncodingFilter.java:81)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1482)
at org.kohsuke.stapler.DiagnosticThreadNameFilter.doFilter(DiagnosticThreadNameFilter.java:30)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1474)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:499)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:533)
at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:428)
at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
at org.eclipse.jetty.server.handler.RequestLogHandler.handle(RequestLogHandler.java:68)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
at org.eclipse.jetty.server.Server.handle(Server.java:370)
at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
at org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:960)
at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1021)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:865)
at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240)
at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)
at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:668)
at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52)
at winstone.BoundedExecutorService$1.run(BoundedExecutorService.java:77)


at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

Caused by: java.lang.IllegalStateException: Unknown instance terminated: i-b56b793d


at com.amazon.jenkins.ec2fleet.EC2FleetCloud.terminateInstance(EC2FleetCloud.java:266)
at com.amazon.jenkins.ec2fleet.IdleRetentionStrategy.check(IdleRetentionStrategy.java:28)
at com.amazon.jenkins.ec2fleet.IdleRetentionStrategy.check(IdleRetentionStrategy.java:12)

at hudson.slaves.SlaveComputer$4.run(SlaveComputer.java:717)


at hudson.model.Queue._withLock(Queue.java:1306)
at hudson.model.Queue.withLock(Queue.java:1189)

at hudson.slaves.SlaveComputer.setNode(SlaveComputer.java:714)
at hudson.model.AbstractCIBase.updateComputer(AbstractCIBase.java:118)
at hudson.model.AbstractCIBase.access$000(AbstractCIBase.java:44)
at hudson.model.AbstractCIBase$2.run(AbstractCIBase.java:186)


at hudson.model.Queue._withLock(Queue.java:1306)
at hudson.model.Queue.withLock(Queue.java:1189)

at hudson.model.AbstractCIBase.updateComputerList(AbstractCIBase.java:169)
at jenkins.model.Jenkins.updateComputerList(Jenkins.java:1260)
at jenkins.model.Nodes$2.run(Nodes.java:136)


at hudson.model.Queue._withLock(Queue.java:1306)
at hudson.model.Queue.withLock(Queue.java:1189)

at jenkins.model.Nodes.addNode(Nodes.java:132)
at jenkins.model.Jenkins.addNode(Jenkins.java:1753)
at hudson.plugins.ec2.EC2Cloud.doProvision(EC2Cloud.java:343)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.kohsuke.stapler.Function$InstanceFunction.invoke(Function.java:320)
at org.kohsuke.stapler.Function.bindAndInvoke(Function.java:163)
at org.kohsuke.stapler.Function.bindAndInvokeAndServeResponse(Function.java:96)
at org.kohsuke.stapler.MetaClass$1.doDispatch(MetaClass.java:124)
at org.kohsuke.stapler.NameBasedDispatcher.dispatch(NameBasedDispatcher.java:58)
at org.kohsuke.stapler.Stapler.tryInvoke(Stapler.java:746)
... 79 more

me@joshma.com (JIRA)

unread,
Aug 8, 2016, 4:19:01 PM8/8/16
to jenkinsc...@googlegroups.com

Having this issue as well - this is blocking us from allocating any new nodes via EC2 cloud

pete@weargoggles.co.uk (JIRA)

unread,
Dec 29, 2016, 6:53:03 AM12/29/16
to jenkinsc...@googlegroups.com

This issue, along with a number of others, is claimed to be resolved by the PR https://github.com/jenkinsci/ec2-fleet-plugin/pull/3

Please advise whether further action is required.

Reply all
Reply to author
Forward
0 new messages