[JIRA] (JENKINS-50410) Periods of high load cause "Invalid virtual machine image" exceptions

2 views
Skip to first unread message

tyler@monkeypox.org (JIRA)

unread,
Mar 26, 2018, 12:35:03 PM3/26/18
to jenkinsc...@googlegroups.com
R. Tyler Croy created an issue
 
Jenkins / Bug JENKINS-50410
Periods of high load cause "Invalid virtual machine image" exceptions
Issue Type: Bug Bug
Assignee: Azure DevOps
Components: azure-vm-agents-plugin
Created: 2018-03-26 16:34
Priority: Critical Critical
Reporter: R. Tyler Croy

Periods of high usage (in this case on ci.jenkins.io) result in the ARM APIs apparently throttiling requests from the plugin, leading to these incorrect "invalid virtual machine image" exceptions.

For the record, we're using 100% stock virtual machines, so these just appear to be poorly handled exceptions by the plugin


Mar 26, 2018 2:45:42 PM SEVERE com.microsoft.azure.vmagent.AzureVMManagementServiceDelegate createStorageAccount

The request is being throttled.

Mar 26, 2018 2:45:42 PM SEVERE com.microsoft.azure.vmagent.AzureVMManagementServiceDelegate createDeployment

AzureVMManagementServiceDelegate: deployment: Unable to deploy
com.microsoft.azure.vmagent.exceptions.AzureCloudException: Failed to create storage account with account name jnju95mbzq9arxtjl19kvjma, location eastus, resourceGroupName jenkinsinfra-agents: The request is being throttled.
	at com.microsoft.azure.vmagent.exceptions.AzureCloudException.create(AzureCloudException.java:49)
	at com.microsoft.azure.vmagent.AzureVMManagementServiceDelegate.createStorageAccount(AzureVMManagementServiceDelegate.java:2254)
	at com.microsoft.azure.vmagent.AzureVMManagementServiceDelegate.createDeployment(AzureVMManagementServiceDelegate.java:210)
	at com.microsoft.azure.vmagent.AzureVMManagementServiceDelegate.createDeployment(AzureVMManagementServiceDelegate.java:167)
	at com.microsoft.azure.vmagent.AzureVMAgentTemplate.provisionAgents(AzureVMAgentTemplate.java:807)
	at com.microsoft.azure.vmagent.AzureVMCloud$3.call(AzureVMCloud.java:790)
	at com.microsoft.azure.vmagent.AzureVMCloud$3.call(AzureVMCloud.java:786)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

Mar 26, 2018 2:46:52 PM SEVERE com.microsoft.azure.vmagent.AzureVMManagementServiceDelegate verifyVirtualMachineImage

Invalid virtual machine image
java.net.SocketException: Socket closed
	at java.net.SocketInputStream.read(SocketInputStream.java:204)
	at java.net.SocketInputStream.read(SocketInputStream.java:141)
	at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
	at sun.security.ssl.InputRecord.read(InputRecord.java:503)
	at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:983)
	at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:940)
	at sun.security.ssl.AppInputStream.read(AppInputStream.java:105)
	at okio.Okio$2.read(Okio.java:138)
	at okio.AsyncTimeout$2.read(AsyncTimeout.java:238)
Caused: java.net.SocketTimeoutException: timeout
	at okio.Okio$3.newTimeoutException(Okio.java:210)
	at okio.AsyncTimeout.exit(AsyncTimeout.java:288)
	at okio.AsyncTimeout$2.read(AsyncTimeout.java:242)
	at okio.RealBufferedSource.indexOf(RealBufferedSource.java:325)
	at okio.RealBufferedSource.indexOf(RealBufferedSource.java:314)
	at okio.RealBufferedSource.readUtf8LineStrict(RealBufferedSource.java:210)
	at okhttp3.internal.http.Http1xStream.readResponse(Http1xStream.java:186)
	at okhttp3.internal.http.Http1xStream.readResponseHeaders(Http1xStream.java:127)
	at okhttp3.internal.http.CallServerInterceptor.intercept(CallServerInterceptor.java:53)
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
	at com.microsoft.rest.interceptors.LoggingInterceptor.intercept(LoggingInterceptor.java:116)
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
	at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:45)
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
	at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:109)
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
	at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93)
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
	at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:124)
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
	at com.microsoft.rest.retry.RetryHandler.intercept(RetryHandler.java:75)
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
	at com.microsoft.rest.interceptors.CustomHeadersInterceptor.intercept(CustomHeadersInterceptor.java:140)
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
	at com.microsoft.rest.interceptors.UserAgentInterceptor.intercept(UserAgentInterceptor.java:83)
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
	at com.microsoft.azure.credentials.AzureTokenCredentialsInterceptor.intercept(AzureTokenCredentialsInterceptor.java:40)
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
	at com.microsoft.azure.management.resources.fluentcore.utils.ResourceManagerThrottlingInterceptor.intercept(ResourceManagerThrottlingInterceptor.java:43)
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
	at com.microsoft.azure.management.resources.fluentcore.utils.ProviderRegistrationInterceptor.intercept(ProviderRegistrationInterceptor.java:43)
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
	at com.microsoft.azure.vmagent.AzureVMAgentPlugin$AzureTelemetryInterceptor.intercept(AzureVMAgentPlugin.java:29)
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
	at com.microsoft.rest.interceptors.BaseUrlHandler.intercept(BaseUrlHandler.java:43)
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
	at com.microsoft.rest.interceptors.RequestIdHeaderInterceptor.intercept(RequestIdHeaderInterceptor.java:29)
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
	at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:170)
	at okhttp3.RealCall.execute(RealCall.java:60)
	at retrofit2.OkHttpCall.execute(OkHttpCall.java:174)
	at retrofit2.adapter.rxjava.RxJavaCallAdapterFactory$RequestArbiter.request(RxJavaCallAdapterFactory.java:171)
	at rx.Subscriber.setProducer(Subscriber.java:211)
	at rx.internal.operators.OnSubscribeMap$MapSubscriber.setProducer(OnSubscribeMap.java:102)
	at retrofit2.adapter.rxjava.RxJavaCallAdapterFactory$CallOnSubscribe.call(RxJavaCallAdapterFactory.java:152)
	at retrofit2.adapter.rxjava.RxJavaCallAdapterFactory$CallOnSubscribe.call(RxJavaCallAdapterFactory.java:138)
	at rx.Observable.unsafeSubscribe(Observable.java:10142)
	at rx.internal.operators.OnSubscribeMap.call(OnSubscribeMap.java:48)
	at rx.internal.operators.OnSubscribeMap.call(OnSubscribeMap.java:33)
	at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:48)
	at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:30)
	at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:48)
	at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:30)
	at rx.Observable.subscribe(Observable.java:10238)
	at rx.Observable.subscribe(Observable.java:10205)
	at rx.observables.BlockingObservable.blockForSingle(BlockingObservable.java:444)
Caused: java.lang.RuntimeException
	at rx.exceptions.Exceptions.propagate(Exceptions.java:58)
	at rx.observables.BlockingObservable.blockForSingle(BlockingObservable.java:464)
	at rx.observables.BlockingObservable.single(BlockingObservable.java:341)
	at com.microsoft.azure.management.compute.implementation.VirtualMachineImagesInner.listPublishers(VirtualMachineImagesInner.java:480)
	at com.microsoft.azure.management.compute.implementation.VirtualMachinePublishersImpl.listByRegion(VirtualMachinePublishersImpl.java:50)
	at com.microsoft.azure.vmagent.AzureVMManagementServiceDelegate.verifyVirtualMachineImage(AzureVMManagementServiceDelegate.java:2041)
	at com.microsoft.azure.vmagent.AzureVMManagementServiceDelegate$4.call(AzureVMManagementServiceDelegate.java:1870)
	at com.microsoft.azure.vmagent.AzureVMManagementServiceDelegate$4.call(AzureVMManagementServiceDelegate.java:1866)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

Add Comment Add Comment
 
This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e)
Atlassian logo

puichan@microsoft.com (JIRA)

unread,
Apr 3, 2018, 12:59:02 PM4/3/18
to jenkinsc...@googlegroups.com
Pui Chee Chan started work on Bug JENKINS-50410
 
Change By: Pui Chee Chan
Status: Open In Progress

puichan@microsoft.com (JIRA)

unread,
Apr 3, 2018, 1:00:04 PM4/3/18
to jenkinsc...@googlegroups.com
Pui Chee Chan stopped work on Bug JENKINS-50410
 
Change By: Pui Chee Chan
Status: In Progress Open

vscjenkins@microsoft.com (JIRA)

unread,
Apr 3, 2018, 10:40:03 PM4/3/18
to jenkinsc...@googlegroups.com

chenyl@microsoft.com (JIRA)

unread,
Apr 3, 2018, 11:16:02 PM4/3/18
to jenkinsc...@googlegroups.com
Chenyang Liu commented on Bug JENKINS-50410
 
Re: Periods of high load cause "Invalid virtual machine image" exceptions

From Azure Docs "For each subscription and tenant, Resource Manager limits read requests to 15,000 per hour and write requests to 1,200 per hour. "  

You said high usage, so how many VMs do you provision in one hour. And if the frequency is really high so that hit the limit, we seems nothing can do to solve it.

 

tyler@monkeypox.org (JIRA)

unread,
Apr 4, 2018, 9:44:02 AM4/4/18
to jenkinsc...@googlegroups.com

Our max configured agents was 50 at this time. We were not making thousands of VMs an hour.

The problem here, in my opinion, could be two-fold:

  1. The plugin is making lots of ARM API requests
  2. The plugin is not handling rate-limiting gracefully.

Perhaps there's something that's polling and iterating over each agent?

chenyl@microsoft.com (JIRA)

unread,
Jun 28, 2018, 4:16:02 AM6/28/18
to jenkinsc...@googlegroups.com

In 0.7.1, this issue should be resolved.

This message was sent by Atlassian JIRA (v7.10.1#710002-sha1:6efc396)

chenyl@microsoft.com (JIRA)

unread,
Jun 28, 2018, 9:26:02 PM6/28/18
to jenkinsc...@googlegroups.com
Chenyang Liu resolved as Fixed
 
Change By: Chenyang Liu
Status: Open Resolved
Resolution: Fixed
Reply all
Reply to author
Forward
0 new messages