this night we discovered an issue with loading files from AWS S3.
There was one Thread hanging in SocketInputStream.socketRead0:
"pool-11-thread-2679" prio=10 tid=0x00002aad7c9ebc00 nid=0x19a4
runnable [0x0000000042c01000..0x0000000042c02b10]
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at com.sun.net.ssl.internal.ssl.InputRecord.readFully(InputRecord.java:293)
at com.sun.net.ssl.internal.ssl.InputRecord.readV3Record(InputRecord.java:405)
at com.sun.net.ssl.internal.ssl.InputRecord.read(InputRecord.java:360)
at com.sun.net.ssl.internal.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:789)
- locked <0x00002aabe5ac8088> (a java.lang.Object)
at com.sun.net.ssl.internal.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:746)
at com.sun.net.ssl.internal.ssl.AppInputStream.read(AppInputStream.java:75)
- locked <0x00002aabe5aca1a8> (a
com.sun.net.ssl.internal.ssl.AppInputStream)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
- locked <0x00002aabe5aca1d0> (a java.io.BufferedInputStream)
at sun.net.www.MeteredStream.read(MeteredStream.java:116)
- locked <0x00002aabe5acb218> (a sun.net.www.http.KeepAliveStream)
at java.io.FilterInputStream.read(FilterInputStream.java:116)
at sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read(HttpURLConnection.java:2504)
at java.io.FilterInputStream.read(FilterInputStream.java:116)
at java.io.FilterInputStream.read(FilterInputStream.java:90)
Not sure about the thread state RUNNABLE, but this thread stood there
for more than half an our, and according to our logs there was no
other conclusion that this must have been the case for hours.
The InputStream was opened via
blobStore.getBlob(containerName, targetName).getPayload().getInput();
and the blobStore was created via
new BlobStoreContextFactory().createContext("s3", username,
password).getBlobStore();
We're still on jclouds 1.0-beta-7.
When digging through the code I'd say that the connection providing
the inputstream was created in
JavaUrlHttpCommandExecutorService.convert via url.openConnection() and
without any configuration the timeout settings are still set to
default values (initially 0 in URLConnection). As 0 indicates an
infinite timeout for connect and observation seems to make sense.
Looking through the documentation I'd try to upgrade to the latest
jclouds version and follow
http://code.google.com/p/jclouds/wiki/jcloudsAPI#Configuration_Modules
Would you confirm that this should fix the issue?
Cheers,
Martin
Yes, setting a timeout will release control back to your thread for retry, etc. make sure you use the following dependency from maven central:
org.jclouds.provider/aws-s3 1.0-beta-9b
When you switch, make sure you use the provider aws-s3 not s3
Cheers,
Adrian
I'm just making the change to our code-base. Would you recommend to
add the EnterpriseConfigurationModule as shown in
http://code.google.com/p/jclouds/wiki/jcloudsAPI#Configuration_Modules ?
Is this required to make use of the properties, or will they get apply
also without the EnterpriseConfigurationModule?
Thanx && cheers,
Martin
--
Martin Grotzke
http://twitter.com/martin_grotzke
Enterprise only swaps out encryption and date libraries w/ more
performant versions, at the moment. So, the parameters are not
dependent on the Enterprise module.
-A
On Wed, Mar 30, 2011 at 8:19 AM, Martin Grotzke
even after our upgrade to 1.0-beta-9b and setting socket and connection
timeouts (following
http://code.google.com/p/jclouds/wiki/jcloudsAPI#Configuration_Modules)
we're still getting into this situation with threads stuck in
SocketInputStream.socketRead0.
Is it possible that the timeouts are not properly applied?
Do you have a clue what might be the issue?
Cheers,
Martin
>
> P.S. I think there might be a problem in beta-9 that has since been
> fixed.
>
> I touched up demos/perftest and found that we run fine in snapshot, but
> I had an IOException on beta-9. Tibor's done a lot of work on snapshot,
> and we could find an explanation in comparing the two implementations of
> JavaUrlHttpCommandExecutorService.
AFAIK we deployed our app with the HttpClient Module activated to our
test system. If it was an issue with the
JavaUrlHttpCommandExecutorService the problem shouldn't occur with this
setup, right?
Is the other issue ("IOException: Error writing request body to server")
also related to JavaUrlHttpCommandExecutorService and might/should be
gone with HCModule?
>
> Can you try out snapshot?
If the issues are gone with HCModule probably we prefer this option than
trying out the snapshot. If this is not the case we can try out the
snapshot on the test system.
As I'm currently sick at home and not in the office communication from
my side might be delayed, perhaps one of our team might step in, don't
be surprised :-)
Cheers,
Martin
>
> Cheers,
> -Adrian
>
> P.S. our apachehc implementation is not fast, as you can see below.
>
> *in snapshot vs amazon's client on the same machine (using home broadband):*
> *using 1.0-beta-9 branch vs amazon's client on the same machine (using
> <martin....@googlemail.com <mailto:martin....@googlemail.com>>
> <martin....@googlemail.com <mailto:martin....@googlemail.com>>
> <mailto:jcl...@googlegroups.com>.
> >>> To unsubscribe from this group, send email to
> >>> jclouds+u...@googlegroups.com
> <mailto:jclouds%2Bunsu...@googlegroups.com>.
> >>> For more options, visit this group at
> >>> http://groups.google.com/group/jclouds?hl=en.
> >>>
> >>
> >> --
> >> You received this message because you are subscribed to the
> Google Groups
> >> "jclouds" group.
> >> To post to this group, send email to jcl...@googlegroups.com
> <mailto:jcl...@googlegroups.com>.
> >> To unsubscribe from this group, send email to
> >> jclouds+u...@googlegroups.com
> <mailto:jclouds%2Bunsu...@googlegroups.com>.
On 04/07/2011 10:28 PM, Adrian Cole wrote:Dedicated hosting in a data center located in germany, further details
> Hi, Martin.
>
> Threads stuck in socketRead could be very network sensitive. What
> environment is your application running in?
about the current infrastructure I had to clarify with the customer and
operations team. What exactly do you want to know?
AFAIK we deployed our app with the HttpClient Module activated to our
>
> P.S. I think there might be a problem in beta-9 that has since been
> fixed.
>
> I touched up demos/perftest and found that we run fine in snapshot, but
> I had an IOException on beta-9. Tibor's done a lot of work on snapshot,
> and we could find an explanation in comparing the two implementations of
> JavaUrlHttpCommandExecutorService.
test system. If it was an issue with the
JavaUrlHttpCommandExecutorService the problem shouldn't occur with this
setup, right?
Is the other issue ("IOException: Error writing request body to server")
also related to JavaUrlHttpCommandExecutorService and might/should be
gone with HCModule?
>
> On Thu, Apr 7, 2011 at 2:53 PM, Martin Grotzke
> <martin....@googlemail.com <mailto:martin....@googlemail.com>>
> wrote:
>
> On 04/07/2011 10:28 PM, Adrian Cole wrote:
> > Hi, Martin.
> >
> > Threads stuck in socketRead could be very network sensitive. What
> > environment is your application running in?
> Dedicated hosting in a data center located in germany, further details
> about the current infrastructure I had to clarify with the customer and
> operations team. What exactly do you want to know?
>
> Curious as to whether the Amazon region your buckets are located in are
> close to you. For example, are you using EU as an endpoint? If not,
> this could invite more connection breaks.
>
> Properties overrides = new Properties();
> overrides.setProperty("aws-s3.endpoint",
> "https://s3-eu-west-1.amazonaws.com");
> context = new
> BlobStoreContextFactory().createContext("aws-s3",accessKey, secretKey,
> ImmutableSet.of(new EnterpriseConfigurationModule()), overrides);
I tried running the AWSS3PutImageIntegrationLiveTest with the endpoint
set, but then I get a
com.google.common.collect.AsynchronousComputationException:
java.lang.NullPointerException
at
com.google.common.collect.ComputingConcurrentHashMap$ComputationExceptionReference.waitForValue(ComputingConcurrentHashMap.java:279)
for all uploads.
The S3 bucket is located in Ireland btw.
What's the meaning of the "aws-s3.endpoint" property? I would have
thought that aws routes automatically to the correct endpoint using as
little as possible hops...
>
> Also note that there are a number of blobstores in germany itself,
> including scaleup-storage.
AFAICS it's not supported by cloudfront, is it?
Do you have experiences regarding performance with e.g. scaleup-storage
compared to S3 and would you suggest scaleup-storage over S3 (in
together with another CDN)?
Cheers,
Martin
> <mailto:martin....@googlemail.com
> <mailto:adrian...@gmail.com <mailto:adrian...@gmail.com>>>
> wrote:
> > >> Yes, setting a timeout will release control back to your thread
> > for retry,
> > >> etc. make sure you use the following dependency from maven
> central:
> > >>
> > >> org.jclouds.provider/aws-s3 1.0-beta-9b
> > >>
> > >> When you switch, make sure you use the provider aws-s3 not s3
> > >>
> > >> Cheers,
> > >> Adrian
> > >>
> > >> On Mar 28, 2011 5:05 AM, "Martin Grotzke"
> > <martin....@googlemail.com
> <mailto:martin....@googlemail.com>
> <mailto:martin....@googlemail.com
> > <mailto:jcl...@googlegroups.com
> <mailto:jcl...@googlegroups.com>>.
> > >>> To unsubscribe from this group, send email to
> > >>> jclouds+u...@googlegroups.com
> <mailto:jclouds%2Bunsu...@googlegroups.com>
> > <mailto:jclouds%2Bunsu...@googlegroups.com
> <mailto:jclouds%252Buns...@googlegroups.com>>.
> > >>> For more options, visit this group at
> > >>> http://groups.google.com/group/jclouds?hl=en.
> > >>>
> > >>
> > >> --
> > >> You received this message because you are subscribed to the
> > Google Groups
> > >> "jclouds" group.
> > >> To post to this group, send email to
> jcl...@googlegroups.com <mailto:jcl...@googlegroups.com>
> > <mailto:jcl...@googlegroups.com
> <mailto:jcl...@googlegroups.com>>.
> > >> To unsubscribe from this group, send email to
> > >> jclouds+u...@googlegroups.com
> <mailto:jclouds%2Bunsu...@googlegroups.com>
> > <mailto:jclouds%2Bunsu...@googlegroups.com
> <mailto:jclouds%252Buns...@googlegroups.com>>.
On 04/08/2011 01:13 PM, Kiss Tibor wrote:
> Hi, Martin, Adrian,
>
> The initial email (below) it has socket read erorr, but the previous one
> it has socket write error.
Right. The socketRead issue motivated us to upgrade to beta-9 (with
setting properties), after that we got the socket write error, and still
saw the socketRead error.
>
> I discovered a socket read error when downloading a relatively large
> file (in my case 180MB), the socket read error happened exactly at the
> end of the stream. I mean the entire content it was downloaded, but the
> peer didn't closed the socket. The same file I uploaded to an apache
> server and started to download with the same Payload.writeTo... in that
> case the socket read didn't get blocked.
> Therefore I created a download tool (small app) using the same jclouds
> api, just to download in smaller pieces and write to the same output
> stream. I used the jclouds's range-es and the result it was successfull.
> I was able to download 32GB size file with it. I used a split size of
> 32MB. What about download speed, when I used this 32MB split size, the
> download speed has been increased to ~32MB/s.
> My conclusion is that amazon webservice is inefficiently handling large
> downloads in a single http body. Maybe in behind the blocks of that data
> stay on different servers and such kind of proxying is costly or much
> worse, it does not handles correclty the end of file.
>
> All in all, I opted to make large downloads in 32MB split size using
> range param.. and this works with older version of jclouds too. Since
> then I transferred a few terrabytes of data and I have no problem.
Sounds interesting! The images we're downloading/uploading are not
bigger than say 1, 2 or 3 MB. As your chunks are 32MB and this is
working well for you it seems as if we have another issue.
> Regarding the previouse problem (from the email below mines)
> java.io.IOException: Error writing request body to server
> may be caused by the missing of 100-continue from http layer. I added it
> in 1.0-SNAPSHOT. Without 100-continue, the webservice layer cannot send
> back some error responses for example authentication problems (in my
> case) and it drops the socket connection and from the jclouds we are not
> able to write the body.
AFAIU you can either
1) just PUT/POST some data
2) or send the request header with the header field
"Expect:100-continue" and wait for the response with status 100
(Continue) to send the actual request body.
Is 1) the way how jclouds worked before, and now you implemented 2)?
I'd expect 2) being slower than 1) for smaller files, so starting from
which file size is 2) going to be used? Is it possible for a jclouds
client to control this per operation, and to set a preferred default?
Would say that 1) has s.th. to do with the socket write issues we're
experiencing?
Cheers,
Martin
>
> Cheers,
> Tibor
>
> On Thu, Apr 7, 2011 at 10:28 PM, Adrian Cole <adrian...@gmail.com
> <mailto:adrian...@gmail.com>> wrote:
>
> Hi, Martin.
>
> Threads stuck in socketRead could be very network sensitive. What
> environment is your application running in?
>
> P.S. I think there might be a problem in beta-9 that has since been
> fixed.
>
> I touched up demos/perftest and found that we run fine in snapshot,
> but I had an IOException on beta-9. Tibor's done a lot of work on
> snapshot, and we could find an explanation in comparing the two
> implementations of JavaUrlHttpCommandExecutorService.
>
> Can you try out snapshot?
>
> Cheers,
> -Adrian
>
> P.S. our apachehc implementation is not fast, as you can see below.
>
> *in snapshot vs amazon's client on the same machine (using home
> *using 1.0-beta-9 branch vs amazon's client on the same machine
> <mailto:martin....@googlemail.com>>
> jcl...@googlegroups.com <mailto:jcl...@googlegroups.com>.
> >>> To unsubscribe from this group, send email to
> >>> jclouds+u...@googlegroups.com
> <mailto:jclouds%2Bunsu...@googlegroups.com>.
> >>> For more options, visit this group at
> >>> http://groups.google.com/group/jclouds?hl=en.
> >>>
> >>
> >> --
> >> You received this message because you are subscribed to the
> Google Groups
> >> "jclouds" group.
> >> To post to this group, send email to jcl...@googlegroups.com
> <mailto:jcl...@googlegroups.com>.
> >> To unsubscribe from this group, send email to
> >> jclouds+u...@googlegroups.com
> <mailto:jclouds%2Bunsu...@googlegroups.com>.
com.google.common.collect.AsynchronousComputationException:
java.lang.NullPointerException
at
com.google.common.collect.ComputingConcurrentHashMap$ComputationExceptionReference.waitForValue(ComputingConcurrentHashMap.java:279)
at
com.google.common.collect.ComputingConcurrentHashMap.waitForValue(ComputingConcurrentHashMap.java:241)
at
com.google.common.collect.ComputingConcurrentHashMap$ComputingSegment.compute(ComputingConcurrentHashMap.java:133)
at
com.google.common.collect.ComputingConcurrentHashMap.apply(ComputingConcurrentHashMap.java:67)
at
com.google.common.collect.MapMaker$ComputingMapAdapter.get(MapMaker.java:623)
at org.jclouds.s3.blobstore.S3BlobStore.putBlob(S3BlobStore.java:236)
at
org.jclouds.aws.s3.blobstore.integration.AWSS3PutImageIntegrationLiveTest.testPutImage(AWSS3PutImageIntegrationLiveTest.java:115)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:74)
at org.testng.internal.Invoker.invokeMethod(Invoker.java:673)
at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:846)
at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1170)
at
org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:125)
at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:109)
at
org.testng.internal.thread.ThreadUtil$CountDownLatchedRunnable.run(ThreadUtil.java:147)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.NullPointerException
at
org.jclouds.blobstore.functions.ThrowContainerNotFoundOn404.apply(ThrowContainerNotFoundOn404.java:41)
at
org.jclouds.blobstore.functions.ThrowContainerNotFoundOn404.apply(ThrowContainerNotFoundOn404.java:1)
at
org.jclouds.concurrent.ExceptionParsingListenableFuture.attemptConvert(ExceptionParsingListenableFuture.java:69)
at
org.jclouds.concurrent.ExceptionParsingListenableFuture.get(ExceptionParsingListenableFuture.java:77)
at org.jclouds.concurrent.internal.SyncProxy.invoke(SyncProxy.java:131)
at $Proxy65.getBucketACL(Unknown Source)
at
org.jclouds.s3.blobstore.config.S3BlobStoreContextModule$3.apply(S3BlobStoreContextModule.java:87)
at
org.jclouds.s3.blobstore.config.S3BlobStoreContextModule$3.apply(S3BlobStoreContextModule.java:1)
at
com.google.common.collect.ComputingConcurrentHashMap$ComputingSegment.compute(ComputingConcurrentHashMap.java:155)
at
com.google.common.collect.ComputingConcurrentHashMap$ComputingSegment.compute(ComputingConcurrentHashMap.java:116)
To give you an update for our current status: our production system is
running with beta-9 but without overridden properties and without the
enterprise module (since yesterday). Download/upload times are quite
well (as before the upgrade). On the test-system the same version is
just being deployed without overridden properties but with apache HC
module. On monday we can see if there were issues with the production
system and how the test system behaved.
So right now it seems as if we could relax over the weekend :-)
Thanx && cheers,
Martin
-A
> To post to this group, send email to jcl...@googlegroups.com.
> To unsubscribe from this group, send email to jclouds+u...@googlegroups.com.
Still getting an NPE:
com.google.common.collect.AsynchronousComputationException:
java.lang.NullPointerException
at
com.google.common.collect.ComputingConcurrentHashMap$ComputationExceptionReference.waitForValue(ComputingConcurrentHashMap.java:228)
at
com.google.common.collect.ComputingConcurrentHashMap$ComputingValueReference.waitForValue(ComputingConcurrentHashMap.java:298)
at
com.google.common.collect.ComputingConcurrentHashMap$ComputingSegment.compute(ComputingConcurrentHashMap.java:157)
at
com.google.common.collect.ComputingConcurrentHashMap.apply(ComputingConcurrentHashMap.java:71)
at
com.google.common.collect.MapMaker$ComputingMapAdapter.get(MapMaker.java:848)
at
org.jclouds.s3.blobstore.S3BlobStore.putBlob(S3BlobStore.java:236)
at
org.jclouds.aws.s3.blobstore.integration.AWSS3PutImageIntegrationLiveTest.testPutImage(AWSS3PutImageIntegrationLiveTest.java:114)
Caused by: java.lang.NullPointerException
at
org.jclouds.blobstore.functions.ThrowContainerNotFoundOn404.apply(ThrowContainerNotFoundOn404.java:40)
at
org.jclouds.blobstore.functions.ThrowContainerNotFoundOn404.apply(ThrowContainerNotFoundOn404.java:35)
at
org.jclouds.concurrent.ExceptionParsingListenableFuture.attemptConvert(ExceptionParsingListenableFuture.java:68)
at
org.jclouds.concurrent.ExceptionParsingListenableFuture.get(ExceptionParsingListenableFuture.java:76)
at
org.jclouds.concurrent.internal.SyncProxy.invoke(SyncProxy.java:130)
at $Proxy65.getBucketACL(Unknown Source)
at
org.jclouds.s3.blobstore.config.S3BlobStoreContextModule$3.apply(S3BlobStoreContextModule.java:86)
at
org.jclouds.s3.blobstore.config.S3BlobStoreContextModule$3.apply(S3BlobStoreContextModule.java:84)
at
com.google.common.collect.ComputingConcurrentHashMap$ComputingValueReference.compute(ComputingConcurrentHashMap.java:316)
at
com.google.common.collect.ComputingConcurrentHashMap$ComputingSegment.compute(ComputingConcurrentHashMap.java:140)
Cheers,
Martin
Apologies. I'll run the whole test-suite with this endpoint. I've
found an issue that causes what you are seeing.
-A
On Mon, Apr 11, 2011 at 10:14 AM, Martin Grotzke