more timeout issues

67 views
Skip to first unread message

aweber1nj

unread,
Feb 1, 2013, 9:41:06 AM2/1/13
to jets3t...@googlegroups.com
I'm still seeing timeout exceptions from S3 where (I'm guessing) one or more of the actual S3 hosts goes down or something.  For example, I have a retry-loop (3x retry, waiting 1sec between retries and no changes to the connection timeout in the httpclient settings), and I can get this 3x:

org.jets3t.service.S3ServiceException: Request Error: org.apache.http.conn.ConnectTimeoutException: Connect to s3.amazonaws.com/s3.amazonaws.com/176.32.100.65 timed out
        at org.jets3t.service.S3Service.getObjectDetails(S3Service.java:2486)

Subsequent to seeing this error reported, I tried pinging that IP address (I assume this is a valid test...correct me if I'm wrong), and it does not respond.  Also tried it from other hosts in other locations, ruling-out some routing possibilities, etc.

So, one thing I'd like to add to my retry logic is to force/ask JetS3t to "change hosts" or re-resolve the generic host (I think it's s3.amazonaws.com -- I didn't change it in the properties file).  Is there a way we can do that?

Thanks,
AJ

aweber1nj

unread,
Feb 1, 2013, 10:14:59 AM2/1/13
to jets3t...@googlegroups.com
UPDATE: I have not restarted this webapp (because it's in production use), and it appears to continue to try that server and continue to timeout.  This is not a good thing at all.  Does anyone know how to tell jets3t or the underlying httpclient/connectionmanager to remove that connection or re-resolve the host?

James Murty

unread,
Feb 1, 2013, 10:23:20 AM2/1/13
to jets3t...@googlegroups.com
Unfortunately working with DNS resolution timeouts in Java is a horrible mess. JetS3t tries to set sensible host name lookup timeouts but this attempt will fail in many (most?) cases.

There's a fair bit of discussion about this problem, without a good code-only solution, here: https://bitbucket.org/jmurty/jets3t/issue/151/do-not-overwrite-networkaddresscache  

Based on the research in that ticket, if your app is failing to re-lookup the S3 hostname via DNS the only thing likely to work is to modify your system's TTL settings in the java.security file, specifically the "networkaddress.cache.ttl" and "networkaddress.cache.negative.ttl" properties. For this to take effect you will need to restart your app.

On the plus side, if your timeout issues were all caused by this problem then fixing the TTL settings in your system's java.security file should fix it once and for all.
--
You received this message because you are subscribed to the Google Groups "JetS3t Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jets3t-users...@googlegroups.com.
To post to this group, send email to jets3t...@googlegroups.com.
Visit this group at http://groups.google.com/group/jets3t-users?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

aweber1nj

unread,
Feb 1, 2013, 10:37:13 AM2/1/13
to jets3t...@googlegroups.com
I will not argue with you at all on that.  But here's the kicker:

I am using my own "pool" of S3Service objects (this was partly to test a possible threading issue, but as per your reply to a previous post, it is probably not entirely necessary).  I am theorizing that each of these actually has up-to 20 (?) http connections via the underlying httpclient.

Some/most S3 requests are working fine.  It is just consistently the issue reported with the one S3-url.  Now, I do not have enough data being logged to indicate whether this is one specific connection within one of my pooled S3Service objects, or whether it is happening for all requests from that particular S3Service object (as it is used from the pool).
I am imagining a tree of objects at this point with my tomcat webapp (that actually has plenty of threads too) at the top, "n" S3Service objects in my pool, and each of those having "n" httpclient connection objects - I guess being managed by a connection-manager.
Thus, I'm not sure at which level we need to address -- the httpclient (leaf), its connectionmanager, or the S3Service object.

I am considering adding code where I catch the ServiceException and if it is a Timeout, try destroying that S3Service object, and replacing it with a new one.  Do you think this could be effective?  Or will it possibly still pick-up the cached info regarding the host-resolution?

Thanks again,
AJ
Reply all
Reply to author
Forward
0 new messages