Fwd: vso_get 'No response from server' errors

35 views
Skip to first unread message

Joe Hourcle

unread,
Jun 22, 2012, 2:49:34 PM6/22/12
to vso...@googlegroups.com
I suspect this didn't go through because I tried sending from my personal e-mail address, not my work one.

-Joe


Begin forwarded message:

> From: Joe Hourcle <one...@annoying.org>
> Date: June 20, 2012 8:39:18 PM EDT
> To: vso...@googlegroups.com
> Subject: vso_get 'No response from server' errors
>
>
> So, for a couple of months now, we've been getting reports where people sporadically get messages from vso_get like:
>
> % HTTP::READF: No response from server
> % HTTP::COPY: No response from server
>
> ... and when I try it, I can't re-create it.
>
> So, first, the good news:
>
> 1. I've managed to re-create it reliably.
> 2. Igor helped me track down what it is.
> 3. We know why it's sporadic.
> 4. We have a work-around.
> 5. Or, you can just re-request the file few minutes later, and it should be okay.
>
> Now for the bad news:
>
> The only current work-around shouldn't be used blindly
>
>
> ...
>
> So, the work around :
>
> add to your vso_get call :
>
> read_timeout=60
>
> ... which still isn't enough time for the tarballs. For those, you need to shut off the timeout entirely, by adding:
>
> read_timeout=0
>
>
> ...
>
> And, for those who care, the problem:
>
> sock_copy does a little check to see if it needs to download the file, before it actually requests that the server send a copy.
>
> For SDO data through the VSO, if it's a bunch of files (that would result in a tarball), we send a response to the initial check telling it we have no idea what the size is (which is how it checks), so it always re-downloads the files. If it's a single file, we return the size.
>
> In order to return the size, we have to have a copy of the file being requested ... which a given caching site might not have, and so it downloads from somewhere else ... but if we can't get it within 10 seconds, we timeout.
>
> I'm not sure what's changed the timeout to 10 seconds. I had thought we suppressed the timeout, as some of the larger tarballs might take a few minutes before we can start serving them.
>
> Under most circumstances, we can respond to single image requests fast enough that people don't see this ... however, the JSOC to SDAC transfers are slow. (SAO or NSO to SDAC are fine, as are JSOC to SAO or NSO) ... and just slow enough that it might take 15-20 seconds for a typical AIA lev1 file.
>
>
> ...
>
> We'll have to work on a longer-term solution; I may be able to modify Dominic's HTTP object to use a longer timeout on the initial HEAD request, but this means that it won't timeout as quickly when there really are problems. (which is why always specifying 'read_timeout=0' is bad).
>
>
> -Joe
>
>
>
>
>

Reply all
Reply to author
Forward
0 new messages