problem hosting tracks on Dropbox

86 views
Skip to first unread message

Carlos Infante

unread,
Mar 3, 2014, 10:29:19 AM3/3/14
to gen...@soe.ucsc.edu
Dear UCSC,

I have the same problem with bigWigs hosted on Dropbox reported by Daofeng Li and Elphege  After emailing back and forth with support at Dropbox, they blame UCSC web browser since they are serving the files. See the final email from Dropbox support below. Has the approach to loading custom track changed?

Thanks,

Carlos


Begin forwarded message:

I've looked into this issue further, and it appears that the issue you're experiencing comes as a result of the 3rd party website you've been using, not with Dropbox.

From what we can tell, it's the website that's not loading the Dropbox links correctly; we're serving the files to the website, but we can't control how the website handles the files after we do so.

I'm sorry not to be able to assist you with this issue further, and please feel free to reach out with further questions!

Gert Hulselmans

unread,
Mar 4, 2014, 11:46:25 AM3/4/14
to Carlos Infante, elph...@gmail.com, lid...@gmail.com, gen...@soe.ucsc.edu
Hi all,

I just did some tests with tracing the open and write calls made by hubCheck to see what happens with track hubs hosted on Dropbox:

# Print strings up to length of 1024 bytes so we can see the full GET request:
$ strace -s 1024 -e open,write /software/kent/bin/hubCheck https://dl.dropboxusercontent.com/u/160815136/hub.txt
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
open("/lib/x86_64-linux-gnu/libssl.so.1.0.0", O_RDONLY|O_CLOEXEC) = 3
open("/lib/x86_64-linux-gnu/libcrypto.so.1.0.0", O_RDONLY|O_CLOEXEC) = 3
open("/lib/x86_64-linux-gnu/libz.so.1", O_RDONLY|O_CLOEXEC) = 3
open("/lib/x86_64-linux-gnu/libdl.so.2", O_RDONLY|O_CLOEXEC) = 3
open("/lib/x86_64-linux-gnu/libm.so.6", O_RDONLY|O_CLOEXEC) = 3
open("/lib/x86_64-linux-gnu/librt.so.1", O_RDONLY|O_CLOEXEC) = 3
open("/lib/x86_64-linux-gnu/libpthread.so.0", O_RDONLY|O_CLOEXEC) = 3
open("/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
write(3, "HEAD /u/160815136/hub.txt HTTP/1.0\r\nUser-Agent: genome.ucsc.edu/net.c\r\nHost: dl.dropboxusercontent.com\r\nAccept: */*\r\n\r\n", 119) = 119
open("/usr/share/zoneinfo/GMT0", O_RDONLY|O_CLOEXEC) = 3
open("/etc/localtime", O_RDONLY|O_CLOEXEC) = 3
open("/tmp/udcCache/https/dl.dropboxusercontent.com/u/160815136/hub.txt/bitmap", O_RDWR) = 3
open("/tmp/udcCache/https/dl.dropboxusercontent.com/u/160815136/hub.txt/bitmap", O_WRONLY|O_CREAT|O_TRUNC, 0664) = 3
write(3, "\366\342\207A", 4)            = 4
write(3, "\0 \0\0", 4)                  = 4
write(3, "p\372\25S\0\0\0\0", 8)        = 8
write(3, "\220\0\0\0\0\0\0\0", 8)       = 8
write(3, "\n\0\0\0", 4)                 = 4
write(3, "\0\0\0\0", 4)                 = 4
write(3, "\0\0\0\0\0\0\0\0", 8)         = 8
write(3, "\0\0\0\0\0\0\0\0", 8)         = 8
write(3, "\0\0\0\0\0\0\0\0", 8)         = 8
write(3, "\0\0\0\0\0\0\0\0", 8)         = 8
write(3, "\0", 1)                       = 1
open("/tmp/udcCache/https/dl.dropboxusercontent.com/u/160815136/hub.txt/sparseData", O_WRONLY|O_CREAT|O_TRUNC, 0664) = 3
open("/tmp/udcCache/https/dl.dropboxusercontent.com/u/160815136/hub.txt/bitmap", O_RDWR) = 3
open("/tmp/udcCache/https/dl.dropboxusercontent.com/u/160815136/hub.txt/sparseData", O_RDWR) = 4
write(5, "GET /u/160815136/hub.txt HTTP/1.0\r\nUser-Agent: genome.ucsc.edu/net.c\r\nHost: dl.dropboxusercontent.com\r\nAccept: */*\r\nRange: bytes=0-\r\n\r\n", 135) = 135
https timeout expired
) = 75
write(1, "unable to fetch 144 bytes from https://dl.dropboxusercontent.com/u/160815136/hub.txt @0 (got 1 bytes)\n", 102unable to fetch 144 bytes from https://dl.dropboxusercontent.com/u/160815136/hub.txt @0 (got 1 bytes)
) = 102
write(1, "\n", 1
)                       = 


So this is the GET request hubCheck makes:

GET /u/160815136/hub.txt HTTP/1.0\r\n
Accept: */*\r\n
Range: bytes=0-\r\n\r\n


Sending the same header with curl, results in a similar failure (takes a while before curls execution ends):

$ curl -H 'GET /u/160815136/hub.txt HTTP/1.0' -H 'User-Agent: genome.ucsc.edu/net.c' -H 'Host: dl.dropboxusercontent.com' -H 'Accept: */*' -H 'Range: bytes=0-' 'https://dl.dropboxusercontent.com/u/160815136/hub.txt'
curl: (18) transfer closed with 143 bytes remaining to read


Removing the "Range:" header solves the problem:

$ curl -H 'GET /u/160815136/hub.txt HTTP/1.0' -H 'User-Agent: genome.ucsc.edu/net.c' -H 'Host: dl.dropboxusercontent.com' -H 'Accept: */*'  'https://dl.dropboxusercontent.com/u/160815136/hub.txt'
hub Elphege_track_hub
shortLabel Elphege_track_hub
longLabel Elphege_track_hub
genomesFile genomes.txt


Also specifying a start and end range, solves the problem:

$ curl -H 'GET /u/160815136/hub.txt HTTP/1.0' -H 'User-Agent: genome.ucsc.edu/net.c' -H 'Host: dl.dropboxusercontent.com' -H 'Accept: */*'  -H 'Range: bytes=0-143' 'https://dl.dropboxusercontent.com/u/160815136/hub.txt'
hub Elphege_track_hub
shortLabel Elphege_track_hub
longLabel Elphege_track_hub
genomesFile genomes.txt


Also specifying a start and end range (not start and end of the file) works fine:

$ curl -H 'GET /u/160815136/hub.txt HTTP/1.0' -H 'User-Agent: genome.ucsc.edu/net.c' -H 'Host: dl.dropboxusercontent.com' -H 'Accept: */*'  -H 'Range: bytes=60-100' 'https://dl.dropboxusercontent.com/u/160815136/hub.txt'
el Elphege_track_hub
genomesFile genomes


Also specifying an invalid end range gives back the full file:

$ curl -H 'GET /u/160815136/hub.txt HTTP/1.0' -H 'User-Agent: genome.ucsc.edu/net.c' -H 'Host: dl.dropboxusercontent.com' -H 'Accept: */*'  -H 'Range: bytes=0-a' 'https://dl.dropboxusercontent.com/u/160815136/hub.txt'
hub Elphege_track_hub
shortLabel Elphege_track_hub
longLabel Elphege_track_hub
genomesFile genomes.txt


For the hub.txt and genomes.txt file UCSC should get the file with a normal GET request without ranges.
Or it should use the value in the "Content-length: " header of the HEAD request to determine the end range:

HTTP/1.1 200 OK
accept-ranges: bytes
cache-control: max-age=0
Content-length: 144
Content-Type: text/plain; charset=ascii
Date: Tue, 04 Mar 2014 16:38:29 GMT
etag: 38n
pragma: public
Server: nginx
x-dropbox-request-id: d162b3d336cf6d378090085bd1cc7f5a
X-RequestId: 12137b494e52ade5183ebcfacd010b17
x-robots-tag: noindex, nofollow
x-server-response-time: 241
Connection: keep-alive

==> end range = content-length - 1




As far as I know, 'Range: bytes=0-' should be supported by a webserver (at least in HTTP/1.1).
So it seems that the Dropbox servers are not HTTP/1.1 compiliant (anymore).



Greetings,
Gert Hulselmans



--
 

To unsubscribe from this group and stop receiving emails from it, send an email to genome+un...@soe.ucsc.edu.

Brian Lee

unread,
Mar 4, 2014, 3:42:10 PM3/4/14
to Gert Hulselmans, Carlos Infante, elph...@gmail.com, Daofeng Li, gen...@soe.ucsc.edu
Dear Carlos and Gert,

Thank you very much for sharing your troubleshooting regarding the Dropbox problem. We have contacted Dropbox about the changes and hope to hear a response soon. Our engineers are also looking into this development to try to discover a workaround for our users.

Until there is a resolution, perhaps Amazon's s3 Bucket might be an option, they offer 1 year of free service within certain usage limits:

Thank you again for your detailed help, it is greatly appreciated. We will contact you if we learn of any solution. If you have any further questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

All the best,

Brian


--
 

Gert Hulselmans

unread,
Mar 5, 2014, 8:04:16 AM3/5/14
to Brian Lee, gen...@soe.ucsc.edu

Dear Brian,


Apparently Sean s found and reported this "Range: bytes=0-" problem already on the Dropbox forum.

It might be useful to mention this report in your own bug report too:

https://forums.dropbox.com/topic.php?id=112637#post-596782


I noticed an issue that when DropBox receives a download request with a Http header of Range: bytes=0-, it will block after reading the first byte. If you use bytes=0-100 (or some other ending value), then it will work, but it fails when the end is not specified. Likewise it also fails if you use bytes=100- (which should start downloading at 100 bytes offset, and continue to the end.. common for resumeable downloads).

The following java code snippet demonstrates in the problem in that the connection will timeout after reading the first byte. (just add an end range, and see it work... or change to a non dropbox url)

<br />
public class TestURLConnection {<br />
    public static void main(String args[]) throws IOException {<br />
        String url = "http://dl.dropboxusercontent.com/u/408295/sageplugins/mediastreaming-war-file-1.3.7.59.zip";<br />
        //String url = "https://github.com/stuckless/sagetv-phoenix-plex-channel/releases/download/v1.0.3/SageTVPhoenix-1.0.3-beta.zip";<br />
        HttpURLConnection conn = (HttpURLConnection) new URL(url).openConnection();<br />
        conn.setInstanceFollowRedirects(true);<br />
        conn.setRequestProperty("Range", "bytes=0-");<br />
        conn.setReadTimeout(10000);<br />
        conn.setConnectTimeout(10000);<br />
        conn.connect();<br />
        System.out.println("  RESP: " + conn.getResponseCode());<br />
        System.out.println("  MESG: " + conn.getResponseMessage());<br />
        System.out.println("LENGTH: " + conn.getContentLengthLong());<br />
        InputStream is = conn.getInputStream();<br />
        while (is.read()>=0) {<br />
            System.out.print(".");<br />
        }<br />
        is.close();<br />
        System.out.println("DONE");<br />
    }<br />
}<br />


Greetings,
Gert Hulselmans

Reply all
Reply to author
Forward
0 new messages