problem hosting tracks on Dropbox

Carlos Infante

unread,

Mar 3, 2014, 10:29:19 AM3/3/14

to gen...@soe.ucsc.edu

Dear UCSC,

I have the same problem with bigWigs hosted on Dropbox reported by Daofeng Li and Elphege After emailing back and forth with support at Dropbox, they blame UCSC web browser since they are serving the files. See the final email from Dropbox support below. Has the approach to loading custom track changed?

Thanks,

Carlos

Begin forwarded message:

I've looked into this issue further, and it appears that the issue you're experiencing comes as a result of the 3rd party website you've been using, not with Dropbox.

From what we can tell, it's the website that's not loading the Dropbox links correctly; we're serving the files to the website, but we can't control how the website handles the files after we do so.

I'm sorry not to be able to assist you with this issue further, and please feel free to reach out with further questions!

Gert Hulselmans

unread,

Mar 4, 2014, 11:46:25 AM3/4/14

to Carlos Infante, elph...@gmail.com, lid...@gmail.com, gen...@soe.ucsc.edu

Hi all,

I just did some tests with tracing the open and write calls made by hubCheck to see what happens with track hubs hosted on Dropbox:

# Print strings up to length of 1024 bytes so we can see the full GET request:

$ strace -s 1024 -e open,write /software/kent/bin/hubCheck https://dl.dropboxusercontent.com/u/160815136/hub.txt

open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3

open("/lib/x86_64-linux-gnu/libssl.so.1.0.0", O_RDONLY|O_CLOEXEC) = 3

open("/lib/x86_64-linux-gnu/libcrypto.so.1.0.0", O_RDONLY|O_CLOEXEC) = 3

open("/lib/x86_64-linux-gnu/libz.so.1", O_RDONLY|O_CLOEXEC) = 3

open("/lib/x86_64-linux-gnu/libdl.so.2", O_RDONLY|O_CLOEXEC) = 3

open("/lib/x86_64-linux-gnu/libm.so.6", O_RDONLY|O_CLOEXEC) = 3

open("/lib/x86_64-linux-gnu/librt.so.1", O_RDONLY|O_CLOEXEC) = 3

open("/lib/x86_64-linux-gnu/libpthread.so.0", O_RDONLY|O_CLOEXEC) = 3

open("/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3

write(3, "HEAD /u/160815136/hub.txt HTTP/1.0\r\nUser-Agent: genome.ucsc.edu/net.c\r\nHost: dl.dropboxusercontent.com\r\nAccept: */*\r\n\r\n", 119) = 119

open("/usr/share/zoneinfo/GMT0", O_RDONLY|O_CLOEXEC) = 3

open("/etc/localtime", O_RDONLY|O_CLOEXEC) = 3

open("/tmp/udcCache/https/dl.dropboxusercontent.com/u/160815136/hub.txt/bitmap", O_RDWR) = 3

open("/tmp/udcCache/https/dl.dropboxusercontent.com/u/160815136/hub.txt/bitmap", O_WRONLY|O_CREAT|O_TRUNC, 0664) = 3

write(3, "\366\342\207A", 4) = 4

write(3, "\0 \0\0", 4) = 4

write(3, "p\372\25S\0\0\0\0", 8) = 8

write(3, "\220\0\0\0\0\0\0\0", 8) = 8

write(3, "\n\0\0\0", 4) = 4

write(3, "\0\0\0\0", 4) = 4

write(3, "\0\0\0\0\0\0\0\0", 8) = 8

write(3, "\0", 1) = 1

open("/tmp/udcCache/https/dl.dropboxusercontent.com/u/160815136/hub.txt/sparseData", O_WRONLY|O_CREAT|O_TRUNC, 0664) = 3

open("/tmp/udcCache/https/dl.dropboxusercontent.com/u/160815136/hub.txt/bitmap", O_RDWR) = 3

open("/tmp/udcCache/https/dl.dropboxusercontent.com/u/160815136/hub.txt/sparseData", O_RDWR) = 4

write(5, "GET /u/160815136/hub.txt HTTP/1.0\r\nUser-Agent: genome.ucsc.edu/net.c\r\nHost: dl.dropboxusercontent.com\r\nAccept: */*\r\nRange: bytes=0-\r\n\r\n", 135) = 135

https timeout expired

write(1, "Errors with hub at 'https://dl.dropboxusercontent.com/u/160815136/hub.txt'\n", 75Errors with hub at 'https://dl.dropboxusercontent.com/u/160815136/hub.txt'

) = 75

write(1, "unable to fetch 144 bytes from https://dl.dropboxusercontent.com/u/160815136/hub.txt @0 (got 1 bytes)\n", 102unable to fetch 144 bytes from https://dl.dropboxusercontent.com/u/160815136/hub.txt @0 (got 1 bytes)

) = 102

write(1, "\n", 1

) =

So this is the GET request hubCheck makes:

GET /u/160815136/hub.txt HTTP/1.0\r\n

User-Agent: genome.ucsc.edu/net.c\r\n

Host: dl.dropboxusercontent.com\r\n

Accept: */*\r\n

Range: bytes=0-\r\n\r\n

Sending the same header with curl, results in a similar failure (takes a while before curls execution ends):

$ curl -H 'GET /u/160815136/hub.txt HTTP/1.0' -H 'User-Agent: genome.ucsc.edu/net.c' -H 'Host: dl.dropboxusercontent.com' -H 'Accept: */*' -H 'Range: bytes=0-' 'https://dl.dropboxusercontent.com/u/160815136/hub.txt'

curl: (18) transfer closed with 143 bytes remaining to read

Removing the "Range:" header solves the problem:

$ curl -H 'GET /u/160815136/hub.txt HTTP/1.0' -H 'User-Agent: genome.ucsc.edu/net.c' -H 'Host: dl.dropboxusercontent.com' -H 'Accept: */*' 'https://dl.dropboxusercontent.com/u/160815136/hub.txt'

hub Elphege_track_hub

shortLabel Elphege_track_hub

longLabel Elphege_track_hub

genomesFile genomes.txt

email elpheg...@gladstone.ucsf.edu

Also specifying a start and end range, solves the problem:

$ curl -H 'GET /u/160815136/hub.txt HTTP/1.0' -H 'User-Agent: genome.ucsc.edu/net.c' -H 'Host: dl.dropboxusercontent.com' -H 'Accept: */*' -H 'Range: bytes=0-143' 'https://dl.dropboxusercontent.com/u/160815136/hub.txt'

hub Elphege_track_hub

shortLabel Elphege_track_hub

longLabel Elphege_track_hub

genomesFile genomes.txt

email elpheg...@gladstone.ucsf.edu

Also specifying a start and end range (not start and end of the file) works fine:

$ curl -H 'GET /u/160815136/hub.txt HTTP/1.0' -H 'User-Agent: genome.ucsc.edu/net.c' -H 'Host: dl.dropboxusercontent.com' -H 'Accept: */*' -H 'Range: bytes=60-100' 'https://dl.dropboxusercontent.com/u/160815136/hub.txt'

el Elphege_track_hub

genomesFile genomes

Also specifying an invalid end range gives back the full file:

$ curl -H 'GET /u/160815136/hub.txt HTTP/1.0' -H 'User-Agent: genome.ucsc.edu/net.c' -H 'Host: dl.dropboxusercontent.com' -H 'Accept: */*' -H 'Range: bytes=0-a' 'https://dl.dropboxusercontent.com/u/160815136/hub.txt'

hub Elphege_track_hub

shortLabel Elphege_track_hub

longLabel Elphege_track_hub

genomesFile genomes.txt

email elpheg...@gladstone.ucsf.edu

For the hub.txt and genomes.txt file UCSC should get the file with a normal GET request without ranges.

Or it should use the value in the "Content-length: " header of the HEAD request to determine the end range:

$ curl -I 'https://dl.dropboxusercontent.com/u/160815136/hub.txt'

HTTP/1.1 200 OK

accept-ranges: bytes

cache-control: max-age=0

Content-length: 144

Content-Type: text/plain; charset=ascii

Date: Tue, 04 Mar 2014 16:38:29 GMT

etag: 38n

pragma: public

Server: nginx

x-dropbox-request-id: d162b3d336cf6d378090085bd1cc7f5a

X-RequestId: 12137b494e52ade5183ebcfacd010b17

x-robots-tag: noindex, nofollow

x-server-response-time: 241

Connection: keep-alive

==> end range = content-length - 1

As far as I know, 'Range: bytes=0-' should be supported by a webserver (at least in HTTP/1.1).

So it seems that the Dropbox servers are not HTTP/1.1 compiliant (anymore).

Greetings,

Gert Hulselmans

--

To unsubscribe from this group and stop receiving emails from it, send an email to genome+un...@soe.ucsc.edu.

Brian Lee

unread,

Mar 4, 2014, 3:42:10 PM3/4/14

to Gert Hulselmans, Carlos Infante, elph...@gmail.com, Daofeng Li, gen...@soe.ucsc.edu

Dear Carlos and Gert,

Thank you very much for sharing your troubleshooting regarding the Dropbox problem. We have contacted Dropbox about the changes and hope to hear a response soon. Our engineers are also looking into this development to try to discover a workaround for our users.

Until there is a resolution, perhaps Amazon's s3 Bucket might be an option, they offer 1 year of free service within certain usage limits:

http://aws.amazon.com/free/faqs/

http://havecamerawilltravel.com/tidbits/how-to-allow-public-access-to-an-amazon-s3-bucket/

http://www.3hubapp.com/

Thank you again for your detailed help, it is greatly appreciated. We will contact you if we learn of any solution. If you have any further questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

All the best,

Brian

--

Gert Hulselmans

unread,

Mar 5, 2014, 8:04:16 AM3/5/14

to Brian Lee, gen...@soe.ucsc.edu

Dear Brian,

Apparently Sean s found and reported this "Range: bytes=0-" problem already on the Dropbox forum.

It might be useful to mention this report in your own bug report too:

https://forums.dropbox.com/topic.php?id=112637#post-596782

I noticed an issue that when DropBox receives a download request with a Http header of Range: bytes=0-, it will block after reading the first byte. If you use bytes=0-100 (or some other ending value), then it will work, but it fails when the end is not specified. Likewise it also fails if you use bytes=100- (which should start downloading at 100 bytes offset, and continue to the end.. common for resumeable downloads).

The following java code snippet demonstrates in the problem in that the connection will timeout after reading the first byte. (just add an end range, and see it work... or change to a non dropbox url)

<br />
public class TestURLConnection {<br />
    public static void main(String args[]) throws IOException {<br />
        String url = "http://dl.dropboxusercontent.com/u/408295/sageplugins/mediastreaming-war-file-1.3.7.59.zip";<br />
        //String url = "https://github.com/stuckless/sagetv-phoenix-plex-channel/releases/download/v1.0.3/SageTVPhoenix-1.0.3-beta.zip";<br />
        HttpURLConnection conn = (HttpURLConnection) new URL(url).openConnection();<br />
        conn.setInstanceFollowRedirects(true);<br />
        conn.setRequestProperty("Range", "bytes=0-");<br />
        conn.setReadTimeout(10000);<br />
        conn.setConnectTimeout(10000);<br />
        conn.connect();<br />
        System.out.println("  RESP: " + conn.getResponseCode());<br />
        System.out.println("  MESG: " + conn.getResponseMessage());<br />
        System.out.println("LENGTH: " + conn.getContentLengthLong());<br />
        InputStream is = conn.getInputStream();<br />
        while (is.read()>=0) {<br />
            System.out.print(".");<br />
        }<br />
        is.close();<br />
        System.out.println("DONE");<br />
    }<br />
}<br />