Hello, Rob.
Thank you for your interest in the Genome Browser and for reporting your issue.
Do you have a network administrator we can contact? Our engineers have investigated the connection issues with your files, and we will need to work with your network admin to help resolve the problem.
You may also be interested in converting your custom track data into a track hub. Track hubs allow you to load a custom set of annotations for an organism we host and give you control over how those annotations are displayed, similar to our native tracks. You can learn more about track hubs here: http://genome.ucsc.edu/goldenPath/help/hgTrackHubHelp.html
If your track hub meets certain standards, you may also submit it to be included as a public hub (http://genome.ucsc.edu/cgi-bin/hgHubConnect#publicHubs). Here are guidelines that we share with users who are trying to list their hub as a public hub: https://genome.ucsc.edu/goldenPath/help/publicHubGuidelines.html.
I hope this is helpful. Please include gen...@soe.ucsc.edu in any replies to ensure visibility by the team. All messages sent to that address are archived on a publicly-accessible Google Groups forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.
Gerardo Perez
UCSC Genomics Institute
--
---
You received this message because you are subscribed to the Google Groups "UCSC Genome Browser Public Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genome+un...@soe.ucsc.edu.
To view this discussion visit https://groups.google.com/a/soe.ucsc.edu/d/msgid/genome/9e6e825b-2a2b-4567-853b-270594241ea4n%40soe.ucsc.edu.
Hi – Apologies for the delayed response.
Our network admin ran some tests for me. The details are copied below, and their contact emails are CCed so you can ask further questions.
Rob
Dr Rob White
Senior Lecturer In Virology
Imperial College London
Section of Virology
Sir Alexander Fleming Building
Imperial College Road
London SW7 2AZ
tel: 0207 594 1124
www.ebv.org.uk
www.imperial.ac.uk/people/robert.e.white/
HI Rob,
Ok, so you’re http://www.ebv.org.uk/ChIPseq2/EBNA3A_FLAG.bam file, like some of the others is 1.4G in size. You are
not throttled, as previously mentioned, but a file that size may be encountering timeouts anywhere between the server and the requestee.
With regard to http/https – yes, all http traffic is redirected from http to https. There are at least three layers on our side of things which will do that automatically. Four if you consider the HSTS policy on the traffic. We don’t even pass the traffic
between servers in the different tiers unencrypted for speed.
Following your instructions, if I paste in a track and then change the URL to https, and ignore the error and let it get processed, it appears that it’s complaining because it’s tried to request an associated .bai file – on http, and not https. So, if
there’s an issue with the protocol, it may be there. The error page comes back very fast – it’s certainly not had time to download the 1.4G file. Looking at the logs, we can see the requests coming in as HTTP HEAD requests and returning successfully
ebv.org.uk 146.179.33.29 - - [16/May/2025:09:53:29 +0100] "HEAD /ChIPseq2/EBNA3A_FLAG.bam HTTP/1.0" 200 - "-" "genome.ucsc.edu/net.c" 979 759 ebv.org.uk 146.179.33.29 - - [16/May/2025:09:53:29 +0100] "HEAD /ChIPseq2/EBNA3A_FLAG.bai HTTP/1.0" 301 - "-" "genome.ucsc.edu/net.c" 979 756 ebv.org.uk 146.179.33.29 - - [16/May/2025:09:53:29 +0100] "HEAD /ChIPseq2/EBNA3A_FLAG.bam.bai HTTP/1.0" 200 - "-" "genome.ucsc.edu/net.c" 983 756 |
Really interesting though, is that the request come in as HTTP/1.0 – which stopped being used widely at the end of the 1990s (97/98?). We really don’t see much traffic coming in on that, and I’m surprised, actually, that it wasn’t fingerprinted as being a bot of some kind. HTTP/1.0 doesn’t provide a HOST header because back then every website had its own IP address and needed its own web server. Because of the way we have the load-balancers and proxies set up the traffic has been successfully associated with the correct web site, but you’re lucky it’s not just getting sent to the default host each time, for whichever backend web server the request was sent to.
That 301 redirect entry is not from our server configuration. We don’t use 301 status codes – they are TOO permanent these days. For our configuration, you’d likely see a 307,
or a 302 status code used.
If I request that URL from the command line, so we can track what’s happening better, it seems to redirect to the .bam file(!)
#> wget -S -O /dev/null http://ebv.org.uk/ChIPseq2/EBNA3A_FLAG.bai HTTP request sent, awaiting response... HTTP/1.1 301 Moved Permanently date: Fri, 16 May 2025 09:19:41 GMT location: https://ebv.org.uk/ChIPseq2/EBNA3A_FLAG.bam content-length: 251 [snip] |
If I copy your .bam.bai file to just a .bai extension, and try the form, then it gets further but then still tries to download the .bam.bai file(!) and then complains about a
download error. So, it seem to test for the .bai file, and then not use it.
But from the command line it works just fine
# wget -S http://ebv.org.uk/ChIPseq2/EBNA3A_FLAG.bai URL transformed to HTTPS due to an HSTS policy --2025-05-16 10:14:44-- https://ebv.org.uk/ChIPseq2/EBNA3A_FLAG.bai Resolving ebv.org.uk (ebv.org.uk)... 2a0c:5bc0:88:100:1::16, 2a0c:5bc0:80:100:1::16, 146.179.12.79, ... Connecting to ebv.org.uk (ebv.org.uk)|2a0c:5bc0:88:100:1::16|:443... connected. HTTP request sent, awaiting response... HTTP/1.1 200 OK date: Fri, 16 May 2025 09:14:44 GMT upgrade: h2 connection: Upgrade last-modified: Fri, 11 Mar 2016 09:36:03 GMT accept-ranges: bytes content-length: 6550216 strict-transport-security: max-age=15768000; includeSubDomains x-xss-protection: 1; mode=block cache-control: private Length: 6550216 (6.2M) Saving to: ‘EBNA3A_FLAG.bai’
EBNA3A_FLAG.bai 100%[=========================================================>] 6.25M --.-KB/s in 0.02s
2025-05-16 10:14:44 (402 MB/s) - ‘EBNA3A_FLAG.bai’ saved [6550216/6550216]
URL transformed to HTTPS due to an HSTS policy --2025-05-16 10:27:29-- https://ebv.org.uk/ChIPseq2/EBNA3A_FLAG.bam.bai Resolving ebv.org.uk (ebv.org.uk)... 2a0c:5bc0:88:100:1::16, 2a0c:5bc0:80:100:1::16, 146.179.12.79, ... Connecting to ebv.org.uk (ebv.org.uk)|2a0c:5bc0:88:100:1::16|:443... connected. HTTP request sent, awaiting response... HTTP/1.1 200 OK date: Fri, 16 May 2025 09:27:29 GMT upgrade: h2 connection: Upgrade last-modified: Fri, 11 Mar 2016 09:36:03 GMT accept-ranges: bytes content-length: 6550216 strict-transport-security: max-age=15768000; includeSubDomains x-xss-protection: 1; mode=block set-cookie: SR=12.02.4; path=/; HttpOnly; Secure cache-control: private Length: 6550216 (6.2M) Saving to: ‘EBNA3A_FLAG.bam.bai’
EBNA3A_FLAG.bam.bai 100%[========================>] 6.25M --.-KB/s in 0.02s
2025-05-16 10:27:29 (303 MB/s) - ‘EBNA3A_FLAG.bam.bai’ saved [6550216/6550216]
|
The requests we get from the remote web site are split using HTTP – we get range requests, See below, the 206 HTTP status code indicates a successful result for partial content.
ebv.org.uk 146.179.33.29 - - [16/May/2025:10:13:05 +0100] "GET /ChIPseq2/EBNA3A_FLAG.bam.bai HTTP/1.0" 206 979656 "-" "genome.ucsc.edu/net.c" 1005 983154 ebv.org.uk 146.179.33.29 - - [16/May/2025:10:13:19 +0100] "GET /ChIPseq2/EBNA3A_FLAG.bam.bai HTTP/1.0" 206 979656 "-" "genome.ucsc.edu/net.c" 1005 983154 ebv.org.uk 146.179.33.29 - - [16/May/2025:10:13:19 +0100] "GET /ChIPseq2/EBNA3A_FLAG.bam.bai HTTP/1.0" 206 717512 "-" "genome.ucsc.edu/net.c" 1005 720284 |
Range requests are what you’d expect – the request will be for the first n bytes of the file, the second for n -> n1, then n1 -> n2, and so on until the content has all been sent.
I’ve reverted my changes, but as a first step, I’d suggest copying/moving your .bam.bai, to just .bai, as in my test above and the continuing from there. Both sets of errors seem to be related to the pages at genome-euro.ucsc.edu, and not the hosting here.
There are no error being thrown on the server side for any of these requests. Successful one or failures. I suspect the issue are down to the remote site using protocols that are twenty years obsolete, but the information above may help with conversing with the site owners.
Best regards,
Kenneth
From: White, Rob <robert....@imperial.ac.uk>
Sent: 16 May 2025 08:58
To: ASK Imperial <a...@imperial.ac.uk>
Cc: Cope, Kenneth C <kennet...@imperial.ac.uk>; Darvell, Iain <i.da...@imperial.ac.uk>; Miller, Austin J <a.mi...@imperial.ac.uk>; Seaman, Ian R <i.se...@imperial.ac.uk>
Subject: Re: CS0646467 - Please reopen CS0633618:Is data upload from my web farm website ebv.org.uk being throttled?
Hi Kenneth
Instructions for reproducing my issue:
Go to https://genome-euro.ucsc.edu/cgi-bin/hgGateway
From drop-down, select Feb2009 (GRCh37/hg19) and click go.
[BCL2L11 in the gene ID will get somewhere that I expect to see a signal in the data, but should not matter for reproducing the upload issue]
From the buttons below the visualisation, click “add cusom tracks”.
You can then paste the URLs from the website https://www.epstein-barrvirus.org.uk/ChIP-seq.php - any of the green text URLs should work, either with or without the qualifiers.
Two of the files have worked for me (the bottom two in the West Lab list), although I do not know if it just happened to throttle at the end of a line, so was approved, but as a partial upload. The rest failed with errors.
I looked on their forum yesterday, and the only similar issue with any resolution was to do with how the hosting service was using redirects for http and https. (https://groups.google.com/a/soe.ucsc.edu/g/genome/c/Cf4T2Bg-X90/m/kyzMzq_oBwAJ for details). However I have tried https in the address and it didn’t help. [Also posted a help request there in case they can diagnose it].
Thanks.
Rob
From:
Gerardo Perez <gpe...@ucsc.edu>
Date: Friday, 23 May 2025 at 18:51
To: Rob White <robert...@gmail.com>
Cc: UCSC Genome Browser Public Support <gen...@soe.ucsc.edu>, White, Rob <robert....@imperial.ac.uk>
Subject: Re: [genome] Failure to upload my hosted custom tracks
This email from gpe...@ucsc.edu originates from outside Imperial. Do not click on links and attachments unless you recognise the sender. If you trust the sender, add them to your safe senders list to disable email stamping for this address. |
Hi Galt,
We’ve tracked down the redirect for Rob and removed that. With regard to the interrupted downloads, we are unable to replicate that, within the UK or without, with direct requests, either in browsers, or using wget, or curl. I noted in my previous tests,
which Rob forwarded, that the incoming connections are HTTP/1.0; this HTTP version had much less resiliency for connections. It’s possible that there’s something there that might be having an effect(?)
As far as we can tell, it’s only you, or connections from your website that are failing with dropped connections. Are you able to do some tests and let us know the timestamps of the ones that fail so that we can check the different layers of our infrastructure
and see if there’s anything logged. As I’ve mentioned to Rob, off this email chain, a
connection reset by peer, error can indicate that either side was the one that dropped the connection – not just the side that’s remote to the tester.
Best regards,
Kenneth
Kenneth C. Cope (He/Him)
Server Engineer, (Web Infrastructure),
Information & Communication Technologies
Imperial College London
White City
www.imperial.ac.uk/admin-services/ict/
Hi Galt
Just a little extra context: I have been able to access the bam files at their URL using the URL option in IGV (which you could also try, to test whether it is something to do with your domain arguing with ours, and not just the UCSC genome browser). Also, the issue replicates using the European mirror site.
Hello,
Thank you for using the UCSC Genome Browser and sending your follow-up.
It looks like you have fixed the URL to use HTTPS directly in the bigDataUrl instead of HTTP with direction. You also added the filename extension (.bai) to the URL as requested.
i.e., you are using EBNA3C_Rep1.sorted.bam.bai, and not EBNA3C_Rep1.sorted.bai, with redirect to EBNA3C_Rep1.sorted.bam.bai
However, their server is still cutting off the data by closing the socket in the middle of the download. Most http(s) clients just quit. The wget command supports retries, but the curl command fails, and our binary libraries also fail.
e.g., fetchUrlTest
Using traceroute and ping did not turn up anything obviously wrong. Some example error messages we see are:
Are you using Apache or nginx for your web server?
I hope this is helpful. If you have any further questions, please reply to gen...@soe.ucsc.edu.
All messages sent to that address are archived on a publicly accessible Google Groups forum.
If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.
Jairo Navarro
UCSC Genome Browser
Hi Jairo,
Rob has asked me to reply to you directly. Please note that I’m not his web developer but one of the system administrators for the cluster the site sits on.
We’ve not been able to find any server-side issues, and when we’ve done remote tests have not been able to find any issues with
downloads either. Here’s a (slightly edited) copy of my last communication with him in response to his reply to your email. We have done additional tests for the site, similar to the one below, with the same results.
We are using Apache, and here are the tests from what was mentioned. I use a personal VPS for this. So, not from the Imperial network, though those tests came out the same.
1
2
3
4
5
6
7
8 |
$> curl -O 'https://www.ebv.org.uk/ChIPseq2/EBNA3C_Rep1.sorted.bam.bai' % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 2072k 100 2072k 0 0 6862k 0 --:--:-- --:--:-- --:--:-- 6862k
$> ls -l EBNA3C_Rep1.sorted.bam.bai -rw-r--r--. 1 kenneth kenneth 2122264 Jul 21 11:49 EBNA3C_Rep1.sorted.bam.bai
$> file EBNA3C_Rep1.sorted.bam.bai EBNA3C_Rep1.sorted.bam.bai: SAMtools BAI (BAM indexing format), with 25 reference sequences
$> curl -O 'https://www.ebv.org.uk/ChIPseq2/EBNA3C_Rep1.sorted.bam.bai' > foo.galt % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 2072k 100 2072k 0 0 8002k 0 --:--:-- --:--:-- --:--:-- 8002k
$> less foo.galt
$> ls -l foo.galt -rw-r--r--. 1 kenneth kenneth 0 Jul 21 11:49 foo.galt
$> wget -S -O foo.galt 'https://www.ebv.org.uk/ChIPseq2/EBNA3C_Rep1.sorted.bam.bai' --2025-07-21 11:50:43-- https://www.ebv.org.uk/ChIPseq2/EBNA3C_Rep1.sorted.bam.bai Resolving www.ebv.org.uk (www.ebv.org.uk)... 146.179.42.79, 2a0c:5bc0:88:100:1::16 Connecting to www.ebv.org.uk (www.ebv.org.uk)|146.179.42.79|:443... connected. |
HTTP request sent, awaiting response... HTTP/1.1 200 OK |
date: Mon, 21 Jul 2025 10:50:43 GMT upgrade: h2 connection: Upgrade last-modified: Tue, 04 Aug 2015 18:03:39 GMT accept-ranges: bytes content-length: 2122264 |
strict-transport-security: max-age=15768000; includeSubDomains x-xss-protection: 1; mode=block set-cookie: SR=12.02.4; path=/; HttpOnly; Secure cache-control: private |
Length: 2122264 (2.0M) Saving to: ‘foo.galt’
foo.galt 100%[===============>] 2.02M 13.2MB/s in 0.2s
2025-07-21 11:50:43 (13.2 MB/s) - ‘foo.galt’ saved [2122264/2122264]
$> file foo.galt foo.galt: SAMtools BAI (BAM indexing format), with 25 reference sequences |
So, 1 worked as expected; I don’t think 4 is going to work anyway, that shell redirect would not get any content; 7 is simply the same test with a different program as I don’t have, whatever
fetchUrlTest
is.
None of them had any issues.
At the moment, all of the tests we’ve done seem ok. This doesn’t look like a server-side issue.
Best regards,
Kenneth
Hello,
Thank you for providing the extra information.
Unfortunately, we are unable to resolve your server configuration to work with the UCSC Genome Browser. In the near future, we will be releasing Hub Space to provide users 10 GB of storage space for free. The feature is still in development, but if you are interested in testing, please use our development server:
https://genome-test.gi.ucsc.edu/cgi-bin/hgHubConnect#hubUpload
Any feedback you have on using the feature would be greatly appreciated. We will be removing all files after the feature is released to clear up storage space on the development server, so please keep backups of your files.
I hope this is helpful. If you have any further questions, please reply to gen...@soe.ucsc.edu.
All messages sent to that address are archived on a publicly accessible Google Groups forum.
If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.
Jairo Navarro
UCSC Genome Browser