Failure to upload my hosted custom tracks

68 views
Skip to first unread message

Rob White

unread,
May 15, 2025, 12:10:16 PMMay 15
to UCSC Genome Browser Public Support
I have hosted a number of old bed and bam files with UCSC upload guidance on my website (all details of URLs at www.ebv.org.uk/Chip-seq.php ).  Sometime in the last year or two something has changed so that these files will no longer upload, with error messages such as failure to find the .bai file, to partial upload errors (bed files with the wrong number of items in the line, or "unable to fetch X bytes"). Two bed files do sometimes upload but I am worried that this is just because they are being truncated at the end of a line, so the failed upload is not recognised. I have been able to upload the bam files to IGV just fine, so it is not a problem with the file.

Any suggestions of why this error is now occurring? I did try varying with http and https in the urls in case that helped (per another comment).

Thanks
Rob
ps: If feasible please copy reply to robert....@imperial.ac.uk as my gmail is infrequently checked.

Gerardo Perez

unread,
May 23, 2025, 1:51:30 PMMay 23
to Rob White, UCSC Genome Browser Public Support, robert....@imperial.ac.uk

Hello, Rob.

Thank you for your interest in the Genome Browser and for reporting your issue.

Do you have a network administrator we can contact? Our engineers have investigated the connection issues with your files, and we will need to work with your network admin to help resolve the problem.

You may also be interested in converting your custom track data into a track hub. Track hubs allow you to load a custom set of annotations for an organism we host and give you control over how those annotations are displayed, similar to our native tracks. You can learn more about track hubs here: http://genome.ucsc.edu/goldenPath/help/hgTrackHubHelp.html

If your track hub meets certain standards, you may also submit it to be included as a public hub (http://genome.ucsc.edu/cgi-bin/hgHubConnect#publicHubs). Here are guidelines that we share with users who are trying to list their hub as a public hub: https://genome.ucsc.edu/goldenPath/help/publicHubGuidelines.html.

I hope this is helpful. Please include gen...@soe.ucsc.edu in any replies to ensure visibility by the team. All messages sent to that address are archived on a publicly-accessible Google Groups forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Gerardo Perez
UCSC Genomics Institute


--

---
You received this message because you are subscribed to the Google Groups "UCSC Genome Browser Public Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genome+un...@soe.ucsc.edu.
To view this discussion visit https://groups.google.com/a/soe.ucsc.edu/d/msgid/genome/9e6e825b-2a2b-4567-853b-270594241ea4n%40soe.ucsc.edu.

White, Rob

unread,
Jun 2, 2025, 1:17:53 PMJun 2
to Gerardo Perez, Rob White, UCSC Genome Browser Public Support, Cope, Kenneth C, Web Infrastructure Team

Hi – Apologies for the delayed response.

 

Our network admin ran some tests for me. The details are copied below, and their contact emails are CCed so you can ask further questions.

 

Rob

 

Dr Rob White
Senior Lecturer In Virology
Imperial College London
Section of Virology

Sir Alexander Fleming Building
Imperial College Road
London
  SW7 2AZ

tel: 0207 594 1124

www.ebv.org.uk
www.imperial.ac.uk/people/robert.e.white/

 

HI Rob,

 

   Ok, so you’re  http://www.ebv.org.uk/ChIPseq2/EBNA3A_FLAG.bam file, like some of the others is 1.4G in size. You are not throttled, as previously mentioned, but a file that size may be encountering timeouts anywhere between the server and the requestee.

    With regard to http/https – yes,  all http traffic is redirected from http to https.  There are at least three layers on our side of things which will do that automatically. Four if you consider the HSTS policy on the traffic.  We don’t even pass the traffic between servers in the different tiers unencrypted for speed.  

    Following your instructions, if I paste in a track and then change  the URL to https, and ignore the error and let it get processed, it appears that it’s complaining because it’s tried to request an associated .bai file – on http, and not https.  So, if there’s an issue with the protocol, it may be there.  The error page comes back very fast – it’s certainly not had time to download the 1.4G file.  Looking at the logs, we can see the requests coming in as HTTP HEAD requests and returning successfully


ebv.org.uk 146.179.33.29 - - [16/May/2025:09:53:29 +0100] "HEAD /ChIPseq2/EBNA3A_FLAG.bam HTTP/1.0" 200 - "-" "genome.ucsc.edu/net.c" 979 759

ebv.org.uk 146.179.33.29 - - [16/May/2025:09:53:29 +0100] "HEAD /ChIPseq2/EBNA3A_FLAG.bai HTTP/1.0" 301 - "-" "genome.ucsc.edu/net.c" 979 756

ebv.org.uk 146.179.33.29 - - [16/May/2025:09:53:29 +0100] "HEAD /ChIPseq2/EBNA3A_FLAG.bam.bai HTTP/1.0" 200 - "-" "genome.ucsc.edu/net.c" 983 756

 

    Really interesting though, is that the request come in as HTTP/1.0 – which stopped being used widely at the end of the 1990s  (97/98?).  We really don’t see much traffic coming in on that, and I’m surprised, actually, that it wasn’t fingerprinted as being a bot of some kind.  HTTP/1.0 doesn’t provide a HOST header because back then every website had its own IP address and needed its own web server.  Because of the way we have the load-balancers and proxies set up the traffic has been successfully associated with the correct web site, but you’re lucky it’s not just getting sent to the default host each time, for whichever backend web server the request was sent to.

 

    That 301 redirect entry is not from our server configuration.  We don’t use 301 status codes – they are TOO permanent these days.  For our configuration, you’d likely see a 307, or a 302 status code used.

   If I request that URL from the command line, so we can track what’s happening better, it seems to redirect to the .bam file(!)


#> wget -S -O /dev/null http://ebv.org.uk/ChIPseq2/EBNA3A_FLAG.bai
[snip]

HTTP request sent, awaiting response...

  HTTP/1.1 301 Moved Permanently

  date: Fri, 16 May 2025 09:19:41 GMT

  location: https://ebv.org.uk/ChIPseq2/EBNA3A_FLAG.bam

  content-length: 251

[snip]

 

   If I copy your .bam.bai file to just a .bai extension, and try the form, then it gets further but then still tries to download the .bam.bai file(!) and then complains about a download error. So, it seem to test for the .bai file, and then not use it.

   But from the command line it works just fine

# wget -S  http://ebv.org.uk/ChIPseq2/EBNA3A_FLAG.bai

URL transformed to HTTPS due to an HSTS policy

--2025-05-16 10:14:44--  https://ebv.org.uk/ChIPseq2/EBNA3A_FLAG.bai

Resolving ebv.org.uk (ebv.org.uk)... 2a0c:5bc0:88:100:1::16, 2a0c:5bc0:80:100:1::16, 146.179.12.79, ...

Connecting to ebv.org.uk (ebv.org.uk)|2a0c:5bc0:88:100:1::16|:443... connected.

HTTP request sent, awaiting response...

  HTTP/1.1 200 OK

  date: Fri, 16 May 2025 09:14:44 GMT                                                                                                                                                                                                     upgrade: h2

  connection: Upgrade

  last-modified: Fri, 11 Mar 2016 09:36:03 GMT

  accept-ranges: bytes

  content-length: 6550216

  strict-transport-security: max-age=15768000; includeSubDomains

  x-xss-protection: 1; mode=block

  cache-control: private

Length: 6550216 (6.2M)

Saving to: ‘EBNA3A_FLAG.bai’

 

EBNA3A_FLAG.bai  100%[=========================================================>]   6.25M  --.-KB/s    in 0.02s

 

2025-05-16 10:14:44 (402 MB/s) - ‘EBNA3A_FLAG.bai’ saved [6550216/6550216]




#> wget -S  http://ebv.org.uk/ChIPseq2/EBNA3A_FLAG.bam.bai

URL transformed to HTTPS due to an HSTS policy

--2025-05-16 10:27:29--  https://ebv.org.uk/ChIPseq2/EBNA3A_FLAG.bam.bai

Resolving ebv.org.uk (ebv.org.uk)... 2a0c:5bc0:88:100:1::16, 2a0c:5bc0:80:100:1::16, 146.179.12.79, ...

Connecting to ebv.org.uk (ebv.org.uk)|2a0c:5bc0:88:100:1::16|:443... connected.

HTTP request sent, awaiting response...

  HTTP/1.1 200 OK

  date: Fri, 16 May 2025 09:27:29 GMT

  upgrade: h2

  connection: Upgrade

  last-modified: Fri, 11 Mar 2016 09:36:03 GMT

  accept-ranges: bytes

  content-length: 6550216

  strict-transport-security: max-age=15768000; includeSubDomains

  x-xss-protection: 1; mode=block

  set-cookie: SR=12.02.4; path=/; HttpOnly; Secure

  cache-control: private

Length: 6550216 (6.2M)

Saving to: ‘EBNA3A_FLAG.bam.bai’

 

EBNA3A_FLAG.bam.bai     100%[========================>]   6.25M  --.-KB/s    in 0.02s

 

2025-05-16 10:27:29 (303 MB/s) - ‘EBNA3A_FLAG.bam.bai’ saved [6550216/6550216]




 

   The requests we get from the remote web site are split using HTTP – we get range requests, See below, the 206 HTTP status code indicates a successful result for partial content.

 

ebv.org.uk 146.179.33.29 - - [16/May/2025:10:13:05 +0100] "GET /ChIPseq2/EBNA3A_FLAG.bam.bai HTTP/1.0" 206 979656 "-" "genome.ucsc.edu/net.c" 1005 983154

ebv.org.uk 146.179.33.29 - - [16/May/2025:10:13:19 +0100] "GET /ChIPseq2/EBNA3A_FLAG.bam.bai HTTP/1.0" 206 979656 "-" "genome.ucsc.edu/net.c" 1005 983154

ebv.org.uk 146.179.33.29 - - [16/May/2025:10:13:19 +0100] "GET /ChIPseq2/EBNA3A_FLAG.bam.bai HTTP/1.0" 206 717512 "-" "genome.ucsc.edu/net.c" 1005 720284

 

   Range requests are what you’d expect – the request will be for the first n bytes of the file, the second for n -> n1, then  n1 -> n2, and so on until the content has all been sent.

 

   I’ve reverted my changes, but as a first step, I’d suggest copying/moving your .bam.bai, to just .bai, as in my test above and the continuing from there.  Both sets of errors seem to be related to the pages at genome-euro.ucsc.edu, and not the hosting here.

 

    There are no error being thrown on the server side for any of these requests. Successful one or failures.  I suspect the issue are down to the remote site using protocols that are twenty years obsolete, but the information above may help with conversing with the site owners.

 

Best regards,

 

    Kenneth

 

 

From: White, Rob <robert....@imperial.ac.uk>
Sent: 16 May 2025 08:58
To: ASK Imperial <a...@imperial.ac.uk>
Cc: Cope, Kenneth C <kennet...@imperial.ac.uk>; Darvell, Iain <i.da...@imperial.ac.uk>; Miller, Austin J <a.mi...@imperial.ac.uk>; Seaman, Ian R <i.se...@imperial.ac.uk>
Subject: Re: CS0646467 - Please reopen CS0633618:Is data upload from my web farm website ebv.org.uk being throttled?

 

Hi Kenneth

 

Instructions for reproducing my issue:

Go to https://genome-euro.ucsc.edu/cgi-bin/hgGateway

From drop-down, select Feb2009 (GRCh37/hg19) and click go. 

[BCL2L11 in the gene ID will get somewhere that I expect to see a signal in the data, but should not matter for reproducing the upload issue]

From the buttons below the visualisation, click “add cusom tracks”. 

You can then paste the URLs from the website https://www.epstein-barrvirus.org.uk/ChIP-seq.php - any of the green text URLs should work, either with or without the qualifiers. 

Two of the files have worked for me (the bottom two in the West Lab list), although I do not know if it just happened to throttle at the end of a line, so was approved, but as a partial upload. The rest failed with errors. 

 

I looked on their forum yesterday, and the only similar issue with any resolution was to do with how the hosting service was using redirects for http and https. (https://groups.google.com/a/soe.ucsc.edu/g/genome/c/Cf4T2Bg-X90/m/kyzMzq_oBwAJ for details). However I have tried https in the address and it didn’t help. [Also posted a help request there in case they can diagnose it].

 

Thanks. 

Rob

 

From: Gerardo Perez <gpe...@ucsc.edu>
Date: Friday, 23 May 2025 at 18:51
To: Rob White <robert...@gmail.com>
Cc: UCSC Genome Browser Public Support <gen...@soe.ucsc.edu>, White, Rob <robert....@imperial.ac.uk>
Subject: Re: [genome] Failure to upload my hosted custom tracks

This email from gpe...@ucsc.edu originates from outside Imperial. Do not click on links and attachments unless you recognise the sender. If you trust the sender, add them to your safe senders list to disable email stamping for this address.

Galt Barber

unread,
Jun 2, 2025, 10:22:37 PMJun 2
to White, Rob, Gerardo Perez, Rob White, UCSC Genome Browser Public Support, Cope, Kenneth C, Web Infrastructure Team
Thanks for the detailed information. It was a big help.


I recommmend these actions will help

1. Remove this redirection, it is not needed,
and worse, causes the system to redirect an index file to a data file,
which is not logical as they are two distint files with different extensions
and types.

[hgwdev:tests> curl -I 'https://www.ebv.org.uk/ChIPseq2/EBNA3C_Rep1.sorted.bai'
HTTP/1.1 301 Moved Permanently
date: Tue, 03 Jun 2025 01:38:12 GMT
location: https://ebv.org.uk/ChIPseq2/EBNA3C_Rep1.sorted.bam

This must be coming from your own web server, because that is where redirects
are created, maintained and controlled.

2. Keep the index files as .bam.bai since that is the standard.
No need to rename the index filers to .bai.
When you observe us fetching .bai it testing for the old way of doing things,
in which we probe first for file.bai, and then file.bam.bai for backwards compatibility.

3. Keep the http to https redirect, or better still just update your
main page to use https directly in the URL given to users, under bigDataUrl=https://.

There is a track setting that probably works in custom tracks too,

bigDataIndex=https://EBNA3C_Rep1.sorted.bam.bai

and it might be handy for very limited web servers,
but you do not need it.

4. Something odd happens sometimes where your server cuts off the connection to us,
dropping the connection. Perhaps if you can easily see the result,
you can try settings until the problem goes away.

If I repeat this a few times it begins failing, even for this small 2.1MB file.

wget -v --tries=1 'https://www.ebv.org.uk/ChIPseq2/EBNA3C_Rep1.sorted.bam.bai'

This should run every single time without fails and without retries.

I also see this error with curl:

curl 'https://www.ebv.org.uk/ChIPseq2/EBNA3C_Rep1.sorted.bam.bai' > /dev/null
curl: (56) Recv failure: Connection reset by peer



Ar Luan 2 Meith 2025 ag 10:18, scríobh White, Rob <robert....@imperial.ac.uk>:

Cope, Kenneth C

unread,
Jun 17, 2025, 1:11:17 PMJun 17
to Galt Barber, White, Rob, Gerardo Perez, Rob White, UCSC Genome Browser Public Support, Web Infrastructure Team

Hi Galt,

    We’ve tracked down the redirect for Rob and removed that.  With regard to the interrupted downloads, we are unable to replicate that, within the UK or without, with direct requests, either in browsers, or using wget, or curl.  I noted in my previous tests, which Rob forwarded, that the incoming connections are HTTP/1.0; this HTTP version had much less resiliency for connections.  It’s possible that there’s something there that might be having an effect(?)

    As far as we can tell, it’s only you, or connections from your website that are failing with dropped connections.  Are you able to do some tests and let us know the timestamps of the ones that fail so that we can check the different layers of our infrastructure and see if there’s anything logged.  As I’ve mentioned to Rob, off this email chain, a connection reset by peer, error can indicate that either side was the one that dropped the connection – not just the side that’s remote to the tester.

 

Best regards,

 

    Kenneth

 

Kenneth C. Cope (He/Him)
Server Engineer,   (Web Infrastructure),

Information & Communication Technologies
Imperial College London
White City

www.imperial.ac.uk/admin-services/ict/

White, Rob

unread,
Jun 17, 2025, 1:12:07 PMJun 17
to Cope, Kenneth C, Galt Barber, Gerardo Perez, Rob White, UCSC Genome Browser Public Support, Web Infrastructure Team

Hi Galt

 

Just a little extra context: I have been able to access the bam files at their URL using the URL option in IGV (which you could also try, to test whether it is something to do with your domain arguing with ours, and not just the UCSC genome browser). Also, the issue replicates using the European mirror site.

Jairo Navarro Gonzalez

unread,
Jun 27, 2025, 8:38:02 PMJun 27
to White, Rob, Cope, Kenneth C, Galt Barber, Gerardo Perez, Rob White, UCSC Genome Browser Public Support, Web Infrastructure Team

Hello,

Thank you for using the UCSC Genome Browser and sending your follow-up.

It looks like you have fixed the URL to use HTTPS directly in the bigDataUrl instead of HTTP with direction. You also added the filename extension (.bai) to the URL as requested.

i.e., you are using EBNA3C_Rep1.sorted.bam.bai, and not EBNA3C_Rep1.sorted.bai, with redirect to EBNA3C_Rep1.sorted.bam.bai

However, their server is still cutting off the data by closing the socket in the middle of the download. Most http(s) clients just quit. The wget command supports retries, but the curl command fails, and our binary libraries also fail.

e.g., fetchUrlTest

Using traceroute and ping did not turn up anything obviously wrong. Some example error messages we see are:

curl: (56) Recv failure: Connection reset by peer

[root@www ~]# curl -O 'https://www.ebv.org.uk/ChIPseq2/EBNA3C_Rep1.sorted.bam.bai'
curl: (56) OpenSSL SSL_read: SSL_ERROR_SYSCALL, errno 104

[hgwdev:tests> fetchUrlTest 'https://www.ebv.org.uk/ChIPseq2/EBNA3C_Rep1.sorted.bam.bai' > foo.galt
Error reading SSL connection

Are you using Apache or nginx for your web server?

I hope this is helpful. If you have any further questions, please reply to gen...@soe.ucsc.edu.
All messages sent to that address are archived on a publicly accessible Google Groups forum.


If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Jairo Navarro
UCSC Genome Browser


Cope, Kenneth C

unread,
Jul 28, 2025, 1:43:35 PMJul 28
to Jairo Navarro Gonzalez, White, Rob, Galt Barber, Gerardo Perez, Rob White, UCSC Genome Browser Public Support, Web Infrastructure Team

Hi Jairo,

 

    Rob has asked me to reply to you directly.  Please note that I’m not his web developer but one of the system administrators for the cluster the site sits on. 

    We’ve not been able to find any server-side issues, and when we’ve done remote tests have not been able to find any issues with downloads either.  Here’s a (slightly edited) copy of my last communication with him in response to his reply to your email.  We have done additional tests for the site, similar to the one below, with the same results.

   We are using Apache, and here are the tests from what was mentioned.  I use a personal VPS for this. So, not from the Imperial network, though those tests came out the same.

1

 

 

 

 

2

 

 

3

 

 

 

 

4

 

 

 

 

 

5

 

6

 

 

 

7

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

8

$> curl -O 'https://www.ebv.org.uk/ChIPseq2/EBNA3C_Rep1.sorted.bam.bai'

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current

                                 Dload  Upload   Total   Spent    Left  Speed

100 2072k  100 2072k    0     0  6862k      0 --:--:-- --:--:-- --:--:-- 6862k

 

$> ls -l EBNA3C_Rep1.sorted.bam.bai

-rw-r--r--. 1 kenneth kenneth 2122264 Jul 21 11:49 EBNA3C_Rep1.sorted.bam.bai

 

$> file  EBNA3C_Rep1.sorted.bam.bai

EBNA3C_Rep1.sorted.bam.bai: SAMtools BAI (BAM indexing format), with 25 reference sequences

 

 

 

$> curl -O 'https://www.ebv.org.uk/ChIPseq2/EBNA3C_Rep1.sorted.bam.bai' > foo.galt

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current

                                 Dload  Upload   Total   Spent    Left  Speed

100 2072k  100 2072k    0     0  8002k      0 --:--:-- --:--:-- --:--:-- 8002k

 

 

$> less foo.galt

 

$> ls -l foo.galt

-rw-r--r--. 1 kenneth kenneth 0 Jul 21 11:49 foo.galt

 

 

$> wget -S -O foo.galt 'https://www.ebv.org.uk/ChIPseq2/EBNA3C_Rep1.sorted.bam.bai'

--2025-07-21 11:50:43--  https://www.ebv.org.uk/ChIPseq2/EBNA3C_Rep1.sorted.bam.bai

Resolving www.ebv.org.uk (www.ebv.org.uk)... 146.179.42.79, 2a0c:5bc0:88:100:1::16

Connecting to www.ebv.org.uk (www.ebv.org.uk)|146.179.42.79|:443... connected.

HTTP request sent, awaiting response...

  HTTP/1.1 200 OK

  date: Mon, 21 Jul 2025 10:50:43 GMT

  upgrade: h2

  connection: Upgrade

  last-modified: Tue, 04 Aug 2015 18:03:39 GMT

  accept-ranges: bytes

  content-length: 2122264

  strict-transport-security: max-age=15768000; includeSubDomains

  x-xss-protection: 1; mode=block

  set-cookie: SR=12.02.4; path=/; HttpOnly; Secure

  cache-control: private

Length: 2122264 (2.0M)

Saving to: ‘foo.galt’

 

foo.galt            100%[===============>]   2.02M  13.2MB/s    in 0.2s

 

2025-07-21 11:50:43 (13.2 MB/s) - ‘foo.galt’ saved [2122264/2122264]

 

$> file foo.galt

foo.galt: SAMtools BAI (BAM indexing format), with 25 reference sequences


    So, 1 worked as expected;  I don’t think 4 is going to work anyway, that shell redirect would not get any content;  7 is simply the same test with a different program as I don’t have, whatever
fetchUrlTest is.

    None of them had any issues.

    At the moment, all of the tests we’ve done seem ok.  This doesn’t look like a server-side issue.

 

Best regards,

 

    Kenneth


Jairo Navarro Gonzalez

unread,
Aug 5, 2025, 2:21:11 PMAug 5
to Cope, Kenneth C, White, Rob, Galt Barber, Gerardo Perez, Rob White, UCSC Genome Browser Public Support, Web Infrastructure Team

Hello,

Thank you for providing the extra information.

Unfortunately, we are unable to resolve your server configuration to work with the UCSC Genome Browser. In the near future, we will be releasing Hub Space to provide users 10 GB of storage space for free. The feature is still in development, but if you are interested in testing, please use our development server:

https://genome-test.gi.ucsc.edu/cgi-bin/hgHubConnect#hubUpload

Any feedback you have on using the feature would be greatly appreciated. We will be removing all files after the feature is released to clear up storage space on the development server, so please keep backups of your files.

I hope this is helpful. If you have any further questions, please reply to gen...@soe.ucsc.edu.
All messages sent to that address are archived on a publicly accessible Google Groups forum.
If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Jairo Navarro
UCSC Genome Browser

Reply all
Reply to author
Forward
0 new messages