Uploading GEO data into UCSC using http/ftp links

635 views
Skip to first unread message

Michael Kosicki

unread,
Oct 17, 2012, 1:57:42 PM10/17/12
to gen...@soe.ucsc.edu
I would like to be able to use processed data available on GEO (mostly
ChIP-seq experiments in bed or wig format) directly in UCSC without
the need to download/upload. It seems to have been possible two years
ago with a simple trick as described in here:
https://lists.soe.ucsc.edu/pipermail/genome/2010-June/022734.html.

E.g. pasting this:
http://www.ncbi.nlm.nih.gov/geosuppl/?acc=GSM540722&file=GSM540722%5FStat3il6WTTh17%2Ebedgraph%2Egz
into "add custom tracks" results in an error.

Brooke Rhead

unread,
Oct 17, 2012, 4:21:28 PM10/17/12
to Michael Kosicki, gen...@soe.ucsc.edu
Hi Michael,

The problem is that the URL encoding
(http://www.w3schools.com/tags/ref_urlencode.asp) in the http link is
keeping our custom track loader from recognizing that this is a zipped file.

If you change the end of the URL from "bedgraph%2Egz" to "bedgraph.gz"
the track will load:

http://www.ncbi.nlm.nih.gov/geosuppl/?acc=GSM540722&file=GSM540722%5FStat3il6WTTh17%2Ebedgraph.gz

I see there are also ftp links on the GEO page for GSM540722
(http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM540722), and the
ftp links don't have the URL encoding:

ftp://ftp.ncbi.nlm.nih.gov/pub/geo/DATA/supplementary/samples/GSM540nnn/GSM540722/GSM540722_Stat3il6WTTh17.bedgraph.gz

you can just paste the ftp link instead.

If you have further questions, please contact us again at
gen...@soe.ucsc.edu.

--
Brooke Rhead
UCSC Genome Bioinformatics Group

Michael Kosicki

unread,
Oct 18, 2012, 1:27:41 PM10/18/12
to Brooke Rhead, gen...@soe.ucsc.edu
Thanks a lot Brook, works like a charm!

Some smaller issues are left though:

1) Is there any possibility of editing track information (name,
description) in this case? Field is gray...

2) For general information: my browser (Firefox 15.0.1) converts the
ftp address anyway, so all %5F and %4E need to be changed to _ and .
respectively for it to work. In case of HTTP links replacing the last
'dot' is sufficient, as suggested.

2012/10/17 Brooke Rhead <rh...@soe.ucsc.edu>:

Brooke Rhead

unread,
Oct 18, 2012, 3:14:17 PM10/18/12
to Michael Kosicki, gen...@soe.ucsc.edu
Hi Michael,

Since there is already a track line
(http://genome.ucsc.edu/goldenPath/help/customTrack.html#TRACK) included
at the top of the file at GEO, there isn't a way for you to supply your
own track info, short of downloading the file, editing your own copy,
and then uploading that version as a custom track.

Regarding the URL encoding problems, after some discussion here
yesterday, we decided to change our custom track interface so that it
will decode URLs automatically. So, you should be able to paste in the
http or ftp links from GEO without having to convert either of them.
(Thanks for pointing out that FireFox converts the ftp addresses, too.)
You can try it on our test server, here:
http://genome-test.cse.ucsc.edu/cgi-bin/hgGateway. Keep in mind that
there is a lot of untested/experimental data up on the test server.

If you try it out, please let us know if you see any problems with the
new functionality!

--
Brooke Rhead
UCSC Genome Bioinformatics Group



Reply all
Reply to author
Forward
0 new messages