Retrieve download link for a given file

7 views
Skip to first unread message

Kai Zhang

unread,
Feb 23, 2014, 9:43:33 PM2/23/14
to gen...@soe.ucsc.edu
Hi UCSC,

I would like to know how to retrieve the download link for a specific file name e.g. wgEn    codeHaibTfbsH1hescAtf3V0416102AlnRep1.bam, through UCSC Public MySQL server? Because I have hundreds of file names obtained from querying the metaDb, and I want to get the download links of these files now. Thanks!

Best,

--
Kai

Brian Lee

unread,
Feb 24, 2014, 12:25:33 PM2/24/14
to Kai Zhang, gen...@soe.ucsc.edu
Dear Kai,

Thank you for using the UCSC Genome Browser and your question about acquiring the download link for an ENCODE file with just the filename.

To begin with, please see this resources and FAQ page about ENCODE data, which includes a link to the ENCODE experiment matrix which provides a graphical way to identify files and tracks: http://genome.ucsc.edu/ENCODE/FAQ/index.html

If you were interested in all the ENCODE files in a given experiment (wgEncodeHaibTfbs), you could rsync the entire directory (trailingSlash/) with a command such as the following:


Since you have hundreds of file names, and you are interested in only these files, you will need to build a script to specify each file ($i).  For example, your file of interest wgEncodeHaibTfbsH1hescAtf3V0416102AlnRep1.bam (and likely associated file.bam.bai which you should include in your list), can be parsed to identify the directory, wgEncodeHaibTfbs. With all the HAIB TFBS files grouped in a file called fileListWgEncodeHaibTfbs, you could then run the following to rsync those files to your current directory(./):

for i in $(cat fileListWgEncodeHaibTfbs); do rsync -aP rsync://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeHaibTfbs/$i  ./;done

Thank you again for your inquiry and using the UCSC Genome Browser. If you have any further questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

All the best,

Brian Lee
UCSC Genome Bioinformatics Group


--
 

Kai Zhang

unread,
Feb 24, 2014, 12:30:11 PM2/24/14
to Brian Lee, gen...@soe.ucsc.edu
Thank you very much!
Reply all
Reply to author
Forward
0 new messages