rsync not working

118 views
Skip to first unread message

Kevin Luo

unread,
Mar 21, 2014, 9:23:14 AM3/21/14
to gen...@soe.ucsc.edu
Hello UCSC experts:

I found a problem downloading ENCODE files using rsync. It showed the error messages that connection refused (61) as below. This happened last night when I tried to download multiple bam files (~50) in parallel. Do you think this was blocked by your end? Any ideas for solving this problem?


My command:


Error message:

rsync: failed to connect to hgdownload.cse.ucsc.edu: Connection refused (61)
rsync error: error in socket IO (code 10) at /SourceCache/rsync/rsync-42/rsync/clientserver.c(105) [receiver=2.6.9]


Thanks.
Kevin

Brian Lee

unread,
Mar 21, 2014, 1:56:39 PM3/21/14
to Kevin Luo, gen...@soe.ucsc.edu
Dear Kevin,

Thank you for using the UCSC Genome Browser and informing us about the crashed rsync server on hgdownload.

The browser is serving thousands of users and it appears that perhaps several parallel connections were overloading the server, which is not a strategy we recommend in case you are taking that approach.  You can download entire directories with a single rsync connection by using a trailing slash. For example you can list the files in a directory before downloading them with the following command (adding " ./" would download them):

More importantly you may be interested in using a new protocol called UDR for UDT Enabled Rsync.  Once installed it can provide much more faster speeds. You can read more about it here: http://genome.ucsc.edu/ENCODE/newsarch.html#091213
And here is UDR's GitHub page: https://github.com/LabAdvComp/UDR

Thank you again for informing us that our rsync server crashed and responsibly downloading from the UCSC Genome Browser. If you have any further questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

All the best,

Brian Lee
UCSC Genome Bioinformatics Group


--


Kevin Luo

unread,
Mar 21, 2014, 2:19:40 PM3/21/14
to Brian Lee, gen...@soe.ucsc.edu
HI Brian,

Thanks so much for your prompt help! Sorry for giving you the trouble for the parallel downloading. I'm trying to process multiple bam files from ENCODE, but I don't need all the files a entire directory, and also don't have the space to keep the entire directory.  So perhaps downloading the entire directory would not be efficient for my case. Do you have any suggestions as for how to download multiple files efficiently? Can I pause for a few minutes before each download command or have to download one file at a time?

Thanks for recommending the UDR protocol. Does that accept multiple downloading requests at the same time? Similarly, is there any constraints for downloading multiple files with UDR as in my situation? Thanks.

Kevin

Jonathan Casper

unread,
Mar 24, 2014, 2:31:03 PM3/24/14
to Kevin Luo, gen...@soe.ucsc.edu

Hello Kevin,

There are several options for rsync that will tell it to download multiple files. The --include, --exclude, and --files-from arguments may all be useful to you. The last option, in particular, allows you to give rsync a file that contains a list the files you wish to download. It will then download all of those files in sequence for you.

With UDR, we still recommend downloading only one file at a time. The advantage of UDR is that you may see significantly faster download rates in some situations.

I hope this is helpful. If you have any further questions, please reply to gen...@soe.ucsc.edu. Questions sent to that address will be archived in a publicly-accessible forum for the benefit of other users. If your question contains sensitive data, you may send it instead to genom...@soe.ucsc.edu.

--
Jonathan Casper
UCSC Genome Bioinformatics Group



--


Reply all
Reply to author
Forward
0 new messages