Hello Mirror Sites,
The UCSC Genome Browser is pleased to offer a new download protocol to use when downloading large sets of files from our download servers: UDR (UDT Enabled Rsync). UDR utilizes rsync as the transport mechanism but sends the data over the UDT protocol, which is very efficient at sending large amounts of data over long distances.
If you are a casual or occasional manual downloader of data, there is no need to change your method, or read any further; continue to visit our download server to download the files you need. This new protocol has been put in place to enable huge amounts of data to be downloaded quickly over long distances.
Remember that we now have two identical download servers to better serve your needs. You can use either one:
-
http://hgdownload.cse.ucsc.edu-
http://hgdownload-sd.sdsc.edu
The Background:
------------------
Typical TCP-based protocols like http, ftp and rsync have a problem in that the further away the download source is from you, the slower the speed becomes. Protocols like UDT/UDR allow for many UDP packets to be sent in batch, thus allowing for much higher transmit speeds over long distances. UDR will be especially useful for users who are downloading from places that are far away from California. The US East Coat and the international community will likely see much higher download speeds by using UDR rather than rsync, http or ftp.
Getting UDR & Setting it up on your System:
-----------------------------------------------
It should be noted that UDR is not written or managed by UCSC, it was written by the Laboratory for Advanced Computing at the University of Chicago. It has been tested to work under Linux, FreeBSD and Mac OSX, but may work under other UNIX variants. The source code can be obtained here, through GitHub:
https://github.com/LabAdvComp/UDRIf you need help building the UDR binaries or have questions about how UDR functions, please read the documentation on the GitHub page and if necessary, contact the UDR authors via the GitHub page. We recommend reading the documentation on the UDR GitHub page to better understand how UDR works. UDR is written in C++. UDR is Open Source and is released under the Apache 2.0 License. You must first have rsync installed on your system.
For your convenience, we are offering a binary distribution of UDR for Red Hat Enterprise Linux 6.x (or variants such as CentOS 6 or Scientific Linux 6). You'll find both a 64-bit and 32-bit rpm here:
http://hgdownload.cse.ucsc.edu/admin/udr
Using UDR to Download Data from the UCSC Genome Browser Download Server(s):
-------------------------------------------------------------
Once you have a working UDR binary, either by building from source or by installing the rpm (if you are using RHEL 6.x or other variant), you can download files from either of our our download servers in a very similar fashion to rsync. For example, using rsync, you may want to download all of the MySQL tables for the hg19 database using the following command:
rsync -avP rsync://
hgdownload.cse.ucsc.edu/mysql/hg19/ /my/local/hg19/
Using UDR is very similar. The UDR syntax for downloading the same data would be:
udr rsync -avP hgdownload.cse.ucsc.edu::mysql/hg19/ /my/local/hg19/
If you installed the rpm, use the 'man udr' command for more information via the man page; if you installed from source please refer to the UDR GitHub page for more details on the capabilities of UDR and how to use it.
Firewall Considerations:
--------------------------
UDR establishes connections on TCP/9000, then transmits the data stream over UDP/9000-9100. Your institution may need to modify its firewall rules to allow inbound and outbound ports TCP/9000 and UDP/9000-9100 from either of the two download machines.
If you decide to install and use UDR, we hope that you experience greatly increased download speeds. As always, if you have questions about mirroring the UCSC Genome Browser, send an email to this list:
genome...@soe.ucsc.edu. If you have difficulties installing UDR on your system, please contact the Laboratory for Advanced Computing through their gitHub page:
https://github.com/LabAdvComp/UDR.