The "/gbdb" directory...

5 views
Skip to first unread message

Mike Rightmire

unread,
Oct 20, 2017, 11:22:21 AM10/20/17
to gen...@soe.ucsc.edu
Hello All,

I'm in the process of creating a mirror of the UCSC Genome Browser. I've noticed that the /gbdb directory storage is substantially bigger than the actual genome files within the MySQL database. For example, while the mm10 MySQL data is only 189GB, the /gbdb/mm10 is greater than 500GB.

My questions:

1. Is this normal, or am I somehow unnecessarily duplicating data? For example....
217G    /gbdb/mm10/multiz60way
258G    /gbdb/mm10/multiz60way.old

Is there supposed to be a "dir" and a "dir.old"...or this the result of some form of duplication from running the rsync command (browserSetup.sh download mm10) multiple times?

2. We wont have a need for "multiz60way". So, if I simply deleted this directory ... would it break the proper functioning of the website or its tools?

3. Assuming deleting it would break the site, can the offline/online mode somehow be set to something like...
mm10(mysql) = offline mode only
mm10(gbdb)  = ON-line mode only
...?

Many thanks!
Mike

--

Universitäts Klinikum Heidelberg - University Hospital Heidelberg

Section of Bioinformatics and Systems Cardiology
Analysezentrum III - Klaus Tschira Institute

Mike Rightmire 

Bioinformatics and IT

Im Neuenheimer Feld 669

69120 Heidelberg

Tel.: +49 6221 56 - 34213
Fax.: +49 6221 56 - 6868

Cell: +49 176 7131 8758

Michael....@uni-heidelberg.de
http://www.klinikum.uni-heidelberg.de

Christopher Lee

unread,
Oct 20, 2017, 3:59:39 PM10/20/17
to Mike Rightmire, UCSC Genome Browser Discussion List

Hi Mike,

Thank you for your question about the /gbdb/ directory. Yes the /gbdb directory is indeed very large, especially for the Mouse and Human assemblies. This directory contains all non-MySQL data, which includes the sequence files, liftovers, and multi-way alignment MAF files, which can be quite large.

If you don't need these large files then you can only download the necessary MySQL database, set your mirror into online mode, and the browser will pull MySQL data from your local MySQL database and pull /gbdb/ files from us just as you would like.

The multiz60way.old directory is not a result of you duplicating data, but is instead an older version of the multiz60way directory. If you would like to drop one (or both) everything will be fine except the Multiz 60way track won't display in your Genome Browser, which may or may not lead to other bugs.

Please let us know if you have any other questions!

Thank you again for your inquiry and using the UCSC Genome Browser. If
you have any further questions, please reply to gen...@soe.ucsc.edu.
All messages sent to that address are archived on a
publicly-accessible forum. If your question includes sensitive data,
you may send it instead to genom...@soe.ucsc.edu.

Christopher Lee
UCSC Genomics Institute


--

---
You received this message because you are subscribed to the Google Groups "UCSC Genome Browser Public Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genome+un...@soe.ucsc.edu.
To post to this group, send email to gen...@soe.ucsc.edu.
Visit this group at https://groups.google.com/a/soe.ucsc.edu/group/genome/.
To view this discussion on the web visit https://groups.google.com/a/soe.ucsc.edu/d/msgid/genome/59E9B461.4080201%40uni-heidelberg.de.
For more options, visit https://groups.google.com/a/soe.ucsc.edu/d/optout.

Reply all
Reply to author
Forward
0 new messages