Data base question

5 views
Skip to first unread message

Ziyue Chen

unread,
Feb 2, 2017, 11:29:31 AM2/2/17
to genome...@soe.ucsc.edu

Dear Genome Browser team,


We are trying to set up a mirror for research use and are currently downloading the mm9 data base. It is 2.7 Tb and I wonder if I download only the full data set of mm9, which is much smaller, how can we set it up as the data base?


Thank you 


Ziyue Chen

Cath Tyner

unread,
Feb 7, 2017, 12:57:45 PM2/7/17
to Ziyue Chen, genome...@soe.ucsc.edu
Hello Ziyue,

Thank you for your question regarding your UCSC Genome Browser installation and the mm9 size concerns. One of our engineers has shared the following information:


It is the /gbdb/mm9/bbi/wgEncode* data that consumes the most space:
[qateam@hgdownload1 /mirrordata] du --apparent-size -hsc gbdb/mm9 mysql/mm9 goldenPath/mm9/database
 2.1T    gbdb/mm9
82G     mysql/mm9
19G     goldenPath/mm9/database
2.2T    total
The only size oddity is that the text database dumps in goldenPath/mm9/database
is smaller than the loaded database because the indexes in the database tables
take up a lot of space.
If possible, you might consider skipping the wgEncode files in gbdb/mm9/bbi/wgEncode*
By doing this, you can save most of the space:
du --apparent-size -hsc gbdb/mm9/bbi/wgEncode*
2.0T total


As a side
​ ​
note, I thought I would point out a new installation script that was recently released. This installation script provides quick automation for the setup of a UCSC Genome Mirror, thus essentially replacing the manual procedures. While the script's name is "Genome Browser in the Cloud (GBiC)," the GBiC script automates an installation equally well on a dedicated server or a cloud server.

Via GBiC, there is an easy option to install mm9 without encode, with examples noted on the User Guide.

For more information see:
The Genome Browser in the Cloud User Guide

If this response still does not provide you with the help that you need, please feel free to respond to this forum with follow-up questions so that our support team can help you further!

Thank you again for your inquiry and for using the UCSC Genome Browser. 
​Please send new and follow-up questions to one of our UCSC Genome Browser mailing lists below:

  * Post to the Public Help Forum: E
mail 
gen...@soe.ucsc.edu
​ or search the Public Archives
​  * Post to the Mirror Help Forum: Email
 
genome...@soe.ucsc.edu 
or search the Mirror Archives​
​  * Confidential/private help: Email
 
genom...@soe.ucsc.edu

UCSC Genome Browser Announcements List (email alerts for new data & software):
  * Subscribe: Email genome-announce+subscribe@soe.ucsc.edu 
  * Unsubscribe: Email genome-announce+unsubscribe@soe.ucsc.edu

Join us on Social Media! FacebookTwitter, Wordpress BlogYouTube

​Enjoy,​
Cath
. . .
Cath Tyner
UCSC Genome Browser, Software QA & User Support
UC Santa Cruz Genomics Institute


--

---
You received this message because you are subscribed to the Google Groups "UCSC Genome Browser mirror site discussion list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genome-mirror+unsubscribe@soe.ucsc.edu.

Reply all
Reply to author
Forward
0 new messages