Dear Yi,
Thank you for using the UCSC Genome Browser and the new GBiB and your message about adding BLAT on a custom assembly hub. Next week we will be releasing a new feature that will enable BLAT support of assembly hubs.
To create an assembly hub on the browser, and the GBiB, you do not have to use hgcentral, complex development tools or mysql commands. Rather follow these steps to creates a few text files (hub.txt, genomes.txt, trackDb.txt) and other associated assembly files in an accessible directory: http://genomewiki.ucsc.edu/index.php/Assembly_Hubs
Below I outline copying a small assembly hub to your local disk and then trying loading it on your GBiB to have a working example. On your computer find a place you do not mind copying over the below files, about 33M total. In that directory you created, run the following wget command to recursively grab the directory structure and files needed:
wget -r --no-parent --reject "index.html*" -nH --cut-dirs=3 http://genome.ucsc.edu/goldenPath/help/examples/hubExamples/hubAssembly/plantAraTha1/
Once you have copied this assembly hub, follow the GBiB instructions on sharing the folder where it is located. First power off the machine, then select the Settings option for the machine in VirtualBox, and click the "Shared Folders" tab and the plus folder icon, as described here with pictures: http://genome.ucsc.edu/goldenPath/help/gbib#YourTracks
Then you can restart your GBiB and navigate to the folders location to see your shared files,http://127.0.0.1:1234/folders/, and then look for where you copied the hub.txt. In my example, this assembly hub.txt is in my shared Google Drive folder: http://127.0.0.1:1234/folders/sf_Google_Drive/trackHubAssembly/hubExamples/hubAssembly/plantAraTha1/hub.txt
Paste the URL to the shared folder location of your copied version of this hub.txt into the My Hubs tab on the Hub page, http://127.0.0.1:1234/cgi-bin/hgHubConnect, and it should load fine.
Now that you have an assembly hub loaded and working fine on your GBiB, you can explore the details regarding its structure. This small plant assembly hub is a slice of a larger assembly hub you can also explore: http://genome-test.cse.ucsc.edu/~hiram/hubs/Plants/
At the end of next week we will add BLAT capabilities to assembly hubs. When the changes are public you can acquire these changes by running gbibUpdate
on the command line of your GBiB.
I will now go over how to activate BLAT on the above Assembly Hub example once those updates have been acquired after next week.
First navigate to the location of your the copied genomes.txt and remove the comment "#" from the two lines mentioning BLAT:blat localhost 17779
transBlat localhost 17777
Now that the genomes.txt is updated for this assembly hub we only need to start BLAT servers on the GBiB with gfServer. To do this navigate on your GBiB to where the 2bit is located for this assembly hub (araTha1.2bit). In my example:
cd /folders/sf_Google_Drive/trackHubAssembly/hubExamples/hubAssembly/plantAraTha1/araTha1/
I suggest you take advantage of the ability to ssh into your GBiB to run the gfServer commands. When your GBiB is running use the following command, ssh browser@localhost -p 1235
, to access your GBiB, the password is "browser". Read more about accessing GBiB with ssh here: http://genome.ucsc.edu/goldenpath/help/gbib.html#YourTracks
From this location on your GBiB run the following gfServer commands that will start two BLAT servers in the background to enable amino acid and DNA sequence blatting:
gfServer start localhost 17777 -trans -mask araTha1.2bit &
gfServer start localhost 17779 -stepSize=5 araTha1.2bit &
You can use ps
to see these operations going, and kill -9 ####
to end them. Note that the 17777 and "-trans" option for amino acid blatting matches the numbers added to genomes.txt transBlat localhost 17777
. Read more about gfServer and BLAT configuration here:
https://genome.ucsc.edu/FAQ/FAQblat.html#blat5
http://genomewiki.ucsc.edu/index.php/Running_your_own_gfServer
With these gfServer commands running in the background on the GBiB you can now load the Assembly Hub and run BLAT operations.
Regarding the utilities on the GBiB, if you run ls $HOME/bin
you should see the entire list of them available. One can also run a gbibAddTools
command, but it should not be necessary. Again I highly recommend using ssh, ssh browser@localhost -p 1235
, to enter your GBiB from your computer's terminal program. Also, while not necessarily recommended, here are some internal notes about converting a GBiB into a machine for development:http://genomewiki.soe.ucsc.edu/genecats/index.php/Gbib_development#Converting_your_gbib_into_a_machine_for_development
Thank you again for trying out the GBiB and using the assembly hub features. If you have any further questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.
All the best,
Brian Lee
UCSC Genome Bioinformatics Group
P.S. Regarding shared folders, if desired you can build them entirely in the GBiB, what follows is a condensed review of the above with the relative locations that would result. I suggest first using the ssh browser@localhost -p 1235
to access the running GBiB from your computer's terminal. Then going to the shared folders cd /folders
one could use sudo to wget the assembly hub sudo wget -r --no-parent --reject "index.html*" -nH --cut-dirs=3http://genome.ucsc.edu/goldenPath/help/examples/hubExamples/hubAssembly/plantAraTha1/
You could then load this hub by loading this URL, selecting it under "group" with "Plant araTha1" http://127.0.0.1:1234/cgi-bin/hgGateway?hubUrl=http://127.0.0.1:1234/folders/hubExamples/hubAssembly/plantAraTha1/hub.txt. Then when the blat update is out and obtained with gbibUpdate
you could navigate to the genomes.txt file, cd /folders/hubExamples/hubAssembly/plantAraTha1/
and the commented blat lines could be edited sudo vi genomes.txt
and then you would change directories to the 2bit files cd /folders/hubExamples/hubAssembly/plantAraTha1/araTha1
and run the two gfserver commands to start the BLAT servers:gfServer start localhost 17777 -trans -mask araTha1.2bit &
gfServer start localhost 17779 -stepSize=5 araTha1.2bit &
--
Dear Yi,
The new feature enabling blat on assembly hubs has been released to the browser and can be obtained on the GBiB. In case you, or future users reviewing this archived mailing list, would like to try this feature on a GBiB, here are a review of some steps to take.
1. First open your operational GBiB, here is the user guide: http://genome.ucsc.edu/goldenPath/help/gbib.html
2. With your GBiB operational you use your computer's terminal program to ssh into your GBiB: ssh browser@localhost -p 1235
, password "browser." In case you may have an older GBiB you can run gbibUpdate
to synchronize your GBiB.
3. To test out the blat feature on assembly hubs you can grab this example assembly hub. Go to the GBiB's folders directory cd /folders
. Then use sudo to wget this assembly hub sudo wget -r --no-parent --reject "index.html*" -nH --cut-dirs=3 http://genome.ucsc.edu/goldenPath/help/examples/hubExamples/hubAssembly/plantAraTha1/
4. On your terminal navigate to the genomes.txt file of this assembly hub, cd /folders/hubExamples/hubAssembly/plantAraTha1/
and edit the currently commented-out blat lines with sudo vi genomes.txt
. Use "x" when the cursor is over # at the start of the line to remove it and :w!
to save the changes and :q
to quit.
blat localhost 17779
transBlat localhost 17777
5. With these blat lines in place in the genomes.txt of the assembly hub you can change directories to the 2bit files cd /folders/hubExamples/hubAssembly/plantAraTha1/araTha1
and run the two gfserver commands to start the blat servers. Use ps
to see the processes running.
gfServer start localhost 17777 -trans -mask araTha1.2bit &
gfServer start localhost 17779 -stepSize=5 araTha1.2bit &
6. This assembly hub can no be loaded on your GBiB by clicking this URL and selecting it under the "group" category where "Plant araTha1" displays: http://127.0.0.1:1234/cgi-bin/hgGateway?hubUrl=http://127.0.0.1:1234/folders/hubExamples/hubAssembly/plantAraTha1/hub.txt.
7. Now on the blat page, http://127.0.0.1:1234/cgi-bin/hgBlat, you can select the Arabidopsis thaliana assembly and blat plant amino acid sequences, like IYQTRENKYIIGEIQITESERDRRRSSLPGNH
or DNA sequences, like TAAGTAAAAAATAATATGATTAAGACTAATAAATCTTAATAGTTAATACT
.
Thank you again for trying out the GBiB. If you have any further questions, please reply togen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.
All the best,
Brian Lee
UCSC Genome Bioinformatics Group