Calling specific region by peakname

24 views
Skip to first unread message

Tunc Morova

unread,
Dec 2, 2016, 10:52:44 AM12/2/16
to gen...@soe.ucsc.edu
Hi,

I would like to call a specific peak from my bed file based on peak
ids in the 4th column. However, when I write the peakname I get
"Sorry, couldn't locate peakno_6 in genome database". Is there a
specific format that I should use to put my bed file ?

Also, the 5th column in my bed file is for the peak heights obtained
from peak calling. However the score that defined in the BED format is
quite different than this, Can this also be cause of the problem?


Thank you very much for your help,

Best regards,

Tunc.



chr6 51217007 51217843 peakno_1 40.3075702308 .
chr7 66969066 66969892 peakno_2 32.5309377692 .
chr6 51567477 51568269 peakno_3 47.5875009231 .
chr6 51602203 51602971 peakno_4 55.1781006154 .
chr6 51877697 51878445 peakno_5 50.4969716923 .
chr7 68195682 68196529 peakno_6 19.0570760769 .
chr11 117046695 117047479 peakno_7 26.1636610769 .

Jairo Navarro Gonzalez

unread,
Dec 8, 2016, 12:10:32 PM12/8/16
to Tunc Morova, UCSC Genome Browser Mailing List

Hello Tunc,

Thank you for using the UCSC Genome Browser and your question about indexing a certain BED column. 
BigBed files are binary indexed files, which will allow you to index your data for different peak IDs. To create a custom search term, you will have to create a bigBed, index the file for a column, create a hub, and host the hub on a web accessible location to view the data in the browser. It might be useful to also read our hub documentation and our example to create a bigBed from a BED file.

Step One: Sort the BED file

After running this command, there should be a new file created called inputBed.sorted.bed with the sorted data.
sort -k1,1 -k2,2n inputBed.bed > inputBed.sorted.bed

Step Two: Create an AutoSql (.as) file

In this case, we are using a BED4+2 since the first four columns are standard BED columns with two additional custom columns. In the same directory as your inputBed.sorted.bed file, create a file inputBed.as with the following inside of it:

table userData
"BED4+2" 
(
string chrom;    "Chromosome (or contig, scaffold, etc.)" 
uint   chromStart;    "Start position in chromosome" 
uint   chromEnd;    "End position in chromosome" 
string name;    "Name of item" 
float   score;    "Score data" 
char[1] strand;    "+ or - or ." 
)

Step Three: Index the peak IDs by the fourth column

Now that we have a sorted BED file and a .as file that specifies the additional columns to the standard BED file, we can index the peak IDs with the following command:

bedToBigBed inputBed.sorted.bed -as=inputBed.as -type=bed4+2 -extraIndex=name http://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/hg19.chrom.sizes output.search.bb

This kent utility takes the sorted BED file created in step one, the AutoSql file from step two, and the downloadable hg19.chrom.sizes file from our downloads server, then creates the bigBed file: output.search.bb
In the above command, we specified the BED file type used with the -type=bed4+2 option and also indexed the file using the -extraIndex=name option.

Step Four: Create a hub

Now that we have indexed file by peak IDs, we can incorporate this bigBed into the browser by creating a hub. Copy the example hub for hg19 with the following command:

wget -r --no-parent --reject "index.html*" -nH --cut-dirs=3 http://genome.ucsc.edu/goldenPath/help/examples/hubDirectory/

Using a text editor, edit the /hg19/trackDb.txt file by removing all stanzas except for the bigBed example stanza
Here, we will add the following new line before the visibility line

searchIndex name

then change the bigDataUrl line to incorporate the bigBed file created in the third step. 
Here is an example trackDB.txt file that you can reference: http://hgwdev.cse.ucsc.edu/~jairo/wgetHub/hubDirectory/hg19/trackDb.txt
You can learn about how to customize this hub with our hub documentation.

Step Five: Load the hub in the browser

Once you have built the hub, move the contents of the hub to a web accessible location. 
After the hub is on a server, copy the URL to the hub.txt file and go to http://genome.ucsc.edu/cgi-bin/hgHubConnect and paste the URL under the My Hubs tab and click Add Hub
You should now be able to search terms such as peakno_1.

When bigBeds are indexed on a field, such as with the "searchIndex name" in this example, the search term must exactly match what was indexed ("peakno_1" will have a hit while "peakno_" will not find a result). To further enhance searching you can build a further additional index file and add to the trackDb stanza a line such as "searchTrix myFurtherIndex.ix", please see this prior mailing list response for more information.

Thank you again for your inquiry and using the UCSC Genome Browser. If you have any further questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Jairo Navarro
UCSC Genomics Institute



--

---
You received this message because you are subscribed to the Google Groups "UCSC Genome Browser discussion list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genome+un...@soe.ucsc.edu.


Reply all
Reply to author
Forward
0 new messages