How to implement case-insensitive or incomplete searches at an assembly hub?

104 views
Skip to first unread message

David da Silva Pires

unread,
May 28, 2015, 1:28:11 PM5/28/15
to gen...@soe.ucsc.edu
Hi, everyone.

I would like to know how to implement case-insensitive or incomplete searches at an assembly hub. If I search by SOD1, Sod1, sod1 or even only od1 at Human assembly, all of this searches are successful. But not at my assembly hub.

For example, if I try to search smp_019460 at the following assembly hub:
the following message appears:

Warning/Error(s): Sorry, couldn't locate smp_186980 in genome database

but there exists a gene named Smp_186980 at SMPs v5.2 track.

The same if I do an incomplete search, like 186980.

Is there any parameter that should be passed to the command bedToBigBed in order to consider case-insensitive or incomplete searches?

Thanks in advance.

--
David da Silva Pires

Brian Lee

unread,
May 28, 2015, 4:21:41 PM5/28/15
to David da Silva Pires, gen...@soe.ucsc.edu
Dear David,

Thank you for using the UCSC Genome Browser and your question about creating complex searches in track hubs.

There is a searchTrix feature that will allow for enhanced searches:
http://genome.ucsc.edu/goldenPath/help/trackDb/trackDbHub.html#searchTrix
http://genome.ucsc.edu/goldenPath/help/trix.html

Here is an example hub you can load:

http://genome.ucsc.edu/cgi-bin/hgTracks?db=hg19&hubUrl=http://genome.ucsc.edu/goldenPath/help/examples/hubExamples/hubIndexedBigBedSearchable/hub.txt

In essence a program called ixIxx creates an additional "trix" index files. Then in the trackDb.txt, along with the "searchIndex name" line to use the bigBed that is indexed, a second "searchTrix outFile3.ix" line is added. Beyond just being able to search for named items like "SJ_96833_AA|GG", you can also search for related indexed text to those named items such as "2053" from the line used during creating the second index: "SJ_96833_AA|GG 996 + 2053 -1 996".

If you search 2053 in the above attached hub link, scroll down to the bottom of results to see the hub_56229_bigBed1 section with a match on SJ_96833_AA|GG at chr21:33032156-33036103.

You might have to created your own indexed outFile3.txt of desired terms like SOD1 sod1 od1 superoxide dismutase 1. There are some notes in the above hub's trackDb.txt: http://genome.ucsc.edu/goldenPath/help/examples/hubExamples/hubIndexedBigBedSearchable/hg19/trackDb.txt

Thank you again for your inquiry and using the UCSC Genome Browser. If you have any further questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

All the best,

Brian Lee
UCSC Genome Bioinformatics Group

--


Reply all
Reply to author
Forward
0 new messages