Making gene annotation tracks searchable

5 views
Skip to first unread message

Stephen Turner

unread,
May 16, 2024, 4:29:20 PMMay 16
to genome...@soe.ucsc.edu
Hello.

We recently set up a local/private mirror (after purchasing an enterprise license). I'm loading genomes that are either private or aren't in annotation hub. 

I have my genome loaded following the directions at https://genomewiki.ucsc.edu/index.php?title=Building_a_new_genome_database

I have a GTF with an annotation, and I'm running the code below to load the gene track data. The tracks show up, and I can browse by location to find a gene, but I can't search by gene ID or name. 

# gtf to genepred, bigbed
export gtf=/path/to/my.gtf
gtfToGenePred -genePredExt $gtf stdout | sort -k2,2 -k4n,4n > ${gtf}.genePred
genePredToBigGenePred ${gtf}.genePred ${gtf}.txt
bedToBigBed -extraIndex=name,name2 -type=bed12+8 -tab -as=/mnt/db/kent/src/hg/lib/hgFindSpec.sql ${gtf}.txt ${genomeid}.chrom.sizes ${gtf}.bb

# Make searchable
cat ${gtf}.genePred | awk '{print $1"\t"$12"\t"$1}' > ${gtf}.input.txt
ixIxx ${gtf}.input.txt ${gtf}.input.ix ${gtf}.input.ixx

# Add this to trackDb.ra
track geneannotation
group genes
bigDataUrl /mnt/db/gbdb/${genomeid}/${gtf}.bb
shortLabel ${genomeid} genes
longLabel ${genomeid} gene annotation track
type bigGenePred
visibility pack
baseColorDefault genomicCodons
searchIndex name,name2
searchTrix /mnt/db/gbdb/${genomeid}/${gtf}.input.ix

# Load/update
hgTrackDb . $genomeid trackDb /mnt/db/kent/src/hg/lib/trackDb.sql $genomeid
hgFindSpec . $genomeid hgFindSpec /mnt/db/kent/src/hg/lib/hgFindSpec.sql $genomeid


I've looked through the mailing list here and the public genome google group messages, and I've looked through the documentation below, and can't seem to find an answer.
Thanks for pointing me in the right direction!

Stephen

Christopher Lee

unread,
May 17, 2024, 1:40:52 PMMay 17
to Stephen Turner, genome...@soe.ucsc.edu
Hi Stephen,

I don't think you require a trix file in your case as you are just
trying to make the name and name2 fields searchable. For that you can
just build the bigBed the way you did, add the searchIndex trackDb
setting as you did, and then add this stanza to the trackDb file, as a
separate stanza from what you have now:

searchName geneannotation
searchTable geneannotation
searchType bigBed
searchDescription Gene Annotation
termRegex ... # optional regular expression if all of the fields can
match a particular expression, otherwise you don't need this line at
all

This is understandably very confusing because all of our documentation
for making bigBeds searchable assumes a track hub is being built, but
since you are adding tracks directly to the database you don't need
that. I am unclear on if you need one stanza per field of the bigBed
that you have indexed, perhaps try starting with just one and see what
happens.

You could use a trix file in addition to what you have if you wanted
some text other than what's in the bigBed to be searchable. For that,
you did everything correctly, you just again would need a search
stanza like I wrote above. In general, for native tracks, you need a
search stanza so the settings get into the hgFindSpec table, and then
if the native track is a bigBed, you also need to have built the
bigBed index when running bedToBigBed (you did this already) AND you
need to have the searchIndex trackDb setting in the bigBeds trackDb
stanza. Native tracks are just a different beast than track hubs. For
more examples of search stanzas you can look through the various
trackDb.ra files for "searchName" or "searchTable" settings:
https://github.com/ucscGenomeBrowser/kent/tree/master/src/hg/makeDb/trackDb

Let us know if you have any further questions,
Christopher Lee
UCSC Genomics Institute
> --
>
> ---
> You received this message because you are subscribed to the Google Groups "UCSC Genome Browser Mirror-Specific Support" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to genome-mirro...@soe.ucsc.edu.
> To view this discussion on the web visit https://groups.google.com/a/soe.ucsc.edu/d/msgid/genome-mirror/CANfqxbiJ5DW5RRSNY-gShhkZvFYk9fWfukHWaL_4Y3dgusZuAw%40mail.gmail.com.
Reply all
Reply to author
Forward
0 new messages