Dear Michael,
Thank you for using the UCSC Genome Browser and your question about finding motifs in different genes.
You could employ a command line utility, such as tacg or findMotif, to isolate your motifs:
If you wish to use the findMotif utility review this archived mailing list question:
You could then a create custom track of your motifs and then you could use the Table Browser to intersect those regions with a gene track.
For example after building a custom track, findMotif -motif=cacgtg /gbdb/hg19/hg19.2bit >cacgtgMotifHg19.bed, you could add it as a custom track at
http://genome.ucsc.edu/cgi-bin/hgCustom. Once added you can edit it to have the name cacgtgMotifHg19 to make things simpler. (Please note that this motif will match twice, once on each strand C-G,A-T,C-G,G-C,T-A,G-C)
Then navigate to the Table Browser,
http://genome.ucsc.edu/cgi-bin/hgTables, and select the "group:" Custom Tracks "track:" cacgtgMotifHg19 and then click the "create" button next to the "Intersection" option. With "All cacgtgMotifHg19 track records that have any overlap with UCSC Genes" selected you can click "submit" and change "output format" to "custom track". Then gives this new custom track a name like "cacgtg UCSC gene exons" and "get custom track in genome browser". The result will be only the motifs that fall into the UCSC gene exons, in this case 28,162.
By first creating a file of entire gene regions you could do another intersection to find where the motifs falls in all regions of a gene. From the Table Browser select "group:" Genes... and "track:" UCSC Genes or whatever gene track you wish, set "output format:" to selected fields... and click "get output" and only select "chrom" "txStart" and "txEnd". Then upload the results as custom track. The intersection will be many more motifs, 279,538 in this case, as it now includes non-coding UCSC gene regions.
Thank you again for your inquiry and using the UCSC Genome Browser. If you have any further questions, please reply to
gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible forum. If your question includes sensitive data, you may send it instead to
genom...@soe.ucsc.edu.
All the best,
Brian Lee
UCSC Genome Bioinformatics Group