Hello David,
Thank you for your question about obtaining CDS sequence for items in your BED 12 track. We would like to improve the DNA retrieval options for the description pages of individual features, but there is no timetable for it right now. In the interim, the easiest way to get the sequence filtering options you describe is to use the Table Browser. First load your hub on our site, and then follow these steps:
1. Open the UCSC Table Browser at http://genome.ucsc.edu/cgi-bin/hgTables (or click "Table Browser" from the top "Tools" menu on our site).
2. Select your track hub and track from the drop-down menus, set the region to "genome", then click the "Identifiers: paste list" button.
3. On the new page, add the name of the BED 12 item that you want sequence from to the text box. Click "submit".
4. Select the output format "sequence" and click "get output".
On the resulting page, you should be able to choose which portions of your feature to retrieve sequence for (CDS, exons, UTR, etc.).
I hope this is helpful. If you have any further questions, please reply to gen...@soe.ucsc.edu or genome...@soe.ucsc.edu. Questions sent to those addresses will be archived in publicly-accessible forums for the benefit of other users. If your question contains sensitive data, you may send it instead to genom...@soe.ucsc.edu.
--
Jonathan Casper
UCSC Genome Bioinformatics Group
--
Hello David,
Thank you for telling us about your problem where the identifiers filter isn't correctly limiting sequence output! One of our engineers is working on solving several problems related to accessing assembly hubs with the UCSC Table Browser, and this is definitely on the list. While that issue is not fixed yet, you can continue to use the Table Browser on our test server at http://genome-test.soe.ucsc.edu. Any fix for this problem will be provided there first. As a temporary workaround, you can pass the output of your Table Browser query through our faFilter tool (available at http://hgdownload.soe.ucsc.edu/admin/exe). faFilter can be provided with a list of names for your desired FASTA records, and will extract only those records from the data.
faFilter -name='accession' TableBrowserOutput.fa MyCDSResults.fa
faFilter -namePatList=accessionFile TableBrowserOutput.fa MyCDSResults.fa
I hope this is helpful. If you have any further questions, please reply to gen...@soe.ucsc.edu or genome...@soe.ucsc.edu. Questions sent to those addresses will be archived in publicly-accessible forums for the benefit of other users. If your question contains sensitive data, you may send it instead to genom...@soe.ucsc.edu.
--
Jonathan Casper
UCSC Genome Bioinformatics Group