Extract mRNA : CDS/ UTR

600 views
Skip to first unread message

Naman Mangukia

unread,
Jul 6, 2018, 11:16:30 AM7/6/18
to gen...@soe.ucsc.edu, ke...@soe.ucsc.edu
Dear Professor,
Greetings !

I am exploring UCSC genome browser to get the in depth information about Human mRNA.
I want 5'UTR, CDS and 3'UTR features in terms of sequence separately along with their position in mRNA.

I have reached to the result page where I got the information of UTR and CDS region wtih
Red / Blue colors for a single gene. Below are the steps in brief for your reference:

======================================================================== 
  NM_032291|SGIP1
        |
        v
    reference : https://www.ncbi.nlm.nih.gov/gene/84251/
        |
        v
    Annotation release     Status     Assembly     Chr     Location
    109     current     GRCh38.p12 (GCF_000001405.38)     1     NC_000001.11 (66533272..66751139)

        |
        | ____________Searching into UCSC_________
   
    https://genome.ucsc.edu/cgi-bin/hgTracks?db=hg38&lastVirtModeType=default&lastVirtModeExtraState=&virtModeType=default&virtMode=0&nonVirtPosition=&position=chr1%3A66533956%2D66751139&hgsid=680239491_3GbmjAYEYuzr3e1nR09DX9wcolra
        |
        v
    https://genome.ucsc.edu/cgi-bin/hgc?c=chr1&l=66533955&r=66751139&o=66533955&t=66751139&g=ncbiRefSeqCurated&i=NM_032291.3&db=hg38
        |
        v
    mRNA/Genomic Alignments (NM_032291.3)

BROWSER | SIZE IDENTITY CHROMOSOME  STRAND    START     END              QUERY      START  END  TOTAL
-----------------------------------------------------------------------------------------------------
browser | 10934  100.0%          1     +  66533956  66751139           NM_032291.3     1 10934 10951

        |
        v
    clicked on "10934  100.0%          1     +  66533956  66751139           NM_032291.3     1 10934 10951"
        |
        v
    https://genome.ucsc.edu/cgi-bin/hgc?hgsid=680241299_Ycl692Y3U2sya1HISPtiBHYUIGWO&g=htcCdnaAli&i=NM_032291.3&c=chr1&l=66533955&r=66751139&o=66533955&aliTable=ncbiRefSeqPsl
========================================================================


In above procedure, i was able to fetch the sequential regions (CDS/UTR) manually.

I want the same information for all available mRNAs of latest updated Human assembly.


What is the proper way to do it ?
Waiting for your response.

Regards,
Naman Mangukia
Research Scholar




Matthew Speir

unread,
Jul 11, 2018, 6:30:20 PM7/11/18
to Naman Mangukia, UCSC Genome Browser Discussion List
Hello, Naman.

Thank you for your question about obtaining CDS/UTR sequence from the UCSC Genome Browser.

From the steps above, it looks like you are working with the NCBI RefSeq Curated track. More information on this track can be found on the description page: https://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg38&g=refSeqComposite. In short, this 'Curated' track contains only the NM, NR, and YP accessioned transcripts which are curated by the RefSeq database. In addition to this track, we have the NCBI RefSeq Predicted track, which contains XM and XR accessioned transcripts that are predicted by RefSeq, and the NCBI RefSeq All track, which combines both the 'Predicted' and 'Curated' tracks.

The mRNA sequences for the transcripts (NM/NR/YP/XM/XR) in all these tracks are contained in a file on our downloads server: http://hgdownload.soe.ucsc.edu/gbdb/hg38/ncbiRefSeq/seqNcbiRefSeq.rna.fa. If you only want a subset of these transcripts, such as just those in the 'Curated' track, you can use our command-line utility 'faSomeRecords' which can also be obtained from our downloads server: http://hgdownload.soe.ucsc.edu/admin/exe/. In addition to a fasta file, the faSomeRecords utility takes as input a file containing a list of IDs that you want to extract from the input fasta file. You can use our Table Browser or MySQL server to obtain this information.

(A) Accessions from MySQL server

A single line can be used to grab the accessions for all of the items in the 'Curated' track:

mysql --user=genome --host=genome-mysql.soe.ucsc.edu -A -P 3306 -Ne 'select name from ncbiRefSeqCurated' hg38 > ncbiRefSeqCurated.ids.txt

In this command, you can replace ncbiRefSeqCurated with ncbiRefSeq or ncbiRefSeqPredicted to get the accessions for items in the 'All' or 'Predicted' tracks, respectively.

(B) Accessions from Table Browser

In a few short steps, you can get the accessions for all of the items in the 'Curated' track:

1. Navigate to the Table Browser, https://genome.ucsc.edu/cgi-bin/hgTables.
2. Make the following selections:
clade: Mammal
genome: Human
assembly: Dec. 2013 (GRCh38/hg38)
group: Genes and Gene Predictions
track: NCBI RefSeq
table: RefSeq Curated (ncbiRefSeqCurated)
region: genome
output format: selected fields from primary and related tables
output file: enter a file name or leave blank to view in web browser

3. Click 'get output'.
4. Under the section "Select Fields from hg38.ncbiRefSeqCurated", check the box next to "name". 
5. Click 'get output'.

As with the MySQL instructions above, you can swap out the table selected in step 2 with any of the different NCBI RefSeq tracks I described.

Next, you can download the coordinates of the CDS within the mRNA from either the Table Browser, the MySQL server, or our downloads server.
 
(C) CDS coordinates from Table Browser

The steps for obtaining the CDS coordinates are similar to those for obtaining the accessions above, except with a few more selections to get the CDS fields:

1-3 are the same as those described under (B) above for getting a list of accessions.
4. Under "Linked Tables", check the box next to ncbiRefSeqCds.
5. Click "allow selection from selected tables".
6. Under "Select Fields from hg38.ncbiRefSeqCurated", check the box next to "name". 
7. Under "hg38.ncbiRefSeqCds fields", check the box next to "cds".
8. Click "get output".

(D) CDS coordinates from the MySQL server

The command for obtaining the CDS coordinates from the MySQL server is similar to that for obtaining the accessions:

mysql --user=genome --host=genome-mysql.soe.ucsc.edu -A -P 3306 -Ne 'select ncbiRefSeqCurated.name, ncbiRefSeqCds.cds from ncbiRefSeqCurated, ncbiRefSeqCds where ncbiRefSeqCurated.name=ncbiRefSeqCds.id' hg38

As with obtaining the accessions, you can replace ncbiRefSeqCurated with ncbiRefSeq or ncbiRefSeqPredicted to get CDS coordinated for either "All" or "Predicted" tracks respectively.

(E) CDS coordinates from Downloads Server

You can download a file containing the CDS coordinates here: http://hgdownload.soe.ucsc.edu/goldenPath/hg38/database/ncbiRefSeqCds.txt.gz.

Similar to the fasta file, this will contain the CDS coordinates for both 'Curated' and 'Predicted' tracks and if you only want a subset of these then you will need to filter the file to obtain that subset. If you do go about filtering this file, you can always use the UNIX 'grep' command and the list of accessions generated to filter the fasta file.

I hope this is helpful. If you have any further questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible Google Groups forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Matthew Speir
UCSC Genome Bioinformatics Group

Training videos & resources: http://genome.ucsc.edu/training/index.html
Want to share the Browser with colleagues?
Host a workshop: http://bit.ly/ucscTraining
 


--

---
You received this message because you are subscribed to the Google Groups "UCSC Genome Browser Public Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genome+un...@soe.ucsc.edu.
To post to this group, send email to gen...@soe.ucsc.edu.
Visit this group at https://groups.google.com/a/soe.ucsc.edu/group/genome/.
To view this discussion on the web visit https://groups.google.com/a/soe.ucsc.edu/d/msgid/genome/CAC4d_2TyGztJpMb_du%2Bb9_7sWseJGORL0Mkk46tufEZQ9pEZBg%40mail.gmail.com.
For more options, visit https://groups.google.com/a/soe.ucsc.edu/d/optout.



--
Matthew Speir
Outreach, User Experience, Quality Assurance and User Support
HCA, CIRM, and UCSC Genome Browser
UCSC Genomics Institute

Naman Mangukia

unread,
Jul 12, 2018, 11:44:00 AM7/12/18
to Matthew Speir, UCSC Genome Browser Discussion List
Dear Sir Matthew,
Greetings !

I am going through your replied procedure.

Yes, I am working on Refseq mRNA sequences.

My remaining query:
Can I also extract the 3'UTR and 5'UTR coordinates of the FASTA sequences along with CDS from mRNA ?
You are the creator, but please find the example of what I want from the UCSC:

=============================================================
Gene : USP22
mRNA : NM_015276.1

cDNA NM_015276.1


GCAGCCGCAG CTCGGGGGCG GTGCCTGCCT TGCAGCCTCC CCTCGGCGAT  50
CGCGCAGCCC CATCTTTGTC CGGCCTCCGC GCTTTGTTCT CGGCGCCCGG  100
GCCTTGGCCA GCCTGGCCAG CCGCCGAGCA GCCCCCACGC CGCGCTGGCG  150
TCGTCCTCGC CTCCCTCGCC GCCGCCCCCC GCGCGCGGCC GGGCCTTGCC  200
CCCCATGGTG TCCCGGCCAG AGCCCGAGGG CGAGGCCATG GACGCCGAGC  250
TGGCGGTAGC GCCGCCGGGC TGCTCGCACC TGGGCAGCTT CAAGGTGGAC  300
AACTGGAAGC AGAACCTGCG GGCCATCTAC CAGTGCTTCG TGTGGAGCGG  350
CACGGCTGAG GCCCGCAAGC GCAAGGCCAA GTCCTGTATC TGCCATGTCT  400
GTGGCGTCCA CCTCAACAGG CTGCATTCCT GCCTCTACTG TGTCTTCTTC  450
GGCTGTTTCA CAAAGAAGCA TATTCACGAG CATGCGAAGG CGAAGCGGCA  500
CAACCTGGCC ATTGATCTGA TGTACGGAGG CATCTACTGT TTTCTGTGCC  550
AGGACTACAT CTATGACAAA GACATGGAAA TAATCGCCAA GGAGGAGCAG  600
CGAAAAGCTT GGAAAATGCA AGGCGTTGGA GAGAAGTTTT CAACTTGGGA  650
ACCAACCAAA CGGGAGCTTG AACTGCTGAA GCACAACCCG AAAAGGAGAA  700
AGATCACCTC GAACTGCACC ATAGGTCTGC GTGGGCTGAT CAACCTTGGG  750
AACACATGCT TCATGAACTG CATCGTGCAG GCCCTGACCC ACACGCCACT  800
TCTGCGGGAC TTCTTCCTGT CTGACAGGCA CCGCTGTGAG ATGCAGAGCC  850
CCAGCTCCTG TCTGGTCTGT GAGATGTCCT CACTGTTTCA GGAGTTTTAC  900
TCTGGACACC GGTCCCCTCA CATCCCGTAT AAGTTGCTGC ACCTGGTGTG  950
GACCCACGCG AGGCACCTAG CAGGCTACGA GCAGCAGGAC GCCCACGAGT  1000
TCCTCATCGC GGCCCTGGAC GTGCTCCACC GACACTGCAA AGGTGATGAC  1050
AATGGGAAGA AGGCCAACAA CCCCAACCAC TGCAACTGCA TCATAGACCA  1100
GATCTTCACA GGCGGGTTGC AGTCAGACGT CACCTGCCAA GTCTGCCATG  1150
GAGTCTCCAC CACCATCGAC CCCTTCTGGG ACATCAGCTT GGATCTCCCC  1200
GGCTCTTCCA CCCCATTCTG GCCCCTGAGC CCAGGGAGCG AGGGCAACGT  1250
GGTAAACGGG GAAAGCCACG TGTCGGGAAC CACCACGCTC ACGGACTGCC  1300
TGCGACGATT CACCAGACCA GAGCACTTGG GCAGCAGCGC CAAGATCAAG  1350
TGCAGCGGTT GCCATAGCTA CCAGGAGTCC ACAAAGCAGC TCACTATGAA  1400
GAAACTGCCC ATCGTAGCCT GTTTTCATCT CAAACGATTT GAACACTCAG  1450
CCAAGCTGCG GCGGAAGATC ACCACGTATG TGTCCTTCCC CCTGGAGCTG  1500
GACATGACCC CTTTCATGGC CTCCAGCAAA GAGAGCAGGA TGAATGGACA  1550
GTACCAGCAG CCCACGGACA GTCTCAACAA TGACAACAAG TATTCCCTGT  1600
TTGCTGTTGT TAACCATCAA GGGACCTTGG AGAGTGGCCA CTACACCAGC  1650
TTTATCCGGC AGCACAAAGA CCAGTGGTTC AAGTGTGACG ATGCCATCAT  1700
CACCAAGGCC AGCATCAAGG ACGTCCTGGA CAGCGAAGGG TACTTGCTGT  1750
TCTATCACAA ACAGTTCCTG GAATACGAGT AGCCTTATCT GCAGCTGGTC  1800
AGAAAAACAA AGGCAATGCA TTGGCAAGCC TCACAAAGTG ATCCTCCCTG  1850
GCCCCCCCCT CCCCCAAGTC TCCCGCCGCC TCCCCGGCCT GGTGACACCA  1900
CCTCCCATGC AGATGTGGCC CCTCTGCACC TGGGACCCAT CGGGTCGGGA  1950
TGGACCACAC GGACGGGGAG GCTCCTGGAG CTGCTTTGAA GATGGATGAG  2000
ATGAGGGGTG TGCTCTGGGT GGGAGGAGCA GCGTACACCC GTCACCAGAA  2050
CATCTCTTGT GTCATGACAT GGGGGTGCAA CGGGGGCCTC ACAGCACAGA  2100
GTGACCGCTG CCTGGCGTTC CCCAGCACTC GGTGTGGAAA GGCCCCTACC  2150
TGCTGTAAGA TTATGGGTCC ATGAAAGCAG TAAGCTGGAC ACAGAGGTGT  2200
AGTGTGCGGG ACAGAGGGCC TTGCAGATGC CTTTCTGTTG GTGTTTTAGT  2250
GTTAAAATAC GGAGAGTATG GAACTCTTCA CCTCCATTTT CTCAGCGGCT  2300
GTGAAGCAGC CTCCTAGCTT CGGAAGTACG GACACTACGT CGCGTTTTCA  2350
AGCGTGTCTG TTCTGCAGGT AACAGCATCA AGCTGCACGT GGAAGCATCT  2400
CGCGGTTTTC TAGAAACAGG CATTTTCTTA TCCCTCTCCC GCTCCTTTTT  2450
CCACAAAGGT GAATTTCATA AATGTAATAC TAGTAAAGTG AATGAATTAC  2500
TGAGTTTATA CAGAAATTTA GGTAACTTCT CCTTTAGTCT CAAGAGCGAG  2550
TCTTGCTTTT TAATGGGTGC CGTTTATGTT GCTGCCCGCC CTGTGTGCCT  2600
GGCTCCTCTG GGTGCCTTGG TGTCTGCTGG TGGCTGGCAG TGGGCGCAGC  2650
GGAGGAGAGT TGTGCTGCAG CTCATACGGT GTGTCTGTCA TCTCAGTCTG  2700
GAGTAAATGC AGTGTCTGCC GGTGTCTGAT GGGTTCTGTC CCTCGTATTT  2750
TCTTTGCCTT CTATCCCATT GCCTGGCTAC CGCTGCCTGG CAGCCAAGGG  2800
TGTTGGTCGC GAAGCTGGAG TGGCCTCTGG TGGAGCCTGC ATCTTGTCTC  2850
GTCTGCCTCT GCTTTACATT TGGTGTACTT TCGGGCGTGG TGGCAGTAAA  2900
ATGACACCGT GATTGAGCTT GTCAGCAGAG CTGAAAGAGA AAGTAGAAGG  2950
ATGTGCATTG TTTCTTGTAA GATATCTTGC ATGTATCTGT GTATTCAAAT  3000
TCAAACAGAG ATGGTTTGTC CATTTGTCCA CTGAGAAATT AGAAACTAGG  3050
GACAAGGGGG AGGAAAAGTA CTGAAATACA GTTTATGAAG CAAGTGTGTC  3100
TCGGGCTGTG CTTGTCCCAG GAGCCCCAGC AGCATCTGAA CTGAGGCTTC  3150
TTCAGTCCTG CAGGAACAGG ATCATCTGTC TCAGCGGTGG GCAGATGTTT  3200
TCATAGACAG CCAGGGAGTA AACACTGTTG GCTCTGTGGG CTGTATGGTC  3250
TCTGCCATAA ATAGTACAGA GATGTGGCTG TGTCTAGTAC AACTTTTAGA  3300
CACAGAAATC TGAATGACAT ATATTGTTCT GTGTCAAGAA ACTTAGATTT  3350
TTTTTTTAAC TATTTAAAAA CGTGAAACCT ATTCTTAGCT CACAGGCCAT  3400
GGAGAAGCTG GTGGGGACCA GACCCAGCTC CTTAGCTGGC TGGGCTGGGG  3450
AGGGGGTAGT GACAGTGGCA GCTGCTACTC ACTGCTCAGT GTGGAAAACA  3500
CAGGACTTGG CAATCACAGC CCGCAGAACC ATCATGTGTG GCAGAAGCCT  3550
GAGGGATGCG GTTTCTTGCC CACGTGCTCT GTTCATTTTC TGTTGTTTTT  3600
CTGCACTTAA AGAATTCACA TGGAAGCATG TTTTATAAAA TGAATTACCA  3650
GAGAAACAGA GATGGGCCGA GATTTTCAGA AATGGTCCCA TGTGACCAAG  3700
TTCTGCTGTT TGGGTGACAG TGCTTTGAAG ATCTCCTTTG AGGATGTGCA  3750
GTCTTTTTTT TTTTTTTTTT GAGATGGAGT TTGTTGCCCA GGCTGGAGTG  3800
AGTGGCACAG TCTCGGCTCA CTGCAACCTC CACCTCCTGG GTTCAAGCAG  3850
TTCTCGTGCC GCAGCCTCCC AAGTAGCTGG GACTACAGGC ATGCACCACC  3900
ACGCCAGGCT AATTTTTGTA TTTTTAGTAG AGATGGGGTT TCACCATGTC  3950
TCAAACTCCT GACCTCAGGC GATCCACCCA CCTCAGCGTC CCAAAGTGCT  4000
GGGATTATAG GCGTGAGCCA CCGCACCTGG CCTATGAGTG GTCTTTTAAT  4050
TAGGAACAAA TCTAATGGAA AGGAGAGTTG ACTGAAGTTG GCCCACAGGA  4100
TTGTGAGCTG GGCAGTGCCT TCATGAAGGC TTGCCACCTT GGGACGCCCC  4150
AGTTTACTGG GGTGTCTTGC GGAGTGCAGA AGGCTTTCTG GCAGCTGCCT  4200
GGGTTTGGCC AGACCCTGCC TCCCCTCCCG CCGGCCAACC CCTAGTCCCC  4250
TTCCTGTCTC CACTTGCATT CAGGGGTGGC TGCTGTTCTG AGAACATTAG  4300
AACTGGGAAG AGAGATGGAG TCACATGGAT TTTTGGTGGG CATTATTCTA  4350
AACTTTCGTA TCCAAGTTAG TCCCCCTTAT TCCACTGTGG CATTGCCGTT  4400
CTAAGCAGTT ACCTGATGCC TGCTGCTGAA GAGCTGCTCA CAGGAGGCGG  4450
CGGCGGCCCT GGCACTGCCC CTTGCATTAG GTCTTGTGTT TGATGTGTTC  4500
TTGTGAATTT ACTTTGTCAG AACAAAATAT TTACGCGTTG GGTTCAGGAA  4550
TTTCTTTTAG CTCCCCATCT GGCTGTGAAA TTCAGGAAAC CTCCCGTTGC  4600
CTAGTAATCA CCCCATGTAG GTGTACATTG TGACAAAGTG CATCTGACCA  4650
CTAAGGGGCC CCCTTGGTGA CCCCAGCACA TTCACAGCAG TGTTAAAATG  4700
GCCTGCATTT TGGAGATGCT GGCTGGCCTT TCAGTGCCTC CCAGGAAGAC  4750
ACATGGCCTT TCCCTCTTCA GATGCCTGAA GGGAGTGCTT TGAGGCAGGT  4800
GATGTGCTGG GAGTGTGGGC GGCCTCCCTC TGGCCCCGGG GCCCTCTGTG  4850
GACCTTGGCT CCCTCCGTGG ACCTGGGCTT CGTGGTGAGC ACTGCAGCCT  4900
CCCTGGGCAT TCCCTCCAGC GCCAGCACCA CTGCAACATA TAGACCTGAG  4950
TGCTATTGTA TTTTGGCTTG GTGTGTATGC TCTTCATTGT GTAAAATTGC  5000
TGTTCTTTTG ACAATTTAAG TGATTGTTTT GTTTACTGTA AGTTTGAAAA  5050
TAAAAATGAA GAAAAAAATT CCAATGACTG TGCTGTGGTT GGAGACTTTA  5100
TTTACCAAGA TGTTTACTCT TCCTTTCCCC TTCCATTTTG AGGAGCTGTG  5150
TCACTCCTCC TCCCCCCCAG TGCTTTGTAG TCTCTCCTAT GTCATAATAA  5200
AGCTACATTT TCTCTGAGAA 
=============================================================

From the above data, I want 4 things:

1. FASTA sequence of this mRNA
2. 5'UTR coordinates : 1-204 (Red color)
3. CDS coordinates : 205-1782 (Blue color)
4. 3' UTR coordinates : 1783-5220 (Red color)
____________________________________________
At the end, I want 4 fasta sequences for an mRNA:

1.
>mRNA|NM_015276.1
GCAGCCGCAGCTCGGGGGCGGTGCCTGCCTTGCAGCCTCCCC
TC.....ATGGTGTCCCGGCCAGAGCCCGAGGGCGAGGCCATGGAC
GCCGAGC.....TCTATCACAAACAGTTCCTGGAATACGAGTAG....TC
CCCCCCAGTGCTTTGTAGTCTCTCCTATGTCATAATAAAGCTACAT
TTTCTCTGAGAA

2.
>5'UTR|NM_015276.1
GCAGCCGCAGCTCGGGGGCGGTGCCTGCCTTGCAGCCTCCCC
TC...CGCGCGGCCGGGCCTTGCCCCCC

3.
>CDS|NM_015276.1
ATGGTGTCCCGGCCAGAGCCCGAGGGCGAGGCCATGGAC
GCCGAGC.....TCTATCACAAACAGTTCCTGGAATACGAGTAG

4.
>3'UTR|NM_015276.1
TCCCCCCCAGTGCTTTGTAGTCTCTCCTATGTCATAATAAAGC
TACATTTTCTCTGAGAA....TCTCTCCTATGTCATAATAAAGCTA
CATTTTCTCTGAGAA

Similarly, I want these 4 sequences for each available Refseq mRNA entry.


Regards,
Naman




Brian Lee

unread,
Jul 13, 2018, 7:07:11 PM7/13/18
to Naman Mangukia, Matthew Speir, UCSC Genome Browser Discussion List
Dear Naman,

Thank you for using the UCSC Genome Browser and your question.

One of our engineers shares your goal could be accomplished with some scripting like this:

1. Make a BED file, using the CDS coordinates and the size of each RefSeq transcript, that enumerates the regions to be output (in BED 0-based half open coordinates; see http://genome.ucsc.edu/blog/the-ucsc-genome-browser-coordinate-counting-systems/). For example, it would have these 4 lines for NM_015276.1 above (CDS = 205..1782, size = 5220):

NM_015276.1  0     5220  mRNA|NM_015276.1
NM_015276.1  0     204   5'UTR|NM_015276.1
NM_015276.1  204   1782  CDS|NM_015276.1
NM_015276.1  1782  5220  3'UTR|NM_015276.1

2. Use faToTwoBit to convert seqNcbiRefSeq.rna.fa to 2bit

3. Use twoBitToFa -bed=<fileCreatedInStep1> to get the desired fasta.

Step 1 is definitely the complex part, but with some background in bioinformatics and scripting it is possible. The size of each mRNA can be obtained from the fasta file like this:

faSize -detailed seqNcbiRefSeq.rna.fa > mrna.sizes

It may be helpful to be aware that some transcripts have incomplete or complex CDS. For example, a few transcripts depend on ribosomal slippage, e.g. NM_001134939.1 with CDS "join(168..260,262..741)". The incomplete CDS seem to be all XM_ at this point.

These utilities like faToTwoBit, faSize, and twoBitToFa can be obtained here: http://hgdownload.soe.ucsc.edu/admin/exe/

Thank you again for your inquiry and using the UCSC Genome Browser. If you have any further public questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

All the best,

Brian Lee
UC Santa Cruz Genomics Institute



Reply all
Reply to author
Forward
0 new messages