[Genome] GenePred Format

31 views
Skip to first unread message

Alden Huang

unread,
Feb 18, 2011, 7:32:12 PM2/18/11
to Gen...@soe.ucsc.edu
Hi,

I just had a quick question...

In the description of the Gene Predictions table format used by UCSC,
under specifically Gene Predictions (Extended), it lists two fields:

string cdsStartStat; "enum('none','unk','incmpl','cmpl')"
string cdsEndStat; "enum('none','unk','incmpl','cmpl')"

I would simply like to know the significance of these particular fields.

thanks,

alden

Katrina Learned

unread,
Feb 23, 2011, 3:33:17 PM2/23/11
to Alden Huang, Gen...@soe.ucsc.edu
Hi Alden,

These fields provide additional information about the status of the
start and end of a gene's coding region. The possible statuses are:

- none - no CDS specified from the sequence's data source.
- unk - unknown - not known if CDS start/end is complete.
- incmpl - the CDS start/end is incomplete
- cmpl - the CDS start/end is complete.

cdsStartStat refers to the cdsStart end of the gene, which is the start
codon for a positive strand gene and the stop codon for a negative
strand gene. cdsEndStat refers to the cdsEnd end of the gene, which is
the stop codon for a positive strand gene and the start codon for a
negative strand gene.

Please don't hesitate to contact the mail list again if you have any
further questions.

Katrina Learned
UCSC Genome Bioinformatics Group

Alden Huang wrote, On 02/18/11 16:32:
> _______________________________________________
> Genome maillist - Gen...@soe.ucsc.edu
> https://lists.soe.ucsc.edu/mailman/listinfo/genome
>
Reply all
Reply to author
Forward
0 new messages