questions on alternative chromosomes and different types of tracks in the human assembly

349 views
Skip to first unread message

Tzachi Hagai

unread,
Nov 7, 2016, 5:46:14 PM11/7/16
to gen...@soe.ucsc.edu
Hello,

I was wondering if you could help me with the following questions:

(1) In the "Table Browser", in human assembly, when using the default genome - what is the difference between "GENCODE V24" and "All GENCODE 24" in the track field?

(2) Alternative chromosomes:
When downloading the human genome chromosome sequences or the bed / gtf of gene annotations, some chromosomes are annotated as "alt"  (e.g. - chr14_KI270847v1_alt) - are these regions alternative haplotypes that have  analogous regions in the primary assembly?

Thank you very much in advance,
Tzachi







Tzachi Hagai

unread,
Nov 14, 2016, 10:21:30 AM11/14/16
to gen...@soe.ucsc.edu
Hello,

I was wondering if you could help me with the following questions:

(1) In the "Table Browser", in human assembly, when using the default genome - what is the difference between "GENCODE V24" and "All GENCODE 24" in the track field? (and what is the difference between comprehensive and basic subsets?)

(2) Alternative chromosomes:
When downloading the human genome chromosome sequences or the bed / gtf of gene annotations, some chromosomes are annotated as "alt"  (e.g. - chr14_KI270847v1_alt) - are these regions alternative haplotypes that have  analogous regions in the primary assembly?

Thank you very much in advance,
Tzachi


Christopher Lee

unread,
Nov 15, 2016, 4:18:35 PM11/15/16
to Tzachi Hagai, UCSC Genome Browser Discussion List

Hi Tzachi,

Thank you for your questions about the the GENCODE V24 and ALL GENCODE V24 tracks and the differences between the basic and comprehensive set, as well as for your question about alternative chromosomes.

There are a couple differences between the GENCODE V24 and ALL GENCODE V24 tracks, summarized below:

ALL GENCODE V24 - super-track of 5 subtracks/tables:
1. Basic (a subset of the "Comprehensive" set, more info on the criteria for selection here: http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg38&g=wgEncodeGencodeV24#basicSetSelection)
2. Comprehensive
3. Pseudogenes
4. 2-way Pseudogenes
5. PolyA

More info on these tracks can be found here: http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg38&g=wgEncodeGencodeV24

GENCODE V24 - just one track/table:
- Consists of the Comprehensive set from ALL GENCODE V24 (which includes the Basic set)
- Does not include the tables Pseudogenes, 2-way Pseudogenes or PolyA.
- Does not "split up" the Comprehensive track into 2 subtracks, as is seen in "ALL GENCODE V24" where the "Basic" set is a separate track.
- Does show only the items from the Basic set by default when looking at the main Genome Browser display.

More information on the GENCODE V24 track can be found here: http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg38&g=knownGene

When getting output from the Table Browser, the output from GENCODE V24 and ALL GENCODE V24 Comprehensive should correspond. The only main difference you will see is that the supporting tables that make up the tracks are different. This is due to how the knownGene table used to be built. You can read more about the transition from the old style 'UCSC Genes' track to 'GENCODE V24' track here:
http://genome.ucsc.edu/blog/new-default-gene-set-on-grch38-gencode-basic-genes/

As for your question about alternative chromosome names, yes, these chromosome names refer to alternative haplotype chromosomes that map to the same relative region as indicated in their name. For instance in your example, chr14_KI270847v1_alt is a chr14 alternative haplotype sequence. For more information on this naming convention, please see the section "Chromosome Naming Conventions" on this page:
http://genome.ucsc.edu/cgi-bin/hgGateway?db=hg38

Thank you again for your inquiry and using the UCSC Genome Browser. If you have any further questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Christopher Lee
UCSC Genomics Institute


--

---
You received this message because you are subscribed to the Google Groups "UCSC Genome Browser discussion list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genome+un...@soe.ucsc.edu.

Reply all
Reply to author
Forward
0 new messages