The new GRCh38 Human Genome Browser has arrived!

299 просмотров
Перейти к первому непрочитанному сообщению

Donna Karolchik

не прочитано,
6 мар. 2014 г., 20:04:2906.03.2014
– Genome-announce, Genecats, David Haussler, cbseadmin, Branwyn Wagman
hi all --

In the final days of 2013, the Genome Reference Consortium (GRC)
released the eagerly awaited GRCh38 human genome assembly, the first
major revision of the human genome in more than four years. During the
past two months, the UCSC team has been hard at work building a browser
that will let our users explore the new assembly using their favorite
Genome Browser features and tools. Today we're announcing the release of
a preliminary browser on the GRCh38 assembly. Although we still have
plenty of work ahead of us in constructing the rich feature set that our
users have come to expect, this early release will allow you to take a
peek at what's new.

Starting with this release, the UCSC Genome Browser version numbers for
human assemblies will match those of the GRC to minimize version
confusion. Hence, the GRCh38 assembly is referred to as hg38 in Genome
Browser datasets and documentation. We've also made some slight changes
to our chromosome naming scheme that affect primarily the names of
haplotype chromosomes, unplaced contigs and unlocalized contigs. For
more details about this, as well as information about the GRCh38
assembly files, statistics, and links for downloading the UCSC data
files, see the Genome Browser hg38 gateway page
(http://genome.ucsc.edu/cgi-bin/hgGateway?db=hg38).

What's new in GRCh38?

- Alternate sequences - Several human chromosomal regions exhibit
sufficient variability to prevent adequate representation by a single
sequence. To address this, the GRCh38 assembly provides alternate
sequence for selected variant regions through the inclusion of alternate
loci scaffolds (or alt loci). Alt loci are separate accessioned
sequences that are aligned to reference chromosomes. This assembly
contains 261 alt loci, many of which are associated with the LRC/KIR
area of chr19 and the MHC region on chr6.

- Centromere representation - Debuting in this release, the large
megabase-sized gaps that were previously used to represent centromeric
regions in human assemblies have been replaced by sequences from
centromere models created by Karen Miga et al. using centromere
databases developed during her work in the Willard lab at Duke
University and analysis software developed while working in the Kent lab
at UCSC. The models, which provide the approximate repeat number and
order for each centromere, will be useful for read mapping and variation
studies.

- Mitochondrial genome - The mitochondrial reference sequence included
in the GRCh38 assembly and hg38 Genome Browser (termed "chrM" in the
browser) is the Revised Cambridge Reference Sequence (rCRS) from MITOMAP
with GenBank accession number J01415.2 and RefSeq accession number
NC_012920.1. This differs from the chrM sequence (RefSeq accession
number NC_001907) used by the previous hg19 Genome Browser, which was
not updated when the GRCh37 assembly later transitioned to the new version.

- Sequence updates - Several erroneous bases and misassembled regions in
GRCh37 have been corrected in the GRCh38 assembly, and more than 100
gaps have been filled or reduced. Much of the data used to improve the
reference sequence was obtained from other genome sequencing and
analysis projects, such as the 1000 Genomes Project.

- Analysis set - The GRCh38 assembly offers an "analysis set" that was
created to accommodate next generation sequencing read alignment
pipelines. Several GRCh38 regions have been eliminated from this set to
improve read mapping. The analysis set may be downloaded from the Genome
Browser downloads page.

There's much more to come! This initial release of the hg38 Genome
Browser provides a rudimentary set of annotations. Many of our
annotations rely on data sets from external contributors (such as our
popular SNPs tracks) or require massive computational effort (our
comparative genomics tracks). In the upcoming months/years, we will
release many more annotation tracks as they become available. To stay
abreast of new datasets, join our genome-announce mailing list or follow
us on twitter.

We'd like to thank our GRC and NCBI collaborators who worked closely
with us in producing the hg38 browser. Their quick responses and helpful
feedback were a key factor in expediting this release. The production of
the hg38 Genome Browser was a team effort, but in particular we'd like
to acknowledge the engineering efforts of Hiram Clawson and Brian Raney,
the QA work done by Steve Heitner, project guidance provided by Ann
Zweig, Robert Kuhn, and Jim Kent, and documentation work by Donna
Karolchik.

-Donna
---------------
Donna Karolchik
UCSC Genome Browser Senior Project Manager
http://genome.ucsc.edu
Ответить всем
Отправить сообщение автору
Переслать
0 новых сообщений