Human Genome has gaps??

138 views
Skip to first unread message

Sucheta Tripathy

unread,
Dec 27, 2013, 9:42:23 PM12/27/13
to gen...@soe.ucsc.edu
Hi,

We have downloaded HG19 human genome from various sources and found there are a number of N's . I am wondering if there are gaps still in human genome. The strange thing is there are some mapping information available on CAGE data that maps to chromosome -1 at position 18. Now at position 18 we only see N's. Does anyone know where to download the completed version of human genome data?

Many Thanks

Sucheta

--
Sucheta Tripathy, Ph.D
Scientist, Ramalingaswamy Fellow,
Indian Institute of Chemical Biology,
Kolkata, India.

Brian Lee

unread,
Dec 30, 2013, 12:40:43 PM12/30/13
to Sucheta Tripathy, gen...@soe.ucsc.edu
Dear Sucheta,

Thank you for using the UCSC Genome Browser and your question about the human genome assembly.

You are correct that there are gaps in the human genome denoted as NNNNs which you can view in the browser. When you are viewing the hg19 assembly in the browser, http://genome.ucsc.edu/cgi-bin/hgTracks?db=hg19&position=chr1%3A1-100, you can scroll down and expand the "Mapping and Sequencing Tracks" section to find the Gap track, which you can set to full visibility from the default setting of hide, http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=gap, and you will see the gaps annotated.

You describe CAGE mapping information that is is mapping to where there are NNNS. Likely what you are seeing is data mapped to a previous assembly, hg18, that is now mapped to a telomere at the beginning of the chromosome in the newer hg19 assembly. For example, if you navigate to hg18 for the above chr1:1-100 coordinates, http://genome.ucsc.edu/cgi-bin/hgTracks?db=hg18&position=chr1%3A1-100, you will see sequence information.

You can find the corresponding hg18 to hg19 coordinates clicking the "View" button in the top blue bar of the browser, and then selecting the "In Other Genomes (Convert)" option. Where you can select hg19 and see that the hg18 coordinates of chr1:1-100 correspond to chr1:10,001-10,100 in hg19. With the Gap track on in hg19, you can see that the new hg19 assembly includes annotation for a telomere for the start of the chromosome, chr1:1-10000, which will all be NNNNs. You can see this for other chromosomes too when comparing differences between hg18 and hg19 coordinates.

I highly recommend taking advantage of some introductory resources available for the browser. Please watch these tutorials: http://www.openhelix.com/ucsc Also please find further training here: http://genome.ucsc.edu/training.html Lastly, before mailing the list with new questions, search our archives for similar previously answered mailing list questions: https://groups.google.com/a/soe.ucsc.edu/forum/?hl=en&fromgroups#!forum/genome

You may be interested to learn there is a new human genome assembly becoming available. Please note that most of the data available will likely be mapped to hg19, and in some cases hg18. Also, in case you are wondering, the new assembly will have gaps too. You can read more here: http://www.ncbi.nlm.nih.gov/news/12-23-2013-grch38-released/

Thank you again for your inquiry and using the UCSC Genome Browser. If you have any further questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

All the best,

Brian Lee


--
 

Reply all
Reply to author
Forward
0 new messages