Question: Numbers of CpG islands

11 views
Skip to first unread message

Sun, Shuying

unread,
Jul 17, 2017, 10:37:17 AM7/17/17
to gen...@soe.ucsc.edu, ssun...@gmail.com

Dear UCSC genome colleagues, 


I recently downloaded hg17, hg18, and hg19 versions of CpG islands from UCSC genome browser (web links shown below). Then I found that the numbers of CpG islands of the above three versions are dramatically different as shown below. That is, hg18 only has 28K, but hg17 and hg19 have more than 50K, which is almost two times of hg18 oneI suspect that there is something wrong here. Could you please help me check to see what might be wrong? Thank you. 


wc -*Unmask*

   51200 hg17.cpgIslandExtUnmasked.txt

   28226 hg18.cpgIslandExtUnmasked.txt

   52502 hg19.cpgIslandExtUnmasked.txt

   

# I download the above three datasets from these weblinks:

http://hgdownload.soe.ucsc.edu/goldenPath/hg17/database/cpgIslandExtUnmasked.txt.gz


http://hgdownload.soe.ucsc.edu/goldenPath/hg18/database/cpgIslandExtUnmasked.txt.gz


http://hgdownload.soe.ucsc.edu/goldenPath/hg19/database/cpgIslandExtUnmasked.txt.gz


Shuying 



Matthew Speir

unread,
Jul 25, 2017, 2:54:49 PM7/25/17
to Sun, Shuying, gen...@soe.ucsc.edu, ssun...@gmail.com
Hi Shuying,

Thank you for your question about CpG Islands in the UCSC Genome Browser.

It appears you have uncovered a bug in the process we used to generate the data for the human assembly hg18. We have corrected this issue and the updated table is now available on our public site. You should see that this new table also has roughly the same number of CpG islands as the other assemblies you mentioned. Additionally, we've updated the download file you referenced, so if you re-download that file for hg18, it should match the table on our public site.

I hope this is helpful. If you have any further questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible Google Groups forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Matthew Speir
UCSC Genome Bioinformatics Group
--

---
You received this message because you are subscribed to the Google Groups "UCSC Genome Browser Public Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genome+un...@soe.ucsc.edu.
To post to this group, send email to gen...@soe.ucsc.edu.
Visit this group at https://groups.google.com/a/soe.ucsc.edu/group/genome/.
To view this discussion on the web visit https://groups.google.com/a/soe.ucsc.edu/d/msgid/genome/DM5PR1101MB233200E23FCF1213284DA9BCC0A00%40DM5PR1101MB2332.namprd11.prod.outlook.com.
For more options, visit https://groups.google.com/a/soe.ucsc.edu/d/optout.

Reply all
Reply to author
Forward
0 new messages