CPG

56 views
Skip to first unread message

Richards, Keri

unread,
May 20, 2021, 10:56:42 AM5/20/21
to genome...@soe.ucsc.edu

Good Morning,

 

How do I download the cpg annotation for an entire genome?

 

All the best,

 

Keri Richards

 

Kerianne Richards

Bioinformatics Scientist, Next Generation Sequencing Team

 

GENEWIZ | Solid science. Superior service.

(remote)

115 Corporate Blvd

South Plainfield, NJ 07080

+1-908-222-0711 ext. 1

www.genewiz.com

GENEWIZ_Logo_email_signature_Jan_2019

 

_____________________________________________________________________
This email message, including any attachments, may contain confidential and proprietary information for the sole use of the intended recipient. If you are not the intended recipient, please notify the sender and delete this message from your system, without making any copy or distribution. Our website and email privacy policy is available at https://www.brooks.com/privacy

Gerardo Perez

unread,
May 21, 2021, 1:53:04 PM5/21/21
to Richards, Keri, genome...@soe.ucsc.edu

Hello Keri,

Thank you for your question about downloading the CPG annotation for an entire genome.

We have two native tracks that display CpG island data, one displays CpG data on the repeat-masked genome, and the other named "Unmasked CpG" shows CpG data across the whole genome. You can read more about these tracks on the description page here: http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg38&g=cpgIslandSuper

You can download these data from our download page: http://hgdownload.soe.ucsc.edu/goldenPath/hg38/database/

I'll also include direct links to the masked and umasked files:
Masked: http://hgdownload.soe.ucsc.edu/goldenPath/hg38/database/cpgIslandExt.txt.gz
Unmasked: http://hgdownload.soe.ucsc.edu/goldenPath/hg38/database/cpgIslandExtUnmasked.txt.gz

If you would like to see a description of what each of the columns represents, you can see the "table schema" page for that track: http://genome.ucsc.edu/cgi-bin/hgTables?db=hg38&hgta_group=regulation&hgta_track=cpgIslandExt&hgta_table=cpgIslandExt&hgta_doSchema=describe+table+schema

field           description
bin             Indexing field to speed chromosome range queries.
chrom           Reference sequence chromosome or scaffold
chromStart      Start position in chromosome
chromEnd        End position in chromosome
name            CpG Island
length          Island Length
cpgNum          Number of CpGs in island
gcNum           Number of C and G in island
perCpg          Percentage of island that is CpG
perGc           Percentage of island that is C or G
obsExp          Ratio of observed(cpgNum) to expected(numC*numG/length) CpG in island

All of these examples are on the hg38 human assembly, however we have these data available for our other assemblies as well. You would select the organism of interest from our downloads page (http://hgdownload.soe.ucsc.edu/downloads.html), click "Annotations", then click the "SQL table dump annotations" link.

I hope this is helpful. If you have any further questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible Google Groups forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Gerardo Perez
UCSC Genomics Institute


--

---
You received this message because you are subscribed to the Google Groups "UCSC Genome Browser Mirror-Specific Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genome-mirro...@soe.ucsc.edu.
To view this discussion on the web visit https://groups.google.com/a/soe.ucsc.edu/d/msgid/genome-mirror/BLAPR16MB3922777623583C724FFEC47A962A9%40BLAPR16MB3922.namprd16.prod.outlook.com.

Richards, Keri

unread,
May 21, 2021, 4:12:54 PM5/21/21
to Gerardo Perez, genome...@soe.ucsc.edu

Thank you very much!

 

 

All the best,

 

Keri Richards

 

Kerianne Richards

Bioinformatics Scientist, Next Generation Sequencing Team

 

GENEWIZ | Solid science. Superior service.

(remote)

115 Corporate Blvd

South Plainfield, NJ 07080

+1-908-222-0711 ext. 1

www.genewiz.com

GENEWIZ_Logo_email_signature_Jan_2019

 

 

 

From: Gerardo Perez <gpe...@ucsc.edu>
Sent: Friday, May 21, 2021 1:53 PM
To: Richards, Keri <Kerianne...@brooks.com>
Cc: genome...@soe.ucsc.edu
Subject: Re: [genome-mirror] CPG

 

[External Email]

Michael Hiller

unread,
May 24, 2021, 12:40:05 PM5/24/21
to genome...@soe.ucsc.edu
Dear UCSC, 

our browser mirror is running on a CentOS linux and since FreeType was enabled by default, it stopped working with "FreeType not enabled".
If we set freeType=off in the hg.conf, it works again. 

I was wondering which font package needs to be installed to get the pfb files with the exact md5sum specified at

Thanks a lot
- Michael 


-- 
Michael Hiller, PhD
Professor of Comparative Genomics
LOEWE Centre for Translational Biodiversity Genomics,
Senckenberg Society for Nature Research & Goethe University, 
Frankfurt am Main, Germany




Brian Lee

unread,
May 25, 2021, 6:47:33 PM5/25/21
to Michael Hiller, genome...@soe.ucsc.edu
Dear Michael,

Thank you for sharing the experiences with FreeType errors when compiling.

We have updated our src/inc/common.mk file to have improved checks for freetype-config existence and you can get the lastest version here:
https://genome-source.gi.ucsc.edu/gitlist/kent.git/blob/master/src/inc/common.mk

Or see the specific changes here:
https://genome-source.gi.ucsc.edu/gitlist/kent.git/commit/8fff967390b185e152a6faf2a7335e5db23a9ff7

Thank you again for sharing these experiences and helping us improve the UCSC Genome Browser for mirror administrators!

All the best,

--

---
You received this message because you are subscribed to the Google Groups "UCSC Genome Browser Mirror-Specific Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genome-mirro...@soe.ucsc.edu.
Reply all
Reply to author
Forward
0 new messages