UCSC-help CpG islands

522 views
Skip to first unread message

Muhammad Sohail Raza

unread,
Nov 6, 2015, 12:16:08 PM11/6/15
to UCSC help
Hi, 

I am looking for downloading CpG islands of human genome in order to compare my variants data. 

we came across two headings of cpg islands for download i.e.
cpgIslandExtUnmasked.txt.gz

can you please tell me what is the difference between two files and which one could be useful?
Many thanks!
--
************************************************************************************
Muhammad Sohail Raza
CAS-TWAS PhD Fellow
Center of Genome Variation and Biomedicine
Beijing Institute of Genomics, CAS
Beijing, China.
              muhammads...@live.com

Cath Tyner

unread,
Nov 7, 2015, 12:51:25 AM11/7/15
to Muhammad Sohail Raza, UCSC help

Dear Sohail,

Thank you for using the UCSC Genome Browser and for submitting your question regarding the differences between the following CpG island files:

cpgIslandExt is the repeat-masked version of these data; repetitive elements are excluded. In the UCSC Genome Browser, only the masked version is displayed (by default).

cpgIslandExtUnmasked is the unmasked version of these data; potential CpG islands existing in repeat regions are displayed, and would otherwise not be visible in the repeat-masked version.

You can read more at the CpG Islands Track Description page: http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=cpgIslandSuper

For further information, please see the related publication which is listed in the Track Description References section:  http://www.sciencedirect.com/science/article/pii/0022283687906899

Thank you again for your inquiry and for using the UCSC Genome Browser. If you have any further questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Enjoy,
Cath
. . .
Cath Tyner
UC Santa Cruz Genomics Institute


--

---
You received this message because you are subscribed to the Google Groups "UCSC Genome Browser discussion list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genome+un...@soe.ucsc.edu.

Muhammad Sohail Raza

unread,
Nov 10, 2015, 11:51:23 AM11/10/15
to UCSC help
Hi UCSC team,

I was trying to download the VISTA enhancers dataset from UCSC genome browser entitle:vistaEnhancers.txt.gz (http://hgdownload.soe.ucsc.edu/goldenPath/hg19/database/). However this data look old, when i saw its uploading date (26-Dec-2010).

According to current release of VISTA enhancer browser 11/9/2015 the database contains information on 2294 in vivo tested elements - 1215 elements with enhancer activity.  

could you please tell me from where i can download the updated VISTA enhancers according to HG19/GRCH37 release?

Many thanks
--


************************************************************************************
Muhammad Sohail Raza
CAS-TWAS PhD Fellow
Center of Genome Variation and Biomedicine
Beijing Institute of Genomics, CAS
Beijing, China.
              muhammads...@live.com



From: muhammads...@live.com
To: gen...@soe.ucsc.edu
Subject: UCSC-help CpG islands
Date: Fri, 6 Nov 2015 09:39:56 +0000

Steve Heitner

unread,
Nov 13, 2015, 3:56:42 PM11/13/15
to Muhammad Sohail Raza, UCSC help

Hello, Muhammad.

Yes, the data for the hg19 VISTA Enhancers track is from 2010.  We do have plans to update this track, but unfortunately, we have no time estimate of when this might happen.

There is also a VISTA Enhancers public hub which contains much more recent data.  This can be viewed by navigating to http://genome.ucsc.edu/cgi-bin/hgHubConnect and clicking the “Connect” button to the left of the “Vista Enhancers” hub close to the bottom of the list.  Then click the “submit” button when you are brought to the gateway page.

We do not have any of the new VISTA Enhancers data for download.  The bigBed data file for the hg19 public hub is available at
http://portal.nersc.gov/dna/RD/ChIP-Seq/VISTA_enhancer_e/hg19_ext_latest.bb.  If you desire any data beyond this, you should contact them directly at enhancer...@gmail.com.

Please contact us again at gen...@soe.ucsc.edu if you have any further questions. 
All messages sent to that address are archived on a publicly-accessible Google Groups forum.  If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

---
Steve Heitner
UCSC Genome Bioinformatics Group

--

Muhammad Sohail Raza

unread,
Nov 16, 2015, 12:34:23 PM11/16/15
to UCSC help
Hi, 

I was looking at the "Table schema" of UCSC datasets and and i came across different terms:

Primary table
Connected Tables and Joining Fields

What is the difference between these two files and if we want to download the dataset which one is complete or they complementing each other?



Many thanks..
--
************************************************************************************
Muhammad Sohail Raza
CAS-TWAS PhD Fellow
Center of Genome Variation and Biomedicine
Beijing Institute of Genomics, CAS
Beijing, China.
              muhammads...@live.com


Subject: RE: UCSC-help CpG islands
Date: Tue, 10 Nov 2015 08:03:25 +0000

Steve Heitner

unread,
Nov 16, 2015, 1:33:28 PM11/16/15
to Muhammad Sohail Raza, UCSC help

Hello, Sohail.

For many of our data tracks, we have a primary table which contains the bulk of the data, but we also have linked tables which provide additional data and are linked by a specific field in the primary table.  In the example you provided, note that the tRNAs table is the primary table and it is linked to the kgXref table by the following fields: tRNAs.name <-> kgXref.tRnaName.  Each table schema page contains a Sample Rows section so you can see examples of what the data in those tables looks like.

For the purposes of downloading, you could consider the primary table to be the complete dataset, but the linked tables provide additional data if it is desired.  Note that if you use the Table Browser to query the data, it gives you the option to include data from linked tables in your output with the “selected fields from primary and related table” output format option.



Please contact us again at gen...@soe.ucsc.edu if you have any further questions. 

All messages sent to that address are archived on a publicly-accessible Google Groups forum.  If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.



---
Steve Heitner
UCSC Genome Bioinformatics Group

 

From: Muhammad Sohail Raza [mailto:muhammads...@live.com]
Sent: Sunday, November 15, 2015 7:41 PM
To: UCSC help
Subject: [genome] RE: UCSC-help CpG islands

 

Hi, 

--

Reply all
Reply to author
Forward
0 new messages