Question about histone modification dataset

75 views
Skip to first unread message

Ying Sun

unread,
Apr 8, 2024, 11:23:17 AM4/8/24
to gen...@soe.ucsc.edu

Hi UCSC organization,

 

I have one naïve question about histone modification dataset in UCSC.

For me, I am interested in not-specific genomic elements in the genome.

UCSC is very user-friendly that provides the cluster of TFBS from multiple cell-lines and tissues. Then I can use it directly.

I’m also interested in the histone modification, The files from UCSC are cell-specific. I’m wonding why it’s not clustered as TFBS. I can download the dataset from ENCODE merged by myself. But Is there some special reasons for not doing this?

 

Thanks for your time, Looking forward to your reply.

 

Best wishes,

Ying

 

Luis Nassar

unread,
Apr 12, 2024, 7:32:52 PM4/12/24
to Ying Sun, gen...@soe.ucsc.edu

Hi, Ying.

Can you share which is the specific ENCODE merged dataset you are referring to?

We try to host a varied amount of data useful to most users, but ultimately there is more data than capacity for us to host. We do offer various options for visualizing these data, through custom tracks (https://genome.ucsc.edu/cgi-bin/hgCustom) and hubs (https://genome.ucsc.edu/cgi-bin/hgHubConnect).

I hope this is helpful. Please include genom...@soe.ucsc.edu in any replies to ensure visibility by the team. All messages sent to that address are archived on our public forum. If your question includes sensitive information, you may send it instead to genom...@soe.ucsc.edu.

Lou Nassar
UCSC Genomics Institute


--

---
You received this message because you are subscribed to the Google Groups "UCSC Genome Browser Public Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genome+un...@soe.ucsc.edu.
To view this discussion on the web visit https://groups.google.com/a/soe.ucsc.edu/d/msgid/genome/E0B97000-C939-409A-9F3F-3C6379853B59%40ku.dk.

Maximilian Haeussler

unread,
Apr 13, 2024, 12:21:44 PM4/13/24
to Luis Nassar, Ying Sun, gen...@soe.ucsc.edu
Hi Ying,

it sounds as if you are looking for a merged version of all histone modifications across all cell types, but I may have misunderstood. Can you let us know why? This is not a common question, but I admit that I am not sure why this is not done, Histone modifications are relatively long and merging them across cell types may not be very meaningful, but that's just a hypothesis. We can't find clustered histone data on the ENCODE portal either, but if you found it, please point us to it. 

However, among our public tracks under "My data > Track hubs" there are various summaries of ENCODE data produced mostly by the ENCODE analysis Center.  You can enter into the search box and press Enter to see the relevant ones. For example, the "ENCODE integrative Trackhub" has a track "all cCREs (cell type agnostic)", which clusters histone modifications across all cell types, and then also clusters TF binding sites in them. This may be close to what you're looking for?

If you can't find what you need, you can also reach out to the ENCODE Analysis Center directly and CC us. We are about to discuss with them how to integrate more ENCODE data and make it easier to find on our website and your question reminded us that their "integrated" summary tracks should be highlighted or stand out in our new ENCODE tracks.

best
Max

Maximilian Haeussler

unread,
Apr 13, 2024, 1:02:24 PM4/13/24
to Ying Sun, Luis Nassar, gen...@soe.ucsc.edu
Hi Ying,

Yes, I see, so in this case, the integrated ENCODE tracks I hope will solve the problem. 

TADs! That's a great topic: we don't have a track for that, right? But we should have one. Which annotation are you using?

best
Max
 

On Sat, Apr 13, 2024 at 9:39 AM Ying Sun <sun....@sund.ku.dk> wrote:

Thanks! I think your reply has already answered my question.

 

For me now, I’m working on topologically associating domains (TADs) across multiple cell-lines and tissues.

Then I’m collecting some epigentic dataset across different cell-lines and tissues for the functional analysis.

It’s super nice that I can download the TFBS from UCSC.

Histone modificaiton is also essential for TADs, which might be enriched at TAD boundaries. So I also want to collect the histone modification across cell-lines.

I didn’t find the merged dataset anywhere. So I write to you for the professional opinion. : )

 

 

Best wishes,

Ying

 

 

From: Maximilian Haeussler <mhae...@ucsc.edu>
Date: Saturday, 13 April 2024 at 18.21
To: Luis Nassar <lrna...@ucsc.edu>
Cc: Ying Sun <sun....@sund.ku.dk>, "gen...@soe.ucsc.edu" <gen...@soe.ucsc.edu>
Subject: Re: [genome] Question about histone modification dataset

 

Ying Sun

unread,
Apr 15, 2024, 12:47:59 PM4/15/24
to Luis Nassar, gen...@soe.ucsc.edu

Hi Lou,

 

Thanks for your reply!

I’m focusing on this file: encRegTfbsClusteredWithCells.hg38.bed.gz

 

From UCSC webpage, you said “ChIP-seq datasets were clustered using the UCSC hgBedsToBedExps tool.”

Histone modification is also measured by ChIP-seq. I also saw the histone modification dataset from different cell-lines in the ENCODE.

I am just curious that why you didn’t do similar thing for histone modification using the same pipeline. I was considering that maybe the peak of the histone modification is longer than TFBS. Or the histone modification is more complex than TF, making it not reasonabel to be clustered across cell-lines.

 

Your insights on this matter would be invaluable to me. Looking forward to your reply.

 

 

 

Best wishes,

Ying

 

 

From: Luis Nassar <lrna...@ucsc.edu>
Date: Saturday, 13 April 2024 at 01.32
To: Ying Sun <sun....@sund.ku.dk>
Cc: "gen...@soe.ucsc.edu" <gen...@soe.ucsc.edu>
Subject: Re: [genome] Question about histone modification dataset

 

Ying Sun

unread,
Apr 15, 2024, 12:48:43 PM4/15/24
to Maximilian Haeussler, Luis Nassar, gen...@soe.ucsc.edu

Thanks! I think your reply has already answered my question.

 

For me now, I’m working on topologically associating domains (TADs) across multiple cell-lines and tissues.

Then I’m collecting some epigentic dataset across different cell-lines and tissues for the functional analysis.

It’s super nice that I can download the TFBS from UCSC.

Histone modificaiton is also essential for TADs, which might be enriched at TAD boundaries. So I also want to collect the histone modification across cell-lines.

I didn’t find the merged dataset anywhere. So I write to you for the professional opinion. : )

 

 

Best wishes,

Ying

 

 

From: Maximilian Haeussler <mhae...@ucsc.edu>


Date: Saturday, 13 April 2024 at 18.21
To: Luis Nassar <lrna...@ucsc.edu>

Cc: Ying Sun <sun....@sund.ku.dk>, "gen...@soe.ucsc.edu" <gen...@soe.ucsc.edu>
Subject: Re: [genome] Question about histone modification dataset

Ying Sun

unread,
Apr 15, 2024, 12:49:36 PM4/15/24
to Maximilian Haeussler, Luis Nassar, gen...@soe.ucsc.edu

Hi Max,

 

That’s true!

Now I am using annotation from DomainCaller. I collected the dataset from 3D Genome Browser http://3dgenome.fsm.northwestern.edu/view.php.

The annotation from 4D data portal is also good. https://data.4dnucleome.org/ Their TADs are called by Insulation Score.

 

Best wishes,

Ying

 

 

From: Maximilian Haeussler <mhae...@ucsc.edu>


Date: Saturday, 13 April 2024 at 19.02
To: Ying Sun <sun....@sund.ku.dk>

Maximilian Haeussler

unread,
Apr 15, 2024, 10:35:55 PM4/15/24
to Ying Sun, Luis Nassar, gen...@soe.ucsc.edu
Hi Ying,

These are ENCODE2 data from more than 13 years ago. I don’t know why histones weren’t clustered, I imagine it has to do with the way im which histones are expected to be functional, as broad markers of function individually, not necessarily working together, unlike TFs.

As I’ve mentioned, you can contact the ENCODE portal, they are the experts on ENCODE data now

Best
Max 

Reply all
Reply to author
Forward
0 new messages