Replicate BAM files for HINT

179 views
Skip to first unread message

Robert Bronstein

unread,
Sep 13, 2018, 4:36:12 PM9/13/18
to RGT Users
Wondering how people approach biological replicates in terms of the BAMs, does everyone simply use merged BAMs? Thanks!

Zhijian Li

unread,
Sep 13, 2018, 4:41:03 PM9/13/18
to RGT Users
in our work, we simply merged the replicates to get more reads.

Eduardo Gade Gusmao

unread,
Sep 13, 2018, 5:14:20 PM9/13/18
to 李志坚, RGT Users
Hi Robert,

a good sanity check is to actually evaluate the correlation between your BAM replicates. Download two other replicates from a completely different cell type and tile your genome in bins (the size would be up to you... I suggest 1Kbp). Then calculate the spearman correlation between all datasets using the vector of the average counts over the bins. A good replicate should have a high correlation and usually much higher than the "control" files from other cell types downloaded.

Hope this helps.

Thanks!

Eduardo Gade Gusmao, Ph.D.
AG Papantonis, Zentrum für Molekulare Medizin, Universität zu Köln,
Robert-Koch-Straße 21, 50931 Köln, Deutschland
Skype: eduardo.gade.gusmao


--
You received this message because you are subscribed to the Google Groups "RGT Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rgtusers+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/rgtusers/d175bf93-f4f2-4b89-9fa3-ae88bca14867%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

shokoh...@gmail.com

unread,
Oct 28, 2019, 7:38:28 AM10/28/19
to RGT Users
Hi, I also had the same question - thank you for clarifying this!

I have three replicates for each condition and there are two conditions.
I run `rgt-hint footprinting` and `rgt-motifanalysis` for individual samples already and was wondering if `rgt-hint differential` can take overlapping input for `--conditions` argument.

eg. 
rgt-hint differential --mpbs-files=A.bed,B.bed,C.bed,D.bed,E.bed,F.bed --reads-files=A.bam,B.bam,C.bam,D.bam,E.bam,F.bam --conditions=treatment,treatment,treatment,control,control,control

Thank you very much for your help.

Best,
Shoko

Zhijian Li

unread,
Oct 28, 2019, 9:12:38 AM10/28/19
to shokoh...@gmail.com, RGT Users
Hi Shoko,

rgt-hint differential doesn't merge the replicates, 
so in your case, you need:
1) merge the bed files: cat A.bed B.bed C.bed > merge.bed 
2) merge the bam files with samtools: samtools merge merge.bam A.bam B.bam C.bam
3) run rgt-hint differential

I hope this helps.

I can add a function to allow users to add replicate data for rgt-hint differential, but it will take a while. 
maybe in our next version.

Best,
LI


______________________________
Zhijian Li
Institute for Computational Genomics
RWTH Aachen University 
Pauwelsstrasse 19
52074 Aachen, Germany




--
You received this message because you are subscribed to the Google Groups "RGT Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rgtusers+u...@googlegroups.com.
Message has been deleted

shokoh...@gmail.com

unread,
Oct 28, 2019, 9:34:58 AM10/28/19
to RGT Users
Hi LI,

Thank you so much for your reply!

May I ask you a few more questions?

1) I previously merged the three replicates so that the output file would keep all the peaks present at least in any two replicates. This contains sometimes overlapping peaks from three samples.
Do you think I can use this as the merged.bed? Do I need to sort or merge overlapping peaks to run any of the commands in this tutorial? (https://www.regulatory-genomics.org/hint/tutorial/)

2) When do you think is the best timing to merge bed and bam files? Did you merge even before running `rgt-hint footprinting`?

Thank you so much for your help.

Best,
Shoko
To unsubscribe from this group and stop receiving emails from it, send an email to rgtu...@googlegroups.com.

Zhijian Li

unread,
Oct 28, 2019, 9:47:27 AM10/28/19
to shokoh...@gmail.com, RGT Users
Hi Shoko,

1) I previously merged the three replicates so that the output file would keep all the peaks present at least in any two replicates. This contains sometimes overlapping reads from three samples.

Do you think I can use this as the merged.bed? Do I need to sort or merge overlapping peaks to run any of the commands in this tutorial? (https://www.regulatory-genomics.org/hint/tutorial/)
2) When do you think is the best timing to merge bed and bam files? Did you merge even before running `rgt-hint footprinting`?

I usually merge the replicates if they have good quality and then call peaks based on the merged bam file, and then footprinting, motif matching, diff. footprinting analysis.

By good quality, I mean the replicates show a high correlation and different conditions show a lower correlation. To visualize the correlation, you can use deeptools, multiBamSummary and plotCorrelation
briefly, you count the number of reads per bin accordingly and then compute a correlation for each pair of your samples.


Let me know if you have other questions. :)

Best,
Li
 

______________________________
Zhijian Li
Institute for Computational Genomics
RWTH Aachen University 
Pauwelsstrasse 19
52074 Aachen, Germany



On Mon, 28 Oct 2019 at 14:33, <shokoh...@gmail.com> wrote:
Hi LI,

Thank you so much for your reply!

May I ask you a few more questions?

1) I previously merged the three replicates so that the output file would keep all the peaks present at least in any two replicates. This contains sometimes overlapping reads from three samples.
Do you think I can use this as the merged.bed? Do I need to sort or merge overlapping peaks to run any of the commands in this tutorial? (https://www.regulatory-genomics.org/hint/tutorial/)

2) When do you think is the best timing to merge bed and bam files? Did you merge even before running `rgt-hint footprinting`?

Thank you so much for your help.

Best,
Shoko


On Monday, October 28, 2019 at 1:12:38 PM UTC, Zhijian Li wrote:
To unsubscribe from this group and stop receiving emails from it, send an email to rgtu...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "RGT Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rgtusers+u...@googlegroups.com.

shokoh...@gmail.com

unread,
Oct 28, 2019, 10:11:21 AM10/28/19
to RGT Users
Hi Li,

Thank you so much for your prompt reply :) 

I understand your approach. 
Do you mind if I ask you what the recommended protocol is when the replicates are not in good quality?

Thank you!

Best,
Shoko

Zhijian Li

unread,
Oct 28, 2019, 11:15:52 AM10/28/19
to shokoh...@gmail.com, RGT Users
Hi Shoko,

Do you mind if I ask you what the recommended protocol is when the replicates are not in good quality?

You mean they do not behave like replicates, e.g., showing a low correlation? 
or the data quality itself is not good, e.g., low Frip?

For both cases, I would recommend discussing with biologists and see if the data is biologically relevant. 
if differential footprinting does work out, an alternative way is to do footprinting by only use differential peaks, 
which is working pretty well in our recent work.
 
you can use rgt-thor to detect diff. peaks:

Best,
Li


 
______________________________
Zhijian Li
Institute for Computational Genomics
RWTH Aachen University 
Pauwelsstrasse 19
52074 Aachen, Germany



To unsubscribe from this group and stop receiving emails from it, send an email to rgtusers+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/rgtusers/6bc78a0a-9414-4a75-b531-9f7efac52989%40googlegroups.com.

shokoh...@gmail.com

unread,
Oct 28, 2019, 11:29:52 AM10/28/19
to RGT Users
Hi Li,

Thank you so much for all your helpful suggestions! :)

Best,
Shoko

shokoh...@gmail.com

unread,
Oct 31, 2019, 10:20:57 AM10/31/19
to RGT Users
Hi Li,

Sorry again.

In the tutorial it says bam file needs to be "sorted". 
Does it need to be sorted by name?

Thank you very much.

Best,
Shoko

Zhijian Li

unread,
Oct 31, 2019, 10:22:35 AM10/31/19
to shokoh...@gmail.com, RGT Users
Hi Shoko,

In the tutorial it says bam file needs to be "sorted". 
Does it need to be sorted by name?

not by name but by position

best,
Li

______________________________
Zhijian Li
Institute for Computational Genomics
RWTH Aachen University 
Pauwelsstrasse 19
52074 Aachen, Germany



To unsubscribe from this group and stop receiving emails from it, send an email to rgtusers+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/rgtusers/d8a61b53-9ac4-41d7-b07f-43f28cf440e5%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages