Questions about bamCompare

Kremsky, Isaac J.

unread,

Oct 10, 2017, 10:51:58 AM10/10/17

to deep...@googlegroups.com

Hi,

I am interested in using bamCompare in order to compare two samples with and without treatment, as it seems like a very useful tool for our purposes. The reason we want to use bamCompare is because it looks like the two samples we are comparing have very different signal-to-noise ratios. Am I correct in my understanding that using the SES method to determine the scaling factor can correct for this discrepancy?

Would it be correct in my case to use the flags

--scaleFactorsMethod SES --ratio subtract

I.e. will bamCompare scale each sample separately in this case? Or is it better to use bamCompare separately on each sample vs. input first, and then compare log2 ratios of each sampe vs. input to find regions where there is a significant difference between sample 1 and sample 2?

Also, is the flag

--scaleFactorsMethod SES

a complete normalization on its own, or do I need to use the flag combination

--normalizeUsingRPKM --scaleFactorsMethod SES

to account for both differences in signal-to-noise ratio, and overall sequencing depth?

Thanks alot for your help. i look forward to hearing from you and seeing what results I get with bamCompare!

Sincerely,

Isaac Kremsky

Postdoc in Victor Corces' Lab

This e-mail message (including any attachments) is for the sole use of
the intended recipient(s) and may contain confidential and privileged
information. If the reader of this message is not the intended
recipient, you are hereby notified that any dissemination, distribution
or copying of this message (including any attachments) is strictly
prohibited.

If you have received this message in error, please contact
the sender by reply e-mail message and destroy all copies of the
original message (including attachments).

Devon Ryan

unread,

Oct 13, 2017, 5:58:27 AM10/13/17

to Kremsky, Isaac J., deep...@googlegroups.com

Hi Isaac,

It's not so much that SES corrects for the signal to noise ratio,
rather it scales according to the non-enriched regions, so you can get
peaks that stand out more than if you had included the signal in them
when determining how to normalize.

"--scaleFactorsMethod SES --ratio subtract" will scale the two samples
indicated relative to each other. If you have multiple samples with
the same input then the scaling factors will likely be a bit different
each time. SES normalization should be a bit more robust to IP
efficiency differences, but keep in mind that nothing is perfect so
don't over-interpret the exact value returned by the subtraction.

"--scaleFactorsMethod SES" is a complete normalization. Unless you
specify "--ratio subtract", "--normalizeUsingRPKM" will be ignored
(the documentation for this is somewhat confusing and we're trying to
change how all of this works in the next release).

Devon

--
Devon Ryan, Ph.D.
Email: dpr...@dpryan.com
Data Manager/Bioinformatician
Max Planck Institute of Immunobiology and Epigenetics
Stübeweg 51
79108 Freiburg
Germany

> --
> You received this message because you are subscribed to the Google Groups
> "deepTools" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to deeptools+...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Kremsky, Isaac J.

unread,

Oct 13, 2017, 10:03:27 AM10/13/17

to Devon Ryan, deep...@googlegroups.com

Hi Devon,

Great thanks for your help. That clears up alot! I have a few more questions though.

So if II understood what you said correctly, in the case where I use "--scaleFactorsMethod SES --ratio subtract --normalizeUsingRPKM", it won't ignore the flag "--normalizeUsingRPKM"? Does that mean it will apply both RPKM and SES normalization? Or are the 2 methods mutually exclusive? If not would it make sense to use both if the two samples have a large difference of total reads as well as differences in IP efficiency?

Thanks!

Isaac

From: dpry...@gmail.com <dpry...@gmail.com> on behalf of Devon Ryan <dpr...@dpryan.com>
Sent: Friday, October 13, 2017 5:58:26 AM
To: Kremsky, Isaac J.
Cc: deep...@googlegroups.com
Subject: Re: Questions about bamCompare

Devon Ryan

unread,

Oct 13, 2017, 12:31:23 PM10/13/17

to Kremsky, Isaac J., deep...@googlegroups.com

It will SES normalize the samples and then RPKM normalize the resulting difference between the samples. SES is nice when you have decent IPs and want to make the enriched regions pop out a bit more. This will help a bit in accounting for IP efficiency differences, but there's nothing it can do to fully account for such differences.

Devon

-- 
Devon Ryan, PhD
Bioinformatician / Data manager
Bioinformatics Core Facility
Max Planck Institute for Immunobiology and Epigenetics
Email: dpry...@gmail.com

Kremsky, Isaac J.

unread,

Oct 13, 2017, 3:55:18 PM10/13/17

to Devon Ryan, deep...@googlegroups.com

Oh I see, so the RPKM normalization wouldn't affect the direction of change then because its normalizing after taking the difference between samples, right? That would be more for making differences between two sets of samples comparable to each other I guess? Does it just normalize based on the sumof the reads from the 2 samples?

Thanks again for all your help!

Isaac

From: Devon Ryan <dpry...@gmail.com>
Sent: Friday, October 13, 2017 12:31:19 PM

Devon Ryan

unread,

Oct 13, 2017, 6:43:19 PM10/13/17

to Kremsky, Isaac J., deep...@googlegroups.com

Correct, it just changes the scaling. The final scaling is based on the reads in the file with lower coverage (the same is true for 1x normalization).