bigWigAverageOverBed -minMax sometimes shows wrong value for min.

40 views
Skip to first unread message

Gert Hulselmans

unread,
Jun 18, 2024, 6:50:45 PMJun 18
to gen...@soe.ucsc.edu
Hi,

bigWigAverageOverBed -minMax sometimes shows the wrong value for min.

# Processing a bunch of regions with Kent tools bigWigAverageOverBed:
$ ./bigWigAverageOverBed -minMax test.bw consensus_peaks_bicnn.bed consensus_peaks.ucsc.bwaob
processing chromosomes....................

# Looking at a specific region:
$ grep chr1:3265027-3265527 consensus_peaks.ucsc.bwaob
chr1:3265027-3265527	500	390	6.31398	0.012628	0.0161897	0	0.0298533

# Creating a new BED file with only that region:
$ grep chr1:3265027-3265527 consensus_peaks_bicnn.bed > consensus_peaks_bicnn.chr1:3265027-3265527.bed

$ cat consensus_peaks_bicnn.chr1:3265027-3265527.bed
chr1	3265027	3265527	chr1:3265027-3265527

# Processing only this region with Kent tools bigWigAverageOverBed:
$ ./bigWigAverageOverBed -minMax pybigtools/Astro.bw consensus_peaks.chr1\:3265027-3265527.bed /dev/stdout
chr1:3265027-3265527	500	390	6.31398	0.012628	0.0161897	0.0149267	0.0298533

==> different minimum: 0 vs 0.0149267 for the same region:

# Acutal regions in the bigWig:
$ bigWigToBedGraph test.bw /dev/stdout consensus_peaks.chr1\:3265027-3265527.bed
chr1	3265027	3265208	0.014926667
chr1	3265208	3265241	0.029853335
chr1	3265241	3265417	0.014926667


This happens more than once.
Grepping 10 lines before and after "chr1:3265027-3265527" in  the full bigWigAverageOverBed -minMax output"

$ grep -C 10 chr1:3265027-3265527 consensus_peaks_bicnn.ucsc.bwaob
chr1:3184966-3185466 500 500 33.4059 0.0668118 0.0668118 0.04478 0.104487
chr1:3204663-3205163 500 413 12.0607 0.0241215 0.0292028 0 0.04478
chr1:3206943-3207443 500 246 3.67196 0.00734392 0.0149267 0 0.0149267
chr1:3210217-3210717 500 500 21.4645 0.0429291 0.0429291 0.0298533 0.0597067
chr1:3210823-3211323 500 99 1.47774 0.00295548 0.0149267 0 0.0149267
chr1:3212731-3213231 500 83 1.23891 0.00247783 0.0149267 0 0.0149267
chr1:3216984-3217484 500 484 12.9713 0.0259425 0.0268002 0 0.04478
chr1:3220981-3221481 500 314 8.89629 0.0177926 0.0283321 0 0.04478
chr1:3235261-3235761 500 332 10.0755 0.020151 0.0303479 0 0.04478
chr1:3251359-3251859 500 500 17.8075 0.035615 0.035615 0.0149267 0.0746333
chr1:3265027-3265527 500 390 6.31398 0.012628 0.0161897 0 0.0298533
chr1:3289281-3289781 500 53 0.791113 0.00158223 0.0149267 0 0.0149267
chr1:3292604-3293104 500 500 21.6437 0.0432873 0.0432873 0.0149267 0.0746333
chr1:3293427-3293927 500 500 13.7027 0.0274054 0.0274054 0.0149267 0.04478
chr1:3296410-3296910 500 0 0 0 0 0 0
chr1:3297362-3297862 500 274 4.46307 0.00892615 0.0162886 0 0.0298533
chr1:3299550-3300050 500 280 7.55289 0.0151058 0.0269746 0 0.04478
chr1:3301086-3301586 500 0 0 0 0 0 0
chr1:3304525-3305025 500 0 0 0 0 0 0
chr1:3309959-3310459 500 500 25.9425 0.0518851 0.0518851 0.0298533 0.08956
chr1:3321446-3321946 500 371 5.97067 0.0119413 0.0160934 0 0.0298533


Running bigWigAverageOverBed with only those input regions:
$ ./bigWigAverageOverBed -minMax test.bw consensus_peaks_bicnn.chr1\:3265027-3265527_c10.bed /dev/stdout
chr1:3184966-3185466 500 500 33.4059 0.0668118 0.0668118 0.04478 0.104487
chr1:3204663-3205163 500 413 12.0607 0.0241215 0.0292028 0.0149267 0.04478
chr1:3206943-3207443 500 246 3.67196 0.00734392 0.0149267 0.0149267 0.0149267
chr1:3210217-3210717 500 500 21.4645 0.0429291 0.0429291 0.0298533 0.0597067
chr1:3210823-3211323 500 99 1.47774 0.00295548 0.0149267 0.0149267 0.0149267
chr1:3212731-3213231 500 83 1.23891 0.00247783 0.0149267 0.0149267 0.0149267
chr1:3216984-3217484 500 484 12.9713 0.0259425 0.0268002 0.0149267 0.04478
chr1:3220981-3221481 500 314 8.89629 0.0177926 0.0283321 0.0149267 0.04478
chr1:3235261-3235761 500 332 10.0755 0.020151 0.0303479 0.0149267 0.04478
chr1:3251359-3251859 500 500 17.8075 0.035615 0.035615 0.0149267 0.0746333
chr1:3265027-3265527 500 390 6.31398 0.012628 0.0161897 0.0149267 0.0298533
chr1:3289281-3289781 500 53 0.791113 0.00158223 0.0149267 0.0149267 0.0149267
chr1:3292604-3293104 500 500 21.6437 0.0432873 0.0432873 0.0149267 0.0746333
chr1:3293427-3293927 500 500 13.7027 0.0274054 0.0274054 0.0149267 0.04478
chr1:3296410-3296910 500 0 0 0 0 0 0
chr1:3297362-3297862 500 274 4.46307 0.00892615 0.0162886 0.0149267 0.0298533
chr1:3299550-3300050 500 280 7.55289 0.0151058 0.0269746 0.0149267 0.04478
chr1:3301086-3301586 500 0 0 0 0 0 0
chr1:3304525-3305025 500 0 0 0 0 0 0
chr1:3309959-3310459 500 500 25.9425 0.0518851 0.0518851 0.0298533 0.08956
chr1:3321446-3321946 500 371 5.97067 0.0119413 0.0160934 0.0149267 0.0298533

All lines that have 0 for the minimum but not the full coverage (column 3) compared with the full region,
have the wrong values in the full bigWigAverageOverBed output.

Cheers,
Gert

Matthew Speir

unread,
Jun 25, 2024, 12:42:39 PM (7 days ago) Jun 25
to Gert Hulselmans, gen...@soe.ucsc.edu
Hello, Gert.

Thank you for your question about bigWigAverageOverBed.

We think one of the issues is with how bigWigAverageOverBed handles regions where there is no data or, in other words, is "undefined." It seems that bigWigAverageOverBed outputs a "0" in this case. If you restrict your output to only regions with data, does the output change?

Additionally, we don't use bigWigAverageOverBed much internally anymore. For this operation, it may be better to use a package like WiggleTools.

If you have any further questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible Google Groups forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

---

Matthew Speir

UCSC Genome Browser, User Support


--

---
You received this message because you are subscribed to the Google Groups "UCSC Genome Browser Public Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genome+un...@soe.ucsc.edu.
To view this discussion on the web visit https://groups.google.com/a/soe.ucsc.edu/d/msgid/genome/CAF18BxvnMi0mzD1kp4zhmD49Emdg7T7B_dLp2pEKR%3DhCxJXeLw%40mail.gmail.com.

Gert Hulselmans

unread,
Jun 28, 2024, 2:09:36 PM (4 days ago) Jun 28
to Matthew Speir, gen...@soe.ucsc.edu
Hello Matthew,

The issue is that bigWigAverageOverBed does not always report the same value for "min" if a region only has partial coverage.
In the case below, if only that specific region is given, it reports the correct value, if also other regions are given, it reports 0 for that
region, but it actually has some defined values for that region.

I found this bug while implementing min/max support for bigtools bigwigaverageoverbed:

We use bigWigAverageOverBed quite regularly (but mostly mean0 and max columns).

For mean bigWigAverageOverBed also reports 0 when the region is not covered at all.
As there is also mean0 (which would give 0 in that case), it feels to me that mean, min, max should return NaN,
if no entries are found in the bigWig file for a region. min0 and max0 could be added to report 0 for those cases.

Thanks,
Gert

Op di 25 jun 2024 om 18:42 schreef Matthew Speir <msp...@ucsc.edu>:
Reply all
Reply to author
Forward
0 new messages