Rscript run_spp_nodups.R -c=chipSampleRep1.tagAlign.gz -i=controlSampleRep0.tagAlign.gz -npeak=300000 -odir=/peaks/reps -savr -savp -rf -out=/stats/phantomPeakStatsReps.tab
I used the above command on my data but do not come anywhere near calling 150 - 300k peaks. Here is example output:
################
ChIP data: /path/to/file/AP.tagAlign.gz
Control data: /path/to/file/Input.tagAlign.gz
strandshift(min): -500
strandshift(step): 5
strandshift(max) 1500
user-defined peak shift NA
exclusion(min): 10
exclusion(max): NaN
num parallel nodes: 4
FDR threshold: 0.01
NumPeaks Threshold: 3e+05
Output Directory: /path/to/folder/spp/peaks/reps
narrowPeak output file name: NA
regionPeak output file name: /path/to/file/spp/peaks/reps/AP.tagAlign_VS_Input.tagAlign.regionPeak
Rdata filename: NA
plot pdf filename: /path/to/file/spp/peaks/reps/AP.tagAlign.pdf
result filename: /path/to/file/spp/stats/_AP-phantomPeakStatsReps.tab
Overwrite files?: TRUE
Decompressing ChIP file
Decompressing control file
Loading required package: caTools
Reading ChIP tagAlign/BAM file /path/to/file/AP.tagAlign.gz
opened /tmp/RtmpzJyRih/AP.tagAlign10e3418b6efa
done. read 13635712 fragments
ChIP data read length 50
[1] TRUE
Reading Control tagAlign/BAM file /path/to/file/Input.tagAlign.gz
opened /tmp/RtmpzJyRih/Input.tagAlign10e35aee9044
done. read 32294683 fragments
Control data read length 50
Loading required namespace: Rmpi
4 slaves are spawned successfully. 0 failed.
Calculating peak characteristics
[1] 1
Minimum cross-correlation value 0.04405169
Minimum cross-correlation shift 1500
Top 3 cross-correlation values 0.0576014765287295,0.0575764175222504
Top 3 estimates for fragment length 185,195
Window half size 300
Phantom peak location 50
Phantom peak Correlation 0.05390492
Normalized Strand cross-correlation coefficient (NSC) 1.307588
Relative Strand Cross correlation Coefficient (RSC) 1.375162
Phantom Peak Quality Tag 1
null device
1
Removing read stacks
4 slaves are spawned successfully. 0 failed.
Finding peaks
finding background exclusion regions ... done
determining peaks on provided 1 control datasets:
using reversed signal for FDR calculations
bg.weight= 32.95003 excluding systematic background anomalies ... done
determining peaks on real data:
bg.weight= 0.03034898 excluding systematic background anomalies ... done
calculating statistical thresholds
FDR 0.99 threshold= 2.000461
Detected 15288 peaks
I'm not sure how to proceed, are enough peaks called to move forward for IDR analysis? Are there other options to call to increase the number of peaks?
When I modified the command to include -fdr=0.1 However, I get the same output with one exception: line 11: FDR Threshold: 0.1
Also, I am confused by the -npeak threshold option. Does this option indicate the max number of peaks to be called or does it tell the program to call 300k peaks.
Any help would be great!
Thanks,
Paul