I am still seeing a discrepancy between the number of samples with CNV reported while visualizing a specific region. Here is some detailed output that shows the behavior:
Code:
filename = paste("chr",chrom,":",start,"-",end,".pdf",sep="")
pdf(file = filename)
visualizeRegion(CNVdata =
objCNVineta,batch.case=c("Cell"),batch.control=c("Whole"), chrom = chrom ,
start = start, end = end,add.heatmap=TRUE)
dev.off()
R output:
3849 samples to process
2 samples with CNV
[1] "cases : 1 segments in 1 samples"
[1] "controls: 1 segments in 1 samples"
[1] "got 0 transcripts"
3849 samples to process
2 samples with CNV
size 0 1 2 3 4+ #del samples #dup samples #CNP samples
Cell 1335 0 0 NA 0 0 0 0 0
sum 1335 0 0 NA 0 0 0 0 0
size 0 1 2 3 4+ #del samples #dup samples #CNP samples
Whole 2514 0 0 NA 1 0 0 1 1
sum 2514 0 0 NA 1 0 0 1 1
Deletion Duplication Copy Number Polymorphism
Pearson's Chi-squared 0 0.7475869 0.7475869
Fisher's Exact 1 1.0000000 1.0000000
It is my expectation that the number of CNV should at all times be for the region that I specified. I understand that the graphics have not been updated, and that it may be very difficult to change the behavior of the plotting code, but there should be some explicit warning that the number of CNV reported in the first part of the R output ("X samples with CNV") and the graphical output ("X/Y cases/controls with CNVs") are not for the region specified. Thank you though, for making the tabular output and statistics accurate for the region specified.