min_uncorr in log.regression.peaks.txt

7 views
Skip to first unread message

Matt Shirley

unread,
Jul 13, 2011, 1:57:57 PM7/13/11
to cnvi...@googlegroups.com
Is it safe to assume that this is the minimum (least significant) p-value in each significant region of the manhattan plot? I would like to use these values for a supplementary table in my manuscript, but want to make sure that we are reporting the correct values. Thanks,

Matt

Michael

unread,
Jul 18, 2011, 9:38:06 AM7/18/11
to CNVineta
The reported p value is the minimum of all p-values within the
specified region. So it is the most significant.

I should implement min, max and average for this output ... !

On Jul 13, 7:57 pm, Matt Shirley <mds...@gmail.com> wrote:
> Is it safe to assume that this is the minimum (least significant) p-value in
> each significant region of the manhattan plot? I would like to use these
> values <http://paste2.org/p/1518502> for a supplementary table in my

Matt Shirley

unread,
Jul 18, 2011, 9:51:29 AM7/18/11
to cnvi...@googlegroups.com
Thanks for the clarification. It would be useful to have a maximum numerical p-value so that you can report something that is a little more conservative. If the least significant p-value for a marker in a region is still very significant, then you can say something about the significance of the rest of the markers, but not the other way around.
--
Matt Shirley
Ph.D Candidate - BCMB
Pevsner Lab
Johns Hopkins Medicine
Google Profile

Michael

unread,
Jul 19, 2011, 1:05:19 PM7/19/11
to CNVineta
I've changed the source an added values for max and and Q1-Q3 (25%,
50% and 75% quantiles). The source is changed but it is coming with
the next release ...


You should have a file with all p values. The file name ends
".segment.pValues.txt". First line has following column names TAB
delimited:
"chrom","pos","Estimate","Std. Error","z value","Pr(>|z|)","Estimate
corrected","Std. Error corrected","z value corrected","Pr(>|z|)
corrected","sign. cov. infl.","peak"


# Load the file:
values = read.table(file =
"yourFilename.segment.pValues.txt",header=TRUE,stringsAsFactors=FALSE)

# Run following commands:
the_peaks = unique(as.integer(values[,"peak"]))
return_table = rep(NA,max(the_peaks,na.rm=TRUE)*13)
dim(return_table) = c(max(the_peaks,na.rm=TRUE),13)
colnames(return_table) =
c("chrom","start","end","min_uncorr","min_corr","max_uncorr","max_corr","Q1_uncorr","Q1_corr","Q2_uncorr","Q2_corr","Q3_uncorr","Q3_corr")
for(i in the_peaks)
{
if(is.na(i))
next
which_belong = which(values[,"peak"]==i)
return_table[i,"chrom"] = values[which_belong[1],"chrom"]
return_table[i,"start"] =
values[min(which_belong,na.rm=TRUE),"pos"]
return_table[i,"end"] =
values[max(which_belong,na.rm=TRUE),"pos"]
return_table[i,"min_uncorr"]=
min(as.double(values[which_belong,"Pr(>|z|)"]),na.rm=TRUE)
return_table[i,"min_corr"] =
max(as.double(values[which_belong,"Pr(>|z|) corrected"]),na.rm=TRUE)
return_table[i,"max_uncorr"]=
min(as.double(values[which_belong,"Pr(>|z|)"]),na.rm=TRUE)
return_table[i,"max_corr"] =
max(as.double(values[which_belong,"Pr(>|z|) corrected"]),na.rm=TRUE)
Q_uncorr =
quantile(as.double(values[which_belong,"Pr(>|
z|)"]),probs=c(0.25,0.5,0.75),na.rm=TRUE)
Q_corr =
quantile(as.double(values[which_belong,"Pr(>|z|)
corrected"]),probs=c(0.25,0.5,0.75),na.rm=TRUE)
return_table[i,"Q1_uncorr"] = Q_uncorr[1]
return_table[i,"Q1_corr"] = Q_corr[1]
return_table[i,"Q2_uncorr"] = Q_uncorr[2]
return_table[i,"Q2_corr"] = Q_corr[2]
return_table[i,"Q3_uncorr"] = Q_uncorr[3]
return_table[i,"Q3_corr"] = Q_corr[3]
}


#write the results:
write.table(return_table, file = "yourFilename.peaks.txt", append =
FALSE, quote = FALSE,sep = "\t",eol = "\n", na = "NA", dec =
".",row.names = FALSE,col.names = TRUE)

Matt Shirley

unread,
Jul 19, 2011, 1:11:37 PM7/19/11
to cnvi...@googlegroups.com
Thanks! This is perfect. 

Michael

unread,
Jul 19, 2011, 3:22:23 PM7/19/11
to CNVineta
!!! within the code min() and max() were switched between two lines,
corrected :

# Load the file:
values = read.table(file =
"yourFilename.segment.pValues.txt",header=TRUE,stringsAsFactors=FALSE)

# Run following commands:
the_peaks = unique(as.integer(values[,"peak"]))
return_table = rep(NA,max(the_peaks,na.rm=TRUE)*13)
dim(return_table) = c(max(the_peaks,na.rm=TRUE),13)
colnames(return_table) =
c("chrom","start","end","min_uncorr","min_corr","max_uncorr","max_corr","Q1_uncorr","Q1_corr","Q2_uncorr","Q2_corr","Q3_uncorr","Q3_corr")
for(i in the_peaks)
{
if(is.na(i))
next
which_belong = which(values[,"peak"]==i)
return_table[i,"chrom"] = values[which_belong[1],"chrom"]
return_table[i,"start"]
=values[min(which_belong,na.rm=TRUE),"pos"]
return_table[i,"end"]
=values[max(which_belong,na.rm=TRUE),"pos"]

return_table[i,"min_uncorr"]=min(as.double(values[which_belong,"Pr(>|
z|)"]),na.rm=TRUE)
return_table[i,"min_corr"]
=min(as.double(values[which_belong,"Pr(>|z|) corrected"]),na.rm=TRUE)

return_table[i,"max_uncorr"]=max(as.double(values[which_belong,"Pr(>|
z|)"]),na.rm=TRUE)
return_table[i,"max_corr"]
=max(as.double(values[which_belong,"Pr(>|z|) corrected"]),na.rm=TRUE)
Q_uncorr
=quantile(as.double(values[which_belong,"Pr(>|
z|)"]),probs=c(0.25,0.5,0.75),na.rm=TRUE)
Q_corr
=quantile(as.double(values[which_belong,"Pr(>|z|)
Reply all
Reply to author
Forward
0 new messages