GSE63525 GM12878 has nan of pvalue and qvalue

110 views
Skip to first unread message

Qin CAO

unread,
Aug 12, 2019, 11:40:48 PM8/12/19
to Fit-Hi-C
Dear admin,

         I run Fit-Hi-C for GSE63525 GM12878(GSE63525_GM12878_insitu_primary+replicate_combined.hic) at 50kb resolution and the procedure seems correct. However, I got nan for pvalues and qvalues at "-x All" and "-x intraOnly". For "-x interOnly", the result is fine. And for all the other cell lines, the results are fine. I'm wondering what could be the possible reason. The version is 2.0.7(the latest) and I used git to install. 

        I attached the resulting file and the log file of GM12878 below:

zcat < FitHiC.spline_pass1.res50000.significances.txt.gz | head
chr1    fragmentMid1    chr2    fragmentMid2    contactCount    p-value q-valuebias1    bias2
chr10   75000   chr10   125000  1736    nan     nan     1.000000e+00    1.000000e+00
chr10   75000   chr10   175000  676     nan     nan     1.000000e+00    1.000000e+00
chr10   125000  chr10   175000  6191    nan     nan     1.000000e+00    1.000000e+00
chr10   75000   chr10   225000  357     nan     nan     1.000000e+00    1.000000e+00
chr10   125000  chr10   225000  2243    nan     nan     1.000000e+00    1.000000e+00
chr10   175000  chr10   225000  6357    nan     nan     1.000000e+00    1.000000e+00
chr10   75000   chr10   275000  323     nan     nan     1.000000e+00    1.000000e+00
chr10   125000  chr10   275000  1717    nan     nan     1.000000e+00    1.000000e+00
chr10   175000  chr10   275000  2993    nan     nan     1.000000e+00    1.000000e+00

log:
Interactions file read successfully
------------------------------------------------------------------------------------
Observed, Intra-chr in range: pairs= 79549520    totalCount= 2536425942
Observed, Intra-chr all: pairs= 79606379         totalCount= 3678507965
Observed, Inter-chr all: pairs= 710072167        totalCount= 1216856242
Range of observed genomic distances [50000 249200000]

Making equal occupancy bins
------------------------------------------------------------------------------------
Observed intra-chr read counts in range 2536425942
Desired number of contacts per bin      25364259.42,
Number of bins  100
Equal occupancy bins generated

Looping through all possible fragment pairs in-range
------------------------------------------------------------------------------------
Chromosome 'chr1',      4986 mappable fragments,        12427605 possible intra-chr fragment pairs in range,    277984458 possible inter-chr fragment pairs
Chromosome 'chr10',     2711 mappable fragments,        3673405 possible intra-chr fragment pairs in range,     157313908 possible inter-chr fragment pairs
Chromosome 'chr11',     2701 mappable fragments,        3646350 possible intra-chr fragment pairs in range,     156760638 possible inter-chr fragment pairs
Chromosome 'chr12',     2678 mappable fragments,        3584503 possible intra-chr fragment pairs in range,     155487358 possible inter-chr fragment pairs
Chromosome 'chr13',     2304 mappable fragments,        2653056 possible intra-chr fragment pairs in range,     134634240 possible inter-chr fragment pairs
Chromosome 'chr14',     2147 mappable fragments,        2303731 possible intra-chr fragment pairs in range,     125797024 possible inter-chr fragment pairs
Chromosome 'chr15',     2051 mappable fragments,        2102275 possible intra-chr fragment pairs in range,     120369088 possible inter-chr fragment pairs
Chromosome 'chr16',     1808 mappable fragments,        1633528 possible intra-chr fragment pairs in range,     106547248 possible inter-chr fragment pairs
Chromosome 'chr17',     1624 mappable fragments,        1317876 possible intra-chr fragment pairs in range,     96002760 possible inter-chr fragment pairs
Chromosome 'chr18',     1562 mappable fragments,        1219141 possible intra-chr fragment pairs in range,     92434474 possible inter-chr fragment pairs
Chromosome 'chr19',     1183 mappable fragments,        699153 possible intra-chr fragment pairs in range,      70454748 possible inter-chr fragment pairs
Chromosome 'chr2',      4864 mappable fragments,        11826816 possible intra-chr fragment pairs in range,    271776000 possible inter-chr fragment pairs
Chromosome 'chr20',     1261 mappable fragments,        794430 possible intra-chr fragment pairs in range,      75001758 possible inter-chr fragment pairs
Chromosome 'chr21',     963 mappable fragments,         463203 possible intra-chr fragment pairs in range,      57564288 possible inter-chr fragment pairs
Chromosome 'chr22',     1027 mappable fragments,        526851 possible intra-chr fragment pairs in range,      61324224 possible inter-chr fragment pairs
Chromosome 'chr3',      3961 mappable fragments,        7842780 possible intra-chr fragment pairs in range,     224897658 possible inter-chr fragment pairs
Chromosome 'chr4',      3824 mappable fragments,        7309576 possible intra-chr fragment pairs in range,     217642960 possible inter-chr fragment pairs
Chromosome 'chr5',      3619 mappable fragments,        6546771 possible intra-chr fragment pairs in range,     206717280 possible inter-chr fragment pairs
Chromosome 'chr6',      3423 mappable fragments,        5856753 possible intra-chr fragment pairs in range,     196192668 possible inter-chr fragment pairs
Chromosome 'chr7',      3183 mappable fragments,        5064153 possible intra-chr fragment pairs in range,     183200748 possible inter-chr fragment pairs
Chromosome 'chr8',      2928 mappable fragments,        4285128 possible intra-chr fragment pairs in range,     169270608 possible inter-chr fragment pairs
Chromosome 'chr9',      2825 mappable fragments,        3988900 possible intra-chr fragment pairs in range,     163607050 possible inter-chr fragment pairs
Chromosome 'chrX',      3106 mappable fragments,        4822065 possible intra-chr fragment pairs in range,     179008098 possible inter-chr fragment pairs
Number of all fragments= 60739
Possible, Intra-chr in range: pairs= 94588049
Possible, Intra-chr all: pairs= 94648788.0
Possible, Inter-chr all: pairs= 1749994642.0
Desired genomic distance range   [0 inf]
Range of possible genomic distances  [50000  249250000]
Baseline intrachromosomal probability is 1.056537564960684e-08
Interchromosomal probability is 5.714303209849485e-10

Calculating probability means and standard deviations of contact counts
------------------------------------------------------------------------------------
Means and error written to hic_GM12878/All/hic_GM12878/FitHiC.fithic_pass1.res50000.txt


Fitting a univariate spline to the probability means
------------------------------------------------------------------------------------
Spline successfully fit





And the running output(partially):

All genomic regions will be analyzed                                            
Lower bound of bias values is 0.5                                               
Upper bound of bias values is 2                                                 
All arguments processed. Running FitHiC now...                                 
=========================                              
Reading the contact counts file to generate bins...                             
Interactions file read. Time took 5700.171980857849                           
Fragments file read. Time took 0.3304603099822998                              
Writing hic_GM12878/All/hic_GM12878/FitHiC.fithic_pass1.res50000.txt            
Spline fit Pass 1 starting...                                                   
Outlier threshold is... 5.421280406018946e-10                                   
Writing p-values and q-values to file hic_GM12878/All/hic_GM12878/FitHiC.spline_pass1.significances.txt                                                         
Number of outliers is... 1351104                                                
Spline fit Pass 1 completed. Time took 56646.725069761276                      
=========================                                                       
Fit-Hi-C completed successfully                    

Ferhat Ay

unread,
Aug 13, 2019, 2:32:06 PM8/13/19
to fit...@googlegroups.com
Hi,
We may have to look in more detail but knowing the human genome, the first ~3Mb of chr1 has very low mappability. The exact bias value of 1 is an indicator of these regions having bias values out of range (less than 0.5). So if you simply look at pairs without "nan" in their p-/q-vals that may be all you need, simply ignore nans. We should do a better job in reporting bias values when they are out of range. The reason why we set them to 1 and not say nan or -1 etc is for normalization of the bias vector but we can keep that internal and report a different output.
Hope this helps.

--
You received this message because you are subscribed to the Google Groups "Fit-Hi-C" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fithic+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/fithic/8b366d93-7995-407f-a1fc-3b30a90e47b9%40googlegroups.com.

Qin CAO

unread,
Aug 13, 2019, 2:43:22 PM8/13/19
to Fit-Hi-C
Hi Ferhat,

       No, all the pvalues and qvalues of all the chromosomes in FitHiC.spline_pass1.res50000.significances.txt.gz with -x All and -x intraOnly are nan. 

         

在 2019年8月14日星期三 UTC+8上午2:32:06,Ferhat Ay写道:
To unsubscribe from this group and stop receiving emails from it, send an email to fit...@googlegroups.com.

Qin CAO

unread,
Aug 13, 2019, 2:43:39 PM8/13/19
to Fit-Hi-C
Hi Ferhat,

       No, all the pvalues and qvalues of all the chromosomes in FitHiC.spline_pass1.res50000.significances.txt.gz with -x All and -x intraOnly are nan. 

         

在 2019年8月14日星期三 UTC+8上午2:32:06,Ferhat Ay写道:
Hi,
To unsubscribe from this group and stop receiving emails from it, send an email to fit...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages