Bias Values are super low. Fithic dumps all bins

183 views
Skip to first unread message

James Smith

unread,
Apr 10, 2021, 8:48:40 AM4/10/21
to Fit-Hi-C

Hi i have used HiCKRy to generate a bias file based on 1kb resolution data from juicers .hic file. The output is as below, do you have any suggestions for how to run this? The data is MicroC with nearly 3 billion contatcs.

The maps load and are visually excellent in Juicebox. Would i be better to convert the KR normalised data from juicer instead? The current data was dumped with no normalisation applied.

Bias file out info

Creating sparse matrix...
Sparse matrix creation took 993.3545796871185 seconds
Removing 0.05 percent of most sparse bins
... corresponds to 154414 total rows
... corresponds to all bins with less than or equal to 0.0 total interactions
Sparse rows removed
Initial matrix size: 3088281 rows and 3088281 columns
New matrix size: 2796973 rows and 2796973 columns
Normalizing with KR Algorithm
WARNING... Bias vector has a median outside of typical range (0.5, 2).
Consider running with a larger -x option if problems occur
Mean    0.8113461825526884
Median    4.6122513483594347e-07
Std. Dev.    1328.74827359699
FitHiC.fithic.log info

Number of all fragments= 3088281
Possible, Intra-chr in range: pairs= 474255561360
Possible, Intra-chr all: pairs= 237133957218.0
Possible, Inter-chr all: pairs= 4531607354403.0
Desired genomic distance range   [2000 inf]
Range of possible genomic distances  [2000  248956000]
Baseline intrachromosomal probability is 4.2170257340271535e-12
Interchromosomal probability is 0
5th quantile of biases: -1.0
50th quantile of biases: 4.6122513483594347e-07
95th quantile of biases: 1.7545576787840297e-06
Out of 3088281 loci 3088280 were discarded with biases not in range [0.5 2]

Ferhat Ay

unread,
Apr 12, 2021, 12:28:21 PM4/12/21
to Fit-Hi-C
The normalization seems to not have worked or converged as expected. I would suggest trying to use KR normalization results from Juicer. This may be an issue with HiCKRy's default filter of bins which has to be set to a larger value for higher resolutions.

James Smith

unread,
Apr 12, 2021, 1:38:35 PM4/12/21
to Fit-Hi-C
Thank you for the reply. This seems to work ok after i converted the juicer KR calc. Something is wrong with the spline fitting hoewever, it appears as though all the interactions detected are significant. Does this image look right to you?
1kb_test_KR_juicer.spline_pass1.png

Ferhat Ay

unread,
Apr 12, 2021, 1:42:39 PM4/12/21
to Fit-Hi-C
It looks ALL good actually. The y-axis in these splines is not the p-value, it is the prior contact probability that is supposed to look like that. You can now go look into results files and filter wrt whatever p/q value threshold you need. One thing to note, I would advise against using all chr length or 100Mb as upper limit if you are doing 1kb analysis. I would go with 5Mb or 10Mb at most unless you really have a very good reason to look that far.
Reply all
Reply to author
Forward
0 new messages