Resolution guidelines

63 views
Skip to first unread message

Owen Chapman

unread,
Feb 20, 2020, 3:30:43 PM2/20/20
to Fit-Hi-C
Hi Ay lab,
Wondering if you can give a set of guidelines or intuitions about what resolution is appropriate for running FitHiC on different HiC experiments. For example, it seems obvious that depth of coverage will influence the choice of an appropriate resolution; but what sequencing depths are appropriate for which resolutions? What other factors should I consider when choosing a resolution for my FitHiC analysis?
Thanks,
Owen

Ferhat Ay

unread,
Feb 20, 2020, 4:15:28 PM2/20/20
to fit...@googlegroups.com
Hi Owen,

We do have some discussion on this in our new Nature Protocols paper below. I hope it helps. If not, find me after one of the Thursday BISB talks and we can discuss. 
Best,

Choosing an appropriate binning strategy Even though the native resolution of Hi-C is at a single restriction fragment (i.e., a genomic region demarcated on both sides by the cut site of the restriction enzyme used), the default mode of Hi-C data analysis has been binning the genome into fixed-size, non-overlapping regions (e.g., 40-kb bins). FitHiC2 can handle both cases and leaves it to the user to determine which mode is more appropriate for their data. For instance, for a small genome (e.g., budding yeast) with sufficient sequencing depth and non-frequently cutting restriction enzymes (e.g., 6-bp cutters such as HindIII), it may make sense to use restriction fragment level contact maps to achieve a high-resolution picture of the genome organization. However, for large genomes (e.g., human) with frequent cutters (e.g., 4-bp cutters such as MboI), unless extremely high depth sequencing is available, it may be more appropriate to bin the contacts at a fixed size such as 5 or 40 kb.
Choosing an appropriate contact map resolution Another important choice is the bin size or the resolution of the contact map. This choice is critical as it will have a significant effect on the downstream analysis and is a tradeoff between the resolution of the analysis and its statistical power. Unfortunately, there is no consensus on how to pick the most appropriate bin size, and only a few articles provide any guideline4,33. For instance, Rao et al.4 suggest using a resolution that results in ≥80% of all possible bins/loci having >1,000 contacts in total. In addition, one could use the density (i.e., percentage of non-zero entries) of the cis- or trans- contact matrices as the cutoff threshold instead of the total contact counts per locus. As determining a correct bin size is critical, it may also be worthwhile repeating some analyses such as FitHiC2 with different resolutions and extracting results that are consistent and robust to the change in resolution.

--
You received this message because you are subscribed to the Google Groups "Fit-Hi-C" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fithic+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/fithic/0ca846f0-869f-417a-a05e-05f3d55e9dfb%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages