Directionality index vs Eigenvector

1,026 views
Skip to first unread message

Qi Yu

unread,
Jul 12, 2017, 11:39:03 AM7/12/17
to 3D Genomics
Hello,

I notice the Juicebox can calculate eigenvector, but by definition it is different from Directionality index (DI), but does that represent similar thing but in a lower resolution?

Besides, when I use Juicebox to extract eigenvector at different resolution (500kb, 2Mb), the values are flipped (like the positive value in 2Mb map is negative in 500kb map), I was wondering does positive and negative means anything?

Thanks!

Bests,
Qi



Ilya Flyamer

unread,
Jul 31, 2017, 7:54:27 PM7/31/17
to 3D Genomics
Eigenvector is, to put it simply, the A/B compartment signal. The directionality index is TAD signal. So they are different things.

The sign of the eigenvector is assigned randomly during computation, the easiest way to make it uniform across samples and chromosomes is via flipping eigenvectors when required to obtain positive correlation with GC content (Imakaev 2012).

среда, 12 июля 2017 г., 16:39:03 UTC+1 пользователь Qi Yu написал:

Erez Aiden

unread,
Jul 31, 2017, 9:31:02 PM7/31/17
to 3D Genomics
Hey -

I just thought I'd clarify this because there's a lot of confusion about these points in the literature. 

Genome compartmentalization refers to the tendency of certain intervals of the genome to co-segregate inside the nucleus. Intervals in the same compartment usually exhibit a similar long-range contact pattern in the Hi-C contact matrix.

Eigenvectors (and their corresponding eigenvalues) are vectors (respectively, scalars) that are defined in linear algebra with respect to a particular matrix. If a genome exhibits compartmentalization, and a high-quality Hi-C contact matrix obtained from that genome is pre-processed appropriately, then one or more of the corresponding eigenvectors will sometimes reflect the above-mentioned long-range patterns, in the sense that there will be a correlation between the sign of the eigenvector at a locus and the long-range contact pattern at that locus. (The eigenvectors that correspond to compartmentalization are usually the eigenvectors with the highest eigenvalues. In general, these concepts relates to things like principal component analysis and the Fiedler vector.)

If the genome does not exhibit compartmentalization, the contact matrix will still have eigenvectors, and those will reflect things about the matrix that have nothing to do with genome compartmentalization. Moreover, even if the genome does exhibit compartmentalization, the eigenvectors still may not reflect it.

The point is, your mileage may vary. Eigenvectors can be a useful tool, but they are not a silver bullet. Before you draw any conclusions about compartmentalization from an eigenvector, spend time looking at the Hi-C matrix.

Similarly, the directionality index is a measure of whether a particular locus tends to interact more with loci that lie upstream, or loci that lie downstream. Sometimes it reflects compartmentalization. Sometimes it reflects other kinds of domain boundaries. Again, your mileage may vary.

In my experience, one should never blindly rely on a particular statistical method. Use your eyeballs. Get to know and understand the patterns in Hi-C matrices. That way, the above methods will sometimes help you - and you'll recognize when they've been pushed beyond their limits.

Erez

Gabriel D

unread,
Oct 30, 2020, 1:05:17 PM10/30/20
to 3D Genomics
Erez,
I am having trouble finding this explanation in your paper – is this eigenvector simply the Fiedler vector of the contact map, i.e. you are simply performing spectral clustering on the graph induced by the contact map?

Moshe Olshansky

unread,
Nov 2, 2020, 7:08:43 PM11/2/20
to 3D Genomics
The eigenvector computed by juicer is the first (principal) eigenvector of the correlation matrix of the (binned) HiC contacts, i.e. if X is your contacts matrix (say of size n by p, p=n for intrachromosomal or Genome Wide matrix) and A(i,j) = correlation between columns i and j of X (for 1 <= i,j <= p) then we are computing the eigenvector of A corresponding to the largest eigenvalue (since A is symmetric and positive semi-definite all its eigenvalues are real and non-negative).
Please note that if v is the desired eigenvector then -v is also such and that is why the sign may be inconsistent between different resolutions. You can pick a locus and decide that at this locus the eigenvector should be, say, positive and then flip the sign whenever it is not so.

Reply all
Reply to author
Forward
0 new messages