Geometricallyvectors are multi-dimensional quantities with magnitude and direction, often pictured as arrows. A linear transformation rotates, stretches, or shears the vectors upon which it acts. Its eigenvectors are those vectors that are only stretched, with neither rotation nor shear. The corresponding eigenvalue is the factor by which an eigenvector is stretched or squished. If the eigenvalue is negative, the eigenvector's direction is reversed.[1]
The eigenvectors and eigenvalues of a linear transformation serve to characterize it, and so they play important roles in all the areas where linear algebra is applied, from geology to quantum mechanics. In particular, it is often the case that a system is represented by a linear transformation whose outputs are fed as inputs to the same transformation (feedback). In such an application, the largest eigenvalue is of particular importance, because it governs the long-term behavior of the system after many applications of the linear transformation, and the associated eigenvector is the steady state of the system.
There is a direct correspondence between n-by-n square matrices and linear transformations from an n-dimensional vector space into itself, given any basis of the vector space. Hence, in a finite-dimensional vector space, it is equivalent to define eigenvalues and eigenvectors using either the language of matrices, or the language of linear transformations.[3][4]
Eigenvalues and eigenvectors feature prominently in the analysis of linear transformations. The prefix eigen- is adopted from the German word eigen (cognate with the English word own) for 'proper', 'characteristic', 'own'.[6][7] Originally used to study principal axes of the rotational motion of rigid bodies, eigenvalues and eigenvectors have a wide range of applications, for example in stability analysis, vibration analysis, atomic orbitals, facial recognition, and matrix diagonalization.
The example here, based on the Mona Lisa, provides a simple illustration. Each point on the painting can be represented as a vector pointing from the center of the painting to that point. The linear transformation in this example is called a shear mapping. Points in the top half are moved to the right, and points in the bottom half are moved to the left, proportional to how far they are from the horizontal axis that goes through the middle of the painting. The vectors pointing to each point in the original image are therefore tilted right or left, and made longer or shorter by the transformation. Points along the horizontal axis do not move at all when this transformation is applied. Therefore, any vector that points directly to the right or left with no vertical component is an eigenvector of this transformation, because the mapping does not change its direction. Moreover, these eigenvectors all have an eigenvalue equal to one, because the mapping does not change their length either.
In the 18th century, Leonhard Euler studied the rotational motion of a rigid body, and discovered the importance of the principal axes.[a] Joseph-Louis Lagrange realized that the principal axes are the eigenvectors of the inertia matrix.[11]
In the early 19th century, Augustin-Louis Cauchy saw how their work could be used to classify the quadric surfaces, and generalized it to arbitrary dimensions.[12] Cauchy also coined the term racine caractristique (characteristic root), for what is now called eigenvalue; his term survives in characteristic equation.[b]
Later, Joseph Fourier used the work of Lagrange and Pierre-Simon Laplace to solve the heat equation by separation of variables in his famous 1822 book Thorie analytique de la chaleur.[13] Charles-Franois Sturm developed Fourier's ideas further, and brought them to the attention of Cauchy, who combined them with his own ideas and arrived at the fact that real symmetric matrices have real eigenvalues.[12] This was extended by Charles Hermite in 1855 to what are now called Hermitian matrices.[14]
Around the same time, Francesco Brioschi proved that the eigenvalues of orthogonal matrices lie on the unit circle,[12] and Alfred Clebsch found the corresponding result for skew-symmetric matrices.[14] Finally, Karl Weierstrass clarified an important aspect in the stability theory started by Laplace, by realizing that defective matrices can cause instability.[12]
At the start of the 20th century, David Hilbert studied the eigenvalues of integral operators by viewing the operators as infinite matrices.[17] He was the first to use the German word eigen, which means "own",[7] to denote eigenvalues and eigenvectors in 1904,[c] though he may have been following a related usage by Hermann von Helmholtz. For some time, the standard term in English was "proper value", but the more distinctive term "eigenvalue" is the standard today.[18]
The first numerical algorithm for computing eigenvalues and eigenvectors appeared in 1929, when Richard von Mises published the power method. One of the most popular methods today, the QR algorithm, was proposed independently by John G. F. Francis[19] and Vera Kublanovskaya[20] in 1961.[21][22]
Eigenvalues and eigenvectors are often introduced to students in the context of linear algebra courses focused on matrices.[23][24]Furthermore, linear transformations over a finite-dimensional vector space can be represented using matrices,[3][4] which is especially common in numerical and computational applications.[25]
If the entries of the matrix A are all real numbers, then the coefficients of the characteristic polynomial will also be real numbers, but the eigenvalues may still have nonzero imaginary parts. The entries of the corresponding eigenvectors therefore may also have nonzero imaginary parts. Similarly, the eigenvalues may be irrational numbers even if all the entries of A are rational numbers or even if they are all integers. However, if the entries of A are all algebraic numbers, which include the rationals, the eigenvalues must also be algebraic numbers.
The non-real roots of a real polynomial with real coefficients can be grouped into pairs of complex conjugates, namely with the two members of each pair having imaginary parts that differ only in sign and the same real part. If the degree is odd, then by the intermediate value theorem at least one of the roots is real. Therefore, any real matrix with odd order has at least one real eigenvalue, whereas a real matrix with even order may not have any real eigenvalues. The eigenvectors associated with these complex eigenvalues are also complex and also appear in complex conjugate pairs.
where κ \displaystyle \kappa is a scalar and u \displaystyle u is a 1 n \displaystyle 1\times n matrix. Any row vector u \displaystyle u satisfying this equation is called a left eigenvector of A \displaystyle A and κ \displaystyle \kappa is its associated eigenvalue. Taking the transpose of this equation, A T u T = κ u T . \displaystyle A^\textsf T\mathbf u ^\textsf T=\kappa \mathbf u ^\textsf T.
Comparing this equation to equation (1), it follows immediately that a left eigenvector of A \displaystyle A is the same as the transpose of a right eigenvector of A T \displaystyle A^\textsf T , with the same eigenvalue. Furthermore, since the characteristic polynomial of A T \displaystyle A^\textsf T is the same as the characteristic polynomial of A \displaystyle A , the left and right eigenvectors of A \displaystyle A are associated with the same eigenvalues.
A can therefore be decomposed into a matrix composed of its eigenvectors, a diagonal matrix with its eigenvalues along the diagonal, and the inverse of the matrix of eigenvectors. This is called the eigendecomposition and it is a similarity transformation. Such a matrix A is said to be similar to the diagonal matrix Λ or diagonalizable. The matrix Q is the change of basis matrix of the similarity transformation. Essentially, the matrices A and Λ represent the same linear transformation expressed in two different bases. The eigenvectors are used as the basis when representing the linear transformation as Λ.
A matrix that is not diagonalizable is said to be defective. For defective matrices, the notion of eigenvectors generalizes to generalized eigenvectors and the diagonal matrix of eigenvalues generalizes to the Jordan normal form. Over an algebraically closed field, any matrix A has a Jordan normal form and therefore admits a basis of generalized eigenvectors and a decomposition into generalized eigenspaces.
In the Hermitian case, eigenvalues can be given a variational characterization. The largest eigenvalue of H \displaystyle H is the maximum value of the quadratic form x T H x / x T x \displaystyle \mathbf x ^\textsf TH\mathbf x /\mathbf x ^\textsf T\mathbf x . A value of x \displaystyle \mathbf x that realizes that maximum is an eigenvector.
A matrix whose elements above the main diagonal are all zero is called a lower triangular matrix, while a matrix whose elements below the main diagonal are all zero is called an upper triangular matrix. As with diagonal matrices, the eigenvalues of triangular matrices are the elements of the main diagonal.
The eigenspaces of T always form a direct sum. As a consequence, eigenvectors of different eigenvalues are always linearly independent. Therefore, the sum of the dimensions of the eigenspaces cannot exceed the dimension n of the vector space on which T operates, and there cannot be more than n distinct eigenvalues.[d]
The eigenvalues of a matrix A \displaystyle A can be determined by finding the roots of the characteristic polynomial. This is easy for 2 2 \displaystyle 2\times 2 matrices, but the difficulty increases rapidly with the size of the matrix.
In theory, the coefficients of the characteristic polynomial can be computed exactly, since they are sums of products of matrix elements; and there are algorithms that can find all the roots of a polynomial of arbitrary degree to any required accuracy.[44] However, this approach is not viable in practice because the coefficients would be contaminated by unavoidable round-off errors, and the roots of a polynomial can be an extremely sensitive function of the coefficients (as exemplified by Wilkinson's polynomial).[44] Even for matrices whose elements are integers the calculation becomes nontrivial, because the sums are very long; the constant term is the determinant, which for an n n \displaystyle n\times n matrix is a sum of n ! \displaystyle n! different products.[e]
3a8082e126