8.5 Matlab Least Squares Approximation

1 view

Skip to first unread message

Poochie Tenharmsel

unread,

Aug 4, 2024, 2:26:34 PM8/4/24

to neuvedonla

Ihave 37 linear equations and 36 variables in the form of a matrix equation; A*X=B . The equations don't have an exact answer. I want to use Matlab least square method to find the answers with the least error. I am new to Matlab so any comments will help. Thank you

The two latter computation methods can also deal with underdetermined systems of linear equations, but they give different solutions in that case: The pseudoinverse gives the solution where x has the smallest sum of squares, while the left-division operator gives a solution with as many 0 coefficients as possible.

x = lsqr(A,b) attempts to solve the system of linear equations A*x = b for x using the Least Squares Method. lsqr finds a least squares solution for x that minimizes norm(b-A*x). When A is consistent, the least squares solution is also a solution of the linear system. When the attempt is successful, lsqr displays a message to confirm convergence. If lsqr fails to converge after the maximum number of iterations or halts for any reason, it displays a diagnostic message that includes the relative residual norm(b-A*x)/norm(b) and the iteration number at which the method stopped.

[x,flag] = lsqr(___) returns a flag that specifies whether the algorithm successfully converged. When flag = 0, convergence was successful. You can use this output syntax with any of the previous input argument combinations. When you specify the flag output, lsqr does not display any diagnostic messages.

By default lsqr uses 20 iterations and a tolerance of 1e-6, but the algorithm is unable to converge in those 20 iterations for this matrix. Since the residual is still large, it is a good indicator that more iterations (or a preconditioner matrix) are needed. You also can use a larger tolerance to make it easier for the algorithm to converge.

Solve the system again using a tolerance of 1e-4 and 70 iterations. Specify six outputs to return the relative residual relres of the calculated solution, as well as the residual history resvec and the least-squares residual history lsvec.

Since flag is 0, the algorithm was able to meet the desired error tolerance in the specified number of iterations. You can generally adjust the tolerance and number of iterations together to make trade-offs between speed and precision in this manner.

These residual norms indicate that x is a least-squares solution, because relres is not smaller than the specified tolerance of 1e-4. Since no consistent solution to the linear system exists, the best the solver can do is to make the least-squares residual satisfy the tolerance.

Plot the residual histories. The relative residual resvec quickly reaches a minimum and cannot make further progress, while the least-squares residual lsvec continues to be minimized on subsequent iterations.

Use lsqr to solve Ax=b twice: one time with the default initial guess, and one time with a good initial guess of the solution. Use 75 iterations and the default tolerance for both solutions. Specify the initial guess in the second solution as a vector with all elements equal to 0.99.

You also can use the initial guess to get intermediate results by calling lsqr in a for-loop. Each call to the solver performs a few iterations and stores the calculated solution. Then you use that solution as the initial vector for the next batch of iterations.

Since this tridiagonal matrix has a special structure, you can represent the operation A*x with a function handle. When A multiplies a vector, most of the elements in the resulting vector are zeros. The nonzero elements in the result correspond with the nonzero tridiagonal elements of A.

Now, solve the linear system Ax=b by providing lsqr with the function handle that calculates A*x and A'*x. Use a tolerance of 1e-6 and 25 iterations. Specify b as the row sums of A so that the true solution for x is a vector of ones.

Coefficient matrix, specified as a matrix or function handle. This matrix is the coefficient matrix in the linear system A*x = b. Generally, A is a large sparse matrix or a function handle that returns the product of a large sparse matrix and column vector.

You can optionally specify the coefficient matrix as a function handle instead of a matrix. The function handle returns matrix-vector products instead of forming the entire coefficient matrix, making the calculation more efficient.

To use a function handle, use the function signature function y = afun(x,opt). Parameterizing Functions explains how to provide additional parameters to the function afun, if necessary. The function afun must satisfy these conditions:

Method tolerance, specified as a positive scalar. Use this input to trade-off accuracy and runtime in the calculation. lsqr must meet the tolerance within the number of allowed iterations to be successful. A smaller value of tol means the answer must be more precise for the calculation to be successful.

Maximum number of iterations, specified as a positive scalar integer. Increase the value of maxit to allow more iterations for lsqr to meet the tolerance tol. Generally, a smaller value of tol means more iterations are required to successfully complete the calculation.

Preconditioner matrices, specified as separate arguments of matrices or function handles. You can specify a preconditioner matrix M or its matrix factors M = M1*M2 to improve the numerical aspects of the linear system and make it easier for lsqr to converge quickly. For square coefficient matrices, you can use the incomplete matrix factorization functions ilu and ichol to generate preconditioner matrices. You also can use equilibrate prior to factorization to improve the condition number of the coefficient matrix. For more information on preconditioners, see Iterative Methods for Linear Systems.

You can optionally specify any of M, M1, or M2 as function handles instead of matrices. The function handle performs matrix-vector operations instead of forming the entire preconditioner matrix, making the calculation more efficient.

To use a function handle, first create a function with the signature function y = mfun(x,opt). Parameterizing Functions explains how to provide additional parameters to the function mfun, if necessary. The function mfun must satisfy these conditions:

Initial guess, specified as a column vector with length equal to size(A,2). If you can provide lsqr with a more reasonable initial guess x0 than the default vector of zeros, then it can save computation time and help the algorithm converge faster.

Convergence flag, returned as one of the scalar values in this table. The convergence flag indicates whether the calculation was successful and differentiates between several different forms of failure.

Relative residual error, returned as a scalar. The relative residual error is an indication of how accurate the returned answer x is. lsqr tracks the relative residual and least-squares residual at each iteration in the solution process, and the algorithm converges when either residual meets the specified tolerance tol. The relres output contains the value of the residual that converged, either the relative residual or the least-squares residual:

The relative residual error is equal to norm(b-A*x)/norm(b) and is generally the residual that meets the tolerance tol when lsqr converges. The resvec output tracks the history of this residual over all iterations.

Residual error, returned as a vector. The residual error norm(b-A*x) reveals how close the algorithm is to converging for a given value of x. The number of elements in resvec is equal to the number of iterations. You can examine the contents of resvec to help decide whether to change the values of tol or maxit.

Convergence of most iterative methods depends on the condition number of the coefficient matrix, cond(A). When A is square, you can use equilibrate to improve its condition number, and on its own this makes it easier for most iterative solvers to converge. However, using equilibrate also leads to better quality preconditioner matrices when you subsequently factor the equilibrated matrix B = R*P*A*C.

You can use matrix reordering functions such as dissect and symrcm to permute the rows and columns of the coefficient matrix and minimize the number of nonzeros when the coefficient matrix is factored to generate a preconditioner. This can reduce the memory and time required to subsequently solve the preconditioned linear system.

All the algorithms except lsqlin active-set are large-scale; see Large-Scale vs. Medium-Scale Algorithms. For a general survey of nonlinear least-squares methods, see Dennis [8]. Specific details on the Levenberg-Marquardt method can be found in Mor [28].

For linear least squares without constraints, the problem is to come up with a least-squares solution to the problem Cx = d. You can solve this problem with mldivide or lsqminnorm. When the problem has linear or bound constraints, use lsqlin. For general nonlinear constraints, uses lsqnonlin.

The lsqlin 'interior-point' algorithm uses the interior-point-convex quadprog Algorithm, and the lsqlin 'active-set' algorithm uses the active-set quadprog algorithm. The quadprog problem definition is to minimize a quadratic function

The quadprog 'interior-point-convex' algorithm has two code paths. It takes one when the Hessian matrix H is an ordinary (full) matrix of doubles, and it takes the other when H is a sparse matrix. For details of the sparse data type, see Sparse Matrices. Generally, the algorithm is faster for large problems that have relatively few nonzero terms when you specify H as sparse. Similarly, the algorithm is faster for small or relatively dense problems when you specify H as full.

To understand the trust-region approach to optimization, consider the unconstrained minimization problem, minimize f(x), where the function takes vector arguments and returns scalars. Suppose you are at a point x in n-space and you want to improve, i.e., move to a point with a lower function value. The basic idea is to approximate f with a simpler function q, which reasonably reflects the behavior of function f in a neighborhood N around the point x. This neighborhood is the trust region. A trial step s is computed by minimizing (or approximately minimizing) over N. This is the trust-region subproblem,