f (x) = ||R(x)||2 = R(x)TR(x)= [R„(x)]2, R(x) e IRN. (A.3.1)

This structure may arise either from a nonlinear over-determined system of equations

or from a data-fitting problem, e.g., the one described in Chap. 4 [formula (4.1.11)] with N given data points (tV, YV) and variances aV, a model function F(t, x), and n adjustable parameters x:

Rv := Rv(x) = Yv - Fv(x) = [Y„ - F(tv, x)]. (A.3.3)

The weights wV are related to the variances aV by

Traditionally, the weights are scaled to a variance of unit weights. The factor P is chosen so as to make the weights come out in a convenient range. Sometimes, if variances are not known, they may be estimated by other considerations as described in Sect. In short vector notation we get

R := Y - F(x) = [R1(x),..., Rn (x)]T , F(x), Y e JRN. (A.3.5)

Our least-squares problem requires us to provide the following input:

7 The minimization of this functional, i.e., the minimization of the sum of weighted quadratic residuals, under the assumption that the statistical errors follow a GauBian distribution with variances as in (A.3.4), provides a

3. variances associated with the data; and

4. measure of goodness of fit, e.g., the Euclidean norm.

In many practical applications, unfortunately, less attention is paid to the third point. It is also very important to point out that the fourth point requires preinformation related to the problem and statistical properties of the data.

Before we treat the general case of nonlinear models in Appendix A.3.3, we discuss the linear case first, i.e., F(x) = Ax with a constant matrix A.

A.3.1 Linear Case: Normal Equations

With y =VWy and Ax =VWA the weighted residual vector

is linear in x and leads to the linear least-squares problem min ||y - Ax||2, x e JRn, y e JRN, A e m(n, n), (A.3.7)

with a constant matrix A. It can be shown that the linear least-squares problem has at least one solution x*. The solution may not be unique. If x'+ denotes another solution, the relation Ax' = Ax* holds. All solutions of (A.3.7) are solution of the normal equations

and vice versa: All solutions of the normal equations are solutions of (A.3.7). Thus, the normal equations represent the necessary and sufficient conditions for the existence and determination of the least-squares solution.

The modulus R* = |R* | of the residual vector R at the solution is uniquely determined by min ||y - Ax||2 = R* = |R*|, R* = y - Ax*• (A.3.9)

However, only if A has full rank, the problem (A.3.7) has a unique solution and there exists a unique solution for R* which can be obtained, for instance, as the solution of the linear system of equations

In this case the symmetric matrix ATA has full rank rank ATA = n.

From a numerical point of view the normal equation approach for solving least-squares problems should, if used at all, be applied with great caution for the following reasons:

• the computation of ATA involves the evaluation of scalar products (numerical analysts usually try to avoid the evaluation of scalar products because this often leads to loss of significant digits due to adding positive and negative terms of similar size); and

• strong propagation of errors on the right-hand side term ATy may occur when solving (A.3.8) as the propagation depends on the condition number k2(AtA) = K2(A).

As shown in Deuflhard & Hohmann (1993, p. 74), during solving least-squares problems, for large residual problems the error propagation related to perturbations of the matrix A is approximately given by (A) while for small residual problems it is better approximated by k2(A). In that case solving (A.3.7) by the normal equation approach and, for instance, the Cholesky-Banachchiewicz algorithm ,is not recommended. Note that in practical problems A itself may already be badly conditioned. Condition numbers of 102-103 are common which leads to k|(A) of the order of 104-106. Therefore, it is recommended to use methods which use only the matrix A itself. Such an algorithm is described in Sect. A.3.2.

A.3.2 The Linear Case: An Orthogonalization Method

Numerical problems due to bad condition numbers of A can be limited if the least-squares problem is solved using only A. Orthogonalization methods involve orthogonal transformations P to solve linear least-squares problems. Orthogonal transformations leave the Euclidean norm of matrices invariant and lead to stable linear least-squares solvers. Householder transformations are special types of orthogonal transformations. The matrix A is transformed in such a way that

1. the solution of the problem is not changed;

2. the condition number k2(PA) of the transformed matrix is not larger than k2(A); and

3. the transformed matrix PA has a triangular structure very suitable for numerical computations.

Let Pk, k e IN, be a sequence of Householder transformations. Householder transformations are special matrices of the form 8

8 Although for our current case it is not necessary to use complex vectors, we describe the general case, for reasons of consistency, with most of the mathematical literature. Unitary matrices in complex vector spaces correspond to orthogonal matrices in real vector spaces.

where 1 is the (n, n) identity matrix and v is an arbitrary n-dimensional vector. Householder transformations x ^ Px are reflections of the vector space Cn with respect to the orthogonal complement

They have the property

where PH is the conjugate transpose, i.e., that P is unitary. Unitary matrices U are known to conserve the norm of vectors x and linear operators A, i.e., l|Ux||2 = ||X||2 ,

The matrix

is unitary as well, as it is a product of unitary matrices. The vector w is chosen such that P maps a given vector x = (x1,..., xn )T with9 x1 = 0 onto a multiple of the first unit vector e1 = (1, 0,..., 0)T, i.e.,

For a given vector x = 0 this implies the following formulas for computing the associated Householder transformation (for x = 0 there is nothing to do):

0 = a (a + |X11), e'a = X1/ |X1|, k = -a ei u = x — k e1, P = 11 - 0-1uuH.

If the matrix A has n linearly independent columns a1,..., an, the matrix A and the vector y in the linear least-squares problem (A.3.7) are finally transformed by P into a upper triangular matrix A

9 If x = 0 we can, by appropriate permutations, always achieve x1 = 0. For x = 0 there is nothing to do.

sii Sin

h1 h2

As usually in numerical computations, for accuracy and stability, P is not computed via the matrix multiplication (A.3.16) but is established by successive Householder transformations, i.e., successive modifications of A. The original problem (A.3.7) now takes the form min ||y — Ax||2 = min ||P (y — Ax)||2 = min

As we are using the Euclidean norm and as P is unitary we get min ||y — Ax||2 = min ||Sx — h^2 + ||h2 N2 •

Because h2 is a constant vector, ||y — Ax||2 takes its minimum when the unknown vector x is the solution of the linear system of equations

Thus, the solution of Sx = h1 solves our problem (A.3.21). The upper triangular matrix S has a unique inverse matrix if and only if Sii = 0 for all i .As P is regular, regularity of S is equivalent to regularity of A.

Telescopes Mastery

Telescopes Mastery

Through this ebook, you are going to learn what you will need to know all about the telescopes that can provide a fun and rewarding hobby for you and your family!

Get My Free Ebook

Post a comment