Conditioning and Stability

Deepayan Sarkar

Condition of a problem

  • Abstract problem: compute \(f: X \to Y\)

  • \(X\) and \(Y\) are normed vector spaces, usually \(\mathbb{R}^k\) for some \(k\)

  • \(f\) is referred to as the “problem”, and is usually continuous

  • We are interested in the behaviour of the problem at a particular “instance” \(x \in X\)

  • A problem instance \(f(x)\) is

    • well-conditioned if small perturbations in \(x\) lead to only small changes in \(f(x)\)

    • ill-conditioned if small perturbations in \(x\) can lead to large changes in \(f(x)\)

  • Depending on context, “small” and “large” may be either absolute or relative change

Absolute condition number

  • Consider a small perturbation \(\delta x\) in \(x\)

  • Define the change in \(f\) to be \(\delta f = f(x + \delta x) - f(x)\)

  • The absolute condition number \(\hat{\kappa} = \hat{\kappa}(x)\) of the problem \(f\) at \(x\) is

\[ \hat{\kappa}(x) = \lim_{h \to 0} \sup_{\lVert \delta x \rVert \leq h} \frac{\lVert \delta f \rVert}{\lVert \delta x \rVert} \]

  • For readability, this is often written informally as (implicitly assuming \(\delta x\) is infinitesimally small)

\[ \hat{\kappa}(x) = \sup_{\delta x} \frac{\lVert \delta f \rVert}{\lVert \delta x \rVert} \]

  • If \(f: \mathbb{R}\to \mathbb{R}\) is differentiable, it is easy to see that \(\hat{\kappa}(x) = \lvert f^{\prime}(x) \rvert\)

Absolute condition number

  • More generally, if \(f: \mathbb{R}^k \to \mathbb{R}\) is differentiable, and \(J(x)\) is the Jacobian function, then

\[ \hat{\kappa}(x) = \lVert J(x) \rVert \]

  • Here \(\lVert J(x) \rVert\) represents a “matrix norm” induced by a vector norm (on \(\mathbb{R}^k\))

  • Definition: For \(A_{m \times n}\), the matrix norm induced by vector norms on \(\mathbb{R}^m\) and \(\mathbb{R}^n\) is

\[ \lVert A \rVert = \sup \left\lbrace \frac{\lVert A x \rVert}{\lVert x \rVert} : x \in \mathbb{R}^n, x \neq 0 \right\rbrace \]

  • Note that here, the first-order Taylor series expansion of \(f\) gives

\[ \delta f = f(x + \delta x) - f(x) \approx J(x) \delta x \implies \sup_{\delta x} \frac{\lVert \delta f \rVert}{\lVert \delta x \rVert} \approx \sup_{\delta x} \frac{\lVert J(x) \delta x \rVert}{\lVert \delta x \rVert} \]

  • Exercise: Show that \(\hat{\kappa} = \lVert J(x) \rVert\)

Relative condition number

  • The nature of floating point computations makes it more important to study relative changes

  • The relative condition number \(\kappa = \kappa(x)\) of \(f\) at \(x\) is

\[ \kappa(x) = \lim_{h \to 0} \sup_{\lVert \delta x \rVert\leq h} \left( \frac{\lVert \delta f \rVert}{\lVert f(x) \rVert} \middle/ \frac{\lVert \delta x \rVert}{\lVert x \rVert} \right) = \frac{\lVert x \rVert}{\lVert f(x) \rVert} \lim_{h \to 0} \sup_{\lVert \delta x \rVert \leq h} \frac{\lVert \delta f \rVert}{\lVert \delta x \rVert} \]

  • If \(f\) is differentiable, we get

\[ \kappa(x) = \frac{\lVert J(x) \rVert \cdot \lVert x \rVert }{\lVert f(x) \rVert} = \left\lvert \frac{ f^{\prime}(x) x }{f(x)} \right\rvert \]

  • A problem \(f\) is well-conditioned if \(\kappa\) is small (e.g., \(1, 10, 10^2\)) and ill-conditioned if \(\kappa\) is large (e.g., \(10^6\), …)

Examples

  • \(f(x) = \sqrt{x}, x \geq 0\)

    • \(f'(x) = \frac{1}{2} x^{-\frac{1}{2}}\)

    • So the condition of \(f\) at \(x\) is \[ \left\lvert \frac{f'(x) x}{f(x)} \right\rvert = \frac{1}{2} \frac{x^{-\frac{1}{2}}}{x^\frac{1}{2}} x = \frac{1}{2} \]

    • So \(f\) is well-conditioned for all \(x\).

  • \(f(x) = x^\alpha\)

    • Exercise: Condition of \(f\) is \(|\alpha|\) at all \(x\)

Examples

  • \(f(x) = \frac{1}{1-x^2}\)

    • \(f'(x) = 2x (1-x^2)^{-2}\)

    • So condition of \(f\) at \(x\) is \[ \left\lvert \frac{f'(x) x}{f(x)} \right\rvert = \lvert 2x (1-x^2)^{-2} x (1-x^2) \rvert = \frac{2x^2}{|1-x^2|} \]

    • Can be large for \(x\) close to \(\pm1\).

Examples

  • \(f(x_1, x_2) = x_1 - x_2\)

    • The Jacobian of \(f\) is \(J = \begin{bmatrix} \frac{\partial f}{\partial x_1} & \frac{\partial f}{\partial x_2} \end{bmatrix} = \begin{bmatrix} 1 & -1 \end{bmatrix}\)

    • So \(\kappa = \frac{\lVert J \rVert \cdot \lVert x \rVert}{\lvert x_1 - x_2 \rvert}\)

    • What is \(\lVert x \rVert\)? Common choices are

      • \(L_1\): \(\lvert x_1 \rvert + \lvert x_2 \rvert\)

      • \(L_2\): \(\sqrt{x_1^2 + x_2^2}\)

      • \(L_{\infty}\): \(\max \{ \lvert x_1 \rvert, \lvert x_2 \rvert \}\)

    • What is \(\Vert J \rVert\)? Depends on vector norm, but some constant \(c\) for this \(J\) regardless of choice

    • So \(\kappa\) is, with the \(L_{\infty}\) norm, \(\kappa = \frac{c \max\{ \lvert x_1 \rvert, \lvert x_2 \rvert \}}{\lvert x_1 - x_2 \rvert}\)

    • Ill-conditioned when \(x_1 \approx x_2\)

Examples

  • Roots of polynomials: e.g., \(ax^2 + bx + c = 0\)

\[f(a, b, c) = \frac{-b \pm \sqrt{b^2 - 4ac}}{2a}\]

  • Exercise: Show that for \(x^2 - 2x + 1 = (x-1)^2 = 0\), \(f(1, -2, 1)\) has \(\kappa = \infty\)

  • Hint: try perturbing one coefficient at a time

  • Graphical demonstration:

Examples

plot of chunk unnamed-chunk-2

Formal model for floating point arithmetic

  • Recall that floating point numbers are represented as

\[ \text{significand} \times \text{base}^{\text{exponent}} \]

  • Ignoring the limitations imposed by the finite range of the exponent, define

\[ \mathbb{F}= \{ 0 \} \cup \left\{ \pm \frac{m}{2^t} \times 2^e : e \in \mathbb{Z} \text{ and $m$ integer with } 1 \leq m \leq 2^t \right\} \]

  • Here the integer \(t\) is the precision of the representation (usually 24 or 53)

  • \(e\) can be an arbitrary integer, so there is no “overflow” or “underflow” (\(\mathbb{F}= 2 \mathbb{F}\))

  • This is still a useful formal model for the subset of \(\mathbb{R}\) that has a floating point representation

  • For example, with \(t=53\),

\[\begin{eqnarray*} \mathbb{F}\cap [1,2] &=& \{ 1, 1 + 2^{-52}, 1 + 2 \times 2^{-52}, 1 + 3 \times 2^{-52}, ..., 2 \}, \\ \mathbb{F}\cap [2,4] &=& \{ 2, 2 + 2^{-51}, 2 + 2 \times 2^{-51}, 2 + 3 \times 2^{-51}, ..., 4 \}, \text{etc.} \\ \end{eqnarray*}\]

Machine epsilon

  • The resolution of \(\mathbb{F}\) is quantified by a number known as machine epsilon, \(\epsilon_m\)

  • Let us tentatively define \(\epsilon_m\) to be half the distance between \(1\) and the next larger number in \(\mathbb{F}\)

  • Clearly, \(\epsilon_m = \frac12 \times 0.000\cdots0001 = \frac12 \times 2^{t-1} = 2^{-t}\), and has the following property:

For all \(x \in \mathbb{R}\), there exists \(x^* \in \mathbb{F}\) such that \(\lvert x - x^* \rvert \leq \epsilon_m \cdot \lvert x \rvert\)

  • For \(t = 24\) (Float32), \(\epsilon_m = 2^{-24} \approx 6 \times 10^{-8}\)

  • For \(t = 53\) (Float64), \(\epsilon_m = 2^{-53} \approx 1.1 \times 10^{-16}\)

Machine epsilon

  • For any \(x \in \mathbb{R}\), define \(\textrm{fl}(x)\) to be the element in \(\mathbb{F}\) closest to \(x\)

  • Then, a restatement of the above property is

For all \(x \in \mathbb{R}\), there exists \(\epsilon\) with \(\lvert \epsilon \rvert \leq \epsilon_m\) such that \(\textrm{fl}(x) = x (1 + \epsilon)\)

  • In other words, the relative approximation error of any real number is bounded by \(\epsilon_m\)

Arithmetic of floating point numbers

  • Consider the elementary arithmetic operations \(+, -, \times, \div\)

  • How should we expect these to behave on \(\mathbb{F}\)?

  • Let \(*\) denote one of these elementary operations, and \(\circledast\) denote the corresponding operation on \(\mathbb{F}\)

  • Then we would ideally want, for \(x, y \in \mathbb{F}\),

\[x \circledast y = \textrm{fl}( x * y)\]

  • If this is indeed true, then we have the Fundamental axiom of floating point arithmetic:

For all \(x, y \in \mathbb{F}\), there exists \(\epsilon\) with \(\lvert \epsilon \rvert \leq \epsilon_m\) such that \(x \circledast y = (x * y) (1 + \epsilon)\)

  • In practice, this may not hold for the theoretical \(\epsilon_m\), but only for some larger value

  • The smallest \(\epsilon_m\) for which this is guaranteed (on a given machine) is defined to be the machine epsilon

Algorithms and stability

  • Suppose we want to solve a problem \(f: X \to Y\)

  • There can be multiple algorithms to calculate a candidate solution

  • Let \(\tilde{f}: X \to Y\) be the actual implementation of an algorithm to solve \(f\)

  • At a minimum, this will involve the approximation of \(x\) by \(\textrm{fl}(x)\)

  • In practice, suppose we want to calculate \(f(x)\), and actually compute \(\tilde{f}(x)\)

  • The relative error is

\[ \frac{ \lVert \tilde{f}(x) - f(x) \rVert}{\lVert f(x) \rVert} \]

Algorithms and stability

  • Recall that \(\textrm{fl}(x) \approx x (1 + \epsilon_m) \implies \frac{\lVert \textrm{fl}(x) - x\rVert}{\lVert x \rVert} \approx \epsilon_m\)

  • If \(\kappa= \kappa(x)\) is the relative condition number of \(f(x)\), we expect (note: for \(f\), not \(\tilde{f}\))

\[ \frac{ \lVert f(\textrm{fl}(x)) - f(x) \rVert}{\lVert f(x) \rVert} \approx \kappa \frac{\lVert \textrm{fl}(x) - x\rVert}{\lVert x \rVert} \approx \kappa \epsilon_m \]

  • This is the best we can hope for with \(\tilde{f}\) instead of \(f\)

\[ \frac{ \lVert \tilde{f}(x) - f(x) \rVert}{\lVert f(x) \rVert} \approx \kappa \epsilon_m \]

  • Informally, an algorithm \(\tilde{f}\) is unstable if this does not hold

Instability

  • Instability arises due to ill-conditioned intermediate steps in an algorithm \(\tilde{f}\)

  • The basic idea is to compare the (inherent) condition of \(f(x)\) with the conditions of intermediate steps

  • Badly conditioned intermediate steps make the process unstable.

Instability: a toy example

  • To make the idea concrete, consider the problem: \(f(x) = \sqrt{x+1} - \sqrt{x}, x > 0\)

  • It is easily seen that the condition of \(f\) at \(x\) is \(\frac{1}{2} \frac{x}{\sqrt{x+1} \sqrt{x}} \approx \frac{1}{2}\) when \(x\) is large

  • A possible algorithm \(\tilde{f}\), directly using the definition, will proceed as follows

    • \(x_0 = x\)

    • \(x_1 = x_0 + 1\)

    • \(x_2 = \sqrt{x_1}\)

    • \(x_3 = \sqrt{x_0}\)

    • \(x_4 = x_2 - x_3\)

Instability: a toy example

  • In general, suppose \(y = \tilde{f}(x)\) is computed in \(n\) steps

  • Let \(x_i\) be the output of the \(i\)th step (define \(x_0 = x\))

  • Then \(y = \tilde{f}(x) = x_n\) can also be viewed as a function of each of the intermediate \(x_i\)s

  • Denote the \(i\)th such function by \(\tilde{f_i}\), such that \(y = \tilde{f_i} (x_i)\)

  • In particular, \(\tilde{f}_0 = \tilde{f}\)

  • Then the instability in the total computation is dominated by the most ill-conditioned \(\tilde{f_i}\)

Instability: a toy example

  • For the \(\tilde{f}\) given above, we have

    • \(\tilde{f}(t) = \sqrt{t+1} - \sqrt{t}\)

    • \(x_0 = x \implies \tilde{f_0}(t) = \sqrt{t+1} - \sqrt{t}\)

    • \(x_1 = x_0 + 1 \implies \tilde{f_1}(t) = \sqrt{t} - \sqrt{x_0}\)

    • \(x_2 = \sqrt{x_1} \implies \tilde{f_2}(t) = t - \sqrt{x_0}\)

    • \(x_3 = \sqrt{x_0} \implies \tilde{f_3}(t) = x_2 - t\)

Instability: a toy example

  • Consider the condition of \(\tilde{f_3} = x_2 - t\), which is (treating \(x_2\) as fixed)

\[ \left\lvert \frac{\tilde{f_3}^\prime(t) \, t}{\tilde{f_3}(t)} \right\rvert = \left\lvert \frac{t}{x_2 - t} \right\rvert \]

  • This can be arbitrarily large for large \(x\), e.g.,
[1]    20.48809   200.49876  2000.49988 20000.49999


  • Here \(x_2\) and \(t\) are related, but the condition number is w.r.t. perturbations in \(t\) keeping \(x_2\) fixed

Instability: a toy example

  • An alternative formula for \(f\) is \(f(x) = \frac{1}{\sqrt{x+1} + \sqrt{x}}\)

  • An algorithm based on this formula would proceed as

    • \(x_0 = x \implies \tilde{f_0}(t) = \frac{1}{\sqrt{t+1} + \sqrt{t}}\)

    • \(x_1 = x_0 + 1 \implies \tilde{f_1}(t) = \frac{1}{\sqrt{t} + \sqrt{x_0}}\)

    • \(x_2 = \sqrt{x_1} \implies \tilde{f_2}(t) = \frac{1}{t + \sqrt{x_0}}\)

    • \(x_3 = \sqrt{x_0} \implies \tilde{f_3}(t) = \frac{1}{x_2 + t}\)

    • \(x_4 = x_2 + x_3 \implies \tilde{f_4}(t) = \frac{1}{t}\)

    • \(x_5 = 1 / x_4\)

  • Exercise: All these have good condition when \(t\) is large