🔢Numerical Analysis II Unit 4 Review

4.4 Krylov subspace methods

🔢Numerical Analysis II
Unit 4 Review

4.4 Krylov subspace methods

Written by the Fiveable Content Team • Last updated September 2025

🔢Numerical Analysis II

Unit & Topic Study Guides

4.1 Power method

4.2 QR algorithm

4.3 Singular value decomposition

4.4 Krylov subspace methods

4.5 Matrix factorizations

Krylov subspace methods are powerful tools for solving large-scale linear systems and eigenvalue problems. These iterative techniques use successive matrix-vector multiplications to build subspaces, making them efficient for sparse matrices common in scientific computing.

The methods leverage properties of Krylov subspaces to approximate solutions, offering advantages over direct methods for large problems. Key algorithms include Conjugate Gradient, GMRES, and Arnoldi iteration, each tailored to specific problem types and matrix properties.

Fundamentals of Krylov subspaces

Krylov subspace methods form a crucial part of Numerical Analysis II, providing efficient techniques for solving large-scale linear systems and eigenvalue problems
These methods leverage the properties of Krylov subspaces to iteratively approximate solutions, making them particularly useful for sparse matrices encountered in various scientific and engineering applications

Definition and properties

Krylov subspace defined as $K_m(A, b) = span\{b, Ab, A^2b, ..., A^{m-1}b\}$ where A is a square matrix and b is a vector
Dimension of Krylov subspace increases with each iteration until it reaches the degree of the minimal polynomial of A with respect to b
Krylov subspaces exhibit nested property $K_1(A, b) \subset K_2(A, b) \subset ... \subset K_m(A, b)$
Invariance under scalar multiplication and addition of vectors within the subspace

Basis vectors

Krylov subspace basis vectors generated by successive matrix-vector multiplications
Basis vectors tend to become linearly dependent due to rounding errors in finite precision arithmetic
Orthogonalization techniques (Gram-Schmidt, Arnoldi) used to maintain numerical stability
Modified Gram-Schmidt process often preferred for improved stability in practice

Relation to matrix powers

Krylov subspaces intimately connected to powers of the matrix A
Each new basis vector represents a higher power of A applied to the initial vector b
Matrix powers capture information about eigenvalues and eigenvectors of A
Characteristic polynomial of A related to the maximum dimension of the Krylov subspace

Krylov subspace methods overview

Krylov subspace methods encompass a family of iterative algorithms for solving large-scale linear systems and eigenvalue problems
These methods have revolutionized numerical linear algebra by providing efficient solutions for problems that were previously intractable using direct methods

Iterative linear solvers

Krylov methods solve $Ax = b$ by constructing approximate solutions in Krylov subspaces
Conjugate Gradient (CG) method for symmetric positive definite matrices
Generalized Minimal Residual (GMRES) method for general nonsymmetric matrices
Biconjugate Gradient Stabilized (BiCGSTAB) method for nonsymmetric systems
Iterative nature allows for early termination when desired accuracy reached

Eigenvalue problems

Arnoldi iteration for general matrices to compute a few eigenvalues
Lanczos algorithm for symmetric matrices, producing tridiagonal matrices
Implicitly Restarted Arnoldi Method (IRAM) for large-scale eigenvalue problems
Krylov-Schur method combines benefits of Arnoldi and QR algorithms

Model order reduction

Krylov subspace methods used to create reduced-order models of large dynamical systems
Moment matching techniques based on Krylov subspaces for model reduction
Rational Krylov methods for more accurate reduced models over wider frequency ranges
Applications in control theory and circuit simulation

Arnoldi iteration

Arnoldi iteration serves as a fundamental building block for many Krylov subspace methods
This algorithm constructs an orthonormal basis for the Krylov subspace while simultaneously producing a Hessenberg matrix representation of the original matrix

Algorithm steps

Initialize with a normalized vector $v_1 = b / \|b\|$
For k = 1, 2, ..., m:
1. Compute $w = Av_k$
2. Orthogonalize w against previous vectors $v_1, ..., v_k$
3. Normalize the resulting vector to obtain $v_{k+1}$
Arnoldi process terminates if w becomes zero, indicating invariant subspace found

Orthogonalization process

Modified Gram-Schmidt (MGS) process commonly used for orthogonalization
MGS performs better in finite precision arithmetic compared to classical Gram-Schmidt
Reorthogonalization techniques (iterative refinement) improve numerical stability
Householder transformations provide an alternative, more stable orthogonalization method

Hessenberg matrix formation

Arnoldi iteration produces upper Hessenberg matrix H_m of size m x m
H_m represents projection of A onto the Krylov subspace
Entries of H_m obtained from orthogonalization coefficients
Relation $AV_m = V_m H_m + h_{m+1,m}v_{m+1}e_m^T$ holds, where V_m contains orthonormal basis vectors

Lanczos algorithm

Lanczos algorithm specializes the Arnoldi iteration for symmetric matrices
Exploits symmetry to achieve significant computational savings and storage reduction

Symmetric vs nonsymmetric matrices

Lanczos algorithm applicable only to symmetric (or Hermitian) matrices
Produces tridiagonal matrix instead of upper Hessenberg matrix
Requires only three-term recurrence relation, reducing storage and computation
Arnoldi iteration reduces to Lanczos algorithm for symmetric matrices

Tridiagonalization process

Lanczos algorithm generates orthonormal basis vectors $q_1, q_2, ..., q_m$
Tridiagonal matrix T_m formed with diagonal elements α_i and off-diagonal elements β_i
Relation $AQ_m = Q_m T_m + β_m q_{m+1} e_m^T$ holds
Tridiagonal structure allows for efficient eigenvalue computations

Recurrence relations

Three-term recurrence relation for generating new basis vectors
$β_{j+1}q_{j+1} = Aq_j - α_j q_j - β_j q_{j-1}$
α_j computed as inner product $q_j^T A q_j$
β_j+1 determined by normalizing the resulting vector
Loss of orthogonality in finite precision arithmetic requires careful implementation

Conjugate gradient method

Conjugate Gradient (CG) method solves symmetric positive definite linear systems
Combines ideas from steepest descent and conjugate directions methods

Derivation and theory

Minimizes quadratic function $f(x) = \frac{1}{2}x^T A x - b^T x$ over Krylov subspaces
Generates sequence of conjugate vectors $p_0, p_1, ..., p_n$
Residual vectors $r_k = b - Ax_k$ orthogonal to previous search directions
Theoretical convergence in at most n steps for n x n matrix in exact arithmetic

Implementation details

Initialize $x_0$ , compute $r_0 = b - Ax_0$ , set $p_0 = r_0$
For k = 0, 1, 2, ...:
1. Compute $α_k = \frac{r_k^T r_k}{p_k^T A p_k}$
2. Update solution $x_{k+1} = x_k + α_k p_k$
3. Update residual $r_{k+1} = r_k - α_k A p_k$
4. Compute $β_k = \frac{r_{k+1}^T r_{k+1}}{r_k^T r_k}$
5. Update search direction $p_{k+1} = r_{k+1} + β_k p_k$
Terminate when residual norm falls below specified tolerance

Convergence properties

Convergence rate depends on condition number κ(A) of matrix A
Error bound $\|x_k - x^*\|_A \leq 2 \left(\frac{\sqrt{κ(A)} - 1}{\sqrt{κ(A)} + 1}\right)^k \|x_0 - x^*\|_A$
Superlinear convergence often observed in practice
Sensitive to rounding errors, may require reorthogonalization for ill-conditioned problems

GMRES method

Generalized Minimal Residual (GMRES) method solves general nonsymmetric linear systems
Builds upon Arnoldi iteration to minimize residual norm over growing Krylov subspaces

Minimization principle

GMRES minimizes residual norm $\|b - Ax_k\|_2$ over Krylov subspace K_k(A, b)
Approximate solution expressed as $x_k = x_0 + V_k y_k$
Minimization problem formulated as least squares problem $\min_y \|β e_1 - H_k y\|_2$
β represents initial residual norm, e_1 is first unit vector

Arnoldi-based implementation

Use Arnoldi iteration to generate orthonormal basis V_k and Hessenberg matrix H_k
Solve least squares problem using QR factorization of H_k
Update solution $x_k = x_0 + V_k y_k$ where y_k minimizes residual
Residual norm computed without explicitly forming x_k
Requires storage of all Arnoldi vectors, leading to increasing memory usage

Restarted vs full GMRES

Full GMRES guarantees convergence but becomes memory-intensive for large k
Restarted GMRES (GMRES(m)) limits memory usage by restarting after m iterations
GMRES(m) discards previous basis vectors and restarts with current approximate solution
Restarting may slow convergence or cause stagnation for difficult problems
Flexible GMRES (FGMRES) allows for variable preconditioning between restarts

BiCGSTAB method

Biconjugate Gradient Stabilized (BiCGSTAB) method solves nonsymmetric linear systems
Combines ideas from BiCG method with additional stabilization techniques

Bi-Lanczos process

Generates two sequences of vectors $\{r_k\}$ and $\{r_k^\}$
Dual space vectors $r_k^$ chosen to be orthogonal to residuals r_k
Avoids explicit transpose matrix-vector products required in BiCG
Constructs basis for Krylov subspaces K_k(A, r_0) and K_k(A^T, r_0^)

Stabilization technique

Introduces additional polynomial factor to smooth convergence behavior
Computes two scalar parameters α and ω in each iteration
Updates solution and residual using combination of BiCG and GMRES-like steps
Stabilization aims to reduce irregular convergence often observed in BiCG

Convergence behavior

Generally more stable convergence compared to BiCG method
May exhibit superlinear convergence for some problems
Breakdown can occur if ω becomes zero or very small
Sensitive to round-off errors, may require careful implementation for stability

Preconditioning techniques

Preconditioning improves convergence of Krylov subspace methods
Transforms original system into equivalent system with better spectral properties

Left vs right preconditioning

Left preconditioning solves $M^{-1}Ax = M^{-1}b$
Right preconditioning solves $AM^{-1}y = b$ with $x = M^{-1}y$
Split preconditioning $M_1^{-1}AM_2^{-1}y = M_1^{-1}b$ with $x = M_2^{-1}y$
Choice between left and right preconditioning affects implementation and convergence

Common preconditioners

Jacobi preconditioner uses diagonal of A
Symmetric Successive Over-Relaxation (SSOR) based on matrix splitting
Incomplete LU factorization (ILU) approximates full LU decomposition
Algebraic Multigrid (AMG) methods for problems with multiple scales
Approximate inverse preconditioners directly approximate A^(-1)

Effect on convergence

Effective preconditioning reduces condition number of coefficient matrix
Clusters eigenvalues, potentially leading to superlinear convergence
Improves robustness and reduces iteration count
Trade-off between preconditioning cost and iteration reduction
Problem-specific preconditioners often outperform general-purpose ones

Error analysis and convergence

Understanding error behavior and convergence properties crucial for effective use of Krylov methods
Analysis provides insights into algorithm performance and termination criteria

Residual vs true error

Residual r_k = b - Ax_k easily computed during iterations
True error e_k = x - x_k generally unknown without exact solution
Residual and error related by $Ae_k = r_k$
Residual norm may not accurately reflect true error magnitude
Backward error analysis relates residual to perturbations in original problem

Convergence rates

Asymptotic convergence rate determined by spectral properties of preconditioned matrix
Linear convergence rate for most Krylov methods in general case
Convergence bound for CG method depends on condition number κ(A)
GMRES minimizes residual norm over growing subspaces, non-increasing residual
BiCGSTAB and other methods may exhibit non-monotonic convergence behavior

Superlinear convergence

Superlinear convergence observed when convergence rate improves as iterations progress
Often attributed to effective clustering of eigenvalues
GMRES can exhibit superlinear convergence for certain eigenvalue distributions
Ritz values approximate extremal eigenvalues, improving effective condition number
Preconditioning can enhance superlinear convergence behavior

Krylov methods for eigenproblems

Krylov subspace methods adapted for large-scale eigenvalue problems
Focus on computing a few eigenvalues and eigenvectors of large, sparse matrices

Implicitly restarted Arnoldi

Combines Arnoldi iteration with implicit QR algorithm
Maintains small subspace dimension while refining desired eigenpairs
Applies filters to focus on specific parts of spectrum
Efficient for computing a few exterior eigenvalues of large nonsymmetric matrices

Jacobi-Davidson method

Combines ideas from Davidson's method with Jacobi orthogonal component correction
Solves correction equation to expand subspace
Effective for interior eigenvalues and clustered spectra
Allows for flexible preconditioning of correction equation

Shift-and-invert techniques

Transforms eigenvalue problem to accelerate convergence to desired eigenvalues
Solves (A - σI)^(-1)x = λx instead of Ax = λx
Shifts spectrum to make desired eigenvalues dominant
Requires efficient solvers for shifted linear systems

Applications and case studies

Krylov subspace methods find widespread use in scientific computing and engineering applications
Ability to handle large, sparse systems makes them indispensable in many fields

Large sparse systems

Finite element analysis in structural mechanics and heat transfer
Circuit simulation in electronic design automation
Optimization problems in operations research
Markov chain analysis in stochastic modeling
Graph algorithms in network analysis and machine learning

Computational fluid dynamics

Solution of Navier-Stokes equations for fluid flow simulations
Pressure correction methods in incompressible flow solvers
Preconditioning techniques for convection-diffusion problems
Coupled multi-physics simulations (fluid-structure interaction)
Climate modeling and weather prediction

Structural mechanics problems

Static and dynamic analysis of large structures
Eigenvalue problems for vibration analysis and modal decomposition
Nonlinear finite element analysis with Newton-Krylov methods
Topology optimization in structural design
Crack propagation and fracture mechanics simulations

Advanced topics

Ongoing research in Krylov subspace methods addresses challenges in high-performance computing and emerging applications
Advanced techniques aim to improve scalability, robustness, and efficiency of existing methods

Block Krylov methods

Solve multiple right-hand sides simultaneously
Exploit cache efficiency and reduce communication costs
Block GMRES and Block CG for systems with multiple right-hand sides
Block Arnoldi methods for eigenvalue problems with invariant subspaces
Applications in model reduction and parametric studies

Flexible Krylov methods

Allow for variable preconditioning between iterations
Flexible GMRES (FGMRES) accommodates nonlinear preconditioners
Flexible CG for symmetric positive definite systems
Useful for adaptive and nonlinear preconditioning strategies
Applications in domain decomposition and multigrid methods

Krylov-Schur methods

Combine benefits of Arnoldi/Lanczos methods with QR algorithm
Implicit restarting technique for large-scale eigenvalue problems
Thick-restart Lanczos for symmetric eigenvalue problems
Krylov-Schur decomposition provides stable and flexible restarting
Efficient implementation in software libraries (ARPACK)

🔢Numerical Analysis II Unit 4 Review

4.4 Krylov subspace methods

🔢Numerical Analysis II Unit 4 Review

4.4 Krylov subspace methods

Unit & Topic Study Guides

Fundamentals of Krylov subspaces

Definition and properties

Basis vectors

Relation to matrix powers

Krylov subspace methods overview

Iterative linear solvers

Eigenvalue problems

Model order reduction

Arnoldi iteration

Algorithm steps

Orthogonalization process

Hessenberg matrix formation

Lanczos algorithm

Symmetric vs nonsymmetric matrices

Tridiagonalization process

Recurrence relations

Conjugate gradient method

Derivation and theory

Implementation details

Convergence properties

GMRES method

Minimization principle

Arnoldi-based implementation

Restarted vs full GMRES

BiCGSTAB method

Bi-Lanczos process

Stabilization technique

Convergence behavior

Preconditioning techniques

Left vs right preconditioning

Common preconditioners

Effect on convergence

Error analysis and convergence

Residual vs true error

Convergence rates

Superlinear convergence

Krylov methods for eigenproblems

Implicitly restarted Arnoldi

Jacobi-Davidson method

Shift-and-invert techniques

Applications and case studies

Large sparse systems

Computational fluid dynamics

Structural mechanics problems

Advanced topics

Block Krylov methods

Flexible Krylov methods

Krylov-Schur methods

history

social science

english & capstone

arts

science

math & computer science

world languages

high school exams

honors classes

college classes

🔢Numerical Analysis II
Unit 4 Review