Solving Nonlinear Equations With Newton's Method

Solving Nonlinear Equations th Newton's M thod fundamentals of Algorithms Editor-in-Chief: Nicholas J. Higham, University of Manchester The SIAM series on Fundamentals of Algorithms publishes monographs on state-of-the-art numerical methods to provide the reader with sufficient knowledge to choose the appropriate method for a given application and to aid the reader in understanding the limitations of each method. The monographs focus on numerical methods and algorithms to solve specific classes of problems and are written for researchers, practitioners, and students. The goal of the series is to produce a collection of short books written by experts on numerical methods that include an explanation of each method and a summary of theoretical background. What distinguishes a book in this series is its emphasis on explaining how to best choose a method, algorithm, or software program to solve a specific type of problem and its descriptions of when a given algorithm or method succeeds or fails. Kelley, C. T. Solving Nonlinear Equations with Newton's Method Kelleg North Carolina State University Raleigh. North Carolina Solving Nonlinear Equations with Newton's Method siamm Society for Industrial and Applied Mathematics Philadelphia .C T. For information. Title. VAIO is a registered trademark of Sony Corporation.4— dc21 2003050663 Apple and Macintosh are trademarks of Apple Computer. Kelley. write to the Society for Industrial and Applied Mathematics.. Library of Congress Cataloging-in-Publication Data Kelley. Newton-Raphson method. p. 2. Philadelphia. Series.S. it is at the user's own risk and the publisher. I. Nonlinear theories. C. Inc. QA297.T. and their employers disclaim all liability for such misuse. are made by the publisher.) 1. Solving nonlinear equations with Newton's method / C. and other countries. .Copyright © 2003 by the Society for Industrial and Applied Mathematics. No warranties. and their employers that the programs contained in this volume are free of error. They should not be relied on as the sole basis to solve a problem whose incorrect solution could result in injury to person or property. ISBN 0-89871-546-6 (pbk. cm. 10 9 8 7 6 5 4 3 2 1 All rights reserved. express or implied. author. stored. T. II. If the programs are employed in such a manner. No part of this book may be reproduced. PA 19104-2688.K455 2003 511'. registered in the U. author. — (Fundamentals of algorithms) Includes bibliographical references and index. Printed in the United States of America. or transmitted in any manner without the written permission of the publisher.8. is a registered trademark. 3600 University City Science Center. Iterative methods (Mathematics) 3. To my students . This page intentionally left blank . 4 Slow Convergence 1.4 secant. .1.1 Nonsmooth Functions 1.9.1 Local Convergence Theory 1.8.3 Computing the Newton Step 1.3 Failure of the Line Search 1.9.10.6 Storage Problems 1.4 Choosing a Solver 1.2 The Initial Iterate 1.Contents Preface How to Get the Software 1 Introduction 1.7.6 Global Convergence and the Armijo Rule 1.1 Warning! 1.9.1 Notation 1.7 A Basic Algorithm 1.1 What Is the Problem? 1.2 Failure to Converge .8 Things to Consider 1.10.m 1.1 Human Time and Public Domain Codes 1.2 newtsol.5 Multiple Solutions 1. 1.1 Common Features 1.10.3 chordsol.2.2 Newton's Method 1.9.m 1.9 What Can Go Wrong? 1.10 Three Codes for Scalar Equations 1.8.8.3 Approximating the Jacobian 1.8.m 1.5 Termination of the Iteration 1.4 Inexact Newton Methods 1.10.9.11 Projects vii xi xiii 1 1 1 2 3 5 7 9 11 12 14 15 15 15 16 16 17 17 18 19 19 20 20 20 21 21 22 23 24 .9. m 3.1 Direct Methods for Solving Linear Equations 2.1 Jacobian-Vector Products 3.1 Failure of the Inner Iteration 3.7.5 What Can Go Wrong? 2.3 Preconditioners 3.2 Output from nsoli.7 Examples 2.1 Poor Jacobians 2.3 Computing a Finite Difference Jacobian 2.6.11.m 3.4 A Two-Point Boundary Value Problem 2.viii Contents 1.2 Low-Storage Krylov Methods 3.m 3.5.5.6.5 Using nsoli.11.4 What Can Go Wrong? 3.m 2.6.1 1.4.1 Chandrasekhar H-equation 2.2 Nested Iteration 2.5 Stiff Initial Value Problems 2.8.6 Examples 3.1 Chandrasekhar H-equation 3.4 The Chord and Shamanskii Methods 2.1 Arctangent Function 2.3 Preconditioning 3.4.2 Finite Difference Jacobian Error 2.2.2 2 Estimating the q-order Singular Problems 24 25 27 27 28 29 33 34 34 35 35 35 36 37 37 38 39 41 43 47 50 50 50 51 57 57 58 59 60 61 61 61 62 63 64 64 64 65 65 65 66 66 67 71 Finding the Newton Step with Gaussian Elimination 2.1 GMRES 3.5.2 Output from nsold.m 2.3 Pivoting 2.2 The Newton-Armijo Iteration 2.6.9 Source Code for nsold.2.7.1 Input to nsoli.2.1 Krylov Methods for Solving Linear Equations 3.3 Chandrasekhar H-equation 2.1.m Newton-Krylov Methods 3.7.8.1.m 2.7.6 Using nsold.1.5.3 Choosing the Forcing Term 3.2 Preconditioning Nonlinear Equations 3.8 Projects 2.2 Loss of Orthogonality 3.2 The Ornstein-Zernike Equations 3.3 Convection-Diffusion Equation 3 .7.6.5.2 A Simple Two-Dimensional Example 2.1 Input to nsold.2 Computing an Approximate Newton Step 3. 5 Using brsola.2 Convection-Diffusion Equation 4.6.Contents ix 3.7.4.4.m 4.7.m 4.5.2 Left and Right Preconditioning 74 3.7 3.6 Examples 4.4 What Can Go Wrong? 4.1 Input to brsola.7.m 76 85 86 86 87 89 89 89 89 90 90 90 91 91 93 Broyden's Method 4.1 Failure of the Line Search 4.1 Convergence Theory 4.2 An Algorithmic Sketch 4.7 Source Code for brsola.2 Output from brsola.6.5.3 Two-Point Boundary Value Problem 74 3. 73 Projects 74 3. .6.m 4.1 Chandrasekhar H-equation 4.3 Computing the Broyden Step and Update 4.2 Failure to Converge 4.7.m Bibliography Index 97 103 .4 Time-Dependent Convection-Diffusion Equation .8 4 3.1 Krylov Methods and the Forcing Term 74 3.4 Making a Movie 75 Source Code for nsoli. This page intentionally left blank . Vickie Kearn. but does not discuss the details of solving particular problems.m. and DA AD 19-02-1-0391. Jeff Holland. Stacy Howington. There are many introductory books on MATLAB. The computational examples in this book were done with MATLAB v6. Because the examples are so closely coupled to the text. Jorg Gablonsky. Nick Higham. Lea Jenkins. Todd Coffey. implementation in any particular language. or evaluating a solver for a given problem. nsoli.Preface This small book on Newton's method is a user-oriented guide to algorithms and implementation. Katie Kavanagh. in MATLAB®. Russ Harmon. Any opinions. The MATLAB codes for the solvers and all the examples accompany this book. Steve Campbell. colleagues.to medium-scale problems having at most a few thousand unknowns. Its purpose is to show. Charlie Berger. Mac Hyman. I'm particularly grateful to these stellar rootfinders for their direct and indirect assistance and inspiration: Don Alfonso. MATLAB is an excellent environment for prototyping and testing and for moderate-sized production work. and friends helped with this project. via algorithms in pseudocode. Carl xi . DMS-0209695. Jackie Hallberg. this book cannot be understood without a working knowledge of MATLAB. This book is intended to complement my larger book [42]. Chris Kees. I have used the three main solvers nsold. Matthew Farthing. Tom Fogwell. DAAD19-02-1-0111. Peter Brown. how one can choose an appropriate Newton-type method for a given problem and write an efficient solver or apply one written by others. findings. John Dennis. and brsola. Parts of this book are based on research supported by the National Science Foundation and the Army Research Office. DMS-0112542.m from the collection of MATLAB codes in my own research. Large-scale problems are best done in a compiled language with a high-quality public domain code. Dan Finkel. and conclusions or recommendations expressed in this material are those of the author and do not necessarily reflect the views of the National Science Foundation or the Army Research Office. Ilse Ipsen. Alan Hindmarsh. which focuses on indepth treatment of convergence theory. Either of [71] and [37] would be a good place to start. The codes were designed for production work on small. Hong-Liang Cui.5 on an Apple Macintosh G4 and a SONY VAIO. Many of my students. We assume that the reader has a good understanding of elementary numerical analysis at the level of [4] and of numerical linear algebra at the level of [23. most recently by grants DMS-0070641. Steve Davis. Paul Boggs. Jan Hesthaven.76].m. and with several examples. and every student who ever took my nonlinear equations course. Debbie Lockhart. Stephen Nash. Jill Reese.xii Preface and Betty Kelley. Monte Pettitt. Mike Pernice. Carol Woodward. Peiji Zhao. Homer Walker. Bobby Schnabel. Kelley Raleigh. Ekkehard Sachs. Carl Meyer. Linda Thiel. Chung-Wei Ng. Jim Ortega. T. Tammy Kolda. Dwight Woolard. Linda Petzold. Greg Racine. Sam Young. Tom Mullikin. Matthew Lasater. Dana Knoll. Chuck Siewert. Casey Miller. North Carolina May 2003 . Joe Schmidt. David Keyes. Jong-Shi Pang. C. The codes are available from SIAM at the URL http://www.com XIII .How to Get the Software This book is tightly coupled to a suite of MATLAB codes.siam. MA 01760-2098 (508) 647-7000 Fax: (508) 647-7001 Email: info@mathworks. You should put the SOLVERS directory in your MATLAB path.m (4) Chapter 3: examples that use nsoli.com WWW: http://www. Inc. 3 Apple Hill Drive Natick. direct factorization of Jacobians — nsoli. no matrix storage — brsol.m One can obtain MATLAB from The MathWorks.m Broyden's method.m Newton's method.m (5) Chapter 4: examples that use brsol. (1) SOLVERS — nsold.m Newton-Krylov methods.mathworks.org/books/fa01 The software is organized into the following five directories. no matrix storage (2) Chapter 1: solvers for scalar equations with examples (3) Chapter2: examples that use nsold. This page intentionally left blank . vectors are to be understood as column vectors. If the components of F are differentiable at x € RN. Here F : RN —> RN. We will denote the ith component of a vector x by (x)i (note the parentheses) and the ith component of xn by (x n )j. We use the standard notation for systems of N equations in N unknowns.1 Notation In this book. As is standard [42]. sketch the implementation.Chapter 1 Introduction 1. 1.1. Physical models that are expressed as nonlinear partial differential equations. so iterative methods must be used to approximate the solution numerically. Rarely can the solution of a nonlinear equation be given by a closed-form expression. e = x — x* will denote the error. and warn of the problems.43]. nonlinear solvers in MATLAB that can be used for production work. en = xn — x* is the error in the nth iterate. and chapter-ending projects. become large systems of nonlinear equations when discretized. Authors of simulation codes must either use a nonlinear solver as a tool or write one from scratch. We will refer to XQ as the initial iterate (not guess!). We do this via algorithmic outlines. we define the Jacobian 1 . examples in MATLAB. We will call F the nonlinear residual or simply the residual. The output of an iterative method is a sequence of approximations to a solution.1 What Is the Problem? Nonlinear equations are solved as part of almost all simulations of physical processes. for example. So. The vector x* will denote a solution. for example. The purpose of this book is to show these authors what technology is available. x a potential solution. following the convention in [42. We will rarely need to refer to individual components of vectors. and {xn}n>o the sequence of iterates. We will let df/d(x)i denote the partial derivative of / with respect to (x)i. The iteration converges rapidly and one can see the linear model becoming more and more accurate.1 illustrates the local linear model and the Newton iteration for the scalar equation with initial iterate X0 = 1. The MATLAB program ataneg.2) is that we model F at the current iterate xn with a linear function Mn(x) = F(xn ) + F'(xn}(x .2 Newton's Method The methods in this book are variations of Newton's method. 2. Introduction matrix F'(x) by Throughout the book.1 and the other figures in this chapter for the arctan function. approximate solution of the equation for the Newton step s.0). the computation of the Newton step.Xn] and let the root of Mn be the next iteration.m creates Figure 1. construction of xn+i = x n +As. We graph the local linear model at Xj from the point (xj. consumes most of the work. Mn is called the local linear model.2). where the step length A is selected to guarantee decrease in Item 2.2 Chapter 1. The third iterate is visually indistinguishable from the solution. The computation of a Newton iteration requires 1. and 3.yj) = (XJ. Figure 1. If F'(xn) is nonsingular. and the variations in Newton's method that we discuss in this book differ most significantly . then Mn(xn+i) = 0 is equivalent to (1. evaluation of F(xn] and a test for termination.F(XJ)} to the next iteration (xj+1. || • || will denote the Euclidean norm on 1. The Newton sequence is The interpretation of (1. 2. Equation 1. F'(x*) is nonsingular. which.2. 1. 3. Newton's Method 3 Figure 1. (standard assumptions) 1.1 Local Convergence Theory The convergence theory for Newton's method [24. F' : fJ —* RNxN is Lipschitz continuous near x*.6). as we will see in Chapter 2. and item 3 was not needed.42.1 has a solution x*. The local convergence theory from [24.1.57] requires the standard assumptions. the step s in item 2 was satisfactory.3) by an iterative method. Newton iteration for the arctan function. can be very expensive. This means that one assumes that the initial iterate XQ is near a solution. Assumption 1. Recall that Lipschitz continuity near x* means that there is 7 > 0 (the Lipschitz constant) such that . Not all methods for computing the Newton step require the complete Jacobian matrix.1.42. One should not write one's own nonlinear solver without step-length control (see section 1. Computing the step may require evaluation and factorization of the Jacobian matrix or the solution of (1. 2.57] that is most often seen in an elementary course in numerical methods is local. in how the Newton step is approximated.2.1.1. In the example from Figure 1. The reader should be warned that attention to the step length is generally very important. Let the standard assumptions hold.4). An inaccurate Jacobian can cause many problems (see .1. then progress slows down for iteration 4 and stops completely after that. Table 1.e. until stagnation takes over. while usually reliable.58] floating point system is about 10~16.1319e-03 3. The reason for this stagnation is clear: one cannot evaluate the function to higher precision than (roughly) machine unit roundoff. In Table 1. n 0 1 2 3 4 5 \F(Xn)\ 1.3733e-01 4. we can observe the quadratic reduction in the error computationally. Theorem 1. Introduction for all x. the convergence speed of the nonlinear iteration is as fast as that for Newton's method. at least for this example. F'(xn is nonsingular for all n > 0) and converges to x* and there is K > 0 such that for n sufficiently large. are often expensive and can be very inaccurate. because the nonlinear residual will also be roughly squared with each iteration.4 Chapter 1.9818e-06 5. With this choice of difference increment. Residual history for Newton's method. is called q-quadratic. Of course. The convergence described by (1.8818e-16 8.y sufficiently near x*.. If X0 is sufficiently near x*.5955e-12 8. The decrease in the function is as the theory predicts for the first three iterations. then the Newton sequence exists (i. we should see the exponent field of the norm of the nonlinear residual roughly double with each iteration. in which the error in the solution will be roughly squared with each iteration. The classic convergence theorem is as follows. if F'(x*) is well conditioned (see (1.1 used a forward difference approximation to the derivative with a difference increment of 10~6. one cannot examine the error without knowing the solution. The results reported in Table 1. Squaring the error roughly means that the number of significant figures in the result doubles with each iteration.1. However.8818e-16 Stagnation is not affected by the accuracy in the derivative.13)). which in the IEEE [39. The reader should be aware that difference approximations to derivatives.493. Therefore.1 we report the Newton iteration for the scalar (N = 1) nonlinear equation The solution is x* « 4. but not the limit of the sequence.2. • Errors in the Jacobian and in the solution of the linear equation for the Newton step (1. the norm of the nonlinear residual against the iteration number.9). but one needs to be aware that stagnation of the nonlinear iteration is all but certain in finite-precision arithmetic. While Table 1. but can be worth it in terms of computer time and robustness when a difference Jacobian performs poorly.2 is a semilog plot of residual history.13) in section 1. Theorem 1. The messages of Theorem 1.3. 1.3. Figure 1. the results will be as accurate as the evaluation of F.e. Then.5). F'(xn) + A(rcn) is nonsingular for all n) and satisfies for some K > 0. for an example..1 gives a clear picture of quadratic convergence. which generated Figures 1. it is usually most efficient to approximate the Newton step in some way. Let a matrix-valued function A(x) and a vector-valued function e(x) be such that for all x near x*. the sequence near x* and dj and 6p are sufficiently is defined (i. An analytic Jacobian can require some human effort. Let the standard assumptions hold.1. if the Jacobian is well conditioned (see (1.1. One way to do this is to approximate F'(xn) in a . in the function evaluation can lead to stagnation. One uses the semilogy command in MATLAB for this. i.2 are as follows: • Small errors. the asymptotic convergence results for exact arithmetic describe the observations well for most problems. for example. One can quantify this stagnation by adding the errors in the function evaluation and derivative evaluations to Theorem 1.3 Approximating the Jacobian As we will see in the subsequent chapters. it's easier to appreciate a graph. The concavity of the plot is the signature of superlinear convergence. We will ignore the errors in the function in the rest of this book. Approximating the Jacobian 5 section 1. machine roundoff. if X0 is sufficiently small. See the file tandemo.3) will affect the speed of the nonlinear iteration..e.m.2 and 1. However. This type of stagnation is usually benign and. The curve appears to be a line with slope « log(p). rather than a forward difference. In many cases of q-linear convergence.2 to the chord method with e = 0 and ||A(xn)|| = O(||eo||) and conclude that p is proportional to the initial error. as you can see from the definition (1. but also saves linear algebra work and matrix storage. This means that there is p G (0. The convergence of the chord iteration is not as fast as Newton's method. One way to approximate the Jacobian is to compute F'(XQ) and use that as an approximation to F'(xn] throughout the iteration.e. The constant p is called the q-factor. Newton iteration for tan(x) — x = 0. way that not only avoids computation of the derivative. but.2. q-linear convergence is usually easy to see on a semilog plot of the residual norms against the iteration number. one observes that In these cases. uses the most recent two .1) such that for n sufficiently large. However. the convergence is q-linear. The price for such an approximation is that the nonlinear iteration converges more slowly. more nonlinear iterations are needed to solve the problem. the overall cost of the solve is usually significantly less..6 Chapter 1. i. We can apply Theorem 1.4). Introduction Figure 1. because the computation of the Newton step is less expensive. This is the chord method or modified Newton method. Assuming that the initial iteration is near enough to x*. The secant method for scalar equations approximates the derivative using a finite difference. The formal definition of q-linear convergence allows for faster convergence. Q-quadratic convergence is also q-linear. Inexact Newton Methods 7 iterations to form the difference quotient.m to apply the solvers and make the plots. for some p > 1. The formula for the secant method does not extend to systems of equations (N > 1) because the denominator in the fraction would be a difference of vectors. We will discuss the design of these codes in section 1.1. An inexact Newton method [22] uses as a Newton . if xn —> x* and. In Figure 1. Q-quadratic convergence is a special case of q-superlinear convergence. We see the convergence behavior that the theory predicts in the linear curve for the chord method and in the concave curves for Newton's method and the secant method.4. The division by zero that we observed is an extreme case. with e = 0 and ||A(xn)|| = O(||en_i||). we say that xn —> x* q-superlinearly with q-order p. and secant . chordsol. So where xn is the current iteration and xn-i is the iteration before that. and tandemo .99z0. one could instead solve the equation for the Newton step approximately. This means that either xn = x* for some finite n or Q-superlinear convergence is hard to distinguish from q-quadratic convergence by visual inspection of the semilog plot of the residual history. implies that the iteration converges q-superlinearly. We also see the stagnation in the terminal phase. Theorem 1. The secant method's approximation to F'(xn) converges to F'(x*} as the iteration progresses.2. 1. The residual curve for q-superlinear convergence is concave down but drops less rapidly than the one for Newton's method. This is what we do in our MATLAB code secant. The secant method has the dangerous property that the difference between xn and x n _i could be too small for an accurate difference approximation.4 Inexact Newton Methods Rather than approximate the Jacobian. we compare Newton's method with the chord method and the secant method for our model problem (1. These solvers are basic scalar codes which have user interfaces similar to those of the more advanced codes which we cover in subsequent chapters.m for the residual. m.5). The MATLAB codes for these examples are ftst . More generally. We discuss one of the many generalizations of the secant method for systems of equations in Chapter 4.m for the solvers.m. The secant method must be initialized with two points.10. newt sol .3.m. One way to do that is to let x-i = 0. The figure does not show the division by zero that halted the secant method computation at iteration 6. the convergence is q-superlinear. Introduction Figure 1. fj]. However. therefore leading to convergence in fewer iterations.8 Chapter 1. and q.3. Theorem 1.order l+p. Then there are 6 and f\ such that. Theorem 1. then the inexact Newton iteration where converges q-linearly to x*. The local convergence theory [22. {rjn} C [0.10) very expensive.3. • */ fyi —» 0. Newton/chord/secant comparison for tan(x) — x. Choosing a small value of rj will make the iteration more like Newton's method. a small value of 77 may make computing a step that satisfies (1.3 is a typical example of such a convergence result. if X0 € B(6).42] for inexact Newton methods reflects the intuitive idea that a small value of 77 leads to fewer iterations. the convergence is q-superlinear with . Moreover. Let the standard assumptions hold. step a vector s that satisfies the inexact Newton condition The parameter 77 (the forcing term) can be varied as the Newton iteration progresses. for some K^ > Q. An unfortunate choice of the forcing term 77 can lead to very poor result The reader is invited to try the two choices 77 = 10~6 and 77 = . setting ra = 0) is a poor idea because an initial iterate that is near the solution may make (1. Either of these usually leads to rapid convergence near the solution.1. Termination of the Iteration _9 Errors in the function evaluation will. then . the steps satisfy (1. implying q-superlinear convergence. Newton iterative methods are named by the particular iterative method used for the linear equation.10) and ignores the dependence of the cost of computing the step as a function of 77. For example. in general.)|| is small is to compare a relative reduction in the norm of the error with a relative reduction in the norm of the nonlinear residual. Better choices of 77 include 77 = 0. r.5 Termination of the Iteration While one cannot know the error without knowing the solution.10) as a termination criterion. and a more complex approach (see section 3.5. In this case. 1.F(a. One can use Theorem 1. lead to stagnation of the iteration.m t see this. For the secant method. One way to quantify the utility of termination when ||. In the case of the chord method.3 to analyze the chord method or the secant method. the overall nonlinear solver is called a Newton iterative method.. the author's personal favorite.m code.m. Iterative methods for solving the equation for the Newton step would typically use (1.11) with which implies q-linear convergence if \\eo\\ is sufficiently small.3) from [29] and [42] that is the default in nsoli. If the standard assumptions hold and XQ and x are sufficiently near the root.3 does not fully describe the performance of inexact methods in practice because the theorem ignores the method used to obtain a step that satisfies (1. we terminate the iteration in our codes when The relative rr and absolute ra error tolerances are both important.2. which we describe in Chapter 3. is an implementation of several Newton— Krylov methods.9 in nsoli.12) impossible to satisfy with ra = 0. but at a much lower cost for the linear solver than a very small forcing term such as 77 = 10~4.n = O(||en_i||).e. Using only the relative reduction in the nonlinear residual as a basis for termination (i. Theorem 1.1. the nsoli. in most cases the norm of F(x) can be used as a reliable indicator of the rate of decay in \\e\\ as the iteration progresses [42]. Based on this heuristic. . This is analogous to the linear case. then and hence Therefore.10 where Chapter 1.(F'(x*}) is not very large). Prom (1. This can happen in practice if the Jacobian is ill conditioned and the initial iterate is far from the solution [45]. One can estimate the current rate of convergence from above by Hence. K. where a small residual implies a small error if the matrix is well conditioned. is to exploit the fast convergence to estimate the error in terms of the step. one may use \\sn\\ as an estimate of ||en||. Termination using (1. for a superlinearly convergent method.16) will imply that In practice.12) is a useful termination criterion.e. a safety factor is used on the left side of (1.61]. the estimate of p is much smaller than the actual q-factor. by Assuming that the estimate of p is reasonable. but is used for linearly convergent methods in some initial value problem solvers [8. If the iteration is converging superlinearly. say. So. which is supported by theory only for superlinearly convergent methods. if we terminate the iteration when and the estimate of p is an overestimate.17) to guard against an underestimate.13) we conclude that. the iteration can terminate too soon. . then implies that Hence. however. for n sufficiently large. then (1. if the Jacobian is well conditioned (i. then (1. Another approach. If.14) is only supported by theory for superlinearly convergent methods. terminating the iteration with xn+1 as soon as will imply that ||en+i|| < r. The trick is to estimate the q-factor p. when the iteration is converging superlinearly. Introduction is the condition number of F'(x*) relative to the norm || • ||. 6. created by ataneg. we find the smallest integer m > 0 such that and let the step be s = 2~md and xn+i = xn + 2~md. i. The simple artifice of reducing the step by half until ||-F(a. The initial iterate and the four subsequent iterates are As you can see.m. say) . the step while in the correct direction.4. The line search in our codes manages the reduction in the step size with more sophistication than simply halving an unsuccessful step.xn + d]. To see this.e. as we have been doing up to now in this book. When we talk about local convergence and are taking full steps (A = 1 and s = d). In Figure 1. we apply Newton's method to find the root x* = 0 of the function F(x) = arctan(x) with initial iterate XQ — 10. but might do much better if a more aggressive step-length reduction (by factors of 1/10. Global Convergence and the Armijo Rule 11 1.. we will now make a distinction between the Newton direction d = —F'(x)~1F(x) and the Newton step when we discuss global convergence. The circled points are iterations for which m > 1 and the value of m is above the circle.18) is called the sufficient decrease of ||F||. the Newton step points in the correct direction. The motivation for this is that some problems respond well to one or two reductions in the step length by modest amounts (such as 1/2) and others require many such reductions. is far too large in magnitude. In order to clearly describe this. For the methods in this book. toward x* = 0. Methods like the Armijo rule are called line search methods because one searches for a decrease in ||F|| along the line segment [xn. the Newton step will be a positive scalar multiple of the Newton direction. a = 10~4 is typical and used in our codes. succeeds.18) as easy as possible to satisfy. 1) is a small number intended to make (1. but overshoots by larger and larger amounts. called the Armijo rule [2]. The condition in (1. This initial iterate is too far from the root for the local convergence theory to hold.1. In fact. The parameter a £ (0. A rigorous convergence analysis requires a bit more detail.6 Global Convergence and the Armijo Rule The requirement in the local convergence theory that the initial iterate be near the solution is more than mathematical pedantry. we will not make this distinction and only refer to the step.)|| has been reduced will usually solve this problem. We begin by computing the Newton direction To keep the step from going too far. we show how this approach. </>(A m ). A m _i/2]. Introduction Figure 1. The line search terminates with the smallest m > 0 such that In the advanced codes from the subsequent chapters.57] for a discussion of other ways to implement a line search. AI = 1/2.28. Newton-Armijo for arctan(o. In this approach. We refer the reader to [42] for the details and to [24. 1. Am is the minimum of this parabola on the interval [A m _i/10.7 A Basic Algorithm Algorithm nsolg is a general formulation of an inexact Newton-Armijo iteration.19) is squared to make <j> a smooth function that can be accurately modeled by a quadratic over small ranges of A. is used. and 0(A m _i). To compute Am for m > 1.4. The next A is the minimizer of the quadratic model. There is a lot of .). So the algorithm generates a sequence of candidate step-length factors {Am} with AO = 1 and The norm in (1. a parabola is fitted to the data </>(0). To address this possibility.42. after two reductions by halving do not lead to sufficient decrease.12 Chapter 1. we use the three-point parabolic model from [42]. The methods in Chapters 2 and 3 are special cases of nsolg. we build a quadratic polynomial model of based on interpolation of 0 at the three most recent values of A. subject to the safeguard that the reduction in A be at least a factor of two and at most a factor of ten. ra. and 3. x will be the approximate solution on output. which we describe in Chapter 3).aA)||F(z)|| do A <— 0-A. If nsolg terminates successfully. the computation of the Newton direction d can be done with direct or iterative linear solvers.6. . Theorem 1.9.4 states this precisely. and the relative and absolute termination tolerances ra and rr. then the forcing term 77 is determined implicitly. If you use an iterative linear solver.rr\F(x)\ + ra. nsolg(z. the convergence is as fast as the quality of the linear solver permits. then the iteration converges to a solution and. then rj is proportional to the error in the Jacobian. while ||F(z)|| > r do Find d such that \\F'(x}d + F(x}\\ < rj\\F(x}\\ If no such d can be found. For example.1.1. Having computed the Newton direction. the function F. 77 is bounded away from one (in the sense of (1. The algorithm does not cover all aspects of a useful implementation. we compute a step length A and a step s = Ad so that the sufficient decrease condition (1. but is not necessary in practice if you use direct solvers. where a 6 [1/10. You'll need to make a decision about the forcing term in that case (or accept the defaults from a code like nsoli.1/2] is computed by minimizing the polynomial model of ||F(arn + Ad)||2. linear iterations. and the sequence {xn} remains bounded.22).4.10) is the termination criterion for that linear solver.5. then rj = 0 in exact arithmetic. end while x <— x + \d end while The theory for Algorithm nsolg is very satisfying. Failure of any of these loops to terminate reasonably rapidly indicates that something is wrong.22)). the Jacobians remain well conditioned throughout the iteration.F. when near the solution. 2. using either the Jacobian F'(x] or an approximation of it. The essential input arguments are the initial iterate x. m. Algorithm 1. A Basic Algorithm 13 freedom in Algorithm nsolg.Tr) Evaluate F(x). T <. It's standard in line search implementations to use a polynomial model like the one we described in section 1. The number of nonlinear iterations. A= l while \\F(x + Xd)\\ > (1 . If F is sufficiently smooth. The theoretical requirements on the forcing term 77 are that it be safely bounded away from one (1. We list some of the potential causes of failure in sections 1. Knowing about n helps you understand and apply the theory. If you use a direct solver. you do not need to provide one. If you use an approximate Jacobian and solve with a direct method. Within the algorithm. and changes in the step length all should be limited. then usually (1.7. if you solve the equation for the Newton step with a direct method.21) holds. terminate with failure. full steps (X = I) are taken for n sufficiently large.25. However. Introduction but not as generally as the results in [24.42. therefore. Difference approximations to the Jacobian are usually sufficiently accurate.14 Chapter 1. or a very good approximation (forward difference. use a forward difference approximation for Jacobian-vector products (with vectors that are natural for the problem) and. and steps. Assume that {xn} is given by Algorithm nsolg.4. • {xn} will be unbounded. which are natural directions for the problem.1 Warning! The theory for convergence of the inexact Newton-Armijo iteration is only valid if F'(xn). A good code will watch for this failure and respond by using a more accurate Jacobian or Jacobianvector product. The reason for this is that the success of the line search is very sensitive to the direction. While this can result in slow convergence when the iterations are near the root.1) be given. is very accurate. other methods are available and can sometimes overcome stagnation or. While the line search paradigm is the simplest way to find a solution if the initial iterate is far from a root. or • F'(xn) will become singular. residuals. for smooth F. such as the Newton-Krylov methods in Chapter 3. In particular.7. whereas differentiation in the directions of the iterations. the outcome can be much worse when far from a solution. at which the standard assumptions hold. F is Lipschitz continuously differentiate. find the solution that is appropriate to a physical problem.44]. .60]. Then {xn} converges to a root x* of F at which the standard assumptions hold.57]. is used to compute the step. but users of nonlinear solvers should be aware that the line search can fail. and {xn} and { \ \ F f ( x n ) ~ l | | } are bounded. Trust region globalization [24. A poor approximation to the Jacobian will cause the Newton step to be inaccurate.3). there are particularly hard problems [48] for which differentiation in the coordinate directions is very inaccurate. and homotopy methods [78] are three such alternatives. Let XQ e RN and a e (0. will usually (but not always) work well when far from a solution. there are only three possibilities for the iteration of Algorithm nsolg: • {xn} will converge to a solution :r*. The important thing that you should remember is that.36. pseudotransient continuation [19. Theorem 1. Sometimes methods like the secant and chord methods work fine with a line search when the initial iterate is far from a solution. in the case of many solutions. for example). 1. if XQ is far from x* there is no reason to expect the secant or chord method to converge. and the convergence behavior in the final phase of the iteration is that given by the local theory for inexact Newton methods (Theorem 1. The inexact Newton methods. 8. 60] rather than the line search method we use here. In some applications the initial iterate is known to be good. Thinking about the problem and the qualitative properties of the solution while choosing the initial iterate can ensure that the solver converges more rapidly and avoids solutions that are not the ones you want.8.5).7. Therefore.1 Human Time and Public Domain Codes When you select a nonlinear solver for your problem. The Newton-Krylov solvers we discuss in Chapter 3 are at present (2003) the solvers of choice for large problems on advanced computers.2 The Initial Iterate Picking an initial iterate at random (the famous "initial guess") is a bad idea. or FORTRAN) or high-performance computing environments. and Broyden's method become very attractive.7. NKSOL [13]. which are based on direct factorizations.6] and the NITSOL [59]. these algorithms are getting most of the attention from the people who build libraries. MINPACK and several other codes for solving nonlinear equations are available from the NETLIB repository at http://www.8. and KINSOL [75] codes are good implementations. C++.1. if you need support for other languages (meaning C. However. There is an implementation of Broyden's method in UNCMIN. you need to consider not only the computational cost (in CPU time and storage) but also YOUR TIME. there are several sources for public domain implementations of the algorithms in this book. The globalization is via a trust region approach [24. Some problems come with a good initial iterate. 1.netlib.to medium-scale production work. so methods like the chord. .1 are not an issue. The MATLAB implementation that accompanies this book requires much less storage and computation. The MINPACK [51] library is a suite of FORTRAN codes that includes an implementation of Newton's method for dense Jacobians. The MATLAB codes that accompany this book are a good start and can be used for small.org/. Some careful implementations can be found in the MINPACK and UNCMIN libraries. The SNES solver in the PETSc library [5. it is usually your job to create one that has as many properties of the solution as possible. the secant. The UNCMIN [65] library is based on the algorithms from [24] and includes a Newton-Armijo nonlinear equations solver. 1. However. The methods from Chapter 2. or you're an expert in this field.8 Things to Consider Here is a short list of things to think about when you select and use a nonlinear solver. have received less attention recently. This implementation is based on dense matrix methods. Unless your problem is very simple. since the problems with the line search discussed in section 1. your best bet is to use a public domain code. A fast code for your problem that takes ten years to write has little value. Things to Consider 15 1. Two examples of this are implicit methods for temporal integration (see section 2. • If AT is small and F is cheap. The items in the list above are not independent.2). 3.21]. Both methods avoid explicit computation of Jacobians. in which case one should try to exploit those data about the solution.4 Choosing a Solver The most important issues in selecting a solver are • the size of the problem. Low-storage Newton-Krylov methods. It is more common to have a little information about the solution in advance. If you know the signs of some components of the solution. 1. Introduction in which the initial iterate is the output of a predictor. storing a Jacobian is difficult and factoring one may be impossible. The reader in a hurry could use the outline below and probably do well. may be the only choice.8. • Sparse differencing can be done in considerable generality [20. be sure that the signs of the corresponding components of the initial iterate agree with those of the solution. then you must either reformulate the problem or find the storage for a direct method.3. make sure that any boundary conditions are reflected in your initial iterate. Even if the storage is available. • the cost of evaluating F and F'. so it is worth considerable effort to build a good preconditioner for an iterative method. A direct method is not always the best choice for a small problem.1. Integral equations. where problems such as differential equations are solved on a coarse mesh and the initial iterate for the solution on finer meshes is an interpolation of the solution from a coarser mesh. such as the example in sections 2. but usually require preconditioning (see sections 3.3 and 3. and nested iteration (see section 2.2. and • the way linear systems of equations will be solved. if your problem is a discretized differential equation.3 Computing the Newton Step If function and Jacobian evaluations are very costly. such as Newton-BiCGSTAB.2. you will save a significant amount of work in .1.8.6.8. are one type for which iterative methods perform better than direct methods even for problems with small numbers of unknowns and dense Jacobians. For example. though. the Newton-Krylov methods from Chapter 3 and Broyden's method from Chapter 4 are worth exploring. For very large problems. These methods are probably the optimal choice in terms of saving your time.7. 1. computing F' with forward differences and using direct solvers for linear algebra makes sense. If you can exploit sparsity in the Jacobian.3). factorization of the Jacobian is usually a poor choice for very large problems. The methods from Chapter 2 are a good choice.16 Chapter 1. and 4. If these efforts fail and the linear iteration fails to converge. you can use that matrix to build an incomplete factorization [62] preconditioner. including the ones that accompany this book. you may well have a nondifferentiable problem. If you can store F'. The internal MATLAB code numjac will do sparse differencing. The codes will behave unpredictably if your function is not Lipschitz continuously differentiable. for example. What Can Go Wrong? 17 the computation of the Jacobian and may be able to use a direct solver.1 Nonsmooth Functions Most nonlinear equation codes. when we discuss problems that are specific to a method for approximating the Newton direction. We discuss how to do this for banded Jacobians in section 2. or calls to other codes. These are some problems that can arise for all choices of methods. 1. If. — If you can't compute or store F' at all. you might be able to use a sparse differencing method to approximate F' and a sparse direct solver. 1. On the other hand. In this section we give some guidance that may help you troubleshoot your own solvers or interpret hard-to-understand results from solvers written by others. a vector norm.1 will help you choose a Krylov method. The discussion in section 3. — If F' is sparse. a nonsmooth nonlinearity can cause any of the failures listed in this section.m. If your function is close to a smooth function. • control structures like case or if-then-else that govern the value returned by F. • internal interpolations from tabulated data.9. the codes may do very well. .9 What Can Go Wrong? Even the best and most robust codes can (and do) fail in practice. or a fractional power. are intended to solve problems for which F' is Lipschitz continuous. but requires the sparsity pattern from you. then the matrix-free methods in Chapters 3 and 4 may be your only options. the code for your function contains • nondifferentiable functions such as the absolute value. If you can obtain the sparsity pattern easily and the computational cost of a direct factorization is acceptable. If you have a good preconditioner.1. • If AT is large or computing and storing F' is very expensive.3 and implement a banded differencing algorithm in nsold. you may not be able to use a direct method. We will also repeat some of these things in subsequent chapters.9. a Newton-Krylov code is a good start. a direct method is a very attractive choice. If F(x) = e~x. No solution If your problem has no solution. If F is a model of a physical problem. So. internal tolerances to algorithms within the computation of F may be too loose. errors in programming (a. while complex and often more costly.9. while technically correct. may have been realized in a way that destroys the solution. internal calculations based on table lookup and interpolation may be inaccurate. as stated in Theorem 1. there are alternatives to line search globalization that. if the iteration fails to converge to a root. The clear symptoms of this are divergence of the iteration or failure of the residual to converge to zero. which is not a root. so if one terminates when the step is small and fails to check that F is approaching zero. including the ones that accompany this book. Singular Jacobian The case where F' approaches singularity is particularly dangerous. Introduction 1. the model itself may be wrong.)|. and if-then-else constructs can make F nondifferentiable. does not imply that the iteration will converge. then any solver will have trouble. then the Newton iteration will diverge to +00 from any starting point. then either the iteration will become unbounded or the Jacobian will become singular. 60]. use a difference increment of « 10~7 for finite difference Jacobians and Jacobian-vector products. therefore.2 Failure to Converge The theory.k. the Newton-Armijo iteration will converge to 0. bugs) are the likely source.18 Chapter 1.4.a. There are public domain . If F(x) = x2 + 1. changing the difference increment in the solvers will usually solve this problem. only that nonconvergence can be identified easily. if necessary. one can incorrectly conclude that a root has been found. If the error in your function evaluation is larger than that. the minimum of |F(a. assume that the errors in the evaluation are on the order of machine roundoff and. Alternatives to Newton-Armijo If you find that a Newton-Armijo code fails for your problem.2 illustrates how an unfortunate choice of initial iterate can lead to this behavior. The algorithm for computing F.7. can be more robust than Newton-Armijo. Thinking about the errors in your function and. In this case the step lengths approach zero. homotopy [78]. Inaccurate function evaluation Most nonlinear solvers. Among these methods are trust region methods [24. For example. The causes in practice are less clear. the Newton direction can be poor enough for the iteration to fail. and pseudotransient continuation [44]. The example in section 2. an analytic Jacobian may make the line search perform much better. be sure that you have specified the correct sparsity pattern. which is a good choice unless the function contains components such as a table lookup or output from an instrument that would reduce the accuracy. then increase the difference increment in a difference Jacobian to roughly the square root of the errors in the function [42].3 only hold if the correct linear system is solved to high accuracy.1 that the theory for convergence of the Armijo rule depends on using the exact Jacobian.9. If you expect to see superlinear convergence. See section 3. • If you are using a sparse-matrix code to solve for the Newton step.3).1. for example). but not always. The difference increment in a forward difference approximation to a Jacobian or a Jacobian-vector product should be a bit more than the square root of the error in the function. but do not. or linear solver is inaccurate.7. the quality of the Newton direction is poor. What Can Go Wrong? 19 codes for the first two of these alternatives. If you're using a direct method to compute the Newton step. where the optimal increment is roughly the cube root of the error in the function. the chances are good that the Jacobian. sufficient. can improve the performance of the solver. A difference approximation to a Jacobian or Jacobian-vector product is usually.4. We repeat the caution from section 1. 1. twice that of a forward difference. • Check your computation of the Jacobian (by comparing it to a difference. Jacobian-vector product. you might try these things: • If the errors in F are significantly larger than floating point roundoff. Our codes use h = 10~7.1 and 1.3 Failure of the Line Search If the line search reduces the step size to an unacceptably small value and the Jacobian is not becoming singular.9. 1.9. Check for errors in the preconditioner and try to investigate its quality.2 for more about this problem. . is rarely justified. If these methods fail. Failure of the line search in a Newton—Krylov iteration may be a symptom of loss of orthogonality in the linear solver. Central difference approximations. • Make sure the tolerances for an iterative linear solver are set tightly enough to get the convergence you want. The local superlinear convergence results from Theorems 1. but for large problems the cost. you should see if you've made a modeling error and thus posed a problem with no solution. One should scale the finite difference increment to reflect the size of x (see section 2.4 Slow Convergence If you use Newton's method and observe slow convergence. No theory can say that the iteration will converge to the solution that you want. Your best option is to find a computer with more memory. you may not be able to store the data that the method needs to converge. you may not be able to store the factors that the sparse Gaussian elimination in MATLAB creates. and secant. so you do best if you can keep data in registers as long as possible.3. Modern computer architectures have complex memory hierarchies.20 Chapter 1. The discussion of loop ordering in [23] is a good place to start learning about efficient programming for computers with memory hierarchies. Many computing environments. Cache memory is faster than RAM. and below that is disk.4. Below the cache is RAM. m are MATLAB implementations of Newton's method. the chord method. The registers in the CPU are the fastest. The problems we discuss in sections 2. This is rarely acceptable. but in FORTRAN or C. The solvers we discuss in this book. GMRES needs a vector for each linear iteration. Below the registers can be several layers of cache memory.2 have multiple solutions. will tell you that there is not enough storage for your job.9. Introduction • If you are using a GMRES solver.9. but much more expensive. for example. Other computing environments solve run-time storage problems with virtual memory.7. 2. 1.6 Storage Problems If your problem is large and the Jacobian is dense. there is no guarantee that an equation has a unique solution.7. MATLAB. for scalar . When this happens.2).2. You probably don't have to think about cache in MATLAB. m. respectively. so a cache is small. 1. 1. are supported by the theory that says that either the solver will converge to a root or it will fail in some well-defined manner.9. you do. as well as the alternatives we listed in section 1.4. MATLAB among them. make sure that you have not lost orthogonality (see section 3. The Newton—Krylov methods and Broyden's method are good candidates for the latter. and 3. for example. newt sol.5 Multiple Solutions In general. or use a solver that requires less storage. will print this message: Out of memory.10 Three Codes for Scalar Equations Three simple codes for scalar equations illustrate the fundamental ideas well. This is called paging and will slow down the computation by factors of 100 or more. Even if you use an iterative method. and the secant method.6. If your Jacobian is sparse. you can find a way to obtain more memory or a larger computer. Type HELP MEMORY for your options. Simple things such as ordering loops to improve the locality of reference can speed up a code dramatically. you may be unable to store that Jacobian. chordsol. This means that data are sent to and from disk as the computation proceeds.m. tola.hist(:. Setting jdiff function / with two output arguments [y. jdiff is an optional argument. optionally. The output is the final result and (optionally) a history of the iteration.m lets you choose between evaluating the derivative with a forward difference (the default) and analytically in the function evaluation.1. The calling sequence is [x. The step-length reduction is done by halving. tolr). are the number of times the line search reduced the step size and the Newton sequence {xn}. The secant and chord method codes do not. jdiff). One MATLAB command will make a semilog plot of the residual history: semilogy(hist(:.7. tola. x = solver(x. The Newton's method code includes a line search. hist] = newtsol (x. The history is kept in a two. newtsol. Three Codes for Scalar Equations 21 equations. if you're not interested in the history array. The codes can be called as follows: [x. As codes for scalar equations. f. tola. but. The function f atan. tolr) or. hist] = solver(x.1 Common Features The three codes require an initial iterate or.1). The first column is the iteration counter and the second the absolute value of the residual after that iteration. they do not need to pay attention to numerical linear algebra or worry about storing the iteration history.2 newtsol. m is the only one of the scalar codes that uses a line search. f. one need not keep the iteration number in the history and our codes for systems do not. The most efficient way to write such a function is to only compute F' if it is requested.1 a bit too seriously. for Newton's method only. The third and fourth.m to expect a . not by the more sophisticated polynomial model based method used in the codes for systems of equations. taking the warning in section 1. for a simple example. 1. where y = F(x) and yp = F'(x). tolr. and relative and absolute residual tolerances tola and tolr.2)).10.10. Each of the scalar codes has a limit of 100 nonlinear iterations.10.m returns the arctan function and. They have features in common with the more elaborate codes from the rest of the book. f.yp]=f(x). Of course. the function /. doing so makes it as easy as possible to plot iteration statistics. 1. its derivative: = 1 directs newtsol. Here is an example.m newt sol.or four-column hist array. 3 chordsol.4.0000e+00 3. 7. The code below.:) ans = 0 l. After that.hist (2:5. m has four columns. The third column is the number of times the step size was reduced in the line search. (optionally) YP = 1/(1+X~2). 3.hist] = newtsoKxO.tol). full steps were taken.d-12. EXAMPLE Draw Figure 1. 7. This is the information we need to locate the circles and the numbers on the graph in Figure 1. Once we know that the line search is active only on iterations 1. end The history array for newt sol. tol=l.10. semilogy (hist (:.5730e+00 4.8549e+00 1.4.4547e+00 1. This allows you to make plots like Figure 1. 2. [Y. and 4. tol=l.3670e+00 The third column tell us that the step size was reduced for the first through fourth iterates. 'fatan'. if nargout == 2 yp = l/(l+x~2). 4.3170e+00 9. for example. 1.22 Chapter 1.0000e+00 1. [x.YP] = FATAN(X) returns Y\.4.3724e+00 1.3921e-01 0 3. 'fatan'. The fourth column contains the Newton sequence.0000e+00 2.0000e+00 S. ylabel('function absolute values').1) . To use the semilogy to plot circles when the line search was required in this example.4711e+00 1.m approximates the Jacobian at the initial iterate with a forward difference and uses that approximation for the entire nonlinear iteration.2)) . creates the plot in Figure 1.OOOOe+OO 4. 7. we can use rows 2. » hist(1:5. knowledge of the history of the iteration was needed. 'o') xlabel('iterations'). 3.m chordsol.yp] = fatan(x) 7.OOOOe+00 2. tol. Introduction function [y.2)) .atan(X) and '/. The calling sequence is . 1) .0000e+00 l. tol.hist] = newtsoKxO.4. » [x.OOOOe+01 -8. xO=10.=\. and 5 of the history array in the plot.abs(hist(2:5.9730e+00 -3.d-12. FATAN Arctangent function with optional derivative */. y = atan(x).tol).0000e+00 2.abs(hist(: . Here is a call to newt sol followed by an examination of the first five rows of the history array: » xO=10. 1.10. Three Codes for Scalar Equations 23 [x, hist] = chordsol (x, f, tola, tolr). The hist array has two columns, the iteration counter and the absolute value of the nonlinear residual. If you write / as you would for newtsol.m, with an optional second output argument, chordsol.m will accept it but won't exploit the analytic derivative. We invite the reader to extend chordsol.m to accept analytic derivatives; this is not hard to do by reusing some code from newtsol.m. 1.10.4 secant, m The secant method needs two approximations to x* to begin the iteration, secant .m uses the initial iterate XQ = x and then sets When stagnation takes place, a secant method code must take care to avoid division by zero in (1.8). secant .m does this by only updating the iteration if xn-i ^ xn. The calling sequence is the same as for chordsol.m: [x, hist] = secant(x, f, tola, tolr). The three codes newtsol.m, chordsol.m, and secant.m were used together in tandemo.m to create Figure 1.3, Table 1.1, and Figure 1.2. The script begins with initialization of the solvers and calls to all three: 7. EXAMPLE 7. Draw Figure 1.3. 7. xO=4.5; tol=l.d-20; 7. 7t Solve the problem three times. 7. [x,hist]=newtsol(xO,'ftan',tol,tol,l); [x,histc]=chordsol(xO,'ftan',tol,tol); [x,hists]=secant(xO,'ftan',tol,tol); 7. 7. Plot 15 iterations for all three methods. 7. maxit=15; semilogy(hist(l:maxit,l),abs(hist(l:maxit,2),'-',... histc(l:maxit,l),abs(histc(l:maxit,2)),'—',... hists(l:maxit,l),abs(hists(l:maxit,2)),'-.'); legend('Newton','Chord','Secant'); xlabel('Nonlinear iterations'); ylabelOAbsolute Nonlinear Residual'); 24 Chapter 1. Introduction 1.11 1.11.1 Projects Estimating the q-order One can examine the data in the itJiist array to estimate the q-order in the following way. If xn —> x* with q-order p, then one might hope that for some K > 0. If that happens, then, as n —» oo, and so Hence, by looking at the itJiist array, we can estimate p. This MATLAB code uses nsold.m to do exactly that for the functions f(x) = x — cos(a:) and f(x) = arctan(x). 7. QORDER a program to estimate the q-order 7, 7, Set nsold for Newton's method, tight tolerances. 7. xO = 1.0; parms = [40,1,0]; tol = [l.d-8,l.d-8] ; [x.histc] = nsold(xO, 'fcos' , tol, parms); lhc=length(histc(: , ) ; 2) 7, 7. Estimate the q-order. 7. qc = log(histc(2:lhc,l))./log(histc(l:lhc-l,D); 7. 7. Try it again with f(x) = atan(x) . 7. [x,histt] = nsold(xO, 'atan' , tol, parms) ; lht=length(histt(: ,2)) ; 7. 7. Estimate the q-order. 7. qt = log(histt(2:lht,l))./log(histt(l:lht-l,l)); If we examine the last few elements of the arrays qc and qt we should see a good estimate of the q-order until the iteration stagnates. The last three elements of qc are 3.8,2.4, and 2.1, as close to the quadratic convergence q-order of 2 as we're likely to see. For f(x) = arctan(o;), the residual at the end is 2 x 10-24, and the final four elements of qt are 3.7, 3.2, 3.2, and 3.1. In fact, the correct q-order for this problem is 3. Why? Apply this idea to the secant and chord methods for the example problems in this chapter. Try it for sin(ar) = 0 with an initial iterate of XQ = 3. Are the 1.11. Projects 25 estimated q-orders consistent with the theory? Can you explain the q-order that you observe for the secant method? 1.11.2 Singular Problems Solve F(x) = x2 = 0 with Newton's method, the chord method, and the secant method. Try the alternative iteration Can you explain your observations? This page intentionally left blank . one should expect to see q-quadratic convergence until finite-precision effects produce stagnation (as predicted in Theorem 1.. Jacobian factorization and storage of that factorization may be more expensive than a solution by iteration. of course.32. symmetry. for example).Chapter 2 Finding the Newton Step with Gaussian Elimination Direct methods for solving the equation for the Newton step are a good idea if • the Jacobian can be computed and stored efficiently and • the cost of the factorization of the Jacobian is not excessive or • iterative methods do not converge for your problem. positivity. 77 = 0 in Algorithm nsolg).2)..1 Direct Methods for Solving Linear Equations In this chapter we solve the equation for the Newton step with Gaussian elimination.e. However. One can.27.) [1. 27 . exchanging an increase in the number of nonlinear iterations for a dramatic reduction in the cost of the computation of the steps. The typical implementation of Gaussian elimination.74. called an LU factorization. As is standard in numerical linear algebra (see [23. .76]. 2..32. If the linear equation for the Newton step is solved exactly and the Jacobian is computed and factored with each nonlinear iteration (i. Even when direct methods work well. we distinguish between the factorization and the solve.23. factors the coefficient matrix A into a product of a permutation matrix and lower and upper triangular factors: The factorization may be simpler and less costly if the matrix has an advantageous structure (sparsity. approximate the Jacobian or evaluate it only a few times during the nonlinear iteration. direct methods are more robust than iterative methods and do not require your worrying about the possible convergence failure of an iterative method or preconditioning.74.76]. Most linear algebra software [1. where.u]=lu(a) 1= 5. Following the factorization. The factorization can fail if. highly ill conditioned.OOOOe+00 0 u = 7.0000e-01 0 0 l.0000e-01. in MATLAB. 2.OOOOe+00 l.27] manages the permutation for you in some way. The cost of the two triangular solves is N2 + O(N) flops. In MATLAB. F' is singular or. The factorization is the most expensive part of the solution. The cost of an LU factorization of an N x N matrix is N3/3 + O(N2) flops. a multiply.0000e+00 1. For example. for example.4286e+00 0 l.0000e+01 2. We will ignore the permutation for the remainder of this chapter. following [27]. and some address computations.u]=lu(A) returned by the MATLAB command is » El. .8571e-01 2. P is not explicitly referenced. Finding the Newton Step with Gaussian Elimination The permutation matrix reflects row interchanges that are done during the factorization to improve stability. one can solve the linear system As = b by solving the two triangular systems Lz = b and Us = z.7143e-01 2. but the reader should remember that it is important. if the LU factorization [l.0000e+00 0 0 8.8571e-01 l.OOOOe+00 -2.28 Chapter 2.2 The Newton-Armijo Iteration Algorithm newton is an implementation of Newton's method that uses Gaussian elimination to compute the Newton step. we define a flop as an add. but is encoded in L. The significant contributors to the computational cost are the computation and LU factorization of the Jacobian. 3 Computing a Finite Difference Jacobian The effort in the computation of the Jacobian can be substantial. then approximating the Jacobian by differences is the only option. while ||F(a.2) than the evaluation of the function. this usually causes no problems in the nonlinear iteration and a forward difference approximation is probably sufficient. factor F'(x) = LU. The difference increment in (2. roughly the square root of the error in F. ra. In some cases one can compute the function and the Jacobian at the same time and the Jacobian costs little more (see the example in section 2.rr||F(x)|| + ra. One computes the forward difference approximation (V/l-F)(ar) to the Jacobian by columns.3. F. Computing a Finite Difference Jacobian 29 Algorithm 2.1) should be scaled. T <.1. if only function evaluations are available. x <— x + As Evaluate F(x). end while 2. As we said in Chapter 1. it can be crucial if \(x)j\ is very large. rr) Evaluate F(x). newton(or. The jth column is In (2. Rather than simply perturb a: by a difference increment /i.)|| > r do Compute F'(x). therefore. in each coordinate direction. we multiply the perturbation to compute the jth column by with a view toward varying the correct fraction of the low-order bits in (x)j. For example.7. Each column of V^F requires one new function evaluation and.2. a finite difference Jacobian costs N function evaluations.5. The difference increment h should be no smaller than the square root of the inaccuracy in F. However. also see section 2.1) 6j is the unit vector in the jth coordinate direction. .3. Note that we do not make adjustments if | (x) j \ is very small because the lower limit on the size of the difference increment is determined by the error in F. if the factorization fails then report an error and terminate else solve LUs = —F(x) end if Find a step length A using a polynomial model. While this scaling usually makes little difference. 30 Chapter 2. Finding the Newton Step with Gaussian Elimination if evaluations of F are accurate to 16 decimal digits. then only columns 1 and 2 depend on (x)i.. The Jacobian F' is banded with upper and lower bandwidths nu and HI if (F}i depends only on (x)j for For example. one can compute several columns of the Jacobian with a single new function evaluation. the difference increment should change roughly the last 8 digits of x. for which sign(O) = 0. The LU factorization of a banded matrix takes less time and less storage than that of a full matrix [23].21] are too complex for this book. Continuing in this way. fourth. we can let and compute From D^F we can recover the first. The cost of the factorization. where In (2. The methods for doing this for general sparsity patterns [20. The cost estimates for a difference Jacobian change if F' is sparse. If F' is tridiagonal. columns of V^-F from D^F as . but we can illustrate the ideas with a forward difference algorithm for banded Jacobians. If F' is sparse. when n/ and nu are small in comparison to JV.2) This is different from the MATLAB sign function. as does the cost of the factorization. The MATLAB sparse matrix commands exploit this structure. A matrix A is banded with upper bandwidth nu and lower bandwidth nl if Aij = Q i f j < i — niOYJ>i + n u . Since (F)k for k > 4 is completely independent of any variables upon which (F)i or (F)<2 depend. then one can compute a numerical Jacobian several columns at a time. is 2Nninu(l + o(l)) floating point operations. The factors have at most ni + nu + 1 nonzeros.. If F' is banded. if F' is tridiagonal. Hence we use the scaled perturbation (Tjh. . HI = nu — I. we can differentiate F with respect to (rr)i and (0^)4 at the same time. 3. x = function and point 7.nl. Repeat the process with to compute the final third of the columns. jac = sparse(n. Our nsold. fifth. Continuing in this way we define pk for 1 < k < 1 + nu + nu by where there are k — I zeros before the first one and HI + nu zeros between the ones. columns. 7t Inputs: f. precomputed function value 7e nl. For a general banded matrix. Hence the next admissible coordinate for perturbation is 2 + ni + nu. then (F)k depends on (x)i for 1 < k < 1 + nj.5) to obtain the second. we can compute the forward difference Jacobian with 1 + HI + nu perturbations. So we can compute the forward difference approximations of dF/d(x}\ and dF/d(x)2+nu+nu with a single perturbation. fO = f(x). . BANDJAC Compute a banded Jacobian f (x) by forward differences. Hence a tridiagonal Jacobian can be approximated with differences using only three new function evaluations. nu = lower and upper bandwidth 7. we cannot perturb in any other direction that influences any (F)k that depends on (or)i. n = length(x). the bookkeeping is a bit more complicated. When MATLAB factors a matrix in this format.fO. .x.n). function jac = bandjac(f. but the central idea is the same. 7. Computing a Finite Difference Jacobian follows: 31 We can compute the remainder of the Jacobian after only two more evaluations.nu) 7. If the upper and lower bandwidths are nu < N and HI < N...2. The matrix is stored in MATLAB's sparse format.4) and (2.m solver uses this algorithm if the upper and lower bandwidths are given as input arguments. If we set we can use formulas analogous to (2. By using the vectors {p^} as the differencing directions. If we perturb in the first coordinate direction. it uses efficient factorization and storage methods for banded matrices. end 7. 7. 7. ist = delr(ist)+l. 7. pt = zeros(n. 7. 7» Compute the forward difference. m = iht-ilt. We'll need delr(l) new function evaluations. epsnew = l. % Fill the appropriate columns of the Jacobian. end 7.l). while ist <= n ilt = il(ist). 7. Finding the Newton Step with Gaussian Elimination dv = zeros(n. il(ip) = range of indices that influence f(ip) for ip = l:n delr(ip) = min([nl+nu+ip. xl = x+epsnew*pt. Sweep through the delr(l) perturbations of f. . il(ip) = max([ip-nu.n]). 7. fl = feval(f. 7. ist = delr(ist)+l.l). while ist <= n pt(ist) = 1.l]). perturbation vector pt 7.d-7. 7o Build the perturbation vector.n]). 7. end end 7. 7o delr(ip)+l = next row to include after ip in the 7. 7. for is = l:delr(l) ist = is. 7. jac(ilt:iht.ist) = dv(ilt:iht). ih(ip). dv = (fl-fO)/epsnew. iht = ih(ist). 7. ist = is.xl).32 Chapter 2. ih(ip) = min([ip+nl. 8. the overall cost of the solve will usually be much less. .2. if the factorization fails then report an error and terminate else while \\F(x)\\ > T do Solve LUs = -F(x). numjac was designed to work with the stiff ordinary differential equation integrators [68] in MATLAB.4 The Chord and Shamanskii Methods If the computational cost of a forward difference Jacobian is high (F is expensive and/or N is large) and if an analytic Jacobian is not available. where the Jacobian may not be updated for several time steps.2.ra. The advantages of the chord method increase as N increases. it is wise to amortize this cost over several nonlinear iterations. factor F'(x] = LU. let you input a general sparsity pattern for the Jacobian and then use a sophisticated sparse differencing algorithm. chord(x. Algorithms chord and Shamanskii are special cases of nsolg. X «— X + S Evaluate F(x). while the convergence is q-linear and more nonlinear iterations will be needed than for Newton's method. Compute F'(x). The Chord and Shamanskii Methods 33 The internal MATLAB code numjac is a more general finite difference Jacobian code.rr) Evaluate F(x). and the computation of the step is based on an LU factorization of F'(x) at an iterate that is generally not the current one. The chord method is the solver of choice in many codes for stiff initial value problems [3.61]. so the step and the direction are the same. Recall that the chord method differs from Newton's method in that the evaluation and factorization of the Jacobian are done only once for F'(X0). So. Global convergence problems have been ignored.rr\F(x}\ + ra. The chord method from section 1. Here the Jacobian factorization and matrix function evaluation are done after every ra computations of the step.F. numjac will. end while end if A middle ground is the Shamanskii method [66].3 does exactly that. Algorithm 2. for example. 2. since both the N function evaluations and the O(N3) work (in the dense matrix case) in the matrix factorization are done only once.4. T <. of course. rr. 2. the convergence may be slower than you'd like. end for end while If one counts as a complete iteration the full m steps between Jacobian computations and factorizations. F... to which a code will switch after a Newton-Armijo iteration has resolved any global convergence problems. Our code nsold. .)|| < r terminate. x <— x + s Evaluate F(x)\ if ||F(a.34 Chapter 2.1 Poor Jacobians The chord method and other methods that amortize factorizations over many nonlinear iterations perform well because factorizations are done infrequently. Finding the Newton Step with Gaussian Elimination Algorithm 2. You should think of the chord and Shamanskii methods as local algorithms.5. ra. but it's worth thinking about a few specific problems that can arise when you compute the Newton step with a direct method. the Jacobians will be accurate enough for the overall performance to be far better than a Newton iteration. for some K > 0. if the initial iterate is good.3. 2. but. then the line search can fail. if you use an approximation to the Jacobian. However. Compute F'(x). if your initial iterate is far from a solution. if the factorization fails then report an error and terminate end if for p = I : m do Solve LUs = -F(x).e. this inaccuracy can cause a line search to fail.6) watches for these problems and updates the Jacobian if either the line search fails or the rate of reduction in the nonlinear residual is too slow. Even if the initial iterate is acceptable. i.m (see section 2. is the m = 1 case. shamanskii(j. factor F'(x) = LU. This means that the Jacobians will be inaccurate. Newton's method. The major point to remember is that. the Shamanskii method converges q-superlinearly with q-order ra+ 1. ra) while ||F(x)|| > r do Evaluate F(x).9 is complete.5 What Can Go Wrong? The list in section 1. r <— rr|F(x)| + ra. you may have the option to compute a sparse factorization without pivoting. assume that the error in the function is on the order of floating point roundoff. Automatic differentiation software takes as its input a code for F and produces a code for F and F'. it's probably a good idea to re-enable it. For general F'.tol.m is to try to avoid computation of the Jacobian and. 2. Using nsold. If.9.3 Pivoting If F' is sparse. If the components of x differ dramatically in size. Using nsold.m 2. F' is symmetric and positive definite. this is the way to proceed. If F is smooth and can be evaluated for complex arguments.9. If line search fails and you have disabled pivoting in your sparse factorization.6 [sol.3 and 1.2 Finite Difference Jacobian Error 35 The choice of finite difference increment h deserves some thought. for example. then you can get a second-order accurate derivative with a single function evaluation by using the formula One should use (2. the difference increment must be adjusted to reflect that. 2. of course. Switching to centered differences can also help. Another approach [49.m is a Newton-Armijo code that uses Gaussian elimination to compute the Newton step. One other approach to more accurate derivatives is automatic differentiation [34]. For sparse problems.6) with some care if there are errors in F and.5.5. if the reduction in the norm of the nonlinear residual is large enough (a factor of two). one should scale h. If that assumption is not valid for your problem. but the codes are usually less efficient and larger than a hand-coded Jacobian program would be. The derivatives are exact. including ours. however. x_hist] = nsold(x. the cost of pivoting can be large and it is tempting to avoid it.4 that the difference increment in a forward difference approximation to a Jacobian or a Jacobian-vector product should be a bit more than the square root of the error in the function.73] uses complex arithmetic to get higher order accuracy. consider a change of independent variables to rescale them. You were warned in sections 1.f. The calling sequence is The default behavior of nsold.2. Most codes.6.1). The . pivoting can be essential for a factorization to produce useful solutions. Check that you have scaled the difference increment to reflect the size of x. but the cost of a centered difference Jacobian is very high.m becomes the chord method once the iteration is near the solution.m it_hist. This means that nsold. Automatic differentiation software for C and FORTRAN is available from Argonne National Laboratory [38]. as we did in (2. ierr. not to update the Jacobian and to reuse the factorization.parms). nsold. 12).m from section 2. the function /.e. The components of parms are maxit is the upper limit on the nonlinear iteration. Finding the Newton Step with Gaussian Elimination reader was warned in section 1.36 Chapter 2.m from section 1.10.6. the default is 40. All our codes expect x and / to be column vectors of the same length.j acobian]=f(x). you must compute the Jacobian and store it as a MATLAB sparse matrix.e. nsold.r r ) contains the tolerances for the termination criterion (1. when the iteration is near a solution that satisfies the standard assumptions).7. which is usually enough. The parms array controls the details of the iteration. so the Jacobian is updated only if the decrease in the nonlinear residual is not sufficiently rapid. the vector tol = (r a . The syntax for the function / is function=f(x) or [function.5 or the line search fails. The H-equation code heq.5.1 that this strategy could defeat the line search. So. In this way the risk (see section 1. when the iteration is far from the solution) of the iteration and that it is almost never updated in the local phase (i. The next parameter controls the computation of the Jacobian. and the tolerances for termination. and you want to use the MATLAB sparse matrix functions. for example.2 is a simpler example.2). If you can provide an analytic Jacobian (using the optional second . but not banded. 2. The scalar function f atan.. If nsold. it is generally faster if you do that rather than let nsold compute the Jacobian as a full or banded matrix with a forward difference.6.1 Input to nsold. isham — 1 and rsham = 0 is Newton's method. In practice this means that the Jacobian is almost always updated in the global phase (i.. If your Jacobian is sparse.3 in the software collection is a nontrivial example of a function with an optional Jacobian.m The required input data are an initial iterate x. A forward difference approximation (jdiff = 1) is the default.7. The default is isham = 1000 and rsham — 0. then a forward difference Jacobian is computed and factored only if the ratio ||F(xn)||/||F(o:n_i)|| > 0. The Jacobian is computed and factored after every isham nonlinear iterations or whenever the ratio of successive norms of the nonlinear residual is larger than rsham. If it is easy for you to compute a Jacobian analytically.m is called with no optional arguments.1) of using an out-of-date Jacobian when far from a solution is reduced. As in all our codes. You can leave this argument out if you want a difference Jacobian and you are not using the banded Jacobian factorization.m takes this danger into account by updating the Jacobian if the reduction in the norm of the residual is too small or if the line search fails (see section 2.7. 3 the difference Jacobian computation takes more time than the rest of the solve! We also give simple examples of how one can use the solver from the command line. One can use xJiist to create figures like Figure 2. For the H-equation example in section 2. Examples 37 output argument to the function). can expend all of MATLAB's storage. .2. Automatic differentiation (see section 2.21). an error flag.1.7 Examples The purposes of these examples are to illustrate the use of nsold. sometimes more than is worthwhile. and ierr = 2. 2. indicating that we provide an analytic Jacobian.2 Output from nsold.5. If your Jacobian is sparse.2) is a different way to obtain exact Jacobian information. If your Jacobian is banded.6. These can be left out for full Jacobians. a history of the iteration.m The outputs are the solution sol and. The first is the J2-norm of the nonlinear residual and the second is the number of step-size reductions done in the line search. Analytic Jacobians almost always make the solver more efficient. MATLAB will automatically use a sparse factorization. The history array itJiist has two columns. The failure modes are ierr = 1. which means that the termination criterion is not met after maxit iterations. give the lower and upper bandwidths to nsold.m and to compare the pure Newton's method with the default strategy. which means that the step length was reduced 20 times in the line search without satisfaction of the sufficient decrease condition (1.m twice. 2. for example. We provide codes for each example that call nsold. Be warned: asking for the iteration history.7. The error flag is useful. We invite the reader to try jdiff = I. and the entire sequence {xn}. but require human effort. but also requires some human and computational effort. set jdiff = 0.7. if nsold is used within a larger code and one needs a test for success. once with the default iteration parameters and once with the parameters for Newton's method Note that the parameter jdiff = 0. The sequence of iterates is useful for making movies or generating figures like Figure 2. {xn} stored in columns of the array x-hist.m as the last two parameters. The limit of 20 can be changed with an internal parameter maxarm in the code.1. optionally. The error flag ierr is 0 if the nonlinear iteration terminates successfully. Run the code and compare the plots yourself. » tol=[l. » [sol. params). Why is that? » xO=10.d-2].3724e+00 1.m with ra = rr = 10~6 and compares the iteration histories graphically.0]. The alert reader will see that the solution and the residual norm are the same to five significant figures. tol.0000e+00 0 0 0 0 0 0 .m. With an initial iterate of X0 = 10. The MATLAB code atandemo. The line search in nsold. therefore.4.3920e-01 9.87116-01 7. even this small problem is difficult for the solver and the step length is reduced many times. » params=[40.OOOOe+OO 3. » sol sol 9.3170e+00 9.1402e-01 1.4711e+00 1.38 Chapter 2. The function only computes a Jacobian if there are two output arguments.hist]=nsold(x. The columns in the hist array are the residual norms and the number of times the line search reduced the step length.66056-04 0 S.l. In the lines below we apply Newton's method with coarse tolerances and report the solutions and iteration history. One can run the solver from the command line to get a feel for its operation and its output.0000e+00 2. the iteration history for Newton's method is a bit different from that in Figure 1.1 Arctangent Function This is a simple example to show how a function should be built for nsold. 0. It takes several iterations before nsold's default mode stops updating the Jacobian and the two iterations begin to differ.m solves this problem using nsold.8343e-01 5.0000e+00 2.4547e+00 1.12786-01 9. 'fatan'.2507e-01 8.7.m uses the polynomial model and.6605e-04 » hist hist = 1. 1.d-2. Finding the Newton Step with Gaussian Elimination 2. '/.m and the code that generated Figure 2.5)T. SIMPDEMO '/.vr]. In this example ra = rr = 10~6.9. For XQ = (2. This program solves the simple two-dimensional °/. f(l)=x(l)*x(l)+x(2)*x(2) .7. The iteration that stagnates converges..2 A Simple Two-Dimensional Example This example is from [24]. f=zeros(2. 2*x(2)].m.1. jac]=simple(x) */. '/. problem in Chapter 2 and makes Figure 2. Here N = 2 and This function is simple enough for us to put the MATLAB code that computes the function and Jacobian here.7. v=[vl. where the Jacobian is singular. .2. exp(x(l)-l). Full steps were taken after that. m. SIMPLE simple two-dimensional problem with interesting */.l:. If XQ = (3.2. '/.1. We investigated two initial iterates. function [f. This is an interesting example because the iteration can stagnate at a point where F'(x) is singular. % Return the Jacobian if it's needed.0. end The MATLAB code for this function is simple .5:4:40. y.25. This is a fragment from simpdemo. 2*x(2). '/. '/. In Figure 2. the line search will fail and the stagnation point will be near the x(l)-axis. v=[. the step length was reduced twice on the first iteration. if nargout == 2 jac=[2*x(l). v=. perhaps by evaluating F (see section 1. but not to a root! Line search codes that terminate when the step is small should also check that the solution is an approximate root.5:2.1 is simpdemo.1 we plot the iteration history for both choices of initial iterate on a contour plot of ||F||. % vl=. Examples 39 2. tol=[l.l).2).l.5)T. Create the mesh for the contour plot of II f I I .d-6. global convergence behavior '/. f(2)=exp(x(l)-l) + x(2)*x(2) .2.d-6]. Here's the code that produced Figure 2.5:2:40]. vr=2:4:40. 7. 1. Newton's method 7. Finding the Newton Step with Gaussian Elimination Figure 2. end end '/. xl is a poor initial iterate. params). 'simple'. xO is a good initial iterate. '/..m.1. errsn2. tol. '/.5]'. for i=l:n for j=l:n w=[xr(i). ierrn2.5]'. xr=-5:. xO=[2.j)=norm(simple(w)).40 Chapter 2. 7. errsn. params= [40.0]. at a point where F' is singular. The iteration from xl will stagnate 7. params). Solution of two-dimensional example with nsold. 7o Draw a contour plot of II f I I . z=zeros(n. 7.n).xr(j)]'. 7. z(i. x_hist2]=nsold(xl. [sn. ierrn. x_hist]=nsold(xO. . tol. 7. 7. [sn2. 0. n=length(xr). xl=[3. 'simple'.2:5. '/. :).7. x_hist2(l. Once A is stored.8) in a more compact form. ylabel('x_2'). Let A be the matrix Our program heqdemo.'-*'.m for solving the H-equation stores -A as a MATLAB global variable and uses it in the evaluation of both F and F1. F(x) can be rapidly evaluated as The Jacobian is given by .17] is This equation arises in radiative transfer theory. xlabel('x_l'). The resulting discrete problem is We will express (2.xr. The algorithms and initial iterates we use in this book find the solution that is of interest physically [46]. legend('Convergence'. '/. axis([0 5-5 5]) 2.2. J -o').x_hist2(2.:). */.7. Use the x_hist array to plot the iterations on the contour plot.:).'Stagnation').v) hold '/. plot(x_hist(l.:). There are two solutions unless c = 0 or c = 1.3 Chandrasekhar H-equation The Chandrasekhar H-equation [15.x_hist(2. Examples 41 figured) contour(xr. Can you find the other one? We will approximate the integrals by the composite midpoint rule: where ni = (i — 1/2) /N for 1 < i < N.z. mu=(mu-. 7..d-6].5)/n. % Be sure to store the correct data in the global array A_heq. h=ones(n. % global A_heq. if nargout==2 hj ac=(ph. once F has been computed.hjac]=heq(x) % HEQ Chandrasekhar H-equation residual */.l)*mu'. function [h. % Set the nodal points for the midpoint rule. tol=[l. mu=l:n. 7. x=ones(n.*ph)*ones(1. 7. Jacobian uses precomputed data for fast evaluation. n=length(x). cc=./(A_heq+A_heq'). 7. 7.n). A_heq=ones(n./h. hj ac=eye(n)-hj ac. 7.2. 7. Solve the H-equation with the default parameters in nsold and plot 7o the results. The output is the plot in Figure 2. Notice how the analytic Jacobian appears in the argument list.9.. 7. 7. global A_heq.l). and c = 0. hj ac=A_heq.5*c/n. m solves this equation with initial iterate XQ = (1. ph=ones(n. ra = rr = 10~6. n=100. . 1)T.m. 7.42 Chapter 2. mu=mu'. A_heq=cc*A_heq'. The MATLAB code for the H-equation is heq. 7. Finding the Newton Step with Gaussian Elimination Hence. h=x-ph..*h j ac.l. Form and store the kernel of the integral operator in a 7o global variable..d-6. N = 100.1)-(A_heq*x).l). c=. HEQDEMO This program creates the H-equation example in Chapter 2. there is almost no new computation needed to obtain F'.9. end The MATLAB code heqdemo. is not interesting. an exercise from [3]. ierrd]=nsold(x. Solution of the H-equation for c = 0. Use the default parameters in nsold. We begin by converting (2. shows how to use the banded differencing algorithm from section 2. errsd.7.m 7. 7o Plot the results.9. 2. % '/. 'heq'. We seek v € C2([0.\mu'). 7. tol). 'Rotation' .9) to a first-order system for . ylabel('H'.7. xlabel(. [hc.he).2.1. so the objective is to find a nonzero solution. for example. plot(gr. This is a very easy problem and the Jacobian is only computed and factored once with the default settings of the parameters. 7. Things are somewhat different with. c= 1 and rsham = 0. One.1) .20]) such that This problem has at least two solutions.3. Examples 43 Figure 2.4 A Two-Point Boundary Value Problem This example.2. v = 0. 1. Problem 7. It is a direct translation of the formulas above.40] on an equally spaced mesh {*»}£=!. The discretization approximates the differential equation with the IN — 2 equations for Ui « U(U) The boundary data provide the remaining two equations We can express the problem for {f/i}^:1 as a nonlinear equation F(x) = 0 with a banded Jacobian of upper and lower bandwidth two by grouping the unknowns at the same point on the mesh: In this way (z)2i+i ~ v(ti) and (x)^ ~ v'(ti). */. BVPSYS Two-point BVP for two unknown functions 7. page 187 in [3] .4. The boundary conditions are the first and last equations w'i = u-2 is expressed in the odd components of F as for 1 < i < N .44 Chapter 2.m.1) * h for 1 < i < N and h = 2Q/(N . The even components of F are the discretization of the original differential equation Here and The MATLAB code for the nonlinear function is bvpsys. Finding the Newton Step with Gaussian Elimination The equation for U is We will discretize this problem with the trapezoid rule [3. 7. where U = (i .1). l). cof=4. 7. with the line search being active for three of the nine iterations. The division by zero really doesn't happen. Set the boundary conditions.l). 7. 7. rhs=cof. 7. The zero solution is easy to find. fb(2:2:n2)=f2. Separate v and v' from their storage in u. % function fb=bvpsys(u) global L n2=length(u). h=L/(n-l). 7. but you may not get the same solution each time! Run the code bvp2demo. Examples 7.*vp + (r. fb=zeros(n2. 7. fb(l:2:n2-l)=fl. v(L) = 0. 7.m. cof(l)=0.5*(vp(2:n)+vp(l:n-D). bvp2demo. 7. 7. vp=u(2:2:n2). too. f2(l:n-l)= vp(2:n)-vp(l:n-l) + h*.m plots v and v' as functions of*. 45 Calling nsold./cof.m is equally straightforward.2. cof(l)=l. This script solves the system of two-point boundary value 7.7. r=r'*h. fl(l)=vp(l). problems in Chapter 2 with nsold. fl=zeros(n. We can find a nonzero solution using the initial iterate The solver struggles. cof=r.*v.1) v 4t 7.*v . v=u(l:2:n2-l). r=0:n-l.l). 7. 7. BVP2DEMO 7. n=n2/2.3. f2=zeros(n.5*(rhs(2:n)+rhs(l:n-l)). 7.m. v" = ( / ) v' + (t v .l). Fix it up. We plot that solution in Figure 2. fl(2:n)= v(2:n)-v(l:n-l)-h*. 7. 7. v>(0) = 0 f2(n)=v(n). 7. . and then change the initial iterate to the zero vector and see what happens. 7. nh=n/2. vp=-. This choice of initial iterate gives the "correct" result. [sol. 7.'v\prime'). 0. watch Newton find a different solution! % v=exp(-r. u(l:2:n-l)=v.*r*. xlabel('t').vp. r=0:nh-l. 7. n=800. 7. u(2:2:n)=vp. 2. Try different initial iterates and 7. The upper and lower bandwidths are both 2.l. Solution of (2. ierr] = nsold(u.parms). tol=[l. Finding the Newton Step with Gaussian Elimination Figure 2. u=zeros(n.l).d-12. parms= [ 0 1.*v.3.tol. 1.'-'. Use Newton's method. it_hist plot(r. v=sol(l:2:n-l).2*r.d-12].46 Chapter 2. global L L=20. it_hist. 2] . 7. legend('v'.'bvpsys'. 4. 7. r=r'*h. vp=sol(2:2:n).l). 7.v.r. h=L/(nh-l). .9).'--'). Our discretization in space is the standard central difference approximation to the second derivative with homogeneous Dirichlet boundary conditions.1. Hence the solver sees a different function (varying un~l and h) at each time step. usually something like (1.5 Stiff Initial Value Problems Nonlinear solvers are important parts of codes for stiff initial value problems. This eliminates the need to evaluate the function only to verify a termination condition. The initial iterate is usually either UQ = un~l or a linear predictor C/o = 2un~1 —un~2. where un solves the nonlinear equation The nonlinear solver is given the function and an initial iterate. a nonlinear solver must be used at each time step. consider the nonlinear parabolic problem with boundary data and initial data We solve this on a spatial mesh with width 6X = 1/64 and use a time step of dt = 0. the Jacobian is updated very infrequently—rarely at every time step and certainly not at every nonlinear iteration.7. but is usually very robust.8. very small time steps must be taken. The discretized problem is a stiff system of 63 ordinary differential equations.2.67].7. stiffness means that either implicit methods must be used to integrate in time or. in the case of an explicit method. In most modern codes [3. The time step h depends on n in any modern initial value problem code. Similarly. In general terms [3. To solve the initial value problem with the implicit Euler method. The unknowns are approximations to u(xi. we specify a time step 6t and approximate the value of the solution at the mesh point n8t by un. If the problem is nonlinear.tn) for the interior nodes {xi}i=i = {ifix}f=i and times {ti}i£i = {i8t}i=i.61] the termination criterion is based on small step lengths. . As an example. We refer the reader to the literature for a complete account of how nonlinear solvers are managed in initial value problem codes and focus here on a very basic example. This combination can lead to problems. Examples 47 2.17). The most elementary example is the implicit Euler method. so we can store the time history of the . The code timedep. timedep. 7. 7.0) =0. calling nsold. n=length(u). to the nonlinear residual as MATLAB global variables. The Jacobian is tridiagonal. use the banded differencing function. Finding the Newton Step with Gaussian Elimination For a given time step n and time increment 6t.m integrates the initial value problem. we use the banded difference Jacobian approximation in nsold. 7.uold) . 7. 7.m are given by for 1 < i < N = 63. This code has the zero boundary conditions built in. The value of u at the current time and the time step are passed 7. d2u(2:n)=d2u(2:n)-u(l:n-l). Newton's method is used 7. 7. with the backward Euler discretization. d2u=2*u.t) = u(l. d2u is the numerical negative second derivative. All of this is encoded in the MATLAB code ftime. 7. u(x.d2u). d2u(l:n-l)=d2u(l:n-l)-u(2:n). so we 7. FTIME '/. while computing it analytically is easy. function ft=ftime(u) global uold dt 7. d2u=d2u/OT2). h=l/(n+l). for the nonlinear solver. 7t Nonlinear residual for implicit Euler discretization 7.m generates the time-space plot of the solution in Figure 2. the components of the function F sent to nsold. 7. The Jacobian is tridiagonal and.dt * (exp(u) . u_t = u_xx + exp(u). 7. 7.m. This problem is 1-D. Nonlinear residual for time-dependent problem in Chapter 2 7. u(0. 0 < t < 1 7.t) = 0.and superdiagonals.m with MATLAB global variables. 7o TIMEDEP This code solves the nonlinear parabolic pde 7. 7.4. We pass the time step and un~l to ftime.48 Chapter 2. ft=(u . 7. The discrete second derivative D^ is the tridiagonal matrix with —2 along the diagonal and 1 along the sub.m. The time step and solution are passed as globals.m at each time step. ierr] =nsold (uold. 7. end 7. Newton's method. for it=l:nt-l [unew. nx=63. it_hist. . uold=zeros(nx. global uold dt dt-. uold=unew. Solution of (2.l. 1.l. tval=0:dt:1. Plot the results.4. Use tight tolerances.2. uhist(2:nx+1.nt). tol=[l.d-6. 7. parms=[40.13).1). 7. mesh(tval. nt=l+l/dt. 7.xval. 1. Examples 49 '/. integration and draw a surface plot. Figure 2. parms) .uhist) y. 1. 1].d-6].7. and a tridiagonal Jacobian. tol. xval=0:dx:1. 7.it+1)=unew.' f time'. 0. uhist=zeros(nx+2. dx=l/(nx+l). m.0. Apply this idea to some of the examples in the text.8. integrating accurately in time is a wasteful way to solve the steady-state problem.8 2. For c £ (—00.8.m. interpolating the solution to a finer mesh. then the H-equation has two solutions [41. and then repeating the process until you have a solution on a target mesh. In this case that limit is a solution of the steady-state (time-independent) equation with boundary data This might give you the idea that one way to solve (2. 25.1.1 Projects Chandrasekhar H-equation Solve the H-equation and plot residual histories for all of nsold.1]. you can estimate the q-factor by examining the ratios of successive residual norms. nsoli . but an extension of this idea.m affect your results? If you suspect the convergence is q-linear. for c = 0. does work [19. If the discretization is secondorder accurate and you halve the mesh width at each level. Try to find the other one.9999.5. resolving on the finer mesh.52]. t) tends to a limit as t —> oo.m.14) would be to solve the time-dependent problem and look for convergence of u as t —> oo. 2. Do the data in the itJiist array indicate superlinear convergence? Does the choice of the forcing term in nsoli.1. How would you compute them? 2. Finding the Newton Step with Gaussian Elimination You can see from the plot that u(x. Do this for these examples and explain your results.0. the two solutions are complex. If c 7^ 0.0. This is especially entertaining for c < 0. The one you have been computing is easy to find. how should you terminate the solver at each level? What kind of iteration statistics would tell you that you've done a satisfactory job? .50 Chapter 2.2 Nested Iteration Solving a differential or integral equation by nested iteration or grid sequencing means resolving the rough features of the solution of a differential or integral equation on a coarse mesh. called pseudotransient continuation.9. Of course.99. brsola.0.44]. using piecewise linear interpolation to move from coarse to fine meshes.1.36. function . T. nl. This is an 67 % OPTIONAL argument. 1] 4. computed and factored after isham 24 % updates of x or whenever the ratio 25 % of successive 12-norms of the 26 % nonlinear residual exceeds rsham. isham • 1. Note that x_hist is not in 78 % the output list. l. 37 % 38 % jdiff = 1: compute Jacobians with forward differences. x_hist = matrix of the entire iteration history.tol.nsold(x.f 17 7.parms) 13 % 14 % inputs: 15 '/. 0]. The iteration 60 % is terminated if too many step-length reductions 61 % are taken. is reduced in the line search. rsham = . 82 "/. for example. 0.hist and ierr and set the iteration parameters. Chord 7 % 8 % C.Jacobian] = f(x). 45 % the Jacobian will be evaluated with a banded differencing 46 % scheme and stored as a sparse matrix. 10 '/.d-6. tol. Source Code for nsold. function [sol. x_hist] • nsold(x.5 is the default.f . 63 7. 33 % 34 '/. x_hist] . isham .-1. internal parameter: 72 '/. nu: lower and upper bandwidths of a banded Jacobian. ierr. 69 % 70 % 71 '/. NSOLD Newton-Arm!jo nonlinear solver 4 X Factor Jacobians with Gaussian Elimination 5 % 6 % Hybrid of Newton. it_hist = array of iteration history. initial iterate = x 16 '/. [result. 86 % 87 % Set the debug parameter. is useful for making movies. jdiff.hist. 48 % 49 % 11 •/. 55 % 56 % ierr = 0 upon successful termination. This 65 '/. rsham = 1 is the chord method. The two columns are the residual norm and 54 % number of step-size reductions done in the line search. 'sin'. April 1. This code comes with no guarantee or warranty of any kind. 50 % output: 51 % sol " solution 52 7. 83 % result '4 % semilogy(errs) 85 7. it.0. errs. 29 7. 2003 9 '/. 42 '/. 1. jdiff . rsham = 0 is Newton's method. ierr = 2 failure in the line search. 47 '/.1 if after maxit iterations 58 % the termination criterion is not satisfied 59 '/. rtol] relative/absolute 18 % error tolerances 19 7. parms = [maxit. if x_hist is in the output argument list. params). nu] 20 % maxit = maximum number of iterations 21 % default . % % Initialize it. if nargin >= 4ftlength(parms) "» 0 . maxit = 40. but 66 7. useful for tables and plots. 53 '/. tol = [l. 31 % isham = m. rsham: The Jacobian matrix is 23 7. ierr] = nsold(x.9 1 2 3 7. The example computes pi as a root of sin(x) 76 % with Newton's method and forward difference derivatives 77 % and plots the iteration history. 62 '/. 39 % jdiff • 0: a call to f will provide analytic Jacobians 40 % using the syntax [function. iband = 0. maxarm = 20.5.40 22 % isham. Kelley. can consume way too much storage. tol = [atol.parms) 7. 44 % If you include nl and nu in the parameter list. The Jacobian is computed and factored 35 % whenever the step size 36 '/. 1 turns display on. 41 % defaults = [ 0 1000.3. rsham = 1 is the Shamanskii method with 32 X m steps per Jacobian evaluation.hist. otherwise off. Storage is only allocated 68 1.tol. Shamanskii. rsham = 0. rsham. isham. params = [40. 88 % 89 90 91 92 93 94 95 96 97 98 99 100 debug = 0. 57 % ierr .1.5. it. 79 X 80 % 81 % x . 64 % The columns are the nonlinear iterates. 30 % isham = -1. debug = turns on/off iteration statistics display as 73 % the iteration progresses 74 % 75 % Here is an example. 27 % 28 % isham = -1. ierr. 12 '/.d-6]. 43 % nl.2. % ierr .m function [sol.f. 137 fnrmo = fnrm. x_hist = [x_hist. disp('Armijo failure. fold = fO. if nargout == 4.x). 107 iband = 1.fO.nu).u] = lu(jacb).[fnrm. Keep track of the ratio (rat = fnrm/frnmo) 132 '/.x. if fnrm > stop_tol.iarm.tol = atol+rtol*fnrm. 121 fnrm = norm(f0).x]. ierr = 1. if debug == 1. isham = parms(2). rsham = parms(3). . jac_age = jac_age+l.fO. 162 7. Add one to the age of the Jacobian after the factors have been '/. 146 jac_age = -1. 163 tmp . 113 if nargout == 4. 150 else 151 jacb . 160 7. nu = parms(6). Compute the stop tolerance. 117 '/. disp([itc fnrm rat]). 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 7. else disp('Complete Armijo failure. xold = x.x). of successive residual norms and 133 7.'). recompute Jacobian. f O ) .iarm]'] '. if armflag == 1 if jac_age > 0 sol = xold. update it.feval(f. used in a solve. 144 if(itc == 1 I rat > rsham I itsham == 0 I armflag == 1) 145 itsham . 7. you're dead. :) = [itc fnrm rat]. end outstat(itc+1. 131 7. 111 it_hist = []. or 142 '/. 120 fO . 104 end 105 if length(parms) >= 6 106 nl = parms(5). end 202 7. 136 outstat(itc+1.').maxarm). 112 n = length(x).armflag] = armijo(direction. every isham iterate. 116 7.tol & itc < maxit) 130 '/.nl. '/. 123 fnrmo = 1. 7. 147 if jdiff == 1 148 if iband == 0 149 [1. 134 7. A fresh Jacobian has an age of -1 at birth. 152 [l. [fnrm.101 maxit = parms(l). 7t On failure. 126 % 127 '/. end 114 fnrm = 1. Evaluate f at the initial iterate. 7. disp(outstat). if debug == 1. 125 stop. 102 if length(parms) >= 4 103 jdiff = parms(4). x_hist = x. return end end fnrm = norm(f0). 129 while(fnrm > stop. fO = fold. on the first iteration. 157 end 158 end 159 itsham = itsham-1. 115 itc = 0. ierr = 2. sol = xold. 124 itsham = isham. set the error flag. 122 it_hist . 7. [step. end rat = fnrm/fnrmo.jac] = feval(f. 139 7. main iteration loop 128 '/.x. the iteration counter (itc) . Compute the Newton direction.fO. end while end sol = x. 156 [l. atol = tol(l). it_hist = [it_hist' . 7. % If the Jacobian is fresh. 153 end 154 else 155 [fv.f. 135 rat = fnrm/fnrmo. end 7.u] = lu(jac).O]. 143 7. if the ratio of successive residual norms is too large.bandjac(f. 119 '/. 138 itc = itc+1.x. :) « [itc fnrm rat]. If the line search fails and the Jacobian is old. 7. 118 '/. 161 7. 108 end 109 end 110 rtol = tol(2).isham. u] = d i f f j a c ( x . f .-l\fO. x = xold. 140 7t Evaluate and factor the Jacobian 141 7. 164 direction = u\tmp. j) • dirder(x. '/. % % ih(ip). iht = ih(ist). dv = (f1-f0).203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 V. 2003 % % This code conies with no guarantee or warranty of any kind.l). u] . w = point and direction f » function fO = f(x). % x. 7. end % '/.f. % % C. % % delr(ip)+l . % % n . il(ip) . 5% % Inputs: f. (uses dirder) X V. ist = delr(ist)+1.fO). jac(ilt:iht. fO) '/. preevaluated •/. fO = f(x). % % Compute the forward difference. % pt = zeros(n.fO) % Compute a finite difference directional derivative. while ist <= n pt(ist) . zj jac(:.d-7.is. fl .d-7./epsnew. . Kelley. T. % % April 1. '/.f(x).nl. z ( ) = 1. end end % % function z = dirder(x. X xl = x+epsnew*pt.length(x).l]). We'll need delr(l) new function evaluations. ih(ip) = min([ip+nl.l). % for is = l:delr(l) 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 ist .f0) % inputs: '/. jac = sparse(n.feval(f.f. % % Hardwired difference increment epsnew = l. function [1.n]). m . '/. il(ip) = max([ip-nu.f. Kelley.x.w.diffjac(x.dv(ilt:iht). April 1.1. '/. u] . for j = l:n zz . end [1. C. 2 0 03 % This code comes with no guarantee or warranty of any kind. fO . f. f = point and function •/.length(x). Compute a forward difference dense Jacobian f' ( ) return lu factors. '/. ist .next row to include after ip in the % perturbation vector pt % '/.range of indices that influence f(ip) % for ip = l:n delr(ip) = min([nl+nu+ip. dv " zeros(n. '/. % function z = dirder(x.w. epsnew = l.bandjac(f. T. x = function and point 7. ist = is.iht-ilt. % inputs: % x. nu = lower and upper bandwidth % n = length(x).delr(ist)+l. precomputed function value % nl. in nonlinear iterations f (x) has usually been computed before the call to dirder.fO.nu) '/.ist) . x.n). while ist <= n ilt = il(ist).lu(jac). function jac .n]).x1). % '/. % end % Sweep through the delr(l) perturbations of f. % Approximate f'(x) w. % % Build the perturbation vector.l). n .zeros(n. BANDJAC Compute a banded Jacobian f (x) by forward differences. Fill the % appropriate columns of the Jacobian.zz. % xold = x.l). '/• Update x. Kelley. if iarm > maxarm disp('Armijo failure. % sigma0 % = . xt = x + step. ffO « nfO*nfO.del). too many reductions'). safeguarding bounds for the line search % % % Set internal parameters. lamm = 1. '/.0. I step • lambda*direction. ft = feval(f. output: % lambdap = new value of lambda given parabolic model '/.1.f. ffm). '/. lamc = lambda.5. is more important than clarity. step = lambda*direction. fp » fO. C. •/. iarm = iarm+1. return. function [step. % input: '/• lambdac = current step-length '/. xt = x + step. keep the books on lambda.d0 epsnew=epsnew*max(abs(xs). ffc = nft*nft. ffc. '/. Apply three-point safeguarded parabolic model for a line search.ft. if iarm == 0 lambda . while nft >= (1 . % function lambdap = parab3p(lambdac.305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 % Scale the step. ffc. 2003 '/. . nft = norm(ft). armflag « 1.d-4. Keep the books on the function norms. alpha = l. lambda = 1. ffm = nft*nft. ffO. April 1.parab3p(lambdac. ft = feval(f. fl . fp . lambdam = previous step-length '/. % Apply the three-point parabolic model. z = (fl .5. armflag . % Now scale the difference increment. '/.xt). % % Compute % the step length with the three-point parabolic model. '/.lamc. del and fl could share the same space if storage '/. ffm) '/. '/. internal parameters: '/. sol .xt).armflag] = armijo(direction. lambdam.dO)*sign(xs). '/. ffc. if norm(w) == 0 z = zeros(n.fO)/epsnew. sigmal = . ffO = value of II F(x_c) II "2 */.xp.sigmal*lambda. function lambdap . 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 lamm .fO. lamm. lamc = lambda. if xs "= 0. sigma0 = 0. ffc = value of II F(x_c + lambdac d) II "2 X ffm » value of I I F(x_c + lambdam d) I I " 2 % % '/. else lambda « parab3p(lamc. '/. '/. iarm = 0. end '/.5. ffc = nft*nft.norm(ft). ffO. ffm) '/. '/. sigmal = 0. '/.alpha*lambda) * nfO.iarm. '/.maxarm) iarm » 0. end epsnew=epsnew/norm(w). ffO. nft .x.xold.1. del = x+epsnew*w. lambdam.feval(f. '/. T. return end '/. % '/.xt. % This code comes with no guarantee or warranty of any kind. end of line search '/. nfO = norm(fO). end end xp .1. xs=(x'*w)/norm(w). sigmal = . ffm = ffc.fp. xp » x. 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 % Compute coefficients of interpolation polynomial. % X p(lambda) • ffO + (cl lambda + c2 lambda'2)/dl 51 % dl - (lambdac - lambdam)*lambdac*lambdam < 0 % So if c2 > 0 we have negative curvature and default to % lambdap - sigaml * lambda. X c2 = lambdam*(ffc-ffO)-lambdac*(ffm-ffO); if c2 >- 0 lambdap = sigmal*lambdac; return end cl = Iambdac*lambdac*(ffm-ff0)-lambdam*lambdam*(ffc-ff0); lambdap = -cl*.5/c2; if lambdap < sigmaOlambdac, lambdap = sigmaO*lambdac; end if lambdap > sigmal*lambdac, lambdap = sigmal*lambdac; end 423 This page intentionally left blank Chapter 3 Newton–Krylov Methods Recall from section 1.4 that an inexact Newton method approximates the Newton direction with a vector d such that The parameter is called the forcing term. Newton iterative methods realize the inexact Newton condition (3.1) by applying a linear iterative method to the equation for the Newton step and terminating that iteration when (3.1) holds. We sometimes refer to this linear iteration as an inner iteration. Similarly, the nonlinear iteration (the while loop in Algorithm nsolg) is often called the outer iteration. The Newton–Krylov methods, as the name suggests, use Krylov subspacebased linear solvers. The methods differ in storage requirements, cost in evaluations of F, and robustness. Our code, nsoli .m, includes three Krylov linear solvers: GMRES [64], BiCGSTAB [77], and TFQMR [31]. Following convention, we will refer to the nonlinear methods as Newton-GMRES, Newton-BiCGSTAB, and NewtonTFQMR. 3.1 Krylov Methods for Solving Linear Equations Krylov iterative methods approximate the solution of a linear system Ad = b with a sum of the form where TQ = b — Ado and do is the initial iterate. If the goal is to approximate a Newton step, as it is here, the most sensible initial iterate is do = 0, because we have no a priori knowledge of the direction, but, at least in the local phase of the iteration, expect it to be small. We express this in compact form as dk KK, where the Kth Krylov subspace is 57 76] for the details of the implementation. pointing out only that it is not a completely trivial task to implement GMRES well.58 Chapter 3. but far from always. you might see if an analytic Jacobian-vector product helps. One way to understand this. keeping in mind that do = 0.1). and often does for large problems. . We refer the reader to [23. 3. so the performance of GMRES(m) can be poor if m is small. like other Krylov methods. GMRES. the default linear solver in nsoli . If you find that the iteration is stagnating. exhaust the available fast memory. Our nsoli .42.42.m has a default value of m = 40. we must have [42] This simple fact can lead to very useful error estimates.23.64. to build an iterate in the appropriate Krylov subspace. Any implementation of GMRES must limit the size of the Krylov subspace.1. This is the set of polynomials of degree k with p(0) = 1. is to observe that the kth GMRES residual is in Kk and hence can be written as a polynomial in A applied to the residual Here p € Pk. nsoli.76]. approximates Jacobianvector products with forward differences (see section 3. the set of Kth-degree residual polynomials. GMRES must accumulate the history of the linear iteration as an orthonormal basis for the Krylov subspace. performs best if the eigenvalues of A are in a few tight clusters [16. rather than details of the matrix itself. is often.m code. The reason for this is that only matrix-vector products. Since the kth GMRES iteration satisfies for all z € Kk. are needed to implement a Krylov method. in very different ways. This is an important property of the method because one can. GMRES. The kth GMRES iterate is the solution of the linear least squares problem of minimizing over Kk. like most implementations of Newton–Krylov methods.m. like other Krylov methods. Newton–Krylov Methods Krylov methods build the iteration by evaluating matrix-vector products. GMRES (m) does this by restarting the iteration when the size of the Krylov space exceeds m vectors.1 GMRES The easiest Krylov method to understand is the GMRES [64] method. The convergence theory for GMRES does not apply to GMRES(m).2. Convergence of GMRES As a general rule (but not an absolute law! [53]). implemented as a matrix-free method. Then. this is a very useful bound. See [16] for examples of similar estimates when the eigenvalues are contained in a small number of clusters. k(V) = 100. then A is normal if the diagonalizing transformation V is unitary.1. for example. Proof. say. that A is diagonalizable. Theorem 3. GMRES will reduce the residual by a factor of. In exact arithmetic the Kth CG iteration minimizes .1. Let A = VAV~l be a nonsingular diagonalizable matrix. A is diagonalizable if there is a nonsingular matrix V such that Here A is a diagonal matrix with the eigenvalues of A on the diagonal. Let Pk £ Pk. Here VH is the complex conjugate transpose of V.1. for all pk Pk. is a convergence result for diagonalizable matrices.3) is to change A to obtain an advantageous distribution of eigenvalues. Krylov Methods for Solving Linear Equations 59 Here.1. Let dk be the kth GMRES iterate.3. 3. We can easily estimate ||pk(A)|| by as asserted. and all the eigenvalues of A lie in a disk of radius 0.1 centered about 1 in the complex plane. The reader should be aware that V and A can be complex even if A is real. In this case the columns of V are the eigenvectors of A and V–l = VH.2 Low-Storage Krylov Methods If A is symmetric and positive definite.1 implies (using Pk(z) = (1 — z)k] that Hence. If A is a diagonalizable matrix and p is a polynomial. Theorem 3. Suppose. One objective of preconditioning (see section 3. Since reduction of the residual is the goal of the linear iteration in an inexact Newton method. for example. the conjugate gradient (CG) method [35] has better convergence and storage properties than the more generally applicable Krylov methods. 105 after seven iterations. CGNR and CGNE are used far less frequently than the other low-storage methods. GMRES (m) should be your first choice. and hence the convergence of the CG iteration can be far too slow. at least in exact arithmetic.60 Chapter 3. Aside from GMRES(m).42].77] but do not have the robust theoretical properties of GMRES or CG.. . is guaranteed [33. called CGNE.42. This is not an artifact of the floating point number system but is intrinsic to the methods.m. Of course. Because the condition number is squared and a transpose-vector multiplication is needed. left. It is simple (see section 3. One needs only a function that performs a preconditioner-vector product. which has a symmetric positive definite coefficient matrix ATA. If you consider BiCGSTAB and TFQMR as solvers. A tempting idea is to multiply a general system Ax = 6 by AT to obtain the normal equations ATAx = ATb and then apply CG to the new problem. this means that the iteration will cause a division by zero. Newton–Krylov Methods over the Kth Krylov subspace. two such low-storage solvers. can be used in nsoli.2. that failure will manifest itself as a stagnation in the iteration. This approach. solves AATz = 6 with CG and then sets x = ATz. there are some problems. however. called CGNR.1) to approximate a Jacobian-vector product with a forward difference. Either method can break down. not a division by zero or an overflow. One does this with the expectation that systems with the coefficient matrix MA or AM are easier to solve than those with A. while both have the advantage of a fixed storage requirement throughout the linear iteration. one of BiCGSTAB or TFQMR may solve your problem. two new evaluations of F). or can compute a transposevector product in an efficient way. The number of linear iterations that BiCGSTAB and TFQMR need for convergence can be roughly the same as for GMRES. has the disadvantage that the condition number of ATA is the square of that of A. BiCGSTAB [77] and TFQMR [31]. you should be aware that.77] for detailed descriptions of these methods. preconditioning can be done in a matrix-free manner. While GMRES (m) can also fail to converge. but each linear iteration needs two matrix-vector products (i. While the cost of a single iteration is two matrix-vector products. The need for a transpose-vector multiplication is a major problem unless one wants to store the Jacobian matrix. or both sides by a preconditioner M. and the Jacobian is well conditioned. but no matrix-free way to obtain a transpose-vector product is known.1. Low-storage alternatives to GMRES that do not need a transpose-vector product are available [31. 3. If.42. The symmetry and positivity can be exploited so that the storage requirements do not grow with the number of iterations. If you can store the Jacobian. you cannot allocate the storage that GMRES(m) needs to perform well.3 Preconditioning Preconditioning the matrix A means multiplying A from the right. applying CG iteration to the normal equations can be a good idea. We refer the reader to [31. A similar approach.33.e. convergence. Right preconditioning has the feature that the residual upon which termination is based is the residual for the original problem. Two-sided preconditioning replaces A with M left AM right .3.1).m expects you to incorporate preconditioning into F. and to use Jacobian-vector products and the forward difference method for banded Jacobians from section 2.3 to form a banded approximation to the Jacobian. Computing an Approximate Newton Step 61 Left preconditioning multiplies the equation As — b on both sides by M to obtain the preconditioned system MAx = Mb. is to pretend that the Jacobian is banded. The reason for this is that the data structures and algorithms for the construction and application of preconditioners are too diverse to all fit into a nonlinear solver code.3). since the preconditioned residual will be used to terminate the linear iteration. To precondition the equation for the Newton step from the left.1 Computing an Approximate Newton Step Jacobian-Vector Products For nonlinear equations.2. If h is roughly the square root of the error in F.2. One would hope so. We first scale w to be a unit vector and take a numerical directional derivative in the direction w/||w||. Then the solution of the original problem is recovered by setting x = My. One factors the banded approximation and uses that factorization as the preconditioner.12]. which is defined by (2.2. If the condition number of MA is really smaller than that of A. So we multiply h by The same scaling was used in the forward difference Jacobian in (2. 3. Right preconditioning solves the system AMy = b with the Krylov method. which is integrated into some initial value problem codes [10. one simply applies nsoli. the residual of the preconditioned system will be a better reflection of the error than that of the original system. A different approach. The forward difference directional derivative at x in the direction w is The scaling is important. 3. One then applies the Krylov method to the preconditioned system. the Jacobian-vector product is easy to approximate with a forward difference directional derivative.2 3.2 Preconditioning Nonlinear Equations Our code nsoli.m to the preconditioned nonlinear problem . we use a difference increment in the forward difference to make sure that the appropriate low-order bits of x are perturbed. Remember not to use the MATLAB sign function for sgn. even if it isn't. One might base a choice of 77 on residual norms. but also to obtain quadratic convergence when near a solution. right preconditioning. You should keep in mind that the two approaches terminate the nonlinear iteration differently. Left preconditioning will terminate the iteration when ||MF(o. then the equation for the step is which is the right-preconditioned equation for the step. Linear equations present us with exactly the same issues. If rjês is bounded away from 1 for the entire iteration. so you need to decide what you're interested in. x+ = M(y+ +. the choice 7]n = rjês will do the job. captures the behavior of the residual.)|| is small.1) as the nonlinear iteration progresses. but it's simpler to solve G(y) = 0 to the desired accuracy and set x = My at the end of the nonlinear solve. To recover the step s in x we might use s = Ms or.*)"1. 3.2. The overall goal in [29] is to solve the linear equation for the Newton step to just enough precision to make good progress when far from a solution. assuming we make a good choice . If M is a good approximation to Fâ. by terminating when ||F(a. The formula is complex and motivated by a lengthy story. equivalently. Left or Right Preconditioning? There is no general rule for choosing between left and right preconditioning. Newton–Krylov Methods which is the left-preconditioned equation for the Newton step for F.3 Choosing the Forcing Term The approach in [29] changes the forcing term 77 in (3. which we condense from [42]. one way to do this is where 7 € (0.1] is a parameter. If we set x = My and solve with Newton's method. then and this termination criterion captures the actual error." which is often the real objective. As in the linear case.)|| is small. responding to the problem statement "Make ||F|| small.62 The equation for the Newton step for G is Chapter 3. the nonlinear residual is the same as that for the original problem.s). On the other hand. one can get away with far less. Nmax is an upper limit on the forcing term and In [29] the choices 7 = 0.9999 are used. Preconditioners 63 for no. [29] suggests limiting the decrease to a factor of rj n -1. Domain decomposition preconditioners [72] approximate the inverse of the high-order term (or the entire operator) by subdividing the geometric domain of the differential operator. Multigrid methods exploit the smoothing properties of the classical stationary iterative methods by mapping the equation through a sequence of grids. for example. then the inverse of the high-order part of the differential operator (with the correct boundary conditions) is an excellent preconditioner [50]. a method of safeguarding was proposed in [29] to avoid volatile decreases in Nn.9 and Nmax = 0. To make sure that Nn stays well away from one.3 Preconditioners This section is not an exhaustive account of preconditioning and is only intended to point the reader to the literature. computing the inverses on the subdomains. we do not let Nn decrease by too much. If the high-order term is linear. If your problem is a discretization of an elliptic differential equation. 3. we can simply limit its maximum size.9.3. After taking all this into account. Ideally the preconditioner should be close to the inverse of the Jacobian. Of course. one might be able to compute the preconditionervector product rapidly with. The idea is that if rjn-i is sufficiently large. then the linear equation for the Newton step can be solved to far more precision than is really needed. a fast transform method (see section 3.9 and rjmax = 0. one finally arrives at [42] The term is the termination tolerance for the nonlinear iteration and is included in the formula to prevent oversolving on the final iteration. The defaults in nsoli.3) or a multigrid iteration [9]. it can often be shown that a solution can be obtained at a cost of O(N) operations. where N is the number of unknowns.m are 7 = 0. In practice. To protect against oversolving.3. When multigrid methods are used as a solver. Multigrid implementation is difficult and a more typical application is to use a single multigrid iteration (for the high-order term) as a preconditioner. if Nn is too small in the early stage of the iteration.6. and combining . conserve storage. The MATLAB commands luinc and cholinc implement incomplete LU and Cholesky factorizations. which may be generated by computer programs. Algebraic multigrid attempts to recover geometric information from the sparsity pattern of the Jacobian and thereby simulate the intergrid transfers and smoothing used in a conventional geometric multigrid preconditioner. 3. including nsoli . Algebraic preconditioners use the sparsity structure of the Jacobian matrix. the condition number of the preconditioned matrix is independent of the discretization mesh size. in extreme cases. a sensible response is to warn the user and return the step to the nonlinear iteration. Most codes.4 What Can Go Wrong? Any problem from section 1. or. The GMRES code in nsoli. An example of such a preconditioner is algebraic multigrid. for example. The symptoms of these problems are unexpectedly slow convergence or even failure/stagnation of the nonlinear iteration.1 Failure of the Inner Iteration When the linear iteration does not satisfy the inexact Newton condition (3. This is a much more subtle problem than failure to converge because the linear solver can report success but return an inaccurate and useless step. Another algebraic approach is incomplete factorization [62. like the ones based on the GMRES solver in [11].4. When implemented in an optimal way. in the case of CG. Newton-Krylov Methods those inverses. do this. Incomplete factorization preconditioners compute a factorization of a sparse matrix but do not store those elements in the factors that are too small or lie outside a prescribed sparsity pattern. There are a few problems that are unique to Newton iterative methods. These preconditioners require that the Jacobian be stored as a sparse matrix. The iteration could terminate prematurely because the estimated residual satisfies (3.64 Chapter 3.m. can arise if you solve linear systems by iteration. 3. find enough storage to use a direct solver.63].m.9. While it is likely that the nonlinear iteration will continue to make progress. for problems that do not come from discretizations of differential or integral equations or for discretizations of differential equations on unstructured grids.1) while the true residual does not.4. This is important. convergence is not certain and one may have to allow the linear solver more iterations. use a different linear solver. 3.1) and the limit on linear iterations has been reached. which is designed for discretized differential equations on unstructured grids. of course. In finite-precision arithmetic this orthogonality can be lost and the estimate of the residual in the iteration can be poor. We refer the reader .2 Loss of Orthogonality GMRES and CG exploit orthogonality of the Krylov basis to estimate the residual and. tests for loss of orthogonality and tries to correct it. The choice of Krylov method is governed by the parameter Imeth. parms). the function /. maxit is the upper limit on the nonlinear iterations. ierr.5. GMRES (Imeth = 1) is the default. where it is the maximum number of iterations before a restart.m is a Newton-Krylov code that uses one of several Krylov methods to satisfy the inexact Newton condition (3. Imeth. You have the option in most GMRES codes of forcing the iteration to maintain the orthogonality of the Krylov basis at a cost of doubling the number of scalar products in the linear iteration. The default is 20. and the entire sequence {xn}. For large problems.r r ) contains the tolerances for the termination criterion (1.m 65 to [42] for the details. x and / must be column vectors of the same length. . 3. This parameter has a dual role.m: [sol.1.m The required data for nsoli .m. nsoli. Don't ask for the sequence {xn} unless you have enough storage for this array. The syntax for / is function = f ( x ) .1). then r\ = \etamax\. If etamax < 0.1).5). If etamax > 0.m. The other alternatives are GMRES (m) (Imeth — 2).5 Using nsoli. optionally. The default is 40.1. restart Jimit]. it_hist. If GMRES (m) is the linear solver. etamax controls the linear tolerance in the inexact Newton condition (3. tol.6. the outputs are the solution sol and. The values of maxit.5. The components parms = [maxit. The default is 40.9. x_hist] = nsoli(x. then 77 is determined by (3.m Like nsold. as it is in all our codes. 3. Using nsoli. and rj must be set if you change the value of Imeth.12).m nsoli. f.3. The default is etamax = 0. and TFQMR (Imeth = 4). These are the same as for nsold.m (see section 2.1 Input to nsoli.m expects the preconditioner to be part of the nonlinear function.5. maxitl. The vector tol = (r a . an error flag.2 Output from nsoli. The calling sequence is similar to that for nsold. The sequence of iterates is useful for making movies or generating figures like Figure 2. except for in GMRES (m). 3. etamax.1).m are x.3. one must also specify the total number of restarts in restart-limit. BiCGSTAB (Imeth = 3). maxitl is the maximum number of linear iterations per nonlinear iteration. and the tolerances for termination. which means that GMRES (m) is allowed 20 x 40 = 800 linear iterations per nonlinear iteration.maxitl. a history of the iteration. as described in section 3. are The parms array is more complex than that for nsold. 3. That's the case with the H-equation in our first example in section 3.6. factoring the Jacobian is not an option. 3. Newton-Krylov Methods asking for the iteration history {xn} by including xJiist in the argument list can expend all of MATLAB's storage. for this example. The first is the Euclidean norm of the nonlinear residual ||F(x)||. This MATLAB fragment does the job with Newton-TFQMR and an initial iterate of H = I to produce a plot of the norm of the nonlinear residual against the number of function evaluations (the dot-dash curve in Figure 3. '/.1). which means that the termination criterion is not met after maxit iterations.4 for Newton-GMRES. and TJ are the defaults but must be included if Imeth is to be varied.1. which means that the step length was reduced 20 times in the line search without satisfaction of the sufficient decrease condition (1. If the Jacobian is too expensive to compute and store. Newton-BiCGSTAB. Note that the values of maxit.1) and ra = rr = 10~8. The code heqkdemo.5). Generating such a plot is simple. The error flag ierr is 0 if the nonlinear iteration terminates successfully.m draws two figures. The limit of 20 can be changed with an internal parameter maxarm in the code. The initial iterate was the vector ones(100. and Newton-TFQMR. NEWTON-TFQMR SOLUTION OF H-EQUATION 7. and the third is the number of step-size reductions done in the line search. we solve the H-equation (2.3. The failure modes are ierr = 1. .m in the directory for this chapter is an example of how to use the sequence of iterates to make a movie.66 Chapter 3. shown in Figure 3.21).m with three sets of the parameter array with Imeth = 1. TFQMR and BiCGSTAB need two Jacobian-vector products for each linear iteration. one that plots the residual against the nonlinear iteration count and another. In this way we can better estimate the total cost and see. The forcing term is computed using (3. that GMRES requires fewer calls to F than the other two linear solvers and therefore is preferable if the storage that GMRES needs is available. Call nsoli to solve the H-equation with Newton-TFQMR. and ierr = 2. with the number of calls to F on the horizontal axis. heqkdemo . maxitl.1 Chandrasekhar H-equation To get started. the second is the cumulative number of calls to F.1. The history array it-hist has three columns.7) on a mesh of 100 points with a variety of Newton-Krylov methods and compare the performance by plotting the relative nonlinear residual ||F(xn)||/||F(xo)|| against the number of calls to F. respectively. which accounts for their added cost.6 Examples Often iterative methods are faster than direct methods even if the Jacobian is small and dense.m calls nsoli .6. The code ozmovie. as is the case with the other two examples. 4]. For this example L = 9. making the computation of the Newton step with a direct method impractical.6. The problem is an integral equation coupled with an algebraic constraint. L]. [sol. Nonlinear residual versus calls to F. In their simplest isotropic form the Ornstein-Zernike (OZ) equations are a system consisting of an integral equation where .40. it_hist. Figure 3. parms = [40.tol.9.'heq'.l.18. semilogy(it.hist(:.2 The Ornstein–Zernike Equations This example is taken from [7.2).d-8. c € C[0. x=ones(100.d-8]..3. '/.6. 7.56]. we find that the function can be most efficiently evaluated with a fast Fourier transform (FFT). It is standard to truncate the computational domain and seek h.it_hist(:.1)) . After approximating the integral with the trapezoid rule.1. tol=[l.parms).1).l)/it_hist(1. ierr] = nsoli(x. Plot a residual history. The unknowns are two continuous functions h and c defined on 0 < r < oo. Examples 67 '/. '/. 3. for 2 < j < N . and a = 2. To approximate h * c.1.9) and /?. p = 0. If h decays sufficiently rapidly. The nonlinear algebraic constraint is In (3.68 Chapter 3. The convolution h * c in (3. for 2 < i < N .2.1. Here p is a parameter.1.0. Let kj = (j — l)6k. where 6k = ir6/(N . Discrete Problem We will approximate the values of h and c on the mesh where 6 = L/(N — 1) is the mesh width and rf = (i — 1)6. Newton-Krylov Methods and the integral is over R3. we begin by discretizing frequency in a way that allows us to use the FFT to evaluate the convolution. We define. Then.7) can be computed with only one-dimensional integrals using the spherical-Bessel transform. we define and We compute h * c by discretizing the formula where he is the pointwise product of functions. . and <r are parameters. as we assume.1). e. For this example we use ft = 10. e = 0. 12) and (3.m solves this problem on a 201-point mesh. 7.2. 69 where uv denotes the componentwise product. epsilon=. The function oz.3. organizing the computation as was done in [47]. 7. dx=L/(n-l).13) can be done with a fast sine transform using the MATLAB FFT. r=0:dx:L. rho=. .m does this. The MATLAB code ozdemo.m that produces the graph of the solution in Figure 3. 7. and compares the cost of a few strategies for computing the forcing term. lf=imag(ft(2:n+D). To prepare this problem for nsoli.2. we define. L=9. 7t 7t 7.f']'.l.C]=DZDEMO returns the solution on a grid with a mesh spacing of 1/256. We also use global variables to avoid repeated computations of the potential u in (3.10). 7. We set (u * V)N = 0 and define (u * v)i by linear interpolation as The sums in (3. r=r'. function [h.c]=ozdemo global L U rho n=257. for 2 < i < N — 1. Fast sine transform with MATLAB's FFT 7. ft=-fft([0.cT)T. Examples Finally. The sine transform code is as follows: % LSINT 7. plots h and c as functions of r. OZDEMO This program creates the OZ example in Chapter 3. sigma=2. 7e Compute the potential and store it in a global variable. Here is the part of ozdemo. beta=10.m we must first consolidate h and c into a single vector x = (hT.0.6. [H. function lf=lsint(f) n=length(f). To compute the sums for 1 < i < N — 1 one can use the MATLAB code Isint to compute 1 = Isint (f).2*n+2). 7. Plot the solution. Figure 3. 7. m.m. It is easy to compare methods for computing the forcing term with nsoli. tol=[l.1).d-8. 7.-.2).!]. much smaller than is needed for a mesh this coarse and used only to illustrate the differences between the choices of the forcing term.6) with 77 = 0.80. ierr] = nsoli(x.l. [sol. Newton-Krylov Methods U=elj(r.d-8].2. ylabeK'h'.1). xlabel('r'). x=zeros(2*n. subplot(1. For this example. . xlabel('r'). plot(r. plot(r.sigma.c.2.2.beta).tol). h=sol(l:n). but the choice 77 = 0.70 Chapter 3. c=sol(n+l:2*n).l).'oz'. 7. 7.1 is superior for realistic values of the tolerances. it_hist. we compare the default strategy (3. the default choice of 77 is best if the goal is very small residuals. 'Rotation'. subplot(1. 7. also produced with ozdemo .3.'-').h. % Unpack h and c.'Rotation».1.epsilon. In Figure 3. Solution of the OZ equations. parms=[40. ylabeK'c'. For both computations the tolerances were ra = rr = 10~8.'-'). 1) . MATLAB makes it easy to alternate between a two-dimensional u (padded with the zero boundary conditions).1) x (0. 3. but solvers expect one-dimensional vectors. Here V2 is the Laplacian operator / has been constructed so that the exact solution is the discretization of We discretized on a uniform mesh with 31 interior grid points in each direction using centered differences and terminated the iterations consistently with the second-order accuracy of the difference scheme by setting The physical grid is two-dimensional.m (d/dx).6.m (d/dy].3. All of this was done within the matrix-free difference operators dxmf . . As an example.f (Laplacian). where one applies the differential operators. and the one-dimensional vector that the solvers require. and lapmf .1). Nonlinear residual versus calls to F. here is the source of dxmf .6.3.m. dymf . The problem is a semilinear (i.3 Convection-Diffusion Equation This example is taken from [42] and shows how to incorporate left and right preconditioning into F. Examples 71 Figure 3. linear in the highest order derivative) convection-diffusion equation with homogeneous Dirichlet boundary conditions on the unit square (0.e.. 2:n+l)=w. function w=pdeleft(u) '/. y. There is no need to apply f ish2d. vv(:)=u.m uses the MATLAB FFT to solve the discrete form of with homogeneous Dirichlet boundary conditions to return g = Mu. % Compute the partial derivative.n). uu(2:n+l.72 Chapter 3. Our solver f ish2d. m is the MATLAB code for the nonlinear residual. % Turn u into a 2-D array y. % % Divide by 2*h. preconditioned pde example with C=20.n+2). y. global rhsf prhsf y. % Compute the low-order nonlinear y. y. homogeneous Dirichlet BC n2=length(u). dxuu=zeros(n. Notice that the preconditioner is applied to the low-order term. y.m to lampmf (u) simply to recover u. . uu=zeros(n+2. n=sqrt(n2). vv=zeros(n.2:n+l)-uu(l:n. h=l/(n+l).n). term. PDELEFT W=PDELEFT(U) is the nonlinear residual of the left°/. */. Newton-Krylov Methods function dxu = dxmf(u) % DXMF Matrix-free partial derivative wrt x. y. i We can exploit the regular grid by using a fast Poisson solver M as a preconditioner.15) to obtain the preconditioned equation pdelef t.5*dxuu(:)/h. y.2:n+l). dxu=. with the BCs built in. dxuu=uu(3:n+2. We apply the preconditioner from the left to (3. The preconditioned right side 73 is stored as the global variable prhsf in pdeldemo. so the solution of the steady-state (time-independent) problem is We expect the solution u(x.3. 3. We will use the implicit Euler method to integrate the nonlinear parabolic initial boundary value problem in time for 0 < t < 1.6. °/. Apply fish2d to the entire pde. Were this a larger problem. '/. As in section 3.6. 7.m. Examples v=20*u. Multigrid or domain decomposition preconditioners [9. The Armijo rule made a difference for the right-preconditioned problem. the number of function evaluations is lower for GMRES. the storage for full GMRES could well be unavailable and the low-storage solvers could be the only choices.3. With right preconditioning.6. The function f ( x . that.m. It does make sense to compare the choices for linear solvers and forcing terms. once again on the second nonlinear iteration.17) to converge to usteady as t —> oo.6. w=u+fish2d(v)-prhsf.3. as before. the step length was reduced once at the first nonlinear iteration for all three choices of linear solver and. in three space dimensions. Preconditioning a semilinear boundary value problem with an exact inverse for the high-order term. * (dxmf (u) +dymf (u) ) .1). while the number of nonlinear iterations is roughly the same. we impose homogeneous Dirichlet boundary conditions on the unit square (0. is optimal in the sense that the convergence speed of the linear iteration will be independent of the mesh spacing [50]. After discretization in space. for BiCGSTAB and TFQMR.and right-preconditioned iterations. One can examine the performance for the three linear solvers and find. so it isn't completely valid to compare the left. t) of (3.6. the problem becomes a large system of ordinary differential . y ) is the same as in section 3.1) x (0. we set u = Mw and solve The MATLAB code for this is pderight. as we do here. For the right-preconditioned problem. however. Recall that the residual has a different meaning than for the left-preconditioned problem. which calls the solver. but are more complicated to implement. say.72] also do this.3.4 Time-Dependent Convection-Diffusion Equation This example is a time-dependent form of the equation in section 3. integration in time proceeds just as it did in section 2.m. Compare the accuracy of the results. the nonlinear equation where M represents the application of the fish2d. Newton-Krylov Methods equations.5. solves linear systems with GMRES. and the convection-diffusion equation. at the end. Try using a factorization of F'(XQ] to build one.7. we discretize in space with centered differences to obtain a system of ordinary differential equations. and. We follow the procedure from section 2.7. number of function evaluations needed to reach a given tolerance.7.1.1 Projects Krylov Methods and the Forcing Term Compare the performance of the three Krylov methods and various choices of the forcing term for the H-equation.3 Two-Point Boundary Value Problem Try to solve the boundary value problem from section 2. Would an incomplete factorization (like luinc from MATLAB) work? . how well does it perform? Do all choices of the forcing term lead to the same root? 3. This system is stiff.m and pderdemo.7. Modify the codes to refine the mesh and see if the performance of the preconditioner degrades as the mesh is refined. at each time step.7 3. the OZ equations.4 with nsoli.7. Make the comparison in terms of computing time.m for the solve is shorter than the explanation above. one solves. which uses a 63 x 63 spatial mesh and.m. The integration code is pdetimedemo. so implicit methods are necessary if we want to avoid unreasonably small time steps. Because the nonlinear residual F has been constructed. You'll need a preconditioner to get good performance. which we write as The nonlinear equation that must be solved for the implicit Euler method with a fixed time step of 6t is To precondition from the left with the fast Poisson solver f ish2d.5 to prepare the nonlinear systems that must be solved for each time step. 3.m codes in the directory for this chapter to examine the quality of the Poisson solver preconditioner.m. If GMRES is limited to the storage that BiCGSTAB or TFQMR needs. compares the result at t = 1 with the steady-state solution. 3.7. The code pdetime. a time step of 0. First.74 Chapter 3. and storage requirements.m solver.2 Left and Right Preconditioning Use the pdeldemo. Use this code as a template to make movies of solutions.4 Making a Movie The code ozmovie. Projects 75 3.7. steps. and nonlinear residuals for some of the other examples.7. This is especially interesting for the differential equation problems. .3.m in the directory for this chapter solves the OZ equations and makes a movie of the iterations for h. if x_hist is in the output argument list. 7. 1 (GMRES). The inner iteration terminates '/. 7. Imeth = 2 7. 7. solver for f(x) =0 '/. when the relative linear residual is 7. alpha = l. otherwise off. ierr = 0 upon successful termination 7i ierr = 1 if after maxit iterations '/. gamma = . default = 40 7. 7. is terminated if too many step-length reductions 7> are taken.5.3). % and number of step-length reductions 7. default . x_hist ™ matrix of the entire iteration history. NSOLI Hewton-Krylov solver. 100 7.3) * 12-norms of nonlinear residuals % for the iteration. % Parabolic line search via three-point interpolation % '/.1. This is an '/. Inexact Newton-Armijo iteration '/. 7.1 (GMRES. 7. If etamax < 0. OPTIONAL argument. Set internal parameters.9. it. parameter to measure sufficient decrease 7. 7.Maximum error tolerance for residual in inner '/. x_hist] = nsoli(x. restart.1. Eisenstat-Walker forcing term '/. This code comes with no guarantee or warranty of any kind.limit = 20. maxitl. C. 7.f. restart.m function [sol. 7. Set the debug parameter. 7. '/. iteration.3. failure is reported 7. 7. number of function evaluations. Imeth. iteration. 2003 '/. 7.parms) '/. 7.1. 7. 7. m = maxitl V. 7. ierr.9 7. default » 20 7. x. For iterative methods other than GHRES(m) maxitl '/. maxarm » 20. x. safeguarding bounds for the line search 7. in CURES (m). I etamax I . debug ™ turns on/off iteration statistics display as 7.limit = max number of restarts for GMRES if 7. those iteration parameters that are optional inputs. '/. The columns are the nonlinear iterates. eta is determined '/. default: etamax = 0. tol .histx = zeros(maxit. by the modified Eisenstat-Walker formula if etamax > 0. ierr. T. 3 (BICGSTAB).limit] 7i maxit = maximum number of nonlinear iterations 7. '/.d-4.hist] = nsoli(x. etamax. smaller than eta*I F(x_c) I. Kelley. 97 Imeth . This '/. Imeth = choice of linear iterative method 7. 7. 7.hist = x. it_hist. function [sol. 1 turns display on.hist. restart. function = f 7. sigmaO = . 7. x. 96 ierr = 0. alpha = l. the termination criterion is not satisfied 7t ierr = 2 failure in the line search. for example. 98 if nargout == 4. is useful for making movies. 7. 2 GMRES(m). debug = 0. sol = solution % it_hist(maxit. but % can consume way too much storage. April 1.I etamax I for the entire 7. the iteration progresses 7. globally convergent '/. 4 (TFQMR) 7. 7t inputs: '/.5. parms = [maxit. .hist and set the default values of '/. 7.[atol. '/. sigmaO = 0. sigmal = . etamax = .d-4. 7. The iteration '/.9. no restarts) 7. then eta . ierr. % internal parameters: 1. Initialize parameters for the iterative methods. error tolerances for the nonlinear iteration '/. maximum number of step-length reductions before 7. maxit = 40. Storage is only allocated 7. 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 'It output: 7.8 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 Source Code for nsoli. 7. 7. maxarm * 20. Initialize it. 7. initial iterate » x 'I. is the upper bound on linear iterations. sigmal = 0.tol.f. maxitl = maximum number of inner iterations before restart 7. 7. end 99 7.hist. rtol] relative/absolute 7. 7. parms) '/. Imaxit = 40.tol. 7. default = 40 '/. 'I. it. 164 lame .alpha*lambda) * nfO.2) = 0. parms(5). rat = fnrm/fnrmo. Imazit . too many reductions').fnrm. lame = lambda. Imaxit. The line search starts here. 168 ft = feval(f.lambda. else 187 y.1. and 119 'I. itc = 0.inner_f_evals] = . 110 end 111 if length(parms) == 5 112 gmparms * [abs(etamax).[itc fnrm 0 0 0 ] . end fnrm . if nargout — 4. Adjust eta as per Eisenstat-Walker. ffO. if etamax > 0 143 % 144 % 145 146 147 148 149 150 151 '/.nft*nft.x]. 163 lamm » lame.(1 . 137 fnrmo » fnrm.1) .parab3p(lamc. lambda . it_histx(itc+l. :) . 120 % 121 fO = feval(f. ffm).tol(2). lambda .2)+l. iarm = 0. Imaxit]. 169 nft « norm(ft). it_histx(itc+l. it_histx(itc+l. ffc. fnrm . 138 itc = itc+1. 124 fnrmo .[abs(etamax).1.3) = iarm. x. Imeth). lamm = 1.hist. while nft >. 108 if length(parms)>= 4 109 Imeth . xt = x + lambda*step. nfO = norm(fO).2) = it_histx(itc. 172 iarm . 162 xt » x+lambda*step.iarm+1. xold = x.norm(ft). 178 if nargout — 4. 117 % 118 V. 184 fO = ft.zeros(maxit.101 V. How many function evaluations did this iteration require? it_histx(itc+l. 185 % 186 % end of line search 188 189 190 191 % 192 % 193 % 194 195 196 197 % 198 199 '/. it_histx(itc+l. gmparms. 165 X 166 % Keep the books on the function norms. it_histx(itc+l.2) = it_histx(itc+l. 171 ffc . inner_it_count.x).hist = [x. if iarm — 0 lambda = sigmal*lambda. lamm. keep the books on lambda. Imaxit]. if itc == 1..1.norm(fO). 123 it_histx(itc+1. 170 ffm = ffc. 180 return. 122 fnrm . 106 it_histx . 135 '/. 176 disp(outstat) 177 it_hist = it_histx(l:itc+1. end % % Update x. 167 '/. etamax * parms(3).x]. 127 X 128 % main iteration loop 129 % 130 while(fnrm > stop_tol t itc < maxit) 131 X 132 % Keep track of the ratio ( a = fnrm/frnmo) rt 133 % of successive residual norms and 134 % the iteration counter (itc) .3) = 0. nft . 126 outstat(itc+1. 175 ierr = 2.3). x_hist • [x_hist. ffc = nft*nft. end.parms(4). 104 if nargin — 4 105 maxit » parms(l). 141 142 % 152 153 154 155 156 157 158 159 160 % % Apply the three-point parabolic model.xt). 113 end 114 end 115 X 116 rtol .parms(2).:). 102 % 103 gmparms » [aba(etamax). ffO = nfO*nfO. errstep. f. ft = feval(f. 139 [step.. 107 gmparms . end 179 sol . . Check for optioned inputs. Evaluate f at the initial iterate. 125 stop_tol = atol + rtol*fnrm. 136 rat » inrm/fnrmo. x. compute the stop tolerance.2)+inner_f_evails+iarm+l.xold. n = length(x). ffm = nft*nft.norm(f0). 201 '/. 173 if iarm > maxarm 174 dispCArmijo failure. 181 end 182 end 183 x = xt.xt). 200 '/. 1]. 140 dkrylov(fO. atol « tol(l). 202 161 y.l) » fnrm. ffm = value of I I F(x_c + lambdam d) I I "2 X X output: '/. f_evals] = . lambdap . end if lambdap > sigmal*lambdac. total_iters. total_iters. 2003 X X X This code comes with no guarantee or warranty of any kind. errstep. April 1. end sol = x. ffO. params.it_histx(l:itc+l. lambdam. end X X X function [step. f. lambdap . gmparms(l) • max(gmparms(l) . end '/. 2003 X X This code comes with no guarantee or warranty of any kind. ffm) X Apply three-point safeguarded parabolic model for a line search. Imeth) X Krylov linear equation solver for use in nsoli X X C.. Kelley. dkrylov(fO. Imeth = method choice 1 GMRES without restarts (default) 2 GMRES(m). '/.sigaml * lambda.sigmal*lambdac. X X p(lambda) = ffO + (cl lambda + c2 lambda's)/dl X X dl = (lambdac . lambdam = previous step length X ffO .5/c2.1.:). it_hist . end V. end gmparms(l) = min([etanew. safeguarding bounds for the line search X 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 294 295 296 297 298 299 300 301 302 303 304 x 293 % X X X X X X X X X X X X Compute coefficients of interpolation polynomial. Kelley. x. sigmal ™ . X function lambdap = parab3p(lambdac.5*stop_tol/fnrm) . Imeth) X X X Input: fO = function at current point X f = nonlinear function X The format for f is function fx . X if fnrm > stop_tol. X sigmaO « . outstat(itc+l. params. f. X X C. ffm) % % input: X lambdac » current step length '/. '/.:). sigmal = 0.gamma*etaold*etaold). ffc. ierr = 1. On failure.5. X Note that for Newton-GMRES we incorporate any X preconditioning into the function routine. T. X x = current point params= vector to control iteration params(l) params(2) params(3) params(4) 1 — 2 — 3 — * relative residual reduction factor = max number of iterations = max number of restarts for GMRES(m) (Optional) .it_histx(l:itc+l. ffc.c + lambdac d) I I "2 '/. T.value of II F(x. X X function [step.max(etanew. :) « [itc fnrm inner_it_count rat iann].reorthogonalization method in GMRES Brown/Hindmarsh condition (default) Never reorthogonalize (not recommended) Always reorthogonalize (not cheap!) X X Set internal parameters.1 etanew . f.5.f(x). ffO.1. if lambdap < sigmaO*lambdac. lambdap = new value of lambda given parabolic model X X internal parameters: X sigmaO .0. if debug == 1 disp(outstat) it_hist . '/. return end cl = Iambdac*lambdac*(ffm-ff0)-lambdam*lambdam*(ffc-ff0). April 1. lambdam. function lambdap = parab3p(lambdac. lambdap « -cl*.203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 % end etaold .value of II F(x_c) ||-2 X ffc .lambdam)*lambdac*lambdam < 0 X so if c2 > 0 we have negative curvature and default to X lambdap . set the error flag. x.evals] X • dkrylov(fO. etanew = gamma*rat*rat.gmparms(l).etamax]). errstep.sigmal*lambdac. X c2 = Iambdam*(ffc-ff0)-lambdac*(ffm-ff0). if c2 >= 0 lambdap .sigmaO*lambdac. '/. . m = params(2) and the maximum number .. if gamma*etaold*etaold > . Compute the step using a GHRES routine especially designed 7. errstep. total.. w " point and direction % % f = function fO = f(x). end y. f. Kelley. restart. gmparms).evals = 2*total iters.l). for this purpose.vector of residual norms for the history of X. params(4)]. x. x.. April 1. errstep. f.limit = 0. 7. epsnew = l. end if Imeth — 1. Imaxit = params(2). params(2). 2003 y.dgmres(f0. % function z = dirder(x.iters] . errstep.number of iterations '/. % 3 BiCGSTAB '/. % f(x) has usually been computed 7. f. total_iters] . Scale the step. C. restart. % This code comes with no guarantee or warranty y.iters .step).iters] = dgmres(fO.305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 1. f. % inputs: x. the iteration % total.fO) % Finite difference directional derivative % Approximate f ( ) w. y. BiCGSTAB % y. n = length(x).3 [step. [step. before the call to dirder. y. 4 TFQHR % % Output: x = solution % errstep .evals « 2*total_iters.0. 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 end y.dcgstab(f0. if norm(w) == 0 z = zeros(n. y. in nonlinear iterations. T.w.limit = 20. x. % TFQMR y. return end .f .. params(2). elseif Imeth — 4 [step.f. y. '/. y. gmparms. kinn .iters == Imaxit ft . 7. gmparms).dirder(x. total_iters] » dtfqmr(fO. ' x of any kind. 7. else error('Imeth error in fdkrylov') end % '/. y.fO) y. end if length(params) ~ 3 V. else gmparms . x.. f. while total. elseif length(params) == 4 % % reorthogonalization method is params(4) % gmparms = Cparams(l). Restart at most restart_limit times. % linear y. kinn < restart_limit kinn = kinn+1. Use a hardwired difference increment. % default reorthogonalization % gmparms = [params(l). '/. function z . errstep. iterative methods if Imeth — 1 I Imeth == 2 % GHRES or GHRES(m) X y. f.w. total. errstep(total.d-7.limit . y. total. elseif Imeth -.iters = total_iters+kinn*Imaxit. f_evals = total_iters+kinn.params(3). y. gmparms).iters) > gmparms(l)*norm(fO) ft . 1]. % y. params(2)]. y. of restarts is params(3). y. % % initialization '/.[params(l). [step. if length(params) >= 3 restart. . dispCearly termination') 499 return 500 end 501 % 502 % 503 v(:. params. xinit . xinit = initial iterate. total. dirder. Input: fO " function at current point 434 'I.fO)/epsnew. total_iters] = dgmres(fO.rho].relative residual reduction factor 441 '/. This 448 '/. f. 506 '/.l).dO)*sign(xs). params (2) = max number of iterations 442 'I. will be used as the linear solver. error. Kelley. 428 % This code comes with no guarantee or warranty of any kind. 419 fl = feval(f. 481 end 482 % 483 '/. the iteration 454 % total. del and fl could share the same space if storage 416 % is more important than clarity. 492 % 493 % Test for termination on entry. 410 if xs -• O. 505 k = 0. 451 % Output: x .l. params (1) .solution 452 '/t error « vector of residual norms for the history of 453 'I. 421 % 422 423 function [x. 430 % function [x. 494 '/.dO 411 epsnew=epsnew*max(abs(xs). f = nonlinear function 435 '/. 507 % GMRES iteration 508 '/. 436 % Note that for Newton-GHRES we incorporate any 437 % preconditioning into the function routine. error. 462 kmax = params(2). 489 g . 461 errtol = params(l). 3 — Always reorthogonalize (not cheap!) 446 */. xinit) % 432 % 433 */. 420 z « (fl . 417 % 418 del = x+epsnew*w.kmax). 497 if(rho < errtol) 498 */. initialization 460 */. 477 r = b. m) 475 '/. 469 % 470 b . 480 r « -dirder(xc. f. April 1.iters .iters = number of iterations 455 % 456 % requires givapp. 490 errtol = errtol*norm(b). 438 '/. 2 — Never reorthogonalize ( o recommended) nt 445 '/. 2 0 03 427 '/. CURES linear equation solver for use in Nevton-GMRES solver 425 % 426 */. 447 •/. 1 — Brown/Hindmarsh condition (default) 444 '/.m 457 431 y.0. 415 '/. xc. 450 '/.m. f. C. 495 error » [error. params(3) (Optional) = reorthogonalization method 443 '/.zeros(kmax).l).407 '/.l). 468 '/. 466 end 467 '/.1). 471 n = length(b). is a reasonable choice unless restarted GMRES 449 '/. 485 v • zeros(n.dgmres(fO.f (x) . params. 463 reorth « 1.zeros(kmax+l.iters] . 472 X 473 % Use zero vector as initial iterate for Newton step unless 474 •/. xinit) 424 y. 429 '/. x. fO)-fO.del). 476 x = zeros(n. 496 total. 478 if nargin == S 479 x = xinit.-fO. the calling routine has a better idea (useful for GMRES ( ) . 504 beta = rho. 484 h . T. xc = current point 439 % params » vector to control iteration 440 '/. 464 if length(params) == 3 465 reorth = params(3). 408 % 409 xs=(x'*w)/norm(w). 414 '/. Now scale the difference increment. 488 rho = norm(r). The format for f is function f x . 486 c = zeros(kmax+1. 487 s .rho*eye(kmax+l. xc. 412 end 413 epsnew=epsnew/norm(w). The right side of the linear equation for the step is -fO. 491 error = [] .0 is the default.l) = r/rho. 458 % 459 '/. g(k:k+l) .001*normav2 »• normav) I reorth -.norm(v(:.h(j.l:k h(j. rho . y. end Form and store the information for the new Givens rotation.k)/nu. 543 % 544 % 545 546 547 548 % 549 % 550 X 551 552 553 '/.iters] 'I. y. end function [x. v(:. y.k+l) • dirder(xc.k+l)). T.3 for j = l:k hr .k+1. Kelley. s(k) = -h(k+l.509 510 511 512 513 514 515 516 while((rho > errtol) ft (k < kmax)) k . normav . % function vrot = givapp(c. h(k. 537 '/.h(l:k. total.abs(g(k+O). Thanks to Howard Elman for this.k+l)-h(j.v(:.0) v(:.s.k+O). '/.k).h(l:k. April 1. function [x. % 517 % 518 % 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 Modified Gram-Schmidt for j .k). f.k) . total.. '/.s(k).k). vin.l:k)*y. xinit) % Forward difference BiCGSTAB solver for use in nsoli y. 2003 y.k) ". end '/.k+l)-hr*v(:. 2003 y. y.nonn(v(:. xc. % At this point either k > kmax or rho < errtol. % C.k-l).k)*v(:.j)'*v(:. y. Kelley.k) .k+l) = v(:. 538 */.k)).1).k) = norm(v(:.j). April 1. s. params.k+l) = v(:. v(:. % v(:. end 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 y.k) = v(:.v(:. X y.k+l). f 0) . w2 . '/. total.w2].k)/nu. vrot(i:i+l) = [wl.givapp(c..k+l). V.j)'*v(:. Update the residual norm. if nu ~= 0 c(k) = h(k.l:k)\g(l:k).rho]. X % end of the main while loop % end y.iters = k. for i = l:k wl = c(i)*vrot(i)-s(i)*vrot(i+l).iters] «. y.k)-s(k)*h(k+l. end h(k+l.k) = c(k)*h(k. X y.vin. nu « norm(h(k:k+1. dcgstab(fO.k)/nu). Here's a modest change that makes the code work in complex % arithmetic.j). error • [error. T. if Reorthogonalize? (reorth — 1 t normav + . k) vrot « vin. % % function vrot . f. used within GMRES codes. error. f. c(k) . 554 555 556 557 558 559 if(h(k+l. end h(k+l.s(i)*vrot(i)+c(i)*vrot(i+l).s(i)*vrot(i)+conj(c(i))*vrot(i+l). xc.k).k+O). x • x + v(l:n. % It's time to compute x and leave. v(:.k).k) '/. C. % w2 .k)+hr. = dcgstabCfO. This code comes with no guarantee or warranty of any kind. h(j. normav2 • h(k+l.k) • 0.g(k:k+l).givapp(c(k). y.conj(h(k. xinit) 00 . '/. % y.s(l:k-l). h(k+l. This code comes with no guarantee or warranty of any kind. % Call directional derivative function. Watch out for happy breakdown. '/. y . if k > 1 h(l:k.k+l)/h(k+l. 539 540 541 542 7. end Don't divide by zero if solution has been found. params.givapp(c(l:k-l).k+l) . error. y. 1. Apply a sequence of k Givens rotations.k) . 693 % This code comes with no guarantee or warranty of any kind. f.. f. r = -dirder(xc. V. xinit = initial iterate.zeta].number of iterations '/.f. This '/. omega ™ 1. . if omega ~ 0 error('BiCGSTAB breakdown. y. oo ro 712 y. hatrO . This 711 '/.s.f. t .p. rho « zeros(kmax+1. The format for f is function fx = f(x).1). 666 if tau == 0 667 error('BiCGSTAB breakdown. error. 665 tau « hatrO'*v. xinit = initial iterate.hatrO'*r.(rho(k+l)/rho(k))*(alpha/omega).l). 705 */. initialization % b .iters » k. 683 end 64 '. '/. xinit . 682 error » [error. Output: x « solution "/. 7. % Requires: dirder. alpha « 1. errtol = params(1)*norm(b). if nargin == 5 x . zeta]. omega . k = 0. 662 beta. 2003 692 '/. 8 / 685 % 686 687 function [x. params.l). error = [].max number of iterations '/. 678 x " x+alpha*p+omega*s. rho(l) = 1. params. 664 v = dirder(xc. 675 end 676 omega = t'*s/tau. fO)-fO.0 ) '. 672 tau « t'*t. is a reasonable choice unless restarts are needed. v » zeros(n. while((zeta > errtol) ft (k < kmax)) k . xc » current point 706 % params = vector to control iteration 707 % params(1) = relative residual reduction factor 708 '/. 673 if tau == 0 674 error('BiCGSTAB breakdown.xinit. zeta = norm(r). 688 dtfqmr(fO.611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 '/. % BiCGSTAB iteration •/. tau =0'). xinit) 689 % Forward difference TFQHR solver for use in nsoli 60 '. x = zeros(n. end y.zeros(n. 680 zeta » norm(r).0 is the default. Kelley.length(b).relative residual reduction factor '/. error = [error.omega*v). 668 end 669 alpha .0 ) '.l). end '/. Use zero vector as initial iterate for Newton step unless '/. rho(2) . kmax = params(2). April 1. params(l) . 681 total. 64 '. 677 rho(k+2) = -omega*(hatrO'*t). '/.fO). x. 703 % Note that for Newton-GMRES we incorporate any 704 % preconditioning into the function routine. xinit = 0 is the default. 671 t = dirder(xc. xc. the calling routine has a better idea (useful for CURES ( ) . Input: fO = function at current point % f » nonlinear function % The format for f is function fx . r = b. n ..iters .fO).rho(k+l)/tau. total. is a reasonable choice unless restarts are needed.s-omega*t. error = vector of residual norms for the history of */. 663 p = r+beta*(p .r-alpha*v. function [x. 679 r . T.f(x).k+1. y.m % */. m) '/. 9 / 691 '/. xinit) 697 •/. error. 698 % 699 % 700 % Input: fO = function at current point 701 % f • nonlinear function 702 y. '/. C. params (2) . total_iters] 696 % = dtfqmr(fO. p . 9 / 695 7.iters] = .r. xc. params = two-dimensional vector to control iteration '/. Note that for Newton-GMRES we incorporate any X preconditioning into the function routine. params (2) ™ max number of iterations 709 % 710 '/. ic = current point '/.-fO. 670 s . '/. the iteration % total. y. f. 759 if j — 2 760 y(:.v.[error. d • y(:. tau].l). 754 •/.tau]. beta .zeros(n.l) . 739 rho = tau*tau.[error. total.1).y(:. rho . 734 u = zeros(n. 764 765 76 6 767 768 769 % 770 7.j). 745 sigma = r'*v.xinit. 746 '/.2). x.rho/sigma. OO W .tau*theta*c. Try to terminate the iteration at each pass through the loop. y .j)+(theta*theta*eta/alpha)*d.l)-alpha»v. 758 '/.2).f.fO). y(:.rhon/rho. 725 b = -fO.2) = dirder(xc.l)+beta*(u(:.l).0') end rhon = r'*w.fO). 732 end 733 7.l) . error . x . kmax = params(2).f. y(:. 726 n » length(b). return end end if rho — 0 error('TFQMR breakdown. 793 Y.0. 731 r = -dirder(xc.x+eta*d. 740 % 741 % TFQMR iteration 742 X 743 while( k < kmax) 744 k = k+1. 757 % Compute y2 and u2 only if you have to.iters • number of iterations 718 '/.iters = k.r. tau = norm(r). theta . f. eta .m 720 '/.2)+beta*v). 723 % initialization 724 7.713 % 714 % Output: x .f.0. Requires: dirder. rho . 735 k = 0.l). eta = c*c*alpha.rhon.2). 761 u(:. 772 773 774 775 776 777 778 % 779 % 780 % 781 782 783 784 % 785 786 787 788 789 790 791 end 792 '/. 736 v • dirder(xc. total.2). 762 end 763 m » 2*k-2+j. 755 for j = 1:2 756 7. tau]. if tau*sqrt(m+l) <= errtol error = [error. sigma = 0') 749 end 750 X 751 alpha . total.l) = w + beta*y(:. 728 r = b. fO)-fO.l) . y(:. tau . error . w " r. 719 '/.vector of residual norms for the history of 716 % the iteration 717 7. y(:. 747 if sigma — 0 748 error('TFQMR breakdown. d • zeros(n. 752 % 753 '/. 771 '/. 727 x = zeros(n. errtol = params(l)*norm(b).norm(w)/tau. 729 if nargin == 5 730 x .fO). v » w-alpha*u(:. u(:. error = []. c = l/sqrt(l+theta*theta).2) .iters = k. 737 u(:.solution 715 % error . y(:. 738 theta .dirderCxc. 721 722 */. v = u(:. This page intentionally left blank . when the initial iterate is near the solution. For a problem where the initial iterate is far from a solution and the number of nonlinear iterations will be large. Broyden's method. The cost of this updating in the modern implementation we advocate here is one vector for each nonlinear iteration. Contrast this cost to NewtonGMRES. this is a significant disadvantage for Broyden's method. does not guarantee that the approximate Newton direction will be a descent direction for \\F\\ and therefore a line search may fail. the current approximation to Ff(xn). one can ask that Bn. Broyden's method usually requires preconditioning to perform well. For these reasons. Recall that the secant method approximates f'(xn) with and then takes the step One way to mimic this in higher dimensions is to carry an approximation to the Jacobian along with the approximation to x* and update the approximate Jacobian as the iteration progresses. The formula for bn will not do. satisfy the secant equation 85 . Broyden's method is the simplest of the quasi-Newton methods. the Newton-Krylov methods are now (2003) used more frequently than Broyden's method. These methods are extensions of the secant method to several variables. which is updated as the nonlinear iteration progresses. because one can't divide by a vector. where the storage is accumulated during a linear iteration. Having said that. like the secant method for scalar equations.Chapter 4 Broyden's Method Broyden's method [14] approximates the Newton direction by using an approximation of the Jacobian (or its inverse). so the decisions you will make are the same as those for a Newton—Krylov method. However. Broyden's method can perform very well. 1) are equivalent. respectively. The algorithm follows the broad outline of nsolg.4). Broyden's Method For scalar equations. i. (4. Keep in mind the warning in section 1.e.2 An Algorithmic Sketch Most implementations of Broyden's method. so a wide variety of methods that satisfy the secant equation have been designed to preserve such properties of the Jacobian as the sparsity pattern or symmetry [24.2) and (4. The line search cannot be proved to compensate for a poor initial iterate.1 is all there is. Theorem 4.42. that. Bn is updated to form Bn+1 using the Broyden update In (4.1. The data now include an initial approximation B to the Jacobian. Then there are 6 and SB such then the Broyden sequence for the data (F.1) is meaningless. therefore.7.1! This may not work and a code must be prepared for the line search to fail.1 Convergence Theory The convergence theory for Broyden's method is only local and. 4.86 Chapter 4.. if Let the standard assumptions hold. (4.m among them.43]. y = F(xn+1) . then where \n is the step length for the Broyden direction After the computation of xn+1. X0. . BO) exists and xn —»• x* q-superlinearly. In the case of Broyden's method. Theorem 4.F(x n ) and 4. less satisfactory than that for the Newton and Newton-Krylov methods. our code brsola. if xn and Bn are the current approximate solution and Jacobian. For equations in more than one variable. incorporate a line search. Most quasi-Newton codes update B~l as the iteration progresses. but this is also extremely costly. then the convergence is q-superlinear. broyden_sketch(x. The next step is to use the Sherman-Morrison formula [69. Letting BQ be the highest order term in a discretized elliptic partial differential equation or the noncompact term in an integral equation is another example. One can also factor BQ and update that factorization (see [24] for one way to do this). Left preconditioning works in the following way. Rather than use BQ = A. BQ = F'(XQ] is a good choice.3 Computing the Broyden Step and Update One way to solve the equation for the Broyden step is to factor Bn with each iteration. v € RN. One then applies this factorization at a cost of O(7V2) floating point operations whenever one wants to compute A~lF(x) or F(A~lx). Use a line search to compute a step length A. but the nonlinear iteration will be different. If the line search fails. are very similar to those for Newton-GMRES.1. one could apply Broyden's method to the left-preconditioned problem A~lF(x) = 0 and use BQ — I. Computing the Broyden Step and Update 87 Algorithm 4. BQ = I is still a good choice. This will amortize the O(N3} factorization of A over the entire nonlinear iteration. eliminates part of the advantage of approximating the Jacobian. one uses the right-preconditioned problem F(A~1x) = 0. Keep in mind that one will never compute and store A~1. Unlike inexact Newton methods or Newton iterative methods. F. If.3. using preconditioning to arrange things so that BQ = I. If the initial iterate is accurate.70]. terminate. but rather factor A and store the factors. The two sequences of approximate solutions are exactly the same [42]. rr) Evaluate while || F (or) || > r do Solve Bd = -F(x). Ta. as we will see. quasi-Newton methods need only one function evaluation for each nonlinear iteration. then B + UVT is nonsingular if and only if . instead. The storage requirements for Broyden's method. This. If the standard assumptions hold and the data XQ and BQ are accurate approximations to x* and F'(x*).4. Suppose A « F'(x*). B. If B is a nonsingular matrix and u. end while The local convergence theory for Broyden's method is completely satisfactory. 4. of course. There are many ways to obtain a good BQ. 5) to Broyden's method. Chapter 4. we use (4. The storage can be halved with a trick [26. for k > 0. d < F(x}.6) at a cost of O(Nn) floating point operations and storage of the 2n vectors {u>fc}£~Q and {sfc}fcZ0. rffl. where. Note that we also must store the sequence of step lengths.88 1 + vTB~lu ^ 0. to apply B~l to a vector p. Algorithm 4. Keep in mind that we assume that F has been preconditioned and that BQ = I. Broyden's Method To apply (4. broyden(x. x <— x + s n<-0 while ||F(x)|| > T do . we write (4.m.2.42] using the observation that the search direction satisfies Hence (see [42] for details) one can compute the search direction and update B simultaneously and only have to store one new vector for each nonlinear iteration. keeping in mind that BQ = /. F. So. Algorithm broyden shows how this is implemented in our Broyden-Armijo MATLAB code brsola. Terminate if the line search fails. compute AQ with a line search.4) as where Then. SG <~ Aod. In that case. r <— rr|F(x)| + ra. rr) Evaluate F(x). 4. 4. Terminate if the line search fails. There are a few failure modes that are unique to Broyden's method. As with line search failure. the iteration can be restarted if there is no more room to store the vectors [30.m is an implementation of Broyden's method as described in Algorithm broyden. parms).tol.55]. called limited memory in the optimization literature [54. When the nonlinear iteration converges slowly or the method completely fails.4.m allows for this.m has a line search. .4. is to replace the oldest of the stored steps with the most recent one.m and nsoli.m brsola. the preconditioner is one likely cause. If the data are poor or you use all available storage for updating B. Our MATLAB code brsola.42]. you may need to find a better preconditioner or switch to a Newton-Krylov method. the nonlinear iteration may fail.1 Failure of the Line Search There is no guarantee that a line search will succeed with Broyden's method. The user interface is similar to those of nsold.2 Failure to Converge The local theory for Broyden states that the convergence is superlinear if the data XQ and BQ are good.5 Using brsola. n – 1 do end for Compute A n +i with a line search. 4.m: [sol.4.f. What Can Go Wrong? 89 for j = 0.4. but if you find that it fails. A different approach. like the chord method. 4. x_hist] = brsola(x. ierr. which. better preconditioning may fix this. Our code brsola.4 What Can Go Wrong? Most of the problems you'll encounter are shared with the Newton-Krylov methods. it_hist. has no guarantee of global convergence. end while As is the case with GMRES. 4. maxitl is the maximum number of nonlinear iterations before a restart (so maxitl — I vectors are stored for the nonlinear iteration).m. optionally. However. One can increase this by changing an internal parameter maxarm in the code.m The required data for brsola. the outputs are the solution sol and. and ierr = 2. The first is the Euclidean norm of the nonlinear residual ||F(x)||.m Exactly as in nsoli. rather than the generous 20 given to nsoli. the function /. which means that the step length was reduced 10 times in the line search without satisfaction of the sufficient decrease condition (1. Because of the uncertainty of the line search.r r ) contains the tolerances for the termination criterion (1. 4. Notice that we give the line search only 10 chances to satisfy (1. the line search will fail if you use brsola.m in the directory for this chapter is an example of how to use brsola.2 (unless you find a good preconditioner). The vector tol = (r a . a history of the iteration. which means that the termination criterion is not met after maxit iterations.1 Input to brsola. The history array itJiist has three columns. For example. asking for the iteration history {xn} by including xJiist in the argument list can expend all of MATLAB's storage. Broyden's Method 4. Broyden's method is not as generally applicable as a Newton-Krylov method.6 Examples Broyden's method. where the Jacobian-vector product is highly accurate.12). The parms array is maxit is the upper limit on the nonlinear iterations.6. .m are x. The error flag ierr is 0 if the nonlinear iteration terminates successfully. The default is 40. is superlinearly convergent in the terminal phase of the iteration. For large problems. the second is the cumulative number of calls to F.m and the sequence of iterates to make a movie. x and / must be column vectors of the same length.21). The failure modes are ierr = 1. and the tolerances for termination. The syntax for / is function = f ( x ) . an error flag. We warn you again not to ask for the sequence {xn} unless you have the storage for this array.m.90 Chapter 4. and the third is the number of step-size reductions done in the line search. The code heqmovie. Broyden's method is useless. the default is 40. and the entire sequence {xn}.5. when working well.5. when the line search fails.m to solve the OZ equations from section 3.2 Output from brsola.21). We used the identity as the initial approximation for the Jacobian (i. initial iterate. Nonlinear residual versus calls to F. but at a cost of 15 function evaluations.2 Convection-Diffusion Equation In this section we compare Broyden's method to the right-preconditioned partial differential equation from section 3. it_hist.3 and 3.6.m with both nsoli. the overall cost is about the same. Since we can evaluate the Jacobian for the H-equation very efficiently.1. Figure 4. after which the updating takes effect.3. x=ones(n.m.l). nsoli terminates in 5 iterations.6.m is the call to brsola.'heq'.m and nsold. 4.1 with brsola. m generated these results. Right preconditioning.4. % Solve the H-equation with brsola.6. we did not precondition). Examples 91 4. Broyden's method terminated after 7 nonlinear iterations and 8 function evaluations. The MATLAB code that generated this .7. ierr] = brsola(x. broyden is at its best for this kind of problem.6. [sol.1 Chandrasekhar H-equation We'll solve the same problem (equation. one can see that the nonlinear iteration is slower than the two Newton-based methods for the first few iterations. The MATLAB code heqbdemo. We compare brsola..e.6. nsold evaluates the Jacobian only once and takes 12 nonlinear iterations and 13 function evaluations to terminate.m using the default choices of the parameters. and tolerances) as we did in sections 2.tol). This fragment from heqbdemo.m. 92 Chapter 4. Broyden's method takes more than 20% fewer nonlinear function evaluations. When storage is limited. While Broyden's method takes more nonlinear iterations. Broyden's Method example is pdebrr. This is an interesting example because the line search in brsola.m succeeds.m reduces the step length once on the second nonlinear iteration. Even so. which required no reduction.m. shows that simply counting iterations is not enough to compare the methods.1. nsoli. Newton-GMRES. In the case of the left-preconditioned convection-diffusion problem. took at most 6 linear iterations for each nonlinear iteration. . the cost in terms of calls to the function is significantly less. the results (obtained with pdebrl. Broyden's method is less impressive.m does not need the line search at all. but brsola. on the other hand. For left preconditioning. one of the plots created by pdebrr. In spite of the extra work in the line search. Broyden's method required 10 nonlinear iterations at a cost of 10 vectors of storage. Broyden's method does best on this example. Figure 4. for example. Contrast this with the two solves using nsoli . reducing the step length once on iterations 2 and 3.m) are similar.m.m. fc-fO.l)'*stp(:.3)=0. n = length(x). 1 turns display on. x. x_hist] « brsola. return end % % initialize the iteration history storage matrices '/. ierr. :) .maxdim). so maxdim-1 vectors are % stored 7. % % evaluate f at the initial iterate 7. This % is useful for making movies.turns on/off iteration statistics display as % the iteration progresses % % alpha = l. if fnrm < stop. one vector storage X % C.3) = 12 norms of nonlinear residuals % for the iteration. it_hist.[itc fnrm 0 0]. maxit-40. debug=0. % % initialize it_hist. function . parms) % BRSOLA Broyden's Method solver.hist . for example. end if nargout«™4 x_hist-x.f '/.hist] » brsola(z. end rtol=tol(2). maxdim=39. tol . Storage is only allocated % if x_hist is in the output argument list.f. % % x. Armijo rule. maxarm-10.1 if after maxit iterations 'I.histx-zeros(maxit. and set the iteration parameters % ierr = 0. Kelley.2 failure in the line search.1). maximum number of steplength reductions before failure is reported % % set the debug parameter. The iteration % is terminated if too many steplength reductions % are taken.d-4. stop_tol=atol + rtol*fnrm. nbroy-0. % if nargin == 4 maxit-parms(l). April 1.[atol.3).0. outstat(itc+l. globally convergent % solver for f(x) . rtol] relative/absolute % error tolerances for the nonlinear iteration "I. ierr .l)-fnrm.l). 2 0 03 50 51 52 53 54 % % maxarm .parms) % % inputs: /. This is an % OPTIONAL argument.tol sol-x.maximum number of nonlinear iterations % default . fnrm=norm(fO). but % can consume way too much storage. otherwise off 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 7.-fc. the termination criterion is not satisfied. % terminate on entry? 7.l) .hist. stp_nrm=zeros(maxdim. X % Set the initial step to -F. parms = [maxit.10. it_histx(itc+l. xjiist. stp(:. parameter to measure sufficient decrease X 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 7. stp_nrm(l)=stp(:. it_histx(itc+l.m function [sol. default = 40 % % output: % sol = solution % it_hist(maxit. compute the step norm '/. 7. % ierr .7 1 2 3 4 5 Source Code for brsola. maxdim] % maxit .40 % maxdim = maximum number of Broyden iterations % before restart.1). stp=zeros(n.4.x).tol.matrix of the entire iteration history. fnrmo~l. % This code comes with no guarantee or warranty of any kind. it. maxdim«parms(2)-l. it. 7.m(x. lambda=l. % % internal parameter: % debug . ierr. fnrm-1.tol. T. ierr. itc-0.f. % The columns are the nonlinear iterates. initial iterate = x '/.2)-0. number function evaluations. % % function [sol. atol-tol(l). % and number of steplength reductions % ierr = 0 upon successful termination 7. lam_rec-ones(maxdim. it_histx(itc+l. compute the stop tolerance % fO-feval(f. . if (itc — 1) it. it_hist=it_histx(l:itc+l. if nargout == 4 x_hist-[x_hist. lrat=. set error flag and return on failure of the line search 149 •/. stp_nrm(nbroy)=lambda*lambda*stp_nrm(nbroy). 202 '/. 135 if iarm—0 136 lambda»lambda*lrat. Line search. 117 x = x + stp(:. 113 % compute the new point. 137 else 138 Iambda=parab3p(lamc. 144 ffc=fnrm*fnrm. 192 % 193 % 194 195 196 197 198 199 */. we assume that the Broyden direction is an 127 y. 186 187 188 189 190 191 '/. 116 xold-x.nbroy).:). end. outstat(itc+l. :) » [itc fnrm iarm rat].xold + lambda*stp(:. 146 end 147 % 148 '/. brsola. lambda-lambda*lrat.:)) end if there's room. How many function evaluations did this iteration require? it_histx(itc+l. alpha-l.nbroy). 143 fnrm=norm(fc). 183 % 184 '/. iarm-0. lamm. f f O .3)-iarm.l)-fnrm. 145 iarm«iarm+l. 201 '/. 120 end 121 fc=feval(f. 130 '/. 161 % 162 '/.x]. 142 fc»feval(f. main iteration loop 103 '/. 124 % 125 % 126 1. the iteration counter (itc) 110 '/. 111 fnrmo=fnrm. 163 164 165 166 167 X 168 % 169 X 170 171 172 173 174 175 '/. adding to iteration history 115 V. sol-xold.2) = it_histx(itc+l.2)-it_histx(itc.101 '/. 139 end 140 lamm-lamc. lamc=lambda. 118 if nargout — 4 119 x_hist-[x_hist. it_hist-it_histx(l:itc+l.x). f f c . terminate? if fnrm < stop.2)+iarm+l. 150 if iarm — maxarm 151 dispCLine search failure in brsola. it_histx(itc+l. test for termination before 114 '/. If the line search fails to 128 % find sufficient decrease after maxarm steplength reductions 129 7. rat=fnrm/fnrmo. if debug—I disp(outstat(itc+l. 106 nbroy»nbroy+1. ffc»fnrm*fnrm. 133 while fnrm >« (1 . lambda-1. 122 fnrm*norm(fc). if debug==l disp(outstat(itc+l. 107 '/. 131 % Three-point parabolic line search 132 '/. itc=itc+l. 108 % keep track of successive residual norms and 109 '/. compute the next search direction and step norm and add to the iteration history .2)-H. 102 '/. end ierr-2. inexact Newton direction.d-4.6. :) • [itc fnrm iarm rat].lambda*alpha) *fnrmo ft iarm < maxarm 134 '/. end it. rat-fnrm/fnrmo. 200 '/. 112 '/. it_histx(itc+l. 123 ffO=fnrmo*fnrmo. lamc=lambda. outstat(itc+l.m ') 152 153 154 155 156 157 158 159 160 '/. 104 while(itc < maxit) 105 V.tol sol=x.m returns with failure. ffm=ffc.x). 141 x .histx(itc-H. it_hist(itc+l)=fnrm. f f m ) . if lambda "= 1 stp(:.:)) end return end modify the step and step norm if needed to reflect the line search lam_rec(nbroy)=lambda. 185 •/.nbroy)»lambda*stp(:. 176 177 178 179 180 181 % 182 '/. end return.:).hist(itc+l)=fnrm.x].nbroy). Apply three-point safeguarded parabolic model for a line search. 223 stp_nrm(nbroy+l)=stp(:. safeguarding bounds for the linesearch 278 X X Set internal parameters. ffm) 256 'I.ffO + (cl lambda + c2 lambda-2)/dl X X dl . ffc.value of \| F x c + Uambdac d) \ ' (.nbroy+l)=(z-a3*stp(:. lambdap = sigmal*lambdac. X sigmaO = .1). al=l . ffO. it_hist-it_histx(l:itc+l.lambdam)*lambdac*lambdam < 0 X so. we've taken the maximum % number of iterations and not terminated. return end cl . if c2 > 0 we have negative curvature and default to X lambdap . X sol=x.kbr)'*z)/stp_nrm(kbr)). end 239 X end while 240 241 242 243 244 245 246 247 248 249 250 251 end X % We're not supposed to be here. Kelley.203 '/. 218 zz=stp(:. sigmal * . 255 function lambdap = parab3p(lambdac.1. time to restart 229 X 230 % 231 stp(: . |2 ffm . ztmp-ztmp+(l . April 1.. 222 stp(:. 221 a4-l+a2*zz. lambdap = -cl*.1. 204 205 206 254 X if nbroy < maxdim+1 z=-fc. if nargout »» 4 x_hist-[x_hist. end ierr-1. if debug==l 252 253 end disp(outstat) . lambdam. 232 stp_nrm(l)-stp(:.(lambdac . 233 nbroy-0.5/c2. z=z+ztmp*((stp(:.5. if c2 >« 0 lambdap = sigmal*lambdac. X X Compute coefficients of interpolation polynomial. 215 X store the new search direction and its norm 216 % 217 a2»-lam_rec(nbroy)/stp_nrm(nbroy).lambdac*lambdac*(ffm-ffO)-lambdam*lambdam*(ffc-ffO).nbroy))/a4.value of \| F(x_c + \lambdam d) \ ' |2 output: lambdap . ffm) input: lambdac = current steplength lambdam • previous steplength ffO . lambdap = sigmaO*lambdac. if nbroy > 1 for kbr • l:nbroy-l ztmp-stp(: .nbroy)'*z.nbroy+l)'*stp(:.:).sigaml * lambda. 2003 This code comes with no guarantee or warranty of any kind. T.l/lam_rec(kbr))*stp(:.kbr). function lambdap . lambdam. 225 X 226 X 227 else 228 X out of room.kbr-H)/lam_rec(kbr+l).5.nbroy+l).l)'*stp(:.x].lam_rec(nbroy).new value of lambda given parabolic model internal parameters: sigmaO " . X 285 X p(lambda) .value of \| F(x_c) \ ' |2 ffc . ffc. end if lambdap > sigmal*lambdac. ffO. 234 X 235 X 236 % 237 end 238 X X % X X X X X X X X X X X X X X X X X C.lambdam*(ffc-ffO)-lambdac*(ffm-ffO). X c2 .parab3p(lambdac. ztmp=ztmp*lam_rec(kbr).1)—fc. 257 X 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 279 280 281 282 283 284 286 287 288 289 290 291 292 293 294 295 296 297 298 207 208 209 210 211 212 end 213 end 214 '/. if lambdap < sigmaO*lambdac. 224 "/. sigmal . 219 220 a3-al*zz/stp_nrm(nbroy). This page intentionally left blank . pp. 610–638. SIAM J. Du CROZ. Comput. [10] P. AND B. SCALES. BRENAN. S. Third Edition. An Introduction to Numerical Analysis. AND L. Argonne National Laboratory. PETSc 2. Philadelphia. GROPP. Minimization of functions having Lipschitz-continuous first partial derivatives. ANDERSON. John Wiley and Sons. BALAY. ANL-95/11 . 2000. AND S. D. Comput. SCHLIPER. GREENBAUM.anl. F. R. Phys. [3] U. pp. ARMUO. Anal. A Multigrid Tutorial. J.gov/petsc.. SORENSEN. Second Edition. pp. W.Bibliography [1] E. PETZOLD. MCKENNEY. C. L. C. BROWN AND A.. 1999. 40–91. SIAM. [9] W. J. Philadelphia. Comm. F. Philadelphia. 16 (1966). N. PETZOLD. DEMMEL. [5] S. Vol. Math.. Computer Methods for Ordinary Differential Equations and Differential-Algebraic Equations. New York. pp. Numerical Solution of Initial-Value Problems in Differential-Algebraic Equations. Extensible Toolkit for Scientific Computation (PETSc) home page. D. Philadelphia. [4] K. Reduced storage matrix methods in stiff ODE systems. J.mcs. J. L. Math. L. D. 31 (1989). Z. HENSON. DONGARRA. A. Argonne. N. C. L. W. BOOTH. E. SIAM. C. E. 1998. Appl. [8] K.. CAMPBELL. Second Edition. 122-134. 2000. Numer. 14 in Classics in Applied Mathematics. Matrix-free methods for stiff systems of ODE's. SIAM. HINDMARSH. LAPACK Users Guide. Pacific J. ASCHER AND L. M. SMITH. SIAM. [11] P. E. HAYMET. MC!NNES. MCINNES. E. J.Revision 2. ATKINSON. AND A. http://www. L.0.0 Users Manual. Efficient solution of liquid state integral equations using the Newton-GMRES algorithm. AND D. HINDMARSH. BAI. 119 (1999). F. G. 1989. 1996.28. BISCHOF. A. 97 . AND B. IL. R. [6] S. 23 (1986). BLACKFORD. [2] L. Rep. HAMMARLING. BRIGGS. V. S. [7] M. C. S. GROPP. 1-3. BROWN AND A. A. Tech. SMITH. Portable. MCCORMICK. J. BALAY. Engrg. 1467-1488. 187-209. W. Impact Comput. REID. B.K..-M. Cambridge University Press. 664–675. BIT. Hybrid Krylov methods for nonlinear systems of equations. R. 19 (1965). Comput. A. SCHNABEL. SIAM J. AND A. PETZOLD. AND C. SIAM J. no. pp. C. R. C. Anal. J. CAMPBELL. [17] S. BUSBRIDGE. Comput. [13] P. J. J. DENNIS. T. Phys. [25] P. Dover. 15 (1994). BROWN AND Y. AND G.. SIAM. Cambridge. KELLEY. Sci. to appear. POWELL. Tech. I. S. Non-isotropic solution of an OZ equation: Matrix methods for integral equations... pp. LINPACK Users' Guide. DEUFLHARD. D. 11 (1990). New York. CHEN AND B. SAAD. BROYDEN. Philadelphia.. The Mathematics of Radiative Transfer.. AND R. CURTIS. Inexact Newton methods. 36 (1996). BUNCH. K. On the estimation of sparse Jacobian matrices. [21] A. 450–481. Numer. A class of methods for solving nonlinear simultaneous equations. SIAM. DEMMEL. MEYER. [20] T.. 13 (1974). Math.. [15] I. . C. Fast secant methods for the iterative solution of large nonsymmetric linear systems. SIAM J. Sci. Anal. W. 50 in Cambridge Tracts.. 1996. Applied Numerical Linear Algebra. 2 (1990). DONGARRA. DEMBO. MOLER. 577-593. pp. U. pp. F. EISENSTAT. [27] J. pp. pp. 400-408. L. SIAM J. AND T. JR. S. E. Math. M.98 Bibliography [12] P. G. N. Comput. [22] R. N. PETTITT. Vol. [23] J. Sci. Adaptive Pseudo-Transient Continuation for Nonlinear Steady State Problems. pp. Using Krylov methods in the solution of large-scale differential-algebraic systems. C. KELLEY. AND T. pp. M. FREUND. [26] P. 1979. 244-276. 19 (1982). 1997. Numerical Methods for Unconstrained Optimization and Nonlinear Equations. Comm.. Rep. 1960. Stat. W. 1960. WALTER. D. [19] C. CHANDRASEKHAR. Sci. Inst. Philadelphia. Philadelphia. Comput. pp. Konrad-Zuse-Zentrum fur Informationstechnik. Berlin. STEWART. COLEMAN AND J. F. 117-119. C. 239-250. T. KEYES. B. DEUFLHARD. Radiative Transfer. AND J. AND L. SIAM J. Estimation of sparse Jacobian matrices and graph coloring problems. MORE. R. Appl. SIAM. 02-14. IPSEN. GMRES and the minimal polynomial. [24] J. BROWN. HINDMARSH. Comput. March 2002. [14] C. STEIHAUG. 20 (1983). E. Numer. J. COFFEY.. [16] S. D. W. 85 (1995). J. Pseudo-transient continuation and differential-algebraic equations. [18] Z. 16 in Classics in Applied Mathematics. Vol. Choosing the forcing terms in an inexact Newton method. 17 (1981). [38] P. pp. WALKER. Optim. [35] M. pp. Bureau Standards. A transpose-free quasi-minimal residual algorithm for nonHermitian linear systems. 24 in CBMS-NSF Regional Conference Series in Applied Mathematics. 16 in Frontiers in Applied Mathematics. 2000. 14 (1993). Internat. Comput. IEEE. 1997. [34] A. R. 19 in Frontiers in Applied Mathematics. [30] M. [42] C. Iterative Methods for Solving Linear and Nonlinear Equations. J. G. 1976. Vol. pp. Anal. NORRIS. [41] C. SIAM J.. Sci. [40] H. 49 (1952). J. Math. 21 (1980). KELLER. SIAM. GRIEWANK. J. SIAM. J. Johns Hopkins Studies in the Mathematical Sciences. GOLUB AND C. F. 1996. Res. [32] G.anl. SIAM J. Iterative Methods for Solving Linear Systems. VAN LOAN. Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiation. Solution of the Chandrasekhar H-equation by Newton's method. Methods of conjugate gradient for solving linear systems. Trust region algorithms and timestep selection. HICHAM. HESTENES AND E. WALKER. 1625-1628. C. 470-482. Numer. 37 (1999). SIAM J. 1995. pp. 2002. [31] R. SIAM. T. SIAM.. Argonne National Laboratory. Methods Engrg. Vol. 2000. Philadelphia. NJ. Numer. 16-32. Baltimore. W. 409–436. Phys. Matrix Computations.. C. Argonne. [33] A. Std 754-1885. Philadelphia. STEIFEL. J.. 707-718. Sci.gov/autodiff/ [39] IEEE Standard for Binary Floating Point Arithmetic. HIGHAM. T. KELLEY. Comput. 4 (1994).Bibliography 99 [28] S. J. 1985.mcs. MATLAB Guide. Third Edition. H. HIGHAM AND N. pp. Philadelphia. Numerical Solution of Two Point Boundary Value Problems. . Johns Hopkins University Press. AND K. [29] S. 17 (1996). EISENSTAT AND H. Argonne National Laboratory Computational Differentiation Project. Philadelphia. IL. HOVLAND AND B. The application of quasiNewton methods in fluid mechanics. ENGELMAN. F. B. 393–422. Globally convergent inexact Newton methods. Vol. FREUND. pp. [37] D. STRANG. BATHE.. GREENBAUM. Piscataway. http://www-fp. Nat. Philadelphia. pp.. SIAM. 194-210. SIAM J. KELLEY. 17 in Frontiers in Applied Mathematics. [36] D. F. EISENSTAT AND H. J. Iterative Methods for Optimization. pp. pp. N. AND M. Numerical differentiation tions. Preconditioning and boundary conditions. S. pp. MULLIKIN. ORTEGA AND W. Vol. SIAM J. 1970. Math. Akad. 357-374. pp. A. Raleigh. 17 (1914). Rep. Appl. Anal. 4 (1967). T. 28 (1986).100 Bibliography [43] C. pp. Wetensch. 27 (1990). 500–501. 19 (1998). C. Appl.. REDDY. pp. W. April 2002. 280–290. Argonne. NC. B. Sci. T. [46] C. 5 (1968). PARTER. Accidental deviations of density and opalescence at the critical point of a single substance. KERKHOVEN AND Y. Konink. 1980. T. Philadelphia. 13 (1992). SIAM J.. NAZARETH. Tocci. 202-210. Conjugate gradient methods less dependent on conjugacy. 793-806. 501-512. KELLEY.. Convergence analysis of pseudo-transient continuation. Comput. Center for Research in Scientific Computation. A Fast Algorithm for the OrnsteinZernike Equations. SAAD. KEYES. [57] J. T. Iterative Solution of Nonlinear Equations in Several Variables. C. How fast are nonsymmetric matrix iterations?. RHEINBOLDT. New York. MANTEUFFEL AND S. M. ZERNIKE. E. Rep. 656–694. [48] T. MOLER. Numer. . C. Tech. L. 199–242. S. Proc. Academic Press. pp. 18 in Frontiers in Applied Mathematics. GARBOW. NACHTIGAL. [52] T. 60 (1992). AND L. [49] J. pp. 35 (1998). Solution by iteration of H-equations in multigroup neutron transport.. SIAM J. Acta Numer. ANL-80–74. SIAM Rev... On acceleration methods for coupled nonlinear elliptic systems. SIAM J. T.. NOCEDAL. S. Anal. Theory of algorithms for unconstrained optimization. W. 19 (1978).. AND K. Matrix Anal. pp. T. KELLEY AND B. [45] C. Prob. North Carolina State University. 525–548. [47] C. KELLEY AND T. PETTITT. V. Anal. pp. MORE. CRSC-TR02-12. Termination of Newton/chord iterations and the method of lines. KELLEY. Numer. 1 (1991). M. M.. [55] J. Nederl. [44] C. KELLEY AND D. Phys. 508-523. of analytic func- [50] T. [53] N. Math. Some probability distributions for neutron transport in a half space. HILLSTROM. D. MULLIKIN. [51] J. [56] L. Argonne National Laboratory.. TREFETHEN. N. B. Numer. J. 778795. E. LYNESS AND C. pp. IL. User Guide for MINPACK-l. MILLER. Numer. ORNSTEIN AND F. SIAM J. J. Tech. J. [54] J. 1999. SIAM.. Comput. 2003. 1-22. WALKER. Ukrain.. [71] K. Sci. pp. pp. ILUM: A multi-elimination ILU preconditioner for general sparse matrices. 19 (1967). Sci. 1970. H. REICHELT. F. pp. 2001. the Livermore Solver for Ordinary Differential Equations. HINDMARSH. Comput. AND B. CRC Press. ed. WEISS. Iterative Methods for Sparse Linear Systems.. F.edu/users/uncmin/tape. SCHULTZ.. 7 (1986). Tech. Lawrence Livermore National Laboratory. 18 (1997). Sci. Adjustment of an inverse matrix corresponding to changes in the elements of a given column or a given row of the original matrix (abstract). p. Stat. 17 (1996). SlGMON AND T. [65] R. . SAAD AND M. 1 (1980). December 1993. Math. Math. New York. SIAM J.. SAAD. Livermore. 20 (1949). Ann. SHAMANSKII. URCLID-113855. W. C. Second Edition. pp. J. Comput. 103-118. Numerical Computing with IEEE Floating Point Arithmetic. MATLAB Primer. Mat. Stat. SAAD. pp. [63] Y. SHERMAN AND W. 419440. [61] K. L. [64] Y. Rabinowitz.. MORRISON. SCHNABEL. J. SHAMPINE. 19 (1998). SIAM J. [60] M.jan30/shar [66] V. FL. E. KOONTZ. SIAM. [67] L. Rep. SIAM. NITSOL: A Newton iterative solver for nonlinear systems. D. 856-869. Description and Use ofLSODE. Philadelphia. MORRISON. Philadelphia. SHERMAN AND W. Stat.Bibliography 101 [58] M.. J. A modification of Newton's method. OVERTON.colorado.cs. pp. SHAMPINE AND M. [59] M. 621. SIAM J. RADHAKRISHNAN AND A. Implementation of implicit formulas for the solution of ODEs.. A. A hybrid method for nonlinear equations. B. 124-127. 133-138 (in Russian). SIAM J. Ann. 11 (1985). A modular system of algorithms for unconstrained minimization. Sixth Edition. Boca Raton.. 21 (1950). 302-318. P. 830–847. 87-114. Stat. [69] J. in Numerical Methods for Nonlinear Algebraic Equations. pp. Gordon and Breach. 2002. The MATLAB ODE suite. Comput. [68] L. Sci. E. [62] Y. Adjustment of an inverse matrix corresponding to a change in one element of a given matrix. SIAM J. J.. pp. F. ftp://ftp. GMRES: A generalized minimal residual algorithm for solving nonsymmetric linear systems. PERNICE AND H. Zh. Comput. Sci. [70] J. DAVIS. pp. CA. ACM TOMS. E. POWELL. 281-310. Comput. Stat. HINDMARSH.K. P. July 1998. CA. A. C. ACM Trans. T. 631–644. C. SIAM J. [76] L. Cambridge University Press. SIAM. AND A. STEWART. Tech. User Documentation for KIN SOL. Sci. pp. Using complex variables to estimate derivatives of real functions. 1973. Algorithm 652: ROMPACK: A suite of codes for globally convergent homotopy algorithms. Cambridge. Rep. TAYLOR AND A. BAU III. AND W. pp. TRAPP. 1997.. MORGAN. GROPP. [78] L. New York. Center for Applied Scientific Computing. Software. SMITH. 40 (1998). VAN DER VORST. Philadelphia. 110-112. BJ0RSTAD. BILLUPS. [77] H. SQUIRE AND G. S. [74] G. Academic Press. BI-CGSTAB: A fast and smoothly converging variant of BI-CG for the solution of nonsymmetric linear systems. N. a Nonlinear Solver for Sequential and Parallel Computers. WATSON. [75] A. 13 (1992). P. 1996. Lawrence Livermore National Laboratory.. . Math. Introduction to Matrix Computations. Domain Decomposition: Parallel Multilevel Methods for Elliptic Partial Differential Equations. UCRL-ID131185. G. U. Numerical Linear Algebra. 13 (1987). Livermore. pp. [73] W. TREFETHEN AND D..102 Bibliography [72] B. SIAM Rev. W. 30 BiCGSTAB. 44 CGNE. 30. 8. 21 Forcing term. 9. 50. 27 GMRES. 74 MATLAB code brsola. 60 Break down. 64. 3 Local linear model. 14 Inexact Newton condition.m. 30. 18. 11 Automatic differentiation. 67. 89 Fast Poisson solver. 89 nsold. 6. 15 Krylov subspace.m. 1 Inner iteration. 1. 27 luinc. 86 convergence. 92 time-dependent. 86 brsola. 42 Homotopy. 35 Banded Jacobian. 8 Inexact Newton method. 29. 60 Broyden's method.m.m. 57 Forward difference 103 banded Jacobian. 15 iterate. 57 Jacobian matrix. 65 . 43 directional derivative. 59 Conjugate transpose. 11 Lipschitz constant. 64 Chord method. 72 Fast sine transform. 59 Difference increment scaling. 61 Failure to converge. 41. 4 LU factorization. 59 Convection-diffusion equation. 45 bvpsys. 60 Chandrasekhar H-equation. 91 cholinc.Index Armijo rule. 73 Diagonalizable matrix. 7 Initial guess. 89 Line search. 57 Limited memory.m. 58 GMRES(m).m. 29 Fourier transform. 89 bvpdemo. 29 KINSOL. 35 nsoli.m. 2 Local quadratic convergence. 60 CGNR. 43 Bandwidth. 72 Gaussian elimination. 59 Diagonalizing transformation. 71. 10 Conjugate gradient method. 41 heq. 15 Condition number. 33 Codes public domain. 69 fatan. 85. 2 difference approximation. 66. 61 Jacobian. 58 H-equation.m. 35 line search. 3 Steady-state solution. 89 multiple solutions. 73 Stiff initial value problems. 64 Preconditioning. 20 MINPACK. 15 Unitary transformation. 61 Secant equation. 4 Quasi-Newton methods. 6 Q-linear convergence. 57 Oversolving. 62 right. 50 Newton direction. 19 storage. 24 Q-quadratic convergence. 20 no solution. 15 Sparse Jacobian. 10 . 18 slow convergence. 9 Newton step. 59 nsold. 60 left. 35 nsoli. 60 Trust region. 58. 64 incomplete factorization. 1 polynomial. 57 NKSOL. 5. 15 Normal equations.m. 67 Outer iteration. 61. 19. 85 Residual history. 20 Pseudotransient continuation. 74 UNCMIN. 35 poor Jacobians. 11 Newton's method. 15 Modified Newton method. 6 Shamanskii method. 25 SNES. 11 TFQMR. 59. 60 Memory paging.m. 47 Sufficient decrease. 14 Public domain. 69 Matrix-free. 61 Problems finite difference Jacobian error. 65 numjac. 9.104 Index MATLAB FFT. 4 Standard assumptions. 15 Preconditioner algebraic multigrid. 6 Nested iteration. 15 Q-factor. 7. 6 Q-order. 5 nonlinear. 33 Stagnation. 29. 34 singular Jacobian. 18. 34 estimation. 18 pivoting. 30 numjac. 11 Newton iterative method. 14 Two-point boundary value problem. 16. 85 Secant method. 61.m. 17. 28 Newton-Krylov methods. 33 Ornstein-Zernike equations. 33 Singular Jacobian. 63 PETSc. 59 Well-conditioned Jacobian. 58 Scaling difference increment. 2 algorithm. 62 two-sided. 60 Normal matrix.

Solving Nonlinear Equations With Newton's Method

Comments

Description