# A note on the relation between the Newton homotopy method and the damped Newton method.

1. Introduction. The primary goal of this paper is to present recently observed relations between the homotopy continuation and the damped Newton methods for nonlinear equations F(x) = 0, where F is a nonlinear map from a Banach space X to a Banach space Z. Specifically, the homotopy considered in this paper is the so called Newton homotopy,

(1.1) H(x,t) = F(x) + (t - 1)F([x.sup.0]), t [member of] [0,1].

Some results on such relations were derived earlier; see for example [5] and [3]. These papers are mainly concerned with the relations between the differential equations that determine the paths followed by the algorithms. However, what mainly interests us are the relations between the two sequences generated by the predictor-corrector (PC) homotopy continuation method and the damped Newton method, since in actual implementation these two sequences generally deviate somewhat from the continuous paths. The iteration processes will be investigated and relations will be explored in terms of the marching directions and the step sizes.

The paper is organized as follows. In Section 2, the explicit Euler method for an ordinary differential equation (ODE) initial value problem (IVP) is investigated under a parameter transformation. In Section 3, the relations between the PC homotopy continuation method and the damped Newton method are explored under the same space setting. The numerical performance is illustrated in Section 4 and concluding remarks are presented in Section 5.

2. The explicit Euler method under parameter transformation. For clarity, in the following we denote the sequence generated by the homotopy continuation method as ([x.sup.k], [t.sub.k]) and the sequence generated by the damped Newton method as ([y.sup.k], [[lambda].sub.k]); ([[bar.y].sup.k], [t.sub.k]) denotes the same sequence after the parameter transformation. As it is known, the damped Newton method

[y.sup.k+1] = [y.sup.k] - [DELTA][[lambda].sub.k]F' [([y.sup.k]).sup.-1] F([y.sup.k]), [y.sup.0] = [x.sup.0], can be considered an explicit Euler method for the following ODE IVP,

(21) {dy/d[lambda] = -F'[(y).sup.-1]F(y), [lambda] [member of] [0, + [infinity]), y(0) = [x.sup.0].

In conventional terms, the solution y([lambda]) of (2.1) is referred to as a phase trajectory, and (y([lambda]), [lambda]) as an integral curve. From the ODE in (2.1), it holds that

F'(y) dy/d[lambda] + F(y) = 0,

which amounts to

d/d[lambda] ([e.sup.[lambda]]F(y([lambda]))) [equivalent to] 0.

Thus, in the PC homotopy continuation framework, this integral curve (y([lambda]), [lambda]) is the path determined by the homotopy

(2.2) H(y, [lambda]) = F(y) - [e.sup.-[lambda]] F([x.sup.0]) = 0, (y, [lambda]) [member of] X x [0, +[infinity]).

Since, conventionally, the domain of the imbedding parameter in the homotopy continuation method is [0,1], the parameter transformation

t = [phi]([lambda]) = 1 - [e.sup.-[lambda]],

is needed. By this transformation, the homotopy (2.2) is changed to

H([bar.y], t) = F([bar.y]) + (t - 1)F([x.sup.0]) = 0, ([bar.y],t) [member of] X x [0,1],

which is essentially the same as the homotopy (1.1). It is therefore necessary to consider the explicit Euler method under parameter transformation for an ODE IVP.

Consider the ODE IVP

(2.3) {du/d[lambda] = f([lambda], u), {u(0) = [u.sup.0].

Under the transformation

[lambda] = [psi](t), 0 = [psi](0),

the ODE IVP (2.3) is transformed to

(2.4) d[bar.u]/dt = [psi]'(t)[bar.f](t,[bar.u], {[bar.u](0) = [u.sup.0].

The explicit Euler methods for (2.3) and (2.4) are respectively

(2.5) [u.sup.k+1] = [u.sup.k] + [DELTA][[lambda].sub.k]f([[lambda].sub.k], [u.sup.k])

and

(2.6) [[bar.u].sup.k+1] = [[bar.u].sup.k] + [DELTA][t.sub.k] [psi]'([t.sub.k])[bar.f]([t.sub.k], [[bar.u].sup.k]).

If [[lambda].sub.k] = [psi]([t.sub.k]) and [[lambda].sub.k+1] = [psi]([t.sub.k+1]), then in general [DELTA][[lambda].sub.k] [not equal to] [DELTA][t.sub.k][psi]'([t.sub.k]), and the sequences {[u.sup.k]} and {[[bar.u].sup.k]} generated by (2.5) and (2.6) are different. In this sense, the explicit Euler method is not invariant under a parameter transformation. In order for {[u.sup.k]} and {[[bar.u].sup.k]} to be the same, [DELTA][t.sub.k] should be chosen such that [DELTA][[lambda].sub.k] = [DELTA][t.sub.k][psi]'([t.sub.k]), in which case [[lambda].sub.k] is different from [psi]([t.sub.k]).

3. The relations. In order to compare the damped Newton method with the PC homotopy continuation method, it is transformed to the underlying space X x [0,1]. Under the parameter transformation

[lambda] = - ln(1 - t),

the ODE IVP (2.1) is changed to

(3.1) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

The damped Newton method in X x [0, [infinity]),

(3.2) [y.sup.k+1] = [y.sup.k] - [DELTA][[lambda].sub.k]F'([y.sup.k])F([y.sup.k]), [y.sup.0] = [x.sup.0],

interpreted as the explicit Euler method for (2.1), is changed to its counterpart in X x [0,1],

(3.3) [[bar.y].sup.k+1] = [[bar.y].sup.k] - [DELTA][t.sub.k]/1 - [t.sub.k] F'[([[bar.y].sup.k]).sup.-1] F([[bar.y].sup.k]), [[bar.y].sup.0] = [x.sup.0],

which is the explicit Euler method for (3.1). In order for [[bar.y].sup.k] = [y.sup.k] in the phase space X, [DELTA][t.sub.k] should be chosen such that [DELTA][t.sub.k] / 1-[t.sub.k] = [DELTA][[lambda].sub.k]. This section is mainly focused on the relations between the sequence {[[bar.y].sup.k]}, generated by (3.3), and the sequence {[x.sup.k]} generated by the PC homotopy continuation method for (1.1).

Assume for simplicity that the path determined by (1.1) is (x(t), t). Then x(t) satisfies the following ODE IVP

(3.4)[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

In a continuous framework, the ODE IVPs (3.1) and (3.4) are equivalent, as the phase trajectories determined by them are the same. Since F'(x)dx/dt + F([x.sup.0]) = 0 is used to compute the tangent vectors in the PC method, the explicit Euler method for (3.4) is equivalent to the homotopy continuation method for the homotopy (1.1), where only the predictor step is performed. Such a continuation method will be called a no-corrector homotopy continuation, for convenience. Therefore, it may seem that the damped Newton method (3.2) is nothing more than the no-corrector homotopy continuation method. However, this is not true. When the explicit Euler method is employed, the ODE IVPs (3.1) and (3.4) are not equivalent anymore, since the vector fields of the two ODEs are different, and so the sequences generated by them will not be the same. This will be discussed in Section 3.1.

3.1. Relation of marching directions. Recall the assumption that the path determined by the homotopy (1.1) is (x(t), t), and the tangent vectors in the PC method are computed by F'(x)dx/dt + F ([x.sup.0]) = 0.

First iteration. For the PC method the first tangent ([d.sup.0], [[tau].sub.0]) at ([x.sup.0], 0) is

(3.5) [d.sup.0] = -F'[([x.sup.0]).sup.-1]F ([x.sup.0]), [[tau].sub.0] = 1.

After one PC iteration, ([x.sup.1], [t.sub.1]) is generated. We can see that [d.sup.0] in (3.5) is just the Newton direction. If the same step sizes are chosen, and the corrector step is not performed, then the same iterate [x.sup.1] = [[bar.y].sup.1] is obtained.

Second iteration. The tangent ([d.sup.1], [[tau].sub.1]) at ([x.sup.1], [t.sub.1]) can be computed as follows. If ([x.sup.1], [t.sub.1]) lies on the path [H.sup.-1] (0) exactly, then

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

If ([x.sup.1], [t.sub.1]) does not lie on the path [H.sup.-1] (0) exactly and H([x.sup.1], [t.sub.1]) = [c.sub.1] [not equal to] 0, then [d.sup.1] is instead

d1 = -F'(x1)-1 F(x0) = ----3F'(x1)-1 (F(x1) - C1)

(3.6) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

In fact, this is the tangent on the path defined by H(x, t) = [c.sub.1] [not equal to] 0. From (3.6) it is obvious that, even if [x.sup.1] = [[bar.y].sup.1], the tangent component [d.sup.1] in the PC method is no longer the Newton direction of F at [x.sup.1].

By iterating, it can be seen that the damped Newton sequence {[[bar.y].sup.k]} is not generally the same as the sequence {[x.sup.k]} generated by the no-corrector homotopy continuation method. Using the terminology of dynamical system, the damped Newton method is marching along the vector field -1/(1 - t)F'[(x).sup.-1]F(x), while the homotopy continuation method is marching along the vector field -F'[(x).sup.-1] F([x.sup.0]). The integral curves for -1/(1 - t)F'[(x).sup.-1]F(x) are the paths implicitly determined by

(3.7) [bar.H] (x,t) = F(x) + (t - 1)F([c.sub.1]) = 0,

while those for -F'[(x).sup.-1] F([x.sup.0]) are

(3.8) H(x,t) = F(x) + (t - 1)F([x.sup.0]) = [c.sub.2],

where [c.sub.1] and [c.sub.2] are generic constant vectors. These two families of paths are different, and the common path is determined by F(x) + (t - 1)F([x.sup.0]) = 0, i.e., [c.sub.1] = [x.sup.0] and [c.sub.2] = 0.

The paths determined by (3.7) and (3.8) are illustrated in Figure 3.1, where the solid lines are paths corresponding to (3.7), the dotted lines corresponding to (3.8) and the thick line to the common path. As illustrated in Figure 3.1, the directions along which the damped Newton method marches are the tangents to the solid lines, while the directions of prediction in the homotopy continuation method are the tangents to the dotted lines. The Newton directions are good directions, since even though the damped Newton iterates may deviate from the path that the method is supposed to follow, they will lie in a curvy cone formed by the paths of F(x) + (t - 1)F([c.sub.1]) = 0, and will bend back to the target point at the end. Thus, even though there may be some singular point on the path of F(x) + (t - 1)F([x.sup.0]) = 0, the damped Newton iterates may steer away from it. Also, there is no need of a corrector step in the damped Newton method, therefore a lot of computational cost can be saved. However, for the PC homotopy continuation method a corrector step is of essential importance. If a correction is performed, the iterates of the PC method will stick to the path determined by the homotopy H(x,t) = F(x) + (t - 1)F([x.sup.0]) = 0, otherwise the iterates will always deviate from the path and never bend back to the target point.

3.2. Relation of step sizes. First of all, the notion of step size needs to be clarified, since the definitions of step size are different in the PC method and the damped Newton method. In the PC method, when the homotopy path is parameterized with respect to arc length s, the step size h in the predictor step is understood as

([[??].sup.k+1], [[??].sub.k+1]) = ([x.sup.k], [t.sub.k]) + h([d.sup.k], [[tau].sub.k]),

where ([d.sup.k], [[tau].sub.k]) is the unit tangent. When the path can be parameterized with respect to t, the step size h in the predictor step is usually referred to as [DELTA][t.sub.k]:

(3.9) ([[??].sup.k+1], [[??].sub.k+1]) = ([x.sup.k], [t.sub.k]) + [DELTA][t.sub.k] (dx/dt ([t.sub.k]), 1).

In the damped Newton method the step size [DELTA][[lambda].sub.k] instead occurs as

[y.sup.k+1] = [y.sup.k] - [DELTA][[lambda].sub.k]F'[([y.sup.k]).sup.-1] F([y.sup.k]).

In this paper, for the PC method tracing the homotopy path (x(t), t), we assume the definition (3.9). As mentioned in Section 2, in order that [[bar.y].sup.k] = [y.sup.k], [DELTA][t.sub.k] in (3.3) should be chosen such that

(3.10) [DELTA][t.sub.k] / 1 - [t.sub.k] = [DELTA][[lambda].sub.k].

With the relation (3.10) and the condition 0 < [DELTA][[lambda].sub.k] < 1, it is easily derived by induction that

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

If [DELTA][[lambda].sub.k] [equivalent to] [DELTA][lambda], then

[DELTA][t.sub.k] = [(1 - [DELTA][lambda]).sup.k] [DELTA][lambda].

On the other hand, if [DELTA][t.sub.k] [equivalent to] [DELTA]t, then setting K = 1/[DELTA]t it can be shown that

[DELTA][[lambda].sub.k] = [DELTA]t / 1 - k[DELTA]t = 1 / K - k.

Thus, the damped Newton method with step size [DELTA][[lambda].sub.k] [equivalent to] [DELTA][lambda] approximately corresponds to the PC method with step size [DELTA][t.sub.k] = [(1 - [DELTA][lambda]).sup.k] [DELTA][lambda], while the PC method with step size [DELTA][t.sub.k] [equivalent to] [DELTA]t approximately corresponds to the damped Newton method with step size [DELTA][[lambda].sub.k] = 1/(K - k).

4. Numerical illustrations. The test problem we consider is the Bratu problem

(4.1) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

The initial guess

[u.sub.0] = 2.4 sin([pi]x) sin([pi]y) + 1.985 x 2.4 sin(10[pi]x) sin(10[pi]y)

is chosen so far away from the solution that the standard Newton method does not converge. The initial guess and the solution are plotted in Figure 4.1. The differential equation is discretized by the P1 finite element method on a triangular mesh with node number N = 1089 and free node number n = 961. Let F(U) = 0 be the discretized equations of (4.1) and H(U, t) = F(U) + (t - 1)F([U.sup.0]) be the homotopy, where [U.sup.0] corresponds to the initial guess function [u.sub.0]. The path which the homotopy continuation method and the damped Newton method are supposed to follow is determined by H(U, t) = 0. For details on the finite element method, see [2].

The damped Newton method is applied with step sizes [DELTA][[lambda].sub.k] [equivalent to] 0.01 and [DELTA][[lambda].sub.k] = 1/(K - k). Here K = 50. The norm of the residual versus the number of iterations is plotted in Figure 4.2. It can be seen that with [DELTA][[lambda].sub.k] = 0. 01 the convergence rate of the damped Newton method becomes very slow at the last stage of the iteration process. This phenomenon can be explained, from the viewpoint of the homotopy continuation method, by noting that [DELTA][[lambda].sub.k] [equivalent to] 0.01 in the damped Newton method corresponds approximately to [DELTA][t.sub.k] = [(1 - [DELTA][lambda]).sup.k] [DELTA][lambda] [right arrow] 0 in the homotopy continuation method. On the other hand, with [DELTA][[lambda].sub.k] = 1/(K - k) the convergence rate is almost uniform, since [DELTA][t.sub.k] is approximately 1/K in the homotopy continuation method.

To see the difference in marching directions, the performance of the no-corrector homotopy continuation method and the damped Newton method is shown in Figure 4.3, in terms of the angle between tangents versus the number of iterations.

As mentioned in Section 3.1, the iterates of the homotopy continuation method will follow the path H(U, t) = 0. There are two simple bifurcation points on this path detected and jumped over by the homotopy continuation method, in which the "Jumping Over A Bifurcation Point" algorithm of [1] is implemented. A simple bifurcation point ([??], [??]) is characterized by rank([H.sub.U] ([??], [??])) = n - 1 and rank([[H.sub.U] ([??], [??]), [H.sub.t]([??], [??])]) = n - 1. The essence of the algorithm is as follows. The direction of the tangent (d, [tau]) of the homotopy path H(U, t) = 0 is determined by the following orientation at the starting point ([U.sup.0],0),

(4.2) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

During the iteration process, if at two successive iterates ([x.sup.k], [t.sup.k]) and ([x.sup.k+1], [t.sup.k+1]) the angle of their tangents ([d.sup.k], [[tau].sup.k]) and ([d.sup.k+1], [[tau].sup.k+1]) is greater than 90[degrees], then there must be a bifurcation point between the two iterates along the path, and the orientation (4.2) is reversed in order to march ahead to reach t =1. For example, one of such bifurcation point is roughly at t = 0.137274, where the angle of the tangents is nearly 180[degrees] and (numerically) rank([H.sup.U]) = rank([[H.sub.U], [H.sub.t]]) = n - 1 = 960, which characterizes a simple bifurcation point. For more details on detection of bifurcation and continuation past bifurcation, see [4] and [1]. On the other hand, the iterates of the damped Newton method deviated from the path at its beginning, and bent back at its end. Thus, the iterates steered away from these two bifurcation points. If turning points with respect to t were encountered, the damped Newton method would break down, while the pseudo-arclength homotopy continuation method could pass through. Furthermore, thanks to the corrector, the maximum of allowed uniform step sizes which assures convergence for the homotopy continuation method is [DELTA]t = 0.2, larger than [DELTA]t = 0.08 for the damped Newton method. In these senses, the homotopy continuation method is more robust than the damped Newton method.

5. Concluding remarks. The Newton homotopy continuation method and the damped Newton method are closely related. After a transformation to the same space X x [0,1], the homotopy continuation method marches along the vector field -F'[(x).sup.-1] F([x.sup.0]), while the damped Newton method marches along -1/1-t F'[(x).sup.-1]F(x). In addition, the step sizes of both methods have an approximate correspondence. The damped Newton method is not equivalent to the no-corrector homotopy continuation method. The Newton directions are good directions and there is no need for a corrector, so there is a saving in the computational cost. For the homotopy continuation method, the corrector is of essential importance for convergence and robustness. Relatively speaking, the homotopy continuation method is more robust, while the damped Newton method is more efficient.

Acknowledgments. The authors would like to thank Dr. Long Chen for the FEM package iFEM [2] in Matlab, and would also like to thank the anonymous referees for their valuable suggestions that led to improvements in the paper.

REFERENCES

[1] E. L. Allgower and K. Georg, Introduction to Numerical Continuation Methods, Classics in Applied Mathematics, 45, SIAM, Philadelphia, 2003.

[2] L. Chen, iFEM: an integrated finite element methods package in MATLAB, Technical report, Department of Mathematics, University of California at Irvine, CA, November 2008.

[3] C. B. Garcia and F. J. Gould, Relations between several path following algorithms and local and global Newton methods, SIAM Rev., 22 (1980), pp. 263-274.

[4] H. B. Keller, Lectures on Numerical Methods In Bifurcation Problems, Springer, Berlin, 1986.

[5] G. H. Meyer, On solving nonlinear equations with a one-parameter operator imbedding, SIAM J. Numer. Anal., 5 (1968), pp. 739-752.

XUPING ZHANG ([dagger]) AND BO YU ([dagger])

* Received October 14, 2011. Accepted August 16, 2013. Published online October 14, 2013. Recommended by Wolfgang Ring.

([dagger]) School of Mathematical Sciences, Dalian University of Technology, Dalian, Liaoning 116025, P. R. China (zxp666@mail.dlut.edu.cn, yubo@dlut.edu.cn). This research was supported by the National Natural Science Foundation of China (11171051, 91230103).
COPYRIGHT 2013 Institute of Computational Mathematics
No portion of this article can be reproduced without the express written permission from the copyright holder.