# Remark on Algorithm 702--The Updated Truncated Newton Minimization Package.

A truncated Newton minimization package, TNPACK, was described in ACM Transactions on Mathematical Software 14, 1 (Mar. 1992), pp. 46-111. Modifications to enhance performance, especially for large-scale minimization of molecular potential functions, are described here. They involve three program segments of TNPACK: negative curvature test, modified Cholesky factorization, and line-search stopping rule.Categories and Subject Descriptors: G.1.6 [Numerical Analysis]: Optimization--nonlinear programming; G.4 [Mathematics of Computing]: Mathematical Software; J.3 [Computer Applications]: Life and Medical Sciences

General Terms: Algorithms, Performance

Additional Key Words and Phrases: Indefinite preconditioner, modified Cholesky factorization, molecular potential minimization, truncated Newton method

1. INTRODUCTION

In 1992, Schlick and Fogelson described a Fortran package of subprograms for unconstrained minimization problems known as TNPACK [Schlick and Fogelson 1992a; 1992b]. In 1994, TNPACK was adapted by Derreumaux et al. [1994] for the widely used molecular mechanics and dynamics program CHARMM and was shown to be an efficient tool for the minimization of molecular potential functions in comparison to other available minimizers. Recently, we have examined practical issues of efficiency and have implemented modifications to the TNPACK version of CHARMM to improve reliability and enhance convergence for large-scale complex nonlinear problems [Xie and Schlick 1999a; 1999b].

This note summarizes the modifications made in TNPACK and provides details regarding program usage. The basic changes involve the following program segments: (1) negative curvature test, (2) modified Cholesky (MC) factorization, and (3) line-search algorithm. These modifications have been analyzed theoretically and numerically in recent works [Xie and Schlick 1999a; 1999b].

In Section 2 we introduce the algorithm and describe the modified negative curvature test and a strong negative curvature test. In Section 3 we describe our MC method, termed "UMC" for unconventional MC, as implemented in TNPACK. In Section 4, we present the more lenient stopping rule for the line search and a modification of the associated trial value. The reader should consult Schlick and Fogelson [1992a; 1992b] for general user instructions. Section 5 here only summarizes usage details related to our modifications. Appendix A presents the full algorithm of TNPACK, and Appendix B includes some numerical examples that show the performance of the updated program. We welcome users to contact us for further information.

2. TERMINATION RULES FOR PCG

The truncated Newton (TN) method [Dembo and Steihaug 1983] solves the unconstrained minimization problem E(X*) = [min.sub.X[element of] D(X), where E(X) is a twice continuously differentiable real-valued function in an open set D of the n-dimensional vector space [R.sup.n]. TN uses a nested sequence of iterations. A sequence of "outer" solution vectors {[X.sup.k]} can be expressed in the form

[X.sup.k+1] = [X.sup.k] + [[Lambda].sub.k][P.sup.k], k = 0, 1, 2, ...,

where [P.sup.k] is a descent direction, [[Lambda].sub.k] is the step length, and [X.sup.0] is an initial guess. The inner loop involves obtaining [p.sup.k] by a "truncated" preconditioned conjugate gradient (PCG) loop; each [[Lambda].sub.k] is then generated by using a line-search scheme (for example, More and Thuente [1994]). The search vector [P.sup.k] is obtained from the PCG approximate solution of the Newton equations

(1) H([X.sup.k]P = -g([X.sup.k],

where g([X.sup.k]) and H([X.sup.k]) are the gradient and Hessian, respectively, of the objective function E at [X.sup.k] (often denoted as [g.sub.k] and [H.sub.k] for clarity). Important in practice are details regarding the efficient solution approximation of system (1), including handing indefiniteness and avoiding too many iterations when the quadratic model is poor.

Let [p.sub.j] and [d.sub.j] be the jth iterate and direction vectors of PCG, respectively. Since the Hessian matrix [H.sub.k] may be indefinite, a negative curvature test is required to guarantee a descent direction for TN. The original package used the following Test 1A, proposed in Dembo and Steihaug [1983].

Test 1A (Negative Curvature Test). If [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] where [Delta] is a given small positive number (e.g., [Delta] = [10.sup.-10]), and [d.sub.j] is a vector generated in PCG (see algorithm in Appendix A), exit with search direction [P.sup.k] = [p.sub.j] or [d.sub.j] (for j = 1, set [P.sup.k] = [-g.sub.k]).

Since our analysis in Xie and Schlick [1999a] showed that dj is a poorer choice than [p.sub.j] as an exit search direction, this option has been removed, leading to the following modified Test 1A'.

Test 1A' (Modified Negative Curvature Test). If [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], where [Delta] is a given small positive number (e.g., [Delta] = [10.sup.-10]), exit with search direction [P.sup.k] = [p.sub.j] (for j = 1, set [P.sup.k] = [-g.sub.k]).

In Xie and Schlick [1999a], we proved that if [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] [is greater than] 0 for j = 1, 2, ..., l, then all [p.sub.j] with 2 [is less than or equal to] j [is less than or equal to ] l + 1 are descent directions and satisfy

(2) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

In practice, however, (2) does not necessarily hold due to computer rounding errors. Hence, to guarantee a good descent direction in finite-precision arithmetic, we also add the following Test 2A, called the "strong negative curvature test," as an alternative [Xie and Schlick 1999a].

Test 2A (Strong Negative Curvature Test). If [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], exit with search direction [P.sup.k] = [p.sub.j] (for j = 1, set [P.sup.k] = [-g.sub.k]).

Theoretically, Test 2A is equivalent to Test 1A', but in practice may lead to different performance. This is because Test 1A' may not guarantee a descent direction or a "good" descent direction in the sense of (2) in finite-precision arithmetic while Test 2A can do so. Numerical results and analyses [Xie and Schlick 1999a] show that Test 2A can improve the performance of TNPACK in comparison to Test 1A'. We let the user choose the test in an option list of input parameters to TNPACK, and we leave Test 2A as the default.

The above tests halt the inner loop in case of negative curvature or nonmonotonic improvements in the search vectors. In addition, TN methods also exit the inner loop when the residual of (1), in which [H.sub.k] may be replaced by a related matrix, is sufficiently small. Test 1B serves this purpose. It is the same as in the previous version.

Test 1B. (Truncation Test). Let [r.sub.j] be the residual vector satisfying [r.sub.j] = [-g.sub.k] - [H.sub.k][p.sub.j]. If ||[r.sub.j]|| [is less than or equal to] min{[c.sub.r]/k, ||[g.sub.k]||}, where [c.sub.r] is a given positive number (e.g., [c.sub.r] = 0.5), and ||x|| is the standard Euclidean norm divided by [square root of n], exit with search direction [P.sup.k] = [p.sub.j].

The inner PCG loop is also halted when the number of iterations at each inner loop exceeds [IT.sub.PCG], a maximum number of allowable PCG iterations for each inner loop. Our previously recommended value for [IT.sub.PCG] was n, but we found that a smaller value, such as 40, works better when n [much greater than] 40 [Xie and Schlick 1999a].

3. AN UNCONVENTIONAL MODIFIED CHOLESKY (UMC) FACTORIZATION

In TNPACK, a symmetric sparse approximation to the Hessian, namely the preconditioner [M.sub.k], is used to accelerate convergence of the inner CG loop. This matrix is not necessarily positive definite when chosen according to a physical subdivision of energy components, as in molecular applications [Schlick 1993]. Thus, a sparse modified Cholesky (MC) factorization based on the Yale Sparse Matrix Package [Gill and Murray 1974; Eisenstat et al. 1981; 1982; Schlick and Fogelson 1992a] is used to construct a modified preconditioner [M.sub.k], where [M.sub.k] = [M.sub.k] + [E.sub.k], and [E.sub.k] is a nonnegative diagonal matrix.

Our numerical experience with biomolecular-structure optimization [Xie and Schlick 1999a] showed that various MC schemes [Cheng and Higham 1998; Gill and Murray 1974; Gill et al. 1981; Schnabel and Eskow 1990] can produce very large modifications to [M.sub.k] that can slow down the overall convergence of the method. To overcome this difficulty, we proposed the UMC [Xie and Schlick 1999a] summarized below.

Let L be a unit lower-triangular matrix and D and E diagonal matrices. We write these matrices as shown: L = [([l.sub.ij]).sub.nxn], D = diag([d.sub.1], [d.sub.2], ..., [d.sub.n]), and E = diag([e.sub.1], [e.sub.2], ..., [e.sub.n]). For a given symmetric matrix M = [([m.sub.ij]).sub.nxn] and a given parameter [Tau], UMC generates a factorization [LDL.sup.T] = M = M + E with diagonals [d.sub.j] (j = 1,2, ..., n) of D defined by

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

where [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]; [Xi] is the largest magnitude of an element of M; [Epsilon] is a small positive number (we set [Epsilon]: [10.sup.-6]); and [[Beta].sup.2]: [Xi]/ [square root n(n - 1)], an a priori choice of [Beta] in our UMC.

Essentially, UMC produces in a numerically stable procedure an [LDL.sup.T] factorization for a matrix M that may be indefinite. Still, the PCG-generated vectors in our context are guaranteed to be directions of descent. Moreover, a user-prescribed parameter [Tau] is set so that near a minimum the preconditioner resulting from UMC is positive definite. If [Tau] [is less than] |[[Lambda].sub.min](M)|, where [[Lambda].sub.min](M) is the minimum eigenvalue of M, then M is positive definite with an error matrix E = M - M = [Tau]I; otherwise, by similar arguments to those used in Gill and Murray [1974], a worst-case bound of [||E||.sub.infinity] can be derived [Xie and Schlick 1999a].

Note that UMC can produce an indefinite preconditioner M when [Tau] [is less than] |[[Lambda].sub.min](M)|. With the use of singularity and negative curvature tests as shown in Appendix A, the recursive formulas of PCG are well defined for a nonsingular preconditioner. Furthermore, the PCG-generated search vectors {[P.sup.k]} of TN are directions of descent (expression (2) above) even when the preconditioner [M.sub.k] is indefinite. We found this overall procedure to work much better than that which allows excessively large modifications in MC [Xie and Schlick 1999a].

The choice of [Tau] affects overall performance of the minimization. Our experience over many problems and system sizes suggests that [Tau] = 10 is a reasonable value [Xie and Schlick 1999a]. This is the default value in TNPACK, but users can set [Tau] to other values suitable for their problems.

4. LINE-SEARCH MODIFICATIONS

A line-search scheme is an iterative algorithm for solving the one-dimensional minimization problem [[min.sub.[Lambda][is less than]0] f([Lambda]), where

f([Lambda]) = E([X.sup.k] + [Lambda][P.sup.k]), [Lambda] [is less than] 0.

The line-search stopping rule influences the overall efficiency of the minimization method. TNPACK uses the line-search algorithm of More and Thuente [1994], which relies on the following stopping rule:

Criterion 1 (Stopping Rule for Line Search). Terminate the line search when the step length [[Lambda].sub.k] [is less than] 0 satisfies

(3) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

and

(4) MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

where [Alpha] and [Beta] are given constants satisfying 0 [is less than] [Alpha] [is less than] [Beta] [is less than] 1.

Our more lenient stopping rule [Xie and Schlick 1999b] in TNPACK is the following:

Criterion 2 (More Lenient Stopping Rule for Line Search). Terminate the line search when the step length [[Lambda].sub.k] [is greater than] 0 satisfies either (3) and

(5) MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

or, alternatively, conditions (3) and

(6) g([X.sup.k] + [[[Lambda].sup.k][P.sup.k]).sup.T][P.sup.k],

where [Alpha] and [Beta] are given constants satisfying 0 [is less than] [Alpha] [is less than] [Beta] [is less than] 1.

Condition (5) is a component of (4), which works only for a strictly convex function f. Condition (6) is introduced to produce a sufficiently large step length when the function f is not strictly convex; for details, see Xie and Schlick [1999b].

Using the more lenient second criterion above can reduce the number of line-search iterations per outer loop step in special cases. It also forces sufficient function decrease according to (3) and guarantees a sufficiently large step length. Hence, we found it useful in practice for overall efficiency of a minimization procedure for large-scale multivariate functions whose function evaluations are expensive. Moreover, in theory, the global convergence for a descent method using Criterion 2 can be proven in the same way as for Criterion i [Dennis and Schnabel 1983]. However, in most of cases, condition (6) may not be encountered, so that performance is overall very similar to the first criterion [Xie and Schlick 1999b]. Thus, we leave Criterion i as the default choice in TNPACK and let the user select Criterion 2 if desired.

In addition, we have added a safeguard to the rule determining Ak at each line-search iteration. Instead of defining the line-search iterate

[[Lambda].sup.(j+1)] = [[Lambda].sub.c],

where [[Lambda].sub.c] is the minimum point of the cubic interpolant [More and Thuente 1994], we use

[[Lambda].sup.(j+1)] = max([[Lambda].sub.l] + [Sigma]([[Lambda].sub.t] - [[Lambda].sub.l]), [[Lambda].sub.c]),

where [Sigma] is a small positive number such as 0.001, and [[Lambda].sub.l] and [[Lambda].sub.t] are the two end points of the interval generated by the line-search algorithm [More and Thuente 1994]. This modification avoids ending the line search with a [[Lambda].sub.c] value that is very small, a case we have encountered in practice. See Xie and Schlick [1999b] for further details.

5. SOME USAGE CHANGES ON TNPACK

The usage details for the new version of TNPACK are the same as described in Schlick and Fogelson [1992a] except for small changes involving the two work arrays OPLIST and PLIST.

The arrays OPLIST and PLIST (of dimension 20) are required in TNPACK to specify options and parameters for minimization. Only 14 entries of OPLIST and 7 of PLIST have been used in the original TNPACK, respectively; see Schlick and Fogelson [1992a]. In particular, entry OPLIST(12) specifies the choice of exit directions in the case of negative curvature for PCG, with five options provided in the original program. Since the PCG direction [d.sub.j] defines a poor search direction for TN and the preconditioner [M.sub.k] may be indefinite, we only use the option OPLIST(12) = i (i.e., Test 1A') now; the other four options have been deleted. As a result, the user's input value for OPLIST(12) does not matter.

We also introduced four new user-specified parameter OPLIST(15), OPLIST(16), OPLIST(17), and PLIST(8) to specify the PCG termination test, the line-search stopping rule, the modified Cholesky factorization, and UMC threshold parameters, respectively (see Table I).

Table I. The Four New User-Specified Parameters Parameters Function OPLIST(15) specifies the choice of PCG termination tests in the case of negative curvature: 1: use Test 1a' 2: use Test 2A The default value is 1 OPLIST(16) specifies the choice of line-search stopping rules 1: use Criterion 1 2: use Criterion 2 The default value is 1 OPLIST(17) specifies the choice of the MC methods 1: use the standard MC 2: use UMC The default value is 1 PLIST(8) specifies the parameter [Tau] of UMC. The default value is 10.0

Like other parameters, sample values are produced in subroutine SETLIS, and they can be modified in the driver interface between the SETLIS and TNMIN calls.

REFERENCES

AVERBUKH, V. Z., FIGUEROA, S., AND SCHLICK, T. 1992. HESFCN--A FORTRAN package of Hessian subroutines for testing nonlinear optimization software. Tech. Rep. 610. Courant Mathematics and Computing Laboratory, New York University, New York, NY.

AVERBUKH, V. Z., FIGUEROA, S., AND SCHLICK, T. 1994. Remark on Algorithm 566. ACM Trans. Math. Softw. 20, 3 (Sept. 1994), 282-285.

BROOKS, B. R., BRUCCOLERI, R. E., OLAFSON, B. D., STATES, D. J., SWAMINATHAN, S., AND KARPLUS, M. 1983. CHARMM: A program for macromolecular energy, minimization, and dynamics calculations. J. Comput. Chem. 4, 187-217.

CHENG, S. H. AND HIGHAM, N. J. 1998. A modified Cholesky algorithm based on a symmetric indefinite factorization. SIAM J. Matrix Anal. Appl. 19, 4, 1097-1110.

DEMBO, R. S. AND STEIHAUG, T. 1983. Truncated-Newton algorithms for large-scale unconstrained optimization. Math. Program. 26, 190-212.

DENNIS, J. E. AND SCHNABEL, R. B. 1983. Numerical Methods for Unconstrained Optimization and Nonlinear Equations. Prentice-Hall, Inc., Upper Saddle River, NJ.

DERREUMAUX, P., ZHANG, G., SCHLICK, T., AND BROOKS, B. 1994. A truncated Newton minimizer adapted for CHARMM and biomolecular applications. J. Comput. Chem. 15, 5 (May 1994), 532-552.

EISENSTAT, S. C., GURSKY, M. C., SCHULTZ, M. H., AND SHERMAN, A. H. 1982. Yale sparse matrix package, I: The symmetric codes. Int. J. Numer. Method. Eng. 18, 1145-1151.

EISENSTAT, S. C., SCHULTZ, M. H., AND SHERMAN, A. H. 1981. Algorithms and data structures for sparse symmetric gaussian elimination. SIAM J. Sci. Stat. Comput. 2, 225-237.

GILL, P. E. AND MURRAY, W. 1974. Newton-type methods for unconstrained and linearly constrained optimization. Math. Program. 28, 311-350.

GILL, P. E., MURRAY, W., AND WRIGHT, M. H. 1981. Practical Optimization. Academic Press Ltd., London, UK.

LIU, D. C. AND NOCEDAL, J. 1989. On the limited memory BFGS method for large scale optimization. Math. Program. 45, 3 (Dec. 1989), 503-528.

MORE, J. J. AND THUENTE, D. J. 1994. Line search algorithms with guaranteed sufficient decrease. ACM Trans. Math. Softw. 20, 3 (Sept. 1994), 286-307.

MORE, J. J., GARBOW, B. S., AND HILLSTROM, K. E. 1981a. Testing unconstrained optimization software. ACM Trans. Math. Softw. 7, 1 (Mar.), 17-41.

MORE, J. J., GARBOW, B. S., AND HILLSTROM, K. E. 1981b. Algorithm 566: Fortran subroutines for testing unconstrained optimization software. ACM Trans. Math. Softw. 7, 1 (Mar.), 136 -140.

SCHLICK, T. 1993. Modified Cholesky factorizations for sparse preconditioners. SIAM J. Sci. Comput. 14, 2 (Mar. 1993), 424-445.

SCHLICK, T. AND FOGELSON, A. 1992a. TNPACK--A truncated Newton minimization package for large-scale problems: I. Algorithm and usage. ACM Trans. Math. Softw. 18, 1 (Mar.), 46 -70.

SCHLICK, T. AND FOGELSON, A. 1992b. TNPACK--A truncated Newton minimization package for large-scale problems: II. Implementation examples. ACM Trans. Math. Softw. 18, 1 (Mar.), 71-111.

SCHLICK, T. AND OVERTON, M. L. 1987. A powerful truncated Newton method for potential energy functions. J. Comput. Chem. 8, 1025-1039.

SCHNABEL, R. B. AND ESKOW, E. 1990. A new modified Cholesky factorization. SIAM J. Sci. Stat. Comput. 11, 6 (Nov. 1990), 1136-1158.

XIE, D. AND SCHLICK, T. 1999a. Efficient implementation of the truncated-Newton algorithm for large-scale chemistry applications. SIAM J. Optim. 9, 3. To be published.

XIE, D. AND SCHLICK, T. 1999b. A more lenient stopping rule for line search algorithms. Preprint.

ZOU, X., NAVON, I. M., BERGER, M., PHUA, K. H., SCHLICK, T., AND LE DIMET, F. X. 1993. Numerical experience with limited-memory quasi-Newton and truncated Newton methods. SIAM J. Optim. 3, 582-608.

APPENDIX

A. THE TRUNCATED NEWTON ALGORITHM

The new algorithmic components are marked by asterisks. These include the line-search scheme, UMC factorization, and negative curvature test (steps (2)-(4) of the inner loop). Default parameter settings are indicated. The indices k and i are used respectively to denote outer loop and inner loop iterates.

Outer Loop of the Truncated Newton Method

(1) Initialization

--Set k = 0 and evaluate E([X.sup.0]) and g([X.sup.0]) corresponding to initial guess [X.sup.0], where E is the objective function to be minimized. --If ||([X.sup.0])|| [is less than] [10.sup.-8] max(1, ||[X.sup.0]||), where ||[multiplied by]|| is the standard Euclidean norm divided by [square root of n], exit algorithm; otherwise, continue to step (2).

(2) Preparation for UMC

--Evaluate the preconditioning matrix [M.sub.k]. --Determine the sparsity pattern of [M.sub.k]. The upper triangle of [M.sub.k] is stored in a compressed row format, and the pattern is specified by two integer arrays that serve as row and column pointers [Schlick and Fogelson 1992a]. --Compute the symbolic factorization [LDL.sup.T] of [M.sub.k], that is, the sparsity structure of the factor L. --Evaluate the Hessian matrix [H.sub.k].

(3) Inner Loop

--Compute search vector pk by the linear preconditioned conjugate gradient method for solving the Newton equations [H.sub.k]P = -g([X.sup.k]) approximately based on the UMC factorization of [M.sub.k]. See below.

(4) * Line search

--compute a step length A by safeguarded cubic and quadratic interpolation [More and Thuente 1994; Xie and Schlick 1999b] so that [X.sup.k+1] = [X.sup.k] + [Lambda][P.sup.k] satisfies either (Al) and (A2a) or (A1) and (A2b), where

(A1) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

(A2a) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

(A2b) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

with [Alpha] = [10.sup.-4] and [Beta] = 0.9.

(5) Convergence tests

--Check the following inequalities:

(A3a) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

(A3b) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

(A3c) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

(A3d) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

where [[element of].sub.f] = [10.sup.-10] and [[element of].sub.g] = [10.sup.-8].

--If conditions {A3a,A3b,A3c} or (A3d) are satisfied, exit algorithm.

(6) Preparation for the next Newton step

--Compute the new preconditioner [M.sub.k+1] by using the pattern determined originally. --Evaluate the new Hessian matrix [M.sub.k+1]. --Set k [left arrow] k + 1 and go to step (3).

Inner Loop k of the Truncated Newton Methd (Step (3) above)

The sequence {[P.sub.i]} below represents the PCG vectors used to construct [P.sup.k] in step (3) of Outer Loop. We omit the Newton subscript k from g, H, and M and the superscript k from P for clarity. In step [(4).sup.*], either Test 2A or Test 1A' can be used.

(1) Initialization

--Set i = 1, [p.sub.1] = 0, and [r.sub.1] = -g. --Set the parameter [[Eta].sub.k] controlling the accuracy of the computed search vector: [[Eta].sub.k] = min{[c.sub.r]/k, ||g||}, where [c.sub.r] [is less than or equal to] 1 (we use [c.sub.r] = 0.5). --Set the parameter [IT.sub.PCG] as the maximum number of allowable PCG iterations for each inner loop (default [IT.sub.PCG] = 40).

(2) * The unconventional modified Cholesky (UMC) factorization

--Perform the numerical factorization of M by UMC with a chosen parameter [Tau] (default [Tau] = 10). The resulting modification M of M is used as the preconditioner of PCG and is stored in the same sparse row format used for M. --Solve for [z.sub.i] in M[z.sub.i] = [r.sub.i]. --Set [d.sub.i] = [z.sub.i].

(3) * Singularity test

--Compute the matrix/vector product [q.sub.i] = H[d.sub.i]. -- [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], exit PCG loop with P - [p.sub.i] (for i = 1, set P = -g).

(4) * Negative curvature test: implement one of the following two tests.

--Test 1A': [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], exit inner loop with P = [p.sub.i] (for i = 1, set P = -g); else update [[Alpha].sub.i] and [p.sub.i+1] as in (A4), and continue to step (5). --Test 2A: Update the quantities

(A4) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

If [g.sup.T][p.sub.i+1] [is greater than or equal to] [g.sup.T][p.sub.i] - [Zeta], exit inner loop with P = [p.sub.i] (for i = 1, set P = -g); else continue to step (5).

(5) Truncation test

--Compute [r.sub.i+1] = [r.sub.i] - [[Alpha].sub.i][q.sub.i]. --If ||[r.sub.i+1]|| [is less than equal to] [[Eta].sub.k]||g|| or i + 1 [is greater than] [IT.sub.PCG], exit inner loop with search direction P = [p.sub.i+1]; else continue to step (6).

(6) Continuation of PCG

--Solve for [z.sub.i+1] in M[z.sub.i+1] = [r.sub.i+1]. --Update the quantities [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] --Set i [left arrow] i + 1, and go to step (3).

B. IMPLEMENTATION EXAMPLES

Three implementation examples are included with the TNPACK software to illustrate the performance to users. The objective functions are familiar optimization test problems from Algorithm 566 by More et al. [1981a; 1981b]. The second-derivative routines for the 18 test functions of Algorithm 566 can be obtained from the supplementary Hessian package, HESFCN, developed by Averbukh et al. [1992; 1994]. We used all default starting points specified in Algorithm 566 for the 18 test functions. The preconditioner for PCG was chosen as the diagonal of the Hessian matrix Ha for all problems.

Included among the 18 test functions are the extended Rosenbrock function with an even integer n

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],

and the trigonometric function

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Detailed descriptions of TNPACK for these two functions are given in Schlick and Fogelson [1992b]. We include them as two separate implementation examples with the TNPACK code to let users easily test TNPACK options. We used the following starting points [X.sup.0 for the Rosenbrock and trigonometric functions

[X.sup.0] = (-1.2 - cos1,1 + cos1, -1.2 - cos3,1 + cos3, ...,

-1.2 - cos(n - 1), 1 + cos(n - 1))

and

[X.sup.0] - (1/n + 0.2cos1,1/n + 0.2cos2,1/n + 0.2cos3, ..., 1/n + 0.2cosn),

respectively. Note that they are different from the default starting points given in Algorithm 566 for these two functions. To test nondiagonal preconditioners and the reordering option of TNPACK, we defined the following preconditioner [M.sub.k] for the trigonometric function: elements [m.sub.ii] are set to the /th diagonal element of [H.sub.k] for i [is less than or equal to] i [is less than or equal to] n; off-diagonals [m.sub.1, n-1] = [m.sub.n-1,1] = 0.1, [m.sub.1, n] = [m.sub.n, 1] = -0.1, and other elements are zero. We set [Tau] = 0.5 for the trigonometric function for better performance while the default value of [Tau] was used for Rosenbrock function.

We also include below numerical examples for large-scale molecular potential functions. We consider two molecular systems: alanine dipeptide (22 atoms) and the protein BPTI (568 atoms). The widely used molecular mechanics and dynamics program CHARMM [Brooks et al. 1983] (version 24bl) was used to compute the potential energy functions and their derivatives; parameter files for the energy functions were used from CHARMM version 19. All nonbonded interactions of the potential energy function were considered, and a distance-dependent dielectric function was used. An effective preconditioner [M.sub.k] is constructed naturally from the local chemical interactions: bond length, bond angle, and torsional potentials [Schlick and Overton 1987]. More numerical results and discussions can be found in Xie and Schlick [1999a; 1999b], where a CHARMM version of TNPACK was used, and some changes have been made for TNPACK for better performance.

All computations were performed in double precision in serial mode on an SGI Power Challenge L computer with R10000 processors of speed 195MHz at New York University. The vector norm ||[multiplied by]|| in tables is the standard Euclidean norm divided by [square root of n].

Tables II and III display minimization performance of the updated TNPACK using the new options for these test functions, and reports for reference results with the previous TNPACK version using default options. Note that the default values of n in Algorithm 566 are small. For example, n = 3 for trigonometric function (Problems 13 in Table IV) and n = 2 for the Rosenbrock function (Problem 14) of Algorithm 566. Hence, the CPU times for running TNPACK for the 18 functions of Algorithm 566 on the R10000 processor were very small (fraction of a second).

Table II. TNPACK Performance for Rosenbrock and Trigonometric Functions

Rosenbrock Function (n = 1000) TNPACK Final Energy Final [parallel]g[parallel] Modified 4.3512 x [10.sup.-18] 2.82 x [10.sup.-9] Original 7.3024 x [10.sup.-24] 2.52 x [10.sup.-12] Trigonometric Function (n = 1000) Modified 1.1215 x [10.sup.-13] 9.43 x [10.sup.-9] Original 1.3833 x [10.sup.17] 1.06 x [10.sup.-10] Rosenbrock Function (n = 1000) TN (PCG) E CPU Time TNPACK Itns. Evals. (sec.) Modified 28 (500) 45 0.208 Original 50 (744) 117 0.302 Trigonometric Function (n = 1000) Modified 21 (73) 23 8.045 Original 27 (74) 41 8.488

Table III. TNPACK Performance for the 18 Test Functions of Algorithm 566

Final Problem TNPACK Final Energy [parallel]g[parallel] 1 Modified 1.7884 x [10.sup.-9] 3.35 x [10.sup.19] Original 2.8711 x [10.sup.-32] 6.67 x [10.sup.-16] 2 Modified 3.2182 x [10.sup.-14] 1.22 x [10.sup.-9] Original 2.4268 x [10.sup.-1] 3.05 x [10.sup.-7] 3 Modified 1.1279 x [10.sup.-8] 6.74 x [10.sup.-9] Original 1.1279 x [10.sup.-8] 5.60 x [10.sup.-11] 4 Modified 7.6372 x [10.sup.-6] 1.28 x [10.sup.-5] Original 2.9769 x [10.sup.-7] 1.28 x [10.sup.-7] 5 Modified 5.6077 x [10.sup.-13] 1.85 x [10.sup.-8] Original 1.0454 x [10.sup.-18] 3.43 x [10.sup.-10] 6 Modified 3.2357 x [10.sup.-22] 8.04 x [10.sup.-11] Original 1.9639 x [10.sup.-18] 6.27 x [10.sup.-9] 7 Modified 4.7140 x [10.sup.-1] 1.19 x [10.sup.-9] Original 4.7140 x [10.sup.-1] 7.52 x [10.sup.-15] 8 Modified 1.5179 x [10.sup.-5] 3.49 x [10.sup.-8] Original 1.5179 x [10.sup.-5] 3.43 x [10.sup.-9] 9 Modified 3.200 x [10.sup.-6] 4.46 x [10.sup.-8] Original 3.1981 x [10.sup.-6] 1.85 x [10.sup.-10] 10 Modified 1.9722 x [10.sup.-31] 6.28 x [10.sup.-10] Original 5.4210 x [10.sup.-20] 7.09 x [10.sup.-10] 11 Modified 8.5822 x [10.sup.4] 8.96 x [10.sup.-3] Original 8.5822 x [10.sup.4] 7.22 x [10.sup.-3] 12 Modified 7.9990 x [10.sup.-11] 1.40 x [10.sup.-7] Original Line search failed at 3rd TN after 30 line search iterations 13 Modified 2.5737 x [10.sup.-3] 1.10 x [10.sup.-12] Original 2.5737 x [10.sup.-3] 7.79 x [10.sup.-9] 14 Modified 1.3433 x [10.sup.-20] 9.08 x [10.sup.-11] Original 1.4800 x [10.sup.-25] 2.97 x [10.sup.-13] 15 Modified 1.4061 x [10.sup.-12] 3.34 x [10.sup.-9] Original 7.3082 x [10.sup.-13] 3.13 x [10.sup.-9] 16 Modified 2.0461 x [10.sup.-21] 5.97 x [10.sup.-11] Original 7.9394 x [10.sup.-24] 2.37 x [10.sup.-12] 17 Modified 1.5576 x [10.sup.-19] 1.11 x [10.sup.-9] Original 4.3014 x [10.sup.-23] 1.43 x [10.sup.-10] 18 Modified 3.3521 x [10.sup.-25] 1.33 x [10.sup.-12] Original 2.9041 x [10.sup.-17] 9.18 x [10.sup.-9] Problem TNPACK TN (PCG) Itns. E Evals. 1 Modified 16 (41) 19 Original 17 (40) 23 2 Modified 271 (948) 295 Original 1274 (6781) 2963 3 Modified 2 (3) 3 Original 2 (4) 3 4 Modified 36 (53) 52 Original 107 (182) 167 5 Modified 14 (29) 20 Original 16 (34) 20 6 Modified 9 (14) 10 Original 9 (14) 10 7 Modified 9 (16) 10 Original 9 (19) 10 8 Modified 45 (101) 56 Original 52 (96) 64 9 Modified 9 (17) 13 Original 35 (95) 44 10 Modified 4 (5) 14 Original 4 (5) 10 11 Modified 10 (27) 11 Original 10 (27) 11 12 Modified 29 (53) 39 Original 13 Modified 9 (24) 11 Original 8 (21) 11 14 Modified 28 (49) 34 Original 33 (57) 43 15 Modified 22 (80) 23 Original 21 (75) 22 16 Modified 9 (14) 11 Original 9 (16) 12 17 Modified 94 (341) 100 Original 52 (175) 64 18 Modified 7 (11) 9 Original 6 (9) 11

Table IV. Performance of TNPACK for Alanine Dipeptide (66 Variables) Minimization Using the Standard Modified Cholesky Factorization (MC) versus Our UMC with [Tau] = 10

Final E Final [parallel]g[parallel] TN (PCG) Itns. MC -15.245 3.27 x [10.sup.-10] 2386 (2738) UMC -15.245 1.31 x [10.sup.-10] 30 (227) CPU Time E Evals. (sec.) MC 5939 9.07 UMC 45 0.92

From these two tables we see that the new options perform overall slightly better for all test functions, except for function 17, Wood function. Significant improvements are noted for the second function (Biggs function) and the 12th function (Gulf research and development function). For the second function, a zero value for the minimum is expected, but the older TNPACK version found a larger value. The older TNPACK version failed for function 12 due to rounding errors in the third truncated Newton iterate (after 30 line-search iterations) using Criterion 1.

Table V compares the performance of TNPACK based on the standard MC [Gill and Murray 1974] to our UMC for alanine dipeptide minimization. Since the preconditioner [M.sub.k] was indefinite while far away from a local minimum, the standard MC generated an excessively large modification to [M.sub.k] such that the modified matrix [M.sub.k] was a very poor approximation of the Hessian matrix [H.sub.k] and had a very large condition number. Consequently, TNPACK with the standard MC led to poor performance in comparison to the TNPACK with UMC. We also found that TNPACK with the standard MC did not work for large-scale molecular minimization problems such as BPTI.

Table V. Comparison of TNPACK with Other Two CHARMM Minimizers and LM-BFGS for BPTI (1704 Variables)

Minimizer Itns. (PCG Itns.) Final E TNPACK 78 (1604) -2776.58 LM-BFGS 4486 -2792.96 ABNR 8329 -2792.96 CONJ 12469 -2792.93 CPU Time Minimizer Final [parallel]g[parallel] E Evals. (min.) TNPACK 1.14 x [10.sup.-6] 227 5.7 LM-BFGS 6.3 x [10.sup.-5] 4622 12.61 ABNR 8.9 x [10.sup.-6] 8330 25.17 CONJ 9.9 x [10.sup.-6] 32661 97.80

Table VI compares the performance of TNPACK for BPTI minimizations with two other CHARMM minimizers, ABNR (an adopted-basis Newton-Raphson method) and CONJ (a nonlinear conjugate gradient method), as well as LM-BFGS with u = 5 stored updates [Liu and Nocedal 1989]. For simplicity, we used the default parameters in CHARMM for ABNR, CONJ, and TNPACK. The results in Table VI show that TNPACK requires less CPU time than the other methods and reaches very low gradient norms. LM-BFGS is the next minimizer in efficiency, but the CPU time of TNPACK is less than that of LM-BFGS by a factor of two or more.

Table VI. Performance of TNPACK for BPTI Using Criterion 1 (C1) versus Criterion 2 (C2) in the Line Search

Criterion Final E Final [parallel]g[parallel] C1 -2762.5 3.72 x [10.sup.-6] C2 -2749.2 7.14 x [10.sup.-6] CPU Time Criterion TN (PCG) Itns. E Evals. (min.) C1 176 (1387) 537 6.99 C2 85 (1029) 250 4.35

To show that a significant improvement can be observed when Criterion 2 (C2) of the line search is used, we consider TNPACK for BPTI potential function minimization. Given that the function has many local minima, and different starting position vectors may lead to different minima, we constructed a starting-position vector as below:

[X.sup.0, [Omega]] = [X.sup.0] + [Upsilon],

where [X.sup.0] is the original starting position vector (from an experimental structure), and [Upsilon] is a random vector, each component of which is chosen from a uniform distribution between 0 and 1. An initial seed value of 1 is chosen for the pseudorandom number generator.

Table VI shows that C2 requires less CPU time to find a minimum than C1.

Finally, Table VII gives a summary of TNPACK results that appeared in the following four papers: [Schlick and Fogelson 1992b; Zou et al. 1993; Derreumaux et al. 1994; Xie and Schlick 1999a].

Table VII. A Summary of TNPACK Results Reference Test Problems (n) Schlick and (1) Rosenbrock's function (1,000) Fogelson [1992b] (2) Trigonometric function (1,000) (3) Potential energy function of deoxycytidine (87) (4) Energy function from models of platelet aggregation (640) Zou et al. [1993] (1) Oceanography problems (7330) (2) Meteorology problems (14,763) Derreumaux et Potential energy functions of al. [1994] (1) Alanine Dipeptide (36) (2) N-Methyl-Acetamide (36) (3) Deca-Alanine (198) (4) Mellitin (765) (5) Rubredoxin (1,413) (6) Avian (1,104) (7) A dimer of Insulin (2,892) (8) BPTI (1,740) (9) Lysozyme (3,897) Xie and Schlick Potential energy functions of [1999a] (1) Butane (42) (2) Alanine Dipeptide (66) (3) BPTI (1,704) (4) Lysozyme (6,090) Reference Main Findings Schlick and Various TNPACK options Fogelson [1992b] and their influence on performance are illustrated. Zou et al. [1993] Truncated Newton methods are competitive with L-BFGS, especially for large-scale meteorological problems. Derreumaux et TNPACK performs al. [1994] significantly better than ABNR when Hessian/vector products are approximated by a finite-difference expression of gradients. Curvature information is important for directing minimization progress, and the use of local structure accelerates convergence of PCG. Very low final gradient norms can be obtained. Xie and Schlick The UMC and other [1999a] modifications improve the performance of TNPACK for large-scale molecular systems.

Received: September 1997; accepted: November 1998

This work was supported in part by the National Science Foundation. T. Schlick is an investigator of the Howard Hughes Medical Institute.

Authors' address: Department of Computer Science, Courant Institute of Mathematical Sciences, New York University, 251 Mercer Street, New York, NY 10012; email: dexuan@cims.nyu.edu; schlick@nyu.edu.

Permission to make digital/hard copy of part or all of this work for personal or classroom use is granted without fee provided that the copies are not made or distributed for profit or commercial advantage, the copyright notice, the title of the publication, and its date appear, and notice is given that copying is by permission of the ACM, Inc. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee.

Printer friendly Cite/link Email Feedback | |

Author: | XIE, DEXUAN; SCHLICK, TAMAR |
---|---|

Publication: | ACM Transactions on Mathematical Software |

Date: | Mar 1, 1999 |

Words: | 6533 |

Previous Article: | A Note on the Recursive Calculation of Incomplete Gamma Functions. |

Next Article: | Self-Adapting Fortran 77 Machine Constants: Comment on Algorithm 528. |

Topics: |