Printer Friendly

Remark on Algorithm 746: New Features of PCOMP, a Fortran Code for Automatic Differentiation.

1. INTRODUCTION

Let f(x) be a real-valued differentiable function defined for x [element of] [R.sup.n]. By automatic differentiation we understand the numerical computation of the derivative value [del]f(x) at a given point x without using approximation techniques or explicit formulas for the derivatives. Automatic differentiation has become an important tool for numerical algorithms that require derivatives in areas such as nonlinear programming, optimal control, parameter estimation, or differential equations.

It is beyond the scope of this article to give a decisive description of the underlying forward and backward accumulation methods, so we assume that the reader is already familiar with Dobmann et al. [1995]. For an explanation of the theoretical background, we refer to Griewank [1989]. PCOMP is a collection of Fortran subroutines that implement both approaches (see Dobmann et al. [1995] for details). PCOMP uses a special modeling language similar to Fortran for the declaration of arbitrary nonlinear functions. After parsing the input file and generating an intermediate code, function, gradient, and Hessian values can be directly evaluated by special subroutines called from a user program. On the other hand, it is possible to generate Fortran code for function and gradient evaluation that has to be compiled and linked separately.

Meanwhile there exists a large variety of alternative approaches especially with the goal of differentiating Fortran or C programs. See Juedes [1991] for an overview, or Griewank et al. [1991] for the description of an implementation that uses operator overloading in C++.

The new PCOMP is upward compatible with the 1995 version, i.e., any PCOMP program written for the 1995 version may be run with the new version. The extensions that are to be outlined in detail are as follows:

--more flexible identifier names

--interpolation functions (piecewise constant, piecewise linear, splines)

--macros for groups of PCOMP statements

--index variables, e.g., for the implementation of loops

--GOTO statements

--evaluation of gradients with respect to arbitrary subsets of variables

--second derivatives in forward mode

--additional error messages, especially for run-time execution

Moreover, a few bugs have been fixed, and PCOMP has been tested on a multitude of different Fortran compilers. As part of programs for parameter estimation in systems of algebraic equations, in ordinary differential equations, in differential algebraic equations, and in time-dependent one-dimensional partial differential equations, PCOMP is in permanent use in many different disciplines, e.g., pharmaceutics, chemical engineering, mechanical engineering, and geology (see Schittkowski [1994; 1995; 1997; 1999; 2000]).

2. NEW LANGUAGE ELEMENTS

2.1 Identifiers

In order to facilitate the problem formulation, the nomenclature of Fortran 77 has been extended, and identifiers for variables or functions may now contain underscores and may consist of up to 20 characters.

2.2 Index Variables

The concept of index variables already exists in the 1995 version of PCOMP; they are used in the definition of vectors and matrices and in sum and PROD statements. For example, in the following statement to compute the scalar product of two vectors

F = SUM(A(I)*X(I), I IN INDEX)

the variable I is implicitly declared as an index variable.

In order to make the structure of a PCOMP program more consistent, these index variables can also be explicitly declared in a new program block called INDEX, for instance:
* INDEX
  I, J
  L


Especially together with the enhanced control structures, it is sensible to allow for assignments to these variables. If we assume that A is declared as an integer array and that F and G are scalar and vector-valued functions, respectively, the following statements show some typical applications:
I = 1+2*4-3
I = A(1)
F = A(I+2)+I*2.0
F = SUM(A(M-I), M IN INDEX)
F = I
F = G(I)


Since these index variables are primarily used as indices for vectors and matrices, it is quite obvious that they should only take integer values. Thus the following statements are not permitted, if we assume that B is declared as a real array:
I = 1.0
I = 4/2
I = B(3)


Because of the special role of index variables, we do not allow these assignments to avoid a misleading usage. Note, also, that although PCOMP distinguishes between integer and real constants, integer variables do not exist.

2.3 Interpolation Functions

A new feature of the 1999 version of PCOMP is the possibility of interpolating user-defined data, using either a piecewise constant, piecewise linear, or a cubic spline function. Given n pairs of real values ([t.sub.1], [y.sub.1]), ..., ([t.sub.n], [y.sub.n]), we are looking for a nonlinear function interpolating these data.

In the first case, we define a piecewise constant interpolation by

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],

A continuous piecewise linear interpolation function is given by

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],

and a piecewise cubic spline is given by

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],

where p(t;[t.sub.1], [t.sub.2], [t.sub.3], [t.sub.4], [y.sub.1], [y.sub.2], [y.sub.3], [y.sub.4]) is a cubic polynomial with p([t.sub.i];[t.sub.1], [t.sub.2], [t.sub.3], [t.sub.4], [y.sub.1], [y.sub.2], [y.sub.3], [y.sub.4]) = [y.sub.i] i = 1, ..., 4,

and [bar]s(t;[[bar]t.sub.1], ..., [[bar]t.sub.m], [[bar]y.sub.1], ..., [[bar]y.sub.m], [[bar]y'.sub.1], [[bar]y'.sub.m]) is a cubic spline function interpolating ([[bar]t.sub.1], [[bar]y.sub.1), ..., ([[bar]t.sub.m], [[bar]y.sub.m]) subject to boundary conditions

d/dt [[bar]s([[bar]t.sub.i];[[bar]t, ..., [[bar]t.sub.m], [[bar]y.sub.1], ..., [[bar]y.sub.m], [[bar]y'.sub.1][[bar]y'.sub.m]) = [[bar]y'.sub.i], i = 1 and i = m.

It is essential to understand that the constant and spline interpolation functions are not symmetric. As noted before, a very typical class of application problems consists of dynamic systems, say ordinary or partial differential equations, where the initial value is set to 0 without loss of generality, leading to a nonsymmetric domain. In these cases, derivatives of the right-hand side of the differential equations and the initial conditions are required for two reasons: first the Jacobian of the right-hand side with respect to the state variables is needed for applying implicit integration routines, in case of stiff systems or additional algebraic equations. Secondly, dynamic systems are often embedded in optimization problems, e.g., in data fitting or optimal control applications. Thus, one has to compute derivatives of the solution with respect to certain design parameters, e.g., by integrating the sensitivity or variational equations, respectively, or by any related technique. Even worser, if we integrate the resulting system by an implicit method, we have to compute also second mixed partial derivatives with respect to state and design variables.

Moreover, interpolated data are often based on experiments that reach a steady state, i.e., a constant value. Thus a zero derivative is chosen at the right end point for spline interpolation to facilitate the input of interpolated steady-state data. On the other hand, any other boundary conditions can be enforced by adding artificial interpolation data.

To give an example, we assume that we want to interpolate the nonlinear function f(t) given by the discrete values f([t.sub.i]) = [y.sub.i] from Table I, using the different techniques mentioned above.
Table I. Interpolation Data

i   [t.sub.i]   [y.sub.i]

1      0.0         0.00
2      1.0         4.91
3      2.0         4.43
4      3.0         3.57
5      4.0         2.80
6      5.0         2.19
7      6.0         1.73
8      7.0         1.39
9      8.0         1.16
10     9.0         1.04
11    10.0         1.00


Interpolation functions are defined by a program block starting with the keyword CONINT for piecewise constant functions, LININT for piecewise linear functions, or SPLINE for piecewise cubic splines, followed by the name of the function. The numerical values of the break points and the function values are given on the subsequent lines, using any standard format starting at column 7 or later. Using piecewise constant approximations, we get the following for our example:
* CONINT F
  0.0   0.00
  1.0   4.91
  2.0   4.43
  3.0   3.57
  4.0   2.80
  5.0   2.19
  6.0   1.73
  7.0   1.39
  8.0   1.16
  9.0   1.04
 10.0   1.00


Within a function definition block, the interpolation functions are treated as intrinsic Fortran functions, i.e., they have to contain a variable or constant as a parameter. If we assume that T has previously been declared as a variable, a valid statement could look like this:
* FUNCTION KT
  TEMP = F(T) + 273
  KT = K0*EXP (E/ (R*TEMP))


The resulting approximations for piecewise constant functions, piecewise linear functions, or piecewise cubic spline functions are depicted in Figures 1-3. Whereas the cubic spline approximation is twice differentiable on the whole interval, the other two approximations are not differentiable at the break points and PCOMP uses the right-hand-sided derivatives instead.

[Figures 1-3 ILLUSTRATION OMITTED]

2.4 Control Statements

Using GOTO and corresponding CONTINUE statements is another possibility to control the execution of a program. The syntax of these statements is
        GOTO <label>
and
<label> CONTINUE


where <label> has to be a number between 1 and 9999. Since PCOMP uses labels in the Fortran code that are generated in reverse accumulation mode, the user-defined labels must be between 5000 and 9999 in this case to avoid unnecessary clashes. The <label> part of the CONTINUE statement has to be located between columns 2 and 5 of an input line. Together with an index variable, the GOTO statement can also be used to implement DO loops, which are not yet available in PCOMP:
     I = 1
     S = 0.0
6000 CONTINUE
      S = S+A (I) *B (I)
      I = I+1
      IF (I.LE.  N)   THEN
        GOTO  6000
      ENDIF


However, it is recommended to avoid GOTO statements whenever possible, and to replace them by SUM and PROD statements, if applicable. If not implemented very carefully, there is a danger that the defined function is nondifferentiable.

2.5 Macros

PCOMP does not allow the declaration of subroutines. However, it is now possible to define macros, i.e., arbitrary sequences of PCOMP statements that define an auxiliary variable to be inserted into the beginning of subsequent function declaration blocks. Macros are identified by a name that can be used in any right-hand side of an assignment statement,

* MACRO <identifier>

followed by a group of PCOMP statements that assign a numerical value to the given identifier. This group of statements is inserted into the source code block that contains the macro name. Macros have no arguments, but they may access all variables,' constants, or functions that have been declared up to their first usage. Any values assigned to local variables within a macro are also available outside in the corresponding function block.

If we assume that x is a variable and we want to define a macro that computes the square of x, we would write something like the following:
* MACRO SQRX
  SQRX = X*X


Now it is possible to replace each occurrence of the term X*X with an invocation of the macro that we have just defined, e.g.,

F = SQRX-5.2

It should be noted that empty lines and all lines after the final END statement are ignored by PCOMP.

3. DERIVATIVES WITH RESPECT TO SUBSETS OF THE VARIABLES

In the previous version of PCOMP, derivatives are always evaluated for the full set of all declared variables. However, there are situations where we need Jacobians or Hessians only with respect to a certain subset of parameters that could change within the outer algorithm.

Typical examples are parameter estimation and optimal control of dynamic systems, see Schittkowski [1995; 2000] or Blatt and Schittkowski [2000]. In these cases we need to calculate derivatives with respect to parameters that occur in the initial values or the right-hand side of an ordinary differential equation of the form

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

To avoid the approximation of derivatives by divided differences, we can solve the variational equations to compute [del]y(x, t), where y(x, t) denotes the solution of the ODE. To formulate these equations, we need the partial derivatives

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

for i, j = 1, ..., m and k = 1, ..., n.

Moreover, the outer optimization algorithm might require the evaluation of gradients of an objective function that is defined in the same input file, with respect to the optimization variables x only, or a separate integration of the ODE by implicit methods, for which we need only the derivatives with respect to y. But we do not need any derivatives for the additional variable t. Note that also in the previous PCOMP version it is possible to evaluate derivatives for certain subsets of all declared functions.

The new version of PCOMP uses an index array to determine for which indices the first or second derivatives are to be evaluated.

4. SECOND DERIVATIVES

Second derivatives are often highly desirable to achieve reliable and efficient algorithms. If we need to apply an implicit method to solve the variational equations of an ordinary differential equation for which derivatives of the solution are to be computed, then second derivatives of the original system are required. Another example is the usage of Newton's method for the solution of systems of nonlinear equations or nonlinear programming problems.

Second derivatives are easily obtained by an extension of the forward accumulation method that has been implemented in PCOMP. In order to apply any automatic differentiation technique to calculate the derivatives of a function f(x) for x [element of] [R.sup.n], the first step is to convert f into a sequence of basis functions {[f.sub.i]}, using auxiliary variables for intermediate values. We assume that there is a finite sequence of basis functions [f.sub.n+1], ..., [f.sub.m] and index sets [j.sub.i] [subset] {1, ..., i - 1}, such that the function value of f at any given point x can be obtained by the following program:
for i := n + 1 to m do
    [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
    f(x) := [x.sub.m];


See Dobmann et al. [1995] for details. Basis functions are elementary arithmetic operations and intrinsic or external functions, where the number of operands is limited by a constant independent of n.

Proceeding from a given factorization, derivative values up to second order can be obtained by inserting the corresponding derivatives of the basis functions into the algorithm above:
for i := 1 to m do begin
    [[del][x.sub.i] := [e.sub.i];
    [[del].sup.2][x.sub.i] := 0 ;
end;
for i := n + 1 to n do begin
    [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII];

end;
f(x) := [x.sub.m] ;
[del]f(x) := [del][x.sub.m] ;
[[del].sup.2]f(x) := [[del].sup.2][x.sub.m];


It is important to understand that function and derivative values are simultaneously computed in forward accumulation mode, i.e., PCOMP uses alternative subroutines for function evaluation only, for function and gradient evaluation, and for function, gradient, and Hessian evaluation.

Although generated Fortran code for the evaluation of functions and gradients based on the reverse mode is substantially faster than interpreting the intermediate code, this approach cannot be used to calculate second derivatives. This is a fundamental drawback of the implemented reverse accumulation algorithm. See Griewank [1989] for a more rigorous treatment of the problem.

5. PROGRAM ORGANIZATION

PCOMP is a collection of Fortran subroutines that can be subdivided into four different categories. In this section we want to give only a short overview of the way PCOMP works; for more details we refer to the user guide pcompdoc.ps contained in the subdirectory ./doc of the PCOMP package.

5.1 Generation of Intermediate Code

First of all, the PCOMP source code is analyzed by the subroutine SYMINP. Intermediate code is then generated for subsequent evaluation by SYMFUN, SYMGRA, or SYMHES, or alternatively for transformation into executable Fortran code by SYMFOR. If there is any error during the processing of the input file, SYMINP is interrupted and the error code can be retrieved from the integer parameter IERR. This error code and the corresponding line number can be passed to SYMERR to generate an error message on standard output. Note that the parser stops at the first error found.

If no errors have been detected, the generated intermediate code and additional data are stored in a double-precision working array WA and an integer working array IWA. These two working arrays must be passed to all subsequently called subroutines for function and derivative evaluation or for code generation, respectively.

Since PCOMP supports a modular concept, the generated intermediate code and the additional data are stored in an external file SYMFIL, from where it can be retrieved using the subroutine SYMPRP. A new feature of the 1999 version of PCOMP is the possibility to generate additional debugging information (see Dobmann et al. [1996] for details of this option).

5.2 Runtime Evaluation of Functions and Derivatives

As outlined in the previous subsection, the intermediate code generated by SYMINP is passed to SYMFUN in the form of a real and an integer working array. Given any variable vector x, this subroutine computes the corresponding function values [f.sub.i](x) by interpreting the intermediate code. The subroutines SYMGRA and SYMHES are very similar to SYMFUN. The only distinction is that they also compute the gradients or gradients and Hessians of the symbolically defined functions, respectively.

The 1999 version of PCOMP requires an additional index array DFX to indicate for which variables the first or second derivatives are to be evaluated. Note that already in the previous PCOMP version it is possible to calculate function values and derivatives only for some of the declared functions.

5.3 Generation of Fortran Code

Proceeding from the intermediate code generated by the parser SYMINP, the subroutine SYMFOR generates two Fortran subroutines for function and gradient evaluation on a given output file. These routines have to be compiled and linked separately. The calling sequences of the generated subroutines XFUN and XGRA are analogous to SYMFUN and SYMGRA, except that no working arrays are required.

5.4 Interface to External Subroutines

The basic idea of PCOMP is to factorize a given function f(x) with respect to a set of library functions, for which derivatives can easily be computed, e.g., elementary arithmetic and intrinsic functions. For a complete list of available operations, we refer to Dobmann et al. [1995].

In some applications it is desirable to extend the list of library functions and to allow for user-provided symbols in the source code. Thus, PCOMP can easily be extended to accept additional functions by inserting information about structure, type, and symbolic name into the parser and by defining subroutines for function, gradient, and Hessian evaluation called EXTFUN, EXTGRA, and EXTHES. For more details and an example see Dobmann et al. [1996].

REFERENCES

BLATT, M. AND SCHITTKOWSKI, K. 2000. Optimal control of one-dimensional partial differential algebraic equations with applications. Ann. Oper. Res. To appear.

DOBMANN, M., LIEPELT M., AND SCHITTKOWSKI, K. 1995. Algorithm 746: PCOMP: a Fortran code for automatic differentiation. ACM Trans. Math. Softw. 21, 3 (Sept.), 233-266.

DOBMANN, M., LIEPELT, M., SCHITTKOWSKI, K., AND TRASSL, C. 1996. PCOMP: A Fortran code for automatic differentiation--language description and user guide (version 5.3). Tech. Rep. University of Bayreuth, Bayreuth, Germany.

GRIEWANK, A. 1989. On automatic differentiation. In Mathematical Programming: Recent Developments and Applications, M. Iri and K. Tanabe, Eds. Kluwer Academic Publishers, Hingham, MA, 83-107.

GRIEWANK, A., JUEDES, D. W., AND SRINIVASAN, J. 1991. ADOL-C: A package for the automatic differentiation of algorithms written in C/C++. Preprint MCS-P180-1190. Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL.

JUEDES, D.W. 1991. A taxonomy of automatic differentiation tools. In Proceedings of the Workshop on Automatic Differentiation of Algorithms: Theory, Implementation and Applications (Breckenridge, CO), A. Griewank and G. Corliss, Eds. SIAM, Philadelphia, PA, 315-330.

SCHITTKOWSKI, K. 1994. Parameter estimation in systems of nonlinear equations. Numer. Math. 68, 129-142.

SCHITTKOWSKI, K. 1995. Parameter estimation in differential equations. In Recent Trends in Optimization Theory and Applications, R. P. Agarwal, Ed. World Scientific Publishing Co., Inc., River Edge, NJ, 353-370.

SCHITTKOWSKI, K. 1997. Parameter estimation in one-dimensional time-dependent partial differential equations. Optim. Meth. Softw. 7, 165-210.

SCHITTKOWSKI, K. 1999. PDEFIT: A. Fortran code for parameter estimation in partial differential equations. Optim. Meth. Softw. 10, 539-582.

SCHITTKOWSKI, K. 2000. EASY-FIT: A software system for data fitting in dynamic systems. J. Des. Optim. To appear.

Received: January 1997; revised: June 1997 and January 1998; accepted: December 1999

Authors' address: Department of Mathematics, University of Bayreuth, Bayreuth, D-95440, Germany; email: Michael.Liepelt@uni-bayreuth.de; Klaus.Schittkowski@uni-bayreuth.de.
COPYRIGHT 2000 Association for Computing Machinery, Inc.
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2000 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Author:LIEPELT, MICHAEL; SCHITTKOWSKI, KLAUS
Publication:ACM Transactions on Mathematical Software
Date:Sep 1, 2000
Words:3503
Previous Article:Superconvergent Interpolants for Collocation Methods Applied to Mixed-Order BVODEs.
Next Article:A Simple Method for Generating Gamma Variables.
Topics:

Terms of use | Privacy policy | Copyright © 2020 Farlex, Inc. | Feedback | For webmasters