Mind the Gap in the Chain Rule:

Software Tool Support for Advanced Algorithmic Differentiation of Numerical Simulation Code.


Uwe Naumann

Adjoint methods play an increasingly important role in Computational Science, Engineering, and Finance. Typical targets are derivative-based large-scale parameter sensitivity analysis, numerical model calibration, nonlinear optimization, and uncertainty quantification. Classical numerical differentiation by finite differences often turns out to be infeasible due to prohibitive run time and/or inaccuracy of the approximated derivatives. Algorithmic Differentiation (AD) enables their computation with machine accuracy for a given implementation of the underlying (primal) numerical simulation as a computer program.

Sensitivities (gradients, Jacobians) of the simulated quantities with respect to a potentially very large number N of free model parameters can be computed with a computational cost which is independent of N in adjoint mode AD. However, the memory requirement of an adjoint code is known to scale with the number of floating-point operations performed by the primal code which makes a naïve application of adjoint mode AD infeasible in most real-world situations.  

AD is based on the knowledge of partial derivatives (local Jacobians) of the  building blocks of a numerical simulation code (e.g, the built-in functions of the programming language, numerical methods, such as linear and nonlinear solvers, or black-box library routines) and the chain rule of differential calculus. The latter turns out to be associative implying an exponential number of different orders of the underlying chained local Jacobian matrix products. This fact is exploited by checkpointing methods to limit the memory requirement of adjoint code, by methods for differentiating implicit functions, or in smoothing techniques for locally non-differentiable primal code. In this talk we are going to discuss the concept of "gaps in the chain rule" and its implementation in our AD software tool dco (derivative code by overloading). Relevance and feasibility of the proposed approach have been illustrated successfully in the context of numerous real-world applications to be commented on.