jacobian

Approximates the Jacobian of m functions in n unknowns using divided differences.

Synopsis

void jacobian (fcn, y, f, fjac)

Required Arguments

void fcn (indx, y[], f[]) (Input/Output)

User-supplied function to compute the value of the function whose Jacobian is to be calculated using divided differences and/or the value of the analytic derivative of that function.

Required Arguments

int indx (Input)
Index of the variable whose derivative is to be computed. jacobian sets this argument to the index of the variable whose derivative is being computed. In those cases where finite differencing and direct analytic computation are combined to calculate a derivative (see optional argument accumulate), jacobian makes an extra call to fcn (with argument indx negative) each time the derivative with respect to variable indx is calculated, in order to calculate the analytic component of that derivative. Note that indx runs from 1 to n, where n is the number of variables.
float y[] (Input)
Array of length n containing the point at which the function is to be computed.
float f[] (Output)
Array of length m, where m is the number of functions to be evaluated at point y, containing the function values at point y. Normally, the user returns the values of the functions evaluated at point y in f. However, when the function can be broken into two parts, a part which is known analytically and a part to be differenced, fcn is called by jacobian once with indx positive for the portion to be differenced and again with indx negative for the portion which is known analytically. In the case where optional argument accumulate has been specified by the user, fcn must be written to handle the known analytic portion separately from the part to be differenced. (See Example 4 for an example where accumulate is used.)
float y[] (Input)
Array of length n containing the point at which the Jacobian is to be evaluated.
float f[] (Output)
Array of length m containing the function values at point y.
float fjac[] (Input/Output)
m by n array which, on output, contains the estimated Jacobian. Note that if optional argument method, method, is used, then for each variable i for which method[i] is set to DD_SKIP, array elements fjac[j=0,…, m-1][i] are input arguments and must be set to the analytic derivatives with respect to variable i prior to calling jacobian. (See description of optional argument method and Example 3 below).

Optional Arguments

yscale, float[] (Input)

An array of length n containing the diagonal scaling matrix for the variables. yscale can also be used to provide appropriate signs for the increments. If the user sets yscale[0] to 0.0, then differencing increment \(del_j\) (for variable \(j=1,\ldots, n\)) is set to \(\sigma_j\) * |y[j-1]| * factor[j-1]. Otherwise, \(del_j\) is set to \(\sigma_j\) * |yscale[j-1]| * factor[j-1]. (See the discussion of optional argument factor below for more information about the calculation of \(del_j\).)

Default: yscale[i=0,...,n-1] = 1.0.

method, int[] (Input)

An array of length n containing the methods used to compute the derivatives. method[i] is the method to be used for variable i. method[i] can be one of the values in the following table:

Value Description
DD_ONE_SIDED Indicates one-sided differences.
DD_CENTRAL Indicates central differences.
DD_SKIP Indicates that the user has set the input Jacobian fjac[j=0,…,m-1][i] to the exact analytic derivative of the function with respect to variable i at point y[i], and that the calculation of the divided difference approximation is to be skipped.

See Example 2 and Example 3 below for demonstrations of how this optional argument is used.

Note that if (and only if) DD_SKIP is specified for a variable i, the required array elements fjac[j=0,...,m-1][i] must be set to the analytic derivatives with respect to variable i prior to calling jacobian. See Example 3 below.

Default: If optional argument method is not used, then one-sided differences are used for all variables.

accumulate (Input)
Indicates that divided differences are to be accumulated with a Jacobian value previously initialized by the user with analytically calculated components of the derivatives via a call to fcn using negative values of fcn argument indx. See the description of indx above and Example 4 below.
factor, float[] (Input)

An array of length n containing the percentage factor for differencing.

For each divided difference for variable \(j=1,\ldots,n\), the differencing increment used is \(del_j\). (See the Description below for a discussion of the differencing methods.) The value of \(del_j\) is computed as follows: If yscale[0] has been set to 0.0 (see the description of optional argument yscale above), define \(y_{a,j}\) = |y[j-1]| and \(\sigma_j=1\). Otherwise, if yscale[j-1] {< , >} 0, define \(y_{a,j}\) = |yscale[j-1]| and \(\sigma_j=\{-1,1\}\). Finally, compute \(del_j\) = \(\sigma_j\) \(y_{a,j}\) factor[j-1].

By changing the sign of yscale[j-1], the difference \(del_j\) can have any desired orientation, such as staying within bounds on variable j. For central differences, a reduced factor is used for \(del_j\) that normally results in relative errors as small as \(\varepsilon^{2/3}\), where ɛ == machine precision = {machine(4), machine(4)} for {single, double} precision. The elements of factor must be such that: \(\varepsilon^{3/4}\)factor[j-1] ≤ 0.1.

Default: All elements of factor are set to \(\varepsilon^{1/2}\).

iStatus, int[] (Output)
Array of length 10 containing status information that might prove useful to a user wanting to gain better control over the differencing parameters. This information can often be ignored. The following table describes the diagnostic information that is returned in each of the entries of array iStatus[]:
Index Description
0 The number of times a function evaluation was computed.
1 The number of columns in which three attempts were made to increase a percentage factor for differencing (i.e., a component in the factor array) but the computed \(del_j\) (for \(j=1,\ldots,n\)) remained unacceptably small relative to y[j‑1] or yscale[j‑1]. In such cases the percentage factor is set to the square root of machine precision.
2 The number of columns in which the computed \(del_j\) was zero to machine precision because y[j‑1] or yscale[j‑1] was zero. In such cases \(del_j\) is set to the square root of machine precision.
3

The number of Jacobian columns that had to be recomputed because the largest difference formed in the column was close to zero relative to scale, where

\(\mathit{scale}=\max\left( | f_i(y) |,| f_i(y+\mathit{del}_j e_j | \right)\)

and i denotes the row index of the largest difference in the column currently being processed. \(index=9\) gives the last column where this occurred.

4 The number of columns whose largest difference is close to zero relative to scale after the column has been recomputed.
5

The number of times scale information was not available for use in the round-off and truncation error tests. This occurs when

\(\min\left( | f_i(y) |,| f_i(y+\mathit{del}_j e_j | \right)=0\)

where i is the index of the largest difference for the column currently being processed.

6

The number of times the increment for differencing (del) was computed and had to be increased because

(yscale[j‑1] + \(del_j\)) - yscale[j‑1] was too small relative to y[j‑1]or yscale[j‑1].

7 The number of times a component of the factor array was reduced because changes in function values were large and excess truncation error was suspected. \(index=8\) gives the last column in which this occurred.
8 The index of the last column where the corresponding component of the factor array had to be reduced because excessive truncation error was suspected.
9 The index of the last column where the difference was small and the column had to be recomputed with an adjusted increment (see \(index=3\)). The largest derivative in this column may be inaccurate due to excessive round-off error.
fcnWData, void fcn_w_data(), void data[] (Input/Output)
User supplied function whose Jacobian is being calculated, and which can also accept data that is supplied by the user. data is the data to be passed to the user-supplied function. See Example 4 for a demonstration of how this optional argument is used and User-Supplied Functions in the “Introduction” chapter for more details on the use of fcnWData.

Description

The function jacobian computes the Jacobian matrix for a function f(y) with m components and n independent variables. jacobian uses divided finite differences to compute the Jacobian. This function is designed for use in numerical methods for solving nonlinear problems where a Jacobian is evaluated repeatedly at neighboring arguments. For example, this occurs in a Gauss-Newton method for solving non-linear least squares problems, or in a non-linear optimization method.

jacobian is suited for applications where the Jacobian is a dense matrix. All cases \(m<n\), \(m=n\), or \(m>n\) are allowed. Both one-sided and central divided differences can be used.

The design allows for computation of derivatives in a variety of contexts. Note that a gradient should be considered as the special case with \(m=1\), \(n\geq 1\). A derivative of a single function of one variable is the case \(m=1\), \(n=1\). Any non-linear solving routine that optionally requests a Jacobian or gradient can use jacobian. This should be considered if there are special properties or scaling issues associated with \(f(y)\). Use the optional argument method to specify different differencing options for numerical differentiation. These can be combined with some analytic subexpressions or other known relationships.

Two divided differences methods are implemented in jacobian for computing the Jacobian: one-sided and central differences.

One-sided differences are computed:

\[\frac{\partial f_i(\mathrm{y})}{\partial y_j} = \frac {f_i\left(\mathrm{y} + \mathit{del}_j \mathrm{e_j}\right) - f_i\left(\mathrm{y}\right)} {\mathit{del}_j}\]

using values of the independent variables at the Jacobian evaluation point \(y=\{ y_j,j=1,\ldots,n \}\) and differenced points \(y+del_j e_j\), where the \(e_j\), \(j=1,\ldots,n\) are the unit coordinate vectors.

Central differences are computed:

\[\frac{\partial f_i(\mathrm{y})}{\partial y_j} = \frac {f_i\left(\mathrm{y} + \mathit{del}_j \mathrm{e_j}\right) - f_i\left(\mathrm{y} - \mathit{del}_j \mathrm{e_j}\right)} {2 \mathit{del}_j}\]

The value for each difference \(del_j\) depends on the variable \(y_j\), the differencing method, and the scaling for that variable. This difference is computed internally. See factor for computational details. \(f_i(y)\) is evaluated with user-supplied argument fcn, where index j, variable y, and output f == \(f_i(y)\) are arguments to fcn.

There are five examples provided that illustrate various ways to use jacobian. For a discussion of the expected errors for the difference methods, see Ralston (1965).

Function jacobian is based upon the Fortran 77 program SJACG, which was designed and programmed by D. A. Salane, Sandia Labs (1986) and modifed by R. J. Hanson, Rice University (June, 2002) with advice from F. T. Krogh. See Salane (1986).

Examples

Example 1

In this example, the Jacobian matrix of

\[f_1(x) = x_1x_{2 }-2\]
\[f_2(x) = x_{1 }- x_1x_{2 }+ 1\]

is estimated by the finite-difference method at the point (1.0, 1.0).

from numpy import *
from pyimsl.math.jacobian import jacobian
from pyimsl.math.writeMatrix import writeMatrix


def fcn(indx, y, f):
    f[0] = y[0] * y[1] - 2.0
    f[1] = y[0] - y[0] * y[1] + 1.0
    return


m = 2
n = 2
y = [1.0, 1.0]
f = zeros(m, dtype=double)
fjac = zeros((m, n), dtype=double)

jacobian(fcn, y, f, fjac)

writeMatrix("The Jacobian is:", fjac, writeFormat="%14.5e")

Output

 
        The Jacobian is:
                1               2
1     1.00000e+00     1.00000e+00
2     0.00000e+00    -1.00000e+00

Example 2

A simple use of jacobian is shown. The gradient of the function

\[f(y_1,y_2) = a exp(by_1) + cy_1y_2^2\]

is required for values

\[a = 2.5e6, b = 3.4, c = 4.5, y_1 = 2.1, y_2 = 3.2\]

The analytic gradient of this function is:

\[grad(f) = [ab exp(by_1) + cy_2^2, 2cy_1y_2]\]

Note that the comparison of the Jacobian estimates using one-sided and central differences with the exact analytic Jacobian results given in this example demonstrates the increased accuracy afforded by use of central differences. However, these estimates require up to twice as many function calculations as do the one-sided differences estimates for Jacobians with a large number of variables.

from numpy import *
from pyimsl.math.jacobian import jacobian, DD_ONE_SIDED, DD_CENTRAL
from pyimsl.math.writeMatrix import writeMatrix


def fcn(indx, y, f):
    a = 2500000.
    b = 3.4
    c = 4.5
    f[0] = a * exp(b * y[0]) + c * y[0] * y[1] * y[1]
    return


def fcnDrv(y, fjac):
    a = 2500000.
    b = 3.4
    c = 4.5
    fjac[0][0] = a * b * exp(b * y[0]) + c * y[1] * y[1]
    fjac[0][1] = 2 * c * y[0] * y[1]
    return


m = 1
n = 2
method = [0, 0]

y = [2.1, 3.2]
f = zeros(m, dtype=double)
fjac = zeros((m, n), dtype=double)
scale = [1.0, 8000.0]

# Use one-sided and central differences
# to approximate gradient and print results:
for j in range(2):
    if (j == 0):
        method[0] = DD_ONE_SIDED
        method[1] = DD_ONE_SIDED
    else:
        method[0] = DD_CENTRAL
        method[1] = DD_CENTRAL

    jacobian(fcn, y, f, fjac,
             method=method, yscale=scale)
    if (j == 0):
        writeMatrix("One sided Jacobian:", fjac,
                    writeFormat="%14.5e")
    else:
        writeMatrix("Central Jacobian:", fjac,
                    writeFormat="%14.5e")

# Calculate analytic Jacobian:
fcnDrv(y, fjac)
writeMatrix("Analytic Jacobian:", fjac, writeFormat="%14.5e")

Output

 
      One sided Jacobian:
             1               2
   1.07221e+10     6.04800e+01
 
       Central Jacobian:
             1               2
   1.07221e+10     6.04800e+01
 
      Analytic Jacobian:
             1               2
   1.07221e+10     6.04800e+01

Example 3

This example uses the same data as in Example 2. Here we assume that the second component of the gradient is analytically known. Therefore only the first gradient component needs numerical approximation. The input value DD_SKIP of array element method[1] specifies that numerical differentiation with respect to \(y_2\) is skipped.

from numpy import *
from pyimsl.math.jacobian import jacobian, DD_ONE_SIDED, DD_SKIP
from pyimsl.math.writeMatrix import writeMatrix


def fcn(indx, y, f):
    a = 2500000.
    b = 3.4
    c = 4.5
    f[0] = a * exp(b * y[0]) + c * y[0] * y[1] * y[1]
    return


m = 1
n = 2
method = [0, 0]

y = [2.1, 3.2]
f = zeros(m, dtype=double)
fjac = zeros((m, n), dtype=double)
scale = [1.0, 8000.0]

# One-sided differences for fjac[0]:
method[0] = DD_ONE_SIDED
# Set method[1] to skip differencing for fjac[1]
# and initialize it to analytic derivative: */
method[1] = DD_SKIP
fjac[0][1] = 2.0 * 4.5 * y[0] * y[1]

# Get Gradient approximation:
jacobian(fcn, y, f, fjac,
         method=method, yscale=scale)

writeMatrix("The Jacobian is:", fjac,
            writeFormat="%14.5e")

Output

 
       The Jacobian is:
             1               2
   1.07221e+10     6.04800e+01

Example 4

This example uses the same data as in Example 2, computing the Jacobian (gradient) of the function:

\[f(y_1, y_2) = a exp(by_1) + cy_1y_2^2\]

For this example, the analytic derivative of the second term with respect to \(y_1\), \(cy_2^2\), is provided by the user (in the example, see case -1: within the user-provided function fcn). This leaves only the first term, \(a \exp(by_1)\), to be evaluated in order to use direct differencing to calculate the first partial (see case 1: within the user-provided function fcn). Also, since the first term does not depend on the second variable, \(y_2\), it can be left out of the function evaluation when computing the partial with respect to \(y_2\) using differencing methods, potentially avoiding cancellation errors (see case 2: within the user-provided function fcn). Since the code does not specify the analytic derivative with respect to \(y_2\) for either of the two terms of \(f(y_1,y_2)\), fcn returns f[0] set to 0 for case -2:. The use of optional argument accumulate thereby allows jacobian to use these facts to obtain greater accuracy using a minimum number of computations of the exponential function.

from numpy import *
from pyimsl.math.jacobian import jacobian, DD_ONE_SIDED, DD_SKIP
from pyimsl.math.writeMatrix import writeMatrix


def fcn(indx, y, f):
    a = 2500000.
    b = 3.4
    c = 4.5

    # Handle both the differenced part and the part that is
    # known analytically for each dependent variable:
    if indx == 1:
        f[0] = a * exp(b * y[0])
    elif indx == -1:
        f[0] = c * y[1] * y[1]
    elif indx == 2:
        f[0] = c * y[0] * y[1] * y[1]
    elif indx == -2:
        f[0] = 0.0
    return


m = 1
n = 2
y = [2.1, 3.2]
f = zeros(m, dtype=double)
fjac = zeros((m, n), dtype=double)
scale = [1.0, 8000.0]

"""
 Use optional argument accumulate so that the
 user can specify which function components are to be
 used for divided difference approximation of
 derivatives and which are to be replaced with exact
 analytically calculated derivatives. Both components
 are set to the default one-sided differences method.
"""
# Calculate and print Jacobian approximation:
jacobian(fcn, y, f, fjac,
         accumulate=True, yscale=scale)
writeMatrix("The Jacobian is:", fjac,
            writeFormat="%14.5e")

Output

 
       The Jacobian is:
             1               2
   1.07221e+10     6.04800e+01

Example 5

This example uses function boundedLeastSquares to solve the nonlinear least-squares problem:

\[\begin{split}\begin{array}{l} \min \tfrac{1}{2} \sum_{i=0}^{1} f_i(x)^2 \\ -2 \leq x_0 \leq 0.5 \\ -1 \leq x_1 \leq 2 \\ \end{array}\end{split}\]

where: \(f_0(x)=10(x_1-x_0^2)\), \(f_1(x)=1-x_0\), an initial guess (-1.2, 1.0) is supplied, and the residual at the approximate solution is returned. This example is identical to Example 2 of boundedLeastSquares, except that Example 2 uses an analytic Jacobian, and this example uses jacobian to approximate the Jacobian using the default one-sided differences.

Note that the function vector whose sum of squares is to be minimized, rosbck, is supplied directly (as a required argument) to boundedLeastSquares and indirectly to jacobian, wrapped in function fcn. Function fcn is supplied to jacobian via optional argument fcnWData, fcn, (void*) idata. jacobian is called from within function jacobian which is passed to boundedLeastSquares via optional argument jacobian.

Also note that the array size parameters m and n are passed to function rosbck (which is wrapped in function fcn for use by jacobian) via integer array idata, which is specified in the optional argument fcnWData, fcn, (void*) idata. This is an example of how to pass necessary integer data to jacobian required argument function fcn using fcnWData; an example of passing real data using fcnWData is given in Example 4 above.

Example 5 demonstrates how jacobian can be used to supply estimates of the Jacobian matrix that are necessary for solving many optimization problems when the function to be minimized is complex and its Jacobian cannot be calculated analytically.

from __future__ import print_function
from numpy import *
from pyimsl.math.boundedLeastSquares import boundedLeastSquares
from pyimsl.math.jacobian import jacobian


def rosbck(m, n, x, f):
    f[0] = 10.0 * (x[1] - x[0] * x[0])
    f[1] = 1.0 - x[0]


def fcn(indx, x, f):
    rosbck(2, 2, x, f)
    return


def jacobianFcn(m, n, x, fjac, fjac_col_dim):
    f = zeros(2, dtype=double)
    fj = zeros((2, 2), dtype=double)
    y = zeros(2, dtype=double)
    y[0] = x[0]
    y[1] = x[1]
    jacobian(fcn, y, f, fj)
    fjac[0] = fj[0][0]
    fjac[1] = fj[0][1]
    fjac[2] = fj[1][0]
    fjac[3] = fj[1][1]


m = 2
n = 2
ibtype = 0
xlb = [-2.0, -1.0]
xub = [0.5, 2.0]
xguess = [-1.2, 1.0]
fvec = []
x = boundedLeastSquares(rosbck, m, n, ibtype, xlb, xub,
                        jacobian=jacobianFcn,
                        xguess=xguess,
                        fvec=fvec)
print("x[0] = %f" % x[0])
print("x[1] = %f" % x[1])
print("fvec[0] = %f" % fvec[0])
print("fvec[1] = %f" % fvec[1])

Output

x[0] = 0.500000
x[1] = 0.250000
fvec[0] = 0.000000
fvec[1] = 0.500000