jacobian¶
Approximates the Jacobian of m functions in n unknowns using divided differences.
Synopsis¶
void jacobian (fcn, y, f, fjac)
Required Arguments¶
- void
fcn
(indx
,y[]
,f[]
) (Input/Output) User-supplied function to compute the value of the function whose Jacobian is to be calculated using divided differences and/or the value of the analytic derivative of that function.
Required Arguments
- int
indx
(Input) - Index of the variable whose derivative is to be computed.
jacobian
sets this argument to the index of the variable whose derivative is being computed. In those cases where finite differencing and direct analytic computation are combined to calculate a derivative (see optional argumentaccumulate
),jacobian
makes an extra call tofcn
(with argumentindx
negative) each time the derivative with respect to variableindx
is calculated, in order to calculate the analytic component of that derivative. Note thatindx
runs from1
ton
, wheren
is the number of variables. - float
y[]
(Input) - Array of length
n
containing the point at which the function is to be computed. - float
f[]
(Output) - Array of length
m
, wherem
is the number of functions to be evaluated at pointy
, containing the function values at pointy
. Normally, the user returns the values of the functions evaluated at pointy
inf
. However, when the function can be broken into two parts, a part which is known analytically and a part to be differenced,fcn
is called byjacobian
once withindx
positive for the portion to be differenced and again withindx
negative for the portion which is known analytically. In the case where optional argumentaccumulate
has been specified by the user,fcn
must be written to handle the known analytic portion separately from the part to be differenced. (See Example 4 for an example whereaccumulate
is used.)
- int
- float
y[]
(Input) - Array of length
n
containing the point at which the Jacobian is to be evaluated. - float
f[]
(Output) - Array of length
m
containing the function values at pointy
. - float
fjac[]
(Input/Output) m
byn
array which, on output, contains the estimated Jacobian. Note that if optional argumentmethod, method,
is used, then for each variablei
for whichmethod[i]
is set toDD_SKIP
, array elementsfjac[j=0
,…,m-1][i]
are input arguments and must be set to the analytic derivatives with respect to variablei
prior to callingjacobian
. (See description of optional argument method and Example 3 below).
Optional Arguments¶
yscale,
float[]
(Input)An array of length
n
containing the diagonal scaling matrix for the variables.yscale
can also be used to provide appropriate signs for the increments. If the user setsyscale[0]
to0.0
, then differencing increment \(del_j\) (for variable \(j=1,\ldots, n\)) is set to \(\sigma_j\) *|y[j-1]|
*factor[j-1]
. Otherwise, \(del_j\) is set to \(\sigma_j\) *|yscale[j-1]|
*factor[j-1]
. (See the discussion of optional argumentfactor
below for more information about the calculation of \(del_j\).)Default:
yscale[i=0,...,n-1]
= 1.0.
method
, int[]
(Input)An array of length
n
containing the methods used to compute the derivatives.method[i]
is the method to be used for variablei
.method[i]
can be one of the values in the following table:Value Description DD_ONE_SIDED
Indicates one-sided differences. DD_CENTRAL
Indicates central differences. DD_SKIP
Indicates that the user has set the input Jacobian fjac[j=0
,…,m-1][i]
to the exact analytic derivative of the function with respect to variablei
at pointy[i]
, and that the calculation of the divided difference approximation is to be skipped.See Example 2 and Example 3 below for demonstrations of how this optional argument is used.
Note that if (and only if)
DD_SKIP
is specified for a variablei
, the required array elementsfjac[j=0,...,m-1][i]
must be set to the analytic derivatives with respect to variablei
prior to callingjacobian
. See Example 3 below.Default: If optional argument
method
is not used, then one-sided differences are used for all variables.accumulate
(Input)- Indicates that divided differences are to be accumulated with a Jacobian
value previously initialized by the user with analytically calculated
components of the derivatives via a call to
fcn
using negative values offcn
argumentindx
. See the description of indx above and Example 4 below.
factor
, float[]
(Input)An array of length
n
containing the percentage factor for differencing.For each divided difference for variable \(j=1,\ldots,n\), the differencing increment used is \(del_j\). (See the Description below for a discussion of the differencing methods.) The value of \(del_j\) is computed as follows: If
yscale[0]
has been set to0.0
(see the description of optional argumentyscale
above), define \(y_{a,j}\) =|y[j-1]|
and \(\sigma_j=1\). Otherwise, ifyscale[j-1]
{< , >} 0, define \(y_{a,j}\) =|yscale[j-1]|
and \(\sigma_j=\{-1,1\}\). Finally, compute \(del_j\) = \(\sigma_j\) \(y_{a,j}\)factor[j-1]
.By changing the sign of
yscale[j-1]
, the difference \(del_j\) can have any desired orientation, such as staying within bounds on variable j. For central differences, a reduced factor is used for \(del_j\) that normally results in relative errors as small as \(\varepsilon^{2/3}\), where ɛ == machine precision = {machine(4), machine(4)} for {single, double} precision. The elements offactor
must be such that: \(\varepsilon^{3/4}\) ≤factor[j-1]
≤ 0.1.Default: All elements of
factor
are set to \(\varepsilon^{1/2}\).iStatus
, int[]
(Output)- Array of length 10 containing status information that might prove useful
to a user wanting to gain better control over the differencing
parameters. This information can often be ignored. The following table
describes the diagnostic information that is returned in each of the
entries of array
iStatus[]
:
Index |
Description |
---|---|
0 |
The number of times a function evaluation was computed. |
1 |
The number of columns in which three attempts were made to increase a percentage factor
for differencing (i.e., a component in the factor array) but the computed
\(del_j\) (for \(j=1,\ldots,n\)) remained unacceptably small relative to
y[j‑1] or yscale[j‑1] . In such cases the percentage factor is set to the square
root of machine precision. |
2 |
The number of columns in which the computed \(del_j\) was zero to machine precision
because y[j‑1] or yscale[j‑1] was zero. In such cases \(del_j\) is set to
the square root of machine precision. |
3 |
The number of Jacobian columns that had to be recomputed because the largest difference formed in the column was close to zero relative to scale, where \(\mathit{scale}=\max\left( | f_i(y) |,| f_i(y+\mathit{del}_j e_j | \right)\) and i denotes the row index of the largest difference in the column currently being processed. \(index=9\) gives the last column where this occurred. |
4 |
The number of columns whose largest difference is close to zero relative to scale after the column has been recomputed. |
5 |
The number of times scale information was not available for use in the round-off and truncation error tests. This occurs when \(\min\left( | f_i(y) |,| f_i(y+\mathit{del}_j e_j | \right)=0\) where i is the index of the largest difference for the column currently being processed. |
6 |
The number of times the increment for differencing (del) was computed and had to be increased because ( |
7 |
The number of times a component of the factor array was reduced because changes in
function values were large and excess truncation error was suspected. \(index=8\)
gives the last column in which this occurred. |
8 |
The index of the last column where the corresponding component of the factor array
had to be reduced because excessive truncation error was suspected. |
9 |
The index of the last column where the difference was small and the column had to be recomputed with an adjusted increment (see \(index=3\)). The largest derivative in this column may be inaccurate due to excessive round-off error. |
fcnWData
, voidfcn_w_data()
, voiddata[]
(Input/Output)- User supplied function whose Jacobian is being calculated, and which can
also accept data that is supplied by the user.
data
is the data to be passed to the user-supplied function. See Example 4 for a demonstration of how this optional argument is used and User-Supplied Functions in the “Introduction” chapter for more details on the use offcnWData
.
Description¶
The function jacobian
computes the Jacobian matrix for a function
f(y) with m components and n independent variables. jacobian
uses divided finite differences to compute the Jacobian. This function is
designed for use in numerical methods for solving nonlinear problems where a
Jacobian is evaluated repeatedly at neighboring arguments. For example, this
occurs in a Gauss-Newton method for solving non-linear least squares
problems, or in a non-linear optimization method.
jacobian
is suited for applications where the Jacobian is a dense matrix.
All cases \(m<n\), \(m=n\), or \(m>n\) are allowed. Both
one-sided and central divided differences can be used.
The design allows for computation of derivatives in a variety of contexts.
Note that a gradient should be considered as the special case with
\(m=1\), \(n\geq 1\). A derivative of a single function of one
variable is the case \(m=1\), \(n=1\). Any non-linear solving routine
that optionally requests a Jacobian or gradient can use jacobian
. This
should be considered if there are special properties or scaling issues
associated with \(f(y)\). Use the optional argument method to specify different differencing options for
numerical differentiation. These can be combined with some analytic
subexpressions or other known relationships.
Two divided differences methods are implemented in jacobian
for
computing the Jacobian: one-sided and central differences.
One-sided differences are computed:
using values of the independent variables at the Jacobian evaluation point \(y=\{ y_j,j=1,\ldots,n \}\) and differenced points \(y+del_j e_j\), where the \(e_j\), \(j=1,\ldots,n\) are the unit coordinate vectors.
Central differences are computed:
The value for each difference \(del_j\) depends on the variable
\(y_j\), the differencing method, and the scaling for that variable. This
difference is computed internally. See factor
for computational details. \(f_i(y)\) is evaluated with user-supplied
argument fcn
, where index j, variable y, and output f
==
\(f_i(y)\) are arguments to fcn
.
There are five examples provided that illustrate various ways to use
jacobian
. For a discussion of the expected errors for the difference
methods, see Ralston (1965).
Function jacobian
is based upon the Fortran 77 program SJACG, which was
designed and programmed by D. A. Salane, Sandia Labs (1986) and modifed by
R. J. Hanson, Rice University (June, 2002) with advice from F. T. Krogh. See
Salane (1986).
Examples¶
Example 1¶
In this example, the Jacobian matrix of
is estimated by the finite-difference method at the point (1.0, 1.0).
from numpy import *
from pyimsl.math.jacobian import jacobian
from pyimsl.math.writeMatrix import writeMatrix
def fcn(indx, y, f):
f[0] = y[0] * y[1] - 2.0
f[1] = y[0] - y[0] * y[1] + 1.0
return
m = 2
n = 2
y = [1.0, 1.0]
f = zeros(m, dtype=double)
fjac = zeros((m, n), dtype=double)
jacobian(fcn, y, f, fjac)
writeMatrix("The Jacobian is:", fjac, writeFormat="%14.5e")
Output¶
The Jacobian is:
1 2
1 1.00000e+00 1.00000e+00
2 0.00000e+00 -1.00000e+00
Example 2¶
A simple use of jacobian
is shown. The gradient of the function
is required for values
The analytic gradient of this function is:
Note that the comparison of the Jacobian estimates using one-sided and central differences with the exact analytic Jacobian results given in this example demonstrates the increased accuracy afforded by use of central differences. However, these estimates require up to twice as many function calculations as do the one-sided differences estimates for Jacobians with a large number of variables.
from numpy import *
from pyimsl.math.jacobian import jacobian, DD_ONE_SIDED, DD_CENTRAL
from pyimsl.math.writeMatrix import writeMatrix
def fcn(indx, y, f):
a = 2500000.
b = 3.4
c = 4.5
f[0] = a * exp(b * y[0]) + c * y[0] * y[1] * y[1]
return
def fcnDrv(y, fjac):
a = 2500000.
b = 3.4
c = 4.5
fjac[0][0] = a * b * exp(b * y[0]) + c * y[1] * y[1]
fjac[0][1] = 2 * c * y[0] * y[1]
return
m = 1
n = 2
method = [0, 0]
y = [2.1, 3.2]
f = zeros(m, dtype=double)
fjac = zeros((m, n), dtype=double)
scale = [1.0, 8000.0]
# Use one-sided and central differences
# to approximate gradient and print results:
for j in range(2):
if (j == 0):
method[0] = DD_ONE_SIDED
method[1] = DD_ONE_SIDED
else:
method[0] = DD_CENTRAL
method[1] = DD_CENTRAL
jacobian(fcn, y, f, fjac,
method=method, yscale=scale)
if (j == 0):
writeMatrix("One sided Jacobian:", fjac,
writeFormat="%14.5e")
else:
writeMatrix("Central Jacobian:", fjac,
writeFormat="%14.5e")
# Calculate analytic Jacobian:
fcnDrv(y, fjac)
writeMatrix("Analytic Jacobian:", fjac, writeFormat="%14.5e")
Output¶
One sided Jacobian:
1 2
1.07221e+10 6.04800e+01
Central Jacobian:
1 2
1.07221e+10 6.04800e+01
Analytic Jacobian:
1 2
1.07221e+10 6.04800e+01
Example 3¶
This example uses the same data as in
Example 2. Here we assume that the second
component of the gradient is analytically known. Therefore only the first
gradient component needs numerical approximation. The input value
DD_SKIP
of array element method[1]
specifies that numerical
differentiation with respect to \(y_2\) is skipped.
from numpy import *
from pyimsl.math.jacobian import jacobian, DD_ONE_SIDED, DD_SKIP
from pyimsl.math.writeMatrix import writeMatrix
def fcn(indx, y, f):
a = 2500000.
b = 3.4
c = 4.5
f[0] = a * exp(b * y[0]) + c * y[0] * y[1] * y[1]
return
m = 1
n = 2
method = [0, 0]
y = [2.1, 3.2]
f = zeros(m, dtype=double)
fjac = zeros((m, n), dtype=double)
scale = [1.0, 8000.0]
# One-sided differences for fjac[0]:
method[0] = DD_ONE_SIDED
# Set method[1] to skip differencing for fjac[1]
# and initialize it to analytic derivative: */
method[1] = DD_SKIP
fjac[0][1] = 2.0 * 4.5 * y[0] * y[1]
# Get Gradient approximation:
jacobian(fcn, y, f, fjac,
method=method, yscale=scale)
writeMatrix("The Jacobian is:", fjac,
writeFormat="%14.5e")
Output¶
The Jacobian is:
1 2
1.07221e+10 6.04800e+01
Example 4¶
This example uses the same data as in Example 2, computing the Jacobian (gradient) of the function:
For this example, the analytic derivative of the second term with respect to
\(y_1\), \(cy_2^2\), is provided by the user (in the example, see
case -1:
within the user-provided function fcn
). This leaves only the
first term, \(a \exp(by_1)\), to be evaluated in order to use direct
differencing to calculate the first partial (see case 1:
within the
user-provided function fcn
). Also, since the first term does not depend
on the second variable, \(y_2\), it can be left out of the function
evaluation when computing the partial with respect to \(y_2\) using
differencing methods, potentially avoiding cancellation errors (see case
2:
within the user-provided function fcn
). Since the code does not
specify the analytic derivative with respect to \(y_2\) for either of the
two terms of \(f(y_1,y_2)\), fcn
returns f[0]
set to 0
for
case -2:
. The use of optional argument accumulate
thereby allows
jacobian
to use these facts to obtain greater accuracy using a minimum
number of computations of the exponential function.
from numpy import *
from pyimsl.math.jacobian import jacobian, DD_ONE_SIDED, DD_SKIP
from pyimsl.math.writeMatrix import writeMatrix
def fcn(indx, y, f):
a = 2500000.
b = 3.4
c = 4.5
# Handle both the differenced part and the part that is
# known analytically for each dependent variable:
if indx == 1:
f[0] = a * exp(b * y[0])
elif indx == -1:
f[0] = c * y[1] * y[1]
elif indx == 2:
f[0] = c * y[0] * y[1] * y[1]
elif indx == -2:
f[0] = 0.0
return
m = 1
n = 2
y = [2.1, 3.2]
f = zeros(m, dtype=double)
fjac = zeros((m, n), dtype=double)
scale = [1.0, 8000.0]
"""
Use optional argument accumulate so that the
user can specify which function components are to be
used for divided difference approximation of
derivatives and which are to be replaced with exact
analytically calculated derivatives. Both components
are set to the default one-sided differences method.
"""
# Calculate and print Jacobian approximation:
jacobian(fcn, y, f, fjac,
accumulate=True, yscale=scale)
writeMatrix("The Jacobian is:", fjac,
writeFormat="%14.5e")
Output¶
The Jacobian is:
1 2
1.07221e+10 6.04800e+01
Example 5¶
This example uses function boundedLeastSquares
to solve the nonlinear
least-squares problem:
where: \(f_0(x)=10(x_1-x_0^2)\), \(f_1(x)=1-x_0\), an initial guess
(-1.2, 1.0) is supplied, and the residual at the approximate solution is
returned. This example is identical to Example 2 of boundedLeastSquares,
except that Example 2 uses an analytic Jacobian, and this example uses
jacobian
to approximate the Jacobian using the default one-sided
differences.
Note that the function vector whose sum of squares is to be minimized,
rosbck
, is supplied directly (as a required argument) to
boundedLeastSquares
and indirectly to jacobian
, wrapped in function
fcn
. Function fcn
is supplied to jacobian
via optional argument
fcnWData, fcn, (void*) idata
. jacobian
is called from within
function jacobian
which is passed to boundedLeastSquares
via
optional argument jacobian
.
Also note that the array size parameters m
and n
are passed to
function rosbck
(which is wrapped in function fcn
for use by
jacobian
) via integer array idata
, which is specified in the
optional argument fcnWData, fcn, (void*) idata
. This is an example of
how to pass necessary integer data to jacobian
required argument
function fcn
using fcnWData
; an example of passing real data using
fcnWData
is given in Example 4 above.
Example 5 demonstrates how jacobian
can be used to supply estimates of
the Jacobian matrix that are necessary for solving many optimization
problems when the function to be minimized is complex and its Jacobian
cannot be calculated analytically.
from __future__ import print_function
from numpy import *
from pyimsl.math.boundedLeastSquares import boundedLeastSquares
from pyimsl.math.jacobian import jacobian
def rosbck(m, n, x, f):
f[0] = 10.0 * (x[1] - x[0] * x[0])
f[1] = 1.0 - x[0]
def fcn(indx, x, f):
rosbck(2, 2, x, f)
return
def jacobianFcn(m, n, x, fjac, fjac_col_dim):
f = zeros(2, dtype=double)
fj = zeros((2, 2), dtype=double)
y = zeros(2, dtype=double)
y[0] = x[0]
y[1] = x[1]
jacobian(fcn, y, f, fj)
fjac[0] = fj[0][0]
fjac[1] = fj[0][1]
fjac[2] = fj[1][0]
fjac[3] = fj[1][1]
m = 2
n = 2
ibtype = 0
xlb = [-2.0, -1.0]
xub = [0.5, 2.0]
xguess = [-1.2, 1.0]
fvec = []
x = boundedLeastSquares(rosbck, m, n, ibtype, xlb, xub,
jacobian=jacobianFcn,
xguess=xguess,
fvec=fvec)
print("x[0] = %f" % x[0])
print("x[1] = %f" % x[1])
print("fvec[0] = %f" % fvec[0])
print("fvec[1] = %f" % fvec[1])
Output¶
x[0] = 0.500000
x[1] = 0.250000
fvec[0] = 0.000000
fvec[1] = 0.500000