Performs Kalman filtering and evaluates the likelihood function for the state-space model.
#include <imsls.h>
void
imsls_f_kalman (int
nb,
float nb[], float
covb[],
int *n,
float *ss,
float *alndet, ..., 0)
The type double function is imsls_d_kalman.
int nb
(Input)
Number of elements in the state vector.
float b[]
(Input/Output)
Array of length nb containing the
estimated state vector. The input is the estimated state vector at time k
given the observations through time
k - 1. The output is the estimated state vector
at time k + 1 given the observations through time k. On the first
call to imsls_f_kalman,
the input b must
be the prior mean of the state vector at time 1.
float covb[]
(Input/Output)
Array of size nb by nb such that covb* s2 is the mean squared error matrix for
b.
Before the first call to imsls_f_kalman,
covb * s2 must equal the variance-covariance matrix
of the state vector.
int *n
(Input/Output)
Pointer to the rank of the variance-covariance matrix for all
the observations. n must be initialized
to zero before the first call to imsls_f_kalman.
In the usual case when the variance-covariance matrix is nonsingular, n equals the sum of
the ny’s from
the invocations to imsls_f_kalman. See optional argument IMSLS_UPDATE below for
the definition of ny.
float *ss
(Input/Output)
Pointer to the generalized sum of squares.
ss must be initialized
to zero before the first call to imsls_f_kalman.
The estimate of s2 is given by
.
float *alndet
(Input/Output)
Pointer to the natural log of the product of the nonzero
eigenvalues of
P where P * s2 is the variance-covariance matrix of the
observations. Although alndet is computed,
imsls_f_kalman
avoids the explicit computation of P. alndet must be
initialized to zero before the first call to imsls_f_kalman.
In the usual case when P is nonsingular, alndet is the natural
log of the determinant of P.
#include <imsls.h>
voidt
*imsls_f_random_sample (int
nb,
float nb[], float
covb[],
int *n,
float *ss, float
*alndet,
IMSLS_UPDATE, int ny, float *y, float *z, float *r,
IMSLS_Z_COL_DIM, int
z_col_dim,
IMSLS_R_COL_DIM, int
r_col_dim,
IMSLS_T, float *t,
IMSLS_T_COL_DIM, int
t_col_dim,
IMSLS_Q, float *q,
IMSLS_Q_COL_DIM, int
t_col_dim,
IMSLS_TOLERANCE, float
tolerance,
IMSLS_V, float
**v,
IMSLS_V_USER, float
v[],
IMSLS_COVV, float
**v,
IMSLS_COVV_USER, float
v[],
0)
IMSLS_UPDATE, int
ny, float *y, float *z, float
*r (Input)
Perform computation of the update
equations.
ny: Number of observations for current update.
y: Array of length ny containing the observations.
z: ny by nb array containing the matrix relating the observations to the state vector in the observation equation.
r: ny by ny array containing
the matrix such that r * s2 is the variance-covariance matrix of
errors in the observation equation.
s2 is a positive unknown scalar. Only
elements in the upper triangle of r are referenced.
IMSLS_Z_COL_DIM, int
z_col_dim (Input)
Column dimension of
the matrix z.
Default:
z_col_dim = nb
IMSLS_R_COL_DIM,
int r_col_dim (Input)
Column
dimension of the matrix r.
Default:
r_col_dim = ny
IMSLS_T, float
*t (Input)
nb by nb transition matrix in the state equation
Default: t = identity matrix
IMSLS_T_COL_DIM, int r_col_dim
(Input)
Column dimension of the matrix
t.
Default:
t_col_dim = nb
IMSLS_Q, float *q
(Input)
nb by
nb matrix such
that q * s2 is the variance-covariance matrix of the
error vector in the state equation.
Default: There is no
error term in the state equation.
IMSLS_Q_COL_DIM, int q_col_dim
(Input)
Column dimension of the matrix
q.
Default:
q_col_dim = nb
IMSLS_TOLERANCE, float tolerance
(Input)
Tolerance used in determining linear dependence.
Default: tolerance = 100.0*imsls_f_machine(4)
IMSLS_V, float **v
(Output)
Address to a pointer v to an array of
length ny
containing the one-step-ahead prediction error.
IMSLS_V_USER, float v[]
(Output)
Storage for v is provided by the
user. See IMSLS_V.
IMSLS_COVV, float **covv
(Output)
The address to a pointer of size ny by ny containing a matrix
such that covv
* s2 is the variance-covariance matrix of
v.
IMSLS_COVV_USER, float covv[]
(Output)
Storage for covv is provided by
the user. See IMSLS_COVV.
Routine imsls_f_kalman is based on a recursive algorithm given by Kalman (1960), which has come to be known as the Kalman filter. The underlying model is known as the state-space model. The model is specified stage by stage where the stages generally correspond to time points at which the observations become available. The routine imsls_f_kalman avoids many of the computations and storage requirements that would be necessary if one were to process all the data at the end of each stage in order to estimate the state vector. This is accomplished by using previous computations and retaining in storage only those items essential for processing of future observations.
The notation used here follows that of Sallas and Harville (1981). Let yk (input in y using optional argument IMSLS_UPDATE) be the nk × 1 vector of observations that become available at time k. The subscript k is used here rather than t, which is more customary in time series, to emphasize that the model is expressed in stages k = 1, 2, ¼ and that these stages need not correspond to equally spaced time points. In fact, they need not correspond to time points of any kind. The observation equation for the state-space model is
yk = Zkbk + ek k = 1, 2, ¼
Here, Zk (input in z using optional argument IMSLS_UPDATE) is an nk × q known matrix and bk is the q × 1 state vector. The state vector bk is allowed to change with time in accordance with the state equation
bk1 = Tk1bk + wk1 k = 1, 2, ¼
starting with b1 = m1 + w1.
The change in the state vector from time k to
k + 1 is explained in part by the transition matrix
Tk+1 (the identity matrix
by default, or optionally input using IMSLS_T),
which is assumed known. It is assumed that the q-dimensional
wks
(k = 1,
2,…) are independently distributed multivariate normal with mean vector 0 and
variance-covariance matrix s2Qk, that the
nk-dimensional
eks (k = 1, 2,…)
are independently distributed multivariate normal with mean vector 0 and
variance-covariance matrix s2Rk, and that the
wks and eks are independent of
each other. Here, m1 is the mean of
b1 and is assumed known,
s2 is an unknown positive
scalar. Qk+1 (input in Q)
and Rk (input in R)
are assumed known.
Denote the estimator of the realization of the state vector bk given the observations y1, y2, …, yj by
By definition, the mean squared error matrix for
is
At the time of the k-th invocation, we have
and
Ck|k1, which were computed from the (k-1)-st invocation, input in b and covb, respectively. During the k-th invocation, routine imsls_f_kalman computes the filtered estimate
along with Ck|k. These quantities are given by the update equations:
where
and where
Here, vk (stored in v) is the one-step-ahead prediction error, and s2Hk is the variance-covariance matrix for vk. Hk is stored in covv. The “start-up values” needed on the first invocation of imsls_f_kalman are
and C1|0 = Q1 input via b and covb, respectively. Computations for the k-th invocation are completed by imsls_f_kalman computing the one-step-ahead estimate
along with Ck1|k given by the prediction equations:
If both the filtered estimates and one-step-ahead estimates are needed by the user at each time point, imsls_f_kalman can be invoked twice for each time point—first without IMSLS_T and IMSLS_Q to produce
and Ck|k, and second without IMSLS_UPDATE to produce
and Ck1|k (Without IMSLS_T and IMSLS_Q, the prediction equations are skipped. Without IMSLS_UPDATE, the update equations are skipped.).
Often, one desires the estimate of the state vector more than one-step-ahead, i.e., an estimate of
is needed where k > j + 1. At time j, imsls_f_kalman is invoked with IMSLS_UPDATE to compute
Subsequent invocations of imsls_f_kalman without IMSLS_UPDATE can compute
Computations for
and Ck|j assume the variance-covariance matrices of the errors in the observation equation and state equation are known up to an unknown positive scalar multiplier, s2. The maximum likelihood estimate of s2 based on the observations y1, y2, …, ym, is given by
where
N and SS are the input/output arguments n and ss.
If s2 is known, the Rks and Qks can be input as the variance-covariance matrices exactly. The earlier discussion is then simplified by letting s2 = 1.
In practice, the matrices Tk, Qk, and Rk are generally not completely known. They may be known functions of an unknown parameter vector q. In this case, imsls_f_kalman can be used in conjunction with an optimization program (see routine imsl_f_min_uncon_multivar, IMSL C/Math/Library, Chapter 8, “Optimization”) to obtain a maximum likelihood estimate of q. The natural logarithm of the likelihood function for y1, y2, …, ym differs by no more than an additive constant from
(Harvey 1981, page 14, equation 2.21).
Here,
(stored in alndet) is the natural logarithm of the determinant of V where s2V is the variance-covariance matrix of the observations.
Minimization of -2L(q, s2; y1, y2, ¼, ym) over all q and s2 produces maximum likelihood estimates. Equivalently, minimization of -2Lc(q; y1, y2, ¼, ym) where
produces maximum likelihood estimates
The minimization of -2Lc(q; y1, y2, ¼, ym) instead of -2L(q, s2; y1, y2, ¼, ym), reduces the dimension of the minimization problem by one. The two optimization problems are equivalent since
minimizes -2L(q, s2; y1, y2, ¼, ym) for all q, consequently,
can be substituted for s2 in L(q, s2; y1, y2, ¼, ym) to give a function that differs by no more than an additive constant from Lc(q; y1, y2, ¼, ym).
The earlier discussion assumed Hk to be nonsingular. If Hk is singular, a modification for singular distributions described by Rao (1973, pages 527–528) is used. The necessary changes in the preceding discussion are as follows:
1. Replace
by a generalized inverse.
2. Replace det(Hk) by the product of the nonzero eigenvalues of Hk.
3. Replace N by
Maximum likelihood estimation of parameters in the Kalman filter is discussed by Sallas and Harville (1988) and Harvey (1981, pages 111–113).
Function imsls_f_kalman
is used to compute the filtered estimates and one-step-ahead estimates for a
scalar problem discussed by Harvey (1981, pages
116–117). The observation
equation and state equation are given by
where the eks are identically and independently distributed normal with mean 0 and variance s2, the wks are identically and independently distributed normal with mean 0 and variance 4s2, and b1 is distributed normal with mean 4 and variance 16s2. Two invocations of imsls_f_kalman are needed for each time point in order to compute the filtered estimate and the one-step-ahead estimate. The first invocation does not use the optional arguments IMSLS_T and IMSLS_Q so that the prediction equations are skipped in the computations. The update equations are skipped in the computations in the second invocation.
This example also computes the one-step-ahead prediction
errors. Harvey (1981, page 117) contains a misprint for the value v4 that he gives as
1.197. The correct value of
v4 = 1.003 is computed by
imsls_f_kalman.
.
#include <stdio.h>
#include <imsls.h>
#define NB 1
#define NOBS 4
#define NY 1
void main()
{
int nb = NB, nobs = NOBS, ny = NY;
int ldcovb, ldcovv, ldq, ldr, ldt, ldz;
int i, iq, it, n, nout;
float alndet, b[NB], covb[NB][NB], covv[NY][NY],
q[NB][NB], r[NY][NY], ss,
t[NB][NB], tol, v[NY], y[NY], z[NY][NB];
float ydata[] = {4.4, 4.0, 3.5, 4.6};
z[0][0] = 1.0;
r[0][0] = 1.0;
q[0][0] = 4.0;
t[0][0] = 1.0;
b[0] = 4.0;
covb[0][0] = 16.0;
/* Initialize arguments for initial call to imsls_f_kalman. */
n = 0;
ss = 0.0;
alndet = 0.0;
printf("k/j b covb n ss alndet v covv\n");
for (i = 0; i < nobs; i++) {
/* Update */
y[0] = ydata[i];
imsls_f_kalman(nb, b, (float*)covb, &n, &ss, &alndet,
IMSLS_UPDATE, ny, y, z, r,
IMSLS_V_USER, v,
IMSLS_COVV_USER, covv,
0);
printf("%d/%d %8.3f %8.3f %d %8.3f %8.3f %8.3f %8.3f\n",
i, i, b[0], covb[0][0], n, ss, alndet, v[0], covv[0][0]);
/* Prediction */
imsls_f_kalman(nb, b, (float*)covb, &n, &ss, &alndet,
IMSLS_T, t,
IMSLS_Q, q,
0);
printf("%d/%d %8.3f %8.3f %d %8.3f %8.3f %8.3f %8.3f\n",
i+1, i, b[0], covb[0][0], n, ss, alndet, v[0], covv[0][0]);
}
}
k/j b covb n ss alndet v covv
0/0 4.376 0.941 1 0.009 2.833 0.400 17.000
1/0 4.376 4.941 1 0.009 2.833 0.400 17.000
1/1 4.063 0.832 2 0.033 4.615 -0.376 5.941
2/1 4.063 4.832 2 0.033 4.615 -0.376 5.941
2/2 3.597 0.829 3 0.088 6.378 -0.563 5.832
3/2 3.597 4.829 3 0.088 6.378 -0.563 5.832
3/3 4.428 0.828 4 0.260 8.141 1.003 5.829
4/3 4.428 4.828 4 0.260 8.141 1.003 5.829
Function imsls_f_kalman is used with routine imsl_f_min_uncon_multivar, (see IMSL C/Math/Library, Chapter 8, “Optimization”) to find a maximum likelihood estimate of the parameter q in a MA(1) time series represented by yk = ek - qek−1. Function imsls_f_random_arma (see IMSL C/Stat/Library, Chapter 12, “Random Number Generation”) is used to generate 200 random observations from an MA(1) time series with q = 0.5 and s2 = 1.
The MA(1) time series is cast as a state-space model of the following form (see Harvey 1981, pages 103–104, 112):
where the two-dimensional wks are independently distributed bivariate normal with mean 0 and variance σ2 Qk and
The warning error that is printed as part of the output is not serious and indicates that imsl_f_min_uncon_multivar (See Chapter 8, “Optimization” in the math manual) is generally used for multi-parameter minimization.
#include <stdio.h>
#include <math.h>
#include <imsls.h>
#define NOBS 200
#define NTHETA 1
#define NB 2
#define NY 1
float fcn(int ntheta, float theta[]);
float *ydata;
void main ()
{
int lagma[1];
float pma[1];
float *theta;
imsls_random_seed_set(123457);
pma[0] = 0.5;
lagma[0] = 1;
ydata = imsls_f_random_arma(200, 0, NULL, 1, pma,
IMSLS_ACCEPT_REJECT_METHOD,
IMSLS_NONZERO_MALAGS,
lagma,
0);
theta = imsl_f_min_uncon_multivar(fcn, NTHETA, 0);
printf("* * * Final Estimate for THETA * * *\n");
printf("Maximum likelihood estimate, THETA = %f\n", theta[0]);
}
float fcn(int ntheta, float theta[])
{
int i, n;
float res, ss, alndet;
float t[] = {0.0, 1.0, 0.0, 0.0};
float z[] = {1.0, 0.0};
float q[NB][NB], r[NY][NY], b[NB], covb[NB][NB], y[NY];
if (fabs(theta[0]) > 1.0) {
res = 1.0e10;
} else {
q[0][0] = 1.0;
q[0][1] = -theta[0];
q[1][0] = -theta[0];
q[1][1] = theta[0]*theta[0];
r[0][0] = 0.0;
b[0] = 0.0;
b[1] = 0.0;
covb[0][0] = 1.0 + theta[0]*theta[0];
covb[0][1] = -theta[0];
covb[1][0] = -theta[0];
covb[1][1] = theta[0]*theta[0];
n = 0;
ss = 0.0;
alndet = 0.0;
for (i = 0; i<NOBS; i++) {
y[0] = ydata[i];
imsls_f_kalman(NB, b, (float*)covb, &n, &ss, &alndet,
IMSLS_UPDATE, NY, y, z, r,
IMSLS_Q, q,
IMSLS_T, t,
0);
}
res = n*log(ss/n) + alndet;
}
return(res);
}
*** WARNING_IMMEDIATE Error from imsl_f_min_uncon_multivar. This routine
*** may be inefficient for a problem of size "n" = 1.
*** WARNING_IMMEDIATE Error from imsl_f_min_uncon_multivar. The last global
*** step failed to locate a lower point than the current X value.
*** The current X may be an approximate local minimizer and no more
*** accuracy is possible or the step tolerance may be too large
*** where "step_tol" = 2.422181e-05 is given.
* * * Final Estimate for THETA * * *
Maximum likelihood estimate, THETA = 0.453256
Visual Numerics, Inc. PHONE: 713.784.3131 FAX:713.781.9260 |