arma¶

Computes least-square estimates of parameters for an ARMA model.

Synopsis¶

arma (z, p, q)

Required Arguments¶

float z[] (Input): Array of length nObservations containing the observations.
int p (Input): Number of autoregressive parameters.
int q (Input): Number of moving average parameters.

Return Value¶

An array of length 1 + p + q with the estimated constant, AR, and MA parameters. If noConstant is specified, the 0-th element of this array is 0.0.

Optional Arguments¶

noConstant

or

constant

If noConstant is specified, the time series is not centered about its mean, meanEstimate. If constant, the default, is specified, the time series is centered about its mean.

arLags, int[] (Input)

Array of length p containing the order of the autoregressive parameters. The elements of arLags must be greater than or equal to 1.

Default: arLags = [1, 2, …, p]

maLags, int[] (Input)

Array of length q containing the order of the moving average parameters. The maLags elements must be greater than or equal to 1.

Default: maLags = [1, 2, …, q]

methodOfMoments

or

leastSquares

If methodOfMoments is specified, the autoregressive and moving average parameters are estimated by a method of moments procedure. If leastSquares is specified, the autoregressive and moving average parameters are estimated by a least-squares procedure.

Default: methodOfMoments is used.

backcasting, int maxbc, float tolerance (Input)

If backcasting is specified, maxbc is the maximum length of backcasting and must be greater than or equal to 0. Argument tolerance is the tolerance level used to determine convergence of the backcast algorithm. Typically, tolerance is set to a fraction of an estimate of the standard deviation of the time series.

Default: maxbc = 10; tolerance = 0.01 × standard deviation of z.

relativeError, float (Input)

Stopping criterion for use in the nonlinear equation solver used in both the method of moments and least-squares algorithms.

Default: relativeError = 100 × machine(4).

See documentation for function machine (Chapter 15, Utilities).

maxIterations, int (Input)

Maximum number of iterations allowed in the nonlinear equation solver used in both the method of moments and least-squares algorithms.

Default: maxIterations = 200.

meanEstimate, float (Input or Input/Output)

On input, meanEstimate is an initial estimate of the mean of the time series z. On return, meanEstimate contains an update of the mean.

If noConstant and leastSquares are specified, meanEstimate is not used in parameter estimation.

initialEstimates, float ar[], float ma[] (Input)

If specified, ar is an array of length p containing preliminary estimates of the autoregressive parameters, and ma is an array of length q containing preliminary estimates of the moving average parameters; otherwise, these are computed internally. initialEstimates is only applicable if leastSquares is also specified.

residual (Output)

An array of length

na = (nObservations - max (arLags [i]) + maxbc) containing the residuals (including backcasts) at the final parameter estimate point in the first nObservations - max(arLags[i]) + nb, where nb is the number of values backcast.

paramEstCov (Output)

An array containing the variance-covariance matrix of the estimated ARMA parameters and (optionally) of the estimated mean of series z. The size of the array is np × np, where np = p + q + 1 if z is centered about z_mean, and np = p + q if z is not centered. The ordering of variables in paramEstCov is meanEstimate, ar, and ma. Argument np must be 1 or larger.

autocov (Output)

An array of length p + q + 2 containing the variance and autocovariances of the time series z. Argument autocov[0] contains the variance of the series z. Argument autocov[k] contains the autocovariance at lag k, where k = 0, 1, …, p + q + 1.

ssResidual (Output)

If specified, ssResidual contains the sum of squares of the random shock, ssResidual = residual

$[1]^2$ + … + residual[na

$]^2$ , where na is equal to the number of residuals.

varNoise (Output)

If specified, varNoise contains the innovation variance of the series.

armaInfo (Output)

A structure that contains information necessary in the call to armaForecast.

Description¶

Function arma computes estimates of parameters for a nonseasonal ARMA model given a sample of observations, $\{W_t\}$ , for $t=1,2, \ldots,n$ , where n = nObservations. There are two methods, method of moments and least squares, from which to choose. The default is method of moments.

Two methods of parameter estimation, method of moments and least squares, are provided. The user can choose the method of moments algorithm with the optional argument methodOfMoments. The least-squares algorithm is used if the user specifies leastSquares. If the user wishes to use the least-squares algorithm, the preliminary estimates are the method of moments estimates by default. Otherwise, the user can input initial estimates by specifying optional argument initialEstimates. The following table lists the appropriate optional arguments for both the method of moments and least-squares algorithm:

Method of Moments Only	Least Squares Only	Both Method of Moments and Least Squares
`methodOfMoments`	`leastSquares`	`relativeError`
	`constant (or noConstant)`	`maxIterations`
	`arLags`	`meanEstimate`
	`maLags`	`autocov`
	`backcasting`	`armaInfo`
	`initialEstimates`
	`residual`
	`paramEstCov`
	`ssResidual`

Method of Moments Estimation¶

Suppose the time series $\{Z_t\}$ is generated by an $ARMA(p,q)$ model of the form

$φ(B)Z_t = θ_0 + θ(B)A_t$

for $t\in\{0,\pm 1,\pm 2,\ldots\}$

Let $\hat{\mu}$ = meanEstimate be the estimate of the mean μ of the time series $\{Z_t\}$ , where $\hat{\mu}$ equals the following:

$\begin{split}\hat{\mu} = \begin{cases} \mu & \text{for } \mu \text{ known} \\ \tfrac{1}{n} \sum\limits_{t=1}^{n} Z_t & \text{for } \mu \text{ unknown} \end{cases}\end{split}$

The autocovariance function is estimated by

$\hat{\sigma}(k) = \frac{1}{n} \sum_{t=1}^{n-k} \left(Z_t - \hat{\mu}\right) \left(Z_{t+k} - \hat{\mu}\right)$

for $k=0,1,\ldots,K$ , where $K=p+q$ . Note that $\hat{\sigma}(0)$ is an estimate of the sample variance.

Given the sample autocovariances, the function computes the method of moments estimates of the autoregressive parameters using the extended Yule-Walker equations as follows:

$\hat{\Sigma} \hat{\phi} = \hat{\sigma}$

where

$\begin{split}\begin{array}{ll} \hat{\phi} = \left( \hat{\phi}, \ldots, \hat{\phi}_p \right)^T & \\ \hat{\Sigma}_{ij} = \hat{\sigma} \left( | q + i - j | \right), & i,j = 1, \ldots, p \\ \hat{\sigma}_i = \hat{\sigma} (q + i), & i = 1, \ldots, p \end{array}\end{split}$

The overall constant $\theta_0$ is estimated by the following:

$\begin{split}\hat{\theta}_0 = \begin{cases} \hat{\mu} & \text{for } p = 0 \\ \hat{\mu}\left(1 - \sum\limits_{i=1}^{p} \hat{\phi}_i\right) & \text{for } p > 0 \\ \end{cases}\end{split}$

The moving average parameters are estimated based on a system of nonlinear equations given $K=p+q+1$ autocovariances, $\sigma(k)$ for $k=1,\ldots,K$ , and p autoregressive parameters $\varphi_I$ for $i=1,\ldots,p$ .

Let $Z'_t=\phi(B)Z_t$ . The autocovariances of the derived moving average process $Z'_t=\theta(B)A_t$ are estimated by the following relation:

$\begin{split}\hat{\sigma}'(k) = \begin{cases} \hat{\sigma}(k) & \text{for } p = 0 \\ \displaystyle\sum_{i=0}^{p} \displaystyle\sum_{j=0}^{p} \hat{\phi}_i \hat{\phi}_j \left( \hat{\sigma} \left( | k + i - j | \right) \right) & \text{for } p \geq 1, \hat{\phi}_0 \equiv -1 \end{cases}\end{split}$

The iterative procedure for determining the moving average parameters is based on the relation

$\begin{split}\sigma(k) = \begin{cases} \left(1 + \theta_1^2 + \ldots + \theta_q^2\right) \sigma_A^2 & \text{for } k = 0 \\ \left(-\theta_k + \theta_1 \theta_{k+1} + \ldots + \theta_{q-k} \theta_{q}\right) \sigma_A^2 & \text{for } k \geq 1 \\ \end{cases}\end{split}$

where σ(k) denotes the autocovariance function of the original $Z_t$ process.

Let $\tau=\left( \tau_0,\tau_1,\ldots,\tau_q \right)^T$ and $f=(f_0,f_1,\ldots,f_q)^T$ , where

$\begin{split}\tau_j = \begin{cases} \sigma_A & \text{for } j = 0 \\ -\theta_j / \tau_0 & \text{for } j = 1,\ldots q \end{cases}\end{split}$

and

$f_j = \sum_{i=0}^{q-j} \tau_i \tau_{i+j} - \hat{\sigma}'(j) \phantom{...} \text{for } j = 0,1, \ldots, q$

Then, the value of $\tau$ at the (i + 1)-th iteration is determined by the following:

$\tau^{i+1} = \tau^i - \left(T^i\right)^{-1} f^i$

The estimation procedure begins with the initial value

$\tau^0 = \left(\sqrt{\hat{\sigma}'(0)}, 0, \ldots, 0\right)^T$

and terminates at iteration i when either $\|f_i\|$ is less than relativeError or i equals maxIterations. The moving average parameter estimates are obtained from the final estimate of $\tau$ by setting

$\hat{\theta}_j = - \tau_j / \tau_0 \text{ for } j = 1, \ldots, q$

The random shock variance is estimated by the following:

$\begin{split}\hat{\sigma}_A^2 = \begin{cases} \hat{\sigma}(0) - \sum\limits_{i=1}^{p} \hat{\phi}_i \hat{\sigma}(i) & \text{for } q = 0 \\ \tau_0^2 & \text{for } q \geq 0 \\ \end{cases}\end{split}$

See Box and Jenkins (1976, pp. 498-500) for a description of a function that performs similar computations.

Least-squares Estimation¶

Suppose the time series $\{Z_t\}$ is generated by a nonseasonal ARMA model of the form,

$φ(B) (Z_t - μ) = θ(B)A_t for t ∈ \{0, ±1, ±2, …\}$

where B is the backward shift operator, μ is the mean of $Z_t$ , and

$\begin{split}\begin{array}{l} \phi(B) = 1 - \phi_1B^{l_\phi(1)} - \phi_2B^{1_\phi(2)} - \ldots - \phi_pB^{1_\phi(p)} \phantom{.....} \text{for } p \geq 0 \\ \theta(B) = 1 - \theta_1B^{l_\theta(1)} - \theta_2B^{1_\theta(2)} - \ldots - \theta_qB^{1_\theta(q)} \phantom{.....} \text{for } q \geq 0 \\ \end{array}\end{split}$

with p autoregressive and q moving average parameters. Without loss of generality, the following is assumed:

$1 ≤ l_φ (1) ≤ l_φ (2) ≤ … ≤ l_φ (p)$

$1 l_θ (1) ≤ l_θ (2) ≤ … ≤ l_θ (q)$

so that the nonseasonal ARMA model is of order $(p',q')$ , where $p'=l_\varphi(p)$ and $q'=l_\theta(q)$ . Note that the usual hierarchical model assumes the following:

$l_φ (i) = i, 1 ≤ i ≤ p$

$l_θ (j) = j, 1 ≤ j ≤ q$

Consider the sum-of-squares function

$S_T(\mu, \phi, \theta) = \sum_{-T+1}^{n} \left[A_t\right]^2$

where

$\left[A_t\right] = E\left[A_t | (\mu, \phi, \theta, Z)\right]$

and T is the backward origin. The random shocks ${A_t}$ are assumed to be independent and identically distributed

$N\left(0, \sigma_A^2\right)$

random variables. Hence, the log-likelihood function is given by

$l\left(\mu, \phi, \theta, \sigma_A\right) = f(\mu, \phi, \theta) - n \ln \left(\sigma_A\right) - \frac{S_T(\mu, \phi, \theta)}{2 \sigma_A^2}$

where $f(\mu,\varphi,\theta)$ is a function of μ, φ, and θ.

For $T=0$ , the log-likelihood function is conditional on the past values of both $Z_t$ and $A_t$ required to initialize the model. The method of selecting these initial values usually introduces transient bias into the model (Box and Jenkins 1976, pp. 210-211). For $T=\infty$ , this dependency vanishes, and estimation problem concerns maximization of the unconditional log-likelihood function. Box and Jenkins (1976, p. 213) argue that

$S_{\infty}(\mu, \phi, \theta) / \left(2 \sigma_A^2\right)$

dominates

$l\left(\mu, \phi, \theta, \sigma_A^2\right)$

The parameter estimates that minimize the sum-of-squares function are called least-squares estimates. For large n, the unconditional least-squares estimates are approximately equal to the maximum likelihood-estimates.

In practice, a finite value of T will enable sufficient approximation of the unconditional sum-of-squares function. The values of $\left[A_T\right]$ needed to compute the unconditional sum of squares are computed iteratively with initial values of $Z_t$ obtained by back forecasting. The residuals (including backcasts), estimate of random shock variance, and covariance matrix of the final parameter estimates also are computed. ARIMA parameters can be computed by using difference with arma.

Examples¶

Example 1¶

Consider the Wolfer Sunspot Data (Anderson 1971, p. 660) consisting of the number of sunspots observed each year from 1749 through 1924. The data set for this example consists of the number of sunspots observed from 1770 through 1869. The method of moments estimates

$\hat{\theta}_0, \hat{\varphi}_1, \hat{\varphi}_2, \text{ and } \hat{\theta}_1$

for the ARMA(2, 1) model

$z_t = \theta_0 + \phi_0 z_{t-1} + \phi_2 z_{t-2} - \theta_1 A_{t-1} + A_t$

where the errors $A_t$ are independently normally distributed with mean zero and variance

$\sigma_A^2$

from __future__ import print_function
from numpy import *
from pyimsl.stat.arma import arma
from pyimsl.stat.dataSets import dataSets

p = 2
q = 1
n_observations = 100
z = empty(100)
relative_error = 0.0
max_iterations = 0

w = dataSets(2)
for i in range(0, n_observations):
    z[i] = w[21 + i][1]

parameters = arma(z, p, q,
                  relativeError=relative_error,
                  maxIterations=max_iterations)

print("AR estimates are %11.4f and %11.4f." % (parameters[1], parameters[2]))
print("MA estimate is %11.4f." % parameters[3])

Output¶

AR estimates are      1.2443 and     -0.5751.
MA estimate is     -0.1241.

Example 2¶

The data for this example are the same as that for the initial example. Preliminary method of moments estimates are computed by default, and the method of least squares is used to find the final estimates.

from __future__ import print_function
from numpy import *
from pyimsl.stat.arma import arma
from pyimsl.stat.dataSets import dataSets

p = 2
q = 1
n_observations = 100
z = empty(100)
w = dataSets(2)
for i in range(0, n_observations):
    z[i] = w[21 + i][1]

parameters = arma(z, p, q,
                  leastSquares=True)

print("AR estimates are %11.4f and %11.4f." % (parameters[1], parameters[2]))
print("MA estimate is %11.4f." % parameters[3])

Output¶

AR estimates are      1.5313 and     -0.8944.
MA estimate is     -0.1320.

Warning Errors¶

IMSLS_LEAST_SQUARES_FAILED Least-squares estimation of the parameters has failed to converge. Solution from last iteration is returned. The estimates of the parameters at the last iteration may be used as new starting values.

Fatal Errors¶

`IMSLS_TOO_MANY_CALLS`	The number of calls to the function has exceeded “`itmax`”(“n”+1) = %(i1). The user may try a new initial guess.
`IMSLS_INCREASE_ERRREL`	The bound for the relative error, “`errrel`” = %(r1), is too small. No further improvement in the approximate solution is possible. The user should increase “`errrel`”.
`IMSLS_NEW_INITIAL_GUESS`	The iteration has not made good progress. The user may try a new initial guess.