Usage Notes

The functions in this chapter assume the time series does not contain any missing values. If missing values are present, they should be set to NaN (see Chapter 15, Utilities function machine), and the function will return an appropriate error message. To enable fitting of the model, the missing values must be replaced by appropriate estimates.

Model Construction and Evaluation Utilities

A major component of the model identification step concerns determining if a given time series is stationary. The sample correlation functions computed by functions autocorrelation, crosscorrelation, multiCrosscorrelation, and partialAutocorrelation may be used to diagnose the presence of nonstationarity in the data, as well as to indicate the type of transformation required to induce stationarity. The family of power transformations provided by function boxCoxTransform coupled with the ability to difference the transformed data using function difference affords a convenient method of transforming a wide class of nonstationary time series to stationarity.

The “raw” data, transformed data, and sample correlation functions also provide insight into the nature of the underlying model. Typically, this information is displayed in graphical form via time series plots, plots of the lagged data, and various correlation function plots.

The observed time series may also be compared with time series generated from various theoretical models to help identify possible candidates for model fitting. The function randomArma (see Chapter 12, Random Number Generation) may be used to generate a time series according to a specified autoregressive moving average model.

ARIMA Models

Once the data are transformed to stationarity, a tentative model in the time domain is often proposed and parameter estimation, diagnostic checking and forecasting are performed.

ARIMA Model (Autoregressive Integrated Moving Average)

A small, yet comprehensive, class of stationary time-series models consists of the nonseasonal ARMA processes defined by

\[φ(B) (W_t - μ) = θ(B)A_t,        t ∈ Z\]

where \(Z=\{\ldots,-2,-1,0,1,2,\ldots\}\) denotes the set of integers, B is the backward shift operator defined by \(B^k W_t=W_{t-k}\), μ is the mean of \(W_t\), and the following equations are true:

\[φ(B) = 1 - φ_1B - φ_2B^2 - ... - φ_pB^p, p ≥ 0\]
\[θ(B) = 1 - θ_1B - θ_2B^2 - ... - θ_qB^q, q ≥ 0\]

The model is of order (p, q) and is referred to as an ARMA (p, q) model.

An equivalent version of the ARMA (p, q) model is given by

\[φ(B) W_t = θ_0 + θ(B)A_t,        t ∈ Z\]

where \(\theta_0\) is an overall constant defined by the following:

\[\theta_0 = \mu \left(1 - \sum_{i=1}^{p} \phi_i\right)\]

See Box and Jenkins (1976, pp. 92-93) for a discussion of the meaning and usefulness of the overall constant.

If the “raw” data, \(\{Z_t\}\), are homogeneous and nonstationary, then differencing using difference induces stationarity, and the model is called ARIMA (AutoRegressive Integrated Moving Average). Parameter estimation is performed on the stationary time series \(W_t=\nabla^d Z_t\), where \(\nabla^d=(1-B)^d\) is the backward difference operator with period 1 and order d, \(d>0\).

Typically, the method of moments includes argument methodOfMoments in a call to function arma for preliminary parameter estimates. These estimates can be used as initial values into the least-squares procedure by including argument leastSquares in a call to function arma. Other initial estimates provided by the user can be used. The least-squares procedure can be used to compute conditional or unconditional least-squares estimates of the parameters, depending on the choice of the backcasting length. The parameter estimates from either the method of moments or least-squares procedures can be input to function armaForecastthrough the armaInfo structure. The functions for preliminary parameter estimation, least-squares parameter estimation, and forecasting follow the approach of Box and Jenkins (1976, Programs 2 - 4, pp. 498-509).

Regression in Autoregressive Integrated Moving Average

There may be one or more external time series that relate to the time series of interest, which may be useful in improving forecasts. Function regressionArima allows for the inclusion of one or more regression time series in the above ARIMA model. That is, if there are r time series \(\{X_{i,t},i=1,\ldots,r\}\) associated with a times series \(Y_t\), the regression ARIMA model (integrated of order d) is

\[W_t = \nabla^d Z_t\]

where

\[Z_t = \left(Y_t - \sum_{t=1}^{r} \beta_i X_{i,t}\right)\]

That is, \(Z_t\) is the residual (indexed by t) of the regression of \(Y_t\) on \(\{X_{i,t},i=1,\ldots,r\}\).

Automatic ARIMA Selection and Fitting Utilities

A popular criterion for comparing autoregressive-moving average (ARMA) models with different lags is a measure known as Akaike’s Information Criterion (AIC). The AIC for an ARMA univariate series is calculated by:

\[\mathit{AIC} = -2 \cdot \ln (L) + 2r\]

where L = the value for the maximum likelihood function for the fitted model, and \(r=p+q+1\), the number of parameters in the ARMA model. To use the criterion, several choices for p and q are fit to a time series and the fit with the smallest AIC is considered best. The function, autoUniAr uses the AIC criterion to select a best fitting AR model. The function autoArima performs a more comprehensive search, considering not only the ARMA parameters, but also the appropriate Box-Cox transformation, degree of differencing and seasonal adjustment, and also filters the data for outliers by calling tsOutlierIdentification.

The function autoParm uses a second criterion, called “Minimum Description Length” or MDL, to automatically fit piecewise AR models to a time series with structural breaks (i.e., a potentially non-stationary time series having stationary segments).

The MDL is defined as

\[\begin{split}\begin{array}{l} \mathit{MDL}\left(m, \tau_1, \ldots \tau_m, p_1, \ldots, p_{m+1}\right) \\ \phantom{.....} = \ln m+(m-1) \ln n + \sum\limits_{j=1}^{m+1} \frac{2+p_j}{2} \ln n_j - \ln L, \\ \end{array}\end{split}\]

where m is the number of structural breaks in the series, \(\left\{ \tau_1,\tau_2,\ldots\tau _{m+1} \right\}\) are the locations of the breaks, \(n_j\) is the number of observations in the j‑th segment, \(p_j\) is the order of the AR model fit to the j‑th segment, and L is the combined likelihood over all segments. autoParm also allows the choice to use the AIC criterion,

\[\begin{split}\begin{aligned} \mathit{AIC}\left( m,\tau_1, \ldots \tau_m, p+1, \ldots p_{m+1} \right) &= 2 (\textit{number of parameters}) - 2 \ln L \\ &= 2 \left( 1 + m + \sum_{j=1}^{m+1} \left( 2 + p_j \right) \right) - 2 \ln L \end{aligned}\end{split}\]

Exponential Smoothing Methods

Exponential smoothing approximates the value of a time series at time t with a weighted average of previous values, with weights defined in such a way that they decay exponentially over time.

The weights can be determined by a smoothing parameter α and the relation,

\[y_t = \alpha y_{t-1} + \hat{y}_t\]
\[⇒\]
\[y_t = \sum_{j=0}^{t-1} \alpha^{t-j}(1-\alpha)^j y_j = \sum_{j=0}^{t-1} w_j y_j\]

The parameter α is on the interval (0,1) and controls the rate of decay. For values close to 1, the effect decays rapidly, whereas for values close to 0, the influence of a past value persists for some time. Exponential smoothing as a forecasting procedure is widely used and largely effective for short term, mean level forecasting. With the addition of a term for linear trend and terms or factors for seasonal patterns, exponential smoothing is an intuitive procedure for many series that arise in business applications. The function hwTimeSeries performs the Holt-Winters method, otherwise known as triple exponential smoothing, and allows for either an additive or a multiplicative seasonal effect.

Garch Models

An important assumption in the ARMA model

\[φ(B) W_t = θ_0 + θ(B)A_t,        t ∈ Z\]

is that the errors \(A_t\) are independent random variables with mean 0 and constant variance, \(\sigma^2\).

For some time series, the assumptions of independent errors and constant variance will not hold for the underlying series. For example, in the stock market, large errors in one direction are often followed by large errors in the opposite direction. Variability over time can increase with a price level or trading volume. To account for heteroscedastic (non-equal) variances, Engle (1982) introduced the Autoregressive Conditional Heteroscedastic or ARCH, model:

\[\begin{split}\begin{array}{l} A_t = z_t \sigma_t \\ \sigma_t^2 = \sigma^2 + \sum\limits_{i=1}^{q} \alpha_i A_{t-i}^{2}, \end{array}\end{split}\]

where the \(z_t\)’s are independent and identically distributed standard normal random variables. In the ARCH model, the variance term depends on previous squared errors, up to a given lag q. A generalized ARCH model, called GARCH, was introduced by Bollerslev (1986) and has the form:

\[\begin{split}\begin{array}{l} A_t = z_t \sigma_t \\ \sigma_t^2 = \sigma^2 + \sum\limits_{i=1}^{p} \beta_i \sigma_{t-i}^2 + \sum\limits_{i=1}^{q} \alpha_i A_{t-i}^{2}, \end{array}\end{split}\]

In the GARCH model, the variance has an auto-regressive term in addition to the squared error term. The function garch estimates ARCH or GARCH models.

State-Space Models

A state‑space model is represented by two equations: the state equation

\[\dot{b}(t) = f(t,b(t),u(t))\]

and the observation equation,

\[Y(t) = h(t,b(t),u(t))\]

where \(b(t)\) is the state variable, \(Y(t)\) is the observation variable, \(u(t)\) is a vector of inputs, and

\[\dot{b}(t) := \frac{d}{dt} b(t)\]

State‑space models originated in the study of dynamical systems. The system state \(b(t)\) is not directly observed but is inferred from the observable variable, \(Y(t)\), through the relation defined by the function, h. \(Y(t)\) is sometimes called the measurement variable or output variable. While f and h are completely general functions in the definition, they are most often linear, based on the assumption that the underlying system behaves linearly or approximately so. There is often a stochastic or noise term added to the equations and, in the time series context, there are usually no control inputs (\(u(t)=0\)). Under these conditions, the state‑space model takes the form,

\[\dot{b}(t) = Z(t) b(t) + w(t)\]

and

\[Y(t) = T(t) b(t) + e(t)\]

where Z and T are known matrices and w and e are noise variables. Time may evolve continuously or discretely. For a discrete time variable, it is customary to write the equations as:

\[b(k+1) = Z(k) b(k) + w(k)\]

and

\[Y(k) = T(k) b(k) + e(k)\]

where \(k=\ldots,-3,-2,-1,0,1,2,3,\ldots\).

Many time series can be expressed in the state‑space form, including ARIMA models (See Section 5.5 in Box, Jenkins, and Reinsel (2008)). For ARIMA models, the state‑space form is convenient for the calculation of the likelihood and forecasts, and for handling missing values. Once a time series is formulated as a state‑space model, the problem becomes one of solving the equations for the unknown parameters. The Kalman filter (kalman) is a recursive method for solving the state‑space equations when the functions f and h are linear.