Package com.imsl.stat

Class ARMAOutlierIdentification

java.lang.Object
com.imsl.stat.ARMAOutlierIdentification
All Implemented Interfaces:
Serializable, Cloneable

public class ARMAOutlierIdentification extends Object implements Serializable, Cloneable
Detects and determines outliers and simultaneously estimates the model parameters in a time series whose underlying outlier free series follows a general seasonal or nonseasonal ARMA model. This class also allows computation of forecasts.

Consider a univariate time series \(\{Y_t\}\) that can be described by the following multiplicative seasonal ARIMA model of order \((p,0,q) \times (0,d,0)_s\): $$ Y_t - \mu = \frac{\theta(B)}{\Delta_s^d \phi(B)} a_t,\, t = 1,\ldots,n $$ Here, \(\Delta_s^d=(1-B^s)^d,\, \theta(B)=1-\theta_1B-\ldots-\theta_qB^q,\, \phi(B)=1-\phi_1B-\ldots-\phi_pB^p\). B is the lag operator, \(B^kY_t=Y_{t-k}\), \(\{a_t\}\) is a white noise process, and \(\mu\) denotes the mean of the series \(\{Y_t\}\).

Outlier detection and parameter estimation

In general, \(\{Y_t\}\) is not directly observable due to the influence of outliers. Chen and Liu (1993) distinguish between four types of outliers: innovational outliers (IO), additive outliers (AO), temporary changes (TC) and level shifts (LS). If an outlier occurs as the last observation of the series, then Chen and Liu's algorithm is unable to determine the outlier's classification. In class ARMAOutlierIdentification, such an outlier is called a UI (unable to identify) and is treated as an innovational outlier.

In order to take the effects of multiple outliers occurring at time points \(t_1,\ldots,t_m\) into account, Chen and Liu consider the following model: $$ Y_t^\ast - \mu = \sum_{j=1}^m\omega_jL_j(B)I_t(t_j)+\frac{\theta(B)}{\Delta_s^d\phi(B)}a_t $$ Here, \(\{Y_t^\ast\}\) is the observed outlier contaminated series, and \(\omega_j\) and \(L_j(B)\) denote the magnitude and dynamic pattern of outlier j, respectively. \(I_t(t_j)\) is an indicator function that determines the temporal course of the outlier effect, \(I_{t_j}(t_j)=1, I_t(t_j)=0\) otherwise. Note that \(L_j(B)\) operates on \(I_t\) via \(B^kI_t=I_{t-k}, \, k=0,1,\ldots\).

The last formula shows that the outlier free series \(\{Y_t\}\) can be obtained from the original series \(\{Y_t^\ast\}\) by removing all occurring outlier effects: $$ Y_t = Y_t^\ast-\sum_{j=1}^m\omega_jL_j(B)I_t(t_j) $$ The different types of outliers are characterized by different values for \(L_j(B)\):

  1. \(L_j(B)=\frac{\theta(B)}{\Delta_s^d\phi(B)}\) for an innovational outlier,
  2. \(L_j(B)=1\) for an additive outlier,
  3. \(L_j(B)=(1-B)^{-1}\) for a level shift outlier and
  4. \(L_j(B)=(1-\delta B)^{-1},\, 0 \lt \delta \lt 1,\) for a temporary change outlier.

Class ARMAOutlierIdentification is an implementation of Chen and Liu's algorithm. It determines the coefficients in \(\phi(B)\) and \(\theta(B)\) and the outlier effects in the model for the observed series jointly in three stages. The magnitude of the outlier effects is determined by least squares estimates. Outlier detection itself is realized by examination of the maximum value of the standardized statistics of the outlier effects. For a detailed description, see Chen and Liu's original paper (1993).

Intermediate and final estimates for the coefficients in \(\phi(B)\) and \(\theta(B)\) are computed by the compute methods from JMSL classes ARMA and ARMAMaxLikelihood. If the roots of \(\phi(B)\) or \(\theta(B)\) lie on or within the unit circle, then the algorithm stops with an appropriate exception. In this case, different values for p and q should be tried.

Forecasting

From the relation between original and outlier free series, $$ Y_t^\ast = Y_t+\sum_{j=1}^m\omega_jL_j(B)I_t(t_j) $$ it follows that the Box-Jenkins forecast at origin t for lead time l, \(\hat{Y}_t^\ast(l)\), can be computed as $$ \hat{Y}_t^\ast(l) = \hat{Y}_t(l)+\sum_{j=1}^m \omega_jL_j(B)I_{t+l}(t_j) ,\; l=1,\ldots,{\rm{nForecast}} $$ Therefore, computation of the forecasts for \(\{Y^\ast_t\}\) is done in two steps:

  1. Computation of the forecasts for the outlier free series \(\{Y_t\}\).
  2. Computation of the forecasts for the original series \(\{Y^\ast_t\}\) by adding the multiple outlier effects to the forecasts for \(\{Y_t\}\).

Step 1: Computation of the forecasts for the outlier free series \(\{Y_t\}\)

Since $$ \varphi(B)(Y_t-\mu) = \theta(B)a_t $$ where $$ \varphi(B):=\Delta_s^d\phi(B)=1-\varphi_1B-\ldots-\varphi_{p+sd}B^{p+sd} $$ the Box-Jenkins forecast at origin t for lead time l, \(\hat{Y}_t(l)\), can be computed recursively as $$ \hat{Y}_t(l)=(1-\sum_{j=1}^{p+sd}\varphi_j)\mu+\sum_{j=1}^{p+sd}\varphi_j\hat{Y}_t(l-j)-\sum_{j=1}^q\theta_ja_{t+l-j} $$ Here, $$ \hat{Y}_t(l-j)= \left\{ \begin{array}{ccc} Y_{t+l-j} & {\mbox{for}} & l-j \le 0 \\ \hat{Y}_t(l-j) & {\mbox{for}} & l-j \gt 0 \end{array} \right. $$ and $$ a_k = \left\{ \begin{array}{ccc} 0 & {\mbox{for}} & k \le \max\{1, p+sd\} \\ Y_k-\hat{Y}_{k-1}(1) & {\mbox{for}} & k = \max\{1, p+sd\}+1,\ldots,n \end{array} \right. $$

Step 2: Computation of the forecasts for the original series \(\{{Y_t}^*\}\) by adding the multiple outlier effects to the forecasts for \(\{Y_t\}\)

The formulas for \(L_j(B)\) for the different types of outliers are as follows:

Innovational outlier (IO)

$$ L_j(B) = \frac{\theta(B)}{\Delta_s^d\phi(B)}:=\psi(B)=\sum_{k=0}^\infty\psi_kB^k,\, \psi_0=1 $$

Additive outliers (AO)

$$ L_j(B) = 1 $$

Level shifts (LS)

$$ L_j(B) = \frac{1}{1-B}=\sum_{k=0}^{\infty}B^k $$

Temporary changes (TC)

$$ L_j(B) = \frac{1}{1-\delta B}=\sum_{k=0}^\infty\delta^kB^k $$ Assuming the outlier occurs at time point \(t_j\), the outlier impact is therefore:

Innovational outliers (IO)

$$ \omega_jL_j(B)I_t(t_j) = \left\{ \begin{array}{ccc} 0 & {\mbox{for}} & t \lt t_j \\ \omega_j \psi_k & {\mbox{for}} & t=t_j+k, \, k \ge 0 \end{array} \right. $$

Additive outliers (AO)

$$ \omega_jL_j(B)I_t(t_j) = \left\{ \begin{array}{ccc} 0 & {\mbox{for}} & t \ne t_j \\ \omega_j & {\mbox{for}} & t=t_j \end{array} \right. $$

Level shifts (LS)

$$ \omega_jL_j(B)I_t(t_j) = \left\{ \begin{array}{ccc} 0 & {\mbox{for}} & t \lt t_j \\ \omega_j & {\mbox{for}} & t=t_j+k, \, k \ge 0 \end{array} \right. $$

Temporary changes (TC)

$$ \omega_jL_j(B)I_t(t_j) = \left\{ \begin{array}{ccc} 0 & {\mbox{for}} & t \lt t_j \\ \omega_j \delta^k & {\mbox{for}} & t=t_j+k, \, k \ge 0 \end{array} \right. $$ From these formulas, the forecasts \(\hat{Y}_t^\ast(l)\) can be computed easily. The \(100(1-\alpha)\) percent probability limits for \(Y_{t+l}^\ast\) and \(Y_{t+l}\) are given by $$ \hat{Y}_t^\ast(l)\, {\rm{(or }}\,\,\, \hat{Y}_t(l) {\rm{, resp.)}} \pm u_{\alpha/2}(1+\sum_{j=1}^{l-1}\psi_j^2)^{1/2}s_a $$ where \(u_{\alpha/2}\) is the \(100(1-\alpha/2)\) percentile of the standard normal distribution, \(s_a^2\) is an estimate of the variance \(\sigma_a^2\) of the random shocks, and the \(\psi\) weights \(\{\psi_j\}\) are the coefficients in $$ \psi(B) := \sum_{k=0}^\infty \psi_k B^k := \frac{\theta(B)}{\Delta_s^d\phi(B)}, \, \psi_0=1 \,. $$ For a detailed explanation of these concepts, see chapter 5:"Forecasting" in Box, Jenkins and Reinsel (1994).
See Also:
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    static final int
    Indicates detection of an additive outlier.
    static final int
    Indicates detection of an innovational outlier.
    static final int
    Indicates detection of a level shift outlier.
    static final int
    Indicates detection of a temporary change outlier.
    static final int
    Indicates detection of an outlier that cannnot be categorized.
  • Constructor Summary

    Constructors
    Constructor
    Description
    Constructor for ARMAOutlierIdentification.
  • Method Summary

    Modifier and Type
    Method
    Description
    final void
    compute(int[] model)
    Detects and determines outliers and simultaneously estimates the model parameters for the given time series.
    final void
    computeForecasts(int nForecast)
    Computes forecasts, associated probability limits and \(\psi\) weights for an outlier contaminated time series whose underlying outlier free series obeys a general seasonal or non-seasonal ARMA model.
    double
    Returns Akaike's information criterion (AIC).
    double
    Returns Akaike's Corrected Information Criterion (AICC).
    double[]
    Returns the final autoregressive parameter estimates.
    double
    Returns the Bayesian Information Criterion (BIC).
    double
    Returns the constant parameter estimate.
    double[]
    Returns the deviations used for calculating the forecast confidence limits.
    double[]
    Returns forecasts for the original outlier contaminated series.
    double[]
    Returns the final moving average parameter estimates.
    int
    Returns the number of outliers detected.
    double[]
    Returns the \(\omega\) weights for the detected outliers.
    double[]
    Returns forecasts for the outlier free series.
    double[]
    Returns the outlier free series.
    int[][]
    Returns the outlier statistics.
    double[]
    Returns the \(\psi\) weights of the infinite order moving average form of the model.
    double[]
    Returns the residuals.
    double
    Returns the residual standard error of the outlier free series.
    double[]
    Returns the t value for each detected outlier.
    void
    setAccuracyTolerance(double epsilon)
    Sets the tolerance value controlling the accuracy of the parameter estimates.
    void
    setConfidence(double confidence)
    Sets the confidence level for calculating confidence limit deviations returned from getDeviations.
    void
    setCriticalValue(double critical)
    Sets the critical value used as a threshold during outlier detection.
    void
    setDelta(double delta)
    Sets the dampening effect parameter.
    void
    setRelativeError(double relativeError)
    Sets the stopping criterion for use in the nonlinear equation solver.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

    • INNOVATIONAL

      public static final int INNOVATIONAL
      Indicates detection of an innovational outlier.
      See Also:
    • ADDITIVE

      public static final int ADDITIVE
      Indicates detection of an additive outlier.
      See Also:
    • LEVEL_SHIFT

      public static final int LEVEL_SHIFT
      Indicates detection of a level shift outlier.
      See Also:
    • TEMPORARY_CHANGE

      public static final int TEMPORARY_CHANGE
      Indicates detection of a temporary change outlier.
      See Also:
    • UNABLE_TO_IDENTIFY

      public static final int UNABLE_TO_IDENTIFY
      Indicates detection of an outlier that cannnot be categorized.
      See Also:
  • Constructor Details

    • ARMAOutlierIdentification

      public ARMAOutlierIdentification(double[] z)
      Constructor for ARMAOutlierIdentification.
      Parameters:
      z - a double array containing the observations.
  • Method Details

    • compute

      Detects and determines outliers and simultaneously estimates the model parameters for the given time series.
      Parameters:
      model - an int array of length 4 containing the numbers p, q, s, d of the \(\text{ARIMA}(p,0,q)\times(0,d,0)_s\) model the outlier free series is following. It is required that p, q and d are non-negative and s is positive and consistent with z.length.
      Throws:
      ARMAMaxLikelihood.NonStationaryException - is thrown if the intermediate or final maximum likelihood estimates for the time series are non-stationary.
      ARMAMaxLikelihood.NonInvertibleException - is thrown if the intermediate or final maximum likelihood estimates for the time series are noninvertible.
      ARMAMaxLikelihood.InitialMAException - is thrown if the initial values provided for the moving average terms are noninvertible. In this case, ARMAMaxLikelihood terminates and does not compute the time series estimates.
      ZeroPolynomial.DidNotConvergeException - is thrown if the algorithm computing the roots of the AR- or MA- polynomial does not converge.
      ARMA.MatrixSingularException - is thrown if the input matrix is singular.
      ARMA.TooManyCallsException - is thrown if the number of calls to the function has exceeded the maximum number of iterations times the number of moving average (MA) parameters + 1.
      ARMA.IncreaseErrRelException - is thrown if the bound for the relative error is too small.
      ARMA.NewInitialGuessException - is thrown if the iteration has not made good progress.
      ARMA.IllConditionedException - is thrown if the problem is ill-conditioned.
      ARMA.TooManyITNException - is thrown if the maximum number of iterations is exceeded.
      ARMA.TooManyFcnEvalException - is thrown if the maximum number of function evaluations is exceeded.
      ARMA.TooManyJacobianEvalException - is thrown if the maximum number of Jacobian evaluations is exceeded.
      SingularMatrixException - is thrown if during the computation of a small perturbation of the matrix product \(A^TA\), it is found that A, the matrix used in the determination of the \(\omega\) weights, is singular.
      ARMA.ResidualsTooLargeException - is thrown if the residuals computed in one step of the Least Squares estimation of the ARMA coefficients become too large.
    • getMA

      public double[] getMA()
      Returns the final moving average parameter estimates.
      Returns:
      a double array of length q = model[1] containing the final moving average parameter estimates. Note that the compute method must be invoked first before invoking this method. Otherwise, the method throws an IllegalStateException exception.
    • getAR

      public double[] getAR()
      Returns the final autoregressive parameter estimates.
      Returns:
      a double array of length p = model[0] containing the final autoregressive parameter estimates. Note that the compute method must be invoked first before invoking this method. Otherwise, the method throws an IllegalStateException exception.
    • getConstant

      public double getConstant()
      Returns the constant parameter estimate.
      Returns:
      a double scalar containing the constant parameter estimate. The compute method must be invoked first before invoking this method. Otherwise, the method throws an IllegalStateException exception.
    • getResidualStandardError

      public double getResidualStandardError()
      Returns the residual standard error of the outlier free series.
      Returns:
      a double scalar containing the standard error of the outlier free series. Note that the compute method must be invoked first before invoking this method. Otherwise, an IllegalStateException exception is thrown.
    • getResidual

      public double[] getResidual()
      Returns the residuals.
      Returns:
      a double array containing the residuals for the outlier free series at the final parameter estimation point. The compute method must be invoked first before invoking this method. Otherwise, the method throws an IllegalStateException exception.
    • getNumberOfOutliers

      public int getNumberOfOutliers()
      Returns the number of outliers detected.
      Returns:
      an int scalar containing the number of outliers detected. The compute method must be invoked first before invoking this method. Otherwise, an IllegalStateException exception is thrown.
    • getOutlierStatistics

      public int[][] getOutlierStatistics()
      Returns the outlier statistics.
      Returns:
      an int matrix of length nOutliers by 2, where nOutliers is the number of detected outliers, containing the outlier statistics. The first column contains the time at which the outlier was observed (time ranging from 1 to z.length, the number of observations in the time series) and the second column contains an identifier indicating the type of outlier observed. Outlier types fall into one of five categories:
      Identifier Outlier type
      INNOVATIONAL Innovational Outliers (IO)
      ADDITIVE Additive Outliers (AO)
      LEVEL_SHIFT Level Shift Outliers (LS)
      TEMPORARY_CHANGE Temporary Change Outliers (TC)
      UNABLE_TO_IDENTIFY Unable to Identify (UI)

      If the number of detected outliers equals zero, then an int array of size 0 is returned.

      The compute method must be invoked first before invoking this method. Otherwise, an IllegalStateException exception is thrown.
    • getTauStatistics

      public double[] getTauStatistics()
      Returns the t value for each detected outlier.
      Returns:
      a double array containing the t statistics for each detected outlier. If the number of detected outliers equals zero, then a vector of length 0 is returned. The compute method must be invoked before using this method. Otherwise, an IllegalStateException exception is thrown.
    • getOmegaWeights

      public double[] getOmegaWeights()
      Returns the \(\omega\) weights for the detected outliers.
      Returns:
      a double array containing the computed \(\omega\) weights for the detected outliers. If the number of detected outliers equals zero, then an array of length zero is returned. The compute method must be invoked before using this method. Otherwise, an IllegalStateException exception is thrown.
    • getAIC

      public double getAIC()
      Returns Akaike's information criterion (AIC).
      Returns:
      a double scalar containing Akaike's information criterion (AIC) for the outlier free series. The compute method must be called before invoking this method. Otherwise, an IllegalStateException exception is thrown.
    • getAICC

      public double getAICC()
      Returns Akaike's Corrected Information Criterion (AICC).
      Returns:
      a double scalar containing Akaike's Corrected Information Criterion (AICC) for the outlier free series. The compute method must be called before invoking this method. Otherwise, an IllegalStateException exception is thrown.
    • getBIC

      public double getBIC()
      Returns the Bayesian Information Criterion (BIC).
      Returns:
      a double scalar containing the Bayesian Information Criterion (BIC) for the outlier free series. The compute method must be called before invoking this method. Otherwise, an IllegalStateException exception is thrown.
    • getOutlierFreeSeries

      public double[] getOutlierFreeSeries()
      Returns the outlier free series.
      Returns:
      a double array containing the outlier free series. The compute method must be called before invoking this method. Otherwise, an IllegalStateException exception is thrown.
    • computeForecasts

      public final void computeForecasts(int nForecast)
      Computes forecasts, associated probability limits and \(\psi\) weights for an outlier contaminated time series whose underlying outlier free series obeys a general seasonal or non-seasonal ARMA model.
      Parameters:
      nForecast - an int scalar containing the maximum lead time for forecasts. nForecast must be greater than 0. Forecast origin is the time point of the last observed value in the time series, \(\text{n}\). Forecasts are computed for lead times \(1, 2, \ldots, \text{nForecast}\), i.e. time points \(\text{n}+1, \text{n}+2,\ldots,\text{n}+\text{nForecast}\). Note that the compute method must be invoked first before invoking this method. Otherwise, the method throws an IllegalStateException exception.
    • setConfidence

      public void setConfidence(double confidence)
      Sets the confidence level for calculating confidence limit deviations returned from getDeviations.
      Parameters:
      confidence - a double scalar specifying the confidence level used in computing forecast confidence intervals. Typical choices for confidence are 0.90, 0.95, and 0.99. confidence must be greater than 0.0 and less than 1.0. Default: confidence = 0.95.
    • getDeviations

      public double[] getDeviations()
      Returns the deviations used for calculating the forecast confidence limits.
      Returns:
      a double array of length nForecast containing the deviations from each forecast for calculating forecast confidence intervals. The confidence level is specified in setConfidence. Method computeForecasts has to be invoked before this method is called. Otherwise, an IllegalStateException exception is thrown.
    • getForecast

      public double[] getForecast()
      Returns forecasts for the original outlier contaminated series.
      Returns:
      a double array of length nForecast containing the forecasts for the original series. Method computeForecasts has to be invoked before this method is called. Otherwise, an IllegalStateException exception is thrown.
    • getOutlierFreeForecast

      public double[] getOutlierFreeForecast()
      Returns forecasts for the outlier free series.
      Returns:
      a double array of length nForecast containing the forecasts for the outlier free series. Method computeForecasts has to be invoked before this method is called. Otherwise, an IllegalStateException exception is thrown.
    • getPsiWeights

      public double[] getPsiWeights()
      Returns the \(\psi\) weights of the infinite order moving average form of the model.
      Returns:
      a double array of length nForecast containing the \(\psi\) weights of the infinite order moving average form of the model for the outlier free series. Method computeForecasts must be invoked before this method is called. Otherwise, an IllegalStateException exception is thrown.
    • setDelta

      public void setDelta(double delta)
      Sets the dampening effect parameter.
      Parameters:
      delta - a double scalar, the dampening effect parameter used in the detection of a Temporary Change Outlier (TC). delta must be greater than 0 and less than 1. Default: delta = 0.7.
    • setCriticalValue

      public void setCriticalValue(double critical)
      Sets the critical value used as a threshold during outlier detection.
      Parameters:
      critical - a double scalar, the critical value used as a threshold for the statistics used in the outlier detection. critical must be greater than zero. Default: critical = 3.0.
    • setAccuracyTolerance

      public void setAccuracyTolerance(double epsilon)
      Sets the tolerance value controlling the accuracy of the parameter estimates.
      Parameters:
      epsilon - a double scalar, a positive tolerance value controlling the accuracy of parameter estimates during outlier detection. Default: epsilon = 0.001.
    • setRelativeError

      public void setRelativeError(double relativeError)
      Sets the stopping criterion for use in the nonlinear equation solver.
      Parameters:
      relativeError - a double positive scalar containing the stopping criterion for use in the nonlinear equation solver used in the least-squares algorithm.
      Default: relativeError= 1.0e-10.