public class ARMAOutlierIdentification extends Object implements Serializable, Cloneable
Consider a univariate time series \(\{Y_t\}\) that can be described by the following multiplicative seasonal ARIMA model of order \((p,0,q) \times (0,d,0)_s\): $$ Y_t - \mu = \frac{\theta(B)}{\Delta_s^d \phi(B)} a_t,\, t = 1,\ldots,n $$ Here, \(\Delta_s^d=(1-B^s)^d,\, \theta(B)=1-\theta_1B-\ldots-\theta_qB^q,\, \phi(B)=1-\phi_1B-\ldots-\phi_pB^p\). B is the lag operator, \(B^kY_t=Y_{t-k}\), \(\{a_t\}\) is a white noise process, and \(\mu\) denotes the mean of the series \(\{Y_t\}\).
Outlier detection and parameter estimation
In general, \(\{Y_t\}\) is not directly observable due to the influence of outliers.
Chen and Liu (1993) distinguish between four types of outliers: innovational outliers (IO), additive
outliers (AO), temporary changes (TC) and level shifts (LS). If an outlier occurs as the last observation
of the series, then Chen and Liu's algorithm is unable to determine the outlier's classification. In
class ARMAOutlierIdentification
, such an outlier is called a UI (unable to identify) and
is treated as an innovational outlier.
In order to take the effects of multiple outliers occurring at time points \(t_1,\ldots,t_m\) into account, Chen and Liu consider the following model: $$ Y_t^\ast - \mu = \sum_{j=1}^m\omega_jL_j(B)I_t(t_j)+\frac{\theta(B)}{\Delta_s^d\phi(B)}a_t $$ Here, \(\{Y_t^\ast\}\) is the observed outlier contaminated series, and \(\omega_j\) and \(L_j(B)\) denote the magnitude and dynamic pattern of outlier j, respectively. \(I_t(t_j)\) is an indicator function that determines the temporal course of the outlier effect, \(I_{t_j}(t_j)=1, I_t(t_j)=0\) otherwise. Note that \(L_j(B)\) operates on \(I_t\) via \(B^kI_t=I_{t-k}, \, k=0,1,\ldots\).
The last formula shows that the outlier free series \(\{Y_t\}\) can be obtained from the original series \(\{Y_t^\ast\}\) by removing all occurring outlier effects: $$ Y_t = Y_t^\ast-\sum_{j=1}^m\omega_jL_j(B)I_t(t_j) $$ The different types of outliers are characterized by different values for \(L_j(B)\):
Class ARMAOutlierIdentification
is an implementation of Chen and Liu's algorithm. It determines
the coefficients in \(\phi(B)\) and \(\theta(B)\) and the outlier effects
in the model for the observed series jointly in three stages. The magnitude of the outlier effects is determined
by least squares estimates. Outlier detection itself is realized by examination of the maximum value of the
standardized statistics of the outlier effects. For a detailed description, see Chen and Liu's original
paper (1993).
Intermediate and final estimates for the coefficients in \(\phi(B)\) and \(\theta(B)\)
are computed by the compute
methods from JMSL classes ARMA
and ARMAMaxLikelihood
.
If the roots of \(\phi(B)\) or \(\theta(B)\) lie on or within the unit circle, then
the algorithm stops with an appropriate exception. In this case, different values for p and q should be tried.
Forecasting
From the relation between original and outlier free series, $$ Y_t^\ast = Y_t+\sum_{j=1}^m\omega_jL_j(B)I_t(t_j) $$ it follows that the Box-Jenkins forecast at origin t for lead time l, \(\hat{Y}_t^\ast(l)\), can be computed as $$ \hat{Y}_t^\ast(l) = \hat{Y}_t(l)+\sum_{j=1}^m \omega_jL_j(B)I_{t+l}(t_j) ,\; l=1,\ldots,{\rm{nForecast}} $$ Therefore, computation of the forecasts for \(\{Y^\ast_t\}\) is done in two steps:
Step 1: Computation of the forecasts for the outlier free series \(\{Y_t\}\)
Since $$ \varphi(B)(Y_t-\mu) = \theta(B)a_t $$ where $$ \varphi(B):=\Delta_s^d\phi(B)=1-\varphi_1B-\ldots-\varphi_{p+sd}B^{p+sd} $$ the Box-Jenkins forecast at origin t for lead time l, \(\hat{Y}_t(l)\), can be computed recursively as $$ \hat{Y}_t(l)=(1-\sum_{j=1}^{p+sd}\varphi_j)\mu+\sum_{j=1}^{p+sd}\varphi_j\hat{Y}_t(l-j)-\sum_{j=1}^q\theta_ja_{t+l-j} $$ Here, $$ \hat{Y}_t(l-j)= \left\{ \begin{array}{ccc} Y_{t+l-j} & {\mbox{for}} & l-j \le 0 \\ \hat{Y}_t(l-j) & {\mbox{for}} & l-j \gt 0 \end{array} \right. $$ and $$ a_k = \left\{ \begin{array}{ccc} 0 & {\mbox{for}} & k \le \max\{1, p+sd\} \\ Y_k-\hat{Y}_{k-1}(1) & {\mbox{for}} & k = \max\{1, p+sd\}+1,\ldots,n \end{array} \right. $$
Step 2: Computation of the forecasts for the original series \(\{{Y_t}^*\}\) by adding the multiple outlier effects to the forecasts for \(\{Y_t\}\)
The formulas for \(L_j(B)\) for the different types of outliers are as follows:
Innovational outlier (IO)
$$ L_j(B) = \frac{\theta(B)}{\Delta_s^d\phi(B)}:=\psi(B)=\sum_{k=0}^\infty\psi_kB^k,\, \psi_0=1 $$Additive outliers (AO)
$$ L_j(B) = 1 $$Level shifts (LS)
$$ L_j(B) = \frac{1}{1-B}=\sum_{k=0}^{\infty}B^k $$Temporary changes (TC)
$$ L_j(B) = \frac{1}{1-\delta B}=\sum_{k=0}^\infty\delta^kB^k $$ Assuming the outlier occurs at time point \(t_j\), the outlier impact is therefore:Innovational outliers (IO)
$$ \omega_jL_j(B)I_t(t_j) = \left\{ \begin{array}{ccc} 0 & {\mbox{for}} & t \lt t_j \\ \omega_j \psi_k & {\mbox{for}} & t=t_j+k, \, k \ge 0 \end{array} \right. $$Additive outliers (AO)
$$ \omega_jL_j(B)I_t(t_j) = \left\{ \begin{array}{ccc} 0 & {\mbox{for}} & t \ne t_j \\ \omega_j & {\mbox{for}} & t=t_j \end{array} \right. $$Level shifts (LS)
$$ \omega_jL_j(B)I_t(t_j) = \left\{ \begin{array}{ccc} 0 & {\mbox{for}} & t \lt t_j \\ \omega_j & {\mbox{for}} & t=t_j+k, \, k \ge 0 \end{array} \right. $$Temporary changes (TC)
$$ \omega_jL_j(B)I_t(t_j) = \left\{ \begin{array}{ccc} 0 & {\mbox{for}} & t \lt t_j \\ \omega_j \delta^k & {\mbox{for}} & t=t_j+k, \, k \ge 0 \end{array} \right. $$ From these formulas, the forecasts \(\hat{Y}_t^\ast(l)\) can be computed easily. The \(100(1-\alpha)\) percent probability limits for \(Y_{t+l}^\ast\) and \(Y_{t+l}\) are given by $$ \hat{Y}_t^\ast(l)\, {\rm{(or }}\,\,\, \hat{Y}_t(l) {\rm{, resp.)}} \pm u_{\alpha/2}(1+\sum_{j=1}^{l-1}\psi_j^2)^{1/2}s_a $$ where \(u_{\alpha/2}\) is the \(100(1-\alpha/2)\) percentile of the standard normal distribution, \(s_a^2\) is an estimate of the variance \(\sigma_a^2\) of the random shocks, and the \(\psi\) weights \(\{\psi_j\}\) are the coefficients in $$ \psi(B) := \sum_{k=0}^\infty \psi_k B^k := \frac{\theta(B)}{\Delta_s^d\phi(B)}, \, \psi_0=1 \,. $$ For a detailed explanation of these concepts, see chapter 5:"Forecasting" in Box, Jenkins and Reinsel (1994).Modifier and Type | Field and Description |
---|---|
static int |
ADDITIVE
Indicates detection of an additive outlier.
|
static int |
INNOVATIONAL
Indicates detection of an innovational outlier.
|
static int |
LEVEL_SHIFT
Indicates detection of a level shift outlier.
|
static int |
TEMPORARY_CHANGE
Indicates detection of a temporary change outlier.
|
static int |
UNABLE_TO_IDENTIFY
Indicates detection of an outlier that cannnot be categorized.
|
Constructor and Description |
---|
ARMAOutlierIdentification(double[] z)
Constructor for
ARMAOutlierIdentification . |
Modifier and Type | Method and Description |
---|---|
void |
compute(int[] model)
Detects and determines outliers and simultaneously estimates the model
parameters for the given time series.
|
void |
computeForecasts(int nForecast)
Computes forecasts, associated probability limits and \(\psi\)
weights for an outlier contaminated time series whose underlying outlier free
series obeys a general seasonal or non-seasonal ARMA model.
|
double |
getAIC()
Returns Akaike's information criterion (AIC).
|
double |
getAICC()
Returns Akaike's Corrected Information Criterion (AICC).
|
double[] |
getAR()
Returns the final autoregressive parameter estimates.
|
double |
getBIC()
Returns the Bayesian Information Criterion (BIC).
|
double |
getConstant()
Returns the constant parameter estimate.
|
double[] |
getDeviations()
Returns the deviations used for calculating the
forecast confidence limits.
|
double[] |
getForecast()
Returns forecasts for the original outlier contaminated series.
|
double[] |
getMA()
Returns the final moving average parameter estimates.
|
int |
getNumberOfOutliers()
Returns the number of outliers detected.
|
double[] |
getOmegaWeights()
Returns the \(\omega\) weights for the
detected outliers.
|
double[] |
getOutlierFreeForecast()
Returns forecasts for the outlier free series.
|
double[] |
getOutlierFreeSeries()
Returns the outlier free series.
|
int[][] |
getOutlierStatistics()
Returns the outlier statistics.
|
double[] |
getPsiWeights()
Returns the \(\psi\) weights of the infinite order moving average
form of the model.
|
double[] |
getResidual()
Returns the residuals.
|
double |
getResidualStandardError()
Returns the residual standard error of the outlier free series.
|
double[] |
getTauStatistics()
Returns the t value for each detected outlier.
|
void |
setAccuracyTolerance(double epsilon)
Sets the tolerance value controlling the accuracy of the parameter estimates.
|
void |
setConfidence(double confidence)
Sets the confidence level for calculating confidence
limit deviations returned from
getDeviations . |
void |
setCriticalValue(double critical)
Sets the critical value used as a threshold during outlier detection.
|
void |
setDelta(double delta)
Sets the dampening effect parameter.
|
void |
setRelativeError(double relativeError)
Sets the stopping criterion for use in the nonlinear equation solver.
|
public static final int INNOVATIONAL
public static final int ADDITIVE
public static final int LEVEL_SHIFT
public static final int TEMPORARY_CHANGE
public static final int UNABLE_TO_IDENTIFY
public ARMAOutlierIdentification(double[] z)
ARMAOutlierIdentification
.z
- a double
array containing the observations.public final void compute(int[] model) throws ARMAMaxLikelihood.NonInvertibleException, ARMAMaxLikelihood.NonStationaryException, ZeroPolynomial.DidNotConvergeException, ARMA.MatrixSingularException, ARMA.TooManyCallsException, ARMA.IncreaseErrRelException, ARMA.NewInitialGuessException, ARMA.IllConditionedException, ARMA.TooManyITNException, ARMA.TooManyFcnEvalException, ARMA.TooManyJacobianEvalException, SingularMatrixException, ARMA.ResidualsTooLargeException
model
- an int
array of length 4 containing the
numbers p, q, s, d of the
\(\text{ARIMA}(p,0,q)\times(0,d,0)_s\)
model the outlier free series is following. It is
required that p, q and d are
non-negative and s is positive and consistent with
z.length
.ARMAMaxLikelihood.NonStationaryException
- is thrown if the
intermediate or final maximum likelihood estimates for the
time series are non-stationary.ARMAMaxLikelihood.NonInvertibleException
- is thrown if the
intermediate or final maximum likelihood estimates for the
time series are noninvertible.ARMAMaxLikelihood.InitialMAException
- is thrown if the
initial values provided for the moving average terms
are noninvertible. In this case, ARMAMaxLikelihood
terminates and does not compute the time series
estimates.ZeroPolynomial.DidNotConvergeException
- is thrown if the
algorithm computing the roots of the AR- or MA- polynomial
does not converge.ARMA.MatrixSingularException
- is thrown if the input matrix
is singular.ARMA.TooManyCallsException
- is thrown if the number of calls
to the function has exceeded the maximum number of iterations
times the number of moving average (MA) parameters + 1.ARMA.IncreaseErrRelException
- is thrown if the bound for the
relative error is too small.ARMA.NewInitialGuessException
- is thrown if the iteration has
not made good progress.ARMA.IllConditionedException
- is thrown if the problem is
ill-conditioned.ARMA.TooManyITNException
- is thrown if the maximum number of
iterations is exceeded.ARMA.TooManyFcnEvalException
- is thrown if the maximum number
of function evaluations is exceeded.ARMA.TooManyJacobianEvalException
- is thrown if the maximum number
of Jacobian evaluations is exceeded.SingularMatrixException
- is thrown if during the computation
of a small perturbation of the matrix product
\(A^TA\), it is found that A,
the matrix used in the determination of the
\(\omega\) weights, is singular.ARMA.ResidualsTooLargeException
- is thrown if the residuals
computed in one step of the Least Squares estimation of the
ARMA coefficients become too large.public double[] getMA()
double
array of length q = model[1]
containing the final moving average parameter estimates.
Note that the compute
method must be invoked first
before invoking this method. Otherwise, the method throws an
IllegalStateException
exception.public double[] getAR()
double
array of length p = model[0]
containing the final autoregressive parameter estimates.
Note that the compute
method must be invoked first
before invoking this method. Otherwise, the method throws an
IllegalStateException
exception.public double getConstant()
double
scalar containing the constant parameter
estimate. The compute
method must be invoked first
before invoking this method. Otherwise, the method throws an
IllegalStateException
exception.public double getResidualStandardError()
double
scalar containing the standard
error of the outlier free series. Note that the
compute
method must be invoked first
before invoking this method. Otherwise, an
IllegalStateException
exception is thrown.public double[] getResidual()
double
array containing the residuals for the
outlier free series at the final parameter estimation point. The
compute
method must be invoked first before
invoking this method. Otherwise, the method throws an
IllegalStateException
exception.public int getNumberOfOutliers()
int
scalar containing the
number of outliers detected. The compute
method must be invoked first before invoking this method.
Otherwise, an IllegalStateException
exception
is thrown.public int[][] getOutlierStatistics()
int
matrix of length nOutliers
by 2,
where nOutliers
is the number of detected outliers,
containing the outlier statistics. The first column contains the
time at which the outlier was observed (time ranging from 1 to
z.length
, the number of observations in the time
series) and the second column contains an identifier indicating
the type of outlier observed. Outlier types fall into one of five
categories:
Identifier | Outlier type |
INNOVATIONAL |
Innovational Outliers (IO) |
ADDITIVE |
Additive Outliers (AO) |
LEVEL_SHIFT |
Level Shift Outliers (LS) |
TEMPORARY_CHANGE |
Temporary Change Outliers (TC) |
UNABLE_TO_IDENTIFY |
Unable to Identify (UI) |
If the number of detected outliers equals zero, then an
int
array of size 0 is returned.
compute
method must be invoked first before
invoking this method. Otherwise, an IllegalStateException
exception is thrown.public double[] getTauStatistics()
double
array containing the t statistics
for each detected outlier. If the number of detected outliers
equals zero, then a vector of length 0 is returned. The
compute
method must be invoked before using this
method. Otherwise, an IllegalStateException
exception is thrown.public double[] getOmegaWeights()
double
array containing the computed
\(\omega\) weights for the detected outliers.
If the number of detected outliers equals zero, then an array
of length zero is returned. The compute
method
must be invoked before using this method. Otherwise, an
IllegalStateException
exception is thrown.public double getAIC()
double
scalar containing Akaike's information
criterion (AIC) for the outlier free series. The
compute
method must be called before invoking
this method. Otherwise, an IllegalStateException
exception is thrown.public double getAICC()
double
scalar containing Akaike's Corrected
Information Criterion (AICC) for the outlier free series.
The compute
method must be called before invoking
this method. Otherwise, an IllegalStateException
exception is thrown.public double getBIC()
double
scalar containing the Bayesian Information
Criterion (BIC) for the outlier free series. The
compute
method must be called before invoking
this method. Otherwise, an IllegalStateException
exception is thrown.public double[] getOutlierFreeSeries()
double
array containing the outlier free series.
The compute
method must be called before invoking
this method. Otherwise, an IllegalStateException
exception is thrown.public final void computeForecasts(int nForecast)
nForecast
- an int
scalar containing the maximum
lead time for forecasts. nForecast
must be greater than 0. Forecast origin is the time point
of the last observed value in the time series,
\(\text{n}\). Forecasts are
computed for lead times
\(1, 2, \ldots, \text{nForecast}\),
i.e. time points
\(\text{n}+1, \text{n}+2,\ldots,\text{n}+\text{nForecast}\).
Note that the compute
method must be invoked first
before invoking this method. Otherwise, the method
throws an IllegalStateException
exception.public void setConfidence(double confidence)
getDeviations
.confidence
- a double
scalar specifying the
confidence level used in computing forecast
confidence intervals. Typical choices for
confidence
are 0.90, 0.95, and 0.99.
confidence
must be greater than 0.0
and less than 1.0. Default: confidence = 0.95
.public double[] getDeviations()
double
array of length nForecast
containing the deviations from each forecast for calculating
forecast confidence intervals. The confidence level is specified in
setConfidence
. Method computeForecasts
has to be invoked before this method is called. Otherwise, an
IllegalStateException
exception is thrown.public double[] getForecast()
double
array of length nForecast
containing the forecasts for the original series. Method
computeForecasts
has to be invoked before this
method is called. Otherwise, an
IllegalStateException
exception is thrown.public double[] getOutlierFreeForecast()
double
array of length nForecast
containing
the forecasts for the outlier free series. Method computeForecasts
has to be invoked before this method is called. Otherwise, an
IllegalStateException
exception is thrown.public double[] getPsiWeights()
double
array of length nForecast
containing
the \(\psi\) weights of the infinite order moving average
form of the model for the outlier free series. Method
computeForecasts
must be invoked
before this method is called. Otherwise, an
IllegalStateException
exception is thrown.public void setDelta(double delta)
delta
- a double
scalar, the dampening effect parameter
used in the detection of a Temporary Change Outlier (TC).
delta
must be greater than 0 and less than
1. Default: delta = 0.7
.public void setCriticalValue(double critical)
critical
- a double
scalar, the critical value used as a
threshold for the statistics used in the outlier detection.
critical
must be greater than zero.
Default: critical = 3.0
.public void setAccuracyTolerance(double epsilon)
epsilon
- a double
scalar, a positive tolerance value
controlling the accuracy of parameter estimates during outlier
detection.
Default: epsilon = 0.001
.public void setRelativeError(double relativeError)
relativeError
- a double
positive scalar containing
the stopping criterion for use in the nonlinear
equation solver used in the least-squares
algorithm. relativeError=
1.0e-10
.Copyright © 2020 Rogue Wave Software. All rights reserved.