tsOutlierForecast

Computes forecasts, their associated probability limits and \(\psi\) weights for an outlier contaminated time series whose underlying outlier free series follows a general seasonal or nonseasonal ARMA model.

Synopsis

tsOutlierForecast (series, outlierStatistics, omega, delta, model, parameters, nPredict)

Required Arguments

float series[] (Input)
An array of length nObs by 2 containing the outlier free time series in its first column and the residuals of the series in the second column.
int outlierStatistics[] (Input)
An array of length numOutliers by 2 containing the outlier statistics from tsOutlierIdentification. If numOutliers=0, this array is ignored.
float omega[] (Input)
Array of length numOutliers containing the \(\psi\) weights for the outliers determined in tsOutlierIdentification. Ignored, if numOutliers=0.
float delta (Input)
The dynamic dampening effect parameter used in the outlier detection.
int model[] (Input)
Vector of length 4 containing the numbers p, q, s, d of the ARIMA \((p,0,q)\times(0,d,0)_s\) model the outlier free series is following.
float parameters[] (Input)
Vector of length 1+p+q containing the estimated constant, AR and MA parameters as output from tsOutlierIdentification.
int nPredict (Input)
Maximum lead time for forecasts. The forecasts are taken at origin t=nObs, the time point of the last observed value, for lead times 1,2,…, nPredict.

Return Value

An array of length nPredict by 3. The first column contains the forecasted values for the original outlier contaminated series. The second column contains the deviations from each forecast for computing confidence probability limits, and the third column contains the \(\psi\) weights of the infinite moving average form of the model.

If an error occurred, None is returned.

Optional Arguments

confidence, float (Input)

Value in the exclusive interval (0,100) used to specify the confidence percent probability limits of the forecast.Typical choices for confidence are 90.0, 95.0 and 99.0.

Default: confidence = 95.0

outFreeForecast (Output)
An array of length nPredict by 3 containing the forecasts for the original outlier free series in column 1, deviations from each forecast in column 2 and the \(\psi\) weights of the infinite moving average form of the model in column 3.

Description

Consider the following model for a given outlier contaminated univariate time series \(\left\{ Y_t^*\right\}_{t=1,\ldots,n}\):

\[Y_t^* = Y_t + \sum\nolimits_{j=1}^{m} \omega_j L_j(B) I_t\left(t_j\right).\]

For an explanation of the notation, see the Description section for tsOutlierIdentification. It follows from the formula above that the Box-Jenkins forecast at origin \(t\) for lead time \(l\), \(\hat{Y}_t^*(l)\), can be computed as:

\[\hat{Y}_{\mathrm{t}}^*(l) = \hat{Y}_{\mathrm{t}}(l) + \sum\nolimits_{\mathrm{j}=1}^{m} \omega_{\mathrm{j}} L_{\mathrm{j}}(B) I_{\mathrm{t}+1} \left(t_{\mathrm{j}}\right), \phantom{...} l = 1, \ldots, \mathrm{nPredict}.\]

Therefore, computation of the forecasts for \(\left\{ Y_t^*\right\}\) is done in two steps:

  1. Computation of the forecasts for the outlier free series \(\left\{Y_t\right\}\).
  2. Computation of the forecasts for the original series \(\left\{ Y_t^*\right\}\) by adding the multiple outlier effects to the forecasts for \(\left\{Y_t\right\}\).

Step 1 above:

Since

\[\varphi(B) \left(Y_t - \mu\right) = \theta(B) a_t,\]

where

\[\varphi(B) := \mathit{\Delta}_s^d \phi(B) = 1 - \varphi_1 B - \ldots - \varphi_{p+sd} B^{p+sd},\]

the Box-Jenkins forecast at origin \(t\) for lead time \(l\), \(\hat{Y}_t(l)\), can be computed recursively as:

\[\hat{Y}_t(l) = \left(1 - \sum\nolimits_{j=1}^{p+sd} \varphi_j\right) \mu + \sum\nolimits_{j=1}^{p+sd} \varphi_j \hat{Y}_t (l-j) - \sum\nolimits_{j=l}^{q} \theta_j a_{t+l+j}.\]

Here,

\[\begin{split}\hat{Y}_t(l-j) = \begin{cases} Y_{t+l-j} & \text{for } l - j \leq 0 \\ \hat{Y}_t(l-j) & \text{for } l - j > 0 \\ \end{cases}\end{split}\]

and

\[\begin{split}a_k = \begin{cases} 0 & \text{for } k \leq \max\{1, p+sd\} \\ Y_k - \hat{Y}_{k-1}(1) & \text{for } k = \max\{1,p+sd\} + 1, \ldots, n \end{cases}\end{split}\]

Step 2 above:

The formulas for \(L_j(B)\) for the different types of outliers are as follows:

Innovational outliers (IO) \(L_j(B)=\tfrac{\theta(B)}{\mathit{\Delta}_s^d \phi(B)} :=\psi(B)=\textstyle\sum_{k=0}^{\infty} \psi_k B^k,\psi_0=1\)
Additive outliers (AO) \(L_j(B)=1\)
Level shifts (LS) \(L_j(B)=\tfrac{1}{1-B}=\textstyle\sum_{k=0}^{\infty} B^k\)
Temporary changes (TC) \(L_j(B)=\tfrac{1}{1-\delta B}=\textstyle\sum_{k=0}^{\infty} \delta^k B^k\)

Assuming the outlier occurs at time point \(t_j\), the outlier impact is therefore:

Innovational outliers (IO)
\[\begin{split}\omega_j L_j(B) I_t\left(t_j\right) = \begin{cases} 0 & \text{for } t < t_j, \\ \omega_j \mathit{\Psi}_k & \text{for } t=t_j + k, k \geq 0, \\ \end{cases}\end{split}\]
Additive outliers (AO)
\[\begin{split}\omega_j L_j(B) I_t\left(t_j\right) = \begin{cases} 0 & \text{for } t eq t_j, \\ x \omega_j & \text{for } t=t_j, \\ \end{cases}\end{split}\]
Level shifts (LS)
\[\begin{split}\omega_j L_j(B) I_t\left(t_j\right) = \begin{cases} 0 & \text{for } t < t_j, \\ \omega_j & \text{for } t=t_j + k, k \geq 0, \\ \end{cases}\end{split}\]
Temporary changes (TC)
\[\begin{split}\omega_j L_j(B) I_t\left(t_j\right) = \begin{cases} 0 & \text{for } t < t_j, \\ \omega_j \delta^k & \text{for } t=t_j + k, k \geq 0, \\ \end{cases}\end{split}\]

From these formulas, the forecasts \(\hat{Y}_t^*(l)\) can be computed easily.

The \(100(1-\alpha)\) percent probability limits for \(Y_{t+l}^*\) and \(Y_{t+l}\) are given by

\[\hat{Y}_t(l)\left(\text{or } \hat{Y}_t(l), \text{resp.}\right) \pm u_{\alpha/2}\left(1 + \sum\nolimits_{j=1}^{l-1} \mathit{\Psi}_j^2\right)^{1/2} s_a,\]

where \(u_{\alpha/2}\) is the \(100(1-\alpha/2)\) percentile of the standard normal distribution, \(s_a^2\) is an estimate of the variance \(\sigma_a^2\) of the random shocks (returned from tsOutlierIdentification), and the \(\psi\) weights \(\left\{\psi_j\right\}\) are the coefficients in

\[\mathit{\Psi}(B) := \sum\nolimits_{k=0}^{\infty} \mathit{\Psi}_k B^k := \frac{\theta(B)}{\mathit{\Delta}_s^d \phi(B)}, \mathit{\Psi}_0 = 1.\]

For a detailed explanation of these concepts, see Chapter 5, Categorical and Discrete Data Analysis, Box, Jenkins and Reinsel (1994).

Example

This example is a realization of an ARMA(2,1) process described by the model \(Y_t-Y_{t-1}+0.24Y_{t-2}=10.0+a_t+0.5a_{t-1}\), \(\left\{ a_t \right\}\) a Gaussian white noise process.

Outliers were artificially added to the outlier free series \(\{Y_t\}_{t=1,\ldots,280}\) at time points \(t=150\) (level shift, \(\omega_1=+2.5\)) and \(t=200\) (additive outlier, \(\omega_2=+3.2\)), resulting in the outlier contaminated series \(\{Z_t\}_{t=1,\ldots,280}\). For both series, forecasts were determined for time points \(t=281,\ldots,290\) and compared with the actual values of the series.

from __future__ import print_function
from numpy import *
from pyimsl.stat.tsOutlierForecast import tsOutlierForecast
from pyimsl.stat.tsOutlierIdentification import tsOutlierIdentification
from pyimsl.stat.writeMatrix import writeMatrix

time_series = array([
    41.6699982, 41.6699982, 42.0752144, 42.6123962, 43.6161919, 42.1932831,
    43.1055450, 44.3518715, 45.3961258, 45.0790215, 41.8874397, 40.2159805,
    40.2447319, 39.6208458, 38.6873589, 37.9272423, 36.8718872, 36.8310852,
    37.4524879, 37.3440933, 37.9861374, 40.3810501, 41.3464622, 42.6495285,
    42.6096764, 40.3134537, 39.7971268, 41.5401535, 40.7160759, 41.0363541,
    41.8171883, 42.4190292, 43.0318832, 43.9968109, 44.0419617, 44.3225212,
    44.6082611, 43.2199631, 42.0419197, 41.9679718, 42.4926224, 43.2091255,
    43.2512283, 41.2301674, 40.1057358, 40.4510574, 41.5329170, 41.5678177,
    43.0090141, 42.1592140, 39.9234505, 38.8394127, 40.4319878, 40.8679352,
    41.4551926, 41.9756317, 43.9878922, 46.5736389, 45.5939293, 42.4487762,
    41.5325394, 42.8830910, 44.5771217, 45.8541985, 46.8249474, 47.5686378,
    46.6700745, 45.4120026, 43.2305107, 42.7635345, 43.7112923, 42.0768661,
    41.1835632, 40.3352280, 37.9761467, 35.9550056, 36.3212509, 36.9925880,
    37.2625008, 37.0040665, 38.5232544, 39.4119797, 41.8316803, 43.7091446,
    42.9381447, 42.1066780, 40.3771248, 38.6518707, 37.0550499, 36.9447708,
    38.1017685, 39.4727097, 39.8670387, 39.3820763, 38.2180786, 37.7543488,
    37.7265244, 38.0290642, 37.5531158, 37.4685936, 39.8233147, 42.0480766,
    42.4053535, 43.0117416, 44.1289330, 45.0393829, 45.1114540, 45.0086479,
    44.6560631, 45.0278931, 46.7830849, 48.7649765, 47.7991905, 46.5339661,
    43.3679199, 41.6420822, 41.2694893, 41.5959740, 43.5330009, 43.3643608,
    42.1471291, 42.5552788, 42.4521446, 41.7629128, 39.9476891, 38.3217010,
    40.5318718, 42.8811569, 44.4796944, 44.6887932, 43.1670265, 41.2226143,
    41.8330154, 44.3721924, 45.2697029, 44.4174194, 43.5068550, 44.9793015,
    45.0585403, 43.2746620, 40.3317070, 40.3880501, 40.2627106, 39.6230278,
    41.0305252, 40.9262009, 40.8326912, 41.7084885, 42.9038048, 45.8650513,
    46.5231590, 47.9916115, 47.8463135, 46.5921936, 45.8854408, 45.9130440,
    45.7450371, 46.2964249, 44.9394569, 45.8141251, 47.5284042, 48.5527802,
    48.3950577, 47.8753052, 45.8880005, 45.7086983, 44.6174774, 43.5567932,
    44.5891113, 43.1778679, 40.9405632, 40.6206894, 41.3330421, 42.2759552,
    42.4744949, 43.0719833, 44.2178459, 43.8956337, 44.1033440, 45.6241455,
    45.3724861, 44.9167595, 45.9180603, 46.9077835, 46.1666603, 46.6013489,
    46.6592331, 46.7291603, 47.1908340, 45.9784355, 45.1215782, 45.6791115,
    46.7379875, 47.3036957, 45.9968834, 44.4669495, 45.7734680, 44.6315041,
    42.9911766, 46.3842583, 43.7214432, 43.5276833, 41.3946495, 39.7013168,
    39.1033401, 38.5292892, 41.0096245, 43.4535828, 44.6525154, 45.5725899,
    46.2815285, 45.2766647, 45.3481712, 45.5039482, 45.6745682, 44.0144806,
    42.9305000, 43.6785469, 42.2500534, 40.0007210, 40.4477005, 41.4432716,
    42.0058670, 42.9357758, 45.6758842, 46.8809929, 46.8601494, 47.0449791,
    46.5420647, 46.8939934, 46.2963371, 43.5479164, 41.3864059, 41.4046364,
    42.3037987, 43.6223717, 45.8602371, 47.3016396, 46.8632469, 45.4651413,
    45.6275482, 44.9968376, 42.7558670, 42.0218239, 41.9883728, 42.2571678,
    44.3708687, 45.7483635, 44.8832512, 44.7945862, 44.8922577, 44.7409401,
    45.1726494, 45.5686874, 45.9946709, 47.3151054, 48.0654068, 46.4817467,
    42.8618279, 42.4550323, 42.5791168, 43.4230957, 44.7787971, 43.8317108,
    43.6481781, 42.4183960, 41.8426285, 43.3475227, 44.4749908, 46.3498306,
    47.8599319, 46.2449913, 43.6044006, 42.4563484, 41.2715340, 39.8492508,
    39.9997292, 41.4410820, 42.9388237, 42.5687332, 42.6384087, 41.7088661,
    43.9399033, 45.4284401, 44.4558411, 45.1761856, 45.3489113, 45.1892662,
    46.3754730, 45.6082802])

n_obs = 280
parameters = []
result = []
forecast = []
outfree_forecast = empty(0)
omega = []
residual = []
res_sigma = []
aic = []
delta = 0.7
series = empty((280, 2))
outlier_stat = empty(0)
num_outliers = []
n_predict = 10
model = empty(4)
forecast_table = empty((10, 4))

model[0] = 2
model[1] = 1
model[2] = 1
model[3] = 0

result = tsOutlierIdentification(model, time_series,
                                 relativeError=1.0e-5,
                                 numOutliers=num_outliers,
                                 residual=residual,
                                 outlierStatistics=outlier_stat,
                                 omegaWeights=omega,
                                 armaParam=parameters,
                                 residualSigma=res_sigma,
                                 aic=aic)

print("ARMA parameters:")
for i in range(0, int(model[0] + model[1] + 1)):
    print("%d\t\t%lf" % (i, parameters[i]))
print("\nNumber of outliers: %d" % num_outliers[0])
print("Outlier statistics:")
print("Time point\t\tOutlier type")
for i in range(0, num_outliers[0]):
    print("%d\t\t%d" % (outlier_stat[i, 0], outlier_stat[i, 1]))
print("\nRSE: %lf" % res_sigma[0])
print("AIC: %lf" % aic[0])
for i in range(0, n_obs):
    series[i, 0] = time_series[i]
    series[i, 1] = residual[i]

forecast = tsOutlierForecast(series,
                             outlier_stat, omega, delta,
                             model, parameters, n_predict,
                             outFreeForecast=outfree_forecast)

for i in range(0, n_predict):
    forecast_table[i, 0] = time_series[n_obs + i]
    forecast_table[i, 1] = forecast[i, 0]
    forecast_table[i, 2] = forecast[i, 1]
    forecast_table[i, 3] = forecast[i, 2]
writeMatrix("\t* * * Forecast Table for outlier "
            "contaminated series * * *\n\nOrig. Series"
            "\tforecast\tprob. limits\tpsi weights\n",
            forecast_table,
            writeFormat="%11.4f")

for i in range(0, n_predict):
    forecast_table[i, 0] = time_series[n_obs + i] - 2.5
    forecast_table[i, 1] = outfree_forecast[i, 0]
    forecast_table[i, 2] = outfree_forecast[i, 1]
    forecast_table[i, 3] = outfree_forecast[i, 2]
writeMatrix("\t* * * Forecast Table for outlier free "
            "series * * *\n\nOutlier free series\tforecast"
            "\tprob. limits\tpsi weights\n",
            forecast_table,
            writeFormat="%11.4f")

Output

ARMA parameters:
0		8.895747
1		0.911444
2		-0.117395
3		-0.565522

Number of outliers: 2
Outlier statistics:
Time point		Outlier type
150		2
200		1

RSE: 1.014127
AIC: 1376.105161
 
	* * * Forecast Table for outlier contaminated series * * *
 
      Orig. Series	forecast	prob. limits	psi weights

                1            2            3            4
   1      42.6384      43.6867       1.9726       1.4770
   2      41.7089      43.8661       3.5184       1.2288
   3      43.9399      44.0968       4.2726       0.9466
   4      45.4284      44.2860       4.6627       0.7185
   5      44.4558      44.4314       4.8734       0.5437
   6      45.1762      44.5417       4.9900       0.4112
   7      45.3489      44.6251       5.0555       0.3110
   8      45.1893      44.6882       5.0926       0.2352
   9      46.3755      44.7360       5.1137       0.1778
  10      45.6083      44.7721       5.1257       0.1345
 
  	* * * Forecast Table for outlier free series * * *
 
Outlier free series	forecast	prob. limits	psi weights

              1            2            3            4
 1      40.1384      41.9962       1.9726       1.4770
 2      39.2089      42.1756       3.5184       1.2288
 3      41.4399      42.4063       4.2726       0.9466
 4      42.9284      42.5955       4.6627       0.7185
 5      41.9558      42.7409       4.8734       0.5437
 6      42.6762      42.8511       4.9900       0.4112
 7      42.8489      42.9346       5.0555       0.3110
 8      42.6893      42.9977       5.0926       0.2352
 9      43.8755      43.0454       5.1137       0.1778
10      43.1083      43.0815       5.1257       0.1345