Chapter 8: Time Series and Forecasting

ts_outlier_identification

Detects and determines outliers and simultaneously estimates the model parameters in a time series whose underlying outlier free series follows a general seasonal or nonseasonal ARMA model.

Synopsis

#include <imsls.h>

float  *imsls_f_ts_outlier_identification (int n_obs, int model[],
float
w[],…,0)

The type double function is imsls_d_ts_outlier_identification.

Required Arguments

int n_obs (Input)
Number of observations in the time series.

int model[] (Input)
Vector of length 4 containing the numbers p, q, s, d of the ARIMA model the outlier free series is following.

float w[] (Input)
An array of length n_obs containing the time series.

Return Value

Pointer to an array of length n_obs containing the outlier free time series.
If an error occurred, NULL is returned.

Synopsis with Optional Arguments

#include <imsls.h>

float  *imsls_f_ts_outlier_identification (int n_obs,
int model[], float w[],
IMSLS_RETURN_USER, float x[],
IMSLS_DELTA, float delta,
IMSLS_CRITICAL, float critical,
IMSLS_EPSILON, float epsilon,
IMSLS_RELATIVE_ERROR, float relative_error,
IMSLS_RESIDUAL, float **residual,
IMSLS_RESIDUAL_USER, float residual[],
IMSLS_RESIDUAL_SIGMA, float *res_sigma,
IMSLS_NUM_OUTLIERS, int *num_outliers,
IMSLS_OUTLIER_STATISTICS, int **outlier_stat,
IMSLS_OUTLIER_STATISTICS_USER, int outlier_stat[],
IMSLS_TAU_STATISTICS, float **tau_stat,
IMSLS_TAU_STATISTICS_USER, float tau_stat[],
IMSLS_OMEGA_WEIGHTS, float **omega,
IMSLS_OMEGA_WEIGHTS_USER, float omega[],
IMSLS_ARMA_PARAM, float **parameters,
IMSLS_ARMA_PARAM_USER, float parameters[],
IMSLS_AIC, float *aic,
0)

Optional Arguments

IMSLS_RETURN_USER, float x[]  (Output)
A user supplied array of length n_obs containing the outlier free series.

IMSLS_DELTA, float delta (Input)
The dampening effect parameter used in the detection of a Temporary Change Outlier (TC), 0<delta < 1.
Default: delta = 0.7

IMSLS_CRITICAL, float critical  (Input)
Critical value used as a threshold for outlier detection, critical > 0.
Default: critical = 3.0

IMSLS_EPSILON, float epsilon  (Input)
Positive tolerance value controlling the accuracy of parameter estimates during outlier detection.
Default: epsilon = 0.001

IMSLS_RELATIVE_ERROR, float relative_error (Input)
Stopping criterion for the nonlinear equation solver used in function imsls_f_arma.
Default:  relative_error = .

IMSLS_RESIDUAL, float **residual  (Output)
Address of a pointer to an internally allocated array of length n_obs containing the residuals for the outlier free series.

IMSLS_RESIDUAL_USER, float residual[]  (Output)
Storage for array residual is provided by the user. See IMSLS_RESIDUAL.

IMSLS_RESIDUAL_SIGMA, float *res_sigma  (Output)
Residual standard error of the outlier free series.

IMSLS_NUM_OUTLIERS, int *num_outliers  (Output)
The number of outliers detected.

IMSLS_OUTLIER_STATISTICS, int **outlier_stat  (Output)
Address of a pointer to an internally allocated array of length num_outliers ´ 2 containing  outlier statistics.  The first column contains the time at which the outlier was observed (t=1,2,...,n_obs) and the second column contains an identifier indicating the type of outlier observed. 
Outlier types fall into one of five categories:

0

Innovational Outliers (IO)

1

Additive outliers (AO)

2

Level Shift Outliers (LS)

3

Temporary Change Outliers (TC)

4

Unable to Identify (UI).

Use IMSLS_NUM_OUTLIERS to obtain IMSLS_NUM_OUTLIERS, the number of detected outliers.
If num_outliers = 0, NULL is returned.

IMSLS_OUTLIER_STATISTICS_USER, int outlier_stat[]  (Output)
A user allocated array of length n_obs ´ 2 containing outlier statistics in the first num_outliers locations.  See IMSLS_OUTLIER_STATISTICS.
If num_outliers = 0, outlier_stat stays unchanged.

IMSLS_TAU_STATISTICS, float **tau_stat  (Output)
Address of a pointer to an internally allocated array of length num_outliers containing the t value for each detected outlier.
If num_outliers = 0, NULL is returned.

IMSLS_TAU_STATISTICS_USER, float tau_stat[] (Output)
A user allocated array of length n_obs containing the t value for each detected outlier in its first num_outliers locations.
If num_outliers = 0, tau_stat stays unchanged.

IMSLS_OMEGA_WEIGHTS, float **omega (Output)
Address of a pointer to an internally allocated array of length num_outliers containing the computed weights for the detected outliers.
If num_outliers = 0, NULL is returned.

IMSLS_OMEGA_WEIGHTS_USER   float omega[] (Output)
A user allocated array of length n_obs containing the computed weights for the detected outliers in its first num_outliers locations.
If num_outliers = 0, omega stays unchanged.

IMSLS_ARMA_PARAM,   float **parameters (Output)
Address of a pointer to an internally allocated array of length 1+p+q containing the estimated constant, AR and MA parameters.

IMSLS_ARMA_PARAM_USER   float parameters[] (Output)
A user allocated array of length 1+p+q containing the estimated constant, AR and MA parameters.

IMSLS_AIC, float   *aic (Output)
Akaike's information criterion (AIC).

Description

Consider a univariate time series that can be described by the following multiplicative seasonal ARIMA model of order :

Here, ,  .  is the lag operator, ,  is a white noise process, and  denotes the mean of the series .

In general,  is not directly observable due to the influence of outliers. Chen and Liu (1993) distinguish between four types of outliers: innovational outliers (IO), additive outliers (AO), temporary changes (TC)  and level shifts (LS). If an outlier occurs as the last observation of the series, then Chen and Liu's algorithm is unable to determine the outlier's classification. In imsls_f_ts_outlier_identification, such an outlier is called a UI (unable to identify) and is treated as an innovational outlier.

In order to take the effects of multiple outliers occurring at time points  into account, Chen and Liu consider the following model:

Here,  is the observed outlier contaminated series, and  and  denote the magnitude and dynamic pattern of outlier , respectively.   is an indicator function that determines the temporal course of the outlier effect, ,  otherwise. Note that  operates on  via .

The last formula shows that the outlier free series  can be obtained from the original series  by removing all occurring outlier effects:

.

The different types of outliers are charaterized by different values for:

1.   for an innovational outlier,

2.   for an additive outlier,

3.   for a level shift outlier and

4.   for a temporary change outlier.

Function imsls_f_ts_outlier_identification is an implementation of Chen and Liu's algorithm. It determines the coefficients in  and the outlier effects in the model for the observed series jointly in three stages. The magnitude of the outlier effects is determined by least squares estimates. Outlier detection itself is realized by examination of the maximum value of the standardized statistics of the outlier effects. For a detailed description, see Chen and Liu's original paper (1993).

Intermediate and final estimates for the coefficients in  and  are computed by functions imsls_f_arma and imsls_f_max_arma.  If the roots of or  lie on or within the unit circle, then the algorithm stops with an appropriate error message. In this case, different values for p and q should be tried.

Examples

Example 1

This example is based on estimates of the Canadian lynx population. Function imsls_f_ts_outlier_identification is used to fit an ARIMA(2,2,0) model of the form, , Gaussian White noise, to the given series. Function ts_outlier_identification computes parameters  and  and identifies a LS outlier at time point .

 

#include <imsls.h>

#include <stdlib.h>

#include <stdio.h>

int main(){

      float series[114]={

      0.24300E01,0.25060E01,0.27670E01,0.29400E01,0.31690E01,0.34500E01,

      0.35940E01,0.37740E01,0.36950E01,0.34110E01,0.27180E01,0.19910E01,

      0.22650E01,0.24460E01,0.26120E01,0.33590E01,0.34290E01,0.35330E01,

      0.32610E01,0.26120E01,0.21790E01,0.16530E01,0.18320E01,0.23280E01,

      0.27370E01,0.30140E01,0.33280E01,0.34040E01,0.29810E01,0.25570E01,

      0.25760E01,0.23520E01,0.25560E01,0.28640E01,0.32140E01,0.34350E01,

      0.34580E01,0.33260E01,0.28350E01,0.24760E01,0.23730E01,0.23890E01,

      0.27420E01,0.32100E01,0.35200E01,0.38280E01,0.36280E01,0.28370E01,

      0.24060E01,0.26750E01,0.25540E01,0.28940E01,0.32020E01,0.32240E01,

      0.33520E01,0.31540E01,0.28780E01,0.24760E01,0.23030E01,0.23600E01,

      0.26710E01,0.28670E01,0.33100E01,0.34490E01,0.36460E01,0.34000E01,

      0.25900E01,0.18630E01,0.15810E01,0.16900E01,0.17710E01,0.22740E01,

      0.25760E01,0.31110E01,0.36050E01,0.35430E01,0.27690E01,0.20210E01,

      0.21850E01,0.25880E01,0.28800E01,0.31150E01,0.35400E01,0.38450E01,

      0.38000E01,0.35790E01,0.32640E01,0.25380E01,0.25820E01,0.29070E01,

      0.31420E01,0.34330E01,0.35800E01,0.34900E01,0.34750E01,0.35790E01,

      0.28290E01,0.19090E01,0.19030E01,0.20330E01,0.23600E01,0.26010E01,

      0.30540E01,0.33860E01,0.35530E01,0.34680E01,0.31870E01,0.27230E01,

      0.26860E01,0.28210E01,0.30000E01,0.32010E01,0.34240E01,0.35310E01};

 

      int i;

      int model[4];

      int n_obs = 114;

      float *parameters = NULL, *result = NULL;

      float res_sigma, aic;

      int *outlier_stat = NULL;

      int num_outliers;

      model[0] = 2;

      model[1] = 0;

      model[2] = 1;

      model[3] = 2;

      result = imsls_f_ts_outlier_identification(n_obs, model, series,

            IMSLS_CRITICAL, 3.5,

            IMSLS_NUM_OUTLIERS, &num_outliers,

            IMSLS_OUTLIER_STATISTICS, &outlier_stat,

            IMSLS_ARMA_PARAM, &parameters,

            IMSLS_RESIDUAL_SIGMA, &res_sigma,

            IMSLS_AIC, &aic, 0);

 

      printf("ARMA parameters:\n");

      for (i=0; i<=model[0]+model[1]; i++)

            printf("%d\t\t%lf\n", i, parameters[i]);

      printf("\nNumber of outliers: %d\n\n", num_outliers);

      printf("Outlier statistics:\n");

      printf("Time point\tOutlier type\n");

      for (i=0; i<num_outliers; i++)

            printf("   t=%2d  \t   Type=%d\n", outlier_stat[2*i],      

            outlier_stat[2*i+1]);

      printf("\n\n");

            printf("RSE:%lf\n", res_sigma);

      printf("AIC:%lf\n", aic);

      printf("\nExtract from the series:\n\n");

      printf ("time point    original series      outlier free series\n\n");

      for (i=0; i<36; i++)

            printf ("%2d               %6.4f                   %6.4f\n",

            i+1, series[i], result[i]);

}

 

 

Output

ARMA parameters:

0               0.000000

1               0.106820

2               -0.196660

 

Number of outliers: 1

 

Outlier statistics:

Time point      Outlier type

   t=16            Type=2

 

 

RSE:0.319541

AIC:282.917603

 

Extract from the series:

 

time point    original series      outlier free series

 

 1               2.4300                   2.4300

 2               2.5060                   2.5060

 3               2.7670                   2.7670

 4               2.9400                   2.9400

 5               3.1690                   3.1690

 6               3.4500                   3.4500

 7               3.5940                   3.5940

 8               3.7740                   3.7740

 9               3.6950                   3.6950

10               3.4110                   3.4110

11               2.7180                   2.7180

12               1.9910                   1.9910

13               2.2650                   2.2650

14               2.4460                   2.4460

15               2.6120                   2.6120

16               3.3590                   2.6996

17               3.4290                   2.7696

18               3.5330                   2.8736

19               3.2610                   2.6016

20               2.6120                   1.9526

21               2.1790                   1.5196

22               1.6530                   0.9936

23               1.8320                   1.1726

24               2.3280                   1.6686

25               2.7370                   2.0776

26               3.0140                   2.3546

27               3.3280                   2.6686

28               3.4040                   2.7446

29               2.9810                   2.3216

30               2.5570                   1.8976

31               2.5760                   1.9166

32               2.3520                   1.6926

33               2.5560                   1.8966

34               2.8640                   2.2046

35               3.2140                   2.5546

36               3.4350                   2.7756

 

Example 2

This example is an artificial realization of an ARMA(1,1) process via formula  Gaussian white noise, .

An additive outlier with  was added at time point , a temporary change outlier with  was added at time point .

Link to example source

 

#include <imsls.h>

#include <stdlib.h>

#include <stdio.h>

 

int main()

 

  int i, n_obs = 300;

  float parameters_user[300], result_user[300];

  float res_sigma, aic;

  int  outlier_stat[600];

  int num_outliers;

  int outlier_stat_user[300];

  float omega_user[300];

  int model[4];

 

  float series[300]={

    50.0000000,50.2728081,50.6242599,51.0373917,51.9317627,50.3494759,

    51.6597252,52.7004929,53.5499802,53.1673279,50.2373505,49.3373871,

    49.5516472,48.6692696,47.6606636,46.8774185,45.7315445,45.6469727,

    45.9882355,45.5216560,46.0479660,48.1958656,48.6387749,49.9055367,

    49.8077278,47.7858467,47.9386749,49.7691956,48.5425873,49.1239853,

    49.8518791,50.3320694,50.9146347,51.8772049,51.8745689,52.3394470,

    52.7273712,51.4310036,50.6727448,50.8370399,51.2843437,51.8162918,

    51.6933670,49.7038231,49.0189247,49.455703,50.2718010,49.9605980,

    51.3775749,50.2285385,48.2692299,47.6495590,49.2938499,49.1924858,

    49.6449242,50.0446815,51.9972496,54.2576981,52.9835434,50.4193535,

    50.3617897,51.8276901,53.1239929,54.0682144,54.9238319,55.6877632,

    54.8896332,54.0701065,52.2754097,52.2522354,53.1248703,51.1287193,

    50.5003815,49.6504173,47.2453079,45.4555626,45.8449707,45.9765129,

    45.7682228,45.2343674,46.6496811,47.0894432,49.3368340,50.8058052,

    49.9132500,49.5893288,48.2470627,46.9779968,45.6760864,45.7070389,

    46.6158409,47.5303612,47.5630417,47.0389214,46.0352287,45.8161545,

    45.7974396,46.0015373,45.3796463,45.3461685,47.6444016,49.3327446,

    49.3810692,50.2027817,51.4567032,52.3986320,52.5819206,52.7721825,

    52.6919098,53.3274345,55.1345940,56.8962631,55.7791634,55.0616989,

    52.3551178,51.3264084,51.0968323,51.1980476,52.8001442,52.0545082,

    50.8742943,51.5150337,51.2242050,50.5033989,48.7760124,47.4179192,

    49.7319527,51.3320541,52.3918304,52.4140434,51.0845947,49.6485748,

    50.6893463,52.9840813,53.3246994,52.4568024,51.9196091,53.6683121,

    53.4555359,51.7755814,49.2915611,49.8755112,49.4546776,48.6171913,

    49.9643021,49.3766441,49.2551308,50.1021881,51.0769119,55.8328133,

    52.0212708,53.4930801,53.2147255,52.2356453,51.9648819,52.1816330,

    51.9898071,52.5623627,51.0717278,52.2431946,53.6943054,54.3752098,

    54.1492615,53.8523254,52.1093712,52.3982697,51.2405128,50.3018112,

    51.3819618,49.5479546,47.5024452,47.4447708,47.8939056,48.4070015,

    48.2440681,48.7389755,49.7309227,49.1998024,49.5798340,51.1196213,

    50.6288414,50.3971405,51.6084099,52.4564743,51.6443901,52.4080658,

    52.4643364,52.6257210,53.1604691,51.9309731,51.4137230,52.1233368,

    52.9867249,53.3180733,51.9647636,50.7947655,52.3815842,50.8353729,

    49.4136009,52.8355217,52.2234840,51.1392517,48.5245132,46.8700218,

    46.1607285,45.2324257,47.4157829,48.9989090,49.6230736,50.4352913,

    51.1652985,50.2588654,50.7820129,51.0448799,51.2880516,49.6898804,

    49.0288200,49.9338837,48.2214432,46.2103348,46.9550171,47.5595894,

    47.7176018,48.4502945,50.9816895,51.6950073,51.6973495,52.1941261,

    51.8988075,52.5617599,52.0218391,49.5236053,47.9684906,48.2445183,

    48.8275146,49.7176971,51.5649338,52.5627213,52.0182419,50.9688835,

    51.5846901,50.9486771,48.8685837,48.5600624,48.4760094,48.5348396,

    50.4187813,51.2542381,50.1872864,50.4407692,50.6222687,50.4972000,

    51.0036087,51.3367500,51.7368202,53.0463791,53.6261253,52.0728683,

    48.9740753,49.3280830,49.2733917,49.8519020,50.8562126,49.5594254,

    49.6109200,48.3785629,48.0026474,49.4874268,50.1596375,51.8059540,

    53.0288620,51.3321075,49.3114815,48.7999306,47.7201881,46.3433914,

    46.5303612,47.6294632,48.6012459,47.8567657,48.0604057,47.1352806,

    49.5724792,50.5566483,49.4182968,50.5578079,50.6883736,50.6333389,

    51.9766159,51.0595245,49.3751640,46.9667702,47.1658173,47.4411278,

    47.5360374,48.9914742,50.4747620,50.2728043,51.9117165,53.7627792};

 

 

  model[0] = 1;

  model[1] = 1;

  model[2] = 1;

  model[3] = 0;

 

  imsls_f_ts_outlier_identification(n_obs, model, series,

                            IMSLS_NUM_OUTLIERS, &num_outliers,

                            IMSLS_OUTLIER_STATISTICS_USER, outlier_stat_user,

                            IMSLS_OMEGA_WEIGHTS_USER, omega_user,

                            IMSLS_ARMA_PARAM_USER, parameters_user,

                            IMSLS_RETURN_USER, result_user,

                            IMSLS_RESIDUAL_SIGMA, &res_sigma,

                            IMSLS_AIC, &aic,

                            IMSLS_RELATIVE_ERROR, 1.0e-05,

                            0);

                           

   printf("\n");

   printf("ARMA parameters:\n");

   for (i=0; i<=model[0]+model[1]; i++)

      printf("%d\t\t%lf\n", i, parameters_user[i]);

 

   printf("\nNumber of outliers: %d\n\n", num_outliers);

   printf("Outlier statistics:\n");

   printf("Time point\tOutlier type\n");

   for (i=0; i<num_outliers; i++)

     printf("%d\t\t%d\n", outlier_stat_user[2*i], outlier_stat_user[2*i+1]);

 

  printf("\nOmega statistics:\n");

  printf("Time point\tomega\n");

  for (i=0; i<num_outliers; i++)

     printf("%d\t%18.6f\n", outlier_stat_user[2*i], omega_user[i]);

 

  printf("\n");

  printf("RSE:%lf\n", res_sigma);

  printf("AIC:%lf\n\n", aic);

 

return;

}

 

 Output

ARMA parameters:

0               10.828958

1               0.785221

2               -0.496502

 

Number of outliers: 2

 

Outlier statistics:

Time point      Outlier type

150             1

200             3

 

Omega statistics:

Time point      omega

150               4.477870

200               3.381565

 

RSE:1.007223

AIC:1417.044312


Visual Numerics, Inc.
Visual Numerics - Developers of IMSL and PV-WAVE
http://www.vni.com/
PHONE: 713.784.3131
FAX:713.781.9260