seasonalFit

Estimates the optimum seasonality parameters for a time series using an autoregressive model, AR(p), to represent the time series.

Synopsis

seasonalFit(z, maxLag, sInitial)

Required Arguments

float z[] (Input)An array of length nObs containing the time series. No missing values in the series are allowed.

int maxLag (Input)
The maximum lag allowed when fitting an AR(p) model.
int sInitial[[]] (Input)
Array of dimension nSInitial by nDifferences containing the seasonal differences to test. All values of sInitial must be greater than or equal to one.

Return Value

An array of length nObs or nObs-lost containing the optimum seasonally adjusted, autoregressive series. The first lost observations in this series are set to NaN, missing values. The seasonal adjustment is done by selecting optimum values for \(d_1,\ldots,d_m\), \(s_1,\ldots,s_m\) (m=nDifferences) and \(p\) in the AR model:

\[\phi_p(B) \left(\mathit{\Delta}_{s_1}^{d_1} \mathit{\Delta}_{s_2}^{d_2} \cdots \mathit{\Delta}_{s_m}^{d_m} Z_t - \mu\right) = a_t\]

where \(\left\{ Z_t \right\}\) is the original time series, \(B\) is the backward shift operator defined by \(B^k Z_t=Z_{t-k}\), \(k\geq 0\), \(a_t\) is Gaussian white noise with \(E \left[ a_t \right]=0\) and \(\mathit{VAR} \left[ a_t \right]=\sigma^2\), \(\phi_p(B)=1-\phi_1 B-\phi_2 B^2-\cdots-\phi_p B^p,0\leq p\leq\mathtt{maxlag}\) , \(\mathit{\Delta}_s^d=\left( 1-B^s \right)^d\), with \(s>0,d\geq 0\), and \(\mu\) is a centering parameter for the differenced series.

NOTE that \(\mathit{\Delta}_s^0=1\), the identity operator, i.e., \(\mathit{\Delta}_s^0 Y_t=Y_t\).

If an error occurred, then None is returned.

Optional Arguments

dInitial, int[] (Input)

An array of dimension nDInitial by nDifferences containing the candidate values for d[], from which the optimum is being selected. All candidate values in dInitial[] must be non-negative and nDInitial ≥ 1.

Default: dInitial an array of length nDifferences filled with ones.

setFirstToNan (Input)

or

excludeFirst (Input)

If excludeFirst is specified, the first lost values are excluded from w due to differencing. The differenced series w is of length nObs–lost. If setFirstToNan is specified, the first lost observations are set to NaN (Not a Number).

Default: setFirstToNan.

center, int (Input)

If supplied, center controls the method used to center the differenced series. If center=0 then the series is not centered. If center=1, the mean of the series is used to center the data, and if center=2, the median is used.

Default: center=1.

lost (Output)
The number of observations lost due to differencing the time series. This is also equal to the number of NaN values that appear in the first lost locations of the returned seasonally adjusted series.
bestPeriods (Output)
An array of length m=nDifferences containing the optimum values for the seasonal adjustment parameters \(s_1,s_2,\ldots,s_m\) selected from the list of candidates contained in sInitial[].
bestOrders (Output)
An array of length m=nDifferences containing the optimum values for the seasonal adjustment parameters \(d_1,d_2,\ldots,d_m\) selected from the list of candidates contained in dInitial[].
arOrder (Output)
The optimum value for the autoregressive lag.
aic (Output)
Akaike’s Information Criterion (AIC) for the optimum seasonally adjusted model.

Description

Many time series contain seasonal trends and cycles that can be modeled by first differencing the series. For example, if the correlation is strong from one period to the next, the series might be differenced by a lag of 1. Instead of fitting a model to the series \(Z_t\), the model is fitted to the transformed series: \(W_t=Z_t-Z_{t-1}\). Higher order lags or differences are warranted if the series has a cycle every 4 or 13 weeks.

Function seasonalFit does not center the original series. If center is specified with either center =1 or center =2, then the differenced series, \(W_t\), is centered before determination of minimum AIC and optimum lag. For every combination of rows in sInitial and dInitial, the series \(Z_t\) is converted to the seasonally adjusted series using the following computation

\[\begin{split}W_t(s,d) = \mathit{\Delta}_{s_1}^{d_1} \mathit{\Delta}_{s_2}^{d_2} \cdots \mathit{\Delta}_{s_m}^{d_m} Z_t = \prod_{i=1}^{m} \left(1 - B^{s_i}\right)^{d_i} Z_t = \prod_{i=1}^{m} \sum_{j=0}^{d_i} \left( \begin{array}{c} d_i \\ j \end{array} \right) (-1)^jB^{j-s_i}Z_t\end{split}\]

where \(s :=\left( s_1,\ldots,s_m \right)\), \(d :=\left( d_1,\ldots,d_m \right)\) represent specific rows of arrays sInitial and dInitial respectively, and \(m\) =nDifferences.

This transformation of the series \(Z_t\) to \(W_t (s,d)\) is accomplished using function difference. After this transformation,

\[W_t(s,d)\]

is (optionally) centered and a call is made to autoUniAr to automatically determine the optimum lag for an AR(p) representation for \(W_t (s,d)\). This procedure is repeated for every possible combination of rows of sInitial and dInitial. The series with the minimum AIC is identified as the optimum representation and returned.

Example

Consider the Airline Data (Box, Jenkins and Reinsel 1994, p. 547) consisting of the monthly total number of international airline passengers from January 1949 through December 1960. Function seasonalFit is used to compute the optimum seasonality representation of the adjusted series

\[W_t(s,d) = \mathit{\Delta}_{s_1}^{d_1} \mathit{\Delta}_{s_2}^{d_2} Z_t = \left(1 - B^{s_1}\right)^{d_1} \left(1 - B^{s_2}\right)^{d_2} Z_t,\]

where

\[s = (1,1)\]

or

\[s = (1,12)\]

and

\[d = (1,1).\]

As differenced series with minimum AIC,

\[W_t = \mathit{\Delta}_1^1 \mathit{\Delta}_{12}^1 Z_t = \left(Z_t - Z_{t-12}\right) - \left(Z_{t-1} - Z_{t-13}\right),\]

is identified.

from __future__ import print_function
from numpy import *
from pyimsl.stat.dataSets import dataSets
from pyimsl.stat.seasonalFit import seasonalFit

maxlag = 10
nobs = 144
n_differences = 2
n_s_initial = 2
nlost = empty(0)
npar = empty(0)
aic = empty(0)
s_init = ((1, 1), (1, 12))
s = empty(0)
d = empty(0)
z = dataSets(4)

difference = seasonalFit(z.flat, maxlag, s_init,
                         lost=nlost,
                         bestPeriods=s,
                         bestOrders=d,
                         aic=aic,
                         arOrder=npar)

print("nlost = %d" % nlost[0])
print("s = (%d, %d)" % (s[0], s[1]))
print("d = (%d, %d)" % (d[0], d[1]))
print("Order of optimum AR process: %d" % npar[0])
print("aic = %lf" % aic[0])
print("\ni\tz[i]\t\tdifference[i]")
for i in range(0, nobs):
    print("%d\t%f\t%f" % (i, z[i], difference[i]))

Output

nlost = 13
s = (1, 12)
d = (1, 1)
Order of optimum AR process: 1
aic = 949.780371

i	z[i]		difference[i]
0	112.000000	nan
1	118.000000	nan
2	132.000000	nan
3	129.000000	nan
4	121.000000	nan
5	135.000000	nan
6	148.000000	nan
7	148.000000	nan
8	136.000000	nan
9	119.000000	nan
10	104.000000	nan
11	118.000000	nan
12	115.000000	nan
13	126.000000	5.000000
14	141.000000	1.000000
15	135.000000	-3.000000
16	125.000000	-2.000000
17	149.000000	10.000000
18	170.000000	8.000000
19	170.000000	0.000000
20	158.000000	0.000000
21	133.000000	-8.000000
22	114.000000	-4.000000
23	140.000000	12.000000
24	145.000000	8.000000
25	150.000000	-6.000000
26	178.000000	13.000000
27	163.000000	-9.000000
28	172.000000	19.000000
29	178.000000	-18.000000
30	199.000000	0.000000
31	199.000000	0.000000
32	184.000000	-3.000000
33	162.000000	3.000000
34	146.000000	3.000000
35	166.000000	-6.000000
36	171.000000	0.000000
37	180.000000	4.000000
38	193.000000	-15.000000
39	181.000000	3.000000
40	183.000000	-7.000000
41	218.000000	29.000000
42	230.000000	-9.000000
43	242.000000	12.000000
44	209.000000	-18.000000
45	191.000000	4.000000
46	172.000000	-3.000000
47	194.000000	2.000000
48	196.000000	-3.000000
49	196.000000	-9.000000
50	236.000000	27.000000
51	235.000000	11.000000
52	229.000000	-8.000000
53	243.000000	-21.000000
54	264.000000	9.000000
55	272.000000	-4.000000
56	237.000000	-2.000000
57	211.000000	-8.000000
58	180.000000	-12.000000
59	201.000000	-1.000000
60	204.000000	1.000000
61	188.000000	-16.000000
62	235.000000	7.000000
63	227.000000	-7.000000
64	234.000000	13.000000
65	264.000000	16.000000
66	302.000000	17.000000
67	293.000000	-17.000000
68	259.000000	1.000000
69	229.000000	-4.000000
70	203.000000	5.000000
71	229.000000	5.000000
72	242.000000	10.000000
73	233.000000	7.000000
74	267.000000	-13.000000
75	269.000000	10.000000
76	270.000000	-6.000000
77	315.000000	15.000000
78	364.000000	11.000000
79	347.000000	-8.000000
80	312.000000	-1.000000
81	274.000000	-8.000000
82	237.000000	-11.000000
83	278.000000	15.000000
84	284.000000	-7.000000
85	277.000000	2.000000
86	317.000000	6.000000
87	313.000000	-6.000000
88	318.000000	4.000000
89	374.000000	11.000000
90	413.000000	-10.000000
91	405.000000	9.000000
92	355.000000	-15.000000
93	306.000000	-11.000000
94	271.000000	2.000000
95	306.000000	-6.000000
96	315.000000	3.000000
97	301.000000	-7.000000
98	356.000000	15.000000
99	348.000000	-4.000000
100	355.000000	2.000000
101	422.000000	11.000000
102	465.000000	4.000000
103	467.000000	10.000000
104	404.000000	-13.000000
105	347.000000	-8.000000
106	305.000000	-7.000000
107	336.000000	-4.000000
108	340.000000	-5.000000
109	318.000000	-8.000000
110	362.000000	-11.000000
111	348.000000	-6.000000
112	363.000000	8.000000
113	435.000000	5.000000
114	491.000000	13.000000
115	505.000000	12.000000
116	404.000000	-38.000000
117	359.000000	12.000000
118	310.000000	-7.000000
119	337.000000	-4.000000
120	360.000000	19.000000
121	342.000000	4.000000
122	406.000000	20.000000
123	396.000000	4.000000
124	420.000000	9.000000
125	472.000000	-20.000000
126	548.000000	20.000000
127	559.000000	-3.000000
128	463.000000	5.000000
129	407.000000	-11.000000
130	362.000000	4.000000
131	405.000000	16.000000
132	417.000000	-11.000000
133	391.000000	-8.000000
134	419.000000	-36.000000
135	461.000000	52.000000
136	472.000000	-13.000000
137	535.000000	11.000000
138	622.000000	11.000000
139	606.000000	-27.000000
140	508.000000	-2.000000
141	461.000000	9.000000
142	390.000000	-26.000000
143	432.000000	-1.000000