autoParm¶
Estimates structural breaks in non-stationary univariate time series.
Synopsis¶
autoParm (y, npcs)
Required Arguments¶
- float
y[]
(Input) - An array of length
nObs
containing the time series. - int
npcs
(Input/Output) - The number of requested/estimated pieces or segments of the time series.
npcs
is considered input only whenarModel
is provided.
Return Value¶
An array (arModel
) of length npcs
× 2 containing the break points
and AR orders for the derived model. If arModel
is used, the return
value is None
.
Column
Index |
Description |
---|---|
0 | Structural break points \(\tau_j\) |
1 | AR order ( \(p_j\) ) for each segment |
Optional Arguments¶
maxArOrder
, int (Input)Maximum order to consider for each AR model.
Default:
maxArOrder
= 20.method
, int (Input)Method of estimation.
method
Method Used 0 Yule –Walker 1 Least Squares 2 Burg Default:
method
= 0.modelSelectionCriterion
, int (Input)Selection criterion.
modelSelectionCriterion
Criterion Used 0 Minimum Description Length (MDL) 1 Akaike’s Information Criterion (AIC) Default:
modelSelectionCriterion
= 0.maximumLikelihood
, int (Input)Likelihood computation method.
maximumLikelihood
Computation Method 0 Exact 1 Approximate Default:
maximumLikelihood
= 0.arModel
, int[]
(Input)- A user specified array of length
npcs
× 2 containing the break points and AR orders. When this argument is used, only the AR parameters and quality of the fit are determined.
Column
Index |
Description |
---|---|
0 | Structural break points \(\tau_j\) |
1 | AR order ( \(p_j\) ) for each segment |
t_print
, int (Input)Printing option.
t_print
Action 0 No printing 1 Prints final results only 2 Prints intermediate and final results Default:
t_print
= 0.randomSeed
, int (Input)Seed of the random number generator. For the same data and parameter settings,
autoParm
will return the same results each time if a seed is set. IfrandomSeed
= 0, the system clock will be used to generate a seed. The result will be nondeterministic.Default:
randomSeed
= 0.
Note: The following input arguments are for setting up and running the embedded Genetic Algorithm. In most situations, the default values should be used for these arguments. Users may wish to change some or all for testing or research purposes. |
probDistribution
, float[]
(Input)Array of length
maxArOrder
+ 1 giving the probability distribution over the AR order variable p = 0,…,maxArOrder
. i = 0,…,maxArOrder
is used to randomly assign an AR order to breakpoint position for a given chromosome.probDistribution
[i] > = 0 and if ΣprobDistribution
is not equal to 1, the values will be normalized, i.e.,probDistribution
[i] =probDistribution
[i]/ ΣprobDistribution
.Default:
probDistribution
[i] = 1/(maxArOrder
+ 1) for all i.minObservations
, int[]
(Input)Array of length
maxArOrder
+ 1 containing minimum number of observations required for valid estimates of AR model with order p = 0, …,maxArOrder
.Default:
minObservations
[p] = 2 ×(number of parameters) + 2 = 2 × (p + 2) + 2.gaParameters
, float[]
(Input)- Array of length 4 containing parameters that control the behavior of the genetic algorithm. These values should be strictly greater than zero and less than one to avoid unexpected results.
Index |
Behavior |
---|---|
0 | Probability used to set initial break points in a
chromosome. Default: min (minObservations ) /
nObs . |
1 | Probability of crossover used to decide between a crossover and a mutation. Default: 1 – min ( |
2 | In the mutation operation, probability an AR(p) model is enforced at the current position. Default: 0.4. |
3 | In the mutation operation, probability a break point is disallowed at the current position. Default: 0.3. |
Note: gaParameters[2] and gaParameters[3] must be valid
probabilities and their sum must be between 0 and 1. 1 –
gaParameters[2] – gaParameters[3] is the probability that the
chromosome j inherits the parent’s chromosome j. |
island
, int[]
(Input)- Array of length 5 containing the migration policy parameters.
Index | Policy |
---|---|
0 | Number of islands. Default: 40. |
1 | Number of generations that pass before migration occurs. Note
that the convergence of the algorithm is revised whether
migrations take place or not (see argument Default: 5. |
2 | Number of subjects that migrate at each migration event. Default: 2. |
3 | Population size (number of chromosomes) per island. Default: 40. |
4 | Migration flag. If 1, migration is performed. Default: 1. |
maxMigrations
, int (Input)Maximum number of times that migrations may take place before the function is stopped if convergence has not occurred.
Default:
maxMigrations
= 20.stopIterations
, int (Input)Number of iterations. The function will declare convergence and stop the iterations if the criterion value (MDL/AIC) has not changed after
stopIterations
consecutive migrations. Otherwise, the algorithm will declare non-convergence and stop aftermaxMigrations
migrations have taken place. See alsomaxMigrations
andisland[1]
. Note that logically,stopIterations
<maxMigrations
.Default:
stopIterations
= 10.selectionCriterionValue
, (Output)- Final value of the selection criterion.
arFit
(Output)An array of length
npcs
×maxArOrder
containing the AR coefficient estimates φ for each segment.arFit[imaxArOrder+j]
is the j‑th coefficient for segment i where i = 0, …,npcs
- 1 and j = 0, …,arModel[i
×2 + 1].Note that the intercept is not reported.
arStats
(Output)- An array of length
npcs
× 2.
Column Index | Description |
---|---|
0 | Likelihood values for each of the fitted AR models |
1 | Residual variances for each of the fitted AR models |
Description¶
Function autoParm
estimates the structural breaks of a non-stationary
time series using, with permission from the authors, the method developed by
Davis et al (2006). autoParm
estimates a partition of the time index and
models the time series in each segment as a separate auto-regressive
(AR(p)) process. The function returns the estimated breakpoints, the
estimated AR(p) models, and supporting statistics.
For the observed time series \(Y_t,t=1,\ldots n\) the problem is to find m, the number of breaks, their locations, \(1<\tau_1<\tau_2<\cdots<\tau_m<n\), and \(p_j\), \(j=1,\ldots,m+1\), the order of the AR process in which the j‑th segment is modeled. That is, \(Y_t=X_{t,j}\) for \(\tau_{j-1}\leq t<\tau_j\) (for convenience, \(\tau_0 :=1\) and \(\tau_{m+1} :=n+1\)), where \(\{X_{t,j}\}\) is an AR process of order \(p_j\)
and \(\varepsilon_t\), the noise sequence, is independent and
identically distributed with mean 0 and variance 1. Note that a series with
m breaks will have m + 1 segments (m + 1 = npcs
).
The vector \(\left\{ m,\tau_1,\ldots,\tau_m,p_1,\ldots,p_{m+1} \right\}\)
completely specifies a piecewise AR model. To estimate this vector
autoParm
optimizes, with respect to this vector, one of two selection
criteria: the first is a Minimum Description Length (MDL) criterion, and the
second is the Akaike’s Information Criterion (AIC). The MDL is defined as
while the AIC criterion is given by
where, given a candidate value of the vector \(\left\{ m,\tau_1,\ldots,\tau_m,p_1,\ldots,p_{m+1} \right\}\), \(L\) is the likelihood of the fitted piecewise AR model evaluated at the parameter estimates,
The parameters \(\left\{ \gamma_j,\phi_{1,1},\ldots\phi _{1,p_j},\sigma_j^2 \right\}\) of the j‑th AR segment are estimated by the choice of one of three estimation methods: Yule‑Walker, Burg, or Least Squares.
For simplicity, assume the mean of each series \(\{X_{t,j}\}\) is 0 and that the errors are Gaussian. Then, the piecewise AR model has Gaussian likelihood
where \(V_j\) is the variance-covariance of the j‑th AR segment (of order \(p_j\) ) and \(Y_j\) is the vector of observations of the j‑th segment, i.e., \(Y_j=\left( *Y_{\tau_j},Y_{\tau_j+1},\ldots,Y_{\tau_{j+1}-1} \right)^T\).
To find the minimizer \(\left\{ m,\tau_j,p_j \right\}^*\) of either MDL
or AIC, autoParm
employs a Genetic Algorithm with islands, migration,
cross-over and mutations. See Davis et.al. (2006) for further details.
Remarks¶
Function autoParm
approximates locally stationary time series by
independent auto‑regressive processes. Experimental results suggest that
autoParm
gives reasonable estimates of the structural breaks of a given
time series, even if the segment series are not autoregressive. Also, based
on experimental results, MDL gives better results than AIC as a selection
criterion.
If randomSeed
is set out of range, an informational (error) message is
issued indicating that the seed will be reset to 123457. Also if
maxMigrations
migrations are reached in the genetic algorithm before the
selection criterion value converges an informational message is issued
suggesting the increase of maxMigrations
.
Example¶
The examples below illustrate different scenarios using autoParm
. The
example series used in each case is the airline demand data (Box, Jenkins
and Reinsel, 1994), which gives monthly total demand for the period January
1949 through December 1960. Each scenario sets the optional argument,
randomSeed
= 123457.
from __future__ import print_function
from numpy import *
from pyimsl.stat.autoParm import autoParm
from pyimsl.stat.dataSets import dataSets
from pyimsl.stat.difference import difference
from pyimsl.stat.writeMatrix import writeMatrix
from pyimsl.stat.free import free
n = 144
npcs = [0]
iseed = 123457
arp2 = [[0, 2], [59, 1]]
sc = []
arfit = []
arstat = []
# get data
y = dataSets(4)
y = y.reshape(n)
# Example 1: Use default values
print("Example 1: Use defaults")
arp1 = autoParm(y, npcs,
t_print=1,
randomSeed=iseed,
selectionCriterionValue=sc,
arFit=arfit,
arStats=arstat)
free(arp1)
# Example 2: differenced series set period for the difference.
# iper is in years for this data set
print("=" * 50)
print("Example 2: differenced series")
iper = [1]
# set the order for the difference.
iord = [1]
# get differenced series dx
nlost = []
sc2 = []
arfit2 = []
arstat2 = []
dx = difference(y, iper,
orders=iord,
lost=nlost)
arp_2 = autoParm(dx[nlost[0]:], npcs,
t_print=1,
randomSeed=iseed,
selectionCriterionValue=sc2,
arFit=arfit2,
arStats=arstat2)
free(arp_2)
# Example 3: original series, lower order allowed
# lower maximum AR order
print("=" * 50)
print("Example 3: original series, lower order allowed")
sc3 = []
arfit3 = []
arstat3 = []
maxarorder = 5
arp3 = autoParm(y, npcs,
maxArOrder=maxarorder,
t_print=1,
randomSeed=iseed,
selectionCriterionValue=sc3,
arFit=arfit3,
arStats=arstat3)
free(arp3)
# Example 4: differenced series, lower order allowed
print("=" * 50)
print("Example 4: differenced series, lower order allowed")
sc4 = []
arfit4 = []
arstat4 = []
maxarorder = 5
arp4 = autoParm(dx[nlost[0]:], npcs,
maxArOrder=maxarorder,
t_print=1,
randomSeed=iseed,
selectionCriterionValue=sc4,
arFit=arfit4,
arStats=arstat4)
free(arp4)
# Example 5: original series, force fit the segments
# Fit only at the break points
print("=" * 50)
print("Example 5: original series, force fit the segments")
sc5 = []
arfit5 = []
arstat5 = []
npcs = [2]
arpnull = autoParm(y, npcs,
arModel=arp2,
t_print=1,
randomSeed=iseed,
selectionCriterionValue=sc5,
arFit=arfit5,
arStats=arstat5)
Output¶
Example 1: Use defaults
==================================================
Example 2: differenced series
==================================================
Example 3: original series, lower order allowed
==================================================
Example 4: differenced series, lower order allowed
==================================================
Example 5: original series, force fit the segments
============== final results ===============
number of pieces: 2
selection criteria value: 685.899716
total time: 0.630000 conv: 1
==================== final model estimates =====================
break point order est. coeff. likelihood resid. var
arp[ 0] arp[ 1] arfit[ 0- 1] arstat[ 0] arstat[ 1]
0 2 0.90785
-0.17078
188.602 344.671
arp[ 2] arp[ 3] arfit[20-32] arstat[ 2] arstat[ 3]
43 13 1.03700
-0.07801
-0.03891
-0.03452
0.11961
-0.12852
0.01990
-0.04886
0.08090
-0.13118
0.22122
0.53863
-0.61515
486.665 691.475
============== final results ===============
number of pieces: 1
selection criteria value: 624.283501
total time: 0.600000 conv: 1
==================== final model estimates =====================
break point order est. coeff. likelihood resid. var
arp[ 0] arp[ 1] arfit[ 0-11] arstat[ 0] arstat[ 1]
0 12 -0.02842
-0.22436
-0.16846
-0.24267
-0.10573
-0.22429
-0.12126
-0.26446
-0.07087
-0.24327
-0.07136
0.57129
619.321 297.352
============== final results ===============
number of pieces: 2
selection criteria value: 705.296593
total time: 0.540000 conv: 1
==================== final model estimates =====================
break point order est. coeff. likelihood resid. var
arp[ 0] arp[ 1] arfit[ 0- 0] arstat[ 0] arstat[ 1]
0 1 0.89533
270.393 333.564
arp[ 2] arp[ 3] arfit[ 5- 6] arstat[ 2] arstat[ 3]
62 2 1.19788
-0.35922
424.270 1632.339
============== final results ===============
number of pieces: 2
selection criteria value: 698.359495
total time: 0.530000 conv: 1
==================== final model estimates =====================
break point order est. coeff. likelihood resid. var
arp[ 0] arp[ 1] arfit[ 0- 0] arstat[ 0] arstat[ 1]
0 0 -------
335.565 357.389
arp[ 2] arp[ 3] arfit[ 5- 5] arstat[ 2] arstat[ 3]
76 1 0.33310
352.175 1786.345
============== final results ===============
number of pieces: 2
selection criteria value: 712.521
==================== final model estimates =====================
break point order est. coeff. likelihood resid. var
arp[ 0] arp[ 1] arfit[ 0- 1] arstat[ 0] arstat[ 1]
0 2 1.12156
-0.24876
258.192 313.889
arp[ 2] arp[ 3] arfit[20-20] arstat[ 2] arstat[ 3]
59 1 0.88605
443.696 1937.633
Warning Errors¶
IMSLS_MAX_MIGRATIONS_EXCEEDED |
“maxMigrations ”
migrations or
“stopIterations ”
iterations were reached in the
genetic algorithm before the
selection criterion value
converged. Try increasing
“maxMigrations ”,
“stopIterations ”. |