JMSLTM Numerical Library 6.1

com.imsl.stat
Class ARMAEstimateMissing

java.lang.Object
  extended by com.imsl.stat.ARMAEstimateMissing
All Implemented Interfaces:
Serializable

public class ARMAEstimateMissing
extends Object
implements Serializable

Estimates missing values in a time series collected with equal spacing. Missing values can be replaced by these estimates prior to fitting a time series using the ARMA class.

Traditional time series analysis as described by Box, Jenkins and Reinsel (1994) requires the observations be made at equidistant time points t_0,t_1,ldots,t_n where t_i = t_0 + i. When observations are missing, ARMA requires that they be replaced with suitable estimates. Class ARMAEstimateMissing offers 4 methods for estimating missing values: MEDIAN, CUBIC_SPLINE, AR_1, and AR_P

Method MEDIAN estimates the missing observations in a gap by the median of the last four time series values before and the first four values after the gap. If not enough values are available before or after the gap then the number is reduced accordingly. This method is very fast and simple, but its use is limited to stationary ergodic series without outliers and level shifts.

Method CUBIC_SPLINE uses a cubic spline interpolation method to estimate missing values. Here the interpolation is again done over the last four time series values before and the first four values after the gap. The missing values are estimated by the resulting interpolant. This method gives smooth transitions across missing values.

Method AR_1 assumes that the time series before the gap can be approximated using an AR(1) process. If the last observation prior to the gap is made at time point t_m then this method uses values at t_0,t_1,ldots,t_m to compute the one-step-ahead forecast at origin t_m. This value is used to estimate the missing value at time point t_m + 1. If the value at t_m+2 is also missing then the values at time points t_0,t_1,ldots,t_m+1 are used to recompute the AR(1) model, and then estimate the value at t_m+2 and so on. The coefficient phi_1 in the AR(1) model is computed internally by the method of least squares from class ARMA.

Finally, method AR_P uses an AR(p) model to estimate missing values using a one-step-ahead forecast similar to method AR_1. First, class ARAutoUnivariate, is applied to the time series values just prior to the missing values to determine the optimum p from the set {0,1,ldots,tt{maxlag}} of possible values and to compute the parameters phi_1,ldots,phi_p of the resulting AR(p) model. The parameters are estimated by the least squares method based on Householder transformations as described in Kitagawa and Akaike (1978). Denoting the mean of the series y_{t_0}, y_{t_1},ldots,y_{t_m} by mu the one-step-ahead forecast at origin t_m,,, hat{y_{t_m}}(1), can be computed by the formula

hat{y_{t_m}}(1)=mu(1 - sumnolimits_{j=1}^pphi_j)+sumnolimits_{j=1}^pphi_j y_{t_m+1-j}rm{.}

This value is used as an estimate for the missing value at t_{m+1}. The procedure starting with ARAutoUnivariate is then repeated for every further missing value in the gap. All four estimation methods treat gaps of missing values in increasing time order.

See Also:
Example, Serialized Form

Field Summary
static int AR_1
          Indicates that missing values should be estimated using an autoregressive time series with 1 lag.
static int AR_P
          Indicates that missing values should be estimated using an autoregressive time series with a maximum lag of maxLag.
static int CUBIC_SPLINE
          Indicates that missing values should be estimated using cublic spline interpolation.
static int LEAST_SQUARES
          Estimate autoregressive coefficients using least squares.
static int MAX_LIKELIHOOD
          Estimate autoregressive coefficients using maximum likelihood.
static int MEDIAN
          Indicates that missing values should be estimated using the median of the values just before and after the missing value gap.
static int METHOD_OF_MOMENTS
          Estimate autoregressive coefficients using method of moments.
 
Constructor Summary
ARMAEstimateMissing(int[] tpoints, double[] z)
          Constructor for ARMAEstimateMissing.
 
Method Summary
 int[] getCompleteTimes()
          Returns an int array of all time points, including values for times with missing values in z.
 double[] getCompleteTimeSeries()
          Returns a double precision vector of length tpoints[tpoints.length-1]-tpoints[0]+1 containing the observed values in the time series z plus estimates for missing values in gaps identified in tpoints.
 double getConvergenceTolerance()
          Returns the current value of convergence tolerance used by the AR_1 and AR_P estimation methods.
 int getEstimationMethod()
          Returns the method used for estimating the final autoregressive coefficients for missing value estimation methods AR_1 and AR_P.
 int getMaxIterations()
          Returns the maximum number of estimation iterations used by missing value estimation methods AR_1 and AR_P.
 int getMaxlag()
          Returns the current value of autoregressive lags used in the AR_P estimation method.
 double getMean()
          Returns the mean value used to center the series.
 int[] getMissingTimes()
          Returns an int array of the times with missing values.
 int getMissingValueMethod()
          Returns the current missing value estimation method.
 int getNumberMissing()
          Returns the number of missing values in the original series
 double getRelativeError()
          Returns the relative error used for the METHOD_OF_MOMENTS and LEAST_SQARES estimation methods.
 void setConvergenceTolerance(double convergenceTolerance)
          Sets the covergence tolerance used by the AR_1 and AR_P missing value estimation methods.
 void setEstimationMethod(int arEstimationMethod)
          Sets the method used for estimating the autoregressive coefficients for missing value estimation methods AR_1 and AR_P.
 void setMaxIterations(int maxIterations)
          Sets the maximum number of estimation iterations for missing value estimation methods AR_1 and AR_P.
 void setMaxlag(int maxlag)
          Sets the maximum number of autoregressive lags when method AR_P is selected as the missing value estimation method.
 void setMean(double mean)
          Sets the mean value used to center the series.
 void setMissingValueMethod(int method)
          Sets the current missing value estimation method to MEDIAN, CUBIC_SPLINE, AR_1, or AR_P.
 void setRelativeError(double relativeError)
          Sets the relative error used for the METHOD_OF_MOMENTS and LEAST_SQUARES estimation methods.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

AR_1

public static final int AR_1
Indicates that missing values should be estimated using an autoregressive time series with 1 lag.

See Also:
Constant Field Values

AR_P

public static final int AR_P
Indicates that missing values should be estimated using an autoregressive time series with a maximum lag of maxLag. By default maxLag=10, but this can be changed using the setMaxlag method.

See Also:
Constant Field Values

CUBIC_SPLINE

public static final int CUBIC_SPLINE
Indicates that missing values should be estimated using cublic spline interpolation.

See Also:
Constant Field Values

LEAST_SQUARES

public static final int LEAST_SQUARES
Estimate autoregressive coefficients using least squares.

See Also:
Constant Field Values

MAX_LIKELIHOOD

public static final int MAX_LIKELIHOOD
Estimate autoregressive coefficients using maximum likelihood.

See Also:
Constant Field Values

MEDIAN

public static final int MEDIAN
Indicates that missing values should be estimated using the median of the values just before and after the missing value gap.

See Also:
Constant Field Values

METHOD_OF_MOMENTS

public static final int METHOD_OF_MOMENTS
Estimate autoregressive coefficients using method of moments.

See Also:
Constant Field Values
Constructor Detail

ARMAEstimateMissing

public ARMAEstimateMissing(int[] tpoints,
                           double[] z)
Constructor for ARMAEstimateMissing.

Parameters:
tpoints - an int array containing the times at which the series values were observed. The values must be strictly increasing. Times for missing values are identified as non-incremental gaps in this series. A gap of missing values in z is assumed when the difference between two consecutive values is greater than 1, i.e. t_{i+1}-t_i>1. The difference is the number of missing values in the gap. The series can have multiple gaps with missing values, but any one gap can have no more than 3 missing values.
z - a double array containing the values for the time series observed at the times given in the vector tpoints.
Method Detail

getCompleteTimes

public int[] getCompleteTimes()
Returns an int array of all time points, including values for times with missing values in z.

Returns:
An int array of all times from tpoints[0]=1 to tpoints.length+nMissing. Where nMissing is the number of values removed from the original time series, nMissing = getNumberMissing().

getCompleteTimeSeries

public double[] getCompleteTimeSeries()
                               throws ARMA.MatrixSingularException,
                                      ARMA.TooManyCallsException,
                                      ARMA.IncreaseErrRelException,
                                      ARMA.NewInitialGuessException,
                                      ARMA.IllConditionedException,
                                      ARMA.TooManyITNException,
                                      ARMA.TooManyFcnEvalException,
                                      ARMA.TooManyJacobianEvalException,
                                      ARMA.NoProgressException,
                                      ARAutoUnivariate.TriangularMatrixSingularException,
                                      ARMAMaxLikelihood.NonStationaryException,
                                      ARMAMaxLikelihood.NonInvertibleException,
                                      ARMAMaxLikelihood.InitialMAException
Returns a double precision vector of length tpoints[tpoints.length-1]-tpoints[0]+1 containing the observed values in the time series z plus estimates for missing values in gaps identified in tpoints.

Returns:
A double array of length tpoints[tpoints.length-1]-tpoints[0]+1 containing the observed values in the time series z plus estimates for missing values in gaps identified in tpoints.
Throws:
ARAutoUnivariate.TriangularMatrixSingularException - is thrown if the input matrix to ARAutoUnivariate is singular. This can only occur with estimation method AR_P.
ARMA.MatrixSingularException - is thrown if the input matrix is singular.
ARMA.TooManyCallsException - is thrown if the number of calls to the function has exceeded the maximum number of iterations times the number of moving average (MA) parameters + 1.
ARMA.IncreaseErrRelException - is thrown if the bound for the relative error is too small.
ARMA.NewInitialGuessException - is thrown if the iteration has not made good progress.
ARMA.IllConditionedException - is thrown if the problem is ill-conditioned.
ARMA.TooManyITNException - is thrown if the maximum number of iterations is exceeded.
ARMA.TooManyFcnEvalException - is thrown if the maximum number of function evaluations is exceeded.
ARMA.TooManyJacobianEvalException - is thrown if the maximum number of Jacobian evaluations is exceeded.
NoProgressException - is thrown when the algorithm is not making any progress. Try a new initial guess.
ARMAMaxLikelihood.NonStationaryException - is thrown if the final maximum likelihood estimates for the time series are nonstationary.
ARMAMaxLikelihood.NonInvertibleException - is thrown if the final maximum likelihood estimates for the time series are noninvertible.
ARMAMaxLikelihood.InitialMAException - is thrown if the initial values provided for the moving average terms using setMA are noninvertable. In this case, ARMAMaxLikelihood terminates and does not compute the time series estimates.
ARMA.NoProgressException

getConvergenceTolerance

public double getConvergenceTolerance()
Returns the current value of convergence tolerance used by the AR_1 and AR_P estimation methods.

Returns:
a double scalar value equal to the convergence tolerance. By default the convergence tolerance is 1.0e-09.

getEstimationMethod

public int getEstimationMethod()
Returns the method used for estimating the final autoregressive coefficients for missing value estimation methods AR_1 and AR_P.

Returns:
an int representing the estimation method used for estimating the autoregressive coefficients. 0 implies METHOD_OF_MOMENTS, 1 implies LEAST_SQUARES, 2 implies MAX_LIKELIHOOD.

getMaxIterations

public int getMaxIterations()
Returns the maximum number of estimation iterations used by missing value estimation methods AR_1 and AR_P.

Returns:
An int scalar equal to the maximum number of iterations for the maximum likelihood missing value estimation method. If this limit is exceeded during the compute method, ARMAEstimateMissing stops execution and issues an ARMAMaxLikelihood.IterationLimitExceededException.

getMaxlag

public int getMaxlag()
Returns the current value of autoregressive lags used in the AR_P estimation method.

Returns:
An int scalar value equal to the maximum number of autoregressive lags used with the AR_P missing value estimation method.

getMean

public double getMean()
Returns the mean value used to center the series.

Returns:
a double scalar used to center the series.

getMissingTimes

public int[] getMissingTimes()
Returns an int array of the times with missing values.

Returns:
An int array containing the times at which missing values were estimated. If there are no missing values a null array is returned.

getMissingValueMethod

public int getMissingValueMethod()
Returns the current missing value estimation method.

Returns:
an int representing the estimation method used for estimating the missing values in the time series. 0 implies MEDIAN, 1 implies CUBIC_SPLINE, 2 implies AR_1 and 3 implies AR_P.

getNumberMissing

public int getNumberMissing()
Returns the number of missing values in the original series

Returns:
An int scalar value containing the number of missing values in the time series.

getRelativeError

public double getRelativeError()
Returns the relative error used for the METHOD_OF_MOMENTS and LEAST_SQARES estimation methods.

Returns:
a double scalar containing the stopping criterion for use in the nonlinear equation solver used in both the method of moments and least-squares algorithm.

setConvergenceTolerance

public void setConvergenceTolerance(double convergenceTolerance)
Sets the covergence tolerance used by the AR_1 and AR_P missing value estimation methods.

Parameters:
convergenceTolerance - A double scalar value. Default: convergenceTolerance = 1.0e-09

setEstimationMethod

public void setEstimationMethod(int arEstimationMethod)
Sets the method used for estimating the autoregressive coefficients for missing value estimation methods AR_1 and AR_P.

Parameters:
arEstimationMethod - An int scalar specifying the method used to estimate the autoregressive coefficients. Valid methods are METHOD_OF_MOMENTS, LEAST_SQUARES, and MAX_LIKELIHOOD. By default, arEstimationMethod=LEAST_SQUARES.

setMaxIterations

public void setMaxIterations(int maxIterations)
Sets the maximum number of estimation iterations for missing value estimation methods AR_1 and AR_P. If this limit is exceeded ARMAEstimateMissing stops execution during the compute method and issues an IterationLimitExceededException.

Parameters:
maxIterations - An int specifying the maximum number of iterations for the maximum likelihood estimation. By default, maxIterations=200.

setMaxlag

public void setMaxlag(int maxlag)
Sets the maximum number of autoregressive lags when method AR_P is selected as the missing value estimation method.

Parameters:
maxlag - An int scalar value equal to the maximum number of autoregressive lags. maxlag must be greater than z.length-5. By default maxlag=10.

setMean

public void setMean(double mean)
Sets the mean value used to center the series.

Parameters:
mean - a double scalar used to center the series. By default the median of the series is used for centering.

setMissingValueMethod

public void setMissingValueMethod(int method)
Sets the current missing value estimation method to MEDIAN, CUBIC_SPLINE, AR_1, or AR_P.

Parameters:
method - An int scalar. By default method=AR_1.

setRelativeError

public void setRelativeError(double relativeError)
Sets the relative error used for the METHOD_OF_MOMENTS and LEAST_SQUARES estimation methods.

Parameters:
relativeError - a double scalar containing the stopping criterion for use in the nonlinear equation solver used in both the method of moments and least-squares algorithm. Default: relativeError = 2.22045e-14

JMSLTM Numerical Library 6.1

Copyright © 1970-2010 Visual Numerics, Inc.
Built July 30 2010.