com.imsl.stat.ARMAEstimateMissing

All Implemented Interfaces:: Serializable

public class ARMAEstimateMissing extends Object implements Serializable

Estimates missing values in a time series collected with equal spacing. Missing values can be replaced by these estimates prior to fitting a time series using the ARMA class.

Traditional time series analysis as described by Box, Jenkins and Reinsel (1994) requires the observations be made at equidistant time points $t_0,t_1,\ldots,t_n$ where $t_i = t_0 + i$. When observations are missing, ARMA requires that they be replaced with suitable estimates. Class ARMAEstimateMissing offers 4 methods for estimating missing values: MEDIAN, CUBIC_SPLINE, AR_1, and AR_P

Method MEDIAN estimates the missing observations in a gap by the median of the last four time series values before and the first four values after the gap. If not enough values are available before or after the gap then the number is reduced accordingly. This method is very fast and simple, but its use is limited to stationary ergodic series without outliers and level shifts.

Method CUBIC_SPLINE uses a cubic spline interpolation method to estimate missing values. Here the interpolation is again done over the last four time series values before and the first four values after the gap. The missing values are estimated by the resulting interpolant. This method gives smooth transitions across missing values.

Method AR_1 assumes that the time series before the gap can be approximated using an AR(1) process. If the last observation prior to the gap is made at time point $t_m$ then this method uses values at $t_0,t_1,\ldots,t_m$ to compute the one-step-ahead forecast at origin $t_m$. This value is used to estimate the missing value at time point $t_m + 1$. If the value at $t_m+2$ is also missing then the values at time points $t_0,t_1,\ldots,t_m+1$ are used to recompute the AR(1) model, and then estimate the value at $t_m+2$ and so on. The coefficient $\phi_1$ in the AR(1) model is computed internally by the method of least squares from class ARMA.

Finally, method AR_P uses an AR(p) model to estimate missing values using a one-step-ahead forecast similar to method AR_1. First, class ARAutoUnivariate, is applied to the time series values just prior to the missing values to determine the optimum p from the set $\{0,1,\ldots,\tt{maxlag}\}$ of possible values and to compute the parameters $\phi_1,\ldots,\phi_p$ of the resulting AR(p) model. The parameters are estimated by the least squares method based on Householder transformations as described in Kitagawa and Akaike (1978). Denoting the mean of the series $y_{t_0}, y_{t_1},\ldots,y_{t_m}$ by $\mu$ the one-step-ahead forecast at origin $t_m,\,\, \hat{y_{t_m}}(1)$, can be computed by the formula $$\hat{y_{t_m}}(1)=\mu(1 - \sum\nolimits_{j=1}^p\phi_j)+\sum\nolimits_{j=1}^p\phi_j y_{t_m+1-j}\rm{.}$$

This value is used as an estimate for the missing value at $t_{m+1}$. The procedure starting with ARAutoUnivariate is then repeated for every further missing value in the gap. All four estimation methods treat gaps of missing values in increasing time order.

See Also:

Field Summary

Fields

Modifier and Type

Field

Description

static final int

AR_1

Indicates that missing values should be estimated using an autoregressive time series with 1 lag.

static final int

AR_P

Indicates that missing values should be estimated using an autoregressive time series with a maximum lag of maxLag.

static final int

CUBIC_SPLINE

Indicates that missing values should be estimated using cublic spline interpolation.

static final int

LEAST_SQUARES

Estimate autoregressive coefficients using least squares.

static final int

MAX_LIKELIHOOD

Estimate autoregressive coefficients using maximum likelihood.

static final int

MEDIAN

Indicates that missing values should be estimated using the median of the values just before and after the missing value gap.

static final int

METHOD_OF_MOMENTS

Estimate autoregressive coefficients using method of moments.
Constructor Summary

Constructors

Constructor

Description

ARMAEstimateMissing(int[] tpoints, double[] z)

Constructor for ARMAEstimateMissing.
Method Summary

Modifier and Type

Method

Description

int[]

getCompleteTimes()

Returns an int array of all time points, including values for times with missing values in z.

double[]

getCompleteTimeSeries()

Returns a double precision vector of length tpoints[tpoints.length-1]-tpoints[0]+1 containing the observed values in the time series z plus estimates for missing values in gaps identified in tpoints.

double

getConvergenceTolerance()

Returns the current value of convergence tolerance used by the AR_1 and AR_P estimation methods.

int

getEstimationMethod()

Returns the method used for estimating the final autoregressive coefficients for missing value estimation methods AR_1 and AR_P.

int

getMaxIterations()

Returns the maximum number of estimation iterations used by missing value estimation methods AR_1 and AR_P.

int

getMaxlag()

Returns the current value of autoregressive lags used in the AR_P estimation method.

double

getMean()

Returns the mean value used to center the series.

int[]

getMissingTimes()

Returns an int array of the times with missing values.

int

getMissingValueMethod()

Returns the current missing value estimation method.

int

getNumberMissing()

Returns the number of missing values in the original series

double

getRelativeError()

Returns the relative error used for the METHOD_OF_MOMENTS and LEAST_SQARES estimation methods.

void

setConvergenceTolerance(double convergenceTolerance)

Sets the covergence tolerance used by the AR_1 and AR_P missing value estimation methods.

void

setEstimationMethod(int arEstimationMethod)

Sets the method used for estimating the autoregressive coefficients for missing value estimation methods AR_1 and AR_P.

void

setMaxIterations(int maxIterations)

Sets the maximum number of estimation iterations for missing value estimation methods AR_1 and AR_P.

void

setMaxlag(int maxlag)

Sets the maximum number of autoregressive lags when method AR_P is selected as the missing value estimation method.

void

setMean(double mean)

Sets the mean value used to center the series.

void

setMissingValueMethod(int method)

Sets the current missing value estimation method to MEDIAN, CUBIC_SPLINE, AR_1, or AR_P.

void

setRelativeError(double relativeError)

Sets the relative error used for the METHOD_OF_MOMENTS and LEAST_SQUARES estimation methods.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Field Details
- MEDIAN
  
  public static final int MEDIAN
  
  Indicates that missing values should be estimated using the median of the values just before and after the missing value gap.
  See Also:
  
  Constant Field Values
- CUBIC_SPLINE
  
  public static final int CUBIC_SPLINE
  
  Indicates that missing values should be estimated using cublic spline interpolation.
  See Also:
  
  Constant Field Values
- AR_1
  
  public static final int AR_1
  
  Indicates that missing values should be estimated using an autoregressive time series with 1 lag.
  See Also:
  
  Constant Field Values
- AR_P
  
  public static final int AR_P
  
  Indicates that missing values should be estimated using an autoregressive time series with a maximum lag of maxLag. By default maxLag=10, but this can be changed using the setMaxlag method.
  See Also:
  
  Constant Field Values
- METHOD_OF_MOMENTS
  
  public static final int METHOD_OF_MOMENTS
  
  Estimate autoregressive coefficients using method of moments.
  See Also:
  
  Constant Field Values
- LEAST_SQUARES
  
  public static final int LEAST_SQUARES
  
  Estimate autoregressive coefficients using least squares.
  See Also:
  
  Constant Field Values
- MAX_LIKELIHOOD
  
  public static final int MAX_LIKELIHOOD
  
  Estimate autoregressive coefficients using maximum likelihood.
  See Also:
  
  Constant Field Values
Constructor Details
- ARMAEstimateMissing
  
  public ARMAEstimateMissing(int[] tpoints, double[] z)
  
  Constructor for ARMAEstimateMissing.
  
  Parameters:
  
  tpoints - an int array containing the times at which the series values were observed. The values must be strictly increasing. Times for missing values are identified as non-incremental gaps in this series. A gap of missing values in z is assumed when the difference between two consecutive values is greater than 1, i.e. $t_{i+1}-t_i>1$. The difference is the number of missing values in the gap. The series can have multiple gaps with missing values, but any one gap can have no more than 3 missing values.
  
  z - a double array containing the values for the time series observed at the times given in the vector tpoints.
Method Details
- getMissingValueMethod
  
  public int getMissingValueMethod()
  
  Returns the current missing value estimation method.
  
  Returns:
  
  an int representing the estimation method used for estimating the missing values in the time series. 0 implies MEDIAN, 1 implies CUBIC_SPLINE, 2 implies AR_1 and 3 implies AR_P.
- setMissingValueMethod
  
  public void setMissingValueMethod(int method)
  
  Sets the current missing value estimation method to MEDIAN, CUBIC_SPLINE, AR_1, or AR_P.
  
  Parameters:
  
  method - An int scalar. By default method=AR_1.
- setEstimationMethod
  
  public void setEstimationMethod(int arEstimationMethod)
  
  Sets the method used for estimating the autoregressive coefficients for missing value estimation methods AR_1 and AR_P.
  
  Parameters:
  
  arEstimationMethod - An int scalar specifying the method used to estimate the autoregressive coefficients. Valid methods are METHOD_OF_MOMENTS, LEAST_SQUARES, and MAX_LIKELIHOOD. By default, arEstimationMethod=LEAST_SQUARES.
- getEstimationMethod
  
  public int getEstimationMethod()
  
  Returns the method used for estimating the final autoregressive coefficients for missing value estimation methods AR_1 and AR_P.
  
  Returns:
  
  an int representing the estimation method used for estimating the autoregressive coefficients. 0 implies METHOD_OF_MOMENTS, 1 implies LEAST_SQUARES, 2 implies MAX_LIKELIHOOD.
- getMaxIterations
  
  public int getMaxIterations()
  
  Returns the maximum number of estimation iterations used by missing value estimation methods AR_1 and AR_P.
  
  Returns:
  
  An int scalar equal to the maximum number of iterations for the maximum likelihood missing value estimation method. If this limit is exceeded during the compute method, ARMAEstimateMissing stops execution and issues an ARMAMaxLikelihood.IterationLimitExceededException.
- getConvergenceTolerance
  
  public double getConvergenceTolerance()
  
  Returns the current value of convergence tolerance used by the AR_1 and AR_P estimation methods.
  
  Returns:
  
  a double scalar value equal to the convergence tolerance. By default the convergence tolerance is 1.0e-09.
- setConvergenceTolerance
  
  public void setConvergenceTolerance(double convergenceTolerance)
  
  Sets the covergence tolerance used by the AR_1 and AR_P missing value estimation methods.
  
  Parameters:
  
  convergenceTolerance - A double scalar value. Default: convergenceTolerance = 1.0e-09
- getRelativeError
  
  public double getRelativeError()
  
  Returns the relative error used for the METHOD_OF_MOMENTS and LEAST_SQARES estimation methods.
  
  Returns:
  
  a double scalar containing the stopping criterion for use in the nonlinear equation solver used in both the method of moments and least-squares algorithm.
- setRelativeError
  
  public void setRelativeError(double relativeError)
  
  Sets the relative error used for the METHOD_OF_MOMENTS and LEAST_SQUARES estimation methods.
  
  Parameters:
  
  relativeError - a double scalar containing the stopping criterion for use in the nonlinear equation solver used in both the method of moments and least-squares algorithm. Default: relativeError = 2.22045e-14
- setMaxIterations
  
  public void setMaxIterations(int maxIterations)
  
  Sets the maximum number of estimation iterations for missing value estimation methods AR_1 and AR_P. If this limit is exceeded ARMAEstimateMissing stops execution during the compute method and issues an IterationLimitExceededException.
  
  Parameters:
  
  maxIterations - An int specifying the maximum number of iterations for the maximum likelihood estimation. By default, maxIterations=200.
- getMaxlag
  
  public int getMaxlag()
  
  Returns the current value of autoregressive lags used in the AR_P estimation method.
  
  Returns:
  
  An int scalar value equal to the maximum number of autoregressive lags used with the AR_P missing value estimation method.
- setMaxlag
  
  public void setMaxlag(int maxlag)
  
  Sets the maximum number of autoregressive lags when method AR_P is selected as the missing value estimation method.
  
  Parameters:
  
  maxlag - An int scalar value equal to the maximum number of autoregressive lags. maxlag must be greater than z.length-5. By default maxlag=10.
- getMean
  
  public double getMean()
  
  Returns the mean value used to center the series.
  
  Returns:
  
  a double scalar used to center the series.
- setMean
  
  public void setMean(double mean)
  
  Sets the mean value used to center the series.
  
  Parameters:
  
  mean - a double scalar used to center the series. By default the median of the series is used for centering.
- getNumberMissing
  
  public int getNumberMissing()
  
  Returns the number of missing values in the original series
  
  Returns:
  
  An int scalar value containing the number of missing values in the time series.
- getMissingTimes
  
  public int[] getMissingTimes()
  
  Returns an int array of the times with missing values.
  
  Returns:
  
  An int array containing the times at which missing values were estimated. If there are no missing values a null array is returned.
- getCompleteTimes
  
  public int[] getCompleteTimes()
  
  Returns an int array of all time points, including values for times with missing values in z.
  
  Returns:
  
  An int array of all times from tpoints[0]=1 to tpoints.length+nMissing. Where nMissing is the number of values removed from the original time series, nMissing = getNumberMissing().
- getCompleteTimeSeries
  
  public double[] getCompleteTimeSeries() throws ARMA.MatrixSingularException, ARMA.TooManyCallsException, ARMA.IncreaseErrRelException, ARMA.NewInitialGuessException, ARMA.IllConditionedException, ARMA.TooManyITNException, ARMA.TooManyFcnEvalException, ARMA.TooManyJacobianEvalException, ARMA.ResidualsTooLargeException, ARAutoUnivariate.TriangularMatrixSingularException, ARMAMaxLikelihood.NonStationaryException, ARMAMaxLikelihood.NonInvertibleException
  
  Returns a double precision vector of length tpoints[tpoints.length-1]-tpoints[0]+1 containing the observed values in the time series z plus estimates for missing values in gaps identified in tpoints.
  
  Returns:
  
  A double array of length tpoints[tpoints.length-1]-tpoints[0]+1 containing the observed values in the time series z plus estimates for missing values in gaps identified in tpoints.
  
  Throws:
  
  ARAutoUnivariate.TriangularMatrixSingularException - is thrown if the input matrix to ARAutoUnivariate is singular. This can only occur with estimation method AR_P.
  
  ARMA.MatrixSingularException - is thrown if the input matrix is singular.
  
  ARMA.TooManyCallsException - is thrown if the number of calls to the function has exceeded the maximum number of iterations times the number of moving average (MA) parameters + 1.
  
  ARMA.IncreaseErrRelException - is thrown if the bound for the relative error is too small.
  
  ARMA.NewInitialGuessException - is thrown if the iteration has not made good progress.
  
  ARMA.IllConditionedException - is thrown if the problem is ill-conditioned.
  
  ARMA.TooManyITNException - is thrown if the maximum number of iterations is exceeded.
  
  ARMA.TooManyFcnEvalException - is thrown if the maximum number of function evaluations is exceeded.
  
  ARMA.TooManyJacobianEvalException - is thrown if the maximum number of Jacobian evaluations is exceeded.
  
  ARMA.ResidualsTooLargeException - is thrown if the residuals computed in one step of the Least Squares estimation of the ARMA coefficients become too large.
  
  ARMAMaxLikelihood.NonStationaryException - is thrown if the final maximum likelihood estimates for the time series are non-stationary.
  
  ARMAMaxLikelihood.NonInvertibleException - is thrown if the final maximum likelihood estimates for the time series are noninvertible.
  
  ARMAMaxLikelihood.InitialMAException - is thrown if the initial values provided for the moving average terms using setMA are noninvertable. In this case, ARMAMaxLikelihood terminates and does not compute the time series estimates.

Class ARMAEstimateMissing

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Details

MEDIAN

CUBIC_SPLINE

AR_1

AR_P

METHOD_OF_MOMENTS

LEAST_SQUARES

MAX_LIKELIHOOD

Constructor Details

ARMAEstimateMissing

Method Details

getMissingValueMethod

setMissingValueMethod

setEstimationMethod

getEstimationMethod

getMaxIterations

getConvergenceTolerance

setConvergenceTolerance

getRelativeError

setRelativeError

setMaxIterations

getMaxlag

setMaxlag

getMean

setMean

getNumberMissing

getMissingTimes

getCompleteTimes

getCompleteTimeSeries