Package com.imsl.stat

Class ARMAEstimateMissing

java.lang.Object
com.imsl.stat.ARMAEstimateMissing
All Implemented Interfaces:
Serializable

public class ARMAEstimateMissing extends Object implements Serializable
Estimates missing values in a time series collected with equal spacing. Missing values can be replaced by these estimates prior to fitting a time series using the ARMA class.

Traditional time series analysis as described by Box, Jenkins and Reinsel (1994) requires the observations be made at equidistant time points \(t_0,t_1,\ldots,t_n\) where \(t_i = t_0 + i\). When observations are missing, ARMA requires that they be replaced with suitable estimates. Class ARMAEstimateMissing offers 4 methods for estimating missing values: MEDIAN, CUBIC_SPLINE, AR_1, and AR_P

Method MEDIAN estimates the missing observations in a gap by the median of the last four time series values before and the first four values after the gap. If not enough values are available before or after the gap then the number is reduced accordingly. This method is very fast and simple, but its use is limited to stationary ergodic series without outliers and level shifts.

Method CUBIC_SPLINE uses a cubic spline interpolation method to estimate missing values. Here the interpolation is again done over the last four time series values before and the first four values after the gap. The missing values are estimated by the resulting interpolant. This method gives smooth transitions across missing values.

Method AR_1 assumes that the time series before the gap can be approximated using an AR(1) process. If the last observation prior to the gap is made at time point \(t_m\) then this method uses values at \(t_0,t_1,\ldots,t_m\) to compute the one-step-ahead forecast at origin \(t_m\). This value is used to estimate the missing value at time point \(t_m + 1\). If the value at \(t_m+2\) is also missing then the values at time points \(t_0,t_1,\ldots,t_m+1\) are used to recompute the AR(1) model, and then estimate the value at \(t_m+2\) and so on. The coefficient \(\phi_1\) in the AR(1) model is computed internally by the method of least squares from class ARMA.

Finally, method AR_P uses an AR(p) model to estimate missing values using a one-step-ahead forecast similar to method AR_1. First, class ARAutoUnivariate, is applied to the time series values just prior to the missing values to determine the optimum p from the set \(\{0,1,\ldots,\tt{maxlag}\}\) of possible values and to compute the parameters \(\phi_1,\ldots,\phi_p\) of the resulting AR(p) model. The parameters are estimated by the least squares method based on Householder transformations as described in Kitagawa and Akaike (1978). Denoting the mean of the series \(y_{t_0}, y_{t_1},\ldots,y_{t_m}\) by \(\mu\) the one-step-ahead forecast at origin \(t_m,\,\, \hat{y_{t_m}}(1)\), can be computed by the formula $$\hat{y_{t_m}}(1)=\mu(1 - \sum\nolimits_{j=1}^p\phi_j)+\sum\nolimits_{j=1}^p\phi_j y_{t_m+1-j}\rm{.}$$

This value is used as an estimate for the missing value at \(t_{m+1}\). The procedure starting with ARAutoUnivariate is then repeated for every further missing value in the gap. All four estimation methods treat gaps of missing values in increasing time order.

See Also:
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    static final int
    Indicates that missing values should be estimated using an autoregressive time series with 1 lag.
    static final int
    Indicates that missing values should be estimated using an autoregressive time series with a maximum lag of maxLag.
    static final int
    Indicates that missing values should be estimated using cublic spline interpolation.
    static final int
    Estimate autoregressive coefficients using least squares.
    static final int
    Estimate autoregressive coefficients using maximum likelihood.
    static final int
    Indicates that missing values should be estimated using the median of the values just before and after the missing value gap.
    static final int
    Estimate autoregressive coefficients using method of moments.
  • Constructor Summary

    Constructors
    Constructor
    Description
    ARMAEstimateMissing(int[] tpoints, double[] z)
    Constructor for ARMAEstimateMissing.
  • Method Summary

    Modifier and Type
    Method
    Description
    int[]
    Returns an int array of all time points, including values for times with missing values in z.
    double[]
    Returns a double precision vector of length tpoints[tpoints.length-1]-tpoints[0]+1 containing the observed values in the time series z plus estimates for missing values in gaps identified in tpoints.
    double
    Returns the current value of convergence tolerance used by the AR_1 and AR_P estimation methods.
    int
    Returns the method used for estimating the final autoregressive coefficients for missing value estimation methods AR_1 and AR_P.
    int
    Returns the maximum number of estimation iterations used by missing value estimation methods AR_1 and AR_P.
    int
    Returns the current value of autoregressive lags used in the AR_P estimation method.
    double
    Returns the mean value used to center the series.
    int[]
    Returns an int array of the times with missing values.
    int
    Returns the current missing value estimation method.
    int
    Returns the number of missing values in the original series
    double
    Returns the relative error used for the METHOD_OF_MOMENTS and LEAST_SQARES estimation methods.
    void
    setConvergenceTolerance(double convergenceTolerance)
    Sets the covergence tolerance used by the AR_1 and AR_P missing value estimation methods.
    void
    setEstimationMethod(int arEstimationMethod)
    Sets the method used for estimating the autoregressive coefficients for missing value estimation methods AR_1 and AR_P.
    void
    setMaxIterations(int maxIterations)
    Sets the maximum number of estimation iterations for missing value estimation methods AR_1 and AR_P.
    void
    setMaxlag(int maxlag)
    Sets the maximum number of autoregressive lags when method AR_P is selected as the missing value estimation method.
    void
    setMean(double mean)
    Sets the mean value used to center the series.
    void
    Sets the current missing value estimation method to MEDIAN, CUBIC_SPLINE, AR_1, or AR_P.
    void
    setRelativeError(double relativeError)
    Sets the relative error used for the METHOD_OF_MOMENTS and LEAST_SQUARES estimation methods.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

    • MEDIAN

      public static final int MEDIAN
      Indicates that missing values should be estimated using the median of the values just before and after the missing value gap.
      See Also:
    • CUBIC_SPLINE

      public static final int CUBIC_SPLINE
      Indicates that missing values should be estimated using cublic spline interpolation.
      See Also:
    • AR_1

      public static final int AR_1
      Indicates that missing values should be estimated using an autoregressive time series with 1 lag.
      See Also:
    • AR_P

      public static final int AR_P
      Indicates that missing values should be estimated using an autoregressive time series with a maximum lag of maxLag. By default maxLag=10, but this can be changed using the setMaxlag method.
      See Also:
    • METHOD_OF_MOMENTS

      public static final int METHOD_OF_MOMENTS
      Estimate autoregressive coefficients using method of moments.
      See Also:
    • LEAST_SQUARES

      public static final int LEAST_SQUARES
      Estimate autoregressive coefficients using least squares.
      See Also:
    • MAX_LIKELIHOOD

      public static final int MAX_LIKELIHOOD
      Estimate autoregressive coefficients using maximum likelihood.
      See Also:
  • Constructor Details

    • ARMAEstimateMissing

      public ARMAEstimateMissing(int[] tpoints, double[] z)
      Constructor for ARMAEstimateMissing.
      Parameters:
      tpoints - an int array containing the times at which the series values were observed. The values must be strictly increasing. Times for missing values are identified as non-incremental gaps in this series. A gap of missing values in z is assumed when the difference between two consecutive values is greater than 1, i.e. \(t_{i+1}-t_i>1\). The difference is the number of missing values in the gap. The series can have multiple gaps with missing values, but any one gap can have no more than 3 missing values.
      z - a double array containing the values for the time series observed at the times given in the vector tpoints.
  • Method Details

    • getMissingValueMethod

      public int getMissingValueMethod()
      Returns the current missing value estimation method.
      Returns:
      an int representing the estimation method used for estimating the missing values in the time series. 0 implies MEDIAN, 1 implies CUBIC_SPLINE, 2 implies AR_1 and 3 implies AR_P.
    • setMissingValueMethod

      public void setMissingValueMethod(int method)
      Sets the current missing value estimation method to MEDIAN, CUBIC_SPLINE, AR_1, or AR_P.
      Parameters:
      method - An int scalar. By default method=AR_1.
    • setEstimationMethod

      public void setEstimationMethod(int arEstimationMethod)
      Sets the method used for estimating the autoregressive coefficients for missing value estimation methods AR_1 and AR_P.
      Parameters:
      arEstimationMethod - An int scalar specifying the method used to estimate the autoregressive coefficients. Valid methods are METHOD_OF_MOMENTS, LEAST_SQUARES, and MAX_LIKELIHOOD. By default, arEstimationMethod=LEAST_SQUARES.
    • getEstimationMethod

      public int getEstimationMethod()
      Returns the method used for estimating the final autoregressive coefficients for missing value estimation methods AR_1 and AR_P.
      Returns:
      an int representing the estimation method used for estimating the autoregressive coefficients. 0 implies METHOD_OF_MOMENTS, 1 implies LEAST_SQUARES, 2 implies MAX_LIKELIHOOD.
    • getMaxIterations

      public int getMaxIterations()
      Returns the maximum number of estimation iterations used by missing value estimation methods AR_1 and AR_P.
      Returns:
      An int scalar equal to the maximum number of iterations for the maximum likelihood missing value estimation method. If this limit is exceeded during the compute method, ARMAEstimateMissing stops execution and issues an ARMAMaxLikelihood.IterationLimitExceededException.
    • getConvergenceTolerance

      public double getConvergenceTolerance()
      Returns the current value of convergence tolerance used by the AR_1 and AR_P estimation methods.
      Returns:
      a double scalar value equal to the convergence tolerance. By default the convergence tolerance is 1.0e-09.
    • setConvergenceTolerance

      public void setConvergenceTolerance(double convergenceTolerance)
      Sets the covergence tolerance used by the AR_1 and AR_P missing value estimation methods.
      Parameters:
      convergenceTolerance - A double scalar value. Default: convergenceTolerance = 1.0e-09
    • getRelativeError

      public double getRelativeError()
      Returns the relative error used for the METHOD_OF_MOMENTS and LEAST_SQARES estimation methods.
      Returns:
      a double scalar containing the stopping criterion for use in the nonlinear equation solver used in both the method of moments and least-squares algorithm.
    • setRelativeError

      public void setRelativeError(double relativeError)
      Sets the relative error used for the METHOD_OF_MOMENTS and LEAST_SQUARES estimation methods.
      Parameters:
      relativeError - a double scalar containing the stopping criterion for use in the nonlinear equation solver used in both the method of moments and least-squares algorithm. Default: relativeError = 2.22045e-14
    • setMaxIterations

      public void setMaxIterations(int maxIterations)
      Sets the maximum number of estimation iterations for missing value estimation methods AR_1 and AR_P. If this limit is exceeded ARMAEstimateMissing stops execution during the compute method and issues an IterationLimitExceededException.
      Parameters:
      maxIterations - An int specifying the maximum number of iterations for the maximum likelihood estimation. By default, maxIterations=200.
    • getMaxlag

      public int getMaxlag()
      Returns the current value of autoregressive lags used in the AR_P estimation method.
      Returns:
      An int scalar value equal to the maximum number of autoregressive lags used with the AR_P missing value estimation method.
    • setMaxlag

      public void setMaxlag(int maxlag)
      Sets the maximum number of autoregressive lags when method AR_P is selected as the missing value estimation method.
      Parameters:
      maxlag - An int scalar value equal to the maximum number of autoregressive lags. maxlag must be greater than z.length-5. By default maxlag=10.
    • getMean

      public double getMean()
      Returns the mean value used to center the series.
      Returns:
      a double scalar used to center the series.
    • setMean

      public void setMean(double mean)
      Sets the mean value used to center the series.
      Parameters:
      mean - a double scalar used to center the series. By default the median of the series is used for centering.
    • getNumberMissing

      public int getNumberMissing()
      Returns the number of missing values in the original series
      Returns:
      An int scalar value containing the number of missing values in the time series.
    • getMissingTimes

      public int[] getMissingTimes()
      Returns an int array of the times with missing values.
      Returns:
      An int array containing the times at which missing values were estimated. If there are no missing values a null array is returned.
    • getCompleteTimes

      public int[] getCompleteTimes()
      Returns an int array of all time points, including values for times with missing values in z.
      Returns:
      An int array of all times from tpoints[0]=1 to tpoints.length+nMissing. Where nMissing is the number of values removed from the original time series, nMissing = getNumberMissing().
    • getCompleteTimeSeries

      Returns a double precision vector of length tpoints[tpoints.length-1]-tpoints[0]+1 containing the observed values in the time series z plus estimates for missing values in gaps identified in tpoints.
      Returns:
      A double array of length tpoints[tpoints.length-1]-tpoints[0]+1 containing the observed values in the time series z plus estimates for missing values in gaps identified in tpoints.
      Throws:
      ARAutoUnivariate.TriangularMatrixSingularException - is thrown if the input matrix to ARAutoUnivariate is singular. This can only occur with estimation method AR_P.
      ARMA.MatrixSingularException - is thrown if the input matrix is singular.
      ARMA.TooManyCallsException - is thrown if the number of calls to the function has exceeded the maximum number of iterations times the number of moving average (MA) parameters + 1.
      ARMA.IncreaseErrRelException - is thrown if the bound for the relative error is too small.
      ARMA.NewInitialGuessException - is thrown if the iteration has not made good progress.
      ARMA.IllConditionedException - is thrown if the problem is ill-conditioned.
      ARMA.TooManyITNException - is thrown if the maximum number of iterations is exceeded.
      ARMA.TooManyFcnEvalException - is thrown if the maximum number of function evaluations is exceeded.
      ARMA.TooManyJacobianEvalException - is thrown if the maximum number of Jacobian evaluations is exceeded.
      ARMA.ResidualsTooLargeException - is thrown if the residuals computed in one step of the Least Squares estimation of the ARMA coefficients become too large.
      ARMAMaxLikelihood.NonStationaryException - is thrown if the final maximum likelihood estimates for the time series are non-stationary.
      ARMAMaxLikelihood.NonInvertibleException - is thrown if the final maximum likelihood estimates for the time series are noninvertible.
      ARMAMaxLikelihood.InitialMAException - is thrown if the initial values provided for the moving average terms using setMA are noninvertable. In this case, ARMAMaxLikelihood terminates and does not compute the time series estimates.