Package com.imsl.stat

Class KolmogorovOneSample

java.lang.Object
com.imsl.stat.KolmogorovOneSample
All Implemented Interfaces:
Serializable

public class KolmogorovOneSample extends Object implements Serializable
The class KolmogorovOneSample performs a Kolmogorov-Smirnov goodness-of-fit test in one sample.

The hypotheses tested follow: $$ \begin{array}{ll} H_0:~ F(x) = F^{*}(x) & H_1:~F(x) \ne F^{*}(x) \\ H_0:~ F(x) \ge F^{*}(x) & H_1:~F(x) \lt F^{*}(x) \\ H_0:~ F(x) \le F^{*}(x) & H_1:~F(x) \gt F^{*}(x) \end{array} $$ where \(F\) is the cumulative distribution function (CDF) of the random variable, and the theoretical cdf, \(F^{*}\), is specified via the user-supplied function cdf. Let n be the number of observations minus the number of missing observations. The test statistics for both one-sided alternatives \(D_n^{+}\) and \(D_n^{-}\) and the two-sided \(D_n\) alternative are computed as well as an asymptotic z-score and p-values associated with the one-sided and two-sided hypotheses. For \(n \gt 80\), asymptotic p-values are used (see Gibbons 1971). For \(n \le 80\), exact one-sided p-values are computed according to a method given by Conover (1980, page 350). An approximate two-sided test p-value is obtained as twice the one-sided p-value. The approximation is very close for one-sided p-values less than 0.10 and becomes very bad as the one-sided p-values get larger.

The theoretical CDF is assumed to be continuous. If the CDF is not continuous, the statistics \(D_n^{*}\) will not be computed correctly.

Estimation of parameters in the theoretical CDF from the sample data will tend to make the p-values associated with the test statistics too liberal. The empirical CDF will tend to be closer to the theoretical CDF than it should be.

No attempt is made to check that all points in the sample are in the support of the theoretical CDF. If all sample points are not in the support of the CDF, the null hypothesis must be rejected.

See Also:
  • Constructor Summary

    Constructors
    Constructor
    Description
    KolmogorovOneSample(CdfFunction cdf, double[] x)
    Constructs a one sample Kolmogorov-Smirnov goodness-of-fit test.
  • Method Summary

    Modifier and Type
    Method
    Description
    double
    Returns \(D^{+}\), the maximum difference between the theoretical and empirical CDF's.
    double
    Returns \(D^{-}\), the minimum difference between the theoretical and empirical CDF's.
    int
    Returns the number of missing values in the data.
    int
    Returns the number of ties in the data.
    double
    Probability of the statistic exceeding D under the null hypothesis of equality and against the one-sided alternative.
    double
    Returns \(D = \max(D^{+}, D^{-})\).
    double
    Probability of the statistic exceeding D under the null hypothesis of equality and against the two-sided alternative.
    double
    Returns the normalized D statistic without the continuity correction applied.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Constructor Details

    • KolmogorovOneSample

      public KolmogorovOneSample(CdfFunction cdf, double[] x)
      Constructs a one sample Kolmogorov-Smirnov goodness-of-fit test.
      Parameters:
      cdf - is the cdf function, \(F(x)\). If must be non-decreasing and its value must be in [0, 1].
      x - is a double array containing the observations.
  • Method Details

    • getNumberOfTies

      public int getNumberOfTies()
      Returns the number of ties in the data.
      Returns:
      the number of ties in the data
    • getTestStatistic

      public double getTestStatistic()
      Returns \(D = \max(D^{+}, D^{-})\).
      Returns:
      The value D.
    • getMaximumDifference

      public double getMaximumDifference()
      Returns \(D^{+}\), the maximum difference between the theoretical and empirical CDF's.
      Returns:
      The value \(D^{+}\).
    • getMinimumDifference

      public double getMinimumDifference()
      Returns \(D^{-}\), the minimum difference between the theoretical and empirical CDF's.
      Returns:
      The value \(D^{-}\).
    • getZ

      public double getZ()
      Returns the normalized D statistic without the continuity correction applied.
      Returns:
      the value Z
    • getOneSidedPValue

      public double getOneSidedPValue()
      Probability of the statistic exceeding D under the null hypothesis of equality and against the one-sided alternative. An exact probability is computed if the number of observation is less than or equal to 80, otherwise an approximate probability is computed.
      Returns:
      the one-sided probability.
    • getTwoSidedPValue

      public double getTwoSidedPValue()
      Probability of the statistic exceeding D under the null hypothesis of equality and against the two-sided alternative. This probability is twice the probability, \(p_1\), reported by getOneSidedPValue, (or 1.0 if \(p_1 \ge 1/2\)). This approximation is nearly exact when \(p_1 \lt 0.1\).
      Returns:
      the two-sided probability.
    • getNumberMissing

      public int getNumberMissing()
      Returns the number of missing values in the data.
      Returns:
      The number of missing values.