kolmogorovOne¶
Performs a Kolmogorov-Smirnov one-sample test for continuous distributions.
Synopsis¶
kolmogorovOne (cdf, x)
Required Arguments¶
- float
cdf
(x
) (Input) - User-supplied function to compute the cumulative distribution function
(
CDF
) at a given value. The form isCDF
(x
), wherex
is the value at whichcdf
is to be evaluated (Input) andcdf
is the value ofCDF
atx
. (Output) - float
x[]
(Input) - Array of size
nObservations
containing the observations.
Return Value¶
An array of length 3 containing Z, \(p_1\), and \(p_2\).
Optional Arguments¶
differences
(Output)- The array containing \(D_n\), \(D_n^+\), \(D_n^-\).
nMissing
(Ouput)- Number of missing values is returned in
nMissing
.
Description¶
The routine kolmogorovOne
performs a Kolmogorov-Smirnov goodness-of-fit
test in one sample. The hypotheses tested follow:
where F is the cumulative distribution function (CDF
) of the random
variable, and the theoretical cdf
, F*, is specified via the
user-supplied function cdf
. Let n = nObservations
- nMissing
.
The test statistics for both one-sided alternatives
and
and the two-sided (\(D_n\) = differences[0]
) alternative are computed
as well as an asymptotic z-score (testStatistics[0]
) and
p-values associated with the one-sided (testStatistics[1]
) and
two-sided (testStatistics[2]
) hypotheses. For \(n>80\), asymptotic
p-values are used (see Gibbons 1971). For \(n\leq 80\), exact one-sided
p-values are computed according to a method given by Conover (1980, page
350). An approximate two-sided test p-value is obtained as twice the
one-sided p-value. The approximation is very close for one-sided
p-values less than 0.10 and becomes very bad as the one-sided p-values
get larger.
Programming Notes¶
The theoretical
CDF
is assumed to be continuous. If theCDF
is not continuous, the statistics\[D_n^*\]will not be computed correctly.
Estimation of parameters in the theoretical
CDF
from the sample data will tend to make the p-values associated with the test statistics too liberal. The empiricalCDF
will tend to be closer to the theoreticalCDF
than it should be.No attempt is made to check that all points in the sample are in the support of the theoretical
CDF
. If all sample points are not in the support of theCDF
, the null hypothesis must be rejected.
Example¶
In this example, a random sample of size 100 is generated via routine
randomUniform (Chapter 12,:doc:/stat/random/index) for
the uniform (0, 1) distribution. We want to test the null hypothesis that
the cdf
is the standard normal distribution with a mean of 0.5 and a
variance equal to the uniform (0, 1) variance (1/12).
from __future__ import print_function
from numpy import *
from pyimsl.stat.kolmogorovOne import kolmogorovOne
from pyimsl.stat.normalCdf import normalCdf
from pyimsl.stat.randomSeedSet import randomSeedSet
from pyimsl.stat.randomUniform import randomUniform
def cdf(x):
mean = .5
std = .2886751
z = (x - mean) / std
return normalCdf(z)
nobs = 100
randomSeedSet(123457)
x = randomUniform(nobs)
nMissing = []
differences = []
statistics = kolmogorovOne(cdf, x,
nMissing=nMissing, differences=differences)
print("D = %8.4f" % (differences[0]))
print("D+ = %8.4f" % (differences[1]))
print("D- = %8.4f" % (differences[2]))
print("Z = %8.4f" % (statistics[0]))
print("Prob greater D one sided = %8.4f" % (statistics[1]))
print("Prob greater D two sided = %8.4f" % (statistics[2]))
print("N missing = %d" % (nMissing[0]))
Output¶
D = 0.1471
D+ = 0.0810
D- = 0.1471
Z = 1.4708
Prob greater D one sided = 0.0132
Prob greater D two sided = 0.0264
N missing = 0
Warning Errors¶
IMSLS_TIE_DETECTED |
# ties were detected in the sample. |
Fatal Errors¶
IMSLS_STOP_USER_FCN |
Request from user supplied function to stop algorithm. User flag = “#”. |