Usage Notes

CNL Stat : Tests of Goodness of Fit : Usage Notes

Usage Notes

The functions in this chapter are used to test for goodness of fit and randomness. The goodness‑of‑fit tests are described in Conover (1980). There are two goodness‑of‑fit tests for general distributions, a Kolmogorov‑Smirnov test and a chi‑squared test. The user supplies the hypothesized cumulative distribution function for these two tests.

There is one function (Lilliefors) that can be used to test specifically for exponential distributions and five functions (Shapiro‑Wilk, Lilliefors, Mardia, Anderson‑Darling, and Cramer‑von Mises) that can be used to test specifically for normal distributions.

When the sample size is less than 5,000 observations, the Shapiro‑Wilk test provides an accurate estimate for the p‑value of this test. Lilliefors test is also popular but it only provides accurate p‑value estimates for values between 0.01 and 0.1. Values below 0.01 are always returned as 0.01, and values above 0.1 are returned as 0.5. The general version of the chi‑squared test is also available for the normal distribution.

The tests for randomness are often used to evaluate the adequacy of pseudorandom number generators. These tests are discussed in Knuth (1981).

The Kolmogorov-Smirnov functions in this chapter compute exact probabilities in small to moderate sample sizes. The chi-squared goodness-of-fit test may be used with discrete as well as continuous distributions.

The Kolmogorov-Smirnov, chi-squared, Anderson-Darling, and Cramer-von Mises goodness-of-fit test functions allow for missing values (NaN, not a number) in the input data. The functions that test for randomness do not allow for missing values.