lillieforsNormalityTest

Performs a Lilliefors test for normality.

Synopsis

lillieforsNormalityTest (x)

Required Arguments

float x[] (Input)
Array of size nObservations containing the observations.

Return Value

The p-value for the Lilliefors test for normality. Probabilities less than 0.01 are reported as 0.01, and probabilities greater than 0.10 for the normal distribution are reported as 0.5.

Optional Arguments

maxDifference (Output)
The maximum absolute difference between the empirical and the theoretical distributions is returned in maxDifference.

Description

This function computes Lilliefors test and its p-value for a normal distribution in which both the mean and variance are estimated. The one-sample, two-sided Kolmogorov-Smirnov statistic D is first computed. The p-value is then computed using an analytic approximation given by Dallal and Wilkinson (1986). Because Dallal and Wilkinson give approximations in the range (0.01, 0.10), if the computed probability of a greater D is less than 0.01, the Lilliefors test by convention calls for rejection and the p-value is set to 0.01. If the computed probability of a greater D is greater than 0.1, by convention the null hypothesis is accepted and the p-value is set to 0.50. Note that because parameters are estimated, p-value in Lilliefors test is not the same as in the Kolmogorov-Smirnov Test.

Observations from a normal distribution should not be tied. If tied observations are found, an informational message is printed. A general reference for the Lilliefors test is Conover (1980). The original reference for the test for normality is Lilliefors (1967).

Example

The data are the head circumference measurements for 50 male infants. The Lilliefors test fails to reject the null hypothesis of normality, i.e., pValue is greater than 0.1.

from __future__ import print_function
from numpy import *
from pyimsl.stat.lillieforsNormalityTest import lillieforsNormalityTest

x = [23.0, 36.0, 54.0, 61.0, 73.0, 23.0,
     37.0, 54.0, 61.0, 73.0, 24.0, 40.0,
     56.0, 62.0, 74.0, 27.0, 42.0, 57.0,
     63.0, 75.0, 29.0, 43.0, 57.0, 64.0,
     77.0, 31.0, 43.0, 58.0, 65.0, 81.0,
     32.0, 44.0, 58.0, 66.0, 87.0, 33.0,
     45.0, 58.0, 68.0, 89.0, 33.0, 48.0,
     58.0, 68.0, 93.0, 35.0, 48.0, 59.0,
     70.0, 97.0]
max_diff = []

# Lilliefors test
p_value = lillieforsNormalityTest(x, maxDifference=max_diff)

print("p-value = %11.4f" % (p_value))
print("Max difference = ", max_diff[0])

Output

***
*** Warning (immediate) error issued from IMSL function lL4llf :
*** Two or more elements in "x" are tied.
***
p-value =      0.5000
Max difference =  0.08107085426241684

Warning Errors

IMSLS_TWO_OR_MORE_TIED Two or more elements in “x” are tied.

Fatal Errors

IMSLS_NEED_AT_LEAST_5 All but # elements of “x” are missing. At least five non-missing observations are necessary to continue.
IMSLS_NEG_IN_EXPONENTIAL In testing the exponential distribution, an invalid element in “x” is found (“x[]” = #). Negative values are not possible in exponential distributions.
IMSLS_NO_VARIATION_INPUT There is no variation in the input data. All non-missing observations are tied.