adNormalityTest

Performs an Anderson‑Darling test for normality.

Synopsis

adNormalityTest (x)

Required Arguments

float x[] (Input)Vector of length nObs containing the observations.

Return Value

The p‑value for the Anderson‑Darling test of normality.

Optional Arguments

stat (Output)
The Anderson‑Darling statistic.
nMissing (Output)
The number of missing observations.

Description

Given a data sample \(\{X_i,i=1 .. n \}\), where n = nObs and \(X_i\) = x[i-1], function adNormalityTest computes the Anderson-Darling (AD) normality statistic A = adstat and the corresponding Return Value (p‑value) P = P == {probability that a normally distributed n element sample would have an AD statistic > A}. If P is sufficiently small (e.g., \(P<.05\)), then the AD test indicates that the null hypothesis that the data sample is normally-distributed should be rejected. A is calculated:

\[A = -n - \frac{1}{n} \sum_{i-1}^{n} \left[(2i-1) \ln \left(\phi \left(Y_i\right)\right) + (2n-2i+1) \ln \left(1-\phi\left(Y_i\right)\right)\right]\]

where \(Y_i=\left( X_i-\overline{X} \right)/s\) and \(\overline{X}\) and s are the sample mean and standard deviation respectively. P is calculated by first transforming A to an “n‑adjusted” statistic A:

\[A* = A\left(1.0 + \frac{0.75}{n} + \frac{2.25}{n^2}\right)\]

and then calculating P in terms of A using a parabolic approximation taken from Table 4.9 in Stephens (1986).

Example

The following example is taken from Conover (1980, pages 364 and 195). The data consists of 50 two‑digit numbers taken from a telephone book. The AD test fails to reject the null hypothesis of normality at the 0.05 level of significance.

from __future__ import print_function
from numpy import *
from pyimsl.stat.adNormalityTest import adNormalityTest
from pyimsl.stat.writeMatrix import writeMatrix

nobs = 50
x = [23.0, 36.0, 54.0, 61.0, 73.0, 23.0, 37.0, 54.0, 61.0, 73.0,
     24.0, 40.0, 56.0, 62.0, 74.0, 27.0, 42.0, 57.0, 63.0, 75.0,
     29.0, 43.0, 57.0, 64.0, 77.0, 31.0, 43.0, 58.0, 65.0, 81.0,
     32.0, 44.0, 58.0, 66.0, 87.0, 33.0, 45.0, 58.0, 68.0, 89.0,
     33.0, 48.0, 58.0, 68.0, 93.0, 35.0, 48.0, 59.0, 70.0, 97.0]
adstat = []
nmiss = []

p_value = adNormalityTest(x,
                          stat=adstat,
                          nMissing=nmiss)

print("Anderson-Darling statistic = %11.4f " % adstat[0])
print("p-value = %11.4f" % p_value)
print("# missing values = %4d" % nmiss[0])

Output

Anderson-Darling statistic =      0.3339 
p-value =      0.5024
# missing values =    0

Informational Errors

IMSLS_PVAL_UNDERFLOW The p‑value has fallen below the minimum value of # for which its calculation has any accuracy; ZERO is returned.

Fatal Errors

IMSLS_TOO_MANY_MISSING After removing the missing observations only 2 observations remain. The test cannot proceed.