adNormalityTest¶

Performs an Anderson‑Darling test for normality.

Synopsis¶

adNormalityTest (x)

Required Arguments¶

float x[] (Input)Vector of length nObs containing the observations.

Return Value¶

The p‑value for the Anderson‑Darling test of normality.

Optional Arguments¶

stat (Output): The Anderson‑Darling statistic.
nMissing (Output): The number of missing observations.

Description¶

Given a data sample $\{X_i,i=1 .. n \}$ , where n = nObs and $X_i$ = x[i-1], function adNormalityTest computes the Anderson-Darling (AD) normality statistic A = adstat and the corresponding Return Value (p‑value) P = P == {probability that a normally distributed n element sample would have an AD statistic > A}. If P is sufficiently small (e.g., $P<.05$ ), then the AD test indicates that the null hypothesis that the data sample is normally-distributed should be rejected. A is calculated:

$A = -n - \frac{1}{n} \sum_{i-1}^{n} \left[(2i-1) \ln \left(\phi \left(Y_i\right)\right) + (2n-2i+1) \ln \left(1-\phi\left(Y_i\right)\right)\right]$

where $Y_i=\left( X_i-\overline{X} \right)/s$ and $\overline{X}$ and s are the sample mean and standard deviation respectively. P is calculated by first transforming A to an “n‑adjusted” statistic A:

$A* = A\left(1.0 + \frac{0.75}{n} + \frac{2.25}{n^2}\right)$

and then calculating P in terms of A using a parabolic approximation taken from Table 4.9 in Stephens (1986).

Example¶

The following example is taken from Conover (1980, pages 364 and 195). The data consists of 50 two‑digit numbers taken from a telephone book. The AD test fails to reject the null hypothesis of normality at the 0.05 level of significance.

from __future__ import print_function
from numpy import *
from pyimsl.stat.adNormalityTest import adNormalityTest
from pyimsl.stat.writeMatrix import writeMatrix

nobs = 50
x = [23.0, 36.0, 54.0, 61.0, 73.0, 23.0, 37.0, 54.0, 61.0, 73.0,
     24.0, 40.0, 56.0, 62.0, 74.0, 27.0, 42.0, 57.0, 63.0, 75.0,
     29.0, 43.0, 57.0, 64.0, 77.0, 31.0, 43.0, 58.0, 65.0, 81.0,
     32.0, 44.0, 58.0, 66.0, 87.0, 33.0, 45.0, 58.0, 68.0, 89.0,
     33.0, 48.0, 58.0, 68.0, 93.0, 35.0, 48.0, 59.0, 70.0, 97.0]
adstat = []
nmiss = []

p_value = adNormalityTest(x,
                          stat=adstat,
                          nMissing=nmiss)

print("Anderson-Darling statistic = %11.4f " % adstat[0])
print("p-value = %11.4f" % p_value)
print("# missing values = %4d" % nmiss[0])

Output¶

Anderson-Darling statistic =      0.3339 
p-value =      0.5024
# missing values =    0

Informational Errors¶

IMSLS_PVAL_UNDERFLOW The p‑value has fallen below the minimum value of # for which its calculation has any accuracy; ZERO is returned.

Fatal Errors¶

IMSLS_TOO_MANY_MISSING After removing the missing observations only 2 observations remain. The test cannot proceed.