exactEnumeration

Computes exact probabilities in a two-way contingency table using the total enumeration method.

Synopsis

exactEnumeration(table)

Required Arguments

float table[[]] (Input)
Array of length nRows × nColumns containing the observed counts in the contingency table.

Return Value

The p-value for independence of rows and columns. The p-value represents the probability of a more extreme table where “extreme” is taken in the Neyman-Pearson sense. The p-value is “two-sided”.

Optional Arguments

probTable (Output)
Probability of the observed table occurring, given that the null hypothesis of independent rows and columns is true.
pValue (Output)
The p-value for independence of rows and columns. The p-value represents the probability of a more extreme table where “extreme” is taken in the Neyman-Pearson sense. The p-value is “two-sided”. The p-value is also returned in functional form (see “Return Value”). A table is more extreme if its probability (for fixed marginals) is less than or equal to probTable.
checkNumericalError (Output)
Sum of the probabilities of all tables with the same marginal totals. Parameter check should have a value of 1.0. Deviation from 1.0 indicates numerical error.

Description

Function exactEnumeration computes exact probabilities for an r × c contingency table for fixed row and column marginals (a marginal is the number of counts in a row or column), where r = nRows and c = nColumns. Let \(f_{ij}\) denote the count in row i and column j of a table, and let \(f_{i\bullet}\) and \(f_{\bullet j}\) denote the row and column marginals. Under the hypothesis of independence, the (conditional) probability of the fixed marginals of the observed table is given by

\[P_f = \frac{\prod\limits_{i=1}^{r} f_{i\cdot}! \prod\limits_{j=1}^{c} f_{\cdot j}!} {f_{\cdot\cdot}! \prod\limits_{i=1}^{r} \prod\limits_{j=1}^{c} f_{ij}!}\]

where \(f_{\bullet\bullet }\) is the total number of counts in the table. \(P_f\) corresponds to output argument probTable.

A “more extreme” table X is defined in the probabilistic sense as more extreme than the observed table if the conditional probability computed for table X (for the same marginal sums) is less than the conditional probability computed for the observed table. The user should note that this definition can be considered “two-sided” in the cell counts.

Because exactEnumeration used total enumeration in computing the probability of a more extreme table, the amount of computer time required increases very rapidly with the size of the table. Tables with a large total count \(f_{\bullet\bullet }\) or a large value of r × c should not be analyzed using exactEnumeration. In such cases, try using exactNetwork.

Example

In this example, the exact conditional probability for the 2 × 2 contingency table

\[\begin{split}\begin{bmatrix} 8 & 12 \\ 8 & 2 \\ \end{bmatrix}\end{split}\]

is computed.

from __future__ import print_function
from numpy import *
from pyimsl.stat.exactEnumeration import exactEnumeration

table = array([[8, 12], [8, 2]])

p = exactEnumeration(table)

print("p-value = %9.4f" % p)

Output

p-value =    0.0577