exactEnumeration¶
Computes exact probabilities in a two-way contingency table using the total enumeration method.
Synopsis¶
exactEnumeration(table)
Required Arguments¶
- float
table[[]]
(Input) - Array of length
nRows
×nColumns
containing the observed counts in the contingency table.
Return Value¶
The p-value for independence of rows and columns. The p-value represents the probability of a more extreme table where “extreme” is taken in the Neyman-Pearson sense. The p-value is “two-sided”.
Optional Arguments¶
probTable
(Output)- Probability of the observed table occurring, given that the null hypothesis of independent rows and columns is true.
pValue
(Output)- The p-value for independence of rows and columns. The p-value
represents the probability of a more extreme table where “extreme” is
taken in the Neyman-Pearson sense. The p-value is “two-sided”. The
p-value is also returned in functional form (see “Return Value”). A
table is more extreme if its probability (for fixed marginals) is less
than or equal to
probTable
. checkNumericalError
(Output)- Sum of the probabilities of all tables with the same marginal totals. Parameter check should have a value of 1.0. Deviation from 1.0 indicates numerical error.
Description¶
Function exactEnumeration
computes exact probabilities for an r × c
contingency table for fixed row and column marginals (a marginal is the
number of counts in a row or column), where r = nRows
and c =
nColumns
. Let \(f_{ij}\) denote the count in row i and column j
of a table, and let \(f_{i\bullet}\) and \(f_{\bullet j}\) denote the
row and column marginals. Under the hypothesis of independence, the
(conditional) probability of the fixed marginals of the observed table is
given by
where \(f_{\bullet\bullet }\) is the total number of counts in the table.
\(P_f\) corresponds to output argument probTable
.
A “more extreme” table X is defined in the probabilistic sense as more extreme than the observed table if the conditional probability computed for table X (for the same marginal sums) is less than the conditional probability computed for the observed table. The user should note that this definition can be considered “two-sided” in the cell counts.
Because exactEnumeration
used total enumeration in computing the
probability of a more extreme table, the amount of computer time required
increases very rapidly with the size of the table. Tables with a large total
count \(f_{\bullet\bullet }\) or a large value of r × c should not be
analyzed using exactEnumeration
. In such cases, try using
exactNetwork
.
Example¶
In this example, the exact conditional probability for the 2 × 2 contingency table
is computed.
from __future__ import print_function
from numpy import *
from pyimsl.stat.exactEnumeration import exactEnumeration
table = array([[8, 12], [8, 2]])
p = exactEnumeration(table)
print("p-value = %9.4f" % p)
Output¶
p-value = 0.0577