randomHypergeometric

Generates pseudorandom numbers from a hypergeometric distribution.

Synopsis

randomHypergeometric (nRandom, n, m, l)

Required Arguments

int nRandom (Input)
Number of random numbers to generate.
int n (Input)
Number of items in the sample. Parameter n must be positive.
int m (Input)
Number of special items in the population, or lot. Parameter m must be positive.
int l (Input)
Number of items in the lot. Parameter l must be greater than both n and m.

Return Value

An integer array of length nRandom containing the random hypergeometric deviates.

Description

Function randomHypergeometric generates pseudorandom numbers from a hypergeometric distribution with parameters N, M, and L. The hypergeometric random variable X can be thought of as the number of items of a given type in a random sample of size N that is drawn without replacement from a population of size L containing M items of this type. The probability function is

\[f(x) = \frac{\binom{M}{x} \binom{L-M}{N-x}}{\binom{L}{N}}\]

for \(x=\max(0,N-L+M),1,2,\ldots,\min(N,M)\)

If the hypergeometric probability function with parameters N, M, and L evaluated at \(N-L+M\) (or at 0 if this is negative) is greater than the machine epsilon (see machine, Chapter 15, Utilities), and less than 1.0 minus the machine epsilon, then randomHypergeometric uses the inverse CDF technique. The function recursively computes the hypergeometric probabilities, starting at \(x= \max(0,N-L+M)\) and using the ratio

\[\frac{f(X=x+1)}{f(X=x)}\]

(see Fishman 1978, p. 475).

If the hypergeometric probability function is too small or too close to 1.0, the randomHypergeometric generates integer deviates uniformly in the interval \(\left[1,L-i\right]\) for \(i=0,1,\ldots\), and at the i-th step, if the generated deviate is less than or equal to the number of special items remaining in the lot, the occurrence of one special item is tallied and the number of remaining special items is decreased by one. This process continues until the sample size of the number of special items in the lot is reached, whichever comes first. This method can be much slower than the inverse CDF technique. The timing depends on N. If N is more than half of L (which in practical examples is rarely the case), the user may wish to modify the problem, replacing N by \(L-N\), and to consider the generated deviates to be the number of special items not included in the sample.

Example

In this example, randomHypergeometric generates five pseudorandom hypergeometric deviates from a hypergeometric distribution to simulate taking random samples of size 4 from a lot containing 20 items, of which 12 are defective. The resulting hypergeometric deviates represent the numbers of defectives in each of the five samples of size 4.

from numpy import *
from pyimsl.stat.randomHypergeometric import randomHypergeometric
from pyimsl.stat.randomSeedSet import randomSeedSet
from pyimsl.stat.writeMatrix import writeMatrix

n_random = 5
n = 4
m = 12
l = 20
randomSeedSet(123457)
ir = randomHypergeometric(n_random, n, m, l)
writeMatrix("Hypergeometric random deviates:", ir,
            noColLabels=True)

Output

 
                Hypergeometric random deviates:
          4            2            3            3            3

Fatal Errors

IMSLS_LOT_SIZE_TOO_SMALL The lot size must be greater than the sample size and the number of defectives in the lot. Lot size = #. Sample size = #. Number of defectives in the lot = #.