randomSampleIndices¶
Generates a simple pseudorandom sample of indices.
Synopsis¶
randomSampleIndices (nsamp, npop)
Required Arguments¶
- int
nsamp
(Input) - Sample size desired.
- int
npop
(Input) - Number of items in the population.
Return Value¶
An array of length nsamp
containing the indices of the sample.
Description¶
Function randomSampleIndices
generates the indices of a pseudorandom
sample, without replacement, of size nsamp
numbers from a population of
size npop
. If nsamp
is greater than npop
/2, the integers from
1 to npop
are selected sequentially with a probability conditional on
the number selected and the number remaining to be considered. If, when the
i-th population index is considered, j items have been included in the
sample, then the index i is included with probability (nsamp
-
j)/(npop
+ 1-i).
If nsamp
is not greater than npop
/2, a O(nsamp
) algorithm
due to Ahrens and Dieter (1985) is used. Of the methods discussed by Ahrens
and Dieter, the one called SG* is used in randomSampleIndices
. It
involves a preliminary selection of q indices using a geometric
distribution for the distances between each index and the next one. If the
preliminary sample size q is less than nsamp
, a new preliminary sample
is chosen, and this is continued until a preliminary sample greater in size
than nsamp
is chosen. This preliminary sample is then thinned using the
same kind of sampling as described above for the case in which the sample
size is greater than half of the population size. Function
randomSampleIndices
does not store the preliminary sample indices, but
rather restores the state of the generator used in selecting the sample
initially, and then passes through once again, making the final selection as
the preliminary sample indices are being generated.
Example¶
In this example, randomSampleIndices
is used to generate the indices of
a pseudorandom sample of size 5 from a population of size 100.
from numpy import *
from pyimsl.stat.randomSampleIndices import randomSampleIndices
from pyimsl.stat.randomSeedSet import randomSeedSet
from pyimsl.stat.writeMatrix import writeMatrix
nsamp = 5
npop = 100
randomSeedSet(123457)
ir = randomSampleIndices(nsamp, npop)
writeMatrix("Random Sample", ir, noColLabels=True, writeFormat="%5i")
Output¶
Random Sample
2 22 53 61 79