randomSampleIndices

Generates a simple pseudorandom sample of indices.

Synopsis

randomSampleIndices (nsamp, npop)

Required Arguments

int nsamp (Input)
Sample size desired.
int npop (Input)
Number of items in the population.

Return Value

An array of length nsamp containing the indices of the sample.

Description

Function randomSampleIndices generates the indices of a pseudorandom sample, without replacement, of size nsamp numbers from a population of size npop. If nsamp is greater than npop/2, the integers from 1 to npop are selected sequentially with a probability conditional on the number selected and the number remaining to be considered. If, when the i-th population index is considered, j items have been included in the sample, then the index i is included with probability (nsamp- j)/(npop + 1-i).

If nsamp is not greater than npop/2, a O(nsamp) algorithm due to Ahrens and Dieter (1985) is used. Of the methods discussed by Ahrens and Dieter, the one called SG* is used in randomSampleIndices. It involves a preliminary selection of q indices using a geometric distribution for the distances between each index and the next one. If the preliminary sample size q is less than nsamp, a new preliminary sample is chosen, and this is continued until a preliminary sample greater in size than nsamp is chosen. This preliminary sample is then thinned using the same kind of sampling as described above for the case in which the sample size is greater than half of the population size. Function randomSampleIndices does not store the preliminary sample indices, but rather restores the state of the generator used in selecting the sample initially, and then passes through once again, making the final selection as the preliminary sample indices are being generated.

Example

In this example, randomSampleIndices is used to generate the indices of a pseudorandom sample of size 5 from a population of size 100.

from numpy import *
from pyimsl.stat.randomSampleIndices import randomSampleIndices
from pyimsl.stat.randomSeedSet import randomSeedSet
from pyimsl.stat.writeMatrix import writeMatrix

nsamp = 5
npop = 100
randomSeedSet(123457)
ir = randomSampleIndices(nsamp, npop)
writeMatrix("Random Sample", ir, noColLabels=True, writeFormat="%5i")

Output

 
          Random Sample
    2     22     53     61     79