random_sample

Generates a simple pseudorandom sample from a finite population.

Synopsis

#include <imsls.h>

float *imsls_f_random_sample(int nrow, int nvar, float population[], int nsamp, 0)

The type double function is imsls_d_random_sample.

Required Arguments

int nrow (Input)
Number of rows of data in population.

int nvar (Input)
Number of variables in the population and in the sample.

float population[] (Input)
nrow by nvar matrix containing the population to be sampled. If either of the optional arguments IMSLS_FIRST_CALL or IMSLS_ADDITIONAL_CALL are specified, then population contains a different part of the population on each invocation, otherwise population contains the entire population.

int nsamp (Input)
The sample size desired.

Return Value

nsamp by nvar matrix containing the sample. To release this space, use imsls_free.

Synopsis with Optional Arguments

#include <imsls.h>

float *imsls_f_random_sample (int nrow, int nvar, float population[], int nsamp,
IMSLS_FIRST_CALL, int **index, int *npop,
IMSLS_FIRST_CALL_USER, int index[], int *npop,
IMSLS_ADDITIONAL_CALL, int *index, int *npop, float *samp,
IMSLS_POPULATION_COL_DIM, int population_col_dim,
IMSLS_RETURN_USER, int samp[],
0)

Optional Arguments

IMSLS_FIRST_CALL, int **index, int *npop (Output)
This is the first invocation with this data; additional calls to imsls_f_random_sample may be made to add to the population. Additional calls should be made using the optional argument IMSLS_ADDITIONAL_CALL. Argument index is the address of a pointer to an internally allocated array of length nsamp containing the indices of the sample in the population. Argument npop returns the number of items in the population. If the population is input a few items at a time, the first call to imsls_f_random_sample should use IMSLS_FIRST_CALL, and subsequent calls should use IMSLS_ADDITIONAL_CALL. See example 2.

IMSLS_FIRST_CALL_USER, int index[], int *npop (Output)
Storage for index is provided by the user. See IMSLS_FIRST_CALL.

IMSLS_ADDITIONAL_CALL, int *index, int *npop, float *samp (Input/Output)
This is an additional invocation of imsls_f_random_sample, and updating for the subpopulation in population is performed. Argument index is a pointer to an array of length nsamp containing the indices of the sample in the population, as returned using optional argument IMSLS_FIRST_CALL. Argument npop, also obtained using optional argument IMSLS_FIRST_CALL, returns the number of items in the population. It is not necessary to know the number of items in the population in advance. npop is used to cumulate the population size and should not be changed between calls to imsls_f_random_sample. Argument samp is a pointer to the array of size nsamp by nvar containing the sample. samp is the result of calling imsls_f_random_sample with optional argument IMSLS_FIRST_CALL. See Example 2.

IMSLS_POPULATION_COL_DIM, int population_col_dim (Input)
Column dimension of the matrix population.

Default: x_col_dim = nvar 

IMSLS_RETURN_USER, int samp[] (Output)
User-supplied array of size nrow by nvar containing the sample. This option should not be used if IMSLS_ADDITIONAL_CALL is used.

Description

Function imsls_f_random_sample generates a pseudorandom sample from a given population, without replacement, using an algorithm due to McLeod and Bellhouse (1983).

The first nsamp items in the population are included in the sample. Then, for each successive item from the population, a random item in the sample is replaced by that item from the population with probability equal to the sample size divided by the number of population items that have been encountered at that time.

Examples

Example 1

In this example, imsls_f_random_sample is used to generate a sample of size 5 from a population stored in the matrix population.

 

#include <imsls.h>

 

int main()

{

int nrow = 176, nvar = 2, nsamp = 5;

float *population;

float *sample;

 

population = imsls_f_data_sets(2,

0);

 

imsls_random_seed_set(123457);

 

sample = imsls_f_random_sample(nrow, nvar, population, nsamp,

0);

 

imsls_f_write_matrix("The sample", nsamp, nvar, sample,

IMSLS_NO_ROW_LABELS,

IMSLS_NO_COL_LABELS,

0);

}

Output

 

The sample

1764 36

1828 62

1923 6

1773 35

1769 106

 

Example 2

Function imsls_f_random_sample is now used to generate a sample of size 5 from the same population as in the example above except the data are input to imsls_f_random_sample one observation at a time. This is the way imsls_f_random_sample may be used to sample from a large data file. Notice that the number of records need not be known in advance.

 

#include <stdio.h>

#include <imsls.h>

 

int main()

{

int i, nrow = 176, nvar = 2, nsamp = 5;

int *index, npop;

float *population;

float *sample;

 

population = imsls_f_data_sets(2, 0);

 

imsls_random_seed_set(123457);

 

sample = imsls_f_random_sample(1, 2, population, nsamp,

IMSLS_FIRST_CALL, &index, &npop,

0);

for (i = 1; i < 176; i++) {

imsls_f_random_sample(1, 2, &population[2*i], nsamp,

IMSLS_ADDITIONAL_CALL, index, &npop, sample,

0);

}

printf("The population size is %d\n", npop);

imsls_i_write_matrix("Indices of random sample", 5, 1, index, 0);

 

 

imsls_f_write_matrix("The sample", nsamp, nvar, sample,

IMSLS_NO_ROW_LABELS,

IMSLS_NO_COL_LABELS,

0);

}

Output

 

The population size is 176

 

Indices of random sample

1 16

2 80

3 175

4 25

5 21

 

The sample

1764 36

1828 62

1923 6

1773 35

1769 106