The type double function is imsls_d_randomness_test.
Required Arguments
intn_observations (Input) Number of observations in x.
floatx[] (Input) Array of size n_observations containing the data.
intn_run (Input) Length of longest run for which tabulation is desired. For optional arguments IMSLS_PAIRS, IMSLS_DSQUARE, and IMSLS_DCUBE, n_run stands for the number of equiprobable cells into which the statistics are to be tabulated.
Return Value
The probability of a larger chi‑squared statistic for testing the null hypothesis of a uniform distribution.
IMSLS_RUNS, float**runs_count, float**covariances, or
IMSLS_RUNS_USER, floatruns_count[], floatcovariances[], or
IMSLS_PAIRS, intpairs_lag, float**pairs_count, or
IMSLS_PAIRS_USER, intpairs_lag, floatpairs_count[], or
IMSLS_DSQUARE, float**dsquare_count, or
IMSLS_DSQUARE_USER, floatdsquare_count[], or
IMSLS_DCUBE, float**dcube_count, or
IMSLS_DCUBE_USER, floatdcube_count[],
IMSLS_RUNS_EXPECT, float**runs_expect,
IMSLS_RUNS_EXPECT_USER, floatruns_expect[],
IMSLS_EXPECT, float*expect,
IMSLS_CHI_SQUARED, float*chi_squared,
IMSLS_DF, float*df,
0)
Optional Arguments
IMSLS_IDO, intido, floatintermediate_results[] (Input/Output) Process data in blocks.
intido (Input) Processing option. The argument ido must be 1, 2, or 3. With this option, it is not a requirement that all observations be memory resident, thus enabling one to handle large data sets. Blocks of rows of the data can be processed sequentially in separate invocations of imsls_f_randomness_test. Output argument values are returned only when ido = 3. (See Example 5.)
ido
Action
1
This is the first invocation with this data; additional calls will be made. The first set of n_observations observations is input in x.
2
This is an intermediate invocation. The next set of n_observations observations is input in x.
3
This is the final invocation of this function. No further invocations of imsls_f_randomness_test with ido greater than 1 should be made without first invoking imsls_f_randomness_test with ido = 1. The last set of n_observations observations is input in x.
Default: ido is not used. All the data is input at once.
floatintermediate_results[] (Input/Output) User-supplied array containing results from invocations of the function. The length of intermediate_results is:
Test
Length
Runs test (IMSLS_RUNS)
n_run
Pairs test (IMSLS_PAIRS)
n_run by n_run
d2 test (IMSLS_DSQUARE)
n_run
triplets test (IMSLS_DCUBE)
n_run by n_run by n_run
In processing blocks of data, x can have different number of observations, n_observations, in separate invocations.
IMSLS_RUNS, float**runs_count, float**covariances, (Output) Indicates the runs test is to be performed. Array of length n_run containing the counts of the number of runs up of each length is returned in runs_count. n_run by n_run matrix containing the variances and covariances of the counts is returned in covariances. IMSLS_RUNS is the default test, however, to return the counts and covariances the IMSLS_RUNS argument must be used.
or
IMSLS_RUNS_USER, float runs_count[], floatcovariances[] (Output) Storage for runs_count and covariances is provided by the user. See IMSLS_RUNS.
or
IMSLS_PAIRS, intpairs_lag (Input), float**pairs_count, (Output) Indicates the pairs test is to be performed. The lag to be used in computing the pairs statistic is stored in pairs_lag. Pairs (x[i], x[i + pairs_lag]) for i = 0,…, N‑pairs_lag -1 are tabulated, where N is the total sample size. An n_run by n_run matrix containing the count of the number of pairs in each cell is returned in pairs_count.
or
IMSLS_PAIRS_USER, int pairs_lag, floatpairs_count[] (Output) Storage for pairs_lag and pairs_count is provided by the user. See IMSLS_PAIRS.
or
IMSLS_DSQUARE, float**dsquare_count, (Output) Indicates the d2 test is to be performed. dsquare_count is an address of a pointer to an internally allocated array of length n_run containing the tabulations for the d2 test.
or
IMSLS_DSQUARE_USER, floatdsquare_count[] (Output) Storage for dsquare_count is provided by the user.
See IMSLS_DSQUARE.
or
IMSLS_DCUBE, float**dcube_count, (Output) Indicates the triplets test is to be performed. dcube_count is an address of a pointer to an internally allocated array of length n_run by n_run by n_run containing the tabulations for the triplets test.
or
IMSLS_DCUBE_USER, floatdcube_count[] (Output) Storage for dcube_count is provided by the user. See IMSLS_DCUBE.
IMSLS_RUNS_EXPECT, float**runs_expect (Output) The address of a pointer to an internally allocated array of length n_run containing the expected number of runs of each length. This option is valid only for the runs test.
IMSLS_RUNS_EXPECT_USER, floatruns_expect[] (Output) Storage for runs_expect is provided by the user. See IMSLS_RUNS_EXPECT.
IMSLS_EXPECT, float*expect (Output) Expected number of counts for each cell. This argument is valid only if one of IMSLS_PAIRS, IMSLS_DSQUARE, or IMSLS_DCUBE is used. It is not valid for the runs test.
IMSLS_CHI_SQUARED, float*chi_squared (Output) Chi‑squared statistic for testing the null hypothesis of a uniform distribution.
IMSLS_DF, float*df (Output) Degrees of freedom for chi‑squared.
Description
Runs Up Test
Function imsls_f_randomness_test performs one of four different tests for randomness. Optional argument IMSLS_RUNS computes statistics for the runs up test. Runs tests are used to test for cyclical trend in sequences of random numbers. If the runs down test is desired, each observation should first be multiplied by -1 to change its sign, and IMSLS_RUNS called with the modified vector of observations.
IMSLS_RUNS first tallies the number of runs up (increasing sequences) of each desired length. For i = 1, ..., r- 1, where r = n_run, runs_count[i] contains the number of runs of length i. runs_count[n_run] contains the number of runs of length n_run or greater. As an example of how runs are counted, the sequence (1, 2, 3, 1) contains 1 run up of length 3, and one run up of length 1.
After tallying the number of runs up of each length, IMSLS_RUNS computes the expected values and the covariances of the counts according to methods given by Knuth (1981, pages 65-67). Let R denote a vector of length n_run containing the number of runs of each length so that the i‑th element of R, ri, contains the count of the runs of length i. Let ΣR denote the covariance matrix of R under the null hypothesis of randomness, and let μR denote the vector of expected values for R under this null hypothesis, then an approximate chi-squared statistic with n_run degrees of freedom is given as
In general, the larger the value of each element of μR, the better the chi-squared approximation.
Pairs Test
IMSLS_PAIRS computes the pairs test (or the Good’s serial test) on a hypothesized sequence of uniform (0,1) pseudo-random numbers. The test proceeds as follows. Subsequent pairs (x[i], x[i + pairs_lag]) are tallied into a k×k matrix, where k = n_run. In this tally, element (j, m) of the matrix is incremented, where
where l = pairs_lag, and the notation ⌊⌋ represents the greatest integer function, ⌊Y⌋ is the greatest integer less than or equal to Y, where Y is a real number. If l = 1, then i = 1, 3, 5, ..., n‑ 1. If l > 1, then i = 1, 2, 3, ..., n-l, where n is the total number of pseudo-random numbers input on the current invocation of IMSLS_PAIRS (i.e., n = n_observations).
Given the tally matrix in pairs_count, chi-squared is computed as
where e = Σoij/k2, and oij is the observed count in cell (i, j) (oij = pairs_count[i][j]).
Because pair statistics for the trailing observations are not tallied on any call, the user should call IMSLS_PAIRS with n_observations as large as possible. For pairs_lag < 20 and n_observations = 2000, little power is lost.
d 2 Test
IMSLS_DSQUARE computes the d2 test for succeeding quadruples of hypothesized pseudo-random uniform (0, 1) deviates. The d2 test is performed as follows. Let X1, X2, X3, and X4denote four pseudo-random uniform deviates, and consider
D2 = (X3-X1)2 + (X4-X2)2
The probability distribution of D2 is given as
when D2≤ 1, where π denotes the value of pi. If D2 > 1, this probability is given as
See Gruenberger and Mark (1951) for a derivation of this distribution.
For each succeeding set of 4 pseudo-random uniform numbers input in X, d2 and the cumulative probability of d2 (Pr(D2≤d2)) are computed. The resulting probability is tallied into one of k = n_run equally spaced intervals.
Let n denote the number of sets of four random numbers input (n = the total number of observations/4). Then, under the null hypothesis that the numbers input are random uniform (0, 1) numbers, the expected value for each element in dsquare_count is e = n/k. An approximate chi-squared statistic is computed as
where oi = dsquare_count[i] is the observed count. Thus, 2 has k- 1 degrees of freedom, and the null hypothesis of pseudo-random uniform (0, 1) deviates is rejected if 2 is too large. As n increases, the chi-squared approximation becomes better. A useful generalization is that e > 5 yields a good chi-squared approximation.
Triplets Test
IMSLS_DCUBE computes the triplets test on a sequence of hypothesized pseudo-random uniform(0, 1) deviates. The triplets test is computed as follows: Each set of three successive deviates, X1, X2, and X3, is tallied into one of m3 equal sized cubes, where m = n_run. Let i = [mX1] + 1, j = [mX2] + 1, and k = [mX3] + 1. For the triplet (X1, X2, X3), dcube_count[i][j][k] is incremented.
Under the null hypothesis of pseudo-random uniform(0, 1) deviates, the m3 cells are equally probable and each has expected value e = n/m3, where n is the number of triplets tallied. An approximate chi-squared statistic is computed as
where oijk = dcube_count[i][j][k].
The computed chi-squared has m3- 1 degrees of freedom, and the null hypothesis of pseudo-random uniform (0, 1) deviates is rejected if 2 is too large.
Examples
Example 1
This example illustrates the use of the runs test on 104 pseudo-random uniform deviates. Since the probability of a larger chi-squared statistic is 0.1872, there is no strong evidence to support rejection of this null hypothesis of randomness.
This example illustrates the calculations of the IMSLS_PAIRS statistics when a random sample of size 104 is used and the pairs_lag is 1. The results are not significant. IMSL function imsls_f_random_uniform (Chapter 12, Random Number Generation) is used in obtaining the pseudo-random deviates.
In this example, 2000 observations generated via IMSL function imsls_f_random_uniform (Chapter 12, Random Number Generation) are input to IMSLS_DSQUARE in one call. In the example, the null hypothesis of a uniform distribution is not rejected.
In this example, 2001 deviates generated by IMSL function imsls_f_random_uniform (Chapter 12, Random Number Generation) are input to IMSLS_DCUBE, and tabulated in 27 equally sized cubes. In the example, the null hypothesis is not rejected.
This example is based on Example 1 to illustrate the use of the IMSLS_IDO optional argument. In this example, imsls_f_randomness_test is called 10 times, with 1000 pseudo-random uniform deviates each time. Since the probability of a larger chi-squared statistic is 0.1872, there is no strong evidence to support rejection of this null hypothesis of randomness.