randomMvarFromData¶
Generates pseudorandom numbers from a multivariate distribution determined from a given sample.
Synopsis¶
randomMvarFromData (nRandom, x, nn)
Required Arguments¶
- int
nRandom
(Input) - Number of random multivariate vectors to generate.
- float
x[[]]
(Input) - Array of size
nsamp
×ndim
matrix containing the given sample. - int
nn
(Input) - Number of nearest neighbors of the randomly selected point in
x
that are used to form the output point in the result.
Return Value¶
nRandom
× ndim
matrix containing the random multivariate vectors
in its rows.
Description¶
Given a sample of size n (= nsamp
) of observations of a k-variate
random variable, randomMvarFromData
generates a pseudorandom sample with
approximately the same moments as the given sample. The sample obtained is
essentially the same as if sampling from a Gaussian kernel estimate of the
sample density. (See Thompson 1989.) Function randomMvarFromData
uses
methods described by Taylor and Thompson (1986).
Assume that the (vector-valued) observations \(x_i\) are in the rows of
x. An observation, \(x_j\), is chosen randomly; its nearest m (=
nn
) neighbors,
are determined; and the mean
of those nearest neighbors is calculated. Next, a random sample \(u_1,u_2,\ldots,u_m\) is generated from a uniform distribution with lower bound
and upper bound
The random variate delivered is
The process is then repeated until nRandom
such simulated variates are
generated and stored in the rows of the result.
Example¶
In this example, randomMvarFromData
is used to generate 5 pseudorandom
vectors of length 4 using the initial and final systolic pressure and the
initial and final diastolic pressure from Data Set A in Afifi and Azen
(1979) as the fixed sample from the population to be modeled. (Values of
these four variables are in the seventh, tenth, twenty-first, and
twenty-fourth columns of data set number nine in function
dataSets, Chapter 15, Utilities.)
from numpy import *
from pyimsl.stat.dataSets import dataSets
from pyimsl.stat.randomMvarFromData import randomMvarFromData
from pyimsl.stat.randomSeedSet import randomSeedSet
from pyimsl.stat.writeMatrix import writeMatrix
n_random = 5
k = 4
nsamp = 113
nn = 5
x = empty((113, 4), dtype='double')
rdata = empty((113, 34), dtype='double')
nrrow = []
nrcol = []
randomSeedSet(123457)
rdata = dataSets(9, nObservations=nrrow, nVariables=nrcol)
for i in range(0, nrrow[0]):
x[i, 0] = rdata[i, 6]
for i in range(0, nrrow[0]):
x[i, 1] = rdata[i, 9]
for i in range(0, nrrow[0]):
x[i, 2] = rdata[i, 20]
for i in range(0, nrrow[0]):
x[i, 3] = rdata[i, 23]
r = randomMvarFromData(n_random, x, nn)
writeMatrix("Random variates", r)
Output¶
Random variates
1 2 3 4
1 162.8 90.5 153.7 104.9
2 153.4 78.3 176.7 85.2
3 93.7 48.2 153.5 71.4
4 101.8 54.2 113.1 56.3
5 91.7 58.8 48.4 28.1