public class Random extends Random implements Serializable, Cloneable
The non-uniform distributions are generated from a uniform distribution. By
default, this class uses the uniform distribution generated by the base class
Random
. If the multiplier is set in this class then a
multiplicative congruential method is used. The form of the generator is
$$x_i \equiv c x_{i-1} {\rm mod}
(2^{31}-1) $$
Each \(x_i\) is then scaled into the unit interval (0,1). If
the multiplier, c, is a primitive root modulo
\(2^{31}-1\) (which is a prime), then the generator will
have a maximal period of \(2^{31}-2\). There are several
other considerations, however. See Knuth (1981) for a good general
discussion. Possible values for c are 16807, 397204094, and 950706376.
The selection is made by the method
setMultiplier
. Evidence suggests that the
performance of 950706376 is best among these three choices (Fishman and Moore
1982).
Alternatively, one can select a 32-bit or 64-bit Mersenne Twister generator
by first instantiating MersenneTwister
or MersenneTwister64
.
These generators have a period of \(2^{19937}-1\) and a
623-dimensional equidistribution property. See Matsumoto et al. 1998 for
details.
The generation of uniform (0,1) numbers is done by the method
nextDouble
.
Nonuniform random numbers are generated using a variety of transformation procedures. All of the transformations used are exact (mathematically). The most straightforward transformation is the invers e CDF technique, but it is often less efficient than others involving acceptance/rejection and mixtures. See Kennedy and Gentle(1980) for discussion of these and other techniques.
Many of the nonuniform generators use different algorithms depending on the values of the parameters of the distributions. This is particularly true of the generators for discrete distributions. Schmeiser (1983) gives an overview of techniques for generating deviates from discrete distributions.
Extensive empirical tests of some of the uniform random number generators
available in the Random
class are reported by Fishman and Moore
(1982 and 1986). Results of tests on the generator using the multiplier 16807
are reported by Learmonth and Lewis (1973). If the user wishes to perform
additional tests, the routines in Chapter 17, Tests of Goodness of
Fit, may be of use. Often in Monte Carlo applications, it is appropriate
to construct an ad hoc test that is sensitive to departures that are
important in the given application. For example, in using Monte Carlo methods
to evaluate a one-dimensional integral, autocorrelations of order one may not
be harmful, but they may be disastrous in evaluating a two-dimensional
integral. Although generally the routines in this chapter for generating
random deviates from nonuniform distributions use exact methods, and, hence,
their quality depends almost solely on the quality of the underlying uniform
generator, it is often advisable to employ an ad hoc test of goodness of fit
for the transformations that are to be applied to the deviates from the
nonuniform generator.
Three methods are associated with copulas. A copula is a multivariate cumulative probability distribution (CDF) whose arguments are random variables uniformly distributed on the interval [0,1] corresponding to the probabilities (variates) associated with arbitrarily distributed marginal deviates. The copula structure allows the multivariate CDF to be partitioned into the copula, which has associated with it information characterizing the dependence among the marginal variables, and the set of separate marginal deviates, each of which has its own distribution structure.
Two methods, nextGaussianCopula
and
nextStudentsTCopula
, allow the user to specify a correlation
structure (in the form of a Cholesky matrix) which can be used to imprint
correlation information on a sequence of multivariate random vectors. Each
call to one of these methods returns a random vector whose elements
(variates) are each uniformly distributed on the interval [0,1] and
correlated according to a user-specified Cholesky matrix. These variate
vector sequences may then be inverted to marginal deviate sequences whose
distributions and imprinted correlations are user-specified.
Method nextGaussianCopula
generates a random Gaussian copula
sequence by inverting uniform [0,1] random numbers to N(0,1) deviates
vectors, imprinting each vector with the correlation information by
multiplying it with the Cholesky matrix, and then using the N(0,1) CDF to map
the imprinted deviates back to uniform [0,1] variates.
Method nextStudentsTCopula
inverts a vector of uniform [0, 1]
random numbers to a N(0,1) deviate vector, imprints the vector with
correlation information by multiplying it with the Cholesky matrix,
transforms the imprinted N(0,1) vector to an imprinted Student's t
vector (where each element is Student's t distributed with
\(\nu\) degrees of freedom) by dividing each element of the
imprinted N(0,1) vector by \(\sqrt{\frac{s}{\nu}}
\), where s is a random deviate taken from a chi-squared
distribution with
\(\nu\) degrees of freedom, and finally maps each element of
the resulting imprinted Student's t vector back to a uniform [0, 1]
distributed variate using the Student's t CDF.
The third copula method, canonicalCorrelation
, extracts a
correlation matrix from a sequence of multivariate deviate vectors whose
component marginals are arbitrarily distributed. This is accomplished by
first extracting the empirical CDF from each of the marginal deviates and
then using this CDF to map the deviates to uniform [0,1] variates which are
then inverted to Normal (0,1) deviates. Each element
\(C_{ij}\) of the correlation matrix can then be extracted
by averaging the products \(z_{it} z_{jt}\) of deviates
i
and j over the t-indexed sequence. The utility of method
canonicalCorrelation
is that because the correlation matrix is
derived from N(0,1) deviates, the correlation is unbiased, i.e. undistorted
by the arbitrary marginal distribution structures of the original deviate
vector sequences. This is important in such financial applications as
portfolio optimization, where correlation is used to estimate and minimize
risk.
The use of these routines is illustrated with RandomEx2.java, which first
uses method nextGaussianCopula
to create a correlation imprinted
sequence of random deviate vectors and then uses method
canonicalCorrelation
to extract the correlation matrix from the
imprinted sequence of vectors.
Modifier and Type | Class and Description |
---|---|
static interface |
Random.BaseGenerator
Base pseudorandom number.
|
Constructor and Description |
---|
Random()
Constructor for the Random number generator class.
|
Random(long seed)
Constructor for the Random number generator class with supplied seed.
|
Random(Random.BaseGenerator baseGenerator)
Constructor for the Random number generator class with an alternate basic
number generator.
|
Modifier and Type | Method and Description |
---|---|
double[][] |
canonicalCorrelation(double[][] deviate)
Method
canonicalCorrelation generates a canonical
correlation matrix from an arbitrarily distributed multivariate deviate
sequence with nvar deviate variables, nseq
steps in the sequence, and a Gaussian Copula dependence structure. |
static void |
canonicalCorrelationSTC(double df,
double[][] STCdevt,
double[][] CorrMtrx)
Deprecated.
|
protected int |
next(int bits)
Generates the next pseudorandom number.
|
double |
nextBeta(double p,
double q)
Generate a pseudorandom number from a beta distribution.
|
int |
nextBinomial(int n,
double p)
Generate a pseudorandom number from a binomial distribution.
|
double |
nextCauchy()
Generates a pseudorandom number from a Cauchy distribution.
|
double |
nextChiSquared(double df)
Generates a pseudorandom number from a Chi-squared distribution.
|
double |
nextContinuousUniform(double a,
double b)
Generate a pseudorandom number from a continuous uniform distribution.
|
int |
nextDiscrete(int imin,
double[] probabilities)
Generate a pseudorandom number from a general discrete distribution using
an alias method.
|
double |
nextExponential()
Generates a pseudorandom number from a standard exponential distribution.
|
double |
nextExponentialMix(double theta1,
double theta2,
double p)
Generate a pseudorandom number from a mixture of two exponential
distributions.
|
double |
nextExtremeValue(double mu,
double beta)
Generate a pseudorandom number from an extreme value distribution.
|
double |
nextF(double dfn,
double dfd)
Generate a pseudorandom number from the F distribution.
|
double |
nextGamma(double a)
Generates a pseudorandom number from a standard gamma distribution.
|
double[] |
nextGaussianCopula(Cholesky chol)
Generate pseudorandom numbers from a Gaussian Copula distribution.
|
double[] |
nextGaussianCopula(int k,
Cholesky chol)
Deprecated.
Use
Random.nextGaussianCopula(Cholesky) instead. |
double |
nextGeneralizedExtremeValue(double mu,
double sigma,
double xi)
Generates a pseudorandom number from a generalized extreme value distribution.
|
double |
nextGeneralizedGaussian(double mu,
double alpha,
double beta)
Generates a pseudorandom number from a generalized Gaussian distribution.
|
double |
nextGeneralizedPareto(double mu,
double sigma,
double beta)
Generates a pseudorandom number from a generalized Pareto distribution.
|
int |
nextGeometric(double p)
Generate a pseudorandom number from a geometric distribution.
|
int |
nextHypergeometric(int n,
int m,
int l)
Generate a pseudorandom number from a hypergeometric distribution.
|
int |
nextLogarithmic(double a)
Generate a pseudorandom number from a logarithmic distribution.
|
double |
nextLogistic(double mu,
double sigma)
Generates a pseudorandom number from a logistic distribution.
|
double |
nextLogNormal(double mean,
double stdev)
Generate a pseudorandom number from a lognormal distribution.
|
double[] |
nextMultivariateNormal(Cholesky matrix)
Generate pseudorandom numbers from a multivariate normal distribution.
|
double[] |
nextMultivariateNormal(int k,
Cholesky matrix)
Deprecated.
Use
Random.nextMultivariateNormal(Cholesky) instead. |
int |
nextNegativeBinomial(double rk,
double p)
Generate a pseudorandom number from a negative binomial distribution.
|
double |
nextNormal()
Generate a pseudorandom number from a standard normal distribution using
an inverse CDF method.
|
double |
nextNormalAR()
Deprecated.
|
int |
nextPoisson(double theta)
Generate a pseudorandom number from a Poisson distribution.
|
double |
nextRayleigh(double sigma)
Generate a pseudorandom number from a Rayleigh distribution.
|
double |
nextStudentsT(double df)
Generate a pseudorandom number from a Student's t distribution.
|
double[] |
nextStudentsTCopula(double df,
Cholesky chol)
Generate pseudorandom numbers from a Student's t Copula distribution.
|
double[] |
nextStudentsTCopula(int k,
double df,
Cholesky chol)
Deprecated.
Use
Random.nextStudentsTCopula(double, Cholesky) instead. |
double |
nextTriangular()
Generate a pseudorandom number from a triangular distribution on the
interval (0,1).
|
int |
nextUniformDiscrete(int k)
Generate a pseudorandom number from a discrete uniform distribution.
|
double |
nextVonMises(double c)
Generate a pseudorandom number from a von Mises distribution.
|
double |
nextWeibull(double a)
Generate a pseudorandom number from a Weibull distribution.
|
double |
nextZigguratNormalAR()
Generates pseudorandom numbers using the Ziggurat method.
|
void |
setMultiplier(int multiplier)
Sets the multiplier for a linear congruential random number generator.
|
void |
setSeed(long seed)
Sets the seed.
|
void |
skip(int n)
Resets the seed to skip ahead in the base linear congruential generator.
|
public Random()
public Random(long seed)
seed
- a long
which represents the random number
generator seed in the range of -2,147,483,647 to +2,147,483,647public Random(Random.BaseGenerator baseGenerator)
baseGenerator
- is used to override the method next
.public final void setSeed(long seed)
public void setMultiplier(int multiplier)
java.util.Random
, is replaced by
the generatormultiplier
- an int
which represents the random number
generator multiplierpublic void skip(int n)
setMultiplier
.
The method skips ahead in the deviates returned by the protected method
next
. The public methods use next(int)
as
their source of uniform random deviates. Some methods call it more than
once. For instance, each call to nextDouble
calls
it twice.n
- is the number of random deviates to skip.protected int next(int bits)
next
method is used. If the
multiplier
is set then the multiplicative congruential
method is used. Otherwise, super.next(bits)
is used.public double nextNormal()
inverseNormal
. This method is
slower than the acceptance/rejection technique used in the
nextNormalAR
to generate standard normal deviates. Deviates
from the normal distribution with mean \(x_m\) and
standard deviation
\(x_{std}\) can be obtained by scaling the output from
nextNormal
. To do this first scale the output of
nextNormal
by \(x_{std}\) and then add
\(x_m\) to the result.double
which represents a pseudorandom number from
a standard normal distributionpublic double nextNormalAR()
nextNormalAR
generates pseudorandom numbers from a standard
normal (Gaussian) distribution using an acceptance/rejection technique
due to Kinderman and Ramage (1976). In this method, the normal density is
represented as a mixture of densities over which a variety of
acceptance/rejection methods due to Marsaglia (1964), Marsaglia and Bray
(1964), and Marsaglia, MacLaren, and Bray (1964) are applied. This method
is faster than the inverse CDF technique used in nextNormal
to generate standard normal deviates.
Deviates from the normal distribution with mean \(x_m\)
and standard deviation \(x_{std}\) can be obtained by
scaling the output from nextNormalAR
. To do this first scale
the output of nextNormalAR
by \(x_{std}\)
and then add \(x_m\) to the result.
double
which represents a pseudorandom number from
a standard normal distributionpublic double nextZigguratNormalAR()
The nextZigguratNormalAR
method cuts the density into many
small pieces. For each random number generated, an interval is chosen at
random and a random normal is generated from the choosen interval. In
this implementation, the density is cut into 256 pieces, but symmetry is
used so that only 128 pieces are needed by the computation. Following
Doornik (2005), different uniform random deviates are used to determine
which slice to use and to determine the normal deviate from the
slice.
double
containing the random normal deviate.public double nextBeta(double p, double q)
Method nextBeta
generates pseudorandom numbers from a beta
distribution with parameters p and
q, both of which must be positive. The probability density
function is
$$f\left( x \right) = \frac{{\Gamma \left( {p + q} \right)}}{{\Gamma \left( p \right)\Gamma \left( q \right)}}x^{p - 1} \left( {1 - x} \right)^{q - 1\,\,\,\,} \,\, for\,0 \le x \le 1 $$
where \(\Gamma (\cdot)\) is the gamma function.
The algorithm used depends on the values of p and q. Except for the trivial cases of p = 1 or q = 1, in which the inverse CDF method is used, all of the methods use acceptance/rejection. If p and q are both less than 1, the method of Johnk (1964) is used; if either p or q is less than 1 and the other is greater than 1, the method of Atkinson (1979) is used; if both p and q are greater than 1, algorithm BB of Cheng (1978), which requires very little setup time, is used.
The value returned is less than 1.0 and greater than \(\varepsilon\), where \(\varepsilon\) is the smallest positive number such that \(1.0 - \varepsilon\) is less than 1.0.
p
- a double
, the first beta distribution parameter, p
\(\gt\) 0q
- a double
, the second beta distribution parameter, q
\(\gt\) 0double
, a pseudorandom number from a beta
distributionpublic int nextBinomial(int n, double p)
nextBinomial
generates pseudorandom numbers from a binomial
distribution with parameters n and
p. n and p
must be positive, and p must be less than 1. The probability
function (with
n = n and p = p) is
$$f\left( x \right) = \left( {_x^n } \right)p^x \left( {1 - p} \right)^{n - x} $$
for \(x = 0, 1, 2, \ldots, n\).
The algorithm used depends on the values of n and p. If \(np \lt 10\) or if p is less than a machine epsilon, the inverse CDF technique is used; otherwise, the BTPE algorithm of Kachitvichyanukul and Schmeiser (see Kachitvichyanukul 1982) is used. This is an acceptance/rejection method using a composition of four regions. (TPE equals Triangle, Parallelogram, Exponential, left and right.)
n
- an int
, the number of Bernoulli trials.p
- a double
, the probability of success on each trial,
\(0 \lt p \lt 1\).int
, the pseudorandom number from a binomial
distribution.public double nextCauchy()
$$f\left( x \right) = \frac{1}{\pi (1 + x^2 )} $$
Use of the inverse CDF technique would yield a Cauchy deviate from a
uniform (0, 1) deviate, u, as
\(\tan \left[ {\pi \left( {u - .5} \right)}\right]\).
Rather than evaluating a tangent directly, however,
nextCauchy
generates two uniform (-1, 1) deviates,
\(x_1\) and \(x_2\). These values can
be thought of as sine and cosine values. If
$$x_1^2 + x_2^2 $$
is less than or equal to 1, then \(x_1/x_2\) is delivered as the Cauchy deviate; otherwise, \(x_1\) and \(x_2\) are rejected and two new uniform (-1, 1) deviates are generated. This method is also equivalent to taking the ratio of two independent normal deviates.
Deviates from the Cauchy distribution with median t and first quartile t - s, that is, with density
$$f\left( x \right) = \frac{s}{{\pi \left[ {s^2 + \left( {x - t} \right)^2 } \right]}} $$
can be obtained by scaling the output from nextCauchy
. To do
this, first scale the output from nextCauchy
by
S and then add T to the result.
double
, a pseudorandom number from a Cauchy
distributionpublic double nextChiSquared(double df)
nextChiSquared
generates pseudorandom numbers from a
chi-squared distribution with df
degrees of freedom. If
df
is an even integer less than 17, the chi-squared deviate
r is generated as
$$r = - 2\ln \left( {\mathop \Pi \limits_{i = 1}^n } u_i \right) $$
where \(n = {\rm df}/2\) and the
\(u_i\) are independent random deviates from a uniform
(0, 1) distribution. If df
is an odd integer less than 17,
the chi-squared deviate is generated in the same way, except the square
of a normal deviate is added to the expression above. If df
is greater than 16 or is not an integer, and if it is not too large to
cause overflow in the gamma random number generator, the chi-squared
deviate is generated as a special case of a gamma deviate, using
nextGamma
. If overflow would occur in
nextGamma
, the chi-squared deviate is generated in the
manner described above, using the logarithm of the product of uniforms,
but scaling the quantities to prevent underflow and overflow.
df
- a double
which specifies the number of degrees of
freedom. It must be positive.double
, a pseudorandom number from a Chi-squared
distribution.public double nextGamma(double a)
Method nextGamma
generates pseudorandom numbers from a gamma
distribution with shape parameter a. The probability density
function is
$$P = \frac{1}{{\Gamma \left( a \right)}}\int_o^x {e^{ - t} } t^{a - 1} dt $$
Various computational algorithms are used depending on the value of the
shape parameter a. For the special case of
a = 0.5, squared and halved normal deviates are used; and for the
special case of a = 1.0, exponential deviates (from method
nextExponential
) are used. Otherwise, if a is less
than 1.0, an acceptance-rejection method due to Ahrens, described in
Ahrens and Dieter (1974), is used; if a is greater than 1.0, a
ten-region rejection procedure developed by Schmeiser and Lal (1980) is
used.
The Erlang distribution is a standard gamma distribution with the shape
parameter having a value equal to a positive integer; hence,
nextGamma
generates pseudorandom deviates from an Erlang
distribution with no modifications required.
a
- a double
, the shape parameter of the gamma
distribution. It must be positive.double
, a pseudorandom number from a standard
gamma distributionpublic double nextGeneralizedGaussian(double mu, double alpha, double beta)
Generates a pseudorandom number from a generalized Gaussian distribution using
an inverse CDF method. A uniform (0,1) random deviate is
generated, then the inverse of the generalized Gaussian distribution function is
evaluated at that point using InvCdf.generalizedGaussion
.
mu
- a double
, the location parameteralpha
- a double
, the scale parameter. It must be positive.beta
- a double
, the shape parameter. It must be positive.double
, a pseudorandom number from a generalized
Gaussian distributionpublic double nextGeneralizedExtremeValue(double mu, double sigma, double xi)
Generates a pseudorandom number from a generalized extreme value distribution using
an inverse CDF method. A uniform (0,1) random deviate is
generated, then the inverse of the generalized extreme value distribution function is
evaluated at that point using InvCdf.generalizedExtremeValue
.
mu
- a double
, the location parametersigma
- a double
, the scale parameter. It must be positive.xi
- a double
, the shape parameterdouble
, a pseudorandom number from a generalized
extreme value distributionpublic double nextGeneralizedPareto(double mu, double sigma, double beta)
Generates a pseudorandom number from a generalized Pareto distribution using
an inverse CDF method. A uniform (0,1) random deviate is
generated, then the inverse of the generalized Pareto distribution function is
evaluated at that point using InvCdf.generalizedPareto
.
mu
- a double
, the location parametersigma
- a double
, the scale parameter. It must be positive.beta
- a double
, the shape parameter. It must be positive.double
, a pseudorandom number from a generalized
Pareto distributionpublic int nextGeometric(double p)
nextGeometric
generates pseudorandom numbers from a
geometric distribution with parameter p, where
P =p is the probability of getting a success on any trial. A
geometric deviate can be interpreted as the number of trials until the
first success (including the trial in which the first success is
obtained). The probability function is
$$f(x) = P(1 - P)^{x - 1} $$
for \(x = 1, 2, \ldots\) and \(0 \lt P \lt 1\).
The geometric distribution as defined above has mean \(1/P\).
The i-th geometric deviate is generated as the smallest integer not less than \(log(U_i)/log(1 - P )\), where the \(U_i\) are independent uniform (0, 1) random numbers (see Knuth, 1981).
The geometric distribution is often defined on \(0, 1, 2, ...,\) with mean \((1 - P)/P\). Such deviates can be obtained by subtracting 1 from each element returned value.
p
- a double
, the probability of success on each trial,
\(0 \lt p \le 1\)int
, a pseudorandom number from a geometric
distributionpublic int nextHypergeometric(int n, int m, int l)
Method nextHypergeometric
generates pseudorandom numbers
from a hypergeometric distribution with parameters n,
m, and l. The hypergeometric random variable x can
be thought of as the number of items of a given type in a random sample
of size n
that is drawn without replacement from a population of size
l containing m items of this type. The probability function
is
$$f\left( x \right) = \frac{{\left( {_x^m } \right)\left( {_{n - x}^{l - m} } \right)}}{{\left( {_n^l } \right)}} $$
for \(x = {\rm max}(0, n - l + m), 1, 2, \ldots, {\rm min}(n, m)\).
If the hypergeometric
probability function with parameters
n, m, and
l evaluated at \(n - l + m\)
(or at 0 if this is negative) is greater than the machine epsilon, and
less than 1.0 minus the machine epsilon, then
nextHypergeometric
uses the inverse CDF technique. The
method recursively computes the hypergeometric
probabilities, starting at \(x = {\rm max}(0, n - l +
m)\)
and using the ratio \(f (x = x + 1)/f(x = x) \)(see Fishman 1978, page
457).
If the hypergeometric
probability function is too small or
too close to 1.0, then nextHypergeometric
generates integer
deviates uniformly in the interval \([1, l- i]\), for
\(i = 0, 1, \ldots\); and at the I-th step, if
the generated deviate is less than or equal to the number of special
items remaining in the lot, the occurrence of one special item is tallied
and the number of remaining special items is decreased by one. This
process continues until the sample size or the number of special items in
the lot is reached, whichever comes first. This method can be much slower
than the inverse CDF technique. The timing depends on n. If
n is more than half of l
(which in practical examples is rarely the case), the user may wish to
modify the problem, replacing n by
\(l - n\), and to consider the deviates to be the number of special
items not included in the sample.
n
- an int
which specifies the number of items in the
sample, n \(\gt\) 0m
- an int
which specifies the number of special items
in the population, or lot, m \(\gt\) 0l
- an int
which specifies the number of items in the
lot, l \(\gt\) max(n,m)int
which specifies the number of special items
in a sample of size n drawn without replacement from a population of size
l that contains m such special items.public int nextLogarithmic(double a)
Method nextLogarithmic
generates pseudorandom numbers from a
logarithmic distribution with parameter a. The probability
function is
$$f\left( x \right) = - \frac{{a^x }}{{x\ln \left( {1 - a} \right)}} $$
for \(x = 1, 2, 3, \ldots\), and \(0 \lt a \lt 1\).
The methods used are described by Kemp (1981) and depend on the value of a. If a is less than 0.95, Kemp's algorithm LS, which is a "chop-down" variant of an inverse CDF technique, is used. Otherwise, Kemp's algorithm LK, which gives special treatment to the highly probable values of 1 and 2, is used.
a
- a double
which specifies the parameter of the
logarithmic distribution, \(0 \lt a \lt 1.0\).int
, a pseudorandom number from a logarithmic
distribution.public int nextNegativeBinomial(double rk, double p)
Method nextNegativeBinomial
generates pseudorandom numbers
from a negative binomial distribution with parameters
\(\rm rk\) and \(\rm p\).
\(\rm rk\) and \(\rm p\) must be
positive and p must be less than 1. The probability function with
(\(r = \rm rk\) and
\(p = \rm p\)) is
$$f\left( x \right) = \left( \begin{array}{c} r + x - 1 \\ x \\ \end{array} \right)\left( {1 - p} \right)^r p^x $$
for \(x = 0, 1, 2, \ldots\).
If r is an integer, the distribution is often called the Pascal distribution and can be thought of as modeling the length of a sequence of Bernoulli trials until r successes are obtained, where p is the probability of getting a success on any trial. In this form, the random variable takes values r, r + 1, \(r + 2, \ldots\) and can be obtained from the negative binomial random variable defined above by adding r to the negative binomial variable. This latter form is also equivalent to the sum of r geometric random variables defined as taking values \(1, 2, 3, \ldots\).
If \(rp/(1 - p)\) is less than 100 and \((1 - p)^r\)
is greater than the machine epsilon, nextNegativeBinomial
uses the inverse CDF technique; otherwise, for each negative binomial
deviate, nextNegativeBinomial
generates a gamma \((r, p/(1
- p))\) deviate
y and then generates a Poisson deviate with parameter
y.
rk
- a double
which specifies the negative binomial
parameter, rk \(\gt\) 0p
- a double
which specifies the probability of success
on each trial. It must be greater than machine precision and less than
one.int
which specifies the pseudorandom number from
a negative binomial distribution. If rk is an integer, the deviate can be
thought of as the number of failures in a sequence of Bernoulli trials
before rk successes occur.public int nextPoisson(double theta)
Method nextPoisson
generates pseudorandom numbers from a
Poisson distribution with parameter theta
.
theta
, which is the mean of the Poisson random variable,
must be positive. The probability function (with \(\rm \theta =
theta\)) is
$$f(x) = e^{ - {\rm{\theta}}} \, {\rm{\theta}}^{x} /x ! $$
for \(x = 0, 1, 2, \ldots\)
If theta
is less than 15, nextPoisson
uses an
inverse CDF method; otherwise the PTPE
method of Schmeiser
and Kachitvichyanukul (1981) (see also Schmeiser 1983) is used.
The PTPE
method uses a composition of four regions, a
triangle, a parallelogram, and two negative exponentials. In each region
except the triangle, acceptance/rejection is used. The execution time of
the method is essentially insensitive to the mean of the Poisson.
theta
- a double
which specifies the mean of the
Poisson distribution, theta \(\gt\) 0int
, a pseudorandom number from a Poisson
distributionpublic int nextUniformDiscrete(int k)
nextUniformDiscrete
generates pseudorandom numbers from a
discrete uniform distribution with parameter k. The integers
\(i=1,\;\ldots,\;k\)
occur with equal probability. A random integer is generated by
multiplying
k by a uniform (0,1) random number, adding 1.0, and truncating the
result to an integer. This, of course, is equivalent to sampling with
replacement from a finite population of size k.
k
- Parameter of the discrete uniform distribution. The integers
\(1,\;\ldots,\;k\) occur with equal probability.
Parameter k must be positive.int
, a pseudorandom number from a discrete
uniform distributionpublic double nextContinuousUniform(double a, double b)
The probability density function of the continuous uniform distribution is $$f(x|a,b)=\left\{\begin{array}{lll}\frac{1}{b-a}, & \mbox{for} & a\le x\le b \\ 0, & \mbox{for} & x\lt a \; \mbox{or} \; x\gt b \end{array}\right. $$ where (\( -\infty \lt a \lt b \lt \infty \)).
a
- a double
, the lower parameterb
- a double
, the upper parameterdouble
, a pseudorandom number from a continuous
uniform distributionpublic double nextExponential()
nextExponential
uses an antithetic inverse CDF technique;
that is, a uniform random deviate \(U\) is generated and the inverse
of the exponential cumulative distribution function is evaluated at
\(1.0 - U\) to yield the exponential deviate.
Deviates from the exponential distribution with mean
\(\theta\) can be generated by using
nextExponential
and then multiplying the result by
\(\theta\).
double
which specifies a pseudorandom number from
a standard exponential distributionpublic double nextExponentialMix(double theta1, double theta2, double p)
$$f\left( x \right) = \frac{p}{\theta }e^{ - x/\theta _1 } + \frac{{1 - p}}{{\theta _2 }}e^{ - x/\theta _2 } \,\,\, for\,x > 0 $$
where \(p = \rm p\), \(\theta_1 = theta1\), and \(\theta_2 = theta2\).
In the case of a convex mixture, that is, the case \(0 \lt p \lt
1\), the mixing parameter p is interpretable as a
probability; and nextExponentialMix
with probability
p generates an exponential deviate with mean
\(\theta_1\), and with probability
\(1 - p\) generates an exponential with mean
\(\theta_2\). When p is greater than 1, but less
than \(\theta_1/(\theta_1 - \theta_2)\), then either an
exponential deviate with mean \(\theta_2\)
or the sum of two exponentials with means \(\theta_1\)
and \(\theta_2\) is generated. The probabilities are
\(q = p - (p -1) \theta_1 /\theta_2\) and
\(1 - q\) respectively, for the single exponential and the sum of the
two exponentials.
theta1
- a double
which specifies the mean of the
exponential distribution that has the larger mean.theta2
- a double
which specifies the mean of the
exponential distribution that has the smaller mean. theta2
must be positive and less than or equal to theta1
.p
- a double
which specifies the mixing parameter. It
must satisfy \(0 \le p \le {\rm
{theta1/(theta1-theta2)}}\).double
, a pseudorandom number from a mixture of
the two exponential distributions.public double nextLogistic(double mu, double sigma)
Generates a pseudorandom number from a logistic distribution using
an inverse CDF method. A uniform (0,1) random deviate is
generated, then the inverse of the logistic function is
evaluated at that point using InvCdf.logistic
.
mu
- a double
, the location parametersigma
- a double
, the scale parameter. It must be positive.double
, a pseudorandom number from a logistic
distributionpublic double nextLogNormal(double mean, double stdev)
Method nextLogNormal
generates pseudorandom numbers from a
lognormal distribution with parameters mean
and
stdev
. The scale parameter in the underlying normal
distribution, stdev
, must be positive. The method is to
generate normal deviates with mean mean
and standard
deviation stdev
and then to exponentiate the normal
deviates.
With \(\mu = mean\) and \(\sigma = stdev\), the probability density function for the lognormal distribution is
$$f\left( x \right) = \frac{1}{{\sigma x\sqrt {2\pi } }}\exp \left[ { - \frac{1}{{2\sigma ^2 }}\left( {\ln x - \mu } \right)^2 } \right]\,\,for\,x > 0 $$
The mean and variance of the lognormal distribution are \(\rm exp(\mu + \sigma2/2)\) and \(\rm exp(2\mu+ 2\sigma2) - \rm exp(2\mu+ \sigma2)\), respectively.
mean
- a double
which specifies the mean of the
underlying normal distributionstdev
- a double
which specifies the standard deviation
of the underlying normal distribution. It must be positive.double
, a pseudorandom number from a lognormal
distributionpublic double nextTriangular()
nextTriangular
uses an
inverse CDF technique.double
, a pseudorandom number from a triangular
distribution on the interval (0,1)public double nextStudentsT(double df)
nextStudentsT
generates pseudo-random numbers from a
Student's t distribution with df
degrees of freedom,
using a method suggested by Kinderman, Monahan, and Ramage (1977). The
method ("TMX" in the reference) involves a representation of the t
density as the sum of a triangular density over (-2, 2) and the
difference of this and the t density. The mixing probabilities
depend on the degrees of freedom of the t
distribution. If the triangular density is chosen, the variate is
generated as the sum of two uniforms; otherwise, an acceptance/rejection
method is used to generate a variate from the difference density.
For degrees of freedom less than 100, nextStudentsT
requires
approximately twice the execution time as nextNormalAR
,
which generates pseudorandom normal deviates. The execution time of
nextStudentsT
increases very slowly as the degrees of
freedom increase. Since for very large degrees of freedom the normal
distribution and the t
distribution are very similar, the user may find that the difference in
the normal and the t does not warrant the additional generation
time required to use nextStudentsT
instead of
nextNormalAR
.
df
- a double
which specifies the number of degrees of
freedom. It must be positive.double
, a pseudorandom number from a Student's t
distributionpublic double nextVonMises(double c)
Method nextVonMises
generates pseudorandom numbers from a
von Mises distribution with parameter c, which must be positive.
With c = C, the probability density function is
$$f\left( x \right) = \frac{1}{{2\pi I_0 \left( c \right)}}\exp \left[ {c\,\cos \left( x \right)} \right]\, for \, - \pi \lt x \lt \pi $$
where \(I_0(c)\) is the modified Bessel function of the first kind of order 0. The probability density equals 0 outside the interval \((-\pi, \pi)\).
The algorithm is an acceptance/rejection method using a wrapped Cauchy distribution as the majorizing distribution. It is due to Best and Fisher (1979).
c
- a double
which specifies the parameter of the von
Mises distribution, \(c \gt 7.4 \times 10^{-9}\).double
, a pseudorandom number from a von Mises
distributionpublic double nextWeibull(double a)
Method nextWeibull
generates pseudorandom numbers from a
Weibull distribution with shape parameter a. The probability
density function is
$$f\left( x \right) = Ax^{A - 1} e^{ - x^A } \,for\,x \ge 0 $$
nextWeibull
uses an antithetic inverse CDF technique to
generate a Weibull variate; that is, a uniform random deviate
U is generated and the inverse of the Weibull cumulative
distribution function is evaluated at \(1.0 - u\)
to yield the Weibull deviate.
Deviates from the two-parameter Weibull distribution, with shape
parameter a and scale parameter b, can be generated by
using nextWeibull
and then multiplying the result by
b.
The Rayleigh distribution with probability density function,
$$ r\left( x \right) = \frac{1}{{\alpha ^2 }}x\, e^{\left( { - x^2 /2\alpha ^2 } \right)} \,\,for\,x \ge 0$$
is the same as a Weibull distribution with shape parameter a equal to 2 and scale parameter b equal to
.$$\sqrt {2\alpha } $$
hence, nextWeibull
and simple multiplication can be used to
generate Rayleigh deviates.
a
- a double
which specifies the shape parameter of the
Weibull distribution, a \(\gt\) 0double
, a pseudorandom number from a Weibull
distributionpublic double[] nextMultivariateNormal(Cholesky matrix)
nextMultivariateNormal
generates pseudorandom numbers from a
multivariate normal distribution with mean vector consisting of all
zeroes and variance-covariance matrix whose Cholesky factor (or "square
root") is matrix
; that is, matrix
is a lower
triangular matrix such that matrix
times the transpose of
matrix
is the variance-covariance matrix. First, independent
random normal deviates with mean 0 and variance 1 are generated, and then
the matrix containing these deviates is pre-multiplied by
matrix
.
Deviates from a multivariate normal distribution with means other than
zero can be generated by using nextMultivariateNormal
and
then by adding the means to the deviates.
matrix
- is the Cholesky
factorization of the
variance-covariance matrix of order k.double
array which contains the pseudorandom
numbers from a multivariate normal distributionpublic double[] nextMultivariateNormal(int k, Cholesky matrix)
Random.nextMultivariateNormal(Cholesky)
instead.public double[] nextGaussianCopula(Cholesky chol)
nextGaussianCopula
generates pseudorandom numbers from a
multivariate Gaussian Copula distribution which are uniformly distributed
on the interval (0,1) representing the probabilities associated with
N(0,1) deviates imprinted with correlation information from input
Cholesky object chol
. Cholesky matrix R
is
defined as the "square root" of a user-defined correlation matrix, that
is R
is a lower triangular matrix such that R
times the transpose of R
is the correlation matrix.
First, a length k vector of independent random normal deviates
with mean 0 and variance 1 is generated, and then this deviate vector is
pre-multiplied by Cholesky matrix R
. Finally, the
Cholesky-imprinted random N(0,1) deviates are mapped to output
probabilities using the N(0,1) cumulative distribution function
(CDF).
Random deviates from arbitrary marginal distributions which are imprinted
with the correlation information contained in Cholesky matrix
R
can then be generated by inverting the output
probabilities using user-specified inverse CDF functions.
chol
- is the Cholesky
object containing the Cholesky
factorization of the correlation matrix of order k.double
array which contains the pseudorandom
numbers from a multivariate Gaussian Copula distribution.public double[] nextGaussianCopula(int k, Cholesky chol)
Random.nextGaussianCopula(Cholesky)
instead.public double[][] canonicalCorrelation(double[][] deviate)
Method canonicalCorrelation
generates a canonical
correlation matrix from an arbitrarily distributed multivariate deviate
sequence with nvar
deviate variables, nseq
steps in the sequence, and a Gaussian Copula dependence structure.
Method canonicalCorrelation
first maps each of the
j=1..nvar
input deviate sequences
deviate[k=1..nseq][j]
into a corresponding sequence of
variates, say variate[k][j]
(where variates are values of
the empirical cumulative probability function,
\(CDF(x)\), defined as the probability that random
deviate variable \(X \; \le \; x\), and where
nseq = deviate.length
and
nvar = deviate[0].length
). The variate matrix
variate[k][j]
is then mapped into Normal(0,1) distributed
deviates
\(z_{kj}\) using the method
Cdf.inverseNormal(variate[k][j])
and then the standard
covariance estimator
$$C_{ij} \;\; = \;\; \frac{1}{n_{seq}}\;\sum_{k =
1}^{n_{seq}} {z_{ki} \; z_{kj}} $$
is used to calculate the canonical correlation matrix
correlation = canonicalCorrelation(deviate)
, where
\(C_{ij}\) = correlation[i][j]
and
\(n_{seq}\) = nseq
.
If a multivariate distribution has Gaussian marginal distributions, then the standard "empirical" correlation matrix given above is "unbiased", i.e. an accurate measure of dependence among the variables. But when the marginal distributions depart significantly from Gaussian, i.e. are skewed or flattened, then the empirical correlation may become biased. One way to remove such bias from dependence measures is to map the non-Gaussian-distributed marginal deviates to Gaussian N(0,1) deviates (by mapping the non-Gaussian marginal deviates to empirically derived marginal CDF variate values, then inverting the variates to N(0,1) deviates as described above), and calculating the standard empirical correlation matrix from these N(0,1) deviates as in the equation above. The resulting "(Gaussian) canonical correlation" matrix thereby avoids the bias that would occur if the empirical correlation matrix were extracted from the non-Gaussian marginal distributions directly.
The canonical correlation matrix may be of value in such applications as Markowitz porfolio optimization, where an unbiased measure of dependence is required to evaluate portfolio risk, defined in terms of the portfolio variance which is in turn defined in terms of the correlation among the component portfolio instruments.
The utility of the canonical correlation derives from the observation that a "copula" multivariate distribution with uniformly-distributed deviates (corresponding to the CDF probabilities associated with the marginal deviates) may be mapped to arbitrarily distributed marginals, so that an unbiased dependence estimator derived from one set of marginals (N(0,1) distributed marginals) can be used to represent the dependence associated with arbitrarily-distributed marginals. The "Gaussian Copula" (whose variate arguments are derived from N(0,1) marginal deviates) is a particularly useful structure for representing multivariate dependence.
This is demonstrated in Example 2 where method
Random.nextGaussianCopula(CholeskyMtrx)
(where
CholeskyMtrx
is a Cholesky object derived from a
user-specified covariance matrix) is used to imprint correlation
information on otherwise arbitrarily distributed and independent random
sequences. Method Random.canonicalCorrelation
is then used
to extract an unbiased correlation matrix from these imprinted deviate
sequences.
deviate
- is the double nseq
by nvar
array
of input deviate values.public double[] nextStudentsTCopula(double df, Cholesky chol)
nextStudentsTCopula
generates pseudorandom numbers from a
multivariate Student's t Copula distribution which are uniformly
distributed on the interval (0,1) representing the probabilities
associated with Student's t deviates with df
degrees
of freedom imprinted with correlation information from the input Cholesky
object chol
. Cholesky matrix R
is defined as
the "square root" of a user-defined correlation matrix, i.e.
R
is a lower triangular matrix such that R
times the transpose of R
is the correlation matrix. First, a
length k vector of independent random normal deviates with mean 0
and variance 1 is generated, and then this deviate vector is
pre-multiplied by Cholesky matrix R
. Each of the k
elements of the resulting vector of Cholesky-imprinted random deviates is
then divided by \(\sqrt{\frac{s}{\nu}}\), where
\(\nu\) = df
and s is a random
deviate taken from a chi-squared distribution with df
degrees of freedom. Each element of the Cholesky-imprinted N(0,1) vector
is a linear combination of normally distributed random numbers and is
therefore itself normal, and the division of each element by
\(\sqrt{\frac{s}{\nu}}
\) therefore insures that each element of the resulting vector
is Student's t distributed. Finally each element of the
Cholesky-imprinted Student's t vector is mapped to an output
probability using the Student's t cumulative distribution function
(CDF) with df
degrees of freedom.
Random deviates from arbitrary marginal distributions which are imprinted
with the correlation information contained in Cholesky matrix
R
can then be generated by inverting the output
probabilities using user-specified inverse CDF functions.
df
- a double
which specifies the degrees of freedom
parameter.chol
- the Cholesky
object containing the Cholesky
factorization of the correlation matrix of order
k.double
array which contains the pseudorandom
numbers from a multivariate Students t Copula distribution with
df
degrees of freedom.public double[] nextStudentsTCopula(int k, double df, Cholesky chol)
Random.nextStudentsTCopula(double, Cholesky)
instead.public static void canonicalCorrelationSTC(double df, double[][] STCdevt, double[][] CorrMtrx)
CanonicalCorrelationSTC
generates a canonical correlation
matrix from an arbitrarily distributed multivariate deviate sequence with
a Student's t Copula (STC) dependence structure.
CanonicalCorrelationSTC
first uses method
Cdf.empiricalCdf(nseq, GCdevt, vart)
to map each of the
j=1..nvar
input deviate sequences
STCdevt[k=1..nseq][j]
into a corresponding sequence of
variates vart[k][j]
(where variates are values of the
empirical cumulative probability function, \(CDF(x)\),
defined as the probability that random deviate variable
\(X \; \le \; x\)). The variate matrix
vart[k][j]
is then mapped into deviates
devt[k][j]
with a Student's t (ST) distribution with
df
degrees of freedom using the method
Cdf.inverseStudentsT(vart[k][j], df)
and then the standard
covariance estimator
$$C_{ij} \;\; = \;\; \frac{\nu \; - \; 2}{\nu \; n_{seq}}\; \sum_{k = 1}^{n_{seq}} {z_{ki} \; z_{kj}} $$
is used to calculate the canonical coorelation matrix
CorrMtrx[i][j]
. where \(C_{ij}\) =
CorrMtrx[i][j]
,
\(z_{ki}\) = devt[k][i]
,
\(\nu\) = dgrees of freedom = df
, and
\(n_{seq}\) = nseq
.
If a multivariate distribution has ST (with df
degrees of
freedom) marginal distributions, then the standard "empirical"
correlation matrix given above is "unbiased", i.e. an accurate measure of
dependence among the variables. But when the marginal distributions
depart significantly from ST, e.g. are skewed or flattened, then the
empirical correlation may become biased. One way to remove such bias from
dependence measures is to map the non-ST-distributed marginal deviates to
ST deviates (by mapping the non-ST marginal deviates to empirically
derived marginal CDF variate values and then inverting the variates to ST
\(\nu\) = df
deviates as described above)
and then calculating the standard empirical correlation matrix from these
ST deviates as in the equation above. The resulting "(ST) canonical
correlation" matrix thereby avoids the bias that would occur if the
empirical correlation matrix were extracted from the non-ST marginal
distributions directly.
The canonical correlation matrix may be of value in such applications as Markowitz portfolio optimization, where an unbiased measure of dependence is required to evaluate portfolio risk, defined in terms of the portfolio variance which is in turn defined in terms of the correlation among the component portfolio instruments.
The utility of the canonical correlation derives from the observation that a "copula" multivariate distribution with uniformly-distributed deviates (corresponding to the CDF probabilities associated with the marginal deviates) may be mapped to arbitrarily distributed marginals, so that an unbiased dependence estimator derived from one set of marginals (e.g. ST distributed marginals) can be used to represent the dependence associated with arbitrarily-distributed marginals. The "ST Copula" ("STC", whose variate arguments are derived from ST marginal deviates) is a particularly useful structure for representing multivariate dependence.
This is demonstrated in the example referenced below, where method
Random.nextStudentsTCopula(df, CholeskyMtrx)
(where
CholeskyMtrx is a Cholesky matrix derived from a user-specified
covariance matrix) is used to to imprint correlation information on
otherwise arbitrarily distributed and independent random sequences.
Method Random.CanonicalCorrelationSTC
is then be used to
extract an unbiased correlation matrix from these imprinted deviate
sequences.
df
- double
degrees of freedomSTCdevt
- is the double
2-index (nseq by nvar) array of
input deviate valuesCorrMtrx
- is the double
2-index (nvar by nvar) output
canonical correlation arraypublic double nextExtremeValue(double mu, double beta)
Random numbers are generated by evaluating uniform variates \(u_i\), equating the continuous distribution function, and then solving for \(x_i\) by first computing \(\frac{x_i - \mu}{\beta}=log(-log(1-u_i))\).
mu
- a double
scalar value representing the location
parameter.beta
- a double
scalar value representing the scale
parameter.double
pseudorandom number from an extreme value
distributionpublic double nextF(double dfn, double dfd)
dfn
- a double
, the numerator degrees of freedom. It
must be positive.dfd
- a double
, the denominator degrees of freedom. It
must be positive.double
, a pseudorandom number from an F
distributionpublic double nextRayleigh(double sigma)
Method nextRayleigh
generates pseudorandom numbers from a
Rayleigh distribution with scale parameter \(\sigma > 0\).
sigma
- a double
which specifies the scale parameter of
the Rayleigh distributiondouble
, a pseudorandom number from a Rayleigh
distributionpublic int nextDiscrete(int imin, double[] probabilities)
Method nextDiscrete
generates a pseudorandom number from a
discrete distribution with probability function given in the vector
probabilities
; that is
for \(i=i_{min},i_{min}+1,\ldots,i_{min}+n_m-1\), where
\(j=i-i_{min}+1,p_j=\) probabilities[j-1]
,
\(i_{min}=\) imin
,\(n_m=\)
nmass
and probabilities.length
is the number of
mass points.
The algorithm is the alias method, due to Walker (1974), with modifications suggested by Kronmal and Peterson (1979). On the first call with a set of probabilities, the method performs an initial setup after which the number generation phase is very fast. To increase efficiency, the code skips the setup phase on subsequent calls with the same inputs.
imin
- an int
which specifies the smallest value the
random deviate can assume. This is the value corresponding to the
probability in probabilities[0]
.probabilities
- a double
array containing the
probabilities associated with the individual mass points. The elements of
probabilities
must be nonnegative and must sum to 1.0. The
length of probabilities
muse be greater than 1.int
which contains the random discrete deviate.Copyright © 2020 Rogue Wave Software. All rights reserved.