randomMvarGaussianCopula¶
Given a Cholesky factorization of a correlation matrix, generates pseudorandom numbers from a Gaussian Copula distribution.
Synopsis¶
randomMvarGaussianCopula (chol)
Required Arguments¶
- float
chol[[]]
(Input) - Array of size
n
×n
containing the upper‑triangular Cholesky factorization of the correlation matrix of ordern
.
Return Value¶
An array of length n
containing the pseudorandom numbers from a
multivariate Gaussian Copula distribution.
Description¶
Function randomMvarGaussianCopula
generates pseudorandom numbers from a
multivariate Gaussian Copula distribution which are uniformly distributed on
the interval (0,1) representing the probabilities associated with standard
normal N(0,1) deviates imprinted with correlation information from input
upper-triangular Cholesky matrix chol
. Cholesky matrix chol
is
defined as the “square root” of a user-defined correlation matrix, that is
chol
is an upper triangular matrix such that the transpose of chol
×
chol
is the correlation matrix. First, a length n
array of
independent random normal deviates with mean 0 and variance 1 is generated,
and then this deviate array is post-multiplied by Cholesky matrix chol
.
Finally, the Cholesky-imprinted random N(0,1) deviates are mapped to output
probabilities using the N(0,1) cumulative distribution function (CDF).
Random deviates from arbitrary marginal distributions which are imprinted
with the correlation information contained in Cholesky matrix chol
can
then be generated by inverting the output probabilities using user-specified
inverse CDF functions.
Example: Using Gaussian Copulas to Imprint and Extract Correlation Information ——————————————————————————
This example uses function randomMvarGaussianCopula
to generate a
multivariate sequence gcdevt
whose marginal distributions are
user-defined and imprinted with a user-specified input correlation matrix
corrin
and then uses function canonicalCorrelation
to extract an
output canonical correlation matrix corrout
from this multivariate
random sequence.
This example illustrates two useful copula related procedures. The first procedure generates a random multivariate sequence with arbitrary user-defined marginal deviates whose dependence is specified by a user-defined correlation matrix. The second procedure is the inverse of the first: an arbitrary multivariate deviate input sequence is first mapped to a corresponding sequence of empirically derived variates, i.e., cumulative distribution function values representing the probability that each random variable has a value less than or equal to the input deviate. The variates are then inverted, using the inverse standard normal CDF function, to N(0,1) deviates; and finally, a canonical covariance matrix is extracted from the multivariate N(0,1) sequence using the standard sum of products.
This example demonstrates that function randomMvarGaussianCopula
correctly embeds the user‑defined correlation information into an arbitrary
marginal distribution sequence by extracting the canonical correlation from
these sequences and showing that they differ from the original correlation
matrix by a small relative error, which generally decreases as the number of
multivariate sequence vectors increases.
from __future__ import print_function
from numpy import *
from pyimsl.math.linSolPosdef import linSolPosdef
from pyimsl.stat.randomOption import randomOption
from pyimsl.stat.randomSeedSet import randomSeedSet
from pyimsl.stat.randomMvarGaussianCopula import randomMvarGaussianCopula
from pyimsl.stat.chiSquaredInverseCdf import chiSquaredInverseCdf
from pyimsl.stat.fInverseCdf import fInverseCdf
from pyimsl.stat.normalInverseCdf import normalInverseCdf
from pyimsl.stat.canonicalCorrelation import canonicalCorrelation
nvar = 3
lmax = 15000
df = 5.0
arg1 = 10.0
arg2 = 15.0
corrin = [[1.0, -0.9486832, 0.8164965],
[-0.9486832, 1.0, -0.6454972],
[0.8164965, -0.6454972, 1.0]]
print("Off-diagonal elements of Input Correlation Matrix:\n")
for i in range(nvar):
for j in range(i):
print(" CorrIn(%d,%d) = %10.6f" % (i, j, corrin[i][j]))
print("\nOff-diagonal elements of Output Correlation Matrices")
print("calculated from Gaussian Copula imprinted multivariate")
print("sequence:")
#
# Compute the Cholesky factorization of corrin
#
# Use IMSL function linSolPosdef to generate
# the nvar by nvar upper triangular matrix chol from
# the Cholesky decomposition R*RT of input correlation
# matrix corrin:
#
chol = []
linSolPosdef(corrin, None, factor=chol, factorOnly=True)
kmax = lmax / 100
for kk in range(1, 4):
gcdevt = zeros((int(kmax), nvar), dtype=double)
print("\n# of vectors in multivariate sequence: %7d\n\n" % kmax)
# use Congruential RN generator, with multiplier 16807
randomOption(1)
# set RN generator seed to be 123457
randomSeedSet(123457)
for k in range(int(kmax)):
#
# generate a NVAR-length random Gaussian Copula
# variate output vector gcvart which is uniformly
# distributed on the interval [0,1] and imprinted
# with correlation information from input Cholesky
# matrix chol:
gcvart = randomMvarGaussianCopula(chol)
for j in range(3):
#
# invert Gaussian Copula probabilities to
# deviates using variable-specific
# inversions: j = 0: Chi Square; j = 1: F;
# j = 2: Normal(0,1); will end up with deviate
# sequences ready for mapping to canonical
# correlation matrix:
#
if (j == 0):
# convert probs into ChiSquare(df=10) deviates
gcdevt[k, j] = chiSquaredInverseCdf(gcvart[j], arg1)
elif (j == 1):
# convert probs into F(dfn=15,dfd=10) deviates
gcdevt[k, j] = fInverseCdf(gcvart[j], arg2, arg1)
else:
# convert probs into Normal(mean=0,variance=1) deviates:
gcdevt[k, j] = normalInverseCdf(gcvart[j])
#
# extract Canonical Correlation matrix from arbitrarily
# distributed deviate sequences gcdevt (k=1..kmax, j=1..NVAR)
# which have been imprinted with corrin (i=1..NVAR, j=1..NVAR)
# above:
corrout = canonicalCorrelation(gcdevt)
for i in range(nvar):
for j in range(i):
rs00 = corrin[i][j]
rs = corrout[i][j]
relerr = abs((rs - rs00) / rs00)
print(" CorrOut(%d,%d) = %10.6f; relerr = %10.6f" %
(i, j, corrout[i][j], relerr))
kmax *= 10
Output¶
Off-diagonal elements of Input Correlation Matrix:
CorrIn(1,0) = -0.948683
CorrIn(2,0) = 0.816496
CorrIn(2,1) = -0.645497
Off-diagonal elements of Output Correlation Matrices
calculated from Gaussian Copula imprinted multivariate
sequence:
# of vectors in multivariate sequence: 150
CorrOut(1,0) = -0.940216; relerr = 0.008926
CorrOut(2,0) = 0.794511; relerr = 0.026927
CorrOut(2,1) = -0.616082; relerr = 0.045569
# of vectors in multivariate sequence: 1500
CorrOut(1,0) = -0.947443; relerr = 0.001308
CorrOut(2,0) = 0.808307; relerr = 0.010030
CorrOut(2,1) = -0.635650; relerr = 0.015256
# of vectors in multivariate sequence: 15000
CorrOut(1,0) = -0.948267; relerr = 0.000439
CorrOut(2,0) = 0.817261; relerr = 0.000936
CorrOut(2,1) = -0.646207; relerr = 0.001099