partialCovariances¶
Computes partial covariances or partial correlations from the covariance or correlation matrix.
Synopsis¶
partialCovariances (nIndependent, nDependent, x)
Required Argument¶
- int
nIndependent
(Input) - Number of “independent” variables to be used in the partial covariances/correlations. The partial covariances/correlations are the covariances/correlations between the dependent variables after removing the linear effect of the independent variables.
- int
nDependent
(Input) - Number of variables for which partial covariances/correlations are desired (the number of “dependent” variables).
- float
x
(Input) - The n × n covariance or correlation matrix, where n =
nIndependent
+nDependent
. The rows/columns must be ordered such that the firstnIndependent
rows/columns contain the independent variables, and the lastnDependent
row/columns contain the dependent variables. Matrixx
must always be square symmetric.
Return Value¶
Matrix of size nDependent
by nDependent
containing the partial
covariances (the default) or partial correlations (use keyword
partialCorr
).
Optional Arguments¶
xIndices
, int[]
(Input)An array containing values indicating the status of the variable as in the following table:
indices[i] Variable is… −1 not used in analysis 0 dependent variable 1 independent variable By default, the first
nIndependent
elements ofindices
are equal to 1, and the lastnDependent
elements are equal to 0.
partialCov
(Input)
or
partialCorr
(Input)- By default, and if
partialCov
is specified, partial covariances are calculated. Partial correlations are calculated ifpartialCorr
is specified. test
, intdf
, intdfOut
, floatpValues
(Input, Output, Output)Argument
df
is an input integer indicating the number of degrees of freedom associated with the input matrixx
. If the number of degrees of freedom inx
varies from element to element, then a conservative choice fordf
is the minimum degrees of freedom for all elements inx
.Argument
dfOut
contains the number of degrees of freedom in the test that the partial covariances/correlations are zero. This value will usually bedf
−nIndependent
, but will be greater than this value if the independent variables are computationally linearly related.Argument
pValues
is an array of sizenDependent
bynDependent
containing the p-values for testing the null hypothesis that the associated partial covariance/correlation is zero. It is assumed that the observations from whichx
was computed follows a multivariate normal distribution and that each element inx
hasdf
degrees of freedom.
Description¶
Function partialCovariances
computed partial covariances or partial
correlations from an input covariance or correlation matrix. If the
“independent” variables (the linear “effect” of the independent variables is
removed in computing the partial covariances/correlations) are linearly
related to one another, partialCovariances
detects the linearity and
eliminates one or more of the independent variables from the list of
independent variables. The number of variables eliminated, if any, can be
determined from argument dfOut
.
Given a covariance or correlation matrix Σ partitioned as
function partialCovariances
computed the partial covariances (of the
standardized variables if Σ is a correlation matrix) as
If partial correlations are desired, these are computed as
where diag denotes the matrix containing the diagonal of its argument along its diagonal with zeros off the diagonal. If \(\Sigma_{11}\) is singular, then as many variables as required are deleted from \(\Sigma_{11}\) (and \(\Sigma_{12}\)) in order to eliminate the linear dependencies. The computations then proceed as above.
The p-value for a partial covariance tests the null hypothesis \(H_0 :
\sigma_{ij|1}=0\), where \(\sigma_{ij|1}\) is the \((i,j)\) element in
matrix \(\Sigma_{22|1}\). The p-value for a partial correlation tests
the null hypothesis \(H_0 : \rho_{ij|1}=0\), where \(\rho_{ij|1}\) is
the \((i,j)\) element in matrix \(P_{22|1}\). The p-values are
returned in pValues
. If the degrees of freedom for x
, df
, is not
known, the resulting p-values may be useful for comparison, but they should
not by used as an approximation to the actual probabilities.
Examples¶
Example 1¶
The following example computes partial covariances, scaled from a nine-variable correlation matrix originally given by Emmett (1949). The first three rows and columns contain the independent variables and the final six rows and columns contain the dependent variables.
from numpy import *
from pyimsl.stat.partialCovariances import partialCovariances
from pyimsl.stat.writeMatrix import writeMatrix
x = [[6.300, 3.050, 1.933, 3.365, 1.317, 2.293, 2.586, 1.242, 4.363],
[3.050, 5.400, 2.170, 3.346, 1.473, 2.303, 2.274, 0.750, 4.077],
[1.933, 2.170, 3.800, 1.970, 0.798, 1.062, 1.576, 0.487, 2.673],
[3.365, 3.346, 1.970, 8.100, 2.983, 4.828, 2.255, 0.925, 3.910],
[1.317, 1.473, 0.798, 2.983, 2.300, 2.209, 1.039, 0.258, 1.687],
[2.293, 2.303, 1.062, 4.828, 2.209, 4.600, 1.427, 0.768, 2.754],
[2.586, 2.274, 1.576, 2.255, 1.039, 1.427, 3.200, 0.785, 3.309],
[1.242, 0.750, 0.487, 0.925, 0.258, 0.768, 0.785, 1.300, 1.458],
[4.363, 4.077, 2.673, 3.910, 1.687, 2.754, 3.309, 1.458, 7.400]]
pcov = partialCovariances(3, 6, x)
writeMatrix("Partial Covariances", pcov, writeFormat="%7.3f")
Output¶
Partial Covariances
1 2 3 4 5 6
1 0.000 0.000 0.000 -0.000 0.000 0.000
2 0.000 0.000 0.000 0.000 0.000 0.000
3 0.000 0.000 0.000 0.000 0.000 0.000
4 -0.000 0.000 0.000 5.495 1.895 3.084
5 0.000 0.000 0.000 1.895 1.841 1.476
6 0.000 0.000 0.000 3.084 1.476 3.403
Example 2¶
The following example computes partial correlations from a 9 variable correlation matrix originally given by Emmett (1949). The partial correlations between the remaining variables, after adjusting for variables 1, 3 and 9, are computed. Note in the output that the row and column labels are numbers, not variable numbers. The corresponding variable numbers would be 2, 4, 5, 6, 7 and 8, respectively.
from __future__ import print_function
from numpy import *
from pyimsl.stat.partialCovariances import partialCovariances
from pyimsl.stat.writeMatrix import writeMatrix
x = [[1.0, 0.523, 0.395, 0.471, 0.346, 0.426, 0.576, 0.434, 0.639],
[0.523, 1.0, 0.479, 0.506, 0.418, 0.462, 0.547, 0.283, 0.645],
[0.395, 0.479, 1.0, .355, 0.27, 0.254, 0.452, 0.219, 0.504],
[0.471, 0.506, 0.355, 1.0, 0.691, 0.791, 0.443, 0.285, 0.505],
[0.346, 0.418, 0.27, 0.691, 1.0, 0.679, 0.383, 0.149, 0.409],
[0.426, 0.462, 0.254, 0.791, 0.679, 1.0, 0.372, 0.314, 0.472],
[0.576, 0.547, 0.452, 0.443, 0.383, 0.372, 1.0, 0.385, 0.68],
[0.434, 0.283, 0.219, 0.285, 0.149, 0.314, 0.385, 1.0, 0.47],
[0.639, 0.645, 0.504, 0.505, 0.409, 0.472, 0.68, 0.47, 1.0]]
indices = [1, 0, 1, 0, 0, 0, 0, 0, 1]
test = {}
test['df'] = 30
pcorr = partialCovariances(3, 6, x,
partialCorr=True,
xIndices=indices,
test=test)
pval = test['pValues']
dfout = test['dfOut']
print("The degrees of freedom are", dfout)
writeMatrix("Partial Correlations", pcorr, writeFormat="%7.4f")
writeMatrix("P-Values", pval, writeFormat="%7.4f")
Output¶
The degrees of freedom are 27
Partial Correlations
1 2 3 4 5 6
1 1.0000 0.2235 0.1936 0.2113 0.1253 -0.0610
2 0.2235 1.0000 0.6054 0.7198 0.0919 0.0249
3 0.1936 0.6054 1.0000 0.5977 0.1230 -0.0766
4 0.2113 0.7198 0.5977 1.0000 0.0349 0.0856
5 0.1253 0.0919 0.1230 0.0349 1.0000 0.0622
6 -0.0610 0.0249 -0.0766 0.0856 0.0622 1.0000
P-Values
1 2 3 4 5 6
1 0.0000 0.2525 0.3232 0.2801 0.5249 0.7576
2 0.2525 0.0000 0.0006 0.0000 0.6417 0.9000
3 0.3232 0.0006 0.0000 0.0007 0.5328 0.6982
4 0.2801 0.0000 0.0007 0.0000 0.8602 0.6650
5 0.5249 0.6417 0.5328 0.8602 0.0000 0.7532
6 0.7576 0.9000 0.6982 0.6650 0.7532 0.0000
Warning Errors¶
IMSLS_NO_HYP_TESTS |
The input matrix “x ” has # degrees
of freedom, and the rank of the dependent
variables is #. There are not enough degrees
of freedom for hypothesis testing. The
elements of “pValues ” are set to NaN
(not a number). |
Fatal Errors¶
IMSLS_INVALID_MATRIX_1 |
The input matrix “x ” is
incorrectly specified. A computed
correlation is greater than 1 for
variables # and #. |
IMSLS_INVALID_PARTIAL |
A computed partial correlation for
variables # and # is greater than 1. The
input matrix “x ” is not positive
semi-definite |