partialCovariances

../../_images/OpenMp_27.png

Computes partial covariances or partial correlations from the covariance or correlation matrix.

Synopsis

partialCovariances (nIndependent, nDependent, x)

Required Argument

int nIndependent (Input)
Number of “independent” variables to be used in the partial covariances/correlations. The partial covariances/correlations are the covariances/correlations between the dependent variables after removing the linear effect of the independent variables.
int nDependent (Input)
Number of variables for which partial covariances/correlations are desired (the number of “dependent” variables).
float x (Input)
The n × n covariance or correlation matrix, where n = nIndependent + nDependent. The rows/columns must be ordered such that the first nIndependent rows/columns contain the independent variables, and the last nDependent row/columns contain the dependent variables. Matrix x must always be square symmetric.

Return Value

Matrix of size nDependent by nDependent containing the partial covariances (the default) or partial correlations (use keyword partialCorr).

Optional Arguments

xIndices, int[] (Input)

An array containing values indicating the status of the variable as in the following table:

indices[i] Variable is…
−1 not used in analysis
0 dependent variable
1 independent variable

By default, the first nIndependent elements of indices are equal to 1, and the last nDependent elements are equal to 0.

partialCov (Input)

or

partialCorr (Input)
By default, and if partialCov is specified, partial covariances are calculated. Partial correlations are calculated if partialCorr is specified.
test, int df, int dfOut, float pValues (Input, Output, Output)

Argument df is an input integer indicating the number of degrees of freedom associated with the input matrix x. If the number of degrees of freedom in x varies from element to element, then a conservative choice for df is the minimum degrees of freedom for all elements in x.

Argument dfOut contains the number of degrees of freedom in the test that the partial covariances/correlations are zero. This value will usually be dfnIndependent, but will be greater than this value if the independent variables are computationally linearly related.

Argument pValues is an array of size nDependent by nDependent containing the p-values for testing the null hypothesis that the associated partial covariance/correlation is zero. It is assumed that the observations from which x was computed follows a multivariate normal distribution and that each element in x has df degrees of freedom.

Description

Function partialCovariances computed partial covariances or partial correlations from an input covariance or correlation matrix. If the “independent” variables (the linear “effect” of the independent variables is removed in computing the partial covariances/correlations) are linearly related to one another, partialCovariances detects the linearity and eliminates one or more of the independent variables from the list of independent variables. The number of variables eliminated, if any, can be determined from argument dfOut.

Given a covariance or correlation matrix Σ partitioned as

\[\begin{split}\begin{pmatrix} \Sigma_{11} & \Sigma_{12} \\ \Sigma_{21} & \Sigma_{22} \\ \end{pmatrix}\end{split}\]

function partialCovariances computed the partial covariances (of the standardized variables if Σ is a correlation matrix) as

\[\Sigma_{22|1} = \Sigma_{22} - \Sigma_{21} \Sigma_{11}^{-1} \Sigma_{12}\]

If partial correlations are desired, these are computed as

\[P_{22|1} = \left[\mathit{diag}\left(\Sigma_{22|1}\right)\right]^{-1/2} \Sigma_{22|1} \left[\mathit{diag}\left(\Sigma_{22|1}\right)\right]^{-1/2}\]

where diag denotes the matrix containing the diagonal of its argument along its diagonal with zeros off the diagonal. If \(\Sigma_{11}\) is singular, then as many variables as required are deleted from \(\Sigma_{11}\) (and \(\Sigma_{12}\)) in order to eliminate the linear dependencies. The computations then proceed as above.

The p-value for a partial covariance tests the null hypothesis \(H_0 : \sigma_{ij|1}=0\), where \(\sigma_{ij|1}\) is the \((i,j)\) element in matrix \(\Sigma_{22|1}\). The p-value for a partial correlation tests the null hypothesis \(H_0 : \rho_{ij|1}=0\), where \(\rho_{ij|1}\) is the \((i,j)\) element in matrix \(P_{22|1}\). The p-values are returned in pValues. If the degrees of freedom for x, df, is not known, the resulting p-values may be useful for comparison, but they should not by used as an approximation to the actual probabilities.

Examples

Example 1

The following example computes partial covariances, scaled from a nine-variable correlation matrix originally given by Emmett (1949). The first three rows and columns contain the independent variables and the final six rows and columns contain the dependent variables.

from numpy import *
from pyimsl.stat.partialCovariances import partialCovariances
from pyimsl.stat.writeMatrix import writeMatrix

x = [[6.300, 3.050, 1.933, 3.365, 1.317, 2.293, 2.586, 1.242, 4.363],
     [3.050, 5.400, 2.170, 3.346, 1.473, 2.303, 2.274, 0.750, 4.077],
     [1.933, 2.170, 3.800, 1.970, 0.798, 1.062, 1.576, 0.487, 2.673],
     [3.365, 3.346, 1.970, 8.100, 2.983, 4.828, 2.255, 0.925, 3.910],
     [1.317, 1.473, 0.798, 2.983, 2.300, 2.209, 1.039, 0.258, 1.687],
     [2.293, 2.303, 1.062, 4.828, 2.209, 4.600, 1.427, 0.768, 2.754],
     [2.586, 2.274, 1.576, 2.255, 1.039, 1.427, 3.200, 0.785, 3.309],
     [1.242, 0.750, 0.487, 0.925, 0.258, 0.768, 0.785, 1.300, 1.458],
     [4.363, 4.077, 2.673, 3.910, 1.687, 2.754, 3.309, 1.458, 7.400]]

pcov = partialCovariances(3, 6, x)

writeMatrix("Partial Covariances", pcov, writeFormat="%7.3f")

Output

 
                  Partial Covariances
         1        2        3        4        5        6
1    0.000    0.000    0.000   -0.000    0.000    0.000
2    0.000    0.000    0.000    0.000    0.000    0.000
3    0.000    0.000    0.000    0.000    0.000    0.000
4   -0.000    0.000    0.000    5.495    1.895    3.084
5    0.000    0.000    0.000    1.895    1.841    1.476
6    0.000    0.000    0.000    3.084    1.476    3.403

Example 2

The following example computes partial correlations from a 9 variable correlation matrix originally given by Emmett (1949). The partial correlations between the remaining variables, after adjusting for variables 1, 3 and 9, are computed. Note in the output that the row and column labels are numbers, not variable numbers. The corresponding variable numbers would be 2, 4, 5, 6, 7 and 8, respectively.

from __future__ import print_function
from numpy import *
from pyimsl.stat.partialCovariances import partialCovariances
from pyimsl.stat.writeMatrix import writeMatrix

x = [[1.0, 0.523, 0.395, 0.471, 0.346, 0.426, 0.576, 0.434, 0.639],
     [0.523, 1.0, 0.479, 0.506, 0.418, 0.462, 0.547, 0.283, 0.645],
     [0.395, 0.479, 1.0, .355, 0.27, 0.254, 0.452, 0.219, 0.504],
     [0.471, 0.506, 0.355, 1.0, 0.691, 0.791, 0.443, 0.285, 0.505],
     [0.346, 0.418, 0.27, 0.691, 1.0, 0.679, 0.383, 0.149, 0.409],
     [0.426, 0.462, 0.254, 0.791, 0.679, 1.0, 0.372, 0.314, 0.472],
     [0.576, 0.547, 0.452, 0.443, 0.383, 0.372, 1.0, 0.385, 0.68],
     [0.434, 0.283, 0.219, 0.285, 0.149, 0.314, 0.385, 1.0, 0.47],
     [0.639, 0.645, 0.504, 0.505, 0.409, 0.472, 0.68, 0.47, 1.0]]
indices = [1, 0, 1, 0, 0, 0, 0, 0, 1]
test = {}
test['df'] = 30

pcorr = partialCovariances(3, 6, x,
                           partialCorr=True,
                           xIndices=indices,
                           test=test)

pval = test['pValues']
dfout = test['dfOut']
print("The degrees of freedom are", dfout)
writeMatrix("Partial Correlations", pcorr, writeFormat="%7.4f")
writeMatrix("P-Values", pval, writeFormat="%7.4f")

Output

The degrees of freedom are 27
 
                 Partial Correlations
         1        2        3        4        5        6
1   1.0000   0.2235   0.1936   0.2113   0.1253  -0.0610
2   0.2235   1.0000   0.6054   0.7198   0.0919   0.0249
3   0.1936   0.6054   1.0000   0.5977   0.1230  -0.0766
4   0.2113   0.7198   0.5977   1.0000   0.0349   0.0856
5   0.1253   0.0919   0.1230   0.0349   1.0000   0.0622
6  -0.0610   0.0249  -0.0766   0.0856   0.0622   1.0000
 
                       P-Values
         1        2        3        4        5        6
1   0.0000   0.2525   0.3232   0.2801   0.5249   0.7576
2   0.2525   0.0000   0.0006   0.0000   0.6417   0.9000
3   0.3232   0.0006   0.0000   0.0007   0.5328   0.6982
4   0.2801   0.0000   0.0007   0.0000   0.8602   0.6650
5   0.5249   0.6417   0.5328   0.8602   0.0000   0.7532
6   0.7576   0.9000   0.6982   0.6650   0.7532   0.0000

Warning Errors

IMSLS_NO_HYP_TESTS The input matrix “x” has # degrees of freedom, and the rank of the dependent variables is #. There are not enough degrees of freedom for hypothesis testing. The elements of “pValues” are set to NaN (not a number).

Fatal Errors

IMSLS_INVALID_MATRIX_1 The input matrix “x” is incorrectly specified. A computed correlation is greater than 1 for variables # and #.
IMSLS_INVALID_PARTIAL A computed partial correlation for variables # and # is greater than 1. The input matrix “x” is not positive semi-definite