crosscorrelation

../../_images/OpenMp_27.png

Computes the sample cross-correlation function of two stationary time series.

Synopsis

crosscorrelation (x, y, lagmax)

Required Arguments

float x[] (Input)
Array of length nObservations containing the first time series.
float y[] (Input)
Array of length nObservations containing the second time series.
int lagmax (Input)
Maximum lag of cross-covariances and cross-correlations to be computed. lagmax must be greater than or equal to 1 and less than nObservations.

Return Value

An array of length 2 × lagmax + 1 containing the cross-correlations between the time series x and y. The k-th element of this array contains the cross-correlation between x and y at lag(k-lagmax) where k = 0, 1, …, 2*lagmax. To release this space, use free. If no solution can be computed, None is returned.

Optional Arguments

printLevel, int (Input)

Printing option.

printLevel Action
0 No printing is performed.
1 Prints the means and variances.
2 Prints the means, variances, and cross-covariances.
3 Prints the means, variances, cross-covariances, cross-correlations, and standard errors of cross-correlations.

Default = 0.

inputMeans, xMeanIn, yMeanIn (Input)
If specified, xMeanIn is the user input of the estimate of the mean of the time series x and yMeanIn is the user input of the estimate of the mean of the time series y.
outputMeans, xMeanOut, yMeanOut (Output)
If specified, xMeanOut is the means of the time series x and yMeanOut is the mean of the time series y.
variances, xVariance, yVariance (Output)
If specified, xVariance is variance of the time series x and yVariance is variance of the time series y.
seCcf, standardErrors, seOption (Output)
An array of length 2 × lagmax + 1containing the standard errors of the cross-correlations between the time series x and y. Method of computation for standard errors of the cross-correlations is chosen by seOption.
seOption Action
1 Compute standard errors of cross-correlations using Bartlett’s formula.
2 Compute standard errors of cross-correlations using Bartlett’s formula with the assumption of no cross-correlation.
crossCovariances (Output)
An array of length 2 × lagmax + 1 containing the cross-covariances between the time series x and y. The k-th element of this array contains the cross-covariances between x and y at lag (k-lagmax), where k = 0, 1, …, 2 × lagmax.

Description

Function crosscorrelation estimates the cross-correlation function of two jointly stationary time series given a sample of n = nObservations observations \(\{X_t\}\) and \(\{Y_t\}\) for \(t=1,2,\ldots,n\).

Let

\[\hat{\mu}_x\]

be the estimate of the mean \(\mu_X\) of the time series \(\{X_t\}\) where

\[\begin{split}\hat{\mu}_X = \begin{cases} \mu_X & \mu_X \text{ known (xMeanIn)} \\ \frac{1}{n} \sum\limits_{t=1}^{n} X_t & \mu_X \text{ unknown (xMeanOut)} \\ \end{cases}\end{split}\]

The autocovariance function of \(\{X_t\}\), \(\sigma_X(k)\), is estimated by

\[\hat{\sigma}_X(k) = \frac{1}{n} \sum_{t=1}^{n-k} \left(X_t - \hat{\mu}_X\right)\left(X_{t+k} - \hat{\mu}_X\right), \phantom{...} k = 0,1,\ldots,K\]

where K = lagmax. Note that

\[\hat{\sigma}_X(0)\]

is equivalent to the sample variance xVariance. The autocorrelation function \(\rho_X(k)\) is estimated by

\[\hat{\rho}_X(k) = \frac{\hat{\sigma}_X(k)}{\hat{\sigma}_X(0)} \phantom{...} k=0,1,\ldots,K\]

Note that

\[\hat{\rho}_X(0) \equiv 1\]

by definition. Let

\[\hat{\mu}_Y, \hat{\sigma}_Y(k), \textit{ and } \hat{\rho}_Y(k)\]

be similarly defined.

The cross-covariance function \(\sigma_{XY}(k)\) is estimated by

\[\begin{split}\begin{array}{l} \hat{\sigma}_{XY}(k) = \begin{cases} \tfrac{1}{n}\displaystyle\sum_{t=1}^{n-k}\left(X_t-\hat{\mu}_X\right)\left(Y_{t+k} - \hat{\mu}_Y\right) & k = 0, 1, \ldots, K \\ \tfrac{1}{n}\displaystyle\sum_{t=1-k}^n\left(X_t-\hat{\mu}_X\right)\left(Y_{t+k} - \hat{\mu}_Y\right) & k = -1, -2, \ldots, -K \end{cases} \end{array}\end{split}\]

The cross-correlation function \(\rho_{XY}(k)\) is estimated by

\[\hat{\rho}_{XY}(k) = \frac{\hat{\sigma}_{XY}(k)}{\left[\hat{\sigma}_X(0) \hat{\sigma}_Y(0)\right]^{1/2}} \phantom{...} k=0, \pm 1, \ldots, \pm K\]

The standard errors of the sample cross-correlations may be optionally computed according to argument seOption for the optional argument seCcf. One method is based on a general asymptotic expression for the variance of the sample cross-correlation coefficient of two jointly stationary time series with independent, identically distributed normal errors given by Bartlett (1978, page 352). The theoretical formula is

\[\begin{split}\begin{aligned} \mathit{var}\left\{\hat{\rho}_{XY}(k)\right\} =& \frac{1}{n-k} \sum_{i=-\infty}^\infty \left[\rho_X(i)\rho_Y(i) + \rho_{XY}(i-k)\rho_{XY}(i+k) \right. \\ & -2\rho_{XY}(k)\left\{\rho_X(i)\rho_{XY}(i+k) + \rho_{XY}(-i)\rho_Y(i+k)\right\} \\ & \left. +\rho_{XY}^2(k)\left\{\rho_X(i) + \tfrac{1}{2}\rho_X^2(i) + \tfrac{1}{2}\rho_Y^2(i)\right\} \right] \\ \end{aligned}\end{split}\]

For computational purposes, the autocorrelations \(\rho_X(k)\) and \(\rho_Y(k)\) and the cross-correlations \(\rho_{XY}(k)\) are replaced by their corresponding estimates for \(|k|\leq K\), and the limits of summation are equal to zero for all k such that \(|k|>K\).

A second method evaluates Bartlett’s formula under the additional assumption that the two series have no cross-correlation. The theoretical formula is

\[\mathrm{var}\left\{\hat{\rho}_{XY}(k)\right\} = \tfrac{1}{n-k} \sum_{i=-\infty}^{\infty} \rho_X(i) \rho_Y(i) \phantom{...} k \geq 0\]

For additional special cases of Bartlett’s formula, see Box and Jenkins (1976, page 377).

An important property of the cross-covariance coefficient is \(\sigma_{XY}(k) =\sigma_{YX}(-k)\) for \(k\geq 0\). This result is used in the computation of the standard error of the sample cross-correlation for lag \(k<0\). In general, the cross-covariance function is not symmetric about zero so both positive and negative lags are of interest.

Example

Consider the Gas Furnace Data (Box and Jenkins 1976, pages 532–533) where X is the input gas rate in cubic feet/minute and Y is the percent \(CO_2\) in the outlet gas. Function crosscorrelation is used to compute the cross-covariances and cross-correlations between time series X and Y with lags from -10 through lag 10. In addition, the estimated standard errors of the estimated cross-correlations are computed. The standard errors are based on the additional assumption that all cross-correlations for X and Y are zero.

from __future__ import print_function
from numpy import *
from pyimsl.stat.crosscorrelation import crosscorrelation
from pyimsl.stat.dataSets import dataSets

nobs = 296
lagmax = 10
x = empty(nobs)
y = empty(nobs)
xymean = {}
xyvar = {}
secc = {'seOption': 2}
ccv = empty(0)

data = dataSets(7)
for i in range(0, nobs):
    x[i] = data[i][0]
    y[i] = data[i][1]

cc = crosscorrelation(x, y, lagmax,
                      outputMeans=xymean,
                      variances=xyvar,
                      seCcf=secc,
                      crossCovariances=ccv)

print("Mean of series X     = %g" % xymean['xMeanOut'])
print("Variance of series X = %g\n" % xyvar['xVariance'])
print("Mean of series Y     = %g" % xymean['yMeanOut'])
print("Variance of series Y = %g\n" % xyvar['yVariance'])
print("Lag            CCV           CC         SECC\n")
for i in range(0, 2 * lagmax + 1):
    print("%-5d%13g%13g%13g" %
          (i - lagmax, ccv[i], cc[i], secc['standardErrors'][i]))

Output

Mean of series X     = -0.0568345
Variance of series X = 1.14694

Mean of series Y     = 53.5091
Variance of series Y = 10.2189

Lag            CCV           CC         SECC

-10      -0.404502    -0.118154     0.162754
-9       -0.508491    -0.148529      0.16247
-8       -0.614369    -0.179456     0.162188
-7       -0.705476    -0.206068     0.161907
-6       -0.776167    -0.226716     0.161627
-5       -0.831474    -0.242871     0.161349
-4       -0.891315    -0.260351     0.161073
-3       -0.980605    -0.286432     0.160798
-2        -1.12477    -0.328542     0.160524
-1        -1.34704    -0.393467     0.160252
0         -1.65853    -0.484451     0.159981
1         -2.04865    -0.598405     0.160252
2         -2.48217    -0.725033     0.160524
3         -2.88541     -0.84282     0.160798
4         -3.16536    -0.924592     0.161073
5         -3.25344     -0.95032     0.161349
6         -3.13113    -0.914593     0.161627
7         -2.83919     -0.82932     0.161907
8         -2.45302     -0.71652     0.162188
9         -2.05269    -0.599584      0.16247
10        -1.69465    -0.495004     0.162754