KENDL

FNLStat : Correlation : KENDL

KENDL

Required Arguments

Computes and test Kendall’s rank correlation coefficient.

Required Arguments

X — Vector of length NOBS containing the observations for the first variable. (Input)

Y — Vector of length NOBS containing the observations for the second variable. (Input)

FUZZ — Value used to determine ties in X or Y. (Input)
Two observations are said to be tied if the absolute value of their difference is less than or equal to FUZZ.

STAT — Vector of length 9 containing some output statistics. (Output)
See the “Description” section for full definitions. The output statistics are;

i	STAT(i)
1	Kendall a (assumes no ties)
2	Kendall b (corrects for ties)
3	Ties statistic for variable X
4	Ties statistic for variable Y
5	Statistic S corresponding to Kendall’s
6	Exact probability of achieving a score at least as large as S. S is not calculated if NOBS is too large (34 on many computers) or there are ties. In either case, STAT(6) is set to NaN (not a number).
7	The same probability as STAT(6) but using a normal approximation. (Set to NaN if NOBS is less than 8.
8	The same probability as STAT(6) but using a continuity correction with a normal approximation. (Set to NaN if NOBS is less than 8.)
9	Index in FRQ corresponding to the frequency of the observed S statistic. STAT(9) is not computed when there are ties.

Optional Arguments

NOBS — Number of observations. (Input)
NOBS must be 3 or more.
Default: NOBS = size (X,1).

FRQ — Vector of length NOBS * (NOBS ‑ 1)/2 + 1 containing the frequencies of occurrence of the possible values of the statistic S, STAT(5), under the null hypothesis of no relationship. (Output)
FRQ is not calculated if there are ties or if NOBS is too large (34 on many computers).

FORTRAN 90 Interface

Generic: CALL KENDL (X, Y, FUZZ, STAT [, …])

Specific: The specific interface names are S_KENDL and D_KENDL.

FORTRAN 77 Interface

Single: CALL KENDL (NOBS, X, Y, FUZZ, STAT, FRQ)

Double: The double precision name is DKENDL.

Description

Routine KENDL performs Kendall’s test of the hypothesis of no correlation (independence) by calculating

a and

b (

b handles ties), the Kendall sum S, and associated probabilities. The frequencies of occurrence of S are also computed if the sample size (NOBS) is not too large.

Kendall’s (1962) method is used in computing the

statistics. Each pair (xi, yi) is compared with every other pair (xj, yj). The Kendall S statistic is incremented if the two pairs are concordant ((xi > xj and yi > yj) or (xi < xj and yi < yj)) and decremented if the pairs are discordant ((xi > xj and yi < yj) or (xi < xj and yi > yj)). Ties (xi = xj or yi = yj) are not counted. Generally, when ties exist,

b is a better measure of correlation than is

a. The untied form of the denominator is used to calculate

a. That is,

where n = NOBS. Ties enter into the denominator of

b as follows:

where D = n(n ‑ 1)/2 and

where ti is the number of ties in the x variable with the i-th tie value. Ty is calculated in a similar manner.

For NOBS less than 34 (on many machines other values on machines with a different value for the largest real number that can be represented), the array FRQ is computed. FRQ contains the frequency distribution of S under the null hypothesis of independence. The probability distribution of S can be obtained directly from these frequencies by dividing each frequency by the sum of the frequencies. See routine KENDP for further discussion on the use of the FRQ array.

For a two-sided test, if the appropriate probability p of achieving or exceeding S is small (less than α/2, where α is the significance level of the test) or if 1 ‑ p is small (less than α/2), then the two-sided hypothesis of no correlation can be rejected. Alternatively, for small p or 1 ‑ p, the appropriate one-sided hypothesis can be rejected.

For n > 7, asymptotic normal probabilities are determined using the fact that

is approximately standard normal for large n. Here,

where ti is the number of observations in the i-th tie group for the x (or y) summation variable.

STAT(7) contains the probability associated with the z statistic while STAT(8) contains the same probability but with the value of S reduced by 1. This reduction is for “continuity correction.” For n less than 25, these probabilities are conservative at the 1% level of significance.

Comments

1. Workspace may be explicitly provided, if desired, by use of K2NDL/DK2NDL. The reference is:

CALL K2NDL (NOBS, X, Y, FUZZ, STAT, FRQ, IWK, WK, XRNK, YRNK)

The additional arguments are as follows:

IWK — Work vector of length NOBS.

WK — Work vector of length (NOBS ‑ 1) * (NOBS ‑ 2)/2 + 1.

XRNK — Work vector of length NOBS.

YRNK — Work vector of length NOBS.

2. Informational errors

Type	Code	Description
3	4	Ties are detected in the two samples. STAT(6) is set to NaN (not a number) and FREQ is not calculated.
3	5	NOBS is less than 8 so the asymptotic normal probabilities are not determined. STAT(7) and STAT(8) are set to NaN (not a number).
3	6	NOBS is too large (34 on many computers). STAT(6) is set to NaN (not a number) and FREQ is not calculated.
4	2	All the elements of X are tied. The output statistics are not defined.
4	3	All the elements of Y are tied. The output statistics are not defined.

Example

In this example, the Kendall test is performed on a sample of size 8. The test fails to reject the null hypothesis of no correlation.

! SPECIFICATIONS FOR PARAMETERS

USE KENDL_INT

USE WRRRL_INT

USE WRRRN_INT

IMPLICIT NONE

REAL FUZZ

PARAMETER (FUZZ=0.0001)

REAL FRQ(29), STAT(9), X(8), Y(8)

CHARACTER CLABEL(2)*10, RLABEL(9)*10

DATA RLABEL/'tau(a)', 'tau(b)', 'ties(X)', 'ties(Y)', &

'S', 'Pr(S)', 'Pr(S)-n', 'Pr(S)-na', 'IFRQ'/

DATA CLABEL/'Statistic', ' '/

DATA X/6, 4, 7, 3, 8, 1, 5, 2/

DATA Y/7, 1, 5, 8, 6, 4, 2, 3/

CALL KENDL (X, Y, FUZZ, STAT, FRQ=FRQ)

CALL WRRRL ('STAT', STAT, RLABEL, CLABEL, FMT='(W10.6)')

CALL WRRRN ('FRQ', FRQ, 1, 29, 1, 0)

END

Output

STAT

Statistic

tau(a) 0.1429

tau(b) 0.1429

ties(X) 0.0000

ties(Y) 0.0000

S 4.0000

Pr(S) 0.3598

Pr(S)-n 0.3103

Pr(S)-na 0.3553

IFREQ 17.0000

FRQ

1 2 3 4 5 6 7 8

1.0 7.0 27.0 76.0 174.0 343.0 602.0 961.0

9 10 11 12 13 14 15 16

1415.0 1940.0 2493.0 3017.0 3450.0 3736.0 3836.0 3736.0

17 18 19 20 21 22 23 24

3450.0 3017.0 2493.0 1940.0 1415.0 961.0 602.0 343.0

25 26 27 28 29

174.0 76.0 27.0 7.0 1.0