KRSKL

Performs a Kruskal‑Wallis test for identical population medians.

Required Arguments

NI — Vector of length NGROUP containing the number of responses for each of the NGROUP groups. (Input)

Y — Vector of length NI(1) +  + NI(NGROUP) that contains the responses for each of the NGROUP groups. (Input)
Y must be sorted by group, with the NI(1) observations in group 1 coming first, the NI(2) observations in group two coming second, and so on.

FUZZ — Constant used to determine ties in Y. (Input)
If (after sorting) Y(i Y(i + 1) is less than or equal to FUZZ, then a tie is counted. FUZZ must be nonnegative.

STAT — Vector of length 4 containing the Kruskal‑Wallis statistics. (Output)

 

I

STAT(I)

1

Kruskal‑Wallis H statistic.

2

Asymptotic probability of a larger H under the null hypothesis of identical population medians.

3

H corrected for ties.

4

Asymptotic probability of a larger H (corrected for ties) under the null hypothesis of identical populations.

Optional Arguments

NGROUP — Number of groups. (Input)
Default: NGROUP = size (NI,1).

FORTRAN 90 Interface

Generic: CALL KRSKL (NI, Y, FUZZ, STAT [])

Specific: The specific interface names are S_KRSKL and D_KRSKL.

FORTRAN 77 Interface

Single: CALL KRSKL (NGROUP, NI, Y, FUZZ, STAT)

Double: The double precision name is DKRSKL.

Description

The routine KRSKL generalizes the Wilcoxon two‑sample test computed by routine RNKSM to more than two populations. It computes a test statistic for testing that the population distribution functions in each of K populations are identical. Under appropriate assumptions, this is a nonparametric analogue of the one‑way analysis of variance. Since more than two samples are involved, the alternative is taken as the analogue of the usual analysis of variance alternative, namely that the populations are not identical.

The calculations proceed as follows: All observations are ranked regardless of the population to which they belong. Average ranks are used for tied observations (observations within FUZZ of each other). Missing observations (observations equal to NaN, not a number) are not included in the ranking. Let Ri denote the sum of the ranks in the i‑th population. The test statistic H is defined as:

 

where N is the total of the sample sizes, ni is the number of observations in the i‑th sample, and S2 is computed as the (bias corrected) sample variance of the Ri.

The null hypothesis is rejected when STAT(4) (or STAT(2)) is less than the significance level of the test. If the null hypothesis is rejected, then the procedures given in Conover (1980, page 231) may be used for multiple comparisons. The routine KRSKL computes asymptotic probabilities using the chi‑squared distribution when the number of groups is 6 or greater, and a Beta approximation (see Wallace 1959) when the number of groups is 5 or less. Tables yielding exact probabilities in small samples may be obtained from Owen (1962).

Comments

1. Workspace may be explicitly provided, if desired, by use of K2SKL/DK2SKL. The reference is:

CALL K2SKL (NGROUP, NI, Y, FUZZ, STAT, IWK, WK, YRNK)

The additional arguments are as follows:

IWK — Integer work vector of length m.

WK — Work vector of length m.

YRNK — Work vector of length m.

2. Informational errors

 

Type

Code

Description

3

4

At least one tie was detected in Y.

3

5

All elements of Y are tied. STAT is set to 1.0.

3

6

The chi‑squared degrees of freedom are less than 5, so the Beta approximation is used.

Example

The following example is taken from Conover (1980, page 231). The data represents the yields per acre of four different methods for raising corn. Since H = 25.5, the four methods are clearly different. The warning error is always printed when the Beta approximation is used, unless printing for warning errors is turned off. See IMSL routine ERSET in the Reference Material.

 

USE KRSKL_INT

USE UMACH_INT

 

IMPLICIT NONE

INTEGER NGROUP

REAL FUZZ

PARAMETER (FUZZ=0.001, NGROUP=4)

!

INTEGER NI(NGROUP), NOUT

REAL STAT(4), Y(34)

!

DATA NI/9, 10, 7, 8/

DATA Y/83, 91, 94, 89, 89, 96, 91, 92, 90, 91, 90, 81, 83, 84, &

83, 88, 91, 89, 84, 101, 100, 91, 93, 96, 95, 94, 78, 82, &

81, 77, 79, 81, 80, 81/

! Perform Kruskal-Wallis test

CALL KRSKL (NI, Y, FUZZ, STAT)

! Print results

CALL UMACH (2, NOUT)

WRITE (NOUT,99999) STAT

!

99999 FORMAT (' H (no ties) = ', F8.1, /, ' Prob (no ties) = ', &

F11.4, /, ' H (ties) = ', F8.1, /, ' Prob (ties) ' &

, ' = ', F11.4)

!

END

Output

 

*** WARNING ERROR 6 from KRSKL. The chi-squared degrees of freedom are

*** less than 5, so the Beta approximation is used.

H (no ties) = 25.5

Prob (no ties) = 0.0000

H (ties) = 25.6

Prob (ties) = 0.0000