Performs a Wilcoxon rank sum test.
#include <imsls.h>
float
imsls_f_wilcoxon_rank_sum (int
n1_observations, float
x1[],
int
n2_observations, float
x2[], ..., 0)
The type double function is imsls_d_wilcoxon_rank_sum.
int
n1_observations (Input)
Number of observations in the
first sample.
float x1[]
(Input)
Array of length n1_observations
containing the first sample.
int
n2_observations (Input)
Number of observations in the
second sample.
float x2[]
(Input)
Array of length n2_observations
containing the second sample.
The two-sided p-value for the Wilcoxon rank sum statistic that is computed with average ranks used in the case of ties.
#include <imsls.h>
float
imsls_f_wilcoxon_rank_sum (int
n1_observations, float
x1[],
int
n2_observations, float
x2[],
IMSLS_FUZZ, float
fuzz,
IMSLS_STAT, float
**stat,
IMSLS_STAT_USER, float
stat[],
0)
IMSLS_FUZZ, float fuzz
(Input)
Nonnegative constant used to determine ties in computing ranks in the
combined samples. A tie is declared when two observations in the combined sample
are within fuzz
of each other.
Default: fuzz = 100 × imsls_f_machine(4) × max {|xi1|, |xj2|}
IMSLS_STAT, float **stat
(Output)
Address of a pointer to an internally allocated array of length 10
containing the following statistics:
Row |
Statistics |
0 |
Wilcoxon W
statistic (the sum of the ranks of the x observations) adjusted for
ties in such a manner that W is |
1 |
2 × E(W) − W, where E(W) is the expected value of W |
2 |
probability of obtaining a statistic less than or equal to min{W, 2 × E(W) − W} |
3 |
W statistic adjusted for ties in such a manner that W is as large as possible |
4 |
2 × E(W) − W, where E(W) is the expected value of W, adjusted for ties in such a manner that W is as large as possible |
5 |
probability of obtaining a statistic less than or equal to min{W, 2 × E(W) − W}, adjusted for ties in such a manner that W is as large as possible |
6 |
W statistic with average ranks used in case of ties |
7 |
estimated standard error of stat [6] under the null hypothesis of no difference |
8 |
standard normal score associated with stat [6] |
9 |
two-sided p-value associated with stat[8] |
IMSLS_STAT_USER, float stat[]
(Output)
Storage for array stat is provided by
the user. See IMSLS_STAT.
Function imsls_f_wilcoxon_rank_sum performs the Wilcoxon rank sum test for identical population distribution functions. The Wilcoxon test is a linear transformation of the Mann-Whitney U test. If the difference between the two populations can be attributed solely to a difference in location, then the Wilcoxon test becomes a test of equality of the population means (or medians) and is the nonparametric equivalent of the two-sample t-test. Function imsls_f_wilcoxon_rank_sum obtains ranks in the combined sample after first eliminating missing values from the data. The rank sum statistic is then computed as the sum of the ranks in the x1 sample. Three methods for handling ties are used. (A tie is counted when two observations are within fuzz of each other.) Method 1 uses the largest possible rank for tied observations in the smallest sample, while Method 2 uses the smallest possible rank for these observations. Thus, the range of possible rank sums is obtained.
Method 3 for handling tied observations between samples uses the average rank of the tied observations. Asymptotic standard normal scores are computed for the W score (based on a variance that has been adjusted for ties) when average ranks are used (see Conover 1980, p. 217), and the probability associated with the two-sided alternative is computed.
In each of the following tests, the first line gives the hypothesis (and its alternative) under the assumptions 1 to 3 below, while the second line gives the hypothesis when assumption 4 is also true. The rejection region is the same for both hypotheses and is given in terms of Method 3 for handling ties. Another output statistic should be used, (stat[0] or stat[3]), if another method for handling ties is desired.
Test |
Null Hypothesis |
Alternative Hypothesis |
Action |
1 |
H0:Pr(x1 < x2) = 0.5 |
H1:Pr(x1 < x2) ≠ 0.5 |
Reject if stat [9] is less than the significance level of the test. Alternatively, |
|
H0:E(x1) = E(x2) |
H1:E(x1) ≠ E(x2) |
reject the null hypothesis if stat [6] is too large or too small. |
2 |
H0:Pr(x1 < x2) ≤ 0.5 |
H1:Pr(x1 < x2) > 0.5 |
Reject if stat [6] is too small |
|
H0:E(x1) ≥ E(x2) |
H1:E(x1) < E(x2) |
|
3 |
H0:Pr(x1 < x2) ≥ 0.5 |
H1:Pr(x1 < x2) < 0.5 |
Reject if stat [6] is too large |
|
H0:E(x1) ≤ E(x2)) |
H1:E(x1) > E(x2) |
|
1. Arguments x1 and x2 contain random samples from their respective populations.
2. All observations are mutually independent.
3. The measurement scale is at least ordinal (i.e., an ordering less than, greater than, or equal to exists among the observations).
4. If f(x) and g(y) are the distribution functions of x and y, then g(y) = f(x + c) for some constant c(i.e., the distribution of y is, at worst, a translation of the distribution of x).
The p-value is calculated using the large-sample normal approximation. This approximate calculation is only valid when the size of one or both samples is greater than 50. For smaller samples, see the exact tables for the Wilcoxon Rank Sum Test.
The following example is taken from Conover (1980, p. 224). It involves the mixing time of two mixing machines using a total of 10 batches of a certain kind of batter, five batches for each machine. The null hypothesis is not rejected at the 5-percent level of significance. The warning error is always printed when one or more ties are detected, unless printing for warning errors is turned off. See function imsls_error_options (Chapter 15, “Utilties”).
#include <imsls.h>
void
main()
{
int n1_observations =
5;
int n2_observations =
5;
float x1[5] = {7.3, 6.9, 7.2, 7.8,
7.2};
float x2[5] = {7.4, 6.8, 6.9, 6.7,
7.1};
float p_value;
p_value = imsls_f_wilcoxon_rank_sum(n1_observations,
x1,
n2_observations, x2, 0);
printf("p-value = %11.4f\n",
p_value);
}
*** WARNING Error IMSLS_AT_LEAST_ONE_TIE from
imsls_f_wilcoxon_rank_sum.
***
At least one tie is detected
between the samples.
p-value = 0.1412
The following example uses the same data as the previous example. Now, all the statistics are output in the array stat.
#include <imsls.h>
void
main()
{
int n1_observations =
5;
int n2_observations =
5;
float x1[5] = {7.3, 6.9, 7.2, 7.8,
7.2};
float x2[5] = {7.4, 6.8, 6.9, 6.7,
7.1};
float *stat;
char *labels[10] = {"Wilcoxon W statistic
......................",
"2*E(W) - W
................................",
"p-value
...................................",
"Adjusted Wilcoxon statistic
...............",
"Adjusted 2*E(W) - W
.......................",
"Adjusted p-value
..........................",
"W statistics for averaged
ranks............",
"Standard error of W (averaged ranks)
......",
"Standard normal score of W (averaged
ranks)",
"Two-sided p-value of W (averaged ranks ...."};
imsls_f_wilcoxon_rank_sum(n1_observations,
x1,
n2_observations, x2,
IMSLS_STAT,
&stat,
0);
imsls_f_write_matrix("statistics", 10, 1, stat,
IMSLS_ROW_LABELS,
labels,
IMSLS_WRITE_FORMAT, "%7.3f",
0);
}
*** WARNING Error IMSLS_AT_LEAST_ONE_TIE from
imsls_f_wilcoxon_rank_sum.
***
At least one tie is detected
between the
samples.
statistics
Wilcoxon W statistic ......................
34.000
2*E(W) - W ................................
21.000
p-value ...................................
0.110
Adjusted Wilcoxon statistic ...............
35.000
Adjusted 2*E(W) - W .......................
20.000
Adjusted p-value ..........................
0.075
W statistics for averaged ranks............
34.500
Standard error of W (averaged ranks) ......
4.758
Standard normal score of W (averaged ranks)
1.471
Two-sided p-value of W (averaged ranks .... 0.141
IMSLS_NOBSX_NOBSY_TOO_SMALL “n1_observations” = # and “n2_observations” = #. Both sample sizes, “n1_observations” and “n2_observations”, are less than 25. Significance levels should be obtained from tabled values.
IMSLS_AT_LEAST_ONE_TIE At least one tie is detected between the samples.
IMSLS_ALL_X_Y_MISSING Each element of “x1” and/or “x2” is a missing (NaN, Not a Number) value.
Visual Numerics, Inc. PHONE: 713.784.3131 FAX:713.781.9260 |