INCLD
Performs an includance test.
Required Arguments
X — Vector of length NOBSX containing the data for the first sample. (Input)
Y — Vector of length NOBSY containing the data for the second sample. (Input)
ILX — Index of the element in the ordered first sample to be used as the low endpoint of the range considered. (Input)
ILX must be greater than zero and less than IHX.
IHX — Index of the element in the ordered first sample to be used as the high endpoint of the range considered. (Input)
IHX must be greater than ILX and less than or equal to NOBSX.
FUZZ — Value used to determine ties. (Input)
If a second sample element is within FUZZ of the ILX or IHX order statistics in the first sample, a tie will be counted.
STAT — Vector of length 4 containing the statistics. (Output)
In the description below, (X(ILX), X(IHX)) is the interval from the ILX ordered first sample value to the IHX ordered first sample value (i.e., from the ILX to the IHX order statistics in the first sample).
I | STAT(I) |
---|
1 | Number of ties detected. |
2 | Number of untied elements in the second sample that are outside the interval (X(ILX), X(IHX)). |
3 | Probability of STAT(2) or more second sample elements lying outside (X(ILX), X(IHX)). |
4 | Probability of STAT(1) + STAT(2) or more elements in the second sample lying outside (X(ILX), X(IHX)). |
Optional Arguments
NOBSX — Number of observations in the first sample. (Input)
Default: NOBSX = size (X,1).
NOBSY — Number of observations in the second sample. (Input)
Default: NOBSY = size (Y,1).
NMISSX — Number of missing (NaN, not a number) values in X. (Output)
NMISSY — Number of missing (NaN, not a number) values in Y. (Output)
FORTRAN 90 Interface
Generic: CALL INCLD (X, Y, ILX, IHX, FUZZ, STAT [, …])
Specific: The specific interface names are S_INCLD and D_INCLD.
FORTRAN 77 Interface
Single: CALL INCLD (NOBSX, X, NOBSY, Y, ILX, IHX, FUZZ, STAT, NMISSX, NMISSY)
Double: The double precision name is DINCLD.
Description
Routine INCLD tests that an equal proportion of two populations lies between the ILX and IHX order statistics of the first sample, and that the densities are equal at the two points. Let Xil and Xih denote the two order statistics in the first sample, where l = ILX, and h = IHX. Then, the probability of exactly i observations in the second sample being outside of the interval (Xil , Xih) is hypergeometric and is given by
where M is the sample size in the first sample (NOBSX ‑ NMISSX), N is the sample size in the second sample (NOBSY ‑ NMISSY), and
denotes a binomial coefficient. The probability of b or fewer observations in the second sample being outside the interval is given by
Use of this test requires that the population samples sizes, ILX and IHX, be set prior to sampling or viewing the data. Ties do not present any special problems except when they occur at the interval endpoints Xil and Xih. When this occurs for the first sample, no action is taken, but an informative warning message is issued. When a second sample observation is within FUZZ of an endpoint, then a tie is counted in STAT(1), and once more, a warning message is issued. In this case, STAT(3) and STAT(4) can be considered as bounds for the actual probability.
Comments
1. Workspace may be explicitly provided, if desired, by use of I2CLD/DI2CLD. The reference is:
CALL I2CLD (NOBSX, X, NOBSY, Y, ILX, IHX, FUZZ, STAT, NMISSX, NMISSY, WK)
The additional argument is:
WK — Work vector of length NOBSX. If X is not needed, X and WK can share the same storage locations.
2. If ILX = 1 and IHX = NOBSX, INCLD tests the hypothesis that the second population lies in equal proportion to the first population, between the endpoints of the first sample.
3. If ILX = (NOBSX + 1)/4 and IHX = 3 * (NOBSX + 1)/4, the first and the third quartile estimates of the first population are being considered. The null hypothesis may be that the first and second samples are drawn from the same population.
Example
The following example, which is an adaptation of a problem in Bradley (1968, page 234) illustrates the use of INCLD to test that equal proportions of two populations lie between the endpoints of the first sample.
USE INCLD_INT
USE WRRRL_INT
IMPLICIT NONE
INTEGER IHX, ILX, NOBSX, NOBSY
REAL FUZZ
PARAMETER (FUZZ=0.0001, IHX=12, ILX=1, NOBSX=12, NOBSY=15)
!
REAL STAT(4), X(NOBSX), Y(NOBSY)
CHARACTER CLABEL(5)*30, RLABEL(1)*4
!
DATA X/1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12/
DATA Y/0, 0, 0, 0, 0, 0, 0, 2, 2, 2, 2, 2, 2, 2, 2/
DATA RLABEL/'NONE'/
DATA CLABEL/' ', '%/Number of ties', '%/Number outside', &
'p-value%/untied', 'p-value%/both'/
! Perform includance test
CALL INCLD (X, Y, ILX, IHX, FUZZ, STAT)
! Print results
CALL WRRRL ('STAT', STAT, RLABEL, CLABEL, 1, 4, 1)
!
END
Output
STAT
p-value p-value
Number of ties Number outside untied both
0.000 7.000 0.038 0.038