CTRAN

FNLStat : Categorical and Discrete Data Analysis : CTRAN

CTRAN

Required Arguments

Performs generalized Mantel‑Haenszel tests in a stratified contingency table.

Required Arguments

NCLVAL — Vector of length NCLVAR containing, in its i‑th element, the number of levels (categories) of the i‑th classification variable. (Input)

TABLE — Vector of length NCLVAL(1) * NCLVAL(2) * … * NCLVAL(NCLVAR) containing the entries in the cells of the table to be fit. (Input)
See Comment 3 for comments on the ordering of the elements in TABLE. For the classification variables specified in INDROW and INDCOL, a series of two‑dimensional contingency tables are obtained from the elements in TABLE. All other classification variables are stratification variables.

INDROW — Index of the classification variable to be used for the row variable in the stratified two‑dimensional table. (Input)

INDCOL — Index of the classification variable to be used for the column variable in the stratified two‑dimensional table. (Input)

ROWSCR — Vector of length NCLVAL(INDCOL) containing the scores associated with the column and used in each row. (Input, if IROWSC = 0; output, otherwise) ROWSCR is not used and can be dimensioned of length 1 in the calling program if ITYPE = 1. If IROWSC is 3, 4, or 5, then ROWSCR contains the scores used in the last contingency table analyzed.

COLSCR — Vector of length NCLVAL(INDROW) containing the scores associated with each row and used in each column. (Input, if ICOLSC = 0; output, otherwise) COLSCR is not used and can be dimensioned of length 1 in the calling program if ITYPE is not 3. If ICOLSC is 3, 4, or 5, then COLSCR contains the scores used in the last contingency table analyzed.

STAT — Table of size m by 3 containing the Mantel‑Haenszel statistics. (Output)
Where m is one plus the number of stratified tables, i.e., m = 1 + NCLVAL(1) * NCLVAL(2) * … * NCLVAL(NCLVAR)/(NCLVAL(INDROW) * NCLVAL(INDCOL)). The first column of STAT contains the chi‑squared statistic for a test of partial association, the second column contains its degrees of freedom, and the third column contains the probability of a greater chi‑squared. The first m ‑ 1 rows of STAT contain the statistics computed for each of the stratified tables. The first row corresponds to the classification stratification variable levels (1, 1, …, 1). The second row corresponds to levels (1, 1, …, 2), etc., so that in row m ‑ 1 all stratification variables are at their highest levels. The last row of STAT contains the same statistics pooled over all of the stratified tables.

Optional Arguments

NCLVAR — Number of classification variables. (Input)
Default: NCLVAR = size (NCLVAL,1).

ITYPE — The type of statistic to compute. (Input)
Default: ITYPE = 1.

ITYPE	Statistic
1	Generalized Mantel‑Haenszel based upon the two‑dimensional contingency tables.
2	Generalized Mantel‑Haenszel based upon the row mean score in the two‑dimensional table.
3	Generalized Mantel‑Haenszel based upon the correlation score for the two‑dimensional tables.

IROWSC — Option parameter giving the scores associated with the column index to be used when computing statistics in each row. (Input)
Default: IROWSC = 0.

IROWSC	Weights
0	User specified (or no) weights.
1	The digits 1, 2, …, NCLVAL(INDCOL).
2	Combined (over all tables) ridit‑type scores.
3	Rank scores computed separately for each table.
4	Ridit‑type scores computed separately for each table.
5	Logrank scores computed separately for each table.

IROWSC is not used if ITYPE = 1.

ICOLSC — Option parameter giving the scores associated with the row index to be used when computing statistics in each column. (Input)
Default: ICOLSC = 0.

ICOLSC	Weights
0	User specified (or no) weights.
1	The digits 1, 2, …, NCLVAL(INDROW).
2	Combined (over all tables) ridit‑type scores.
3	Rank scores computed separately for each table.
4	Ridit‑type scores computed separately for each table.
5	Logrank scores computed separately for each table.

ICOLSC is not used if ITYPE is not 3.

IPRINT — Print option. (Input)
Default: IPRINT = 0.

IPRINT	Action
0	No printing.
1	Print the contents of the STAT array.
2	Print each stratified table followed by the contents of the STAT array.

LDSTAT — Leading dimension of STAT exactly as specified in the dimension statement of the calling program. (Input)
Default: LDSTAT = size (STAT,1).

FORTRAN 90 Interface

Generic:CALL CTRAN (NCLVAL, TABLE, INDROW, INDCOL, ROWSCR, COLSCR, STAT [, …])

Specific: The specific interface names are S_CTRAN and D_CTRAN.

FORTRAN 77 Interface

Single: CALL CTRAN (NCLVAR, NCLVAL, TABLE, INDROW, INDCOL, ITYPE, IROWSC, ICOLSC, IPRINT, ROWSCR, COLSCR, STAT, LDSTAT)

Double: The double precision name is DCTRAN.

Description

Routine CTRAN computes tests of partial association (a test of homogeneity, a test on means, and a test on correlations) in stratified two‑dimensional contingency tables. The type of test computed depends upon parameter ITYPE. All tests are generalizations of the Mantel‑Haenszel stratified 2 × 2 contingency table test statistic in the sense that information is “pooled” over all tables without increasing the total degrees of freedom in the test. Like the Mantel‑Haenszel test, if all tables violate the null hypothesis in the same direction, the tests computed here are more powerful than most other tests of the same null hypothesis.

While CTRAN allows for an arbitrary number of classification variables, only three are required to describe the test statistics since all stratification variables could be (if desired) lumped into a single classification variable. Because of this, only three classification variables are discussed here. Let fijk denote the frequency in cell ij of stratum k where k = 1, …, m, i = 1, …, r, and j = 1, …, c. Then, the input data can be described as a series of contingency tables. For example, if r = c = 2, so that 2 × 2 tables are used, then we would have:

f111	f121		f112	f122	…	f11m	f12m
f211	f221		f212	f222	…	f21m	f22m

All tests are computed as follows: For each table, a test statistic vector xk with estimated covariance matrix

is computed. The test statistic vector xk represents the mean difference (from the null hypothesis) for the test being computed. Thus, if ITYPE = 1, xk is a vector of cell frequencies minus their expected value under the hypothesis of homogeneity while if ITYPE = 2, xk is a vector containing the row means (based upon the row scores) for the elements in a row of a table minus the estimated mean for the table (estimated under the assumption that all means are equal). Finally, if ITYPE = 3, xk is a vector of length 1 containing an estimated correlation coefficient computed between the row and column scores.

Note that for nominal data in both the rows and columns, one would generally use ITYPE = 1 while if an ordering (and scores) make sense for each row of a table, ITYPE = 2 would be used. If an ordering (and scores) make sense for both the rows and the columns of a table, then a correlation measure (ITYPE = 3) is appropriate.

Test statistics for each table are computed as

which has degrees of freedom (r ‑ 1)(c ‑ 1) when ITYPE = 1, r ‑ 1 when ITYPE = 2, and 1 for ITYPE = 3. While these test statistics could be combined by summing them over all tables (yielding a X2 test with m times the of degrees of freedom in a single table), the Mantel‑Haenszel test combines the scores in a different way. Let

Then, an overall X2 may be computed as

This test statistic has the same degrees of freedom as the test statistic computed for a single stratum of the three‑way table and is reported in the last row of STAT. Routine CTRAN uses simplified computational methods. See Landis, Cooper, Kennedy, and Koch (1979) for details.

Landis, Cooper, Kennedy, and Koch (1979, page 225) give the null hypothesis for a test of partial association as follows (paraphrased):

H0 : For each of the separate tables, the data in the respective rows of the table can be regarded as a successive set of simple random samples from a fixed population corresponding to the column marginal totals for the table.

All three tests above are tests of partial association.

For ITYPE= 2 and 3, different row and column (ITYPE = 3) scores are used in computing measures of location and association. The scores used by CTRAN for the rows are

1. For IROWSC = 0, the user supplies the scores to be used in each row of the table.

2. For IROWSC = 1, uniform scores are used. These scores consist of the digits 1, 2, …, c where c is the number of columns in each table.

3 For IROWSC = 2, combined ridit scores are used. A combined ridit score is computed by summing the column marginals over all tables. The combined row score for the j‑th column is then computed as the sum of the initial j ‑ 1 column marginals plus half of the j‑th column marginal. The result is divided by the number of observations in all tables to yield the ridit score.

4. For IROWSC = 3, marginal rank scores are used. The j‑th marginal rank score is computed for each table from the column marginals for that table as the sum of the initial j ‑ 1 column marginals plus half the j‑th column marginal.

5. For IROWSC = 4, marginal ridit scores are used. These are computed as the marginal rank scores divided by the total frequency in the table.

6. For IROWSC = 5, logrank scores are used. These are computed as

where f+lk is the column marginal for column l in table k.

Column scores are computed in a similar manner.

Comments

1. Workspace may be explicitly provided, if desired, by use of C2RAN/DC2RAN. The reference is:

CALL C2RAN (NCLVAR, NCLVAL, TABLE, INDROW, INDCOL, ITYPE, IROWSC, ICOLSC, IPRINT, ROWSCR, COLSCR, STAT, LDSTAT, IX, F, COLSUM, ROWSUM, DIFVEC, DIFSUM, COV, COVSUM, AWK, BWK)

The additional arguments are as follows:

IX — Work array of length NCLVAR.

F — Work array of length NCLVAL(INDROW) * NCLVAL(INDCOL).

COLSUM — Work array of length NCLVAL(INDCOL).

ROWSUM — Work array of length NCLVAL(INDROW).

DIFVEC — Work array. If ITYPE = 1, the length is (NCLVAL(INDROW) ‑ 1) * (NCLVAL(INDCOL) ‑ 1). For ITYPE = 2, the length is NCLVAL(INDROW). For ITYPE = 3, DIFVEC is not used and may be of length 1.

DIFSUM — Work array. If ITYPE = 1, the length is (NCLVAL(INDROW) ‑ 1) * (NCLVAL(INDCOL) ‑ 1). DIFSUM contains the sum of the tables containing the observed minus expected frequencies (excluding the last row and column of each table). For ITYPE = 2, the length is NCLVAL(INDROW). DIFSUM contains the sum of the table row mean scores minus their expected value. For ITYPE = 3, the length is 1. DIFSUM contains the sum of the table correlations between the row and column mean scores. (Output)

COV — Work array. If ITYPE = 1, the length is (NCLVAL(INDROW) ‑ 1)2 * (INCLVA(INDCOL) ‑ 1)2. For ITYPE = 2, the length is NCLVAL(INDROW)2. For ITYPE = 3, COV is not used and may be of length 1.

COVSUM — Work array. If ITYPE = 1, the length is (NCLVAL(INDROW) ‑ 1)2 * (INCLVA(INDCOL) ‑ 1)2. For ITYPE = 2, the length is NCLVAL(INDROW)2. For ITYPE = 3, the length is 1.

AWK — Work array. If ITYPE = 1, the length is (NCLVAL(INDROW) ‑ 1)2. For ITYPE = 2, the length is NCLVAL(INDROW). For ITYPE = 3, AWK is not used and may be of length 1.

BWK — Work array. If ITYPE = 1, the length is (NCLVAL(INDCOL) ‑ 1)2. For ITYPE= 2 or 3, BWK is not used and may be of length 1.

2. Informational errors

Type	Code	Description
3	1	All frequencies of stratified table K are zero. This table will be excluded from the Mantel‑Haenszel test statistic.
3	2	The elements of stratified table K sum to one. This table will be excluded from the Mantel‑Haenszel test statistic.
3	3	The variance of the response variable for stratified table K is zero.
3	4	The variance of either the sub‑population or the response variable is zero for stratified table K.
3	5	The label for table K exceeds the buffer limit of 72.

Here, K is an integer that is greater than or equal to one and less than or equal to the number of stratified contingency tables.

3. The cells of the vectors TABLE are sequenced so that the first variable cycles from 1 to NCLVAL(1) the slowest, the second variable cycles from 1 to NCLVAL(2) the next most slowly, and so on, up to the NCLVAR‑th variable, which cycles from 1 to NCLVAL(NCLVAR) the fastest.

Example: For NCLVAR = 3, NCLVAL(1) = 2, NCLVAL(2) = 3, and NCLVAL(3) = 2 the cells of table X(I, J, K) are entered into TABLE(1) through TABLE(12) in the following order: X(1, 1, 1), X(1, 1, 2), X(1, 2, 1), X(1, 2, 2), X(1, 3, 1), X(1, 3, 2), X(2, 1, 1), X(2, 1, 2), X(2, 2, 1), X(2, 2, 2), X(2, 3, 1), X(2, 3, 2).

Example

In the following example, all three values of ITYPE are used in computing the partial association statistics. This is accomplished via three calls to CTRAN. The value of ITYPE changes on each call. The example is taken from Landis, Cooper, Kennedy, and Koch (1979, page 241). Uniform scores are used in both the rows and column as required by the tests type. The results indicate the presence of association between the row and column variables.

USE CTRAN_INT

IMPLICIT NONE

INTEGER ICOLSC, INDCOL, INDROW, IROWSC, LDSTAT, NCLVAR

PARAMETER (ICOLSC=1, INDCOL=1, INDROW=3, IROWSC=1, LDSTAT=5, &

NCLVAR=3)

INTEGER IPRINT, ITYPE, NCLVAL(NCLVAR)

REAL COLSCR(4), ROWSCR(3), STAT(LDSTAT,3), TABLE(48)

DATA TABLE/23, 23, 20, 24, 18, 18, 13, 9, 8, 12, 11, 7, 12, 15, &

14, 13, 7, 10, 13, 10, 6, 6, 13, 15, 6, 4, 6, 7, 9, 3, 8, &

6, 2, 5, 5, 6, 1, 2, 2, 2, 3, 4, 2, 4, 1, 2, 3, 4/

DATA NCLVAL/3, 4, 4/

IPRINT = 2

DO 10 ITYPE=1, 3

CALL CTRAN (NCLVAL, TABLE, INDROW, INDCOL, &

ROWSCR, COLSCR, STAT, ITYPE=ITYPE, &

IROWSC=IROWSC, ICOLSC=ICOLSC, IPRINT=IPRINT)

IPRINT = 1

10 CONTINUE

END

Output

Values for the class variables are defined to be:

Variable Number of Levels

1 A 3

2 B 4

3 C 4

----------

Strata 1: B = 1

C (row) by A (column)

1 2 3

1 23.00 7.00 2.00

2 23.00 10.00 5.00

3 20.00 13.00 5.00

4 24.00 10.00 6.00

----------

Strata 2: B = 2

C (row) by A (column)

1 2 3

1 18.00 6.00 1.00

2 18.00 6.00 2.00

3 13.00 13.00 2.00

4 9.00 15.00 2.00

----------

Strata 3: B = 3

C (row) by A (column)

1 2 3

1 8.00 6.00 3.00

2 12.00 4.00 4.00

3 11.00 6.00 2.00

4 7.00 7.00 4.00

----------

Strata 4: B = 4

C (row) by A (column)

1 2 3

1 12.00 9.00 1.00

2 15.00 3.00 2.00

3 14.00 8.00 3.00

4 13.00 6.00 4.00

Test of independence between row and column variables

Degrees of

Strata Chi-Squared Freedom Probability

1 3.4 6 0.7575

2 10.8 6 0.0942

3 3.1 6 0.7987

4 5.2 6 0.5177

Degrees of

Chi-Squared Freedom Probability

Mantel-Haenszel 10.6 6 0.1016

Test of equality of location for rows given column scores

Degrees of

Strata Chi-Squared Freedom Probability

1 2.62 3 0.4536

2 7.34 3 0.0617

3 1.69 3 0.6381

4 1.68 3 0.6420

Degrees of

Chi-Squared Freedom Probability

Mantel-Haenszel 6.59 3 0.08618

Row Scores

1 2 3

1.000 2.000 3.000

Test of correlation given row and column scores

Degrees of

Strata Chi-Squared Freedom Probability

1 1.57 1 0.2105

2 7.06 1 0.0079

3 0.16 1 0.6927

4 0.66 1 0.4161

Degrees of

Chi-Squared Freedom Probability

Mantel-Haenszel 6.34 1 0.0118

Row Scores

1 2 3

1.000 2.000 3.000

Column Scores

1 2 3 4

1.000 2.000 3.000 4.000