TRNBL
Computes Turnbull’s generalized Kaplan‑Meier estimates of survival probabilities in samples with interval censoring.
Required Arguments
X — NOBS by NCOL matrix containing the data. (Input)
ILT — For interval‑censored and left‑censored observations, the column number in X that contains the upper endpoint of the failure interval. (Input)
See argument ICEN. If ILT = 0, left‑censored and interval‑censored observations cannot be input.
IRT — For interval‑censored and right‑censored observations, the column number in X that contains the lower endpoint of the failure interval. (Input)
See argument ICEN. IRT must not be zero.
NINTVL — Number of failure intervals found. (Output)
SPROB — NINTVL by 4 matrix. (Output)
Col. |
Description |
1 |
Lower endpoint of the failure interval |
2 |
Upper endpoint of the failure interval |
3 |
Estimated change in the survival probability density within the failure interval |
4 |
Estimate of the survival probability for the interval |
The estimated survival probability is a constant equal to SPROB(i, 4) from SPROB (i, 2) to SPROB(i + 1, 1). The estimated survival probability is 1 prior to SPROB(1, 1). The estimated survival probability is undefined in the interval SPROB(i, 1) to SPROB(i, 2). If the NINTVL‑th interval is from SPROB(NINTVL, 1) to infinity, then SPROB(NINTVL, 2) is set to positive machine infinity.
ALGL — Optimized log‑likelihood for the input data. (Output)
Optional Arguments
NOBS — Number of observations. (Input)
Default: NOBS = size (X,1).
NCOL — Number of columns in X. (Input)
Default: NCOL = size (X,2).
LDX — Leading dimension of X exactly as specified in the dimension statement in the calling program. (Input)
Default: LDX = size (X,1).
IFRQ — Frequency option. (Input)
If IFRQ = 0, a response frequency of 1 for each observation is assumed. For positive IFRQ, column number IFRQ contains the frequency of response for each observation.
Default: IFRQ = 0.
ICEN — Censoring code option. (Input)
Default: ICEN = 0.
If ICEN = 0, a censoring code of 0 is assumed. For positive ICEN, column number ICEN contains the censoring code for each observation. Valid censoring codes are:
Code |
Meaning |
0 |
Exact failure at X(i, IRT). |
1 |
Right censored. The response is greater than X(i, IRT). |
2 |
Left censored. The response is less than or equal to X(i, ILT). |
3 |
Interval censored. The response is greater than X(i, IRT), but less than or equal to X(i, ILT). |
MAXIT — Maximum number of iterations. (Input)
Default: MAXIT = 30.
EPS — Convergence criterion. (Input)
Convergence is assumed when the relative change in the log‑likelihood from one iteration to the next is less than EPS. EPS = 0.00001 is typical.
Default: EPS = 0.00001.
IPRINT — Printing option. (Input)
IPRINT = 0 means that no printing is performed. IPRINT = 1 means that printing is performed.
Default: IPRINT = 0.
LDSPRO — Leading dimension of SPROB exactly as specified in the dimension statement in the calling program. (Input)
If LDSPRO is less than NINTVL, only the first LDSPRO intervals are returned in SPROB.
Default: LDSPRO = size (SPROB,1).
NRMISS — Number of rows of data in X that contain missing values. (Output)
Any row of X that contains missing values in the ILT, IRT, ICEN, or IFRQ columns (when the ILT, ICEN or IFRQ is positive) is omitted from the analysis.
FORTRAN 90 Interface
Generic: CALL TRNBL (X, ILT, IRT, NINTVL, SPROB, ALGL [, …])
Specific: The specific interface names are S_TRNBL and D_TRNBL.
FORTRAN 77 Interface
Single: CALL TRNBL (NOBS, NCOL, X, LDX, ILT, IRT, IFRQ, ICEN, MAXIT, EPS, IPRINT, NINTVL, SPROB, LDSPRO, ALGL, NRMISS)
Double: The double precision name is DTRNBL.
Description
Routine TRNBL computes nonparametric maximum likelihood estimates of a survival distribution based upon a random sample of data containing exact failure, right‑censored, leftcensored (interval censored with a left endpoint of zero), or interval‑censored observations. The computational method of Turnbull (1976) is used in computing the probability estimates. The model used is also discussed by Peto (1973).
Routine TRNBL begins by finding a set of regions or “failure intervals” (to distinguish them from “observation failure intervals”) on the positive real axis in which a change in the survival probability occurs. The survival probability is constant outside of these regions, and undefined within them. Each region (failure interval) is composed of a single left and a single right endpoint obtained from the left and right endpoints of the observation failure intervals (for exact failure times, the left and right endpoints are equal). The regions are defined by the fact that no observation interval endpoints are allowed within a region, except at its endpoints. Note that the endpoints of the intervals need not correspond to a single observation. Regions defined by endpoints from two distinct observations are often obtained.
Let pi, i = 1, …, NINTVL denote the change in the survival probability within the i‑th region, and let the region be denoted by ci. Let n = NOBS and suppose that the observation failure interval for observation j is denoted by Ij. The EM (expectation, maximization) algorithm of Dempster, Laird and Rubin (1977) is used to find the optimal
The algorithm is defined as follows:
For given
compute the expected contribution of the j‑th observation to the i‑th change interval as
where δij = 1 if ci ⊆ Ij and δij = 0 otherwise, and fj is the observation frequency.
For given expectations
compute the new probability estimate as
Iterate in this manner until convergence. Convergence is assumed when the relative change in the log‑likelihood
is small (less than EPS). Because the algorithm is slow to converge, 5 expectation‑maximization cycles are considered to be one iteration of the algorithm. The initial estimate for all the
is taken to be one divided by the number of regions (failure intervals).
Comments
1. Workspace may be explicitly provided, if desired, by use of T2NBL/DT2NBL. The reference is:
CALL T2NBL (NOBS, NCOL, X, LDX, ILT, IRT, IFRQ, ICEN, MAXIT, EPS, IPRINT, NINTVL, SPROB, LDSPRO, ALGL, NRMISS, WK, IPERM, INDDR, WWK, IWK)
The additional arguments are as follows:
WK — Work vector of length 7 * NOBS.
IPERM — Work vector of length NOBS.
INDDR — Work vector of length NOBS.
WWK — Work vector of length 2 * max(NOBS, 7).
IWK — Work vector of length max(NOBS, 7).
2. Informational errors
Type |
Code |
Description |
3 |
3 |
The maximum number of iterations was exceeded. Convergence is assumed. |
4 |
1 |
There are no valid observations. |
4 |
2 |
There are no finite failure intervals present in the data. |
Example
The following example contains exact failure, right-, left-, and interval‑censored observations. The 20 observations yield 15 change intervals. The last interval is from 192 to ∞, and corresponds to a right‑censored observation. When the last interval is infinite, as is the case here, the second column of SPROB contains +∞ in the NINTVL‑th position. Left‑or right‑censored observations input in X are arbitrarily assigned the value 0.0 for the non‑specified endpoint.
USE WRRRN_INT
USE TRNBL_INT
IMPLICIT NONE
INTEGER ICEN, IFRQ, ILT, IPRINT, IRT, LDSPRO, LDX, NCOL, NOBS
PARAMETER (ICEN=4, IFRQ=3, ILT=1, IPRINT=1, IRT=2, LDSPRO=20, &
LDX=20, NCOL=4, NOBS=20)
!
INTEGER NINTVL, NRMISS
REAL ALGL, SPROB(LDSPRO,4), X(LDX,NCOL)
!
DATA X/0.9, 1.9, 2.5, 3.5, 6.3, 7.1, 18., 25.1, 25.3, 30.3, 45.9, &
63.5, 70.1, 73.0, 93.0, 94.4, 96.0, 0.0, 191.4, 0.0, 0.9, &
0.0, 0.0, 0.0, 6.3, 1.9, 1.8, 25.1, 9.5, 30.3, 45.9, &
60.7, 70.1, 71.0, 74.0, 94.4, 96.0, 96.0, 191.4, 192.0, &
17*1.0, 5.0, 1.0, 1.0, 0.0, 2.0, 2.0, 2.0, 0.0, 3.0, 3.0, &
0.0, 3.0, 0.0, 0.0, 3.0, 0.0, 3.0, 3.0, 0.0, 0.0, 1.0, 0.0, &
1.0/
!
CALL WRRRN ('X', X)
!
CALL TRNBL (X, ILT, IRT, NINTVL, SPROB, ALGL, IFRQ=IFRQ, &
ICEN=ICEN, IPRINT=IPRINT)
!
END
Output
X
1 2 3 4
1 0.9 0.9 1.0 0.0
2 1.9 0.0 1.0 2.0
3 2.5 0.0 1.0 2.0
4 3.5 0.0 1.0 2.0
5 6.3 6.3 1.0 0.0
6 7.1 1.9 1.0 3.0
7 18.0 1.8 1.0 3.0
8 25.1 25.1 1.0 0.0
9 25.3 9.5 1.0 3.0
10 30.3 30.3 1.0 0.0
11 45.9 45.9 1.0 0.0
12 63.5 60.7 1.0 3.0
13 70.1 70.1 1.0 0.0
14 73.0 71.0 1.0 3.0
15 93.0 74.0 1.0 3.0
16 94.4 94.4 1.0 0.0
17 96.0 96.0 1.0 0.0
18 0.0 96.0 5.0 1.0
19 191.4 191.4 1.0 0.0
20 0.0 192.0 1.0 1.0
Iteration Log-Likelihood Relative convergence
0 -54.94 ............
1 -52.14 0.5367E-01
2 -52.09 0.8407E-03
3 -52.09 0.1372E-03
4 -52.09 0.2476E-04
5 -52.08 0.4614E-05
SPROB
Lower Upper Interval Survival
Interval Endpoint Endpoint Probability Probability
1 0.9000 0.9000 0.0972 0.9028
2 1.9000 1.9000 0.1215 0.7813
3 6.3000 6.3000 0.0729 0.7083
4 9.5000 18.0000 0.0000 0.7083
5 25.1000 25.1000 0.0833 0.6250
6 30.3000 30.3000 0.0417 0.5833
7 45.9000 45.9000 0.0417 0.5417
8 60.7000 63.5000 0.0417 0.5000
9 70.1000 70.1000 0.0417 0.4583
10 71.0000 73.0000 0.0417 0.4167
11 74.0000 93.0000 0.0417 0.3750
12 94.4000 94.4000 0.0417 0.3333
13 96.0000 96.0000 0.1111 0.2222
14 191.4000 191.4000 0.1111 0.1111
15 192.0000 Inf 0.1111 0.0000