TRNBL

Computes Turnbull’s generalized Kaplan‑Meier estimates of survival probabilities in samples with interval censoring.

Required Arguments

XNOBS by NCOL matrix containing the data. (Input)

ILT — For interval‑censored and left‑censored observations, the column number in X that contains the upper endpoint of the failure interval. (Input)
See argument ICEN. If ILT = 0, left‑censored and interval‑censored observations cannot be input.

IRT — For interval‑censored and right‑censored observations, the column number in X that contains the lower endpoint of the failure interval. (Input)
See argument ICEN. IRT must not be zero.

NINTVL — Number of failure intervals found. (Output)

SPROBNINTVL by 4 matrix. (Output)

 

Col.

Description

1

Lower endpoint of the failure interval

2

Upper endpoint of the failure interval

3

Estimated change in the survival probability density within the failure interval

4

Estimate of the survival probability for the interval

The estimated survival probability is a constant equal to SPROB(i, 4) from SPROB (i, 2) to SPROB(i + 1, 1). The estimated survival probability is 1 prior to SPROB(1, 1). The estimated survival probability is undefined in the interval SPROB(i, 1) to SPROB(i, 2). If the NINTVL‑th interval is from SPROB(NINTVL, 1) to infinity, then SPROB(NINTVL, 2) is set to positive machine infinity.

ALGL — Optimized log‑likelihood for the input data. (Output)

Optional Arguments

NOBS — Number of observations. (Input)
Default: NOBS = size (X,1).

NCOL — Number of columns in X. (Input)
Default: NCOL = size (X,2).

LDX — Leading dimension of X exactly as specified in the dimension statement in the calling program. (Input)
Default: LDX = size (X,1).

IFRQ — Frequency option. (Input)
If IFRQ = 0, a response frequency of 1 for each observation is assumed. For positive IFRQ, column number IFRQ contains the frequency of response for each observation.
Default: IFRQ = 0.

ICEN — Censoring code option. (Input)
Default: ICEN = 0.
If ICEN = 0, a censoring code of 0 is assumed. For positive ICEN, column number ICEN contains the censoring code for each observation. Valid censoring codes are:

 

Code

Meaning

0

Exact failure at X(iIRT).

1

Right censored. The response is greater than X(iIRT).

2

Left censored. The response is less than or equal to X(iILT).

3

Interval censored. The response is greater than X(iIRT), but less than or equal to X(iILT).

MAXIT — Maximum number of iterations. (Input)
Default: MAXIT = 30.

EPS — Convergence criterion. (Input)
Convergence is assumed when the relative change in the log‑likelihood from one iteration to the next is less than EPS. EPS = 0.00001 is typical.
Default: EPS = 0.00001.

IPRINT — Printing option. (Input)
IPRINT = 0 means that no printing is performed. IPRINT = 1 means that printing is performed.
Default: IPRINT = 0.

LDSPRO — Leading dimension of SPROB exactly as specified in the dimension statement in the calling program. (Input)
If LDSPRO is less than NINTVL, only the first LDSPRO intervals are returned in SPROB.
Default: LDSPRO = size (SPROB,1).

NRMISS — Number of rows of data in X that contain missing values. (Output)
Any row of X that contains missing values in the ILT, IRT, ICEN, or IFRQ columns (when the ILT, ICEN or IFRQ is positive) is omitted from the analysis.

FORTRAN 90 Interface

Generic: CALL TRNBL (X, ILT, IRT, NINTVL, SPROB, ALGL [])

Specific: The specific interface names are S_TRNBL and D_TRNBL.

FORTRAN 77 Interface

Single: CALL TRNBL (NOBS, NCOL, X, LDX, ILT, IRT, IFRQ, ICEN, MAXIT, EPS, IPRINT, NINTVL, SPROB, LDSPRO, ALGL, NRMISS)

Double: The double precision name is DTRNBL.

Description

Routine TRNBL computes nonparametric maximum likelihood estimates of a survival distribution based upon a random sample of data containing exact failure, right‑censored, leftcensored (interval censored with a left endpoint of zero), or interval‑censored observations. The computational method of Turnbull (1976) is used in computing the probability estimates. The model used is also discussed by Peto (1973).

Routine TRNBL begins by finding a set of regions or “failure intervals” (to distinguish them from “observation failure intervals”) on the positive real axis in which a change in the survival probability occurs. The survival probability is constant outside of these regions, and undefined within them. Each region (failure interval) is composed of a single left and a single right endpoint obtained from the left and right endpoints of the observation failure intervals (for exact failure times, the left and right endpoints are equal). The regions are defined by the fact that no observation interval endpoints are allowed within a region, except at its endpoints. Note that the endpoints of the intervals need not correspond to a single observation. Regions defined by endpoints from two distinct observations are often obtained.

Let pi, i = 1, NINTVL denote the change in the survival probability within the i‑th region, and let the region be denoted by ci. Let n = NOBS and suppose that the observation failure interval for observation j is denoted by Ij. The EM (expectation, maximization) algorithm of Dempster, Laird and Rubin (1977) is used to find the optimal

 

The algorithm is defined as follows:

For given

 

compute the expected contribution of the j‑th observation to the i‑th change interval as

 

where δij = 1 if ci  Ij and δij = 0 otherwise, and fj is the observation frequency.

For given expectations

 

compute the new probability estimate as

 

Iterate in this manner until convergence. Convergence is assumed when the relative change in the log‑likelihood

 

is small (less than EPS). Because the algorithm is slow to converge, 5 expectation‑maximization cycles are considered to be one iteration of the algorithm. The initial estimate for all the

 

is taken to be one divided by the number of regions (failure intervals).

Comments

1. Workspace may be explicitly provided, if desired, by use of T2NBL/DT2NBL. The reference is:

CALL T2NBL (NOBS, NCOL, X, LDX, ILT, IRT, IFRQ, ICEN, MAXIT, EPS, IPRINT, NINTVL, SPROB, LDSPRO, ALGL, NRMISS, WK, IPERM, INDDR, WWK, IWK)

The additional arguments are as follows:

WK — Work vector of length 7 * NOBS.

IPERM — Work vector of length NOBS.

INDDR — Work vector of length NOBS.

WWK — Work vector of length 2 * max(NOBS, 7).

IWK — Work vector of length max(NOBS, 7).

2. Informational errors

 

Type

Code

Description

3

3

The maximum number of iterations was exceeded. Convergence is assumed.

4

1

There are no valid observations.

4

2

There are no finite failure intervals present in the data.

Example

The following example contains exact failure, right-, left-, and interval‑censored observations. The 20 observations yield 15 change intervals. The last interval is from 192 to , and corresponds to a right‑censored observation. When the last interval is infinite, as is the case here, the second column of SPROB contains + in the NINTVL‑th position. Left‑or right‑censored observations input in X are arbitrarily assigned the value 0.0 for the non‑specified endpoint.

 

USE WRRRN_INT

USE TRNBL_INT

 

IMPLICIT NONE

INTEGER ICEN, IFRQ, ILT, IPRINT, IRT, LDSPRO, LDX, NCOL, NOBS

PARAMETER (ICEN=4, IFRQ=3, ILT=1, IPRINT=1, IRT=2, LDSPRO=20, &

LDX=20, NCOL=4, NOBS=20)

!

INTEGER NINTVL, NRMISS

REAL ALGL, SPROB(LDSPRO,4), X(LDX,NCOL)

!

DATA X/0.9, 1.9, 2.5, 3.5, 6.3, 7.1, 18., 25.1, 25.3, 30.3, 45.9, &

63.5, 70.1, 73.0, 93.0, 94.4, 96.0, 0.0, 191.4, 0.0, 0.9, &

0.0, 0.0, 0.0, 6.3, 1.9, 1.8, 25.1, 9.5, 30.3, 45.9, &

60.7, 70.1, 71.0, 74.0, 94.4, 96.0, 96.0, 191.4, 192.0, &

17*1.0, 5.0, 1.0, 1.0, 0.0, 2.0, 2.0, 2.0, 0.0, 3.0, 3.0, &

0.0, 3.0, 0.0, 0.0, 3.0, 0.0, 3.0, 3.0, 0.0, 0.0, 1.0, 0.0, &

1.0/

!

CALL WRRRN ('X', X)

!

CALL TRNBL (X, ILT, IRT, NINTVL, SPROB, ALGL, IFRQ=IFRQ, &

ICEN=ICEN, IPRINT=IPRINT)

!

END

Output

 

X

1 2 3 4

1 0.9 0.9 1.0 0.0

2 1.9 0.0 1.0 2.0

3 2.5 0.0 1.0 2.0

4 3.5 0.0 1.0 2.0

5 6.3 6.3 1.0 0.0

6 7.1 1.9 1.0 3.0

7 18.0 1.8 1.0 3.0

8 25.1 25.1 1.0 0.0

9 25.3 9.5 1.0 3.0

10 30.3 30.3 1.0 0.0

11 45.9 45.9 1.0 0.0

12 63.5 60.7 1.0 3.0

13 70.1 70.1 1.0 0.0

14 73.0 71.0 1.0 3.0

15 93.0 74.0 1.0 3.0

16 94.4 94.4 1.0 0.0

17 96.0 96.0 1.0 0.0

18 0.0 96.0 5.0 1.0

19 191.4 191.4 1.0 0.0

20 0.0 192.0 1.0 1.0

 

Iteration Log-Likelihood Relative convergence

0 -54.94 ............

1 -52.14 0.5367E-01

2 -52.09 0.8407E-03

3 -52.09 0.1372E-03

4 -52.09 0.2476E-04

5 -52.08 0.4614E-05

 

SPROB

Lower Upper Interval Survival

Interval Endpoint Endpoint Probability Probability

1 0.9000 0.9000 0.0972 0.9028

2 1.9000 1.9000 0.1215 0.7813

3 6.3000 6.3000 0.0729 0.7083

4 9.5000 18.0000 0.0000 0.7083

5 25.1000 25.1000 0.0833 0.6250

6 30.3000 30.3000 0.0417 0.5833

7 45.9000 45.9000 0.0417 0.5417

8 60.7000 63.5000 0.0417 0.5000

9 70.1000 70.1000 0.0417 0.4583

10 71.0000 73.0000 0.0417 0.4167

11 74.0000 93.0000 0.0417 0.3750

12 94.4000 94.4000 0.0417 0.3333

13 96.0000 96.0000 0.1111 0.2222

14 191.4000 191.4000 0.1111 0.1111

15 192.0000 Inf 0.1111 0.0000