HAZEZ
Performs nonparametric hazard rate estimation using kernel functions. Easy‑to‑use version of HAZRD.
Required Arguments
X — NOBS by m matrix containing the raw data, where m = 1 if ICEN = 0, and m = 2 otherwise. (Input)
IRT — Column number in X containing the times of occurrence of the events. (Input)
ST — Vector of length NOBS containing the times of occurrence of the events, sorted from smallest to largest. (Output)
Vector ST is obtained from matrix X and is used as input to routine HAZST.
JCEN — Vector of length NOBS containing the sorted censor codes. (Output)
Censor codes are sorted corresponding to the events ST(i), with censored observations preceding tied failures. Vector JCEN is obtained from the censor codes in X and is used as input to routine HAZST.
ALPHA — Optimal estimate for the parameter α. (Output)
BTA — Optimal estimate for the parameter β. (Output)
K — Optimal estimate for the parameter k. (Output)
VML — Optimal value of the criterion function. (Output)
VML is the “modified likelihood”.
H — Vector of length 5 * NOBS containing the constants needed to compute the k‑th nearest failure distance and the observation weights. (Output)
H is used as input to routine HAZST.
Optional Arguments
NOBS — Number of observations. (Input)
Default: NOBS = size (X,1).
LDX — Leading dimension of X exactly as specified in the dimension statement in the calling program. (Input)
Default: LDX = size (X,1).
ICEN — Censoring option. (Input)
If ICEN = 0, then all of the data is treated as exact data with no censoring. For ICEN > 0, column ICEN of X contains the censoring codes. A censoring code of 0 means an exact event (failure). A censoring code of 1 means that the observation was right censored at the event time.
Default: ICEN = 0.
IPRINT — Printing option. (Input)
If IPRINT = 1, the grid estimates and the optimized estimates are printed for each value of k. Otherwise, no printing is performed.
Default : IPRINT = 0.
NMISS — Number of missing (NaN, not a number) values in X. (Output)
FORTRAN 90 Interface
Generic: CALL HAZEZ (X, IRT, ST, JCEN, ALPHA, BTA, K, VML, H [, …])
Specific: The specific interface names are S_HAZEZ and D_HAZEZ.
FORTRAN 77 Interface
Single: CALL HAZEZ (NOBS, X, LDX, IRT, ICEN, IPRINT, ST, JCEN, ALPHA, BTA, K, VML, H, NMISS)
Double: The double precision name is DHAZEZ.
Description
Routine HAZEZ is an implementation of the methods discussed by Tanner and Wong (1984) for estimating the hazard rate in survival or reliability data with right censoring. It uses the biweight kernel,
and a modified likelihood to obtain data‑based estimates of the smoothing parameters α, β, and k needed in the estimation of the hazard rate. For kernel K(x), define the “smoothed” kernel
Ks(x – x(j)) as follows:
where djk is the distance to the k‑th nearest failure from x(j), and x(j)is the j‑th ordered observation (from smallest to largest). For given α and β, the hazard at point x is given by:
where N = NOBS, δi is the censor code (0 = failed, 1 = censored) for the i‑th ordered observation, and wi is the weight of the i‑th ordered observation (given by 1/(N – i + 1)). The hazard may be estimated via routine after the smoothing parameters have been obtained
Let
The likelihood is given by:
where Π denotes product. Since the likelihood, as specified, will lead to degenerate estimates, Tanner and Wong (1984) suggest the use of a modified likelihood. The modification consists of deleting the observation xi in the calculation of h(xi) and H(xi) when the likelihood term for xi is computed. For a given k, α and β can then be estimated via the usual optimization techniques.
Estimates for α and β are computed as follows. For a given β, a closed form solution is available for α. The problem is thus reduced to the estimation of β. To estimate α and β, a grid search is first performed. Experience indicates that if the initial estimate of β from this grid search is greater than exp(6), then the modified likelihood is degenerate because the hazard rate does not change with time. In this situation, β should be taken to be infinite, and an estimate of α corresponding to infinite β is computed directly. When the estimate of β from the grid search is less than exp(6) (approximately 400), a secant algorithm is used to optimize the modified likelihood. The secant algorithm is said to have converged when the change in β from one iteration to the next is less than 0.00001. Additionally, convergence is assumed when the value of β becomes greater than exp(6). This corresponds to an infinite β with a degenerate likelihood.
A grid of k‑values is used to find the optimum value of the likelihood with respect to k. The grid is determined by HAZEZ and consists of at most 10 points. The starting value in the grid is the smallest possible value of k. An increment of 2 is then used to obtain the remaining grid points.
For each grid value, the modified likelihood is optimized with respect to α and β. That grid point, which leads to the smallest likelihood, is taken to be the optimal k.
Comments
1. Informational errors
Type |
Code |
Description |
4 |
6 |
All observations are missing (NaN, not a number) values. |
4 |
7 |
There are not enough failing observations in X to continue. |
2. The grid values in the initial grid search are given as follows: Let
β* = – 8, – 4, – 2, – 1, – 0.5,0.5,1, and 2, and
For each value of β, VML is computed at the optimizing β. The maximizing β is used to initiate the iterations.
3. If the initial β* is determined from the grid search to be less than –6, then it is presumed that β is infinite, and an analytic estimate of α based upon infinite β is used. Infinite β corresponds to a flat hazard rate.
Programming Notes
1. Routine HAZST may be used to estimate the hazard on a grid of points once the optimal values for α, β and k have been found. (The user should also consider using routine HAZRD, which allows for more options than HAZEZ.)
2. Routine HAZEZ assumes that censored observations precede failed observations at tied failure/censoring times.
3. The secant algorithm iterates on the transformed parameter β* = exp(–β). This assures a positive β, and it also seems to lead to a more desirable grid search. All results returned to the user are in the original parameterization.
Example
The following example is illustrated in Tanner and Wong (1984), and the data are taken from Stablein, Carter, and Novak (1981). It involves the survival times of individuals with nonresectable gastric carcinoma. Only those individuals treated with radiation and chemotherapy are used.
USE HAZEZ_INT
USE UMACH_INT
USE WRRRN_INT
USE WRIRN_INT
IMPLICIT NONE
INTEGER ICEN, IPRINT, IRT, LDX, NOBS
PARAMETER (ICEN=2, IPRINT=1, IRT=1, LDX=45, NOBS=45)
!
INTEGER JCEN(NOBS), K, NMISS, NOUT
REAL ALPHA, BTA, H(5*NOBS), ST(NOBS), VML, X(NOBS,2)
!
DATA X/17, 42, 44, 48, 60, 72, 74, 95, 103, 108, 122, 144, 167, &
170, 183, 185, 193, 195, 197, 208, 234, 235, 254, 307, 315, &
401, 445, 464, 484, 528, 542, 567, 577, 580, 795, 855, 882, &
892, 1031, 1033, 1306, 1335, 1366, 1452, 1472, 36*0, 9*1/
!
CALL HAZEZ (X, IRT, ST, JCEN, ALPHA, BTA, K, VML, H, ICEN=ICEN, &
IPRINT=IPRINT, NMISS=NMISS)
!
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) NMISS
99999 FORMAT (/' NMISS = ', I4/)
CALL WRRRN ('ST', ST, 1, NOBS, 1)
CALL WRIRN ('JCEN', JCEN, 1, NOBS, 1)
CALL WRRRN ('H', H, NOBS, 5, NOBS)
END
Output
*** GRID SEARCH FOR K = 2 ***
ALPHA BETA VML
65.157967 2980.958008 -266.543945
32.434208 54.598148 -262.551147
17.100269 20.085537 -263.100769
11.402525 12.182494 -264.410187
7.263529 7.389056 -267.502014
4.452315 4.481689 -270.548523
2.689497 2.718282 -335.407288
1.633702 1.648721 -338.566162
0.995799 1.000000 -519.939514
*** OPTIMAL PARAMETER ESTIMATES ***
ALPHA BETA VML
32.219337 53.968315 -262.550781
*** GRID SEARCH FOR K = 4 ***
ALPHA BETA VML
25.596716 2980.958008 -266.471558
20.476425 54.598148 -262.893860
13.995192 20.085537 -262.792755
10.109113 12.182494 -262.573212
6.883837 7.389056 -263.030121
4.407142 4.481689 -265.238647
2.690131 2.718282 -265.606293
1.633339 1.648721 -266.822693
0.993371 1.000000 -271.831390
*** OPTIMAL PARAMETER ESTIMATES ***
ALPHA BETA VML
8.530729 9.683726 -262.545593
*** GRID SEARCH FOR K = 6 ***
ALPHA BETA VML
16.828691 2980.958008 -266.729248
14.840095 54.598148 -264.019409
11.215133 20.085537 -262.844360
9.013870 12.182494 -263.663391
6.557755 7.389056 -263.283752
4.330785 4.481689 -263.732697
2.691744 2.718282 -264.613800
1.633932 1.648721 -265.381866
0.990891 1.000000 -266.242767
*** OPTIMAL PARAMETER ESTIMATES ***
ALPHA BETA VML
12.553377 28.178671 -262.529877
*** GRID SEARCH FOR K = 8 ***
ALPHA BETA VML
11.377748 2980.958008 -266.746185
10.773529 54.598148 -265.469299
8.766835 20.085537 -262.476807
7.427887 12.182494 -263.109009
5.916299 7.389056 -264.492432
4.184323 4.481689 -264.289886
2.656351 2.718282 -264.807617
1.623750 1.648721 -265.270691
0.989442 1.000000 -264.738403
*** OPTIMAL PARAMETER ESTIMATES ***
ALPHA BETA VML
8.522110 18.281288 -262.438568
*** GRID SEARCH FOR K = 10 ***
ALPHA BETA VML
8.689023 2980.958008 -267.026093
8.412854 54.598148 -266.250366
7.196551 20.085537 -263.192688
6.207793 12.182494 -262.648376
5.143391 7.389056 -264.274384
3.934601 4.481689 -264.523193
2.630993 2.718282 -264.877869
1.611710 1.648721 -264.332581
0.984530 1.000000 -263.920013
*** OPTIMAL PARAMETER ESTIMATES ***
ALPHA BETA VML
6.483376 13.956067 -262.589661
*** GRID SEARCH FOR K = 12 ***
ALPHA BETA VML
6.669007 2980.958008 -266.778259
6.551508 54.598148 -266.347595
5.933167 20.085537 -264.141174
5.252526 12.182494 -262.516205
4.471936 7.389056 -262.691589
3.598284 4.481689 -263.914032
2.557817 2.718282 -263.390106
1.588307 1.648721 -263.879578
0.973723 1.000000 -263.361908
*** OPTIMAL PARAMETER ESTIMATES ***
ALPHA BETA VML
4.923379 9.819798 -262.336670
*** GRID SEARCH FOR K = 14 ***
ALPHA BETA VML
5.668086 2980.958008 -266.747559
5.595870 54.598148 -266.436584
5.195685 20.085537 -264.737946
4.685275 12.182494 -262.971497
4.044650 7.389056 -262.288147
3.335586 4.481689 -263.126434
2.496436 2.718282 -262.891663
1.585763 1.648721 -263.418976
0.969140 1.000000 -263.164032
*** OPTIMAL PARAMETER ESTIMATES ***
ALPHA BETA VML
4.145060 7.966486 -262.260559
*** GRID SEARCH FOR K = 16 ***
ALPHA BETA VML
4.970138 2980.958008 -266.419281
4.924928 54.598148 -266.199646
4.663393 20.085537 -264.938660
4.280633 12.182494 -263.266602
3.741570 7.389056 -262.020355
3.132969 4.481689 -262.401733
2.421248 2.718282 -262.555817
1.586469 1.648721 -262.426025
0.967658 1.000000 -263.135101
*** OPTIMAL PARAMETER ESTIMATES ***
ALPHA BETA VML
3.639074 6.767537 -261.987305
*** GRID SEARCH FOR K = 18 ***
ALPHA BETA VML
4.578322 2980.958008 -266.804504
4.543117 54.598148 -266.619690
4.336464 20.085537 -265.541168
4.019334 12.182494 -264.001404
3.542742 7.389056 -262.540100
2.990577 4.481689 -262.511810
2.351537 2.718282 -262.633911
1.584173 1.648721 -262.158264
0.966332 1.000000 -262.868408
*** OPTIMAL PARAMETER ESTIMATES ***
ALPHA BETA VML
1.695147 1.769263 -262.118530
*** GRID SEARCH FOR K = 20 ***
ALPHA BETA VML
4.053934 2980.958008 -266.525970
4.032835 54.598148 -266.401428
3.905046 20.085537 -265.648315
3.687815 12.182494 -264.401672
3.304344 7.389056 -262.665924
2.822716 4.481689 -262.080078
2.252759 2.718282 -262.445251
1.555777 1.648721 -261.772278
0.955586 1.000000 -262.617645
*** OPTIMAL PARAMETER ESTIMATES ***
ALPHA BETA VML
1.540533 1.631551 -261.771484
*** THE FINAL SOLUTION (K = 20) ***
ALPHA BETA VML
1.540533 1.631551 -261.771484
NMISS = 0
ST
1 2 3 4 5 6 7 8
17.0 42.0 44.0 48.0 60.0 72.0 74.0 95.0
9 10 11 12 13 14 15 16
103.0 108.0 122.0 144.0 167.0 170.0 183.0 185.0
17 18 19 20 21 22 23 24
193.0 195.0 197.0 208.0 234.0 235.0 254.0 307.0
25 26 27 28 29 30 31 32
315.0 401.0 445.0 464.0 484.0 528.0 542.0 567.0
33 34 35 36 37 38 39 40
577.0 580.0 795.0 855.0 882.0 892.0 1031.0 1033.0
41 42 43 44 45
1306.0 1335.0 1366.0 1452.0 1472.0
JCEN
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1
41 42 43 44 45
1 1 1 1 1
H
1 2 3 4 5
1 217.0 218.0 1.0 21.0 1.0
2 192.0 193.0 1.0 21.0 0.5
3 190.0 191.0 1.0 21.0 0.3
.
.
.
43 1173.0 1181.0 17.0 45.0 0.0
44 1259.0 1267.0 17.0 45.0 0.0
45 1279.0 1287.0 17.0 45.0 0.0