survivalEstimates

Estimates survival probabilities and hazard rates for the various parametric models.

Synopsis

survivalEstimates (survivalInfo, nObservations, xpt, time, npt, delta)

Required Arguments

structure survivalInfo (Input)
A structure containing the estimated survival coefficients and other related information. See survivalGlm.
int nObservations (Input)
Number of observations for which estimates are to be calculated.
float xpt[[]] (Input)
Array xpt is an array containing the groups of covariates for which estimates are desired. The covariates must be specified exactly as in the call to survivalGlm which produced survivalInfo.
float time (Input)
Beginning of the time grid for which estimates are desired. Survival probabilities and hazard rates are computed for each covariate vector over the grid of time points time + i×delta for i = 0, 1, …, npt − 1.
int npt (Input)
Number of points on the time grid for which survival probabilities are desired.
float delta (Input)
Increment between time points on the time grid.

Return Value

An array of size npt by (2 ×nObservations + 1) containing the estimated survival probabilities for the covariate groups specified in xpt. Column 0 contains the survival time. Columns 1 and 2 contain the estimated survival probabilities and hazard rates, respectively, for the covariates in the first row of xpt. In general, the survival and hazard for row i of xpt is contained in columns 2i − 1 and 2i, respectively, for i = 1, 2, …, npt.

Optional Arguments

xbeta (Output)

An array of length nObservations containing the estimated linear response

\[w + x \hat{\beta}\]

for each row of xpt.

Description

Function survivalEstimates computes estimates of survival probabilities and hazard rates for the parametric survival/reliability models fit by function survivalGlm.

Let \(\eta=x^T \beta\) be the linear parameterization, where x is the design vector corresponding to a row of xpt (survivalEstimates generates the design vector using function regressorsForGlm), and β is a vector of parameters associated with the linear model. Let T denote the random response variable and \(S(t)\) denote the probability that \(T>t\). All models considered also allow a fixed parameter w (input in column xColFixedParameter of xpt). Use of the parameter is discussed in function survivalGlm. There also may be nuisance parameters \(\theta>0\) or \(\sigma>0\). Let Φ denote the cumulative normal distribution. The survival models available in survivalEstimates are:

Model Name \(S(t)\)
0 Exponential \(\exp\left[-t \exp(w_i+\eta)\right]\)
1 Linear hazard \(\exp\left[-\left( t+\tfrac{\theta t^2}{2} \right) \exp\left( w_i+\eta\right) \right]\)
2 Log-normal \(1-\phi\left( \tfrac{\ln(t)-\eta-w_i}{\sigma} \right)\)
3 Normal \(1-\phi\left( \tfrac{t-\eta-w_i}{\sigma} \right)\)
4 Log-logistic \(\left\{ 1+\exp\left( \tfrac{\ln(t)-\eta-w_i}{\sigma} \right) \right\}^{-1}\)
5 Logistic \(\left\{ 1+\exp\left( \tfrac{t-\eta-w_i}{\sigma} \right) \right\}^{-1}\)
6 Log least extreme value \(\exp\left\{-\exp\left( \tfrac{\ln(t)-\eta-w_i}{\sigma} \right) \right\}\)
7 Least extreme value \(\exp\left\{-\exp\left( \tfrac{t-\eta-w_i}{\sigma} \right) \right\}\)
8 Log extreme value \(1-\exp\left\{-\exp\left( \tfrac{\ln(t)-\eta-w_i}{\sigma} \right) \right\}\)
9 Extreme value \(1-\exp\left\{-\exp\left( \tfrac{t-\eta-w_i}{\sigma} \right) \right\}\)
10 Weibull \(\exp\left\{-\left[ \tfrac{t}{\exp\left( w_i+\eta\right)} \right]^\theta\right\}\)

Let λ(t) denote the hazard rate at time t. Then λ(t) and S(t) are related at

\[S(t) = \exp \left(\int_{-\infty}^{t} \lambda(s) ds\right)\]

Models 0, 1, 2, 4, 6, 8, and 10 require that \(T>0\) (in which case assume \(\lambda(s)=0\) for \(s<0\)), while the remaining models allow arbitrary values for T, \(-\infty<T<\infty\). The computations proceed in function survivalEstimates as follows:

  1. The input arguments are checked for consistency and validity.
  2. For each row of xpt, the explanatory variables are generated from the classification and variables and the covariates using function regressorsForGlm (see Chapter 2, Regression) with dummy = leaveOutLast. Given the explanatory variables x, η is computed as \(\eta=x^T \beta\), where β is input in survivalInfo.
  3. For each point requested in the time grid, the survival probabilities and hazard rates are computed.

Example

This example is a continuation of the first example given for function survivalGlm. Prior to calling survivalEstimates, survivalGlm is invoked to compute the parameter estimates (contained in the structure survivalInfo). The example is taken from Lawless (1982, p. 287) and involves the mortality of patients suffering from lung cancer.

from numpy import *
from pyimsl.stat.survivalEstimates import survivalEstimates
from pyimsl.stat.survivalGlm import survivalGlm
from pyimsl.stat.writeMatrix import writeMatrix

x = array([
    [1.0, 0.0, 7.0, 64.0, 5.0, 411.0, 0.0],
    [1.0, 0.0, 6.0, 63.0, 9.0, 126.0, 0.0],
    [1.0, 0.0, 7.0, 65.0, 11.0, 118.0, 0.0],
    [1.0, 0.0, 4.0, 69.0, 10.0, 92.0, 0.0],
    [1.0, 0.0, 4.0, 63.0, 58.0, 8.0, 0.0],
    [1.0, 0.0, 7.0, 48.0, 9.0, 25.0, 1.0],
    [1.0, 0.0, 7.0, 48.0, 11.0, 11.0, 0.0],
    [2.0, 0.0, 8.0, 63.0, 4.0, 54.0, 0.0],
    [2.0, 0.0, 6.0, 63.0, 14.0, 153.0, 0.0],
    [2.0, 0.0, 3.0, 53.0, 4.0, 16.0, 0.0],
    [2.0, 0.0, 8.0, 43.0, 12.0, 56.0, 0.0],
    [2.0, 0.0, 4.0, 55.0, 2.0, 21.0, 0.0],
    [2.0, 0.0, 6.0, 66.0, 25.0, 287.0, 0.0],
    [2.0, 0.0, 4.0, 67.0, 23.0, 10.0, 0.0],
    [3.0, 0.0, 2.0, 61.0, 19.0, 8.0, 0.0],
    [3.0, 0.0, 5.0, 63.0, 4.0, 12.0, 0.0],
    [4.0, 0.0, 5.0, 66.0, 16.0, 177.0, 0.0],
    [4.0, 0.0, 4.0, 68.0, 12.0, 12.0, 0.0],
    [4.0, 0.0, 8.0, 41.0, 12.0, 200.0, 0.0],
    [4.0, 0.0, 7.0, 53.0, 8.0, 250.0, 0.0],
    [4.0, 0.0, 6.0, 37.0, 13.0, 100.0, 0.0],
    [1.0, 1.0, 9.0, 54.0, 12.0, 999.0, 0.0],
    [1.0, 1.0, 5.0, 52.0, 8.0, 231.0, 1.0],
    [1.0, 1.0, 7.0, 50.0, 7.0, 991.0, 0.0],
    [1.0, 1.0, 2.0, 65.0, 21.0, 1.0, 0.0],
    [1.0, 1.0, 8.0, 52.0, 28.0, 201.0, 0.0],
    [1.0, 1.0, 6.0, 70.0, 13.0, 44.0, 0.0],
    [1.0, 1.0, 5.0, 40.0, 13.0, 15.0, 0.0],
    [2.0, 1.0, 7.0, 36.0, 22.0, 103.0, 1.0],
    [2.0, 1.0, 4.0, 44.0, 36.0, 2.0, 0.0],
    [2.0, 1.0, 3.0, 54.0, 9.0, 20.0, 0.0],
    [2.0, 1.0, 3.0, 59.0, 87.0, 51.0, 0.0],
    [3.0, 1.0, 4.0, 69.0, 5.0, 18.0, 0.0],
    [3.0, 1.0, 6.0, 50.0, 22.0, 90.0, 0.0],
    [3.0, 1.0, 8.0, 62.0, 4.0, 84.0, 0.0],
    [4.0, 1.0, 7.0, 68.0, 15.0, 164.0, 0.0],
    [4.0, 1.0, 3.0, 39.0, 4.0, 19.0, 0.0],
    [4.0, 1.0, 6.0, 49.0, 11.0, 43.0, 0.0],
    [4.0, 1.0, 8.0, 64.0, 10.0, 340.0, 0.0],
    [4.0, 1.0, 7.0, 67.0, 18.0, 231.0, 0.0]])
n_estimates = 2
n_class = 2
n_continuous = 3
model = 0
icen = 6
ilt = -1
irt = 5
lp_max = 40
time = 10.0
npt = 10
delta = 20.0

survival_info = []
fmt = "%12.4f"
# fmt = "%12.2f%10.4f%10.6f%10.4f%10.6f"
clabels = ["", "Time", "S1", "H1", "S2", "H2"]
n_coef = survivalGlm(n_class,
                     n_continuous,
                     model, x,
                     xColCensoring={'icen': icen, 'ilt': ilt, 'irt': irt},
                     infinityCheck=lp_max,
                     survivalInfo=survival_info)
survival_info = survival_info[0]
sprob = survivalEstimates(survival_info, n_estimates,
                          x, time, npt, delta)
writeMatrix("Survival and Hazard Estimates",
            sprob,
            noRowLabels=True,
            colLabels=clabels,
            writeFormat=fmt)

Output

 
                    Survival and Hazard Estimates
        Time            S1            H1            S2            H2
     10.0000        0.9626        0.0038        0.9370        0.0065
     30.0000        0.8921        0.0038        0.8228        0.0065
     50.0000        0.8267        0.0038        0.7224        0.0065
     70.0000        0.7661        0.0038        0.6343        0.0065
     90.0000        0.7099        0.0038        0.5570        0.0065
    110.0000        0.6579        0.0038        0.4890        0.0065
    130.0000        0.6096        0.0038        0.4294        0.0065
    150.0000        0.5650        0.0038        0.3770        0.0065
    170.0000        0.5235        0.0038        0.3311        0.0065
    190.0000        0.4852        0.0038        0.2907        0.0065

Note that the hazard rate is constant over time for the exponential model.

Warning Errors

IMSLS_CONVERGENCE_ASSUMED_1 Too many step halvings. Convergence is assumed.
IMSLS_CONVERGENCE_ASSUMED_2 Too many step iterations. Convergence is assumed.
IMSLS_NO_PREDICTED_1 estimates[0]” > 1.0. The expected value for the log logistic distribution (“model” = 4) does not exist. Predicted values will not be calculated.
IMSLS_NO_PREDICTED_2 estimates[0]” > 1.0. The expected value for the log extreme value distribution (“model” = 8) does not exist. Predicted values will not be calculated.
IMSLS_NEG_EIGENVALUE The Hessian has at least one negative eigenvalue. An upper bound on the absolute value of the minimum eigenvalue is # corresponding to variable index #.
IMSLS_INVALID_FAILURE_TIME_4 x[#][“ilt”= #]” = # and “x[#][“xResponseCol”= #]” = #. The censoring interval has length 0.0. The censoring code for this observation is being set to 0.0.

Fatal Errors

IMSLS_MAX_CLASS_TOO_SMALL The number of distinct values of the classification variables exceeds “maxClass” = #.
IMSLS_TOO_FEW_COEF initialEstInput is specified, and “nCoefInputt” = #. The model specified requires # coefficients.
IMSLS_TOO_FEW_VALID_OBS nObservations” = %(i1) and “nRowsMissing” = #. “nObservations”−”nRowsMissing” must be greater than or equal to 2 in order to estimate the coefficients.
IMSLS_SVGLM_1 For the exponential model (“model” = 0) with “nEffects” = # and no intercept, “nCoef” has been determined to equal 0. With no coefficients in the model, processing cannot continue.
IMSLS_INCREASE_LP_MAX Too many observations are to be deleted from the model. Either use a different model or increase the workspace.
IMSLS_INVALID_DATA_8 nClassValues[#]” = #. The number of distinct values for each classification variable must be greater than one.