survivalEstimates¶
Estimates survival probabilities and hazard rates for the various parametric models.
Synopsis¶
survivalEstimates (survivalInfo, nObservations, xpt, time, npt, delta)
Required Arguments¶
- structure
survivalInfo
(Input) - A structure containing the estimated survival coefficients and other related information. See survivalGlm.
- int
nObservations
(Input) - Number of observations for which estimates are to be calculated.
- float
xpt[[]]
(Input) - Array
xpt
is an array containing the groups of covariates for which estimates are desired. The covariates must be specified exactly as in the call tosurvivalGlm
which producedsurvivalInfo
. - float
time
(Input) - Beginning of the time grid for which estimates are desired. Survival
probabilities and hazard rates are computed for each covariate vector
over the grid of time points
time
+ i×delta
for i = 0, 1, …,npt
− 1. - int
npt
(Input) - Number of points on the time grid for which survival probabilities are desired.
- float
delta
(Input) - Increment between time points on the time grid.
Return Value¶
An array of size npt
by (2 ×nObservations
+ 1) containing the
estimated survival probabilities for the covariate groups specified in
xpt
. Column 0 contains the survival time. Columns 1 and 2 contain the
estimated survival probabilities and hazard rates, respectively, for the
covariates in the first row of xpt
. In general, the survival and hazard
for row i of xpt
is contained in columns 2i − 1 and 2i,
respectively, for i = 1, 2, …, npt
.
Optional Arguments¶
xbeta
(Output)An array of length
nObservations
containing the estimated linear response\[w + x \hat{\beta}\]for each row of
xpt
.
Description¶
Function survivalEstimates
computes estimates of survival probabilities
and hazard rates for the parametric survival/reliability models fit by
function survivalGlm
.
Let \(\eta=x^T \beta\) be the linear parameterization, where x is the
design vector corresponding to a row of xpt
(survivalEstimates
generates the design vector using function regressorsForGlm
), and β is a
vector of parameters associated with the linear model. Let T denote the
random response variable and \(S(t)\) denote the probability that
\(T>t\). All models considered also allow a fixed parameter w (input in
column xColFixedParameter
of xpt
). Use of the parameter is discussed
in function survivalGlm. There also may be nuisance
parameters \(\theta>0\) or \(\sigma>0\). Let Φ denote the cumulative
normal distribution. The survival models available in survivalEstimates
are:
Model | Name | \(S(t)\) |
---|---|---|
0 | Exponential | \(\exp\left[-t \exp(w_i+\eta)\right]\) |
1 | Linear hazard | \(\exp\left[-\left( t+\tfrac{\theta t^2}{2} \right) \exp\left( w_i+\eta\right) \right]\) |
2 | Log-normal | \(1-\phi\left( \tfrac{\ln(t)-\eta-w_i}{\sigma} \right)\) |
3 | Normal | \(1-\phi\left( \tfrac{t-\eta-w_i}{\sigma} \right)\) |
4 | Log-logistic | \(\left\{ 1+\exp\left( \tfrac{\ln(t)-\eta-w_i}{\sigma} \right) \right\}^{-1}\) |
5 | Logistic | \(\left\{ 1+\exp\left( \tfrac{t-\eta-w_i}{\sigma} \right) \right\}^{-1}\) |
6 | Log least extreme value | \(\exp\left\{-\exp\left( \tfrac{\ln(t)-\eta-w_i}{\sigma} \right) \right\}\) |
7 | Least extreme value | \(\exp\left\{-\exp\left( \tfrac{t-\eta-w_i}{\sigma} \right) \right\}\) |
8 | Log extreme value | \(1-\exp\left\{-\exp\left( \tfrac{\ln(t)-\eta-w_i}{\sigma} \right) \right\}\) |
9 | Extreme value | \(1-\exp\left\{-\exp\left( \tfrac{t-\eta-w_i}{\sigma} \right) \right\}\) |
10 | Weibull | \(\exp\left\{-\left[ \tfrac{t}{\exp\left( w_i+\eta\right)} \right]^\theta\right\}\) |
Let λ(t) denote the hazard rate at time t. Then λ(t) and S(t) are related at
Models 0, 1, 2, 4, 6, 8, and 10 require that \(T>0\) (in which case
assume \(\lambda(s)=0\) for \(s<0\)), while the remaining models
allow arbitrary values for T, \(-\infty<T<\infty\). The computations
proceed in function survivalEstimates
as follows:
- The input arguments are checked for consistency and validity.
- For each row of
xpt
, the explanatory variables are generated from the classification and variables and the covariates using function regressorsForGlm (see Chapter 2, Regression) withdummy
=leaveOutLast
. Given the explanatory variables x, η is computed as \(\eta=x^T \beta\), where β is input insurvivalInfo
. - For each point requested in the time grid, the survival probabilities and hazard rates are computed.
Example¶
This example is a continuation of the first example given for function
survivalGlm. Prior to calling survivalEstimates
,
survivalGlm
is invoked to compute the parameter estimates (contained in
the structure survivalInfo
). The example is taken from Lawless (1982, p.
287) and involves the mortality of patients suffering from lung cancer.
from numpy import *
from pyimsl.stat.survivalEstimates import survivalEstimates
from pyimsl.stat.survivalGlm import survivalGlm
from pyimsl.stat.writeMatrix import writeMatrix
x = array([
[1.0, 0.0, 7.0, 64.0, 5.0, 411.0, 0.0],
[1.0, 0.0, 6.0, 63.0, 9.0, 126.0, 0.0],
[1.0, 0.0, 7.0, 65.0, 11.0, 118.0, 0.0],
[1.0, 0.0, 4.0, 69.0, 10.0, 92.0, 0.0],
[1.0, 0.0, 4.0, 63.0, 58.0, 8.0, 0.0],
[1.0, 0.0, 7.0, 48.0, 9.0, 25.0, 1.0],
[1.0, 0.0, 7.0, 48.0, 11.0, 11.0, 0.0],
[2.0, 0.0, 8.0, 63.0, 4.0, 54.0, 0.0],
[2.0, 0.0, 6.0, 63.0, 14.0, 153.0, 0.0],
[2.0, 0.0, 3.0, 53.0, 4.0, 16.0, 0.0],
[2.0, 0.0, 8.0, 43.0, 12.0, 56.0, 0.0],
[2.0, 0.0, 4.0, 55.0, 2.0, 21.0, 0.0],
[2.0, 0.0, 6.0, 66.0, 25.0, 287.0, 0.0],
[2.0, 0.0, 4.0, 67.0, 23.0, 10.0, 0.0],
[3.0, 0.0, 2.0, 61.0, 19.0, 8.0, 0.0],
[3.0, 0.0, 5.0, 63.0, 4.0, 12.0, 0.0],
[4.0, 0.0, 5.0, 66.0, 16.0, 177.0, 0.0],
[4.0, 0.0, 4.0, 68.0, 12.0, 12.0, 0.0],
[4.0, 0.0, 8.0, 41.0, 12.0, 200.0, 0.0],
[4.0, 0.0, 7.0, 53.0, 8.0, 250.0, 0.0],
[4.0, 0.0, 6.0, 37.0, 13.0, 100.0, 0.0],
[1.0, 1.0, 9.0, 54.0, 12.0, 999.0, 0.0],
[1.0, 1.0, 5.0, 52.0, 8.0, 231.0, 1.0],
[1.0, 1.0, 7.0, 50.0, 7.0, 991.0, 0.0],
[1.0, 1.0, 2.0, 65.0, 21.0, 1.0, 0.0],
[1.0, 1.0, 8.0, 52.0, 28.0, 201.0, 0.0],
[1.0, 1.0, 6.0, 70.0, 13.0, 44.0, 0.0],
[1.0, 1.0, 5.0, 40.0, 13.0, 15.0, 0.0],
[2.0, 1.0, 7.0, 36.0, 22.0, 103.0, 1.0],
[2.0, 1.0, 4.0, 44.0, 36.0, 2.0, 0.0],
[2.0, 1.0, 3.0, 54.0, 9.0, 20.0, 0.0],
[2.0, 1.0, 3.0, 59.0, 87.0, 51.0, 0.0],
[3.0, 1.0, 4.0, 69.0, 5.0, 18.0, 0.0],
[3.0, 1.0, 6.0, 50.0, 22.0, 90.0, 0.0],
[3.0, 1.0, 8.0, 62.0, 4.0, 84.0, 0.0],
[4.0, 1.0, 7.0, 68.0, 15.0, 164.0, 0.0],
[4.0, 1.0, 3.0, 39.0, 4.0, 19.0, 0.0],
[4.0, 1.0, 6.0, 49.0, 11.0, 43.0, 0.0],
[4.0, 1.0, 8.0, 64.0, 10.0, 340.0, 0.0],
[4.0, 1.0, 7.0, 67.0, 18.0, 231.0, 0.0]])
n_estimates = 2
n_class = 2
n_continuous = 3
model = 0
icen = 6
ilt = -1
irt = 5
lp_max = 40
time = 10.0
npt = 10
delta = 20.0
survival_info = []
fmt = "%12.4f"
# fmt = "%12.2f%10.4f%10.6f%10.4f%10.6f"
clabels = ["", "Time", "S1", "H1", "S2", "H2"]
n_coef = survivalGlm(n_class,
n_continuous,
model, x,
xColCensoring={'icen': icen, 'ilt': ilt, 'irt': irt},
infinityCheck=lp_max,
survivalInfo=survival_info)
survival_info = survival_info[0]
sprob = survivalEstimates(survival_info, n_estimates,
x, time, npt, delta)
writeMatrix("Survival and Hazard Estimates",
sprob,
noRowLabels=True,
colLabels=clabels,
writeFormat=fmt)
Output¶
Survival and Hazard Estimates
Time S1 H1 S2 H2
10.0000 0.9626 0.0038 0.9370 0.0065
30.0000 0.8921 0.0038 0.8228 0.0065
50.0000 0.8267 0.0038 0.7224 0.0065
70.0000 0.7661 0.0038 0.6343 0.0065
90.0000 0.7099 0.0038 0.5570 0.0065
110.0000 0.6579 0.0038 0.4890 0.0065
130.0000 0.6096 0.0038 0.4294 0.0065
150.0000 0.5650 0.0038 0.3770 0.0065
170.0000 0.5235 0.0038 0.3311 0.0065
190.0000 0.4852 0.0038 0.2907 0.0065
Note that the hazard rate is constant over time for the exponential model.
Warning Errors¶
IMSLS_CONVERGENCE_ASSUMED_1 |
Too many step halvings. Convergence is assumed. |
IMSLS_CONVERGENCE_ASSUMED_2 |
Too many step iterations. Convergence is assumed. |
IMSLS_NO_PREDICTED_1 |
“estimates[0] ” > 1.0. The
expected value for the log logistic
distribution (“model ” = 4)
does not exist. Predicted values will
not be calculated. |
IMSLS_NO_PREDICTED_2 |
“estimates[0] ” > 1.0. The
expected value for the log extreme
value distribution (“model ” =
8) does not exist. Predicted values
will not be calculated. |
IMSLS_NEG_EIGENVALUE |
The Hessian has at least one negative eigenvalue. An upper bound on the absolute value of the minimum eigenvalue is # corresponding to variable index #. |
IMSLS_INVALID_FAILURE_TIME_4 |
“x [#][“ilt ”= #]” = #
and
“x [#][“xResponseCol ”=
#]” = #. The censoring interval has
length 0.0. The censoring code for
this observation is being set to 0.0. |
Fatal Errors¶
IMSLS_MAX_CLASS_TOO_SMALL |
The number of distinct values of the
classification variables exceeds
“maxClass ” = #. |
IMSLS_TOO_FEW_COEF |
initialEstInput is specified, and
“nCoefInput t” = #. The model specified
requires # coefficients. |
IMSLS_TOO_FEW_VALID_OBS |
“nObservations ” = %(i1) and
“nRowsMissing ” = #.
“nObservations ”−”nRowsMissing ”
must be greater than or equal to 2 in order to
estimate the coefficients. |
IMSLS_SVGLM_1 |
For the exponential model (“model” = 0) with
“nEffect s” = # and no intercept,
“nCoef ” has been determined to equal
0. With no coefficients in the model,
processing cannot continue. |
IMSLS_INCREASE_LP_MAX |
Too many observations are to be deleted from the model. Either use a different model or increase the workspace. |
IMSLS_INVALID_DATA_8 |
“nClassValues [#]” = #. The number of
distinct values for each classification
variable must be greater than one. |