Click or drag to resize
ProportionalHazards Class
Analyzes survival and reliability data using Cox's proportional hazards model.
Inheritance Hierarchy
SystemObject
  Imsl.StatProportionalHazards

Namespace: Imsl.Stat
Assembly: ImslCS (in ImslCS.dll) Version: 6.5.2.0
Syntax
[SerializableAttribute]
public class ProportionalHazards

The ProportionalHazards type exposes the following members.

Constructors
  NameDescription
Public methodProportionalHazards
Constructor for ProportionalHazards.
Top
Methods
  NameDescription
Public methodEquals
Determines whether the specified object is equal to the current object.
(Inherited from Object.)
Protected methodFinalize
Allows an object to try to free resources and perform other cleanup operations before it is reclaimed by garbage collection.
(Inherited from Object.)
Public methodGetCaseStatistics
Returns the case statistics for each observation.
Public methodGetClassValueCounts
Returns the number of values taken by each classification variable.
Public methodGetClassValues
Returns the class values taken by each classification variable.
Public methodGetGradient
Returns the inverse of the Hessian times the gradient vector, computed at the initial estimates.
Public methodGetHashCode
Serves as a hash function for a particular type.
(Inherited from Object.)
Public methodGetHessian
Returns the inverse of the Hessian of the negative of the log-likelihood, computed at the initial estimates.
Public methodGetInitialEstimates
Gets the initial parameter estimates.
Public methodGetLastUpdates
Gets the last parameter updates.
Public methodGetMeans
Returns the means of the design variables.
Public methodGetParameterStatistics
Returns the parameter estimates and associated statistics.
Public methodGetStratumNumbers
Returns the stratum number used for each observation.
Public methodGetType
Gets the Type of the current instance.
(Inherited from Object.)
Public methodGetVarianceCovarianceMatrix
Returns the estimated asymptotic variance-covariance matrix of the parameters.
Protected methodMemberwiseClone
Creates a shallow copy of the current Object.
(Inherited from Object.)
Public methodSetClassVarColumns
Sets the column indices of x that are the classification variables.
Public methodSetInitialEstimates
Sets the initial parameter estimates.
Public methodToString
Returns a string that represents the current object.
(Inherited from Object.)
Top
Properties
  NameDescription
Public propertyCensorColumn
The column index of x containing the optional censoring code for each observation.
Public propertyConstantColumn
The column index of x containing the constant to be added to the linear response.
Public propertyConvergenceTol
The convergence tolerance.
Public propertyFrequencyColumn
The column index of x containing the frequency of response for each observation.
Public propertyHessianOption
The boolean used to indicate whether or not to compute the Hessian and gradient at the initial estimates.
Public propertyLogger
Returns the logger object and enables logging.
Public propertyMaxClass
The upper bound used on the sum of the number of distinct values found among the classification variables in x.
Public propertyMaximumLikelihood
Returns the maximized log-likelihood.
Public propertyMaxIterations
The maximum number of iterations allowed.
Public propertyNumberOfCoefficients
Returns the number of estimated coefficients in the model.
Public propertyNumberRowsMissing
The number of rows of data in x that contain missing values in one or more columns of x.
Public propertyResponseColumn
The column index of x containing the response time for each observation.
Public propertyStratumColumn
The column index of x containing the stratum number for each observation.
Public propertyStratumRatio
The ratio at which a stratum is split into two strata.
Public propertyTiesOption
The method used for handling ties.
Top
Remarks

Class ProportionalHazards computes parameter estimates and other statistics in Proportional Hazards Generalized Linear Models. These models were first proposed by Cox (1972). Two methods for handling ties are allowed. Time-dependent covariates are not allowed. The user is referred to Cox and Oakes (1984), Kalbfleisch and Prentice (1980), Elandt-Johnson and Johnson (1980), Lee (1980), or Lawless (1982), among other texts, for a thorough discussion of the Cox proportional hazards model.

Let \lambda(t,z_i) represent the hazard rate at time t for observation number i with covariables contained as elements of row vector z_i. The basic assumption in the proportional hazards model (the proportionality assumption) is that the hazard rate can be written as a product of a time varying function \lambda_0(t), which depends only on time, and a function f(z_i), which depends only on the covariable values. The function f(z_i) used in ProportionalHazards is given as f(z_i)=
            \textup{exp}(w_i+\beta z_i) where w_i is a fixed constant assigned to the observation, and b is a vector of coefficients to be estimated. With this function one obtains a hazard rate \lambda(t, z_i)=\lambda_0(t)\textup{exp}(w_i+
            \beta z_i). The form of \lambda_0(t) is not important in proportional hazards models.

The constants w_i may be known theoretically. For example, the hazard rate may be proportional to a known length or area, and the w_i can then be determined from this known length or area. Alternatively, the w_i may be used to fix a subset of the coefficients \beta (say, \beta_1) at specified values. When w_i is used in this way, constants w_i=
            \beta_1z_{1i} are used, while the remaining coefficients in \beta are free to vary in the optimization algorithm. Constants are defined as 0.0 by default. If user-specified constants are desired, use the ConstantColumn property to specify which column contains the constant.

With this definition of \lambda(t,z_i), the usual partial (or marginal, see Kalbfleisch and Prentice (1980)) likelihood becomes

L=\prod_{i=1}^{n_d}\frac{\textup{exp}
            (w_i+\beta z_i)} {\sum_{j\in R(t_i)}^{}\textup{exp}(w_j+\beta z_j)}

where R(t_i) denotes the set of indices of observations that have not yet failed at time t_i (the risk set), t_i denotes the time of failure for the i-th observation, n_d is the total number of observations that fail. Right-censored observations (i.e., observations that are known to have survived to time t_i
            , but for which no time of failure is known) are incorporated into the likelihood through the risk set R(t_i)
            . Such observations never appear in the numerator of the likelihood. When TiesOption is set to BreslowsApproximate (the default), all observations that are censored at time t_i are not included in R(t_i), while all observations that fail at time t_i are included in R(t_i).

If it can be assumed that the dependence of the hazard rate upon the covariate values remains the same from stratum to stratum, while the time-dependent term, \lambda_0(t), may be different in different strata, then ProportionalHazards allows the incorporation of strata into the likelihood as follows. Let k index the m strata (set with StratumColumn). Then, the likelihood is given by

L_S=\prod_{k=1}^{m}\left[
            \prod_{i=1}^{n_k}\frac{\textup{exp}(w_{ki}+\beta z_{ki})}{\sum_{j\in
            R(t_{ki})}^{}\textup{exp}(w_{kj}+\beta z_{kj})} \right ]

In ProportionalHazards, the log of the likelihood is maximized with respect to the coefficients \beta. A quasi-Newton algorithm approximating the Hessian via the matrix of sums of squares and cross products of the first partial derivatives is used in the initial iterations. When the change in the log-likelihood from one iteration to the next is less than 100 times the convergence tolerance, Newton-Raphson iteration is used. If, during any iteration, the initial step does not lead to an increase in the log-likelihood, then step halving is employed to find a step that will increase the log-likelihood.

Once the maximum likelihood estimates have been computed, the algorithm computes estimates of a probability associated with each failure. Within stratum k, an estimate of the probability that the i-th observation fails at time t_i given the risk set R(t_{ki}) is given by

p_{ki}=\frac{\textup{exp}(w_{ki}+\beta
            z_{ki})}{\sum_{j\in R(t_{ki})}^{}\textup{exp}(w_{kj}+\beta z_{kj})}

A diagnostic "influence" or "leverage" statistic is computed for each noncensored observation as:

l_{ki}=-{g}'_{ki}H^{-1}_s{g}'_{ki}

where H_s is the matrix of second partial derivatives of the log-likelihood, and

{g}'_{ki}

is computed as:

{g}'_{ki}=z_{ki}-\frac{z_{ki}
            \textup{exp}(w_{ki}+\beta z_{ki})}{\sum_{j\in R(t_{ki})}^{}\textup{exp}
            (w_{kj}+\beta z_{kj})}

Influence statistics are not computed for censored observations.

A "residual" is computed for each of the input observations according to methods given in Cox and Oakes (1984, page 108). Residuals are computed as

r_{ki}=\textup{exp}(w_{ki}+\hat{\beta}
            z_{ki})\sum_{j \in R(t_{ki})}^{}\frac{d_{kj}}{\sum_{l \in R(t_{kj})}^{}
            \textup{exp}(w_{kl}+\hat{\beta} z_{kl})}

where d_{kj} is the number of tied failures in group k at time t_{kj}. Assuming that the proportional hazards assumption holds, the residuals should approximate a random sample (with censoring) from the unit exponential distribution. By subtracting the expected values, centered residuals can be obtained. (The j-th expected order statistic from the unit exponential with censoring is given as

e_j=\sum_{l \le j}^{}\frac{1}{h-l+1}

where h is the sample size, and censored observations are not included in the summation.)

An estimate of the cumulative baseline hazard within group k is given as

\hat{H}_{k0}(t_{ik})=\sum_{t_{kj}\le
            t_{ki}}^{}\frac{d_{kj}}{\sum_{l \in R(t_{kj})}^{}\textup{exp}(w_{kl}+
            \hat{\beta} z_{kl})}

The observation proportionality constant is computed as

\textup{exp}(w_{ki}+\hat{\beta}z_{ki})

Note that one can use logging to generate intermediate output for this class. Accumulated levels of detail correspond to Fine, Finer, and Finest logging levels with Fine yielding the smallest amount of information and Finest yielding the most. The levels of output yield the following:

LevelOutput
Fine Logging is enabled, but observational statistics are not printed.
Finer All output statistics are printed.
Finest Tracks progress through internal methods.

See Also

Reference

Other Resources