CategoricalGenLinModel Class |
Namespace: Imsl.Stat
The CategoricalGenLinModel type exposes the following members.
Name | Description | |
---|---|---|
![]() | CategoricalGenLinModel |
Constructs a new CategoricalGenLinModel.
|
Name | Description | |
---|---|---|
![]() | Equals | Determines whether the specified object is equal to the current object. (Inherited from Object.) |
![]() | Finalize | Allows an object to try to free resources and perform other cleanup operations before it is reclaimed by garbage collection. (Inherited from Object.) |
![]() | GetHashCode | Serves as a hash function for a particular type. (Inherited from Object.) |
![]() | GetType | Gets the Type of the current instance. (Inherited from Object.) |
![]() | MemberwiseClone | Creates a shallow copy of the current Object. (Inherited from Object.) |
![]() | SetEffects |
Initializes an index vector to contain the column numbers in
x associated with each effect.
|
![]() | SetInitialEstimates |
Sets the initial parameter estimates option.
|
![]() | Solve |
Returns the parameter estimates and associated statistics for a
CategoricalGenLinModel object.
|
![]() | ToString | Returns a string that represents the current object. (Inherited from Object.) |
Name | Description | |
---|---|---|
![]() | CaseAnalysis |
The case analysis.
|
![]() | CensorColumn |
The column number in x which contains the interval
type for each observation.
|
![]() | ClassificationVariableColumn |
An index vector to contain the column numbers in x
that are classification variables.
|
![]() | ClassificationVariableCounts |
The number of values taken by each classification variable.
|
![]() | ClassificationVariableValues |
The distinct values of the classification variables in ascending
order.
|
![]() | ConvergenceTolerance |
The convergence criterion.
|
![]() | CovarianceMatrix |
The estimated asymptotic covariance matrix of the coefficients.
|
![]() | DesignVariableMeans |
The means of the design variables.
|
![]() | ExtendedLikelihoodObservations |
A vector indicating which observations are included in the extended
likelihood.
|
![]() | FixedParameterColumn |
The column number in x that contains a fixed
parameter for each observation that is added to the linear response
prior to computing the model parameter.
|
![]() | FrequencyColumn |
The column number in x that contains the frequency
of response for each observation.
|
![]() | Hessian |
The Hessian computed at the initial parameter estimates.
|
![]() | InfiniteEstimateMethod |
Specifies the method used for handling infinite estimates.
|
![]() | LastParameterUpdates |
The last parameter updates (excluding step halvings).
|
![]() | LowerEndpointColumn |
The column number in x that contains the lower
endpoint of the observation interval for full interval and right
interval observations.
|
![]() | MaxIterations |
The maximum number of iterations allowed.
|
![]() | ModelIntercept |
The intercept option.
|
![]() | NRowsMissing |
The number of rows of data in x that contain
missing values in one or more specific columns of x.
|
![]() | ObservationMax |
The maximum number of observations that can be handled in the linear
programming.
|
![]() | OptimizedCriterion |
The optimized criterion.
|
![]() | OptionalDistributionParameterColumn |
The column number in x that contains an optional
distribution parameter for each observation.
|
![]() | Parameters |
Parameter estimates and associated statistics.
|
![]() | Product |
The inverse of the Hessian times the gradient vector computed at the
input parameter estimates.
|
![]() | Tolerance | The tolerance used in determining linear dependence.
|
![]() | UpperBound |
Defines the upper bound on the sum of the number of distinct values
taken on by each classification variable.
|
![]() | UpperEndpointColumn |
The column number in x that contains the upper
endpoint of the observation interval for full interval and left
interval observations.
|
Reweighted least squares is used to compute (extended) maximum likelihood estimates in some generalized linear models involving categorized data. One of several models, including probit, logistic, Poisson, logarithmic, and negative binomial models, may be fit for input point or interval observations. (In the usual case, only point observations are observed.)
Let
The models available in CategoricalGenLinModel are:
Model Name | Parameterization | Response PDF |
---|---|---|
Model0 (Poisson) | ![]() | ![]() |
Model1 (Negative Binomial) | ![]() | ![]() |
Model2 (Logarithmic) | ![]() | ![]() |
Model3 (Logistic) | ![]() | ![]() |
Model4 (Probit) | ![]() | ![]() |
Model5 (Log-log) | ![]() | ![]() |
Here denotes the cumulative normal
distribution, N and S are known parameters specified for
each observation via column OptionalDistributionParameterColumn of x, and w is an optional fixed parameter specified for each
observation via column FixedParameterColumn of
x. (By default N is taken to be 1 for model
= 0, 3, 4 and 5 and S is taken to be 1 for model = 1.
By default w is taken to be 0.) Since the log-log model (model
= 5) probabilities are not symmetric with respect to 0.5,
quantitatively, as well as qualitatively, different models result when
the definitions of "success" and "failure" are interchanged in this
distribution. In this model and all other models involving
,
is taken to be the
probability of a "success."
Note that each row vector in the data matrix can represent a single observation; or, through the use of column FrequencyColumn of the matrix x, each vector can represent several observations.
For interval observations, the probability of the observation is
computed by summing the probability distribution function over the range
of values in the observation interval. For right-interval observations,
is computed as a sum based upon the
equality
. Derivatives
are similarly computed. CategoricalGenLinModel allows three types
of interval observations. In full interval observations, both the lower
and the upper endpoints of the interval must be specified. For
right-interval observations, only the lower endpoint need be given while
for left-interval observations, only the upper endpoint is given.
The computations proceed as follows:
For bounded interval observations, the midpoint of the interval is used for x[i,LowerEndpointColumn]. Right-interval observations are not used in obtaining initial estimates when the distribution has unbounded support (since the midpoint of the interval is not defined). When computing initial estimates, standard modifications are made to prevent illegal operations such as division by zero.
Regression estimates are obtained at this point, as well as later, by use of linear regression.
If InfiniteEstimateMethod is set to 0, then the methods of Clarkson and Jennrich (1991) are used to check for the existence of infinite estimates in
When InfiniteEstimateMethod is set to 1, no observations are
eliminated during the iterations. In this case, when infinite estimates
occur, some (or all) of the coefficient estimates will become large, and it is likely that the Hessian will
become (numerically) singular prior to convergence.
When infinite estimates for the are
detected, linear regression (see Chapter 2, Regression;) is used at the
convergence of the algorithm to obtain unique estimates
. This is accomplished by regressing the optimal
or the observations with finite
against
, yielding
a unique
(by setting coefficients
that are linearly related to previous
coefficients in the model to zero). All of the final statistics relating
to
are based upon these estimates.
Residuals are computed according to methods discussed by Pregibon
(1981). Let denote the
log-likelihood of the i-th observation evaluated at
. Then, the standardized residual is
computed as
Following Cook and Weisberg (1982), we take the influence of the i-th observation to be
This quantity is a one-step approximation to the change in the estimates
when the i-th observation is deleted. Here, the partial
derivatives are with respect to .
A second method for specifying binomial models is to use x[i,LowerEndpointColumn] to represent the number of successes in the x[i,OptionalDistributionParameterColumn] trials. In this case, x[i,FrequencyColumn] will usually be 1, but it may be greater than 1, in which case interval observations are possible.
Note that the Solve() method must be called before using any property as a right operand, otherwise the value is null.