CTGLM

FNLStat : Categorical and Discrete Data Analysis : CTGLM

CTGLM

Required Arguments

Computational Details

Comments

Programming Notes

Examples

Analyzes categorical data using logistic, Probit, Poisson, and other generalized linear models.

Required Arguments

X — NOBS by NCOL matrix containing the data. (Input)

MODEL — Model option parameter. (Input)
MODEL specifies the distribution of the response variable and the function used to model the distribution parameter. The lower‑bound given in the following table is the minimum possible value of the response variable:

MODEL	Distribution	Function	LowerBound
0	Poisson	Exponential	0
1	Neg. Binomial	Logistic	0
2	Logarithmic	Logistic	1
3	Binomial	Logistic	0
4	Binomial	Probit	0
5	Binomial	Log‑log	0

Let γ be the dot product of a row in the design matrix with the parameters (plus the fixed parameter, if used). Then, the functions used to model the distribution parameter are given by:

Name	Function
Exponential	exp(γ)
Logistic	exp(γ)/(1 + exp(γ))
Probit	Normal(γ) (normal cdf)
Log‑log	1 ‑ exp(−γ)

NCOEF — Number of estimated coefficients in the model. (Output)

COEF — NCOEF by 4 matrix containing the parameter estimates and associated statistics. (Output, if INIT = 0; input, if INIT = 1 and MAXIT = 0; input/output, if INIT = 1 and MAXIT > 0)

Col.	Statistic
1	Coefficient estimate.
2	Estimated standard deviation of the estimated coefficient.
3	Asymptotic normal score for testing that the coefficient is zero.
4	p‑value associated with the normal score in column 3.

When INIT = 1, only the first column needs to be specified on input.

COV — NCOEF by NCOEF matrix containing the estimated asymptotic covariance matrix of the coefficients. (Output)
For MAXIT = 0, this is the Hessian computed at the initial parameter estimates.

XMEAN — Vector of length NCOEF containing the means of the design variables. (Output)

GR — Vector of length NCOEF containing the last parameter updates (excluding step halvings). (Output)
For MAXIT = 0, GR contains the inverse of the Hessian times the gradient vector, all computed at the initial parameter estimates.

Optional Arguments

NOBS — Number of observations. (Input)
Default: NOBS = size (X,1).

NCOL — Number of columns in X. (Input)
Default: NCOL = size (X,2).

LDX — Leading dimension of X exactly as specified in the dimension statement in the calling program. (Input)
Default: LDX = size (X,1).

ILT — For full‑interval and left‑interval observations, the column number in X that contains the upper endpoint of the observation interval. (Input)
See argument ICEN. If ILT = 0, left‑interval and full‑interval observations cannot be input.
Default: ILT = 0.

IRT — For full‑interval and right‑interval observations, the column number in X that contains the lower endpoint of the interval. (Input)
For point observations, X(i, IRT) contains the observation point. IRT must not be zero. See argument ICEN. In the usual case, all observations are “point” observations.
Default: IRT = size (X,2).

IFRQ — Column number in X containing the frequency of response for each observation. (Input)
If IFRQ = 0, a response frequency of 1 for each observation is assumed.
Default: IFRQ = 0.

IFIX — Column number in X containing a fixed parameter for each observation that is added to the linear response prior to computing the model parameter. (Input)
The “fixed” parameter allows one to test hypothesis about the parameters via the log‑likelihoods. If IFIX = 0, the fixed parameter is assumed to be 0.
Default: IFIX = 0.

IPAR — Column number in X containing an optional distribution parameter for each observation. (Input)
Default: IPAR = 0.
If IPAR = 0, the distribution parameter is assumed to be 1. The meaning of the distributional parameter depends upon MODEL as follows:

MODEL	Meaning of X(I, IPAR)
0	The Poisson parameter is given by X(i, IPAR) * exp(γ).
1	The number of successes required in the negative binomial is given by X(i, IPAR).
2	X(i, IPAR) is not used.
3 - 5	The number of trials in the binomial distribution is given by X(i, IPAR).

ICEN — Column number in X containing the interval‑type for each observation. (Input)
Default: ICEN = 0.
If ICEN = 0, a code of 0 is assumed. Valid codes are

X(i,ICEN)	Censoring
0	Point observation. The response is unique and is given by X(i, IRT).
1	Right‑interval. The response is greater than or equal to X(i, IRT) and less than or equal to the upper bound, if any, of the distribution.
2	Left‑interval. The response is less than or equal to X(i, ILT) and greater than or equal to the lower bound of the distribution.
3	Full‑interval. The response is greater than or equal to X(i, IRT), but less than or equal to X(i, ILT).

INFIN — Method to be used for handling infinite estimates. (Input)
Default: INFIN = 1.

INFIN	Method
0	Remove a right or left‑censored observation from the log‑likelihood whenever the probability of the observation exceeds 0.995. At convergence, use linear programming to check that all removed observations actually have an estimated linear response that is infinite. Set IADD(i) for observation i to 2 if the linear response is infinite. If not all removed observations have infinite linear response, recompute the estimates based upon the observations with estimated linear response that is finite.This option is valid only for censoring codes (see ICEN) 1 and 2.
1	Iterate without checking for infinite estimates.

INFIN

Method

Remove a right or left‑censored observation from the log‑likelihood whenever the probability of the observation exceeds 0.995. At convergence, use linear programming to check that all removed observations actually have an estimated linear response that is infinite. Set IADD(i) for observation i to 2 if the linear response is infinite. If not all removed observations have infinite linear response, recompute the estimates based upon the observations with estimated linear response that is finite.This option is valid only for censoring codes (see ICEN) 1 and 2.

Iterate without checking for infinite estimates.

See the Description section for more discussion.

MAXIT — Maximum number of iterations. (Input)
MAXIT = 30 is usually sufficient. Use MAXIT = 0 to compute the Hessian, stored in COV, and the Newton step, stored in GR, at the initial estimates.
Default: MAXIT = 30.

EPS — Convergence criterion. (Input)
Convergence is assumed when the maximum relative change in any coefficient estimate is less than EPS from one iteration to the next or when the relative change in the log‑likelihood, ALGL, from one iteration to the next is less than EPS/100. If EPS is negative, EPS = 0.001 is assumed.
Default: EPS = .001.

INTCEP — Intercept option. (Input)
Default: INTCEP = 1

INTCEP	Action
0	No intercept is in the model (unless otherwise provided for by the user).
1	Intercept is automatically included in the model.

NCLVAR — Number of classification variables. (Input)
Dummy or indicator variables are generated for classification variables using the IDUMMY = 2 option of IMSL routine GRGLM (Chapter 2, “Regression”). See Comment 3.
Default: NCLVAR = 0.

INDCL — Index vector of length NCLVAR containing the column numbers of X that are classification variables. (Input, if NCLVAR is positive; not used otherwise). If NCLVAR is 0, INDCL is not referenced and can be dimensioned of length 1 in the calling program.

NEF — Number of effects in the model. (Input)
In addition to effects involving classification variables, simple covariates and the product of simple covariates are also considered effects.
Default: NEF = 0.

NVEF — Vector of length NEF containing the number of variables associated with each effect in the model. (Input, if NEF is positive; not used otherwise).
If NEF is zero, NVEF is not used and can be dimensioned of length 1 in the calling program.

INDEF — Index vector of length NVEF(1) + NVEF(2) + … + NVEF(NEF) containing the column numbers in X associated with each effect.(Input, if NEF is positive, not used otherwise)
The first NVEF(1) elements of INDEF give the column numbers of the variables in the first effect. The next NVEF(2) elements give the column numbers for the second effect, etc. If NEF is zero, INDEF is not used and can be dimensioned of length 1 in the calling program.

INIT — Initialization option. (Input)
Default: INIT = 0.

INIT	Action
0	Unweighted linear regression is used to obtain initial estimates.
1	The NCOEF elements in the first column of COEF contain initial estimates of the parameters on input to CTGLM (requiring that the user know NCOEF prior to calling CTGLM).

IPRINT — Printing option. (Input)
Default: IPRINT = 0.

IPRINT	Action
0	No printing is performed.
1	Printing is performed, but observational statistics are not printed.
2	All output statistics are printed.

MAXCL — An upper bound on the sum of the number distinct values taken on by each classification variable. (Input)
Default: If NCLVAR = 0, then MAXCL = 1, else MAXCL = 3 * NCLVAR.

NCLVAL — Vector of length NCLVAR containing the number of values taken by each classification variable. (Output, if NCLVAR is positive; not used otherwise)
NCLVAL(i) is the number of distinct values for the i‑th classification variable. If NCLVAR is zero, NCLVAL is not used and can be dimensioned of length 1 in the calling program.

CLVAL — Vector of length NCLVAL(1) + NCLVAL(2) + … + NCLVAL(NCLVAR) containing the distinct values of the classification variables in ascending order. (Output, if NCLVAR is positive; not used otherwise)
Since in general the length of CLVAL will not be known in advance, MAXCL (an upper bound for this length) should be used for purposes of dimensioning CLVAL. The first NCLVAL(1) elements of CLVAL contain the values for the first classification variables, the next NCLVAL(2) elements contain the values for the second classification variable, etc. If NCLVAR is zero, then CLVAL is not referenced and can be dimensioned of length 1 in the calling program.

LDCOEF — Leading dimension of COEF exactly as specified in the dimension statement in the calling program. (Input)
Default: LDCOEF = size (COEF,1).

ALGL — Value of the log‑likelihood evaluated at the final estimates. (Output)

LDCOV — Leading dimension of COV exactly as specified in the dimension statement in the calling program. (Input)
Default: LDCOV = size (COV,1).

CASE — NOBS by 5 vector containing the case analysis. (Output)

Col.	Statistic
1	Predicted parameter.
2	The residual.
3	The estimated standard error of the residual.
4	The estimated influence of the observation.
5	The standardized residual.

Case statistics are computed for all observations except where missing values prevent their computation.

The predicted parameter in column 1 depends upon MODEL as follows.

MODEL	Parameter
0	The predicted mean for the observation.
1‑5	The probability of a success on a single trial.

LDCASE — Leading dimension of CASE exactly as specified in the dimension statement in the calling program. (Input)

IADDS — Vector of length NOBS indicating which observations are included in the extended likelihood. (Output, if MAXIT > 0; input/output, if MAXIT = 0)

Value	Status of Observation
0	Observation i is in the likelihood.
1	Observation i cannot be in the likelihood because it contains at least one missing value in X.
2	Observation i is not in the likelihood. Its estimated parameter is infinite. For MAXIT = 0, the IADDS array must be initialized prior to calling CTGLM.

In this case, some elements of IADDS may be set to 1, by CTGLM, but no check for infinite estimates performed.

NRMISS — Number of rows of data in X that contain missing values in one or more columns ILT, IRT, IFRQ, IFIX, IPAR, ICEN, INDCL, or INDEF of X. (Output)

FORTRAN 90 Interface

Generic: CALL CTGLM (X, MODEL, NCOEF, COEF, COV, XMEAN, GR [, …])

Specific: The specific interface names are S_CTGLM and D_CTGLM.

FORTRAN 77 Interface

Single: CALL CTGLM (NOBS, NCOL, X, LDX, MODEL, ILT, IRT, IFRQ, IFIX, IPAR, ICEN, INFIN, MAXIT, EPS, INTCEP, NCLVAR, INDCL, NEF, NVEF, INDEF, INIT, IPRINT, MAXCL, NCLVAL, CLVAL, NCOEF, COEF, LDCOEF, ALGL, COV, LDCOV, XMEAN, CASE, LDCASE, GR, IADDS, NRMISS)

Double: The double precision name is DCTGLM.

Description

Routine CTGLM uses iteratively reweighted least squares to compute (extended) maximum likelihood estimates in some generalized linear models involving categorized data. One of several models, including the probit, logistic, Poisson, logarithmic, and negative binomial models, may be fit for input point or interval observations. (In the usual case, only point observations are observed.)

Let

be the linear response where xi is a design column vector obtained from a row of X, β is the column vector of coefficients to be estimated, and wi is a fixed parameter that may be input in X. When some of the γi are infinite at the supremum of the likelihood, then extended maximum likelihood estimates are computed. Extended maximum likelihood are computed as the finite (but nonunique) estimates

that optimize the likelihood containing only the observations with finite

. These estimates, when combined with the set of indices of the observations such that

is infinite at the supremum of the likelihood, are called extended maximum estimates. When none of the optimal

are infinite, extended maximum likelihood estimates are identical to maximum likelihood estimates. Extended maximum likelihood estimation is discussed in more detail by Clarkson and Jennrich (1991). In CTGLM, observations with potentially infinite

are detected and removed from the likelihood if INFIN = 0. See below.

The models available in CTGLM are:

MODEL	Name	Parameterization	PDF
0	Poisson
1	Neg. Binomial
2	Logarithmic
3	Logistic
4	Probit
5	Log‑log

Here, Φ denotes the cumulative normal distribution, N and S are known parameters specified for each observation via column IPAR of X, and w is an optional fixed parameter specified for each observation via column IFIX of X. (If IPAR = 0, then N is taken to be 1 for MODEL = 0, 3, 4 and 5 and S is taken to be 1 for MODEL = 1. If IFIX = 0, then w is taken to be 0.) Since the log‑log model (MODEL = 5) probabilities are not symmetric with respect to 0.5, quantitatively, as well as qualitatively, different models result when the definitions of “success” and “failure” are interchanged in this distribution. In this model and all other models involving θ, θ is taken to be the probability of a “success.”

Note that each row vector in the data matrix can represent a single observation; or, through the use of column IFRQ of the matrix X, each vector can represent several observations. Also note that classification variables and their products are easily incorporated into the models via the usual regression‑type specifications.

Computational Details

For interval observations, the probability of the observation is computed by summing the probability distribution function over the range of values in the observation interval. For right‑interval observations, Pr(Y ≥ y) is computed as a sum based upon the equality Pr(Y ≥ y) = 1 ‑ Pr(Y < y). Derivatives are computed similarly. CTGLM allows three types of interval observations. In full interval observations, both the lower and the upper endpoints of the interval must be specified. For right‑interval observations, only the lower endpoint need be given while for left‑interval observations, only the upper endpoint is given.

The computations proceed as follows:

1. The input parameters are checked for consistency and validity.

2. Estimates of the means of the “independent” or design variables are computed. The frequency of the observation in all but binomial distribution models is taken from column IFRQ of the data matrix X. In binomial distribution models, the frequency is taken as the product of n = X(I, IPAR) and X(I, IFRQ). In all cases, if IFRQ = 0, or IPAR = 0, these values default to 1. Means are computed as

3. If INIT= 0, initial estimates of the coefficients are obtained (based upon the observation intervals) as multiple regression estimates relating transformed observation probabilities to the observation design vector. For example, in the binomial distribution models, θ for point observations may be estimated as

and, when MODEL = 3, the linear relationship is given by

while if MODEL = 4,

For bounded interval observations, the midpoint of the interval is used for X(I,IRT). Right‑interval observations are not used in obtaining initial estimates when the distribution has unbounded support (since the midpoint of the interval is not defined). When computing initial estimates, standard modifications are made to prevent illegal operations such as division by zero.

Regression estimates are obtained at this point, as well as later, by use of routine RGIVN (see Chapter 2, “Regression”.)

4. Newton‑Raphson iteration for the maximum likelihood estimates is implemented via iteratively reweighted least squares. Let

denote the log of the probability of the i‑th observation for coefficients β. In the least‑squares model, the weight of the i‑th observation is taken as the absolute value of the second derivative of

with respect to

(times the frequency of the observation), and the dependent variable is taken as the first derivative Ψ with respect to γi, divided by the square root of the weight times the frequency. The Newton step is given by

where all derivatives are evaluated at the current estimate of γ, and βn+1 = βn ‑ Δβ. This step is computed as the estimated regression coefficients in the least‑squares model. Step halving is used when necessary to ensure a decrease in the criterion.

5. Convergence is assumed when the maximum relative change in any coefficient update from one iteration to the next is less than EPS or when the relative change in the log--likelihood from one iteration to the next is less than EPS/100. Convergence is also assumed after MAXIT iterations or when step halving leads to a step size of less than .0001 with no increase in the log‑likelihood.

6. For interval observations, the contribution to the log‑likelihood is the log of the sum of the probabilities of each possible outcome in the interval. Because the distributions are discrete, the sum may involve many terms. The user should be aware that data with wide intervals can lead to expensive (in terms of computer time) computations.

7. If requested (INFIN = 0), then the methods of Clarkson and Jennrich (1991) are used to check for the existence of infinite estimates in

As an example of a situation in which infinite estimates can occur, suppose that observation j is right censored with tj > 15 in a logistic model. If design matrix X is such that xjm = 1 and xim = 0 for all i ≠ j, then the optimal estimate of βm occurs at

leading to an infinite estimate of both βmand ηj. In CTGLM, such estimates may be “computed.”

In all models fit by CTGLM, infinite estimates can only occur when the optimal estimated probability associated with the left- or right‑censored observation is 1. If INFIN = 0, left- or right- censored observations that have estimated probability greater than 0.995 at some point during the iterations are excluded from the log‑likelihood, and the iterations proceed with a log‑likelihood based upon the remaining observations. This allows convergence of the algorithm when the maximum relative change in the estimated coefficients is small and also allows for the determination of observations with infinite

At convergence, linear programming is used to ensure that the eliminated observations have infinite ηi. If some (or all) of the removed observations should not have been removed (because their estimated ηi’s must be finite), then the iterations are restarted with a log‑likelihood based upon the finite ηi observations. See Clarkson and Jennrich (1991) for more details.

When INFIN = 1, no observations are eliminated during the iterations. In this case, when infinite estimates occur, some (or all) of the coefficient estimates

will become large, and it is likely that the Hessian will become (numerically) singular prior to convergence.

When infinite estimates for the

are detected, routine RGIVN (see Chapter 2, “Regression”) is used at the convergence of the algorithm to obtain unique estimates

. This is accomplished by regressing the optimal

or the observations with finite η against X β, yielding a unique

(by setting coefficients

that are linearly related to previous coefficients in the model to zero). All of the final statistics relating to

are based upon these estimates.

8. Residuals are computed according to methods discussed by Pregibon (1981). Let li(γi) denote the log‑likelihood of the i‑th observation evaluated at γi. Then, the standardized residual is computed as

where

is the value of γi when evaluated at the optimal

and the derivatives here (and only here) are with respect to γ rather than with respect to β. The denominator of this expression is used as the “standard error of the residual” while the numerator is the “raw” residual.

Following Cook and Weisberg (1982), we take the influence of the i‑th observation to be

This quantity is a one‑step approximation to the change in the estimates when the i‑th observation is deleted. Here, the partial derivatives are with respect to β.

Comments

1. Workspace may be explicitly provided, if desired, by use of C2GLM/DC2GLM. The reference is:

CALL C2GLM (NOBS, NCOL, X, LDX, MODEL, ILT, IRT, IFRQ, IFIX, IPAR, ICEN, INFIN, MAXIT, EPS, INTCEP, NCLVAR, INDCL, NEF, NVEF, INDEF, INIT, IPRINT, MAXCL, NCLVAL, CLVAL, NCOEF, COEF, LDCOEF, ALGL, COV, LDCOV, XMEAN, CASE, LDCASE, GR, IADD, NRMISS, NMAX, OBS, ADDX, XD, WK, KBASIS)

The additional arguments are as follows:

NMAX — Maximum number of observations that can be handled in the linear programming. (Input)
If workspace is not explicitly provided, NMAX is set to NMAX = (n ‑ 8)/(7 + NCOEF) in S_CTGLM and NMAX = (n ‑ 16)/(11 + 2 * NCOEF) in D_CTGLM where n is the maximum number of units of workspace available after allocating space for OBS. In the typical problem, no linear programming is performed so that NMAX = 1 is sufficient. NMAX = NOBS is always sufficient. Even when extended maximum likelihood estimates are computed, NMAX = 30 will usually suffice. If INFIN = 1, set NMAX = 0.

OBS — Work vector of length NCOEF + 1.

ADDX — Logical work vector of length NMAX. ADDX is not needed and can be a array of length 1 in the calling program if NMAX = 0.

XD — Work vector of length NMAX * NCOEF. XD is not needed and can be a array of length 1 in the calling program if NMAX = 0.

WK — Work vector of length 4 * NMAX. WK is not needed and can be a array of length 1 in the calling program if NMAX = 0.

KBASIS — Work vector of length 2 * NMAX. KBASIS is not needed and can be a array of length 1 in the calling program if NMAX = 0.

2 Informational errors

Type	Code	Description
3	1	There were too many iterations required. Convergence is assumed.
3	2	There were too many step halvings. Convergence is assumed.
4	3	The number of distinct values of the classification variables exceeds MAXCL.
4	4	The number of distinct values of a classification must be greater than one.
4	5	LDCOEF or LDCOV must be greater than or equal to NCOEF.
4	6	The number of observations to be deleted has exceeded NMAX. Rerun with a different model or increase the workspace.

3. Dummy variables are generated for the classification variables as follows: An ascending list of all distinct values of each classification variable is obtained and stored in CLVAL. Dummy variables are then generated for each but the last of these distinct values. Each dummy variable is zero unless the classification variable equals the list value corresponding to the dummy variable, in which case the dummy variable is one. See argument IDUMMY for IDUMMY = 2 in routine GRGLM in Chapter 2.

4. The “product” of a classification variable with a covariate yields dummy variables equal to the product of the covariate with each of the dummy variables associated with the classification variable.

5. The “product” of two classification variables yields dummy variables in the usual manner. Each dummy variable associated with the first classification variable multiplies each dummy variable associated with the second classification variable. The resulting dummy variables are such that the index of the second classification variable varies fastest.

Programming Notes

1. Classification variables are specified via arguments NCLVAR and INDCL. Indicator or dummy variables are created for the classification variables using routine GRGLM (see Chapter 2, “Regression”) with IDUMMY = 2.

2. To enhance precision “centering” of covariates is performed if INTCEP = 1 and NOBS ‑ NRMISS > 1. In doing so, the sample means of the design variables are subtracted from each observation prior to its inclusion in the model. On convergence the intercept, its variance and its covariance with the remaining estimates are transformed to the uncentered estimate values.

3 Two methods for specifying a binomial distribution model are possible. In the first method, X(I, IFRQ) contains the frequency of the observation while X(I, IRT) is 0 or 1 depending upon whether the observation is a success or failure. In this case, N = X(I, IPAR) is always 1. The model is treated as repeated Bernoulli trials, and interval observations are not possible.

A second method for specifying binomial models is to use X(I, IRT) to represent the number of successes in the X(I, IPAR) trials. In this case, X(I, IFRQ) will usually be 1, but it may be greater than 1, in which case interval observations are possible.

The estimated coefficients will be identical between the two methods, but the log‑likelihood will differ by a constant term determined by the binomial coefficient. Specifically, if the model is treated as binomial trials, each observation with

yi = x(I,IRT), Ni = x(I,IPAR), fi = x(I,IFRQ)

contributes

to the log‑likelihood. For Bernoulli data, this term evaluates to 0 for every observation.

Examples

Example

The first example is from Prentice (1976) and involves the mortality of beetles after exposure to various concentrations of carbon disulphide. Both a logit and a probit fit are produced for linear model

μ + βx

The data is given as:

Covariate(x)	N	y
1.690	59	6
1.724	60	13
1.755	62	18
1.784	56	28
1.811	63	52
1.836	59	53
1.861	62	61
1.883	60	60

USE UMACH_INT

USE CTGLM_INT

IMPLICIT NONE

INTEGER IPAR, IRT, LDCASE, LDCOEF, LDCOV, LDX, NCOL, &

NEF, NOBS, NOUT, MAXCL

REAL EPS

PARAMETER (EPS=0.0001, IPAR=2, IRT=3, LDCASE=8, LDCOEF=2, LDCOV=2, &

LDX=8, MAXCL=1, NCOL=3, NEF=1, NOBS=8)

INTEGER INDCL(MAXCL), INDEF(1), IPRINT, MODEL, NCLVAL(1), &

NCOEF, NRMISS, NVEF(1)

REAL ALGL, CASE(LDCASE,5), CLVAL(1), COEF(LDCOEF,4), &

COV(LDCOV,4), GR(2), X(LDX,NCOL), XMEAN(2)

DATA NVEF/1/, INDEF/1/

DATA X/1.690, 1.724, 1.755, 1.784, 1.811, 1.836, 1.861, 1.883, &

59, 60, 62, 56, 63, 59, 62, 60, 6, 13, 18, 28, 52, 53, 61, &

60/

IPRINT = 2

CALL UMACH(2, NOUT)

DO 10 MODEL=3, 4

WRITE(NOUT, *) 'MODEL=', MODEL

CALL CTGLM (X, MODEL, NCOEF, COEF, COV, XMEAN, GR, IRT=IRT, &

IPAR=IPAR, EPS=EPS, NEF=NEF, NVEF=NVEF, INDEF=INDEF, IPRINT=IPRINT)

IPRINT = 1

10 CONTINUE

END

Output

Model = 3

Initial Estimates

1 2

-63.27 35.84

Method Iteration Step size Maximum scaled Log

coef. update likelihood

Q-N 0 -20.31

Q-N 1 1.0000 0.1387 -19.25

N-R 2 1.0000 0.6112E-01 -18.89

N-R 3 1.0000 0.7221E-01 -18.78

N-R 4 1.0000 0.6362E-03 -18.78

N-R 5 1.0000 0.3044E-06 -18.78

Log-likelihood -18.77818

Coefficient Statistics

Standard Asymptotic Asymptotic

Coefficient Error Z-statistic P-value

1 -60.76 5.19 -11.66 0.00

2 34.30 2.92 11.76 0.00

Asymptotic Coefficient Covariance

1 2

1 0.2691E+02 -0.1512E+02

2 0.8505E+01

Case Analysis

Residual Standardized

Predicted Residual Std. Error Leverage Residual

1 0.058 2.593 1.792 0.267 1.448

2 0.164 3.139 2.871 0.347 1.093

3 0.363 -4.498 3.786 0.311 -1.188

4 0.606 -5.952 3.656 0.232 -1.628

5 0.795 1.890 3.202 0.269 0.590

6 0.902 -0.195 2.288 0.238 -0.085

7 0.956 1.743 1.619 0.198 1.077

8 0.979 1.278 1.119 0.138 1.143

Last Coefficient Update

1 2

1.104E-07 -2.295E-07

Covariate Means

1.793

Observation Codes

1 2 3 4 5 6 7 8

0 0 0 0 0 0 0 0

Number of Missing Values 0

Model = 4

Log-likelihood -18.23232

Coefficient Statistics

Standard Asymptotic Asymptotic

Coefficient Error Z-statistic P-value

1 -34.94 2.64 -13.23 0.00

2 19.74 1.49 13.29 0.00

Note that the probit model yields a slightly smaller absolute log‑likelihood and, thus, is preferred. For this data, a model based upon the log‑log transformation function is even better. See Prentice (1976) for details.

Example 2

As a second example, the following data illustrate the Poisson model when all types of interval data are present. The example also illustrates the use of classification variables and the detection of potentially infinite estimates (which turn out here to be finite). These potential estimates lead to the two iteration summaries. The input data is

Column
ILT	IRT	ICEN	Class 1	Class 2
0	5	0	1	0
9	4	3	0	0
0	4	1	0	0
9	0	2	1	1
0	1	0	0	1

A linear model

μ + β1x1 + β2x2

is fit where x1 = 1 if the Class 1 variable is 0, x1 = 0, otherwise, and the x2 variable is similarly defined.

USE CTGLM_INT

IMPLICIT NONE

INTEGER ICEN, IFIX, ILT, INFIN, IPAR, IPRINT, IRT, &

LDCASE, LDCOEF, LDCOV, LDX, MAXCL, MAXIT, MODEL, &

NCLVAR, NCOL, NEF, NOBS

REAL EPS

PARAMETER (ICEN=3, ILT=1, INFIN=0, IPAR=2, IPRINT=2, &

IRT=2, LDCASE=5, LDCOEF=4, LDCOV=4, LDX=5, MAXCL=4, &

MODEL=0, NCLVAR=2, NCOL=5, NEF=2, NOBS=5)

INTEGER IADD(NOBS), INDCL(NCLVAR), INDEF(2), NCLVAL(MAXCL), &

NCOEF, NRMISS, NVEF(NEF)

REAL ALGL, CASE(LDCASE,5), CLVAL(4), COEF(LDCOEF,4), &

COV(LDCOV,4), GR(5), X(LDX,NCOL), XMEAN(3)

DATA INDCL/4, 5/, NVEF/1, 1/, INDEF/4, 5/

DATA X/0, 9, 0, 9, 0, 5, 4, 4, 0, 1, 0, 3, 1, 2, 0, 1, 0, 0, 1, &

0, 0, 0, 0, 1, 1/

CALL CTGLM (X, MODEL, NCOEF, COEF, COV, XMEAN, GR, ILT=ILT, &

IRT=IRT, IPAR=IPAR, ICEN=ICEN, INFIN=INFIN, &

NCLVAR=NCLVAR, NCLVAL=NCLVAL, CLVAL=CLVAL, INDCL=INDCL, &

NEF=NEF, NVEF=NVEF, INDEF=INDEF, IPRINT=IPRINT, MAXCL=MAXCL)

END

Output

Initial Estimates

1 2 3

0.2469 0.4463 -0.0645

Method Iteration Step size Maximum scaled Log

coef. update likelihood

Q-N 0 -3.529

Q-N 1 0.2500 5.168 -3.262

N-R 2 0.0625 183.4 -3.134

N-R 3 1.0000 0.7438 -3.006

N-R 4 1.0000 0.2108 -3.005

N-R 5 1.0000 0.5559E-02 -3.005

Method Iteration Step size Maximum scaled Log

coef. update likelihood

Q-N 0 -3.529

Q-N 1 0.2500 5.168 -3.262

N-R 2 0.0625 183.4 -3.217

N-R 3 1.0000 1.128 -3.116

N-R 4 1.0000 0.1673 -3.115

N-R 5 1.0000 0.4418E-02 -3.115

Log-likelihood -3.114638

Coefficient Statistics

Standard Asymptotic Asymptotic

Coefficient Error Z-statistic P-value

1 -0.549 1.171 -0.517 0.605

2 0.549 0.610 0.900 0.368

3 0.549 1.083 0.507 0.612

Asymptotic Coefficient Covariance

1 2 3

1 0.1372E+01 -0.3719E+00 -0.1172E+01

2 0.3719E+00 0.1719E+00

3 0.1172E+01

Case Analysis

Residual Standardized

Predicted Residual Std. Error Leverage Residual

1 5.000 0.000 2.236 1.000 0.000

2 6.925 -0.412 2.108 0.764 -0.196

3 6.925 0.412 1.173 0.236 0.351

4 0.000 0.000 0.000 0.000 NaN

5 1.000 0.000 1.000 1.000 0.000

Last Coefficient Update

1 2 3

-2.924E-07 -1.131E-08 7.075E-07

Covariate Means

1 2

0.6000 0.6000

Distinct Values For Each Class Variable

Variable 1: 0. 1.0

Variable 2: 0. 1.0

Observation Codes

1 2 3 4 5

0 0 0 0 0

Number of Missing Values 0