logisticRegression

Fit a binomial or multinomial logistic regression model using iteratively re-weighted least squares.

Synopsis

logisticRegression (nIndependent, nClasses, x, y)

Required Arguments

int nIndependent (Input)
The number of independent variables.
int nClasses (Input)
The number of discrete outcomes, or classes.
float x[[]] (Input)
An array of length nObservations × nIndependent containing the values of the independent variables corresponding to the responses in y.
float y[] (Input)
An array of length nObservations × nClasses containing the binomial (nClasses = 2) or multinomial (nClasses>2) counts per class. In an alternate format, y is an array of length nObservations × (nClasses ‑ 1) containing the counts for all but one class. The missing class is treated as the reference class. The optional argument groupCounts specifies this format for y. In another alternative format, y is an array of length nObservations containing the class id’s. See optional argument groups.

Return Value

An array of length nCoefficients × nClasses containing the estimated coefficients. The function fits a full model, where nCoefficients = 1 + nIndependent. The optional arguments noIntercept, xIndices, and xInteractions may be used to specify different models. Note that the last column (column nClasses) represents the reference class and is set to all zeros.

Optional Arguments

groupCounts (Input)

or

groups (Input)

These optional arguments specify the format of the input array y. If groupCounts is present, y is of length nObservations × (nClasses ‑ 1), and contains counts for all but one of the classes for each observation. The missing class is treated as the reference class.

If groups is present, the input array y is of length nObservations, and y[i] contains the group or class number to which the observation belongs. In this case, frequencies[i] is set to 1 for all observations.

Default: y is nObservations × (nClasses), and contains counts for all the classes.

columnWise (Input)

If present, the input arrays are column-oriented. That is, contiguous elements in x are values of the same independent variable, or column, except at multiples of nObservations.

Default: Input arrays are row-oriented.

frequencies, int[] (Input)

An array of length nObservations containing the number of replications or trials for each of the observations. This argument is required if groupCounts is specified and any element of y > 1.

Default: frequencies[i] = 1.

referenceClass, int (Input)

Number specifying which class or outcome category to use as the reference class. See the Description section for details.

Note that the last column of coefficients always represents the reference class. So when referenceClass < nClasses, columns referenceClass and nClasses are swapped for the output coefficients, i.e. coefficients for class nClasses will be returned in column referenceClass of coefficients. For example, if referenceClass = 1 and nClasses = 3, the first column of coefficients contains the coefficients for class 3 ( nClasses), the second column contains the coefficients for class 2, and the third column contains all zeros for the reference class.

Default: referenceClasses=nClasses

noIntercept (Input)

If present, the model will not include an intercept term.

Default: The intercept term is included.

xIndices, (Input)

An array of length nXin providing the column indices of x that correspond to the independent variables the user wishes to be included in the logistic regression model. For example, suppose there are five independent variables x0, x1, …, x4. To fit a model that includes only x2 and x3, set nXin = 2, xin[0] = 2, and xin[1] = 3.

Default: All nIndependent variables are included.

xInteractions (Input)

An array of length nXinteract × 2 providing pairs of column indices of x that define the interaction terms in the model. Adjacent indices should be unique. For example, suppose there are two independent variables x0 and x1. To fit a model that includes their interaction term, x0x1, set nXinteract = 1, xinteract[0] = 0, and xinteract[1] = 1.

Default: No interaction terms are included.

tolerance, float (Input)

Convergence error criteria. Iteration completes when the normed difference between successive estimates is less than tolerance or maxIter iterations are reached.

Default: tolerance = 100.00 × machine(4)

maxIter, int (Input)

The maximum number of iterations.

Default: maxIter = 20

initInput, int (Input)

initInput must be 0 or 1. If initInput = 1, initial values for the coefficient estimates are provided in the user array coefficients. If initInput = 0, initial values are computed by the function.

Default: initInput = 0

prevResults, structure (Input)
A structure containing information about a previous logistic regression fit. The model is combined with the fit to new data or to nextResults, if provided.
nextResults, structure nextModel (Input/Output)
A structure. If present and None, the structure is internally allocated and on output contains the model information. If present and not None, its contents are combined with the fit to new data or to prevResults, if provided. The combined results are returned in nextModel.
coefficients, float[] (Input/Output)
Storage for the coefficient array of length nCoefficients × nClasses is provided by the user. When initInput = 1, coefficients should contain the desired initial values of the estimates.
lrstat (Output)
The value of the likelihood ratio test statistic.

Description

Function logisticRegression fits a logistic regression model for discrete dependent variables with two or more mutually exclusive outcomes or classes. For a binary response y, the objective is to model the conditional probability of success, \(\pi_1(x)=Pr[y=1| x]\), where \(x=(x_1,x_2,\ldots,x_p)\)‘ is a realization of p independent variables. Logistic regression models the conditional probability, \(\pi_1(x)\), using the cdf of the logistic distribution. In particular,

\[\pi_1(x) = \frac{\exp\left(\eta_1\right)}{1 + \exp\left(\eta_1\right)} \equiv \frac{1}{1 + \exp\left(-\eta_i\right)}\]

where

\[\eta_1 = \beta_{01} + x^T \beta_{1}\]

and

\[\beta_{01}, \beta_1 = \left(\beta_{11}, \beta_{12}, \ldots \beta_{1p}\right)'\]

are unknown coefficients that are to be estimated.

Solving for the linear component \(\eta_1\) results in the log-odds or logit transformation of \(\pi_1(x)\):

\[\mathrm{logit}\left(\pi_1(x)\right) = \log \frac{\pi_1(x)}{1 - \pi_1(x)} = \eta_1\]

Given a set of N observations \((y_i,x_i)\), where \(y_i\) follows a binomial \((n,\pi)\) distribution with parameters \(n=1\) and \(\pi=\pi_1(x_i)\), the likelihood and log-likelihood are, respectively,

\[\begin{split}\begin{array}{l} L = \displaystyle\prod_{i=1}^{N} \pi \left( x_i \right)^{y_i} \left( 1 - \pi \left( x_i \right) \right)^{1 - y_i} \\ l = \displaystyle\sum_{i=1}^{N} y_i \log \left( \frac{\pi \left(x_i\right)}{1 - \pi \left(x_i\right)} + \log \left( 1 - \pi \left(x_i\right)\right) \right) \end{array}\end{split}\]

The log-likelihood in terms of the parameters, \(\{\beta_{01},\beta_1\}\), is therefore

\[l\left(\beta_0,\beta_1\right) = \sum_{i=1}^{N} y_i \eta_{i1} - \log\left(1 + \exp\left(\eta_{i1}\right)\right)\]

where

\[\eta_1 = \beta_{01} + x^T \beta_{1}\]

With a binary outcome, only one probability needs to be modeled. The second probability can be obtained from the constraint, \(\pi_1(x)+\pi_2(x)=1\). If each \(y_i\) is the number of successes in \(n_i\) independent trials, the log-likelihood becomes

\[l = \sum_{i=1}^{N} y_i \log \left(\frac{\pi\left(x_i\right)}{1 - \pi\left(x_i\right)}\right) + n_i \log \left(1 - \pi\left(x_i\right)\right)\]

or

\[l\left(\beta_0, \beta_1\right) = \sum_{i=1}^{N} y_i \eta_{i1} - n_i \log \left(1 + \exp\left(\eta_{i1}\right)\right)\]

See optional argument frequencies to set frequencies \(n_i\) > 1.

To test the significance of the model, the log-likelihood of the fitted model is compared to that of an intercept‑only model. In particular, \(G=-2 (l(\beta_{01})-l(\beta_{01},\beta_1))\) is a likelihood-ratio test statistic and under the null hypothesis, \(H_0 : \beta_{11}=\beta_{12}=\ldots=\beta_{1p}=0\), G is distributed as chi-squared with p‑1 degrees of freedom. A significant result suggests that at least one parameter in the model is non‑zero. See Hosmer and Lemeshow (2000) for further discussion.

In the multinomial case, the response vector is \(y_i=(y_{i1},y_{i2}, \ldots,y_{iK})'\), where \(y_{ik}=1\) when the i-th observation belongs to class k and \(y_{ik}=0\), otherwise. Furthermore, because the outcomes are mutually exclusive,

\[\sum_{k=1}^{K} y_{ik} = 1\]

and \(\pi_1(x)+\pi_2(x)+\cdots+\pi K(x)=1\). The last class K serves as the baseline or reference class in the sense that it is not modeled directly but found from

\[\pi_K(x) = 1 - \sum_{k=1}^{K-1} \pi_k(x)\]

If there are multiple trials, \(n_i>1\), then the constraint on the responses is

\[\sum_{k=1}^{K} y_{ik} = n_i\]

The log‑likelihood in the multinomial case becomes

\[l\left(\beta_{0l},\beta_l\right)_{l=1}^{K} = \sum_{i=1}^{N} \left[\sum_{l=1}^{K} y_{il} \eta_{il} - \log \left(\sum_{l=1}^{K} \exp\left(\eta_{il}\right)\right)\right]\]

or

\[l\left(\beta_{0l},\beta_{l}\right)_{l=1}^{K} = \sum_{i=1}^{N} \left[ \sum_{l=1}^{K} y_{il} \left( \beta_{0l} + x_i^T \beta_l \right) - \log \left( \sum_{l=1}^{K} \exp \left( \beta_{0l} + x_i^T \beta_l \right) \right) \right]\]

The constraint

\[\sum_{k=1}^{K} \pi_{ik} = 1\]

is handled by setting \(\eta_K=0\) for the K‑th class, and then the log‑likelihood is

\[l\left(\beta_{0l},\beta_l\right)_{l=1}^{K-1} = \sum_{i=1}^{N} \left[ \sum_{l=1}^{K-1} y_{il} \left(\beta_{0l} + x_i^T \beta_l\right) - \log \left( \sum_{l=1}^{K-1} \exp \left(\beta_{0l} + x_i^T \beta_l\right) \right) \right]\]

Note that for the multinomial case, the log‑odds (or logit) is

\[\log \frac{\pi_{il}}{\pi_{iK}} = \beta_{0l} + x_i^T \beta_l, \phantom{...} l=1, \ldots K-1\]

Note that each of the logits involve the odds ratio of being in class l versus class K, the reference class. Maximum likelihood estimates can be obtained by solving the score equation for each parameter:

\[\begin{split}\begin{aligned} \frac{\partial\left(\beta_{0l},\beta_l\right)}{\partial\left(\beta_{jl}\right)} &= \sum_{i=1}^{N} \left[ x_{ij}y_{il} - \frac{x_{ij}\exp\left(\eta_{il}\right)}{\left(1 + \displaystyle\sum_{m=1}^{K-1}\exp\left(\eta_{im}\right)\right)} \right] \\ &= \sum_{i=1}^{N}\left[x_{ij}\left(y_{il} - \pi_{il}\left(x_i\right)\right)\right] \\ &= 0 \end{aligned}\end{split}\]

To solve the score equations, the function employs a method known as iteratively re-weighted least squares or IRLS. In this case the IRLS is equivalent to the Newton-Raphson algorithm (Hastie, et. al., 2009, Thisted, 1988).

Consider the full vector of parameters

\[\beta = \left(\beta_{01}, \beta_{11}, \ldots \beta_{p1}, \beta_{02}, \beta_{12}, \ldots \beta_{p2}, \ldots, \beta_{0K-1}, \beta_{1K-1}, \ldots \beta_{pK-1}\right)'\]

the Newton-Raphson iteration is

\[\beta^{n+1} = \beta^{n} - \mathrm{H}^{-1}\left(\beta^n\right)G\left(\beta^n\right)\]

where H denotes the Hessian matrix, i.e., the matrix of second partial derivatives defined by

\[\begin{split}\begin{aligned} \frac { \partial^2l\left(\beta_{0l},\beta_l\right) } { \partial\beta_{kl}\partial_{jl} } &= -\sum_{i=1}^{N}\left[ \frac { x_{ij}x_{lk}\exp\left(\eta_{il}\right) \sum_{m=1}^{K} \exp\left(\eta_{im}\right) - x_{ij}x_{ik}\exp\left(\eta_{il}\right)\exp\left(\eta_{il}\right) } { \left(\sum_{m=1}^{K}\exp\left(\eta_{im}\right)\right)^2 } \right] \\ &= -\sum_{i=1}^{N}\left[x_{ij}x_{ik}p_{il}\left(1-p_{il}\right)\right] \end{aligned}\end{split}\]

and

\[\begin{split}\begin{array}{c} \frac {\partial^2l\left(\beta_{0l},\beta_l\right)} {\partial\beta_{kv}\partial\beta_{jl}} = -\frac{1}{N}\sum_{i=1}^{N}\left[\frac {-x_{ij}x_{lk}\exp\left(\eta_{il}\right)\exp\left(\eta_{iv}\right)} {\left(\sum_{l=1}^{K} \exp\left(\eta_{il}\right)\right)^2} \right] = \frac{1}{N}\sum_{i=1}^{N}\left[\frac {x_{ij}x_{ik}\exp\left(\eta_{il}\right)\exp\left(\eta_{iv}\right)} {\left(\sum_{l=1}^{K} \exp\left(\eta_{il}\right)\right)^2} \right] = \hfill \\ = \frac{1}{N}\sum_{i=1}^{N} \left[x_{ij}x_{ik}p_{il}p_{iv}\right] \hfill \\ \end{array}\end{split}\]

and G denotes the gradient vector, the vector of first partial derivatives,

\[\frac{\partial l\left(\beta_{0l},\beta_l\right)}{\partial l\left(\beta_{jl}\right)}\]

Both the gradient and the Hessian are evaluated at the most recent estimate of the parameters, \(\beta^n\). The iteration continues until convergence or until maximum iterations are reached. Following the theory of maximum likelihood estimation (Kendall and Stuart, 1979), standard errors are obtained from Fisher’s information matrix \((-H)^{-1}\) evaluated at the final estimates.

When the nextResults option is specified, the function combines estimates of the same model from separate fits using the method presented in Xi, Lin, and Chen (2008). To illustrate, let \(\beta_1\) and \(\beta_2\) be the MLE’s from separate fits to two different sets of data, and let \(H_1\) and \(H_2\) be the associated Hessian matrices. Then the combined estimate,

\[\beta = \left(\mathrm{H}_1 + \mathrm{H}_2\right)^{-1} \left(\beta_1^T\mathrm{H}_1 + \beta_2^T\mathrm{H}_2\right)\]

approximates the MLE of the combined data set. The model structure, nextModel contains the combined estimates as well as other elements. See Table 1: Data Structure below.

Table 5.1 — The Data Structure
Parameter Data Type Description
nObs int Total number of observations. If the model structure has been updated three times, first with 100 observations, next with 50, and third with 50, then nObs = 200.
n_updates int Total number of times the model structure has been updated. In the above scenario, n_updates = 3.
n_coefs int Number of coefficients in the model. This parameter must be the same for each model update.
coefs float[] An array of length n_coefs×nClasses containing the coefficients.
meany float[] An array of length nClasses containing the overall means for each class variable.
stderrs float[] An array of length n_coefs×(nClasses ‑ 1) containing the estimated standard errors for the estimated coefficients.
grad float[] An array of length n_coefs×(nClasses ‑ 1) containing the estimated gradient at the coefficient estimates.
hess float[] An array of length n_coefs(nClasses ‑ 1)×n_coefs×(nClasses ‑ 1) containing the estimated Hessian matrix at the coefficient estimates.

Remarks

Iteration stops when the estimates converge within tolerance, when maximum iterations are reached, or when the gradient becomes within tolerance of 0, whichever event occurs first. When the gradient converges before the coefficient estimate converges, a condition in the data known as complete or quasi-complete separation may be present. Separation in the data means that one or more independent variable perfectly predicts the response. When detected, the function stops the iteration, issues a warning, and returns the current values of the model estimates. Some of the coefficient estimates and standard errors may not be reliable. Furthermore, overflow issues may occur before the gradient converges. In such cases the program issues a fatal error.

Examples

Example 1

The first example is from Prentice (1976) and involves the mortality of beetles after five hours exposure to eight different concentrations of carbon disulphide. The table below lists the number of beetles exposed (N) to each concentration level of carbon disulphide (x, given as log dosage) and the number of deaths which result (y):

Log Dosage Number of Beetles Exposed Number of Deaths
1.690 59 6
1.724 60 13
1.755 62 18
1.784 56 28
1.811 63 52
1.836 59 53
1.861 62 61
1.883 60 60

The number of deaths at each concentration level is the binomial response (nClasses = 2) and the log-dosage is the single independent variable. Note that this example illustrates the GROUP_COUNTS format for y and the optional argument frequencies.

from numpy import *
from pyimsl.stat.logisticRegression import logisticRegression
from pyimsl.stat.writeMatrix import writeMatrix

x1 = [1.69, 1.724, 1.755, 1.784, 1.811, 1.836, 1.861, 1.883]
y1 = [6, 13, 18, 28, 52, 53, 61, 60]
freqs = [59, 60, 62, 56, 63, 59, 62, 60]
nclasses = 2
nIndependent = 1

coefs = logisticRegression(nIndependent, nclasses, x1, y1,
                           groupCounts=True,
                           frequencies=freqs)

labels = ['1', '2']
tmp = coefs.reshape(4)

writeMatrix("Coefficient Estimates", tmp[0:2],
            transpose=True,
            noColLabels=True,
            rowLabels=labels)

Output

 
Coefficient Estimates
   1       -60.76
   2        34.30

Example 2

In this example the response is a multinomial random variable with 4 outcome classes. The 3 independent variables represent 2 categorical variables and 1 continuous variable. A subset of 2 independent variables along with the intercept defines the logistic regression model. A test of significance is performed.

from __future__ import print_function
from numpy import *
from pyimsl.stat.logisticRegression import logisticRegression
from pyimsl.stat.chiSquaredCdf import chiSquaredCdf
from pyimsl.stat.writeMatrix import writeMatrix
n_classes = 4
n_observations = 50
n_independent = 3
n_coefs = 3
model = []
xindices = [0, 1]
lrstat = []

x = array([[3, 2, 2, 1, 3, 3, 3, 2, 3, 3, 3, 3, 3, 3, 2, 3, 2, 1, 3, 2,
            2, 1, 2, 1, 3, 2, 1, 2, 1, 2, 3, 2, 1, 2, 1, 1, 2, 3, 1, 2,
            1, 1, 1, 3, 1, 3, 2, 3, 3, 1],
           [25.92869, 51.63245, 25.78432, 39.37948, 24.65058, 45.20084,
            52.6796, 44.28342, 40.63523, 51.76094, 26.30368, 20.70230,
            38.74273, 19.47333, 26.42211, 37.05986, 51.67043, 42.40156,
            33.90027, 35.43282, 44.30369, 46.72387, 46.99262, 36.05923,
            36.83197, 61.66257, 25.67714, 39.08567, 48.84341, 39.43391,
            24.73522, 50.55251, 31.34263, 27.15795, 31.72685, 25.00408,
            26.35457, 38.12343, 49.9403, 42.45779, 38.80948, 43.22799,
            41.87624, 48.0782, 43.23673, 39.41294, 23.93346,
            42.8413, 30.40669, 37.77389],
           [1, 2, 1, 1, 1, 1, 2, 2, 2, 1, 1, 2, 2, 1, 1, 2, 2, 1, 2, 1,
            1, 1, 1, 1, 2, 2, 1, 2, 2, 1, 1, 2, 2, 2, 1, 1, 2, 1, 1, 2,
            2, 2, 1, 1, 2, 1, 1, 2, 1, 1]])

y = [1., 2., 3, 4, 3, 3, 4, 4, 4, 4, 2, 1, 4, 1, 1, 1, 4, 4, 3, 1,
     2, 3, 3, 4, 2, 3, 4, 1, 2, 4, 3, 4, 4, 1, 3, 4, 4, 2, 3, 4,
     2, 2, 4, 3, 1, 4, 3, 4, 2, 3]


coefs = logisticRegression(n_independent, n_classes, x, y,
                           groups=True,
                           columnWise=True,
                           xIndices=xindices,
                           lrstat=lrstat,
                           nextResults=model)


dof = n_coefs * (n_classes - 1) - (n_classes - 1)
model_pval = 1.0 - chiSquaredCdf(lrstat[0], dof)

rowLabels = list(range(1, 10))
rowLabels = str(rowLabels)
rowLabels = rowLabels.replace('[', '')
rowLabels = rowLabels.replace(']', '')
rowLabels = rowLabels.split(',')

print("")
tmp = coefs.reshape(16)
writeMatrix("Coefficients", tmp[0:9],
            transpose=True,
            noColLabels=True,
            rowLabels=rowLabels)

print("")
writeMatrix("Std Errs", model[0].stderrs[0:9],
            transpose=True,
            noColLabels=True,
            rowLabels=rowLabels)

print("")
print("Log-likelihood: %5.2f" % model[0].loglike)
print("LR test statistic: %5.2f" % lrstat[0])
print("%d deg. freedom, p-value: %5.4f" % (dof, model_pval))

Output

Log-likelihood: -62.92
LR test statistic:  7.69
6 deg. freedom, p-value: 0.2621
 
 Coefficients
1         2.294
 2        0.408
 3       -0.111
 4       -1.159
 5        0.245
 6       -0.002
 7       -0.064
 8        0.178
 9       -0.017
 
   Std Errs
1         2.259
 2        0.548
 3        0.051
 4        2.122
 5        0.500
 6        0.044
 7        1.862
 8        0.442
 9        0.039

Example 3

Example 3 uses the same data as in Example 2 and an additional set of 50 observations using the same data generating process. The model structure includes all 3 independent variables and an intercept, and a single model fit is approximated from two separate model fits. Example 3 also includes a fit on the full data set for comparison purposes.

from __future__ import print_function
from numpy import *
from pyimsl.stat.logisticRegression import logisticRegression
from pyimsl.stat.chiSquaredCdf import chiSquaredCdf
from pyimsl.stat.writeMatrix import writeMatrix

x1 = [[3, 2, 2, 1, 3, 3, 3, 2, 3, 3, 3, 3, 3, 3, 2, 3, 2, 1, 3, 2, 2,
       1, 2, 1, 3, 2, 1, 2, 1, 2, 3, 2, 1, 2, 1, 1, 2, 3, 1, 2, 1, 1,
       1, 3, 1, 3, 2, 3, 3, 1],
      [25.92869, 51.63245, 25.78432, 39.37948, 24.65058, 45.20084,
       52.6796, 44.28342, 40.63523, 51.76094, 26.30368, 20.70230,
       38.74273, 19.47333, 26.42211, 37.05986, 51.67043, 42.40156,
       33.90027, 35.43282, 44.30369, 46.72387, 46.99262, 36.05923,
       36.83197, 61.66257, 25.67714, 39.08567, 48.84341, 39.34391,
       24.73522, 50.55251, 31.34263, 27.15795, 31.72685, 25.00408,
       26.35457, 38.12343, 49.9403, 42.45779, 38.80948, 43.22799,
       41.87624, 48.0782, 43.23673, 39.41294, 23.93346,
       42.8413, 30.40669, 37.77389],
      [1, 2, 1, 1, 1, 1, 2, 2, 2, 1, 1, 2, 2, 1, 1, 2, 2, 1, 2, 1, 1,
       1, 1, 1, 2, 2, 1, 2, 2, 1, 1, 2, 2, 2, 1, 1, 2, 1, 1, 2, 2, 2,
       1, 1, 2, 1, 1, 2, 1, 1]]

x2 = [[1, 1, 3, 3, 2, 3, 3, 3, 2, 1, 1, 1, 1, 3, 3, 2, 2, 3, 3, 2, 3,
       2, 1, 3, 3, 2, 2, 3, 3, 2, 1, 2, 1, 2, 3, 3, 1, 1, 2, 2, 3, 1,
       1, 2, 2, 1, 1, 2, 3, 1],
      [35.66064, 26.68771, 23.11251, 58.14765, 44.95038, 42.45634,
       34.97379, 53.54269, 32.57257, 46.91201, 30.93306, 51.63743,
       34.67712, 53.84584, 14.97474, 44.4485, 47.10448, 43.96467,
       55.55741, 36.63123, 32.35164, 55.75668, 36.83637, 46.7913,
       44.24153, 49.94011, 41.91916, 24.78584, 50.79019, 39.97886,
       34.42149, 41.93271, 28.59433, 38.47255, 32.11676, 37.19347,
       52.89337, 34.64874, 48.61935, 33.99104, 38.32489, 35.53967,
       29.59645, 21.14665, 51.11257, 34.20155, 44.40374, 49.67626,
       58.35377, 28.03744],
      [1, 1, 2, 1, 1, 1, 2, 2, 2, 1, 1, 2, 2, 1, 1, 2, 1, 1, 2, 2, 2,
       1, 2, 1, 2, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 2, 1, 2, 2,
       1, 1, 2, 1, 1, 2, 1, 1]]
y1 = [1, 2, 3, 4, 3, 3, 4, 4, 4, 4, 2, 1, 4, 1, 1, 1, 4, 4, 3, 1, 2,
      3, 3, 4, 2, 3, 4, 1, 2, 4, 3, 4, 4, 1, 3, 4, 4, 2, 3, 4, 2, 2,
      4, 3, 1, 4, 3, 4, 2, 3]

y2 = [1, 4, 1, 4, 1, 1, 3, 1, 2, 4, 3, 1, 3, 2, 4, 4, 4, 2, 3, 2, 1,
      4, 4, 4, 4, 3, 1, 1, 3, 1, 4, 2, 4, 2, 1, 2, 3, 1, 1, 4, 1, 2,
      4, 3, 4, 2, 4, 3, 2, 4]

#   float x3[100 * 3], y3[100], *coefs;
#   int i, j,
n_classes = 4
n_observations = 50
n_independent = 3
n_coefs = 4
model1 = []
model12 = []
model3 = []

# first call with x1, y1
coefs = logisticRegression(n_independent,
                           n_classes, x1, y1,
                           groups=True,
                           columnWise=True,
                           nextResults=model1)

rowLabels = list(range(1, 13))
rowLabels = str(rowLabels)
rowLabels = rowLabels.replace('[', '')
rowLabels = rowLabels.replace(']', '')
rowLabels = rowLabels.split(',')
writeMatrix("First Model Coefficients:", model1[0].coefs[0:12],
            transpose=True,
            noColLabels=True,
            rowLabels=rowLabels)

print("")
writeMatrix("First Model Standard Errors:", model1[0].stderrs[0:12],
            transpose=True,
            noColLabels=True,
            rowLabels=rowLabels)

# second call with x2,y2

coefs = logisticRegression(n_independent,
                           n_classes, x2, y2,
                           groups=True,
                           columnWise=True,
                           prevResults=model1[0],
                           nextResults=model12)
print("")
writeMatrix("Combined Model Coefficients:", model12[0].coefs[0:12],
            transpose=True,
            noColLabels=True,
            rowLabels=rowLabels)

writeMatrix("Combined Model Standard Errors:", model12[0].stderrs[0:12],
            transpose=True,
            noColLabels=True,
            rowLabels=rowLabels)

# combine the data
y3 = y1
y3 += y2
x3 = x1
x3[0] += x2[0]
x3[1] += x2[1]
x3[2] += x2[2]

coefs = logisticRegression(n_independent,
                           n_classes, x3, y3,
                           groups=True,
                           columnWise=True,
                           nextResults=model3)

writeMatrix("Full Data Model Coefficients:", model3[0].coefs[0:12],
            transpose=True,
            noColLabels=True,
            rowLabels=rowLabels)

writeMatrix("Full Data Standard Errors:", model3[0].stderrs[0:12],
            transpose=True,
            noColLabels=True,
            rowLabels=rowLabels)

Output

 
First Model Coefficients:
    1          1.691
     2         0.350
     3        -0.137
     4         1.057
     5        -1.254
     6         0.242
     7        -0.004
     8         0.115
     9         1.032
     10        0.278
     11        0.016
     12       -1.954
 
First Model Standard Errors:
      1          2.389
       2         0.565
       3         0.061
       4         1.025
       5         2.197
       6         0.509
       7         0.047
       8         0.885
       9         2.007
       10        0.461
       11        0.043
       12        0.958
 
Combined Model Coefficients:
      1         -1.169
       2         0.649
       3        -0.038
       4         0.608
       5        -1.935
       6         0.435
       7         0.002
       8         0.215
       9        -0.193
       10        0.282
       11        0.002
       12       -0.630
 
Combined Model Standard Errors:
       1          1.489
        2         0.359
        3         0.029
        4         0.588
        5         1.523
        6         0.358
        7         0.030
        8         0.584
        9         1.461
        10        0.344
        11        0.030
        12        0.596
 
Full Data Model Coefficients:
      1         -1.009
       2         0.640
       3        -0.051
       4         0.764
       5        -2.008
       6         0.436
       7         0.003
       8         0.263
       9        -0.413
       10        0.299
       11        0.004
       12       -0.593
 
Full Data Standard Errors:
     1          1.466
      2         0.350
      3         0.029
      4         0.579
      5         1.520
      6         0.357
      7         0.029
      8         0.581
      9         1.389
      10        0.336
      11        0.028
      12        0.577

Warning Errors

IMSLS_NO_CONV_SEP Convergence did not occur in # iterations. “tolerance” = #, the error between estimates = #, and the gradient has norm = #. Adjust “tolerance” or “maxIter”, or there may be a separation problem in the data.
IMSLS_EMPTY_INT_RESULTS Intermediate results given to the function are empty and may be expected to be non‑empty in this scenario.

Fatal Errors

IMSLS_NO_CONV_OVERFLOW The linear predictor = # is too large and will lead to overflow when exponentiated. The algorithm fails to converge.