LogisticRegression (JMSL Numerical Library (jmsl) 2022.1.0 API)

java.lang.Object
- com.imsl.datamining.PredictiveModel
- - com.imsl.datamining.LogisticRegression

All Implemented Interfaces:

Serializable, Cloneable
```
public class LogisticRegression
extends PredictiveModel
implements Serializable, Cloneable
```
Performs binomial or multinomial logistic regression.
This function fits a logistic regression model for discrete dependent variables with two or more mutually exclusive outcomes or classes. For a binary response $y$, the objective is to model the conditional probability of success, $\pi_1 (x) = Pr[y = 1∣ x]$, where $x = (x_1, x_2, …, x_p)$ is a realization of $p$ independent variables. Logistic regression models the conditional probability $\pi_1 (x)$ using the cdf of the logistic distribution. In particular, $$ \pi_1(x) = \frac{\exp(\eta_1)}{1+\exp(\eta_1)} $$ where the linear predictor $$\eta_1 = \beta_{01} + x_1\beta_{11} + x_2\beta_{21} + \cdots + x_p\beta_{p1} $$ The coefficients $\{\beta_{01},\beta_{11},\ldots,\beta_{p1}\}$ are unknown and are estimated using the training data.

Solving for the linear component $\eta_1$ results in the log-odds or logit transformation of $\pi_1 (x)$: $$ \text{logit}(\pi_1(x)) = \text{log}\frac{\pi_1(x)}{1-\pi_1(x)} = \eta_1 $$ Given a set of $N$ observations, $\{y_i, x_i\}$, where $y_i$ follows a binomial $(n, \pi) $ distribution with parameters $n = 1$ and $\pi = \pi_1 (x_i)$, the likelihood and log-likelihood are, respectively, $$ L = \prod_{i=1}^N{{\pi_1(x_i)}^{y_i}{(1-\pi_1(x_i))}^{1-y_i}}$$ $$ \mathcal{L} = \log(L) = \sum_{i=1}^{N} \left[y_i \text{logit}(\pi_1 (x_i))+ \log(1-\pi_1 (x_i))\right]$$ The log-likelihood in terms of the $\eta_1$ and the unknown parameters $\{\beta_{01},\beta_1\}$ is therefore $$ \mathcal{L}(\beta_{01},\beta_1) = \sum_{i=1}^{N} \left[y_i \eta_1(x_i) - \log(1+\exp(\eta_1(x_i)))\right]$$ where $\eta_1(x_i) = \beta_{01} + x_i^{T}\beta_{1}$. With a binary outcome, only one probability needs to be modeled. The second probability can be obtained from the constraint, $\pi_1(x) + \pi_2(x) = 1$. If each $y_i$ is the number of successes in $n_i \gt 1$ independent trials, the log-likelihood becomes $$ \mathcal{L} = \sum_{i=1}^{N} \left[ y_i \text{logit}(\pi_1 (x_i))+ n_i\log(1-\pi_1 (x_i)) \right]$$ or $$ \mathcal{L}(\beta_{01},\beta_1) = \sum_{i=1}^N \left[y_i \eta_1(x_i) - n_i\log(1+\exp(\eta_1(x_i)))\right]$$ The parameters are estimated using the method of maximum likelihood. To test the significance of the model, the log-likelihood of the fitted model is compared to that of an intercept‑only model. In particular, $G = -2(\mathcal{L}(\beta_{01}) - \mathcal{L}(\beta_{01},\beta_1))$ is a likelihood-ratio test statistic and under the null hypothesis, $H_0 : \beta_{11} = \beta_{12} = \ldots = \beta_{1p} = 0$, $G$ is distributed as chi-squared with $p - 1$ degrees of freedom. A significant result suggests that at least one parameter in the model is non‑zero. See Hosmer and Lemeshow (2000) for further discussion.

In the multinomial case, the response vector is $y_i = (y_{i1}, y_{i2},\ldots, y_{iK})^T$, where $y_{ik} = 1$ when the $i$-th observation belongs to class $k$ and $y_{ik} = 0$, otherwise. Furthermore, because the outcomes are mutually exclusive, $\sum_{k=1}^{K} y_{ik} = 1$ and $\pi_1(x) + \pi_2(x) + \cdots + \pi_K(x) = 1$. Without loss of generality the last class $K$ serves as the baseline or reference class in the sense that it is not modeled directly but found from $\pi_K(x) = 1 - \sum_{k=1}^{K-1} \pi_k(x) $. Note that if there are multiple trials $n_i > 1$, the constraint on the responses becomes $\sum_{k=1}^{K} y_{ik} = n_i$.

The log-likelihood in the multinomial case is $$\mathcal{L}(\beta_{0l}, \beta_{l})_{l=1}^{K} = \sum_{i=1}^N \left[ \sum_{l=1}^{K} y_{il}\eta_{il} - \log \left ( \sum_{l=1}^{K} \exp(\eta_{il})\right ) \right ] $$ where $ \eta_{il} = \beta_{0l} + x_i^T\beta_{l} $.

With $\pi_l(x_i) = \frac{\exp(\eta_{il})}{\sum_{k=1}^{K}\exp(\eta_{ik})}$, the constraint $\sum_{l=1}^{K} \pi_l(x_i) = 1 $ is handled by setting $\eta_{iK}=0$. Then the log-likelihood is $$\mathcal{L}(\beta_{0l}, \beta_{l})_{l=1}^{K-1} = \sum_{i=1}^N \left[ \sum_{l=1}^{K-1} y_{il}\eta_{il} - \log \left ( \sum_{l=1}^{K-1} \exp(\eta_{il})\right ) \right ] $$ Note that the logit for class $k$ is $$\log(\frac{\pi_{ik}}{\pi_{iK}}) = \eta_{ik} $$ which is the log-odds of class $k$ versus the reference class $K$.

Maximum likelihood estimates are found by solving the score equation for each parameter $$\begin{array}{l} \frac{\partial{\mathcal{L}(\beta_{0l},\beta_{l})}}{\partial{\beta_{jl}}} & = \sum_{i=1}^N \left [x_{ij}y_{il}-\frac{x_{ij}\exp(\eta_{il})}{1+\sum_{k=1}^{K-1}\exp(\eta_{ik})} \right]\\ & = \sum_{i=1}^N x_{ij}(y_{il} - \pi_{il}) \\ & = 0 \end{array} $$ To solve the score equations, the function employs a method known as iteratively re-weighted least squares or IRLS. In this case the IRLS is equivalent to the Newton-Raphson algorithm (Hastie, et. al., 2009, Thisted, 1988).

Consider the full vector of parameters $$ \beta = (\beta_{01},\beta_{11}, \beta_{21},\ldots,\beta_{p1},\beta_{02},\ldots,\beta_{p2},\ldots, \beta_{0K-1},\beta_{1K-1},\ldots,\beta_{pK-1})^T$$ The Newton-Raphson step at iteration $n+1$ is $$\beta^{n+1} = \beta^n - \matrix{H}^{-1}(\beta^n)\matrix{G}(\beta^n)$$ where $\matrix{H}$ is the Hessian matrix, the matrix of the second partial derivatives defined by $$ \begin{array}{l} \frac{\partial^2{\mathcal{L}(\beta_{0l},\beta_{l})}}{\partial{\beta_{kl}}\partial{\beta_{jl}}} & = -\sum_{i=1}^N \left [ \frac{x_{ij}x_{ik}\exp(\eta_{il})\sum_{m=1}^K \exp(\eta_{im}) - x_{ij}x_{ik}\exp(\eta_{il})\exp(\eta_{il})}{\left ( \sum_{m=1}^K \exp(\eta_{im})\right )^2} \right ]\\ & = -\sum_{i=1}^N x_{ij}x_{ik} \pi_{il}(1-\pi_{il}) \end{array} $$ for $l=1,2,\ldots, K$, $k,j = 1,2,\ldots,p$, and $$ \begin{array}{l} \frac{\partial^2{\mathcal{L}(\beta_{0l},\beta_{l})}}{\partial{\beta_{k\nu}}\partial{\beta_{jl}}} & = -\frac{1}{N}\sum_{i=1}^N \left [ -\frac{x_{ij}x_{ik}\exp(\eta_{il})\exp(\eta_{i\nu})}{\left ( \sum_{m=1}^K \exp(\eta_{im})\right )^2} \right ]\\ & = \frac{1}{N}\sum_{i=1}^N x_{ij}x_{ik} \pi_{il}\pi_{i\nu} \end{array} $$ where $l \ne \nu =1,2,\ldots,K $, and $k,j = 1,2,\ldots,p$. The $\matrix{G}$ denotes the the gradient, the vector of the first partial derivatives of the log-likelihood $$\frac{\partial{\mathcal{L}(\beta_{0l},\beta_{l})}}{\partial{\beta_{jl}}}. $$

Both the gradient and the Hessian are evaluated at the most recent estimate of the parameters, $\beta^n$. The iteration continues until convergence or until maximum iterations are reached. Following the theory of maximum likelihood estimation (Kendall and Stuart, 1979), standard errors are obtained from Fisher’s information matrix $ \matrix{I}= -\matrix{H}^{-1} $ evaluated at the final estimates.

See Also:

Example 1, Example 2, Example 3, Serialized Form

Nested Class Summary
- Nested classes/interfaces inherited from class com.imsl.datamining.PredictiveModel
  PredictiveModel.CloneNotSupportedException, PredictiveModel.PredictiveModelException, PredictiveModel.StateChangeException, PredictiveModel.SumOfProbabilitiesNotOneException, PredictiveModel.VariableType

Constructor Summary

Constructors
Constructor and Description
`LogisticRegression(double[][] x, double[][] y, PredictiveModel.VariableType[] predictorVarType, PredictiveModel.VariableType responseVarType)` Constructs a logistic regression predictive model.
`LogisticRegression(double[][] xy, int responseColumnIndex, PredictiveModel.VariableType[] varType)` Constructs a logistic regression predictive model.
`LogisticRegression(LogisticRegression lrModel)` Constructs a copy of the input `LogisticRegression` predictive model.

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`LogisticRegression`	`clone()` Clones a `LogisticRegression` predictive model.
`void`	`fitModel()` Fits the logistic regression predictive model.
`double[]`	`getClassMeans()` Returns the class means.
`double[]`	`getCoefficients()` Returns the current values of the coefficient estimates.
`double[][]`	`getConfidenceIntervals()` Returns the confidence intervals for the predictions.
`double`	`getConfidenceLevel()` Returns the confidence level for calculating the confidence limits for the predictions.
`double[]`	`getGradient()` Returns the gradient.
`double[][]`	`getHessian()` Returns the Hessian matrix.
`double`	`getInitialLoglikelihood()` Returns the log-likelihood of the model evaluated at the starting coefficients.
`double`	`getLoglikelihood()` Returns the log-likelihood evaluated at the estimated coefficients.
`double`	`getLogLikelihoodRatio()` Returns the log-likelihood ratio statistic.
`int`	`getNumberOfCoefficients()` Returns the number of coefficients (per class).
`double[]`	`getStandardErrors()` Returns the standard errors of the estimated coefficients.
`int[]`	`getXInteractionsIndex()` Returns the array containing the indices specifying interaction terms.
`boolean`	`isIncludeIntercept()` Returns the current setting of the flag to include or not include an intercept in the model.
`double[]`	`predict()` Returns the fitted values on the training data.
`double[]`	`predict(double[][] testData)` Returns the predicted values on the test data.
`double[]`	`predict(double[][] testData, double[] testDataWeights)` Returns predicted values on the test data using the given weights.
`void`	`setConfidenceLevel(double confidence)` Sets the confidence level for calculating the confidence limits for the predictions.
`void`	`setIncludeIntercept(boolean includeIntercept)` Sets the flag to include or not include an intercept in the model.
`void`	`setInteractionIndex(int[] xInteractions)` Sets the column indices for pairwise interactions to include in the model.
`void`	`setStartingCoefficients(double[] startCoefs)` Sets the starting values for the coefficients.
`void`	`setTolerance(double tolerance)` Sets the error tolerance criteria for convergence.

Methods inherited from class com.imsl.datamining.PredictiveModel
getClassCounts, getClassErrors, getClassErrors, getClassLabels, getClassProbabilities, getCostMatrix, getMaxNumberOfCategories, getMaxNumberOfIterations, getNumberOfClasses, getNumberOfColumns, getNumberOfMissing, getNumberOfPredictors, getNumberOfRows, getNumberOfUniquePredictorValues, getPredictorIndexes, getPredictorTypes, getPrintLevel, getPriorProbabilities, getRandomObject, getResponseColumnIndex, getResponseVariableAverage, getResponseVariableMostFrequentClass, getResponseVariableType, getTotalWeight, getVariableType, getWeights, getXY, isConstantSeries, isMustFitModel, isUserFixedNClasses, setClassCounts, setClassLabels, setClassProbabilities, setConfiguration, setCostMatrix, setMaxNumberOfCategories, setMaxNumberOfIterations, setMustFitModel, setNumberOfClasses, setPredictorIndex, setPredictorTypes, setPrintLevel, setPriorProbabilities, setRandomObject, setResponseColumnIndex, setTrainingData, setVariableType, setWeights

Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Constructor Detail
  - LogisticRegression
```
public LogisticRegression(double[][] xy,
                          int responseColumnIndex,
                          PredictiveModel.VariableType[] varType)
```
    Constructs a logistic regression predictive model.
    
    Parameters:
    
    xy - a double matrix containing the training data
    
    responseColumnIndex - an int, the column index for the response variable
    
    varType - a PredictiveModel.VariableType array containing the type of each variable
  - LogisticRegression
```
public LogisticRegression(double[][] x,
                          double[][] y,
                          PredictiveModel.VariableType[] predictorVarType,
                          PredictiveModel.VariableType responseVarType)
```
    Constructs a logistic regression predictive model.
    
    Parameters:
    
    x - a double matrix containing the training data for the predictor variables
    
    y - a double matrix containing the training data for the response variable
    
    predictorVarType - a PredictiveModel.VariableType array containing the type of each variable
    
    responseVarType - a PredictiveModel.VariableType, the type of the response variable
  - LogisticRegression
```
public LogisticRegression(LogisticRegression lrModel)
```
    Constructs a copy of the input LogisticRegression predictive model.
    
    Parameters:
    
    lrModel - a LogisticRegression predictive model
- Method Detail
  - clone
```
public LogisticRegression clone()
```
    Clones a LogisticRegression predictive model.
    
    Specified by:
    
    clone in class PredictiveModel
    
    Returns:
    
    a clone of the LogisticRegression predictive model
  - fitModel
```
public void fitModel()
              throws PredictiveModel.PredictiveModelException
```
    Fits the logistic regression predictive model.
    
    Overrides:
    
    fitModel in class PredictiveModel
    
    Throws:
    
    PredictiveModel.PredictiveModelException
  - setConfidenceLevel
```
public void setConfidenceLevel(double confidence)
```
    Sets the confidence level for calculating the confidence limits for the predictions.
    
    Parameters:
    
    confidence - a double value in the interval $(0,1)$
  - getConfidenceLevel
```
public double getConfidenceLevel()
```
    Returns the confidence level for calculating the confidence limits for the predictions.
    
    Returns:
    
    a double, the confidence level
  - predict
```
public double[] predict()
                 throws PredictiveModel.PredictiveModelException
```
    Returns the fitted values on the training data.
    The fitted value at index i*nClasses+k is the expected number of outcomes in class k for training example (observation) i. If the number of trials (see PredictiveModel.getWeights()) is equal to 1, the fitted value is a probability.
    
    Specified by:
    
    predict in class PredictiveModel
    
    Returns:
    
    a double array of length numberOfObservations*numberOfClasses containing the fitted values
    
    Throws:
    
    PredictiveModel.PredictiveModelException
  - predict
```
public double[] predict(double[][] testData)
                 throws PredictiveModel.PredictiveModelException
```
    Returns the predicted values on the test data.
    The predicted value at index i*nClasses+k is the expected number of outcomes in class k for test example (observation) i. If the number of trials (see PredictiveModel.getWeights()) is equal to 1, the predicted value is a probability.
    
    Specified by:
    
    predict in class PredictiveModel
    
    Parameters:
    
    testData - double matrix containing data to be predicted. testData must have the same number of columns in the same arrangement as the training data x or xy.
    
    Returns:
    
    a double array containing the predicted values.
    
    Throws:
    
    PredictiveModel.PredictiveModelException - is thrown when an exception occurs in the common PredictiveModel methods. Implementing or overriding methods from this class may require throwing exceptions. Exceptions thrown from these methods will necessarily extend the PredictiveModelException.
  - predict
```
public double[] predict(double[][] testData,
                        double[] testDataWeights)
                 throws PredictiveModel.PredictiveModelException
```
    Returns predicted values on the test data using the given weights.
    The predicted value at index i*nClasses+k is the expected number of outcomes in class k for the test example (observation) i. If the number of trials or frequency provided in testDataWeights is equal to 1, the predicted value is a probability.
    
    Specified by:
    
    predict in class PredictiveModel
    
    Parameters:
    
    testData - a double matrix containing the test data predictors
    
    testDataWeights - a double array containing the test data weights. For the binomial/multinomial response the weights are typically the frequencies, or number of trials.
    
    Returns:
    
    a double array containing predicted outcomes
    
    Throws:
    
    PredictiveModel.PredictiveModelException
  - getClassMeans
```
public double[] getClassMeans()
```
    Returns the class means.
    Run the method LogisticRegression.fitModel() to populate the class means.
    
    Returns:
    
    a double array containing the class means
  - getCoefficients
```
public double[] getCoefficients()
```
    Returns the current values of the coefficient estimates.
    Run the method LogisticRegression.fitModel() to populate the coefficient estimates.
    
    Returns:
    
    a double array containing the coefficient estimates
  - getGradient
```
public double[] getGradient()
```
    Returns the gradient.
    Run the method LogisticRegression.fitModel() to populate the gradient.
    
    Returns:
    
    a double array containing the final gradient
  - getConfidenceIntervals
```
public double[][] getConfidenceIntervals()
```
    Returns the confidence intervals for the predictions.
    Run one of the prediction methods (e.g., LogisticRegression.predict()) to populate the confidence intervals. The lower limit and upper limit are given for each observation in the training data or test data, according to which prediction method is used.
    
    Returns:
    
    a double matrix containing the lower and upper confidence limits
  - getHessian
```
public double[][] getHessian()
```
    Returns the Hessian matrix.
    Run the method LogisticRegression.fitModel() to populate the Hessian.
    
    Returns:
    
    a double matrix containing the Hessian matrix
  - getInitialLoglikelihood
```
public double getInitialLoglikelihood()
```
    Returns the log-likelihood of the model evaluated at the starting coefficients.
    This is the log-likelihood at the default starting values coefficients[i]=0 or the starting coefficients set using LogisticRegression.setStartingCoefficients(double[]). Run the method LogisticRegression.fitModel() to populate the initial log-likelihood.
    
    Returns:
    
    a double, the initial log-likelihood
  - getLoglikelihood
```
public double getLoglikelihood()
```
    Returns the log-likelihood evaluated at the estimated coefficients.
    Run the method LogisticRegression.fitModel() to populate the log-likelihood.
    
    Returns:
    
    a double, the log-likelihood of the fitted or trained model
  - getLogLikelihoodRatio
```
public double getLogLikelihoodRatio()
```
    Returns the log-likelihood ratio statistic.
    Run the method LogisticRegression.fitModel() to populate the likelihood ratio statistic.
    
    Returns:
    
    a double, the value of the log-likelihood ratio statistic
  - getStandardErrors
```
public double[] getStandardErrors()
```
    Returns the standard errors of the estimated coefficients.
    Run the method LogisticRegression.fitModel() to populate the standard errors.
    
    Returns:
    
    a double array containing the standard errors
  - setStartingCoefficients
```
public void setStartingCoefficients(double[] startCoefs)
```
    Sets the starting values for the coefficients.
    
    Parameters:
    
    startCoefs - a double array containing starting values
  - setTolerance
```
public void setTolerance(double tolerance)
```
    Sets the error tolerance criteria for convergence.
    
    Parameters:
    
    tolerance - a double, the error tolerance criteria for convergence
  - setIncludeIntercept
```
public void setIncludeIntercept(boolean includeIntercept)
```
    Sets the flag to include or not include an intercept in the model.
    The default is true, the model will include an intercept.
    
    Parameters:
    
    includeIntercept - a boolean, indicating whether or not to include an intercept
  - setInteractionIndex
```
public void setInteractionIndex(int[] xInteractions)
```
    Sets the column indices for pairwise interactions to include in the model.
    For each desired interaction pair, i,j, set xIntIndices[2*k]=i and xIntIndices[2*k+1]=j where k=0,1,.. and i,j are column indices contained in PredictiveModel.getPredictorIndexes(). Setting i==j defines a squared term. The default is that no interactions are included.
    
    Parameters:
    
    xInteractions - an int array containing the column indices of desired interactions
  - isIncludeIntercept
```
public boolean isIncludeIntercept()
```
    Returns the current setting of the flag to include or not include an intercept in the model.
    
    Returns:
    
    a boolean, the indicator of whether or not the model includes an intercept
  - getXInteractionsIndex
```
public int[] getXInteractionsIndex()
```
    Returns the array containing the indices specifying interaction terms.
    
    Returns:
    
    an int array containing the index pairs for interaction terms
  - getNumberOfCoefficients
```
public int getNumberOfCoefficients()
```
    Returns the number of coefficients (per class).
    
    Returns:
    
    an int, the number of coefficients

Class LogisticRegression

Nested Class Summary

Nested classes/interfaces inherited from class com.imsl.datamining.PredictiveModel

Constructor Summary

Method Summary

Methods inherited from class com.imsl.datamining.PredictiveModel

Methods inherited from class java.lang.Object

Constructor Detail

LogisticRegression

LogisticRegression

LogisticRegression

Method Detail

clone

fitModel

setConfidenceLevel

getConfidenceLevel

predict

predict

predict

getClassMeans

getCoefficients

getGradient

getConfidenceIntervals

getHessian

getInitialLoglikelihood

getLoglikelihood

getLogLikelihoodRatio

getStandardErrors

setStartingCoefficients

setTolerance

setIncludeIntercept

setInteractionIndex

isIncludeIntercept

getXInteractionsIndex

getNumberOfCoefficients