public class StepwiseRegression extends Object implements Serializable, Cloneable
Class StepwiseRegression
builds a multiple linear regression
model using forward selection, backward selection, or forward stepwise (with
a backward glance) selection.
Levels of priority can be assigned to the candidate independent variables
using the setLevels(int[])
method. All variables with a priority level of
1 must enter the model before variables with a priority level of 2.
Similarly, variables with a level of 2 must enter before variables with a
level of 3, etc. Variables also can be forced into the model (setForce(int)
). Note that specifying "force" without also specifying the levels
will result in all variables being forced into the model.
Typically, the intercept is forced into all models and is not a candidate variable. In this case, a sumofsquares and crossproducts matrix for the independent and dependent variables corrected for the mean is required. Other possibilities are as follows:
cov
.
Argument nObservations
must be set to one greater than
the number of observations.cov
. In this case, cov
contains one
additional row and column corresponding to the constant regressor.
This row/column contains the sumofsquares and crossproducts of the
constant regressor with the independent and dependent variables. The
remaining elements in cov
are the same as in the
previous case. Argument nObservations
must be set to
one greater than the number of observations.The stepwise regression algorithm is due to Efroymson (1960).
StepwiseRegression
uses sweeps of the covariance matrix (input
in cov
, if the covariance matrix is specified, or generated
internally) to move variables in and out of the model (Hemmerle 1967,
Chapter 3). The SWEEP operator discussed in Goodnight (1979) is used. A
description of the stepwise algorithm is also given by Kennedy and Gentle
(1980, pp. 335340). The advantage of stepwise model building over all
possible regression (SelectionRegression
) is that it is less
demanding computationally when the number of candidate independent variables
is very large. However, there is no guarantee that the model selected will
be the best model (highest ) for any subset size of
independent variables.
Modifier and Type  Class and Description 

class 
StepwiseRegression.CoefficientTTests
CoefficientTTests contains statistics related to the
studentt test, for each regression coefficient. 
static class 
StepwiseRegression.CyclingIsOccurringException
Cycling is occurring.

static class 
StepwiseRegression.NoVariablesEnteredException
No Variables can enter the model.

Modifier and Type  Field and Description 

static int 
BACKWARD_REGRESSION
Indicates backward regression.

static int 
FORWARD_REGRESSION
Indicates forward regression.

static int 
STEPWISE_REGRESSION
Indicates stepwise regression.

Constructor and Description 

StepwiseRegression(double[][] x,
double[] y)
Creates a new instance of
StepwiseRegression . 
StepwiseRegression(double[][] x,
double[] y,
double[] weights)
Creates a new instance of weighted
StepwiseRegression . 
StepwiseRegression(double[][] x,
double[] y,
double[] weights,
double[] frequencies)
Creates a new instance of weighted
StepwiseRegression
using observation frequencies. 
StepwiseRegression(double[][] cov,
int nObservations)
Creates a new instance of
StepwiseRegression from a
usersupplied variancecovariance matrix. 
Modifier and Type  Method and Description 

void 
compute()
Builds the multiple linear regression models using forward selection,
backward selection, or stepwise selection.

ANOVA 
getANOVA()
Get an analysis of variance table and related statistics.

StepwiseRegression.CoefficientTTests 
getCoefficientTTests()
Returns the studentt test statistics for the regression
coefficients.

double[] 
getCoefficientVIF()
Returns the variance inflation factors for the final model in this
invocation.

double[][] 
getCovariancesSwept()
Returns the results after
cov has been swept for the
columns corresponding to the variables in the model. 
double[] 
getHistory()
Returns the stepwise regression history for the independent variables.

double 
getIntercept()
Returns the intercept.

double[] 
getSwept()
Returns an array containing information indicating whether or not a
particular variable is in the model.

void 
setForce(int force)
Forces independent variables into the model based on their level
assigned from
setlevels . 
void 
setLevels(int[] levels)
Sets the levels of priority for variables entering and leaving the
regression.

void 
setMeans(double[] means)
Sets the means of the variables.

void 
setMethod(int method)
Specifies the stepwise selection method, forward, backward, or
stepwise Regression.

void 
setPValueIn(double pValueIn)
Defines the largest pvalue for variables entering the model.

void 
setPValueOut(double pValueOut)
Defines the smallest pvalue for removing variables.

void 
setTolerance(double tolerance)
The tolerance used to detect linear dependence among the independent
variables.

public static final int BACKWARD_REGRESSION
pValueOut
. During initialization, all candidate
independent variables enter the model.public static final int FORWARD_REGRESSION
pValueIn
. During intitialization, only forced variables
enter the model.public static final int STEPWISE_REGRESSION
public StepwiseRegression(double[][] x, double[] y) throws com.imsl.stat.Covariances.TooManyObsDeletedException, com.imsl.stat.Covariances.MoreObsDelThanEnteredException, com.imsl.stat.Covariances.DiffObsDeletedException
StepwiseRegression
.x
 a double
matrix of nObs by nVars,
where nObs is the number of observations and nVars
is the number of independent variables.y
 a double
array containing the observations of
the dependent variable.com.imsl.stat.Covariances.TooManyObsDeletedException
 is thrown if more
observations have been deleted than were originally
entered, i.e. the sum of frequencies has become
negative.com.imsl.stat.Covariances.MoreObsDelThanEnteredException
 is thrown if
more observations are being deleted from
"variancecovariance" matrix than were originally
entered. The corresponding row, column of the incidence
matrix is less than zero.com.imsl.stat.Covariances.DiffObsDeletedException
 is thrown if different
observations are being deleted than were originally
entered.public StepwiseRegression(double[][] x, double[] y, double[] weights) throws Covariances.NonnegativeWeightException, com.imsl.stat.Covariances.TooManyObsDeletedException, com.imsl.stat.Covariances.MoreObsDelThanEnteredException, com.imsl.stat.Covariances.DiffObsDeletedException
StepwiseRegression
.x
 a double
matrix of nObs by nVars,
where nObs is the number of observations and nVars
is the number of independent variables.y
 a double
array containing the observations of
the dependent variable.weights
 a double
array containing the weight for
each observation of x
.Covariances.NonnegativeWeightException
 is thrown if the
weights are negative.com.imsl.stat.Covariances.TooManyObsDeletedException
 is thrown if more
observations have been deleted than were originally
entered, i.e. the sum of frequencies has become
negative.com.imsl.stat.Covariances.MoreObsDelThanEnteredException
 is thrown if
more observations are being deleted from
"variancecovariance" matrix than were originally
entered. The corresponding row, column of the incidence
matrix is less than zero.com.imsl.stat.Covariances.DiffObsDeletedException
 is thrown if different
observations are being deleted than were originally
entered.public StepwiseRegression(double[][] x, double[] y, double[] weights, double[] frequencies) throws Covariances.NonnegativeFreqException, Covariances.NonnegativeWeightException, com.imsl.stat.Covariances.TooManyObsDeletedException, com.imsl.stat.Covariances.MoreObsDelThanEnteredException, com.imsl.stat.Covariances.DiffObsDeletedException
StepwiseRegression
using observation frequencies.x
 a double
matrix of nObs by nVars,
where nObs is the number of observations and nVars
is the number of independent variables.y
 a double
array containing the observations of
the dependent variable.weights
 a double
array containing the weight for
each observation of x
.frequencies
 a double
array containing the frequency
for each row of x
.Covariances.NonnegativeFreqException
 is thrown if the
frequencies are negative.Covariances.NonnegativeWeightException
 is thrown if the
weights are negative.com.imsl.stat.Covariances.TooManyObsDeletedException
 is thrown if more
observations have been deleted than were originally
entered, i.e. the sum of frequencies has become
negative.com.imsl.stat.Covariances.MoreObsDelThanEnteredException
 is thrown if
more observations are being deleted from
"variancecovariance" matrix than were originally
entered. The corresponding row, column of the incidence
matrix is less than zero.com.imsl.stat.Covariances.DiffObsDeletedException
 is thrown if different
observations are being deleted than were originally
entered.public StepwiseRegression(double[][] cov, int nObservations)
StepwiseRegression
from a
usersupplied variancecovariance matrix.cov
 a double
matrix containing a
variancecovariance or sum of squares and crossproducts
matrix, in which the last column must correspond to the
dependent variable. cov
can be computed using
the Covariances
class.nObservations
 an int
containing the number of
observations associated with cov
.public void compute() throws StepwiseRegression.NoVariablesEnteredException, StepwiseRegression.CyclingIsOccurringException
StepwiseRegression.NoVariablesEnteredException
 is thrown if no variables
entered the model. All elements of ANOVA
table are set to NaN
StepwiseRegression.CyclingIsOccurringException
 is thrown if cycling occurspublic ANOVA getANOVA() throws StepwiseRegression.NoVariablesEnteredException, StepwiseRegression.CyclingIsOccurringException
ANOVA
table and related statistics.StepwiseRegression.NoVariablesEnteredException
StepwiseRegression.CyclingIsOccurringException
public StepwiseRegression.CoefficientTTests getCoefficientTTests() throws StepwiseRegression.NoVariablesEnteredException, StepwiseRegression.CyclingIsOccurringException
StepwiseRegression.CoefficientTTests
object
containing statistics relating to the regression coefficients.StepwiseRegression.NoVariablesEnteredException
StepwiseRegression.CyclingIsOccurringException
public double[] getCoefficientVIF() throws StepwiseRegression.NoVariablesEnteredException, StepwiseRegression.CyclingIsOccurringException
x
(or, if the covariance matrix is specified,
the elements are in the same order as the variables in cov
). Each element corresponding to a variable not in the model
contains statistics for a model which includes the variables of the
final model and the variables corresponding to the element in question.
The square of the multiple correlation coefficient for the ith regressor after all others can be obtained from the ith element for the returned array by the following formula:
double
array containing the variance inflation
factors for the final model in this invocation.StepwiseRegression.NoVariablesEnteredException
StepwiseRegression.CyclingIsOccurringException
public double[][] getCovariancesSwept() throws StepwiseRegression.NoVariablesEnteredException, StepwiseRegression.CyclingIsOccurringException
cov
has been swept for the
columns corresponding to the variables in the model.double
matrix containing the results after
cov
has been swept on the columns corresponding to
the variables in the model. The estimated variancecovariance
matrix of the estimated regression coefficients in the final
model can be obtained by extracting the rows and columns
corresponding to the independent variables in the final model
and multiplying the elements of this matrix by the error mean
square.StepwiseRegression.NoVariablesEnteredException
StepwiseRegression.CyclingIsOccurringException
public double[] getHistory() throws StepwiseRegression.NoVariablesEnteredException, StepwiseRegression.CyclingIsOccurringException
double
array containing the recent history of
the independent variables. The last element corresponds to the
dependent variable.
history[i]  Status of ith Variable 
0.0  This variable has never been added to the model. 
0.5  This variable was added into the model during initialization. 
k 0.0  This variable was added to the model during the kth step. 
k 0.0  This variable was deleted from model during the kth step 
StepwiseRegression.NoVariablesEnteredException
StepwiseRegression.CyclingIsOccurringException
setLevels(int[])
public double getIntercept() throws StepwiseRegression.NoVariablesEnteredException, StepwiseRegression.CyclingIsOccurringException
y
, are the coefficients, and
are the mean values for each independent
variable in the final model. If the covariance
matrix is used for input, use method setMean
to specify the
means of the variables. If x
and y
are used
for input, the means are computed internally and do not need to be
specified.double
containing the intercept.StepwiseRegression.NoVariablesEnteredException
StepwiseRegression.CyclingIsOccurringException
public double[] getSwept() throws StepwiseRegression.NoVariablesEnteredException, StepwiseRegression.CyclingIsOccurringException
double
array with information to indicate the
independent variables in the model. The last element corresponds
to the dependent variable. A +1 in the ith position
indicates that the variable is in the selected model. A 1
indicates that the variable is not in the selected model.StepwiseRegression.NoVariablesEnteredException
StepwiseRegression.CyclingIsOccurringException
setLevels(int[])
public void setForce(int force)
setlevels
.force
 an int
specifying the upper bound on the
variables forced into the model. Variables with
levels 1, 2, ..., force
are forced into
the model as independent variables.setLevels(int[])
public void setLevels(int[] levels)
levels[i]=0
means the ith variable never
enters the model. Argument levels[i]=1
means the
ith variable is the dependent variable. The last element in
levels
must correspond to the dependent variable, except
when the variancecovariance or sum of squares and crossproducts matrix
is supplied.levels
 an int
array containing the levels of entry
into the model for each variable.
Default: 1, 1, ..., 1, 1 where 1 corresponds to the
dependent variable.setForce(int)
public void setMeans(double[] means)
getIntercept()
is requested.
Otherwise, it is not used.means
 a double
array of length nVars+1,
where nVars is the number of independent
variables. means[0]
through
means[nVars1]
are the means of the
independent variables and means[nVars]
is the mean of the dependent variable.getIntercept()
public void setMethod(int method)
method
 an int
value between 1 and 1 specifying
the stepwise selection method. Fields
FORWARD_REGRESSION
, BACKWARD_REGRESSION
, and STEPWISE_REGRESSION
should be
used. Default: STEPWISE_REGRESSION
.FORWARD_REGRESSION
,
BACKWARD_REGRESSION
,
STEPWISE_REGRESSION
public void setPValueIn(double pValueIn)
pValueIn
may enter
the model. Backward regression does not use this value.pValueIn
 a double
containing the largest
pvalue for variables entering the model.
Default: pValueIn
= 0.05.public void setPValueOut(double pValueOut)
pValueOut
may leave the
model. pValueOut
must be greater than or equal to
pValueIn
. A common choice for pValueOut
is
2*pValueIn
. Forward regression does not use this value.pValueOut
 a double
containing the smallest
pvalue for removing variables from the
model. Default: pValueOut
= 0.10.public void setTolerance(double tolerance)
tolerance
 a double
containing the tolerance used
for detecting linear dependence. Default:
tolerance
= 2.2204460492503e16.Copyright © 19702015 Rogue Wave Software
Built October 13 2015.