RegressorsForGLM Class

Generates regressors for a general linear model.

Inheritance Hierarchy

System.Object
Imsl.Stat.RegressorsForGLM

Namespace: Imsl.Stat
Assembly: ImslCS (in ImslCS.dll) Version: 6.5.2.0

Syntax

C++

Copy

[SerializableAttribute]
public class RegressorsForGLM

<SerializableAttribute>
Public Class RegressorsForGLM

[SerializableAttribute]
public ref class RegressorsForGLM

[<SerializableAttribute>]
type RegressorsForGLM =  class end

The RegressorsForGLM type exposes the following members.

Constructors

	Name	Description
	RegressorsForGLM(Double[,], Int32)	Constructor where the class columns are the first columns.
	RegressorsForGLM(Double[,],Int32[])	Constructor with an explicit set of class column indices.

Top

Methods

	Name	Description
	Equals	Determines whether the specified object is equal to the current object. (Inherited from Object.)
	Finalize	Allows an object to try to free resources and perform other cleanup operations before it is reclaimed by garbage collection. (Inherited from Object.)
	GetEffects	Returns the effects.
	GetEffectsColumns	Returns a mapping of effects to regressor columns.
	GetHashCode	Serves as a hash function for a particular type. (Inherited from Object.)
	GetRegressors	Returns the regressor array.
	GetType	Gets the Type of the current instance. (Inherited from Object.)
	MemberwiseClone	Creates a shallow copy of the current Object. (Inherited from Object.)
	SetEffects	Set the effects.
	ToString	Returns a string that represents the current object. (Inherited from Object.)

Top

Properties

	Name	Description
	DummyMethod	The dummy method.
	ModelOrder	The order of the model.
	NumberOfMissingRows	Returns the number of rows in the regressors matrix containing NaN (not a number).
	NumberOfRegressors	Returns the number of regressors.

Top

Remarks

Class RegressorsForGLM generates regressors for a general linear model from a data matrix. The data matrix can contain classification variables as well as continuous variables. Regressors for effects composed solely of continuous variables are generated as powers and crossproducts. Consider a data matrix containing continuous variables as Columns 3 and 4. The effect indices (3, 3) generate a regressor whose i-th value is the square of the i-th value in Column 3. The effect indices (3, 4) generates a regressor whose i-th value is the product of the i-th value in Column 3 with the i-th value in Column 4.

Regressors for an effect (source of variation) composed of a single classification variable are generated using indicator variables. Let the classification variable A take on values $a_1, a_2, \ldots, a_n$ . From this classification variable, RegressorsForGLM creates n indicator variables. For $k = 1, 2, \ldots, n$ , we have

$I_k = \left\{ \begin{array}{rl} 1 & \mbox{if } A = a_k \\ 0 & \mbox{otherwise} \end{array} \right.$

For each classification variable, another set of variables is created from the indicator variables. These new variables are called dummy variables. Dummy variables are generated from the indicator variables in one of three manners:

The dummies are the n indicator variables.
The dummies are the first indicator variables.
The dummies are defined in terms of the indicator variables so that for balanced data, the usual summation restrictions are imposed on the regression coefficients.

In particular, for dummy method All, the dummy variables are $A_k = I_k \: (k = 1, 2, \ldots, n)$ . For dummy method LeaveOutLast, the dummy variables are $A_k = I_k \: (k = 1, 2, ..., n - 1)$ . For dummy method SumToZero, the dummy variables are $A_k = I_k - I_n \: (k = 1, 2, \ldots, n - 1)$ . The regressors generated for an effect composed of a single-classification variable are the associated dummy variables.

Let be the number of dummies generated for the j-th classification variable. Suppose there are two classification variables A and B with dummies

$A_1, A_2, \ldots, A_{m_1}$

and

$B_1, B_2, \ldots, B_{m_2}$

The regressors generated for an effect composed of two classification variables A and B are

$\begin{array}{rl} A \otimes B = & (A_1, A_2, \ldots, A_{m_1}) \otimes (B_1, B_2, \ldots, B_{m_2}) \\ = & (A_1 B_1, A_1 B_2, \ldots, A_1 B_{m_2}, A_2, B_1, A_2 B_2, \ldots, \\ = & A_2 B_{m_2}, \ldots, A_{m_1}, B_1, A_{m_1}, B_2, \ldots, A_{m_1} B_{m_2}) \end{array}$

More generally, the regressors generated for an effect composed of several classification variables and several continuous variables are given by the Kronecker products of variables, where the order of the variables is specified in SetEffects. Consider a data matrix containing classification variables in Columns 0 and 1 and continuous variables in Columns 2 and 3. Label these four columns , , , and . The regressors generated by the effect indices are $A \otimes B \otimes X_1 X_1 X_2$

Remarks

Let the data matrix $\mathtt{x} = (A, B, X_1)$ , where A and B are classification variables and is a continuous variable. The model containing the effects , B, AB, , , , and is specified by setting nClassVariables=2 in the constructor and calling SetEffects(effects), with int effects[][] = { {0}, {1}, {0, 1}, {2}, {0, 2}, {1, 2}, {0, 1, 2} };

For this model, suppose that variable A has two levels, and , and that variable B has three levels, , , and . For each DummyMethod option, the regressors in their order of appearance in regressors are given below.

DummyMethod	Regressors
All	, , , , , , , , , , , , , , , ,
LeaveOutLast	, , , , , , , , , ,
SumToZero	, , , , , , , , ,

Within a group of regressors corresponding to an interaction effect, the indicator variables composing the regressors vary most rapidly for the last classification variable, next most rapidly for the next to last classification variable, etc.

By default, RegressorsForGLM internally generates values for effects which correspond to a first order model with nEffects = nContinuousVariables + nClassVariables, where nContinuousVariables is the number of continuous variables and nClassVariables is the number of classification variables. The variables then are used to create the regressor variables. The effects are ordered such that the first effect corresponds to the first column of x, the second effect corresponds to the second column of x, etc. A second order model corresponding to the columns (variables) of x is generated if ModelOrder = 2 is used.

The effects array for a first or second order model can be obtained by first using ModelOrder followed by GetEffects. This array can then be modified and used as the argument to SetEffects. This may be an easier way of setting the effects for an almost linear or quadratic model than creating the effects array from scratch.

There are

$\mathtt{nEffects} = \mathtt{nClassVariables} + \mathtt{nContinuousVariables} + \frac{\mathtt{nVar} (\mathtt{nVar} - 1)}{2}$

effects, where nVar = nClassVariables+nContinuousVariables. The first nVar effects correspond to the columns of x, such that the first effect corresponds to the first column of x, the second effect corresponds to the second column of x, ..., the nVar-th effect corresponds to the nVar-th column of x (i.e. x[nVar-1]). The next nContinuousVariables effects correspond to squares of the continuous variables. The last $\mathtt{nVar} (\mathtt{nVar} - 1) / 2$ effects correspond to the two-variable interactions.

Let the data matrix $\mathtt{x} = (A, B, X_1)$ , where A and B are classification variables and is a continuous variable. The effects generated and order of appearance is
$A,\: B,\: X_1,\: X_1^2,\: A B,\: A X_1,\: B X_1$
Let the data matrix $\mathtt{x} = (A, X_1, X_2)$ , where A is a classification variable and and are continuous variables. The effects generated and order of appearance is
$A,\: X_1,\: X_2,\: X_1^2,\: X_2^2,\: A X_1,\: A X_2,\: X_1 X_2$
Let the data matrix $\mathtt{x} = (X_1, A, X_2)$ , where A is a classification variable and and are continuous variables. The effects generated and order of appearance is
$X_1,\: A,\: X_2,\: X_1^2,\: X_2^2,\: X_1 A,\: X_1 X_2,\: A X_2$

Higher-order and more complicated models can be specified using SetEffects.

Reference

Imsl.Stat Namespace

Other Resources

Example 1

Example 2