Chapter 2: Regression

hypothesis_test

Performs tests for a multivariate general linear hypothesis HβG given the hypothesis sums of squares and crossproducts matrix SH.

Synopsis

#include <imsls.h>

float imsls_f_hypothesis_test (Imsls_f_regression *regression_info, float dfh, float *scph, ..., 0)

The type double function is imsls_d_hypothesis_test.

Required Argument

Imsls_f_regression *regression_info   (Input)
Pointer to a structure of type Imsls_f_regression containing information about the regression fit. See function imsls_f_regression.

float dfh   (Input)
Degrees of freedom for the sums of squares and crossproducts matrix.

float *scph   (Input)
Array of size nu by nu containing SH, the sums of squares and crossproducts attribut­able to the hypothesis.

Return Value

The p-value corresponding to Wilks’ lambda test.

Synopsis with Optional Arguments

#include <imsls.h>

float imsls_f_hypothesis_test (Imsls_f_regression *regression_info, float dfh, float *scph,
IMSLS_U, int nu, float u[],
IMSLS_WILK_LAMBDA, float *value, float *p_value,
IMSLS_ROY_MAX_ROOT, float *value, float *p_value,
IMSLS_HOTELLING_TRACE, float *value, float *p_value,
IMSLS_PILLAI_TRACE, float *value, float *p_value,
0)

Optional Arguments

IMSLS_U, int nu, float u[]   (Input)
Argument nu is the number of linear combinations of the dependent variables to be considered. The value nu must be greater than 0 and less than or equal to n_dependent. Argument u contains the n_dependent by nu U matrix for the test HpβGp.
Default: nun_dependent and u is the identity matrix

IMSLS_WILK_LAMBDA, float *value, float *p_value   (Output)
Wilk’s lamda and p-value.

IMSLS_ROY_MAX_ROOT, float *value, float *p_value   (Output)
Roy’s maximum root criterion and p-value.

IMSLS_HOTELLING_TRACE, float *value, float *p_value   (Output)
Hotelling’s trace and p-value.

IMSLS_PILLAI_TRACE, float *value, float *p_value   (Output)
Pillai’s trace and p-value.

Description

Function imsls_f_hypothesis_test computes test statistics and p-values for the general linear hypothesis HβG for the multivariate general linear model.

The hypothesis sum of squares and crossproducts matrix input in scph is

where C is a solution to RTH and where D is a diagonal matrix with diagonal elements

For a detailed discussion, see “Linear Dependence and the R Matrix.”

The error sum of squares and crossproducts matrix for the model Xβ + ɛ is

which is input in regression_info. The error sum of squares and crossproducts matrix for the hypothesis HβG computed by imsls_f_hypothesis_test is

Let p equal the order of the matrices SE and SH, i.e.,

Let q (stored in dfh) be the degrees of freedom for the hypothesis. Let v (input in regression_info) be the degrees of freedom for error. Function imsls_f_hypothesis_test com­puted three test statistics based on eigenvalues λi (= 1, 2, , p) of the generalized eigenvalue problem SHλSEx. These test statistics are as follows:

 

Wilk’s lambda

The associated p-value is based on an approximation discussed by Rao (1973, p. 556). The statistic

has an approximate F distribution with pq and ms  pq  2 + 1 numerator and denominator degrees of freedom, respectively, where

and

The F test is exact if min (p, q)  2 (Kshirsagar, 1972, Theorem 4, p. 299300).

Roy’s maximum root

= max λi         over all i

where c is output as value. The p-value is based on the approximation

where = max (p, q) has an approximate F distribution with s and ν +  s numerator and denominator degrees of freedom, respectively. The F test is exact if s = 1; the p-value is also exact. In general, the value output in p_value is lower bound on the actual p-value.

Hotelling’s trace

U is output as value. The p-value is based on the approximation of McKeon (1974) that super­sedes the approximation of Hughes and Saw (1972). McKeon’s approximation is also discussed by Seber (1984, p. 39). For

the p-value is based on the result that

has an approximate F distribution with pq and b degrees of freedom. The test is exact if min (p, q) = 1. For ν  + 1, the approximation is not valid, and p_value is set to NaN.

These three test statistics are valid when SE is positive definite. A necessary condition for SE to be positive definite is ν  p. If SE is not positive definite, a warning error message is issued, and both value and p_value are set to NaN.

Because the requirement ν  p can be a serious drawback, imsls_f_hypothesis_test computes a fourth test statistic based on eigenvalues θi (= 1, 2, , p) of the generalized eigenvalue problem SHθ(SH + SE) w. This test statistic requires a less restrictive assumption—SH + SE is posi­tive definite. A necessary condition for SH + SE to be positive definite is ν +  p. If SE is positive definite, imsls_f_hypothesis_test avoids the computation of the generalized eigenvalue problem from scratch. In this case, the eigenvalues θi are obtained from λi by

The fourth test statistic is as follows:

Pillai’s trace

V is output as value. The p-value is based on an approximation discussed by Pillai (1985). The statistic

has an approximate F distribution with s(2+ 1) and s(2+ 1) numerator and denom­inator degrees of freedom, respectively, where

s = min (p, q)

m = ˝(|p q| 1)

n = ˝(ν p 1)

The F test is exact if min (p, q) = 1.

Examples

Example 1

The data for this example are from Maindonald (1984, p. 203204). A multivariate regres­sion model containing two dependent variables and three independent variables is fit using function imsls_f_regression and the results stored in the structure regression_info. The sum of squares and crossproducts matrix, scph, is then computed with a call to imsls_f_hypothesis_scph for the test that the third independent variable is in the model (determined by specification of h). Finally, func­tion imsls_f_hypothesis_test is called to compute the p-value for the test statistic (Wilk’s lambda).

#include <imsls.h>

main()

{

    Imsls_f_regression *info;

    float   *coefficients, *scph;

    float   dfh, p_value;

    float   x[]     = { 7.0, 5.0, 6.0,

                        2.0,-1.0, 6.0, 

                        7.0, 3.0, 5.0, 

                       -3.0, 1.0, 4.0,

                        2.0,-1.0, 0.0,

                        2.0, 1.0, 7.0,

                       -3.0,-1.0, 3.0,

                        2.0, 1.0, 1.0,

                        2.0, 1.0, 4.0 };

    float   y[]     = { 7.0, 1.0,

                       -5.0, 4.0, 

                        6.0, 10.0, 

                        5.0, 5.0,

                        5.0, -2.0,

                       -2.0, 4.0,

                        0.0, -6.0,

                        8.0, 2.0,

                        3.0, 0.0 };

    int     n_observations = 9;

    int     n_independent = 3;

    int     n_dependent = 2;

    int     nh = 1;

    float h[]       = { 0, 0, 0, 1 };


    coefficients = imsls_f_regression(n_observations, n_independent,

        x, y,

        IMSLS_N_DEPENDENT, n_dependent,

        IMSLS_REGRESSION_INFO, &info,

        0);


    scph = imsls_f_hypothesis_scph(info, nh, h, &dfh, 0);

 

    p_value = imsls_f_hypothesis_test(info, dfh, scph, 0);


    printf("P-value = %10.6f\n", p_value);


}

Output

P-value =   0.000010

Example 2

This example is the same as the first example, but more statistics are computed. Also, the U matrix, u, is explicitly specified as the identity matrix (which is the same default configuration of U).

#include <imsls.h>

main()

{

    Imsls_f_regression *info;

    float   *coefficients, *scph;

    float   dfh, p_value;

    float   x[]     = { 7.0, 5.0, 6.0,

                        2.0,-1.0, 6.0, 

                        7.0, 3.0, 5.0, 

                       -3.0, 1.0, 4.0,

                        2.0,-1.0, 0.0,

                        2.0, 1.0, 7.0,

                       -3.0,-1.0, 3.0,

                        2.0, 1.0, 1.0,

                        2.0, 1.0, 4.0 };

    float   y[]     = { 7.0, 1.0,

                       -5.0, 4.0, 

                        6.0, 10.0, 

                        5.0, 5.0,

                        5.0, -2.0,

                       -2.0, 4.0,

                        0.0, -6.0,

                        8.0, 2.0,

                        3.0, 0.0 };

    int     n_observations = 9;

    int     n_independent = 3;

    int     n_dependent = 2;

    int     nh = 1;

    float   h[]     = { 0, 0, 0, 1 };

    int     nu = 2;

    float   u[4]={1, 0, 0, 1};

    float   v1, v2, v3, v4, p1, p2, p3, p4;


    coefficients = imsls_f_regression(n_observations, n_independent,

        x, y,

        IMSLS_N_DEPENDENT, n_dependent,

        IMSLS_REGRESSION_INFO, &info,

        0);


    scph = imsls_f_hypothesis_scph(info, nh, h, &dfh, 0);

 

    p_value = imsls_f_hypothesis_test(info, dfh, scph,

        IMSLS_U, nu, u, 

        IMSLS_WILK_LAMBDA, &v1, &p1,

        IMSLS_ROY_MAX_ROOT, &v2, &p2,

        IMSLS_HOTELLING_TRACE, &v3, &p3,

        IMSLS_PILLAI_TRACE, &v4, &p4,

        0);


    printf("Wilk      value = %10.6f   p-value = %10.6f\n", v1, p1);

    printf("Roy       value = %10.6f   p-value = %10.6f\n", v2, p2);

    printf("Hotelling value = %10.6f   p-value = %10.6f\n", v3, p3);

    printf("Pillai    value = %10.6f   p-value = %10.6f\n", v4, p4);

}

Output

Wilk      value =   0.003149   p-value =   0.000010

Roy       value = 316.600861   p-value =   0.000010

Hotelling value = 316.600861   p-value =   0.000010

Pillai    value =   0.996851   p-value =   0.000010

Warning Errors

IMSLS_SINGULAR_1                                   “u”*“scpe”*“u” is singular. Only Pillai’s trace can be computed. Other statistics are set to NaN.

Fatal Errors

IMSLS_NO_STAT_1                                     “scpe” + “scph” is singular. No tests can be computed.

IMSLS_NO_STAT_2                                     No statistics can be computed. Iterations for eigenval­ues for the generalized eigenvalue problem “scph”*x = (lambda)*(“scph”+“scpe”)*x failed to converge.

IMSLS_NO_STAT_3                                     No statistics can be computed. Iterations
for eigenval­ues for the generalized
eigenvalue problem “scph”
*x = (lambda)*(“scph”+“u”*“scpe”*“u”)*x failed to con­verge.

IMSLS_SINGULAR_2                                   “u”*“scpe”*“u” + “scph” is singular. No tests can be computed.

IMSLS_SINGULAR_TRI_MATRIX              The input triangular matrix is singular. The index of the first zero diagonal element is equal to #.


Visual Numerics, Inc.
Visual Numerics - Developers of IMSL and PV-WAVE
http://www.vni.com/
PHONE: 713.784.3131
FAX:713.781.9260