Computes predicted values, confidence intervals, and diagnostics after fitting a regression model.
#include <imsls.h>
float *imsls_f_regression_prediction (Imsls_f_regression *regression_info, int n_predict, float x[], ..., 0)
The type double function is imsls_d_regression_prediction.
Imsls_f_regression
*regression_info (Input)
Pointer to a structure of type
Imsls_f_regression containing information about the regression fit. See
imsls_f_regression.
int n_predict
(Input)
Number of rows in x.
float x[]
(Input)
Array of size n_predict by the
number of independent variables containing the combinations of independent
variables in each row for which calculations are to be performed.
Pointer to an internally allocated array of length n_predict containing the predicted values.
#include <imsls.h>
float
*imsls_f_regression_prediction (Imsls_f_regression *regression_info, int
n_predict,
float x[],
IMSLS_X_COL_DIM, int
x_col_dim,
IMSLS_Y_COL_DIM, int
y_col_dim,
IMSLS_INDEX_REGRESSION, int idep,
IMSLS_X_INDICES, int
indind[],
int
inddep[],
int
ifrq,
int iwt,
IMSLS_WEIGHTS, float
weights[],
IMSLS_CONFIDENCE, float
confidence,
IMSLS_SCHEFFE_CI, float
**lower_limit, float **upper_limit,
IMSLS_SCHEFFE_CI_USER, float
lower_limit[],
float upper_limit[],
IMSLS_POINTWISE_CI_POP_MEAN, float
**lower_limit, float **upper_limit,
IMSLS_POINTWISE_CI_POP_MEAN_USER, float
lower_limit[], float upper_limit[],
IMSLS_POINTWISE_CI_NEW_SAMPLE, float
**lower_limit, float **upper_limit,
IMSLS_POINTWISE_CI_NEW_SAMPLE_USER,
float lower_limit[], float upper_limit[],
IMSLS_LEVERAGE, float
**leverage,
IMSLS_LEVERAGE_USER, float
leverage[],
IMSLS_RETURN_USER, float
y_hat[],
IMSLS_Y, float
y[],
IMSLS_RESIDUAL, float
**residual,
IMSLS_RESIDUAL_USER, float
residual[],
IMSLS_STANDARDIZED_RESIDUAL,
float **standardized_residual,
IMSLS_STANDARDIZED_RESIDUAL_USER, float standardized_residual[],
IMSLS_DELETED_RESIDUAL, float
**deleted_residual,
IMSLS_DELETED_RESIDUAL_USER, float
deleted_residual[],
IMSLS_COOKSD, float
**cooksd,
IMSLS_COOKSD_USER, float
cooksd[],
IMSLS_DFFITS, float
**dffits,
IMSLS_DFFITS_USER, float
dffits[],
0)
IMSLS_X_COL_DIM, int x_col_dim
(Input)
Number of columns in x.
Default: x_col_dim is equal to
the number of independent variables, which is input from the structure regression_info
IMSLS_Y_COL_DIM, int y_col_dim
(Input)
Number of columns in y.
Default: y_col_dim = 1
IMSLS_INDEX_REGRESSION, int idep
(Input)
Given a multivariate regression fit, this option allows the user to
specify for which regression statistics will be computed.
Default: idep = 0
IMSLS_X_INDICES, int indind[], int inddep, int ifrq, int iwt
(Input)
This argument allows an alternative method for data specification.
Data (independent, dependent, frequencies, and weights) is all stored in the
data matrix x.
Argument y, and
keyword IMSLS_WEIGHTS are
ignored.
Each of the four arguments contains indices indicating column numbers of x in which particular types of data are stored. Columns are numbered 0, …, x_col_dim − 1.
Parameter indind contains the indices of the independent variables.
Parameter inddep contains the indices of the dependent variables. If there is to be no dependent variable, this must be indicated by setting the first element of the vector to −1.
Parameters ifrq and iwt contain the column numbers of x in which the frequencies and weights, respectively, are stored. Set ifrq = −1 if there will be no column for frequencies. Set iwt = −1 if there will be no column for weights. Weights are rounded to the nearest integer. Negative weights are not allowed.
Note that frequencies are not referenced by function regression_prediction, and is included here only for the sake of keyword consistency.
Finally, note that IMSLS_X_INDICES and IMSLS_Y are mutually exclusive keywords, and may not be specified in the same call to regression_prediction.
IMSLS_WEIGHTS, float weights[]
(Input)
Array of length n_predict containing
the weight for each row of x. The computed
prediction interval uses SSE/(DFE*weights[i])
for the estimated variance of a future response.
Default: weights[] = 1
IMSLS_CONFIDENCE, float
confidence (Input)
Confidence level for both two-sided
interval estimates on the mean and for two-sided prediction intervals, in
percent. Argument confidence must be in
the range [0.0, 100.0). For one-sided intervals with confidence level onecl, where
50.0 ≤ onecl < 100.0,
set confidence = 100.0 − 2.0* (100.0 − onecl).
Default:
confidence = 95.0
IMSLS_SCHEFFE_CI, float **lower_limit, float
**upper_limit (Output)
Array lower_limit is the
address of a pointer to an internally allocated array of length n_predict containing
the lower confidence limits of Scheffé confidence intervals corresponding
to the rows of x. Array upper_limit is the
address of a pointer to an internally allocated array of length n_predict containing
the upper confidence limits of Scheffé confidence intervals corresponding to the
rows of x.
IMSLS_SCHEFFE_CI_USER, float lower_limit[], float
upper_limit[] (Output)
Storage for arrays lower_limit and upper_limit is
provided by the user. See IMSLS_SCHEFFE_CI.
IMSLS_POINTWISE_CI_POP_MEAN, float **lower_limit,
float **upper_limit
(Output)
Array lower_limit is the
address of a pointer to an internally allocated array of length n_predict containing
the lower-confidence limits of the confidence intervals for two-sided interval
estimates of the means, corresponding to the rows of x. Array upper_limit is the
address of a pointer to an internally allocated array of length n_predict containing
the upper-confidence limits of the confidence intervals for two-sided interval
estimates of the means, corresponding to the rows of x.
IMSLS_POINTWISE_CI_POP_MEAN_USER, float lower_limit[],
float upper_limit[]
(Output)
Storage for arrays lower_limit and upper_limit is
provided by the user. See IMSLS_POINTWISE_CI_POP_MEAN.
IMSLS_POINTWISE_CI_NEW_SAMPLE, float **lower_limit,
float **upper_limit
(Output)
Array lower_limit is the
address of a pointer to an internally allocated array of length n_predict containing
the lower-confidence limits of the confidence intervals for two-sided prediction
intervals, corresponding to the rows of x. Array upper_limit is the
address of a pointer to an internally allocated array of length n_predict containing
the upper-confidence limits of the confidence intervals for two-sided prediction
intervals, corresponding to the rows of x.
IMSLS_POINTWISE_CI_NEW_SAMPLE_USER, float lower_limit[],
float upper_limit[]
(Output)
Storage for arrays lower_limit and upper_limit is
provided by the user. See IMSLS_POINTWISE_CI_NEW_SAMPLE.
IMSLS_LEVERAGE, float
**leverage (Output)
Address of a pointer to an internally
allocated array of length n_predict containing
the leverages.
IMSLS_LEVERAGE_USER, float
leverage[] (Output)
Storage for array leverage is provided
by the user. See IMSLS_LEVERAGE.
IMSLS_RETURN_USER, float y_hat[]
(Output)
Storage for array y_hat is provided by
the user. The length n_predict array
contains the predicted values.
IMSLS_Y, float y[]
(Input)
Array of length n_predict containing
the observed responses.
Note: IMSLS_Y (or IMSLS_X_INDICES) must be specified if any of the following optional arguments are specified.
IMSLS_RESIDUAL, float
**residual (Output)
Address of a pointer to an internally
allocated array of length n_predict containing
the residuals.
IMSLS_RESIDUAL_USER, float
residual[] (Output)
Storage for array residual is provided
by the user. See IMSLS_RESIDUAL.
IMSLS_STANDARDIZED_RESIDUAL, float
**standardized_residual (Output)
Address of a pointer to
an internally allocated array of length n_predict containing
the standardized residuals.
IMSLS_STANDARDIZED_RESIDUAL_USER,
float standardized_residual[]
(Output)
Storage for array standardized_residual
is provided by the user. See IMSLS_STANDARDIZED_RESIDUAL.
IMSLS_DELETED_RESIDUAL, float
**deleted_residual (Output)
Address of a pointer to an
internally allocated array of length n_predict containing
the deleted residuals.
IMSLS_DELETED_RESIDUAL_USER, float
deleted_residual[] (Output)
Storage for array deleted_residual is
provided by the user. See IMSLS_DELETED_RESIDUAL.
IMSLS_COOKSD, float **cooksd
(Output)
Address of a pointer to an internally allocated array of length
n_predict
containing the Cook's D statistics.
IMSLS_COOKSD_USER, float cooksd[]
(Output)
Storage for array cooksd is provided by the user. See IMSLS_COOKSD.
IMSLS_DFFITS, float **dffits
(Output)
Address of a pointer to an internally allocated array of
length n_predict
containing the DFFITS statistics.
IMSLS_DFFITS_USER, float dffits[]
(Output)
Storage for array dffits is provided by
the user. See IMSLS_DFFITS.
The general linear model used by function imsls_f_regression_prediction is
y = Xβ + ɛ
where y is the n × 1 vector of responses,
X is the n × p matrix of
regressors,
β is the
p × 1
vector of regression coefficients, and ɛ is the n × 1 vector of errors
whose elements are independently normally distributed with mean 0 and the
variance below.
From a general linear model fit using the wi's as the weights, function imsls_f_regression_prediction computes confidence intervals and statistics for the individual cases that constitute the data set. Let xi be a column vector containing elements of the i-th row of X. Let W = diag (w1, w2, …, wn). The leverage is defined as
Put D = diag (d1,
d2, …,
dn) with dj = 1 if the j-th
diagonal element of R is positive and 0 otherwise. The leverage is
computed as hi = (aTDa)
wi where
a is a solution to RTa = xi.
The estimated variance of
is given by the following:
where
The computation of the remainder of the case statistics follow easily from their definitions. See Diagnostic for Individual Cases for the definition of the case diagnostics
Informational errors can occur if the input matrix x is not consistent with the information from the fit (contained in regression_info), or if excess rounding has occurred. The warning error IMSLS_NONESTIMABLE arises when x contains a row not in the space spanned by the rows of R. An examination of the model that was fitted and the x for which diagnostics are to be computed is required in order to ensure that only linear combinations of the regression coefficients that can be estimated from the fitted model are specified in x. For further details, see the discussion of estimable functions given in Maindonald (1984, pp. 166−168) and Searle (1971, pp. 180−188).
Often predicted values and confidence intervals are desired for combinations of settings of the independent variables not used in computing the regression fit. This can be accomplished by defining a new data matrix. Since the information about the model fit is input in regression_info, it is not necessary to send in the data set used for the original calculation of the fit, i.e., only variable combinations for which predictions are desired need be entered in x.
#include <imsls.h>
int main()
{
#define INTERCEPT 1
#define N_INDEPENDENT 4
#define N_OBSERVATIONS 13
#define N_COEFFICIENTS (INTERCEPT + N_INDEPENDENT)
#define N_DEPENDENT 1
float *y_hat, *coefficients;
Imsls_f_regression *regression_info;
float x[][N_INDEPENDENT] = {
7.0, 26.0, 6.0, 60.0,
1.0, 29.0, 15.0, 52.0,
11.0, 56.0, 8.0, 20.0,
11.0, 31.0, 8.0, 47.0,
7.0, 52.0, 6.0, 33.0,
11.0, 55.0, 9.0, 22.0,
3.0, 71.0, 17.0, 6.0,
1.0, 31.0, 22.0, 44.0,
2.0, 54.0, 18.0, 22.0,
21.0, 47.0, 4.0, 26.0,
1.0, 40.0, 23.0, 34.0,
11.0, 66.0, 9.0, 12.0,
10.0, 68.0, 8.0, 12.0};
float y[] = {78.5, 74.3, 104.3, 87.6, 95.9, 109.2,
102.7, 72.5, 93.1, 115.9, 83.8, 113.3, 109.4};
/* Fit the regression model */
coefficients = imsls_f_regression(N_OBSERVATIONS, N_INDEPENDENT,
(float *)x, y,
IMSLS_REGRESSION_INFO, ®ression_info,
0);
/* Generate case statistics */
y_hat = imsls_f_regression_prediction(regression_info,
N_OBSERVATIONS, (float*)x, 0);
/* Print results */
imsls_f_write_matrix("Predicted Responses", 1, N_OBSERVATIONS,
y_hat, 0);
}
Predicted Responses
1 2 3 4 5 6
78.5 72.8 106.0 89.3 95.6 105.3
7 8 9 10 11 12
104.1 75.7 91.7 115.6 81.8 112.3
13
111.7
#include <imsls.h>
int main()
{
#define INTERCEPT 1
#define N_INDEPENDENT 4
#define N_OBSERVATIONS 13
#define N_COEFFICIENTS (INTERCEPT + N_INDEPENDENT)
#define N_DEPENDENT 1
float *y_hat, *leverage, *residual, *standardized_residual,
*deleted_residual, *dffits, *cooksd, *mean_lower_limit,
*mean_upper_limit, *new_sample_lower_limit,
*new_sample_upper_limit, *scheffe_lower_limit,
*scheffe_upper_limit, *coefficients;
Imsls_f_regression *regression_info;
float x[][N_INDEPENDENT] = {
7.0, 26.0, 6.0, 60.0,
1.0, 29.0, 15.0, 52.0,
11.0, 56.0, 8.0, 20.0,
11.0, 31.0, 8.0, 47.0,
7.0, 52.0, 6.0, 33.0,
11.0, 55.0, 9.0, 22.0,
3.0, 71.0, 17.0, 6.0,
1.0, 31.0, 22.0, 44.0,
2.0, 54.0, 18.0, 22.0,
21.0, 47.0, 4.0, 26.0,
1.0, 40.0, 23.0, 34.0,
11.0, 66.0, 9.0, 12.0,
10.0, 68.0, 8.0, 12.0};
float y[] = {78.5, 74.3, 104.3, 87.6, 95.9, 109.2,
102.7, 72.5, 93.1, 115.9, 83.8, 113.3, 109.4};
/* Fit the regression model */
coefficients = imsls_f_regression(N_OBSERVATIONS, N_INDEPENDENT,
(float *)x, y,
IMSLS_REGRESSION_INFO, ®ression_info,
0);
/* Generate the case statistics */
y_hat = imsls_f_regression_prediction(regression_info,
N_OBSERVATIONS, (float*)x,
IMSLS_Y, y,
IMSLS_LEVERAGE, &leverage,
IMSLS_RESIDUAL, &residual,
IMSLS_STANDARDIZED_RESIDUAL, &standardized_residual,
IMSLS_DELETED_RESIDUAL, &deleted_residual,
IMSLS_COOKSD, &cooksd,
IMSLS_DFFITS, &dffits,
IMSLS_POINTWISE_CI_POP_MEAN, &mean_lower_limit,
&mean_upper_limit,
IMSLS_POINTWISE_CI_NEW_SAMPLE, &new_sample_lower_limit,
&new_sample_upper_limit,
IMSLS_SCHEFFE_CI, &scheffe_lower_limit,
&scheffe_upper_limit,
0);
/* Print results */
imsls_f_write_matrix("Predicted Responses", 1, N_OBSERVATIONS,
y_hat, 0);
imsls_f_write_matrix("Residuals", 1, N_OBSERVATIONS, residual, 0);
imsls_f_write_matrix("Standardized Residuals", 1, N_OBSERVATIONS,
standardized_residual, 0);
imsls_f_write_matrix("Leverages", 1, N_OBSERVATIONS, leverage, 0);
imsls_f_write_matrix("Deleted Residuals", 1, N_OBSERVATIONS,
deleted_residual, 0);
imsls_f_write_matrix("Cooks D", 1, N_OBSERVATIONS, cooksd, 0);
imsls_f_write_matrix("DFFITS", 1, N_OBSERVATIONS, dffits, 0);
imsls_f_write_matrix("Scheffe Lower Limit", 1, N_OBSERVATIONS,
scheffe_lower_limit, 0);
imsls_f_write_matrix("Scheffe Upper Limit", 1, N_OBSERVATIONS,
scheffe_upper_limit, 0);
imsls_f_write_matrix("Population Mean Lower Limit", 1,
N_OBSERVATIONS, mean_lower_limit, 0);
imsls_f_write_matrix("Population Mean Upper Limit", 1,
N_OBSERVATIONS, mean_upper_limit, 0);
imsls_f_write_matrix("New Sample Lower Limit", 1, N_OBSERVATIONS,
new_sample_lower_limit, 0);
imsls_f_write_matrix("New Sample Upper Limit", 1, N_OBSERVATIONS,
new_sample_upper_limit, 0);
}
Predicted Responses
1 2 3 4 5 6
78.5 72.8 106.0 89.3 95.6 105.3
7 8 9 10 11 12
104.1 75.7 91.7 115.6 81.8 112.3
13
111.7
Residuals
1 2 3 4 5 6
0.005 1.511 -1.671 -1.727 0.251 3.925
7 8 9 10 11 12
-1.449 -3.175 1.378 0.282 1.991 0.973
13
-2.294
Standardized Residuals
1 2 3 4 5 6
0.003 0.757 -1.050 -0.841 0.128 1.715
7 8 9 10 11 12
-0.744 -1.688 0.671 0.210 1.074 0.463
13
-1.124
Leverages
1 2 3 4 5 6
0.5503 0.3332 0.5769 0.2952 0.3576 0.1242
7 8 9 10 11 12
0.3671 0.4085 0.2943 0.7004 0.4255 0.2630
13
0.3037
Deleted Residuals
1 2 3 4 5 6
0.003 0.735 -1.058 -0.824 0.120 2.017
7 8 9 10 11 12
-0.722 -1.967 0.646 0.197 1.086 0.439
13
-1.146
Cooks D
1 2 3 4 5 6
0.0000 0.0572 0.3009 0.0593 0.0018 0.0834
7 8 9 10 11 12
0.0643 0.3935 0.0375 0.0207 0.1708 0.0153
13
0.1102
DFFITS
1 2 3 4 5 6
0.003 0.519 -1.236 -0.533 0.089 0.759
7 8 9 10 11 12
-0.550 -1.635 0.417 0.302 0.935 0.262
13
-0.757
Scheffe Lower Limit
1 2 3 4 5 6
70.7 66.7 98.0 83.6 89.4 101.6
7 8 9 10 11 12
97.8 69.0 86.0 106.8 75.0 106.9
13
105.9
Scheffe Upper Limit
1 2 3 4 5 6
86.3 78.9 113.9 95.0 101.9 109.0
7 8 9 10 11 12
110.5 82.4 97.4 124.4 88.7 117.7
13
117.5
Population Mean Lower Limit
1 2 3 4 5 6
74.3 69.5 101.7 86.3 92.3 103.3
7 8 9 10 11 12
100.7 72.1 88.7 110.9 78.1 109.4
13
108.6
Population Mean Upper Limit
1 2 3 4 5 6
82.7 76.0 110.3 92.4 99.0 107.3
7 8 9 10 11 12
107.6 79.3 94.8 120.3 85.5 115.2
13
114.8
New Sample Lower Limit
1 2 3 4 5 6
71.5 66.3 98.9 82.9 89.1 99.3
7 8 9 10 11 12
97.6 69.0 85.3 108.3 75.1 106.0
13
105.3
New Sample Upper Limit
1 2 3 4 5 6
85.5 79.3 113.1 95.7 102.2 111.3
7 8 9 10 11 12
110.7 82.4 98.1 123.0 88.5 118.7
13
118.1
IMSLS_NONESTIMABLE Within the preset tolerance, the linear combination of regression coefficients is nonestimable.
IMSLS_LEVERAGE_GT_1 A leverage (= #) much greater than 1.0 is computed. It is set to 1.0.
IMSLS_DEL_MSE_LT_0
A deleted residual mean square
(= #) much
less than 0 is computed. It is set to 0.
IMSLS_NONNEG_WEIGHT_REQUEST_2 The weight for row # was #. Weights must be nonnegative.
Visual Numerics, Inc. PHONE: 713.784.3131 FAX:713.781.9260 |