polyPrediction¶
Computes predicted values, confidence intervals, and diagnostics after fitting a polynomial regression model.
Synopsis¶
polyPrediction (polyInfo, x)
Required Arguments¶
- structure
polyInfo
(Input) - A structure. See function polyRegression.
- float
x[]
(Input) - Array of length
nPredict
containing the values of the independent variable for which calculations are to be performed.
Return Value¶
An array of length nPredict
containing the predicted values.
Optional Arguments¶
confidence
, float (Input)Confidence level for both two-sided interval estimates on the mean and for two-sided prediction intervals in percent. Argument
confidence
must be in the range [0.0, 100.0). For one-sided intervals with confidence levelonecl
, where 50.0 ≤onecl
< 100.0, setconfidence
= 100.0 – 2.0\*
(100.0 −onecl
).Default:
confidence
= 95.0.weights
, float[]
(Input)Array of length
nPredict
containing the weight for each row ofx
. The computed prediction interval uses SSE/(DFE*weights
[i
]) for the estimated variance of a future response, where SSE is sum of squares error and DFE is degrees of freedom error.Default:
weights[]
= 1.scheffeCi
,lowerLimit
,upperLimit
(Output)- Array
lowerLimit
is an array of lengthnPredict
containing the lower confidence limits of Scheffé confidence intervals corresponding to the rows ofx
. ArrayupperLimit
is an array of lengthnPredict
containing the upper confidence limits of Scheffé confidence intervals corresponding to the rows ofx
. pointwiseCiPopMean
,lowerLimit
,upperLimit
(Output)- Array
lowerLimit
is an array of lengthnPredict
containing the lower confidence limits of the confidence intervals for two-sided interval estimates of the means, corresponding to the rows ofx
. ArrayupperLimit
is an array of lengthnPredict
containing the upper confidence limits of the confidence intervals for two-sided interval estimates of the means, corresponding to the rows ofx
. pointwiseCiNewSample
,lowerLimit
,upperLimit
(Output)- Array
lowerLimit
is an array of lengthnPredict
containing the lower confidence limits of the confidence intervals for two-sided prediction intervals, corresponding to the rows ofx
. ArrayupperLimit
is an array of lengthnPredict
containing the upper confidence limits of the confidence intervals for two-sided prediction intervals, corresponding to the rows ofx
. leverage
(Output)- An array of length
nPredict
containing the leverages. y
float[]
(Input)- Array of length
nPredict
containing the observed responses.
y must be specified if any of the following optional arguments
are specified. |
residual
(Output)- An array of length
nPredict
containing the residuals. standardizedResidual
(Output)- An array of length
nPredict
containing the standardized residuals. deletedResidual
(Output)- An array of length
nPredict
containing the deleted residuals. cooksd
(Output)- An array of length
nPredict
containing the Cook’s D statistics. dffits
(Output)- An array of length
nPredict
containing the DFFITS statistics, where DFFITS is the change in the predicted value of a point in the absence of the empirical value.
Description¶
Function polyPrediction
assumes a polynomial model
where the observed values of the \(y_i\)’s constitute the response, the \(x_i\)’s are the settings of the independent variable, the \(\beta_j\)’s are the regression coefficients and the \(\varepsilon_i\)’s are the errors that are independently distributed normal with mean 0 and the following variance:
Given the results of a polynomial regression, fitted using orthogonal
polynomials and weights \(w_i\), function polyPrediction
produces
predicted values, residuals, confidence intervals, prediction intervals, and
diagnostics for outliers and in influential cases.
Often, a predicted value and confidence interval are desired for a setting
of the independent variable not used in computing the regression fit. This
is accomplished by simply using a different x
matrix when calling
polyPrediction
than was used for the fit (function polyRegression
).
See Example 1.
Results from function polyRegression
, which produces the fit using
orthogonal polynomials, are used for input by the structure polyInfo
.
The fitted model from polyRegression
is
where the \(z_i\)’s are settings of the independent variable x scaled to the interval [−2, 2] and the \(p_j(z)\)’s are the orthogonal polynomials. The \(X^TX\) matrix for this model is a diagonal matrix with elements \(d_j\). The case statistics are easily computed from this model and are equal to those from the original polynomial model with \(\beta_j\)’s as the regression coefficients.
The leverage is computed as follows:
The estimated variance of
is given by the following:
The computation of the remainder of the case statistics follows easily from the definitions. See Diagnostics for Individual Cases for the definition of the case diagnostics.
Often, predicted values and confidence intervals are desired for
combinations of settings of the independent variables not used in computing
the regression fit. This can be accomplished by defining a new data matrix.
Since the information about the model fit is input in polyInfo
, it is
not necessary to send in the data set used for the original calculation of
the fit, i.e., only variable combinations for which predictions are desired
need be entered in x
.
Examples¶
Example 1¶
A polynomial model is fit to the data discussed by Neter and Wasserman (1974, pp. 279–285). The data set contains the response variable y measuring coffee sales (in hundred gallons) and the number of self-service dispensers. Responses for 14 similar cafeterias are in the data set.
from __future__ import print_function
from numpy import *
from pyimsl.stat.polyPrediction import polyPrediction
from pyimsl.stat.polyRegression import polyRegression
from pyimsl.stat.writeMatrix import writeMatrix
x = array([0.0, 0.0, 1.0, 1.0, 2.0, 2.0, 4.0,
4.0, 5.0, 5.0, 6.0, 6.0, 7.0, 7.0])
y = array([508.1, 498.4, 568.2, 577.3, 651.7, 657.0, 755.3,
758.9, 787.6, 792.1, 841.4, 831.8, 854.7, 871.4])
x2 = array([0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0])
degree = 2
poly_info = []
n_predict = 8
# Generate the polynomial regression fit
coefficients = polyRegression(x, y, degree, polyRegressionInfo=poly_info)
# Compute predicted values
y_hat = polyPrediction(poly_info[0], x2)
# Print predicted values
writeMatrix("Predicted Values", y_hat)
Output¶
Predicted Values
1 2 3 4 5 6
503.3 578.3 645.4 704.4 755.6 798.8
7 8
834.1 861.4
Example 2¶
Predicted values, confidence intervals, and diagnostics are computed for the data set described in the first example.
from __future__ import print_function
from numpy import *
from pyimsl.stat.polyPrediction import polyPrediction
from pyimsl.stat.polyRegression import polyRegression
from pyimsl.stat.writeMatrix import writeMatrix
x = array([0.0, 0.0, 1.0, 1.0, 2.0, 2.0, 4.0,
4.0, 5.0, 5.0, 6.0, 6.0, 7.0, 7.0])
y = array([508.1, 498.4, 568.2, 577.3, 651.7, 657.0, 755.3,
758.9, 787.6, 792.1, 841.4, 831.8, 854.7, 871.4])
degree = 2
poly_info = []
n_predict = 8
# Generate the polynomial regression fit
coefficients = polyRegression(x, y, degree, polyRegressionInfo=poly_info)
# Compute predicted values and case statistics
pointwise_ci_pop_mean = {}
pointwise_ci_new_sample = {}
s_residual = []
d_residual = []
leverage = []
cooksd = []
dffits = []
scheffe_ci = {}
y_hat = polyPrediction(poly_info[0], x,
pointwiseCiPopMean=pointwise_ci_pop_mean,
pointwiseCiNewSample=pointwise_ci_new_sample,
y=y,
standardizedResidual=s_residual,
deletedResidual=d_residual,
leverage=leverage,
cooksd=cooksd,
dffits=dffits,
scheffeCi=scheffe_ci)
# Print results
writeMatrix("Predicted Values", y_hat, writeFormat="%10.1f")
ssl = scheffe_ci["lowerLimit"]
writeMatrix("Lower Scheffe CI", ssl, writeFormat="%10.1f")
ssu = scheffe_ci["upperLimit"]
writeMatrix("Upper Scheffe CI", ssu, writeFormat="%10.1f")
pcil = pointwise_ci_pop_mean["lowerLimit"]
writeMatrix("Lower CI", pcil, writeFormat="%10.1f")
pciu = pointwise_ci_pop_mean["upperLimit"]
writeMatrix("Upper CI", pciu, writeFormat="%10.1f")
pil = pointwise_ci_new_sample["lowerLimit"]
writeMatrix("Lower PI", pil, writeFormat="%10.1f")
piu = pointwise_ci_new_sample["upperLimit"]
writeMatrix("Upper PI", piu, writeFormat="%10.1f")
writeMatrix("Standardized Residual", s_residual, writeFormat="%10.1f")
writeMatrix("Deleted Residual", d_residual, writeFormat="%10.1f")
writeMatrix("Leverage", leverage, writeFormat="%10.1f")
writeMatrix("Cooks Distance", cooksd, writeFormat="%10.1f")
writeMatrix("DFFITS", dffits, writeFormat="%10.1f")
Output¶
Predicted Values
1 2 3 4 5 6
503.3 503.3 578.3 578.3 645.4 645.4
7 8 9 10 11 12
755.6 755.6 798.8 798.8 834.1 834.1
13 14
861.4 861.4
Lower Scheffe CI
1 2 3 4 5 6
489.8 489.8 569.5 569.5 636.5 636.5
7 8 9 10 11 12
745.7 745.7 790.2 790.2 825.5 825.5
13 14
847.7 847.7
Upper Scheffe CI
1 2 3 4 5 6
516.9 516.9 587.1 587.1 654.2 654.2
7 8 9 10 11 12
765.5 765.5 807.4 807.4 842.7 842.7
13 14
875.1 875.1
Lower CI
1 2 3 4 5 6
492.8 492.8 571.4 571.4 638.4 638.4
7 8 9 10 11 12
747.9 747.9 792.1 792.1 827.4 827.4
13 14
850.7 850.7
Upper CI
1 2 3 4 5 6
513.9 513.9 585.2 585.2 652.3 652.3
7 8 9 10 11 12
763.3 763.3 805.5 805.5 840.8 840.8
13 14
872.1 872.1
Lower PI
1 2 3 4 5 6
482.7 482.7 559.3 559.3 626.3 626.3
7 8 9 10 11 12
736.3 736.3 779.9 779.9 815.2 815.2
13 14
840.7 840.7
Upper PI
1 2 3 4 5 6
524.0 524.0 597.3 597.3 664.4 664.4
7 8 9 10 11 12
774.9 774.9 817.7 817.7 853.0 853.0
13 14
882.1 882.1
Standardized Residual
1 2 3 4 5 6
0.7 -0.8 -1.4 -0.1 0.9 1.6
7 8 9 10 11 12
-0.0 0.5 -1.5 -0.9 1.0 -0.3
13 14
-1.0 1.6
Deleted Residual
1 2 3 4 5 6
0.7 -0.8 -1.4 -0.1 0.8 1.7
7 8 9 10 11 12
-0.0 0.4 -1.6 -0.9 1.0 -0.3
13 14
-1.1 1.7
Leverage
1 2 3 4 5 6
0.4 0.4 0.2 0.2 0.2 0.2
7 8 9 10 11 12
0.2 0.2 0.1 0.1 0.1 0.1
13 14
0.4 0.4
Cooks Distance
1 2 3 4 5 6
0.1 0.1 0.1 0.0 0.0 0.1
7 8 9 10 11 12
0.0 0.0 0.1 0.0 0.1 0.0
13 14
0.2 0.5
DFFITS
1 2 3 4 5 6
0.5 -0.6 -0.6 -0.1 0.4 0.7
7 8 9 10 11 12
-0.0 0.2 -0.7 -0.4 0.4 -0.1
13 14
-0.8 1.3
Warning Errors¶
IMSLS_LEVERAGE_GT_1 |
A leverage (= #) much greater than one is computed. It is set to 1.0. |
IMSLS_DEL_MSE_LT_0 |
A deleted residual mean square (= #) much less than zero is computed. It is set to zero. |
Fatal Errors¶
IMSLS_NEG_WEIGHT |
“weights[#]” = #. Weights must be nonnegative. |