Analyzes a balanced factorial design with fixed effects.
#include <imsls.h>
float imsls_f_anova_factorial (int n_subscripts, int n_levels, float y[], ..., 0)
The type double function is imsls_d_anova_factorial
int
n_subscripts (Input)
Number of subscripts. Number of
factors in the model + 1 (for the error term).
int n_levels
(Input)
Array of length n_subscripts
containing the number of levels for each of the factors for the first n_subscripts − 1 elements. n_levels [n_subscripts − 1] is the number of
observations per cell.
float y[]
(Input)
Array of length n_levels [0]*n_levels [1]* … *n_levels [n_subscripts − 1] containing the
responses. Argument y must not contain NaN for any of its elements, i.e.,
missing values are not allowed.
The p-value for the overall F test.
#include <imsls.h>
float
imsls_f_anova_factorial (int n_subscripts, int
n_levels, float
y[],
IMSLS_MODEL_ORDER, int
model_order,
IMSLS_PURE_ERROR, or
IMSLS_POOL_INTERACTIONS,
IMSLS_ANOVA_TABLE, float **anova_table,
IMSLS_ANOVA_TABLE_USER, float anova_table[],
IMSLS_TEST_EFFECTS, float **test_effects,
IMSLS_TEST_EFFECTS_USER, float test_effects[],
IMSLS_MEANS, float **means,
IMSLS_MEANS_USER, float means[],
0)
IMSLS_MODEL_ORDER, int
model_order (Input)
Number of factors to be included in
the highest-way interaction in the model. Argument model_order must be in
the interval [1, n_subscripts − 1]. For example, a
model_order of 1
indicates that a main effect model will be analyzed, and a model_order of 2
indicates that two-way interactions will be included in the model. Default:
model_order = n_subscripts − 1
IMSLS_PURE_ERROR, or
IMSLS_POOL_INTERACTIONS
(Input)
IMSLS_PURE_ERROR,
the default option, indicates factor n_subscripts is error.
Its main effect and all its interaction effects are pooled into the error with
the other (model_order + 1)-way
and higher-way interactions. IMSLS_POOL_INTERACTIONS
indicates factor n_subscripts is not
error. Only (model_order + 1)-way
and higher-way interactions are included in the error.
IMSLS_ANOVA_TABLE, float
**anova_table (Output)
Address of a pointer to an
internally allocated array of size 15 containing the analysis of variance table.
The analysis of variance statistics are given as follows:
Element |
Analysis of Variance Statistics |
0 |
degrees of freedom for the model |
1 |
degrees of freedom for error |
2 |
total (corrected) degrees of freedom |
3 |
sum of squares for the model |
4 |
sum of squares for error |
5 |
total (corrected) sum of squares |
6 |
model mean square |
7 |
error mean square |
8 |
Overall F-statistic |
9 |
p-value |
10 |
R2 (in percent) |
11 |
adjusted R2 (in percent) |
12 |
estimate of the standard deviation |
13 |
overall mean of y |
14 |
coefficient of variation (in percent) |
IMSLS_ANOVA_TABLE_USER, float
anova_table[] (Output)
Storage for array anova_table is
provided by the user. See IMSLS_ANOVA_TABLE.
IMSLS_TEST_EFFECTS, float
**test_effects (Output)
Address of a pointer to an NEF × 4 internally allocated
array containing a matrix containing statistics relating to the sums of squares
for the effects in the model. Here,
where n is given by n_subscripts if IMSLS_POOL_INTERACTIONS is specified; otherwise, n_subscripts − 1.
Suppose the factors are A, B, C, and error. With model_order = 3, rows 0 through NEF − 1 would correspond to A, B, C, AB, AC, BC, and ABC, respectively. The columns of test_effects are as follows:
Column |
Description |
0 |
degrees of freedom |
1 |
sum of squares |
2 |
F-statistic |
3 |
p-value |
IMSLS_TEST_EFFECTS_USER, float
test_effects[] (Output)
Storage for array test_effects is
provided by the user. See IMSLS_TEST_EFFECTS.
IMSLS_MEANS, float **means
(Output)
Address of a pointer to an internally allocated array of length
(n_levels [0] + 1) × (n_levels [1] + 1) × … ×
(n_levels[n − 1] + 1)
containing the subgroup means.
See argument IMSLS_TEST_EFFECTS for a definition of n. If the factors are A, B, C, and error, the ordering of the means is grand mean, A means, B means, C means, AB means, AC means, BC means, and ABC means.
IMSLS_MEANS_USER, float means[]
(Output)
Storage for array means is provided by
the user. See IMSLS_MEANS.
Function imsls_f_anova_factorial performs an analysis for an n-way classification design with balanced data. For balanced data, there must be an equal number of responses in each cell of the n-way layout. The effects are assumed to be fixed effects. The model is an extension of the two-way model to include n factors. The interactions (two-way, three-way, up to n-way) can be included in the model, or some of the higher-way interactions can be pooled into error. The argument model_order specifies the number of factors to be included in the highest-way interaction. For example, if three-way and higher-way interactions are to be pooled into error, set model_order = 2. (By default, model_order = n_subscripts − 1 with the last subscript being the error subscript.) Argument IMSLS_PURE_ERROR indicates there are repeated responses within the n-way cell; IMSLS_POOL_INTERACTIONS_INTO_ERROR indicates otherwise.
Function imsls_f_anova_factorial requires the responses as input into a single vector y in lexicographical order, so that the response subscript associated with the first factor varies least rapidly, followed by the subscript associated with the second factor, and so forth. Hemmerle (1967, Chapter 5) discusses the computational method.
A two-way analysis of variance is performed with balanced data discussed by Snedecor and Cochran (1967, Table 12.5.1, p. 347). The responses are the weight gains (in grams) of rats that were fed diets varying in the source (A) and level (B) of protein. The model is
where
for i = 1, 2. The first responses in each cell in the two-way layout are given in the following table:
|
Protein Source (A) | ||
Protein Level (B) |
Beef |
Cereal |
Pork |
High |
73, 102, 118, 104, 81, 107, 100, 87, 117, 111 |
98, 74, 56, 111, 95, 88, 82, 77, 86, 92 |
94, 79, 96, 98, 102, 102, 108, 91, 120, 105 |
Low |
90, 76, 90, 64, 86, 51, 72, 90, 95, 78 |
107, 95, 97, 80, 98, 74, 74, 67, 89, 58 |
49, 82, 73, 86, 81, 97, 106, 70, 61, 82 |
#include <imsls.h>
void main
()
{
int
n_subscripts= 3;
int n_levels[3] =
{3,2,10};
float
p_value;
float y[60] =
{
73.0, 102.0, 118.0, 104.0, 81.0,
107.0, 100.0, 87.0, 117.0, 111.0,
90.0, 76.0, 90.0, 64.0, 86.0,
51.0, 72.0, 90.0, 95.0,
78.0,
98.0, 74.0, 56.0, 111.0,
95.0,
88.0, 82.0, 77.0, 86.0,
92.0,
107.0, 95.0, 97.0, 80.0,
98.0,
74.0, 74.0, 67.0, 89.0,
58.0,
94.0, 79.0, 96.0, 98.0,
102.0,
102.0, 108.0, 91.0, 120.0,
105.0,
49.0, 82.0, 73.0, 86.0,
81.0,
97.0, 106.0, 70.0, 61.0,
82.0};
p_value = imsls_f_anova_factorial(n_subscripts,
n_levels, y, 0);
printf("P-value =
%10.6f",p_value);
}
P-value = 0.00229
In this example, the same model and data is fit as in the initial example, but optional arguments are used for a more complete analysis.
#include <imsls.h>
void main
()
{
int
n_subscripts= 3;
int n_levels[3] =
{3,2,10};
float
p_value;
float
*test_effects, *means, *anova_table;
float y[60] =
{
73.0, 102.0, 118.0, 104.0, 81.0,
107.0, 100.0, 87.0, 117.0, 111.0,
90.0, 76.0, 90.0, 64.0, 86.0,
51.0, 72.0, 90.0, 95.0,
78.0,
98.0, 74.0, 56.0, 111.0,
95.0,
88.0, 82.0, 77.0, 86.0,
92.0,
107.0, 95.0, 97.0, 80.0,
98.0,
74.0, 74.0, 67.0, 89.0,
58.0,
94.0, 79.0, 96.0, 98.0,
102.0,
102.0, 108.0, 91.0, 120.0,
105.0,
49.0, 82.0, 73.0, 86.0,
81.0,
97.0, 106.0, 70.0, 61.0,
82.0};
char *labels[] =
{
"degrees of freedom for the
model",
"degrees of freedom for
error",
"total (corrected) degrees
of freedom",
"sum of squares for
the model",
"sum of squares for
error",
"total (corrected) sum of
squares",
"model mean square",
"error mean square",
"F-statistic", "p-value",
"R-squared (in percent)","Adjusted R-squared (in
percent)",
"est. standard
deviation of the model error",
"overall mean of y",
"coefficient
of variation (in percent)"};
char *test_row_labels[] = {"A", "B",
"A*B"};
char
*test_col_labels[] = {
"Source",
"DF", "Sum of\nSquares",
"Mean\nSquare", "Prob. of\nLarger F"};
char *mean_row_labels[] =
{
"grand
mean",
"A1", "A2",
"A3",
"B1", "B2",
"A1*B1", "A1*B2", "A2*B1",
"A2*B2", "A3*B1",
"A3*B2"};
/* Perform analysis */
p_value =
imsls_f_anova_factorial(n_subscripts, n_levels, y,
IMSLS_ANOVA_TABLE,
&anova_table,
IMSLS_TEST_EFFECTS, &test_effects,
IMSLS_MEANS,
&means,
0);
printf("P-value =
%10.6f",p_value);
/* Print results */
imsls_f_write_matrix("* * *
Analysis of Variance * * *\n", 15,
1,
anova_table,
IMSLS_ROW_LABELS,
labels,
IMSLS_WRITE_FORMAT,
"%11.4f",
0);
imsls_f_write_matrix("* * * Variation Due to the Model * * *", 3,
4,
test_effects,
IMSLS_ROW_LABELS,
test_row_labels,
IMSLS_COL_LABELS,
test_col_labels,
IMSLS_WRITE_FORMAT, "%11.4f",
0);
imsls_f_write_matrix("* * * Subgroup Means * * *",
12, 1,
means,
IMSLS_ROW_LABELS,
mean_row_labels,
IMSLS_WRITE_FORMAT, "%11.4f",
0);
}
P-value = 0.002299
* * * Analysis of Variance * * *
degrees of freedom for the
model
5.0000
degrees of freedom for
error
54.0000
total (corrected) degrees of
freedom
59.0000
sum of squares for the
model
4612.9346
sum of squares for
error
11585.9990
total (corrected) sum of
squares
16198.9336
model mean
square
922.5869
error mean
square
214.5555
F-statistic
4.3000
p-value
0.0023
R-squared
(in
percent)
28.4768
Adjusted R-squared (in
percent)
21.8543
est. standard deviation of the model
error 14.6477
overall mean of
y
87.8667
coefficient of variation (in
percent)
16.6704
*
* * Variation Due to the Model * *
*
Source
DF Sum
of Mean
Prob.
of
Squares Square
Larger F
A
2.0000 266.5330
0.6211
0.5411
B
1.0000 3168.2678
14.7667
0.0003
A*B
2.0000 1178.1337
2.7455 0.0732
* * * Subgroup Means * * *
grand
mean 87.8667
A1
89.6000
A2
84.9000
A3
89.1000
B1
95.1333
B2
80.6000
A1*B1
100.0000
A1*B2
79.2000
A2*B1
85.9000
A2*B2
83.9000
A3*B1
99.5000
A3*B2 78.7000
This example performs a three-way analysis of variance using data discussed by Peter W.M. John (1971, pp. 91−92). The responses are weights (in grams) of roots of carrots grown with varying amounts of applied nitrogen (A), potassium (B), and phosphorus (C). Each cell of the three-way layout has one response. Note that the ABC interactions sum of squares, which is 186, is given incorrectly by Peter W.M. John (1971, Table 5.2.) The three-way layout is given in the following table:
|
A0 |
A1 |
A2 | |||||||||
|
B0 |
B1 |
B2 |
B0 |
B1 |
B2 |
B0 |
B1 |
B2 |
| ||
C0 |
88.76 |
91.41 |
97.85 |
94.83 |
100.49 |
99.75 |
99.90 |
100.23 |
104.51 |
| ||
C1 |
87.45 |
98.27 |
95.85 |
84.57 |
97.20 |
112.30 |
92.98 |
107.77 |
110.94 |
| ||
C2 |
86.01 |
104.20 |
90.09 |
81.06 |
120.80 |
108.77 |
94.72 |
118.39 |
102.87 |
| ||
#include <imsls.h>
void main
()
{
int
n_subscripts= 3;
int n_levels[3] =
{3,3,3};
float
p_value;
float
*test_effects, *anova_table;
float y[27] =
{
88.76, 87.45, 86.01,
91.41, 98.27, 104.2, 97.85, 95.85,
90.09, 94.83, 84.57, 81.06,
100.49, 97.2, 120.8, 99.75,
112.3, 108.77, 99.9, 92.98, 94.72, 100.23, 107.77, 118.39,
104.51, 110.94,
102.87};
char *labels[] =
{
"degrees of freedom for the
model",
"degrees of freedom for
error",
"total (corrected) degrees
of freedom",
"sum of squares for
the model",
"sum of squares for
error",
"total (corrected) sum of
squares",
"model mean square",
"error mean square",
"F-statistic", "p-value",
"R-squared (in percent)","Adjusted R-squared (in
percent)",
"est. standard
deviation of the model error",
"overall mean of y",
"coefficient
of variation (in percent)"};
char *test_row_labels[] = {"A", "B", "C", "A*B",
"A*C", "B*C"};
char
*test_col_labels[] = {
"Source",
"DF", "Sum of\nSquares",
"Mean\nSquare", "Prob. of\nLarger
F"};
/* Perform analysis */
p_value =
imsls_f_anova_factorial(n_subscripts, n_levels, y,
IMSLS_ANOVA_TABLE,
&anova_table,
IMSLS_TEST_EFFECTS,
&test_effects,
IMSLS_POOL_INTERACTIONS,
0);
/* Print results */
printf("P-value =
%10.6f",p_value);
imsls_f_write_matrix("* * * Analysis
of Variance * * *\n", 15, 1,
anova_table,
IMSLS_ROW_LABELS,
labels,
IMSLS_WRITE_FORMAT,
"%11.4f",
0);
imsls_f_write_matrix("* * * Variation Due to the
Model * * *", 6, 4,
test_effects,
IMSLS_ROW_LABELS,
test_row_labels,
IMSLS_COL_LABELS,
test_col_labels,
IMSLS_WRITE_FORMAT, "%11.4f",
0);
}
P-value = 0.008299
* * *
Analysis of Variance * * *
degrees of freedom for the
model
18.0000
degrees of freedom for
error
8.0000
total (corrected) degrees of
freedom
26.0000
sum of squares for the
model
2395.7290
sum of squares for
error
185.7763
total (corrected) sum of
squares
2581.5054
model mean
square
133.0961
error mean
square
23.2220
F-statistic
5.7315
p-value
0.0083
R-squared (in percent)
92.8036
Adjusted
R-squared (in
percent)
76.6116
est. standard deviation of the model
error 4.8189
overall mean of
y
98.9619
coefficient of variation (in
percent)
4.8695
* * *
Variation Due to the Model * *
*
Source
DF Sum
of Mean
Prob.
of
Squares Square
Larger F
A
2.0000 488.3678
10.5152
0.0058
B
2.0000 1090.6559
23.4832
0.0004
C
2.0000 49.1484
1.0582
0.3911
A*B
4.0000 142.5856
1.5350
0.2804
A*C
4.0000 32.3474
0.3482
0.8383
B*C
4.0000 592.6240
6.3800 0.0131
Visual Numerics, Inc. PHONE: 713.784.3131 FAX:713.781.9260 |