rcbdFactorial¶
Analyzes data from balanced and unbalanced randomized complete-block
experiments. Unlike anovaFactorial, function
rcbdFactorial
allows for missing data and one or more locations.
Synopsis¶
rcbdFactorial (nLocations, nFactors, nLevels, model, y)
Required Arguments¶
- int
nLocations
(Input) - Number of locations.
nLocations
must be one or greater. - int
nFactors
(Input) - Number of factors in the model.
- int
nLevels[]
(Input) - Array of length
nFactors
+1. ThenLevels[
0]
throughnLevels[nFactors
‑1]
contain the number of levels for each factor. The last element,nLevels[nFactors]
, contains the number of blocks at a location. There must be at least two blocks and two levels for each factor, i.e.,nLevels
[i] ≥ 2 for i =0, 1, …,nFactors
. - int
model[]
(Input) - A
nObs
by (nFactors
+2) array identifying the location, block and factor levels associated with each observation iny
. The first column must contain the location identifier and the second column must contain the block identifier for the observation associated with that row. The remaining columns, columns 3 throughnFactors
+2, should contain the factor level identifiers in the same order used innLevels
. IfnLocations
=1, the first column is still required, but its contents are ignored. - float
y[]
(Input) - An array of length
nObs
containing the experimental observations and any missing values. Missing values are indicated by placing a NaN (not a number) iny
. TheNaN
value can be set using the functionmachine
(6).
Return Value¶
A two dimensional, nAnova
by 6 array containing the ANOVA table, where:
and m=
nFactors
.
Each row in this array contains values for one of the effects in the ANOVA
table. The first value in each row, \(\text{anovaTable}_{i,0}\) =
anovaTable[i
*6]
, is the source identifier which identifies the
type of effect associated with values in that row. The remaining values in a
row contain the ANOVA table values using the following convention:
j | \(\texttt{anovaTable}_{i,j} = \texttt{anovaTable}[\texttt{i}*6+\texttt{j}]\) |
---|---|
0 | Source Identifier (values described below) |
1 | Degrees of freedom |
2 | Sum of squares |
3 | Mean squares |
4 | F-statistic |
5 | p-value for this F-statistic |
The values for the mean squares, F-statistic and p-value are set to NaN for the residual and corrected total effects.
Note that the p‑value for the F-statistic is returned as 0.0 when the value is so small that all significant digits have been lost. |
The Source Identifiers in the first column of
\(\text{anovaTable}_{i,j}\) are the only negative values in
anovaTable[]
. The absolute value of the source identifier is equal to
the order of the effect in that row. Main effects, for example, have a
source identifier of –1. Two-way interactions use a source identifier of –2,
–3 and so on.
Source Identifier | ANOVA Source |
---|---|
-1 | Main Effects † |
-2 | Two-Way Interactions ‡ |
-3 | Three-Way Interactions ‡ |
. | . |
. | . |
. | . |
-nFactors |
(nFactors) -way Interactions ‡ |
-nFactors-1 |
Error Term for Factors and Interactions |
-nFactors-2 |
Residual * |
-nFactors-3 |
Corrected Total |
Note: The Effects Error Term is equal to the Residual effect if
nLocations
=
1.
†
The number of main effects is equal to nFactors+
2 if
nLocations
> 1, and nFactors
+
1 if nLocations
= 1. The
first two rows, anovaTable[0]
through anovaTable[10]
are used to
represent the location and block effects if nLocations
> 1. If
nLocations
=1, then anovaTable[0]
through anovaTable[5]
contain
the block effects.
‡ The number of interaction effects for the nth-way interactions is equal to
The order of these terms is in ascending order by treatment subscript. The interactions for factor 1 appear first, followed by factor 2, factor 3, and so on.
* The residual term is only produced when there is replication within blocks.
Optional Arguments¶
nMissing
(Output)- Number of missing values, if any, found in
y
. Missing values are denoted with a NaN (Not a Number) value. cv
(Output)- Coefficient of Variation computed by:
grandMean
(Output)- Mean of all the data across every location.
factorMeans
, floatfactorMeans
(Output)- An array of length
nLevels[0]+nLevels[1]+…+nLevels[nFactors-1]
containing the factor means. factorStdErrors
, floatfactorStdErrors
(Output)- An
nFactors
by 2 array containing factor standard errors and their associated degrees of freedom. The first column contains the standard errors for comparing two factor means and the second its associated degrees of freedom twoWayMeans
, floattwoWayMeans
(Output)A one-dimensional array containing the two-way means for all two by two combinations of the factors. The total length of this array when
nFactors
>1 is equal to\[\sum_{i=0}^{f} \sum_{j=i+1}^{f+1} \mathrm{nLevels}[i] \times \mathrm{nLevels}[j]\]where
ƒ =
nFactors-2
If
nFactors
= 1,None
is returned. IfnFactors
>1, the means would first be produced for all combinations of the first two factors followed by all combinations of the remaining factors using the subscript order suggested by the above formula. For example, if the experiment is a 2x2x2 factorial, the 12 two-way means would appear in the following order: \(A_1 B_1\), \(A_1 B_2\), \(A_2 B_1\), \(A_2 B_2\), \(A_1 C_1\), \(A_1 C_2\), \(A_2 C_1\), \(A_2 C_2\), \(B_1 C_1\), \(B_1 C_2\), \(B_2 C_1\), and \(B_2 C_2\).twoWayStdErrors
(Output)An
nTwoWay
by 2 array containing factor standard errors and their associated degrees of freedom, where\[\mathrm{nTwoWay} = \binom{\mathrm{nFactors}}{2}\]The first column contains the standard errors for comparing two 2-way interaction means and the second its associated degrees of freedom. The ordering of the rows in this array is similar to that used in
twoWayMeans
. For example ifnFactors=4
, thennTwoWay
=6
with the order AB, AC, AD, BC, BD, CD.treatmentMeans
(Output)An array of size
nLevels[0]
×nLevels[1]
× … ×nLevels[nFactors
‑1]
containing the treatment means. The order of the means is organized in ascending order by the value of the factor identifier. For example, if the experiment is a 2x2x2 factorial, the 8 means would appear in the following order: \(A_1 B_1 C_1\), \(A_1 B_1 C_2\), \(A_1 B_2 C_1\), \(A_1 B_1 C_2\), \(A_2 B_1 C_1\), \(A_2 B_1 C_2\), \(A_2 B_2 C_1\), and \(A_2 B_2 C_2\).
treatmentStdError
(Output)- The array of length 2 containing standard error for comparing treatments based upon the average number of replicates per treatment and its associated degrees of freedom.
anovaRowLabels
(Output)- An array containing the labels for each of the
nAnova
rows of the returned ANOVA table. The label for the i‑th row of the ANOVA table can be printed withprint anovaRowLabels[i]
.
Description¶
The function rcbdFactorial
is capable of analyzing randomized complete
block factorial experiments replicated in different locations. Missing
observations are estimated using the Yates method. Locations, if used, and
blocks are treated as random factors. All treatment factors are regarded as
fixed effects in the analysis. If nLocations
> 1, then blocks are
treated as nested within locations and the number of blocks used at each
location must be the same.
If nLocations
= 1, then the residual mean square is used as the error
mean square in calculating the F-tests for all other effects. That is
when nLocations
= 1.
In this case, the residual mean square is calculating by pooling all interactions between treatments and blocks. For example, if treatments are formed from two factors, A and B, then
When nLocations
= 1, then \(MS_{\mathit{residual}}\) is also used to
calculate the standard errors between means. For example, in a two factor
experiment:
where
are the number of observations for each level of the effects A, B and their interaction, respectively.
If nLocations
> 1, then the error mean square is used as the denominator
of the F-test for effects:
The error mean square in this calculation is obtained by pooling all
interactions between each factor and locations. For example nLocations
>
1 and nFactors
=2 then:
In this case, nLocations
> 1, the standard errors for means are
calculated using
The F‑test for differences between locations is calculated using the mean squares for blocks within locations:
Example¶
This example is based upon data from an agricultural trial conducted by DOW Agrosciences. This is a three factor, 3x2x2, experiment replicated in two blocks at one location. For illustration, two observations are set to NaN to simulate missing observations.
from numpy import *
from pyimsl.stat.page import page, SET_PAGE_WIDTH
from pyimsl.stat.machine import machine
from pyimsl.stat.rcbdFactorial import rcbdFactorial
from pyimsl.stat.writeMatrix import writeMatrix
n_obs = 24
n_locations = 1
n_factors = 3
n_levels = [3, 2, 2, 2]
model = [[1, 1, 1, 1, 1], [1, 2, 1, 1, 1],
[1, 1, 1, 1, 2], [1, 2, 1, 1, 2],
[1, 1, 1, 2, 1], [1, 2, 1, 2, 1],
[1, 1, 1, 2, 2], [1, 2, 1, 2, 2],
[1, 1, 2, 1, 1], [1, 2, 2, 1, 1],
[1, 1, 2, 1, 2], [1, 2, 2, 1, 2],
[1, 1, 2, 2, 1], [1, 2, 2, 2, 1],
[1, 1, 2, 2, 2], [1, 2, 2, 2, 2],
[1, 1, 3, 1, 1], [1, 2, 3, 1, 1],
[1, 1, 3, 1, 2], [1, 2, 3, 1, 2],
[1, 1, 3, 2, 1], [1, 2, 3, 2, 1],
[1, 1, 3, 2, 2], [1, 2, 3, 2, 2]]
y = [4.42725419998168950, 2.98526261840015650,
2.12795543670654300, 4.36357164382934570,
2.55254390835762020, 2.78596709668636320,
1.21479606628417970, 2.68143519759178160,
2.47588264942169190, 4.69543695449829100,
5.01306104660034180, 3.01919978857040410,
4.73502767086029050, 0.00000000000000000,
0.00000000000000000, 5.05780076980590820,
5.01421167794615030, 3.61517095565795900,
4.11972457170486450, 4.71947982907295230,
6.51671624183654790, 4.22036057710647580,
4.73365202546119690, 4.68545144796371460]
page_width = 132
col_labels = [" ", "ID", "df", "SS", "MS", "F-Test", "P-Value"]
# Set missing observations
y[13] = machine(6)
y[14] = machine(6)
aov_labels = []
anova_table = rcbdFactorial(n_locations, n_factors,
n_levels, model, y, anovaRowLabels=aov_labels)
page(SET_PAGE_WIDTH, page_width)
writeMatrix(" *** ANALYSIS OF VARIANCE TABLE ***",
anova_table, rowLabels=aov_labels,
colLabels=col_labels,
writeFormat="%3.0f%3.0f%8.2f%7.2f%7.2f%7.3f")
Output¶
*** ANALYSIS OF VARIANCE TABLE ***
ID df SS MS F-Test P-Value
Blocks -1 1 0.01 0.01 ....... .......
[1] -1 2 14.73 7.37 5.15 0.032
[2] -1 1 0.24 0.24 0.17 0.692
[3] -1 1 0.15 0.15 0.10 0.756
[1]x[2] -2 2 5.79 2.89 2.02 0.188
[1]x[3] -2 2 1.02 0.51 0.36 0.709
[2]x[3] -2 1 0.20 0.20 0.14 0.719
[1]x[2]x[3] -3 2 0.13 0.06 0.05 0.956
Error -4 9 12.88 1.43 ....... .......
Total -6 21 35.15 ....... ....... .......