splitSplitPlot¶
Analyzes data from split-split-plot experiments. The whole-plots can be
assigned to experimental units using either a completely randomized or
randomized complete block design. Function splitSplitPlot
also analyzes
split‑split‑plot experiments replicated at several locations.
Synopsis¶
splitSplitPlot (nLocations, nWhole, nSplit, nSub, rep, whole, split, sub, y)
Required Arguments¶
- int
nLocations
(Input) - Number of locations.
nLocations
must be one or greater. IfnLocations
>1 then the optional arraylocations[]
must be included as input. See optional argumentlocations
. - int
nWhole
(Input) - Number of levels associated with the whole-plot factor.
nWhole
must be greater than one. - int
nSplit
(Input) - Number of levels associated with the split-plot factor.
nSplit
must be greater than one. - int
nSub
(Input) - Number of levels associated with the sub-plot factor.
nSub
must be greater than one. - int
rep[]
(Input) - An array of length
n
containing the block, or replicate, identifiers for each observation iny
. Different locations can have different numbers of blocks or replicates. Each block or replicate at a single location must be assigned a different identifier, but different locations can have the same assignments. - int
whole[]
(Input) - An array of length
n
containing the whole-plot identifiers for each observation iny
. Each level of the whole-plot factor must be assigned a different integer.splitSplitPlot
verifies that the number of unique whole-plot identifiers is equal tonWhole
. - int
split[]
(Input) - An array of length
n
containing the split-plot identifiers for each observation iny
. Each level of the split-plot factor must be assigned a different integer.splitSplitPlot
verifies that the number of unique split-plot identifiers is equal tonSplit.
- int
sub[]
(Input) - An array of length
n
containing the sub-plot identifiers for each observation iny
. Each level of the sub-plot factor must be assigned a different integer.splitSplitPlot
verifies that the number of unique sub-plot identifiers is equal tonSub
. - float
y[]
(Input) - An array of length
n
containing the experimental observations and any missing values. Missing values cannot be omitted. They are included by placing a NaN (not a number) iny
. The NaN value can be set using the functionmachine(6)
. At a single location, only one missing value per whole-plot is allowed. The location, whole-plot, split-plot and sub-plot for each observation iny
are identified by the corresponding values in the argumentslocations
,whole
,split
andsub
.
Return Value¶
A two dimensional, 20 by 6 array containing the ANOVA table. Each row in this array contains values for one of the effects in the ANOVA table. The first value in each row, \(\texttt{anovaTable}_{i,0} = \texttt{anovaTable}[i*6]\), identifies the source for the effect associated with values in that row. The remaining values in a row contain the ANOVA table values using the following convention:
j |
\(\texttt{anovaTable}_{i,j} = \texttt{anovaTable}[\texttt{i}*6+\texttt{j}]\) |
---|---|
0 | Source Identifier (values described below) |
1 | Degrees of freedom |
2 | Sum of squares |
3 | Mean squares |
4 | F-statistic |
5 | p-value for this F-statistic |
The Source Identifiers in the first column of
\(\text{anovaTable}_{i,j}\) are the only negative values in
anovaTable[]
. Note that the p-value for the F-statistic is returned
as 0.0 when the value is so small that all significant digits have been
lost. Assignments of identifiers to ANOVA sources use the following coding:
Source Identifier | ANOVA Source |
---|---|
-1 |
LOCATION† |
-2 |
BLOCK WITHIN LOCATION‡ |
-3 |
WHOLE-PLOT |
-4 |
LOCATION × WHOLE-PLOT† |
-5 |
WHOLE-PLOT ERROR |
-6 |
SPLIT-PLOT |
-7 |
LOCATION × SPLIT-PLOT† |
-8 |
WHOLE-PLOT × SPLIT-PLOT |
-9 |
LOCATION × WHOLE-PLOT × SPLIT-PLOT† |
-10 |
SPLIT-PLOT ERROR* |
-11 |
CORRECTED TOTAL |
-12 |
LOCATION × SUB-PLOT† |
-13 |
WHOLE-PLOT × SUB-PLOT |
-14 |
LOCATION × WHOLE-PLOT × SUB-PLOT† |
-15 |
SPLIT-PLOT × SUB-PLOT |
-16 |
LOCATION × SPLIT-PLOT × SUB-PLOT† |
-17 |
WHOLE-PLOT × SPLIT-PLOT × SUB-PLOT |
-18 |
LOCATION × WHOLE-PLOT × SPLIT-PLOT ×
SUBPLOT† |
-19 |
SUB-PLOT ERROR |
-20 |
CORRECTED TOTAL |
NOTES:
†
If nLocations=1
sources involving location are set to missing
(NaN).
‡
If crd
is set, entries for blocks within location are set to
missing, and its sum of squares and degrees of freedom are pooled into the
whole-plot error.
* Split-plot error component calculation varies depending upon
nLocations
. See Description below for
details.
Optional Arguments¶
locations
, int[]
(Input)- An array of length
n
containing the location identifiers for each observation iny
. Unique integers must be assigned to each location in the study. This argument is required whennLocations
>1.
rcbd
(Input)
or
crd
(Input)Whole-plot randomization characteristic:
rcbd
implies that whole-plots are assigned to whole-plot experimental units using a randomized complete block design.crd
implies that whole-plots are completely randomized to whole-plot experimental units.Default:
rcbd
.nMissing
(Output)- Number of missing values, if any, found in
y
. Missing values are denoted with a NaN (Not a Number) value. cv
(Output)- An array of length 3 containing the whole-plot, split-plot and sub-plot
coefficients of variation.
cv
[0] contains the whole-plot C.V.,cv
[1] contains the split-plot C.V., andcv
[2] contains the sub-plot C.V. grandMean
(Output)- Mean of all the data across every location.
wholePlotMeans
(Output)- An array of length
nWhole
containing the whole-plot means. splitPlotMeans
(Output)- An array of length
nSplit
containing the split-plot means. subPlotMeans
(Output)- An array of length
nSub
containing the sub-plot means. wholeSplitPlotMeans
(Output)- A 2-dimensional array of size
nWhole
bynSplit
containing the whole-plot by split-plot means. wholeSubPlotMeans
(Output)- A 2-dimensional array of size
nWhole
bynSub
containing the whole-plot by sub-plot means. splitSubPlotMeans
(Output)- A 2-dimensional array of size
nSplit
bynSub
containing the split-plot by sub-plot means. treatmentMeans
(Output)- An array of size (
nWhole
×nSplit
×nSub
) containing the treatment means. For \(i>0\), \(j>0\) and \(k>0\), \(\text{treatmentMeans}_{i,j,k}\) =treatmentMeans
[(i-1)*nSplit
*nSub
+(j-1)*nSub
+ k-1] contains the mean of the observations, averaged over all locations, blocks and replicates, for the k‑th sub-plot within the j‑th split-plot within the i‑th whole-plot. stdErrors
(Output)- An array of length 8 containing five standard errors and their
associated degrees of freedom. The standard errors are in the first five
elements and their associated degrees of freedom are reported in
stdErrors
[4] throughstdErrors
[7].
Element | Standard Error for Comparisons Between Two | Degrees of Freedom |
---|---|---|
stdErrors[0] |
Whole-Plot Means | stdErrors[4] |
stdErrors[1] |
Split-Plot Means | stdErrors[5] |
stdErrors[2] |
Sub-Plot Means | stdErrors[6] |
stdErrors[3] |
Treatment Means (same whole-plot, split-plot and sub-plot) | stdErrors[7] |
nBlocks
(Output)- An array of length
nLocations
containing the number of blocks, or replicates, at each location. locationAnovaTable
(Output)- A 3-dimensional array of size
nLocations
by 20 by 6 containing the anova tables associated with each location. For each location, the 20 by 6 dimensional array corresponds to the anova table for that location. For example,locationAnovaTable
[(i‑1)×120+(j‑1)×6 + (k‑1)] contains the value in the k‑th column and j‑th row of the returned anova-table for the i‑th location. anovaRowLabels
(Output)- An array containing the labels for each of the
nAnova
rows of the returned ANOVA table. The label for the i‑th row of the ANOVA table can be printed withprint anovaRowLabels[i]
.
Description¶
Function splitSplitPlot
is capable of analyzing a wide variety of
split-split-plot experiments.
Split-split-plot experimental designs can vary in the assignment of
whole-plot factors to experimental units. In some cases, this assignment is
completely random. For example, in a drug study the experimental unit might
be the subject receiving a treatment. The whole-plot factor, possibly
different treatments, could be assigned in one of two ways. Each subject
could receive only one treatment or each could receive all treatments over
an appropriate period of time. If each subject received only a single
randomly selected treatment, then this design constitutes a completely
randomized design for the whole-plot factor, and the optional input argument
crd
must be set.
On the other hand, if each subject receives every treatment in random order,
then the subject is a blocking factor, and this sampling scheme constitutes
a randomized complete block design. In this case, it is necessary to assume
that there are no carry-over effects from one treatment to another. This
sampling scheme is the default setting, i.e. rcbd
is the default
setting.
This randomization choice occurs often in agricultural field trials. A trial
designed to test different fertilizers and different seed lots can be
conducted in one of two ways. The whole-plot factor, fertilizer, can be
applied to different fields, or each can be applied to sub-divisions of
these fields. In either case, a field, or a sub-division of a field, is the
whole-plot experimental unit. In the first case, in which only one randomly
selected fertilizer is applied to each field, the whole-plot factor is not
blocked and this scheme is called as a completely randomized design, and the
optional input argument crd
must be set. However, if fertilizers are
applied to sub-divisions within a field, then the whole-plot factor is
blocked within fields and this assignment is referred to as a randomized
complete block design. By default, splitSplitPlot
assumes that levels of
the whole-plot factor are randomly assigned within blocks, i.e., rcbd
is
the default setting for randomizing whole-plots.
The essential distinction between split-plot and split-split-plot experiments is the presence of a third factor that is blocked, or nested, within each level of the whole-plot and split-plot factors. This third factor is referred to as the sub-plot factor.
Whole Plot Factor | |||
A2 | A1 | A4 | A3 |
A2B1 | A1B3 | A4B1 | A3B2 |
A2B3 | A1B1 | A4B3 | A3B1 |
A2B2 | A1B2 | A4B2 | A3B2 |
Whole Plot Factor A | |||
A2 | A1 | A4 | A3 |
A2B3C2 A2B3C1 |
A1B2C1 A1B2C2 |
A4B1C2 A4B1C1 |
A3B3C2 A3B3C1 |
A2B1C1 A2B1C2 |
A1B1C1 A1B1C2 |
A4B3C2 A4B3C1 |
A3B2C2 A3B2C1 |
A2B2C2 A2B2C1 |
A1B3C1 A1B3C2 |
A4B2C1 A4B2C2 |
A3B1C2 A3B1C1 |
Contrast the split-split plot experiment to the same experiment run using a strip-split plot design, see Table 4.26. In a strip-split plot experiment factor B is applied in strip across factor A; whereas, in a split-split plot experiment, factor B is randomly assigned to each level of factor A. In a strip-split plot experiment, the level of factor B is constant across a row; whereas in a split-split plot experiment, the levels of factor B change as you go across a row, reflecting the fact that factor B is randomized within each level of factor A.
Factor A Strip Plots | |||||
A2 | A1 | A4 | A3 | ||
Factor B Strip Plots |
B3 | A2B3C2 A2B3C1 |
A1B3C1 A1B3C2 |
A4B3C2 A4B3C1 |
A3B3C2 A3B3C1 |
B1 | A2B1C1 A2B1C2 |
A1B1C1 A1B1C2 |
A4B1C2 A4B1C1 |
A3B1C2 A3B1C1 |
|
B2 | A2B2C2 A2B2C1 |
A1B2C1 A1B2C2 |
A4B2C1 A4B2C2 |
A3B2C2 A3B2C1 |
In some studies, a split-split-plot experiment is replicated at several
locations. Function splitSplitPlot
can analyze these, even when the
number of blocks or replicates at each location is different. If only a
single replicate or block is used at each location, then location should be
treated as a blocking factor, with nLocations
set equal to one. If
nLocations
=1, it is assumed that the experiment was conducted at a
single location with more than one block or replicate at that location. In
this case, all entries in the anova table associated with location will
contain missing values.
However, if nLocations
>1, it is assumed the experiment was repeated at
multiple locations, with replication or blocking occurring at each location.
Although the number of blocks, or replicates, at each location can be
different, the number of levels for whole-plot and split-plot factors,
nWhole
and nSplit
, must be the same at each location. The locations
associated with each of the observations in y
are specified in the
argument locations[]
, which is a required input argument when
nLocations
>1.
By default, locations are assumed to be random effects. Tests involving whole-plots use the interaction between whole-plots and locations as the error term for testing whether there are statistically significant differences among whole-plot factor levels. This assumes that the interaction of whole-plots and locations is not statistically significant. A test of this assumption uses the pooled whole-plot error. If the interaction between location and whole-plots, split-plots or sub-plot is statistically significant, then the nature of that interaction should be explored since it impacts the interpretation of the significance of the treatment factors.
When nLocations
>1 are assumed to be random effects, tests involving
split-plots do not use the split-plot errors pooled across locations.
Instead, the error term for split plots is the interaction between locations
and split-plots. The split-plot by whole-plot interaction is tested against
the location by split-plot by whole-plot interaction.
Suppose, for example, that a researcher wanted to conduct an agricultural experiment comparing the effectiveness of 4 fertilizers with 3 rates of application and 2 seed lots. One replicate of the experiment is conducted at each of the 3 farms. That is, only a single field at each location is assigned to this experiment.
Each field is divided into 4 whole-plots and the fertilizers are randomly assigned to each of the 4 whole-plots. Each whole-plot is then further sub-divided into 3 split-plots which are each randomly assigned one of the three fertilizer application rates. Finally, each of these sub-divisions assigned a particular fertilizer and application rate is sub-divided into 2 plots and randomly assigned one of the two seed lots.
In this case, each farm is a blocking factor, fertilizers are whole-plots
and fertilizer application rate are split plots, and seed lots are
sub-plots. The input array rep
would contain integers from 1 to the
number of farms, with nWhole
=4, nSplit
=3 and nSub
=2.
However, if each farm allocated more than a single field for this study,
then each farm would be treated as a different location with nLocations
set equal to the number of farms, and fields might be treated as blocking
factor. The array rep
would contain integers from 1 to the number fields
used in a farm, and locations[]
would contain integers from 1 to the
number of farms.
In summary splitSplitPlot
can analyze 3x2=6 different experimental
situations, depending upon the settings of:
- Locations (none, fixed or random): specified by setting
nLocations
,locations[]
andlocFixed
orlocRandom
. - Whole-plot sampling (CRD or RCBD): specified by setting
crd
orrcbd
.
The default condition depends upon the value for nLocations
. If
nLocations
>1, locations are assumed to be a random effect. Assignment
of experimental units to whole-plots is assumed to use a RCBD design and
whole-plots, split-plots and sub-plots are all assumed to be fixed effects.
Example¶
This example uses data from a split‑split‑plot design consisting of two whole-plots, two-split‑plots and two sub‑plots.
from __future__ import print_function
import sys
from numpy import *
from pyimsl.stat.page import page, SET_PAGE_WIDTH
from pyimsl.stat.multipleComparisons import multipleComparisons
from pyimsl.stat.splitSplitPlot import splitSplitPlot
from pyimsl.stat.writeMatrix import writeMatrix
col_labels = [" ", "\nID", "\nDF", "\nSSQ",
"Mean\nsquares", "\nF", "\np-value"]
page_width = 132
n = 24 # Total number of observations
n_locations = 1 # Number of locations
n_whole = 2 # Number of whole-plots/location
n_split = 2 # Number of split-plots/location
n_sub = 2
rep = [1, 1, 1, 1, 1, 1, 1, 1,
2, 2, 2, 2, 2, 2, 2, 2,
3, 3, 3, 3, 3, 3, 3, 3]
whole = [1, 1, 1, 1, 2, 2, 2, 2,
1, 1, 1, 1, 2, 2, 2, 2,
1, 1, 1, 1, 2, 2, 2, 2]
split = [1, 1, 2, 2, 1, 1, 2, 2,
1, 1, 2, 2, 1, 1, 2, 2,
1, 1, 2, 2, 1, 1, 2, 2]
sub = [1, 2, 1, 2, 1, 2, 1, 2,
1, 2, 1, 2, 1, 2, 1, 2,
1, 2, 1, 2, 1, 2, 1, 2]
y = [30.0, 40.0, 38.9, 38.2, 41.8, 52.2, 54.8, 58.2,
20.5, 26.9, 21.4, 25.1, 26.4, 36.7, 28.9, 35.9,
21.0, 25.4, 24.0, 23.3, 34.4, 41.0, 33.0, 34.9]
grand_mean = []
cv = []
treatment_means = []
whole_plot_means = []
split_plot_means = []
sub_plot_means = []
std_err = []
equal_means = []
aov_row_labels = []
aov = splitSplitPlot(n_locations, n_whole, n_split, n_sub,
rep, whole, split, sub, y,
grandMean=grand_mean, cv=cv,
treatmentMeans=treatment_means,
wholePlotMeans=whole_plot_means,
splitPlotMeans=split_plot_means,
subPlotMeans=sub_plot_means,
stdErrors=std_err,
anovaRowLabels=aov_row_labels)
# Output results
page(SET_PAGE_WIDTH, page_width)
# Print ANOVA table
writeMatrix(" *** ANALYSIS OF VARIANCE TABLE ***",
aov, writeFormat="%3.0f%3.0f%8.2f%7.2f%7.2f%7.3f",
rowLabels=aov_row_labels,
colLabels=col_labels)
# Print the various means
print("\nGrand mean: %7.3f" % grand_mean[0])
print("Coefficient of Variation ***")
print(" Whole-Plot: %7.3f" % cv[0])
print(" Split-Plot: %7.3f" % cv[1])
print(" Sub-Plot : %7.3f" % cv[2])
print("\n*************************************************************")
print("Treatment Means: ")
l = 0
for i in range(0, n_whole):
for j in range(0, n_split):
for k in range(0, n_sub):
sys.stdout.write(" treatment[%d][%d][%d] %f \n" % (
i, j, k, treatment_means[i][j][k]))
l += 1
sys.stdout.write("\n Standard Error for Comparing Two Treatment Means: %f \n (df=%f)\n" % (
std_err[3], std_err[7]))
tma = array(treatment_means, dtype='int')
equal_means = multipleComparisons(tma.flat,
int(std_err[7]),
std_err[3] / sqrt(2), lsd=True, alpha=.05)
print("\n LSD for Treatment Means (alpha=0.05)")
writeMatrix(" Size of Groups of Means", equal_means, writeFormat="%5i")
# Whole-plot Means
print("\n*************************************************************")
writeMatrix("Whole-plot Means", whole_plot_means, column=True)
sys.stdout.write("\nStandard Error for Comparing Two Whole-Plot Means: %f \n(df=%f)\n" %
(std_err[0], std_err[4]))
equal_means = multipleComparisons(whole_plot_means,
int(std_err[4]), std_err[0] / sqrt(2),
lsd=True, alpha=.05)
print("\nLSD for Whole-Plot Means (alpha=0.05)")
writeMatrix("Size of Groups of Means", equal_means)
# Split-plot Means
print("\n*************************************************************")
writeMatrix("Split-plot Means", split_plot_means, column=True)
sys.stdout.write("\nStandard Error for Comparing Two Split-Plot Means: %f \n(df=%f)\n" %
(std_err[1], std_err[5]))
equal_means = multipleComparisons(split_plot_means,
int(std_err[5]), std_err[1] / sqrt(2),
lsd=True, alpha=.05)
print("\nLSD for Split-Plot Means (alpha=0.05)")
writeMatrix("Size of Groups of Means", equal_means)
# Sub-plot Means
print("\n*************************************************************")
writeMatrix("Sub-plot Means", sub_plot_means, column=True)
sys.stdout.write("\nStandard Error for Comparing Two Sub-Plot Means: %f \n(df=%f)\n" %
(std_err[2], std_err[6]))
equal_means = multipleComparisons(sub_plot_means,
int(std_err[6]), std_err[1] / sqrt(2),
lsd=True, alpha=.05)
print("\nLSD for Sub-Plot Means (alpha=0.05)")
writeMatrix("Size of Groups of Means", equal_means)
Output¶
Grand mean: 33.871
Coefficient of Variation ***
Whole-Plot: 13.612
Split-Plot: 14.712
Sub-Plot : 5.329
*************************************************************
Treatment Means:
treatment[0][0][0] 23.833333
treatment[0][0][1] 30.766667
treatment[0][1][0] 28.100000
treatment[0][1][1] 28.866667
treatment[1][0][0] 34.200000
treatment[1][0][1] 43.300000
treatment[1][1][0] 38.900000
treatment[1][1][1] 43.000000
Standard Error for Comparing Two Treatment Means: 1.473846
(df=8.000000)
LSD for Treatment Means (alpha=0.05)
*************************************************************
Standard Error for Comparing Two Whole-Plot Means: 2.661792
(df=2.000000)
LSD for Whole-Plot Means (alpha=0.05)
*************************************************************
Standard Error for Comparing Two Split-Plot Means: 2.876944
(df=4.000000)
LSD for Split-Plot Means (alpha=0.05)
*************************************************************
Standard Error for Comparing Two Sub-Plot Means: 1.473846
(df=8.000000)
LSD for Sub-Plot Means (alpha=0.05)
*** ANALYSIS OF VARIANCE TABLE ***
Mean
ID DF SSQ squares F p-value
Location -1 ... ........ ....... ....... .......
Blocks Within Location -2 2 1310.28 655.14 30.82 0.031
Whole-Plot -3 1 858.01 858.01 40.37 0.024
Location x Whole-Plot -4 ... ........ ....... ....... .......
Whole-Plot Error -5 2 42.51 21.26 0.86 0.490
Split-Plot -6 1 17.17 17.17 0.69 0.452
Location x Split-Plot -7 ... ........ ....... ....... .......
Whole-Plot x Split-Plot -8 1 1.55 1.55 0.06 0.815
Location x Whole-Plot x -9 ... ........ ....... ....... .......
Split-Plot
Split-Plot Error -10 4 99.32 24.83 7.62 0.008
Sub-Plot -11 1 163.80 163.80 50.27 0.000
Location x Sub-Plot -12 ... ........ ....... ....... .......
Whole-Plot x Sub-Plot -13 1 11.34 11.34 3.48 0.099
Location x Whole-Plot x Sub-Plot -14 ... ........ ....... ....... .......
Split-plot x Sub-Plot -15 1 46.76 46.76 14.35 0.005
Location x Split-Plot x Sub-Plot -16 ... ........ ....... ....... .......
Whole_plot x Split-Plot -17 1 0.51 0.51 0.16 0.703
x Sub-Plot
Location x Whole-Plot x -18 ... ........ ....... ....... .......
Split-Plot x Sub-Plot
Sub-Plot Error -19 8 26.07 3.26 ....... .......
Corrected Total -20 23 2577.33 ....... ....... .......
Size of Groups of Means
1 2 3 4 5 6 7
0 3 0 0 0 0 2
Whole-plot Means
1 27.89
2 39.85
Size of Groups of Means
0
Split-plot Means
1 33.02
2 34.72
Size of Groups of Means
2
Sub-plot Means
1 31.26
2 36.48
Size of Groups of Means
2