latinSquare

Analyzes data from latin-square experiments. Function latinSquare also analyzes latin-square experiments replicated at several locations.

Synopsis

latinSquare (nLocations, nTreatments, row, col, treatment, y)

Required Arguments

hint nLocations (Input)
Number of locations. nLocations must be one or greater. If nLocations>1 then the optional array locations[] must be included as input to latinSquare.
int nTreatments (Input)
Number of treatments. nTreatments must be greater than one. In addition the number of rows and columns must be equal to nTreatments.
int row[] (Input)
An array of length n containing the row identifiers for each observation in y. Each row must be assigned values from 1 to nTreatments. latinSquare verifies that the number of unique factor A identifiers is equal to nTreatments.
int col[] (Input)
An array of length n containing the column identifiers for each observation in y. Each column must be assigned values from 1 to nTreatments. latinSquare verifies that the number of unique column identifiers is equal to nTreatments.
int treatment[] (Input)
An array of length n containing the treatment identifiers for each observation in y. Each treatment must be assigned values from 1 to nTreatments. latinSquare verifies that the number of unique treatment identifiers is equal to nTreatments.
float y[] (Input)
An array of length n containing the experimental observations and any missing values. Missing values cannot be omitted. They are indicated by placing a NaN (not a number) in y. The NaN value can be set using the function machine(6). The location, row, column, and treatment number for each observation in y are identified by the corresponding values in the arguments locations, row, col, and treatment.

Return Value

A two dimensional, 7 by 6 array containing the ANOVA table. Each row in this array contains values for one of the effects in the ANOVA table. The first value in each row, \(\texttt{anovaTable}_{i,0} = \texttt{anovaTable}[i*6]\), identifies the source for the effect associated with values in that row. The remaining values in a row contain the ANOVA table values using the following convention:

j \(\texttt{anovaTable}_{i,j} = \texttt{anovaTable}[\texttt{i}*6+\texttt{j}]\)
0 Source Identifier (values described below)
1 Degrees of freedom
2 Sum of squares
3 Mean squares
4 F-statistic
5 p-value for this F-statistic
Note that the p‑value for the F-statistic is returned as 0.0 when the value is so small that all significant digits have been lost.

The Source Identifiers in the first column of \(\text{anovaTable}_{i,j}\) are the only negative values in anovaTable[]. Assignments of identifiers to ANOVA sources use the following coding:

Source Identifier ANOVA Source
-1 LOCATIONS
-2 ROWS
-3 COLUMNS
-4 TREATMENTS
-5 LOCATIONS × TREATMENTS
-6 ERROR WITHIN LOCATIONS
-7 CORRECTED TOTAL

Note: If nLocations=1 rows involving location are set to missing (NaN).

Optional Arguments

locations, int (Input)
An array of length n containing the location identifiers for each observation in y. Unique integers must be assigned to each location in the study. This argument is required when nLocations>1.
nMissing (Output)
Number of missing values, if any, found in y. Missing values are denoted with a NaN (Not a Number) value.
cv (Output)
The coefficient of variation computed by using the within location standard deviation.
grandMean (Output)
Mean of all the data across every location.
treatmentMeans (Output)
An array of size nTreatments containing the treatment means.
stdErrors (Output)
An array of length 2 containing the standard error and associated degrees of freedom for comparing two treatment means. stdErrors[0] contains the standard error and its degrees of freedom are returned in stdErrors[1].
locationAnovaTable (Output)
A 3-dimensional array of size nLocations by 7 by 6 containing the anova tables associated with each location. For each location, the 7 by 6 dimensional array corresponds to the anova table for that location. For example, locationAnovaTable[(i‑1)×42+(j‑16 + (k-1)] contains the value in the k‑th column and j‑th row of the anova-table for the i‑th location.
anovaRowLabels (Output)
An array containing the labels for each of the nAnova rows of the returned ANOVA table. The label for the i‑th row of the ANOVA table can be printed with print anovaRowLabels[i]).

Description

The function latinSquare analyzes latin-square experiments, possibly replicated at multiple locations. Latin-square experiments block treatments using two factors: rows and columns. The number of levels associated with rows and columns must equal the number of treatments. Treatments are blocked by rows and columns in a balanced arrangement to ensure that every row contain one replicate of every treatment. The same balance is required for every column, see Table 4.16. Notice that the four treatments, T1, T2, T3, and T4, appear exactly once in every column and every row.

Table 4.16 — Latin-Square Experiment with Four Treatments
    Columns
    C1 C2 C3 C4
Rows R1 T1 T2 T3 T4
R2 T2 T3 T4 T1
R3 T3 T4 T1 T2
R4 T4 T1 T2 T3

A necessary assumption in Latin-Square experiments is that there are no interactions between treatments and the row and column blocking factors. For data collected at a single location, the Anova table for a Latin-Square experiment is usually organized into five rows, see Table 4.17.

Table 4.17 — The ANOVA Table for a Latin-Square Experiment at one Location
Source DF Sum of Squares Mean Squares
ROWS \(t-1\) \(\mathrm{SSR}=t \sum_{i=1}^{t} \left( \overline{y}_{i.}-\overline{y}_{..}\right)^2\) MSR
COLUMNS \(t-1\) \(\mathrm{SSC}=t \sum_{j=1}^{t} \left( \overline{y}_{.j}-\overline{y}_{..}\right)^2\) MSC
TREATMENTS \(t-1\) \(\mathrm{SST}=t \sum_{k=1}^{t} \left( \overline{y}_{k}-\overline{y}\right)^2\) MST
ERROR \((t-1)(t-2)\) SSE=SSTot-SSR-SSC-SST MSE
TOTAL \(t^2-1\) \(\mathrm{SSTot}=\sum_{i=1}^{t} \sum_{j=1}^{t} \left( y_{ij}-\overline{y}_{..} \right)^2\)

The statistical model used to represent data is from a single location:

\[y_{ij(k)} = \mu + \rho_i + \gamma_j + \tau_{k(ij)} + \varepsilon_{ij(k)}\]

where \(y_{ij(k)}\) is the observation for the k‑th treatment in the i‑th row and j‑th column of the Latin Square, and, \(\tau_{k(ij)}\) is the effect associated with the k‑th treatment. \(\rho_i\) and \(\gamma_j\) are the i‑th row and j‑th column effects, respectively, and \(\varepsilon_{ij(k)}\) is the noise associated with this observation.

If multiple locations are involved, latinSquare assumes that treatments are crossed with locations, but that row and column effects are nested within locations, see Table 4.18. The statistical model used to represent these data is:

\[y_{lij(k)} = \mu + \alpha_l + \rho_{i(l)} + \gamma_{j(l)} + \tau_{k(ij)} + \alpha\tau_{lk(ij)} + \varepsilon_{lij(k)}\]

where

\[\tau_{k(ij)}\]

is the effect associated with the kth treatment, and

\[\alpha\tau_{lk(ij)}\]

is the interaction effect between location l and treatment k.

Table 4.18 — The ANOVA Table for a Latin-Square Experiment at Multiple Locations
SOURCE DF Sum of Squares Mean Squares
LOCATIONS \(r-1\) \(SSL=t^2 \sum_{l=1}^{r} \left( \overline{y}_{l..}-\overline{y}_{\ldots} \right)^2\) MSL
ROWS \(r(t-1)\) \(SSR=t \sum_{l=1}^{r} \sum_{i=1}^{t} \left( \overline{y}_{li.}-\overline{y}_{l..} \right)^2\) MSR
COLUMNS \(r(t-1)\) \(SSC=t \sum_{l=1}^{r} \sum_{i=1}^{t} \left( \overline{y}_{l.j}-\overline{y}_{l..} \right)^2\) MSC
TREATMENTS \(t-1\) \(SST=r\cdot t \sum_{k=1}^{t} \left( \overline{y}_k-\overline{y}_{\ldots} \right)^2\) MST
LOCATIONS X TREATMENTS \((r-1)(t-1)\) SSLT by difference MSLT
ERROR \((t-1)[r((t-1)-1-1]\) \(SSE=\sum_{l=1}^{r} SSE_l\) MSE
TOTAL \(r\cdot t^2-1\) \(\mathit{SSTot}=\sum_{l=1}^{r} \sum_{i=1}^{t} \sum_{j=1}^{t} \left( y_{lij}-\overline{y}_{..} \right)^2\)

Example

This example uses four treatments organized into a latin square. This example also uses the function lPrintLsd(), which is defined in the first example for lattice().

from __future__ import print_function
import sys
from numpy import *
from l_print_lsd import l_print_lsd
from pyimsl.stat.page import page, SET_PAGE_WIDTH
from pyimsl.stat.latinSquare import latinSquare
from pyimsl.stat.multipleComparisons import multipleComparisons
from pyimsl.stat.writeMatrix import writeMatrix

col_labels = [" ", "\nID", "\nDF", "\nSSQ  ",
              "Mean  \nsquares", "\nF-Test", "\np-Value"]
alpha = 0.05
page_width = 132
n = 16            # Total number of observations
n_locations = 1   # Number of locations
n_treatments = 4  # Number of rows, cols, and treatments
n_aov_rows = 7    # Number of rows in latin-sq anova table
col = [1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4]
row = [3, 2, 4, 1, 1, 4, 2, 3, 2, 3, 1, 4, 4, 1, 3, 2]
treatment = [1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4]
y = [1.167, 1.185, 1.655, 1.345, 1.64, 1.29, 1.665, 1.29,
     1.475, 0.71, 1.425, 0.66, 1.565, 1.29, 1.4, 1.18]
print("\n*** Experimental Design ***")
print("===============================")
print("| COL  |  1  |  2  |  3  |  4  |")
print("===============================")
print("|ROW 1 |  2  |  4  |  3  |  1  |")
print("===============================")
print("|ROW 2 |  3  |  1  |  2  |  4  |")
print("===============================")
print("|ROW 3 |  1  |  3  |  4  |  2  |")
print("===============================")
print("|ROW 4 |  4  |  2  |  1  |  3  |")
print("===============================")
grand_mean = []
cv = []
treatment_means = []
std_err = []
anova_row_labels = []

aov = latinSquare(n_locations, n_treatments, row, col,
                  treatment, y, grandMean=grand_mean,
                  cv=cv, treatmentMeans=treatment_means,
                  stdErrors=std_err,
                  anovaRowLabels=anova_row_labels)

# Output results
page(SET_PAGE_WIDTH, page_width)
writeMatrix("\n   *** ANALYSIS OF VARIANCE TABLE ***",
            aov, writeFormat="%3.0f%3.0f%8.3f%8.3f%8.3f%8.3f",
            rowLabels=anova_row_labels,
            colLabels=col_labels)
print("\nGrand Mean: %7.3f" % grand_mean[0])
print("\nCoefficient of Variation: %7.3f\n" % cv[0])
print("Treatment means: ")
l = 0
for i in range(0, n_treatments):
    print("treatment[%2d]              %7.4f" % (i + 1, treatment_means[l]))
    l += 1
df = int(std_err[1])
print("\nStandard Error for Comparing Two Treatment Means: %f \n(df=%d)" %
      (std_err[0], df))
equal_means = multipleComparisons(treatment_means, df,
                                  std_err[0] / sqrt(2.0), lsd=True, alpha=alpha)
l_print_lsd(n_treatments, equal_means, treatment_means)

Output

*** Experimental Design ***
===============================
| COL  |  1  |  2  |  3  |  4  |
===============================
|ROW 1 |  2  |  4  |  3  |  1  |
===============================
|ROW 2 |  3  |  1  |  2  |  4  |
===============================
|ROW 3 |  1  |  3  |  4  |  2  |
===============================
|ROW 4 |  4  |  2  |  1  |  3  |
===============================

Grand Mean:   1.309

Coefficient of Variation:  13.204

Treatment means: 
treatment[ 1]               1.3380
treatment[ 2]               1.4712
treatment[ 3]               1.0675
treatment[ 4]               1.3587

Standard Error for Comparing Two Treatment Means: 0.122201 
(df=6)
[group] 	  Mean 		LSD Grouping
  [3] 		1.067500	  *
  [1] 		1.338000	  *	  *
  [4] 		1.358750	  *	  *
  [2] 		1.471250		  *
 
 
                     *** ANALYSIS OF VARIANCE TABLE ***
                                               Mean                      
                          ID   DF     SSQ     squares    F-Test   p-Value
Locations .............   -1  ...  ........  ........  ........  ........
Rows  .................   -2    3     0.185     0.062     2.064     0.207
Columns ...............   -3    3     0.589     0.196     6.579     0.025
Treatments ............   -4    3     0.352     0.117     3.927     0.073
Locations x Treatments    -5  ...  ........  ........  ........  ........
Error within Locations    -6    6     0.179     0.030  ........  ........
Corrected Total .......   -7   15     1.305  ........  ........  ........