Package com.imsl.stat

Class PooledCovariances

java.lang.Object
com.imsl.stat.PooledCovariances
All Implemented Interfaces:
Serializable, Cloneable

public class PooledCovariances extends Object implements Serializable, Cloneable
Computes a pooled variance-covariance matrix from one or more sets of observations.

Class PooledCovariances computes the pooled variance-covariance matrix from one or more matrices of observations. The within-groups means are also computed. Listwise deletion of missing values is assumed so that all observations used are complete; for any row of x, if any element of the observation is missing (with a value of Double.NaN), the row is not used. This class should be used whenever the user suspects that the data has been sampled from populations with different means but identical variance-covariance matrices. If these assumptions cannot be made, a different variance-covariance matrix should be estimated within each group.

Group observation totals, \(T_i\) for i = 1, ..., g, where g is the number of groups, are computed as: $$T_i=\sum\limits_{j}w_{ij}f_{ij}x_{ij}$$

where \(w_{ij}\) is the observation weight, \(x_{ij}\) is the j-th observation in the i-th group, and \(f_{ij}\) is the observation frequency.

Modified Givens rotations are used in computing the Cholesky decomposition of the pooled sums of squares and crossproducts matrix (Golub and Van Loan 1983).

The group means and the pooled sample covariance matrix S are computed from intermediate results. These quantities are defined by

$$ \bar{x}_i=\frac{{T_i}}{{\sum\limits_{j}w_{ij}f_{ij}}} $$ $$ S=\frac{1}{\sum\limits_{ij}f_{ij}-g}\sum\limits_{ij}w_{ij}f_{ij} \big(x_{ij}-\bar{x}_i\big)\big(x_{ij}-\bar{x}_i\big)^T $$
See Also:
  • Constructor Summary

    Constructors
    Constructor
    Description
    PooledCovariances(int nGroups)
    Constructor for PooledCovariances.
  • Method Summary

    Modifier and Type
    Method
    Description
    int[]
    Returns the number of observations in each group.
    double[][]
    Returns the means of each group.
    int
    Returns the number of groups used in the analysis.
    int
    Returns the total number of observations that contain missing values (Double.NaN or group[i] == 0).
    int
    Returns the number of variables used in the analysis.
    double[][]
    Computes and returns the pooled covariances.
    double[]
    Returns the sum of the weights times the frequencies in the groups.
    int
    Returns the total number of observations used in the analysis.
    double[][]
    Returns the lower matrix U, the lower triangular for the pooled sample crossproducts matrix.
    void
    update(double[][] x)
    Updates the pooled covariances with new observations from one group.
    void
    update(double[][] x, int[] groups)
    Updates the pooled covariances with new group observations.
    void
    update(double[][] x, int[] groups, double[] frequencies, double weight)
    Updates the pooled covariances with new group observations, frequencies and a scalar weight.
    void
    update(double[][] x, int[] groups, double[] frequencies, double[] weights)
    Updates the pooled covariances with new group observations, frequencies and weights.
    void
    update(double[][] x, int[] groups, double frequency, double weight)
    Updates the pooled covariances with new group observations and a scalar frequency and weight.
    void
    update(double[][] x, int[] groups, double frequency, double[] weights)
    Updates the pooled covariances with new group observations, a scalar frequency and weights.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Constructor Details

    • PooledCovariances

      public PooledCovariances(int nGroups)
      Constructor for PooledCovariances.
      Parameters:
      nGroups - an int, the number of groups in the data. The groups are numbered 1, 2,..., nGroups.
  • Method Details

    • update

      public void update(double[][] x)
      Updates the pooled covariances with new observations from one group.
      Parameters:
      x - a double matrix containing the observed data. Each row of x contains one observation consisting of x[0].length variables. If x[i][j] has value Double.NaN, then row i of the observations will be skipped and counted as missing.

      The number of observation variables is determined in the first call to any of the update methods. In all subsequent update calls, the number of observation variables must be the same.

      This method assumes that all observations belong to group 1 and have frequencies and weights of 1.0.

    • update

      public void update(double[][] x, int[] groups)
      Updates the pooled covariances with new group observations.
      Parameters:
      x - a double matrix containing the observed data. Each row of x contains one observation consisting of x[0].length variables. If x[i][j] has value Double.NaN, then row i of the observations will be skipped and counted as missing.

      The number of observation variables is determined in the first call to any of the update methods. In all subsequent update calls, the number of observation variables must be the same.

      This method assumes that all observations have frequencies and weights of 1.0.

      groups - an int array containing the group number of the observations in x. Group numbers must be numbered 1, 2,..., nGroups. If groups[i] == 0, the row of observations will be skipped and counted as missing. For groups[i] &lt 0 or groups[i] &gt nGroups, a warning will be issued indicating that the row of observations will be skipped (not marked as missing).
    • update

      public void update(double[][] x, int[] groups, double[] frequencies, double[] weights)
      Updates the pooled covariances with new group observations, frequencies and weights.
      Parameters:
      x - a double matrix containing the observed data. Each row of x contains one observation consisting of x[0].length variables. If x[i][j] has value Double.NaN, then row i of the observations will be skipped and counted as missing.

      The number of observation variables is determined in the first call to any of the update methods. In all subsequent update calls, the number of observation variables must be the same.

      groups - an int array containing the group number of the observations in x. Group numbers must be numbered 1, 2,..., nGroups. If groups[i] == 0, the row of observations will be skipped and counted as missing. For groups[i] &lt 0 or groups[i] &gt nGroups, a warning will be issued indicating that the row of observations will be skipped (not marked as missing).
      frequencies - a double array of size x.length containing the frequency for each observation. Each value must be positive. Any Double.NaN value results in that observation being skipped and marked missing.
      weights - a double array of size x.length containing the weight for each observation. Each value must be positive. Any Double.NaN value results in that observation being skipped and marked missing.
    • update

      public void update(double[][] x, int[] groups, double frequency, double[] weights)
      Updates the pooled covariances with new group observations, a scalar frequency and weights.
      Parameters:
      x - a double matrix containing the observed data. Each row of x contains one observation consisting of x[0].length variables. If x[i][j] has value Double.NaN, then row i of the observations will be skipped and counted as missing.

      The number of observation variables is determined in the first call to any of the update methods. In all subsequent update calls, the number of observation variables must be the same.

      groups - an int array containing the group number of the observations in x. Group numbers must be numbered 1, 2,..., nGroups. If groups[i] == 0, the row of observations will be skipped and counted as missing. For groups[i] &lt 0 or groups[i] &gt nGroups, a warning will be issued indicating that the row of observations will be skipped (not marked as missing).
      frequency - a positive double containing the frequency valid for each observation
      weights - a double array of size x.length containing the weight for each observation. Each value must be positive. Any Double.NaN value results in that observation being skipped and marked missing.
    • update

      public void update(double[][] x, int[] groups, double[] frequencies, double weight)
      Updates the pooled covariances with new group observations, frequencies and a scalar weight.
      Parameters:
      x - a double matrix containing the observed data. Each row of x contains one observation consisting of x[0].length variables. If x[i][j] has value Double.NaN, then row i of the observations will be skipped and counted as missing.

      The number of observation variables is determined in the first call to any of the update methods. In all subsequent update calls, the number of observation variables must be the same.

      groups - an int array containing the group number of the observations in x. Group numbers must be numbered 1, 2,..., nGroups. If groups[i] == 0, the row of observations will be skipped and counted as missing. For groups[i] &lt 0 or groups[i] &gt nGroups, a warning will be issued indicating that the row of observations will be skipped (not marked as missing).
      frequencies - a double array of size x.length containing the frequency for each observation. Each value must be positive. Any Double.NaN value results in that observation being skipped and marked missing.
      weight - a positive double containing the weight valid for each observation
    • update

      public void update(double[][] x, int[] groups, double frequency, double weight)
      Updates the pooled covariances with new group observations and a scalar frequency and weight.
      Parameters:
      x - a double matrix containing the observed data. Each row of x contains one observation consisting of x[0].length variables. If x[i][j] has value Double.NaN, then row i of the observations will be skipped and counted as missing.

      The number of observation variables is determined in the first call to any of the update methods. In all subsequent update calls, the number of observation variables must be the same.

      groups - an int array containing the group number of the observations in x. Group numbers must be numbered 1, 2,..., nGroups. If groups[i] == 0, the row of observations will be skipped and counted as missing. For groups[i] &lt 0 or groups[i] &gt nGroups, a warning will be issued indicating that the row of observations will be skipped (not marked as missing).
      frequency - a positive double containing the frequency valid for each observation
      weight - a positive double containing the weight valid for each observation
    • getPooledCovariances

      public double[][] getPooledCovariances()
      Computes and returns the pooled covariances.

      Note that one of the update methods must be invoked first before invoking this method. Otherwise, the method throws an IllegalStateException exception.

      Returns:
      a square double matrix of order nVar, the number of observation variables, containing the pooled covariances
    • getGroupCounts

      public int[] getGroupCounts()
      Returns the number of observations in each group.
      Returns:
      an int array of length nGroups containing the number of observations in each group
    • getSumOfWeights

      public double[] getSumOfWeights()
      Returns the sum of the weights times the frequencies in the groups.
      Returns:
      a double array of length nGroups containing the sum of the weights times the frequencies in the groups
    • getMeans

      public double[][] getMeans()
      Returns the means of each group.

      Note that one of the update methods must be invoked first before invoking this method. Otherwise, the method throws an IllegalStateException exception.

      Returns:
      a double matrix with nGroups rows. The i-th row contains the group i variable means.
    • getU

      public double[][] getU()
      Returns the lower matrix U, the lower triangular for the pooled sample crossproducts matrix. U is computed from the pooled sample covariance matrix, S, as \(S=U^TU\).

      Note that one of the update methods must be invoked first before invoking this method. Otherwise, the method throws an IllegalStateException exception.

      Returns:
      a square double matrix of order nVar, the number of observation variables, containing U
    • getNumberOfMissingRows

      public int getNumberOfMissingRows()
      Returns the total number of observations that contain missing values (Double.NaN or group[i] == 0).
      Returns:
      an int containing the total number of observations with missing values
    • getTotalNumberOfObservations

      public int getTotalNumberOfObservations()
      Returns the total number of observations used in the analysis.
      Returns:
      an int, the total number of observations from all update invocations
    • getNumberOfVariables

      public int getNumberOfVariables()
      Returns the number of variables used in the analysis.
      Returns:
      an int, the number of variables
    • getNumberOfGroups

      public int getNumberOfGroups()
      Returns the number of groups used in the analysis.
      Returns:
      an int, the number of groups