com.imsl.stat.NormTwoSample

All Implemented Interfaces:: Serializable, Cloneable

public class NormTwoSample extends Object implements Serializable, Cloneable

Computes statistics for mean and variance inferences using samples from two normal populations.

Class NormTwoSample computes statistics for making inferences about the means and variances of two normal populations, using independent samples in x and y. Missing values, that is, values equal to NaN (not a number), are excluded from the computations. For inferences concerning parameters of a single normal population, see class NormOneSample.

Let $\mu_1$ and $\sigma _1^2$ be the mean and variance of the first population, and let $\mu_2$ and $\sigma _2^2$ be the corresponding quantities of the second population. The methods in this class support tests for the difference in means $\mu_1-\mu_2$, for equality of variances, and for the common variance (assuming the variances are equal).

The sample means and variances are as follows:

$$\bar x_1 = \left( {\sum {x_{1i} /n_1 } } \right), \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, \bar x_2 = \left( {\sum {x_{2i} } } /n_2\right)$$

and

$$s_1^2 = \sum {\left( {x_{1i} - \bar x_1 } \right)}^2 /\left( {n_1 - 1} \right), \,\,\,\,\,\,\,\,\,\,\,s_2^2 = \sum {\left( {x_{2i} - {\bar x}_2} \right)}^2 /\left( {n_2 - 1} \right)$$

Inferences about the Means

The test that the difference in means equals a certain value, for example, $\mu_0$, depends on whether or not the variances of the two populations can be considered equal. If the variances are equal and $\mu_0=0$, the test is the two-sample t-test, which is equivalent to an analysis-of-variance test. The pooled variance for the difference-in-means test is as follows:

$$s^2 = \frac{{\left( {n_1 - 1} \right)s_1 + \left( {n_2 - 1} \right)s_2 }} {{n_1 + n_2 - 2}}$$

The t statistic is as follows:

$$t = \frac{{\bar x_1 - \bar x_2 - \mu _0}} {s\sqrt {{\left( {1/n_1 } \right)} + \left( {1/n_2 } \right)}}$$

Also, the confidence interval for the difference in means can be obtained by first assigning the unequal variances flag to false. This can be done by calling the setUnequalVariances method. The confidence interval can then be obtained by the getLowerCIDiff and getUpperCIDiff methods.

If the population variances are not equal, the ordinary t statistic does not have a t distribution and several approximate tests for the equality of means have been proposed. (See, for example, Anderson and Bancroft 1952, and Kendall and Stuart 1979.) One of the earliest tests devised for this situation is the Fisher-Behrens test, based on Fisher's concept of fiducial probability. A procedure used in the getTTest, getLowerCIDiff and getUpperCIDiff methods assuming unequal variances are specified is the Satterthwaite's procedure, as suggested by H.F. Smith and modified by F.E. Satterthwaite (Anderson and Bancroft 1952, p. 83). Use setUnequalVariances true to obtain results assuming unequal variances.

The test statistic is

$$t' = \left( {\bar x_1 - \bar x_2 - \mu _0 } \right)/s_d$$

where

$$s_d = \sqrt {\left( {s_1^2 /n_1 } \right) + \left( {s_2^2 /n_2 } \right)}$$

Under the null hypothesis of $\mu_1- \mu_2= c$, this quantity has an approximate t distribution with degrees of freedom df, given by the following equation:

$${\rm{df}} = \frac{{s_d^4 }}{{\frac{{\left( {s_1^2 /n_1 } \right)^2 }}{{n_1 - 1}} + \frac{{\left( {s_2^2 /n_2 } \right)^2 }}{{n_2 - 1}}}}$$

Inferences about Variances

The F statistic for testing the equality of variances is given by $F = s_{\max }^2 /s_{\min }^2$, where $s_{\max}^2$ is the larger of $s_1^2$ and $s_2^2$. If the variances are equal, this quantity has an F distribution with $n_1 - 1$ and $n_2 - 1$ degrees of freedom.

Note: it is generally not recommended that the results of the F test be used to decide whether to use the regular t-test or the modified $t'$ on a single set of data. The modified $t'$ (Satterthwaite's procedure) is the more conservative approach to use if there is doubt about the equality of the variances.

See Also:

Constructor Summary

Constructors

Constructor

Description

NormTwoSample(double[] x, double[] y)

Constructor.
Method Summary

Modifier and Type

Method

Description

void

downdateX(double[] x)

Removes the observations in x from the first sample.

void

downdateY(double[] y)

Removes the observations in y from the second sample.

double

getChiSquaredTest()

Returns the test statistic associated with the chi-squared test for the (assumed) common variance.

int

getChiSquaredTestDF()

Returns the degrees of freedom associated with the chi-squared test for the common variance.

double

getChiSquaredTestP()

Returns the probability of a larger value than the chi-squared statistic associated with the test for the common variance, assuming the null hypothesis is true (i.e., the p-value for the test).

double

getDiffMean()

Returns the difference in sample means.

double

getFTest()

Returns the F statistic value calculated in an F-test for equality of variances.

int

getFTestDFdenominator()

Returns the denominator degrees of freedom of the F test for equality of variances.

int

getFTestDFnumerator()

Returns the numerator degrees of freedom in the $F$-test for equality of variances.

double

getFTestP()

Returns the probability of a larger (in absolute value) F statistic value, assuming equal variances (i.e., the p-value for the test).

double

getLowerCICommonVariance()

Returns the lower confidenceVariance $*100%$ confidence limit for the common variance.

double

getLowerCIDiff()

Returns the lower confidence limit for the difference, $\mu_x - \mu_y$ for equal or unequal variances depending on the value set by setUnequalVariances.

double

getLowerCIRatioVariance()

Returns the approximate lower confidence limit in an interval estimate for the ratio of variances, $\sigma_1^2/\sigma_2^2$.

double

getMeanX()

Returns the mean of the first sample.

double

getMeanY()

Returns the mean of the second sample.

double

getPooledVariance()

Returns the pooled variance for the two samples.

double

getStdDevX()

Returns the standard deviation of the first sample.

double

getStdDevY()

Returns the standard deviation of the second sample.

double

getTTest()

Returns the test statistic for the Satterthwaite's approximation.

double

getTTestDF()

Returns the degrees of freedom for the Satterthwaite's approximation.

double

getTTestP()

Returns the approximate probability of observing a larger value of the t-statistic given the null hypothesis is true (i.e.,the p-value for the test).

double

getUpperCICommonVariance()

Returns the upper confidenceVariance $*100%$ confidence limit for the common variance.

double

getUpperCIDiff()

Returns the upper confidence limit for the difference, $\mu_x - \mu_y$ for equal or unequal variances depending on the value set by setUnequalVariances.

double

getUpperCIRatioVariance()

Returns the approximate upper confidence limit in a confidence interval for the ratio of variances, $\sigma_1^2/\sigma_2^2$.

void

setChiSquaredTestNull(double varianceHypothesisValue)

Sets the null hypothesis value for the chi-squared test.

void

setConfidenceMean(double confidenceMean)

Sets the confidence level (in percent) for a two-sided confidence interval for the difference in means, $\mu_x - \mu_y$.

void

setConfidenceVariance(double confidenceVariance)

Sets the confidence level for a two-sided interval estimate for the common variance and for the ratios of variances.

void

setTTestNull(double meanHypothesis)

Sets the Null hypothesis value for t-test for the mean.

void

setUnequalVariances(boolean uneqVar)

Specifies whether to return statistics based on equal or unequal variances.

void

update(double[] x, double[] y)

Concatenates the data in x and y with the samples provided in the constructor.

void

updateX(double[] x)

Concatenates the data in x with the first sample.

void

updateY(double[] y)

Concatenates the data in y with the second sample.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Details
- NormTwoSample
  
  public NormTwoSample(double[] x, double[] y)
  
  Constructor.
  
  Parameters:
  
  x - a double array containing the first sample
  
  y - a double array containing the second sample
Method Details
- getDiffMean
  
  public double getDiffMean()
  
  Returns the difference in sample means.
  
  Returns:
  
  a double, the difference in sample means
- getMeanX
  
  public double getMeanX()
  
  Returns the mean of the first sample.
  
  Returns:
  
  a double, the mean of the first sample
- getMeanY
  
  public double getMeanY()
  
  Returns the mean of the second sample.
  
  Returns:
  
  a double, the mean of the second sample
- setConfidenceMean
  
  public void setConfidenceMean(double confidenceMean)
  
  Sets the confidence level (in percent) for a two-sided confidence interval for the difference in means, $\mu_x - \mu_y$. The argument, confidenceMean must be between $0.0$ and$ 1.0$ and common choices are $0.90, 0.95$ or $0.99$. Note: In order to use getUpperCIDiff() (getLowerCIDiff()) as a $C$% upper (lower) one-sided confidence limit, set confidenceMean=$(1-2(1-C))/100$ Default: confidenceMean = .95
  
  Parameters:
  
  confidenceMean - double, the desired confidence level of the mean
- getUpperCIDiff
  
  public double getUpperCIDiff()
  
  Returns the upper confidence limit for the difference, $\mu_x - \mu_y$ for equal or unequal variances depending on the value set by setUnequalVariances. setUnequalVariances
  
  Returns:
  
  a double containing the upper confidence limit for the difference in means of the two populations.
- getLowerCIDiff
  
  public double getLowerCIDiff()
  
  Returns the lower confidence limit for the difference, $\mu_x - \mu_y$ for equal or unequal variances depending on the value set by setUnequalVariances. setUnequalVariances
  
  Returns:
  
  a double containing the lower confidence limit for the difference in means of the two populations.
- setUnequalVariances
  
  public void setUnequalVariances(boolean uneqVar)
  
  Specifies whether to return statistics based on equal or unequal variances. The default is to return statistics for equal variances. If uneqVar is True then statistics for unequal variances will be returned.
  
  Parameters:
  
  uneqVar - a boolean containing a true or false value. A value of true will cause results for unequal variances to be returned. A value of false will cause results for equal variances to be returned.
- getTTestDF
  
  public double getTTestDF()
  
  Returns the degrees of freedom for the Satterthwaite's approximation. The value returned is based on the assumption of equal or unequal variances, depending on the value set by setUnequalVariances. setUnequalVariances
  
  Returns:
  
  an double containing the degrees of freedom for the t-test.
- getTTest
  
  public double getTTest()
  
  Returns the test statistic for the Satterthwaite's approximation. The value returned is based on the assumption of equal or unequal variances, according to the value set by setUnequalVariances. setUnequalVariances
  
  Returns:
  
  a double containing the test statistic for the t-test.
- getTTestP
  
  public double getTTestP()
  
  Returns the approximate probability of observing a larger value of the t-statistic given the null hypothesis is true (i.e.,the p-value for the test). The value returned is based on the assumption of equal or unequal variances, according to the value set by setUnequalVariances for equal or unequal variances. setUnequalVariances
  
  Returns:
  
  a double, the p-value for the test
- setTTestNull
  
  public void setTTestNull(double meanHypothesis)
  
  Sets the Null hypothesis value for t-test for the mean. meanHypothesis=0.0 by default.
  
  Parameters:
  
  meanHypothesis - double containing the hypothesis value.
- getPooledVariance
  
  public double getPooledVariance()
  
  Returns the pooled variance for the two samples.
  
  Returns:
  
  a double, the pooled variance for the two samples
- setConfidenceVariance
  
  public void setConfidenceVariance(double confidenceVariance)
  
  Sets the confidence level for a two-sided interval estimate for the common variance and for the ratios of variances. Under the assumption of equal variances, the pooled variance for the two samples is used to obtain a confidenceVariance $*100%$ two-sided confidence interval for the common variance with lower limit returned by getLowerCICommonVariance and upper limit returned by getUpperCICommonVariance. Without making the assumption of equal variances, setUnequalVariances , the ratio of the variances is of interest. A two-sided confidenceVariance$*100%$ confidence interval for the ratios of the variances $\sigma_1^2/\sigma_2^2$ is given by the getLowerCIRatioVariance and getUpperCIRatioVariance. See setUnequalVariances and getUpperCIRatioVariance. The confidence intervals are symmetric in probability. Argument confidenceVariance must be between 0.0 and 1.0 and is often 0.90, 0.95 or 0.99. The default is 0.95.
  
  Parameters:
  
  confidenceVariance - a double containing the confidence level of the variance
- getLowerCICommonVariance
  
  public double getLowerCICommonVariance()
  
  Returns the lower confidenceVariance $*100%$ confidence limit for the common variance.
  
  Returns:
  
  a double, the lower confidence limit for the common variance
- getUpperCICommonVariance
  
  public double getUpperCICommonVariance()
  
  Returns the upper confidenceVariance $*100%$ confidence limit for the common variance.
  
  Returns:
  
  a double the upper confidence limit for the common variance
- getChiSquaredTestDF
  
  public int getChiSquaredTestDF()
  
  Returns the degrees of freedom associated with the chi-squared test for the common variance. The chi-squared test is a test of the hypothesis $\omega^2 = \omega_0^2$ where $\omega_0^2$ is the null hypothesis value as described in setChiSquaredTestNull.
  
  Returns:
  
  an int, the degrees of freedom for the chi-squared test
- getChiSquaredTest
  
  public double getChiSquaredTest()
  
  Returns the test statistic associated with the chi-squared test for the (assumed) common variance. The chi-squared test is a test of the hypothesis $\omega^2 = \omega_0^2$ where $\omega^2$ is the assumed common variance between the two populations and $\omega_0^2$ is the null hypothesis value as described in setChiSquaredTestNull.
  
  Returns:
  
  a double, the test statistic for the chi-squared test
- getChiSquaredTestP
  
  public double getChiSquaredTestP()
  
  Returns the probability of a larger value than the chi-squared statistic associated with the test for the common variance, assuming the null hypothesis is true (i.e., the p-value for the test).
  
  Returns:
  
  a double, the p-value for the chi-squared test
- setChiSquaredTestNull
  
  public void setChiSquaredTestNull(double varianceHypothesisValue)
  
  Sets the null hypothesis value for the chi-squared test. The default is 1.0.
  
  Parameters:
  
  varianceHypothesisValue - a double , the null hypothesis value for the chi-squared test
- getStdDevX
  
  public double getStdDevX()
  
  Returns the standard deviation of the first sample.
  
  Returns:
  
  a double, the standard deviation of the first sample
- getStdDevY
  
  public double getStdDevY()
  
  Returns the standard deviation of the second sample.
  
  Returns:
  
  a double, the standard deviation of the second sample
- getLowerCIRatioVariance
  
  public double getLowerCIRatioVariance()
  
  Returns the approximate lower confidence limit in an interval estimate for the ratio of variances, $\sigma_1^2/\sigma_2^2$.
  
  Returns:
  
  a double, the lower limit
- getUpperCIRatioVariance
  
  public double getUpperCIRatioVariance()
  
  Returns the approximate upper confidence limit in a confidence interval for the ratio of variances, $\sigma_1^2/\sigma_2^2$.
  
  Returns:
  
  a double, the upper limit
- getFTestDFnumerator
  
  public int getFTestDFnumerator()
  
  Returns the numerator degrees of freedom in the $F$-test for equality of variances.
  
  Returns:
  
  an int, the numerator degrees of freedom
- getFTestDFdenominator
  
  public int getFTestDFdenominator()
  
  Returns the denominator degrees of freedom of the F test for equality of variances.
  
  Returns:
  
  an int, the denominator degrees of freedom
- getFTest
  
  public double getFTest()
  
  Returns the F statistic value calculated in an F-test for equality of variances.
  
  Returns:
  
  a double, the value of the test statistic
- getFTestP
  
  public double getFTestP()
  
  Returns the probability of a larger (in absolute value) F statistic value, assuming equal variances (i.e., the p-value for the test).
  
  Returns:
  
  a double, the probability of a larger F statistic
- updateX
  
  public void updateX(double[] x)
  
  Concatenates the data in x with the first sample.
  This method updates the test results to include a new subset of the data. This is useful when the data is too large to fit into memory or when all of the data is not available at one time or location.
  
  Parameters:
  
  x - a double array containing new data for the first sample
- downdateX
  
  public void downdateX(double[] x)
  
  Removes the observations in x from the first sample.
  
  Parameters:
  
  x - a double array containing the values to remove from the first sample
- updateY
  
  public void updateY(double[] y)
  
  Concatenates the data in y with the second sample.
  This method updates the test results to include a new subset of the data. This is useful when the data is too large to fit into memory or when all of the data is not available at one time or location.
  
  Parameters:
  
  y - a double array containing new data for the second sample
- downdateY
  
  public void downdateY(double[] y)
  
  Removes the observations in y from the second sample.
  
  Parameters:
  
  y - a double array containing the values to remove from the second sample
- update
  
  public void update(double[] x, double[] y)
  
  Concatenates the data in x and y with the samples provided in the constructor.
  This method updates the test results to include a new subset of the data. This is useful when the data is too large to fit into memory or when all of the data is not available at one time or location.
  
  Parameters:
  
  x - a double array containing updates to the first sample
  
  y - a double array containing updates to the second sample

Class NormTwoSample

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Details

NormTwoSample

Method Details

getDiffMean

getMeanX

getMeanY

setConfidenceMean

getUpperCIDiff

getLowerCIDiff

setUnequalVariances

getTTestDF

getTTest

getTTestP

setTTestNull

getPooledVariance

setConfidenceVariance

getLowerCICommonVariance

getUpperCICommonVariance

getChiSquaredTestDF

getChiSquaredTest

getChiSquaredTestP

setChiSquaredTestNull

getStdDevX

getStdDevY

getLowerCIRatioVariance

getUpperCIRatioVariance

getFTestDFnumerator

getFTestDFdenominator

getFTest

getFTestP

updateX

downdateX

updateY

downdateY

update