NormTwoSample (JMSL Numerical Library (jmsl) 2021.0.0 API)

java.lang.Object
- com.imsl.stat.NormTwoSample

All Implemented Interfaces:

Serializable, Cloneable
```
public class NormTwoSample
extends Object
implements Serializable, Cloneable
```
Computes statistics for mean and variance inferences using samples from two normal populations.
Class NormTwoSample computes statistics for making inferences about the means and variances of two normal populations, using independent samples in x and y. Missing values, that is, values equal to NaN (not a number), are excluded from the computations. For inferences concerning parameters of a single normal population, see class NormOneSample.

Let $\mu_1$ and $\sigma _1^2$ be the mean and variance of the first population, and let $\mu_2$ and $\sigma _2^2$ be the corresponding quantities of the second population. The methods in this class support tests for the difference in means $\mu_1-\mu_2$ , for equality of variances, and for the common variance (assuming the variances are equal).

The sample means and variances are as follows:

$\bar x_1 = \left( {\sum {x_{1i} /n_1 } } \right), \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, \bar x_2 = \left( {\sum {x_{2i} } } /n_2\right)$

and

$s_1^2 = \sum {\left( {x_{1i} - \bar x_1 } \right)}^2 /\left( {n_1 - 1} \right), \,\,\,\,\,\,\,\,\,\,\,s_2^2 = \sum {\left( {x_{2i} - {\bar x}_2} \right)}^2 /\left( {n_2 - 1} \right)$

Inferences about the Means

The test that the difference in means equals a certain value, for example, $\mu_0$ , depends on whether or not the variances of the two populations can be considered equal. If the variances are equal and $\mu_0=0$ , the test is the two-sample t-test, which is equivalent to an analysis-of-variance test. The pooled variance for the difference-in-means test is as follows:

$s^2 = \frac{{\left( {n_1 - 1} \right)s_1 + \left( {n_2 - 1} \right)s_2 }} {{n_1 + n_2 - 2}}$

The t statistic is as follows:

$t = \frac{{\bar x_1 - \bar x_2 - \mu _0}} {s\sqrt {{\left( {1/n_1 } \right)} + \left( {1/n_2 } \right)}}$

Also, the confidence interval for the difference in means can be obtained by first assigning the unequal variances flag to false. This can be done by calling the setUnequalVariances method. The confidence interval can then be obtained by the getLowerCIDiff and getUpperCIDiff methods.

If the population variances are not equal, the ordinary t statistic does not have a t distribution and several approximate tests for the equality of means have been proposed. (See, for example, Anderson and Bancroft 1952, and Kendall and Stuart 1979.) One of the earliest tests devised for this situation is the Fisher-Behrens test, based on Fisher's concept of fiducial probability. A procedure used in the getTTest, getLowerCIDiff and getUpperCIDiff methods assuming unequal variances are specified is the Satterthwaite's procedure, as suggested by H.F. Smith and modified by F.E. Satterthwaite (Anderson and Bancroft 1952, p. 83). Use setUnequalVariances true to obtain results assuming unequal variances.

The test statistic is

$t' = \left( {\bar x_1 - \bar x_2 - \mu _0 } \right)/s_d$

where

$s_d = \sqrt {\left( {s_1^2 /n_1 } \right) + \left( {s_2^2 /n_2 } \right)}$

Under the null hypothesis of $\mu_1- \mu_2= c$ , this quantity has an approximate t distribution with degrees of freedom df, given by the following equation:

${\rm{df}} = \frac{{s_d^4 }}{{\frac{{\left( {s_1^2 /n_1 } \right)^2 }}{{n_1 - 1}} + \frac{{\left( {s_2^2 /n_2 } \right)^2 }}{{n_2 - 1}}}}$

Inferences about Variances

The F statistic for testing the equality of variances is given by $F = s_{\max }^2 /s_{\min }^2$ , where $s_{\max}^2$ is the larger of $s_1^2$ and $s_2^2$ . If the variances are equal, this quantity has an F distribution with $n_1 - 1$ and $n_2 - 1$ degrees of freedom.

Note: it is generally not recommended that the results of the F test be used to decide whether to use the regular t-test or the modified $t'$ on a single set of data. The modified $t'$ (Satterthwaite's procedure) is the more conservative approach to use if there is doubt about the equality of the variances.

See Also:

Example 1, Example 2, Serialized Form

Constructor Summary

Constructors
Constructor and Description

NormTwoSample(double[] x, double[] y)
Constructor.

Constructors
Constructor and Description
`NormTwoSample(double[] x, double[] y)` Constructor.

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`void`	`downdateX(double[] x)` Removes the observations in `x` from the first sample.
`void`	`downdateY(double[] y)` Removes the observations in `y` from the second sample.
`double`	`getChiSquaredTest()` Returns the test statistic associated with the chi-squared test for the (assumed) common variance.
`int`	`getChiSquaredTestDF()` Returns the degrees of freedom associated with the chi-squared test for the common variance.
`double`	`getChiSquaredTestP()` Returns the probability of a larger value than the chi-squared statistic associated with the test for the common variance, assuming the null hypothesis is true (i.e., the p-value for the test).
`double`	`getDiffMean()` Returns the difference in sample means.
`double`	`getFTest()` Returns the F statistic value calculated in an F-test for equality of variances.
`int`	`getFTestDFdenominator()` Returns the denominator degrees of freedom of the F test for equality of variances.
`int`	`getFTestDFnumerator()` Returns the numerator degrees of freedom in the $F$ -test for equality of variances.
`double`	`getFTestP()` Returns the probability of a larger (in absolute value) F statistic value, assuming equal variances (i.e., the p-value for the test).
`double`	`getLowerCICommonVariance()` Returns the lower `confidenceVariance` $*100%$ confidence limit for the common variance.
`double`	`getLowerCIDiff()` Returns the lower confidence limit for the difference, $\mu_x - \mu_y$ for equal or unequal variances depending on the value set by setUnequalVariances.
`double`	`getLowerCIRatioVariance()` Returns the approximate lower confidence limit in an interval estimate for the ratio of variances, $\sigma_1^2/\sigma_2^2$ .
`double`	`getMeanX()` Returns the mean of the first sample.
`double`	`getMeanY()` Returns the mean of the second sample.
`double`	`getPooledVariance()` Returns the pooled variance for the two samples.
`double`	`getStdDevX()` Returns the standard deviation of the first sample.
`double`	`getStdDevY()` Returns the standard deviation of the second sample.
`double`	`getTTest()` Returns the test statistic for the Satterthwaite's approximation.
`double`	`getTTestDF()` Returns the degrees of freedom for the Satterthwaite's approximation.
`double`	`getTTestP()` Returns the approximate probability of observing a larger value of the t-statistic given the null hypothesis is true (i.e.,the p-value for the test).
`double`	`getUpperCICommonVariance()` Returns the upper `confidenceVariance` $*100%$ confidence limit for the common variance.
`double`	`getUpperCIDiff()` Returns the upper confidence limit for the difference, $\mu_x - \mu_y$ for equal or unequal variances depending on the value set by `setUnequalVariances`.
`double`	`getUpperCIRatioVariance()` Returns the approximate upper confidence limit in a confidence interval for the ratio of variances, $\sigma_1^2/\sigma_2^2$ .
`void`	`setChiSquaredTestNull(double varianceHypothesisValue)` Sets the null hypothesis value for the chi-squared test.
`void`	`setConfidenceMean(double confidenceMean)` Sets the confidence level (in percent) for a two-sided confidence interval for the difference in means, $\mu_x - \mu_y$ .
`void`	`setConfidenceVariance(double confidenceVariance)` Sets the confidence level for a two-sided interval estimate for the common variance and for the ratios of variances.
`void`	`setTTestNull(double meanHypothesis)` Sets the Null hypothesis value for t-test for the mean.
`void`	`setUnequalVariances(boolean uneqVar)` Specifies whether to return statistics based on equal or unequal variances.
`void`	`update(double[] x, double[] y)` Concatenates the data in `x` and `y` with the samples provided in the constructor.
`void`	`updateX(double[] x)` Concatenates the data in `x` with the first sample.
`void`	`updateY(double[] y)` Concatenates the data in `y` with the second sample.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Constructor Detail
  - NormTwoSample
```
public NormTwoSample(double[] x,
                     double[] y)
```
    Constructor.
    
    Parameters:
    
    x - a double array containing the first sample
    
    y - a double array containing the second sample
- Method Detail
  - getDiffMean
```
public double getDiffMean()
```
    Returns the difference in sample means.
    
    Returns:
    
    a double, the difference in sample means
  - getMeanX
```
public double getMeanX()
```
    Returns the mean of the first sample.
    
    Returns:
    
    a double, the mean of the first sample
  - getMeanY
```
public double getMeanY()
```
    Returns the mean of the second sample.
    
    Returns:
    
    a double, the mean of the second sample
  - setConfidenceMean
```
public void setConfidenceMean(double confidenceMean)
```
    Sets the confidence level (in percent) for a two-sided confidence interval for the difference in means, $\mu_x - \mu_y$ . The argument, confidenceMean must be between $0.0$ and $1.0$ and common choices are $0.90, 0.95$ or $0.99$ . Note: In order to use NormTwoSample.getUpperCIDiff() (NormTwoSample.getLowerCIDiff()) as a $C$ % upper (lower) one-sided confidence limit, set confidenceMean= $(1-2(1-C))/100$ Default: confidenceMean = .95
    
    Parameters:
    
    confidenceMean - double, the desired confidence level of the mean
  - getUpperCIDiff
```
public double getUpperCIDiff()
```
    Returns the upper confidence limit for the difference, $\mu_x - \mu_y$ for equal or unequal variances depending on the value set by setUnequalVariances. setUnequalVariances
    
    Returns:
    
    a double containing the upper confidence limit for the difference in means of the two populations.
  - getLowerCIDiff
```
public double getLowerCIDiff()
```
    Returns the lower confidence limit for the difference, $\mu_x - \mu_y$ for equal or unequal variances depending on the value set by setUnequalVariances. setUnequalVariances
    
    Returns:
    
    a double containing the lower confidence limit for the difference in means of the two populations.
  - setUnequalVariances
```
public void setUnequalVariances(boolean uneqVar)
```
    Specifies whether to return statistics based on equal or unequal variances. The default is to return statistics for equal variances. If uneqVar is True then statistics for unequal variances will be returned.
    
    Parameters:
    
    uneqVar - a boolean containing a true or false value. A value of true will cause results for unequal variances to be returned. A value of false will cause results for equal variances to be returned.
  - getTTestDF
```
public double getTTestDF()
```
    Returns the degrees of freedom for the Satterthwaite's approximation. The value returned is based on the assumption of equal or unequal variances, depending on the value set by setUnequalVariances. setUnequalVariances
    
    Returns:
    
    an double containing the degrees of freedom for the t-test.
  - getTTest
```
public double getTTest()
```
    Returns the test statistic for the Satterthwaite's approximation. The value returned is based on the assumption of equal or unequal variances, according to the value set by setUnequalVariances. setUnequalVariances
    
    Returns:
    
    a double containing the test statistic for the t-test.
  - getTTestP
```
public double getTTestP()
```
    Returns the approximate probability of observing a larger value of the t-statistic given the null hypothesis is true (i.e.,the p-value for the test). The value returned is based on the assumption of equal or unequal variances, according to the value set by setUnequalVariances for equal or unequal variances. setUnequalVariances
    
    Returns:
    
    a double, the p-value for the test
  - setTTestNull
```
public void setTTestNull(double meanHypothesis)
```
    Sets the Null hypothesis value for t-test for the mean. meanHypothesis=0.0 by default.
    
    Parameters:
    
    meanHypothesis - double containing the hypothesis value.
  - getPooledVariance
```
public double getPooledVariance()
```
    Returns the pooled variance for the two samples.
    
    Returns:
    
    a double, the pooled variance for the two samples
  - setConfidenceVariance
```
public void setConfidenceVariance(double confidenceVariance)
```
    Sets the confidence level for a two-sided interval estimate for the common variance and for the ratios of variances. Under the assumption of equal variances, the pooled variance for the two samples is used to obtain a confidenceVariance $*100%$ two-sided confidence interval for the common variance with lower limit returned by getLowerCICommonVariance and upper limit returned by getUpperCICommonVariance. Without making the assumption of equal variances, setUnequalVariances , the ratio of the variances is of interest. A two-sided confidenceVariance $*100%$ confidence interval for the ratios of the variances $\sigma_1^2/\sigma_2^2$ is given by the getLowerCIRatioVariance and getUpperCIRatioVariance. See setUnequalVariances and getUpperCIRatioVariance. The confidence intervals are symmetric in probability. Argument confidenceVariance must be between 0.0 and 1.0 and is often 0.90, 0.95 or 0.99. The default is 0.95.
    
    Parameters:
    
    confidenceVariance - a double containing the confidence level of the variance
  - getLowerCICommonVariance
```
public double getLowerCICommonVariance()
```
    Returns the lower confidenceVariance $*100%$ confidence limit for the common variance.
    
    Returns:
    
    a double, the lower confidence limit for the common variance
  - getUpperCICommonVariance
```
public double getUpperCICommonVariance()
```
    Returns the upper confidenceVariance $*100%$ confidence limit for the common variance.
    
    Returns:
    
    a double the upper confidence limit for the common variance
  - getChiSquaredTestDF
```
public int getChiSquaredTestDF()
```
    Returns the degrees of freedom associated with the chi-squared test for the common variance. The chi-squared test is a test of the hypothesis $\omega^2 = \omega_0^2$ where $\omega_0^2$ is the null hypothesis value as described in setChiSquaredTestNull.
    
    Returns:
    
    an int, the degrees of freedom for the chi-squared test
  - getChiSquaredTest
```
public double getChiSquaredTest()
```
    Returns the test statistic associated with the chi-squared test for the (assumed) common variance. The chi-squared test is a test of the hypothesis $\omega^2 = \omega_0^2$ where $\omega^2$ is the assumed common variance between the two populations and $\omega_0^2$ is the null hypothesis value as described in setChiSquaredTestNull.
    
    Returns:
    
    a double, the test statistic for the chi-squared test
  - getChiSquaredTestP
```
public double getChiSquaredTestP()
```
    Returns the probability of a larger value than the chi-squared statistic associated with the test for the common variance, assuming the null hypothesis is true (i.e., the p-value for the test).
    
    Returns:
    
    a double, the p-value for the chi-squared test
  - setChiSquaredTestNull
```
public void setChiSquaredTestNull(double varianceHypothesisValue)
```
    Sets the null hypothesis value for the chi-squared test. The default is 1.0.
    
    Parameters:
    
    varianceHypothesisValue - a double , the null hypothesis value for the chi-squared test
  - getStdDevX
```
public double getStdDevX()
```
    Returns the standard deviation of the first sample.
    
    Returns:
    
    a double, the standard deviation of the first sample
  - getStdDevY
```
public double getStdDevY()
```
    Returns the standard deviation of the second sample.
    
    Returns:
    
    a double, the standard deviation of the second sample
  - getLowerCIRatioVariance
```
public double getLowerCIRatioVariance()
```
    Returns the approximate lower confidence limit in an interval estimate for the ratio of variances, $\sigma_1^2/\sigma_2^2$ .
    
    Returns:
    
    a double, the lower limit
  - getUpperCIRatioVariance
```
public double getUpperCIRatioVariance()
```
    Returns the approximate upper confidence limit in a confidence interval for the ratio of variances, $\sigma_1^2/\sigma_2^2$ .
    
    Returns:
    
    a double, the upper limit
  - getFTestDFnumerator
```
public int getFTestDFnumerator()
```
    Returns the numerator degrees of freedom in the $F$ -test for equality of variances.
    
    Returns:
    
    an int, the numerator degrees of freedom
  - getFTestDFdenominator
```
public int getFTestDFdenominator()
```
    Returns the denominator degrees of freedom of the F test for equality of variances.
    
    Returns:
    
    an int, the denominator degrees of freedom
  - getFTest
```
public double getFTest()
```
    Returns the F statistic value calculated in an F-test for equality of variances.
    
    Returns:
    
    a double, the value of the test statistic
  - getFTestP
```
public double getFTestP()
```
    Returns the probability of a larger (in absolute value) F statistic value, assuming equal variances (i.e., the p-value for the test).
    
    Returns:
    
    a double, the probability of a larger F statistic
  - updateX
```
public void updateX(double[] x)
```
    Concatenates the data in x with the first sample.
    This method updates the test results to include a new subset of the data. This is useful when the data is too large to fit into memory or when all of the data is not available at one time or location.
    
    Parameters:
    
    x - a double array containing new data for the first sample
  - downdateX
```
public void downdateX(double[] x)
```
    Removes the observations in x from the first sample.
    
    Parameters:
    
    x - a double array containing the values to remove from the first sample
  - updateY
```
public void updateY(double[] y)
```
    Concatenates the data in y with the second sample.
    This method updates the test results to include a new subset of the data. This is useful when the data is too large to fit into memory or when all of the data is not available at one time or location.
    
    Parameters:
    
    y - a double array containing new data for the second sample
  - downdateY
```
public void downdateY(double[] y)
```
    Removes the observations in y from the second sample.
    
    Parameters:
    
    y - a double array containing the values to remove from the second sample
  - update
```
public void update(double[] x,
                   double[] y)
```
    Concatenates the data in x and y with the samples provided in the constructor.
    This method updates the test results to include a new subset of the data. This is useful when the data is too large to fit into memory or when all of the data is not available at one time or location.
    
    Parameters:
    
    x - a double array containing updates to the first sample
    
    y - a double array containing updates to the second sample

Class NormTwoSample

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Detail

NormTwoSample

Method Detail

getDiffMean

getMeanX

getMeanY

setConfidenceMean

getUpperCIDiff

getLowerCIDiff

setUnequalVariances

getTTestDF

getTTest

getTTestP

setTTestNull

getPooledVariance

setConfidenceVariance

getLowerCICommonVariance

getUpperCICommonVariance

getChiSquaredTestDF

getChiSquaredTest

getChiSquaredTestP

setChiSquaredTestNull

getStdDevX

getStdDevY

getLowerCIRatioVariance

getUpperCIRatioVariance

getFTestDFnumerator

getFTestDFdenominator

getFTest

getFTestP

updateX

downdateX

updateY

downdateY

update