Class NormTwoSample
- All Implemented Interfaces:
Serializable,Cloneable
Class NormTwoSample computes statistics for making
inferences about the means and variances of two normal populations, using
independent samples in x and y. Missing values,
that is, values equal to NaN (not a number), are excluded from the
computations. For inferences concerning parameters of a single normal
population, see class NormOneSample.
Let \(\mu_1\) and \(\sigma _1^2\) be the mean and variance of the first population, and let \(\mu_2\) and \(\sigma _2^2\) be the corresponding quantities of the second population. The methods in this class support tests for the difference in means \(\mu_1-\mu_2\), for equality of variances, and for the common variance (assuming the variances are equal).
The sample means and variances are as follows:
$$\bar x_1 = \left( {\sum {x_{1i} /n_1 } } \right), \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, \bar x_2 = \left( {\sum {x_{2i} } } /n_2\right)$$
and
$$s_1^2 = \sum {\left( {x_{1i} - \bar x_1 } \right)}^2 /\left( {n_1 - 1} \right), \,\,\,\,\,\,\,\,\,\,\,s_2^2 = \sum {\left( {x_{2i} - {\bar x}_2} \right)}^2 /\left( {n_2 - 1} \right)$$
Inferences about the Means
The test that the difference in means equals a certain value, for example, \(\mu_0\), depends on whether or not the variances of the two populations can be considered equal. If the variances are equal and \(\mu_0=0\), the test is the two-sample t-test, which is equivalent to an analysis-of-variance test. The pooled variance for the difference-in-means test is as follows:
$$s^2 = \frac{{\left( {n_1 - 1} \right)s_1 + \left( {n_2 - 1} \right)s_2 }} {{n_1 + n_2 - 2}}$$
The t statistic is as follows:
$$t = \frac{{\bar x_1 - \bar x_2 - \mu _0}} {s\sqrt {{\left( {1/n_1 } \right)} + \left( {1/n_2 } \right)}}$$
Also, the confidence interval for the difference in means can be obtained
by first assigning the unequal variances flag to false. This can be done by calling
the setUnequalVariances method. The confidence interval
can then be obtained by the getLowerCIDiff and
getUpperCIDiff methods.
If the population variances are not equal, the ordinary t
statistic does not have a t distribution and several approximate
tests for the equality of means have been proposed. (See, for example,
Anderson and Bancroft 1952, and Kendall and Stuart 1979.) One of the
earliest tests devised for this situation is the Fisher-Behrens test, based
on Fisher's concept of fiducial probability.
A procedure used in the
getTTest, getLowerCIDiff and getUpperCIDiff
methods assuming unequal variances are specified is the Satterthwaite's
procedure, as suggested by H.F. Smith and modified by F.E. Satterthwaite
(Anderson and Bancroft 1952, p. 83). Use setUnequalVariances
true to obtain results assuming unequal variances.
The test statistic is
$$t' = \left( {\bar x_1 - \bar x_2 - \mu _0 } \right)/s_d$$
where
$$s_d = \sqrt {\left( {s_1^2 /n_1 } \right) + \left( {s_2^2 /n_2 } \right)}$$
Under the null hypothesis of \(\mu_1- \mu_2= c\), this
quantity has an approximate t distribution with degrees of freedom
df, given by the following equation:
$${\rm{df}} = \frac{{s_d^4 }}{{\frac{{\left( {s_1^2 /n_1 } \right)^2 }}{{n_1 - 1}} + \frac{{\left( {s_2^2 /n_2 } \right)^2 }}{{n_2 - 1}}}}$$
Inferences about Variances
The F statistic for testing the equality of variances is given by \(F = s_{\max }^2 /s_{\min }^2\), where \(s_{\max}^2\) is the larger of \(s_1^2\) and \(s_2^2\). If the variances are equal, this quantity has an F distribution with \(n_1 - 1\) and \(n_2 - 1\) degrees of freedom.
Note: it is generally not recommended that the results of the F test be used to decide whether to use the regular t-test or the modified \(t'\) on a single set of data. The modified \(t'\) (Satterthwaite's procedure) is the more conservative approach to use if there is doubt about the equality of the variances.
- See Also:
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoiddowndateX(double[] x) Removes the observations inxfrom the first sample.voiddowndateY(double[] y) Removes the observations inyfrom the second sample.doubleReturns the test statistic associated with the chi-squared test for the (assumed) common variance.intReturns the degrees of freedom associated with the chi-squared test for the common variance.doubleReturns the probability of a larger value than the chi-squared statistic associated with the test for the common variance, assuming the null hypothesis is true (i.e., the p-value for the test).doubleReturns the difference in sample means.doublegetFTest()Returns the F statistic value calculated in an F-test for equality of variances.intReturns the denominator degrees of freedom of the F test for equality of variances.intReturns the numerator degrees of freedom in the \(F\)-test for equality of variances.doubleReturns the probability of a larger (in absolute value) F statistic value, assuming equal variances (i.e., the p-value for the test).doubleReturns the lowerconfidenceVariance\(*100%\) confidence limit for the common variance.doubleReturns the lower confidence limit for the difference, \(\mu_x - \mu_y\) for equal or unequal variances depending on the value set by setUnequalVariances.doubleReturns the approximate lower confidence limit in an interval estimate for the ratio of variances, \(\sigma_1^2/\sigma_2^2\).doublegetMeanX()Returns the mean of the first sample.doublegetMeanY()Returns the mean of the second sample.doubleReturns the pooled variance for the two samples.doubleReturns the standard deviation of the first sample.doubleReturns the standard deviation of the second sample.doublegetTTest()Returns the test statistic for the Satterthwaite's approximation.doubleReturns the degrees of freedom for the Satterthwaite's approximation.doubleReturns the approximate probability of observing a larger value of the t-statistic given the null hypothesis is true (i.e.,the p-value for the test).doubleReturns the upperconfidenceVariance\(*100%\) confidence limit for the common variance.doubleReturns the upper confidence limit for the difference, \(\mu_x - \mu_y\) for equal or unequal variances depending on the value set bysetUnequalVariances.doubleReturns the approximate upper confidence limit in a confidence interval for the ratio of variances, \(\sigma_1^2/\sigma_2^2\).voidsetChiSquaredTestNull(double varianceHypothesisValue) Sets the null hypothesis value for the chi-squared test.voidsetConfidenceMean(double confidenceMean) Sets the confidence level (in percent) for a two-sided confidence interval for the difference in means, \(\mu_x - \mu_y\).voidsetConfidenceVariance(double confidenceVariance) Sets the confidence level for a two-sided interval estimate for the common variance and for the ratios of variances.voidsetTTestNull(double meanHypothesis) Sets the Null hypothesis value for t-test for the mean.voidsetUnequalVariances(boolean uneqVar) Specifies whether to return statistics based on equal or unequal variances.voidupdate(double[] x, double[] y) Concatenates the data inxandywith the samples provided in the constructor.voidupdateX(double[] x) Concatenates the data inxwith the first sample.voidupdateY(double[] y) Concatenates the data inywith the second sample.
-
Constructor Details
-
NormTwoSample
public NormTwoSample(double[] x, double[] y) Constructor.- Parameters:
x- adoublearray containing the first sampley- adoublearray containing the second sample
-
-
Method Details
-
getDiffMean
public double getDiffMean()Returns the difference in sample means.- Returns:
- a
double, the difference in sample means
-
getMeanX
public double getMeanX()Returns the mean of the first sample.- Returns:
- a
double, the mean of the first sample
-
getMeanY
public double getMeanY()Returns the mean of the second sample.- Returns:
- a
double, the mean of the second sample
-
setConfidenceMean
public void setConfidenceMean(double confidenceMean) Sets the confidence level (in percent) for a two-sided confidence interval for the difference in means, \(\mu_x - \mu_y\). The argument,confidenceMeanmust be between \(0.0\) and\( 1.0\) and common choices are \(0.90, 0.95\) or \(0.99\). Note: In order to usegetUpperCIDiff()(getLowerCIDiff()) as a \(C\)% upper (lower) one-sided confidence limit, setconfidenceMean=\((1-2(1-C))/100\) Default: confidenceMean = .95- Parameters:
confidenceMean-double, the desired confidence level of the mean
-
getUpperCIDiff
public double getUpperCIDiff()Returns the upper confidence limit for the difference, \(\mu_x - \mu_y\) for equal or unequal variances depending on the value set bysetUnequalVariances.setUnequalVariances- Returns:
- a
doublecontaining the upper confidence limit for the difference in means of the two populations.
-
getLowerCIDiff
public double getLowerCIDiff()Returns the lower confidence limit for the difference, \(\mu_x - \mu_y\) for equal or unequal variances depending on the value set by setUnequalVariances.setUnequalVariances- Returns:
- a
doublecontaining the lower confidence limit for the difference in means of the two populations.
-
setUnequalVariances
public void setUnequalVariances(boolean uneqVar) Specifies whether to return statistics based on equal or unequal variances. The default is to return statistics for equal variances. IfuneqVaris True then statistics for unequal variances will be returned.- Parameters:
uneqVar- abooleancontaining a true or false value. A value of true will cause results for unequal variances to be returned. A value of false will cause results for equal variances to be returned.
-
getTTestDF
public double getTTestDF()Returns the degrees of freedom for the Satterthwaite's approximation. The value returned is based on the assumption of equal or unequal variances, depending on the value set bysetUnequalVariances.setUnequalVariances- Returns:
- an
doublecontaining the degrees of freedom for the t-test.
-
getTTest
public double getTTest()Returns the test statistic for the Satterthwaite's approximation. The value returned is based on the assumption of equal or unequal variances, according to the value set bysetUnequalVariances.setUnequalVariances- Returns:
- a
doublecontaining the test statistic for the t-test.
-
getTTestP
public double getTTestP()Returns the approximate probability of observing a larger value of the t-statistic given the null hypothesis is true (i.e.,the p-value for the test). The value returned is based on the assumption of equal or unequal variances, according to the value set bysetUnequalVariancesfor equal or unequal variances.setUnequalVariances- Returns:
- a
double, the p-value for the test
-
setTTestNull
public void setTTestNull(double meanHypothesis) Sets the Null hypothesis value for t-test for the mean.meanHypothesis=0.0 by default.- Parameters:
meanHypothesis-doublecontaining the hypothesis value.
-
getPooledVariance
public double getPooledVariance()Returns the pooled variance for the two samples.- Returns:
- a
double, the pooled variance for the two samples
-
setConfidenceVariance
public void setConfidenceVariance(double confidenceVariance) Sets the confidence level for a two-sided interval estimate for the common variance and for the ratios of variances. Under the assumption of equal variances, the pooled variance for the two samples is used to obtain aconfidenceVariance\(*100%\) two-sided confidence interval for the common variance with lower limit returned bygetLowerCICommonVarianceand upper limit returned bygetUpperCICommonVariance. Without making the assumption of equal variances,setUnequalVariances, the ratio of the variances is of interest. A two-sidedconfidenceVariance\(*100%\) confidence interval for the ratios of the variances \(\sigma_1^2/\sigma_2^2\) is given by thegetLowerCIRatioVarianceandgetUpperCIRatioVariance. SeesetUnequalVariancesandgetUpperCIRatioVariance. The confidence intervals are symmetric in probability. ArgumentconfidenceVariancemust be between 0.0 and 1.0 and is often 0.90, 0.95 or 0.99. The default is 0.95.- Parameters:
confidenceVariance- adoublecontaining the confidence level of the variance
-
getLowerCICommonVariance
public double getLowerCICommonVariance()Returns the lowerconfidenceVariance\(*100%\) confidence limit for the common variance.- Returns:
- a
double, the lower confidence limit for the common variance
-
getUpperCICommonVariance
public double getUpperCICommonVariance()Returns the upperconfidenceVariance\(*100%\) confidence limit for the common variance.- Returns:
- a
doublethe upper confidence limit for the common variance
-
getChiSquaredTestDF
public int getChiSquaredTestDF()Returns the degrees of freedom associated with the chi-squared test for the common variance. The chi-squared test is a test of the hypothesis \(\omega^2 = \omega_0^2\) where \(\omega_0^2\) is the null hypothesis value as described insetChiSquaredTestNull.- Returns:
- an
int, the degrees of freedom for the chi-squared test
-
getChiSquaredTest
public double getChiSquaredTest()Returns the test statistic associated with the chi-squared test for the (assumed) common variance. The chi-squared test is a test of the hypothesis \(\omega^2 = \omega_0^2\) where \(\omega^2\) is the assumed common variance between the two populations and \(\omega_0^2\) is the null hypothesis value as described insetChiSquaredTestNull.- Returns:
- a
double, the test statistic for the chi-squared test
-
getChiSquaredTestP
public double getChiSquaredTestP()Returns the probability of a larger value than the chi-squared statistic associated with the test for the common variance, assuming the null hypothesis is true (i.e., the p-value for the test).- Returns:
- a
double, the p-value for the chi-squared test
-
setChiSquaredTestNull
public void setChiSquaredTestNull(double varianceHypothesisValue) Sets the null hypothesis value for the chi-squared test. The default is 1.0.- Parameters:
varianceHypothesisValue- adouble, the null hypothesis value for the chi-squared test
-
getStdDevX
public double getStdDevX()Returns the standard deviation of the first sample.- Returns:
- a
double, the standard deviation of the first sample
-
getStdDevY
public double getStdDevY()Returns the standard deviation of the second sample.- Returns:
- a
double, the standard deviation of the second sample
-
getLowerCIRatioVariance
public double getLowerCIRatioVariance()Returns the approximate lower confidence limit in an interval estimate for the ratio of variances, \(\sigma_1^2/\sigma_2^2\).- Returns:
- a
double, the lower limit
-
getUpperCIRatioVariance
public double getUpperCIRatioVariance()Returns the approximate upper confidence limit in a confidence interval for the ratio of variances, \(\sigma_1^2/\sigma_2^2\).- Returns:
- a
double, the upper limit
-
getFTestDFnumerator
public int getFTestDFnumerator()Returns the numerator degrees of freedom in the \(F\)-test for equality of variances.- Returns:
- an
int, the numerator degrees of freedom
-
getFTestDFdenominator
public int getFTestDFdenominator()Returns the denominator degrees of freedom of the F test for equality of variances.- Returns:
- an
int, the denominator degrees of freedom
-
getFTest
public double getFTest()Returns the F statistic value calculated in an F-test for equality of variances.- Returns:
- a
double, the value of the test statistic
-
getFTestP
public double getFTestP()Returns the probability of a larger (in absolute value) F statistic value, assuming equal variances (i.e., the p-value for the test).- Returns:
- a
double, the probability of a larger F statistic
-
updateX
public void updateX(double[] x) Concatenates the data inxwith the first sample.This method updates the test results to include a new subset of the data. This is useful when the data is too large to fit into memory or when all of the data is not available at one time or location.
- Parameters:
x- adoublearray containing new data for the first sample
-
downdateX
public void downdateX(double[] x) Removes the observations inxfrom the first sample.- Parameters:
x- adoublearray containing the values to remove from the first sample
-
updateY
public void updateY(double[] y) Concatenates the data inywith the second sample.This method updates the test results to include a new subset of the data. This is useful when the data is too large to fit into memory or when all of the data is not available at one time or location.
- Parameters:
y- adoublearray containing new data for the second sample
-
downdateY
public void downdateY(double[] y) Removes the observations inyfrom the second sample.- Parameters:
y- adoublearray containing the values to remove from the second sample
-
update
public void update(double[] x, double[] y) Concatenates the data inxandywith the samples provided in the constructor.This method updates the test results to include a new subset of the data. This is useful when the data is too large to fit into memory or when all of the data is not available at one time or location.
- Parameters:
x- adoublearray containing updates to the first sampley- adoublearray containing updates to the second sample
-