public class NormTwoSample extends Object implements Serializable, Cloneable
Class NormTwoSample
computes statistics for making
inferences about the means and variances of two normal populations, using
independent samples in x
and y
. Missing values,
that is, values equal to NaN (not a number), are excluded from the
computations. For inferences concerning parameters of a single normal
population, see class NormOneSample
.
Let \(\mu_1\) and \(\sigma _1^2\) be the mean and variance of the first population, and let \(\mu_2\) and \(\sigma _2^2\) be the corresponding quantities of the second population. The methods in this class support tests for the difference in means \(\mu_1-\mu_2\), for equality of variances, and for the common variance (assuming the variances are equal).
The sample means and variances are as follows:
$$\bar x_1 = \left( {\sum {x_{1i} /n_1 } } \right), \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, \bar x_2 = \left( {\sum {x_{2i} } } /n_2\right)$$
and
$$s_1^2 = \sum {\left( {x_{1i} - \bar x_1 } \right)}^2 /\left( {n_1 - 1} \right), \,\,\,\,\,\,\,\,\,\,\,s_2^2 = \sum {\left( {x_{2i} - {\bar x}_2} \right)}^2 /\left( {n_2 - 1} \right)$$
Inferences about the Means
The test that the difference in means equals a certain value, for example, \(\mu_0\), depends on whether or not the variances of the two populations can be considered equal. If the variances are equal and \(\mu_0=0\), the test is the two-sample t-test, which is equivalent to an analysis-of-variance test. The pooled variance for the difference-in-means test is as follows:
$$s^2 = \frac{{\left( {n_1 - 1} \right)s_1 + \left( {n_2 - 1} \right)s_2 }} {{n_1 + n_2 - 2}}$$
The t statistic is as follows:
$$t = \frac{{\bar x_1 - \bar x_2 - \mu _0}} {s\sqrt {{\left( {1/n_1 } \right)} + \left( {1/n_2 } \right)}}$$
Also, the confidence interval for the difference in means can be obtained
by first assigning the unequal variances flag to false. This can be done by calling
the setUnequalVariances
method. The confidence interval
can then be obtained by the getLowerCIDiff
and
getUpperCIDiff
methods.
If the population variances are not equal, the ordinary t
statistic does not have a t distribution and several approximate
tests for the equality of means have been proposed. (See, for example,
Anderson and Bancroft 1952, and Kendall and Stuart 1979.) One of the
earliest tests devised for this situation is the Fisher-Behrens test, based
on Fisher's concept of fiducial probability.
A procedure used in the
getTTest
, getLowerCIDiff
and getUpperCIDiff
methods assuming unequal variances are specified is the Satterthwaite's
procedure, as suggested by H.F. Smith and modified by F.E. Satterthwaite
(Anderson and Bancroft 1952, p. 83). Use setUnequalVariances
true to obtain results assuming unequal variances.
The test statistic is
$$t' = \left( {\bar x_1 - \bar x_2 - \mu _0 } \right)/s_d$$
where
$$s_d = \sqrt {\left( {s_1^2 /n_1 } \right) + \left( {s_2^2 /n_2 } \right)}$$
Under the null hypothesis of \(\mu_1- \mu_2= c\), this
quantity has an approximate t distribution with degrees of freedom
df
, given by the following equation:
$${\rm{df}} = \frac{{s_d^4 }}{{\frac{{\left( {s_1^2 /n_1 } \right)^2 }}{{n_1 - 1}} + \frac{{\left( {s_2^2 /n_2 } \right)^2 }}{{n_2 - 1}}}}$$
Inferences about Variances
The F statistic for testing the equality of variances is given by \(F = s_{\max }^2 /s_{\min }^2\), where \(s_{\max}^2\) is the larger of \(s_1^2\) and \(s_2^2\). If the variances are equal, this quantity has an F distribution with \(n_1 - 1\) and \(n_2 - 1\) degrees of freedom.
Note: it is generally not recommended that the results of the F test be used to decide whether to use the regular t-test or the modified \(t'\) on a single set of data. The modified \(t'\) (Satterthwaite's procedure) is the more conservative approach to use if there is doubt about the equality of the variances.
Constructor and Description |
---|
NormTwoSample(double[] x,
double[] y)
Constructor.
|
Modifier and Type | Method and Description |
---|---|
void |
downdateX(double[] x)
Removes the observations in
x from the first sample. |
void |
downdateY(double[] y)
Removes the observations in
y from the second sample. |
double |
getChiSquaredTest()
Returns the test statistic associated with the chi-squared test
for the (assumed) common variance.
|
int |
getChiSquaredTestDF()
Returns the degrees of freedom associated with the chi-squared
test for the common variance.
|
double |
getChiSquaredTestP()
Returns the probability of a larger value than the
chi-squared statistic associated with the test for
the common variance, assuming the null hypothesis is true (i.e.,
the p-value for the test).
|
double |
getDiffMean()
Returns the difference in sample means.
|
double |
getFTest()
Returns the F statistic value calculated in an
F-test for equality of variances.
|
int |
getFTestDFdenominator()
Returns the denominator degrees of freedom of the F
test for equality of variances.
|
int |
getFTestDFnumerator()
Returns the numerator degrees of freedom in the \(F\)-test for
equality of variances.
|
double |
getFTestP()
Returns the probability of a larger (in absolute value)
F statistic value, assuming equal variances (i.e.,
the p-value for the test).
|
double |
getLowerCICommonVariance()
Returns the lower
confidenceVariance \(*100%\) confidence
limit for the common variance. |
double |
getLowerCIDiff()
Returns the lower confidence limit for the difference,
\(\mu_x - \mu_y\) for
equal or unequal variances depending on the value
set by setUnequalVariances.
|
double |
getLowerCIRatioVariance()
Returns the approximate lower confidence limit in an
interval estimate for the ratio of variances,
\(\sigma_1^2/\sigma_2^2\).
|
double |
getMeanX()
Returns the mean of the first sample.
|
double |
getMeanY()
Returns the mean of the second sample.
|
double |
getPooledVariance()
Returns the pooled variance for the two samples.
|
double |
getStdDevX()
Returns the standard deviation of the first sample.
|
double |
getStdDevY()
Returns the standard deviation of the second sample.
|
double |
getTTest()
Returns the test statistic for the Satterthwaite's
approximation.
|
double |
getTTestDF()
Returns the degrees of freedom for the Satterthwaite's
approximation.
|
double |
getTTestP()
Returns the approximate probability of observing a larger
value of the t-statistic given the null hypothesis is true
(i.e.,the p-value for the test).
|
double |
getUpperCICommonVariance()
Returns the upper
confidenceVariance \(*100%\) confidence
limit for the common variance. |
double |
getUpperCIDiff()
Returns the upper confidence limit for the difference,
\(\mu_x - \mu_y\) for
equal or unequal variances depending on the value
set by
setUnequalVariances . |
double |
getUpperCIRatioVariance()
Returns the approximate upper confidence limit in a
confidence interval for the ratio of variances,
\(\sigma_1^2/\sigma_2^2\).
|
void |
setChiSquaredTestNull(double varianceHypothesisValue)
Sets the null hypothesis value for the chi-squared test.
|
void |
setConfidenceMean(double confidenceMean)
Sets the confidence level (in percent) for a two-sided
confidence interval for the difference in means,
\(\mu_x - \mu_y\).
|
void |
setConfidenceVariance(double confidenceVariance)
Sets the confidence level for a two-sided interval
estimate for the common variance and for the ratios of variances.
|
void |
setTTestNull(double meanHypothesis)
Sets the Null hypothesis value for t-test for the mean.
|
void |
setUnequalVariances(boolean uneqVar)
Specifies whether to return statistics based on equal or unequal
variances.
|
void |
update(double[] x,
double[] y)
Concatenates the data in
x and y with
the samples provided in the constructor. |
void |
updateX(double[] x)
Concatenates the data in
x with the first sample. |
void |
updateY(double[] y)
Concatenates the data in
y with the second sample. |
public NormTwoSample(double[] x, double[] y)
x
- a double
array containing the first sampley
- a double
array containing the second samplepublic double getDiffMean()
double
, the
difference in sample meanspublic double getMeanX()
double
, the mean of the first samplepublic double getMeanY()
double
, the mean of the second samplepublic void setConfidenceMean(double confidenceMean)
confidenceMean
must be between \(0.0\) and\( 1.0\) and common choices are
\(0.90, 0.95\) or \(0.99\).
Note: In order to use NormTwoSample.getUpperCIDiff()
(NormTwoSample.getLowerCIDiff()
)
as a \(C\)% upper (lower) one-sided
confidence limit, set confidenceMean
=\((1-2(1-C))/100\)
Default: confidenceMean = .95confidenceMean
- double
, the desired confidence
level of the meanpublic double getUpperCIDiff()
setUnequalVariances
.
setUnequalVariances
double
containing the upper confidence
limit for the difference in means of the two populations.public double getLowerCIDiff()
setUnequalVariances
double
containing the lower confidence
limit for the difference in means of the two populations.public void setUnequalVariances(boolean uneqVar)
uneqVar
is True then statistics for unequal variances
will be returned.uneqVar
- a boolean
containing a true or false value.
A value of true will cause results for unequal variances to be returned.
A value of false will cause results for equal variances to be returned.public double getTTestDF()
setUnequalVariances
.
setUnequalVariances
double
containing the degrees of freedom
for the t-test.public double getTTest()
setUnequalVariances
.
setUnequalVariances
double
containing the test statistic for
the t-test.public double getTTestP()
setUnequalVariances
for equal or
unequal variances.
setUnequalVariances
double
, the p-value for the
testpublic void setTTestNull(double meanHypothesis)
meanHypothesis
=0.0 by default.meanHypothesis
- double
containing the hypothesis value.public double getPooledVariance()
double
, the pooled
variance for the two samplespublic void setConfidenceVariance(double confidenceVariance)
confidenceVariance
\(*100%\)
two-sided
confidence interval for the common variance with lower limit returned by
getLowerCICommonVariance
and upper limit
returned by
getUpperCICommonVariance
.
Without making the assumption
of equal variances,
setUnequalVariances
, the ratio of the
variances is of interest. A two-sided confidenceVariance
\(*100%\)
confidence interval for the ratios of the variances
\(\sigma_1^2/\sigma_2^2\)
is given by the
getLowerCIRatioVariance
and getUpperCIRatioVariance
.
See setUnequalVariances
and
getUpperCIRatioVariance
. The confidence
intervals are symmetric in probability.
Argument confidenceVariance
must be between 0.0 and 1.0
and is often 0.90, 0.95 or 0.99. The default is 0.95.confidenceVariance
- a double
containing the confidence
level of the variancepublic double getLowerCICommonVariance()
confidenceVariance
\(*100%\) confidence
limit for the common variance.double
, the lower confidence
limit for the common variancepublic double getUpperCICommonVariance()
confidenceVariance
\(*100%\) confidence
limit for the common variance.double
the upper confidence
limit for the common variancepublic int getChiSquaredTestDF()
setChiSquaredTestNull
.int
, the degrees of freedom for the
chi-squared testpublic double getChiSquaredTest()
setChiSquaredTestNull
.double
, the test statistic for the
chi-squared testpublic double getChiSquaredTestP()
double
, the p-value for the chi-squared testpublic void setChiSquaredTestNull(double varianceHypothesisValue)
varianceHypothesisValue
- a double
, the null hypothesis value for the
chi-squared testpublic double getStdDevX()
double
, the standard deviation
of the first samplepublic double getStdDevY()
double
, the standard deviation
of the second samplepublic double getLowerCIRatioVariance()
double
, the lower limitpublic double getUpperCIRatioVariance()
double
, the
upper limitpublic int getFTestDFnumerator()
int
, the numerator degrees of freedompublic int getFTestDFdenominator()
int
, the denominator
degrees of freedompublic double getFTest()
double
, the value of the test statisticpublic double getFTestP()
double
, the probability of a larger F
statisticpublic void updateX(double[] x)
x
with the first sample.
This method updates the test results to include a new subset of the data. This is useful when the data is too large to fit into memory or when all of the data is not available at one time or location.
x
- a double
array containing new data for the
first samplepublic void downdateX(double[] x)
x
from the first sample.x
- a double
array containing the values to remove
from the first samplepublic void updateY(double[] y)
y
with the second sample.
This method updates the test results to include a new subset of the data. This is useful when the data is too large to fit into memory or when all of the data is not available at one time or location.
y
- a double
array containing new data for the
second samplepublic void downdateY(double[] y)
y
from the second sample.y
- a double
array containing the values to remove from the second samplepublic void update(double[] x, double[] y)
x
and y
with
the samples provided in the constructor.
This method updates the test results to include a new subset of the data. This is useful when the data is too large to fit into memory or when all of the data is not available at one time or location.
x
- a double
array containing updates to the first sampley
- a double
array containing updates to the second sampleCopyright © 2020 Rogue Wave Software. All rights reserved.