com.imsl.stat.WelchsTTest

All Implemented Interfaces:: Serializable, Cloneable

public class WelchsTTest extends Object implements Serializable, Cloneable

Performs Welch's t-test for testing the difference in means between two normal populations with unequal variances.

Let $\mu_x$ and $\sigma _x^2$ be the mean and variance of the first population, and let $\mu_y$ and $\sigma_y^2$ be the corresponding quantities of the second population. The methods in this class support tests and confidence intervals for the difference in means $\mu_x-\mu_y$.

For some real constant, $c$, a hypothesis test for the difference may be expressed as one of the following: $$ H_0: \mu_x - \mu_y = c \,\,\,\,\,\mbox{vs.}\,\,\,\,\, H_1: \mu_x - \mu_y \ne c$$ $$ H_0: \mu_x - \mu_y \le c \,\,\,\,\,\mbox{vs.}\,\,\,\,\, H_1: \mu_x - \mu_y >c$$ $$ H_0:\mu_x - \mu_y \ge c \,\,\,\,\,\mbox{vs.}\,\,\,\,\, H_1: \mu_x - \mu_y < c$$ where $H_0$ is the null-hypothesis, and $H_1$ is the alternate or alternative hypothesis. The first test is a two-sided test, because the rejection region defined by $H_1$ is two-sided, while the other tests are one-sided. Conventionally, the null hypothesis is assumed to be true and the alternate hypothesis represents an experimental conjecture. If there is sufficient evidence in the sample data the null hypothesis $H_0$ is rejected in favor of $H_1$, while insufficient evidence in the sample data results in a failure to reject the null hypothesis. Evidence for the decision to reject or fail to reject is based on probabilities calculated using the test statistic's distribution under the null hypothesis (the null distribution).

For the Welch's t-test, the two samples are assumed to come from two independent normal distributions with unequal variances ($\sigma_x^2 \ne \sigma_y^2)$ and means that satisfy the null hypothesis. When the population variances are not equal, the ordinary t statistic does not have a t distribution and several approximate tests for the equality of means have been proposed. (See, for example, Anderson and Bancroft 1952, and Kendall and Stuart 1979.)

Welch's test statistic is given by

$$t' = \left( {\bar x - \bar y - c } \right)/s_d$$

where $\bar{x}, \bar{y}, s^2_x, s^2_y$ are the sample means and sample variances (unbiased versions), respectively, and

$$s_d = \sqrt {\left( {s_x^2 /n_x } \right) + \left( {s_y^2 /n_y } \right)}$$

Under the null hypothesis of $\mu_x- \mu_y= c$, this quantity has an approximate t-distribution with degrees of freedom df, given by the following equation (known as the Welch-Satterthwaite approximation):

$${\rm{df}} = \frac{{s_d^4 }}{{\frac{{\left( {s_x^2 /n_x } \right)^2 }}{{n_x - 1}} + \frac{{\left( {s_y^2 /n_y } \right)^2 }}{{n_y - 1}}}}$$

Probabilities based on this distribution form the basis of the test and the confidence intervals for the mean difference. For two-sample tests when the variances are assumed equal and for tests of the common variance or for the ratio of variances, see the class NormTwoSample.

See Also:

Nested Class Summary

Nested Classes

Modifier and Type

Class

Description

static enum

WelchsTTest.Hypothesis

The form of the alternate hypothesis.
Constructor Summary

Constructors

Constructor

Description

WelchsTTest(double[] x, double[] y)

Constructor for the class.
Method Summary

Modifier and Type

Method

Description

void

downdateX(double[] x)

Removes the observations in x from the first sample.

void

downdateY(double[] y)

Removes the observations in y from the second sample.

double

getDiffMean()

Returns the difference in sample means.

double

getLowerCIDiff()

Returns the (approximate) lower confidenceMean*100% confidence limit for the difference in population means, $\mu_x - \mu_y$.

double

getMeanX()

Returns the mean of the first sample.

double

getMeanY()

Returns the mean of the second sample.

double

getStdDevX()

Returns the standard deviation of the first sample.

double

getStdDevY()

Returns the standard deviation of the second sample.

double

getTTest()

Returns the calculated test statistic for Welch's t-test.

double

getTTestDF()

Returns the degrees of freedom used in the test.

double

getTTestP()

Returns the approximate probability of observing a more extreme value of the t-statistic given the null hypothesis is true (i.e, the approximate p-value of the test).

double

getUpperCIDiff()

Returns the (approximate) upper confidenceMean*100% confidence limit for the difference in population means, $\mu_x - \mu_y$.

void

setConfidenceMean(double confidenceMean)

Sets the confidence level for a two-sided confidence interval for the difference in population means, $\mu_x - \mu_y$.

void

setHypothesis(WelchsTTest.Hypothesis hypothesis)

Sets the direction of the null/alternative test.

void

setTTestNull(double meanHypothesis)

Sets the null hypothesis value.

void

update(double[] x, double[] y)

Concatenates the data in x and y with the samples provided in the constructor.

void

updateX(double[] x)

Concatenates the data in x with the first sample.

void

updateY(double[] y)

Concatenates the data in y with the second sample.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Details
- WelchsTTest
  
  public WelchsTTest(double[] x, double[] y)
  
  Constructor for the class.
  
  Parameters:
  
  x - a double array containing data for the first sample
  
  y - a double array containing data for the second sample
Method Details
- getDiffMean
  
  public double getDiffMean()
  
  Returns the difference in sample means.
  
  Returns:
  
  a double, the difference in sample means
- setHypothesis
  
  public void setHypothesis(WelchsTTest.Hypothesis hypothesis)
  
  Sets the direction of the null/alternative test.
  Default: hypothesis = Hypothesis.TWO_SIDED
  
  Parameters:
  
  hypothesis - an Hypothesis enum, specifying the type of test
- setTTestNull
  
  public void setTTestNull(double meanHypothesis)
  
  Sets the null hypothesis value. The null hypothesis value is the value $c$ in $H_0:\mu_x - \mu_y = c $ and the other forms of the hypothesis.
  Default: $ c = 0.0 $
  
  Parameters:
  
  meanHypothesis - double, the hypothesis value
- setConfidenceMean
  
  public void setConfidenceMean(double confidenceMean)
  
  Sets the confidence level for a two-sided confidence interval for the difference in population means, $\mu_x - \mu_y$.
  
  The argument, confidenceMean must be between $0.0$ and $1.0$. Common choices are $0.90, 0.95$ or $0.99$.
  
  Note: In order to use getUpperCIDiff() (getLowerCIDiff()) as a $C$% upper (lower) one-sided confidence limit, set confidenceMean=$(1-2(1-C))/100$.
  
  Default: confidenceMean = 0.95
  
  Parameters:
  
  confidenceMean - double, the desired confidence level
- getUpperCIDiff
  
  public double getUpperCIDiff()
  
  Returns the (approximate) upper confidenceMean*100% confidence limit for the difference in population means, $\mu_x - \mu_y$.
  
  Returns:
  
  a double, the upper confidence limit for the difference in means
- getLowerCIDiff
  
  public double getLowerCIDiff()
  
  Returns the (approximate) lower confidenceMean*100% confidence limit for the difference in population means, $\mu_x - \mu_y$.
  
  Returns:
  
  a double, the lower confidence limit for the difference in means
- getTTestP
  
  public double getTTestP()
  
  Returns the approximate probability of observing a more extreme value of the t-statistic given the null hypothesis is true (i.e, the approximate p-value of the test).
  
  Returns:
  
  a double, the approximate p-value for the test
- getTTestDF
  
  public double getTTestDF()
  
  Returns the degrees of freedom used in the test. (The value obtained using the Welch-Satterthwaite's approximation.)
  
  Returns:
  
  a double, the degrees of freedom used in the test
- getTTest
  
  public double getTTest()
  
  Returns the calculated test statistic for Welch's t-test.
  
  Returns:
  
  a double, the test statistic
- getMeanX
  
  public double getMeanX()
  
  Returns the mean of the first sample.
  
  Returns:
  
  a double, the mean of the first sample
- getMeanY
  
  public double getMeanY()
  
  Returns the mean of the second sample.
  
  Returns:
  
  a double, the mean of the second sample
- getStdDevX
  
  public double getStdDevX()
  
  Returns the standard deviation of the first sample.
  
  Returns:
  
  a double, the standard deviation of the first sample
- getStdDevY
  
  public double getStdDevY()
  
  Returns the standard deviation of the second sample.
  
  Returns:
  
  a double, the standard deviation of the second sample
- downdateY
  
  public void downdateY(double[] y)
  
  Removes the observations in y from the second sample.
  
  Parameters:
  
  y - a double array containing the values to remove from the second sample
- downdateX
  
  public void downdateX(double[] x)
  
  Removes the observations in x from the first sample.
  
  Parameters:
  
  x - a double array containing the values to remove from the first sample
- updateY
  
  public void updateY(double[] y)
  
  Concatenates the data in y with the second sample.
  This method updates the test results to include a new subset of the data. This is useful when the data is too large to fit into memory or when all of the data is not available at one time or location.
  
  Parameters:
  
  y - a double array containing new data for the second sample
- updateX
  
  public void updateX(double[] x)
  
  Concatenates the data in x with the first sample.
  This method updates the test results to include a new subset of the data. This is useful when the data is too large to fit into memory or when all of the data is not available at one time or location.
  
  Parameters:
  
  x - a double array containing new data for the first sample
- update
  
  public void update(double[] x, double[] y)
  
  Concatenates the data in x and y with the samples provided in the constructor.
  This method updates the test results to include a new subset of the data. This is useful when the data is too large to fit into memory or when all of the data is not available at one time or location.
  
  Parameters:
  
  x - a double array containing updates to the first sample
  
  y - a double array containing updates to the second sample

Class WelchsTTest

Nested Class Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Details

WelchsTTest

Method Details

getDiffMean

setHypothesis

setTTestNull

setConfidenceMean

getUpperCIDiff

getLowerCIDiff

getTTestP

getTTestDF

getTTest

getMeanX

getMeanY

getStdDevX

getStdDevY

downdateY

downdateX

updateY

updateX

update