Two - Sample T test
2 Sample T-Test
Suppose that a company has recently updated their website to make it more colorful and inviting. The company wants to know whether the new design is resulting in visitors staying on the site for a longer period of time. A sample of 100 visitors who saw the old design spent an average of 25 minutes on the site. A second sample of 100 visitors who saw the new version spent an average of 28 minutes on the site. Did the average time spent per visitor vary across groups? Or is this difference attributable to random chance?
One way of testing whether this difference is significant is by using a 2 Sample T-Test. A 2 Sample T-Test compares two sets of numerical data.
The null hypothesis of a 2 Sample T-Test is that the two observed samples come from populations with the same mean. In the example above, this means: if we could observe all site visitors in two alternate universes (one where they see each version of the site), the average visiting times in these universes would be equal.
The alternative hypothesis could be: The two observed samples come from populations with different means. In the example above, this would mean that the average visiting times in our two alternate universes are actually different, hence why we observed a difference in our samples.
We can use SciPy’s ttest_ind
function to perform a 2 Sample T-Test. It takes the two samples as inputs and returns the t-statistic and a p-value, which we can use to assess the probability of an observed difference happening by chance if the null hypothesis were true. For more information about p-values, refer to the earlier exercise on univariate t-tests.
Suppose that we own a chain of stores that sell ants, called VeryAnts. There are three different locations: A, B, and C. We want to know if the average ant sales over the past year are significantly different between the three locations.
At first, it seems that we could perform t-tests between each pair of stores.
We know that the p-value is the probability that we incorrectly reject the null hypothesis on each t-test. The more t-tests we perform, the more likely that we are to get a false positive, a Type I error.
For a significance threshold of 0.05
, if the null hypothesis is true, then the probability of correctly failing to reject the null is 1 – 0.05
= 0.95
. When we run another t-test where the null is true, the probability of correctly failing to reject the null on both of those tests is 0.95
* 0.95
, or 0.9025
. That means our probability of making an error is now 1
- 0.9025
, or close to 10%
! This error probability only gets bigger with the more t-tests we do.
Comments
Post a Comment