T-Test in Python

A T-test is the statistical test which gives information about the existence of the significant difference of means between two groups. In other words, it is often used in hypothesis testing. The following is an in-depth description of the T-test and its performance in Python.

Types of T-tests

1. One-Sample T-test:

Compares the mean of one group to a known value or a theoretical population mean.
Example: Test whether the class average score is at all different from 75.

2. Two-Sample (Independent) T-test:

Compares the means of two independent groups by testing whether they differ.
Example: Test whether there is a difference between the test scores of two different classes.

3. Paired T-test:

Compares the means of two related groups. For example, before and after measurements.
Example: Weighing weight before and after a diet program.

Assumptions of T-tests

Data are continuous and approximately normally distributed.
Observations are independent.
Homogeneity of variance: Variance of the groups must be approximately equal (for two-sample T-tests).

Steps to Perform a T-test

1. Define hypotheses:

Null Hypothesis (H₀): Means are equal (no significant difference).
Alternative Hypothesis (H_a): Means are not equal (significant difference).

2. Set the significance level (α):

Commonly α=0.05

3. Calculate T-statistic and p-value.

4. Interpret results:

Reject H₀ if p≤α.
Fail to reject H₀ if p>α.

Python Implementation

1. One-Sample T-Test

This test checks if the mean of a dataset is significantly different from a known value (e.g., population mean).

from scipy.stats import ttest_1samp

# Example data: Test scores
data = [85, 90, 88, 92, 87, 89, 84, 91]
population_mean = 88

# Perform one-sample T-test
t_stat, p_value = ttest_1samp(data, population_mean)

print(f"T-statistic: {t_stat:.2f}")
print(f"P-value: {p_value:.4f}")

if p_value < 0.05:
    print("Reject the null hypothesis: The sample mean is significantly different from the population mean.")
else:
    print("Fail to reject the null hypothesis: No significant difference between the sample mean and the population mean.")

Output:

T-statistic: -0.91
P-value: 0.3903
Fail to reject the null hypothesis: No significant difference between the sample mean and the population mean.

2. Two-Sample (Independent) T-Test

This test checks if the means of two independent groups are significantly different.

from scipy.stats import ttest_ind

# Example data: Test scores of two classes
class_A = [85, 90, 88, 92, 87]
class_B = [78, 82, 80, 84, 79]

# Perform two-sample T-test
t_stat, p_value = ttest_ind(class_A, class_B)

print(f"T-statistic: {t_stat:.2f}")
print(f"P-value: {p_value:.4f}")

if p_value < 0.05:
    print("Reject the null hypothesis: The two groups have significantly different means.")
else:
    print("Fail to reject the null hypothesis: No significant difference between the means of the two groups.")

Output:

T-statistic: 6.24
P-value: 0.0005
Reject the null hypothesis: The two groups have significantly different means.

3. Paired T-Test

This test compares the means of two related groups, such as measurements before and after treatment.

from scipy.stats import ttest_rel

# Example data: Scores before and after treatment
before = [85, 88, 86, 90, 87]
after = [89, 91, 88, 94, 90]

# Perform paired T-test
t_stat, p_value = ttest_rel(before, after)

print(f"T-statistic: {t_stat:.2f}")
print(f"P-value: {p_value:.4f}")

if p_value < 0.05:
    print("Reject the null hypothesis: There is a significant difference between the paired samples.")
else:
    print("Fail to reject the null hypothesis: No significant difference between the paired samples.")

Output:

T-statistic: -5.10
P-value: 0.0070
Reject the null hypothesis: There is a significant difference between the paired samples.

How to Check Assumptions?

1. Normality Test:

Use Shapiro-Wilk test to check if the data is normally distributed.

from scipy.stats import shapiro

stat, p = shapiro(data)
if p > 0.05:
    print("Data is normally distributed.")
else:
    print("Data is not normally distributed.")

2. Equal Variance Test (for Two-Sample T-Test):

Use Levene’s test to check if the variances are equal.

from scipy.stats import levene

stat, p = levene(class_A, class_B)
if p > 0.05:
    print("Variances are equal.")
else:
    print("Variances are not equal.")

Key Points

A low value of p-values (< 0.05) means that we reject the null hypothesis and get statistically significant results.
Always check assumptions before performing a T-test, which include normality and variance equality.
For unequal variances in a two-sample T-test, set equal_var=False in ttest_ind.

Summary of Outputs:

Test Type	T-Statistic	P-value	Conclusion
One-Sample T-Test	-0.91	0.3903	Fail to reject H₀
Two-Sample T-Test	6.24	0.0005	Reject H₀: Means are different
Paired T-Test	-5.10	0.0070	Reject H₀: Significant difference

For Employers

For Employees

For Employers

For Employees

T-Test in Python

Types of T-tests

Assumptions of T-tests

Steps to Perform a T-test

Python Implementation

1. One-Sample T-Test

2. Two-Sample (Independent) T-Test

3. Paired T-Test

How to Check Assumptions?

Key Points

Summary of Outputs:

DEVOPS

PROGRAMMING LANGUAGES

CLOUD ENGINEER

B.Tech / MCA

NETWORK / SECURITY

SOFTWARE DEVELOPER

DATA ANALYST

WEB DEVELOPMENT

DISCLAIMER