Hi fellows!

This is the second part of the article where I would like to focus on the common problem in statistics - multiple comparisons.

In the first part, we dived into the main terminology of this problem and the most common solutions. In this article, we will explore practical implementation with Python code and interpretation of the results.

Let's get started!

Practical Implementation

First of all, let’s make sure that we install all necessary libraries

pip install numpy statsmodels

Bonferroni correction

#import libraries
from statsmodels.stats.multitest import multipletests
import numpy as np

# Imagine these are your p-values from testing various hypotheses
p_values = [0.005, 0.0335, 0.098543, 0.00123]  # Let's say we did 4 tests

# Applying Bonferroni correction
bonf_rejected, bonf_corrected, _, _ = multipletests(p_values, alpha=0.05, method='bonferroni')

#where alpha is Type I Error (see previous article)

print("Bonferroni Approach")
print(f"Rejected: {bonf_rejected}")
print(f"Adjusted p-values: {bonf_corrected}\n")

Let's break down what we've got after applying the Bonferroni correction to your p-values:

Bonferroni Approach
Rejected: [ True False False  True]
Adjusted p-values: [0.02     0.134    0.394172 0.00492 ]

Benjamini-Hochberg correction

# Benjamini-Hochberg correction for the brave
from statsmodels.stats.multitest import multipletests
import numpy as np

# Imagine these are your p-values from testing various hypotheses
p_values = [0.005, 0.0335, 0.098543, 0.00123] # Let's say we did 4 tests

# Applying BH correction
bh_rejected, bh_corrected, _, _ = multipletests(p_values, alpha=0.05, method='fdr_bh')

print("Benjamini-Hochberg Approach")
print(f"Rejected: {bh_rejected}")
print(f"Adjusted p-values: {bh_corrected}")

Let's break down what we've got after applying the Benjamini-Hochberg correction to your p-values:

Benjamini-Hochberg Approach
Rejected: [ True  True False  True]
Adjusted p-values: [0.01       0.04466667 0.098543   0.00492   ]

Interpretation of the Results in Celebrity Terms

This metaphor highlights the inherent trade-offs between sensitivity and specificity in statistical corrections and the importance of choosing the right approach based on the context of your research or, in our playful analogy, the type of party you are attending.

Wrapping It Up: The Takeaway

Testing a lot of hypotheses is like walking through a field full of potential mistakes. But with the right tools (thanks, Python!) and strategies (hello, Bonferroni and Benjamin-Hochberg), you can manage this process and keep your research sound. Remember, it's all about balancing risk and reward. Whether you are being extra cautious or going for big discoveries, properly handling multiple tests will help make your results more trustworthy.

Have a good data hunt!