John Tukey's Innovation: How Do Familial Error Rates Affect Statistics?

In statistics, the family-wise error rate (FWER) refers to the probability of one or more false discoveries (Type I errors) occurring in multiple hypothesis tests. This is a key concept for researchers who wish to reduce error rates when performing multiple tests.

John Tukey introduced the concept of family-type error rate in 1953 to measure the probability of a Type I error occurring in a specific group.

The concept of familial error rate lies within the important framework of statistics, which includes concepts related to experiments. Ryan proposed the Experiment-wise Error Rate in 1959, which represents the probability of a Type I error occurring in an experiment. The experimental error rate can be thought of as a set of tests where all tests in the set are uniformly controlled.

In statistics, the word "family" has several definitions. Hochberg and Tamhane (1987) define a "family" as "any set of inferences that meaningfully take into account some comprehensive measure of error." This definition emphasizes correctness and selection effects in statistical analysis.

Hypothesis Results
H1 ...
H2 ...

When conducting multiple hypothesis tests, several outcomes may occur. For example, assuming there are m hypotheses, the number of true hypotheses and the number of false positives will affect the final statistical conclusion.

The core of familial error rate is to control at least one Type I error.

There are several traditional methods for controlling familial error rates. The most well-known include:

  • Bonferroni Program
  • Šidák Program
  • Tukey program
  • Holm's Ladder Method
  • Hochberg’s step-up method

Take the Bonferroni procedure as an example, a very simple method that controls the overall familial error rate by dividing the significance level of each hypothesis test by the total number of tests.

Research has pointed out that Holm's ladder method is more powerful than the Bonferroni method and can effectively control the error rate of all assumptions.

In testing hypotheses, statisticians also need to consider dependencies between tests. Traditional methods such as Bonferroni and Holm provide a relatively conservative solution suitable for the detection of cross-test dependencies in multiple hypotheses.

However, the conservative nature of these methods also means that their performance may be limited by some kind of dependency structure. In some cases, the adoption of resampling strategies, such as the introduction of bootstrapping and replacement methods, can improve the ability to control error rates and enhance detection performance.

Of all these strategies, family-based error rate control provides more stringent protection than False Discovery Rate (FDR) control.

It is worth noting that each method has its own strengths and weaknesses in controlling error rates. It is crucial to choose an appropriate control strategy based on the background of the research and the characteristics of the hypothesis. Furthermore, controlling familial error rates is often part of trying to reduce uncertainty and decision-making risk, which is crucial in scientific research.

In the long term, how to balance controlling error rates and maintaining the validity of results will continue to be a challenge in statistical research. In this context, John Tukey’s innovation deserves our reflection, and how will its impact on data science change?

Trending Knowledge

The secret of the family-wise error rate: How to ensure the accuracy of multiple hypothesis testing?
In today's data-driven society, hypothesis testing is particularly important in scientific research. However, with the popularity of multiple hypothesis testing, the family-wise error rate (FWER) has
When Statistical Testing Faces Multiple Challenges: How Can Family-Based Error Rate Help You Avoid Mistakes?
As scientific research and data analysis advance, statistical testing becomes increasingly important in ensuring the accuracy of results. When conducting multiple hypothesis testing, the family-wise e
Experimental Error Rate vs. Family-Wide Error Rate: What's the Difference and Why Does It Matter?
<header> </header> Data analysis and statistics are an indispensable part of today's scientific research, especially in the process of hypothesis testing. However, when re

Responses