Benjamini Hochberg Correction Calculator
Control the false discovery rate across many hypothesis tests. Paste your p-values, choose an FDR level, and instantly see which findings remain significant after Benjamini-Hochberg correction.
Use commas, spaces, tabs, or new lines. Values must be between 0 and 1.
Click Calculate Correction to generate adjusted p-values, significance decisions, and a Benjamini-Hochberg threshold chart.
What a Benjamini Hochberg correction calculator does
A Benjamini Hochberg correction calculator helps researchers evaluate many statistical tests at once without being overwhelmed by false positives. When you run one hypothesis test at a 0.05 significance level, the chance of a false positive is controlled at 5 percent under the usual assumptions. But when you run 20, 100, or 10,000 tests, the probability that at least one result appears significant just by chance rises sharply. This is the multiple testing problem. The Benjamini-Hochberg procedure, often abbreviated BH, addresses it by controlling the false discovery rate, or FDR, rather than the stricter family-wise error rate controlled by methods like Bonferroni.
In practical terms, the BH method sorts p-values from smallest to largest and compares each one to an increasing threshold defined by rank. The threshold for the ith smallest p-value among m tests is (i / m) × alpha. The procedure finds the largest p-value that remains below its threshold and declares that p-value, and all smaller ones, significant. A benjamini hochberg correction calculator automates these steps and usually also reports BH adjusted p-values, sometimes called q-values in casual usage, although exact terminology can vary by software package.
Why this matters: In genomics, proteomics, neuroimaging, survey analytics, A/B testing, and any high-dimensional analysis, failing to correct for multiple comparisons can produce a long list of exciting but unreliable findings. BH offers a balance between discovery and caution.
Why false discovery rate control is often preferred
Traditional methods like the Bonferroni correction control the probability of making even one false rejection across the full family of tests. That is extremely conservative when thousands of tests are examined. If you test 1,000 hypotheses and use Bonferroni at an overall alpha of 0.05, the per-test threshold becomes 0.00005. That can be appropriate in safety-critical settings, but in exploratory biology, psychology, economics, and machine learning research, it may be too strict and may hide meaningful signals.
Benjamini-Hochberg instead controls the expected proportion of false discoveries among the rejected hypotheses. For example, with FDR set to 0.05, the procedure aims to keep the expected proportion of false positives among all declared significant results at or below 5 percent, under its assumptions. That distinction is powerful. It means researchers can identify more signals while still maintaining a disciplined approach to error control.
| Number of tests | Nominal alpha | Bonferroni cutoff | BH cutoff for rank 1 | BH cutoff for rank 10 | BH cutoff for rank 100 |
|---|---|---|---|---|---|
| 20 | 0.05 | 0.00250 | 0.00250 | 0.02500 | Not applicable |
| 100 | 0.05 | 0.00050 | 0.00050 | 0.00500 | 0.05000 |
| 1,000 | 0.05 | 0.00005 | 0.00005 | 0.00050 | 0.00500 |
The table shows why BH is attractive. For the first-ranked p-value, BH and Bonferroni start identically. But as rank increases, BH allows a more realistic threshold for declaring discoveries. That is exactly the point: to reward stronger overall evidence while managing the expected fraction of false findings.
How the Benjamini-Hochberg procedure works step by step
- Collect all p-values from the family of tests you want to evaluate together.
- Sort the p-values from smallest to largest.
- Assign ranks from 1 to m, where m is the total number of tests.
- Choose your FDR level, commonly 0.05 or 0.10.
- Compute the BH critical value for each rank using (i / m) × alpha.
- Find the largest rank where the sorted p-value is less than or equal to its BH critical value.
- Reject that hypothesis and all hypotheses with smaller p-values.
- Optionally compute BH adjusted p-values by applying the monotone step-up adjustment.
Suppose you have 10 p-values and set alpha to 0.05. The BH thresholds would be 0.005, 0.010, 0.015, 0.020, 0.025, and so on up to 0.050 for rank 10. If your sorted p-values are 0.001, 0.004, 0.009, 0.020, 0.031, 0.045, 0.080, 0.120, 0.210, and 0.350, then the first four pass their corresponding thresholds, but the fifth does not. The result is four significant discoveries. That is the same example loaded into the calculator above.
Adjusted p-values in Benjamini-Hochberg
Many analysts prefer adjusted p-values because they are easy to compare against a chosen FDR level. The BH adjusted p-value for each test is based on the sorted p-value multiplied by m / i, followed by a monotonicity correction moving from the largest rank back to the smallest. This ensures adjusted p-values do not decrease as rank increases. If an adjusted p-value is less than or equal to your selected alpha, that test is significant under the BH procedure.
Worked example with real numbers
Consider the following 10 p-values from a hypothetical screening experiment: 0.001, 0.004, 0.009, 0.020, 0.031, 0.045, 0.080, 0.120, 0.210, and 0.350. At FDR 0.05, the BH thresholds by rank are 0.005, 0.010, 0.015, 0.020, 0.025, 0.030, 0.035, 0.040, 0.045, and 0.050.
| Rank | Sorted p-value | BH threshold at FDR 0.05 | Passes threshold? |
|---|---|---|---|
| 1 | 0.001 | 0.005 | Yes |
| 2 | 0.004 | 0.010 | Yes |
| 3 | 0.009 | 0.015 | Yes |
| 4 | 0.020 | 0.020 | Yes |
| 5 | 0.031 | 0.025 | No |
| 6 | 0.045 | 0.030 | No |
| 7 | 0.080 | 0.035 | No |
| 8 | 0.120 | 0.040 | No |
| 9 | 0.210 | 0.045 | No |
| 10 | 0.350 | 0.050 | No |
Because rank 4 is the largest rank that still meets the criterion, the first four hypotheses are declared significant. This example demonstrates a common BH outcome: some moderate p-values remain significant when there is a strong cluster of small p-values, whereas a strict Bonferroni threshold of 0.005 would keep only the first two discoveries.
When to use a benjamini hochberg correction calculator
- Genomics and transcriptomics: microarray and RNA-seq studies often involve testing thousands of genes simultaneously.
- Proteomics and metabolomics: broad molecular panels can produce massive multiple comparison burdens.
- Neuroimaging: voxel-wise or region-wise analyses involve a very large number of related tests.
- A/B testing at scale: large digital experimentation programs may evaluate many metrics at once.
- Survey and social science research: subgroup and item-level analyses often multiply the number of inferential tests.
- Machine learning feature screening: univariate statistical screening across many candidate predictors can benefit from FDR control.
Benjamini-Hochberg versus Bonferroni
BH and Bonferroni are not competitors in every situation. They answer different scientific questions. Bonferroni is ideal when even a single false positive is very costly, such as confirmatory analyses, safety assessments, some regulatory settings, or expensive follow-up experiments. Benjamini-Hochberg is often preferred when the main goal is to discover promising signals while still maintaining a disciplined and transparent false positive framework.
Practical tradeoffs
- Power: BH usually has more power than Bonferroni.
- Error metric: Bonferroni controls family-wise error rate, while BH controls false discovery rate.
- Interpretation: BH lets you tolerate a controlled expected fraction of false discoveries among rejected tests.
- Use cases: BH is common in exploratory high-dimensional research; Bonferroni is common in confirmatory settings.
Important assumptions and limitations
The classic BH procedure is guaranteed to control FDR under independence and under certain forms of positive dependence among tests. In many practical applications, it performs reasonably well even outside ideal assumptions, but analysts should be cautious. Highly complex dependence structures can affect error control. There are related procedures, such as Benjamini-Yekutieli, that are more conservative and valid under broader dependence conditions.
Another key issue is defining the family of tests correctly. If you only correct a subset of hypotheses after looking at the data, you may underestimate the true multiplicity burden. The family should reflect the complete set of inferential opportunities relevant to the claim. In addition, p-values themselves must be valid. A sophisticated correction cannot rescue a flawed model, poor randomization, selective reporting, or a violated test assumption.
How to interpret your calculator output
This calculator returns four pieces of information that are especially useful. First, it counts the total number of tests entered. Second, it reports the selected FDR alpha. Third, it identifies how many hypotheses are significant after correction. Fourth, it lists adjusted p-values and a decision for each test. If the adjusted p-value for a given test is less than or equal to your alpha, then that result is significant under BH.
The chart adds another layer of intuition. It plots sorted p-values and the BH critical line. Any sorted p-values that lie on or below the critical line before the final crossing point are discoveries. This visualization is especially helpful when many p-values cluster near the threshold because it shows whether significance is driven by a clear block of strong signals or by marginal values.
Best practices for using BH in real research
- Predefine the family of tests whenever possible.
- Choose the FDR level based on the scientific context, not on the observed output.
- Report both raw and adjusted p-values for transparency.
- Describe the exact correction used, such as Benjamini-Hochberg at FDR 0.05.
- Complement significance with effect sizes and confidence intervals.
- Be clear about whether results are exploratory or confirmatory.
Authoritative references and further reading
For readers who want formal statistical background and applied guidance, the following sources are useful:
- Purdue University copy of the original Benjamini and Hochberg 1995 paper
- National Institutes of Health article on false discovery rate concepts and applications
- Penn State STAT 555 explanation of the Benjamini-Hochberg procedure
Final takeaway
A benjamini hochberg correction calculator is one of the most practical statistical tools for modern data analysis. It lets you manage large sets of simultaneous hypothesis tests without falling into the trap of either excessive false positives or excessive conservatism. If your work involves many parallel comparisons, BH is often the right middle ground. Use it thoughtfully, define your test family clearly, and interpret your corrected results alongside effect sizes, study design quality, and subject-matter knowledge.