Bayesian Sample Size Calculator
Estimate the minimum sample size needed for a binary outcome study using a Bayesian credible interval approach. This calculator combines your prior belief with an expected event rate, then finds the smallest sample that achieves your target posterior margin of error at the selected credible level.
How this calculator works
For a binary endpoint, the posterior under a Beta prior and Binomial likelihood is also Beta. This page uses the expected proportion to approximate the future posterior variance at each candidate sample size, then selects the smallest n where the posterior half-width is at or below your target margin.
Results will appear here
Enter your assumptions and click Calculate Sample Size.
Posterior precision vs. sample size
The chart shows how the estimated posterior margin of error shrinks as your sample increases. The horizontal target line marks your desired precision threshold.
Expert guide to using a Bayesian sample size calculator
A Bayesian sample size calculator helps researchers, product analysts, healthcare teams, and survey designers answer a practical question before data collection starts: how many observations are needed to make a decision with the desired level of certainty? In classical statistics, sample size planning often revolves around power, alpha, and a null hypothesis. In a Bayesian design, the focus shifts to posterior precision, decision thresholds, predictive probability, and the role of prior knowledge. That makes Bayesian planning especially useful when you already have historical information, pilot data, expert judgment, or operational constraints that should influence the study design.
This calculator is built for one of the most common Bayesian planning problems: estimating a binary proportion. Examples include the conversion rate of a landing page, the response rate to an outreach campaign, the prevalence of a condition in a target population, the probability of device failure, or the adverse event rate in a clinical setting. If the outcome can be counted as success or failure, yes or no, event or no event, then a Beta-Binomial model is often a natural planning choice.
In this setup, your prior belief about the true proportion is represented by a Beta distribution with parameters alpha and beta. Once data are collected, the posterior distribution remains Beta, which makes interpretation straightforward. The posterior combines two sources of information: your prior and the observed sample. Because the model is mathematically convenient and intuitively clear, it is widely used for practical planning in medicine, online experimentation, quality control, and social science.
What the calculator actually computes
The calculator uses an expected future proportion to approximate the posterior distribution you are likely to obtain at a candidate sample size. For each possible sample size n, it computes the expected posterior Beta parameters:
- Posterior alpha = prior alpha + n × expected proportion
- Posterior beta = prior beta + n × (1 – expected proportion)
From those values, the posterior variance is estimated. The calculator then approximates the posterior half-width, or margin of error, using the selected credible level. It searches for the smallest sample size where the posterior half-width is less than or equal to your target. In simple terms, it finds the minimum n needed so your posterior interval is tight enough for the level of precision you want.
A key benefit of Bayesian planning is that the prior can reduce the required sample size when there is credible previous evidence. A weak or neutral prior leaves the data to do most of the work. A stronger prior contributes more information, which can narrow the posterior interval even before many new observations are collected.
How to interpret the main inputs
- Expected proportion: This is your best planning estimate for the event rate. If you expect a 12% adverse event rate, enter 0.12. If you are planning for maximum uncertainty and have no better forecast, 0.50 is often conservative because binary variance is highest near 50%.
- Target margin of error: This is the half-width of the posterior credible interval. For example, a margin of error of 0.05 means you want the posterior interval to extend roughly plus or minus 5 percentage points around the posterior mean.
- Prior alpha and beta: These define the prior distribution. Beta(1,1) is uniform, Beta(0.5,0.5) is Jeffreys, and Beta(5,5) is a more concentrated skeptical prior centered at 0.50.
- Credible level: A 95% credible interval is the most common default. A 99% interval is wider and therefore usually requires a larger sample.
Why Bayesian sample size planning is different from frequentist planning
Frequentist sample size calculations typically target power for rejecting a null hypothesis. Bayesian sample size planning often targets a posterior quantity instead: credible interval width, posterior probability of superiority, expected loss, or predictive probability of success. This difference matters because many real decisions are not fundamentally about whether a p-value crosses a threshold. They are about whether the estimate is precise enough to act on, whether the probability of benefit is high enough, or whether uncertainty has been reduced to an acceptable level.
Suppose a health system wants to estimate the prevalence of hypertension in a local program population. It may care less about rejecting a null prevalence and more about obtaining a posterior interval narrow enough to support staffing decisions, medication inventory planning, and outreach budgeting. A Bayesian precision-based sample size is often more aligned with that objective than a standard power calculation.
Reference rates and how they affect sample size
Real-world planning often starts with benchmark prevalence or event rates from trusted public sources. The table below shows several public health examples along with the implications for precision planning. The rates are illustrative planning anchors based on widely reported U.S. surveillance figures and should be updated for your exact population before final study design.
| Indicator | Example rate | Public source | Approximate n for 95% margin of error ±0.05 with weak prior |
|---|---|---|---|
| Adult cigarette smoking prevalence | 11.5% | CDC adult smoking estimates | About 157 |
| Diagnosed diabetes among U.S. adults | About 11.6% | CDC diabetes statistics | About 158 |
| Hypertension prevalence among U.S. adults | About 48.1% | CDC heart disease and blood pressure data | About 381 |
The sample size is much larger when the expected proportion is closer to 0.50 because uncertainty is greatest there. Rare or very common events often require fewer observations to achieve the same absolute margin of error. That is one of the simplest and most important planning insights from both Bayesian and frequentist proportion estimation.
Credible level and precision trade-offs
Your choice of credible level directly influences required sample size. Higher credibility means a wider interval and therefore more data. The following comparison summarizes the practical trade-off if everything else is held constant and the expected proportion is 0.50 under a weak prior.
| Credible level | Normal critical value used in approximation | Approximate n for target margin ±0.05 | Relative increase vs. 90% |
|---|---|---|---|
| 90% | 1.645 | About 271 | Baseline |
| 95% | 1.960 | About 381 | About 41% higher |
| 99% | 2.576 | About 658 | About 143% higher |
This is why it is useful to define precision in operational terms before locking in the highest possible interval level. If your decision can be made comfortably with a 90% credible interval, requiring 99% may create a costly and unnecessary expansion in sample size.
Choosing a prior responsibly
One of the biggest strengths of Bayesian methods is also one of the most misunderstood: the prior. A prior is not a shortcut for bias. It is a formal way to represent information that already exists before the current sample is observed. That information might come from earlier trials, quality logs, registry data, observational studies, or subject-matter expertise. In many regulated or high-stakes settings, documenting the rationale for the prior is essential.
- Uniform prior Beta(1,1): often used when you want a mild, neutral starting point.
- Jeffreys prior Beta(0.5,0.5): popular because of good invariance properties and less influence at the center.
- Skeptical prior Beta(5,5): useful when you want to resist extreme conclusions without strong data.
- Optimistic prior Beta(8,2): reflects a prior mean of 0.80 and more concentrated prior belief.
If your prior is strong and well justified, it can reduce the new sample needed to achieve a given level of posterior precision. If the prior is weak or disputed, a more neutral choice is usually safer. For publication, internal governance, or regulatory review, include both the main prior and a sensitivity analysis under at least one alternative prior.
Where Bayesian sample size calculators are most useful
- Clinical pilot studies estimating response or adverse event rates
- A/B tests and conversion optimization for websites and apps
- Manufacturing quality checks for defect proportions
- Public health prevalence surveys and screening studies
- Reliability studies where a binary pass or fail endpoint matters
- Education research measuring uptake, completion, or yes/no outcomes
Common mistakes to avoid
- Using an unrealistic expected proportion: If your expected rate is too optimistic or too low, your projected sample size may be misleading. When uncertain, run multiple scenarios.
- Confusing absolute and relative precision: A margin of error of 0.05 means 5 percentage points, not 5% of the proportion.
- Ignoring missing data or attrition: If you expect 10% nonresponse, inflate the recruited sample accordingly.
- Overstating the prior: A strong prior can dominate a small new sample. Only use it when the prior evidence is defensible and relevant.
- Assuming one scenario covers all stakeholders: Decision-makers may want a range of assumptions, not a single number.
How to build a robust planning workflow
A practical Bayesian sample size workflow usually follows five steps. First, define the decision the study is supposed to support. Second, identify the parameter of interest, such as a conversion rate or prevalence. Third, select plausible prior distributions based on available evidence. Fourth, choose a precision target or a decision threshold that reflects the real action point. Fifth, test sensitivity by varying the prior, the expected proportion, and the credible level.
That final step matters more than many teams realize. In real projects, the required sample is not a single immutable truth. It is conditional on assumptions. A high-quality planning document often includes best-case, base-case, and conservative scenarios so stakeholders understand how robust the design is.
Example interpretation
Imagine you are planning a patient survey and expect that 48% of respondents will report a specific care coordination issue. You choose a weak prior, a 95% credible level, and a desired margin of error of plus or minus 0.05. The calculator will return a required sample close to the familiar figure of about 380 observations. If you tighten the target margin to plus or minus 0.03, the required sample rises dramatically. That is not a flaw in Bayesian methods. It reflects the nonlinear cost of demanding greater precision.
Authoritative sources for planning assumptions and statistical context
For public health and biomedical planning assumptions, review current estimates from CDC.gov, guidance and evidence resources from the National Institutes of Health, and statistical education materials from Penn State’s online statistics resources. If your study is regulatory or clinical, pair this calculator with protocol-specific advice from your statistical analysis plan and institutional review requirements.
Final takeaway
A Bayesian sample size calculator is most valuable when your real goal is estimation quality and decision readiness rather than a narrow hypothesis test. By incorporating prior knowledge and focusing on posterior precision, Bayesian design can produce more realistic and more transparent planning numbers. Use this calculator as a first-pass planning tool for binary outcomes, then refine the design with scenario analysis, attrition adjustments, and domain-specific review. The most defensible sample size is rarely just the output of a formula. It is the output of a formula plus context, prior evidence, and a clear statement of how much uncertainty is acceptable for the decision you need to make.