A Priori Sample Size Calculator

Estimate the minimum sample size needed before data collection. Choose a common hypothesis test, set your alpha and power targets, and calculate a defensible sample size for means or proportions with a publication-ready rationale.

Study Design Inputs

Test type

Select the design that best matches your planned primary analysis.

Hypothesis type

Two-sided testing is standard in most confirmatory research.

Alpha level

Common value: 0.05.

Power

Common values: 0.80 or 0.90.

Expected mean difference

Enter the smallest clinically or practically meaningful difference.

Standard deviation

Use a pilot study, literature estimate, or validated benchmark.

Group 1 proportion

Example: current event rate in the control group.

Group 2 proportion

Example: expected event rate in the intervention group.

Null proportion

The benchmark, historical, or regulatory reference proportion.

Expected true proportion

The proportion you believe the study will observe.

Expected dropout or nonresponse rate

Enter a percent. The calculator inflates the final target accordingly.

Allocation ratio for two-sample tests

This version assumes equal allocation for two-group designs.

Tip: For mean-based studies, the standardized effect size is approximately difference divided by standard deviation. For example, 5 divided by 12 gives about 0.42, which is a moderate effect in many practical settings.

Calculated Output

Ready to calculate

Enter your design assumptions and click the button to estimate the minimum required sample size.

How to Use an A Priori Sample Size Calculator Correctly

An a priori sample size calculator helps researchers determine how many observations, participants, records, or experimental units they should plan to collect before a study begins. The phrase a priori means that the sample size decision is made in advance, based on explicit statistical assumptions rather than after the fact. This is one of the most important planning steps in quantitative research because too small a sample can leave a study underpowered, while an unnecessarily large sample may waste time, money, participant burden, and operational resources.

At a practical level, this calculator combines four core ingredients: the significance threshold, desired statistical power, expected variability or event rate, and the minimum effect you want to be able to detect. If those assumptions are realistic, your calculated sample size provides a defensible target for study planning, ethics submissions, grant proposals, dissertations, and protocol documents.

What an a priori calculation is designed to answer

The central question is simple: How large should my sample be so I have a strong chance of detecting a meaningful effect if that effect truly exists? The answer depends on your hypothesis test. A study comparing two means uses different assumptions from a study comparing two proportions, but the planning logic is the same. When researchers skip this step, they often end up with inconclusive findings that are difficult to interpret. A non-significant result from an underpowered study does not necessarily mean there is no effect. It may simply mean the study was not large enough.

In plain language: sample size planning is a balance problem. Smaller effects, stricter alpha levels, and higher power targets all require larger samples. Larger effects and lower variability reduce the required sample size.

The four assumptions that drive sample size

Alpha level: This is the probability of a Type I error, often set at 0.05. In a two-sided design, alpha is split across both tails of the sampling distribution.
Power: This is the probability of detecting the target effect when it truly exists. Typical targets are 80% or 90%.
Effect size: This is the smallest difference that matters scientifically, clinically, commercially, or operationally.
Variance or event rates: Mean-based tests require a standard deviation estimate. Proportion-based tests require expected proportions.

When any one of these assumptions changes, the required sample size changes too. For example, moving from 80% power to 90% power can substantially increase the sample requirement. Likewise, detecting a subtle difference is much harder than detecting a large one, so smaller target effects produce much larger sample recommendations.

Common formulas behind an a priori sample size calculator

Although a calculator automates the math, it is useful to know what it is doing. For a one-sample mean, the classic normal-approximation formula is based on the ratio of the standard deviation to the mean difference of interest, multiplied by the sum of the critical z-values for alpha and power. For a two-sample mean with equal group sizes, the same structure is used, but the variance term is doubled. For two proportions, the formula uses both the pooled proportion and the separate group proportions. These formulas are standard approximations used in planning and are widely accepted for first-pass study design.

Reference values commonly used in planning

Most protocols use a limited set of standard alpha and power combinations. The table below shows the normal critical values that drive many a priori calculations. These are not arbitrary numbers; they are established statistical cutoffs used in research design.

Planning parameter	Common setting	Critical z-value	Interpretation
Two-sided alpha	0.05	1.96	Most common significance threshold in biomedical, behavioral, and social science research.
Two-sided alpha	0.01	2.576	More conservative threshold, often used when false positives are especially costly.
Power	0.80	0.842	Traditional minimum acceptable power in many applied studies.
Power	0.90	1.282	Higher assurance of detection, common in pivotal or high-stakes work.

Why effect size matters more than many beginners expect

New researchers often focus on alpha and power but underestimate the effect size decision. In reality, effect size is usually the most influential assumption. If you choose a target difference that is unrealistically large, the calculator may produce an attractive but misleadingly small sample size. If you choose a difference that is too tiny to matter in practice, you may end up with a sample requirement that is operationally impossible. The best target effect is usually the minimum meaningful effect, not the largest effect you hope to observe.

For mean outcomes, that means deciding how many units of change would actually justify a conclusion or intervention. For proportions, it means defining the smallest absolute increase or decrease in the event rate that would matter. Literature reviews, pilot studies, prior registries, or subject-matter consensus are ideal sources for these assumptions.

Examples of planning scenarios

Clinical trial: You want to detect a 5-point change in a symptom score with standard deviation 12, alpha 0.05, and power 0.80.
Survey quality improvement: You want to show that satisfaction improved from 50% to 65% after a new process was introduced.
Education research: You want enough students to detect a moderate difference in test performance between two teaching methods.
Public health evaluation: You want to compare uptake rates across two interventions and need a sample large enough to detect a realistic absolute difference.

Illustrative sample size benchmarks

The next table gives planning benchmarks for a simple proportion estimate at 95% confidence using the standard formula with maximum variability at p = 0.50. These values are widely used in survey planning and show how precision drives sample size. They are helpful context when you discuss why tighter precision requires more observations.

Margin of error	Confidence level	Assumed proportion	Approximate required n
±10 percentage points	95%	0.50	96
±5 percentage points	95%	0.50	385
±3 percentage points	95%	0.50	1,067
±2 percentage points	95%	0.50	2,401

How dropout inflation should be handled

Most real studies do not retain every planned participant or complete every measurement. That is why a good a priori sample size calculation should include a final inflation step for dropout, attrition, nonresponse, unusable records, or missing outcome data. If the analytic minimum is 200 and you expect 10% attrition, divide by 0.90, which yields 222.22, then round up to 223. This protects your final analyzable sample from shrinking below the power target.

It is usually better to justify dropout with actual prior evidence rather than a generic guess. Look at historical completion rates from your setting, population, recruitment channel, or instrument. In regulated or grant-funded work, reviewers often want to see the source of this assumption.

Two-sided versus one-sided hypotheses

A two-sided test asks whether the effect differs in either direction and is the standard choice in most confirmatory studies. A one-sided test asks only whether the effect differs in the expected direction and therefore requires a smaller sample than a two-sided test with the same alpha and power. However, one-sided testing should only be used when the opposite direction would be irrelevant scientifically and would not change the interpretation of the results. If there is any realistic possibility that the effect could reverse, a two-sided design is usually the safer and more credible option.

What this calculator does well

Provides quick planning estimates for common tests before data collection begins.
Supports means and proportions, including one-sample and two-sample designs.
Applies standard normal-approximation formulas used in many protocol drafts and early-stage feasibility studies.
Inflates final recruitment targets for dropout or nonresponse.
Visualizes how required sample size changes as the detectable effect gets smaller or larger.

What this calculator does not replace

No online calculator can replace a full statistical analysis plan for every design. If your study includes clustering, repeated measures, unequal allocation, survival endpoints, multivariable modeling, interim analyses, multiplicity adjustments, finite population corrections, or noninferiority margins, a more advanced method may be required. In those situations, you should work with a statistician or use validated software tailored to that design.

Best practices for writing up your sample size section

A strong methods section does more than report a number. It explains where the number came from. A complete write-up usually includes the primary endpoint, planned test, alpha level, desired power, effect size rationale, variance or event-rate assumptions, allocation ratio, and attrition inflation. Here is a model structure:

Identify the primary outcome and comparison.
State the statistical test and whether it is one-sided or two-sided.
Report alpha and target power.
Justify the effect size and variability assumptions with prior literature or pilot data.
Provide the resulting analytic sample size and the inflated recruitment target after expected dropout.

Frequent mistakes that lead to poor planning

Using a hoped-for effect instead of the smallest meaningful effect.
Borrowing a standard deviation from a population that is not comparable.
Forgetting to inflate for attrition or unusable data.
Running the study with multiple primary outcomes but planning the sample size for only one without adjustment.
Choosing a one-sided hypothesis only to make the required sample smaller.
Confusing statistical significance with practical importance.

Authoritative sources for deeper guidance

National Library of Medicine: Power and Sample Size Determination
Boston University School of Public Health: Sample Size and Power Notes
National Cancer Institute: Definition of Statistical Power

Final takeaway

An a priori sample size calculator is not just a convenience tool. It is part of rigorous study design. It helps ensure that your project is neither too small to answer the question nor larger than necessary. The most credible calculations are grounded in realistic assumptions, transparent reporting, and a clear definition of what counts as a meaningful effect. Use the calculator above as an efficient first-pass planning tool, then refine your assumptions with pilot data, published evidence, and design-specific statistical advice when the study stakes are high.