Calculating Effect Size for a Continuous Variable
Use this premium calculator to estimate standardized mean differences between two groups. It computes Cohen’s d, Hedges’ g, or Glass’s delta from group means, standard deviations, and sample sizes, then visualizes the practical magnitude of the difference.
Tip: Cohen’s d uses the pooled standard deviation, Hedges’ g applies a small sample correction to d, and Glass’s delta standardizes by the control group’s standard deviation, which this calculator treats as Group 2.
Your results will appear here
Enter the summary statistics for two groups and click Calculate Effect Size.
Expert Guide to Calculating Effect Size for a Continuous Variable
Calculating effect size for a continuous variable is one of the most important skills in applied statistics, research design, clinical evaluation, and evidence-based decision-making. While a p-value can tell you whether a difference is statistically detectable, it does not tell you how large or meaningful that difference is in practical terms. Effect size fills that gap. For continuous outcomes such as blood pressure, test scores, weight, time to completion, pain rating, cholesterol level, or income, effect size quantifies the magnitude of the difference between groups using a standardized scale.
In the simplest two-group setting, effect size is often reported as a standardized mean difference. This means the difference between the two sample means is divided by some measure of variability, usually a standard deviation. Standardization matters because it lets you compare results across studies that may use different measurement units. A treatment that improves exam scores by 6 points and a therapy that reduces systolic blood pressure by 8 mmHg operate on different scales, but both can be compared through effect size.
The three most common measures used for a continuous variable are Cohen’s d, Hedges’ g, and Glass’s delta. They are closely related, but each is best suited to slightly different contexts. The calculator above allows you to compute all three from summary data. That is useful when you have means, standard deviations, and sample sizes but not the raw dataset.
Why effect size matters more than significance alone
A very large study can produce a statistically significant result even when the real-world difference is tiny. Conversely, a smaller study may fail to reach significance despite a potentially meaningful outcome. Effect size helps solve this interpretive problem. It gives readers, reviewers, clinicians, and analysts a direct sense of the magnitude of the observed difference. This is why reporting standards in many fields recommend including both p-values and effect sizes whenever continuous variables are analyzed.
A practical rule is simple: statistical significance addresses whether an effect is likely to exist, while effect size addresses how much of an effect exists.
The core formulas used for continuous variables
Suppose Group 1 has mean M1, standard deviation SD1, and sample size n1. Group 2 has mean M2, standard deviation SD2, and sample size n2. The raw mean difference is:
Mean difference = M1 – M2
To standardize it, you divide by a standard deviation. For Cohen’s d in independent samples, the denominator is the pooled standard deviation:
SDpooled = sqrt(((n1 – 1)SD1² + (n2 – 1)SD2²) / (n1 + n2 – 2))
Then:
Cohen’s d = (M1 – M2) / SDpooled
Hedges’ g adjusts Cohen’s d to reduce small-sample bias:
Hedges’ g = J × d, where J = 1 – 3 / (4(n1 + n2) – 9)
Glass’s delta is useful when one group’s variability is the preferred reference, often the control group:
Glass’s delta = (M1 – M2) / SD2
How to choose between Cohen’s d, Hedges’ g, and Glass’s delta
- Cohen’s d is widely used for comparing two independent groups when standard deviations are reasonably similar.
- Hedges’ g is typically preferred in published research and meta-analysis because it corrects the slight upward bias in Cohen’s d when sample sizes are small.
- Glass’s delta is most useful when the treatment may affect variability itself, making pooled standard deviation less appropriate. In that case, the control group’s standard deviation can provide a more stable reference.
Step-by-step example with real numbers
Imagine a training program designed to improve standardized test performance. Group 1 is the intervention group and Group 2 is the comparison group. Suppose the intervention group has a mean score of 78.4 with standard deviation 10.5 and sample size 45. The comparison group has mean 72.1 with standard deviation 9.8 and sample size 43. These are the default values in the calculator.
- Compute the mean difference: 78.4 – 72.1 = 6.3.
- Compute the pooled standard deviation from the two variances and sample sizes.
- Divide 6.3 by the pooled standard deviation to obtain Cohen’s d.
- If you want Hedges’ g, apply the correction factor to d.
- If you want Glass’s delta, divide 6.3 by Group 2’s standard deviation of 9.8.
The resulting standardized effect is moderate to moderately large. That conveys a much richer story than simply saying the groups differed. It suggests the average participant in Group 1 improved by a meaningful fraction of a standard deviation relative to Group 2.
| Scenario | Group 1 Mean | Group 2 Mean | Group 1 SD | Group 2 SD | n1 | n2 | Approx. Cohen’s d |
|---|---|---|---|---|---|---|---|
| Training test scores | 78.4 | 72.1 | 10.5 | 9.8 | 45 | 43 | 0.62 |
| Exercise and systolic BP | 124.0 | 131.5 | 12.0 | 13.2 | 60 | 58 | -0.59 |
| Reading intervention score | 88.2 | 81.4 | 14.1 | 13.8 | 32 | 30 | 0.48 |
How to interpret the magnitude
A familiar set of benchmarks for Cohen’s d is 0.2 for a small effect, 0.5 for a medium effect, and 0.8 for a large effect. These are rough heuristics, not universal laws. In some fields, a d of 0.2 may be important, especially in public health where small average changes can matter across large populations. In other domains, such as laboratory settings with highly controlled outcomes, a d of 0.2 may be considered minor.
The sign of the effect size matters too. A positive effect size means Group 1’s mean is larger than Group 2’s mean. A negative effect size means Group 1’s mean is lower. Whether lower is better depends on the variable. For pain score or blood pressure, a negative effect size may actually signal improvement.
| Absolute Effect Size | Common Interpretation | Typical Practical Reading |
|---|---|---|
| 0.00 to 0.19 | Trivial to very small | Difference exists but may be difficult to notice in practice |
| 0.20 to 0.49 | Small | Modest shift, potentially important depending on cost and context |
| 0.50 to 0.79 | Medium | Clear practical difference in many applied settings |
| 0.80 or higher | Large | Substantial separation between groups |
Common mistakes when calculating effect size for a continuous variable
- Using standard error instead of standard deviation. The denominator in these formulas should usually be a standard deviation, not a standard error.
- Ignoring study design. Paired or repeated-measures data require different methods than independent groups.
- Combining highly unequal variances without thought. If the intervention changes variability a lot, Glass’s delta may be more defensible than Cohen’s d.
- Reporting only the magnitude and not the direction. The sign can carry important substantive meaning.
- Overrelying on generic cutoffs. Interpret the effect in relation to your field, measurement reliability, and consequences of change.
Independent groups versus paired data
The calculator on this page is designed for two independent groups summarized by means, standard deviations, and sample sizes. If you are comparing pre-test and post-test values in the same participants, or matched pairs such as twins or case-control matches, you should not use the independent-groups pooled standard deviation blindly. Paired designs require accounting for within-subject correlation. In that setting, researchers may report a standardized mean change or another repeated-measures effect size that uses the standard deviation of change scores or other design-specific formulas.
How effect size supports meta-analysis
One major reason standardized effect sizes are so popular is that they make it possible to synthesize findings across many studies. Meta-analysts frequently convert outcomes to Hedges’ g because it is slightly less biased in small samples. Once each study is expressed on a common scale, effect sizes can be weighted by precision and combined into an overall estimate. This is a core reason why good primary studies report means, standard deviations, sample sizes, and effect sizes clearly.
Confidence intervals and uncertainty
A single effect size estimate is not the whole story. It is also good practice to report a confidence interval around the effect size. A confidence interval reflects the uncertainty due to sampling variation. Narrow intervals indicate more precise estimates, while wide intervals suggest the true effect could plausibly vary over a broader range. Although this calculator focuses on point estimates and visual interpretation, researchers preparing formal reports should consider adding interval estimates using software designed for inferential analysis.
When a continuous variable is not normally distributed
Standardized mean differences can still be informative when distributions are somewhat non-normal, but heavy skewness, strong outliers, or floor and ceiling effects can complicate interpretation. In those circumstances, the mean and standard deviation may not summarize the variable well. Analysts may prefer transformations, robust methods, nonparametric approaches, or alternative effect size measures based on ranks. The right choice depends on both the research question and the empirical shape of the data.
Practical reporting template
Here is a concise way to report a result from a continuous outcome comparison: “The intervention group scored higher than the comparison group on the final assessment (M = 78.4, SD = 10.5, n = 45 vs. M = 72.1, SD = 9.8, n = 43), corresponding to a moderate standardized mean difference, Cohen’s d = 0.62.” If the sample is smaller or the result is intended for meta-analysis, you might replace d with Hedges’ g.
Authoritative references and learning resources
For readers who want deeper statistical guidance, these authoritative sources are excellent starting points:
- National Center for Biotechnology Information (NCBI): overview of effect size concepts in medical research
- NIST.gov Engineering Statistics Handbook: foundational guidance on statistical methods and variability
- Columbia University Department of Statistics: academic resources on applied statistical reasoning
Bottom line
Calculating effect size for a continuous variable helps you move from “Is there a difference?” to “How large is the difference?” For independent groups, Cohen’s d is a natural first choice, Hedges’ g is often preferable for publication-quality reporting, and Glass’s delta is especially useful when the control group’s standard deviation should anchor the comparison. The calculator above turns summary statistics into an interpretable standardized metric and chart, making it easier to communicate magnitude, direction, and practical significance in a way that a p-value alone never can.