Estimate required sample size for statistical significance and power. Use it for quick planning, scenario testing, and documenting assumptions before you recruit participants.
Calculator explanation (what it does and when to use it)
Sample-size planning is where an experiment becomes feasible (or not). If your study is too small, you risk a false negative (missing a real effect). If it is too large,
you may waste time, money, and participant effort. This calculator provides a practical estimate of the number of participants you should recruit based on common power-analysis inputs.
The tool is designed for early-stage planning and “what-if” comparisons. It is not a substitute for a full statistical analysis plan, but it helps you document a defensible starting point
and understand how sensitive your required n is to effect size, variability, and design choices.
How to use the sample size calculator
- Select a Type of Statistical Test that matches your design (two-sample, paired, ANOVA, correlation/regression, or proportions).
- Choose Hypothesis Type (two-tailed is typical unless you have a strong directional justification).
- Set Significance Level (α) (commonly 0.05; smaller values reduce false positives but increase required sample size).
- Set Statistical Power (1 − β) (commonly 0.80; higher power reduces false negatives but increases required sample size).
- Pick an Expected Effect Size (Cohen’s d) or choose Custom value if you have a specific estimate.
- Enter a Baseline/Control Group Value and Expected Standard Deviation to help interpret the effect in real units.
- Enter Dropout, Compliance, and optional Budget assumptions to see feasibility impacts.
- Click Calculate Sample Size to view results and recommendations.
Key concepts and assumptions
- Effect size (Cohen’s d): standardized difference between groups. Rough guide: 0.2 small, 0.5 medium, 0.8 large.
- Standard deviation (SD): expected variability of your outcome. Larger SD increases required sample size.
- Alpha (α): Type I error rate (false positive). Lower α increases required sample size.
- Power (1 − β): probability of detecting a true effect. Higher power increases required sample size.
- Dropout and compliance: the calculator inflates recruitment targets to account for attrition and protocol non-adherence.
Formulas used (high-level)
For continuous outcomes (two-sample t-test style approximation), the calculator uses a simplified relationship of the form:
n ≈ 2 × ((zα + zβ) × SD / d)2
It then applies small adjustments for multi-group designs and specific study types (paired designs are treated as more efficient; correlation/regression uses a different scaling).
For proportions, it uses a basic two-proportion approximation based on baseline proportion and the implied difference.
Worked example (quick sanity check)
Suppose you plan a two-group study with α = 0.05 (two-tailed), power = 0.80, SD = 15, and expected effect size d = 0.50.
A d of 0.50 corresponds to an absolute difference of about 0.50 × 15 = 7.5 units on your measurement scale.
If that difference is meaningful, the resulting sample size is likely in a realistic range for many lab or single-site studies.
Interpreting the results
The results panel reports (1) the estimated sample size per group, (2) total sample across groups, and (3) an adjusted recruitment target after dropout and compliance.
Use the Minimum Detectable Effect as a reality check: if the detectable effect is larger than what you consider meaningful, you may need a larger sample,
a more precise measurement (lower SD), or a more efficient design.
Limitations
This calculator uses approximations and does not model every design nuance (e.g., unequal allocation, clustering, repeated measures correlation structures, multiple comparisons,
or non-normal outcomes). For confirmatory studies, regulatory submissions, or high-stakes decisions, validate the plan with a statistician and/or dedicated power-analysis software.
Background: statistical power, effect size, and sample size
Power analysis connects four ideas: the false-positive rate you accept (α), the false-negative rate you can tolerate (β), the effect size you want to detect, and the sample size you need.
If you hold three constant, the fourth is determined. In practice, researchers often choose α = 0.05 and power = 0.80, then explore how sample size changes under different effect-size assumptions.
Effect size in plain language
Cohen’s d is a standardized difference: it expresses the expected mean difference in units of standard deviations. That makes it comparable across measures.
For example, if SD = 15 and d = 0.5, the implied mean difference is 7.5 units. If 7.5 units would not matter in your domain, then a study powered only for d = 0.5 may not answer your real question.
Type I vs. Type II errors
A Type I error (α) is a false positive: concluding there is an effect when there is not. A Type II error (β) is a false negative: failing to detect a real effect.
Many fields treat false positives as more costly, but the right balance depends on context (exploratory vs. confirmatory research, safety-critical outcomes, and ethical constraints).
Practical guidance for better inputs
- Use prior evidence: meta-analyses, similar studies, or pilot data are better than guesses.
- Be conservative when uncertain: smaller effect sizes and larger SDs usually produce safer (larger) sample estimates.
- Plan for attrition: dropout and non-compliance are common; recruitment targets should reflect reality.
- Define a primary outcome: multiple outcomes and multiple comparisons often require stricter thresholds or larger samples.
Reminder about scope
This page provides a fast, transparent estimate for planning. If your design includes clustering (schools, clinics), repeated measures, unequal group sizes, interim analyses,
or complex endpoints, you should use a specialized power tool and document the full model in your protocol.