Exact Testing Without Large-Sample Approximations

Fisher's exact test supplies an exact probability calculation for $2 \times 2$ contingency tables, in contrast to the more common $χ^{2}$ test which relies on an approximation valid only for sufficiently large expected counts. When samples are small or cell counts sparse, the chi-square statistic can yield misleading p-values. R.A. Fisher circumvented this problem in 1935 by deriving an exact expression based on the hypergeometric distribution. The test considers all possible tables with fixed margins and calculates the probability of obtaining one as extreme as or more extreme than the observed configuration.

The computation rests on treating the row and column totals as fixed. Suppose we observe the following layout. The rows might correspond to treatment and control groups, while the columns represent success and failure outcomes. Using the notation $a_{11}$ , $a_{12}$ , $a_{21}$ , and $a_{22}$ for the four cells, the margins are $r_{1} = a_{11} + a_{12}$ , $r_{2} = a_{21} + a_{22}$ , $c_{1} = a_{11} + a_{21}$ , and $c_{2} = a_{12} + a_{22}$ . Under the null hypothesis that the treatment and outcome are independent, the number $X$ appearing in the first cell follows a hypergeometric distribution with parameters $N$ , $c_{1}$ , and $r_{1}$ where $N = r_{1} + r_{2}$ .

Generic Contingency Table
	Outcome 1	Outcome 2	Row Totals
Group 1	`a`	`b`	`a+b`
Group 2	`c`	`d`	`c+d`
Column Totals	`a+c`	`b+d`	`n`

In this representation the probability of observing the particular table above, conditional on the margins, equals $\frac{r_{1}! r_{2}! c_{1}! c_{2}!}{a! b! c! d! n!}$ . This expression arises by counting the number of arrangements leading to the specified table and dividing by the total number of tables with the given margins. Because factorials of even modestly large integers explode in magnitude, the calculation in the script employs logarithms of factorials to avoid overflow, exponentiating only at the end to recover the probability.

To perform a two‑tailed test, we compute the probability of the observed table and then enumerate all tables with the same margins, summing the probabilities of those whose probability is less than or equal to the observed value. This criterion mirrors Fisher's original "two‑sided" definition and ensures that extreme outcomes in either direction contribute to the p‑value. Alternative definitions of a two‑tailed Fisher test exist, but this method is common in statistical software and textbooks. The calculator reports the exact p‑value without invoking normal or chi‑square approximations.

Worked Numerical Example

Consider a small clinical trial in which ten patients receive a new therapy and ten receive a placebo. Suppose three treated patients and one control patient experience symptom relief. The table therefore has $3$ and $7$ in the first row, and $1$ and $9$ in the second row. Plugging these counts into the formula above gives the probability of observing that table under the null hypothesis of no treatment effect. But to determine the p‑value we must also include tables with probabilities equal to or lower than that of the observed table. Those tables correspond to more lopsided outcomes such as four successes in the treated group and none in the control group. Summing their probabilities yields the exact p‑value.

In the example the observed configuration has probability approximately $0.0574$ . Only one other table with the same margins is as or more extreme, the one with four successes in the treatment group. Its probability is $0.0046$ . Therefore the two‑tailed p‑value is the sum $0.0574 + 0.0046 = 0.062$ . Although the treatment seems to have helped more patients, the p‑value suggests that such a difference could easily arise by chance when the sample size is this small. Larger studies are needed to draw firm conclusions.

Historical Context

Fisher developed this method while working at the Rothamsted Experimental Station in England, a hub for agricultural research. His 1935 monograph "The Design of Experiments" introduced randomization and many other foundational ideas in statistics. The exact test arose from analyzing experiments with very small numbers of observations where existing approximate techniques failed. One famous anecdote describes a colleague claiming she could tell whether milk or tea was poured first into a cup. Fisher designed an eight‑cup test to evaluate her claim, leading to the formalization of the hypergeometric model. The logic of fixed margins and exact probability enumeration traces back to this playful experiment.

While conceived for agriculture and tea tasting, Fisher's exact test now permeates disciplines from medicine to ecology. Clinical trials with rare outcomes, genetics studies with small sample sizes, and laboratory experiments with limited resources all benefit from a method that remains valid regardless of cell counts. Software packages in R, Python, and other languages include built‑in implementations. This calculator offers the same capability in a lightweight, browser‑based form requiring no internet connection once loaded.

Algorithmic Details

The script uses a straightforward enumeration algorithm. After reading the four inputs it determines the possible range of the upper left cell consistent with the margins. Specifically, $x$ can vary between $\max (0, c_{1} - r_{2})$ and $\min (r_{1}, c_{1})$ . For each feasible $x$ it constructs the full table, computes its hypergeometric probability, and compares it to the probability of the observed table. Probabilities smaller than or equal to the observed one are accumulated into the p‑value. Because all computations occur in log space until the final exponentiation, the method remains stable even for moderately large totals such as 100 or 200.

The function logFact iteratively sums logarithms using JavaScript's Math.log. While this is less efficient than using a precomputed table or Lanczos approximation, it suffices for the small values typically entered by users. The algorithm terminates quickly, with a worst‑case complexity proportional to the range of $x$ . For most realistic inputs the calculation completes instantly on modern devices.

Interpretation and Practical Advice

A small p‑value from Fisher's exact test indicates that under the assumption of independence, the observed table or one more extreme would be unlikely. Many practitioners use a threshold such as $0.05$ to determine "statistical significance." However, p‑values should not be interpreted mechanically. Consider the study design, potential biases, and the magnitude of the observed effect. The exact test does not measure effect size; it only gauges compatibility with the null hypothesis. To describe the association strength, analysts often compute the odds ratio $\frac{a}{d} / \frac{b}{c}$ or its logarithm, along with confidence intervals.

Another important nuance involves the choice between one‑ and two‑tailed tests. If a research question predicts a specific direction of association—for instance, that the treatment can only increase the success rate—a one‑tailed test may be appropriate. The algorithm in this calculator uses the two‑tailed approach by default because it is more conservative and widely applicable. Researchers should match the tail specification to their hypotheses and justify the choice in reporting results.

Beyond 2x2 Tables

Fisher's reasoning extends in principle to larger tables, but the computational burden escalates rapidly as the number of cells grows. For a $2 \times 3$ table, the enumeration involves summing over many more configurations, and for an $r \times c$ table it becomes impractical. Specialized algorithms such as network algorithms or Markov chain Monte Carlo methods are employed to approximate exact p-values in such settings. Nonetheless, the $2 \times 2$ case remains the most common, especially in clinical research, making a dedicated calculator valuable.

Fisher's exact test holds an esteemed place in the history of statistics, illustrating how clever probability reasoning can circumvent limitations of asymptotic approximations. Its enduring popularity stems from its simplicity and reliability. Whether you are assessing the efficacy of a new drug, checking for a gender bias in small hiring datasets, or performing classroom demonstrations, this tool provides a transparent and precise method for evaluating independence in small samples.

Fisher's Exact Test Calculator

Exact Testing Without Large-Sample Approximations

Worked Numerical Example

Historical Context

Algorithmic Details

Interpretation and Practical Advice

Beyond 2x2 Tables

Embed this calculator

Fisher's Exact Test Calculator

Exact Testing Without Large-Sample Approximations

Worked Numerical Example

Historical Context

Algorithmic Details

Interpretation and Practical Advice

Beyond 2x2 Tables

Embed this calculator

Related Calculators

Fisher Information Matrix Calculator - Normal Distribution Parameters

Chi-Square Test Calculator - Assess Independence of Two Variables

Two-Sample t-Test Calculator - Compare Independent Means

Age Calculator - Calculate Your Exact Age or Date Difference

Exact Age Calculator - Find Age in Years, Months, and Days

Odds Ratio Calculator - Compare Exposure Risks