The negative binomial distribution models the number of failures that occur before a specified number of successes is achieved in a sequence of independent trials. Imagine repeatedly flipping a coin until you get three heads. The count of tails observed before that third head is a negative binomial random variable. Because the distribution keeps tally of the failures until a set number of successes, it tends to be right skewed, especially when success probability is low. The negative binomial is closely related to the geometric distribution, which is the special case where only one success is required.
Mathematically the probability of observing exactly \(k\) failures before the \(r\)-th success is given by
where \(p\) is the probability of success in each trial. The binomial coefficient \(\binom{k+r-1}{k}\) counts the number of ways to arrange the failures and successes, while the probability terms account for the chance of each arrangement. The mean of the distribution is
and the variance is . These relationships show how variability increases dramatically as the success probability decreases.The negative binomial distribution arises in many real-world settings. Epidemiologists use it to model the number of people one patient might infect before recovery. Reliability engineers describe the number of component failures before a system breaks. Marketing analysts look at how many prospects decline before making a set number of sales. Whenever events occur independently with a constant success chance and the process stops after a target number of successes, a negative binomial model is a natural fit. The distribution’s long tail captures the possibility of observing many failures, which the regular binomial distribution does not express as directly.
The shape of the distribution depends heavily on the parameters. A high success probability produces a sharply peaked distribution near zero failures. A low probability leads to a long tail where many failures might accumulate. The table below shows how the mean number of failures grows as the success probability decreases for a fixed \(r=3\).
Success Probability | Mean Failures | Variance |
---|---|---|
0.9 | 0.33 | 0.37 |
0.7 | 1.29 | 1.84 |
0.5 | 3 | 6 |
0.3 | 7 | 23.3 |
Enter the desired number of successes \(r\), the success probability \(p\) (as a fraction between 0 and 1), and the observed failure count \(k\). Press Compute to calculate four key quantities: the probability mass at exactly \(k\), the cumulative probability up to and including \(k\), and the distribution’s mean and variance. The calculator performs a straightforward summation for the cumulative value, so very large \(k\) may take a moment. If your browser supports clipboard operations, you can copy the text result after calculation for later use.
This tool is useful for anyone dealing with count data where events have to succeed a certain number of times. For instance, a factory might record how many defective items appear before the tenth non-defective one. A call center might track how many rejections an agent receives before three positive responses. Because the negative binomial models the failure count, it is especially handy when events are rare. The accompanying binomial distribution would instead measure the number of successes in a fixed number of trials, which is not always convenient if the trial count varies or could be extremely large.
To compute the binomial coefficient in the equation above, the script uses a basic iterative factorial routine. This approach is adequate for small integer parameters—typical in most applications—and avoids any external libraries. The probability terms are multiplied carefully to reduce rounding error. For the cumulative distribution, the calculator simply sums the individual probabilities from zero up to the requested failure count. While direct closed-form expressions exist, the summation technique is easier to implement and sufficient for moderate parameter ranges.
The negative binomial is sometimes described with the roles of failures and successes reversed, particularly in early literature. In that formulation the distribution represents the number of successes before a fixed number of failures. Although the algebra looks different, the probabilities are equivalent after adjusting the parameters. This calculator sticks with the more common convention where \(r\) is the target successes and the random variable counts failures.
The negative binomial assumes independent, identical trials and a constant success probability. In many practical cases these assumptions hold only approximately. For example, if success probability changes over time or trials influence one another, the distribution may not fit perfectly. However, it often serves as a reasonable approximation even when conditions are not ideal. The ability to model over-dispersion—variability that exceeds the mean—is particularly valuable in statistics. Compared with the Poisson distribution, the negative binomial allows the variance to be larger than the mean.
When the success probability is very high or the number of successes \(r\) is small, the distribution becomes narrow. In these cases the geometric distribution (\(r=1\)) may suffice. Conversely, with low success probability and higher \(r\), a very long tail arises and the mean can be quite large. Always examine whether your sample data align with the theoretical mean and variance before relying heavily on the distribution.
The negative binomial distribution is a versatile model for discrete counts where a process continues until a predetermined number of successes occur. Whether you are testing reliability, analyzing infection spread, or studying marketing conversions, it can provide valuable insight into how many failures you might expect along the way. This calculator makes it easy to explore the distribution without specialized statistical software, offering quick access to probabilities and summary measures. By adjusting the parameters you can see just how sensitive the results are to changes in success chance or required successes, deepening your understanding of this fundamental statistical tool.
Compute the Reynolds number for a fluid flow using density, velocity, characteristic length, and viscosity. Determine whether the flow is laminar or turbulent.
Compute the unknown resistance in a Wheatstone bridge using three known resistors under the balance condition.
Estimate how long a lead-acid battery will last under various loads using Peukert's law.