Kullback–Leibler Divergence Calculator
Enter P and Q.

Understanding KL Divergence

Kullback–Leibler (KL) divergence measures how one probability distribution diverges from a second, reference distribution. Given discrete distributions P and Q over the same set, the KL divergence from P to Q is defined as

D(PQ)=iP(i)ln(P(i)Q(i))

Intuitively, the formula quantifies the extra information required to encode events sampled from P if we use a code optimized for Q. When P and Q are identical, the divergence is zero. As the distributions diverge, the value grows, reflecting the inefficiency of using Q to approximate P.

Common Applications

KL divergence appears throughout machine learning and information theory. In variational inference, models minimize KL divergence to find a simpler approximate distribution that mimics a complicated posterior. In reinforcement learning, policies are often constrained by a maximum KL divergence from prior policies to ensure stable updates. The concept also helps compare language models, evaluate generative networks, and track training progress in classification problems.

Because KL divergence is asymmetric, D(PQ) generally differs from D(QP). This asymmetry underscores its interpretation as a measure of relative entropy—the expected extra message length when P is encoded with Q's code.

Using the Calculator

Enter probabilities for P and Q separated by commas. Each list should contain the same number of values and sum to 1. The script normalizes them if necessary. When you press the compute button, it iterates through the arrays, sums P(i)ln(P(i)Q(i)), and displays the result. Probabilities equal to zero contribute nothing because the limit of xln(x) approaches zero as x vanishes.

Example

Suppose P assigns probabilities {0.6,0.4} while Q assigns {0.5,0.5}. Plugging these into the formula yields

D(PQ)=0.6ln(1.2)+0.4ln(0.8)0.02

Small values indicate the distributions are close, while large values highlight stark differences.

Perspectives

By experimenting with the inputs, you can see how skewing probability mass increases the divergence. Extreme mismatches quickly produce large values. This sensitivity to improbable events is a hallmark of KL divergence and influences its use in robust statistics. In practice, KL divergence informs algorithms ranging from expectation-maximization to reinforcement learning policy updates, showcasing its broad relevance.

Related Calculators

Wien's Displacement Law Calculator - Peak Blackbody Wavelength

Determine the wavelength of maximum emission for a blackbody at any temperature using Wien's displacement law.

Wien displacement law calculator blackbody peak wavelength thermal radiation

Seismic Wave Travel Time Calculator - Estimate Arrival Times

Compute the travel time of seismic P or S waves given distance and wave velocity. Useful for earthquake analysis and geophysical surveys.

seismic wave travel time calculator earthquake wave arrival geophysics

Photon Energy Calculator - Light Quanta

Compute photon energy from wavelength or frequency using Planck's relation. Learn how light's quantum nature connects energy to electromagnetic waves.

photon energy calculator planck relation hc over lambda