This page helps you compare two discrete probability distributions (probability vectors) P and Q defined over the same set of outcomes (categories). You enter the probabilities as comma-separated lists (for example, 0.6, 0.4). The calculator then reports common information-theoretic divergence measures, such as:
You can also choose the log base (natural log gives results in nats; base 2 gives results in bits).
Assume P and Q are discrete distributions over outcomes , with , , and .
The Kullback–Leibler divergence from to is:
If you choose ln, then log is and the unit is nats. If you choose log2, the unit is bits.
Cross-entropy of relative to is:
H(P, Q) = - ∑ P(i) log Q(i)
It relates to KL divergence via:
H(P, Q) = H(P) + D_KL(P‖Q), where H(P) = -∑ P(i) log P(i) is the entropy of P.
JSD is a symmetric, smoothed divergence based on the mixture :
JSD(P, Q) = 1/2 · D_KL(P‖M) + 1/2 · D_KL(Q‖M)
With base-2 logs, JSD is bounded between 0 and 1 bit for discrete distributions.
P assigns probability to outcomes that Q considers impossible.P but you code using a model Q.Let:
P = 0.6, 0.4Q = 0.5, 0.5Using natural logs:
D_KL(P‖Q) = 0.6·ln(0.6/0.5) + 0.4·ln(0.4/0.5)
= 0.6·ln(1.2) + 0.4·ln(0.8) ≈ 0.6·0.1823 + 0.4·(-0.2231) ≈ 0.0201 nats
This is small, indicating P and Q are close.
| Metric | Discrete formula | Symmetric? | Range / behavior | Notes |
|---|---|---|---|---|
| KL(P‖Q) | ∑ P(i) log(P(i)/Q(i)) | No | ≥ 0; can be ∞ | Undefined/infinite if Q(i)=0 where P(i)>0 |
| KL(Q‖P) | ∑ Q(i) log(Q(i)/P(i)) | No | ≥ 0; can be ∞ | Highlights different failure modes than KL(P‖Q) |
| Cross-entropy H(P,Q) | −∑ P(i) log Q(i) | No | ≥ H(P); can be ∞ | Common in classification/log-loss settings |
| JSD(P,Q) | ½·KL(P‖M)+½·KL(Q‖M), M=(P+Q)/2 | Yes | Finite; bounded (≤ 1 bit with log2) | More stable and interpretable for “distance-like” comparison |
P must correspond to the same outcome as the i-th entry of Q. If you reorder one list, the divergence changes.P(i)=0 contribute 0 to KL (by the limit behavior), so they do not cause problems by themselves.Q(i)=0 while P(i)>0, then KL(P‖Q) diverges to infinity (because you are assigning zero probability to an event that occurs under P). This is not a bug; it reflects an impossible event under Q.