Non-negativity of the KL Divergence

by benmoran

It is straightforward to show that the KL divergence is never negative using Jensen’s inequality and the concavity of the \log function.

Jensen implies that \mathbb{E}[f(x)] \geq f(\mathbb{E}[x]) when f(x) is convex.

Setting f(x)=-\log(x) gives

\begin{aligned} D_{KL}(p\Vert q) & = \mathbb{E}_q\left[-\log \frac{p(x)}{q(x)}\right] \\ & \geq -\log \mathbb{E}_q\left[\frac{p(x)}{q(x)}\right] \\ & = -\log \int q(x) \frac{p(x)}{q(x)} dx \\ & = -\log \int p(x) dx = -\log 1 = 0 \\ \end{aligned}

Furthermore, the KL divergence is just one member of a more general family, the Csiszár f-divergences. These have the form

D_{f}(p\Vert q) = \int q(x) f\left(\frac{p(x)}{q(x)}\right) dx

for some convex function f. The same argument applies here (noting that the lower bound is now f(1), so this will only translate into non-negativity for particular choices of f).