Non-negativity of the KL Divergence
by benmoran
It is straightforward to show that the KL divergence is never negative using Jensen’s inequality and the concavity of the function.
Jensen implies that when
is convex.
Setting gives
Furthermore, the KL divergence is just one member of a more general family, the Csiszár -divergences. These have the form
for some convex function . The same argument applies here (noting that the lower bound is now
, so this will only translate into non-negativity for particular choices of
).
[…] we saw previously that , we […]