The KL divergence between the joint and the marginals. For joint distribution with marginals and , the mutual information is .

The mutual information is a measure of dependence between and : If you know one, how much do you know about the other? (The MI is symmetric). iff and are independent.