Density estimation in a parametric class of models. That, is we observe for some , where is finite dimensional. We want to infer ; either with a point estimate or a distribution. There are various approaches.

The most common and straightforward is obviously the MLE; find the parameters that maximize the likelihood or log-likelihood. This is a frequentist approach. One can also use regularized MLE, in which we add some penalty to avoid overfitting.

Another frequentist approach is the method of moments for estimation. This is simpler to compute than the MLE, but typically less efficient. But it’s seen success in estimating mixtures of Gaussians, eg

A final frequentist approach is deep density estimation. Many might consider this a solution to nonparametric density estimation but I think that’s wrong; we choose the parameters of the network first, which gives rise to possible functions , where represents a particular set of weights in the given architecture. The set of all weight configurations is parametric (it’s big, but not infinite).

There is also a natural Bayesian approach, which is to place a prior over , and then compute the posterior . If we want a point estimate , we can compute it as some function of the posterior. Eg it’s natural to take the expected value: . As usual, the Bayesian approach allows for very natural uncertainty quantification, since we have a distribution over the parameter space.

Unlike nonparametric density estimation, we must specify the parametric family of densities to use, which is of course challenging. Once we have an answer, we might consider running a goodness-of-fit test to see how well our model actually fits the data, or whether we need to consider a different class of functions.