Dan Dennett once said of Darwin’s theory of evolution that it was the best idea that anyone has ever had. You could say the same about the MLE in the realm of statistical inference. It’s simple and elegant and sometimes optimal.
Given a parametric model (parametric versus nonparametric statistics) and data we solve
So, given the data, we just optimize over the parameters that could have generated that data. Badaboom-badabing.
todo guarantees etc.