Intuitively, the Fisher information is the expected curvature around a parameter. The more curvature, the “easier” the parameter is to estimate.
Formally, the Fisher information is the expected outer-product of the score function, which is the gradient of the log-likelihood. If is the score based on observations, then the Fisher information is:
Since is a -dimensional vector (if is d-dimensional), the Fisher information is a matrix. For iid data, the Fisher information satisfies , which is extremely handy.
An equivalent way to define it is
i.e. the second derivative of the negative log-likelihood.