Intuitively, the Fisher information is the expected curvature around a parameter. The more curvature, the “easier” the parameter is to estimate.

Formally, the Fisher information is the expected outer-product of the score function, which is the gradient of the log-likelihood. If is the score based on observations, then the Fisher information is:

Since is a -dimensional vector (if is d-dimensional), the Fisher information is a matrix. For iid data, the Fisher information satisfies , which is extremely handy.

An equivalent way to define it is

i.e. the second derivative of the negative log-likelihood.