Fisher Information

Intuitively, the Fisher information is the expected curvature around a parameter. The more curvature, the “easier” the parameter is to estimate.

Formally, the Fisher information is the expected outer-product of the score function, which is the gradient of the log-likelihood. If $s_{n} (θ) = \nabla_{θ} ℓ_{n} (θ)$ is the score based on $n$ observations, then the Fisher information is:

I_{n} (θ) := E [s (θ) s (θ)^{⊤}] .

Since $s (θ)$ is a $d$ -dimensional vector (if $θ$ is d-dimensional), the Fisher information is a $d \times d$ matrix. For iid data, the Fisher information satisfies $I_{n} (θ) = n I_{1} (θ)$ , which is extremely handy.

An equivalent way to define it is

I_{1} (θ) = - E [\nabla_{θ}^{2} ℓ_{1} (θ)],

i.e. the second derivative of the negative log-likelihood.

The Stats Map

Explore

Fisher information

Graph View

Backlinks

Explore