- Source: Local asymptotic normality
In statistics, local asymptotic normality is a property of a sequence of statistical models, which allows this sequence to be asymptotically approximated by a normal location model, after an appropriate rescaling of the parameter. An important example when the local asymptotic normality holds is in the case of i.i.d sampling from a regular parametric model.
The notion of local asymptotic normality was introduced by Le Cam (1960) and is fundamental in the treatment of estimator and test efficiency.
Definition
A sequence of parametric statistical models { Pn,θ: θ ∈ Θ } is said to be locally asymptotically normal (LAN) at θ if there exist matrices rn and Iθ and a random vector Δn,θ ~ N(0, Iθ) such that, for every converging sequence hn → h,
ln
d
P
n
,
θ
+
r
n
−
1
h
n
d
P
n
,
θ
=
h
′
Δ
n
,
θ
−
1
2
h
′
I
θ
h
+
o
P
n
,
θ
(
1
)
,
{\displaystyle \ln {\frac {dP_{\!n,\theta +r_{n}^{-1}h_{n}}}{dP_{n,\theta }}}=h'\Delta _{n,\theta }-{\frac {1}{2}}h'I_{\theta }\,h+o_{P_{n,\theta }}(1),}
where the derivative here is a Radon–Nikodym derivative, which is a formalised version of the likelihood ratio, and where o is a type of big O in probability notation. In other words, the local likelihood ratio must converge in distribution to a normal random variable whose mean is equal to minus one half the variance:
ln
d
P
n
,
θ
+
r
n
−
1
h
n
d
P
n
,
θ
→
d
N
(
−
1
2
h
′
I
θ
h
,
h
′
I
θ
h
)
.
{\displaystyle \ln {\frac {dP_{\!n,\theta +r_{n}^{-1}h_{n}}}{dP_{n,\theta }}}\ \ {\xrightarrow {d}}\ \ {\mathcal {N}}{\Big (}{-{\tfrac {1}{2}}}h'I_{\theta }\,h,\ h'I_{\theta }\,h{\Big )}.}
The sequences of distributions
P
n
,
θ
+
r
n
−
1
h
n
{\displaystyle P_{\!n,\theta +r_{n}^{-1}h_{n}}}
and
P
n
,
θ
{\displaystyle P_{n,\theta }}
are contiguous.
= Example
=The most straightforward example of a LAN model is an iid model whose likelihood is twice continuously differentiable. Suppose { X1, X2, …, Xn } is an iid sample, where each Xi has density function f(x, θ). The likelihood function of the model is equal to
p
n
,
θ
(
x
1
,
…
,
x
n
;
θ
)
=
∏
i
=
1
n
f
(
x
i
,
θ
)
.
{\displaystyle p_{n,\theta }(x_{1},\ldots ,x_{n};\,\theta )=\prod _{i=1}^{n}f(x_{i},\theta ).}
If f is twice continuously differentiable in θ, then
ln
p
n
,
θ
+
δ
θ
≈
ln
p
n
,
θ
+
δ
θ
′
∂
ln
p
n
,
θ
∂
θ
+
1
2
δ
θ
′
∂
2
ln
p
n
,
θ
∂
θ
∂
θ
′
δ
θ
=
ln
p
n
,
θ
+
δ
θ
′
∑
i
=
1
n
∂
ln
f
(
x
i
,
θ
)
∂
θ
+
1
2
δ
θ
′
[
∑
i
=
1
n
∂
2
ln
f
(
x
i
,
θ
)
∂
θ
∂
θ
′
]
δ
θ
.
{\displaystyle {\begin{aligned}\ln p_{n,\theta +\delta \theta }&\approx \ln p_{n,\theta }+\delta \theta '{\frac {\partial \ln p_{n,\theta }}{\partial \theta }}+{\frac {1}{2}}\delta \theta '{\frac {\partial ^{2}\ln p_{n,\theta }}{\partial \theta \,\partial \theta '}}\delta \theta \\&=\ln p_{n,\theta }+\delta \theta '\sum _{i=1}^{n}{\frac {\partial \ln f(x_{i},\theta )}{\partial \theta }}+{\frac {1}{2}}\delta \theta '{\bigg [}\sum _{i=1}^{n}{\frac {\partial ^{2}\ln f(x_{i},\theta )}{\partial \theta \,\partial \theta '}}{\bigg ]}\delta \theta .\end{aligned}}}
Plugging in
δ
θ
=
h
/
n
{\displaystyle \delta \theta =h/{\sqrt {n}}}
, gives
ln
p
n
,
θ
+
h
/
n
p
n
,
θ
=
h
′
(
1
n
∑
i
=
1
n
∂
ln
f
(
x
i
,
θ
)
∂
θ
)
−
1
2
h
′
(
1
n
∑
i
=
1
n
−
∂
2
ln
f
(
x
i
,
θ
)
∂
θ
∂
θ
′
)
h
+
o
p
(
1
)
.
{\displaystyle \ln {\frac {p_{n,\theta +h/{\sqrt {n}}}}{p_{n,\theta }}}=h'{\Bigg (}{\frac {1}{\sqrt {n}}}\sum _{i=1}^{n}{\frac {\partial \ln f(x_{i},\theta )}{\partial \theta }}{\Bigg )}\;-\;{\frac {1}{2}}h'{\Bigg (}{\frac {1}{n}}\sum _{i=1}^{n}-{\frac {\partial ^{2}\ln f(x_{i},\theta )}{\partial \theta \,\partial \theta '}}{\Bigg )}h\;+\;o_{p}(1).}
By the central limit theorem, the first term (in parentheses) converges in distribution to a normal random variable Δθ ~ N(0, Iθ), whereas by the law of large numbers the expression in second parentheses converges in probability to Iθ, which is the Fisher information matrix:
I
θ
=
E
[
−
∂
2
ln
f
(
X
i
,
θ
)
∂
θ
∂
θ
′
]
=
E
[
(
∂
ln
f
(
X
i
,
θ
)
∂
θ
)
(
∂
ln
f
(
X
i
,
θ
)
∂
θ
)
′
]
.
{\displaystyle I_{\theta }=\mathrm {E} {\bigg [}{-{\frac {\partial ^{2}\ln f(X_{i},\theta )}{\partial \theta \,\partial \theta '}}}{\bigg ]}=\mathrm {E} {\bigg [}{\bigg (}{\frac {\partial \ln f(X_{i},\theta )}{\partial \theta }}{\bigg )}{\bigg (}{\frac {\partial \ln f(X_{i},\theta )}{\partial \theta }}{\bigg )}'\,{\bigg ]}.}
Thus, the definition of the local asymptotic normality is satisfied, and we have confirmed that the parametric model with iid observations and twice continuously differentiable likelihood has the LAN property.
See also
Asymptotic distribution
= Notes
=References
Kata Kunci Pencarian:
- Local asymptotic normality
- Asymptotic distribution
- Asymptotic theory (statistics)
- Lan
- Contiguity (probability theory)
- List of statistics articles
- Lucien Le Cam
- Ordinary least squares
- Generalized method of moments
- Central limit theorem