Search Results for “asymptotic equipartition property”

Source: Asymptotic equipartition property

In information theory, the asymptotic equipartition property (AEP) is a general property of the output samples of a stochastic source. It is fundamental to the concept of typical set used in theories of data compression.
Roughly speaking, the theorem states that although there are many series of results that may be produced by a random process, the one actually produced is most probably from a loosely defined set of outcomes that all have approximately the same chance of being the one actually realized. (This is a consequence of the law of large numbers and ergodic theory.) Although there are individual outcomes which have a higher probability than any outcome in this set, the vast number of outcomes in the set almost guarantees that the outcome will come from the set. One way of intuitively understanding the property is through Cramér's large deviation theorem, which states that the probability of a large deviation from mean decays exponentially with the number of samples. Such results are studied in large deviations theory; intuitively, it is the large deviations that would violate equipartition, but these are unlikely.
In the field of pseudorandom number generation, a candidate generator of undetermined quality whose output sequence lies too far outside the typical set by some statistical criteria is rejected as insufficiently random. Thus, although the typical set is loosely defined, practical notions arise concerning sufficient typicality.

Definition

asymptotic

Discrete-time i.i.d. sources

asymptotic

equipartition

property

Discrete-time finite-valued stationary ergodic sources

Non-stationary discrete-time source producing independent symbols

for all i, for some M > 0, the following holds (AEP):

lim

n
→
∞

Pr

[

|

−

1
n

log
⁡
p
(

X

1

,

X

2

,
…
,

X

n

)
−

H
¯

n

(
X
)

|

<
ε

]

=
1

∀
ε
>
0

{\displaystyle \lim _{n\to \infty }\Pr \left[\,\left|-{\frac {1}{n}}\log p(X_{1},X_{2},\ldots ,X_{n})-{\overline {H}}_{n}(X)\right|<\varepsilon \right]=1\qquad \forall \varepsilon >0}

where

H
¯

n

(
X
)
=

1
n

H
(

X

1

,

X

2

,
…
,

X

n

)

{\displaystyle {\overline {H}}_{n}(X)={\frac {1}{n}}H(X_{1},X_{2},\ldots ,X_{n})}

= Applications

=
The asymptotic equipartition property for non-stationary discrete-time independent process leads us to (among other results) the source coding theorem for non-stationary source (with independent output symbols) and noisy-channel coding theorem for non-stationary memoryless channels.

Measure-theoretic form

T

{\textstyle T}

is a measure-preserving map on the probability space

Ω

{\textstyle \Omega }

.
If

P

{\textstyle P}

is a finite or countable partition of

Ω

{\textstyle \Omega }

, then its entropy is

H
(
P
)
:=
−

∑

p
∈
P

μ
(
p
)
ln
⁡
μ
(
p
)

{\displaystyle H(P):=-\sum _{p\in P}\mu (p)\ln \mu (p)}

with the convention that

0
ln
⁡
0
=
0

{\displaystyle 0\ln 0=0}

.
We only consider partitions with finite entropy:

H
(
P
)
<
∞

{\textstyle H(P)<\infty }

.
If

P

{\textstyle P}

is a finite or countable partition of

Ω

{\textstyle \Omega }

, then we construct a sequence of partitions by iterating the map:

P

(
n
)

:=
P
∨

T

−
1

P
∨
⋯
∨

T

−
(
n
−
1
)

P

{\displaystyle P^{(n)}:=P\vee T^{-1}P\vee \dots \vee T^{-(n-1)}P}

where

P
∨
Q

{\textstyle P\vee Q}

is the least upper bound partition, that is, the least refined partition that refines both

P

{\textstyle P}

and

Q

{\textstyle Q}

:

P
∨
Q
:=
{
p
∩
q
:
p
∈
P
,
q
∈
Q
}

{\displaystyle P\vee Q:=\{p\cap q:p\in P,q\in Q\}}

Write

P
(
x
)

{\textstyle P(x)}

to be the set in

P

{\textstyle P}

where

x

{\textstyle x}

falls in. So, for example,

P

(
n
)

(
x
)

{\textstyle P^{(n)}(x)}

is the

n

{\textstyle n}

-letter initial segment of the

(
P
,
T
)

{\textstyle (P,T)}

-name of

x

{\textstyle x}

.
Write

I

P

(
x
)

{\textstyle I_{P}(x)}

to be the information (in units of nats) about

x

{\textstyle x}

we can recover, if we know which element in the partition

P

{\textstyle P}

that

x

{\textstyle x}

falls in:

I

P

:=
−
ln
⁡
μ
(
P
(
x
)
)

{\displaystyle I_{P}:=-\ln \mu (P(x))}

Similarly, the conditional information of partition

P

{\textstyle P}

, conditional on partition

Q

{\textstyle Q}

, about

x

{\textstyle x}

, is

I

P

|

Q

(
x
)
:=
−
ln
⁡

P
∨
Q
(
x
)

Q
(
x
)

{\displaystyle I_{P|Q}(x):=-\ln {\frac {P\vee Q(x)}{Q(x)}}}

h

T

(
P
)

{\textstyle h_{T}(P)}

is the Kolmogorov-Sinai entropy

h

T

(
P
)
:=

lim

n

1
n

H
(

P

(
n
)

)
=

lim

n

E

x
∼
μ

[

1
n

I

P

(
n
)

(
x
)

]

{\displaystyle h_{T}(P):=\lim _{n}{\frac {1}{n}}H(P^{(n)})=\lim _{n}E_{x\sim \mu }\left[{\frac {1}{n}}I_{P^{(n)}}(x)\right]}

In other words, by definition, we have a convergence in expectation. The SMB theorem states that when

T

{\textstyle T}

is ergodic, we have convergence in L1.

If

T

{\textstyle T}

is not necessarily ergodic, then the underlying probability space would be split up into multiple subsets, each invariant under

T

{\textstyle T}

. In this case, we still have L1 convergence to some function, but that function is no longer a constant function.

When

T

{\textstyle T}

is ergodic,

I

{\textstyle {\mathcal {I}}}

is trivial, and so the function

x
↦
E

[

lim

n

I

P

|

∨

k
=
1

n

T

−
k

P

|

I

]

{\displaystyle x\mapsto E\left[\lim _{n}I_{P|\vee _{k=1}^{n}T^{-k}P}{\big |}\;{\mathcal {I}}\right]}

simplifies into the constant function

x
↦
E

[

lim

n

I

P

|

∨

k
=
1

n

T

−
k

P

]

{\textstyle x\mapsto E\left[\lim _{n}I_{P|\vee _{k=1}^{n}T^{-k}P}\right]}

, which by definition, equals

lim

n

H
(
P

|

∨

k
=
1

n

T

−
k

P
)

{\textstyle \lim _{n}H(P|\vee _{k=1}^{n}T^{-k}P)}

, which equals

h

T

(
P
)

{\textstyle h_{T}(P)}

by a proposition.

Continuous-time stationary ergodic sources

Discrete-time functions can be interpolated to continuous-time functions. If such interpolation f is measurable, we may define the continuous-time stationary process accordingly as

X
~

:=
f
∘
X

{\displaystyle {\tilde {X}}:=f\circ X}

. If the asymptotic equipartition property holds for the discrete-time process, as in the i.i.d. or finite-valued stationary ergodic cases shown above, it automatically holds for the continuous-time stationary process derived from it by some measurable interpolation. i.e.

−

1
n

log
⁡
p
(

X
~

0

τ

)
→
H
(
X
)

{\displaystyle -{\frac {1}{n}}\log p({\tilde {X}}_{0}^{\tau })\to H(X)}

where n corresponds to the degree of freedom in time τ. nH(X)/τ and H(X) are the entropy per unit time and per degree of freedom respectively, defined by Shannon.
An important class of such continuous-time stationary process is the bandlimited stationary ergodic process with the sample space being a subset of the continuous

L

2

{\displaystyle {\mathcal {L}}_{2}}

functions. The asymptotic equipartition property holds if the process is white, in which case the time samples are i.i.d., or there exists T > 1/2W, where W is the nominal bandwidth, such that the T-spaced time samples take values in a finite set, in which case we have the discrete-time finite-valued stationary ergodic process.
Any time-invariant operations also preserves the asymptotic equipartition property, stationarity and ergodicity and we may easily turn a stationary process to non-stationary without losing the asymptotic equipartition property by nulling out a finite number of time samples in the process.

Notes

References

= Journal articles

=
Claude E. Shannon. "A Mathematical Theory of Communication". Bell System Technical Journal, July/October 1948.
Sergio Verdu and Te Sun Han. "The Role of the Asymptotic Equipartition Property in Noiseless Source Coding." IEEE Transactions on Information Theory, 43(3): 847–857, 1997.

= Textbooks

=
Cover, Thomas M.; Thomas, Joy A. (1991). Elements of Information Theory (first ed.). Hoboken, New Jersey: Wiley. ISBN 978-0-471-24195-9.
MacKay, David J.C. (2003). Information Theory, Inference, and Learning Algorithms. Cambridge University Press. ISBN 0-521-64298-1.

Definition

Discrete-time i.i.d. sources

Discrete-time finite-valued stationary ergodic sources

Non-stationary discrete-time source producing independent symbols

= Applications

Measure-theoretic form

Continuous-time stationary ergodic sources

Category theory

See also

Notes

References

= Journal articles

= Textbooks

Kata Kunci Pencarian:

Recent Movies

Recent Movies

Categories

Recent Movies