- Source: Giry monad
In mathematics, the Giry monad is a construction that assigns to a measurable space a space of probability measures over it, equipped with a canonical sigma-algebra. It is one of the main examples of a probability monad.
It is implicitly used in probability theory whenever one considers probability measures which depend measurably on a parameter (giving rise to Markov kernels), or when one has probability measures over probability measures (such as in de Finetti's theorem).
Like many iterable constructions, it has the category-theoretic structure of a monad, on the category of measurable spaces.
Construction
The Giry monad, like every monad, consists of three structures:
A functorial assignment, which in this case assigns to a measurable space
X
{\displaystyle X}
a space of probability measures
P
X
{\displaystyle PX}
over it;
A natural map
δ
:
X
→
P
X
{\displaystyle \delta :X\to PX}
called the unit, which in this case assigns to each element of a space the Dirac measure over it;
A natural map
E
:
P
P
X
→
P
X
{\displaystyle {\mathcal {E}}:PPX\to PX}
called the multiplication, which in this case assigns to each probability measure over probability measures its expected value.
= The space of probability measures
=Let
(
X
,
F
)
{\displaystyle (X,{\mathcal {F}})}
be a measurable space.
Denote by
P
X
{\displaystyle PX}
the set of probability measures over
(
X
,
F
)
{\displaystyle (X,{\mathcal {F}})}
.
We equip the set
P
X
{\displaystyle PX}
with a sigma-algebra as follows. First of all, for every measurable set
A
∈
F
{\displaystyle A\in {\mathcal {F}}}
, define the map
ε
A
:
P
X
→
R
{\displaystyle \varepsilon _{A}:PX\to \mathbb {R} }
by
p
⟼
p
(
A
)
{\displaystyle p\longmapsto p(A)}
.
We then define the sigma algebra
P
F
{\displaystyle {\mathcal {PF}}}
on
P
X
{\displaystyle PX}
to be the smallest sigma-algebra which makes the maps
ε
A
{\displaystyle \varepsilon _{A}}
measurable, for all
A
∈
F
{\displaystyle A\in {\mathcal {F}}}
(where
R
{\displaystyle \mathbb {R} }
is assumed equipped with the Borel sigma-algebra).
Equivalently,
P
F
{\displaystyle {\mathcal {PF}}}
can be defined as the smallest sigma-algebra on
P
X
{\displaystyle PX}
which makes the maps
p
⟼
∫
X
f
d
p
{\displaystyle p\longmapsto \int _{X}f\,dp}
measurable for all bounded measurable
f
:
X
→
R
{\displaystyle f:X\to \mathbb {R} }
.
The assignment
(
X
,
F
)
↦
(
P
X
,
P
F
)
{\displaystyle (X,{\mathcal {F}})\mapsto (PX,{\mathcal {PF}})}
is part of an endofunctor on the category of measurable spaces, usually denoted again by
P
{\displaystyle P}
. Its action on morphisms, i.e. on measurable maps, is via the pushforward of measures.
Namely, given a measurable map
f
:
(
X
,
F
)
→
(
Y
,
G
)
{\displaystyle f:(X,{\mathcal {F}})\to (Y,{\mathcal {G}})}
, one assigns to
f
{\displaystyle f}
the map
f
∗
:
(
P
X
,
P
F
)
→
(
P
Y
,
P
G
)
{\displaystyle f_{*}:(PX,{\mathcal {PF}})\to (PY,{\mathcal {PG}})}
defined by
f
∗
p
(
B
)
=
p
(
f
−
1
(
B
)
)
{\displaystyle f_{*}p\,(B)=p(f^{-1}(B))}
for all
p
∈
P
X
{\displaystyle p\in PX}
and all measurable sets
B
∈
G
{\displaystyle B\in {\mathcal {G}}}
.
= The Dirac delta map
=Given a measurable space
(
X
,
F
)
{\displaystyle (X,{\mathcal {F}})}
, the map
δ
:
(
X
,
F
)
→
(
P
X
,
P
F
)
{\displaystyle \delta :(X,{\mathcal {F}})\to (PX,{\mathcal {PF}})}
maps an element
x
∈
X
{\displaystyle x\in X}
to the Dirac measure
δ
x
∈
P
X
{\displaystyle \delta _{x}\in PX}
, defined on measurable subsets
A
∈
F
{\displaystyle A\in {\mathcal {F}}}
by
δ
x
(
A
)
=
1
A
(
x
)
=
{
1
if
x
∈
A
,
0
if
x
∉
A
.
{\displaystyle \delta _{x}(A)=1_{A}(x)={\begin{cases}1&{\text{if }}x\in A,\\0&{\text{if }}x\notin A.\end{cases}}}
= The expectation map
=Let
μ
∈
P
P
X
{\displaystyle \mu \in PPX}
, i.e. a probability measure over the probability measures over
(
X
,
F
)
{\displaystyle (X,{\mathcal {F}})}
. We define the probability measure
E
μ
∈
P
X
{\displaystyle {\mathcal {E}}\mu \in PX}
by
E
μ
(
A
)
=
∫
P
X
p
(
A
)
μ
(
d
p
)
{\displaystyle {\mathcal {E}}\mu (A)=\int _{PX}p(A)\,\mu (dp)}
for all measurable
A
∈
F
{\displaystyle A\in {\mathcal {F}}}
.
This gives a measurable, natural map
E
:
(
P
P
X
,
P
P
F
)
→
(
P
X
,
P
F
)
{\displaystyle {\mathcal {E}}:(PPX,{\mathcal {PPF}})\to (PX,{\mathcal {PF}})}
.
Example: mixture distributions
A mixture distribution, or more generally a compound distribution, can be seen as an application of the map
E
{\displaystyle {\mathcal {E}}}
.
Let's see this for the case of a finite mixture. Let
p
1
,
…
,
p
n
{\displaystyle p_{1},\dots ,p_{n}}
be probability measures on
(
X
,
F
)
{\displaystyle (X,{\mathcal {F}})}
, and consider the probability measure
q
{\displaystyle q}
given by the mixture
q
(
A
)
=
∑
i
=
1
n
w
i
p
i
(
A
)
{\displaystyle q(A)=\sum _{i=1}^{n}w_{i}\,p_{i}(A)}
for all measurable
A
∈
F
{\displaystyle A\in {\mathcal {F}}}
, for some weights
w
i
≥
0
{\displaystyle w_{i}\geq 0}
satisfying
w
1
+
⋯
+
w
n
=
1
{\displaystyle w_{1}+\dots +w_{n}=1}
.
We can view the mixture
q
{\displaystyle q}
as the average
q
=
E
μ
{\displaystyle q={\mathcal {E}}\mu }
, where the measure on measures
μ
∈
P
P
X
{\displaystyle \mu \in PPX}
, which in this case is discrete, is given by
μ
=
∑
i
=
1
n
w
i
δ
p
i
.
{\displaystyle \mu =\sum _{i=1}^{n}w_{i}\,\delta _{p_{i}}.}
More generally, the map
E
:
P
P
X
→
P
X
{\displaystyle {\mathcal {E}}:PPX\to PX}
can be seen as the most general, non-parametric way to form arbitrary mixture or compound distributions.
The triple
(
P
,
δ
,
E
)
{\displaystyle (P,\delta ,{\mathcal {E}})}
is called the Giry monad.
Relationship with Markov kernels
One of the properties of the sigma-algebra
P
F
{\displaystyle {\mathcal {PF}}}
is that given measurable spaces
(
X
,
F
)
{\displaystyle (X,{\mathcal {F}})}
and
(
Y
,
G
)
{\displaystyle (Y,{\mathcal {G}})}
, we have a bijective correspondence between measurable functions
(
X
,
F
)
→
(
P
Y
,
P
G
)
{\displaystyle (X,{\mathcal {F}})\to (PY,{\mathcal {PG}})}
and Markov kernels
(
X
,
F
)
→
(
Y
,
G
)
{\displaystyle (X,{\mathcal {F}})\to (Y,{\mathcal {G}})}
. This allows to view a Markov kernel, equivalently, as a measurably parametrized probability measure.
In more detail, given a measurable function
f
:
(
X
,
F
)
→
(
P
Y
,
P
G
)
{\displaystyle f:(X,{\mathcal {F}})\to (PY,{\mathcal {PG}})}
, one can obtain the Markov kernel
f
♭
:
(
X
,
F
)
→
(
Y
,
G
)
{\displaystyle f^{\flat }:(X,{\mathcal {F}})\to (Y,{\mathcal {G}})}
as follows,
f
♭
(
B
|
x
)
=
f
(
x
)
(
B
)
{\displaystyle f^{\flat }(B|x)=f(x)(B)}
for every
x
∈
X
{\displaystyle x\in X}
and every measurable
B
∈
G
{\displaystyle B\in {\mathcal {G}}}
(note that
f
(
x
)
∈
P
Y
{\displaystyle f(x)\in PY}
is a probability measure).
Conversely, given a Markov kernel
k
:
(
X
,
F
)
→
(
Y
,
G
)
{\displaystyle k:(X,{\mathcal {F}})\to (Y,{\mathcal {G}})}
, one can form the measurable function
k
♯
:
(
X
,
F
)
→
(
P
Y
,
P
G
)
{\displaystyle k^{\sharp }:(X,{\mathcal {F}})\to (PY,{\mathcal {PG}})}
mapping
x
∈
X
{\displaystyle x\in X}
to the probability measure
k
♯
(
x
)
∈
P
Y
{\displaystyle k^{\sharp }(x)\in PY}
defined by
k
♯
(
x
)
(
B
)
=
k
(
B
|
x
)
{\displaystyle k^{\sharp }(x)(B)=k(B|x)}
for every measurable
B
∈
G
{\displaystyle B\in {\mathcal {G}}}
.
The two assignments are mutually inverse.
From the point of view of category theory, we can interpret this correspondence as an adjunction
H
o
m
M
e
a
s
(
X
,
P
Y
)
≅
H
o
m
S
t
o
c
h
(
X
,
Y
)
{\displaystyle \mathrm {Hom} _{\mathrm {Meas} }(X,PY)\cong \mathrm {Hom} _{\mathrm {Stoch} }(X,Y)}
between the category of measurable spaces and the category of Markov kernels. In particular, the category of Markov kernels can be seen as the Kleisli category of the Giry monad.
Product distributions
Given measurable spaces
(
X
,
F
)
{\displaystyle (X,{\mathcal {F}})}
and
(
Y
,
G
)
{\displaystyle (Y,{\mathcal {G}})}
, one can form the measurable space
(
P
X
,
P
X
)
×
(
P
Y
,
P
Y
)
=
(
X
×
Y
,
F
×
G
)
{\displaystyle (PX,{\mathcal {PX}})\times (PY,{\mathcal {PY}})=(X\times Y,{\mathcal {F}}\times {\mathcal {G}})}
with the product sigma-algebra, which is the product in the category of measurable spaces.
Given probability measures
p
∈
P
X
{\displaystyle p\in PX}
and
q
∈
P
Y
{\displaystyle q\in PY}
, one can form the product measure
p
⊗
q
{\displaystyle p\otimes q}
on
(
X
×
Y
,
F
×
G
)
{\displaystyle (X\times Y,{\mathcal {F}}\times {\mathcal {G}})}
. This gives a natural, measurable map
(
P
X
,
P
F
)
×
(
P
Y
,
P
G
)
→
(
P
(
X
×
Y
)
,
P
(
F
×
G
)
)
{\displaystyle (PX,{\mathcal {PF}})\times (PY,{\mathcal {PG}})\to {\big (}P(X\times Y),{\mathcal {P(F\times G)}}{\big )}}
usually denoted by
∇
{\displaystyle \nabla }
or by
⊗
{\displaystyle \otimes }
.
The map
∇
:
P
X
×
P
Y
→
P
(
X
×
Y
)
{\displaystyle \nabla :PX\times PY\to P(X\times Y)}
is in general not an isomorphism, since there are probability measures on
X
×
Y
{\displaystyle X\times Y}
which are not product distributions, for example in case of correlation.
However, the maps
∇
:
P
X
×
P
Y
→
P
(
X
×
Y
)
{\displaystyle \nabla :PX\times PY\to P(X\times Y)}
and the isomorphism
1
≅
P
1
{\displaystyle 1\cong P1}
make the Giry monad a monoidal monad, and so in particular a commutative strong monad.
Further properties
If a measurable space
(
X
,
F
)
{\displaystyle (X,{\mathcal {F}})}
is standard Borel, so is
(
P
X
,
P
F
)
{\displaystyle (PX,{\mathcal {PF}})}
. Therefore the Giry monad restricts to the full subcategory of standard Borel spaces.
The algebras for the Giry monad include compact convex subsets of Euclidean spaces, as well as the extended positive real line
[
0
,
∞
]
{\displaystyle [0,\infty ]}
, with the algebra structure map given by taking expected values. For example, for
[
0
,
∞
]
{\displaystyle [0,\infty ]}
, the structure map
e
:
P
[
0
,
∞
]
→
[
0
,
∞
]
{\displaystyle e:P[0,\infty ]\to [0,\infty ]}
is given by
p
⟼
∫
[
0
,
∞
)
x
p
(
d
x
)
{\displaystyle p\longmapsto \int _{[0,\infty )}x\,p(dx)}
whenever
p
{\displaystyle p}
is supported on
[
0
,
∞
)
{\displaystyle [0,\infty )}
and has finite expected value, and
e
(
p
)
=
∞
{\displaystyle e(p)=\infty }
otherwise.
See also
Mixture distribution
Compound distribution
de Finetti theorem
Measurable space
Markov kernel
Monad (category theory)
Monad (functional programming)
Category of measurable spaces
Category of Markov kernels
Categorical probability
Citations
References
Further reading
Monads of probability, measures, and valuations, in nLab.
https://ncatlab.org/nlab/show/Giry+monad
External links
What is a probability monad?, video tutorial.
Kata Kunci Pencarian:
- Giry monad
- Monad (category theory)
- Codensity monad
- Categorical probability
- Category of Markov kernels
- Markov decision process
- De Finetti's theorem
- Pushforward measure
- Monoidal monad
- Mixture model