- Source: Log sum inequality
The log sum inequality is used for proving theorems in information theory.
Statement
Let
a
1
,
…
,
a
n
{\displaystyle a_{1},\ldots ,a_{n}}
and
b
1
,
…
,
b
n
{\displaystyle b_{1},\ldots ,b_{n}}
be nonnegative numbers. Denote the sum of all
a
i
{\displaystyle a_{i}}
s by
a
{\displaystyle a}
and the sum of all
b
i
{\displaystyle b_{i}}
s by
b
{\displaystyle b}
. The log sum inequality states that
∑
i
=
1
n
a
i
log
a
i
b
i
≥
a
log
a
b
,
{\displaystyle \sum _{i=1}^{n}a_{i}\log {\frac {a_{i}}{b_{i}}}\geq a\log {\frac {a}{b}},}
with equality if and only if
a
i
b
i
{\displaystyle {\frac {a_{i}}{b_{i}}}}
are equal for all
i
{\displaystyle i}
, in other words
a
i
=
c
b
i
{\displaystyle a_{i}=cb_{i}}
for all
i
{\displaystyle i}
.
(Take
a
i
log
a
i
b
i
{\displaystyle a_{i}\log {\frac {a_{i}}{b_{i}}}}
to be
0
{\displaystyle 0}
if
a
i
=
0
{\displaystyle a_{i}=0}
and
∞
{\displaystyle \infty }
if
a
i
>
0
,
b
i
=
0
{\displaystyle a_{i}>0,b_{i}=0}
. These are the limiting values obtained as the relevant number tends to
0
{\displaystyle 0}
.)
Proof
Notice that after setting
f
(
x
)
=
x
log
x
{\displaystyle f(x)=x\log x}
we have
∑
i
=
1
n
a
i
log
a
i
b
i
=
∑
i
=
1
n
b
i
f
(
a
i
b
i
)
=
b
∑
i
=
1
n
b
i
b
f
(
a
i
b
i
)
≥
b
f
(
∑
i
=
1
n
b
i
b
a
i
b
i
)
=
b
f
(
1
b
∑
i
=
1
n
a
i
)
=
b
f
(
a
b
)
=
a
log
a
b
,
{\displaystyle {\begin{aligned}\sum _{i=1}^{n}a_{i}\log {\frac {a_{i}}{b_{i}}}&{}=\sum _{i=1}^{n}b_{i}f\left({\frac {a_{i}}{b_{i}}}\right)=b\sum _{i=1}^{n}{\frac {b_{i}}{b}}f\left({\frac {a_{i}}{b_{i}}}\right)\\&{}\geq bf\left(\sum _{i=1}^{n}{\frac {b_{i}}{b}}{\frac {a_{i}}{b_{i}}}\right)=bf\left({\frac {1}{b}}\sum _{i=1}^{n}a_{i}\right)=bf\left({\frac {a}{b}}\right)\\&{}=a\log {\frac {a}{b}},\end{aligned}}}
where the inequality follows from Jensen's inequality since
b
i
b
≥
0
{\displaystyle {\frac {b_{i}}{b}}\geq 0}
,
∑
i
=
1
n
b
i
b
=
1
{\displaystyle \sum _{i=1}^{n}{\frac {b_{i}}{b}}=1}
, and
f
{\displaystyle f}
is convex.
Generalizations
The inequality remains valid for
n
=
∞
{\displaystyle n=\infty }
provided that
a
<
∞
{\displaystyle a<\infty }
and
b
<
∞
{\displaystyle b<\infty }
.
The proof above holds for any function
g
{\displaystyle g}
such that
f
(
x
)
=
x
g
(
x
)
{\displaystyle f(x)=xg(x)}
is convex, such as all continuous non-decreasing functions. Generalizations to non-decreasing functions other than the logarithm is given in Csiszár, 2004.
Another generalization is due to Dannan, Neff and Thiel, who showed that if
a
1
,
a
2
⋯
a
n
{\displaystyle a_{1},a_{2}\cdots a_{n}}
and
b
1
,
b
2
⋯
b
n
{\displaystyle b_{1},b_{2}\cdots b_{n}}
are positive real numbers with
a
1
+
a
2
⋯
+
a
n
=
a
{\displaystyle a_{1}+a_{2}\cdots +a_{n}=a}
and
b
1
+
b
2
⋯
+
b
n
=
b
{\displaystyle b_{1}+b_{2}\cdots +b_{n}=b}
, and
k
≥
0
{\displaystyle k\geq 0}
, then
∑
i
=
1
n
a
i
log
(
a
i
b
i
+
k
)
≥
a
log
(
a
b
+
k
)
{\displaystyle \sum _{i=1}^{n}a_{i}\log \left({\frac {a_{i}}{b_{i}}}+k\right)\geq a\log \left({\frac {a}{b}}+k\right)}
.
Applications
The log sum inequality can be used to prove inequalities in information theory. Gibbs' inequality states that the Kullback-Leibler divergence is non-negative, and equal to zero precisely if its arguments are equal. One proof uses the log sum inequality.
The inequality can also prove convexity of Kullback-Leibler divergence.
Notes
References
Cover, Thomas M.; Thomas, Joy A. (1991). Elements of Information Theory. Hoboken, New Jersey: Wiley. ISBN 978-0-471-24195-9.
Csiszár, I.; Shields, P. (2004). "Information Theory and Statistics: A Tutorial" (PDF). Foundations and Trends in Communications and Information Theory. 1 (4): 417–528. doi:10.1561/0100000004. Retrieved 2009-06-14.
T.S. Han, K. Kobayashi, Mathematics of information and coding. American Mathematical Society, 2001. ISBN 0-8218-0534-7.
Information Theory course materials, Utah State University [1]. Retrieved on 2009-06-14.
MacKay, David J.C. (2003). Information Theory, Inference, and Learning Algorithms. Cambridge University Press. ISBN 0-521-64298-1.
Kata Kunci Pencarian:
- Log sum inequality
- Gibbs' inequality
- Jensen's inequality
- LogSumExp
- Hoeffding's inequality
- List of inequalities
- Divisor function
- Chernoff bound
- Minkowski inequality
- Divergence of the sum of the reciprocals of the primes