- Source: Proximal operator
In mathematical optimization, the proximal operator is an operator associated with a proper, lower semi-continuous convex function
f
{\displaystyle f}
from a Hilbert space
X
{\displaystyle {\mathcal {X}}}
to
[
−
∞
,
+
∞
]
{\displaystyle [-\infty ,+\infty ]}
, and is defined by:
prox
f
(
v
)
=
arg
min
x
∈
X
(
f
(
x
)
+
1
2
‖
x
−
v
‖
X
2
)
.
{\displaystyle \operatorname {prox} _{f}(v)=\arg \min _{x\in {\mathcal {X}}}\left(f(x)+{\frac {1}{2}}\|x-v\|_{\mathcal {X}}^{2}\right).}
For any function in this class, the minimizer of the right-hand side above is unique, hence making the proximal operator well-defined. The proximal operator is used in proximal gradient methods, which is frequently used in optimization algorithms associated with non-differentiable optimization problems such as total variation denoising.
Properties
The
prox
{\displaystyle {\text{prox}}}
of a proper, lower semi-continuous convex function
f
{\displaystyle f}
enjoys several useful properties for optimization.
Fixed points of
prox
f
{\displaystyle {\text{prox}}_{f}}
are minimizers of
f
{\displaystyle f}
:
{
x
∈
X
|
prox
f
x
=
x
}
=
arg
min
f
{\displaystyle \{x\in {\mathcal {X}}\ |\ {\text{prox}}_{f}x=x\}=\arg \min f}
.
Global convergence to a minimizer is defined as follows: If
arg
min
f
≠
∅
{\displaystyle \arg \min f\neq \varnothing }
, then for any initial point
x
0
∈
X
{\displaystyle x_{0}\in {\mathcal {X}}}
, the recursion
(
∀
n
∈
N
)
x
n
+
1
=
prox
f
x
n
{\displaystyle (\forall n\in \mathbb {N} )\quad x_{n+1}={\text{prox}}_{f}x_{n}}
yields convergence
x
n
→
x
∈
arg
min
f
{\displaystyle x_{n}\to x\in \arg \min f}
as
n
→
+
∞
{\displaystyle n\to +\infty }
. This convergence may be weak if
X
{\displaystyle {\mathcal {X}}}
is infinite dimensional.
The proximal operator can be seen as a generalization of the projection operator. Indeed, in the specific case where
f
{\displaystyle f}
is the 0-
∞
{\displaystyle \infty }
characteristic function
ι
C
{\displaystyle \iota _{C}}
of a nonempty, closed, convex set
C
{\displaystyle C}
we have that
prox
ι
C
(
x
)
=
argmin
y
{
1
2
‖
x
−
y
‖
2
2
if
y
∈
C
+
∞
if
y
∉
C
=
argmin
y
∈
C
1
2
‖
x
−
y
‖
2
2
{\displaystyle {\begin{aligned}\operatorname {prox} _{\iota _{C}}(x)&=\operatorname {argmin} \limits _{y}{\begin{cases}{\frac {1}{2}}\left\|x-y\right\|_{2}^{2}&{\text{if }}y\in C\\+\infty &{\text{if }}y\notin C\end{cases}}\\&=\operatorname {argmin} \limits _{y\in C}{\frac {1}{2}}\left\|x-y\right\|_{2}^{2}\end{aligned}}}
showing that the proximity operator is indeed a generalisation of the projection operator.
A function is firmly non-expansive if
(
∀
(
x
,
y
)
∈
X
2
)
‖
prox
f
x
−
prox
f
y
‖
2
≤
⟨
x
−
y
,
prox
f
x
−
prox
f
y
⟩
{\displaystyle (\forall (x,y)\in {\mathcal {X}}^{2})\quad \|{\text{prox}}_{f}x-{\text{prox}}_{f}y\|^{2}\leq \langle x-y\ ,{\text{prox}}_{f}x-{\text{prox}}_{f}y\rangle }
.
The proximal operator of a function is related to the gradient of the Moreau envelope
M
λ
f
{\displaystyle M_{\lambda f}}
of a function
λ
f
{\displaystyle \lambda f}
by the following identity:
∇
M
λ
f
(
x
)
=
1
λ
(
x
−
p
r
o
x
λ
f
(
x
)
)
{\displaystyle \nabla M_{\lambda f}(x)={\frac {1}{\lambda }}(x-\mathrm {prox} _{\lambda f}(x))}
.
The proximity operator of
f
{\displaystyle f}
is characterized by inclusion
p
=
prox
f
(
x
)
⇔
x
−
p
∈
∂
f
(
p
)
{\displaystyle p=\operatorname {prox} _{f}(x)\Leftrightarrow x-p\in \partial f(p)}
, where
∂
f
{\displaystyle \partial f}
is the subdifferential of
f
{\displaystyle f}
, given by
∂
f
(
x
)
=
{
u
∈
R
N
∣
∀
y
∈
R
N
,
(
y
−
x
)
T
u
+
f
(
x
)
≤
f
(
y
)
}
{\displaystyle \partial f(x)=\{u\in \mathbb {R} ^{N}\mid \forall y\in \mathbb {R} ^{N},(y-x)^{\mathrm {T} }u+f(x)\leq f(y)\}}
In particular, If
f
{\displaystyle f}
is differentiable then the above equation reduces to
p
=
prox
f
(
x
)
⇔
x
−
p
=
∇
f
(
p
)
{\displaystyle p=\operatorname {prox} _{f}(x)\Leftrightarrow x-p=\nabla f(p)}
.
Notes
References
See also
Proximal gradient method
External links
The Proximity Operator repository: a collection of proximity operators implemented in Matlab and Python.
ProximalOperators.jl: a Julia package implementing proximal operators.
ODL: a Python library for inverse problems that utilizes proximal operators.