• Source: Proximal operator
  • In mathematical optimization, the proximal operator is an operator associated with a proper, lower semi-continuous convex function



    f


    {\displaystyle f}

    from a Hilbert space





    X




    {\displaystyle {\mathcal {X}}}


    to



    [


    ,
    +

    ]


    {\displaystyle [-\infty ,+\infty ]}

    , and is defined by:





    prox

    f



    (
    v
    )
    =
    arg


    min

    x



    X





    (

    f
    (
    x
    )
    +


    1
    2



    x

    v




    X



    2



    )

    .


    {\displaystyle \operatorname {prox} _{f}(v)=\arg \min _{x\in {\mathcal {X}}}\left(f(x)+{\frac {1}{2}}\|x-v\|_{\mathcal {X}}^{2}\right).}


    For any function in this class, the minimizer of the right-hand side above is unique, hence making the proximal operator well-defined. The proximal operator is used in proximal gradient methods, which is frequently used in optimization algorithms associated with non-differentiable optimization problems such as total variation denoising.


    Properties


    The




    prox



    {\displaystyle {\text{prox}}}

    of a proper, lower semi-continuous convex function



    f


    {\displaystyle f}

    enjoys several useful properties for optimization.

    Fixed points of





    prox


    f




    {\displaystyle {\text{prox}}_{f}}

    are minimizers of



    f


    {\displaystyle f}

    :



    {
    x



    X




    |




    prox


    f


    x
    =
    x
    }
    =
    arg

    min
    f


    {\displaystyle \{x\in {\mathcal {X}}\ |\ {\text{prox}}_{f}x=x\}=\arg \min f}

    .
    Global convergence to a minimizer is defined as follows: If



    arg

    min
    f




    {\displaystyle \arg \min f\neq \varnothing }

    , then for any initial point




    x

    0





    X




    {\displaystyle x_{0}\in {\mathcal {X}}}

    , the recursion



    (

    n


    N

    )


    x

    n
    +
    1


    =


    prox


    f



    x

    n




    {\displaystyle (\forall n\in \mathbb {N} )\quad x_{n+1}={\text{prox}}_{f}x_{n}}

    yields convergence




    x

    n



    x

    arg

    min
    f


    {\displaystyle x_{n}\to x\in \arg \min f}

    as



    n

    +



    {\displaystyle n\to +\infty }

    . This convergence may be weak if





    X




    {\displaystyle {\mathcal {X}}}

    is infinite dimensional.
    The proximal operator can be seen as a generalization of the projection operator. Indeed, in the specific case where



    f


    {\displaystyle f}

    is the 0-






    {\displaystyle \infty }

    characteristic function




    ι

    C




    {\displaystyle \iota _{C}}

    of a nonempty, closed, convex set



    C


    {\displaystyle C}

    we have that









    prox


    ι

    C





    (
    x
    )



    =

    argmin

    y





    {





    1
    2






    x

    y




    2


    2





    if

    y

    C




    +




    if

    y

    C












    =

    argmin

    y

    C





    1
    2






    x

    y




    2


    2








    {\displaystyle {\begin{aligned}\operatorname {prox} _{\iota _{C}}(x)&=\operatorname {argmin} \limits _{y}{\begin{cases}{\frac {1}{2}}\left\|x-y\right\|_{2}^{2}&{\text{if }}y\in C\\+\infty &{\text{if }}y\notin C\end{cases}}\\&=\operatorname {argmin} \limits _{y\in C}{\frac {1}{2}}\left\|x-y\right\|_{2}^{2}\end{aligned}}}


    showing that the proximity operator is indeed a generalisation of the projection operator.
    A function is firmly non-expansive if



    (

    (
    x
    ,
    y
    )




    X



    2


    )




    prox


    f


    x



    prox


    f


    y



    2




    x

    y

    ,


    prox


    f


    x



    prox


    f


    y



    {\displaystyle (\forall (x,y)\in {\mathcal {X}}^{2})\quad \|{\text{prox}}_{f}x-{\text{prox}}_{f}y\|^{2}\leq \langle x-y\ ,{\text{prox}}_{f}x-{\text{prox}}_{f}y\rangle }

    .
    The proximal operator of a function is related to the gradient of the Moreau envelope




    M

    λ
    f




    {\displaystyle M_{\lambda f}}

    of a function



    λ
    f


    {\displaystyle \lambda f}

    by the following identity:





    M

    λ
    f


    (
    x
    )
    =


    1
    λ


    (
    x



    p
    r
    o
    x


    λ
    f


    (
    x
    )
    )


    {\displaystyle \nabla M_{\lambda f}(x)={\frac {1}{\lambda }}(x-\mathrm {prox} _{\lambda f}(x))}

    .
    The proximity operator of



    f


    {\displaystyle f}

    is characterized by inclusion



    p
    =

    prox

    f



    (
    x
    )

    x

    p


    f
    (
    p
    )


    {\displaystyle p=\operatorname {prox} _{f}(x)\Leftrightarrow x-p\in \partial f(p)}

    , where




    f


    {\displaystyle \partial f}

    is the subdifferential of



    f


    {\displaystyle f}

    , given by





    f
    (
    x
    )
    =
    {
    u



    R


    N




    y



    R


    N


    ,
    (
    y

    x

    )


    T



    u
    +
    f
    (
    x
    )

    f
    (
    y
    )
    }


    {\displaystyle \partial f(x)=\{u\in \mathbb {R} ^{N}\mid \forall y\in \mathbb {R} ^{N},(y-x)^{\mathrm {T} }u+f(x)\leq f(y)\}}

    In particular, If



    f


    {\displaystyle f}

    is differentiable then the above equation reduces to



    p
    =

    prox

    f



    (
    x
    )

    x

    p
    =

    f
    (
    p
    )


    {\displaystyle p=\operatorname {prox} _{f}(x)\Leftrightarrow x-p=\nabla f(p)}

    .


    Notes




    References




    See also


    Proximal gradient method


    External links


    The Proximity Operator repository: a collection of proximity operators implemented in Matlab and Python.
    ProximalOperators.jl: a Julia package implementing proximal operators.
    ODL: a Python library for inverse problems that utilizes proximal operators.

Kata Kunci Pencarian: