8. Inexact proximal methods

8.1. Accelerated inexact forward backward 

PEPit.examples.inexact_proximal_methods.wc_accelerated_inexact_forward_backward(L, zeta, n, wrapper='cvxpy', solver=None, verbose=1)[source]

Consider the composite convex minimization problem,

\[F_\star \triangleq \min_x \left\{F(x) \equiv f(x) + g(x) \right\},\]

where \(f\) is \(L\)-smooth convex, and \(g\) is closed, proper, and convex. We further assume that one can readily evaluate the gradient of \(f\) and that one has access to an inexact version of the proximal operator of \(g\) (whose level of accuracy is controlled by some parameter \(\zeta\in (0,1)\)).

This code computes a worst-case guarantee for an accelerated inexact forward backward (AIFB) method (a.k.a., inexact accelerated proximal gradient method). That is, it computes the smallest possible \(\tau(n, L, \zeta)\) such that the guarantee

\[F(x_n) - F(x_\star) \leqslant \tau(n, L, \zeta) \|x_0 - x_\star\|^2,\]

is valid, where \(x_n\) is the output of the IAFB, and where \(x_\star\) is a minimizer of \(F\).

In short, for given values of \(n\), \(L\) and \(\zeta\), \(\tau(n, L, \zeta)\) is computed as the worst-case value of \(F(x_n) - F(x_\star)\) when \(\|x_0 - x_\star\|^2 \leqslant 1\).

Algorithm: Let \(t\in\{0,1,\ldots,n\}\). The method is presented in, e.g., [1, Algorithm 3.1]. For simplicity, we instantiate [1, Algorithm 3.1] using simple values for its parameters and for the problem setting (in the notation of [1]: \(A_0\triangleq 0\), \(\mu=0\), \(\xi_t \triangleq0\), \(\sigma_t\triangleq 0\), \(\lambda_t \triangleq\gamma\triangleq\tfrac{1}{L}\), \(\zeta_t\triangleq\zeta\), \(\eta \triangleq (1-\zeta^2) \gamma\)), and without backtracking, arriving to:

\begin{eqnarray} A_{t+1} && = A_t + \frac{\eta+\sqrt{\eta^2+4\eta A_t}}{2},\\ y_{t} && = x_t + \frac{A_{t+1}-A_t}{A_{t+1}} (z_t-x_t),\\ (x_{t+1},v_{t+1}) && \approx_{\varepsilon_t} \left(\mathrm{prox}_{\gamma g}\left(y_t-\gamma \nabla f(y_t)\right),\, \mathrm{prox}_{ g^*/\gamma}\left(\frac{y_t-\gamma \nabla f(y_t)}{\gamma}\right)\right),\\ && \text{with } \varepsilon_t = \frac{\zeta^2\gamma^2}{2}\|v_{t+1}+\nabla f(y_t) \|^2,\\ z_{t+1} && = z_t-(A_{t+1}-A_t)\left(v_{t+1}+\nabla f(y_t)\right),\\ \end{eqnarray}

where \(\{\varepsilon_t\}_{t\geqslant 0}\) is some sequence of accuracy parameters (whose values are fixed within the algorithm as it runs), and \(\{A_t\}_{t\geqslant 0}\) is some scalar sequence of parameters for the method (typical of accelerated methods).

The line with “\(\approx_{\varepsilon}\)” can be described as the pair \((x_{t+1},v_{t+1})\) satisfying an accuracy requirement provided by [1, Definition 2.3]. More precisely (but without providing any intuition), it requires the existence of some \(w_{t+1}\) such that \(v_{t+1} \in \partial g(w_{t+1})\) and for which the accuracy requirement

\[\gamma^2 || x_{t+1} - y_t + \gamma v_{t+1} ||^2 + \gamma (g(x_{t+1}) - g(w_{t+1}) - v_{t+1}(x_{t+1} - w_{t+1})) \leqslant \varepsilon_t,\]

is valid.

Theoretical guarantee: A theoretical upper bound is obtained in [1, Corollary 3.5]:

\[F(x_n)-F_\star\leqslant \frac{2L \|x_0-x_\star\|^2}{(1-\zeta^2)n^2}.\]

References: The method and theoretical result can be found in [1, Section 3].

[1] M. Barre, A. Taylor, F. Bach (2021). A note on approximate accelerated forward-backward methods with absolute and relative errors, and possibly strongly convex objectives. arXiv:2106.15536v2.

Parameters:

L (float) – smoothness parameter.
zeta (float) – relative approximation parameter in (0,1).
n (int) – number of iterations.
wrapper (str) – the name of the wrapper to be used.
solver (str) – the name of the solver the wrapper should use.
verbose (int) –
level of information details to print.
- -1: No verbose at all.
- 0: This example’s output.
- 1: This example’s output + PEPit information.
- 2: This example’s output + PEPit information + solver details.

Returns:

pepit_tau (float) – worst-case value.
theoretical_tau (float) – theoretical value.

Example

>>> pepit_tau, theoretical_tau = wc_accelerated_inexact_forward_backward(L=1.3, zeta=.45, n=11, wrapper="cvxpy", solver=None, verbose=1)
(PEPit) Setting up the problem: size of the Gram matrix: 59x59
(PEPit) Setting up the problem: performance measure is the minimum of 1 element(s)
(PEPit) Setting up the problem: Adding initial conditions and general constraints ...
(PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 2 function(s)
                        Function 1 : Adding 156 scalar constraint(s) ...
                        Function 1 : 156 scalar constraint(s) added
                        Function 2 : Adding 506 scalar constraint(s) ...
                        Function 2 : 506 scalar constraint(s) added
(PEPit) Setting up the problem: additional constraints for 1 function(s)
                        Function 1 : Adding 22 scalar constraint(s) ...
                        Function 1 : 22 scalar constraint(s) added
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.018734101450651804
(PEPit) Primal feasibility check:
                The solver found a Gram matrix that is positive semi-definite up to an error of 6.169436183689734e-09
                All the primal scalar constraints are verified up to an error of 2.3055138501440475e-08
(PEPit) Dual feasibility check:
                The solver found a residual matrix that is positive semi-definite
                All the dual scalar values associated with inequality constraints are nonnegative up to an error of 3.9318808809398555e-09
(PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 1.038962372016442e-06
(PEPit) Final upper bound (dual): 0.018734107018754872 and lower bound (primal example): 0.018734101450651804
(PEPit) Duality gap: absolute: 5.5681030688981e-09 and relative: 2.9721751446501176e-07
*** Example file: worst-case performance of an inexact accelerated forward backward method ***
        PEPit guarantee:         F(x_n)-F_* <= 0.0187341 ||x_0 - x_*||^2
        Theoretical guarantee:   F(x_n)-F_* <= 0.0269437 ||x_0 - x_*||^2

8.2. Partially inexact Douglas Rachford splitting 

PEPit.examples.inexact_proximal_methods.wc_partially_inexact_douglas_rachford_splitting(mu, L, n, gamma, sigma, wrapper='cvxpy', solver=None, verbose=1)[source]

Consider the composite strongly convex minimization problem,

\[F_\star \triangleq \min_x \left\{ F(x) \equiv f(x) + g(x) \right\}\]

where \(f\) is \(L\)-smooth and \(\mu\)-strongly convex, and \(g\) is closed convex and proper. We denote by \(x_\star = \arg\min_x F(x)\) the minimizer of \(F\). The (exact) proximal operator of \(g\), and an approximate version of the proximal operator of \(f\) are assumed to be available.

This code computes a worst-case guarantee for a partially inexact Douglas-Rachford Splitting (DRS). That is, it computes the smallest possible \(\tau(n,L,\mu,\sigma,\gamma)\) such that the guarantee

\[\|z_{n} - z_\star\|^2 \leqslant \tau(n,L,\mu,\sigma,\gamma) \|z_0 - z_\star\|^2\]

is valid, where \(z_n\) is the output of the DRS (initiated at \(x_0\)), \(z_\star\) is its fixed point, \(\gamma\) is a step-size, and \(\sigma\) is the level of inaccuracy.

Algorithm: The partially inexact Douglas-Rachford splitting under consideration is described by

\begin{eqnarray} x_{t} && \approx_{\sigma} \arg\min_x \left\{ \gamma f(x)+\frac{1}{2} \|x-z_t\|^2 \right\},\\ y_{t} && = \arg\min_y \left\{ \gamma g(y)+\frac{1}{2} \|y-(x_t-\gamma \nabla f(x_t))\|^2 \right\},\\ z_{t+1} && = z_t + y_t - x_t. \end{eqnarray}

More precisely, the notation “\(\approx_{\sigma}\)” correspond to require the existence of some \(e_{t}\) such that

\begin{eqnarray} x_{t} && = z_t - \gamma (\nabla f(x_t) - e_t),\\ y_{t} && = \arg\min_y \left\{ \gamma g(y)+\frac{1}{2} \|y-(x_t-\gamma \nabla f(x_t))\|^2 \right\},\\ && \text{with } \|e_t\|^2 \leqslant \frac{\sigma^2}{\gamma^2}\|y_{t} - z_t + \gamma \nabla f(x_t) \|^2,\\ z_{t+1} && = z_t + y_t - x_t. \end{eqnarray}

Theoretical guarantee: The following tight theoretical bound is due to [2, Theorem 5.1]:

\[\|z_{n} - z_\star\|^2 \leqslant \max\left(\frac{1 - \sigma + \gamma \mu \sigma}{1 - \sigma + \gamma \mu}, \frac{\sigma + (1 - \sigma) \gamma L}{1 + (1 - \sigma) \gamma L)}\right)^{2n} \|z_0 - z_\star\|^2.\]

References: The method is from [1], its PEP formulation and the worst-case analysis from [2], see [2, Section 4.4] for more details.

[1] J. Eckstein and W. Yao (2018). Relative-error approximate versions of Douglas–Rachford splitting and special cases of the ADMM. Mathematical Programming, 170(2), 417-444.

[2] M. Barre, A. Taylor, F. Bach (2020). Principled analyses and design of first-order methods with inexact proximal operators, arXiv 2006.06041v2.

Parameters:

L (float) – the smoothness parameter.
mu (float) – the strong convexity parameter.
n (int) – number of iterations.
gamma (float) – the step-size.
sigma (float) – noise parameter.
wrapper (str) – the name of the wrapper to be used.
solver (str) – the name of the solver the wrapper should use.
verbose (int) –
level of information details to print.
- -1: No verbose at all.
- 0: This example’s output.
- 1: This example’s output + PEPit information.
- 2: This example’s output + PEPit information + solver details.

Returns:

pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value

Example

>>> pepit_tau, theoretical_tau = wc_partially_inexact_douglas_rachford_splitting(mu=.1, L=5, n=5, gamma=1.4, sigma=.2, wrapper="cvxpy", solver=None, verbose=1)
(PEPit) Setting up the problem: size of the Gram matrix: 18x18
(PEPit) Setting up the problem: performance measure is the minimum of 1 element(s)
(PEPit) Setting up the problem: Adding initial conditions and general constraints ...
(PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 2 function(s)
                        Function 1 : Adding 30 scalar constraint(s) ...
                        Function 1 : 30 scalar constraint(s) added
                        Function 2 : Adding 30 scalar constraint(s) ...
                        Function 2 : 30 scalar constraint(s) added
(PEPit) Setting up the problem: additional constraints for 1 function(s)
                        Function 1 : Adding 10 scalar constraint(s) ...
                        Function 1 : 10 scalar constraint(s) added
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.2812061652921267
(PEPit) Primal feasibility check:
                The solver found a Gram matrix that is positive semi-definite up to an error of 7.538692070583704e-10
                All the primal scalar constraints are verified up to an error of 2.1234433933425834e-09
(PEPit) Dual feasibility check:
                The solver found a residual matrix that is positive semi-definite
                All the dual scalar values associated with inequality constraints are nonnegative
(PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 2.755838881401962e-07
(PEPit) Final upper bound (dual): 0.2812061650995206 and lower bound (primal example): 0.2812061652921267
(PEPit) Duality gap: absolute: -1.9260609773752435e-10 and relative: -6.849284315563937e-10
*** Example file: worst-case performance of the partially inexact Douglas Rachford splitting ***
        PEPit guarantee:         ||z_n - z_*||^2 <= 0.281206 ||z_0 - z_*||^2
        Theoretical guarantee:   ||z_n - z_*||^2 <= 0.281206 ||z_0 - z_*||^2

8.3. Relatively inexact proximal point 

PEPit.examples.inexact_proximal_methods.wc_relatively_inexact_proximal_point_algorithm(n, gamma, sigma, wrapper='cvxpy', solver=None, verbose=1)[source]

Consider the (possibly non-smooth) convex minimization problem,

\[f_\star \triangleq \min_x f(x)\]

where \(f\) is closed, convex, and proper. We denote by \(x_\star\) some optimal point of \(f\) (hence \(0\in\partial f(x_\star)\)). We further assume that one has access to an inexact version of the proximal operator of \(f\), whose level of accuracy is controlled by some parameter \(\sigma\geqslant 0\).

This code computes a worst-case guarantee for an inexact proximal point method. That is, it computes the smallest possible \(\tau(n, \gamma, \sigma)\) such that the guarantee

\[f(x_n) - f(x_\star) \leqslant \tau(n, \gamma, \sigma) \|x_0 - x_\star\|^2\]

is valid, where \(x_n\) is the output of the method, \(\gamma\) is some step-size, and \(\sigma\) is the level of accuracy of the approximate proximal point oracle.

Algorithm: The approximate proximal point method under consideration is described by

\[x_{t+1} \approx_{\sigma} \arg\min_x \left\{ \gamma f(x)+\frac{1}{2} \|x-x_t\|^2 \right\},\]

where the notation “\(\approx_{\sigma}\)” corresponds to require the existence of some vector \(s_{t+1}\in\partial f(x_{t+1})\) and \(e_{t+1}\) such that

\[x_{t+1} = x_t - \gamma s_{t+1} + e_{t+1} \quad \quad \text{with }\|e_{t+1}\|^2 \leqslant \sigma^2\|x_{t+1} - x_t\|^2.\]

We note that the case \(\sigma=0\) implies \(e_{t+1}=0\) and this operation reduces to a standard proximal step with step-size \(\gamma\).

Theoretical guarantee: The following (empirical) upper bound is provided in [1, Section 3.5.1],

\[f(x_n) - f(x_\star) \leqslant \frac{1 + \sigma}{4 \gamma n^{\sqrt{1 - \sigma^2}}}\|x_0 - x_\star\|^2.\]

References: The precise formulation is presented in [1, Section 3.5.1].

[1] M. Barre, A. Taylor, F. Bach (2020). Principled analyses and design of first-order methods with inexact proximal operators. arXiv 2006.06041v2.

Parameters:

n (int) – number of iterations.
gamma (float) – the step-size.
sigma (float) – accuracy parameter of the proximal point computation.
wrapper (str) – the name of the wrapper to be used.
solver (str) – the name of the solver the wrapper should use.
verbose (int) –
level of information details to print.
- -1: No verbose at all.
- 0: This example’s output.
- 1: This example’s output + PEPit information.
- 2: This example’s output + PEPit information + solver details.

Returns:

pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value

Example

>>> pepit_tau, theoretical_tau = wc_relatively_inexact_proximal_point_algorithm(n=8, gamma=10, sigma=.65, wrapper="cvxpy", solver=None, verbose=1)
(PEPit) Setting up the problem: size of the Gram matrix: 18x18
(PEPit) Setting up the problem: performance measure is the minimum of 1 element(s)
(PEPit) Setting up the problem: Adding initial conditions and general constraints ...
(PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 1 function(s)
                        Function 1 : Adding 72 scalar constraint(s) ...
                        Function 1 : 72 scalar constraint(s) added
(PEPit) Setting up the problem: additional constraints for 1 function(s)
                        Function 1 : Adding 16 scalar constraint(s) ...
                        Function 1 : 16 scalar constraint(s) added
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.007678482388821737
(PEPit) Primal feasibility check:
                The solver found a Gram matrix that is positive semi-definite up to an error of 9.339211172115614e-09
                All the primal scalar constraints are verified up to an error of 1.0034810280098137e-07
(PEPit) Dual feasibility check:
                The solver found a residual matrix that is positive semi-definite
                All the dual scalar values associated with inequality constraints are nonnegative
(PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 1.7290964880672109e-07
(PEPit) Final upper bound (dual): 0.007678489787312847 and lower bound (primal example): 0.007678482388821737
(PEPit) Duality gap: absolute: 7.398491109686378e-09 and relative: 9.635355966248009e-07
*** Example file: worst-case performance of an inexact proximal point method in distance in function values ***
        PEPit guarantee:         f(x_n) - f(x_*) <= 0.00767849 ||x_0 - x_*||^2
        Theoretical guarantee:   f(x_n) - f(x_*) <= 0.00849444 ||x_0 - x_*||^2

8. Inexact proximal methods

8.1. Accelerated inexact forward backward

8.2. Partially inexact Douglas Rachford splitting

8.3. Relatively inexact proximal point

8.1. Accelerated inexact forward backward 

8.2. Partially inexact Douglas Rachford splitting 

8.3. Relatively inexact proximal point 