10. Low dimensional worst-cases scenarios

Inexact gradient
Non-convex gradient descent
Optimized gradient
Frank Wolfe
Proximal point
Halpern iteration
Alternate projections
Averaged projections
Dykstra

10.1. Inexact gradient 

PEPit.examples.low_dimensional_worst_cases_scenarios.wc_inexact_gradient(L, mu, epsilon, n, verbose=1)[source]

Consider the convex minimization problem

\[f_\star \triangleq \min_x f(x),\]

where \(f\) is \(L\)-smooth and \(\mu\)-strongly convex.

This code computes a worst-case guarantee for an inexact gradient method and looks for a low-dimensional worst-case example nearly achieving this worst-case guarantee using \(10\) iterations of the logdet heuristic.

That is, it computes the smallest possible \(\tau(n,L,\mu,\varepsilon)\) such that the guarantee

\[f(x_n) - f_\star \leqslant \tau(n,L,\mu,\varepsilon) (f(x_0) - f_\star)\]

is valid, where \(x_n\) is the output of the gradient descent with an inexact descent direction, and where \(x_\star\) is the minimizer of \(f\). Then, it looks for a low-dimensional nearly achieving this performance.

The inexact descent direction is assumed to satisfy a relative inaccuracy described by (with \(0 \leqslant \varepsilon \leqslant 1\))

\[\|\nabla f(x_t) - d_t\| \leqslant \varepsilon \|\nabla f(x_t)\|,\]

where \(\nabla f(x_t)\) is the true gradient, and \(d_t\) is the approximate descent direction that is used.

Algorithm:

The inexact gradient descent under consideration can be written as

\[x_{t+1} = x_t - \frac{2}{L_{\varepsilon} + \mu_{\varepsilon}} d_t\]

where \(d_t\) is the inexact search direction, \(L_{\varepsilon} = (1 + \varepsilon)L\) and \(\mu_{\varepsilon} = (1-\varepsilon) \mu\).

Theoretical guarantee:

A tight worst-case guarantee obtained in [1, Theorem 5.3] or [2, Remark 1.6] is

\[f(x_n) - f_\star \leqslant \left(\frac{L_{\varepsilon} - \mu_{\varepsilon}}{L_{\varepsilon} + \mu_{\varepsilon}}\right)^{2n}(f(x_0) - f_\star ),\]

with \(L_{\varepsilon} = (1 + \varepsilon)L\) and \(\mu_{\varepsilon} = (1-\varepsilon) \mu\). This guarantee is achieved on one-dimensional quadratic functions.

References:The detailed analyses can be found in [1, 2]. The logdet heuristic is presented in [3].

[1] E. De Klerk, F. Glineur, A. Taylor (2020). Worst-case convergence analysis of inexact gradient and Newton methods through semidefinite programming performance estimation. SIAM Journal on Optimization, 30(3), 2053-2082.

[2] O. Gannot (2021). A frequency-domain analysis of inexact gradient methods. Mathematical Programming (to appear).

[3] F. Maryam, H. Hindi, S. Boyd (2003). Log-det heuristic for matrix rank minimization with applications to Hankel and Euclidean distance matrices. American Control Conference (ACC).

Parameters

L (float) – the smoothness parameter.
mu (float) – the strong convexity parameter.
epsilon (float) – level of inaccuracy
n (int) – number of iterations.
verbose (int) –
Level of information details to print.
- -1: No verbose at all.
- 0: This example’s output.
- 1: This example’s output + PEPit information.
- 2: This example’s output + PEPit information + CVXPY details.

Returns

pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value

Example

>>> pepit_tau, theoretical_tau = wc_inexact_gradient(L=1, mu=0.1, epsilon=0.1, n=6, verbose=1)
(PEPit) Setting up the problem: size of the main PSD matrix: 15x15
(PEPit) Setting up the problem: performance measure is minimum of 1 element(s)
(PEPit) Setting up the problem: Adding initial conditions and general constraints ...
(PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 1 function(s)
                 function 1 : Adding 62 scalar constraint(s) ...
                 function 1 : 62 scalar constraint(s) added
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (solver: SCS); optimal value: 0.13989778793516514
(PEPit) Postprocessing: 2 eigenvalue(s) > 1.7005395180119392e-05 before dimension reduction
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal_inaccurate (solver: SCS); objective value: 0.1398878008962302
(PEPit) Postprocessing: 2 eigenvalue(s) > 5.283608596989854e-06 after 1 dimension reduction step(s)
(PEPit) Solver status: optimal_inaccurate (solver: SCS); objective value: 0.13988778337004493
(PEPit) Postprocessing: 2 eigenvalue(s) > 5.335098252373141e-06 after 2 dimension reduction step(s)
(PEPit) Solver status: optimal (solver: SCS); objective value: 0.1398927512487368
(PEPit) Postprocessing: 2 eigenvalue(s) > 1.2372028101610534e-05 after 3 dimension reduction step(s)
(PEPit) Solver status: optimal_inaccurate (solver: SCS); objective value: 0.13988824650439619
(PEPit) Postprocessing: 2 eigenvalue(s) > 2.006867894032787e-05 after 4 dimension reduction step(s)
(PEPit) Solver status: optimal_inaccurate (solver: SCS); objective value: 0.13988779568391294
(PEPit) Postprocessing: 2 eigenvalue(s) > 5.416953129163531e-06 after 5 dimension reduction step(s)
(PEPit) Solver status: optimal_inaccurate (solver: SCS); objective value: 0.1398889451757595
(PEPit) Postprocessing: 2 eigenvalue(s) > 3.983502472713177e-05 after 6 dimension reduction step(s)
(PEPit) Solver status: optimal_inaccurate (solver: SCS); objective value: 0.13988780180833413
(PEPit) Postprocessing: 2 eigenvalue(s) > 5.4785759855262395e-06 after 7 dimension reduction step(s)
(PEPit) Solver status: optimal_inaccurate (solver: SCS); objective value: 0.13988778218159367
(PEPit) Postprocessing: 2 eigenvalue(s) > 5.360843247635456e-06 after 8 dimension reduction step(s)
(PEPit) Solver status: optimal (solver: SCS); objective value: 0.13988478099895965
(PEPit) Postprocessing: 2 eigenvalue(s) > 9.59529914206238e-06 after 9 dimension reduction step(s)
(PEPit) Solver status: optimal_inaccurate (solver: SCS); objective value: 0.13988791535665998
(PEPit) Postprocessing: 2 eigenvalue(s) > 9.339529753603287e-06 after 10 dimension reduction step(s)
(PEPit) Solver status: optimal_inaccurate (solver: SCS); objective value: 0.13988791535665998
(PEPit) Postprocessing: 2 eigenvalue(s) > 9.339529753603287e-06 after dimension reduction
*** Example file: worst-case performance of inexact gradient ***
        PEPit example:           f(x_n)-f_* == 0.139888 (f(x_0)-f_*)
        Theoretical guarantee:   f(x_n)-f_* <= 0.139731 (f(x_0)-f_*)

10.2. Non-convex gradient descent 

PEPit.examples.low_dimensional_worst_cases_scenarios.wc_gradient_descent(L, gamma, n, verbose=1)[source]

Consider the minimization problem

\[f_\star \triangleq \min_x f(x),\]

where \(f\) is \(L\)-smooth.

This code computes a worst-case guarantee for gradient descent with fixed step-size \(\gamma\), and looks for a low-dimensional worst-case example nearly achieving this worst-case guarantee. That is, it computes the smallest possible \(\tau(n, L, \gamma)\) such that the guarantee

\[\min_{t\leqslant n} \|\nabla f(x_t)\|^2 \leqslant \tau(n, L, \gamma) (f(x_0) - f(x_n))\]

is valid, where \(x_n\) is the n-th iterates obtained with the gradient method with fixed step-size. Then, it looks for a low-dimensional nearly achieving this performance.

Algorithm: Gradient descent is described as follows, for \(t \in \{ 0, \dots, n-1\}\),

\[x_{t+1} = x_t - \gamma \nabla f(x_t),\]

where \(\gamma\) is a step-size and.

Theoretical guarantee: When \(\gamma \leqslant \frac{1}{L}\), an empirically tight theoretical worst-case guarantee is

\[\min_{t\leqslant n} \|\nabla f(x_t)\|^2 \leqslant \frac{4}{3}\frac{L}{n} (f(x_0) - f(x_n)),\]

see discussions in [1, page 190] and [2].

References:

[1] Taylor, A. B. (2017). Convex interpolation and performance estimation of first-order methods for convex optimization. PhD Thesis, UCLouvain.

[2] H. Abbaszadehpeivasti, E. de Klerk, M. Zamani (2021). The exact worst-case convergence rate of the gradient method with fixed step lengths for L-smooth functions. Optimization Letters, 16(6), 1649-1661.

Parameters

L (float) – the smoothness parameter.
gamma (float) – step-size.
n (int) – number of iterations.
verbose (int) –
Level of information details to print.
- -1: No verbose at all.
- 0: This example’s output.
- 1: This example’s output + PEPit information.
- 2: This example’s output + PEPit information + CVXPY details.

Returns

pepit_tau (float) – worst-case value.
theoretical_tau (float) – theoretical value.

Example

>>> L = 1
>>> gamma = 1 / L
>>> pepit_tau, theoretical_tau = wc_gradient_descent(L=L, gamma=gamma, n=5, verbose=1)
(PEPit) Setting up the problem: size of the main PSD matrix: 7x7
(PEPit) Setting up the problem: performance measure is minimum of 6 element(s)
(PEPit) Setting up the problem: Adding initial conditions and general constraints ...
(PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 1 function(s)
                 function 1 : Adding 30 scalar constraint(s) ...
                 function 1 : 30 scalar constraint(s) added
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (solver: SCS); optimal value: 0.2666769474847614
(PEPit) Postprocessing: 6 eigenvalue(s) > 0.0 before dimension reduction
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (solver: SCS); objective value: 0.266667996380269
(PEPit) Postprocessing: 2 eigenvalue(s) > 1.0527850440294492e-05 after 1 dimension reduction step(s)
(PEPit) Solver status: optimal (solver: SCS); objective value: 0.2666668138016744
(PEPit) Postprocessing: 2 eigenvalue(s) > 2.510763274714993e-07 after 2 dimension reduction step(s)
(PEPit) Solver status: optimal (solver: SCS); objective value: 0.2666668138016744
(PEPit) Postprocessing: 2 eigenvalue(s) > 2.510763274714993e-07 after dimension reduction
*** Example file: worst-case performance of gradient descent with fixed step-size ***
        PEPit example:           min_i ||f'(x_i)||^2 == 0.266667 (f(x_0)-f_*)
        Theoretical guarantee:   min_i ||f'(x_i)||^2 <= 0.266667 (f(x_0)-f_*)

10.3. Optimized gradient 

PEPit.examples.low_dimensional_worst_cases_scenarios.wc_optimized_gradient(L, n, verbose=1)[source]

Consider the minimization problem

\[f_\star \triangleq \min_x f(x),\]

where \(f\) is \(L\)-smooth and convex.

This code computes a worst-case guarantee for optimized gradient method (OGM), and applies the trace heuristic for trying to find a low-dimensional worst-case example on which this guarantee is nearly achieved. That is, it computes the smallest possible \(\tau(n, L)\) such that the guarantee

\[f(x_n) - f_\star \leqslant \tau(n, L) \|x_0 - x_\star\|^2\]

is valid, where \(x_n\) is the output of OGM and where \(x_\star\) is a minimizer of \(f\). Then, it applies the trace heuristic, which allows obtaining a one-dimensional function on which the guarantee is nearly achieved.

Algorithm: The optimized gradient method is described by

\begin{eqnarray} x_{t+1} & = & y_t - \frac{1}{L} \nabla f(y_t)\\ y_{t+1} & = & x_{t+1} + \frac{\theta_{t}-1}{\theta_{t+1}}(x_{t+1}-x_t)+\frac{\theta_{t}}{\theta_{t+1}}(x_{t+1}-y_t), \end{eqnarray}

with

\begin{eqnarray} \theta_0 & = & 1 \\ \theta_t & = & \frac{1 + \sqrt{4 \theta_{t-1}^2 + 1}}{2}, \forall t \in [|1, n-1|] \\ \theta_n & = & \frac{1 + \sqrt{8 \theta_{n-1}^2 + 1}}{2}. \end{eqnarray}

Theoretical guarantee: The tight theoretical guarantee can be found in [2, Theorem 2]:

\[f(x_n)-f_\star \leqslant \frac{L\|x_0-x_\star\|^2}{2\theta_n^2}.\]

References: The OGM was developed in [1,2]. Low-dimensional worst-case functions for OGM were obtained in [3, 4].

[1] Y. Drori, M. Teboulle (2014). Performance of first-order methods for smooth convex minimization: a novel approach. Mathematical Programming 145(1–2), 451–482.

[2] D. Kim, J. Fessler (2016). Optimized first-order methods for smooth convex minimization. Mathematical Programming 159.1-2: 81-107.

[3] A. Taylor, J. Hendrickx, F. Glineur (2017). Smooth strongly convex interpolation and exact worst-case performance of first-order methods. Mathematical Programming, 161(1-2), 307-345.

[4] D. Kim, J. Fessler (2017). On the convergence analysis of the optimized gradient method. Journal of Optimization Theory and Applications, 172(1), 187-205.

Parameters

L (float) – the smoothness parameter.
n (int) – number of iterations.
verbose (int) –
Level of information details to print.
- -1: No verbose at all.
- 0: This example’s output.
- 1: This example’s output + PEPit information.
- 2: This example’s output + PEPit information + CVXPY details.

Returns

pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value

Example

>>> pepit_tau, theoretical_tau = wc_optimized_gradient(L=3, n=4, verbose=1)
(PEPit) Setting up the problem: size of the main PSD matrix: 7x7
(PEPit) Setting up the problem: performance measure is minimum of 1 element(s)
(PEPit) Setting up the problem: Adding initial conditions and general constraints ...
(PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 1 function(s)
                 function 1 : Adding 30 scalar constraint(s) ...
                 function 1 : 30 scalar constraint(s) added
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (solver: SCS); optimal value: 0.07675218017587908
(PEPit) Postprocessing: 5 eigenvalue(s) > 0.00012110342786525262 before dimension reduction
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (solver: SCS); objective value: 0.0767421794376856
(PEPit) Postprocessing: 1 eigenvalue(s) > 5.187978263167338e-09 after dimension reduction
*** Example file: worst-case performance of optimized gradient method ***
        PEPit example:           f(y_n)-f_* == 0.0767422 ||x_0 - x_*||^2
        Theoretical guarantee:   f(y_n)-f_* <= 0.0767518 ||x_0 - x_*||^2

10.4. Frank Wolfe 

PEPit.examples.low_dimensional_worst_cases_scenarios.wc_frank_wolfe(L, D, n, verbose=1)[source]

Consider the composite convex minimization problem

\[F_\star \triangleq \min_x \{F(x) \equiv f_1(x) + f_2(x)\},\]

where \(f_1\) is \(L\)-smooth and convex and where \(f_2\) is a convex indicator function on \(\mathcal{D}\) of diameter at most \(D\).

This code computes a worst-case guarantee for the conditional gradient method, aka Frank-Wolfe method, and looks for a low-dimensional worst-case example nearly achieving this worst-case guarantee using \(12\) iterations of the logdet heuristic. That is, it computes the smallest possible \(\tau(n, L)\) such that the guarantee

\[F(x_n) - F(x_\star) \leqslant \tau(n, L) D^2,\]

is valid, where x_n is the output of the conditional gradient method, and where \(x_\star\) is a minimizer of \(F\). In short, for given values of \(n\) and \(L\), \(\tau(n, L)\) is computed as the worst-case value of \(F(x_n) - F(x_\star)\) when \(D \leqslant 1\). Then, it looks for a low-dimensional nearly achieving this performance.

Algorithm:

This method was first presented in [1]. A more recent version can be found in, e.g., [2, Algorithm 1]. For \(t \in \{0, \dots, n-1\}\),

\[\begin{split}\begin{eqnarray} y_t & = & \arg\min_{s \in \mathcal{D}} \langle s \mid \nabla f_1(x_t) \rangle, \\ x_{t+1} & = & \frac{t}{t + 2} x_t + \frac{2}{t + 2} y_t. \end{eqnarray}\end{split}\]

Theoretical guarantee:

An upper guarantee obtained in [2, Theorem 1] is

\[F(x_n) - F(x_\star) \leqslant \frac{2L D^2}{n+2}.\]

References: The algorithm is presented in, among others, [1, 2]. The logdet heuristic is presented in [3].

[1] M .Frank, P. Wolfe (1956). An algorithm for quadratic programming. Naval research logistics quarterly, 3(1-2), 95-110.

[2] M. Jaggi (2013). Revisiting Frank-Wolfe: Projection-free sparse convex optimization. In 30th International Conference on Machine Learning (ICML).

[3] F. Maryam, H. Hindi, S. Boyd (2003). Log-det heuristic for matrix rank minimization with applications to Hankel and Euclidean distance matrices. American Control Conference (ACC).

Parameters

L (float) – the smoothness parameter.
D (float) – diameter of \(f_2\).
n (int) – number of iterations.
verbose (int) –
Level of information details to print.
- -1: No verbose at all.
- 0: This example’s output.
- 1: This example’s output + PEPit information.
- 2: This example’s output + PEPit information + CVXPY details.

Returns

pepit_tau (float) – worst-case value.
theoretical_tau (float) – theoretical value.

Example

>>> pepit_tau, theoretical_tau = wc_frank_wolfe(L=1, D=1, n=10, verbose=1)
(PEPit) Setting up the problem: size of the main PSD matrix: 26x26
(PEPit) Setting up the problem: performance measure is minimum of 1 element(s)
(PEPit) Setting up the problem: Adding initial conditions and general constraints ...
(PEPit) Setting up the problem: initial conditions and general constraints (0 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 2 function(s)
                 function 1 : Adding 132 scalar constraint(s) ...
                 function 1 : 132 scalar constraint(s) added
                 function 2 : Adding 325 scalar constraint(s) ...
                 function 2 : 325 scalar constraint(s) added
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (solver: SCS); optimal value: 0.07830185202143693
(PEPit) Postprocessing: 12 eigenvalue(s) > 0.0006226631118848632 before dimension reduction
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal_inaccurate (solver: SCS); objective value: 0.07828372031738319
(PEPit) Postprocessing: 11 eigenvalue(s) > 4.365697148503946e-06 after 1 dimension reduction step(s)
(PEPit) Solver status: optimal_inaccurate (solver: SCS); objective value: 0.07826632525166947
(PEPit) Postprocessing: 11 eigenvalue(s) > 1.2665145818615854e-05 after 2 dimension reduction step(s)
(PEPit) Solver status: optimal_inaccurate (solver: SCS); objective value: 0.07824094610510846
(PEPit) Postprocessing: 11 eigenvalue(s) > 2.4505278932874855e-05 after 3 dimension reduction step(s)
(PEPit) Solver status: optimal_inaccurate (solver: SCS); objective value: 0.07820114036570962
(PEPit) Postprocessing: 11 eigenvalue(s) > 4.164155031005524e-05 after 4 dimension reduction step(s)
(PEPit) Solver status: optimal_inaccurate (solver: SCS); objective value: 0.07823286027467699
(PEPit) Postprocessing: 10 eigenvalue(s) > 9.73301991908838e-05 after 5 dimension reduction step(s)
(PEPit) Solver status: optimal_inaccurate (solver: SCS); objective value: 0.07823446697003811
(PEPit) Postprocessing: 10 eigenvalue(s) > 0.00011791962010861412 after 6 dimension reduction step(s)
(PEPit) Solver status: optimal_inaccurate (solver: SCS); objective value: 0.07823446697003811
(PEPit) Postprocessing: 10 eigenvalue(s) > 0.00011791962010861412 after dimension reduction
*** Example file: worst-case performance of the Conditional Gradient (Frank-Wolfe) in function value ***
        PEPit example:           f(x_n)-f_* == 0.0782345 ||x0 - xs||^2
        Theoretical guarantee:   f(x_n)-f_* <= 0.166667 ||x0 - xs||^2

10.5. Proximal point 

PEPit.examples.low_dimensional_worst_cases_scenarios.wc_proximal_point(alpha, n, verbose=1)[source]

Consider the monotone inclusion problem

\[\mathrm{Find}\, x:\, 0\in Ax,\]

where \(A\) is maximally monotone. We denote \(J_A = (I + A)^{-1}\) the resolvents of \(A\).

This code computes a worst-case guarantee for the proximal point method, and looks for a low-dimensional worst-case example nearly achieving this worst-case guarantee using the trace heuristic.

That is, it computes the smallest possible \(\tau(n, \alpha)\) such that the guarantee

\[\|x_n - x_{n-1}\|^2 \leqslant \tau(n, \alpha) \|x_0 - x_\star\|^2,\]

is valid, where \(x_\star\) is such that \(0 \in Ax_\star\). Then, it looks for a low-dimensional nearly achieving this performance.

Algorithm: The proximal point algorithm for monotone inclusions is described as follows, for \(t \in \{ 0, \dots, n-1\}\),

\[x_{t+1} = J_{\alpha A}(x_t),\]

where \(\alpha\) is a step-size.

Theoretical guarantee: A tight theoretical guarantee can be found in [1, section 4].

\[\|x_n - x_{n-1}\|^2 \leqslant \frac{\left(1 - \frac{1}{n}\right)^{n - 1}}{n} \|x_0 - x_\star\|^2.\]

Reference:

[1] G. Gu, J. Yang (2020). Tight sublinear convergence rate of the proximal point algorithm for maximal monotone inclusion problem. SIAM Journal on Optimization, 30(3), 1905-1921.

Parameters

alpha (float) – the step-size.
n (int) – number of iterations.
verbose (int) –
Level of information details to print.
- -1: No verbose at all.
- 0: This example’s output.
- 1: This example’s output + PEPit information.
- 2: This example’s output + PEPit information + CVXPY details.

Returns

pepit_tau (float) – worst-case value.
theoretical_tau (float) – theoretical value.

Example

>>> pepit_tau, theoretical_tau = wc_proximal_point(alpha=2.2, n=11, verbose=1)
(PEPit) Setting up the problem: size of the main PSD matrix: 13x13
(PEPit) Setting up the problem: performance measure is minimum of 1 element(s)
(PEPit) Setting up the problem: Adding initial conditions and general constraints ...
(PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 1 function(s)
                 function 1 : Adding 132 scalar constraint(s) ...
                 function 1 : 132 scalar constraint(s) added
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (solver: SCS); optimal value: 0.03504735907840766
(PEPit) Postprocessing: 2 eigenvalue(s) > 1.885183851963194e-06 before dimension reduction
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (solver: SCS); objective value: 0.03503739338571882
(PEPit) Postprocessing: 2 eigenvalue(s) > 1.9044504527414672e-06 after dimension reduction
*** Example file: worst-case performance of the Proximal Point Method***
        PEPit example:           ||x(n) - x(n-1)||^2 == 0.0350374 ||x0 - xs||^2
        Theoretical guarantee:   ||x(n) - x(n-1)||^2 <= 0.0350494 ||x0 - xs||^2

10.6. Halpern iteration 

PEPit.examples.low_dimensional_worst_cases_scenarios.wc_halpern_iteration(n, verbose=1)[source]

Consider the fixed point problem

\[\mathrm{Find}\, x:\, x = Ax,\]

where \(A\) is a non-expansive operator, that is a \(L\)-Lipschitz operator with \(L=1\).

This code computes a worst-case guarantee for the Halpern Iteration, and looks for a low-dimensional worst-case example nearly achieving this worst-case guarantee. That is, it computes the smallest possible \(\tau(n)\) such that the guarantee

\[\|x_n - Ax_n\|^2 \leqslant \tau(n) \|x_0 - x_\star\|^2\]

is valid, where \(x_n\) is the output of the Halpern iteration, and \(x_\star\) the fixed point of \(A\).

In short, for a given value of \(n\), \(\tau(n)\) is computed as the worst-case value of \(\|x_n - Ax_n\|^2\) when \(\|x_0 - x_\star\|^2 \leqslant 1\). Then, it looks for a low-dimensional nearly achieving this performance.

Algorithm: The Halpern iteration can be written as

\[x_{t+1} = \frac{1}{t + 2} x_0 + \left(1 - \frac{1}{t + 2}\right) Ax_t.\]

Theoretical guarantee: A tight worst-case guarantee for Halpern iteration can be found in [1, Theorem 2.1]:

\[\|x_n - Ax_n\|^2 \leqslant \left(\frac{2}{n+1}\right)^2 \|x_0 - x_\star\|^2.\]

References: The detailed approach and tight bound are available in [1].

[1] F. Lieder (2021). On the convergence rate of the Halpern-iteration. Optimization Letters, 15(2), 405-418.

[2] F. Maryam, H. Hindi, S. Boyd (2003). Log-det heuristic for matrix rank minimization with applications to Hankel and Euclidean distance matrices. American Control Conference (ACC).

Parameters

n (int) – number of iterations.
verbose (int) –
Level of information details to print.
- -1: No verbose at all.
- 0: This example’s output.
- 1: This example’s output + PEPit information.
- 2: This example’s output + PEPit information + CVXPY details.

Returns

pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value

Example

>>> pepit_tau, theoretical_tau = wc_halpern_iteration(n=10, verbose=1)
(PEPit) Setting up the problem: size of the main PSD matrix: 13x13
(PEPit) Setting up the problem: performance measure is minimum of 1 element(s)
(PEPit) Setting up the problem: Adding initial conditions and general constraints ...
(PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 1 function(s)
                 function 1 : Adding 132 scalar constraint(s) ...
                 function 1 : 132 scalar constraint(s) added
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (solver: SCS); optimal value: 0.033076981475854986
(PEPit) Postprocessing: 11 eigenvalue(s) > 2.538373915093237e-06 before dimension reduction
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (solver: SCS); objective value: 0.03306531836320572
(PEPit) Postprocessing: 2 eigenvalue(s) > 0.00010453609338097841 after 1 dimension reduction step(s)
(PEPit) Solver status: optimal_inaccurate (solver: SCS); objective value: 0.0330736415198303
(PEPit) Postprocessing: 2 eigenvalue(s) > 4.3812352924839906e-05 after 2 dimension reduction step(s)
(PEPit) Solver status: optimal_inaccurate (solver: SCS); objective value: 0.03307313275765859
(PEPit) Postprocessing: 2 eigenvalue(s) > 4.715648695840045e-05 after 3 dimension reduction step(s)
(PEPit) Solver status: optimal_inaccurate (solver: SCS); objective value: 0.03307313275765859
(PEPit) Postprocessing: 2 eigenvalue(s) > 4.715648695840045e-05 after dimension reduction
*** Example file: worst-case performance of Halpern Iterations ***
        PEPit example:           ||xN - AxN||^2 == 0.0330731 ||x0 - x_*||^2
        Theoretical guarantee:   ||xN - AxN||^2 <= 0.0330579 ||x0 - x_*||^2

10.7. Alternate projections 

PEPit.examples.low_dimensional_worst_cases_scenarios.wc_alternate_projections(n, verbose=1)[source]

Consider the convex feasibility problem:

\[\mathrm{Find}\, x\in Q_1\cap Q_2\]

where \(Q_1\) and \(Q_2\) are two closed convex sets.

This code computes a worst-case guarantee for the alternate projection method, and looks for a low-dimensional worst-case example nearly achieving this worst-case guarantee. That is, it computes the smallest possible \(\tau(n)\) such that the guarantee

\[\|\mathrm{Proj}_{Q_1}(x_n)-\mathrm{Proj}_{Q_2}(x_n)\|^2 \leqslant \tau(n) \|x_0 - x_\star\|^2\]

is valid, where \(x_n\) is the output of the alternate projection method, and \(x_\star\in Q_1\cap Q_2\) is a solution to the convex feasibility problem.

In short, for a given value of \(n\), \(\tau(n)\) is computed as the worst-case value of \(\|\mathrm{Proj}_{Q_1}(x_n)-\mathrm{Proj}_{Q_2}(x_n)\|^2\) when \(\|x_0 - x_\star\|^2 \leqslant 1\). Then, it looks for a low-dimensional nearly achieving this performance.

Algorithm: The alternate projection method can be written as

\[\begin{split}\begin{eqnarray} y_{t+1} & = & \mathrm{Proj}_{Q_1}(x_t), \\ x_{t+1} & = & \mathrm{Proj}_{Q_2}(y_{t+1}). \end{eqnarray}\end{split}\]

References: The first results on this method are due to [1]. Its translation in PEPs is due to [2].

[1] J. Von Neumann (1949). On rings of operators. Reduction theory. Annals of Mathematics, pp. 401–485.

[2] A. Taylor, J. Hendrickx, F. Glineur (2017). Exact worst-case performance of first-order methods for composite convex optimization. SIAM Journal on Optimization, 27(3):1283–1313.

Parameters

n (int) – number of iterations.
verbose (int) –
Level of information details to print.
- -1: No verbose at all.
- 0: This example’s output.
- 1: This example’s output + PEPit information.
- 2: This example’s output + PEPit information + CVXPY details.

Returns

pepit_tau (float) – worst-case value
theoretical_tau (None) – no theoretical value.

Example

>>> pepit_tau, theoretical_tau = wc_alternate_projections(n=10, verbose=1)
(PEPit) Setting up the problem: size of the main PSD matrix: 24x24
(PEPit) Setting up the problem: performance measure is minimum of 1 element(s)
(PEPit) Setting up the problem: Adding initial conditions and general constraints ...
(PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 2 function(s)
                 function 1 : Adding 144 scalar constraint(s) ...
                 function 1 : 144 scalar constraint(s) added
                 function 2 : Adding 121 scalar constraint(s) ...
                 function 2 : 121 scalar constraint(s) added
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (solver: SCS); optimal value: 0.018858674370385117
(PEPit) Postprocessing: 2 eigenvalue(s) > 0.0003128757392530764 before dimension reduction
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (solver: SCS); objective value: 0.018851597249912744
(PEPit) Postprocessing: 2 eigenvalue(s) > 7.314172662475898e-06 after 1 dimension reduction step(s)
(PEPit) Solver status: optimal (solver: SCS); objective value: 0.018851597249912744
(PEPit) Postprocessing: 2 eigenvalue(s) > 7.314172662475898e-06 after dimension reduction
*** Example file: worst-case performance of the alternate projection method ***
        PEPit example:   ||Proj_Q1 (xn) - Proj_Q2 (xn)||^2 == 0.0188516 ||x0 - x_*||^2

10.8. Averaged projections 

PEPit.examples.low_dimensional_worst_cases_scenarios.wc_averaged_projections(n, verbose=1)[source]

Consider the convex feasibility problem:

\[\mathrm{Find}\, x\in Q_1\cap Q_2\]

where \(Q_1\) and \(Q_2\) are two closed convex sets.

This code computes a worst-case guarantee for the averaged projection method, and looks for a low-dimensional worst-case example nearly achieving this worst-case guarantee. That is, it computes the smallest possible \(\tau(n)\) such that the guarantee

\[\|\mathrm{Proj}_{Q_1}(x_n)-\mathrm{Proj}_{Q_2}(x_n)\|^2 \leqslant \tau(n) \|x_0 - x_\star\|^2\]

is valid, where \(x_n\) is the output of the averaged projection method, and \(x_\star\in Q_1\cap Q_2\) is a solution to the convex feasibility problem.

In short, for a given value of \(n\), \(\tau(n)\) is computed as the worst-case value of \(\|\mathrm{Proj}_{Q_1}(x_n)-\mathrm{Proj}_{Q_2}(x_n)\|^2\) when \(\|x_0 - x_\star\|^2 \leqslant 1\). Then, it looks for a low-dimensional nearly achieving this performance.

Algorithm: The averaged projection method can be written as

\[\begin{eqnarray} x_{t+1} & = & \frac{1}{2} \left(\mathrm{Proj}_{Q_1}(x_t) + \mathrm{Proj}_{Q_2}(x_t)\right). \end{eqnarray}\]

Parameters

n (int) – number of iterations.
verbose (int) –
Level of information details to print.
- -1: No verbose at all.
- 0: This example’s output.
- 1: This example’s output + PEPit information.
- 2: This example’s output + PEPit information + CVXPY details.

Returns

pepit_tau (float) – worst-case value
theoretical_tau (None) – no theoretical value.

Example

>>> pepit_tau, theoretical_tau = wc_averaged_projections(n=10, verbose=1)
(PEPit) Setting up the problem: size of the main PSD matrix: 25x25
(PEPit) Setting up the problem: performance measure is minimum of 1 element(s)
(PEPit) Setting up the problem: Adding initial conditions and general constraints ...
(PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 2 function(s)
                 function 1 : Adding 144 scalar constraint(s) ...
                 function 1 : 144 scalar constraint(s) added
                 function 2 : Adding 144 scalar constraint(s) ...
                 function 2 : 144 scalar constraint(s) added
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (solver: SCS); optimal value: 0.06845454756941292
(PEPit) Postprocessing: 2 eigenvalue(s) > 0.00014022393949281894 before dimension reduction
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (solver: SCS); objective value: 0.06844459892544441
(PEPit) Postprocessing: 2 eigenvalue(s) > 7.442958512820225e-07 after 1 dimension reduction step(s)
(PEPit) Solver status: optimal (solver: SCS); objective value: 0.06844459892544441
(PEPit) Postprocessing: 2 eigenvalue(s) > 7.442958512820225e-07 after dimension reduction
*** Example file: worst-case performance of the averaged projection method ***
        PEPit example:   ||Proj_Q1 (xn) - Proj_Q2 (xn)||^2 == 0.0684446 ||x0 - x_*||^2

10.9. Dykstra 

PEPit.examples.low_dimensional_worst_cases_scenarios.wc_dykstra(n, verbose=1)[source]

Consider the convex feasibility problem:

\[\mathrm{Find}\, x\in Q_1\cap Q_2\]

where \(Q_1\) and \(Q_2\) are two closed convex sets.

This code computes a worst-case guarantee for the Dykstra projection method, and looks for a low-dimensional worst-case example nearly achieving this worst-case guarantee. That is, it computes the smallest possible \(\tau(n)\) such that the guarantee

\[\|\mathrm{Proj}_{Q_1}(x_n)-\mathrm{Proj}_{Q_2}(x_n)\|^2 \leqslant \tau(n) \|x_0 - x_\star\|^2\]

is valid, where \(x_n\) is the output of the Dykstra projection method, and \(x_\star\in Q_1\cap Q_2\) is a solution to the convex feasibility problem.

In short, for a given value of \(n\), \(\tau(n)\) is computed as the worst-case value of \(\|\mathrm{Proj}_{Q_1}(x_n)-\mathrm{Proj}_{Q_2}(x_n)\|^2\) when \(\|x_0 - x_\star\|^2 \leqslant 1\). Then, it looks for a low-dimensional nearly achieving this performance.

Algorithm: The Dykstra projection method can be written as

\[\begin{split}\begin{eqnarray} y_{t} & = & \mathrm{Proj}_{Q_1}(x_t+p_t), \\ p_{t+1} & = & x_t + p_t - y_t,\\ x_{t+1} & = & \mathrm{Proj}_{Q_2}(y_t+q_t),\\ q_{t+1} & = & y_t + q_t - x_{t+1}. \end{eqnarray}\end{split}\]

References: This method is due to [1].

[1] J.P. Boyle, R.L. Dykstra (1986). A method for finding projections onto the intersection of convex sets in Hilbert spaces. Lecture Notes in Statistics. Vol. 37. pp. 28–47.

Parameters

n (int) – number of iterations.
verbose (int) –
Level of information details to print.
- -1: No verbose at all.
- 0: This example’s output.
- 1: This example’s output + PEPit information.
- 2: This example’s output + PEPit information + CVXPY details.

Returns

pepit_tau (float) – worst-case value
theoretical_tau (None) – no theoretical value.

Example

>>> pepit_tau, theoretical_tau = wc_dykstra(n=10, verbose=1)
(PEPit) Setting up the problem: size of the main PSD matrix: 24x24
(PEPit) Setting up the problem: performance measure is minimum of 1 element(s)
(PEPit) Setting up the problem: Adding initial conditions and general constraints ...
(PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 2 function(s)
                 function 1 : Adding 144 scalar constraint(s) ...
                 function 1 : 144 scalar constraint(s) added
                 function 2 : Adding 121 scalar constraint(s) ...
                 function 2 : 121 scalar constraint(s) added
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal_inaccurate (solver: SCS); optimal value: 0.020649148184166164
(PEPit) Postprocessing: 3 eigenvalue(s) > 0.003245910668057083 before dimension reduction
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal_inaccurate (solver: SCS); objective value: 0.02124334648210737
(PEPit) Postprocessing: 3 eigenvalue(s) > 0.002134191248999246 after 1 dimension reduction step(s)
(PEPit) Solver status: optimal_inaccurate (solver: SCS); objective value: 0.02124334648210737
(PEPit) Postprocessing: 3 eigenvalue(s) > 0.002134191248999246 after dimension reduction
*** Example file: worst-case performance of the Dykstra projection method ***
        PEPit example:   ||Proj_Q1 (xn) - Proj_Q2 (xn)||^2 == 0.0212433 ||x0 - x_*||^2