1. Unconstrained convex minimization

1.1. Gradient descent

PEPit.examples.unconstrained_convex_minimization.wc_gradient_descent(L, gamma, n, wrapper='cvxpy', solver=None, verbose=1)[source]

Consider the convex minimization problem

\[f_\star \triangleq \min_x f(x),\]

where \(f\) is \(L\)-smooth and convex.

This code computes a worst-case guarantee for gradient descent with fixed step-size \(\gamma\). That is, it computes the smallest possible \(\tau(n, L, \gamma)\) such that the guarantee

\[f(x_n) - f_\star \leqslant \tau(n, L, \gamma) \|x_0 - x_\star\|^2\]

is valid, where \(x_n\) is the output of gradient descent with fixed step-size \(\gamma\), and where \(x_\star\) is a minimizer of \(f\).

In short, for given values of \(n\), \(L\), and \(\gamma\), \(\tau(n, L, \gamma)\) is computed as the worst-case value of \(f(x_n)-f_\star\) when \(\|x_0 - x_\star\|^2 \leqslant 1\).

Algorithm: Gradient descent is described by

\[x_{t+1} = x_t - \gamma \nabla f(x_t),\]

where \(\gamma\) is a step-size.

Theoretical guarantee: When \(\gamma \leqslant \frac{1}{L}\), the tight theoretical guarantee can be found in [1, Theorem 3.1]:

\[f(x_n)-f_\star \leqslant \frac{L}{4nL\gamma+2} \|x_0-x_\star\|^2,\]

which is tight on some Huber loss functions.

References:

[1] Y. Drori, M. Teboulle (2014). Performance of first-order methods for smooth convex minimization: a novel approach. Mathematical Programming 145(1–2), 451–482.

Parameters:
  • L (float) – the smoothness parameter.

  • gamma (float) – step-size.

  • n (int) – number of iterations.

  • wrapper (str) – the name of the wrapper to be used.

  • solver (str) – the name of the solver the wrapper should use.

  • verbose (int) –

    level of information details to print.

    • -1: No verbose at all.

    • 0: This example’s output.

    • 1: This example’s output + PEPit information.

    • 2: This example’s output + PEPit information + solver details.

Returns:
  • pepit_tau (float) – worst-case value

  • theoretical_tau (float) – theoretical value

Example

>>> L = 3
>>> pepit_tau, theoretical_tau = wc_gradient_descent(L=L, gamma=1 / L, n=4, wrapper="cvxpy", solver=None, verbose=1)
(PEPit) Setting up the problem: size of the Gram matrix: 7x7
(PEPit) Setting up the problem: performance measure is the minimum of 1 element(s)
(PEPit) Setting up the problem: Adding initial conditions and general constraints ...
(PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 1 function(s)
                        Function 1 : Adding 30 scalar constraint(s) ...
                        Function 1 : 30 scalar constraint(s) added
(PEPit) Setting up the problem: additional constraints for 0 function(s)
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.1666666649793712
(PEPit) Primal feasibility check:
                The solver found a Gram matrix that is positive semi-definite up to an error of 6.325029587061441e-10
                All the primal scalar constraints are verified up to an error of 6.633613956752438e-09
(PEPit) Dual feasibility check:
                The solver found a residual matrix that is positive semi-definite
                All the dual scalar values associated with inequality constraints are nonnegative up to an error of 7.0696173743789816e-09
(PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 7.547098305159261e-08
(PEPit) Final upper bound (dual): 0.16666667331941884 and lower bound (primal example): 0.1666666649793712
(PEPit) Duality gap: absolute: 8.340047652488636e-09 and relative: 5.004028642152831e-08
*** Example file: worst-case performance of gradient descent with fixed step-sizes ***
        PEPit guarantee:         f(x_n)-f_* <= 0.166667 ||x_0 - x_*||^2
        Theoretical guarantee:   f(x_n)-f_* <= 0.166667 ||x_0 - x_*||^2

1.2. Subgradient method

PEPit.examples.unconstrained_convex_minimization.wc_subgradient_method(M, n, gamma, wrapper='cvxpy', solver=None, verbose=1)[source]

Consider the minimization problem

\[f_\star \triangleq \min_x f(x),\]

where \(f\) is convex and \(M\)-Lipschitz. This problem is a (possibly non-smooth) minimization problem.

This code computes a worst-case guarantee for the subgradient method. That is, it computes the smallest possible \(\tau(n, M, \gamma)\) such that the guarantee

\[\min_{0 \leqslant t \leqslant n} f(x_t) - f_\star \leqslant \tau(n, M, \gamma)\]

is valid, where \(x_t\) are the iterates of the subgradient method after \(t\leqslant n\) steps, where \(x_\star\) is a minimizer of \(f\), and when \(\|x_0-x_\star\|\leqslant 1\).

In short, for given values of \(M\), the step-size \(\gamma\) and the number of iterations \(n\), \(\tau(n, M, \gamma)\) is computed as the worst-case value of \(\min_{0 \leqslant t \leqslant n} f(x_t) - f_\star\) when \(\|x_0-x_\star\| \leqslant 1\).

Algorithm: For \(t\in \{0, \dots, n-1 \}\)

\begin{eqnarray} g_{t} & \in & \partial f(x_t) \\ x_{t+1} & = & x_t - \gamma g_t \end{eqnarray}

Theoretical guarantee: The tight bound is obtained in [1, Section 3.2.3] and [2, Eq (2)]

\[\min_{0 \leqslant t \leqslant n} f(x_t)- f(x_\star) \leqslant \frac{M}{\sqrt{n+1}}\|x_0-x_\star\|,\]

and tightness follows from the lower complexity bound for this class of problems, e.g., [3, Appendix A].

References: Classical references on this topic include [1, 2].

[1] Y. Nesterov (2003). Introductory lectures on convex optimization: A basic course. Springer Science & Business Media.

[2] S. Boyd, L. Xiao, A. Mutapcic (2003). Subgradient Methods (lecture notes).

[3] Y. Drori, M. Teboulle (2016). An optimal variant of Kelley’s cutting-plane method. Mathematical Programming, 160(1), 321-351.

Parameters:
  • M (float) – the Lipschitz parameter.

  • n (int) – the number of iterations.

  • gamma (float) – step-size.

  • wrapper (str) – the name of the wrapper to be used.

  • solver (str) – the name of the solver the wrapper should use.

  • verbose (int) –

    level of information details to print.

    • -1: No verbose at all.

    • 0: This example’s output.

    • 1: This example’s output + PEPit information.

    • 2: This example’s output + PEPit information + solver details.

Returns:
  • pepit_tau (float) – worst-case value

  • theoretical_tau (float) – theoretical value

Example

>>> M = 2
>>> n = 6
>>> gamma = 1 / (M * sqrt(n + 1))
>>> pepit_tau, theoretical_tau = wc_subgradient_method(M=M, n=n, gamma=gamma, wrapper="cvxpy", solver=None, verbose=1)
(PEPit) Setting up the problem: size of the Gram matrix: 9x9
(PEPit) Setting up the problem: performance measure is the minimum of 7 element(s)
(PEPit) Setting up the problem: Adding initial conditions and general constraints ...
(PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 1 function(s)
                        Function 1 : Adding 64 scalar constraint(s) ...
                        Function 1 : 64 scalar constraint(s) added
(PEPit) Setting up the problem: additional constraints for 0 function(s)
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.7559287513714278
(PEPit) Primal feasibility check:
                The solver found a Gram matrix that is positive semi-definite
                All the primal scalar constraints are verified
(PEPit) Dual feasibility check:
                The solver found a residual matrix that is positive semi-definite
                All the dual scalar values associated with inequality constraints are nonnegative up to an error of 1.0475429120359347e-09
(PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 8.251581484359109e-08
(PEPit) Final upper bound (dual): 0.7559287543574007 and lower bound (primal example): 0.7559287513714278
(PEPit) Duality gap: absolute: 2.9859729133718815e-09 and relative: 3.950071892297578e-09
*** Example file: worst-case performance of subgradient method ***
        PEPit guarantee:         min_(0 \leq t \leq n) f(x_i) - f_* <= 0.755929 ||x_0 - x_*||
        Theoretical guarantee:   min_(0 \leq t \leq n) f(x_i) - f_* <= 0.755929 ||x_0 - x_*||

1.3. Subgradient method under restricted secant inequality and error bound

PEPit.examples.unconstrained_convex_minimization.wc_subgradient_method_rsi_eb(mu, L, gamma, n, wrapper='cvxpy', solver=None, verbose=1)[source]

Consider the convex minimization problem

\[f_\star \triangleq \min_x f(x),\]

where \(f\) verifies the “lower” restricted secant inequality (\(\mu-\text{RSI}^-\)) and the “upper” error bound (\(L-\text{EB}^+\)) [1].

This code computes a worst-case guarantee for gradient descent with fixed step-size \(\gamma\). That is, it computes the smallest possible \(\tau(n, \mu, L, \gamma)\) such that the guarantee

\[\| x_n - x_\star \|^2 \leqslant \tau(n, \mu, L, \gamma) \| x_0 - x_\star \|^2\]

is valid, where \(x_n\) is the output of gradient descent with fixed step-size \(\gamma\), and where \(x_\star\) is a minimizer of \(f\).

In short, for given values of \(n\), \(L\), and \(\gamma\), \(\tau(n, \mu, L, \gamma)\) is computed as the worst-case value of \(\| x_n - x_\star \|^2\) when \(\|x_0 - x_\star\|^2 \leqslant 1\).

Algorithm: Sub-gradient descent is described by

\[x_{t+1} = x_t - \gamma \nabla f(x_t),\]

where \(\gamma\) is a step-size.

Theoretical guarantee: The tight theoretical guarantee can be found in [1, Prop 1] (upper bound) and [1, Theorem 2] (lower bound):

\[\| x_n - x_\star \|^2 \leqslant (1 - 2\gamma\mu + L^2 \gamma^2)^n \|x_0-x_\star\|^2.\]

References: Definition and convergence guarantees can be found in [1].

[1] C. Guille-Escuret, B. Goujaud, A. Ibrahim, I. Mitliagkas (2022). Gradient Descent Is Optimal Under Lower Restricted Secant Inequality And Upper Error Bound.

Parameters:
  • mu (float) – the rsi parameter

  • L (float) – the eb parameter.

  • gamma (float) – step-size.

  • n (int) – number of iterations.

  • wrapper (str) – the name of the wrapper to be used.

  • solver (str) – the name of the solver the wrapper should use.

  • verbose (int) –

    level of information details to print.

    • -1: No verbose at all.

    • 0: This example’s output.

    • 1: This example’s output + PEPit information.

    • 2: This example’s output + PEPit information + solver details.

Returns:
  • pepit_tau (float) – worst-case value

  • theoretical_tau (float) – theoretical value

Example

>>> mu = .1
>>> L = 1
>>> pepit_tau, theoretical_tau = wc_subgradient_method_rsi_eb(mu=mu, L=L, gamma=mu / L ** 2, n=4, wrapper="cvxpy", solver=None, verbose=1)
(PEPit) Setting up the problem: size of the Gram matrix: 6x6
(PEPit) Setting up the problem: performance measure is the minimum of 1 element(s)
(PEPit) Setting up the problem: Adding initial conditions and general constraints ...
(PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 1 function(s)
                        Function 1 : Adding 8 scalar constraint(s) ...
                        Function 1 : 8 scalar constraint(s) added
(PEPit) Setting up the problem: additional constraints for 0 function(s)
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.9605960099986828
(PEPit) Primal feasibility check:
                The solver found a Gram matrix that is positive semi-definite
                All the primal scalar constraints are verified
(PEPit) Dual feasibility check:
                The solver found a residual matrix that is positive semi-definite
                All the dual scalar values associated with inequality constraints are nonnegative
(PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 2.9007807847382036e-12
(PEPit) Final upper bound (dual): 0.960596009998838 and lower bound (primal example): 0.9605960099986828
(PEPit) Duality gap: absolute: 1.5520917884259688e-13 and relative: 1.6157591456455218e-13
*** Example file: worst-case performance of gradient descent with fixed step-sizes ***
        PEPit guarantee:         f(x_n)-f_* <= 0.960596 ||x_0 - x_*||^2
        Theoretical guarantee:   f(x_n)-f_* <= 0.960596 ||x_0 - x_*||^2

1.5. Conjugate gradient

PEPit.examples.unconstrained_convex_minimization.wc_conjugate_gradient(L, n, wrapper='cvxpy', solver=None, verbose=1)[source]

Consider the convex minimization problem

\[f_\star \triangleq \min_x f(x),\]

where \(f\) is \(L\)-smooth and convex.

This code computes a worst-case guarantee for the conjugate gradient (CG) method (with exact span searches). That is, it computes the smallest possible \(\tau(n, L)\) such that the guarantee

\[f(x_n) - f_\star \leqslant \tau(n, L) \|x_0-x_\star\|^2\]

is valid, where \(x_n\) is the output of the conjugate gradient method, and where \(x_\star\) is a minimizer of \(f\). In short, for given values of \(n\) and \(L\), \(\tau(n, L)\) is computed as the worst-case value of \(f(x_n)-f_\star\) when \(\|x_0-x_\star\|^2 \leqslant 1\).

Algorithm:

\[x_{t+1} = x_t - \sum_{i=0}^t \gamma_i \nabla f(x_i)\]

with

\[(\gamma_i)_{i \leqslant t} = \arg\min_{(\gamma_i)_{i \leqslant t}} f \left(x_t - \sum_{i=0}^t \gamma_i \nabla f(x_i) \right)\]

Theoretical guarantee:

The tight guarantee obtained in [1] is

\[f(x_n) - f_\star \leqslant\frac{L}{2 \theta_n^2}\|x_0-x_\star\|^2.\]

where

\begin{eqnarray} \theta_0 & = & 1 \\ \theta_t & = & \frac{1 + \sqrt{4 \theta_{t-1}^2 + 1}}{2}, \forall t \in [|1, n-1|] \\ \theta_n & = & \frac{1 + \sqrt{8 \theta_{n-1}^2 + 1}}{2}, \end{eqnarray}

and tightness follows from [2, Theorem 3].

References: The detailed approach (based on convex relaxations) is available in [1, Corollary 6].

[1] Y. Drori and A. Taylor (2020). Efficient first-order methods for convex minimization: a constructive approach. Mathematical Programming 184 (1), 183-220.

[2] Y. Drori (2017). The exact information-based complexity of smooth convex minimization. Journal of Complexity, 39, 1-16.

Parameters:
  • L (float) – the smoothness parameter.

  • n (int) – number of iterations.

  • wrapper (str) – the name of the wrapper to be used.

  • solver (str) – the name of the solver the wrapper should use.

  • verbose (int) –

    level of information details to print.

    • -1: No verbose at all.

    • 0: This example’s output.

    • 1: This example’s output + PEPit information.

    • 2: This example’s output + PEPit information + solver details.

Returns:
  • pepit_tau (float) – worst-case value

  • theoretical_tau (float) – theoretical value

Example

>>> pepit_tau, theoretical_tau = wc_conjugate_gradient(L=1, n=2, wrapper="cvxpy", solver=None, verbose=1)
(PEPit) Setting up the problem: size of the Gram matrix: 7x7
(PEPit) Setting up the problem: performance measure is the minimum of 1 element(s)
(PEPit) Setting up the problem: Adding initial conditions and general constraints ...
(PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 1 function(s)
                        Function 1 : Adding 12 scalar constraint(s) ...
                        Function 1 : 12 scalar constraint(s) added
(PEPit) Setting up the problem: additional constraints for 1 function(s)
                        Function 1 : Adding 6 scalar constraint(s) ...
                        Function 1 : 6 scalar constraint(s) added
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.061894196487033516
(PEPit) Primal feasibility check:
                The solver found a Gram matrix that is positive semi-definite up to an error of 4.09403336792362e-09
                All the primal scalar constraints are verified up to an error of 7.438128096087793e-09
(PEPit) Dual feasibility check:
                The solver found a residual matrix that is positive semi-definite
                All the dual scalar values associated with inequality constraints are nonnegative
(PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 8.571121581640089e-08
(PEPit) Final upper bound (dual): 0.06189420236648946 and lower bound (primal example): 0.061894196487033516
(PEPit) Duality gap: absolute: 5.8794559429364845e-09 and relative: 9.49920392644276e-08
*** Example file: worst-case performance of conjugate gradient method ***
        PEPit guarantee:         f(x_n)-f_* <= 0.0618942 ||x_0 - x_*||^2
        Theoretical guarantee:   f(x_n)-f_* <= 0.0618942 ||x_0 - x_*||^2

1.6. Heavy Ball momentum

PEPit.examples.unconstrained_convex_minimization.wc_heavy_ball_momentum(mu, L, alpha, beta, n, wrapper='cvxpy', solver=None, verbose=1)[source]

Consider the convex minimization problem

\[f_\star \triangleq \min_x f(x),\]

where \(f\) is \(L\)-smooth and \(\mu\)-strongly convex.

This code computes a worst-case guarantee for the Heavy-ball (HB) method, aka Polyak momentum method. That is, it computes the smallest possible \(\tau(n, L, \mu, \alpha, \beta)\) such that the guarantee

\[f(x_n) - f_\star \leqslant \tau(n, L, \mu, \alpha, \beta) (f(x_0) - f_\star)\]

is valid, where \(x_n\) is the output of the Heavy-ball (HB) method, and where \(x_\star\) is the minimizer of \(f\). In short, for given values of \(n\), \(L\) and \(\mu\), \(\tau(n, L, \mu, \alpha, \beta)\) is computed as the worst-case value of \(f(x_n)-f_\star\) when \(f(x_0) - f_\star \leqslant 1\).

Algorithm:

\[x_{t+1} = x_t - \alpha \nabla f(x_t) + \beta (x_t-x_{t-1})\]

with

\[\alpha \in (0, \frac{1}{L}]\]

and

\[\beta = \sqrt{(1 - \alpha \mu)(1 - L \alpha)}\]

Theoretical guarantee:

The upper guarantee obtained in [2, Theorem 4] is

\[f(x_n) - f_\star \leqslant (1 - \alpha \mu)^n (f(x_0) - f_\star).\]

References: This methods was first introduce in [1, Section 2], and convergence upper bound was proven in [2, Theorem 4].

[1] B.T. Polyak (1964). Some methods of speeding up the convergence of iteration method. URSS Computational Mathematics and Mathematical Physics.

[2] E. Ghadimi, H. R. Feyzmahdavian, M. Johansson (2015). Global convergence of the Heavy-ball method for convex optimization. European Control Conference (ECC).

Parameters:
  • L (float) – the smoothness parameter.

  • mu (float) – the strong convexity parameter.

  • alpha (float) – parameter of the scheme.

  • beta (float) – parameter of the scheme such that \(0<\beta<1\) and \(0<\alpha<2(1+\beta)\).

  • n (int) – number of iterations.

  • wrapper (str) – the name of the wrapper to be used.

  • solver (str) – the name of the solver the wrapper should use.

  • verbose (int) –

    level of information details to print.

    • -1: No verbose at all.

    • 0: This example’s output.

    • 1: This example’s output + PEPit information.

    • 2: This example’s output + PEPit information + solver details.

Returns:
  • pepit_tau (float) – worst-case value

  • theoretical_tau (float) – theoretical value

Example

>>> mu = 0.1
>>> L = 1.
>>> alpha = 1 / (2 * L)  # alpha \in [0, 1 / L]
>>> beta = sqrt((1 - alpha * mu) * (1 - L * alpha))
>>> pepit_tau, theoretical_tau = wc_heavy_ball_momentum(mu=mu, L=L, alpha=alpha, beta=beta, n=2, wrapper="cvxpy", solver=None, verbose=1)
(PEPit) Setting up the problem: size of the Gram matrix: 5x5
(PEPit) Setting up the problem: performance measure is the minimum of 1 element(s)
(PEPit) Setting up the problem: Adding initial conditions and general constraints ...
(PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 1 function(s)
                        Function 1 : Adding 12 scalar constraint(s) ...
                        Function 1 : 12 scalar constraint(s) added
(PEPit) Setting up the problem: additional constraints for 0 function(s)
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.7534930184723507
(PEPit) Primal feasibility check:
                The solver found a Gram matrix that is positive semi-definite up to an error of 8.447704542447025e-09
                All the primal scalar constraints are verified up to an error of 2.3525640133886805e-09
(PEPit) Dual feasibility check:
                The solver found a residual matrix that is positive semi-definite
                All the dual scalar values associated with inequality constraints are nonnegative
(PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 1.1216875770062036e-07
(PEPit) Final upper bound (dual): 0.7534930169804492 and lower bound (primal example): 0.7534930184723507
(PEPit) Duality gap: absolute: -1.4919014912351258e-09 and relative: -1.979980510316926e-09
*** Example file: worst-case performance of the Heavy-Ball method ***
        PEPit guarantee:         f(x_n)-f_* <= 0.753493 (f(x_0) - f(x_*))
        Theoretical guarantee:   f(x_n)-f_* <= 0.9025 (f(x_0) - f(x_*))

1.7. Accelerated gradient for convex objective

PEPit.examples.unconstrained_convex_minimization.wc_accelerated_gradient_convex(mu, L, n, wrapper='cvxpy', solver=None, verbose=1)[source]

Consider the convex minimization problem

\[f_\star \triangleq \min_x f(x),\]

where \(f\) is \(L\)-smooth and \(\mu\)-strongly convex (\(\mu\) is possibly 0).

This code computes a worst-case guarantee for an accelerated gradient method, a.k.a. fast gradient method. That is, it computes the smallest possible \(\tau(n, L, \mu)\) such that the guarantee

\[f(x_n) - f_\star \leqslant \tau(n, L, \mu) \|x_0 - x_\star\|^2\]

is valid, where \(x_n\) is the output of the accelerated gradient method, and where \(x_\star\) is the minimizer of \(f\). In short, for given values of \(n\), \(L\) and \(\mu\), \(\tau(n, L, \mu)\) is computed as the worst-case value of \(f(x_n)-f_\star\) when \(\|x_0 - x_\star\|^2 \leqslant 1\).

Algorithm: The accelerated gradient method of this example is provided by

\begin{eqnarray} x_{t+1} & = & y_t - \frac{1}{L} \nabla f(y_t) \\ y_{t+1} & = & x_{t+1} + \frac{t-1}{t+2} (x_{t+1} - x_t). \end{eqnarray}

Theoretical guarantee: When \(\mu=0\), a tight empirical guarantee can be found in [1, Table 1]:

\[f(x_n)-f_\star \leqslant \frac{2L\|x_0-x_\star\|^2}{n^2 + 5 n + 6},\]

where tightness is obtained on some Huber loss functions.

References:

[1] A. Taylor, J. Hendrickx, F. Glineur (2017). Exact worst-case performance of first-order methods for composite convex optimization. SIAM Journal on Optimization, 27(3):1283–1313.

Parameters:
  • mu (float) – the strong convexity parameter

  • L (float) – the smoothness parameter.

  • n (int) – number of iterations.

  • wrapper (str) – the name of the wrapper to be used.

  • solver (str) – the name of the solver the wrapper should use.

  • verbose (int) –

    level of information details to print.

    • -1: No verbose at all.

    • 0: This example’s output.

    • 1: This example’s output + PEPit information.

    • 2: This example’s output + PEPit information + solver details.

Returns:
  • pepit_tau (float) – worst-case value

  • theoretical_tau (float) – theoretical value

Example

>>> pepit_tau, theoretical_tau = wc_accelerated_gradient_convex(mu=0, L=1, n=1, wrapper="cvxpy", solver=None, verbose=1)
(PEPit) Setting up the problem: size of the Gram matrix: 4x4
(PEPit) Setting up the problem: performance measure is the minimum of 1 element(s)
(PEPit) Setting up the problem: Adding initial conditions and general constraints ...
(PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 1 function(s)
                        Function 1 : Adding 6 scalar constraint(s) ...
                        Function 1 : 6 scalar constraint(s) added
(PEPit) Setting up the problem: additional constraints for 0 function(s)
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.16666666115099577
(PEPit) Primal feasibility check:
                The solver found a Gram matrix that is positive semi-definite up to an error of 4.820889929895971e-09
                All the primal scalar constraints are verified up to an error of 3.620050065267222e-09
(PEPit) Dual feasibility check:
                The solver found a residual matrix that is positive semi-definite
                All the dual scalar values associated with inequality constraints are nonnegative
(PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 3.1009588480346295e-08
(PEPit) Final upper bound (dual): 0.16666666498584737 and lower bound (primal example): 0.16666666115099577
(PEPit) Duality gap: absolute: 3.834851602935174e-09 and relative: 2.300911037907513e-08
*** Example file: worst-case performance of accelerated gradient method ***
        PEPit guarantee:         f(x_n)-f_* <= 0.166667 ||x_0 - x_*||^2
        Theoretical guarantee:   f(x_n)-f_* <= 0.166667 ||x_0 - x_*||^2

1.8. Accelerated gradient for strongly convex objective

PEPit.examples.unconstrained_convex_minimization.wc_accelerated_gradient_strongly_convex(mu, L, n, wrapper='cvxpy', solver=None, verbose=1)[source]

Consider the convex minimization problem

\[f_\star \triangleq \min_x f(x),\]

where \(f\) is \(L\)-smooth and \(\mu\)-strongly convex.

This code computes a worst-case guarantee for an accelerated gradient method, a.k.a fast gradient method. That is, it computes the smallest possible \(\tau(n, L, \mu)\) such that the guarantee

\[f(x_n) - f_\star \leqslant \tau(n, L, \mu) \left(f(x_0) - f(x_\star) + \frac{\mu}{2}\|x_0 - x_\star\|^2\right),\]

is valid, where \(x_n\) is the output of the accelerated gradient method, and where \(x_\star\) is the minimizer of \(f\). In short, for given values of \(n\), \(L\) and \(\mu\), \(\tau(n, L, \mu)\) is computed as the worst-case value of \(f(x_n)-f_\star\) when \(f(x_0) - f(x_\star) + \frac{\mu}{2}\|x_0 - x_\star\|^2 \leqslant 1\).

Algorithm: For \(t \in \{0, \dots, n-1\}\),

\begin{eqnarray} y_t & = & x_t + \frac{\sqrt{L} - \sqrt{\mu}}{\sqrt{L} + \sqrt{\mu}}(x_t - x_{t-1}) \\ x_{t+1} & = & y_t - \frac{1}{L} \nabla f(y_t) \end{eqnarray}

with \(x_{-1}:= x_0\).

Theoretical guarantee:

The following upper guarantee can be found in [1, Corollary 4.15]:

\[f(x_n)-f_\star \leqslant \left(1 - \sqrt{\frac{\mu}{L}}\right)^n \left(f(x_0) - f(x_\star) + \frac{\mu}{2}\|x_0 - x_\star\|^2\right).\]

References:

[1] A. d’Aspremont, D. Scieur, A. Taylor (2021). Acceleration Methods. Foundations and Trends in Optimization: Vol. 5, No. 1-2.

Parameters:
  • mu (float) – the strong convexity parameter

  • L (float) – the smoothness parameter.

  • n (int) – number of iterations.

  • wrapper (str) – the name of the wrapper to be used.

  • solver (str) – the name of the solver the wrapper should use.

  • verbose (int) –

    level of information details to print.

    • -1: No verbose at all.

    • 0: This example’s output.

    • 1: This example’s output + PEPit information.

    • 2: This example’s output + PEPit information + solver details.

Returns:
  • pepit_tau (float) – worst-case value

  • theoretical_tau (float) – theoretical value

Example

>>> pepit_tau, theoretical_tau = wc_accelerated_gradient_strongly_convex(mu=0.1, L=1, n=2, wrapper="cvxpy", solver=None, verbose=1)
(PEPit) Setting up the problem: size of the Gram matrix: 5x5
(PEPit) Setting up the problem: performance measure is the minimum of 1 element(s)
(PEPit) Setting up the problem: Adding initial conditions and general constraints ...
(PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 1 function(s)
                        Function 1 : Adding 12 scalar constraint(s) ...
                        Function 1 : 12 scalar constraint(s) added
(PEPit) Setting up the problem: additional constraints for 0 function(s)
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.34760222631660587
(PEPit) Primal feasibility check:
                The solver found a Gram matrix that is positive semi-definite up to an error of 1.9078112028836573e-09
                All the primal scalar constraints are verified up to an error of 1.3491461073322775e-09
(PEPit) Dual feasibility check:
                The solver found a residual matrix that is positive semi-definite
                All the dual scalar values associated with inequality constraints are nonnegative
(PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 1.191295818299753e-08
(PEPit) Final upper bound (dual): 0.3476022268314509 and lower bound (primal example): 0.34760222631660587
(PEPit) Duality gap: absolute: 5.148450554770534e-10 and relative: 1.4811327905826417e-09
*** Example file: worst-case performance of the accelerated gradient method ***
        PEPit guarantee:         f(x_n)-f_* <= 0.347602 (f(x_0) - f(x_*) + mu/2*||x_0 - x_*||**2)
        Theoretical guarantee:   f(x_n)-f_* <= 0.467544 (f(x_0) - f(x_*) + mu/2*||x_0 - x_*||**2)

1.9. Optimized gradient

PEPit.examples.unconstrained_convex_minimization.wc_optimized_gradient(L, n, wrapper='cvxpy', solver=None, verbose=1)[source]

Consider the minimization problem

\[f_\star \triangleq \min_x f(x),\]

where \(f\) is \(L\)-smooth and convex.

This code computes a worst-case guarantee for optimized gradient method (OGM). That is, it computes the smallest possible \(\tau(n, L)\) such that the guarantee

\[f(x_n) - f_\star \leqslant \tau(n, L) \|x_0 - x_\star\|^2\]

is valid, where \(x_n\) is the output of OGM and where \(x_\star\) is a minimizer of \(f\).

In short, for given values of \(n\) and \(L\), \(\tau(n, L)\) is computed as the worst-case value of \(f(x_n)-f_\star\) when \(\|x_0 - x_\star\|^2 \leqslant 1\).

Algorithm: The optimized gradient method is described by

\begin{eqnarray} x_{t+1} & = & y_t - \frac{1}{L} \nabla f(y_t)\\ y_{t+1} & = & x_{t+1} + \frac{\theta_{t}-1}{\theta_{t+1}}(x_{t+1}-x_t)+\frac{\theta_{t}}{\theta_{t+1}}(x_{t+1}-y_t), \end{eqnarray}

with

\begin{eqnarray} \theta_0 & = & 1 \\ \theta_t & = & \frac{1 + \sqrt{4 \theta_{t-1}^2 + 1}}{2}, \forall t \in [|1, n-1|] \\ \theta_n & = & \frac{1 + \sqrt{8 \theta_{n-1}^2 + 1}}{2}. \end{eqnarray}

Theoretical guarantee: The tight theoretical guarantee can be found in [2, Theorem 2]:

\[f(x_n)-f_\star \leqslant \frac{L\|x_0-x_\star\|^2}{2\theta_n^2},\]

where tightness follows from [3, Theorem 3].

References: The optimized gradient method was developed in [1, 2]; the corresponding lower bound was first obtained in [3].

[1] Y. Drori, M. Teboulle (2014). Performance of first-order methods for smooth convex minimization: a novel approach. Mathematical Programming 145(1–2), 451–482.

[2] D. Kim, J. Fessler (2016). Optimized first-order methods for smooth convex minimization. Mathematical Programming 159.1-2: 81-107.

[3] Y. Drori (2017). The exact information-based complexity of smooth convex minimization. Journal of Complexity, 39, 1-16.

Parameters:
  • L (float) – the smoothness parameter.

  • n (int) – number of iterations.

  • wrapper (str) – the name of the wrapper to be used.

  • solver (str) – the name of the solver the wrapper should use.

  • verbose (int) –

    level of information details to print.

    • -1: No verbose at all.

    • 0: This example’s output.

    • 1: This example’s output + PEPit information.

    • 2: This example’s output + PEPit information + solver details.

Returns:
  • pepit_tau (float) – worst-case value

  • theoretical_tau (float) – theoretical value

Example

>>> pepit_tau, theoretical_tau = wc_optimized_gradient(L=3, n=4, wrapper="cvxpy", solver=None, verbose=1)
(PEPit) Setting up the problem: size of the Gram matrix: 7x7
(PEPit) Setting up the problem: performance measure is the minimum of 1 element(s)
(PEPit) Setting up the problem: Adding initial conditions and general constraints ...
(PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 1 function(s)
                        Function 1 : Adding 30 scalar constraint(s) ...
                        Function 1 : 30 scalar constraint(s) added
(PEPit) Setting up the problem: additional constraints for 0 function(s)
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.0767518265733206
(PEPit) Primal feasibility check:
                The solver found a Gram matrix that is positive semi-definite up to an error of 4.2100544930481964e-09
                All the primal scalar constraints are verified up to an error of 4.178030258567e-09
(PEPit) Dual feasibility check:
                The solver found a residual matrix that is positive semi-definite
                All the dual scalar values associated with inequality constraints are nonnegative up to an error of 2.3578267940913163e-09
(PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 3.653093053290753e-08
(PEPit) Final upper bound (dual): 0.0767518302587488 and lower bound (primal example): 0.0767518265733206
(PEPit) Duality gap: absolute: 3.6854281987297455e-09 and relative: 4.801746568479483e-08
*** Example file: worst-case performance of optimized gradient method ***
        PEPit guarantee:         f(y_n)-f_* <= 0.0767518 ||x_0 - x_*||^2
        Theoretical guarantee:   f(y_n)-f_* <= 0.0767518 ||x_0 - x_*||^2

1.10. Optimized gradient for gradient

PEPit.examples.unconstrained_convex_minimization.wc_optimized_gradient_for_gradient(L, n, wrapper='cvxpy', solver=None, verbose=1)[source]

Consider the minimization problem

\[f_\star \triangleq \min_x f(x),\]

where \(f\) is \(L\)-smooth and convex.

This code computes a worst-case guarantee for optimized gradient method for gradient (OGM-G). That is, it computes the smallest possible \(\tau(n, L)\) such that the guarantee

\[\|\nabla f(x_n)\|^2 \leqslant \tau(n, L) (f(x_0) - f_\star)\]

is valid, where \(x_n\) is the output of OGM-G and where \(x_\star\) is a minimizer of \(f\).

In short, for given values of \(n\) and \(L\), \(\tau(n, L)\) is computed as the worst-case value of \(\|\nabla f(x_n)\|^2\) when \(f(x_0)-f_\star \leqslant 1\).

Algorithm: For \(t\in\{0,1,\ldots,n-1\}\), the optimized gradient method for gradient [1, Section 6.3] is described by

\begin{eqnarray} y_{t+1} & = & x_t - \frac{1}{L} \nabla f(x_t),\\ x_{t+1} & = & y_{t+1} + \frac{(\tilde{\theta}_t-1)(2\tilde{\theta}_{t+1}-1)}{\tilde{\theta}_t(2\tilde{\theta}_t-1)}(y_{t+1}-y_t)+\frac{2\tilde{\theta}_{t+1}-1}{2\tilde{\theta}_t-1}(y_{t+1}-x_t), \end{eqnarray}

with

\begin{eqnarray} \tilde{\theta}_n & = & 1 \\ \tilde{\theta}_t & = & \frac{1 + \sqrt{4 \tilde{\theta}_{t+1}^2 + 1}}{2}, \forall t \in [|1, n-1|] \\ \tilde{\theta}_0 & = & \frac{1 + \sqrt{8 \tilde{\theta}_{1}^2 + 1}}{2}. \end{eqnarray}

Theoretical guarantee: The tight worst-case guarantee can be found in [1, Theorem 6.1]:

\[\|\nabla f(x_n)\|^2 \leqslant \frac{2L(f(x_0)-f_\star)}{\tilde{\theta}_0^2},\]

where tightness is achieved on Huber losses, see [1, Section 6.4].

References: The optimized gradient method for gradient was developed in [1].

[1] D. Kim, J. Fessler (2021). Optimizing the efficiency of first-order methods for decreasing the gradient of smooth convex functions. Journal of optimization theory and applications, 188(1), 192-219.

Parameters:
  • L (float) – the smoothness parameter.

  • n (int) – number of iterations.

  • wrapper (str) – the name of the wrapper to be used.

  • solver (str) – the name of the solver the wrapper should use.

  • verbose (int) –

    level of information details to print.

    • -1: No verbose at all.

    • 0: This example’s output.

    • 1: This example’s output + PEPit information.

    • 2: This example’s output + PEPit information + solver details.

Returns:
  • pepit_tau (float) – worst-case value

  • theoretical_tau (float) – theoretical value

Example

>>> pepit_tau, theoretical_tau = wc_optimized_gradient_for_gradient(L=3, n=4, wrapper="cvxpy", solver=None, verbose=1)
(PEPit) Setting up the problem: size of the Gram matrix: 7x7
(PEPit) Setting up the problem: performance measure is the minimum of 1 element(s)
(PEPit) Setting up the problem: Adding initial conditions and general constraints ...
(PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 1 function(s)
                        Function 1 : Adding 30 scalar constraint(s) ...
                        Function 1 : 30 scalar constraint(s) added
(PEPit) Setting up the problem: additional constraints for 0 function(s)
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.307007304609862
(PEPit) Primal feasibility check:
                The solver found a Gram matrix that is positive semi-definite
                All the primal scalar constraints are verified up to an error of 1.851107461758872e-11
(PEPit) Dual feasibility check:
                The solver found a residual matrix that is positive semi-definite
                All the dual scalar values associated with inequality constraints are nonnegative up to an error of 1.897293067292418e-11
(PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 3.0436663734975135e-10
(PEPit) Final upper bound (dual): 0.30700730461344283 and lower bound (primal example): 0.307007304609862
(PEPit) Duality gap: absolute: 3.5808578324747486e-12 and relative: 1.1663754505858493e-11
*** Example file: worst-case performance of optimized gradient method for gradient ***
        PEP-it guarantee:        ||f'(x_n)||^2 <= 0.307007 (f(x_0) - f_*)
        Theoretical guarantee:   ||f'(x_n)||^2 <= 0.307007 (f(x_0) - f_*)

1.11. Robust momentum

PEPit.examples.unconstrained_convex_minimization.wc_robust_momentum(mu, L, lam, wrapper='cvxpy', solver=None, verbose=1)[source]

Consider the convex minimization problem

\[f_\star \triangleq \min_x f(x),\]

where \(f\) is \(L\)-smooth and \(\mu\)-strongly-convex.

This code computes a worst-case guarantee for the robust momentum method (RMM). That is, it computes the smallest possible \(\tau(n, \mu, L, \lambda)\) such that the guarantee

\[v(x_{n+1}) \leqslant \tau(n, \mu, L, \lambda) v(x_{n}),\]

is valid, where \(x_n\) is the \(n^{\mathrm{th}}\) iterate of the RMM, and \(x_\star\) is a minimizer of \(f\). The function \(v(.)\) is a well-chosen Lyapunov defined as follows,

\begin{eqnarray} v(x_t) & = & l\|z_t - x_\star\|^2 + q_t, \\ q_t & = & (L - \mu) \left(f(x_t) - f_\star - \frac{\mu}{2}\|y_t - x_\star\|^2 - \frac{1}{2}\|\nabla f(y_t) - \mu (y_t - x_\star)\|^2 \right), \end{eqnarray}

with \(\kappa = \frac{\mu}{L}\), \(\rho = \lambda (1 - \frac{1}{\kappa}) + (1 - \lambda) \left(1 - \frac{1}{\sqrt{\kappa}}\right)\), and \(l = \mu^2 \frac{\kappa - \kappa \rho^2 - 1}{2 \rho (1 - \rho)}`\).

Algorithm:

For \(t \in \{0, \dots, n-1\}\),

\begin{eqnarray} x_{t+1} & = & x_{t} + \beta (x_t - x_{t-1}) - \alpha \nabla f(y_t), \\ y_{t+1} & = & y_{t} + \gamma (x_t - x_{t-1}), \end{eqnarray}

with \(x_{-1}, x_0 \in \mathrm{R}^d\), and with parameters \(\alpha = \frac{\kappa (1 - \rho^2)(1 + \rho)}{L}\), \(\beta = \frac{\kappa \rho^3}{\kappa - 1}\), \(\gamma = \frac{\rho^2}{(\kappa - 1)(1 - \rho)^2(1 + \rho)}\).

Theoretical guarantee:

A convergence guarantee (empirically tight) is obtained in [1, Theorem 1],

\[v(x_{n+1}) \leqslant \rho^2 v(x_n),\]

with \(\rho = \lambda (1 - \frac{1}{\kappa}) + (1 - \lambda) \left(1 - \frac{1}{\sqrt{\kappa}}\right)\).

References:

[1] S. Cyrus, B. Hu, B. Van Scoy, L. Lessard (2018). A robust accelerated optimization algorithm for strongly convex functions. American Control Conference (ACC).

Parameters:
  • L (float) – the smoothness parameter.

  • mu (float) – the strong convexity parameter.

  • lam (float) – if \(\lambda=1\) it is the gradient descent, if \(\lambda=0\), it is the Triple Momentum Method.

  • wrapper (str) – the name of the wrapper to be used.

  • solver (str) – the name of the solver the wrapper should use.

  • verbose (int) –

    level of information details to print.

    • -1: No verbose at all.

    • 0: This example’s output.

    • 1: This example’s output + PEPit information.

    • 2: This example’s output + PEPit information + solver details.

Returns:
  • pepit_tau (float) – worst-case value

  • theoretical_tau (float) – theoretical value

Examples

>>> pepit_tau, theoretical_tau = wc_robust_momentum(mu=0.1, L=1, lam=0.2, wrapper="cvxpy", solver=None, verbose=1)
(PEPit) Setting up the problem: size of the Gram matrix: 5x5
(PEPit) Setting up the problem: performance measure is the minimum of 1 element(s)
(PEPit) Setting up the problem: Adding initial conditions and general constraints ...
(PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 1 function(s)
                        Function 1 : Adding 6 scalar constraint(s) ...
                        Function 1 : 6 scalar constraint(s) added
(PEPit) Setting up the problem: additional constraints for 0 function(s)
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.5285548454743232
(PEPit) Primal feasibility check:
                The solver found a Gram matrix that is positive semi-definite
                All the primal scalar constraints are verified
(PEPit) Dual feasibility check:
                The solver found a residual matrix that is positive semi-definite
                All the dual scalar values associated with inequality constraints are nonnegative
(PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 6.797521527290181e-08
(PEPit) Final upper bound (dual): 0.5285548474610776 and lower bound (primal example): 0.5285548454743232
(PEPit) Duality gap: absolute: 1.9867544276408466e-09 and relative: 3.758842520605294e-09
*** Example file: worst-case performance of the Robust Momentum Method ***
        PEPit guarantee:         v(x_(n+1)) <= 0.528555 v(x_n)
        Theoretical guarantee:   v(x_(n+1)) <= 0.528555 v(x_n)

1.12. Triple momentum

PEPit.examples.unconstrained_convex_minimization.wc_triple_momentum(mu, L, n, wrapper='cvxpy', solver=None, verbose=1)[source]

Consider the minimization problem

\[f_\star \triangleq \min_x f(x),\]

where \(f\) is \(L\)-smooth and \(\mu\)-strongly convex.

This code computes a worst-case guarantee for triple momentum method (TMM). That is, it computes the smallest possible \(\tau(n, L, \mu)\) such that the guarantee

\[f(x_n) - f_\star \leqslant \tau(n, L, \mu) \|x_0 - x_\star\|^2\]

is valid, where \(x_n\) is the output of the TMM, and where \(x_\star\) is the minimizer of \(f\). In short, for given values of \(n\), \(L\) and \(\mu\), \(\tau(n, L, \mu)\) is computed as the worst-case value of \(f(x_n)-f_\star\) when \(\|x_0 - x_\star\|^2 \leqslant 1\).

Algorithm:

For \(t \in \{ 1, \dots, n\}\)

\begin{eqnarray} \xi_{t+1} &&= (1 + \beta) \xi_{t} - \beta \xi_{t-1} - \alpha \nabla f(y_t) \\ y_{t} &&= (1+\gamma ) \xi_{t} -\gamma \xi_{t-1} \\ x_{t} && = (1 + \delta) \xi_{t} - \delta \xi_{t-1} \end{eqnarray}

with

\begin{eqnarray} \kappa &&= \frac{L}{\mu} , \quad \rho = 1- \frac{1}{\sqrt{\kappa}}\\ (\alpha, \beta, \gamma,\delta) && = \left(\frac{1+\rho}{L}, \frac{\rho^2}{2-\rho}, \frac{\rho^2}{(1+\rho)(2-\rho)}, \frac{\rho^2}{1-\rho^2}\right) \end{eqnarray}

and

\begin{eqnarray} \xi_{0} = x_0 \\ \xi_{1} = x_0 \\ y = x_0 \end{eqnarray}

Theoretical guarantee: A theoretical upper (empirically tight) bound can be found in [1, Theorem 1, eq. 4]:

\[f(x_n)-f_\star \leqslant \frac{\rho^{2(n+1)} L \kappa}{2}\|x_0 - x_\star\|^2.\]

References: The triple momentum method was discovered and analyzed in [1].

[1] Van Scoy, B., Freeman, R. A., Lynch, K. M. (2018). The fastest known globally convergent first-order method for minimizing strongly convex functions. IEEE Control Systems Letters, 2(1), 49-54.

Parameters:
  • L (float) – the smoothness parameter.

  • mu (float) – the strong convexity parameter.

  • n (int) – number of iterations.

  • wrapper (str) – the name of the wrapper to be used.

  • solver (str) – the name of the solver the wrapper should use.

  • verbose (int) –

    level of information details to print.

    • -1: No verbose at all.

    • 0: This example’s output.

    • 1: This example’s output + PEPit information.

    • 2: This example’s output + PEPit information + solver details.

Returns:
  • pepit_tau (float) – worst-case value

  • theoretical_tau (float) – theoretical value

Example

>>> pepit_tau, theoretical_tau = wc_triple_momentum(mu=0.1, L=1., n=4, wrapper="cvxpy", solver=None, verbose=1)
(PEPit) Setting up the problem: size of the Gram matrix: 7x7
(PEPit) Setting up the problem: performance measure is the minimum of 1 element(s)
(PEPit) Setting up the problem: Adding initial conditions and general constraints ...
(PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 1 function(s)
                        Function 1 : Adding 30 scalar constraint(s) ...
                        Function 1 : 30 scalar constraint(s) added
(PEPit) Setting up the problem: additional constraints for 0 function(s)
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.23892507617696113
(PEPit) Primal feasibility check:
                The solver found a Gram matrix that is positive semi-definite up to an error of 1.2421790162612716e-08
                All the primal scalar constraints are verified up to an error of 2.3083153937765444e-08
(PEPit) Dual feasibility check:
                The solver found a residual matrix that is positive semi-definite
                All the dual scalar values associated with inequality constraints are nonnegative up to an error of 2.128560722969591e-09
(PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 9.890478893964213e-08
(PEPit) Final upper bound (dual): 0.23892508270020568 and lower bound (primal example): 0.23892507617696113
(PEPit) Duality gap: absolute: 6.523244555634022e-09 and relative: 2.7302469292936613e-08
*** Example file: worst-case performance of the Triple Momentum Method ***
        PEPit guarantee:         f(x_n)-f_* <= 0.238925 ||x_0-x_*||^2
        Theoretical guarantee:   f(x_n)-f_* <= 0.238925 ||x_0-x_*||^2

1.13. Information theoretic exact method

PEPit.examples.unconstrained_convex_minimization.wc_information_theoretic(mu, L, n, wrapper='cvxpy', solver=None, verbose=1)[source]

Consider the convex minimization problem

\[f_\star \triangleq \min_x f(x),\]

where \(f\) is \(L\)-smooth and \(\mu\)-strongly convex (\(\mu\) is possibly 0).

This code computes a worst-case guarantee for the information theoretic exact method (ITEM). That is, it computes the smallest possible \(\tau(n, L, \mu)\) such that the guarantee

\[\|z_n - x_\star\|^2 \leqslant \tau(n, L, \mu) \|z_0 - x_\star\|^2\]

is valid, where \(z_n\) is the output of the ITEM, and where \(x_\star\) is the minimizer of \(f\). In short, for given values of \(n\), \(L\) and \(\mu\), \(\tau(n, L, \mu)\) is computed as the worst-case value of \(\|z_n - x_\star\|^2\) when \(\|z_0 - x_\star\|^2 \leqslant 1\).

Algorithm: For \(t\in\{0,1,\ldots,n-1\}\), the information theoretic exact method of this example is provided by

\begin{eqnarray} y_{t} & = & (1-\beta_t) z_t + \beta_t x_t \\ x_{t+1} & = & y_t - \frac{1}{L} \nabla f(y_t) \\ z_{t+1} & = & \left(1-q\delta_t\right) z_t+q\delta_t y_t-\frac{\delta_t}{L}\nabla f(y_t), \end{eqnarray}

with \(y_{-1}=x_0=z_0\), \(q=\frac{\mu}{L}\) (inverse condition ratio), and the scalar sequences:

\begin{eqnarray} A_{t+1} & = & \frac{(1+q)A_t+2\left(1+\sqrt{(1+A_t)(1+qA_t)}\right)}{(1-q)^2},\\ \beta_{t+1} & = & \frac{A_t}{(1-q)A_{t+1}},\\ \delta_{t+1} & = & \frac{1}{2}\frac{(1-q)^2A_{t+1}-(1+q)A_t}{1+q+q A_t}, \end{eqnarray}

with \(A_0=0\).

Theoretical guarantee: A tight worst-case guarantee can be found in [1, Theorem 3]:

\[\|z_n - x_\star\|^2 \leqslant \frac{1}{1+q A_n} \|z_0-x_\star\|^2,\]

where tightness is obtained on some quadratic loss functions (see [1, Lemma 2]).

References:

[1] A. Taylor, Y. Drori (2022). An optimal gradient method for smooth strongly convex minimization. Mathematical Programming.

Parameters:
  • mu (float) – the strong convexity parameter.

  • L (float) – the smoothness parameter.

  • n (int) – number of iterations.

  • wrapper (str) – the name of the wrapper to be used.

  • solver (str) – the name of the solver the wrapper should use.

  • verbose (int) –

    level of information details to print.

    • -1: No verbose at all.

    • 0: This example’s output.

    • 1: This example’s output + PEPit information.

    • 2: This example’s output + PEPit information + solver details.

Returns:
  • pepit_tau (float) – worst-case value

  • theoretical_tau (float) – theoretical value

Example

>>> pepit_tau, theoretical_tau = wc_information_theoretic(mu=.001, L=1, n=15, wrapper="cvxpy", solver=None, verbose=1)
(PEPit) Setting up the problem: size of the Gram matrix: 17x17
(PEPit) Setting up the problem: performance measure is the minimum of 1 element(s)
(PEPit) Setting up the problem: Adding initial conditions and general constraints ...
(PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 1 function(s)
                        Function 1 : Adding 240 scalar constraint(s) ...
                        Function 1 : 240 scalar constraint(s) added
(PEPit) Setting up the problem: additional constraints for 0 function(s)
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.7566088333863785
(PEPit) Primal feasibility check:
                The solver found a Gram matrix that is positive semi-definite
                All the primal scalar constraints are verified up to an error of 2.6490565304427935e-09
(PEPit) Dual feasibility check:
                The solver found a residual matrix that is positive semi-definite
                All the dual scalar values associated with inequality constraints are nonnegative
(PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 5.0900580550698854e-05
(PEPit) Final upper bound (dual): 0.7566088219545419 and lower bound (primal example): 0.7566088333863785
(PEPit) Duality gap: absolute: -1.1431836588471356e-08 and relative: -1.5109308911059783e-08
*** Example file: worst-case performance of the information theoretic exact method ***
        PEP-it guarantee:        ||z_n - x_* ||^2 <= 0.756609 ||z_0 - x_*||^2
        Theoretical guarantee:   ||z_n - x_* ||^2 <= 0.756605 ||z_0 - x_*||^2

1.14. Cyclic coordinate descent

PEPit.examples.unconstrained_convex_minimization.wc_cyclic_coordinate_descent(L, n, wrapper='cvxpy', solver=None, verbose=1)[source]

Consider the convex minimization problem

\[f_\star \triangleq \min_x f(x),\]

where \(f\) is \(L\)-smooth by blocks (with \(d\) blocks) and convex.

This code computes a worst-case guarantee for cyclic coordinate descent with fixed step-sizes \(1/L_i\). That is, it computes the smallest possible \(\tau(n, d, L)\) such that the guarantee

\[f(x_n) - f_\star \leqslant \tau(n, d, L) \|x_0 - x_\star\|^2\]

is valid, where \(x_n\) is the output of cyclic coordinate descent with fixed step-sizes \(1/L_i\), and where \(x_\star\) is a minimizer of \(f\).

In short, for given values of \(n\), \(L\), and \(d\), \(\tau(n, d, L)\) is computed as the worst-case value of \(f(x_n)-f_\star\) when \(\|x_0 - x_\star\|^2 \leqslant 1\).

Algorithm: Cyclic coordinate descent is described by

\[x_{t+1} = x_t - \frac{1}{L_{i_t}} \nabla_{i_t} f(x_t),\]

where \(L_{i_t}\) is the Lipschitz constant of the block \(i_t\), and where \(i_t\) follows a prescribed ordering.

References:

[1] Z. Shi, R. Liu (2016). Better worst-case complexity analysis of the block coordinate descent method for large scale machine learning. In 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA).

Parameters:
  • L (list) – list of floats, smoothness parameters (for each block).

  • n (int) – number of iterations.

  • wrapper (str) – the name of the wrapper to be used.

  • solver (str) – the name of the solver the wrapper should use.

  • verbose (int) –

    level of information details to print.

    • -1: No verbose at all.

    • 0: This example’s output.

    • 1: This example’s output + PEPit information.

    • 2: This example’s output + PEPit information + solver details.

Returns:
  • pepit_tau (float) – worst-case value

  • theoretical_tau (float) – None

Example

>>> L = [1., 2., 10.]
>>> pepit_tau, theoretical_tau = wc_cyclic_coordinate_descent(L=L, n=9, wrapper="cvxpy", solver=None, verbose=1)
(PEPit) Setting up the problem: size of the Gram matrix: 34x34
(PEPit) Setting up the problem: performance measure is the minimum of 1 element(s)
(PEPit) Setting up the problem: Adding initial conditions and general constraints ...
(PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 1 function(s)
                        Function 1 : Adding 330 scalar constraint(s) ...
                        Function 1 : 330 scalar constraint(s) added
(PEPit) Setting up the problem: additional constraints for 0 function(s)
(PEPit) Setting up the problem: 1 partition(s) added
                        Partition 1 with 3 blocks: Adding 363 scalar constraint(s)...
                        Partition 1 with 3 blocks: 363 scalar constraint(s) added
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 1.4892758367502887
(PEPit) Primal feasibility check:
                The solver found a Gram matrix that is positive semi-definite up to an error of 7.745171522956692e-09
                All the primal scalar constraints are verified up to an error of 7.4478840872416185e-09
(PEPit) Dual feasibility check:
                The solver found a residual matrix that is positive semi-definite
                All the dual scalar values associated with inequality constraints are nonnegative up to an error of 7.835133702255824e-09
(PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 3.797651214794727e-07
(PEPit) Final upper bound (dual): 1.4892758368167314 and lower bound (primal example): 1.4892758367502887
(PEPit) Duality gap: absolute: 6.644262917632204e-11 and relative: 4.461405169998919e-11
*** Example file: worst-case performance of cyclic coordinate descent with fixed step-sizes ***
        PEPit guarantee:         f(x_n)-f_* <= 1.48928 ||x_0 - x_*||^2

1.15. Proximal point

PEPit.examples.unconstrained_convex_minimization.wc_proximal_point(gamma, n, wrapper='cvxpy', solver=None, verbose=1)[source]

Consider the minimization problem

\[f_\star \triangleq \min_x f(x),\]

where \(f\) is closed, proper, and convex (and potentially non-smooth).

This code computes a worst-case guarantee for the proximal point method with step-size \(\gamma\). That is, it computes the smallest possible \(\tau(n,\gamma)\) such that the guarantee

\[f(x_n) - f_\star \leqslant \tau(n, \gamma) \|x_0 - x_\star\|^2\]

is valid, where \(x_n\) is the output of the proximal point method, and where \(x_\star\) is a minimizer of \(f\).

In short, for given values of \(n\) and \(\gamma\), \(\tau(n,\gamma)\) is computed as the worst-case value of \(f(x_n)-f_\star\) when \(\|x_0 - x_\star\|^2 \leqslant 1\).

Algorithm:

The proximal point method is described by

\[x_{t+1} = \arg\min_x \left\{f(x)+\frac{1}{2\gamma}\|x-x_t\|^2 \right\},\]

where \(\gamma\) is a step-size.

Theoretical guarantee:

The tight theoretical guarantee can be found in [1, Theorem 4.1]:

\[f(x_n)-f_\star \leqslant \frac{\|x_0-x_\star\|^2}{4\gamma n},\]

where tightness is obtained on, e.g., one-dimensional linear problems on the positive orthant.

References:

[1] A. Taylor, J. Hendrickx, F. Glineur (2017). Exact worst-case performance of first-order methods for composite convex optimization. SIAM Journal on Optimization, 27(3):1283–1313.

Parameters:
  • gamma (float) – step-size.

  • n (int) – number of iterations.

  • wrapper (str) – the name of the wrapper to be used.

  • solver (str) – the name of the solver the wrapper should use.

  • verbose (int) –

    level of information details to print.

    • -1: No verbose at all.

    • 0: This example’s output.

    • 1: This example’s output + PEPit information.

    • 2: This example’s output + PEPit information + solver details.

Returns:
  • pepit_tau (float) – worst-case value

  • theoretical_tau (float) – theoretical value

Example

>>> pepit_tau, theoretical_tau = wc_proximal_point(gamma=3, n=4, wrapper="cvxpy", solver=None, verbose=1)
(PEPit) Setting up the problem: size of the Gram matrix: 6x6
(PEPit) Setting up the problem: performance measure is the minimum of 1 element(s)
(PEPit) Setting up the problem: Adding initial conditions and general constraints ...
(PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 1 function(s)
                        Function 1 : Adding 20 scalar constraint(s) ...
                        Function 1 : 20 scalar constraint(s) added
(PEPit) Setting up the problem: additional constraints for 0 function(s)
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.020833335685730252
(PEPit) Primal feasibility check:
                The solver found a Gram matrix that is positive semi-definite up to an error of 3.626659005644299e-09
                All the primal scalar constraints are verified up to an error of 1.1386158081487519e-08
(PEPit) Dual feasibility check:
                The solver found a residual matrix that is positive semi-definite
                All the dual scalar values associated with inequality constraints are nonnegative
(PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 3.0464297827437203e-08
(PEPit) Final upper bound (dual): 0.020833337068527292 and lower bound (primal example): 0.020833335685730252
(PEPit) Duality gap: absolute: 1.382797040067052e-09 and relative: 6.637425042856655e-08
*** Example file: worst-case performance of proximal point method ***
        PEPit guarantee:         f(x_n)-f_* <= 0.0208333 ||x_0 - x_*||^2
        Theoretical guarantee:   f(x_n)-f_* <= 0.0208333 ||x_0 - x_*||^2

1.16. Accelerated proximal point

PEPit.examples.unconstrained_convex_minimization.wc_accelerated_proximal_point(A0, gammas, n, wrapper='cvxpy', solver=None, verbose=1)[source]

Consider the minimization problem

\[f_\star \triangleq \min_x f(x),\]

where \(f\) is convex and possibly non-smooth.

This code computes a worst-case guarantee an accelerated proximal point method, aka fast proximal point method (FPP). That is, it computes the smallest possible \(\tau(n, A_0,\vec{\gamma})\) such that the guarantee

\[f(x_n) - f_\star \leqslant \tau(n, A_0, \vec{\gamma}) \left(f(x_0) - f_\star + \frac{A_0}{2} \|x_0 - x_\star\|^2\right)\]

is valid, where \(x_n\) is the output of FPP (with step-size \(\gamma_t\) at step \(t\in \{0, \dots, n-1\}\)) and where \(x_\star\) is a minimizer of \(f\) and \(A_0\) is a positive number.

In short, for given values of \(n\), \(A_0\) and \(\vec{\gamma}\), \(\tau(n)\) is computed as the worst-case value of \(f(x_n)-f_\star\) when \(f(x_0) - f_\star + \frac{A_0}{2} \|x_0 - x_\star\|^2 \leqslant 1\), for the following method.

Algorithm: For \(t\in \{0, \dots, n-1\}\):

\begin{eqnarray} y_{t+1} & = & (1-\alpha_{t} ) x_{t} + \alpha_{t} v_t \\ x_{t+1} & = & \arg\min_x \left\{f(x)+\frac{1}{2\gamma_t}\|x-y_{t+1}\|^2 \right\}, \\ v_{t+1} & = & v_t + \frac{1}{\alpha_{t}} (x_{t+1}-y_{t+1}) \end{eqnarray}

with

\begin{eqnarray} \alpha_{t} & = & \frac{\sqrt{(A_t \gamma_t)^2 + 4 A_t \gamma_t} - A_t \gamma_t}{2} \\ A_{t+1} & = & (1 - \alpha_{t}) A_t \end{eqnarray}

and \(v_0=x_0\).

Theoretical guarantee: A theoretical upper bound can be found in [1, Theorem 2.3.]:

\[f(x_n)-f_\star \leqslant \frac{4}{A_0 (\sum_{t=0}^{n-1} \sqrt{\gamma_t})^2}\left(f(x_0) - f_\star + \frac{A_0}{2} \|x_0 - x_\star\|^2 \right).\]

References: The accelerated proximal point was first obtained and analyzed in [1].

[1] O. Güler (1992). New proximal point algorithms for convex minimization. SIAM Journal on Optimization, 2(4):649–664.

Parameters:
  • A0 (float) – initial value for parameter A_0.

  • gammas (list) – sequence of step-sizes.

  • n (int) – number of iterations.

  • wrapper (str) – the name of the wrapper to be used.

  • solver (str) – the name of the solver the wrapper should use.

  • verbose (int) –

    level of information details to print.

    • -1: No verbose at all.

    • 0: This example’s output.

    • 1: This example’s output + PEPit information.

    • 2: This example’s output + PEPit information + solver details.

Returns:
  • pepit_tau (float) – worst-case value

  • theoretical_tau (float) – theoretical value

Example

>>> pepit_tau, theoretical_tau = wc_accelerated_proximal_point(A0=5, gammas=[(i + 1) / 1.1 for i in range(3)], n=3, wrapper="cvxpy", solver=None, verbose=1)
(PEPit) Setting up the problem: size of the Gram matrix: 6x6
(PEPit) Setting up the problem: performance measure is the minimum of 1 element(s)
(PEPit) Setting up the problem: Adding initial conditions and general constraints ...
(PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 1 function(s)
                        Function 1 : Adding 20 scalar constraint(s) ...
                        Function 1 : 20 scalar constraint(s) added
(PEPit) Setting up the problem: additional constraints for 0 function(s)
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.015931148923290624
(PEPit) Primal feasibility check:
                The solver found a Gram matrix that is positive semi-definite up to an error of 3.713119626105772e-10
                All the primal scalar constraints are verified up to an error of 1.4460231649235378e-09
(PEPit) Dual feasibility check:
                The solver found a residual matrix that is positive semi-definite
                All the dual scalar values associated with inequality constraints are nonnegative up to an error of 2.0490523713620816e-10
(PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 2.9451787884841313e-09
(PEPit) Final upper bound (dual): 0.015931149263944334 and lower bound (primal example): 0.015931148923290624
(PEPit) Duality gap: absolute: 3.4065371010139067e-10 and relative: 2.138287148915985e-08
*** Example file: worst-case performance of fast proximal point method ***
        PEPit guarantee:         f(x_n)-f_* <= 0.0159311 (f(x_0) - f_* + A/2* ||x_0 - x_*||^2)
        Theoretical guarantee:   f(x_n)-f_* <= 0.0511881 (f(x_0) - f_* + A/2* ||x_0 - x_*||^2)

1.17. Inexact gradient descent

PEPit.examples.unconstrained_convex_minimization.wc_inexact_gradient_descent(L, mu, epsilon, n, wrapper='cvxpy', solver=None, verbose=1)[source]

Consider the convex minimization problem

\[f_\star \triangleq \min_x f(x),\]

where \(f\) is \(L\)-smooth and \(\mu\)-strongly convex.

This code computes a worst-case guarantee for the inexact gradient method. That is, it computes the smallest possible \(\tau(n, L, \mu, \varepsilon)\) such that the guarantee

\[f(x_n) - f_\star \leqslant \tau(n, L, \mu, \varepsilon) (f(x_0) - f_\star)\]

is valid, where \(x_n\) is the output of the inexact gradient method, and where \(x_\star\) is the minimizer of \(f\). In short, for given values of \(n\), \(L\), \(\mu\) and \(\varepsilon\), \(\tau(n, L, \mu, \varepsilon)\) is computed as the worst-case value of \(f(x_n)-f_\star\) when \(f(x_0) - f_\star \leqslant 1\).

Algorithm:

\[x_{t+1} = x_t - \gamma d_t\]

with

\[\|d_t - \nabla f(x_t)\| \leqslant \varepsilon \|\nabla f(x_t)\|\]

and

\[\gamma = \frac{2}{L_{\varepsilon} + \mu_{\varepsilon}}\]

where \(L_{\varepsilon} = (1 + \varepsilon) L\) and \(\mu_{\varepsilon} = (1 - \varepsilon) \mu\).

Theoretical guarantee:

The tight worst-case guarantee obtained in [1, Theorem 5.3] or [2, Remark 1.6] is

\[f(x_n) - f_\star \leqslant \left(\frac{L_{\varepsilon}-\mu_{\varepsilon}}{L_{\varepsilon}+\mu_{\varepsilon}}\right)^{2n}(f(x_0) - f_\star),\]

where tightness is achieved on simple quadratic functions.

References: The detailed analyses can be found in [1, 2].

[1] E. De Klerk, F. Glineur, A. Taylor (2020). Worst-case convergence analysis of inexact gradient and Newton methods through semidefinite programming performance estimation. SIAM Journal on Optimization, 30(3), 2053-2082.

[2] O. Gannot (2021). A frequency-domain analysis of inexact gradient methods. Mathematical Programming.

Parameters:
  • L (float) – the smoothness parameter.

  • mu (float) – the strong convexity parameter.

  • epsilon (float) – level of inaccuracy.

  • n (int) – number of iterations.

  • wrapper (str) – the name of the wrapper to be used.

  • solver (str) – the name of the solver the wrapper should use.

  • verbose (int) –

    level of information details to print.

    • -1: No verbose at all.

    • 0: This example’s output.

    • 1: This example’s output + PEPit information.

    • 2: This example’s output + PEPit information + solver details.

Returns:
  • pepit_tau (float) – worst-case value

  • theoretical_tau (float) – theoretical value

Example

>>> pepit_tau, theoretical_tau = wc_inexact_gradient_descent(L=1, mu=.1, epsilon=.1, n=2, wrapper="cvxpy", solver=None, verbose=1)
(PEPit) Setting up the problem: size of the Gram matrix: 7x7
(PEPit) Setting up the problem: performance measure is the minimum of 1 element(s)
(PEPit) Setting up the problem: Adding initial conditions and general constraints ...
(PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 1 function(s)
                        Function 1 : Adding 12 scalar constraint(s) ...
                        Function 1 : 12 scalar constraint(s) added
(PEPit) Setting up the problem: additional constraints for 1 function(s)
                        Function 1 : Adding 2 scalar constraint(s) ...
                        Function 1 : 2 scalar constraint(s) added
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.5189167048760179
(PEPit) Primal feasibility check:
                The solver found a Gram matrix that is positive semi-definite up to an error of 3.328901122905122e-09
                All the primal scalar constraints are verified up to an error of 9.223752428511034e-09
(PEPit) Dual feasibility check:
                The solver found a residual matrix that is positive semi-definite
                All the dual scalar values associated with inequality constraints are nonnegative
(PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 1.0409575365469605e-07
(PEPit) Final upper bound (dual): 0.5189166992915334 and lower bound (primal example): 0.5189167048760179
(PEPit) Duality gap: absolute: -5.584484541465429e-09 and relative: -1.0761813001953176e-08
*** Example file: worst-case performance of inexact gradient method in distance in function values ***
        PEPit guarantee:         f(x_n)-f_* <= 0.518917 (f(x_0)-f_*)
        Theoretical guarantee:   f(x_n)-f_* <= 0.518917 (f(x_0)-f_*)

1.19. Inexact accelerated gradient

PEPit.examples.unconstrained_convex_minimization.wc_inexact_accelerated_gradient(L, epsilon, n, wrapper='cvxpy', solver=None, verbose=1)[source]

Consider the minimization problem

\[f_\star \triangleq \min_x f(x),\]

where \(f\) is \(L\)-smooth and convex.

This code computes a worst-case guarantee for an accelerated gradient method using inexact first-order information. That is, it computes the smallest possible \(\tau(n, L, \varepsilon)\) such that the guarantee

\[f(x_n) - f_\star \leqslant \tau(n, L, \varepsilon) \|x_0 - x_\star\|^2\]

is valid, where \(x_n\) is the output of inexact accelerated gradient descent and where \(x_\star\) is a minimizer of \(f\).

The inexact descent direction is assumed to satisfy a relative inaccuracy described by (with \(0\leqslant \varepsilon \leqslant 1\))

\[\|\nabla f(y_t) - d_t\| \leqslant \varepsilon \|\nabla f(y_t)\|,\]

where \(\nabla f(y_t)\) is the true gradient at \(y_t\) and \(d_t\) is the approximate descent direction that is used.

Algorithm: The inexact accelerated gradient method of this example is provided by

\begin{eqnarray} x_{t+1} & = & y_t - \frac{1}{L} d_t\\ y_{k+1} & = & x_{t+1} + \frac{t-1}{t+2} (x_{t+1} - x_t). \end{eqnarray}

Theoretical guarantee: When \(\varepsilon=0\), a tight empirical guarantee can be found in [1, Table 1]:

\[f(x_n)-f_\star \leqslant \frac{2L\|x_0-x_\star\|^2}{n^2 + 5 n + 6},\]

which is achieved on some Huber loss functions (when \(\varepsilon=0\)).

References:

[1] A. Taylor, J. Hendrickx, F. Glineur (2017). Exact worst-case performance of first-order methods for composite convex optimization. SIAM Journal on Optimization, 27(3):1283–1313.

Parameters:
  • L (float) – smoothness parameter.

  • epsilon (float) – level of inaccuracy

  • n (int) – number of iterations.

  • wrapper (str) – the name of the wrapper to be used.

  • solver (str) – the name of the solver the wrapper should use.

  • verbose (int) –

    level of information details to print.

    • -1: No verbose at all.

    • 0: This example’s output.

    • 1: This example’s output + PEPit information.

    • 2: This example’s output + PEPit information + solver details.

Returns:
  • pepit_tau (float) – worst-case value

  • theoretical_tau (float) – theoretical value

Example

>>> pepit_tau, theoretical_tau = wc_inexact_accelerated_gradient(L=1, epsilon=0.1, n=5, wrapper="cvxpy", solver=None, verbose=1)
(PEPit) Setting up the problem: size of the Gram matrix: 13x13
(PEPit) Setting up the problem: performance measure is the minimum of 1 element(s)
(PEPit) Setting up the problem: Adding initial conditions and general constraints ...
(PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 1 function(s)
                        Function 1 : Adding 42 scalar constraint(s) ...
                        Function 1 : 42 scalar constraint(s) added
(PEPit) Setting up the problem: additional constraints for 1 function(s)
                        Function 1 : Adding 5 scalar constraint(s) ...
                        Function 1 : 5 scalar constraint(s) added
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.039388047868526704
(PEPit) Primal feasibility check:
                The solver found a Gram matrix that is positive semi-definite up to an error of 9.664109261839382e-09
                All the primal scalar constraints are verified up to an error of 2.7598597017106097e-08
(PEPit) Dual feasibility check:
                The solver found a residual matrix that is positive semi-definite
                All the dual scalar values associated with inequality constraints are nonnegative
(PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 1.4089929790920282e-07
(PEPit) Final upper bound (dual): 0.03938804252543366 and lower bound (primal example): 0.039388047868526704
(PEPit) Duality gap: absolute: -5.3430930443965075e-09 and relative: -1.3565264930699812e-07
*** Example file: worst-case performance of inexact accelerated gradient method ***
        PEPit guarantee:                         f(x_n)-f_* <= 0.039388 (f(x_0)-f_*)
        Theoretical guarantee for epsilon = 0 :  f(x_n)-f_* <= 0.0357143 (f(x_0)-f_*)

1.20. Epsilon-subgradient method

PEPit.examples.unconstrained_convex_minimization.wc_epsilon_subgradient_method(M, n, gamma, eps, R, wrapper='cvxpy', solver=None, verbose=1)[source]

Consider the minimization problem

\[f_\star \triangleq \min_x f(x),\]

where \(f\) is closed, convex, and proper. This problem is a (possibly non-smooth) minimization problem.

This code computes a worst-case guarantee for the \(\varepsilon\) -subgradient method. That is, it computes the smallest possible \(\tau(n, M, \gamma, \varepsilon, R)\) such that the guarantee

\[\min_{0 \leqslant t \leqslant n} f(x_t) - f_\star \leqslant \tau(n, M, \gamma, \varepsilon, R)\]

is valid, where \(x_t\) are the iterates of the \(\varepsilon\) -subgradient method after \(t\leqslant n\) steps, where \(x_\star\) is a minimizer of \(f\), where \(M\) is an upper bound on the norm of all \(\varepsilon\)-subgradients encountered, and when \(\|x_0-x_\star\|\leqslant R\).

In short, for given values of \(M\), of the accuracy \(\varepsilon\), of the step-size \(\gamma\), of the initial distance \(R\), and of the number of iterations \(n\), \(\tau(n, M, \gamma, \varepsilon, R)\) is computed as the worst-case value of \(\min_{0 \leqslant t \leqslant n} f(x_t) - f_\star\).

Algorithm: For \(t\in \{0, \dots, n-1 \}\)

\begin{eqnarray} g_{t} & \in & \partial_{\varepsilon} f(x_t) \\ x_{t+1} & = & x_t - \gamma g_t \end{eqnarray}

Theoretical guarantee: An upper bound is obtained in [1, Lemma 2]:

\[\min_{0 \leqslant t \leqslant n} f(x_t)- f(x_\star) \leqslant \frac{R^2+2(n+1)\gamma\varepsilon+(n+1) \gamma^2 M^2}{2(n+1) \gamma}.\]

References:

[1] R.D. Millán, M.P. Machado (2019). Inexact proximal epsilon-subgradient methods for composite convex optimization problems. Journal of Global Optimization 75.4 (2019): 1029-1060.

Parameters:
  • M (float) – the bound on norms of epsilon-subgradients.

  • n (int) – the number of iterations.

  • gamma (float) – step-size.

  • eps (float) – the bound on the value of epsilon (inaccuracy).

  • R (float) – the bound on initial distance to an optimal solution.

  • wrapper (str) – the name of the wrapper to be used.

  • solver (str) – the name of the solver the wrapper should use.

  • verbose (int) –

    level of information details to print.

    • -1: No verbose at all.

    • 0: This example’s output.

    • 1: This example’s output + PEPit information.

    • 2: This example’s output + PEPit information + solver details.

Returns:
  • pepit_tau (float) – worst-case value

  • theoretical_tau (float) – theoretical value

Example

>>> M, n, eps, R = 2, 6, .1, 1
>>> gamma = 1 / sqrt(n + 1)
>>> pepit_tau, theoretical_tau = wc_epsilon_subgradient_method(M=M, n=n, gamma=gamma, eps=eps, R=R, wrapper="cvxpy", solver=None, verbose=1)
(PEPit) Setting up the problem: size of the Gram matrix: 21x21
(PEPit) Setting up the problem: performance measure is the minimum of 7 element(s)
(PEPit) Setting up the problem: Adding initial conditions and general constraints ...
(PEPit) Setting up the problem: initial conditions and general constraints (14 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 1 function(s)
                        Function 1 : Adding 182 scalar constraint(s) ...
                        Function 1 : 182 scalar constraint(s) added
(PEPit) Setting up the problem: additional constraints for 1 function(s)
                        Function 1 : Adding 6 scalar constraint(s) ...
                        Function 1 : 6 scalar constraint(s) added
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 1.0191560420875132
(PEPit) Primal feasibility check:
                The solver found a Gram matrix that is positive semi-definite
                All the primal scalar constraints are verified up to an error of 9.992007221626409e-16
(PEPit) Dual feasibility check:
                The solver found a residual matrix that is positive semi-definite
                All the dual scalar values associated with inequality constraints are nonnegative up to an error of 8.140035658377668e-09
(PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 2.4537099231463084e-07
(PEPit) Final upper bound (dual): 1.019156044756485 and lower bound (primal example): 1.0191560420875132
(PEPit) Duality gap: absolute: 2.668971710306778e-09 and relative: 2.6188057570065385e-09
*** Example file: worst-case performance of the epsilon-subgradient method ***
        PEPit guarantee:         min_(0 <= t <= n) f(x_i) - f_* <= 1.01916
        Theoretical guarantee:   min_(0 <= t <= n) f(x_i) - f_* <= 1.04491

1.21. Gradient descent for quadratically upper bounded convex objective

PEPit.examples.unconstrained_convex_minimization.wc_gradient_descent_qg_convex(L, gamma, n, wrapper='cvxpy', solver=None, verbose=1)[source]

Consider the convex minimization problem

\[f_\star \triangleq \min_x f(x),\]

where \(f\) is quadratically upper bounded (\(\text{QG}^+\) [1]), i.e. \(\forall x, f(x) - f_\star \leqslant \frac{L}{2} \|x-x_\star\|^2\), and convex.

This code computes a worst-case guarantee for gradient descent with fixed step-size \(\gamma\). That is, it computes the smallest possible \(\tau(n, L, \gamma)\) such that the guarantee

\[f(x_n) - f_\star \leqslant \tau(n, L, \gamma) \| x_0 - x_\star\|^2\]

is valid, where \(x_n\) is the output of gradient descent with fixed step-size \(\gamma\), and where \(x_\star\) is a minimizer of \(f\).

In short, for given values of \(n\), \(L\), and \(\gamma\), \(\tau(n, L, \gamma)\) is computed as the worst-case value of \(f(x_n)-f_\star\) when \(||x_0 - x_\star||^2 \leqslant 1\).

Algorithm: Gradient descent is described by

\[x_{t+1} = x_t - \gamma \nabla f(x_t),\]

where \(\gamma\) is a step-size.

Theoretical guarantee: When \(\gamma < \frac{1}{L}\), the lower theoretical guarantee can be found in [1, Theorem 2.2]:

\[f(x_n)-f_\star \leqslant \frac{L}{2}\max\left(\frac{1}{2n L \gamma + 1}, L \gamma\right) \|x_0-x_\star\|^2.\]

References:

The detailed approach is available in [1, Theorem 2.2].

[1] B. Goujaud, A. Taylor, A. Dieuleveut (2022). Optimal first-order methods for convex functions with a quadratic upper bound.

Parameters:
  • L (float) – the quadratic growth parameter.

  • gamma (float) – step-size.

  • n (int) – number of iterations.

  • wrapper (str) – the name of the wrapper to be used.

  • solver (str) – the name of the solver the wrapper should use.

  • verbose (int) –

    level of information details to print.

    • -1: No verbose at all.

    • 0: This example’s output.

    • 1: This example’s output + PEPit information.

    • 2: This example’s output + PEPit information + solver details.

Returns:
  • pepit_tau (float) – worst-case value

  • theoretical_tau (float) – theoretical value

Example

>>> L = 1
>>> pepit_tau, theoretical_tau = wc_gradient_descent_qg_convex(L=L, gamma=.2 / L, n=4, wrapper="cvxpy", solver=None, verbose=1)
(PEPit) Setting up the problem: size of the Gram matrix: 7x7
(PEPit) Setting up the problem: performance measure is the minimum of 1 element(s)
(PEPit) Setting up the problem: Adding initial conditions and general constraints ...
(PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 1 function(s)
                        Function 1 : Adding 35 scalar constraint(s) ...
                        Function 1 : 35 scalar constraint(s) added
(PEPit) Setting up the problem: additional constraints for 0 function(s)
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.19230769057979225
(PEPit) Primal feasibility check:
                The solver found a Gram matrix that is positive semi-definite
                All the primal scalar constraints are verified up to an error of 6.487419074163725e-10
(PEPit) Dual feasibility check:
                The solver found a residual matrix that is positive semi-definite
                All the dual scalar values associated with inequality constraints are nonnegative up to an error of 9.613757534126084e-10
(PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 9.03711530175072e-09
(PEPit) Final upper bound (dual): 0.1923076915693468 and lower bound (primal example): 0.19230769057979225
(PEPit) Duality gap: absolute: 9.895545494131852e-10 and relative: 5.145683703182944e-09
*** Example file: worst-case performance of gradient descent with fixed step-sizes ***
        PEPit guarantee:         f(x_n)-f_* <= 0.192308 ||x_0 - x_*||^2
        Theoretical guarantee:   f(x_n)-f_* <= 0.192308 ||x_0 - x_*||^2

1.22. Gradient descent with decreasing step sizes for quadratically upper bounded convex objective

PEPit.examples.unconstrained_convex_minimization.wc_gradient_descent_qg_convex_decreasing(L, n, wrapper='cvxpy', solver=None, verbose=1)[source]

Consider the convex minimization problem

\[f_\star \triangleq \min_x f(x),\]

where \(f\) is quadratically upper bounded (\(\text{QG}^+\) [1]), i.e. \(\forall x, f(x) - f_\star \leqslant \frac{L}{2} \|x-x_\star\|^2\), and convex.

This code computes a worst-case guarantee for gradient descent with decreasing step-sizes. That is, it computes the smallest possible \(\tau(n, L)\) such that the guarantee

\[f(x_n) - f_\star \leqslant \tau(n, L) \| x_0 - x_\star\|^2\]

is valid, where \(x_n\) is the output of gradient descent with decreasing step-sizes, and where \(x_\star\) is a minimizer of \(f\).

In short, for given values of \(n\) and \(L\), \(\tau(n, L)\) is computed as the worst-case value of \(f(x_n)-f_\star\) when \(||x_0 - x_\star||^2 \leqslant 1\).

Algorithm: Gradient descent with decreasing step sizes is described by

\[x_{t+1} = x_t - \gamma_t \nabla f(x_t)\]

with

\[\gamma_t = \frac{1}{L u_{t+1}}\]

where the sequence \(u\) is defined by

\begin{eqnarray} u_0 & = & 1 \\ u_{t} & = & \frac{u_{t-1}}{2} + \sqrt{\left(\frac{u_{t-1}}{2}\right)^2 + 2}, \quad \mathrm{for } t \geq 1 \end{eqnarray}

Theoretical guarantee: The tight theoretical guarantee is conjectured in [1, Conjecture A.3]:

\[f(x_n)-f_\star \leqslant \frac{L}{2 u_t} \|x_0-x_\star\|^2.\]

Notes:

We verify that \(u_t \sim 2\sqrt{t}\). The step sizes as well as the function values of the iterates decrease as \(O\left( \frac{1}{\sqrt{t}} \right)\).

References:

The detailed approach is available in [1, Appendix A.3].

[1] B. Goujaud, A. Taylor, A. Dieuleveut (2022). Optimal first-order methods for convex functions with a quadratic upper bound.

Parameters:
  • L (float) – the quadratic growth parameter.

  • n (int) – number of iterations.

  • wrapper (str) – the name of the wrapper to be used.

  • solver (str) – the name of the solver the wrapper should use.

  • verbose (int) –

    level of information details to print.

    • -1: No verbose at all.

    • 0: This example’s output.

    • 1: This example’s output + PEPit information.

    • 2: This example’s output + PEPit information + solver details.

Returns:
  • pepit_tau (float) – worst-case value

  • theoretical_tau (float) – theoretical value

Example

>>> pepit_tau, theoretical_tau = wc_gradient_descent_qg_convex_decreasing(L=1, n=6, wrapper="cvxpy", solver=None, verbose=1)
(PEPit) Setting up the problem: size of the Gram matrix: 9x9
(PEPit) Setting up the problem: performance measure is the minimum of 1 element(s)
(PEPit) Setting up the problem: Adding initial conditions and general constraints ...
(PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 1 function(s)
                        Function 1 : Adding 63 scalar constraint(s) ...
                        Function 1 : 63 scalar constraint(s) added
(PEPit) Setting up the problem: additional constraints for 0 function(s)
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.10554738683923168
(PEPit) Primal feasibility check:
                The solver found a Gram matrix that is positive semi-definite up to an error of 1.3052503007966848e-10
                All the primal scalar constraints are verified up to an error of 6.537865110400887e-10
(PEPit) Dual feasibility check:
                The solver found a residual matrix that is positive semi-definite
                All the dual scalar values associated with inequality constraints are nonnegative up to an error of 4.781039101724198e-10
(PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 8.076454220612452e-09
(PEPit) Final upper bound (dual): 0.10554738755645543 and lower bound (primal example): 0.10554738683923168
(PEPit) Duality gap: absolute: 7.172237526109626e-10 and relative: 6.7952772123428116e-09
*** Example file: worst-case performance of gradient descent with fixed step-sizes ***
        PEPit guarantee:         f(x_n)-f_* <= 0.105547 ||x_0 - x_*||^2
        Theoretical conjecture:  f(x_n)-f_* <= 0.105547 ||x_0 - x_*||^2

1.23. Conjugate gradient for quadratically upper bounded convex objective

PEPit.examples.unconstrained_convex_minimization.wc_conjugate_gradient_qg_convex(L, n, wrapper='cvxpy', solver=None, verbose=1)[source]

Consider the convex minimization problem

\[f_\star \triangleq \min_x f(x),\]

where \(f\) is quadratically upper bounded (\(\text{QG}^+\) [2]), i.e. \(\forall x, f(x) - f_\star \leqslant \frac{L}{2} \|x-x_\star\|^2\), and convex.

This code computes a worst-case guarantee for the conjugate gradient (CG) method (with exact span searches). That is, it computes the smallest possible \(\tau(n, L)\) such that the guarantee

\[f(x_n) - f_\star \leqslant \tau(n, L) \|x_0-x_\star\|^2\]

is valid, where \(x_n\) is the output of the conjugate gradient method, and where \(x_\star\) is a minimizer of \(f\). In short, for given values of \(n\) and \(L\), \(\tau(n, L)\) is computed as the worst-case value of \(f(x_n)-f_\star\) when \(\|x_0-x_\star\|^2 \leqslant 1\).

Algorithm:

\[x_{t+1} = x_t - \sum_{i=0}^t \gamma_i \nabla f(x_i)\]

with

\[(\gamma_i)_{i \leqslant t} = \arg\min_{(\gamma_i)_{i \leqslant t}} f \left(x_t - \sum_{i=0}^t \gamma_i \nabla f(x_i) \right)\]

Theoretical guarantee:

The tight guarantee obtained in [2, Theorem 2.3] (lower) and [2, Theorem 2.4] (upper) is

\[f(x_n) - f_\star \leqslant \frac{L}{2 (n + 1)} \|x_0-x_\star\|^2.\]

References: The detailed approach (based on convex relaxations) is available in [1, Corollary 6], and the result provided in [2, Theorem 2.4].

[1] Y. Drori and A. Taylor (2020). Efficient first-order methods for convex minimization: a constructive approach. Mathematical Programming 184 (1), 183-220.

[2] B. Goujaud, A. Taylor, A. Dieuleveut (2022). Optimal first-order methods for convex functions with a quadratic upper bound.

Parameters:
  • L (float) – the quadratic growth parameter.

  • n (int) – number of iterations.

  • wrapper (str) – the name of the wrapper to be used.

  • solver (str) – the name of the solver the wrapper should use.

  • verbose (int) –

    level of information details to print.

    • -1: No verbose at all.

    • 0: This example’s output.

    • 1: This example’s output + PEPit information.

    • 2: This example’s output + PEPit information + solver details.

Returns:
  • pepit_tau (float) – worst-case value

  • theoretical_tau (float) – theoretical value

Example

>>> pepit_tau, theoretical_tau = wc_conjugate_gradient_qg_convex(L=1, n=12, wrapper="cvxpy", solver=None, verbose=1)
(PEPit) Setting up the problem: size of the Gram matrix: 27x27
(PEPit) Setting up the problem: performance measure is the minimum of 1 element(s)
(PEPit) Setting up the problem: Adding initial conditions and general constraints ...
(PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 1 function(s)
                        Function 1 : Adding 195 scalar constraint(s) ...
                        Function 1 : 195 scalar constraint(s) added
(PEPit) Setting up the problem: additional constraints for 1 function(s)
                        Function 1 : Adding 156 scalar constraint(s) ...
                        Function 1 : 156 scalar constraint(s) added
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.038461537777152104
(PEPit) Primal feasibility check:
                The solver found a Gram matrix that is positive semi-definite up to an error of 9.102635033370141e-10
                All the primal scalar constraints are verified up to an error of 6.985771551504261e-09
(PEPit) Dual feasibility check:
                The solver found a residual matrix that is positive semi-definite
                All the dual scalar values associated with inequality constraints are nonnegative up to an error of 6.487940056145134e-09
(PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 2.3524476901692115e-07
(PEPit) Final upper bound (dual): 0.038461545907145095 and lower bound (primal example): 0.038461537777152104
(PEPit) Duality gap: absolute: 8.129992991323665e-09 and relative: 2.113798215357174e-07
*** Example file: worst-case performance of conjugate gradient method ***
        PEPit guarantee:         f(x_n)-f_* <= 0.0384615 ||x_0 - x_*||^2
        Theoretical guarantee:   f(x_n)-f_* <= 0.0384615 ||x_0 - x_*||^2

1.24. Heavy Ball momentum for quadratically upper bounded convex objective

PEPit.examples.unconstrained_convex_minimization.wc_heavy_ball_momentum_qg_convex(L, n, wrapper='cvxpy', solver=None, verbose=1)[source]

Consider the convex minimization problem

\[f_\star \triangleq \min_x f(x),\]

where \(f\) is quadratically upper bounded (\(\text{QG}^+\) [2]), i.e. \(\forall x, f(x) - f_\star \leqslant \frac{L}{2} \|x-x_\star\|^2\), and convex.

This code computes a worst-case guarantee for the Heavy-ball (HB) method, aka Polyak momentum method. That is, it computes the smallest possible \(\tau(n, L)\) such that the guarantee

\[f(x_n) - f_\star \leqslant \tau(n, L) \|x_0 - x_\star\|^2\]

is valid, where \(x_n\) is the output of the Heavy-ball (HB) method, and where \(x_\star\) is the minimizer of \(f\). In short, for given values of \(n\) and \(L\), \(\tau(n, L)\) is computed as the worst-case value of \(f(x_n)-f_\star\) when \(\|x_0 - x_\star\|^2 \leqslant 1\).

Algorithm:

This method is described in [1]

\[x_{t+1} = x_t - \alpha_t \nabla f(x_t) + \beta_t (x_t-x_{t-1})\]

with

\[\alpha_t = \frac{1}{L} \frac{1}{t+2}\]

and

\[\beta_t = \frac{t}{t+2}\]

Theoretical guarantee:

The tight guarantee obtained in [2, Theorem 2.3] (lower) and [2, Theorem 2.4] (upper) is

\[f(x_n) - f_\star \leqslant \frac{L}{2}\frac{1}{n+1} \|x_0 - x_\star\|^2.\]

References: This methods was first introduce in [1, section 3], and convergence tight bound was proven in [2, Theorem 2.3] (lower) and [2, Theorem 2.4] (upper).

[1] E. Ghadimi, H. R. Feyzmahdavian, M. Johansson (2015). Global convergence of the Heavy-ball method for convex optimization. European Control Conference (ECC).

[2] B. Goujaud, A. Taylor, A. Dieuleveut (2022). Optimal first-order methods for convex functions with a quadratic upper bound.

Parameters:
  • L (float) – the quadratic growth parameter.

  • n (int) – number of iterations.

  • wrapper (str) – the name of the wrapper to be used.

  • solver (str) – the name of the solver the wrapper should use.

  • verbose (int) –

    level of information details to print.

    • -1: No verbose at all.

    • 0: This example’s output.

    • 1: This example’s output + PEPit information.

    • 2: This example’s output + PEPit information + solver details.

Returns:
  • pepit_tau (float) – worst-case value

  • theoretical_tau (float) – theoretical value

Example

>>> pepit_tau, theoretical_tau = wc_heavy_ball_momentum_qg_convex(L=1, n=5, wrapper="cvxpy", solver=None, verbose=1)
(PEPit) Setting up the problem: size of the Gram matrix: 8x8
(PEPit) Setting up the problem: performance measure is the minimum of 1 element(s)
(PEPit) Setting up the problem: Adding initial conditions and general constraints ...
(PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 1 function(s)
                        Function 1 : Adding 48 scalar constraint(s) ...
                        Function 1 : 48 scalar constraint(s) added
(PEPit) Setting up the problem: additional constraints for 0 function(s)
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.08333333312648067
(PEPit) Primal feasibility check:
                The solver found a Gram matrix that is positive semi-definite
                All the primal scalar constraints are verified up to an error of 1.0360445487633818e-10
(PEPit) Dual feasibility check:
                The solver found a residual matrix that is positive semi-definite
                All the dual scalar values associated with inequality constraints are nonnegative up to an error of 9.384283773139436e-11
(PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 2.3121924686696765e-09
(PEPit) Final upper bound (dual): 0.08333333333812117 and lower bound (primal example): 0.08333333312648067
(PEPit) Duality gap: absolute: 2.1164049679445185e-10 and relative: 2.539685967837512e-09
*** Example file: worst-case performance of the Heavy-Ball method ***
        PEPit guarantee:         f(x_n)-f_* <= 0.0833333 ||x_0 - x_*||^2
        Theoretical guarantee:   f(x_n)-f_* <= 0.0833333 ||x_0 - x_*||^2

1.25. Gradient descent for smooth strongly convex quadratic objective

PEPit.examples.unconstrained_convex_minimization.wc_gradient_descent_quadratics(mu, L, gamma, n, wrapper='cvxpy', solver=None, verbose=1)[source]

Consider the convex minimization problem

\[f_\star \triangleq \min_x f(x),\]

where \(f=\frac{1}{2} x^T Q x\) is \(L\)-smooth and \(\mu\)-strongly convex (i.e. \(\mu I \preceq Q \preceq LI\)).

This code computes a worst-case guarantee for gradient descent with fixed step-size \(\gamma\). That is, it computes the smallest possible \(\tau(n, \mu, L, \gamma)\) such that the guarantee

\[f(x_n) - f_\star \leqslant \tau(n, \mu, L, \gamma) \|x_0 - x_\star\|^2\]

is valid, where \(x_n\) is the output of gradient descent with fixed step-size \(\gamma\), and where \(x_\star\) is a minimizer of \(f\).

In short, for given values of \(n\), \(\mu\), \(L\), and \(\gamma\), \(\tau(n, L, \gamma)\) is computed as the worst-case value of \(f(x_n)-f_\star\) when \(\|x_0 - x_\star\|^2 \leqslant 1\).

Algorithm: Gradient descent is described by

\[x_{t+1} = x_t - \gamma \nabla f(x_t),\]

where \(\gamma\) is a step-size.

Theoretical guarantee:

When \(\gamma \leqslant \frac{2}{L}\) and \(0 \leqslant \mu \leqslant L\), the tight theoretical conjecture can be found in [1, Equation (4.17)]:

\[f(x_n)-f_\star \leqslant \frac{L}{2} \max\left\{\alpha(1-\alpha L\gamma)^{2n}, (1-L\gamma)^{2n} \right\} \|x_0-x_\star\|^2,\]

where \(\alpha = \mathrm{proj}_{[\frac{\mu}{L},1]} \left(\frac{1}{L\gamma (2n+1)}\right)\).

References:

Parameters:
  • mu (float) – the strong convexity parameter.

  • L (float) – the smoothness parameter.

  • gamma (float) – step-size.

  • n (int) – number of iterations.

  • wrapper (str) – the name of the wrapper to be used.

  • solver (str) – the name of the solver the wrapper should use.

  • verbose (int) –

    Level of information details to print.

    • -1: No verbose at all.

    • 0: This example’s output.

    • 1: This example’s output + PEPit information.

    • 2: This example’s output + PEPit information + CVXPY details.

Returns:
  • pepit_tau (float) – worst-case value

  • theoretical_tau (float) – theoretical value

Example

>>> L = 3.
>>> mu = 0.3
>>> pepit_tau, theoretical_tau = wc_gradient_descent_quadratics(mu=mu, L=L, gamma=1 / L, n=4, verbose=1)
(PEPit) Setting up the problem: size of the Gram matrix: 7x7
(PEPit) Setting up the problem: performance measure is the minimum of 1 element(s)
(PEPit) Setting up the problem: Adding initial conditions and general constraints ...
(PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 1 function(s)
                        Function 1 : Adding 21 scalar constraint(s) ...
                        Function 1 : 21 scalar constraint(s) added
                        Function 1 : Adding 1 lmi constraint(s) ...
                 Size of PSD matrix 1: 6x6
                        Function 1 : 1 lmi constraint(s) added
(PEPit) Setting up the problem: additional constraints for 0 function(s)
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.0649573836866313
(PEPit) Primal feasibility check:
                The solver found a Gram matrix that is positive semi-definite up to an error of 6.909122375079227e-10
                All required PSD matrices are indeed positive semi-definite up to an error of 8.456976505538092e-09
                All the primal scalar constraints are verified up to an error of 8.495850689627105e-10
(PEPit) Dual feasibility check:
                The solver found a residual matrix that is positive semi-definite
                All the dual matrices to lmi are positive semi-definite
                All the dual scalar values associated with inequality constraints are nonnegative
(PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 2.5432361579460934
(PEPit) Final upper bound (dual): 0.06495738812047232 and lower bound (primal example): 0.0649573836866313
(PEPit) Duality gap: absolute: 4.4338410165600806e-09 and relative: 6.825769088776581e-08
*** Example file: worst-case performance of gradient descent on quadratics with fixed step-sizes ***
        PEPit guarantee:         f(x_n)-f_* <= 0.0649574 ||x_0 - x_*||^2
        Theoretical guarantee:   f(x_n)-f_* <= 0.0649574 ||x_0 - x_*||^2

1.26. Gradient descent for smooth strongly convex objective with linear mapping

PEPit.examples.unconstrained_convex_minimization.wc_gradient_descent_lc(mug, Lg, typeM, muM, LM, gamma, n, verbose=1)[source]

Consider the convex minimization problem

\[g_\star \triangleq \min_x g(Mx),\]

where \(g\) is an \(L_g\)-smooth, mu_g-strongly convex function and \(M\) is a general, symmetric or skew-symmetric matrix with \(\mu_M \leqslant \|M\| \leqslant L_M\).

Note

For general and skew-symmetric matrices, \(\mu_M\) must be set to 0.

This code computes a worst-case guarantee for gradient descent with fixed step-size \(\gamma\) on a function \(F\) defined as the composition of a function \(G\) and a linear operator \(M\). That is, it computes the smallest possible \(\tau(n, \mu_g, L_g, \mu_M, L_M, \gamma)\) such that the guarantee

\[g(Mx_n) - g_\star \leqslant \tau(n, \mu_g, L_g, \mu_M, L_M, \gamma) \|x_0 - x_\star\|^2\]

is valid, where \(x_n\) is the output of gradient descent run on \(F\) with fixed step-size \(\gamma\), where \(x_\star\) is a minimizer of \(F(x) = g(Mx)\), and where \(g_\star = g(M x_\star)\).

In short, for given values of \(n\), \(\mu_g\), \(L_g\), \(\mu_M\), \(L_M\), and \(\gamma\), \(\tau(n, \mu_g, L_g, \mu_M, L_M, \gamma)\) is computed as the worst-case value of \(g(Mx_n)-g_\star\) when \(\|x_0 - x_\star\|^2 \leqslant 1\).

Algorithm: Gradient descent on such a function is described by

\[x_{t+1} = x_t - \gamma M^T \nabla g(Mx_t),\]

where \(\gamma\) is a step-size.

Theoretical guarantee: When \(\gamma \leqslant \frac{2}{L}\), \(0 \leqslant \mu_g \leqslant L_g\), and \(0 \leqslant \mu_M \leqslant L_M\), the following tight theoretical guarantee is conjectured in [1, Conjecture 4.2], and the associated lower theoretical guarantee is stated in [1, Conjecture 4.3]:

\[g(Mx_n)-g_\star \leqslant \frac{L}{2} \max\left\{\frac{\kappa_g {M^*}^2}{\kappa_g -1 + (1-\kappa_g {M^*}^2 L \gamma )^{-2n}}, (1-L\gamma)^{2n} \right\} \|x_0-x_\star\|^2,\]

where \(L = L_g L_M^2\), \(\kappa_g = \frac{\mu_g}{L_g}\), \(\kappa_M = \frac{\mu_M}{L_M}\), \(M^* = \mathrm{proj}_{[\kappa_M,1]} \left(\sqrt{\frac{h_0}{L\gamma}}\right)\) for \(h_0\) solution of

\[(1-\kappa_g)(1-\kappa_g h_0)^{2n+1} = 1 - (2n+1)\kappa_g h_0.\]

References:

[1] N. Bousselmi, J. Hendrickx, F. Glineur (2023). Interpolation Conditions for Linear Operators and applications to Performance Estimation Problems. arXiv preprint

Parameters:
  • mug (float) – the strong convexity parameter of \(g(y)\).

  • Lg (float) – the smoothness parameter of \(g(y)\).

  • typeM (string) – type of matrix \(M\) (“gen”, “sym” or “skew”).

  • muM (float) – lower bound on \(\|M\|\) (if typeM != “sym”, then muM must be set to zero).

  • LM (float) – upper bound on \(\|M\|\).

  • gamma (float) – step-size.

  • n (int) – number of iterations.

  • verbose (int) –

    Level of information details to print.

    • -1: No verbose at all.

    • 0: This example’s output.

    • 1: This example’s output + PEPit information.

    • 2: This example’s output + PEPit information + CVXPY details.

Returns:
  • pepit_tau (float) – worst-case value

  • theoretical_tau (float) – theoretical value

Example

>>> Lg = 3.; mug = 0.3
>>> typeM = "sym"; LM = 1.; muM = 0.
>>> L = Lg*LM**2
>>> pepit_tau, theoretical_tau = wc_gradient_descent_lc(mug = mug, Lg=Lg, typeM=typeM, muM = muM, LM=LM, gamma=1 / L, n=3, verbose=1)
(PEPit) Setting up the problem: size of the Gram matrix: 16x16
(PEPit) Setting up the problem: performance measure is the minimum of 1 element(s)
(PEPit) Setting up the problem: Adding initial conditions and general constraints ...
(PEPit) Setting up the problem: initial conditions and general constraints (2 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 3 function(s)
                        Function 1 : Adding 20 scalar constraint(s) ...
                        Function 1 : 20 scalar constraint(s) added
                        Function 2 : Adding 20 scalar constraint(s) ...
                        Function 2 : 20 scalar constraint(s) added
                        Function 2 : Adding 2 lmi constraint(s) ...
                 Size of PSD matrix 1: 5x5
                 Size of PSD matrix 2: 4x4
                        Function 2 : 2 lmi constraint(s) added
                        Function 3 : Adding 0 scalar constraint(s) ...
                        Function 3 : 0 scalar constraint(s) added
(PEPit) Setting up the problem: additional constraints for 0 function(s)
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.16380641240039548
(PEPit) Primal feasibility check:
                The solver found a Gram matrix that is positive semi-definite up to an error of 8.261931885644884e-09
                All required PSD matrices are indeed positive semi-definite
                All the primal scalar constraints are verified up to an error of 1.7219835096598865e-08
(PEPit) Dual feasibility check:
                The solver found a residual matrix that is positive semi-definite
                All the dual matrices to lmi are positive semi-definite
                All the dual scalar values associated with inequality constraints are nonnegative
(PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 5.282963298309745e-07
(PEPit) Final upper bound (dual): 0.1638061803039686 and lower bound (primal example): 0.16380641240039548
(PEPit) Duality gap: absolute: -2.320964268831549e-07 and relative: -1.4168946348439444e-06
*** Example file: worst-case performance of gradient descent on g(Mx) with fixed step-sizes ***
        PEPit guarantee:         f(x_n)-f_* <= 0.163806 ||x_0 - x_*||^2
        Theoretical guarantee:   f(x_n)-f_* <= 0.16379 ||x_0 - x_*||^2