1. Unconstrained convex minimization

Gradient descent
Subgradient method
Gradient descent with exact line search
Conjugate gradient
Heavy Ball momentum
Accelerated gradient for convex objective
Accelerated gradient for strongly convex objective
Optimized gradient
Optimized gradient for gradient
Robust momentum
Triple momentum
Information theoretic exact method
Proximal point
Accelerated proximal point
Inexact gradient descent
Inexact gradient descent with exact line search
Inexact accelerated gradient

1.1. Gradient descent 

PEPit.examples.unconstrained_convex_minimization.wc_gradient_descent(L, gamma, n, verbose=True)[source]

Consider the convex minimization problem

\[f_\star \triangleq \min_x f(x),\]

where \(f\) is \(L\)-smooth and convex.

This code computes a worst-case guarantee for gradient descent with fixed step-size \(\gamma\). That is, it computes the smallest possible \(\tau(n, L, \gamma)\) such that the guarantee

\[f(x_n) - f_\star \leqslant \tau(n, L, \gamma) || x_0 - x_\star ||^2\]

is valid, where \(x_n\) is the output of gradient descent with fixed step-size \(\gamma\), and where \(x_\star\) is a minimizer of \(f\).

In short, for given values of \(n\), \(L\), and \(\gamma\), \(\tau(n, L, \gamma)\) is computed as the worst-case value of \(f(x_n)-f_\star\) when \(||x_0 - x_\star||^2 \leqslant 1\).

Algorithm: Gradient descent is described by

\[x_{t+1} = x_t - \gamma \nabla f(x_t),\]

where \(\gamma\) is a step-size.

Theoretical guarantee: When \(\gamma \leqslant \frac{1}{L}\), the tight theoretical guarantee can be found in [1, Theorem 1]:

\[f(x_n)-f_\star \leqslant \frac{L||x_0-x_\star||^2}{4nL\gamma+2},\]

which is tight on some Huber loss functions.

References:

[1] Y. Drori, M. Teboulle (2014). Performance of first-order methods for smooth convex minimization: a novel approach. Mathematical Programming 145(1–2), 451–482.

Parameters

L (float) – the smoothness parameter.
gamma (float) – step-size.
n (int) – number of iterations.
verbose (bool) – if True, print conclusion

Returns

pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value

Example

>>> L = 3
>>> pepit_tau, theoretical_tau = wc_gradient_descent(L=L, gamma=1 / L, n=4, verbose=True)
(PEPit) Setting up the problem: size of the main PSD matrix: 7x7
(PEPit) Setting up the problem: performance measure is minimum of 1 element(s)
(PEPit) Setting up the problem: initial conditions (1 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 1 function(s)
         function 1 : 30 constraint(s) added
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (solver: MOSEK); optimal value: 0.16666666497937685
*** Example file: worst-case performance of gradient descent with fixed step-sizes ***
    PEPit guarantee:             f(x_n)-f_* <= 0.166667 ||x_0 - x_*||^2
    Theoretical guarantee:       f(x_n)-f_* <= 0.166667 ||x_0 - x_*||^2

1.2. Subgradient method 

PEPit.examples.unconstrained_convex_minimization.wc_subgradient_method(M, n, gamma, verbose=True)[source]

Consider the minimization problem

\[f_\star \triangleq \min_x f(x),\]

where \(f\) is convex and \(M\)-Lipschitz. This problem is a (possibly non-smooth) minimization problem.

This code computes a worst-case guarantee for the subgradient method. That is, it computes the smallest possible \(\tau(n, M, \gamma)\) such that the guarantee

\[\min_{0 \leqslant t \leqslant n} f(x_t) - f_\star \leqslant \tau(n, M, \gamma) \|x_0 - x_\star\|\]

is valid, where \(x_t\) is the output of the subgradient method after \(t\leqslant n\) steps, and where \(x_\star\) is the minimizer of \(f\).

In short, for given values of \(M\), the step-size \(\gamma\) and the number of iterations \(n\), \(\tau(n, M, \gamma)\) is computed as the worst-case value of \(\min_{0 \leqslant t \leqslant n} f(x_t) - f_\star\) when \(\|x_0-x_\star\| \leqslant 1\).

Algorithm: For \(t\in \{0, \dots, n-1 \}\)

\begin{eqnarray} g_{t} & \in & \partial f(x_t) \\ x_{t+1} & = & x_t - \gamma g_t \end{eqnarray}

Theoretical guarantee: The tight bound is obtained in [1, Section 3.2.3] and [2, Eq (2)]

\[\min_{0 \leqslant t \leqslant n} f(x_t)- f(x_\star) \leqslant \frac{M}{\sqrt{n+1}}\|x_0-x_\star\|,\]

and tightness follows from the lower complexity bound for this class of problems, e.g., [3, Appendix A].

References: Classical references on this topic include [1, 2].

[1] Y. Nesterov (2003). Introductory lectures on convex optimization: A basic course. Springer Science & Business Media.

[2] S. Boyd, L. Xiao, A. Mutapcic (2003). Subgradient Methods (lecture notes).

[3] Y. Drori, M. Teboulle (2016). An optimal variant of Kelley’s cutting-plane method. Mathematical Programming, 160(1), 321-351.

Parameters

M (float) – the Lipschitz parameter.
n (int) – the number of iterations.
gamma (float) – step-size.
verbose (bool, optional) – if True, print conclusion.

Returns

pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value

Example

>>> M = 2
>>> n = 6
>>> gamma = 1 / (M * sqrt(n + 1))
>>> pepit_tau, theoretical_tau = wc_subgradient_method(M=M, n=n, gamma=gamma, verbose=True)
(PEPit) Setting up the problem: size of the main PSD matrix: 9x9
(PEPit) Setting up the problem: performance measure is minimum of 7 element(s)
(PEPit) Setting up the problem: initial conditions (1 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 1 function(s)
         function 1 : 64 constraint(s) added
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (solver: SCS); optimal value: 0.7559825331741553
*** Example file: worst-case performance of subgradient method ***
    PEPit guarantee:             min_(0 \leq t \leq n) f(x_i) - f_* <= 0.755983 ||x_0 - x_*||
    Theoretical guarantee:       min_(0 \leq t \leq n) f(x_i) - f_* <= 0.755929 ||x_0 - x_*||

1.3. Gradient descent with exact line search 

PEPit.examples.unconstrained_convex_minimization.wc_gradient_exact_line_search(L, mu, n, verbose=True)[source]

Consider the convex minimization problem

\[f_\star \triangleq \min_x f(x),\]

where \(f\) is \(L\)-smooth and \(\mu\)-strongly convex.

This code computes a worst-case guarantee for the gradient descent (GD) with exact linesearch (ELS). That is, it computes the smallest possible \(\tau(n, L, \mu)\) such that the guarantee

\[f(x_n) - f_\star \leqslant \tau(n, L, \mu) (f(x_0) - f_\star)\]

is valid, where \(x_n\) is the output of the GD with ELS, and where \(x_\star\) is the minimizer of \(f\). In short, for given values of \(n\), \(L\) and \(\mu\), \(\tau(n, L, \mu)\) is computed as the worst-case value of \(f(x_n)-f_\star\) when \(f(x_0) - f_\star \leqslant 1\).

Algorithm: GD with ELS can be written as

\[x_{t+1} = x_t - \gamma_t \nabla f(x_t)\]

with \(\gamma_t = \arg\min_{\gamma} f \left( x_t - \gamma \nabla f(x_t) \right)\).

Theoretical guarantee: The tight worst-case guarantee for GD with ELS, obtained in [1, Theorem 1.2], is

\[f(x_n) - f_\star \leqslant \left(\frac{L-\mu}{L+\mu}\right)^{2n} (f(x_0) - f_\star).\]

References: The detailed approach (based on convex relaxations) is available in [1], along with theoretical bound.

[1] E. De Klerk, F. Glineur, A. Taylor (2017). On the worst-case complexity of the gradient method with exact line search for smooth strongly convex functions. Optimization Letters, 11(7), 1185-1199.

Parameters

L (float) – the smoothness parameter.
mu (float) – the strong convexity parameter.
n (int) – number of iterations.
verbose (bool, optional) – if True, print conclusion.

Returns

pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value

Example

>>> pepit_tau, theoretical_tau = wc_gradient_exact_line_search(L=1, mu=.1, n=2, verbose=True)
(PEPit) Setting up the problem: size of the main PSD matrix: 7x7
(PEPit) Setting up the problem: performance measure is minimum of 1 element(s)
(PEPit) Setting up the problem: initial conditions (1 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 1 function(s)
         function 1 : 16 constraint(s) added
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (solver: SCS); optimal value: 0.44812204883466417
*** Example file: worst-case performance of gradient descent with exact linesearch (ELS) ***
    PEPit guarantee:             f(x_n)-f_* <= 0.448122 (f(x_0)-f_*)
    Theoretical guarantee:       f(x_n)-f_* <= 0.448125 (f(x_0)-f_*)

1.4. Conjugate gradient 

PEPit.examples.unconstrained_convex_minimization.wc_conjugate_gradient(L, n, verbose=True)[source]

Consider the convex minimization problem

\[f_\star \triangleq \min_x f(x),\]

where \(f\) is \(L\)-smooth and convex.

This code computes a worst-case guarantee for the conjugate gradient (CG) method (with exact span searches). That is, it computes the smallest possible \(\tau(n, L)\) such that the guarantee

\[f(x_n) - f_\star \leqslant \tau(n, L) \|x_0-x_\star\|^2\]

is valid, where \(x_n\) is the output of the conjugate gradient method, and where \(x_\star\) is a minimizer of \(f\). In short, for given values of \(n\) and \(L\), \(\tau(n, L)\) is computed as the worst-case value of \(f(x_n)-f_\star\) when \(\|x_0-x_\star\|^2 \leqslant 1\).

Algorithm:

\[x_{t+1} = x_t - \sum_{i=0}^t \gamma_i \nabla f(x_i)\]

with

\[(\gamma_i)_{i \leqslant t} = \arg\min_{(\gamma_i)_{i \leqslant t}} f \left(x_t - \sum_{i=0}^t \gamma_i \nabla f(x_i) \right)\]

Theoretical guarantee:

The tight guarantee obtained in [1] is

\[f(x_n) - f_\star \leqslant\frac{L}{2 \theta_n^2}\|x_0-x_\star\|^2.\]

where

\begin{eqnarray} \theta_0 & = & 1 \\ \theta_t & = & \frac{1 + \sqrt{4 \theta_{t-1}^2 + 1}}{2}, \forall t \in [|1, n-1|] \\ \theta_n & = & \frac{1 + \sqrt{8 \theta_{n-1}^2 + 1}}{2}, \end{eqnarray}
and tightness follows from [2, Theorem 3].

References: The detailed approach (based on convex relaxations) is available in [1, Corollary 6].

[1] Y. Drori and A. Taylor (2020). Efficient first-order methods for convex minimization: a constructive approach. Mathematical Programming 184 (1), 183-220.

[2] Y. Drori (2017). The exact information-based complexity of smooth convex minimization. Journal of Complexity, 39, 1-16.

Parameters

L (float) – the smoothness parameter.
n (int) – number of iterations.
verbose (bool) – if True, print conclusion.

Returns

pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value

Example

>>> pepit_tau, theoretical_tau = wc_conjugate_gradient(L=1, n=2, verbose=True)
(PEPit) Setting up the problem: size of the main PSD matrix: 7x7
(PEPit) Setting up the problem: performance measure is minimum of 1 element(s)
(PEPit) Setting up the problem: initial conditions (1 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 1 function(s)
         function 1 : 18 constraint(s) added
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (solver: SCS); optimal value: 0.061893515427809735
*** Example file: worst-case performance of conjugate gradient method ***
    PEPit guarantee:             f(x_n)-f_* <= 0.0618935 ||x_0 - x_*||^2
    Theoretical guarantee:       f(x_n)-f_* <= 0.0618942 ||x_0 - x_*||^2

1.5. Heavy Ball momentum 

PEPit.examples.unconstrained_convex_minimization.wc_heavy_ball_momentum(mu, L, alpha, beta, n, verbose=True)[source]

Consider the convex minimization problem

\[f_\star \triangleq \min_x f(x),\]

where \(f\) is \(L\)-smooth and \(\mu\)-strongly convex.

This code computes a worst-case guarantee for the Heavy-ball (HB) method, aka Polyak momentum method. That is, it computes the smallest possible \(\tau(n, L, \mu)\) such that the guarantee

\[f(x_n) - f_\star \leqslant \tau(n, L, \mu) (f(x_0) - f_\star)\]

is valid, where \(x_n\) is the output of the Heavy-ball (HB) method, and where \(x_\star\) is the minimizer of \(f\). In short, for given values of \(n\), \(L\) and \(\mu\), \(\tau(n, L, \mu)\) is computed as the worst-case value of \(f(x_n)-f_\star\) when \(f(x_0) - f_\star \leqslant 1\).

Algorithm:

\[x_{t+1} = x_t - \alpha \nabla f(x_t) + \beta (x_t-x_{t-1})\]

with

\[\alpha \in (0, \frac{1}{L}]\]

and

\[\beta = \sqrt{(1 - \alpha \mu)(1 - L \alpha)}\]

Theoretical guarantee:

The upper guarantee obtained in [2, Theorem 4] is

\[f(x_n) - f_\star \leqslant (1 - \alpha \mu)^n (f(x_0) - f_\star).\]

References: This methods was first introduce in [1, Section 2], and convergence upper bound was proven in [2, Theorem 4].

[1] B.T. Polyak (1964). Some methods of speeding up the convergence of iteration method. URSS Computational Mathematics and Mathematical Physics.

[2] E. Ghadimi, H. R. Feyzmahdavian, M. Johansson (2015). Global convergence of the Heavy-ball method for convex optimization. European Control Conference (ECC).

Parameters

L (float) – the smoothness parameter.
mu (float) – the strong convexity parameter.
alpha (float) – parameter of the scheme.
beta (float) – parameter of the scheme such that \(0<\beta<1\) and \(0<\alpha<2(1+\beta)\).
n (int) – number of iterations.
verbose (bool) – if True, print conclusion.

Returns

pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value

Example

>>> mu = 0.1
>>> L = 1.
>>> alpha = 1 / (2 * L)  # alpha \in [0, 1 / L]
>>> beta = sqrt((1 - alpha * mu) * (1 - L * alpha))
>>> pepit_tau, theoretical_tau = wc_heavy_ball_momentum(mu=mu, L=L, alpha=alpha, beta=beta, n=2, verbose=True)
(PEPit) Setting up the problem: size of the main PSD matrix: 5x5
(PEPit) Setting up the problem: performance measure is minimum of 1 element(s)
(PEPit) Setting up the problem: initial conditions (1 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 1 function(s)
         function 1 : 12 constraint(s) added
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (solver: SCS); optimal value: 0.8145062493468549
*** Example file: worst-case performance of the Heavy-Ball method ***
    PEPit guarantee:             f(x_n)-f_* <= 0.753492 (f(x_0) -  f(x_*))
    Theoretical guarantee:       f(x_n)-f_* <= 0.9025 (f(x_0) -  f(x_*))

1.6. Accelerated gradient for convex objective 

PEPit.examples.unconstrained_convex_minimization.wc_accelerated_gradient_convex(mu, L, n, verbose=True)[source]

Consider the convex minimization problem

\[f_\star \triangleq \min_x f(x),\]

where \(f\) is \(L\)-smooth and \(\mu\)-strongly convex (\(\mu\) is possibly 0).

This code computes a worst-case guarantee for the accelerated gradient method, a.k.a. fast gradient method. That is, it computes the smallest possible \(\tau(n, L, \mu)\) such that the guarantee

\[f(x_n) - f_\star \leqslant \tau(n, L, \mu) \|x_0 - x_\star\|^2\]

is valid, where \(x_n\) is the output of the accelerated gradient method, and where \(x_\star\) is the minimizer of \(f\). In short, for given values of \(n\), \(L\) and \(\mu\), \(\tau(n, L, \mu)\) is computed as the worst-case value of \(f(x_n)-f_\star\) when \(\|x_0 - x_\star\|^2 \leqslant 1\).

Algorithm: The accelerated gradient method of this example is provided by

\begin{eqnarray} x_{t+1} & = & y_t - \frac{1}{L} \nabla f(y_t) \\ y_{t+1} & = & x_{t+1} + \frac{t-1}{t+2} (x_{t+1} - x_t). \end{eqnarray}

Theoretical guarantee: When \(\mu=0\), a tight empirical guarantee can be found in [1, Table 1]:

\[f(x_n)-f_\star \leqslant \frac{2L\|x_0-x_\star\|^2}{n^2 + 5 n + 6},\]

where tightness is obtained on some Huber loss functions.

References:

[1] A. Taylor, J. Hendrickx, F. Glineur (2017). Exact worst-case performance of first-order methods for composite convex optimization. SIAM Journal on Optimization, 27(3):1283–1313.

Parameters

mu (float) – the strong convexity parameter
L (float) – the smoothness parameter.
n (int) – number of iterations.
verbose (bool) – if True, print conclusion

Returns

pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value

Example

>>> pepit_tau, theoretical_tau = wc_accelerated_gradient_convex(mu=0, L=1, n=1, verbose=True)
(PEPit) Setting up the problem: size of the main PSD matrix: 4x4
(PEPit) Setting up the problem: performance measure is minimum of 1 element(s)
(PEPit) Setting up the problem: initial conditions (1 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 1 function(s)
         function 1 : 6 constraint(s) added
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (solver: SCS); optimal value: 0.16666666668209376
*** Example file: worst-case performance of accelerated gradient method ***
    PEPit guarantee:             f(x_n)-f_* <= 0.166667 ||x_0 - x_*||^2
    Theoretical guarantee:       f(x_n)-f_* <= 0.166667 ||x_0 - x_*||^2

1.7. Accelerated gradient for strongly convex objective 

PEPit.examples.unconstrained_convex_minimization.wc_accelerated_gradient_strongly_convex(mu, L, n, verbose=True)[source]

Consider the convex minimization problem

\[f_\star \triangleq \min_x f(x),\]

where \(f\) is \(L\)-smooth and \(\mu\)-strongly convex.

This code computes a worst-case guarantee for the accelerated gradient method, a.k.a fast gradient method. That is, it computes the smallest possible \(\tau(n, L, \mu)\) such that the guarantee

\[f(x_n) - f_\star \leqslant \tau(n, L, \mu) \left(f(x_0) - f(x_\star) + \frac{\mu}{2}\|x_0 - x_\star\|^2\right),\]

is valid, where \(x_n\) is the output of the accelerated gradient method, and where \(x_\star\) is the minimizer of \(f\). In short, for given values of \(n\), \(L\) and \(\mu\), \(\tau(n, L, \mu)\) is computed as the worst-case value of \(f(x_n)-f_\star\) when \(f(x_0) - f(x_\star) + \frac{\mu}{2}\|x_0 - x_\star\|^2 \leqslant 1\).

Algorithm: For \(t \in \{0, \dots, n-1\}\),

\begin{eqnarray} y_t & = & x_t + \frac{\sqrt{L} - \sqrt{\mu}}{\sqrt{L} + \sqrt{\mu}}(x_t - x_{t-1}) \\ x_{t+1} & = & y_t - \frac{1}{L} \nabla f(y_t) \end{eqnarray}

with \(x_{-1}:= x_0\).

Theoretical guarantee:

The following upper guarantee can be found in [1, Corollary 4.15]:

\[f(x_n)-f_\star \leqslant \left(1 - \sqrt{\frac{\mu}{L}}\right)^n \left(f(x_0) - f(x_\star) + \frac{\mu}{2}\|x_0 - x_\star\|^2\right).\]

References:

[1] A. d’Aspremont, D. Scieur, A. Taylor (2021). Acceleration Methods. Foundations and Trends in Optimization: Vol. 5, No. 1-2.

Parameters

mu (float) – the strong convexity parameter
L (float) – the smoothness parameter.
n (int) – number of iterations.
verbose (bool) – if True, print conclusion

Returns

pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value

Example

>>> pepit_tau, theoretical_tau = wc_accelerated_gradient_strongly_convex(mu=0.1, L=1, n=2, verbose=True)
(PEPit) Setting up the problem: size of the main PSD matrix: 5x5
(PEPit) Setting up the problem: performance measure is minimum of 1 element(s)
(PEPit) Setting up the problem: initial conditions (1 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 1 function(s)
         function 1 : 12 constraint(s) added
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (solver: SCS); optimal value: 0.34758587217463155
*** Example file: worst-case performance of the accelerated gradient method ***
    PEPit guarantee:             f(x_n)-f_*  <= 0.347586 (f(x_0) -  f(x_*) +  mu/2*|| x_0 - x_* ||**2)
    Theoretical guarantee:       f(x_n)-f_*  <= 0.467544 (f(x_0) -  f(x_*) +  mu/2*|| x_0 - x_* ||**2)

1.8. Optimized gradient 

PEPit.examples.unconstrained_convex_minimization.wc_optimized_gradient(L, n, verbose=True)[source]

Consider the minimization problem

\[f_\star \triangleq \min_x f(x),\]

where \(f\) is \(L\)-smooth and convex.

This code computes a worst-case guarantee for optimized gradient method (OGM). That is, it computes the smallest possible \(\tau(n, L)\) such that the guarantee

\[f(x_n) - f_\star \leqslant \tau(n, L) \|x_0 - x_\star\|^2\]

is valid, where \(x_n\) is the output of OGM and where \(x_\star\) is a minimizer of \(f\).

In short, for given values of \(n\) and \(L\), \(\tau(n, L)\) is computed as the worst-case value of \(f(x_n)-f_\star\) when \(\|x_0 - x_\star\|^2 \leqslant 1\).

Algorithm: The optimized gradient method is described by

\begin{eqnarray} x_{t+1} & = & y_t - \frac{1}{L} \nabla f(y_t)\\ y_{t+1} & = & x_{t+1} + \frac{\theta_{t}-1}{\theta_{t+1}}(x_{t+1}-x_t)+\frac{\theta_{t}}{\theta_{t+1}}(x_{t+1}-y_t), \end{eqnarray}

with

\begin{eqnarray} \theta_0 & = & 1 \\ \theta_t & = & \frac{1 + \sqrt{4 \theta_{t-1}^2 + 1}}{2}, \forall t \in [|1, n-1|] \\ \theta_n & = & \frac{1 + \sqrt{8 \theta_{n-1}^2 + 1}}{2}. \end{eqnarray}

Theoretical guarantee: The tight theoretical guarantee can be found in [2, Theorem 2]:

\[f(x_n)-f_\star \leqslant \frac{L\|x_0-x_\star\|^2}{2\theta_n^2},\]

where tightness follows from [3, Theorem 3].

References: The optimized gradient method was developed in [1, 2]; the corresponding lower bound was first obtained in [3].

[1] Y. Drori, M. Teboulle (2014). Performance of first-order methods for smooth convex minimization: a novel approach. Mathematical Programming 145(1–2), 451–482.

[2] D. Kim, J. Fessler (2016). Optimized first-order methods for smooth convex minimization. Mathematical Programming 159.1-2: 81-107.

[3] Y. Drori (2017). The exact information-based complexity of smooth convex minimization. Journal of Complexity, 39, 1-16.

Parameters

L (float) – the smoothness parameter.
n (int) – number of iterations.
verbose (bool) – if True, print conclusion

Returns

pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value

Example

>>> pepit_tau, theoretical_tau = wc_optimized_gradient(L=3, n=4, verbose=True)
(PEPit) Setting up the problem: size of the main PSD matrix: 7x7
(PEPit) Setting up the problem: performance measure is minimum of 1 element(s)
(PEPit) Setting up the problem: initial conditions (1 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 1 function(s)
         function 1 : 30 constraint(s) added
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (solver: MOSEK); optimal value: 0.07675182659831646
*** Example file: worst-case performance of optimized gradient method ***
        PEPit guarantee:       f(y_n)-f_* <= 0.0767518 || x_0 - x_* ||^2
        Theoretical guarantee:  f(y_n)-f_* <= 0.0767518 || x_0 - x_* ||^2

1.9. Optimized gradient for gradient 

PEPit.examples.unconstrained_convex_minimization.wc_optimized_gradient_for_gradient(L, n, verbose=True)[source]

Consider the minimization problem

\[f_\star \triangleq \min_x f(x),\]

where \(f\) is \(L\)-smooth and convex.

This code computes a worst-case guarantee for optimized gradient method for gradient (OGM-G). That is, it computes the smallest possible \(\tau(n, L)\) such that the guarantee

\[\|\nabla f(x_n)\|^2 \leqslant \tau(n, L) (f(x_0) - f_\star)\]

is valid, where \(x_n\) is the output of OGM-G and where \(x_\star\) is a minimizer of \(f\).

In short, for given values of \(n\) and \(L\), \(\tau(n, L)\) is computed as the worst-case value of \(\|\nabla f(x_n)\|^2\) when \(f(x_0)-f_\star \leqslant 1\).

Algorithm: For \(t\in\{0,1,\ldots,n-1\}\), the optimized gradient method for gradient [1, Section 6.3] is described by

\begin{eqnarray} y_{t+1} & = & x_t - \frac{1}{L} \nabla f(x_t),\\ x_{t+1} & = & y_{t+1} + \frac{(\tilde{\theta}_t-1)(2\tilde{\theta}_{t+1}-1)}{\tilde{\theta}_t(2\tilde{\theta}_t-1)}(y_{t+1}-y_t)+\frac{2\tilde{\theta}_{t+1}-1}{2\tilde{\theta}_t-1}(y_{t+1}-x_t), \end{eqnarray}

with

\begin{eqnarray} \tilde{\theta}_n & = & 1 \\ \tilde{\theta}_t & = & \frac{1 + \sqrt{4 \tilde{\theta}_{t+1}^2 + 1}}{2}, \forall t \in [|1, n-1|] \\ \tilde{\theta}_0 & = & \frac{1 + \sqrt{8 \tilde{\theta}_{1}^2 + 1}}{2}. \end{eqnarray}

Theoretical guarantee: The tight worst-case guarantee can be found in [1, Theorem 6.1]:

\[\|\nabla f(x_n)\|^2 \leqslant \frac{2L(f(x_0)-f_\star)}{\tilde{\theta}_0^2},\]

where tightness is achieved on Huber losses, see [1, Section 6.4].

References: The optimized gradient method for gradient was developed in [1].

[1] D. Kim, J. Fessler (2021). Optimizing the efficiency of first-order methods for decreasing the gradient of smooth convex functions. Journal of optimization theory and applications, 188(1), 192-219.

Parameters

L (float) – the smoothness parameter.
n (int) – number of iterations.
verbose (bool) – if True, print conclusion.

Returns

pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value

Example

>>> pepit_tau, theoretical_tau = wc_optimized_gradient_for_gradient(L=3, n=4, verbose=True)
(PEP-it) Setting up the problem: size of the main PSD matrix: 7x7
(PEP-it) Setting up the problem: performance measure is minimum of 1 element(s)
(PEP-it) Setting up the problem: initial conditions (1 constraint(s) added)
(PEP-it) Setting up the problem: interpolation conditions for 1 function(s)
         function 1 : 30 constraint(s) added
(PEP-it) Compiling SDP
(PEP-it) Calling SDP solver
(PEP-it) Solver status: optimal (solver: MOSEK); optimal value: 0.30700730460985826
*** Example file: worst-case performance of optimized gradient method for gradient ***
        PEP-it guarantee:       ||f'(x_n)|| ^ 2 <= 0.307007 (f(x_0) - f_*)
        Theoretical guarantee:  ||f'(x_n)|| ^ 2 <= 0.307007 (f(x_0) - f_*)

1.10. Robust momentum 

PEPit.examples.unconstrained_convex_minimization.wc_robust_momentum(mu, L, lam, verbose=True)[source]

Consider the convex minimization problem

\[f_\star \triangleq \min_x f(x),\]

where \(f\) is \(L\)-smooth and \(\mu\)-strongly-convex.

This code computes a worst-case guarantee for the robust momentum method (RMM). That is, it computes the smallest possible \(\tau(n, \mu, L, \lambda)\) such that the guarantee

\[v(x_{n+1}) \leqslant \tau(n, \mu, L, \lambda) v(x_{n}),\]

is valid, where \(x_n\) is the \(n^{\mathrm{th}}\) iterate of the RMM, and \(x_\star\) is a minimizer of \(f\). The function \(v(.)\) is a well-chosen Lyapunov defined as follows,

\begin{eqnarray} v(x_t) & = & l\|z_t - x_\star\|^2 + q_t, \\ q_t & = & (L - \mu) \left(f(x_t) - f_\star - \frac{\mu}{2}\|y_t - x_\star\|^2 - \frac{1}{2}\|\nabla f(y_t) - \mu (y_t - x_\star)\|^2 \right), \end{eqnarray}

with \(\kappa = \frac{\mu}{L}\), \(\rho = \lambda (1 - \frac{1}{\kappa}) + (1 - \lambda) \left(1 - \frac{1}{\sqrt{\kappa}}\right)\), and \(l = \mu^2 \frac{\kappa - \kappa \rho^2 - 1}{2 \rho (1 - \rho)}\).

Algorithm:

For \(t \in \{0, \dots, n-1\}\),

\begin{eqnarray} x_{t+1} & = & x_{t} + \beta (x_t - x_{t-1}) - \alpha \nabla f(y_t), \\ y_{t+1} & = & y_{t} + \gamma (x_t - x_{t-1}), \end{eqnarray}

with \(x_{-1}, x_0 \in \mathrm{R}^d\), and with parameters \(\alpha = \frac{\kappa (1 - \rho^2)(1 + \rho)}{L}\), \(\beta = \frac{\kappa \rho^3}{\kappa - 1}\), \(\gamma = \frac{\rho^2}{(\kappa - 1)(1 - \rho)^2(1 + \rho)}\).

Theoretical guarantee:

A convergence guarantee (empirically tight) is obtained in [1, Theorem 1],

\[v(x_{n+1}) \leqslant \rho^2 v(x_n),\]

with \(\rho = \lambda (1 - \frac{1}{\kappa}) + (1 - \lambda) \left(1 - \frac{1}{\sqrt{\kappa}}\right)\).

References:

[1] S. Cyrus, B. Hu, B. Van Scoy, L. Lessard (2018). A robust accelerated optimization algorithm for strongly convex functions. American Control Conference (ACC).

Parameters

L (float) – the smoothness parameter.
mu (float) – the strong convexity parameter.
lam (float) – if \(\lambda=1\) it is the gradient descent, if \(\lambda=0\), it is the Triple Momentum Method.
verbose (bool, optional) – if True, print conclusion.

Returns

pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value

Examples

>>> pepit_tau, theoretical_tau = wc_robust_momentum(0.1, 1, 0.2, verbose=True)
(PEPit) Setting up the problem: size of the main PSD matrix: 5x5
(PEPit) Setting up the problem: performance measure is minimum of 1 element(s)
(PEPit) Setting up the problem: initial conditions (1 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 1 function(s)
         function 1 : 6 constraint(s) added
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (solver: SCS); optimal value: 0.5285548355257013
*** Example file: worst-case performance of the Robust Momentum Method ***
    PEPit guarantee:             v(x_(n+1)) <= 0.528555 v(x_n)
    Theoretical guarantee:       v(x_(n+1)) <= 0.528555 v(x_n)

1.11. Triple momentum 

PEPit.examples.unconstrained_convex_minimization.wc_triple_momentum(mu, L, n, verbose=True)[source]

Consider the minimization problem

\[f_\star \triangleq \min_x f(x),\]

where \(f\) is \(L\)-smooth and \(\mu\)-strongly convex.

This code computes a worst-case guarantee for triple momentum method (TMM). That is, it computes the smallest possible \(\tau(n, L, \mu)\) such that the guarantee

\[f(x_n) - f_\star \leqslant \tau(n, L, \mu) \|x_0 - x_\star\|^2\]

is valid, where \(x_n\) is the output of the TMM, and where \(x_\star\) is the minimizer of \(f\). In short, for given values of \(n\), \(L\) and \(\mu\), \(\tau(n, L, \mu)\) is computed as the worst-case value of \(f(x_n)-f_\star\) when \(\|x_0 - x_\star\|^2 \leqslant 1\).

Algorithm:

For \(t \in \{ 1, \dots, n\}\)

\begin{eqnarray} \xi_{t+1} &&= (1 + \beta) \xi_{t} - \beta \xi_{t-1} - \alpha \nabla f(y_t) \\ y_{t} &&= (1+\gamma ) \xi_{t} -\gamma \xi_{t-1} \\ x_{t} && = (1 + \delta) \xi_{t} - \delta \xi_{t-1} \end{eqnarray}

with

\begin{eqnarray} \kappa &&= \frac{L}{\mu} , \quad \rho = 1- \frac{1}{\sqrt{\kappa}}\\ (\alpha, \beta, \gamma,\delta) && = \left(\frac{1+\rho}{L}, \frac{\rho^2}{2-\rho}, \frac{\rho^2}{(1+\rho)(2-\rho)}, \frac{\rho^2}{1-\rho^2}\right) \end{eqnarray}

and

\begin{eqnarray} \xi_{0} = x_0 \\ \xi_{1} = x_0 \\ y = x_0 \end{eqnarray}

Theoretical guarantee: A theoretical upper (empirically tight) bound can be found in [1, Theorem 1, eq. 4]:

\[f(x_n)-f_\star \leqslant \frac{\rho^{2(n+1)} L \kappa}{2}\|x_0 - x_\star\|^2.\]

References: The triple momentum method was discovered and analyzed in [1].

[1] Van Scoy, B., Freeman, R. A., Lynch, K. M. (2018), The fastest known globally convergent first-order method for minimizing strongly convex functions. IEEE Control Systems Letters, 2(1), 49-54.

Parameters

L (float) – the smoothness parameter.
mu (float) – the strong convexity parameter.
n (int) – number of iterations.
verbose (bool) – if True, print conclusion.

Returns

pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value

Example

>>> pepit_tau, theoretical_tau = wc_triple_momentum(mu=0.1, L=1., n=4, verbose=True)
(PEPit) Setting up the problem: size of the main PSD matrix: 7x7
(PEPit) Setting up the problem: performance measure is minimum of 1 element(s)
(PEPit) Setting up the problem: initial conditions (1 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 1 function(s)
         function 1 : 30 constraint(s) added
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (solver: SCS); optimal value: 0.23893532450841679
*** Example file: worst-case performance of the Triple Momentum Method ***
    PEPit guarantee:             f(x_n)-f_* <= 0.238935 ||x_0-x_*||^2
    Theoretical guarantee:       f(x_n)-f_* <= 0.238925 ||x_0-x_*||^2

1.12. Information theoretic exact method 

PEPit.examples.unconstrained_convex_minimization.wc_information_theoretic(mu, L, n, verbose=True)[source]

Consider the convex minimization problem

\[f_\star \triangleq \min_x f(x),\]

where \(f\) is \(L\)-smooth and \(\mu\)-strongly convex (\(\mu\) is possibly 0).

This code computes a worst-case guarantee for the information theoretic exact method (ITEM). That is, it computes the smallest possible \(\tau(n, L, \mu)\) such that the guarantee

\[\|z_n - x_\star\|^2 \leqslant \tau(n, L, \mu) \|z_0 - x_\star\|^2\]

is valid, where \(z_n\) is the output of the ITEM, and where \(x_\star\) is the minimizer of \(f\). In short, for given values of \(n\), \(L\) and \(\mu\), \(\tau(n, L, \mu)\) is computed as the worst-case value of \(\|z_n - x_\star\|^2\) when \(\|z_0 - x_\star\|^2 \leqslant 1\).

Algorithm: For \(t\in\{0,1,\ldots,n-1\}\), the information theoretic exact method of this example is provided by

\begin{eqnarray} y_{t} & = & (1-\beta_t) z_t + \beta_t x_t \\ x_{t+1} & = & y_t - \frac{1}{L} \nabla f(y_t) \\ z_{t+1} & = & \left(1-q\delta_t\right) z_t+q\delta_t y_t-\frac{\delta_t}{L}\nabla f(y_t), \end{eqnarray}

with \(y_{-1}=x_0=z_0\), \(q=\frac{\mu}{L}\) (inverse condition ratio), and the scalar sequences:

\begin{eqnarray} A_{t+1} & = & \frac{(1+q)A_t+2\left(1+\sqrt{(1+A_t)(1+qA_t)}\right)}{(1-q)^2},\\ \beta_{t+1} & = & \frac{A_t}{(1-q)A_{t+1}},\\ \delta_{t+1} & = & \frac{1}{2}\frac{(1-q)^2A_{t+1}-(1+q)A_t}{1+q+q A_t}, \end{eqnarray}

with \(A_0=0\).

Theoretical guarantee: A tight worst-case guarantee can be found in [1, Theorem 3]:

\[\|z_n - x_\star\|^2 \leqslant \frac{1}{1+q A_n} \|z_0-x_\star\|^2,\]

where tightness is obtained on some quadratic loss functions (see [1, Lemma 2]).

References:

[1] A. Taylor, Y. Drori (2021). An optimal gradient method for smooth strongly convex minimization. arXiv 2101.09741v2.

Parameters

mu (float) – the strong convexity parameter.
L (float) – the smoothness parameter.
n (int) – number of iterations.
verbose (bool) – if True, print conclusion.

Returns

pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value

Example

>>> pepit_tau, theoretical_tau = wc_information_theoretic(mu=.001, L=1, n=15, verbose=True)
(PEP-it) Setting up the problem: size of the main PSD matrix: 17x17
(PEP-it) Setting up the problem: performance measure is minimum of 1 element(s)
(PEP-it) Setting up the problem: initial conditions (1 constraint(s) added)
(PEP-it) Setting up the problem: interpolation conditions for 1 function(s)
         function 1 : 240 constraint(s) added
(PEP-it) Compiling SDP
(PEP-it) Calling SDP solver
(PEP-it) Solver status: optimal (solver: MOSEK); optimal value: 0.7566088333863754
*** Example file: worst-case performance of the information theoretic exact method ***
        PEP-it guarantee:       ||z_n - x_* ||^2 <= 0.756609 ||z_0 - x_*||^2
        Theoretical guarantee:  ||z_n - x_* ||^2 <= 0.756605 ||z_0 - x_*||^2

1.13. Proximal point 

PEPit.examples.unconstrained_convex_minimization.wc_proximal_point(gamma, n, verbose=True)[source]

Consider the minimization problem

\[f_\star \triangleq \min_x f(x),\]

where \(f\) is closed, proper, and convex (and potentially non-smooth).

This code computes a worst-case guarantee for the proximal point method with step-size \(\gamma\). That is, it computes the smallest possible \(\tau(n,\gamma)\) such that the guarantee

\[f(x_n) - f_\star \leqslant \tau(n, \gamma) \|x_0 - x_\star\|^2\]

is valid, where \(x_n\) is the output of the proximal point method, and where \(x_\star\) is a minimizer of \(f\).

In short, for given values of \(n\) and \(\gamma\), \(\tau(n,\gamma)\) is computed as the worst-case value of \(f(x_n)-f_\star\) when \(\|x_0 - x_\star\|^2 \leqslant 1\).

Algorithm:

The proximal point method is described by

\[x_{t+1} = \arg\min_x \left\{f(x)+\frac{1}{2\gamma}\|x-x_t\|^2 \right\},\]

where \(\gamma\) is a step-size.

Theoretical guarantee:

The tight theoretical guarantee can be found in [1, Theorem 4.1]:

\[f(x_n)-f_\star \leqslant \frac{\|x_0-x_\star\|^2}{4\gamma n},\]

where tightness is obtained on, e.g., one-dimensional linear problems on the positive orthant.

References:

[1] A. Taylor, J. Hendrickx, F. Glineur (2017). Exact worst-case performance of first-order methods for composite convex optimization. SIAM Journal on Optimization, 27(3):1283–1313.

Parameters

gamma (float) – step-size.
n (int) – number of iterations.
verbose (bool) – if True, print conclusion

Returns

pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value

Example

>>> pepit_tau, theoretical_tau = wc_proximal_point(gamma=3, n=4, verbose=True)
(PEPit) Setting up the problem: size of the main PSD matrix: 6x6
(PEPit) Setting up the problem: performance measure is minimum of 1 element(s)
(PEPit) Setting up the problem: initial conditions (1 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 1 function(s)
         function 1 : 20 constraint(s) added
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (solver: MOSEK); optimal value: 0.020833335685727362
*** Example file: worst-case performance of proximal point method ***
    PEPit guarantee:           f(x_n)-f_* <= 0.0208333 ||x_0 - x_*||^2
    Theoretical guarantee:      f(x_n)-f_* <= 0.0208333 ||x_0 - x_*||^2

1.14. Accelerated proximal point 

PEPit.examples.unconstrained_convex_minimization.wc_accelerated_proximal_point(A0, gammas, n, verbose=True)[source]

Consider the minimization problem

\[f_\star \triangleq \min_x f(x),\]

where \(f\) is convex and possibly non-smooth.

This code computes a worst-case guarantee for accelerated proximal point method, aka fast proximal point method (FPP). That is, it computes the smallest possible \(\tau(n, A_0,\vec{\gamma})\) such that the guarantee

\[f(x_n) - f_\star \leqslant \tau(n, A_0, \vec{\gamma}) \left(f(x_0) - f_\star + \frac{A_0}{2} \|x_0 - x_\star\|^2\right)\]

is valid, where \(x_n\) is the output of FPP (with step-size \(\gamma_t\) at step \(t\in \{0, \dots, n-1\}\)) and where \(x_\star\) is a minimizer of \(f\) and \(A_0\) is a positive number.

In short, for given values of \(n\), \(A_0\) and \(\vec{\gamma}\), \(\tau(n)\) is computed as the worst-case value of \(f(x_n)-f_\star\) when \(f(x_0) - f_\star + \frac{A_0}{2} \|x_0 - x_\star\|^2 \leqslant 1\), for the following method.

Algorithm: For \(t\in \{0, \dots, n-1\}\):

\begin{eqnarray} y_{t+1} & = & (1-\alpha_{t} ) x_{t} + \alpha_{t} v_t \\ x_{t+1} & = & \arg\min_x \left\{f(x)+\frac{1}{2\gamma_t}\|x-y_{t+1}\|^2 \right\}, \\ v_{t+1} & = & v_t + \frac{1}{\alpha_{t}} (x_{t+1}-y_{t+1}) \end{eqnarray}

with

\begin{eqnarray} \alpha_{t} & = & \frac{\sqrt{(A_t \gamma_t)^2 + 4 A_t \gamma_t} - A_t \gamma_t}{2} \\ A_{t+1} & = & (1 - \alpha_{t}) A_t \end{eqnarray}

and \(v_0=x_0\).

Theoretical guarantee: A theoretical upper bound can be found in [1, Theorem 2.3.]:

\[f(x_n)-f_\star \leqslant \frac{4}{A_0 (\sum_{t=0}^{n-1} \sqrt{\gamma_t})^2}\left(f(x_0) - f_\star + \frac{A_0}{2} \|x_0 - x_\star\|^2 \right).\]

References: The accelerated proximal point was first obtained and analyzed in [1].

[1] O. Güler (1992). New proximal point algorithms for convex minimization. SIAM Journal on Optimization, 2(4):649–664.

Parameters

A0 (float) – initial value for parameter A_0.
gammas (list) – sequence of step-sizes.
n (int) – number of iterations.
verbose (bool) – if True, print conclusion

Returns

pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value

Example

>>> pepit_tau, theoretical_tau = wc_accelerated_proximal_point(A0=5, gammas=[(i + 1) / 1.1 for i in range(3)], n=3, verbose=True)
(PEPit) Setting up the problem: size of the main PSD matrix: 6x6
(PEPit) Setting up the problem: performance measure is minimum of 1 element(s)
(PEPit) Setting up the problem: initial conditions (1 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 1 function(s)
          function 1 : 20 constraint(s) added
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (solver: SCS); optimal value: 0.01593113594082973
*** Example file: worst-case performance of optimized gradient method ***
    PEPit guarantee:       f(x_n)-f_* <= 0.0159311  (f(x_0) - f_* + A0/2* ||x_0 - x_*||^2)
    Theoretical guarantee:  f(x_n)-f_* <= 0.0511881  (f(x_0) - f_* + A0/2* ||x_0 - x_*||^2)

1.15. Inexact gradient descent 

PEPit.examples.unconstrained_convex_minimization.wc_inexact_gradient_descent(L, mu, epsilon, n, verbose=True)[source]

Consider the convex minimization problem

\[f_\star \triangleq \min_x f(x),\]

where \(f\) is \(L\)-smooth and \(\mu\)-strongly convex.

This code computes a worst-case guarantee for the inexact gradient method. That is, it computes the smallest possible \(\tau(n, L, \mu, \varepsilon)\) such that the guarantee

\[f(x_n) - f_\star \leqslant \tau(n, L, \mu, \varepsilon) (f(x_0) - f_\star)\]

is valid, where \(x_n\) is the output of the inexact gradient method, and where \(x_\star\) is the minimizer of \(f\). In short, for given values of \(n\), \(L\), \(\mu\) and \(\varepsilon\), \(\tau(n, L, \mu, \varepsilon)\) is computed as the worst-case value of \(f(x_n)-f_\star\) when \(f(x_0) - f_\star \leqslant 1\).

Algorithm:

\[x_{t+1} = x_t - \gamma d_t\]

with

\[\|d_t - \nabla f(x_t)\| \leqslant \varepsilon \|\nabla f(x_t)\|\]

and

\[\gamma = \frac{2}{L_{\varepsilon} + \mu_{\varepsilon}}\]

where \(L_{\varepsilon} = (1 + \varepsilon) L\) and \(\mu_{\varepsilon} = (1 - \varepsilon) \mu\).

Theoretical guarantee:

The tight worst-case guarantee obtained in [1, Theorem 5.3] or [2, Remark 1.6] is

\[f(x_n) - f_\star \leqslant \left(\frac{L_{\varepsilon}-\mu_{\varepsilon}}{L_{\varepsilon}+\mu_{\varepsilon}}\right)^{2n}(f(x_0) - f_\star),\]

where tightness is achieved on simple quadratic functions.

References: The detailed analyses can be found in [1, 2].

[1] E. De Klerk, F. Glineur, A. Taylor (2020). Worst-case convergence analysis of inexact gradient and Newton methods through semidefinite programming performance estimation. SIAM Journal on Optimization, 30(3), 2053-2082.

[2] O. Gannot (2021). A frequency-domain analysis of inexact gradient methods. Mathematical Programming (to appear).

Parameters

L (float) – the smoothness parameter.
mu (float) – the strong convexity parameter.
epsilon (float) – level of inaccuracy.
n (int) – number of iterations.
verbose (bool) – if True, print conclusion.

Returns

pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value

Example

>>> pepit_tau, theoretical_tau = wc_inexact_gradient_descent(L=1, mu=.1, epsilon=.1, n=2, verbose=True)
(PEPit) Setting up the problem: size of the main PSD matrix: 8x8
(PEPit) Setting up the problem: performance measure is minimum of 1 element(s)
(PEPit) Setting up the problem: initial conditions (1 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 1 function(s)
         function 1 : 15 constraint(s) added
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (solver: SCS); optimal value: 0.5188661397616067
*** Example file: worst-case performance of inexact gradient method in distance in function values ***
    PEPit guarantee:             f(x_n)-f_* <= 0.518866 (f(x_0)-f_*)
    Theoretical guarantee:       f(x_n)-f_* <= 0.518917 (f(x_0)-f_*)

1.16. Inexact gradient descent with exact line search 

PEPit.examples.unconstrained_convex_minimization.wc_inexact_gradient_exact_line_search(L, mu, epsilon, n, verbose=True)[source]

Consider the convex minimization problem

\[f_\star \triangleq \min_x f(x),\]

where \(f\) is \(L\)-smooth and \(\mu\)-strongly convex.

This code computes a worst-case guarantee for an inexact gradient method with exact linesearch (ELS). That is, it computes the smallest possible \(\tau(n, L, \mu, \varepsilon)\) such that the guarantee

\[f(x_n) - f_\star \leqslant \tau(n, L, \mu, \varepsilon) ( f(x_0) - f_\star )\]

is valid, where \(x_n\) is the output of the gradient descent with an inexact descent direction and an exact linesearch, and where \(x_\star\) is the minimizer of \(f\).

The inexact descent direction \(d\) is assumed to satisfy a relative inaccuracy described by (with \(0 \leqslant \varepsilon < 1\))

\[\|\nabla f(x_t) - d_t\| \leqslant \varepsilon \|\nabla f(x_t)\|,\]

where \(\nabla f(x_t)\) is the true gradient, and \(d_t\) is the approximate descent direction that is used.

Algorithm:

For \(t \in \{0, \dots, n-1\}\),

\begin{eqnarray} \gamma_t & = & \arg\min_{\gamma \in R^d} f(x_t- \gamma d_t), \\ x_{t+1} & = & x_t - \gamma_t d_t. \end{eqnarray}

Theoretical guarantees:

The tight guarantee obtained in [1, Theorem 5.1] is

\[f(x_n) - f_\star\leqslant \left(\frac{L_{\varepsilon} - \mu_{\varepsilon}}{L_{\varepsilon} + \mu_{\varepsilon}}\right)^{2n}( f(x_0) - f_\star ),\]

with \(L_{\varepsilon} = (1 + \varepsilon) L\) and \(\mu_{\varepsilon} = (1 - \varepsilon) \mu\). Tightness is achieved on simple quadratic functions.

References: The detailed approach (based on convex relaxations) is available in [1],

[1] E. De Klerk, F. Glineur, A. Taylor (2017). On the worst-case complexity of the gradient method with exact line search for smooth strongly convex functions. Optimization Letters, 11(7), 1185-1199.

Parameters

L (float) – the smoothness parameter.
mu (float) – the strong convexity parameter.
epsilon (float) – level of inaccuracy.
n (int) – number of iterations.
verbose (bool, optional) – if True, print conclusion.

Returns

pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value

Example

>>> pepit_tau, theoretical_tau = wc_inexact_gradient_exact_line_search(1, 0.1, 0.1, 1, verbose=True)
(PEPit) Setting up the problem: size of the main PSD matrix: 9x9
(PEPit) Setting up the problem: performance measure is minimum of 1 element(s)
(PEPit) Setting up the problem: initial conditions (1 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 1 function(s)
         function 1 : 18 constraint(s) added
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (solver: SCS); optimal value: 0.5186658287501191
*** Example file: worst-case performance of inexact gradient descent with exact linesearch ***
    PEPit guarantee:             f(x_n)-f_* <= 0.518666 (f(x_0)-f_*)
    Theoretical guarantee:       f(x_n)-f_* <= 0.518917 (f(x_0)-f_*)

1.17. Inexact accelerated gradient 

PEPit.examples.unconstrained_convex_minimization.wc_inexact_accelerated_gradient(L, epsilon, n, verbose=True)[source]

Consider the minimization problem

\[f_\star \triangleq \min_x f(x),\]

where \(f\) is \(L\)-smooth and convex.

This code computes a worst-case guarantee for an accelerated gradient method using inexact first-order information. That is, it computes the smallest possible \(\tau(n, L, \varepsilon)\) such that the guarantee

\[f(x_n) - f_\star \leqslant \tau(n, L, \varepsilon) \|x_0 - x_\star\|^2\]

is valid, where \(x_n\) is the output of inexact accelerated gradient descent and where \(x_\star\) is a minimizer of \(f\).

The inexact descent direction is assumed to satisfy a relative inaccuracy described by (with \(0\leqslant \varepsilon \leqslant 1\))

\[\|\nabla f(y_t) - d_t\| \leqslant \varepsilon \|\nabla f(y_t)\|,\]

where \(\nabla f(y_t)\) is the true gradient at \(y_t\) and \(d_t\) is the approximate descent direction that is used.

Algorithm: The inexact accelerated gradient method of this example is provided by

\begin{eqnarray} x_{t+1} & = & y_t - \frac{1}{L} d_t\\ y_{k+1} & = & x_{t+1} + \frac{t-1}{t+2} (x_{t+1} - x_t). \end{eqnarray}

Theoretical guarantee: When \(\varepsilon=0\), a tight empirical guarantee can be found in [1, Table 1]:

\[f(x_n)-f_\star \leqslant \frac{2L\|x_0-x_\star\|^2}{n^2 + 5 n + 6},\]

which is achieved on some Huber loss functions (when \(\varepsilon=0\)).

References:

[1] A. Taylor, J. Hendrickx, F. Glineur (2017). Exact worst-case performance of first-order methods for composite convex optimization. SIAM Journal on Optimization, 27(3):1283–1313.

Parameters

L (float) – smoothness parameter.
epsilon (float) – level of inaccuracy
n (int) – number of iterations.
verbose (bool) – if True, print conclusion

Returns

pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value

Example

>>> pepit_tau, theoretical_tau = wc_inexact_accelerated_gradient(L=1, epsilon=0.1, n=5, verbose=True)
(PEPit) Setting up the problem: size of the main PSD matrix: 13x13
(PEPit) Setting up the problem: performance measure is minimum of 1 element(s)
(PEPit) Setting up the problem: initial conditions (1 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 1 function(s)
         function 1 : 47 constraint(s) added
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (solver: SCS); optimal value: 0.03944038534724904
*** Example file: worst-case performance of inexact accelerated gradient method ***
    PEPit guarantee:                             f(x_n)-f_* <= 0.0394404 (f(x_0)-f_*)
    Theoretical guarantee for epsilon = 0 :      f(x_n)-f_* <= 0.0357143 (f(x_0)-f_*)