10. Low dimensional worst-cases scenarios
10.1. Inexact gradient
- PEPit.examples.low_dimensional_worst_cases_scenarios.wc_inexact_gradient(L, mu, epsilon, n, verbose=True)[source]
Consider the convex minimization problem
\[f_\star \triangleq \min_x f(x),\]where \(f\) is \(L\)-smooth and \(\mu\)-strongly convex.
This code computes a worst-case guarantee for an inexact gradient method. That is, it computes the smallest possible \(\tau(n,L,\mu,\varepsilon)\) such that the guarantee
\[f(x_n) - f_\star \leqslant \tau(n,L,\mu,\varepsilon) (f(x_0) - f_\star)\]is valid, where \(x_n\) is the output of the gradient descent with an inexact descent direction, and where \(x_\star\) is the minimizer of \(f\).
The inexact descent direction is assumed to satisfy a relative inaccuracy described by (with \(0 \leqslant \varepsilon \leqslant 1\))
\[\|\nabla f(x_t) - d_t\| \leqslant \varepsilon \|\nabla f(x_t)\|,\]where \(\nabla f(x_t)\) is the true gradient, and \(d_t\) is the approximate descent direction that is used.
Algorithm:
The inexact gradient descent under consideration can be written as
\[x_{t+1} = x_t - \frac{2}{L_{\varepsilon} + \mu_{\varepsilon}} d_t\]where \(d_t\) is the inexact search direction, \(L_{\varepsilon} = (1 + \varepsilon)L\) and \(\mu_{\varepsilon} = (1-\varepsilon) \mu\).
Theoretical guarantee:
A tight worst-case guarantee obtained in [1, Theorem 5.3] or [2, Remark 1.6] is
\[f(x_n) - f_\star \leqslant \left(\frac{L_{\varepsilon} - \mu_{\varepsilon}}{L_{\varepsilon} + \mu_{\varepsilon}}\right)^{2n}(f(x_0) - f_\star ),\]with \(L_{\varepsilon} = (1 + \varepsilon)L\) and \(\mu_{\varepsilon} = (1-\varepsilon) \mu\). This guarantee is achieved on one-dimensional quadratic functions.
References:The detailed analyses can be found in [1, 2].
- Parameters
L (float) – the smoothness parameter.
mu (float) – the strong convexity parameter.
epsilon (float) – level of inaccuracy
n (int) – number of iterations.
verbose (bool) – if True, print conclusion
- Returns
pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value
Example
>>> pepit_tau, theoretical_tau = wc_inexact_gradient(L=1, mu=0.1, epsilon=0.1, n=2, verbose=True) (PEPit) Setting up the problem: size of the main PSD matrix: 8x8 (PEPit) Setting up the problem: performance measure is minimum of 1 element(s) (PEPit) Setting up the problem: initial conditions (1 constraint(s) added) (PEPit) Setting up the problem: interpolation conditions for 1 function(s) function 1 : 15 constraint(s) added (PEPit) Compiling SDP (PEPit) Calling SDP solver (PEPit) Solver status: optimal (solver: SCS); optimal value: 0.5188661397616067 (PEPit) Postprocessing: applying trace heuristic. Currently 4 eigenvalue(s) > 1e-05 before resolve. (PEPit) Calling SDP solver (PEPit) Solver status: optimal (solver: SCS); objective value: 0.5188511226295036 (PEPit) Postprocessing: 3 eigenvalue(s) > 1e-05 after trace heuristic *** Example file: worst-case performance of inexact gradient *** PEPit guarantee: f(x_n)-f_* <= 0.518851 (f(x_0)-f_*) Theoretical guarantee: f(x_n)-f_* <= 0.518917 (f(x_0)-f_*)
10.2. Optimized gradient
- PEPit.examples.low_dimensional_worst_cases_scenarios.wc_optimized_gradient(L, n, verbose=True)[source]
Consider the minimization problem
\[f_\star \triangleq \min_x f(x),\]where \(f\) is \(L\)-smooth and convex.
This code computes a worst-case guarantee for optimized gradient method (OGM), and applies the trace heuristic for trying to find a low-dimensional worst-case example on which this guarantee is achieved. That is, it computes the smallest possible \(\tau(n, L)\) such that the guarantee
\[f(x_n) - f_\star \leqslant \tau(n, L) \|x_0 - x_\star\|^2\]is valid, where \(x_n\) is the output of OGM and where \(x_\star\) is a minimizer of \(f\). Then, it applies the trace heuristic, which allows obtaining a one-dimensional function on which the guarantee is achieved.
Algorithm: The optimized gradient method is described by
\begin{eqnarray} x_{t+1} & = & y_t - \frac{1}{L} \nabla f(y_t)\\ y_{t+1} & = & x_{t+1} + \frac{\theta_{t}-1}{\theta_{t+1}}(x_{t+1}-x_t)+\frac{\theta_{t}}{\theta_{t+1}}(x_{t+1}-y_t), \end{eqnarray}with
\begin{eqnarray} \theta_0 & = & 1 \\ \theta_t & = & \frac{1 + \sqrt{4 \theta_{t-1}^2 + 1}}{2}, \forall t \in [|1, n-1|] \\ \theta_n & = & \frac{1 + \sqrt{8 \theta_{n-1}^2 + 1}}{2}. \end{eqnarray}Theoretical guarantee: The tight theoretical guarantee can be found in [2, Theorem 2]:
\[f(x_n)-f_\star \leqslant \frac{L\|x_0-x_\star\|^2}{2\theta_n^2}.\]References: The OGM was developed in [1,2]. Low-dimensional worst-case functions for OGM were obtained in [3, 4].
- Parameters
L (float) – the smoothness parameter.
n (int) – number of iterations.
verbose (bool) – if True, print conclusion
- Returns
pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value
Example
>>> pepit_tau, theoretical_tau = wc_optimized_gradient(L=3, n=4, verbose=True) (PEPit) Setting up the problem: size of the main PSD matrix: 7x7 (PEPit) Setting up the problem: performance measure is minimum of 1 element(s) (PEPit) Setting up the problem: initial conditions (1 constraint(s) added) (PEPit) Setting up the problem: interpolation conditions for 1 function(s) function 1 : 30 constraint(s) added (PEPit) Compiling SDP (PEPit) Calling SDP solver (PEPit) Solver status: optimal (solver: SCS); optimal value: 0.07675218017587908 (PEPit) Postprocessing: applying trace heuristic. Currently 6 eigenvalue(s) > 1e-05 before resolve. (PEPit) Calling SDP solver (PEPit) Solver status: optimal (solver: SCS); objective value: 0.0767421794376856 (PEPit) Postprocessing: 1 eigenvalue(s) > 1e-05 after trace heuristic *** Example file: worst-case performance of optimized gradient method *** PEPit guarantee: f(y_n)-f_* <= 0.0767422 ||x_0 - x_*||^2 Theoretical guarantee: f(y_n)-f_* <= 0.0767518 ||x_0 - x_*||^2