1 Unconstrained convex minimization
1.1 Gradient descent
- PEPit.examples.unconstrained_convex_minimization.wc_gradient_descent(L, gamma, n, wrapper='cvxpy', solver=None, verbose=1)[source]
Consider the convex minimization problem
\[f_\star \triangleq \min_x f(x),\]where \(f\) is \(L\)-smooth and convex.
This code computes a worst-case guarantee for gradient descent with fixed step-size \(\gamma\). That is, it computes the smallest possible \(\tau(n, L, \gamma)\) such that the guarantee
\[f(x_n) - f_\star \leqslant \tau(n, L, \gamma) \|x_0 - x_\star\|^2\]is valid, where \(x_n\) is the output of gradient descent with fixed step-size \(\gamma\), and where \(x_\star\) is a minimizer of \(f\).
In short, for given values of \(n\), \(L\), and \(\gamma\), \(\tau(n, L, \gamma)\) is computed as the worst-case value of \(f(x_n)-f_\star\) when \(\|x_0 - x_\star\|^2 \leqslant 1\).
Algorithm: Gradient descent is described by
\[x_{t+1} = x_t - \gamma \nabla f(x_t),\]where \(\gamma\) is a step-size.
Theoretical guarantee: When \(\gamma \leqslant \frac{1}{L}\), the tight theoretical guarantee can be found in [1, Theorem 3.1]:
\[f(x_n)-f_\star \leqslant \frac{L}{4nL\gamma+2} \|x_0-x_\star\|^2,\]which is tight on some Huber loss functions.
References:
- Parameters:
L (float) – the smoothness parameter.
gamma (float) – step-size.
n (int) – number of iterations.
wrapper (str) – the name of the wrapper to be used.
solver (str) – the name of the solver the wrapper should use.
verbose (int) –
level of information details to print.
-1: No verbose at all.
0: This example’s output.
1: This example’s output + PEPit information.
2: This example’s output + PEPit information + solver details.
- Returns:
pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value
Example
>>> L = 3 >>> pepit_tau, theoretical_tau = wc_gradient_descent(L=L, gamma=1 / L, n=4, wrapper="cvxpy", solver=None, verbose=1) (PEPit) Setting up the problem: size of the Gram matrix: 7x7 (PEPit) Setting up the problem: performance measure is the minimum of 1 element(s) (PEPit) Setting up the problem: Adding initial conditions and general constraints ... (PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added) (PEPit) Setting up the problem: interpolation conditions for 1 function(s) Function 1 : Adding 30 scalar constraint(s) ... Function 1 : 30 scalar constraint(s) added (PEPit) Setting up the problem: additional constraints for 0 function(s) (PEPit) Compiling SDP (PEPit) Calling SDP solver (PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.1666666649793714 (PEPit) Primal feasibility check: The solver found a Gram matrix that is positive semi-definite up to an error of 6.325029629397552e-10 All the primal scalar constraints are verified up to an error of 6.63361402267193e-09 (PEPit) Dual feasibility check: The solver found a residual matrix that is positive semi-definite All the dual scalar values associated with inequality constraints are nonnegative up to an error of 7.0696174975682125e-09 (PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 7.547097551421911e-08 (PEPit) Final upper bound (dual): 0.1666666733194192 and lower bound (primal example): 0.1666666649793714 (PEPit) Duality gap: absolute: 8.340047791266514e-09 and relative: 5.004028725419552e-08 *** Example file: worst-case performance of gradient descent with fixed step-sizes *** PEPit guarantee: f(x_n)-f_* <= 0.166667 ||x_0 - x_*||^2 Theoretical guarantee: f(x_n)-f_* <= 0.166667 ||x_0 - x_*||^2
1.2 Subgradient method
- PEPit.examples.unconstrained_convex_minimization.wc_subgradient_method(M, n, gamma, wrapper='cvxpy', solver=None, verbose=1)[source]
Consider the minimization problem
\[f_\star \triangleq \min_x f(x),\]where \(f\) is convex and \(M\)-Lipschitz. This problem is a (possibly non-smooth) minimization problem.
This code computes a worst-case guarantee for the subgradient method. That is, it computes the smallest possible \(\tau(n, M, \gamma)\) such that the guarantee
\[\min_{0 \leqslant t \leqslant n} f(x_t) - f_\star \leqslant \tau(n, M, \gamma)\]is valid, where \(x_t\) are the iterates of the subgradient method after \(t\leqslant n\) steps, where \(x_\star\) is a minimizer of \(f\), and when \(\|x_0-x_\star\|\leqslant 1\).
In short, for given values of \(M\), the step-size \(\gamma\) and the number of iterations \(n\), \(\tau(n, M, \gamma)\) is computed as the worst-case value of \(\min_{0 \leqslant t \leqslant n} f(x_t) - f_\star\) when \(\|x_0-x_\star\| \leqslant 1\).
Algorithm: For \(t\in \{0, \dots, n-1 \}\)
\begin{eqnarray} g_{t} & \in & \partial f(x_t) \\ x_{t+1} & = & x_t - \gamma g_t \end{eqnarray}Theoretical guarantee: The tight bound is obtained in [1, Section 3.2.3] and [2, Eq (2)]
\[\min_{0 \leqslant t \leqslant n} f(x_t)- f(x_\star) \leqslant \frac{M}{\sqrt{n+1}}\|x_0-x_\star\|,\]and tightness follows from the lower complexity bound for this class of problems, e.g., [3, Appendix A].
References: Classical references on this topic include [1, 2].
[2] S. Boyd, L. Xiao, A. Mutapcic (2003). Subgradient Methods (lecture notes).
- Parameters:
M (float) – the Lipschitz parameter.
n (int) – the number of iterations.
gamma (float) – step-size.
wrapper (str) – the name of the wrapper to be used.
solver (str) – the name of the solver the wrapper should use.
verbose (int) –
level of information details to print.
-1: No verbose at all.
0: This example’s output.
1: This example’s output + PEPit information.
2: This example’s output + PEPit information + solver details.
- Returns:
pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value
Example
>>> M = 2 >>> n = 6 >>> gamma = 1 / (M * sqrt(n + 1)) >>> pepit_tau, theoretical_tau = wc_subgradient_method(M=M, n=n, gamma=gamma, wrapper="cvxpy", solver=None, verbose=1) (PEPit) Setting up the problem: size of the Gram matrix: 9x9 (PEPit) Setting up the problem: performance measure is the minimum of 7 element(s) (PEPit) Setting up the problem: Adding initial conditions and general constraints ... (PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added) (PEPit) Setting up the problem: interpolation conditions for 1 function(s) Function 1 : Adding 64 scalar constraint(s) ... Function 1 : 64 scalar constraint(s) added (PEPit) Setting up the problem: additional constraints for 0 function(s) (PEPit) Compiling SDP (PEPit) Calling SDP solver (PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.7559287513714107 (PEPit) Primal feasibility check: The solver found a Gram matrix that is positive semi-definite All the primal scalar constraints are verified (PEPit) Dual feasibility check: The solver found a residual matrix that is positive semi-definite All the dual scalar values associated with inequality constraints are nonnegative up to an error of 1.0475405061169397e-09 (PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 8.251377031711948e-08 (PEPit) Final upper bound (dual): 0.7559287543573815 and lower bound (primal example): 0.7559287513714107 (PEPit) Duality gap: absolute: 2.9859708039481347e-09 and relative: 3.950069101791627e-09 *** Example file: worst-case performance of subgradient method *** PEPit guarantee: min_(0 \leq t \leq n) f(x_i) - f_* <= 0.755929 ||x_0 - x_*|| Theoretical guarantee: min_(0 \leq t \leq n) f(x_i) - f_* <= 0.755929 ||x_0 - x_*||
1.3 Subgradient method under restricted secant inequality and error bound
- PEPit.examples.unconstrained_convex_minimization.wc_subgradient_method_rsi_eb(mu, L, gamma, n, wrapper='cvxpy', solver=None, verbose=1)[source]
Consider the convex minimization problem
\[f_\star \triangleq \min_x f(x),\]where \(f\) verifies the “lower” restricted secant inequality (\(\mu-\text{RSI}^-\)) and the “upper” error bound (\(L-\text{EB}^+\)) [1].
This code computes a worst-case guarantee for gradient descent with fixed step-size \(\gamma\). That is, it computes the smallest possible \(\tau(n, \mu, L, \gamma)\) such that the guarantee
\[\| x_n - x_\star \|^2 \leqslant \tau(n, \mu, L, \gamma) \| x_0 - x_\star \|^2\]is valid, where \(x_n\) is the output of gradient descent with fixed step-size \(\gamma\), and where \(x_\star\) is a minimizer of \(f\).
In short, for given values of \(n\), \(L\), and \(\gamma\), \(\tau(n, \mu, L, \gamma)\) is computed as the worst-case value of \(\| x_n - x_\star \|^2\) when \(\|x_0 - x_\star\|^2 \leqslant 1\).
Algorithm: Sub-gradient descent is described by
\[x_{t+1} = x_t - \gamma \nabla f(x_t),\]where \(\gamma\) is a step-size.
Theoretical guarantee: The tight theoretical guarantee can be found in [1, Prop 1] (upper bound) and [1, Theorem 2] (lower bound):
\[\| x_n - x_\star \|^2 \leqslant (1 - 2\gamma\mu + L^2 \gamma^2)^n \|x_0-x_\star\|^2.\]References: Definition and convergence guarantees can be found in [1].
- Parameters:
mu (float) – the rsi parameter
L (float) – the eb parameter.
gamma (float) – step-size.
n (int) – number of iterations.
wrapper (str) – the name of the wrapper to be used.
solver (str) – the name of the solver the wrapper should use.
verbose (int) –
level of information details to print.
-1: No verbose at all.
0: This example’s output.
1: This example’s output + PEPit information.
2: This example’s output + PEPit information + solver details.
- Returns:
pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value
Example
>>> mu = .1 >>> L = 1 >>> pepit_tau, theoretical_tau = wc_subgradient_method_rsi_eb(mu=mu, L=L, gamma=mu / L ** 2, n=4, wrapper="cvxpy", solver=None, verbose=1) (PEPit) Setting up the problem: size of the Gram matrix: 6x6 (PEPit) Setting up the problem: performance measure is the minimum of 1 element(s) (PEPit) Setting up the problem: Adding initial conditions and general constraints ... (PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added) (PEPit) Setting up the problem: interpolation conditions for 1 function(s) Function 1 : Adding 8 scalar constraint(s) ... Function 1 : 8 scalar constraint(s) added (PEPit) Setting up the problem: additional constraints for 0 function(s) (PEPit) Compiling SDP (PEPit) Calling SDP solver (PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.9605960099986828 (PEPit) Primal feasibility check: The solver found a Gram matrix that is positive semi-definite All the primal scalar constraints are verified (PEPit) Dual feasibility check: The solver found a residual matrix that is positive semi-definite up to an error of 6.555981041081373e-24 All the dual scalar values associated with inequality constraints are nonnegative (PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 2.8993396496580087e-12 (PEPit) Final upper bound (dual): 0.9605960099988382 and lower bound (primal example): 0.9605960099986828 (PEPit) Duality gap: absolute: 1.5543122344752192e-13 and relative: 1.6180706751814954e-13 *** Example file: worst-case performance of gradient descent with fixed step-sizes *** PEPit guarantee: f(x_n)-f_* <= 0.960596 ||x_0 - x_*||^2 Theoretical guarantee: f(x_n)-f_* <= 0.960596 ||x_0 - x_*||^2
1.4 Gradient descent with exact line search
- PEPit.examples.unconstrained_convex_minimization.wc_gradient_exact_line_search(L, mu, n, wrapper='cvxpy', solver=None, verbose=1)[source]
Consider the convex minimization problem
\[f_\star \triangleq \min_x f(x),\]where \(f\) is \(L\)-smooth and \(\mu\)-strongly convex.
This code computes a worst-case guarantee for the gradient descent (GD) with exact linesearch (ELS). That is, it computes the smallest possible \(\tau(n, L, \mu)\) such that the guarantee
\[f(x_n) - f_\star \leqslant \tau(n, L, \mu) (f(x_0) - f_\star)\]is valid, where \(x_n\) is the output of the GD with ELS, and where \(x_\star\) is the minimizer of \(f\). In short, for given values of \(n\), \(L\) and \(\mu\), \(\tau(n, L, \mu)\) is computed as the worst-case value of \(f(x_n)-f_\star\) when \(f(x_0) - f_\star \leqslant 1\).
Algorithm: GD with ELS can be written as
\[x_{t+1} = x_t - \gamma_t \nabla f(x_t)\]with \(\gamma_t = \arg\min_{\gamma} f \left( x_t - \gamma \nabla f(x_t) \right)\).
Theoretical guarantee: The tight worst-case guarantee for GD with ELS, obtained in [1, Theorem 1.2], is
\[f(x_n) - f_\star \leqslant \left(\frac{L-\mu}{L+\mu}\right)^{2n} (f(x_0) - f_\star).\]References: The detailed approach (based on convex relaxations) is available in [1], along with theoretical bound.
- Parameters:
L (float) – the smoothness parameter.
mu (float) – the strong convexity parameter.
n (int) – number of iterations.
wrapper (str) – the name of the wrapper to be used.
solver (str) – the name of the solver the wrapper should use.
verbose (int) –
level of information details to print.
-1: No verbose at all.
0: This example’s output.
1: This example’s output + PEPit information.
2: This example’s output + PEPit information + solver details.
- Returns:
pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value
Example
>>> pepit_tau, theoretical_tau = wc_gradient_exact_line_search(L=1, mu=.1, n=2, wrapper="cvxpy", solver=None, verbose=1) (PEPit) Setting up the problem: size of the Gram matrix: 7x7 (PEPit) Setting up the problem: performance measure is the minimum of 1 element(s) (PEPit) Setting up the problem: Adding initial conditions and general constraints ... (PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added) (PEPit) Setting up the problem: interpolation conditions for 1 function(s) Function 1 : Adding 12 scalar constraint(s) ... Function 1 : 12 scalar constraint(s) added (PEPit) Setting up the problem: additional constraints for 1 function(s) Function 1 : Adding 4 scalar constraint(s) ... Function 1 : 4 scalar constraint(s) added (PEPit) Compiling SDP (PEPit) Calling SDP solver (PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.44812496858878037 (PEPit) Primal feasibility check: The solver found a Gram matrix that is positive semi-definite up to an error of 1.5179264703173475e-08 All the primal scalar constraints are verified up to an error of 1.861932416580281e-08 (PEPit) Dual feasibility check: The solver found a residual matrix that is positive semi-definite All the dual scalar values associated with inequality constraints are nonnegative (PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 1.389540571773174e-07 (PEPit) Final upper bound (dual): 0.4481249692104364 and lower bound (primal example): 0.44812496858878037 (PEPit) Duality gap: absolute: 6.216560044514097e-10 and relative: 1.3872380430153385e-09 *** Example file: worst-case performance of gradient descent with exact linesearch (ELS) *** PEPit guarantee: f(x_n)-f_* <= 0.448125 (f(x_0)-f_*) Theoretical guarantee: f(x_n)-f_* <= 0.448125 (f(x_0)-f_*)
1.5 Conjugate gradient
- PEPit.examples.unconstrained_convex_minimization.wc_conjugate_gradient(L, n, wrapper='cvxpy', solver=None, verbose=1)[source]
Consider the convex minimization problem
\[f_\star \triangleq \min_x f(x),\]where \(f\) is \(L\)-smooth and convex.
This code computes a worst-case guarantee for the conjugate gradient (CG) method (with exact span searches). That is, it computes the smallest possible \(\tau(n, L)\) such that the guarantee
\[f(x_n) - f_\star \leqslant \tau(n, L) \|x_0-x_\star\|^2\]is valid, where \(x_n\) is the output of the conjugate gradient method, and where \(x_\star\) is a minimizer of \(f\). In short, for given values of \(n\) and \(L\), \(\tau(n, L)\) is computed as the worst-case value of \(f(x_n)-f_\star\) when \(\|x_0-x_\star\|^2 \leqslant 1\).
Algorithm:
\[x_{t+1} = x_t - \sum_{i=0}^t \gamma_i \nabla f(x_i)\]with
\[(\gamma_i)_{i \leqslant t} = \arg\min_{(\gamma_i)_{i \leqslant t}} f \left(x_t - \sum_{i=0}^t \gamma_i \nabla f(x_i) \right)\]Theoretical guarantee:
The tight guarantee obtained in [1] is
\[f(x_n) - f_\star \leqslant\frac{L}{2 \theta_n^2}\|x_0-x_\star\|^2.\]where
\begin{eqnarray} \theta_0 & = & 1 \\ \theta_t & = & \frac{1 + \sqrt{4 \theta_{t-1}^2 + 1}}{2}, \forall t \in [|1, n-1|] \\ \theta_n & = & \frac{1 + \sqrt{8 \theta_{n-1}^2 + 1}}{2}, \end{eqnarray}and tightness follows from [2, Theorem 3].
References: The detailed approach (based on convex relaxations) is available in [1, Corollary 6].
- Parameters:
L (float) – the smoothness parameter.
n (int) – number of iterations.
wrapper (str) – the name of the wrapper to be used.
solver (str) – the name of the solver the wrapper should use.
verbose (int) –
level of information details to print.
-1: No verbose at all.
0: This example’s output.
1: This example’s output + PEPit information.
2: This example’s output + PEPit information + solver details.
- Returns:
pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value
Example
>>> pepit_tau, theoretical_tau = wc_conjugate_gradient(L=1, n=2, wrapper="cvxpy", solver=None, verbose=1) (PEPit) Setting up the problem: size of the Gram matrix: 7x7 (PEPit) Setting up the problem: performance measure is the minimum of 1 element(s) (PEPit) Setting up the problem: Adding initial conditions and general constraints ... (PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added) (PEPit) Setting up the problem: interpolation conditions for 1 function(s) Function 1 : Adding 12 scalar constraint(s) ... Function 1 : 12 scalar constraint(s) added (PEPit) Setting up the problem: additional constraints for 1 function(s) Function 1 : Adding 6 scalar constraint(s) ... Function 1 : 6 scalar constraint(s) added (PEPit) Compiling SDP (PEPit) Calling SDP solver (PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.06189419648705603 (PEPit) Primal feasibility check: The solver found a Gram matrix that is positive semi-definite up to an error of 4.094038231215726e-09 All the primal scalar constraints are verified up to an error of 7.438133589957041e-09 (PEPit) Dual feasibility check: The solver found a residual matrix that is positive semi-definite All the dual scalar values associated with inequality constraints are nonnegative (PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 8.571101560090838e-08 (PEPit) Final upper bound (dual): 0.06189420236651673 and lower bound (primal example): 0.06189419648705603 (PEPit) Duality gap: absolute: 5.879460696078809e-09 and relative: 9.49921160590296e-08 *** Example file: worst-case performance of conjugate gradient method *** PEPit guarantee: f(x_n)-f_* <= 0.0618942 ||x_0 - x_*||^2 Theoretical guarantee: f(x_n)-f_* <= 0.0618942 ||x_0 - x_*||^2
1.6 Heavy Ball momentum
- PEPit.examples.unconstrained_convex_minimization.wc_heavy_ball_momentum(mu, L, alpha, beta, n, wrapper='cvxpy', solver=None, verbose=1)[source]
Consider the convex minimization problem
\[f_\star \triangleq \min_x f(x),\]where \(f\) is \(L\)-smooth and \(\mu\)-strongly convex.
This code computes a worst-case guarantee for the Heavy-ball (HB) method, aka Polyak momentum method. That is, it computes the smallest possible \(\tau(n, L, \mu, \alpha, \beta)\) such that the guarantee
\[f(x_n) - f_\star \leqslant \tau(n, L, \mu, \alpha, \beta) (f(x_0) - f_\star)\]is valid, where \(x_n\) is the output of the Heavy-ball (HB) method, and where \(x_\star\) is the minimizer of \(f\). In short, for given values of \(n\), \(L\) and \(\mu\), \(\tau(n, L, \mu, \alpha, \beta)\) is computed as the worst-case value of \(f(x_n)-f_\star\) when \(f(x_0) - f_\star \leqslant 1\).
Algorithm:
\[x_{t+1} = x_t - \alpha \nabla f(x_t) + \beta (x_t-x_{t-1})\]with
\[\alpha \in (0, \frac{1}{L}]\]and
\[\beta = \sqrt{(1 - \alpha \mu)(1 - L \alpha)}\]Theoretical guarantee:
The upper guarantee obtained in [2, Theorem 4] is
\[f(x_n) - f_\star \leqslant (1 - \alpha \mu)^n (f(x_0) - f_\star).\]References: This methods was first introduce in [1, Section 2], and convergence upper bound was proven in [2, Theorem 4].
- Parameters:
L (float) – the smoothness parameter.
mu (float) – the strong convexity parameter.
alpha (float) – parameter of the scheme.
beta (float) – parameter of the scheme such that \(0<\beta<1\) and \(0<\alpha<2(1+\beta)\).
n (int) – number of iterations.
wrapper (str) – the name of the wrapper to be used.
solver (str) – the name of the solver the wrapper should use.
verbose (int) –
level of information details to print.
-1: No verbose at all.
0: This example’s output.
1: This example’s output + PEPit information.
2: This example’s output + PEPit information + solver details.
- Returns:
pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value
Example
>>> mu = 0.1 >>> L = 1. >>> alpha = 1 / (2 * L) # alpha \in [0, 1 / L] >>> beta = sqrt((1 - alpha * mu) * (1 - L * alpha)) >>> pepit_tau, theoretical_tau = wc_heavy_ball_momentum(mu=mu, L=L, alpha=alpha, beta=beta, n=2, wrapper="cvxpy", solver=None, verbose=1) (PEPit) Setting up the problem: size of the Gram matrix: 5x5 (PEPit) Setting up the problem: performance measure is the minimum of 1 element(s) (PEPit) Setting up the problem: Adding initial conditions and general constraints ... (PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added) (PEPit) Setting up the problem: interpolation conditions for 1 function(s) Function 1 : Adding 12 scalar constraint(s) ... Function 1 : 12 scalar constraint(s) added (PEPit) Setting up the problem: additional constraints for 0 function(s) (PEPit) Compiling SDP (PEPit) Calling SDP solver (PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.7534930184722486 (PEPit) Primal feasibility check: The solver found a Gram matrix that is positive semi-definite up to an error of 8.447704549549468e-09 All the primal scalar constraints are verified up to an error of 2.3525573294991275e-09 (PEPit) Dual feasibility check: The solver found a residual matrix that is positive semi-definite All the dual scalar values associated with inequality constraints are nonnegative (PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 1.1217055700431244e-07 (PEPit) Final upper bound (dual): 0.753493016980345 and lower bound (primal example): 0.7534930184722486 (PEPit) Duality gap: absolute: -1.4919036006588726e-09 and relative: -1.9799833098437924e-09 *** Example file: worst-case performance of the Heavy-Ball method *** PEPit guarantee: f(x_n)-f_* <= 0.753493 (f(x_0) - f(x_*)) Theoretical guarantee: f(x_n)-f_* <= 0.9025 (f(x_0) - f(x_*))
1.7 Accelerated gradient for convex objective
- PEPit.examples.unconstrained_convex_minimization.wc_accelerated_gradient_convex(mu, L, n, wrapper='cvxpy', solver=None, verbose=1)[source]
Consider the convex minimization problem
\[f_\star \triangleq \min_x f(x),\]where \(f\) is \(L\)-smooth and \(\mu\)-strongly convex (\(\mu\) is possibly 0).
This code computes a worst-case guarantee for an accelerated gradient method, a.k.a. fast gradient method [1]. That is, it computes the smallest possible \(\tau(n, L, \mu)\) such that the guarantee
\[f(x_n) - f_\star \leqslant \tau(n, L, \mu) \|x_0 - x_\star\|^2\]is valid, where \(x_n\) is the output of the accelerated gradient method, and where \(x_\star\) is the minimizer of \(f\). In short, for given values of \(n\), \(L\) and \(\mu\), \(\tau(n, L, \mu)\) is computed as the worst-case value of \(f(x_n)-f_\star\) when \(\|x_0 - x_\star\|^2 \leqslant 1\).
Algorithm: Initialize \(\lambda_1=1\), \(y_1=x_0\). One iteration of accelerated gradient method is described by
\[\begin{split}\begin{eqnarray} \text{Set: }\lambda_{t+1} & = & \frac{1 + \sqrt{4\lambda_t^2 + 1}}{2} \\ x_{t} & = & y_t - \frac{1}{L} \nabla f(y_t),\\ y_{t+1} & = & x_{t} + \frac{\lambda_t-1}{\lambda_{t+1}} (x_t-x_{t-1}). \end{eqnarray}\end{split}\]Theoretical guarantee: The following worst-case guarantee can be found in e.g., [2, Theorem 4.4]:
\[f(x_n)-f_\star \leqslant \frac{L}{2}\frac{\|x_0-x_\star\|^2}{\lambda_n^2}.\]References:
- Parameters:
mu (float) – the strong convexity parameter
L (float) – the smoothness parameter.
n (int) – number of iterations.
wrapper (str) – the name of the wrapper to be used.
solver (str) – the name of the solver the wrapper should use.
verbose (int) –
level of information details to print.
-1: No verbose at all.
0: This example’s output.
1: This example’s output + PEPit information.
2: This example’s output + PEPit information + solver details.
- Returns:
pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value
Example
>>> pepit_tau, theoretical_tau = wc_accelerated_gradient_convex(mu=0, L=1, n=1, wrapper="cvxpy", solver=None, verbose=1) (PEPit) Setting up the problem: size of the Gram matrix: 4x4 (PEPit) Setting up the problem: performance measure is the minimum of 1 element(s) (PEPit) Setting up the problem: Adding initial conditions and general constraints ... (PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added) (PEPit) Setting up the problem: interpolation conditions for 1 function(s) Function 1 : Adding 6 scalar constraint(s) ... Function 1 : 6 scalar constraint(s) added (PEPit) Setting up the problem: additional constraints for 0 function(s) (PEPit) Compiling SDP (PEPit) Calling SDP solver (PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.16666666115098375 (PEPit) Primal feasibility check: The solver found a Gram matrix that is positive semi-definite up to an error of 4.82087966328108e-09 All the primal scalar constraints are verified up to an error of 3.6200406144937247e-09 (PEPit) Dual feasibility check: The solver found a residual matrix that is positive semi-definite All the dual scalar values associated with inequality constraints are nonnegative (PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 3.101096412994053e-08 (PEPit) Final upper bound (dual): 0.16666666498582347 and lower bound (primal example): 0.16666666115098375 (PEPit) Duality gap: absolute: 3.834839723548811e-09 and relative: 2.3009039102756247e-08 *** Example file: worst-case performance of accelerated gradient method *** PEPit guarantee: f(x_n)-f_* <= 0.166667 ||x_0 - x_*||^2 Theoretical guarantee: f(x_n)-f_* <= 0.5 ||x_0 - x_*||^2
1.8 Simplified accelerated gradient for convex objective
- PEPit.examples.unconstrained_convex_minimization.wc_accelerated_gradient_convex_simplified(mu, L, n, wrapper='cvxpy', solver=None, verbose=1)[source]
Consider the convex minimization problem
\[f_\star \triangleq \min_x f(x),\]where \(f\) is \(L\)-smooth and \(\mu\)-strongly convex (\(\mu\) is possibly 0).
This code computes a worst-case guarantee for an accelerated gradient method, a.k.a. fast gradient method with a set of classical slightly simplified sets of coefficients compared to the original [1]. That is, the code computes the smallest possible \(\tau(n, L, \mu)\) such that the guarantee
\[f(x_n) - f_\star \leqslant \tau(n, L, \mu) \|x_0 - x_\star\|^2\]is valid, where \(x_n\) is the output of the accelerated gradient method below, and where \(x_\star\) is the minimizer of \(f\). In short, for given values of \(n\), \(L\) and \(\mu\), \(\tau(n, L, \mu)\) is computed as the worst-case value of \(f(x_n)-f_\star\) when \(\|x_0 - x_\star\|^2 \leqslant 1\).
Algorithm: The accelerated gradient method of this example is provided by
\begin{eqnarray} x_{t+1} & = & y_t - \frac{1}{L} \nabla f(y_t) \\ y_{t+1} & = & x_{t+1} + \frac{t-1}{t+2} (x_{t+1} - x_t). \end{eqnarray}Theoretical guarantee: When \(\mu=0\), a tight empirical guarantee can be found in [2, Table 1]:
\[f(x_n)-f_\star \leqslant \frac{2L\|x_0-x_\star\|^2}{n^2 + 5 n + 6},\]where tightness is obtained on some Huber loss functions.
References:
- Parameters:
mu (float) – the strong convexity parameter
L (float) – the smoothness parameter.
n (int) – number of iterations.
wrapper (str) – the name of the wrapper to be used.
solver (str) – the name of the solver the wrapper should use.
verbose (int) –
level of information details to print.
-1: No verbose at all.
0: This example’s output.
1: This example’s output + PEPit information.
2: This example’s output + PEPit information + solver details.
- Returns:
pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value
Example
>>> pepit_tau, theoretical_tau = wc_accelerated_gradient_convex_simplified(mu=0, L=1, n=1, wrapper="cvxpy", solver=None, verbose=1) (PEPit) Setting up the problem: size of the Gram matrix: 4x4 (PEPit) Setting up the problem: performance measure is the minimum of 1 element(s) (PEPit) Setting up the problem: Adding initial conditions and general constraints ... (PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added) (PEPit) Setting up the problem: interpolation conditions for 1 function(s) Function 1 : Adding 6 scalar constraint(s) ... Function 1 : 6 scalar constraint(s) added (PEPit) Setting up the problem: additional constraints for 0 function(s) (PEPit) Compiling SDP (PEPit) Calling SDP solver (PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.16666666115098375 (PEPit) Primal feasibility check: The solver found a Gram matrix that is positive semi-definite up to an error of 4.82087966328108e-09 All the primal scalar constraints are verified up to an error of 3.6200406144937247e-09 (PEPit) Dual feasibility check: The solver found a residual matrix that is positive semi-definite All the dual scalar values associated with inequality constraints are nonnegative (PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 3.101096412994053e-08 (PEPit) Final upper bound (dual): 0.16666666498582347 and lower bound (primal example): 0.16666666115098375 (PEPit) Duality gap: absolute: 3.834839723548811e-09 and relative: 2.3009039102756247e-08 *** Example file: worst-case performance of accelerated gradient method *** PEPit guarantee: f(x_n)-f_* <= 0.166667 ||x_0 - x_*||^2 Theoretical guarantee: f(x_n)-f_* <= 0.166667 ||x_0 - x_*||^2
1.9 Accelerated gradient for strongly convex objective
- PEPit.examples.unconstrained_convex_minimization.wc_accelerated_gradient_strongly_convex(mu, L, n, wrapper='cvxpy', solver=None, verbose=1)[source]
Consider the convex minimization problem
\[f_\star \triangleq \min_x f(x),\]where \(f\) is \(L\)-smooth and \(\mu\)-strongly convex.
This code computes a worst-case guarantee for an accelerated gradient method, a.k.a fast gradient method. That is, it computes the smallest possible \(\tau(n, L, \mu)\) such that the guarantee
\[f(x_n) - f_\star \leqslant \tau(n, L, \mu) \left(f(x_0) - f(x_\star) + \frac{\mu}{2}\|x_0 - x_\star\|^2\right),\]is valid, where \(x_n\) is the output of the accelerated gradient method, and where \(x_\star\) is the minimizer of \(f\). In short, for given values of \(n\), \(L\) and \(\mu\), \(\tau(n, L, \mu)\) is computed as the worst-case value of \(f(x_n)-f_\star\) when \(f(x_0) - f(x_\star) + \frac{\mu}{2}\|x_0 - x_\star\|^2 \leqslant 1\).
Algorithm: For \(t \in \{0, \dots, n-1\}\),
\begin{eqnarray} y_t & = & x_t + \frac{\sqrt{L} - \sqrt{\mu}}{\sqrt{L} + \sqrt{\mu}}(x_t - x_{t-1}) \\ x_{t+1} & = & y_t - \frac{1}{L} \nabla f(y_t) \end{eqnarray}with \(x_{-1}:= x_0\).
Theoretical guarantee:
The following upper guarantee can be found in [1, Corollary 4.15]:
\[f(x_n)-f_\star \leqslant \left(1 - \sqrt{\frac{\mu}{L}}\right)^n \left(f(x_0) - f(x_\star) + \frac{\mu}{2}\|x_0 - x_\star\|^2\right).\]References:
- Parameters:
mu (float) – the strong convexity parameter
L (float) – the smoothness parameter.
n (int) – number of iterations.
wrapper (str) – the name of the wrapper to be used.
solver (str) – the name of the solver the wrapper should use.
verbose (int) –
level of information details to print.
-1: No verbose at all.
0: This example’s output.
1: This example’s output + PEPit information.
2: This example’s output + PEPit information + solver details.
- Returns:
pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value
Example
>>> pepit_tau, theoretical_tau = wc_accelerated_gradient_strongly_convex(mu=0.1, L=1, n=2, wrapper="cvxpy", solver=None, verbose=1) (PEPit) Setting up the problem: size of the Gram matrix: 5x5 (PEPit) Setting up the problem: performance measure is the minimum of 1 element(s) (PEPit) Setting up the problem: Adding initial conditions and general constraints ... (PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added) (PEPit) Setting up the problem: interpolation conditions for 1 function(s) Function 1 : Adding 12 scalar constraint(s) ... Function 1 : 12 scalar constraint(s) added (PEPit) Setting up the problem: additional constraints for 0 function(s) (PEPit) Compiling SDP (PEPit) Calling SDP solver (PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.3476022263165957 (PEPit) Primal feasibility check: The solver found a Gram matrix that is positive semi-definite up to an error of 1.9077689453863704e-09 All the primal scalar constraints are verified up to an error of 1.349111329596031e-09 (PEPit) Dual feasibility check: The solver found a residual matrix that is positive semi-definite All the dual scalar values associated with inequality constraints are nonnegative (PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 1.1911053831538528e-08 (PEPit) Final upper bound (dual): 0.347602226831412 and lower bound (primal example): 0.3476022263165957 (PEPit) Duality gap: absolute: 5.148163007007156e-10 and relative: 1.4810500673601024e-09 *** Example file: worst-case performance of the accelerated gradient method *** PEPit guarantee: f(x_n)-f_* <= 0.347602 (f(x_0) - f(x_*) + mu/2*||x_0 - x_*||**2) Theoretical guarantee: f(x_n)-f_* <= 0.467544 (f(x_0) - f(x_*) + mu/2*||x_0 - x_*||**2)
1.10 Optimized gradient
- PEPit.examples.unconstrained_convex_minimization.wc_optimized_gradient(L, n, wrapper='cvxpy', solver=None, verbose=1)[source]
Consider the minimization problem
\[f_\star \triangleq \min_x f(x),\]where \(f\) is \(L\)-smooth and convex.
This code computes a worst-case guarantee for optimized gradient method (OGM). That is, it computes the smallest possible \(\tau(n, L)\) such that the guarantee
\[f(x_n) - f_\star \leqslant \tau(n, L) \|x_0 - x_\star\|^2\]is valid, where \(x_n\) is the output of OGM and where \(x_\star\) is a minimizer of \(f\).
In short, for given values of \(n\) and \(L\), \(\tau(n, L)\) is computed as the worst-case value of \(f(x_n)-f_\star\) when \(\|x_0 - x_\star\|^2 \leqslant 1\).
Algorithm: The optimized gradient method is described by
\begin{eqnarray} x_{t+1} & = & y_t - \frac{1}{L} \nabla f(y_t)\\ y_{t+1} & = & x_{t+1} + \frac{\theta_{t}-1}{\theta_{t+1}}(x_{t+1}-x_t)+\frac{\theta_{t}}{\theta_{t+1}}(x_{t+1}-y_t), \end{eqnarray}with
\begin{eqnarray} \theta_0 & = & 1 \\ \theta_t & = & \frac{1 + \sqrt{4 \theta_{t-1}^2 + 1}}{2}, \forall t \in [|1, n-1|] \\ \theta_n & = & \frac{1 + \sqrt{8 \theta_{n-1}^2 + 1}}{2}. \end{eqnarray}Theoretical guarantee: The tight theoretical guarantee can be found in [2, Theorem 2]:
\[f(x_n)-f_\star \leqslant \frac{L\|x_0-x_\star\|^2}{2\theta_n^2},\]where tightness follows from [3, Theorem 3].
References: The optimized gradient method was developed in [1, 2]; the corresponding lower bound was first obtained in [3].
- Parameters:
L (float) – the smoothness parameter.
n (int) – number of iterations.
wrapper (str) – the name of the wrapper to be used.
solver (str) – the name of the solver the wrapper should use.
verbose (int) –
level of information details to print.
-1: No verbose at all.
0: This example’s output.
1: This example’s output + PEPit information.
2: This example’s output + PEPit information + solver details.
- Returns:
pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value
Example
>>> pepit_tau, theoretical_tau = wc_optimized_gradient(L=3, n=4, wrapper="cvxpy", solver=None, verbose=1) (PEPit) Setting up the problem: size of the Gram matrix: 7x7 (PEPit) Setting up the problem: performance measure is the minimum of 1 element(s) (PEPit) Setting up the problem: Adding initial conditions and general constraints ... (PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added) (PEPit) Setting up the problem: interpolation conditions for 1 function(s) Function 1 : Adding 30 scalar constraint(s) ... Function 1 : 30 scalar constraint(s) added (PEPit) Setting up the problem: additional constraints for 0 function(s) (PEPit) Compiling SDP (PEPit) Calling SDP solver (PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.07675182658909867 (PEPit) Primal feasibility check: The solver found a Gram matrix that is positive semi-definite up to an error of 4.2418929147154685e-09 All the primal scalar constraints are verified up to an error of 4.209824447376498e-09 (PEPit) Dual feasibility check: The solver found a residual matrix that is positive semi-definite All the dual scalar values associated with inequality constraints are nonnegative up to an error of 2.3554960855935635e-09 (PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 3.7023258253444443e-08 (PEPit) Final upper bound (dual): 0.07675183030228215 and lower bound (primal example): 0.07675182658909867 (PEPit) Duality gap: absolute: 3.7131834829118304e-09 and relative: 4.8379089435758487e-08 *** Example file: worst-case performance of optimized gradient method *** PEPit guarantee: f(y_n)-f_* <= 0.0767518 ||x_0 - x_*||^2 Theoretical guarantee: f(y_n)-f_* <= 0.0767518 ||x_0 - x_*||^2
1.11 Optimized gradient for gradient
- PEPit.examples.unconstrained_convex_minimization.wc_optimized_gradient_for_gradient(L, n, wrapper='cvxpy', solver=None, verbose=1)[source]
Consider the minimization problem
\[f_\star \triangleq \min_x f(x),\]where \(f\) is \(L\)-smooth and convex.
This code computes a worst-case guarantee for optimized gradient method for gradient (OGM-G). That is, it computes the smallest possible \(\tau(n, L)\) such that the guarantee
\[\|\nabla f(x_n)\|^2 \leqslant \tau(n, L) (f(x_0) - f_\star)\]is valid, where \(x_n\) is the output of OGM-G and where \(x_\star\) is a minimizer of \(f\).
In short, for given values of \(n\) and \(L\), \(\tau(n, L)\) is computed as the worst-case value of \(\|\nabla f(x_n)\|^2\) when \(f(x_0)-f_\star \leqslant 1\).
Algorithm: For \(t\in\{0,1,\ldots,n-1\}\), the optimized gradient method for gradient [1, Section 6.3] is described by
\begin{eqnarray} y_{t+1} & = & x_t - \frac{1}{L} \nabla f(x_t),\\ x_{t+1} & = & y_{t+1} + \frac{(\tilde{\theta}_t-1)(2\tilde{\theta}_{t+1}-1)}{\tilde{\theta}_t(2\tilde{\theta}_t-1)}(y_{t+1}-y_t)+\frac{2\tilde{\theta}_{t+1}-1}{2\tilde{\theta}_t-1}(y_{t+1}-x_t), \end{eqnarray}with
\begin{eqnarray} \tilde{\theta}_n & = & 1 \\ \tilde{\theta}_t & = & \frac{1 + \sqrt{4 \tilde{\theta}_{t+1}^2 + 1}}{2}, \forall t \in [|1, n-1|] \\ \tilde{\theta}_0 & = & \frac{1 + \sqrt{8 \tilde{\theta}_{1}^2 + 1}}{2}. \end{eqnarray}Theoretical guarantee: The tight worst-case guarantee can be found in [1, Theorem 6.1]:
\[\|\nabla f(x_n)\|^2 \leqslant \frac{2L(f(x_0)-f_\star)}{\tilde{\theta}_0^2},\]where tightness is achieved on Huber losses, see [1, Section 6.4].
References: The optimized gradient method for gradient was developed in [1].
- Parameters:
L (float) – the smoothness parameter.
n (int) – number of iterations.
wrapper (str) – the name of the wrapper to be used.
solver (str) – the name of the solver the wrapper should use.
verbose (int) –
level of information details to print.
-1: No verbose at all.
0: This example’s output.
1: This example’s output + PEPit information.
2: This example’s output + PEPit information + solver details.
- Returns:
pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value
Example
>>> pepit_tau, theoretical_tau = wc_optimized_gradient_for_gradient(L=3, n=4, wrapper="cvxpy", solver=None, verbose=1) (PEPit) Setting up the problem: size of the Gram matrix: 7x7 (PEPit) Setting up the problem: performance measure is the minimum of 1 element(s) (PEPit) Setting up the problem: Adding initial conditions and general constraints ... (PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added) (PEPit) Setting up the problem: interpolation conditions for 1 function(s) Function 1 : Adding 30 scalar constraint(s) ... Function 1 : 30 scalar constraint(s) added (PEPit) Setting up the problem: additional constraints for 0 function(s) (PEPit) Compiling SDP (PEPit) Calling SDP solver (PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.3070073046098571 (PEPit) Primal feasibility check: The solver found a Gram matrix that is positive semi-definite All the primal scalar constraints are verified up to an error of 1.851027664478977e-11 (PEPit) Dual feasibility check: The solver found a residual matrix that is positive semi-definite All the dual scalar values associated with inequality constraints are nonnegative up to an error of 1.8972117293793012e-11 (PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 3.0435452497813057e-10 (PEPit) Final upper bound (dual): 0.30700730461343734 and lower bound (primal example): 0.3070073046098571 (PEPit) Duality gap: absolute: 3.580247209811205e-12 and relative: 1.1661765554278782e-11 *** Example file: worst-case performance of optimized gradient method for gradient *** PEP-it guarantee: ||f'(x_n)||^2 <= 0.307007 (f(x_0) - f_*) Theoretical guarantee: ||f'(x_n)||^2 <= 0.307007 (f(x_0) - f_*)
1.12 Robust momentum
- PEPit.examples.unconstrained_convex_minimization.wc_robust_momentum(mu, L, lam, wrapper='cvxpy', solver=None, verbose=1)[source]
Consider the convex minimization problem
\[f_\star \triangleq \min_x f(x),\]where \(f\) is \(L\)-smooth and \(\mu\)-strongly-convex.
This code computes a worst-case guarantee for the robust momentum method (RMM). That is, it computes the smallest possible \(\tau(n, \mu, L, \lambda)\) such that the guarantee
\[v(x_{n+1}) \leqslant \tau(n, \mu, L, \lambda) v(x_{n}),\]is valid, where \(x_n\) is the \(n^{\mathrm{th}}\) iterate of the RMM, and \(x_\star\) is a minimizer of \(f\). The function \(v(.)\) is a well-chosen Lyapunov defined as follows,
\begin{eqnarray} v(x_t) & = & l\|z_t - x_\star\|^2 + q_t, \\ q_t & = & (L - \mu) \left(f(x_t) - f_\star - \frac{\mu}{2}\|y_t - x_\star\|^2 - \frac{1}{2}\|\nabla f(y_t) - \mu (y_t - x_\star)\|^2 \right), \end{eqnarray}with \(\kappa = \frac{\mu}{L}\), \(\rho = \lambda (1 - \frac{1}{\kappa}) + (1 - \lambda) \left(1 - \frac{1}{\sqrt{\kappa}}\right)\), and \(l = \mu^2 \frac{\kappa - \kappa \rho^2 - 1}{2 \rho (1 - \rho)}\).
Algorithm:
For \(t \in \{0, \dots, n-1\}\),
\begin{eqnarray} x_{t+1} & = & x_{t} + \beta (x_t - x_{t-1}) - \alpha \nabla f(y_t), \\ y_{t+1} & = & y_{t} + \gamma (x_t - x_{t-1}), \end{eqnarray}with \(x_{-1}, x_0 \in \mathrm{R}^d\), and with parameters \(\alpha = \frac{\kappa (1 - \rho^2)(1 + \rho)}{L}\), \(\beta = \frac{\kappa \rho^3}{\kappa - 1}\), \(\gamma = \frac{\rho^2}{(\kappa - 1)(1 - \rho)^2(1 + \rho)}\).
Theoretical guarantee:
A convergence guarantee (empirically tight) is obtained in [1, Theorem 1],
\[v(x_{n+1}) \leqslant \rho^2 v(x_n),\]with \(\rho = \lambda (1 - \frac{1}{\kappa}) + (1 - \lambda) \left(1 - \frac{1}{\sqrt{\kappa}}\right)\).
References:
- Parameters:
L (float) – the smoothness parameter.
mu (float) – the strong convexity parameter.
lam (float) – if \(\lambda=1\) it is the gradient descent, if \(\lambda=0\), it is the Triple Momentum Method.
wrapper (str) – the name of the wrapper to be used.
solver (str) – the name of the solver the wrapper should use.
verbose (int) –
level of information details to print.
-1: No verbose at all.
0: This example’s output.
1: This example’s output + PEPit information.
2: This example’s output + PEPit information + solver details.
- Returns:
pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value
Examples
>>> pepit_tau, theoretical_tau = wc_robust_momentum(mu=0.1, L=1, lam=0.2, wrapper="cvxpy", solver=None, verbose=1) (PEPit) Setting up the problem: size of the Gram matrix: 5x5 (PEPit) Setting up the problem: performance measure is the minimum of 1 element(s) (PEPit) Setting up the problem: Adding initial conditions and general constraints ... (PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added) (PEPit) Setting up the problem: interpolation conditions for 1 function(s) Function 1 : Adding 6 scalar constraint(s) ... Function 1 : 6 scalar constraint(s) added (PEPit) Setting up the problem: additional constraints for 0 function(s) (PEPit) Compiling SDP (PEPit) Calling SDP solver (PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.5285548454804174 (PEPit) Primal feasibility check: The solver found a Gram matrix that is positive semi-definite All the primal scalar constraints are verified (PEPit) Dual feasibility check: The solver found a residual matrix that is positive semi-definite All the dual scalar values associated with inequality constraints are nonnegative (PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 6.778431681160021e-08 (PEPit) Final upper bound (dual): 0.5285548474761074 and lower bound (primal example): 0.5285548454804174 (PEPit) Duality gap: absolute: 1.9956899466322398e-09 and relative: 3.7757480868769724e-09 *** Example file: worst-case performance of the Robust Momentum Method *** PEPit guarantee: v(x_(n+1)) <= 0.528555 v(x_n) Theoretical guarantee: v(x_(n+1)) <= 0.528555 v(x_n)
1.13 Triple momentum
- PEPit.examples.unconstrained_convex_minimization.wc_triple_momentum(mu, L, n, wrapper='cvxpy', solver=None, verbose=1)[source]
Consider the minimization problem
\[f_\star \triangleq \min_x f(x),\]where \(f\) is \(L\)-smooth and \(\mu\)-strongly convex.
This code computes a worst-case guarantee for triple momentum method (TMM). That is, it computes the smallest possible \(\tau(n, L, \mu)\) such that the guarantee
\[f(x_n) - f_\star \leqslant \tau(n, L, \mu) \|x_0 - x_\star\|^2\]is valid, where \(x_n\) is the output of the TMM, and where \(x_\star\) is the minimizer of \(f\). In short, for given values of \(n\), \(L\) and \(\mu\), \(\tau(n, L, \mu)\) is computed as the worst-case value of \(f(x_n)-f_\star\) when \(\|x_0 - x_\star\|^2 \leqslant 1\).
Algorithm:
For \(t \in \{ 1, \dots, n\}\)
\begin{eqnarray} \xi_{t+1} &&= (1 + \beta) \xi_{t} - \beta \xi_{t-1} - \alpha \nabla f(y_t) \\ y_{t} &&= (1+\gamma ) \xi_{t} -\gamma \xi_{t-1} \\ x_{t} && = (1 + \delta) \xi_{t} - \delta \xi_{t-1} \end{eqnarray}with
\begin{eqnarray} \kappa &&= \frac{L}{\mu} , \quad \rho = 1- \frac{1}{\sqrt{\kappa}}\\ (\alpha, \beta, \gamma,\delta) && = \left(\frac{1+\rho}{L}, \frac{\rho^2}{2-\rho}, \frac{\rho^2}{(1+\rho)(2-\rho)}, \frac{\rho^2}{1-\rho^2}\right) \end{eqnarray}and
\begin{eqnarray} \xi_{0} = x_0 \\ \xi_{1} = x_0 \\ y = x_0 \end{eqnarray}Theoretical guarantee: A theoretical upper (empirically tight) bound can be found in [1, Theorem 1, eq. 4]:
\[f(x_n)-f_\star \leqslant \frac{\rho^{2(n+1)} L \kappa}{2}\|x_0 - x_\star\|^2.\]References: The triple momentum method was discovered and analyzed in [1].
- Parameters:
L (float) – the smoothness parameter.
mu (float) – the strong convexity parameter.
n (int) – number of iterations.
wrapper (str) – the name of the wrapper to be used.
solver (str) – the name of the solver the wrapper should use.
verbose (int) –
level of information details to print.
-1: No verbose at all.
0: This example’s output.
1: This example’s output + PEPit information.
2: This example’s output + PEPit information + solver details.
- Returns:
pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value
Example
>>> pepit_tau, theoretical_tau = wc_triple_momentum(mu=0.1, L=1., n=4, wrapper="cvxpy", solver=None, verbose=1) (PEPit) Setting up the problem: size of the Gram matrix: 7x7 (PEPit) Setting up the problem: performance measure is the minimum of 1 element(s) (PEPit) Setting up the problem: Adding initial conditions and general constraints ... (PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added) (PEPit) Setting up the problem: interpolation conditions for 1 function(s) Function 1 : Adding 30 scalar constraint(s) ... Function 1 : 30 scalar constraint(s) added (PEPit) Setting up the problem: additional constraints for 0 function(s) (PEPit) Compiling SDP (PEPit) Calling SDP solver (PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.23892507610108352 (PEPit) Primal feasibility check: The solver found a Gram matrix that is positive semi-definite up to an error of 1.2413384091344622e-08 All the primal scalar constraints are verified up to an error of 2.3063312517808757e-08 (PEPit) Dual feasibility check: The solver found a residual matrix that is positive semi-definite All the dual scalar values associated with inequality constraints are nonnegative up to an error of 2.1341759240263788e-09 (PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 9.962063006166431e-08 (PEPit) Final upper bound (dual): 0.23892508261738885 and lower bound (primal example): 0.23892507610108352 (PEPit) Duality gap: absolute: 6.516305328663208e-09 and relative: 2.7273425774305555e-08 *** Example file: worst-case performance of the Triple Momentum Method *** PEPit guarantee: f(x_n)-f_* <= 0.238925 ||x_0-x_*||^2 Theoretical guarantee: f(x_n)-f_* <= 0.238925 ||x_0-x_*||^2
1.14 Information theoretic exact method
- PEPit.examples.unconstrained_convex_minimization.wc_information_theoretic(mu, L, n, wrapper='cvxpy', solver=None, verbose=1)[source]
Consider the convex minimization problem
\[f_\star \triangleq \min_x f(x),\]where \(f\) is \(L\)-smooth and \(\mu\)-strongly convex (\(\mu\) is possibly 0).
This code computes a worst-case guarantee for the information theoretic exact method (ITEM). That is, it computes the smallest possible \(\tau(n, L, \mu)\) such that the guarantee
\[\|z_n - x_\star\|^2 \leqslant \tau(n, L, \mu) \|z_0 - x_\star\|^2\]is valid, where \(z_n\) is the output of the ITEM, and where \(x_\star\) is the minimizer of \(f\). In short, for given values of \(n\), \(L\) and \(\mu\), \(\tau(n, L, \mu)\) is computed as the worst-case value of \(\|z_n - x_\star\|^2\) when \(\|z_0 - x_\star\|^2 \leqslant 1\).
Algorithm: For \(t\in\{0,1,\ldots,n-1\}\), the information theoretic exact method of this example is provided by
\begin{eqnarray} y_{t} & = & (1-\beta_t) z_t + \beta_t x_t \\ x_{t+1} & = & y_t - \frac{1}{L} \nabla f(y_t) \\ z_{t+1} & = & \left(1-q\delta_t\right) z_t+q\delta_t y_t-\frac{\delta_t}{L}\nabla f(y_t), \end{eqnarray}with \(y_{-1}=x_0=z_0\), \(q=\frac{\mu}{L}\) (inverse condition ratio), and the scalar sequences:
\begin{eqnarray} A_{t+1} & = & \frac{(1+q)A_t+2\left(1+\sqrt{(1+A_t)(1+qA_t)}\right)}{(1-q)^2},\\ \beta_{t+1} & = & \frac{A_t}{(1-q)A_{t+1}},\\ \delta_{t+1} & = & \frac{1}{2}\frac{(1-q)^2A_{t+1}-(1+q)A_t}{1+q+q A_t}, \end{eqnarray}with \(A_0=0\).
Theoretical guarantee: A tight worst-case guarantee can be found in [1, Theorem 3]:
\[\|z_n - x_\star\|^2 \leqslant \frac{1}{1+q A_n} \|z_0-x_\star\|^2,\]where tightness is obtained on some quadratic loss functions (see [1, Lemma 2]).
References:
- Parameters:
mu (float) – the strong convexity parameter.
L (float) – the smoothness parameter.
n (int) – number of iterations.
wrapper (str) – the name of the wrapper to be used.
solver (str) – the name of the solver the wrapper should use.
verbose (int) –
level of information details to print.
-1: No verbose at all.
0: This example’s output.
1: This example’s output + PEPit information.
2: This example’s output + PEPit information + solver details.
- Returns:
pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value
Example
>>> pepit_tau, theoretical_tau = wc_information_theoretic(mu=.001, L=1, n=15, wrapper="cvxpy", solver=None, verbose=1) (PEPit) Setting up the problem: size of the Gram matrix: 17x17 (PEPit) Setting up the problem: performance measure is the minimum of 1 element(s) (PEPit) Setting up the problem: Adding initial conditions and general constraints ... (PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added) (PEPit) Setting up the problem: interpolation conditions for 1 function(s) Function 1 : Adding 240 scalar constraint(s) ... Function 1 : 240 scalar constraint(s) added (PEPit) Setting up the problem: additional constraints for 0 function(s) (PEPit) Compiling SDP (PEPit) Calling SDP solver (PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.7566088333863759 (PEPit) Primal feasibility check: The solver found a Gram matrix that is positive semi-definite All the primal scalar constraints are verified up to an error of 2.649056531701061e-09 (PEPit) Dual feasibility check: The solver found a residual matrix that is positive semi-definite All the dual scalar values associated with inequality constraints are nonnegative (PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 5.090066388573269e-05 (PEPit) Final upper bound (dual): 0.7566088219545406 and lower bound (primal example): 0.7566088333863759 (PEPit) Duality gap: absolute: -1.1431835256203726e-08 and relative: -1.5109307150219134e-08 *** Example file: worst-case performance of the information theoretic exact method *** PEP-it guarantee: ||z_n - x_* ||^2 <= 0.756609 ||z_0 - x_*||^2 Theoretical guarantee: ||z_n - x_* ||^2 <= 0.756605 ||z_0 - x_*||^2
1.15 Cyclic coordinate descent
- PEPit.examples.unconstrained_convex_minimization.wc_cyclic_coordinate_descent(L, n, wrapper='cvxpy', solver=None, verbose=1)[source]
Consider the convex minimization problem
\[f_\star \triangleq \min_x f(x),\]where \(f\) is \(L\)-smooth by blocks (with \(d\) blocks) and convex.
This code computes a worst-case guarantee for cyclic coordinate descent with fixed step-sizes \(1/L_i\). That is, it computes the smallest possible \(\tau(n, d, L)\) such that the guarantee
\[f(x_n) - f_\star \leqslant \tau(n, d, L) \|x_0 - x_\star\|^2\]is valid, where \(x_n\) is the output of cyclic coordinate descent with fixed step-sizes \(1/L_i\), and where \(x_\star\) is a minimizer of \(f\).
In short, for given values of \(n\), \(L\), and \(d\), \(\tau(n, d, L)\) is computed as the worst-case value of \(f(x_n)-f_\star\) when \(\|x_0 - x_\star\|^2 \leqslant 1\).
Algorithm: Cyclic coordinate descent is described by
\[x_{t+1} = x_t - \frac{1}{L_{i_t}} \nabla_{i_t} f(x_t),\]where \(L_{i_t}\) is the Lipschitz constant of the block \(i_t\), and where \(i_t\) follows a prescribed ordering.
References:
- Parameters:
L (list) – list of floats, smoothness parameters (for each block).
n (int) – number of iterations.
wrapper (str) – the name of the wrapper to be used.
solver (str) – the name of the solver the wrapper should use.
verbose (int) –
level of information details to print.
-1: No verbose at all.
0: This example’s output.
1: This example’s output + PEPit information.
2: This example’s output + PEPit information + solver details.
- Returns:
pepit_tau (float) – worst-case value
theoretical_tau (float) – None
Example
>>> L = [1., 2., 10.] >>> pepit_tau, theoretical_tau = wc_cyclic_coordinate_descent(L=L, n=9, wrapper="cvxpy", solver=None, verbose=1) (PEPit) Setting up the problem: size of the Gram matrix: 34x34 (PEPit) Setting up the problem: performance measure is the minimum of 1 element(s) (PEPit) Setting up the problem: Adding initial conditions and general constraints ... (PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added) (PEPit) Setting up the problem: interpolation conditions for 1 function(s) Function 1 : Adding 330 scalar constraint(s) ... Function 1 : 330 scalar constraint(s) added (PEPit) Setting up the problem: additional constraints for 0 function(s) (PEPit) Setting up the problem: 1 partition(s) added Partition 1 with 3 blocks: Adding 363 scalar constraint(s)... Partition 1 with 3 blocks: 363 scalar constraint(s) added (PEPit) Compiling SDP (PEPit) Calling SDP solver (PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 1.489275837101339 (PEPit) Primal feasibility check: The solver found a Gram matrix that is positive semi-definite up to an error of 7.746787768677001e-09 All the primal scalar constraints are verified up to an error of 7.456650408244059e-09 (PEPit) Dual feasibility check: The solver found a residual matrix that is positive semi-definite All the dual scalar values associated with inequality constraints are nonnegative up to an error of 7.838891370416837e-09 (PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 3.804651942291628e-07 (PEPit) Final upper bound (dual): 1.4892758371735533 and lower bound (primal example): 1.489275837101339 (PEPit) Duality gap: absolute: 7.221423459213838e-11 and relative: 4.8489495896672164e-11 *** Example file: worst-case performance of cyclic coordinate descent with fixed step-sizes *** PEPit guarantee: f(x_n)-f_* <= 1.48928 ||x_0 - x_*||^2
1.16 Proximal point
- PEPit.examples.unconstrained_convex_minimization.wc_proximal_point(gamma, n, wrapper='cvxpy', solver=None, verbose=1)[source]
Consider the minimization problem
\[f_\star \triangleq \min_x f(x),\]where \(f\) is closed, proper, and convex (and potentially non-smooth).
This code computes a worst-case guarantee for the proximal point method with step-size \(\gamma\). That is, it computes the smallest possible \(\tau(n,\gamma)\) such that the guarantee
\[f(x_n) - f_\star \leqslant \tau(n, \gamma) \|x_0 - x_\star\|^2\]is valid, where \(x_n\) is the output of the proximal point method, and where \(x_\star\) is a minimizer of \(f\).
In short, for given values of \(n\) and \(\gamma\), \(\tau(n,\gamma)\) is computed as the worst-case value of \(f(x_n)-f_\star\) when \(\|x_0 - x_\star\|^2 \leqslant 1\).
Algorithm:
The proximal point method is described by
\[x_{t+1} = \arg\min_x \left\{f(x)+\frac{1}{2\gamma}\|x-x_t\|^2 \right\},\]where \(\gamma\) is a step-size.
Theoretical guarantee:
The tight theoretical guarantee can be found in [1, Theorem 4.1]:
\[f(x_n)-f_\star \leqslant \frac{\|x_0-x_\star\|^2}{4\gamma n},\]where tightness is obtained on, e.g., one-dimensional linear problems on the positive orthant.
References:
- Parameters:
gamma (float) – step-size.
n (int) – number of iterations.
wrapper (str) – the name of the wrapper to be used.
solver (str) – the name of the solver the wrapper should use.
verbose (int) –
level of information details to print.
-1: No verbose at all.
0: This example’s output.
1: This example’s output + PEPit information.
2: This example’s output + PEPit information + solver details.
- Returns:
pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value
Example
>>> pepit_tau, theoretical_tau = wc_proximal_point(gamma=3, n=4, wrapper="cvxpy", solver=None, verbose=1) (PEPit) Setting up the problem: size of the Gram matrix: 6x6 (PEPit) Setting up the problem: performance measure is the minimum of 1 element(s) (PEPit) Setting up the problem: Adding initial conditions and general constraints ... (PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added) (PEPit) Setting up the problem: interpolation conditions for 1 function(s) Function 1 : Adding 20 scalar constraint(s) ... Function 1 : 20 scalar constraint(s) added (PEPit) Setting up the problem: additional constraints for 0 function(s) (PEPit) Compiling SDP (PEPit) Calling SDP solver (PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.02083333568572521 (PEPit) Primal feasibility check: The solver found a Gram matrix that is positive semi-definite up to an error of 3.626657775567276e-09 All the primal scalar constraints are verified up to an error of 1.13861532216597e-08 (PEPit) Dual feasibility check: The solver found a residual matrix that is positive semi-definite All the dual scalar values associated with inequality constraints are nonnegative (PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 3.0463811740572e-08 (PEPit) Final upper bound (dual): 0.020833337068521363 and lower bound (primal example): 0.02083333568572521 (PEPit) Duality gap: absolute: 1.3827961518886323e-09 and relative: 6.637420779602328e-08 *** Example file: worst-case performance of proximal point method *** PEPit guarantee: f(x_n)-f_* <= 0.0208333 ||x_0 - x_*||^2 Theoretical guarantee: f(x_n)-f_* <= 0.0208333 ||x_0 - x_*||^2
1.17 Accelerated proximal point
- PEPit.examples.unconstrained_convex_minimization.wc_accelerated_proximal_point(A0, gammas, n, wrapper='cvxpy', solver=None, verbose=1)[source]
Consider the minimization problem
\[f_\star \triangleq \min_x f(x),\]where \(f\) is convex and possibly non-smooth.
This code computes a worst-case guarantee an accelerated proximal point method, aka fast proximal point method (FPP). That is, it computes the smallest possible \(\tau(n, A_0,\vec{\gamma})\) such that the guarantee
\[f(x_n) - f_\star \leqslant \tau(n, A_0, \vec{\gamma}) \left(f(x_0) - f_\star + \frac{A_0}{2} \|x_0 - x_\star\|^2\right)\]is valid, where \(x_n\) is the output of FPP (with step-size \(\gamma_t\) at step \(t\in \{0, \dots, n-1\}\)) and where \(x_\star\) is a minimizer of \(f\) and \(A_0\) is a positive number.
In short, for given values of \(n\), \(A_0\) and \(\vec{\gamma}\), \(\tau(n)\) is computed as the worst-case value of \(f(x_n)-f_\star\) when \(f(x_0) - f_\star + \frac{A_0}{2} \|x_0 - x_\star\|^2 \leqslant 1\), for the following method.
Algorithm: For \(t\in \{0, \dots, n-1\}\):
\begin{eqnarray} y_{t+1} & = & (1-\alpha_{t} ) x_{t} + \alpha_{t} v_t \\ x_{t+1} & = & \arg\min_x \left\{f(x)+\frac{1}{2\gamma_t}\|x-y_{t+1}\|^2 \right\}, \\ v_{t+1} & = & v_t + \frac{1}{\alpha_{t}} (x_{t+1}-y_{t+1}) \end{eqnarray}with
\begin{eqnarray} \alpha_{t} & = & \frac{\sqrt{(A_t \gamma_t)^2 + 4 A_t \gamma_t} - A_t \gamma_t}{2} \\ A_{t+1} & = & (1 - \alpha_{t}) A_t \end{eqnarray}and \(v_0=x_0\).
Theoretical guarantee: A theoretical upper bound can be found in [1, Theorem 2.3.]:
\[f(x_n)-f_\star \leqslant \frac{4}{A_0 (\sum_{t=0}^{n-1} \sqrt{\gamma_t})^2}\left(f(x_0) - f_\star + \frac{A_0}{2} \|x_0 - x_\star\|^2 \right).\]References: The accelerated proximal point was first obtained and analyzed in [1].
- Parameters:
A0 (float) – initial value for parameter A_0.
gammas (list) – sequence of step-sizes.
n (int) – number of iterations.
wrapper (str) – the name of the wrapper to be used.
solver (str) – the name of the solver the wrapper should use.
verbose (int) –
level of information details to print.
-1: No verbose at all.
0: This example’s output.
1: This example’s output + PEPit information.
2: This example’s output + PEPit information + solver details.
- Returns:
pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value
Example
>>> pepit_tau, theoretical_tau = wc_accelerated_proximal_point(A0=5, gammas=[(i + 1) / 1.1 for i in range(3)], n=3, wrapper="cvxpy", solver=None, verbose=1) (PEPit) Setting up the problem: size of the Gram matrix: 6x6 (PEPit) Setting up the problem: performance measure is the minimum of 1 element(s) (PEPit) Setting up the problem: Adding initial conditions and general constraints ... (PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added) (PEPit) Setting up the problem: interpolation conditions for 1 function(s) Function 1 : Adding 20 scalar constraint(s) ... Function 1 : 20 scalar constraint(s) added (PEPit) Setting up the problem: additional constraints for 0 function(s) (PEPit) Compiling SDP (PEPit) Calling SDP solver (PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.01593114892329127 (PEPit) Primal feasibility check: The solver found a Gram matrix that is positive semi-definite up to an error of 3.7131214110874085e-10 All the primal scalar constraints are verified up to an error of 1.4460238423330551e-09 (PEPit) Dual feasibility check: The solver found a residual matrix that is positive semi-definite All the dual scalar values associated with inequality constraints are nonnegative up to an error of 2.0490532342962685e-10 (PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 2.945112502095782e-09 (PEPit) Final upper bound (dual): 0.01593114926394514 and lower bound (primal example): 0.01593114892329127 (PEPit) Duality gap: absolute: 3.4065386969595046e-10 and relative: 2.138288150692735e-08 *** Example file: worst-case performance of fast proximal point method *** PEPit guarantee: f(x_n)-f_* <= 0.0159311 (f(x_0) - f_* + A/2* ||x_0 - x_*||^2) Theoretical guarantee: f(x_n)-f_* <= 0.0511881 (f(x_0) - f_* + A/2* ||x_0 - x_*||^2)
1.18 Inexact gradient descent
- PEPit.examples.unconstrained_convex_minimization.wc_inexact_gradient_descent(L, mu, epsilon, n, wrapper='cvxpy', solver=None, verbose=1)[source]
Consider the convex minimization problem
\[f_\star \triangleq \min_x f(x),\]where \(f\) is \(L\)-smooth and \(\mu\)-strongly convex.
This code computes a worst-case guarantee for the inexact gradient method. That is, it computes the smallest possible \(\tau(n, L, \mu, \varepsilon)\) such that the guarantee
\[f(x_n) - f_\star \leqslant \tau(n, L, \mu, \varepsilon) (f(x_0) - f_\star)\]is valid, where \(x_n\) is the output of the inexact gradient method, and where \(x_\star\) is the minimizer of \(f\). In short, for given values of \(n\), \(L\), \(\mu\) and \(\varepsilon\), \(\tau(n, L, \mu, \varepsilon)\) is computed as the worst-case value of \(f(x_n)-f_\star\) when \(f(x_0) - f_\star \leqslant 1\).
Algorithm:
\[x_{t+1} = x_t - \gamma d_t\]with
\[\|d_t - \nabla f(x_t)\| \leqslant \varepsilon \|\nabla f(x_t)\|\]and
\[\gamma = \frac{2}{L_{\varepsilon} + \mu_{\varepsilon}}\]where \(L_{\varepsilon} = (1 + \varepsilon) L\) and \(\mu_{\varepsilon} = (1 - \varepsilon) \mu\).
Theoretical guarantee:
The tight worst-case guarantee obtained in [1, Theorem 5.3] or [2, Remark 1.6] is
\[f(x_n) - f_\star \leqslant \left(\frac{L_{\varepsilon}-\mu_{\varepsilon}}{L_{\varepsilon}+\mu_{\varepsilon}}\right)^{2n}(f(x_0) - f_\star),\]where tightness is achieved on simple quadratic functions.
References: The detailed analyses can be found in [1, 2].
- Parameters:
L (float) – the smoothness parameter.
mu (float) – the strong convexity parameter.
epsilon (float) – level of inaccuracy.
n (int) – number of iterations.
wrapper (str) – the name of the wrapper to be used.
solver (str) – the name of the solver the wrapper should use.
verbose (int) –
level of information details to print.
-1: No verbose at all.
0: This example’s output.
1: This example’s output + PEPit information.
2: This example’s output + PEPit information + solver details.
- Returns:
pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value
Example
>>> pepit_tau, theoretical_tau = wc_inexact_gradient_descent(L=1, mu=.1, epsilon=.1, n=2, wrapper="cvxpy", solver=None, verbose=1) (PEPit) Setting up the problem: size of the Gram matrix: 7x7 (PEPit) Setting up the problem: performance measure is the minimum of 1 element(s) (PEPit) Setting up the problem: Adding initial conditions and general constraints ... (PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added) (PEPit) Setting up the problem: interpolation conditions for 1 function(s) Function 1 : Adding 12 scalar constraint(s) ... Function 1 : 12 scalar constraint(s) added (PEPit) Setting up the problem: additional constraints for 1 function(s) Function 1 : Adding 2 scalar constraint(s) ... Function 1 : 2 scalar constraint(s) added (PEPit) Compiling SDP (PEPit) Calling SDP solver (PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.5189167024522467 (PEPit) Primal feasibility check: The solver found a Gram matrix that is positive semi-definite up to an error of 3.510283035307193e-09 All the primal scalar constraints are verified up to an error of 9.704962500300951e-09 (PEPit) Dual feasibility check: The solver found a residual matrix that is positive semi-definite All the dual scalar values associated with inequality constraints are nonnegative (PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 1.049446752352651e-07 (PEPit) Final upper bound (dual): 0.5189166963250256 and lower bound (primal example): 0.5189167024522467 (PEPit) Duality gap: absolute: -6.127221174878628e-09 and relative: -1.1807716240242786e-08 *** Example file: worst-case performance of inexact gradient method in distance in function values *** PEPit guarantee: f(x_n)-f_* <= 0.518917 (f(x_0)-f_*) Theoretical guarantee: f(x_n)-f_* <= 0.518917 (f(x_0)-f_*)
1.19 Inexact gradient descent with exact line search
- PEPit.examples.unconstrained_convex_minimization.wc_inexact_gradient_exact_line_search(L, mu, epsilon, n, wrapper='cvxpy', solver=None, verbose=1)[source]
Consider the convex minimization problem
\[f_\star \triangleq \min_x f(x),\]where \(f\) is \(L\)-smooth and \(\mu\)-strongly convex.
This code computes a worst-case guarantee for an inexact gradient method with exact linesearch (ELS). That is, it computes the smallest possible \(\tau(n, L, \mu, \varepsilon)\) such that the guarantee
\[f(x_n) - f_\star \leqslant \tau(n, L, \mu, \varepsilon) ( f(x_0) - f_\star )\]is valid, where \(x_n\) is the output of the gradient descent with an inexact descent direction and an exact linesearch, and where \(x_\star\) is the minimizer of \(f\).
The inexact descent direction \(d\) is assumed to satisfy a relative inaccuracy described by (with \(0 \leqslant \varepsilon < 1\))
\[\|\nabla f(x_t) - d_t\| \leqslant \varepsilon \|\nabla f(x_t)\|,\]where \(\nabla f(x_t)\) is the true gradient, and \(d_t\) is the approximate descent direction that is used.
Algorithm:
For \(t \in \{0, \dots, n-1\}\),
\begin{eqnarray} \gamma_t & = & \arg\min_{\gamma \in R^d} f(x_t- \gamma d_t), \\ x_{t+1} & = & x_t - \gamma_t d_t. \end{eqnarray}Theoretical guarantees:
The tight guarantee obtained in [1, Theorem 5.1] is
\[f(x_n) - f_\star\leqslant \left(\frac{L_{\varepsilon} - \mu_{\varepsilon}}{L_{\varepsilon} + \mu_{\varepsilon}}\right)^{2n}( f(x_0) - f_\star ),\]with \(L_{\varepsilon} = (1 + \varepsilon) L\) and \(\mu_{\varepsilon} = (1 - \varepsilon) \mu\). Tightness is achieved on simple quadratic functions.
References: The detailed approach (based on convex relaxations) is available in [1],
- Parameters:
L (float) – the smoothness parameter.
mu (float) – the strong convexity parameter.
epsilon (float) – level of inaccuracy.
n (int) – number of iterations.
wrapper (str) – the name of the wrapper to be used.
solver (str) – the name of the solver the wrapper should use.
verbose (int) –
level of information details to print.
-1: No verbose at all.
0: This example’s output.
1: This example’s output + PEPit information.
2: This example’s output + PEPit information + solver details.
- Returns:
pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value
Example
>>> pepit_tau, theoretical_tau = wc_inexact_gradient_exact_line_search(L=1, mu=0.1, epsilon=0.1, n=2, wrapper="cvxpy", solver=None, verbose=1) (PEPit) Setting up the problem: size of the Gram matrix: 9x9 (PEPit) Setting up the problem: performance measure is the minimum of 1 element(s) (PEPit) Setting up the problem: Adding initial conditions and general constraints ... (PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added) (PEPit) Setting up the problem: interpolation conditions for 1 function(s) Function 1 : Adding 12 scalar constraint(s) ... Function 1 : 12 scalar constraint(s) added (PEPit) Setting up the problem: additional constraints for 1 function(s) Function 1 : Adding 6 scalar constraint(s) ... Function 1 : 6 scalar constraint(s) added (PEPit) Compiling SDP (PEPit) Calling SDP solver (PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.5189166579925197 (PEPit) Primal feasibility check: The solver found a Gram matrix that is positive semi-definite up to an error of 7.855711098683355e-09 All the primal scalar constraints are verified up to an error of 2.1750143269771982e-08 (PEPit) Dual feasibility check: The solver found a residual matrix that is positive semi-definite All the dual scalar values associated with inequality constraints are nonnegative (PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 8.345007454521682e-08 (PEPit) Final upper bound (dual): 0.5189166453888967 and lower bound (primal example): 0.5189166579925197 (PEPit) Duality gap: absolute: -1.2603623034124212e-08 and relative: -2.428833771280839e-08 *** Example file: worst-case performance of inexact gradient descent with exact linesearch *** PEPit guarantee: f(x_n)-f_* <= 0.518917 (f(x_0)-f_*) Theoretical guarantee: f(x_n)-f_* <= 0.518917 (f(x_0)-f_*)
1.20 Inexact accelerated gradient
- PEPit.examples.unconstrained_convex_minimization.wc_inexact_accelerated_gradient(L, epsilon, n, wrapper='cvxpy', solver=None, verbose=1)[source]
Consider the minimization problem
\[f_\star \triangleq \min_x f(x),\]where \(f\) is \(L\)-smooth and convex.
This code computes a worst-case guarantee for an accelerated gradient method using inexact first-order information. That is, it computes the smallest possible \(\tau(n, L, \varepsilon)\) such that the guarantee
\[f(x_n) - f_\star \leqslant \tau(n, L, \varepsilon) \|x_0 - x_\star\|^2\]is valid, where \(x_n\) is the output of inexact accelerated gradient descent and where \(x_\star\) is a minimizer of \(f\).
The inexact descent direction is assumed to satisfy a relative inaccuracy described by (with \(0\leqslant \varepsilon \leqslant 1\))
\[\|\nabla f(y_t) - d_t\| \leqslant \varepsilon \|\nabla f(y_t)\|,\]where \(\nabla f(y_t)\) is the true gradient at \(y_t\) and \(d_t\) is the approximate descent direction that is used.
Algorithm: The inexact accelerated gradient method of this example is provided by
\begin{eqnarray} x_{t+1} & = & y_t - \frac{1}{L} d_t\\ y_{k+1} & = & x_{t+1} + \frac{t-1}{t+2} (x_{t+1} - x_t). \end{eqnarray}Theoretical guarantee: When \(\varepsilon=0\), a tight empirical guarantee can be found in [1, Table 1]:
\[f(x_n)-f_\star \leqslant \frac{2L\|x_0-x_\star\|^2}{n^2 + 5 n + 6},\]which is achieved on some Huber loss functions (when \(\varepsilon=0\)).
References:
- Parameters:
L (float) – smoothness parameter.
epsilon (float) – level of inaccuracy
n (int) – number of iterations.
wrapper (str) – the name of the wrapper to be used.
solver (str) – the name of the solver the wrapper should use.
verbose (int) –
level of information details to print.
-1: No verbose at all.
0: This example’s output.
1: This example’s output + PEPit information.
2: This example’s output + PEPit information + solver details.
- Returns:
pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value
Example
>>> pepit_tau, theoretical_tau = wc_inexact_accelerated_gradient(L=1, epsilon=0.1, n=5, wrapper="cvxpy", solver=None, verbose=1) (PEPit) Setting up the problem: size of the Gram matrix: 13x13 (PEPit) Setting up the problem: performance measure is the minimum of 1 element(s) (PEPit) Setting up the problem: Adding initial conditions and general constraints ... (PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added) (PEPit) Setting up the problem: interpolation conditions for 1 function(s) Function 1 : Adding 42 scalar constraint(s) ... Function 1 : 42 scalar constraint(s) added (PEPit) Setting up the problem: additional constraints for 1 function(s) Function 1 : Adding 5 scalar constraint(s) ... Function 1 : 5 scalar constraint(s) added (PEPit) Compiling SDP (PEPit) Calling SDP solver (PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.03938804786885548 (PEPit) Primal feasibility check: The solver found a Gram matrix that is positive semi-definite up to an error of 9.664115266791685e-09 All the primal scalar constraints are verified up to an error of 2.759861429581928e-08 (PEPit) Dual feasibility check: The solver found a residual matrix that is positive semi-definite All the dual scalar values associated with inequality constraints are nonnegative (PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 1.409073847097795e-07 (PEPit) Final upper bound (dual): 0.03938804252576436 and lower bound (primal example): 0.03938804786885548 (PEPit) Duality gap: absolute: -5.343091122322896e-09 and relative: -1.3565260050746843e-07 *** Example file: worst-case performance of inexact accelerated gradient method *** PEPit guarantee: f(x_n)-f_* <= 0.039388 (f(x_0)-f_*) Theoretical guarantee for epsilon = 0 : f(x_n)-f_* <= 0.0357143 (f(x_0)-f_*)
1.21 Epsilon-subgradient method
- PEPit.examples.unconstrained_convex_minimization.wc_epsilon_subgradient_method(M, n, gamma, eps, R, wrapper='cvxpy', solver=None, verbose=1)[source]
Consider the minimization problem
\[f_\star \triangleq \min_x f(x),\]where \(f\) is closed, convex, and proper. This problem is a (possibly non-smooth) minimization problem.
This code computes a worst-case guarantee for the \(\varepsilon\) -subgradient method. That is, it computes the smallest possible \(\tau(n, M, \gamma, \varepsilon, R)\) such that the guarantee
\[\min_{0 \leqslant t \leqslant n} f(x_t) - f_\star \leqslant \tau(n, M, \gamma, \varepsilon, R)\]is valid, where \(x_t\) are the iterates of the \(\varepsilon\) -subgradient method after \(t\leqslant n\) steps, where \(x_\star\) is a minimizer of \(f\), where \(M\) is an upper bound on the norm of all \(\varepsilon\)-subgradients encountered, and when \(\|x_0-x_\star\|\leqslant R\).
In short, for given values of \(M\), of the accuracy \(\varepsilon\), of the step-size \(\gamma\), of the initial distance \(R\), and of the number of iterations \(n\), \(\tau(n, M, \gamma, \varepsilon, R)\) is computed as the worst-case value of \(\min_{0 \leqslant t \leqslant n} f(x_t) - f_\star\).
Algorithm: For \(t\in \{0, \dots, n-1 \}\)
\begin{eqnarray} g_{t} & \in & \partial_{\varepsilon} f(x_t) \\ x_{t+1} & = & x_t - \gamma g_t \end{eqnarray}Theoretical guarantee: An upper bound is obtained in [1, Lemma 2]:
\[\min_{0 \leqslant t \leqslant n} f(x_t)- f(x_\star) \leqslant \frac{R^2+2(n+1)\gamma\varepsilon+(n+1) \gamma^2 M^2}{2(n+1) \gamma}.\]References:
- Parameters:
M (float) – the bound on norms of epsilon-subgradients.
n (int) – the number of iterations.
gamma (float) – step-size.
eps (float) – the bound on the value of epsilon (inaccuracy).
R (float) – the bound on initial distance to an optimal solution.
wrapper (str) – the name of the wrapper to be used.
solver (str) – the name of the solver the wrapper should use.
verbose (int) –
level of information details to print.
-1: No verbose at all.
0: This example’s output.
1: This example’s output + PEPit information.
2: This example’s output + PEPit information + solver details.
- Returns:
pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value
Example
>>> M, n, eps, R = 2, 6, .1, 1 >>> gamma = 1 / sqrt(n + 1) >>> pepit_tau, theoretical_tau = wc_epsilon_subgradient_method(M=M, n=n, gamma=gamma, eps=eps, R=R, wrapper="cvxpy", solver=None, verbose=1) (PEPit) Setting up the problem: size of the Gram matrix: 21x21 (PEPit) Setting up the problem: performance measure is the minimum of 7 element(s) (PEPit) Setting up the problem: Adding initial conditions and general constraints ... (PEPit) Setting up the problem: initial conditions and general constraints (14 constraint(s) added) (PEPit) Setting up the problem: interpolation conditions for 1 function(s) Function 1 : Adding 182 scalar constraint(s) ... Function 1 : 182 scalar constraint(s) added (PEPit) Setting up the problem: additional constraints for 1 function(s) Function 1 : Adding 6 scalar constraint(s) ... Function 1 : 6 scalar constraint(s) added (PEPit) Compiling SDP (PEPit) Calling SDP solver (PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 1.0191560420875105 (PEPit) Primal feasibility check: The solver found a Gram matrix that is positive semi-definite All the primal scalar constraints are verified up to an error of 8.881784197001252e-16 (PEPit) Dual feasibility check: The solver found a residual matrix that is positive semi-definite All the dual scalar values associated with inequality constraints are nonnegative up to an error of 8.140035838944204e-09 (PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 2.453710021145397e-07 (PEPit) Final upper bound (dual): 1.0191560447564818 and lower bound (primal example): 1.0191560420875105 (PEPit) Duality gap: absolute: 2.668971266217568e-09 and relative: 2.61880532126443e-09 *** Example file: worst-case performance of the epsilon-subgradient method *** PEPit guarantee: min_(0 <= t <= n) f(x_i) - f_* <= 1.01916 Theoretical guarantee: min_(0 <= t <= n) f(x_i) - f_* <= 1.04491
1.22 Gradient descent for quadratically upper bounded convex objective
- PEPit.examples.unconstrained_convex_minimization.wc_gradient_descent_qg_convex(L, gamma, n, wrapper='cvxpy', solver=None, verbose=1)[source]
Consider the convex minimization problem
\[f_\star \triangleq \min_x f(x),\]where \(f\) is quadratically upper bounded (\(\text{QG}^+\) [1]), i.e. \(\forall x, f(x) - f_\star \leqslant \frac{L}{2} \|x-x_\star\|^2\), and convex.
This code computes a worst-case guarantee for gradient descent with fixed step-size \(\gamma\). That is, it computes the smallest possible \(\tau(n, L, \gamma)\) such that the guarantee
\[f(x_n) - f_\star \leqslant \tau(n, L, \gamma) \| x_0 - x_\star\|^2\]is valid, where \(x_n\) is the output of gradient descent with fixed step-size \(\gamma\), and where \(x_\star\) is a minimizer of \(f\).
In short, for given values of \(n\), \(L\), and \(\gamma\), \(\tau(n, L, \gamma)\) is computed as the worst-case value of \(f(x_n)-f_\star\) when \(||x_0 - x_\star||^2 \leqslant 1\).
Algorithm: Gradient descent is described by
\[x_{t+1} = x_t - \gamma \nabla f(x_t),\]where \(\gamma\) is a step-size.
Theoretical guarantee: When \(\gamma < \frac{1}{L}\), the lower theoretical guarantee can be found in [1, Theorem 2.2]:
\[f(x_n)-f_\star \leqslant \frac{L}{2}\max\left(\frac{1}{2n L \gamma + 1}, L \gamma\right) \|x_0-x_\star\|^2.\]References:
The detailed approach is available in [1, Theorem 2.2].
- Parameters:
L (float) – the quadratic growth parameter.
gamma (float) – step-size.
n (int) – number of iterations.
wrapper (str) – the name of the wrapper to be used.
solver (str) – the name of the solver the wrapper should use.
verbose (int) –
level of information details to print.
-1: No verbose at all.
0: This example’s output.
1: This example’s output + PEPit information.
2: This example’s output + PEPit information + solver details.
- Returns:
pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value
Example
>>> L = 1 >>> pepit_tau, theoretical_tau = wc_gradient_descent_qg_convex(L=L, gamma=.2 / L, n=4, wrapper="cvxpy", solver=None, verbose=1) (PEPit) Setting up the problem: size of the Gram matrix: 7x7 (PEPit) Setting up the problem: performance measure is the minimum of 1 element(s) (PEPit) Setting up the problem: Adding initial conditions and general constraints ... (PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added) (PEPit) Setting up the problem: interpolation conditions for 1 function(s) Function 1 : Adding 35 scalar constraint(s) ... Function 1 : 35 scalar constraint(s) added (PEPit) Setting up the problem: additional constraints for 0 function(s) (PEPit) Compiling SDP (PEPit) Calling SDP solver (PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.19230769057989414 (PEPit) Primal feasibility check: The solver found a Gram matrix that is positive semi-definite All the primal scalar constraints are verified up to an error of 6.488017484373998e-10 (PEPit) Dual feasibility check: The solver found a residual matrix that is positive semi-definite All the dual scalar values associated with inequality constraints are nonnegative up to an error of 9.614444025786438e-10 (PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 9.03742597164331e-09 (PEPit) Final upper bound (dual): 0.19230769156951819 and lower bound (primal example): 0.19230769057989414 (PEPit) Duality gap: absolute: 9.896240493745267e-10 and relative: 5.146045102982441e-09 *** Example file: worst-case performance of gradient descent with fixed step-sizes *** PEPit guarantee: f(x_n)-f_* <= 0.192308 ||x_0 - x_*||^2 Theoretical guarantee: f(x_n)-f_* <= 0.192308 ||x_0 - x_*||^2
1.23 Gradient descent with decreasing step sizes for quadratically upper bounded convex objective
- PEPit.examples.unconstrained_convex_minimization.wc_gradient_descent_qg_convex_decreasing(L, n, wrapper='cvxpy', solver=None, verbose=1)[source]
Consider the convex minimization problem
\[f_\star \triangleq \min_x f(x),\]where \(f\) is quadratically upper bounded (\(\text{QG}^+\) [1]), i.e. \(\forall x, f(x) - f_\star \leqslant \frac{L}{2} \|x-x_\star\|^2\), and convex.
This code computes a worst-case guarantee for gradient descent with decreasing step-sizes. That is, it computes the smallest possible \(\tau(n, L)\) such that the guarantee
\[f(x_n) - f_\star \leqslant \tau(n, L) \| x_0 - x_\star\|^2\]is valid, where \(x_n\) is the output of gradient descent with decreasing step-sizes, and where \(x_\star\) is a minimizer of \(f\).
In short, for given values of \(n\) and \(L\), \(\tau(n, L)\) is computed as the worst-case value of \(f(x_n)-f_\star\) when \(||x_0 - x_\star||^2 \leqslant 1\).
Algorithm: Gradient descent with decreasing step sizes is described by
\[x_{t+1} = x_t - \gamma_t \nabla f(x_t)\]with
\[\gamma_t = \frac{1}{L u_{t+1}}\]where the sequence \(u\) is defined by
\begin{eqnarray} u_0 & = & 1 \\ u_{t} & = & \frac{u_{t-1}}{2} + \sqrt{\left(\frac{u_{t-1}}{2}\right)^2 + 2}, \quad \mathrm{for } t \geq 1 \end{eqnarray}Theoretical guarantee: The tight theoretical guarantee is conjectured in [1, Conjecture A.3]:
\[f(x_n)-f_\star \leqslant \frac{L}{2 u_t} \|x_0-x_\star\|^2.\]Notes:
We verify that \(u_t \sim 2\sqrt{t}\). The step sizes as well as the function values of the iterates decrease as \(O\left( \frac{1}{\sqrt{t}} \right)\).
References:
The detailed approach is available in [1, Appendix A.3].
- Parameters:
L (float) – the quadratic growth parameter.
n (int) – number of iterations.
wrapper (str) – the name of the wrapper to be used.
solver (str) – the name of the solver the wrapper should use.
verbose (int) –
level of information details to print.
-1: No verbose at all.
0: This example’s output.
1: This example’s output + PEPit information.
2: This example’s output + PEPit information + solver details.
- Returns:
pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value
Example
>>> pepit_tau, theoretical_tau = wc_gradient_descent_qg_convex_decreasing(L=1, n=6, wrapper="cvxpy", solver=None, verbose=1) (PEPit) Setting up the problem: size of the Gram matrix: 9x9 (PEPit) Setting up the problem: performance measure is the minimum of 1 element(s) (PEPit) Setting up the problem: Adding initial conditions and general constraints ... (PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added) (PEPit) Setting up the problem: interpolation conditions for 1 function(s) Function 1 : Adding 63 scalar constraint(s) ... Function 1 : 63 scalar constraint(s) added (PEPit) Setting up the problem: additional constraints for 0 function(s) (PEPit) Compiling SDP (PEPit) Calling SDP solver (PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.10554738683915078 (PEPit) Primal feasibility check: The solver found a Gram matrix that is positive semi-definite up to an error of 1.3035997525822776e-10 All the primal scalar constraints are verified up to an error of 6.534398439006495e-10 (PEPit) Dual feasibility check: The solver found a residual matrix that is positive semi-definite All the dual scalar values associated with inequality constraints are nonnegative up to an error of 4.780906839863336e-10 (PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 8.07763970309369e-09 (PEPit) Final upper bound (dual): 0.10554738755607095 and lower bound (primal example): 0.10554738683915078 (PEPit) Duality gap: absolute: 7.169201621248789e-10 and relative: 6.7924008693595726e-09 *** Example file: worst-case performance of gradient descent with fixed step-sizes *** PEPit guarantee: f(x_n)-f_* <= 0.105547 ||x_0 - x_*||^2 Theoretical conjecture: f(x_n)-f_* <= 0.105547 ||x_0 - x_*||^2
1.24 Conjugate gradient for quadratically upper bounded convex objective
- PEPit.examples.unconstrained_convex_minimization.wc_conjugate_gradient_qg_convex(L, n, wrapper='cvxpy', solver=None, verbose=1)[source]
Consider the convex minimization problem
\[f_\star \triangleq \min_x f(x),\]where \(f\) is quadratically upper bounded (\(\text{QG}^+\) [2]), i.e. \(\forall x, f(x) - f_\star \leqslant \frac{L}{2} \|x-x_\star\|^2\), and convex.
This code computes a worst-case guarantee for the conjugate gradient (CG) method (with exact span searches). That is, it computes the smallest possible \(\tau(n, L)\) such that the guarantee
\[f(x_n) - f_\star \leqslant \tau(n, L) \|x_0-x_\star\|^2\]is valid, where \(x_n\) is the output of the conjugate gradient method, and where \(x_\star\) is a minimizer of \(f\). In short, for given values of \(n\) and \(L\), \(\tau(n, L)\) is computed as the worst-case value of \(f(x_n)-f_\star\) when \(\|x_0-x_\star\|^2 \leqslant 1\).
Algorithm:
\[x_{t+1} = x_t - \sum_{i=0}^t \gamma_i \nabla f(x_i)\]with
\[(\gamma_i)_{i \leqslant t} = \arg\min_{(\gamma_i)_{i \leqslant t}} f \left(x_t - \sum_{i=0}^t \gamma_i \nabla f(x_i) \right)\]Theoretical guarantee:
The tight guarantee obtained in [2, Theorem 2.3] (lower) and [2, Theorem 2.4] (upper) is
\[f(x_n) - f_\star \leqslant \frac{L}{2 (n + 1)} \|x_0-x_\star\|^2.\]References: The detailed approach (based on convex relaxations) is available in [1, Corollary 6], and the result provided in [2, Theorem 2.4].
- Parameters:
L (float) – the quadratic growth parameter.
n (int) – number of iterations.
wrapper (str) – the name of the wrapper to be used.
solver (str) – the name of the solver the wrapper should use.
verbose (int) –
level of information details to print.
-1: No verbose at all.
0: This example’s output.
1: This example’s output + PEPit information.
2: This example’s output + PEPit information + solver details.
- Returns:
pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value
Example
>>> pepit_tau, theoretical_tau = wc_conjugate_gradient_qg_convex(L=1, n=12, wrapper="cvxpy", solver=None, verbose=1) (PEPit) Setting up the problem: size of the Gram matrix: 27x27 (PEPit) Setting up the problem: performance measure is the minimum of 1 element(s) (PEPit) Setting up the problem: Adding initial conditions and general constraints ... (PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added) (PEPit) Setting up the problem: interpolation conditions for 1 function(s) Function 1 : Adding 195 scalar constraint(s) ... Function 1 : 195 scalar constraint(s) added (PEPit) Setting up the problem: additional constraints for 1 function(s) Function 1 : Adding 156 scalar constraint(s) ... Function 1 : 156 scalar constraint(s) added (PEPit) Compiling SDP (PEPit) Calling SDP solver (PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.038461537777153165 (PEPit) Primal feasibility check: The solver found a Gram matrix that is positive semi-definite up to an error of 9.102633063571334e-10 All the primal scalar constraints are verified up to an error of 6.98577099726011e-09 (PEPit) Dual feasibility check: The solver found a residual matrix that is positive semi-definite All the dual scalar values associated with inequality constraints are nonnegative up to an error of 6.4879396218187255e-09 (PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 2.352447618124301e-07 (PEPit) Final upper bound (dual): 0.03846154590714546 and lower bound (primal example): 0.038461537777153165 (PEPit) Duality gap: absolute: 8.129992297434274e-09 and relative: 2.113798034945871e-07 *** Example file: worst-case performance of conjugate gradient method *** PEPit guarantee: f(x_n)-f_* <= 0.0384615 ||x_0 - x_*||^2 Theoretical guarantee: f(x_n)-f_* <= 0.0384615 ||x_0 - x_*||^2
1.25 Heavy Ball momentum for quadratically upper bounded convex objective
- PEPit.examples.unconstrained_convex_minimization.wc_heavy_ball_momentum_qg_convex(L, n, wrapper='cvxpy', solver=None, verbose=1)[source]
Consider the convex minimization problem
\[f_\star \triangleq \min_x f(x),\]where \(f\) is quadratically upper bounded (\(\text{QG}^+\) [2]), i.e. \(\forall x, f(x) - f_\star \leqslant \frac{L}{2} \|x-x_\star\|^2\), and convex.
This code computes a worst-case guarantee for the Heavy-ball (HB) method, aka Polyak momentum method. That is, it computes the smallest possible \(\tau(n, L)\) such that the guarantee
\[f(x_n) - f_\star \leqslant \tau(n, L) \|x_0 - x_\star\|^2\]is valid, where \(x_n\) is the output of the Heavy-ball (HB) method, and where \(x_\star\) is the minimizer of \(f\). In short, for given values of \(n\) and \(L\), \(\tau(n, L)\) is computed as the worst-case value of \(f(x_n)-f_\star\) when \(\|x_0 - x_\star\|^2 \leqslant 1\).
Algorithm:
This method is described in [1]
\[x_{t+1} = x_t - \alpha_t \nabla f(x_t) + \beta_t (x_t-x_{t-1})\]with
\[\alpha_t = \frac{1}{L} \frac{1}{t+2}\]and
\[\beta_t = \frac{t}{t+2}\]Theoretical guarantee:
The tight guarantee obtained in [2, Theorem 2.3] (lower) and [2, Theorem 2.4] (upper) is
\[f(x_n) - f_\star \leqslant \frac{L}{2}\frac{1}{n+1} \|x_0 - x_\star\|^2.\]References: This methods was first introduce in [1, section 3], and convergence tight bound was proven in [2, Theorem 2.3] (lower) and [2, Theorem 2.4] (upper).
- Parameters:
L (float) – the quadratic growth parameter.
n (int) – number of iterations.
wrapper (str) – the name of the wrapper to be used.
solver (str) – the name of the solver the wrapper should use.
verbose (int) –
level of information details to print.
-1: No verbose at all.
0: This example’s output.
1: This example’s output + PEPit information.
2: This example’s output + PEPit information + solver details.
- Returns:
pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value
Example
>>> pepit_tau, theoretical_tau = wc_heavy_ball_momentum_qg_convex(L=1, n=5, wrapper="cvxpy", solver=None, verbose=1) (PEPit) Setting up the problem: size of the Gram matrix: 8x8 (PEPit) Setting up the problem: performance measure is the minimum of 1 element(s) (PEPit) Setting up the problem: Adding initial conditions and general constraints ... (PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added) (PEPit) Setting up the problem: interpolation conditions for 1 function(s) Function 1 : Adding 48 scalar constraint(s) ... Function 1 : 48 scalar constraint(s) added (PEPit) Setting up the problem: additional constraints for 0 function(s) (PEPit) Compiling SDP (PEPit) Calling SDP solver (PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.08333333312677232 (PEPit) Primal feasibility check: The solver found a Gram matrix that is positive semi-definite All the primal scalar constraints are verified up to an error of 1.0357759788748311e-10 (PEPit) Dual feasibility check: The solver found a residual matrix that is positive semi-definite All the dual scalar values associated with inequality constraints are nonnegative up to an error of 9.410870890813632e-11 (PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 2.3134912789184716e-09 (PEPit) Final upper bound (dual): 0.08333333333831308 and lower bound (primal example): 0.08333333312677232 (PEPit) Duality gap: absolute: 2.1154075713347709e-10 and relative: 2.5384890918939593e-09 *** Example file: worst-case performance of the Heavy-Ball method *** PEPit guarantee: f(x_n)-f_* <= 0.0833333 ||x_0 - x_*||^2 Theoretical guarantee: f(x_n)-f_* <= 0.0833333 ||x_0 - x_*||^2
1.26 Gradient descent for smooth strongly convex quadratic objective
- PEPit.examples.unconstrained_convex_minimization.wc_gradient_descent_quadratics(mu, L, gamma, n, wrapper='cvxpy', solver=None, verbose=1)[source]
Consider the convex minimization problem
\[f_\star \triangleq \min_x f(x),\]where \(f=\frac{1}{2} x^T Q x\) is \(L\)-smooth and \(\mu\)-strongly convex (i.e. \(\mu I \preceq Q \preceq LI\)).
This code computes a worst-case guarantee for gradient descent with fixed step-size \(\gamma\). That is, it computes the smallest possible \(\tau(n, \mu, L, \gamma)\) such that the guarantee
\[f(x_n) - f_\star \leqslant \tau(n, \mu, L, \gamma) \|x_0 - x_\star\|^2\]is valid, where \(x_n\) is the output of gradient descent with fixed step-size \(\gamma\), and where \(x_\star\) is a minimizer of \(f\).
In short, for given values of \(n\), \(\mu\), \(L\), and \(\gamma\), \(\tau(n, L, \gamma)\) is computed as the worst-case value of \(f(x_n)-f_\star\) when \(\|x_0 - x_\star\|^2 \leqslant 1\).
Algorithm: Gradient descent is described by
\[x_{t+1} = x_t - \gamma \nabla f(x_t),\]where \(\gamma\) is a step-size.
- Theoretical guarantee:
When \(\gamma \leqslant \frac{2}{L}\) and \(0 \leqslant \mu \leqslant L\), the tight theoretical conjecture can be found in [1, Equation (4.17)]:
\[f(x_n)-f_\star \leqslant \frac{L}{2} \max\left\{\alpha(1-\alpha L\gamma)^{2n}, (1-L\gamma)^{2n} \right\} \|x_0-x_\star\|^2,\]
where \(\alpha = \mathrm{proj}_{[\frac{\mu}{L},1]} \left(\frac{1}{L\gamma (2n+1)}\right)\).
References:
- Parameters:
mu (float) – the strong convexity parameter.
L (float) – the smoothness parameter.
gamma (float) – step-size.
n (int) – number of iterations.
wrapper (str) – the name of the wrapper to be used.
solver (str) – the name of the solver the wrapper should use.
verbose (int) –
Level of information details to print.
-1: No verbose at all.
0: This example’s output.
1: This example’s output + PEPit information.
2: This example’s output + PEPit information + CVXPY details.
- Returns:
pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value
Example
>>> L = 3. >>> mu = 0.3 >>> pepit_tau, theoretical_tau = wc_gradient_descent_quadratics(mu=mu, L=L, gamma=1 / L, n=4, verbose=1) (PEPit) Setting up the problem: size of the Gram matrix: 7x7 (PEPit) Setting up the problem: performance measure is the minimum of 1 element(s) (PEPit) Setting up the problem: Adding initial conditions and general constraints ... (PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added) (PEPit) Setting up the problem: interpolation conditions for 1 function(s) Function 1 : Adding 21 scalar constraint(s) ... Function 1 : 21 scalar constraint(s) added Function 1 : Adding 1 lmi constraint(s) ... Size of PSD matrix 1: 6x6 Function 1 : 1 lmi constraint(s) added (PEPit) Setting up the problem: additional constraints for 0 function(s) (PEPit) Compiling SDP (PEPit) Calling SDP solver (PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.0649573837234624 (PEPit) Primal feasibility check: The solver found a Gram matrix that is positive semi-definite up to an error of 7.166296416653088e-10 All required PSD matrices are indeed positive semi-definite up to an error of 8.5005680002544e-09 All the primal scalar constraints are verified up to an error of 8.619633617978906e-10 (PEPit) Dual feasibility check: The solver found a residual matrix that is positive semi-definite All the dual matrices to lmi are positive semi-definite All the dual scalar values associated with inequality constraints are nonnegative (PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 2.543004576738361 (PEPit) Final upper bound (dual): 0.06495738819334852 and lower bound (primal example): 0.0649573837234624 (PEPit) Duality gap: absolute: 4.4698861279002244e-09 and relative: 6.881259483801711e-08 *** Example file: worst-case performance of gradient descent on quadratics with fixed step-sizes *** PEPit guarantee: f(x_n)-f_* <= 0.0649574 ||x_0 - x_*||^2 Theoretical guarantee: f(x_n)-f_* <= 0.0649574 ||x_0 - x_*||^2
1.27 Gradient descent for smooth strongly convex objective with linear mapping
- PEPit.examples.unconstrained_convex_minimization.wc_gradient_descent_lc(mug, Lg, typeM, muM, LM, gamma, n, verbose=1)[source]
Consider the convex minimization problem
\[g_\star \triangleq \min_x g(Mx),\]where \(g\) is an \(L_g\)-smooth, mu_g-strongly convex function and \(M\) is a general, symmetric or skew-symmetric matrix with \(\mu_M \leqslant \|M\| \leqslant L_M\).
Note
For general and skew-symmetric matrices, \(\mu_M\) must be set to 0.
This code computes a worst-case guarantee for gradient descent with fixed step-size \(\gamma\) on a function \(F\) defined as the composition of a function \(G\) and a linear operator \(M\). That is, it computes the smallest possible \(\tau(n, \mu_g, L_g, \mu_M, L_M, \gamma)\) such that the guarantee
\[g(Mx_n) - g_\star \leqslant \tau(n, \mu_g, L_g, \mu_M, L_M, \gamma) \|x_0 - x_\star\|^2\]is valid, where \(x_n\) is the output of gradient descent run on \(F\) with fixed step-size \(\gamma\), where \(x_\star\) is a minimizer of \(F(x) = g(Mx)\), and where \(g_\star = g(M x_\star)\).
In short, for given values of \(n\), \(\mu_g\), \(L_g\), \(\mu_M\), \(L_M\), and \(\gamma\), \(\tau(n, \mu_g, L_g, \mu_M, L_M, \gamma)\) is computed as the worst-case value of \(g(Mx_n)-g_\star\) when \(\|x_0 - x_\star\|^2 \leqslant 1\).
Algorithm: Gradient descent on such a function is described by
\[x_{t+1} = x_t - \gamma M^T \nabla g(Mx_t),\]where \(\gamma\) is a step-size.
Theoretical guarantee: When \(\gamma \leqslant \frac{2}{L}\), \(0 \leqslant \mu_g \leqslant L_g\), and \(0 \leqslant \mu_M \leqslant L_M\), the following tight theoretical guarantee is conjectured in [1, Conjecture 4.2], and the associated lower theoretical guarantee is stated in [1, Conjecture 4.3]:
\[g(Mx_n)-g_\star \leqslant \frac{L}{2} \max\left\{\frac{\kappa_g {M^*}^2}{\kappa_g -1 + (1-\kappa_g {M^*}^2 L \gamma )^{-2n}}, (1-L\gamma)^{2n} \right\} \|x_0-x_\star\|^2,\]where \(L = L_g L_M^2\), \(\kappa_g = \frac{\mu_g}{L_g}\), \(\kappa_M = \frac{\mu_M}{L_M}\), \(M^* = \mathrm{proj}_{[\kappa_M,1]} \left(\sqrt{\frac{h_0}{L\gamma}}\right)\) for \(h_0\) solution of
\[(1-\kappa_g)(1-\kappa_g h_0)^{2n+1} = 1 - (2n+1)\kappa_g h_0.\]References:
- Parameters:
mug (float) – the strong convexity parameter of \(g(y)\).
Lg (float) – the smoothness parameter of \(g(y)\).
typeM (string) – type of matrix \(M\) (“gen”, “sym” or “skew”).
muM (float) – lower bound on \(\|M\|\) (if typeM != “sym”, then muM must be set to zero).
LM (float) – upper bound on \(\|M\|\).
gamma (float) – step-size.
n (int) – number of iterations.
verbose (int) –
Level of information details to print.
-1: No verbose at all.
0: This example’s output.
1: This example’s output + PEPit information.
2: This example’s output + PEPit information + CVXPY details.
- Returns:
pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value
Example
>>> Lg = 3.; mug = 0.3 >>> typeM = "gen"; LM = 1.; muM = 0.1 >>> L = Lg*LM**2 >>> pepit_tau, theoretical_tau = wc_gradient_descent_lc(mug = mug, Lg=Lg, typeM=typeM, muM = muM, LM=LM, gamma=1 / L, n=3, verbose=1) (PEPit) Setting up the problem: size of the Gram matrix: 16x16 (PEPit) Setting up the problem: performance measure is the minimum of 1 element(s) (PEPit) Setting up the problem: Adding initial conditions and general constraints ... (PEPit) Setting up the problem: initial conditions and general constraints (2 constraint(s) added) (PEPit) Setting up the problem: interpolation conditions for 3 function(s) Function 1 : Adding 20 scalar constraint(s) ... Function 1 : 20 scalar constraint(s) added Function 2 : Adding 20 scalar constraint(s) ... Function 2 : 20 scalar constraint(s) added Function 2 : Adding 2 lmi constraint(s) ... Size of PSD matrix 1: 5x5 Size of PSD matrix 2: 4x4 Function 2 : 2 lmi constraint(s) added Function 3 : Adding 0 scalar constraint(s) ... Function 3 : 0 scalar constraint(s) added (PEPit) Setting up the problem: additional constraints for 0 function(s) (PEPit) Compiling SDP (PEPit) Calling SDP solver (PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.16380409246384472 (PEPit) Primal feasibility check: The solver found a Gram matrix that is positive semi-definite up to an error of 6.115640539592991e-09 All required PSD matrices are indeed positive semi-definite All the primal scalar constraints are verified up to an error of 1.3721933136925677e-08 (PEPit) Dual feasibility check: The solver found a residual matrix that is positive semi-definite All the dual matrices to lmi are positive semi-definite All the dual scalar values associated with inequality constraints are nonnegative (PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 6.541301934937408e-07 (PEPit) Final upper bound (dual): 0.16380386618153198 and lower bound (primal example): 0.16380409246384472 (PEPit) Duality gap: absolute: -2.2628231274857136e-07 and relative: -1.3814203866641304e-06 *** Example file: worst-case performance of gradient descent on g(Mx) with fixed step-sizes *** PEPit guarantee: f(x_n)-f_* <= 0.163804 ||x_0 - x_*||^2 Theoretical guarantee: f(x_n)-f_* <= 0.16379 ||x_0 - x_*||^2
1.28 Gradient descent with silver step-size for convex objective
- PEPit.examples.unconstrained_convex_minimization.wc_gradient_descent_silver_stepsize_convex(L, n, wrapper='cvxpy', solver=None, verbose=1)[source]
Consider the convex minimization problem
\[f_\star \triangleq \min_x f(x),\]where \(f\) is \(L\)-smooth and convex.
This code computes a worst-case guarantee for \(n\) steps of the gradient descent method tuned according to the silver stepsize schedule. That is, it computes the smallest possible \(\tau(n, L)\) such that the guarantee
\[f(x_n) - f_\star \leqslant \tau(n, L) \|x_0 - x_\star\|^2\]is valid, where \(x_n\) is the output of gradient descent using the silver step-sizes, and where \(x_\star\) is a minimizer of \(f\).
In short, for given values of \(n\), and \(L\), \(\tau(n, L)\) is computed as the worst-case value of \(f(x_n)-f_\star\) when \(\|x_0 - x_\star\|^2 \leqslant 1\).
Algorithm: Gradient descent is described by
\[x_{t+1} = x_t - \gamma_t \nabla f(x_t),\]where \(\gamma_t\) is a step-size of the \(t^{th}\) step of the silver step-size schedule described in [1].
Theoretical guarantee: The theoretical guarantee for the convergence rate of the silver stepsize can be found in [1, Theorem 1.1]:
\[f(x_n)-f_\star \leqslant \frac{L}{1 + \sqrt{4(1 + \sqrt{2})^{2k}-3}} \|x_0-x_\star\|^2,\]where \(k\) is such that \(n = 2^k - 1\).
References:
- Parameters:
L (float) – the smoothness parameter.
n (int) – number of iterations (will be reset to the largest power of 2 minus 1 smaller than the provided value).
wrapper (str) – the name of the wrapper to be used.
solver (str) – the name of the solver the wrapper should use.
verbose (int) –
Level of information details to print.
-1: No verbose at all.
0: This example’s output.
1: This example’s output + PEPit information.
2: This example’s output + PEPit information + CVXPY details.
- Returns:
pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value
Example
>>> pepit_tau, theoretical_tau = wc_gradient_descent_silver_stepsize_convex(L=10, n=7, wrapper="cvxpy", solver=None, verbose=1) (PEPit) Setting up the problem: size of the Gram matrix: 10x10 (PEPit) Setting up the problem: performance measure is the minimum of 1 element(s) (PEPit) Setting up the problem: Adding initial conditions and general constraints ... (PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added) (PEPit) Setting up the problem: interpolation conditions for 1 function(s) Function 1 : Adding 72 scalar constraint(s) ... Function 1 : 72 scalar constraint(s) added (PEPit) Setting up the problem: additional constraints for 0 function(s) (PEPit) Compiling SDP (PEPit) Calling SDP solver (PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.18421541423068957 (PEPit) Primal feasibility check: The solver found a Gram matrix that is positive semi-definite up to an error of 2.974454346986409e-10 All the primal scalar constraints are verified up to an error of 2.3268307107471298e-09 (PEPit) Dual feasibility check: The solver found a residual matrix that is positive semi-definite All the dual scalar values associated with inequality constraints are nonnegative up to an error of 2.421946677534727e-09 (PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 3.1629132685347117e-08 (PEPit) Final upper bound (dual): 0.18421541696330856 and lower bound (primal example): 0.18421541423068957 (PEPit) Duality gap: absolute: 2.732618992196123e-09 and relative: 1.4833823779665445e-08 *** Example file: worst-case performance of gradient descent with silver step-sizes *** PEPit guarantee: f(x_n)-f_* <= 0.184215 ||x_0 - x_*||^2 Theoretical guarantee: f(x_n)-f_* <= 0.343775 ||x_0 - x_*||^2
1.29 Gradient descent with silver step-size for strongly convex objective
- PEPit.examples.unconstrained_convex_minimization.wc_gradient_descent_silver_stepsize_strongly_convex(L, mu, n, wrapper='cvxpy', solver=None, verbose=1)[source]
Consider the strongly convex minimization problem
\[f_\star \triangleq \min_x f(x),\]where \(f\) is \(L\)-smooth and \(\mu\) strongly-convex.
This code computes a worst-case guarantee for \(n\) steps of the gradient descent method tuned according to the silver stepsize schedule. That is, it computes the smallest possible \(\tau(n, L, \mu)\) such that the guarantee
\[\|x_n - x_\star\|^2 \leqslant \tau(n, L, \mu) \|x_0 - x_\star\|^2\]is valid, where \(x_n\) is the output of gradient descent using the silver stepsizes, and where \(x_\star\) is a minimizer of \(f\).
In short, for given values of \(n\), \(L\) and \(\mu\), \(\tau(n, L, \mu)\) is computed as the worst-case value of \(\|x_n - x_\star\|^2\) when \(\|x_0 - x_\star\|^2 \leqslant 1\).
Algorithm: Gradient descent is described by
\[x_{t+1} = x_t - \gamma_t \nabla f(x_t),\]where \(\gamma_t\) is a step-size of the \(t^{th}\) step of the silver step-size schedule described in [1].
Theoretical guarantee: The theoretical guarantee for the convergence rate of the silver stepsize can be found in [1, Theorem 4.1]: Let \(n^\star = 2^{\lfloor log_\rho(L/(3\mu)) \rfloor}\).
When \(n \leq n^\star\), the guarantee is given by
\[\|x_n - x_\star\|^2 \leqslant e^{-\frac{n^{\log_2(1 + \sqrt{2})}}{L/\mu}} \|x_0-x_\star\|^2,\]When \(n > n^\star\) the guarantee is given by
\[\|x_n - x_\star\|^2 \leqslant e^{-\frac{n}{n^*} \frac{(n^*)^{\log_2(\rho)}}{L/\mu}} \|x_0-x_\star\|^2\]References:
- Parameters:
L (float) – the smoothness parameter.
mu (float) – the strong convexity parameter.
n (int) – number of iterations.
wrapper (str) – the name of the wrapper to be used.
solver (str) – the name of the solver the wrapper should use.
verbose (int) –
Level of information details to print.
-1: No verbose at all.
0: This example’s output.
1: This example’s output + PEPit information.
2: This example’s output + PEPit information + CVXPY details.
- Returns:
pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value
Example
>>> pepit_tau, theoretical_tau = wc_gradient_descent_silver_stepsize_strongly_convex(L=3.2, mu=.1, n=8, wrapper="cvxpy", solver=None, verbose=1) (PEPit) Setting up the problem: size of the Gram matrix: 10x10 (PEPit) Setting up the problem: performance measure is the minimum of 1 element(s) (PEPit) Setting up the problem: Adding initial conditions and general constraints ... (PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added) (PEPit) Setting up the problem: interpolation conditions for 1 function(s) Function 1 : Adding 72 scalar constraint(s) ... Function 1 : 72 scalar constraint(s) added (PEPit) Setting up the problem: additional constraints for 0 function(s) (PEPit) Compiling SDP (PEPit) Calling SDP solver (PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.2214496833206523 (PEPit) Primal feasibility check: The solver found a Gram matrix that is positive semi-definite All the primal scalar constraints are verified up to an error of 1.726732905965589e-14 (PEPit) Dual feasibility check: The solver found a residual matrix that is positive semi-definite All the dual scalar values associated with inequality constraints are nonnegative up to an error of 7.059821779868726e-15 (PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 1.8519554936513068e-12 (PEPit) Final upper bound (dual): 0.22144968332064066 and lower bound (primal example): 0.2214496833206523 (PEPit) Duality gap: absolute: -1.1629586182948515e-14 and relative: -5.2515704735098826e-14 *** Example file: worst-case performance of gradient descent with silver step-sizes *** PEPit guarantee: ||x_n - x_*||^2 <= 0.22145 ||x_0 - x_*||^2 Theoretical guarantee: ||x_n - x_*||^2 <= 0.22145 ||x_0 - x_*||^2