7. Potential functions

7.1. Gradient descent Lyapunov 1 

PEPit.examples.potential_functions.wc_gradient_descent_lyapunov_1(L, gamma, n, wrapper='cvxpy', solver=None, verbose=1)[source]

Consider the convex minimization problem

\[f_\star \triangleq \min_x f(x),\]

where \(f\) is \(L\)-smooth and convex.

This code verifies a worst-case guarantee for gradient descent with fixed step-size \(\gamma\). That is, it verifies that the Lyapunov (or potential/energy) function

\[V_n \triangleq n (f(x_n) - f_\star) + \frac{L}{2} \|x_n - x_\star\|^2\]

is decreasing along all trajectories and all smooth convex function \(f\) (i.e., in the worst-case):

\[V_{n+1} \leqslant V_n,\]

where \(x_{n+1}\) is obtained from a gradient step from \(x_{n}\) with fixed step-size \(\gamma=\frac{1}{L}\).

Algorithm: Onte iteration of gradient descent is described by

\[x_{n+1} = x_n - \gamma \nabla f(x_n),\]

where \(\gamma\) is a step-size.

Theoretical guarantee: The theoretical guarantee can be found in e.g., [1, Theorem 3.3]:

\[V_{n+1} - V_n \leqslant 0,\]

when \(\gamma=\frac{1}{L}\).

References: The detailed potential function can found in [1] and the SDP approach can be found in [2].

[1] N. Bansal, A. Gupta (2019). Potential-function proofs for gradient methods. Theory of Computing, 15(1), 1-32.

[2] A. Taylor, F. Bach (2019). Stochastic first-order methods: non-asymptotic and computer-aided analyses via potential functions. Conference on Learning Theory (COLT).

Parameters:

L (float) – the smoothness parameter.
gamma (float) – the step-size.
n (int) – current iteration number.
wrapper (str) – the name of the wrapper to be used.
solver (str) – the name of the solver the wrapper should use.
solver – the name of the solver the wrapper should use.
verbose (int) –
level of information details to print.
- -1: No verbose at all.
- 0: This example’s output.
- 1: This example’s output + PEPit information.
- 2: This example’s output + PEPit information + solver details.

Returns:

pepit_tau (float) – worst-case value.
theoretical_tau (float) – theoretical value.

Examples

>>> L = 1
>>> pepit_tau, theoretical_tau = wc_gradient_descent_lyapunov_1(L=L, gamma=1 / L, n=10, wrapper="cvxpy", solver=None, verbose=1)
(PEPit) Setting up the problem: size of the Gram matrix: 4x4
(PEPit) Setting up the problem: performance measure is the minimum of 1 element(s)
(PEPit) Setting up the problem: Adding initial conditions and general constraints ...
(PEPit) Setting up the problem: initial conditions and general constraints (0 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 1 function(s)
                        Function 1 : Adding 6 scalar constraint(s) ...
                        Function 1 : 6 scalar constraint(s) added
(PEPit) Setting up the problem: additional constraints for 0 function(s)
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 2.458069122242756e-09
(PEPit) Primal feasibility check:
                The solver found a Gram matrix that is positive semi-definite up to an error of 1.1620786701078888e-09
                All the primal scalar constraints are verified up to an error of 2.157699467921729e-09
(PEPit) Dual feasibility check:
                The solver found a residual matrix that is positive semi-definite
                All the dual scalar values associated with inequality constraints are nonnegative
(PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 1.2239204636309262e-08
(PEPit) Final upper bound (dual): 0.0 and lower bound (primal example): 2.458069122242756e-09
(PEPit) Duality gap: absolute: -2.458069122242756e-09 and relative: -1.0
*** Example file: worst-case performance of gradient descent with fixed step-size for a given Lyapunov function***
        PEPit guarantee:        V_(n+1) - V_(n) <= 0.0
        Theoretical guarantee:  V_(n+1) - V_(n) <= 0.0

7.2. Gradient descent Lyapunov 2 

PEPit.examples.potential_functions.wc_gradient_descent_lyapunov_2(L, gamma, n, wrapper='cvxpy', solver=None, verbose=1)[source]

Consider the convex minimization problem

\[f_\star \triangleq \min_x f(x),\]

where \(f\) is \(L\)-smooth and convex.

This code verifies a worst-case guarantee for gradient descent with fixed step-size \(\gamma\). That is, it verifies that the Lyapunov (or potential/energy) function

\[V_n \triangleq (2n + 1) L \left(f(x_n) - f_\star\right) + n(n+2) \|\nabla f(x_n)\|^2 + L^2 \|x_n - x_\star\|^2\]

is decreasing along all trajectories and all smooth convex function \(f\) (i.e., in the worst-case):

\[V_{n+1} \leqslant V_n,\]

where \(x_{n+1}\) is obtained from a gradient step from \(x_{n}\) with fixed step-size \(\gamma=\frac{1}{L}\).

Algorithm: Onte iteration of radient descent is described by

\[x_{n+1} = x_n - \gamma \nabla f(x_n),\]

where \(\gamma\) is a step-size.

Theoretical guarantee: The theoretical guarantee can be found in [1, Theorem 3]:

\[V_{n+1} - V_n \leqslant 0,\]

when \(\gamma=\frac{1}{L}\).

References: The detailed potential function and SDP approach can be found in [1].

[1] A. Taylor, F. Bach (2019). Stochastic first-order methods: non-asymptotic and computer-aided analyses via potential functions. Conference on Learning Theory (COLT).

Parameters:

L (float) – the smoothness parameter.
gamma (float) – the step-size.
n (int) – current iteration number.
wrapper (str) – the name of the wrapper to be used.
solver (str) – the name of the solver the wrapper should use.
verbose (int) –
level of information details to print.
- -1: No verbose at all.
- 0: This example’s output.
- 1: This example’s output + PEPit information.
- 2: This example’s output + PEPit information + solver details.

Returns:

pepit_tau (float) – worst-case value.
theoretical_tau (float) – theoretical value.

Examples

>>> L = 1
>>> pepit_tau, theoretical_tau = wc_gradient_descent_lyapunov_2(L=L, gamma=1 / L, n=10, wrapper="cvxpy", solver=None, verbose=1)
(PEPit) Setting up the problem: size of the Gram matrix: 4x4
(PEPit) Setting up the problem: performance measure is the minimum of 1 element(s)
(PEPit) Setting up the problem: Adding initial conditions and general constraints ...
(PEPit) Setting up the problem: initial conditions and general constraints (0 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 1 function(s)
                        Function 1 : Adding 6 scalar constraint(s) ...
                        Function 1 : 6 scalar constraint(s) added
(PEPit) Setting up the problem: additional constraints for 0 function(s)
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 4.129020680920803e-09
(PEPit) Primal feasibility check:
                The solver found a Gram matrix that is positive semi-definite up to an error of 2.3114964936226548e-11
                All the primal scalar constraints are verified up to an error of 2.3376856006507296e-11
(PEPit) Dual feasibility check:
                The solver found a residual matrix that is positive semi-definite
                All the dual scalar values associated with inequality constraints are nonnegative up to an error of 4.907374995917855e-13
(PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 7.181842880899949e-09
(PEPit) Final upper bound (dual): 0.0 and lower bound (primal example): 4.129020680920803e-09
(PEPit) Duality gap: absolute: -4.129020680920803e-09 and relative: -1.0
*** Example file: worst-case performance of gradient descent with fixed step size for a given Lyapunov function***
        PEPit guarantee:        V_(n+1) - V_(n) <= 0.0
        Theoretical guarantee:  V_(n+1) - V_(n) <= 0.0

7.3. Accelerated gradient method 

PEPit.examples.potential_functions.wc_accelerated_gradient_method(L, gamma, lam, wrapper='cvxpy', solver=None, verbose=1)[source]

Consider the convex minimization problem

\[f_\star \triangleq \min_x f(x),\]

where \(f\) is \(L\)-smooth and convex.

This code verifies a worst-case guarantee for an accelerated gradient method. That is, it verifies that the Lyapunov (or potential/energy) function

\[V_n \triangleq \lambda_n^2 (f(x_n) - f_\star) + \frac{L}{2} \|z_n - x_\star\|^2\]

is decreasing along all trajectories and all smooth convex function \(f\) (i.e., in the worst-case):

\[V_{n+1} \leqslant V_n,\]

where \(x_{n+1}\), \(z_{n+1}\), and \(\lambda_{n+1}\) are obtained from one iteration of the accelerated gradient method below, from some arbitrary \(x_{n}\), \(z_{n}\), and \(\lambda_{n}\).

Algorithm: One iteration of accelerated gradient method is described by

\[\begin{split}\begin{eqnarray} \text{Set: }\lambda_{n+1} & = & \frac{1}{2} \left(1 + \sqrt{4\lambda_n^2 + 1}\right), \tau_n & = & \frac{1}{\lambda_{n+1}}, \text{ and } \eta_n & = & \frac{\lambda_{n+1}^2 - \lambda_{n}^2}{L} \\ y_n & = & (1 - \tau_n) x_n + \tau_n z_n,\\ z_{n+1} & = & z_n - \eta_n \nabla f(y_n), \\ x_{n+1} & = & y_n - \gamma \nabla f(y_n). \end{eqnarray}\end{split}\]

Theoretical guarantee: The following worst-case guarantee can be found in e.g., [2, Theorem 5.3]:

\[V_{n+1} - V_n \leqslant 0,\]

when \(\gamma=\frac{1}{L}\).

References: The potential can be found in the historical [1]; and in more recent works, e.g., [2, 3].

[1] Y. Nesterov (1983). A method for solving the convex programming problem with convergence rate :math:`O(1/k^2). In Dokl. akad. nauk Sssr (Vol. 269, pp. 543-547). <http://www.mathnet.ru/links/9bcb158ed2df3d8db3532aafd551967d/dan46009.pdf>`_

[2] N. Bansal, A. Gupta (2019). Potential-function proofs for gradient methods. Theory of Computing, 15(1), 1-32.

[3] A. d’Aspremont, D. Scieur, A. Taylor (2021). Acceleration Methods. Foundations and Trends in Optimization: Vol. 5, No. 1-2.

Parameters:

L (float) – the smoothness parameter.
gamma (float) – the step-size.
lam (float) – the initial value for sequence \((\lambda_t)_t\).
wrapper (str) – the name of the wrapper to be used.
solver (str) – the name of the solver the wrapper should use.
verbose (int) –
level of information details to print.
- -1: No verbose at all.
- 0: This example’s output.
- 1: This example’s output + PEPit information.
- 2: This example’s output + PEPit information + solver details.

Returns:

pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value

Examples

>>> L = 1
>>> pepit_tau, theoretical_tau = wc_accelerated_gradient_method(L=L, gamma=1 / L, lam=10., wrapper="cvxpy", solver=None, verbose=1)
(PEPit) Setting up the problem: size of the Gram matrix: 6x6
(PEPit) Setting up the problem: performance measure is the minimum of 1 element(s)
(PEPit) Setting up the problem: Adding initial conditions and general constraints ...
(PEPit) Setting up the problem: initial conditions and general constraints (0 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 1 function(s)
                        Function 1 : Adding 12 scalar constraint(s) ...
                        Function 1 : 12 scalar constraint(s) added
(PEPit) Setting up the problem: additional constraints for 0 function(s)
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 7.94632223606942e-09
(PEPit) Primal feasibility check:
                The solver found a Gram matrix that is positive semi-definite up to an error of 1.9439082173289184e-10
                All the primal scalar constraints are verified up to an error of 2.9403765314447956e-10
(PEPit) Dual feasibility check:
                The solver found a residual matrix that is positive semi-definite
                All the dual scalar values associated with inequality constraints are nonnegative
(PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 2.223563804320635e-08
(PEPit) Final upper bound (dual): 0.0 and lower bound (primal example): 7.94632223606942e-09
(PEPit) Duality gap: absolute: -7.94632223606942e-09 and relative: -1.0
*** Example file: worst-case performance of accelerated gradient method for a given Lyapunov function***
        PEPit guarantee:         V_(n+1) - V_n <= 0.0
        Theoretical guarantee:   V_(n+1) - V_n <= 0.0

7. Potential functions

7.1. Gradient descent Lyapunov 1

7.2. Gradient descent Lyapunov 2

7.3. Accelerated gradient method

7.1. Gradient descent Lyapunov 1 

7.2. Gradient descent Lyapunov 2 

7.3. Accelerated gradient method 