11. Continuous-time models

Gradient flow for strongly convex functions
Gradient flow for convex functions
Accelerated gradient flow for strongly convex functions
Accelerated gradient flow for convex functions

11.1. Gradient flow for strongly convex functions 

PEPit.examples.continuous_time_models.wc_gradient_flow_strongly_convex(mu, verbose=1)[source]

Consider the convex minimization problem

\[f_\star \triangleq \min_x f(x),\]

where \(f\) is \(\mu\)-strongly convex.

This code computes a worst-case guarantee for a gradient flow. That is, it computes the smallest possible \(\tau(\mu)\) such that the guarantee

\[\frac{d}{dt}\mathcal{V}(X_t) \leqslant -\tau(\mu)\mathcal{V}(X_t) ,\]

is valid, where \(\mathcal{V}(X_t) = f(X_t) - f(x_\star)\), \(X_t\) is the output of the gradient flow, and where \(x_\star\) is the minimizer of \(f\). In short, for given values of \(\mu\), \(\tau(\mu)\) is computed as the worst-case value of the derivative \(f(X_t)-f_\star\) when \(f(X_t) - f(x_\star)\leqslant 1\).

Algorithm: For \(t \geqslant 0\),

\[\frac{d}{dt}X_t = -\nabla f(X_t),\]

with some initialization \(X_{0}\triangleq x_0\).

Theoretical guarantee:

The following tight guarantee can be found in [1, Proposition 11]:

\[\frac{d}{dt}\mathcal{V}(X_t) \leqslant -2\mu\mathcal{V}(X_t).\]

The detailed approach using PEPs is available in [2, Theorem 2.1].

References:

[1] D. Scieur, V. Roulet, F. Bach and A. D’Aspremont (2017). Integration methods and accelerated optimization algorithms. In Advances in Neural Information Processing Systems (NIPS).

[2] C. Moucer, A. Taylor, F. Bach (2022). A systematic approach to Lyapunov analyses of continuous-time models in convex optimization.

Parameters

mu (float) – the strong convexity parameter
verbose (int) –
Level of information details to print.
- -1: No verbose at all.
- 0: This example’s output.
- 1: This example’s output + PEPit information.
- 2: This example’s output + PEPit information + CVXPY details.

Returns

pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value

Example

>>> pepit_tau, theoretical_tau = wc_gradient_flow_strongly_convex(mu=0.1, verbose=1)
(PEPit) Setting up the problem: size of the main PSD matrix: 3x3
(PEPit) Setting up the problem: performance measure is minimum of 1 element(s)
(PEPit) Setting up the problem: Adding initial conditions and general constraints ...
(PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 1 function(s)
                 function 1 : Adding 2 scalar constraint(s) ...
                 function 1 : 2 scalar constraint(s) added
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (solver: SCS); optimal value: -0.20000000011533495
*** Example file: worst-case performance of the gradient flow ***
        PEPit guarantee:         d/dt[f(X_t)-f_*] <= -0.2 (f(X_t) - f(x_*))
        Theoretical guarantee:   d/dt[f(X_t)-f_*] <= -0.2 (f(X_t) - f(x_*))

11.2. Gradient flow for convex functions 

PEPit.examples.continuous_time_models.wc_gradient_flow_convex(t, verbose=1)[source]

Consider the convex minimization problem

\[f_\star \triangleq \min_x f(x),\]

where \(f\) is convex.

This code computes a worst-case guarantee for a gradient flow. That is, it verifies the following inequality

\[\frac{d}{dt}\mathcal{V}(X_t, t) \leqslant 0,\]

is valid, where \(\mathcal{V}(X_t, t) = t(f(X_t) - f(x_\star)) + \frac{1}{2} \|X_t - x_\star\|^2\), \(X_t\) is the output of the gradient flow, and where \(x_\star\) is the minimizer of \(f\). In short, for given values of \(t\), it verifies \(\frac{d}{dt}\mathcal{V}(X_t, t)\leqslant 0\).

Algorithm: For \(t \geqslant 0\),

\[\frac{d}{dt}X_t = -\nabla f(X_t),\]

with some initialization \(X_{0}\triangleq x_0\).

Theoretical guarantee:

The following tight guarantee can be found in [1, p. 7]:

\[\frac{d}{dt}\mathcal{V}(X_t, t) \leqslant 0.\]

After integrating between \(0\) and \(T\),

\[f(X_T) - f_\star \leqslant \frac{1}{2T}\|x_0 - x_\star\|^2.\]

The detailed approach using PEPs is available in [2, Theorem 2.3].

References:

[1] W. Su, S. Boyd, E. J. Candès (2016). A differential equation for modeling Nesterov’s accelerated gradient method: Theory and insights. In the Journal of Machine Learning Research (JMLR).

[2] C. Moucer, A. Taylor, F. Bach (2022). A systematic approach to Lyapunov analyses of continuous-time models in convex optimization.

Parameters

t (float) – time step
verbose (int) –
Level of information details to print.
- -1: No verbose at all.
- 0: This example’s output.
- 1: This example’s output + PEPit information.
- 2: This example’s output + PEPit information + CVXPY details.

Returns

pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value

Example

>>> pepit_tau, theoretical_tau = wc_gradient_flow_convex(t=2.5, verbose=1)
(PEPit) Setting up the problem: size of the main PSD matrix: 3x3
(PEPit) Setting up the problem: performance measure is minimum of 1 element(s)
(PEPit) Setting up the problem: Adding initial conditions and general constraints ...
(PEPit) Setting up the problem: initial conditions and general constraints (0 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 1 function(s)
                 function 1 : Adding 2 scalar constraint(s) ...
                 function 1 : 2 scalar constraint(s) added
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (solver: SCS); optimal value: 1.910532459863401e-18
*** Example file: worst-case performance of the gradient flow ***
        PEPit guarantee:         d/dt V(X_t) <= 1.91053e-18
        Theoretical guarantee:   d/dt V(X_t) <= 0.0

11.3. Accelerated gradient flow for strongly convex functions 

PEPit.examples.continuous_time_models.wc_accelerated_gradient_flow_strongly_convex(mu, psd=True, verbose=1)[source]

Consider the convex minimization problem

\[f_\star \triangleq \min_{x\in\mathbb{R}^d} f(x),\]

where \(f\) is \(\mu\)-strongly convex.

This code computes a worst-case guarantee for an accelerated gradient flow. That is, it computes the smallest possible \(\tau(\mu)\) such that the guarantee

\[\frac{d}{dt}\mathcal{V}_{P}(X_t) \leqslant -\tau(\mu)\mathcal{V}_P(X_t) ,\]

is valid with

\[\mathcal{V}_{P}(X_t) = f(X_t) - f(x_\star) + (X_t - x_\star, \frac{d}{dt}X_t)^T(P \otimes I_d)(X_t - x_\star, \frac{d}{dt}X_t) ,\]

where \(I_d\) is the identity matrix, \(X_t\) is the output of an accelerated gradient flow, and where \(x_\star\) is the minimizer of \(f\).

In short, for given values of \(\mu\), \(\tau(\mu)\) is computed as the worst-case value of the derivative of \(f(X_t)-f_\star\) when \(f(X_t) - f(x_\star)\leqslant 1\).

Algorithm: For \(t \geqslant 0\),

\[\frac{d^2}{dt^2}X_t + 2\sqrt{\mu}\frac{d}{dt}X_t + \nabla f(X_t) = 0,\]

with some initialization \(X_{0}\triangleq x_0\).

Theoretical guarantee:

The following tight guarantee for \(P = \frac{1}{2}\begin{pmatrix} \mu & \sqrt{\mu} \\ \sqrt{\mu} & 1\end{pmatrix}\), for which \(\mathcal{V}_{P} \geqslant 0\) can be found in [1, Appendix B], [2, Theorem 4.3]:

\[\frac{d}{dt}\mathcal{V}_P(X_t) \leqslant -\sqrt{\mu}\mathcal{V}_P(X_t).\]

For \(P = \begin{pmatrix} \frac{4}{9}\mu & \frac{4}{3}\sqrt{\mu} \\ \frac{4}{3}\sqrt{\mu} & \frac{1}{2}\end{pmatrix}\), for which \(\mathcal{V}_{P}(X_t) \geqslant 0\) along the trajectory, the following tight guarantee can be found in [3, Corollary 2.5],

\[\frac{d}{dt}\mathcal{V}_P(X_t) \leqslant -\frac{4}{3}\sqrt{\mu}\mathcal{V}_P(X_t).\]

References:

[1] A. C. Wilson, B. Recht, M. I. Jordan (2021). A Lyapunov analysis of accelerated methods in optimization. In the Journal of Machine Learning Reasearch (JMLR), 22(113):1−34, 2021.

[2] J.M. Sanz-Serna and K. C. Zygalakis (2021) The connections between Lyapunov functions for some optimization algorithms and differential equations. In SIAM Journal on Numerical Analysis, 59 pp 1542-1565.

[3] C. Moucer, A. Taylor, F. Bach (2022). A systematic approach to Lyapunov analyses of continuous-time models in convex optimization.

Parameters

mu (float) – the strong convexity parameter
psd (boolean) – option for positivity of \(P\) in the Lyapunov function \(\mathcal{V}_{P}\)
verbose (int) –
Level of information details to print.
- -1: No verbose at all.
- 0: This example’s output.
- 1: This example’s output + PEPit information.
- 2: This example’s output + PEPit information + CVXPY details.

Returns

pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value

Example

>>> pepit_tau, theoretical_tau = wc_accelerated_gradient_flow_strongly_convex(mu=0.1, psd=True, verbose=1)
(PEPit) Setting up the problem: size of the main PSD matrix: 4x4
(PEPit) Setting up the problem: performance measure is minimum of 1 element(s)
(PEPit) Setting up the problem: Adding initial conditions and general constraints ...
(PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 1 function(s)
                 function 1 : Adding 2 scalar constraint(s) ...
                 function 1 : 2 scalar constraint(s) added
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (solver: SCS); optimal value: -0.31622776602929215
*** Example file: worst-case performance of an accelerated gradient flow ***
        PEPit guarantee:         d/dt V(X_t,t) <= -0.316228 V(X_t,t)
        Theoretical guarantee:   d/dt V(X_t) <= -0.316228 V(X_t,t)

11.4. Accelerated gradient flow for convex functions 

PEPit.examples.continuous_time_models.wc_accelerated_gradient_flow_convex(t, verbose=1)[source]

Consider the convex minimization problem

\[f_\star \triangleq \min_x f(x),\]

where \(f\) is convex.

This code computes a worst-case guarantee for an accelerated gradient flow. That is, it verifies the inequality

\[\frac{d}{dt}\mathcal{V}(X_t, t) \leqslant 0 ,\]

is valid, where \(\mathcal{V}(X_t, t) = t^2(f(X_t) - f(x_\star)) + 2 \|(X_t - x_\star) + \frac{t}{2}\frac{d}{dt}X_t \|^2\), \(X_t\) is the output of an accelerated gradient flow, and where \(x_\star\) is the minimizer of \(f\).

In short, for given values of \(t\), it verifies \(\frac{d}{dt}\mathcal{V}(X_t, t) \leqslant 0\).

Algorithm: For \(t \geqslant 0\),

\[\frac{d^2}{dt^2}X_t + \frac{3}{t}\frac{d}{dt}X_t + \nabla f(X_t) = 0,\]

with some initialization \(X_{0}\triangleq x_0\).

Theoretical guarantee:

The following tight guarantee can be verified in [1, Section 2]:

\[\frac{d}{dt}\mathcal{V}(X_t, t) \leqslant 0.\]

After integrating between \(0\) and \(T\),

\[f(X_T) - f_\star \leqslant \frac{2}{T^2}\|x_0 - x_\star\|^2.\]

The detailed approach using PEPs is available in [2, Theorem 2.6].

References:

[1] W. Su, S. Boyd, E. J. Candès (2016). A differential equation for modeling Nesterov’s accelerated gradient method: Theory and insights. In the Journal of Machine Learning Research (JMLR).

[2] C. Moucer, A. Taylor, F. Bach (2022). A systematic approach to Lyapunov analyses of continuous-time models in convex optimization.

Parameters

t (float) – time step
verbose (int) –
Level of information details to print.
- -1: No verbose at all.
- 0: This example’s output.
- 1: This example’s output + PEPit information.
- 2: This example’s output + PEPit information + CVXPY details.

Returns

pepit_tau (float) – worst-case value
theoretical_tau (float) – theoretical value

Example

>>> pepit_tau, theoretical_tau = wc_accelerated_gradient_flow_convex(t=3.4, verbose=1)
(PEPit) Setting up the problem: size of the main PSD matrix: 4x4
(PEPit) Setting up the problem: performance measure is minimum of 1 element(s)
(PEPit) Setting up the problem: Adding initial conditions and general constraints ...
(PEPit) Setting up the problem: initial conditions and general constraints (0 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 1 function(s)
                 function 1 : Adding 2 scalar constraint(s) ...
                 function 1 : 2 scalar constraint(s) added
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (solver: SCS); optimal value: -1.2008648627755779e-18
*** Example file: worst-case performance of an accelerated gradient flow ***
        PEPit guarantee:         d/dt V(X_t,t) <= -1.20086e-18
        Theoretical guarantee:   d/dt V(X_t) <= 0.0

11. Continuous-time models

11.1. Gradient flow for strongly convex functions

11.2. Gradient flow for convex functions

11.3. Accelerated gradient flow for strongly convex functions

11.4. Accelerated gradient flow for convex functions

11.1. Gradient flow for strongly convex functions 

11.2. Gradient flow for convex functions 

11.3. Accelerated gradient flow for strongly convex functions 

11.4. Accelerated gradient flow for convex functions 