11. Continuous-time models

11.1. Gradient flow for strongly convex functions

PEPit.examples.continuous_time_models.wc_gradient_flow_strongly_convex(mu, verbose=1)[source]

Consider the convex minimization problem

\[f_\star \triangleq \min_x f(x),\]

where \(f\) is \(\mu\)-strongly convex.

This code computes a worst-case guarantee for a gradient flow. That is, it computes the smallest possible \(\tau(\mu)\) such that the guarantee

\[\frac{d}{dt}\mathcal{V}(X_t) \leqslant -\tau(\mu)\mathcal{V}(X_t) ,\]

is valid, where \(\mathcal{V}(X_t) = f(X_t) - f(x_\star)\), \(X_t\) is the output of the gradient flow, and where \(x_\star\) is the minimizer of \(f\). In short, for given values of \(\mu\), \(\tau(\mu)\) is computed as the worst-case value of the derivative \(f(X_t)-f_\star\) when \(f(X_t) - f(x_\star)\leqslant 1\).

Algorithm: For \(t \geqslant 0\),

\[\frac{d}{dt}X_t = -\nabla f(X_t),\]

with some initialization \(X_{0}\triangleq x_0\).

Theoretical guarantee:

The following tight guarantee can be found in [1, Proposition 11]:

\[\frac{d}{dt}\mathcal{V}(X_t) \leqslant -2\mu\mathcal{V}(X_t).\]

The detailed approach using PEPs is available in [2, Theorem 2.1].

References:

[1] D. Scieur, V. Roulet, F. Bach and A. D’Aspremont (2017). Integration methods and accelerated optimization algorithms. In Advances in Neural Information Processing Systems (NIPS).

[2] C. Moucer, A. Taylor, F. Bach (2022). A systematic approach to Lyapunov analyses of continuous-time models in convex optimization.

Parameters
  • mu (float) – the strong convexity parameter

  • verbose (int) –

    Level of information details to print.

    • -1: No verbose at all.

    • 0: This example’s output.

    • 1: This example’s output + PEPit information.

    • 2: This example’s output + PEPit information + CVXPY details.

Returns
  • pepit_tau (float) – worst-case value

  • theoretical_tau (float) – theoretical value

Example

>>> pepit_tau, theoretical_tau = wc_gradient_flow_strongly_convex(mu=0.1, verbose=1)
(PEPit) Setting up the problem: size of the main PSD matrix: 3x3
(PEPit) Setting up the problem: performance measure is minimum of 1 element(s)
(PEPit) Setting up the problem: Adding initial conditions and general constraints ...
(PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 1 function(s)
                 function 1 : Adding 2 scalar constraint(s) ...
                 function 1 : 2 scalar constraint(s) added
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (solver: SCS); optimal value: -0.20000000011533495
*** Example file: worst-case performance of the gradient flow ***
        PEPit guarantee:         d/dt[f(X_t)-f_*] <= -0.2 (f(X_t) - f(x_*))
        Theoretical guarantee:   d/dt[f(X_t)-f_*] <= -0.2 (f(X_t) - f(x_*))

11.2. Gradient flow for convex functions

PEPit.examples.continuous_time_models.wc_gradient_flow_convex(t, verbose=1)[source]

Consider the convex minimization problem

\[f_\star \triangleq \min_x f(x),\]

where \(f\) is convex.

This code computes a worst-case guarantee for a gradient flow. That is, it verifies the following inequality

\[\frac{d}{dt}\mathcal{V}(X_t, t) \leqslant 0,\]

is valid, where \(\mathcal{V}(X_t, t) = t(f(X_t) - f(x_\star)) + \frac{1}{2} \|X_t - x_\star\|^2\), \(X_t\) is the output of the gradient flow, and where \(x_\star\) is the minimizer of \(f\). In short, for given values of \(t\), it verifies \(\frac{d}{dt}\mathcal{V}(X_t, t)\leqslant 0\).

Algorithm: For \(t \geqslant 0\),

\[\frac{d}{dt}X_t = -\nabla f(X_t),\]

with some initialization \(X_{0}\triangleq x_0\).

Theoretical guarantee:

The following tight guarantee can be found in [1, p. 7]:

\[\frac{d}{dt}\mathcal{V}(X_t, t) \leqslant 0.\]

After integrating between \(0\) and \(T\),

\[f(X_T) - f_\star \leqslant \frac{1}{2T}\|x_0 - x_\star\|^2.\]

The detailed approach using PEPs is available in [2, Theorem 2.3].

References:

[1] W. Su, S. Boyd, E. J. Candès (2016). A differential equation for modeling Nesterov’s accelerated gradient method: Theory and insights. In the Journal of Machine Learning Research (JMLR).

[2] C. Moucer, A. Taylor, F. Bach (2022). A systematic approach to Lyapunov analyses of continuous-time models in convex optimization.

Parameters
  • t (float) – time step

  • verbose (int) –

    Level of information details to print.

    • -1: No verbose at all.

    • 0: This example’s output.

    • 1: This example’s output + PEPit information.

    • 2: This example’s output + PEPit information + CVXPY details.

Returns
  • pepit_tau (float) – worst-case value

  • theoretical_tau (float) – theoretical value

Example

>>> pepit_tau, theoretical_tau = wc_gradient_flow_convex(t=2.5, verbose=1)
(PEPit) Setting up the problem: size of the main PSD matrix: 3x3
(PEPit) Setting up the problem: performance measure is minimum of 1 element(s)
(PEPit) Setting up the problem: Adding initial conditions and general constraints ...
(PEPit) Setting up the problem: initial conditions and general constraints (0 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 1 function(s)
                 function 1 : Adding 2 scalar constraint(s) ...
                 function 1 : 2 scalar constraint(s) added
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (solver: SCS); optimal value: 1.910532459863401e-18
*** Example file: worst-case performance of the gradient flow ***
        PEPit guarantee:         d/dt V(X_t) <= 1.91053e-18
        Theoretical guarantee:   d/dt V(X_t) <= 0.0

11.3. Accelerated gradient flow for strongly convex functions

PEPit.examples.continuous_time_models.wc_accelerated_gradient_flow_strongly_convex(mu, psd=True, verbose=1)[source]

Consider the convex minimization problem

\[f_\star \triangleq \min_{x\in\mathbb{R}^d} f(x),\]

where \(f\) is \(\mu\)-strongly convex.

This code computes a worst-case guarantee for an accelerated gradient flow. That is, it computes the smallest possible \(\tau(\mu)\) such that the guarantee

\[\frac{d}{dt}\mathcal{V}_{P}(X_t) \leqslant -\tau(\mu)\mathcal{V}_P(X_t) ,\]

is valid with

\[\mathcal{V}_{P}(X_t) = f(X_t) - f(x_\star) + (X_t - x_\star, \frac{d}{dt}X_t)^T(P \otimes I_d)(X_t - x_\star, \frac{d}{dt}X_t) ,\]

where \(I_d\) is the identity matrix, \(X_t\) is the output of an accelerated gradient flow, and where \(x_\star\) is the minimizer of \(f\).

In short, for given values of \(\mu\), \(\tau(\mu)\) is computed as the worst-case value of the derivative of \(f(X_t)-f_\star\) when \(f(X_t) - f(x_\star)\leqslant 1\).

Algorithm: For \(t \geqslant 0\),

\[\frac{d^2}{dt^2}X_t + 2\sqrt{\mu}\frac{d}{dt}X_t + \nabla f(X_t) = 0,\]

with some initialization \(X_{0}\triangleq x_0\).

Theoretical guarantee:

The following tight guarantee for \(P = \frac{1}{2}\begin{pmatrix} \mu & \sqrt{\mu} \\ \sqrt{\mu} & 1\end{pmatrix}\), for which \(\mathcal{V}_{P} \geqslant 0\) can be found in [1, Appendix B], [2, Theorem 4.3]:

\[\frac{d}{dt}\mathcal{V}_P(X_t) \leqslant -\sqrt{\mu}\mathcal{V}_P(X_t).\]

For \(P = \begin{pmatrix} \frac{4}{9}\mu & \frac{4}{3}\sqrt{\mu} \\ \frac{4}{3}\sqrt{\mu} & \frac{1}{2}\end{pmatrix}\), for which \(\mathcal{V}_{P}(X_t) \geqslant 0\) along the trajectory, the following tight guarantee can be found in [3, Corollary 2.5],

\[\frac{d}{dt}\mathcal{V}_P(X_t) \leqslant -\frac{4}{3}\sqrt{\mu}\mathcal{V}_P(X_t).\]

References:

[1] A. C. Wilson, B. Recht, M. I. Jordan (2021). A Lyapunov analysis of accelerated methods in optimization. In the Journal of Machine Learning Reasearch (JMLR), 22(113):1−34, 2021.

[2] J.M. Sanz-Serna and K. C. Zygalakis (2021) The connections between Lyapunov functions for some optimization algorithms and differential equations. In SIAM Journal on Numerical Analysis, 59 pp 1542-1565.

[3] C. Moucer, A. Taylor, F. Bach (2022). A systematic approach to Lyapunov analyses of continuous-time models in convex optimization.

Parameters
  • mu (float) – the strong convexity parameter

  • psd (boolean) – option for positivity of \(P\) in the Lyapunov function \(\mathcal{V}_{P}\)

  • verbose (int) –

    Level of information details to print.

    • -1: No verbose at all.

    • 0: This example’s output.

    • 1: This example’s output + PEPit information.

    • 2: This example’s output + PEPit information + CVXPY details.

Returns
  • pepit_tau (float) – worst-case value

  • theoretical_tau (float) – theoretical value

Example

>>> pepit_tau, theoretical_tau = wc_accelerated_gradient_flow_strongly_convex(mu=0.1, psd=True, verbose=1)
(PEPit) Setting up the problem: size of the main PSD matrix: 4x4
(PEPit) Setting up the problem: performance measure is minimum of 1 element(s)
(PEPit) Setting up the problem: Adding initial conditions and general constraints ...
(PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 1 function(s)
                 function 1 : Adding 2 scalar constraint(s) ...
                 function 1 : 2 scalar constraint(s) added
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (solver: SCS); optimal value: -0.31622776602929215
*** Example file: worst-case performance of an accelerated gradient flow ***
        PEPit guarantee:         d/dt V(X_t,t) <= -0.316228 V(X_t,t)
        Theoretical guarantee:   d/dt V(X_t) <= -0.316228 V(X_t,t)

11.4. Accelerated gradient flow for convex functions

PEPit.examples.continuous_time_models.wc_accelerated_gradient_flow_convex(t, verbose=1)[source]

Consider the convex minimization problem

\[f_\star \triangleq \min_x f(x),\]

where \(f\) is convex.

This code computes a worst-case guarantee for an accelerated gradient flow. That is, it verifies the inequality

\[\frac{d}{dt}\mathcal{V}(X_t, t) \leqslant 0 ,\]

is valid, where \(\mathcal{V}(X_t, t) = t^2(f(X_t) - f(x_\star)) + 2 \|(X_t - x_\star) + \frac{t}{2}\frac{d}{dt}X_t \|^2\), \(X_t\) is the output of an accelerated gradient flow, and where \(x_\star\) is the minimizer of \(f\).

In short, for given values of \(t\), it verifies \(\frac{d}{dt}\mathcal{V}(X_t, t) \leqslant 0\).

Algorithm: For \(t \geqslant 0\),

\[\frac{d^2}{dt^2}X_t + \frac{3}{t}\frac{d}{dt}X_t + \nabla f(X_t) = 0,\]

with some initialization \(X_{0}\triangleq x_0\).

Theoretical guarantee:

The following tight guarantee can be verified in [1, Section 2]:

\[\frac{d}{dt}\mathcal{V}(X_t, t) \leqslant 0.\]

After integrating between \(0\) and \(T\),

\[f(X_T) - f_\star \leqslant \frac{2}{T^2}\|x_0 - x_\star\|^2.\]

The detailed approach using PEPs is available in [2, Theorem 2.6].

References:

[1] W. Su, S. Boyd, E. J. Candès (2016). A differential equation for modeling Nesterov’s accelerated gradient method: Theory and insights. In the Journal of Machine Learning Research (JMLR).

[2] C. Moucer, A. Taylor, F. Bach (2022). A systematic approach to Lyapunov analyses of continuous-time models in convex optimization.

Parameters
  • t (float) – time step

  • verbose (int) –

    Level of information details to print.

    • -1: No verbose at all.

    • 0: This example’s output.

    • 1: This example’s output + PEPit information.

    • 2: This example’s output + PEPit information + CVXPY details.

Returns
  • pepit_tau (float) – worst-case value

  • theoretical_tau (float) – theoretical value

Example

>>> pepit_tau, theoretical_tau = wc_accelerated_gradient_flow_convex(t=3.4, verbose=1)
(PEPit) Setting up the problem: size of the main PSD matrix: 4x4
(PEPit) Setting up the problem: performance measure is minimum of 1 element(s)
(PEPit) Setting up the problem: Adding initial conditions and general constraints ...
(PEPit) Setting up the problem: initial conditions and general constraints (0 constraint(s) added)
(PEPit) Setting up the problem: interpolation conditions for 1 function(s)
                 function 1 : Adding 2 scalar constraint(s) ...
                 function 1 : 2 scalar constraint(s) added
(PEPit) Compiling SDP
(PEPit) Calling SDP solver
(PEPit) Solver status: optimal (solver: SCS); optimal value: -1.2008648627755779e-18
*** Example file: worst-case performance of an accelerated gradient flow ***
        PEPit guarantee:         d/dt V(X_t,t) <= -1.20086e-18
        Theoretical guarantee:   d/dt V(X_t) <= 0.0