Source code for PEPit.examples.stochastic_and_randomized_convex_minimization.point_saga

import numpy as np

from PEPit import PEP
from PEPit.functions import SmoothStronglyConvexFunction
from PEPit.primitive_steps import proximal_step


[docs]def wc_point_saga(L, mu, n, verbose=1):
    """
    Consider the finite sum minimization problem

    .. math:: F^\\star \\triangleq \\min_x \\left\\{F(x) \\equiv \\frac{1}{n} \\sum_{i=1}^n f_i(x)\\right\\},

    where :math:`f_1, \\dots, f_n` are :math:`L`-smooth and :math:`\\mu`-strongly convex, and with proximal operator
    readily available.

    This code computes a tight (one-step) worst-case guarantee using a Lyapunov function for **Point SAGA** [1].
    The Lyapunov (or energy) function at a point :math:`x` is given in [1, Theorem 5]:

    .. math:: V(x) = \\frac{1}{L \\mu}\\frac{1}{n} \\sum_{i \\leq n} \\|\\nabla f_i(x) - \\nabla f_i(x_\\star)\\|^2 + \\|x - x^\\star\\|^2,

    where :math:`x^\\star` denotes the minimizer of :math:`F`. The code computes the smallest possible
    :math:`\\tau(n, L, \\mu)` such that the guarantee (in expectation):

    .. math:: \\mathbb{E}\\left[V\\left(x^{(1)}\\right)\\right] \\leqslant \\tau(n, L, \\mu) V\\left(x^{(0)}\\right),

    is valid (note that we use the notation :math:`x^{(0)},x^{(1)}` to denote two consecutive iterates for convenience; as the
    bound is valid for all :math:`x^{(0)}`, it is also valid for any pair of consecutive iterates of the algorithm).

    In short, for given values of :math:`n`, :math:`L`, and :math:`\\mu`,
    :math:`\\tau(n, L, \\mu)` is computed as the worst-case value of :math:`\\mathbb{E}\\left[V\\left(x^{(1)}\\right)\\right]` when :math:`V\\left(x^{(0)}\\right) \\leqslant 1`.

    **Algorithm**:
    Point SAGA is described by

    .. math::
        \\begin{eqnarray}
            \\text{Set }\\gamma & = & \\frac{\\sqrt{(n - 1)^2 + 4n\\frac{L}{\\mu}}}{2Ln} - \\frac{\\left(1 - \\frac{1}{n}\\right)}{2L} \\\\
            \\text{Pick random }j & \\sim & \\mathcal{U}\\left([|1, n|]\\right) \\\\
            z^{(t)} & = & x_t + \\gamma \\left(g_j^{(t)} - \\frac{1}{n} \\sum_{i\\leq n}g_i^{(t)} \\right), \\\\
            x^{(t+1)} & = & \\mathrm{prox}_{\\gamma f_j}(z^{(t)})\\triangleq \\arg\\min_x\\left\\{ \\gamma f_j(x)+\\frac{1}{2} \\|x-z^{(t)}\\|^2 \\right\\}, \\\\
            g_j^{(t+1)} & = & \\frac{1}{\\gamma}(z^{(t)} - x^{(t+1)}).
        \\end{eqnarray}

    **Theoretical guarantee**: A theoretical **upper** bound is given in [1, Theorem 5].

    .. math:: \\mathbb{E}\\left[V\\left(x^{(t+1)}\\right)\\right] \\leqslant \\frac{1}{1 + \\mu\\gamma} V\\left(x^{(t)}\\right)

    **References**:

    `[1] A. Defazio (2016). A simple practical accelerated method for finite sums.
    Advances in Neural Information Processing Systems (NIPS), 29, 676-684.
    <https://proceedings.neurips.cc/paper/2016/file/4f6ffe13a5d75b2d6a3923922b3922e5-Paper.pdf>`_

    Args:
        L (float): the smoothness parameter.
        mu (float): the strong convexity parameter.
        n (int): number of functions.
        verbose (int): Level of information details to print.
                        
                        - -1: No verbose at all.
                        - 0: This example's output.
                        - 1: This example's output + PEPit information.
                        - 2: This example's output + PEPit information + CVXPY details.

    Returns:
        pepit_tau (float): worst-case value
        theoretical_tau (float): theoretical value

    Example:
        >>> pepit_tau, theoretical_tau = wc_point_saga(L=1, mu=.01, n=10, verbose=1)
        (PEPit) Setting up the problem: size of the main PSD matrix: 31x31
        (PEPit) Setting up the problem: performance measure is minimum of 1 element(s)
        (PEPit) Setting up the problem: Adding initial conditions and general constraints ...
        (PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added)
        (PEPit) Setting up the problem: interpolation conditions for 10 function(s)
                         function 1 : Adding 2 scalar constraint(s) ...
                         function 1 : 2 scalar constraint(s) added
                         function 2 : Adding 2 scalar constraint(s) ...
                         function 2 : 2 scalar constraint(s) added
                         function 3 : Adding 2 scalar constraint(s) ...
                         function 3 : 2 scalar constraint(s) added
                         function 4 : Adding 2 scalar constraint(s) ...
                         function 4 : 2 scalar constraint(s) added
                         function 5 : Adding 2 scalar constraint(s) ...
                         function 5 : 2 scalar constraint(s) added
                         function 6 : Adding 2 scalar constraint(s) ...
                         function 6 : 2 scalar constraint(s) added
                         function 7 : Adding 2 scalar constraint(s) ...
                         function 7 : 2 scalar constraint(s) added
                         function 8 : Adding 2 scalar constraint(s) ...
                         function 8 : 2 scalar constraint(s) added
                         function 9 : Adding 2 scalar constraint(s) ...
                         function 9 : 2 scalar constraint(s) added
                         function 10 : Adding 2 scalar constraint(s) ...
                         function 10 : 2 scalar constraint(s) added
        (PEPit) Compiling SDP
        (PEPit) Calling SDP solver
        (PEPit) Solver status: optimal (solver: SCS); optimal value: 0.9714053941143999
        *** Example file: worst-case performance of Point SAGA for a given Lyapunov function ***
                PEPit guarantee:         E[V(x^(1))] <= 0.971405 V(x^(0))
                Theoretical guarantee:   E[V(x^(1))] <= 0.973292 V(x^(0))

    """

    # Instantiate PEP
    problem = PEP()

    # Declare a sum of strongly convex functions
    fn = [problem.declare_function(SmoothStronglyConvexFunction, L=L, mu=mu) for _ in range(n)]
    func = np.mean(fn)

    # Start by defining its unique optimal point xs = x_*
    xs = func.stationary_point()

    # Then define the initial values
    phi = [problem.set_initial_point() for _ in range(n)]
    x0 = problem.set_initial_point()

    # Parameters of the scheme and of the Lyapunov function
    gamma = np.sqrt((n - 1) ** 2 + 4 * n * L / mu) / 2 / L / n - (1 - 1 / n) / 2 / L
    c = 1 / (mu * L)

    # Compute the initial value of the Lyapunov function
    init_lyapunov = (xs - x0) ** 2
    gs = [fn[i].gradient(xs) for i in range(n)]
    for i in range(n):
        init_lyapunov = init_lyapunov + c / n * (gs[i] - phi[i]) ** 2

    # Set the initial constraint as the Lyapunov bounded by 1
    problem.set_initial_condition(init_lyapunov <= 1.)

    # Compute the expected value of the Lyapunov function after one iteration
    # (so: expectation over n possible scenarios:  one for each element fi in the function).
    final_lyapunov_avg = (xs - xs) ** 2
    for i in range(n):
        w = x0 + gamma * phi[i]
        for j in range(n):
            w = w - gamma / n * phi[j]
        x1, gx1, _ = proximal_step(w, fn[i], gamma)
        final_lyapunov = (xs - x1) ** 2
        for j in range(n):
            if i != j:
                final_lyapunov = final_lyapunov + c / n * (phi[j] - gs[j]) ** 2
            else:
                final_lyapunov = final_lyapunov + c / n * (gs[j] - gx1) ** 2
        final_lyapunov_avg = final_lyapunov_avg + final_lyapunov / n

    # Set the performance metric to the distance average to optimal point
    problem.set_performance_metric(final_lyapunov_avg)

    # Solve the PEP
    pepit_verbose = max(verbose, 0)
    pepit_tau = problem.solve(verbose=pepit_verbose)

    # Compute theoretical guarantee (for comparison) : the bound is given in theorem 5 of [1]
    kappa = mu * gamma / (1 + mu * gamma)
    theoretical_tau = (1 - kappa)

    # Print conclusion if required
    if verbose != -1:
        print('*** Example file: worst-case performance of Point SAGA for a given Lyapunov function ***')
        print('\tPEPit guarantee:\t E[V(x^(1))] <= {:.6} V(x^(0))'.format(pepit_tau))
        print('\tTheoretical guarantee:\t E[V(x^(1))] <= {:.6} V(x^(0))'.format(theoretical_tau))

    # Return the worst-case guarantee of the evaluated method (and the reference theoretical value)
    return pepit_tau, theoretical_tau


if __name__ == "__main__":
    pepit_tau, theoretical_tau = wc_point_saga(L=1, mu=.01, n=10, verbose=1)