3 Non-convex optimization
3.1 Gradient Descent
- PEPit.examples.nonconvex_optimization.wc_gradient_descent(L, gamma, n, wrapper='cvxpy', solver=None, verbose=1)[source]
Consider the minimization problem
\[f_\star \triangleq \min_x f(x),\]where \(f\) is \(L\)-smooth.
This code computes a worst-case guarantee for gradient descent with fixed step-size \(\gamma\). That is, it computes the smallest possible \(\tau(n, L, \gamma)\) such that the guarantee
\[\min_{t\leqslant n} \|\nabla f(x_t)\|^2 \leqslant \tau(n, L, \gamma) (f(x_0) - f(x_n))\]is valid, where \(x_n\) is the n-th iterates obtained with the gradient method with fixed step-size.
Algorithm: Gradient descent is described as follows, for \(t \in \{ 0, \dots, n-1\}\),
\[x_{t+1} = x_t - \gamma \nabla f(x_t),\]where \(\gamma\) is a step-size.
Theoretical guarantee: When \(\gamma \leqslant \frac{1}{L}\), an empirically tight theoretical worst-case guarantee is
\[\min_{t\leqslant n} \|\nabla f(x_t)\|^2 \leqslant \frac{4}{3}\frac{L}{n} (f(x_0) - f(x_n)),\]see discussions in [1, page 190] and [2].
References:
- Parameters:
L (float) – the smoothness parameter.
gamma (float) – step-size.
n (int) – number of iterations.
wrapper (str) – the name of the wrapper to be used.
solver (str) – the name of the solver the wrapper should use.
verbose (int) –
level of information details to print.
-1: No verbose at all.
0: This example’s output.
1: This example’s output + PEPit information.
2: This example’s output + PEPit information + solver details.
- Returns:
pepit_tau (float) – worst-case value.
theoretical_tau (float) – theoretical value.
Example
>>> L = 1 >>> gamma = 1 / L >>> pepit_tau, theoretical_tau = wc_gradient_descent(L=L, gamma=gamma, n=5, wrapper="cvxpy", solver=None, verbose=1) (PEPit) Setting up the problem: size of the Gram matrix: 7x7 (PEPit) Setting up the problem: performance measure is the minimum of 6 element(s) (PEPit) Setting up the problem: Adding initial conditions and general constraints ... (PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added) (PEPit) Setting up the problem: interpolation conditions for 1 function(s) Function 1 : Adding 30 scalar constraint(s) ... Function 1 : 30 scalar constraint(s) added (PEPit) Setting up the problem: additional constraints for 0 function(s) (PEPit) Compiling SDP (PEPit) Calling SDP solver (PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.2666666655116666 (PEPit) Primal feasibility check: The solver found a Gram matrix that is positive semi-definite All the primal scalar constraints are verified (PEPit) Dual feasibility check: The solver found a residual matrix that is positive semi-definite All the dual scalar values associated with inequality constraints are nonnegative up to an error of 4.5045553432130066e-10 (PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 4.49157760320874e-09 (PEPit) Final upper bound (dual): 0.2666666657156721 and lower bound (primal example): 0.2666666655116666 (PEPit) Duality gap: absolute: 2.0400547917631684e-10 and relative: 7.650205502246835e-10 *** Example file: worst-case performance of gradient descent with fixed step-size *** PEPit guarantee: min_i ||f'(x_i)||^2 <= 0.266667 (f(x_0)-f_*) Theoretical guarantee: min_i ||f'(x_i)||^2 <= 0.266667 (f(x_0)-f_*)
3.2 No Lips 1
- PEPit.examples.nonconvex_optimization.wc_no_lips_1(L, gamma, n, wrapper='cvxpy', solver=None, verbose=1)[source]
Consider the constrainted non-convex minimization problem
\[F_\star \triangleq \min_x \{F(x) \equiv f_1(x)+f_2(x) \}\]where \(f_2\) is a closed convex indicator function and \(f_1\) is possibly non-convex and \(L\)-smooth relatively to \(h\), and where \(h\) is closed proper and convex.
This code computes a worst-case guarantee for the NoLips method. That is, it computes the smallest possible \(\tau(n, L, \gamma)\) such that the guarantee
\[\min_{0 \leqslant t \leqslant n-1} D_h(x_{t+1}; x_t) \leqslant \tau(n, L, \gamma) (F(x_0) - F(x_n))\]is valid, where \(x_n\) is the output of the NoLips method, and where \(D_h\) is the Bregman distance generated by \(h\):
\[D_h(x; y) \triangleq h(x) - h(y) - \nabla h (y)^T(x - y).\]In short, for given values of \(n\), \(L\), and \(\gamma\), \(\tau(n, L, \gamma)\) is computed as the worst-case value of \(\min_{0 \leqslant t \leqslant n-1}D_h(x_{t+1}; x_t)\) when \(F(x_0) - F(x_n) \leqslant 1\).
Algorithms: This method (also known as Bregman Gradient, or Mirror descent) can be found in, e.g., [1, Section 3]. For \(t \in \{0, \dots, n-1\}\),
\[x_{t+1} = \arg\min_{u \in R^d} \nabla f(x_t)^T(u - x_t) + \frac{1}{\gamma} D_h(u; x_t).\]Theoretical guarantees: The tight theoretical upper bound is obtained in [1, Proposition 4.1]
\[\min_{0 \leqslant t \leqslant n-1} D_h(x_{t+1}; x_t) \leqslant \frac{\gamma}{n(1 - L\gamma)}(F(x_0) - F(x_n))\]References: The detailed setup and results are availaible in [1]. The PEP approach for studying such settings is presented in [2].
DISCLAIMER: This example requires some experience with PEPit and PEPs (see Section 4 in [2]).
- Parameters:
L (float) – relative-smoothness parameter.
gamma (float) – step-size (equal to 1/(2*L) for guarantee).
n (int) – number of iterations.
wrapper (str) – the name of the wrapper to be used.
solver (str) – the name of the solver the wrapper should use.
verbose (int) –
level of information details to print.
-1: No verbose at all.
0: This example’s output.
1: This example’s output + PEPit information.
2: This example’s output + PEPit information + solver details.
- Returns:
pepit_tau (float) – worst-case value.
theoretical_tau (float) – theoretical value.
Example
>>> L = 1 >>> gamma = 1 / (2 * L) >>> pepit_tau, theoretical_tau = wc_no_lips_1(L=L, gamma=gamma, n=5, wrapper="cvxpy", solver=None, verbose=1) (PEPit) Setting up the problem: size of the Gram matrix: 20x20 (PEPit) Setting up the problem: performance measure is the minimum of 5 element(s) (PEPit) Setting up the problem: Adding initial conditions and general constraints ... (PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added) (PEPit) Setting up the problem: interpolation conditions for 3 function(s) Function 1 : Adding 30 scalar constraint(s) ... Function 1 : 30 scalar constraint(s) added Function 2 : Adding 30 scalar constraint(s) ... Function 2 : 30 scalar constraint(s) added Function 3 : Adding 49 scalar constraint(s) ... Function 3 : 49 scalar constraint(s) added (PEPit) Setting up the problem: additional constraints for 0 function(s) (PEPit) Compiling SDP (PEPit) Calling SDP solver (PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.19999999999153723 (PEPit) Primal feasibility check: The solver found a Gram matrix that is positive semi-definite All the primal scalar constraints are verified up to an error of 2.416178368491728e-12 (PEPit) Dual feasibility check: The solver found a residual matrix that is positive semi-definite All the dual scalar values associated with inequality constraints are nonnegative up to an error of 2.5759179508275314e-12 (PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 4.8348778296212814e-11 (PEPit) Final upper bound (dual): 0.19999999999319756 and lower bound (primal example): 0.19999999999153723 (PEPit) Duality gap: absolute: 1.6603385333269216e-12 and relative: 8.301692666985885e-12 *** Example file: worst-case performance of the NoLips in Bregman divergence *** PEPit guarantee: min_t Dh(x_(t+1), x_(t)) <= 0.2 (F(x_0) - F(x_n)) Theoretical guarantee : min_t Dh(x_(t+1), x_(t)) <= 0.2 (F(x_0) - F(x_n))
3.3 No Lips 2
- PEPit.examples.nonconvex_optimization.wc_no_lips_2(L, gamma, n, wrapper='cvxpy', solver=None, verbose=1)[source]
Consider the constrainted composite convex minimization problem
\[F_\star \triangleq \min_x \{F(x) \equiv f_1(x)+f_2(x) \}\]where \(f_2\) is a closed convex indicator function and \(f_1\) is possibly non-convex, \(L\)-smooth relatively to \(h\), and \(h\) is closed proper and convex.
This code computes a worst-case guarantee for the NoLips method. That is, it computes the smallest possible \(\tau(n,L,\gamma)\) such that the guarantee
\[\min_{0 \leqslant t \leqslant n-1} D_h(x_t;x_{t+1}) \leqslant \tau(n, L, \gamma) (F(x_0) - F(x_n))\]is valid, where \(x_n\) is the output of the NoLips method, and where \(D_h\) is the Bregman distance generated by \(h\):
\[D_h(x; y) \triangleq h(x) - h(y) - \nabla h (y)^T(x - y).\]In short, for given values of \(n\), \(L\), and \(\gamma\), \(\tau(n, L, \gamma)\) is computed as the worst-case value of \(\min_{0 \leqslant t \leqslant n-1}D_h(x_t;x_{t+1})\) when \(F(x_0) - F(x_n) \leqslant 1\).
Algorithms: This method (also known as Bregman Gradient, or Mirror descent) can be found in, e.g., [1, Section 3]. For \(t \in \{0, \dots, n-1\}\),
\[x_{t+1} = \arg\min_{u \in R^d} \nabla f(x_t)^T(u - x_t) + \frac{1}{\gamma} D_h(u; x_t).\]Theoretical guarantees: An empirically tight worst-case guarantee is
\[\min_{0 \leqslant t \leqslant n-1}D_h(x_t;x_{t+1}) \leqslant \frac{\gamma}{n}(F(x_0) - F(x_n)).\]References: The detailed setup is presented in [1]. The PEP approach for studying such settings is presented in [2].
DISCLAIMER: This example requires some experience with PEPit and PEPs (see Section 4 in [2]).
- Parameters:
L (float) – relative-smoothness parameter.
gamma (float) – step-size (equal to \(\frac{1}{2L}\) for guarantee).
n (int) – number of iterations.
wrapper (str) – the name of the wrapper to be used.
solver (str) – the name of the solver the wrapper should use.
verbose (int) –
level of information details to print.
-1: No verbose at all.
0: This example’s output.
1: This example’s output + PEPit information.
2: This example’s output + PEPit information + solver details.
- Returns:
pepit_tau (float) – worst-case value.
theoretical_tau (float) – theoretical value.
Example
>>> L = 1 >>> gamma = 1 / L >>> pepit_tau, theoretical_tau = wc_no_lips_2(L=L, gamma=gamma, n=3, wrapper="cvxpy", solver=None, verbose=1) (PEPit) Setting up the problem: size of the Gram matrix: 14x14 (PEPit) Setting up the problem: performance measure is the minimum of 3 element(s) (PEPit) Setting up the problem: Adding initial conditions and general constraints ... (PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added) (PEPit) Setting up the problem: interpolation conditions for 3 function(s) Function 1 : Adding 12 scalar constraint(s) ... Function 1 : 12 scalar constraint(s) added Function 2 : Adding 12 scalar constraint(s) ... Function 2 : 12 scalar constraint(s) added Function 3 : Adding 25 scalar constraint(s) ... Function 3 : 25 scalar constraint(s) added (PEPit) Setting up the problem: additional constraints for 0 function(s) (PEPit) Compiling SDP (PEPit) Calling SDP solver (PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.3333333333330492 (PEPit) Primal feasibility check: The solver found a Gram matrix that is positive semi-definite All the primal scalar constraints are verified up to an error of 4.18554080283684e-14 (PEPit) Dual feasibility check: The solver found a residual matrix that is positive semi-definite All the dual scalar values associated with inequality constraints are nonnegative up to an error of 4.1967204284666985e-14 (PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 5.917219186068013e-13 (PEPit) Final upper bound (dual): 0.3333333333330736 and lower bound (primal example): 0.3333333333330492 (PEPit) Duality gap: absolute: 2.4369395390522186e-14 and relative: 7.310818617162887e-14 *** Example file: worst-case performance of the NoLips_2 in Bregman distance *** PEPit guarantee: min_t Dh(x_(t-1), x_(t)) <= 0.333333 (F(x_0) - F(x_n)) Theoretical guarantee: min_t Dh(x_(t-1), x_(t)) <= 0.333333 (F(x_0) - F(x_n))
3.4 Gradient descent on smooth function satisfying quadratic Lojasiewicz inequality (naive version)
- PEPit.examples.nonconvex_optimization.wc_gradient_descent_quadratic_lojasiewicz_naive(L, mu, gamma, n, wrapper='cvxpy', solver=None, verbose=1)[source]
Consider the minimization problem
\[f_\star \triangleq \min_x f(x),\]where \(f\) is \(L\)-smooth and satisfies a quadratic Lojasiewicz inequality:
\[f(x)-f_\star \leqslant \frac{1}{2\mu}\|\nabla f(x) \|^2,\]details can be found in [1,2,3].
This code computes a worst-case guarantee for gradient descent with fixed step-size \(\gamma\). That is, it computes the smallest possible \(\tau(n, L, \gamma)\) such that the guarantee
\[f(x_n)-f_\star \leqslant \tau(n, L, \mu, \gamma) (f(x_0) - f(x_\star))\]is valid, where \(x_n\) is the n-th iterates obtained with the gradient method with fixed step-size.
Algorithm: Gradient descent is described as follows, for \(t \in \{ 0, \dots, n-1\}\),
\[x_{t+1} = x_t - \gamma \nabla f(x_t),\]where \(\gamma\) is a step-size and.
Theoretical guarantee: We compare with the guarantees from [4, Theorem 3].
- References:
- Parameters:
L (float) – the smoothness parameter.
mu (float) – Lojasiewicz parameter.
gamma (float) – step-size.
n (int) – number of iterations.
wrapper (str) – the name of the wrapper to be used.
solver (str) – the name of the solver the wrapper should use.
verbose (int) –
level of information details to print.
-1: No verbose at all.
0: This example’s output.
1: This example’s output + PEPit information.
2: This example’s output + PEPit information + solver details.
- Returns:
pepit_tau (float) – worst-case value.
theoretical_tau (float) – theoretical value.
Example
>>> L, mu, gamma, n = 1, .2, 1, 1 >>> pepit_tau, theoretical_tau = wc_gradient_descent_quadratic_lojasiewicz_naive(L=L, gamma=gamma, n=n, wrapper="cvxpy", solver=None, verbose=1) (PEPit) Setting up the problem: size of the Gram matrix: 4x4 (PEPit) Setting up the problem: performance measure is the minimum of 1 element(s) (PEPit) Setting up the problem: Adding initial conditions and general constraints ... (PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added) (PEPit) Setting up the problem: interpolation conditions for 1 function(s) Function 1 : Adding 10 scalar constraint(s) ... Function 1 : 10 scalar constraint(s) added (PEPit) Setting up the problem: additional constraints for 0 function(s) (PEPit) Compiling SDP (PEPit) Calling SDP solver (PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.7272727088286305 (PEPit) Primal feasibility check: The solver found a Gram matrix that is positive semi-definite All the primal scalar constraints are verified (PEPit) Dual feasibility check: The solver found a residual matrix that is positive semi-definite All the dual scalar values associated with inequality constraints are nonnegative up to an error of 6.45387237307569e-09 (PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 3.656794643018536e-08 (PEPit) Final upper bound (dual): 0.7272727115834279 and lower bound (primal example): 0.7272727088286305 (PEPit) Duality gap: absolute: 2.754797390203123e-09 and relative: 3.7878465075914795e-09 *** Example file: worst-case performance of gradient descent with fixed step-size *** *** (smooth problem satisfying a Lojasiewicz inequality; cheap naive version) *** PEPit guarantee: f(x_1) - f(x_*) <= 0.727273 (f(x_0)-f_*) Theoretical guarantee: f(x_1) - f(x_*) <= 0.727273 (f(x_0)-f_*)
3.5 Gradient descent on smooth function satisfying quadratic Lojasiewicz inequality (intermediate)
- PEPit.examples.nonconvex_optimization.wc_gradient_descent_quadratic_lojasiewicz_intermediate(L, mu, gamma, n, alpha, wrapper='cvxpy', solver=None, verbose=1)[source]
Consider the minimization problem
\[f_\star \triangleq \min_x f(x),\]where \(f\) is \(L\)-smooth and satisfies a quadratic Lojasiewicz inequality:
\[f(x)-f_\star \leqslant \frac{1}{2\mu}\|\nabla f(x) \|^2,\]details can be found in [1,2,3]. The example here relies on the
SmoothQuadraticLojasiewiczFunctionCheapdescription of smooth Lojasiewicz functions (based on [5, Proposition 3.2]).This code computes a worst-case guarantee for gradient descent with fixed step-size \(\gamma\). That is, it computes the smallest possible \(\tau(n, L, \gamma)\) such that the guarantee
\[f(x_n)-f_\star \leqslant \tau(n, L, \mu, \gamma) (f(x_0) - f(x_\star))\]is valid, where \(x_n\) is the n-th iterates obtained with the gradient method with fixed step-size.
Algorithm: Gradient descent is described as follows, for \(t \in \{ 0, \dots, n-1\}\),
\[x_{t+1} = x_t - \gamma \nabla f(x_t),\]where \(\gamma\) is a step-size and.
Theoretical guarantee: We compare with the guarantees from [4, Theorem 3].
- References:
- Parameters:
L (float) – the smoothness parameter.
mu (float) – Lojasiewicz parameter.
gamma (float) – step-size.
n (int) – number of iterations.
wrapper (str) – the name of the wrapper to be used.
solver (str) – the name of the solver the wrapper should use.
verbose (int) –
level of information details to print.
-1: No verbose at all.
0: This example’s output.
1: This example’s output + PEPit information.
2: This example’s output + PEPit information + solver details.
- Returns:
pepit_tau (float) – worst-case value.
theoretical_tau (float) – theoretical value.
Example
>>> L, mu, gamma, n = 1, .2, 1, 1 >>> alpha = (2*mu/(2*L+mu)) >>> pepit_tau, theoretical_tau = wc_gradient_descent_quadratic_lojasiewicz_intermediate(L=L, gamma=gamma, n=1, alpha=alpha, wrapper="cvxpy", solver=None, verbose=1) (PEPit) Setting up the problem: size of the Gram matrix: 4x4 (PEPit) Setting up the problem: performance measure is the minimum of 1 element(s) (PEPit) Setting up the problem: Adding initial conditions and general constraints ... (PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added) (PEPit) Setting up the problem: interpolation conditions for 1 function(s) Function 1 : Adding 16 scalar constraint(s) ... Function 1 : 16 scalar constraint(s) added (PEPit) Setting up the problem: additional constraints for 0 function(s) (PEPit) Compiling SDP (PEPit) Calling SDP solver (PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.727272727239017 (PEPit) Primal feasibility check: The solver found a Gram matrix that is positive semi-definite All the primal scalar constraints are verified up to an error of 9.529710354172494e-12 (PEPit) Dual feasibility check: The solver found a residual matrix that is positive semi-definite All the dual scalar values associated with inequality constraints are nonnegative up to an error of 5.521136597015314e-11 (PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 3.812156896706273e-11 (PEPit) Final upper bound (dual): 0.7272727272394729 and lower bound (primal example): 0.727272727239017 (PEPit) Duality gap: absolute: 4.558575739110893e-13 and relative: 6.268041641568012e-13 *** Example file: worst-case performance of gradient descent with fixed step-size *** *** (smooth problem satisfying a Lojasiewicz inequality; intermediate version) *** PEPit guarantee: f(x_1) - f(x_*) <= 0.727273 (f(x_0)-f_*) Theoretical guarantee: f(x_1) - f(x_*) <= 0.727273 (f(x_0)-f_*)
3.6 Gradient descent on smooth function satisfying quadratic Lojasiewicz inequality (expensive version)
- PEPit.examples.nonconvex_optimization.wc_gradient_descent_quadratic_lojasiewicz_expensive(L, mu, gamma, n, wrapper='cvxpy', solver=None, verbose=1)[source]
Consider the minimization problem
\[f_\star \triangleq \min_x f(x),\]where \(f\) is \(L\)-smooth and satisfies a quadratic Lojasiewicz inequality:
\[f(x)-f_\star \leqslant \frac{1}{2\mu}\|\nabla f(x) \|^2,\]details can be found in [1,2,3]. The example here relies on the
SmoothQuadraticLojasiewiczFunctionExpensivedescription of smooth Lojasiewicz functions (based on [5, Proposition 3.4]).This code computes a worst-case guarantee for gradient descent with fixed step-size \(\gamma\). That is, it computes the smallest possible \(\tau(n, L, \gamma)\) such that the guarantee
\[f(x_n)-f_\star \leqslant \tau(n, L, \mu, \gamma) (f(x_0) - f(x_\star))\]is valid, where \(x_n\) is the n-th iterates obtained with the gradient method with fixed step-size.
Algorithm: Gradient descent is described as follows, for \(t \in \{ 0, \dots, n-1\}\),
\[x_{t+1} = x_t - \gamma \nabla f(x_t),\]where \(\gamma\) is a step-size and.
Theoretical guarantee: We compare with the guarantees from [4, Theorem 3].
- References:
- Parameters:
L (float) – the smoothness parameter.
mu (float) – Lojasiewicz parameter.
gamma (float) – step-size.
n (int) – number of iterations.
wrapper (str) – the name of the wrapper to be used.
solver (str) – the name of the solver the wrapper should use.
verbose (int) –
level of information details to print.
-1: No verbose at all.
0: This example’s output.
1: This example’s output + PEPit information.
2: This example’s output + PEPit information + solver details.
- Returns:
pepit_tau (float) – worst-case value.
theoretical_tau (float) – theoretical value.
Example
>>> L, mu, gamma, n = 1, .2, 1, 1 >>> pepit_tau, theoretical_tau = wc_gradient_descent_quadratic_lojasiewicz_expensive(L=L, gamma=gamma, n=n, wrapper="cvxpy", solver=None, verbose=1) (PEPit) Setting up the problem: size of the Gram matrix: 4x4 (PEPit) Setting up the problem: performance measure is the minimum of 1 element(s) (PEPit) Setting up the problem: Adding initial conditions and general constraints ... (PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added) (PEPit) Setting up the problem: interpolation conditions for 1 function(s) Function 1 : Adding 4 scalar constraint(s) ... Function 1 : 4 scalar constraint(s) added Function 1 : Adding 6 lmi constraint(s) ... Size of PSD matrix 1: 4x4 Size of PSD matrix 2: 4x4 Size of PSD matrix 3: 4x4 Size of PSD matrix 4: 4x4 Size of PSD matrix 5: 4x4 Size of PSD matrix 6: 4x4 Function 1 : 6 lmi constraint(s) added (PEPit) Setting up the problem: additional constraints for 0 function(s) (PEPit) Compiling SDP (PEPit) Calling SDP solver (PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.6832669556328734 (PEPit) Primal feasibility check: The solver found a Gram matrix that is positive semi-definite All required PSD matrices are indeed positive semi-definite up to an error of 1.0099203333404037e-09 All the primal scalar constraints are verified (PEPit) Dual feasibility check: The solver found a residual matrix that is positive semi-definite All the dual matrices to lmi are positive semi-definite All the dual scalar values associated with inequality constraints are nonnegative up to an error of 5.671954340368105e-10 (PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 2.0306640495891322e-08 (PEPit) Final upper bound (dual): 0.6832669563172779 and lower bound (primal example): 0.6832669556328734 (PEPit) Duality gap: absolute: 6.844044220244427e-10 and relative: 1.0016647466735981e-09 *** Example file: worst-case performance of gradient descent with fixed step-size *** *** (smooth problem satisfying a Lojasiewicz inequality; expensive version) *** PEPit guarantee: f(x_1) - f(x_*) <= 0.683267 (f(x_0)-f_*) Theoretical guarantee: f(x_1) - f(x_*) <= 0.727273 (f(x_0)-f_*)
3.7 Difference-of-convex algorithm (DCA)
- PEPit.examples.nonconvex_optimization.wc_difference_of_convex_algorithm(mu1, mu2, L1, L2, n, alpha=0, wrapper='cvxpy', solver=None, verbose=1)[source]
Consider the minimization problem
\[F_\star \triangleq \min_x f_1(x)-f_2(x),\]where \(f_1\) and \(f_2\) are convex functions, respectively \(L_1\)-smooth and \(\mu_1\)-strongly convex and \(L_2\)-smooth and \(\mu_2\)-strongly convex.
This code computes a worst-case guarantee for DCA (difference-of-convex algorithm, also known as the convex-concave procedure). That is, it computes the smallest possible \(\tau(n, \mu_1, L_1,\mu_2, L_2)\) such that the guarantee
\[\min_{t\leqslant n} \|\nabla f_1(x_t)-\nabla f_2(x_t)\|^2 \leqslant \tau(n, \mu_1, L_1,\mu_2, L_2) (f_1(x_0)-f_2(x_0)-F_\star)\]is valid, where \(x_n\) is the n-th iterates obtained with DCA.
Algorithm: DCA is described as follows, for \(t \in \{ 0, \dots, n-1\}\),
\[x_{t+1} \in \mathrm{argmin}_x\,\{ f_1(x) - \langle \nabla f_2(x_t), x\rangle\},\]Theoretical guarantee: The results are compared with [1, Theorem 3]; a more complete picture can be obtained from [2], also by possibly allowing for non-convex functions \(f_1\) and \(f_2\) (i.e., possibly negative values for \(\mu_1\), \(\mu_2\)).
References:
- Parameters:
mu1 (float) – strong convexity parameter for f1.
mu2 (float) – strong convexity parameter for f2.
L1 (float) – smoothness parameter for f1.
L2 (float) – smoothness parameter for f2.
alpha (float) – boosting parameter (defaulted to 0).
n (int) – number of iterations.
wrapper (str) – the name of the wrapper to be used.
solver (str) – the name of the solver the wrapper should use.
verbose (int) –
level of information details to print.
-1: No verbose at all.
0: This example’s output.
1: This example’s output + PEPit information.
2: This example’s output + PEPit information + solver details.
- Returns:
pepit_tau (float) – worst-case value.
theoretical_tau (float) – reference theoretical value [1, Theorem 3].
Example
>>> L1, L2, mu1, mu2 = 2., 5., .15, .1 >>> pepit_tau, theory = wc_difference_of_convex_algorithm(mu1=mu1, mu2=mu2, L1=L1, L2=L2, n=5, alpha=0, wrapper="cvxpy", solver=None, verbose=1) (PEPit) Setting up the problem: size of the Gram matrix: 15x15 (PEPit) Setting up the problem: performance measure is the minimum of 6 element(s) (PEPit) Setting up the problem: Adding initial conditions and general constraints ... (PEPit) Setting up the problem: initial conditions and general constraints (7 constraint(s) added) (PEPit) Setting up the problem: interpolation conditions for 2 function(s) Function 1 : Adding 42 scalar constraint(s) ... Function 1 : 42 scalar constraint(s) added Function 2 : Adding 42 scalar constraint(s) ... Function 2 : 42 scalar constraint(s) added (PEPit) Setting up the problem: additional constraints for 0 function(s) (PEPit) Compiling SDP (PEPit) Calling SDP solver (PEPit) Solver status: prosta.prim_and_dual_feas (wrapper:mosek, solver: MOSEK); optimal value: 0.49113062165416893 (PEPit) Primal feasibility check: The solver found a Gram matrix that is positive semi-definite All the primal scalar constraints are verified up to an error of 8.798134450149764e-09 (PEPit) Dual feasibility check: The solver found a residual matrix that is positive semi-definite All the dual scalar values associated with inequality constraints are nonnegative up to an error of 1.0048489946103415e-08 (PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 2.102726387109924e-07 (PEPit) Final upper bound (dual): 0.4911306282045956 and lower bound (primal example): 0.49113062165416893 (PEPit) Duality gap: absolute: 6.550426645546281e-09 and relative: 1.3337442946407816e-08 *** Example file: worst-case performance of DCA *** PEPit guarantee: min_i ||f'(x_i)||^2 <= 0.491131 (f(x_0)-f_*) Theoretical guarantee: min_i ||f'(x_i)||^2 <= 0.491131 (f(x_0)-f_*)