Primitive steps
Inexact gradient step
- PEPit.primitive_steps.inexact_gradient_step(x0, f, gamma, epsilon, notion='absolute')[source]
This routines performs a step \(x \leftarrow x_0 - \gamma d_{x_0}\) where \(d_{x_0}\) is close to the gradient of \(f\) in \(x_0\) in the following sense:
\[\begin{split}\|d_{x_0} - \nabla f(x_0)\|^2 \leqslant \left\{ \begin{eqnarray} & \varepsilon^2 & \text{if notion is set to 'absolute'}, \\ & \varepsilon^2 \|\nabla f(x_0)\|^2 & \text{if notion is set to 'relative'}. \end{eqnarray} \right.\end{split}\]This relative approximation is used at least in in 3 PEPit examples, in particular in 2 unconstrained convex minimizations: an inexact gradient descent, and an inexact accelerated gradient.
- Parameters
- Returns
x (Point) – the output point.
dx0 (Point) – the approximate (sub)gradient of f at x0.
fx0 (Expression) – the value of the function f at x0.
- Raises
ValueError – if notion is not set in [‘absolute’, ‘relative’].
Note
When \(\gamma\) is set to 0, then the this routine returns \(x_0\), \(d_{x_0}\), and \(f_{x_0}\). It is used as is in the example of unconstrained convex minimization scheme called “inexact gradient exact line search” only to access to the direction \(d_{x_0}\) close to the gradient \(g_{x_0}\).
Exact line-search step
- PEPit.primitive_steps.exact_linesearch_step(x0, f, directions)[source]
This routines outputs some \(x\) by mimicking an exact line/span search in specified directions. It is used for instance in
PEPit.examples.unconstrained_convex_minimization.wc_gradient_exact_line_search
and inPEPit.examples.unconstrained_convex_minimization.wc_conjugate_gradient
.The routine aims at mimicking the operation:
\begin{eqnarray} x & = & x_0 - \sum_{i=1}^{T} \gamma_i d_i,\\ \text{with } \overrightarrow{\gamma} & = & \arg\min_\overrightarrow{\gamma} f\left(x_0 - \sum_{i=1}^{T} \gamma_i d_i\right), \end{eqnarray}where \(T\) denotes the number of directions \(d_i\). This operation can equivalently be described in terms of the following conditions:
\begin{eqnarray} x - x_0 & \in & \text{span}\left\{d_1,\ldots,d_T\right\}, \\ \nabla f(x) & \perp & \text{span}\left\{d_1,\ldots,d_T\right\}. \end{eqnarray}In this routine, we instead constrain \(x_{t}\) and \(\nabla f(x_{t})\) to satisfy
\begin{eqnarray} \forall i=1,\ldots,T: & \left< \nabla f(x);\, d_i \right> & = & 0,\\ \text{and } & \left< \nabla f(x);\, x - x_0 \right> & = & 0, \end{eqnarray}which is a relaxation of the true line/span search conditions.
Note
The latest condition is automatically implied by the 2 previous ones.
Warning
One can notice this routine does not encode completely the fact that \(x_{t+1} - x_t\) must be a linear combination of the provided directions (i.e., this routine performs a relaxation). Therefore, if this routine is included in a PEP, the obtained value might be an upper bound on the true worst-case value.
Although not always tight, this relaxation is often observed to deliver pretty accurate results (in particular, it automatically produces tight results under some specific conditions, see, e.g., [1]). Two such examples are provided in the conjugate gradient and gradient with exact line search example files.
References
- Parameters
- Returns
x (Point) – such that all vectors in directions are orthogonal to the (sub)gradient of f at x.
gx (Point) – a (sub)gradient of f at x.
fx (Expression) – the function f evaluated at x.
Proximal step
- PEPit.primitive_steps.proximal_step(x0, f, gamma)[source]
This routine performs a proximal step of step-size gamma, starting from x0, and on function f. That is, it performs:
\begin{eqnarray} x \triangleq \text{prox}_{\gamma f}(x_0) & \triangleq & \arg\min_x \left\{ \gamma f(x) + \frac{1}{2} \|x - x_0\|^2 \right\}, \\ & \Updownarrow & \\ 0 & = & \gamma g_x + x - x_0 \text{ for some } g_x\in\partial f(x),\\ & \Updownarrow & \\ x & = & x_0 - \gamma g_x \text{ for some } g_x\in\partial f(x). \end{eqnarray}
Inexact proximal step
- PEPit.primitive_steps.inexact_proximal_step(x0, f, gamma, opt='PD_gapII')[source]
This routine encodes an inexact proximal operation with step size \(\gamma\). That is, it outputs a tuple \((x, g\in \partial f(x), f(x), w, v\in\partial f(w), f(w), \varepsilon)\) which are described as follows.
First, \(x\) is an approximation to the proximal point of \(x_0\) on function \(f\):
\[x \approx \mathrm{prox}_{\gamma f}(x_0)\triangleq\arg\min_x \left\{ \gamma f(x) + \frac{1}{2}\|x-x_0\|^2\right\},\]where the meaning of \(\approx\) depends on the option “opt” and is explained below. The notions of inaccuracy implemented within this routine are specified using primal and dual proximal problems, denoted by
\begin{eqnarray} &\Phi^{(p)}_{\gamma f}(x; x_0) \triangleq \gamma f(x) + \frac{1}{2}\|x-x_0\|^2,\\ &\Phi^{(d)}_{\gamma f}(v; x_0) \triangleq -\gamma f^*(v)-\frac{1}{2}\|x_0-\gamma v\|^2 + \frac{1}{2}\|x_0\|^2,\\ \end{eqnarray}where \(\Phi^{(p)}_{\gamma f}(x;x_0)\) and \(\Phi^{(d)}_{\gamma f}(v;x_0)\) respectively denote the primal and the dual proximal problems, and where \(f^*\) is the Fenchel conjugate of \(f\). The options below encode different meanings of “\(\approx\)” by specifying accuracy requirements on primal-dual pairs:
\[(x,v) \approx_{\varepsilon} \left(\mathrm{prox}_{\gamma f}(x_0),\,\mathrm{prox}_{f^*/\gamma}(x_0/\gamma)\right),\]where \(\approx_{\varepsilon}\) corresponds to require the primal-dual pair \((x,v)\) to satisfy some primal-dual accuracy requirement:
\[\Phi^{(p)}_{\gamma f}(x;x_0)-\Phi^{(d)}_{\gamma f}(v;x_0) \leqslant \varepsilon,\]where \(\varepsilon\geqslant 0\) is the error magnitude, which is returned to the user so that one can constrain it to be bounded by some other values.
Relation to the exact proximal operation: In the exact case (no error in the computation, \(\varepsilon=0\)), \(v\) corresponds to the solution of the dual proximal problem and one can write
\[x = x_0-\gamma g,\]with \(g=v=\mathrm{prox}_{f^*/\gamma}(x_0/\gamma)\in\partial f(x)\), and \(x=w\).
Reformulation of the primal-dual gap: In regard with the exact proximal computation; the inexact case under consideration here can be described as performing
\[x = x_0-\gamma v + e,\]where \(v\) is an \(\epsilon\)-subgradient of \(f\) at \(x\) (notation \(v\in\partial_{\epsilon} f(x)\)) and \(e\) is some additional computation error. Those elements allow for a common convenient reformulation of the primal-dual gap, written in terms of the magnitudes of \(\epsilon\) and of \(e\):
\[\Phi^{(p)}_{\gamma f}(x;x_0)-\Phi^{(d)}_{\gamma f}(v;x_0) = \frac{1}{2} \|e\|^2 + \gamma \epsilon.\]Options: The following options are available (a list of such choices is presented in [4]; we provide a reference for each of those choices below).
‘PD_gapI’ : the constraint imposed on the output is the vanilla (see, e.g., [2])
\[\Phi^{(p)}_{\gamma f}(x;x_0)-\Phi^{(d)}_{\gamma f}(v;x_0) \leqslant \varepsilon.\]
This approximation requirement is used in one PEPit example: an accelerated inexact forward backward.
‘PD_gapII’ : the constraint is stronger than the vanilla primal-dual gap, as more structure is imposed (see, e.g., [1,5]) :
\[\Phi^{(p)}_{\gamma f}(x;x_0)-\Phi^{(d)}_{\gamma f}(g;x_0) \leqslant \varepsilon,\]
where we imposed that \(v\triangleq g\in\partial f(x)\) and \(w\triangleq x\). This approximation requirement is used in two PEPit examples: in a relatively inexact proximal point algorithm and in a partially inexact Douglas-Rachford splitting.
‘PD_gapIII’ : the constraint is stronger than the vanilla primal-dual gap, as more structure is imposed (see, e.g., [3]):
\[\Phi^{(p)}_{\gamma f}(x;x_0)-\Phi^{(d)}_{\gamma f}(\tfrac{x_0 - x}{\gamma};x_0) \leqslant \varepsilon,\]
where we imposed that \(v \triangleq \frac{x_0 - x}{\gamma}\).
References
- Parameters
- Returns
x (Point) – the approximated proximal point.
gx (Point) – a (sub)gradient of f at x (subgradient used in evaluating the accuracy criterion).
fx (Expression) – f evaluated at x.
w (Point) – a point w such that v (see next output) is a subgradient of f at w.
v (Point) – the approximated proximal point of the dual problem, (sub)gradient of f evaluated at w.
fw (Expression) – f evaluated at w.
eps_var (Expression) – value of the primal-dual gap (which can be further bounded by the user).
Bregman gradient step
- PEPit.primitive_steps.bregman_gradient_step(gx0, sx0, mirror_map, gamma)[source]
This routine outputs \(x\) by performing a mirror step of step-size \(\gamma\). That is, denoting \(f\) the function to be minimized and \(h\) the mirror map, it performs
\[x = \arg\min_x \left[ f(x_0) + \left< \nabla f(x_0);\, x - x_0 \right> + \frac{1}{\gamma} D_h(x; x_0) \right],\]where \(D_h(x; x_0)\) denotes the Bregman divergence of \(h\) on \(x\) with respect to \(x_0\).
\[D_h(x; x_0) \triangleq h(x) - h(x_0) - \left< \nabla h(x_0);\, x - x_0 \right>.\]Warning
The mirror map \(h\) is assumed differentiable.
By differentiating the previous objective function, one can observe that
\[\nabla h(x) = \nabla h(x_0) - \gamma \nabla f(x_0).\]- Parameters
- Returns
x (Point) – new iterate \(\textbf{x} \triangleq x\).
sx (Point) – \(h\)’s gradient on new iterate \(x\) \(\textbf{sx} \triangleq \nabla h(x)\).
hx (Expression) – \(h\)’s value on new iterate \(\textbf{hx} \triangleq h(x)\).
Bregman proximal step
- PEPit.primitive_steps.bregman_proximal_step(sx0, mirror_map, min_function, gamma)[source]
This routine outputs \(x\) by performing a proximal mirror step of step-size \(\gamma\). That is, denoting \(f\) the function to be minimized and \(h\) the mirror map, it performs
\[x = \arg\min_x \left[ f(x) + \frac{1}{\gamma} D_h(x; x_0) \right],\]where \(D_h(x; x_0)\) denotes the Bregman divergence of \(h\) on \(x\) with respect to \(x_0\).
\[D_h(x; x_0) \triangleq h(x) - h(x_0) - \left< \nabla h(x_0);\, x - x_0 \right>.\]Warning
The mirror map \(h\) is assumed differentiable.
By differentiating the previous objective function, one can observe that
\[\nabla h(x) = \nabla h(x_0) - \gamma \nabla f(x).\]- Parameters
- Returns
x (Point) – new iterate \(\textbf{x} \triangleq x\).
sx (Point) – \(h\)’s gradient on new iterate \(x\) \(\textbf{sx} \triangleq \nabla h(x)\).
hx (Expression) – \(h\)’s value on new iterate \(\textbf{hx} \triangleq h(x)\).
gx (Point) – \(f\)’s gradient on new iterate \(x\) \(\textbf{gx} \triangleq \nabla f(x)\).
fx (Expression) – \(f\)’s value on new iterate \(\textbf{fx} \triangleq f(x)\).
Linear optimization step
- PEPit.primitive_steps.linear_optimization_step(dir, ind)[source]
This routine outputs the result of a minimization problem with linear objective (whose direction is provided by dir) on the domain of the (closed convex) indicator function ind. That is, it outputs a solution to
\[\arg\min_{\text{ind}(x)=0} \left< \text{dir};\, x \right>,\]One can notice that \(x\) is solution of this problem if and only if
\[- \text{dir} \in \partial \text{ind}(x).\]- Parameters
dir (Point) – direction of optimization
ind (ConvexIndicatorFunction) – convex indicator function
- Returns
x (Point) – the optimal point.
gx (Point) – the (sub)gradient of ind on x.
fx (Expression) – the function value of ind on x.