Block-decomposition and accelerated gradient methods for large-scale convex optimization
Ortiz Diaz, Camilo
MetadataShow full item record
In this thesis, we develop block-decomposition (BD) methods and variants of accelerated *9gradient methods for large-scale conic programming and convex optimization, respectively. The BD methods, discussed in the first two parts of this thesis, are inexact versions of proximal-point methods applied to two-block-structured inclusion problems. The adaptive accelerated methods, presented in the last part of this thesis, can be viewed as new variants of Nesterov's optimal method. In an effort to improve their practical performance, these methods incorporate important speed-up refinements motivated by theoretical iteration-complexity bounds and our observations from extensive numerical experiments. We provide several benchmarks on various important problem classes to demonstrate the efficiency of the proposed methods compared to the most competitive ones proposed earlier in the literature. In the first part of this thesis, we consider exact BD first-order methods for solving conic semidefinite programming (SDP) problems and the more general problem that minimizes the sum of a convex differentiable function with Lipschitz continuous gradient, and two other proper closed convex (possibly, nonsmooth) functions. More specifically, these problems are reformulated as two-block monotone inclusion problems and exact BD methods, namely the ones that solve both proximal subproblems exactly, are used to solve them. In addition to being able to solve standard form conic SDP problems, the latter approach is also able to directly solve specially structured non-standard form conic programming problems without the need to add additional variables and/or constraints to bring them into standard form. Several ingredients are introduced to speed-up the BD methods in their pure form such as: adaptive (aggressive) choices of stepsizes for performing the extragradient step; and dynamic updates of scaled inner products to balance the blocks. Finally, computational results on several classes of SDPs are presented showing that the exact BD methods outperform the three most competitive codes for solving large-scale conic semidefinite programming. In the second part of this thesis, we present an inexact BD first-order method for solving standard form conic SDP problems which avoids computations of exact projections onto the manifold defined by the affine constraints and, as a result, is able to handle extra large-scale SDP instances. In this BD method, while the proximal subproblem corresponding to the first block is solved exactly, the one corresponding to the second block is solved inexactly in order to avoid finding the exact solution of a linear system corresponding to the manifolds consisting of both the primal and dual affine feasibility constraints. Our implementation uses the conjugate gradient method applied to a reduced positive definite dual linear system to obtain inexact solutions of the latter augmented primal-dual linear system. In addition, the inexact BD method incorporates a new dynamic scaling scheme that uses two scaling factors to balance three inclusions comprising the optimality conditions of the conic SDP. Finally, we present computational results showing the efficiency of our method for solving various extra large SDP instances, several of which cannot be solved by other existing methods, including some with at least two million constraints and/or fifty million non-zero coefficients in the affine constraints. In the last part of this thesis, we consider an adaptive accelerated gradient method for a general class of convex optimization problems. More specifically, we present a new accelerated variant of Nesterov's optimal method in which certain acceleration parameters are adaptively (and aggressively) chosen so as to: preserve the theoretical iteration-complexity of the original method; and substantially improve its practical performance in comparison to the other existing variants. Computational results are presented to demonstrate that the proposed adaptive accelerated method performs quite well compared to other variants proposed earlier in the literature.