contents 1. convex optimization models: an overview . . . . . . p. 1 1.1. lagrangeduality .......... .......... p.2 1.1.1. separable problems – decomposition . . . . . . . . . p. 7 1.1.2. partitioning .................... p.9 1.2. fenchel duality and conic programming . . . . . . . . . . p. 10 1.2.1. linearconicproblems . . . . . . . . . . . . . . . p.15 1.2.2. second order cone programming . . . . . . . . . . . p. 17 1.2.3. semide.nite programming . . . . . . . . . . . . . . p. 22 1.3. additivecostproblems . . . . . . . . . . . . . . . . . p.25 1.4. largenumberofconstraints . . . . . . . . . . . . . . . p.34 1.5. exactpenalty functions . . . . . . . . . . . . . . . . p.39 1.6. notes,sources,andexercises . . . . . . . . . . . . . . p.47 2. optimization algorithms: an overview . . . . . . . . p. 53 2.1. iterativedescentalgorithms . . . . . . . . . . . . . . . p.55 2.1.1. di.erentiable cost function descent – unconstrained . . . . problems ..................... p.58 2.1.2. constrained problems – feasible direction methods . . . p. 71 2.1.3. nondi.erentiable problems – subgradient methods . . . p. 78 2.1.4. alternative descent methods . . . . . . . . . . . . . p. 80 2.1.5. incrementalalgorithms . . . . . . . . . . . . . . . p.83 2.1.6. distributed asynchronous iterative algorithms . . . . p. 104 2.2. approximationmethods . . . . . . . . . . . . . . . p.106 2.2.1. polyhedral approximation . . . . . . . . . . . . . p. 107 2.2.2. penalty, augmented lagrangian, and interior . . . . . . . pointmethods .................. p.108 2.2.3. proximal algorithm, bundle methods, and . . . . . . . . . tikhonovregularization . . . . . . . . . . . . . . p.110 2.2.4. alternating direction method of multipliers . . . . . p. 111 2.2.5. smoothing of nondi.erentiable problems . . . . . . p. 113 2.3. notes,sources,andexercises . . . . . . . . . . . . . p.119 3. subgradientmethods . . . . . . . . . . . . . . . p.135 3.1. subgradients of convex real-valued functions . . . . . . p. 136 iv contents 3.1.1. characterization of the subdi.erential . . . . . . . . p. 146 3.2. convergence analysis of subgradient methods . . . . . . p. 148 3.3. .-subgradientmethods ................ p.162 3.3.1. connection with incremental subgradient methods . . p. 166 3.4. notes,sources,andexercises . . . . . . . . . . . . . . p.167 4. polyhedral approximation methods . . . . . . . . . p. 181 4.1. outer linearization – cutting plane methods . . . . . . p. 182 4.2. inner linearization – simplicial decomposition . . . . . . p. 188 4.3. duality of outer and inner linearization . . . . . . . . . p. 194 4.4. generalized polyhedral approximation . . . . . . . . . p. 196 4.5. generalized simplicial decomposition . . . . . . . . . . p. 209 4.5.1. di.erentiablecostcase . . . . . . . . . . . . . . p.213 4.5.2. nondi.erentiable cost and side constraints . . . . . p. 213 4.6. polyhedral approximation for conic programming . . . . p. 217 4.7. notes,sources,andexercises . . . . . . . . . . . . . . p.228 5. proximalalgorithms . . . . . . . . . . . . . . . p.233 5.1. basic theory of proximal algorithms . . . . . . . . . . p. 234 5.1.1. convergence ................... p.235 5.1.2. rateofconvergence. . . . . . . . . . . . . . . . p.239 5.1.3. gradient interpretation . . . . . . . . . . . . . . p. 246 5.1.4. fixed point interpretation, overrelaxation, . . . . . . . . . andgeneralization ................ p.248 5.2. dualproximalalgorithms . . . . . . . . . . . . . . . p.256 5.2.1. augmented lagrangian methods . . . . . . . . . . p. 259 5.3. proximal algorithms with linearization . . . . . . . . . p. 268 5.3.1. proximal cutting plane methods . . . . . . . . . . p. 270 5.3.2. bundlemethods ................. p.272 5.3.3. proximal inner linearization methods . . . . . . . . p. 276 5.4. alternating direction methods of multipliers . . . . . . . p. 280 5.4.1. applications in machine learning . . . . . . . . . . p. 286 5.4.2. admm applied to separable problems . . . . . . . p. 289 5.5. notes,sources,andexercises . . . . . . . . . . . . . . p.293 6. additional algorithmic topics . . . . . . . . . . . p. 301 6.1. gradientprojectionmethods . . . . . . . . . . . . . . p.302 6.2. gradient projection with extrapolation . . . . . . . . . p. 322 6.2.1. an algorithm with optimal iteration complexity . . . p. 323 6.2.2. nondi.erentiable cost – smoothing . . . . . . . . . p. 326 6.3. proximalgradientmethods . . . . . . . . . . . . . . p.330 6.4. incremental subgradient proximal methods . . . . . . . p. 340 6.4.1. convergence for methods with cyclic order . . . . . p. 344 contents 6.4.2. convergence for methods with randomized order . . p. 353 6.4.3. application in specially structured problems . . . . . p. 361 6.4.4. incremental constraint projection methods . . . . . p. 365 6.5. coordinatedescentmethods . . . . . . . . . . . . . . p.369 6.5.1. variants of coordinate descent . . . . . . . . . . . p. 373 6.5.2. distributed asynchronous coordinate descent . . . . p. 376 6.6. generalized proximal methods . . . . . . . . . . . . . p. 382 6.7. .-descent and extended monotropic programming . . . . p. 396 6.7.1. .-subgradients .................. p.397 6.7.2. .-descentmethod........ ......... p.400 6.7.3. extended monotropic programming duality . . . . . p. 406 6.7.4. special cases of strong duality . . . . . . . . . . . p. 408 6.8. interiorpointmethods . . . . . . . . . . . . . . . . p.412 6.8.1. primal-dual methods for linear programming . . . . p. 416 6.8.2. interior point methods for conic programming . . . . p. 423 6.8.3. central cutting plane methods . . . . . . . . . . . p. 425 6.9. notes,sources,andexercises . . . . . . . . . . . . . . p.426 appendix a: mathematical background . . . . . . . . p. 443 a.1. linearalgebra ........... ......... p.445 a.2. topologicalproperties . . . . . . . . . . . . . . . . p.450 a.3. derivatives ..................... p.456 a.4. convergencetheorems . . . . . . . . . . . . . . . . p.458 appendix b: convex optimization theory: a summary . p. 467 b.1. basic concepts of convex analysis . . . . . . . . . . . p. 467 b.2. basic concepts of polyhedral convexity . . . . . . . . . p. 489 b.3. basic concepts of convex optimization . . . . . . . . . p. 494 b.4. geometric duality framework . . . . . . . . . . . . . p. 498 b.5. duality andoptimization . . . . . . . . . . . . . . . p.505 references .............. ......... p.519 index ................. ......... p.557