## Approximation Algorithms for Mixed Integer Non-Linear Optimization Problems

##### Abstract

Mixed integer non-linear optimization (MINLO) problems are usually NP-hard. Although obtaining feasible solutions is relatively easy via heuristic or local search methods, it is still challenging to guarantee the quality (i.e., the gap to optimal value) of a given feasible solution even under mild assumptions in a tractable fashion. In this thesis, we propose efficient mixed integer linear programming based algorithms for finding feasible solutions and proving the quality of these solutions for three widely-applied MINLO problems.
In Chapter 1, we study the sparse principal component analysis (SPCA) problem. SPCA is a dimensionality reduction tool in statistics. Comparing with the classical principal component analysis (PCA), the SPCA enhances the interpretability by incorporating an additional sparsity constraint in the feature weights (factor loadings). However, unlike PCA, solving the SPCA problem to optimality is NP-hard. Most conventional methods for SPCA are heuristics with no guarantees such as certificates of optimality on the solution-quality via associated \emph{dual bounds}. We present a convex integer programming (IP) framework to derive dual bounds based on the $\ell_1$-relaxation of SPCA. We show the theoretical worst-case guarantee of the dual bounds provided by the convex IP. Based on numerical results, we empirically illustrate that our convex IP framework outperforms existing SPCA methods in both accuracy and efficiency of finding dual bounds. Moreover, these dual bounds obtained in computations are significantly better than worst-case theoretical guarantees.
Chapter 2 focuses on solving a non-trivial generalization of SPCA -- the (row) sparse principal component analysis (rsPCA) problem. Solving rsPCA is to find the top-$r$ leading principal components of a covariance matrix such that all these principal components share the same support set with cardinality at most $k$. In this chapter, we propose: (a) a convex integer programming relaxation of rsPCA that gives upper (dual) bounds for rsPCA, and; (b) a new local search algorithm for finding primal feasible solutions for rsPCA. We also show that, in the worst-case, the dual bounds provided by the convex IP are within an affine function of the optimal value. We demonstrate our techniques applied to large-scale covariance matrices.
In Chapter 3, we consider a fundamental training problem of finding the best-fitting ReLU concerning square-loss -- also called ``ReLU Regression'' in machine learning. We begin by proving the NP-hardness of the ReLU regression. We then present an approximation algorithm to solve the ReLU regression, whose running time is $\mathcal{O}(n^k)$ where $n$ is the number of samples, and $k$ is a predefined integral constant as an algorithm parameter. We analyze the performance of this algorithm under two regimes and show that: (1) given an arbitrary set of training samples, the algorithm guarantees an $(n/k)$-approximation for the ReLU regression problem -- to the best of our knowledge, this is the first time that an algorithm guarantees an approximation ratio for arbitrary data scenario; thus, in the ideal case (i.e., when the training error is zero) the approximation algorithm achieves the globally optimal solution for the ReLU regression problem; and (2) given training sample with Gaussian noise, the same approximation algorithm achieves a much better asymptotic approximation ratio which is independent of the number of samples $n$. Extensive numerical studies show that our approximation algorithm can perform better than the classical gradient descent algorithm in ReLU regression. Moreover, numerical results also imply that the proposed approximation algorithm could provide a good initialization for gradient descent and significantly improve the performance.