The Joy of PCA
MetadataShow full item record
Principal Component Analysis is the most widely used technique for high-dimensional or large data. For typical applications (nearest neighbor, clustering, learning), it is not hard to build examples on which PCA "fails." Yet, it is popular and successful across a variety of data-rich areas. In this talk, we focus on two algorithmic problems where the performance of PCA is provably near-optimal, and no other method is known to have similar guarantees. The problems we consider are (a) the classical statistical problem of unraveling a sample from a mixture of k unknown Gaussians and (b) the classic learning theory problem of learning an intersection of k halfspaces. During the talk, we will encounter recent extensions of PCA that are noise-resistant, affine-invariant and nonviolent.