Show simple item record

dc.contributor.authorWilmes, John
dc.date.accessioned2017-11-08T17:47:36Z
dc.date.available2017-11-08T17:47:36Z
dc.date.issued2017-10-30
dc.identifier.urihttp://hdl.handle.net/1853/58904
dc.descriptionPresented as part of the ARC 11 lecture on October 30, 2017 at 10:00 a.m. in the Klaus Advanced Computing Building, room 1116.en_US
dc.descriptionJohn Wilmes is a postdoc at the Algorithms and Randomness Center at Georgia Tech. Previously, I completed my Ph.D. at the University of Chicago under the supervision of László Babai. His research includes machine learning, the theory of computing, and discrete math.en_US
dc.descriptionRuntime: 30:01 minutesen_US
dc.description.abstractThe empirical successes of neural networks currently lack rigorous theoretical explanation. A first step might be to show that data generated by neural networks with a single hidden layer, smooth activation functions and benign input distributions can be learned efficiently. We demonstrate a surprisingly general lower bound. For inputs drawn from any logconcave distribution, we construct a family of functions that is hard to learn in the sense that any statistical query algorithm (including all known variants of stochastic gradient descent with any loss function, for any model architecture) needs an exponential number of queries even using tolerance inversely proportional to the input dimensionality. Moreover, this hard family of functions is realizable as neural networks using any activation function from a large family (including sigmoid and ReLU) with a small (sublinear in dimension) number of units in the single hidden layer, and whose output is a sum gate. Joint work with Le Song, Santosh Vempala, and Bo Xie.en_US
dc.format.extent30:01 minutes
dc.language.isoen_USen_US
dc.relation.ispartofseriesARC11en_US
dc.subjectMachine learningen_US
dc.subjectNeural networksen_US
dc.subjectRegressionen_US
dc.subjectStatistical query complexityen_US
dc.titleThe Complexity of Learning Neural Networksen_US
dc.typeLectureen_US
dc.typeVideoen_US
dc.contributor.corporatenameGeorgia Institute of Technology. Algorithms, Randomness and Complexity Centeren_US


Files in this item

This item appears in the following Collection(s)

  • ARC Talks and Events [84]
    Distinguished lectures, colloquia, seminars and speakers of interest to the ARC community

Show simple item record