Show simple item record

dc.contributor.authorSchapire, Robert
dc.date.accessioned2017-11-08T20:48:48Z
dc.date.available2017-11-08T20:48:48Z
dc.date.issued2017-10-30
dc.identifier.urihttp://hdl.handle.net/1853/58911
dc.descriptionPresented as part of the ARC11 lecture on October 30, 2017 at 10:00 a.m. in the Klaus Advanced Computing Building, Room 1116.en_US
dc.descriptionRobert Schapire is a Principal Researcher at Microsoft Research in New York City. His main research interest is in theoretical and applied machine learning, with particular focus on boosting, online learning, game theory, and maximum entropy.en_US
dc.descriptionRuntime: 63:55 minutesen_US
dc.description.abstractWe consider how to learn through experience to make intelligent decisions. In the generic setting, called the contextual bandits problem, the learner must repeatedly decide which action to take in response to an observed context, and is then permitted to observe the received reward, but only for the chosen action. The goal is to learn to behave nearly as well as the best policy (or decision rule) in some possibly very large and rich space of candidate policies. This talk will describe progress on developing general methods for this problem and some of its variants.en_US
dc.format.extent63:55 minutes
dc.language.isoen_USen_US
dc.relation.ispartofseriesARC11en_US
dc.subjectContextual banditsen_US
dc.subjectMachine learningen_US
dc.titleThe Contextual Bandits Problem: Techniques for Learning to Make High-Reward Decisionsen_US
dc.typeLectureen_US
dc.typeVideoen_US
dc.contributor.corporatenameGeorgia Institute of Technology. Algorithms, Randomness and Complexity Centeren_US
dc.contributor.corporatenameMicrosoft Researchen_US


Files in this item

This item appears in the following Collection(s)

  • ARC Talks and Events [68]
    Distinguished lectures, colloquia, seminars and speakers of interest to the ARC community

Show simple item record