Show simple item record

dc.contributor.authorChintalapudi, Sahit
dc.date.accessioned2020-11-09T16:59:02Z
dc.date.available2020-11-09T16:59:02Z
dc.date.created2019-12
dc.date.submittedDecember 2019
dc.identifier.urihttp://hdl.handle.net/1853/63845
dc.description.abstractExisting Model Predictive Control methods rely on finite-horizon trajectories from the environment. Such methods are limited by the length of the samples because the robot cannot plan for scenarios beyond this time horizon. Simply extending the time-horizon of sampled trajectories is not feasible as an increase in the time-horizon requires more sampled trajectories from the environment in order to maintain controller performance. On robots such as the AutoRally platform, which operate in real time with limited computational power, increasing the number of sampled trajectories is computationally intractable. This work improves the long-term planning capabilities of autonomous systems by augmenting cost-estimates of trajectories with a learned value of the terminal state. This learned value approximates the expected cost under the car's current control policy from the terminal state for an arbitrary time-horizon without requiring an increase in the number of samples. We show that this improves the lap times of the AutoRally platform.
dc.format.mimetypeapplication/pdf
dc.language.isoen_US
dc.publisherGeorgia Institute of Technology
dc.subjectModel-Predictive Control
dc.subjectReinforcement Learning
dc.subjectAutonomous Driving
dc.titleImproving Model-Predictive Control with Value Function Approximation
dc.typeUndergraduate Research Option Thesis
dc.description.degreeUndergraduate
dc.contributor.departmentComputer Science
dc.contributor.departmentComputer Science
thesis.degree.levelUndergraduate
dc.contributor.committeeMemberBoots, Byron
dc.contributor.committeeMemberTsiotras, Panagiotis
dc.date.updated2020-11-09T16:59:02Z


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record