Efficient trajectory and policy optimization using dynamics models
MetadataShow full item record
Data-driven approaches hold the promise of creating the next wave of robots that can perform diverse tasks and adapt to unstructured environments. However, gathering data of physical systems is often a labor-intensive, time-consuming, and even dangerous process. This issue of data scarcity motivates us to design algorithms that benefit from prior knowledge while avoiding relying too much on domain knowledge. One general and compact form of prior knowledge is dynamics models; they summarize our knowledge of the robot in the mechanical design and prior interactions with the robot through system identification. Unfortunately, often utilizing dynamics models to their full potential is not straightforward: (1) they are computationally expensive, and (2) they can even be harmful if the model errors are not taken into account. In this thesis, we address these two issues of using dynamics models by focusing on a central problem in robotics: trajectory and policy optimization. We develop new algorithmic and theoretic foundations of (1) computationally efficient trajectory optimization and (2) unbiased sample efficient policy optimization. Our research increases the practicality of continuous-time linear dynamics models and Gaussian process dynamics models in real-time incremental trajectory optimization, and accelerates policy optimization by utilizing dynamics models for prediction and control variates while avoiding performance bias due to model errors. We evaluate our approaches on a series of robot estimation, planning, and control tasks that involve both simulated data and real robotic systems.