Leveraging Value-awareness for Online and Offline Model-based Reinforcement Learning
MetadataShow full item record
Model-based Reinforcement Learning (RL) lies at the intersection of planning and learning for sequential decision making. Value-awareness in model learning has recently emerged as a means to imbue task or reward information into the objective of model learn- ing, in order for the model to leverage specificity of a task. While finding success in theory as being superior to maximum likelihood estimation in the context of (online) model-based RL, value-awareness has remained impractical for most non-trivial tasks. This thesis aims to bridge the gap in theory and practice by applying the principle of value-awareness to two settings – the online RL setting and offline RL setting. First, within online RL, this thesis revisits value-aware model learning from the perspective of minimizing performance difference, obtaining a novel value-aware model learning objec- tive as a direct upper bound of it. Then, this thesis investigates and remedies the issue of stale value estimates that has so far been holding back the practicality of value-aware model learning. Using the proposed remedy, performance improvements are presented over maximum-likelihood based baselines and existing value-aware objectives, in several continuous control tasks, while also enabling existing value-aware objectives to become performant. In the offline RL context, this thesis takes a step back from model learning and ap- plies value-awareness towards better data augmentation. Such data augmentation, when applied to model-based offline RL algorithms, allows for leveraging unseen states with low epistemic uncertainty that have previously not been reachable within the assumptions and limitations of model-based offline RL. Value-aware state augmentations are found to enable better performance on offline RL benchmarks compared to existing baselines and non-value-aware alternatives.
Showing items related by title, author, creator and subject.
Inghilleri, Niccolo (Georgia Institute of Technology, 2021-05-05)This study aims to assess the impact on skill development of a hands-on experimentation and learning device within the undergraduate aerospace control analysis curriculum at Georgia Institute of Technology. The Transportable ...
Mehta, Nishant A. (Georgia Institute of Technology, 2013-05-15)Given the "right" representation, learning is easy. This thesis studies representation learning and meta-learning, with a special focus on sparse representations. Meta-learning is fundamental to machine learning, and it ...
Berlind, Christopher (Georgia Institute of Technology, 2015-07-22)Traditional supervised machine learning algorithms are expected to have access to a large corpus of labeled examples, but the massive amount of data available in the modern world has made unlabeled data much easier to ...