Learning to Compose Skills
MetadataShow full item record
We present a differentiable framework capable of learning a wide variety of compositions of simple policies that we call skills. By recursively composing skills with themselves, we can create hierarchies that display complex behavior. Skill networks are trained to generate skill-state embeddings that are provided as inputs to a trainable composition function, which in turn outputs a policy for the overall task. Our experiments on an environment consisting of multiple collect and evade tasks show that this architecture is able to quickly build complex skills from simpler ones. Furthermore, the learned composition function displays some transfer to unseen combinations of skills, allowing for zero-shot generalizations. We also design experiments for learning both skills as well as optimal compositions of those skills from scratch. This allows for the agent to more intelligently search the space of possible skills rather than having a human hand-design the composition functions, which may be limited in scope.