Co-evolution of shaping rewards and meta-parameters in reinforcement learning

Show full item record

Please use this identifier to cite or link to this item: http://hdl.handle.net/1853/38251

Title: Co-evolution of shaping rewards and meta-parameters in reinforcement learning
Author: Elfwing, Stefan ; Uchibe, Eiji ; Doya, Kenji ; Christensen, Henrik I.
Abstract: In this article, we explore an evolutionary approach to the optimization of potential-based shaping rewards and meta-parameters in reinforcement learning. Shaping rewards is a frequently used approach to increase the learning performance of reinforcement learning, with regards to both initial performance and convergence speed. Shaping rewards provide additional knowledge to the agent in the form of richer reward signals, which guide learning to high-rewarding states. Reinforcement learning depends critically on a few meta-parameters that modulate the learning updates or the exploration of the environment, such as the learning rate α, the discount factor of future rewards γ, and the temperature τ that controls the trade-off between exploration and exploitation in softmax action selection. We validate the proposed approach in simulation using the mountain-car task. We also transfer shaping rewards and meta-parameters, evolutionarily obtained in simulation, to hardware, using a robotic foraging task.
Description: Digital Object Identifier: 10.1177/1059712308092835
Type: Article
URI: http://hdl.handle.net/1853/38251
ISSN: 1059-7123
Citation: Elfwing, S., Uchibe, E., Doya, K., and Christensen, H. I. Co-evolution of shaping rewards and meta-parameters in reinforcement learning. Adaptive Behaviour 16, 8 (Dec 2008), 400-412.
Date: 2008-12
Contributor: Georgia Institute of Technology. College of Computing
Okinawa Institute of Science and Technology. Neural Computation Unit
Georgia Institute of Technology. Center for Robotics and Intelligent Machines
Kungl. Tekniska Högskolan. Centrum för Autonoma System
Publisher: Georgia Institute of Technology
Sage
International Society for Adaptive Behavior
Subject: Shaping rewards
Reinforcement learning

All materials in SMARTech are protected under U.S. Copyright Law and all rights are reserved, unless otherwise specifically indicated on or in the materials.

Files in this item

Files Size Format View
parham_fuchs_OR11.pdf 358.9Kb PDF View/ Open

This item appears in the following Collection(s)

Show full item record