Efficient Continuous-Time Reinforcement Learning with Adaptive State
G. Neumann, M. Pfeiffer, and W. Maass
We present a new reinforcement learning approach for deterministic continuous
control problems in environments with unknown, arbitrary reward functions.
The difficulty of finding solution trajectories for such problems can be
reduced by incorporating limited prior knowledge of the approximative local
system dynamics. The presented algorithm builds an adaptive state graph of
sample points within the continuous state space. The nodes of the graph are
generated by an efficient principled exploration scheme that directs the
agent towards promising regions, while maintaining good online performance.
Global solution trajectories are formed as combinations of local controllers
that connect nodes of the graph, thereby naturally allowing continuous
actions and continuous time steps. We demonstrate our approach on various
movement planning tasks in continuous domains.
Reference: G. Neumann, M. Pfeiffer, and W. Maass.
Efficient continuous-time reinforcement learning with adaptive state graphs.
In Proceedings of the 18th European Conference on Machine Learning (ECML)
and the 11th European Conference on Principles and Practice of Knowledge
Discovery in Databases (PKDD) 2007, Warsaw (Poland). Springer (Berlin),