D-CIS Publication Database


Type of publication:Inproceedings
Entered by:LB
TitlePolicy Search with Cross-Entropy Optimization of Basis Functions
Bibtex cite IDBusoniu-adprl09
Booktitle Proceedings of the 2009 International Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL 2009)
Year published 2009
Month April
Pages 153-160
Location 30 March - 2 April 2009, Nashville, USA
Keywords Markov decision process,direct policy search,adaptive basis functions,cross-entropy optimization
This paper introduces a novel algorithm for approximate policy search in continuous-state discrete-action Markov decision processes (MDPs). Previous policy search approaches have typically used ad-hoc parameterizations developed for specific MDPs. In contrast, the novel algorithm employs a flexible policy parameterization, suitable for solving general discrete-action MDPs. The algorithm looks for the best closed-loop policy that can be represented using a given number of basis functions, where a discrete action is assigned to each basis function. The locations and shapes of the basis functions are optimized, together with the action assignments. This allows a large class of policies to be represented. The optimization is carried out with the cross-entropy method and evaluates the policies by their empirical return from a representative set of initial states. We report simulation experiments in which the algorithm reliably obtains good policies with only a small number of basis functions, albeit at sizable computational costs.
Busoniu, Lucian
Ernst, Damien
De Schutter, Bart
Babuška, Robert
Total mark: 5