Abstract: This paper proposes a reinforcement learning architecture containing multiple "experts", each of which is a specialist in a different region in the overall state space. The central idea is that the different experts use qualitatively different (but sufficiently Markov) state representations, each of
which captures different information regarding the true underlying world state, and which for that reason is suitable for a different part of the state space. The experts themselves learn to switch to another state representation (other expert) by having switching actions. Value functions can be learned using standard reinforcement learning algorithms. Experiments in a small, proof-of-principle experiment as well as a larger, more realistic experiment illustrate the validity of this approach.