Abstract: In realistic multiagentsystems, learning on the basis of complete state information is not feasible. We introduce adaptive state focus Q-learning, a class of methods derived from Q-learning that start learning with only the state information that is strictly necessary for a single agent to perform the task, and that monitor the convergence of learning. If lack of convergence is detected, the learner dynamically expands its state space to incorporate more state information (e.g., states of other agents). Learning is faster and takes less resources than if the complete state were considered from the start, while being able to handle situations where agents interfere in pursuing their goals. We illustrate our approach by instantiating a simple version of such a method, and by showing that it outperforms learning with full state information without being hindered by the deciencies of learning on the basis of a single agent's state.
Abstract: We consider multiagentsystems with stochastic non-linear dynamics in continuous space-time. We focus on systems of agents that aim to visit a number of given target locations at given points in time at minimal control cost. The online optimization of which agent has to visit which target requires the solution of the Hamilton-Jacobi-Bellman (HJB) equation, which is a non-linear partial differential equation (PDE). Under some conditions, the log-transform can be applied to turn the HJB equation into a linear PDE. We then show that the optimal solution in the multiagent scheduling problem can be expressed in closed form as a sum of single schedule solutions.
Abstract: Research on organization of MultiagentSystems (M.A.S.) has shown that by adapting its organization, a M.A.S. is better able to operate in dynamic environments. In this paper we describe an experiment with a M.A.S. that consists of agents where the capability to reorganize is integrated in their coordination mechanism. In the RoboCupRescue simulator we have implemented a M.A.S. where work can be coordinated according to three different coordination styles; direct supervision and standardization of skills with and without a reorganization extension. An experiment shows the effects of unknown workload distribution and incomplete information on the performance of the three styles. Results show significant interaction effects between both workload distribution and coordination mechanism, and completeness of information and coordination mechanism. Furthermore, results show that standardization of skills with reorganization performs better and is more robust to heterogeneous workload distribution and incompleteness of information.
Abstract: This paper introduces the MultiAgent Decision Process software toolbox, an open source C++ library for decision-theoretic planning under uncertainty in multiagentsystems. It provides support for several multiagent models, such as POSGs, Dec-POMDPs and MMDPs. The toolbox aims to reduce development time for planning algorithms and to provide a benchmarking platform by providing a number of commonly used problem descriptions. It features a parser for a text-based ﬁle format for discrete Dec-POMDPs, shared functionality for planning algorithms, as well as the implementation of several Dec-POMDP planners. We describe design goals and architecture of the toolbox, and provide an overview of its functionality, illustrated by some usage examples. Finally we report on current and future work.