Abstract: Multi-agent systems are rapidly finding applications
in a variety of domains, including robotics, distributed control, telecommunications, and economics. The complexity of many tasks arising in these domains makes them difficult to solve with preprogrammed agent behaviors. The agents must instead discover a solution on their own, using learning. A significant part of the research on multi-agent learning concerns reinforcement learning techniques. This paper provides a comprehensive survey of multi-agent reinforcement learning (MARL). A central issue in the field is the formal statement of the multi-agent learning goal. Different viewpoints on this issue have led to the proposal of many different goals, among which two focal points can be distinguished: stability of the agents' learning dynamics, and adaptation to the changing behavior of the other agents. The MARL algorithms described in the literature aim - either explicitly or implicitly - at one of these two goals or at a combination of both, in a fully cooperative, fully competitive, or more general setting. A representative selection of these algorithms is discussed in detail in this paper, together with the specific issues that arise in each category. Additionally, the benefits and challenges of MARL are described along with some of the problem domains where MARL techniques have been applied. Finally, an outlook for the field is provided.
Abstract: Agents in an organization need to coordinate their actions in order to
reach the organizational goals. This research describes the relation between types
of coordination and the autonomy of actors. In an experimental setting we show
that there is not one best way to coordinate in all situations. The dynamics and
complexity of, for example, crisis situations require a crisis management orga-
nization to work with dynamic types of coordination. In order to reach dynamic
coordination we provide the actors with adjustable autonomy. Actors should be
able to make decisions at different levels of autonomy and reason about the re-
quired level. We propose a way to implement this in a multi-agent system. The
agent is provided with reasoning rules with which it can control the external in-
ﬂuences on its decision-making.
Abstract: Agents in an organization need to coordinate their actions in order to reach the organizational goals. This research describes the relation between types of coordination and the autonomy of actors. In an experimental setting we show that there is not one best way to coordinate in all situations. The dynamics and complexity of, for example, crisis situations require a crisis management orga- nization to work with dynamic types of coordination. In order to reach dynamic coordination we provide the actors with adjustable autonomy. Actors should be able to make decisions at different levels of autonomy and reason about the re- quired level. We propose a way to implement this in a multi-agent system. The agent is provided with reasoning rules with which it can control the external in- ﬂuences on its decision-making.
Abstract: This report presents an overview of the state-of-the-art methods and models for planning for teams of embodied agents. Due to the nature of the real world, this means we focus on multi-agent planning in stochastic, partially observable systems. In particular we focus on decentralized partially observable Markov decision processes (Dec-POMDPs), partially observable stochastic games (POSGs) and related models. Regarding such models, we review complexity results and recently proposed methods for finding (approximate) solutions.
Abstract: This paper describes the design, implementation, visualizations, results and lessons learned of a novel real-world socio-technical research system for the purpose of rescheduling train drivers in
the event of disruptions. The research system is structured according to the Actor-Agent paradigm: here agents assist in rescheduling tasks of train drivers. Coordination between agents is based on a team formation process in which possible rescheduling alternatives can be evaluated, based on constraints and preferences of involved human train drivers and dispatchers. The research system is the result of cooperation on decentralised multi-agent crew rescheduling between Netherlands Railways (NS) and the D-CIS Lab. The implementation is realized using the Cougaar framework and includes actual timetable and rolling stock schedule data and driver duty data.
Abstract: One of the first requirements for building multi-agent systems with complex and dynamic structures is to have agents that are able to operate in such organizations. Being able to adopt different organizational roles is one of the key requirements for an agent in order to have this ability. Another requirement for the agent we intend to build is the ability to operate in a dynamic environment. This means the agent has to be able to construct a plan for the task it is about to perform and while performing the task, the agent has to be able to evaluate whether the plan is still valid. When changes in the environment have caused the plan to become invalid, the agent needs to be able to generate a new and valid plan for the task. The agent architecture that is described in this document is a step towards an agent that meets these requirements of operating in a dynamic organization and dynamic environment.
Abstract: This deliverable is a state of the art on automated negotiation techniques for multi-agent systems.
Negotiation is a subject of research in a variety of domains, such as social choice, economics,
rhetoric, game theory, multi-criteria decision making, knowledge management, and so on, from
many years but the automated negotiation in a multi-agent environment has only less than ten
years old. Therefore, we present main techniques used in multi-agent systems enabling
automated negotiation such as voting, bargaining, auctions and contracting.
Negotiation is used to solve more efficiently logistics and crisis management problems, among
others, when at least two parties are involved in the decision process. Most of the current
negotiation approaches only deal with problems concerning mono-dimensional negotiation, i.e.
which only concern a single negotiation dimension, such as allocating a single task to a co-
workers. But recently, some studies have been conducted on the topic of combined negotiations,
in which the negotiation process concerns multiple interrelated objectives, a more realistic
modeling of what occurs in real world applications. Therefore, we suggest to study the problem of
multi-criteria and combined negotiation in multi-agents systems by first realizing a state of the art
on negotiation techniques for multi-agent systems.
Abstract: In this paper, two different methods for information fusion are compared with respect to communication cost. These are the lambda-pi and the junction-tree approach as the probability computing methods in Bayesian networks. The analysis is done within the scope of large distributed networks of computing nodes. The result of this comparison enables us to make astatement about the most appropriate method for reasoning in distributed Bayesian networks. Each node in the network is considered an intelligent agent in a multi-agentsystem.
Abstract: Multi-agent systems are rapidly finding applications in a variety of domains, including robotics, distributed control, telecommunications, etc. Learning approaches to multi-agent control, many of them based on reinforcement learning (RL), are investigated in complex domains such as teams of mobile robots. However, the application of decentralized RL to low-level control tasks is not as intensively studied. In this paper, we investigate centralized and decentralized RL, emphasizing the challenges and potential advantages of the latter. These are then illustrated on an example: learning to control a two-link rigid manipulator. Some open issues and future research directions in decentralized RL are outlined.
Abstract: Simulations of crisis scenarios have the potential to increase insight in the organizational structures needed as crises escalate. Real-life simulations involving personnel and ﬁgurants are expensive and time-consuming. Multi-agent system models allow for cost-eﬀective simulations of changing organizational structures, enabling analysis of the implications for enactment during crisis escalation with respect to
roles and communication structures. This paper presents both an organization-based model for crisis management that supports simulation of the dynamics of crisis management and a proof of concept implementation.
Abstract: This paper describes the sample implementation of a distributed goaloriented reasoning engine for multi-agent systems. The paper summarizes part of the design and programming issues that we addressed for providing the initial prototype with customization and self-configuration facilities.
Abstract: French coastguard missions have become increasingly varied implying new challenges such as the reduction of the decision cycle and the expansion of available information. Thus, it involves new needs for enhanced decision support. An efficient situation awareness system has to quickly detect and identify suspicious boats. The efficiency of such a system relies on a reliable sensor fusion since a coastguard uses sensors to achieve his mission. We present an innovative approach based on multi-agent negotiation to fuse classifiers, benefiting from the efficiency of existing classification tools and from the flexibility and reliability of a multi-agent system to exploit distributed data across dispersed sources. We developed a first prototype using a basic negotiation protocol in order to validate the feasibility and the interest of our approach. The results obtained are promising and encourage us to continue on this way.
Abstract: Agents that are acting in a multi-agent system can be inﬂuenced by other agents via interactions. At the same time
agents are expected to be autonomous; they have control
over their internal state and their behavior. Our claim is
that autonomous agents should be aware of the external inﬂuences they are sensitive for. This paper describes diﬀerent types of inter-agent inﬂuences and investigates their role in
the reasoning process of an agent. We propose away to
represent the experienced inﬂuences in the mental model of
an agent. By doing so we provide the awareness of external inﬂuences and guarantee the autonomy requiremen of
Abstract: Ant Colony Optimization (ACO) has proven to be a very powerful optimization heuristic for Combinatorial Optimization Problems. While being very successful for various NP-complete optimization problems, ACO is not trivially applicable to control problems. In this paper a novel ACO algorithm is introduced for the automated design of optimal control policies for continuous-state dynamic systems. The so called Fuzzy ACO algorithm integrates the multi-agent optimization heuristic of ACO with a fuzzy partitioning of the state space of the system. A simulated control problem is presented to demonstrate the functioning of the proposed algorithm.
Abstract: In this article we consider the issue of optimal control in collaborative multi-agent systems with stochastic dynamics. The agents have a joint task in which they have to reach a number of target states. The dynamics of the agents contains additive control and additive noise, and the autonomous part factorizes over the agents. Full observation of the global state is assumed. The goal is to minimize the accumulated joint cost, which consists of integrated instantaneous costs and a joint end cost. The joint end cost expresses the joint task of the agents. The instantaneous costs are quadratic in the control and factorize over the agents. The optimal control is given as a weighted linear combination of single-agent to single-target controls. The single-agent to single-target controls are expressed in terms of diffusion processes. These controls, when not closed form expressions, are formulated in terms of path integrals, which are calculated approximately by Metropolis-Hastings sampling. The weights in the control are interpreted as marginals of a joint distribution over agent to target assignments. The structure of the latter is represented by a graphical model, and the marginals are obtained by graphical model inference. Exact inference of the graphical model will break down in large systems, and so approximate inference methods are needed. We use naive mean field approximation and belief propagation to approximate the optimal control in systems with linear dynamics. We compare the approximate inference methods with the exact solution, and we show that they can accurately compute the optimal control. Finally, we demonstrate the control method in multi-agent systems with nonlinear dynamics consisting of up to 80 agents that have to reach an equal number of target states.
Abstract: Multi-agent systems are rapidly finding applications in a variety of domains, including robotics, distributed control, telecommunications, economics. Many tasks arising in these domains require that the agents learn behaviors online. A significant part of the research on multi-agent learning concerns reinforcement learning techniques. However, due to different viewpoints on central issues, such as the formal statement of the learning goal, a large number of different methods and approaches have been introduced. In this paper we aim to present an integrated survey of the field. First, the issue of the multi-agent learning goal is discussed, after which a representative selection of algorithms is reviewed. Finally, open issues are identified and future research directions are outlined.
Abstract: Results from disaster research suggest that methods for coordination between individual emergency responders and organizations should recognize the independence and autonomy of these actors. These actor features are key factors in effective adaptation and improvisation of response to emergency situations which are inherently uncertain. Autonomy and adaptability are also well-known aspects of a multi-agent system (MAS). In this paper we present two MAS strategies that can effectively handle aircraft deicing incidents. These MAS strategies help improve to prevent and reduce e.g. airplane delays at deicing stations due to changing weather conditions or incidents at the station, where aircraft agents adopting pre-made plans that would act on behalf of aircraft pilots or companies, would only create havoc. Herein each agent using its own decision mechanism deliberates about the uncertainty in the problem domain and the preferences (or priorities) of the agents. Furthermore, taking both these issues into account each proposed MAS strategy outperforms a naive first-come, first-served coordination strategy. The simulation results help pilots and companies taking decisions with respect to the scheduling of the aircraft for deicing when unexpected incidents occur: they provide insights in the impacts and means for robust selection of incident-specific strategies on e.g. deicing station delays of (individual) aircraft.
Abstract: We study optimal control in large stochastic multi-agent systems in continuous space and time. We consider multi-agent systems where agents have independent dynamics with additive noise and control. The goal is to minimize the joint cost, which consists of a state dependent term and a term quadratic in the control. The system is described by a mathematical model, and an explicit solution is given. We focus on large systems where agents have to distribute themselves over a number of targets with minimal cost. In such a setting the optimal control problem is equivalent to a graphical model inference problem. Exact inference will be intractable, and we use the mean field approximation to compute accurate approximations of the optimal controls. We conclude that near to optimal control in large stochastic multi-agent systems is possible with this approach.
Abstract: Negotiation is used to solve more efficiently logistics and crisis management problems, among others, when at least two parties are involved in the decision process. Most of the current negotiation approaches only deal with problems concerning mono-dimensional negotiation, i.e. which only concern a single negotiation dimension, such as allocating a single task to a co-workers. But recently, some studies have been conducted on the topic of combined negotiations, in which the negotiation process concerns multiple interrelated objectives, a more realistic modeling of what occurs in real world applications. Therefore, we suggest to study the problem of multi-criteria and combined negotiation in multi-agents systems by designing a new protocol to solve some of existing negotiation problems.
This deliverable follows a previous one, a report titled “Cooperation-based Multilateral Multi-issue Negotiation for Crisis Management” (Hemaissia et al., 2006). The negotiation protocol proposed here, is suited for multiple agents with complex preferences and taking into account, at the same time, multiple interdependent issues and recommendations made by the agents to improve a proposal.
Abstract: The Dec-POMDP is a model for multi-agent planning under uncertainty that has received increasingly more attention over the recent years. In this work we propose a new heuristic QBG that can be used in various algorithms for Dec-POMDPs and describe differences and similarities with QMDP and QPOMDP. An experimental evaluation shows that, at the price of some computation, QBG gives a consistently tighter upper bound to the maximum value obtainable.
Abstract: Passenger railway operations are based on an extensive planning process for generating the timetable, the rolling stock circulation, and the crew duties for train drivers and conductors. In particular, crew scheduling is a complex process.
After the planning process has been completed, the plans are carried out in the real-time operations. Preferably, the plans are carried out as scheduled. However, in case of delays of trains or large disruptions of the railway system, the timetable, the rolling stock circulation and the crew duties may not be feasible anymore and must be rescheduled.
This paper presents a method based on multi-agent techniques to solve the train driver rescheduling problem in case of a large disruption. It assumes that the timetable and the rolling stock have been rescheduled already based on an incident scenario. In the crew rescheduling model, each train driver is represented by a driver-agent. A driver-agent whose duty has become infeasible by the disruption starts a recursive task exchange process with the other driver-agents in order to solve this infeasibility. The task exchange process is supported by a route-analyzer-agent, which determines whether a proposed task exchange is feasible, conditionally feasible, or not feasible. The task exchange process is guided by several cost parameters, and the aim is to find a feasible set of duties at minimal total cost.
The train driver rescheduling method was tested on several realistic disruption instances of Netherlands Railways (NS), the main operator of passenger trains in the Netherlands. In general the rescheduling method finds an appropriate set of rescheduled duties in a short amount of time. This research was carried out in close cooperation by NS and the D-CIS Lab.
Abstract: Multi-agent systems are rapidly finding applications in a variety of domains, including robotics, distributed control, telecommunications, etc. Although the individual agents can be programmed in advance, many tasks require that they learn behaviors online. A significant part of the research on multi-agent learning concerns reinforcement learning techniques. This paper gives a survey of multi-agent reinforcement learning, starting with a review of the different viewpoints on the learning goal, which is a central issue in the field. Two generic goals are distinguished: stability of the learning dynamics, and adaptation to the other agents' dynamic behavior . The focus on one of these goals, or a combination of both, leads to a categorization of the methods and approaches in the field. The challenges and benefits of multi-agent reinforcement learning are outlined along with open issues and future research directions.
Abstract: Thanks to advances in both computer science and
engineering, the divide between robotics and multi-agent systems
is shrinking. Robots are capable of performing an ever wider
range of tasks, and there is an increasing need for solutions
to high-level problems such as multi-agent coordination. In this
paper we examine the problem of finding a robust exploration
strategy for a team of mobile robots that takes into account
communication limitations.We propose four performance metrics
to evaluate and compare existing multi-robot exploration algorithms,
and present a role-based approach in which robots either
act as explorers or as relays. The result is a complete exploration
of the environment in which information is efficiently returned
to a central command centre, which is particularly applicable to
the domain of rescue robotics.
Abstract: Multi-agent systems get deployed more and more often in settings where they share a common real-world environment. Therefore, it becomes necessary to include sensory information in the reasoning and coordination of agents. The Active Sensor Web (ASW) aims to deliver coherent models and components to this end. The architeccture and some exemplary applications of the ASW are described in this paper.
Abstract: Recently, a theory for stochastic optimal control in non-linear dynamical systems in continuous space-time has been developed (Kappen, 2005). We apply this theory to collaborative multi-agent systems. The agents
evolve according to a given non-linear dynamics with additive Wiener noise. Each
agent can control its own dynamics. The goal
is to minimize the accumulated joint cost,
which consists of a state dependent term and
a term that is quadratic in the control. We focus on systems of non-interacting agents that
have to distribute themselves optimally over
a number of targets, given a set of end-costs
for the different possible agent-target combinations. We show that optimal control is
the combinatorial sum of independent single-
agent single-target optimal controls weighted
by a factor proportional to the end-costs
of the different combinations. Thus, multi-
agent control is related to a standard graphical model inference problem. The additional
computational cost compared to single-agent
control is exponential in the tree-width of the
graph specifying the combinatorial sum times
the number of targets. We illustrate the result by simulations of systems with up to 42
Abstract: Research on multi-agent systems frequently involves experiments with agents, including situations where humans engage in interactions with agents. Consequently, the ﬁeld of experimental (human) sciences becomes more and more relevant. This paper clariﬁes how things can and often do go wrong in distributed AI experiments. We show the ﬂaws in methodological design in existing literature (both with and without humans) and work out an example involving human test-subjects to introduce the
fundamental issues of experimental design. Furthermore, we provide researchers with an approach to improve their experimental design. We wish to stimulate researchers to conduct better experiments – which will beneﬁt us all.
Abstract: Research in the area of Multi-Agent System (MAS) organization has shown that the ability for a MAS to adapt its organizational structure can be beneficial when coping with dynamics and uncertainty in the MASs environment. Different types of reorganization exist, such as changing relations and interaction patterns between agents, changing agent roles and changing the coordination style in the MAS. In this paper we propose a framework for agent Coordination and Reorganization (AgentCoRe) that incorporates each of these aspects of reorganization. We describe both declarative and procedural knowledge an agent uses to decompose and assign tasks, and to reorganize. The RoboCupRescue simulation environment is used to demonstrate how AgentCoRe is used to build a MAS that is capable of reorganizing itself by changing relations, interaction patterns and agent roles.
Abstract: Crew rescheduling in response to disruptions is a difficult problem, due to the additional (social) constraints imposed on human workforce. In the real-world domain of train driver rescheduling in the Netherlands, an actor-agent based approach is taken to (a) support human dispatchers and (b) accommodate individual train drivers’ preferences. This paper outlines the task-exchange team-configuration process including the role of the
various rescheduling constraints. The rescheduling approach is designed for operation in a real world environment: to this end, a number of heuristics are discussed that are currently being examined to optimize the solution finding process with respect to
three dimensions: performance, quality and clarity. The heuristics have been implemented in a research system, supporting the full driver-agent population, working on real world data. This effort is an ongoing study on novel multi-agent approaches to crew
rescheduling, and is the result of cooperation between Netherlands Railways and D-CIS Lab.