Abstract: Computer-Assisted Instruction systems (CAI) enable fully automated simulator-based training. Traditionally, a CAI system does not enable a true dialogue between the learner and the virtual instructor. Most frequently, the system acts like a human expert, and authoritatively provides feedback and ways to improve the task performance. In this conference paper, we describe an educational agent that enables a dialogue between the learner and the agent. The agent is called the companion agent. It acts like a virtual co-learner, for example by deliberating about new operational measures after a situation-change. The agent operates on the same authority level as the learner, and is therefore less threatening than a traditional virtual instructor. We believe companion agents are typically useful in modern, constructive learning situations where learners can take control of their own learning process. Potential applications of companion agents lie within the civil area (for example a civil tunnel operator during tunnel surveillance training) and the military area (for example embedded training in tactical surveillance).
This paper was selected as one of the Continuing Education Unit (CEU) papers for the 2007 Interservice/Industry Training, Simulation and Education Conference (I/ITSEC). The I/ITSEC board states that only those papers that demonstrate exceptional innovation, research, experimentation, and documentation in an area of new technology are selected for CEU credit.
Abstract: Multi-agent systems are rapidly finding applications
in a variety of domains, including robotics, distributed control, telecommunications, and economics. The complexity of many tasks arising in these domains makes them difficult to solve with preprogrammed agent behaviors. The agents must instead discover a solution on their own, using learning. A significant part of the research on multi-agent learning concerns reinforcement learning techniques. This paper provides a comprehensive survey of multi-agent reinforcement learning (MARL). A central issue in the field is the formal statement of the multi-agent learning goal. Different viewpoints on this issue have led to the proposal of many different goals, among which two focal points can be distinguished: stability of the agents' learning dynamics, and adaptation to the changing behavior of the other agents. The MARL algorithms described in the literature aim - either explicitly or implicitly - at one of these two goals or at a combination of both, in a fully cooperative, fully competitive, or more general setting. A representative selection of these algorithms is discussed in detail in this paper, together with the specific issues that arise in each category. Additionally, the benefits and challenges of MARL are described along with some of the problem domains where MARL techniques have been applied. Finally, an outlook for the field is provided.
Abstract: This research report discusses human group characteristics as a stepping stone to study human-agent team characteristics and dynamics. A human-agent team, or so called actor-agent team (AAT) is a group of humans and agents who interact in a coherent and coordinated way towards a common goal. The concept of AATs relates to actor-agent communities (AACs), as AACs are groups of humans and artificial systems (socio-technical information systems) that intimately work together to achieve a common goal (i.e. solve a problem) (Iacob et al., 2009).
AATs are envisioned to increase human performance in (among others) safety and security domains, emergency management, and traffic control. However, the concept of AATs brings many challenges. Besides the realisation of agents as teammembers, and the realisation of real-world AATs, the interaction between agents and humans is a challenge. If agents are to become (task performing) group members, team membership requires much from agents regarding human-agent interaction. How should agents be designed to become teammembers in an AAT? How can humans best interact with agents? When do trust an agent, or rely on it?
This document discusses human group characteristics to draw implications for AAT dynamics. This document is a follow-up of Gouman et al. (2008) in which stages of team development, group membership and cohesion, subgroups, norms, roles, status, and leadership were discussed. The current report first addresses communication and decision making, after which team performance and implications for AATs are discussed.
Abstract: Multi-agent systems are rapidly finding applications in a variety of domains, including robotics, distributed control, telecommunications, etc. Learning approaches to multi-agent control, many of them based on reinforcement learning (RL), are investigated in complex domains such as teams of mobile robots. However, the application of decentralized RL to low-level control tasks is not as intensively studied. In this paper, we investigate centralized and decentralized RL, emphasizing the challenges and potential advantages of the latter. These are then illustrated on an example: learning to control a two-link rigid manipulator. Some open issues and future research directions in decentralized RL are outlined.
Abstract: The abstraction-sophistication analysis has been developed in extension of the abstraction hierarchy to aid the design for effective human-automation interaction for vehicale controlsystems. The new analysis framework is applied to the mini UAV system being developed at the D-CIS lab and TUDelft.
Abstract: Automation is often accused of adding to the complexity of a system and unnecessarily increasing operator’s workload, and the potential for human error. An approach is needed that guides designers to make the right design choices. Cognitive Systems Engineering (CSE) is a promising approach. However, this field is still young and tangible examples of automation design with an explicit CSE approach do not exist. This paper describes how the design of Total Energy Control System (TECS) that was founded in the late 1970’s can be regarded as an example avant la letter. TECS is an automated flight control system designed to solve many of the issues that classical autopilot and auto-throttle systems have. Since TECS has been designed, implemented, and evaluated it could teach valuable lessons on how Work Domain Analysis (WDA) can guide the design of automated systems as the first phase of CSE approach. The application of WDA to TECS is exemplified using the abstraction hierarchy and the abstraction decomposition space.
Abstract: Ant Colony Optimization (ACO) has proven to be a very powerful optimization heuristic for Combinatorial Optimization Problems. While being very successful for various NP-complete optimization problems, ACO is not trivially applicable to control problems. In this paper a novel ACO algorithm is introduced for the automated design of optimal control policies for continuous-state dynamic systems. The so called Fuzzy ACO algorithm integrates the multi-agent optimization heuristic of ACO with a fuzzy partitioning of the state space of the system. A simulated control problem is presented to demonstrate the functioning of the proposed algorithm.
Abstract: In this article we consider the issue of optimal control in collaborative multi-agent systems with stochastic dynamics. The agents have a joint task in which they have to reach a number of target states. The dynamics of the agents contains additive control and additive noise, and the autonomous part factorizes over the agents. Full observation of the global state is assumed. The goal is to minimize the accumulated joint cost, which consists of integrated instantaneous costs and a joint end cost. The joint end cost expresses the joint task of the agents. The instantaneous costs are quadratic in the control and factorize over the agents. The optimal control is given as a weighted linear combination of single-agent to single-target controls. The single-agent to single-target controls are expressed in terms of diffusion processes. These controls, when not closed form expressions, are formulated in terms of path integrals, which are calculated approximately by Metropolis-Hastings sampling. The weights in the control are interpreted as marginals of a joint distribution over agent to target assignments. The structure of the latter is represented by a graphical model, and the marginals are obtained by graphical model inference. Exact inference of the graphical model will break down in large systems, and so approximate inference methods are needed. We use naive mean field approximation and belief propagation to approximate the optimal control in systems with linear dynamics. We compare the approximate inference methods with the exact solution, and we show that they can accurately compute the optimal control. Finally, we demonstrate the control method in multi-agent systems with nonlinear dynamics consisting of up to 80 agents that have to reach an equal number of target states.
Abstract: Autonomous agents are believed to have control over their internal state and over their behaviour. For that reason, an agent should control how and by whom it is being influenced. We introduce a reasoning component for BDI-agents that deals with the control over external influences, and we propose heuristics using local knowledge to process incoming stimuli. One of those heuristics is based on information relevance with respect to the agent's current plans and goals. We have developed a way to determine the relevance of information in BDI-agents using magic sets from database research as basis. The method presented shows a new application of magic sets by applying the theory in agent systems.
Abstract: Advances in network technologies enable distributed systems, operating in complex physical environments, to co-ordinate their activities over larger areas within shorter time intervals. Some envisioned application domains for such systems are defence, crisis management, traffic management and public safety. In these systems humans and machines will, in close interaction, be adaptive to a changing environment. Various architecture models are proposed for such Networked Adaptive Interactive Hybrid Systems (NAIHS) from different research areas like (networked) sensor fusion, command and control, artificial intelligence, robotics and human machine interaction. In this paper an architecture model is proposed that seeks to combine their merits. The NAIHS model focuses on the ‘hybrid mind’ that is layered in several dimensions
defining specific functional components and their
interactions. Subsequently, the interaction between the human and artificial part of the system is discussed.
Abstract: Choice of an incorrect representation for the design of automation can dramatically increase
system complexity. Principles from Cognitive Systems Engineering (CSE), which can be used to
identify good representations about the way the ‘world works’, provide a good starting point
for automation design.
This paper discusses, that by choosing the right model for automation design the added
complexity can be limited. But what is the right model for automation? The model of the
environment, or ecology, is preferred above the mental models that human operators have
developed through interacting with the system. The technology has altered the work
environment of the human operator and can have implied to complex or too simplified mental
models. A too complex mental model will bring a too high cognitive load and a simplified
mental model will not be sufficient in all situations. Using the ecology as the basis for the
model of automation, the complexity of the automation is constrained to that of the actual
environment with a minimum share of automation induced complexity.
To illustrate this we considered the design of a conventional autopilot and one based on total
energy control and discuss the mental model pilots have for energy control. Energy control is
the fundamental physics of flight. It is part of the environment thus ecology for pilots and a
proper understanding of energy control helps the pilot to deal with unanticipated event as the
mountain wave condition.
Abstract: Multi-agent systems are rapidly finding applications in a variety of domains, including robotics, distributed control, telecommunications, economics. Many tasks arising in these domains require that the agents learn behaviors online. A significant part of the research on multi-agent learning concerns reinforcement learning techniques. However, due to different viewpoints on central issues, such as the formal statement of the learning goal, a large number of different methods and approaches have been introduced. In this paper we aim to present an integrated survey of the field. First, the issue of the multi-agent learning goal is discussed, after which a representative selection of algorithms is reviewed. Finally, open issues are identified and future research directions are outlined.
Abstract: We study optimal control in large stochastic multi-agent systems in continuous space and time. We consider multi-agent systems where agents have independent dynamics with additive noise and control. The goal is to minimize the joint cost, which consists of a state dependent term and a term quadratic in the control. The system is described by a mathematical model, and an explicit solution is given. We focus on large systems where agents have to distribute themselves over a number of targets with minimal cost. In such a setting the optimal control problem is equivalent to a graphical model inference problem. Exact inference will be intractable, and we use the mean field approximation to compute accurate approximations of the optimal controls. We conclude that near to optimal control in large stochastic multi-agent systems is possible with this approach.
Abstract: We consider multiagent systems with stochastic non-linear dynamics in continuous space-time. We focus on systems of agents that aim to visit a number of given target locations at given points in time at minimal control cost. The online optimization of which agent has to visit which target requires the solution of the Hamilton-Jacobi-Bellman (HJB) equation, which is a non-linear partial differential equation (PDE). Under some conditions, the log-transform can be applied to turn the HJB equation into a linear PDE. We then show that the optimal solution in the multiagent scheduling problem can be expressed in closed form as a sum of single schedule solutions.
Abstract: Abstract: Reinforcement learning (RL) comprises an array of techniques that learn a control
policy soas to maximize a reward signal. When applied to the control of elevator systems, RL
has the potential of ﬁnding better control policies than classical heuristic, suboptimal policies.
On theother hand, elevator systems oﬀer an interesting benchmark application for the study
of RL. In this paper, RL is applied toa single-elevator system. The mathematical model of
the elevator system is described in detail, making the system easy to re-implement and re-use.
An experimental comparison is made between the performance of the Q-value iteration and
Q-learning RL algorithms, when applied to the elevator system.
Abstract: Multi-agent systems are rapidly finding applications in a variety of domains, including robotics, distributed control, telecommunications, etc. Although the individual agents can be programmed in advance, many tasks require that they learn behaviors online. A significant part of the research on multi-agent learning concerns reinforcement learning techniques. This paper gives a survey of multi-agent reinforcement learning, starting with a review of the different viewpoints on the learning goal, which is a central issue in the field. Two generic goals are distinguished: stability of the learning dynamics, and adaptation to the other agents' dynamic behavior . The focus on one of these goals, or a combination of both, leads to a categorization of the methods and approaches in the field. The challenges and benefits of multi-agent reinforcement learning are outlined along with open issues and future research directions.
Abstract: A large class of nonlinear systems can be well
approximated by Takagi-Sugeno fuzzy models, for which methods and algorithms have been developed to analyze their stability and to design observers and controllers. However, results obtained for Takagi-Sugeno fuzzy models are in general not directly applicable to the original nonlinear system. In this paper, we investigate what conclusions can be drawn when an observer-based controller is designed for an approximate model and then applied to the original nonlinear system. In particular, we consider the case when the scheduling vector depends on the states that have to be estimated, and in the membership functions of the observer estimated scheduling vectors are used. The results are illustrated throughout the
paper using simulation examples.
Abstract: Recently, a theory for stochastic optimal control in non-linear dynamical systems in continuous space-time has been developed (Kappen, 2005). We apply this theory to collaborative multi-agent systems. The agents
evolve according to a given non-linear dynamics with additive Wiener noise. Each
agent can control its own dynamics. The goal
is to minimize the accumulated joint cost,
which consists of a state dependent term and
a term that is quadratic in the control. We focus on systems of non-interacting agents that
have to distribute themselves optimally over
a number of targets, given a set of end-costs
for the different possible agent-target combinations. We show that optimal control is
the combinatorial sum of independent single-
agent single-target optimal controls weighted
by a factor proportional to the end-costs
of the different combinations. Thus, multi-
agent control is related to a standard graphical model inference problem. The additional
computational cost compared to single-agent
control is exponential in the tree-width of the
graph specifying the combinatorial sum times
the number of targets. We illustrate the result by simulations of systems with up to 42