Abstract: We address the role of noise and the issue of efficient computation in stochastic optimal control
problems. We consider a class of nonlinear control problems that can be formulated as a path integral and
where the noise plays the role of temperature. The path integral displays symmetry breaking and there
exists a critical noise value that separates regimes where optimal control yields qualitatively different
solutions. The path integral can be computed efficiently by Monte Carlo integration or by a Laplace
approximation, and can therefore be used to solve high dimensional stochastic control problems
Abstract: This report presents an overview of the state-of-the-art methods and models for planning for teams of embodied agents. Due to the nature of the real world, this means we focus on multi-agent planning in stochastic, partially observable systems. In particular we focus on decentralized partially observable Markov decision processes (Dec-POMDPs), partially observable stochastic games (POSGs) and related models. Regarding such models, we review complexity results and recently proposed methods for finding (approximate) solutions.
Abstract: The collaboration between humans (actors) and artificial entities (agents) can be a potential performance boost.
Agents, as complementary artificial intelligent entities, can alleviate actors from certain activities, while enlarging
the collective effectiveness. This paper describes our approach for experimentation with actors, agents and their
interaction. This approach is based on a principled combination of existing empirical research methods and is
illustrated by a small experiment which assesses the performance of a specific actor-agent team in comparison with
an actor-only team in an incident management context. The REsearch and Simulation toolKit (RESK) is
instrumental for controlled and repeatable experimentation. The indicative findings show that the approach is viable
and forms a basis for further data collection and comparative experiments. The approach supports applied actor-
agent research to show its (dis)advantages as compared to actor-only solutions.
Abstract: Data corpora are an important part of any audio-visual research. However, the time and effort needed to build a good dataset are very large. Therefore, we argue that the researchers should follow some general guidelines when building a corpus that guarantees that the resulted datasets have common properties. This will give the opportunity to compare the results of different approaches of different research groups even without sharing the same data corpus. In this paper we will formulate the set of guidelines that should always be taken into account when developing an audio-visual data corpus for bi-modal speech recognition. During the process we compare samples from different existing datasets, and give solutions for solving the drawbacks that these datasets suffer. In the end we give a complete list with all the properties of some of the most known data corpora.
Abstract: Present scenarios, issues, requirements, experience and solutions from a central government organisation supporting local emergency management organisations.
The presentation will cover the experience collected in the DIADEM project by the Danish Emergency Management Organisation with the focus of using ARGOS for response to chemical incidents.
Abstract: Reinforcement learning (RL) is a widely used
paradigm for learning control. Computing exact RL solutions is
generally only possible when process states and control actions
take values in a small discrete set. In practice, approximate
algorithms are necessary. In this paper, we propose an approximate, model-based Q-iteration algorithm that relies on
a fuzzy partition of the state space, and a discretization of
the action space. Using assumptions on the continuity of the
dynamics and of the reward function, we show that the resulting
algorithm is consistent, i.e., that the optimal solution is obtained
asymptotically as the approximation accuracy increases. An
experimental study indicates that a continuous reward function
is also important for a predictable improvement in performance
as the approximation accuracy increases.
Abstract: An often heard lament is that agent technology is not used by ‘the industry’. In this column we argue that agent research needs to look at themselves, and show, by demonstration, how agent technology can be used. Industry is, in general, interested in technology which provides something ‘better’: in terms of solutions to problems, in terms of cost, maintenance, deployments, development, etc. As such, more emphasis and rewards need to be placed on building demonstrations of agent technology, including performance metrics and comparisons with alternative technologies.
Abstract: Reinforcement learning (RL) is a learning control paradigm that provides well-understood algorithms with good convergence and consistency properties. Unfortunately, these algorithms require that process states and control actions take only discrete values. Approximate solutions using fuzzy representations have been proposed in the literature for the case when the states and possibly the actions are continuous. However, the link between these mainly heuristic solutions and the larger body of work on approximate RL, including convergence results, has not been made explicit. In this paper, we propose a fuzzy approximation structure for the Q-value iteration algorithm, and show that the resulting algorithm is convergent. The proof is based on an extension of previous results in approximate RL. We then propose a modified, serial version of the algorithm that is guaranteed to converge at least as fast as the original algorithm. An illustrative simulation example is also provided.
Abstract: Decentralized partially observable Markov decision processes (Dec-POMDPs) constitute a generic and expressive framework for multiagent planning under uncertainty. However, planning optimally is difficult because solutions map local observation histories to actions, and the number of such histories grows exponentially in the planning horizon. In this work, we identify a criterion that allows for lossless clustering of observation histories: i.e., we prove that when two histories satisfy the criterion, they have the same optimal value and thus can be treated as one. We show how this result can be exploited in optimal policy search and demonstrate empirically that it can provide a speed-up of multiple orders of magnitude, allowing the optimal solution of significantly larger problems. We also perform an empirical analysis of the generality of our clustering method, which suggests that it may also be useful in other (approximate) Dec-POMDP solution methods.
Abstract: We consider multiagent systems with stochastic non-linear dynamics in continuous space-time. We focus on systems of agents that aim to visit a number of given target locations at given points in time at minimal control cost. The online optimization of which agent has to visit which target requires the solution of the Hamilton-Jacobi-Bellman (HJB) equation, which is a non-linear partial differential equation (PDE). Under some conditions, the log-transform can be applied to turn the HJB equation into a linear PDE. We then show that the optimal solution in the multiagent scheduling problem can be expressed in closed form as a sum of single schedule solutions.
Abstract: Thanks to advances in both computer science and
engineering, the divide between robotics and multi-agent systems
is shrinking. Robots are capable of performing an ever wider
range of tasks, and there is an increasing need for solutions
to high-level problems such as multi-agent coordination. In this
paper we examine the problem of finding a robust exploration
strategy for a team of mobile robots that takes into account
communication limitations.We propose four performance metrics
to evaluate and compare existing multi-robot exploration algorithms,
and present a role-based approach in which robots either
act as explorers or as relays. The result is a complete exploration
of the environment in which information is efficiently returned
to a central command centre, which is particularly applicable to
the domain of rescue robotics.
Abstract: In highly dynamic scheduling, change events occur long before the scheduling horizon is reached. The scheduler is driven by change. The complexity of these problems cannot be described in terms of NP-hardness, as no exact optimal solution exists for a dynamic problem. Nevertheless, search space explosions are common. In addition, search time is very limited, since the environment requires immediate solutions. Each problem solving approach will therefore employ heuristics in some sense. The heuristics may be in the algorithm (e.g., metaheuristics), in narrowing down the search space, or in enhancing the fitness functions (e.g., by preferring robust schedules).
Evolutionary algorithms (EA) are often applied metaheuristics, both in static and dynamic environments. They have innate robust tendencies, being able to adapt to changing environments. However, we argue they are of limited use in highly dynamic problems. As revised schedules preferably stay close to the original, there is no need for broad sampled neighbourhoods such as provided by EAs. Furthermore, schedules in dynamic environments have a great internal coherence, which an EA is prone to break. Lastly, in such problems, the time to perform EA computations simply is not there. The optimisation approach needed in such environments often involves load balancing, which can be performed by exchanging tasks between machines or queues. Therefore, more straightforward element-swapping heuristics such as k-opt search, will prove more valuable.
Elevator dispatching was used as a case study in this research. It is a typical example of a highly dynamic scheduling problem. Experiments using real world passenger data show that EAs are outperformed by k-opt search, even when search time was unlimited.
Abstract: Urban Search and Rescue is a growing area of robotic research. The RoboCup Federation has recognized this, and has created the new Virtual Robots competition to complement its existing physical robot and agent competitions. In order to successfully compete in this competition, teams need to field multi-robot solutions that cooperatively explore and map an environment while searching for victims. This paper presents the results of the first annual RoboCup Rescue Virtual competition. It provides details on the metrics used to judge the contestants as well as summaries of the algorithms used by the top four teams. This allows readers to compare and contrast these effective approaches. Furthermore, the simulation engine itself is examined and real-world validation results on the engine and algorithms are offered.