Abstract: We consider a stochastic nonlinear dynamical process with annihilation of particles. This process can be viewed as the continuous time version of the extended
Kalman ﬁlter/smoother. It also plays an important role in stochastic optimal control theory. We derive a Gaussian approximation for this process. With the use
of the pathintegral formalism we derive Euler-Lagrange equations for the mode.
Furthermore, we derive a linear noise approximation to estimate the size of the
ﬂuctuations around the mode, and estimates of the partition function, based on the
mode and Gaussian corrections. Numerical experiments conﬁrm the validity of
the approximation method. In addition, they show that the Gaussian correction
provides a signiﬁcant improvement of the estimate of the partition function.

Abstract: We address the role of noise and the issue of efficient computation in stochastic optimal control
problems. We consider a class of nonlinear control problems that can be formulated as a pathintegral and
where the noise plays the role of temperature. The pathintegral displays symmetry breaking and there
exists a critical noise value that separates regimes where optimal control yields qualitatively different
solutions. The pathintegral can be computed efficiently by Monte Carlo integration or by a Laplace
approximation, and can therefore be used to solve high dimensional stochastic control problems

Abstract: Control theory is a mathematical description of how to act
optimally to gain future rewards. In this paper We discuss
a class of non-linear stochastic control problems that can be
eﬃciently solved using a pathintegral. In this control formalism, the central concept of cost-to-go or value function
becomes a free energy and methods and concepts from statistical physics can be readily applied, such as Monte Carlo
sampling or the Laplace approximation. When applied to a
receding horizon problem in a stationary environment, the
solution resembles the one obtained by traditional reinforcement learning with discounted reward. It is shown that this
solution can be computed more eﬃciently than in the discounted reward framework. As shown in previous work, the
approach is easily generalized to time-dependent tasks and
is therefore of great relevance for modeling real-time interactions between agents.

Abstract: Control theory is a mathematical description of how to act optimally to gain future rewards. In this paper I give an introduction to deterministic and stochastic control theory and I give an overview of the possible application of control theory to the modeling of animal behavior and learning. I discuss a class of non-linear stochastic control problems that can be efficiently solved using a pathintegral or by MC sampling. In this control formalism the central concept of cost-to-go becomes a free energy and methods and concepts from statistical physics can be readily applied.

Abstract: In this article we consider the issue of optimal control in collaborative multi-agent systems with stochastic dynamics. The agents have a joint task in which they have to reach a number of target states. The dynamics of the agents contains additive control and additive noise, and the autonomous part factorizes over the agents. Full observation of the global state is assumed. The goal is to minimize the accumulated joint cost, which consists of integrated instantaneous costs and a joint end cost. The joint end cost expresses the joint task of the agents. The instantaneous costs are quadratic in the control and factorize over the agents. The optimal control is given as a weighted linear combination of single-agent to single-target controls. The single-agent to single-target controls are expressed in terms of diffusion processes. These controls, when not closed form expressions, are formulated in terms of pathintegrals, which are calculated approximately by Metropolis-Hastings sampling. The weights in the control are interpreted as marginals of a joint distribution over agent to target assignments. The structure of the latter is represented by a graphical model, and the marginals are obtained by graphical model inference. Exact inference of the graphical model will break down in large systems, and so approximate inference methods are needed. We use naive mean field approximation and belief propagation to approximate the optimal control in systems with linear dynamics. We compare the approximate inference methods with the exact solution, and we show that they can accurately compute the optimal control. Finally, we demonstrate the control method in multi-agent systems with nonlinear dynamics consisting of up to 80 agents that have to reach an equal number of target states.

Abstract: Optimal control theory is a mathematical description of how to act optimally
to gain future rewards. In this paper I give an introduction to
deterministic and stochastic control theory; partial observability,
learning and the combined problem of inference and control. Subsequently, I
discuss a new class of non-linear stochastic
control problems for which the Bellman equation becomes linear in the
control and that can be efficiently solved using a pathintegral.
In this control formalism the central concept of cost-to-go becomes a
free energy and methods and concepts from probabilistic graphical
models and statistical physics can be readily applied. I illustrate the
theory with a number of examples.

Abstract: This paper considers linear-quadratic control of a non-linear dynamical system subject to arbitrary cost. I show that for this class of stochastic control problems the non-linear Hamilton–Jacobi–Bellman equation can be transformed into a linear equation. The transformation is similar to the transformation used to relate the classical Hamilton–Jacobi equation to the Schr{\"o}dinger equation. As a result of the linearity, the usual backward computation can be replaced by a forward diffusion process that can be computed by stochastic integration or by the evaluation of a pathintegral. It is shown how in the deterministic limit the Pontryagin minimum principle formalism is recovered. The significance of the pathintegral approach is that it forms the basis for a number of efficient computational methods, such as Monte Carlo sampling, the Laplace approximation and the variational approximation. We show the effectiveness of the first two methods in a number of examples. Examples are given that show the qualitative difference between stochastic and deterministic control and the occurrence of symmetry breaking as a function of the noise.