PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Efficient planning for factored infinite-horizon DEC-POMDPs
Joni Pajarinen and Jaakko Peltonen
In: IJCAI-2011, The Twenty-second International Joint Conference on Artificial Intelligence, 16-22 Jul 2011, Barcelona, Spain.


Decentralized partially observable Markov decision processes (DEC-POMDPs) are used to plan policies for multiple agents that must maximize a joint reward function but do not communicate with each other. The agents act under uncertainty about each other and the environment. This planning task arises in optimization of wireless networks, and other scenarios where communication between agents is restricted by costs or physical limits. DEC-POMDPs are a promising solution, but optimizing policies quickly becomes computationally intractable when problem size grows. Factored DEC-POMDPs allow large problems to be described in compact form, but have the same worst case complexity as non-factored DEC-POMDPs. We propose an efficient optimization algorithm for large factored infinite-horizon DEC-POMDPs. We formulate expectation-maximization based optimization into a new form, where complexity can be kept tractable by factored approximations. Our method performs well, and it can solve problems with more agents and larger state spaces than state of the art DEC-POMDP methods. We give results for factored infinite-horizon DEC-POMDP problems with up to 10 agents.

EPrint Type:Conference or Workshop Item (Paper)
Additional Information:
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Computational, Information-Theoretic Learning with Statistics
Learning/Statistics & Optimisation
Theory & Algorithms
ID Code:9073
Deposited By:Jaakko Peltonen
Deposited On:21 February 2012