PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Algorithms for Infinitely Many-Armed Bandits
Yizao Wang, Rémi Munos and Jean-Yves Audibert
In: Neural Information Processig Systems, Vancouver(2008).


We consider multi-armed bandit problems where the number of arms is larger than the possible number of experiments. We make a stochastic assumption on the mean-reward of a new selected arm which characterizes its probability of be- ing a near-optimal arm. Our assumption is weaker than in previous works. We describe algorithms based on upper-confidence-bounds applied to a restricted set of randomly selected arms and provide upper-bounds on the resulting expected regret. We also derive a lower-bound which matches (up to a logarithmic factor) the upper-bound in some cases.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Conference or Workshop Item (Paper)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Computational, Information-Theoretic Learning with Statistics
ID Code:5140
Deposited By:Rémi Munos
Deposited On:24 March 2009