PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

EPrints submitted by András Antos

Click here to see user's record.

Number of EPrints submitted by this user: 13

Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path
András Antos, Csaba Szepesvari and Rémi Munos
Machine Learning Volume 71, Number 1, pp. 89-129, 2008. ISSN 1573-0565

Fitted Q-iteration in continuous action-space MDPs
András Antos, Rémi Munos and Csaba Szepesvari
In: Advances in Neural Information Processing Systems (2008) MIT Press , Cambridge, MA, USA , pp. 9-16.

Value-iteration based fitted policy iteration: learning with a single trajectory
András Antos, Csaba Szepesvari and Rémi Munos
In: 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning (ADPRL), 1-5 April 2007, Honolulu, Hawaii, USA.

Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path
András Antos, Csaba Szepesvari and Rémi Munos
In: The Nineteenth Annual Conference on Learning Theory, COLT 2006, Proceedings Lecture Notes in Computer Science/Lecture Notes in Artificial Intelligence , 4005 . (2006) Springer-Verlag , Berlin, Heidelberg, Germany , pp. 574-588. ISBN 978-3-540-35294-5

Active learning in multi-armed bandits
András Antos, Varun Grover and Csaba Szepesvari
In: 19th International Conference on Algorithmic Learning Theory, ALT 2008, Proceedings Lecture Notes in Computer Science/Lecture Notes in Artificial Intelligence , 5254 . (2008) Springer-Verlag , Berlin, Heidelberg, Germany , pp. 287-302. ISBN 978-3-540-87986-2

Active learning with heteroscedastic noise
András Antos, Varun Grover and Csaba Szepesvari
Theoretical Computer Science Volume 411, Number 29-30, pp. 2712-2728, 2010. ISSN 0304-3975

On codecell convexity of optimal multiresolution scalar quantizers for continuous sources
András Antos
IEEE Transactions on Information Theory Volume 58, Number 2, pp. 1147-1157, 2012. ISSN 0018-9448

Online Markov decision processes under bandit feedback
Gergely Neu, Andras Gyorgy, Csaba Szepesvari and András Antos
In: Twenty-Fourth Annual Conference on Neural Information Processing Systems 2010, 6-9 Dec 2010, Vancouver, B.C., Canada.

Toward a classification of finite partial-monitoring games
András Antos, Gabor Bartok, Dávid Pál and Csaba Szepesvari
Theoretical Computer Science Volume 473, pp. 77-99, 2013. ISSN 0304-3975

Online Markov decision processes under bandit feedback
Gergely Neu, Andras Gyorgy, Csaba Szepesvari and András Antos
IEEE Transactions on Automatic Control 2010. ISSN 0018-9286

Minimax strategy for stratified sampling for Monte Carlo
Alexandra Carpentier, Rémi Munos and András Antos
Journal of Machine Learning Research 2012. ISSN 1533-7928

Non-trivial two-action partial-monitoring games with sublinear regrets are bandits
András Antos, Gabor Bartok and Csaba Szepesvari
(2011) arXiv.

Forced-exploration based algorithms for playing in stochastic linear bandits
Yasin Abbasi-Yadkori, András Antos and Csaba Szepesvari
In: Joint ICML, UAI, COLT workshop: On-line Learning with Limited Feedback, 19 June 2009, Montreal, QC, Canada.