EPrints submitted by András Antos
Click here to see user's record. Number of EPrints submitted by this user: 13
Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path
András Antos, Csaba Szepesvari and Rémi Munos
Machine Learning
Volume 71,
Number 1,
,
2008.
ISSN 1573-0565
Fitted Q-iteration in continuous action-space MDPs
András Antos, Rémi Munos and Csaba Szepesvari
In:
Advances in Neural Information Processing Systems
(2008)
MIT Press
, Cambridge, MA, USA
, .
Value-iteration based fitted policy iteration: learning with a single trajectory
András Antos, Csaba Szepesvari and Rémi Munos
In: 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning (ADPRL), 1-5 April 2007, Honolulu, Hawaii, USA.
Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path
András Antos, Csaba Szepesvari and Rémi Munos
In:
The Nineteenth Annual Conference on Learning Theory, COLT 2006, Proceedings
Lecture Notes in Computer Science/Lecture Notes in Artificial Intelligence
, 4005
.
(2006)
Springer-Verlag
, Berlin, Heidelberg, Germany
, .
ISBN 978-3-540-35294-5
Active learning in multi-armed bandits
András Antos, Varun Grover and Csaba Szepesvari
In:
19th International Conference on Algorithmic Learning Theory, ALT 2008, Proceedings
Lecture Notes in Computer Science/Lecture Notes in Artificial Intelligence
, 5254
.
(2008)
Springer-Verlag
, Berlin, Heidelberg, Germany
, .
ISBN 978-3-540-87986-2
Active learning with heteroscedastic noise
András Antos, Varun Grover and Csaba Szepesvari
Theoretical Computer Science
Volume 411,
Number 29-30,
,
2010.
ISSN 0304-3975
On codecell convexity of optimal multiresolution scalar quantizers for continuous sources
András Antos
IEEE Transactions on Information Theory
Volume 58,
Number 2,
,
2012.
ISSN 0018-9448
Online Markov decision processes under bandit feedback
Gergely Neu, Andras Gyorgy, Csaba Szepesvari and András Antos
In: Twenty-Fourth Annual Conference on Neural Information Processing Systems 2010, 6-9 Dec 2010, Vancouver, B.C., Canada.
Toward a classification of finite partial-monitoring games
András Antos, Gabor Bartok, Dávid Pál and Csaba Szepesvari
Theoretical Computer Science
Volume 473,
,
2013.
ISSN 0304-3975
Online Markov decision processes under bandit feedback
Gergely Neu, Andras Gyorgy, Csaba Szepesvari and András Antos
IEEE Transactions on Automatic Control
2010.
ISSN 0018-9286
Minimax strategy for stratified sampling for Monte Carlo
Alexandra Carpentier, Rémi Munos and András Antos
Journal of Machine Learning Research
2012.
ISSN 1533-7928
Non-trivial two-action partial-monitoring games with sublinear regrets are bandits
András Antos, Gabor Bartok and Csaba Szepesvari
(2011)
arXiv.
Forced-exploration based algorithms for playing in stochastic linear bandits
Yasin Abbasi-Yadkori, András Antos and Csaba Szepesvari
In: Joint ICML, UAI, COLT workshop: On-line Learning with Limited Feedback, 19 June 2009, Montreal, QC, Canada.
|