|
Online Clustering of Processes
Azadeh Khaleghi, Daniil Ryabko, Jeremie Mary and Philippe Preux
Conference or Workshop Item
Item not available online.
(April 2012)
|
 |
Autonomous Exploration For Navigating In MDPs
Shiau Hong Lim and Peter Auer
Article
Item availablity restricted.
(01 January 2012)
|
 |
Online solution of the average cost Kullback-Leibler optimization problem
Joris Bierkens and Bert Kappen
Conference or Workshop Item
Item availablity restricted.
(16 December 2011)
|
 |
Conditional Anomaly Detection with Soft Harmonic Functions
Michal Valko, Branislav Kveton, Hamed Valizadegan, Gregory Cooper and Milos Hauskrecht
Conference or Workshop Item
(12 December 2011)
|
 |
PAC-Bayesian Analysis of Contextual Bandits
Yevgeny Seldin, Peter Auer, François Laviolette, John Shawe-Taylor and Ronald Ortner
Conference or Workshop Item
(December 2011)
|
|
Adaptive Aggregation for Reinforcement Learning in Average Reward Markov Decision Processes
Ronald Ortner
Article
Item not available online.
(September 2011)
|
|
Classification-based Policy Iteration with a Critic
Victor Gabillon, Alessandro Lazaric, Mohammad Ghavamzadeh and Bruno Scherer
Conference or Workshop Item
Item not available online.
(2011)
|
 |
Finite time analysis of stratified sampling for monte carlo
Alexandra Carpentier and Rémi Munos
Article
(2011)
|
 |
Finite-sample analysis of Lasso-TD
Mohammad Ghavamzadeh, Alessandro Lazaric, Rémi Munos and Matthew Hoffman
Article
(2011)
|
 |
Finite-Sample Analysis of Lasso-TD
Mohammad Ghavamzadeh, Alessandro Lazaric, Remi Munos and Matthew Hoffman
Conference or Workshop Item
(2011)
|
 |
Finite-time analysis of multi-armed bandits problems with Kullback-Leibler divergences
Odalric Maillard, Rémi Munos and Gilles Stoltz
Article
(2011)
|
 |
Multi-Bandit Best Arm Identification
Victor Gabillon, Mohammad Ghavamzadeh, Alessandro Lazaric and Sebastien Bubeck
Conference or Workshop Item
(2011)
|
 |
Noisy Search with Comparative Feedback
Shiau Hong Lim and Peter Auer
Conference or Workshop Item
(2011)
|
|
On the relation between realizable and non-realizable cases of the sequence prediction problem.
Daniil Ryabko
Article
Item not available online.
(2011)
|
 |
Optimistic optimization of deterministic functions without the knowledge of its smoothness
Rémi Munos
Article
(2011)
|
 |
Optimistic planning for sparsely stochastic systems
Lucian Busoniu, Rémi Munos, Bart De Schutter and Robert Babuska
Article
(2011)
|
|
Regularized Least Squares Temporal Difference learning with nested L2 and L1 penalization
Matthew Hoffman, Alessandro Lazaric, Mohammad Ghavamzadeh and Remi Munos
Conference or Workshop Item
Item not available online.
(2011)
|
 |
Selecting the state-representation in reinforcement learning
Odalric Maillard, Rémi Munos and Daniil Ryabko
Article
(2011)
|
 |
Sparse recovery with brownian sensing
Alexandra Carpentier, Odalric Maillard and Rémi Munos
Article
(2011)
|
 |
Speedy Q-Learning
Mohammad Gheshlaghi Azar, Remi Munos, Mohammad Ghavamzadeh and Hilbert Kappen
Conference or Workshop Item
(2011)
|
 |
Towards the best history-dependent strategy
Odalric Maillard and Rémi Munos
Article
(2011)
|
|
Upper confidence bounds algorithms for active learning in multi-armed bandits
Alexandra Carpentier, Mohammad Ghavamzadeh, Alessandro Lazaric, Rémi Munos and Peter Auer
Article
Item not available online.
(2011)
|
 |
Upper-Confidence-Bound Algorithms for Active Learning in Multi-Armed Bandits
Alexandra Carpentier, Alessandro Lazaric, Mohammad Ghavamzadeh, Remi Munos and Peter Auer
Conference or Workshop Item
(2011)
|