 |
Evaluation and Analysis of the Performance of the EXP3 Algorithm in Stochastic Environments
Yevgeny Seldin, Csaba Szepesvári, Peter Auer and Yasin Abbasi-Yadkori
Article
(12 January 2013)
|
|
Dynamic Policy Programming
Mohammad Azar and Bert Kappen
Article
Item not available online.
(2013)
|
|
Minimax PAC Bounds on the Sample Complexity of Reinforcement
Learning with a Generative Model
Mohammad Azar, Remi Munos and Bert Kappen
Article
Item not available online.
(2013)
|
|
Optimistic Planning for Belief-Augmented Markov Decision Processes
Raphael Fonteneau, Lucian Busoniu and Rémi Munos
Conference or Workshop Item
Item not available online.
(2013)
|
|
Stochastic Simultaneous Optimistic Optimization
Michal Valko, Alexandra Carpentier and Rémi Munos
Conference or Workshop Item
Item not available online.
(2013)
|
 |
Superevolution: Sex as Gibbs Sampling
Chris Watkins and Yvonne Buttkewitz
Conference or Workshop Item
Item availablity restricted.
(2013)
|
|
Toward Optimal Stratification for Stratified Monte-Carlo Integration
Alexandra Carpentier and Rémi Munos
Conference or Workshop Item
Item not available online.
(2013)
|
 |
Symbolic Dynamic Programming for Continuous State and Observation POMDPs
Zahra Zamani and Scott Sanner
Conference or Workshop Item
(03 December 2012)
|
|
Hierarchical Optimistic Region Selection driven by Curiosity.
Odalric-Ambrym Maillard
Article
Item not available online.
(December 2012)
|
|
Online allocation and homogeneous partitioning for piecewise constant mean-approximation.
Alexandra Carpentier and Odalric-Ambrym Maillard
Article
Item not available online.
(December 2012)
|
|
PAC-Bayesian Inequalities for Martingales
Yevgeny Seldin, François Laviolette, Nicolò Cesa-Bianchi, John Shawe-Taylor and Peter Auer
Article
Item not available online.
(December 2012)
|
|
Competing with an Infinite Set of Models in Reinforcement Learning
Phuong Nguyen, Odalric-Ambrym Maillard, Daniil Ryabko and Ronald Ortner
Conference or Workshop Item
Item not available online.
(November 2012)
|
 |
Optimal regret bounds for selecting the state representation in reinforcement learning.
Odalric-Ambrym Maillard, Phuong Nguyen, Ronald Ortner and Daniil Ryabko
Article
(01 October 2012)
|
 |
Score-based Bayesian Skill Learning
Shengbo Guo, Scott Sanner, Thore Graepel and Wray Buntine
Conference or Workshop Item
(24 September 2012)
|
 |
Reassignment-Based Strategy-Proof Mechanisms for Interdependent Task Allocation
Ayman Ghoneim
Conference or Workshop Item
(03 September 2012)
|
 |
Semi-Supervised Apprenticeship Learning
Michal Valko, Mohammad Ghavamzadeh and Alessandro Lazaric
Article
(September 2012)
|
 |
Symbolic Dynamic Programming for Continuous State and Action MDPs
Zahra Zamani and Scott Sanner
Conference or Workshop Item
(22 July 2012)
|
 |
Symbolic Variable Elimination for Discrete and Continuous Graphical Models
Scott Sanner and Ehsan Abbasnejad
Conference or Workshop Item
(22 July 2012)
|
 |
A Survey of the Seventh International Planning Competition
Amanda Coles, Andrew Coles, Angel Garcia Olaya, Sergio Jimenez, Carlos Linares Lopez, Scott Sanner and Sungwook Yoon
Article
(01 June 2012)
|
 |
PAC-Bayes-Bernstein Inequality for Martingales and its Application to Multiarmed Bandits
Yevgeny Seldin, Nicolò Cesa-Bianchi, Peter Auer, François Laviolette and John Shawe-Taylor
Article
(02 May 2012)
|
|
Online Clustering of Processes
Azadeh Khaleghi, Daniil Ryabko, Jeremie Mary and Philippe Preux
Conference or Workshop Item
Item not available online.
(April 2012)
|
 |
Autonomous Exploration For Navigating In MDPs
Shiau Hong Lim and Peter Auer
Conference or Workshop Item
Item availablity restricted.
(14 February 2012)
|
|
Online Regret Bounds for Undiscounted Continuous Reinforcement Learning
Ronald Ortner and Daniil Ryabko
Conference or Workshop Item
Item not available online.
(2012)
|
|
A Dantzig Selector Approach to Temporal Difference Learning
Matthieu Geist, Bruno Scherrer, Alessandro Lazaric and Mohammad Ghavamzadeh
Conference or Workshop Item
Item not available online.
(2012)
|
|
A Truthful Learning Mechanism for Multi-Slot Sponsored Search Auctions with Externalities (Extended Abstract)
Nicola Gatti, Alessandro Lazaric and Francesco Trovo
Conference or Workshop Item
Item not available online.
(2012)
|
|
Adaptive stratified sampling for monte-carlo integration of differentiable functions
Alexandra Carpentier and Rémi Munos
Conference or Workshop Item
Item not available online.
(2012)
|
|
Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence
Victor Gabillon, Mohammad Ghavamzadeh and Alessandro Lazaric
Conference or Workshop Item
Item not available online.
(2012)
|
|
Conservative and Greedy Approaches to Classification-based Policy Iteration
Mohammad Ghavamzadeh and Alessandro Lazaric
Conference or Workshop Item
Item not available online.
(2012)
|
 |
Dynamic policy programming
Mohammad Azar and Bert Kappen
Article
(2012)
|
|
Explicit solution of relative entropy weighted control
Joris Bierkens and Bert Kappen
Article
Item not available online.
(2012)
|
|
Learning with stochastic inputs and adversarial outputs
Alessandro Lazaric and Rémi Munos
Article
Item not available online.
(2012)
|
|
Linear regression with random projections
Odalric-Ambrym Maillard and Rémi Munos
Article
Item not available online.
(2012)
|
|
Locating Changes in Highly Dependent Data with Unknown Number of Change Points
Azadeh Khaleghi and Daniil Ryabko
Conference or Workshop Item
Item not available online.
(2012)
|
|
Minimax number of strata for online stratified sampling given noisy samples
Alexandra Carpentier and Rémi Munos
Conference or Workshop Item
Item not available online.
(2012)
|
|
Minimax PAC-bounds on the sample complexity of reinforcement learning with a generative model
Mohammad Gheshlaghi Azar, Rémi Munos and Hilbert Kappen
Article
Item not available online.
(2012)
|
 |
On the sample Complexity of Reinforcement Learning with a Generative Model
Mohammad Azar, Remi Munos and Bert Kappen
Conference or Workshop Item
(2012)
|
|
Online solution of the average cost Kullback-Leibler
optimization problem
Joris Bierkens, Vicenc Gomez and Bert Kappen
Article
Item not available online.
(2012)
|
|
Reducing statistical time-series problems to binary classification
Daniil Ryabko and Jérémie Mary
Conference or Workshop Item
Item not available online.
(2012)
|
|
Regret Bounds for Restless Markov Bandits
Ronald Ortner, Daniil Ryabko, Remi Munos and Peter Auer
Conference or Workshop Item
Item not available online.
(2012)
|
|
Risk Averse Multi-Arm Bandits
Amir Sani, Alessandro Lazaric and Rémi Munos
Conference or Workshop Item
Item not available online.
(2012)
|
|
Speedy Q-Learning: A Computationally Efficient Reinforcement Learning Algorithm with a Near Optimal Rate of Convergence
Mohammad Azar, Remi Munos, M. Ghavamzadach and Bert Kappen
Article
Item not available online.
(2012)
|
|
Testing composite hypotheses about discrete ergodic processes
daniil ryabko
Article
Item not available online.
(2012)
|
|
Thompson sampling: an asymptotically optimal finite time analysis
Emilie Kaufmann, Nathaniel Korda and Rémi Munos
Conference or Workshop Item
Item not available online.
(2012)
|
 |
Online solution of the average cost Kullback-Leibler optimization problem
Joris Bierkens and Bert Kappen
Conference or Workshop Item
Item availablity restricted.
(16 December 2011)
|
 |
Conditional Anomaly Detection with Soft Harmonic Functions
Michal Valko, Branislav Kveton, Hamed Valizadegan, Gregory Cooper and Milos Hauskrecht
Conference or Workshop Item
(12 December 2011)
|
 |
PAC-Bayesian Analysis of Contextual Bandits
Yevgeny Seldin, Peter Auer, François Laviolette, John Shawe-Taylor and Ronald Ortner
Conference or Workshop Item
(December 2011)
|
|
Adaptive Aggregation for Reinforcement Learning in Average Reward Markov Decision Processes
Ronald Ortner
Article
Item not available online.
(September 2011)
|
|
Classification-based Policy Iteration with a Critic
Victor Gabillon, Alessandro Lazaric, Mohammad Ghavamzadeh and Bruno Scherer
Conference or Workshop Item
Item not available online.
(2011)
|
 |
Finite time analysis of stratified sampling for monte carlo
Alexandra Carpentier and Rémi Munos
Article
(2011)
|
 |
Finite-sample analysis of Lasso-TD
Mohammad Ghavamzadeh, Alessandro Lazaric, Rémi Munos and Matthew Hoffman
Article
(2011)
|
 |
Finite-Sample Analysis of Lasso-TD
Mohammad Ghavamzadeh, Alessandro Lazaric, Remi Munos and Matthew Hoffman
Conference or Workshop Item
(2011)
|
 |
Finite-time analysis of multi-armed bandits problems with Kullback-Leibler divergences
Odalric Maillard, Rémi Munos and Gilles Stoltz
Article
(2011)
|
 |
Multi-Bandit Best Arm Identification
Victor Gabillon, Mohammad Ghavamzadeh, Alessandro Lazaric and Sebastien Bubeck
Conference or Workshop Item
(2011)
|
 |
Noisy Search with Comparative Feedback
Shiau Hong Lim and Peter Auer
Conference or Workshop Item
(2011)
|
|
On the relation between realizable and non-realizable cases of the sequence prediction problem.
Daniil Ryabko
Article
Item not available online.
(2011)
|
 |
Optimistic optimization of deterministic functions without the knowledge of its smoothness
Rémi Munos
Article
(2011)
|
 |
Optimistic planning for sparsely stochastic systems
Lucian Busoniu, Rémi Munos, Bart De Schutter and Robert Babuska
Article
(2011)
|
|
Regularized Least Squares Temporal Difference learning with nested L2 and L1 penalization
Matthew Hoffman, Alessandro Lazaric, Mohammad Ghavamzadeh and Remi Munos
Conference or Workshop Item
Item not available online.
(2011)
|
 |
Selecting the state-representation in reinforcement learning
Odalric Maillard, Rémi Munos and Daniil Ryabko
Article
(2011)
|
 |
Sparse recovery with brownian sensing
Alexandra Carpentier, Odalric Maillard and Rémi Munos
Article
(2011)
|
 |
Speedy Q-Learning
Mohammad Gheshlaghi Azar, Remi Munos, Mohammad Ghavamzadeh and Hilbert Kappen
Conference or Workshop Item
(2011)
|
 |
Towards the best history-dependent strategy
Odalric Maillard and Rémi Munos
Article
(2011)
|
|
Upper confidence bounds algorithms for active learning in multi-armed bandits
Alexandra Carpentier, Mohammad Ghavamzadeh, Alessandro Lazaric, Rémi Munos and Peter Auer
Article
Item not available online.
(2011)
|
 |
Upper-Confidence-Bound Algorithms for Active Learning in Multi-Armed Bandits
Alexandra Carpentier, Alessandro Lazaric, Mohammad Ghavamzadeh, Remi Munos and Peter Auer
Conference or Workshop Item
(2011)
|