PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

EPrints submitted by Mohammad Ghavamzadeh

Click here to see user's record.

Number of EPrints submitted by this user: 21

Natural Actor-Critic Algorithms
Shalabh Bhatnagar, Richard Sutton, Mohammad Ghavamzadeh and Mark Lee
Automatica Volume 45, Number 11, pp. 2471-2482, 2009.

Natural Actor-Critic Algorithms
Shalabh Bhatnagar, Richard Sutton, Mohammad Ghavamzadeh and Mark Lee
(2009) Technical Report. Department of Computing Science, University of Alberta, Edmonton, Alberta, Canada.

Hierarchical Hybrid Reinforcement Learning Algorithms
Mohammad Ghavamzadeh
In: Workshop on Bridging the Gap between High-level Discrete Representations and Low-level Continuous Behaviors, Robotics: Science and Systems Conference (RSS-2009), 28 June 2009, Seattle, WA, USA.

Robot Learning with Regularized Reinforcement Learning
Amir-massoud Farahmand, Mohammad Ghavamzadeh, Csaba Szepesvari and Shie Mannor
In: Regression in Robotics—Approaches and Applications, Robotics: Science and Systems Conference (RSS-2009), 28 June 2009, Seattle, WA, USA.

Bayesian Actor Critic: A Bayesian Model for Value Function Approximation and Policy Learning
Mohammad Ghavamzadeh and Yaakov Engel
In: Workshop on Regression in Robotics—Approaches and Applications, Robotics: Science and Systems Conference (RSS-2009), 28 June 2009, Seattle, WA, USA.

Regularization in Reinforcement Learning
Amir-massoud Farahmand, Mohammad Ghavamzadeh, Csaba Szepesvari and Shie Mannor
In: Multidisciplinary Symposium on Reinforcement Learning (MSRL-2009), 18 June 2009, Montreal, QC, Canada.

Regularized Fitted Q-iteration for Planning in Continuous-Space Markovian Decision Problems
Amir-massoud Farahmand, Mohammad Ghavamzadeh, Csaba Szepesvari and Shie Mannor
In: Proceedings of the 2009 American Control Conference (ACC-2009) (2009) IEEE . ISBN 142444523X

Finite-Sample Analysis of LSTD
Alessandro Lazaric, Mohammad Ghavamzadeh and Rémi Munos
In: Finite-Sample Analysis of LSTD, 21-24 June 2010, Haifa, Israel.

Analysis of a Classification-based Policy Iteration Algorithm
Alessandro Lazaric, Mohammad Ghavamzadeh and Rémi Munos
In: Analysis of a Classification-based Policy Iteration Algorithm, 21-24 June 2010, Haifa, Israel.

Bayesian Multi-Task Reinforcement Learning
Alessandro Lazaric and Mohammad Ghavamzadeh
In: Twenty-Seventh International Conference on Machine Learning (ICML-2010), 21-24 June 2010, Haifa, Israel.

Finite-Sample Analysis of Bellman Residual Minimization
Odalric-Ambrym Maillard, Rémi Munos, Alessandro Lazaric and Mohammad Ghavamzadeh
In: Second Asian Conference on Machine Learning (ACML-2010), 8-10 Nov 2010, Tokyo, Japan.

LSTD with Random Projections
Mohammad Ghavamzadeh, Alessandro Lazaric, Odalric-Ambrym Maillard and Rémi Munos
In: Twenty-Fourth Annual Conference on Advances in Neural Information Processing Systems (NIPS-2010), 6-9 Dec 2010, Vancouver, Canada.

Rollout Allocation Strategies for Classification-based Policy Iteration
Victor Gabillon, Alessandro Lazaric and Mohammad Ghavamzadeh
In: Workshop on Reinforcement Learning and Search in Very Large Spaces, Twenty-Seventh International Conference on Machine Learning (ICML-2010), 25 June 2010, Haifa, Israel.

LSPI with Random Projections
Mohammad Ghavamzadeh, Alessandro Lazaric, Odalric-Ambrym Maillard and Rémi Munos
(2010) Technical Report. INRIA, France.

Finite-Sample Analysis of Least-Squares Policy Iteration
Alessandro Lazaric, Mohammad Ghavamzadeh and Rémi Munos
(2010) Technical Report. INRIA, France.

Classification-based Policy Iteration with a Critic
Victor Gabillon, Alessandro Lazaric, Mohammad Ghavamzadeh and Bruno Scherer
In: Twenty-Eighth International Conference on Machine Learning (ICML-2011), Seattle, Washington, USA(2011).

Finite-Sample Analysis of Lasso-TD
Mohammad Ghavamzadeh, Alessandro Lazaric, Remi Munos and Matthew Hoffman
In: Twenty-Eighth International Conference on Machine Learning (ICML-2011), Seattle, Washington, USA(2011).

Upper-Confidence-Bound Algorithms for Active Learning in Multi-Armed Bandits
Alexandra Carpentier, Alessandro Lazaric, Mohammad Ghavamzadeh, Remi Munos and Peter Auer
In: Twenty-Second International Conference on Algorithmic Learning Theory (ALT-2011), Espoo, Finland(2011).

Multi-Bandit Best Arm Identification
Victor Gabillon, Mohammad Ghavamzadeh, Alessandro Lazaric and Sebastien Bubeck
In: Twenty-Fifth Annual Conference on Advances in Neural Information Processing Systems (NIPS-2011), Granada, Spain(2011).

Speedy Q-Learning
Mohammad Gheshlaghi Azar, Remi Munos, Mohammad Ghavamzadeh and Hilbert Kappen
In: Twenty-Fifth Annual Conference on Advances in Neural Information Processing Systems (NIPS-2011), Granada, Spain(2011).

Regularized Least Squares Temporal Difference learning with nested L2 and L1 penalization
Matthew Hoffman, Alessandro Lazaric, Mohammad Ghavamzadeh and Remi Munos
In: Ninth European Workshop on Reinforcement Learning (EWRL-2011), Athens, Greece(2011).