PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Adaptive play in Texas Hold’em Poker.
Raphael Maitrepierre, Jérémie Mary and Rémi Munos
In: European Conference on Artificial Intelligence(2008).

Abstract

Abstract. We present a Texas Hold’em poker player for limit heads- up games. Our bot is designed to adapt automatically to the strategy of the opponent and is not based on Nash equilibrium computation. The main idea is to design a bot that builds beliefs on his opponent’s hand. A forest of game trees is generated according to those beliefs and the solutions of the trees are combined to make the best decision. The beliefs are updated during the game according to several meth- ods, each of which corresponding to a basic strategy. We then use an exploration-exploitation bandit algorithm, namely the UCB (Up- per Confidence Bound), to select a strategy to follow. This results in a global play that takes into account the opponent’s strategy, and which turns out to be rather unpredictable. Indeed, if a given strategy is exploited by an opponent, the UCB algorithm will detect it using change point detection, and will choose another one. The initial resulting program , called Brennus, participated to the AAAI’07 Computer Poker Competition in both online and equilib- rium competition and ranked eight out of seventeen competitors.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Conference or Workshop Item (Paper)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Learning/Statistics & Optimisation
ID Code:5142
Deposited By:Rémi Munos
Deposited On:24 March 2009