Exploration exploitation in Go: UCT for Monte-Carlo Go
Sylvain Gelly and Yizao Wang
In: On-line Trading of Exploration and Exploitation, 8 December 2006, Whistler, BC, Canada.
Algorithm UCB1 for multi-armed bandit problem has already been extended to Algorithm UCT which works for minimax tree search. We have developed a Monte-Carlo program, MoGo, which is the ﬁrst computer Go program using UCT. We explain our modiﬁcations of UCT for Go application, among which efficient memory management, parametrization, ordering of non-visited nodes and parallelization. MoGo is now a top-level Computer-Go program on 9 × 9 Go board.