Hendrik Baier | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Hendrik Baier is active.

Explore More

Publication

Featured researches published by Hendrik Baier.

computational intelligence and games | 2013

Monte-Carlo Tree Search and minimax hybrids

Hendrik Baier; Mark H. M. Winands

Monte-Carlo Tree Search is a sampling-based search algorithm that has been successfully applied to a variety of games. Monte-Carlo rollouts allow it to take distant consequences of moves into account, giving it a strategic advantage in many domains over traditional depth-limited minimax search with alpha-beta pruning. However, MCTS builds a highly selective tree and can therefore miss crucial moves and fall into traps in tactical situations. Full-width minimax search does not suffer from this weakness. This paper proposes MCTS-minimax hybrids that employ shallow minimax searches within the MCTS framework. The three proposed approaches use minimax in the selection/expansion phase, the rollout phase, and the backpropagation phase of MCTS. Without requiring domain knowledge in the form of evaluation functions, these hybrid algorithms are a first step at combining the strategic strength of MCTS and the tactical strength of minimax. We investigate their effectiveness in the test domains of Connect-4 and Breakthrough.

european conference on artificial intelligence | 2012

Nested Monte-Carlo Tree Search for online planning in large MDPs

Hendrik Baier; Mark H. M. Winands

Monte-Carlo Tree Search (MCTS) is state of the art for online planning in large MDPs. It is a best-first, sample-based search algorithm in which every state in the search tree is evaluated by the average outcome of Monte-Carlo rollouts from that state. These rollouts are typically random or directed by a simple, domain-dependent heuristic. We propose Nested Monte-Carlo Tree Search (NMCTS), in which MCTS itself is recursively used to provide a rollout policy for higher-level searches. In three large-scale MDPs, SameGame, Clickomania and Bubble Breaker, we show that NMCTS is significantly more effective than regular MCTS at equal time controls, both using random and heuristic rollouts at the base level. Experiments also suggest superior performance to Nested Monte-Carlo Search (NMCS) in some domains.

IEEE Transactions on Computational Intelligence and Ai in Games | 2015

MCTS-Minimax Hybrids

Hendrik Baier; Mark H. M. Winands

Monte Carlo tree search (MCTS) is a sampling-based search algorithm that is state of the art in a variety of games. In many domains, its Monte Carlo rollouts of entire games give it a strategic advantage over traditional depth-limited minimax search with αβ pruning. These rollouts can often detect long-term consequences of moves, freeing the programmer from having to capture these consequences in a heuristic evaluation function. But due to its highly selective tree, MCTS runs a higher risk than full-width minimax search of missing individual moves and falling into traps in tactical situations. This paper proposes MCTS-minimax hybrids that integrate shallow minimax searches into the MCTS framework. Three approaches are outlined, using minimax in the selection/expansion phase, the rollout phase, and the backpropagation phase of MCTS. Without assuming domain knowledge in the form of evaluation functions, these hybrid algorithms are a first step towards combining the strategic strength of MCTS and the tactical strength of minimax. We investigate their effectiveness in the test domains of Connect-4, Breakthrough, Othello, and Catch the Lion, and relate this performance to the tacticality of the domains.

computer games | 2014

Monte-Carlo Tree Search and minimax hybrids with heuristic evaluation functions

Hendrik Baier; Mark H. M. Winands

Monte-Carlo Tree Search (MCTS) has been found to play suboptimally in some tactical domains due to its highly selective search, focusing only on the most promising moves. In order to combine the strategic strength of MCTS and the tactical strength of minimax, MCTS-minimax hybrids have been introduced, embedding shallow minimax searches into the MCTS framework. Their results have been promising even without making use of domain knowledge such as heuristic evaluation functions. This paper continues this line of research for the case where evaluation functions are available. Three different approaches are considered, employing minimax with an evaluation function in the rollout phase of MCTS, as a replacement for the rollout phase, and as a node prior to bias move selection. The latter two approaches are newly proposed. The MCTS-minimax hybrids are tested and compared to their counterparts using evaluation functions without minimax in the domains of Othello, Breakthrough, and Catch the Lion. Results showed that introducing minimax search is effective for heuristic node priors in Othello and Catch the Lion. The MCTS-minimax hybrids are also found to work well in combination with each other. For their basic implementation in this investigative study, the effective branching factor of a domain is identified as a limiting factor of the hybrids performance.

advances in computer games | 2011

Time Management for Monte-Carlo Tree Search in Go

Hendrik Baier; Mark H. M. Winands

The dominant approach for programs playing the game of Go is nowadays Monte-Carlo Tree Search (MCTS). While MCTS allows for fine-grained time control, little has been published on time management for MCTS programs under tournament conditions. This paper investigates the effects that various time-management strategies have on the playing strength in Go. We consider strategies taken from the literature as well as newly proposed and improved ones. We investigate both semi-dynamic strategies that decide about time allocation for each search before it is started, and dynamic strategies that influence the duration of each move search while it is already running. In our experiments, two domain-independent enhanced strategies, EARLY-C and CLOSE-N, are tested; each of them provides a significant improvement over the state of the art.

computational intelligence and games | 2012

Beam Monte-Carlo Tree Search

Hendrik Baier; Mark H. M. Winands

Monte-Carlo Tree Search (MCTS) is a state-of-the-art stochastic search algorithm that has successfully been applied to various multi- and one-player games (puzzles). Beam search is a search method that only expands a limited number of promising nodes per tree level, thus restricting the space complexity of the underlying search algorithm to linear in the tree depth. This paper presents Beam Monte-Carlo Tree Search (BMCTS), combining the ideas of MCTS and beam search. Like MCTS, BMCTS builds a search tree using Monte-Carlo simulations as state evaluations. When a predetermined number of simulations has traversed the nodes of a given tree depth, these nodes are sorted by their estimated value, and only a fixed number of them is selected for further exploration. In our experiments with the puzzles SameGame, Clickomania and Bubble Breaker, BMCTS significantly outperforms MCTS at equal time controls. We show that the improvement is equivalent to an up to four-fold increase in computing time for MCTS.

Journal of Artificial Intelligence Research | 2018

MCTS-Minimax Hybrids with State Evaluations

Hendrik Baier; Mark H. M. Winands

Monte-Carlo Tree Search (MCTS) has been found to show weaker play than minimax-based search in some tactical game domains. This is partly due to its highly selective search and averaging value backups, which make it susceptible to traps. In order to combine the strategic strength of MCTS and the tactical strength of minimax, MCTS-minimax hybrids have been introduced, embedding shallow minimax searches into the MCTS framework. Their results have been promising even without making use of domain knowledge such as heuristic evaluation functions. This article continues this line of research for the case where evaluation functions are available. Three different approaches are considered, employing minimax with an evaluation function in the rollout phase of MCTS, as a replacement for the rollout phase, and as a node prior to bias move selection. The latter two approaches are newly proposed. Furthermore, all three hybrids are enhanced with the help of move ordering and k-best prunin! g for minimax. Results show that the use of enhanced minimax for computing node priors results in the strongest MCTS-minimax hybrid investigated in the three test domains of Othello, Breakthrough, and Catch the Lion. This hybrid, called MCTS-IP-M-k, also outperforms enhanced minimax as a standalone player in Breakthrough, demonstrating that at least in this domain, MCTS and minimax can be combined to an algorithm stronger than its parts. Using enhanced minimax for computing node priors is therefore a promising new technique for integrating domain knowledge into an MCTS framework.

IEEE Transactions on Computational Intelligence and Ai in Games | 2016

Time Management for Monte Carlo Tree Search

Hendrik Baier; Mark H. M. Winands

Monte Carlo Tree Search (MCTS) is a popular approach for tree search in a variety of games. While MCTS allows for fine-grained time control, not much has been published on time management for MCTS programs under tournament conditions. This paper first investigates the effects of various time-management strategies on playing strength in the challenging game of Go. A number of domain-independent strategies are then tested in the domains Connect-4, Breakthrough, Othello, and Catch the Lion. We consider strategies taken from the literature as well as newly proposed and improved ones. Strategies include both semi-dynamic strategies that decide about time allocation for each search before it is started, and dynamic strategies that influence the duration of each move search while it is already running. Furthermore, we analyze the effects of time management strategies on the distribution of time over the moves of an average game, allowing us to partly explain their performance. In the experiments, the domain-independent strategy STOP provides a significant improvement over the state of the art in Go, and is the most effective time management strategy tested in all five domains.

Archive | 2011