Martin C. Smith
Queen Mary University of London
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Martin C. Smith.
annual conference on computers | 1998
Donald F. Beal; Martin C. Smith
This paper describes first results from the application of Temporal Difference learning [1] to shogi. We report on experiments to determine whether sensible values for shogi pieces can be obtained in the same manner as for western chess pieces [2]. The learning is obtained entirely from randomised self-play, without access to any form of expert knowledge. The piece values are used in a simple search program that chooses shogi moves from a shallow lookahead, using pieces values to evaluate the leaves, with a random tie-break at the top level. Temporal difference learning is used to adjust the piece values over the course of a series of games. The method is successful in learning values that perform well in matches against hand-crafted values.
Theoretical Computer Science | 2001
Donald F. Beal; Martin C. Smith
This paper describes the application of temporal difference (TD) learning to minimax searches in general, and presents results from shogi. TD learning is used to adjust the weights for evaluation features over the course of a series of games, starting from arbitrary initial values. For some games, to obtain weights accurate enough for high-performance play will require the TD learning phase to make use of minimax searches. A theoretical description of TD applied to minimax search is given, and we discuss how the theoretical characteristics of the method interact with practical considerations. These include the depth of search appropriate for successful learning and the use of self-play to enable the algorithm to be independent of human knowledge. We then report on experiments that obtained values for use in shogi-playing programs. Unlike chess, shogi has no generally agreed standardized set of values for pieces, so there is more need for machine learning. We compare our machine-learnt values, obtained without any human knowledge input, with hand-crafted values. TD learning was successful in obtaining values that performed well in matches against hand-crafted values.
ICGA Journal | 1997
Donald F. Beal; Martin C. Smith
Information Sciences | 2000
Donald F. Beal; Martin C. Smith
ICGA Journal | 1995
Donald F. Beal; Martin C. Smith
international joint conference on artificial intelligence | 1999
Donald F. Beal; Martin C. Smith
ICGA Journal | 1999
Donald F. Beal; Martin C. Smith
ICGA Journal | 1994
Donald F. Beal; Martin C. Smith
ICGA Journal | 1996
Donald F. Beal; Martin C. Smith
Lecture Notes in Computer Science | 1999
Donald F. Beal; Martin C. Smith