Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Martin C. Smith is active.

Publication


Featured researches published by Martin C. Smith.


annual conference on computers | 1998

First Results from Using Temporal Difference Learning in Shogi

Donald F. Beal; Martin C. Smith

This paper describes first results from the application of Temporal Difference learning [1] to shogi. We report on experiments to determine whether sensible values for shogi pieces can be obtained in the same manner as for western chess pieces [2]. The learning is obtained entirely from randomised self-play, without access to any form of expert knowledge. The piece values are used in a simple search program that chooses shogi moves from a shallow lookahead, using pieces values to evaluate the leaves, with a random tie-break at the top level. Temporal difference learning is used to adjust the piece values over the course of a series of games. The method is successful in learning values that perform well in matches against hand-crafted values.


Theoretical Computer Science | 2001

Temporal difference learning applied to game playing and the results of application to shogi

Donald F. Beal; Martin C. Smith

This paper describes the application of temporal difference (TD) learning to minimax searches in general, and presents results from shogi. TD learning is used to adjust the weights for evaluation features over the course of a series of games, starting from arbitrary initial values. For some games, to obtain weights accurate enough for high-performance play will require the TD learning phase to make use of minimax searches. A theoretical description of TD applied to minimax search is given, and we discuss how the theoretical characteristics of the method interact with practical considerations. These include the depth of search appropriate for successful learning and the use of self-play to enable the algorithm to be independent of human knowledge. We then report on experiments that obtained values for use in shogi-playing programs. Unlike chess, shogi has no generally agreed standardized set of values for pieces, so there is more need for machine learning. We compare our machine-learnt values, obtained without any human knowledge input, with hand-crafted values. TD learning was successful in obtaining values that performed well in matches against hand-crafted values.


ICGA Journal | 1997

Learning Piece Values Using Temporal Differences

Donald F. Beal; Martin C. Smith


Information Sciences | 2000

Temporal difference learning for heuristic search and game playing

Donald F. Beal; Martin C. Smith


ICGA Journal | 1995

Quantification of Search-Extension Benefits

Donald F. Beal; Martin C. Smith


international joint conference on artificial intelligence | 1999

Temporal Coherence and Prediction Decay in TD Learning

Donald F. Beal; Martin C. Smith


ICGA Journal | 1999

Learning Piece-square Values using Temporal Differences

Donald F. Beal; Martin C. Smith


ICGA Journal | 1994

Random Evaluations in Chess.

Donald F. Beal; Martin C. Smith


ICGA Journal | 1996

Multiple Probes of Transposition Tables

Donald F. Beal; Martin C. Smith


Lecture Notes in Computer Science | 1999

First results from using temporal difference learning in shogi

Donald F. Beal; Martin C. Smith

Collaboration


Dive into the Martin C. Smith's collaboration.

Top Co-Authors

Avatar

Donald F. Beal

Queen Mary University of London

View shared research outputs
Researchain Logo
Decentralizing Knowledge