Thore Graepel | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Thore Graepel is active.

Explore More

Publication

Featured researches published by Thore Graepel.

Nature | 2016

Mastering the game of Go with deep neural networks and tree search

David Silver; Aja Huang; Chris J. Maddison; Arthur Guez; Laurent Sifre; George van den Driessche; Julian Schrittwieser; Ioannis Antonoglou; Veda Panneershelvam; Marc Lanctot; Sander Dieleman; Dominik Grewe; John Nham; Nal Kalchbrenner; Ilya Sutskever; Timothy P. Lillicrap; Madeleine Leach; Koray Kavukcuoglu; Thore Graepel; Demis Hassabis

The game of Go has long been viewed as the most challenging of classic games for artificial intelligence owing to its enormous search space and the difficulty of evaluating board positions and moves. Here we introduce a new approach to computer Go that uses ‘value networks’ to evaluate board positions and ‘policy networks’ to select moves. These deep neural networks are trained by a novel combination of supervised learning from human expert games, and reinforcement learning from games of self-play. Without any lookahead search, the neural networks play Go at the level of state-of-the-art Monte Carlo tree search programs that simulate thousands of random games of self-play. We also introduce a new search algorithm that combines Monte Carlo simulation with value and policy networks. Using this search algorithm, our program AlphaGo achieved a 99.8% winning rate against other Go programs, and defeated the human European Go champion by 5 games to 0. This is the first time that a computer program has defeated a human professional player in the full-sized game of Go, a feat previously thought to be at least a decade away.

Proceedings of the National Academy of Sciences of the United States of America | 2013

Private traits and attributes are predictable from digital records of human behavior

Michal Kosinski; David Stillwell; Thore Graepel

We show that easily accessible digital records of behavior, Facebook Likes, can be used to automatically and accurately predict a range of highly sensitive personal attributes including: sexual orientation, ethnicity, religious and political views, personality traits, intelligence, happiness, use of addictive substances, parental separation, age, and gender. The analysis presented is based on a dataset of over 58,000 volunteers who provided their Facebook Likes, detailed demographic profiles, and the results of several psychometric tests. The proposed model uses dimensionality reduction for preprocessing the Likes data, which are then entered into logistic/linear regression to predict individual psychodemographic profiles from Likes. The model correctly discriminates between homosexual and heterosexual men in 88% of cases, African Americans and Caucasian Americans in 95% of cases, and between Democrat and Republican in 85% of cases. For the personality trait “Openness,” prediction accuracy is close to the test–retest accuracy of a standard personality test. We give examples of associations between attributes and Likes and discuss implications for online personalization and privacy.

Nature | 2017

Mastering the game of Go without human knowledge

David Silver; Julian Schrittwieser; Karen Simonyan; Ioannis Antonoglou; Aja Huang; Arthur Guez; Thomas Hubert; Lucas Baker; Matthew Lai; Adrian Bolton; Yutian Chen; Timothy P. Lillicrap; Fan Hui; Laurent Sifre; George van den Driessche; Thore Graepel; Demis Hassabis

A long-standing goal of artificial intelligence is an algorithm that learns, tabula rasa, superhuman proficiency in challenging domains. Recently, AlphaGo became the first program to defeat a world champion in the game of Go. The tree search in AlphaGo evaluated positions and selected moves using deep neural networks. These neural networks were trained by supervised learning from human expert moves, and by reinforcement learning from self-play. Here we introduce an algorithm based solely on reinforcement learning, without human data, guidance or domain knowledge beyond game rules. AlphaGo becomes its own teacher: a neural network is trained to predict AlphaGo’s own move selections and also the winner of AlphaGo’s games. This neural network improves the strength of the tree search, resulting in higher quality move selection and stronger self-play in the next iteration. Starting tabula rasa, our new program AlphaGo Zero achieved superhuman performance, winning 100–0 against the previously published, champion-defeating AlphaGo.

Journal of Machine Learning Research | 2001

Bayes point machines

Ralf Herbrich; Thore Graepel; Colin Campbell

Kernel-classifiers comprise a powerful class of non-linear decision functions for binary classification. The support vector machine is an example of a learning algorithm for kernel classifiers that singles out the consistent classifier with the largest margin, i.e. minimal real-valued output on the training sample, within the set of consistent hypotheses, the so-called version space. We suggest the Bayes point machine as a well-founded improvement which approximates the Bayes-optimal decision by the centre of mass of version space. We present two algorithms to stochastically approximate the centre of mass of version space: a billiard sampling algorithm and a sampling algorithm based on the well known perceptron algorithm. It is shown how both algorithms can be extended to allow for soft-boundaries in order to admit training errors. Experimentally, we find that - for the zero training error case - Bayes point machines consistently outperform support vector machines on both surrogate data and real-world benchmark data sets. In the soft-boundary/soft-margin case, the improvement over support vector machines is shown to be reduced. Finally, we demonstrate that the real-valued output of single Bayes points on novel test points is a valid confidence measure and leads to a steady decrease in generalisation error when used as a rejection criterion.

Neurocomputing | 1998

Self-organizing maps: Generalizations and new optimization techniques

Thore Graepel; Matthias Burger; Klaus Obermayer

Abstract We offer three algorithms for the generation of topographic mappings to the practitioner of unsupervised data analysis. The algorithms are each based on the minimization of a cost function which is performed using an EM algorithm and deterministic annealing. The soft topographic vector quantization algorithm (STVQ) – like the original self-organizing map (SOM) – provides a tool for the creation of self-organizing maps of Euclidean data. Its optimization scheme, however, offers an alternative to the heuristic stepwise shrinking of the neighborhood width in the SOM and makes it possible to use a fixed neighborhood function solely to encode desired neighborhood relations between nodes. The kernel-based soft topographic mapping (STMK) is a generalization of STVQ and introduces new distance measures in data space, based on kernel functions. Using the new distance measures corresponds to performing the STVQ in a high-dimensional feature space, which is related to data space by a nonlinear mapping. This preprocessing can reveal structure in the data which may go unnoticed if the STVQ is performed in the standard Euclidean space. The soft topographic mapping for proximity data (STMP) is another generalization of STVQ that enables the user to generate topographic maps for data which are given in terms of pairwise proximities. It thus offers a flexible alternative to multidimensional scaling methods and opens up a new range of applications for SOMs. Both STMK and STMP share the robust optimization properties of STVQ due to the application of deterministic annealing. In our contribution we discuss the algorithms together with their implementation and provide detailed pseudo-code and explanations.

international conference on information security and cryptology | 2012

ML confidential: machine learning on encrypted data

Thore Graepel; Kristin E. Lauter; Michael Naehrig

We demonstrate that, by using a recently proposed leveled homomorphic encryption scheme, it is possible to delegate the execution of a machine learning algorithm to a computing service while retaining confidentiality of the training and test data. Since the computational complexity of the homomorphic encryption scheme depends primarily on the number of levels of multiplications to be carried out on the encrypted data, we define a new class of machine learning algorithms in which the algorithms predictions, viewed as functions of the input data, can be expressed as polynomials of bounded degree. We propose confidential algorithms for binary classification based on polynomial approximations to least-squares solutions obtained by a small number of gradient descent steps. We present experimental validation of the confidential machine learning pipeline and discuss the trade-offs regarding computational complexity, prediction accuracy and cryptographic security.

Neural Computation | 1999

A stochastic self-organizing map for proximity data

Thore Graepel; Klaus Obermayer

We derive an efficient algorithm for topographic mapping of proximity data (TMP), which can be seen as an extension of Kohonens self-organizing map to arbitrary distance measures. The TMP cost function is derived in a Baysian framework of folded Markov chains for the description of autoencoders. It incorporates the data by a dissimilarity matrix and the topographic neighborhood by a matrix of transition probabilities. From the principle of maximum entropy, a nonfactorizing Gibbs distribution is obtained, which is approximated in a mean-field fashion. This allows for maximum likelihood estimation using an expectation-maximization algorithm. In analogy to the transition from topographic vector quantization to the self-organizing map, we suggest an approximation to TMP that is computationally more efficient. In order to prevent convergence to local minima, an annealing scheme in the temperature parameter is introduced, for which the critical temperature of the first phase transition is calculated in terms of and . Numerical results demonstrate the working of the algorithm and confirm the analytical results. Finally, the algorithm is used to generate a connection map of areas of the cats cerebral cortex.

Machine Learning | 2014

Manifestations of user personality in website choice and behaviour on online social networks

Michal Kosinski; Pushmeet Kohli; David Stillwell; Thore Graepel

Individual differences in personality affect users’ online activities as much as they do in the offline world. This work, based on a sample of over a third of a million users, examines how users’ behaviour in the online environment, captured by their website choices and Facebook profile features, relates to their personality, as measured by the standard Five Factor Model personality questionnaire. Results show that there are psychologically meaningful links between users’ personalities, their website preferences and Facebook profile features. We show how website audiences differ in terms of their personality, present the relationships between personality and Facebook profile features, and show how an individual’s personality can be predicted from Facebook profile features. We conclude that predicting a user’s personality profile can be applied to personalize content, optimize search results, and improve online advertising.

international conference on machine learning | 2006

Bayesian pattern ranking for move prediction in the game of Go

David H. Stern; Ralf Herbrich; Thore Graepel

We investigate the problem of learning to predict moves in the board game of Go from game records of expert players. In particular, we obtain a probability distribution over legal moves for professional play in a given position. This distribution has numerous applications in computer Go, including serving as an efficient stand-alone Go player. It would also be effective as a move selector and move sorter for game tree search and as a training tool for Go players. Our method has two major components: a) a pattern extraction scheme for efficiently harvesting patterns of given size and shape from expert game records and b) a Bayesian learning algorithm (in two variants) that learns a distribution over the values of a move given a board position based on the local pattern context. The system is trained on 181,000 expert games and shows excellent prediction performance as indicated by its ability to perfectly predict the moves made by professional Go players in 34% of test positions.

international symposium on neural networks | 2000

Gaussian process regression: active data selection and test point rejection

Sambu Seo; Marko Wallat; Thore Graepel; Klaus Obermayer

We consider active data selection and test point rejection strategies for Gaussian process regression based on the variance of the posterior over target values. Gaussian process regression is viewed as transductive regression that provides target distributions for given points rather than selecting an explicit regression function. Since not only the posterior mean but also the posterior variance are easily calculated we use this additional information to two ends: active data selection is performed by either querying at points of high estimated posterior variance or at points that minimize the estimated posterior variance averaged over the input distribution of interest or (in a transductive manner) averaged over the test set. Test point rejection is performed using the estimated posterior variance as a confidence measure. We find that, for both a two-dimensional toy problem and a real-world benchmark problem, the variance is a reasonable criterion for both active data selection and test point rejection.

Explore More