Paul Lu | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Paul Lu is active.

Explore More

Publication

Featured researches published by Paul Lu.

Science | 2007

Checkers Is Solved

Jonathan Schaeffer; Neil Burch; Yngvi Björnsson; Akihiro Kishimoto; Martin Müller; Robert Lake; Paul Lu; Steve Sutphen

The game of checkers has roughly 500 billion billion possible positions (5 × 1020). The task of solving the game, determining the final result in a game with no mistakes made by either player, is daunting. Since 1989, almost continuously, dozens of computers have been working on solving checkers, applying state-of-the-art artificial intelligence techniques to the proving process. This paper announces that checkers is now solved: Perfect play by both sides leads to a draw. This is the most challenging popular game to be solved to date, roughly one million times as complex as Connect Four. Artificial intelligence technology has been used to generate strong heuristic-based game-playing programs, such as Deep Blue for chess. Solving a game takes this to the next level by replacing the heuristics with perfection.

Nucleic Acids Research | 2005

BASys: a web server for automated bacterial genome annotation

Gary H. Van Domselaar; Paul Stothard; Savita Shrivastava; Joseph A. Cruz; Anchi Guo; Xiaoli Dong; Paul Lu; Duane Szafron; Russell Greiner; David S. Wishart

BASys (Bacterial Annotation System) is a web server that supports automated, in-depth annotation of bacterial genomic (chromosomal and plasmid) sequences. It accepts raw DNA sequence data and an optional list of gene identification information and provides extensive textual annotation and hyperlinked image output. BASys uses >30 programs to determine ∼60 annotation subfields for each gene, including gene/protein name, GO function, COG function, possible paralogues and orthologues, molecular weight, isoelectric point, operon structure, subcellular localization, signal peptides, transmembrane regions, secondary structure, 3D structure, reactions and pathways. The depth and detail of a BASys annotation matches or exceeds that found in a standard SwissProt entry. BASys also generates colorful, clickable and fully zoomable maps of each query chromosome to permit rapid navigation and detailed visual analysis of all resulting gene annotations. The textual annotations and images that are provided by BASys can be generated in ∼24 h for an average bacterial chromosome (5 Mb). BASys annotations may be viewed and downloaded anonymously or through a password protected access system. The BASys server and databases can also be downloaded and run locally. BASys is accessible at .

Artificial Intelligence | 1992

A world championship caliber checkers program

Jonathan Schaeffer; Joseph C. Culberson; Norman Treloar; Brent Knight; Paul Lu; Duane Szafron

The checkers program Chinook has won the right to play a 40-game match for the World Checkers Championship against Dr. Marion Tinsley. This was earned by placing second, after Dr. Tinsley, at the 1990 U.S. National Open, the biennial event used to determine a challenger for the Championship. This is the first time a program has earned the right to contest for a human World Championship. In an exhibition match played in December 1990, Tinsley narrowly defeated Chinook 7.5 - 6.5. This paper describes the program, the research problems encountered and our solutions. Many of the techniques used for computer chess are directly applicable to computer checkers. However, the problems of building a world championship caliber program force us to address some issues that have, to date, been largely ignored by the computer chess community.

international conference on data mining | 2001

Fast parallel association rule mining without candidacy generation

Osmar R. Zaïane; Mohammad El-Hajj; Paul Lu

In this paper we introduce a new parallel algorithm MLFPT (multiple local frequent pattern tree) for parallel mining of frequent patterns, based on FP-growth mining, that uses only two full I/O scans of the database, eliminating the need for generating candidate items, and distributing the work fairly among processors. We have devised partitioning strategies at different stages of the mining process to achieve near optimal balancing between processors. We have successfully tested our algorithm on datasets larger than 50 million transactions.

Ai Magazine | 1996

CHINOOK The World Man-Machine Checkers Champion

Jonathan Schaeffer; Robert Lake; Paul Lu; Martin Bryant

In 1992, the seemingly unbeatable World Checker Champion Marion Tinsley defended his title against the computer program CHINOOK. After an intense, tightly contested match, Tinsley fought back from behind to win the match by scoring four wins to CHINOOKs two, with 33 draws. This match was the first time in history that a human world champion defended his title against a computer. This article reports on the progress of the checkers (8 3 8 draughts) program CHINOOK since 1992. Two years of research and development on the program culminated in a rematch with Tinsley in August 1994. In this match, after six games (all draws), Tinsley withdrew from the match and relinquished the world championship title to CHINOOK,citing health concerns. CHINOOK has since defended its title in two subsequent matches. It is the first time in history that a computer has won a human-world championship.

parallel computing | 1993

On the versatility of parallel sorting by regular sampling

Xiaobo Li; Paul Lu; Jonathan Schaeffer; John Shillington; Pok Sze Wong; Hanmao Shi

Abstract Parallel sorting algorithms have already been proposed for a variety of multiple instruction streams, multiple data streams (MIMD) architectures. These algorithms often exploit the strengths of the particular machine to achieve high performance. In many cases, however, the existing algorithms cannot achieve comparable performance on other architectures. Parallel Sorting by Regular Sampling (PSRS) is an algorithm that is suitable for a diverse range of MIMD architectures. It has good load balancing properties, modest communication needs and good memory locality of reference. If there are no duplicate keys, PSRS guarantees to balance the work among the processors within a factor of two of optimal in theory, regardless of the data value distribution, and within a few percent of optimal in practice. This paper presents new theoretical and empirical results for PSRS. The theoretical analysis of PSRS is extended to include a lower bound and a tighter upper bound on the work done by a processor. The effect of duplicate keys is addressed analytically and shown that, in practice, it is not a concern. In addition, the issues of oversampling and undersampling the data are introduced and analyzed. Empirically, PSRS has been implemented on four diverse MIMD architectures and a network of workstations. On all of the machines, for both random and application-generated data sets, the algorithm achieves good results. PSRS is not necessarily the best parallel sorting algorithm for any specific machine. But PSRS will achieve good performance on a wide spectrum of machines before any strengths of the architecture are exploited.

advances in computer games | 1993

Solving Large Retrograde Analysis Problems Using a Network of Workstations

Robert Lake; Jonathan Schaeffer; Paul Lu

Chess endgame databases, while of important theoretical interest, have yet to make a significant impact in tournament chess. In the game of checkers, however, endgame databases have played a pivotal role in the success of our World Championship challenger program Chinook. Consequently, we are interested in building databases consisting of hundreds of billions of positions. Since database positions arise frequently in Chinook’s search trees, the databases must be accessible in real-time, unlike in chess. This paper discusses techniques for building large endgame databases using a network of workstations, and how this data can be organized for use in a real-time search. Although checkers is used to illustrate many of the ideas, the techniques and tools developed are also applicable to chess.

job scheduling strategies for parallel processing | 2002

Practical Heterogeneous Placeholder Scheduling in Overlay Metacomputers: Early Experiences

Christopher Pinchak; Paul Lu; Mark Goldenberg

A practical problem faced by users of high-performance computers is: How can I automatically load balance my jobs across different batchq ueues, whichare in different administrative domains, if there is no existing grid infrastructure? It is common to have user accounts for a number of individual high-performance systems (e.g., departmental, university, regional) that are administered by different groups. Without an administration-deployed grid infrastructure, one can still create a purely user-level aggregation of individual computing systems.The Trellis Project is developing the techniques and tools to take advantage of a user-level overlay metacomputer. Because placeholder scheduling does not require superuser permissions to set up or configure, it is well-suited to overlay metacomputers. This paper contributes to the practical side of grid and metacomputing by empirically demonstrating that placeholder scheduling can work across different administrative domains, across different local schedulers (i.e., PBS and Sun Grid Engine), and across different programming models (i.e., Pthreads, MPI, and sequential). We also describe a new metaqueue system to manage jobs with explicit workflow dependencies.

Bioinformatics | 2008

Improving subcellular localization prediction using text classification and the gene ontology

Alona Fyshe; Yifeng Liu; Duane Szafron; Russell Greiner; Paul Lu

MOTIVATION Each protein performs its functions within some specific locations in a cell. This subcellular location is important for understanding protein function and for facilitating its purification. There are now many computational techniques for predicting location based on sequence analysis and database information from homologs. A few recent techniques use text from biological abstracts: our goal is to improve the prediction accuracy of such text-based techniques. We identify three techniques for improving text-based prediction: a rule for ambiguous abstract removal, a mechanism for using synonyms from the Gene Ontology (GO) and a mechanism for using the GO hierarchy to generalize terms. We show that these three techniques can significantly improve the accuracy of protein subcellular location predictors that use text extracted from PubMed abstracts whose references are recorded in Swiss-Prot.

advances in computer games | 2004

Building the Checkers 10-Piece Endgame Databases

Jonathan Schaeffer; Yngvi Björnsson; Neil Burch; Robert Lake; Paul Lu; Steve Sutphen

In 1993, the Chinook team completed the computation of the 2 through 8-piece checkers endgame databases, consisting of roughly 444 billion positions. Until recently, nobody had attempted to extend this work. In November 2001, we began an effort to compute the 9- and 10-piece databases. By June 2003, the entire 9-piece database and the 5-piece versus 5-piece portion of the 10-piece database were completed. The result is a 13 trillion position database, compressed into 148 GB of data organized for real-time decompression. This represents the largest endgame database initiative yet attempted. The results obtained from these computations are being used to aid an attempt to weakly solve the game. This paper describes our experiences working on building large endgame databases.

Explore More