Tom Barclay
Microsoft
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Tom Barclay.
international conference on management of data | 1994
Chris Nyberg; Tom Barclay; Zarka Cvetanovic; Jim Gray; David B. Lomet
A new sort algorithm, called AlphaSort, demonstrates that commodity processors and disks can handle commercial batch workloads. Using Alpha AXP processors, commodity memory, and arrays of SCSI disks, AlphaSort runs the industry-standard sort benchmark in seven seconds. This beats the best published record on a 32-cpu 32-disk Hypercube by 8:1. On another benchmark, AlphaSort sorted more than a gigabyte in a minute. AlphaSort is a cache-sensitive memory-intensive sort algorithm. It uses file striping to get high disk bandwidth. It uses QuickSort to generate runs and uses replacement-selection to merge the runs. It uses shared memory multiprocessors to break the sort into subsort chores. Because startup times are becoming a significant part of the total time, we propose two new benchmarks: (1) Minutesort: how much can you sort in a minute, and (2) DollarSort: how much can you sort for a dollar.
international conference on management of data | 2000
Tom Barclay; Jim Gray; Donald R. Slutz
Microsoft® TerraServer stores aerial, satellite, and topographic images of the earth in a SQL database available via the Internet. It is the worlds largest online atlas, combining eight terabytes of image data from the United States Geological Survey (USGS) and SPIN-2. Internet browsers provide intuitive spatial and text interfaces to the data. Users need no special hardware, software, or knowledge to locate and browse imagery. This paper describes how terabytes of “Internet unfriendly” geo-spatial images were scrubbed and edited into hundreds of millions of “Internet friendly” image tiles and loaded into a SQL data warehouse. All meta-data and imagery are stored in the SQL database. TerraServer demonstrates that general-purpose relational database technology can manage large scale image repositories, and shows that web browsers can be a good geo-spatial image presentation system.
very large data bases | 1995
Chris Nyberg; Tom Barclay; Zarka Cvetanovic; Jim Gray; David B. Lomet
A new sort algorithm, called AlphaSort, demonstrates that commodity processors and disks can handle commercial batch workloads. Using commodity processors, memory, and arrays of SCSI disks, AlphaSort runs the industrystandard sort benchmark in seven seconds. This beats the best published record on a 32-CPU 32-disk Hypercube by 8:1. On another benchmark, AlphaSort sorted more than a gigabyte in one minute. AlphaSort is a cache-sensitive, memoryintensive sort algorithm. We argue that modern architectures require algorithm designers to re-examine their use of the memory hierarchy. AlphaSort uses clustered data structures to get good cache locality, file striping to get high disk bandwidth, QuickSort to generate runs, and replacement-selection to merge the runs. It uses shared memory multiprocessors to break the sort into subsort chores. Because startup times are becoming a significant part of the total time, we propose two new benchmarks: (1) MinuteSort: how much can you sort in one minute, and (2) PennySort: how much can you sort for one penny.
international conference on management of data | 1994
Tom Barclay; Robert Barnes; Jim Gray; Prakash Sundaresan
This paper describes a parallel database load prototype for Digitals Rdb database product. The prototype takes a dataflow approach to database parallelism. It includes an explorer that discovers and records the cluster configuration in a database, a client CUI interface that gathers the load job description from the user and from the Rdb catalogs, and an optimizer that picks the best parallel execution plan and records it in a web data structure. The web describes the data operators, the dataflow rivers among them, the binding of operators to processes, processes to processors, and files to discs and tapes. This paper describes the optimizers cost-based hierarchical optimization strategy in some detail. The prototype executes the webs plan by spawning a web manager process at each node of the cluster. The managers create the local executor processes, and orchestrate startup, phasing, checkpoint, and shutdown. The execution processes perform one or more operators. Data flows among the operators are via memory-to-memory streams within a node, and via web-manager multiplexed tcp/ip streams among nodes. The design of the transaction and checkpoint/restart mechanisms are also described. Preliminary measurements indicate that this design will give excellent scaleups.
international conference on management of data | 2000
Tom Barclay; Donald R. Slutz; Jim Gray
arXiv: Digital Libraries | 2002
Tom Barclay; Jim Gray; Eric Strand; Steve Ekblad; Jeffrey Richter
arXiv: Networking and Internet Architecture | 2002
Jim Gray; Wyman Chong; Tom Barclay; Alexander S. Szalay; Jan Vandenberg
Archive | 2002
Tom Barclay; Jim Gray; Elizabeth B. Strand; Steve Ekblad; Jeffrey Richter
Archive | 1998
Tom Barclay
IEEE Internet Computing | 2006
Tom Barclay; Jim Gray; Steve Ekblad; Eric Strand; Jeffrey Richter