Is this you? Create Your Porfile

Stephen Tu

University of California, Berkeley

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Stephen Tu is active.

Explore More

Publication

Featured researches published by Stephen Tu.

symposium on operating systems principles | 2013

Speedy transactions in multicore in-memory databases

Stephen Tu; Wenting Zheng; Eddie Kohler; Barbara Liskov; Samuel Madden

Silo is a new in-memory database that achieves excellent performance and scalability on modern multicore machines. Silo was designed from the ground up to use system memory and caches efficiently. For instance, it avoids all centralized contention points, including that of centralized transaction ID assignment. Silos key contribution is a commit protocol based on optimistic concurrency control that provides serializability while avoiding all shared-memory writes for records that were only read. Though this might seem to complicate the enforcement of a serial order, correct logging and recovery is provided by linking periodically-updated epochs with the commit protocol. Silo provides the same guarantees as any serializable database without unnecessary scalability bottlenecks or much additional latency. Silo achieves almost 700,000 transactions per second on a standard TPC-C workload mix on a 32-core machine, as well as near-linear scalability. Considered per core, this is several times higher than previously reported results.

very large data bases | 2013

Processing analytical queries over encrypted data

Stephen Tu; M. Frans Kaashoek; Samuel Madden; Nickolai Zeldovich

MONOMI is a system for securely executing analytical workloads over sensitive data on an untrusted database server. MONOMI works by encrypting the entire database and running queries over the encrypted data. MONOMI introduces split client/server query execution, which can execute arbitrarily complex queries over encrypted data, as well as several techniques that improve performance for such workloads, including per-row precomputation, space-efficient encryption, grouped homomorphic addition, and pre-filtering. Since these optimizations are good for some queries but not others, MONOMI introduces a designer for choosing an efficient physical design at the server for a given workload, and a planner to choose an efficient execution plan for a given query at runtime. A prototype of MONOMI running on top of Postgres can execute most of the queries from the TPC-H benchmark with a median overhead of only 1.24× (ranging from 1.03×to 2.33×) compared to an un-encrypted Postgres database where a compromised server would reveal all data.

very large data bases | 2013

Anti-caching: a new approach to database management system architecture

Justin DeBrabant; Andrew Pavlo; Stephen Tu; Michael Stonebraker; Stanley B. Zdonik

The traditional wisdom for building disk-based relational database management systems (DBMS) is to organize data in heavily-encoded blocks stored on disk, with a main memory block cache. In order to improve performance given high disk latency, these systems use a multi-threaded architecture with dynamic record-level locking that allows multiple transactions to access the database at the same time. Previous research has shown that this results in substantial overhead for on-line transaction processing (OLTP) applications [15]. The next generation DBMSs seek to overcome these limitations with architecture based on main memory resident data. To overcome the restriction that all data fit in main memory, we propose a new technique, called anti-caching, where cold data is moved to disk in a transactionally-safe manner as the database grows in size. Because data initially resides in memory, an anti-caching architecture reverses the traditional storage hierarchy of disk-based systems. Main memory is now the primary storage device. We implemented a prototype of our anti-caching proposal in a high-performance, main memory OLTP DBMS and performed a series of experiments across a range of database sizes, workload skews, and read/write mixes. We compared its performance with an open-source, disk-based DBMS optionally fronted by a distributed main memory cache. Our results show that for higher skewed workloads the anti-caching architecture has a performance advantage over either of the other architectures tested of up to 9× for a data size 8× larger than memory.

conference on object-oriented programming systems, languages, and applications | 2012

The HipHop compiler for PHP

Haiping Zhao; Iain Andrew Russell Proctor; Minghui Yang; Xin Qi; Mark Williams; Qi Gao; Guilherme de Lima Ottoni; Andrew John Paroski; Scott MacVicar; Jason Owen Evans; Stephen Tu

Scripting languages are widely used to quickly accomplish a variety of tasks because of the high productivity they enable. Among other reasons, this increased productivity results from a combination of extensive libraries, fast development cycle, dynamic typing, and polymorphism. The dynamic features of scripting languages are traditionally associated with interpreters, which is the approach used to implement most scripting languages. Although easy to implement, interpreters are generally slow, which makes scripting languages prohibitive for implementing large, CPU-intensive applications. This efficiency problem is particularly important for PHP given that it is the most commonly used language for server-side web development. This paper presents the design, implementation, and an evaluation of the HipHop compiler for PHP. HipHop goes against the standard practice and implements a very dynamic language through static compilation. After describing the most challenging PHP features to support through static compilation, this paper presents HipHops design and techniques that support almost all PHP features. We then present a thorough evaluation of HipHop running both standard benchmarks and the Facebook web site. Overall, our experiments demonstrate that HipHop is about 5.5x faster than standard, interpreted PHP engines. As a result, HipHop has reduced the number of servers needed to run Facebook and other web sites by a factor between 4 and 6, thus drastically cutting operating costs.

network and distributed system security symposium | 2015