Is this you? Create Your Porfile

Tongping Liu

University of Texas at San Antonio

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Tongping Liu is active.

Explore More

Publication

Featured researches published by Tongping Liu.

conference on object-oriented programming systems, languages, and applications | 2009

Grace: safe multithreaded programming for C/C++

Emery D. Berger; Ting Yang; Tongping Liu; Gene Novark

The shift from single to multiple core architectures means that programmers must write concurrent, multithreaded programs in order to increase application performance. Unfortunately, multithreaded applications are susceptible to numerous errors, including deadlocks, race conditions, atomicity violations, and order violations. These errors are notoriously difficult for programmers to debug. This paper presents Grace, a software-only runtime system that eliminates concurrency errors for a class of multithreaded programs: those based on fork-join parallelism. By turning threads into processes, leveraging virtual memory protection, and imposing a sequential commit protocol, Grace provides programmers with the appearance of deterministic, sequential execution, while taking advantage of available processing cores to run code concurrently and efficiently. Experimental results demonstrate Graces effectiveness: with modest code changes across a suite of computationally-intensive benchmarks (1-16 lines), Grace can achieve high scalability and performance while preventing concurrency errors.

symposium on operating systems principles | 2011

Dthreads: efficient deterministic multithreading

Tongping Liu; Charlie Curtsinger; Emery D. Berger

Multithreaded programming is notoriously difficult to get right. A key problem is non-determinism, which complicates debugging, testing, and reproducing errors. One way to simplify multithreaded programming is to enforce deterministic execution, but current deterministic systems for C/C++ are incomplete or impractical. These systems require program modification, do not ensure determinism in the presence of data races, do not work with general-purpose multithreaded programs, or run up to 8.4× slower than pthreads. This paper presents Dthreads, an efficient deterministic multithreading system for unmodified C/C++ applications that replaces the pthreads library. Dthreads enforces determinism in the face of data races and deadlocks. Dthreads works by exploding multithreaded applications into multiple processes, with private, copy-on-write mappings to shared memory. It uses standard virtual memory protection to track writes, and deterministically orders updates by each thread. By separating updates from different threads, Dthreads has the additional benefit of eliminating false sharing. Experimental results show that Dthreads substantially outperforms a state-of-the-art deterministic runtime system, and for a majority of the benchmarks evaluated here, matches and occasionally exceeds the performance of pthreads.

conference on object-oriented programming systems, languages, and applications | 2011

SHERIFF: precise detection and automatic mitigation of false sharing

Tongping Liu; Emery D. Berger

False sharing is an insidious problem for multithreaded programs running on multicore processors, where it can silently degrade performance and scalability. Previous tools for detecting false sharing are severely limited: they cannot distinguish false sharing from true sharing, have high false positive rates, and provide limited assistance to help programmers locate and resolve false sharing. This paper presents two tools that attack the problem of false sharing: Sheriff-Detect and Sheriff-Protect. Both tools leverage a framework we introduce here called Sheriff. Sheriff breaks out threads into separate processes, and exposes an API that allows programs to perform per-thread memory isolation and tracking on a per-page basis. We believe Sheriff is of independent interest. Sheriff-Detect finds instances of false sharing by comparing updates within the same cache lines by different threads, and uses sampling to rank them by performance impact. Sheriff-Detect is precise (no false positives), runs with low overhead (on average, 20%), and is accurate, pinpointing the exact objects involved in false sharing. We present a case study demonstrating Sheriff-Detects effectiveness at locating false sharing in a variety of benchmarks. Rewriting a program to fix false sharing can be infeasible when source is unavailable, or undesirable when padding objects would unacceptably increase memory consumption or further worsen runtime performance. Sheriff-Protect mitigates false sharing by adaptively isolating shared updates from different threads into separate physical addresses, effectively eliminating most of the performance impact of false sharing. We show that Sheriff-Protect can improve performance for programs with catastrophic false sharing by up to 9×, without programmer intervention.

acm sigplan symposium on principles and practice of parallel programming | 2014

PREDATOR: predictive false sharing detection

Tongping Liu; Chen Tian; Ziang Hu; Emery D. Berger

False sharing is a notorious problem for multithreaded applications that can drastically degrade both performance and scalability. Existing approaches can precisely identify the sources of false sharing, but only report false sharing actually observed during execution; they do not generalize across executions. Because false sharing is extremely sensitive to object layout, these detectors can easily miss false sharing problems that can arise due to slight differences in memory allocation order or object placement decisions by the compiler. In addition, they cannot predict the impact of false sharing on hardware with different cache line sizes. This paper presents PREDATOR, a predictive software-based false sharing detector. PREDATOR generalizes from a single execution to precisely predict false sharing that is latent in the current execution. PREDATOR tracks accesses within a range that could lead to false sharing given different object placement. It also tracks accesses within virtual cache lines, contiguous memory ranges that span actual hardware cache lines, to predict sharing on hardware platforms with larger cache line sizes. For each, it reports the exact program location of predicted false sharing problems, ranked by their projected impact on performance. We evaluate PREDATOR across a range of benchmarks and actual applications. PREDATOR identifies problems undetectable with previous tools, including two previously-unknown false sharing problems, with no false positives. PREDATOR is able to immediately locate false sharing problems in MySQL and the Boost library that had eluded detection for years.

international conference on software engineering | 2016

DoubleTake: fast and precise error detection via evidence-based dynamic analysis

Tongping Liu; Charlie Curtsinger; Emery D. Berger

Programs written in unsafe languages like C and C++ often suffer from errors like buffer overflows, dangling pointers, and memory leaks. Dynamic analysis tools like Valgrind can detect these errors, but their overhead — primarily due to the cost of instrumenting every memory read and write — makes them too heavyweight for use in deployed applications and makes testing with them painfully slow. The result is that much deployed software remains susceptible to these bugs, which are notoriously difficult to track down.This paper presents evidence-based dynamic analysis, an approach that enables these analyses while imposing minimal overhead (under 5%), making it practical for the first time to perform these analyses in deployed settings. The key insight of evidence-based dynamic analysis is that for a class of errors, it is possible to ensure that evidence that they happened at some point in the past remains for later detection. Evidence-based dynamic analysis allows execution to proceed at nearly full speed until the end of an epoch (e.g., a heavyweight system call). It then examines program state to check for evidence that an error occurred at some time during that epoch. If so, it rolls back execution and re-executes the code with instrumentation activated to pinpoint the error.We present DoubleTake, a prototype evidence-based dynamic analysis framework. DoubleTake is practical and easy to deploy, requiring neither custom hardware, compiler, nor operating system support. We demonstrate DoubleTake’s generality and efficiency by building dynamic analyses that find buffer overflows, memory use-after-free errors, and memory leaks. Our evaluation shows that DoubleTake is efficient, imposing under 5% overhead on average, making it the fastest such system to date. It is also precise: DoubleTake pinpoints the location of these errors to the exact line and memory addresses where they occur, providing valuable debugging information to programmers.

european conference on computer systems | 2017

SyncPerf: Categorizing, Detecting, and Diagnosing Synchronization Performance Bugs

Mohammad Mejbah ul Alam; Tongping Liu; Guangming Zeng; Abdullah Muzahid

Despite the obvious importance, performance issues related to synchronization primitives are still lacking adequate attention. No literature extensively investigates categories, root causes, and fixing strategies of such performance issues. Existing work primarily focuses on one type of problems, while ignoring other important categories. Moreover, they leave the burden of identifying root causes to programmers. This paper first conducts an extensive study of categories, root causes, and fixing strategies of performance issues related to explicit synchronization primitives. Based on this study, we develop two tools to identify root causes of a range of performance issues. Compare with existing work, our proposal, SyncPerf, has three unique advantages. First, SyncPerfs detection is very lightweight, with 2.3% performance overhead on average. Second, SyncPerf integrates information based on callsites, lock variables, and types of threads. Such integration helps identify more latent problems. Last but not least, when multiple root causes generate the same behavior, SyncPerf provides a second analysis tool that collects detailed accesses inside critical sections and helps identify possible root causes. SyncPerf discovers many unknown but significant synchronization performance issues. Fixing them provides a performance gain anywhere from 2.5% to 42%. Low overhead, better coverage, and informative reports make SyncPerf an effective tool to find synchronization performance bugs in the production environment.

symposium on code generation and optimization | 2016

Cheetah: detecting false sharing efficiently and effectively

Tongping Liu; Xu Liu

False sharing is a notorious performance problem that may occur in multithreaded programs when they are running on ubiquitous multicore hardware. It can dramatically degrade the performance by up to an order of magnitude, significantly hurting the scalability. Identifying false sharing in complex programs is challenging. Existing tools either incur significant performance overhead or do not provide adequate information to guide code optimization. To address these problems, we develop Cheetah, a profiler that detects false sharing both efficiently and effectively. Cheetah leverages the lightweight hardware performance monitoring units (PMUs) that are available in most modern CPU architectures to sample memory accesses. Cheetah develops the first approach to quantify the optimization potential of false sharing instances without actual fixes, based on the latency information collected by PMUs. Cheetah precisely reports false sharing and provides insightful optimization guidance for programmers, while adding less than 7% runtime overhead on average. Cheetah is ready for real deployment.

technical symposium on computer science education | 2018

A User Space-based Project for Practicing Core Memory Management Concepts

Sam Silvestro; Timothy T. Yuen; Corey Crosser; Dakai Zhu; Turgay Korkmaz; Tongping Liu

This paper presents the design and evaluation of a novel project designed to facilitate the learning of memory management concepts and interactions between different components. This project removes the complexity of a full or specific operating system by implementing memory management inside the user space. Evaluation results show that the mean exam scores improved by about 29% to 34%. On average, the total code size is less than 300 lines and time spent working on this project is under 17 hours. Therefore, this project is beneficial in helping students learn memory management while maintaining a reasonable project workload.

computer and communications security | 2017

FreeGuard: A Faster Secure Heap Allocator

Sam Silvestro; Hongyu Liu; Corey Crosser; Zhiqiang Lin; Tongping Liu

In spite of years of improvements to software security, heap-related attacks still remain a severe threat. One reason is that many existing memory allocators fall short in a variety of aspects. For instance, performance-oriented allocators are designed with very limited countermeasures against attacks, but secure allocators generally suffer from significant performance overhead, e.g., running up to 10x slower. This paper, therefore, introduces FreeGuard, a secure memory allocator that prevents or reduces a wide range of heap-related security attacks, such as heap overflows, heap over-reads, use-after-frees, as well as double and invalid frees. FreeGuard has similar performance to the default Linux allocator, with less than 2% overhead on average, but provides significant improvement to security guarantees.

operating systems design and implementation | 2008