Kaushik Veeraraghavan

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Kaushik Veeraraghavan is active.

Explore More

Publication

Featured researches published by Kaushik Veeraraghavan.

international conference on mobile systems, applications, and services | 2008

Virtualized in-cloud security services for mobile devices

Jon Oberheide; Kaushik Veeraraghavan; Evan Cooke; Jason Flinn; Farnam Jahanian

Modern mobile devices continue to approach the capabilities and extensibility of standard desktop PCs. Unfortunately, these devices are also beginning to face many of the same security threats as desktops. Currently, mobile security solutions mirror the traditional desktop model in which they run detection services on the device. This approach is complex and resource intensive in both computation and power. This paper proposes a new model whereby mobile antivirus functionality is moved to an off-device network service employing multiple virtualized malware detection engines. Our argument is that it is possible to spend bandwidth resources to significantly reduce on-device CPU, memory, and power resources. We demonstrate how our in-cloud model enhances mobile security and reduces on-device software complexity, while allowing for new services such as platform-specific behavioral analysis engines. Our benchmarks on Nokias N800 and N95 mobile devices show that our mobile agent consumes an order of magnitude less CPU and memory while also consuming less power in common scenarios compared to existing on-device antivirus software.

architectural support for programming languages and operating systems | 2011

DoublePlay: parallelizing sequential logging and replay

Kaushik Veeraraghavan; Dongyoon Lee; Benjamin Wester; Jessica Ouyang; Peter M. Chen; Jason Flinn; Satish Narayanasamy

Deterministic replay systems record and reproduce the execution of a hardware or software system. In contrast to replaying execution on uniprocessors, deterministic replay on multiprocessors is very challenging to implement efficiently because of the need to reproduce the order or values read by shared memory operations performed by multiple threads. In this paper, we present DoublePlay, a new way to efficiently guarantee replay on commodity multiprocessors. Our key insight is that one can use the simpler and faster mechanisms of single-processor record and replay, yet still achieve the scalability offered by multiple cores, by using an additional execution to parallelize the record and replay of an application. DoublePlay timeslices multiple threads on a single processor, then runs multiple time intervals (epochs) of the program concurrently on separate processors. This strategy, which we call uniparallelism, makes logging much easier because each epoch runs on a single processor (so threads in an epoch never simultaneously access the same memory) and different epochs operate on different copies of the memory. Thus, rather than logging the order of shared-memory accesses, we need only log the order in which threads in an epoch are timesliced on the processor. DoublePlay runs an additional execution of the program on multiple processors to generate checkpoints so that epochs run in parallel. We evaluate DoublePlay on a variety of client, server, and scientific parallel benchmarks; with spare cores, DoublePlay reduces logging overhead to an average of 15% with two worker threads and 28% with four threads.

architectural support for programming languages and operating systems | 2010

Respec: efficient online multiprocessor replayvia speculation and external determinism

Dongyoon Lee; Benjamin Wester; Kaushik Veeraraghavan; Satish Narayanasamy; Peter M. Chen; Jason Flinn

Deterministic replay systems record and reproduce the execution of a hardware or software system. While it is well known how to replay uniprocessor systems, replaying shared memory multiprocessor systems at low overhead on commodity hardware is still an open problem. This paper presents Respec, a new way to support deterministic replay of shared memory multithreaded programs on commodity multiprocessor hardware. Respec targets online replay in which the recorded and replayed processes execute concurrently. Respec uses two strategies to reduce overhead while still ensuring correctness: speculative logging and externally deterministic replay. Speculative logging optimistically logs less information about shared memory dependencies than is needed to guarantee deterministic replay, then recovers and retries if the replayed process diverges from the recorded process. Externally deterministic replay relaxes the degree to which the two executions must match by requiring only their system output and final program states match. We show that the combination of these two techniques results in low recording and replay overhead for the common case of data-race-free execution intervals and still ensures correct replay for execution intervals that have data races. We modified the Linux kernel to implement our techniques. Our software system adds on average about 18% overhead to the execution time for recording and replaying programs with two threads and 55% overhead for programs with four threads.

architectural support for programming languages and operating systems | 2010

Respec: Efficient Online Multiprocessor Replay via Speculation and External Determinism

Dongyoon Lee; Benjamin Wester; Kaushik Veeraraghavan; Satish Narayanasamy; Peter M. Chen; Jason Flinn

Deterministic replay systems record and reproduce the execution of a hardware or software system. While it is well known how to replay uniprocessor systems, replaying shared memory multiprocessor systems at low overhead on commodity hardware is still an open problem. This paper presents Respec, a new way to support deterministic replay of shared memory multithreaded programs on commodity multiprocessor hardware. Respec targets online replay in which the recorded and replayed processes execute concurrently. Respec uses two strategies to reduce overhead while still ensuring correctness: speculative logging and externally deterministic replay. Speculative logging optimistically logs less information about shared memory dependencies than is needed to guarantee deterministic replay, then recovers and retries if the replayed process diverges from the recorded process. Externally deterministic replay relaxes the degree to which the two executions must match by requiring only their system output and final program states match. We show that the combination of these two techniques results in low recording and replay overhead for the common case of datarace-free execution intervals and still ensures correct replay for execution intervals that have data races. We modified the Linux kernel to implement our techniques. Our software system adds on average about 18% overhead to the execution time for recording and replaying programs with two threads and 55% overhead for programs with four threads.

symposium on operating systems principles | 2011

Detecting and surviving data races using complementary schedules

Kaushik Veeraraghavan; Peter M. Chen; Jason Flinn; Satish Narayanasamy

Data races are a common source of errors in multithreaded programs. In this paper, we show how to protect a program from data race errors at runtime by executing multiple replicas of the program with complementary thread schedules. Complementary schedules are a set of replica thread schedules crafted to ensure that replicas diverge only if a data race occurs and to make it very likely that harmful data races cause divergences. Our system, called Frost, uses complementary schedules to cause at least one replica to avoid the order of racing instructions that leads to incorrect program execution for most harmful data races. Frost introduces outcome-based race detection, which detects data races by comparing the state of replicas executing complementary schedules. We show that this method is substantially faster than existing dynamic race detectors for unmanaged code. To help programs survive bugs in production, Frost also diagnoses the data race bug and selects an appropriate recovery strategy, such as choosing a replica that is likely to be correct or executing more replicas to gather additional information. Frost controls the thread schedules of replicas by running all threads of a replica non-preemptively on a single core. To scale the program to multiple cores, Frost runs a third replica in parallel to generate checkpoints of the programs likely future states --- these checkpoints let Frost divide program execution into multiple epochs, which it then runs in parallel. We evaluate Frost using 11 real data race bugs in desktop and server applications. Frost both detects and survives all of these data races. Since Frost runs three replicas, its utilization cost is 3x. However, if there are spare cores to absorb this increased utilization, Frost adds only 3--12% overhead to application runtime.

ACM Transactions on Computer Systems | 2012

DoublePlay: Parallelizing Sequential Logging and Replay

Kaushik Veeraraghavan; Dongyoon Lee; Benjamin Wester; Jessica Ouyang; Peter M. Chen; Jason Flinn; Satish Narayanasamy

Deterministic replay systems record and reproduce the execution of a hardware or software system. In contrast to replaying execution on uniprocessors, deterministic replay on multiprocessors is very challenging to implement efficiently because of the need to reproduce the order of or the values read by shared memory operations performed by multiple threads. In this paper, we present DoublePlay, a new way to efficiently guarantee replay on commodity multiprocessors. Our key insight is that one can use the simpler and faster mechanisms of single-processor record and replay, yet still achieve the scalability offered by multiple cores, by using an additional execution to parallelize the record and replay of an application. DoublePlay timeslices multiple threads on a single processor, then runs multiple time intervals (epochs) of the program concurrently on separate processors. This strategy, which we call uniparallelism, makes logging much easier because each epoch runs on a single processor (so threads in an epoch never simultaneously access the same memory) and different epochs operate on different copies of the memory. Thus, rather than logging the order of shared-memory accesses, we need only log the order in which threads in an epoch are timesliced on the processor. DoublePlay runs an additional execution of the program on multiple processors to generate checkpoints so that epochs run in parallel. We evaluate DoublePlay on a variety of client, server, and scientific parallel benchmarks; with spare cores, DoublePlay reduces logging overhead to an average of 15p with two worker threads and 28p with four threads.

symposium on operating systems principles | 2015

Existential consistency: measuring and understanding consistency at Facebook

Haonan Lu; Kaushik Veeraraghavan; Philippe Vincent Ajoux; Jim Hunt; Yee Jiun Song; Wendy Tobagus; Sanjeev Kumar; Wyatt Lloyd

Replicated storage for large Web services faces a trade-off between stronger forms of consistency and higher performance properties. Stronger consistency prevents anomalies, i.e., unexpected behavior visible to users, and reduces programming complexity. There is much recent work on improving the performance properties of systems with stronger consistency, yet the flip-side of this trade-off remains elusively hard to quantify. To the best of our knowledge, no prior work does so for a large, production Web service. We use measurement and analysis of requests to Facebooks TAO system to quantify how often anomalies happen in practice, i.e., when results returned by eventually consistent TAO differ from what is allowed by stronger consistency models. For instance, our analysis shows that 0.0004% of reads to vertices would return different results in a linearizable system. This in turn gives insight into the benefits of stronger consistency; 0.0004% of reads are potential anomalies that a linearizable system would prevent. We directly study local consistency models---i.e., those we can analyze using requests to a sample of objects---and use the relationships between models to infer bounds on the others. We also describe a practical consistency monitoring system that tracks φ-consistency, a new consistency metric ideally suited for health monitoring. In addition, we give insight into the increased programming complexity of weaker consistency by discussing bugs our monitoring uncovered, and anti-patterns we teach developers to avoid.

ACM Transactions on Storage | 2010

quFiles: The right file at the right time

Kaushik Veeraraghavan; Jason Flinn; Edmund B. Nightingale; Brian D. Noble

A quFile is a unifying abstraction that simplifies data management by encapsulating different physical representations of the same logical data. Similar to a quBit (quantum bit), the particular representation of the logical data displayed by a quFile is not determined until the moment it is needed. The representation returned by a quFile is specified by a data-specific policy that can take context into account such as the application requesting the data, the device on which data is accessed, screen size, and battery status. We demonstrate the generality of the quFile abstraction by using it to implement six case studies: resource management, copy-on-write versioning, data redaction, resource-aware directories, application-aware adaptation, and platform-specific encoding. Most quFile policies were expressed using less than one hundred lines of code. Our experimental results show that, with caching and other performance optimizations, quFiles add less than 1% overhead to application-level file system.

international conference on mobile systems, applications, and services | 2009

Fidelity-aware replication for mobile devices

Kaushik Veeraraghavan; Venugopalan Ramasubramanian; Thomas L. Rodeheffer; Douglas B. Terry; Ted Wobber

Mobile devices often store data in reduced resolutions or custom formats in order to accommodate resource constraints and tailor-made software. The Polyjuz framework enables sharing and synchronization of data across a collection of personal devices that use formats of different fidelity. Layered transparently between the application and an off-the-shelf replication platform, Polyjuz bridges the isolated worlds of different data formats. With Polyjuz, data items created or updated on high-fidelity devices-such as laptops and desktops-are automatically replicated onto low-fidelity, mobile devices. Similarly, data items updated on low-fidelity devices are reintegrated with their high-fidelity counterparts when possible. Polyjuz performs these fidelity reductions and reintegrations as devices exchange data in a peer-to-peer manner, ultimately extending the eventual-consistency guarantee of the underlying replication platform to the multifidelity universe. In this paper, we present the design and implementation of Polyjuz and demonstrate its benefits for fidelity-aware contacts management and picture sharing applications.

workshop on mobile computing systems and applications | 2008

quFiles: a unifying abstraction for mobile data management

Kaushik Veeraraghavan; Edmund B. Nightingale; Jason Flinn; Brian D. Noble

We introduce a unifying file system abstraction, called a quFile, that provides a new mechanism for implementing mobile data management policies. quFiles allow arbitrary data types to be bundled together without confusing the user. Similar to a quBit (quantum bit), the particular data displayed by a quFile is not determined until the moment it is observed. A quFile displays the appropriate data type and version depending upon an application-specific policy that can take any information into account, such as the platform, external devices, context, connectivity, or battery power. We first describe what the quFile mechanism provides to applications and mobile devices. We then discuss how quFiles can benefit mobile data management in the areas of resource management, extensibility, and data consistency and availability.

Explore More