Kc Sivaramakrishnan | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Kc Sivaramakrishnan is active.

Explore More

Publication

Featured researches published by Kc Sivaramakrishnan.

international conference on functional programming | 2009

Partial memoization of concurrency and communication

Lukasz Ziarek; Kc Sivaramakrishnan; Suresh Jagannathan

Memoization is a well-known optimization technique used to eliminate redundant calls for pure functions. If a call to a function f with argument v yields result r, a subsequent call to f with v can be immediately reduced to r without the need to re-evaluate fs body. Understanding memoization in the presence of concurrency and communication is significantly more challenging. For example, if f communicates with other threads, it is not sufficient to simply record its input/output behavior; we must also track inter-thread dependencies induced by these communication actions. Subsequent calls to f can be elided only if we can identify an interleaving of actions from these call-sites that lead to states in which these dependencies are satisfied. Similar issues arise if f spawns additional threads. In this paper, we consider the memoization problem for a higher-order concurrent language whose threads may communicate through synchronous message-based communication. To avoid the need to perform unbounded state space search that may be necessary to determine if all communication dependencies manifest in an earlier call can be satisfied in a later one, we introduce a weaker notion of memoization called partial memoization that gives implementations the freedom to avoid performing some part, if not all, of a previously memoized call. To validate the effectiveness of our ideas, we consider the benefits of memoization for reducing the overhead of recomputation for streaming, server-based, and transactional applications executed on a multi-core machine. We show that on a variety of workloads, memoization can lead to substantial performance improvements without incurring high memory costs.

international conference on coordination models and languages | 2010

Efficient session type guided distributed interaction

Kc Sivaramakrishnan; Karthik Nagaraj; Lukasz Ziarek; Patrick Eugster

Recently, there has been much interest in multi-party session types (MPSTs) as a means of rigorously specifying protocols for interaction among multiple distributed participants. By capturing distributed interaction as series of typed interactions, MPSTs allow for the static verification of compliance of corresponding distributed object programs. We observe that explicit control flow information manifested by MPST opens intriguing avenues also for performance enhancements. In this paper, we present a session type assisted performance enhancement framework for distributed object interaction in Java. Experimental evaluation within our distributed runtime infrastructure illustrates the costs and benefits of our composable enhancement strategies.

programming language design and implementation | 2011

Composable asynchronous events

Lukasz Ziarek; Kc Sivaramakrishnan; Suresh Jagannathan

Although asynchronous communication is an important feature of many concurrent systems, building composable abstractions that leverage asynchrony is challenging. This is because an asynchronous operation necessarily involves two distinct threads of control -- the thread that initiates the operation, and the thread that discharges it. Existing attempts to marry composability with asynchrony either entail sacrificing performance (by limiting the degree of asynchrony permitted), or modularity (by forcing natural abstraction boundaries to be broken). In this paper, we present the design and rationale for asynchronous events, an abstraction that enables composable construction of complex asynchronous protocols without sacrificing the benefits of abstraction or performance. Asynchronous events are realized in the context of Concurrent MLs first-class event abstraction. We discuss the definition of a number of useful asynchronous abstractions that can be built on top of asynchronous events (e.g., composable callbacks) and provide a detailed case study of how asynchronous events can be used to substantially improve the modularity and performance of an I/O-intensive highly concurrent server application.

workshop on declarative aspects of multicore programming | 2010

Lightweight asynchrony using parasitic threads

Kc Sivaramakrishnan; Lukasz Ziarek; Raghavendra Prasad; Suresh Jagannathan

Message-passing is an attractive thread coordination mechanism because it cleanly delineates points in an execution when threads communicate, and unifies synchronization and communication: a sender is allowed to proceed only when a receiver willing to accept the data being sent is available and vice versa. To enable greater performance, however, asynchronous or non-blocking extensions are usually provided that allow senders and receivers to proceed even if a matching partner is unavailable. Lightweight threads with synchronous message-passing can be used to encapsulate asynchronous message-passing operations, although such implementations have greater thread management costs that can negatively impact scalability and performance. This paper introduces parasitic threads, a novel mechanism for expressing asynchronous computation, that combines the efficiency of a non-declarative solution with the ease of use provided by languages with first-class channels and lightweight threads. A parasitic thread is a lightweight data structure that encapsulates an asynchronous computation using the resources provided by a host thread. Parasitic threads need not execute cooperatively, impose no restrictions on the computations they encapsulate, or the communication actions they perform, and impose no additional burden on thread scheduling mechanisms. We describe an implementation of parasitic threads in MLton, a whole-program optimizing compiler and runtime for Standard ML. Benchmark results indicate parasitic threads enable construction of scalable and efficient message-passing parallel programs.

international symposium on memory management | 2012

Eliminating read barriers through procrastination and cleanliness

Kc Sivaramakrishnan; Lukasz Ziarek; Suresh Jagannathan

Managed languages typically use read barriers to interpret forwarding pointers introduced to keep track of copied objects. For example, in a multicore environment with thread-local heaps and a global, shared heap, an object initially allocated on a local heap may be copied to a shared heap if it becomes the source of a store operation whose target location resides on the shared heap. As part of the copy operation, a forwarding pointer may be established in the original object to point to the copied object. This level of indirection avoids the need to update all of the references to the object that has been copied. In this paper, we consider the design of a managed runtime that eliminates read barriers. Our design is premised on the availability of a sufficient degree of concurrency to stall operations that would otherwise necessitate the copy. Stalled actions are deferred until the next local collection, avoiding exposing forwarding pointers to the mutator. In certain important cases, procrastination is unnecessary -- lightweight runtime techniques can sometimes be used to allow objects to be eagerly copied when their set of incoming references is known, or when it can be determined that having multiple copies would not violate program semantics. We evaluate our techniques on 3 platforms: a 16-core AMD64 machine, a 48-core Intel SCC, and an 864-core Azul Vega 3. Experimental results over a range of parallel benchmarks indicate that our approach leads to notable performance gains (20 - 32% on average) without incurring any additional complexity.

Journal of Functional Programming | 2014

MultiMLton: A multicore-aware runtime for standard ML

Kc Sivaramakrishnan; Lukasz Ziarek; Suresh Jagannathan

MultiMLton is an extension of the MLton compiler and runtime system that targets scalable, multicore architectures. It provides specific support for ACML, a derivative of Concurrent ML that allows for the construction of composable asynchronous events. To effectively manage asynchrony, we require the runtime to efficiently handle potentially large numbers of lightweight, short-lived threads, many of which are created specifically to deal with the implicit concurrency introduced by asynchronous events. Scalability demands also dictate that the runtime minimize global coordination. MultiMLton therefore implements a split-heap memory manager that allows mutators and collectors running on different cores to operate mostly independently. More significantly, MultiMLton exploits the premise that there is a surfeit of available concurrency in ACML programs to realize a new collector design that completely eliminates the need for read barriers, a source of significant overhead in other managed runtimes. These two symbiotic features - a thread design specifically tailored to support asynchronous communication, and a memory manager that exploits lightweight concurrency to greatly reduce barrier overheads - are MultiMLton s key novelties. In this article, we describe the rationale, design, and implementation of these features, and provide experimental results over a range of parallel benchmarks and different multicore architectures including an 864 core Azul Vega 3, and a 48 core non-coherent Intel SCC (Single-Cloud Computer), that justify our design decisions.

Science of Computer Programming | 2013

Efficient sessions

Kc Sivaramakrishnan; Mohammad Qudeisat; Lukasz Ziarek; Karthik Nagaraj; Patrick Eugster

Recently, there has been much interest in multi-party session types (MPSTs) as a means of rigorously specifying protocols for interaction among multiple distributed participants. By capturing distributed interaction as a series of typed interactions, MPSTs allow for the static verification of compliance of corresponding distributed object programs. We observe that explicit control flow information manifested by MPST opens intriguing avenues for performance improvements. In this paper, we present a session type guided performance enhancement framework for distributed object interaction in Java. Our framework combines control flow information from MPSTs with data flow information obtained from corresponding programs. Detailed experimental evaluation of our distributed runtime infrastructure in both Emulab and Amazons Elastic Compute Cloud (EC2) illustrate benefits of our composable enhancement strategies.

practical aspects of declarative languages | 2014

[InlineEquation not available: see fulltext.]CML: A Prescription for Safely Relaxing Synchrony

Kc Sivaramakrishnan; Lukasz Ziarek; Suresh Jagannathan

A functional programming discipline, combined with abstractions like Concurrent ML (CML)s first-class synchronous events, offers an attractive programming model for concurrency. In high-latency distributed environments, like the cloud, however, the high communication latencies incurred by synchronous communication can compromise performance. While switching to an explicitly asynchronous communication model may reclaim some of these costs, program structure and understanding also becomes more complex. To ease the challenge of migrating concurrent applications to distributed cloud environments, we have built an extension of the MultiMLton compiler and runtime that implements CML communication asynchronously, but guarantees that the resulting execution is faithful to the synchronous semantics of CML. We formalize the conditions under which this equivalence holds, and present an implementation that builds a decentralized dependence graph whose structure can be used to check the integrity of an execution with respect to this equivalence. We integrate a notion of speculation to allow ill-formed executions to be rolled-back and re-executed, replacing offending asynchronous actions with safe synchronous ones. Several realistic case studies deployed on the Amazon EC2 cloud infrastructure demonstrate the utility of our approach.A functional programming discipline, combined with abstractions like Concurrent ML (CML)’s first-class synchronous events, offers an attractive programming model for concurrency. In high-latency distributed environments, like the cloud, however, the high communication latencies incurred by synchronous communication can compromise performance. While switching to an explicitly asynchronous communication model may reclaim some of these costs, program structure and understanding also becomes more complex. To ease the challenge of migrating concurrent applications to distributed cloud environments, we have built an extension of the MultiMLton compiler and runtime that implements CML communication asynchronously, but guarantees that the resulting execution is faithful to the synchronous semantics of CML. We formalize the conditions under which this equivalence holds, and present an implementation that builds a decentralized dependence graph whose structure can be used to check the integrity of an execution with respect to this equivalence. We integrate a notion of speculation to allow ill-formed executions to be rolled-back and re-executed, replacing offending asynchronous actions with safe synchronous ones. Several realistic case studies deployed on the Amazon EC2 cloud infrastructure demonstrate the utility of our approach.

practical aspects of declarative languages | 2014