Is this you? Create Your Porfile

Shashidhar Mysore

University of California, Santa Barbara

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Shashidhar Mysore is active.

Explore More

Publication

Featured researches published by Shashidhar Mysore.

architectural support for programming languages and operating systems | 2009

Complete information flow tracking from the gates up

Mohit Tiwari; Hassan M. G. Wassel; Bita Mazloom; Shashidhar Mysore; Frederic T. Chong; Timothy Sherwood

For many mission-critical tasks, tight guarantees on the flow of information are desirable, for example, when handling important cryptographic keys or sensitive financial data. We present a novel architecture capable of tracking all information flow within the machine, including all explicit data transfers and all implicit flows (those subtly devious flows caused by not performing conditional operations). While the problem is impossible to solve in the general case, we have created a machine that avoids the general-purpose programmability that leads to this impossibility result, yet is still programmable enough to handle a variety of critical operations such as public-key encryption and authentication. Through the application of our novel gate-level information flow tracking method, we show how all flows of information can be precisely tracked. From this foundation, we then describe how a class of architectures can be constructed, from the gates up, to completely capture all information flows and we measure the impact of doing so on the hardware implementation, the ISA, and the programmer.

international conference on vlsi design | 2008

Exploring the Processor and ISA Design for Wireless Sensor Network Applications

Shashidhar Mysore; Banit Agrawal; Frederic T. Chong; Timothy Sherwood

Power consumption, physical size, and architecture design of sensor node processors have been the focus of sensor network research in the architecture community. What lies at the foundation for these research is the hardware- level design which determines the boundaries for achievable utility and performance. Architecture design and evaluation, however, cannot be accomplished independent of the applications and software that run on these sensor nodes. On one hand, some researchers have proposed architectures that can cater to a variety of application classes while trading off on some performance improvements. On the other hand, a set of application-specific architectures have been proposed which perform certain operations extremely well but are not versatile enough to run a variety of applications. This paper provides a design space exploration and optimizations platform to characterize the processor and ISA design tailored for a particular application or a class of applications. We collect a wide variety of sensor network applications to create a comprehensive benchmark suite called the WiSeNBench. We then present a careful profiling of these benchmark applications using an ARM simulator to identify some of the key characteristic behaviors. This also opens up avenue for a possible re-look at the classes of applications that could be supported on next-generation sensor networks and efficient architectural designs to enable these applications.

international symposium on microarchitecture | 2008

A small cache of large ranges: Hardware methods for efficiently searching, storing, and updating big dataflow tags

Mohit Tiwari; Banit Agrawal; Shashidhar Mysore; Jonathan Valamehr; Timothy Sherwood

Dynamically tracking the flow of data within a microprocessor creates many new opportunities to detect and track malicious or erroneous behavior, but these schemes all rely on the ability to associate tags with all of virtual or physical memory. If one wishes to store large 32-bit tags, multiple tags per data element, or tags at the granularity of bytes rather than words, then directly storing one tag on chip to cover one byte or word (in a cache or otherwise) can be an expensive proposition. We show that dataflow tags in fact naturally exhibit a very high degree of spatial-value locality, an observation we can exploit by storing metadata on ranges of addresses (which cover a non-aligned contiguous span of memory) rather than on individual elements. In fact, a small 128 entry on-chip range cache (with area equivalent to 4 KB of SRAM) hits more than 98% of the time on average. The key to this approach is our proposed method by which ranges of tags are kept in cache in an optimally RLE-compressed form, queried at high speed, swapped in and out with secondary memory storage, and (most important for dataflow tracking) rapidly stitched together into the largest possible ranges as new tags are written on every store, all the while correctly handling the cases of unaligned and overlapping ranges. We examine the effectiveness of this approach by simulating its use in definedness tracking (covering both the stack and the heap), in tracking network-derived dataflow through a multi-language web application, and through a synthesizable prototype implementation.

symposium on code generation and optimization | 2006

Profiling over Adaptive Ranges

Shashidhar Mysore; Banit Agrawal; Timothy Sherwood; Nisheeth Shrivastava; Subhash Suri

Modern computer systems are called on to deal with billions of events every second, whether they are instructions executed, memory locations accessed, or packets forwarded. This presents a serious challenge to those who seek to quantify, analyze, or optimize such systems, because important trends and behaviors may easily be lost in a sea of data. We present range adaptive profiling (RAP) as a new and general purpose profiling method capable of hierarchically classifying streams of data efficiently in hardware. Through the use of RAP, events in an input stream are dynamically classified into increasingly precise categories based on the frequency with which they occur. The more important a class, or range of events, the more precisely it is quantified. Despite the dynamic nature of our technique, we build upon tight theoretic bounds covering both worst-case error as well as the required memory. In the limit, it is known that error and the memory bounds can be independent of the stream size, and grow only linearly with the level of precision desired. Significantly, we expose the critical constants in these algorithms and through careful engineering, algorithm re-design, and use of heuristics, we show how a high performance profile system can be implemented for range adaptive profiling. RAP can be used on various profiles such as PCs, load values, and memory addresses, and has a broad range of uses, from hot-region profiling to quantifying cache miss value locality. We propose two methods of implementation, one in software and the other with specialized hardware, and we show that with just 8k bytes of memory range profiles can be gathered with an average accuracy of 98%.

international symposium on microarchitecture | 2010

Gate-Level Information-Flow Tracking for Secure Architectures

Mohit Tiwari; Xun Li; Hassan M. G. Wassel; Bita Mazloom; Shashidhar Mysore; Frederic T. Chong; Timothy Sherwood

This article describes a new method for constructing and analyzing architectures that can track all information flows within a processor, including explicit, implicit, and timing flows. The key to this approach is a novel gate-level information-flow-tracking method that provides a way to create complex logical structures with well-defined information-flow properties.

international conference on parallel architectures and compilation techniques | 2009

Quantifying the Potential of Program Analysis Peripherals

Mohit Tiwari; Shashidhar Mysore; Timothy Sherwood

Tools such as multi-threaded data race detectors, memory bounds checkers, dynamic type analyzers, data flight recorders, and various performance profilers are becoming increasingly vital aids to software developers. Rather than performing all the instrumentation and analysis on the main processor, we exploit the fact that increasingly high-throughput board level interconnect is available on many systems, a fact we use to offload analysis to an off-chip accelerator. We characterize the potential of such a system to both accelerate existing software development tools and enable a new class of heavyweight tools. There are many non-trivial technical issues in taking such an approach that may not appear in simulation, and to flush them out we have developed a prototype system that maps a DMA based analysis engine, sitting on a PCI-mounted FPGA, into the Valgrind instrumentation framework. With our novel instrumentation methods, we demonstrate that program analysis speedups of 29% to 440% could be achieved today with strictly off-the-shelf components on some of the state-of-the-art tools, and we carefully quantify the bottlenecks to illuminate several new opportunities for further architectural innovation.

ACM Transactions on Architecture and Code Optimization | 2008

Formulating and implementing profiling over adaptive ranges

Shashidhar Mysore; Banit Agrawal; Rodolfo Neuber; Timothy Sherwood; Nisheeth Shrivastava; Subhash Suri

Modern computer systems are called on to deal with billions of events every second, whether they are executed instructions, accessed memory locations, or forwarded packets. This presents a serious challenge to those who seek to quantify, analyze, or optimize such systems, because important trends and behaviors may easily be lost in a sea of data. We present range-adaptive profiling (RAP) as a new and general-purpose profiling method capable of hierarchically efficiently classifying streams of data in hardware. Through the use of RAP, events in an input stream are dynamically classified into increasingly precise categories, based on the frequency with which they occur. The more important a class, or range of events, the more precisely it is quantified. Despite the dynamic nature of our technique, we build upon tight theoretic bounds covering both worst-case error, as well as the required memory. In the limit, it is known that error and the memory bounds can be independent of the stream size and grow only linearly with the level of precision desired. Significantly, we expose the critical constants in these algorithms and through careful engineering, algorithm redesign, and use of heuristics, we show how a high-performance profile system can be implemented for range-adaptive profiling. RAP can be used on various profiles, such as PCs, load values, and memory addresses, and has a broad range of uses, from hot-region profiling to quantifying cache miss value locality. We propose two methods of implementation of RAP, one in software and the other with specialized hardware, for which we also describe our prototype FPGA implementation. We show that with just 8KB of memory, range profiles can be gathered with an average accuracy of 98%.

ACM Transactions on Architecture and Code Optimization | 2012

Dataflow Tomography: Information Flow Tracking For Understanding and Visualizing Full Systems

Bita Mazloom; Shashidhar Mysore; Mohit Tiwari; Banit Agrawal; Timothy Sherwood

It is not uncommon for modern systems to be composed of a variety of interacting services, running across multiple machines in such a way that most developers do not really understand the whole system. As abstraction is layered atop abstraction, developers gain the ability to compose systems of extraordinary complexity with relative ease. However, many software properties, especially those that cut across abstraction layers, become very difficult to understand in such compositions. The communication patterns involved, the privacy of critical data, and the provenance of information, can be difficult to find and understand, even with access to all of the source code. The goal of Dataflow Tomography is to use the inherent information flow of such systems to help visualize the interactions between complex and interwoven components across multiple layers of abstraction. In the same way that the injection of short-lived radioactive isotopes help doctors trace problems in the cardiovascular system, the use of “data tagging” can help developers slice through the extraneous layers of software and pin-point those portions of the system interacting with the data of interest. To demonstrate the feasibility of this approach we have developed a prototype system in which tags are tracked both through the machine and in between machines over the network, and from which novel visualizations of the whole system can be derived. We describe the system-level challenges in creating a working system tomography tool and we qualitatively evaluate our system by examining several example real world scenarios.

architectural support for programming languages and operating systems | 2008