Atanas Rountev | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Atanas Rountev is active.

Explore More

Publication

Featured researches published by Atanas Rountev.

ACM Transactions on Software Engineering and Methodology | 2005

Parameterized object sensitivity for points-to analysis for Java

Ana Milanova; Atanas Rountev; Barbara G. Ryder

The goal of points-to analysis for Java is to determine the set of objects pointed to by a reference variable or a reference object field. We present object sensitivity, a new form of context sensitivity for flow-insensitive points-to analysis for Java. The key idea of our approach is to analyze a method separately for each of the object names that represent run-time objects on which this method may be invoked. To ensure flexibility and practicality, we propose a parameterization framework that allows analysis designers to control the tradeoffs between cost and precision in the object-sensitive analysis.Side-effect analysis determines the memory locations that may be modified by the execution of a program statement. Def-use analysis identifies pairs of statements that set the value of a memory location and subsequently use that value. The information computed by such analyses has a wide variety of uses in compilers and software tools. This work proposes new versions of these analyses that are based on object-sensitive points-to analysis.We have implemented two instantiations of our parameterized object-sensitive points-to analysis. On a set of 23 Java programs, our experiments show that these analyses have comparable cost to a context-insensitive points-to analysis for Java which is based on Andersens analysis for C. Our results also show that object sensitivity significantly improves the precision of side-effect analysis and call graph construction, compared to (1) context-insensitive analysis, and (2) context-sensitive points-to analysis that models context using the invoking call site. These experiments demonstrate that object-sensitive analyses can achieve substantial precision improvement, while at the same time remaining efficient and practical.

conference on object-oriented programming systems, languages, and applications | 2001

Points-to analysis for Java using annotated constraints

Atanas Rountev; Ana Milanova; Barbara G. Ryder

The goal of point-to analysis for Java is to determine the set of objects pointed by a reference variable or a reference object field. This information has a wide variety of client applications in optimizing compilers and software engineering tools. In this paper we present a point-to analysis for Java based on Andersens point-to analysis for C [5]. We implement the analysis by using a constraint-based approach which employs annotated inclusion constraints. Constraint annotations allow us model precisely and efficiently the semantics of virtual calls and the flow of values through object fields. By solving systems of annotated inclusion constraints, we have been albe to perform practical and precies points-to analysis for Java

international symposium on software testing and analysis | 2002

Parameterized object sensitivity for points-to and side-effect analyses for Java

Ana Milanova; Atanas Rountev; Barbara G. Ryder

The goal of points-to analysis for Java is to determine the set of objects pointed to by a reference variable or a reference objet field. Improving the precision of practical points-to analysis is important because points-to information has a wide variety of client applications in optimizing compilers and software engineering tools. In this paper we present object sensitivity, a new form of context sensitivity for flow-insensitive points-to analysis for Java. The key idea of our approach is to analyze a method separately for each of the objects on which this method is invoked. To ensure flexibility and practicality, we propose a parameterization framework that allows analysis designers to control the tradeoffs between cost and precision in the object-sensitive analysis.Side-effect analysis determines the memory locations that may be modified by the execution of a program statement. This information is needed for various compiler optimizations and software engineering tools. We present a new form of side-effect analysis for Java which is based on object-sensitive points-to analysis.We have implemented one instantiation of our parameterized object-sensitive points-to analysis. We compare this instantiation with a context-insensitive points-to analysis for Java which is based on Andersens analysis for C [4]. On a set of 23 Java programs, our experiments show that the two analyses have comparable cost. In some cases the object-sensitive analysis is actually faster than the context-insensitive analysis. Our results also show that object sensitivity significantly improves the precision of side-effect analysis, call graph construction, and virtual call resolution. These experiments demonstrate that object-sensitive analyses can achieve significantly better precision than context-insensitive ones, while at the same time remaining efficient and practical.

programming language design and implementation | 2007

Effective automatic parallelization of stencil computations

Sriram Krishnamoorthy; Muthu Manikandan Baskaran; Uday Bondhugula; J. Ramanujam; Atanas Rountev; P. Sadayappan

Performance optimization of stencil computations has been widely studied in the literature, since they occur in many computationally intensive scientific and engineering applications. Compiler frameworks have also been developed that can transform sequential stencil codes for optimization of data locality and parallelism. However, loop skewing is typically required in order to tile stencil codes along the time dimension, resulting in load imbalance in pipelined parallel execution of the tiles. In this paper, we develop an approach for automatic parallelization of stencil codes, that explicitly addresses the issue of load-balanced execution of tiles. Experimental results are provided that demonstrate the effectiveness of the approach.

compiler construction | 2008

Automatic transformations for communication-minimized parallelization and locality optimization in the polyhedral model

Uday Bondhugula; Muthu Manikandan Baskaran; Sriram Krishnamoorthy; J. Ramanujam; Atanas Rountev; P. Sadayappan

The polyhedral model provides powerful abstractions to optimize loop nests with regular accesses. Affine transformations in this model capture a complex sequence of execution-reordering loop transformations that can improve performance by parallelization as well as locality enhancement. Although a significant body of research has addressed affine scheduling and partitioning, the problem of automaticallyfinding good affine transforms forcommunication-optimized coarsegrained parallelization together with locality optimization for the general case of arbitrarily-nested loop sequences remains a challenging problem. We propose an automatic transformation framework to optimize arbitrarilynested loop sequences with affine dependences for parallelism and locality simultaneously. The approach finds good tiling hyperplanes by embedding a powerful and versatile cost function into an Integer Linear Programming formulation. These tiling hyperplanes are used for communication-minimized coarse-grained parallelization as well as for locality optimization. The approach enables the minimization of inter-tile communication volume in the processor space, and minimization of reuse distances for local execution at each node. Programs requiring one-dimensional versusmulti-dimensional time schedules (with scheduling-based approaches) are all handled with the same algorithm. Synchronization-free parallelism, permutable loops or pipelined parallelismat various levels can be detected. Preliminary studies of the framework show promising results.

acm sigplan symposium on principles and practice of parallel programming | 2008

Automatic data movement and computation mapping for multi-level parallel architectures with explicitly managed memories

Muthu Manikandan Baskaran; Uday Bondhugula; Sriram Krishnamoorthy; J. Ramanujam; Atanas Rountev; P. Sadayappan

Several parallel architectures such as GPUs and the Cell processor have fast explicitly managed on-chip memories, in addition to slow off-chip memory. They also have very high computational power with multiple levels of parallelism. A significant challenge in programming these architectures is to effectively exploit the parallelism available in the architecture and manage the fast memories to maximize performance. In this paper we develop an approach to effective automatic data management for on-chip memories, including creation of buffers in on-chip (local) memories for holding portions of data accessed in a computational block, automatic determination of array access functions of local buffer references, and generation of code that moves data between slow off-chip memory and fast local memories. We also address the problem of mapping computation in regular programs to multi-level parallel architectures using a multi-level tiling approach, and study the impact of on-chip memory availability on the selection of tile sizes at various levels. Experimental results on a GPU demonstrate the effectiveness of the proposed approach.

programming language design and implementation | 2000

Off-line variable substitution for scaling points-to analysis

Atanas Rountev; Satish Chandra

Most compiler optimizations and software productivity tools rely oninformation about the effects of pointer dereferences in a program.The purpose of points-to analysis is to compute this informationsafely, and as accurately as is practical. Unfortunately, accuratepoints-to information is difficult to obtain for large programs,because the time and space requirements of the analysis becomeprohibitive. We consider the problem of scaling flow- and context-insensitivepoints-to analysis to large programs, perhaps containing hundreds ofthousands of lines of code. Our approach is based on a variable substitution transformation, which is performed off-line, i.e.,before a standard points-to analysis is performed. The general idea ofvariable substitution is that a set of variables in a program can be replaced by a single representative variable, thereby reducing the input size of the problem. Our main contribution is a linear-time algorithm which finds a particular variable substitution that maintains the precision of the standard analysis, and is also very effective in reducing the size of the problem. We report our experience in performing points-to analysis on largeC programs, including some industrial-sized ones. Experiments show thatour algorithm can reduce the cost of Andersens points-to analysis substantially: on average, it reduced the running time by 53% and the memory cost by 59%, relative to an efficient baseline implementation of the analysis.

programming language design and implementation | 2010

Finding low-utility data structures

Guoqing Xu; Nick Mitchell; Matthew Arnold; Atanas Rountev; Edith Schonberg; Gary Sevitsky

Many opportunities for easy, big-win, program optimizations are missed by compilers. This is especially true in highly layered Java applications. Often at the heart of these missed optimization opportunities lie computations that, with great expense, produce data values that have little impact on the programs final output. Constructing a new date formatter to format every date, or populating a large set full of expensively constructed structures only to check its size: these involve costs that are out of line with the benefits gained. This disparity between the formation costs and accrued benefits of data structures is at the heart of much runtime bloat. We introduce a run-time analysis to discover these low-utility data structures. The analysis employs dynamic thin slicing, which naturally associates costs with value flows rather than raw data flows. It constructs a model of the incremental, hop-to-hop, costs and benefits of each data structure. The analysis then identifies suspicious structures based on imbalances of its incremental costs and benefits. To decrease the memory requirements of slicing, we introduce abstract dynamic thin slicing, which performs thin slicing over bounded abstract domains. We have modified the IBM J9 commercial JVM to implement this approach. We demonstrate two client analyses: one that finds objects that are expensive to construct but are not necessary for the forward execution, and second that pinpoints ultimately-dead values. We have successfully applied them to large-scale and long-running Java applications. We show that these analyses are effective at detecting operations that have unbalanced costs and benefits.

programming language design and implementation | 2009

Go with the flow: profiling copies to find runtime bloat

Guoqing Xu; Matthew Arnold; Nick Mitchell; Atanas Rountev; Gary Sevitsky

Many large-scale Java applications suffer from runtime bloat. They execute large volumes of methods, and create many temporary objects, all to execute relatively simple operations. There are large opportunities for performance optimizations in these applications, but most are being missed by existing optimization and tooling technology. While JIT optimizations struggle for a few percent, performance experts analyze deployed applications and regularly find gains of 2x or more. Finding such big gains is difficult, for both humans and compilers, because of the diffuse nature of runtime bloat. Time is spread thinly across calling contexts, making it difficult to judge how to improve performance. Bloat results from a pile-up of seemingly harmless decisions. Each adds temporary objects and method calls, and often copies values between those temporary objects. While data copies are not the entirety of bloat, we have observed that they are excellent indicators of regions of excessive activity. By optimizing copies, one is likely to remove the objects that carry copied values, and the method calls that allocate and populate them. We introduce copy profiling, a technique that summarizes runtime activity in terms of chains of data copies. A flat copy profile counts copies by method. We show how flat profiles alone can be helpful. In many cases, diagnosing a problem requires data flow context. Tracking and making sense of raw copy chains does not scale, so we introduce a summarizing abstraction called the copy graph. We implement three clients analyses that, using the copy graph, expose common patterns of bloat, such as finding hot copy chains and discovering temporary data structures. We demonstrate, with examples from a large-scale commercial application and several benchmarks, that copy profiling can be used by a programmer to quickly find opportunities for large performance gains.

international conference on software engineering | 2015

Static control-flow analysis of user-driven callbacks in Android applications

Shengqian Yang; Dacong Yan; Haowei Wu; Yan Wang; Atanas Rountev

Android software presents many challenges for static program analysis. In this work we focus on the fundamental problem of static control-flow analysis. Traditional analyses cannot be directly applied to Android because the applications are framework-based and event-driven. We consider user-event-driven components and the related sequences of callbacks from the Android framework to the application code, both for lifecycle callbacks and for event handler callbacks. We propose a program representation that captures such callback sequences. This representation is built using context-sensitive static analysis of callback methods. The analysis performs graph reachability by traversing context-compatible interprocedural control-flow paths and identifying statements that may trigger callbacks, as well as paths that avoid such statements. We also develop a client analysis that builds a static model of the applications GUI. Experimental evaluation shows that this context-sensitive approach leads to substantial precision improvements, while having practical cost.

Explore More