Pamela Bhattacharya | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Pamela Bhattacharya is active.

Explore More

Publication

Featured researches published by Pamela Bhattacharya.

international conference on software engineering | 2012

Graph-based analysis and prediction for software evolution

Pamela Bhattacharya; Marios Iliofotou; Iulian Neamtiu; Michalis Faloutsos

We exploit recent advances in analysis of graph topology to better understand software evolution, and to construct predictors that facilitate software development and maintenance. Managing an evolving, collaborative software system is a complex and expensive process, which still cannot ensure software reliability. Emerging techniques in graph mining have revolutionized the modeling of many complex systems and processes. We show how we can use a graph-based characterization of a software system to capture its evolution and facilitate development, by helping us estimate bug severity, prioritize refactoring efforts, and predict defect-prone releases. Our work consists of three main thrusts. First, we construct graphs that capture software structure at two different levels: (a) the product, i.e., source code and module level, and (b) the process, i.e., developer collaboration level. We identify a set of graph metrics that capture interesting properties of these graphs. Second, we study the evolution of eleven open source programs, including Firefox, Eclipse, MySQL, over the lifespan of the programs, typically a decade or more. Third, we show how our graph metrics can be used to construct predictors for bug severity, high-maintenance software parts, and failure-prone releases. Our work strongly suggests that using graph topology analysis concepts can open many actionable avenues in software engineering research and practice.

international conference on software maintenance | 2010

Fine-grained incremental learning and multi-feature tossing graphs to improve bug triaging

Pamela Bhattacharya; Iulian Neamtiu

Software bugs are inevitable and bug fixing is a difficult, expensive, and lengthy process. One of the primary reasons why bug fixing takes so long is the difficulty of accurately assigning a bug to the most competent developer for that bug kind or bug class. Assigning a bug to a potential developer, also known as bug triaging, is a labor-intensive, time-consuming and fault-prone process if done manually. Moreover, bugs frequently get reassigned to multiple developers before they are resolved, a process known as bug tossing. Researchers have proposed automated techniques to facilitate bug triaging and reduce bug tossing using machine learning-based prediction and tossing graphs. While these techniques achieve good prediction accuracy for triaging and reduce tossing paths, they are vulnerable to several issues: outdated training sets, inactive developers, and imprecise, single-attribute tossing graphs. In this paper we improve triaging accuracy and reduce tossing path lengths by employing several techniques such as refined classification using additional attributes and intra-fold updates during training, a precise ranking function for recommending potential tossees in tossing graphs, and multi-feature tossing graphs. We validate our approach on two large software projects, Mozilla and Eclipse, covering 856,259 bug reports and 21 cumulative years of development. We demonstrate that our techniques can achieve up to 83.62% prediction accuracy in bug triaging. Moreover, we reduce tossing path lengths to 1.5–2 tosses for most bugs, which represents a reduction of up to 86.31% compared to original tossing paths. Our improvements have the potential to significantly reduce the bug fixing effort, especially in the context of sizable projects with large numbers of testers and developers.

mining software repositories | 2011

Bug-fix time prediction models: can we do better?

Pamela Bhattacharya; Iulian Neamtiu

Predicting bug-fix time is useful in several areas of software evolution, such as predicting software quality or coordinating development effort during bug triaging. Prior work has proposed bug-fix time prediction models that use various bug report attributes (e.g., number of developers who participated in fixing the bug, bug severity, number of patches, bug-openers reputation) for estimating the time it will take to fix a newly-reported bug. In this paper we take a step towards constructing more accurate and more general bug-fix time prediction models by showing how existing models fail to validate on large projects widely-used in bug studies. In particular, we used multivariate and univariate regression testing to test the prediction significance of existing models on 512,474 bug reports from five open source projects: Eclipse, Chrome and three products from the Mozilla project (Firefox, Seamonkey and Thunderbird). The results of our regression testing indicate that the predictive power of existing models is between 30% and 49% and that there is a need for more independent variables (attributes) when constructing a prediction model. Additionally, we found that, unlike in prior recent studies on commercial software, in the projects we examined there is no correlation between bug-fix likelihood, bug-openers reputation and the time it takes to fix a bug. These findings indicate three open research problems: (1) assessing whether prioritizing bugs using bug-openers reputation is beneficial, (2) identifying attributes which are effective in predicting bug-fix time, and (3) constructing bug-fix time prediction models which can be validated on multiple projects.

Journal of Systems and Software | 2012

Automated, highly-accurate, bug assignment using machine learning and tossing graphs

Pamela Bhattacharya; Iulian Neamtiu; Christian R. Shelton

Empirical studies indicate that automating the bug assignment process has the potential to significantly reduce software evolution effort and costs. Prior work has used machine learning techniques to automate bug assignment but has employed a narrow band of tools which can be ineffective in large, long-lived software projects. To redress this situation, in this paper we employ a comprehensive set of machine learning tools and a probabilistic graph-based model (bug tossing graphs) that lead to highly-accurate predictions, and lay the foundation for the next generation of machine learning-based bug assignment. Our work is the first to examine the impact of multiple machine learning dimensions (classifiers, attributes, and training history) along with bug tossing graphs on prediction accuracy in bug assignment. We validate our approach on Mozilla and Eclipse, covering 856,259 bug reports and 21 cumulative years of development. We demonstrate that our techniques can achieve up to 86.09% prediction accuracy in bug assignment and significantly reduce tossing path lengths. We show that for our data sets the Naive Bayes classifier coupled with product-component features, tossing graphs and incremental learning performs best. Next, we perform an ablative analysis by unilaterally varying classifiers, features, and learning model to show their relative importance of on bug assignment accuracy. Finally, we propose optimization techniques that achieve high prediction accuracy while reducing training and prediction time.

conference on software maintenance and reengineering | 2013

An Empirical Analysis of Bug Reports and Bug Fixing in Open Source Android Apps

Pamela Bhattacharya; Liudmila Ulanova; Iulian Neamtiu; Sai Charan Koduru

Smartphone platforms and applications (apps) have gained tremendous popularity recently. Due to the novelty of the smartphone platform and tools, and the low barrier to entry for app distribution, apps are prone to errors, which affects user experience and requires frequent bug fixes. An essential step towards correcting this situation is understanding the nature of the bugs and bug-fixing processes associated with smartphone platforms and apps. However, prior empirical bug studies have focused mostly on desktop and server applications. Therefore, in this paper, we perform an in-depth empirical study on bugs in the Google Android smartphone platform and 24 widely-used open-source Android apps from diverse categories such as communication, tools, and media. Our analysis has three main thrusts. First, we define several metrics to understand the quality of bug reports and analyze the bug-fix process, including developer involvement. Second, we show how differences in bug life-cycles can affect the bug-fix process. Third, as Android devices carry significant amounts of security-sensitive information, we perform a study of Android security bugs. We found that, although contributor activity in these projects is generally high, developer involvement decreases in some projects, similarly, while bug-report quality is high, bug triaging is still a problem. Finally, we observe that in Android apps, security bug reports are of higher quality but get fixed slower than non-security bugs. We believe that the findings of our study could potentially benefit both developers and users of Android apps.

international conference on software engineering | 2011

Assessing programming language impact on development and maintenance: a study on c and c++

Pamela Bhattacharya; Iulian Neamtiu

Billions of dollars are spent every year for building and maintaining software. To reduce these costs we must identify the key factors that lead to better software and more productive development. One such key factor, and the focus of our paper, is the choice of programming language. Existing studies that analyze the impact of choice of programming language suffer from several deficiencies with respect to methodology and the applications they consider. For example, they consider applications built by different teams in different languages, hence fail to control for developer competence, or they consider small-sized, infrequently-used, short-lived projects. We propose a novel methodology which controls for development process and developer competence, and quantifies how the choice of programming language impacts software quality and developer productivity. We conduct a study and statistical analysis on a set of long-lived, widely-used, open source projects - Firefox, Blender, VLC, and MySQL. The key novelties of our study are: (1) we only consider projects which have considerable portions of development in two languages, C and C++, and (2) a majority of developers in these projects contribute to both C and C++ code bases. We found that using C++ instead of C results in improved software quality and reduced maintenance effort, and that code bases are shifting from C to C++. Our methodology lays a solid foundation for future studies on comparative advantages of particular programming languages.

mining software repositories | 2012

Mining challenge 2012: the Android platform

Emad Shihab; Yasutaka Kamei; Pamela Bhattacharya

The MSR Challenge offers researchers and practitioners in the area of Mining Software Repositories a common data set and asks them to put their mining tools and approaches on a dare. This year, the challenge is on the Android platform. We provided the change and bug report data for the Android platform asked researchers to uncover interesting findings related to the Android platform. In this paper, we describe the role of the MSR Challenge, highlight the data provided and summarize the papers accepted for inclusion in this years challenge.

Proceedings of the 2010 Workshop on Analysis and Programming Languages for Web Applications and Cloud Applications | 2010

Dynamic updates for web and cloud applications

Pamela Bhattacharya; Iulian Neamtiu

The center of mass for newly-released applications is shifting from traditional, desktop or server programs, toward web and cloud computing applications. This shift is favorable to end-users, but puts additional burden on application developers and service providers. In particular, the newly emerging development methodologies, based on dynamic languages and multi-tier setups, complicate tasks such as verification and require end-to-end, rather than program-local guarantees. Moreover, service providers need to provide continuous service while accommodating the fast evolution pace characteristic of web and cloud applications. A promising approach for providing uninterrupted service while keeping applications up-to-date is to permit dynamic software updates, i.e., applying dynamic patches to running programs. In this paper we focus on safe dynamic updates for web and cloud applications; we point out difficulties associated with dynamic updates for these applications, present some of our preliminary results, and lay out directions for future work.

international conference on software engineering | 2011

Using software evolution history to facilitate development and maintenance

Pamela Bhattacharya

Much research in software engineering have been focused on improving software quality and automating the maintenance process to reduce software costs and mitigating complications associated with the evolution process. Despite all these efforts, there are still high cost and effort associated with software bugs and software maintenance, software still continues to be unreliable, and software bugs can wreak havoc on software producers and consumers alike. My dissertation aims to advance the state-of-art in software evolution research by designing tools that can measure and predict software quality and to create integrated frameworks that helps in improving software maintenance and research that involves mining software repositories.

Proceedings of the 3rd International Workshop on Search-Driven Development: Users, Infrastructure, Tools, and Evaluation | 2011

A prolog-based framework for search, integration and empirical analysis on software evolution data

Pamela Bhattacharya; Iulian Neamtiu

Software projects use different repositories for storing project and evolution information such as source code, bugs and patches. An integrated system that combines these multiple repositories and can answer a broad range of queries regarding the projects evolution history would be beneficial to both software developers and researchers. For example, the list of source code changes or the list of developers associated with a bug fix are frequent queries for both developers and researchers. Integrating and gathering this information is a tedious, cumbersome, error-prone process when done manually, especially for large projects. Previous approaches to this problem use frameworks that limit the user to a set of pre-defined query templates, or use query languages with limited power. In this paper, we argue the need for a framework built with recursively enumerable languages, that can answer temporal queries, and supports negation and recursion. As a first step toward such a framework, we present a Prolog-based system that we built, along with an evaluation of real-world integrated data from the Firefox project. Our system allows for elegant and concise, yet powerful queries, and can be used by developers and researchers for frequent development and empirical analysis tasks.

Explore More