Ripon K. Saha | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Ripon K. Saha is active.

Explore More

Publication

Featured researches published by Ripon K. Saha.

automated software engineering | 2013

Improving bug localization using structured information retrieval

Ripon K. Saha; Matthew Lease; Sarfraz Khurshid; Dewayne E. Perry

Locating bugs is important, difficult, and expensive, particularly for large-scale systems. To address this, natural language information retrieval techniques are increasingly being used to suggest potential faulty source files given bug reports. While these techniques are very scalable, in practice their effectiveness remains low in accurately localizing bugs to a small number of files. Our key insight is that structured information retrieval based on code constructs, such as class and method names, enables more accurate bug localization. We present BLUiR, which embodies this insight, requires only the source code and bug reports, and takes advantage of bug similarity data if available. We build BLUiR on a proven, open source IR toolkit that anyone can use. Our work provides a thorough grounding of IR-based bug localization research in fundamental IR theoretical and empirical knowledge and practice. We evaluate BLUiR on four open source projects with approximately 3,400 bugs. Results show that BLUiR matches or outperforms a current state-of-the-art tool across applications considered, even when BLUiR does not use bug similarity data used by the other tool.

acm symposium on applied computing | 2012

Comparative stability of cloned and non-cloned code: an empirical study

Manishankar Mondal; Chanchal K. Roy; Md. Saidur Rahman; Ripon K. Saha; Jens Krinke; Kevin A. Schneider

Code cloning is a controversial software engineering practice due to contradictory claims regarding its effect on software maintenance. Code stability is a recently introduced measurement technique that has been used to determine the impact of code cloning by quantifying the changeability of a code region. Although most of the existing stability analysis studies agree that cloned code is more stable than non-cloned code, the studies have two major flaws: (i) each study only considered a single stability measurement (e.g., lines of code changed, frequency of change, age of change); and, (ii) only a small number of subject systems were analyzed and these were of limited variety. In this paper, we present a comprehensive empirical study on code stability using three different stability measuring methods. We use a recently introduced hybrid clone detection tool, NiCAD, to detect the clones and analyze their stability in four dimensions: by clone type, by measuring method, by programming language, and by system size and age. Our four-dimensional investigation on 12 diverse subject systems written in three programming languages considering three clone types reveals that: (i) Type-1 and Type-2 clones are unstable, but Type-3 clones are not; (ii) clones in Java and C systems are not as stable as clones in C# systems; (iii) a systems development strategy might play a key role in defining its comparative code stability scenario; and, (iv) cloned and non-cloned regions of a subject system do not follow a consistent change pattern.

source code analysis and manipulation | 2010

Evaluating Code Clone Genealogies at Release Level: An Empirical Study

Ripon K. Saha; Muhammad Asaduzzaman; Minhaz F. Zibran; Chanchal K. Roy; Kevin A. Schneider

Code clone genealogies show how clone groups evolve with the evolution of the associated software system, and thus could provide important insights on the maintenance implications of clones. In this paper, we provide an in-depth empirical study for evaluating clone genealogies in evolving open source systems at the release level. We develop a clone genealogy extractor, examine 17 open source C, Java, C++ and C# systems of diverse varieties and study different dimensions of how clone groups evolve with the evolution of the software systems. Our study shows that majority of the clone groups of the clone genealogies either propagate without any syntactic changes or change consistently in the subsequent releases, and that many of the genealogies remain alive during the evolution. These findings seem to be consistent with the findings of a previous study that clones may not be as detrimental in software maintenance as believed to be (at least by many of us), and that instead of aggressively refactoring clones, we should possibly focus on tracking and managing clones during the evolution of software systems.

international conference on software maintenance | 2011

An automatic framework for extracting and classifying near-miss clone genealogies

Ripon K. Saha; Chanchal K. Roy; Kevin A. Schneider

Extracting code clone genealogies across multiple versions of a program and classifying them according to their change patterns underlies the study of code clone evolution. While there are a few studies in the area, the approaches do not handle near-miss clones well and the associated tools are often computationally expensive. To address these limitations, we present a framework for automatically extracting both exact and near-miss clone genealogies across multiple versions of a program and for identifying their change patterns using a few key similarity factors. We have developed a prototype clone genealogy extractor, applied it to three open source projects including the Linux Kernel, and evaluated its accuracy in terms of precision and recall. Our experience shows that the prototype is scalable, adaptable to different clone detection tools, and can automatically identify evolution patterns of both exact and near-miss clones by constructing their genealogies.

international conference on engineering of complex computer systems | 2011

Analyzing and Forecasting Near-Miss Clones in Evolving Software: An Empirical Study

Minhaz F. Zibran; Ripon K. Saha; Muhammad Asaduzzaman; Chanchal K. Roy

Effort for development and maintenance of complex large software is believed to have dependency on the amount of duplicated code fragments (code clones) present in code-bases. For example, clones need to be carefully and consistently maintained and/or refactored for preventing accidental error propagation. Thus it is important to understand the proportion and evolution of clones in evolving software systems for cost estimation or the like. This paper presents a study on the evolution of near-miss clones at release level in medium to large open source software systems of different types (operating systems, database systems, editors, etc.) written in three different programming languages namely C, C#, and Java. Using a hybrid clone detector, NiCad, we detected both exact and near-miss clones at different levels of similarity. Applying statistical methods we investigated, from different dimensions, the evolution of both exact and near-miss clones, and also forecasted the amount of clones in future releases of the software systems. Our study offers significant insights into the existence and evolution of code clones and their relationships with programming language or paradigm and program size.

international conference on software engineering | 2015

An information retrieval approach for regression test prioritization based on program changes

Ripon K. Saha; Lingming Zhang; Sarfraz Khurshid; Dewayne E. Perry

Regression testing is widely used in practice for validating program changes. However, running large regression suites can be costly. Researchers have developed several techniques for prioritizing tests such that the higher-priority tests have a higher likelihood of finding bugs. A vast majority of these techniques are based on dynamic analysis, which can be precise but can also have significant overhead (e.g., for program instrumentation and test-coverage collection). We introduce a new approach, REPiR, to address the problem of regression test prioritization by reducing it to a standard Information Retrieval problem such that the differences between two program versions form the query and the tests constitute the document collection. REPiR does not require any dynamic profiling or static program analysis. As an enabling technology we leverage the open-source IR toolkit Indri. An empirical evaluation using eight open-source Java projects shows that REPiR is computationally efficient and performs better than existing (dynamic or static) techniques for the majority of subject systems.

mining software repositories | 2013

Understanding the evolution of Type-3 clones: An exploratory study

Ripon K. Saha; Chanchal K. Roy; Kevin A. Schneider; Dewayne E. Perry

Understanding the evolution of clones is important both for understanding the maintenance implications of clones and building a robust clone management system. To this end, researchers have already conducted a number of studies to analyze the evolution of clones, mostly focusing on Type-1 and Type-2 clones. However, although there are a significant number of Type-3 clones in software systems, we know a little how they actually evolve. In this paper, we perform an exploratory study on the evolution of Type-1, Type-2, and Type-3 clones in six open source software systems written in two different programming languages and compare the result with a previous study to better understand the evolution of Type-3 clones. Our results show that although Type-3 clones are more likely to change inconsistently, the absolute number of consistently changed Type-3 clone classes is higher than that of Type-1 and Type-2. Type-3 clone classes also have a lifespan similar to that of Type-1 and Type-2 clones. In addition, a considerable number of Type-1 and Type-2 clones convert into Type-3 clones during evolution. Therefore, it is important to manage type-3 clones properly to limit their negative impact. However, various automated clone management techniques such as notifying developers about clone changes or linked editing should be chosen carefully due to the inconsistent nature of Type-3 clones.

international conference on program comprehension | 2011

An Empirical Study of the Impacts of Clones in Software Maintenance

Manishankar Mondal; Md. Saidur Rahman; Ripon K. Saha; Chanchal K. Roy; Jens Krinke; Kevin A. Schneider

The impacts of clones on software maintenance is a long-lived debate on whether clones are beneficial or not. Some researchers argue that clones lead to additional changes during the maintenance phase and thus increase the overall maintenance effort. Moreover, they note that inconsistent changes to clones may introduce faults during evolution. On the other hand, other researchers argue that cloned code exhibits more stability than non-cloned code. Studies resulting in such contradictory outcomes may be a consequence of using different methodologies, using different clone detection tools, defining different impact assessment metrics, and evaluating different subject systems. In order to understand the conflicting results from the studies, we plan to conduct a comprehensive empirical study using a common framework incorporating nine existing methods that yielded mostly contradictory findings. Our research strategy involves implementing each of these methods using four clone detection tools and evaluating the methods on more than fifteen subject systems of different languages and of a diverse nature. We believe that our study will help eliminate tool and study biases to resolve conflicts regarding the impacts of clones on software maintenance.

conference on software maintenance and reengineering | 2014

An empirical study of long lived bugs

Ripon K. Saha; Sarfraz Khurshid; Dewayne E. Perry

Bug fixing is a crucial part of software development and maintenance. A large number of bugs often indicate poor software quality since buggy behavior not only causes failures that may be costly but also has a detrimental effect on the users overall experience with the software product. The impact of long lived bugs can be even more critical since experiencing the same bug version after version can be particularly frustrating for user. While there are many studies that investigate factors affecting bug fixing time for entire bug repositories, to the best of our knowledge, none of these studies investigates the extent and reasons of long lived bugs. In this paper, we analyzed long lived bugs from five different perspectives: their proportion, severity, assignment, reasons, as well as the nature of fixes. Our study on four open-source projects shows that there are a considerable number of long lived bugs in each system and over 90% of them adversely affect the users experience. The reasons of these long lived bugs are diverse including long assignment time, not understanding their importance in advance etc. However, many bug-fixes were delayed without any specific reasons. Our analysis of bug fixing changes further shows that many long lived bugs can be fixed quickly through careful prioritization. We believe our results will help both developers and researchers to better understand factors behind delays, improve the overall bug fixing process, and investigate analytical approaches for prioritizing bugs based on bug severity as well as expected bug fixing effort.

foundations of software engineering | 2013

Toward understanding the causes of unanswered questions in software information sites: a case study of stack overflow

Ripon K. Saha; Avigit K. Saha; Dewayne E. Perry

Stack Overflow is a highly successful question-answering website in the programming community, which not only provide quick solutions to programmers’ questions but also is considered as a large repository of valuable software engineering knowledge. However, despite having a very engaged and active user community, Stack Overflow currently has more than 300K unanswered questions. In this paper, we perform an initial investigation to understand why these questions remain unanswered by applying a combination of statistical and data mining techniques. Our preliminary results indicate that although there are some topics that were never answered, most questions remained unanswered because they apparently are of little interest to the user community.

Explore More