Dave W. Binkley | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Dave W. Binkley is active.

Explore More

Publication

Featured researches published by Dave W. Binkley.

Empirical Software Engineering | 2013

The impact of identifier style on effort and comprehension

Dave W. Binkley; Marcia H. Davis; Dawn J. Lawrie; Jonathan I. Maletic; Christopher H. Morrell; Bonita Sharif

A family of studies investigating the impact of program identifier style on human comprehension is presented. Two popular identifier styles are examined, namely camel case and underscore. The underlying hypothesis is that identifier style affects the speed and accuracy of comprehending source code. To investigate this hypothesis, five studies were designed and conducted. The first study, which investigates how well humans read identifiers in the two different styles, focuses on low-level readability issues. The remaining four studies build on the first to focus on the semantic implications of identifier style. The studies involve 150 participants with varied demographics from two different universities. A range of experimental methods is used in the studies including timed testing, read aloud, and eye tracking. These methods produce a broad set of measurements and appropriate statistical methods, such as regression models and Generalized Linear Mixed Models (GLMMs), are applied to analyze the results. While unexpected, the results demonstrate that the tasks of reading and comprehending source code is fundamentally different from those of reading and comprehending natural language. Furthermore, as the task becomes similar to reading prose, the results become similar to work on reading natural language text. For more “source focused” tasks, experienced software developers appear to be less affected by identifier style; however, beginners benefit from the use of camel casing with respect to accuracy and effort.

Empirical Software Engineering | 2015

Are test smells really harmful? An empirical study

Gabriele Bavota; Abdallah Qusef; Andrea De Lucia; Dave W. Binkley

Bad code smells have been defined as indicators of potential problems in source code. Techniques to identify and mitigate bad code smells have been proposed and studied. Recently bad test code smells (test smells for short) have been put forward as a kind of bad code smell specific to tests such a unit tests. What has been missing is empirical investigation into the prevalence and impact of bad test code smells. Two studies aimed at providing this missing empirical data are presented. The first study finds that there is a high diffusion of test smells in both open source and industrial software systems with 86 % of JUnit tests exhibiting at least one test smell and six tests having six distinct test smells. The second study provides evidence that test smells have a strong negative impact on program comprehension and maintenance. Highlights from this second study include the finding that comprehension is 30 % better in the absence of test smells.

Journal of Systems and Software | 2014

Recovering test-to-code traceability using slicing and textual analysis

Abdallah Qusef; Gabriele Bavota; Andrea De Lucia; Dave W. Binkley

Test suites are a valuable source of up-to-date documentation as developers continuously modify them to reflect changes in the production code and preserve an effective regression suite. While maintaining traceability links between unit test and the classes under test can be useful to selectively retest code after a change, the value of having traceability links goes far beyond this potential savings. One key use is to help developers better comprehend the dependencies between tests and classes and help maintain consistency during refactoring. Despite its importance, test-to-code traceability is not common in software development and, when needed, traceability information has to be recovered during software development and evolution. We propose an advanced approach, named SCOTCH+ (Source code and COncept based Test to Code traceability Hunter), to support the developer during the identification of links between unit tests and tested classes. Given a test class, represented by a JUnit class, the approach first exploits dynamic slicing to identify a set of candidate tested classes. Then, external and internal textual information associated with the classes retrieved by slicing is analyzed to refine this set of classes and identify the final set of candidate tested classes. The external information is derived from the analysis of the class name, while internal information is derived from identifiers and comments. The approach is evaluated on five software systems. The results indicate that the accuracy of the proposed approach far exceeds the leading techniques found in the literature.

international conference on software maintenance | 2001

An implementation of and experiment with semantic differencing

Dave W. Binkley; R. Capellini; L.R. Raszewski; C. Smith

Software maintainers face a wide range of difficult tasks including impact analysis and regression testing. Understanding semantic relationships, such as the semantic cohesiveness in a program or the semantic differences between two programs, can help a maintainer address these problems. However, semantic analysis is a difficult problem. For example, few semantic differencing algorithms and even fewer implementations exist. The first semantic differencing implementation for the C language is presented and studied. A large collection of semantic differences of 10 programs are computed. The average size reduction was 37.70%. The study presented illustrates the practicality of semantics differencing. Finally, the application of semantic differencing in the area of program testing and impact analysis is considered.

international conference on software maintenance | 2012

Vocabulary normalization improves IR-based concept location

Dave W. Binkley; Dawn J. Lawrie; Christopher Uehlinger

Tool support is crucial to modern software development, evolution, and maintenance. Early tools reused the static analysis performed by the compiler. These were followed by dynamic analysis tools and more recently tools that exploit natural language. This later class has the advantage that it can incorporate not only the code, but artifacts from all phases of software construction and its subsequent evolution. Unfortunately, the natural language found in source code often uses a vocabulary different from that used in other software artifacts and thus increases the vocabulary mismatch problem. This problem exists because many natural-language tools imported from Information Retrieval (IR) and Natural Language Processing (NLP) implicitly assume the use of a single natural language vocabulary. Vocabulary normalization, which goes well beyond simple identifier splitting, brings the vocabulary of the source into line with other artifacts. Consequently, it is expected to improve the performance of existing and future IR and NLP based tools. As a case study, an experiment with an LSI-based feature locator is replicated. Normalization universally improves performance. For the tersest queries, this improvement is over 180% (p <; 0.0001).

mining software repositories | 2016

Improving change recommendation using aggregated association rules

Thomas Rolfsnes; Leon Moonen; Stefano Di Alesio; Razieh Behjati; Dave W. Binkley

Past research has proposed association rule mining as a means to uncover the evolutionary coupling from a system’s change history. These couplings have various applications, such as improving system decomposition and recommending related changes during development. The strength of the coupling can be characterized using a variety of interestingness measures. Existing recommendation engines typically use only the rule with the highest interestingness value in situations where more than one rule applies. In contrast, we argue that multiple applicable rules indicate increased evidence, and hypothesize that the aggregation of such rules can be exploited to provide more accurate recommendations.To investigate this hypothesis we conduct an empirical study on the change histories of two large industrial systems and four large open source systems. As aggregators we adopt three cumulative gain functions from information retrieval. The experiments evaluate the three using 39 different rule interestingness measures. The results show that aggregation provides a significant impact on most measure’s value and, furthermore, leads to a significant improvement in the resulting recommendation.

ieee international conference on software analysis evolution and reengineering | 2016

Generalizing the Analysis of Evolutionary Coupling for Software Change Impact Analysis

Thomas Rolfsnes; Stefano Di Alesio; Razieh Behjati; Leon Moonen; Dave W. Binkley

Software change impact analysis aims to find artifacts potentially affected by a change. Typical approaches apply language-specific static or dynamic dependence analysis, and are thus restricted to homogeneous systems. This restriction is a major drawback given todays increasingly heterogeneous software. Evolutionary coupling has been proposed as a language-agnostic alternative that mines relations between source-code entities from the systems change history. Unfortunately, existing evolutionary coupling based techniques fall short. For example, using Singular Value Decomposition (SVD) quickly becomes computationally expensive. An efficient alternative applies targeted association rule mining, but the most widely known approach (ROSE) has restricted applicability: experiments on two large industrial systems, and four large open source systems, show that ROSE can only identify dependencies about 25% of the time. To overcome this limitation, we introduce TARMAQ, a new algorithm for mining evolutionary coupling. Empirically evaluated on the same six systems, TARMAQ performs consistently better than ROSE and SVD, is applicable 100% of the time, and runs orders of magnitude faster than SVD. We conclude that the proposed algorithm is a significant step forward towards achieving robust change impact analysis for heterogeneous systems.

source code analysis and manipulation | 2016

Exploring the Effects of History Length and Age on Mining Software Change Impact

Leon Moonen; Stefano Di Alesio; Thomas Rolfsnes; Dave W. Binkley

The goal of Software Change Impact Analysis is to identify artifacts (typically source-code files) potentially affected by a change. Recently, there is an increased interest in mining software change impact based on evolutionary coupling. A particularly promising approach uses association rule mining to uncover potentially affected artifacts from patterns in the systems change history. Two main considerations when using this approach are the history length, the number of transactions from the change history used to identify the impact of a change, and history age, the number of transactions that have occurred since patterns were last mined from the history. Although history length and age can significantly affect the quality of mining results, few guidelines exist on how to best select appropriate values for these two parameters. In this paper, we empirically investigate the effects of history length and age on the quality of change impact analysis using mined evolutionary couplings. Specifically, we report on a series of systematic experiments involving the change histories of two large industrial systems and 17 large open source systems. In these experiments, we vary the length and age of the history used to mine software change impact, and assess how this affects precision and applicability. Results from the study are used to derive practical guidelines for choosing history length and age when applying association rule mining to conduct software change impact analysis.

international conference on software maintenance | 2013

Which Feature Location Technique is Better

Emily Hill; Alberto Bacchelli; Dave W. Binkley; Bogdan Dit; Dawn J. Lawrie

Feature location is a fundamental step in software evolution tasks such as debugging, understanding, and reuse. Numerous automated and semi-automated feature location techniques (FLTs) have been proposed, but the question remains: How do we objectively determine which FLT is most effective? Existing evaluations frequently use bug fix data, which includes the location of the fix, but not what other code needs to be understood to make the fix. Existing evaluation measures such as precision, recall, effectiveness, mean average precision (MAP), and mean reciprocal rank (MRR) will not differentiate between a FLT that ranks higher these related elements over completely irrelevant ones. We propose an alternative measure of relevance based on the likelihood of a developer finding the bug fix locations from a ranked list of results. Our initial evaluation shows that by modeling user behavior, our proposed evaluation methodology can compare and evaluate FLTs fairly.

Empirical Software Engineering | 2018

Aggregating Association Rules to Improve Change Recommendation

Thomas Rolfsnes; Leon Moonen; Stefano Di Alesio; Razieh Behjati; Dave W. Binkley

As the complexity of software systems grows, it becomes increasingly difficult for developers to be aware of all the dependencies that exist between artifacts (e.g., files or methods) of a system. Change recommendation has been proposed as a technique to overcome this problem, as it suggests to a developer relevant source-code artifacts related to her changes. Association rule mining has shown promise in deriving such recommendations by uncovering relevant patterns in the system’s change history. The strength of the mined association rules is captured using a variety of interestingness measures. However, state-of-the-art recommendation engines typically use only the rule with the highest interestingness value when more than one rule applies. In contrast, we argue that when multiple rules apply, this indicates collective evidence, and aggregating those rules (and their evidence) will lead to more accurate change recommendation. To investigate this hypothesis we conduct a large empirical study of 15 open source software systems and two systems from our industry partners. We evaluate association rule aggregation using four variants of the change history for each system studied, enabling us to compare two different levels of granularity in two different scenarios. Furthermore, we study 40 interestingness measures using the rules produced by two different mining algorithms. The results show that (1) between 13 and 90% of change recommendations can be improved by rule aggregation, (2) rule aggregation almost always improves change recommendation for both algorithms and all measures, and (3) fine-grained histories benefit more from rule aggregation.

Explore More