Michele Tufano | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Michele Tufano is active.

Explore More

Publication

Featured researches published by Michele Tufano.

international conference on software engineering | 2015

When and why your code starts to smell bad

Michele Tufano; Fabio Palomba; Gabriele Bavota; Massimiliano Di Penta; Andrea De Lucia; Denys Poshyvanyk

Technical debt is a metaphor introduced by Cunningham to indicate “not quite right code which we postpone making it right”. One noticeable symptom of technical debt is represented by code smells, defined as symptoms of poor design and implementation choices. Previous studies showed the negative impact of code smells on the comprehensibility and maintainability of code. While the repercussions of smells on code quality have been empirically assessed, there is still only anecdotal evidence on when and why bad smells are introduced, what is their survivability, and how they are removed by developers. To empirically corroborate such anecdotal evidence, we conducted a large empirical study over the change history of 200 open source projects. This study required the development of a strategy to identify smell-introducing commits, the mining of over half a million of commits, and the manual analysis and classification of over 10K of them. Our findings mostly contradict common wisdom, showing that most of the smell instances are introduced when an artifact is created and not as a result of its evolution. At the same time, 80 percent of smells survive in the system. Also, among the 20 percent of removed instances, only 9 percent are removed as a direct consequence of refactoring operations.

automated software engineering | 2016

Deep learning code fragments for code clone detection

Martin White; Michele Tufano; Christopher Vendome; Denys Poshyvanyk

Code clone detection is an important problem for software maintenance and evolution. Many approaches consider either structure or identifiers, but none of the existing detection techniques model both sources of information. These techniques also depend on generic, handcrafted features to represent code fragments. We introduce learning-based detection techniques where everything for representing terms and fragments in source code is mined from the repository. Our code analysis supports a framework, which relies on deep learning, for automatically linking patterns mined at the lexical level with patterns mined at the syntactic level. We evaluated our novel learning-based approach for code clone detection with respect to feasibility from the point of view of software maintainers. We sampled and manually evaluated 398 file- and 480 method-level pairs across eight real-world Java systems; 93% of the file- and method-level samples were evaluated to be true positives. Among the true positives, we found pairs mapping to all four clone types. We compared our approach to a traditional structure-oriented technique and found that our learning-based approach detected clones that were either undetected or suboptimally reported by the prominent tool Deckard. Our results affirm that our learning-based approach is suitable for clone detection and a tenable technique for researchers.

Journal of Software: Evolution and Process | 2017

There and back again: Can you compile that snapshot?

Michele Tufano; Fabio Palomba; Gabriele Bavota; Massimiliano Di Penta; Andrea De Lucia; Denys Poshyvanyk

A broken snapshot represents a snapshot from a projects change history that cannot be compiled. Broken snapshots can have significant implications for researchers, as they could hinder any analysis of the past project history that requires code to be compiled. Noticeably, while some broken snapshots may be observable in change history repositories (e.g., no longer available dependencies), some of them may not necessarily happen during the actual development. In this paper, we systematically study the compilability of broken snapshots in 219 395 snapshots belonging to 100 Java projects from the Apache Software Foundation, all relying on Maven as an automated build tool. We investigated broken snapshots from 2 different perspectives: (1) how frequently they happen and (2) likely causes behind them. The empirical results indicate that broken snapshots occur in most (96%) of the projects we studied and that they are mainly due to problems related to the resolution of dependencies. On average, only 38% of the change history of the analyzed systems is currently successfully compilable.

mining software repositories | 2015

Landfill: an open dataset of code smells with public evaluation

Fabio Palomba; Dario Di Nucci; Michele Tufano; Gabriele Bavota; Denys Poshyvanyk; Andrea De Lucia

Code smells are symptoms of poor design and implementation choices that may hinder code comprehension and possibly increase change- and fault-proneness of source code. Several techniques have been proposed in the literature for detecting code smells. These techniques are generally evaluated by comparing their accuracy on a set of detected candidate code smells against a manually-produced oracle. Unfortunately, such comprehensive sets of annotated code smells are not available in the literature with only few exceptions. In this paper we contribute (i) a dataset of 243 instances of five types of code smells identified from 20 open source software projects, (ii) a systematic procedure for validating code smell datasets, (iii) LANDFILL, a Web-based platform for sharing code smell datasets, and (iv) a set of APIs for programmatically accessing LANDFILLs contents. Anyone can contribute to Landfill by (i) improving existing datasets (e.g., Adding missing instances of code smells, flagging possibly incorrectly classified instances), and (ii) sharing and posting new datasets. Landfill is available at www.sesa.unisa.it/landfill/, while the video demonstrating its features in action is available at http://www.sesa.unisa.it/tools/landfill.jsp.

foundations of software engineering | 2017

Enabling mutation testing for Android apps

Mario Linares-Vásquez; Gabriele Bavota; Michele Tufano; Kevin Moran; Massimiliano Di Penta; Christopher Vendome; Carlos Bernal-Cárdenas; Denys Poshyvanyk

Mutation testing has been widely used to assess the fault-detection effectiveness of a test suite, as well as to guide test case generation or prioritization. Empirical studies have shown that, while mutants are generally representative of real faults, an effective application of mutation testing requires “traditional” operators designed for programming languages to be augmented with operators specific to an application domain and/or technology. This paper proposes MDroid+, a framework for effective mutation testing of Android apps. First, we systematically devise a taxonomy of 262 types of Android faults grouped in 14 categories by manually analyzing 2,023 so ware artifacts from different sources (e.g., bug reports, commits). Then, we identified a set of 38 mutation operators, and implemented an infrastructure to automatically seed mutations in Android apps with 35 of the identified operators. The taxonomy and the proposed operators have been evaluated in terms of stillborn/trivial mutants generated as compared to well know mutation tools, and their capacity to represent real faults in Android apps

Journal of Systems and Software | 2017

How developers micro-optimize Android apps

Mario Linares-Vásquez; Christopher Vendome; Michele Tufano; Denys Poshyvanyk

Abstract Optimizing mobile apps early on in the development cycle is supposed to be a key strategy for obtaining higher user rankings, more downloads, and higher retention. In fact, mobile platform designers publish specific guidelines, and tools aimed at optimizing apps. However, little research has been done with respect to identifying and understanding actual optimization practices performed by developers. In this paper, we present the results of three empirical studies aimed at investigating practices of Android developers towards improving the performance of their apps, by means of micro-optimizations. We mined change histories of 3513 apps to identify the most frequent micro-optimization opportunities in 297K+ snapshots and to understand if (and when) developers implement these optimizations. Then, we performed an in-depth analysis into whether implementing micro-optimizations can help reduce memory/CPU usage. Finally, we conducted a survey with 389 open-source developers to understand how they use micro-optimizations to improve the performance of Android apps. Surprisingly, our results indicate that developers rarely implement micro-optimizations. Also, the impact of the analyzed micro-optimization on CPU/memory consumption is negligible in most of the cases. Finally, the results from the survey shed some light into why this happens as well as upon which practices developers rely upon.

international conference on software engineering | 2018

MDroid+: a mutation testing framework for android

Kevin Moran; Michele Tufano; Carlos Bernal-Cárdenas; Mario Linares-Vásquez; Gabriele Bavota; Christopher Vendome; Massimiliano Di Penta; Denys Poshyvanyk

Mutation testing has shown great promise in assessing the effectiveness of test suites while exhibiting additional applications to test-case generation, selection, and prioritization. Traditional mutation testing typically utilizes a set of simple language specific source code transformations, called operators, to introduce faults. However, empirical studies have shown that for mutation testing to be most effective, these simple operators must be augmented with operators specific to the domain of the software under test. One challenging software domain for the application of mutation testing is that of mobile apps. While mobile devices and accompanying apps have become a mainstay of modern computing, the frameworks and patterns utilized in their development make testing and verification particularly difficult. As a step toward helping to measure and ensure the effectiveness of mobile testing practices, we introduce MDroid+, an automated framework for mutation testing of Android apps. MDroid+ includes 38 mutation operators from ten empirically derived types of Android faults and has been applied to generate over 8,000 mutants for more than 50 apps.

mining software repositories | 2018

Deep learning similarities from different representations of source code

Michele Tufano; Cody Watson; Gabriele Bavota; Massimiliano Di Penta; Martin White; Denys Poshyvanyk

Assessing the similarity between code components plays a pivotal role in a number of Software Engineering (SE) tasks, such as clone detection, impact analysis, refactoring, etc. Code similarity is generally measured by relying on manually defined or hand-crafted features, e.g., by analyzing the overlap among identifiers or comparing the Abstract Syntax Trees of two code components. These features represent a best guess at what SE researchers can utilize to exploit and reliably assess code similarity for a given task. Recent work has shown, when using a stream of identifiers to represent the code, that Deep Learning (DL) can effectively replace manual feature engineering for the task of clone detection. However, source code can be represented at different levels of abstraction: identifiers, Abstract Syntax Trees, Control Flow Graphs, and Bytecode. We conjecture that each code representation can provide a different, yet orthogonal view of the same code fragment, thus, enabling a more reliable detection of similarities in code. In this paper, we demonstrate how SE tasks can benefit from a DL-based approach, which can automatically learn code similarities from different representations.

international conference on software engineering | 2015

Extract package refactoring in ARIES

Fabio Palomba; Michele Tufano; Gabriele Bavota; Andrian Marcus; Denys Poshyvanyk; Andrea De Lucia

Software evolution often leads to the degradation of software design quality. In Object-Oriented (OO) systems, this often results in packages that are hard to understand and maintain, as they group together heterogeneous classes with unrelated responsibilities. In such cases, state-of-the-art re-modularization tools solve the problem by proposing a new organization of the existing classes into packages. However, as indicated by recent empirical studies, such approaches require changing thousands of lines of code to implement the new recommended modularization. In this demo, we present the implementation of an Extract Package refactoring approach in ARIES (Automated Refactoring In EclipSe), a tool supporting refactoring operations in Eclipse. Unlike state-of-the-art approaches, ARIES automatically identifies and removes single low-cohesive packages from software systems, which represent localized design flaws in the package organization, with the aim to incrementally improve the overall quality of the software modularisation.

international conference on program comprehension | 2018

Towards just-in-time refactoring recommenders

Jevgenija Pantiuchina; Gabriele Bavota; Michele Tufano; Denys Poshyvanyk

Empirical studies have provided ample evidence that low code quality is generally associated with lower maintainability. For this reason, tools have been developed to automatically detect design flaws (e.g., code smells). However, these tools are not able to prevent the introduction of design flaws. This means that the code has to experience a quality decay (with a consequent increase of maintenance/evolution costs) before state-of-the-art tools can be applied to identify and refactor the design flaws. Our goal is to develop a new generation of refactoring recommenders aimed at preventing, via refactoring operations, the introduction of design flaws rather than fixing them once they already affect the system. We refer to such a novel perspective on software refactoring as just-in-time refactoring. In this paper, we make a first step towards this direction, presenting an approach able to predict which classes will be affected in the future by code smells.

Explore More