Annibale Panichella | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Annibale Panichella is active.

Explore More

Publication

Featured researches published by Annibale Panichella.

IEEE Transactions on Software Engineering | 2018

Automated Test Case Generation as a Many-Objective Optimisation Problem with Dynamic Selection of the Targets

Annibale Panichella; Fitsum Meshesha Kifetew; Paolo Tonella

The test case generation is intrinsically a multi-objective problem, since the goal is covering multiple test targets (e.g., branches). Existing search-based approaches either consider one target at a time or aggregate all targets into a single fitness function (whole-suite approach). Multi and many-objective optimisation algorithms (MOAs) have never been applied to this problem, because existing algorithms do not scale to the number of coverage objectives that are typically found in real-world software. In addition, the final goal for MOAs is to find alternative trade-off solutions in the objective space, while in test generation the interesting solutions are only those test cases covering one or more uncovered targets. In this paper, we present Dynamic Many-Objective Sorting Algorithm (DynaMOSA), a novel many-objective solver specifically designed to address the test case generation problem in the context of coverage testing. DynaMOSA extends our previous many-objective technique Many-Objective Sorting Algorithm (MOSA) with dynamic selection of the coverage targets based on the control dependency hierarchy. Such extension makes the approach more effective and efficient in case of limited search budget. We carried out an empirical study on 346 Java classes using three coverage criteria (i.e., statement, branch, and strong mutation coverage) to assess the performance of DynaMOSA with respect to the whole-suite approach (WS), its archive-based variant (WSA) and MOSA. The results show that DynaMOSA outperforms WSA in 28 percent of the classes for branch coverage (+8 percent more coverage on average) and in 27 percent of the classes for mutation coverage (+11 percent more killed mutants on average). It outperforms WS in 51 percent of the classes for statement coverage, leading to +11 percent more coverage on average. Moreover, DynaMOSA outperforms its predecessor MOSA for all the three coverage criteria in 19 percent of the classes with +8 percent more code coverage on average.

ieee international conference on software analysis evolution and reengineering | 2017

Software-based energy profiling of Android apps: Simple, efficient and reliable?

Dario Di Nucci; Fabio Palomba; Antonio Prota; Annibale Panichella; Andy Zaidman; Andrea De Lucia

Modeling the power profile of mobile applications is a crucial activity to identify the causes behind energy leaks. To this aim, researchers have proposed hardware-based tools as well as model-based and software-based techniques to approximate the actual energy profile. However, all these solutions present their own advantages and disadvantages. Hardware-based tools are highly precise, but at the same time their use is bound to the acquisition of costly hardware components. Model-based tools require the calibration of parameters needed to correctly create a model on a specific hardware device. Software-based approaches do not need any hardware components, but they rely on battery measurements and, thus, they are hardware-assisted. These tools are cheaper and easier to use than hardware-based tools, but they are believed to be less precise. In this paper, we take a deeper look at the pros and cons of software-based solutions investigating to what extent their measurements depart from hardware-based solutions. To this aim, we propose a software-based tool named PETRA that we compare with the hardware-based MONSOON toolkit on 54 Android apps. The results show that PETRA performs similarly to MONSOON despite not using any sophisticated hardware components. In fact, in all the apps the mean relative error with respect to MONSOON is lower than 0.05. Moreover, for 95% of the analyzed methods the estimation error is within 5% of the actual values measured using the hardware-based toolkit.

2017 IEEE/ACM 10th International Workshop on Search-Based Software Testing (SBST) | 2017

Java unit testing tool competition: fifth round

Urko Rueda Molina; Fitsum Meshesha Kifetew; Annibale Panichella

We report on the advances in this sixth edition of the JUnit tool competitions. This year the contest introduces new benchmarks to assess the performance of JUnit testing tools on different types of real-world software projects. Following on the statistical analyses from the past contest work, we have extended it with the combined tools performance aiming to beat the human made tests. Overall, the 6th competition evaluates four automated JUnit testing tools taking as baseline human written test cases for the selected benchmark projects. The paper details the modifications performed to the methodology and provides full results of the competition.

international conference on software engineering | 2017

A guided genetic algorithm for automated crash reproduction

Mozhan Soltani; Annibale Panichella; Arie van Deursen

To reduce the effort developers have to make for crash debugging, researchers have proposed several solutions for automatic failure reproduction. Recent advances proposed the use of symbolic execution, mutation analysis, and directed model checking as underling techniques for post-failure analysis of crash stack traces. However, existing approaches still cannot reproduce many real-world crashes due to such limitations as environment dependencies, path explosion, and time complexity. To address these challenges, we present EvoCrash, a post-failure approach which uses a novel Guided Genetic Algorithm (GGA) to cope with the large search space characterizing real-world software programs. Our empirical study on three open-source systems shows that EvoCrash can replicate 41 (82%) of real-world crashes, 34 (89%) of which are useful reproductions for debugging purposes, outperforming the state-of-the-art in crash replication.

international conference on software engineering | 2017

PETrA: a software-based tool for estimating the energy profile of Android applications

Dario Di Nucci; Fabio Palomba; Antonio Prota; Annibale Panichella; Andy Zaidman; Andrea De Lucia

Energy efficiency is a vital characteristic of any mobile application, and indeed is becoming an important factor for user satisfaction. For this reason, in recent years several approaches and tools for measuring the energy consumption of mobile devices have been proposed. Hardware-based solutions are highly precise, but at the same time they require costly hardware toolkits. Model-based techniques require a possibly difficult calibration of the parameters needed to correctly create a model on a specific hardware device. Finally, software-based solutions are easier to use, but they are possibly less precise than hardware-based solution. In this demo, we present PETrA, a novel software-based tool for measuring the energy consumption of Android apps. With respect to other tools, PETrA is compatible with all the smartphones with Android 5.0 or higher, not requiring any device specific energy profile. We also provide evidence that our tool is able to perform similarly to hardware-based solutions.

ieee international conference on software analysis evolution and reengineering | 2017

Lightweight detection of Android-specific code smells: The aDoctor project

Fabio Palomba; Dario Di Nucci; Annibale Panichella; Andy Zaidman; Andrea De Lucia

Code smells are symptoms of poor design solutions applied by programmers during the development of software systems. While the research community devoted a lot of effort to studying and devising approaches for detecting the traditional code smells defined by Fowler, little knowledge and support is available for an emerging category of Mobile app code smells. Recently, Reimann et al. proposed a new catalogue of Android-specific code smells that may be a threat for the maintainability and the efficiency of Android applications. However, current tools working in the context of Mobile apps provide limited support and, more importantly, are not available for developers interested in monitoring the quality of their apps. To overcome these limitations, we propose a fully automated tool, coined ADOCTOR, able to identify 15 Android-specific code smells from the catalogue by Reimann et al. An empirical study conducted on the source code of 18 Android applications reveals that the proposed tool reaches, on average, 98% of precision and 98% of recall. We made ADOCTOR publicly available.

IEEE Transactions on Software Engineering | 2017

Automatic Generation of Tests to Exploit XML Injection Vulnerabilities in Web Applications

Sadeeq Jan; Annibale Panichella; Andrea Arcuri; Lionel C. Briand

Modern enterprise systems can be composed of many web services (e.g., SOAP and RESTful). Users of such systems might not have direct access to those services, and rather interact with them through a single-entry point which provides a GUI (e.g., a web page or a mobile app). Although the interactions with such entry point might be secure, a hacker could trick such systems to send malicious inputs to those internal web services. A typical example is XML injection targeting SOAP communications. Previous work has shown that it is possible to automatically generate such kind of attacks using search-based techniques. In this paper, we improve upon previous results by providing more efficient techniques to generate such attacks. In particular, we investigate four different algorithms and two different fitness functions. A large empirical study, involving also two industrial systems, shows that our technique is effective at automatically generating XML injection attacks.

symposium on search based software engineering | 2017

LIPS vs MOSA: A Replicated Empirical Study on Automated Test Case Generation

Annibale Panichella; Fitsum Meshesha Kifetew; Paolo Tonella

Replication is a fundamental pillar in the construction of scientific knowledge. Test data generation for procedural programs can be tackled using a single-target or a many-objective approach. The proponents of LIPS, a novel single-target test generator, conducted a preliminary empirical study to compare their approach with MOSA, an alternative many-objective test generator. However, their empirical investigation suffers from several external and internal validity threats, does not consider complex programs with many branches and does not include any qualitative analysis to interpret the results. In this paper, we report the results of a replication of the original study designed to address its major limitations and threats to validity. The new findings draw a completely different picture on the pros and cons of single-target vs many-objective approaches to test case generation.

international conference on software engineering | 2018

Search-based test data generation for SQL queries

Jeroen Castelein; Mauricio Finavaro Aniche; Mozhan Soltani; Annibale Panichella; Arie van Deursen

Database-centric systems strongly rely on SQL queries to manage and manipulate their data. These SQL commands can range from very simple selections to queries that involve several tables, subqueries, and grouping operations. And, as with any important piece of code, developers should properly test SQL queries. In order to completely test a SQL query, developers need to create test data that exercise all possible coverage targets in a query, e.g., JOINs and WHERE predicates. And indeed, this task can be challenging and time-consuming for complex queries. Previous studies have modeled the problem of generating test data as a constraint satisfaction problem and, with the help of SAT solvers, generate the required data. However, such approaches have strong limitations, such as partial support for queries with JOINs, subqueries, and strings (which are commonly used in SQL queries). In this paper, we model test data generation for SQL queries as a search-based problem. Then, we devise and evaluate three different approaches based on random search, biased random search, and genetic algorithms (GAs). The GA, in particular, uses a fitness function based on information extracted from the physical query plan of a database engine as search guidance. We then evaluate each approach in 2,135 queries extracted from three open source software and one industrial software system. Our results show that GA is able to completely cover 98.6% of all queries in the dataset, requiring only a few seconds per query. Moreover, it does not suffer from the limitations affecting state-of-the art techniques.

international conference on software engineering | 2018

The scent of a smell: an extensive comparison between textual and structural smells

Fabio Palomba; Annibale Panichella; Andy Zaidman; Andrea De Lucia

Code smells are symptoms of poor design or implementation choices that have a negative effect on several aspects of software maintenance and evolution, such as program comprehension or change-and fault-proneness. This is why researchers have spent a lot of effort on devising methods that help developers to automatically detect them in source code. Almost all the techniques presented in literature are based on the analysis of structural properties extracted from source code, although alternative sources of information (e.g., textual analysis) for code smell detection have also been recently investigated. Nevertheless, some studies have indicated that code smells detected by existing tools based on the analysis of structural properties are generally ignored (and thus not refactored) by the developers. In this paper, we aim at understanding whether code smells detected using textual analysis are perceived and refactored by developers in the same or different way than code smells detected through structural analysis. To this aim, we set up two different experiments. We have first carried out a software repository mining study to analyze how developers act on textually or structurally detected code smells. Subsequently, we have conducted a user study with industrial developers and quality experts in order to qualitatively analyze how they perceive code smells identified using the two different sources of information. Results indicate that textually detected code smells are easier to identify and for this reason they are considered easier to refactor with respect to code smells detected using structural properties. On the other hand, the latter are often perceived as more severe, but more difficult to exactly identify and remove.

Explore More