Andrea Arcuri | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Andrea Arcuri is active.

Explore More

Publication

Featured researches published by Andrea Arcuri.

automated software engineering | 2015

Do Automatically Generated Unit Tests Find Real Faults? An Empirical Study of Effectiveness and Challenges (T)

Sina Shamshiri; René Just; José Miguel Rojas; Gordon Fraser; Phil McMinn; Andrea Arcuri

Rather than tediously writing unit tests manually, tools can be used to generate them automatically - sometimes even resulting in higher code coverage than manual testing. But how good are these tests at actually finding faults? To answer this question, we applied three state-of-the-art unit test generation tools for Java (Randoop, EvoSuite, and Agitar) to the 357 real faults in the Defects4J dataset and investigated how well the generated test suites perform at detecting these faults. Although the automatically generated test suites detected 55.7% of the faults overall, only 19.9% of all the individual test suites detected a fault. By studying the effectiveness and problems of the individual tools and the tests they generate, we derive insights to support the development of automated unit test generators that achieve a higher fault detection rate. These insights include 1) improving the obtained code coverage so that faulty statements are executed in the first instance, 2) improving the propagation of faulty program states to an observable output, coupled with the generation of more sensitive assertions, and 3) improving the simulation of the execution environment to detect faults that are dependent on external factors such as date and time.

Empirical Software Engineering | 2017

A detailed investigation of the effectiveness of whole test suite generation

José Miguel Rojas; Mattia Vivanti; Andrea Arcuri; Gordon Fraser

A common application of search-based software testing is to generate test cases for all goals defined by a coverage criterion (e.g., lines, branches, mutants). Rather than generating one test case at a time for each of these goals individually, whole test suite generation optimizes entire test suites towards satisfying all goals at the same time. There is evidence that the overall coverage achieved with this approach is superior to that of targeting individual coverage goals. Nevertheless, there remains some uncertainty on (a) whether the results generalize beyond branch coverage, (b) whether the whole test suite approach might be inferior to a more focused search for some particular coverage goals, and (c) whether generating whole test suites could be optimized by only targeting coverage goals not already covered. In this paper, we perform an in-depth analysis to study these questions. An empirical study on 100 Java classes using three different coverage criteria reveals that indeed there are some testing goals that are only covered by the traditional approach, although their number is only very small in comparison with those which are exclusively covered by the whole test suite approach. We find that keeping an archive of already covered goals along with the tests covering them and focusing the search on uncovered goals overcomes this small drawback on larger classes, leading to an improved overall effectiveness of whole test suite generation.

2017 IEEE/ACM 10th International Workshop on Search-Based Software Testing (SBST) | 2015

EvoSuite at the SBST 2015 tool competition

Gordon Fraser; Andrea Arcuri

EvoSuite is a search-based tool that automatically generates unit tests for Java code. This paper summarizes the results and experiences of EvoSuites participation at the fourth unit testing competition at SBST 2016, where EvoSuite achieved the highest overall score.

international symposium on software testing and analysis | 2015

Automated unit test generation during software development: a controlled experiment and think-aloud observations

José Miguel Rojas; Gordon Fraser; Andrea Arcuri

Automated unit test generation tools can produce tests that are superior to manually written ones in terms of code coverage, but are these tests helpful to developers while they are writing code? A developer would first need to know when and how to apply such a tool, and would then need to understand the resulting tests in order to provide test oracles and to diagnose and fix any faults that the tests reveal. Considering all this, does automatically generating unit tests provide any benefit over simply writing unit tests manually? We empirically investigated the effects of using an automated unit test generation tool (EvoSuite) during development. A controlled experiment with 41 students shows that using EvoSuite leads to an average branch coverage increase of +13%, and 36% less time is spent on testing compared to writing unit tests manually. However, there is no clear effect on the quality of the implementations, as it depends on how the test generation tool and the generated tests are used. In-depth analysis, using five think-aloud observations with professional programmers, confirms the necessity to increase the usability of automated unit test generation tools, to integrate them better during software development, and to educate software developers on how to best use those tools.

symposium on search based software engineering | 2015

Combining Multiple Coverage Criteria in Search-Based Unit Test Generation

José Miguel Rojas; José Campos; Mattia Vivanti; Gordon Fraser; Andrea Arcuri

Automated test generation techniques typically aim at maximising coverage of well-established structural criteria such as statement or branch coverage. In practice, generating tests only for one specific criterion may not be sufficient when testing object oriented classes, as standard structural coverage criteria do not fully capture the properties developers may desire of their unit test suites. For example, covering a large number of statements could be easily achieved by just calling the main method of a class; yet, a good unit test suite would consist of smaller unit tests invoking individual methods, and checking return values and states with test assertions. There are several different properties that test suites should exhibit, and a search-based test generator could easily be extended with additional fitness functions to capture these properties.

Software Testing, Verification & Reliability | 2016

Seeding strategies in search-based unit test generation

José Miguel Rojas; Gordon Fraser; Andrea Arcuri

Search‐based techniques have been applied successfully to the task of generating unit tests for object‐oriented software. However, as for any meta‐heuristic search, the efficiency heavily depends on many factors; seeding, which refers to the use of previous related knowledge to help solve the testing problem at hand, is one such factor that may strongly influence this efficiency. This paper investigates different seeding strategies for unit test generation, in particular seeding of numerical and string constants derived statically and dynamically, seeding of type information and seeding of previously generated tests. To understand the effects of these seeding strategies, the results of a large empirical analysis carried out on a large collection of open‐source projects from the SF110 corpus and the Apache Commons repository are reported. These experiments show with strong statistical confidence that, even for a testing tool already able to achieve high coverage, the use of appropriate seeding strategies can further improve performance.

foundations of software engineering | 2015

Generating TCP/UDP network data for automated unit test generation

Andrea Arcuri; Gordon Fraser; Juan Pablo Galeotti

Although automated unit test generation techniques can in principle generate test suites that achieve high code coverage, in practice this is often inhibited by the dependence of the code under test on external resources. In particular, a common problem in modern programming languages is posed by code that involves networking (e.g., opening a TCP listening port). In order to generate tests for such code, we describe an approach where we mock (simulate) the networking interfaces of the Java standard library, such that a search-based test generator can treat the network as part of the test input space. This not only has the benefit that it overcomes many limitations of testing networking code (e.g., different tests binding to the same local ports, and deterministic resolution of hostnames and ephemeral ports), it also substantially increases code coverage. An evaluation on 23,886 classes from 110 open source projects, totalling more than 6.6 million lines of Java code, reveals that network access happens in 2,642 classes (11%). Our implementation of the proposed technique as part of the EVOSUITE testing tool addresses the networking code contained in 1,672 (63%) of these classes, and leads to an increase of the average line coverage from 29.1% to 50.8%. On a manual selection of 42 Java classes heavily depending on networking, line coverage with EVOSUITE more than doubled with the use of network mocking, increasing from 31.8% to 76.6%.

Empirical Software Engineering | 2016

Improving the performance of OCL constraint solving with novel heuristics for logical operations: a search-based approach

Shaukat Ali; Muhammad Zohaib Z. Iqbal; Maham Khalid; Andrea Arcuri

A common practice to specify constraints on the Unified Modeling Language (UML) models is using the Object Constraint Language (OCL). Such constraints serve various purposes, ranging from simply providing precise meaning to the models to supporting complex verification and validation activities. In many applications, these constraints have to be solved to obtain values satisfying the constraints, for example, in the case of model-based testing (MBT) to generate test data for the purpose of generating executable test cases. In our previous work, we proposed novel heuristics for various OCL constructs to efficiently solve them using search algorithms. These heuristics are enhanced in this paper to further improve the performance of OCL constraint solving. We performed an empirical evaluation comprising of three case studies using three search algorithms: Alternating Variable Method (AVM), (1 + 1) Evolutionary Algorithm (EA), and a Genetic Algorithm (GA) and in addition Random Search (RS) was used as a comparison baseline. In the first case study, we evaluated each heuristics using carefully designed artificial problems. In the second case study, we evaluated the heuristics on various constraints of Cisco’s Video Conferencing Systems defined to support MBT. Finally, the third case study is about EU-Rent Car Rental specification and is obtained from the literature. The results of the empirical evaluation showed that (1 + 1) EA and AVM with the improved heuristics significantly outperform the rest of the algorithms.

symposium on search based software engineering | 2017

An Empirical Evaluation of Evolutionary Algorithms for Test Suite Generation

José Campos; Yan Ge; Gordon Fraser; Marcello Eler; Andrea Arcuri

Evolutionary algorithms have been shown to be effective at generating unit test suites optimised for code coverage. While many aspects of these algorithms have been evaluated in detail (e.g., test length and different kinds of techniques aimed at improving performance, like seeding), the influence of the specific algorithms has to date seen less attention in the literature. As it is theoretically impossible to design an algorithm that is best on all possible problems, a common approach in software engineering problems is to first try a Genetic Algorithm, and only afterwards try to refine it or compare it with other algorithms to see if any of them is more suited for the addressed problem. This is particularly important in test generation, since recent work suggests that random search may in practice be equally effective, whereas the reformulation as a many-objective problem seems to be more effective. To shed light on the influence of the search algorithms, we empirically evaluate six different algorithms on a selection of non-trivial open source classes. Our study shows that the use of a test archive makes evolutionary algorithms clearly better than random testing, and it confirms that the many-objective search is the most effective.

international conference on software testing verification and validation | 2017

Private API Access and Functional Mocking in Automated Unit Test Generation

Andrea Arcuri; Gordon Fraser; René Just

Not all object oriented code is easily testable: Dependency objects might be difficult or even impossible to instantiate, and object-oriented encapsulation makes testing potentially simple code difficult if it cannot easily be accessed. When this happens, then developers can resort to mock objects that simulate the complex dependencies, or circumvent object-oriented encapsulation and access private APIs directly through the use of, for example, Java reflection. Can automated unit test generation benefit from these techniques as well? In this paper we investigate this question by extending the EvoSuite unit test generation tool with the ability to directly access private APIs and to create mock objects using the popular Mockito framework. However, care needs to be taken that this does not impact the usefulness of the generated tests: For example, a test accessing a private field could later fail if that field is renamed, even if that renaming is part of a semantics-preserving refactoring. Such a failure would not be revealing a true regression bug, but is a false positive, which wastes the developers time for investigating and fixing the test. Our experiments on the SF110 and Defects4J benchmarks confirm the anticipated improvements in terms of code coverage and bug finding, but also confirm the existence of false positives. However, by ensuring the test generator only uses mocking and reflection if there is no other way to reach some part of the code, their number remains small.

Explore More