René Just
University of Washington
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by René Just.
foundations of software engineering | 2014
René Just; Darioush Jalali; Laura Inozemtseva; Michael D. Ernst; Reid Holmes; Gordon Fraser
A good test suite is one that detects real faults. Because the set of faults in a program is usually unknowable, this definition is not useful to practitioners who are creating test suites, nor to researchers who are creating and evaluating tools that generate test suites. In place of real faults, testing research often uses mutants, which are artificial faults -- each one a simple syntactic variation -- that are systematically seeded throughout the program under test. Mutation analysis is appealing because large numbers of mutants can be automatically-generated and used to compensate for low quantities or the absence of known real faults. Unfortunately, there is little experimental evidence to support the use of mutants as a replacement for real faults. This paper investigates whether mutants are indeed a valid substitute for real faults, i.e., whether a test suite’s ability to detect mutants is correlated with its ability to detect real faults that developers have fixed. Unlike prior studies, these investigations also explicitly consider the conflating effects of code coverage on the mutant detection rate. Our experiments used 357 real faults in 5 open-source applications that comprise a total of 321,000 lines of code. Furthermore, our experiments used both developer-written and automatically-generated test suites. The results show a statistically significant correlation between mutant detection and real fault detection, independently of code coverage. The results also give concrete suggestions on how to improve mutation analysis and reveal some inherent limitations.
international symposium on software testing and analysis | 2014
René Just; Darioush Jalali; Michael D. Ernst
Empirical studies in software testing research may not be comparable, reproducible, or characteristic of practice. One reason is that real bugs are too infrequently used in software testing research. Extracting and reproducing real bugs is challenging and as a result hand-seeded faults or mutants are commonly used as a substitute. This paper presents Defects4J, a database and extensible framework providing real bugs to enable reproducible studies in software testing research. The initial version of Defects4J contains 357 real bugs from 5 real-world open source pro- grams. Each real bug is accompanied by a comprehensive test suite that can expose (demonstrate) that bug. Defects4J is extensible and builds on top of each program’s version con- trol system. Once a program is configured in Defects4J, new bugs can be added to the database with little or no effort. Defects4J features a framework to easily access faulty and fixed program versions and corresponding test suites. This framework also provides a high-level interface to common tasks in software testing research, making it easy to con- duct and reproduce empirical studies. Defects4J is publicly available at http://defects4j.org.
automated software engineering | 2011
René Just; Franz Schweiggert; Gregory M. Kapfhammer
Mutation analysis is an effective, yet often time-consuming and difficult-to-use method for the evaluation of testing strategies. In response to these and other challenges, this paper presents MAJOR, a fault seeding and mutation analysis tool that is integrated into the Java Standard Edition compiler as a non-invasive enhancement for use in any Java-based development environment. MAJOR reduces the mutant generation time and enables efficient mutation analysis. It has already been successfully applied to large applications with up to 373,000 lines of code and 406,000 mutants. Moreover, MAJORs domain specific language for specifying and adapting mutation operators also makes it extensible. Due to its ease-of-use, efficiency, and extensibility, MAJOR is an ideal platform for the study and application of mutation analysis.
international symposium on software testing and analysis | 2014
René Just
Mutation analysis seeds artificial faults (mutants) into a pro- gram and evaluates testing techniques by measuring how well they detect those mutants. Mutation analysis is well- established in software engineering research but hardly used in practice due to inherent scalability problems and the lack of proper tool support. In response to those challenges, this paper presents Major, a framework for mutation analysis and fault seeding. Major provides a compiler-integrated mu- tator and a mutation analyzer for JUnit tests. Major implements a large set of optimizations to enable efficient and scalable mutation analysis of large software sys- tems. It has already been applied to programs with more than 200,000 lines of code and 150,000 mutants. Moreover, Major features its own domain specific language and is de- signed to be highly configurable to support fundamental re- search in software engineering. Due to its efficiency and flexibility, the Major mutation framework is suitable for the application of mutation analysis in research and practice. It is publicly available at http://mutation-testing.org.
computer and communications security | 2014
Michael D. Ernst; René Just; Suzanne Millstein; Werner Dietl; Stuart Pernsteiner; Franziska Roesner; Karl Koscher; Paulo Barros Barros; Ravi Bhoraskar; Seungyeop Han; Paul Vines; Edward X. Wu
Current app stores distribute some malware to unsuspecting users, even though the app approval process may be costly and time-consuming. High-integrity app stores must provide stronger guarantees that their apps are not malicious. We propose a verification model for use in such app stores to guarantee that the apps are free of malicious information flows. In our model, the software vendor and the app store auditor collaborate -- each does tasks that are easy for her/him, reducing overall verification cost. The software vendor provides a behavioral specification of information flow (at a finer granularity than used by current app stores) and source code annotated with information-flow type qualifiers. A flow-sensitive, context-sensitive information-flow type system checks the information flow type qualifiers in the source code and proves that only information flows in the specification can occur at run time. The app store auditor uses the vendor-provided source code to manually verify declassifications. We have implemented the information-flow type system for Android apps written in Java, and we evaluated both its effectiveness at detecting information-flow violations and its usability in practice. In an adversarial Red Team evaluation, we analyzed 72 apps (576,000 LOC) for malware. The 57 Trojans among these had been written specifically to defeat a malware analysis such as ours. Nonetheless, our information-flow type system was effective: it detected 96% of malware whose malicious behavior was related to information flow and 82% of all malware. In addition to the adversarial evaluation, we evaluated the practicality of using the collaborative model. The programmer annotation burden is low: 6 annotations per 100 LOC. Every sound analysis requires a human to review potential false alarms, and in our experiments, this took 30 minutes per 1,000 LOC for an auditor unfamiliar with the app.
automated software engineering | 2015
Sina Shamshiri; René Just; José Miguel Rojas; Gordon Fraser; Phil McMinn; Andrea Arcuri
Rather than tediously writing unit tests manually, tools can be used to generate them automatically - sometimes even resulting in higher code coverage than manual testing. But how good are these tests at actually finding faults? To answer this question, we applied three state-of-the-art unit test generation tools for Java (Randoop, EvoSuite, and Agitar) to the 357 real faults in the Defects4J dataset and investigated how well the generated test suites perform at detecting these faults. Although the automatically generated test suites detected 55.7% of the faults overall, only 19.9% of all the individual test suites detected a fault. By studying the effectiveness and problems of the individual tools and the tests they generate, we derive insights to support the development of automated unit test generators that achieve a higher fault detection rate. These insights include 1) improving the obtained code coverage so that faulty statements are executed in the first instance, 2) improving the propagation of faulty program states to an observable output, coupled with the generation of more sensitive assertions, and 3) improving the simulation of the execution environment to detect faults that are dependent on external factors such as date and time.
international conference on software testing verification and validation | 2012
René Just; Gregory M. Kapfhammer; Franz Schweiggert
Mutation analysis is an unbiased and powerful method for assessing input values and test oracles. However, in comparison to other techniques, such as those that rely on code coverage, it is a computationally-expensive and time-consuming method, especially for large software systems. This high cost is due, in part, to the fact that many mutation operators generate redundant mutants that may both misrepresent the mutation score and increase the runtime of the mutation analysis process. After showing how the conditional operator replacement(COR) mutation operator can be defined in a redundant-free manner, this paper uses four real-world programs, ranging in size from 3,000 to nearly 40,000 lines of code, to show the prevalence of redundant mutants. Focusing on the conditional operator replacement (COR)and relational operator replacement (ROR) mutation operators that create 41% of all mutants in the chosen programs, the case study reveals that the removal of redundant mutants reduces the runtime of mutation analysis by up to 34%. Additional empirical results show that redundant mutants can lead to a mutation score that is misleadingly overestimated by as much as 10%. Overall, this paper convincingly demonstrates that it is possible to improve the effectiveness and efficiency of a mutation analysis system by identifying and removing redundant mutants.
international symposium on software testing and analysis | 2014
René Just; Michael D. Ernst; Gordon Fraser
Mutation analysis evaluates a testing technique by measur- ing how well it detects seeded faults (mutants). Mutation analysis is hampered by inherent scalability problems — a test suite is executed for each of a large number of mutants. Despite numerous optimizations presented in the literature, this scalability issue remains, and this is one of the reasons why mutation analysis is hardly used in practice. Whereas most previous optimizations attempted to stati- cally reduce the number of executions or their computational overhead, this paper exploits information available only at run time to further reduce the number of executions. First, state infection conditions can reveal — with a single test execution of the unmutated program — which mutants would lead to a different state, thus avoiding unnecessary test executions. Second, determining whether an infected execution state propagates can further reduce the number of executions. Mutants that are embedded in compound expressions may infect the state locally without affecting the outcome of the compound expression. Third, those mutants that do infect the state can be partitioned based on the resulting infected state — if two mutants lead to the same infected state, only one needs to be executed as the result of the other can be inferred. We have implemented these optimizations in the Major mu- tation framework and empirically evaluated them on 14 open source programs. The optimizations reduced the mutation analysis time by 40% on average.
international conference on software engineering | 2017
Spencer Pearson; José Creissac Campos; René Just; Gordon Fraser; Michael D. Ernst; Deric Pang; Benjamin Keller
Most fault localization techniques take as input a faulty program, and produce as output a ranked list of suspicious code locations at which the program may be defective. When researchers propose a new fault localization technique, they typically evaluate it on programs with known faults. The technique is scored based on where in its output list the defective code appears. This enables the comparison of multiple fault localization techniques to determine which one is better. Previous research has evaluated fault localization techniques using artificial faults, generated either by mutation tools or manually. In other words, previous research has determined which fault localization techniques are best at finding artificial faults. However, it is not known which fault localization techniques are best at finding real faults. It is not obvious that the answer is the same, given previous work showing that artificial faults have both similarities to and differences from real faults. We performed a replication study to evaluate 10 claims in the literature that compared fault localization techniques (from the spectrum-based and mutation-based families). We used 2995 artificial faults in 6 real-world programs. Our results support 7 of the previous claims as statistically significant, but only 3 as having non-negligible effect sizes. Then, we evaluated the same 10 claims, using 310 real faults from the 6 programs. Every previous result was refuted or was statistically and practically insignificant. Our experiments show that artificial faults are not useful for predicting which fault localization techniques perform best on real faults. In light of these results, we identified a design space that includes many previously-studied fault localization techniques as well as hundreds of new techniques. We experimentally determined which factors in the design space are most important, using an overall set of 395 real faults. Then, we extended this design space with new techniques. Several of our novel techniques outperform all existing techniques, notably in terms of ranking defective code in the top-5 or top-10 reports.
international symposium on software reliability engineering | 2012
René Just; Gregory M. Kapfhammer; Franz Schweiggert
Mutation analysis is a powerful and unbiased technique to assess the quality of input values and test oracles. However, its application domain is still limited due to the fact that it is a time consuming and computationally expensive method, especially when used with large and complex software systems. Addressing these challenges, this paper makes several contributions to significantly improve the efficiency of mutation analysis. First, it investigates the decrease in generated mutants by applying a reduced, yet sufficient, set of mutants for replacing conditional (COR) and relational (ROR) operators. The analysis of ten real-world applications, with 400,000 lines of code and more than 550,000 generated mutants in total, reveals a reduction in the number of mutants created of up to 37% and more than 25% on average. Yet, since the isolated use of non-redundant mutation operators does not ensure that mutation analysis is efficient and scalable, this paper also presents and experimentally evaluates an optimized workflow that exploits the redundancies and runtime differences of test cases to reorder and split the corresponding test suite. Using the same ten open-source applications, an empirical study convincingly demonstrates that the combination of non-redundant operators and prioritization leveraging information about the runtime and mutation coverage of tests reduces the total cost of mutation analysis further by as much as 65%.