Matt Staats | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Matt Staats is active.

Explore More

Publication

Featured researches published by Matt Staats.

formal methods | 2008

Requirements Coverage as an Adequacy Measure for Conformance Testing

Ajitha Rajan; Michael W. Whalen; Matt Staats; Mats Per Erik Heimdahl

Conformance testing in model-based development refers to the testing activity that verifies whether the code generated (manually or automatically) from the model is behaviorally equivalent to the model. Presently the adequacy of conformance testing is inferred by measuring structural coverage achieved over the model. We hypothesize that adequacy metrics for conformance testing should consider structural coverage over the requirementseither in place of or in addition to structural coverage over the model. Measuring structural coverage over the requirements gives a notion of how well the conformance tests exercise the required behavior of the system. We conducted an experiment to investigate the hypothesis stating structural coverage over formal requirements is more effective than structural coverage over the model as an adequacy measure for conformance testing. We found that the hypothesis was rejected at 5% statistical significance on three of the four case examples in our experiment. Nevertheless, we found that the tests providing requirements coverage found several faults that remained undetected by tests providing model coverage. We thus formed a second hypothesis stating that complementing model coverage with requirements coverage will prove more effective as an adequacy measure than solely using model coverage for conformance testing. In our experiment, we found test suites providing both requirements coverage and model coverage to be more effective at finding faults than test suites providing model coverage alone, at 5% statistical significance. Based on our results, we believe existing adequacy measures for conformance testing that only consider model coverage can be strengthened by combining them with rigorous requirements coverage metrics.

international symposium on software testing and analysis | 2010

Parallel symbolic execution for structural test generation

Matt Staats; Corina S. Pǎsǎreanu

Symbolic execution is a popular technique for automatically generating test cases achieving high structural coverage. Symbolic execution suffers from scalability issues since the number of symbolic paths that need to be explored is very large (or even infinite) for most realistic programs. To address this problem, we propose a technique, Simple Static Partitioning, for parallelizing symbolic execution. The technique uses a set of pre-conditions to partition the symbolic execution tree, allowing us to effectively distribute symbolic execution and decrease the time needed to explore the symbolic execution tree. The proposed technique requires little communication between parallel instances and is designed to work with a variety of architectures, ranging from fast multi-core machines to cloud or grid computing environments. We implement our technique in the Java PathFinder verification tool-set and evaluate it on six case studies with respect to the performance improvement when exploring a finite symbolic execution tree and performing automatic test generation. We demonstrate speedup in both the analysis time over finite symbolic execution trees and in the time required to generate tests relative to sequential execution, with a maximum analysis time speedup of 90x observed using 128 workers and a maximum test generation speedup of 70x observed using 64 workers.

IEEE Transactions on Software Engineering | 2015

The Risks of Coverage-Directed Test Case Generation

Matt Staats; Michael W. Whalen; Mats Per Erik Heimdahl

A number of structural coverage criteria have been proposed to measure the adequacy of testing efforts. In the avionics and other critical systems domains, test suites satisfying structural coverage criteria are mandated by standards. With the advent of powerful automated test generation tools, it is tempting to simply generate test inputs to satisfy these structural coverage criteria. However, while techniques to produce coverage-providing tests are well established, the effectiveness of such approaches in terms of fault detection ability has not been adequately studied. In this work, we evaluate the effectiveness of test suites generated to satisfy four coverage criteria through counterexample-based test generation and a random generation approach-where tests are randomly generated until coverage is achieved-contrasted against purely random test suites of equal size. Our results yield three key conclusions. First, coverage criteria satisfaction alone can be a poor indication of fault finding effectiveness, with inconsistent results between the seven case examples (and random test suites of equal size often providing similar-or even higher-levels of fault finding). Second, the use of structural coverage as a supplement-rather than a target-for test generation can have a positive impact, with random test suites reduced to a coverage-providing subset detecting up to 13.5 percent more faults than test suites generated specifically to achieve coverage. Finally, Observable MC/DC, a criterion designed to account for program structure and the selection of the test oracle, can-in part-address the failings of traditional structural coverage criteria, allowing for the generation of test suites achieving higher levels of fault detection than random test suites of equal size. These observations point to risks inherent in the increase in test automation in critical systems, and the need for more research in how coverage criteria, test generation approaches, the test oracle used, and system structure jointly influence test effectiveness.

ACM Transactions on Software Engineering and Methodology | 2015

Does Automated Unit Test Generation Really Help Software Testers? A Controlled Empirical Study

Gordon Fraser; Matt Staats; Phil McMinn; Andrea Arcuri; Frank Padberg

Work on automated test generation has produced several tools capable of generating test data which achieves high structural coverage over a program. In the absence of a specification, developers are expected to manually construct or verify the test oracle for each test input. Nevertheless, it is assumed that these generated tests ease the task of testing for the developer, as testing is reduced to checking the results of tests. While this assumption has persisted for decades, there has been no conclusive evidence to date confirming it. However, the limited adoption in industry indicates this assumption may not be correct, and calls into question the practical value of test generation tools. To investigate this issue, we performed two controlled experiments comparing a total of 97 subjects split between writing tests manually and writing tests with the aid of an automated unit test generation tool, EvoSuite. We found that, on one hand, tool support leads to clear improvements in commonly applied quality metrics such as code coverage (up to 300% increase). However, on the other hand, there was no measurable improvement in the number of bugs actually found by developers. Our results not only cast some doubt on how the research community evaluates test generation tools, but also point to improvements and future work necessary before automated test generation tools will be widely adopted by practitioners.

international conference on software engineering | 2011

Better testing through oracle selection (NIER track)

Matt Staats; Michael W. Whalen; Mats Per Erik Heimdahl

In software testing, the test oracle determines if the application under test has performed an execution correctly. In current testing practice and research, significant effort and thought is placed on selecting test inputs, with the selection of test oracles largely neglected. Here, we argue that improvements to the testing process can be made by considering the problem of oracle selection. In particular, we argue that selecting the test oracle and test inputs together to complement one another may yield improvements testing effectiveness. We illustrate this using an example and present selected results from an ongoing study demonstrating the relationship between test suite selection, oracle selection, and fault finding.

ieee/aiaa digital avionics systems conference | 2008

On MC/DC and implementation structure: An empirical study

Mats Per Erik Heimdahl; Michael W. Whalen; Ajitha Rajan; Matt Staats

In civil avionics, obtaining DO-178B certification for highly critical airborne software requires that the adequacy of the code testing effort be measured using a structural coverage criterion known as Modified Condition and Decision Coverage (MC/DC). We hypothesized that the effectiveness of the MC/DC metric is highly sensitive to the structure of the implementation and can therefore be problematic as a test adequacy criterion. We tested this hypothesis by evaluating the fault-finding ability of MC/DC-adequate test suites on five industrial systems (flight guidance and display management). For each system, we created two versions of the implementations-implementations with and without expression folding (i.e., inlining).

Proceedings of the 7th International Workshop on Search-Based Software Testing | 2014

Moving the goalposts: coverage satisfaction is not enough

Matt Staats; Michael W. Whalen; Mats Per Erik Heimdahl

Structural coverage criteria have been proposed to measure the adequacy of testing efforts. Indeed, in some domains—e.g., critical systems areas—structural coverage criteria must be satisfied to achieve certification. The advent of powerful search-based test generation tools has given us the ability to generate test inputs to satisfy these structural coverage criteria. While tempting, recent empirical evidence indicates these tools should be used with caution, as merely achieving high structural coverage is not necessarily indicative of high fault detection ability. In this report, we review some of these findings, and offer recommendations on how the strengths of search-based test generation methods can alleviate these issues.

privacy enhancing technologies | 2008

Breaking and Provably Fixing Minx

Erik Shimshock; Matt Staats; Nicholas Hopper

In 2004, Danezis and Laurie proposed Minx, an encryption protocol and packet format for relay-based anonymity schemes, such as mix networks and onion routing, with simplicity as a primary design goal. Danezis and Laurie argued informally about the security properties of Minx but left open the problem of proving its security. In this paper, we show that there cannot be such a proof by showing that an active global adversary can decrypt Minx messages in polynomial time. To mitigate this attack, we also prove secure a very simple modification of the Minx protocol.

IEEE Transactions on Software Engineering | 2015

Automated Oracle Data Selection Support

Matt Staats; Michael W. Whalen; Mats Per Erik Heimdahl

The choice of test oracle-the artifact that determines whether an application under test executes correctly-can significantly impact the effectiveness of the testing process. However, despite the prevalence of tools that support test input selection, little work exists for supporting oracle creation. We propose a method of supporting test oracle creation that automatically selects the oracle data-the set of variables monitored during testing-for expected value test oracles. This approach is based on the use of mutation analysis to rank variables in terms of fault-finding effectiveness, thus automating the selection of the oracle data. Experimental results obtained by employing our method over six industrial systems (while varying test input types and the number of generated mutants) indicate that our method-when paired with test inputs generated either at random or to satisfy specific structural coverage criteria-may be a cost-effective approach for producing small, effective oracle data sets, with fault finding improvements over current industrial best practice of up to 1,435 percent observed (with typical improvements of up to 50 percent).

international conference on software engineering | 2015

A flexible and non-intrusive approach for computing complex structural coverage metrics

Michael W. Whalen; Suzette Person; Neha Rungta; Matt Staats; Daniela Grijincu

Software analysis tools and techniques often leverage structural code coverage information to reason about the dynamic behavior of software. Existing techniques instrument the code with the required structural obligations and then monitor the execution of the compiled code to report coverage. Instrumentation based approaches often incur considerable runtime overhead for complex structural coverage metrics such as Modified Condition/Decision (MC/DC). Code instrumentation, in general, has to be approached with great care to ensure it does not modify the behavior of the original code. Furthermore, instrumented code cannot be used in conjunction with other analyses that reason about the structure and semantics of the code under test. In this work, we introduce a non-intrusive preprocessing approach for computing structural coverage information. It uses a static partial evaluation of the decisions in the source code and a source-to-bytecode mapping to generate the information necessary to efficiently track structural coverage metrics during execution. Our technique is flexible; the results of the preprocessing can be used by a variety of coverage-driven software analysis tasks, including automated analyses that are not possible for instrumented code. Experimental results in the context of symbolic execution show the efficiency and flexibility of our non- intrusive approach for computing code coverage information.

Explore More