Milos Gligoric | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Milos Gligoric is active.

Explore More

Publication

Featured researches published by Milos Gligoric.

international conference on software engineering | 2010

Test generation through programming in UDITA

Milos Gligoric; Tihomir Gvero; Vilas Jagannath; Sarfraz Khurshid; Viktor Kuncak; Darko Marinov

We present an approach for describing tests using non-deterministic test generation programs. To write such programs, we introduce UDITA, a Java-based language with non-deterministic choice operators and an interface for generating linked structures. We also describe new algorithms that generate concrete tests by efficiently exploring the space of all executions of non-deterministic UDITA programs. We implemented our approach and incorporated it into the official, publicly available repository of Java PathFinder (JPF), a popular tool for verifying Java programs. We evaluate our technique by generating tests for data structures, refactoring engines, and JPF itself. Our experiments show that test generation using UDITA is faster and leads to test descriptions that are easier to write than in previous frameworks. Moreover, the novel execution mechanism of UDITA is essential for making test generation feasible. Using UDITA, we have discovered a number of bugs in Eclipse, NetBeans, Sun javac, and JPF.

international symposium on software testing and analysis | 2013

Comparing non-adequate test suites using coverage criteria

Milos Gligoric; Alex Groce; Chaoqiang Zhang; Rohan Sharma; Mohammad Amin Alipour; Darko Marinov

A fundamental question in software testing research is how to compare test suites, often as a means for comparing test-generation techniques. Researchers frequently compare test suites by measuring their coverage. A coverage criterion C provides a set of test requirements and measures how many requirements a given suite satisfies. A suite that satisfies 100% of the (feasible) requirements is C-adequate. Previous rigorous evaluations of coverage criteria mostly focused on such adequate test suites: given criteria C and C′, are C-adequate suites (on average) more effective than C′-adequate suites? However, in many realistic cases producing adequate suites is impractical or even impossible. We present the first extensive study that evaluates coverage criteria for the common case of non-adequate test suites: given criteria C and C′, which one is better to use to compare test suites? Namely, if suites T1, T2 . . . Tn have coverage values c1, c2 . . . cn for C and c′1, c′2 . . . c′n for C′, is it better to compare suites based on c1, c2 . . . cn or based on c′1, c′ 2 . . . c′n? We evaluate a large set of plausible criteria, including statement and branch coverage, as well as stronger criteria used in recent studies. Two criteria perform best: branch coverage and an intra-procedural acyclic path coverage.

fundamental approaches to software engineering | 2011

Testing container classes: random or systematic?

Rohan Sharma; Milos Gligoric; Andrea Arcuri; Gordon Fraser; Darko Marinov

Container classes such as lists, sets, or maps are elementary data structures common to many programming languages. Since they are a part of standard libraries, they are important to test, which led to research on advanced testing techniques targeting such containers and research on comparing testing techniques using such containers. However, these techniques have not been thoroughly compared to simpler techniques such as random testing. We present the results of a larger case study in which we compare random testing with shape abstraction, a systematic technique that showed the best results in a previous study. Our experiments show that random testing is about as effective as shape abstraction for testing these containers, which raises the question whether containers are well suited as a benchmark for comparing advanced testing techniques.

foundations of software engineering | 2011

Improved multithreaded unit testing

Vilas Jagannath; Milos Gligoric; Dongyun Jin; Qingzhou Luo; Grigore Rosu; Darko Marinov

Multithreaded code is notoriously hard to develop and test. A multithreaded test exercises the code under test with two or more threads. Each test execution follows some schedule/interleaving of the multiple threads, and different schedules can give different results. Developers often want to enforce a particular schedule for test execution, and to do so, they use time delays (Thread.sleep in Java). Unfortunately, this approach can produce false positives or negatives, and can result in unnecessarily long testing time. This paper presents IMUnit, a novel approach to specifying and executing schedules for multithreaded tests. We introduce a new language that allows explicit specification of schedules as orderings on events encountered during test execution. We present a tool that automatically instruments the code to control test execution to follow the specified schedule, and a tool that helps developers migrate their legacy, sleep-based tests into event-based tests in IMUnit. The migration tool uses novel techniques for inferring events and schedules from the executions of sleep-based tests. We describe our experience in migrating over 200 tests. The inference techniques have high precision and recall of over 75%, and IMUnit reduces testing time compared to sleep-based tests on average 3.39x.

international symposium on software testing and analysis | 2015

Practical regression test selection with dynamic file dependencies

Milos Gligoric; Lamyaa Eloussi; Darko Marinov

Regression testing is important but can be time-intensive. One approach to speed it up is regression test selection (RTS), which runs only a subset of tests. RTS was proposed over three decades ago but has not been widely adopted in practice. Meanwhile, testing frameworks, such as JUnit, are widely adopted and well integrated with many popular build systems. Hence, integrating RTS in a testing framework already used by many projects would increase the likelihood that RTS is adopted. We propose a new, lightweight RTS technique, called Ekstazi, that can integrate well with testing frameworks. Ekstazi tracks dynamic dependencies of tests on files, and unlike most prior RTS techniques, Ekstazi requires no integration with version-control systems. We implemented Ekstazi for Java and JUnit, and evaluated it on 615 revisions of 32 open-source projects (totaling almost 5M LOC) with shorter- and longer-running test suites. The results show that Ekstazi reduced the end-to-end testing time 32% on average, and 54% for longer-running test suites, compared to executing all tests. Ekstazi also has lower end-to-end time than the existing techniques, despite the fact that it selects more tests. Ekstazi has been adopted by several popular open source projects, including Apache Camel, Apache Commons Math, and Apache CXF.

foundations of software engineering | 2014

Balancing trade-offs in test-suite reduction

August Shi; Alex Gyori; Milos Gligoric; Andrey Zaytsev; Darko Marinov

Regression testing is an important activity but can get expensive for large test suites. Test-suite reduction speeds up regression testing by identifying and removing redundant tests based on a given set of requirements. Traditional research on test-suite reduction is rather diverse but most commonly shares three properties: (1) requirements are defined by a coverage criterion such as statement coverage; (2) the reduced test suite has to satisfy all the requirements as the original test suite; and (3) the quality of the reduced test suites is measured on the software version on which the reduction is performed. These properties make it hard for test engineers to decide how to use reduced test suites. We address all three properties of traditional test-suite reduction: (1) we evaluate test-suite reduction with requirements defined by killed mutants; (2) we evaluate inadequate reduction that does not require reduced test suites to satisfy all the requirements; and (3) we propose evolution-aware metrics that evaluate the quality of the reduced test suites across multiple software versions. Our evaluations allow a more thorough exploration of trade-offs in test-suite reduction, and our evolution-aware metrics show how the quality of reduced test suites can change after the version where the reduction is performed. We compare the trade-offs among various reductions on 18 projects with a total of 261,235 tests over 3,590 commits and a cumulative history spanning 35 years of development. Our results help test engineers make a more informed decision about balancing size, coverage, and fault-detection loss of reduced test suites.

automated software engineering | 2013

Operator-based and random mutant selection: better together

Lingming Zhang; Milos Gligoric; Darko Marinov; Sarfraz Khurshid

Mutation testing is a powerful methodology for evaluating the quality of a test suite. However, the methodology is also very costly, as the test suite may have to be executed for each mutant. Selective mutation testing is a well-studied technique to reduce this cost by selecting a subset of all mutants, which would otherwise have to be considered in their entirety. Two common approaches are operator-based mutant selection, which only generates mutants using a subset of mutation operators, and random mutant selection, which selects a subset of mutants generated using all mutation operators. While each of the two approaches provides some reduction in the number of mutants to execute, applying either of the two to medium-sized, real-world programs can still generate a huge number of mutants, which makes their execution too expensive. This paper presents eight random sampling strategies defined on top of operator-based mutant selection, and empirically validates that operator-based selection and random selection can be applied in tandem to further reduce the cost of mutation testing. The experimental results show that even sampling only 5% of mutants generated by operator-based selection can still provide precise mutation testing results, while reducing the average mutation testing time to 6.54% (i.e., on average less than 5 minutes for this study).

international conference on software testing, verification, and validation | 2010

MuTMuT: Efficient Exploration for Mutation Testing of Multithreaded Code

Milos Gligoric; Vilas Jagannath; Darko Marinov

Mutation testing is a method for measuring the quality of test suites. Given a system under test and a test suite, mutations are systematically inserted into the system, and the test suite is executed to determine which mutants it detects. A major cost of mutation testing is the time required to execute the test suite on all the mutants. This cost is even greater when the system under test is multithreaded: not only are test cases from the test suite executed on many mutants, but also each test case is executed for multiple possible thread schedules. We introduce a general framework that can reduce the time for mutation testing of multithreaded code. We present four techniques within the general framework and implement two of them in a tool called MuTMuT. We evaluate MuTMuT on eight multithreaded programs. The results show that MuTMuT reduces the time for mutation testing, substantially over a straightforward mutant execution and up to 77% with the advanced technique over the basic technique.

international symposium on software testing and analysis | 2013

Selective mutation testing for concurrent code

Milos Gligoric; Lingming Zhang; Cristiano Pereira; Gilles Pokam

Concurrent code is becoming increasingly important with the advent of multi-cores, but testing concurrent code is challenging. Researchers are developing new testing techniques and test suites for concurrent code, but evaluating these techniques and test suites often uses a small number of real or manually seeded bugs. Mutation testing allows creating a large number of buggy programs to evaluate test suites. However, performing mutation testing is expensive even for sequential code, and the cost is higher for concurrent code where each test has to be executed for many (possibly all) thread schedules. The most widely used technique to speed up mutation testing is selective mutation, which reduces the number of mutants by applying only a subset of mutation operators such that test suites that kill all mutants generated by this subset also kill (almost) all mutants generated by all mutation operators. To date, selective mutation has been used only for sequential mutation operators. This paper explores selective mutation for concurrent mutation operators. Our results identify several sets of concurrent mutation operators that can effectively reduce the number of mutants, show that operator-based selection is slightly better than random mutant selection, and show that sequential and concurrent mutation operators are independent, demonstrating the importance of studying concurrent mutation operators.

european conference on object oriented programming | 2013

Systematic testing of refactoring engines on real software projects

Milos Gligoric; Farnaz Behrang; Yilong Li; Jeffrey Overbey; Munawar Hafiz; Darko Marinov

Testing refactoring engines is a challenging problem that has gained recent attention in research. Several techniques were proposed to automate generation of programs used as test inputs and to help developers in inspecting test failures. However, these techniques can require substantial effort for writing test generators or finding unique bugs, and do not provide an estimate of how reliable refactoring engines are for refactoring tasks on real software projects. This paper evaluates an end-to-end approach for testing refactoring engines and estimating their reliability by (1) systematically applying refactorings at a large number of places in well-known, open-source projects and collecting failures during refactoring or while trying to compile the refactored projects, (2) clustering failures into a small, manageable number of failure groups, and (3) inspecting failures to identify non-duplicate bugs. By using this approach on the Eclipse refactoring engines for Java and C, we already found and reported 77 new bugs for Java and 43 for C. Despite the seemingly large numbers of bugs, we found these refactoring engines to be relatively reliable, with only 1.4% of refactoring tasks failing for Java and 7.5% for C.

Explore More