Toward Speeding up Mutation Analysis by Memoizing Expensive Methods
TToward Speeding up Mutation Analysisby Memoizing Expensive Methods
Ali Ghanbari Andrian Marcus
University of Texas at Dallas, TX 75080, USA { ali.ghanbari,amarcus } @utdallas.edu Abstract —Mutation analysis has many applications, such asassessing the quality of test cases, fault localization, test inputgeneration, security analysis, etc. Such applications involve run-ning test suite against a large number of program mutants leadingto poor scalability. Much research has been aimed at speedingup this process, focusing on reducing the number of mutants, thenumber of executed tests, or the execution time of the mutants.This paper presents a novel approach, named MeMu, forreducing the execution time of the mutants, by memoizingthe most expensive methods in the system. Memoization is anoptimization technique that allows bypassing the execution ofexpensive methods, when repeated inputs are detected. MeMucan be used in conjunction with existing acceleration techniques.We implemented MeMu on top of PITest, a well-known JVMbytecode-level mutation analysis system, and obtained, on aver-age, an 18.15% speed-up over PITest, in the execution time ofthe mutants for 12 real-world programs.These promising results and the fact that MeMu could alsobe used for other applications that involve repeated executionof tests ( e.g. , automatic program repair and regression testing),strongly support future research for improving its efficiency.
Index Terms —Memoization, Mutation Analysis, Test Case,JVM
I. I
NTRODUCTION
Mutation analysis/testing [1] is a program analysis tech-nique that involves generating a pool of program variants,called mutants , by systematically mutating ( e.g. , replacing anarithmetic operator with another) program elements and run-ning the test suite against the mutants. Mutation analysis hasbeen mainly used for assessing test adequacy by computinga mutation score , which indicates how good a test suite isfor detecting bugs [1]–[3]. In addition, mutation analysis hasbeen used for other purposes, such as fault localization [4]–[6],automatic program repair [7]–[11], test generation [12]–[14]and prioritization [15], program verification [16], [17], etc.
Despite its success in some practical use cases [18], [19],mutation analysis suffers from poor scalability. One mainreason behind this problem is that the generated mutants mustbe tested against the test suite, and usually a large numberof mutants are generated, making the process lengthy. Muchresearch has been devoted to reducing the cost of mutationanalysis [20], [21], focusing primarily on: (1) reducing thenumber of generated [22]–[24] or executed mutants [25]–[27];(2) reducing the number of tests [28]–[30] or reordering them[31]; (3) reducing mutant execution time [32]–[35].In this paper, we focus on reducing mutant executiontime. The common aspect of most of the approaches in this category is that they focus, one way or another, on the mutatedcode. We contend that the execution time during mutationanalysis can also be reduced by reducing the execution time ofunmutated code. Such a speed-up technique will complementexisting acceleration techniques, as they are orthogonal toeach other. Specifically, we focus on reducing the executiontime of (unmutated) expensive methods, i.e. , those that havea longer execution time relative to other methods. Given thatmutation analysis requires many repeated test executions, anda mutation involves small (usually single-pointed) changes tothe program, we expect that unmutated expensive methods areexecuted frequently. The more frequently these methods areexecuted, the bigger the time savings will be. In support of ouridea, an empirical study (see §II) revealed that the executionof the top 20% most expensive methods account for 43.21%of the mutant testing execution time (in average).This paper investigates the use of memoization [36] forspeeding-up the execution the of expensive methods, in thecontext of mutation analysis. Memoization is an optimizationtechnique that stores the results of expensive function calls andreturns the cached result when the same inputs occur again,and it has been successfully used for speeding up recursivefunctions [36], [37], optimizing functional programs [38],[39], and eliminating performance bottlenecks [40], [41]. Weintroduce and evaluate a technique, named MeMu ( Me moized Mu tation analysis), for reducing the execution time of expen-sive methods during mutation analysis via memoization.Specifically, after identifying the expensive methods, MeMurecords a snapshot of the state of the unmutated programat the entry and exit point(s) of the those methods, in theform of input-output pairs and stores them in a memo-table .When testing the mutants, upon the invocation of an expensivemethod, MeMu does a light-weight table look-up to check ifa given input has already been recorded in the memo-table.If a match for the given input is found, then it updates thesystem state with the pre-recorded state, without executing theexpensive method. Otherwise, if the input is not in the memo-table ( i.e. , a cache miss occurs), the method is executed.MeMu is independent of mutant generation or test caseselection/reordering and it is meant to be used in conjunctionwith any existing mutation analysis tool. We implemented andevaluated an instance of MeMu, built on top of the PITestmutation analysis system [42]. As such, the MeMu prototypeis usable with JVM-based programming languages.We used MeMu for analyzing the tests of 12 real-world pro- a r X i v : . [ c s . S E ] F e b rams, resulting in 18.15% speed-up over PITest, in average(min. -0.66%, max. 51.77%), for mutant testing.For any mutation analysis optimization technique, speed-upcomes with two main challenges: (1) limiting the overheadcosts; and (2) maintaining true value of mutation score. Ourwork highlights challenges and solutions in achieving thesegoals. For example, memoizing all the expensive methodsresults in significant runtime overhead caused by loadingand deserializing large memo-table databases and a largenumber of cache misses. We also found that memoizing non-deterministic methods adversely affects the mutation score.We introduce a novel technique, that we call provisionalmemoization (see §III), to reduce the size of the memo-table databases and the number of cache misses. Provisionalmemoization also identifies certain non-memoizable methods, e.g. , those that involve non-determinism.We argue that the use of memoization (with provisionalmemoization) for speeding-up mutation analysis is promisingand we anticipate that future research will further reduce theoverhead. Such research is worthwhile pursuing, as MeMucould also be used for speeding up other automated softwarequality assurance techniques ( e.g. , [7], [43], [44]), which alsorely on the repeated execution of a test suite on the program.II. M OTIVATIONAL E MPIRICAL S TUDY
We conducted an empirical study to understand how muchtime is used on the repeated execution of the expensivemethods, during mutation testing. The premise of our memo-ization approach is that the execution of the most expensivemethods amount to a significant percentage of the total mutantexecution time. We used PITest [18], a state-of-the-art JVMbytecode-based mutation analysis system. It offers 29 mutationoperators (including commonly-used ones [2]) and performson-the-fly mutation generation and testing via
ASM [45] andJava instrumentation API [46], mitigating the compilation andtest isolation overhead.As subjects, we selected 12 real-world programs (see TableI), which are widely used in mutation analysis research [25],[47], [48]. Table I lists the programs, the revisions that weused, and their sizes (number of tests and methods) .We measured the time it took to generate and test (execute)the mutants. To measure mutant execution time, we calculatedthe difference between time before and after executing themutant. By subtracting it from total mutation analysis time, weobtain an approximation of other activities ( e.g. , mutant gener-ation) performed by PITest. To calculate the execution time ofindividual methods during mutant execution, we instrumentedmutants and injected before and after advises for using ASMto calculate the difference between time at the entry and exitpoint(s) of the methods. We used a Dell Workstation with 3.70GHz CPU and 126 GB of RAM running Ubuntu 18.04.4 LTS.All time measurements are in seconds and are the result of theaverage of two executions rounded to the nearest integer.Table I reports the execution time of the top 20% mostexpensive methods, the execution time of all the methods, andthe total time needed by PITest to perform the mutant testing
Call GraphAnalysisCall GraphDeterminacyAnalysisProfiler DependencyAnalysisSide-EffectAnalysisMemoizerDeterministicMethodsExpensive Det.Methods Instance and StaticFields Accesses Dependency RelationMemo-tablesDatabase ----- --
SourceCode TestCasesUserExpensivenessCriteriaProgram Text I n f o r m a t i on E x po r t ed t o t he C li en t C o m ponen t Fig. 1: The memoization component of MeMu. Processes are rep-resented as double-lined rectangles and information produced/con-sumed by processes using dashed rounded rectangles. Each processuses the program source code and tests as input. and other activities such as, mutant generation, mutation scorecomputation, etc.
We observe that PITest spends, on average, 69.97% ofits time on testing mutants. More importantly, executing thetop 20% most expensive methods in the programs accounted,in average, for 43.21% of mutant test execution time (min.10.11%, max. 78.06%). The findings imply that reducingthe execution time of a relatively small fraction of methods( i.e. , the 20% most expensive ones) may lead to a significantreduction in the overall mutation analysis time. They serveas motivation for our approach to focus on method-leveloptimization for reducing the execution time.III. M
EMOIZED M UTATION A NALYSIS F RAMEWORK
MeMu is designed as a framework with two main compo-nents: the memoization component and the client component.The memoization component (see Fig. 1) is responsible foridentifying and memoizing the expensive methods, and passingthis information to the client component. The client component can be an existing mutation analysis tool that is modified tointercept the execution of expensive methods identified by thememoization component so as to check whether or not it canreuse the already computed results instead of re-executing themethod. We implemented the client component for mutationtesting by modifying PITest [42].We describe the data used and produced (denoted by x ) bythe processes (denoted by y ) in the memoization component(see Fig. 1) and the client component. In short, to memoizea method, MeMu records a snapshot of the state of theunmutated program at the entry and exit point(s) of the theexpensive methods, in the form of input-output pairs and storesthem in a memo-table . The collection of memo-tables, i.e. ,the memo-tables database , is then passed to the clientcomponent, which uses it to bypass the execution methods,when a “cache hit” occurs during mutant execution.It is impractical to memoize all methods, as it leads to largeoverhead and in many cases the execution time of a methodmay be actually shorter than a look-up in the memo-table.ence, as we discussed before, we focus on memoizing onlythe expensive methods. Given a program text comprisedof the source code and a test suite, the framework needs firstto determine which methods to memoize. A. Which methods to memoize?
In order to avoid memoizing non-expensive methods, MeMuuses two user-provided parameters: a threshold , τ , and a limit , (cid:96) , which define the expensiveness criterion . It attempts tomemoize the (cid:96) most expensive methods, with execution timelonger than τ milliseconds. However, not all of these expensivemethods are memoizable.The framework applies a call graph analysis to obtainthe call graph . In our implementation, we used the WALAprogram analysis infrastructure [49] to construct a 0-CFA call-graph, but, of course, other tools may also be used. The callgraph is then used by additional analyses to determine whichof the expensive methods should be memoized.First, the dependency analysis determines the reflexive,transitive closure of the call graph, which is also sent tothe client. The resulting dependency relations are usedfor identifying methods that should not be memoized. If theintercepted method ( i.e. , the one to be memoized) depends ona mutated/modified method or itself undergoes a mutation/-modification, then the method shall not be memoized.Second, the determinacy analysis identifies the methodsthat depend on time and/or random generator or return valuescomputed in such a manner. We refer to these methods aslikely non-deterministic methods. MeMu does not memoizethese methods as they might result in large number cachemisses (due to the way their input/output is obtained) orchange the semantic of programs. The set of methods that arenot likely non-deterministic ( i.e. , deterministic methods areused in the next process.Finally, the profiler instruments the system to measurethe execution time of the likely deterministic methods anddetermines the expensive methods that will be memoized.It also records coverage information of each test case used forexcluding unnecessary test cases, for faster memoization. B. Memoization
Recording all variables in a memo-table may lead to verylarge tables. So, before constructing the memo-tables, MeMufilters out fields that are untouched. The side-effect analysis determines which method may access which static/instancefields, either directly or by calling another method. To keep thesize of memo-tables small and optimize table look-up withinthe client, the framework only uses the accessed fields . The Memoizer constructs a minimal memo-tables database of the methods that are deemed memoizable in the previoussteps ( i.e. , the expensive, deterministic methods). This is doneby applying two filtering steps.First, MeMu determines which methods will not result infailures when memoized. This is achieved via provisionalmemoization which tentatively memoizes methods and ex-cludes non-memoizable ones. Specifically, we consider a memoization attempt on a method as failed if memoizingthe method results in (new) failed tests. In this way, we cansingle out non-memoizable methods. Second, before passingthe memo-tables database to the client component, MeMuremoves the methods incurring cache misses when they aretested against covering tests. This is done by post-processingthe database using the execution information obtained duringprovisional memoization. C. Client Component
The client component for MeMu is constructed by modify-ing PITest such that it loads the memo-tables database in eachmutant testing process that PITest forks, and we instrument themutant code such that the memoizable methods do a light-weight check before proceeding running their bodies. Themethods check if they are mutated or depend on some mutatedmethod. If that is the case, no memoization shall take place.Otherwise, they do a light-weight table look-up based on thestate of the system at their entry points and update the systemstate if such a state have occurred previously. Then, the methodimmediately returns without executing its body.IV. E
MPIRICAL E VALUATION AND D ISCUSSION
We conducted an empirical study to assess whether MeMuobtains any speed-up in mutant execution time compared toPITest. We used the same subjects as in the motivational study,described in §II. We set τ and (cid:96) to 1 ms and 20% of numberof methods for each subject program, respectively.The right hand side of Table I summarizes the informationabout MeMu’s execution.Comparing PITest’s and MeMu’s mutant testing time ( i.e. ,the two MT (s) columns in Table I), in 10 out of 12 cases,MeMu completes the execution faster. Excluding jfreechart ,MeMu results in 18.15% speed-up (on average - minimum-0.66%, maximum 51.77%) over PITest.We analyzed the two cases where MeMu did not obtain aspeed-up: jfreechart and joda-time . For the jfreechart system,our implementation of MeMu fails to completely restore thesystem state, so provisional memoization fails to memoize anymethods, so we did not perform memoized mutation testingfor that subject, hence the ”N/A” values the table. The reasonis that jfreechart uses graphic libraries that involve systemstates, which are inaccessible through the Java reflection API[50] used by our framework. We expect that MeMu has thesame problem with other similar systems. However, this isnot a shortcoming of the idea, rather a consequence of ourengineering choice to use reflection and will be addressed infuture work.Thanks to the provisional memoization algorithm, we havebeen able to exclude non-memoizable methods (see the jfreechart )have 3,521 methods (20% of which is 704). MeMu memoizes,no more than 1% of the methods (min. 1, max. 50), yet itresults in a considerable amount of time saving.Provisional memoization ensured that the number of cachemisses ( ABLE I: Result of applying PITest and MeMu on 12 systems. MT =mutant testing time, Score =mutation score, =memoized methods.
Subject Information PITest Execution Information MeMu Execution InformationProject Name Rev commons-codec
851 792 730 418 283 0.863341 21 220 933,291 45 commons-math commons-cli
390 292 140 55 20 0.897196 5 47 8,962 431 commons-csv f368
306 189 402 282 114 0.841484 3 261 3,998 0 closure-compiler commons-io
132 1,325 2,329 1,682 170 0.805581 1 1,641 2,406 187 commons-fileupload
82 306 275 232 160 0.606796 4 170 3,247 29 jfreechart commons-imaging fd01
93 2,439 4,740 3,954 785 0.422943 7 3,407 475 46 commons-lang joda-time commons-geometry-euclidean b36d column) for the memoized methods. However, the algorithmis not perfect; for the joda-time system, MeMu is slower thanPITest, because the memoized methods do not result in anycache hits during the mutant executions.Finally, we believe the memoization also resulted in alossless mutation testing, because for the subject programswith constant mutation scores between runs, the mutation scorebefore and after memoization did not change.V. R
ELATED W ORK
Conventionally, we classify approaches for reducing muta-tion analysis costs into three major categories [20], [51]: (1) do fewer approaches strive generating/testing as few mutantsas possible with minimal adverse effect on mutation score[24]–[28]; (2) do faster approaches are meant to generate andrun mutants as fast as possible (without any concern aboutmutation score) [22], [23], [29]–[31], [52], [53]; (3) do smarter approaches are intended to distribute the workload of testingmutants into several machines or several cores of a singlemachine [18], [54], [55], or factor out shared state betweenmutant executions and avoid re-executing them [32]–[35].MeMu fits in the third category as it applies a semantic-preserving program optimization method ( i.e. , memoizationin this case) on the unmutated parts of the mutants to avoidre-executing (expensive) methods for which the state of thesystem at the entry and exit point(s) do not change fromone execution to another. We discuss here the works in thiscategory, which we consider most related to MeMu.Split-stream [32], [34] and its modern variants are intendedto avoid repeated execution of part of the code that is sharedbetween mutants. Mutations targeting the same program el-ement, result in many mutants that share the same codebefore the mutation impact point. Executing this portion ofthe mutants (provided that the program is deterministic) willalways result the same output. Split-stream runs these portionsonly once and fork different processes for the each mutant after the mutation point of impact to test individual mutants.The modern incarnation of split-stream [33] attempts to reuseshared program states even after mutation point of impact.Just et al. [35] propose three runtime optimizations thatresult in 40% speed up of their MAJOR mutation analysissystem [48]: (1) if a mutation does not result in program statechange immediately after the mutation point, it marks the cor-responding mutant as survived , i.e. , not killed, and terminatesthe test execution; (2) even if a mutation infects the system state in an expression while the change does not propagate tothe subsequent statements, it marks the corresponding mutantas survived and terminates the test execution; (3) mutants thatinfect the state of the system in the same way should only beexecuted once.Since MeMu optimizes the execution of unmutated code andthe memoization does not influence the effect of the mutation,it can complement existing cost reduction techniques. Theinformation collection processes can be parallelized with thepre-processing done by such complementary techniques, ina non-interfering manner, to further speed-up the mutationanalysis process.VI. C ONCLUSIONS AND F UTURE W ORK
The new idea put forward in this paper is speeding up muta-tion analysis by automatically memoizing expensive methods.Our optimization is orthogonal to the state-of-the-art costreduction techniques for mutation analysis and can be used to-gether with them to further speed up the process. An empiricalstudy using state-of-the-art, JVM-based, mutation analysis toolPITest [18] and 12 real-world programs, revealed that 43.21%(avg.) of the mutant execution time is spent on executing thetop 20% most expensive methods. This finding supports theintuition behind the memoization-based approach for speedingup mutant execution. An additional empirical study showedthat memoizing a small subset of these expensive methods (1%of all methods) leads to an average of 18.15% speed-up duringmutant testing. We uncovered two specific issues, importantfor the successful memoization: identifying non-memoizablemethods and minimizing the number of cache misses duringtesting. Provisional memoization shows promise in tacklingthese issues. Future work will focus on more light-weighttechniques, based on statistical models, which may be lesscostly than provisional memoization. The other analyses usedduring memoization can be optimized through parallelization.We contend that other software quality assurance methodsthat rely on repeated execution of the code, such as, automaticprogram repair and regression testing, can also benefit fromthe memoization idea. Hence, the potential advantages largelyexceed those reported here, supporting future work that willfurther optimize the memoization approach.D
ATA A VAILABILITY
Data are available at https://bit.ly/3omErsz.
CKNOWLEDGMENTS
This research was partially supported by the NSF grantsCCF-1910976 and CCF-1955837.R
EFERENCES[1] R. A. DeMillo, R. J. Lipton, and F. G. Sayward, “Hints on test dataselection: Help for the practicing programmer,”
IEEE Computer , pp.34–41, 1978.[2] P. Ammann and J. Offutt,
Introduction to software testing . CambridgeUniversity Press, 2016.[3] W. Visser, “What makes killing a mutant hard,” in
ASE , 2016, pp. 39–44.[4] W. E. Wong, R. Gao, Y. Li, R. Abreu, and F. Wotawa, “A survey onsoftware fault localization,”
TSE , pp. 707–740, 2016.[5] M. Papadakis and Y. Le Traon, “Metallaxis-fl: mutation-based faultlocalization,”
Software Testing, Verification and Reliability , vol. 25, no.5-7, pp. 605–628, 2015.[6] ——, “Using mutants to locate” unknown” faults,” in . IEEE, 2012, pp. 691–700.[7] C. Le Goues, M. Pradel, and A. Roychoudhury, “Automated programrepair,”
CACM , 2019.[8] V. Debroy and W. E. Wong, “Using mutation to automatically suggestfixes for faulty programs,” in
ICST , 2010, pp. 65–74.[9] A. Arcuri, “Evolutionary repair of faulty software,”
ASC , pp. 3494–3514,2011.[10] A. Ghanbari, S. Benton, and L. Zhang, “Practical program repair viabytecode mutation,” in
ISSTA , 2019, pp. 19–30.[11] X.-B. D. Le, D.-H. Chu, D. Lo, C. Le Goues, and W. Visser, “S3: syntax-and semantic-guided repair synthesis via programming by examples,” in
FSE , 2017, pp. 593–604.[12] G. Fraser and A. Arcuri, “Achieving scalable mutation-based generationof whole test suites,”
ESE , pp. 783–812, 2015.[13] F. C. M. Souza, M. Papadakis, Y. Le Traon, and M. E. Delamaro, “Strongmutation-based test data generation using hill climbing,” in
IWSBST ,2016, pp. 45–54.[14] R. A. DeMillo, A. J. Offutt et al. , “Constraint-based automatic test datageneration,”
TSE , pp. 900–910, 1991.[15] D. Shin, S. Yoo, M. Papadakis, and D.-H. Bae, “Empirical evaluationof mutation-based test case prioritization techniques,”
STVR , p. e1695,2019.[16] J. P. Galeotti, C. A. Furia, E. May, G. Fraser, and A. Zeller, “Inferringloop invariants by mutation, dynamic analysis, and static checking,”
TSE ,pp. 1019–1037, 2015.[17] A. Groce, I. Ahmed, C. Jensen, and P. E. McKenney, “How verified ismy code? falsification-driven verification (t),” in
ASE , 2015, pp. 737–748.[18] H. Coles, T. Laurent, C. Henard, M. Papadakis, and A. Ventresque, “Pit:a practical mutation testing tool for java,” in
Proceedings of the 25thInternational Symposium on Software Testing and Analysis , 2016, pp.449–452.[19] I. Ahmed, C. Jensen, A. Groce, and P. E. McKenney, “Applying mutationanalysis on kernel test suites: an experience report,” in
ICSTW , 2017,pp. 110–115.[20] A. V. Pizzoleto, F. C. Ferrari, J. Offutt, L. Fernandes, and M. Ribeiro,“A systematic literature review of techniques and metrics to reduce thecost of mutation testing,”
JSS , vol. 157, p. 110388, 2019.[21] Y. Jia and M. Harman, “An analysis and survey of the development ofmutation testing,”
TSE , pp. 649–678, 2010.[22] R. H. Untch, A. J. Offutt, and M. J. Harrold, “Mutation analysis usingmutant schemata,” in
ISSTA , 1993, pp. 139–148.[23] P. R. Mateo and M. P. Usaola, “Mutant execution cost reduction:Through music (mutant schema improved with extra code),” in
ICST ,2012, pp. 664–672.[24] W. E. Wong and A. P. Mathur, “Reducing the cost of mutation testing:An empirical study,”
JSS , pp. 185–196, 1995.[25] J. Zhang, Z. Wang, L. Zhang, D. Hao, L. Zang, S. Cheng, and L. Zhang,“Predictive mutation testing,” in
ISSTA , 2016, pp. 342–353.[26] D. Mao, L. Chen, and L. Zhang, “An extensive study on cross-projectpredictive mutation testing,” in
ICST , 2019, pp. 160–171.[27] X. Devroey, G. Perrouin, M. Papadakis, A. Legay, P.-Y. Schobbens, andP. Heymans, “Automata language equivalence vs. simulations for model-based mutant equivalence: An empirical evaluation,” in
ICST , 2017, pp.424–429. [28] L. Chen and L. Zhang, “Speeding up mutation testing via regression testselection: An extensive study,” in
ICST , 2018, pp. 58–69.[29] L. Zhang, D. Marinov, L. Zhang, and S. Khurshid, “Regression mutationtesting,” in
ISSTA , 2012, pp. 331–341.[30] M. Gligoric, V. Jagannath, and D. Marinov, “Mutmut: Efficient explo-ration for mutation testing of multithreaded code,” in
ICST , 2010, pp.55–64.[31] L. Zhang, D. Marinov, and S. Khurshid, “Faster mutation testing inspiredby test prioritization and reduction,” in
ISSTA , 2013, pp. 235–245.[32] K. N. King and A. J. Offutt, “A fortran language system for mutation-based software testing,”
SPE , pp. 685–718, 1991.[33] B. Wang, Y. Xiong, Y. Shi, L. Zhang, and D. Hao, “Faster mutationanalysis via equivalence modulo states,” in
ISSTA , 2017, p. 295–306.[34] S. Tokumoto, H. Yoshida, K. Sakamoto, and S. Honiden, “Muvm:Higher order mutation analysis virtual machine for c,” in
ICST , 2016,pp. 320–329.[35] R. Just, M. D. Ernst, and G. Fraser, “Efficient mutation analysis bypropagating and partitioning infected execution states,” in
ISSTA , 2014,pp. 315–326.[36] D. Michie, ““memo” functions and machine learning,”
Nature , pp. 19–22, 1968.[37] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein,
Introductionto algorithms . MIT press, 2009.[38] H. Xu, C. J. Pickett, and C. Verbrugge, “Dynamic purity analysis forjava programs,” in
PASTE , 2007, pp. 75–82.[39] A. Heydon, R. Levin, and Y. Yu, “Caching function calls using precisedependencies,” in
PLDI , 2000, pp. 311–320.[40] L. Della Toffola, M. Pradel, and T. R. Gross, “Performance problems youcan fix: A dynamic analysis of memoization opportunities,”
OOPSLA ,pp. 607–622, 2015.[41] P. J. Guo and D. Engler, “Using automatic persistent memoization tofacilitate data analysis scripting,” in
ISSTA , 2011, pp. 287–297.[42] M. Delahaye and L. Du Bousquet, “Selecting a software engineeringtool: lessons learnt from mutation analysis,”
SPE , pp. 875–891, 2015.[43] L. Baresi, P. L. Lanzi, and M. Miraz, “Testful: an evolutionary testapproach for java,” in
ICST , 2010, pp. 185–194.[44] M. Z. Gligoric, “Regression test selection: Theory and practice,” Ph.D.dissertation, University of Illinois at Urbana-Champaign, 2015.[45] E. Bruneton, R. Lenglet, and T. Coupaye, “Asm: a code manipulationtool to implement adaptable systems,”
AECS , 2002.[46] O. Corporation, “Java Instrumentation API,” 2004, accessed: 10/20.[Online]. Available: https://bit.ly/3czmzFV[47] D. Schuler, V. Dallmeier, and A. Zeller, “Efficient mutation testing bychecking invariant violations,” in
ISSTA , 2009, pp. 69–80.[48] R. Just, F. Schweiggert, and G. M. Kapfhammer, “Major: An efficientand extensible tool for mutation analysis in a java compiler,” in
ASE ,2011, pp. 612–615.[49] J. Dolby, S. J. Fink, and M. Sridharan, “Tj watson libraries for analysis(wala),” 2015. [Online]. Available: https://bit.ly/3jWm8Jn[50] O. Corporation, “Trail: The Reflection API,” 2020, accessed: 10/20.[Online]. Available: https://bit.ly/37niPHJ[51] A. J. Offutt and R. H. Untch, “Mutation 2000: Uniting the orthogonal,”in
Mutation testing for the new century . Springer, 2001, pp. 34–44.[52] W. E. Howden, “Weak mutation testing and completeness of test sets,”
TSE , pp. 371–379, 1982.[53] V. H. Durelli, J. Offutt, and M. E. Delamaro, “Toward harnessing high-level language virtual machines for further speeding up weak mutationtesting,” in
ICST , 2012, pp. 681–690.[54] R. Gopinath, C. Jensen, and A. Groce, “Topsy-turvy: a smarter andfaster parallelization of mutation analysis,” in
ICSE-Companion , 2016,pp. 740–743.[55] N. Li, M. West, A. Escalona, and V. H. Durelli, “Mutation testing inpractice using ruby,” in