JCoffee: Using Compiler Feedback to Make Partial Code Snippets Compilable
aa r X i v : . [ c s . P L ] S e p JCoffee: Using Compiler Feedback to Make PartialCode Snippets Compilable st Piyush Gupta
Dept. of Computer Sc. and Engg.IIIT-Delhi
New Delhi, [email protected] nd Nikita Mehrotra
Dept. of Computer Sc. and Engg.IIIT-Delhi
New Delhi, [email protected] rd Rahul Purandare
Dept. of Computer Sc. and Engg.IIIT-Delhi
New Delhi, [email protected]
Abstract —Static program analysis tools are often required towork with only a small part of a programs source code, either dueto the unavailability of the entire program or the lack of need toanalyze the complete code. This makes it challenging to use staticanalysis tools that require a complete and typed intermediaterepresentation (IR). We present JCoffee, a tool that leveragescompiler feedback to convert partial Java programs into theircompilable counterparts by simulating the presence of missingsurrounding code. It works with any well-typed code snippet(class, function, or even an unenclosed group of statements) whilemaking minimal changes to the input code fragment. A demo ofthe tool is available here: https://youtu.be/O4h2g n2Qls
Index Terms —Partial programs, Compiler feedback, Staticanalysis, Type inference, Java
I. I
NTRODUCTION & M
OTIVATION
Static analysis is an indispensable tool to model a program’sstructure and analyze its behavior. Most static analysis toolswork on an Intermediate Representation (IR) of the code,which is a trivial task for the compiler to build when thecomplete program is available at their disposal. However, whenonly a subset of the entire code is available, despite its syn-tactical correctness, static analysis tools refuse to operate onit due to ambiguities and references to undeclared constructs.We define a partial program (PP) as a non-empty subsetof an otherwise complete program (CP), which is unlikelyto have any syntactic errors. A PP is incomplete becauseit may contain references to classes, methods, and variableswhose declarations lie in CP but not in PP. Figure 1 illustratesone such code snippet. Given only the code for a Java class
Bar , we call it a partial program since it is ambiguous todetermine the type of variable lbl , the parameter type offunction doB(...) , and the return type of function doA() .The ability to analyze a partial program is advantageous inmany scenarios where either the entire program is unavailable,or it is interesting to analyze only a recently-modified snippetof the entire program, such as: • encountering missing/deprecated dependencies duringmaintenance of an old project, • analyzing code snippets from forum threads and docu-mentation files before incorporating them in a codebase, • independently checking bug fixes before automatic patch-ing, and class Bar { void main() { Foo foo = new Foo(); foo.lbl = "Hello world"; foo.doB(foo.doA()); } } Fig. 1: A Partial program (PP). • analyzing changes to code in web repositories to under-stand the evolution of source code.In order to be able to analyze partial programs, it is essentialto overcome the challenges, namely:i. Resolving syntactical uncertainty that arises due to miss-ing declarations. For example, Foo.baz() may denoteboth a call to a static function baz() of a class
Foo ; ora virtual method call to a member function baz() of anobject
Foo of some class, andii. Assigning correct data types for local variables, methodreturn values, and class-level fields to enable IR genera-tion.While complete and sound disambiguation of a partialprogram is an undecidable problem, it works in our favorthat software engineering techniques can often trade someguarantees on correctness for increased precision [1]. Someof the inferences we make during disambiguation may not betotally sound, but they provide little to no threat to the validityof the use case at hand.The contributions of this paper are: • A novel approach for partial program analysis that in-fers missing code declarations based on the compiler’sfeedback during compilation in an iterative process. Wefocus on making little to no changes to the input snippet,but on adding other class/method definitions around it tocomplete the missing pieces. • Our implementation - Java COmpiler For FEEdback(JCoffee) that realizes our approach. • An evaluation of our tool on 9133 partial code snippets.JCoffee successfully disambiguated over 90% of theinputs, with an overall accuracy of 93.3%.ur approach offers several benefits over existing methods.First, we do not need to propagate inferred information acrosserrors since subsequent compile cycles automatically incorpo-rate it. Second, we support a much broader use case as we donot need to scrape the web or maintain a repository of existingmethod signatures to match undeclared methods in input code.Lastly, as opposed to a simplified intermediate representation,we build the complete bytecode satisfying the strictest typeguarantees of the compiler.II. A
BOUT JC OFFEE
JCoffee is based on the simple premise that Java compilererror messages are very verbose and, a lot of context neededto fix a given error is already present in its description.Consider, for example, the PP in Figure 1. The errorsgenerated during its compilation can be fixed by creating theclass
Foo , its member variable lbl , and functions doA() and doB() with the right signatures. Some of these errorsand the corresponding fixed CP is shown in Figure 2. TheCP is now used to generate a class file, an intermediaterepresentation (IR), a program dependence graph (PDG), orany other representation required for static analysis.Interestingly, for each error, the compiler points out theexact location of the cause, the keywords it was expecting,and the keywords it found. Moreover, the errors are reportedhierarchically, the fine-grained errors are detected only afterthe higher level inconsistencies have been resolved.Thus, in a use-case such as ours, where the syntacticcorrectness of the code snippet is assured, it would not bewrong to make decisions based solely on the suggestions putforth by the compiler.Our only assumption apart from a well-typed inputcode is the availability of standard Java types, such as java.lang.* , or java.io.* during compile time. Weview this as a reasonable assumption because they ship withstandard compilers and are necessary to run Java programs.The various errors fixed by JCoffee can be classified as: • Identifier errors – cannot find symbol : The commonest error, it is thrownwhenever the compiler cannot find the declaration ofan identifier (such as package, interface, class, method,constructor, or variable). JCoffee declares a new iden-tifier with the corresponding missing signature. – array required but . . . found : This is thrown when thereis an attempt to index a variable that has not beendeclared as an array. JCoffee infers this and modifiesthe declared type of the variable. • Computation errors – incorrect method, . . . cannot be applied to . . . : This er-ror results from an incompatibility between a method’scall and its declaration. JCofee adapts the declarationto match the call signature. – operator . . . cannot be applied to . . . / incompatibletypes / inconvertible types : Some operators are onlydefined for specific types, and JCoffee uses these errors error: cannot find symbolsymbol: class Foolocation: class Barsymbol: variable lbllocation: variable foo of type Foosymbol: method doA()location: variable foo of type Foo (a) Simplified error log. class Bar { void main() { Foo foo = new Foo(); foo.lbl = "Hello world"; foo.doB(foo.doA()); } } // Code added by JCoffee begins class Foo { public String lbl; public UNKNOWN doA() {return null;} public void doB(UNKNOWN obj) {} } class UNKNOWN { } (b) Fixed complete program. class UNKNOWN instantiatesall objects whose type cannot be resolved. Fig. 2: Errors and their fixes for a partial program.to infer the type of one of the participating operandwhenever possible. – invalid method declaration; return type required : Everymethod must have a return type or void specified.JCoffee uses this fact to infer that the ‘method’ here isin fact a constructor of the enclosing class. • Access to static entities – non-static variable/method . . . cannot be referencedfrom a static context : The ‘static’ modifier states thata variable/method is associated with a class, not indi-vidual objects. This helps JCoffee infer that the corre-sponding variable/method is to be labeled as ‘static’. • Miscellaneous errors – for-each not applicable to expression type : ‘for-each’is one of the many ways to iterate over an iterablein Java. This error indicates that the expression is aniterable, most likely an array. Repeated occurrencesof this error in the same code fragment indicates themultidimensionality of the array. – exception . . . is never thrown in body of correspondingtry statement : This error occurs when the code snip-pet is missing the exception declaration, and JCoffeeerroneously made a class by the same name on en-countering a missing symbol error earlier. This error isthen used to declare an exception instead.II. A RCHITECTURE
At its core, JCoffee attempts to fix all the errors pointed outby the compiler in a deterministic manner, by simulating anenvironment of missing dependencies around the given codesnippet. The tool consists of two modules, namely the
ErrorFixing Module (EFM) and the
Intermediate RepresentationGenerator (IRG). It also comprises the
Driver Engine (DE)to integrate the two modules. Although the ideas presented arequite generic and may be implemented in any way suitable tothe use case, we have implemented DE, EFM and IRG inpython. We now discuss each component in detail. a) Error Fixing Module (EFM):
Given a code snippetand a list of compiler-generated error messages, EFM fixeseach error by adding class/method/variable declarations orinferring and updating the missing return type/declared typefor methods/variables. Once all errors have been handled, EFMreturns the modified code.
Algorithm 1
JCoffee function
DE( code, maxSteps ) code ← preprocess ( code ) for ctr ← to maxSteps do nErr, errorM sgs ← compile ( code ) if nErr = 0 then IRG( code ) return true end if code ← EFM( code, errorM sgs ) end forreturn f alse end functionfunction EFM( code, errorM sgs ) modCode ← code for err in errorM sgs do modCode ← f ixError ( code, err ) end forreturn modCode end functionfunction IRG( code ) outF ile ← codecompile ( outF ile ) ⊲ Used for static analysis end function b) Intermediate Representation Generator (IRG):
Onceall the compiler-generated errors have been fixed, IRG gen-erates bytecode or the .class files. The bytecode can beused as it is or further intermediate representations may begenerated depending on the individual use case. c) Driver Engine (DE):
The first step is to pre-processthe input. If the input code snippet is not already a class (or aset of classes), it is encapsulated into a placeholder method anda class structure. DE then invokes javac , the Java compiler,on the input, to get a list of error messages which are sent to EFM. A successful compilation invokes IRG. Otherwise, itloops for a fixed number of iterations, with each EFM outputbecoming the input for the next invocation.We outline the pseudocode for each of the components inAlgorithm 1. The simplistic and modular design of JCoffeemakes it easy to extend it to support a wider range of errors,keeping it compatible with future Java versions.IV. E
VALUATION
We evaluated our implementation of JCoffee to answer thefollowing research questions:(RQ1): How effective is JCoffee in disambiguating codesnippets?(RQ2): What is the accuracy of JCoffee in preserving theintended semantics of code during disambiguation?(RQ3): What is the time complexity and overhead of theiterative approach used in JCoffee?
A. Dataset & Environment
We used 9133 partial code snippets from the Big-CloneBench [2] dataset released by Svajlenko et al. Rangingfrom a few statements (3 lines) to functions as large as 920lines of code, the snippets contained code for frequently usedfunctionalities from open source java projects. As almost alltest samples referenced undeclared objects and functions, lessthan 1% of the snippets were initially compilable.The experiments were conducted on a server having Open-JDK Java 8, running Ubuntu 19.04 with 12 Intel Xeon E4CPU cores and 31 GB of memory.
B. RQ1: Effectiveness
JCoffee’s primary goal is to completely resolve all compilererrors to generate bytecode for the given partial code. Withsome samples producing as many as 291 errors, we had a suc-cess rate of just over 90% - JCoffee completely disambiguated8220 samples out of the 9133 input snippets when allowed torun for up to 10 compiler iterations per input. Further, 94% ofthe total samples were reduced to 2 or fewer compiler errors,which may then be fixed manually.Upon further inspection of snippets that JCoffee failed tofix, we found the following recurring constructs: • inner classes • Java generics • classes/functions as parameters • lambda expressionsWe consider this to be a limitation of the current implemen-tation, and plan to fix this in future versions. C. RQ2: Accuracy
In some cases, the fix for a snippet does not completelycapture the intended logic of the PP.Consider, for example, the statement dst.transfer(src, 0, src.size());
Here, src and dst are objects of class
Channel . Whilethe PP contains their declaration, it does not contain the def-inition of class
Channel . So, JCoffee sets the 3 rd parameterype of transfer() and the return type of size() as UNKNOWN . While it is trivial for a programmer to guess that size() likely returns an int , JCoffee does not yet use thisextra information available in variable & function names.To evaluate this impact, we randomly selected 30 testsnippets, and manually fixed them, considering language con-structs wherever possible. We compared this golden standardwith the fixes generated by JCoffee for the same partialprograms. Our results show that 93.3% of the type inferenceswere identical, and in merely 6.7% of the cases, a humanevaluator could make more precise assumptions.Hence, one extension of our approach can be to extractlinguistic information from identifier names wherever possibleand benefit from it.
D. RQ3: Time complexity & overhead
The complexity of the algorithm is bounded by n s , thenumber of statements in the source file, n e , the average numberof errors detected per pass, and p , the number of compilepasses required to disambiguate the code completely. The timecomplexity can thus be expressed as O ( p ∗ ( n s + n e )) .Out of the fixed files, 91.3% required up to 5 compileiterations, while 99.5% were fixed within seven iterations. Onaverage, it took 1.64s to fix a code snippet. We view this asan acceptable cost for complete disambiguation.V. R ELATED W ORK
Several works exist that parse incomplete code to supportstatic analyses. To the best of our knowledge, we know of noother research project aiming to generate compilable code forJava, or directly leveraging compiler feedback.PPA introduced by Dagenais and Hendren [1] is the closestin terms of research objectives. It predicts unknown bindingsfor partial programs by performing partial type inferenceand uses heuristics to resolve ambiguities. However, it iscompatible with only Java 1.4, and is now deprecated.Melo et al. [3] develop PsycheC, with a similar goal asours, but for the C language. They build a lattice structurefor various standard C types to bind variables to them duringconstraint generation for type inference.Zhong and Wang [4] propose GRAPA that locates previ-ously released code archives, and extracts resolved types andmethod signatures to generate System Dependency Graphs [5](SDG) for partial programs. While suitable for a niche use caseof publicly released applications, this approach fails for partialprograms in general.PARSEWeb [6], PRIME [7], and SemDiff [8] work similarlyby recommending method calls with matching signatures fromarbitrary frameworks or mining specifications for extractedAPI calls from the code snippet. Approaches such as [9]only try to resolve missing dependencies given explicit importstatements, something which is rarely available in partialprograms.Further, many Integrated Development Environments (IDE)such as [10], [11] generate a typed abstract syntax tree forincomplete code snippets. However, they throw errors whenthey encounter constructs with missing declarations. VI. C
ONCLUSIONS AND F UTURE W ORK
We implemented JCoffee, a tool that makes use of the de-tailed error information by the compiler to generate bytecodefor partial programs by simulating the presence of undeclaredconstructs. Based on our evaluations on thousands of opensource partial code snippets, JCoffee completely removeserrors from over 90% of input files, with the modificationsbeing identical to the gold standard in 93.3% of the cases.In the future, we plan to support complex Java 8 mecha-nisms and add capabilities to extract information from objectnames for a more precise type-inference. Also, since ourapproach is general, it can be adapted to other programminglanguages with descriptive errors, such as Rust and Elm.Finally, the implementation of JCoffee, benchmarks, andoutputs are available at https://github.com/piyush69/JCoffee.A
CKNOWLEDGMENT
This work is partly supported by Infosys Center for Arti-ficial Intelligence at IIIT-Delhi, Department of Science andTechnology (DST) (India), Science and Engineering ResearchBoard (SERB), the Confederation of Indian Industry (CII), andNucleus Software Exports Ltd.R
EFERENCES[1] B. Dagenais and L. Hendren, “Enabling static analysis for partial javaprograms,” in
Proceedings of the 23rd ACM SIGPLAN conferenceon Object-oriented programming systems languages and applications ,2008, pp. 313–328.[2] J. Svajlenko, J. F. Islam, I. Keivanloo, C. K. Roy, and M. M. Mia,“Towards a big data curated benchmark of inter-project code clones,”in . IEEE, 2014, pp. 476–480.[3] L. T. Melo, R. G. Ribeiro, M. R. de Ara´ujo, and F. M. Q. Pereira,“Inference of static semantics for incomplete c programs,”
Proceedingsof the ACM on Programming Languages , vol. 2, no. POPL, pp. 1–28,2017.[4] H. Zhong and X. Wang, “Boosting complete-code tool for partial pro-gram,” in . IEEE, 2017, pp. 671–681.[5] J. Ferrante, K. J. Ottenstein, and J. D. Warren, “The program dependencegraph and its use in optimization,”
ACM Transactions on ProgrammingLanguages and Systems (TOPLAS) , vol. 9, no. 3, pp. 319–349, 1987.[6] S. Thummalapenta and T. Xie, “Parseweb: a programmer assistant forreusing open source code on the web,” in
Proceedings of the twenty-second IEEE/ACM international conference on Automated softwareengineering , 2007, pp. 204–213.[7] A. Mishne, S. Shoham, and E. Yahav, “Typestate-based semantic codesearch over partial programs,” in
Proceedings of the ACM internationalconference on Object oriented programming systems languages andapplications , 2012, pp. 997–1016.[8] B. Dagenais and M. P. Robillard, “Recommending adaptive changes forframework evolution,”
ACM Transactions on Software Engineering andMethodology (TOSEM) , vol. 20, no. 4, pp. 1–35, 2011.[9] J. Ossher, S. Bajracharya, and C. Lopes, “Automated dependency reso-lution for open source software,” in2010 7th IEEE Working Conferenceon Mining Software Repositories (MSR 2010)