Chen Fu | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Chen Fu is active.

Explore More

Publication

Featured researches published by Chen Fu.

international conference on software engineering | 2011

Portfolio: finding relevant functions and their usage

Collin McMillan; Mark Grechanik; Denys Poshyvanyk; Qing Xie; Chen Fu

Different studies show that programmers are more interested in finding definitions of functions and their uses than variables, statements, or arbitrary code fragments [30, 29, 31]. Therefore, programmers require support in finding relevant functions and determining how those functions are used. Unfortunately, existing code search engines do not provide enough of this support to developers, thus reducing the effectiveness of code reuse. We provide this support to programmers in a code search system called Portfolio that retrieves and visualizes relevant functions and their usages. We have built Portfolio using a combination of models that address surfing behavior of programmer and sharing Related concepts among functions. We conducted an experiment with 49 professional programmers to compare Portfolio to Google Code Search and Koders using a standard methodology. The results show with strong statistical significance that users find more relevant functions with higher precision with Portfolio than with Google Code Search and Koders.

international conference on software engineering | 2010

A search engine for finding highly relevant applications

Mark Grechanik; Chen Fu; Qing Xie; Collin McMillan; Denys Poshyvanyk; Chad M. Cumby

A fundamental problem of finding applications that are highly relevant to development tasks is the mismatch between the high-level intent reflected in the descriptions of these tasks and low-level implementation details of applications. To reduce this mismatch we created an approach called Exemplar (EXEcutable exaMPLes ARchive) for finding highly relevant software projects from large archives of applications. After a programmer enters a natural-language query that contains high-level concepts (e.g., MIME, data sets), Exemplar uses information retrieval and program analysis techniques to retrieve applications that implement these concepts. Our case study with 39 professional Java programmers shows that Exemplar is more effective than Sourceforge in helping programmers to quickly find highly relevant applications.

international conference on software engineering | 2009

Maintaining and evolving GUI-directed test scripts

Mark Grechanik; Qing Xie; Chen Fu

Since manual black-box testing of GUI-based APplications (GAPs) is tedious and laborious, test engineers create test scripts to automate the testing process. These test scripts interact with GAPs by performing actions on their GUI objects. An extra effort that test engineers put in writing test scripts is paid off when these scripts are run repeatedly. Unfortunately, releasing new versions of GAPs with modified GUIs breaks their corresponding test scripts thereby obliterating benefits of test automation. We offer a novel approach for maintaining and evolving test scripts so that they can test new versions of their respective GAPs. We built a tool to implement our approach, and we conducted a case study with forty five professional programmers and test engineers to evaluate this tool. The results show with strong statistical significance that users find more failures and report fewer false positives (p ≪ 0.02) in test scripts with our tool than with a flagship industry product and a baseline manual approach. Our tool is lightweight and it takes less than eight seconds to analyze approximately 1KLOC of test scripts.

empirical software engineering and measurement | 2010

An empirical investigation into a large-scale Java open source code repository

Mark Grechanik; Collin McMillan; Luca DeFerrari; Marco Comi; Stefano Crespi; Denys Poshyvanyk; Chen Fu; Qing Xie; Carlo Ghezzi

Getting insight into different aspects of source code artifacts is increasingly important -- yet there is little empirical research using large bodies of source code, and subsequently there are not much statistically significant evidence of common patterns and facts of how programmers write source code. We pose 32 research questions, explain rationale behind them, and obtain facts from 2,080 randomly chosen Java applications from Sourceforge. Among these facts we find that most methods have one or zero arguments or they do not return any values, few methods are overridden, most inheritance hierarchies have the depth of one, close to 50% of classes are not explicitly inherited from any classes, and the number of methods is strongly correlated with the number of fields in a class.

international conference on software engineering | 2012

Automatically finding performance problems with feedback-directed learning software testing

Mark Grechanik; Chen Fu; Qing Xie

A goal of performance testing is to find situations when applications unexpectedly exhibit worsened characteristics for certain combinations of input values. A fundamental question of performance testing is how to select a manageable subset of the input data faster to find performance problems in applications automatically. We offer a novel solution for finding performance problems in applications automatically using black-box software testing. Our solution is an adaptive, feedback-directed learning testing system that learns rules from execution traces of applications and then uses these rules to select test input data automatically for these applications to find more performance problems when compared with exploratory random testing. We have implemented our solution and applied it to a medium-size application at a major insurance company and to an open-source application. Performance problems were found automatically and confirmed by experienced testers and developers.

IEEE Transactions on Software Engineering | 2012

Exemplar: A Source Code Search Engine for Finding Highly Relevant Applications

Collin McMillan; Mark Grechanik; Denys Poshyvanyk; Chen Fu; Qing Xie

A fundamental problem of finding software applications that are highly relevant to development tasks is the mismatch between the high-level intent reflected in the descriptions of these tasks and low-level implementation details of applications. To reduce this mismatch we created an approach called EXEcutable exaMPLes ARchive (Exemplar) for finding highly relevant software projects from large archives of applications. After a programmer enters a natural-language query that contains high-level concepts (e.g., MIME, datasets), Exemplar retrieves applications that implement these concepts. Exemplar ranks applications in three ways. First, we consider the descriptions of applications. Second, we examine the Application Programming Interface (API) calls used by applications. Third, we analyze the dataflow among those API calls. We performed two case studies (with professional and student developers) to evaluate how these three rankings contribute to the quality of the search results from Exemplar. The results of our studies show that the combined ranking of application descriptions and API documents yields the most-relevant search results. We released Exemplar and our case study data to the public.

international symposium on software reliability engineering | 2010

Is Data Privacy Always Good for Software Testing

Mark Grechanik; Christoph Csallner; Chen Fu; Qing Xie

Database-centric applications (DCAs) are common in enterprise computing, and they use nontrivial databases. Testing of DCAs is increasingly outsourced to test centers in order to achieve lower cost and higher quality. When releasing proprietary DCAs, its databases should also be made available to test engineers, so that they can test using real data. Testing with real data is important, since fake data lacks many of the intricate semantic connections among the original data elements. However, different data privacy laws prevent organizations from sharing these data with test centers because databases contain sensitive information. Currently, testing is performed with fake data that often leads to worse code coverage and fewer uncovered bugs, thereby reducing the quality of DCAs and obliterating benefits of test outsourcing. We show that a popular data anonymization algorithm called k-anonymity seriously degrades test coverage of DCAs. We propose an approach that uses program analysis to guide selective application of k-anonymity. This approach helps protect sensitive data in databases while retaining testing efficacy. Our results show that for small values of k = 7, test coverage drops to less than 30% from the original coverage of more than 70%, thus making it difficult to achieve good quality when testing DCAs while applying data privacy.

formal methods | 2013

Portfolio: Searching for relevant functions and their usages in millions of lines of code

Collin McMillan; Denys Poshyvanyk; Mark Grechanik; Qing Xie; Chen Fu

Different studies show that programmers are more interested in finding definitions of functions and their uses than variables, statements, or ordinary code fragments. Therefore, developers require support in finding relevant functions and determining how these functions are used. Unfortunately, existing code search engines do not provide enough of this support to developers, thus reducing the effectiveness of code reuse. We provide this support to programmers in a code search system called Portfolio that retrieves and visualizes relevant functions and their usages. We have built Portfolio using a combination of models that address surfing behavior of programmers and sharing related concepts among functions. We conducted two experiments: first, an experiment with 49 C/C++ programmers to compare Portfolio to Google Code Search and Koders using a standard methodology for evaluating information-retrieval-based engines; and second, an experiment with 19 Java programmers to compare Portfolio to Koders. The results show with strong statistical significance that users find more relevant functions with higher precision with Portfolio than with Google Code Search and Koders. We also show that by using PageRank, Portfolio is able to rank returned relevant functions more efficiently.

foundations of software engineering | 2012

CarFast: achieving higher statement coverage faster

Sangmin Park; B. M. Mainul Hossain; Ishtiaque Hussain; Christoph Csallner; Mark Grechanik; Kunal Taneja; Chen Fu; Qing Xie

Test coverage is an important metric of software quality, since it indicates thoroughness of testing. In industry, test coverage is often measured as statement coverage. A fundamental problem of software testing is how to achieve higher statement coverage faster, and it is a difficult problem since it requires testers to cleverly find input data that can steer execution sooner toward sections of application code that contain more statements. We created a novel fully automatic approach for aChieving higher stAtement coveRage FASTer (CarFast), which we implemented and evaluated on twelve generated Java applications whose sizes range from 300 LOC to one million LOC. We compared CarFast with several popular test case generation techniques, including pure random, adaptive random, and Directed Automated Random Testing (DART). Our results indicate with strong statistical significance that when execution time is measured in terms of the number of runs of the application on different input test data, CarFast outperforms the evaluated competitive approaches on most subject applications.

international conference on software testing, verification and validation workshops | 2009

Creating GUI Testing Tools Using Accessibility Technologies

Mark Grechanik; Qing Xie; Chen Fu

Since manual black-box testing of GUI-based APplications(GAPs) is tedious and laborious, and existing toolsdo not fully address different aspects of the testing process, test engineers create custom testing tools to automate the testing process. These tools interact with GAPs by performing actions on their GUI objects. An extra effort that test engineers put in writing test tools is paid off when these tools are run repeatedly on different GAPs. Unfortunately, creating custom GUI testing tools is a laborious and intellectually intensive process, during which test engineers use platform-specific libraries and techniques. As a result, these tools are expensive, difficult to maintain and evolve, and they often run only on specific platforms.We offer a universal approach for creating custom testingGUI tools. This approach is lightweight, portable, nonintrusive, universal, and cheap, and it combines a nonstandard use of accessibility technologies for accessing and controlling GAPs in a uniform way with a visualization mechanism that enables test personnel to interact with GUI objects by performing point-and-click, drag-and-drop operations on GAPs. We describe how we used this approach to create various GUI testing tools, delve into technical features of accessibility technologies, and review our experience with this approach.

Explore More