Jaechang Nam
Hong Kong University of Science and Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Jaechang Nam.
international conference on software engineering | 2013
Dongsun Kim; Jaechang Nam; Jaewoo Song; Sunghun Kim
Patch generation is an essential software maintenance task because most software systems inevitably have bugs that need to be fixed. Unfortunately, human resources are often insufficient to fix all reported and known bugs. To address this issue, several automated patch generation techniques have been proposed. In particular, a genetic-programming-based patch generation technique, GenProg, proposed by Weimer et al., has shown promising results. However, these techniques can generate nonsensical patches due to the randomness of their mutation operations. To address this limitation, we propose a novel patch generation approach, Pattern-based Automatic program Repair (Par), using fix patterns learned from existing human-written patches. We manually inspected more than 60,000 human-written patches and found there are several common fix patterns. Our approach leverages these fix patterns to generate program patches automatically. We experimentally evaluated Par on 119 real bugs. In addition, a user study involving 89 students and 164 developers confirmed that patches generated by our approach are more acceptable than those generated by GenProg. Par successfully generated patches for 27 out of 119 bugs, while GenProg was successful for only 16 bugs.
international conference on software engineering | 2013
Jaechang Nam; Sinno Jialin Pan; Sunghun Kim
Many software defect prediction approaches have been proposed and most are effective in within-project prediction settings. However, for new projects or projects with limited training data, it is desirable to learn a prediction model by using sufficient training data from existing source projects and then apply the model to some target projects (cross-project defect prediction). Unfortunately, the performance of cross-project defect prediction is generally poor, largely because of feature distribution differences between the source and target projects. In this paper, we apply a state-of-the-art transfer learning approach, TCA, to make feature distributions in source and target projects similar. In addition, we propose a novel transfer defect learning approach, TCA+, by extending TCA. Our experimental results for eight open-source projects show that TCA+ significantly improves cross-project prediction performance.
foundations of software engineering | 2015
Jaechang Nam; Sunghun Kim
Many recent studies have documented the success of cross-project defect prediction (CPDP) to predict defects for new projects lacking in defect data by using prediction models built by other projects. However, most studies share the same limitations: it requires homogeneous data; i.e., different projects must describe themselves using the same metrics. This paper presents methods for heterogeneous defect prediction (HDP) that matches up different metrics in different projects. Metric matching for HDP requires a “large enough” sample of distributions in the source and target projects—which raises the question on how large is “large enough” for effective heterogeneous defect prediction. This paper shows that empirically and theoretically, “large enough” may be very small indeed. For example, using a mathematical model of defect prediction, we identify categories of data sets were as few as 50 instances are enough to build a defect prediction model. Our conclusion for this work is that, even when projects use different metric sets, it is possible to quickly transfer lessons learned about defect prediction.
automated software engineering | 2015
Jaechang Nam; Sunghun Kim
Defect prediction on new projects or projects with limited historical data is an interesting problem in software engineering. This is largely because it is difficult to collect defect information to label a dataset for training a prediction model. Cross-project defect prediction (CPDP) has tried to address this problem by reusing prediction models built by other projects that have enough historical data. However, CPDP does not always build a strong prediction model because of the different distributions among datasets. Approaches for defect prediction on unlabeled datasets have also tried to address the problem by adopting unsupervised learning but it has one major limitation, the necessity for manual effort. In this study, we propose novel approaches, CLA and CLAMI, that show the potential for defect prediction on unlabeled datasets in an automated manner without need for manual effort. The key idea of the CLA and CLAMI approaches is to label an unlabeled dataset by using the magnitude of metric values. In our empirical study on seven open-source projects, the CLAMI approach led to the promising prediction performances, 0.636 and 0.723 in average f-measure and AUC, that are comparable to those of defect prediction based on supervised learning.
IEEE Transactions on Software Engineering | 2018
Jaechang Nam; Wei Fu; Sunghun Kim; Tim Menzies; Lin Tan
Many recent studies have documented the success of cross-project defect prediction (CPDP) to predict defects for new projects lacking in defect data by using prediction models built by other projects. However, most studies share the same limitations: it requires homogeneous data; i.e., different projects must describe themselves using the same metrics. This paper presents methods for heterogeneous defect prediction (HDP) that matches up different metrics in different projects. Metric matching for HDP requires a “large enough” sample of distributions in the source and target projects—which raises the question on how large is “large enough” for effective heterogeneous defect prediction. This paper shows that empirically and theoretically, “large enough” may be very small indeed. For example, using a mathematical model of defect prediction, we identify categories of data sets were as few as 50 instances are enough to build a defect prediction model. Our conclusion for this work is that, even when projects use different metric sets, it is possible to quickly transfer lessons learned about defect prediction.
foundations of software engineering | 2015
Mi Jung Kim; Jaechang Nam; Jaehyuk Yeon; Soonhwang Choi; Sunghun Kim
Quality assurance for common APIs is important since the the reliability of APIs affects the quality of other systems using the APIs. Testing is a common practice to ensure the quality of APIs, but it is a challenging and laborious task especially for industrial projects. Due to a large number of APIs with tight time constraints and limited resources, it is hard to write enough test cases for all APIs. To address these challenges, we present a novel technique, REMI that predicts high risk APIs in terms of producing potential bugs. REMI allows developers to write more test cases for the high risk APIs. We evaluate REMI on a real-world industrial project, Tizen-wearable, and apply REMI to the API development process at Samsung Electronics. Our evaluation results show that REMI predicts the bug-prone APIs with reasonable accuracy (0.681 f-measure on average). The results also show that applying REMI to the Tizen-wearable development process increases the number of bugs detected, and reduces the resources required for executing test cases.
international conference on software testing verification and validation workshops | 2011
Jaechang Nam; David Schuler; Andreas Zeller
During mutation testing, artificial defects are inserted into a program, in order to measure the quality of a test suite and to provide means for improvement. These defects are generated using predefined mutation operators-inspired by faults that programmers tend to make. As the type of faults varies between different programmers and projects, mutation testing might be improved by learning from past defects-Does a sample of mutations similar to past defects help to develop better tests than a randomly chosen sample of mutations? In this paper, we present the first approach that uses software repository mining techniques to calibrate mutation testing to the defect history of a project. Furthermore, we provide an implementation and evaluation of calibrated mutation testing for the Jaxen project. However, first results indicate that calibrated mutation testing cannot outperform random selection strategies.
international conference on software engineering | 2018
Jaechang Nam; Song Wang; Yuan Xi; Lin Tan
One of the challenging issues of the existing static analysis tools is the high false alarm rate. To address the false alarm issue, we design bug detection rules by learning from a large number of real bugs from open-source projects from GitHub. Specifically, we build a framework that learns and refines bug detection rules for fewer false positives. Based on the framework, we implemented ten patterns, six of which are new ones to existing tools. To evaluate the framework, we implemented a static analysis tool, FeeFin, based on the framework with the ten bug detection rules and applied the tool for 1,800 open-source projects in GitHub. The 57 detected bugs by FeeFin has been confirmed by developers as true positives and 44 bugs out of the detected bugs were actually fixed.
foundations of software engineering | 2011
Taek Lee; Jaechang Nam; Donggyun Han; Sunghun Kim; Hoh Peter In
automated software engineering | 2015
Jaechang Nam; Sunghun Kim