IEEE Transactions on Reliability | 2021
Effort-Aware Just-in-Time Bug Prediction for Mobile Apps Via Cross-Triplet Deep Feature Embedding
Abstract
Just-in-time (JIT) bug prediction is an effective quality assurance activity that identifies whether a code commit will introduce bugs into the mobile app, aiming to provide prompt feedback to practitioners for priority review. Since collecting sufficient labeled bug data is not always feasible for some mobile apps, one possible approach is to leverage cross-app models. In this work, we propose a new cross-triplet deep feature embedding method, called CDFE, for cross-app JIT bug prediction task. The CDFE method incorporates a state-of-the-art cross-triplet loss function into a deep neural network to learn high-level feature representation for the cross-app data. This loss function adapts to the cross-app feature learning task and aims to learn a new feature space to shorten the distance of commit instances with the same label and enlarge the distance of commit instances with different labels. In addition, this loss function assigns higher weights to losses caused by cross-app instance pairs than that by intra-app instance pairs, aiming to narrow the discrepancy of cross-app bug data. We evaluate our CDFE method on a benchmark bug dataset from 19 mobile apps with two effort-aware indicators. The experimental results on 342 cross-app pairs show that our proposed CDFE method performs better than 14 baseline methods. Manuscript received June 19, 2020; revised September 13, 2020 and January 25, 2021; accepted February 28, 2021. This work was supported in part by the National Key Research and Development Project under Grant 2018YFB2101200, in part by the National Natural Science Foundation of China under Grant 62002034, in part by the Fundamental Research Funds for the Central Universities under Grants 2020CDCGRJ072 and 2020CDJQY-A021, in part by China Postdoctoral Science Foundation under Grant 2020M673137, in part by the Natural Science Foundation of Chongqing in China under Grant cstc2020jcyj-bshX0114, in part by the Science and Technology Development Fund of Macau under Grant 0047/2020/A1, in part by Faculty Research Grant Projects of MUST under Grant FRG-20-008-FI, and in part by the European Commission under Grant 825040 RADON. Associate Editor: Z. Jin. (Corresponding authors: Chunlei Fu; Meng Yan.) Zhou Xu, Chunlei Fu, Meng Yan, and Xiaohong Zhang are with the Key Laboratory of Dependable Service Computing in Cyber Physical Society (Chongqing University), Ministry of Education, China, and School of Big Data and Software Engineering, Chongqing University, Chongqing 401331, China (e-mail: [email protected]; [email protected]; [email protected]; [email protected]). Kunsong Zhao and Zhiwen Xie are with the School of Computer Science, Wuhan University, Wuhan 430072, China (e-mail: [email protected]; [email protected]). Tao Zhang is with the Faculty of Information Technology, Macau University of Science and Technology, Macau 999078, China (e-mail: [email protected]). Gemma Catolino is with the Jheronimus Academy of Data Science, Tilburg University, Tilburg 90153, The Netherlands (e-mail: [email protected]). Color versions of one or more figures in this article are available at https: //doi.org/10.1109/TR.2021.3066170. Digital Object Identifier 10.1109/TR.2021.3066170