Lingfeng Bao | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Lingfeng Bao is active.

Explore More

Publication

Featured researches published by Lingfeng Bao.

Empirical Software Engineering | 2017

What do developers search for on the web

Xin Xia; Lingfeng Bao; David Lo; Pavneet Singh Kochhar; Ahmed E. Hassan; Zhenchang Xing

Developers commonly make use of a web search engine such as Google to locate online resources to improve their productivity. A better understanding of what developers search for could help us understand their behaviors and the problems that they meet during the software development process. Unfortunately, we have a limited understanding of what developers frequently search for and of the search tasks that they often find challenging. To address this gap, we collected search queries from 60 developers, surveyed 235 software engineers from more than 21 countries across five continents. In particular, we asked our survey participants to rate the frequency and difficulty of 34 search tasks which are grouped along the following seven dimensions: general search, debugging and bug fixing, programming, third party code reuse, tools, database, and testing. We find that searching for explanations for unknown terminologies, explanations for exceptions/error messages (e.g., HTTP 404), reusable code snippets, solutions to common programming bugs, and suitable third-party libraries/services are the most frequent search tasks that developers perform, while searching for solutions to performance bugs, solutions to multi-threading bugs, public datasets to test newly developed algorithms or systems, reusable code snippets, best industrial practices, database optimization solutions, solutions to security bugs, and solutions to software configuration bugs are the most difficult search tasks that developers consider. Our study sheds light as to why practitioners often perform some of these tasks and why they find some of them to be challenging. We also discuss the implications of our findings to future research in several research areas, e.g., code search engines, domain-specific search engines, and automated generation and refinement of search queries.

international symposium on software reliability engineering | 2016

Combining Word Embedding with Information Retrieval to Recommend Similar Bug Reports

Xinli Yang; David Lo; Xin Xia; Lingfeng Bao; Jianling Sun

Similar bugs are bugs that require handling of many common code files. Developers can often fix similar bugs with a shorter time and a higher quality since they can focus on fewer code files. Therefore, similar bug recommendation is a meaningful task which can improve development efficiency. Rocha et al. propose the first similar bug recommendation system named NextBug. Although NextBug performs better than a start-of-the-art duplicated bug detection technique REP, its performance is not optimal and thus more work is needed to improve its effectiveness. Technically, it is also rather simple as it relies only upon a standard information retrieval technique, i.e., cosine similarity. In the paper, we propose a novel approach to recommend similar bugs. The approach combines a traditional information retrieval technique and a word embedding technique, and takes bug titles and descriptions as well as bug product and component information into consideration. To evaluate the approach, we use datasets from two popular open-source projects, i.e., Eclipse and Mozilla, each of which contains bug reports whose bug ids range from [1,400000]. The results show that our approach improves the performance of NextBug statistically significantly and substantially for both projects.

Empirical Software Engineering | 2017

Extracting and analyzing time-series HCI data from screen-captured task videos

Lingfeng Bao; Jing Li; Zhenchang Xing; Xinyu Wang; Xin Xia; Bo Zhou

Recent years have witnessed the increasing emphasis on human aspects in software engineering research and practices. Our survey of existing studies on human aspects in software engineering shows that screen-captured videos have been widely used to record developers’ behavior and study software engineering practices. The screen-captured videos provide direct information about which software tools the developers interact with and which content they access or generate during the task. Such Human-Computer Interaction (HCI) data can help researchers and practitioners understand and improve software engineering practices from human perspective. However, extracting time-series HCI data from screen-captured task videos requires manual transcribing and coding of videos, which is tedious and error-prone. In this paper we report a formative study to understand the challenges in manually transcribing screen-captured videos into time-series HCI data. We then present a computer-vision based video scraping technique to automatically extract time-series HCI data from screen-captured videos. We also present a case study of our scvRipper tool that implements the video scraping technique using 29-hours of task videos of 20 developers in two development tasks. The case study not only evaluates the runtime performance and robustness of the tool, but also performs a detailed quantitative analysis of the tool’s ability to extract time-series HCI data from screen-captured task videos. We also study the developer’s micro-level behavior patterns in software development from the quantitative analysis.

mining software repositories | 2016

How android app developers manage power consumption?: an empirical study by mining power management commits

Lingfeng Bao; David Lo; Xin Xia; Xinyu Wang; Cong Tian

As Android platform becomes more and more popular, a large amount of Android applications have been developed. When developers design and implement Android applications, power consumption management is an important factor to consider since it affects the usability of the applications. Thus, it is important to help developers adopt proper strategies to manage power consumption. Interestingly, today, there is a large number of Android application repositories made publicly available in sites such as GitHub. These repositories can be mined to help crystalize common power management activities that developers do. These in turn can be used to help other developers to perform similar tasks to improve their own Android applications.In this paper, we present an empirical study of power management commits in Android applications. Our study extends that of Moura et al. who perform an empirical studyon energy aware commits; however they do not focus on Android applications and only a few of the commits that they study come from Android applications. Android applications are often different from other applications (e.g., those running on a server) due to the issue of limited battery life and the use of specialized APIs. As subjects of our empirical study, we obtain a list of open source Android applications from F-Droid and crawl their commits from Github. We get 468 power management commits after we filter the commits using a set of keywords and by performing manual analysis. These 468 power management commits are from 154 different Android applications and belong to 15 different application categories. Furthermore, we use open card sort to categorize these power management commits and we obtain 6 groups which correspond to different power management activities. Our study also reveals that for different kinds of Android application (e.g., Games, Connectivity, Navigation, etc.), the dominant power management activities differ.For example, the percentageof power management commits belonging to Power Adaptation activity is larger for Navigation applications than those belonging to other categories.

Science in China Series F: Information Sciences | 2017

Automated Android application permission recommendation

Lingfeng Bao; David Lo; Xin Xia; Shanping Li

The number of Android applications has increased rapidly as Android is becoming the dominant platform in the smartphone market. Security and privacy are key factors for an Android application to be successful. Android provides a permission mechanism to ensure security and privacy. This permission mechanism requires that developers declare the sensitive resources required by their applications. On installation or during runtime, users are required to agree with the permission request. However, in practice, there are numerous popular permission misuses, despite Android introducing official documents stating how to use these permissions properly. Some data mining techniques (e.g., association rule mining) have been proposed to help better recommend permissions required by an API. In this paper, based on popular techniques used to build recommendation systems, we propose two novel approaches to improve the effectiveness of the prior work. The first approach utilizes a collaborative filtering technique, which is inspired by the intuition that apps that have similar features — inferred from their APIs — usually share similar permissions. The second approach recommends permissions based on a text mining technique that uses a naive Bayes multinomial classification algorithm to build a prediction model by analyzing descriptions of apps. To evaluate these two approaches, we use 936 Android apps from F-Droid, which is a repository of free and open source Android applications. We find that our proposed approaches yield a significant improvement in terms of precision, recall, F1-score, and MAP of the top-k results over the baseline approach.

international conference on software engineering | 2015

scvRipper: video scraping tool for modeling developers' behavior using interaction data

Lingfeng Bao; Jing Li; Zhenchang Xing; Xinyu Wang; Bo Zhou

Screen-capture tool can record a users interaction with software and application content as a stream of screenshots which is usually stored in certain video format. Researchers have used screen-captured videos to study the programming activities that the developers carry out. In these studies, screen-captured videos had to be manually transcribed to extract software usage and application content data for the study purpose. This paper presents a computer-vision based video scraping tool (called scvRipper) that can automatically transcribe a screen-captured video into time-series interaction data according to the analysts need. This tool can address the increasing need for automatic behavioral data collection methods in the studies of human aspects of software engineering.

international conference on software maintenance | 2017

Personality and Project Success: Insights from a Large-Scale Study with Professionals

Xin Xia; David Lo; Lingfeng Bao; Abhishek Sharma; Shanping Li

A software project is typically completed as a result of a collective effort done by individuals of different personalities. Personality reflects differences among people in behaviour patterns, communication, cognition and emotion. It often impacts relationships and collaborative work, and software engineering teamwork is no exception. Some personalities are more likely to click while others to clash. A number of studies have investigated the relationship between personality and collaborative work success. However, most of them are done in a laboratory setting, do not involve professionals, or consider non software engineering tasks. Additionally, they only answer a limited set of questions, and many other questions remain open.To enrich the existing body of work, we study professionals working on real software projects, answering a new set of research questions that assess linkages between project manager personality and team personality composition and project success. In particular, our study investigates 28 recently completed software projects, which contain a total of 346 professionals, in 2 large IT companies. We asked project members to do a DISC (Dominance, Influence, Steadiness, and Compliant) personality test, and correlated the test outcomes with project success scores measured in six different dimensions. The scores were given by managers of three office as part of their regular day-to-day work. Our results show that project teams with dominant managers, along with those with more influential members and less dominant members, have higher success scores. This work provides new insights to construct a personality matching strategy that can contribute to building an effective project team.

2016 International Conference on Software Analysis, Testing and Evolution (SATE) | 2016

What Permissions Should This Android App Request

Lingfeng Bao; David Lo; Xin Xia; Shanping Li

As Android is one of the most popular open source mobile platforms, ensuring security and privacy of Android applications is very important. Android provides a permission mechanism which requires developers to declare sensitive resources their applications need, and users need to agree with this request when they install (for Android API level 22 or lower) or run (for Android API level 23) these applications. Although Android provides very good official documents to explain how to properly use permissions, unfortunately misuses even for the most popular permissions have been reported. Recently, Karim et al. propose an association rule mining based approach to better infer permissions that an API needs. In this work, to improve the effectiveness of the prior work, we propose an approach which is based on collaborative filtering technique, one of popular techniques used to build recommendation systems. Our approach is designed based on the intuition that apps that have similar features - inferred from the APIs that they use - usually share similar permissions. We evaluate the proposed approaches on 936 Android apps from F-Droid, which is a repository of free and open source Android applications. The experimental results show that our proposed approaches achieve significant improvement in terms of the precision, recall, F1-score and MAP of the top-k results over Karim et al.s approach.

Empirical Software Engineering | 2018

APIReal: an API recognition and linking approach for online developer forums

Deheng Ye; Lingfeng Bao; Zhenchang Xing; Shang-Wei Lin

When discussing programming issues on social platforms (e.g, Stack Overflow, Twitter), developers often mention APIs in natural language texts. Extracting API mentions from natural language texts serves as the prerequisite to effective indexing and searching for API-related information in software engineering social content. The task of extracting API mentions from natural language texts involves two steps: 1) distinguishing API mentions from other English words (i.e., API recognition), 2) disambiguating a recognized API mention to its unique fully qualified name (i.e., API linking). Software engineering social content lacks consistent API mentions and sentence writing format. As a result, API recognition and linking have to deal with the inherent ambiguity of API mentions in informal text, for example, due to the ambiguity between the API sense of a common word and the normal sense of the word (e.g., append, apply and merge), the simple name of an API can map to several APIs of the same library or of different libraries, or different writing forms of an API should be linked to the same API. In this paper, we propose a semi-supervised machine learning approach that exploits name synonyms and rich semantic context of API mentions for API recognition in informal text. Based on the results of our API recognition approach, we further propose an API linking approach leveraging a set of domain-specific heuristics, including mention-mention similarity, scope filtering, and mention-entry similarity, to determine which API in the knowledge base a recognized API actually refers to. To evaluate our API recognition approach, we use 1205 API mentions of three libraries (Pandas, Numpy, and Matplotlib) from Stack Overflow text. We also evaluate our API linking approach with 120 recognized API mentions of these three libraries.

international conference on software maintenance | 2016