Zhifei Chen | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Zhifei Chen is active.

Explore More

Publication

Featured researches published by Zhifei Chen.

computer software and applications conference | 2014

Dynamic Slicing of Python Programs

Zhifei Chen; Lin Chen; Yuming Zhou; Zhaogui Xu; William C. Chu; Baowen Xu

Python is widely used for web programming and GUI development. Due to the dynamic features of Python, Python programs may contain various unlimited errors. Dynamic slicing extracts those statements from a program which affect the variables in a slicing criterion with a particular input. Dynamic slicing of Python programs is essential for program debugging and fault location. In this paper, we propose an approach of dynamic slicing for Python programs which combines static analysis and dynamic tracing of the Python byte code. It precisely handles the dynamic features of Python, such as dynamic typing of variables, heavy usage of first-class objects, and dynamic modifications of classes and instances. Finally, we evaluate our approach on several Python programs. Experimental results show that the whole dynamic slicing for each subject program spends at most about 13 seconds on the average and costs at most 7.58 mb memory space overhead. Furthermore, the average slice ratio of Python source code ranges from 9.26% to 59.42%. According to it, our dynamic slicing approach can be effectively and efficiently performed. To the best of our knowledge, it is the first one of dynamic slicing for Python programs.

international conference on quality software | 2013

Static Slicing for Python First-Class Objects

Zhaogui Xu; Ju Qian; Lin Chen; Zhifei Chen; Baowen Xu

Program slicing is an important program analysis technique and now has been used in many fields of software engineering. However, most existing program slicing methods focus on static programming languages such as C/C++ and Java, and methods on dynamic languages like Python are rarely seen. Python, a typical dynamic object-oriented language, has been more and more widely used now. In Python, everything is a first-class object, including functions, classes, methods, and modules. Existing slicing methods cannot handle the issue of these first-class objects. Therefore, this paper proposes a static slicing method for Python first-class objects. By adding all the definitions of first-class objects into the dependence model and uniformly constructing the program dependence graphs for all the functions, classes, methods, and modules, this method can effectively solve the slicing problems caused by arbitrary definitions and uses of first-class objects in Python.

international conference on software engineering | 2015

An empirical study on the impact of Python dynamic features on change-proneness

Beibei Wang; Lin Chen; Wanwangying Ma; Zhifei Chen; Baowen Xu

The dynamic features of programming languages are useful constructs that bring developers convenience and flexibility, but they are also perceived to lead to difficulties in software maintenance. Figuring out whether the use of dynamic features affects maintenance is significant for both researchers and practitioners, yet little work has been done to investigate it. In this paper, we conduct an empirical study to explore whether program source code files using dynamic features are more change-prone and whether particular categories of dynamic features are more correlated to change-proneness than others. To this end, we statically analyze historical data from 4 to 7 years of the development of seven open-source systems. We employ Fisher and Mann-Whitney hypothetical test methods, along with logistic regression model to solve three research questions. The results show that: (1) files with dynamic features are more change-prone, (2) files with a higher number of dynamic features are more change-prone, and (3) Introspection is shown to be more correlated to change-proneness than the other three categories in most systems. This innovative work can give some inspirations and references to researchers who are always focusing their eyes on how and why the dynamic features are used. For practitioners, we suggest them to be wary of files with dynamic features because they are more likely to be the subject of their maintenance effort.

international conference on software maintenance | 2016

An Empirical Study on the Characteristics of Python Fine-Grained Source Code Change Types

Wei Lin; Zhifei Chen; Wanwangying Ma; Lin Chen; Lei Xu; Baowen Xu

Software has been changing during its whole life cycle. Therefore, identification of source code changes becomes a key issue in software evolution analysis. However, few current change analysis research focus on dynamic language software. In this paper, we pay attention to the fine-grained source code changes of Python software. We implement an automatic tool named PyCT to extract 77 kinds of fine-grained source code change types from commit history information. We conduct an empirical study on ten popular Python projects from five domains, with 132294 commits, to investigate the characteristics of dynamic software source code changes. Analyzing the source code changes in four aspects, we distill 11 findings, which are summarized into two insights on software evolution: change prediction and fault code fix. In addition, we provide direct evidence on how developers use and change dynamic features. Our results provide useful guidance and insights for improving the understanding of source code evolution of dynamic language software.

Science in China Series F: Information Sciences | 2016

Empirical analysis of network measures for predicting high severity software faults

Lin Chen; Wanwangying Ma; Yuming Zhou; Lei Xu; Ziyuan Wang; Zhifei Chen; Baowen Xu

Network measures are useful for predicting fault-prone modules. However, existing work has not distinguished faults according to their severity. In practice, high severity faults cause serious problems and require further attention. In this study, we explored the utility of network measures in high severity faultproneness prediction. We constructed software source code networks for four open-source projects by extracting the dependencies between modules. We then used univariate logistic regression to investigate the associations between each network measure and fault-proneness at a high severity level. We built multivariate prediction models to examine their explanatory ability for fault-proneness, as well as evaluated their predictive effectiveness compared to code metrics under forward-release and cross-project predictions. The results revealed the following: (1) most network measures are significantly related to high severity fault-proneness; (2) network measures generally have comparable explanatory abilities and predictive powers to those of code metrics; and (3) network measures are very unstable for cross-project predictions. These results indicate that network measures are of practical value in high severity fault-proneness prediction.

web information system and application conference | 2014

Hybrid Information Flow Analysis for Python Bytecode

Zhifei Chen; Lin Chen; Baowen Xu

Python is widely used to create and manage complex, database-driven websites. However, due to dynamic features such as dynamic typing of variables, Python programs pose a serious security risk to web applications. Most security vulnerabilities result from the fact that unsafe data input reaches security-sensitive operations. To address this problem, information flow analysis for Python programs is proposed to enforce this property. Information flow can capture the fact that a particular value affects another value in the program. In this paper, we present a novel approach for analyzing information flow in Python byte code which is a low-level language and is more widely broadcast. Our approach performs a hybrid of static and dynamic control/data flow analysis. Static analysis is used to study implicit flow, while dynamic analysis efficiently tracks execution information and determines definition-use pair. To the best of our knowledge, it is the first one for Python byte code.

2016 International Conference on Software Analysis, Testing and Evolution (SATE) | 2016

Detecting Code Smells in Python Programs

Zhifei Chen; Lin Chen; Wanwangying Ma; Baowen Xu

As a traditional dynamic language, Python is increasingly used in various software engineering tasks. However, due to its flexibility and dynamism, Python is a particularly challenging language to write code in and maintain. Consequently, Python programs contain code smells which indicate potential comprehension and maintenance problems. With the aim of supporting refactoring strategies to enhance maintainability, this paper describes how to detect code smells in Python programs. We introduce 11 Python smells and describe the detection strategy. We also implement a smell detection tool named Pysmell and use it to identify code smells in five real world Python systems. The results show that Pysmell can detect 285 code smell instances in total with the average precision of 97.7%. It reveals that Large Class and Large Method are most prevalent. Our experiment also implies Python programs may be suffering code smells further.

2016 Third International Conference on Trustworthy Systems and their Applications (TSA) | 2016

Tracking Down Dynamic Feature Code Changes against Python Software Evolution

Zhifei Chen; Wanwangying Ma; Wei Lin; Lin Chen; Baowen Xu

Python, a typical dynamic programming language, is increasingly used in many application domains. Dynamic features in Python allow developers to change the code at runtime. Some dynamic features such as dynamic type checking play an active part in maintenance activities, thus dynamic feature code is often changed to cater to software evolution. The aim of this paper is exploring and validating the characteristics of feature changes in Python. We collected change occurrences in 85 open-source projects and discovered the relationship between feature changes and bug-fix activities. Furthermore, we went into 358 change occurrences to explore the causes and behaviors of feature changes. The results show that: (1) dynamic features are increasingly used and the code is changeable; (2) most dynamic features may behave that feature code is more likely to be changed in bug-fix activities than non-bugfix activities; (3) dynamic feature code plays both positive and negative roles in maintenance activities. Our results provide useful guidance and insights for improving automatic program repair and refactoring tools.

Science in China Series F: Information Sciences | 2018

A study on the changes of dynamic feature code when fixing bugs: towards the benefits and costs of Python dynamic features

Zhifei Chen; Wanwangying Ma; Wei Lin; Lin Chen; Yanhui Li; Baowen Xu

Dynamic features in programming languages support the modification of the execution status at runtime, which is often considered helpful in rapid development and prototyping. However, it was also reported that some dynamic feature code tends to be change-prone or error-prone. We present the first study that analyzes the changes of dynamic feature code and the roles of dynamic features in bug-fix activities for the Python language. We used an AST-based differencing tool to capture fine-grained source code changes from 17926 bug-fix commits in 17 Python projects. Using this data, we conducted an empirical study on the changes of dynamic feature code when fixing bugs in Python. First, we investigated the characteristics of dynamic feature code changes, by comparing the changes between dynamic feature code and non-dynamic feature code when fixing bugs, and comparing dynamic feature changes between bug-fix and non-bugfix activities. Second, we explored 226 bug-fix commits to investigate the motivation and behaviors of dynamic feature changes when fixing bugs. The study results reveal that (1) the changes of dynamic feature code are significantly related to bug-fix activities rather than non-bugfix activities; (2) compared with non-dynamic feature code, dynamic feature code is inserted or updated more frequently when fixing bugs; (3) developers often insert dynamic feature code as type checks or attribute checks to fix type errors and attribute errors; (4) the misuse of dynamic features introduces bugs in dynamic feature code, and the bugs are often fixed by adding a check or adding an exception handling. As a benefit of this paper, we gain insights into the manner in which developers and researchers handle the changes of dynamic feature code when fixing bugs.

Information & Software Technology | 2018

Understanding metric-based detectable smells in Python software: A comparative study

Zhifei Chen; Lin Chen; Wanwangying Ma; Xiaoyu Zhou; Yuming Zhou; Baowen Xu

Abstract Context Code smells are supposed to cause potential comprehension and maintenance problems in software development. Although code smells are studied in many languages, e.g. Java and C#, there is a lack of technique or tool support addressing code smells in Python. Objective Due to the great differences between Python and static languages, the goal of this study is to define and detect code smells in Python programs and to explore the effects of Python smells on software maintainability. Method In this paper, we introduced ten code smells and established a metric-based detection method with three different filtering strategies to specify metric thresholds (Experience-Based Strategy, Statistics-Based Strategy, and Tuning Machine Strategy). Then, we performed a comparative study to investigate how three detection strategies perform in detecting Python smells and how these smells affect software maintainability with different detection strategies. This study utilized a corpus of 106 Python projects with most stars on GitHub. Results The results showed that: (1) the metric-based detection approach performs well in detecting Python smells and Tuning Machine Strategy achieves the best accuracy; (2) the three detection strategies discover some different smell occurrences, and Long Parameter List and Long Method are more prevalent than other smells; (3) several kinds of code smells are more significantly related to changes or faults in Python modules. Conclusion These findings reveal the key features of Python smells and also provide a guideline for the choice of detection strategy in detecting and analyzing Python smells.

Explore More