Beijun Shen | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Beijun Shen is active.

Explore More

Publication

Featured researches published by Beijun Shen.

international conference on quality software | 2012

Compressed C4.5 Models for Software Defect Prediction

Jun Wang; Beijun Shen; Yuting Chen

Defects in every software must be handled properly, and the number of defects directly reflects the quality of a software. In recent years, researchers have applied data mining and machine learning methods to predicting software defects. However, in their studies, the method in which the machine learning models are directly adopted may not be precise enough. Optimizing the machine learning models used in defects prediction will improve the prediction accuracy. In this paper, aiming at the characteristics of the metrics mined from the open source software, we proposed three new defect prediction models based on C4.5 model. The new models introduce the Spearmans rank correlation coefficient to the basis of choosing root node of the decision tree which makes the models better on defects prediction. In order to verify the effectiveness of the improved models, an experimental scheme is designed. In the experiment, we compared the prediction accuracies of the existing models and the improved models and the result showed that the improved models reduced the size of the decision tree by 49.91% on average and increased the prediction accuracy by 4.58% and 4.87% on two modules used in the experiment.

international conference on software engineering | 2010

From isolated tenancy hosted application to multi-tenancy: Toward a systematic migration method for web application

Xuesong Zhang; Beijun Shen; Xucheng Tang; Wei Chen

Software as a Service (SaaS) attracts small and medium enterprises by its low investment, flexibility and easy to manage. Migrating isolated tenancy hosted web application toward SaaS application can reuse the legacy software assets and cut down the re-development cost and risk. Since multi-tenant is a prime characteristic of SaaS, migrating to multi-tenancy is the prerequisite step of migrating to SaaS. However, it is a hard task complicated by the lack of appropriate migration approaches and tools. In this paper, a systematic method is proposed to migrate and evolve isolated tenancy hosted applications into multi-tenant enabled applications from aspects of data model, access control and tenant management, taking into account both the business needs and technical contents. An experiment has been conducted to tune the approach and evaluate applicability and performance impact of our migration method.

computer science and software engineering | 2008

A Case Study of Software Process Improvement in a Chinese Small Company

Beijun Shen; Tong Ruan

This paper presents a software process improvement (SPI) project at a small software company in China, who aims to transit quality system from ISO9000 based to CMMI based. One of the main challenges is how to combine flexibility and control without impeding a small companys innovative nature. Therefore many SPI practices were implemented, mainly including process modeling, process automation, and process measurement. These experiences with the SPI initiatives offered several lessons about how small companies can more successfully manage SPI.

software engineering and knowledge engineering | 2015

Building a Large-scale Software Programming Taxonomy from Stackoverflow.

Jiangang Zhu; Beijun Shen; Xuyang Cai; Haofen Wang

Taxonomy is becoming indispensable to a growing number of applications in software engineering such as software repository mining and defect prediction. However, the existing related taxonomies are always manually constructed. The sizes of these taxonomies are small and their depths are limited. In order to show the full potential of taxonomies in software engineering applications, in this paper, we present the first large-scale software programming taxonomy which is more comprehensive than any existing ones. It contains 38,205 concepts and 68,098 subsumption relations. Instead of learning from a open domain, we focus on taxonomy construction from Stackoverflow which is one of the largest QA websites about software programming. We propose a machine learning based method with novel features to create a taxonomy that captures the hierarchical semantic structure of tags in Stackoverflow. This method executes iteratively to find as many relations as possible. Experimental results show that our approach achieves much better accuracy than baselines. Compared with taxonomies related to software programming which are extracted from the general-purpose taxonomies such as WikiTaxonomy, Yago Taxonomy and Schema.org, our taxonomy has the widest coverage of concepts, contains the largest number of subsumption relations, and runs up to the deepest semantic hierarchy. Keywords—Taxonomy Construction, Stackoverflow, Software Engineering

empirical software engineering and measurement | 2015

Code Bad Smell Detection through Evolutionary Data Mining

Shizhe Fu; Beijun Shen

The existence of code bad smell has a severe impact on the software quality. Numerous researches show that ignoring code bad smells can lead to failure of a software system. Thus, the detection of bad smells has drawn the attention of many researchers and practitioners. Quite a few approaches have been proposed to detect code bad smells. Most approaches are solely based on structural information extracted from source code. However, we have observed that some code bad smells have the evolutionary property, and thus propose a novel approach to detect three code bad smells by mining software evolutionary data: duplicated code, shotgun surgery, and divergent change. It exploits association rules mined from change history of software systems, upon which we define heuristic algorithms to detect the three bad smells. The experimental results on five open source projects demonstrate that the proposed approach achieves higher precision, recall and F-measure.

wri world congress on software engineering | 2009

Model-Driven Reengineering of Database

Hanzhe Wang; Beijun Shen; Cheng Chen

A lot of work has been done applying Model-Driven Approach to those business domain concerned software development. These researches mostly show how to transform business domain models to software application with different paradigms, rather than how to transform specific software artifacts generally regarding of business domain factor, such as database, the common infrastructure of nowadays software system. The later kind of work can make more contribution to general software development rather than some specific business domains. In this paper, we present a MDA based approach to perform database reengineering and also build a framework based on current framework (EMF, Operational-QVT).

asia-pacific software engineering conference | 2015

TBIL: A Tagging-Based Approach to Identity Linkage Across Software Communities

Wenkai Mo; Beijun Shen; Yuting Chen; Jiangang Zhu

Nowadays, developers can be involved in several software developer communities like StackOverflow and Github. Meanwhile, accounts from different communities are usually less connected. Linking these accounts, which is called identity linkage, is a prerequisite of many interesting studies such as investigating activities of one developer in two or more communities. Many researches have been performed on social networks, but very few of them can be adapted to software communities, as information of users provided in these communities has a huge difference to that in social networks. We tackle with the problem by introducing TBIL, a novel tagging-based approach to identity linkage among software communities. The essential idea of this approach is to employ skills (measured by tags), usernames and concerned topics of developers as hints, and to use a decision tree-based algorithm and another heuristic greedy matching algorithm to link user identities. We measure the effectiveness of TBIL on two well-known software communities, i.e., StackOverflow and Github. The results show that our method is feasible and practical in linking developer identities. In particular, the F-Score of our method is 0.15 higher than previous identity linkage methods in software communities.

computer software and applications conference | 2014

A Scenario-Based Approach to Predicting Software Defects Using Compressed C4.5 Model

Biwen Li; Beijun Shen; Jun Wang; Yuting Chen; Tao Zhang; Jinshuang Wang

Defect prediction approaches use software metrics and fault data to learn which software properties are associated with what kinds of software faults in programs. One trend of existing techniques is to predict the software defects in a program construct (file, class, method, and so on) rather than in a specific function scenario, while the latter is important for assessing software quality and tracking the defects in software functionalities. However, it still remains a challenge in that how a functional scenario is derived and how a defect prediction technique should be applied to a scenario. In this paper, we propose a scenario-based approach to defect prediction using compressed C4.5 model. The essential idea of this approach is to use a k-medoids algorithm to cluster functions followed by deriving functional scenarios, and then to use the C4.5 model to predict the fault in the scenarios. We have also conducted an experiment to evaluate the scenario-based approach and compared it with a file-based prediction approach. The experimental results show that the scenario-based approach provides with high performance by reducing the size of the decision tree by 52.65% on average and also slightly increasing the accuracy.

international conference on computer science and service system | 2011

On building knowledge cloud

Dehua Ju; Beijun Shen

In this paper, we have extended the “data in the cloud” approach to the “knowledge in the cloud” paradigm. The knowledge cloud model represents an ideal solution for on-demand KaaS on the cloud platform. The design framework for public knowledge service platform previously proposed by the authors can be smoothly migrated into the cloud environment as explained in this paper. The knowledge cloud is a high-usability knowledge source for all knowledge workers, moreover, a collaborative base for knowledge co-creation.

ICSP'07 Proceedings of the 2007 international conference on Software process | 2007

On the measurement of agility in software process

Beijun Shen; Dehua Ju

Agile software process may become one of the most rational development patterns in global economic environment to assist software enterprise to make rapid response to the market. This paper proposes a method to measure agility in software process using goal-driven techniques and balanced scorecard. Using this method, we design a set of representative agility metrics for measuring agility in software process. We also perform one case study for the proposed agility measurement.

Explore More