
Featured researches published by Martin J. Shepperd.

IEEE Transactions on Software Engineering | 1997

Estimating software project effort using analogies

Martin J. Shepperd; Chris Schofield

Accurate project effort prediction is an important goal for the software engineering community. To date most work has focused upon building algorithmic models of effort, for example COCOMO. These can be calibrated to local environments. We describe an alternative approach to estimation based upon the use of analogies. The underlying principle is to characterize projects in terms of features (for example, the number of interfaces, the development method or the size of the functional requirements document). Completed projects are stored and then the problem becomes one of finding the most similar projects to the one for which a prediction is required. Similarity is defined as Euclidean distance in n-dimensional space where n is the number of project features. Each dimension is standardized so all dimensions have equal weight. The known effort values of the nearest neighbors to the new project are then used as the basis for the prediction. The process is automated using a PC-based tool known as ANGEL. The method is validated on nine different industrial datasets (a total of 275 projects) and in all cases analogy outperforms algorithmic models based upon stepwise regression. From this work we argue that estimation by analogy is a viable technique that, at the very least, can be used by project managers to complement current estimation techniques.

IEEE Transactions on Software Engineering | 2007

A Systematic Review of Software Development Cost Estimation Studies

Magne Jørgensen; Martin J. Shepperd

This paper aims to provide a basis for the improvement of software-estimation research through a systematic review of previous work. The review identifies 304 software cost estimation papers in 76 journals and classifies the papers according to research topic, estimation approach, research approach, study context and data set. A Web-based library of these cost estimation papers is provided to ease the identification of relevant estimation research results. The review results combined with other knowledge provide support for recommendations for future software cost estimation research, including: 1) increase the breadth of the search for relevant studies, 2) search manually for relevant papers within a carefully selected set of journals when completeness is essential, 3) conduct more studies on estimation methods commonly used by the software industry, and 4) increase the awareness of how properties of the data sets impact the results when evaluating estimation methods

international conference on software engineering | 1996

Effort estimation using analogy

Martin J. Shepperd; Chris Schofield; Barbara A. Kitchenham

The staff resources or effort required for a software project are notoriously difficult to estimate in advance. To date most work has focused upon algorithmic cost models such as COCOMO and Function Points. These can suffer from the disadvantage of the need to calibrate the model to each individual measurement environment coupled with very variable accuracy levels even after calibration. An alternative approach is to use analogy for estimation. We demonstrate that this method has considerable promise in that we show it to out perform traditional algorithmic methods for six different datasets. A disadvantage of estimation by analogy is that it requires a considerable amount of computation. The paper describes an automated environment known as ANGEL that supports the collection, storage and identification of the most analogous projects in order to estimate the effort for a new project. ANGEL is based upon the minimisation of Euclidean distance in n-dimensional space. The software is flexible and can deal with differing datasets both in terms of the number of observations (projects) and in the variables collected. Our analogy approach is evaluated with six distinct datasets drawn from a range of different environments and is found to outperform other methods. It is widely accepted that effective software effort estimation demands more than one technique. We have shown that estimating by analogy is a candidate technique and that with the aid of an automated environment is an eminently practical technique.

IEEE Transactions on Software Engineering | 2000

An empirical investigation of an object-oriented software system

Michelle Cartwright; Martin J. Shepperd

The paper describes an empirical investigation into an industrial object oriented (OO) system comprised of 133000 lines of C++. The system was a subsystem of a telecommunications product and was developed using the Shlaer-Mellor method (S. Shlaer and S.J. Mellor, 1988; 1992). From this study, we found that there was little use of OO constructs such as inheritance, and therefore polymorphism. It was also found that there was a significant difference in the defect densities between those classes that participated in inheritance structures and those that did not, with the former being approximately three times more defect-prone. We were able to construct useful prediction systems for size and number of defects based upon simple counts such as the number of states and events per class. Although these prediction systems are only likely to have local significance, there is a more general principle that software developers can consider building their own local prediction systems. Moreover, we believe this is possible, even in the absence of the suites of metrics that have been advocated by researchers into OO technology. As a consequence, measurement technology may be accessible to a wider group of potential users.

Journal of Systems and Software | 2000

An investigation of machine learning based prediction systems

Carolyn Mair; Gada F. Kadoda; Martin Lefley; Keith Phalp; Chris Schofield; Martin J. Shepperd; Steve Webster

Traditionally, researchers have used either o�f-the-shelf models such as COCOMO, or developed local models using statistical techniques such as stepwise regression, to obtain software eff�ort estimates. More recently, attention has turned to a variety of machine learning methods such as artifcial neural networks (ANNs), case-based reasoning (CBR) and rule induction (RI). This paper outlines some comparative research into the use of these three machine learning methods to build software e�ort prediction systems. We briefly describe each method and then apply the techniques to a dataset of 81 software projects derived from a Canadian software house in the late 1980s. We compare the prediction systems in terms of three factors: accuracy, explanatory value and configurability. We show that ANN methods have superior accuracy and that RI methods are least accurate. However, this view is somewhat counteracted by problems with explanatory value and configurability. For example, we found that considerable eff�ort was required to configure the ANN and that this compared very unfavourably with the other techniques, particularly CBR and least squares regression (LSR). We suggest that further work be carried out, both to further explore interaction between the enduser and the prediction system, and also to facilitate configuration, particularly of ANNs.

IEEE Transactions on Software Engineering | 2011

A General Software Defect-Proneness Prediction Framework

Qinbao Song; Zihan Jia; Martin J. Shepperd; Shi Ying; Jin Liu

BACKGROUND - Predicting defect-prone software components is an economically important activity and so has received a good deal of attention. However, making sense of the many, and sometimes seemingly inconsistent, results is difficult. OBJECTIVE - We propose and evaluate a general framework for software defect prediction that supports 1) unbiased and 2) comprehensive comparison between competing prediction systems. METHOD - The framework is comprised of 1) scheme evaluation and 2) defect prediction components. The scheme evaluation analyzes the prediction performance of competing learning schemes for given historical data sets. The defect predictor builds models according to the evaluated learning scheme and predicts software defects with new data according to the constructed model. In order to demonstrate the performance of the proposed framework, we use both simulation and publicly available software defect data sets. RESULTS - The results show that we should choose different learning schemes for different data sets (i.e., no scheme dominates), that small details in conducting how evaluations are conducted can completely reverse findings, and last, that our proposed framework is more effective and less prone to bias than previous approaches. CONCLUSIONS - Failure to properly or fully evaluate a learning scheme can be misleading; however, these problems may be overcome by our proposed framework.

IEEE Transactions on Software Engineering | 2006

Software defect association mining and defect correction effort prediction

Qinbao Song; Martin J. Shepperd; Michelle Cartwright; Carolyn Mair

Much current software defect prediction work focuses on the number of defects remaining in a software system. In this paper, we present association rule mining based methods to predict defect associations and defect correction effort. This is to help developers detect software defects and assist project managers in allocating testing resources more effectively. We applied the proposed methods to the SEL defect data consisting of more than 200 projects over more than 15 years. The results show that, for defect association prediction, the accuracy is very high and the false-negative rate is very low. Likewise, for the defect correction effort prediction, the accuracy for both defect isolation effort prediction and defect correction effort prediction are also high. We compared the defect correction effort prediction method with other types of methods - PART, C4.5, and Naive Bayes - and show that accuracy has been improved by at least 23 percent. We also evaluated the impact of support and confidence levels on prediction accuracy, false-negative rate, false-positive rate, and the number of rules. We found that higher support and confidence levels may not result in higher prediction accuracy, and a sufficient number of rules is a precondition for high prediction accuracy.

IEEE Transactions on Software Engineering | 2013

Data Quality: Some Comments on the NASA Software Defect Datasets

Martin J. Shepperd; Qinbao Song; Zhongbin Sun; Carolyn Mair

Background--Self-evidently empirical analyses rely upon the quality of their data. Likewise, replications rely upon accurate reporting and using the same rather than similar versions of datasets. In recent years, there has been much interest in using machine learners to classify software modules into defect-prone and not defect-prone categories. The publicly available NASA datasets have been extensively used as part of this research. Objective--This short note investigates the extent to which published analyses based on the NASA defect datasets are meaningful and comparable. Method--We analyze the five studies published in the IEEE Transactions on Software Engineering since 2007 that have utilized these datasets and compare the two versions of the datasets currently in use. Results--We find important differences between the two versions of the datasets, implausible values in one dataset and generally insufficient detail documented on dataset preprocessing. Conclusions--It is recommended that researchers 1) indicate the provenance of the datasets they use, 2) report any preprocessing in sufficient detail to enable meaningful replication, and 3) invest effort in understanding the data prior to applying machine learners.

IEEE Transactions on Software Engineering | 1995

Comments on "A metrics suite for object oriented design

Neville Churcher; Martin J. Shepperd; Shyam R. Chidamber; Chris F. Kemerer

A suite of object oriented software metrics has recently been proposed by S.R. Chidamber and C.F. Kemerer (see ibid., vol. 20, p. 476-94, 1994). While the authors have taken care to ensure their metrics have a sound measurement theoretical basis, we argue that is premature to begin applying such metrics while there remains uncertainty about the precise definitions of many of the quantities to be observed and their impact upon subsequent indirect metrics. In particular, we show some of the ambiguities associated with the seemingly simple concept of the number of methods per class. The usefulness of the proposed metrics, and others, would be greatly enhanced if clearer guidance concerning their application to specific languages were to be provided. Such empirical considerations are as important as the theoretical issues raised by the authors. >

IEEE Transactions on Software Engineering | 2014

Researcher Bias: The Use of Machine Learning in Software Defect Prediction

Martin J. Shepperd; David Bowes; Tracy Hall

Background. The ability to predict defect-prone software components would be valuable. Consequently, there have been many empirical studies to evaluate the performance of different techniques endeavouring to accomplish this effectively. However no one technique dominates and so designing a reliable defect prediction model remains problematic. Objective. We seek to make sense of the many conflicting experimental results and understand which factors have the largest effect on predictive performance. Method. We conduct a meta-analysis of all relevant, high quality primary studies of defect prediction to determine what factors influence predictive performance. This is based on 42 primary studies that satisfy our inclusion criteria that collectively report 600 sets of empirical prediction results. By reverse engineering a common response variable we build a random effects ANOVA model to examine the relative contribution of four model building factors (classifier, data set, input metrics and researcher group) to model prediction performance. Results. Surprisingly we find that the choice of classifier has little impact upon performance (1.3 percent) and in contrast the major (31 percent) explanatory factor is the researcher group. It matters more who does the work than what is done. Conclusion. To overcome this high level of researcher bias, defect prediction researchers should (i) conduct blind analysis, (ii) improve reporting protocols and (iii) conduct more intergroup studies in order to alleviate expertise issues. Lastly, research is required to determine whether this bias is prevalent in other applications domains.

Researchain Logo
Decentralizing Knowledge