Carolyn Mair | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Carolyn Mair is active.

Explore More

Publication

Featured researches published by Carolyn Mair.

Journal of Systems and Software | 2000

An investigation of machine learning based prediction systems

Carolyn Mair; Gada F. Kadoda; Martin Lefley; Keith Phalp; Chris Schofield; Martin J. Shepperd; Steve Webster

Traditionally, researchers have used either o�f-the-shelf models such as COCOMO, or developed local models using statistical techniques such as stepwise regression, to obtain software eff�ort estimates. More recently, attention has turned to a variety of machine learning methods such as artifcial neural networks (ANNs), case-based reasoning (CBR) and rule induction (RI). This paper outlines some comparative research into the use of these three machine learning methods to build software e�ort prediction systems. We briefly describe each method and then apply the techniques to a dataset of 81 software projects derived from a Canadian software house in the late 1980s. We compare the prediction systems in terms of three factors: accuracy, explanatory value and configurability. We show that ANN methods have superior accuracy and that RI methods are least accurate. However, this view is somewhat counteracted by problems with explanatory value and configurability. For example, we found that considerable eff�ort was required to configure the ANN and that this compared very unfavourably with the other techniques, particularly CBR and least squares regression (LSR). We suggest that further work be carried out, both to further explore interaction between the enduser and the prediction system, and also to facilitate configuration, particularly of ANNs.

IEEE Transactions on Software Engineering | 2006

Software defect association mining and defect correction effort prediction

Qinbao Song; Martin J. Shepperd; Michelle Cartwright; Carolyn Mair

Much current software defect prediction work focuses on the number of defects remaining in a software system. In this paper, we present association rule mining based methods to predict defect associations and defect correction effort. This is to help developers detect software defects and assist project managers in allocating testing resources more effectively. We applied the proposed methods to the SEL defect data consisting of more than 200 projects over more than 15 years. The results show that, for defect association prediction, the accuracy is very high and the false-negative rate is very low. Likewise, for the defect correction effort prediction, the accuracy for both defect isolation effort prediction and defect correction effort prediction are also high. We compared the defect correction effort prediction method with other types of methods - PART, C4.5, and Naive Bayes - and show that accuracy has been improved by at least 23 percent. We also evaluated the impact of support and confidence levels on prediction accuracy, false-negative rate, false-positive rate, and the number of rules. We found that higher support and confidence levels may not result in higher prediction accuracy, and a sufficient number of rules is a precondition for high prediction accuracy.

IEEE Transactions on Software Engineering | 2013

Data Quality: Some Comments on the NASA Software Defect Datasets

Martin J. Shepperd; Qinbao Song; Zhongbin Sun; Carolyn Mair

Background--Self-evidently empirical analyses rely upon the quality of their data. Likewise, replications rely upon accurate reporting and using the same rather than similar versions of datasets. In recent years, there has been much interest in using machine learners to classify software modules into defect-prone and not defect-prone categories. The publicly available NASA datasets have been extensively used as part of this research. Objective--This short note investigates the extent to which published analyses based on the NASA defect datasets are meaningful and comparable. Method--We analyze the five studies published in the IEEE Transactions on Software Engineering since 2007 that have utilized these datasets and compare the two versions of the datasets currently in use. Results--We find important differences between the two versions of the datasets, implausible values in one dataset and generally insufficient detail documented on dataset preprocessing. Conclusions--It is recommended that researchers 1) indicate the provenance of the datasets they use, 2) report any preprocessing in sufficient detail to enable meaningful replication, and 3) invest effort in understanding the data prior to applying machine learners.

ieee international software metrics symposium | 2005

Using grey relational analysis to predict software effort with small data sets

Qinbao Song; Martin Shepperd; Carolyn Mair

The inherent uncertainty of the software development process presents particular challenges for software effort prediction. We need to systematically address missing data values, feature subset selection and the continuous evolution of predictions as the project unfolds, and all of this in the context of data-starvation and noisy data. However, in this paper, we particularly focus on feature subset selection and effort prediction at an early stage of a project. We propose a novel approach of using grey relational analysis (GRA) of grey system theory (GST), which is a recently developed system engineering theory based on the uncertainty of small samples. In this work we address some of the theoretical challenges in applying GRA to feature subset selection and effort prediction, and then evaluate our approach on five publicly available industrial data sets using stepwise regression as a benchmark. The results are very encouraging in the sense of being comparable or better than other machine learning techniques and thus indicate that the method has considerable potential

model driven engineering languages and systems | 2005

An analysis of data sets used to train and validate cost prediction systems

Carolyn Mair; Martin J. Shepperd; Magne Jørgensen

OBJECTIVE - to build up a picture of the nature and type of data sets being used to develop and evaluate different software project effort prediction systems. We believe this to be important since there is a growing body of published work that seeks to assess different prediction approaches.METHOD - we performed an exhaustive search from 1980 onwards from three software engineering journals for research papers that used project data sets to compare cost prediction systems.RESULTS - this identified a total of 50 papers that used, one or more times, a total of 71 unique project data sets. We observed that some of the better known and easily accessible data sets were used repeatedly making them potentially disproportionately influential. Such data sets also tend to be amongst the oldest with potential problems of obsolescence. We also note that only about 60% of all data sets are in the public domain. Finally, extracting relevant information from research papers has been time consuming due to different styles of presentation and levels of contextural information.CONCLUSIONS - first, the community needs to consider the quality and appropriateness of the data set being utilised; not all data sets are equal. Second, we need to assess the way results are presented in order to facilitate meta-analysis and whether a standard protocol would be appropriate.

Journal of Further and Higher Education | 2012

Using technology for enhancing reflective writing, metacognition and learning

Carolyn Mair

There exists broad agreement on the value of reflective practice for personal and professional development. However, many students in higher education (HE) struggle with the concept of reflection, so they do not engage well with the process, and its full value is seldom realised. An online resource was developed to facilitate and structure the recording, storage and retrieval of reflections with the focus on facilitating reflective writing, developing metacognitive awareness and, ultimately, enhancing learning. Ten undergraduate students completed a semi-structured questionnaire prior to participating in a focus group designed to elicit a common understanding of reflective practice. They maintained reflective practice online for 6 weeks and participated in post-study individual interviews. Findings provide evidence for the positive acceptance, efficiency and effectiveness of the intervention. Using a structured approach to online reflective practice is empowering and ultimately enhances undergraduate learning through the development of metacognition.

hawaii international conference on system sciences | 2012

An Empirical Study of Software Project Managers Using a Case-Based Reasoner

Carolyn Mair; Miriam Martincova; Martin J. Shepperd

BACKGROUND -- whilst substantial effort has been invested in developing and evaluating knowledge-based techniques for project prediction, little is known about the interaction between them and expert users. OBJECTIVE - the aim is to explore the interaction of cognitive processes and personality of software project managers undertaking tool-supported estimation tasks such as effort and cost prediction. METHOD - we conducted personality profiling and observational studies using think-aloud protocols with five senior project managers using a case-based reasoning (CBR) tool to predict effort for real projects. RESULTS - we found pronounced differences between the participants in terms of individual differences, cognitive behaviour and estimation outcomes, although there was a general tendency for over-optimism and over-confidence. CONCLUSIONS - in order to improve task effectiveness in the workplace we need to understand the cognitive behaviour of software professionals in addition to conducting machine learning research.

Journal of Applied Research in Higher Education | 2016

Detecting uncertainty, predicting outcome for first year students

Lalage Sanders; Carolyn Mair; Rachael James

Purpose – The purpose of this paper is to evaluate the use of two psychometric measures as predictors of end of year outcome for first year university students. Design/methodology/approach – New undergraduates (n=537) were recruited in two contrasting universities: one arts based, and one science, in different cities in the UK. At the start of the academic year, new undergraduates across 30 programmes in the two institutions were invited to complete a survey comprising two psychometric measures: Academic Behavioural Confidence scale and the Performance Expectation Ladder. Outcome data were collected from the examining boards the following summer distinguishing those who were able to progress to the next year of study without further assessment from those who were not. Findings – Two of the four Confidence subscales, Attendance and Studying, had significantly lower scores amongst students who were not able to progress the following June compared to those who did (p < 0.003). The Ladder data showed the less...

international symposium on empirical software engineering | 2005