Michael Franklin Bosu

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Michael Franklin Bosu is active.

Explore More

Publication

Featured researches published by Michael Franklin Bosu.

evaluation and assessment in software engineering | 2013

Data quality in empirical software engineering: a targeted review

Michael Franklin Bosu; Stephen G. MacDonell

Context: The utility of prediction models in empirical software engineering (ESE) is heavily reliant on the quality of the data used in building those models. Several data quality challenges such as noise, incompleteness, outliers and duplicate data points may be relevant in this regard. Objective: We investigate the reporting of three potentially influential elements of data quality in ESE studies: data collection, data pre-processing, and the identification of data quality issues. This enables us to establish how researchers view the topic of data quality and the mechanisms that are being used to address it. Greater awareness of data quality should inform both the sound conduct of ESE research and the robust practice of ESE data collection and processing. Method: We performed a targeted literature review of empirical software engineering studies covering the period January 2007 to September 2012. A total of 221 relevant studies met our inclusion criteria and were characterized in terms of their consideration and treatment of data quality. Results: We obtained useful insights as to how the ESE community considers these three elements of data quality. Only 23 of these 221 studies reported on all three elements of data quality considered in this paper. Conclusion: The reporting of data collection procedures is not documented consistently in ESE studies. It will be useful if data collection challenges are reported in order to improve our understanding of why there are problems with software engineering data sets and the models developed from them. More generally, data quality should be given far greater attention by the community. The improvement of data sets through enhanced data collection, pre-processing and quality assessment should lead to more reliable prediction models, thus improving the practice of software engineering.

australian software engineering conference | 2013

A Taxonomy of Data Quality Challenges in Empirical Software Engineering

Michael Franklin Bosu; Stephen G. MacDonell

Reliable empirical models such as those used in software effort estimation or defect prediction are inherently dependent on the data from which they are built. As demands for process and product improvement continue to grow, the quality of the data used in measurement and prediction systems warrants increasingly close scrutiny. In this paper we propose a taxonomy of data quality challenges in empirical software engineering, based on an extensive review of prior research. We consider current assessment techniques for each quality issue and proposed mechanisms to address these issues, where available. Our taxonomy classifies data quality issues into three broad areas: first, characteristics of data that mean they are not fit for modeling, second, data set characteristics that lead to concerns about the suitability of applying a given model to another data set, and third, factors that prevent or limit data accessibility and trust. We identify this latter area as of particular need in terms of further research.

2017 IEEE International Conference on Software Quality, Reliability and Security (QRS) | 2017

Investigating the Significance of Bellwether Effect to Improve Software Effort Estimation

Solomon Mensah; Jacky Keung; Stephen G. MacDonell; Michael Franklin Bosu; Kwabena Ebo Bennin

Bellwether effect refers to the existence of exemplary projects (called the Bellwether) within a historical dataset to be used for improved prediction performance. Recent studies have shown an implicit assumption of using recently completed projects (referred to as moving window) for improved prediction accuracy. In this paper, we investigate the Bellwether effect on software effort estimation accuracy using moving windows. The existence of the Bellwether was empirically proven based on six postulations. We apply statistical stratification and Markov chain methodology to select the Bellwether moving window. The resulting Bellwether moving window is used to predict the software effort of a new project. Empirical results show that Bellwether effect exist in chronological datasets with a set of exemplary and recently completed projects representing the Bellwether moving window. Result from this study has shown that the use of Bellwether moving window with the Gaussian weighting function significantly improve the prediction accuracy.

software engineering and knowledge engineering | 2016

Multi-Objective Optimization for Software Testing Effort Estimation

Solomon Mensah; Jacky Keung; Kwabena Ebo Bennin; Michael Franklin Bosu

Software Testing Effort (STE), which contributes about 25-40% of the total development effort, plays a significant role in software development. In addressing the issues faced by companies in finding relevant datasets for STE estimation modeling prior to development, cross-company modeling could be leveraged. The study aims at assessing the effectiveness of cross-company (CC) and within-company (WC) projects in STE estimation. A robust multi-objective Mixed-Integer Linear Programming (MILP) optimization framework for the selection of CC and WC projects was constructed and estimation of STE was done using Deep Neural Networks. Results from our study indicate that the application of the MILP framework yielded similar results for both WC and CC modeling. The modeling framework will serve as a foundation to assist in STE estimation prior to the development of new a software project.

international conference on software engineering | 2017

A Stratification and Sampling Model for Bellwether Moving Window.

Solomon Mensah; Jacky Keung; Michael Franklin Bosu; Kwabena Ebo Bennin; Patrick Kwaku Kudjo

An effective method for finding the relevant number (window size) and the elapsed time (window age) of recently completed projects has proven elusive in software effort estimation. Although these two parameters significantly affect the prediction accuracy, there is no effective method to stratify and sample chronological projects to improve prediction performance of software effort estimation models. Exemplary projects (Bellwether) representing the training set have been empirically validated to improve the prediction accuracy in the domain of software defect prediction. However, the concept of Bellwether and its effect have not been empirically proven in software effort estimation as a method of selecting exemplary/relevant projects with defined window size and age. In view of this, we introduce a novel method for selecting relevant and recently completed projects referred to as Bellwether moving window for improving the software effort prediction accuracy. We first sort and cluster a pool of N projects and apply statistical stratification based on Markov chain modeling to select the Bellwether moving window. We evaluate the proposed approach using the baseline Automatically Transformed Linear Model on the ISBSG dataset. Results show that (1) Bellwether effect exist in software effort estimation dataset, (2) the Bellwether moving window with a window size of 82 to 84 projects and window age of 1.5 to 2 years resulted in an improved prediction accuracy than the traditional approach.

Information & Software Technology | 2018

Duplex output software effort estimation model with self-guided interpretation

Solomon Mensah; Jacky Keung; Michael Franklin Bosu; Kwabena Ebo Bennin

Abstract Context Software effort estimation (SEE) plays a key role in predicting the effort needed to complete software development task. However, the conclusion instability across learners has affected the implementation of SEE models. This instability can be attributed to the lack of an effort classification benchmark that software researchers and practitioners can use to facilitate and interpret prediction results. Objective To ameliorate the conclusion instability challenge by introducing a classification and self-guided interpretation scheme for SEE. Method We first used the density quantile function to discretise the effort recorded in 14 datasets into three classes ( high, low and moderate ) and built regression models for these datasets. The results of the regression models were an effort estimate, termed output 1 , which was then classified into an effort class, termed output 2. We refer to the models generated in this study as duplex output models as they return two outputs. The introduced duplex output models trained with the leave-one-out cross validation and evaluated with MAE, BMMRE and adjusted R 2 , can be used to predict both the software effort and the class of software effort estimate. Robust statistical tests (Welchs t- test and Kruskal-Wallis H -test) were used to examine the statistical significant differences in the models’ prediction performances. Results We observed the following: (1) the duplex output models not only predicted the effort estimates, they also offered a guide to interpreting the effort expended; (2) incorporating the genetic search algorithm into the duplex output model allowed the sampling of relevant features for improved prediction accuracy; and (3) ElasticNet, a hybrid regression, provided superior prediction accuracy over the ATLM, the state-of-the-art baseline regression. Conclusion The results show that the duplex output model provides a self-guided benchmark for interpreting estimated software effort. ElasticNet can also serve as a baseline model for SEE.

DIGITAL HEALTH | 2017

Is knee pain information on YouTube videos perceived to be helpful? An analysis of user comments and implications for dissemination on social media

Sarah Meldrum; Bastin Tr Savarimuthu; Sherlock A. Licorish; Amjed Tahir; Michael Franklin Bosu; Prasath Jayakaran

Objective There is little research that characterises knee pain related information disseminated via social media. However, variances in the content and quality of such sources could compromise optimal patient care. This study explored the nature of the comments on YouTube videos related to non-specific knee pain, to determine their helpfulness to the users. Methods A systematic search identified 900 videos related to knee pain on the YouTube database. A total of 3537 comments from 58 videos were included in the study. A categorisation scheme was developed and 1000 randomly selected comments were analysed according to this scheme. Results The most common category was the users providing personal information or describing a personal situation (19%), followed by appreciation or acknowledgement of others’ inputs (17%) and asking questions (15%). Of the questions, 33% were related to seeking help in relation to a specific situation. Over 10% of the comments contained negativity or disagreement; while 4.4% of comments reported they intended to pursue an action, based on the information presented in the video and/or from user comments. Conclusion It was observed that individuals commenting on YouTube videos on knee pain were most often soliciting advice and information specific to their condition. The analysis of comments from the most commented videos using a keyword-based search approach suggests that the YouTube videos can be used for disseminating general advice on knee pain.

2015 24th Australasian Software Engineering Conference | 2015