Solomon Mensah | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Solomon Mensah is active.

Explore More

Publication

Featured researches published by Solomon Mensah.

empirical software engineering and measurement | 2017

The significant effects of data sampling approaches on software defect prioritization and classification

Kwabena Ebo Bennin; Jacky Keung; Akito Monden; Passakorn Phannachitta; Solomon Mensah

Context: Recent studies have shown that performance of defect prediction models can be affected when data sampling approaches are applied to imbalanced training data for building defect prediction models. However, the magnitude (degree and power) of the effect of these sampling methods on the classification and prioritization performances of defect prediction models is still unknown. Goal: To investigate the statistical and practical significance of using resampled data for constructing defect prediction models. Method: We examine the practical effects of six data sampling methods on performances of five defect prediction models. The prediction performances of the models trained on default datasets (no sampling method) are compared with that of the models trained on resampled datasets (application of sampling methods). To decide whether the performance changes are significant or not, robust statistical tests are performed and effect sizes computed. Twenty releases of ten open source projects extracted from the PROMISE repository are considered and evaluated using the AUC, pd, pf and G-mean performance measures. Results: There are statistical significant differences and practical effects on the classification performance (pd, pf and G-mean) between models trained on resampled datasets and those trained on the default datasets. However, sampling methods have no statistical and practical effects on defect prioritization performance (AUC) with small or no effect values obtained from the models trained on the resampled datasets. Conclusions: Existing sampling methods can properly set the threshold between buggy and clean samples, while they cannot improve the prediction of defect-proneness itself. Sampling methods are highly recommended for defect classification purposes when all faulty modules are to be considered for testing.

2017 IEEE International Conference on Software Quality, Reliability and Security (QRS) | 2017

Investigating the Significance of Bellwether Effect to Improve Software Effort Estimation

Solomon Mensah; Jacky Keung; Stephen G. MacDonell; Michael Franklin Bosu; Kwabena Ebo Bennin

Bellwether effect refers to the existence of exemplary projects (called the Bellwether) within a historical dataset to be used for improved prediction performance. Recent studies have shown an implicit assumption of using recently completed projects (referred to as moving window) for improved prediction accuracy. In this paper, we investigate the Bellwether effect on software effort estimation accuracy using moving windows. The existence of the Bellwether was empirically proven based on six postulations. We apply statistical stratification and Markov chain methodology to select the Bellwether moving window. The resulting Bellwether moving window is used to predict the software effort of a new project. Empirical results show that Bellwether effect exist in chronological datasets with a set of exemplary and recently completed projects representing the Bellwether moving window. Result from this study has shown that the use of Bellwether moving window with the Gaussian weighting function significantly improve the prediction accuracy.

software engineering and knowledge engineering | 2016

Multi-Objective Optimization for Software Testing Effort Estimation

Solomon Mensah; Jacky Keung; Kwabena Ebo Bennin; Michael Franklin Bosu

Software Testing Effort (STE), which contributes about 25-40% of the total development effort, plays a significant role in software development. In addressing the issues faced by companies in finding relevant datasets for STE estimation modeling prior to development, cross-company modeling could be leveraged. The study aims at assessing the effectiveness of cross-company (CC) and within-company (WC) projects in STE estimation. A robust multi-objective Mixed-Integer Linear Programming (MILP) optimization framework for the selection of CC and WC projects was constructed and estimation of STE was done using Deep Neural Networks. Results from our study indicate that the application of the MILP framework yielded similar results for both WC and CC modeling. The modeling framework will serve as a foundation to assist in STE estimation prior to the development of new a software project.

Journal of Systems and Software | 2018

On the value of a prioritization scheme for resolving Self-admitted technical debt

Solomon Mensah; Jacky Keung; Jeffery Svajlenko; Kwabena Ebo Bennin; Qing Mi

Abstract Programmers tend to leave incomplete, temporary workarounds and buggy codes that require rework in software development and such pitfall is referred to as Self-admitted Technical Debt (SATD). Previous studies have shown that SATD negatively affects software project and incurs high maintenance overheads. In this study, we introduce a prioritization scheme comprising mainly of identification, examination and rework effort estimation of prioritized tasks in order to make a final decision prior to software release. Using the proposed prioritization scheme, we perform an exploratory analysis on four open source projects to investigate how SATD can be minimized. Four prominent causes of SATD are identified, namely code smells (23.2%), complicated and complex tasks (22.0%), inadequate code testing (21.2%) and unexpected code performance (17.4%). Results show that, among all the types of SATD, design debts on average are highly prone to software bugs across the four projects analysed. Our findings show that a rework effort of approximately 10 to 25 commented LOC per SATD source file is needed to address the highly prioritized SATD ( vital few ) tasks. The proposed prioritization scheme is a novel technique that will aid in decision making prior to software release in an attempt to minimize high maintenance overheads.

international conference on software engineering | 2017

A Stratification and Sampling Model for Bellwether Moving Window.

Solomon Mensah; Jacky Keung; Michael Franklin Bosu; Kwabena Ebo Bennin; Patrick Kwaku Kudjo

An effective method for finding the relevant number (window size) and the elapsed time (window age) of recently completed projects has proven elusive in software effort estimation. Although these two parameters significantly affect the prediction accuracy, there is no effective method to stratify and sample chronological projects to improve prediction performance of software effort estimation models. Exemplary projects (Bellwether) representing the training set have been empirically validated to improve the prediction accuracy in the domain of software defect prediction. However, the concept of Bellwether and its effect have not been empirically proven in software effort estimation as a method of selecting exemplary/relevant projects with defined window size and age. In view of this, we introduce a novel method for selecting relevant and recently completed projects referred to as Bellwether moving window for improving the software effort prediction accuracy. We first sort and cluster a pool of N projects and apply statistical stratification based on Markov chain modeling to select the Bellwether moving window. We evaluate the proposed approach using the baseline Automatically Transformed Linear Model on the ISBSG dataset. Results show that (1) Bellwether effect exist in software effort estimation dataset, (2) the Bellwether moving window with a window size of 82 to 84 projects and window age of 1.5 to 2 years resulted in an improved prediction accuracy than the traditional approach.

international conference on software engineering | 2018

MAHAKIL: diversity based oversampling approach to alleviate the class imbalance issue in software defect prediction

Kwabena Ebo Bennin; Jacky Keung; Passakorn Phannachitta; Akito Monden; Solomon Mensah

Highly imbalanced data typically make accurate predictions difficult. Unfortunately, software defect datasets tend to have fewer defective modules than non-defective modules. Synthetic oversampling approaches address this concern by creating new minority defective modules to balance the class distribution before a model is trained. Notwithstanding the successes achieved by these approaches, they mostly result in over-generalization (high rates of false alarms) and generate near-duplicated data instances (less diverse data). In this study, we introduce MAHAKIL, a novel and efficient synthetic oversampling approach for software defect datasets that is based on the chromosomal theory of inheritance. Exploiting this theory, MAHAKIL interprets two distinct sub-classes as parents and generates a new instance that inherits different traits from each parent and contributes to the diversity within the data distribution. We extensively compare MAHAKIL with SMOTE, Borderline-SMOTE, ADASYN, Random Oversampling and the No sampling approach using 20 releases of defect datasets from the PROMISE repository and five prediction models. Our experiments indicate that MAHAKIL improves the prediction performance for all the models and achieves better and more significant pf values than the other oversampling approaches, based on Brunners statistical significance test and Cliffs effect sizes. Therefore, MAHAKIL is strongly recommended as an efficient alternative for defect prediction models built on highly imbalanced datasets.

evaluation and assessment in software engineering | 2018

An Inception Architecture-Based Model for Improving Code Readability Classification

Qing Mi; Jacky Keung; Yan Xiao; Solomon Mensah; Xiupei Mei

The process of classifying a piece of source code into a Readable or Unreadable class is referred to as Code Readability Classification. To build accurate classification models, existing studies focus on handcrafting features from different aspects that intuitively seem to correlate with code readability, and then exploring various machine learning algorithms based on the newly proposed features. On the contrary, our work opens up a new way to tackle the problem by using the technique of deep learning. Specifically, we propose IncepCRM, a novel model based on the Inception architecture that can learn multi-scale features automatically from source code with little manual intervention. We apply the information of human annotators as the auxiliary input for training IncepCRM and empirically verify the performance of IncepCRM on three publicly available datasets. The results show that: 1) Annotator information is beneficial for model performance as confirmed by robust statistical tests (i.e., the Brunner-Munzel test and Cliffs delta); 2) IncepCRM can achieve an improved accuracy against previously reported models across all datasets. The findings of our study confirm the feasibility and effectiveness of deep learning for code readability classification.

Information & Software Technology | 2018

Duplex output software effort estimation model with self-guided interpretation

Solomon Mensah; Jacky Keung; Michael Franklin Bosu; Kwabena Ebo Bennin

Abstract Context Software effort estimation (SEE) plays a key role in predicting the effort needed to complete software development task. However, the conclusion instability across learners has affected the implementation of SEE models. This instability can be attributed to the lack of an effort classification benchmark that software researchers and practitioners can use to facilitate and interpret prediction results. Objective To ameliorate the conclusion instability challenge by introducing a classification and self-guided interpretation scheme for SEE. Method We first used the density quantile function to discretise the effort recorded in 14 datasets into three classes ( high, low and moderate ) and built regression models for these datasets. The results of the regression models were an effort estimate, termed output 1 , which was then classified into an effort class, termed output 2. We refer to the models generated in this study as duplex output models as they return two outputs. The introduced duplex output models trained with the leave-one-out cross validation and evaluated with MAE, BMMRE and adjusted R 2 , can be used to predict both the software effort and the class of software effort estimate. Robust statistical tests (Welchs t- test and Kruskal-Wallis H -test) were used to examine the statistical significant differences in the models’ prediction performances. Results We observed the following: (1) the duplex output models not only predicted the effort estimates, they also offered a guide to interpreting the effort expended; (2) incorporating the genetic search algorithm into the duplex output model allowed the sampling of relevant features for improved prediction accuracy; and (3) ElasticNet, a hybrid regression, provided superior prediction accuracy over the ATLM, the state-of-the-art baseline regression. Conclusion The results show that the duplex output model provides a self-guided benchmark for interpreting estimated software effort. ElasticNet can also serve as a baseline model for SEE.

Information & Software Technology | 2018

Improving code readability classification using convolutional neural networks

Qing Mi; Jacky Keung; Yan Xiao; Solomon Mensah; Yujin Gao

Abstract Context Code readability classification (which refers to classification of a piece of source code as either readable or unreadable) has attracted increasing concern in academia and industry. To construct accurate classification models, previous studies depended mainly upon handcrafted features. However, the manual feature engineering process is usually labor-intensive and can capture only partial information about the source code, which is likely to limit the model performance. Objective To improve code readability classification, we propose the use of Convolutional Neural Networks (ConvNets). Method We first introduce a representation strategy (with different granularities) to transform source codes into integer matrices as the input to ConvNets. We then propose DeepCRM, a deep learning-based model for code readability classification. DeepCRM consists of three separate ConvNets with identical architectures that are trained on data preprocessed in different ways. We evaluate our approach against five state-of-the-art code readability models. Results The experimental results show that DeepCRM can outperform previous approaches. The improvement in accuracy ranges from 2.4% to 17.2%. Conclusions By eliminating the need for manual feature engineering, DeepCRM provides a relatively improved performance, confirming the efficacy of deep learning techniques in the task of code readability classification.

IEEE Transactions on Software Engineering | 2018