Kwabena Ebo Bennin | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Kwabena Ebo Bennin is active.

Explore More

Publication

Featured researches published by Kwabena Ebo Bennin.

2016 IEEE International Conference on Software Quality, Reliability and Security (QRS) | 2016

Empirical Evaluation of Cross-Release Effort-Aware Defect Prediction Models

Kwabena Ebo Bennin; Koji Toda; Yasutaka Kamei; Jacky Keung; Akito Monden; Naoyasu Ubayashi

To prioritize quality assurance efforts, various fault prediction models have been proposed. However, the best performing fault prediction model is unknown due to three major drawbacks: (1) comparison of few fault prediction models considering small number of data sets, (2) use of evaluation measures that ignore testing efforts and (3) use of n-fold cross-validation instead of the more practical cross-release validation. To address these concerns, we conducted cross-release evaluation of 11 fault density prediction models using data sets collected from 2 releases of 25 open source software projects with an effort-aware performance measure known as Norm(Popt). Our result shows that, whilst M5 and K* had the best performances, they were greatly influenced by the percentage of faulty modules present and size of data set. Using Norm(Popt) produced an overall average performance of more than 50% across all the selected models clearly indicating the importance of considering testing efforts in building fault-prone prediction models.

computer software and applications conference | 2016

A Methodology to Automate the Selection of Design Patterns

Shahid Hussain; Jacky Keung; Arif Ali Khan; Kwabena Ebo Bennin

Background: Over the last two decades, numerous software design patterns have been introduced and cataloged on the basis of developers interest and skills. Motivation: In software design phase, inexperienced designers are mostly concerned on how to select an appropriate design pattern from the catalog of relevant patterns in order to solve a design problem. The existing automated design pattern selection methodologies are limited to the need of formal specification of design patterns or an appropriate sample size to make the learning process more effective. Method: To address this concern, we propose a three step methodology to automate the selection process of design pattern for a design problem. The steps of the methodology are text preprocessing, use of an unsupervised learning technique (that is Fuzzy c-Mean) as a core function to quantitatively determine the resemblance of different objects and selection of most appropriate pattern for a design problem. We evaluate our methodology with two samples that is Gang-of-Four (GoF) design pattern and spoiled pattern collection, and three object-oriented related design problems. Moreover, we used Fuzzy Silhouette test, Kappa (k) test, Cosine Similarity and argmax function to measure the effectiveness of our methodology. Results: In case of GoF pattern collection, we validated the reliability of Fuzzy c-Mean (FCM) results using a classification decision tree, and observed promising results compared to other automation techniques. Conclusion: From the comparison results, we observed 11%, 4% and 18% improvement in the performance of proposed technique as compared to supervised learning techniques of Support Vector Machine, Naïve Bayes and C4.5 respectively.

australian software engineering conference | 2015

Performance Evaluation of Ensemble Methods For Software Fault Prediction: An Experiment

Shahid Hussain; Jacky Keung; Arif Ali Khan; Kwabena Ebo Bennin

In object-oriented software development, a plethora of studies have been carried out to present the application of machine learning algorithms for fault prediction. Furthermore, it has been empirically validated that an ensemble method can improve classification performance as compared to a single classifier. But, due to the inherent differences among machine learning and data mining approaches, the classification performance of ensemble methods will be varied. In this study, we investigated and evaluated the performance of different ensemble methods with itself and base-level classifiers, in predicting the faults proneness classes. Subsequently, we used three ensemble methods AdaboostM1, Vote and StackingC with five base-level classifiers namely Naivebayes, Logistic, J48, VotedPerceptron and SMO in Weka tool. In order to evaluate the performance of ensemble methods, we retrieved twelve datasets of open source projects from PROMISE repository. In this experiment, we used k-fold (k=10) cross-validation and ROC analysis for validation. Besides, we used recall, precision, accuracy, F-value measures to evaluate the performance of ensemble methods and base-level Classifiers. Finally, we observed significant performance improvement of applying ensemble methods as compared to its base-level classifier, and among ensemble methods we observed StackingC outperformed other selected ensemble methods for software fault prediction.

empirical software engineering and measurement | 2017

The significant effects of data sampling approaches on software defect prioritization and classification

Kwabena Ebo Bennin; Jacky Keung; Akito Monden; Passakorn Phannachitta; Solomon Mensah

Context: Recent studies have shown that performance of defect prediction models can be affected when data sampling approaches are applied to imbalanced training data for building defect prediction models. However, the magnitude (degree and power) of the effect of these sampling methods on the classification and prioritization performances of defect prediction models is still unknown. Goal: To investigate the statistical and practical significance of using resampled data for constructing defect prediction models. Method: We examine the practical effects of six data sampling methods on performances of five defect prediction models. The prediction performances of the models trained on default datasets (no sampling method) are compared with that of the models trained on resampled datasets (application of sampling methods). To decide whether the performance changes are significant or not, robust statistical tests are performed and effect sizes computed. Twenty releases of ten open source projects extracted from the PROMISE repository are considered and evaluated using the AUC, pd, pf and G-mean performance measures. Results: There are statistical significant differences and practical effects on the classification performance (pd, pf and G-mean) between models trained on resampled datasets and those trained on the default datasets. However, sampling methods have no statistical and practical effects on defect prioritization performance (AUC) with small or no effect values obtained from the models trained on the resampled datasets. Conclusions: Existing sampling methods can properly set the threshold between buggy and clean samples, while they cannot improve the prediction of defect-proneness itself. Sampling methods are highly recommended for defect classification purposes when all faulty modules are to be considered for testing.

ieee international conference on software quality reliability and security companion | 2016

Detection of Fault-Prone Classes Using Logistic Regression Based Object-Oriented Metrics Thresholds

Shahid Hussain; Jacky Keung; Arif Ali Khan; Kwabena Ebo Bennin

Background: In the plethora of studies, the object-orientedmetrics have been empirically validated to assess the design properties and quantify the high-level quality attributes such as fault-proneness, either at the method or class granularity levels of software. Motivation: A more precise value of an object-oriented metric can be used as an indicator for the developers tomake the informed decisions regarding the detection of design flaws and classify the fault-proneness classes. Method: Bender used an approach in the domain of epidemiology studies to derivethe threshold values for the risk factors. In our study, we follow the Benders approach and propose a model to derive the thresholds for a set of software design metrics via non-linearfunctions, which are described through logistic regressioncoefficients. Subsequently, we perform four types of analysis and three experiments in order to evaluate and compare the effectiveness of derived thresholds in the domain of classificationof fault proneness classes. We use the Precision, Recall, Fmeasureand classification accuracy performance measures toassess the effectiveness of derived metrics thresholds. Results: We compare the derive threshold values of DIT, CA, LCOM andNPM metrics with their existing data distribution basedthreshold values, and observed the significant increase in the classification accuracy of fault-prone classes. For example, DIT(27%), Ca (2%), NPM (2%) and LCOM (15%) for the Ant-1.5project. Conclusion: The analysis results suggest that the proposed model can be applied to derive the thresholds of otherobject-oriented metrics which present either with or withoutheavy-tailed distribution, however, the proposed model to derivethresholds cannot generalize for all the systems due to variationin data characteristics.

computer software and applications conference | 2016

Investigating the Effects of Balanced Training and Testing Datasets on Effort-Aware Fault Prediction Models

Kwabena Ebo Bennin; Jacky Keung; Akito Monden; Yasutaka Kamei; Naoyasu Ubayashi

To prioritize software quality assurance efforts, fault prediction models have been proposed to distinguish faulty modules from clean modules. The performances of such models are often biased due to the skewness or class imbalance of the datasets considered. To improve the prediction performance of these models, sampling techniques have been employed to rebalance the distribution of fault-prone and non-fault-prone modules. The effect of these techniques have been evaluated in terms of accuracy/geometric mean/F1-measure in previous studies; however, these measures do not consider the effort needed to fix faults. To empirically investigate the effect of sampling techniques on the performance of software fault prediction models in a more realistic setting, this study employs Norm(Popt), an effort-aware measure that considers the testing effort. We performed two sets of experiments aimed at (1) assessing the effects of sampling techniques on effort-aware models and finding the appropriate class distribution for training datasets (2) investigating the role of balanced training and testing datasets on performance of predictive models. Of the four sampling techniques applied, the over-sampling techniques outperformed the under-sampling techniques with Random Over-sampling performing best with respect to the Norm(Popt) evaluation measure. Also, performance of all the prediction models improved when sampling techniques were applied between the rates of (20-30)% on the training datasets implying that a strictly balanced dataset (50% faulty modules and 50% clean modules) does not result in the best performance for effort-aware models. Our results also indicate that performances of effort-aware models are significantly dependent on the proportions of the two types of the classes in the testing dataset. Models trained on moderately balanced datasets are more likely to withstand fluctuations in performance as the class distribution in the testing data varies.

2017 IEEE International Conference on Software Quality, Reliability and Security (QRS) | 2017

Investigating the Significance of Bellwether Effect to Improve Software Effort Estimation

Solomon Mensah; Jacky Keung; Stephen G. MacDonell; Michael Franklin Bosu; Kwabena Ebo Bennin

Bellwether effect refers to the existence of exemplary projects (called the Bellwether) within a historical dataset to be used for improved prediction performance. Recent studies have shown an implicit assumption of using recently completed projects (referred to as moving window) for improved prediction accuracy. In this paper, we investigate the Bellwether effect on software effort estimation accuracy using moving windows. The existence of the Bellwether was empirically proven based on six postulations. We apply statistical stratification and Markov chain methodology to select the Bellwether moving window. The resulting Bellwether moving window is used to predict the software effort of a new project. Empirical results show that Bellwether effect exist in chronological datasets with a set of exemplary and recently completed projects representing the Bellwether moving window. Result from this study has shown that the use of Bellwether moving window with the Gaussian weighting function significantly improve the prediction accuracy.

software engineering and knowledge engineering | 2016

Multi-Objective Optimization for Software Testing Effort Estimation

Solomon Mensah; Jacky Keung; Kwabena Ebo Bennin; Michael Franklin Bosu

Software Testing Effort (STE), which contributes about 25-40% of the total development effort, plays a significant role in software development. In addressing the issues faced by companies in finding relevant datasets for STE estimation modeling prior to development, cross-company modeling could be leveraged. The study aims at assessing the effectiveness of cross-company (CC) and within-company (WC) projects in STE estimation. A robust multi-objective Mixed-Integer Linear Programming (MILP) optimization framework for the selection of CC and WC projects was constructed and estimation of STE was done using Deep Neural Networks. Results from our study indicate that the application of the MILP framework yielded similar results for both WC and CC modeling. The modeling framework will serve as a foundation to assist in STE estimation prior to the development of new a software project.

international conference on evaluation of novel approaches to software engineering | 2015

Effects of Geographical, Socio-cultural and Temporal distances on communication in Global Software Development during Requirements Change Management A Pilot Study

Arif Ali Khan; Jacky Keung; Shahid Hussain; Kwabena Ebo Bennin

Trend of software development is changing rapidly most of the software development organizations are trying to globalize their activities throughout the world. This trend leads towards a phenomenon called Global Software Development (GSD). The main reason behind the software globalization is its various benefits. Besides these benefits, software organizations are facing various challenges. One of these challenges is communication which is considered a big challenge in GSD and it becomes more complicated during the Requirements Change Management (RCM) process due to three factors, they are Geographical, Socio-cultural and Temporal distances. This paper presents a framework which shows the effect of these factors on communication during RCM process in GSD. Communication is the core function of collaboration which allows information to be exchanged between the team members. A pilot study has been conducted in three GSD organizations. A quantitative research method has been used to collect data. The findings from the survey data show that these three factors have a strong negative impact on communication process in GSD.

Journal of Systems and Software | 2018

On the value of a prioritization scheme for resolving Self-admitted technical debt

Solomon Mensah; Jacky Keung; Jeffery Svajlenko; Kwabena Ebo Bennin; Qing Mi

Abstract Programmers tend to leave incomplete, temporary workarounds and buggy codes that require rework in software development and such pitfall is referred to as Self-admitted Technical Debt (SATD). Previous studies have shown that SATD negatively affects software project and incurs high maintenance overheads. In this study, we introduce a prioritization scheme comprising mainly of identification, examination and rework effort estimation of prioritized tasks in order to make a final decision prior to software release. Using the proposed prioritization scheme, we perform an exploratory analysis on four open source projects to investigate how SATD can be minimized. Four prominent causes of SATD are identified, namely code smells (23.2%), complicated and complex tasks (22.0%), inadequate code testing (21.2%) and unexpected code performance (17.4%). Results show that, among all the types of SATD, design debts on average are highly prone to software bugs across the four projects analysed. Our findings show that a rework effort of approximately 10 to 25 commented LOC per SATD source file is needed to address the highly prioritized SATD ( vital few ) tasks. The proposed prioritization scheme is a novel technique that will aid in decision making prior to software release in an attempt to minimize high maintenance overheads.

Explore More