Cagatay Catal | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Cagatay Catal is active.

Explore More

Publication

Featured researches published by Cagatay Catal.

Expert Systems With Applications | 2009

A systematic review of software fault prediction studies

Cagatay Catal; Banu Diri

This paper provides a systematic review of previous software fault prediction studies with a specific focus on metrics, methods, and datasets. The review uses 74 software fault prediction papers in 11 journals and several conference proceedings. According to the review results, the usage percentage of public datasets increased significantly and the usage percentage of machine learning algorithms increased slightly since 2005. In addition, method-level metrics are still the most dominant metrics in fault prediction research area and machine learning algorithms are still the most popular methods for fault prediction. Researchers working on software fault prediction area should continue to use public datasets and machine learning algorithms to build better fault predictors. The usage percentage of class-level is beyond acceptable levels and they should be used much more than they are now in order to predict the faults earlier in design phase of software life cycle.

Applied Soft Computing | 2015

On the use of ensemble of classifiers for accelerometer-based activity recognition

Cagatay Catal; Selin Tufekci; Elif Pirmit; Guner Kocabag

Proposed activity recognition approach. We propose and validate a novel activity recognition model.We examine the power of ensemble of classifiers approach experimentally.The model uses J48, Logistic Regression, and MLP.Proposed recognition model is superior to MLP-based recognition model suggested in a previous study.We suggest researchers to focus on ensemble of classifiers approach for activity recognition. Activity recognition aims to detect the physical activities such as walking, sitting, and jogging performed by humans. With the widespread adoption and usage of mobile devices in daily life, several advanced applications of activity recognition were implemented and distributed all over the world. In this study, we explored the power of ensemble of classifiers approach for accelerometer-based activity recognition and built a novel activity prediction model based on machine learning classifiers. Our approach utilizes from J48 decision tree, Multi-Layer Perceptrons (MLP) and Logistic Regression techniques and combines these classifiers with the average of probabilities combination rule. Publicly available activity recognition dataset known as WISDM (Wireless Sensor Data Mining) which includes information from thirty six users was used during the experiments. According to the experimental results, our model provides better performance than MLP-based recognition approach suggested in previous study. These results strongly suggest researchers applying ensemble of classifiers approach for activity recognition problem.

product focused software process improvement | 2008

A Fault Prediction Model with Limited Fault Data to Improve Test Process

Cagatay Catal; Banu Diri

Software fault prediction models are used to identify the fault-prone software modules and produce reliable software. Performance of a software fault prediction model is correlated with available software metrics and fault data. In some occasions, there may be few software modules having fault data and therefore, prediction models using only labeled data can not provide accurate results. Semi-supervised learning approaches which benefit from unlabeled and labeled data may be applied in this case. In this paper, we propose an artificial immune system based semi-supervised learning approach. Proposed approach uses a recent semi-supervised algorithm called YATSI (Yet Another Two Stage Idea) and in the first stage of YATSI, AIRS (Artificial Immune Recognition Systems) is applied. In addition, AIRS, RF (Random Forests) classifier, AIRS based YATSI, and RF based YATSI are benchmarked. Experimental results showed that while YATSI algorithm improved the performance of AIRS, it diminished the performance of RF for unbalanced datasets. Furthermore, performance of AIRS based YATSI is comparable with RF which is the best machine learning classifier according to some researches.

Applied Soft Computing | 2017

A sentiment classification model based on multiple classifiers

Cagatay Catal; Mehmet Nangir

Abstract With the widespread usage of social networks, forums and blogs, customer reviews emerged as a critical factor for the customers’ purchase decisions. Since the beginning of 2000s, researchers started to focus on these reviews to automatically categorize them into polarity levels such as positive, negative, and neutral. This research problem is known as sentiment classification. The objective of this study is to investigate the potential benefit of multiple classifier systems concept on Turkish sentiment classification problem and propose a novel classification technique. Vote algorithm has been used in conjunction with three classifiers, namely Naive Bayes, Support Vector Machine (SVM), and Bagging. Parameters of the SVM have been optimized when it was used as an individual classifier. Experimental results showed that multiple classifier systems increase the performance of individual classifiers on Turkish sentiment classification datasets and meta classifiers contribute to the power of these multiple classifier systems. The proposed approach achieved better performance than Naive Bayes, which was reported the best individual classifier for these datasets, and Support Vector Machines. Multiple classifier systems (MCS) is a good approach for sentiment classification, and parameter optimization of individual classifiers must be taken into account while developing MCS-based prediction systems.

Archive | 2010

Metrics-Driven Software Quality Prediction Without Prior Fault Data

Cagatay Catal; Ugur Sevim; Banu Diri

Software quality assessment models are quantitative analytical models that are more reliable compared to qualitative models based on personal judgment. These assessment models are classified into two groups: generalized and product-specific models. Measurement-driven predictive models, a subgroup of product-specific models, assume that there is a predictive relationship between software measurements and quality. In recent years, greater attention in quality assessment models has been devoted to measurement-driven predictive models and the field of software fault prediction modeling has become established within the product-specific model category. Most of the software fault prediction studies focused on developing fault predictors by using previous fault data. However, there are cases when previous fault data are not available. In this study, we propose a novel software fault prediction approach that can be used in the absence of fault data. This fully automated technique does not require an expert during the prediction process and it does not require identifying the number of clusters before the clustering phase, as required by the K-means clustering method. Software metrics thresholds are used to remove the need for an expert. Our technique first applies the X-means clustering method to cluster modules and identifies the best cluster number. After this step, the mean vector of each cluster is checked against the metrics thresholds vector. A cluster is predicted as fault-prone if at least one metric of the mean vector is higher than the threshold value of that metric. Three datasets, collected from a Turkish white-goods manufacturer developing embedded controller software, have been used during experimental studies. Experiments revealed that unsupervised software fault prediction can be automated fully and effective results can be achieved by using the X-means clustering method and software metrics thresholds.

international conference on information and software technologies | 2012

The Ten Best Practices for Test Case Prioritization

Cagatay Catal

In this study, test case prioritization approaches that are used to execute the regression testing in a cost-effective manner were investigated. We discussed the critical issues and best practices that a software company should focus on before and after the implementation of test case prioritization techniques inside the company. Due to the increasing complexity of today’s software intensive systems, the number of test cases in a software development project increases for an effective validation & verification process and the time allocated to execute the regression tests decreases because of the marketing pressures. For this reason, it is very crucial to plan and setup test case prioritization infrastructures properly in software companies to improve the software testing process. Ten best practices for a successful test case prioritization are introduced and explained in this study.

Proceedings of the 2nd international workshop on Evidential assessment of software technologies | 2012

On the application of genetic algorithms for test case prioritization: a systematic literature review

Cagatay Catal

We conducted a Systematic Literature Review (SLR) to investigate the effectiveness of genetic algorithms for test case prioritization. The search string retrieved 120 test case prioritization papers, but after we read them in full, we identified that genetic algorithm was used in seven primary studies. One paper does not provide any experimental data. Based on the results of these six papers, we conclude that there is evidence that genetic algorithm-based techniques are effective for test case prioritization and the field is still open for further research.

Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery | 2012

Software mining and fault prediction

Cagatay Catal

Mining software repositories (MSRs) such as source control repositories, bug repositories, deployment logs, and code repositories provide useful patterns for practitioners. Instead of using these repositories as record‐keeping ones, we need to transform them into active repositories that can guide the decision processes inside the company. By MSRs with several data mining algorithms, effective software fault prediction models can be built and error‐prone modules can be detected prior to the testing phase. We discuss numerous real‐world challenges in building accurate fault prediction models and present some solutions to these challenges.

ACM Sigsoft Software Engineering Notes | 2013

Teaching evidence-based software engineering to master students: a single lecture within a course or an entire semester-long course?

Cagatay Catal

In this paper, we summarize our perspective on teaching evidence-based software engineering (EBSE) to master students. In this semester, we aimed to investigate this subject as a single lecture within a master course called Software Architecture instead of an entire semester-long course called EBSE. Each of the students delivered a systematic mapping study report related to the software architecture at the end of the semester and these project reports showed that this teaching approach is quite useful for master students even though this teaching activity is too short.

pacific-asia conference on knowledge discovery and data mining | 2017

Development of a Software Vulnerability Prediction Web Service Based on Artificial Neural Networks

Cagatay Catal; Akhan Akbulut; Ecem Ekenoglu; Meltem Alemdaroglu

Detecting vulnerable components of a web application is an important activity to allocate verification resources effectively. Most of the studies proposed several vulnerability prediction models based on private and public datasets so far. In this study, we aimed to design and implement a software vulnerability prediction web service which will be hosted on Azure cloud computing platform. We investigated several machine learning techniques which exist in Azure Machine Learning Studio environment and observed that the best overall performance on three datasets is achieved when Multi-Layer Perceptron method is applied. Software metrics values are received from a web form and sent to the vulnerability prediction web service. Later, prediction result is computed and shown on the web form to notify the testing expert. Training models were built on datasets which include vulnerability data from Drupal, Moodle, and PHPMyAdmin projects. Experimental results showed that Artificial Neural Networks is a good alternative to build a vulnerability prediction model and building a web service for vulnerability prediction purpose is a good approach for complex systems.

Explore More