Sotiris B. Kotsiantis
University of Patras
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Sotiris B. Kotsiantis.
Artificial Intelligence Review | 2006
Sotiris B. Kotsiantis; Ioannis D. Zaharakis; Panayiotis E. Pintelas
Supervised classification is one of the tasks most frequently carried out by so-called Intelligent Systems. Thus, a large number of techniques have been developed based on Artificial Intelligence (Logic-based techniques, Perceptron-based techniques) and Statistics (Bayesian Networks, Instance-based techniques). The goal of supervised learning is to build a concise model of the distribution of class labels in terms of predictor features. The resulting classifier is then used to assign class labels to the testing instances where the values of the predictor features are known, but the value of the class label is unknown. This paper describes various classification algorithms and the recent attempt for improving classification accuracy—ensembles of classifiers.
Artificial Intelligence Review | 2013
Sotiris B. Kotsiantis
Decision tree techniques have been widely used to build classification models as such models closely resemble human reasoning and are easy to understand. This paper describes basic decision tree issues and current research points. Of course, a single article cannot be a complete review of all algorithms (also known induction classification trees), yet we hope that the references cited will cover the major theoretical issues, guiding the researcher in interesting research directions and suggesting possible bias combinations that have yet to be explored.
international conference on knowledge-based and intelligent information and engineering systems | 2003
Sotiris B. Kotsiantis; Christos Pierrakeas; Panayiotis E. Pintelas
Student dropout occurs quite often in universities providing distance education. The scope of this research is to study whether the usage of machine learning techniques can be useful in dealing with this problem. Subsequently, an attempt was made to identifying the most appropriate learning algorithm for the prediction of students’ dropout. A number of experiments have taken place with data provided by the ‘informatics’ course of the Hellenic Open University and a quite interesting conclusion is that the Naive Bayes algorithm can be successfully used. A prototype web based support tool, which can automatically recognize students with high probability of dropout, has been constructed by implementing this algorithm.
Artificial Intelligence Review | 2012
Sotiris B. Kotsiantis
Use of machine learning techniques for educational proposes (or educational data mining) is an emerging field aimed at developing methods of exploring data from computational educational settings and discovering meaningful patterns. The stored data (virtual courses, e-learning log file, demographic and academic data of students, admissions/registration info, and so on) can be useful for machine learning algorithms. In this article, we cite the most current articles that use machine learning techniques for educational proposes and we present a case study for predicting students’ marks. Students’ key demographic characteristics and their marks in a small number of written assignments can constitute the training set for a regression method in order to predict the student’s performance. Finally, a prototype version of software support tool for tutors has been constructed.
Artificial Intelligence Review | 2011
Sotiris B. Kotsiantis
Bagging, boosting, rotation forest and random subspace methods are well known re-sampling ensemble methods that generate and combine a diversity of learners using the same learning algorithm for the base-classifiers. Boosting and rotation forest algorithms are considered stronger than bagging and random subspace methods on noise-free data. However, there are strong empirical indications that bagging and random subspace methods are much more robust than boosting and rotation forest in noisy settings. For this reason, in this work we built an ensemble of bagging, boosting, rotation forest and random subspace methods ensembles with 6 sub-classifiers in each one and then a voting methodology is used for the final prediction. We performed a comparison with simple bagging, boosting, rotation forest and random subspace methods ensembles with 25 sub-classifiers, as well as other well known combining methods, on standard benchmark datasets and the proposed technique had better accuracy in most cases.
artificial intelligence methodology systems applications | 2004
Sotiris B. Kotsiantis; Panayiotis E. Pintelas
Simple Bayes algorithm captures the assumption that every feature is independent from the rest of the features, given the state of the class feature. The fact that the assumption of independence is clearly almost always wrong has led to a general rejection of the crude independence model in favor of more complicated alternatives, at least by researchers knowledgeable about theoretical issues. In this study, we attempted to increase the prediction accuracy of the simple Bayes model. Because the concept of combining classifiers is proposed as a new direction for the improvement of the performance of individual classifiers, we made use of Adaboost, with the difference that in each iteration of Adaboost, we used a discretization method and we removed redundant features using a filter feature selection method. Finally, we performed a large-scale comparison with other attempts that have tried to improve the accuracy of the simple Bayes algorithm as well as other state-of-the-art algorithms and ensembles on 26 standard benchmark datasets and we took better accuracy in most cases using less time for training, too.
mediterranean conference on control and automation | 2007
M. Karagiannopoulos; D. S. Anyfantis; Sotiris B. Kotsiantis; Panayiotis E. Pintelas
Many real-world data sets exhibit skewed class distributions in which almost all cases are allotted to a class and far fewer cases to a smaller, usually more interesting class. A classifier induced from an imbalanced data set has, typically, a low error rate for the majority class and an unacceptable error rate for the minority class. This paper firstly provides a systematic study on the various methodologies that have tried to handle this problem. Finally, it presents an experimental study of these methodologies with a proposed local cost sensitive technique and it concludes that such a framework can be a more effective solution to the problem. Our method seems to allow improved identification of difficult small classes in predictive analysis, while keeping the classification ability of the other classes in an acceptable level.
hellenic conference on artificial intelligence | 2004
Sotiris B. Kotsiantis; Panagiotis Pintelas
A class of problems between classification and regression, learning to predict ordinal classes, has not received much attention so far, even though there are many problems in the real world that fall into that category. Given ordered classes, one is not only interested in maximizing the classification accuracy, but also in minimizing the distances between the actual and the predicted classes. This paper provides a systematic study on the various methodologies that have tried to handle this problem and presents an experimental study of these methodologies with a cost sensitive technique that uses fixed and unequal misclassification costs between classes. It concludes that this technique can be a more robust solution to the problem because it minimizes the distances between the actual and the predicted classes, without harming but actually slightly improving the classification accuracy.
artificial intelligence applications and innovations | 2007
D. Anyfantis; M. Karagiannopoulos; Sotiris B. Kotsiantis; Panayiotis E. Pintelas
Many real world datasets exhibit skewed class distributions in which almost all instances are allotted to a class and far fewer instances to a smaller, but more interesting class. A classifier induced from an imbalanced dataset has a low error rate for the majority class and an undesirable error rate for the minority class. Many research efforts have been made to deal with class noise but none of them was designed for imbalanced datasets. This paper provides a study on the various methodologies that have tried to handle the imbalanced datasets and examines their robustness in class noise.
International Journal of Intelligent Systems Technologies and Applications | 2007
Sotiris B. Kotsiantis
Credit risk analysis is an important topic in financial risk management. Owing to recent financial crises, credit risk analysis has been the major focus of the financial and banking industry. An accurate estimation of credit risk could be transformed into a more efficient use of economic capital. To this end, a number of experiments have been conducted using representative learning algorithms, which were tested using two publicly credit datasets. The decision of which particular method to choose is a complicated problem. A good alternative to choosing only one method is to create a hybrid forecasting system incorporating a number of possible solution methods as components (an ensemble of classifiers). For this purpose, we have implemented a hybrid decision support system that combines the representative algorithms using a selective voting methodology and achieves better performance than any examined simple and ensemble method.