Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Ioannis Kavakiotis is active.

Publication


Featured researches published by Ioannis Kavakiotis.


Computational and structural biotechnology journal | 2017

Machine Learning and Data Mining Methods in Diabetes Research

Ioannis Kavakiotis; O. Tsave; Athanasios Salifoglou; Nicos Maglaveras; Ioannis P. Vlahavas; Ioanna Chouvarda

The remarkable advances in biotechnology and health sciences have led to a significant production of data, such as high throughput genetic data and clinical information, generated from large Electronic Health Records (EHRs). To this end, application of machine learning and data mining methods in biosciences is presently, more than ever before, vital and indispensable in efforts to transform intelligently all available information into valuable knowledge. Diabetes mellitus (DM) is defined as a group of metabolic disorders exerting significant pressure on human health worldwide. Extensive research in all aspects of diabetes (diagnosis, etiopathophysiology, therapy, etc.) has led to the generation of huge amounts of data. The aim of the present study is to conduct a systematic review of the applications of machine learning, data mining techniques and tools in the field of diabetes research with respect to a) Prediction and Diagnosis, b) Diabetic Complications, c) Genetic Background and Environment, and e) Health Care and Management with the first category appearing to be the most popular. A wide range of machine learning algorithms were employed. In general, 85% of those used were characterized by supervised learning approaches and 15% by unsupervised ones, and more specifically, association rules. Support vector machines (SVM) arise as the most successful and widely used algorithm. Concerning the type of data, clinical datasets were mainly used. The title applications in the selected articles project the usefulness of extracting valuable knowledge leading to new hypotheses targeting deeper understanding and further investigation in DM.


Journal of Heredity | 2015

TRES: Identification of Discriminatory and Informative SNPs from Population Genomic Data

Ioannis Kavakiotis; Alexandros Triantafyllidis; Despoina Ntelidou; Panoraia Alexandri; Hendrik-Jan Megens; R.P.M.A. Crooijmans; M.A.M. Groenen; Grigorios Tsoumakas; Ioannis P. Vlahavas

The advent of high-throughput genomic technologies is enabling analyses on thousands or even millions of single-nucleotide polymorphisms (SNPs). At the same time, the selection of a minimum number of SNPs with the maximum information content is becoming increasingly problematic. Available locus ranking programs have been accused of providing upwardly biased results (concerning the predicted accuracy of the chosen set of markers for population assignment), cannot handle high-dimensional datasets, and some of them are computationally intensive. The toolbox for ranking and evaluation of SNPs (TRES) is a collection of algorithms built in a user-friendly and computationally efficient software that can manipulate and analyze datasets even in the order of millions of genotypes in a matter of seconds. It offers a variety of established methods for evaluating and ranking SNPs on user defined groups of populations and produces a set of predefined number of top ranked loci. Moreover, dataset manipulation algorithms enable users to convert datasets in different file formats, split the initial datasets into train and test sets, and finally create datasets containing only selected SNPs occurring from the SNP selection analysis for later on evaluation in dedicated software such as GENECLASS. This application can aid biologists to select loci with maximum power for optimization of cost-effective panels with applications related to e.g. species identification, wildlife management, and forensic problems. TRES is available for all operating systems at http://mlkd.csd.auth.gr/bio/tres.


IEEE Transactions on Molecular, Biological, and Multi-Scale Communications | 2017

Fick’s Law Model Revisited: A New Approach to Modeling Multiple Sources Message Dissemination in Bacterial Communication Nanosystems

Konstantinos Kantelis; Georgios I. Papadimitriou; Petros Nikopolitidis; Ioannis Kavakiotis; O. Tsave; Athanasios Salifoglou

As advances in nanotechnology continue their ascending course, new areas of application for nanoscale communication open up, involving biological systems. Such systems have peculiarities that must be taken into consideration, when trying to study new communication paradigms based on micro-biological communication systems. In this paper, an innovative mathematical model is employed, in an effort to study message dissemination through bacterial communication in the form of delivery of information within a simple, biologically inspired, communication system consisting of bacteria, yet representative of the characteristics of biological nanocommunication environments. Stimulus spreading is being investigated within the realm of message dissemination in electromagnetic networks, for single and multiple infection sources using macro scale simulation techniques, with the help of a simulation tool, which was developed based on a commercial simulation suite. The observed results indicate that the mathematical model predictions are in strong agreement with the simulations described by Fick’s laws of diffusion and well-approximated through the Fisher–Kolmogorov–Petrovsky–Piscunov (FKPP) equation, enabling use of conventional simulation systems for fast biological nanosystem property investigation.


hellenic conference on artificial intelligence | 2016

Ensemble Feature Selection using Rank Aggregation Methods for Population Genomic Data

Ioannis Kavakiotis; Alexandros Triantafyllidis; Grigorios Tsoumakas; Ioannis P. Vlahavas

Single Nucleotide Polymorphisms (SNPs) constitute important genetic markers with numerous medical and biological applications of high scientific and economic interest. SNP datasets are typically high dimensional, containing up to million features. Reasons originating from both biology and machine learning, dictate to perform feature selection which is mainly performed after feature evaluation. In this paper we present methods for SNP evaluation and eventually selection, based on combining results obtained from established genetic marker evaluation methods originating from the field of population genetics. To achieve this we have formulated the feature selection task as a ranking aggregation problem, which is a classical problem in social choice and voting theory.


Archive | 2018

Adipose Tissue as a Biomarker in Data Mining Predictive Models of Metabolic Pathophysiologies

O. Tsave; Ioannis Kavakiotis; Ioannis P. Vlahavas; Athanasios Salifoglou

It is well known that the metabolic syndrome emerges as one of the major public health issues worldwide. In diabetes and other metabolism related diseases, further complexity is added in diagnosis and prognosis due to the presence of metabolic syndrome, including obesity. Obesity, which is defined as an excess of body fat, can be described as an underlying risk factor of almost any of the aforementioned metabolic related pathologies. Moreover, a very likely potential link between such pathologies and obesity is the adipose tissue, which functions as an endocrine organ. Since obesity serves as general key to metabolism related disorders and complications, the adipose tissue can be a useful tool in predicting such pathologies. In the present mini review work, several representative studies are discussed with respect to the effectiveness of adipose tissue as a valuable biomarker along with other factors taken into consideration with data mining approaches. Taken together, adipose tissue can be used in data mining as a predictive tool in diabetes, mortality, cardiometabolic risk and other metabolism related pathologies.


Nano Communication Networks | 2017

Using bacterial concentration as means of dissipating information through chemotaxis

Stathis B. Mavridopoulos; Petros Nicopolitidis; O. Tsave; Ioannis Kavakiotis; Athanasios Salifoglou

Abstract Utilizing bacteria in communications has emerged as a promising solution to delineating peculiarities at the nanoscale level. Usually, bacteria are used as carriers of molecules, which are exchanged in order to dissipate information. This work proposes a system, whereby bacterial concentration is used to transfer information. Chemotaxis plays a central role in the scheme. The investigation targets the examination and comparison of two different methodologies, where either the server uses chemorepellent or the clients use chemoattractant substances to bring about the chemotaxis effect. Performance of the proposed topologies was investigated through simulation. In the simulated experiments performed, random messages were encoded in the bacterial concentration using a simple on–off keying modulation, which then the receivers decode to recover the initial message. The experiments demonstrate the differences between the two strategies under various topologies, show the superior performance achieved in the case of chemoattractant clients, and highlight the influence of the parameters of distance, number of sensors and number of bacteria per pulse on the received signal amplitude and achievable bit error rate.


Computers in Biology and Medicine | 2017

FIFS: A data mining method for informative marker selection in high dimensional population genomic data

Ioannis Kavakiotis; Patroklos Samaras; Alexandros Triantafyllidis; Ioannis P. Vlahavas

BACKGROUND AND OBJECTIVE Single Nucleotide Polymorphism (SNPs) are, nowadays, becoming the marker of choice for biological analyses involving a wide range of applications with great medical, biological, economic and environmental interest. Classification tasks i.e. the assignment of individuals to groups of origin based on their (multi-locus) genotypes, are performed in many fields such as forensic investigations, discrimination between wild and/or farmed populations and others. Τhese tasks, should be performed with a small number of loci, for computational as well as biological reasons. Thus, feature selection should precede classification tasks, especially for Single Nucleotide Polymorphism (SNP) datasets, where the number of features can amount to hundreds of thousands or millions. METHODS In this paper, we present a novel data mining approach, called FIFS - Frequent Item Feature Selection, based on the use of frequent items for selection of the most informative markers from population genomic data. It is a modular method, consisting of two main components. The first one identifies the most frequent and unique genotypes for each sampled population. The second one selects the most appropriate among them, in order to create the informative SNP subsets to be returned. RESULTS The proposed method (FIFS) was tested on a real dataset, which comprised of a comprehensive coverage of pig breed types present in Britain. This dataset consisted of 446 individuals divided in 14 sub-populations, genotyped at 59,436 SNPs. Our method outperforms the state-of-the-art and baseline methods in every case. More specifically, our method surpassed the assignment accuracy threshold of 95% needing only half the number of SNPs selected by other methods (FIFS: 28 SNPs, Delta: 70 SNPs Pairwise FST: 70 SNPs, In: 100 SNPs.) CONCLUSION: Our approach successfully deals with the problem of informative marker selection in high dimensional genomic datasets. It offers better results compared to existing approaches and can aid biologists in selecting the most informative markers with maximum discrimination power for optimization of cost-effective panels with applications related to e.g. species identification, wildlife management, and forensics.


Clinical Cancer Research | 2017

Chronic Lymphocytic Leukemia with Mutated IGHV4-34 Receptors: Shared and Distinct Immunogenetic Features and Clinical Outcomes.

Aliki Xochelli; Panagiotis Baliakas; Ioannis Kavakiotis; Andreas Agathangelidis; Lesley Ann Sutton; Eva Minga; Stavroula Ntoufa; Eugen Tausch; Xiao Jie Yan; Tait D. Shanafelt; Karla Plevová; Myriam Boudjogra; Davide Rossi; Zadie Davis; Alba Navarro; Yorick Sandberg; Fie Juhl Vojdeman; Lydia Scarfò; Niki Stavroyianni; Andrey Sudarikov; Silvio Veronese; Tatiana Tzenou; Teodora Karan-Djurasevic; Mark A. Catherwood; Dirk Kienle; Maria Chatzouli; Monica Facco; Jasmin Bahlo; Christiane Pott; Lone Bredo Pedersen

Purpose: We sought to investigate whether B cell receptor immunoglobulin (BcR IG) stereotypy is associated with particular clinicobiological features among chronic lymphocytic leukemia (CLL) patients expressing mutated BcR IG (M-CLL) encoded by the IGHV4-34 gene, and also ascertain whether these associations could refine prognostication. Experimental Design: In a series of 19,907 CLL cases with available immunogenetic information, we identified 339 IGHV4-34–expressing cases assigned to one of the four largest stereotyped M-CLL subsets, namely subsets #4, #16, #29 and #201, and investigated in detail their clinicobiological characteristics and disease outcomes. Results: We identified shared and subset-specific patterns of somatic hypermutation (SHM) among patients assigned to these subsets. The greatest similarity was observed between subsets #4 and #16, both including IgG-switched cases (IgG-CLL). In contrast, the least similarity was detected between subsets #16 and #201, the latter concerning IgM/D-expressing CLL. Significant differences between subsets also involved disease stage at diagnosis and the presence of specific genomic aberrations. IgG subsets #4 and #16 emerged as particularly indolent with a significantly (P < 0.05) longer time-to-first-treatment (TTFT; median TTFT: not yet reached) compared with the IgM/D subsets #29 and #201 (median TTFT: 11 and 12 years, respectively). Conclusions: Our findings support the notion that BcR IG stereotypy further refines prognostication in CLL, superseding the immunogenetic distinction based solely on SHM load. In addition, the observed distinct genetic aberration landscapes and clinical heterogeneity suggest that not all M-CLL cases are equal, prompting further research into the underlying biological background with the ultimate aim of tailored patient management. Clin Cancer Res; 23(17); 5292–301. ©2017 AACR.


BMC Bioinformatics | 2016

Integrating multiple immunogenetic data sources for feature extraction and mining somatic hypermutation patterns: the case of “towards analysis” in chronic lymphocytic leukaemia

Ioannis Kavakiotis; Aliki Xochelli; Andreas Agathangelidis; Grigorios Tsoumakas; Nicos Maglaveras; Kostas Stamatopoulos; Anastasia Hadzidimitriou; Ioannis P. Vlahavas; Ioanna Chouvarda

BackgroundSomatic Hypermutation (SHM) refers to the introduction of mutations within rearranged V(D)J genes, a process that increases the diversity of Immunoglobulins (IGs). The analysis of SHM has offered critical insight into the physiology and pathology of B cells, leading to strong prognostication markers for clinical outcome in chronic lymphocytic leukaemia (CLL), the most frequent adult B-cell malignancy. In this paper we present a methodology for integrating multiple immunogenetic and clinocobiological data sources in order to extract features and create high quality datasets for SHM analysis in IG receptors of CLL patients. This dataset is used as the basis for a higher level integration procedure, inspired form social choice theory. This is applied in the Towards Analysis, our attempt to investigate the potential ontogenetic transformation of genes belonging to specific stereotyped CLL subsets towards other genes or gene families, through SHM.ResultsThe data integration process, followed by feature extraction, resulted in the generation of a dataset containing information about mutations occurring through SHM. The Towards analysis performed on the integrated dataset applying voting techniques, revealed the distinct behaviour of subset #201 compared to other subsets, as regards SHM related movements among gene clans, both in allele-conserved and non-conserved gene areas. With respect to movement between genes, a high percentage movement towards pseudo genes was found in all CLL subsets.ConclusionsThis data integration and feature extraction process can set the basis for exploratory analysis or a fully automated computational data mining approach on many as yet unanswered, clinically relevant biological questions.


hellenic conference on artificial intelligence | 2014

Feature Evaluation Metrics for Population Genomic Data

Ioannis Kavakiotis; Alexandros Triantafyllidis; Grigorios Tsoumakas; Ioannis P. Vlahavas

Single Nucleotide Polymorphisms (SNPs) are considered nowadays one of the most important class of genetic markers with a wide range of applications with both scientific and economic interests. Although the advance of biotechnology has made feasible the production of genome wide SNP datasets, the cost of the production is still high. The transformation of the initial dataset into a smaller one with the same genetic information is a crucial task and it is performed through feature selection. Biologists evaluate features using methods originating from the field of population genetics. Although several studies have been performed in order to compare the existing biological methods, there is a lack of comparison between methods originating from the biology field with others originating from the machine learning. In this study we present some early results which support that biological methods perform slightly better than machine learning methods.

Collaboration


Dive into the Ioannis Kavakiotis's collaboration.

Top Co-Authors

Avatar

Ioannis P. Vlahavas

Aristotle University of Thessaloniki

View shared research outputs
Top Co-Authors

Avatar

Andreas Agathangelidis

Vita-Salute San Raffaele University

View shared research outputs
Top Co-Authors

Avatar

Grigorios Tsoumakas

Aristotle University of Thessaloniki

View shared research outputs
Top Co-Authors

Avatar

Ioanna Chouvarda

Aristotle University of Thessaloniki

View shared research outputs
Top Co-Authors

Avatar

Alexandros Triantafyllidis

Aristotle University of Thessaloniki

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Athanasios Salifoglou

Aristotle University of Thessaloniki

View shared research outputs
Top Co-Authors

Avatar

George Tzanis

Aristotle University of Thessaloniki

View shared research outputs
Top Co-Authors

Avatar

O. Tsave

Aristotle University of Thessaloniki

View shared research outputs
Top Co-Authors

Avatar

Nicos Maglaveras

Aristotle University of Thessaloniki

View shared research outputs
Researchain Logo
Decentralizing Knowledge