Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where George Tzanis is active.

Publication


Featured researches published by George Tzanis.


Knowledge Based Systems | 2008

An empirical study on sea water quality prediction

Evaggelos V. Hatzikos; Grigorios Tsoumakas; George Tzanis; Nick Bassiliades; Ioannis P. Vlahavas

This paper studies the problem of predicting future values for a number of water quality variables, based on measurements from under-water sensors. It performs both exploratory and automatic analysis of the collected data with a variety of linear and nonlinear modeling methods. The paper investigates issues, such as the ability to predict future values for a varying number of days ahead and the effect of including values from a varying number of past days. Experimental results provide interesting insights on the predictability of the target variables and the performance of the different learning algorithms.


International Journal of Data Warehousing and Mining | 2007

Mining for Mutually Exclusive Items in Transaction Databases

George Tzanis; Christos Berberidis

Association rule mining is a popular task that involves the discovery of co-occurences of items in transaction databases. Several extensions of the traditional association rule mining model have been proposed so far; however, the problem of mining for mutually exclusive items has not been directly tackled yet. Such information could be useful in various cases (e.g., when the expression of a gene excludes the expression of another), or it can be used as a serious hint in order to reveal inherent taxonomical information. In this article, we address the problem of mining pairs of items, such that the presence of one excludes the other. First, we provide a concise review of the literature, then we define this problem, we propose a probability-based evaluation metric, and finally a mining algorithm that we test on transaction data.


panhellenic conference on informatics | 2005

Improving the accuracy of classifiers for the prediction of translation initiation sites in genomic sequences

George Tzanis; Christos Berberidis; Anastasia Alexandridou; Ioannis P. Vlahavas

The prediction of the Translation Initiation Site (TIS) in a genomic sequence is an important issue in biological research. Although several methods have been proposed to deal with this problem, there is a great potential for the improvement of the accuracy of these methods. Due to various reasons, including noise in the data as well as biological reasons, TIS prediction is still an open problem and definitely not a trivial task. In this paper we follow a three-step approach in order to increase TIS prediction accuracy. In the first step, we use a feature generation algorithm we developed. In the second step, all the candidate features, including some new ones generated by our algorithm, are ranked according to their impact to the accuracy of the prediction. Finally, in the third step, a classification model is built using a number of the top ranked features. We experiment with various feature sets, feature selection methods and classification algorithms, compare with alternative methods, draw important conclusions and propose improved models with respect to prediction accuracy.


hellenic conference on artificial intelligence | 2006

Prediction of Translation Initiation Sites Using Classifier Selection

George Tzanis; Ioannis P. Vlahavas

The prediction of the translation initiation site (TIS) in a genomic sequence is an important issue in biological research. Several methods have been proposed to deal with it. However, it is still an open problem. In this paper we follow an approach consisting of a number of steps in order to increase TIS prediction accuracy. First, all the sequences are scanned and the candidate TISs are detected. These sites are grouped according to the length of the sequence upstream and downstream them and a number of features is generated for each one. The features are evaluated among the instances of every group and a number of the top ranked ones are selected for building a classifier. A new instance is assigned to a group and is classified by the corresponding classifier. We experiment with various feature sets and classification algorithms, compare with alternative methods and draw important conclusions.


international conference of the ieee engineering in medicine and biology society | 2007

MANTIS: A Data Mining Methodology for Effective Translation Initiation Site Prediction

George Tzanis; Christos Berberidis; Ioannis P. Vlahavas

The prediction of the translation initiation site in a genomic sequence with the highest possible accuracy is an important problem that still has to be investigated by the research community. Current approaches perform quite well, however there is still room for a more general framework for the researchers who want to follow an effective and reliable methodology. We developed a prediction methodology that combines ad hoc as well as discovered knowledge in order to significantly increase the achieved accuracy reliably. Our methodology is modular and consists of three major decision components: a consensus component, a coding region classification component and a novel ATG location-based component that allows for the utilization of the advantages of the popular ribosome scanning model while overcoming its limitations. All three of them are combined into a meta-classification system, using stacked generalization, in a highly effective prediction framework. We performed extensive comparative experiments on four different datasets, showing that the increase in terms of accuracy and adjusted accuracy is not only statistically significant, but also the highest reported.


International Journal of Knowledge Discovery in Bioinformatics | 2014

Biological and Medical Big Data Mining

George Tzanis

This paper discusses the concept of big data mining in the domain of biology and medicine. Biological and medical data are increasing at very rapid rates, which in many cases outpace even Moores law. This is the result of recent technological development, as well as the exploratory attitude of human beings, that prompts scientists to answer more questions by conducting more experiments. Representative examples are the advances in sequencing and medical imaging technologies. Challenges posed by this data deluge, and the emerging opportunities of their efficient management and analysis are also part of the discussion. The major emphasis is given to the most common biological and medical data mining applications.


international conference on biological and medical data analysis | 2006

A novel data mining approach for the accurate prediction of translation initiation sites

George Tzanis; Christos Berberidis; Ioannis P. Vlahavas

In an mRNA sequence, the prediction of the exact codon where the process of translation starts (Translation Initiation Site – TIS) is a particularly important problem. So far it has been tackled by several researchers that apply various statistical and machine learning techniques, achieving high accuracy levels, often over 90%. In this paper we propose a mahine learning approach that can further improve the prediction accuracy. First, we provide a concise review of the literature in this field. Then we propose a novel feature set. We perform extensive experiments on a publicly available, real world dataset for various vertebrate organisms using a variety of novel features and classification setups. We evaluate our results and compare them with a reference study and show that our approach that involves new features and a combination of the Ribosome Scanning Model with a meta-classifier shows higher accuracy in most cases.


Computers in Biology and Medicine | 2012

StackTIS: A stacked generalization approach for effective prediction of translation initiation sites

George Tzanis; Christos Berberidis; Ioannis P. Vlahavas

The prediction of the translation initiation site in an mRNA or cDNA sequence is an essential step in gene prediction and an open research problem in bioinformatics. Although recent approaches perform well, more effective and reliable methodologies are solicited. We developed an adaptable data mining method, called StackTIS, which is modular and consists of three prediction components that are combined into a meta-classification system, using stacked generalization, in a highly effective framework. We performed extensive experiments on sequences of two diverse eukaryotic organisms (Homo sapiens and Oryza sativa), indicating that StackTIS achieves statistically significant improvement in performance.


international conference on tools with artificial intelligence | 2007

Accurate Classification of SAGE Data Based on Frequent Patterns of Gene Expression

George Tzanis; Ioannis P. Vlahavas

In this paper we present a method for classifying accurately SAGE (serial analysis of gene expression) data. The high dimensionality of the data, namely the large number of features, in combination with the small number of samples poses a great challenge and demands more accurate and robust algorithms for classification. The prediction accuracy of the up to now proposed approaches is moderate. In our approach we exploit the associations among the expressions of genes in order to construct more accurate classifiers. For validating the effectiveness of our approach we experimented with two real datasets using numerous feature selection and classification algorithms. The results have shown that our approach improves significantly the classification accuracy, which reaches 99%.


intelligent data acquisition and advanced computing systems: technology and applications | 2005

Mining for Contiguous Frequent Itemsets in Transaction Databases

Christos Berberidis; George Tzanis; Ioannis P. Vlahavas

Mining a transaction database for association rules is a particularly popular data mining task, which involves the search for frequent co-occurrences among items. One of the problems often encountered is the large number of weak rules extracted. Item taxonomies, when available, can be used to reduce them to a more usable volume. In this paper we introduce a new data mining paradigm, which involves the discovery of contiguous frequent itemsets. We formulate the problem of mining contiguous frequent itemsets in a transaction database and we present a level-wise algorithm for finding these itemsets. Contiguous frequent itemsets may contain important knowledge about the dataset, that can not be exposed by the use of classic association rule mining approaches. This knowledge may well include serious hints for the generation of a taxonomy for all or part of the items.

Collaboration


Dive into the George Tzanis's collaboration.

Top Co-Authors

Avatar

Ioannis P. Vlahavas

Aristotle University of Thessaloniki

View shared research outputs
Top Co-Authors

Avatar

Christos Berberidis

Aristotle University of Thessaloniki

View shared research outputs
Top Co-Authors

Avatar

Ioannis Kavakiotis

Aristotle University of Thessaloniki

View shared research outputs
Top Co-Authors

Avatar

Anastasia Alexandridou

Aristotle University of Thessaloniki

View shared research outputs
Top Co-Authors

Avatar

Grigorios Tsoumakas

Aristotle University of Thessaloniki

View shared research outputs
Top Co-Authors

Avatar

Nick Bassiliades

Aristotle University of Thessaloniki

View shared research outputs
Researchain Logo
Decentralizing Knowledge