Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Antonino Fiannaca is active.

Publication


Featured researches published by Antonino Fiannaca.


BMC Bioinformatics | 2013

Alignment-free analysis of barcode sequences by means of compression-based methods

Massimo La Rosa; Antonino Fiannaca; Riccardo Rizzo; Alfonso Urso

BackgroundThe key idea of DNA barcode initiative is to identify, for each group of species belonging to different kingdoms of life, a short DNA sequence that can act as a true taxon barcode. DNA barcode represents a valuable type of information that can be integrated with ecological, genetic, and morphological data in order to obtain a more consistent taxonomy. Recent studies have shown that, for the animal kingdom, the mitochondrial gene cytochrome c oxidase I (COI), about 650 bp long, can be used as a barcode sequence for identification and taxonomic purposes of animals. In the present work we aims at introducing the use of an alignment-free approach in order to make taxonomic analysis of barcode sequences. Our approach is based on the use of two compression-based versions of non-computable Universal Similarity Metric (USM) class of distances. Our purpose is to justify the employ of USM also for the analysis of short DNA barcode sequences, showing how USM is able to correctly extract taxonomic information among those kind of sequences.ResultsWe downloaded from Barcode of Life Data System (BOLD) database 30 datasets of barcode sequences belonging to different animal species. We built phylogenetic trees of every dataset, according to compression-based and classic evolutionary methods, and compared them in terms of topology preservation. In the experimental tests, we obtained scores with a percentage of similarity between evolutionary and compression-based trees between 80% and 100% for the most of datasets (94%). Moreover we carried out experimental tests using simulated barcode datasets composed of 100, 150, 200 and 500 sequences, each simulation replicated 25-fold. In this case, mean similarity scores between evolutionary and compression-based trees span between 83% and 99% for all simulated datasets.ConclusionsIn the present work we aims at introducing the use of an alignment-free approach in order to make taxonomic analysis of barcode sequences. Our approach is based on the use of two compression-based versions of non-computable Universal Similarity Metric (USM) class of distances. This way we demonstrate the reliability of compression-based methods even for the analysis of short barcode sequences. Compression-based methods, with their strong theoretical assumptions, may then represent a valid alignment-free and parameter-free approach for barcode studies.


BMC Bioinformatics | 2015

Probabilistic topic modeling for the analysis and classification of genomic sequences

Massimo La Rosa; Antonino Fiannaca; Riccardo Rizzo; Alfonso Urso

BackgroundStudies on genomic sequences for classification and taxonomic identification have a leading role in the biomedical field and in the analysis of biodiversity. These studies are focusing on the so-called barcode genes, representing a well defined region of the whole genome. Recently, alignment-free techniques are gaining more importance because they are able to overcome the drawbacks of sequence alignment techniques. In this paper a new alignment-free method for DNA sequences clustering and classification is proposed. The method is based on k-mers representation and text mining techniques.MethodsThe presented method is based on Probabilistic Topic Modeling, a statistical technique originally proposed for text documents. Probabilistic topic models are able to find in a document corpus the topics (recurrent themes) characterizing classes of documents. This technique, applied on DNA sequences representing the documents, exploits the frequency of fixed-length k-mers and builds a generative model for a training group of sequences. This generative model, obtained through the Latent Dirichlet Allocation (LDA) algorithm, is then used to classify a large set of genomic sequences.Results and conclusionsWe performed classification of over 7000 16S DNA barcode sequences taken from Ribosomal Database Project (RDP) repository, training probabilistic topic models. The proposed method is compared to the RDP tool and Support Vector Machine (SVM) classification algorithm in a extensive set of trials using both complete sequences and short sequence snippets (from 400 bp to 25 bp). Our method reaches very similar results to RDP classifier and SVM for complete sequences. The most interesting results are obtained when short sequence snippets are considered. In these conditions the proposed method outperforms RDP and SVM with ultra short sequences and it exhibits a smooth decrease of performance, at every taxonomic level, when the sequence length is decreased.


BMC Bioinformatics | 2015

Analysis of miRNA expression profiles in breast cancer using biclustering

Antonino Fiannaca; Massimo La Rosa; Laura La Paglia; Riccardo Rizzo; Alfonso Urso

BackgroundMicroRNAs (miRNAs) are important key regulators in multiple cellular functions, due to their a crucial role in different physiological processes. MiRNAs are differentially expressed in specific tissues, during specific cell status, or in different diseases as tumours. RNA sequencing (RNA-seq) is a Next Generation Sequencing (NGS) method for the analysis of differential gene expression. Using machine learning algorithms, it is possible to improve the functional significance interpretation of miRNA in the analysis and interpretation of data from RNA-seq. Furthermore, we tried to identify some patterns of deregulated miRNA in human breast cancer (BC), in order to give a contribution in the understanding of this type of cancer at the molecular level.ResultsWe adopted a biclustering approach, using the Iterative Signature Algorithm (ISA) algorithm, in order to evaluate miRNA deregulation in the context of miRNA abundance and tissue heterogeneity. These are important elements to identify miRNAs that would be useful as prognostic and diagnostic markers. Considering a real word breast cancer dataset, the evaluation of miRNA differential expressions in tumours versus healthy tissues evidenced 12 different miRNA clusters, associated to specific groups of patients. The identified miRNAs were deregulated in breast tumours compared to healthy controls. Our approach has shown the association between specific sub-class of tumour samples having the same immuno-histo-chemical and/or histological features. Biclusters have been validated by means of two online repositories, MetaMirClust database and UCSC Genome Browser, and using another biclustering algorithm.ConclusionsThe obtained results with biclustering algorithm aimed first of all to give a contribute in the differential expression analysis in a cohort of BC patients and secondly to support the potential role that these non-coding RNA molecules could play in the clinical practice, in terms of prognosis, evolution of tumour and treatment response.


Neural Computing and Applications | 2013

Simulated annealing technique for fast learning of SOM networks

Antonino Fiannaca; Giuseppe Di Fatta; Riccardo Rizzo; Alfonso Urso; Salvatore Gaglio

The Self-Organizing Map (SOM) is a popular unsupervised neural network able to provide effective clustering and data visualization for multidimensional input datasets. In this paper, we present an application of the simulated annealing procedure to the SOM learning algorithm with the aim to obtain a fast learning and better performances in terms of quantization error. The proposed learning algorithm is called Fast Learning Self-Organized Map, and it does not affect the easiness of the basic learning algorithm of the standard SOM. The proposed learning algorithm also improves the quality of resulting maps by providing better clustering quality and topology preservation of input multi-dimensional data. Several experiments are used to compare the proposed approach with the original algorithm and some of its modification and speed-up techniques.


international conference on artificial neural networks | 2007

Improved SOM learning using simulated annealing

Antonino Fiannaca; Giuseppe Di Fatta; Salvatore Gaglio; Riccardo Rizzo; Alfonso Urso

Self-Organizing Map (SOM) algorithm has been extensively used for analysis and classification problems. For this kind of problems, datasets become more and more large and it is necessary to speed up the SOM learning. In this paper we present an application of the Simulated Annealing (SA) procedure to the SOM learning algorithm. The goal of the algorithm is to obtain fast learning and better performance in terms of matching of input data and regularity of the obtained map. An advantage of the proposed technique is that it preserves the simplicity of the basic algorithm. Several tests, carried out on different large datasets, demonstrate the effectiveness of the proposed algorithm in comparison with the original SOM and with some of its modification introduced to speed-up the learning.


international conference on engineering applications of neural networks | 2013

Analysis of DNA Barcode Sequences Using Neural Gas and Spectral Representation

Antonino Fiannaca; Massimo La Rosa; Riccardo Rizzo; Alfonso Urso

In this paper we present an application of the neural gas network to the classification of the DNA barcode sequences. The proposed method is based on the identification of distinctive words, extracted from the spectral representation of DNA sequences. In particular we calculated the “signatures” that are a characteristic of the DNA sequence at different taxonomic levels. In order to demonstrate the efficacy of the proposed method, we tested it over 10 real barcode datasets belonging to different animalia species, provided by on-line resource Barcode of Life Database (BOLD).


computational intelligence in bioinformatics and computational biology | 2012

An ontology design methodology for Knowledge-Based systems with application to bioinformatics

Antonino Fiannaca; Massimo La Rosa; Salvatore Gaglio

Ontologies are formal knowledge representation models. Knowledge organization is a fundamental requirement in order to develop Knowledge-Based systems. In this paper we present Data-Problem-Solver (DPS) approach, a new ontological paradigm that allows the knowledge designer to model and represent a Knowledge Base (KB) for expert systems. Our approach clearly distinguishes among the knowledge about a problem to resolve (answering the “what to do” question), the solver method to resolve it (answering the “how to do” question) and the type of input data required (answering the “what I need” question). The main purpose of the proposed paradigm is to facilitate the generalization of the application domain and the modularity and the expandability of the represented knowledge. The proposed DPS ontological approach is applied to the modelling of the knowledge about a bioinformatics application scenario: the protein complex extraction from a protein-protein interaction network.


computational intelligence methods for bioinformatics and biostatistics | 2015

A Deep Learning Approach to DNA Sequence Classification

Riccardo Rizzo; Antonino Fiannaca; Massimo La Rosa; Alfonso Urso

Deep learning neural networks are capable to extract significant features from raw data, and to use these features for classification tasks. In this work we present a deep learning neural network for DNA sequence classification based on spectral sequence representation. The framework is tested on a dataset of 16S genes and its performances, in terms of accuracy and F1 score, are compared to the General Regression Neural Network, already tested on a similar problem, as well as naive Bayes, random forest and support vector machine classifiers. The obtained results demonstrate that the deep learning approach outperformed all the other classifiers when considering classification of small sequence fragment 500 bp long.


computational intelligence methods for bioinformatics and biostatistics | 2013

Genomic Sequence Classification Using Probabilistic Topic Modeling

Massimo La Rosa; Antonino Fiannaca; Riccardo Rizzo; Alfonso Urso

Taxonomic classification of genomic sequences is usually based on evolutionary distance obtained by alignment. In this work we introduce a novel alignment-free classification approach based on probabilistic topic modeling. Using a k-mer (small fragments of length k) decomposition of DNA sequences and the Latent Dirichlet Allocation algorithm, we built a classifier for 16S rRNA bacterial gene sequences. We tested our method with a tenfold cross validation procedure considering a bacteria dataset of 3000 elements belonging to the most numerous bacteria phyla: Actinobacteria, Firmicutes and Proteobacteria. Experiments were carried out using complete and 400 bp long 16S sequences, in order to test the robustness of the proposed methodology. Our results, in terms of precision scores and for different number of topics, ranges from 100 %, at class level, to 77 % at genus level, for both full and 400 bp length, considering k-mers of length 8. These results demonstrate the effectiveness of the proposed approach.


computational intelligence methods for bioinformatics and biostatistics | 2012

A Study of Compression–Based Methods for the Analysis of Barcode Sequences

Massimo La Rosa; Antonino Fiannaca; Riccardo Rizzo; Alfonso Urso

In this paper it is introduced a new methodology for the analysis of barcode sequences. Barcode DNA is a very short nucleotide sequence, corresponding for the animal kingdom to the mitochondrial gene cytochrome c oxidase subunit 1, that acts as a unique element for identification and taxonomic purposes. Traditional barcode analysis uses well consolidated bioinformatics techniques such as sequence alignment, computation of evolutionary distances and phylogenetic trees. The proposed alignment-free approach consists in the use of two different compression-based approximations of Universal Similarity Metric in order to compute dissimilarity matrices among barcode sequences of 20 datasets belonging to different species. From these matrices phylogenetic trees are computed and compared, in terms of topology and branch length, with trees built from evolutionary distance. The results show high similarity values between compression-based and evolutionary-based trees allowing us to consider the former methodology worth to be employed for the study of barcode sequences

Collaboration


Dive into the Antonino Fiannaca's collaboration.

Top Co-Authors

Avatar

Riccardo Rizzo

National Research Council

View shared research outputs
Top Co-Authors

Avatar

Alfonso Urso

National Research Council

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Laura La Paglia

National Research Council

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Antonio Messina

National Research Council

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Angelo Bonanno

National Research Council

View shared research outputs
Researchain Logo
Decentralizing Knowledge