Nicola Cannata | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Nicola Cannata is active.

Explore More

Publication

Featured researches published by Nicola Cannata.

PLOS Computational Biology | 2005

Time to organize the bioinformatics resourceome.

Nicola Cannata; Emanuela Merelli; Russ B. Altman

T he field of bioinformatics has blossomed in the last ten years, and as a result, there is a large and increasing number of researchers generating computational tools for solving problems relevant to biology. Because the number of artifacts has increased greatly, it is impossible for many bioinformatics researchers to track tools, databases, and methods in the field—or even perhaps within their own specialty area. More critically, however, biologist users and scientists approaching the field do not have a comprehensive index of bioinformatics algorithms, databases, and literature annotated with information about their context and appropriate use. We suggest that the full set of bioinformatics resources—the ‘‘resourceome’’—should be explicitly characterized and organized. A hierarchical and machine-understandable organization of the field, along with rich cross-links (an ontology!) would be a useful start. It is likely that a distributed development approach would be required so that those with focused expertise can classify resources in their area, while providing the metadata that would allow easier access to useful existing resources. The growth of bioinformatics can be quantified in many ways. The Intelligent Systems for Molecular Biology Meeting began in 1993, and numerous other meetings have been established. The International Society for Computational Biology (ISCB) was formed in 1995, and recent membership numbers have reached 2,000. The field has gone from having one or two journals to having more than a dozen—if one considers ‘‘-omics’’ (i.e., subjects relating to high-throughput functional genomics, where computation plays a central role) and the emerging field of systems biology. Because bioinformatics has a strong element of engineering, the creation and maintenance of tools provide value only insofar as they are used. These tools may be databases that hold biological data, or they may be algorithms that act on this data to draw inferences. Access to these artifacts is currently uneven. Of course, the published literature is the archival resting place for the initial description of these innovations, but it only contains a snapshot of most tools early in their lifetime. The literature does not use any standard classification system to describe tools, so the sensitivity of searches for specific functions is not generally high. Indeed, the bibliome itself is idiosyncratically organized, and finding the right article is often like searching for a needle in a haystack [2]. Finally, the published literature does not contain reliable references to the location and to the availability of most bioinformatics resources [3,4]. One could also argue that Google (http:// www.google.com) provides adequate access to tools based on keyword searching [5]. However, the lack of standard terms makes sensitive and specific searches difficult. In addition, most search hits confound papers, Web sites, tools, departments, and people in a manner that makes extracting useful information very difficult. Recognizing this limitation, there have been some grassroots attempts to organize the bioinformatics resourceome. Among the most famous are the ‘‘archaeological’’ Pedro’s List—a list of computer tools for molecular biologists (http://www.public.iastate.edu/;pedro/ research_tools.html)—and the Expasy Life Sciences Directory, formerly known as the Amos’s WWW links page (http://www.expasy.org/links.html). The Bioinformatics Links Directory (http://www.bioinformatics.ubc.ca/resources/ links_directory/) today contains more than 700 curated links to bioinformatics resources, organized into eleven main categories, including all the databases and Web servers yearly listed in the dedicated Nucleic Acids Research special issues [6]. The National Center for Biotechnology Institute has tried to make access to its suite of tools transparent, with moderate success. Many Web sites can be found listing ‘‘useful sites,’’ especially concerning special interest or limited topics (e.g., microarrays, text mining, and gene regulation). But all of these efforts are limited by the difficulty in maintaining currency and by the lack of a uniformly recognized classification scheme. Yet our colleagues in bioinformatics and biology are constantly asking about the availability of tools or databases with certain characteristics. The lack of a useful index, thus, routinely costs time and opportunities. In addition, there is no ‘‘peer-review’’ system for bioinformatics tools so that the most useful ones can be highlighted by happy users. A secure and reliable system for rating (similar to that used by Amazon.com, for example) would also be an important prerequisite. An ‘‘ontology’’ is a specification of a conceptual space, often used by computer programs. The field of ontology

Bioinformatics | 2002

Simplifying amino acid alphabets by means of a branch and bound algorithm and substitution matrices

Nicola Cannata; Stefano Toppo; Chiara Romualdi; Giorgio Valle

MOTIVATION Protein and DNA are generally represented by sequences of letters. In a number of circumstances simplified alphabets (where one or more letters would be represented by the same symbol) have proved their potential utility in several fields of bioinformatics including searching for patterns occurring at an unexpected rate, studying protein folding and finding consensus sequences in multiple alignments. The main issue addressed in this paper is the possibility of finding a general approach that would allow an exhaustive analysis of all the possible simplified alphabets, using substitution matrices like PAM and BLOSUM as a measure for scoring. RESULTS The computational approach presented in this paper has led to a computer program called AlphaSimp (Alphabet Simplifier) that can perform an exhaustive analysis of the possible simplified amino acid alphabets, using a branch and bound algorithm together with standard or user-defined substitution matrices. The program returns a ranked list of the highest-scoring simplified alphabets. When the extent of the simplification is limited and the simplified alphabets are maintained above ten symbols the program is able to complete the analysis in minutes or even seconds on a personal computer. However, the performance becomes worse, taking up to several hours, for highly simplified alphabets. AVAILABILITY AlphaSimp and other accessory programs are available at http://bioinformatics.cribi.unipd.it/alphasimp

Bioinformatics | 2005

RAP: a new computer program for de novo identification of repeated sequences in whole genomes

Davide Campagna; Chiara Romualdi; Nicola Vitulo; Micky Del Favero; Matej Lexa; Nicola Cannata; Giorgio Valle

MOTIVATION DNA repeats are a common feature of most genomic sequences. Their de novo identification is still difficult despite being a crucial step in genomic analysis and oligonucleotides design. Several efficient algorithms based on word counting are available, but too short words decrease specificity while long words decrease sensitivity, particularly in degenerated repeats. RESULTS The Repeat Analysis Program (RAP) is based on a new word-counting algorithm optimized for high resolution repeat identification using gapped words. Many different overlapping gapped words can be counted at the same genomic position, thus producing a better signal than the single ungapped word. This results in better specificity both in terms of low-frequency detection, being able to identify sequences repeated only once, and highly divergent detection, producing a generally high score in most intron sequences. AVAILABILITY The program is freely available for non-profit organizations, upon request to the authors. CONTACT [email protected] SUPPLEMENTARY INFORMATION The program has been tested on the Caenorhabditis elegans genome using word lengths of 12, 14 and 16 bases. The full analysis has been implemented in the UCSC Genome Browser and is accessible at http://genome.cribi.unipd.it.

Transactions on Computational Systems Biology | 2005

An agent-oriented conceptual framework for systems biology

Nicola Cannata; Flavio Corradini; Emanuela Merelli; Andrea Omicini; Alessandro Ricci

Recently, a collective effort from multiple research areas has been made to understand biological systems at the system level. On the one hand, for instance, researchers working on systems biology aim at understanding how living systems routinely perform complex tasks. On the other hand, bioscientists involved in pharmacogenomics strive to study how an individuals genetic inheritance affects the bodys response to drugs. Among the many things, research in the above disciplines requires the ability to simulate particular biological systems as cells, organs, organisms and communities. When observed according to the perspective of system simulation, biological systems are complex ones, and consist of a set of components interacting with each other and with an external (dynamic) environment. In this work, we propose an alternative way to specify and model complex systems based on behavioral modelling. We consider a biological system as a set of active computational components interacting in a dynamic and often unpredictable environment. Then, we propose a conceptual framework for engineering computational systems simulating the behaviour of biological systems, and modelling them in terms of agents and agent societies.

BMC Bioinformatics | 2008

A Semantic Web for bioinformatics: goals, tools, systems, applications

Nicola Cannata; Michael Schröder; Roberto Marangoni; Paolo Romano

Network Tools and Applications in Biology (NETTAB) [1] is a series of workshops focused on the most promising and innovative Information and Communication Technologies (ICT) tools and to their usefulness in Bioinformatics. These workshops aim at introducing participants to innovative network standards and technologies that are being applied to the biology field. To this end, each year a special emphasis is given to a focus theme. Workshops also include special sessions devoted both to the general theme of the series of workshops, i.e. “Network Tools and Applications in Biology”, and to further topics selected by local organizers. Biological data integration issues were already discussed in previous editions of this series of workshops, including topics such as “CORBA and XML: towards a bioinformatics integrated network environment” (NETTAB 2001) [2], “Agents in Bioinformatics” (NETTAB 2002) [3], “Workflows management: new abilities for the biological information overflow” (NETTAB 2005) [4] and “Distributed Applications, Web Services, Tools and GRID Infrastructures for Bioinformatics” (NETTAB 2006) [5,6]. The Seventh NETTAB workshop was held at the Computer Science Department of the University of Pisa, on June 12-15, 2007, having “A Semantic Web for Bioinformatics: Goals, Tools, Systems, Applications” as focus theme. Adjunct themes were “Algorithms in bioinformatics” and “Formal Methods for Systems Biology”. This BMC Bioinformatics Supplement includes the best papers and posters – representing all the themes - from works presented at the workshop.

Yeast | 2001

Characterization of 16 novel human genes showing high similarity to yeast sequences

Fabio Stanchi; Emanuela Bertocco; Stefano Toppo; Rosario Dioguardi; Barbara Simionati; Nicola Cannata; Rosanna Zimbello; Gerolamo Lanfranchi; Giorgio Valle

The entire set of open reading frames (ORFs) of Saccharomyces cerevisiae has been used to perform systematic similarity searches against nucleic acid and protein databases: with the aim of identifying interesting homologies between yeast and mammalian genes. Many similarities were detected: mostly with known genes. However: several yeast ORFs were only found to match human partial sequence tags: indicating the presence of human transcripts still uncharacterized that have a homologous counterpart in yeast. About 30 such transcripts were further studied and named HUSSY (human sequence similar to yeast). The 16 most interesting are presented in this paper along with their sequencing and mapping data. As expected: most of these genes seem to be involved in basic metabolic and cellular functions (lipoic acid biosynthesis: ribulose‐5‐phosphate‐3‐epimerase: glycosyl transferase: β‐transducin: serine‐threonine‐kinase: ABC proteins: cation transporters). Genes related to RNA maturation were also found (homologues to DIM1: ROK1‐RNA‐elicase and NFS1). Furthermore: five novel human genes were detected (HUSSY‐03: HUSSY‐22: HUSSY‐23: HUSSY‐27: HUSSY‐29) that appear to be homologous to yeast genes whose function is still undetermined. More information on this work can be obtained at the website http://grup.bio.unipd.it/hussy Copyright

Automated Experimentation | 2010

Combining ontologies and workflows to design formal protocols for biological laboratories

Alessandro Maccagnan; Mauro Riva; Erika Feltrin; Barbara Simionati; Tullio Vardanega; Giorgio Valle; Nicola Cannata

BackgroundLaboratory protocols in life sciences tend to be written in natural language, with negative consequences on repeatability, distribution and automation of scientific experiments. Formalization of knowledge is becoming popular in science. In the case of laboratory protocols two levels of formalization are needed: one for the entities and individuals operations involved in protocols and another one for the procedures, which can be manually or automatically executed. This study aims to combine ontologies and workflows for protocol formalization.ResultsA laboratory domain specific ontology and the COW (Combining Ontologies with Workflows) software tool were developed to formalize workflows built on ontologies. A method was specifically set up to support the design of structured protocols for biological laboratory experiments. The workflows were enhanced with ontological concepts taken from the developed domain specific ontology.The experimental protocols represented as workflows are saved in two linked files using two standard interchange languages (i.e. XPDL for workflows and OWL for ontologies). A distribution package of COW including installation procedure, ontology and workflow examples, is freely available from http://www.bmr-genomics.it/farm/cow.ConclusionsUsing COW, a laboratory protocol may be directly defined by wet-lab scientists without writing code, which will keep the resulting protocols specifications clear and easy to read and maintain.

International Journal of Modelling, Identification and Control | 2008

Multiagent modelling and simulation of carbohydrate oxidation in cell

Nicola Cannata; Flavio Corradini; Emanuela Merelli

A cell consists of a large number of components interacting in a dynamic environment. The complexity of interactions among cell components makes design of cell simulation a challenging task. Multiagent systems can be considered a suitable framework for modelling and engineering complex systems organised as autonomous interactive components. The multiagent paradigm proved natural and useful not only in building simulations, but also for devising an engineering methodology. To evaluate the proposed approach, we constructed a multiagent model of cell components involved in the metabolic pathways of Carbohydrate Oxidation (CO).

International Journal of Environmental Research and Public Health | 2015

SaDA: From Sampling to Data Analysis—An Extensible Open Source Infrastructure for Rapid, Robust and Automated Management and Analysis of Modern Ecological High-Throughput Microarray Data

Kumar Saurabh Singh; Dominique Thual; Roberto Spurio; Nicola Cannata

One of the most crucial characteristics of day-to-day laboratory information management is the collection, storage and retrieval of information about research subjects and environmental or biomedical samples. An efficient link between sample data and experimental results is absolutely important for the successful outcome of a collaborative project. Currently available software solutions are largely limited to large scale, expensive commercial Laboratory Information Management Systems (LIMS). Acquiring such LIMS indeed can bring laboratory information management to a higher level, but most of the times this requires a sufficient investment of money, time and technical efforts. There is a clear need for a light weighted open source system which can easily be managed on local servers and handled by individual researchers. Here we present a software named SaDA for storing, retrieving and analyzing data originated from microorganism monitoring experiments. SaDA is fully integrated in the management of environmental samples, oligonucleotide sequences, microarray data and the subsequent downstream analysis procedures. It is simple and generic software, and can be extended and customized for various environmental and biomedical studies.

Bioinformatics | 2003

TRAIT (TRAnscript Integrated Table): a knowledgebase of human skeletal muscle transcripts

Stefano Toppo; Nicola Cannata; Paolo Fontana; Chiara Romualdi; Paolo Laveder; Emanuela Bertocco; Gerolamo Lanfranchi; Giorgio Valle

TRAIT is a knowledgebase integrating information on transcripts with related data from genome, proteins, ortholog genes and diseases. It was initially built as a system to manage an EST-based gene discovery project on human skeletal muscle, which yielded over 4500 independent sequence clusters. Transcripts are annotated using automatic as well as manual procedures, linking known transcripts to public databases and unknown transcripts to tables of predicted features. Data are stored in a MySQL database. Complex queries are automatically built by means of a user-friendly web interface that allows the concurrent selection of many fields such as ontology, expression level, map position and protein domains. The results are parsed by the system and returned in a ranked order, in respect to the number of satisfied criteria.

Explore More