Paola Bonizzoni | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Paola Bonizzoni is active.

Explore More

Publication

Featured researches published by Paola Bonizzoni.

Journal of Computer Science and Technology | 2003

The Haplotyping problem: an overview of computational models and solutions

Paola Bonizzoni; Gianluca Della Vedova; Riccardo Dondi; Jing Li

The investigation of genetic differences among humans has given evidence that mutations in DNA sequences are responsible for some genetic diseases. The most common mutation is the one that involves only a single nucleotide of the DNA sequence, which is called a single nucleotide polymorphism (SNP). As a consequence, computing a complete map of all SNPs occurring in the human populations is one of the primary goals of recent studies in human genomics. The construction of such a map requires to determine the DNA sequences that from all chromosomes. In diploid organisms like humans, each chromosome consists of two sequences calledhaplotypes. Distinguishing the information contained in both haplotypes when analyzing chromosome sequences poses several new computational issues which collectively form a new emerging topic of Computational Biology known asHaplotyping.This paper is a comprehensive study of some new combinatorial approaches proposed in this research area and it mainly focuses on the formulations and algorithmic solutions of some basic biological problems. Three statistical approaches are briefly discussed at the end of the paper.

Theoretical Computer Science | 2001

The complexity of multiple sequence alignment with SP-score that is a metric

Paola Bonizzoni; Gianluca Della Vedova

This paper analyzes the computational complexity of computing the optimal alignment of a set of sequences under the sum of all pairs (SP) score scheme. We solve an open question by showing that the problem is NP-complete in the very restricted case in which the sequences are over a binary alphabet and the score is a metric. This result establishes the intractability of multiple sequence alignment under a score function of mathematical interest, which has indeed received much attention in biological sequence comparison.

Theoretical Computer Science | 2005

Reconciling a gene tree to a species tree under the duplication cost model

Paola Bonizzoni; Gianluca Della Vedova; Riccardo Dondi

The general problem of reconciling the information from evolutionary trees representing the relationships between distinct gene families is of great importance in bioinformatics and has been popularized among the computer science researchers by Ma et al. [From gene trees to species trees, SIAM J. Comput. 30(3) (2000) 729-752] where the authors pose the intriguing question if a certain definition of minimum tree that reconciles a gene tree and a species tree is correct. We answer affirmatively to this question; moreover, we show an efficient algorithm for computing such minimum-leaf reconciliation trees and prove the uniqueness of such trees. We then tackle some different versions of the biological problem by showing that the exemplar problem, arising from the exemplar analysis of multigene genomes, is NP-hard even when the number of copies of a given label is at most two. Finally, we introduce two novel formulations for the problem of recombining evolutionary trees, extending the gene duplication problem studied in [Ma et al., From gene trees to species trees, SIAM J. Comput. 30(3) (2000) 729-752; M. Fellows et al., On the multiple gene duplication problem, in: Proc. Ninth Internat. Symp. on Algorithms and Computation (ISAAC98), 1998; R. Page, Maps between trees and cladistic analysis of historical associations among genes, Systematic Biology 43 (1994) 58-77; R.M. Page, J. Cotton, Vertebrate phylogenomics: reconciled trees and gene duplications, in: Proc. Pacific Symp. on Biocomputing 2002 (PSB2002), 2002, pp. 536-547; R. Guigo et al., Reconstruction of ancient molecular phylogeny, Mol. Phy. and Evol. 6(2) (1996) 189-213], and we give an exact algorithm (via dynamic programming) for one of these formulations.

Bioinformatics | 2008

ASPicDB: A database resource for alternative splicing analysis

Tiziana Castrignanò; Mattia D'Antonio; Anna Anselmo; Danilo Carrabino; A. D'Onorio De Meo; Anna Maria D'Erchia; Flavio Licciulli; Marina Mangiulli; Flavio Mignone; Giulio Pavesi; Ernesto Picardi; Alberto Riva; Raffaella Rizzi; Paola Bonizzoni

MOTIVATION Alternative splicing has recently emerged as a key mechanism responsible for the expansion of transcriptome and proteome complexity in human and other organisms. Although several online resources devoted to alternative splicing analysis are available they may suffer from limitations related both to the computational methodologies adopted and to the extent of the annotations they provide that prevent the full exploitation of the available data. Furthermore, current resources provide limited query and download facilities. RESULTS ASPicDB is a database designed to provide access to reliable annotations of the alternative splicing pattern of human genes and to the functional annotation of predicted splicing isoforms. Splice-site detection and full-length transcript modeling have been carried out by a genome-wide application of the ASPic algorithm, based on the multiple alignments of gene-related transcripts (typically a Unigene cluster) to the genomic sequence, a strategy that greatly improves prediction accuracy compared to methods based on independent and progressive alignments. Enhanced query and download facilities for annotations and sequences allow users to select and extract specific sets of data related to genes, transcripts and introns fulfilling a combination of user-defined criteria. Several tabular and graphical views of the results are presented, providing a comprehensive assessment of the functional implication of alternative splicing in the gene set under investigation. ASPicDB, which is regularly updated on a monthly basis, also includes information on tissue-specific splicing patterns of normal and cancer cells, based on available EST sequences and their library source annotation. AVAILABILITY www.caspur.it/ASPicDB

BMC Bioinformatics | 2005

ASPIC: a novel method to predict the exon-intron structure of a gene that is optimally compatible to a set of transcript sequences

Paola Bonizzoni; Raffaella Rizzi

Background:Currently available methods to predict splice sites are mainly based on the independent and progressive alignment of transcript data (mostly ESTs) to the genomic sequence. Apart from often being computationally expensive, this approach is vulnerable to several problems – hence the need to develop novel strategies.Results:We propose a method, based on a novel multiple genome-EST alignment algorithm, for the detection of splice sites. To avoid limitations of splice sites prediction (mainly, over-predictions) due to independent single EST alignments to the genomic sequence our approach performs a multiple alignment of transcript data to the genomic sequence based on the combined analysis of all available data. We recast the problem of predicting constitutive and alternative splicing as an optimization problem, where the optimal multiple transcript alignment minimizes the number of exons and hence of splice site observations.We have implemented a splice site predictor based on this algorithm in the software tool ASPIC (Alternative Splicing PredICtion). It is distinguished from other methods based on BLAST-like tools by the incorporation of entirely new ad hoc procedures for accurate and computationally efficient transcript alignment and adopts dynamic programming for the refinement of intron boundaries. ASPIC also provides the minimal set of non-mergeable transcript isoforms compatible with the detected splicing events. The ASPIC web resource is dynamically interconnected with the Ensembl and Unigene databases and also implements an upload facility.Conclusion:Extensive bench marking shows that ASPIC outperforms other existing methods in the detection of novel splicing isoforms and in the minimization of over-predictions. ASPIC also requires a lower computation time for processing a single gene and an EST cluster. The ASPIC web resource is available at http://aspic.algo.disco.unimib.it/aspic-devel/.

international workshop on combinatorial algorithms | 2010

Parameterized complexity of k-anonymity: hardness and tractability

Paola Bonizzoni; Gianluca Della Vedova; Riccardo Dondi; Yuri Pirola

The problem of publishing personal data without giving up privacy is becoming increasingly important. A precise formalization that has been recently proposed is the k-anonymity, where the rows of a table are partitioned into clusters of sizes at least k and all rows in a cluster become the same tuple after the suppression of some entries. The natural optimization problem, where the goal is to minimize the number of suppressed entries, is hard even when the stored values are over a binary alphabet or the table consists of a bounded number of columns. In this paper we study how the complexity of the problem is influenced by different parameters. First we show that the problem is W[1]-hard when parameterized by the value of the solution (and k). Then we exhibit a fixed-parameter algorithm when the problem is parameterized by the number of columns and the number of different values in any column. Finally, we prove that k-anonymity is still APX-hard even when restricting to instances with 3 columns and k=3.

IEEE/ACM Transactions on Computational Biology and Bioinformatics | 2007

Exemplar Longest Common Subsequence

Paola Bonizzoni; Gianluca Della Vedova; Riccardo Dondi; Guillaume Fertin; Raffaella Rizzi; Stéphane Vialette

In this paper, we investigate the computational and approximation complexity of the Exemplar Longest Common Subsequence (ELCS) of a set of sequences (ELCS problem), a generalization of the Longest Common Subsequence problem, where the input sequences are over the union of two disjoint sets of symbols, a set of mandatory symbols and a set of optional symbols. We show that different versions of the problem are APX-hard even for instances with two sequences. Moreover, we show that the related problem of determining the existence of a feasible solution of the ELCS of two sequences is NP-hard. On the positive side, we first present an efficient algorithm for the ELCS problem over instances of two sequences where each mandatory symbol can appear in total at most three times in the sequences. Furthermore, we present two fixed-parameter algorithms for the ELCS problem over instances of two sequences where the parameter is the number of mandatory symbols.

Nucleic Acids Research | 2006

ASPIC: a web resource for alternative splicing prediction and transcript isoforms characterization

Tiziana Castrignanò; Raffaella Rizzi; Ivano Giuseppe Talamo; Paolo D'Onorio De Meo; Anna Anselmo; Paola Bonizzoni

Alternative splicing (AS) is now emerging as a major mechanism contributing to the expansion of the transcriptome and proteome complexity of multicellular organisms. The fact that a single gene locus may give rise to multiple mRNAs and protein isoforms, showing both major and subtle structural variations, is an exceptionally versatile tool in the optimization of the coding capacity of the eukaryotic genome. The huge and continuously increasing number of genome and transcript sequences provides an essential information source for the computational detection of genes AS pattern. However, much of this information is not optimally or comprehensively used in gene annotation by current genome annotation pipelines. We present here a web resource implementing the ASPIC algorithm which we developed previously for the investigation of AS of user submitted genes, based on comparative analysis of available transcript and genome data from a variety of species. The ASPIC web resource provides graphical and tabular views of the splicing patterns of all full-length mRNA isoforms compatible with the detected splice sites of genes under investigation as well as relevant structural and functional annotation. The ASPIC web resource-available at http://www.caspur.it/ASPIC/--is dynamically interconnected with the Ensembl and Unigene databases and also implements an upload facility.

acm symposium on applied computing | 2001

An approximation algorithm for the shortest common supersequence problem: an experimental analysis

Paolo Barone; Paola Bonizzoni; Gianluca Delta Vedova; Giancarlo Mauri

In this paper an approximation algorithm, called ReduceExpand, for the Shortest Common Supersequence (SCS) problem is presented and its behavior is studied experimentally. While the guaranteed approximation ratio of ReduceExpand matches that of the best known algorithm, our algorithm clearly outperforms such algorithm with respect to the length of the approximate solution.

Discrete Applied Mathematics | 2001

Experimenting an approximation algorithm for the LCS

Paola Bonizzoni; Gianluca Della Vedova; Giancarlo Mauri

The problem of finding the longest common subsequence (lcs) of a given set of sequences over an alphabet Σ occurs in many interesting contexts, such as data compression and molecular biology, in order to measure the “similarity degree” among biological sequences. Since the problem is NP-complete in its decision version (i.e. does there exist a lcs of length at least k, for a given k?) even over fixed alphabet, polynomial algorithms which give approximate solutions have been proposed. Among them, Long Run (LR) is the only one with guaranteed constant performance ratio. In this paper, we give a new approximation algorithm for the longest common subsequence problem: the Expansion Algorithm (EA). First of all, we prove that the solution found by the Expansion Algorithm is always at least as good as the one found by LR. Then we report the results of an experimentation with two different groups of instances, which show that EA clearly outperforms Long Run in practice.

Explore More