Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Marco Previtali is active.

Publication


Featured researches published by Marco Previtali.


Journal of Computational Biology | 2016

LSG: An External-Memory Tool to Compute String Graphs for Next-Generation Sequencing Data Assembly

Paola Bonizzoni; Gianluca Della Vedova; Yuri Pirola; Marco Previtali; Raffaella Rizzi

The large amount of short read data that has to be assembled in future applications, such as in metagenomics or cancer genomics, strongly motivates the investigation of disk-based approaches to index next-generation sequencing (NGS) data. Positive results in this direction stimulate the investigation of efficient external memory algorithms for de novo assembly from NGS data. Our article is also motivated by the open problem of designing a space-efficient algorithm to compute a string graph using an indexing procedure based on the Burrows-Wheeler transform (BWT). We have developed a disk-based algorithm for computing string graphs in external memory: the light string graph (LSG). LSG relies on a new representation of the FM-index that is exploited to use an amount of main memory requirement that is independent from the size of the data set. Moreover, we have developed a pipeline for genome assembly from NGS data that integrates LSG with the assembly step of SGA (Simpson and Durbin, 2012 ), a state-of-the-art string graph-based assembler, and uses BEETL for indexing the input data. LSG is open source software and is available online. We have analyzed our implementation on a 875-million read whole-genome dataset, on which LSG has built the string graph using only 1GB of main memory (reducing the memory occupation by a factor of 50 with respect to SGA), while requiring slightly more than twice the time than SGA. The analysis of the entire pipeline shows an important decrease in memory usage, while managing to have only a moderate increase in the running time.


string processing and information retrieval | 2016

Fully Dynamic de Bruijn Graphs

Djamal Belazzougui; Travis Gagie; Veli Mäkinen; Marco Previtali

We present a space- and time-efficient fully dynamic implementation of de Bruijn graphs, which can also support fixed-length jumbled pattern matching.


latin american symposium on theoretical informatics | 2016

Bidirectional Variable-Order de Bruijn Graphs

Djamal Belazzougui; Travis Gagie; Veli Mäkinen; Marco Previtali; Simon J. Puglisi

Implementing de Bruijn graphs compactly is an important problem because of their role in genome assembly. There are currently two main approaches, one using Bloom filters and the other using a kind of Burrows-Wheeler Transform on the edge labels of the graph. The second representation is more elegant and can even handle many graph-orders at once, but it does not cleanly support traversing edges backwards or inserting new nodes or edges. In this paper we resolve the first of these issues and partially address the second.


workshop on algorithms in bioinformatics | 2014

Constructing String Graphs in External Memory

Paola Bonizzoni; Gianluca Della Vedova; Yuri Pirola; Marco Previtali; Raffaella Rizzi

In this paper we present an efficient external memory algorithm to compute the string graph from a collection of reads, which is a fundamental data representation used for sequence assembly.


International Conference on Algorithms for Computational Biology | 2017

Mapping RNA-seq Data to a Transcript Graph via Approximate Pattern Matching to a Hypertext

Stefano Beretta; Paola Bonizzoni; Luca Denti; Marco Previtali; Raffaella Rizzi

Graphs are the most suited data structure to summarize the transcript isoforms produced by a gene. Such graphs may be modeled by the notion of hypertext, that is a graph where nodes are texts representing the exons of the gene and edges connect consecutive exons of a transcript. Mapping reads obtained by deep transcriptome sequencing to such graphs is crucial to compare reads with an annotation of transcript isoforms and to infer novel events due to alternative splicing at the exonic level.


international symposium on bioinformatics research and applications | 2016

FSG: Fast String Graph Construction for De Novo Assembly of Reads Data

Paola Bonizzoni; Gianluca Della Vedova; Yuri Pirola; Marco Previtali; Raffaella Rizzi

The string graph for a collection of next-generation reads is a lossless data representation that is fundamental for de novo assemblers based on the overlap-layout-consensus paradigm. In this paper, we explore a novel approach to compute the string graph, based on the FM-index and Burrows-Wheeler Transform. We describe a simple algorithm that uses only the FM-index representation of the collection of reads to construct the string graph, without accessing the input reads. Our algorithm has been integrated into the SGA assembler as a standalone module to construct the string graph. The new integrated assembler has been assessed on a standard benchmark, showing that FSG is significantly faster than SGA while maintaining a moderate use of main memory, and showing practical advantages in running FSG on multiple threads.


research in computational molecular biology | 2018

ASGAL: Aligning RNA-Seq Data to a Splicing Graph to Detect Novel Alternative Splicing Events

Luca Denti; Raffaella Rizzi; Stefano Beretta; Gianluca Della Vedova; Marco Previtali; Paola Bonizzoni

Background: While the reconstruction of transcripts from a sample of RNA-Seq data is a computationally expensive and complicated task, the detection of splicing events from RNA-Seq data and a gene annotation is computationally feasible. The latter task, which is adequate for many transcriptome analyses, is usually achieved by aligning the reads to a reference genome, followed by comparing the alignments with a gene annotation, often implicitly represented by a graph: the splicing graph. Results: We present ASGAL (Alternative Splicing Graph ALigner): a tool for mapping RNA-Seq data to the splicing graph, with the main goal of detecting novel alternative splicing events. ASGAL receives in input the annotated transcripts of a gene and an RNA-Seq sample, and it computes (1) the spliced alignments of each read, and (2) a list of novel events with respect to the gene annotation. Conclusions: An experimental analysis shows that, by aligning reads directly to the splicing graph, ASGAL better predicts alternative splicing events when compared to tools requiring spliced alignments of the RNA-Seq data to a reference genome. To the best of our knowledge, ASGAL is the first tool that detects novel alternative splicing events by directly aligning reads to a splicing graph. Availability: Source code, documentation, and data are available for download at http://asgal.algolab.eu.


conference on computability in europe | 2018

Divide and Conquer Computation of the Multi-string BWT and LCP Array

Paola Bonizzoni; Gianluca Della Vedova; Serena Nicosia; Yuri Pirola; Marco Previtali; Raffaella Rizzi

Indexing huge collections of strings, such as those produced by the widespread sequencing technologies, heavily relies on multi-string generalizations of the Burrows-Wheeler Transform (BWT) and the Longest Common Prefix (LCP) array, since solving efficiently both problems are essential ingredients of several algorithms on a collection of strings.


Archive | 2016

Graph Theory and Definitions

Stefano Beretta; Luca Denti; Marco Previtali

Graphs are a mathematical structure composed of a set of elements, and a set of connection between them. Due to their intuitive representation, graphs are widely employed in many different fields and, in particular, in systems biology and bioinformatics. In fact, they are used to model biological networks is several case studies, but also to represent data structures in different algorithmic procedures for solving bioinformatic problems. One of the key points of this widespread adoption is also related to the strong mathematical formulation behind them.


Archive | 2017

Computing the BWT and LCP array of a Set of Strings in External Memory.

Paola Bonizzoni; Gianluca Della Vedova; Yuri Pirola; Marco Previtali; Raffaella Rizzi

Collaboration


Dive into the Marco Previtali's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Yuri Pirola

University of Milano-Bicocca

View shared research outputs
Top Co-Authors

Avatar

Luca Denti

University of Milano-Bicocca

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Serena Nicosia

University of Milano-Bicocca

View shared research outputs
Top Co-Authors

Avatar

Djamal Belazzougui

Helsinki Institute for Information Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Travis Gagie

Diego Portales University

View shared research outputs
Researchain Logo
Decentralizing Knowledge