Stefano Beretta | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Stefano Beretta is active.

Explore More

Publication

Featured researches published by Stefano Beretta.

Journal of Computational Biology | 2014

Modeling alternative splicing variants from RNA-Seq data with isoform graphs.

Stefano Beretta; Paola Bonizzoni; Gianluca Della Vedova; Yuri Pirola; Raffaella Rizzi

Next-generation sequencing (NGS) technologies need new methodologies for alternative splicing (AS) analysis. Current computational methods for AS analysis from NGS data are mainly based on aligning short reads against a reference genome, while methods that do not need a reference genome are mostly underdeveloped. In this context, the main developed tools for NGS data focus on de novo transcriptome assembly (Grabherr et al., 2011 ; Schulz et al., 2012). While these tools are extensively applied for biological investigations and often show intrinsic shortcomings from the obtained results, a theoretical investigation of the inherent computational limits of transcriptome analysis from NGS data, when a reference genome is unknown or highly unreliable, is still missing. On the other hand, we still lack methods for computing the gene structures due to AS events under the above assumptions--a problem that we start to tackle with this article. More precisely, based on the notion of isoform graph (Lacroix et al., 2008), we define a compact representation of gene structures--called splicing graph--and investigate the computational problem of building a splicing graph that is (i) compatible with NGS data and (ii) isomorphic to the isoform graph. We characterize when there is only one representative splicing graph compatible with input data, and we propose an efficient algorithmic approach to compute this graph.

Bioinformatics | 2014

Further Steps in TANGO: improved taxonomic assignment in metagenomics

Daniel Alonso-Alemany; Aurélien Barré; Stefano Beretta; Paola Bonizzoni; Macha Nikolski; Gabriel Valiente

MOTIVATION TANGO is one of the most accurate tools for the taxonomic assignment of sequence reads. However, because of the differences in the taxonomy structures, performing a taxonomic assignment on different reference taxonomies will produce divergent results. RESULTS We have improved the TANGO pipeline to be able to perform the taxonomic assignment of a metagenomic sample using alternative reference taxonomies, coming from different sources. We highlight the novel pre-processing step, necessary to accomplish this task, and describe the improvements in the assignment process. We present the new TANGO pipeline in details, and, finally, we show its performance on four real metagenomic datasets and also on synthetic datasets. AVAILABILITY The new version of TANGO, including implementation improvements and novel developments to perform the assignment on different reference taxonomies, is freely available at http://sourceforge.net/projects/taxoassignment/.

Operations Research Letters | 2013

A hybrid genetic algorithm for the repetition free longest common subsequence problem

Mauro Castelli; Stefano Beretta; Leonardo Vanneschi

Computing the longest common subsequence of two sequences is one of the most studied algorithmic problems. In this work we focus on a particular variant of the problem, called repetition free longest common subsequence ( RF-LCS ), which has been proved to be NP-hard. We propose a hybrid genetic algorithm, which combines standard genetic algorithms and estimation of distribution algorithms, to solve this problem. An experimental comparison with some well-known approximation algorithms shows the suitability of the proposed technique.

Nature Communications | 2017

HIV-1-mediated insertional activation of STAT5B and BACH2 trigger viral reservoir in T regulatory cells

Daniela Cesana; Francesca R. Santoni de Sio; Laura Rudilosso; Pierangela Gallina; Andrea Calabria; Stefano Beretta; Ivan Merelli; Elena Bruzzesi; Laura Passerini; Silvia Nozza; Elisa Vicenzi; Guido Poli; Silvia Gregori; Giuseppe Tambussi; Eugenio Montini

HIV-1 insertions targeting BACH2 or MLK2 are enriched and persist for decades in hematopoietic cells from patients under combination antiretroviral therapy. However, it is unclear how these insertions provide such selective advantage to infected cell clones. Here, we show that in 30/87 (34%) patients under combination antiretroviral therapy, BACH2, and STAT5B are activated by insertions triggering the formation of mRNAs that contain viral sequences fused by splicing to their first protein-coding exon. These chimeric mRNAs, predicted to express full-length proteins, are enriched in T regulatory and T central memory cells, but not in other T lymphocyte subsets or monocytes. Overexpression of BACH2 or STAT5B in primary T regulatory cells increases their proliferation and survival without compromising their function. Hence, we provide evidence that HIV-1-mediated insertional activation of BACH2 and STAT5B favor the persistence of a viral reservoir in T regulatory cells in patients under combination antiretroviral therapy.HIV insertions in hematopoietic cells are enriched in BACH2 or MLK2 genes, but the selective advantages conferred are unknown. Here, the authors show that BACH2 and additionally STAT5B are activated by viral insertions, generating chimeric mRNAs specifically enriched in T regulatory cells favoring their persistence.

Theoretical Computer Science | 2016

Parameterized tractability of the maximum-duo preservation string mapping problem

Stefano Beretta; Mauro Castelli; Riccardo Dondi

In this paper we investigate the parameterized complexity of the Maximum-Duo Preservation String Mapping Problem, the complementary of the Minimum Common String Partition Problem. We show that this problem is fixed-parameter tractable when parameterized by the number k of conserved duos, by first giving a parameterized algorithm based on the color-coding technique and then presenting a reduction to a kernel of size O(k^6 ).

conference on computability in europe | 2014

Gene tree correction by leaf removal and modification: Tractability and approximability

Stefano Beretta; Riccardo Dondi

The reconciliation of a gene tree and a species tree is a well-known method to understand the evolution of a gene family in order to identify which evolutionary events (speciations, duplications and losses) occurred during gene evolution. Since reconciliation is usually affected by errors in the gene trees, they have to be preprocessed before the reconciliation process. A method to tackle with this problem aims to correct a gene tree by removing the minimum number of leaves (Minimum Leaf Removal). In this paper we show that Minimum Leaf Removal is not approximable within factor b logm, where m is the number of leaves of the species tree and b > 0 is a constant. Furthermore, we introduce a new variant of the problem, where the goal is the correction of a gene tree with the minimum number of leaf modifications. We show that this problem, differently from the removal version, is W[1]-hard, when parameterized by the number of leaf modifications.

parallel, distributed and network-based processing | 2017

Low-Power Architectures for miRNA-Target Genome Wide Analysis

Stefano Beretta; Lucia Morganti; Elena Corni; Andrea Ferraro; Daniele Cesini; Daniele D'Agostino; Luciano Milanesi; Ivan Merelli

In molecular biology, the interaction mechanisms between microRNAs (miRNAs) with their messenger RNA targets are poorly understood. This is the reason why many miRNA-target prediction methods are available, but their results are often inconsistent. A lot of efforts focus on the quality of the sequence match between miRNA and target rather than on the role of the mRNA secondary structure in which the target is embedded. Nonetheless, it is known that the miRNA secondary structures contribute to target recognition, because there is an energetic cost to freeing base-pairing interactions within mRNA to make the target accessible for miRNA binding. This approach is implemented by PITA (Probability of Interaction by Target Accessibility), a very computational-intensive tool that is able to provide accurate results even when little is know about the conservation of the miRNA. In this paper we propose a new implementation of PITA, called lPITA, able to exploit a coarse-grained parallelism over low power architectures to reduce both execution times and the power consumption.

International Conference on Algorithms for Computational Biology | 2017

Mapping RNA-seq Data to a Transcript Graph via Approximate Pattern Matching to a Hypertext

Stefano Beretta; Paola Bonizzoni; Luca Denti; Marco Previtali; Raffaella Rizzi

Graphs are the most suited data structure to summarize the transcript isoforms produced by a gene. Such graphs may be modeled by the notion of hypertext, that is a graph where nodes are texts representing the exons of the gene and edges connect consecutive exons of a transcript. Mapping reads obtained by deep transcriptome sequencing to such graphs is crucial to compare reads with an annotation of transcript isoforms and to infer novel events due to alternative splicing at the exonic level.

parallel, distributed and network-based processing | 2016

A Machine Learning Approach for the Integration of miRNA-Target Predictions

Stefano Beretta; Mauro Castelli; Yuliana Martínez; Luis Muñoz; Sara Silva; Leonardo Trujillo; Luciano Milanesi; Ivan Merelli

Although several computational methods have been developed for predicting interactions between miRNA and target genes, there are substantial differences in the achieved results. For this reason, machine learning approaches are widely used for integrating the predictions obtained from different tools. In this work we adopt a method, called M3GP, which relies on a genetic programming approach, to classify results from three tools: miRanda, TargetScan, and RNAhybrid. Such algorithm is highly parallelizable and its adoption provides great advantages while handling problems involving big datasets, since it is independent from the implementation and from the architecture on which it is executed. More precisely, we apply this technique for the classification of the achieved miRNA target predictions and we compare its results with those obtained with other classifiers.

Journal of Discrete Algorithms | 2015

Correcting gene tree by removal and modification

Stefano Beretta; Mauro Castelli; Riccardo Dondi

Gene tree correction with respect to a given species tree is a problem that has been recently proposed in order to better understand the evolution of gene families. One of the combinatorial methods proposed to tackle with this problem aims to correct a gene tree by removing the minimum number of leaves/labels (Minimum Leaf Removal and Minimum Label Removal, respectively). The two problems have been shown to be APX-hard, and fixed-parameter tractable, when parameterized by the number of leaves/labels removed. In this paper, we focus on the approximation complexity of these two problems and we show that they are not approximable within factor b log ? m , where m is the number of leaves of the species tree and b 0 is a constant. Furthermore, we introduce and study two new variants of the problem, where the goal is the correction of a gene tree with the minimum number of leaf/label modifications (Minimum Leaf Modification and Minimum Label Modification, respectively). We show that the two modification problems, differently from the removal versions, are unlikely to be fixed-parameter tractable. More precisely, we prove that the Minimum Leaf Modification problem is W 1 ] -hard, when parameterized by the number of leaf modifications, and that the Minimum Label Modification problem is W 2 ] -hard, when parameterized by the number of label modifications.

Explore More