Is this you? Create Your Porfile

Maria Federico

University of Modena and Reggio Emilia

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Maria Federico is active.

Explore More

Publication

Featured researches published by Maria Federico.

conference on web accessibility | 2012

Enhancing learning accessibility through fully automatic captioning

Maria Federico; Marco Furini

The simple act of listening or of taking notes while attending a lesson may represent an insuperable burden for millions of people with some form of disabilities (e.g., hearing impaired, dyslexic and ESL students). In this paper, we propose an architecture that aims at automatically creating captions for video lessons by exploiting advances in speech recognition technologies. Our approach couples the usage of off-the-shelf ASR (Automatic Speech Recognition) software with a novel caption alignment mechanism that smartly introduces unique audio markups into the audio stream before giving it to the ASR and transforms the plain transcript produced by the ASR into a timecoded transcript.

Theoretical Computer Science | 2009

Suffix tree characterization of maximal motifs in biological sequences

Maria Federico; Nadia Pisanti

Finding motifs in biological sequences is one of the most intriguing problems for string algorithm designers due to, on the one hand, the numerous applications of this problem in molecular biology and, on the other hand, the challenging aspects of the computational problem. Indeed, when dealing with biological sequences it is necessary to work with approximations (that is, to identify fragments that are not necessarily identical, but just similar, according to a given similarity notion), and this complicates the problem. Existing algorithms run in time linear with respect to the input size. Nevertheless, the output size can be very large due to the approximation (namely exponential in the approximation degree). This often makes the output unreadable, as well as slowing down the inference itself. A high degree of redundancy has been detected in the set of motifs that satisfy traditional requirements, even for exact motifs. Moreover, it has been observed many times that only a subset of these motifs, namely the maximal motifs, could be enough to provide the information of all of them. In this paper, we aim at removing such redundancy. We extend some notions of maximality already defined for exact motifs to the case of approximate motifs with Hamming distance, and we give a characterization of maximal motifs on the suffix tree. Given that this data structure is used by a whole class of motif extraction tools, we show how these tools can be modified to include the maximality requirement without changing the asymptotical complexity.

Multimedia Tools and Applications | 2014

An automatic caption alignment mechanism for off-the-shelf speech recognition technologies

Maria Federico; Marco Furini

With a growing number of online videos, many producers feel the need to use video captions in order to expand content accessibility and face two main issues: production and alignment of the textual transcript. Both activities are expensive either for the high labor of human resources or for the employment of dedicated software. In this paper, we focus on caption alignment and we propose a novel, automatic, simple and low-cost mechanism that does not require human transcriptions or special dedicated software to align captions. Our mechanism uses a unique audio markup and intelligently introduces copies of it into the audio stream before giving it to an off-the-shelf automatic speech recognition (ASR) application; then it transforms the plain transcript produced by the ASR application into a timecoded transcript, which allows video players to know when to display every single caption while playing out the video. The experimental study evaluation shows that our proposal is effective in producing timecoded transcripts and therefore it can be helpful to expand video content accessibility.

Algorithms for Molecular Biology | 2012

Direct vs 2-stage approaches to structured motif finding.

Maria Federico; Mauro Leoncini; Manuela Montangero; Paolo Valente

BackgroundThe notion of DNA motif is a mathematical abstraction used to model regions of the DNA (known as Transcription Factor Binding Sites, or TFBSs) that are bound by a given Transcription Factor to regulate gene expression or repression. In turn, DNA structured motifs are a mathematical counterpart that models sets of TFBSs that work in concert in the gene regulations processes of higher eukaryotic organisms. Typically, a structured motif is composed of an ordered set of isolated (or simple) motifs, separated by a variable, but somewhat constrained number of “irrelevant” base-pairs. Discovering structured motifs in a set of DNA sequences is a computationally hard problem that has been addressed by a number of authors using either a direct approach, or via the preliminary identification and successive combination of simple motifs.ResultsWe describe a computational tool, named SISMA, for the de-novo discovery of structured motifs in a set of DNA sequences. SISMA is an exact, enumerative algorithm, meaning that it finds all the motifs conforming to the specifications. It does so in two stages: first it discovers all the possible component simple motifs, then combines them in a way that respects the given constraints. We developed SISMA mainly with the aim of understanding the potential benefits of such a 2-stage approach w.r.t. direct methods. In fact, no 2-stage software was available for the general problem of structured motif discovery, but only a few tools that solved restricted versions of the problem. We evaluated SISMA against other published tools on a comprehensive benchmark made of both synthetic and real biological datasets. In a significant number of cases, SISMA outperformed the competitors, exhibiting a good performance also in most of the cases in which it was inferior.ConclusionsA reflection on the results obtained lead us to conclude that a 2-stage approach can be implemented with many advantages over direct approaches. Some of these have to do with greater modularity, ease of parallelization, and the possibility to perform adaptive searches of structured motifs. As another consideration, we noted that most hard instances for SISMA were easy to detect in advance. In these cases one may initially opt for a direct method; or, as a viable alternative in most laboratories, one could run both direct and 2-stage tools in parallel, halting the computations when the first halts.

computer games | 2011

MovieRemix: having fun playing with videos

Nicola Maria Dusi; Maria Federico; Marco Furini

The process of producing new creative videos by editing, combining, and organizing pre-existing material (e.g., video shots) is a popular phenomenon in the current web scenario. Known as remix or video remix, the produced videomay have new and different meanings with respect to the source material. Unfortunately, when managing audiovisual objects, the technological aspect can be a burden for many creative users. Motivated by the large success of the gaming market, we propose a novel game and an architecture to make the remix process a pleasant and stimulating gaming experience. MovieRemix allows people to act like a movie director, but instead of dealing with cast and cameras, the player has to create a remixed video starting from a given screenplay and from video shots retrieved from the provided catalog. MovieRemix is not a simple video editing tool nor is a simple game: it is a challenging environment that stimulates creativity. To temp to play the game, players can access different levels of screenplay (original, outline, derived) and can also challenge other players. Computational and storage issues are kept at the server side, whereas the client device just needs to have the capability of playing streaming videos.

bioinformatics research and development | 2008

Suffix Tree Characterization of Maximal Motifs in Biological Sequences

Maria Federico; Nadia Pisanti

Finding motifs in biological sequences is one of the most intriguing problems for string algorithms designers as it is necessary to deal with approximations and this complicates the problem. Existing algorithms run in time linear with the input size. Nevertheless, the output size can be very large due to the approximation. This makes the output often unreadable, next to slowing down the inference itself. Since only a subset of the motifs, i.e. the maximal motifs, could be enough to give the information of all of them, in this paper, we aim at removing such redundancy. We define notions of maximality that we characterize in the suffix tree data structure. Given that this is used by a whole class of motifs extraction tools, we show how these tools can be modified to include the maximality requirement on the fly without changing the asymptotical complexity.

Discrete Applied Mathematics | 2014

Rime: Repeat identification

Maria Federico; Pierre Peterlongo; Nadia Pisanti; Marie-France Sagot

We present an algorithm for detecting long similar fragments occurring at least twice in a set of biological sequences. The problem becomes computationally challenging when the frequency of a repeat is allowed to increase and when a non-negligible number of insertions, deletions and substitutions are allowed. We introduce in this paper an algorithm, Rime (for Repeat Identification: long, Multiple, and with Edits) that performs this task, and manages instances whose size and combination of parameters cannot be handled by other currently existing methods. This is achieved by using a filter as a preprocessing step, and by then exploiting the information gathered by the filter in the following actual repeat inference step. To the best of our knowledge, Rime is the first algorithm that can accurately deal with very long repeats (up to a few thousands), occurring possibly several times, and with a rate of differences (substitutions and indels) allowed among copies of a same repeat of 10-15% or even more.

acs ieee international conference on computer systems and applications | 2010

An optimized filter for finding multiple repeats in DNA sequences

Maria Federico; Pierre Peterlongo; Nadia Pisanti

This paper presents new optimizations designed to improve an algorithm at the state-of-the-art for filtering sequences as a preprocessing step to the task of finding multiple repeats allowing a given pairwise edit distance between pairs of occurrences. The target application is to find possibly long repeats having two or more occurrences, such that each couple of occurrences may show substitutions, insertions or deletions in up to 10 to 15 % of their size. Assimilated to multiple alignment, exact detection of multiple repeats is an NP-hard problem. For increasing computation speed while avoiding the use of heuristics, one may use filters that quickly remove large parts of input that do not contain searched repeats. We describe at theoretical level some optimizations that can be applied to the tool that is currently the state-of-the-art for this filtering task. Finally, we exhibit some experiments in which the optimized tool outperforms its original version.

Proceedings of the 1st ACM workshop on Breaking frontiers of computational biology | 2009

An efficient algorithm for planted structured motif extraction

Maria Federico; Paolo Valente; Mauro Leoncini; Manuela Montangero; Roberto Cavicchioli

In this paper we present an algorithm for the problem of planted structured motif extraction from a set of sequences. This problem is strictly related to the structured motif extraction problem, which has many important applications in molecular biology. We propose an algorithm that uses a simple two-stage approach: first it extracts simple motifs, then the simple motifs are combined in order to extract structured motifs. We compare our algorithm with existing algorithms whose code is available, and which are based on more complex approaches. Our experiments show that, even if in general the problem is NP-hard, our algorithm is able to handle complex instances of the problem in a reasonable amount of time.

international conference on information technology | 2011

A high performing tool for residue solvent accessibility prediction

Lorenzo Palmieri; Maria Federico; Mauro Leoncini; Manuela Montangero

Many efforts were spent in the last years in bridging the gap between the huge number of sequenced proteins and the relatively few solved structures. Relative Solvent Accessibility (RSA) prediction of residues in protein complexes is a key step towards secondary structure and protein-protein interaction sites prediction. With very different approaches, a number of software tools for RSA prediction have been produced throughout the last twenty years. Here, we present a binary classifier which implements a new method mainly based on sequence homology and implemented by means of look-up tables. The tool exploits residue similarity in solvent exposure pattern of neighboring context in similar protein chains, using BLAST search and DSSP structure. A two-state classification with 89.5% accuracy and 0.79 correlation coefficient against the real data is achieved on a widely used dataset.

Explore More