Guilherme P. Telles | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Guilherme P. Telles is active.

Explore More

Publication

Featured researches published by Guilherme P. Telles.

BMC Bioinformatics | 2015

InteractiVenn: a web-based tool for the analysis of sets through Venn diagrams

Henry Heberle; Gabriela Vaz Meirelles; Felipe Rodrigues da Silva; Guilherme P. Telles; Rosane Minghim

BackgroundSet comparisons permeate a large number of data analysis workflows, in particular workflows in biological sciences. Venn diagrams are frequently employed for such analysis but current tools are limited.ResultsWe have developed InteractiVenn, a more flexible tool for interacting with Venn diagrams including up to six sets. It offers a clean interface for Venn diagram construction and enables analysis of set unions while preserving the shape of the diagram. Set unions are useful to reveal differences and similarities among sets and may be guided in our tool by a tree or by a list of set unions. The tool also allows obtaining subsets’ elements, saving and loading sets for further analyses, and exporting the diagram in vector and image formats. InteractiVenn has been used to analyze two biological datasets, but it may serve set analysis in a broad range of domains.ConclusionsInteractiVenn allows set unions in Venn diagrams to be explored thoroughly, by consequence extending the ability to analyze combinations of sets with additional observations, yielded by novel interactions between joined sets. InteractiVenn is freely available online at: www.interactivenn.net.

PLOS ONE | 2013

Metagenomic Analysis of a Tropical Composting Operation at the São Paulo Zoo Park Reveals Diversity of Biomass Degradation Functions and Organisms

Layla Farage Martins; Luciana Principal Antunes; Renata C. Pascon; Júlio Cezar de Oliveira; Luciano Antonio Digiampietri; Deibs Barbosa; Bruno Malveira Peixoto; Marcelo A. Vallim; Cristina Viana-Niero; Éric Hainer Ostroski; Guilherme P. Telles; Zanoni Dias; João Batista da Cruz; Luiz Juliano; Sergio Verjovski-Almeida; Aline M. da Silva; João C. Setubal

Composting operations are a rich source for prospection of biomass degradation enzymes. We have analyzed the microbiomes of two composting samples collected in a facility inside the São Paulo Zoo Park, in Brazil. All organic waste produced in the park is processed in this facility, at a rate of four tons/day. Total DNA was extracted and sequenced with Roche/454 technology, generating about 3 million reads per sample. To our knowledge this work is the first report of a composting whole-microbial community using high-throughput sequencing and analysis. The phylogenetic profiles of the two microbiomes analyzed are quite different, with a clear dominance of members of the Lactobacillus genus in one of them. We found a general agreement of the distribution of functional categories in the Zoo compost metagenomes compared with seven selected public metagenomes of biomass deconstruction environments, indicating the potential for different bacterial communities to provide alternative mechanisms for the same functional purposes. Our results indicate that biomass degradation in this composting process, including deconstruction of recalcitrant lignocellulose, is fully performed by bacterial enzymes, most likely by members of the Clostridiales and Actinomycetales orders.

Discrete Applied Mathematics | 1998

On the consecutive ones property

João Meidanis; Oscar Porto; Guilherme P. Telles

Abstract A binary matrix has the Consecutive Ones Property (C1P) when there is a permutation of its rows that leaves the 1s consecutive in every column. We study the recognition problem for these matrices, giving a structure, PQR trees, generalizing the PQ trees of Booth and Lueker (1976). This new structure is capable of, not only recording all valid permutations when the matrix has the C1P, but also pointing out possible obstructions when the property does not hold. We recast the problem using collections of sets, developing a new theory for it. This problem appears naturally in several applications in molecular biology, for instance, in the construction of physical maps from hybridization data.

Genetics and Molecular Biology | 2001

Trimming and clustering sugarcane ESTs

Guilherme P. Telles; Felipe Rodrigues da Silva

The original clustering procedure adopted in the Sugarcane Expressed Sequence Tag project (SUCEST) had many problems, for instance too many clusters, the presence of ribosomal sequences, etc. We therefore redesigned the clustering procedure entirely, including a much more careful initial trimming of the reads. In this paper the new trimming and clustering strategies are described in detail and we give the new official figures for the project, 237,954 expressed sequence tags and 43,141 clusters.

Computer Graphics Forum | 2012

Semantic Wordification of Document Collections

Fernando Vieira Paulovich; Franklina Maria Bragion Toledo; Guilherme P. Telles; Rosane Minghim; Luis Gustavo Nonato

Word clouds have become one of the most widely accepted visual resources for document analysis and visualization, motivating the development of several methods for building layouts of keywords extracted from textual data. Existing methods are effective to demonstrate content, but are not capable of preserving semantic relationships among keywords while still linking the word cloud to the underlying document groups that generated them. Such representation is highly desirable for exploratory analysis of document collections. In this paper we present a novel approach to build document clouds, named ProjCloud that aim at solving both semantical layouts and linking with document sets. ProjCloud generates a semantically consistent layout from a set of documents. Through a multidimensional projection, it is possible to visualize the neighborhood relationship between highly related documents and their corresponding word clouds simultaneously. Additionally, we propose a new algorithm for building word clouds inside polygons, which employs spectral sorting to maintain the semantic relationship among words. The effectiveness and flexibility of our methodology is confirmed when comparisons are made to existing methods. The technique automatically constructs projection based layouts the user may choose to examine in the form of the point clouds or corresponding word clouds, allowing a high degree of control over the exploratory process.

visual analytics science and technology | 2007

Point Placement by Phylogenetic Trees and its Application to Visual Analysis of Document Collections

Ana M. Cuadros; Fernando Vieira Paulovich; Rosane Minghim; Guilherme P. Telles

The task of building effective representations to visualize and explore collections with moderate to large number of documents is hard. It depends on the evaluation of some distance measure among texts and also on the representation of such relationships in bi- dimensional spaces. In this paper we introduce an alternative approach for building visual maps of documents based on their content similarity, through reconstruction of phylogenetic trees. The tree is capable of representing relationships that allows the user to quickly recover information detected by the similarity metric. For a variety of text collections of different natures we show that we can achieve improved exploration capability and more clear visualization of relationships amongst documents.

IEEE Transactions on Visualization and Computer Graphics | 2011

Improved Similarity Trees and their Application to Visual Data Classification

José Gustavo de Paiva; Laura Florian; Helio Pedrini; Guilherme P. Telles; Rosane Minghim

An alternative form to multidimensional projections for the visual analysis of data represented in multidimensional spaces is the deployment of similarity trees, such as Neighbor Joining trees. They organize data objects on the visual plane emphasizing their levels of similarity with high capability of detecting and separating groups and subgroups of objects. Besides this similarity-based hierarchical data organization, some of their advantages include the ability to decrease point clutter; high precision; and a consistent view of the data set during focusing, offering a very intuitive way to view the general structure of the data set as well as to drill down to groups and subgroups of interest. Disadvantages of similarity trees based on neighbor joining strategies include their computational cost and the presence of virtual nodes that utilize too much of the visual space. This paper presents a highly improved version of the similarity tree technique. The improvements in the technique are given by two procedures. The first is a strategy that replaces virtual nodes by promoting real leaf nodes to their place, saving large portions of space in the display and maintaining the expressiveness and precision of the technique. The second improvement is an implementation that significantly accelerates the algorithm, impacting its use for larger data sets. We also illustrate the applicability of the technique in visual data mining, showing its advantages to support visual classification of data sets, with special attention to the case of image classification. We demonstrate the capabilities of the tree for analysis and iterative manipulation and employ those capabilities to support evolving to a satisfactory data organization and classification.

combinatorial pattern matching | 2013

External Memory Generalized Suffix and LCP Arrays Construction

Felipe Alves da Louza; Guilherme P. Telles; Cristina Dutra de Aguiar Ciferri

A suffix array is a data structure that, together with the LCP array, allows solving many string processing problems in a very efficient fashion. In this article we introduce eGSA, the first external memory algorithm to construct both generalized suffix and LCP arrays for sets of strings. Our algorithm relies on a combination of buffers, induced sorting and a heap. Performance tests with real DNA sequence sets of size up to 8.5 GB showed that eGSA can indeed be applied to sets of large sequences with efficient running time on a low-cost machine. Compared to the algorithm that most closely resembles eGSA purpose, eSAIS, eGSA reduced the time spent to construct the arrays by a factor of 2.5−4.8.

Electronic Notes in Discrete Mathematics | 2005

Building PQR trees in almost-linear time

Guilherme P. Telles; João Meidanis

AbstractIn 1976, Booth and Leuker invented the PQ trees as a compact wayof storing and manipulating all the permutations on n elements thatkeep consecutive the elements in certain given sets C 1 ,C 2 ,...,C m .Such permutations are called valid. This problem ﬁnds applicationsin DNA physical mapping, interval graph recognition, logic circuitoptimization and data retrieval, among others. PQ trees constructiontime is linear on the size of the input sets. In 1995, Meidanis andMunuera created the PQR trees, a natural generalization of PQ trees.The diﬀerence between them is that PQR trees exist for every setcollection, even when there are no valid permutations. The R nodesencapsulate subsets where the consecutive ones property fails. In thisnote we present an almost-linear time algorithm to build a PQR treefor an arbitrary set collection. Keywords: Trees, analysis of algorithms 1 Introduction Given a collection of m subsets C 1 ,C 2 ,...,C m of a set U of n elements,the consecutive ones problem consists in answering whether there is a validpermutation of the elements in U, that is, a permutation that keeps theelements of each C

Journal of Discrete Algorithms | 2016

An improved algorithm for the all-pairs suffixprefix problem

William Hideki Azana Tustumi; Simon Gog; Guilherme P. Telles; Felipe Alves da Louza

Finding all longest suffixprefix matches for a collection of strings is known as the all pairs suffixprefix match problem and its main application is de novo genome assembly. This problem is well studied in stringology and has been solved optimally in 1992 by Gusfield et al. [8] using suffix trees. In 2010, Ohlebusch and Gog [13] proposed an alternative solution based on enhanced suffix arrays which has also optimal time complexity but is faster in practice. In this article, we present another optimal algorithm based on enhanced suffix arrays which further improves the practical running time. Our new solution solves the problem locally for each string, scanning the enhanced suffix array backwards to avoid the processing of suffixes that are no suffixprefix matching candidates. In an empirical evaluation we observed that the new algorithm is over two times faster and more space-efficient than the method proposed by Ohlebusch and Gog. When compared to Readjoiner [5], a good practical solution, our algorithm is faster for small overlap lengths, in pace with theoretical optimality.

Explore More