Zenilton Kleber Gonçalves do Patrocínio
Pontifícia Universidade Católica de Minas Gerais
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Hotspot
Dive into the research topics where Zenilton Kleber Gonçalves do Patrocínio is active.
Publication
Featured researches published by Zenilton Kleber Gonçalves do Patrocínio.
Pattern Recognition Letters | 2014
Kleber Jacques Ferreira de Souza; Arnaldo de Albuquerque Araújo; Zenilton Kleber Gonçalves do Patrocínio; Silvio Jamil Ferzoli Guimarães
Hierarchical video segmentation provides region-oriented scale-space, i.e., a set of video segmentations at different detail levels in which the segmentations at finer levels are nested with respect to those at coarser levels. In this work, the hierarchical video segmentation is transformed into a graph partitioning problem in which each part corresponds to one supervoxel of the video, and we present a new methodology for hierarchical video segmentation which computes a hierarchy of partitions by a reweighting of the original graph using a simple dissimilarity measure in which a not too coarse segmentation can be easily inferred. We also provide an extensive comparative analysis, considering quantitative assessments showing accuracy, ease of use, and temporal coherence of our methods - p-HOScale, cp-HOScale and 2cp-HOScale. According to the experiments, the hierarchy inferred by our methods produces good quantitative results when applied to video segmentation. Moreover, unlike to other tested methods, space and time cost of our methods are not influenced by the number of supervoxels to be computed.
Neurocomputing | 2016
Luciana dos Santos Belo; Carlos Caetano; Zenilton Kleber Gonçalves do Patrocínio; Silvio Jamil Ferzoli Guimarães
Video summarization is a simplification of video content for compacting the video information. The video summarization problem can be transformed into a clustering problem, in which some frames are selected to saliently represent the video content. In this work, we use a graph-based hierarchical clustering method for computing a video summary. In fact, the proposed approach, called HSUMM, adopts a hierarchical clustering method to generate a weight map from the frame similarity graph in which the clusters (or connected components of the graph) can easily be inferred. Moreover, the use of this strategy allows the application of a similarity measure between clusters during graph partition, instead of considering only the similarity between isolated frames. We also provide a unified framework for video summarization based on minimum spanning tree and weight maps in which HSUMM could be seen as an instance that uses a minimum spanning tree of frames and a weight map based on hierarchical observation scales computed over that tree. Furthermore, a new evaluation measure that assesses the diversity of opinions among users when they produce a summary for the same video, called Covering, is also proposed. During tests, different strategies for the identification of summary size and for the selection of keyframes were analyzed. Experimental results provide quantitative and qualitative comparison between the new approach and other popular algorithms from the literature, showing that the new algorithm is robust. Concerning quality measures, HSUMM outperforms the compared methods regardless of the visual feature used in terms of F-measure.
brazilian symposium on computer graphics and image processing | 2013
Kleber Jacques Ferreira de Souza; Arnaldo de Albuquerque Araújo; Zenilton Kleber Gonçalves do Patrocínio; Jean Cousty; Laurent Najman; Yukiko Kenmochi; Silvio Jamil Ferzoli Guimarães
Hierarchical video segmentation provides region-oriented scale-space, i.e., a set of video segmentations at different detail levels in which the segmentations at finer levels are nested with respect to those at coarser levels. Hierarchical methods have the interesting property of preserving spatial and neighboring information among segmented regions. Here, we transform the hierarchical video segmentation into a graph partitioning problem in which each part will correspond to one region of the video. Thus, we propose a new methodology for hierarchical video segmentation which computes a hierarchy of partitions by a reweighting of original graph in which a segmentation can be easily infered. The temporal coherence is given, only, by color information instead of more complex features. We provide an extensive comparative analysis, considering both quantitative and qualitative assessments showing efficiency, ease of use, and temporal coherence of our methods. According to our experiments, the hierarchy infered by our two methods, p-HOScale and cp-HOScale, produces good quantitative and qualitative results when applied to video segmentation. Moreover, unlike other tested methods, our methods are not influenced by the number of supervoxels to be computed, as shown in the experimental analysis, and present a low space cost.
multimedia signal processing | 2009
Silvio Jamil Ferzoli Guimarães; Zenilton Kleber Gonçalves do Patrocínio; Kleber Jacques Ferreira de Souza; Hugo Bastos de Paula
This paper addresses gradual transition detection which is part of video segmentation problem, and consists in identifying the boundary between consecutive shots. In this work, we propose an approach to cope with gradual transition detection in which we define and use a new dissimilarity measure based on the size of the maximum cardinality matching calculated using a bipartite graph with respect to a specified window. The experiments have used a video dataset which presents a variety of different video genres with more than 500 gradual transitions and our method with a much simpler classification approach achieves more than 90% recall with almost 80% precision which is similar to the best results found.
content based multimedia indexing | 2017
Nam Le; Hervé Bredin; Gabriel Sargent; Miquel India; Paula Lopez-Otero; Claude Barras; Camille Guinaudeau; Guillaume Gravier; Gabriel Barbosa da Fonseca; Izabela Lyon Freire; Zenilton Kleber Gonçalves do Patrocínio; Silvio Jamil Ferzoli Guimarães; Gerard Martí; Josep Ramon Morros; Javier Hernando; Laura Docio-Fernandez; Carmen García-Mateo; Sylvain Meignier; Jean-Marc Odobez
The rapid growth of multimedia databases and the human interest in their peers make indices representing the location and identity of people in audio-visual documents essential for searching archives. Person discovery in the absence of prior identity knowledge requires accurate association of audio-visual cues and detected names. To this end, we present 3 different strategies to approach this problem: clustering-based naming, verification-based naming, and graph-based naming. Each of these strategies utilizes different recent advances in unsupervised face / speech representation, verification, and optimization. To have a better understanding of the approaches, this paper also provides a quantitative and qualitative comparative study of these approaches using the associated corpus of the Person Discovery challenge at MediaEval 2016. From the results of our experiments, we can observe the pros and cons of each approach, thus paving the way for future promising research directions.
acm symposium on applied computing | 2016
Henrique Batista da Silva; Raquel Almeida; Gabriel Barbosa da Fonseca; Carlos Caetano; Dario Vieira; Zenilton Kleber Gonçalves do Patrocínio; Arnaldo de Albuquerque Araújo; Silvio Jamil Ferzoli Guimarães
The amount of applications using unstructured data, like videos, has been increased, and the researches concerning multimedia retrieval have attracted great attention. The need to efficiently index and retrieve this kind of data is of great concern, due to the fact that common searching approaches based on the use of keywords are not adequate for large video databases. Similarity search is a content based approach and it has been successfully used in retrieval systems. Accordingly, a major challenge is to provide an accurate and compact video representation that can achieve good performance with a fast answer in this type of searching. In this work, we proposed a compact video representation by using Min-Hash and the k-nearest GIST descriptors. Furthermore, we also present the first use of BossaNova Video Descriptor (BNVD) to video similarity search. Both compact video representations have shown more than 88% of mean average precision on similarity video search. The experimental results indicate high efficiency of our proposed representations in video retrieval task.
international conference on tools with artificial intelligence | 2014
Luciana dos Santos Belo; Carlos Caetano; Zenilton Kleber Gonçalves do Patrocínio; Silvio Jamil Ferzoli Guimarães
Video summarization is a simplification of video content for compacting the video information. The video summarization problem can be transformed to a clustering problem, in which some frames are selected to saliently represent the video content. In this work, we use a hierarchical graph-based clustering method for computing a video summary. In fact, the proposed approach, called Summary, adopts a hierarchical clustering method to generate a weight map from the frame similarity graph in which the clusters (or connected components of the graph) can easily be inferred. Moreover, the use of this strategy allows to apply a similarity measure between clusters during graph partition, instead of considering only the similarity between isolated frames. Furthermore, a new evaluation measure that assesses the diversity of opinions of user summaries, called Covering, is also proposed. Experimental results provide quantitative and qualitative comparison between the new approach and other popular algorithms from the literature, showing that the new algorithm is robust and efficient. Concerning quality measures, Summary outperforms the compared methods regardless of the visual feature used in terms of F-measure.
international symposium on neural networks | 2017
Raquel Almeida; Zenilton Kleber Gonçalves do Patrocínio; Silvio Jamil Ferzoli Guimarães
In human action classification task, a video must be classified into a pre-determined class. To cope with this problem, we propose a mid-level representation, in which information about quantization errors is embedded together with the aggregated data on low level features. The main contributions of this article are twofold: (i) assembly of low-level features (dense trajectories) by a mid-level representation enriched with information about distances between descriptors and codewords; and (ii) a survey of the most common protocols for human action classification methods when applied to three different datasets. Regarding classification protocols, we have experimented the training and testing classification (called split), the leave-one-out cross-validation (LOOCV) and the leave-one-group-out cross-validation (25-fold CV). Experimental results demonstrated that our strategy either has improved the classification rates with respect to the state-of-the-art for KTH dataset, achieving 98%, or it is a competitive one, for UCF-11 with 90%, when compared with methods with no feature learning.
brazilian symposium on computer graphics and image processing | 2015
Kleber Jacques Ferreira de Souza; Arnaldo de Albuquerque Araújo; Silvio Jamil Ferzoli Guimarães; Zenilton Kleber Gonçalves do Patrocínio; Matthieu Cord
In this paper, we present an approach to streaming graph-based hierarchical video segmentation by simple label propagation. Here, we transform the streaming video segmentation into a graph partitioning problem in which each part corresponds to one region of the video, furthermore, we apply a simple method for merging the segmentations of two consecutive blocks to achieve the temporal coherence. The spatial-temporal coherence is given, only, by color information instead of more complex features. We provide an extensive comparative analysis among our method and methods in the literature showing efficiency, ease of use, and temporal coherence of ours. According to the experiments, our method produces good results when applied to video segmentation besides presenting a low space and time cost, compared to other methods.
international symposium on multimedia | 2011
Ângelo Magno de Jesus; Silvio Jamil Ferzoli Guimarães; Zenilton Kleber Gonçalves do Patrocínio
Video text extraction is the process of identifying embedded text on video, which is usually on complex background. This paper proposes a new approach to cope with this problem considering image regularization and temporal information. The former helps us to decrease the number of gray values in order to simplify the image content, and the second one takes advantage of video text persistence in order to identify video segments ignoring text changes. According to our experiments, the proposed method presents better results than other. Moreover, we propose a post-processing step for improving the text results obtained by Otsu method.
Collaboration
Dive into the Zenilton Kleber Gonçalves do Patrocínio's collaboration.
Silvio Jamil Ferzoli Guimarães
Pontifícia Universidade Católica de Minas Gerais
View shared research outputsKleber Jacques Ferreira de Souza
Pontifícia Universidade Católica de Minas Gerais
View shared research outputs