Sandra Eliza Fontes de Avila

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Sandra Eliza Fontes de Avila is active.

Explore More

Publication

Featured researches published by Sandra Eliza Fontes de Avila.

Pattern Recognition Letters | 2011

VSUMM: A mechanism designed to produce static video summaries and a novel evaluation method

Sandra Eliza Fontes de Avila; Ana Paula Brandão Lopes; Antonio da Luz; Arnaldo de Albuquerque Araújo

The fast evolution of digital video has brought many new multimedia applications and, as a consequence, has increased the amount of research into new technologies that aim at improving the effectiveness and efficiency of video acquisition, archiving, cataloging and indexing, as well as increasing the usability of stored videos. Among possible research areas, video summarization is an important topic that potentially enables faster browsing of large video collections and also more efficient content indexing and access. Essentially, this research area consists of automatically generating a short summary of a video, which can either be a static summary or a dynamic summary. In this paper, we present VSUMM, a methodology for the production of static video summaries. The method is based on color feature extraction from video frames and k-means clustering algorithm. As an additional contribution, we also develop a novel approach for the evaluation of video static summaries. In this evaluation methodology, video summaries are manually created by users. Then, several user-created summaries are compared both to our approach and also to a number of different techniques in the literature. Experimental results show - with a confidence level of 98% - that the proposed solution provided static video summaries with superior quality relative to the approaches to which it was compared.

Computer Vision and Image Understanding | 2013

Pooling in image representation: The visual codeword point of view

Sandra Eliza Fontes de Avila; Nicolas Thome; Matthieu Cord; Eduardo Valle; Arnaldo de Albuquerque Araújo

In this work, we propose BossaNova, a novel representation for content-based concept detection in images and videos, which enriches the Bag-of-Words model. Relying on the quantization of highly discriminant local descriptors by a codebook, and the aggregation of those quantized descriptors into a single pooled feature vector, the Bag-of-Words model has emerged as the most promising approach for concept detection on visual documents. BossaNova enhances that representation by keeping a histogram of distances between the descriptors found in the image and those in the codebook, preserving thus important information about the distribution of the local descriptors around each codeword. Contrarily to other approaches found in the literature, the non-parametric histogram representation is compact and simple to compute. BossaNova compares well with the state-of-the-art in several standard datasets: MIRFLICKR, ImageCLEF 2011, PASCAL VOC 2007 and 15-Scenes, even without using complex combinations of different local descriptors. It also complements well the cutting-edge Fisher Vector descriptors, showing even better results when employed in combination with them. BossaNova also shows good results in the challenging real-world application of pornography detection.

brazilian symposium on computer graphics and image processing | 2009

Nude Detection in Video Using Bag-of-Visual-Features

Ana Paula Brandão Lopes; Sandra Eliza Fontes de Avila; Anderson N. A. Peixoto; Rodrigo Silva Oliveira; Marcelo de Miranda Coelho; Arnaldo de Albuquerque Araújo

The ability to filter improper content from multimedia sources based on visual content has important applications, since text-based filters are clearly insufficient against erroneous and/or malicious associations between text and actual content. In this paper, we investigate a method for detection of nudity in videos based on a bag-of-visual-features representation for frames and an associated voting scheme.Bag-of-Visual-Features (BoVF) approaches have been successfully applied to object recognition and scene classification, showing robustness to occlusion and also to the several kinds of variations that normally curse object detection methods. To the best of our knowledge, only two proposals in the literature use BoVF for nude detection in still images, and no other attempt has been made at applying BoVF for videos. Nevertheless, the results of our experiments show that this approach is indeed able to provide good recognition rates for nudity even at the frame level and with a relatively low sampling ratio. Also, the proposed voting scheme significantly enhances the recognition rates for video segments, achieving, in the best case, a value of 93.2% of correct classification, using a sampling ratio of 1/15 frames. Finally, a visual analysis of some particular cases indicates possible sources of misclassifications.

web science | 2014

The impact of visual attributes on online image diffusion

Luam C. Totti; Felipe Almeida Costa; Sandra Eliza Fontes de Avila; Eduardo Valle; Wagner Meira; Virgílio A. F. Almeida

Little is known on how visual content affects the popularity on social networks, despite images being now ubiquitous on the Web, and currently accounting for a considerable fraction of all content shared. Existing art on image sharing focuses mainly on non-visual attributes. In this work we take a complementary approach, and investigate resharing from a mainly visual perspective. Two sets of visual features are proposed, encoding both aesthetical properties (brightness, contrast, sharpness, etc.), and semantical content (concepts represented by the images). We collected data from a large image-sharing service (Pinterest) and evaluated the predictive power of different features on popularity (number of reshares). We found that visual properties have low predictive power compared that of social cues. However, after factoring-out social influence, visual features show considerable predictive power, especially for images with higher exposure, with over 3:1 accuracy odds when classifying highly exposed images between very popular and unpopular.

Neurocomputing | 2017

Video pornography detection through deep learning techniques and motion information

Mauricio Perez; Sandra Eliza Fontes de Avila; Daniel de Carvalho Moreira; Daniel Moraes; Vanessa Testoni; Eduardo Valle; Siome Goldenstein; Anderson Rocha

Recent literature has explored automated pornographic detection - a bold move to replace humans in the tedious task of moderating online content. Unfortunately, on scenes with high skin exposure, such as people sunbathing and wrestling, the state of the art can have many false alarms. This paper is based on the premise that incorporating motion information in the models can alleviate the problem of mapping skin exposure to pornographic content, and advances the bar on automated pornography detection with the use of motion information and deep learning architectures. Deep Learning, especially in the form of Convolutional Neural Networks, have striking results on computer vision, but their potential for pornography detection is yet to be fully explored through the use of motion information. We propose novel ways for combining static (picture) and dynamic (motion) information using optical flow and MPEG motion vectors. We show that both methods provide equivalent accuracies, but that MPEG motion vectors allow a more efficient implementation. The best proposed method yields a classification accuracy of 97.9% - an error reduction of 64.4% when compared to the state of the art - on a dataset of 800 challenging test cases. Finally, we present and discuss results on a larger, and more challenging, dataset.

Neurocomputing | 2016

A mid-level video representation based on binary descriptors: A case study for pornography detection

Carlos Caetano; Sandra Eliza Fontes de Avila; William Robson Schwartz; Silvio Jamil Ferzoli Guimarães; Arnaldo de Albuquerque Araújo

Abstract With the growing amount of inappropriate content on the Internet, such as pornography, arises the need to detect and filter such material. The reason for this is given by the fact that such content is often prohibited in certain environments (e.g., schools and workplaces) or for certain publics (e.g., children). In recent years, many works have been mainly focused on detecting pornographic images and videos based on visual content, particularly on the detection of skin color. Although these approaches provide good results, they generally have the disadvantage of a high false positive rate since not all images with large areas of skin exposure are necessarily pornographic images, such as people wearing swimsuits or images related to sports. Local feature based approaches with Bag-of-Words models (BoW) have been successfully applied to visual recognition tasks in the context of pornography detection. Even though existing methods provide promising results, they use local feature descriptors that require a high computational processing time yielding high-dimensional vectors. In this work, we propose an approach for pornography detection based on local binary feature extraction and BossaNova image representation, a BoW model extension that preserves more richly the visual information. Moreover, we propose two approaches for video description based on the combination of mid-level representations namely BossaNova Video Descriptor (BNVD) and BoW Video Descriptor (BoW-VD). The proposed techniques are promising, achieving an accuracy of 92.40%, thus reducing the classification error by 16% over the current state-of-the-art local features approach on the Pornography dataset.

acm symposium on applied computing | 2014

Representing local binary descriptors with BossaNova for visual recognition

Carlos Caetano; Sandra Eliza Fontes de Avila; Silvio Jamil Ferzoli Guimarães; Arnaldo de Albuquerque Araújo

Binary descriptors have recently become very popular in visual recognition tasks. This popularity is largely due to their low complexity and for presenting similar performances when compared to non binary descriptors, like SIFT. In literature, many researchers have applied binary descriptors in conjunction with mid-level representations (e.g., Bag-of-Words). However, despite these works have demonstrated promising results, their main problems are due to use of a simple mid-level representation and the use of binary descriptors in which rotation and scale invariance are missing. In order to address those problems, we propose to evaluate state-of-the-art binary descriptors, namely BRIEF, ORB, BRISK and FREAK, in a recent mid-level representation, namely BossaNova, which enriches the Bag-of-Words model, while preserving the binary descriptor information. Our experiments carried out in the challenging PASCAL VOC 2007 dataset revealed outstanding performances. Also, our approach shows good results in the challenging real-world application of pornography detection.

international symposium on biomedical imaging | 2017

Knowledge transfer for melanoma screening with deep learning

Afonso Menegola; Michel Fornaciali; Ramon Pires; Flávia Vasques Bittencourt; Sandra Eliza Fontes de Avila; Eduardo Valle

Knowledge transfer impacts the performance of deep learning — the state of the art for image classification tasks, including automated melanoma screening. Deep learnings greed for large amounts of training data poses a challenge for medical tasks, which we can alleviate by recycling knowledge from models trained on different tasks, in a scheme called transfer learning. Although much of the best art on automated melanoma screening employs some form of transfer learning, a systematic evaluation was missing. Here we investigate the presence of transfer, from which task the transfer is sourced, and the application of fine tuning (i.e., retraining of the deep learning model after transfer). We also test the impact of picking deeper (and more expensive) models. Our results favor deeper models, pretrained over ImageNet, with fine-tuning, reaching an AUC of 80.7% and 84.5% for the two skin-lesion datasets evaluated.

Forensic Science International | 2016

Pornography classification: The hidden clues in video space-time.

Daniel de Carvalho Moreira; Sandra Eliza Fontes de Avila; Mauricio Perez; Daniel Moraes; Vanessa Testoni; Eduardo Valle; Siome Goldenstein; Anderson Rocha

As web technologies and social networks become part of the general publics life, the problem of automatically detecting pornography is into every parents mind - nobody feels completely safe when their children go online. In this paper, we focus on video-pornography classification, a hard problem in which traditional methods often employ still-image techniques - labeling frames individually prior to a global decision. Frame-based approaches, however, ignore significant cogent information brought by motion. Here, we introduce a space-temporal interest point detector and descriptor called Temporal Robust Features (TRoF). TRoF was custom-tailored for efficient (low processing time and memory footprint) and effective (high classification accuracy and low false negative rate) motion description, particularly suited to the task at hand. We aggregate local information extracted by TRoF into a mid-level representation using Fisher Vectors, the state-of-the-art model of Bags of Visual Words (BoVW). We evaluate our original strategy, contrasting it both to commercial pornography detection solutions, and to BoVW solutions based upon other space-temporal features from the scientific literature. The performance is assessed using the Pornography-2k dataset, a new challenging pornographic benchmark, comprising 2000 web videos and 140h of video footage. The dataset is also a contribution of this work and is very assorted, including both professional and amateur content, and it depicts several genres of pornography, from cartoon to live action, with diverse behavior and ethnicity. The best approach, based on a dense application of TRoF, yields a classification error reduction of almost 79% when compared to the best commercial classifier. A sparse description relying on TRoF detector is also noteworthy, for yielding a classification error reduction of over 69%, with 19× less memory footprint than the dense solution, and yet can also be implemented to meet real-time requirements.

brazilian symposium on computer graphics and image processing | 2014

Statistical Learning Approach for Robust Melanoma Screening

Michel Fornaciali; Sandra Eliza Fontes de Avila; Micael Carvalho; Eduardo Valle

According to the American Cancer Society, one person dies of melanoma every 57 minutes, although it is the most curable type of cancer if detected early. Thus, computeraided diagnosis for melanoma screening has been a topic of active research. Much of the existing art is based on the Bag-of-Visual-Words (BoVW) model, combined with color and texture descriptors. However, recent advances in the BoVW model, as well as the evaluation of the importance of the many different factors affecting the BoVW model were yet to be explored, thus motivating our work. We show that a new approach for melanoma screening, based upon the state-of-the-art BossaNova descriptors, shows very promising results for screening, reaching an AUC of up to 93.7%. An important contribution of this work is an evaluation of the factors that affect the performance of the two-layered BoVW model. Our results show that the low-level layer has a major impact on the accuracy of the model, but that the codebook size on the mid-level layer is also important. Those results may guide future works on melanoma screening.

Explore More