Rogério Schmidt Feris

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Rogério Schmidt Feris is active.

Explore More

Publication

Featured researches published by Rogério Schmidt Feris.

computer vision and pattern recognition | 2011

Image ranking and retrieval based on multi-attribute queries

Behjat Siddiquie; Rogério Schmidt Feris; Larry S. Davis

We propose a novel approach for ranking and retrieval of images based on multi-attribute queries. Existing image retrieval methods train separate classifiers for each word and heuristically combine their outputs for retrieving multiword queries. Moreover, these approaches also ignore the interdependencies among the query terms. In contrast, we propose a principled approach for multi-attribute retrieval which explicitly models the correlations that are present between the attributes. Given a multi-attribute query, we also utilize other attributes in the vocabulary which are not present in the query, for ranking/retrieval. Furthermore, we integrate ranking and retrieval within the same formulation, by posing them as structured prediction problems. Extensive experimental evaluation on the Labeled Faces in the Wild(LFW), FaceTracer and PASCAL VOC datasets show that our approach significantly outperforms several state-of-the-art ranking and retrieval methods.

international conference on computer graphics and interactive techniques | 2004

Non-photorealistic camera: depth edge detection and stylized rendering using multi-flash imaging

Ramesh Raskar; Kar-Han Tan; Rogério Schmidt Feris; Jingyi Yu; Matthew Turk

We present a non-photorealistic rendering approach to capture and convey shape features of real-world scenes. We use a camera with multiple flashes that are strategically positioned to cast shadows along depth discontinuities in the scene. The projective-geometric relationship of the camera-flash setup is then exploited to detect depth discontinuities and distinguish them from intensity edges due to material discontinuities.We introduce depiction methods that utilize the detected edge features to generate stylized static and animated images. We can highlight the detected features, suppress unnecessary details or combine features from multiple images. The resulting images more clearly convey the 3D structure of the imaged scenes.We take a very different approach to capturing geometric features of a scene than traditional approaches that require reconstructing a 3D model. This results in a method that is both surprisingly simple and computationally efficient. The entire hardware/software setup can conceivably be packaged into a self-contained device no larger than existing digital cameras.

Image and Vision Computing | 2006

Manifold based analysis of facial expression

Ya Chang; Changbo Hu; Rogério Schmidt Feris; Matthew Turk

We propose a novel approach for modeling, tracking and recognizing facial expressions. Our method works on a low dimensional expression manifold, which is obtained by Isomap embedding. In this space, facial contour features are first clustered, using a mixture model. Then, expression dynamics are learned for tracking and classification. We use ICondensation to track facial features in the embedded space, while recognizing facial expressions in a cooperative manner, within a common probabilistic framework. The image observation likelihood is derived from a variation of the Active Shape Model (ASM) algorithm. For each cluster in the low-dimensional space, a specific ASM model is learned, thus avoiding incorrect matching due to non-linear image variations. Preliminary experimental results show that our probabilistic facial expression model on manifold significantly improves facial deformation tracking and expression recognition.

european conference on computer vision | 2016

A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection

Zhaowei Cai; Quanfu Fan; Rogério Schmidt Feris; Nuno Vasconcelos

A unified deep neural network, denoted the multi-scale CNN (MS-CNN), is proposed for fast multi-scale object detection. The MS-CNN consists of a proposal sub-network and a detection sub-network. In the proposal sub-network, detection is performed at multiple output layers, so that receptive fields match objects of different scales. These complementary scale-specific detectors are combined to produce a strong multi-scale object detector. The unified network is learned end-to-end, by optimizing a multi-task loss. Feature upsampling by deconvolution is also explored, as an alternative to input upsampling, to reduce the memory and computation costs. State-of-the-art object detection performance, at up to 15 fps, is reported on datasets, such as KITTI and Caltech, containing a substantial number of small objects.

workshop on applications of computer vision | 2009

Attribute-based people search in surveillance environments

Daniel A. Vaquero; Rogério Schmidt Feris; Duan Tran; Lisa M. Brown; Arun Hampapur; Matthew Turk

We propose a novel framework for searching for people in surveillance environments. Rather than relying on face recognition technology, which is known to be sensitive to typical surveillance conditions such as lighting changes, face pose variation, and low-resolution imagery, we approach the problem in a different way: we search for people based on a parsing of human parts and their attributes, including facial hair, eyewear, clothing color, etc. These attributes can be extracted using detectors learned from large amounts of training data. A complete system that implements our framework is presented. At the interface, the user can specify a set of personal characteristics, and the system then retrieves events that match the provided description. For example, a possible query is “show me the bald people who entered a given building last Saturday wearing a red shirt and sunglasses.” This capability is useful in several applications, such as finding suspects or missing people. To evaluate the performance of our approach, we present extensive experiments on a set of images collected from the Internet, on infrared imagery, and on two-and-a-half months of video from a real surveillance environment. We are not aware of any similar surveillance system capable of automatically finding people in video based on their fine-grained body parts and attributes.

systems man and cybernetics | 2011

Robust Detection of Abandoned and Removed Objects in Complex Surveillance Videos

Yingli Tian; Rogério Schmidt Feris; Haowei Liu; Arun Hampapur; Ming-Ting Sun

Tracking-based approaches for abandoned object detection often become unreliable in complex surveillance videos due to occlusions, lighting changes, and other factors. We present a new framework to robustly and efficiently detect abandoned and removed objects based on background subtraction (BGS) and foreground analysis with complement of tracking to reduce false positives. In our system, the background is modeled by three Gaussian mixtures. In order to handle complex situations, several improvements are implemented for shadow removal, quick-lighting change adaptation, fragment reduction, and keeping a stable update rate for video streams with different frame rates. Then, the same Gaussian mixture models used for BGS are employed to detect static foreground regions without extra computation cost. Furthermore, the types of the static regions (abandoned or removed) are determined by using a method that exploits context information about the foreground masks, which significantly outperforms previous edge-based techniques. Based on the type of the static regions and user-defined parameters (e.g., object size and abandoned time), a matching method is proposed to detect abandoned and removed objects. A person-detection process is also integrated to distinguish static objects from stationary people. The robustness and efficiency of the proposed method is tested on IBM Smart Surveillance Solutions for public safety applications in big cities and evaluated by several public databases, such as The Image library for intelligent detection systems (i-LIDS) and IEEE Performance Evaluation of Tracking and Surveillance Workshop (PETS) 2006 datasets. The test and evaluation demonstrate our method is efficient to run in real-time, while being robust to quick-lighting changes and occlusions in complex environments.

computer vision and pattern recognition | 2004

Manifold Based Analysis of Facial Expression

Changbo Hu; Ya Chang; Rogério Schmidt Feris; Matthew Turk

ieee international conference on automatic face and gesture recognition | 2002

Hierarchical wavelet networks for facial feature localization

Rogério Schmidt Feris; Jim Gemmell; Kentaro Toyama; Volker Krüger

We present a technique for facial feature localization using a two-level hierarchical wavelet network. The first level wavelet network is used for face matching, and yields an affine transformation used for a rough approximation of feature locations. Second level wavelet networks for each feature are then used to fine-tune the feature locations. Construction of a training database containing hierarchical wavelet networks of many faces allows features to be detected in most faces. Experiments show that facial feature localization benefits significantly from the hierarchical approach. Results compare favorably with existing techniques for feature localization.

mexican international conference on artificial intelligence | 2000

Detection and Tracking of Facial Features in Video Sequences

Rogério Schmidt Feris; Teofilo de Campos; Roberto Marcondes Cesar Junior

This work presents a real time system for detection and tracking of facial features in video sequences. Such system may be used in visual communication applications, such as teleconferencing, virtual reality, intelligent interfaces, human-machine interaction, surveillance, etc. We have used a statistical skin-color model to segment face-candidate regions in the image. The presence or absence of a face in each region is verified by means of an eye detector, based on an efficient template matching scheme . Once a face is detected, the pupils, nostrils and lip corners are located and these facial features are tracked in the image sequence, performing real time processing.

computer vision and pattern recognition | 2015

Deep domain adaptation for describing people based on fine-grained clothing attributes

Qiang Chen; Junshi Huang; Rogério Schmidt Feris; Lisa M. Brown; Jian Dong; Shuicheng Yan

We address the problem of describing people based on fine-grained clothing attributes. This is an important problem for many practical applications, such as identifying target suspects or finding missing people based on detailed clothing descriptions in surveillance videos or consumer photos. We approach this problem by first mining clothing images with fine-grained attribute labels from online shopping stores. A large-scale dataset is built with about one million images and fine-detailed attribute sub-categories, such as various shades of color (e.g., watermelon red, rosy red, purplish red), clothing types (e.g., down jacket, denim jacket), and patterns (e.g., thin horizontal stripes, houndstooth). As these images are taken in ideal pose/lighting/background conditions, it is unreliable to directly use them as training data for attribute prediction in the domain of unconstrained images captured, for example, by mobile phones or surveillance cameras. In order to bridge this gap, we propose a novel double-path deep domain adaptation network to model the data from the two domains jointly. Several alignment cost layers placed inbetween the two columns ensure the consistency of the two domain features and the feasibility to predict unseen attribute categories in one of the domains. Finally, to achieve a working system with automatic human body alignment, we trained an enhanced RCNN-based detector to localize human bodies in images. Our extensive experimental evaluation demonstrates the effectiveness of the proposed approach for describing people based on fine-grained clothing attributes.

Explore More