N. da Vitoria Lobo
University of Central Florida
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by N. da Vitoria Lobo.
IEEE Transactions on Intelligent Transportation Systems | 2003
Paul Smith; Mubarak Shah; N. da Vitoria Lobo
This paper presents a system for analyzing human driver visual attention. The system relies on estimation of global motion and color statistics to robustly track a persons head and facial features. The system is fully automatic, it can initialize automatically, and reinitialize when necessary. The system classifies rotation in all viewing directions, detects eye/mouth occlusion, detects eye blinking and eye closure, and recovers the three dimensional gaze of the eyes. In addition, the system is able to track both through occlusion due to eye blinking, and eye closure, large mouth movement, and also through occlusion due to rotation. Even when the face is fully occluded due to rotation, the system does not break down. Further the system is able to track through yawning, which is a large local mouth motion. Finally, results are presented, and future work on how this system can be used for more advanced driver visual attention monitoring is discussed.
computer vision and pattern recognition | 1994
Young Ho Kwon; N. da Vitoria Lobo
The ability to classify age from a facial image has not been pursued in computer vision. This research addresses the limited task of age classification of a facial image into a baby, young adult, and senior adult. This is the first reported work to classify age, and to successfully extract and use natural wrinkles. We present a theory and practical computations for visual age classification from facial images, based on cranio-facial changes in feature-position ratios, and on skin wrinkle analysis. Three age groups are classified.<<ETX>>
international conference on pattern recognition | 2000
Paul Smith; Mubarak Shah; N. da Vitoria Lobo
We describe a system for analyzing human driver alertness. It relies on optical flow and color predicates to robustly track a persons head and facial features. Our system classifies rotation in all viewing directions, detects eye/mouth occlusion, detects eye blinking, and recovers the 3D gaze of the eyes. We show results and discuss how this system can be used for monitoring driver alertness.
workshop on applications of computer vision | 2000
W. Phillips; Mubarak Shah; N. da Vitoria Lobo
This paper presents an automatic system for fire detection in video sequences. There are many previous methods to detect fire, however, all except two use spectroscopy or particle sensors. The two that use visual information suffer from the inability to cope with a moving camera or a moving scene. One of these is not able to work on general data, such as movie sequences. The other is too simplistic and unrestrictive in determining what is considered fire, so that it can be used reliably only in aircraft dry bays. Our system uses color and motion information computed from video sequences to locate fire. This is done by first using an approach that is based upon creating a Gaussian-smoothed color histogram to determine the fire-colored pixels, and then using the temporal variation of pixels to determine which of these pixels are actually fire. Unlike the two previous vision-based methods for pre detection, our method is applicable to more areas because of its insensitivity to camera motion. Two specific applications not possible with previous algorithms are the recognition of fire in the presence of global camera motion or scene motion and the recognition of fire in movies for possible use in an automatic rating system. We show that our method works in a variety of conditions, and that it can automatically determine when it has insufficient information.
international conference on pattern recognition | 2002
Ankur Datta; Mubarak Shah; N. da Vitoria Lobo
We address the problem of detecting human violence in video, such as fist fighting, kicking, hitting with objects, etc. To detect violence we rely on motion trajectory information and on orientation information of a persons limbs. We define an Acceleration Measure Vector (AMV) composed of direction and magnitude of motion and we define jerk to be the temporal derivative of AMV We present results from several data sequences involving multiple types of violent activities.
computer vision and pattern recognition | 1996
R.G. Uhl; N. da Vitoria Lobo
We present a theory and practical computations for automatically matching a police artist sketch to a set of true photographs. We locate facial features in both the sketch as well as the set of photograph images. Then, the sketch is photometrically standardized to facilitate comparison with a photo and then both the sketch and the photos are geometrically standardized. Finally, for matching, eigenanalysis is employed. Results using real police sketches and arrest photos are presented.
ieee international conference on automatic face and gesture recognition | 2000
Andrew Wu; Mubarak Shah; N. da Vitoria Lobo
We present a method for tracking the 3D position of a finger, using a single camera placed several meters away from the user. After skin detection, we use motion to identify the gesticulating arm. The finger point is found by analyzing the arms outline. To derive a 3D trajectory, we first track 2D positions of the users elbow and shoulder. Given that a humans upper arm and lower arm have consistent length, we observe that the possible locations of a finger and elbow form two spheres with constant radii. From the previously tracked body points, we can reconstruct these spheres, computing the 3D position of the elbow and finger. These steps are fully automated and do not require human intervention. The system presented can be used as a visualization tool, or as a user input interface, in cases when the user would rather not be constrained by the camera system.
international conference on computer vision | 2005
Paul Smith; N. da Vitoria Lobo; Mubarak Shah
This paper contributes a new boosting paradigm to achieve detection of events in video. Previous boosting paradigms in vision focus on single frame detection and do not scale to video events. Thus new concepts need to be introduced to address questions such as determining if an event has occurred, localizing the event, handling same action performed at different speeds, incorporating previous classifier responses into current decision, using temporal consistency of data to aid detection and recognition. The proposed method has the capability to improve weak classifiers by allowing them to use previous history in evaluating the current frame. A learning mechanism built into the boosting paradigm is also given which allows event level decisions to be made. This is contrasted with previous work in boosting which uses limited higher level temporal reasoning and essentially makes object detection decisions at the frame level. Our approach makes extensive use of temporal continuity of video at the classifier and detector levels. We also introduce a relevant set of activity features. Features are evaluated at multiple zoom levels to improve detection. We show results for a system that is able to recognize 11 actions.
canadian conference on computer and robot vision | 2005
D. Batz; N. da Vitoria Lobo; Mubarak Shah
We propose a computer vision system to assist a human user in the monitoring of their medication habits. This task must be accomplished without the knowledge of any pill locations, as they are too small to track with a static camera, and are usually occluded. At the core of this process is a mixture of low-level, high-level, and heuristic techniques such as skin segmentation, face detection, template matching, and a novel approach to hand localization and occlusion handling. We discuss the approach taken towards this goal, along with the results of our testing phase.
computer vision and pattern recognition | 2004
Paul Smith; Mubarak Shah; N. da Vitoria Lobo
To facilitate activity recognition, analysis of the scene at multiple levels of detail is necessary. Required prerequisites for our activity recognition are tracking objects across frames and establishing a consistent labeling of objects across cameras. This paper makes several innovative uses of the epipolar constraint in the context of activity recognition. We first demonstrate how we track heads and hands using the epipolar geometry. Next we show how the detected objects are labeled consistently across cameras and zooms by employing epipolar, spatial, trajectory, and appearance properties. Finally we show how our method, utilizing the multiple levels of detail, is able to answer activity recognition problems which are difficult to answer with a single level of detail.