Ismail Haritaoglu | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Ismail Haritaoglu is active.

Explore More

Publication

Featured researches published by Ismail Haritaoglu.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2000

W/sup 4/: real-time surveillance of people and their activities

Ismail Haritaoglu; David Harwood; Larry S. Davis

W/sup 4/ is a real time visual surveillance system for detecting and tracking multiple people and monitoring their activities in an outdoor environment. It operates on monocular gray-scale video imagery, or on video imagery from an infrared camera. W/sup 4/ employs a combination of shape analysis and tracking to locate people and their parts (head, hands, feet, torso) and to create models of peoples appearance so that they can be tracked through interactions such as occlusions. It can determine whether a foreground region contains multiple people and can segment the region into its constituent people and track them. W/sup 4/ can also determine whether people are carrying objects, and can segment objects from their silhouettes, and construct appearance models for them so they can be identified in subsequent frames. W/sup 4/ can recognize events between people and objects, such as depositing an object, exchanging bags, or removing an object. It runs at 25 Hz for 320/spl times/240 resolution images on a 400 MHz dual-Pentium II PC.

Computer Vision and Image Understanding | 2001

Backpack: detection of people carrying objects using silhouettes

Ismail Haritaoglu; Ross Cutler; David Harwood; Larry S. Davis

Abstract We describe a video-rate surveillance algorithm for determining whether people are carrying objects or moving unencumbered from a stationary camera. The contribution of the paper is the shape analysis algorithm that both determines whether a person is carrying an object and segments the object from the person so that it can be tracked, e.g., during an exchange of objects between two people. As the object is segmented, an appearance model of the object is constructed. The method combines periodic motion estimation with static symmetry analysis of the silhouettes of a person in each frame of the sequence. Experimental results demonstrate robustness and real-time performance of the proposed algorithm.

computer vision and pattern recognition | 2001

Detection and tracking of shopping groups in stores

Ismail Haritaoglu; Myron Flickner

We describe a monocular real-time computer vision system that identifies shopping groups by detecting and tracking multiple people as they wait in a checkout line or service counter. Our system segments each frame into foreground regions which contains multiple people. Foreground regions are further segmented into individuals using a temporal segmentation of foreground and motion cues. Once a person is detected, an appearance model based on color and edge density in conjunction with a mean-shift tracker is used to recover the persons trajectory. People are grouped together as a shopping group by analyzing interbody distances. The system also monitors the cashiers activities to determine when shopping transactions start and end. Experimental results demonstrate the robustness and real-time performance of the algorithm.

international conference on pattern recognition | 2000

A fast background scene modeling and maintenance for outdoor surveillance

Ismail Haritaoglu; David Harwood; Larry S. Davis

We describe fast background scene modeling and maintenance techniques for real time visual surveillance system for tracking people in an outdoor environment. It operates on monocular gray scale video imagery or on video imagery from an infrared camera. The system learns and models background scene statistically to detect foreground objects, even when the background is not completely stationary (e.g. motion of tree branches) using shape and motion cues. Also, a background maintenance model is proposed for preventing false positives, such as, illumination changes (the sun being blocked by clouds causing changes in brightness), or false negative, such as, physical changes (person detection while he is getting out of the parked car). Experimental results demonstrate robustness and real-time performance of the algorithm.

computer vision and pattern recognition | 2001

Scene text extraction and translation for handheld devices

Ismail Haritaoglu

We describe a scene text extraction system for handheld devices to provide enhanced information perception services to the user. It uses a color camera attached to a personal digital assistant as an input device to capture scene images from the real world and it employs image enhancement and segmentation methods to extract written information from the scene, convert them to text information and show them to the user so that he/she can see both the real world and information together. We implemented a prototype application: an automatic sign/text language translation for foreign travelers, where people can use the system whenever they want to see text or signs in their own language where they are originally written in a foreign language in the scene.

ubiquitous computing | 2001

InfoScope: Link from Real World to Digital Information Space

Ismail Haritaoglu

We describe an information augmentation system (infoScope) and applications integrating handheld device with a color camera to provide enhanced information perception services to users. InfoScope uses a color camera as an input device to capture scene images from the real world and utilize computer vision techniques to extract information from real world, convert them into digital world as text information and augment them back to the original scene location. The user can see both the real world and information together on display of the handheld device. We have implemented two applications: First one is an automatic sign/text translation for foreign travelers where a user may use infoScope whenever they want to see texts or signs in their own language where they are originally written in foreign language in the scene and extracted from scene images automatically by using computer vision techniques. The second application is Information Augmentation in the City where a user can see information associated with building, or a place, overlaid onto real scene images on their PDAs display.

international conference on pattern recognition | 2000

An appearance-based body model for multiple people tracking

Ismail Haritaoglu; David Harwood; Larry S. Davis

We describe an appearance-based human body model for tracking multiple people when they are interaction with each others causing significant occlusion amongst them, or when they re-enter the scene. The proposed model allows real time surveillance systems understand who is who after multiple people interactions, partial or total occlusions, or when a person reappeal in the scene. It combines the grayscale textural appearance and expected shape information together in a 2D dynamic template. Experimental results demonstrates the robustness and real time performance of the proposed model.

ieee workshop on motion and video computing | 2002

Ghost/sup 3D/: detecting body posture and parts using stereo

Ismail Haritaoglu; David Beymer; Myron Flickner

The paper describes how to detect human posture and upper body parts using overhead narrow-baseline stereo cameras. This information is extracted to understand retail customer behavior while shopping. We propose an approach to detect body posture without using an explicit 3D human model. The proposed method is based on a 3D silhouette, a silhouette-ghost of a person, that is constructed from a 2D silhouette. The 2D silhouette is detected by color and disparity background subtraction. Once the silhouette-ghost is generated, the head and shoulder regions are identified using the human body topological structure which constrains the relative position of each body part. A shape histogram, the distribution of relative positions of points on a 3D silhouette, is introduced to estimate the posture and body parts.

workshop on applications of computer vision | 2002

Attentive billboards: towards to video based customer behavior understanding

Ismail Haritaoglu; Myron Flickner

We describe a real-time computer vision system and algorithms that extracts customer behavior information by detecting and tracking multiple people as they wait and watch advertisements on a billboard or a new product promotion at a stand. Our system segments each frame into foreground regions which contains multiple people. Foreground regions are further segmented into individuals using a temporal segmentation of foreground and motion cues and global shape constraints on 2D Silhouettes. A 2D dynamic appearance templates is used to track people. The system can provide online customer information, such as, number of people currently watching the billboard, their gender, and offline customer data, such as, how long each people looked at the billboard. Experimental results demonstrate robustness and real-time performance of the algorithm.

Computer Vision and Image Understanding | 2004

Introduction: special issue on event detection in video

Tanveer Fathima Syeda-Mahmood; Ismail Haritaoglu; Thomas S. Huang

It is our pleasure to welcome you to this special issue of Computer Vision and Image Understanding on event detection in video. The initial call for papers was sent in September 2001, and we are glad that we are finally able to bring this issue to you. The papers in this issue were derived from their original submissions at the first IEEE Event Detection in Video Workshop held in Vancouver, British Columbia, as part of ICCV 2001. Since then, three other Event Workshops have been held, and this topic is now beginning to be part of the mainstream computer vision conferences. The fundamental issues surrounding the detection, recognition, and understanding of events continue to be an active topic of research by several academic and industrial researchers across the world. The analysis of events is important in a variety of applications including surveillance, vision-based human–computer interaction, and content-based retrieval. Several challenges exist with regard to the detection and recognition of events. First, a good definition of what constitutes an event itself is lacking. Both salient changes and the states surrounding such changes are often termed as events in time-varying data. The time scale for an event can vary over a large range. For example, a man running in an otherwise static scene can constitute an event. The Gulf War is also an event that lasted over a much longer time period. Because of the long duration of this event, it could be regarded as a state during its occurrence. Second, understanding events seems to involve the detection and recognition of objects, actions, and their evolving interrelationships. Moreover, events are often multimodal, requiring the gathering of evidence from information available in multiple media sources such as video and audio. Even with the best techniques for visual or audio scene analysis, event detection using individual cues will continue to exhibit poor robustness in the foreseeable future, as a result of high detection errors. Further, the localization of events through multimodal fusion will continue to be difficult due to conflicting indications given by the individual cues. The purpose of this special issue was to highlight the state-of-the-art research in this emerging field. We solicited original papers that addressed a range of issues in event detection and recognition in digital video including:

Explore More