Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Michael S. Ryoo is active.

Publication


Featured researches published by Michael S. Ryoo.


international conference on computer vision | 2009

Spatio-temporal relationship match: Video structure comparison for recognition of complex human activities

Michael S. Ryoo; Jake K. Aggarwal

Human activity recognition is a challenging task, especially when its background is unknown or changing, and when scale or illumination differs in each video. Approaches utilizing spatio-temporal local features have proved that they are able to cope with such difficulties, but they mainly focused on classifying short videos of simple periodic actions. In this paper, we present a new activity recognition methodology that overcomes the limitations of the previous approaches using local features. We introduce a novel matching, spatio-temporal relationship match, which is designed to measure structural similarity between sets of features extracted from two videos. Our match hierarchically considers spatio-temporal relationships among feature points, thereby enabling detection and localization of complex non-periodic activities. In contrast to previous approaches to ‘classify’ videos, our approach is designed to ‘detect and localize’ all occurring activities from continuous videos where multiple actors and pedestrians are present. We implement and test our methodology on a newly-introduced dataset containing videos of multiple interacting persons and individual pedestrians. The results confirm that our system is able to recognize complex non-periodic activities (e.g. ‘push’ and ‘hug’) from sets of spatio-temporal features even when multiple activities are present in the scene


international conference on computer vision | 2011

Human activity prediction: Early recognition of ongoing activities from streaming videos

Michael S. Ryoo

In this paper, we present a novel approach of human activity prediction. Human activity prediction is a probabilistic process of inferring ongoing activities from videos only containing onsets (i.e. the beginning part) of the activities. The goal is to enable early recognition of unfinished activities as opposed to the after-the-fact classification of completed activities. Activity prediction methodologies are particularly necessary for surveillance systems which are required to prevent crimes and dangerous activities from occurring. We probabilistically formulate the activity prediction problem, and introduce new methodologies designed for the prediction. We represent an activity as an integral histogram of spatio-temporal features, efficiently modeling how feature distributions change over time. The new recognition methodology named dynamic bag-of-words is developed, which considers sequential nature of human activities while maintaining advantages of the bag-of-words to handle noisy observations. Our experiments confirm that our approach reliably recognizes ongoing activities from streaming videos with a high accuracy.


computer vision and pattern recognition | 2006

Recognition of Composite Human Activities through Context-Free Grammar Based Representation

Michael S. Ryoo; Jake K. Aggarwal

This paper describes a general methodology for automated recognition of complex human activities. The methodology uses a context-free grammar (CFG) based representation scheme to represent composite actions and interactions. The CFG-based representation enables us to formally define complex human activities based on simple actions or movements. Human activities are classified into three categories: atomic action, composite action, and interaction. Our system is not only able to represent complex human activities formally, but also able to recognize represented actions and interactions with high accuracy. Image sequences are processed to extract poses and gestures. Based on gestures, the system detects actions and interactions occurring in a sequence of image frames. Our results show that the system is able to represent composite actions and interactions naturally. The system was tested to represent and recognize eight types of interactions: approach, depart, point, shake-hands, hug, punch, kick, and push. The experiments show that the system can recognize sequences of represented composite actions and interactions with a high recognition rate.


International Journal of Computer Vision | 2009

Semantic Representation and Recognition of Continued and Recursive Human Activities

Michael S. Ryoo; Jake K. Aggarwal

This paper describes a methodology for automated recognition of complex human activities. The paper proposes a general framework which reliably recognizes high-level human actions and human-human interactions. Our approach is a description-based approach, which enables a user to encode the structure of a high-level human activity as a formal representation. Recognition of human activities is done by semantically matching constructed representations with actual observations. The methodology uses a context-free grammar (CFG) based representation scheme as a formal syntax for representing composite activities. Our CFG-based representation enables us to define complex human activities based on simpler activities or movements. Our system takes advantage of both statistical recognition techniques from computer vision and knowledge representation concepts from traditional artificial intelligence. In the low-level of the system, image sequences are processed to extract poses and gestures. Based on the recognition of gestures, the high-level of the system hierarchically recognizes composite actions and interactions occurring in a sequence of image frames. The concept of hallucinations and a probabilistic semantic-level recognition algorithm is introduced to cope with imperfect lower-layers. As a result, the system recognizes human activities including ‘fighting’ and ‘assault’, which are high-level activities that previous systems had difficulties. The experimental results show that our system reliably recognizes sequences of complex human activities with a high recognition rate.


international conference on pattern recognition | 2010

An overview of contest on semantic description of human activities (SDHA) 2010

Michael S. Ryoo; Chia-Chih Chen; Jake K. Aggarwal; Amit K. Roy-Chowdhury

This paper summarizes results of the 1st Contest on Semantic Description of Human Activities (SDHA), in conjunction with ICPR 2010. SDHA 2010 consists of three types of challenges, High-level Human Interaction Recognition Challenge, Aerial View Activity Classification Challenge, and Wide-Area Activity Search and Recognition Challenge. The challenges are designed to encourage participants to test existing methodologies and develop new approaches for complex human activity recognition scenarios in realistic environments. We introduce three new public datasets through these challenges, and discuss results of the stateof-the-art activity recognition systems designed and implemented by the contestants. A methodology using a spatio-temporal voting [19] successfully classified segmented videos in the UT-Interaction datasets, but had a difficulty correctly localizing activities from continuous videos. Both the method using local features [10] and the HMM based method [18] recognized actions from low-resolution videos (i.e. UT-Tower dataset) successfully. We compare their results in this paper.


advanced video and signal based surveillance | 2007

Detection of abandoned objects in crowded environments

Medha Bhargava; Chia-Chih Chen; Michael S. Ryoo; Jake K. Aggarwal

With concerns about terrorism and global security on the rise, it has become vital to have in place efficient threat detection systems that can detect and recognize potentially dangerous situations, and alert the authorities to take appropriate action. Of particular significance is the case of unattended objects in mass transit areas. This paper describes a general framework that recognizes the event of someone leaving a piece of baggage unattended in forbidden areas. Our approach involves the recognition of four sub-events that characterize the activity of interest. When an unaccompanied bag is detected, the system analyzes its history to determine its most likely owner(s), where the owner is defined as the person who brought the bag into the scene before leaving it unattended. Through subsequent frames, the system keeps a lookout for the owner, whose presence in or disappearance from the scene defines the status of the bag, and decides the appropriate course of action. The system was successfully tested on the i-LIDS dataset.


computer vision and pattern recognition | 2008

Observe-and-explain: A new approach for multiple hypotheses tracking of humans and objects

Michael S. Ryoo; Jake K. Aggarwal

This paper presents a novel approach for tracking humans and objects under severe occlusion. We introduce a new paradigm for multiple hypotheses tracking, observe-and-explain, as opposed to the previous paradigm of hypothesize-and-test. Our approach efficiently enumerates multiple possibilities of tracking by generating several likely dasiaexplanationspsila after concatenating a sufficient amount of observations. The computational advantages of our approach over the previous paradigm under severe occlusions are presented. The tracking system is implemented and tested using the i-Lids dataset, which consists of videos of humans and objects moving in a London subway station. The experimental results show that our new approach is able to track humans and objects accurately and reliably even when they are completely occluded, illustrating its advantage over previous approaches.


machine vision applications | 2009

Detection of object abandonment using temporal logic

Medha Bhargava; Chia-Chih Chen; Michael S. Ryoo; Jake K. Aggarwal

This paper describes a novel framework for a smart threat detection system that uses computer vision to capture, exploit and interpret the temporal flow of events related to the abandonment of an object. Our approach uses contextual information along with an analysis of the causal progression of events to decide whether or not an alarm should be raised. When an unattended object is detected, the system traces it back in time to determine and record who its most likely owner(s) may be. Through subsequent frames, the system searches the scene for the owner and issues an alert if no match is found for the owner over a given period of time. Our algorithm has been successfully tested on two benchmark datasets (PETS 2006 Benchmark Data, 2006; i-LIDS Dataset for AVSS, 2007), and yielded results that are substantially more accurate than similar systems developed by other academic and industrial research groups.


advanced video and signal based surveillance | 2007

Real-time detection of illegally parked vehicles using 1-D transformation

Jong Taek Lee; Michael S. Ryoo; Matthew Riley; Jake K. Aggarwal

With decreasing costs of high quality surveillance systems, human activity detection and tracking has become increasingly practical. Accordingly, automated systems have been designed for numerous detection tasks, but the task of detecting illegally parked vehicles has been left largely to the human operators of surveillance systems. We propose a methodology for detecting this event in realtime by applying a novel image projection that reduces the dimensionality of the image data and thus reduces the computational complexity of the segmentation and tracking processes. After event detection, we invert the transformation to recover the original appearance of the vehicle and to allow for further processing that may require the two dimensional data. The proposed algorithm is able to successfully recognize illegally parked vehicles in real-time in the i-LIDS bag and vehicle detection challenge datasets.


international conference on pattern recognition | 2006

Semantic Understanding of Continued and Recursive Human Activities

Michael S. Ryoo; Jake K. Aggarwal

This paper presents a methodology for semantic understanding of complex and continued human activities. A context-free grammar (CFG) based representation scheme developed earlier is extended to construct a description for continued and recursive human activities. New system recognizes recursively described high-level interactions: fighting and greeting. The system understands activities by detecting the time intervals that satisfy their semantic descriptions

Collaboration


Dive into the Michael S. Ryoo's collaboration.

Top Co-Authors

Avatar

Jake K. Aggarwal

University of Texas at Austin

View shared research outputs
Top Co-Authors

Avatar

Chia-Chih Chen

University of Texas at Austin

View shared research outputs
Top Co-Authors

Avatar

Ilaria Gori

University of Texas at Austin

View shared research outputs
Top Co-Authors

Avatar

Jong Taek Lee

University of Texas at Austin

View shared research outputs
Top Co-Authors

Avatar

Medha Bhargava

University of Texas at Austin

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jong T. Lee

University of Texas at Austin

View shared research outputs
Top Co-Authors

Avatar

Kristen Grauman

University of Texas at Austin

View shared research outputs
Top Co-Authors

Avatar

Larry H. Matthies

California Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Lu Xia

University of Texas at Austin

View shared research outputs
Researchain Logo
Decentralizing Knowledge