Omar U. Florez | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Omar U. Florez is active.

Explore More

Publication

Featured researches published by Omar U. Florez.

conference on information and knowledge management | 2010

Data aspects in a relational database

Curtis E. Dyreson; Omar U. Florez

Data has cross-cutting concerns such as versioning, privacy, and reliability. In this paper we sketch support such concerns by adapting the aspect-oriented programming (AOP) paradigm to data. Our goal, shared by AOP, is to re-engineer applications to support cross-cutting concerns without directly modifying the applications data or queries. We propose modeling a cross-cutting data concern as a data aspect. A data aspect weaves metadata around an applications data and queries, imbuing them with additional semantics for constraint and query processing.

Proceedings of the 5th Ph.D. workshop on Information and knowledge | 2012

Is that scene dangerous?: transferring knowledge over a video stream

Omar U. Florez; Curtis E. Dyreson

Activity mining in traffic scenes aims to automatically explain the complex interactions among moving objects recorded with a surveillance camera. Traditional machine learning algorithms generate a model and validate it with manually labeled data, which is a time-consuming and expensive task. The common issue is that these models often get outdated when external variables take place during posterior recording such as dynamic background, illumination, and different weather conditions. Those changes practically impose a new domain that often makes the original model inaccurate for clustering and classification tasks. If we directly apply a statistical model trained in one domain to other over the same stream, the performance of the algorithm will notably decrease due to distinct activity representations and different marginal and conditional distributions. We approach this problem in two stages: 1) we present mature results on a hierarchical Bayesian model designed to represent every video scene as a multinomial distribution over topics. 2) we present early stage evidence of an algorithm to transfer knowledge across two instances of the hierarchical model described in the previous stage. A concrete example of this first stage consists of a simple (but efficient) algorithm to incrementally generate association rules to explain current traffic scenes as co-occurrence relationships between topics. This approach is especially useful when we do not have any labels in a target domain, but have some labeled information (which frames contain dangerous scenes?) in a source domain, by far the most frequent case in real surveillance systems. This algorithm clusters domain-dependent activities in the latent space and bridge them across domains via domain-independent activities. Our experiments show that our method is able to successfully compete with SVM to perform generalization when the temporal gap between source and target domain is large.

data warehousing and olap | 2011

Building a display of missing information in a data sieve

Curtis E. Dyreson; Omar U. Florez

A data sieve filters a data stream to harvest data of interest and summarizes the harvested data in a multidimensional database (MDB). To build the data sieve, a designer supplies a list of filters. Each filter consists of a filter unit and category for each dimension. The filter unit specifies a pattern (a regular expression) to match as the data stream is filtered. The filter category is the system of measurement in which occurrences of that pattern are counted or otherwise aggregated. Since filtering discards some of the data, incomplete regions within the MDB are created. The missing data complicates querying. While a query on the filtered data can be automatically analysed to determine if sufficient information has been filtered to satisfy it, a better query construction strategy is to prevent users from formulating unsatisfiable queries. To aid users in formulating only satisfiable queries, the GUI for a data sieve needs to color or otherwise display regions of complete, partially complete, and missing data. As a user constructs a query, choosing categories and units, the displayed incomplete regions shift and change, curtailing future choices. For instance, if a user selects a spatial unit of Australia, the display for a temporal category of days may need to be colored as incomplete since no filters would satisfy both selections. We describe an algorithm that uses bit strings to create and maintain the display of incomplete information in a data sieve in real-time.

multimedia information retrieval | 2010

Sublinear querying of realistic timeseries and its application to human motion

Omar U. Florez; Alexander Ocsa; Curtis E. Dyreson

This paper introduces a novel hashing algorithm for large timeseries databases, which can improve the querying of human motion. Timeseries that represent human motion come from many sources, in particular, videos and motion capture systems. Motion-related timeseries have features which are not commonly present in traditional types of vector data and that create additional indexing challenges: high and variable dimensionality, no Euclidean distance without normalization, and a metric space not fully defined. New techniques are needed to index motion-related timeseries. The algorithm that we present in this paper generalizes the dot product operator to hash timeseries of variable dimensionality without assuming constant dimensionality or requiring dimensionality normalization, unlike other approaches. By avoiding normalization, our hashing algorithm preserves more timeseries information and improves retrieval accuracy, and by hashing achieves sublinear computation time for most searches. Additionally, we show how to further improve the hashing by partitioning the search space using timeseries within the index. This paper also reports the results of experiments that show that the algorithm performs well in the querying of real human motion datasets.

acm symposium on applied computing | 2009

Discovery of time series in video data through distribution of spatiotemporal gradients

Omar U. Florez; SeungJin Lim

We propose a novel algorithm to extract time series from video to characterize the type of motion embedded in the video. Our method relies on describing the motion exposed in a video as a collection of spatiotemporal gradients. Each gradient models high variation in the respective region of the video both in space and time with respect to its spatiotemporal neighborhood. Rather than obtaining a coarse sampling of the motion by taking one event per frame, we obtain a continuous function by considering all the events that fall in the short-time slicing window of time length equal to the value of the temporal variance. The result is a composed time series that represents the motion in the video independent of rotation and scale. As an empirical demonstration of the viability of our method, we are able to cluster human motions contained in 114 videos into hand-based motions and foot-based motions with the precision of 86.0% and 75.9% respectively.

aspect-oriented software development | 2013

Supporting data aspects in pig latin

Curtis E. Dyreson; Omar U. Florez; Akshay Thakre; Vishal Sharma

In this paper we apply the aspect-oriented programming (AOP) paradigm to Pig Latin, a dataflow language for cloud computing, used primarily for the analysis of massive data sets. Missing from Pig Latin is support for cross-cutting data concerns. Data, like code, has cross-cutting concerns such as versioning, privacy, and reliability. AOP techniques can be used to weave metadata around Pig data. The metadata imbues the data with additional semantics that must be observed in the evaluation of Pig Latin programs. In this paper we show how to modify Pig Latin to process data woven together with metadata. The data weaver is a layer that maps a Pig Latin program to an augmented Pig Latin program using Pig Latin templates or patterns. We also show how to model additional levels of advice, i.e., meta-metadata.

database and expert systems applications | 2008

HRG: A Graph Structure for Fast Similarity Search in Metric Spaces

Omar U. Florez; SeungJin Lim

Indexing is the most effective technique to speed up queries in databases. While traditional indexing approaches are used for exact search, a query object may not be always identical to an existing data object in similarity search. This paper proposes a new dynamic data structure called Hypherspherical Region Graph (HRG) to efficiently index a large volume of data objects as a graph for similarity search in metric spaces. HRG encodes the given dataset in a smaller number of vertices than the known graph index, Incremental-RNG, while providing flexible traversal without incurring backtracking as observed in tree-based indices. An empirical analysis performed on search time shows that HRG outperforms Incremental-RNG in both cases. HRG, however, outperforms tree-based indices in range search only when the data dimensionality is not so high.

conference on information and knowledge management | 2011

Scalable similarity search of timeseries with variable dimensionality

Omar U. Florez; Curtis E. Dyreson

Timeseries can be similar in shape but differ in length. For example, the sound waves produced by the same word spoken twice have roughly the same shape, but one may be shorter in duration. Stream data mining, approximate querying of image and video databases, data compression, and near duplicate detection are applications that need to be able to classify or cluster such timeseries, and to search for and rank timeseries that are similar to a chosen timeseries. We demonstrate software for clustering and performing similarity search in databases of timeseries data, where the timeseries have high and variable dimensionality. Our demonstration uses Timeseries Sensitive Hashing (TSH)[3] to index the timeseries. TSH adapts Locality Sensitive Hashing (LSH), which is an approximate algorithm to index data points in a d-dimensional space under some (e.g., Euclidean) distance function. TSH, unlike LSH, can index points that do not have the same dimensionality. As examples of the potential of TSH, the demonstration will index and classify timeseries from an image database and timeseries describing human motion extracted from a video stream and a motion capture system.

conference on information and knowledge management | 2010

Mining rules to explain activities in videos

Omar U. Florez; Curtis E. Dyreson

We present a novel approach to mining dependency rules that explain the scenes present during a video sequence. The approach first characterizes activities based on their most important events. Next, an HMM-based approach finds the mixture components that best describe the clustering dependencies between events and activities in video data. The dependencies among activities are taken as association patterns with temporal precedence and analyzed using their co-occurrence relationships in time windows. This technique is meant to understand the multiple actions taken in a video or to predict future occurrences of certain activities.

international conference on acoustics, speech, and signal processing | 2009

MOBHRG: Fast k-nearest-neighbor search by overlap reduction of hyperspherical regions

Omar U. Florez; Xiaojun Qi; Alexander Ocsa

We propose a minimum overlap based hyperspherical region graph indexing structure to achieve fast similarity-based queries for both low and high dimensional datasets. Specifically, we reduce the region overlaps in the graph construction phase by incrementally dividing each saturated hyperspherical region and removing the longest edge of a minimum spanning tree representation of the internal objects. This overlap reduction scheme creates more separated regions, so fewer regions as potential paths are traversed when a query is issued. We also introduce a k-nearest-neighbor search scheme by automatically deciding the search radius to return the required number of nearest neighbors. Our extensive experimental results show the effectiveness of the proposed indexing structure compared with other tree and graph based indexing structures.

Explore More