Octavian Udrea | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Octavian Udrea is active.

Explore More

Publication

Featured researches published by Octavian Udrea.

international conference on management of data | 2011

Apples and oranges: a comparison of RDF benchmarks and real RDF datasets

Songyun Duan; Anastasios Kementsietsidis; Kavitha Srinivas; Octavian Udrea

The widespread adoption of the Resource Description Framework (RDF) for the representation of both open web and enterprise data is the driving force behind the increasing research interest in RDF data management. As RDF data management systems proliferate, so are benchmarks to test the scalability and performance of these systems under data and workloads with various characteristics. In this paper, we compare data generated with existing RDF benchmarks and data found in widely used real RDF datasets. The results of our comparison illustrate that existing benchmark data have little in common with real data. Therefore any conclusions drawn from existing benchmark tests might not actually translate to expected behaviours in real settings. In terms of the comparison itself, we show that simple primitive data metrics are inadequate to flesh out the fundamental differences between real and benchmark data. We make two contributions in this paper: (1) To address the limitations of the primitive metrics, we introduce intuitive and novel metrics that can indeed highlight the key differences between distinct datasets; (2) To address the limitations of existing benchmarks, we introduce a new benchmark generator with the following novel characteristics: (a) the generator can use any (real or synthetic) dataset and convert it into a benchmark dataset; (b) the generator can generate data that mimic the characteristics of real datasets with user-specified data properties. On the technical side, we formulate the benchmark generation problem as an integer programming problem whose solution provides us with the desired benchmark datasets. To our knowledge, this is the first methodological study of RDF benchmarks, as well as the first attempt on generating RDF benchmarks in a principled way.

international conference on management of data | 2007

Leveraging data and structure in ontology integration

Octavian Udrea; Lise Getoor; Renée J. Miller

There is a great deal of research on ontology integration which makes use of rich logical constraints to reason about the structural and logical alignment of ontologies. There is also considerable work on matching data instances from heterogeneous schema or ontologies. However, little work exploits the fact that ontologies include both data and structure. We aim to close this gap by presenting a new algorithm (ILIADS) that tightly integrates both data matching and logical reasoning to achieve better matching of ontologies. We evaluate our algorithm on a set of 30 pairs of OWL Lite ontologies with the schema and data matchings found by human reviewers. We compare against two systems - the ontology matching tool FCA-merge [28] and the schema matching tool COMA++ [1]. ILIADS shows an average improvement of 25% in quality over FCA-merge and a 11% improvement in recall over COMA++.

international conference on management of data | 2013

Building an efficient RDF store over a relational database

Mihaela A. Bornea; Julian Dolby; Anastasios Kementsietsidis; Kavitha Srinivas; Patrick Dantressangle; Octavian Udrea; Bishwaranjan Bhattacharjee

Efficient storage and querying of RDF data is of increasing importance, due to the increased popularity and widespread acceptance of RDF on the web and in the enterprise. In this paper, we describe a novel storage and query mechanism for RDF which works on top of existing relational representations. Reliance on relational representations of RDF means that one can take advantage of 35+ years of research on efficient storage and querying, industrial-strength transaction support, locking, security, etc. However, there are significant challenges in storing RDF in relational, which include data sparsity and schema variability. We describe novel mechanisms to shred RDF into relational, and novel query translation techniques to maximize the advantages of this shredded representation. We show that these mechanisms result in consistently good performance across multiple RDF benchmarks, even when compared with current state-of-the-art stores. This work provides the basis for RDF support in DB2 v.10.1.

IEEE Transactions on Multimedia | 2008

A Constrained Probabilistic Petri Net Framework for Human Activity Detection in Video

Massimiliano Albanese; Rama Chellappa; Vincenzo Moscato; Antonio Picariello; V. S. Subrahmanian; Pavan K. Turaga; Octavian Udrea

Recognition of human activities in restricted settings such as airports, parking lots and banks is of significant interest in security and automated surveillance systems. In such settings, data is usually in the form of surveillance videos with wide variation in quality and granularity. Interpretation and identification of human activities requires an activity model that a) is rich enough to handle complex multi-agent interactions, b) is robust to uncertainty in low-level processing and c) can handle ambiguities in the unfolding of activities. We present a computational framework for human activity representation based on Petri nets. We propose an extension-Probabilistic Petri Nets (PPN)-and show how this model is well suited to address each of the above requirements in a wide variety of settings. We then focus on answering two types of questions: (i) what are the minimal sub-videos in which a given activity is identified with a probability above a certain threshold and (ii) for a given video, which activity from a given set occurred with the highest probability? We provide the PPN-MPS algorithm for the first problem, as well as two different algorithms (naive PPN-MPA and PPN-MPA) to solve the second. Our experimental results on a dataset consisting of bank surveillance videos and an unconstrained TSA tarmac surveillance dataset show that our algorithms are both fast and provide high quality results.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2010

PADS: A Probabilistic Activity Detection Framework for Video Data

Massimiliano Albanese; Rama Chellappa; Naresh P. Cuntoor; Vincenzo Moscato; Antonio Picariello; V. S. Subrahmanian; Octavian Udrea

There is now a growing need to identify various kinds of activities that occur in videos. In this paper, we first present a logical language called Probabilistic Activity Description Language (PADL) in which users can specify activities of interest. We then develop a probabilistic framework which assigns to any subvideo of a given video sequence a probability that the subvideo contains the given activity, and we finally develop two fast algorithms to detect activities within this framework. OffPad finds all minimal segments of a video that contain a given activity with a probability exceeding a given threshold. In contrast, the OnPad algorithm examines a video during playout (rather than afterwards as OffPad does) and computes the probability that a given activity is occurring (even if the activity is only partially complete). Our prototype Probabilistic Activity Detection System (PADS) implements the framework and the two algorithms, building on top of existing image processing algorithms. We have conducted detailed experiments and compared our approach to four different approaches presented in the literature. We show that-for complex activity definitions-our approach outperforms all the other approaches.

Information & Computation | 2008

Rule-based static analysis of network protocol implementations

Octavian Udrea; Cristian Lumezanu; Jeffrey S. Foster

Todays software systems communicate over the Internet using standard protocols that have been heavily scrutinized, providing some assurance of resistance to malicious attacks and general robustness. However, the software that implements those protocols may still contain mistakes, and an incorrect implementation could lead to vulnerabilities even in the most well-understood protocol. The goal of this work is to close this gap by introducing a new technique for checking that a C implementation of a protocol matches its description in an RFC or similar standards document. We present a static (compile-time) source code analysis tool called Pistachio that checks C code against a rule-based specification of its behavior. Rules describe what should happen during each round of communication, and can be used to enforce constraints on ordering of operations and on data values. Our analysis is not guaranteed sound due to some heuristic approximations it makes, but has a low false negative rate in practice when compared to known bug reports. We have applied Pistachio to implementations of SSH and RCP, and our system was able to find many bugs, including security vulnerabilities, that we confirmed by hand and checked against each projects bug databases.

ESWC'06 Proceedings of the 3rd European conference on The Semantic Web: research and applications | 2006

Annotated RDF

Octavian Udrea; Diego Reforgiato Recupero; V. S. Subrahmanian

There are numerous extensions of RDF that support temporal reasoning, reasoning about pedigree, reasoning about uncertainty, and so on. In this paper, we present Annotated RDF (or aRDF for short) in which RDF triples are annotated by members of a partially ordered set (with bottom element) that can be selected in any way desired by the user. We present a formal declarative semantics (model theory) for annotated RDF and develop algorithms to check consistency of aRDF theories and to answer queries to aRDF theories. We show that annotated RDF captures versions of all the forms of reasoning mentioned above within a single unified framework. We develop a prototype aRDF implementation and show that our algorithms work very fast indeed – in fact, in just a matter of seconds for theories with over 100,000 nodes.

conference on information and knowledge management | 2009

Mashup-based information retrieval for domain experts

Anand Ranganathan; Anton V. Riabov; Octavian Udrea

In this paper, we tackle the problem of helping domain experts to construct, parameterize and deploy mashups of data and code. We view a mashup as a data processing flow, that describes how data is obtained from one or more sources, processed by one or more components, and finally sent to one or more sinks. Our approach allows specifying patterns of flows, in a language called Cascade. The patterns cover different possible variations of the flows, including variations in the structure of the flow, the components in the flow and the possible parameterizations of these components. We present a tool that makes use of this knowledge of flow patterns and associated metadata to allow domain experts to explore the space of possible flows described in the pattern. The tool uses an AI planning approach to automatically build a flow, belonging to the flow pattern, from a high-level goal, specified as a set of tags. We describe examples from the financial services domain to show the use of flow patterns in allowing domain experts to construct a large variety of mashups rapidly.

international conference on move to meaningful internet systems | 2005

Probabilistic ontologies and relational databases

Octavian Udrea; Deng Yu; Edward Hung; V. S. Subrahmanian

The relational algebra and calculus do not take the semantics of terms into account when answering queries. As a consequence, not all tuples that should be returned in response to a query are always returned, leading to low recall. In this paper, we propose the novel notion of a constrained probabilistic ontology (CPO). We developed the concept of a CPO-enhanced relation in which each attribute of a relation has an associated CPO. These CPOs describe relationships between terms occurring in the domain of that attribute. We show that the relational algebra can be extended to handle CPO-enhanced relations. This allows queries to yield sets of tuples, each of which has a probability of being correct.

scalable uncertainty management | 2007

Aggregates in Generalized Temporally Indeterminate Databases

Octavian Udrea; Zoran Majkic; V. S. Subrahmanian

Dyreson and Snodgrass as well as Dekhtyar et. al. have provided a probabilistic model (as well as compelling example applications) for why there may be temporal indeterminacy in databases. In this paper, we first propose a formal model for aggregate computation in such databases when there is uncertainty not just in the temporal attribute, but also in the ordinary (non-temporal) attributes. We identify two types of aggregates: event correlated aggregates, and non event correlated aggregations, and provide efficient algorithms for both of them. We prove that our algorithms are correct, and we present experimental results showing that the algorithms work well in practice.

Explore More