Zhe Wu | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Zhe Wu is active.

Explore More

Publication

Featured researches published by Zhe Wu.

international conference on data engineering | 2008

Implementing an Inference Engine for RDFS/OWL Constructs and User-Defined Rules in Oracle

Zhe Wu; George Eadon; Souripriya Das; Eugene Inseok Chong; Vladimir Kolovski; Melliyal Annamalai; Jagannathan Srinivasan

This inference engines are an integral part of semantic data stores. In this paper, we describe our experience of implementing a scalable inference engine for Oracle semantic data store. This inference engine computes production rule based entailment of one or more RDFS/OWL encoded semantic data models. The inference engine capabilities include (i) inferencing based on semantics of RDFS/OWL constructs and user-defined rules, (ii) computing ancillary information (namely, semantic distance and proof) for inferred triples, and (iii) validation of semantic data model based on RDFS/OWL semantics. A unique aspect of our approach is that the inference engine is implemented entirely as a database application on top of Oracle database. The paper describes the inferencing requirements, challenges in supporting a sufficiently expressive set of RDFS/OWL constructs, and techniques adopted to build a scalable inference engine. A performance study conducted using both native and synthesized semantic datasets demonstrates the effectiveness of our approach.

international semantic web conference | 2010

Optimizing enterprise-scale OWL 2 RL reasoning in a relational database system

Vladimir Kolovski; Zhe Wu; George Eadon

OWL 2 RL was standardized as a less expressive but scalable subset of OWL 2 that allows a forward-chaining implementation. However, building an enterprise-scale forward-chaining based inference engine that can 1) take advantage of modern multi-core computer architectures, and 2) efficiently update inference for additions remains a challenge. In this paper, we present an OWL 2 RL inference engine implemented inside the Oracle database system, using novel techniques for parallel processing that can readily scale on multicore machines and clusters. Additionally, we have added support for efficient incremental maintenance of the inferred graph after triple additions. Finally, to handle the increasing number of owl:sameAs relationships present in Semantic Web datasets, we have provided a hybrid in-memory/disk based approach to efficiently compute compact equivalence closures. We have done extensive testing to evaluate these new techniques; the test results demonstrate that our inference engine is capable of performing efficient inference over ontologies with billions of triples using a modest hardware configuration.

DBLP Bibliography (http://dblp.uni-trier.de/) | 2015

RDFox: A Highly-Scalable RDF Store.

Yavor Nenov; Robert Piro; Boris Motik; Ian Horrocks; Zhe Wu; Jay Banerjee

We present RDFox—a main-memory, scalable, centralised RDF store that supports materialisation-based parallel datalog reasoning and SPARQL query answering. RDFox uses novel and highly-efficient parallel reasoning algorithms for the computation and incremental update of datalog materialisations with efficient handling of owl:sameAs. In this system description paper, we present an overview of the system architecture and highlight the main ideas behind our indexing data structures and our novel reasoning algorithms. In addition, we evaluate RDFox on a high-end SPARC T5-8 server with 128 physical cores and 4TB of RAM. Our results show that RDFox can effectively exploit such a machine, achieving speedups of up to 87 times, storage of up to 9.2 billion triples, memory usage as low as 36.9 bytes per triple, importation rates of up to 1 million triples per second, and reasoning rates of up to 6.1 million triples per second.

international world wide web conferences | 2013

Making the most of your triple store: query answering in OWL 2 using an RL reasoner

Yujiao Zhou; Bernardo Cuenca Grau; Ian Horrocks; Zhe Wu; Jay Banerjee

Triple stores implementing the RL profile of OWL 2 are becoming increasingly popular. In contrast to unrestricted OWL 2, the RL profile is known to enjoy favourable computational properties for query answering, and state-of-the-art RL reasoners such as OWLim and Oracles native inference engine of Oracle Spatial and Graph have proved extremely successful in industry-scale applications. The expressive restrictions imposed by OWL 2 RL may, however, be problematical for some applications. In this paper, we propose novel techniques that allow us (in many cases) to compute exact query answers using an off-the-shelf RL reasoner, even when the ontology is outside the RL profile. Furthermore, in the cases where exact query answers cannot be computed, we can still compute both lower and upper bounds on the exact answers. These bounds allow us to estimate the degree of incompleteness of the RL reasoner on the given query, and to optimise the computation of exact answers using a fully-fledged OWL 2 reasoner. A preliminary evaluation using the RDF Semantic Graph feature in Oracle Database has shown very promising results with respect to both scalability and tightness of the bounds.

First International Workshop on Graph Data Management Experiences and Systems | 2013

Graph analysis: do we have to reinvent the wheel?

Adam Welc; Raghavan Raman; Zhe Wu; Sungpack Hong; Hassan Chafi; Jay Banerjee

The problem of efficiently analyzing graphs of various shapes and sizes has been recently enjoying an increased level of attention both in the academia and in the industry. This trend prompted creation of specialized graph databases that have been rapidly gaining popularity of late. In this paper we argue that there exist alternatives to graph databases, providing competitive or superior performance, that do not require replacement of the entire existing storage infrastructure by the companies wishing to deploy them.

international conference on data engineering | 2008

A Scalable Scheme for Bulk Loading Large RDF Graphs into Oracle

Souripriya Das; Eugene Inseok Chong; Zhe Wu; Melliyal Annamalai; Jagannathan Srinivasan

The growth of RDF data makes it imperative that an efficient mechanism for bulk-loading RDF graphs be supported. Thus, the paper proposes a bulk-load scheme that allows fast loading of arbitrarily large RDF graphs into a database. Specifically, three modes of load are supported: i) loading into an empty RDF graph, ii) appending to a non-empty RDF graph, and iii) concurrent loads into multiple graphs. The bulk-load scheme is implemented as part of Oracle database semantic technologies and the performance experiments conducted with a variety of RDF graphs (from UniProt and synthesized data of Lehigh University Benchmark) demonstrate the scalability of the approach. The paper outlines the challenges involved in bulk- loading of large RDF graphs, describes the bulk-load scheme, discusses its implementation, and presents a performance study.

Proceedings of Workshop on GRAph Data management Experiences and Systems | 2014

PGX.ISO: Parallel and Efficient In-Memory Engine for Subgraph Isomorphism

Raghavan Raman; Oskar van Rest; Sungpack Hong; Zhe Wu; Hassan Chafi; Jay Banerjee

Subgraph isomorphism, or finding matching patterns in a graph, is a classic graph problem that has many practical use cases. There are even commercialized solutions for this problem such as RDF databases with their support for SPARQL queries. In this paper, we present an efficient, parallel in-memory solution to this problem. Our solution exploits efficient data representations as well as algorithmic extensions, both tailored for parallel, in-memory processing. Moreover, when processing RDF data, we reduce the problem size by converting certain nodes and edges into properties. We also propose a new graph query language where such a conversion can be encoded. Our evaluation shows that our solution can achieve significant performance boost over an existing secondary storage based RDF database.

international conference on data engineering | 2010

Visualizing large-scale RDF data using Subsets, Summaries, and Sampling in Oracle

Seema Sundara; Medha Atre; Vladimir Kolovski; Souripriya Das; Zhe Wu; Eugene Inseok Chong; Jagannathan Srinivasan

The paper addresses the problem of visualizing large scale RDF data via a 3-S approach, namely, by using, 1) Subsets: to present only relevant data for visualisation; both static and dynamic subsets can be specified, 2) Summaries: to capture the essence of RDF data being viewed; summarized data can be expanded on demand thereby allowing users to create hybrid (summary-detail) fisheye views of RDF data, and 3) Sampling: to further optimize visualization of large-scale data where a representative sample suffices. The visualization scheme works with both asserted and inferred triples (generated using RDF(S) and OWL semantics). This scheme is implemented in Oracle by developing a plug-in for the Cytoscape graph visualization tool, which uses functions defined in a Oracle PL/SQL package, to provide fast and optimized access to Oracle Semantic Store containing RDF data. Interactive visualization of a synthesized RDF data set (LUBM 1 million triples), two native RDF datasets (Wikipedia 47 million triples and UniProt 700 million triples), and an OWL ontology (eClassOwl with a large class hierarchy including over 25,000 OWL classes, 5,000 properties, and 400,000 class-properties) demonstrates the effectiveness of our visualization scheme.

international semantic web conference | 2015