Byron Choi | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Byron Choi is active.

Explore More

Publication

Featured researches published by Byron Choi.

very large data bases | 2003

Implementing XQuery 1.0: the Galax experience

Mary F. Fernández; Jérôme Siméon; Byron Choi; Amélie Marian; Gargi Sur

Galax is a light-weight, portable, open-source implementation of XQuery 1.0. Started in December 2000 as a small prototype designed to test the XQuery static type system, Galax has now become a solid implementation, aiming at full conformance with the family of XQuery 1.0 specifications. Because of its completeness and open architecture, Galax also turns out to be a very convenient platform for researchers interested in experimenting with XQuery optimization. We demonstrate the Galax system as well as its most advanced features, including support for XPath 2.0, XML Schema and static type-checking. We also present some of our first experiments with optimization. Notably, we demonstrate query rewriting capabilities in the Galax compiler, and the ability to run queries on documents up to a Gigabyte without the need for preindexing. Although early versions of Galax have been shown in industrial conferences over the last two years, this is the first time it is demonstrated in the database community.

international conference on data engineering | 2011

Processing private queries over untrusted data cloud through privacy homomorphism

Haibo Hu; Jianliang Xu; Chushi Ren; Byron Choi

Query processing that preserves both the data privacy of the owner and the query privacy of the client is a new research problem. It shows increasing importance as cloud computing drives more businesses to outsource their data and querying services. However, most existing studies, including those on data outsourcing, address the data privacy and query privacy separately and cannot be applied to this problem. In this paper, we propose a holistic and efficient solution that comprises a secure traversal framework and an encryption scheme based on privacy homomorphism. The framework is scalable to large datasets by leveraging an index-based approach. Based on this framework, we devise secure protocols for processing typical queries such as k-nearest-neighbor queries (kNN) on R-tree index. Moreover, several optimization techniques are presented to improve the efficiency of the query processing protocols. Our solution is verified by both theoretical analysis and performance study.

international conference on management of data | 2004

Incremental evaluation of schema-directed XML publishing

Philip Bohannon; Byron Choi; Wenfei Fan

When large XML documents published from a database are maintained externally, it is inefficient to repeatedly recompute them when the database is updated. Vastly preferable is incremental update, as common for views stored in a data warehouse. However, to support schema-directed publishing, there may be no simple query that defines the mapping from the database to the external document. To meet the need for efficient incremental update, this paper studies two approaches for incremental evaluation of ATGs [4], a formalism for schema-directed XML publishing. The reduction approach seeks to push as much work as possible to the underlying DBMS. It is based on a relational encoding of XML trees and a nontrivial translation of ATGs to SQL 99 queries with recursion. However, a weakness of this approach is that it relies on high-end DBMS features rather than the lowest common denominator. In contrast, the bud-cut approach pushes only simple queries to the DBNS and performs the bulk of the work in middleware. It capitalizes on the tree-structure of XML views to minimize unnecessary recomputations and leverages optimization techniques developed for XML publishing. While implementation of the reduction approach is not yet in the reach of commercial DBMS, we have implemented the bud-cut approach and experimentally evaluated its performance compared to recomputation.

symposium on cloud computing | 2012

Improving large graph processing on partitioned graphs in the cloud

Rishan Chen; Mao Yang; Xuetian Weng; Byron Choi; Bingsheng He; Xiaoming Li

As the study of large graphs over hundreds of gigabytes becomes increasingly popular for various data-intensive applications in cloud computing, developing large graph processing systems has become a hot and fruitful research area. Many of those existing systems support a vertex-oriented execution model and allow users to develop custom logics on vertices. However, the inherently random access pattern on the vertex-oriented computation generates a significant amount of network traffic. While graph partitioning is known to be effective to reduce network traffic in graph processing, there is little attention given to how graph partitioning can be effectively integrated into large graph processing in the cloud environment. In this paper, we develop a novel graph partitioning framework to improve the network performance of graph partitioning itself, partitioned graph storage and vertex-oriented graph processing. All optimizations are specifically designed for the cloud network environment. In experiments, we develop a system prototype following Pregel (the latest vertex-oriented graph engine by Google), and extend it with our graph partitioning framework. We conduct the experiments with a real-world social network and synthetic graphs over 100GB each in a local cluster and on Amazon EC2. Our experimental results demonstrate the efficiency of our graph partitioning framework, and the effectiveness of network performance aware optimizations on the large graph processing engine.

international conference on data engineering | 2008

Multiple Materialized View Selection for XPath Query Rewriting

Nan Tang; Jeffrey Xu Yu; M.T. Ozsu; Byron Choi; Kam-Fai Wong

We study the problem of answering XPATH queries using multiple materialized views. Despite the efforts on answering queries using single materialized view, answering queries using multiple views remains relatively new. We address two important aspects of this problem: multiple-view selection and equivalent multiple-view rewriting. With regards to the first problem, we propose an NFA-based approach (called VFILTER) to filter views that cannot be used to answer a given query. We then present the criterion for multiple view/query answerability. Based on the output of VFILTER, we further propose a heuristic method to identify a minimal view set that can answer a given query. For the problem of multiple-view rewriting, we first refine the materialized fragments of each selected view (like pushing selection), we then join the refined fragments utilizing an encoding scheme. Finally, we extract the result of the query from the materialized fragments of a single view. Experiments show the efficiency of our approach.

database and expert systems applications | 2003

On the Optimality of Holistic Algorithms for Twig Queries

Byron Choi; Malika Mahoui; Derick Wood

Streaming XML documents has many emerging applications. However, in this paper, we show that the restrictions imposed by data streaming are too restrictive for processing twig queries – the core operation for XML query processing. Previous proposed algorithm TwigStack is an optimal algorithm for processing twig queries with only descendent edges over streams of nodes. The cause of the suboptimality of the TwigStack algorithm is the structurally recursions appearing in XML documents. We show that without relaxing the data streaming model, it is not possible to develop an optimal holistic algorithm for twig queries. Also the computation of the twig queries is not memory bounded. This motivates us to study two variations of the data streaming model: (1) offline sorting is allowed and the algorithm is allowed to select the correct nodes to be streamed and (2) multiple scans on the data streams are allowed. We show the lower bounds of the two variations.

conference on information and knowledge management | 2011

PCMLogging: reducing transaction logging overhead with PCM

Shen Gao; Jianliang Xu; Bingsheng He; Byron Choi; Haibo Hu

Phase Changing Memory (PCM), as one of the most promising next-generation memory technologies, offers various attractive properties such as non-volatility, bit-alterability, and low idle energy consumption. In this paper, we present PCMLogging, a novel logging scheme that exploits PCM devices for both data buffering and transaction logging in disk-based databases. Different from the traditional approach where buffered updates and transaction logs are completely separated, they are integrated in the new logging scheme. Our preliminary experiments show an up to 40% improvement of PCMLogging in disk I/O performance in comparison with a basic buffering and logging scheme.

international world wide web conferences | 2008

On incremental maintenance of 2-hop labeling of graphs

Ramadhana Bramandia; Byron Choi; Wee Keong Ng

Recent interests on XML, Semantic Web, and Web ontology, among other topics, have sparked a renewed interest on graph-structured databases. A fundamental query on graphs is the reachability test of nodes. Recently, 2-hop labeling has been proposed to index large collections of XML and/or graphs for efficient reachability tests. However, there has been few work on updates of 2-hop labeling. This is compounded by the fact that Web data changes over time. In response to these, this paper studies the incremental maintenance of 2-hop labeling. We identify the main reason for the inefficiency of updates of existing 2-hop labels. We propose two updatable 2-hop labelings, hybrids of 2-hop labeling, and their incremental maintenance algorithms. The proposed 2-hop labeling is derived from graph connectivities, as opposed to set cover which is used by all previous work. Our experimental evaluation illustrates the space efficiency and update performance of various kinds of 2-hop labeling. The main conclusion is that there is a natural way to spare some index size for update performance in 2-hop labeling.

IEEE Transactions on Knowledge and Data Engineering | 2010

Incremental Maintenance of 2-Hop Labeling of Large Graphs

Ramadhana Bramandia; Byron Choi; Wee Keong Ng

Recent interests on XML, the Semantic Web, and Web ontology, among other topics, have sparked a renewed interest on graph-structured databases. A fundamental query on graphs is the reachability test of nodes. Recently, 2-hop labeling has been proposed to index a large collection of XML and/or graphs for efficient reachability tests. However, there has been few work on updates of 2-hop labeling. This is compounded by the fact that data may often change over time. In response to these, this paper studies incremental maintenance of 2-hop labeling. We identify the main reason for the inefficiency of updates of existing 2-hop labels. We propose three updatable 2-hop labelings, hybrids of 2-hop labeling, and their incremental maintenance algorithms. The proposed 2-hop labeling is derived from graph connectivity, as opposed to set cover which is used by most previous works. Our experimental evaluation illustrates the space efficiency and update performance of various kinds of 2-hop labelings. Our results show that our incremental maintenance algorithm can be two orders of magnitude faster than previous methods and the size of our 2-hop labeling can be comparable to existing 2-hop labeling. We conclude that there is a natural way to spare some index size for update performance in 2-hop labeling.

Journal of Computer Science and Technology | 2008

Updating recursive XML views of relations

Byron Choi; Gao Cong; Wenfei Fan; Stratis D. Viglas

This paper investigates the view update problem for XML views published from relational data. We consider XML views defined in terms of mappings directed by possibly recursive DTDs compressed into DAGs and stored in relations. We provide new techniques to efficiently support XML view updates specified in terms of XPath expressions with recursion and complex filters. The interaction between XPath recursion and DAG compression of XML views makes the analysis of the XML view update problem rather intriguing. Furthermore, many issues are still open even for relational view updates, and need to be explored. In response to these, on the XML side, we revise the notion of side effects and update semantics based on the semantics of XML views, and present effecient algorithms to translate XML updates to relational view updates. On the relational side, we propose a mild condition on SPJ views, and show that under this condition the analysis of deletions on relational views becomes PTIME while the insertion analysis is NP-complete. We develop an efficient algorithm to process relational view deletions, and a heuristic algorithm to handle view insertions. Finally, we present an experimental study to verify the effectiveness of our techniques.

Explore More