Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Chengkai Li is active.

Publication


Featured researches published by Chengkai Li.


international conference on management of data | 2006

Supporting ad-hoc ranking aggregates

Chengkai Li; Kevin Chen Chuan Chang; Ihab F. Ilyas

This paper presents a principled framework for efficient processing of ad-hoc top-k (ranking) aggregate queries, which provide the k groups with the highest aggregates as results. Essential support of such queries is lacking in current systems, which process the queries in a naïve materialize-group-sort scheme that can be prohibitively inefficient. Our framework is based on three fundamental principles. The Upper-Bound Principle dictates the requirements of early pruning, and the Group-Ranking and Tuple-Ranking Principles dictate group-ordering and tuple-ordering requirements. They together guide the query processor toward a provably optimal tuple schedule for aggregate query processing. We propose a new execution framework to apply the principles and requirements. We address the challenges in realizing the framework and implementing new query operators, enabling efficient group-aware and rank-aware query plans. The experimental study validates our framework by demonstrating orders of magnitude performance improvement in the new query plans, compared with the traditional plans.


international world wide web conferences | 2010

Facetedpedia: dynamic generation of query-dependent faceted interfaces for wikipedia

Chengkai Li; Ning Yan; Senjuti Basu Roy; Lekhendro Lisham; Gautam Das

This paper proposes Facetedpedia, a faceted retrieval system for information discovery and exploration in Wikipedia. Given the set of Wikipedia articles resulting from a keyword query, Facetedpedia generates a faceted interface for navigating the result articles. Compared with other faceted retrieval systems, Facetedpedia is fully automatic and dynamic in both facet generation and hierarchy construction, and the facets are based on the rich semantic information from Wikipedia. The essence of our approach is to build upon the collaborative vocabulary in Wikipedia, more specifically the intensive internal structures (hyperlinks) and folksonomy (category system). Given the sheer size and complexity of this corpus, the space of possible choices of faceted interfaces is prohibitively large. We propose metrics for ranking individual facet hierarchies by users navigational cost, and metrics for ranking interfaces (each with k facets) by both their average pairwise similarities and average navigational costs. We thus develop faceted interface discovery algorithms that optimize the ranking metrics. Our experimental evaluation and user study verify the effectiveness of the system.


international conference on data engineering | 2014

GQBE: Querying knowledge graphs by example entity tuples

Nandish Jayaram; Mahesh Gupta; Arijit Khan; Chengkai Li; Xifeng Yan; Ramez Elmasri

We present GQBE, a system that presents a simple and intuitive mechanism to query large knowledge graphs. Answers to tasks such as “list university professors who have designed some programming languages and also won an award in Computer Science” are best found in knowledge graphs that record entities and their relationships. Real-world knowledge graphs are difficult to use due to their sheer size and complexity and the challenging task of writing complex structured graph queries. Toward better usability of query systems over knowledge graphs, GQBE allows users to query knowledge graphs by example entity tuples without writing complex queries. In this demo we present: 1) a detailed description of the various features and user-friendly GUI of GQBE, 2) a brief description of the system architecture, and 3) a demonstration scenario that we intend to show the audience.


extending database technology | 2012

An optimization framework for map-reduce queries

Leonidas Fegaras; Chengkai Li; Upa Gupta

We present an effective optimization framework for general SQL-like map-reduce queries, which is based on a novel query algebra and uses a small number of higher-order physical operators that are directly implementable on existing map-reduce systems, such as Hadoop. Although our framework is applicable to any SQL-like map-reduce query language, we focus on a powerful query language, called MRQL. Current map-reduce query languages, such as HiveQL and PigLatin, enable users to plug-in custom map-reduce scripts into queries for those jobs that cannot be declaratively coded in the query language, which may result to suboptimal, error-prone, and hard-to-maintain code. In contrast to these languages, MRQL is expressive enough to capture most of these computations in declarative form and at the same time is amenable to optimization. We describe an optimization framework that maps the algebraic forms derived from the MRQL queries to efficient workflows of map-reduce operations that consist of our physical plan operators. We also describe many algebraic optimizations, such as fusing cascading map-reduce jobs into one job and synthesizing a combine function from the reduce function of a map-reduce job. Finally, we report on a prototype system implementation and we show some performance results of evaluating MRQL queries on a small cluster of computers.


IEEE Transactions on Knowledge and Data Engineering | 2012

One Size Does Not Fit All: Toward User- and Query-Dependent Ranking for Web Databases

Aditya Telang; Chengkai Li; Sharma Chakravarthy

With the emergence of the deep web, searching web databases in domains such as vehicles, real estate, etc., has become a routine task. One of the problems in this context is ranking the results of a user query. Earlier approaches for addressing this problem have used frequencies of database values, query logs, and user profiles. A common thread in most of these approaches is that ranking is done in a user- and/or query-independent manner. This paper proposes a novel query- and user-dependent approach for ranking query results in web databases. We present a ranking model, based on two complementary notions of user and query similarity, to derive a ranking function for a given user query. This function is acquired from a sparse workload comprising of several such ranking functions derived for various user-query pairs. The model is based on the intuition that similar users display comparable ranking preferences over the results of similar queries. We define these similarities formally in alternative ways and discuss their effectiveness analytically and experimentally over two distinct web databases.


International Workshop on Challenges in Web Information Retrieval and Integration | 2005

Query Routing: Finding Ways in the Maze of the DeepWeb

Govind Kabra; Chengkai Li; Kevin Chen Chuan Chang

This paper presents a source selection system based on attribute co-occurrence framework for ranking and selecting Deep Web sources that provide information relevant to users requirement. Given the huge number of heterogeneous Deep Web data sources, the end users may not know the sources that can satisfy their information needs. Selecting and ranking sources in relevance to the user requirements is challenging. Our system finds appropriate sources for such users by allowing them to input just an imprecise initial query. As a key insight, we observe that the semantics and relationships between deep Web sources are self-revealing through their query interfaces, and in essence, through the co-occurrences between attributes. Based on this insight, we design a co-occurrence based attribute graph for capturing the relevances of attributes, and using them in ranking of sources in the order of relevance to user’s requirement. Further, we present an iterative algorithm that realizes our model. Our preliminary evaluation on real-world sources demonstrates the effectiveness of our approach.


IEEE Transactions on Knowledge and Data Engineering | 2015

Querying Knowledge Graphs by Example Entity Tuples

Nandish Jayaram; Arijit Khan; Chengkai Li; Xifeng Yan; Ramez Elmasri

We witness an unprecedented proliferation of knowledge graphs that record millions of entities and their relationships. While knowledge graphs are structure-flexible and content-rich, they are difficult to use. The challenge lies in the gap between their overwhelming complexity and the limited database knowledge of non-professional users. If writing structured queries over “simple” tables is difficult, complex graphs are only harder to query. As an initial step toward improving the usability of knowledge graphs, we propose to query such data by example entity tuples, without requiring users to form complex graph queries. Our system, Graph Query By Example (


international conference on management of data | 2003

Composing XSL transformations with XML publishing views

Chengkai Li; Philip Bohannon; P. P. S. Narayan

\mathsf {GQBE}


international workshop on testing database systems | 2010

Dynamic symbolic database application testing

Chengkai Li; Christoph Csallner

), automatically discovers a weighted hidden maximum query graph based on input query tuples, to capture a user’s query intent. It then efficiently finds and ranks the top approximate matching answer graphs and answer tuples. We conducted experiments and user studies on the large Freebase and DBpedia datasets and observed appealing accuracy and efficiency. Our system provides a complementary approach to the existing keyword-based methods, facilitating user-friendly graph querying. To the best of our knowledge, there was no such proposal in the past in the context of graphs.


IEEE Transactions on Knowledge and Data Engineering | 2014

On Skyline Groups

Nan Zhang; Chengkai Li; Naeemul Hassan; Sundaresan Rajasekaran; Gautam Das

While the XML Stylesheet Language for Transformations (XSLT) was not designed as a query language, it is well-suited for many query-like operations on XML documents including selecting and restructuring data. Further, it actively fulfills the role of an XML query language in modern applications and is widely supported by application platform software. However, the use of database techniques to optimize and execute XSLT has only recently received attention in the research community. In this paper, we focus on the case where XSL transformations are to be run on XML documents defined as views of relational databases. For a subset of XSLT, we present an algorithm to compose a transformation with an XML view, eliminating the need for the XSLT execution. We then describe how to extend this algorithm to handle several additional features of XSLT, including a proposed approach for handling recursion.

Collaboration


Dive into the Chengkai Li's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Naeemul Hassan

University of Texas at Arlington

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Gensheng Zhang

University of Texas at Arlington

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ning Yan

University of Texas at Arlington

View shared research outputs
Top Co-Authors

Avatar

Ramez Elmasri

University of Texas at Arlington

View shared research outputs
Top Co-Authors

Avatar

Aditya Telang

University of Texas at Arlington

View shared research outputs
Top Co-Authors

Avatar

Afroza Sultana

University of Texas at Arlington

View shared research outputs
Top Co-Authors

Avatar

Gautam Das

University of Texas at Arlington

View shared research outputs
Researchain Logo
Decentralizing Knowledge