Is this you? Create Your Porfile

Tok Wang Ling

National University of Singapore

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Tok Wang Ling is active.

Explore More

Publication

Featured researches published by Tok Wang Ling.

international conference on management of data | 2005

On boosting holism in XML twig pattern matching using structural indexing techniques

Ting Chen; Jiaheng Lu; Tok Wang Ling

Searching for all occurrences of a twig pattern in an XML document is an important operation in XML query processing. Recently a holistic method TwigStack. [2] has been proposed. The method avoids generating large intermediate results which do not contribute to the final answer and is CPU and I/O optimal when twig patterns only have ancestor-descendant relationships. Another important direction of XML query processing is to build structural indexes [3][8][13][15] over XML documents to avoid unnecessary scanning of source documents. We regard XML structural indexing as a technique to partition XML documents and call it streaming scheme in our paper. In this paper we develop a method to perform holistic twig pattern matching on XML documents partitioned using various streaming schemes. Our method avoids unnecessary scanning of irrelevant portion of XML documents. More importantly, depending on different streaming schemes used, it can process a large class of twig patterns consisting of both ancestor-descendant and parent-child relationships and avoid generating redundant intermediate results. Our experiments demonstrate the applicability and the performance advantages of our approach.

Archive | 2003

Conceptual Modeling-ER 2003

Il-Yeol Song; Stephen W. Liddle; Tok Wang Ling; Peter Scheuermann

The Semantic Web and the Web service paradigm are currently the most important trends on the way to the next generation of the Web. They promise new opportunities for content and service provision, enabling manifold and flexible new applications and improved support for individual and cooperative tasks. The use of the Web service paradigm in the development of Web applications, that typically couple application databases with user dialogs, is quite obvious. The development of Web applications that can be operated effectively in the Semantic Web context (Semantic Web Applications), however, imposes some challenges. Two main challenges towards extended (conceptual) modeling support are addressed in this talk: 1. In the Semantic Web, Web applications move from a purely human user community towards a mixed user community consisting of humans as well as of software agents; this results into new requirements towards models for Web applications’ user interfaces; 2. Automatic interpretation of content, one of the main building blocks of the Semantic Web, is based on interlinking local models with globally defined interpretation schemes like vocabularies and ontologies; this has to be reflected by the conceptual application domain models of Semantic Web Applications. Conceptual Modeling for Web applications, thus, has to be revisited in the context of the new Web trends looking for adequate Semantic Web Application Models. In Web applications dialog-oriented (in most cases form-based) user interface models are state-of-the art for the interaction with users. The requirement of representing interaction with humans as well as with software agents is best met by a user interface model that describes the dialogs with the system on a conceptual level that can be dynamically translated into a (user) interface language adequate for the respective “user” (human or agent). The upcoming Web standard XForms for the next generation of form-based user interfaces is a good example of such a conceptual user interface model. For the linking of globally defined concepts with local domain model concepts one of the most popular models in the context of the Semantic Web is provided by the Resource Description Framework (RDF). The systematic integration of Uniform Resource Identifiers (URIs) into the model facilitates references to vocabularies and ontologies defined e.g. as RDF Schema or OWL ontology. However, for using RDF in Web applications a coupling between these “semantic” data models and the more traditional data models underlying the application data is necessary. I.-Y. Song et al. (Eds.): ER 2003, LNCS 2813, pp. 1–2, 2003. c

international conference on data engineering | 2009

Effective XML Keyword Search with Relevance Oriented Ranking

Zhifeng Bao; Tok Wang Ling; Bo Chen; Jiaheng Lu

Inspired by the great success of information retrieval (IR) style keyword search on the web, keyword search on XML has emerged recently. The difference between text database and XML database results in three new challenges: (1) Identify the user search intention, i.e. identify the XML node types that user wants to search for and search via. (2) Resolve keyword ambiguity problems: a keyword can appear as both a tag name and a text value of some node; a keyword can appear as the text values of different XML node types and carry different meanings. (3) As the search results are sub-trees of the XML document, new scoring function is needed to estimate its relevance to a given query. However, existing methods cannot resolve these challenges, thus return low result quality in term of query relevance. In this paper, we propose an IR-style approach which basically utilizes the statistics of underlying XML data to address these challenges. We first propose specific guidelines that a search engine should meet in both search intention identification and relevance oriented ranking for search results. Then based on these guidelines, we design novel formulae to identify the search for nodes and search via nodes of a query, and present a novel XML TF*IDF ranking strategy to rank the individual matches of all possible search intentions. Lastly, the proposed techniques are implemented in an XML keyword search engine called XReal, and extensive experiments show the effectiveness of our approach.

Archive | 2004

Conceptual Modeling – ER 2004

Paolo Atzeni; Wesley W. Chu; Hongjun Lu; Shuigeng Zhou; Tok Wang Ling

The envisioned Semantic Web aims to provide richly annotated and explicitly structured Web pages in XML, RDF, or description logics, based upon underlying ontologies and thesauri. Ideally, this should enable a wealth of query processing and semantic reasoning capabilities using XQuery and logical inference engines. However, we believe that the diversity and uncertainty of terminologies and schema-like annotations will make precise querying on a Web scale extremely elusive if not hopeless, and the same argument holds for large-scale dynamic federations of Deep Web sources. Therefore, ontology-based reasoning and querying needs to be enhanced by statistical means, leading to relevanceranked lists as query results. This paper presents steps towards such a “statistically semantic” Web and outlines technical challenges. We discuss how statistically quantified ontological relations can be exploited in XML retrieval, how statistics can help in making Web-scale search efficient, and how statistical information extracted from users’ query logs and click streams can be leveraged for better search result ranking. We believe these are decisive issues for improving the quality of next-generation search engines for intranets, digital libraries, and the Web, and they are crucial also for peer-to-peer collaborative Web search. 1 The Challenge of “Semantic” Information Search The age of information explosion poses tremendous challenges regarding the intelligent organization of data and the effective search of relevant information in business and industry (e.g., market analyses, logistic chains), society (e.g., health care), and virtually all sciences that are more and more data-driven (e.g., gene expression data analyses and other areas of bioinformatics). The problems arise in intranets of large organizations, in federations of digital libraries and other information sources, and in the most humongous and amorphous of all data collections, the World Wide Web and its underlying numerous databases that reside behind portal pages. The Web bears the potential of being the world’s largest encyclopedia and knowledge base, but we are very far from being able to exploit this potential. Database-system and search-engine technologies provide support for organizing and querying information; but all too often they require excessive manual preprocessing, such as designing a schema and cleaning raw data or manually classifying documents into a taxonomy for a good Web portal, or manual postprocessing such as browsing through large result lists with too many irrelevant items or surfing in the vicinity of promising but not truly satisfactory approximate matches. The following are a few example queries where current Web and intranet search engines fall short or where data P. Atzeni et al. (Eds.): ER 2004, LNCS 3288, pp. 3–17, 2004. c

conference on information and knowledge management | 2004

Efficient processing of XML twig patterns with parent child edges: a look-ahead approach

Jiaheng Lu; Ting Chen; Tok Wang Ling

With the growing importance of semi-structure data in information exchange, much research has been done to provide an effective mechanism to match a twig query in an XML database. A number of algorithms have been proposed recently to process a twig query holistically. Those algorithms are quite efficient for quires with only ancestor-descendant edges. But for queries with mixed ancestor-descendant and parent-child edges, the previous approaches still may produce large intermediate results, even when the input and output size are more manageable. To overcome this limitation, in this paper, we propose a novel holistic twig join algorithm, namely TwigStackList. Our main technique is to look-ahead read some elements in input data steams and cache limited number of them to lists in the main memory. The number of elements in any list is bounded by the length of the longest path in the XML document. We show that TwigStackList is I/O optimal for queries with only ancestor-descendant relationships below branching nodes. Further, even when queries contain parent-child relationship below branching nodes, the set of intermediate results in TwigStackList is guaranteed to be a subset of that in previous algorithms. We complement our experimental results on a range of real and synthetic data to show the significant superiority of TwigStackList over previous algorithms for queries with parent-child relationships.

knowledge discovery and data mining | 2000

IntelliClean: a knowledge-based intelligent data cleaner

Mong Li Lee; Tok Wang Ling; Wai Lup Low

data cleaning methods work on the basis of com- puting the degree of similarity between nearby records in a sorted database. High recall is achieved by accepting records with low degrees of similarity as duplicates, at the cost of lower precision. High precision is achieved analogously at the cost of lower recall. This is the recall-precision dilemma. In this paper, we propose a generic knowledge-based frame- work for effective data cleaning that implements existing cleaning strategies and more. We develop a new method to compute transitive closure under uncertainty which handles the merging of groups of inexact duplicate records. Experi- mental results show that this framework can identify dupli- cates and anomalies with high recall and precision.

extending database technology | 2002

Designing Functional Dependencies for XML

Mong Li Lee; Tok Wang Ling; Wai Lup Low

Functional dependencies are an integral part of database theory and they form the basis for normalizing relational tables up to BCNF. With the increasing relevance of the data-centric aspects of XML, it is pertinent to study functional dependencies in the context of XML, which will form the basis for further studies into XML keys and normalization. In this work, we investigate the design of functional dependencies in XML databases. We propose FDXML, a notation and DTD for representing functional dependencies in XML. We observe that many databases are hierarchical in nature and the corresponding nested XML data may inevitably contain redundancy. We develop a model based on FDXML to estimate the amount of data replication in XML data. We show how functional dependencies in XML can be verified with a single pass through the XML data, and present supporting experimental results. A platform-independent framework is also drawn up to demonstrate how the techniques proposed in this work can enrich the semantics of XML.

Information Systems | 2001

A knowledge-based approach for duplicate elimination in data cleaning

Wai Lup Low; Mong Li Lee; Tok Wang Ling

Abstract Existing duplicate elimination methods for data cleaning work on the basis of computing the degree of similarity between nearby records in a sorted database. High recall can be achieved by accepting records with low degrees of similarity as duplicates, at the cost of lower precision. High precision can be achieved analogously at the cost of lower recall. This is the recall–precision dilemma . We develop a generic knowledge-based framework for effective data cleaning that can implement any existing data cleaning strategies and more. We propose a new method for computing transitive closure under uncertainty for dealing with the merging of groups of inexact duplicate records and explain why small changes to window sizes has little effect on the results of the sorted neighborhood method. Experimental study with two real-world datasets show that this approach can accurately identify duplicates and anomalies with high recall and precision, thus effectively resolving the recall–precision dilemma.

Archive | 1998

Conceptual Modeling – ER ’98

Tok Wang Ling; Sudha Ram; Mong Li Lee

The Software Industry in Japan grew extraordinarily only in the field of custom software, and fell after the collapse of the “bubble economy” in 1991. In Japan, the field of packaged software is still at an early stage of development. Why did this happen? On the other hand, Japan surpassed the U.S.A in the game software field, and became No. 1 in the world. Why is this? Can Japanese packaged software survive in the future? Or, will Western packaged software made by Microsoft, SAP etc. conquer the Japanese market? I will state my opinion based on my own experience in Software Industry during the past 30 years.

Archive | 2011

Conceptual Modeling – ER 2011

Manfred A. Jeusfeld; Lois M. L. Delcambre; Tok Wang Ling

This book constitutes the refereed proceedings of the 30th International Conference on Conceptual Modeling, ER 2011, held in Brussels, Belgium, in October/November 2011. The 25 revised full papers presented together with 14 short papers and three keynotes were carefully reviewed and selected from 157 submissions. The papers are organized in topical sections on modeling goals and compliance; human and socio-technical factors; ontologies; data model theory; model development and maintainability; user interfaces and software classification; evolution, propagation and refinement; UML and requirements modeling; views, queries and search; requirements and business intelligence; MDA and ontology-based modeling; process modeling; and panels.

Explore More