Erwin Leonardi
Nanyang Technological University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Erwin Leonardi.
data and knowledge engineering | 2007
Erwin Leonardi; Tran T. Hoai; Sourav S. Bhowmick; Sanjay Kumar Madria
The DTD of a set of XML documents may change due to many reasons such as changes to the real-world events, changes to the users requirements, and mistakes in the initial design. In this paper, we present a novel algorithm called DTD-Diff to detect the changes to DTDs that defines the structure of a set of XML documents. Such change detection tool can be useful in several ways such as maintenance of XML documents, incremental maintenance of relational schema for storing XML data, and XML schema integration. We compare DTD-Diff with existing XML change detection approaches and show that converting DTD to XML schema (XSD) (which is in XML document format) and detecting the changes using existing XML change detection algorithms is not a feasible option. Our experimental results show that DTD-Diff is 5-325 times faster than X-Diff when it detects the changes to the XSD files. Compared to XyDiff, DTD-Diff is up to 38 times faster. We also study the result quality of detected deltas.
database systems for advanced applications | 2005
Erwin Leonardi; Sourav S. Bhowmick; Sanjay Kumar Madria
Previous works in change detection on XML documents are not suitable for detecting the changes to large XML documents as it requires a lot of memory to keep the two versions of XML documents in the memory. In this paper, we take a more conservative yet novel approach of using traditional relational database engines for detecting the changes to large unordered XML documents. We elaborate how we detect the changes on unordered XML documents by using relational database. To this end, we have implemented a prototype system called Xandy that converts XML documents into relational tuples and detects the changes from these tuples by using SQL queries. Our experimental results show that the relational approach has better scalability compared to published algorithms like X-Diff. The result quality of our approach is comparable to the one of X-Diff.
data and knowledge engineering | 2006
Erwin Leonardi; Sourav S. Bhowmick
Previous work in change detection to XML documents is not suitable for detecting the changes to large XML documents as it requires a lot of memory to keep the two versions of XML documents in the memory. In this article, we take a more conservative yet novel approach of using traditional relational database engines for detecting the changes to large ordered XML documents. To this end, we have implemented a prototype system called XANDY that converts XML documents into relational tuples and detects the changes from these tuples by using SQL queries. Our experimental results show that the relational-based approach has better scalability compared to published algorithm like X-Diff. It has comparable efficiency and result quality compared to X-Diff in some cases. Our experimental results also show that, generally, XANDY has better result quality than XyDiff.
conference on information and knowledge management | 2005
Erwin Leonardi; Sourav S. Bhowmick
Several relational approaches have been proposed to detect the changes to XML documents by using relational databases. These approaches store the XML documents in the relational database and issue SQL queries (whenever appropriate) to detect the changes. All of these relational-based approaches use the schema-oblivious XML storage strategy for detecting the changes. However, there is growing evidence that schema-conscious storage approaches perform significantly better than schema-oblivious approaches as far as XML query processing is concerned. In this paper, we study a relational-based unordered XML change detection technique (called H<small>ELIOS</small>) that uses a schema-conscious approach (Shared-Inlining) as the underlying storage strategy. H<small>ELIOS</small> is up to 52 times faster than X-Diff [7] for large datasets (more than 1000 nodes). It is also up to 6.7 times faster than X<small>ANDY</small> [4]. The result quality of deltas detected by H<small>ELIOS</small> is comparable to the result quality of deltas detected by XANDY.
database systems for advanced applications | 2006
Erwin Leonardi; Tran T. Hoai; Sourav S. Bhowmick; Sanjay Kumar Madria
The DTD of a set of XML documents may change due to many reasons such as changes to the real world events, changes to the users requirements, and mistakes in the initial design. In this paper, we present a novel algorithm called DTD-DIFF to detect the changes to DTDs that defines the structure of a set of XML documents. Such change detection tool can be useful in several ways such as maintenance of XML documents, incremental maintenance of relational schema for storing XML data, and XML schema integration. We compare DTD-DIFF with existing XML change detection approaches and show that converting DTD to XML Schema (XSD) (which is in XML document format) and detecting the changes using existing XML change detection algorithms is not a feasible option. Our experimental results show that DTD-DIFF is 5-325 times faster than X-Diff when it detects the changes to the XSD files. We also study the result quality of detected deltas.
database and expert systems applications | 2004
Erwin Leonardi; Sourav S. Bhowmick; T. S. Dharma; Sanjay Kumar Madria
Previous works in change detection on XML focused on detecting changes to text file using ordered and unordered tree model. These approaches are not suitable for detecting changes to large XML document as it requires a lot of memory to keep the two versions of XML documents in the memory. In this paper, we take a more conservative yet novel approach of using traditional relational database engines for detecting content changes of ordered large XML data. First, we store XML documents in RDBMS. Then, we detect the changes by using a set of SQL queries. Experimental results show that our approach has better scalability, better performance, and comparable result quality compared to the state-of-the-art approaches.
international conference on management of data | 2007
Erwin Leonardi; Sourav S. Bhowmick
Recently, a number of main memory algorithms for detecting the changes to XML data have been proposed. These approaches are not suitable for detecting changes to large XML document as it requires a lot of memory to keep the two versions of XML documents in the memory. We have developed a novel XML change detection system, called XANADUE that uses traditional relational database engines for detecting changes to large XML data. In this approach, we store the XML documents in the relational database and issue SQL queries (whenever appropriate) to detect the changes. This demonstration will showcase the functionality of our system and the effectiveness of XML change detection in relational environment.
conference on information and knowledge management | 2007
Sourav S. Bhowmick; Erwin Leonardi; Hongmei Sun
Recent study showed that native twig join algorithms and tree-aware relational framework significantly outperform tree-unaware approaches in evaluating structural relationships in XML twig queries. In this paper, we present an efficient strategy to evaluate high-selective twig queries containing only parent-child relationships in a tree-unaware relational environment. Our scheme is built on top of our S<scp>UCXENT</scp>++ system. We show that by exploiting the encoding scheme of S<scp>UCXENT</scp>++, we can devise efficient strategy for evaluating such twig queries. Extensive performance studies on various data sets and queries show that our approach performs better than a representative tree-unaware approach (G<scp>LOBAL</scp>-O<scp>RDER</scp>) and a state-of-the-art native twig join algorithm (TJF<scp>AST</scp>) on all benchmark queries with the highest observed gain factors being 243 and 95, respectively. Additionally, our approach reduces significantly the performance gap between tree-aware and tree-unaware approaches and even outperforms a tree-aware approach(M<scp>ONET</scp>DB/XQ<scp>UERY</scp>) for certain high-selective twig queries. We also report our insights to the plan choices a relational optimizer made during twig query evaluation by visually characterizing its behavior over the relational selectivity space.
database systems for advanced applications | 2010
Erwin Leonardi; Sourav S. Bhowmick; Mizuho Iwaihara
Achieving data security over cooperating web services is becoming a reality, but existing xml access control architectures do not consider this federated service computing. In this paper, we consider a federated access control model, in which Data Provider and Policy Enforcers are separated into different organizations; the Data Provider is responsible for evaluating criticality of requested xml documents based on co-occurrence of security objects, and issuing security clearances. The Policy Enforcers enforce access control rules reflecting their organization-specific policies. A user’s query is sent to the Data Provider and she needs to obtain a permission from the Policy Enforcer in her organization to read the results of her query. The Data Provider evaluates the query and also evaluate criticality of the query, where evaluation of sensitiveness is carried out by using clearance rules. In this setting, we present a novel approach, called the diff approach, to evaluate security clearance by the Data Provider. Our technique is build on top of relational framework and utilizes pre-evaluated clearances by taking the differences (or deltas) between query results.
conference on information and knowledge management | 2009
Sourav S. Bhowmick; Curtis E. Dyreson; Erwin Leonardi; Zhifeng Ng
XML query languages use directional path expressions to locate data in an XML data collection. They are tightly coupled to the structure of a data collection, and can fail when evaluated on the same data in a different structure. This paper extends path expressions with a new non-directional axis called the rank-distance axis. Given a context node and two positive integers α and β, the rank-distance axis returns those nodes that are ranked between α and β in terms of closeness from the context node in any direction. This paper shows how to evaluate the rank-distance axis in a tree-unaware XML database. A tree-unaware implementation does not invade the database kernel to support XML queries, instead it uses an existing RDBMS such as Microsofts SQL server as a back-end and provides a front-end layer to translate XML queries to SQL. This paper presents an overview of an algorithm that translates queries with a rank-distance axis to SQL.