Is this you? Create Your Porfile

Weimin He

University of Wisconsin–Stevens Point

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Weimin He is active.

Explore More

Publication

Featured researches published by Weimin He.

databases information systems and peer to peer computing | 2005

XML query routing in structured P2P systems

Leonidas Fegaras; Weimin He; Gautam Das; David Levine

This paper addresses the problem of data placement, indexing, and querying large XML data repositories distributed over an existing P2P service infrastructure. Our architecture scales gracefully to the network and data sizes, is fully distributed, fault tolerant and self-organizing, and handles complex queries efficiently, even those queries that use full-text search. Our framework for indexing distributed XML data is based on both meta-data information and textual content. We introduce a novel data synopsis structure to summarize text that correlates textual with positional information and increases query routing precision. Our processing framework maps an XML query with full-text search into a distributed program that migrates from peer to peer, collecting relevant document locations along the way. In addition, we introduce methods to handle network updates, such as node arrivals, departures, and failures. Finally, we report on a prototype implementation, which is used to validate the accuracy of our data synopses and to analyze the various costs involved in indexing XML data and answering queries.

british national conference on databases | 2007

Indexing and searching XML documents based on content and structure synopses

Weimin He; Leonidas Fegaras; David Levine

We present a novel framework for indexing and searching schema-less XML documents based on concise summaries of their structural and textual content. Our search query language is XPath extended with full-text search. We introduce two novel data synopsis structures that correlate textual with positional information in an XML document and improves query precision. In addition, we present a two-phase containment filtering algorithm based on these synopses that improves the searching process. Our experimental evaluation shows that our data synopses indexing scheme outperforms the standard XML indexing scheme based on inverted lists; the query evaluation based on our data synopses is more accurate than related approximate approaches that do not consider positional information; our two-phase containment filtering algorithm is more efficient than a single-phase brute force algorithm.

international conference on computational and information sciences | 2013

Visual Evaluation of XPath Queries

Weimin He; Teng Lv; Matthew Meis; Ping Yan

Over the past one decade, due to its simplicity and flexibility, Extensible Markup Language (XML) is rapidly gaining in popularity as a universal data format for data exchange and integration on the web. In this paper, we present a novel framework to evaluate a variety of XPath queries in a very user-friendly manner. We developed a prototype system named VXPath, which is a visual XPath query evaluator that allows the user to evaluate an XPath query by clicking the nodes in an expanding tree instead of typing the whole XPath query by hand. Our system supports various XPath axes, including child, descendant, self, parent, ancestor, following-sibling, precedingsibling, predicate and so on. In order to handle XML documents in very large size, instead of loading the whole XML document into memory, we extracted a concise data synopsis termed structural summary from the original XML document to avoid the loading overhead for large XML document. We evaluated our system over the data from XMark and DBLP and our system can handle large XML documents up to gigabytes.

web age information management | 2012

Uncertain XML Functional Dependencies Based on Tree Tuple Models

Teng Lv; Weimin He; Ping Yan

With the increase of uncertain data in many new applications, such as sensor network, data integration, web extraction, etc, uncertainty both in relational databases and XML datasets has attracted more and more research interests in recent years. As functional dependencies are critical and necessary to schema design in relational databases and XML datasets, it is also significant to study the functional dependencies and their applications in uncertain XML datasets. This paper proposed three new kinds of functional dependencies based on tree tuple mode for uncertain XML datasets. We also give a set of sound and complete inference rules and two applications, such as to test whether an uncertain XML dataset satisfies a given functional dependency and to find a closed set for a given uncertain XML dataset.

international conference on computational and information sciences | 2012

VXPath: A Visual XPath Query Evaluator

Weimin He; Teng Lv; Matthew Meis; Ping Yan

With the popularity of Extensible Markup Language (XML) as the new standard of data exchange on the web, a large number of data sources are represented or encoded in XML format. Therefore, querying and searching XML data on the web has attracted much attention in the database literature. In this research work, we developed a prototype software system named VXPath to facilitate the efficient evaluation of XPath queries over XML data. VXPath is a visual XPath query evaluator that allows the user to evaluate an XPath query by clicking the nodes in an expanding tree instead of typing the whole XPath query by hand. Our system supports most common XPath axes, such as child, descendant, predicate and so on. In order to handle XML documents in very large size, instead of loading the whole XML document into memory, we extracted a concise data synopsis termed structural summary from the original XML document to avoid the loading overhead for large XML document. We evaluated our system over the data from XMark and DBLP and our system can handle large XML documents up to gigabytes.

international conference on computer science and information technology | 2010

Bloom filter-based keyword search over XML data in structured Peer-to-Peer systems

Weimin He; Teng Lv

With the popularity of Extensible Markup Language (XML) as the new standard of data exchange on the web, a large number of data sources are represented or encoded in XML format. Therefore, querying and searching XML data on the web has attracted much attention in the database literature. The new emerging Peer-to-Peer (P2P) computing model has fueled the autonomous data sharing over the Internet in a more flexible fashion. Needless to say, XML data retrieval in P2P systems has become attractive to professionals in both research and industrial communities. In this paper, we propose a Bloom-Filter based keyword search framework for XML data retrieval in structured P2P systems. We designed an efficient Bloom-Filter based XML indexing scheme for XML data retrieval in structured P2P systems. We also developed an effective keyword search algorithm over our Bloom-Filter encoded XML indexes. The experimental results demonstrate that our novel indexing scheme is much more efficient compared to the traditional full-indexing scheme, our keyword search algorithm is efficient in terms of query response time, and our system is scalable in terms of data size and network size.

International journal of database theory and application | 2016

On Uncertain Probabilistic Data Modeling

Teng Lv; Ping Yan; Weimin He

Uncertainty in data is caused by various reasons including data itself, data mapping, and data policy. For data itself, data are uncertain because of various reasons. For example, data from a sensor network, Internet of Things or Radio Frequency Identification is often inaccurate and uncertain because of devices or environmental factors. For data mapping, integrated data from various heterogonous data sources is commonly uncertain because of uncertain data mapping, data inconsistency, missing data, and dirty data. For data policy, data is modified or hided for policies of data privacy and data confidentiality in an organization. But traditional deterministic data management mainly deals with deterministic data which is precise and certain, and cannot process uncertain data. Modeling uncertain data is a foundation of other technologies for further processing data, such as indexing, querying, searching, mapping, integrating, and mining data, etc. Probabilistic data models of relational databases, XML data and graph data are widely used in many applications and areas today, such as World Wide Web, semantic web, sensor networks, Internet of Things, mobile ad-hoc networks, social networks, traffic networks, biological networks, genome databases, and medical records, etc. This paper presents a survey study of different probabilistic models of uncertain data in relational databases, XML data, and graph data, respectively. The advantages and disadvantages of each kind of probabilistic modes are analyzed and compared. Further open topics of modeling uncertain probabilistic data such as semantic and computation aspects are discussed in the paper. Criteria for modeling uncertain data, such as expressive power, complexity, efficiency, extension are also proposed in the paper.

Archive | 2014

Functional Dependencies and Lossless Decompositions of Uncertain XML Datasets

Ping Yan; Teng Lv; Weimin He; Xiuzhen Wang

With the increase in uncertain data in many new applications, such as sensor network, data integration, web extraction, etc., uncertainty of both relational databases and Extensible Markup Language (XML) datasets have attracted high-research interests in recent years. As functional dependencies (FDs) are critical and necessary to schema design in relational databases and XML datasets, it is also significant to study the FDs and their applications in lossless decompositions of uncertain XML datasets. This paper first proposed three new kinds of FDs for uncertain XML datasets based on tree-tuple model, then three lossless decomposition methods based on the proposed three FDs, respectively, are given to decompose an XML dataset losslessly.

international conference on applications of digital information and web technologies | 2011

Extending vector space model for XML ranking

Weimin He; Teng Lv

There is an increasing interest in recent years for querying and ranking XML documents. In this paper, we present a new framework for querying and ranking schema-less XML documents based on concise summaries of their structural and textual content. We introduce a novel data synopsis structure to summarize the textual content of an XML document for efficient indexing. More importantly, we extend the traditional vector space model to effectively rank XML documents over the proposed data synopses. We conduct extensive experiments over XML benchmark data to demonstrate the advantages of the indexing scheme and the effectiveness of our ranking scheme. We also compare our framework with Lucene to demonstrate our extended TF*IDF scoring function is effective

international conference on applications of digital information and web technologies | 2011

Exponential synchronization of Cohen-Grossberg neural networks with diffusion terms and delays

Teng Lv; Weimin He; Ping Yan

In this paper, the exponential synchronization of a class of Cohen-Grossberg neural networks with delays and diffusion terms is discussed. By using Lyapunov functional method and Sobolev inequality, some sufficient conditions are given to ensure the exponential synchronization of the drive-response cellular neural networks with delays and diffusion terms.

Explore More