Biao Qin
Renmin University of China
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Biao Qin.
Knowledge Based Systems | 2011
Biao Qin; Yuni Xia; Shan Wang; Xiaoyong Du
Data uncertainty can be caused by numerous factors such as measurement precision limitations, network latency, data staleness and sampling errors. When mining knowledge from emerging applications such as sensor networks or location based services, data uncertainty should be handled cautiously to avoid erroneous results. In this paper, we apply probabilistic and statistical theory on uncertain data and develop a novel method to calculate conditional probabilities of Bayes theorem. Based on that, we propose a novel Bayesian classification algorithm for uncertain data. The experimental results show that the proposed method classifies uncertain data with potentially higher accuracies than the Naive Bayesian approach. It also has a more stable performance than the existing extended Naive Bayesian method.
Information Sciences | 2008
Biao Qin; Shan Wang; Xiaoyong Du; Qiming Chen; Qiuyue Wang
Peer knowledge management systems (PKMS) offer a flexible architecture for decentralized knowledge sharing. In PKMSs, the knowledge sharing and evolution processes are based on peer ontologies. Finding an effective and efficient query rewriting algorithm for regular expression queries is vital for knowledge sharing between peers in PKMSs; and for this our solution is characterized by graph-based query rewriting. Based on the graphs for both axioms and mappings, we design a novel algorithm, regular expression rewriting algorithm, to rewrite regular expression queries along semantic paths. The simulation results show that the performance of our algorithm is better than Morks reformulation algorithms [P. Mork, Peer architectures for knowledge sharing, PhD thesis, University of Washington, 2005. ], and our algorithm is more effective than the naive rewriting algorithm.
IEEE Transactions on Knowledge and Data Engineering | 2014
Biao Qin; Shan Wang; Xiaofang Zhou; Xiaoyong Du
This paper investigates the problem of efficiently computing responsibility for lineages of conjunctive queries with inequalities on databases. We classify the lineages of a class of queries with inequalities, called IQ queries, into path and composite lineages. We first compile path lineages into lineage graphs and transform lineage graphs into matrices. Then we reduce the problem of computing responsibility for path lineages to the shortest path problem, which can be solved by the dynamic programming algorithm in PTIME. We further prove composite lineages can be decomposed into path lineages for computing responsibility. Thus, our first main result shows it is in PTIME to compute responsibility for lineages of IQ queries. We generalize the previous results on dichotomy of responsibility analysis for lineages of conjunctive queries with equalities, now in the presence of inequalities. After decomposing composite lineages into path lineages, the data population needed for computing responsibility decreases more than one order of magnitude. Thus, our algorithm can efficiently compute responsibility for composite lineages. In order to compute responsibility for lineages in general, we introduce a greedy algorithm, consisting of a reduction to the set cover problem. Finally, we demonstrate the benefits of the proposed algorithms with extensive experimental results.
web age information management | 2013
Xiongpai Qin; Biao Qin; Xiaoyong Du; Shan Wang
In recent years MapReduce has risen to be the de-facto tool for big data processing. MapReduce is a disruptive innovation. It has changed the landscape of database market, the landscape of technologies, as well as the landscape of saying power. The article will give a reflection on the popularity of the technique and some observations of its position in a unified big data platform.
semantics, knowledge and grid | 2005
Biao Qin; Shan Wang; Xiaoyong Du
Materialized view is a useful tool to cache data in data warehouse, data replication, data visualization and etc. However, view maintenance in peer data management systems (PDMSs) has received very little attention. In this paper, we propose a strategy to effectively maintain views in PDMSs. First, we present a hybrid peer architecture, in which peers prefer peer-to-peer architecture to super-peer based architecture. Then, we extend the definition of view and propose the peer view, local view and global view according to the requirements of applications. So a global view maintenance is became the maintenance of all related local views if join operations are confined in each local PDMS. Furthermore, we extend Morks rules governing the use of updategrams and boosters and provide a push-based algorithm for view maintenance. Finally, we do extensive simulation experiments in our SPDMS. The simulation results show that the proposed view maintenance strategy has better performance than that of Morks.
Information Sciences | 2011
Biao Qin; Shan Wang
In this paper, we prove that a query plan is safe in tuple independent probabilistic databases if and only if its every answer tuple is tree structured in probabilistic graphical models. We classify hierarchical queries into core and non-core hierarchical queries and show that the existing methods can only generate safe plans for core hierarchical queries. Inspired by the bucket elimination framework, we give the sufficient and necessary conditions for the answer relation of every candidate sub-query to be used as a base relation. Finally, the proposed algorithm generates safe plans for extensional query evaluation on non-boolean hierarchical queries and invokes the SPROUT algorithm [24] for intensional query evaluation on boolean queries. A case study on the TPC-H benchmark reveals that the safe plans of Q7 and Q8 can be evaluated efficiently. Furthermore, extensive experiments show that safe plans generated by the proposed algorithm scale well.
web age information management | 2006
Biao Qin; Shan Wang; Xiaoyong Du
The problem of sharing data in peer-to-peer environment has received considerable attention in recent years. However, knowledge sharing in peer architectures has received very little attention. This paper proposes a framework for query reformulation in peer architectures. We first consider a mapping language based on a particular description logic that includes class connectors. Then a set of rules are proposed for building graphs. Because the axioms in a knowledge base have different properties, our graph generation algorithm classifies the generated graphs into four sets (Ugraph, Bgraph, Cgraph and Dgraph). Furthermore, based on the properties of the unification nodes, our algorithms can reformulate each kind of atom in a special way. Finally we do extensive simulation experiments and simulation results show that the proposed method has better performance than those of Morks [8].
asia pacific web conference | 2006
Biao Qin; Shan Wang; Xiaoyong Du
The problem of sharing data in peer data management systems (PDMSs) has received considerable attention in recent years. However, update management in PDMSs has received very little attention. This paper proposes a strategy to maintain views in our SPDMS. Based on applications, this paper extends the definition of view and proposes the peer view, local view and global view. So the maintenance of a global view is became the maintenance of all related local views if join operations are confined in each local PDMS. Furthermore, this paper proposes an ECA rule for definition consistency maintenance and a push-based strategy for date consistency maintenance. Finally, we do extensive simulation experiments in our SPDMS. The simulation results show the proposed strategy has better performance than that of Mork’s.
database systems for advanced applications | 2015
Biao Qin
Probabilistic data management has recently drawn much attention of the database research community. This paper investigates safe plans of queries on block independent disjoint (BID) probabilistic databases. This problem is fundamental to evaluate queries whose time complexity is PTIME. We first introduce two new probabilistic table models which are the correlated table and the correlated block table, and a hybrid project which executes a disjoint project and then performs an independent project in an atomic operation on BID tables. After that, we propose an algorithm to find safe plans for queries on BID probabilistic databases. Finally, we present the experimental results to show that the proposed algorithm can find safe plans for more queries than the state-of-the-art and the safe plans generated by the proposed algorithm are efficient and scale well.
database systems for advanced applications | 2013
Biao Qin; Shan Wang; Xiaoyong Du
Provenance information describes the origins and the history of data in its life cycle. Responsibility captures the notion of degree of causality and tells us which facts are the most influential in the lineage. Since responsibility cannot be computed by a relational query, the analysis of lineage becomes an essential tool to compute responsibility of tuples in the query results. We extend the definitions of causality and responsibility of a tuple t for the answer r to those of a set of tuples for the answer r, and Co-Trees to P-Trees for read-once functions. By using P-Trees, we develop an efficient algorithm to compute responsibilities of tuples in read-once formulas, and a novel algorithm to find top-k responsibility tuples in read-once functions. Finally, experimental evaluation on TPC-H data shows substantial efficiency improvement when compared to the state of the art.