Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Xiu Susie Fang is active.

Publication


Featured researches published by Xiu Susie Fang.


conference on information and knowledge management | 2015

An Integrated Bayesian Approach for Effective Multi-Truth Discovery

Xianzhi Wang; Quan Z. Sheng; Xiu Susie Fang; Lina Yao; Xiaofei Xu; Xue Li

Truth-finding is the fundamental technique for corroborating reports from multiple sources in both data integration and collective intelligent applications. Traditional truth-finding methods assume a single true value for each data item and therefore cannot deal will multiple true values (i.e., the multi-truth-finding problem). So far, the existing approaches handle the multi-truth-finding problem in the same way as the single-truth-finding problems. Unfortunately, the multi-truth-finding problem has its unique features, such as the involvement of sets of values in claims, different implications of inter-value mutual exclusion, and larger source profiles. Considering these features could provide new opportunities for obtaining more accurate truth-finding results. Based on this insight, we propose an integrated Bayesian approach to the multi-truth-finding problem, by taking these features into account. To improve the truth-finding efficiency, we reformulate the multi-truth-finding problem model based on the mappings between sources and (sets of) values. New mutual exclusive relations are defined to reflect the possible co-existence of multiple true values. A finer-grained copy detection method is also proposed to deal with sources with large profiles. The experimental results on three real-world datasets show the effectiveness of our approach.


conference on information and knowledge management | 2015

Approximate Truth Discovery via Problem Scale Reduction

Xianzhi Wang; Quan Z. Sheng; Xiu Susie Fang; Xue Li; Xiaofei Xu; Lina Yao

Many real-world applications rely on multiple data sources to provide information on their interested items. Due to the noises and uncertainty in data, given a specific item, the information from different sources may conflict. To make reliable decisions based on these data, it is important to identify the trustworthy information by resolving these conflicts, i.e., the truth discovery problem. Current solutions to this problem detect the veracity of each value jointly with the reliability of each source for each data item. In this way, the efficiency of truth discovery is strictly confined by the problem scale, which in turn limits truth discovery algorithms from being applicable on a large scale. To address this issue, we propose an approximate truth discovery approach, which divides sources and values into groups according to a user-specified approximation criterion. The groups are then used for efficient inter-value influence computation to improve the accuracy. Our approach is applicable to most existing truth discovery algorithms. Experiments on real-world datasets show that our approach improves the efficiency compared to existing algorithms while achieving similar or even better accuracy. The scalability is further demonstrated by experiments on large synthetic datasets.


web information systems engineering | 2015

Classifying Perspectives on Twitter: Immediate Observation, Affection, and Speculation

Yihong Zhang; Claudia Szabo; Quan Z. Sheng; Xiu Susie Fang

Popular micro-blogging services such as Twitter enable users to effortlessly publish observations and thoughts about ongoing events. Such social sensing generates a very large pool of rich and up-to-date information. However, the large volume and a fast rate of posting make it very challenging to read through the posts and find out useful information in relevant tweets. In this paper, we propose an automated tweet classification approach that distinguishes three perspectives in which a Twitter user may compose messages, namely Immediate Observation, Affection, and Speculation. Using tweets made about the Ukraine Crisis in 2014, our experimental results show that, with the right choice of features and classifiers, we can generally obtain very satisfying results, with the classification precisions in many cases higher than 0.8. We show that the classification results can be used in event time and location detection, public sentiment analysis, and early rumor detection.


international world wide web conferences | 2017

Truth Discovery from Conflicting Multi-Valued Objects

Xiu Susie Fang

Truth discovery is a fundamental research topic, which aims at identifying the true value(s) of objects of interest given the conflicting multi-sourced data. Although considerable research efforts have been conducted on this topic, we can still point out two significant issues unsolved: i) single-valued assumption, i.e., current methods assume only one true value for each object, while in reality objects with multiple true values widely exist; ii) sparse ground truth, i.e., current works evaluate and compare existing truth discovery methods based on datasets with limited ground truth. Therefore, the empirical studies might be biased and cannot legitimately validate the existing methods. In this PhD project, we propose a full-fledged graph-based model, SmartMTD (Smart Multi-valued Truth Discovery), which incorporates four important implications to conduct truth discovery for multi-valued objects. Two graphs are constructed and further used to derive two aspects of source reliability via random walk computations. We also present a general approach, which utilizes Markov chain models with Bayesian inference, for comparing the existing truth discovery methods and validate our approach without ground truth. Initial empirical studies on two real-world datasets show the effectiveness of SmartMTD.


australasian database conference | 2015

Ontology Augmentation via Attribute Extraction from Multiple Types of Sources

Xiu Susie Fang; Xianzhi Wang; Quan Z. Sheng

A comprehensive ontology can ease the discovery, maintenance and popularization of knowledge in many domains. As a means to enhance existing ontologies, attribute extraction has attracted tremendous research attentions. However, most existing attribute extraction techniques focus on exploring a single type of sources, such as structured (e.g., relational databases), semi-structured (e.g., Extensible Markup Language (XML)) or unstructured sources (e.g., Web texts, images), which leads to the poor coverage of knowledge bases (KBs). This paper presents a framework for ontology augmentation by extracting attributes from four types of sources, namely existing knowledge bases (KBs), query stream, Web texts, and Document Object Model (DOM) trees. In particular, we use query stream and two major KBs, DBpedia and Freebase, to seed the attribute extraction from Web texts and DOM trees. We specially focus on exploring the extraction technique from DOM trees, which is rarely studied in previous works. Algorithms and a series of filters are developed. Experiments show the capability of our approach in augmenting existing KB ontology.


World Wide Web | 2018

SNAF: Observation filtering and location inference for event monitoring on twitter

Yihong Zhang; Claudia Szabo; Quan Z. Sheng; Xiu Susie Fang

Twitter has recently emerged as a popular microblogging service that has 284 million monthly active users around the world. A part of the 500 million tweets posted on Twitter everyday are personal observations of immediate environment. If provided with time and location information, these observations can be seen as sensory readings for monitoring and localizing objects and events of interests. Location information on Twitter, however, is scarce, with less than 1% of tweets have associated GPS coordinates. Current researches on Twitter location inference mostly focus on city-level or coarser inference, and cannot provide accurate results for fine-grained locations. We propose an event monitoring system for Twitter that emphasizes local events, called SNAF (Sense and Focus). The system filters personal observations posted on Twitter and infers location of each report. Our extensive experiments with real Twitter data show that, the proposed observation filtering approach can have about 22% improvement over existing filtering techniques, and our location inference approach can increase the location accuracy by up to 36% within the 3km error range. By aggregating the observation reports with location information, our prototype event monitoring system can detect real world events, in many case earlier than news reports.


international world wide web conferences | 2017

Value Veracity Estimation for Multi-Truth Objects via a Graph-Based Approach

Xiu Susie Fang; Quan Z. Sheng; Xianzhi Wang; Anne H. H. Ngu

A fundamental issue with current truth discovery methods is that they generally assume only one true value for each object, while in reality objects may have multiple true values. In this work, we propose a graph-based approach, called SmartMTD, to relax this assumption in truth discovery. SmartMTD models two types of source relations with additional quantification to precisely estimate source reliability and to detect malicious agreement among sources for multi-truth discovery. In particular, two graphs are constructed based on the modeled source relations, which are further used to derive two aspects of source reliability via random walk computation.


PSU Research Review | 2017

GrandBase: generating actionable knowledge from Big Data

Xiu Susie Fang; Quan Z. Sheng; Xianzhi Wang; Anne H. H. Ngu; Yihong Zhang

Purpose This paper aims to propose a system for generating actionable knowledge from Big Data and use this system to construct a comprehensive knowledge base (KB), called GrandBase. Design/methodology/approach In particular, this study extracts new predicates from four types of data sources, namely, Web texts, Document Object Model (DOM) trees, existing KBs and query stream to augment the ontology of the existing KB (i.e. Freebase). In addition, a graph-based approach to conduct better truth discovery for multi-valued predicates is also proposed. Findings Empirical studies demonstrate the effectiveness of the approaches presented in this study and the potential of GrandBase. The future research directions regarding GrandBase construction and extension has also been discussed. Originality/value To revolutionize our modern society by using the wisdom of Big Data, considerable KBs have been constructed to feed the massive knowledge-driven applications with Resource Description Framework triples. The important challenges for KB construction include extracting information from large-scale, possibly conflicting and different-structured data sources (i.e. the knowledge extraction problem) and reconciling the conflicts that reside in the sources (i.e. the truth discovery problem). Tremendous research efforts have been contributed on both problems. However, the existing KBs are far from being comprehensive and accurate: first, existing knowledge extraction systems retrieve data from limited types of Web sources; second, existing truth discovery approaches commonly assume each predicate has only one true value. In this paper, the focus is on the problem of generating actionable knowledge from Big Data. A system is proposed, which consists of two phases, namely, knowledge extraction and truth discovery, to construct a broader KB, called GrandBase.


advanced data mining and applications | 2016

An Ensemble Approach for Better Truth Discovery

Xiu Susie Fang; Quan Z. Sheng; Xianzhi Wang

Truth discovery is a hot research topic in the Big Data era, with the goal of identifying true values from the conflicting data provided by multiple sources on the same data items. Previously, many methods have been proposed to tackle this issue. However, none of the existing methods is a clear winner that consistently outperforms the others due to the varied characteristics of different methods. In addition, in some cases, an improved method may not even beat its original version as a result of the bias introduced by limited ground truths or different features of the applied datasets. To realize an approach that achieves better and robust overall performance, we propose to fully leverage the advantages of existing methods by extracting truth from the prediction results of these existing truth discovery methods. In particular, we first distinguish between the single-truth and multi-truth discovery problems and formally define the ensemble truth discovery problem. Then, we analyze the feasibility of the ensemble approach, and derive two models, i.e., serial model and parallel model, to implement the approach, and to further tackle the above two types of truth discovery problems. Extensive experiments over three large real-world datasets and various synthetic datasets demonstrate the effectiveness of our approach.


conference on information and knowledge management | 2016

Truth Discovery via Exploiting Implications from Multi-Source Data

Xianzhi Wang; Quan Z. Sheng; Lina Yao; Xue Li; Xiu Susie Fang; Xiaofei Xu; Boualem Benatallah

Collaboration


Dive into the Xiu Susie Fang's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Xianzhi Wang

University of New South Wales

View shared research outputs
Top Co-Authors

Avatar

Xiaofei Xu

Harbin Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Lina Yao

University of New South Wales

View shared research outputs
Top Co-Authors

Avatar

Xue Li

University of Queensland

View shared research outputs
Top Co-Authors

Avatar

Xianzhi Wang

University of New South Wales

View shared research outputs
Top Co-Authors

Avatar

Boualem Benatallah

University of New South Wales

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge