Yao-Chung Fan
National Chung Hsing University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Yao-Chung Fan.
international parallel and distributed processing symposium | 2008
Yao-Chung Fan; Arbee L. P. Chen
Sensor networks have received considerable attention in recent years, and are often employed in the applications where data are difficult or expensive to collect. In these applications, in addition to individual sensor readings, statistical aggregates such as Min and Count over the readings of a group of sensor nodes are often needed. To conserve resources for sensor nodes, in-network strategies are adopted to process the aggregates. One primitive in-network aggregation strategy is the tree-based aggregation, where the aggregates are computed along a spanning tree over a sensor network. However, a shortcoming with the tree-based aggregation is that it is not robust against communication failures, which are common in sensor networks. One of the solutions to overcome this shortcoming is to enable multi-path routing, by which each node broadcasts its reading or a partial aggregate to multiple neighbors. However, multi-path routing based aggregation typically suffers from the problem of overcounting sensor readings. In this study, we propose using the linear counting sketches for multi-path routing based in-network aggregation. We claim that the use of the linear counting sketches makes our approach considerably more accurate than previous approaches using the same sketch space. Our approach also enjoys low variances in term of the aggregate accuracy, and low overheads either in computations or sketch space. Through extensive experiments with real-world and synthetic data, we demonstrate the efficiency and effectiveness of using the linear counting sketches as a solution for the in- network aggregation.
IEEE Transactions on Parallel and Distributed Systems | 2010
Yao-Chung Fan; Arbee L. P. Chen
Sensor networks have received considerable attention in recent years, and are often employed in the applications where data are difficult or expensive to collect. In these applications, in addition to individual sensor readings, statistical aggregates such as Min and Count over the readings of a group of sensor nodes are often needed. To conserve resources for sensor nodes, in-network strategies are adopted to process the aggregates. One primitive in-network aggregation strategy is the tree-based aggregation, where the aggregates are computed from leaves to the root of a spanning tree over a sensor network. However, a shortcoming with the tree-based aggregation is that it is not robust against communication failures, which are common in sensor networks. One of the solutions to overcome this shortcoming is to enable multipath routing, by which each node broadcasts its reading or a partial aggregate to multiple neighbors. However, multipath routing-based aggregation typically suffers from the problem of overcounting sensor readings. In this study, we propose two schemes based on the linear counting technique to deal with the overcounting problem. These two schemes process aggregates by statically and dynamically, respectively, allocating space for the use of the linear counting technique. Both schemes provide the same accuracy guarantee but involve different communication costs. Through extensive experiments with real-world and synthetic data, we demonstrate the efficiency and effectiveness of using these two schemes as solutions for processing aggregates in a sensor network. The experiments also show that the scheme that dynamically allocates the space often outperforms the other one in terms of energy conservation since it requires less space to satisfy an accuracy constraint.
IEEE Transactions on Knowledge and Data Engineering | 2012
Yao-Chung Fan; Arbee L. P. Chen
Sensor networks have received considerable attention in recent years, and are employed in many applications. In these applications, statistical aggregates such as Sum over the readings of a group of sensor nodes are often needed. One challenge for computing sensor data aggregates comes from the communication failures, which are common in sensor networks. To enhance the robustness of the aggregate computation, multipath-based aggregation is often used. However, the multipath-based aggregation suffers from the problem of overcounting sensor readings. The approaches using the multipath-based aggregation therefore need to incorporate techniques that avoid overcounting sensor readings. In this paper, we present a novel technique named scalable counting for efficiently avoiding the overcounting problem. We focus on having an (ε, δ) accuracy guarantee for computing an aggregate, which ensures that the error in computing the aggregate is within a factor of ε with probability (1 - δ). Our schemes using the scalable counting technique efficiently compute the aggregates under a given accuracy guarantee. We provide theoretical analyses that show the advantages of the scalable counting technique over previously proposed techniques. Furthermore, extensive experiments are made to validate the theoretical results and manifest the advantages of using the scalable counting technique for sensor data aggregation.
conference on information and knowledge management | 2014
Chih-Wei Chang; Yao-Chung Fan; Kuo-Chen Wu; Arbee L. P. Chen
Over the recent years smart devices have become a ubiquitous medium supporting various forms of functionality and are widely accepted for common users. One distinguishing feature for smart devices is the ability of positioning the physical location of a device, and numerous applications based on user location information have been proposed. While the potentials have been foreseen, location based services fundamentally suffer from the problem of lacking an effective and scalable mechanism to bridge the gap between the machine-observed locations and the human understandable places. In this study, we contribute on this fundamental problem. Differing from the existing solutions on this subject, we start from a novel perspective; we propose to address the place semantic understanding problem by casting it as a classification problem and employ machine learning techniques to automatically infer the types of the places. The key observation is that human behaviors are not random, e.g., people visit restaurants around noon, go for work in the daytime, and stay at home at night. Namely, by properly selecting features, a mechanism for automatically inferring place type semantics can be achieved. This paper summarizes our treatment and findings of leveraging the human behaviors patterns to infer the type of a place. Experiments using month-long trace logs from the recruited participants are conducted, and the experiment results demonstrate the effectiveness of the proposed method.
IEEE Transactions on Knowledge and Data Engineering | 2016
Yao-Chung Fan; Yu-Chi Chen; Kuan-Chieh Tung; Kuo-Chen Wu; Arbee L.P. Chen
Nowadays, mobile devices have become a ubiquitous medium supporting various forms of functionality and are widely accepted for commons. In this study, we investigate using Wi-Fi logs from a mobile device to discover user preferences. The core ideas are two folds. First, every Wi-Fi access point is with a network name, normally a human-readable string, called SSID (Service Set Identifier). Since SSIDs are often with semantics, from which we can infer the place where the user stayed. Second, a Wi-Fi log is produced when the user is near a Wi-Fi access point. A high frequency of a consecutively observed SSID implies a long stay duration at a place. To the best of our knowledge, our work is the first attempting to understand users from the collected Wi-Fi logs from mobile devices. However, Wi-Fi logs are essentially of various information types and with noises. How to assess the information types, eliminate irrelevant information, and clean up the noises within partial-informative SSIDs are therefore keys for profiling user preferences over Wi-Fi logs. In this paper, we propose a data cleaning and information enrichment framework for enabling the user preference understanding through collected Wi-Fi logs, and introduce a data clean framework for cleaning, correcting, and refining Wi-Fi logs. In addition, a comprehensive experiment with data collected from users is made to verify the effectiveness of the proposed techniques for cleaning noisy Wi-Fi data for user preferences profiling. The experiment results demonstrate the effectiveness of the proposed framework for profiling user preferences through Wi-Fi logs.
international conference on parallel and distributed systems | 2013
Yao-Chung Fan; Wei Hong Lee; Cheng Teng Iam; Gia Hao Syu
With the popularity of mobile devices, numerous mobile applications have been and will continue to be developed for various interesting usage scenarios. Riding this trend, recent research community envisions a novel information retrieving and information-sharing platform, which views the users with mobile devices and being willing to accept crowd sourcing tasks as crowd sensors. With the neat idea, a set of crowd sensors applications have emerged. Among the applications, the geospatial information systems based on crowd sensors show significant potentials beyond traditional ones by providing real time geospatial information. In the applications, user positioning is of great importance. However, existing positioning techniques have their own disadvantages. In this paper, we study using pervasive Wi-Fi access point as a position indicator. The major challenge for using Wi-Fi access point is that there is no mechanism for mapping observed Wi-Fi signals to human-defined places. To this end, our idea is to employ crowd sourcing model to perform place name annotations by mobile participants to bridge the gap between signals and human-defined places. In this paper, we propose schemes for effectively enabling based-based place name annotation, and conduct real trials with recruited participants to study the effectiveness of the proposed schemes. The experiment results demonstrate the effectiveness of the proposed schemes over existing solutions.
autonomic and trusted computing | 2012
Yao-Chung Fan; Xingjie Liu; Wang-Chien Lee; Arbee L. P. Chen
The growing concerns on urgent environmental and economical issues, such as global warming and rising energy cost, have motivated research studies on various green computing technologies. For example, Non-Intrusive Appliance Load Monitor (NIALM) techniques, aiming at energy monitoring, load forecasting and improved control of residential electrical appliances, have been developed by monitoring one electrical circuit that contains a number of electrical appliances without using separate sub-meters. By employing pattern recognition algorithms, the NIALM techniques estimate the consumption of individual appliances. While the basic ideas behind the NIALM techniques are valid, existing proposals suffer from the issue of poor estimation accuracy. In this paper, we model the process of load separation in NIALM as a time series disaggregation problem. Aiming at achieving high estimation accuracy and alleviating excessive computation, we develop a time-series disaggregation algorithm which incorporates two novel techniques, namely, DE-pruning and monotonic enumeration, for search space pruning. A comprehensive set of experiments are conducted to validate our proposals and to evaluate the effectiveness and the efficiency of the proposed methods. The result shows that our proposal is effective and efficient.
innovative mobile and internet services in ubiquitous computing | 2018
Vorakit Vorakitphan; Fang-Yie Leu; Yao-Chung Fan
In recent years, social networking platform serves as a new media of news sharing and information diffusion. Social networking platform has become a part of our daily life. As such, social media advertising budgets have explosively expanded worldwide over the past few years. Due to the huge commercial interest, clickbait behaviors are commonly observed, which use attractive headlines and sensationalized textual description to bait users to visit websites. Clickbaits mainly exploit the users’ curiosity’s gap by interesting headlines to entice its readers to click an accompanying link to articles often with poor contents. Clickbaits are bothersome either to social media users or platform site owners. In this paper, we propose an approach called Ontology-based LSTM Model (OLSTM) to detect clickbaits. Compared with the existing solutions for clickbait detection, our approach is characterized by the following three components: word embedding model, Recurrent Neural Networks (RNN), and word ontology information. The observation is that preserving semantic relationships is significantly an important factor to be considered in detecting clickbaits. Therefore, we propose to capture semantic relationships between words by word embedding models. In addition, we adopted RNN as our classification models to consider word orders in a sentence. Furthermore, we consider the word ontology relation as another feature set for clickbait classification, as clickbaits often uses words with generalized concepts to induce curiosity. We conduct experiments with real data from Twitter and news websites to validate the effectiveness of the proposed approach, which demonstrates that the employment of the proposed method improves clickbait detection accuracy from 80% to 90% compared with the existing solutions.
international conference on data engineering | 2016
Yao-Chung Fan; Yu-Chi Chen; Kuan-Chieh Tung; Kuo-Chen Wu; Arbee L.P. Chen
Nowadays, mobile devices have become a ubiquitous medium supporting various forms of functionality and are widely accepted for commons. In this study, we investigate using Wi-Fi logs from a mobile device to discover user preferences. The core ideas are two folds. First, every Wi-Fi access point is with a network name, normally a human-readable string, called SSID (Service Set Identifier). Since SSIDs are often with semantics, from which we can infer the place where the user stayed. Second, a Wi-Fi log is produced when the user is near a Wi-Fi access point. A high frequency of a consecutively observed SSID implies a long stay duration at a place. To the best of our knowledge, our work is the first attempting to understand users from the collected Wi-Fi logs from mobile devices. However, Wi-Fi logs are essentially of various information types and with noises. How to assess the information types, eliminate irrelevant information, and clean up the noises within partial-informative SSIDs are therefore keys for profiling user preferences over Wi-Fi logs. In this paper, we propose a data cleaning and information enrichment framework for enabling the user preference understanding through collected Wi-Fi logs, and introduce a data clean framework for cleaning, correcting, and refining Wi-Fi logs. In addition, a comprehensive experiment with data collected from users is made to verify the effectiveness of the proposed techniques for cleaning noisy Wi-Fi data for user preferences profiling. The experiment results demonstrate the effectiveness of the proposed framework for profiling user preferences through Wi-Fi logs.
acm symposium on applied computing | 2014
Yao-Chung Fan; Huan Chen
Social networking service platforms have gained a great success in recent years. Analyzing the social network data from the platforms presents new opportunities for various applications. Among the applications, the social influence analysis has gained great attentions, which provide great business values in helping companies determine which potential customers to market to. However, as social networks become increasingly large, scalability is quickly becoming the major challenge for conducting the social influence analysis in large-scale social networks. To this point, the common practice is to adopt parallel processing model. However, from the initial experimentation, we find that the traffics load between nodes is very high, and becomes a bottleneck for analysis. In this paper, we present a novel approximation framework which significantly reduces the amount of data traffics for processing social influence analysis. The proposed framework exhibit high efficiency and ensures a tunable (ε, δ) accuracy constraint, which guarantees the error in the reported result is within a factor of ε with probability (1--δ). In addition, we conduct a comprehensive performance evaluation to validate and evaluate the proposed techniques. The experimental results clearly show the superiority of the proposed framework.