Hongan Wang | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Hongan Wang is active.

Explore More

Publication

Featured researches published by Hongan Wang.

fuzzy systems and knowledge discovery | 2008

Efficient Clustering-Based Outlier Detection Algorithm for Dynamic Data Stream

Manzoor Elahi; Kun Li; Wasif Nisar; Xinjie Lv; Hongan Wang

Anomaly detection is currently an important and active research problem in many fields and involved in numerous applications. Most of the existing methods are based on distance measure. But in case of data stream these methods are not very efficient as computational point of view. Most of the exiting work on outlier detection in data stream declare a point as an outlier/inlier as soon as it arrive due to limited memory resources as compared to the huge data stream, to declare an outlier as it arrive often can lead us to a wrong decision, because of dynamic nature of the incoming data. In this paper we introduced a clustering based approach, which divide the stream in chunks and cluster each chunk using k-mean in fixed number of clusters. Instead of keeping only the summary information, which often used in case of clustering data stream, we keep the candidate outliers and mean value of every cluster for the next fixed number of steam chunks, to make sure that the detected candidate outliers are the real outliers. By employing the mean value of the clusters of previous chunk with mean values of the current chunk of stream, we decide better outlierness for data stream objects. Several experiments on different dataset confirm that our technique can find better outliers with low computational cost than the other exiting distance based approaches of outlier detection in data stream.

business intelligence for the real-time enterprises | 2008

QoS-Aware Publish-Subscribe Service for Real-Time Data Acquisition

Xinjie Lu; Xin Li; Tian Yang; Zaifei Liao; Wei Liu; Hongan Wang

Many complex distributed real-time applications need complicated processing and sharing of an extensive amount of data under critical timing constraints. In this paper, we present a comprehensive overview of the Data Distribution Service standard (DDS) and describe its QoS features for developing real-time applications. An overview of an active real-time database (ARTDB) named Agilor is also provided. For efficient expressing QoS policy in Agilor, a Real-time ECA (RECA) rule model is presented based on common ECA rule. And then we propose a novel QoS-aware Real-Time Publish-Subscribe (QRTPS) service compatible to DDS for distributed real-time data acquisition. Furthermore, QRTPS is implemented on Agilor by using objects and RECA rules in Agilor. To illustrate the benefits of QRTPS for real-time data acquisition, an example application is presented.

computer science and information engineering | 2009

Detection of Local Outlier over Dynamic Data Streams Using Efficient Partitioning Method

Manzoor Elahi; Kun Li; Wasif Nisar; Xinjie Lv; Hongan Wang

Outlier detection is the process of detecting the data objects which are grossly different from or inconsistent with the remaining set of data. Some of the important applications in the field of data mining are fraud detection, customer behavior analysis, and intrusion detection. There are number of good research algorithms for detecting outliers if the entire data is available and algorithms can operate in more than single passes to achieve the required results. Among the existing methods, LOF (Local outlier Factor) a density based method is very efficient in detecting all forms of outliers. LOF algorithm can not be directly applied to the datastream as the large number of nearest neighbor searches, LOF computation and lrd (local reachability distances) can make it highly inefficient for datastream. In this paper we propose a cluster based partitioning algorithm which can divide the stream in safe region and candidate regions. In Second phase apply LOF algorithm over these partitions separately with some slight enhancement for LOF computation over candidate region to achieve accurate results for finding most outstanding outliers. Several experiments on different dataset confirm that our technique can find better outliers with low computational cost than the direct LOF or compared to the other enhancements proposed for LOF.

fuzzy systems and knowledge discovery | 2007

Adaptive Load Management over Real-Time Data Streams

Xin Li; Li Ma; Kun Li; Kun Wang; Hongan Wang

Streaming applications require long-running query services against data streams. Existing data stream management systems (DSMSs) are poor at processing long-running queries with timing constrains. To address this problem, we present a real-time DSMS which can support real-time query services in unpredictable environments. In this system, long- running queries over data streams are divided into two classes: periodic and continuous queries. A mixed query model is introduced to characterize these two kinds of real-time queries. Furthermore, an adaptive load management (ALM) strategy based on dynamic execution time prediction is proposed to distribute processor time among all query instances. The objective of the ALM strategy is to provide certain guarantee on the deadline miss ratio of periodic queries and reduce the one of continuous queries, meanwhile maximizing overall query quality. A series of experiments confirm that the ALM strategy is effective in improving query quality and managing workload fluctuations.

acm symposium on applied computing | 2009

Real-time scheduling for continuous queries with deadlines

Li Ma; Xin Li; Yongyan Wang; Hongan Wang

Many stream-based applications have real-time performance requirements for continuous queries over time-varying data streams. In order to address this challenge, a real-time continuous query model is presented to handle multiple queries with timing constraints. In this model, the execution of one tuple passing through an operator path is modeled as a real-time task instance. A fine-grained scheduling strategy named OP-EDF is proposed for real-time scheduling, which schedules the operator path with the earliest deadline of the waiting tuples at any time slot. The experimental results show that the proposed continuous query model and scheduling algorithm are effective in real-time query processing for data streams with bursty arrival rates.

international symposium on parallel and distributed processing and applications | 2008

A Novel QoS-Enable Real-Time Publish-Subscribe Service

Xinjie Lu; Tian Yang; Zaifei Liao; Xin Li; Yongyan Wang; Wei Liu; Hongan Wang

Complex distributed real-time applications require complicated processing and sharing of an extensive amount of data under critical timing constraints. In this paper, we present a comprehensive overview of the Data Distribution Service standard (DDS) and describe its QoS (Quality of Service) features for developing real-time applications. Real-time ECA (RECA) rules are introduced to efficiently describe QoS policy in an active real-time database (ARTDB) named Agilor. And then we propose a novel QoS-Enable Real-Time Publish-Subscribe (QERTPS) service compatible to DDS for distributed real-time data acquisition. QERTPS could support several different QoS levels for various applications at the same time. Furthermore, QERTPS is implemented by object models and RECA rules in Agilor. To illustrate the benefits of QERTPS for real-time data acquisition, an example application is presented. Experimental evaluation shows that the proposed service provides a stable and timely service for providing different QoS levels.

fuzzy systems and knowledge discovery | 2008

Mining Recent Frequent Itemsets in Data Streams

Kun Li; Yongyan Wang; Manzoor Ellahi; Hongan Wang

Mining frequent itemsets in data streams is a hot research topic in recent years. Due to the continuous, high-speed and unbounded properties of data streams, traditional algorithms on static dataset are not suitable for mining in data streams. In this paper we present bounded frequent itemsets stream (abbreviated as BFI-stream) algorithm, which uses a prefix-tree based structure, called BFI-tree, to maintain all accurate frequent itemsets from sliding windows over data streams. By monitoring the boundary between frequent itemsets and infrequent itemsets, it restricts the update process on a small part of the tree. Mining all frequent itemsets with accurate frequencies is just to traverse the tree. It is time efficient even when the user-specified minimum support threshold is small. Experiments compare the time and space usage with MFI-TransSW, which also returns all accurate frequent itemsets from sliding windows. The results show that BFI-stream outperforms MFI-TransSW in both time and space at most time especially when the minimum support is small.

fuzzy systems and knowledge discovery | 2008

An Approach to Handle Overload in Real-Time Data Stream Management System

Li Ma; Xin Li; Yongyan Wang; Hongan Wang

Load shedding is a challengeable issue in data stream management systems (DSMSs). When data stream rates exceed system capacity, the overloaded DSMS fails to process all of its input data and keep up with the rate of data arrival. Especially, in a time-critical environment, queries should be completed not just timely but within certain deadlines. Existing strategies are poor at handling huge fluctuant overload with deadline. In this paper, an Effective Deadline-Aware Random Load Shedding algorithm (named RLS-EDA) is proposed to handle real-time system overload effectively. The RLS-EDA algorithm can make full use of the system idle time by buffering dropped tuples which would have opportunities to be executed when the workload is fade. Experiment results show that our algorithm can reduce average deadline miss ratio and increase system throughput during the period of huge workload fluctuations.

computational intelligence and data mining | 2009

Maintaining only frequent itemsets to mine approximate frequent itemsets over online data streams

Yongyan Wang; Kun Li; Hongan Wang

Mining frequent itemsets over online data streams, where the new data arrive and the old data will be removed with high speed, is a challenge for the computational complexity. Existing approximate mining algorithms suffer from explosive computational complexity when decreasing the error parameter, ∈, which is used to control the mining accuracy. We propose a new approximate mining algorithm using an approximate frequent itemset tree (abbreviated as AFI-tree), called AFI algorithm, to mine approximate frequent itemsets over online data streams. The AFI-tree based on prefix tree maintains only frequent itemsets, so the number of nodes in the tree is very small. All the infrequent child nodes of any frequent node are pruned and the maximal support of the pruned nodes is estimated to detect new frequent itemsets. In order to guarantee the mining accuracy, when the estimated maximal support of the pruned nodes is a bit more than the minimum support, their supports will be re-computed and the frequent nodes among them will be inserted into the AFI-tree. Experimental results show that the AFI algorithm consumes much less memory space than existing algorithms, and runs much faster than existing algorithms in most occasions.

fuzzy systems and knowledge discovery | 2005

Scheduling design of controllers with fuzzy deadline

Hong Jin; Hongan Wang; Henry (Hui) Wang; Danli Wang

Because some timing-constraints of a controller task may be not determined as a real-time system engineer thinks of, its scheduling with uncertain attributes can not be usually and simply dealt with according to classic manners used in real-time systems. The model of a controller task with fuzzy deadline and its scheduling are studied. The dedication concept and the scheduling policy of largest dedication first are proposed first. Simulation shows that the scheduling of controller tasks with fuzzy deadline can be implemented by using the proposed method, whilst the control performance cost gets guaranteed.

Explore More