Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Hong Cao is active.

Publication


Featured researches published by Hong Cao.


IEEE Electrical Insulation Magazine | 2015

An overview of state-of-the-art partial discharge analysis techniques for condition monitoring

Min Wu; Hong Cao; Jianneng Cao; Hai-Long Nguyen; João Bártolo Gomes; Shonali Krishnaswamy

As one step toward the future smart grid, condition monitoring is important to facilitate the reliability of grid asset operation and to save on maintenance cost [1]. Most failures of the power grid are caused by electrical insulation failure, and a key indicator of such electrical failure is the occurrence of partial discharge (PD). Therefore, one focus of condition monitoring is to detect PD, especially in the early stages, to prevent a serious power failure or outage.


IEEE Transactions on Knowledge and Data Engineering | 2013

Integrated Oversampling for Imbalanced Time Series Classification

Hong Cao; Xiaoli Li; David Yew-Kwong Woon; See-Kiong Ng

This paper proposes a novel Integrated Oversampling (INOS) method that can handle highly imbalanced time series classification. We introduce an enhanced structure preserving oversampling (ESPO) technique and synergistically combine it with interpolation-based oversampling. ESPO is used to generate a large percentage of the synthetic minority samples based on multivariate Gaussian distribution, by estimating the covariance structure of the minority-class samples and by regularizing the unreliable eigen spectrum. To protect the key original minority samples, we use an interpolation-based technique to oversample a small percentage of synthetic population. By preserving the main covariance structure and intelligently creating protective variances in the trivial eigen dimensions, ESPO effectively expands the synthetic samples into the void area in the data space without being too closely tied with existing minority-class samples. This also addresses a key challenge for applying oversampling for imbalanced time series classification, i.e., maintaining the correlation between consecutive values through preserving the main covariance structure. Extensive experiments based on seven public time series data sets demonstrate that our INOS approach, used with support vector machines (SVM), achieved better performance over existing oversampling methods as well as state-of-the-art methods in time series classification.


ubiquitous computing | 2012

An integrated framework for human activity classification

Hong Cao; Minh Nhut Nguyen; Clifton Phua; Shonali Krishnaswamy; Xiaoli Li

This paper presents an integrated framework to enable using standard non-sequential machine learning tools for accurate multi-modal activity recognition. We develop a novel framework that contains simple pre- and post-classification strategies to improve the overall performance. We achieve this through class-imbalance correction on the learning data using structure preserving oversampling (SPO), leveraging the sequential nature of sensory data using smoothing of the predicted label sequence and classifier fusion, respectively. Through evaluation on recent publicly available activity datasets comprising of a large amount of multi-dimensional sensory data, we demonstrate that our proposed strategies are effective in improving classification performance over common techniques such as One Nearest Neighbor (1NN) and Support Vector Machines (SVM). Our framework also shows better performance over sequential probabilistic models, such as Conditional Random Field (CRF) and Hidden Markov Model (HMM) and when these models are used as meta-learners.


international conference on data mining | 2011

SPO: Structure Preserving Oversampling for Imbalanced Time Series Classification

Hong Cao; Xiaoli Li; Yew-Kwong Woon; See-Kiong Ng

This paper presents a novel structure preserving over sampling (SPO) technique for classifying imbalanced time series data. SPO generates synthetic minority samples based on multivariate Gaussian distribution by estimating the covariance structure of the minority class and regularizing the unreliable eigen spectrum. By preserving the main covariance structure and intelligently creating protective variances in the trivial eigen feature dimensions, the synthetic samples expand effectively into the void area in the data space without being too closely tied with existing minority-class samples. Extensive experiments based on several public time series datasets demonstrate that our proposed SPO in conjunction with support vector machines can achieve better performances than existing over sampling methods and state-of-the-art methods in time series classification.


international conference on big data and smart computing | 2015

Mobile user verification/identification using statistical mobility profile

Miao Lin; Hong Cao; Vincent W. Zheng; Kevin Chen Chuan Chang; Shonali Krishnaswamy

Recent studies show that ubiquitous smartphone data, e.g., the universal cell tower IDs, WiFi access points, etc., can be used to effectively recover individuals mobility. However, recording and releasing the data containing such information without anonymization can hurt individuals location privacy. Therefore, many anonymization methods have been used to sanitize these datasets before they are shared to the research community. In this paper, we demonstrate the idea of statistical mobile user profiling and identification based on anonymized datasets. Our insight is that, the mobility patterns inferred from different individuals data are identifiable by using the statistical profiles constructed from the patterns. Experimental results show that, the proposed method achieves a promising identification accuracy of 96% on average based on randomly chosen two users data, which makes our framework feasible for the application of inferring the fraud usage of the smartphones. Also, extensive experiments are conducted on the more challenging cases, showing a 59.5% identification accuracy for a total of 50 users based on 636 weekly data segments and a 56.1% accuracy for a total of 63 users based on 786 weekly data segments for two separate datasets. As the first work of such kind, our result suggests good possibility of developing location-based services or applications on the ubiquitous location anonymized datasets.


international conference of the ieee engineering in medicine and biology society | 2015

Modeling perceived stress via HRV and accelerometer sensor streams.

Min Wu; Hong Cao; Hai-Long Nguyen; Karl Surmacz; Caroline Hargrove

Discovering and modeling of stress patterns of human beings is a key step towards achieving automatic stress monitoring, stress management and healthy lifestyle. As various wearable sensors become popular, it becomes possible for individuals to acquire their own relevant sensory data and to automatically assess their stress level on the go. Previous studies for stress analysis were conducted in the controlled laboratory and clinic settings. These studies are not suitable for stress monitoring in ones daily life as various physical activities may affect the physiological signals. In this paper, we address such issue by integrating two modalities of sensors, i.e., HRV sensors and accelerometers, to monitor the perceived stress levels in daily life. We gathered both the heart and the motion data from 8 participants continuously for about 2 weeks. We then extracted features from both sensory data and compared the existing machine learning methods for learning personalized models to interpret the perceived stress levels. Experimental results showed that Bagging classifier with feature selection is able to achieve a prediction accuracy 85.7%, indicating our stress monitoring on daily basis is fairly practical.


ieee conference on prognostics and health management | 2015

Active learning for accurate analysis of streaming partial discharge data

Hai-Long Nguyen; João Bártolo Gomes; Min Wu; Hong Cao; Jianneng Cao; Shonali Krishnaswamy

Partial discharge (PD) is a phenomenon of electric discharge typically caused by the damaged or aged insulation of high voltage equipment in power grids, such as transformers, switch gears, and cable terminals. In the context of Prognostic and Health Management (PHM), detection and monitoring of PD are important to ensure the reliability of electrical assets and to avoid catastrophic failures. Machine learning techniques have been successfully applied to discover features and patterns that correspond to different types of partial discharges [9], [11]. Recently, PD monitoring systems have being deployed for assessing the health condition of these equipments continuously so that the maintenance would require less human effort and fewer maintenance interruptions to the operation. However, such systems require labeled data to build data models for PD detection and classification. Labeled data is expensive to obtain since it requires domain experts manual inputs. Minimizing the labeling cost is thus an important issue to solve. To the best of our knowledge, this issue has not been properly addressed in this domain. This paper proposes an active learning (AL) approach for accurate analysis of streaming PD data that aims to train an accurate PD classification model with minimum cost through selecting the most informative instances for the human experts to label. Experimental results show that our method is able to achieve the high classification accuracy of 86.9% with only a small labeling budget of 1 %.


ubiquitous computing | 2012

An integrated framework for human activity recognition

Hong Cao; Minh Nhut Nguyen; Clifton Phua; Shonali Krishnaswamy; Xiaoli Li

This poster presents an integrated framework to enable using standard non-sequential machine learning tools for accurate multi-modal activity recognition. Our framework contains simple pre- and post-classification strategies such as class-imbalance correction on the learning data using structure preserving oversampling, leveraging the sequential nature of sensory data using smoothing of the predicted label sequence and classifier fusion, respectively, for improved performance. Through evaluation on recent publicly-available OPPORTUNITY activity datasets comprising of a large amount of multi-dimensional, continuous-valued sensory data, we show that our proposed strategies are effective in improving the performance over common techniques such as One Nearest Neighbor (1NN) and Support Vector Machines (SVM). Our framework also shows better performance over sequential probabilistic models, such as Conditional Random Field (CRF) and Hidden Markov Models (HMM) and when these models are used as meta-learners.


international conference on data mining | 2015

IntelligShop: Enabling Intelligent Shopping in Malls through Location-Based Augmented Reality

Aditi Adhikari; Vincent W. Zheng; Hong Cao; Miao Lin; Yuan Fang; Kevin Chen Chuan Chang

Shopping experience is important for both citizens and tourists. We present IntelligShop, a novel location-based augmented reality application that supports intelligent shopping experience in malls. As the key functionality, IntelligShop provides an augmented reality interface -- people can simply use ubiquitous smartphones to face mall retailers, then IntelligShop will automatically recognize the retailers and fetch their online reviews from various sources (including blogs, forums and publicly accessible social media) to display on the phones. Technically, IntelligShop addresses two challenging data mining problems, including robust feature learning to support heterogeneous smartphones in localization and learning to query for automatically gathering the retailer content from the Web for augmented reality. We demonstrate the system effectiveness via a test bed established in a real mall of Singapore.


international conference on artificial intelligence | 2015

Mobility profiling for user verification with anonymized location data

Miao Lin; Hong Cao; Vincent W. Zheng; Kevin Chen Chuan Chang; Shonali Krishnaswamy

Collaboration


Dive into the Hong Cao's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge