Is this you? Create Your Porfile

Kezhi Mao

Nanyang Technological University

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Kezhi Mao is active.

Explore More

Publication

Featured researches published by Kezhi Mao.

Sensors | 2017

Learning to Monitor Machine Health with Convolutional Bi-Directional LSTM Networks

Rui Zhao; Ruqiang Yan; Jinjiang Wang; Kezhi Mao

In modern manufacturing systems and industries, more and more research efforts have been made in developing effective machine health monitoring systems. Among various machine health monitoring approaches, data-driven methods are gaining in popularity due to the development of advanced sensing and data analytic techniques. However, considering the noise, varying length and irregular sampling behind sensory data, this kind of sequential data cannot be fed into classification and regression models directly. Therefore, previous work focuses on feature extraction/fusion methods requiring expensive human labor and high quality expert knowledge. With the development of deep learning methods in the last few years, which redefine representation learning from raw data, a deep neural network structure named Convolutional Bi-directional Long Short-Term Memory networks (CBLSTM) has been designed here to address raw sensory data. CBLSTM firstly uses CNN to extract local features that are robust and informative from the sequential input. Then, bi-directional LSTM is introduced to encode temporal information. Long Short-Term Memory networks (LSTMs) are able to capture long-term dependencies and model sequential data, and the bi-directional structure enables the capture of past and future contexts. Stacked, fully-connected layers and the linear regression layer are built on top of bi-directional LSTMs to predict the target value. Here, a real-life tool wear test is introduced, and our proposed CBLSTM is able to predict the actual tool wear based on raw sensory data. The experimental results have shown that our model is able to outperform several state-of-the-art baseline methods.

IEEE Transactions on Industrial Electronics | 2018

Machine Health Monitoring Using Local Feature-Based Gated Recurrent Unit Networks

Rui Zhao; Dongzhe Wang; Ruqiang Yan; Kezhi Mao; Fei Shen; Jinjiang Wang

In modern industries, machine health monitoring systems (MHMS) have been applied wildly with the goal of realizing predictive maintenance including failures tracking, downtime reduction, and assets preservation. In the era of big machinery data, data-driven MHMS have achieved remarkable results in the detection of faults after the occurrence of certain failures (diagnosis) and prediction of the future working conditions and the remaining useful life (prognosis). The numerical representation for raw sensory data is the key stone for various successful MHMS. Conventional methods are the labor-extensive as they usually depend on handcrafted features, which require expert knowledge. Inspired by the success of deep learning methods that redefine representation learning from raw data, we propose local feature-based gated recurrent unit (LFGRU) networks. It is a hybrid approach that combines handcrafted feature design with automatic feature learning for machine health monitoring. First, features from windows of input time series are extracted. Then, an enhanced bidirectional GRU network is designed and applied on the generated sequence of local features to learn the representation. A supervised learning layer is finally trained to predict machine condition. Experiments on three machine health monitoring tasks: tool wear prediction, gearbox fault diagnosis, and incipient bearing fault detection verify the effectiveness and generalization of the proposed LFGRU.

IEEE Transactions on Instrumentation and Measurement | 2016

A New Probabilistic Kernel Factor Analysis for Multisensory Data Fusion: Application to Tool Condition Monitoring

Jinjiang Wang; Junyao Xie; Rui Zhao; Kezhi Mao; Laibin Zhang

The features extracted from multisensory measurements can be used to characterize machinery conditions. However, the nonlinearity and uncertainty presented in machinery degradation process pose challenges on feature selection and fusion in machinery condition monitoring. To alleviate these issues, this paper presents a new probabilistic nonlinear feature selection and fusion method, named probabilistic kernel factor analysis (PKFA). First, the mathematical structure of the PKFA is formulated incorporating kernel techniques on the basis of conventional factor analysis (FA). Next, a PKFA-based machining tool condition monitoring model with support vector regression is presented. The effectiveness of the scheme is experimentally verified on a machining tool testbed. The experimental results show that the proposed PKFA method provides more accurate tool condition prediction than using all initially extracted features and other feature selection techniques (e.g., kernel principal component analysis and conventional FA), and thus confirms its utility as an effective tool for machining tool condition assessment.

IEEE Transactions on Affective Computing | 2017

Cyberbullying Detection Based on Semantic-Enhanced Marginalized Denoising Auto-Encoder

Rui Zhao; Kezhi Mao

As a side effect of increasingly popular social media, cyberbullying has emerged as a serious problem afflicting children, adolescents and young adults. Machine learning techniques make automatic detection of bullying messages in social media possible, and this could help to construct a healthy and safe social media environment. In this meaningful research area, one critical issue is robust and discriminative numerical representation learning of text messages. In this paper, we propose a new representation learning method to tackle this problem. Our method named semantic-enhanced marginalized denoising auto-encoder (smSDA) is developed via semantic extension of the popular deep learning model stacked denoising autoencoder (SDA). The semantic extension consists of semantic dropout noise and sparsity constraints, where the semantic dropout noise is designed based on domain knowledge and the word embedding technique. Our proposed method is able to exploit the hidden feature structure of bullying information and learn a robust and discriminative representation of text. Comprehensive experiments on two public cyberbullying corpora ( Twitter and MySpace) are conducted, and the results show that our proposed approaches outperform other baseline text representation learning methods.

IEEE Transactions on Audio, Speech, and Language Processing | 2017

Topic-Aware Deep Compositional Models for Sentence Classification

Rui Zhao; Kezhi Mao

In recent years, deep compositional models have emerged as a popular technique for representation learning of sentence in computational linguistic and natural language processing. These models normally train various forms of neural networks on top of pretrained word embeddings using a task-specific corpus. However, most of these works neglect the multisense nature of words in the pretrained word embeddings. In this paper we introduce topic models to enrich the word embeddings for multisenses of words. The integration of the topic model with various semantic compositional processes leads to topic-aware convolutional neural network and topic-aware long short term memory networks. Different from previous multisense word embeddings models that assign multiple independent and sense-specific embeddings to each word, our proposed models are lightweight and have flexible frameworks that regard word sense as the composition of two parts: a general sense derived from a large corpus and a topic-specific sense derived from a task-specific corpus. In addition, our proposed models focus on semantic composition instead of word understanding. With the help of topic models, we can integrate the topic-specific sense at word-level before the composition and sentence-level after the composition. Comprehensive experiments on five public sentence classification datasets are conducted and the results show that our proposed topic-aware deep compositional models produce competitive or better performance than other text representation learning methods.

IEEE Transactions on Industrial Electronics | 2017

Building Occupancy Estimation with Environmental Sensors via CDBLSTM

Zhenghua Chen; Rui Zhao; Qingchang Zhu; Mustafa K. Masood; Yeng Chai Soh; Kezhi Mao

Buildings consume quite a lot of energy; hence, the issue of building energy efficiency has attracted a great deal of attention in recent years. A key factor in achieving this objective is occupancy information that directly impacts on energy-related building control systems. In this paper, we leverage on environmental sensors that are nonintrusive and cost-effective for building occupancy estimation. Our result relies on feature engineering and learning. The conventional feature engineering requires one to manually extract relevant features without a clear guideline. This blind feature extraction is labor intensive and may miss some significant implicit features. To address this issue, we propose a convolutional deep bidirectional long short-term memory (CDBLSTM) approach that contains a convolutional network and a deep structure to automatically learn significant features from the sensory data without human intervention. Moreover, the long short-term memory networks are able to capture temporal dependencies in the data and the bidirectional structure can take the past and future contexts into consideration for the final identification of occupancy. We have conducted real experiments to evaluate the performance of our proposed CDBLSTM approach. Instead of estimating the exact number of occupants, we attempt to identify the range of occupants, i.e., zero, low, medium, and high, which is adequate for most of building control systems. The experimental results indicate the effectiveness of our proposed approach compared with the state-of-the-art methods.

IEEE Transactions on Audio, Speech, and Language Processing | 2017

Task Independent Fine Tuning for Word Embeddings

Xuefeng Yang; Kezhi Mao

Representation learning of words, also known as word embedding technique, is based on the distributional hypothesis that words with similar semantic meanings have similar context. The selection of context window naturally has an influence on word vectors learned. However, it is found that the word vectors are often very sensitive to the defined context window, and unfortunately there is no unified optimal context window for all words. One impact of this issues is that, under a predefined context window, the semantic meanings of some words may not be well represented by the learned vectors. To alleviate the problem and improve word embeddings, we propose a task-independent fine-tuning framework in this paper. The main idea of the task-independent fine tuning is to integrate multiple word embeddings and lexical semantic resources to fine tune a target word embedding. The effectiveness of the proposed framework is tested by tasks of semantic similarity prediction, analogical reasoning, and sentence completion. Experiments results on six word embeddings and eight datasets show that the proposed fine-tuning framework could significantly improve word embeddings.

Archive | 2016

Partially Connected ELM for Fast and Effective Scene Classification

Dongzhe Wang; Rui Zhao; Kezhi Mao

Scene classification is often solved as a machine learning problem, where a classifier is first learned from training data, and class labels are then assigned to unlabelled testing data based on the outputs of the classifier. Generally, image descriptors are represented in high-dimensional space, where classifiers such as support vector machine (SVM) show good performance. However, SVM classifiers demand high computational power during model training. Extreme learning machine (ELM), whose synaptic weight matrix from the input layer to the hidden layer are randomly generated, has demonstrated superior computational efficiency. But the weights thus generated may not yield enough discriminative power for hidden layer nodes. Our recent study shows that the random mapping from the input layer to the hidden layer in ELM can be replaced by semi-random projection (SRP) to achieve a good balance between computational complexity and discriminative power of the hidden nodes. The application of SRP to ELM yields the so-called partially connected ELM (PC-ELM) algorithm. In this study, we apply PC-ELM to multi-class scene classification. Experimental results show that PC-ELM outperforms ELM in high-dimensional feature space at the cost of slightly higher computational complexity.

international conference on information and communication security | 2015

Regularized training of compositional distributional semantic models

Xuefeng Yang; Kezhi Mao; Rui Zhao

The compositional distributional semantic models (cDSMs) aim to use numerical vectors to represent the meaning of complex language expressions. cDSMs are usually trained using single training target, either from the basic DSM or a pseudo gold standard. In this paper, a new regularized training approach that integrates multiple training targets is proposed to improve semantic composition models. The experiment results show that the proposed training algorithm can effectively enhance compositional distributional semantic models.

international conference on information and communication security | 2015

Distributional sentence representation by expert knowledge for causal relation identification

Xuefeng Yang; Kezhi Mao; Rui Zhao

Extracting causal relations from natural sentences is an important issue in knowledge discovery. As a typical high level semantic problem with limited data, most systems only employ hand crafted features from various lexical semantic resources because it may generate very robust feature to support classification. However, human summarized knowledge is limited and there are more information in unlabeled corpora. To employ the features learned from unlabeled corpora, the authors propose a distributional sentence representation to make the distributional word representation applicable for high level semantic meaning problems. Experiments show that added features contain complementary knowledge for the causal relation expressions and it may improve the performance of the relation extraction system.

Explore More