A comprehensive review on convolutional neural network in machine fault diagnosis
11 A comprehensive review on convolutional neural network in machine fault diagnosis
Jinyang Jiao a , Ming Zhao a , Jing Lin b,* , Kaixuan Liang a a State Key Laboratory for Manufacturing Systems Engineering, School of Mechanical Engineering, Xi’an Jiaotong University, Xi’an, Shaanxi 710049, China b School of Reliability and Systems Engineering, Beihang University, Beijing 100083, China
Abstract
With the rapid development of manufacturing industry, machine fault diagnosis has become increasingly significant to ensure safe equipment operation and production. Consequently, multifarious approaches have been explored and developed in the past years, of which intelligent algorithms develop particularly rapidly. Convolutional neural network, as a typical representative of intelligent diagnostic models, has been extensively studied and applied in recent five years, and a large amount of literature has been published in academic journals and conference proceedings. However, there has not been a systematic review to cover these studies and make a prospect for the further research. To fill in this gap, this work attempts to review and summarize the development of the Convolutional Network based Fault Diagnosis (CNFD) approaches comprehensively. Generally, a typical CNFD framework is composed of the following steps, namely, data collection, model construction, and feature learning and decision making, thus this paper is organized by following this stream. Firstly, data collection process is described, in which several popular datasets are introduced. Then, the fundamental theory from the basic convolutional neural network to its variants is elaborated. After that, the applications of CNFD are reviewed in terms of three mainstream directions, i.e. classification, prediction and transfer diagnosis. Finally, conclusions and prospects are presented to point out the characteristics of current development, facing challenges and future trends. Last but not least, it is expected that this work would provide convenience and inspire further exploration for researchers in this field.
Keywords:
Convolutional neural network, machine fault diagnosis, classification, prediction, transfer learning. * Corresponding author
E-mail address: [email protected]
1. Introduction
Powered by the integration innovation of intelligent manufacturing, industrial big data and industrial 4.0, the modern industry is experiencing a new revolution from the traditional manufacturing industry to intelligent industry [1, 2]. Mechanical equipment, as one of the most significant roles in this revolution, is evolving to continuously promote production and improve economic benefit. Unfortunately, various faults will inevitably be exposed during the tireless operation of machine, once the fault appears, it will cause unscheduled downtime, economic loss, even catastrophic accidents and casualties [3, 4]. However, big data generated from modern industry also affords an unprecedented opportunity to obtain an in-depth understanding of machine condition. Therefore, it is vital to seize this opportunity and advance diagnostic methods for accurate judgment and timely response on machine degradation and failure. Over the years, a variety of approaches have been developed for machine fault diagnosis through the wisdom and efforts of researchers and engineers. Review them briefly, existing approaches can be roughly divided into four categories according to the development process, i.e. physical model-based methods, signal processing-based methods, machine learning-based methods and their hybrid [5-7]. Physical model-based methods usually require a thorough understanding for mechanisms of the machine, thus it is difficult to build accurate physics systems for modern complex mechanical equipment, especially in dynamic and noisy working environment. In addition, most of physical models are inflexible and inefficient since they are unable to be updated with real monitoring data. Different from these approaches, the signal processing-based approach aims to explore advanced signal de-nosing and filtering technologies to emphasize fault characteristic information. However, it usually requires related equipment knowledge for feature frequency calculation, moreover, the solid fault representation theory and mathematical basis are also the premise of this method. Another family named machine learning-based method, as a typical representative of data-driven approaches, has been active and brilliant with the development industrial modern industry, especially the advent of deep learning [8]. Although classical machine learning models, such as support vector machine (SVM) and k -nearest neighbor, have achieved remarkable progress over the past years, some drawbacks still exist when facing the higher industrial requirements [9]. For example, i) These methods generally need to extract and select features manually, which is limited in complex big data analysis. In addition, it is also difficult to effectively mine high-dimension features due to the shallow structure; ii) Feature mining and decision making are separately designed, in which the unsynchronized optimization will consume considerable time and restrict the performance; iii) With the growing diversity of sensor and complexity of machine, as well as increasing volume of data with increased dimensions and dynamics, it is difficult to obtain satisfactory diagnostic with traditional algorithms. Deep learning, as the hottest branch of machine learning, has been witnessed the proliferation and prosperity in various fields, including image identification, speech processing and so on [10]. This is not only due to subjective factors, such as powerful capabilities of data processing, feature learning and architecture innovation, but also several external factors cannot be ignored, including i) Explosive increase of industrial big data; ii) Breakthrough of hardware, such as graphics processor unit; iii) Stimulation from multifarious competitive task requirements. Naturally, deep learning has also raised the wave of intelligent fault diagnosis over the past five years. The popular deep learning based diagnostic models include deep auto-encoder [11], deep belief network [12], recurrent neural network [13], and convolutional neural network (CNN) [14]. Among them, convolutional network [15] has become the leading architecture and achieved state of the art performance in many benchmarks [10]. Similarity, fault diagnosis approaches using convolutional network have also developed most rapidly and a lot of research work has been published. Given the popularity of Convolutional Network based Fault Diagnosis (CNFD), a systemic review and summary is necessary to help to tease out current work and make prospects for the further research. The CNFD framework can generally be summarized into three steps as shown in Fig. 1, including data collection, model construction, as well as feature learning and decision making. In the first step, tremendous monitoring data are collected and prepared from the concerned mechanical equipment. Next, convolutional network models are designed and constructed depend on the task requirements. Finally, the hierarchical and high-dimensional features can be adaptively learned for characterizing machinery condition. Meanwhile, the decision, such as fault classification and remaining useful life (RUL) prediction, is carried out based on the extracted features. Several merits can be clearly revealed from this framework, i) It is able to exploit the in-depth and intrinsic characteristics adaptively while alleviate the requirements of human labor as well as expert knowledge; ii) This model can flexibly update itself according to the real-time monitoring data for more practical diagnostic requirements; iii) This diagnostic framework integrates the feature extraction and decision making together and constructs an end-to-end intelligent diagnostic model. Prior to our work, there are also several excellent review articles in machine fault diagnosis. For instance, Liu et al. [16] summarized five artificial intelligent algorithms for fault diagnosis of rotating machinery. However, their work mainly focused on the traditional machine learning models, and the review on deep learning based methods is insufficient, especially for convolutional network. Zhao et al. [17] presented a work to review several deep learning models and their applications to machine health monitoring. Although the convolutional network has also been described in their work, it was treated equally with other models and the review about CNN was not enough comprehensive. Hoang et al. [18] presented a survey on deep learning based bearing fault diagnosis, in which the literature only referred to bearing applications. Meanwhile, the summarization about CNN based methods is also incomplete. Furthermore, recent budding studies that integrate the convolutional neural networks with transfer learning technologies have not been mentioned in these papers, while these methods have gradually attracted attention since they are suitable for more practical industrial scenarios. With this in mind, this paper intends to review fault diagnosis algorithms by leveraging convolutional networks more comprehensively, meanwhile, to provide a reference for those who want to understand and promote the development of CNN technologies for fault diagnosis of machinery. The rest of this paper is organized as follows. According the line of the Fig. 1, data collection process and several popular public datasets are described in Second 2. After that, the concept and theory of CNN and its variants are introduced in Section 3. In Section 4, the applications of convolutional network on machine fault diagnosis are comprehensively reviewed. In Section 5, some conclusions are drawn based on above review. Finally, prospects are summarized in Section 6.
Collect tremendous mechanical data for fault diagnosis, such asVibration, current, acoustic emission, built-in encoder, etc.
Mechanical data
Data cloud ... ...
Collect dataCollect data
Data Collection
Design and construct convolutional networks, such as basic one- dimension / two-dimension
CNN or its variants:
ResNet, GAN,
Dense network, etc.
Model Construction w e i gh t B N R e L U w e i gh t B N R e L U f One-dimension modelTwo-dimension model
Model Construction
Machine health degradation prediction
Convolutional network can adaptively learning features and making decision, common decision includes fault classification and health prediction etc.
Feature & Decision
Gradient descent for network training
Training curve
Feature Learning and Decision Making + Fig. 1. The framework of general CNFD method.
2. Data preparation
As shown in Fig. 1, high-quality data is the premise and foundation for successfully training convolutional neural networks. In brief, there are two steps to acquire mechanical data, including sensor selection and layout as well as data sampling and storage. With the development of the sensor technology, various sensors have been applied to mechanical condition monitoring, such as vibration, current, built-in encoder [19], etc. Based on these sensors, comprehensive monitoring data can be captured for machine fault diagnosis. Among them, vibration analysis has become the most popular monitoring manner and been developed most rapidly over past years. Although these vibration-based approaches have made impressive progress, there are still some restrictions for collecting vibration data in practical industry. For instance, vibration data is often plagued by the interference of transmission path and environment noise, thus the signal-to-noise ratio of data is low. In addition, vibration data is not sensitive to low frequency response, thus it is not suitable for the condition monitoring of low speed machinery. Furthermore, the vibration sensor cannot even be installed in high temperature, high pressure or closed working environments. However, these drawbacks can be circumvented by using other sensors, for example, infrared imaging can provide a non-contact measurement method and built-in encoder signal has better signal-to-noise ratio and low frequency response. Therefore, it is of significance to comprehensively consider multiple factors, such as equipment type, working environment, monitoring object and operating condition, for selecting well-suited sensors. Next, the layout of sensor is also an important consideration since proper location can perceive much more health information and reduce the influence of transmission path and interference. Following this step, data sampling can be carried out using the data acquisition system and then data are stored by the hard disk or cloud platform for further analysis and use. Although the process of data collection is clear and intuitive, there are still difficulties for acquiring high-quality data in real industrial scenarios [20]. For example, i) the fault data is hard to be acquired than health data since the machine is usually not allowed to run in fault condition; ii) The obtaining of life-cycle data is time consuming, expensive and even prohibitive since machine generally has a long running time from the health to failure. Fortunately, a few institutions have published datasets for public study and application. Therefore, several public datasets are introduced in following subsections, which aims to offer a guideline for researchers and engineers who intend to select these data for the evaluation of their approaches.
Fig. 2. Experimental platform of CWRU.
The Case Western Reserve University (CWRU) [21] bearing dataset has become one of the most popular datasets for machine fault diagnosis since it was made public. The experimental rig is shown as Fig. 2, which is consisted of an electric motor, a torque transducer/encoder, a dynamometer, and control electronics. The single point motor bearing faults simulated by the electro-discharge machining were tested in this platform, including inner race fault, outer race fault and ball fault. Each fault has different fault sizes, i.e. 7 mils, 14 mils, 21 mils, 28 mils, and 40 mils (1mil=0.001 inches). The accelerometers attached to the drive end and fan end of the motor housing were used to collect vibration data with respective sampling frequencies, i.e. 12 kHz and 48 kHz. In addition, there are four operating conditions in this dataset, including 0 hp/1797 rpm, 1 hp/1772 rpm, 2 hp/1750 rpm, and 3 hp/1730 rpm.
1) The faults of this dataset were processed by the electro-discharge machining and have certain differences from the real natural industrial scenarios. Therefore, when using this dataset to construct the classification task with multi-fault, many faults are easy to be detected and the results may be lead to blind faith. 2) This dataset was collected from different sensor positions, which thus can be used to study the generalization capability of model to different sensor data. 3) There are different sizes for the same fault condition in this dataset. Thus this dataset can be applied to the transfer diagnosis scenario where the model is trained by one fault size and tested using other fault sizes. 4) This dataset includes four different working conditions. Therefore, it is suitable for the transfer fault diagnosis study, in which the training data and the test data are from different operating conditions. T ac h : C o l Input Shaft B ea r i ng B ea r i ng B ea r i ng
32T Or16T96T Or48T Input Shaftldler ShaftOutput Shaft 48T Or24T80T Or40T B ea r i ng B ea r i ng B ea r i ng AccelCol 1 AccelCol 2Output Shaft
Fig. 3. Experimental rig of PHM 2009.
This database was shared by the IEEE International Conference on prognostics and health management (PHM) 2009 [22]. The schematic of the experimental rig is shown in Fig. 3, where the tested industrial gearbox contains three shafts, four gears and six bearings. Two types of gears, i.e. spur gear and spiral cut (helical) gear, were used for experimental test. The spur gear dataset contains eight health conditions and helical gear dataset has six health conditions as described in Table 1 and Table 2. In this experiment, the vibration data were sampled with 66.67 kHz sampling frequency by two accelerometers mounted on both the input and output shaft retaining plates. Meanwhile, the tachometer signals were collected by 10 pulse per revolution. In this dataset, therefore, each data file contains three columns and the first two columns are vibration data and the third column is tachometer data. Moreover, the experimental operating conditions involve five speeds and two loads, i.e. 30 Hz, 35 Hz, 40 Hz, 45 Hz, 50 Hz shaft speed and high and low loading.
1) This dataset was collected from a generic industrial gearbox, in which the spur gear contains eight different health conditions and helical gear includes six different health conditions. Thus it can be used to construct the multi-classification diagnosis scenario. 2) Two accelerometers are used to synchronously sample vibration data from different positions, therefore, it is applicable for the research of the double-sensor information fusion or transfer diagnosis between sensors. 3) There are multiple hybrid faults of gears, bearings and shafts in this database, thus it is a typical case for the study of hybrid fault diagnosis. 4) This dataset includes multiple working conditions.
Therefore, it can be applied to the transfer diagnosis research under different speeds and loads.
Table 1. Description of spur gear health conditions in PHM 2009 dataset.
Gear Bearing Shaft 32T 96T 48T 80T IS:IS ID:IS OS:IS IS:OS ID:OS OS:OS Input Output Spur 1 G G G G G G G G G G G G Spur 2 C G E G G G G G G G G G Spur 3 G G E G G G G G G G G G Spur 4 G G E Br B G G G G G G G Spur 5 C G E Br In B O G G G G G Spur 6 G G G Br In B O G G G Im G Spur 7 G G G G In G G G G G G KS Spur 8 G G G G G B O G G G Im G IS: Input Shaft; ID: Idler Shaft; OS: Output Shaft; :IS: Input Side; :OS: Output Side; G: Good; C: Chipped; E: Eccentric; Br: Broken; B: Ball; In: Inner; O: Outer; Im: Imbalance; KS: Keyway Sheared.
Table 2. Description of helical gear health conditions in PHM 2009 dataset.
Gear Bearing Shaft 16T 48T 24T 40T IS:IS ID:IS OS:IS IS:OS ID:OS OS:OS Input Output Hel 1 G G G G G G G G G G G G Hel 2 G G C G G G G G G G G G Hel 3 G G Br G G G G Co In G BS G Hel 4 G G G G G G G Co B G Im G Hel 5 G G Br G G G G G In G G G Hel 6 G G G Br In B O G G G BS G Hel: Helical; Co: Combination; BS: Bent Shaft.
Electric motor Torque Measurement Bearing Test Module Flywheel Load Motor
Fig. 4. Paderborn test rig for condition monitoring of rolling bearings.
The Paderborn dataset [23] was obtained using the test bench as shown in Fig. 4 and this rig was composed of an electric motor, a torque measurement shaft, a rolling bearing test module, a flywheel and a load motor. The bearings of different states were installed in the bearing test module to acquire experimental data. In total, experiments with 26 faulty bearings and 6 healthy bearings were performed, in which the fault contains 12 artificial damages as shown in Table 3 and Fig. 5 (a) and 14 real damages as shown in Table 4 and Fig. 5 (b). The motor current and vibration signals of bearing housing were synchronously measured with a sampling rate of 64 kHz. Moreover, this test rig was respectively operated under four different operating conditions as shown in Table 5 by changing the rotational speed, load torque and radial force.
Table 3. Description of test bearing with artificial damage.
Artificial Damage BC OR OR OR OR OR OR OR IR IR IR IR IR ED 1 2 1 2 1 2 2 1 1 1 2 2 DM EDM EE EE EE D D D EDM EE EE EE EE BE: Bearing Component; ED: Extent of Damage; DM: Damage Method; OR: Outer Ring; IR: Inner Ring; EDM: Electric Discharge Machining; D: Drilling; EE: Manual Electric Engraving.
Sharp trend by EDM Drilling
Pitting by electric engraver Indentation of outer ring Pitting of inner ring (a) (b)
Fig. 5. Several examples of bearing damages. (a) artificial; (b) real.
Table 4. Description of test bearing with real damage caused by accelerated lifetime test.
Real Damage D FP PI FP FP PI FP FP PI FP FP FP FP FP FP BC OR OR OR OR OR IR(+OR) IR(+OR) IR+OR IR IR IR IR IR IR Co S S R S R M M M M M S R S S A nr nr r nr r r nr r nr nr nr r nr nr ED 1 1 2 1 1 2 3 1 1 1 3 1 2 1 CD SP SP SP SP Di SP Di Di SP SP SP SP SP SP D: Damage (main mode and symptom); BC: Bearing Component; Co: Combination; A: Arrangement; ED: Extent of Damage; CD:
Characteristic of damage; FP: fatigue: pitting; PI: Plastic deformation: Indentations; OR: Outer Ring; IR: Inner Ring; S: Single Damage; R: Repetitive Damage; M: Multiple Damage; nr: no repetition; r: random; SP: Single Point; Di: Distributed.
Table 5. Four operating conditions.
No. 0 1 2 3 Rotational Speed (rpm) 1500 900 1500 1500 Load Torque (Nm) 0.7 0.7 0.1 0.7 Radial Force (N) 1000 1000 1000 400
1) This dataset contains different damage states and thus can be used for the multi-fault classification study. Besides, this dataset is more comprehensive than CWRU bearing data since it takes into account the artificial and realistic bearing damages simultaneously. 2) In this dataset, the motor current and vibration signals were synchronously sampled for bearing health information collection. Therefore, on one hand, the motor current signal or the vibration signal can be independently used for bearing diagnosis and comparison study; on the other hand, multi-sensors information fusion based diagnosis can be studied using this dataset. 3) This dataset can be used for the transfer diagnosis according to different fault formation mode, that is, the model is trained using the artificial fault data and tested using the real fault data. 4) There are four operating conditions with different speeds, load torques and radial forces. Thus this dataset is suitable to the transfer diagnosis study under different working conditions.
Accelerometers Radial Load
Thermocouples
Bearing 1 Bearing 2 Bearing 3 Bearing 4
Fig. 6. The IMS bearing test rig.
This bearing dataset was established by the center for Intelligent Maintenance Systems (IMS) of University of Cincinnati [24] and the used test rig is presented in Fig. 6. In this test rig, four Rexnord ZA-2115 double row bearings were installed on the shaft for testing. The rotation speed was kept constant at 2000 RPM by an AC motor coupled to the shaft via rub belts. Besides, a radial load of 6000 LBS was applied to the shaft and bearing by a spring mechanism. The vibration data were acquired by accelerometers attached on the bearing housings with the sampling rate of 20 kHz. In total, there are three experiments and each described a test-to-failure task as shown in Table 6. The inner race defect occurred in bearing 3 and roller element defect in bearing 4 at the end of the first experiment. In the second and third experiment, the outer race failure finally occurred in bearing 1 and bearing 3, respectively. Table 6. Description of bearing conditions in three experiments.
Experiment Bearing 1 Bearing 2 Bearing 3 Bearing 4 1 UD UD IRD RED 2 ORD UD UD UD 3 UD UD ORD UD UD: undamaged; IRD: inner race damage; RED: roller element damage; ORD: outer race damage.
1) This dataset contains four different health conditions, i.e. health, roller fault, outer race fault and inner race fault, which can be used to study the issue of bearing classification. 2) Each data described a run-to-failure experiment, thus researchers can employ this dataset to study the bearing RUL prediction. 3) The bearings of this experiment experienced an “increase-decrease-increase” degradation trend, in which the reason of “decrease” is the self-healing nature of the damage. As a result, selecting data during this period will increase the difficulty of fault diagnosis. 4) There is only one operating condition in this dataset, which limits the diversity of the data. In addition, the lifetime of each unit has distinct discrepancies, which increases the difficulty of RUL prediction.
Fan Combustor N1 LPTHPT NozzleN2HPCLPC
Fig. 7. The diagram of the simulated engine.
C-MAPSS dataset [25] was provided by Commercial Modular Aero-Propulsion System Simulation (C-MAPSS) to simulate the t urbofan engine degradation. The diagram of simulated engine is shown in Fig. 7. This system has 14 inputs, i.e. the fuel flow and a set of 13 health parameters, which allows the user to simulate the effects of faults and deterioration in any of the engine’s five rotating components. There are 58 outputs including various sensor responses and operability margins, in which a total of 21 sensor variables were used to measure the health states of the engine. In total, five subsets are included in this dataset as listed in Table 7 and each trajectory has a specific initial wear level and degradation process. Table 7. Five subsets included in the turbofan engine degradation dataset.
No. Train trajectories Test trajectories Conditions Fault modes 1 218 218 -- -- 2 100 100 One (sea level) HPC Degradation 3 260 259 Six HPC Degradation 4 100 100 One (sea level) HPC and Fan Degradation 5 248 249 Six HPC and Fan Degradation
1) This dataset was generated from the simulation software and thus it has a certain difference with the experimental data or realistic industrial scenario. 2) There are sufficient training samples in this dataset, which thus is able to train the complex convolutional networks for prediction. 3) This dataset contains 21 different observation features, such as the temperature, pressure, speed, etc., which means that this dataset can be applied for the study of multi-sensor information fusion. 4) There are different operating conditions for the same fault mode. Thus, it can be used to evaluate the generalization capability of model.
Accelerometer
AccelerometerAE SensorCutterWorkpiece Dynamometer
Fig. 8. The high-speed CNC milling machine cutters.
This dataset was shared in the 2010 PHM Society Conference Data Challenge [26], which focused on RUL estimation of a high-speed CNC milling machine cutters. The experimental rig is shown in Fig. 8 and the operation parameters are listed in Table 8. A Kistler quartz 3-component platform dynamometer was mounted between the work piece and machining table to measure the cutting forces, three Kislter accelerometers were mounted on the work piece to measure the machine tool vibrations of cutting process in X, Y, Z direction, respectively, and a Kistler acoustic emission sensor was mounted on the workpiece to monitor the high frequency stress wave generated by the cutting process. There are six individual cutter data in this dataset in total. Table 8. The description of operation parameters.
Description Value Running speed of the spindle 10400 rpm Feed rate in the x direction 1555 mm/min Depth of cut in the y direction 0.125 mm Depth of cut in the z direction 0.2 mm Sampling frequency 50 kHz
1) This dataset was conducted under the dry milling environment, therefore, it has certain differences with the real milling process. However, real milling data are quite difficult to be acquired due to the cost or commercial competition, this dataset is still a good choice for the study of RUL prediction of milling machine cutters. 2) Each data file is composed of three-dimension cutting force data, three-dimension vibration data and acoustic emission signal. Therefore, it can be used to explore single-sensor or multi-sensor fusion based prediction scenarios. 3) Although this dataset has six cutter data, there are only three cutters are labeled. Thus the amount of data may be insufficient for building complex diagnostic networks. 4) This experiment was performed in one milling operating condition, which limits the diversity of data and restricts the construction of cross-validation prediction scenarios.
NI DAQ Card Pressure Regulator Cylinder Pressure Force Sensor Bearing Tested AccelerometersAC Motor Speed Sensor Speed Reducer Torquemeter Coupling Thermocouple
Fig. 9. PRONOSTIA experimental platform.
FEMTO dataset [27] was acquired from the PRONOSTIA experimental platform designed by Franche-Comté Electronics, Mechanics, Thermal Processing, Optics-Sciences and Technologies institute (FEMTO), which aimed to provide the experimental data to characterize the degradation of bearings. This dataset was also used for the prognostic challenge in the IEEE International Conference on PHM 2012. The overview of this test rig is presented in Fig. 9, which is composed of a rotating part (the asynchronous motor with a gearbox and two shafts), a degradation generation part (with a radial force applied on the tested bearing) and a measurement part (sensors). All bearings are healthy and not seeded with any defects at the beginning of the test. Two types of sensor, i.e. thermocouple and accelerometers (horizontal and vertical) were used to collect the temperatures and vibration signals of the testing bearing, the sampling frequencies of vibration and temperature were set as 25.6 kHz and 10 Hz, respectively. The bearing life is considered terminated when the amplitude of the vibration signal exceeds 20 g, which aims to avoid propagation of damages to the whole test bed. In summary, this dataset contains 17 start-to-end data of bearings.
1) This is a full-life bearing dataset with real damages, thus it is a good choice for the study of bearing RUL prediction. However, the train set is small while the spread of the life duration of all bearing is wide, which means that the RUL estimation is more challenging using this dataset. 2) Although two-direction vibration signals are collected, the vertical vibration signals provide less useful information than the horizontal ones for tracking the bearing degradations according to the related literature [28, 29]. 3) This dataset provides a natural degradation process since the bearings are healthy and not seeded with any defects at the beginning of the tests. But this dataset presents no prior information about the properties of the damages. 4) The degradation and fault patterns are discriminative for distinct bearings even under the same operating condition due to various factors, which thus increases the difficulty of bearing RUL estimation.
Table 9. The summary of different datasets.
Name Monitoring object Multi-sensors Classification Prediction Transfer diagnosis CWRU Motor Bearing √ √ × √ PHM 09 Gearbox, Baring, Shaft √ √ × √ Paderborn Bearing √ √ × √ IMS Bearing × √ √ × C-MAPSS Turbofan engine √ × √ √ PHM 10 Milling Machine Cutters √ × √ × FEMTO Bearing √ × √ √ In this section, data collection is firstly introduced from sensor selection and layout to data sampling and storage. To provide more convenience for relevant scholars, we summarize seven popular public datasets and list a concise summary in Table 9. This table firstly shows the main monitoring object of each dataset in second column. Then whether each dataset contains multiple sensing information is summarized. Finally, the application scenarios, including classification, RUL prediction, and transfer diagnosis are illustrated and compared in the subsequent columns. Furthermore, it is noteworthy that the datasets introduced in this work are just several popular ones and there are still some other available public datasets for the use and reference, such as MFPT fault dataset [30], XJTU-SY bearing dataset [31], University of Connecticut gear fault dataset [32]. In addition, the international conferences on PHM from the PHM society or the IEEE reliability society often provide some valuable datasets for researchers and engineers.
3. Convolutional neural network and its variants
Convolutional neural network, as the leader of deep learning models, has become a milestone technique and achieved state-of-the-art performances in various computer vision and pattern recognition tasks. Similarly, convolutional neural network has also shined in the field of machine fault diagnosis. From the perspective of completeness, the basic theory on CNN are firstly presented before reviewing its applications, which aims to provide the understanding and preparation for the researchers, engineers and even beginners who intend to apply convolutional networks for fault diagnosis.
Take one-dimensional mechanical signal as an example, a basic convolutional neural network is displayed in Fig. 10, which contains one input layer, multiple convolution-pool and fully-connected layers and one output layer. Moreover, two popular operations, including batch normalization and dropout, are also embedded in this structure, which can help to improve the model performance. In following subsections, each operation will be introduced respectively. . . .. . . . . .. . . . . . . . . . . .. . . . . . . . .. . . . . . . . . . . . . . . Input Data
ConvBNReLU Pooling OutputConvBN ReLU PoolingFeature Map F C - D r opou t ... . . . . . .. . . . . . ... Fig. 10. An example of convolutional neural network. Conv: Convolution; BN: Batch Normalization; ReLU: Rectified Linear Unit activation function; FC: Fully-connected layer.
The convolutional kernel (filter) plays central role in the convolutional layer, which endows two key core ideas, i.e. sparse connection and shared weight. Each kernel is connected with the local patch in the feature maps of the previous layer, meanwhile, the weights remain unchanged when the kernel slides on these maps. In general, multiple kernels are contained in one convolutional layer, which aims to learn comprehensive feature representations. Mathematically, let d x be the d -dimensional mechanical data, the i -th convolutional feature in the j -th map can be described as: j j j b c x w (1) where j h w represents the j -th filter, it is used to code the input x and generate the j -th feature map =[ , ,..., ] j j j jd h c c c c ; j b denotes the bias term. After the convolutional operation, the non-linear activation function, such as sigmoid function, tanh function, and rectified linear unit (ReLU) [33] as shown in Fig. 11, is usually employed to achieve the feature transformation. At present, ReLU is widely used in CNN since it not only computes much faster than sigmoid and tanh function, but also can alleviate the issue of gradient vanishing. However, a potential disadvantage of ReLU unit is that it has zero gradient whenever the unit is not active. This may cause units that do not active initially never active as the gradient-based optimization will not adjust their weights. To alleviate this problem, more advanced activation function has been proposed, such leaky ReLU [34]. -5 0 5-1 -0.5 f ( c )=1/(1+exp(- c )) f ( c )=( e c -e -c )/(e c +e -c ) f ( c )=max( c , 0)(a) (b) (c) Fig. 11. Three nonlinear activation functions. (a) sigmoid; (b) tanh; (c) ReLU.
The pooling layer, as another important operation in CNN, aims to reduce the dimension of features and enable features more robust. The common pooling operations include the max pooling and the average pooling and the difference between them is whether to take the maximum or average value in the pooling region [35]. Take the max pooling as an example, the mathematical description is given as follows: j jk k k p po c : 1 max{ } (2) where : 1 jk k p c represents input and p is the pool size; jk po denotes the maximum value in the corresponding pooling region. The batch normalization (BN) [36] has become a popular technique to alleviate internal covariance shift and promote network training. Mathematically, given the d -dimensional input (1) ( ) ={ ,..., } d x x x , the operation of BN is described as follows: ( ) ( )( ) ( ) ( ) ( ) ( )( ) [ ],Var[ ] k kk k k k kk x xx h xx (3) where ( ) k x and ( ) k h represent the k -th activation input and output, respectively; ( ) and Var( ) denote the expectation and variance; ( ) k and ( ) k stand for the parameters to be learned. Dropout [37] is a technique that prevents overfitting and provides a way of approximately combining different networks. The key operation of dropout is to randomly drop neuron units (along with their connections) of the network during training as shown in Fig. 12. Specifically, a unit is present with probability p at training time, while the unit is always present and the weights are multiplied by p at test time. After adopting dropout
Fig. 12. The example of dropout operation.
After stacking multiple convolution-pool modules, the fully-connected layers are usually employed to process features further. The mathematical calculation of the fully-connected layer is the same as the traditional perception, it can be described as: ( ) l l b fc w fc (4) where l fc represents the output features of l -th fully-connected layer; w and b stand for the connection weight and bias, respectively; ( ) denotes the non-linear activation function. After the feature extraction, the decision layer is usually followed to get the final results. There are usually two typical outputs in fault diagnosis problem, the one is classification label output and the other is single variable output, such as RUL prediction. Softmax function has become one of the most popular choices for the classification task owing to its convenience and effectiveness. Suppose the input data x that belongs to one of the class N c , then the Softmax output that estimates the category probability of x can be calculated as:
12 1... ... ( )( )Softmax( ) ( ) TTc Tj TNc
Nc j ep yp y eep y N e xxx x x (5) where , , ..., =[ ] c TN stands for the parameters. Note that the value of Softmax(x) is positive and the sum of each item is 1. In addition to the original convolutional network, the improved variants have also been developed and applied to the field of fault diagnosis for more excellent performance. Therefore, several common variants will be introduced briefly in this subsections, including residual network, densely connected convolutional network, and generative adversarial convolutional network. w e i gh t B N R e L U x w e i gh t B N + R e L U y f Fig. 13. The residual learning connection, where weight represents convolution; BN stands for batch normalization; and ReLU is rectified linear unit.
Stacking deeper convolutional layers in the regular CNN blindly will occur the performance degradation or gradient vanishing/exploding problem. To address these problems, a novel convolutional model, named residual network (ResNet) [38], is proposed and has become the typical representative in deep networks owing to noticeable improvements. The ResNet is generally composed of multiple residual learning blocks and each contains the convolutional layer, BN layer and activation layer as shown in Fig. 13. From this figure, it can be seen that the output of the residual block is calculated as f( , ) y x w x , where f denotes the residual mapping to be learned; w represents the parameters. The operation f x is carried out by a shortcut connection of element-wise addition. Note that a projection by shortcut connection is performed to match the input and output when they have different dimensions. BN ReLUConvolution
Input x BNReLUConvolution BNReLUConvolution F F F Feature x Feature x Feature x Fig. 14. The dense connection block.
In addition to ResNet, another popular deeper structure named densely connected convolutional network (DenseNet) [39] has also attracted increasing attentions owing to its excellent performance. The DenseNet is composed multiple dense connection block, in which the current layer receives the feature maps from all previous layers as shown in Fig. 14. More specifically, given the input x , the feature calculation of the l -th layer can be described as ([ , ,..., ]) l l l F x x x x , where [ , ,..., ] l x x x denotes the concatenation of the feature maps generated in layer 0,…, l -1; l F represents a composite function of three consecutive operations, i.e. BN, ReLU, and convolution. Random noise Generated data
Real data Real or Pseudo G e n e r a ti v e m od e l D i s c r i m i n a ti v e m od e l Fig. 15. The framework of GAN.
Generative adversarial convolutional network (GAN) [40] has become an important research hotspot with promising performance on data generation. The GAN usually contains two models, i.e. a generative model G and a discriminative model D , which are pitted against each other to find a Nash equilibrium. As shown in Fig. 15, the G is trained to learn the distribution of real data and generate samples from noise and the D is trained to distinguish the real samples and pseudo samples by enabling the high output probability of real samples and the low probability of generative samples. In other words, the optimization of adversarial learning is a minimax game as follows: data( ) ( ) ~ ~ min max ( , ) [log ( )] [log(1 ( ( )))] P PG D
D G D D G x z z x z x z (6) where data(x) p represents true data distribution; ( ) p z z is the distribution of random noise z. In this section, the concept on basic CNN and several variants is introduced, which aims to help readers better understand the work mechanism of convolutional network. Unfortunately, there is no specific guidelines for architecture selection and design, thus researchers need to optimized networks following own task requirements. It is necessary to pay attention to the same network architecture when designing the comparison approaches for a fair result. Furthermore, most advanced networks are developed based on image data feature, thus engineers are strongly encouraged to explore novel convolutional networks that fits the characteristic of mechanical data, which is promising for more excellent fault diagnosis.
4. Applications on CNFD
In this section, applications on CNFD are systematically reviewed and summarized, which covers published journal and conference papers in recent three years. In particular, we introduce these literature according to the following three aspects: fault classification, health prediction, and transfer diagnosis.
Fault classification is the earliest and the widely studied sub-field in CNFD inspired directly by image classification. In this section, the applications on fault classification are systematically reviewed. To make the narrative more organized, we elaborate these literatures depend on the structure characteristics of convolutional network and categorize them from three aspects, i.e. two-dimension (2-D) convolutional network based classification, one-dimension (1-D) convolutional network based classification, and fault classification based on convolutional network variants. At the beginning, the convolutional networks used for machine fault diagnosis are the original 2-D structure by imitating the image processing. Since mechanical data is a 1-D time series in almost all cases, the main idea is to convert the 1-D data into the 2-D form in this case. Therefore, we firstly refine various signal conversion methods and then summarize the related applications. Data matrix transformation ...
Fig. 16. Simple illustration of data matrix transformation.
Data matrix transformation refers to that researchers directly arranged raw mechanical data into a 2-D format as the model input as shown in Fig. 16. Guo et al. [41] transformed vibration data into matrix 2-D and then proposed a hierarchical CNN with adaptive learning rate for fault pattern recognition and fault size evaluation. Wang et al. [42] arranged the raw vibration signal into 2-D input and introduced an adaptive CNN for bearing fault diagnosis, in which the particle swarm optimization method was added to determine the main parameters of CNN. Shao et al. [43] proposed a bearing fault diagnosis method based on the deep convolutional belief network and the compressed sensing technique. Similarly, they [44] also utilized the auto-encoder to compress data for fault diagnosis of electric locomotive bearing. Wang et al. [45] constructed mechanical data into Hankel matrix and proposed a CNN based hidden Markov model for bearing fault classification. Gong et al. [46] firstly integrated the temporal and spatial multichannel raw signals to construct the model input. After that, the 2-D convolutional network was designed for feature learning and SVM was used for fault classification. Jing et al. [47] presented an adaptive multi-sensor data fusion based CNN method for planetary gearbox fault diagnosis, which aimed to optimize a combination of different fusion levels to satisfy the requirements of different diagnosis tasks. Chen et al. [48] fused horizontal and vertical direction vibration data into 2-D matrix and presented a deep CNN for health state identification of planetary gearboxes. Han et al. [49] presented a diagnostic framework of complex systems by combining the spatiotemporal pattern network with convolutional network, in which the former was used for spatiotemporal feature learning and the latter was used for condition classification. Yang et al. [50] used the hierarchical symbolic analysis to process original signal and then built a three-layer convolutional network for fault diagnosis of rotating machinery. Liu et al. [51] presented a dislocated time series CNN for fault classification, in which a dislocated layer is introduced to constructed 2-D input data. Yang et al. [52] transformed multi-source vibration signals to construct the 2-D matrix and then proposed a CNN based method for fault diagnosis of reciprocating compressor. Image transformation
Fig. 17. Simple illustration from mechanical data to image.
Image transformation means that researchers try to convert 1-D mechanical signals into the image in pixel format as shown in Fig. 17. Xia et al. [53] presented a fault diagnosis method for rotating machinery based on multiple sensors fusion and CNN. In this method, raw vibration signals from sensors of different locations were aligned into 2-D images as the input. Hoang et al. [54] converted raw vibration signals into vibration images, then a simple two-layer convolutional model was constructed for rolling bearing fault classification. Zhang et al. [55] proposed an equitable sliding stride segmentation approach to expanse data volume whilst converted the data into images. Next a hybrid model based on the convolutional network and bi-gate recurrent unit was constructed for feature learning and classification. Hoang et al. [56] converted the motor current signals into gray images and presented a decision level fusion based CNN for bearing fault identification. Wang et al. [57] used multi-sensor data fusion to construct image data and then designed a four-layer convolutional network for fault classification. Hu et al. [58] used the compressed sensing technology to reduce data size and retain information as much as possible and transform data into image pixel. After that, an improved multi-scale convolutional network was constructed for fault recognition of machinery. Wang et al. [59] converted multi-sensor vibration signals into RGB color images to refine features and enlarge the differences between different types of fault signals. Then, an improved LeNet-5 was designed for fault diagnosis. In [60], raw mechanical signals were transformed into a square matrix through non-overlapping cutting and normalization. Then a modified LeNet-5 was designed for feature learning and fault classification. Time or frequency domain transformation Another type of conversion is to use the statistics of time or frequency domain as input information of convolutional network. Chen et al. [61] calculated statistical measures of vibration signals from the time and frequency domains as the model input and then applied one-layer convolutional network for fault identification of bearings and gears. Janssens et al. [62] utilized the discrete Fourier transform to process the accelerometer signals and presented a simple convolutional network for bearing condition recognition. Bhadane et al. [63] used the statistical features extracted from vibration data as the model input and developed a 2-D CNN for bearing fault classification. Lu et al. [64] proposed a convolutional network based health state classification method for rolling bearing. In this method, the time and frequency domain features of vibration data were extracted to build the input matrix. Li et al. [65] used the root mean square maps from the spectrum of two vibration data as the input and presented a CNN with an improved Dempster-Shafer evidence theory for bearing fault diagnosis. Tra et al. [66] utilized the spectral energy maps of the acoustic emission signals as the model input. Then a CNN with the stochastic diagonal Levenberg-Marquardt algorithm was proposed for incipient bearing fault diagnosis under variable operating speeds. Prosvirin et al. [67] transformed 1-D acoustic emission signals into 2-D kurtogram images and then utilized the CNN for feature extraction and bearing fault classification. Tian et al. [68] integrated features from time and frequency domains as the input and developed a deep CNN with an immunity algorithm for rolling bearing fault diagnosis. Tra et al. [69] used the energy distribution maps of acoustic emission spectra to train a convolutional network for fault diagnosis under variable speed conditions. Li et al. [70] constructed feature images from multi central frequencies as well as vibration frequency spectrum and then used CNN to process these images for gear fault identification. Yao et al. [71] integrated features of time and frequency domains from multi-channel acoustic signals as input data and established a convolutional network for gear fault diagnosis. Kien et al. [72] visualized the spectrums of vibration signals as grayscale images and then proposed a deep convolutional network to process these images for crack detection of gears. Wavelet transform Wavelet transform (WT) preprocessing method is to convert mechanical time series into 2-D time-frequency representation as the network input. Ding et al. [73] used the wavelet packet energy image as the input of deep convolutional network and presented an energy-fluctuated multiscale feature mining approach for spindle bearing fault diagnosis. Gao et al. [74] used the complex Morlet wavelet to acquire the 2-D time frequency maps from vibration signals. Then CNN was designed for rolling bearing fault diagnosis. Guo et al. [75] employed the continuous WT to decompose vibration signals into scalogram according to the rotating speed. Then a Pythagorean spatial pyramid pooling based convolutional network was presented for bearing fault diagnosis. Xu et al. [76] utilized the WT to convert vibration signals into 2-D grayscale images. Then LeNet-5 was built to learn multi-level features and the random forest classifiers were used for bearing fault classification. Islam et al. [77] employed the discrete wavelet packet transform to process the acoustic emission (AE) signals and proposed a convolutional diagnostic model for bearing fault classification. Sun et al. [78] used the dual-tree complex WT to acquire the multiscale features to train the CNN for gear fault recognition. Cabrera et al. [79] proposed a convolutional diagnostic network pre-trained by the stacked convolutional auto-encoder for fault severity assessment of helical gearbox, in which the time-frequency features acquired by the WT were used as the model input. Han et al. [80] proposed a dynamic ensemble convolutional neural network for gear fault diagnosis. In their method, wavelet packet transform was employed to construct multi-level wavelet coefficients matrices for representing the nonstationary vibration signals. Then multiple paralleled CNNs with shared parameters and a dynamic ensemble layer were designed for feature extraction and fault classification. Grezmak et al. [81] presented an explainable deep CNN with layer-wise relevance propagation for gearbox fault diagnosis, in which the time-frequency images from continuous WT were used as the model input. Liang et al. [82] adopted the WT to extract time-frequency information from vibration signals to train a CNN for compound fault diagnosis of gearbox. Guo et al. [83] used the continuous WT to convert the original mechanical signals as the input of the convolutional network for rotor fault diagnosis. Shao et al. [84] converted the vibration and current signals into the time-frequency representation by the WT. Then a deep CNN was designed to predict induction motor conditions. Hsueh et al. [85] used the empirical WT to process the current signals into grayscale images and then trained a convolutional network for induction motor fault classification. Chen et al. [86] employed the continuous WT to process raw vibration signals and then designed a convolutional model with a square pooling architecture for feature extraction. Finally, extreme learning machine was used for fault classification. Cao et al. [87] firstly adopted the dual tree wavelet to decompose the machine spindle vibration signals. Then the reconstructed sub-signal sequences from different scales and their Hilbert envelope demodulation spectra were stacked to train convolutional neural network for tool wear state identification. Short Time Fourier Transform Analogously to wavelet transform, Short Time Fourier Transform (STFT), as another common time-frequency analysis approach, has also been used for data preprocessing in CNFD. Verstraete et al. [88] utilized the STFT to generate image representations of raw vibration signals and then constructed a CNN for bearing classification. Pandhare et al. [89] employed the time-frequency features obtained by STFT to train convolutional network model for bearing fault diagnosis. Xin et al. [90] used STFT to calculate TF features and then employed the sparse auto-encoder and convolutional network for feature extraction. Finally, Softmax classifier was used to obtain the classification results. Yu et al. [91] transformed the phonetic signals into spectrograms using STFT and then used VGG 16 for wind turbine fault diagnosis. In [92], Wen el al. presented a snapshot ensemble convolutional network for fault diagnosis of pump and bearing, this method can find the proper range of learning rate when facing a new dataset. In [93], Wang et al. proposed a convolutional neural network based motor fault diagnosis method, in which STFT was used to pretreat raw signals to acquire time-frequency maps. Other preprocessing technologies Li et al. [94] used the S-transform to process original data into the time-frequency coefficients matrix. Then CNN was designed for feature learning and fault classification. Wen et al. [95] proposed a convolutional network based two-level hierarchical diagnosis network, in which S-transform was used to preprocess data. Jeong et al. [96] employed shaft orbit shape images as monitoring information to train the convolutional neural network for fault diagnosis. Waziralilah et al. [97] used the Gabor transform to process the raw vibration signal and then presented a CNN for bearing fault diagnosis. Zhao et al. [98] used the Hilbert transform and synchrosqueezing transform to calculate the TF representations of the vibration signals. Then these features were used to train a convolutional model for bearing fault classification. Janssens et al. [99] employed VGG to process infrared thermal video for condition detection of the machine. Jia et al. [100] presented a convolutional network based fault detection model by processing infrared thermography images. Li et al. [101] proposed a rotating machinery condition monitoring method, in which the CNN was designed to process the infrared thermal images for feature extraction and fault classification. Chen et al. [102] developed a CNN based degradation state identification approach for planetary gear, in which the singular spectrum of raw data was used to train the model. Wang et al. [103] employed the singular value decomposition based on the phase space reconstruction to analyze the bearing vibration signal and then utilized CNN to process the obtained features for bearing fault diagnosis. Zhu et al. [104] proposed a symmetrized dot pattern to transform vibration signals into 2-D images and the trained a convolutional network for fault diagnosis. Li et al. [105] used the K-singular value decomposition to enhance the resolution of time-frequency features obtained by Wigner-Ville Distribution and then built CNN for planetary gearbox fault classification. Udmale et al. [106] utilized the kurtogram of raw signals to train CNN for bearing fault diagnosis. In [107], Senanayaka proposed a gearbox fault diagnosis method based on multiple classifiers and data fusion. Specifically, the vibration spectrum was used as the input of multilayer perceptron while the features from STFT and CWT were used input of CNN. Finally, the naïve Bayes combiner was employed to integrate the results of two classifiers. In this subsection, the applications of 2-D CNN for fault classification were reviewed systematically. A clear and intuitive summary is displayed in Table 10, which aims to help reader search these studies quickly depending on the signal transform approach or analysis object. Table 10. Summary of the applications of 2-D CNN for fault classification.
Transform method References Object Data matrix [41] [42] [43] [44] [45] [46] Bearing [47] [48] / [49] / [50] Gear/bearing, wind turbine/bearing, pump [51] / [52] Motor/compressor Images [53] [54] [55] [56] Bearing [57] [58] [59] / [60] Bearing, gear/bearing, pump Time or frequency domain transform [62] [63] [64] [65] [66] [67] [68] [69] / [61] / [70] [71] [72] Bearing/bearing, gear/gear Wavelet transform [73] [74] [75] [76] [77] Bearing [78] [79] [80] [81] / [82] Gear/gear, rotor, bearing [83] [84] / [85] / [86] / [87] Rotor/motor/bearing, gear/tool Short time Fourier transform [88] [89] / [90] [91] Bearing/bearing, gear [92] / [93] Bearing, pump/ motor Other [94] [95] [97] [98] [99] [100] [103] [106] / [101] [102] [105] Bearing/gear [96] / [104] / [107] Rotor/ bearing, rotor/ bearing, rotor, gear
In addition to 2-D convolutional network based fault classification, a more direct strategy is to construct 1-D convolutional diagnostic model to process original time-series data. In this subsection, the applications of 1-D convolutional network for fault classification are reviewed according to raw sensor data types, such as vibration and AE data. Vibration data Vibration data has been the most common source of information in machine fault diagnosis due to its legibility and intuitiveness, including convolutional network based intelligent diagnosis. Eren [108] used raw vibration signal as the input to train 1-D CNN for IMS bearing fault detection. Pan et al. [109] combined the CNN with the long short term memory (LSTM) network and proposed an improved bearing fault diagnosis method. Inspired by the second generation wavelet transform, Pan et al. [110] improved the convolutional network and proposed a LiftingNet to process raw mechanical data for fault classification. Qian et al. [111] constructed an adaptive overlapping CNN for bearing fault diagnosis, in which the raw vibration signals were used to train model. Jia et al. [112] proposed a normalized CNN for imbalanced fault classification of machinery. In this method, a neuron activation maximization algorithm was presented to help understand the feature learning process of the network. Eren et al. [113] developed a compact adaptive 1-D CNN for real-time bearing fault diagnosis. Ma et al. [114] integrated the residual convolutional network, deep belief network as well as deep auto-encoder and proposed an ensemble deep learning method for fault diagnosis of rotor bearing system. Wang et al. [115] proposed a multi-scale learning network with the 1-D and 2-D convolution channels to learn the local correlation of adjacent and nonadjacent intervals in vibration signals for bearing fault diagnosis. Huang et al. [116] added a multi-scale cascade layer at the front of conventional convolutional network and proposed a CNN approach with multi-scale information for bearing fault diagnosis. Qiao et al. [117] used raw vibration signals as the input and proposed an adaptive weighted multiscale CNN for bearing fault diagnosis under variable operating conditions. Abdeljaber et al. [118] utilized the compact CNN to present an online condition monitoring method for fault detection and severity identification of bearings. Huang et al. [119] utilized raw vibration signals as the model input and proposed a deep decoupling CNN for intelligent compound fault diagnosis. Liu et al. [120] combined the denoising convolutional autoencoder with CNN to develop an anti-noise fault diagnosis method. Han et al. [121] proposed an enhanced convolutional network with enlarged receptive fields for planetary gearbox fault diagnosis. Considering the inherent multiscale characteristics of vibration signals, Jiang et al. [122] developed a multi-scale CNN for wind turbine gearbox fault diagnosis. In [123], Sun et al. firstly used back-propagation based neural network to learn the local filters. Then these local filters were used to build the feed-forward convolutional neural network for feature learning. Finally, the learned features were fed into SVM classifier for induction motor fault classification. Yuan et al. [124] applied multi-sourced heterogeneous monitoring data as the input and presented a multi-mode CNN based method for rotor system fault diagnosis. Afrasiabi et al. [125] proposed an accelerated CNN for bearing fault diagnosis of induction motors, in which the pruning connection and weight sharing technique were used to compress model without loss of accuracy. Chen et al. [126] utilized 1-D CNN to learn features from raw vibration signals and then fed these features into a bidirectional LSTM network for wear state identification of tool. In addition to directly using the raw vibration data, some scholars adopted the features extracted from vibration data to train 1-D convolutional neural network for fault diagnosis. Xie et al. [127] used the convolutional network to learn features from frequency spectrum of vibration signals and then integrated these features with energy entropy of empirical mode decomposition and time domain features for final fault classification. Sadoughi et al. [128] used spectral kurtosis and envelope spectrum analysis to process raw mechanical data and proposed a physics-based CNN for fault diagnosis of rotating machinery. To maintain the diagnosis performance in the noisy environment and different working load, Zhang et al. [129] developed a CNN with training interference based diagnosis method. In [130], Dong et al. used 1-D CNN and 2-D CNN to respectively extract features from the frequency spectrum and STFT spectrum of vibration signals for rolling bearing degradation. Jing et al. [131] developed a CNN based method to extract features from the frequency spectrum of vibration signal for gearbox fault diagnosis. In [132], Ma et al. used coefficients of wavelet packet decomposition as the model input and proposed a lighted CNN for bearing fault diagnosis. Other data Compared with vibration data, other mechanical data are also used in 1-D convolutional network based classification, including current, AE, and build-in encoder data. For instance, Ince et al. [133] directly used raw current signals to train 1-D CNN for real-time motor condition monitoring, the results show the effectiveness and superiority of their approach. Besides, they proposed a real-time broken rotor bar fault detection model based on the shallow 1-D convolutional neural network [134]. Khan et al. [135] used the motor current as the input and developed an analytical model for inter-turn fault diagnosis by combining the 1-D CNN and LSTM network. Kao et al. [136] applied a 1-D CNN to the fault diagnosis of magnet synchronous motor over a wide speed range by using current data as the model input. In [137], the motor vibration signal and the stator current signal were firstly segmented by analysis windows of varying lengths for the joint representation. Then, the CNN and LSTM network were designed to automatically learn discriminative features and achieve motor fault diagnosis. In addition to the current data, AE data analysis is also a common monitoring manner for fault diagnosis. Li et al. [138] used convolutional network and gate recurrent unit to reprehensively extract features from AE and vibration data. Then the learned features were concatenated for gear pitting fault diagnosis. In [139], Appana et al. utilized the CNN to process the envelope spectrums of AE signals to achieve bearing fault diagnosis under varying rotating speeds. In light of the drawbacks of external sensors, Jiao et al. [14] proposed a build-in encoder information based CNN for intelligent fault diagnosis. In this method, a multivariate encoder information was presented by information fusion to capture comprehensive mechanical health states, then the convolutional network was designed for adaptively feature learning and condition classification. According to the types of sensor data, the comprehensive review on the applications of 1-D convolutional network is presented in this subsection. Depending on above literature, a concise generalization is displayed in Table 11.
Table 11. Summary of the applications of 1-D CNN for fault classification.
Signal Type References Object Vibration data [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] Bearing [119] [120] [121] Bearing, gear [122] / [123] [124] [125] / [126] Wind turbine gearbox/motor/tool [127] [128] [129] [130] [132] / [131] Bearing/gear Other data [133] [134] [135] [137] / [136] [139] / [138] [14] Motor/bearing/gear
As introduced in Section 3.2, many variants of CNN have also been studied and applied in the field of fault diagnosis. Thus in this subsection, we will review these publications according to different network variants. Applications of ResNet to fault classification Zhao et al. [140] employed a series of wavelet packet coefficients as the model input and proposed a deep residual network with dynamically weighted wavelet coefficients for planetary gearbox fault diagnosis under serious noise environment. The comparison results showed higher accuracies than other deep learning approaches. Furthermore, they [141] combined the WT with ResNet and proposed the multiple wavelet coefficients fusion based deep residual network for planetary gearbox fault diagnosis, which aimed to learn more easily-distinguished features from the input data. Li et al. [142] proposed a deep residual learning network for fault diagnosis, in which the data augmentation techniques were presented to artificially create additional valid samples for model training. The result showed that their method can achieve high diagnosis accuracy with small original training dataset. Zhang et al. [143] used raw vibration signals as the model input to train a deep ResNet for bearing fault diagnosis, the results show the superiority to traditional CNN model. Peng et al. [144] presented a deeper 1-D CNN with residual learning for fault diagnosis of wheelset bearings in high-speed trains. Ma et al. [145] proposed a deep residual convolutional network based on a separable convolution and concatenated ReLU lightweight convolution for bearing fault diagnosis, in which the coefficients of wavelet packet transform were selected as the network input. Zhuang et al. [146] proposed a stacked residual dilated CNN for bearing fault diagnosis by combining the dilated convolution, the input gate structure of LSTM and the residual network. Su et al. [147] utilized raw time sequences as the input and presented a residual-squeeze net for fault diagnosis of high-speed train bogie. Ma et al. [148] proposed a fault diagnosis method of planetary gearbox under nonstationary running conditions using ResNet with demodulated time-frequency features. Considering the non-stationary conditions of machine, Liu et al. [149] proposed multi-scale kernel based residual convolutional network for motor fault diagnosis. The results showed the superiority compared with state-of-the-art methods. From above review, it can be seen that ResNet diagnostic model is promising for more comprehensive feature extraction and higher diagnosis accuracy conditions in modern industry, especially in complicated mechanical equipment or industrial environment. Applications of GAN to fault classification Due to the excellent data generation characteristics, GANs have been gradually applied to the field of fault diagnosis, especially for the diagnostic scenario with imbalanced data sets. Cao et al. [150] firstly transformed the time-domain signals into image data. Then a GAN was designed for rolling bearing fault classification.
The results illustrated the potential of GAN on the fault diagnosis with small samples. Xie et al. [151] developed a GAN to generate the samples of minority classes for bearing fault diagnosis, which aimed to address the issue of data imbalance. Shao et al. [152] proposed an auxiliary classifier GAN based diagnostic framework to generate synthesized data and achieve induction motor fault diagnosis. Afrasiabi et al. [153] combined GAN with temporal CNN and proposed a wind turbine fault diagnosis method, in which the former was used as the feature extractor and the latter was used as the fault classifier. Li et al. [154] proposed an enhanced GAN for fault diagnosis of rotating machinery with imbalanced data. In their method, 2-D convolutional network was used to build the generator and discriminator, which aims to produce small samples to balance the dataset. Suh et al. [155] employed the nested scatter plot method to transform raw vibration signals into 2-D images, then a 2-D CNN was designed for bearing fault classification. In addition, a GAN was embedded in this framework to generate fault images for the data imbalance issue. To address the issue of lacking the labeled fault data, Guo et al. [156] proposed a multi-label 1-D GAN for fault diagnosis, in which the auxiliary classifier GAN was used to generate real damage data and then the generated data and real data are both used to train fault classifier. The experimental results showed that the proposed method can improve diagnosing accuracy from 95% to 98% when model was trained with the generated data. Applications on other variants to fault classification Jiao et al. [9] employed the built-in encoder and external vibration signals as the input in parallel and presented a deep coupled dense convolutional network based intelligent fault diagnosis. The results verified the superiority than traditional convolutional network. Li et al. [157] presented an improved inception network to process raw vibration signals for gear pitting fault diagnosis. In [158], Chen et al. proposed a deep inception net with atrous convolution to bridge the gap between artificial and real damage for bearing fault diagnosis. Zhu et al. [159] employed the STFT to convert signals into 2-D graphs as the input and proposed a capsule network with an inception block and a regression branch for bearing fault diagnosis. Chen et al. [160] proposed a deep capsule network with stochastic delta rule for rolling bearing fault diagnosis, in which raw vibration signals were used as the model input.
This subsection reviews the classification applications in machine fault diagnosis using convolutional neural networks, in which the results reveal that powerful feature learning and fault identification capabilities of CNN. Despite these methods have achieved certain success, some existing practical problems still cannot be ignored. For instance, the success of convolutional neural networks is based on the large scale datasets with a tremendous amount of labeled samples. However, in many practical situations, a large number of labeled samples are inaccessible, especially for fault data. Besides, above most approaches assume that the distributions of training data and test data are same, however, this assumption is not hold in real industry. Consequently, it is necessary to solve these realistic problems and advance convolutional diagnostic approaches for the promising employment in modern intelligent industry.
Different from the fault classification, the purpose of health prediction is to track the degraded state of machinery, even if no apparent failure has occurred. This branch is vitally important in the field of machine fault diagnosis, which allows maintenance personnel make early judgments and decisions to avoid losses and injuries. Therefore, the applications on health prediction are reviewed and summarized according to the application object in this section.
Rolling bearings are widely used and play an important role in modern machinery. The deterioration or failure of bearings will lead to machine breakdown and even disaster. Therefore, numerous studies have been conducted to assess and predict the health condition of bearings. Yoo et al. [161] used the continuous WT to obtain the time-frequency images for the health indicator (HI) construction. Then the CNN was designed to process these images for bearing RUL prediction. Belmiloud et al. [162] used wavelet packet decomposition to extract features as the model input and then presented a deep CNN based method for adaptive HI construction. Hinchi et al. [163] proposed a bearing RUL estimation method, in which the convolutional layer and LSTM layer were integrated to learned features from raw sensor data. Guo et al. [164] proposed a method for HI construction, in which the trend burr was considered and the results showed their proposed method is more effective than other methods in terms of tradability, monotonicity and scale similarity. Ren et al. [165] presented a spectrum-principal-energy-vector algorithm to obtain the eigenvector as the network input to train the CNN for bearing RUL prediction. She et al. [166] proposed a multi-channel CNN with exponentially decaying learning rate to construct wear indicator and evaluate the health of rolling bearing, in which the original multi-channel signals were used as the input. Mao et al. [167] employed the CNN to learn features from the marginal spectrum of Hilbert-Huang transform. After that, a LSTM network was constructed for RUL prediction of bearings. Li et al. [168] used the STFT to process raw vibration signals to obtain the time-frequency domain information. Then a deep CNN was built to extract multi-scale features for RUL estimation. Zhu et al. [169] used the WT to acquire the time-frequency representation and then trained a multi-scale CNN to learn global and local features for RUL estimation. The results showed enhanced performance in prediction accuracy compared to tradition data-driven and CNN based methods. Wang et al. [170] converted 1-D signals into the 2-D images to train CNN for RUL prediction, in which the maximum correlation entropy with regular terms was employed as the loss function for better performance compared to the mean square error. Zhang et al. [171] proposed a deep multilayer perceptron convolutional network for HI construction, in which the outlier region correction method is introduced to detect and remove outliers and enhance the interpretability of HI. Yang et al. [172] utilized raw mechanical signals to trained a double-CNN model for RUL prediction, in which the first CNN was used to identify the incipient fault point and the second CNN model was applied for RUL prediction. Considering that the uncertainty was critical for health prognostic, Peng et al. [173] introduced a Bayesian multi-scale convolutional network based prognostic method, which shows the more accurate performance than point estimates. Yao et al. [174] combined the empirical model decomposition with ensemble CNNs for bearing RUL estimation, in which the former can reveal the nonstationary property of degradation data and help CNN to get a more accurate prediction. Wang et al. [175] integrated the CNN and LSTM network to process time-series data and calculate an unsupervised H -statistic for bearing performance degradation assessment. Liu et al. [176] presented a joint-loss CNN for bearing fault recognition and RUL prediction in parallel, which can capture common features between different relative tasks and improve the generalization capability. Wang et al. [177] proposed a deep separable convolutional network for RUL prediction of machinery, in which the data from different sensors were used to train a separable convolutional building block with a residual connection for feature learning. They further proposed a recurrent convolutional network for RUL prediction [178], in which recurrent convolutional layers were designed to model the temporal dependencies and variational inference was utilized to quantify the uncertainty of prediction results. Benefitting from the public C-MAPSS dataset, many researchers validated the proposed prognosis approaches on the turbofan engine. Babu et al. [179] constructed 2-D data matrix from multi-variate time series to train a CNN with two-convolution layers and two-fully connected layers for RUL estimation. Li et al. [180] adopted the time window approach to process the multi-variate temporal data and then developed a deep CNN for feature extraction and RUL estimation. Wen et al. [181] presented a deep residual CNN for RUL estimation, in which the k -fold ensemble method was adopted to enhance the prediction preformation. Li et al. [182] presented a directed acyclic graph network for RUL prediction by combining CNN and LSTM network. The comparative results showed that the proposed method had better predication accuracy. Al-Dulaimi et al. [183] proposed a hybrid deep network framework for RUL estimation, in which the LSTM network and CNN were arranged in parallel for feature learning and then a multilayer fully connected network was designed for feature fusion and decision making. Ruiz-Tagle Palazuelos et al. [184] introduced a capsule neural network for degradation estimation of turbofan engine and the results showed the superiority than traditional CNN based methods. Kong et al. [185] adopted the polynomial regression to construct HI from raw data and then designed a hybrid deep model based on the CNN and LSTM network for RUL prediction. In addition to the bearing and turbofan engine, some scholars also developed convolutional network based prediction approaches to other applications, such as CNC machine. Zhao et al. [186] proposed a convolutional bi-directional long short term memory network for CNC machining tool health monitoring, which combined the advantages of CNN and LSTM network to obtain accurate predication. Qiao et al. [187] proposed a hybrid deep learning framework for gearbox fault diagnosis and tool wear prediction, in which the multiple convolutional and LSTM layers were firstly designed to extract local spatiotemporal features and then a holistic convolution-LSTM layer was designed to extract holistic spatiotemporal features. Aghazadeh et al. [188] employed CNN to establish a deep learning algorithm for tool wear estimation, in which wavelet transform and spectral subtraction algorithms were designed to intensify the effect of tool wear and reduce the effect of cutting parameters. Huang et al. [189] utilized features from time-domain, frequency domain and time-frequency domain of multi-sensor signals as health information and proposed a deep CNN based method for tool wear prediction. In [190], Fu et al. combined the CNN and LSTM network to establish the logical relationship of observed variables for condition monitoring of wind turbine gearbox bearing. Kong et al. [191] presented a health monitoring method of wind turbines based SCADA data. In their approach, the CNN and gated recurrent units were integrated to learn spatial and temporal features, and then the exponential weighted moving average control chart was designed for condition recognition. Luo et al. [192] employed the dual-tree complex wavelet to obtain multiscale characteristics as the input. Then an enhanced convolutional LSTM network was designed for damage monitoring of the automotive suspension component. Li et al. [193] developed a scalable degradation assessment approach for bandsaw machine by proposing a dual-phase modeling method. In this approach, a physics informed model is firstly established to generate the HI to monitor wear condition using the vibration and acoustic signals. Then a deep CNN based surrogate model is designed to replace the physics informed model by using alternative low-cost sensor data. In this section, applications of convolutional networks on health prediction are systemically reviewed according to the application object. The summarization reveals that many researchers have successfully developed convolutional networks based prediction approaches to address weak generality, flexibility and intelligence of previous physical and mathematical models. However, it should be noticed that lifetime data is difficult to obtain in practical industry. In other words, there is usually no sufficient data to train a complete life prediction model, hence how to build a model with experimental or simulation data and then make the model generalizable to practical industrial applications should be paid to more attention.
Although CNN on fault classification and health predication of machinery have acquired certain achievements, an assumption that training data and test data have same data distribution is necessary for most of the above approaches. In practical industrial scenarios, the data distribution differences are inevitable due to natural wear of equipment, changes in operating conditions, interference from environment and human, and so on. Consequently, the performance of above most models will seriously degrade when the data distributions between training set (source domain) and test set (target domain) are different. An immediate solution is to retrain or built new model, however, a large number of labeled data are necessary in this case. In many task scenarios, sufficient labeled instances are either difficult to collect, or their labeling costs is prohibitively. Therefore, it is quite necessary to explore how to apply the previously models established on the related domain to the new diagnostic scenarios. In light of these issues, transfer learning or domain adaptation technologies have been introduced to machine fault diagnosis, especially its combination with the deep convolutional networks. In this section, a summary on the applications of convolutional network on transfer diagnosis will be introduced in detail. Before starting the literature review, three common tricks are firstly introduced, including parameters transfer, moment matching strategies, and adversarial domain adaptation.
Transfer Transfer Transfer Transfer S ou r ce d a t a Pre-train T a r g e t d a t a F i n e - t un e F i n e - t un e Test F i x e d F i x e d F i x e d F i x e d Fig. 18. Illustration of parameters transfer. 1
Parameters transfer is also called pre-train model based transfer, which means that the partial parameters of network trained in the source domain are fixed and transferred while remaining parameters will be fine-tuned using labeled data in the target domain. Intuitive understanding is shown in Fig. 18, in which the idea comes from the fact that the features of previous layers are general and transferable while the features of the last few layers are task-specific. Therefore, the model can be used to new target domain by using the few labeled target data to fine-tune parameters of task-specific layers. Some researchers have applied this technology to achieve transfer diagnosis of machinery in recent years. Cao et al. [194] proposed a deep CNN based transfer learning approach for gearbox fault diagnosis. The first part of their method was constructed by a part of a pre-trained network and the second part was a fully connected layers retrained by gear data. Hasan et al. [195] used the frequency spectrum as the input and presented a parameters transfer based 1-D convolutional network for bearing fault diagnosis under variable working conditions. Hemmer et al. [196] employed a CNN pre-trained by ImageNet dataset to learn features from WT images of vibration and AE signals. Then a sparse autoencoder-based SVM was designed to process these features for bearing fault classification. Zhong et al. [197] proposed a transfer learning framework for gas turbine fault diagnosis, in which the CNN trained on large-scale annotated normal dataset was transferred to fault diagnosis task with limited fault data for feature learning and then the SVM was designed as the new classifier for fault classification. Wen et al. [198] converted the raw time-domain signals to RGB images to fine-tune a pre-trained ResNet-50 for fault diagnosis. Furthermore, they utilized the negative correlation learning to retrain several fully-connected layers and the Softmax classifier of the pre-trained ResNet-50 for fault classification [199]. Han et al. [200] presented a transfer learning framework for fault diagnosis of unseen machine conditions, in which the CNN trained on large datasets was transferred to new tasks with proper fine-tuning. In addition, they designed three transfer learning strategies to investigate the feature transferability in the different network levels. Shao et al. [201] employed the WT to convert raw signals into images and developed a deep transfer framework for machine fault diagnosis, in which the labeled time-frequency images were used to fine-tune the higher layers of a convolutional network pre-trained by ImageNet dataset. Ma et al. [202] introduced the frequency slice wavelet transform to extract the raw vibration signals into 2-D time-frequency images and then fine-tune a pre-trained AlexNet model for bearing fault diagnosis. Hasan et al. [203] used the acoustic spectral images of AE signals to reflect mechanical health state and then proposed a pre-train CNN based parameters transfer learning approach for bearing fault diagnosis under variable speed conditions. Chen et al. [204] employed the parameters transfer based method for fault diagnosis of rotary machinery, in which a wide kernel 1-D CNN was designed for learning transferable features and results showed the effectiveness of the proposed method. In this subsection, applications on parameters transfer based fault diagnosis are reviewed. Although this transfer strategy is easy to understand and operate, a distressing issue still exists that labeled data in the target domain is necessary. Therefore, these transfer learning approaches will encounter unexpected obstacles due to the unavailability of labeled data in practical industrial applications. Discrepancy measure based method is generally achieved by minimizing a certain distance between hidden activations of convolutional network in different domains, in which the key is to explore the efficient discrepancy metric function, such as Maximum Mean Discrepancy (MMD) and correlation alignment. In this subsection, the definition of MMD and correlation alignment is firstly introduced to help understand this transfer learning manner, then the applications on discrepancy measure based transfer diagnosis are systemically reviewed. Maximum mean discrepancy [205, 206] measures the distribution divergences by the mean embedding of two distributions in the reproducing kernel Hilbert space . Specifically, give the datasets ={ } s si x and ={ } t ti x drawn from two domains with different distributions P and Q , the MMD can be calculated as: = sup( [ ( )] [ ( )]) M sP QD tM x x (7) where represents the feature map; P Q if and only if =0 MMD . In practical application, the MMD is calculated as the empirical estimation based on the kernel mean embedding: s t s s t t s t n n n n n n n ns s s s t t s ti i i j i j i ji i i j i j i js t s ts tMMD k k kn n n nn n x x x x x x x x (8) where (, ) k represents the characteristic kernel; s n and t n are the number of source samples and target samples. Correlation alignment [207] is defined as a constraint function to measure the data distribution difference based on the second-order statistics. Mathematically, given the source feature matrix s n ds D and target feature matrix t n dt D , where the row represents the number of sample and the column denotes the feature dimension. The covariance matrices of two feature matrices can be calculated as: s s s s ss st t t t tt t C D D D Dn nC D D D Dn n (9) where s C and t C stand for the covariance matrices of two domains, respectively. is a column vector with all elements equal to 1; T stands for the transposition. Based on two covariance matrices, the correlation alignment is defined as follows: c s t F C Cd (10) where F represents the squared matrix Frobenius norm. In the community of fault diagnosis, Zhang et al. [208] proposed a domain adaptation based CNN for fault diagnosis under varying working conditions, in which the MMD was used to minimize the domain divergence. Li et al. [209] presented a CNN based rolling bearing fault diagnosis method under noisy and changing working condition. In their approach, a feature clustering method was introduced to minimize the difference of intra-class and maximize the difference of inter-class. Meanwhile, the MMD was adopted to reduce the domain divergences. Furthermore, they proposed a multi-layer domain adaptation approach for bearing fault diagnosis [210], in which the multi-kernel maximum mean discrepancy was employed as the metric function to reduced distribution differences between different domains. In [211], Xiao et al. presented a domain adaptation based motor fault diagnosis method, in which the CNN was adopted to extract multi-level features from raw vibration signals and the MMD was incorporated to reduce the feature distribution differences of multiple layers. Yang et al. [212] combined the CNN with MMD to introduce a transfer learning network for the fault diagnosis from laboratory bearings to locomotive bearings. Han et al. [213] extended the marginal distribution adaptation to the joint distribution adaptation and proposed a deep transfer network for fault diagnosis. Xu et al. [214] proposed a convolutional transfer feature discrimination network for unbalanced fault diagnosis under variable rotational speeds, in which the MMD was used to reduce the distribution differences of high-dimensional features. Zhu et al. [215] converted raw vibration data into gray pixel images as the network input and proposed a multi-Gaussian kernels MMD based deep transfer learning approach for rolling bearing fault diagnosis under different operating conditions. To address the cross-domain diagnosis problem with insufficient target samples, Li et al. [216] used the MMD based generative convolutional networks to generate fake target fault samples and proposed a domain adaptation based approach for bearing fault diagnosis. In [217], a renewable fusion method was proposed for fault diagnosis under variable speed conditions and unbalanced samples, in which the second order statistics were used to reduce feature distribution differences and the contrastive loss function was employed to promote the similar features between different speeds. Source Data Label Classifier F ea t u r e E x t r ac t o r Domain Discriminator
Target Data
Adversarial learning
Fig. 19. Illustration of adversarial domain adaptation.
In addition to above two moment matching algorithms, another transfer learning strategy, named the adversarial domain adaptation [218], is attracting increasing attentions. Unlike the discrepancy measure based method, adversarial domain adaptation constructs a two-player minimax game to supervise the network for learning transferable features. Specifically, adversarial domain adaptation network is usually composed of a feature extractor F , a label classifier C and a domain discriminator D as shown in Fig. 19, in which the discriminator is trained to distinguish whether the features are from the source domain or the target domain while the feature extractor tries to learn the features and fool the discriminator. In the training process, meanwhile, the classifier is trained to minimize the classification error of source data. Give the source domain , )}{( s ns si i is y x with s n samples and target domain )}{( t ntt i i x with t n samples. The overall objective of the adversarial domain adaptation network is described as follows: )( i s i ts a y i i d is tda is C F y D F dn n n x x x x (11) where y represents the classification loss function; d denotes the domain identification loss function; i y and i d stand for the category label and domain label, respectively; is the trade-off parameter. Inspired by the adversarial domain adaptation, Han et al. [219] introduced a deep adversarial convolutional network for machine fault diagnosis and the results showed that the proposed method was superior to the conventional convolutional networks. Guo et al. [220] integrated the moment matching with adversarial learning strategies to develop a deep convolutional transfer learning network for fault diagnosis of machines, in which the training and test dataset were acquired from different machines. Zhang et al. [221] proposed a Wasserstein distance guided multi-adversarial convolutional network for fault diagnosis under different operating conditions. The experimental results showed improved performance than MMD based methods. To improve the Wasserstein distance-based adversarial approach, Wang et al. [222] presented a triplet loss guided adversarial domain adaptation network for bearing fault diagnosis and the results showed the better performance. Xie et al. [223] proposed a transfer learning approach for fault diagnosis using the cycle-consistent GAN, in which the GAN was designed to generate new sample for unknown conditions to pre-train a classifier. From the perspective of decision boundaries, Jiao et al. [224] developed an unsupervised adversarial adaptation network to achieve cross-domain fault diagnosis using two task classifiers without the domain discriminator. Furthermore, they presented a domain adaptation network based on classifier inconsistency for addressing more realistic problem [225], i.e. partial transfer diagnosis, in which the source domain and target domain have different class number. In addition to above three popular transfer diagnosis methods, several ingenious technologies for transfer diagnosis are also introduced. For example, Zhang et al. [226] used raw vibration signals as the model input and presented a CNN based diagnosis method, in which the wide convolutional kernels and adaptive batch normalization were adopted for the domain adaptation capability. Duan et al. [227] proposed an auxiliary model based domain adaptation method for reciprocating compressor diagnosis under different operating conditions, in which a pre-trained CNN was used for feature learning and a marginalized stacked auto-encoder was used to eliminate data distribution difference. Hasan et al. [228] introduced a discrete orthonormal Stockwell transform for data preprocessing. Then a deep convolutional network was proposed for bearing fault diagnosis under variable rotational speeds. Xiao et al. [229] proposed a transfer learning model for fault diagnosis by integrating the modified TrAdaBoost algorithm with the convolutional network. In this section, applications on transfer diagnosis are methodically reviewed and a concise summary is listed in Table 12. The first column represents different transfer strategies and the third column denotes specific methodology for achieving transfer diagnosis. The last column stands for the application scenarios, where “image → machine” represents that the pre-trained model is from the field of image processing; “operating conditions” and “machines” denotes the transfer between different conditions or different machines, respectively. Table 12. Summary of applications on transfer diagnosis.
Transfer strategy References Methodology Scenarios Parameters transfer [194] [196] [198] [199] [201] [202] Pre-train by image data image → machine [195] [197] [200] [203] Pre-train by mechanical data operating conditions [204] operating conditions and machines Discrepancy measure [208] [209] [210] [211] [213] [214] [215] [216] MMD operating conditions [212] machines [217] Correlation alignment operating conditions Adversarial transfer [219] Adversarial discriminator operating conditions [220] discriminator and MMD machines [221] [222] Adversarial Wasserstein operating conditions [223] GAN operating conditions [224] [225] Classifier discrepancy operating conditions Other [226][227] [228] [229] / operating conditions
5. Conclusions / % V a r i a n t s C l a ss i f i ca ti on Health Prediction
Transfer Diagnosis
Data matrix (12) Image(8)
Statistics (12) WT(15) STFT(6) Other (14)Vibration(25)
Other(8)
ResNet (10) GAN(7) Other (5) Other(8)Bearing (18) Engine (7)
Parameters (11)
Metric (10) Adversarial(7)
Other (4)
Fig. 20. The pie chart of publication related to the convolutional network based monitoring diagnosis method. 6
In previous sections, the published literature on CNFD has been systemically reviewed. The overall pie chart of these summarizations is displayed in Fig. 20. It should be pointed out that the related literature on this field is huge and abundant, meanwhile the new research is also emerging. Therefore, it is inevitable that some papers are missing from the current review. In addition, some non-English publications are also not considered in this work because of the limitation of language proficiency. In spite of this, some observations and conclusions are provided in this section based on the current literature review. (1)
From Fig. 20, it can be found that more than 65% of publications are focused on the fault classification task. And most of transfer diagnosis approaches are oriented to the health condition classification. The number of mechanical health prognostic applications only account for about 17.6% of the total. This phenomenon may be because the implementation of fault classification is more easy and intuitive while the health prognostic usually requires additional assistance, such as the HI construction and health stage division etc. However, the fault classification mainly focuses on various failure conditions, which are the final states of machine. Prior to these conditions, the equipment usually undergoes a degradation process, in which the foreseeable action should be taken instead of waiting for the final fault. Thus the health prediction, i.e. degradation monitoring or RUL prediction, should be paid more attentions in future studies. (2)
Almost all models mentioned in the above literature are trained and tested in the experimental or simulated scenario, thus these models may be unsuitable to be directly applied to realistic industry since the acquired data and industrial data usually have certain differences. In addition, the training data and test data even come from the same set of experiment in some validation process in some research, in which the excellent results can be produced owing to the data similarity. This will make researchers blindly believe in the network capability. Consequently, utilizing the reasonable experimental data and diverse realistic industrial data to train more powerful models is of significance. (3)
It is known that the parameter set plays a decisive role in the performance of deep convolutional network. Reviewing the above literature, the design and selection of network parameters (including architecture and hyper-parameters) are mainly determined by authors’ subjectivity while a specific standard or rule for selecting appropriate parameters has not been formed. Although a set of parameters cannot be ideally applied to various tasks, the study on relation between parameters and mechanical signal characteristics or parameter selection trick is still promising and significant. (4)
In some applications, especially 2-D convolutional networks, the data transform or signal processing technology is necessary. Thus it will increase the complexity of the overall framework and reduce the efficiency as well as the level of intelligence. On the contrary, raw signals based applications can avoid the requirements for domain knowledge and construct the end-to-end diagnostic framework. But the noise and interference existed in raw data may disturb the model convergence and even lead the model astray. Therefore, it is suggested to objectively view the merit and demerit between the deep convolutional network and advanced signal processing algorithms, and organically integrate them for better performance.
6. Prospects
Despite the published literature has achieved great advancements in machine fault diagnosis, there still several aspects need to be further explored and investigated. Therefore, in this section, we will share some prospects with the readers, researchers and engineers who aim to promote the development of this field. (1)
More theoretical investigation is necessary to reveal the “black box” issue of CNFD approaches. Although convolutional networks have been widely applied to fault diagnosis, in-depth theoretical research is still very rare. For example, the relation between the weights and the mechanical features or the explanation of the learned features have not been reasonably explained. In addition, the “black box” issue will make the companies or factories doubt the capabilities of these methods and refuse to apply them to realistic scenarios. Therefore, it is urgent to lift the veil of CNFD methods whether in academia or industry. (2)
How to identify unseen damage types or fault conditions? The literature on the applications of CNDP generally only focuses on identifying the faults existed in the training set. It means that the model can be used only when the category of the test dataset is included in the training set. However, the construction of all-encompassing training set is expensive or even impossible. Moreover, some strange faults will inevitably occur in real scenarios with the changing of equipment itself and working environment. As a result, it is still an open question to explore the models which could distinguish the unseen damages or faults. (3)
There is a requirement to detect early damage from the point of quantitative analysis. Most of CNFD applications only focus on how to identify the different health categories, however, there is no obvious fault types in early degradation stage. In particular, it is unreasonable to carry out diagnosis until the occurrence of large or significant failures in high-precise and vital industrial applications. Moreover, the decision should be performed according to the level of damage by the qualitative analysis. Considering this issue, therefore, the effort to explore the detection and quantitative analysis of early weak damages should be encouraged. (4)
How to train the model using non-stationary data for diagnosis or prognosis under variable operating condition? In previous variable operating conditions studies, the convolutional models are usually trained by the data of smooth operation or the time-invariant features extracted by signal processing technologies. As a result, the former is only from one stationary condition to another and the later need certain expert knowledge. Moreover, collecting the stationary data is difficult and even impossible in the realistic continuously non-stationary operating environment. Consequently, how to use the deep convolutional network to directly process the non-stationary data and achieve the reliable diagnosis is also an urgent problem. (5)
How to speed up the CNFD algorithm for the real-time diagnostic requirements? It is known that a frequently mentioned drawback for the CNFD approaches is that they consume more time for training than classical shallow algorithms, thus this will lead to unsatisfactory in real-time and quick task requirement. Therefore, the exploration of the novel technology and trick to accelerate and improve the CNFD algorithms requires to be paid more attentions in future research. (6) How to establish the CNFD models that can be used for the requirement of equipment fleet? According to the above literature, existing methods are mostly employed for the diagnosis and prognosis of a single machine. However, the cluster machine development is becoming an increasing trend in the rapid manufacturing and production era. Therefore, it is more significant to study the CNFD models with powerful generalization capability which can be freely applied to the other similar machines. (7)
How to utilize the opportunity of industrial big data to improve the performance of CNDP methods? The sufficient data is the premise and foundation to achieve the excellent performance of deep networks. Reviewing the literature, the choice of data quantity heavily depends on the subjective factors or is limited by the experimental condition. As a result, many models are not optimized to optimal performance due to the illusion of simple or limited amount data. Therefore, how to seize the chance of industrial big data and utilize its characteristic, such as diversity and heterogeneity, to develop more robust and reliable models will be another promising topic in next research.
Acknowledgments
This research was supported by National Natural Science Foundation of China (Grant No. 51421004, 91860205), the Defense Industrial Technology Development Program (Grant No. JCKY2018601C013), Fundamental Research Funds for the Central Universities (Grant No. xzy022019022) and the China Scholarship Council.
References [1] J. Wan, S. Tang, D. Li, S. Wang, C. Liu, H. Abbas, A.V. Vasilakos, A manufacturing big data solution for active preventive maintenance, IEEE Trans. Ind. Inform., 13 (4) (2017) 2039-2047. [2] Y. Lei, F. Jia, J. Lin, S. Xing, S.X. Ding, An Intelligent Fault Diagnosis Method Using Unsupervised Feature Learning Towards Mechanical Big Data, IEEE Trans. Ind. Electron., 63 (5) (2016) 3137-3147. [3] J. Jiao, M. Zhao, J. Lin, K. Liang, Hierarchical discriminating sparse coding for weak fault feature extraction of rolling bearings, Reliab. Eng. Syst. Saf., 184 (2019) 41-54. [4] M. Zhao, X. Jia, A novel strategy for signal denoising using reweighted SVD and its applications to weak fault feature enhancement of rotating machinery, Mech. Syst. Signal Pr., 94 (2017) 129-147. [5] Y. Lei, J. Lin, M.J. Zuo, Z. He, Condition monitoring and fault diagnosis of planetary gearboxes: A review, Measurement, 48 (2014) 292-305. [6] M. Yu, D. Wang, M. Luo, Model-Based Prognosis for Hybrid Systems With Mode-Dependent Degradation Behaviors, IEEE Trans. Ind. Electron., 61 (1) (2014) 546-554. [7] S. Haidong, C. Junsheng, J. Hongkai, Y. Yu, W. Zhantao, Enhanced deep gated recurrent unit and complex wavelet packet energy moment entropy for early fault prognosis of bearing, Knowl-Based Syst., (2019) 9 [52] H. Yang, J. Zhang, L. Chen, H. Zhang, S. Liu, Fault Diagnosis of Reciprocating Compressor Based on Convolutional Neural Networks with Multisource Raw Vibration Signals, Math. Probl. Eng., 2019 (2019). [53] M. Xia, T. Li, L. Xu, L. Liu, C.W. de Silva, Fault Diagnosis for Rotating Machinery Using Multiple Sensors and Convolutional Neural Networks, IEEE/ASME Transactions on Mechatronics, 23 (1) (2018) 101-110. [54] D. Hoang, H. Kang, Rolling element bearing fault diagnosis using convolutional neural network and vibration image, Cogn Syst Res, 53 (2019) 42-50. [55] W. Zhang, D. Yang, H. Wang, X. Huang, M. Gidlund, CarNet: A Dual Correlation Method for Health Perception of Rotating Machinery, IEEE Sens. J., 19 (16) (2019) 7095-7106. [56] D.T. Hoang, H.J. Kang, A Motor Current Signal Based Bearing Fault Diagnosis Using Deep Learning And Information Fusion, IEEE Trans. Instrum. Meas., 1. [57] H. Wang, S. Li, L. Song, L. Cui, A novel convolutional neural network based fault recognition method via image fusion of multi-vibration-signals, Comput. Ind., 105 (2019) 182-190. [58] Z. Hu, Y. Wang, M. Ge, J. Liu, Data-driven Fault Diagnosis Method based on Compressed Sensing and Improved Multi-scale Network, IEEE Trans. Ind. Electron., 1. [59] H. Wang, S. Li, L. Song, L. Cui, P. Wang, An Enhanced Intelligent Diagnosis Method Based on Multi-Sensor Image Fusion via Improved Deep Learning Network, IEEE Trans. Instrum. Meas.. [60] L. Wen, X. Li, L. Gao, Y. Zhang, A New Convolutional Neural Network-Based Data-Driven Fault Diagnosis Method, IEEE Trans. Ind. Electron., 65 (7) (2018) 5990-5998. [61] Z. Chen, C. Li, R.E.V. Sanchez, Gearbox fault identification and classification with convolutional neural networks, Shock Vib., 2015 (2015). [62] O. Janssens, V. Slavkovikj, B. Vervisch, K. Stockman, M. Loccufier, S. Verstockt, R. Van de Walle, S. Van Hoecke, Convolutional neural network based fault detection for rotating machinery, J. Sound Vib., 377 (2016) 331-345. [63] M. Bhadane, K.I. Ramachandran. Bearing fault identification and classification with convolutional neural network. In: Editor edito. Pub Place: IEEE; 2017. p. 1-5. [64] C. Lu, Z. Wang, B. Zhou, Intelligent fault diagnosis of rolling bearing using hierarchical convolutional network based health state classification, Adv. Eng. Inform., 32 (2017) 139-151. [65] S. Li, G. Liu, X. Tang, J. Lu, J. Hu, An ensemble deep convolutional neural network model with improved DS evidence fusion for bearing fault diagnosis, Sensors-Basel, 17 (8) (2017) 1729. [66] V. Tra, J. Kim, S.A. Khan, J. Kim, Bearing fault diagnosis under variable speed using convolutional neural networks and the stochastic diagonal levenberg-marquardt algorithm, Sensors-Basel, 17 (12) (2017) 2834. [67] A. Prosvirin, J. Kim, J. Kim, Bearing Fault Diagnosis Based on Convolutional Neural Networks with Kurtogram Representation of Acoustic Emission Signals, Advances in Computer Science and Ubiquitous Computing,Springer,2017, pp. 21-26. [68] Y. Tian, X. Liu, A deep adaptive learning method for rolling bearing fault diagnosis using immunity, Tsinghua Sci. Technol., 24 (6) (2019) 750-762. [69] V. Tra, S.A. Khan, J. Kim, Diagnosis of bearing defects under variable speed conditions using energy distribution maps of acoustic emission spectra and convolutional neural networks, The Journal of the Acoustical Society of America, 144 (4) (2018) L322-L327. [70] Y. Li, G. Cheng, Y. Pang, M. Kuai, Planetary gear fault diagnosis via feature image extraction based on multi central frequencies and vibration signal frequency Spectrum, Sensors-Basel, 18 (6) (2018) 1735. [71] Y. Yao, H. Wang, S. Li, Z. Liu, G. Gui, Y. Dan, J. Hu, End-to-end convolutional neural network model for gear fault diagnosis based on sound signals, Applied Sciences, 8 (9) (2018) 1584. [72] B.H. Kien, D. Iba, Y. Ishii, Y. Tsutsui, N. Miura, T. Iizuka, A. Masuda, A. Sone, I. Moriwaki, Crack detection 2 of plastic gears using a convolutional neural network pre-learned from images of meshing vibration data with transfer learning, Forschung im Ingenieurwesen, 83 (3) (2019) 645-653. [73] X. Ding, Q. He, Energy-fluctuated multiscale feature learning with deep convnet for intelligent spindle bearing fault diagnosis, IEEE Trans. Instrum. Meas., 66 (8) (2017) 1926-1935. [74] D. Gao, Y. Zhu, X. Wang, K. Yan, J. Hong. A Fault Diagnosis Method of Rolling Bearing Based on Complex Morlet CWT and CNN. In: Editor edito. Pub Place: IEEE; 2018. p. 1101-1105. [75] S. Guo, T. Yang, W. Gao, C. Zhang, Y. Zhang, An intelligent fault diagnosis method for bearings with variable rotating speed based on Pythagorean spatial pyramid pooling CNN, Sensors-Basel, 18 (11) (2018) 3857. [76] G. Xu, M. Liu, Z. Jiang, D. S O Ffker, W. Shen, Bearing fault diagnosis method based on deep convolutional neural network and random forest ensemble learning, Sensors-Basel, 19 (5) (2019) 1088. [77] M.M. Islam, J. Kim, Automated bearing fault diagnosis scheme using 2D representation of wavelet packet transform and deep convolutional neural network, Comput. Ind., 106 (2019) 142-153. [78] W. Sun, B. Yao, N. Zeng, B. Chen, Y. He, X. Cao, W. He, An intelligent gear fault diagnosis methodology using a complex wavelet enhanced convolutional neural network, Materials, 10 (7) (2017) 790. [79] D. Cabrera, F. Sancho, C. Li, M. Cerrada, R.E.V. S A Nchez, F. Pacheco, J.E.V. de Oliveira, Automatic feature extraction of time-series applied to fault severity assessment of helical gearbox in stationary and non-stationary speed operation, Appl. Soft Comput., 58 (2017) 53-64. [80] Y. Han, B. Tang, L. Deng, Multi-level wavelet packet fusion in dynamic ensemble convolutional neural network for fault diagnosis, Measurement, 127 (2018) 246-255. [81] J. Grezmak, P. Wang, C. Sun, R.X. Gao, Explainable Convolutional Neural Network for Gearbox Fault Diagnosis, Procedia CIRP, 80 (2019) 476-481. [82] P. Liang, C. Deng, J. Wu, Z. Yang, J. Zhu, Z. Zhang, Compound Fault Diagnosis of Gearboxes via Multi-label Convolutional Neural Network and Wavelet Transform, Comput. Ind., 113 (2019) 103132. [83] S. Guo, T. Yang, W. Gao, C. Zhang, A novel fault diagnosis method for rotating machinery based on a convolutional neural network, Sensors-Basel, 18 (5) (2018) 1429. [84] S. Shao, R. Yan, Y. Lu, P. Wang, R. Gao, DCNN-based Multi-signal Induction Motor Fault Diagnosis, Ieee T. Instrum. Meas., 1. [85] Y. Hsueh, V.R. Ittangihal, W. Wu, H. Chang, C. Kuo, Fault Diagnosis System for Induction Motors by CNN Using Empirical Wavelet Transform, Symmetry, 11 (10) (2019) 1212. [86] Z. Chen, K. Gryllias, W. Li, Mechanical fault diagnosis using Convolutional Neural Networks and Extreme Learning Machine, Mech. Syst. Signal Pr., 133 (2019) 106272. [87] X. Cao, B. Chen, B. Yao, W. He, Combining translation-invariant wavelet frames and convolutional neural network for intelligent tool wear state identification, Comput. Ind., 106 (2019) 71-84. [88] D. Verstraete, A.E.S. Ferrada, E.L.O.P. Droguett, V. Meruane, M. Modarres, Deep learning enabled fault diagnosis using time-frequency image analysis of rolling element bearings, Shock Vib., 2017 (2017). [89] V. Pandhare, J. Singh, J. Lee, Convolutional Neural Network Based Rolling-Element Bearing Fault Diagnosis for Naturally Occurring and Progressing Defects Using Time-Frequency Domain Features, IEEE,2019, pp. 320-326. [90] Y. Xin, S. Li, C. Cheng, J. Wang, An intelligent fault diagnosis method of rotating machinery based on deep neural networks and time-frequency analysis, Journal of Vibroengineering, 20 (6) (2018) 2321-2335. [91] W. Yu, S. Huang, W. Xiao, Fault Diagnosis Based on an Approach Combining a Spectrogram and a Convolutional Neural Network with Application to a Wind Turbine System, Energies, 11 (10) (2018) 2561. [92] L. Wen, L. Gao, X. Li, A New Snapshot Ensemble Convolutional Neural Network for Fault Diagnosis, IEEE Access, 7 (2019) 32037-32047. 3 [93] L. Wang, X. Zhao, J. Wu, Y. Xie, Y. Zhang, Motor fault diagnosis based on short-time Fourier transform and convolutional neural network, Chinese Journal of Mechanical Engineering, 30 (6) (2017) 1357-1368. [94] G. Li, C. Deng, J. Wu, X. Xu, X. Shao, Y. Wang, Sensor Data-Driven Bearing Fault Diagnosis Based on Deep Convolutional Neural Networks and S-Transform, Sensors-Basel, 19 (12) (2019) 2750. [95] L. Wen, X. Li, L. Gao, A New Two-Level Hierarchical Diagnosis Network Based on Convolutional Neural Network, IEEE Trans. Instrum. Meas., (2019). [96] H. Jeong, S. Park, S. Woo, S. Lee, Rotating machinery diagnostics using deep learning on orbit plot images, Procedia Manufacturing, 5 (2016) 1107-1118. [97] N.F. Waziralilah, A. Abu, M.H. Lim, L.K. Quen, A. Elfakarany, Bearing fault diagnosis employing Gabor and augmented architecture of convolutional neural network, Journal of Mechanical Engineering and Sciences, 13 (3) (2019) 5689-5702. [98] D. Zhao, T. Wang, F. Chu, Deep convolutional neural network based planet bearing fault classification, Comput. Ind., 107 (2019) 59-66. [99] O. Janssens, R. Van de Walle, M. Loccufier, S. Van Hoecke, Deep Learning for Infrared Thermal Image Based Machine Health Monitoring, IEEE/ASME Transactions on Mechatronics, 23 (1) (2018) 151-159. [100] Z. Jia, Z. Liu, C. Vong, M. Pecht, A Rotating Machinery Fault Diagnosis Method Based on Feature Learning of Thermal Images, IEEE Access, 7 (2019) 12348-12359. [101] Y. Li, J.X. Gu, D. Zhen, M. Xu, A. Ball, An Evaluation of Gearbox Condition Monitoring Using Infrared Thermal Images Applied with Convolutional Neural Networks, Sensors-Basel, 19 (9) (2019) 2205. [102] X. Chen, L. Peng, G. Cheng, C. Luo, Research on Degradation State Recognition of Planetary Gear Based on Multiscale Information Dimension of SSD and CNN, Complexity, 2019 (2019). [103] F. Wang, G. Deng, L. Ma, X. Liu, H. Li, Convolutional Neural Network Based on Spiral Arrangement of Features and Its Application in Bearing Fault Diagnosis, IEEE Access, 7 (2019) 64092-64100. [104] X. Zhu, D. Hou, P. Zhou, Z. Han, Y. Yuan, W. Zhou, Q. Yin, Rotor fault diagnosis using a convolutional neural network with symmetrized dot pattern images, Measurement, 138 (2019) 526-535. [105] H. Li, Q. Zhang, X. Qin, Y. Sun, K-SVD based WVD enhancement algorithm for planetary gearbox fault diagnosis under a CNN framework, Meas. Sci. Technol., (2019). [106] S.S. Udmale, S.S. Patil, V.M. Phalle, S.K. Singh, A bearing vibration data analysis based on spectral kurtosis and ConvNet, Soft Comput., 23 (19) (2019) 9341-9359. [107] J.S.L. Senanayaka, H. Van Khang, K.G. Robbersmyr, Multiple Classifiers and Data Fusion for Robust Diagnosis of Gearbox Mixed Faults, IEEE Trans. Ind Inform, 15 (8) (2019) 4569-4579. [108] L. Eren, Bearing fault detection by one-dimensional convolutional neural networks, Math. Probl. Eng., 2017 (2017). [109] H. Pan, X. He, S. Tang, F. Meng, An Improved Bearing Fault Diagnosis Method using One-Dimensional CNN and LSTM., Strojniski Vestnik/Journal of Mechanical Engineering, 64 (2018). [110] J. Pan, Y. Zi, J. Chen, Z. Zhou, B. Wang, LiftingNet: A Novel Deep Learning Network With Layerwise Feature Learning From Noisy Mechanical Data for Fault Classification, IEEE Trans. Ind. Electron., 65 (6) (2018) 4973-4982. [111] W. Qian, S. Li, J. Wang, Z. An, X. Jiang, An intelligent fault diagnosis framework for raw vibration signals: adaptive overlapping convolutional neural network, Meas. Sci. Technol., 29 (9) (2018) 95009. [112] F. Jia, Y. Lei, N. Lu, S. Xing, Deep normalized convolutional neural network for imbalanced fault classification of machinery and its understanding via visualization, Mech. Syst. Signal Pr., 110 (2018) 349-367. [113] L. Eren, T. Ince, S. Kiranyaz, A Generic Intelligent Bearing Fault Diagnosis System Using Compact Adaptive 4
1D CNN Classifier, Journal of Signal Processing Systems, 91 (2) (2019) 179-189. [114] S. Ma, F. Chu, Ensemble deep learning-based fault diagnosis of rotor bearing systems, Comput. Ind., 105 (2019) 143-152. [115] D. Wang, Q. Guo, Y. Song, S. Gao, Y. Li, Application of Multiscale Learning Neural Network Based on CNN in Bearing Fault Diagnosis, Journal of Signal Processing Systems, 91 (10) (2019) 1205-1217. [116] W. Huang, J. Cheng, Y. Yang, G. Guo, An improved deep convolutional neural network with multi-scale information for bearing fault diagnosis, Neurocomputing, 359 (2019) 77-92. [117] H. Qiao, T. Wang, P. Wang, L. Zhang, M. Xu, An Adaptive Weighted Multiscale Convolutional Neural Network for Rotating Machinery Fault Diagnosis Under Variable Operating Conditions, Ieee Access, 7 (2019) 118954-118964. [118] O. Abdeljaber, S. Sassi, O. Avci, S. Kiranyaz, A.A. Ibrahim, M. Gabbouj, Fault Detection and Severity Identification of Ball Bearings by Online Condition Monitoring, IEEE Trans. Ind. Electron., 66 (10) (2019) 8136-8147. [119] R. Huang, Y. Liao, S. Zhang, W. Li, Deep decoupling convolutional neural network for intelligent compound fault diagnosis, IEEE Access, 7 (2018) 1848-1858. [120] X. Liu, Q. Zhou, J. Zhao, H. Shen, X. Xiong, Fault Diagnosis of Rotating Machinery under Noisy Environment Conditions Based on a 1-D Convolutional Autoencoder and 1-D Convolutional Neural Network, Sensors-Basel, 19 (4) (2019) 972. [121] Y. Han, B. Tang, L. Deng, An enhanced convolutional neural network with enlarged receptive fields for fault diagnosis of planetary gearboxes, Comput. Ind., 107 (2019) 50-58. [122] G. Jiang, H. He, J. Yan, P. Xie, Multiscale Convolutional Neural Networks for Fault Diagnosis of Wind Turbine Gearbox, IEEE Trans. Ind. Electron., 66 (4) (2019) 3196-3207. [123] W. Sun, R. Zhao, R. Yan, S. Shao, X. Chen, Convolutional discriminative feature learning for induction motor fault diagnosis, IEEE Trans. Ind Inform, 13 (3) (2017) 1350-1359. [124] Z. Yuan, L. Zhang, L. Duan, A novel fusion diagnosis method for rotor system fault based on deep learning and multi-sourced heterogeneous monitoring data, Meas. Sci. Technol., 29 (11) (2018) 115005. [125] S. Afrasiabi, M. Afrasiabi, B. Parang, M. Mohammadi. Real-Time Bearing Fault Diagnosis of Induction Motors with Accelerated Deep Learning Approach. In: Editor edito. Pub Place: IEEE; 2019. p. 155-159. [126] Chen, Xie, Yuan, Huang, Li, Research on a Real-Time Monitoring Method for the Wear State of a Tool Based on a Convolutional Bidirectional LSTM Model, Symmetry, 11 (10) (2019) 1233. [127] Y. Xie, T. Zhang, Fault diagnosis for rotating machinery based on convolutional neural network and empirical mode decomposition, Shock Vib., 2017 (2017). [128] M. Sadoughi, C. Hu, Physics-Based Convolutional Neural Network for Fault Diagnosis of Rolling Element Bearings, Ieee Sens. J., 19 (11) (2019) 4181-4192. [129] W. Zhang, C. Li, G. Peng, Y. Chen, Z. Zhang, A deep convolutional neural network with new training methods for bearing fault diagnosis under noisy environment and different working load, Mech. Syst. Signal Pr., 100 (2018) 439-453. [130] S. Dong, G. Wen, Z. Zhang, Y. Yuan, J. Luo, Rolling Bearing Incipient Degradation Monitoring and Performance Assessment Based on Signal Component Tracking, IEEE Access, 7 (2019) 45983-45993. [131] L. Jing, M. Zhao, P. Li, X. Xu, A convolutional neural network based feature learning and fault diagnosis method for the condition monitoring of gearbox, Measurement, 111 (2017) 1-10. [132] S. Ma, W. Cai, W. Liu, Z. Shang, G. Liu, A Lighted Deep Convolutional Neural Network Based Fault Diagnosis of Rotating Machinery, Sensors-Basel, 19 (10) (2019) 2381. [133] T. Ince, S. Kiranyaz, L. Eren, M. Askar, M. Gabbouj, Real-time motor fault detection by 1-D convolutional 5 neural networks, IEEE Trans. Ind. Electron., 63 (11) (2016) 7067-7075. [134] T. Ince, Real-time broken rotor bar fault detection and classification by shallow 1D convolutional neural networks, Electr. Eng., (2019). [135] T. Khan, P. Alekhya, J. Seshadrinath. Incipient Inter-turn Fault Diagnosis in Induction motors using CNN and LSTM based Methods. In: Editor edito. Pub Place: IEEE; 2018. p. 1-6. [136] I. Kao, W. Wang, Y. Lai, J. Perng, Analysis of Permanent Magnet Synchronous Motor Fault Diagnosis Based on Learning, IEEE Trans. Instrum. Meas., 68 (2) (2019) 310-324. [137] J. Wang, P. Fu, L. Zhang, R.X. Gao, R. Zhao, Multi-level information fusion for induction motor fault diagnosis, IEEE/ASME Transactions on Mechatronics, 1. [138] X. Li, J. Li, Y. Qu, D. He, Gear Pitting Fault Diagnosis Using Integrated CNN and GRU Network with Both Vibration and Acoustic Emission Signals, Applied Sciences, 9 (4) (2019) 768. [139] D.K. Appana, A. Prosvirin, J. Kim, Reliable fault diagnosis of bearings with varying rotational speeds using envelope spectrum and convolution neural networks, Soft Comput., 22 (20) (2018) 6719-6729. [140] M. Zhao, M. Kang, B. Tang, M. Pecht, Deep Residual Networks With Dynamically Weighted Wavelet Coefficients for Fault Diagnosis of Planetary Gearboxes, IEEE Trans. Ind. Electron., 65 (5) (2018) 4290-4300. [141] M. Zhao, M. Kang, B. Tang, M. Pecht, Multiple Wavelet Coefficients Fusion in Deep Residual Networks for Fault Diagnosis, IEEE Trans. Ind. Electron., 66 (6) (2019) 4696-4706. [142] X. Li, W. Zhang, Q. Ding, J. Sun, Intelligent rotating machinery fault diagnosis based on deep learning using data augmentation, J. Intell. Manuf., (2018) 1-20. [143] W. Zhang, X. Li, Q. Ding, Deep residual learning-based fault diagnosis method for rotating machinery, Isa T., (2018). [144] D. Peng, Z. Liu, H. Wang, Y. Qin, L. Jia, A Novel Deeper One-Dimensional CNN With Residual Learning for Fault Diagnosis of Wheelset Bearings in High-Speed Trains, IEEE Access, 7 (2019) 10278-10293. [145] S. Ma, W. Liu, W. Cai, Z. Shang, G. Liu, Lightweight Deep Residual CNN for Fault Diagnosis of Rotating Machinery Based on Depthwise Separable Convolutions, IEEE Access, 7 (2019) 57023-57036. [146] Z. Zhuang, H. Lv, J. Xu, Z. Huang, W. Qin, A Deep Learning Method for Bearing Fault Diagnosis through Stacked Residual Dilated Convolutions, Applied Sciences, 9 (9) (2019) 1823. [147] L. Su, L. Ma, N. Qin, D. Huang, A.H. Kemp, Fault Diagnosis of High-Speed Train Bogie by Residual-Squeeze Net, IEEE Trans. Ind Inform, 15 (7) (2019) 3856-3863. [148] S. Ma, F. Chu, Q. Han, Deep residual learning with demodulated time-frequency features for fault diagnosis of planetary gearbox under nonstationary running conditions, Mech. Syst. Signal Pr., 127 (2019) 190-201. [149] R. Liu, F. Wang, B. Yang, S.J. Qin, Multi-scale Kernel based Residual Convolutional Neural Network for Motor Fault Diagnosis Under Non-stationary Conditions, IEEE Trans. Ind Inform, (2019). [150] S. Cao, L. Wen, X. Li, L. Gao. Application of Generative Adversarial Networks for Intelligent Fault Diagnosis. In: Editor edito. Pub Place: IEEE; 2018. p. 711-715. [151] Y. Xie, T. Zhang. Imbalanced Learning for Fault Diagnosis Problem of Rotating Machinery Based on Generative Adversarial Networks. In: Editor edito. Pub Place; 2018. p. 6017-6022. [152] S. Shao, P. Wang, R. Yan, Generative adversarial networks for data augmentation in machine fault diagnosis, Comput. Ind., 106 (2019) 85-93. [153] S. Afrasiabi, M. Afrasiabi, B. Parang, M. Mohammadi, M.M. Arefi, M. Rastegar, Wind Turbine Fault Diagnosis with Generative-Temporal Convolutional Neural Network, IEEE,2019, pp. 1-5. [154] Q. Li, L. Chen, C. Shen, B. Yang, Z. Zhu, Enhanced generative adversarial networks for fault diagnosis of rotating machinery with imbalanced data, Meas. Sci. Technol., 30 (11) (2019) 115005. 6 [155] S. Suh, H. Lee, J. Jo, P. Lukowicz, Y.O. Lee, Generative Oversampling Method for Imbalanced Data on Bearing Fault Detection and Diagnosis, Applied Sciences, 9 (4) (2019) 746. [156] Q. Guo, Y. Li, Y. Song, D. Wang, W. Chen, Intelligent Fault Diagnosis Method Based on Full 1D Convolutional Generative Adversarial Network, IEEE Trans. Ind Inform, (2019). [157] X. Li, X. Li, Y. Qu, D. He, Gear pitting level diagnosis using vibration signals with an improved inception structure, Vibroengineering PROCEDIA, 20 (2018) 70-75. [158] Y. Chen, G. Peng, C. Xie, W. Zhang, C. Li, S. Liu, ACDIN: Bridging the gap between artificial and real bearing damages for bearing fault diagnosis, Neurocomputing, 294 (2018) 61-71. [159] Z. Zhu, G. Peng, Y. Chen, H. Gao, A convolutional neural network based on a capsule network with strong generalization for bearing fault diagnosis, Neurocomputing, 323 (2019) 62-75. [160] T. Chen, Z. Wang, X. Yang, K. Jiang, A deep capsule neural network with stochastic delta rule for bearing fault diagnosis on raw vibration signals, Measurement, 148 (2019) 106857. [161] Y. Yoo, J. Baek, A novel image feature for the remaining useful lifetime prediction of bearings based on continuous wavelet transform and convolutional neural network, Applied Sciences, 8 (7) (2018) 1102. [162] D. Belmiloud, T. Benkedjouh, M. Lachi, A. Laggoun, J.P. Dron, Deep convolutional neural networks for Bearings failure predictionand temperature correlation, Journal of Vibroengineering, 20 (8) (2018) 2878-2891. [163] A.Z. Hinchi, M. Tkiouat, Rolling element bearing remaining useful life estimation based on a convolutional long-short-term memory network, Procedia Computer Science, 127 (2018) 123-132. [164] L. Guo, Y. Lei, N. Li, T. Yan, N. Li, Machinery health indicator construction based on convolutional neural networks considering trend burr, Neurocomputing, 292 (2018) 142-150. [165] L. Ren, Y. Sun, H. Wang, L. Zhang, Prediction of bearing remaining useful life with deep convolution neural network, IEEE Access, 6 (2018) 13041-13049. [166] D. She, M. Jia, Wear indicator construction of rolling bearings based on multi-channel deep convolutional neural network with exponentially decaying learning rate, Measurement, 135 (2019) 368-375. [167] W. Mao, J. He, J. Tang, Y. Li, Predicting remaining useful life of rolling bearings based on deep feature representation and long short-term memory neural network, Adv Mech Eng, 10 (12) (2018) 754328416. [168] X. Li, W. Zhang, Q. Ding, Deep learning-based remaining useful life estimation of bearings using multi-scale feature extraction, Reliabi. Eng. Syst. Saf., 182 (2019) 208-218. [169] J. Zhu, N. Chen, W. Peng, Estimation of Bearing Remaining Useful Life Based on Multiscale Convolutional Neural Network, IEEE Trans. Ind. Electron., 66 (4) (2019) 3208-3216. [170] Q. Wang, B. Zhao, H. Ma, J. Chang, G. Mao, A method for rapidly evaluating reliability and predicting remaining useful life using two-dimensional convolutional neural network with signal conversion, J. Mech Sci Technol, (2019) 1-11. [171] D. Zhang, E. Stewart, J. Ye, M. Entezami, C. Roberts, Roller Bearing Degradation Assessment Based on a Deep MLP Convolution Neural Network Considering Outlier Regions, IEEE Trans. Instrum. Meas. (2019). [172] B. Yang, R. Liu, E. Zio, Remaining Useful Life Prediction Based on a Double-Convolutional Neural Network Architecture, Ieee T. Ind. Electron., 66 (12) (2019) 9521-9530. [173] W. Peng, Z. Ye, N. Chen, Bayesian Deep Learning based Health Prognostics Towards Prognostics Uncertainty, IEEE Trans. Ind. Electron., (2019). [174] Q. Yao, T. Yang, Z. Liu, Z. Zheng. Remaining Useful Life Estimation by Empirical Mode Decomposition and Ensemble Deep Convolution Neural Networks. In: Editor edito. Pub Place: IEEE; 2019. p. 1-6. [175] Z. Wang, H. Ma, H. Chen, B. Yan, X. Chu, Performance degradation assessment of rolling bearing based on convolutional neural network and deep long-short term memory network, Int. J. Prod. Res., (2019) 1-13. 7 [176] R. Liu, B. Yang, A.G. Hauptmann, Simultaneous Bearing Fault Recognition and Remaining Useful Life Prediction Using Joint Loss Convolutional Neural Network, IEEE Trans. Ind Inform, (2019). [177] B. Wang, Y. Lei, N. Li, T. Yan, Deep separable convolutional network for remaining useful life prediction of machinery, Mech. Syst. Signal Pr., 134 (2019) 106330. [178] B. Wang, Y. Lei, T. Yan, N. Li, L. Guo, Recurrent convolutional neural network: A new framework for remaining useful life prediction of machinery, Neurocomputing, (2019). [179] G.S. Babu, P. Zhao, X. Li. Deep convolutional neural network based regression approach for estimation of remaining useful life. In: Editor edito. Pub Place: Springer; 2016. p. 214-228. [180] X. Li, Q. Ding, J. Sun, Remaining useful life estimation in prognostics using deep convolution neural networks, Reliab. Eng. Syst. Saf., 172 (2018) 1-11. [181] L. Wen, Y. Dong, L. Gao, A new ensemble residual convolutional neural network for remaining useful life estimation, Math. Biosci. Eng, 16 (2019) 862-880. [182] J. Li, X. Li, D. He, A Directed Acyclic Graph Network Combined With CNN and LSTM for Remaining Useful Life Prediction, IEEE Access, 7 (2019) 75464-75475. [183] A. Al-Dulaimi, S. Zabihi, A. Asif, A. Mohammadi, A multimodal and hybrid deep neural network model for Remaining Useful Life estimation, Comput. Ind., 108 (2019) 186-196. [184] A. Ruiz-Tagle Palazuelos, E.L. Droguett, R. Pascual, A novel deep capsule neural network for remaining useful life estimation, Proceedings of the Institution of Mechanical Engineers, Part O: Journal of Risk and Reliability, (2019) 1748006X-1986654X. [185] Z. Kong, Y. Cui, Z. Xia, H. Lv, Convolution and Long Short-Term Memory Hybrid Deep Neural Networks for Remaining Useful Life Prognostics, Applied Sciences, 9 (19) (2019) 4156. [186] R. Zhao, R. Yan, J. Wang, K. Mao, Learning to monitor machine health with convolutional bi-directional LSTM networks, Sensors-Basel, 17 (2) (2017) 273. [187] H. Qiao, T. Wang, P. Wang, S. Qiao, L. Zhang, A Time-Distributed Spatiotemporal Feature Learning Method for Machine Health Monitoring with Multi-Sensor Time Series, Sensors-Basel, 18 (9) (2018) 2932. [188] F. Aghazadeh, A. Tahan, M. Thomas, Tool condition monitoring using spectral subtraction and convolutional neural networks in milling process, The International Journal of Advanced Manufacturing Technology, 98 (9-12) (2018) 3217-3227. [189] Z. Huang, J. Zhu, J. Lei, X. Li, F. Tian, Tool wear predicting based on multi-domain feature fusion by deep convolutional neural network in milling operations, J. Intell. Manuf., (2019). [190] J. Fu, J. Chu, P. Guo, Z. Chen, Condition Monitoring of Wind Turbine Gearbox Bearing Based on Deep Learning Model, IEEE Access, 7 (2019) 57078-57087. [191] Z. Kong, B. Tang, L. Deng, W. Liu, Y. Han, Condition monitoring of wind turbines based on spatio-temporal fusion of SCADA data by convolutional neural networks and gated recurrent units, Renew. Energ., 146 (2020) 760-768. [192] H. Luo, M. Huang, Z. Zhou, A dual-tree complex wavelet enhanced convolutional LSTM neural network for structural health monitoring of automotive suspension, Measurement, 137 (2019) 14-27. [193] P. Li, X. Jia, J. Feng, F. Zhu, M. Miller, L. Chen, J. Lee, A novel scalable method for machine degradation assessment using deep convolutional neural network, Measurement, (2019) 107106. [194] P. Cao, S. Zhang, J. Tang, Preprocessing-free gear fault diagnosis using small datasets with deep convolutional neural network-based transfer learning, IEEE Access, 6 (2018) 26241-26253. [195] M.J. Hasan, M. Sohaib, J. Kim. 1D CNN-Based Transfer Learning Model for Bearing Fault Diagnosis under Variable Working Conditions. In: Editor edito. Pub Place: Springer; 2018. p. 13-23. [196] M. Hemmer, H. Van Khang, K. Robbersmyr, T. Waag, T. Meyer, Fault Classification of Axial and Radial 8
Roller Bearings Using Transfer Learning through a Pretrained Convolutional Neural Network, Designs, 2 (4) (2018) 56. [197] S. Zhong, S. Fu, L. Lin, A novel gas turbine fault diagnosis method based on transfer learning with CNN, Measurement, 137 (2019) 435-453. [198] L. Wen, X. Li, L. Gao, A transfer convolutional neural network for fault diagnosis based on ResNet-50, Neural Computing and Applications, 1-14. [199] L. Wen, L. Gao, Y. Dong, Z. Zhu, A negative correlation ensemble transfer learning method for fault diagnosis based on convolutional neural network, Math Biosci Eng, 16 (5) (2019) 3311-3330. [200] T. Han, C. Liu, W. Yang, D. Jiang, Learning transferable features in deep convolutional neural networks for diagnosing unseen machine conditions, ISA Trans., (2019). [201] S. Shao, S. McAleer, R. Yan, P. Baldi, Highly Accurate Machine Fault Diagnosis Using Deep Transfer Learning, IEEE Trans. Ind Inform, 15 (4) (2019) 2446-2455. [202] P. Ma, H. Zhang, W. Fan, C. Wang, G. Wen, X. Zhang, A novel bearing fault diagnosis method based on 2D image representation and transfer learning-convolutional neural network, Meas. Sci. Technol., 30 (5) (2019) 55402. [203] M.J. Hasan, M.M. Islam, J. Kim, Acoustic spectral imaging and transfer learning for reliable bearing fault diagnosis under variable speed conditions, Measurement, 138 (2019) 620-631. [204] Z. Chen, K. Gryllias, W. Li, Intelligent Fault Diagnosis for Rotary Machinery Using Transferable Convolutional Neural Network, IEEE Trans. Ind Inform, (2019). [205] K.M. Borgwardt, A. Gretton, M.J. Rasch, H. Kriegel, B. Sch O Lkopf, A.J. Smola, Integrating structured biological data by kernel maximum mean discrepancy, Bioinformatics, 22 (14) (2006) e49-e57. [206] X. Jia, M. Zhao, Y. Di, Q. Yang, J. Lee, Assessment of Data Suitability for Machine Prognosis Using Maximum Mean Discrepancy, IEEE Trans. Ind. Electron., 65 (7) (2018) 5872-5881. [207] B. Sun, K. Saenko. Deep coral: Correlation alignment for deep domain adaptation. In: Editor edito. Pub Place: Springer; 2016. p. 443-450. [208] B. Zhang, W. Li, X. Li, S. Ng, Intelligent fault diagnosis under varying working conditions based on domain adaptive convolutional neural networks, IEEE Access, 6 (2018) 66367-66384. [209] X. Li, W. Zhang, Q. Ding, A robust intelligent fault diagnosis method for rolling element bearings based on deep distance metric learning, Neurocomputing, 310 (2018) 77-95. [210] X. Li, W. Zhang, Q. Ding, J. Sun, Multi-Layer domain adaptation method for rolling bearing fault diagnosis, Signal Process., 157 (2019) 180-197. [211] D. Xiao, Y. Huang, L. Zhao, C. Qin, H. Shi, C. Liu, Domain Adaptive Motor Fault Diagnosis Using Deep Transfer Learning, IEEE Access, 7 (2019) 80937-80949. [212] B. Yang, Y. Lei, F. Jia, S. Xing, An intelligent fault diagnosis approach based on transfer learning from laboratory bearings to locomotive bearings, Mech. Syst. Signal Pr., 122 (2019) 692-706. [213] T. Han, C. Liu, W. Yang, D. Jiang, Deep transfer network with joint distribution adaptation: a new intelligent fault diagnosis framework for industry application, ISA Trans., (2019). [214] K. Xu, S. Li, J. Wang, Z. An, W. Qian, H. Ma, A novel convolutional transfer feature discrimination network for imbalanced fault diagnosis under variable rotational speed, Meas. Sci. Technol., (2019). [215] J. Zhu, N. Chen, C. Shen, A New Deep Transfer Learning Method for Bearing Fault Diagnosis under Different Working Conditions, IEEE Sens. J., (2019). [216] X. Li, W. Zhang, Q. Ding, Cross-Domain Fault Diagnosis of Rolling Element Bearings Using Deep Generative Neural Networks, IEEE Trans. Ind. Electron., 66 (7) (2019) 5525-5534. [217] K. Xu, S. Li, X. Jiang, Z. An, J. Wang, T. Yu, A renewable fusion fault diagnosis network for the variable 9 speed conditions under unbalanced samples, Neurocomputing, (2019). [218] Y. Ganin, V. Lempitsky, Unsupervised domain adaptation by backpropagation, arXiv preprint arXiv:1409.7495, (2014). [219] T. Han, C. Liu, W. Yang, D. Jiang, A novel adversarial learning framework in deep convolutional neural network for intelligent diagnosis of mechanical faults, Knowl-Based Syst., 165 (2019) 474-487. [220] L. Guo, Y. Lei, S. Xing, T. Yan, N. Li, Deep Convolutional Transfer Learning Network: A New Method for Intelligent Fault Diagnosis of Machines With Unlabeled Data, IEEE Trans. Ind. Electron., 66 (9) (2019) 7316-7325. [221] M. Zhang, D. Wang, W. Lu, J. Yang, Z. Li, B. Liang, A Deep Transfer Model With Wasserstein Distance Guided Multi-Adversarial Networks for Bearing Fault Diagnosis Under Different Working Conditions, IEEE Access, 7 (2019) 65303-65318. [222] X. Wang, F. Liu, Triplet Loss Guided Adversarial Domain Adaptation for Bearing Fault Diagnosis, Sensors-Basel, 20 (1) (2020) 320. [223] Y. Xie, T. Zhang, A Transfer Learning Strategy for Rotation Machinery Fault Diagnosis based on Cycle-Consistent Generative Adversarial Networks, IEEE,2018, pp. 1309-1313. [224] J. Jiao, M. Zhao, J. Lin, Unsupervised Adversarial Adaptation Network for Intelligent Fault Diagnosis, IEEE Trans. Ind. Electron., (2019). [225] J. Jiao, M. Zhao, J. Lin, C. Ding, Classifier Inconsistency based Domain Adaptation Network for Partial Transfer Intelligent Diagnosis, IEEE Trans. Ind Inform, 1. [226] W. Zhang, G. Peng, C. Li, Y. Chen, Z. Zhang, A new deep learning model for fault diagnosis with good anti-noise and domain adaptation ability on raw vibration signals, Sensors-Basel, 17 (2) (2017) 425. [227] L. Duan, X. Wang, M. Xie, Z. Yuan, J. Wang, Auxiliary-model-based domain adaptation for reciprocating compressor diagnosis under variable conditions, Journal of Intelligent & Fuzzy Systems, 34 (6) (2018) 3595-3604. [228] M.J. Hasan, J. Kim, Bearing Fault Diagnosis under Variable Rotational Speeds Using Stockwell Transform-Based Vibration Imaging and Transfer Learning, Applied Sciences, 8 (12) (2018) 2357. [229] D. Xiao, Y. Huang, C. Qin, Z. Liu, Y. Li, C. Liu, Transfer learning with convolutional neural networks for small sample size problem in machinery fault diagnosis, Proceedings of the Institution of Mechanical Engineers, Part C: Journal of Mechanical Engineering Science, (2019) 62159741.speed conditions under unbalanced samples, Neurocomputing, (2019). [218] Y. Ganin, V. Lempitsky, Unsupervised domain adaptation by backpropagation, arXiv preprint arXiv:1409.7495, (2014). [219] T. Han, C. Liu, W. Yang, D. Jiang, A novel adversarial learning framework in deep convolutional neural network for intelligent diagnosis of mechanical faults, Knowl-Based Syst., 165 (2019) 474-487. [220] L. Guo, Y. Lei, S. Xing, T. Yan, N. Li, Deep Convolutional Transfer Learning Network: A New Method for Intelligent Fault Diagnosis of Machines With Unlabeled Data, IEEE Trans. Ind. Electron., 66 (9) (2019) 7316-7325. [221] M. Zhang, D. Wang, W. Lu, J. Yang, Z. Li, B. Liang, A Deep Transfer Model With Wasserstein Distance Guided Multi-Adversarial Networks for Bearing Fault Diagnosis Under Different Working Conditions, IEEE Access, 7 (2019) 65303-65318. [222] X. Wang, F. Liu, Triplet Loss Guided Adversarial Domain Adaptation for Bearing Fault Diagnosis, Sensors-Basel, 20 (1) (2020) 320. [223] Y. Xie, T. Zhang, A Transfer Learning Strategy for Rotation Machinery Fault Diagnosis based on Cycle-Consistent Generative Adversarial Networks, IEEE,2018, pp. 1309-1313. [224] J. Jiao, M. Zhao, J. Lin, Unsupervised Adversarial Adaptation Network for Intelligent Fault Diagnosis, IEEE Trans. Ind. Electron., (2019). [225] J. Jiao, M. Zhao, J. Lin, C. Ding, Classifier Inconsistency based Domain Adaptation Network for Partial Transfer Intelligent Diagnosis, IEEE Trans. Ind Inform, 1. [226] W. Zhang, G. Peng, C. Li, Y. Chen, Z. Zhang, A new deep learning model for fault diagnosis with good anti-noise and domain adaptation ability on raw vibration signals, Sensors-Basel, 17 (2) (2017) 425. [227] L. Duan, X. Wang, M. Xie, Z. Yuan, J. Wang, Auxiliary-model-based domain adaptation for reciprocating compressor diagnosis under variable conditions, Journal of Intelligent & Fuzzy Systems, 34 (6) (2018) 3595-3604. [228] M.J. Hasan, J. Kim, Bearing Fault Diagnosis under Variable Rotational Speeds Using Stockwell Transform-Based Vibration Imaging and Transfer Learning, Applied Sciences, 8 (12) (2018) 2357. [229] D. Xiao, Y. Huang, C. Qin, Z. Liu, Y. Li, C. Liu, Transfer learning with convolutional neural networks for small sample size problem in machinery fault diagnosis, Proceedings of the Institution of Mechanical Engineers, Part C: Journal of Mechanical Engineering Science, (2019) 62159741.