On the topology effects in wireless sensor networks based prognostics and health management
Ahmad Farhat, Abdallah Makhoul, Christophe Guyeux, Rami Tawil, Ali Jaber, Abbas Hijazi
OOn the topology effects in wireless sensor networksbased prognostics and health management
Ahmad Farhat, Abdallah Makhoul, and Christophe Guyeux
FEMTO-ST Laboratory, DISC departmentUniversity of Franche-ComtéRue Engel-Gros, 90016 Belfort, France
Rami Tawil, Ali Jaber, and Abbas Hijazi
Department of Computer ScienceLebanese UniversityBeirut, Lebanon
Abstract —In this work, we consider the usage of wirelesssensor networks (WSN) to monitor an area of interest, in orderto diagnose on real time its state. Each sensor node forwardsinformation about relevant features towards the sink where thedata is processed. Nevertheless, energy conservation is a key issuein the design of such networks and once a sensor exhausts itsresources, it will be dropped from the network. This will lead tobroken links and data loss. It is therefore important to keep thenetwork running for as long as possible by preserving the energyheld by the nodes. Indeed, saving the quality of service (QoS)of a wireless sensor network for a long period is very importantin order to ensure accurate data. Then, the area diagnosing willbe more accurate. From another side, packet transmission is thephase that consumes the highest amount of energy comparing toother activities in the network. Therefore, we can see that thenetwork topology has an important impact on energy efficiency,and thus on data and diagnosis accuracies. In this paper, we studyand compare four network topologies: distributed, hierarchical,centralized, and decentralized topology and show their impacton the resulting estimation of diagnostics. We have used sixdiagnostic algorithms, to evaluate both prognostic and healthmanagement with the variation of type of topology in WSN.
I. I
NTRODUCTION
Due to the increasing demand in reliability and qualityof service, modern industrial plants witness a continuouslygrowing complexity. As a result, the costs of failure and systemdowntime are getting more expensive. Therefore, monitoringthese areas is very essential to evaluate their health anddiagnose them at any time, and then to plan maintenanceactivities to avoid disastrous failure results. Prognostic andHealth Management (PHM) is a process that allows an ad-vanced system to automatically test the area, diagnose it,isolate the failure, and try to predict the Remaining UsefulLife (RUL) for an area before failure takes place [20]. Bydoing so, a maintenance scheduling is then determined andthe area shutdown is prevented. It is worth mentioning thatif the prediction model and the provided measurements arenot accurate, there is a high possibility that the maintenanceactivity will not be done on time.Health assessment and diagnostics activity of the area, thatis followed by prediction of RUL, requires online measure-ments of the operating conditions of the area of interest.These information are usually gathered through a numberof sensor nodes. In this study, we consider the case wheresensors communicate their information within a Wireless Sensor Network (WSN). WSNs are different from traditionalcomputer networks, as the former are composed by a largenumber of sensor nodes with very limited and non-renewableenergy. Most of the time, they are deployed to capture theoccurrence of possible events in hostile and inaccessibleareas [21]. A classical assumption in PHM is that monitoringdata is available and complete, which is not always true. Dueto the nature of communication in this network and to thecharacteristics of its devices, a WSN is at risk of failure sothis will have an effect on the accuracy and completeness ofthe data that will be captured, and consequently on PHM.Therefore, one of our objectives is to maintain the Qualityof Service (QoS) of WSN as long as possible to ensure theaccuracy of the data of the monitored area. If such issue is nottaken into consideration while building a PHM process overa WSN, the provided results of diagnostic or prognostic maynot be reliable.From several factors and important parameters in WSN like:lifetime, security, data aggregation, packet transfer, density,etc, the topology in WSN has an important impact on theaccuracy of data and then on PHM. The variability of networktopologies due to node failures, introduction of additionalnodes, variations in sensor location, requires the adaptabilityof underlying network structures and operations. From an-other side, in order to save more energy sensor nodes maybe activated or deactivated into scheduling mechanisms inorder of keeping as much as possible a dense coverage andachieve fault tolerance. Thus, the diagnostic processes must becompatible with these strategies, and with a device’s coverageof a changing quality.In this paper, we study the topologies in WSN and itsrelation with prognostic and health management. We focus onthe impact of topologies on the accuracy of the data capturedby the wireless sensor network, and its consequences on thediagnostic of the status of the monitored area. Our objectiveis to show that usual diagnostic processes that perform well inclassical data provided by a well deployed wired network ofsensors, may face a dramatic decrease of performances in thecase where data are obtained via a WSN, due to the diversityand variation of topologies. To do so, we used six machinelearning algorithms to diagnose the area state, namely the so-called Support Vector Machines (SVM), Naive Bayes (NB),Random Forests (RF), Gradient Tree Boosting (GTB), Tree- a r X i v : . [ c s . D C ] A ug ased Feature Selection (TBFS), and Nearest Neighbors (NN)methods. In addition, we study four different types of topology(the most used in WSN) which are: distributed, hierarchical,centralized, and decentralized topology.The remainder of this article is structured as follows.Section II presents an overview of WSNs topologies. InSection III, we detail the links that can be established betweenPHM and the topologies in WSN field or research. We simulateand describe four different topologies in WSN to show theirimpact on diagnostics, and the results of these simulationsare given in Section IV. This article ends with a conclusionsection, where the contribution is summarized and intendedfuture work is provided.II. T OPOLOGIES IN WIRELESS SENSOR NETWORKS
In wireless sensor networks, the connectivity of the networkis established via radio transmission between sensors. For twosensors to be able to communicate, they must be within somecritical range of each other, as transmission capability is finite.A network is connected if any node can communicate with anyother node, possibly using intermediate nodes as relays. Thevariability of network topologies (connectivity) due to nodefailures, introduction of additional nodes, variations in sen-sor location, requires the adaptability of underlying networkstructures and operations. Since sensors may be spread in anarbitrary manner, One of the fundamental issues that arises insensor networks in addition to the connectivity is the coverage.In order to ensure connectivity and data accuracy in additionto coverage, WSN use redundant coverage where multiplesensors nodes cover the same physical location. Therefore,coverage may vary across the network. A solution to saveenergy in the network rises on finding scheduling mechanisms.The objective of such mechanisms is to activate or deactivateredundant nodes while keeping as much as possible a densecoverage and ensuring connectivity.Another metric to save energy in sensor networks is toreduce the amount of data collected and transmitted via thenetwork. Data gathering in WSNs can be either periodicor event-driven [7]. In periodic applications [12], [13], datais gathered periodically while in event-driven applicationsgathering depends on the occurrence of some events. In bothcases, the goal from aggregation is reducing energy dissipationby holding packets for as long as possible in intermediatenodes. All packets will be combined together then forwardedin the network. It is obvious to see that a decrease in energyconsumption leads to an increase in the overall delay, and viceversa. A reliable solution would aim at finding an acceptabletrade off between energy consumption and delay in WSNs [6],[10].WSNs can be either heterogeneous or homogeneous [11].In the latter, all nodes have the same role and characteristics.In the former, nodes have different roles: some nodes simplysense and forward information while others aggregate data,manage their area, perform computations, etc. Consequently,some of the nodes can be equipped with higher energy, longerradio range, etc. Several WSN topologies were used in existing monitoring applications, but all of them revolved aroundfour different types (or models) of topologies which are:distributed, hierarchical, centralized, or decentralized topology. • Distributed topology: in distributed topologies, there isno management of the network by the central node (ora region of it). They consist of a collection of nodeshaving equal roles. Therefore, no aspect of hierarchyis considered. No prior infrastructure is imposed be-fore the network starts running; each node discovers itssurrounding area and decides which node (or nodes)to communicate with. This decision usually relies onthe radio range and the transfer distance. Distributedtopologies render the network’s maintenance an easy task:if a node fails, its neighbors, within their sensing range,will establish new links with other nodes, and the networkwill continue to work normally. • Hierarchical topology: the organization of sensor nodescan be in several levels, making a hierarchical topology(or a tree topology). Level is represented by the rootand there is no level above. From two adjacent levels,sensor nodes are connected in an end to end manner.The hierarchical model can be seen as three differentlayers: (1) the core layer (the root), which is enhanced foravailability and performance, (2) the distribution layer,which implements policies and forwards messages, and (3) the access layer (the leaf nodes), which represents theaccess point to the network. Scalability is the advantageof Hierarchical WSN. The network is more manageableand the task of isolating and detecting faults is simplifieddue to the presence of different levels. • Centralized topology: its one of the easiest topologies todesign and implement (also called star topology). All thesensor nodes have a simple task which is sensing newinformation and forwarding it to a central node whereall the data processing will be proceeded with. One ofthe major problems of this topology is that it presents asingle point of failure. The whole network will becomeparalyzed if a problem occurs at the central node: Thedata packet cannot be forwarded nor processed when anew event is detected. • Decentralized topology: decentralized topologies areconsidered as a combination of the distributed and thecentralized topologies. The network is divided into re-gions (or clusters) which are locally managed by a centralnode (called the Cluster Head CH). This topology offersa reasonable settlement between energy consumption andQuality of Service (QoS). In this type of topology, thereis a reduction of congestion problem and the network nolonger has a single point of failure.III. WSN
TOPOLOGIES IMPACT ON PROGNOSTICS ANDHEALTH MANAGEMENT
Maintenance is an important activity in industry which iseither performed to revive a machine/component, or to preventit from breaking down. It aims at increasing system avail-ability, readiness and enhancing safety. Different strategiesig. 1: The process of PHM.have evolved through time in order to bring maintenance toits current state: condition-based and predictive maintenance.The increasing demand of reliability in industry caused thisevolution. PHM is a tool to predict the Remaining UsefulLife (RUL) of engineering assets and is the key processof condition-based and predictive maintenance. Nowadays,industrial machines are required to avoid shutdowns whileoffering safety and reliability [18]. Research in PHM fieldhas gained and was given a great deal of attention. Prognosticmodels are developed in an attempt to predict the RUL ofmachinery (or monitored area) before failure takes place.If there is no accuracy in the prediction model and theprovided measurements, the maintenance activity will possiblybe performed either too soon or too late.Condition Based Maintenance (CBM) was proposed anddeveloped in early nineties [8], and it is based on real-timeobservations. It is an on-line approach that assesses machine’shealth through condition measurements. As any maintenancestrategy, CBM aims to increase the system reliability andavailability while reducing costs of maintenance. This partic-ular strategy has benefits which include avoiding unnecessarymaintenance tasks and costs, as well as not interrupting thenormal machine operations [8]. CBM decreases the numberof maintenance operations and reduces the influence of humanerror. A new maintenance has recently emerged which is thePredictive maintenance (PM). It predicts the system healthin the future and defines the needed maintenance activitiesaccordingly based on the current condition. Shifting from tra-ditional maintenance strategies to CBM and PM requires extratasks. These tasks encompass data analysis and modeling,system surveillance, and decision making support system. Thisscientific approach is called PHM. PHM is the core activityof CBM and PM. The steps of PHM are: data acquisition,data processing, health assessment, diagnostics, prognostics,and decision making support [9], this is done following thesteps described in Figure 1.The aim of diagnostics is to specify and quantify an actualfailure while the aim of prognostics is anticipating failures.Prognostics estimate the RUL by considering the past events,in addition to the machine’s current state, and operating con-ditions [9]. By studying the evolution of continuous measure-ments of parameters that need to be tracked in time to assessthe machine’s state, this estimation is done. These parameters can be temperature, humidity, vibration, pressure, and so on.There is a fixed threshold for the monitored parameter. Oncethis threshold is reached, an alarm goes on indicating that asymptom of system deteriorating has been detected. After that,a diagnosis of the state of the system is made and the RUL iscomputed with an associated confidence limit. There are twocauses for the uncertainties of the RUL predictions: either thethreshold value of monitored parameter, or the RUL predictionitself. The necessary prerequisites for reliable prognostics areproposed in [15].Reliability is necessary in industry (monitored area in gen-eral). For the past years, the research in prognostics resultedin variety of tools and techniques that offer the possibilityfor plants to survey their systems, anticipate failures, andschedule maintenance activities. WSNs are mainly designedfor surveillance purposes. They can be deployed in manyfields such as military, automotive, agriculture, medicine, andso on [11]. Recently, a great deal of attention was given toWSN applications by industry. These sensor networks areused to monitor their machinery for maintenance scheduling.Furthermore, data will be provided by the sensors deployedto survey the system/component in order to assess the health,diagnose the system, and estimate the RUL. However, inac-curacy in the data will cause the prediction based on it to beirrelevant. The topologies in WSN have important impact onthe accuracy of data and therefore have an important impacton PHM. Before the network starts running, studying andchoosing the topologies in WSNs need to be considered. Theaim of this study is to reveal the impact of topologies in WSNon the accuracy of the captured data from the monitored areaand therefore on PHM. We can say that the accuracy of thedata is related to the topologies used in WSN from severalfactors and important parameters in WSN. Lifetime is oneof the most important factors in WSN which is related totopologies, and this is because the costs related to energyconsumption varies with the variation of topologies (dataaggregation, packet transfer distance, frequency, etc). Securityis another important factor also related to topologies, andthis is because the role and characteristics of nodes differaccording to topologies (some nodes simply sense and forwardinformation while others aggregate data, manage their area,perform computations, etc). Several factors other than thosementioned play an important role in the accuracy of data inWSN with the variation of applications (topologies), such asdensity, batteries of nodes, data aggregation, etc. What is worthmentioning is that data aggregation is important in increasingthe lifetime of network as mentioned before, but on the otherhand, data aggregation always reduces the data accuracy, sothe error rate of diagnosis is greatly related to the method ofdata aggregation. Since good predictions rely on real data, it iscertain that the first step to be done in the research is ensuringa reliable source of information.IV. N
UMERICAL STUDY
A. Experimental protocol) WSN simulation:
In this paper, in order to show theimpact of WSN topologies on the PHM, we used threetypes of sensing fields: temperature, pressure, and humidity.Therefore, we considered a network of sensor nodes,sensing respectively the levels of temperature ( sensors),pressure ( ), and humidity ( sensors). Each sensor nodehas a battery of u ( u is the battery unit), and capturesspecific data depending on the operating age t . We considerthat no level of correlation is introduced between the differentfeatures: • Under normal conditions, temperature sensors follow aGaussian law of parameter (20 × (1 + 0 . t ) , , in caseof a malfunction of the area in the range of this sensor,these parameters are mapped to (350 , . Finally, thesesensors return the value when they break down. • The pressure sensors produce data following a Gaussianlaw of parameter (5 × (1 + 0 . t ) , . when they aresensing a well-functioning area. The parameters changedto (20 , . in case of area failure in the location wherethe sensor is placed, as long as the pressure sensorsreturn when they are broken down. • The Gaussian parameters are (52 . × (1 + 0 . t ) , . when both the area and the humidity sensors are innormal conditions. These parameters are set to (80 , in case of area failure in the range of this sensor, whereasmalfunctioning humidity sensors produce the value .Each sensor follows a Poisson process ( P p ) of parameter (200 × (1 − . t )+0 . , to determine if a breakdown occursin the location where sensor is placed. Subsequently all ofthese sensors execute Algorithm 1. Algorithm 1
Sensor algorithm if P p < then the area and the sensors are in normal conditions elseif ≤ P p < then the area in failure (in the range of this sensor) else the sensor is broken down end ifend if
Each category of sensors has its own constant threshold,depending on the abnormality of the sensed data. If thecaptured data by the sensor in a specific category exceeded thethreshold, this indicates that a symptom of system deterioratinghas been detected. Then a diagnostic study aims for specifyingand quantifying an actual failure (whether it failed or not). Inthis work, we used six algorithms for diagnosis, which arementioned in Section IV-A2. In this study, we consider thevalues of the thresholds as follows: degrees for temperature, bars in pressure, and percents of humidity.The deployment strategy (manually or randomly) of sen-sors [17], the adjustment of the coverage range of sensors [23],and the density in WSN [1] have an important impact on the accuracy of the data captured by WSN that will be usedto diagnose the state of area. In order to study the impactof topologies of WSN on diagnostics, in our simulation weconsider the following: • Most of the times, the area to be monitored is haz-ardous and hard to access because of the difficulty inits geographical area like monitoring the forests, oceans,military zones, etc. Therefore in this study, we usedrandom deployment for area monitoring. • Suppose that in this work, the region to be monitoredis a rectangle of area A = L × W such that L and W are the length and width of the monitored regionrespectively. The area of the coverage range of the sensoris mostly related to the area of the region to be monitored.Therefore we consider the area of the coverage range tobe equal of the total area of the region. Subsequently,the coverage radius will be R = 1 / × (cid:112) A/π . • Suppose that the density of sensors in the monitored areais constant ( sensors), and that the area is fully coveredby these sensors at time t = 0 (when the WSN startsworking).
2) Machine learning algorithms:
The research in PHM isvery broad and the authors working in this domain use severalalgorithms in order to perform the diagnostic of the stateof system. These algorithms in literature are called machinelearning algorithms. In machine learning, classification refersto identifying the class to which a new observation belongs, onthe basis of a training set and quantifiable observations, knownas properties. Machine learning displays a detailed study aboutthe system and from it, an algorithm is built. These algorithmscan be operated by building a model from examples inputs inorder for the algorithm to be able to diagnose or take decisionfor new data.We have chosen six machine learning algorithms (to diag-nose the system) which were used before by several authors inliterature in order to evaluate the PHM. Our study in this workfocused on evaluating these six diagnostic algorithms withthe variation of the topology of WSN. These algorithms are:Support Vector Machine (SVM) [5], Naive Bayes (NB) [16],[22], Random Forests (RF) [2], [4], Gradient Tree Boosting(GTB) [3], Tree-Based Feature Selection (TBFS) [19], andNearest Neighbors (NN) [14].Finally we need a large and reliable data set in order to trainthese algorithms. So that, later, we can diagnose the system(area monitoring) from the new data that will be captured byWSN. For that, we take data consisting of N lines, each line iscomposed by T temperature data, P data of pressure, and H data of humidity to train these algorithms. All of these dataare generated in the same way mentioned in Section IV-A1(same type of data that will be captured by WSN during areamonitoring). B. Simulation results
In order to illustrate the impact of topologies on the qualityof data and the diagnosing of the state of the monitored area,we simulated four different topologies: decentralized topology,istributed topology, hierarchical topology and centralizedtopology.
1) Decentralized topology:
In order to study the impact ofdecentralized topology on the diagnosing of state of area, weconsider that the nodes are grouped into clusters. Eachcluster is managed by a leader called cluster head (CH) oraggregator which is equipped with batteries of u . Thesensors capture the data from the area and send it to the CH,the latter aggregates the data and send it to another CH or tothe sink. In this study, we consider that the data aggregationat the CH happens as follows: S − (cid:88) i =0 D ci /S (1)where D is the data sent from the sensor to the aggregator, c is the type of sensor (temperature, pressure, or humidity),and S is the number of data that will be aggregated each time(for example, every data from a certain type which are sentto CH from sensors, undergo aggregation). (a) Sensors network at time t = 0 . (b) Sensors network at time t = x . Fig. 2: Scenario of decentralized topology.The scenario of this topology is shown in Figure 2, thedeployment of sensors is random, and the distribution and par-tition of CH on sensors follows K -means clustering method.Each sensor sends data to its CH. The latter, after aggregatingthese data, sends it to the closest CH on the condition that thisCH is closest to the sink. If no CH meets this requirement,it will send it directly to the sink as shown in Figure 2a.After time t = x , the CH and sensors may become inactivefor several reasons most importantly energy consumption oractivity scheduling. If a CH became inactive, sensors in thiscluster find other closest clusters to be in. In addition, CHscommunicating with this inactive CH change their routes tothe closest active CH. What is worth mentioning is that theblack circles are the active sensors, the white circles are theinactive sensors, the black hexagons are the active CH, thewhite hexagons are the inactive CH, and finally the crossedcircle is the sink.In this study, we supposed that the area is fully coveredby these sensors after they have been randomly deployed.As mentioned before, the topology may be dynamic, thesensors or CHs on the long term will die (because of energyconsumption) or break down (due to various causes as theoperating age). Figure 3 shows the variation of error rate forthe six considered algorithms, in the case where the topology Fig. 3: Error rate in diagnostics if the topology is decentralizedwith the variation of the time.is decentralized, with the variation of time t (operating age).Each point in this figure is an average of error rates of a givenalgorithm on simulations (for a certain t ). As shown in thefigure, during t = 0 | t = 60 (if ≤ t ≤ ), each algorithmhas a specific error interval (in %) as follows: [24 , forSVM, [14 , for NB, [16 , for RF, [8 , for GTB, [12 , for TBFS, and [22 , for NN. After that (if t > ) theerror rate for each algorithm increased significantly at theseintervals to reach at t = 70 ,
44 % for SVM,
24 % for NB,
22 % for RF,
36 % for GTB,
23 % for TBFS, and
40 % for NN. This shows that at this time the sensors and CH inWSN are dying or breaking down, and this fact leads to thepresence of uncovered places in the area (coverage hole) andtherefore incomplete data for diagnostics. Then when the WSNexceeds t = 60 (if t > ) the error rate of algorithms increaseas time increase to reach
91 % at t = 100 if the algorithmis SVM,
89 % if NB,
88 % if RF,
90 % if GTB,
89 % ifTBFS, and
90 % if NN (approximately the whole network isinactive). What is worth mentioning is that the error rate in thissimulation (decentralized topology) is related to the method ofdata aggregation.
2) Distributed topology:
The scenario of distributed topol-ogy is shown in Figure 4, where all sensor nodes in thenetwork have the same role and importance; i.e. there is noaggregation role, no clusters, and no CHs. Data packets areforwarded in a hop-by-hop manner. Each sensor is able todiscover its neighbors within a radio range of R c ( R c isthe coverage range). We assume that every node can accessinformation about its neighbors, including their locations. Thenodes choose neighbors to communicate with, and the lattershould be closest to the sink within the sender’s radio range.If the sensor is closest to the sink, the sensor will then sendit directly to the sink.As explained in the scenario before, after certain time t = x ,the sensors may become inactive and the routes always changein function of the closest neighbors to the sink as shownin Figure 4b. What is worth mentioning is that the black,white, and crossed circles represent the active sensors, inactivesensors, and the sink respectively. a) Sensors network at time t = 0 . (b) Sensors network at time t = x . Fig. 4: Scenario of distributed topology.Fig. 5: Error rate in diagnostics if the topology is distributedwith the variation of the time.Figure 5 presents the variation of error rate for the sixconsidered algorithms in the case of distributed topology, withthe variation of time t (operating age). Each point in thisfigure is an average of error rates of a given algorithm on simulations (for a certain t ). As shown in the figure, during t = 0 | t = 40 (if ≤ t ≤ ), each algorithm has a specificerror interval (in %) as follows: [14 , for SVM, [7 , forNB, [7 , for RF, [2 , for GTB, [4 , for TBFS, and [12 , for NN. After that (if t > ) the error rate for each algorithmincreased significantly at these intervals to reach at t = 50 ,
40 % for SVM,
20 % for NB,
16 % for RF,
30 % for GTB,
18 % for TBFS, and
36 % for NN. This shows that at thistime the sensors in WSN are dying or breaking down, andthis fact leads to the presence of uncovered places in the area(coverage hole) and therefore incomplete data for diagnostics.Then when the WSN exceeds t = 40 (if t > ) the errorrate of algorithms increase as time increase to reach
90 % at t = 90 if the algorithm is SVM,
89 % if NB,
88 % if RF,
90 % if GTB,
89 % if TBFS, and
91 % if NN (approximatelythe whole network is inactive).
3) Hierarchical topology:
As we have mentioned in Sec-tion II, sensor nodes can be organized in several levels, makinga hierarchical topology. The sensor nodes are organized in atree hierarchy from the sink (being the root of a tree), untilsensor nodes having no descendants (leaf nodes). In orderto study the impact of hierarchical topology on diagnosis,and to compare this topology to other topologies, we tookWSN composed of sensors (leaf nodes), and each sensor has u for battery. These sensors are considered as theaccess layer in this topology (third layer). We considered nodes playing the role of the second layer in topology (thedistribution layer), which implements policies and forwardmessages. These nodes responsible for building the linksbetween the leaf nodes towards the sink (core layer). Eachsensor from these has u for battery supply, on thecontrary, in a decentralized topology, CHs are given an extrasupply, therefore the batteries last longer and we dispose withmore data for diagnostics. (a) Sensors network at time t = 0 . (b) Sensors network at time t = x . Fig. 6: Scenario of hierarchical topology.The scenario of this topology is shown in Figure 6. After acertain time t = x , the sensors in access layer or distributionlayer may become inactive for several reasons most impor-tantly energy consumption. Unfortunately if a parent node (indistribution layer) become inactive, its children can no longercommunicate with other nodes in the network. In this case, inorder to keep connectivity, each sensor will then communicatewith the closest active node in the distribution layer as shownin Figure 6b.Fig. 7: Error rate in diagnostics if the topology is hierarchicalwith the variation of the time.Figure 7 indicates the variation of error rate for the sixconsidered algorithms under the same conditions of the pre-vious studies, but here it is the case where the topology ishierarchical with the variation of time t . Each point in thisfigure, is an average of error rates of a given algorithm on simulations (for a certain t ). As shown in the figure, during t = 0 | t = 20 (if ≤ t ≤ ), each algorithm has a specificerror interval (in %) as follows: [15 , for SVM, [8 , or NB, [8 , for RF, [4 , for GTB, [4 , for TBFS, and [12 , for NN. These intervals are approximately the samewhere the topology is distributed (where the whole network isactive), and this is because in these two topologies, there is nodata aggregation as the decentralized topology. After that (if t > ) the error rate for each algorithm increased significantlyat these intervals to reach at t = 30 ,
40 % for SVM,
20 % forNB,
16 % for RF,
30 % for GTB,
18 % for TBFS, and
36 % for NN. This shows that at this time the sensors in WSN (inaccess or distribution layer) are dying or breaking down, andthis fact leads to the presence of uncovered places in the area(coverage hole) and therefore incomplete data for diagnostics.Then when the WSN exceeds t = 20 (if t > ) the errorrate of algorithms increase as time increase to reach
91 % at t = 100 if the algorithm is SVM,
90 % if NB,
89 % if RF,
90 % if GTB,
89 % if TBFS, and
90 % if NN (approximatelythe whole network is inactive).
4) Centralized topology:
In centralized topology, all thesensor nodes have the simple task of sensing new informationand forwarding it to a central node where all the data process-ing is done as shown in Figure 8a. In this topology, we cannotice that after t = x , the nodes that exhaust their energy firstare the farthest from the sink. This is due to the long distanceof packet transfer as shown in Figure 8b. The black and whitecircles are the active and inactive sensors respectively, and thecrossed circle is the sink. (a) Sensors network at time t = 0 . (b) Sensors network at time t = x . Fig. 8: Scenario of centralized topology.Figure 9 shows the variation of error rate for the six con-sidered algorithms under the same conditions of the previousstudies. Each point in this figure is an average of error rates ofa given algorithm on simulations (for a certain t). As shownin the figure, at t = 0 (when the WSN starts working) eachalgorithm has a specific error rate (in %) as follows:
18 % forSVM,
10 % for NB,
12 % for RF, for GTB, forTBFS, and
16 % for NN. During the work of the network,with time, the sensors that are located farthest from the sinkstart dying first because they consume more energy than theothers, and this is due to the long distance of packet transfer.For that at t = 10 the error rate increased in a noticeable wayto become
52 % with SVM,
35 % with NB,
28 % with RF,
44 % with GTB,
32 % with TBFS, and
48 % with NN. Thisis a proof that the data became incomplete for diagnostic, andthis is because the regions far from the sink are no longercovered by sensors. Then when the WSN exceeds t = 10 (if t > ) the error rate of algorithms increase as time increaseto reach
91 % at t = 90 if the algorithm is SVM,
89 % if NB,
88 % if RF,
90 % if GTB,
89 % if TBFS, and
90 % if NN(approximately the whole network is inactive).Fig. 9: Error rate in diagnostics if the topology is centralizedwith the variation of the time.
5) Discussion:
In this section we will explain and comparethe results we obtained in our study in order to focus on severalissues or parameters related to topologies which have a greatimpact on diagnostics (PHM). We notice in Figure 5 (withdistributed topology), the noticeable variation of error rate ofthe algorithms with time from what is shown in Figure 3 (withdecentralized topology). We note from these two figures thatin Figure 5 the sensors after t = 40 ( t > ) started dyingor breaking down, and the whole network became inactive at t = 90 , while in Figure 3 the sensors or CH started dying orbreaking down after t = 60 ( t > ), and the whole networkbecame inactive at t = 100 . Moreover we can notice thatin Figure 3, during t = 0 | t = 60 (the whole network isactive) the error rate is evolving in a way larger than the onein Figure 5 during t = 0 | t = 40 (where the whole network isactive). From this study and based on this comparison we canconclude that the lifetime of the networks with decentralizedtopology is greater than if it were distributed topology, andthis is due to that the data aggregation reduces the number ofpacket transfer, and therefore it further reduces the overallenergy consumption in the network. But the error rate ofdiagnosis is greatly related to the method of data aggregation(if the topology is decentralized) because data aggregationalways reduces the data accuracy that will be used to diagnose,and this is shown and clarified in these two figures where thewhole network is active.We had a different scene in Figure 7 (with hierarchicaltopology) because the variation of error rate of the algorithmsvaried with time in a significant way from what is shown inFigure 3 and 5 (with decentralized and distributed topology).We notice from these figures that in Figure 7 the sensors after t = 20 ( t > ) started dying or breaking down, while inprevious studies, the sensors became inactive after this timeas shown in the figures (variation of error rate with time).Based on this study, we can conclude that the lifetime of thenetwork with hierarchical topology is smaller than if it wereecentralized or distributed topology (the network lifetimedefined as time until the first node dies). Furthermore, wenote that the whole network with hierarchical topology becameinactive at t = 100 as the network with decentralized topology,while with distributed topology the whole network becameinactive at t = 90 . Based on these results, we can concludethe importance of dividing the WSN in area monitoring intoregions which are locally managed by a central node (or parentnode).If we suppose that the network lifetime can alternativelybe defined as the time until the first node dies, then byrelying on the change of curves in these figures, we concludethat the lifetime of the network with centralized topology issmaller than if it were decentralized, distributed, or hierarchi-cal topology. Moreover, the whole network with hierarchicaland decentralized topology became inactive at t = 100 , whilewith distributed and centralized topology the whole networkbecame inactive at t = 90 . Based on these four results, weconfirm what we mentioned before about the importance ofdividing the WSN in area monitoring into regions which arelocally managed by a parent node (the network remains activefor a longer time, therefore the sink continues to receiveinformation from the monitoring area for a longer time). Basedon this work, we were able to notice the importance and impactof each type of topology in WSN on diagnostics with theincrease of operating age of WSN, and focus on several issuesrelated to these types of topologies.V. C ONCLUSION AND F UTURE W ORK
The WSNs provide a new way of distributed data collectionand wireless transmission for PHM to diagnose the state of anarea and to be informed if it is in failure or not. Topologiesin WSN are important factors to achieve QoS in WSNsapplication. In this paper, we explained the relation betweenWSN topologies and their impact on PHM and the areadiagnostic. We mentioned and studied four different topologiesin WSN, each one of them belonging to a certain type asfollows: distributed, hierarchical, centralized, and decentral-ized topology. In this work, we focused on several issuesrelated to these types of topologies. We studied the lifetimeof each type, and concluded that the lifetime of decentralizedtopology is larger than the other. Therefore we can say thatthis type of topology is the best for PHM reliability, and thisis because the complete data from the area are available fora longer time. We conclude also that dividing the WSN inarea monitoring into regions which are locally managed by aparent node (like decentralized and hierarchical topology) isvery important, because the network in this case remains activefor a longer time. As a future work, we plan to study otherfactors like density, data aggregation, coverage and scheduling,communication, etc and their impact on PHM.
This work is partially funded by the Labex ACTION program(contract ANR-11-LABX-01-01) R EFERENCES[1] Sachin Adlakha and Mani Srivastava. Critical density thresholds forcoverage in wireless sensor networks. In
Wireless Communications and Networking, 2003. WCNC 2003. 2003 IEEE , volume 3, pages 1615–1620. IEEE, 2003.[2] Leo Breiman. Random forests.
Machine Learning , 45:5–32, 2001.[3] Peter Bühlmann and Torsten Hothorn. Boosting algorithms: Regulariza-tion, prediction and model fitting.
Statistical Science , pages 477–505,2007.[4] Wiem Elghazel, Kamal Medjaher, Noureddine Zerhouni, Jacques Bahi,Ahmad Farhat, Christophe Guyeux, and Mourad Hakem. Randomforests for industrial device functioning diagnostics using wireless sensornetworks. In
Aerospace Conference, 2015 IEEE , pages 1–9. IEEE, 2015.[5] D Galar, U Kumar, J Lee, and W Zhao. Remaining useful life estimationusing time trajectory tracking and support vector machines. In
Journal ofPhysics: Conference Series , volume 364, page 012063. IOP Publishing,2012.[6] Hassan Hareb, Abdallah Makhoul, and Raphaël Couturier. An enhancedK-means and ANOVA-based clustering approach for similarity aggre-gation in underwater wireless sensor networks.
IEEE Sensors Journal ,15(10):5483–5493, 2015.[7] Hassan Hareb, Abdallah Makhoul, Ramy Tawil, and Ali Jaber. Energy-efficient data aggregation and transfer in periodic sensor networks.
IETWireless Sensor Systems , 4(4):149–158, 2014.[8] Aiwina Heng, Sheng Zhang, Andy CC Tan, and Joseph Mathew. Rotat-ing machinery prognostics: State of the art, challenges and opportunities.
Mechanical Systems and Signal Processing , 23(3):724–739, 2009.[9] Andrew KS Jardine, Daming Lin, and Dragan Banjevic. A review onmachinery diagnostics and prognostics implementing condition-basedmaintenance.
Mechanical systems and signal processing , 20(7):1483–1510, 2006.[10] Soonmok Kwon, Jae Hoon Ko, Jeongkyu Kim, and Cheeha Kim.Dinamic timeout for data aggregation in wireless sensor netwoks.
Computer Networks , 55:650–664, 2011.[11] Zhijun Li and Guang Gong. Survey on security in wireless sensor.
Journal of the Korea Institute of Information Security and Cryptology ,18(6B):233–248, 2008.[12] Abdallah Makhoul, Hassan Harb, and David Laiymani. Residual energy-based adaptive data collection approach for periodic sensor networks.
AdHoc Networks , 35:149–160, 2015.[13] Abdallah Makhoul, David Laiymani, Hassan Hareb, and Jacques Bahi.An adaptive scheme for data collection and aggregation in periodicsensor networks.
International journal of sensor networks , 18(1/2):62–74, 2015.[14] Ronald E McRoberts. Estimating forest attribute parameters for smallareas using nearest neighbors techniques.
Forest Ecology and Manage-ment , 272:3–12, 2012.[15] ISO Condition Monitoring. Diagnostics of machines-prognostics part 1:General guidelines.
ISO13381-1: 2004 (e). vol. ISO/IEC Directives Part2, IO f. S , page 14.[16] Selina SY Ng, Yinjiao Xing, and Kwok L Tsui. A naive bayes model forrobust remaining useful life prediction of lithium-ion battery.
AppliedEnergy , 118:114–123, 2014.[17] AK PATIL and AJ PATIL. Issues of connectivity and coverage inwireless sensor networks.[18] Ying Peng, Ming Dong, and Ming Jian Zuo. Current status ofmachine prognostics in condition-based maintenance: a review.
TheInternational Journal of Advanced Manufacturing Technology , 50(1-4):297–313, 2010.[19] V Sugumaran, V Muralidharan, and KI Ramachandran. Feature selectionusing decision tree and classification through proximal support vectormachine for fault diagnostics of roller bearing.
Mechanical systems andsignal processing , 21(2):930–942, 2007.[20] Bo Sun, Rui Kang, and Jin-song XIE. Research and application of theprognostic and health management system [j].
Systems Engineering andElectronics , 10:041, 2007.[21] Jennifer Yick, Biswanath Mukherjee, and Dipak Ghosal. Wireless sensornetwork survey.