Rethinking System Health Management
RRethinking System Health Management
Edward Balaban
Intelligent Systems Division, NASA Ames Research Center, Moffett Field, CA 94035
Stephen B. Johnson
Dependable System Technologies, LLCJacobs ESSCA Group at NASA Marshall Space Flight Center
Mykel J. Kochenderfer
Department of Aeronautics and Astronautics, Stanford University, Stanford, CA 94305
Abstract
Health management of complex dynamic systems hastraditionally evolved separately from automated con-trol, planning, and scheduling (generally referred to inthe paper as decision making). A goal of IntegratedSystem Health Management has been to enable coor-dination between system health management and deci-sion making, although successful practical implementa-tions have remained limited. This paper proposes that,rather than being treated as connected, yet distinct en-tities, system health management and decision makingshould be unified in their formulations. Enabled by ad-vances in modeling and computing, we argue that theunified approach will increase a system’s operational ef-fectiveness and may also lead to a lower overall systemcomplexity. We overview the prevalent system healthmanagement methodology and illustrate its limitationsthrough numerical examples. We then describe the pro-posed unification approach and show how it accommo-dates the typical system health management concepts.
System Health Management (SHM) has evolved from sim-ple red-line alarms and human-initiated responses to a dis-cipline that often includes sophisticated modeling and auto-mated fault recovery recommendations (Aaseng 2001). Themain goal of modern SHM has been defined as the preser-vation of a system’s ability to function as intended (Ras-mussen 2008; Johnson and Day 2011). While in this paperwe apply the term SHM to the operational phase of a sys-tem’s lifespan, in other contexts the term may also encom-pass design-time considerations (Johnson and Day 2011).The actual achievement of a system’s operational objectives,on the other hand, is under the purview of the fields of con-trol, planning, and scheduling. In this paper, we will gen-erally refer to all of the processes aimed at accomplishingoperational objectives as decision making (DM), while us-ing the more specialized terms where necessary.Historically, the field of SHM has developed separatelyfrom DM. The typical SHM functions (monitoring, fault de-tection, diagnosis, mitigation, and recovery) were originallyhandled by human operators–and similarly for DM (Aaseng2001; Ogata 2010). As simple automated control was be-ing introduced for DM, automated fault monitors and alarms reduced operator workload on the SHM side. Gradually,more sophisticated control techniques were developed forDM (Ogata 2010), while automated emergency responsesstarted handling some of the off-nominal system health con-ditions (Rasmussen 2008). Capable automated planning andscheduling tools eventually became available for performingmore strategic DM (Ghallab, Nau, and Traverso 2016). Onthe SHM side, automated fault diagnosis was added, in somecases coupled with failure prediction (i.e., prognostic ) algo-rithms (Lu and Saeks 1979), as well as with recovery proce-dures (Avizienis 1976). Still, the two sides largely remainedseparated. When the concept of Integrated System HealthManagement became formalized, most interpretations of in-tegrated encompassed some interactions between DM andSHM, although practical implementations of such interac-tions have been limited (Figueroa and Walker 2018).This paper makes the following claims:1. For actively controlled systems, prognostics is not mean-ingful; for uncontrolled systems prognostics may only bemeaningful under a specific set of conditions;2. DM should be unified with SHM for optimality and thereis a path for doing so;3. Automated emergency response should be done sepa-rately from unified DM/SHM, in order to provide perfor-mance guarantees and dissimilar redundancy.For the second claim, we intend to show a path towardsDM/SHM unification that builds on the latest developmentsin state space modeling, planning, and control. While inthe past limitations in computing hardware and algorithmswould have made unified DM/SHM difficult, we believe thatthe advances of the recent years make it an attainable goal.We also believe that this unified approach is applicable to abroad spectrum of DM, from traditional deterministic plan-ners to complex algorithms computing long-horizon actionpolicies in the presence of uncertainty. We use the term uni-fication to emphasize the idea of DM and SHM being donewithin the same framework, rather than the two being inte-grated as separate subsystems exchanging information.Some initial progress towards DM/SHM unification canbe found in the earlier work by Balaban and Alonso (2013),Balaban et al. (2018), and Johnson and Day (2010; 2011).System health information was also incorporated into DM1 a r X i v : . [ c s . A I] M a r y others (Bethke, How, and Vian 2008; Ure et al. 2013;Agha-mohammadi et al. 2014). This paper aims to introducea systematic view on such integration, discuss its benefits,and illustrate how current SHM concepts map into the pro-posed approach without a loss of functionality or generality.The paper first defines the categories of systems that areof interest to this study (Section 2) and then overviews theprevailing approach to SHM (Section 3). The first claim (onprognostics) is discussed in Section 4. Section 5 discussesthe second claim, concerning DM/SHM integration and itsbenefits, as well as the rationale for the third claim (thatDM/SHM should be separate from automated emergency re-sponse). Section 6 concludes. In the discussion to follow, we consider both uncontrolledand controlled systems. For our purposes, the uncontrolledsystems category includes not only those systems types forwhich control is not available or required, but also thoseoperating on predefined control sequences, such as indus-trial robots performing the same sets of operations overextended periods of time. Also included are system typesthat can be considered uncontrolled within some time inter-val of interest ( decision horizon ). Systems in this categorymay have degradation modes that affect the system’s per-formance within its expected useful life span. The rate ofdegradation is influenced by internal (e.g., chemical decom-position) and external factors (e.g., temperature of the op-erating environment). In addition to aforementioned indus-trial robots, examples of uncontrolled system types includebridges, buildings, electronic components, and certain typesof rotating machinery, such as electrical power generators.The controlled systems category covers all other systemtypes, including dynamically controlled systems operatingin uncertain environments, where the current state cannotbe fully observed and non-determinism is present in controlaction outcomes. Degradation processes are influenced notonly by the same kinds of internal and external factors as forthe uncontrolled systems, but also by the control actions.Most of the discussion to follow is applicable to both cate-gories, although controlled systems would, naturally, benefitmore from active DM/SHM. In describing the systems, weadopt the notation from the field of decision making underuncertainty (Kaelbling, Littman, and Cassandra 1998).A system state s can be a scalar or a vector belonging toa state space S . A system action a initiates state transitions,with A denoting the space of all available actions ( A may bestate-dependent). A transition model T ( s, a, s (cid:48) ) describesthe probability of transitioning to a particular state s (cid:48) ∈ S as a result of taking action a from state s . A reward model R ( s, a ) describes a positive reward obtained or a negativecost incurred as a result of taking action a from state s . Ter-minal states form a subset S T ⊂ S . Terminal states S T mayinclude both failure and goal states. Transitions from a ter-minal state are only allowed back to itself.If there is state uncertainty, belief states are used insteadof regular states (also referred to as beliefs ). A belief b is aprobability distribution over S , with B denoting the spaceof all beliefs. Observations ( e.g. , sensor readings) can help with belief estimation and updating . Like a state, an ob-servation can be a vector quantity. An observation model O ( s (cid:48) , a, o ) describes the probability of getting an observa-tion o upon transition to state s (cid:48) as a result of action a .The general function of decision making is to select ac-tions. While in some systems we are only concerned withselecting a single a t at a given time t , decision makingproblems often involve selecting a sequence of actions. Ina strictly deterministic system, an entire sequence of actions(a plan ) can be selected ahead of time. In systems with ac-tion outcome uncertainty, however, a fixed plan can quicklybecome obsolete. Instead, a policy π ( s ) : S → A needs tobe selected that prescribes which action should be taken inany state. If a policy is selected that, in expectation, opti-mizes a desired metric ( e.g. , maximizes cumulative reward),it is referred to as an optimal policy and denoted π ∗ .Throughout the paper, we use a robotic exploration roveroperating on the surface of the Moon as a running exampleof a complex controlled system. The rover is solar-poweredand stores electrical energy in a rechargeable battery. A typical contemporary SHM integration approach is shownin Figure 1. A DM subsystem generates an action a t, DM ,aimed at achieving the operational objectives. The plant exe-cutes a t, DM and an observation o t is generated and relayed toboth DM and SHM. DM computes a t +1 , DM on the basis of o t , while SHM analyzes o t for indications of faults (definedhere as system states considered to be off-nominal) and, ifany are detected, issues a mitigation or recovery command a t +1 , SHM either directly to the plant or as a recommendationto the DM subsystem (Valasek 2012).
PlantDM SHM
Figure 1: A typical system architecture with SHMFigure 2 goes into more detail on the SHM subsystem.The fault detection module corresponds to the traditionalred-line monitors detecting threshold-crossing events of sen-sor values, represented on the diagram by a Boolean faultdetection function F . If a fault is detected ( F ( o t ) = true), fault isolation and diagnosis (or identification ) is per-formed, generating a vector of fault descriptors f t (Daigleand Roychoudhury 2010). Each fault descriptor typicallyidentifies a component, its fault mode, and its fault parame-ters (Daigle and Roychoudhury 2010). There are diagnosticsystems that also include an estimated fault probability inthe descriptor (Narasimhan and Brownston 2007). If the un-certainty of the diagnostic results is deemed too high ( e.g. , f t consists of only low-probability elements), uncertaintymanagement is sometimes performed in order to obtain abetter estimate of the current system condition (Lopez andSarigul-Klijn 2010).Some recent SHM implementations then pass f t to a prognostic module (Roychoudhury and Daigle 2011). In theSHM context, the intended goal of the prognostic module is rognosisdiagnosis/isolationfault detection mitigation/recovery Figure 2: A typical contemporary SHM architectureto predict, at time t p (here t p = t ), whether and when faultswill lead to system failure (defined as inability to performits assigned function) within the window [ t p , t p + H ] of aprediction horizon H (the terms prediction horizon and de-cision horizon are equivalent for our purposes). In prognos-tics literature, the time of failure is commonly synonymouswith the term end of [useful] life (EOL) . Equivalently, thegoal of prognostics can be defined as predicting the sys-tem’s remaining useful life (RUL) . In Figure 2, the prog-nostic prediction is represented as a probability distribution p ( EOL | f t ) for EOL ∈ [ t p , t p + H ] . Uncertainty managementis sometimes also prescribed following prognostic analysis,meant to improve the prediction if the confidence in it is in-sufficiently high (Wang, Youn, and Hu 2012). Note that ifa prognostic module is part of an SHM sequence, the term Prognostics and Health Management (PHM) is used bysome instead of SHM in order to emphasize the role prog-nostics is playing in managing the system’s lifecycle.Finally, p ( EOL | f t ) and f t are passed to the fault mitiga-tion and recovery component to select an action a t +1 , SHM from the action set A SHM , in order to mitigate or recoverfrom faults in f t . As part of this process, operational con-straints may be set for those faulty components that cannotbe restored to nominal health. If functional redundancy ex-ists for such components, their further use may be avoided.The overall limitations of the current SHM approach arediscussed in Section 5, where an approach that unifies DMand SHM is then proposed. The next section, however, fo-cuses on the prognostic component and discusses why it isnot meaningful for actively controlled systems and is chal-lenging to implement in a useful manner for uncontrolledsystems. A general definition of prognostics is that of a processpredicting the time of occurrence of an event E (Daigle,Sankararaman, and Kulkarni 2015). Using notation fromSection 2, if φ E : S → B (where B (cid:44) { , } ) is anevent threshold function, then t E (cid:44) inf { t ∈ [ t p , t p + H ] : φ E ( s t ) = 1 } is the nearest predicted time of E ( s t isthe state at time t ). If the state evolution trajectory is non-deterministic, then p ( t E | s t p ) is computed instead. If statescannot be directly observed, p ( t E | o t p ) is computed. As de-fined, prognostics is only meaningful in a specific set of cir-cumstances and next we use two examples to illustrate why. There are two main desirable, interrelated attributes for aprognostic system: (1) low uncertainty in EOL estimation,so that a decision about mitigation or recovery actions can bemade with confidence, and (2) the ability to make a predic-tion far enough in advance for the actions to be successfully executed. For uncontrolled systems, this means that prog-nostics is primarily useful for systems with long lifetimes,low process uncertainty, or both. To illustrate why this is thecase, we start with a simple uncontrolled system example: h ea lt h deterministicstochastic Figure 3: Uncontrolled system prognostics (Example 1)
Example 1.
At time t = 0 , the health state of a sys-tem is s = 1 (states are scalar). According to a de-terministic model, the nominal health degradation rate isconstant at ˙ s n = 0 . / ∆ t , where ∆ t is the predictiontime step, selected as the minimum time interval withinwhich a change in the system’s health is expected to bedetectable. A stochastic model predicts the probability ofthe nominal degradation rate ˙ s n within any time step as p n = 0 . and the probability of a higher degradation rate( ˙ s h = ˙ s n + (cid:15)/ ∆ t ) as p h = 0 . . Assume (cid:15) = 0 . .The objective for both models is to predict EOL, i.e. ,the smallest t for which s ≤ . For this example, the pre-diction uncertainty is defined as σ ( t p ) = | E [ EOL d ( t p )] − E [ EOL s ( t p )] | , i.e. , the absolute difference between theexpected EOL values computed by the two models at aprediction time t p . A requirement is set on the maximumEOL prediction uncertainty as σ max = 1∆ t .For this example, let us assume that the health state isfully observable and define ρ p = s p /s , the fraction of fullhealth remaining at t p . In Figure 3, a prediction is shown tobe made at t p = t , with s p = s , ρ p = 1 , and H = 20∆ t .Since E [ EOL d ( t p )] = ρ p ˙ s n and E [ EOL s ( t p )] = ρ p p n ˙ s n + p h ˙ s h == ρ p (1 − p h ) ˙ s n + p h ( ˙ s n + (cid:15)/ ∆ t ) = ρ p ( ˙ s n + p h (cid:15)/ ∆ t )) , then σ = (cid:12)(cid:12)(cid:12)(cid:12) ρ p ˙ s n − ρ p ( ˙ s n + p h (cid:15)/ ∆ t ) (cid:12)(cid:12)(cid:12)(cid:12) = ρ p (cid:12)(cid:12)(cid:12)(cid:12) − .
05 + p h (cid:15) ) (cid:12)(cid:12)(cid:12)(cid:12) ∆ t. (1)In the last equation, we substitute the value for the nom-inal degradation rate ˙ s n in order to focus on the effects ofthe degradation rate uncertainty. As can be seen in Figure 3,EOL is reached by both models within the prediction hori-zon. However, from Equation 1, σ = 3 . t > σ max .For the requirement on σ to be satisfied, either ρ p (healthfraction remaining), p h (the probability of deviations fromhe nominal degradation rate), (cid:15) (the magnitude of devia-tions), or some combination of them needs to be reduced.If p h and (cid:15) are kept the same, with ρ p = 0 . we can get σ = 0 . t . However, t p now needs to be ≈ t , withonly t left until failure (RUL). RUL corresponds to thetime available to either replace/repair the uncontrolled sys-tem or initiate an emergency response. For a quickly degrad-ing system with ∆ t = 1 s , RUL would be only , whichis likely enough time for an emergency response, but notfor repair or replacement. In practice, in uncontrolled sys-tems where handling a fast-developing degradation processis important, estimating p ( EOL ) is unlikely to bring tangi-ble benefits. For instance, if pressure starts building quicklyin a fuel tank of an ascending crewed rocket, the launchabort system (emergency response) is likely to be activatedby the exceedance of a predefined pressure limit or pressureincrease rate ( i.e. , functions of fault detection). Computingwhether the tank breach will occur in seconds or sec-onds will not materially influence that response. For uncon-trolled systems with mid-range degradation rates (minutes,hours, days), extrapolation of the degradation function maybe able to serve as part of a caution and warning mechanism.What follows is that if degradation uncertainty is rela-tively high or varies significantly over time, a short predic-tion horizon (compared to the overall system lifetime) maybe necessary to limit the uncertainty propagation and resultin a usable σ . In this case, systems with longer lifetimes aremore suitable for applying prognostics. For example, if abridge failure can be predicted years in advance with anaccuracy of ± year, that can still be a useful prediction.However, while many uncontrolled systems can be clas-sified as systems with long lifetimes, there exists a numberof fundamental practical difficulties in performing effectiveprognostics for them. One of the primary issues stems di-rectly from the typically long (often decades) lifetimes. Inorder to establish trust in the degradation models, they needto be adequately tested using long-term observations from“normal use” degradation or observations from properly for-mulated accelerated testing. With useful (from the statisti-cal point of view) “normal use” degradation data sets beingrare for many long-life system types (Heng et al. 2009), ac-celerated degradation efforts are common. If an accelerateddegradation regime is proposed, however, what needs to beclearly demonstrated is that:1. The regime can be used as a substitute for real-lifedegradation processes . For instance, while Rigamonti etal. (2016) and Celaya et al. (2011) use thermal and elec-trical overstress to quickly degrade electrical capacitorsand predict the time of their failure (by using an empir-ical equation), their work does not extend to making aconnection to real-life degradation, which takes place atlower temperatures and voltage/current levels. Similar is-sues are highlighted by Dao et al. (2006) for compositematerials, where mechanical, thermal, and chemical pro-cesses result in complex interactions during aging.2.
There is a known mapping from the accelerated time-line to the unaccelerated timeline.
Oh et al. (2015) notein an overview of condition monitoring and prognostics of insulated gate bipolar transistors that while numerousfatigue models have been constructed that predict cycles-to-failure under repetitive cycle loading, they are not de-signed to predict RUL under realistic usage conditions.Although use of various fatigue analysis models, e.g. ,Paris, Gomez, and Anderson (1961), has been proposedfor estimating RUL on the basis of stress cycles, their ac-curacy has proven difficult to confirm.There are subfields of prognostics where accelerated ag-ing regimes may be viable, such as in aircraft metallic struc-tures or rotating machinery (where mechanical degradationfactors could be assumed dominant). However, the issue ofhigh uncertainty of degradation trajectories still arises, evenunder the same test conditions (Virkler, Hillberry, and Goel1979; Meng 1994). Finite element modeling may help al-leviate degradation trajectory uncertainty in specific cases,albeit at a significant computational cost (Heng et al. 2009).Some of the other challenges with effective prognosticsfor uncontrolled systems include the accuracy of estimatingthe actual state of health, the effects of fault interactions, andthe effects of system maintenance actions (Heng et al. 2009).If these challenges are successfully overcome and the fail-ure mechanisms of a component are understood well enoughto develop useful degradation models, a different questionthen arises: should the design or usage of the component bechanged to mitigate these mechanisms? While in some casesthis may not be feasible, in others it may be the simplest andmost reliable way of improving safety and maintenance re-quirements (Bathias and Pineau 2013). A redesign or changein usage would, on the other hand, make the degradationmodels obsolete. The next tier of degradation modes wouldthen need to be analyzed and modeled, possibly followedby another redesign. Thus analysis intended for the develop-ment of degradation (prognostics) models instead becomespart of the design cycle.For those uncontrolled systems that are, in fact, suitablefor health management based on prognostics, the actionspace is typically limited to: (a) no action, (b) replacement,or (c) repair to a nominal operating condition. Even so, westill propose that predictive analysis for these systems needsto be driven by decision making requirements. For instance,if domain knowledge informs that variability in system be-havior beyond some health index value h min is too greatfor choosing actions with sufficiently high confidence, thenEOL can be redefined as h min and system dynamics beyond h min need to be neither modeled nor computed during pre-diction, potentially freeing computing resources for estimat-ing system behavior up to h min with more accuracy. As soon as dynamic control is introduced into the systemand uncertainty is taken into account, prognostics, as definedabove, and the PHM version of the process in Figure 2 areno longer meaningful–for two key reasons. First, not havingthe knowledge, at t p , of the future system actions, a PHMalgorithm will either (a) have to rely on some precomputedplan to obtain a t p +1: H for its predictive analysis (a plan thatcan quickly become obsolete due to outcome uncertainty)r (b) have to use a random policy (which can, for instance,result in less optimal actions being considered as probableas the more optimal ones). A random policy is also likely toresult in a greater state uncertainty throughout the [ t p , t p + H ] interval. Second, the oft-proposed strategy of rerunningprognostic analysis after an action, so that new informationcan be taken into account (Tang et al. 2011), may not help.Once a suboptimal execution branch has been committed to,it may remain suboptimal regardless of future decisions. Thefollowing example provides an illustration of these issues: Example 2.
A rover needs to traverse an area with no sun-light, going around a large crater from waypoint wp tothe closest suitable recharge location at wp . The batterycharge at wp is .There are three possible levels of terrain difficulty: dif-ficult (requiring
600 Wh per drive segment), moderate (
300 Wh per segment), and easy (
200 Wh per segment).All drive segments are the same length. Probabilities ofterrain types in different regions are shown in Figure 4.The rover can go to the left, wp → wp → wp , or tothe right, wp → wp → wp (left and right are relative tothe diagram). If going to the right, it can decide to detouraround a smaller crater wp → wp → wp ( easy terrainwith p = 1 . ) instead of going directly wp → wp . p = 1.0) wp wp wp p = 0.5)Detour region:300Wh ( p = 0.5)Right region: wp wp p = 0.4)300Wh ( p = 0.6)Left region: Figure 4: PHM vs. DM for a controlled system (Example 2) s s s s s s s s s s s ( p= ) ( p= ) ( p= ) ( p= ) − − − − − − − − − − −
100 500 −
100 100 500 4001100 a a a a a a Figure 5: Execution scenarios in Example 2A decision support PHM algorithm, running a sufficientlylarge number of simulations, would consider two possi-ble execution scenarios along the left route: (L1) e total =1200 Wh , p = 0 . and (L2) e total = 600 Wh , p = 0 . ( e total is the total energy consumed in a scenario). The expected en-ergy consumption along the left route can then be computedas E [ e total, L ] = 1200 Wh · . · . .The algorithm would then consider four possible exe-cution scenarios along the right route (assuming uniformrandom choice of action at wp ): (R1) e total = 1200 Wh , p = 0 . ; (R2) e total = 600 Wh , p = 0 . ; (R3) e total = 1000 Wh , p = 0 . ; and (R4) e total = 700 Wh , p = 0 . .Then E [ e total, R ] = (1200 + 600 + 1000 + 700) · .
25 Wh =875 Wh . With E [ e total, L ] < E [ e total, R ] , the PHM algorithmcommits to the left path. Note that in many cases prognosticsalgorithms generate even less information to support actionselection than what was done here (typically p ( EOL ) only).A DM algorithm capable of sequential reasoning un-der uncertainty (Kochenderfer 2015) would compute E [ e total, L ] = 840 Wh in the same manner as the PHM al-gorithm, as there are no actions needing to be selected alongthe left route after wp . On the right side, however, the DMalgorithm can make an informed choice at wp , based onobservations made along wp → wp . This implies havingonly two possible execution scenarios: (R1) if the terrain isobserved as difficult , the detour through wp is taken, and(R2) if the terrain is observed as moderate , the rover goesdirectly to wp . For R1, e total = 1000 Wh , p = 0 . . ForR2, e total = 600 Wh , p = 0 . . The expected energy use isthus E [ e total, R ] = (1000 + 600) · . . With E [ e total, L ] > E [ e total, R ] , the algorithm chooses the right path.Now let us assume that the true terrain condition both onthe left and the right sides of the crater is difficult . The leftpath ( wp → wp → wp ) will require to tra-verse, therefore a rover relying on the PHM algorithm willfall
100 Wh short and will not reach wp . A rover relying onthe DM algorithm will expend only (scenario R1),arriving at wp with
100 Wh in reserve.It may be suggested that the issues with the PHM ap-proach could be eliminated if access to a precomputed oper-ational policy π DM is provided. However, even if such a pol-icy was available, that would still be insufficient. If, at time t , p ( EOL ) is computed using π DM , then a t +1 , SHM is taken onthe basis of p ( EOL ) , p ( EOL ) could immediately become in-valid unless T ( s t , a t +1 , SHM , s t +1 ) = T ( s t , a t +1 , DM , s t +1 ) . This section discusses the benefits of a unified DM/SHM ap-proach and outlines some of the key implementation details.
The next example illustrates how unification can be helpfulin balancing system health needs and operational objectives: wp Zone 1 Zone 2wp wp drive (4h) drive (4h) activities (2/4h) recharge (0/4h?) Figure 6: Separated vs. unified DM/SHM (Example 3)
Example 3.
The rover starts at wp in Zone 1 (Figure 6)with
500 Wh in the battery (out of the capacity).The solar panels can charge the battery at a rate of
250 W .Sunlight in Zone 1 will last for another 12 hours.Two actions are available in the initial state s at wp (Figure 7): skip charging ( a = +0 Wh ) and charge tofull ( a = +1000 Wh ). In Figure 7, the unitless numbersare Watt-hours of energy going in or out of the battery.Time (in hours) at each state is denoted as ’[t] h’. ( p= ) s s s s − a a s s s s s s s s p= ) +0 +1000 − − h 4 h8 h 12 h16 h14 h12 h 10 h4 h6 h 8 h ( p= ) +200( p= ) +10010 h − − − − Figure 7: Execution scenarios in Example 3The rover needs to perform a 2-hour stationary sci-ence activity at wp and be able to arrive at wp , thenext recharge point. The prior probability of the activityat wp needing to be redone (a 2-hour delay) is . . Sci-ence payload power consumption is
100 W , resulting in anet-positive (
250 W −
200 W = +50 W ) power flow.The driving times from wp to wp and from wp to wp are hours, with the average drive train power con-sumption of
300 W , resulting in a net-negative (
250 W −
300 W = −
50 W ) power flow.If operating without sunlight, a
150 W heater needsto be used to keep batteries and electronics warm, thusresulting in a net-negative power flow of −
300 W −
150 W = −
450 W for driving and −
200 W −
150 W = −
350 W for stationary science activities.According to the general SHM policy of restoring health(battery charge, in this case) to nominal, the action chosenat wp is π shm ( s ) = a and the battery is recharged tofull . After the science activity at w is completed,assessment is made that it needs to be repeated. The 2-hourdelay means that the entirety of wp → wp segment needsto be done without sunlight, resulting in a complete batterydepletion before wp is reached, with a deficit of
300 Wh .In computing a unified policy, however, where SHM ac-tions are considered in the context of the overall mission, allfour scenarios depicted in Figure 7 would play a role. Theexpected amount of battery charge remaining if a is cho-sen would be Q = 0 . ·
200 + 0 . ·
300 = 250 . For a : Q = 0 . ·
400 + 0 . · ( − . Action a (no recharge)would be chosen and, with the two-hour delay at wp , therover would arrive at wp with
300 Wh still remaining.The example illustrates the first benefit of unifying DMand SHM: the ability to naturally take operational objectivesand constraints into account when making a system healthrecovery decision. The next example illustrates how DM, onthe other hand, can benefit from unified action spaces andaccess to health-related models:
Example 4.
The rover is traveling from wp to wp (flatterrain) when a decision is made at point A to make a de-tour and attempt data collection at a scientifically valuable wp wp wp A B
Figure 8: Unified action spaces (Example 4) wp , requiring a 6-hour climb up a steep hill (Figure 8).The 1-hour data collection activity at wp must be com-pleted before the loss of illumination there in hours.After completing the 2-hour climb up to point B ( / of the way up), it is observed that the internal temperatureof one of the drive motors has risen to T m = 60 ◦ C fromthe nominal of ◦ C . At T m = 80 ◦ C there is a significantrisk of permanent damage and failure of the motor.If the SHM system on the rover consists of the traditionalfault detection, diagnosis, and mitigation/recovery compo-nents only, it may diagnose the fault to be increased friction ,mark the component (motor) as faulty, and set a constrainton terrain types to either flat or downhill . It would then in-voke the recommended action for this fault from A SHM : stopand cool down (until T m = 20 ◦ C ).If a prognostic component is present, it may predict that atthe current temperature increase rate, the RUL of the motoris 1 hour (with 4 hours of climb remaining to reach wp ).The same mitigation action ( stop and cool down ) would beinitiated and the same constraint on the current to the motor(and, thus, on the incline angle) may be set. After T m re-turns to nominal (which happens to take 1 hour), control isreturned to DM. Based on the new constraints and the motormarked as faulty, DM would command the rover to abort thedetour, return to A , and resume the drive to wp .If, however, DM had stop and cool down as part of itsaction space ( A DM ) and updated the state variables for theaffected motor with the newly computed heat-up and cooldown rates, an operational policy could be computed thatoptimizes the duration of driving and cool down intervalsand allows the rover to reach wp in time. For instance, ifthe rover drives for two hours, then stops for an hour to cooldown the motor, it can still reach wp in hours. With the science activity taking 1 hour, there wouldstill be an hour in reserve before the loss of sunlight at wp . In proposing the unified DM/SHM approach, we rely heav-ily on utility theory (Fishburn 1970). The following, in ouropinion, are the key ingredients for a successful unification:(1) a state-based system modeling framework and (2) a util-ity (value) function capturing the operational preferences forthe system. A utility function (denoted as U ) captures, nu-merically, our preferences over the space of possible out-comes. For instance, we may assign a higher utility valueto a rover state where a desired scientific location has beenreached. Utility can be defined for system states or for state-action pairs. If, in some system state s , the outcome of anaction a (a particular state s (cid:48) ) is not guaranteed, then the ex-ected utility of the ( s, a ) tuple is U ( s, a ) = (cid:88) s (cid:48) T ( s (cid:48) , a, s ) U ( s (cid:48) ) . (2)In a strictly deterministic system, where a plan { a H } upto a horizon H can be provided ahead of time, the expectedutility of s relative to the plan is: U a H ( s ) = (cid:88) i =0: H R ( s i , a i ) . (3)For systems with action outcome uncertainty, the expectedutility associated with executing a policy π for t steps fromstate s can be computed recursively as U πt ( s ) = R ( s, π ( s )) + γ (cid:88) s (cid:48) T ( s, π ( s ) , s (cid:48) ) U πt − ( s (cid:48) ) , (4)where γ ∈ [0 , is the discount factor that is sometimesused to bias towards earlier rewards, particularly in infinitehorizon problems ( t → ∞ ). The optimal utility for a statecan then also be computed recursively using U ∗ t ( s ) = max a ∈ A ( R ( s, a ) + γ (cid:88) s (cid:48) T ( s, a, s (cid:48) ) U ∗ t − ( s (cid:48) )) , (5)which for t → ∞ becomes the Bellman equation (Bellman1957). Knowing the optimal utility function, we can derivean optimal policy: π ∗ ( s ) = arg max a ∈ A ( R ( s, a ) + γ (cid:88) s (cid:48) T ( s, a, s (cid:48) ) U ∗ ( s (cid:48) )) . (6)In problems with state uncertainty, beliefs b ∈ B can takethe place of states in equations 4–6 (Kaelbling, Littman, andCassandra 1998). The Markov property is assumed, mean-ing that T ( s, a, s (cid:48) ) does not depend on the sequence of tran-sitions that led to s (Kemeny and Snell 1983).Now that the foundational concepts have been described,we will focus on those most relevant to the proposed unifi-cation: states and the reward function R ( s, a ) . States can bevector quantities (Section 2). For real-world problems, therelevant elements of the operating environment are some-times included in the state vector, either explicitly or im-plicitly (Ragi and Chong 2014). For instance, the rover statevector would certainly need to include the rover’s x and y coordinates, but may also include time t . These three ele-ments allow us to indirectly access other information aboutthe environment, e.g. , solar illumination, ambient tempera-ture, communications coverage, and terrain properties.Similarly, health-related elements can be included in thesame state vector. For the rover, the battery charge wouldlikely be in it already for operational purposes. Adding bat-tery temperature, however, would allow for better reason-ing about the state of battery health, when combined withinformation on ambient temperature, terrain, and rechargecurrent. Thus, including even a few health-related elementsin the state vector may have a multiplicative effect on theamount of information available. The resulting size of thestate vector may also end up being smaller than the sum ofseparately maintained DM and SHM state vectors, as redun-dant elements are eliminated. The reward function R ( s, a ) encodes the costs and therewards of being in a particular state or taking a particularaction in a state and can also be thought of as a local util-ity function. For many realistic problems with multi-variablestate representations, the reward function needs to combinecosts or rewards pertinent to state components. Several ap-proaches have been proposed (Keeney and Raiffa 1993),with additive decomposition being an effective option inmany cases. However it is implemented, the key property ofthe function is that by mapping multiple variables to a singlenumber, it allows us to compute U ( s ) or U ( s, a ) and trans-late a potentially complex DM formulation into an abstractutility maximization problem.As an example of such mapping, consider two differentrover states: s , where the rate of battery discharge is higherthan nominal due to a parasitic load in the electrical systemand s , where the rover is healthy, but is traversing difficultterrain, thus also leading to a higher rate of battery discharge.The utilities U ( s ) and U ( s ) may end up being approxi-mately equal, however, given the similar impact of the twoissues on future rewards. Policies computed for s , s , andtheir follow-on states may be similar also and, for instance,could result in more frequent recharge stops.One notable consequence of health-related componentsbeing integrated into a common state vector is that from thecomputational point of view the concepts of fault and fail-ure become somewhat superfluous. If subsets S fault ⊂ S or S failure ⊂ S are defined for the system, the framework de-scribed above will not do anything differently for them. Theonly essential subset of S is S T (the terminal states). Fail-ure states may be part of S T if they result in termination ofsystem operations; however, goal (success states) are mem-bers of S T also. The only difference between them is in their U ( s ) values. As long as a component fault or a failure doesnot lead to a transition to a terminal state, actions that maxi-mize the expected value of that state will be selected (which,as it happens, implements the “fail operational” philosophy).We will refer to this unified approach as health-aware de-cision making (HADM) . The rest of the major SHM con-cepts are incorporated into the new approach as follows. Fault detection and diagnostics are subsumed in beliefestimation and updating, although these operations are, ofcourse, used for nominal belief states as well.
Uncertaintymanagement can now be purposefully incorporated into thedecision making process by either augmenting A with in-formation gathering actions (Spaan, Veiga, and Lima 2015),evaluated in the same context as other types of actions, orby continuing to improve U ( s ) or U ( s, a ) estimates until adesired level of confidence in them is reached (Browne andPowley 2012). For actively controlled systems, predictive simulations are simply an integral part of U ( s ) calculation,where T ( s, a, s (cid:48) ) serves as a one-step “prognostic” function(with degradation models, if any). Whereas prognostic al-gorithms applied to controlled systems are limited in theirpredictive ability due to the lack of knowledge about futureactions, here the U ( s ) calculation process is an explorationof possible execution scenarios, thus combining s (cid:48) or b (cid:48) es-timation with sequential action selection.The overall HADM operational loop (assuming state and lant Figure 9: The main loop of health-aware decision makingoutcome uncertainty) can be seen in Figure 9. Once the ini-tial belief b is estimated at time t , either an offline policyis referenced or an online policy is computed to determine a ∗ (best action). The action is executed by the plant, transi-tioning to a new (hidden) state, and generating an observa-tion o . The observation is then used to update b (typicallythrough some form of Bayesian updating) and the processrepeats until a terminal state is believed to be reached.A detailed discussion of the actual algorithms that canimplement the proposed approach is left for future publi-cations, however examples of DM algorithms that can dealwith state or outcome uncertainty are provided by Kochen-derfer (2015). In systems where both state and outcome un-certainty are not a factor, a variety of traditional state spaceplanning algorithms (Ghallab, Nau, and Traverso 2016)would not need any modification to produce plans for spacesof state vectors that include health-related components. For realistic complex systems operating in the presence ofstate and outcome uncertainty, S , B , and O are likely tobe infinitely large (with | A | (cid:28) ∞ , although still poten-tially large). The problem of finding exact optimal policies insuch cases is PSPACE-complete (Papadimitriou and Tsitsik-lis 1987). Approximate solution methods typically work on-line , constructing π for the current belief b based on beliefsreachable from b within the decision horizon (Browne andPowley 2012). They also typically perform targeted sam-pling from S/B , A , and O , thus optimality guarantees canbe harder to provide. We, therefore, propose the following: • That system emergency response (SER) be defined asan automated or semi-automated process that is invokedto maximize the likelihood of preserving the system’s in-tegrity, regardless of the effect on operational goals ( i.e. ,a different objective from HADM). • That emergency response policy π SER be computed sepa-rately from the HADM policy.In the space domain, an example of SER is putting a space-craft in a safe mode until the emergency is resolved (Ras-mussen 2008). In aviation, it could be executing recom-mendations of a collision avoidance system (Kochender-fer 2015). Introduction of a separate SER system wouldlikely require the introduction of S SER , an additional S sub-set which defines for which states π SER is invoked ver-sus π HADM . Once again, however, S SER will not necessar-ily only contain system fault and failure states. For instance,states where environmental conditions warrant emergencyresponse ( e.g. , solar activity interrupting teleoperation of therover) would be included as well. Since the scope of the SERproblem is likely to be much narrower than that of HADM,it opens up the possibility of computing, verifying, and val-idating π SER offline (Kochenderfer 2015). As sensors, computing capabilities, and DM algorithmsimprove further, the fraction of the system’s state space thatis under the purview of SER will decrease. Still, we fore-see the need for a “safety net” SER to be there for safety-critical functions and thus advocate SER independence. SERwould cover the situations where the primary HADM systemmay not be able to produce a suitable solution in a desiredamount of time, serving as an equivalent of human reflex-ive responses triggered by important stimuli versus the moredeliberative cognitive functionality of the brain. It could alsoprovide dissimilar redundancy for critical regions of S , es-sentially implementing the Swiss Cheese Model (Reason1990), which advocates multiple different layers of protec-tion for important functions. A counter-argument to the unified approach can be madethat it increases the computational complexity by operatingon larger
S/B , A , and O spaces. While this is indeed a con-cern, algorithms have been developed in the recent years thatcan effectively accommodate much of the complexity, evenfor problems with state and outcome uncertainty, by approx-imating optimal policies online (Silver and Veness 2010;Browne and Powley 2012). Secondly, some of the complex-ity is removed from HADM into an independent SER com-ponent, making the former problem easier. Finally, as notedabove, unifying the SHM and DM elements may also resultin reduction of overlapping model variables. This paper reexamines the prevalent approach to perform-ing system health management and makes the case for whykeeping system health management functions separate fromdecision making functions can be inefficient and/or ineffec-tive. We also present the case for why prognostics is onlymeaningful in a limited set of circumstances and, even then,needs to be driven by decision making requirements.We then explain the rationale for unifying (not just in-tegrating) system health management with decision makingand outline an approach for accomplishing that. We also pro-pose keeping emergency response functionality separate toguarantee timely response, provide dissimilar redundancy,and allow for offline computation, validation, and verifica-tion of emergency response policies.We believe that the proposed unification approach willimprove the performance of both decision making and sys-tem health management, while potentially simplifying in-formational architectures, reducing sensor suite sizes, andcombining modeling efforts. The approach is scalable withrespect to system complexity and the types of uncertaintiespresent. It also puts a variety of existing and emerging com-putational methods at the disposal of system designers. eferences
Aaseng, G. B. 2001. Blueprint for an integrated vehicle healthmanagement system. In
Digital Avionics Systems Conference .Agha-mohammadi, A.; Ure, N. K.; How, J. P.; and Vian, J. 2014.Health aware stochastic planning for persistent package deliverymissions using quadrotors. In
IEEE/RSJ International Conferenceon Intelligent Robots and Systems .Avizienis, A. 1976. Fault-tolerant systems.
IEEE Transactions onComputers
Annual Conference of the Prognostics and HealthManagement Society .Balaban, E.; Arnon, T.; Shirley, M. H.; Brisson, S. F.; and Gao, A.2018. A system health aware POMDP framework for planetaryrover traverse evaluation and refinement. In
AIAA SciTech Forum .Bathias, C., and Pineau, A. 2013.
Fatigue of Materials and Struc-tures: Fundamentals . Wiley.Bellman, R. 1957. A Markovian decision process.
Journal ofMathematics and Mechanics (6):679–684.Bethke, B.; How, J. P.; and Vian, J. 2008. Group health manage-ment of UAV teams with applications to persistent surveillance. In
American Control Conference .Browne, C., and Powley, E. 2012. A survey of Monte Carlo TreeSearch methods.
IEEE Transactions on Computational Intelligenceand AI in Games
Annual Confer-ence of the Prognostics and Health Management Society .Daigle, M. J., and Roychoudhury, I. 2010. Qualitative event-baseddiagnosis: Case study on the Second International Diagnostic Com-petition. In
International Workshop on the Principles of Diagnosis .Daigle, M. J.; Sankararaman, S.; and Kulkarni, C. S. 2015.Stochastic prediction of remaining driving time and distance fora planetary rover. In
IEEE Aerospace Conference .Dao, B.; Hodgkin, J.; Krstina, J.; Mardel, J.; and Tian, W. 2006.Accelerated aging versus realistic aging in aerospace compositematerials.
Journal of Applied Polymer Science
AIAA SciTech Forum .Fishburn, P. C. 1970.
Utility Theory for Decision Making . ResearchAnalysis Corporation.Ghallab, M.; Nau, D.; and Traverso, P. 2016.
Automated Planningand Acting . Cambridge University Press.Heng, A.; Zhang, S.; Tan, A. C. C.; and Mathew, J. 2009. Rotatingmachinery prognostics: State of the art, challenges and opportuni-ties.
Mechanical Systems and Signal Processing
AIAA In-fotech@Aerospace Conference .Johnson, S. B., and Day, J. C. 2011. System health managementtheory and design strategies. In
AIAA Infotech@Aerospace Con-ference .Kaelbling, L. P.; Littman, M. L.; and Cassandra, A. R. 1998. Plan-ning and acting in partially observable stochastic domains.
Artifi-cial Intelligence
Decisions with Multiple Ob-jectives: Preferences and Value Tradeoffs . Cambridge UniversityPress. Kemeny, J., and Snell, J. 1983.
Finite Markov Chains . Springer.Kochenderfer, M. J. 2015.
Decision Making Under Uncertainty:Theory and Application . MIT Press.Lopez, I., and Sarigul-Klijn, N. 2010. A review of uncertaintyin flight vehicle structural damage monitoring, diagnosis and con-trol: Challenges and opportunities.
Progress in Aerospace Sciences
IEEE Transac-tions on Systems, Man, and Cybernetics
Wear modeling: evaluation and categorizationof wear models . Ph.D. Dissertation, University of Michigan.Narasimhan, S., and Brownston, L. 2007. HyDE - a general frame-work for stochastic and hybrid model-based diagnosis. In
Interna-tional Workshop on the Principles of Diagnosis .Ogata, K. 2010.
Modern Control Engineering . Pearson.Oh, H.; Han, B.; McCluskey, P.; Han, C.; and Youn, B. D. 2015.Physics-of-failure, condition monitoring, and prognostics of insu-lated gate bipolar transistor modules: A review.
IEEE Transactionson Power Electronics
Mathematics of Operations Research
The Trend in Engineering
IEEE Transactions on Aerospace and Electronic Systems
AAS Guidance and Control Conference .Reason, J. 1990.
Human Error . Cambridge University Press.Rigamonti, M.; Baraldi, P.; Zio, E.; Astigarraga, D.; and Galarza,A. 2016. Particle filter-based prognostics for an electrolytic capac-itor working in variable operating conditions.
IEEE Transactionson Power Electronics
International Work-shop on the Principles of Diagnosis .Silver, D., and Veness, J. 2010. Monte-Carlo planning in largePOMDPs. In
Advances In Neural Information Processing Systems .Spaan, M. T. J.; Veiga, T. S.; and Lima, P. U. 2015. Decision-theoretic planning under uncertainty with information rewards foractive cooperative perception.
Autonomous Agents and Multi-Agent Systems
Annual Conference of the Prognostics andHealth Management Society .Ure, N. K.; Chowdhary, G.; How, J. P.; Vavrina, M. A.; and Vian, J.2013. Health aware planning under uncertainty for UAV missionswith heterogeneous teams. In
European Control Conference .Valasek, J. 2012.
Advances in Intelligent and AutonomousAerospace Systems . AIAA.Virkler, D. A.; Hillberry, B.; and Goel, P. K. 1979. The statis-tical nature of fatigue crack propagation.
Journal of EngineeringMaterials and Technology