[PDF] A probabilistic risk-based decision framework for structural health monitoring

Abstract

Obtaining the ability to make informed decisions regarding the operation and maintenance of structures, provides a major incentive for the implementation of structural health monitoring (SHM) systems. Probabilistic risk assessment (PRA) is an established methodology that allows engineers to make risk-informed decisions regarding the design and operation of safety-critical and high-value assets in industries such as nuclear and aerospace. The current paper aims to formulate a risk-based decision framework for structural health monitoring that combines elements of PRA with the existing SHM paradigm. As an apt tool for reasoning and decision-making under uncertainty, probabilistic graphical models serve as the foundation of the framework. The framework involves modelling failure modes of structures as Bayesian network representations of fault trees and then assigning costs or utilities to the failure events. The fault trees allow for information to pass from probabilistic classifiers to influence diagram representations of decision processes whilst also providing nodes within the graphical model that may be queried to obtain marginal probability distributions over local damage states within a structure. Optimal courses of action for structures are selected by determining the strategies that maximise expected utility. The risk-based framework is demonstrated on a realistic truss-like structure and supported by experimental data. Finally, a discussion of the risk-based approach is made and further challenges pertaining to decision-making processes in the context of SHM are identified.

Full PDF

AA probabilistic risk-based decision framework for structural healthmonitoring

A.J. Hughes a, ∗ , R.J. Barthorpe a , N. Dervilis a , C.R. Farrar b , K. Worden a a Dynamics Research Group, Department of Mechanical Engineering, University of SheﬃeldSheﬃeld S1 3JD, UK b Engineering Institute, MS T-001, Los Alamos National Laboratory, Los Alamos, NM 87545, USA

Abstract

Obtaining the ability to make informed decisions regarding the operation and maintenance ofstructures, provides a major incentive for the implementation of structural health monitoring(SHM) systems. Probabilistic risk assessment (PRA) is an established methodology that allowsengineers to make risk-informed decisions regarding the design and operation of safety-criticaland high-value assets in industries such as nuclear and aerospace. The current paper aims toformulate a risk-based decision framework for structural health monitoring that combines elementsof PRA with the existing SHM paradigm. As an apt tool for reasoning and decision-makingunder uncertainty, probabilistic graphical models serve as the foundation of the framework. Theframework involves modelling failure modes of structures as Bayesian network representationsof fault trees and then assigning costs or utilities to the failure events. The fault trees allow forinformation to pass from probabilistic classiﬁers to inﬂuence diagram representations of decisionprocesses whilst also providing nodes within the graphical model that may be queried to obtainmarginal probability distributions over local damage states within a structure. Optimal coursesof action for structures are selected by determining the strategies that maximise expected utility.The risk-based framework is demonstrated on a realistic truss-like structure and supported byexperimental data. Finally, a discussion of the risk-based approach is made and further challengespertaining to decision-making processes in the context of SHM are identiﬁed.

Keywords: structural health monitoring, probabilistic risk assessment, probabilistic graphicalmodels, decision-making

1. Introduction

The ﬁeld of engineering known as

Structural Health Monitoring (SHM) concerns the developmentand implementation of data acquisition and processing systems for the purpose of damage detectionin aerospace, civil or mechanical infrastructure [1]. A prime motivation for the use of SHM systemsis to acquire the ability to make informed decisions regarding the operation and management ofstructures so as to improve safety and/or reduce costs. In the context of SHM, an agent tasked withmaking decisions is required to specify action policies that are robust to uncertainties that arise asa result of having imperfect information regarding the damage state of a structure. In addition,it may be desirable to manage a population of structures simultaneously; further complicatingthe decision problem. The task of decision-making for SHM is complex and highly involved andtherefore demands a thorough and systematic approach.For many years,

Probabilistic Risk Assessment (PRA) has been used in industries such asaerospace and nuclear for making decisions regarding the design and operation of safety-criticalassets, such as nuclear power plants [2] and reusable space vehicles, like the space shuttle [3]. ∗ Corresponding author

Email address: [email protected] (A.J. Hughes)

Preprint submitted to Mechanical Systems and Signal Processing January 6, 2021 a r X i v : . [ s t a t . A P ] J a n RA provides a rigorous and structured methodology for identifying possible adverse eventsassociated with the operation of a system and subsequently quantifying the respective probabilitiesof occurrence and the severity of the consequences.Thus far, the majority of research in the ﬁeld of SHM has been focussed on the identiﬁcation,localisation and classiﬁcation of damage. There have been fewer attempts to address decision-making processes and to incorporate risk into SHM problems. A static probabilistic risk-basedapproach to SHM was applied to a simulated truss structure in [4], although no attempt was madeto forecast structural degradation. Flynn and Todd successfully applied a Bayes risk approachto the decision problem of sensor placement for an SHM system on square, gusset and T-shapedplates in [5]. The approach considered the risk of false positives and false negatives of damageidentiﬁcation in discrete regions of the plates. Cost-informed decision-making for miter gates wasdemonstrated in [6]; this involved using a Bayesian neural network trained on a ﬁnite elementmodel to infer damage and forecast performance using a transition matrix. An approach proposedin [7], facilitates cost-eﬃcient reliability-based maintenance. As the latter is a reliability-basedapproach rather than a risk-based approach, the costs of failure events and maintenance are notexplicitly modelled. Hence, whilst the maintenance strategies developed may be cost-eﬃcient forgiven safety parameters, they are not necessarily cost-optimal. There has been some researchinto the risk-based operation and maintenance of structures and components. Nielsen details arisk-based approach that is utilised for the operation and maintenance of oﬀ-shore wind turbines in[8], using probabilistic graphical modelling. Similarly, Hovgaard and Brincker provide a case studydemonstrating a risk-based approach to the monitoring and maintenance of a ﬁnite element modelof a wind turbine tower experiencing circumferential cracking in [9]. In [10] a continuous-statepartially-observable Markov decision process (POMDP) was demonstrated on artiﬁcial data formaintenance planning on a deteriorating bridge. Dynamic Bayesian networks are also employed in[11] for the diagnosis and prognosis of the structural health of an aircraft wing; this includes theprobabilistic temporal modelling and prediction of crack growth.The current paper aims to address the lack of a generalised framework for conducting risk-basedmonitoring of structures at the full-system scale by augmenting the current SHM paradigm withpractices employed in probabilistic risk assessment and thereby facilitating the decision-makingprocesses that motivate the implementation of SHM systems. An overview is given of the currentparadigms for conducting PRA and SHM as outlined in the literature. Also provided is backgroundtheory regarding the key technologies required for mapping PRA onto SHM; namely, probabilisticgraphical models in the form of Bayesian networks and inﬂuence diagrams. A notation is establishedbefore the augmented risk-based paradigm for SHM is detailed. Finally, a discussion around theframework is made and further challenges in the SHM decision-process are identiﬁed.

2. Structural Health Monitoring

Structural health monitoring involves implementing damage identiﬁcation strategies to determinethe health state of a structure throughout its operational lifetime. Statistical pattern recognition(SPR) oﬀers a natural approach to SHM as it allows the uncertainties inherent in engineeringproblems to be dealt with robustly. It is for this reason that the SPR approach has been the focusof much research over the past three decades. The established SPR paradigm for an SHM systemis composed of four procedures [1]:1. Operational Evaluation.2. Data acquisition.3. Feature selection.4. Statistical modelling for feature discrimination. seeks to answer several questions concerning the implementation of anSHM system, speciﬁcally: 2

What is the justiﬁcation (safety and/or economic) for implementing an SHM system? • How is damage deﬁned for the system and what are the critical damage states? • What are the environmental and operational conditions that the monitoring system is requiredto perform under? • How does the operational environment limit data acquisition?For an SHM system to be successfully developed and implemented, a substantial amount ofinformation must be collected during the operational evaluation process, and signiﬁcant eﬀort maybe necessary to obtain an adequate amount that is quantiﬁed in suﬃcient detail. Examples ofrequired information include: monetary cost and reliability of the proposed SHM system; pertinentdamage states of the structure and the thresholds at which they can deemed to have occurred, andthe temperature and load variations experienced by the structure during operation.

The data acquisition process is informed by the operational evaluation. The process aims toﬁnalise the types, number and locations of sensors to be used in the SHM system. The dataacquisition, storage and transmittal hardware must also be selected. The process is constrained byboth economic restrictions and the limitations enforced by the expected environmental conditions;for this reason, the data acquisition process is context dependent and relies heavily on theinformation gathered during the operational evaluation stage.

Once the data have been acquired, a set of features must be constructed that indicate whetheror not there is damage present in the structure. This procedure often involves processing the dataacquired from the structure; common practices include domain transformation, dimensionalityreduction, and normalisation [1].

Statistical models must be developed to exploit the discrepancy between features that indicatediﬀering damage states. The degree of knowledge regarding the damage state obtained from anSHM system is highly dependent on the statistical model employed and can be evaluated in termsof Rytter’s Hierarchy [12]:1. Is there damage in the system?2. Where is the damage located?3. What type of damage is present?4. How severe is the damage?5. How much useful life remains?Whilst Rytter’s hierarchy, in itself, does not lead to decisions being made, as SHM systemsprogress up the hierarchy, the information they yield becomes increasing useful to agents taskedwith deciding upon a course of action for a structure. Within the ﬁeld of decision theory, cost andutility are metrics used ubiquitously for the comparison of courses of action and their consequences.By combining the cost/utility of a given consequence with the respective likelihood, one can arriveat the notion of risk . 3 . Probabilistic Risk Assessment

Probabilistic risk assessment (PRA) is a method that is widely used for evaluating risks andmaking decisions associated with the design and management of safety-critical systems and high-value assets. In the context of PRA, risk is characterised by the likelihood of an adverse eventoccurring and the severity of the consequences of the event. The likelihood of occurrence foruncertain adverse events is quantiﬁed through probabilistic event sequence and system modelling.The consequences and expected costs/gains are compared and evaluated by ﬁnding an appropriateutility metric - obvious examples include ﬁnancial cost and loss of human life; however, in manyapplications these are overly simplistic [13]. Probabilistic risk assessment is applied in a range ofindustries including nuclear, aerospace and chemical process. Whilst the exact methodology usedfor conducting PRA diﬀers between industries, they generally adhere to the key steps as outlinedby the US Nuclear Regulatory Commission (USNRC) and the Internation Atomic Energy Agency(IAEA) [2]:1. Initial information collection.2. Event-tree development.3. System modelling.4. Reliability modelling.5. Failure sequence quantiﬁcation.6. Consequence analysis.Further detail is provided for each step in the following subsections.

Initiatingevent :Jumpfromplane

Topevent:

Mainchute

Topevent:

Reservechute

Adverse event:

Injury f a il s System successful:

Injury prevented w o r k s f a il s System successful:

Injury prevented w o r k s Figure 1: An example event tree for a parachute system to prevent fall injuries [14].Figure 2: Fault tree diagram representations of the AND-gate (left) and OR-gate (right).

Information regarding the design and operation of the structure in question is collated. Detailssuch as component speciﬁcations, loading and environmental conditions are considered. Theinformation gathered at this stage is used to inform the subsequent steps. Given the large quantityof information required for conducting PRA, an important factor to be considered at this stage isthe method by which the necessary information is represented, stored and managed. A commonpractice is to utilise a database [13]. There is a clear analogy here with the operational evaluationstage for SHM. 4 op event:

Reservechute fails

Intermediateevent:

Chutenot deployed

Basic event:

Ripcord breaks

Intermediateevent:

Autoactivation fails

Basic event:

Altimetermalfunctions

Basic event:

Dead battery

Basic event:

Chute tangled

Figure 3: An example fault tree for a reserve parachute system [14].

Event trees outline potential accident sequences - combinations of initiating events and thesubsequent system failures or successes that may result in an adverse consequence. The sequencesof system failures and successes are known as top events . The system failures identiﬁed in theevent-tree development stage are subsequently modelled as fault trees. An example event tree for asystem designed to prevent injury following a jump from a plane is shown in Figure 1.

Fault trees are used in PRA to facilitate the quantiﬁcation of system failure probabilities.The development of fault trees involves expressing the causal relationships between componentfailures and subsystem failures using Boolean logic gates. The level of detail captured in the faulttree (the level of components which are incorporated) is determined by the component level forwhich meaningful reliability data can be obtained. Failures of components belonging to the mostfundamental level incorporated in the fault tree are known as basic events and are represented in afault tree diagram as circles. Intermediate and top events are deﬁned as combinations of otherintermediate, and basic events through Boolean logic gates such as the AND-gate and OR-gate.The fault tree diagram notation for the AND-gate and OR-gate are shown in Figure 2. An examplefault tree for the deployment of a reserve parachute is shown in Figure 3.

Information regarding the reliability of system components and the frequency of initiatingevents is necessary to quantify the probabilities of top events; this is typically gleaned from dataand by applying appropriate reliability models. 5 .5. Failure sequence quantiﬁcation

By assigned the components in the fault trees with failure rates, the probability of top eventsmay be computed. Propagating the probabilities of the initiating events and top events throughthe event tree allows the probability of each possible outcome in the event tree to be calculated.

With the probability of an adverse event quantiﬁed, a cost/utility metric must be chosen sothat the risk associated with the failure sequence may be assessed. The risk assessment may thenbe used to inform design decisions, such as increasing safety by introducing additional redundanciesin the system, or optimising cost by removing components that do not cause the risk to fall belowan acceptable threshold. The risk assessment may also be used to inform risk-based inspection fora system that is in operation.

4. Probabilistic Graphical Models

Probabilistic graphical models (PGMs) are a powerful framework for reasoning and decision-making under uncertainty - a core problem in SHM. Probabilistic graphical models are represen-tations of joint probability distributions, in which nodes denote a set of random variables andedges connecting nodes imply dependency between variables. The probabilistic graphical modelrepresentation provides beneﬁts over a ﬂat (non-graphical) representation [15]: • They provide a compact and intuitive representation of complex probability distributionswhich makes them easier to understand, communicate and learn. • They facilitate eﬃcient computation by exploiting local independence structures.A PGM over a set of N variables X may be speciﬁed by a set of M local functions f ( Y i ), where Y i is some subset of X , and a graph G comprised of nodes/vertices V and edges E . The jointprobability distribution represented by the graph is obtained by: P ( X , X , . . . , X N ) = K M (cid:89) i =1 f ( Y i ) (1)where K is a normalisation factor ensuring the probabilities sum to unity.There are two classes of problem associated with PGMs: inference and learning . Inference isconcerned with obtaining the marginal or conditional probabilities of a subset of variables Z givenany other subset Y . i.e. P ( Z | Y ). Learning is concerned with obtaining the graph structure andparameters given a complete, or incomplete, set of observed data values for X . i.e. G, f ( Y i ) | X .The remainder of the current paper will be primarily concerned with inference problems and theirapplication in a risk-informed SHM framework. Bayesian networks (BNs) are a form of PGM. Speciﬁcally, they are directed acyclic graphs(DAGs) in which nodes represent random variables and edges connecting nodes represent conditionaldependencies between variables. For discrete random variables, the local functions that describethe conditional probability distributions (CPDs) between variables are conditional probabilitytables (CPTs), and in the case of continuous random variables are conditional probability densityfunctions (CPDFs).

X Y Z

Figure 4: An example Bayesian network. X is a parent of Y and an ancestor of Z ; Z issaid to be the child of Y and a descendant of X . Node X is independent of of other nodes and sois speciﬁed by the unconditional distribution P ( X ). Observed variables are shaded grey.Given observations on a subset of nodes in a BN, inference algorithms can be applied to obtainposterior distributions over the unobserved random variables. In some cases, analytical solutionsof posterior distributions may be found by using exact inference methods. To solve the inferenceproblem using direct computation from the joint probability distribution, the computationalcomplexity increases exponentially with the size of the graph and quickly becomes intractable.Fortunately, algorithms have been developed that allow eﬃcient computation [16]. Bayesian networks can be augmented to represent decision processes by incorporating nodesfor decision variables and utility functions - these augmented networks are known as inﬂuencediagrams [17]. Decision nodes are denoted by squares and utility nodes are denoted by rhombi,as shown in Figure 5. Edges connecting random variables to utility nodes denote that the utilityfunction is dependent on the state of that variable. Similarly, edges connecting decision nodes toutility nodes denote that the utility function has a dependence on the action decided upon. Edgesconnecting random variables or decision nodes to other decision nodes denote order, that is tosay the random variable or decision is observed prior to the decision being made; such edges arereferred to as informational links as they do not imply a functional dependence but rather thatthe information regarding the state of the variable is required for the decision to be made.The inﬂuence diagram shown in Figure 5 can be interpreted as a binary decision processregarding whether to go out for a walk or stay in and watch TV under uncertainty in the futureweather condition W c given an observed weather forecast W f . The nodes W f and W c can beconsidered as binary random variables representing the weather forecast and actual weathercondition, respectively, with possible states domain ( W f ) = domain ( W c ) = (cid:8) bad, good (cid:9) and theweather forecast is dependent on the weather condition. The possible actions can be summarisedas domain ( D ) = (cid:8) T V, walk (cid:9) . The utility U achieved is then dependent on both the weathercondition experienced and the decided action. W c W f D U

Figure 5: An example inﬂuence diagram representing the decision of whether to go outside or stay in underuncertainty in the future weather condition given an observed forecast.

In general, a policy δ is a mapping from all possible observations to possible actions. Theproblem of inference in inﬂuence diagrams is to determine an optimal strategy ∆ ∗ = (cid:8) δ ∗ , . . . , δ ∗ n (cid:9) given a set of observations on random variables where δ ∗ i is the policy for the i th decision to bemade in a strategy ∆ ∗ that yields the maximum expected utility ( M EU ). The expected utility is afunction of probability and utility; and by this deﬁnition is equivalent to risk.

5. Deﬁnitions

To establish a framework for mapping PRA onto SHM, some fundamental concepts will ﬁrst bedeﬁned. In addition, a notation will be established for describing structures that can be expressedas hierarchical graphs.One must start with a physical structure or system of interest S . It is assumed that S may bedeﬁned in terms of constituent units; components c , joints j and substructures s . Components and7 j s s c j j s s j L j LN Lj c L c L c LN Lc − c LN Lc Figure 6: A hierarchical graphical representation of a generic structure S . Superscripts denote the level in thehierarchy and subscript indexes each type of constituent unit in a given level. Dotted edges imply an arbitrarystructuring between levels. joints are considered irreducible base units of S whereas substructures are compound units andmay be comprised of joints, components and other substructures.Figure 6 depicts a graphical representation of a hierarchical structure that may be consideredwithout loss of generality. Nodes represent the global structure and its constituent units and edgesrepresent the dependence of a (sub)structure on its constituent units. At the top, or level 1, ofthe hierarchy is the global structure with the hierarchy level denoted in the superscript. It canbe seen that the global structure S is comprised of two substructures s , s and a joint j , i.e. S = (cid:8) s , j , s (cid:9) . These units form the second level of the hierarchy. s and s may in turn beexpanded to yield S = (cid:8) { s , j , c } , j , { c , j , s } (cid:9) . Progressing down the hierarchy levels, onecan continue to expand the substructures into constituent units until the L th level of the hierarchywhich is comprised solely of base units. By taking the expansion of S into its constituent baseunits and discarding the repeated units arising from substructures that share components, oneobtains a list of the base units that form a given structure S . Within each level of the hierarchy,units are numbered via a subscript from 1 to N iu where N iu is the number of a constituent unittype u in the i th level of the hierarchy. The notation u in where i is an integer from 1 to L and n isan integer from 1 to N iu provides a unique identiﬁer for each unit within a structure.It is assumed that there exists a set of features ν , observable from S , that are produced accordingto a generative latent state model with latent state H ∗ ( t ), where H ∗ ( t ) is the true health state of S and may be expressed in terms of the true health states of the constituent components and joints h ∗ c in ( t ) and h ∗ j in ( t ), respectively, i.e. H ∗ ( t ) = (cid:8) h ∗ c ( t ) , . . . , h ∗ c LN Lc ( t ) , h ∗ j ( t ) , . . . , h ∗ j LN Lj ( t ) (cid:9) .The structure S also has a predicted time-dependent health state vector H ( t ) = (cid:8) hc ( t ) ,. . . , hc LN Lc ( t ) , hj ( t ) , . . . , h j LNLj ( t ) (cid:9) . Health-state vectors can be constructed from any subset ofcomponents and joints.For the structure/system S , there must exist a set of failure modes of interest F = (cid:8) F , . . . , F N F (cid:9) whereby S ceases to be ﬁt for purpose. It is assumed that a given failure mode is dependent on thehealth states of a subset of components, joints and substructures for which a health-state vectorcan be constructed. In addition, each failure state has an associated utility U F n .Finally, for the structure S , there also exists a set of decisions d = (cid:8) d , . . . , d N d (cid:9) which aﬀect H ∗ , each having an associated utility U d i . In addition, there will exist some set of environmentalconditions e = (cid:8) e ( t ) , . . . , e n e ( t ) (cid:9) that will alter the distribution of ν .

6. Mapping PRA onto SHM

Upon examination, it becomes apparent that there are both diﬀerences and similarities betweenthe paradigms for SHM and PRA that can be examined to determine which aspects of PRA will be8seful for SHM. Whilst it is clear that both SHM and PRA are utilised for the purpose of makingdecisions in the face of uncertainty, PRA is conducted oﬄine for a system experiencing a set ofanticipated initiating events. In contrast, the decision processes for which SHM is implemented areonline and require continual predictions of the damage state of the structure. It is for this reasonthat the event-tree development stages and failure sequence quantiﬁcation stages in PRA are lessapplicable to SHM.Both paradigms begin with collating information regarding the structure and deﬁning thecontext in which decisions are to be made. In fact, the ﬁrst three stages of the PRA paradigminvolve expressing the structure and context in a logical way which facilitates the quantiﬁcationof risk and the decision-making process. It is in this formal expression of the structure that thedecision-making process in the SHM paradigm stands to beneﬁt. An overview of the risk-basedSHM paradigm is as follows:1. Operational evaluation.2. Failure-mode modelling.3. Decision modelling.4. Data acquisition.5. Feature selection.6. Statistical modelling for feature discrimination.

With the aim of justifying the use and deﬁning the context of a risk-based SHM system,the operational evaluation stage seeks to answer many of the same questions as in the standardparadigm. However, some questions require an approach that facilitates the failure-mode modellingand decision-modelling stages.Foremost, information regarding the components c , joints j , substructures s , and the depen-dencies between them is required.When identifying the critical damage states of the structure S , one should aim to identifythe failure modes of interest F . Critical components, joints and substructures/subsystems thatcontribute to F should also be identiﬁed at this stage. The predicted damage states of thesecomponents h should be deﬁned. The damage states of the critical substructures/subsystems H should be deﬁned as a vector in terms of h .For each failure mode in F , potential decisions d should be identiﬁed and the ways in whichthe actions inﬂuence the structure or likelihood of failure modes occurring should be determined.Utility values U F and U d for all F and all d , respectively, should be quantiﬁed. The selectionof utility values will determine the behaviour of the decision-making agent, and is analogous tosetting a decision threshold in a standard SHM paradigm.Environmental inﬂuences e should also be identiﬁed. It should also be decided whether theSHM system is to evaluate the health of the structure at static, independent instances in time, orpredict future health states, thereby requiring a model forecasting the degradation of the structure.For large, complex structures it may be beneﬁcial to borrow the data management techniquesused in PRA, such as databases, to organise the information obtained during the OperationalEvaluation stage. This will allow for a rigorous and structured approach to the informationcollection and allow for the identiﬁcation of aspects of the SHM system that require furtherspeciﬁcation or more information. Having a formal information structure will also expedite thesubsequent failure-mode modelling step which requires detailed knowledge of the physical structure. For each of the failure modes of interest in F , one should proceed to construct a fault tree,such as that shown in Figure 7, based upon the health states of the relevant components, jointsand substructures/subsystems. It should be noted here that, in many cases, the exact nature ofthe failure modes will be unknown and so a best estimate based on engineering judgement may beused. 9 c s s c j c j s c j c j Figure 7: A fault tree of a (single) failure mode F where the superscript denotes the hierarchy level and thesubscript is an identiﬁer. F hs hc hj hj hs hs hj hj hc hc hc hc H Figure 8: A Bayesian network of failure mode in F . Fault trees oﬀer a rigorous and consistent structure for expressing the failure modes; however,as statements in Boolean logic they are limited in their ﬂexibility. In the context of SHM, it isdesirable to represent the components in a fault tree as having multiple damage states, and it isfor this reason that one should map the constructed fault trees into Bayesian networks. Bobbio et al outline a convenient mapping from fault trees into Bayesian networks in [18], whilst alsohighlighting the additional ﬂexibility that is granted by doing so. Additionally, Bayesian networksare used to represent structural failures in [19].In the example shown in Figure 8, the component health states, substructure health statesand failure event are represented as random variables where the substructure health states areconditioned on the component health states and the failure events is conditioned on the substructure10ealth state. The random variables are deﬁned using a conditional probability distribution (CPD)which may be discrete or continuous.A node representing the health-state vector of the critical components and joints H should beincluded in the fault tree, as this latent state will be predicted during the statistical modellingprocess. To deﬁne the vector H within the Bayesian network, the conditional dependence betweenthe nodes representing the local health states of the components and joints and H are expressedas a binary logic table.One function of the failure-mode Bayesian network is to allow the ﬂow of information from thestatistical model to the decision, whilst parsing the information in a way that facilitates the deﬁningof the failure events F . The network also allows the computation of marginal distributions for theprobability of failure in each component, joint, or substructure allowing for damage localisation. Modelling the decision process involves augmenting the Bayesian network developed in thefailure modelling stage with nodes for each decision in d and for utilities U F and U d to producean inﬂuence diagram. Decision nodes in which the actions alter the probability of a state or eventshould modify the CPDs accordingly. Utility nodes are constrained to be leaf nodes and should bedependent on the appropriate failure events or decisions.For static problems, it may be convenient to model the decision process in a separate inﬂuencediagram which receives information regarding failure probabilities from the fault tree. This issue isbecause it is implicit that the decision is made after observations are made; if one attempts to solvea network in which a decision is made that yields a state that is inconsistent with the observedstate, a conﬂict arises. The data acquisition process should not diﬀer from that in the standard SHM paradigm. Here,there is a subtlety that the data acquisition system should be designed so as to optimise thedecision-making rather than damage identiﬁcation.

The feature selection process should not diﬀer from that in the standard SHM paradigm. Again,there is the subtlety that the features should be selected so as to optimise the decision-making.

The purpose of the statistical model is to predict the critical health states H given the selectedfeature set ν . As aforementioned, it is assumed that ν is produced through a generative latent statemodel, with latent-state H . Probabilistic classiﬁers that output a probability distribution over allpossible states of H , such as Gaussian mixture models (GMMs), are compatible. Probabilisticclassiﬁers are instrumental in building robustness to the uncertainty surrounding the true healthstate of S into the decision process. Ideally, the chosen statistical model will be capable ofconsistently identifying the actual health state under all identiﬁed operating and environmentalconditions e , or at least appropriately reﬂect the uncertainty caused by varying conditions in theprediction.Finally, if a model describing the degradation of S (i.e. a transition model for H ) is requiredfor forecasting failure events in the time-dependent case, the CPDs deﬁning P ( H t | H t − , d ) shouldbe speciﬁed accordingly. 11 igure 9: A 2-dimensional four-bay truss comprised of 20 members, 8 of which are removable and denoted by adashed line. Loads are applied at points L, and a preload is applied at point P. Load positions are shown as bluedots. The bays are numbered left to right from 1 to 4.

7. Case study: Four-bay truss

To demonstrate the probabilistic graphical model formulation of a risk-based approach to anSHM problem, it was applied to a four-bay truss identical to that used in [20] and shown in Figure9. For clarity, the example will be limited to a single failure mode and a single binary decision.The failure of joints will also be ignored.The truss was composed of 20 aluminium members each with length 250mm and cross-sectionalarea 177 mm ; the overall length of the structure being 1 m and the height being 0.25 m. Themembers were pinned together with steel bolts in lubricated holes. The truss was subjected to apreload of 5 kg at point P and additional consecutive loads of 10, 20 and 30 kg at each of the 8points L in turn. For each of the 24 load cases, microstrains were measured at the midpoints of the12 horizontal and vertical members. This process was performed 8 times in total, once with eachcross-member removed.In addition to the experimental data, a ﬁnite element simulation of the truss was developedwhereby removal of a cross-member was achieved by assigning a Young’s modulus of 1 MPa. Aswell as the 10, 20 and 30 kg loads used in the experiment, the truss was simulated with loadsof 5, 15 and 25 kg. Furthermore, strains were obtained for the truss under each load case in itsundamaged condition, i.e. with all cross-members intact. In order to construct a risk-based decision framework for the truss, one must ﬁrst deﬁne itformally. As it was elected to ignore joints, the global truss structure T can be deﬁned as foursubstructures, one for each bay i.e. T = (cid:8) b , b , b , b (cid:9) . As only the failure of cross-members wasconsidered, each bay can in turn be deﬁned as two components, for example, b = (cid:8) m , m (cid:9) . Conse-quently, the global structure may be represented as T = (cid:8) m , m , m , m , m , m , m , m (cid:9) .A single failure mode F T of the truss was considered; the full or partial collapse of the structure.This failure mode corresponds to the event where the truss is no longer able to support theload/preload, hence, F T occurs when both cross-members in a single bay fail.In an attempt to minimise the occurence of the failure mode of interest F T , a single binarydecision d was identiﬁed; a choice between ‘do nothing’ and ‘perform maintenance’. In addition,utilities were assigned to the failure event and the decidable actions in a manner which may reﬂectthe relative costs associated with failure and maintenance in real-world engineering applications.The utilities assigned to the failure and decision are shown in Tables 1 and 2, respectively.For the purposes of demonstration it is assumed that the load on the truss will be uncertain,varying in discrete time within the interval [0 , w max ] where w max is deﬁned in subsection 7.3.Furthermore, it is assumed that the location of the load also varies in discrete time and that, inthe limit of inﬁnite time-slices, each of the 8 locations is visited an equal number of times.12 able 1: A table showing the entries of the utility function U ( F T ) where F T = 0 and F T = 1 denote the truss beingoperational and failed, respectively. F T U ( F T )0 151 − Table 2: A table showing the entries of the utility function U ( d ) where d = 0 and d = 1 denote the ‘do nothing’ and‘perform maintenance’ actions, respectively. d U ( d )0 01 − The failure mode F T of the truss can be represented as the fault tree shown in Figure 10 wherethe failure of a bay b is deﬁned as the and-gate of two cross-member m failures, and the failure ofthe truss F T is deﬁned as the or-gate of the bay failures. F T b m m b m m b m m b m m Figure 10: A fault tree of failure mode F T for a four-bay truss. The failure mode F T occurs if at least one bay b fails. A bay b will fail if both cross-members m fail. To map the fault tree for the failure event F T into a probabilistic graphical model, h b and hm will be used to denote the random variables that represent the local binary health statesof the bays and cross-members, respectively, where 0 corresponds to intact and 1 correspondsto failed. Additionally, H will be used to denote the random variable vector for the healthstate of the global structure where H = (cid:8) hm , hm , hm , hm , hm , hm , hm , hm (cid:9) . Forconciseness, the vector H will, on occasion, be summarised as H = H , where H is the decimalrepresentation of the 8-bit binary number speciﬁed by the vector. Finally, the F T notation will beretained to represent the random variable corresponding to the failure event. The Bayesian networkcorresponding to the fault tree shown in Figure 10 is shown in Figure 11. The conditional probabilitydistributions specifying P ( F T | h b , h b , h b , h b ) (or P ( F T | hb ) for brevity) and P ( h b | hm , hm )(or P ( h b | h m b ) for brevity) are shown in Tables 3 and 4, respectively. The purpose of health-state transition modelling is to develop the conditional probabilitydistribution P ( H t +1 | H t , d ) that predicts the future health state of the truss forward in time, giventhe current health state and the decided action.13 T h b h b h b h b hm hm hm hm hm hm hm hm Figure 11: A Bayesian network representation of failure mode F T for a four-bay truss.Table 3: A table showing the entries of the conditional probability distribution P ( F T | h b , h b , h b , h b ) where h b = 0 and h b = 1 denote a bay being intact and failed, respectively, and F T = 0 and F T = 1 denote the trussbeing operational and failed, respectively. h b h b h b h b P ( F T = 1 | hb ) P ( F T = 0 | hb )0 0 0 0 0 11 0 0 0 1 00 1 0 0 1 01 1 0 0 1 00 0 1 0 1 0... ... ... ... ... ...0 1 1 1 1 01 1 1 1 1 0For the purpose of this demonstration, it was decided that the ‘perform maintenance’ actionsimply returns the structure to its undamaged state, i.e. with no cross-members failed, withprobability 1, independent of H t .With regard to the ‘do nothing’ action, it was ﬁrst assumed that the truss would not sponta-neously transition from a more advanced damaged state to a lesser one, that is to say, cross-memberswould not self-repair in the absence of intervening maintenance, or, without maintenance thehealth-state of the structure monotonically degrades as a function of time.The assumed loading range [0 , w max ] was discretised into 100 evenly-spaced increments; com-bined with the 8 possible load locations, this resulted in 800 unique considered load cases L c .For a given load case L c , a transition in health state was deﬁned as H t +1 = H t + δ H t → t +1 where δ H t → t +1 is an 8-bit binary vector with i th entry equal to 1 if the yield stress of aluminium(300 MPa) is exceeded in member m i +8 when the truss is simulated in health state H t subject toload case L c , and equal to 0 otherwise. The conditional probability of transitioning from H t = H t to H t +1 = H t +1 , P ( H t +1 = H t +1 | H t = H t , L c ), was assigned unity if δ H t → t +1 = H t +1 − H t for H t = H t and H t +1 = H t +1 for load case L c , and assigned zero otherwise. The full transitionmatrix P ( H t +1 | H t ) was then populated where the entry P ( H t +1 = H t +1 | H t = H t ) is given by, P ( H t +1 = H t +1 | H t = H t ) = (cid:80) N Lc L c =1 P ( H t +1 = H t +1 | H t = H t , L c ) N L c (2)where N L c is the total number of load cases considered and N L c = 800.For illustrative purposes, the maximum load w max was determined by asserting P ( H t +1 (cid:54) =0 | H t = 0) = 0 .

005 and the value of w max that satisﬁed the condition was found to be approximately6900 kg. This is obviously somewhat arbitrary, and in practice the maximum load for a structuremay be estimated during the operational evaluation stage.The transition model developed provides a means of forecasting future health states of thetruss, a heatmap representation of the transition matrix is shown in Figure 12.14 able 4: A table showing the entries of the conditional probability distribution P ( h b | hm , hm ) where hm = 0and hm = 1 denote a member being intact and failed, respectively, and h b = 0 and h b = 1 denote a bay being intactand failed, respectively. hm hm P ( h b = 1 | h m b ) P ( h b = 0 | h m b )0 0 0 11 0 0 10 1 0 11 1 1 0 Figure 12: A heatmap showing the conditional probability distribution transition matrix P ( H t +1 | H t ). The purpose of the statistical classiﬁer is to obtain a probability distribution over the currenthealth state given a set of observed features, i.e. P ( H t | ν ). In the current case study, the damageindicative features are the strains measured from the horizontal and vertical members of the truss.The classiﬁer selected for the current case study was comprised of two components; a detector anda localiser, corresponding to the ﬁrst two stages of Rytter’s hierarchy. Whilst generative modelsmay better reﬂect the causality of the problem at hand, where it is assumed that the featuresare generated as a result of the latent state and P ( H t | ν ) may be computed via Bayes’ Theorem,discriminative classiﬁers that directly learn a mapping from the feature space to the label spaceare also applicable in the risk-based decision framework.A Gaussian novelty detector was implemented to determine the probability that the structureis currently in its undamaged state P ( H t = 0 | ν ), and, as its complement, the probability that thestructure is currently in a damaged state P ( H t (cid:54) = 0 | ν ). The ﬁrst two principal components ofthe simulated strain data for the undamaged structure are compared to those from the damagedstructure in Figure 13. Inspection of Figure 13 reveals that it is possible to discriminate between the15 igure 13: A comparison of the distributions of the ﬁrst two principal components of the strain data obtained fromthe ﬁnite element model of the undamaged and damaged truss. undamaged and damaged ﬁnite element simulation data using only the ﬁrst principal componentof the strains hence the novelty detector uses the ﬁrst principal component as the discriminativefeature. As such this principal component projection was learned from the training data andthe detector was formed by computing the mean µ and standard deviation σ of the univariatedistribution of the ﬁrst principal component. If the ﬁrst principal component of an incoming set ofstrains were to lie within the range µ ± σ , it was asserted that the observed strains came fromthe structure in its undamaged condition with conﬁdence 0.997, i.e. P ( H t = 0 | ν ) = 0 . ± σ conﬁdence interval, P ( H t = 0 | ν ) was given by theprobability mass in the tail of the Gaussian probability density function parametrised by µ and σ .The function of the localiser component of the statistical classiﬁer is to distribute P ( H t (cid:54) = 0 | ν )over the remaining 255 health states corresponding to the various combinations of cross-memberfailures. For consistency with [20], the classiﬁer selected for this purpose was an artiﬁcial neuralnetwork (ANN) with an input node for each of the twelve strain measurements, an output nodefor each cross-member, and three hidden layers with twelve, twelve and eight nodes, respectively.The activation function used was the hyperbolic tangent function. As an identical classiﬁer to thatused in [20] was implemented, only health states H t = (cid:8) , , , , , , , (cid:9) are considered bythe localiser in this case study.In accordance with [20], a training dataset was constructed from the ﬁnite element data for theloads of 10, 20 and 30 kg and a validation dataset from the ﬁnite element data for the loads of 5,15 and 25 kg. In both instances, 100 repetitions of the datasets were produced and superimposedwith a noise pattern of 1 microstrain RMS. The optimal weights were computed using the scaledconjugate gradient (SCG) back-propagation algorithm [21] and evaluated and selected based onthe classiﬁcation performance of the network on the validation dataset. Although not inherentlyprobabilistic, a pseudo-probabilistic interpretation for the activations of the output nodes wasacquired through the use of a softmax function. The failure model, transition models, classiﬁer, decisions and utilities can be combined to form apartially-observable Markov decision process represented by the limited memory inﬂuence diagram16LIMID) shown in Figure 14. Figure 14 shows the decision process for two decisions over threetime-slices. The edge connecting ν t +0 to d t +0 implies that the features are observed prior to thedecision being made. Similarly, the edge connecting d t +0 to d t +1 implies d t +0 is decided before d t +1 .It should be noted that the model shown in Figure 14 assumes a generative model for thefeatures ν ; for discriminative classiﬁers the direction of the conditioning edge connecting nodes H t +0 and ν t +0 would be reversed. U F t +0 F T h b h b h b h b hm hm hm hm hm hm hm hm U F t +1 F (cid:48) t +1 H t +0 H t +1 ν t +0 d t +0 U d t +0 d t +1 H t +2 U F t +2 F (cid:48) t +2 U d t +1 Figure 14: An inﬂuence diagram representing the partially observable Markov decision process for determining theutility-optimal maintenance strategy for the cross-members of a four-bay truss given observations of strains madefrom the horizontal and vertical members. Observed variables are shaded grey. The fault tree failure models for thelatter time steps have been represented as the nodes F (cid:48) t for compactness.

8. Results

When applied to the experimental strain data from the damaged truss, the univariate Gaussiannovelty detector was able to correctly identify 175 out of the 192 observations as novel with respect17 igure 15: A confusion matrix detailing the classiﬁcation performance of the novelty detector on the experimentaldata. to the simulated undamaged data, thereby yielding an overall accuracy of 91.1% and shown in theconfusion matrix in Figure 15.The 8.9% misclassiﬁcation can be elucidated by examining the distribution of the experimentaldata as projected through the principal component mapping learned from the ﬁnite elementsimulation data, shown in Figure 16. Separability between the projected damaged and undamageddatasets is absent where values of the ﬁrst principal component are approximately 3.9.

When applied to the experimental strain data, the neural network localiser has an overallclassiﬁcation accuracy of 59.9%. The full confusion matrix is shown in Figure 17. Whilst 60%classiﬁcation accuracy may be considered low, for a 8 class problem the neural network providesa signiﬁcant improvement over simply guessing which would yield an accuracy of 12.5%. Theimperfect classiﬁer has been deliberately chosen here as a possible source of uncertainty.The misclassiﬁcation error of 40.1% can be explained by considering the physics of the problemat hand. The selected damage sensitive features were the strains in the horizontal and verticalmembers of the truss; however, signiﬁcant changes in the strains are only expected of membersin the load path between the end ﬁxture and applied mass. Therefore, the features are largelyinsensitive to damage when the mass is closer to the ﬁxture than the damaged cross-member andany diﬀerences can be attributed to the strains induced by the preload. In 72 of the 192 (37.5%)cases, the damaged cross-member is not in the load path.

The decision algorithm was tested on a dataset comprised of the experimental strains for thetruss in its damaged conditions and, due to the lack of experimental data, ﬁnite element simulationdata for the truss in its undamaged condition. Equal proportions of the undamaged and damageddata were used; 192 sets of strains from each. 18 igure 16: A comparison of the distributions of the ﬁrst two principal components of the strain data obtained fromthe ﬁnite element model of the undamaged truss and the strain data from the experiment performed on the damagedtruss mapped through the projection learned from the simulated training data.Figure 17: A confusion matrix detailing the classiﬁcation performance of the artiﬁcial neural network localiser onthe experimental data. igure 18: A confusion matrix detailing the ‘accuracy’ of the decision algorithm for both decisions in the threetime-slice problem for U ( F T = 1) = − The decision algorithm used for testing was similar to that shown in Figure 14, except using thediscriminative pseudo-probabilistic ANN classiﬁer rather than a generative model. The graphicalwas modelled in MATLAB using the Bayes Net Toolbox [22] and solved using the junction treealgorithm for inﬂuence diagrams described in [23]. Utilities were as shown in Tables 1 and2. Three failure events in consecutive time-steps were considered in conjunction with two ‘donothing’/‘perform maintenance’ decisions in the ﬁrst two time-steps with a single observation madeduring the ﬁrst time-step.The performance of the decision process was evaluated with a metric similar to that of aclassiﬁer’s overall accuracy. Whereas a classiﬁers accuracy is a comparison between the predictedoutputs and the target outputs, this ‘decision accuracy’ is a comparison between the decided actionsgiven the outputs of the classiﬁer used to the optimal decided actions given perfect information ofthe health state, i.e. the utility-optimal decisions when the targets of the classiﬁer are provided inplace of P ( H | ν ). The target and output classes ‘0’ and ‘1’ for Figures 18, 19 and 20 correspond tothe ‘do nothing’ and ‘perform maintenance’ actions, respectively.Figure 18 shows the performance of the decision algorithm across all 786 decisions associatedwith the test dataset. It can be seen that an overall ‘accuracy’ of 93.2% was achieved meaningthat the optimal decision given perfect information of the health state selected in 716 of the caseswhen the statistical classiﬁer was used to infer the health state. In 40 of the cases, the ‘performmaintenance’ action was selected unnecessarily, this is a result of the uncertainty in the health statetriggering a more, conservative action to be taken; this form of error is akin to a ‘false positive’ ortype I error. In 12 of the cases the ‘do nothing’ action was deemed to be optimal whereas, hadperfect information of the structures health state been available, the optimal decision would, infact, have been ‘perform maintenance’. This form of error is akin to a ‘false negative’ or type IIerror.The severity/signiﬁcance of type I and type II errors is dependent on the context of the SHMsystem. For example, for an oﬀshore wind structure, erroneously sending inspection/maintenance20 igure 19: A confusion matrix detailing the ‘accuracy’ of the decision algorithm for the ﬁrst decision in the threetime-slice problem for U ( F T = 1) = − engineers has a higher cost relative to the failure event, whereas for a bridge the cost of in-spection/maintenance is relatively lower with respect to the cost of failure. In this risk-baseddecision-framework, the costs are explicitly modelled and may be used to inform a preferentialselection of classiﬁer with regard to type I and type II errors.Figures 19 and 20 provide a breakdown of the ‘decision accuracy’ shown in Figure 18 for theﬁrst and second decisions, respectively. Figure 19 shows that, for the ﬁrst decision, the algorithmwas able to select the correct action in 332 of the 384 cases, yielding an overall ‘decision accuracy’of 86.5%. Additionally, it can be seen that in 40 cases the ‘perform maintenance’ action wasselected incorrectly, and in 12 cases the ‘do nothing’ action was selected incorrectly. ComparingFigures 18 and 19 reveals that all type I and type II errors occur during the ﬁrst decision. Logically,Figure 20 shows that the algorithm has a ‘decision accuracy’ of 100% with respect to the seconddecision. Moreover, Figure 20 shows that all optimal decisions during the second time slice are ‘donothing’. This result can be explained by considering two possibilities. During the ﬁrst time step,if the algorithm has decided that maintenance is warranted, then, under the assumed transitionmodel P ( H t +1 | H t +0 , d = 1), the structure is guaranteed to be in its undamaged health state in thesecond time step in which case further maintenance is unwarranted given P ( H t +2 | H t +1 , d = 0) and U ( F T = 1) = − igure 20: A confusion matrix detailing the ‘accuracy’ of the decision algorithm for the second decision in the threetime-slice problem for U ( F T = 1) = − − U ( F T = 1). igure 22: A comparison of the ‘accuracy’ of the decision process as a function of the cost of the failure event whenthe health state is inferred using the statistical classiﬁer and a uniform distribution over health states is assumed.Failure event cost is deﬁned as − U ( F T = 1). seen from Figure 21 that for a given failure event cost the time until maintenance decreases withcost of maintenance. It should also be noted that, logically, if the cost of maintenance exceeds thecost of failure, the structure will be allowed to operate until failure.To investigate the inﬂuence of the cost of failure upon the overall ‘accuracy’ of the decisionalgorithm, the decision process used to produce Figure 18 was repeated for varying U ( F T = 1). Itshould be noted that the utility of ‘perform maintenance’ action was ﬁxed at U ( d = 1) = − U ( F T = 1), the decision algorithm was executed assuming a uniform distributionover the health states targeted by the classiﬁer rather than the distribution as predicted by theclassiﬁer. Figure 22 shows how the ‘decision accuracy’ of each algorithm varies with the cost of thefailure event.Figure 22 shows that the decision algorithms using the classiﬁer and the uniform distributionassumption all have perfect ‘accuracy’ when U ( F T = 1) = − U ( F T = 1) = − U ( F T = 1) ≤ −

450 the accuracy is constant. This is likely because,due to the high cost of failure, given the perfect knowledge of the health state, the optimal decision23or all damage cases other than undamaged are ‘perform maintenance’. For the algorithm utilisingthe classiﬁer, misclassiﬁcation of damage location does not inﬂuence the decided action in this costrange. As previously mentioned, the uniform assumption is ignorant of the health state and thedecided actions are invariant in the range U ( F T = 1) ≤ − − < U ( F T = 1) ≤− U ( F T = 1) ≥ −

200 an increase in accuracy is seen for the algorithmassuming a uniform distribution over the health states. This is a result of the health state invariantdecision being ‘do nothing’; in this failure cost range this assumption is able to correctly decideactions for the undamaged cases and the less severe damage locations. The ‘decision accuracy’of the algorithm employing the classiﬁer ﬂuctuates in this range, this may be due to the decidedactions being sensitive to the distribution of uncertainties over the health states.

9. Discussion

The framework described and demonstrated in the current paper provides an approach to risk-based decision-making in the context of SHM. Decision-making is facilitated through the inclusionof aspects of PRA such as fault tree modelling and risk, thereby allowing for the comparison ofactions and the identiﬁcation of a strategy that maximises expected utility.The PRA paradigm currently practiced in industries such as aerospace and nuclear provides abasis for the formalisation of the operational evaluation procedure. Organising the informationspecifying the structure and monitoring system in a database will assist with ensuring all thenecessary information required for subsequent stages is acquired and it will also provide a structuredmethod for the retrieval of applicable information at each stage.The fault tree development process of PRA provides the key novelty of this risk-based approachto SHM. Firstly, it facilitates the deﬁnition of key failure modes of interest and provides a structuredmethod for identifying pertinent components whose health states should be targeted by a statisticalclassiﬁer. The size of decision space for any given structure in the context of SHM is vast andan intimidating problem to begin addressing. By targeting selected failure modes of interest fora structure and modelling them as fault trees, the scope of the decision-maker may be limited,thereby making the problem more approachable; additional failure modes may subsequently beincorporated as an SHM system is further developed/expanded. Mapping the fault trees intoBayesian networks enables the framework to retain information regarding the uncertainties inthe health states thereby allowing robustness in the decision-making. Moreover, intermediatenodes within the Bayesian network representation of a fault tree may be queried, yielding marginaldistributions that provide information about the probability of damage within components and/orsubstructures. This information may be utilised to guide inspection and maintenance engineersto speciﬁc locations, potentially saving time and reducing the cost of the actions, particularly forlarger structures.Whilst the framework presented addresses some of the problems surrounding the SHM decisionprocess, there remain a number of challenges. One challenge, that has been widely acknowledgedin the SHM community, is that data from the damage states of interest for a structure are seldomavailable prior to the implementation of an SHM system. This poses an issue in the developmentof the classiﬁers on which the decision process is highly dependent and a choice must be maderegarding the approach to the statistical modelling. One option is to take a model-driven approach[24] that utilises outputs from physics-based models of the structure in its damage states of interestto learn a classiﬁer in a supervised manner pre-implementation of the SHM system. Subsequently,the classiﬁer can be continuously updated and validated with data obtained during the monitoringcampaign. Alternatively, a semi-supervised approach can be taken in which a clustering algorithmis applied to the data acquired throughout the monitoring campaign. Clusters are attributed24amage state labels through the incorporation of labelled data into the clustering algorithm [25];damage state labels for data points may be obtained through inspection of the structure [26].The results presented previously show that the performance of a probabilistic risk-based decisionalgorithm is dependent on the available information regarding the health state of the structure. Asdemonstrated, gains with regards to utility can be made in the absence of high-accuracy classiﬁersprovided uncertainty is accounted for. This provides motivation for moving towards the use of trueprobabilistic classiﬁers in the context of SHM, be they discriminative (such as relevance vectormachines (RVMs) [27]) or generative (such as GMMs [28]).In addition to being dependent on the statistical classiﬁer used, the optimality of decisions ishighly contingent on the appropriateness of the transition model used; if the degradation of thestructure is not accurately modelled, erroneous actions may be taken. Facing a similar issue to thestatistical modelling process, oftentimes, data describing the transitions between the health states ofinterest are not held a priori . Again, one is faced with the choice of taking a model-driven approachinvolving the simulation of the degradation, or a data-driven approach that utilises data obtainedduring the monitoring campaign. The development of data-driven transition models, or validationof model-driven transition models is an awkward problem. Due to the fact that one is performinginterventions on the structure during operation, information on transitions between health states isregularly censored, meaning the quantities of data spanning all state transitions of interest requiredfor developing/validating transitions models may never be acquired, and particularly troublesomefor one-oﬀ/bespoke structures. This problem is left as an open research question.To ensure the desired performance of the decision algorithm, a vital stage in the risk-basedframework is to assign utilities/costs to failure events and actions. Currently, within the literaturethere is no formal approach to how these values should be elicited, nor is there a consensus on howthe risk preferences of an SHM decision-maker should be speciﬁed; should an agent be risk averse,risk neutral, or risk seeking? The issue at hand is one of both a technical and ethical nature, andwhilst it will not be discussed in further detail in the current paper, it is highlighted to stimulatethe conversations required for progress in the area of risk-informed decision-making for SHM.In summary, a probabilistic risk-based framework for structural health monitoring was presented.Borrowing practices frequently used in probabilistic risk assessment, such as the use of fault treesto model system failures, the framework facilitates robust decision-making under uncertainty andprovides advancements in the utility-optimal operation and maintenance of structures.

Acknowledgements

The authors would like to acknowledge the support of the UK EPSRC via the ProgrammeGrant EP/R006768/1. KW would also like to acknowledge support via the EPSRC EstablishedCareer Fellowship EP/R003625/1. The authors would like to thank Dr Mark Bateman of EDFEnergy for providing valuable discussions.

Conﬂict of interest

The authors declare that they have no conﬂict of interest.

References [1] C. R. Farrar, K. Worden, Structural Health Monitoring: A Machine Learning Perspective,John Wiley & Sons, Ltd, United Kingdom, 2013.[2] US Nuclear Regulatory Commission, PRA Procedure Guide, US Nuclear Regulatory Commis-sion 1.[3] J. R. Fragola, G. Maggio, Space shuttle operational risk assessment, AIP Conference Proceed-ings 361 (May) (1996) 719–720. doi:10.1063/1.49935 .URL http://aip.scitation.org/doi/abs/10.1063/1.49935 doi:10.12783/shm2019/32376 .[5] E. B. Flynn, M. D. Todd, A Bayesian approach to optimal sensor placement for structuralhealth monitoring with application to active sensing, Mechanical Systems and Signal Processing24 (4) (2010) 891–903. doi:10.1016/j.ymssp.2009.09.003 .[6] M. Vega, M. Todd, A variational bayesian neural network for structural health monitoringand cost-informed decision-making in miter gates, Structural Health Monitoring doi:10.1177/1475921720904543 .[7] M. Gobbato, J. B. Kosmatka, J. P. Conte, A recursive Bayesian approach for fatigue damageprognosis : An experimental validation at the reliability component level, Mechanical Systemsand Signal Processing 45 (2) (2014) 448–467.[8] J. S. Nielsen, Risk-Based Operation and Maintenance of Oﬀshore Wind Turbines, Ph.D. thesis,Aalborg University (2013). doi:10.13052/rp-9788793102521 .[9] M. K. Hovgaard, R. Brincker, Limited memory inﬂuence diagrams for structural damagedetection decision-making, Journal of Civil Structural Health Monitoring 6 (2) (2016) 205–215. doi:10.1007/s13349-016-0153-z .[10] R. Sch¨obi, E. N. Chatzi, Maintenance planning using continuous-state partially observableMarkov decision processes and non-linear action models processes and non-linear action models,Structure and Infrastructure Engineering 12 (8) (2016) 977–994. doi:10.1080/15732479.2015.1076485 .[11] C. Li, S. Mahadevan, Y. Ling, S. Choze, L. Wang, Dynamic bayesian network for aircraftwing health monitoring digital twin, AIAA Journal 55 (2017) 1–12. doi:10.2514/1.J055201 .[12] A. Rytter, Vibration Based Inspection of Civil Engineering Structures, Phd thesis, AalborgUniversity (1993).[13] T. Bedford, R. Cooke, Probabilistic Risk Analysis: Foundations and Methods, CambridgeUniversity Press, Cambridge, Untied Kingdom, 2001.[14] US Nuclear Regulatory Commission, Probabilistic Risk Assessment (PRA) (2018).URL [15] L. E. Sucar, Probabilistic Graphical Models: Principles and Applications, Springer, London,2015. doi:10.1007/978-1-4471-6699-3 .[16] J. Pearl, Fusion, propagation and structuring in belief networks, Artiﬁcial Intelligence 29 (3)(1986) 241–288.[17] U. B. Kjaerulﬀ, A. L. Madsen, Bayesian Networks and Inﬂuence Diagrams: A Guide toConstruction and Analysis, Springer, New York, 2008. doi:10.1007/978-0-387-74101-7 .[18] A. Bobbio, L. Portinale, M. Minichino, E. Ciancamerla, Improving the analysis of dependablesystems by mapping Fault Trees into Bayesian Networks, Reliability Engineering and SystemSafety 71 (3) (2001) 249–260. doi:10.1016/S0951-8320(00)00077-6 .[19] S. Mahadevan, R. Zhang, N. Smith, Bayesian networks for system reliability reassessment,Structural Safety 23 (3) (2001) 231–251.[20] K. Worden, A. D. Ball, G. R. Tomlinson, Fault location in a framework structure using neuralnetworks, Smart Materials and Structures 2 (3) (1993) 189–200.2621] M. F. Møller, A scaled conjugate gradient algorithm for fast supervised learning, NeuralNetworks 6 (4) (1993) 525–533. doi:10.1016/S0893-6080(05)80056-5 .[22] K. P. Murphy, The Bayes Net Toolbox for Matlab.[23] F. Jensen, F. V. Jensen, S. L. Dittmer, From Inﬂuence Diagrams to Junction Trees, UncertaintyProceedings 10 (1994) 367–373. doi:10.1016/B978-1-55860-332-5.50051-1doi:10.1016/B978-1-55860-332-5.50051-1