MaxSAT Evaluation 2020 -- Benchmark: Identifying Maximum Probability Minimal Cut Sets in Fault Trees
MMaxSAT Evaluation 2020 - Benchmark:Identifying Maximum Probability Minimal Cut Setsin Fault Trees
Mart´ın Barr`ere and Chris Hankin
Institute for Security Science and Technology, Imperial College London, UK { m.barrere, c.hankin } @imperial.ac.uk Abstract —This paper presents a MaxSAT benchmark focusedon the identification of Maximum Probability Minimal Cut Sets(MPMCSs) in fault trees. We address the MPMCS problem bytransforming the input fault tree into a weighted logical formulathat is then used to build and solve a Weighted Partial MaxSATproblem. The benchmark includes 80 cases with fault trees ofdifferent size and composition as well as the optimal cost andsolution for each case.
Index Terms —MaxSAT, Benchmark, Fault trees, Fault TreeAnalysis, Reliability, Cyber-Physical Security, Dependability.
I. P
ROBLEM OVERVIEW
Fault Tree Analysis (FTA) is an analytical tool aimed atmodelling and evaluating how complex systems may fail.FTA is widely used as a risk assessment tool in safetyand reliability engineering for a broad range of industriesincluding aerospace, power plants, nuclear plants, and othershigh-hazard fields [1]. Essentially, a fault tree is a directedacyclic graph (DAG) which involves a set of basic events (e.g.component failures) that are combined using logic operators(e.g. AND and OR gates) to model how these events may leadto an undesired state of the system normally represented at theroot of the tree (top level event).Our work is focused on a novel measure for FTA in theform of a hybrid analysis technique that involves quantitativeand qualitative aspects of fault trees. From a qualitativeperspective, we focus on Minimal Cut Sets (MCS). An MCSis a minimal combination of events that together cause thetop level event. As such, MCSs are fundamental for structuralanalysis. The problem is that, in large scenarios, computing allMCSs might be very expensive and there might be hundredsof MCSs, which makes it hard to handle and prioritise whichMCSs should be attended first. In that context, the goal ofthis work is to identify the MCS with maximum probability.We call this problem the MPMCS. This is an NP-completeproblem and we use a MaxSAT-based approach to address it.II. S
IMPLE EXAMPLE
The fault tree shown in Fig. 1 illustrates the different combi-nations of events that may lead to the failure of an hypotheticalFire Protection System (FPS) based on [2]. The FPS can
This work has been supported by the European Union’s Horizon 2020research and innovation programme under grant No 739551 (KIOS CoE).To appear in Proceedings of the MaxSAT Evaluation 2020 (MSE’20),https://maxsat-evaluations.github.io/2020/. fail if either the fire detection system or the fire suppressionmechanism fails. In turn, the detection system can fail ifboth sensors fail simultaneously (events x and x ), while thesuppression mechanism may fail if there is no water ( x ), thesprinkler nozzles are blocked ( x ), or the triggering systemdoes not work. The latter can fail if neither of its operationmodes (automatic ( x ) or remotely operated) works properly.The remote control can fail if the communications channelfails ( x ) or the channel is not available due to a cyber attack,e.g. DDoS attack ( x ). Each basic event has an associatedvalue that indicates its probability of occurrence p ( x i ) . Fig. 1. Fault tree of a cyber-physical fire protection system (simplified)
A fault tree F can be represented as a Boolean equation f ( t ) that expresses the different ways in which the top event t can be satisfied [3]. In our example, f ( t ) is as follows: f ( t ) = ( x ∧ x ) ∨ ( x ∨ x ∨ ( x ∧ ( x ∨ x ))) The objective is to find the minimal set of logical variablesthat makes the equation f ( t ) true and whose joint probabilityis maximal among all minimal sets. In our example, theMPMCS is { x , x } with a joint probability of . .III. M AX SAT
FORMULATION STRATEGY
Given a fault tree and its logical formulation f ( t ) , we carryout a series of steps to compute the MPMCS as follows.
1. Logical transformation.
Since we are interested inminimising the number of satisfied clauses, which is opposedto what MaxSAT does (maximisation), we flip all logic gatesbut keep all events in their positive form. In our example, weobtain: g ( t ) = ( x ∨ x ) ∧ ( x ∧ x ∧ ( x ∨ ( x ∧ x ))) . a r X i v : . [ c s . CR ] J u l hen, the objective is to satisfy ¬ g ( t ) where the falsifiedvariables will indicate the minimum set of events that mustsimultaneously occur to trigger the top level event. A moredetailed explanation of this transformation can be found in[4]. We then use the Tseitin transformation to produce inpolynomial time an equisatisfiable CNF formula [5].
2. MaxSAT weights.
Due to the fact that MaxSAT is addi-tive in nature and the MPMCS problem involves the multipli-cation of decision variables, we transform the probabilities intoa negative log-space so the multiplication becomes a sum. Inaddition, many SAT solvers only support integer weights so weperform a second transformation by right shifting (multiplyingby 10) every value until the smallest value is covered with anacceptable level of precision. For example, 0.001 and 0.00007would become 100 and 7 (right shift 5 times). Additionalvariables introduced by the Tseitin transformation have weight0. We then specify the problem as a Partial Weighted MaxSATinstance by assigning the transformed probability values as apenalty score for each decision variable.
3. Parallel SAT-solving architecture.
Since different SATsolvers normally use different resolution techniques, some ofthem are very good at some instances and not that good atothers. To address this issue, we run multiple SAT-solversin parallel and pick the solution of the solver that finishesfirst. We have experimentally observed that the combination ofdifferent solvers provides good results in terms of performanceand scalability. Once the solution has been found, we translateback the transformed values into their stochastic domain andoutput the MCS with maximum probability.IV. F
AULT TREE GENERATION
The benchmark presented in this paper relies on our opensource tool MPMCS4FTA [6]. We have used MPMCS4FTA togenerate and analyse synthetic pseudo-random fault trees ofdifferent size and composition. We use AND/OR graphs as theunderlying structure to represent fault trees. The benchmarkpresented in [7] also considers AND/OR graphs as a means torepresent operational dependencies between components in in-dustrial control systems [8]. However, the instances presentedin this paper differ in that: 1) they are restricted to directedacyclic graphs (DAGs), 2) only the basic events representedat the leaves of the fault tree involve a probability of failure,and 3) leaves can have more than one parent in order to relaxthe definition of strict logical trees.We control the size and composition of a randomfault tree of size n according to a configuration R =( R AT , R AND , R OR ) . R AT ∈ [0 , indicates the proportionof atomic nodes (basic events) with respect to size n (e.g. . means ) whereas R AND and R OR indicate the proportionof AND and OR nodes respectively. To create a fault treeof size n , we first create two lists: L = { l , . . . , l m } and A = { a , . . . , a s } . L is a random sequence of AND and ORnodes with the specified proportions for each operator where m = n ∗ ( R AND + R OR ) . A is a list of atomic nodes where s = n ∗ R AT , thus n = m + s . In addition, each atomic nodehas a random probability of failure p ( a i ) ∈ [0 , . To ensure connectivity, we first create the root node t andconnect l to t ( l → t ). Then, for each logic node l i inthe sequence L , we randomly choose k nodes l j ahead (thus j > i ) and create k edges ( l j → l i ) in the tree. When theremaining nodes in L are not enough to cover k nodes, weuse random atomic nodes from A . At this point, we also makesure that l i points to at least one previous node in the sequence L . If that is not the case, we choose a random node l h (with h < i ) and create an edge ( l i → l h ). Once the sequence L has been processed, we traverse the list A and connect eachatomic node a i as follows. First, we draw a random value k (cid:48) between 1 and k . Then, we add random edges ( a i → l j ) from a i to logic nodes l j until we cover k (cid:48) connections.V. B ENCHMARK DESCRIPTION
Out dataset includes 80 cases in total, and can be obtainedat [6]. It contains fault trees with four different sizes: 2500,5000, 7500, and 10000 nodes (20 cases each). For each treesize, we consider two different graph configurations, R =(0 . , . , . and R = (0 . , . , . , which determine thecomposition of the fault trees (10 cases each). Table I showsthe identifiers of the cases within each one of these categories. R = (0 . , . , . R = (0 . , . , . ENCHMARK CASES AND CONFIGURATIONS
Each case is specified in an individual .wcnf (DIMACS-like,weighted CNF) file named with the case id and the numberof nodes involved. The weight for hard clauses ( top value)has been set to . × . The weight of each soft constraintis an integer value that corresponds to the transformation(right shifting) of the probability value in − log space. TablesII and III detail each case as well as the results obtainedwith our tool. The field id identifies each case. gNodes and gEdges indicate the total number of nodes and edges in thefault tree. gAT , gAND , and gOR , indicate the approximatecomposition of the graph in terms of atomic (basic events),AND, and OR nodes. tsVars and tsClauses show the numberof variables and clauses involved in the MaxSAT formulationafter applying the Tseitin transformation. time shows theresolution time reported by MPMCS4FTA in milliseconds.Currently, the MaxSAT solvers used in MPMCS4FTA areSAT4J [9] and a Python-based linear programming approachusing Gurobi [10]. size indicates the number of nodes identi-fied in the MPMCS solution. intLogCost indicates the costof the solution in − log space as an integer value (rightshifted). logCost indicates the cost of the solution in − log space. MPMCS probability indicates the joint probability ofthe MPMCS. These experiments have been performed on aMacBook Pro (16-inch, 2019), 2.4 GHz 8-core Intel Core i9,32 GB 2666 MHz DDR4.2 d gNodes gEdges gAT gAND gOR tsVars tsClauses time size intLogCost logCost MPMCS probability
ENCHMARK DESCRIPTION - C
ASES TO d gNodes gEdges gAT gAND gOR tsVars tsClauses time size intLogCost logCost MPMCS probability
41 7500 21502 6002 750 750 8871 28951 965 1 160 1.6E-4 0.99984142 7500 21515 6002 750 750 7191 23069 852 1 393 3.93E-4 0.99960743 7500 21497 6002 750 750 5716 18114 843 1 1095 0.001095 0.99890644 7500 21536 6002 750 750 6476 20645 849 600 607247314 607.247314 1.8912103369207186E-26445 7500 21472 6002 750 750 10277 34266 859 251 235979386 235.979386 3.279829621872166E-10346 7500 21607 6002 750 750 10235 34064 849 31 27638401 27.638401 9.927826703704467E-1347 7500 21609 6002 750 750 11377 38597 920 689 644477962 644.477962 1.2810988897753624E-28048 7500 21397 6002 750 750 4488 14083 815 1 18442 0.018442 0.98172849 7500 21410 6002 750 750 12792 44789 1031 668 672741572 672.741572 6.81228495760467E-29350 7500 21566 6002 750 750 13253 47290 851 1 9154 0.009154 0.99088851 7500 20454 4502 1500 1500 11031 39763 972 1 2151 0.002151 0.99785252 7500 20450 4502 1500 1500 8927 30739 855 1 738 7.38E-4 0.99926353 7500 20616 4502 1500 1500 11843 43792 894 1 37 3.7E-5 0.99996454 7500 20530 4502 1500 1500 9961 35071 1053 502 480184105 480.184105 2.8797108920892045E-20955 7500 20563 4502 1500 1500 9462 32930 1368 769 739302414 739.302414 8.45E-32256 7500 20493 4502 1500 1500 9084 31398 833 1 7545 0.007545 0.99248457 7500 20491 4502 1500 1500 4922 16088 817 1 104472 0.104472 0.900858 7500 20594 4502 1500 1500 5943 19507 987 267 256660486 256.660486 3.4340775952647096E-11259 7500 20406 4502 1500 1500 9340 32356 898 158 148111431 148.111431 4.74472781242486E-6560 7500 20445 4502 1500 1500 8882 30572 827 1 14066 0.014066 0.98603361 10000 28613 8002 1000 1000 16234 56222 1087 1 1904 0.001904 0.99809962 10000 28675 8002 1000 1000 14261 47804 914 197 185985480 185.98548 1.6901841317920728E-8163 10000 28558 8002 1000 1000 13755 45717 893 1 43 4.3E-5 0.99995764 10000 28738 8002 1000 1000 13370 44343 882 1 127 1.27E-4 0.99987465 10000 28752 8002 1000 1000 15537 53105 917 643 606121928 606.121928 5.826520007473361E-26466 10000 28803 8002 1000 1000 9981 32065 852 1 796 7.96E-4 0.99920567 10000 28632 8002 1000 1000 13418 44550 861 448 439405919 439.405919 1.4772121624185204E-19168 10000 28830 8002 1000 1000 17774 63650 874 1 3047 0.003047 0.99695969 10000 28717 8002 1000 1000 14505 48831 861 1 1691 0.001691 0.99831170 10000 28604 8002 1000 1000 16032 55089 855 1 436 4.36E-4 0.99956471 10000 27114 6002 2000 2000 15244 55476 2286 652 652324945 652.324945 5.016628484164324E-28472 10000 27515 6002 2000 2000 10588 36029 867 1 15974 0.015974 0.98415473 10000 27411 6002 2000 2000 9596 32332 862 422 440653751 440.653751 4.240514855635819E-19274 10000 27271 6002 2000 2000 15985 59167 873 1 2033 0.002033 0.99796975 10000 27228 6002 2000 2000 13506 47651 2223 621 639112478 639.112478 2.7423451190246526E-27876 10000 27345 6002 2000 2000 12066 41598 1253 326 307525901 307.525901 2.779537506735469E-13477 10000 27310 6002 2000 2000 10310 34812 835 1 10970 0.01097 0.98909178 10000 27306 6002 2000 2000 12092 41711 1004 228 218680041 218.680041 1.0684631282749114E-9579 10000 27315 6002 2000 2000 14069 50130 848 1 1447 0.001447 0.99855580 10000 27375 6002 2000 2000 14851 53699 859 1 180 1.8E-4 0.999821TABLE IIIB
ENCHMARK DESCRIPTION - C
ASES TO EFERENCES[1] E. Ruijters and M. Stoelinga, “Fault tree analysis: A survey of the state-of-the-art in modeling, analysis and tools,”
Computer Science Review ,vol. 15-16, pp. 29 – 62, 2015.[2] S. Kabir, “An overview of Fault Tree Analysis and its application inmodel based dependability analysis,”
Expert Systems with Applications ,vol. 77, pp. 114 – 135, 2017.[3] W. Vesely, M. Stamatelatos, J. Dugan, J. Fragola, J. Minarick III,and J. Railsback, “Fault Tree Handbook with Aerospace Applications,”
Office of Safety and Mission Assurance, NASA Headquarters, US , 2002.[4] M. Barr`ere and C. Hankin, “Fault Tree Analysis: Identifying MaximumProbability Minimal Cut Sets with MaxSAT,” https://arxiv.org/abs/2005.03003, May 2020.[5] G. S. Tseitin, “On the Complexity of Derivation in PropositionalCalculus,” in
Studies in Constructive Mathematics and MathematicalLogic, Part II , A. Slisenko, Ed., 1970, pp. 234–259.[6] M. Barr`ere, “MPMCS4FTA - Maximum Probability Minimal Cut Setsfor Fault Tree Analysis,” https://github.com/mbarrere/mpmcs4fta, March2020.[7] M. Barr`ere, C. Hankin, N. Nicolaou, D. Eliades, and T. Parisini,“MaxSAT Evaluation 2019 - Benchmark: Identifying Security-CriticalCyber-Physical Components in Weighted AND/OR Graphs. In MaxSATEvaluation 2019 (MSE19),” https://arxiv.org/abs/1911.00516, 2019.[8] M. Barr`ere, C. Hankin, N. Nicolaou, D. Eliades, and T. Parisini,“Measuring cyber-physical security in industrial control systems viaminimum-effort attack strategies,”
Journal of Information Securityand Applications