[PDF] Formal Methods in Dependable Systems Engineering: A Survey of Professionals from Europe and North America

Abstract

Context: Formal methods (FMs) have been around for a while, still being unclear how to leverage their benefits, overcome their challenges, and set new directions for their improvement towards a more successful transfer into practice. Objective: We study the use of formal methods in mission-critical software domains, examining industrial and academic views. Method: We perform a cross-sectional on-line survey. Results: Our results indicate an increased intent to apply FMs in industry, suggesting a positively perceived usefulness. But the results also indicate a negatively perceived ease of use. Scalability, skills, and education seem to be among the key challenges to support this intent. Conclusions: We present the largest study of this kind so far (N = 216), and our observations provide valuable insights, highlighting directions for future theoretical and empirical research of formal methods. Our findings are strongly coherent with earlier observations by Austin and Parkin (1993).

Full PDF

11812 . *correspondence: [email protected] F ormal M ethods in D ependable S ystems E ngineering : AS urvey of P rofessionals from E urope and N orth A merica ∗ P reprint , compiled S eptember

23, 2020

Mario Gleirscher and Diego Marmsoler Department of Computer Science, University of York,, Deramore Lane, Heslington, York YO10 5GH,United Kingdom Institut f¨ur Informatik, Technical University of Munich,, Boltzmannstraße 3, 85748 Garching, Germany A bstract Context:

Formal methods (FMs) have been around for a while, still being unclear howto leverage their beneﬁts, overcome their challenges, and set new directions for theirimprovement towards a more successful transfer into practice.

Objective:

We study theuse of formal methods in mission-critical software domains, examining industrial andacademic views.

Method:

We perform a cross-sectional on-line survey.

Results:

Ourresults indicate an increased intent to apply FMs in industry, suggesting a positivelyperceived usefulness. But the results also indicate a negatively perceived ease of use.Scalability, skills, and education seem to be among the key challenges to support thisintent.

Conclusions:

We present the largest study of this kind so far ( N = K eywords formal methods · empirical research · on-line survey · usage · usefulness · practical challenges · research transfer · software engineering education & training A cronyms CMMI

Capability Maturity ModelIntegration DI respondents with decreased us-age intent EOU ease of use FM formal method GQM goal-question-metric HQ head quarter ICT information and communica-tion technology II respondents with increased usageintent IS information system LE less experienced respondents M respondents with some motiva-tions to use FMs MbE model-based engineering ME more experienced respondents NP non-practitioners P practitioners PEOU perceived ease of use PU perceived usefulness RQ research question SE software engineering SMT satisﬁability modulo theory

TAM technology acceptance model

TLD top-level domain NM respondents without any moti-vations to use FMs UFM

Use of FMs in mission-critical SE U usefulness * Funding:

Partly funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) underthe Grant no. 381212925.

Conﬂict of Interest:

The authors declare that they have no conﬂict of interest. This is a post-peer-review, pre-copyedit version of an article published in

Empirical Software Engineering . The ﬁnal authenticatedversion is available online at: doi:10.1007 / s10664-020-09836-5 a r X i v : . [ c s . S E ] S e p reprint – F ormal M ethods U se otivation and C hallenges Over the past decades, many software errors have been deployed in the ﬁeld and some of these errors hada clearly intolerable impact. Cost savings from reducing such impact have been the motivation of formalmethods (FMs) as a ﬁrst-class approach to error prevention, detection, and removal (Holloway, 1997).In university courses on software engineering, we learned that FMs are among the best we have to designand assure correct systems. The question “Why are FMs not used more widely?” (Knight et al., 1997)is hence more than justiﬁed. With a Twitter poll, which emerged from our co ﬀ ee spot discussions, wesolicited opinions on a timely paraphrase of a statement argued by Holloway (1997): “FMs should be acornerstone of dependability and security of highly distributed and adaptive automation.” What can a tinyopportunity sample of 22 respondents from our social network tell? Not much, well, (i) 55% agree s, i.e.,seem to attribute importance to this role of FMs, (ii) 14% disagree s, i.e., oppose that view, (iii) 32% just don’t know . Why should and how could FMs be a cornerstone?Since the beginning of software engineering (SE) there has been a debate on the usefulness of FMs toimprove SE. In the 1970s and 1980s, several SE and FM researchers had started to examine this usefulnessand to identify error possibilities despite the rigour in FMs (Gerhart and Yelowitz, 1976), with the aim ofresponding to critical observations of practitioners (Jackson, 1987).Hall (1990) and Bowen and Hinchey (1995a) illuminate 14 myths (e.g. “formal methods are unneces-sary”), providing their insights on when FMs are best used and highlighting that FMs can be overkill insome cases but are highly recommended in others. The transfer of FMs into SE practice is by far notstraightforward. Knight et al. (1997) examine reasons for the low adoption of FMs in practice. Barrocaand McDermid (1992) ask: “To what extent should FMs form part of the [safety-critical SE] method?”Glass (2002, pp. 148–149, 165–166) and Parnas (2010) observe that “many [SE] researchers advocaterather than investigate” by assuming the need for more methodologies. Glass summarises that FMs weresupposed to help represent ﬁrm requirements concisely and support rigorous inspections and testing.He observes that changing requirements has become an established practice even in critical domains,and inspections, even if based on FMs, are insu ﬃ cient for complete error removal. In line with Barrocaand McDermid (1992, p. 591), he notes that FMs have occasionally been sold as to make error removalcomplete, but there is no silver bullet (Glass, 2002, pp. 108–109). Bad communication between theoristsand practitioners sustains the issue that FMs are taught but rarely applied (Glass, 2002; Holloway andButler, 1996, pp. 68–70). Parnas (2010) compares alternative paradigms in FM research (e.g. axiomaticvs. relational calculi) and points to challenges of FM adoption (e.g. valid simple abstractions).In contrast, Miller et al. (2010) draw positive conclusions from recent applications of model checking andhighlight lessons learned. In his keynote, O’Hearn (2018) conveys positive experiences in scaling FMsthrough adequate tool support for continuous reasoning in agile projects (see, e.g. Chudnov et al., 2018).Many researchers (see, e.g. Aichernig and Maibaum, 2003) have been working on the improvement ofFMs towards their successful transfer. Boulanger (2012) and Gnesi and Margaria (2013) summarisepromising industry-ready FMs and present larger case studies.Have software errors been overlooked because of hidden inconsistencies that can be detected when prop-erly formalised? Are such errors compelling arguments for the wider use of FMs? Strong evidence for the ease of use of FMs and their e ﬃ cacy and usefulness is scarce and largely anecdotal, rarely drawnfrom comparative studies (e.g. Pﬂeeger and Hatton, 1997; Sobel and Clarkson, 2002), often primarilyconducted in research labs (e.g. Chudnov et al., 2018; Galloway et al., 1998 and many others). In late re-sponse to Holloway and Butler’s request for empirical data (Holloway and Butler, 1996), Graydon (2015) See anecdotal evidence (grey literature, press articles) on software-related incidents, for example, by Kaner andPels (1998, 2018), Charette (2018) and Neumann (2018). See https://twitter.com/MarioGleirscher/status/889737625178976256 . For example, walking through development artefacts in a structured and moderated discussion group and withbug pattern checklists (Fagan, 1976). reprint – F ormal M ethods U se ﬀ ectiveness of FMs in assurance argumentation for safety-criticalsystems, suggesting empirical studies to examine hypotheses and collect evidence.FMs have many potentials but SE research has reached a stage of maturity where strong empirical evi-dence is crucial for research progress and transfer . Je ﬀ ery et al. (2015) identify questions and metrics for FM productivity assessment , supporting FM research transfer.

Contributions.

We contribute to SE and FM research (1) by presenting results of the largest cross-sectional survey of FM use among SE researchers and practitioners to this date, (2) by answering researchquestions about the past and intended use of FMs and the perception of systematically mapped FM chal-lenges, (3) by relating our ﬁndings to the perceived ease of use and usefulness of FMs using a simpliﬁedvariant of the technology acceptance model for evaluating engineering methods and techniques, and (4) byproviding a research design for repetitive (e.g. longitudinal) FM studies.

Overview.

The next section introduces important terms. Section 3 relates our work to existing research.In Section 4, we explain our research design. We describe our data and answer our research questionsin Section 5. In Section 6, we summarise and interpret our ﬁndings in the light of existing evidence andwith respect to threats to validity. Section 7 highlights our conclusions and potential follow-up work. ackground and T erminology By formal methods , we refer to explicit mathematical models and sound logical reasoning about crit-ical properties (Rushby, 1994)—such as reliability, safety, security, more generally, dependability andperformance—of electrical, electronic, and programmable electronic or software systems in mission- orproperty-critical application domains. Model checking, theorem proving, abstract interpretation, asser-tion checking, and formal contracts are examples of FMs. By use of FMs , we refer to their application inthe development and analysis of critical systems and to substantially integrating FMs with the used pro-gramming methodologies (e.g. structured development, model-based engineering (MbE), assertion-basedprogramming, test-driven development), notations (e.g. UML, SysML), and tools. Tool and Method Evaluation.

In the following, we give an overview of several evaluation approachesand explain in Section 4.2 which approach we take.The widely used technology acceptance model (TAM; Davis, 1989) is a psychological test that allows theassessment of end-user IT based on the two constructs perceived ease of use (PEOU, i.e., positive andnegative experiences while using an IT system) and perceived usefulness (PU, i.e., positive experiences ofaccomplishing a task using an IT system compared to not using this system for accomplishing the sametask).Complementary to TAM, Basili (1985) proposes the goal-question-metric (GQM) approach to methodand tool evaluation. While GQM serves as a good basis for quantitative follow-up studies, we followthe user-focused TAM. Maturity models according to the Capability Maturity Model Integration (SEI,2010) do not ﬁt our purposes because they focus on engineering process improvement beyond particulardevelopment techniques. Poston and Sexton (1992) present tool survey guidelines based on technology-focused classiﬁcation and selection criteria with a very limited view on tool usefulness and usability.Miyoshi and Azuma (1993) evaluate ease of use of development environments (i.e., speciﬁcation andmodelling tools) using metrics from the ISO / IEC 9126 quality model.From comparing two models of predicting an individual’s intention to use a tool, Mathieson (1991) sup-ports TAM’s validity and convenience but indicates its limits in providing enough information on users’opinions. For software methods and programming techniques, Murphy et al. (1999) show how surveys,case studies, and experiments can be used to compensate for this lack of information about usefulnessand usability. reprint – F ormal M ethods U se ease of use (EOU) of a FM characterises thetype and amount of e ﬀ ort a user is likely to spend to learn, adopt, and apply this FM. Usefulness (U)determines how ﬁt a FM is for its purpose, that is, how well it supports the engineer to accomplish anappropriate task. If EOU and U are measured by a survey whose data points are user perceptions then wetalk of perceived ease of use (PEOU) and perceived usefulness (PU) . Together, PEOU and PU form the user acceptance of a FM and, by support of Mathieson (1991) and Riemenschneider et al. (2002), canpredict the intention to use this FM.Whereas TAM is a model based on the two user-focused constructs PEOU and PU, Kitchenham et al.(1997) propose a meta-evaluation approach called DESMET for tools and methods based on multipleperformance indicators (e.g. with TAM as one of the indicators). elated W ork Table 1 shows a systematic map (Petersen et al., 2008) of 36 studies on FM research evaluation andtransfer. For each study, we estimate the authors’ attitude against or in favour of FMs, the motivation ofthe study, the approach followed, and the type of result obtained. Most of these works present personalexperiences, opinions, case studies, or literature summaries. In contrast, the work presented in this paperfocuses on the analysis of experience from a wide range of practitioners and experts. However, we foundfour similar studies.Austin and Parkin (1993) sought to explain the low acceptance of FMs in industry around 1992. Usinga questionnaire similar to ours with only open questions, they evaluated 111 responses from a sample ofsize 444, using a sampling method similar to ours (then using di ﬀ erent channels). Responses from FMusers are distinguished from general responses. Their questions examine beneﬁts, limitations, barriers,suggestions to overcome those barriers, personal reasons for or against the use of FMs, and ways ofassessing FMs.In a second study in 2001, Snook and Harrison (2001) conduct interviews with representatives of ﬁvecompanies to discover the main issues involved in FM use, in particular, issues of understandability andthe di ﬃ culty of creating and utilising formal speciﬁcations.A similar, though more comprehensive interview study was performed by Woodcock et al. (2009) in2009. They assess the state of the art of the application of FMs, using questionnaires to collect data on 62industrial projects.Liebel et al. (2016, pp. 102–103) assess e ﬀ ects on and shortcomings of the adoption of MbE in embeddedSE including a discussion of FM adoption. The authors observe a lack of tool support, bad reputation, andrigid development processes as obstacles to FM adoption. Their data suggests a need of FM adoption.30% of the responses from industry declare the need for FMs as a reason to adopt MbE. Moreover,responses indicate that MbE adoption has a positive e ﬀ ect on FM adoption. One limitation of their studyis the small number of responses from FM users. This estimate is based on opinions and attitudes expressed by the original authors and, where unavailable, on ourown interpretation when reading the studies. reprint – F ormal M ethods U se Study A Motivation Support E C RSurveys

Austin and Parkin (1993) = LoEv Interviews • •

Snook and Harrison (2001) = LoEv Interviews • Oliveira (2004) = Edu. / Train. Course websites • Woodcock et al. (2009) a = LoEv Interviews • Davis et al. (2013) + TechTx Interviews • •

Liebel et al. (2016) + LoEv Online questionnaire • Ferrari et al. (2019) + TechTx Literature study • Literature Studies and Summaries

Wing (1990) + SotA O / E • • Bloomﬁeld et al. (1991) = SotA • Fraser et al. (1994) = TechTx • Heitmeyer (1998) = TechTx • Gleirscher et al. (2019) + TechTx SWOT analysis • •

Expert Opinions and Experience Reports

Jackson (1987) = TechTx • Bjorner (1987) = TechTx • Barroca and McDermid (1992) = SotA Multiple cases • Bowen and Hinchey (1995a) + Hyp. Testing • Bowen and Hinchey (1995b) + TechTx • Hinchey and Bowen (1996) – TechTx • Heisel (1996) + TechTx • Holloway and Butler (1996) + LoEv • Lai (1996) + TechTx • Bowen and Hinchey (2005) + Hyp. Testing Literature study • Parnas (2010) = TechTx • •

Case Studies and Experiments

Gerhart and Yelowitz (1976) = LoEv Multiple cases, O / E • • • Hall (1990) + Hyp. Testing O / E • Craigen et al. (1995) b + SotA Multiple cases, O / E • Knight et al. (1997) = TechTx Field experiment • Pﬂeeger and Hatton (1997) = Hyp. Testing E ﬀ ect analysis • Galloway et al. (1998) + TechTx Single case in lab • •

Sobel and Clarkson (2002) = Hyp. Testing Lab experiment • Miller et al. (2010) = TechTx Multiple cases, O / E • Klein et al. (2018) + TechTx • Chudnov et al. (2018) = TechTx • a See also Bicarregui et al. (2009), b see also Craigen (1995) and Craigen et al. (1993); (A)ttitude,(E)valuation / analysis, (C)hallenges, (R)ecommendations, +/=/ – . . . positive / neutral / negative, LoEv . . . lack of em-pirical evidence, Hyp. Testing . . . hypotheses testing, Edu. / Train. . . . education / training, O / E . . . opinion / experiencereport, SotA . . . state of the art, SWOT . . . strengths, weaknesses, opportunities, and threats, TechTx . . . technologytransfer While these studies focus on the elicitation of the state of the art and the state of practice, the main focusof our study is to compare the current FM adoption or use with the intention to adopt and use FMs inthe future. To the best of our knowledge, our study o ﬀ ers the largest set of data points investigating theuse of FMs in SE, so far. In Section 6.3, we provide a further discussion of how our ﬁndings relate tothe ﬁndings of these studies, particularly to the works of Austin and Parkin (1993) and Woodcock et al.(2009). reprint – F ormal M ethods U se Concept Id. Description [Scale] Point [Question]

Measured twice . . .

Domain C1 Application domains of FMs [MC among domains] Past [Q1], Intent [Q8]Role C2 Role in using FMs [MC among roles] Past [Q4], Intent [Q9]Use C3 Use of FMs [experience level / relative frequency per FM class] Past [Q5 / Q6], Intent[Q10 / Q11]Purpose C4 Purpose of using FMs [absolute / relative frequency per pur-pose] Past [Q7], Intent [Q12] Measured once . . .

Experience C5 Level of FM experience [duration ranges in years] Single [Q2]Motivation C6 Motivation to use FMs [degree per motivational factor] Single [Q3]Obstacles C7 Di ﬃ culty of obstacles to using FMs [degree per challenge] Single [Q13]MC. . . multiple-choice esearch M ethod In this section, we describe our research design, our survey instrument, and our procedure for data col-lection and analysis. For this research, we follow the guidelines of Kitchenham and Pﬂeeger (2008) forself-administered surveys and use our experience from a previous more general survey (Gleirscher andNyokabi, 2018).

The questions in Section 1 have led to this survey on the use, usage intent, and challenges of FMs . Ourinterest is devoted to the following research questions (RQs) : RQ1

In which typical domains, for which purposes, in which roles, and to what extent have

FMs beenused ? RQ2

Which relationships can we observe between past experience in using FMs and the intent to useFMs ? RQ3

How di ﬃ cult do study participants perceive widely known FM challenges to be? RQ4

What can we say about the perceived ease of use and the perceived usefulness of FMs?

Table 2 lists the (C)oncepts that constitute the construct

Use of FMs in mission-critical SE (UFM), thecorresponding scales , the points of measurement, and references to (Q)uestions from the questionnaire.

Measuring Past and Intended Use.

For RQ1 (

UFM ), we examine potential application domains forFMs (C1), roles when using FMs (C2), motivations and purposes of using FMs (C6, C4), and the extentof

UFM at the general (C5) and speciﬁc (C3) experience level of our study participants when using FMs.For RQ2, we compare the past ( UFM p ) and intended use ( UFM i ) of FMs regarding the domain (C1),role (C2), FM class (C3), and purpose (C4). We measure UFM i by relative frequency (Table 4) withrespect to a participants’ current situation, FM class, and purpose of use. Using a relative instead ofan absolute frequency scale slightly reduces the burden on respondents to make detailed and, hence,uncertain predictions of UFM i .For RQ3, we measure the perception of di ﬃ culty of several obstacles (C7) known from the literature andfrom our experience. reprint – F ormal M ethods U se Method Evaluation and TAM-style Interpretation.

We follow DESMET (Kitchenham et al., 1997)and Murphy et al. (1999) insofar as we combine a qualitative survey (i.e., FM evaluation by SE practi-tioners and researchers) and a qualitative e ﬀ ects analysis based on the past and intent measurements forC4 (i.e., subjective assessment of e ﬀ ects of FMs by asking SE practitioners and researchers).We assume UFM is, nowadays, to a large extent implying the use of the tools automating the corre-sponding FMs. This assumption is justiﬁed inasmuch as for all FMs referred to in this survey, tools areavailable. In fact, in the past two decades (the period most survey respondents could have possibly usedFMs), the development of a FM has mostly gone hand in hand with the development of its supportingtools.For RQ4, we associate our ﬁndings from RQ2 and RQ3 with PEOU and PU. Whereas TAM predicts

UFM i of a speciﬁc tool by measuring PEOU and PU, we directly interrogate past (like in Mohagheghiet al., 2012, Figure 2) and intended use of classes of FMs. We measure UFM i (C1, C2, C3, C4) in moredetail than TAM. Our approach relates to TAM for methods (Riemenschneider et al., 2002, Table 2) inas-much as we collect data for PEOU through asking about potential obstacles to the further use of FMs (C7)based on experience with past FM use ( UFM p ). For this, respondents are asked to rate the di ﬃ culty ofseveral known challenges to be tackled in typical FM applications. Furthermore, UFM i is known to becorrelated with PU. We then interpret the answers to RQ3 to examine the PEOU and, furthermore, inter-pret the answers to RQ2 to reason about PU. In Section 4.4, we discuss our questionnaire including thequestions for measuring the sub-constructs. Our target group for this survey includes persons with (1) an educational background in engineering andthe sciences related to critical computer-based or software-intensive systems, preferably having gainedtheir ﬁrst degree, or (2) a practical engineering background in a reasonably critical system or productdomain involving software practice. We use (study or survey) participant and respondent as synonyms.We talk of FM users to refer to the part of the population that has already used FMs in one or anotherway. See Appendix A.1 and Table 8 for a more ﬁne-grained analysis of the population.

Table 3 summarises the questionnaire we use to measure

UFM (Table 2). The scales used for encodingthe answers are described in Table 4.Although we do not collect personal data, respondents could leave us their email address if they want toreceive our results. We expect participants to spend about 8 to 15 minutes to complete the questionnaire.However, we thought it to be unnecessary in our case to instrument the questionnaire or our tooling toallow us to determine the time spent for submitting complete data points.

Face and Content Validity.

We derived answer options from the literature, our own experience withFMs, SE research training, discussions with other SE researchers and colleagues from industry, pilotresponses, and coding of open answers. Particularly, the classiﬁcation of FM methods (C3; Q5, Q6, Q10,Q11) and the list of obstacles or challenges (C7; Q13) were derived from our own training, literatureknowledge prior to this study, and experience as well as from occasional personal discussions with SEexperts from academia and industry. Most questions are half-open, allowing respondents to go beyondgiven answer options. We treat degree and relative frequency as 3-level Likert-type scales.For each question, we provide “do not know” (dnk) -options to include participants without previousknowledge of FMs in any academic or practical context. If participants are not able to provide an answerthey can choose, e.g. “do not know”, “not yet used”, “no experience”, or “not at all”, and proceed. Thisway, we reduce bias by forced response. We indicate dnk -answers whenever we exclude them. Ourquestionnaire tool (Section 4.6) supports us with getting complete data points , reducing the e ﬀ ort to dealwith missing answers. reprint – F ormal M ethods U se Id. Question or Question Template Scale (see Table 4)

Sec. Fig.Q1

In which application domains (C1) in industry or academiahave you mainly used FMs? MC among domains 5.2 2 Q2 How many years of

FM experience (including the study ofFMs, C5) have you gained? Duration range in years 5.2 3 Q3 Which have been your motivations (C6) to use FMs? Degree per motivationalfactor 5.2 4 Q4 In which roles (C2) have you used FMs? MC among roles 5.3 5 Q5 Describe your level of experience (C3) for (cid:104) class of formaldescription techniques (cid:105) . Level of experience perclass 5.3 6 Q6 Describe your level of experience (C3) for (cid:104) class of formalreasoning techniques (cid:105) . Level of experience perclass 5.3 7 Q7 I have mainly used FMs for (C4) ... Absolute frequency perpurpose 5.3 8 Q8 In which domains (C1) in industry or academia do you intendto use FMs? MC among domains 5.4 9 Q9 In which roles (C2) would (or do) you intend to use FMs? MC among roles 5.4 10

Q10

I (would) intend to use (C3) (cid:104) class of formal descriptiontechniques (cid:105) (cid:104) this (cid:105) often. Relative frequency perclass 5.4 11

Q11

I (would) intend to use (C3) (cid:104) class of formal reasoningtechniques (cid:105) (cid:104) this (cid:105) often. Relative frequency perclass 5.4 12

Q12

I (would) intend to use FMs for (C4) (cid:104) purpose (cid:105) . Relative frequency perpurpose 5.4 13

Q13

For any use of FMs in my future activities, I consider (cid:104) obstacle (cid:105) (C7) as (cid:104) that (cid:105) di ﬃ cult. Degree of di ﬃ culty perobstacle 5.5 16MC. . . multiple-choice Table 4: Scales used in the questionnaire

Name Values Type degree ofmotivation “no motivation” , “moderate motivation”, “strong motivation (or requirement)” L3 degree ofdi ﬃ culty “not as an issue.”, “as a moderate challenge.”, “as a tough challenge.”, “I don’tknow.” L3 experience level(duration-based) “I do not have any knowledge of or experience in FMs.” , “less than 3 years”,“3 to 7 years”, “8 to 15 years”, “16 to 25 years”, “more than 25 years” O experience level(task-based) “no experience or no knowledge” , “studied in (university) course”, “applied inlab, experiments, case studies”, “applied once in engineering practice”, “appliedseveral times in engineering practice” O frequency(absolute) “not at all.” , “once.”, “in 2 to 5 separate tasks.”, “in more than 5 separate tasks.” O frequency(relative) “no more or not at all.” , “less often than in the past.”, “as often as in the past.”,“more often than in the past.”, “I don’t know.” L3 choice single / multiple: ( ch )ecked, ( un )checked Nbold. . . express lack of knowledge or indecision; (N)ominal, (O)rdinal, Ln . . . Likert-type scale with n values reprint – F ormal M ethods U se Channel Type Examples & References

General panels SurveyCircle,

LinkedIn groups E.g. on ARP 4754, DO-178, FME, ISO 26262Mailing lists E.g. system safety (U Bielefeld, formerly U York)Newsletters BCS FACS; GI RE, SWT, TAVPersonal pages E.g. Facebook, Twitter, LinkedIn, XingResearchGate Q&A forums on

Xing groups E.g. Safety Engineering, RE

Table 5: Channels used for sampling

We could not ﬁnd an open, non-commercial panel of engineers. Large-scale panel services are eithercommercial (e.g. Decision Analyst, 2018) or they do not allow the sampling of software engineers (e.g.Leiner, 2014). Hence, we opt for a mixture of opportunity, volunteer, and cluster-based sampling. Todraw a diverse sample of potential FM users, we1. advertise our survey on various on-line discussion channels,2. invite software practitioners and researchers from our social networks, and3. ask these people to disseminate our survey.We examine C5, C1, C2, and C3 from Table 2 to check how well our sample covers the given conceptcategories . The better the coverage of these categories the wider is the range of analyses possible fromour data set. Less covered categories might indicate inappropriate concepts as well as the case that oursample just does not touch this fraction of the target population. Under the assumption that the sample isdrawn from the target population in a uniformly random fashion, we would be able to draw conclusionsabout the constitution of the target population. However, as noted, this assumption was in our case notcontrollable.

For RQ1, we summarise the data and apply descriptive statistics for categorical and ordinal variablesin Section 5.3. We answer RQ2 by comparison of the data for the past and future views regarding thedomain (C1), role (C2), FM class (C3), and purpose (C4) in Section 5.4. Then, in Section 5.5, we answerRQ3 by • describing the challenge di ﬃ culty ratings after associating one of (1) domain, (2) motivationalfactor, (3) role, (4) purpose, and (5) FM class with challenge (C7) and • distinguishing (1) more experienced (ME, > ≤ pairs of matri-ces (e.g. Figure 17). We answer RQ4 by arguing from results for RQ1, 2, and 3. Half-Open and Open Questions.

We code open answers in additional text ﬁelds as follows: If we cansubsume an open answer into one of the given options, we add a corresponding rating (if necessary). Ifwe cannot do this then we introduce a new category “Other” and estimate the rating. Finally, we cluster reprint – F ormal M ethods U se W / / W / / W / / W / / W / / W / / W / / W / / W / / W / / W / / W / / W / / W / / W / / W / / W / / W / / W / / W / / W / / W / / W / / W / / W / / R e s pon s e s Figure 1: Distribution of responses over timethe added answers and split the “Other” category (if necessary). For Q13, we performed the latter stepcombined with independent coding (Neuendorf, 2016) to conﬁrm that the understanding of the challengecategories is consistent among the authors of the present study. For MC questions, we eliminate thechoice of “I do / have not. . . ” options from the data if ordinary answer options where also checked. Tooling.

We use Google Forms (Google, 2018) for implementing our questionnaire (Appendix A.12)and for data collection (Section 4.5) and storage. For statistical analysis and data visualisation (Sec-tion 4.6), we use GNU R (The R Project, 2018) (with the packages likert , gplots , and ggplot2 andsome helpers from the “Cookbook for R” and the “Stack Exchange Stats” community ). Content anal-ysis and coding takes place in a spreadsheet application. A draft of Appendix A has been archived inGleirscher and Marmsoler (2018). xecution , R esults , and A nalysis In this section, we summarise the responses to the questions in Table 3 and answer the RQs 1, 2, and 3 asexplained in Section 4.1. To answer RQ1, we describe the sample in Section 5.2 and discuss some facetsof FM use in Section 5.3. For RQ2, we summarise data about past use and usage intent in Section 5.4.For RQ3, we analyse further data in Section 5.5.

For data collection, we (1) advertised our survey on the channels in Table 5 and (2) personally invited >

30 persons. The sampling period lasted from August 2017 til March 2019 . In this period, we repeatedstep 1 up to three times to increase the number of participants. Figure 1 summarises the distribution ofresponses. The channels in Table 5 particularly cover the European and North American areas.

A size estimation of the channels in Table 5 yields around 65K channel memberships (for some channelswe make a best guess but, e.g. for LinkedIn the counts are given). Assuming participants are, on average,member of at least three of the channels, we could have reached up to 20K real persons . Given a recentestimate of worldwide 23 million SE practitioners (Evans Data, 2018) and assuming that at least 1% aremission-critical SE practitioners, our population might comprise at least 230K persons, possibly around38K in the US and 61K in Europe. We received N =

216 responses resulting in an estimated response See and https://stats.stackexchange.com . An estimation in Gleirscher et al. (2019) suggests that about 5% of the overall ICT / IS developer population areembedded systems practitioners in critical and non-critical domains. Moreover, Evans Data (2018) and Wikipediacontributors (2018) describe data from 2016 and 2017, suggesting that 3.87 million (19%) SE practitioners live inthe US and about 13.3 million (39%) in Europe, the Middle East, and Africa. According to an analysis of data fromStack Overﬂow by ATOMICO (2019), there is a “software engineering talent pool” of about 6.1 million in Europe. reprint – F ormal M ethods U se otherMilitary systems not in the above domainsProcess automationIndustrial machineryI have not used FMs in any academic or industrial domain.Device industryPlatformsBusiness informationCritical infrastructuresSupportiveTransportation 20 40 60 80 N = 216

Figure 2: (Q1) In which application domains in industry or academia have you mainly used FMs? (MC) rate between 1 and 2% and a population coverage of at most 0.1% globally and 0.2% in the US and inEurope. About 40% of our respondents provided their email addresses, the majority from the US, UK,Germany, France, and a sixth from other EU and non-EU countries.In the following, we summarise the responses to the questions about the application domain (Q1), thelevel of experience (Q2), and the motivations (Q3) of a FM user.

Guide to the Figures.

For Likert-type ordered scales , we use centred diverging stacked bar charts (see,e.g. Figure 4) as recommended by Robbins and Heiberger (2011). The horizontal bars in each lineshow the answer fractions according to the legend at the bottom and are annotated with the percentagesof the left-most, middle, and right-most answer options. These bars are aligned by the midpoint ofthe middle group (for 3- and 5-level scales) or by the boundary between the two central groups (for 4-level scales).

Bar labels often abbreviate the corresponding answer options in the questionnaire. Thequestionnaire copy in Appendix A.12 contains short deﬁnitions, explanations, and examples to clarify theanswer options. For sake of brevity, we do not repeat this information here. “M” denotes the median,“CI” the 95% conﬁdence interval for the median calculated according to Campbell and Gardner (1988),“X” the number of excluded data points per answer option, and “NA” the number of invalid data points.

Q1: Application Domain.

For each domain, Figure 2 shows the number of participants having experi-ence in that domain. Note that 180 of the respondents do have experience with applying FM in di ﬀ erentindustrial contexts, while only 36 have not applied FMs to any application domain. Medical healthcareis an example where participants could have checked more than one answer category because medicaldevices would belong to “device industry” and emergency management IT would belong to “critical in-frastructures”. See Appendix A.12 for more information about the answer categories. Q2: FM Experience.

Figure 3 depicts participants’ years of experience in using FMs, showing that thesample covers all experience levels. However, the fraction of respondents with no experience (i.e., cate-gory “0”) is comparatively low. According to Section 4.6, one third of the participants can be consideredLEs with up to three years of experience, and two thirds can be considered MEs with at least three yearsof experience (29 of those with even more than 25 years). A further analysis of the study participants’experience proﬁle is available from Table 8 in Appendix A.1 on page 35.

Q3: Motivation.

Figure 4 suggests that regulatory authorities play a subordinate role in triggering theuse of FMs. In contrast, intrinsic motivation (in terms of private interest) seems to be the major factor forusing FMs. For 9 respondents, none of the given factors was motivating at all. The 88 open responsesfor this question could either be subsumed in at least one of the given categories (65 in “Own (private) MC entails that the sum of answers can exceed N . reprint – F ormal M ethods U se F r equen cy Figure 3: (Q2) How many years of FM experience (including the study of FMs) have you gained?

Regulatory authorities (X=0, M=no,CI[no,moderate], NA=0)Customers / scientific community (X=0,M=moderate, CI[moderate,moderate], NA=0)Employer / research collaborators (X=0,M=moderate, CI[moderate,moderate], NA=0)Superior / principal investigator (X=0, M=no,CI[no,moderate], NA=0)Study or research program (X=0, M=moderate,CI[moderate,strong motivation], NA=0)Own interest (X=0, M=moderate, CI[moderate,strongmotivation], NA=0)On behalf of an FM (X=0, M=no, CI[no,moderate],NA=0) 100 50 0 50 100

PercentageDegree of motivation: no moderate strong motivation

Figure 4: (Q3) Which have been your motivations to use FMs?interest”, 11 in other categories) or be declared as a comment (3) or not a further motivation (9). Hence,coding did not require an additional answer category to Q3.

In the following, we summarise the responses to the questions about the role of a user (Q4), use inspeciﬁcation (Q5), use in analysis (Q6), and the underlying purpose (Q7) of such use.

Q4: Role.

Figure 5 shows in which roles the respondents applied FMs. An analysis of the MC answersshows that 72% of the participants used FMs in an academic environment , as a researcher, lecturer, orstudent. 50% of the participants applied FMs in practice , as an engineer or consultant (see also Gleirscherand Marmsoler, 2018).

Q5: Use in Speciﬁcation.

The degree of usage of FMs for speciﬁcation is depicted in Figure 6. Thereis an almost balanced proportion between theoretical and practical experience with the use of variousspeciﬁcation techniques. Only the use of FMs for the description of dynamical systems seems to beremarkably low.

Q6: Use in Analysis.

The use of FMs for analysis is depicted in Figure 7. Similar to speciﬁcationtechniques, we observe an almost balanced proportion between theoretical and practical experience withthe usage of various analysis techniques. Outstanding is the use of assertion checking techniques, such ascontracts. As expected from the observations for Q5, the use of FMs in computational engineering, suchas algebraic reasoning about di ﬀ erential equations, is again exceptionally low. reprint – F ormal M ethods U se I have not used FMs in any specific role.Stakeholder of an FM tool or service providerConsulting or managing practitioner in industryExternal consultantResearcher in industryEngineering practitioner in industryLecturer, teacher, trainer, or coachBachelor, master, or PhD studentResearcher in academia 20 40 60 80 100

N = 216

Figure 5: (Q4) In which roles have you used FMs? (MC)

Predicative, relational, or algebraic (X=0,M=lab, CI[course,lab], NA=0)Modal and temporal logic specification (X=0,M=lab, CI[course,lab], NA=0)Process models (X=0, M=lab, CI[course,lab],NA=0)Dynamical systems (X=0, M=course,CI[course,course], NA=0) 100 50 0 50 100

PercentageExperience: none course lab practiced practiced > 1

Figure 6: (Q5) Describe your level of experience with each of the following classes of formal descriptiontechniques.

Abstract interpretation (X=0, M=course,CI[course,lab], NA=0)Assertion checking (X=0, M=lab, CI[lab,lab],NA=0)Process calculi (X=0, M=course,CI[course,course], NA=0)Model checking, SMV (X=0, M=lab,CI[course,lab], NA=0)Constraint solving (X=0, M=course,CI[course,lab], NA=0)Generic theorem proving (X=0, M=course,CI[course,lab], NA=0)Computational engineering, simulation (X=0,M=course, CI[course,course], NA=0)Symbolic execution (X=0, M=course,CI[course,lab], NA=0)Consistency checking (X=0, M=lab,CI[course,lab], NA=47) 100 50 0 50 100

PercentageExperience: none course lab practiced practiced > 1

Figure 7: (Q6) Describe your level of experience with each of the following classes of formal reasoningtechniques. reprint – F ormal M ethods U se Clarification (X=0, M=2 − 5, CI[1,2 − 5],NA=0)Specification (X=0, M=2 − 5, CI[2 − 5,2 −5], NA=0)Inspection (X=0, M=2 − 5, CI[1,2 − 5],NA=0)Synthesis (X=0, M=0, CI[0,1], NA=0)Assurance (X=0, M=2 − 5, CI[2 − 5,2 − 5],NA=0) 100 50 0 50 100

Percentagein

Figure 8: (Q7) I have mainly used FMs for ...

Business informationPlatformsSupportiveCritical infrastructuresDevice industryMilitary systems not in the above domainsotherIndustrial machineryTransportationProcess automationI have not used FMs in any academic or industri...I would not or do not intend to use FMs in any... 0 20 40 60 80 100 120 140Application Domain (past)Application Domain (future)

Figure 9: Number of respondents using FMs by domain (past vs. intent)

Q7: Purpose.

Figure 8 depicts the participants’ purposes to apply FMs. It seems that the respondentsemploy FMs mainly for assurance, speciﬁcation, and inspection. Synthesis, on the other hand, to themseems to be only a subordinate purpose in the use of FMs.

We investigate the usage intent of FMs across various domains and roles as well as the participants’ intentto use various FMs and their intended purpose to use FMs.

Application Domain.

Figure 9 compares the respondents’ past domains of FM application with theirintended domains (see Q8). This ﬁgure reveals two insights into the participants’ intentions to use FMs:(i) Fewer participants do not want to apply FMs in the future (19) than participants that have not usedFMs (36, see yellow bars). Ten participants fall into both categories, they have not used FMs and donot intend to use FMs. (ii) The intended application of FMs outperforms the current application of FMsacross all domains. Hence, there is a tendency to increase the use of FMs across all application domains.

Role.

Figure 10 compares the participants’ roles in which they applied FMs in the past with their in-tended role to apply FMs in the future (see Q9). Similar to the results for the application domain, weobserve that some participants, who have not applied FMs in any role so far, intend to apply such meth-ods in the future. However, the comparison reveals that academic disciplines (i.e., researcher and lecturer)seem to be stable . There is only a small di ﬀ erence between the number of participants who applied FMs reprint – F ormal M ethods U se Bachelor, master, or PhD studentConsulting or managing practitioner in industryExternal consultantResearcher in academiaResearcher in industryLecturer, teacher, trainer, or coachStakeholder of an FM tool or service providerEngineering practitioner in industryI have not used FMs in any specific roleI do not or would not intend to use FMs in any ... 0 20 40 60 80 100Role (past)Role (future)

Figure 10: Number of respondents applying FMs by role (past vs. intent)

Predicative, relational, or algebraic specification (X=26,M=equally, CI[equally,equally], NA=0)Modal or temporal logic specification (X=27, M=equally,CI[equally,equally], NA=0)Process models (X=27, M=equally, CI[equally,equally], NA=0)Dynamical system models (X=39, M=equally,CI[equally,equally], NA=0) 100 50 0 50 100

PercentageRelative frequency: no more less equally more often

Figure 11: (Q10) I (would) intend to use ...in academic domains in the past and the number of participants who want to apply such methods to thesedomains in the future.In contrast, there is a signiﬁcant increase in the number of participants aiming to apply FMs, across all industrial roles.Furthermore, the diagram shows a strong contrast between past and indented use in the category “Bach-elor, master, or PhD student.” We can see several reasons for this di ﬀ erence. From the respondents who“used FMs as a student,” many (i) might not be able to “use FMs as a student” anymore because of havinggraduated, (ii) did not ﬁnd FMs or the way FMs were taught as helpful, or (iii) moved into a businessdomain with no foreseeable demand for the application of FMs. Q10: Intended use for Speciﬁcation.

Figure 11 depicts the respondents’ intended future use of variousFMs for system speciﬁcation (i.e., formal description techniques). The ﬁgure shows an almost equal amount of participants aiming to decrease (i.e., “no more” and “less”) and increase (i.e., “more often”)their use of FMs for speciﬁcation. Only dynamical system models again seem to be an exception: moreparticipants want to decrease their use of this technology, compared to participants who want to increaseit.

Q11: Intended use for Analysis.

The respondents’ intended use of FMs for the analysis of speciﬁca-tions (i.e., formal reasoning techniques) is depicted in Figure 12. Except for process calculi, we observea general tendency of the participants to increase their future FM use.

Q12: Intended Purpose.

Figure 13 indicates why respondents intend to apply FMs. Again, there is atendency of the participants to increase

FM use across all listed purposes. reprint – F ormal M ethods U se Abstract interpretation (X=37, M=equally,CI[equally,equally], NA=0)Assertion checking (X=17, M=equally, CI[equally,moreoften], NA=0)Process calculi (X=46, M=equally, CI[equally,equally],NA=0)Model checking, SMV (X=21, M=equally, CI[equally,moreoften], NA=0)Constraint solving (X=23, M=equally, CI[equally,moreoften], NA=0)Generic theorem proving (X=38, M=equally, CI[equally,moreoften], NA=0)Simulation (X=28, M=equally, CI[equally,equally], NA=0)Symbolic execution (X=26, M=equally, CI[equally,moreoften], NA=0)Consistency checking (X=23, M=equally, CI[equally,moreoften], NA=47) 100 50 0 50 100

PercentageRelative frequency: no more less equally more often

Figure 12: (Q11) I (would) intend to use ...

Clarification (X=22, M=equally, CI[equally,moreoften], NA=0)Specification (X=12, M=equally, CI[equally,moreoften], NA=0)Inspection (X=20, M=equally, CI[equally,more often],NA=0)Synthesis (X=34, M=equally, CI[equally,more often],NA=0)Assurance (X=18, M=equally, CI[equally,more often],NA=0) 100 50 0 50 100

PercentageRelative frequency: no more less equally more often

Figure 13: (Q12) I (would) intend to use FMs for ...

Q7 and Q12: Comparison of Code- and Model-based FMs.

In the following, we regard practition-ers with experience level “applied several times in engineering practice” or “applied once in engineeringpractice” and frequency “applied in 2 to 5 separate tasks” or “applied in more than 5 separate tasks”(see Table 4). We compare users of code-based FMs (CBs; including “abstract interpretation”, “asser-tion checking”, “symbolic execution”, “consistency checking”; with N = users of model-basedFMs (MBs; including “process calculi”, “model checking”, “theorem proving”, and “simulation”; withN = inspection (e.g. error detection, bug ﬁnding) showsthe following: • CBs show slightly more frequently an increased intent (the “more often” group) than MBs; forboth sub-groups, respondents with 2 to 5 and with more than 5 past uses. • MBs show slightly more frequently a decreased intent (the “no more” group) than CBs.Looking at assurance (e.g. proof, error removal) shows the following: reprint – F ormal M ethods U se l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l Business informationCritical infrastructuresDevice industryI have not used FMsin any academic orindustrial domainIndustrial machineryMilitary systems not inthe above domainsOtherPlatformsProcess automationSupportiveTransportation A b s t r a c t i n t e r p r e t a t i on A ss e r t i on c he ck i ng C o m pu t a t i ona l eng i nee r i ng , s i m u l a t i on C on s i s t en cy c he ck i ng C on s t r a i n t s o l v i ng D y na m i c a l sys t e m s G ene r i c t heo r e m p r o v i ng M oda l and t e m po r a l l og i cs pe c i f i c a t i on M ode l c he ck i ng , S M V P r ed i c a t i v e , r e l a t i ona l , o r a l geb r a i cs pe c i f i c a t i on P r o c e ss c a l c u li P r o c e ss m ode l s S y m bo li c e x e c u t i on FM Class A pp li c a t i on D o m a i n Figure 14: Approximation (likelihood) of practised use (

UFM p ) by FM class and application domain • MBs show slightly more frequently an increased intent than CBs when looking at respondentswho have used FMs more than 5 times. However, MBs indicate slightly less frequently anincreased intent than CBs when looking at respondents with 2 to 5 uses. • CBs indicate more dnk s after 2 to 5 uses and slightly more frequently a decreased intent after 5uses in comparison with MBs.

Q1, Q5, and Q6: Practised FM Classes by Application Domain.

We asked respondents about theiruse of each FM class independent of the application domain and about their general use of FMs in eachsuch domain. Hence, we can only approximate past usage per FM class and application domain assum-ing that the overall usage per respondent is uniformly distributed among the speciﬁed FM classes anddomains. For that, we interpret (and count) each respondent who speciﬁes a domain in combination with“applied once in engineering practice” or “applied several times in engineering practice” for an FM classas a practitioner who has used ( UFM p ) or, respectively, wants to use ( UFM i ) FMs of that class in thatdomain. More generally, we count a respondent who speciﬁes n domains, say d to d n , in combinationwith “applied once in engineering practice” or “applied several times in engineering practice” for m FMclasses, say c to c m , as a practitioner who has used ( UFM p ) or, respectively, wants to use ( UFM i ) FMsof the classes c to c m in the domains d to d n . Figures 14 and 15 show these approximations for UFM p and UFM i . Table 6 lists the FM challenges subject to discussion, their background, and literature referring to them.We apply the procedure described in Section 4.6. reprint – F ormal M ethods U se Challenge Name & Description Src. Supported by (oldest,newest) Findings for RQ3 (Section 5.5)Scalability:

Useful in handlinglarge and technologicallyheterogeneous systems Q 7 studies, e.g. Hall(1990) and Miller et al.(2010) toughest in Figure 16; by Ps more thanby NPs; when using FMs for assuranceand clariﬁcation; independent of FMclass

Skills & Education:

Methodsknown (little misconception);trained and experienced usersavailable Q 13 studies, e.g. Bjorner(1987), Bicarreguiet al. (2009) 2nd toughest; agreed by LEs and MEs;largely independent of FM class;comparatively small tough-proportionsby Ms

Transfer of Proofs:

Relationbetween models and reality (e.g.code), handling incompletespeciﬁcations Q 8 studies, e.g. Jackson(1987) and Parnas(2010) Agreed by LEs and MEs; top-rated byDIs and NMs; largely independent ofFM class

Reusability:

Parametric proofs,reusable speciﬁcations andveriﬁcation results Q Barroca andMcDermid (1992) andBowen and Hinchey(1995b) Top-rated by tool provider stakeholdersand lectures

Abstraction:

Useful and correct(automated) abstractions fromirrelevant detail (forcomprehension and validation) Q 12 studies, e.g.Jackson (1987), Milleret al. (2010), andParnas (2010) Varies notably across FM classes

Tools & Automation:

Usefulnotations and trustworthy tools(for manipulation, checking,collaboration, documentation) Q 16 studies, e.g. Bjorner(1987) and O’Hearn(2018) Top-rated by DIs; but comparativelysmall tough-proportions frompractitioners

Maintainability:

Stable proofs,easily modiﬁable speciﬁcations,and adaptable veriﬁcation results Q Barroca andMcDermid (1992),Knight et al. (1997),and Parnas (2010) Comparatively small tough-proportionsfrom practitioners

Resources: Su ﬃ cient resources,good cost-beneﬁt ratio (despiteadoption, training, licenses) 4R 11 studies, e.g. Hall(1990) and Woodcocket al. (2009) No detailed data was collected:Because these challenges werementioned several times each,we classify them to be at leastof moderate di ﬃ culty. Process Compatibility:

Integration into existing process,method culture, standards, andregulations 6R 12 studies, e.g. Bjorner(1987) and O’Hearn(2018)

Practicality & Reputation:

Beneﬁt awareness and su ﬃ cientempirical evidence for beneﬁts 7R 6 studies, e.g. Lai andLeung (1995) andParnas (2010)Src.. . . source, Q . . . in questionnaire, n R . . . additionally raised by n Respondents reprint – F ormal M ethods U se l l ll l l l l ll l l l l l l l ll lll l l l l l Business informationCritical infrastructuresDevice industryI would not or do notintend to use FMs in anyacademic or industrialdomainIndustrial machineryMilitary systems not inthe above domainsOtherPlatformsProcess automationSupportiveTransportation A b s t r a c t i n t e r p r e t a t i on A ss e r t i on c he ck i ng C o m pu t a t i ona l eng i nee r i ng , s i m u l a t i on C on s i s t en cy c he ck i ng C on s t r a i n t s o l v i ng D y na m i c a l sys t e m s G ene r i c t heo r e m p r o v i ng M oda l and t e m po r a l l og i cs pe c i f i c a t i on M ode l c he ck i ng , S M V P r ed i c a t i v e , r e l a t i ona l , o r a l geb r a i cs pe c i f i c a t i on P r o c e ss c a l c u li P r o c e ss m ode l s S y m bo li c e x e c u t i on FM Class A pp li c a t i on D o m a i n Figure 15: Approximation (likelihood) of increased usage intent (

UFM i ) by FM class and applicationdomain Scalability (X=26, M=tough, CI[tough,tough], NA=0)Proper abstractions from irrelevant details (X=36,M=moderate, CI[moderate,tough], NA=0)Maintainability of verification results (X=36,M=moderate, CI[moderate,tough], NA=0)Reusability of verification results (X=37, M=moderate,CI[moderate,tough], NA=0)Transfer of verification results from (X=31,M=moderate, CI[moderate,tough], NA=0)Automation or tool support (X=27, M=moderate,CI[moderate,tough], NA=0)Skills and education (X=23, M=tough, CI[tough,tough],NA=0) 100 50 0 50 100

PercentageDegree of difficulty: not an issue moderate tough

Figure 16: (Q13) For any use of FMs in my future activities, I consider (cid:104) obstacle (cid:105) as [not an | a moderate | atough] issue. General Ranking (Q13).

Figure 16 shows the respondents’ ratings of all challenges. Most of thembelieve that scalability will be the toughest challenge and maintainability is considered the least di ﬃ cultof all rated obstacles. For reuse of proof results , proper abstractions , and tool support , the participantsdistribute more uniformly across moderate and high di ﬃ culty.In the following, we compare speciﬁc groups of respondents by how they perceive the di ﬃ culty of thevarious challenges. We group respondents according to the criteria in Section 4.6 and according to the reprint – F ormal M ethods U se Less Experienced (LE) versus More Experienced (ME) Respondents (Q2).

The comparison of thedi ﬃ culty ratings of LEs with the ratings of MEs shows that (i) LEs less often perceive the given challengesas tough, (ii) MEs signiﬁcantly more often rate scalability as tough, (iii) both groups show the closestagreement on transfer of veriﬁcation results and skills and education . Non-Practitioners (NP) versus Practitioners (P) by Past Purpose (Q7).

The perception of skills andeducation and scalability as the most di ﬃ cult challenges is largely independent of the purpose, again Psattributing more signiﬁcance to scalability . Scalability, the forerunner in Figure 16, exhibits the mosttough-ratings from NPs in synthesis and from Ps in assurance and clariﬁcation (see the top half of Fig-ure 22 in Appendix A.6). Decreased Intent (DI) versus Increased Intent (II) by Purpose (Q12).

The comparison of the dif-ﬁculty ratings of respondents with no or decreased intent to use FMs for a speciﬁc purpose and of re-spondents with equal or increased intent shows: (i)

Scalability and skills and education , both forerunnersin Figure 16, show the most tough-ratings from IIs for assurance (67%) and inspection (66%) and fromDIs for synthesis (53%). (ii) The trend in Figure 16 is more clearly observable from IIs than from DIs,where transfer of veriﬁcation results and automation and tool support seem to be tougher than skills andeducation . Non-Practitioners (NP) versus Practitioners (P) by FM Class (Q5, Q6).

The top half of Figure 17shows for NPs, the trend in Figure 16 is largely independent of the FM class , except for consistencychecking and logic leading with tough proportions of 49%.The bottom half of Figure 17 shows for Ps, di ﬃ culty ratings across FM classes vary more: The foremostchallenges in Figure 16 received the most tough -ratings from users of process models , dynamical systems , process calculi , model checking , and theorem proving . Di ﬃ culty ratings of users are often centred onmoderate or tough, proper abstraction and skills and education show a comparatively wide variety acrossFM classes.The histograms in the lower right corners in Figure 17 indicate that (i) NPs’ di ﬃ culty ratings vary lessthan Ps’ ratings, (ii) NPs’ ratings are more independent from the FM classes, and (iii) NPs’ di ﬃ cultyratings are lower on average than Ps’ ratings. Appendix A.6 contains several such association matriceswith more detailed data in the matrix cells. Decreased Intent (DI) versus Increased Intent (II) by FM Class (Q10, Q11).

The trend in Figure 16is supported by many tough ratings (48%) for transfer of veriﬁcation results from DIs in consistencychecking . However, DIs in process calculi provide comparatively many tough-ratings (39%) for the gen-erally low-ranked automation and tool support . Assertion checking exhibits comparatively low tough-proportions across all challenges whereas process calculi exhibit comparatively high tough-ratings. Mir-roring the trend in Figure 16, IIs show less variance than DIs across all FM classes.

Unmotivated (NM) versus Motivated (M) respondents by Motivating Factor (Q3).

Respondentswith moderate to strong motivation to use FMs more likely identify the given challenges as moderate totough, regardless of the motivating factor.

The trend in Figure 16 seems explainable by many tough rat-ings from respondents motivated by regulatory authorities (69%), not motivated by tool providers (56%),and not motivated by superiors / principal investigators (56%, see Figure 24 in Appendix A.6). NMs’tough-ratings are notably lower than Ms’ tough-ratings. Past and Future Views by Role (Q4, Q9).

Although participants show role-based discrepancies be-tween their past and intended use of FMs (Figure 10), the perception of di ﬃ culty of the rated challengesseems to be largely similar , following the trend in Figure 16. The high ranking of scalability (and reprint – F ormal M ethods U se S c a l ab ili t y P r ope r ab s t r a c t i on s f r o m i rr e l e v an t M a i n t a i nab ili t y o f v e r i f i c a t i on r e s u l t s R eu s ab ili t y o f v e r i f i c a t i on r e s u l t s T r an s f e r o f v e r i f i c a t i on r e s u l t s A u t o m a t i on o r t oo l s uppo r t S k ill s and edu c a t i on Consistency checkingSymbolic executionComputational engineering, simulationGeneric theorem provingConstraint solvingModel checking, SMVProcess calculiAssertion checkingAbstract interpretationDynamical systemsProcess modelsModal and temporal logic specificationPredicative, relational, or algebraicspecification0 0.2 0.4 0.6 0.8 1Tough−proportion of all ratings per cell

Key/Histogram of Toughs C e ll s Comparison of Challenge Difficulty across FMs (users not practicing FMs, past) S c a l ab ili t y P r ope r ab s t r a c t i on s f r o m i rr e l e v an t M a i n t a i nab ili t y o f v e r i f i c a t i on r e s u l t s R eu s ab ili t y o f v e r i f i c a t i on r e s u l t s T r an s f e r o f v e r i f i c a t i on r e s u l t s A u t o m a t i on o r t oo l s uppo r t S k ill s and edu c a t i on Consistency checkingSymbolic executionComputational engineering, simulationGeneric theorem provingConstraint solvingModel checking, SMVProcess calculiAssertion checkingAbstract interpretationDynamical systemsProcess modelsModal and temporal logic specificationPredicative, relational, or algebraicspecification0 0.2 0.4 0.6 0.8 1Tough−proportion of all ratings per cell

Key/Histogram of Toughs C e ll s Comparison of Challenge Difficulty across FMs (users practicing FMs, past)

Figure 17: Di ﬃ culty of challenges (cols): NPs (top) compared to Ps (bottom) by class of used FM (rows). Legend:

In each cell of an association matrix, both the solid vertical line and the colour (gradient fromred to white) represent the tough proportions (from 0 to 100%), with the dotted vertical line marking the50% margin. The histogram (to the lower right corner of each matrix) counts the combinations (cells) ineach 5%-band of tough ratings. E.g. ∼

70% of “process calculi” users perceive “scalability” as a toughchallenge. reprint – F ormal M ethods U se reusability of veriﬁcation results ) is supported by many tough-ratings from tool provider stakeholders forthe past view and many from lecturers for the future view. Respondents not having used FMs or notplanning to use FMs exhibit the lowest tough-ratings but also the highest fractions of dnk -answers. Past and Future Views by Domain (Q1, Q8).

The trend in Figure 16 is underpinned by highest tough-proportions for respondents from the transportation , military systems , industrial machinery , and support-ive domains. iscussion In this section, we discuss and interpret our ﬁndings, relate them to existing evidence, outline generalfeedback on the questionnaire, and critically assess the validity of our study.

The following (F)indings are based on the data summarised and analysed in the Sections 5.2 to 5.4. Allﬁndings are then collected in Table 7 on page 25.

Findings for RQ 1. (F1)

Regulatory authorities with their norms, codes, or policies represent only aminor motivating factor to use FMs.

Intrinsic motivation (maybe market-triggered) seems to be stronger.This ﬁnding is consistent with what we know from the literature survey in Gleirscher et al. (2019): FMsare not formally required by corresponding standards today, not even for the highest safety integritylevels. If regulatory authorities change their recommendations to requirements, then this might spike as amotivating factor. (F2)

The low fraction of respondents with no experience in Figure 3 may have been caused (1) by ourchoice of expert channels in Table 5 where the likelihood of encountering FM users is probably higherthan in more generic SE channels (e.g. Stack Overﬂow) and (2) by the fact that SE students will usuallyhave an FM course or some lectures about FMs such that they would choose “1–3 years” in Q2 and“studied in course” in Q5. (F3)

We observe the least use of FMs in computational engineering and for reasoning about dynamicalsystems, for example, reasoning about the correctness of algorithms, and their implementation in embed-ded software, controlling such systems. One explanation for this is that our sample mainly comprisessoftware and systems engineers who will work less intensively with such FMs than, for example, me-chanical or control engineers. Another explanation is that such FMs are still less widely known, less welldeveloped, or less well supported by tools than FMs focusing on the reasoning about pure software.

Findings for RQ 2. (F4)

It seems that in all given domains (Figure 9, except for other ) respondentsintend to increase their future use of FMs. Moreover, we observe that this tendency is independent ofthe particular

FM class (except process calculi) or purpose . The data also suggest that the use of FMs byteachers and researchers is saturated. This saturation indicates a stable intent to teach FMs, to performresearch in FMs, or to otherwise use FMs in teaching or research. However, there is an increased intentto apply FMs in industrial contexts in the future. One explanation could be that engineers have alreadywanted to use FMs but have not had the opportunity or were not told or permitted to do so. Anotherexplanation for an increased intent of FM non-users could be due to some bias when answering questionsabout whether someone would do (e.g. try out) something. (F5)

Our data suggest that experience in using a certain FM class is positively associated with the intent touse this FM class in the future. To investigate this suspicion, we analysed the intended use of a FM classbased on the experience of participants in using this class (also by association analysis as described inSection 4.6). We observe that the more experience one has with using a speciﬁc FM class, the more likelythey will apply it in the future (see the two charts in the Appendices A.3 and A.5). No experience witha speciﬁc FM class correlates with a low intent to use that class. Participants not having used FMs and, reprint – F ormal M ethods U se Bachelor, master, or PhD studentConsulting or managing practitioner inExternal consultantResearcher in academiaResearcher in industryLecturer, teacher, trainer, or coachStakeholder of an FM toolEngineering practitioner in industryI have not used FMs Number of non−practitioners (MC)0 5 10 15 20 25

Figure 18: The past role proﬁle of the 46 non-practitioners (out of 216 respondents) helps to explainﬁnding F8hence, unfamiliar with them might not have had the need in the ﬁrst place. Only little experience with acertain FM class signiﬁcantly increases the intent to apply it again in the future . Similar observations canbe made for the use of FMs in general for a speciﬁc purpose. (F6)

The di ﬀ erences in past and intended use between code- and model-based FMs (Section 5.4), forexample, when looking at inspection and assurance, are marginal. Moreover, we cannot ﬁnd a signif-icant di ﬀ erence or a trend between these two categories of FMs when considering di ﬀ erent purposes,experience levels, and usage frequencies. (F7) The approximation in the Figures 14 and 15 allows the, albeit vague, interpretation of the numbersas the likelihood that respondents have used (Figure 14) or want to use (Figure 15) a particular FM ina particular domain. Assuming this model, Figure 15 indicates the highest likelihoods of an increased

UFM i for methods such as “assertion checking”, “constraint solving”, “model checking”, and “symbolicexecution” in domains such as “transportation”, “critical infrastructures”, and the “device industry”. Findings for RQ 3. (F8)

Scalability and skills and education lead the challenge ranking, independentof the domain, FM class, motivating factor, and purpose. Practitioners see scalability as more problematicthan non-practitioners, whereas non-practitioners perceive skills and education as more problematic thanpractitioners. Figure 18 may explain the latter by showing a high fraction of students among the 46non-practitioners. (F9)

Maintainability of proof results or other veriﬁcation artefacts was found to be the least di ﬃ cult chal-lenge. However, in the lower half of Figure 17, the challenge column “maintainability” shows relativelylow frequencies for “modal and temporal logic” and “model checking” (possibly because of the highlevel of automation) whereas “theorem proving” (possibly because of a low level of automation) and“constraint solving” (possibly because of being too versatile or generic for the present purpose) showthe highest frequencies of tough ratings. See Figure 26 in the Appendix A.6 for more details. (F10) Reusability of proof results was rated as tough by several practitioner groups. (F11)

FM users with decreased usage intent rate tool deﬁciencies as their top obstacle to FM adoption. (F12)

Furthermore, our respondents raised three additional challenges (i.e., resources, process compati-bility, and practicability & reputation) which we cross-validated with the literature (see highlighted rowsin Table 6). The fact that these obstacles were mentioned several times in addition to the given obstacles reprint – F ormal M ethods U se (F13) Challenges are perceived as moderate or tough , largely similarly between the pairs of groups wedistinguish in Section 4.6. (F14)

With 72% of tough ratings for scalability , process calculi (e.g. ACP, CCS, CSP) perform in themidﬁeld despite their high reputation as compositional methods.

Scalability of process models (e.g. Petrinets, Mealy machines, labelled transition systems, Markov models) is also ranked in the middle ﬁeld oftough challenges. The ranking of these models, however, is unsurprising in the light of the di ﬃ cult scal-ability of model checking, a frequently used veriﬁcation technique for process models and the leader inthis ranking (cf. Figure 17). One explanation for the high number of tough-ratings from NPs in synthesis could be that NPs might either not associate FMs with synthesis in general, or because automated synthe-sis of sophisticated artefacts is known to be an unsolved problem in many cases, independent of the useof FMs. In analogy to the reasoning in Davis (1989), an increased positive experience with practically applyingFMs forms a high degree of PU (Section 2). Davis (1989, pp. 329, 331) observed that current and intendedusage are signiﬁcantly correlated with PU, less with PEOU. In fact, F4 suggests an increased intent touse FMs in the future. Moreover, F5 suggests a positive association of the degree of experience with

UFM i , that is, more experience increases the intent. (F15) Because the use of FMs is not mandatory formost respondents, a likely explanation for an increased intent (

UFM i ) is that our respondents perceive the usefulness of FMs to be more positive than negative.Inspired by Riemenschneider et al. (2002), in the last paragraph of Section 4.2, we justify the use ofchallenge scales to collect data for PEOU and PU. We justify the validity of the FM-speciﬁc challengescale using the studies in Table 1. The column “supported by” in Table 6 indicates studies discussingthe corresponding challenges. From these discussions, we infer that tackling these challenges contributesto an increased EOU and U. First, the studies suggest that FMs are easier to use if users have su ﬃ cientskills and education, if the methods scale to large systems, if mature tools and automation are available,and if proofs are easily maintainable and reusable. Second, the studies suggest that FMs are more useful if they are compatible with the process, if their cost-beneﬁt ratio is low, if their abstractions are correctand expressive, and if proofs can be correctly transferred to reality. Hence, these challenges representFM-speciﬁc substrata (Davis, 1989, p. 325) of EOU and U for FMs. Moreover, a high degree of PEOUcorresponds to an increased positive user experience with FMs which translates to a low proportion oftough ratings for the obstacles measured in Q13. However, from F13, we observe that respondents ratemost challenges as moderate to tough, largely independent of other variables (F8). (F16) Overall, it thus seems that our respondents perceive the ease of use of FMs to be more negativethan positive. According to Table 6, many of the surveyed studies discuss skills & education (12 studies)and tools & automation (16) as important challenges. Moreover, Figure 16 suggests that conceptualdi ﬃ culties (possibly, from a lack of education and training, from di ﬃ culties in FM teaching, from a lackof FM students) seem to be at least as responsible for the negative ease of use as the lower ranked tooldeﬁciencies. Indeed, in a recent discussion of “push-button veriﬁers”, O’Hearn (2018) highlights thatboth conceptual expertise and tool deﬁciencies are still signiﬁcant bottlenecks. However, an investigationof respondents’ experiences with FM tools in comparison to their experiences with FM concepts goesbeyond the possibilities of the data collected for this study. Our systematic map shows that our list of challenges is completely backed by substantial literature (seeTable 6) raising and discussing these challenges. (F17)

However, the fact that maintainability andreusability were least covered by our literature is, on the one hand, in line with F9 but, on the otherhand, not with F10 and typical cultures of reuse in practice. reprint – F ormal M ethods U se RQ1:

In which typical domains, for which purposes, in which roles, and to what extent have

FMs beenused ?F1 Intrinsic motivation to use FMs is stronger than norms or codes of regulatory authorities.F2 The fraction of respondents with no experience at all is comparatively low.F3 Respondents use FMs the least in computational engineering and for dynamical systems.

RQ2:

Which relationships can we observe between past experience in using FMs and intent to use FMs ?F4 Increased intent to use FMs observable across all application domains.F5 Amount of experience is positively associated with the strength of usage intent.F6 The responses do not show any signiﬁcant di ﬀ erences between code- and model-based FMs.F7 Respondents show high likelihoods of an increased intent to use FMs such as “model checking” or “asser-tion checking” in areas such as “transportation” or “critical infrastructures”. RQ3:

How di ﬃ cult do study participants perceive widely known FM challenges ?F8 Scalability and skills & education lead the challenge di ﬃ culty ranking.F9 Maintainability of proof results is found to be the least worrying challenge.F10 Reusability of proof results is rated as tough by several practitioner groups.F11 FM users with decreased usage intent rate tool deﬁciencies as their top obstacle.F12 Respondents identiﬁed resources, process compatibility, and reputation as further obstacles.F13 All considered challenges are generally perceived as moderate or tough.F14 Among the FM classes, process models are most positively associated with tough scalability. RQ4:

What can we say about the perceived ease of use and the perceived usefulness of FMs?F15 Respondents perceive the usefulness of FMs as mainly positive and intend to increase their use.F16 Respondents perceive the ease of use of FMs as mainly negative.

Relationship to Existing Evidence (from the literature):F17 Proof maintainability and reusability are least covered by the literature.F18 We repeat Austin and Parkin (1993), excluding beneﬁt analysis but with a broader sample and more de-tailed questions.

Beyond the general ﬁndings about FM beneﬁts in Austin and Parkin (1993), we steered our half-openquestionnaire towards a reﬁned classiﬁcation of responses, comparing past with intended use, and inter-rogating recently perceived obstacles among a methodologically and geographically more diverse sample.Their sample mainly covers Z and VDM users in the UK. Our questionnaire has less focus on representa-tion and methodology and excludes both questions on beneﬁts and on suggestions to overcome obstacles.Regarding the latter, Austin and Parkin (1993) mention the improvement of education and standardisation,the preparation of case studies, and the deﬁnition of FM e ﬀ ectiveness metrics. (F18) The report of Austin and Parkin (1993) from the National Physical Laboratory archive was unfor-tunately no more available to us. We ﬁnally managed to get access to a paper copy provided by a friendlycolleague. This, however, only happened after conducting this survey. Anyway, we found that our conclu-sions are nearly identical to Austin and Parkin’s. The data from Figure 16 and Table 6 conﬁrms that manyof the obstacles (i.e., limitations and barriers) they identiﬁed back in 1991 / / beneﬁt evidence) and some have been more strongly addressed (e.g. lack ofexpressiveness, lack of appropriate tools). Not mentioned in Austin and Parkin (1993) is scalability, ratedby our respondents to be the toughest obstacle.F5 is in line with other observations in Bicarregui et al. (2009) and Woodcock et al. (2009) that the re-peated use of a FM results in lower overheads (i.e., an experienced e ﬀ ort or cost reduction and improvederror removal), up to an order of magnitude less than its ﬁrst use (Miller et al., 2010). Finally, our studygeneralises the main ﬁndings about barriers in Davis et al. (2013) to several geographies and applica-tion domains, however, using an on-line questionnaire instead of interviews and not asking for barriermitigations. reprint – F ormal M ethods U se We assess our research design with regard to four common criteria (Shull et al., 2008; Wohlin et al.,2012). Per threat ( (cid:32) ), we estimate its criticality (minor or major), describe it, and discuss its partial ( ◦ ) orfull ( (cid:88) ) mitigation. Why would the construct (Section 4.2) appropriately represent the phenomenon? maj (cid:32) : Inappropriate questions and conceptual misalignment / To support face validity , we applied ourown experience from FM use to develop a core set of questions. For the design of our questionnaire,we use feedback from colleagues, from respondents we personally know, and from the general feedbackon the survey to improve and support content validity . A positive comparison with the questionnaire inAustin and Parkin (1993) ﬁnally conﬁrms the appropriateness of our questions. However, we might haveneeded additional questions to check for conceptual alignment, for example, to more precisely determinewhether the respondents’ understanding of

FMs and of the use or application of FMs closely matchesours. However, from 18 respondents giving feedback on our questionnaire, only one commented on thedeﬁnition and one on the classiﬁcation of FMs. That suggests that many respondents did not have or werenot aware of misunderstandings worth mentioning. ◦ min (cid:32) : Questionnaire limited for measurement of PEOU (e.g. per FM class) and PU / We avoid derivingconclusions speciﬁc to a FM or a corresponding tool from our data. (cid:88) min (cid:32) : Bias by omitted scale values (e.g. FM class, domain, purpose) / Respondents are encouraged toprovide open answers to all questions, helping us to check scale completeness. Between 8% and 40% ofthe respondents made use of the text ﬁeld “Other.” Our systematic map conﬁrms that we have not listedunknown challenges in Q13. We identiﬁed three additional challenges via open answers and the litera-ture. We believe to have achieved good criterion validity through questions and scales for distinguishingimportant sub-groups (see Section 4.6) of our population. (cid:88) min (cid:32) : Educational background asked indirectly / We approximate what we need to know by using datafrom Q1, Q3, Q4, and Q5. (cid:88)

Why would the procedure in Section 4 lead to reasonable and justiﬁed results? min (cid:32) : Incomplete data points / After the 47th response, feedback from colleagues and respondentsresulted in an extension of Q3 with the option “on behalf of FM tool provider” (Figure 4) and of Q6 andQ11 with the option “consistency checking” (Figure 7). The enhancement of 169 complete data points to216 maintained all trends. (cid:88) min (cid:32) : Duplicate & invalid answers / To identify intentional misconduct, we checked for timestampanomalies and for duplicate or meaningless phrases in open answers. Voluntarily provided email ad-dresses (90 / / have not. . . ” was combined with other checked options. (cid:88) min (cid:32) : Inter- vs. intra-UFM inference / Our study design is not suitable for “inter-

UFM predictions”, forexample, to predict that (dis)satisﬁed model checking practitioners have an increased (a decreased) intentto use theorem proving. However, the argumentation in the Sections 4.2 and 6.2 aims at “intra-

UFM predictions”, that is, inferring an increased or decreased intent to use model checking from the quantityand quality of past experience in using model checking. Such predictions may inherit possible limitationsof TAM studies. ◦ reprint – F ormal M ethods U se Why would the procedure in Section 4 lead to similar results with more general populations? maj (cid:32) : Low response rate / We believe our estimates in Section 5.2 to be sensible. We tried to (i) im-prove targeting by repetitively advertising on multiple appropriate channels, (ii) spot unreliable contactinformation, (iii) provide incentive (study results via email), (iv) keep the questionnaire short and com-prehensible, (v) avoid forced answers, and (vi) allow lack of topic knowledge. Some uncertainties remain,for example, lack of sympathy, personal motivation, and interest, or strong loyalty, and high expectationsin the outcome, or intentional bias. However, from an estimated population of around 100K (i.e., therounded sum of 38K and 61K), the minimum sample size for 95% conﬁdence intervals with continuousscale error margins of less than 7% is 196, consistent with the ballpark ﬁgure in Gleirscher et al. (2019,p. 117:29). Our sample (N = “FMs are not practically useful” or “FMs are di ﬃ cult to apply” . Because these statements address FMs as a whole, we believe suchlocal errors do not a ﬀ ect our general conclusions. However, the response rate (1 to 2%) and populationcoverage (0.1%, cf. Section 5.1) were too low to avoid such errors and refute speciﬁc null-hypotheses,such as “FM m is e ﬀ ective for role r and purpose p in domain d” (by the FM community) or “FM m isdi ﬃ cult to apply for role r and purpose p in domain d” (by SE practitioners), with satisfactory statisticalpower. (cid:88) maj (cid:32) : Bias towards speciﬁc groups (Shull et al., 2008, p. 181) / We distributed our questionnaire overgeneral SE channels. We mix opportunity (only 5 to 10% chain referral), volunteer, and cluster-basedsampling. Selection bias, a problem in snowball sampling (Biernacki and Waldorf, 1981), is limited bygood visibility and accessibility of the target population in these channels (Section 5.2) as well as little useand control of referral chains among respondents. Our sample includes 50% practitioners according toSection 4.6, ≈

21% NPs (incl. laypersons), and ≈

31% pure academics. A bias towards FM experts (Fig-ure 3) does not harm our PEOU discussion led by practitioners but shapes our PU discussion. Regardingapplication domains, our conclusions cannot be generalised to, e.g. critical IT systems in the ﬁnance ande-voting sectors. ◦ min (cid:32) : Non-response / We decided not to enforce responses or provide incentives. Still, our data suggeststhat our advertisement stimulated responses from FM-critical minds. ◦ min (cid:32) : Lack of FM knowledge /

11 to 18% of our respondents did not know speciﬁc challenges (Fig-ure 16). For RQ1 (Figures 2 to 16), dnk -data points have no inﬂuence because the ﬁndings of RQ1directly describe and interpret the status quo of

UFM p . For test purposes, we included dnk -data points inthe analyses of RQ2 and RQ3 (Figures 11 to 16), with no relevant inﬂuence. (cid:88) min (cid:32) : Geographical background missing / Respondents were not required to own a Google account toavoid tracking and to increase anonymity and the response rate. The limited geographical knowledgeabout our sample constrains the generalisability of our conclusions, e.g. to geographies such as China,India, or Brazil. ◦ Why would a repetition of the procedure in Section 4 with di ﬀ erent samples from the same populationlead to the same results? maj (cid:32) : Internal consistency / All 7 items for the concept “obstacle to FM e ﬀ ectiveness” (C7) show goodinternal consistency for our sample with a Cronbach α = .

84, the PEOU-part of C7 consisting of 5 itemsshows an α = .

79 (Shull et al., 2008). The other concepts are not measured with multiple items. ◦ maj (cid:32) : Change of proportions / The limited sample and the low response rate make it hard to mitigatethis risk. However, we compared the ﬁrst (til 4.8.2018, N = N = reprint – F ormal M ethods U se ﬀ erence does not show a signiﬁcant di ﬀ erence between these two groups (e.g. forQ13 and Q4). Only for the Q3 item “On behalf of FM tool provider,” a p = .

07 indicates a potentialdi ﬀ erence. The addition of that item only after the 47th respondent might explain this di ﬀ erence. ◦ onclusions We conducted an on-line survey of mission-critical software engineering practitioners and researchers toexamine how formal methods have been used, how these professionals intend to use them, and how theyperceive challenges in using them. This study aims to contribute to the body of knowledge of the softwareengineering and formal methods communities.

Overall Findings.

From the evidence we gathered for the use of formal methods, we make the follow-ing observations: • Intrinsic motivation is stronger than the regulatory one. • Despite the challenges, our respondents show an increased intent to use FMs in industrial con-texts. • Past experience is correlated with usage intent. • All challenges were rated either moderately or highly di ﬃ cult, with scalability, skills, and edu-cation leading. Experienced respondents rate challenges as highly di ﬃ cult more often than lessexperienced respondents. • From the literature and the responses, we identiﬁed three additional challenges: su ﬃ cient re-sources , process compatibility , good practicality / reputation . • The negative responses to the questions about obstacles to FM e ﬀ ectiveness suggest that the easeof use of FMs is perceived more negative than positive. • Gaining experience and conﬁdence in the application of a FM seems to play a role in developinga positive perception of usefulness of that FM .Barroca and McDermid (1992) present evidence to show that FMs can be used in industry e ﬀ ectivelyand more widely. Their observation from 1992 is that FM use had been limited, beneﬁts were clearbut limitations were subtle. In response to Barroca and McDermid’s ﬁnding “FMs are both oversold andunder-used”, our insights from the analysis of RQ 2 and 3 lead us to conclude that today FMs are probablymore underused than oversold. However, our data also suggests that these methods still need substantialimprovement and support in several areas in order for their beneﬁts to be better utilised. General Feedback on the Survey.

The questionnaire seems to be well-received by the participants.One of them found it an “interesting set of questions.” This impression is conﬁrmed by another partici-pant: “Well chosen questions which do not leave me guessing. Relevant to future FM research andpractice.”

Another respondent noted: “Thank you very much for this survey. It is very constructive and important. It handles most ofthe issues encountered by any practitioner and user of FMs.”

Only one participant found the questionnaire di ﬃ cult for FM beginners. Implications Towards a Research Agenda.

In the spirit of Je ﬀ ery et al. (2015) and complementing thesuggestions from the SWOT analysis in Gleirscher et al. (2019), we want to make another step in settingout an agenda for future FM research. reprint – F ormal M ethods U se scalability , we need more research on how compositional methods (e.g. automated assume-guarantee reasoning, Cofer et al., 2012; automated assertion checking, Leino, 2017) can be better lever-aged in practical settings. To address skills and education , we need an enhanced and up-to-date FM bodyof knowledge (FMBoK; Oliveira et al., 2018). From his survey of “FMs courses in European highereducation”, Oliveira (2004) observes that (i) “model-oriented speciﬁcation”, “formalising distribution,concurrency and mobility”, and “logical foundations of formal methods” showed to be the topic areasmost frequently taught by FM lecturers, and (ii) Z, B, SML, CSP, and Haskell showed to be the mostpopular formal notations and languages taught in these courses. A comparison of the current state withOliveira’s observations can help to evaluate and revise current FM curricula (e.g. for undergraduate SEas suggested in Davis et al., 2013) and to derive recommendations for improved FM courses fosteringgood modelling, composition, and reﬁnement skills in SE practice. To address controllable abstractions ,we need semantics workbenches for underpinning domain-speciﬁc languages with formal semantics. Webelieve that further steps in theory integration and uniﬁcation (Gleirscher et al., 2019) can help establishproof hierarchies and, hence, reusability and proof transfer .To address process compatibility , we need more research in continuous reasoning (e.g. Chudnov et al.,2018; O’Hearn, 2018), a revival of activities, possibly even regulations, in tool integration and model datainterchange, and guidance on how to update engineering development processes. To address reputation ,we need to provide more incentives for practitioners to use FMs and take recent progress in FM researchinto account when changing current software processes, policies, regulations, and standards. This in-cludes convincing practitioners to invest in the support of large-scale studies for monitoring FM use inindustry. Cost-savings analyses of FM applications (e.g. Je ﬀ ery et al., 2015) supported by strong em-pirical designs (i.e., controlled ﬁeld experiments) can help to collect the necessary evidence for decisionmaking, successful knowledge transfer, and for implementing this vision.This survey underpins and enhances the analysis of strengths and weaknesses of FMs in Gleirscher et al.(2019) and can be a guide (1) for consulting and managing practitioners when considering the introductionof FMs into a engineering organisation, (2) for research managers when shaping a grant programme forFM experimentation and transfer, and (3) for associate editors when organising a journal special sectionon applied FM research. Future Work.

Our survey is another important step in the research of e ﬀ ectively applying FM-basedtechnologies in practice. To put it with the words of one of our participants: “[A] closed questionnaire isjust a start.”Hence, we aim at a follow-up study (i) to ﬁnd out which particular FM (and tool) is used in which domainfor which particular purpose and role (e.g. was SMT solving used for model checking in certiﬁcation orfor task scheduling at run-time?), (ii) to measure where particular techniques work well (e.g. which typesof formal contracts work well in control software requirements management in a DO-178C context?),(iii) to measure key indicators for successful use of FMs, (iv) to identify management techniques neededto accommodate the changes in working practices, and, ﬁnally, (v) to provide guidance to future projectswishing to adopt FMs.In a next survey, we like to ask about typical FM beneﬁts, about suggestions for barrier mitigation (Daviset al., 2013), pose more speciﬁc questions on scalability and useful abstraction, the geographical andeducational background, and for conceptual alignment. Further analysis of obstacles, beneﬁts, and usageintent could also beneﬁt from a more ﬁne-grained distinction between FMs directly applied to programcode and FMs focusing on more abstract models. We would also like to change from 3-level to 5-levelLikert-type scales to receive ﬁne-granular responses. Our research design accounts for repeatability,hence, allowing us to go for a longitudinal study.The research design, and even our current data set, allows the derivation of the usage intent ( UFM i ) foreach FM class, application domain, and obstacle. These UFM i values could be used to analyse whethera particular FM might be (1) underused (i.e., domains with an increased usage intent indicate a potential According to https://en.wikipedia.org/wiki/United_Nations_geoscheme . reprint – F ormal M ethods U se ﬀ ective). Acknowledgements.

It is our pleasure to thank all survey participants for their time spent and their valuableresponses, and all channel moderators for forwarding our postings. We are much obliged to Jim Woodcock, who hasled previous studies in our direction, and supported us to critically reﬂect our work and relate it to existing evidence.He connected us with John Fitzgerald, who made his paper copy of Austin and Parkin (1993) available to us suchthat we were able to complete our investigation. We are grateful to John Fitzgerald and also to John McDermid forhelpful feedback and for encouraging us to do further research in this direction. We would like to spend sinceregratitude to Krzysztof Brzezinski, Louis Brabant, and Emmanuel Eze for pointing us to several related works. R eferences Aichernig, Bernhard K. and Tom Maibaum, eds. (Nov. 18, 2003).

Formal Methods at the Crossroads. From Panaceato Foundational Support . Springer Berlin Heidelberg. isbn : 3-540-20527-6.ATOMICO (2019).

The State of European Tech 2019. Section 6.4 . url : https : / / web . archive . org / web /20191220234928 / http : / / 2019 . stateofeuropeantech . com / chapter / people / article / strong -talent-base/ .Austin, Stephen and Graeme Parkin (Mar. 1993). Formal methods: A survey . Tech. rep. Teddington, Middlesex, UK:National Physical Laboratory.Barroca, Leonor M. and John A. McDermid (1992). “Formal methods: Use and relevance for the development ofsafety-critical systems”. In:

Comp. J. doi : .Basili, Victor R. (July 1, 1985). Quantitative evaluation of software methodology . Tech. rep. TR-1519. Universityof Maryland. url : https : / / drum . lib . umd . edu / bitstream / handle / 1903 / 7520 / Quantitative +Evaluation.pdf?sequence=1 (visited on 05 / / FM 2009: Formal Methods .Ed. by Ana Cavalcanti and Dennis R. Dams. Berlin, Heidelberg: Springer Berlin Heidelberg, pp. 810–813. isbn :978-3-642-05089-3.Biernacki, Patrick and Dan Waldorf (1981). “Snowball Sampling: Problems and Techniques of Chain Referral Sam-pling”. In:

Sociological Methods & Research doi : .Bjorner, D. (1987). “On the Use of Formal Methods in Software Development”. In: Proceedings of the 9th Inter-national Conference on Software Engineering . ICSE ’87. Monterey, California, USA: IEEE Computer SocietyPress, pp. 17–29. isbn : 0-89791-216-0. doi : . url : http://dl.acm.org/citation.cfm?id=41765.41768 .Bloomﬁeld, RE, PKD Froome, and BQ Monahan (1991). “Formal methods in the production and assessment ofsafety critical software”. In: Reliability Engineering & System Safety

Industrial Use of Formal Methods: Formal Veriﬁcation . Wiley-ISTE. 298 pp. isbn : 9781848213630.Bowen, J. P. and M. G. Hinchey (July 1995a). “Seven more myths of formal methods”. In:

IEEE Software issn : 0740-7459. doi : .— (Apr. 1995b). “Ten commandments of formal methods”. In: Computer issn : 0018-9162. doi : .Bowen, Jonathan P. and Michael G. Hinchey (2005). “Ten Commandments Revisited: A Ten-year Perspective onthe Industrial Application of Formal Methods”. In: Proceedings of the 10th International Workshop on FormalMethods for Industrial Critical Systems . FMICS ’05. Lisbon, Portugal: ACM, pp. 8–16. isbn : 1-59593-148-1. doi : .Campbell, Michael J. and Martin J. Gardner (1988). “Calculating conﬁdence intervals for some non-parametric anal-yses”. In: British Medical Journal

Fiat Chrysler Is Being Sued Over a Software Flaw . IEEE. url : https://web.archive . org / web / 20180629231601 / https : / / spectrum . ieee . org / riskfactor / computing /software/court-allows-lawsuit-to-proceed-against-fiat-chrysler-over-software-flaw .Chudnov, Andrey et al. (2018). “Continuous Formal Veriﬁcation of Amazon s2n”. In: Computer Aided Veriﬁcation .Springer International Publishing, pp. 430–446. doi : .Cofer, Darren D. et al. (2012). “Compositional Veriﬁcation of Architectural Models”. In: NASA Formal Methods -4th International Symposium, NFM 2012, Norfolk, VA, USA, April 3-5, 2012. Proceedings , pp. 126–140. doi : . reprint – F ormal M ethods U se Craigen, D., S. Gerhart, and T. Ralston (Feb. 1995). “Formal methods reality check: industrial usage”. In:

IEEETransactions on Software Engineering issn : 0098-5589. doi : .Craigen, Dan (1995). “Formal methods technology transfer: Impediments and innovation (abstract)”. In: CONCUR’95: Concurrency Theory: 6th International Conference Philadelphia, PA, USA, August 21–24, 1995 Proceed-ings . Ed. by Insup Lee and Scott A. Smolka. Berlin, Heidelberg: Springer Berlin Heidelberg, pp. 328–332. isbn :978-3-540-44738-2. doi : .Craigen, Dan, Susan Gerhart, and Ted Ralston (1993). “An International Survey of Industrial Applications of FormalMethods”. In: Z User Workshop, London 1992: Proceedings of the Seventh Annual Z User Meeting, London14–15 December 1992 . Ed. by J. P. Bowen and J. E. Nicholls. London: Springer London, pp. 1–5. isbn : 978-1-4471-3556-2. doi : .Davis, Fred D. (Sept. 1989). “Perceived Usefulness, Perceived Ease of Use, and User Acceptance of InformationTechnology”. In: MIS Quarterly

FormalMethods for Industrial Critical Systems . Springer Berlin Heidelberg, pp. 63–77. doi : .Decision Analyst (Aug. 2018). Technology Advisory Board . Decision Analyst, Inc. url : .Evans Data (2018). Global Developer Population and Demographic Study . Tech. rep. Volume 1. Evans Data Corpo-ration. url : https://web.archive.org/web/20191015060004/https://evansdata.com/reports/viewRelease.php?reportID=9 .Fagan, M. E. (1976). “Design and Code Inspections to reduce Errors in Program Development”. In: IBM SystemsJournal doi : .Ferrari, Alessio et al. (Jan. 2019). Survey on Formal Methods and Tools in Railways Technical Report on the activitiesperformed within ASTRail, Deliverable D4.1 . doi : .Fraser, Martin D., Kuldeep Kumar, and Vijay K. Vaishnavi (1994). “Strategies for incorporating formal speciﬁcationsin software development”. In: Communications of the ACM doi : .Galloway, Andy J., Trevor J. Cockram, and John A. McDermid (Sept. 1998). “Experiences with the Applicationof Discrete Formal Methods to the Development of Engine Control Software”. In: IFAC Proceedings Volumes doi : .Gerhart, Susan L. and Lawrence Yelowitz (1976). “Observations of Fallibility in Applications of Modern Program-ming Methodologies”. In: IEEE Trans. Software Eng. doi : .Glass, Robert L. (Oct. 28, 2002). Facts and Fallacies of Software Engineering . Pearson Education (US). isbn : 978-0321117427.Gleirscher, Mario and Diego Marmsoler (Nov. 14, 2018).

Electronic Supplementary Material for “Formal Methods:Oversold? Underused? A Survey” . Zenodo. doi : .Gleirscher, Mario and Anne Nyokabi (2018). System Safety Practice: An Interrogation of Practitioners about TheirActivities, Challenges, and Views with a Focus on the European Region . Tech. rep. York, UK: Department ofComputer Science, University of York, UK. arXiv: .Gleirscher, Mario, Simon Foster, and Jim Woodcock (2019). “New Opportunities for Integrated Formal Methods”. In:

ACM Computing Surveys

52 (6), 117:1–117:36. issn : 0360-0300. doi : . arXiv: .Gnesi, Stefania and Tiziana Margaria (2013). Formal Methods for Industrial Critical Systems: A Survey of Applica-tions . Wiley-IEEE Press. isbn : 9781118459898.Google (Aug. 2018).

Google Forms Service . Google, Inc. url : http://forms.google.com .Graydon, P.J. (June 2015). “Formal Assurance Arguments: A Solution in Search of a Problem?” In: DependableSystems and Networks (DSN), 2015 45th Annual IEEE / IFIP International Conference on , pp. 517–528. doi : .Hall, Anthony (1990). “Seven Myths of Formal Methods”. In: IEEE Software doi : .Heisel, Maritta (Jan. 1, 1996). “A Pragmatic Approach to Formal Speciﬁcation”. In: Object-Oriented BehavioralSpeciﬁcations . Springer. isbn : 978-0-7923-9778-6. doi : .Heitmeyer, Constance L. (1998). “On the Need for ’Practical’ Formal Methods”. In: Proceedings of the 5th Inter-national Symposium on Formal Techniques in Real-Time Fault Tolerant Systems (FTRTFT) . Vol. LICS 1486.Lyngby, DenmarkLyngby, Denmark, pp. 18–26.Hinchey, Michael G. and Jonathan P. Bowen (Apr. 1996). “To formalize or not to formalize?” In:

IEEE Computer / IEEE DigitalAvionics Systems Conference. Reﬂections to the Future. Proceedings . Vol. 1, pp. 16–22. doi : . reprint – F ormal M ethods U se Holloway, C. M. and R. W. Butler (1996). “Impediments to industrial use of formal methods”. In:

Computer doi : .Jackson, Michael (1987). “Power and Limitations of Formal Methods for Software Fabrication”. In: Journal of In-formation Technology doi : .Je ﬀ ery, Ross et al. (Apr. 2015). “An empirical research agenda for understanding formal methods productivity”. In: Information and Software Technology

60, pp. 102–112. doi : .Kaner, Cem and David Pels (Aug. 1998). Bad Software . Wiley. isbn : 978-0471318262.— (Aug. 2018).

Bad Software: Website . url : https : / / web . archive . org / web / 20191210042547 / http ://badsoftware.com/ .Kitchenham, B., S. Linkman, and D. Law (1997). “DESMET: a methodology for evaluating software engineeringmethods and tools”. In: Computing & Control Engineering Journal doi :

10 . 1049 / cce :19970304 .Kitchenham, Barbara A. and Shari L. Pﬂeeger (2008). “Guide to Advanced Empirical Software Engineering”. In:Springer. Chap. Personal Opinion Surveys, pp. 63–92.Klein, Gerwin et al. (Sept. 2018). “Formally veriﬁed software in the real world”. In:

Communications of the ACM doi : .Knight, John C. et al. (1997). “Why Are Formal Methods Not Used More Widely?” In: Fourth NASA Formal MethodsWorkshop , pp. 1–12.Lai, R. (Jan. 1, 1996). “How could research on testing of communicating systems become more industrially relevant?”In: Springer, pp. 3–13. doi : .Lai, Richard and Wilfred Leung (1995). “Industrial and Academic Protocol Testing: The Gap and the Means ofConvergence”. In: Computer Networks and ISDN Systems doi : .Leiner, D. J. (2014). SoSci Survey . Tech. rep. url : .Leino, K. Rustan M. (2017). “Accessible Software Veriﬁcation with Dafny”. In: IEEE Software doi : .Liebel, Grischa et al. (2016). “Model-based engineering in the embedded systems domain: an industrial survey onthe state-of-practice”. In: Software & Systems Modeling doi : .Mathieson, Kieran (1991). “Predicting User Intentions: Comparing the Technology Acceptance Model with the The-ory of Planned Behavior”. In: Information Systems Research doi : .Miller, Steven P., Michael W. Whalen, and Darren D. Cofer (Feb. 2010). “Software model checking takes o ﬀ ”. In: Communications of the ACM doi : .Miyoshi, T. and M. Azuma (1993). “An empirical study of evaluating software development environment quality”.In: IEEE Transactions on Software Engineering doi : .Mohagheghi, Parastoo et al. (Jan. 2012). “An empirical study of the state of the practice and acceptance of model-driven engineering in four industrial cases”. In: Empirical Software Engineering doi :

10 .1007/s10664-012-9196-x .Murphy, G. C., R. J. Walker, and E. L. A. Banlassad (1999). “Evaluating emerging software development technolo-gies: lessons learned from assessing aspect-oriented programming”. In:

IEEE Transactions on Software Engi-neering doi : .Neuendorf, Kimberly A. (Aug. 2016). The Content Analysis Guidebook . 2nd. Sage. isbn : 9781412979474.Neumann, Peter G. (May 2018). “Risks to the Public”. In:

ACM SIGSOFT Software Engineering Notes doi : .O’Hearn, Peter W. (2018). “Continuous Reasoning”. In: Proceedings of the 33rd Annual ACM / IEEE Symposium onLogic in Computer Science - LICS’18 . ACM Press. doi : .Oliveira, Jos´e Nuno (2004). “A Survey of Formal Methods Courses in European Higher Education”. In: TeachingFormal Methods . Springer Berlin Heidelberg, pp. 235–248. doi : .Oliveira, Jos´e Nuno et al. (Aug. 2018). Formal Methods Body of Knowledge (FMBoK) . url : https : / / web .archive.org/web/20200109111534/https://formalmethods.wikia.org/wiki/FMBoK .Parnas, David Lorge (2010). “Really Rethinking ’Formal Methods’”. In: IEEE Computer doi : .Petersen, Kai et al. (2008). “Systematic Mapping Studies in Software Engineering”. In: . doi : .Pﬂeeger, S. L. and L. Hatton (1997). “Investigating the inﬂuence of formal methods”. In: Computer doi : .Poston, R. M. and M. P. Sexton (1992). “Evaluating and selecting testing tools”. In: IEEE Software doi : . reprint – F ormal M ethods U se Riemenschneider, C. K., B. C. Hardgrave, and F. D. Davis (2002). “Explaining software developer acceptance ofmethodologies: a comparison of ﬁve theoretical models”. In:

IEEE Transactions on Software Engineering doi : .Robbins, Naomi B. and Richard M. Heiberger (2011). “Plotting Likert and Other Rating Scales”. In: Joint StatisticalMeeting , pp. 1058–66.Rushby, John (1994). “Critical system properties: Survey and taxonomy”. In:

Reliability Engineering & System Safety doi : .SEI (2010). CMMI for Development . Tech. rep. CMU / SEI-2010-TR-033. CMU.Shull, Forrest, Janice Singer, and Dag I. K. Sjøberg, eds. (Oct. 2008).

Guide to Advanced Empirical Software Engi-neering . London: Springer.Snook, C and R Harrison (2001). “Practitioners’ views on the use of formal methods: an industrial survey by struc-tured interview”. In:

Information and Software Technology issn : 0950-5849. doi : .Sobel, A.E.K. and M.R. Clarkson (Mar. 2002). “Formal methods application: an empirical tale of software develop-ment”. In: IEEE Transactions on Software Engineering doi : .The R Project (Aug. 2018). R . The R Project. url : .Wikipedia contributors (2018). Software engineering demographics — Wikipedia, The Free Encyclopedia . url : https://en.wikipedia.org/w/index.php?title=Software_engineering_demographics\&oldid=823840899 (visited on 01 / / Computer doi : .Wohlin, Claes et al. (June 2012). Experimentation in Software Engineering . Springer. isbn : 9783642290435.Woodcock, Jim et al. (Oct. 2009). “Formal Methods: Practice and Experience”. In:

ACM Comput. Surv. issn : 0360-0300. doi : . reprint – F ormal M ethods U se A S upplementary M aterial for “F ormal M ethods in D ependable S oftware E ngineering : A S urvey ” In the following, we provide additional material to the survey, including1. a more detailed analysis of responses to certain questions (Appendix A.1),2. further visualizations of the collected data (Appendices A.2 to A.6),3. more details on our analysis of related work (Table 9 in Appendix A.7),4. more details on the mapping from studies to challenges (Appendix A.8),5. a comprehensive table of open answers (Appendix A.9),6. a copy of the advertisement ﬂyer (Appendix A.10),7. a screenshot of the Twitter poll (Appendix A.11), and8. a copy of the whole questionnaire (Appendix A.12).

A.1 Data for Analysis of RQ1 and Estimation of External Validity

Based on the responses for question Q1, the Table 8 provides an overview of categories of respondentsreferred to in our analysis (particularly, in Section 5.2 and Figure 10) along with the corresponding countsbased on the sample from 31.3.2019 with N = reprint – F ormal M ethods U se Category of respondents to . . . Description Count Fraction . . .

Question Q4:Respondents with academiceducational background (AEB)

Researchers in academia; bachelor, master, orPhD students; lecturers, teachers, trainers,coaches 156 72%Academics with pure transferexperience AEB cut with researchers in industry,consulting and managing practitioners, andexternal consultants; without engineeringpractitioners in industry and without toolprovider stakeholders 35 16%Academics with practical experience AEB cut with engineering practitioners inindustry 41 21%Academics with experience in transferand practice AEB cut with researchers in industry,consulting and managing practitioners, andexternal consultants; cut with engineeringpractitioners in industry and without toolprovider stakeholders 31 14%Practitioners incl. transfer practitionersand industrial consultants, all withacademic background AEB cut with researchers in industry,consulting and managing practitioners, externalconsultants, and engineering practitioners inindustry 86 40%Pure academics AEB without respondents specifying additionalroles 66 31%

Respondents not specifying aneducational background (NEB)

The complement of AEB 60 27%Respondents not specifying aneducational background and beingresearchers in industry NEB intersected with researchers in industry 13 6%Consultants NEB intersected with consulting or managingpractitioners and external consultants 23 11%Pure practitioners NEB cut with engineering practitioners inindustry 23 11%Tool provider stakeholders notspecifying an educational background NEB cut with stakeholder of an FM tool orservice provider 5 2.3%Non-academic FM non-users NEB cut with “I have not used FMs in anyspeciﬁc role.” 19 9%Practitioners incl. industrialconsultants Consulting and managing practitioners,external consultants, and engineeringpractitioners in industry 108 50%FM users (all) Respondents having used FMs in one oranother way and context in the past 212 98%FM users (beyond students) Excl. “only-in-course” respondents 202 93.5%. . .

Question Q1:

FM non-users Respondents who chose “I have not used FMsin any academic or industrial domain.” 36 17%. . .

Question Q3:

Respondents with no motivation Respondents who selected “no” for allmotivating factors 9 4%. . .

Questions Q5 and Q6:

Non-practitioners including FMnon-users Respondents who chose “no experience or noknowledge”, “studied in (university) course” or“applied in lab, experiments, case studies” forall FM techniques. This group includeslaypersons. 46 21%Sample (N) All valid responses 216 100% reprint – F ormal M ethods U se DE DE/IT DE/US EU EU/US EU/US/AUS FR UK

Estimated geographical reachability of population via survey channels

AR AT BE DE DK ES FI FR IE IT JP NG NL SE UK US ZA

Geographical distribution of respondents by email address (TLD, company HQ, if provided)

Figure 19: Geographical analysis of the sample.

Legend: top-level domain (TLD), head quarter (HQ)

A.2 Geographical Analysis of the Sample

Figure 19 shows geographical aspects of the sample for this study.

A.3 Usage Intent (UFM i ) by Purpose (for Analysis of RQ2) The comparison in Figure 20 (and in the ﬁgures of the Appendices A.4 and A.5) contains two columns.The left column describes for each purpose (e.g. speciﬁcation) how often (e.g. in 2 to 5 separate tasks)respondents have used FMs in the past (

UFM p ).The right column describes for each purpose the usage intent ( UFM i ) depending on how often respondentshave used FMs in the past ( UFM p ). The horizontal bars representing the UFM p frequency categories arelisted in descending order by the overall size of both UFM i groups “more often” and “dnk”. We choseto keep dnk -answers visible despite the readability inconvenience caused by the dnk s inﬂuencing theordering. However, in the majority of cases the largest group of respondents intending to increase FMuse in the future is visible ﬁrst or near the top. A.4 Code-based vs. Model-based FMs for Assurance vs. Inspection

The data for this comparison is summarised in Figure 21. reprint – F ormal M ethods U se Clarification (N=216)

Specification (N=216)

Inspection (N=216)

Synthesis (N=216) rq2_comparison_P4_F5_purpose (past)Assurance (N=216) rq2_comparison_P4_F5_purpose (future) no more less equally more often dnk

Figure 20: Comparison of past and future usage intent by purpose reprint – F ormal M ethods U se Clarification (N=128)

Specification (N=128)

Inspection (N=128)

Synthesis (N=128) rq2_comparison_P4_F5_purpose_cbfm (past)Assurance (N=128) rq2_comparison_P4_F5_purpose_cbfm (future) no more less equally more often dnk010203040

Clarification (N=114)

Specification (N=114)

Inspection (N=114)

Synthesis (N=114) rq2_comparison_P4_F5_purpose_mbfm (past)Assurance (N=114) rq2_comparison_P4_F5_purpose_mbfm (future) no more less equally more often dnk

Figure 21: Comparison of past and future usage for code-based (top half) and model-based FMs (bottomhalf) by purpose reprint – F ormal M ethods U se A.5 Usage Intent (UFM i ) by FM Class (for Analysis of RQ2) Predicative, relational,or algebraicspecification (N=216)

Modal and temporal logicspecification (N=216)

Process models (N=216)

Dynamical systems(N=216)

Abstract interpretation(N=216)

Assertion checking(N=216)

Process calculi (N=216)

Model checking, SMV(N=216)

Constraint solving(N=216)

Generic theorem proving(N=216)

Computationalengineering, simulation(N=216)

Symbolic execution(N=216) rq2_comparison_P2_P3_F3_F4_usage (past)Consistency checking(N=169) rq2_comparison_P2_P3_F3_F4_usage (future) no more less equally more often dnk reprint – F ormal M ethods U se A.6 Data for the Analysis of RQ3

Figure 22 and the following ﬁgures in this section show pairs of matrices, so-called “heatmaps”, usefulfor association analysis between categorical and ordinal variables. The cells in the matrices representcombinations of the scales, each cell containing data about the mode and median of “degree of di ﬃ culty”ratings, their proportion of tough ratings, and the actual numbers of data points. Both the colour gradi-ent (red to white) and the solid vertical lines in the cells represent the tough proportions (left = = reprint – F ormal M ethods U se S c a l ab ili t y P r ope r ab s t r a c t i on s f r o m i rr e l e v an t M a i n t a i nab ili t y o f v e r i f i c a t i on r e s u l t s R eu s ab ili t y o f v e r i f i c a t i on r e s u l t s T r an s f e r o f v e r i f i c a t i on r e s u l t s A u t o m a t i on o r t oo l s uppo r t S k ill s and edu c a t i on AssuranceSynthesisInspectionSpecificationClarification Key/Histogram of Toughs C e ll s Comparison of Challenge Difficulty across Purposes(users not practicing FMs, past) S c a l ab ili t y P r ope r ab s t r a c t i on s f r o m i rr e l e v an t M a i n t a i nab ili t y o f v e r i f i c a t i on r e s u l t s R eu s ab ili t y o f v e r i f i c a t i on r e s u l t s T r an s f e r o f v e r i f i c a t i on r e s u l t s A u t o m a t i on o r t oo l s uppo r t S k ill s and edu c a t i on AssuranceSynthesisInspectionSpecificationClarification

Key/Histogram of Toughs C e ll s Comparison of Challenge Difficulty across Purposes(users practicing FMs, past)

Figure 22: Comparison of challenge di ﬃ culty across purposes ( UFM p ) reprint – F ormal M ethods U se S c a l ab ili t y P r ope r ab s t r a c t i on s f r o m i rr e l e v an t M a i n t a i nab ili t y o f v e r i f i c a t i on r e s u l t s R eu s ab ili t y o f v e r i f i c a t i on r e s u l t s T r an s f e r o f v e r i f i c a t i on r e s u l t s A u t o m a t i on o r t oo l s uppo r t S k ill s and edu c a t i on AssuranceSynthesisInspectionSpecificationClarification

Key/Histogram of Toughs C e ll s Comparison of Challenge Difficulty across Purposes(respondents with no or decreased intent, future) S c a l ab ili t y P r ope r ab s t r a c t i on s f r o m i rr e l e v an t M a i n t a i nab ili t y o f v e r i f i c a t i on r e s u l t s R eu s ab ili t y o f v e r i f i c a t i on r e s u l t s T r an s f e r o f v e r i f i c a t i on r e s u l t s A u t o m a t i on o r t oo l s uppo r t S k ill s and edu c a t i on AssuranceSynthesisInspectionSpecificationClarification

Key/Histogram of Toughs C e ll s Comparison of Challenge Difficulty across Purposes(respondents with same or increased intent, future)

Figure 23: Comparison of challenge di ﬃ culty across purposes ( UFM i ) reprint – F ormal M ethods U se S c a l ab ili t y P r ope r ab s t r a c t i on s f r o m i rr e l e v an t M a i n t a i nab ili t y o f v e r i f i c a t i on r e s u l t s R eu s ab ili t y o f v e r i f i c a t i on r e s u l t s T r an s f e r o f v e r i f i c a t i on r e s u l t s A u t o m a t i on o r t oo l s uppo r t S k ill s and edu c a t i on On behalf of an FM toolOwn interestStudy or research programSuperior / principal investigatorEmployer / research collaboratorsCustomers / scientific communityRegulatory authorities Key/Histogram of Toughs C e ll s Comparison of Challenge Difficulty across Motivations(users without motivation) S c a l ab ili t y P r ope r ab s t r a c t i on s f r o m i rr e l e v an t M a i n t a i nab ili t y o f v e r i f i c a t i on r e s u l t s R eu s ab ili t y o f v e r i f i c a t i on r e s u l t s T r an s f e r o f v e r i f i c a t i on r e s u l t s A u t o m a t i on o r t oo l s uppo r t S k ill s and edu c a t i on On behalf of an FM toolOwn interestStudy or research programSuperior / principal investigatorEmployer / research collaboratorsCustomers / scientific communityRegulatory authorities

Key/Histogram of Toughs C e ll s Comparison of Challenge Difficulty across Motivations(users with at least moderate motivation)

Figure 24: Comparison of challenge di ﬃ culty across motivations reprint – F ormal M ethods U se S c a l ab ili t y P r ope r ab s t r a c t i on s f r o m i rr e l e v an t M a i n t a i nab ili t y o f v e r i f i c a t i on r e s u l t s R eu s ab ili t y o f v e r i f i c a t i on r e s u l t s T r an s f e r o f v e r i f i c a t i on r e s u l t s A u t o m a t i on o r t oo l s uppo r t S k ill s and edu c a t i on Consistency checkingSymbolic executionComputational engineering, simulationGeneric theorem provingConstraint solvingModel checking, SMVProcess calculiAssertion checkingAbstract interpretationDynamical systemsProcess modelsModal and temporal logic specificationPredicative, relational, or algebraicspecification

Key/Histogram of Toughs C e ll s Comparison of Challenge Difficulty across FMs (users practicing FMs, past)

Figure 25: Comparison of challenge di ﬃ culty across FM classes ( UFM p ) reprint – F ormal M ethods U se S c a l ab ili t y P r ope r ab s t r a c t i on s f r o m i rr e l e v an t M a i n t a i nab ili t y o f v e r i f i c a t i on r e s u l t s R eu s ab ili t y o f v e r i f i c a t i on r e s u l t s T r an s f e r o f v e r i f i c a t i on r e s u l t s A u t o m a t i on o r t oo l s uppo r t S k ill s and edu c a t i on Consistency checkingSymbolic executionSimulationGeneric theorem provingConstraint solvingModel checking, SMVProcess calculiAssertion checkingAbstract interpretationDynamical system modelsProcess modelsModal or temporal logic specificationPredicative, relational, or algebraicspecification

Key/Histogram of Toughs C e ll s Comparison of Challenge Difficulty across FMs(users with no or decreased intent, future) S c a l ab ili t y P r ope r ab s t r a c t i on s f r o m i rr e l e v an t M a i n t a i nab ili t y o f v e r i f i c a t i on r e s u l t s R eu s ab ili t y o f v e r i f i c a t i on r e s u l t s T r an s f e r o f v e r i f i c a t i on r e s u l t s A u t o m a t i on o r t oo l s uppo r t S k ill s and edu c a t i on Consistency checkingSymbolic executionSimulationGeneric theorem provingConstraint solvingModel checking, SMVProcess calculiAssertion checkingAbstract interpretationDynamical system modelsProcess modelsModal or temporal logic specificationPredicative, relational, or algebraicspecification

Key/Histogram of Toughs C e ll s Comparison of Challenge Difficulty across FMs(users with same or increased intent, future)

Figure 26: Comparison of challenge di ﬃ culty across FM classes ( UFM i ) reprint – F ormal M ethods U se S c a l ab ili t y P r ope r ab s t r a c t i on s f r o m i rr e l e v an t M a i n t a i nab ili t y o f v e r i f i c a t i on r e s u l t s R eu s ab ili t y o f v e r i f i c a t i on r e s u l t s T r an s f e r o f v e r i f i c a t i on r e s u l t s A u t o m a t i on o r t oo l s uppo r t S k ill s and edu c a t i on I have not used FMs inEngineering practitioner in industryStakeholder of an FM tool orLecturer, teacher, trainer, or coachResearcher in industryResearcher in academiaExternal consultantConsulting or managing practitioner inindustryBachelor, master, or PhD student

Key/Histogram of Toughs C e ll s Comparison of Challenge Difficulty across Roles (Past) S c a l ab ili t y P r ope r ab s t r a c t i on s f r o m i rr e l e v an t M a i n t a i nab ili t y o f v e r i f i c a t i on r e s u l t s R eu s ab ili t y o f v e r i f i c a t i on r e s u l t s T r an s f e r o f v e r i f i c a t i on r e s u l t s A u t o m a t i on o r t oo l s uppo r t S k ill s and edu c a t i on I do not or would notEngineering practitioner in industryResearcher in industryStakeholder of an FM tool orResearcher in academiaLecturer, teacher, trainer, or coachExternal consultantConsulting or managing practitioner inindustryBachelor, master, or PhD student

Key/Histogram of Toughs C e ll s Comparison of Challenge Difficulty across Roles (Future)

Figure 27: Comparison of challenge di ﬃ culty across roles reprint – F ormal M ethods U se S c a l ab ili t y P r ope r ab s t r a c t i on s f r o m i rr e l e v an t M a i n t a i nab ili t y o f v e r i f i c a t i on r e s u l t s R eu s ab ili t y o f v e r i f i c a t i on r e s u l t s T r an s f e r o f v e r i f i c a t i on r e s u l t s A u t o m a t i on o r t oo l s uppo r t S k ill s and edu c a t i on I have not used FMs inProcess automationTransportationIndustrial machineryOtherMilitary systems not in the aboveDevice industryCritical infrastructuresSupportivePlatformsBusiness information

Key/Histogram of Toughs C e ll s Comparison of Challenge Difficulty across Domains (Past) S c a l ab ili t y P r ope r ab s t r a c t i on s f r o m i rr e l e v an t M a i n t a i nab ili t y o f v e r i f i c a t i on r e s u l t s R eu s ab ili t y o f v e r i f i c a t i on r e s u l t s T r an s f e r o f v e r i f i c a t i on r e s u l t s A u t o m a t i on o r t oo l s uppo r t S k ill s and edu c a t i on I would not or do notOtherProcess automationTransportationIndustrial machineryMilitary systems not in the aboveDevice industryCritical infrastructuresSupportivePlatformsBusiness information

Key/Histogram of Toughs C e ll s Comparison of Challenge Difficulty across Domains (Future)

Figure 28: Comparison of challenge di ﬃ culty across domains reprint – F ormal M ethods U se S c a l ab ili t y P r ope r ab s t r a c t i on s f r o m i rr e l e v an t de t a il s M a i n t a i nab ili t y o f v e r i f i c a t i on r e s u l t s R eu s ab ili t y o f v e r i f i c a t i on r e s u l t s T r an s f e r o f v e r i f i c a t i on r e s u l t s f r o m m ode l s t o r ea li t y A u t o m a t i on o r t oo l s uppo r t S k ill s and edu c a t i on Comparison of Challenge Difficulty as rated by Expert Users(> 3 years of experience) % o f e x pe r t u s e r s r a t i ng c ha ll enge a s ' t ough ' S c a l ab ili t y P r ope r ab s t r a c t i on s f r o m i rr e l e v an t de t a il s M a i n t a i nab ili t y o f v e r i f i c a t i on r e s u l t s R eu s ab ili t y o f v e r i f i c a t i on r e s u l t s T r an s f e r o f v e r i f i c a t i on r e s u l t s f r o m m ode l s t o r ea li t y A u t o m a t i on o r t oo l s uppo r t S k ill s and edu c a t i on Comparison of Challenge Difficulty as rated by non−Expert Users(<= 3 years of experience) % o f non − e x pe r t s r a t i ng c ha ll enge a s ' t ough ' Figure 29: Comparison of expert and non-expert users by their perception of challenge di ﬃ culty A.7 Details on the Systematic Map

Table 9 contains the data we collected from the literature for the systematic map. r e p r i n t – F o r m a l M e t h o d s U s e Table 9: Details for the classiﬁcation of related work

Reference Motivation Approach Result Relation List of Obstacles

Gerhart andYelowitz(1976) Pinpoint fallibilityin the use of FMs Evaluation of methodologies,error classiﬁcation andanalysis, using severalexamples of fundamentalalgorithms Identiﬁcation of three classesof errors: speciﬁcation errors,systematic constructionerrors, proved program errors.Discussion of potential causesof errors of each class. Observations containobstacles,recommendations theiralleviation Formality gap, appropriateabstraction and structuring,lack of skills and educationJackson (1987) Generic evaluation Expert opinion FMs have inherent limitations Aim of method transfer,design of applicable FMs Inherent informality offormalisation, di ﬃ cult tocommunicate to customers,lack ofexpressiveness / freedom ofexpression, lack ofmethodologyBjorner (1987) Proposes a softwareDevelopmentmethod based onFM Personal opinion, experience Identify 3 main challenges:education, hiring, and toolsupport. Challenges could beconsidered obstacles Lack of skills and education,lack of tools,changeability / compatibilitywith existing process (methodculture)Hall (1990) Present and test FMmyths. They evaluate FMs on onelarger case study ( 50.000lines of Objective C Code)where they use Z (550 Zschemas to deﬁne 280operations) to develop aCASE tool. Formal methods are powerfultools which must be betterunderstood by developers atlarge. Rejection of commonHypothesis about FMs.Evaluation of a singleFM by means of a casestudy. Myth 1: improper abstraction;transfer of v.results; (myth2-4: skills and directededucation;) myth 5:time-budget restrictions; myth6: improper abstraction, toolsupport (usability); myth 7:scalabilityWing (1990) FM adoption Literature review, summary,and analysis Overview / taxonomy of FMs;analysis of limitations Justiﬁes the FMclassiﬁcation used in ourquestionnaire Proper abstraction (formalitygap, neglected environmentalassumptions)Bloomﬁeldet al. (1991) Introduction toformal methods Reference to technologytransfer, existing case studies,and tool support At the present time, formalmethods are good for thedescription of sequentialproperties of systems, and forcommunication protocols,although they do not yetaddress temporal propertiesand concurrency particularlywell. Investigation of state ofthe art, no comparisonwith future. Handling of incomplete specs,lack of veriﬁcation tools,costly training, changingmanagement styleContinued on next page r e p r i n t – F o r m a l M e t h o d s U s e Reference Motivation Approach Result Relation List of Obstacles

Austin andParkin (1993) Lack of acceptancein industry Literature survey andquestionnaire Identify obstacles and suggestto improve education andstandardisation and toperform case studies anddeﬁne metrics. Our questionnaire hasless focus onrepresentation andmethodology andexcludes questions onbeneﬁts, suggestions.Their sample mainlycovers Z / VDM users inthe UK. Our analysis ofpast use is moreelaborate. Math, tools, lack ofcost / beneﬁt evidence, changeresistanceCraigen et al.(1993) Determine state ofthe art about the useof FM in practice Analysis of 12 case studies FMs are beginning to be usedseriously and successfully byindustry Study of FMs inindustrial practice Stated as recommendations:scalability, lack of toolsupport, lack ofskills / education, transfer ofverif. obl. / results from / tocode; resource constraintsFraser et al.(1994) Lack of FMadoption,improvement of RE Discuss beneﬁts and problemsof FM adoption, literaturestudy Present a two-dimensionalframework for assessingstrategies for incorporatingformal speciﬁcations insoftware development Add suggestions on FMintroduction Lack of method and toolsupport, lack ofskills / training, not suitable forrequirements prototyping,lack of cost / beneﬁt evidenceBowen andHinchey(1995a) Re-examine Hall’smyths, introduce 7new Myths. Argumentation andmentioning of case studies(although no reference to thestudies are given). More real links betweenindustry and academia arerequired, and the successfuluse of formal methods mustbe better publicized. Rejection of commonHypothesis about FMs.Evaluation withreference to case studies. Time-budget-restrictions, lackof tool support, lack ofintegration in current process,scalability (multi-tech)Bowen andHinchey(1995b) Identify maximsthat may help in theapplication offormal methods inan industrial setting. Based on observations (byourselves and others) on anumber of recently completedand in-progress projects 10 Hypotheses on how toimprove the success of FMusage Investigation of FMusage; Argumentationand Examples. Stated as commandments:lack of tool support(documentation guidelines),proper abstraction, budgetrestrictions (bad cost-beneﬁtratio), lack of experts (skillsand education); compatibilitywith current process (lack ofquality culture); lack ofreuseabilityContinued on next page r e p r i n t – F o r m a l M e t h o d s U s e Reference Motivation Approach Result Relation List of Obstacles

Lai and Leung(1995) Investigate reasonswhy academicmethods are notadapted by industry Personal experience Provide 5 reasons:practicality, too academic,Education, Resistance tochange, di ﬃ culty to re-invest The reasons can beconsidered as obstacles Practicality (being tooacademic): scalability, skillsand education, resistance tochange (compat with existingprocess); budget restrictionsHeisel (1996) In practice FMs arenot widely used Personal Opinion / Observation Proposes a pragmaticapproach to FM Method tries toovercome obstacles Lack of skills / education, lackof experts, improperabstraction (low useability,low correctness); budgetrestrictions; compatibilitywith existing processHinchey andBowen (1996) Identify reasons forindustry’sreluctance to takeformal methods toheart. Experience in editing acollection of essays on theindustrial application of FM. Misconception of Myths,Standards, Tools, andEducation are obstacles to useFM in industry. They identify obstacles.However, they to notprovide evidence againstor in favor of them. Wrong skills and education,lack of clariﬁcation (badreputation / misconception);lack of standards / regulation;lack of tools;Holloway andButler (1996) FM adoption Position statement, experiencereport, expert opinion List of impediments to FMadoption Do not measure usageintent but highlight lackof transfer e ﬀ orts Inadequate tools andexamples, inadequate transferLai (1996) Academic methods(FM) not used incommunicationindustry Personal opinions; critizeresearch transfer and suggestimprovements that might alsobe helpful for FM transfer topractice 8 Reasons why academianeeds to do industry research,7 catalysts, 12 industryrelevant factors Reasons can beconsidered obstacles Lack of empirical evidence,lack of skills / education,scalability, properabstractions; compatibilitywith existing process, lack oftool supportPﬂeeger andHatton (1997) Investigatee ﬀ ectiveness ofFMs Case study and e ﬀ ectanalysis: Comparison ofchange requests for FM-basedand non-FM-based codefragments as a result ofpostdelivery problems causedby these fragments. FM have a positive e ﬀ ect oncode quality Empirical evidence forFM e ﬀ ectiveness shownfor FMs in design phasebut not in more general,however only one system Cost-beneﬁt (fault removale ﬀ ectiveness)Knight et al.(1997) FMs are notaccepted byindustry Elementary Field Experiment Evaluation of Z, PVS, andstate-charts Evaluation of FMs inpractice Integration in existing process(tools, methods,environments, people); properand useful abstractions (incl.usability / comprehensibility);tool support (groupdevelopment, collaborativeeng.); evolution andspec / proof maintainability;budget constraintsContinued on next page r e p r i n t – F o r m a l M e t h o d s U s e Reference Motivation Approach Result Relation List of Obstacles

Galloway et al.(1998) Lack of FMadoption,improvement ofrequirements Single case study in the lab Applied discrete FMs (e.g. Z,PVS, and state-charts) forspecifying and aggregatingrequirements of aircraftengine control systems Interesting discussion ofFM weaknesses in a veryrelevant applicationcontext Inadequate abstraction(di ﬃ cult to integrate discreteand continuous abstractions);lack of training; lack ofinterest from engineers; FMsare perceived as too expensiveto applyHeitmeyer(1998) FM tools are usefulbut not used inindustry sincepeople are notskilled enough Reference to a few casestudies Provide a set of guidelineshow to make FM more usable Usability as obstacle Tool support, properabstraction, processchangeability / compatibilitySnook andHarrison(2001) Lack of empiricalinvestigation on theuse of FM 5 Structured Interviews Improved quality of softwarewith little or no additionallifecycle costs Empirical investigationof the beneﬁts of FM inindustry Lack of skills / education,improper abstractions(understandability); transferof verif.results (models toreality)Bowen andHinchey(2005) Re-examination oftheir 1995commandments 10Years later Personal experience,reference to literature empirically validatecommandments with littleconclusion They investigate the useof FM in industry Tool support still an issuedespite some case studiesBicarreguiet al. (2009) No recent study onthe industrial use ofFMs. Structured Questionnaire on62 industrial projects 4 challenges where identiﬁed Empirical Studyidentifying obstacles Lack of tool support, lack ofempirical evidence; lack ofexperience (skills / education),budget restrictionsWoodcocket al. (2009) State of the art ofindustrial use of FM(extension ofBicarregui et al.,2009) Structured Questionnaire on62 industrial projects (claimedto be most comprehensivereview ever made of formalmethods application inindustry) Identify several challenges Empirical Studyidentifying obstacles Budget / resource constraints(high entry costs,cost-beneﬁt), lack of toolsupport (automation)Miller et al.(2010) FMs are not widelyused in Industry 3 Case studies about the useof Model Checking inIndustry Model Checking can bee ﬀ ectively used to ﬁnd errorsearly in the developmentprocess for many classes ofmodels They investigate theapplicability of one FMin industry Lessons from case studies:scalability and properabstraction (useability)Continued on next page r e p r i n t – F o r m a l M e t h o d s U s e Reference Motivation Approach Result Relation List of Obstacles

Parnas (2010) FMs are not widelyused in industry Argumentation / PersonalExperience Provides reasons why FMsare not used and suggestionsfor improvement Provides Obstacles Proper abstraction, lack ofempirical evidence, lack oftool support;maintainability / transfer(reﬁnement,step-by-step)Mohagheghiet al. (2012) Adoption of MbE inSE Tool evaluations based onTAM (Riemenschneider et al.,2002), interviews, survey Evaluation of PEOU, PU,current, and future use Also use IVs current andfuture use; little focus onFM; but MDE issometimes based on FM,MDE adoption canimprove FM adoption; Lack of training, lack ofmaturity, broken tool chains,high cost of adoption (toolintegration)Davis et al.(2013) Identify barriers toFM adoption andsuggestions forbarrier mitigation Interviews with 31practitioners from the USaerospace domain Top barriers: education, tools,work environment; topmitigations: education, toolintegration, evidence of FMbeneﬁts; occasionalnon-barriers: evidence onsavings, FM complexity,training / skills Similar researchquestions, open-endedinterview questions;restricted to one domainand geography Education, tools,environment, engineering,certiﬁcation, misconceptions,scalability, evidence ofbeneﬁts, costLiebel et al.(2016) Adoption of MbE(incl. FM) inembedded SE Online survey on needs,positive / negative e ﬀ ects, andshortcomings of MDEadoption SotA and challengeassessment: FMs not usedwidely; their data suggests aneed of FM adoption; 30% ofthe responses from industrydeclare the need for FMs as areason to adopt MDE; medianof responses suggests thatMbE adoption has a positivee ﬀ ect on FM adoption; Few FM users asparticipants Lack of tool support, badreputation, rigid developmentprocessesFerrari et al.(2019) Lack of FMadoption in railwaydomain Review of FM literature, FMprojects, and FM toolsaccording toDESMET (Kitchenham et al.,1997); survey amongpractitioners UML dominates as the MbElanguage, many FMs andFM-based tools are used, Bdominates as the FM; toolranking / selection matrix Analyse maturity of FMsfrom literature review;evaluate rele-vance / quality / maturity ofFM and FM tool featuresfrom subjectiveassessment of surveyrespondents; Di ﬃ culty to learn, lack of toolqualiﬁcations, lack ofexpressivenessContinued on next page r e p r i n t – F o r m a l M e t h o d s U s e Reference Motivation Approach Result Relation List of Obstacles

Klein et al.(2018) FM adoption Large case study,measurement of proof e ﬀ ort FMs can scale to real systems,mixed assurance levels arepossible Evidence for scalabilitycontradicting the beliefof our responses Incompleteness of theoremsfor abstracting from allhardware features reprint – F ormal M ethods U se A.8 Mapping of Studies to Challenges for RQ3

In addition to Table 6 in Section 5.5, Table 10 provides the complete lists of surveyed studies mapped tothe corresponding challenges.Table 10: Mapping of studies to challenge names (with the number of studies in parentheses)

Challenge Name Supported byScalability (7)

Bowen and Hinchey (1995a), Craigen et al. (1995, 1993), Hall (1990), Lai (1996), Laiand Leung (1995), and Miller et al. (2010)

Skills & Education(13)

Barroca and McDermid (1992), Bicarregui et al. (2009), Bjorner (1987), Bowen andHinchey (1995b), Craigen et al. (1995, 1993), Galloway et al. (1998), Hall (1990),Heisel (1996), Hinchey and Bowen (1996), Lai (1996), Lai and Leung (1995), andSnook and Harrison (2001)

Transfer of Proofs(8)

Barroca and McDermid (1992), Bloomﬁeld et al. (1991), Craigen et al. (1995, 1993),Hall (1990), Jackson (1987), Parnas (2010), and Snook and Harrison (2001)

Reusability (2)

Barroca and McDermid (1992) and Bowen and Hinchey (1995b)

Abstraction (12)

Barroca and McDermid (1992), Bowen and Hinchey (1995b), Galloway et al. (1998),Hall (1990), Heisel (1996), Heitmeyer (1998), Jackson (1987), Knight et al. (1997), Lai(1996), Miller et al. (2010), Parnas (2010), and Snook and Harrison (2001)

Tools & Automation(16)

Bicarregui et al. (2009), Bjorner (1987), Bloomﬁeld et al. (1991), Bowen and Hinchey(1995a,b), Bowen and Hinchey (2005), Craigen et al. (1995, 1993), Hall (1990),Heitmeyer (1998), Hinchey and Bowen (1996), Knight et al. (1997), Lai (1996),O’Hearn (2018), Parnas (2010), and Woodcock et al. (2009)

Maintainability (3)

Barroca and McDermid (1992), Knight et al. (1997), and Parnas (2010)

Resources (11)

Bicarregui et al. (2009), Bloomﬁeld et al. (1991), Bowen and Hinchey (1995a,b),Craigen et al. (1995, 1993), Hall (1990), Heisel (1996), Knight et al. (1997), Lai andLeung (1995), and Woodcock et al. (2009)

ProcessCompatibility (12)

Bjorner (1987), Bloomﬁeld et al. (1991), Bowen and Hinchey (1995a,b), Craigen et al.(1995), Heisel (1996), Heitmeyer (1998), Hinchey and Bowen (1996), Knight et al.(1997), Lai (1996), Lai and Leung (1995), and O’Hearn (2018)

Practicality &Reputation (6)

Bicarregui et al. (2009), Galloway et al. (1998), Glass (2002), Lai (1996), Lai andLeung (1995), and Parnas (2010)

A.9 All Answers to Open Questions

Table 11 provides all answers to the open questions of our questionnaire. We ﬁxed a small number ofspelling mistakes in the responses during the preparation of this table. r e p r i n t – F o r m a l M e t h o d s U s e Table 11: Open answers to the questions Q3, Q6, Q7, Q11, Q12, and Q13; R. . . respondent

R Date D3o

Further motivationsto use FMs

P3o

Other FMsused

P4o

Used FMs forother purp.

F4o

Future use ofother FMs

F5o

Future use forother purp.

O1o

Further obstacles to FMuse

General Feedback / /

14 none none none none none cool4 2017 / /

14 Science shall be best andgreat. No, that appearapproximatelycomplete to me. That is pretty all. No No In real operative situations arealways hidden obstacles as wellas those listed already. Imag-ine real and nominal deﬁnitionswere really ﬁrst order logic.Add operational deﬁnition andstay sound. Abstraction, reduction, exempliﬁ-cation, representation, knowledge,experience, skill and the art ofanalysing the taxonomies underselection are necessary or in need.8 2017 / /

18 Scientiﬁc curiosity9 2017 / /

18 Test-Generation for smallproblems, like MCDCcoverage high license costs19 2017 / /

25 I just liked logic.Actually never caredabout applying it, butresearch positions are inFM so well :-) goes completely against anyexisting software developmentprocess: FM require that youwrite speciﬁcations before im-plementing. I don’t believe any-more that anybody would dothis any time.21 2017 / /

28 Error elimination22 2017 / /

28 Completeness, accuracy Survey not suitable for beginners(like me) with the FMs.28 2017 / /

12 Budgetary restrictions32 2017 / /

16 environment34 2017 / /

20 The cost-beneﬁt relationship isnot slanted in FM’s favour: sub-stantial improvements in soft-ware engineering will not comefrom application of FM butthrough process improvement,better training and education,and better control of require-ments. Formal methods haveuses for verifying critical com-ponents but they will not solvethe big problems in software en-gineering. Continued on next page r e p r i n t – F o r m a l M e t h o d s U s e No. Date

Q3 Further motivations touse FMs Q6 Other FMs used Q7 Used FMs forother purp. Q11 Future use ofother FMs Q12 Future use forother purp. Q13 Further obstacles to FMuse

General Feedback

35 2017 / /

25 Supporting the design andconstruction of reliableand dependable systems Many dialects ofPetri nets anAutomata36 2017 / /

25 Improved level ofconﬁdence versus moretraditional means reverse engineering The main obstacle to adoptionby industry is the ”use” of engi-neers not able to handle abstrac-tion, leading to poor results.37 2017 / /

25 The need to achieve anddemonstrate the highestpossible integrity ofsystems None None None None Perceived cost and di ﬃ culty ofuse requiring specialist knowl-edge38 2017 / /

25 No. None None There are no speciﬁc barriers(apart from not having neededpractice with them) apart from Iﬁnd my Formalised Methods (tostrict procedures) gets the taskdone well enough. I make a clear distinction betweenthe mathematically Formal Meth-ods and the procedure based For-malised Methods.39 2017 / /

25 B method40 2017 / /

25 Cost and quality ofﬁnished product SPARK Adaprogramminglanguage, Z, CSP Broken market / market-for-lemons in software quality43 2017 / /

25 They are the only way toguarantee certainproperties of software andits documentation Speciﬁcation:TLA + , Z Fault and failureanalysis44 2017 / /

25 none none no No45 2017 / /

25 Clearly thinking about theproblem and correctness47 2017 / /

25 I work with ’timetriggered’ systems.These can be mod-elled e ﬀ ectively(semi-formally)without a com-plete mathematicalmodel. Many ofour customers thinkthis is far moreadvanced than theyrequire ... No customer demand. Continued on next page r e p r i n t – F o r m a l M e t h o d s U s e No. Date

Q3 Further motivations touse FMs Q6 Other FMs used Q7 Used FMs forother purp. Q11 Future use ofother FMs Q12 Future use forother purp. Q13 Further obstacles to FMuse

General Feedback

48 2017 / /

25 ”Bugs”, software defects,are semanticinconsistencies in code.Formal methodsacknowledge thenecessary semantics andhelp prevent defects. Currentlyintegrating a degreeof formal methodsinto softwarerequirements anddesign modellingmethod Drive constructionof good code. Prop-erties of a goodproof are also prop-erties of good code:minimal steps,logical progressionof steps, ... Writingcode to be prove-able (even if neverexplicitly proven)yields better code. I don’t understandthis question. None that I canthink of now The software industry’s culturalpredisposition to informal ap-proaches (”But that’s not theway we’ve always done it!”)49 2017 / /

26 Combining traditionaland Formal methodsbased Software V & V50 2017 / /

26 I love logic and maths. my experience ismostly as aresearcher andteacher, somethingfor which the abovedoes not allow meto tick a box to make exam ques-tions :-)54 2017 / /

04 Belief it helps constructsystems with much lesserrors Most of practitioners don’t anddon’t want to know formalmethods. So there is a strongneed for ”hidden” use of formalmethods, like compilers. FM have strong potentials butare also di ﬃ cult to use in indus-try practices. Transferring verygood academia results into indus-try practices is challenging.56 2017 / /

06 no58 2017 / /

23 To reduce costs in systemdevelopment. Reduce costs.59 2017 / /

23 simpliﬁes sometimesthings, because ofenforcement of asystematic approach60 2017 / /

23 The beauty ofmathematics61 2017 / /

23 research program ignorant persons64 2017 / /

23 Elegance and precision VDM, Z, Event-B Interesting set of questions65 2017 / /

23 Maintenance cost andreliability66 2017 / /

23 I consider it thefoundation of principledsoftware engineering Unifying Theoriesof Programming rapid prototypingcalculation tools lack of time67 2017 / /

23 Higher level of trust pseudo-occam,SPARK-ada Not very user friendly, too ab-stract and diverse syntax be-tween di ﬀ erent FMs, productiv-ity Continued on next page r e p r i n t – F o r m a l M e t h o d s U s e No. Date

Q3 Further motivations touse FMs Q6 Other FMs used Q7 Used FMs forother purp. Q11 Future use ofother FMs Q12 Future use forother purp. Q13 Further obstacles to FMuse

General Feedback

68 2017 / /

23 Increasing engineeringreliability The TRIOspeciﬁcationlanguage UML with for-mal application-dependent tailoring.71 2017 / /

23 Teaching how to buildhigh-quality software72 2017 / /

23 Engineers will not be able to ap-ply formal methods.73 2017 / /

23 Achieving correct,fault-free software In lectures andteaching FMs.76 2017 / /

25 no77 2017 / /

26 Thesis - tool veriﬁcation79 2017 / /

29 productivity of the*whole* process80 2017 / /

30 Soundness checking ofautomation83 2017 / /

06 Well chosen questions which donot leave me guessing. Relevant tofuture FM research and practice.84 2017 / /

06 None - impractical formost systems85 2017 / /

06 Identiﬁed as industrialbest practice87 2017 / /

06 Desire to develop systemsthat I have solid evidenceperform correctly underall scenarios. Aligning academic researchwith industrial need. Gettingtool developers to invest in FM.90 2017 / /

19 business and researchdi ﬀ erentiation91 2018 / /

02 Lack of tools and wide range ofmethods in existence.93 2018 / /

16 I don’t know I haven’t I haven’t I don’t know Academia I don’t know94 2018 / /

30 Niche market, outperformcompanies using”traditional” means96 2018 / /

31 verify protocols97 2018 / /

01 I believe that softwareengineering should havethe same mathematicalunderpinning as regularengineering (I majored inengineering) Continued on next page r e p r i n t – F o r m a l M e t h o d s U s e No. Date

Q3 Further motivations touse FMs Q6 Other FMs used Q7 Used FMs forother purp. Q11 Future use ofother FMs Q12 Future use forother purp. Q13 Further obstacles to FMuse

General Feedback

98 2018 / /

01 To increase quality andreliability100 2018 / /

01 needed - - - - - -102 2018 / /

02 no none no no105 2018 / /

02 See Andy Galloway,Trevor Cockram, andJohn McDermid.Experiences with theapplication of discreteformal methods to thedevelopment of enginecontrol ... Independentreviewer of FMspeciﬁcations None107 2018 / /

02 Pushing back theboundaries of knowledge.Making programs whichare correct byconstruction. FermaT ProgramTransformations108 2018 / /

03 critical infrastructuresand technologyrequirements speciﬁcation of hy-brid communicatingsystems future technologicaldesign methodolo-gies109 2018 / /

03 No further motivations touse FMs111 2018 / /

03 A candidate suppliero ﬀ ered FMs in a tenderresponse to address anassurance requirement reviewingconsistent set ofrequirementsexpressed using Z,OBJ, etc I need to provide assurance thatis understandable to those whoneed to be assured Formal Methods are perceived tobe expensive to use, and they canbe, but this can be o ﬀ set by thebeneﬁts; there does not seem to bemuch work published on this thatcould be used to persuade the Cus-tomer and the budget holder...113 2018 / /

03 Formal is the only way toobtain rigorousveriﬁcation.114 2018 / /

04 Strategic facilitation inlarge enterprises, whereformal modelling canreveal gaps in the clientsunderstanding of theirown ecosystems. Z, VDM, AXES Requirements engi-neering Protective Analysis(PAN)115 2018 / /

04 As per my interpretation,FM is a consistent alogical way to describe aSystem / Product Current corporations do notconsider valuable FMs duringproduct development, as a con-sequence, Engineers will noteven try to use it Looks really interesting and I willdo some research from my side tolearn a little bit moreContinued on next page r e p r i n t – F o r m a l M e t h o d s U s e No. Date

Q3 Further motivations touse FMs Q6 Other FMs used Q7 Used FMs forother purp. Q11 Future use ofother FMs Q12 Future use forother purp. Q13 Further obstacles to FMuse

General Feedback

117 2018 / /

05 To change the way theworld produces software Security and safetyanalysis No Safety and securityanalysis of Systems Awareness of commercial po-tential Closed questionnaire is just a start118 2018 / /

05 Improve general systemreliability TRIO , timedAutomata, B Requirement engi-neering Research on inno-vative formalisms,foundational re-search122 2018 / /

06 I have developed varioussafety-related gas sensorsand other measurementdevices. I was invited tobecome an assessor ofthese systems by acertifying body. I wasshocked by what Idiscovered when Iscrutinised the methodsand approaches used bythe embedded softwaredevelopers whose workwas being certiﬁed. Iresolved to improve myown knowledge andpractice so that I coulddeal authoritatively withsome of theunsatisfactory situationsthat arose during variouscertiﬁcations. I have great di ﬃ -culty in persuadinganyone to take aninterest in FM. Irepresented theUK on a Europeancommittee con-cerned with safetyrelated gas systems.The ignorance andlack of interest inFM was startling- the German rep-resentatives wereparticularly hostile.So I only use FM inspecial cases withspeciﬁc customers.There is muchprejudice out thereagainst FM. See answer above. We have topublicise some successes in or-der to lift the veil of ignoranceabout FM. Continued on next page r e p r i n t – F o r m a l M e t h o d s U s e No. Date

Q3 Further motivations touse FMs Q6 Other FMs used Q7 Used FMs forother purp. Q11 Future use ofother FMs Q12 Future use forother purp. Q13 Further obstacles to FMuse

General Feedback

123 2018 / /

06 We provide customers anFM tool Most tools have many limita-tions and do not support widedomains by themselves126 2018 / /

07 Quality requirements Complexity129 2018 / /

07 Scientiﬁc curiosity. Formalising inter-national standards131 2018 / /

07 It is a hard problem.133 2018 / /

07 Improving conﬁdence insoftware134 2018 / /

07 Formal methods are thebackbone of the sciencein ”Computing Science”.136 2018 / /

07 To model complexphenomena in socialsciences137 2018 / /

07 No138 2018 / /

07 Protocol analysis Protocol analysis141 2018 / /

08 The lists essentiallycontains formalveriﬁcationapproaches. Formalmethods aremathematicallygroundedapproaches todesign / analyseartefacts, e.g. the Bmethod is a formalmethod to devisesoftwarecomponents. I usethe B method in mycurrent job. System-level analy-sis. Software de-sign. System analysis.Software design. Continued on next page r e p r i n t – F o r m a l M e t h o d s U s e No. Date

Q3 Further motivations touse FMs Q6 Other FMs used Q7 Used FMs forother purp. Q11 Future use ofother FMs Q12 Future use forother purp. Q13 Further obstacles to FMuse

General Feedback

145 2018 / /

08 Curiosity None146 2018 / /

08 Interest / Research As testing tools. Industry mind set147 2018 / /

08 gain conﬁdence in thealgorithms we werereasoning about. None. I am convinced that onlyFMs can help us from shitty andbuggy software.150 2018 / /

09 The robustness andpreciseness that theformal methods disciplinecan provide via itsspeciﬁcations andveriﬁcations techniques Thank you very much for this sur-vey. It is very constructive andimportant. It handles most of theissues encountered by any practi-tioner and user of formal methods.151 2018 / /

09 I only encountered FMs as a topicin my studies once. They do notplay any role in my profession atthe moment.153 2018 / /

09 Quality154 2018 / /

09 Main part of businessmodel of our startup Sorry for not dis-closing our FutureExtent of FormalMethods Use Hypes (eg AI) that direct deci-sion makers in other directions Sorry for not disclosing our FutureExtent of Formal Methods Use157 2018 / /

10 I think FM is misplaced in ourstudy program SE, especially itbeing mandatory. It is very rarelyused in industry (for a reason) andthere are so many more importantthings to learn in order to becomethe Software Engineers of tomor-row (eg. Entrepreneurship).160 2018 / /

10 I am a PhD studentusing FM for formalveriﬁcation. I planon continuing to dothis...161 2018 / /

11 Time constraints, ﬁnancial con-straints, refactoring e ﬀ orts167 2018 / /

13 They are fun and solve a”real” problem168 2018 / /

13 Abstract stateMachines, PetriNets, LTL, CTL,SAT and SMTsolvers Continued on next page r e p r i n t – F o r m a l M e t h o d s U s e No. Date

Q3 Further motivations touse FMs Q6 Other FMs used Q7 Used FMs forother purp. Q11 Future use ofother FMs Q12 Future use forother purp. Q13 Further obstacles to FMuse

General Feedback

169 2018 / /

13 Strong (potential)guarantees unobtainableby other methods Requirement fal-siﬁcation (throughsimulation), dy-namic / online assur-ance (monitoringand reasoning) Lack of appreciation and under-standing from other (CS and en-gineering) communities Nice questions overall. In thosewith relative answers (e.g.,more / less frequent than before), itwould be nice to have my previousanswers on the same page (tocheck what I answered exactly).171 2018 / /

13 I have developed bothVDM andRely / Guarantee concepts VDM,Rely / Guarantee,Separation Logic Rely / Guarantee education174 2018 / /

13 model-based testing176 2018 / /

14 Petri Nets very strange you put Petri nets toSection P2 as a description tech-nique, and process calculi to Sec-tion P3 as a reasoning technique.In fact, Petri nets o ﬀ er much morereasoning techniques than processcalculi do178 2018 / /

15 Software Projects often have alarge existing code base. Theassumptions that hold in thatcode base need to be thor-oughly analyzed ﬁrst. More-over, I am missing automaticmapping from source code toformal logic models / descrip-tions. I’ve missed questions towards whypeople do not use formal methodsin a given domain. What are theobstacles that they are currentlyfacing?179 2018 / /

15 Interest in reliable criticalsystems Changed focus of interest at mywork182 2018 / /

16 I believe in doing thingsright or doing it not at all. code generationbased on allegories The way academia communi-cates FMs to engineers in prac-tice.183 2018 / /

16 Interest186 2018 / /

17 Teaching FM Combinations offormal and semi-formal methods187 2018 / /

19 FM = only method ableto solve the problem193 2018 / /

26 Z Karnaugh Maps ?for circuit design195 2018 / /

04 business in FM mcrl2197 2018 / /

24 FMs have been inexistence since 1967.they have never made itinto the mainstream. CSP Finding synchro-nization bugs Continued on next page r e p r i n t – F o r m a l M e t h o d s U s e No. Date

Q3 Further motivations touse FMs Q6 Other FMs used Q7 Used FMs forother purp. Q11 Future use ofother FMs Q12 Future use forother purp. Q13 Further obstacles to FMuse

General Feedback

198 2018 / /

24 need for rigorousspeciﬁcation andveriﬁcation199 2018 / /

24 State of the art in controlengineering andautomation204 2018 / /

26 Move to management level207 2018 / /

05 Only reasonable way toget my results208 2018 / /

05 I never felt tempted to useformal methods To clearly deﬁneconcepts213 2018 / /

16 Regulatory authority ok215 2019 / /

04 Functional safetystandards and theirapplication Modellinglanguages rather semi-FMs,are more practicaland easy to apply support test casegeneration andvalidation customers ability to make useof them as well in the speciﬁca-tion and development process ofcritical systems217 2019 / /

20 UML218 2019 / /

22 increase conﬁdence insystem security221 2020 / /

20 Static analyzer e.g.PVS Studio 1. The tools don’t even work. 2.The tools aren’t cost e ﬀ ective.3. We would have to train all theprogrammers, the QA team, thesystems engineering team, and(importantly) the company ex-ecutives. reprint – F ormal M ethods U se A.10 Copy of the Advertisement FlyerA.11 Screenshot of the Twitter PollA.12 Copy of the Questionnaire

The PDF export of our on-line questionnaire on the next page corresponds to the questionnaire weused for the sample taken until 31.3.2019 with N = https://goo.gl/forms/FnKNQtTmI3A6BekM2 .We crafted this questionnaire using Google Forms (Google, 2018). We use numbered identiﬁers for eachquestion category, demographic questions are preﬁxed with a “D”, questions about past FM use ( UFM p )with a “P”, about future or intended FM use ( UFM i ) with an “F”, questions about obstacles with an “O”.Open questions are su ﬃ xed by an “o”. reprint – F ormal M ethods U se Use of Formal Methods

Dear participant, thank you for your interest in this 8-10min survey on the use of formal methods (FMs).This survey does NOT require previous knowledge in FMs or in their actual application in a practical context. However, this survey targets persons with an educational background in engineering and sciences OR with a practical engineering background in a reasonably critical systems or product domain. By "FMs", we refer to explicit mathematical models and sound formal logical reasoning about critical properties---such as reliability, safety, availability, data privacy or, more generally, dependability and security---of electrical, electronic, and programmable electronic or software systems in critical application domains. FMs include, for example, formal specification, theorem proving, model checking, formal contracts, SMT solving, process algebras.By "use of FMs", we refer to the application of FMs to engineered systems in the context of education, research, and, particularly, the field of industrial practice and by using formal languages together with manual or automated tool-based techniques. This survey is anonymous. However, you can provide your email address if you are interested in receiving our final results afterwards.The underlying study is conducted by Mario Gleirscher at University of York and Diego Marmsoler at Technical University of Munich.* Erforderlich

Demographic Questions reprint – F ormal M ethods U se D1. In which application domain(s) in industry or academia (if any) have youmainly used FMs? *

Wählen Sie alle zutreffenden Antworten aus.

I have not used FMs in any academic or industrial domain.Critical infrastructures (e.g. telecom, energy, road/air/naval/rail traffic, smartbuildings or cities)Process automation (e.g. chemical process plants, power plants, warehouselogistics, production lines)Industrial machinery (e.g. stationary robotics, production machines)Transportation (e.g. automotive, utility vehicles, naval, aeronautics, trainsystems, freight logistics, cable cars, mobile robotics, drones/UAVs)Device industry (e.g. medical, health-care, semi-conductors, consumerelectronics)Military systems not in the above domains (e.g. for command, control,surveillance)Business information (e.g. database applications, banking, finance, ERP, PLM,web services, cloud apps)Platforms (e.g. operating systems, middle-ware, firmware, drivers, databasesystems, libraries)Supportive (e.g. CASE tools, checking or verification tools, CAD/CAM systems)Sonstiges:1.

D2. How many years of FM experience (including the study of FMs) have yougained? *

Markieren Sie nur ein Oval.

I do not have any knowledge of or experience in FMs.less than 3 years3 to 7 years8 to 15 years16 to 25 yearsmore than 25 years2. reprint – F ormal M ethods U se D3. Which have been your motivations (if any) to use FMs? *

Markieren Sie nur ein Oval pro Zeile. nomotivation moderatemotivation strong motivation (orrequirement)Regulatory authoritiesCustomers / scientificcommunityEmployer / researchcollaboratorsSuperior(s) / principalinvestigator(s)Study or researchprogramOwn (private) interestOn behalf of an FM toolor service provider3.

D3o. Which have been your furthermotivations to use FMs (if any)? Past and Current Use of Formal Methods

The following questions aim at your EXPERIENCE with the use of FMs in your PAST and CURRENT activities and projects.NOTE: If you are not able to say anything about past or current use, please, choose the corresponding "not yet used...", "no experience...", or "not at all" options and proceed to the next page!

P1. In which role(s) have you used FMs? *

Wählen Sie alle zutreffenden Antworten aus.

I have not used FMs in any specific role.Engineering practitioner in industry (e.g. programmer)Consulting or managing practitioner in industry (e.g. architect, requirements orsystems engineer)External consultant (e.g. external requirements or systems engineer)Researcher in industryResearcher in academiaLecturer, teacher, trainer, or coachBachelor, master, or PhD studentStakeholder of an FM tool or service providerSonstiges:5.

Experience in Formal Methods Use reprint – F ormal M ethods U se P2. Describe your level of experience with each of the following classes offormal description techniques? *

Markieren Sie nur ein Oval pro Zeile. noexperienceor noknowledge studied in(university)course applied inlab,experiments,case studies appliedonce inengineeringpractice appliedseveraltimes inengineeringpracticePredicative,relational, oralgebraicspecificationModal andtemporal logicspecificationProcess models(e.g. Petri nets,Mealy machines,LTS, Markovprocesses)Dynamicalsystems (i.e.differentialequations)6. reprint – F ormal M ethods U se P3. Describe your level of experience with each of the following classes offormal reasoning techniques? *

Markieren Sie nur ein Oval pro Zeile. noexperienceor noknowledge studied in(university)lectures applied inlab,experiments,case studies appliedonce inengineeringpractice appliedseveraltimes inengineeringpracticeAbstractinterpretationAssertionchecking (e.g. forpre/postspecification,contracts)Process calculi(e.g. CSP, CCS,pi, mu, hybrid)Model checking,SMV (of e.g.temporal orprobabilisticproperties)Constraint (SAT,SMT) solving(e.g. for staticcode analysis),optimisationtechniquesGeneric (first-order, HOL)theorem proving(using e.g. termrewriting,functionalprogramming)Computationalengineering,simulation (usinge.g. differentialcalculus,numericalmethods)Symbolicexecution (e.g.scenario testing,model animation)Consistencychecking (e.g.syntax or bugpattern checking)7. reprint – F ormal M ethods U se P3o. List other FMs you have experience with (if any): Purposes of Formal Methods Use

P4. I have mainly used FMs for ... *

Markieren Sie nur ein Oval pro Zeile. ... notat all. ...once. ... in 2 to 5separate tasks. ... in more than 5separate tasks.... clarification (i.e.explicit description foranalyzing a problem)... specification (e.g.contracts,documentation andcommunication ofrequirements anddesign)... inspection (i.e. errordetection, e.g. non-conformance checking,model-based testing)... synthesis (e.g.transformation,compilation)... assurance (e.g. errorremoval, propertyverification, refinementor equivalence proofs,argumentation)9.

P4o. I have used FMs for otherpurposes (if any):

Intended Future Use of Formal Methods

The following questions aim at your INTENT to use FMs in your FUTURE activities and projects.NOTE: Your intend to use FMs will also be interpreted as the "mere possibility of FM usage in the corresponding ways" according to and based on your responses.

Future Extent of Formal Methods Use reprint – F ormal M ethods U se F1. In which application domain(s) in industry or academia (if any) would (ordo) you intend or recommend to use FMs? *

Wählen Sie alle zutreffenden Antworten aus.

I would not or do not intend (or recommend) to use FMs in any academic orindustrial domain.Critical infrastructures (e.g. telecom, energy, road/air/naval/rail traffic, smartbuildings or cities)Process automation (e.g. chemical process plants, power plants, warehouselogistics, production lines)Industrial machinery (e.g. stationary robotics, production machines)Transportation (e.g. automotive, utility vehicles, naval, aeronautics, trainsystems, freight logistics, cable cars, mobile robotics, drones/UAVs)Device industry (e.g. medical, health-care, semi-conductors, consumerelectronics)Military systems not in the above domains (e.g. for command, control,surveillance)Business information (e.g. database applications, banking, finance, ERP, PLM,web services, cloud apps)Platforms (e.g. operating systems, middle-ware, firmware, drivers, databasesystems, libraries)Supportive (e.g. CASE tools, checking or verification tools, CAD/CAM systems)Sonstiges:11.

F2. In which role(s) would (or do) you intend to use FMs? *

Wählen Sie alle zutreffenden Antworten aus.

I do not or would not intend to use FMs in any specific role.Engineering practitioner in industry (e.g. programmer, test or verificationengineer)Consulting or managing practitioner in industry (e.g. architect, requirements orsystems engineer)External consultant (e.g. external requirements or systems engineer)Researcher in industryResearcher in academiaLecturer, teacher, trainer, or coachBachelor, master, or PhD studentStakeholder of an FM tool or service providerSonstiges:12. reprint – F ormal M ethods U se F3. I (would) intend to use ... *

Markieren Sie nur ein Oval pro Zeile. ... nomore ornot at all. ... lessoften thanin thepast. ... asoften asin thepast. ... moreoften thanin the past. I don'tknow.... predicative,relational, oralgebraicspecification... modal or temporallogic specification... process models(e.g. Petri nets,Mealy machines,LTS, Markovprocesses)... dynamical systemmodels (i.e.differential equations)13. reprint – F ormal M ethods U se F4. I (would) intend to use ... *

Markieren Sie nur ein Oval pro Zeile. ... nomore ornot at all. ... lessoften thanin thepast. ... asoften asin thepast. ... moreoften thanin the past. I don'tknow.... abstractinterpretation... assertion checking(e.g. for pre/postspecification,contracts)... process calculi(e.g. CSP, CCS, pi,mu, hybrid)... model checking,SMV (of e.g.temporal orprobabilisticproperties)... constraint (SAT,SMT) solving (e.g. forstatic code analysis),optimisationtechniques... generic (first-order,HOL) theoremproving (using e.g.term rewriting,functionalprogramming)... simulation (i.e.computationalengineering usinge.g. differentialcalculus, numericalmethods)... symbolic execution(e.g. scenario testing,model animation)... consistencychecking (e.g. syntaxor bug patternchecking)14.

F4o. I (would) intend to use other FMs,semi-FMs (i.e. without formal semanticsand proof system), or highly systematicprocedure (if any, please, provide somedetails):

Future Purposes of Formal Methods Use reprint – F ormal M ethods U se F5. I (would) intend to use FMs for ... *

Markieren Sie nur ein Oval pro Zeile. ... nomore ornot at all. ... lessoften thanin thepast. ... asoften asin thepast. ... moreoften thanin the past. I don'tknow.... clarification (i.e.explicit description foranalyzing a problem)... specification (e.g.contracts,documentation andcommunication ofrequirements anddesign)... inspection (i.e.error detection, e.g.non-conformancechecking, model-based testing)... synthesis (e.g.transformation,compilation)... assurance (e.g.error removal,property verification,refinement orequivalence proofs,argumentation)16.

F5o. I (would) intend to use FMs forother purposes (if any):

Potential Obstacles to the Intended Use of FormalMethods reprint – F ormal M ethods U se O1. For any potential use of FMs in my future activities and projects, I consider... *

Markieren Sie nur ein Oval pro Zeile. ... not asan issue. ... as amoderatechallenge. ... as a toughchallenge. I don'tknow.... scalability (e.g.towards large orheterogeneoussystems)... proper (automated)abstractions fromirrelevant details... maintainability ofverification results (e.g.stable proofs)... reusability ofverification results (e.g.parametric proofs)... transfer ofverification results frommodels to reality... automation or toolsupport (incl. notations,DSLs, IDEs)... skills and education(e.g. methods knownand ready to use)18.

O1o. Which further obstacles (if any)would potentially hinder you to useFMs as intended?

Thank you for your participation! ): The Survey Code is: XXXX-XXXX-XXXX-XXXX

Please, feel free to provide us any feedback on the questionnaire or on itstopic: reprint – F ormal M ethods U se Bereitgestellt von