Bounded verification of message-passing concurrency in Go using Promela and Spin
SS. Balzer, L. Padovani (Eds.): Programming Language Approachesto Concurrency- & Communication-cEntric Software (PLACES 2020)EPTCS 314, 2020, pp. 34–45, doi:10.4204/EPTCS.314.4
Bounded verification of message-passing concurrency in Gousing Promela and Spin
Nicolas Dilley
University of Kent
Julien Lange
University of Kent
This paper describes a static verification framework for the message-passing fragment of the Goprogramming language. Our framework extracts models that over-approximate the message-passingbehaviour of a program. These models, or behavioural types, are encoded in Promela, hence canbe efficiently verified with Spin. We improve on previous works by verifying programs that includecommunication-related parameters that are unknown at compile-time, i.e., programs that spawn a pa-rameterised number of threads or that create channels with a parameterised capacity. These programsare checked via a bounded verification approach with bounds provided by the user.
Go is an increasingly popular programming language that is known for its lightweight threads (called goroutines ) and native support for message-passing concurrency. Go programmers are encouraged tocoordinate threads by exchanging messages over channels, rather than using shared memory protectedby mutexes [13]. In a recent empirical survey [3], we have discovered that more than 70% of the mostpopular Go projects on GitHub use message-passing primitives. Additionally, Tu et al. [18] showed thatmessage-passing based software is as liable to errors as other concurrent programming techniques. Theyalso showed that Go concurrency bugs are hard to detect and have a long life time. This is reflected ina recent survey amongst Go programmers reporting that programmers often do not feel they are able toeffectively repair bugs related to Go’s concurrency features [17]. Concretely, message-passing concur-rency bugs in Go fall in two categories: ( i ) blocking errors, where a goroutine is permanently waiting fora matching send/receive action and ( ii ) channel errors, where a goroutine attempts to close or send to achannel that is already closed.The Go ecosystem provides little support for users to detect concurrency bugs. Its type system onlyensures that each channel instance carries a single specified data type. While a run-time global dead-lock detector is available, it is silently disabled by some libraries. To help programmers produce correctconcurrent software, several authors have proposed techniques to verify Go programs both statically (atcompile-time) [7, 8, 10, 12, 14] and dynamically (at run-time) [15, 16]. One of the more mature tech-niques for statically verifying Go programs is Godel [8] which relies on the similarity of Go’s message-passing aspect to CCS [11]. Godel follows an approach based on behavioural types where Go programsare over-approximated by CCS-like processes, which in turns are model-checked, using mCRL2 [2] forsafety and liveness properties. Because mCRL2 only deals with finite-state models, Godel has one keylimitation: it does not support programs that spawn new threads in for-loops, e.g., the program in Fig-ure 1 is not supported. This restriction limits the applicability of Godel to real-world code-bases. Indeed,58% of the Go projects we studied in [3] feature thread-spawning in for-loops.Figure 1 shows a typical Go program where several worker threads are concurrently sending data tothe parent thread via channel a . Note that this program spawns | files | threads and creates a channel . Dilley & J. Lange func main () { files := getFiles () // decl . of getFiles () omitted a := make ( chan string , len ( files )) // create bounded buffer ’a’ for i := 0; i < len ( files ); i++ { go worker (a, files [i], i) // spawn worker () } for i := 0; i < len ( files ); i++ { <-a // receive from ’a’ } } func worker (a chan int , f string , i int ) { a <- parseFile (f, i) // send data on ’a’ ( decl . of parseFile () omitted ) } Figure 1: File processing examplewhose capacity is | files | . The length of files is unknown at compile-time, hence this program cannotbe checked for concurrency errors with existing static verification techniques for Go [7, 8, 10, 12, 14]. Our approach
Our short-term objective is to improve the approach from [8] so that we can detect bugs inprograms that feature communication-related parameters that are unknown at compile-time. We focus ontwo kinds of communication-related parameters: ( i ) those that determine the number of threads a programmay spawn at run-time and ( ii ) those that determine the capacity of channels. For example, the number ofthreads and the capacity of channel a are unknown at compile-time in Figure 1. To fulfil our objective, weaugment the behavioural types technique of [8] with ( i ) an intra -procedural analysis to identify unknowncommunication-related parameters, and ( ii ) a bounded verification wrt. these parameters. Concretely,we infer behavioural types from Go programs which may feature (undefined) communication-relatedparameters. If so, we ask the users to instantiate these parameters with bounds so that we can model-check the inferred behavioural types. The main challenges are to ask for user-provided bounds only whennecessary and to ensure that these bounds are used consistently. We address these challenges by keepingtrack of variables that may be used in channel creation statements or for-loops that spawn threads.Our long-term objective is to study automated repair of message-passing errors in Go. To anticipatefor this next step, we deviate from [8] in several ways. (1) We infer behavioural types directly fromGo source code instead of its lower-level (SSA) representation. (2) We use Promela and Spin instead ofmCRL2 to encode and verify behavioural types. Promela has the advantage of being much closer to Go.It has an imperative Go-like syntax and natively supports synchronous and asynchronous channels. Asa consequence, it will be easier to syntactically map an error in a Promela model to its source program.(3) We divide Go programs into independent partitions. This allows us to detect partial deadlocks and toidentify the location of defects more precisely, while making our tool faster. Figure 2 gives an overviewof our approach, which we have implemented in a tool called GOMELA [4].
Synopsis
In § 2, we present a core subset of Go, called
MiniGo , as well as typical bugs that we want torule out. In § 3, we give a detailed algorithm to extract Promela models from Go programs, while keepingtrack of communication-related parameters. In § 4 we present our implementation and its empiricalevaluation. We discuss related work and conclude in § 5. MiniGo and message-passing concurrency errors
For the sake of presentation, we use a fragment of Go that is focused on its message-passing featuresand call it
MiniGo — we describe how our tool deals with a larger subset of Go in Section 4. The syntax6
Bounded verification of message-passing concurrency in Go using Promela and Spin
Go sourcefiles
Program partitioning f1() {...}f2() {...}
Model extraction Model checking
PromelaPromela SPINSPIN
User
Bounds Bounds
Figure 2:
GOMELA workflow.of
MiniGo is given in Figure 3. We only discuss its semantics informally and refer to [7] for a formalaccount of the semantics of a variation of our language.We use v to range over non-channel variables, ch to range over channel variables, x to range over anyvariables, e to range over expressions (excluding channel variables), a to range over expressions (possiblyincluding channel variables), id to range over function names, and n to range over integer literals. Weuse r to range over mutators of for-loop indices. We use (cid:101) a to range over a list of expressions and overloadthe notation for lists of statements ( ˜ s ), etc. We write ˜ s ˜ s for the concatenation of ˜ s and ˜ s . We write chans ( (cid:101) a ) (resp. chans ( ˜ x ) ) for the maximal sub-list of (cid:101) a (resp. ˜ x ) that contains only channel variables.A MiniGo program p consists of a list of function declarations ˜ d , possibly including a main function(the program’s entry point). Each function declaration specifies a list of parameters ˜ x (possibly includingchannel variables) and a function body ˜ s .Statement ch := make ( chan , e ) creates a new channel of capacity n , when e evaluates to n . If n = α interactwith channels: v ← ch receives a value from channel ch and binds it to variable v ; while ch ← e sends theevaluation of expression e on channel ch . Send actions are blocking when the channel is synchronous orhas reached its maximal capacity. A channel can be closed with a close ( ch ) operation. Any send or closeaction on a closed channel triggers a run-time error. Any receive action on a closed channel succeeds, ifthe channel is empty a default value is returned. Select statements select { (cid:101) c } are guarded choices: theyblock until one of the guarding communication operations succeeds; after which the corresponding caseis executed. If multiple operations are available, one is chosen non-deterministically. Select statementsmay include a unique default branch, which is taken if all other branches are blocking.A statement go id ( (cid:101) a ) spawns a new goroutine, i.e., an instance of function id ( (cid:101) a ) which is executedconcurrently with its parent thread. MiniGo also includes standard constructs such as general sequencing,conditionals, for-loops, and assignment. For the sake of simplicity we model only the relevant parts ofthe language of expression (see definition of e ). We assume that variable names are pairwise distinct.Additionally, as in [8], we assume that channel are not in e , that variables are immutable, and thatrecursive functions do not spawn goroutines (for-loops are more common than recursion in Go). Message-passing errors in
MiniGo
In this work, we are interested in message-passing related bugsthat
MiniGo programs may encounter at run-time. We distinguish three types of such bugs. A globaldeadlock is a situation where at least one goroutine is waiting for a send or receive action to succeed,while all the other goroutines are either blocked or terminated. A partial deadlock is a situation whereat least one goroutine is permanently stuck while waiting for a send or receive action to succeed. Godevelopers refer to partial deadlocks as goroutine leaks because such stuck goroutines never reach theend of their scope, and thus are never garbage-collected. A channel safety error is a situation where asend or close operation is triggered on a closed channel. . Dilley & J. Lange p : = (cid:101) ds : = ch := make ( chan , e ) | α | select { (cid:101) c } | close ( ch ) | id ( (cid:101) a ) | go id ( (cid:101) a ) | { ˜ s }| if e then ˜ s else ˜ s | for v := e ; e ; r { ˜ s } α : = v ← ch | ch ← e c : = case α : ˜ s | default : ˜ se : = true | false | n | v | . . . a : = ch | ex : = ch | vd : = func id ( ˜ x ) { ˜ s } r : = v ++ | v -- | . . . Figure 3: Syntax of
MiniGo ( ch ranges over channel variables and v stands for non-channel variables). MiniGo programs
We adopt an approach based on behavioural types to produce a sound analysis of
MiniGo for channelsafety and global deadlock errors, following [7, 8]. Note that this approach is generally unsound wrt.liveness properties such as partial deadlock freedom without a termination checker, see [7, Section 5].In this context, behavioural types are an over-approximation of the interactions between goroutines, i.e.,they record send and receive actions, while abstracting away from the computational aspects. Typically,conditional statements are assigned behavioural types that correspond to non-deterministic choices inprocess calculi. In our work, behavioural types take the form of Promela models, which we extract from
MiniGo source code. Remarkably, we keep track of some computational aspects when they affect thestructure of the communication of the program, e.g., in Figure 1 we need to keep track of len(files) .Given a
MiniGo program p , we extract Promela models as follows. For each function declaration func id ( ˜ x ) { ˜ s } where ˜ x does not contain any channel variables, we generate a model which consists ofthree parts: (1) a model entry point ( init process in Promela) that contains the translation of ˜ s ; (2) alist of process declarations ( proctype in Promela), one for each distinct function call occurring (inter-procedurally) in ˜ s ; (3) a set of monitor processes, one for each channel created in ˜ s . Each of thesemodels correspond to a partition of a MiniGo program. Because these partitions do not have free channelvariables, they are effectively independent. Hence, we can verify them independently by consideringeach function declaration without channel parameters as a program entry point. As a consequence, weobtain a more precise and wider analysis of code-bases, while reducing the computational cost of ouranalysis, comparing to [8]. In particular, we can detect some partial deadlocks in the program underconsidering by identifying global deadlocks in some of its partitions.Hereafter, we take the following conventions: Promela strings generated by our algorithm are writtenin typewriter blue . MiniGo code is written in italic . Our approach is formalised through functions(in typewriter black ) and algorithms (in bold-red ) that manipulate
MiniGo programs. Each identifierin
MiniGo is translated to the equivalent string in Promela, e.g., ch MiniGo is translated to ch1 . Forthe sake of readability, we omit the concatenation operator between literal Promela strings and stringsgenerated by translation functions.
Function declarations
Given a function body ˜ s , for each distinct (blocking) function call id ( (cid:101) a ) occurringinter-procedurally in ˜ s such that chans ( (cid:101) a ) (cid:54) = [] , we define a Promela process ( proctype ) as follows: proctype id (chanParams ( id ) ,ch) { TransStmts ( ∅ , body ( id )) ;ch!0 } where ch is a channel used to signal the termination of the function call (with ch!0 } ).For each distinct non-blocking function call go id ( (cid:101) a ) , such that chans ( (cid:101) a ) (cid:54) = [] , we define the process: proctype go id (chanParams ( id ) ) { TransStmts ( ∅ , body ( id )) } where chanParams ( id ) (resp. body ( id ) ) returns the channel parameters (resp. body) of function id and ∅ denotes the empty map. Observe that the non-channel parameters are abstracted away. We use8 Bounded verification of message-passing concurrency in Go using Promela and Spin function
TransStmts ( ∆ , s ˜ s ) switch s :case ch ← e : ch .in!0; ch .sending?state;TransStmts ( ∆ , ˜ s ) case v ← ch : ch .in?0;TransStmts ( ∆ , ˜ s ) case close ( ch ) : ch .closing?state;TransStmts ( ∆ , ˜ s ) case if e then ˜ s else ˜ s : if :: true -> TransStmts ( ∆ , ˜ s ) :: true -> TransStmts ( ∆ , ˜ s ) fi; TransStmts ( ∆ , ˜ s ) case select { (cid:101) c } : if TransStmts ( ∆ , ˜ c ) fi; TransStmts ( ∆ , ˜ s ) case case α : ˜ s : :: TransStmts ( ∆ , α ) -> TransStmts ( ∆ , ˜ s ) case default : { ˜ s } : :: true -> TransStmts ( ∆ , ˜ s ) case id ( (cid:101) a ) :if chans ( (cid:101) a ) (cid:54) = [] then ch = [0] of { int } ; run id (chans ( (cid:101) a ) , ch);ch?0; TransStmts ( ∆ , ˜ s ) otherwise TransStmts ( ∆ , ˜ s ) case go id ( (cid:101) a ) :if chans ( (cid:101) a ) (cid:54) = [] then run go id (chans ( (cid:101) a ) ); TransStmts ( ∆ , ˜ s ) otherwise TransStmts ( ∆ , ˜ s ) case for v := e ; e ; r { ˜ s } :let ( ∆ (cid:48) , x , y ) = lookup ∆ ( v := e ; e ; r ) inif spawns ( ˜ s ) ∨ ∆ == ∆ (cid:48) then for (i: x .. y ) { TransStmts ( ∆ (cid:48) , ˜ s ) } ; TransStmts ( ∆ (cid:48) , ˜ s ) otherwise do :: true -> TransStmts ( ∆ , ˜ s ) :: true -> break;od;TransStmts ( ∆ , ˜ s ) case ch := make ( chan , e ) :let ( ∆ (cid:48) , , y ) = lookup ∆ ( i : = i < e ; i ++ ) in Chandef ch ;chan ch .in = [ y ] of { int } ;run chanmonitor( ch );TransStmts ( ∆ (cid:48) , ˜ s ) Algorithm 1:
Extracting Promela from
MiniGo statements. We assume that
TransStmts ( ∆ , []) returnsthe empty string. proctype instead of inline definition as the latter cannot include declarations of new channels. Next,we define function TransStmts which translates
MiniGo statements to Promela.Algorithm 1 specifies how we extract a model from a list of
MiniGo statements. We use b to rangeover the control statements of a for-loop, i.e., b ranges over triples of the form ( v := e ; e ; r ).Function TransStmts takes two parameters: (1) ∆ maps expressions (corresponding to communication-related parameters) to Promela strings, and (2) a list of MiniGo statements.
Channel primitives
For each
MiniGo channel we generate a custom Promela structure, called
Chandef ,which contains three channels: in carries the exchanged messages, while sending (resp. closing ) isused to monitor send (resp. closing) actions. A send statement is translated to a send statement in Promela(on channel in ), followed by a receive statement on the corresponding channel monitor (on the sending channel, see below). A receive statement is translated to a Promela receive (on channel in ). A closestatement is translated to a Promela receive on channel closing . . Dilley & J. Lange ↓ b = e if b is v : = e ; v < e ; v ++ e if b is v : = e ; v > e ; v -- ⊥ otherwise ↑ b = e if b is v : = e ; v < e ; v ++ e if b is v : = e ; v > e ; v -- ⊥ otherwise lookup ∆ ( b ) = (cid:40) ( ∆ , x , y ) if ↓ ( b ) = ⊥ or ↑ ( b ) = ⊥ , with x and y fresh ( ∆ (cid:48) , ∆ (cid:48) ( ↓ b ) , ∆ (cid:48) ( ↑ b )) otherwise, where ∆ (cid:48) = ∆ [ e (cid:55)→ x | e ∈ {↓ ( b ) , ↑ ( b ) } \ dom ( ∆ ) with x fresh ] Figure 4: Auxiliary functions for Algorithm 1, where we assume ∆ ( n ) = n for each integer n . Conditionals
An if-then-else statement is translated to an if block in Promela. It behaves as a non-deterministic internal choice ( true is an always-enabled guard). A select statement is translated to anon-deterministic choice, using an if block where each non-default branch is guarded by a send orreceive action. Default branches are translated to a branch that is always available. Control statements
A blocking function call is translated to Promela code that spawns an instantiationof the corresponding Promela process (using the run keyword), then waits for it to terminate by waitingon fresh channel ch . For spawning function calls, i.e., go id ( (cid:101) a ) , there are two cases. If the parametersinclude channels, the algorithm returns Promela code that spawns the corresponding process. Otherwiseit omits the call entirely — as it will be checked in the model of an independent partition.To translate for v := e ; e ; r { ˜ s } , we need to first check ( i ) whether we can extract well-identifiedbounds that we consider as communication-related parameters and ( ii ) whether the loop contains (inter-procedurally) spawning function calls, i.e., spawns ( ˜ s ) holds (whose straightforward definition is omit-ted). If spawns ( ˜ s ) holds then the range of the loop needs to be finite. Hence, either we set-up Promelavariables (that the user will instantiate) to define a range; or we are able to identify a static range (integerliteral bounds). When the loop does not spawn new threads, we only use a finite Promela for-loop if all in-volved variables have been seen before ( ∆ == ∆ (cid:48) , see below). In all other cases, we use a non-deterministicloop using a Promela do block which can be exited at any iteration with a break operation.We keep track of re-usable communication-related parameters with the map ∆ and use the function lookup , defined in Figure 4, to query it. Functions ↓ ( b ) and ↑ ( b ) respectively return the lower andupper bounds of for-loop control statement b . When the control statements of a for-loop are well-formed(they obey a recognisable pattern, i.e., ↓ b (cid:54) = ⊥∧ ↑ b (cid:54) = ⊥ ), the lookup function returns a new map ∆ (cid:48) and the lower ( ∆ (cid:48) ( ↓ b ) ) and upper ( ∆ (cid:48) ( ↑ b ) ) bounds of the for-loop. Map ∆ (cid:48) augments ∆ with mappingsfrom newly identified expressions to fresh Promela variables. Channel creation
A channel creation statement is translated to the instantiation of a
Chandef structureand the spawning of its chanmonitor process. We initialise the in channel of the Chandef structurewith a capacity corresponding to its
MiniGo equivalent. If the capacity is not a integer literal, the lookup function ensures that we either re-use Promela variables, or generate fresh ones. Note that channels sending and closing in Chandef are always synchronous. open closed error ch.sending ! ch.closing ! ch.closing ! ch.sending ! ch.in ! Channel monitors
To detect channel safety errors, we keeptrack of the state of
MiniGo channels. We use Promela pro-cesses to monitor channel actions, i.e., send, receive, andclose. As an optimisation, we only create such monitorswhen a close primitive appears in the program. Processescorresponding to goroutines interact with channel monitorsvia an instance ch of a ChanDef structure which containsthree channels ( in , sending , and closing ). The automa-0 Bounded verification of message-passing concurrency in Go using Promela and Spin typedef Chandef { chan in = [0] of { int }; chan sending = [0] of { int }; chan closing = [0] of { bool }; } init { chan _ch0_in = [ len_files_0 ] of { int }; Chandef _ch0 ; _ch0 .in = _ch0_in ; run chanMonitor ( _ch0 ) for (i : 1.. len_files_0 ) { run mainworker ( _ch0 ) }; for (i : 1.. len_files_0 ) { _ch0 .in ?0 }; } proctype mainworker ( Chandef a) { a.in !0; a. sending ?0 } proctype chanMonitor ( Chandef ch) { bool open = true ; do :: true -> if :: open -> if :: ch. sending ! open ; :: ch. closing ! true -> open = false fi :: else -> if :: ch. sending ! open -> assert ( false ) :: ch. closing ! true -> assert ( false ) :: ch.in !0; fi fi od } Figure 5: Model extracted from Listing 1 with Algorithm 1ton on the right represents the behaviour of this monitor ( chanMonitor ). Figure 5 (lines 26- 47) givesthe corresponding Promela code. In
MiniGo and Go, when a channel is closed , sending on it or closingit will raise an error, hence the transitions from state closed to state error in the automaton. Also, receiveactions on closed channels always succeed, hence the self-loop on state closed . Example 1.
The model extracted from Figure 1 with Algorithm 1 is given in Figure 5. Lines 8-20contain the translation of the main function. Note that a chanMonitor is spawned at line 12. Thedefinition of this process is given in lines 26-47. The translation of the worker function is in lines 21-24.Figure 5 contains one (unknown) communication-related parameter: len(files) which is used inthe capacity of channel a , and in the loops at lines 4 and 7 of Figure 1. Observe that the communication-related parameter len(files) is set to 15 here (see line 1). Our implementation also allows user tospecify such parameters as program (command line) arguments. Expression len(files) is first seenby Algorithm 1 in a channel creation statement. At this point, it adds len(files) (cid:55)→ ‘len files 0’ in map ∆ . When the algorithm processes both loops, it invokes lookup ∆ ( i:=0;i We have implemented our approach in a tool called GOMELA [4]. Given a Go program,our tool extracts Promela models (as described in Section 3). If necessary, the user enters values (bounds)for the statically unknown communication-related parameters produced by Algorithm 1 (i.e., the Promelavariables). For instance, the user provides a bound to instantiate len(files) in Figure 1. This value isthen used as a bound in the for-loops at lines 4 and 7 as well as the capacity of channel a . . Dilley & J. Lange Go programs verified by GOMELA and comparison with Godel [8]. All programs are available online [4]. GOMELA Godel Manual analysis Programs & Partitions | States | CS GD Infer (ms) Spin (ms) Time ψ s ψ g CS GD | files | =15) 376880 (cid:51) (cid:51) 85 1666 ⊥ ⊥ ⊥ (cid:51) (cid:51) | files | =15) 109 (cid:51) (cid:55) 86 1057 ⊥ ⊥ ⊥ (cid:51) (cid:55) | files | =15) 110 (cid:51) (cid:55) 85 1027 ⊥ ⊥ ⊥ (cid:51) (cid:55) k =5, n =10, m =10) 4802785 (cid:51) (cid:51) 85 31239 ⊥ ⊥ ⊥ (cid:51) (cid:51) (cid:51) (cid:51) 956 1391 460 (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) - -- ConcurrentSearch() (cid:51) (cid:51) - 1126 - - - (cid:51) (cid:51) - ConcurrentSearchWC() (cid:51) (cid:55) - 1335 - - - (cid:51) (cid:55) - First() (cid:51) (cid:51) - 1183 - - - (cid:51) (cid:51) - ReplicaSearch() (cid:51) (cid:55) - 1326 - - - (cid:51) (cid:55) - SequentialSearch() (cid:51) (cid:51) - 1073 - - - (cid:51) (cid:51) - FakeSearch() (cid:51) (cid:51) - 1083 - - - (cid:51) (cid:51) - main() (cid:51) (cid:51) - 1068 - - - (cid:51) (cid:51) (cid:51) (cid:51) 84 1278 623 (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) 804 1478 859 (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) 835 1801 8681 (cid:51) (cid:51) (cid:51) (cid:51) 10 double-close 18 (cid:55) - 85 1486 463 (cid:55) (cid:51) (cid:55) (cid:51) 11 fanin-alt 34 (cid:51) (cid:55) 769 1686 786 (cid:51) (cid:51) (cid:51) (cid:55) 12 fanin 15 (cid:51) (cid:51) 681 1495 621 (cid:51) (cid:51) (cid:51) (cid:51) 13 fixed 14 - - 740 2379 656 (cid:51) (cid:51) - -- Work() (cid:51) (cid:51) - 1117 - - - (cid:51) (cid:51) - main() (cid:51) (cid:51) - 1262 - - - (cid:51) (cid:51) 14 forselect 21 (cid:51) (cid:51) 755 1507 813 (cid:51) (cid:51) (cid:51) (cid:51) 15 jobsched 48 - - 781 2745 589 (cid:51) (cid:51) - -- main() (cid:51) (cid:51) - 1532 - - - (cid:51) (cid:51) - morejob() (cid:51) (cid:51) - 1213 - - - (cid:51) (cid:51) 16 mismatch 12 - - 714 2316 603 (cid:51) (cid:55) - -- Work() (cid:51) (cid:51) - 1044 - - - (cid:51) (cid:51) - main() (cid:51) (cid:55) - 1272 - - - (cid:51) (cid:55) 17 sel 21 (cid:51) (cid:55) 84 1719 326 (cid:51) (cid:55) (cid:51) (cid:55) 18 selFixed 19 (cid:51) (cid:51) 82 1330 572 (cid:51) (cid:51) (cid:51) (cid:51) 19 philo 18 (cid:51) (cid:55) 85 1362 537 (cid:51) (cid:55) (cid:51) (cid:55) 20 starvephil 67 (cid:51) (cid:55) 759 1343 836 (cid:51) (cid:55) (cid:51) (cid:55) 21 nonlive 7 (cid:51) (cid:51) 850 1194 550 (cid:51) (cid:51) (cid:51) (cid:51) 22 nonlive v1 7 (cid:51) (cid:55) † 850 1152 366 (cid:51) (cid:55) † (cid:51) (cid:51) 23 prod-cons 61 (cid:51) (cid:51) 87 1390 508 (cid:51) (cid:51) (cid:51) (cid:51) 24 prod3-cons3 5746 (cid:51) (cid:51) 643 1534 19963 (cid:51) (cid:51) (cid:51) (cid:51) 25 prodconsclose 12185424 (cid:51) (cid:51) 163 18672 25348 (cid:51) (cid:51) (cid:51) (cid:51) 26 stuckmsg 5 (cid:51) (cid:51) 86 1109 790 (cid:51) (cid:51) (cid:51) (cid:51) 27 data-dependent 12 (cid:51) (cid:55) † 86 1170 694 (cid:51) (cid:55) † (cid:51) (cid:51) Column number 1 2 3 4 5 6 7 8 9 10 GOMELA uses Spin to check whether each model is free from channel safety errors and global dead-locks. Spin reports any global deadlock and any trace that leads to an assert(false) statements. Weuse the former to check for global deadlocks (GD) in a program’s partitions, and the latter to check forchannel safety errors (CS). Spin detects when the main process ( init ) terminates while another processis still running. We use this to detect some goroutine leaks, i.e., a particular case of partial deadlocks.In addition to MiniGo statements, GOMELA deals with constants (used as communication-related pa-rameters), anonymous functions, break statements, for range loops and switch statements. Occurrencesof integer constants are replaced with their actual values. To deal with anonymous functions, GOMELA generates Promela corresponding function declarations (using fresh names) and its (unique) invoca-tion. Go’s break statements are translated as Promela break statements. For-ranges loop ( for range list { ˜ s } ) are treated as for-loops with control statements of the form: to for i :=0; i < len ( list ) ; i ++ .Finally, switch statements are translated into n-ary internal choices (similar to an if-then-else). Evaluation To evaluate our tool, we ran it on several benchmarks, including some from [8, Table 1]. Theresults of this evaluation are in Table 1. In Table 1, Column 1 gives the number of states in the model, asgiven by Spin. In Column 2 (resp. 3) a (cid:51) -mark says that the corresponding program partition is channel-2 Bounded verification of message-passing concurrency in Go using Promela and Spin safe (resp. free of global deadlock); a (cid:55) -mark says that the property is violated. Column 4 (resp. 5)shows the time (milliseconds) taken to extract (resp. verify) the Promela model of a partition. The timingfor programs are the sum of all of their partitions. Column 6 shows the time (milliseconds) taken byGodel [8] to verify channel safety ( ψ s , Column 7) and global deadlock ( ψ g , Column 8) properties in. A ⊥ -mark means that Godel does not support this program. A (cid:51) -mark in Column 9 (resp. 10) says that itwas not possible to find any channel errors (resp. global deadlocks) manually, (cid:55) † highlights false alarms.In Table 1, Program 1 is the example in Figure 1 where the user has set len(files) to 15. Pro-grams 2 and 3 are variations of Program 1 with a global deadlock and a goroutine leak, respectively.Program 4 spawns n producers and m consumers that interact (repeatedly) over a channel with capacity k . Observe that these are not supported by Godel because they include thread-spawning in a for-loop.Programs 5 to 26 are taken from [8, Table 1]. We note that Godel runs marginally faster than GOMELA on programs with small models. This is due to Spin’s start up time of ∼ GOMELA performs much better than Godel, see, e.g., Programs 9, 24, and 25. Thissuggests that our tool scales better than Godel on larger code-bases.Below we comment on the errors identified in the programs from Table 1.• Program 6 has two partitions that contain global deadlocks. In ConcurrentSearchWC , a func-tion spawns three goroutines that send a message on a shared channel c . That function mayterminate silently (via timeout) hence leaving three unmatched send actions in the goroutines. ReplicaSearch includes a similar pattern where the parent thread may terminate while leavingsome goroutines permanently blocked.• Program 10 contains a channel safety error where a channel is closed twice by two differentthreads. We note that Spin aborts the verification as soon as it finds such errors.• Program 11 creates two producers and one consumer. The consumer may terminate silently inwhich case both producers are blocked permanently. Program 12 is similar to Program 11 exceptthat the consumer never terminates, thus fixing the bug.• Program 16 contains two partitions: Work consist of a non-terminating loop (without communica-tions), while main contains a global deadlock due to a mismatch between the number of send andreceive actions. Note that if we were modelling this program with one monolithic partition, wewould not detect a global deadlock because Work never blocks.• In Program 17 each goroutine may get stuck because of a mismatch between the number of sendand receive actions.• Program 19 is an encoding of the (starving) philosopher problem using only two philosophers,taken from [14]. Program 20 is another encoding of the (starving) dining philosophers from [8].• Programs 21 and 22 are two variants of a program that contain two goroutines: one is waitingfor a message, while the other contains a non-terminating for-loop. In Program 21, the non-terminating for-loop is followed by a (unreachable) matching send action. These programs showthe limits of the behavioural types approach: they contain proper partial deadlocks. The problemin Programs 21 is only detected in Godel by using an additional termination checker.• Program 27 is given in Figure 6. This program gives a typical example of a false alarm raisedby our tool, and any existing approach based on behavioural types. Because if-then-else state-ments are translated to non-deterministic choices, our approach is unable to determine that the twoconditional blocks are “synchronised” by the same invocation of function f() . Limitations Our approach is applicable to MiniGo , extended with the syntactic constructs discussedabove. Several key limitations need to be tackled to address the full Go language. We assume thatvariables are immutable, as a consequence we cannot soundly analyse programs that, e.g., mutate a list . Dilley & J. Lange func main () { a := make ( chan int ) go send (a) if f () { // decl . of f () elided a <- 0 } else {} } func send (a chan int ) { if f () { <-a } else {} } /* f () is a deterministic functionwithout side - effects */ Figure 6: Data-dependent choice (Program 27). files in between using len(files) as a communication-related parameter. Go has object oriented-like features, such as structs, methods, and interfaces which we currently do not support. Virtual methodcalls (on interfaces) are particularly difficult to model. As in [7, 8, 10, 12, 14], we do not support channelpassing (since we abstract away the data sent over channels). We note that our empirical survey [3] foundthat only 6% of projects used channels that carry channels. Related work Spin and Promela have been used extensively in software verification. Notably JavaPathFinder [5] translates Java programs to Promela models which are then verified for deadlocks andviolations of user-provided assertions. Also, Zaks and Joshi [19] use Spin to verify multi-threaded Cprograms using their LLVM representation and custom virtual machine.Several works focus on the verification of message-passing concurrency in Go [7, 8, 10, 12, 14,15, 16]. Four papers studied static verification using behavioural models. Ng and Yoshida [12] proposed dingohunter , the first static global deadlock detection tool for Go. It relies on communicating finite-statesmachines [1] and multiparty compatibility [9]. Their work does not support asynchronous channels norprograms that spawn goroutines or create channels in loops or conditionals. Stadtm¨uller et al. introduced gopherlyzer [14] which detects global deadlocks using forkable regular expressions. This work does notsupport channel closures, asynchronous channels, nor goroutines spawned in loops. Lange et al. [7, 8]proposed Gong and Godel, whose approach serves as a basis for this work. Gong uses an ad-hoc checkerwhich supports bounded verification of infinite-state models, but did not scale well. Instead, Godel usesmCRL2 [2] as a back-end. Because mCRL2’s communication model is very different from Go’s theencoding from behavioural types to mCRL2’s language is very intricate, see [6]. Promela is a moreconvenient language for this purpose, but because Spin supports only LTL, while mCRL2 supports the µ -calculus, it is not possible to check the liveness property specified in [7, 8]. However, we can stillidentify some goroutine leaks by checking whether their corresponding models reach their end states. Conclusions Our work builds on the approach in [8] and improves it to support statically unknowncommunication-related parameters via a bounded analysis. Our approach allows us to support programsthat spawn a parameterised number of goroutines or channel capacities. Our evaluation shows that ourtool scales well and produces models that can be easily understood and adjusted by programmers. Future work Our short term plans are to support additional concurrency-related Go features, e.g.,barriers ( WaitGroup ). We will also improve our algorithms to support more complex for-loops controlstatements, and to perform a fully inter -procedural analysis of communication-related parameters. In thelonger term, we plan to use our tool to detect concurrency errors and suggest repairs for large code-bases.We plan to perform a large-scale empirical evaluation of this toolchain on the dataset identified in [3].4 Bounded verification of message-passing concurrency in Go using Promela and Spin References [1] Daniel Brand & Pitro Zafiropulo (1983): On Communicating Finite-State Machines . J. ACM An Overview of the mCRL2 Toolset and ItsRecent Advances . In: TACAS , Lecture Notes in Computer Science An Empirical Study of Messaging Passing Concurrency inGo Projects . In: SANER , IEEE, pp. 377–387, doi:10.1109/SANER.2019.8668036.[4] Nicolas Dilley & Julien Lange (2020): Gomela . http://github.com/nicolasdilley/gomela .[5] Klaus Havelund & Thomas Pressburger (2000): Model Checking JAVA Programs using JAVAPathFinder . STTT Godel Checker . https://bitbucket.org/MobilityReadingGroup/godel-checker .[7] Julien Lange, Nicholas Ng, Bernardo Toninho & Nobuko Yoshida (2017): Fencing offGo: liveness and safety for channel-based programming . In: POPL , ACM, pp. 748–761,doi:10.1145/3009837.3009847.[8] Julien Lange, Nicholas Ng, Bernardo Toninho & Nobuko Yoshida (2018): A static verificationframework for message passing in Go using behavioural types . In: ICSE , ACM, pp. 1137–1148,doi:10.1145/3180155.3180157.[9] Julien Lange, Emilio Tuosto & Nobuko Yoshida (2015): From Communicating Machines to Graph-ical Choreographies . In: POPL , ACM, pp. 221–232, doi:10.1145/2676726.2676964.[10] Jan Midtgaard, Flemming Nielson & Hanne Riis Nielson (2018): Process-Local Static Analysisof Synchronous Processes . In: SAS , Lecture Notes in Computer Science Lectures on a calculus for communicating systems . In: International Confer-ence on Concurrency , Springer, pp. 197–220, doi:10.1007/3-540-15670-4 10.[12] Nicholas Ng & Nobuko Yoshida (2016): Static deadlock detection for concurrent Go by globalsession graph synthesis . In: CC , ACM, pp. 174–184, doi:10.1145/2892208.2892232.[13] Rob Pike (2015): Go Proverbs . .[14] Kai Stadtm¨uller, Martin Sulzmann & Peter Thiemann (2016): Static Trace-Based Deadlock Analy-sis for Synchronous Mini-Go . In: APLAS , Lecture Notes in Computer Science Trace-Based Run-Time Analysis of Message-PassingGo Programs . In: Haifa Verification Conference , Lecture Notes in Computer Science Two-Phase Dynamic Analysis of Message-Passing Go Programs Based on Vector Clocks . In: PPDP , ACM, pp. 22:1–22:13,doi:10.1145/3236950.3236959. . Dilley & J. Lange Go Survey Results . blog.golang.org/survey[year]-results .[18] Tengfei Tu, Xiaoyu Liu, Linhai Song & Yiying Zhang (2019): Understanding Real-World Concur-rency Bugs in Go . In: ASPLOS , ACM, pp. 865–878, doi:10.1145/3297858.3304069.[19] Anna Zaks & Rajeev Joshi (2008): Verifying Multi-threaded C Programs with SPIN . In: