[PDF] Workflow Management on BFT Blockchains

Abstract

Blockchain technology has been proposed as a new infrastructure technology for a wide variety of novel applications. Blockchains provide an immutable record of transactions, making them useful when business actors do not trust each other. Their distributed nature makes them suitable for inter-organizational applications. However, proof-of-work based blockchains are computationally inefficient and do not provide final consensus, although they scale well to large networks. In contrast, blockchains built around Byzantine Fault Tolerance (BFT) algorithms are more efficient and provide immediate and final consensus, but do not scale well to large networks. We argue that this makes them well-suited for workflow management applications that typically include no more than a few dozen participants but require final consensus. In this paper, we discuss architectural options and present a prototype implementation of a BFT-blockchain-based workflow management system (WfMS).

Full PDF

SSMaRt Blockchain Distributed WorkﬂowManagement

Joerg Evermann and Henry Kim Memorial University of Newfoundland, St. John’s, Canada [email protected] York University, Toronto, Canada [email protected]

Abstract.

Blockchain technology has been proposed as a new infra-structure technology for a wide variety of novel applications. Blockchainsprovide an immutable record of transactions, making them useful whenbusiness actors do not trust each other. Their distributed nature makesthem suitable for inter-organizational applications. However, proof-of-work based blockchains are computationally ineﬃcient and do not pro-vide ﬁnal consensus, although they scale well to large networks. In con-trast, blockchains built around Byzantine Fault Tolerance (BFT) algo-rithms are more eﬃcient and provide immediate and ﬁnal consensus,but do not scale well to large networks. We argue that this makes themwell-suited for workﬂow management applications that typically includeno more than a few dozen participants but require ﬁnal consensus. Inthis paper, we discuss architectural options and present a prototype im-plementation of a BFT-blockchain-based workﬂow management system(WfMS).

Keywords:

Byzantine fault tolerance · blockchain · workﬂow manage-ment · interorganizational workﬂow · distributed workﬂow Inter-enterprise business processes may include stakeholders in adversarial rela-tionships, that nonetheless have to jointly complete process instances. Trust inthe current state of a process instance and correct execution of activities by otherstakeholders may be lacking. Blockchain technology can help in such situationsby providing a trusted, distributed workﬂow execution infrastructure.A blockchain cryptographically signs a series of blocks, containing transac-tions, so that it is diﬃcult or impossible to alter earlier blocks in the chain. In adistributed blockchain, actors independently validate transactions, add them tothe blockchain, and replicate the chain across diﬀerent nodes. The independentand distributed nature of actors requires ﬁnding a consensus regarding the valid-ity and order of transactions and blocks. In workﬂow execution, it is importantthat actors agree on the ”state of work” as this determines the set of next validactivities in the process. Hence, it is natural to use blockchain transactions todescribe workﬂow activities or workﬂow states. a r X i v : . [ c s . D C ] A ug J. Evermann and H. Kim

In contrast to prior work, which has focused on transaction ordering on proof-of-work blockchains, we examine the use of consensus protocols based on algo-rithms for Byzantine Fault Tolerance (BFT). Furthermore, we explore the archi-tecture of a blockchain-based WfMS without smart contracts. We motivate bothof these choices later in the paper. Event without the use of smart contracts, theblockchain remains essential as it provides independent validation of workﬂowactivities, distribution, replication, and tamper-prooﬁng to workﬂow execution.Blockchain technology admits many diﬀerent system designs, and WfMS canbe implemented in many diﬀerent ways on blockchain infrastructure. In thispaper, we focus on the interface between the blockchain and the workﬂow engineand the architectural options available for the design of the system.

Contributions

We describe a prototype WfMS system as a proof-of-concept im-plementation for an architecture that has not yet seen attention in the literature.First, in contrast to earlier work (Sec. 2) we do not use smart contracts to imple-ment model-speciﬁc workﬂow engines. We show that generic or existing workﬂowengines can be readily adapted to ﬁt onto a blockchain infrastructure and thatsmart contracts are not required. Second, as recommended, but not implementedby [22], we show how a BFT-based blockchain can be used as workﬂow man-agement infrastructure. We describe the implementation of a blockchain-basedWfMS that has served as our tool to investigate design choices, problems andsolutions in this research area. While our prototype is an important demonstra-tion of feasibility, our main contribution is in the identiﬁcation and discussion ofthe diﬀerent architectural choices, and highlighting the existence and feasibilityof alternatives to smart contracts on proof-of-work blockchains in this researcharea.The remainder of the paper is structured as follows. Section 2 reviews relatedwork on blockchain-based WfMS. We then describe the principles of distributedblockchains with a focus on BFT-based consensus (Sec. 3). Section 4 describesthe architecture of our system and discusses design choices. Section 5 presentsour prototype implementation. The ﬁnal Sec. 6 discusses implications of BFT-based blockchain technology for WfMS and an outlook to future work.

This section discusses existing work in two research areas. The ﬁrst subsectionfocuses on blockchain technology applied to workﬂow management; the secondsubsection focuses on blockchains that apply a BFT ordering mechanism.

Blockchain-based workﬂow management has only recently received research at-tention [15]. The main research challenges are around integration of blockchaininfrastructure into WfMS and ensuring correctness and security of the workﬂowexecution [15]. A number of prototype implementations have been presented,

MaRt Blockchain Distributed Workﬂow Management 3 focusing on the use of ”smart contracts”. A smart contract is a software applica-tion that is recorded and executed on the blockchain. This application ”listens”for relevant transactions sent to it and executes application logic upon receipt ofa transaction. For example, the widely used Ethereum blockchain has a Turing-complete virtual machine (VM) for smart contracts and compilers for diﬀerentprogramming languages.In a project driven by a ﬁnancial institution, a prototype workﬂow implemen-tation using smart contracts on the Ethereum blockchain oﬀers digital documentﬂow in the import/export domain [8,7]. The project demonstrates signiﬁcantlylowered process cost, as well as increased transparency and trust among tradingpartners.A blockchain-based workﬂow project in the real-estate domain [13], also us-ing the Ethereum blockchain and smart contracts, notes that the de-centralizednature of blockchains and the lack of a central agency will make it diﬃcult forregulators to enforce obligations and responsibilities of trading partners.A complete WfMS, including collaborative workﬂow modelling and modelinstantiation, uses models as contracts between collaborators [10]. The systemallows distributed, versioned modelling of private and public workﬂows, consen-sus building on versions to be instantiated, and tracking of instance states on theblockchain. The blockchain provides integrity assurance for models and instancestates. The authors note that the usefulness of the approach is limited by blocksize limits on the blockchain and the latency of new blocks [10].Another implementation of blockchain-based workﬂow execution [24,25] usessmart contracts on the Ethereum blockchain either as a choreography monitor,where the smart contract monitors execution status and validity of workﬂowmessages against a process model, or as an active mediator, where the smart con-tract”drives” the process by sending and receiving messages according to a pro-cess model. BPMN models are translated into smart contracts. Local Ethereumnodes monitor the blockchain for relevant messages from the smart contractand create messages for the smart contract. Transaction cost and latency arerecognized as important considerations in the evaluation of the approach. Acomparison between the public Ethereum blockchain and the Amazon SimpleWorkﬂow Service cloud-based environment shows blockchain-based costs to betwo orders of magnitude higher than a traditional infrastructure [18]. Hence,optimizing the space and computational requirements for smart contracts is im-portant [9]. BPMN models are ﬁrst translated to Petri Nets, for which minimizingalgorithms are known. The minimized Petri nets are then compiled into smartcontracts, achieving up to 25% reduction in transaction cost [24,25], while alsosigniﬁcantly improving the throughput time. Building on lessons learned from[24,25], Caterpillar is an open-source blockchain-based business process manage-ment system [14]. Developed in Node.js it uses standard Ethereum tools, likethe Solidity compiler solc and the Ethereum client geth, to provide a distributedexecution environment for BPMN-based process models. Lorikeet is a similarsystem [6], also based on BPMN models that are translated to smart contracts https://ethereum.github.io/yellowpaper/paper.pdf J. Evermann and H. Kim for the Ethereum chain. Also working with Ethereum and Solidity, [21] presenta system that focusing on resource management in addition to control ﬂow con-siderations and extends the smart contracts to manage a variety of resourceallocation patterns.The replicated nature of blockchains means that information is available toall participants. One approach to address this privacy issue in the context ofworkﬂow management is the use of access control lists and their enforcement insmart contracts [17]. After examining diﬀerent blockchain consensus mechanisms in terms of termina-tion time and fault tolerance, BFT-based consensus is recommended for businessprocess executions [22].Solving the ordering and consensus problems not with expensive proof-of-work approaches, but with eﬃcient and provably correct and live algorithms, isan important motivator for many recent blockchain projects. The Hyperledgerproject of the Linux foundation is the umbrella for a number of BFT-based block-chain implementations of various stages or maturity. Hyperledger Burrow is ablockchain that can execute Ethereum virtual machine code but is based on theTendermint BFT-based consensus algorithm. Hyperledger Iroha is based on”YAC”, a proprietary BFT-based consensus protocol, but does not provide smartcontracts. Hyperledger Indy is a blockchain implementation for decentralizedidentity management, based on redundant byzantine fault tolerance (RBFT) [2].Hyperledger Fabric is a generic blockchain implementation that provides smartcontracts, called ”chaincode”, which can be written in Go or Node.js. Early im-plementations used the BFT-SMRT ordering protocol [19], while recent versionshave moved to the simpler, crash-fault tolerant (CFT) RAFT algorithm [16]. A blockchain records transactions in contiguous blocks. A transaction can be anykind of content. Information integrity is maintained by applying a hash functionto the content of each block, which also contains the hash of the previous blockin the chain. Hence, altering a block requires changing all following blocks. In atypical blockchain, nodes are connected using a peer-to-peer network topology.New transactions may originate on any peer and must be recorded in new blocks.Blocks are generally distributed to each peer for independent validation andreplicated storage. The key challenge is to achieve a consensus on the validityand order of transactions and blocks, despite peers that are characterized by https://tendermint.com/ ”byzantine faults”: they may not respond correctly, may respond unpredictably,or may become altogether unresponsive. Blockchains may be either public or permissioned (”consortium”). Public block-chains typically have no access control or identity management. Hence, no nodecan be assumed to be trustworthy. In contrast, a permissioned blockchain hasaccess controls, node operators are generally known and invited to participate,and (some) node operators may be implicitly trusted. The distinction betweenpublic and permissioned is not binary, but a continuum [22].Public chains are typically created to serve a large number of anonymous par-ticipants. Their advantages include anonymity, universal access, and generally ahigh trustworthiness as a large number of nodes provide independent transactionvalidation. On the other hand, public chains require incentives for validation, of-ten in the form of a cryptocurrency, which increases transaction costs. Publicchains also provide little ﬂexibility to adapt to special use cases.In contrast, permissioned chains are typically created for a speciﬁc use casewith a small number of known institutional participants. Advantages of permis-sioned chains include low transaction costs, high ﬂexibility to adapt to specialuse cases, identiﬁability of transaction originators, and access controls. Disad-vantages may include relatively lower trustworthiness due to the smaller numberof validating nodes.Workﬂow management is typically the domain of a small number of institu-tional collaborators, rather than a large number of anonymous participants. Assuch, it is a good ﬁt with permissioned blockchains.While the blockchain technology used for public blockchains may also beused for permissioned blockchains, the diﬀerent characteristics of the latter maypermit or favour the use of technology options that would not be suitable forpublic blockchains, such as communication intensive BFT-based systems.

Smart contracts allow code execution as part of transactions on the blockchain.Advantages include code integrity, as code is part of the blockchain, and a tightintegration of application logic with transaction validation. Disadvantages maybe limitations of the smart contract language instruction set and the need tore-develop existing application logic.In contrast, implementing application logic oﬀ-chain means that existing ap-plications do not need to be ported, and developers have access to familiar pro-gramming languages, code libraries and development tools. On the other hand,transaction validation must call back to the application logic.Smart contracts ensure that all nodes provide the same validation results,whereas performing validation in oﬀ-chain logic places the onus on the develop-ers to ensure identical results for all nodes. On the other hand, it allows devel-opers to develop against a behavioural speciﬁcation without specifying the exact

J. Evermann and H. Kim algorithms or implementation to be used. For the WfMS case in this article, thatmeans that transparency is lost about the speciﬁc details of the workﬂow imple-mentation, but what is gained is that diﬀerent workﬂow systems can interoperateas long as all obey the same workﬂow semantics.Smart contracts have great potential in the context of workﬂow management,as witnessed by the the Caterpillar and Lorikeet approaches [6]. However, nei-ther Caterpillar nor Lorikeet provide a BPMN based generic workﬂow engine asa smart contract. Instead, both systems compile individual BPMN to speciﬁcsmart contracts.

Given the extensive investment in WfMS by researchers andpractitioners, we believe that investigating how standard WfMS can be imple-mented on blockchain infrastructure without re-implementation in smart con-tract languages is worthwhile.

Bitcoin popularized the proof-of-work mechanism for consensus ﬁnding and se-curing the blockchain. New transactions are distributed to all peers, validatedand added to a transaction pool. Validation is based on transactions that existin the chain as well as others already in the transaction pool. Each peer canindependently propose new blocks based on its latest block and distribute theseto other peers. Depending on network connectivity, speeds, and topology, eachpeer may have a diﬀerent set of blocks and transactions, and hence may proposediﬀerent blocks, leading to side branches . Each peer considers the longest branchas the current main branch and proposes new blocks based on this. Transactionsin side branches are not considered valid and are not considered when validatingnew transitions or blocks. When a side branch becomes longer than the currentmain branch, the chain undergoes a reorganization . What was the side branchis validated and becomes the main branch. What was the main branch is con-sidered invalid and becomes a side branch. Transactions no longer in the mainbranch are added back to the transaction pool to be included in other blocks. Asa consequence, diﬀerent peers can at times consider diﬀerent blocks and transac-tions as valid. As proposed blocks are distributed across the network, peers willeventually converge on a consensus regarding the valid blocks and transactionsand their order in the main branch of the chain.To limit the rate of new block proposals and to secure the blockchain againstatttacks, proof-of-work consensus requires block proposers to solve a hard prob-lem (”proof-of-work”, ”mining”). Typically, this is to require the block hash tobe less than a certain value. A limited block rate allows nodes to achieve even-tual consensus, and a hard problem prevents attackers from ”overtaking” thecreation of legitimate blocks with fraudulent one. Assuming equal processingpower for each node, the network needs 2 f + 1 total nodes to tolerate f faultyor malicious nodes.The probability that a transaction in the main branch of the blockchain be-comes invalid decreases with each block that is ”mined” on top of it, althoughin principle it is always possible that a block becomes invalidated. Blockchain MaRt Blockchain Distributed Workﬂow Management 7 communities use rules of thumb for the number of additional blocks that is con-sidered to make a transaction ”safe” enough to act on. In addition to the lackof ﬁnality of consensus, this approach induces signiﬁcant latency as applicationsmust wait not only for one block but many to be created. Furthermore, applica-tions must actively monitor the status of all transactions of interest, must reactto chain reorganizations, and communicate these aspects to the user.

In response to the drawbacks of the proof-of-work consensus, i.e. latencies, noﬁnality of consensus, and required processing power, provably correct orderingalgorithms, based on distributed systems research, have seen a resurgence ininterest. Most of the ongoing research can be traced back to a practical methodfor achieving byzantine fault tolerance (PBFT) [5]. PBFT orders client requestsusing a set of nodes that are fully connected by reliable messaging. Every orderingconsensus is established by a speciﬁc set of nodes (”view”), with a leader orprimary node. Tolerating up to f faulty nodes requires 3 f + 1 total nodes. Protocol

PBFT is a three-stage protocol. A client sends a request to all nodes.The leader proposes a sequence number for the request and broadcasts a pre-prepare message. Upon receipt of a pre-prepare message, a node broadcasts acorresponding prepare messge if it has itself received the request, has not alreadyreceived another pre-prepare message for the same sequence number, and is inthe current view. This indicates the node is prepared to accept the proposedsequence number. Nodes then wait to receive 2 f matching prepare messages,indicating that 2 f + 1 nodes are prepared to accept the proposed sequence num-ber for the request. When a node has received 2 f identical prepare messages,it broadcasts a commit message to all nodes. Each node then waits to reeive2 f identical commit messages, indicating that 2 f + 1 nodes have accepted theproposed sequence number for the request. Upon committing, the node executesthe request and sends a reply message to the client. The client in turn waits for2 f + 1 identical replies, which indicates that a consensus has been reached onthe sequence number of the request.In case the leader fails to propose a sequence number, nodes ﬁrst forwardrequests to the leader. When the leader continues failing to act on requests orproposes sequence numbers too high or too low, nodes trigger a view change.The view change uses a three-stage protocol similar to the normal operation oneto determine a new leader.Consensus about request sequencing is closely related to state machine repli-cation (SMR). Each node maintains a state that can be changed by client re-quests. When every node begins with the same state and executes requests inthe same order, the state machine is replicated. BFT SMART

BFT-SMART [4] is a software library built around the PBFTordering protocol and adds dynamic view reconﬁguration allowing nodes to joinand leave views, and the MOD-SMART [20] state transfer system.

J. Evermann and H. Kim

Collaborative state transfer is useful when nodes create state checkpoints atdiﬀerent times (”sequential checkpointing”). Due to the lack of multiple identicalcheckpoints, a simple quorum protocol cannot be used. Instead, ”collaborativestate transfer” [3] provides checkpoint and log information from multiple nodesin a way that allows a new node to verify its correctness.BFT-SMART provides a simple programming interface. The client-side inter-face exposes the ability to submit requests for ordered or unordered operations.State-changing operations should be ordered, while read-only operations maybe unordered. Applications implement a server-side interface, encapsulating thestate machine, that receives ordered and unordered operation requests in consen-sus sequence from the BFT-SMART library for execution. Any replies are sentback to the requesting client. Operation requests are opaque to the library andare simple byte arrays. It is the client- and server-side application’s responsibil-ity to serialize and deserialize these in a meaningul way. View reconﬁgurations(adding or removing a node, or changing the level of byzantine fault tolerance)are special types of ordered requests but are treated as any other ordered requestfor ordering and consensus purposes.For state management, the server-side application implements methods tofetch and set a state snapshot or checkpoint, also serialized as a byte array.State changes (ordered operations) are logged and the state is periodically check-pointed (sequential checkpointing). When a node joins a view, it is sent the latestcheckpointed state (collaborative state transfer), which it sets for the server-sideapplication, and any ordered operations after that checkpoint are then replayed,allowing the server state to catch up to the consensus state.BFT-SMART has been proven to be correct and live, i.e. it will provide thesame sequence of operations to all nodes and will not deadlock [4]. In termsof throughput, a BFT-SMART system with four nodes ( f = 1) supports morethan 15,000 operation requests (1kB size) per second with latencies around 10milliseconds on a local network. BFT-SMART’s performance decreases linearlyas fault tolerance (and hence the number of nodes) increases: A system with 10nodes ( f = 3) still supports more than 10,000 operations per second [4]. Summary

PBFT-based ordering, as implemented in BFT-SMART, avoids thelatency, lack of ﬁnality and processing requirements of proof-of-work consensus.On the other hand, its three-stage protocol imposes signiﬁcant communicationoverhead and requires fully-connected nodes. Fault tolerance in PBFT-derivedmethods increases linearly with the number of nodes, but performance tends todecrease due to additional communication. [23] presents a comparison of proof-of-work and BFT consensus, shown in Table 1.

The diﬀerent strengths and weak-nesses of the two consensus mechanisms suggest that BFT-based ordering is agood ﬁt with small, permissioned blockchains in the workﬂow management con-text.

Note that while throughput (transactions per seconds) is a key performancemetric for many blockchains and consensus algorithms, it is not important inthe workﬂow management context: Even the largest organizations are unlikely

MaRt Blockchain Distributed Workﬂow Management 9

Proof-of-work BFT ordering

Node identity open, anonymous permissioned, nodes know other nodesConsensus ﬁnality no yesScalability (ordering nodes) excellent limitedScalability (clients) excellent excellentThroughput limited excellentLatency high lowCorrectness proof no yes

Table 1.

Comparison between proof-of-work and BFT-based blockchains, adaptedfrom [23] to have production workﬂow systems that need to sustain tens of thousands ofworkﬂow actions per second.

The main component of a WfMS is the workﬂow engine, which interprets theworkﬂow model and enables work items for manual execution or execution byexternal applications [12]. The engine maintains workﬂow state information andcase data. It may be supported by, or include, services for organizational datamanagement and role resolution, worklist management, document storage, etc.Designing a WfMS architecture requires choosing where to locate and how toimplement the workﬂow engine and other service.Existing work on blockchain-based workﬂow management (Sec. 2) has de-ployed the workﬂow engine on the blockchain itself. However, by compiling aworkﬂow model to a smart contract, the contract forms a workﬂow engine foronly that workﬂow model. Alternatively, blockchains can be treated as a trustedinfrastructure layer for generic workﬂow engines, using the blockchain only forstoring and sharing the state of work and achieving consensus on that state. Toour knowledge, there has been no such implementation using PBFT-derived, orany other, ordering mechanisms.Ordering, block management, and the workﬂow engine are the three mainservices in our system architecture. Fig. 1 shows the architecture of our system.

Ordering Service

The ordering service in our prototype is implemented based onthe BFT-SMART library [4]. It can receive transactions to add to the blockchain,which is an ordered (state-changing) type of request it supports. The orderingservice maintains a record of the latest block hash and block number, as wellas a queue of transactions that have been added as its state. When a suﬃcientnumber of transactions has been collected, the ordering service creates a newblock and clears the transaction queue. The ordering service returns the latestblock hash and the hash of the set of queued transactions as a result to clients,allowing clients to detect absence of consensus. Clients can request the latestblock hash.

Block Service

The block service stores the blockchain, may exchange blocks withother nodes, and veriﬁes the integrity of the blockchain.The block service uses a peer-to-peer network for block exchange with newand recovering nodes. This network is distinct from the network layer of BFT-SMART and is not fully connected. Block exchange is required only when a nodebegins operation and enters an ordering view. At that point, the ordering servicestate is ﬁrst updated through the BFT-SMART state replication mechanisms.The block service then compares its latest block to the latest hash from theordering service. The latter is assumed to be authoritative. Veriﬁcation of theblockchain then proceeds backwards from the head of the chain, i.e. the blockwith the latest hash. Any missing blocks are requested from other peers andveriﬁed prior to adding them.

Workﬂow Engine

The workﬂow engine maintains information about workﬂowinstances (cases) and workﬂow model deﬁnitions. It receives workﬂow transac-tions of new blocks that are added to the chain, updating the state of eachprocess instance and creating work items accordingly. Through the worklist, itmanages user interactions with work items and execution of external functionsby work items.

Ordering Service

Client

Adapter BFT SMRT Ordering ServiceP2P Block Exchange ServiceBlockService

Blocks

Ordering Service Server • Last Hash • Last Block

Workflow Engine

Cases

User

Worklist UI Work

Items

Node A123 4 4Ordering Service

Client

AdapterBlockService

Blocks

Ordering Service Server • Last Hash • Last Block

Workflow Engine

Cases

User

Worklist UI Work

Items

Node B5 Fig. 1.

Architecture overview, transaction ﬂow, and block exchange

The red arrows labelled with numerals in Fig. 1 indicate the steps of handlinga workﬂow transaction in our system:1. User completes work item in worklist2. Transaction is created and passed to ordering service client for submissionto ordering service3. Transaction is submitted to ordering service4. Transaction is passed in order to the ordering service server of all nodes5. Ordering service servers validate ordered transaction with their workﬂowengine

MaRt Blockchain Distributed Workﬂow Management 11

6. When transaction pool contains a suﬃcient number of transactions, a newblock is created and passed to block service7. Block service notiﬁes workﬂow engine of new block and transactions8. Workﬂow engine updates state of running cases and creates new work itemsfor local worklistThe green arrows labelled with letters in Fig. 1 indicate the block exchangemechanism when a peer node is started.A. Block service queries ordering service server for latest hash and transactionnumberB. If block service determines it is missing blocks, it broadcasts a block requestto all other nodesC. Block services receive block requestsD. Block services assemble blocks into response messageE. Block service receives requested blocks and veriﬁes block chainIn step A, note that ordering service is started before block service and re-ceives latest hash and transaction number through state exchange from othernodes. Furthermore, the block request contains the lower and upper block num-bers required by the node. In step B, the block service begins by querying onerandom peer. When it receives no response, it queries an increasingly largernumber of peers for blocks. In step D, other nodes only respond if they cansatisfy at least the upper block number. In step E, if the block chain containsthe most recent block but is missing individual earlier blocks, the block servicewill successively request these blocks from the peer it has most recently receivedblocks from. If this fails, it will again broadcast a query for a speciﬁc block. Asfragments of the blockchain and individual blocks are added, the block servicessuccessively veriﬁes the chain integrity beginning with the latest block and thelast hash received from the ordering service.Next, we discuss the architectural options that we considered when designingour prototype system. These aﬀect performance, ease of implementation, andresilience.

Because BFT-SMART provides exchange of state information with new and re-covering nodes, one architectural option is to employ this method also for theblocks of the blockchain. This means that the entire blockchain is part of thereplicated state in BFT-SMART, eﬀectively removing the need for a separateblock service with its peer-to-peer network and block exchange protocol. Whileeasy to implement by serializing the blockchain into the BFT-SMART statesnapshot, this model becomes infeasible as the blockchain becomes too large tobe rapidly exchanged with other nodes using the complex and communication-intensive collaborative state transfer mechanism in BFT-SMART. As an alter-native, it is suﬃcient for the state to only contain the hash of the last block, thenumber of the last created block, and the queue of transactions waiting to becollected into new blocks.

As noted above, blocks are created by the ordering service. One design optionis to pass new blocks as replies from the ordering service operation back tothe node that requested the add-transition operation that triggered the blockcreation. That node’s block service is then responsible for exchanging the blockwith other nodes using the peer-to-peer network. This creates signiﬁcant traﬃcon that network and may also lead to delays in new block distribution.A second design option, implemented in our system, is to have the orderingservice server-side application that creates the new block pass the new blockdirectly to the block service on its node. This tighter coupling between orderingservice and block service reduces the communication overhead for the peer-to-peer network and latencies due to the block exchange. The peer-to-peer networkis still required for block exchange with new or recovering nodes.

One option is for workﬂow engine and block service to always be present togetheron each node, as we have done for our system. Block service notifying the engineof new blocks, or the engine validating transactions for the ordering service canbe done with local method calls.While there is little to be gained by separating block service and workﬂowengine and running multiples of each, a second option is to operate only a singleblock service with multiple, distributed workﬂow engines. This eliminates thepeer-to-peer network and block exchange communication. Blockchain integritycan still be veriﬁed from the latest hash of the ordering service nodes. However,this design eliminates the redundant storage that is an advantage of a replicatedblockchain. On the other hand, redundancy can be achieved by a replicatedstorage layer within the block service, e.g. a distributed ﬁle system or database.

A transaction may represent workﬂow operations such as deﬁning a new workﬂowmodel, launching a new case, executing an activity, aborting or cancelling acase or removing a workﬂow model. Activity execution information includes theactivity name and case ID, as well input and output data values. Alternatively, atransaction can represent a workﬂow instance state , i.e. data values and enabledactivities, without capturing activity execution itself.The ﬁrst option requires the engine to maintain its own state of the workﬂow(i.e. information about workﬂow models, running instances, data values and en-abled activities). Constructing this state means reading the blockchain forwards from the genesis block and replaying all transactions. State updates are doneby executing transactions in new blocks. While reducing the amount of infor-mation stored on the blockchain, as only changed information recorded, thisoption requires signiﬁcant eﬀort in managing the separate state and ensuring itis consistent with the blockchain record. In contrast, the second option makes

MaRt Blockchain Distributed Workﬂow Management 13 the workﬂow state available by reading the blockchain backwards from the headto identify the latest state for each process instance. State updates are done sim-ply by copying workﬂow states from blockchain transactions as new blocks arepresented. Not maintaining separate state signifanctly simpliﬁes the workﬂowengine design but leads to more information being stored on the chain.The ﬁrst option provides activity information in each transaction. Hence,data constraints can be speciﬁed as post-execution constraints and checked whenvalidating the transaction. The second option does not provide information aboutactivity execution in a transaction. Hence, only global case data constraints canbe speciﬁed and checked as part of transaction validation.Finally, while transactions are waiting to be included in a block, users canbe made aware of such pending transactions. For the ﬁrst option, transactionsare informative as they inform the user about pending workﬂow activities. Inthe second option, such transactions are less informative to the user, as they donot contain activity execution information.

In proof-of-work blockchains, blocks contain multiple transactions. The blocksize is a trade-oﬀ among transaction arrival rate, available hashing power, desiredblock creation rate, available network bandwidth, and tolerance for latency. Atransaction may be ”pending” for a some time until it is included in a blockand at a ”safe” depth. In contrast, in BFT-based systems, there is no reason toprevent blocks from containing only one transaction, i.e. the blockchain becomesa chain of transactions.Moving to a chain of transaction has another advantage. Proof-of-work sys-tems order transactions between diﬀerent blocks, but the order of transactionswithin a block is not deﬁned: Transactions may be included in the same block aslong as they are not mutually contradictory. Block miners ultimately impose anorder, but this order is arbitrary. This means that as pending transactions arecollected, they must be validated against the entire set of pending transactionsto ensure they are not mutually conﬂicting. In a chain of transactions, a newtransaction must be validated only against the immediately prior one.

The ordering and block services (the latter always together with a workﬂowengine) can be coupled to varying degrees. At one extreme, block managementis part of the ordering service, as discussed in Sec. 4.1.In the less integrated architecture implemented in our system, every blockservice and workﬂow engine node is also an ordering node and vice versa, butblock management is distinct from ordering and implements its own peer-to-peer network infrastructure. This allows each ordering node to quickly validatetransactions using the local workﬂow engine. The drawback of this design is thatthe number of ordering nodes should be determined by the desired level of faulttolerance, whereas the number of workﬂow nodes should be determined based on the business process and/or application. An application requiring more orderingthan workﬂow nodes is not a problem as the additional nodes are simply notassigned any workﬂow tasks. On the other hand, when an application requiresmore workﬂow nodes than ordering nodes, the excess ordering nodes decreaseperformance due to the communication overhead.Both types of coupling have the problem that a faulty ordering service alsocompromises the block service and with that the workﬂow engine on that node.However, workﬂow engine and block service can detect their local node’s faultswhen adding a transaction by comparing the consensus ordering service resultto the result of the local ordering service result. If a detected fault is accidental,the node can be reset and synchronized with the consensus ordering view andreceive valid blocks from peers. If the faulty behaviour is malicious, it is theintention of the process participant that controls the entire node, so the blockservice and workﬂow service are also compromised intentionally.At the other extreme, in a very loosely coupled architecture, ordering nodesand block service / workﬂow nodes are separated. Newly created blocks arepassed to the block server as replies from BFT operations and are communicatedusing the block service peer-to-peer network. However, because the orderingservice validates transactions after ordering but before accepting them, eachordering node would require a reliable connection to at least one workﬂow engine.Managing these connections as workﬂow engines join and leave the network,and managing the additional communication, adds signiﬁcant complexity andintroduces additional latency in validating transactions.

Given the architectural design options discussed in the previous section andtheir advantages and disadvantages, we chose to implement our initial prototypeby storing only the latest block hash, block number, and transaction pool asBFT-SMART state (Sec. 4.1). The workﬂow engine and block service are al-ways present together at each node (Sec. 4.3) and both are always co-locatedwith an ordering service node (Sec. 4.6). The ordering service passes new blocksdirectly to the local block service upon block creation, but a peer-to-peer net-work supports block exchange with new or recovering nodes (Sec. 4.2). We storeworkﬂow states on the blockchain, instead of workﬂow operations (Sec. 4.4) soas not having to maintain a separate workﬂow state in the engine. The blocksize is user conﬁgurable (Sec. 4.5). We developed the prototype in Java. Sourcecode is available as well as a video demonstration . Fig. 2 shows a screenshotof our prototype.We implemented a permissioned peer-to-peer infrastructure with a pre-deﬁnedlist of participating actors. To keep our prototype simple, actors are identiﬁedby their internet address rather than their public keys, so that we can omit anaddress resolution layer. The P2P layer is implemented using Java sockets and https://joerg.evermann.ca/software.html https://joerg.evermann.ca/BlockchainDemo.htmlMaRt Blockchain Distributed Workﬂow Management 15 Fig. 2.

Screenshot of prototype serialization. Each P2P node has an outbound server that establishes connec-tions to other peers, and an inbound server that accepts and veriﬁes connectionrequests from peers. Each connection is served by a peer-connection thread,which in turn uses inbound and outbound queue handler threads to receive andsend messages. Incoming messages are submitted to the inbound message han-dler which passes them to the appropriate service. Nodes can join and leave thepeer-to-peer network at will. When a node joins, it tries to open connections to running peers. The ﬁrst peer to be contacted will initiate a view change in theBFT-SMART odering service to include the new peer on that level as well.Upon starting of a node, the BFT-SMART layer will ﬁrst update state in-formation from other nodes in the view. Next, the block service will identifymissing blocks and request them from peers. Once the blockchain is completeand veriﬁed, the workﬂow engine reads the blockchain to get the latest statefor each workﬂow instance. Peer-to-peer messages are cryptographically signedand veriﬁed upon receipt. Table 2 lists the message types on our peer-to-peernetwork.

BlockRequest Requests a block with a speciﬁc hash from one or more peersBlockSend Sends a block to one or more peersBlockChainRequest Requests multiple blocks within a hash range from one or morepeersBlockChainSend Sends multiple blocks to one or more peers

Table 2.

Message types

Our blockchain has two transaction types. A

ModelUpdate transaction installsa new workﬂow model deﬁnition. An

InstanceState transaction contains a stateof a workﬂow instance. It is submitted after a new case has been launched or anactivity instance has been executed. Extensions to terminate cases and invalidatemodel deﬁnitions are readily possible.To keep our prototype simple, our workﬂow models are based on plain Petrinets [1]. Each Petri net transition speciﬁes a workﬂow activity. The workﬂowengine keeps track of the Petri net markings and case data, and can detectdeadlocked and ﬁnished cases to remove them from the worklist.Each activity is associated with a single node. This partitioning of the processto diﬀerent nodes does not form the resource perspective of the workﬂow but isused only to signal each node whether to act on a transaction. Each node canprovide its own resource management by deﬁning roles or other organizationalconcepts and performing further work item allocation within each node. Ourmodels allow the process designer to specify this information.External method calls are speciﬁed as calls to static Java methods, and areperformed synchronously by the workﬂow engine on work item enablement.The data perspective is implemented as a key–value store. We currently admitonly simple Java types as we implement a GUI for these; an extension to arbitrarytypes is readily possible. Each workﬂow instance has a set of data variables.When a transition is enabled, an activity instance (work item) is created forit and its input values are ﬁlled from the values of the workﬂow instance. Theactivity instance is then added to the local worklist or externally executed. Afteran activity instance is completed (manually or through execution of an externalapplication), output values are written back to the workﬂow instance which isthen submitted as an

InstanceState transaction to the ordering service.We emphasize that our implementation is not meant to be a fully-featuredWfMS. Instead, it serves only to illustrate generic WfMS functionality and its

MaRt Blockchain Distributed Workﬂow Management 17 interplay with blockchain infrastructure components. The WfMS features them-selves are not the focus of this research.The ordering service, workﬂow engine and the block service have a simpleinterface (Table 3). The ordering and block services can call on the workﬂow en-gine to validate transactions against the current workﬂow state, and optionally,against the most recent pending transaction. Validation checks that a transac-tion’s instance marking is reachable from the marking of the current workﬂowinstance state or that of the pending transaction. It also checks for data con-straint violation. The block service receives new blocks from the ordering serviceand passes them to the workﬂow engine. In the other direction, the workﬂowengine can submit new transactions to the ordering service after a work itemhas been completed. Finally, the block service can request the latest hash fromthe ordering service on joining the network or recovering from a fault. → validateTransaction(tx[, pendingTx]) Ordering service asks workﬂow engine to val-idate a transaction, given the current work-ﬂow state and optionally the most recentpending transaction (cf. arrow 5 in Fig. 1) → receiveBlock(block) Block service receives a new block and passesrelevant transactions to the workﬂow engine(cf. arrow 6 in Fig. 1) ← addTransaction(tx) Workﬂow engine submits a new transactionto the ordering service (cf. arrow 2 in Fig. 1) ← getLatestHash() Block service requests the latest hash fromthe ordering service (cf. arrow A in Fig. 1) Table 3.

Interfaces between ordering service, block service and workﬂow engine (di-rections from the perspective of the ordering service)

Previous work on blockchain-based WfMS has focused on creating smart con-tracts to represent speciﬁc workﬂow models. In particular, the Ethereum proof-of-work-based blockchain is widely used. However, proof-of-work-based systemshave signiﬁcant drawbacks in terms of processing power requirements, latency,and the lack of ﬁnal consensus. In this work, we have shown that a PBFT-derivedordering and consensus method is a suitable WfMS infrastructure. While we donot use smart contracts of modern blockchains, the use of a blockchain remainsessential, as it provides independent validation of workﬂow actions, distribution,replication, and tamper-prooﬁng to workﬂow management systems.Through the development of our prototype, we have identiﬁed architecturaldesign options with their advantages and disadvantages. Our chosen design, inwhich we integrate ordering service, block service, and workﬂow engine on everynode, strikes a balance between architectural and implementation simplicity onthe one hand, and performance and scalability on the other.

A limitation in our chosen model is that the number of nodes must strikea balance between the requirements of the workﬂow (the number of actors in-volved), the desired level of fault tolerance, and the performance of the system.The major advantages are the low communication overhead on the P2P block ex-change and the ability of local workﬂow engines to validate transactions quickly.While our approach has lower resilience against faults and malicious attacksthan proof-of-work chains, it also has lower latency and higher throughput. Un-like proof-of-work chains, the PBFT-based approach does not scale to a verylarge number of nodes. Given these characteristics, systems such as ours aresuitable for permissioned blockchain applications. The low latency makes themsuitable for fast-moving processes, where activities are of short duration or mustfollow each other quickly. Our system is cheaper to operate than public proof-of-work blockchains that incentivizes block mining through cryptocurrencies. Whileone can implement permissioned proof-of-work chains, these lose their resilienceagainst attacks in small networks as it is easy for a single actor to acquire themajority of processing power in a single high-performance node. Attacking aPBFT-based system cannot be done by concentrating computational power butrequires control of more than 1 / – We currently assign single peer nodes statically to workﬂow activities. In thefuture, we will extend this to dynamic peer node assignment and to integratethis with the workﬂow’s resource perspective. – Porting existing feature-complete workﬂow engines, such as the open-sourceYAWL [11] or Bonita systems, to blockchain infrastructure allows a richerworkﬂow language and leverages existing implementations.To conclude, this paper has described a prototype implementation for anarchitecture that has not yet seen any attention in the blockchain-based work-ﬂow literature. We have implemented PBFT-based system as recommended by[22] and shown that this infrastructure is suitable for WfMS. We have shownhow generic workﬂow engines can be readily adapted to ﬁt onto a blockchaininfrastructure without implementing these as smart contracts. The interfacesbetween components are quite simple. In contrast to [15], who suggest thatblockchain-speciﬁc modelling languages need to be developed, our work showsthat workﬂow engines do not need to be implemented using smart contracts, asdone by [24,25], but that traditional workﬂow engines be easily adapted to use blockchains as infrastructure for communication, persistence, replication, andtrust building. References

1. van der Aalst, W.M.P.: The application of petri nets to workﬂowmanagement. Journal of Circuits, Systems, and Computers (4), 398–461 (2002), https://doi.org/10.1145/571637.5716406. Ciccio, C.D., Cecconi, A., Dumas, M., Garcia-Banuelos, L., Lopez-Pintado,O., Lu, Q., Mendling, J., Ponomarev, A., Tran, A.B., Weber, I.: Blockchainsupport for collaborative business processes. Informatik Spektrum (3), 182–190 (2019). https://doi.org/10.1007/s00287-019-01178-x, https://doi.org/10.1007/s00287-019-01178-x7. Fridgen, G., Radszuwill, S., Urbach, N., Utz, L.: Cross-organizational workﬂowmanagement using blockchain technology - towards applicability, auditability, andautomation. In: 51st Hawaii International Conference on System Sciences HICSS.AIS Electronic Library (2018), http://aisel.aisnet.org/hicss-51/in/blockchain/68. Fridgen, G., Sablowsky, B., Urbach, N.: Implementation ofa blockchain workﬂow management prototype. ERCIM News100