[PDF] Calculating a backtracking algorithm: an exercise in monadic program derivation

Abstract

Equational reasoning is among the most important tools that functional programming provides us. Curiously, relatively less attention has been paid to reasoning about monadic programs. In this report we derive a backtracking algorithm for problem specifications that use a monadic unfold to generate possible solutions, which are filtered using a \mathit{scanl}-like predicate. We develop theorems that convert a variation of \mathit{scanl} to a \mathit{foldr} that uses the state monad, as well as theorems constructing hylomorphism. The algorithm is used to solve the n-queens puzzle, our running example. The aim is to develop theorems and patterns useful for the derivation of monadic programs, focusing on the intricate interaction between state and non-determinism.

Full PDF

aa r X i v : . [ c s . P L ] J a n Calculating a Backtracking Algorithm:

An Exercise in Monadic Program Derivation

SHIN-CHENG MU,

Academia Sinica, TaiwanEquational reasoning is among the most important tools that functional programming provides us. Curiously,relatively less attention has been paid to reasoning about monadic programs. In this report we derive a back-tracking algorithm for problem speciﬁcations that use a monadic unfold to generate possible solutions, whichare ﬁltered using a scanl -like predicate. We develop theorems that convert a variation of scanl to a foldr thatuses the state monad, as well as theorems constructing hylomorphism. The algorithm is used to solve the n -queens puzzle, our running example. The aim is to develop theorems and patterns useful for the derivationof monadic programs, focusing on the intricate interaction between state and non-determinism. Equational reasoning is among the many gifts that functional programming oﬀers us. Functionalprograms preserve a rich set of mathematical properties, which not only helps to prove propertiesabout programs in a relatively simple and elegant manner, but also aids the development of pro-grams. One may reﬁne a clear but ineﬃcient speciﬁcation, stepwise through equational reasoning,to an eﬃcient program whose correctness may not be obvious without such a derivation.It is misleading if one says that functional programming does not allow side eﬀects. In fact, evena purely functional language may allow a variety of side eﬀects — in a rigorous, mathematicallymanageable manner. Since the introduction of monads into the functional programming commu-nity [Moggi 1989; Wadler 1992], it has become the main framework in which eﬀects are modelled.Various monads were developed for diﬀerent eﬀects, from general ones such as IO, state, non-determinism, exception, continuation, environment passing, to speciﬁc purposes such as parsing.Numerous research were also devoted to producing practical monadic programs.It is also a wrong impression that impure programs are bound to be diﬃcult to reason about.In fact, the laws of monads and their operators are suﬃcient to prove quite a number of usefulproperties about monadic programs. The validity of these properties, proved using only theselaws, is independent from the particular implementation of the monad.This report follows the trail of Hutton and Fulger [2008] and Gibbons and Hinze [2011], aimingto develop theorems and patterns that are useful for reasoning about monadic programs. We focuson two eﬀects — non-determinism and state. In this report we consider problem speciﬁcations thatuse a monadic unfold to generate possible solutions, which are ﬁltered using a scanl -like predicate.We develop theorems that convert a variation of scanl to a foldr that uses the state monad, as wellas theorems constructing hylomorphism. The algorithm is used to solve the n -queens puzzle, ourrunning example.While the interaction between non-determinism and state is known to be intricate, when eachnon-deterministic branch has its own local state, we get a relatively well-behaved monad thatprovides a rich collection of properties to work with. The situation when the state is global andshared by all non-deterministic branches is much more complex, and is dealt with in a subsequentpaper [Pauwels et al. 2019]. Author’s address: Shin-Cheng Mu, Institute of Information Science, Academia Sinica, Taiwan, [email protected] Report TR-IIS-19-003, Institute of Information Science, Academia Sinica. Publication date: June 2019.

Shin-Cheng Mu

A monad consists of a type constructor m :: ∗ → ∗ and two operators return and “bind” ( >> = ) , oftenmodelled by the following Haskell type class declaration: class Monad m where return :: a → m a ( >> = ) :: m a → ( a → m b ) → m b . They are supposed to satisfy the following monad laws : return x >> = f = f x , (1) m >> = return = m , (2) ( m >> = f ) >> = g = m >> = ( 𝜆 x → f x >> = g ) . (3)We also deﬁne m >> m = m >> = const m , which has type ( >> ) :: m a → m b → m b . Kleislicomposition, denoted by ( > = > ) , composes two monadic operations a → m b and b → m c into anoperation a → m c . The operator ( h $ i ) applies a pure function to a monad. ( > = > ) :: Monad m ⇒ ( a → m b ) → ( b → m c ) → a → m c ( f > = > g ) x = f x >> = g , ( h $ i ) :: Monad m ⇒ ( a → b ) → m a → m bf h $ i n = n >> = ( return · f ) . The following properties can be proved from their deﬁnitions and the monad laws: ( f · g ) h $ i m = f h $ i ( g h $ i m ) , (4) ( f h $ i m ) >> = g = m >> = ( g · f ) , (5) f h $ i ( m >> = k ) = m >> = ( 𝜆 x → f h $ i k x ) , x not free in f . (6) Eﬀect and Eﬀect Operators.

Monads are used to model eﬀects, and each eﬀect comes with itscollection of operators. For example, to model non-determinism we assume two operators ∅ and ( ) , respectively modeling failure and choice. A state eﬀect comes with operators get and put ,which respectively reads from and writes to an unnamed state variable.A program may involve more than one eﬀect. In Haskell, the type class constraint MonadPlus inthe type of a program denotes that the program may use ∅ or ( ) , and possibly other eﬀects, while MonadState s denotes that it may use get and put . Some theorems in this report, however, applyonly to programs that, for example, use non-determinism and no other eﬀects . In such cases we willnote in text that the theorem applies only to programs “whose only eﬀect is non-determinism.” Theset of eﬀects a program uses can always be statically inferred by syntax. Total, Finite Programs.

Like in other literature on program derivation, we assume a set-theoreticsemantics in which functions are total. We thus have the following laws regarding branching: f ( if p then e else e ) = if p then f e else f e , (7) if p then ( 𝜆 x → e ) else ( 𝜆 x → e ) = 𝜆 x → if p then e else e . (8) This report uses type classes to be explicit about the eﬀects a program uses. For example, programs using non-determinismare labelled with constraint

MonadPlus m . The style of reasoning proposed in this report is not tied to type classes orHaskell, and we do not strictly follow the particularities of type classes in the current Haskell standard. For example,we overlook the particularities that a Monad must also be

Applicative , MonadPlus be Alternative , and that functionaldependency is needed in a number of places in this report.Technical Report TR-IIS-19-003, Institute of Information Science, Academia Sinica. Publication date: June 2019. alculating a Backtracking Algorithm 3

Lists in this report are inductive types, and unfolds generate ﬁnite lists too. Non-deterministicchoices are ﬁnitely branching. Given a concrete input, a function always expands to a ﬁnitely-sized expression consisting of syntax allowed by its type. We may therefore prove properties of amonadic program by structural induction over its syntax. n -QUEENS PROBLEM Reasoning about monadic programs gets more interesting when more than one eﬀect is involved.Backtracking algorithms make good examples of programs that are stateful and non-deterministic,and the n -queens problem, also dealt with by Gibbons and Hinze [2011], is among the most well-known examples of backtracking. In this section we present a speciﬁcation of the problem, before transforming it into the form unfoldM p f > = > ﬁlt ( all ok · scanl + (⊕) st ) (whose components will be deﬁned later), which is thegeneral form of problems we will deal with in this report. Since the n -queens problem will be speciﬁed by a non-deterministic program, we discuss non-determinism before presenting the speciﬁcation. We assume two operators ∅ and ( ) : class Monad m ⇒ MonadPlus m where ∅ :: m a ( ) :: m a → m a → m a . The former denotes failure, while m n denotes that the computation may yield either m or n .What laws they should satisfy, however, can be a tricky issue. As discussed by Kiselyov [2015], iteventually comes down to what we use the monad for. It is usually expected that ( ) and ∅ forma monoid. That is, ( ) is associative, with ∅ as its zero: ( m n ) k = m ( n k ) , (9) ∅ m = m = m ∅ . (10)It is also assumed that monadic bind distributes into ( ) from the end, while ∅ is a left zero for ( >> = ) : left-distributivity : ( m m ) >> = f = ( m >> = f ) ( m >> = f ) , (11) left-zero : ∅ >> = f = ∅ . (12)We will refer to the laws (9), (10), (11), (12) collectively as the nondeterminism laws . Other propertiesregarding ∅ and ( ) will be introduced when needed.The monadic function ﬁlt p x returns x if p x holds, and fails otherwise: ﬁlt :: MonadPlus m ⇒ ( a → Bool ) → a → m aﬁlt p x = guard ( p x ) >> return x , where guard is a standard monadic function deﬁned by: guard :: MonadPlus m ⇒ Bool → m () guard b = if b then return () else ∅ . The following properties allow us to move guard around. Their proofs are given in Appendix A. guard ( p ∧ q ) = guard p >> guard q , (13) Curiously, Gibbons and Hinze [2011] did not ﬁnish their derivation and stopped at a program that exhaustively generatesall permutations and tests each of them. Perhaps it was suﬃcient to demonstrate their point.Technical Report TR-IIS-19-003, Institute of Information Science, Academia Sinica. Publication date: June 2019.

Shin-Cheng Mu . . . . . 𝑄 . . . . . 𝑄 . . . . . . . . . . 𝑄 . 𝑄 . . . . . . . . . . . . . . 𝑄 . 𝑄 . . . . . . . . . . 𝑄 . . . . . 𝑄 . . . . . (a) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

12 137 . . . . .

12 13 14 (b) − . . . − − − . − . . . − − . . − . . . −

53 3 . . . . . . . . . . . . . . . . . . . . (c)Fig. 1. (a) This placement can be represented by [ , , , , , , , ] . (b) Up diagonals. (c) Down diagonals. guard p >> ( f h $ i m ) = f h $ i ( guard p >> m ) . (14) guard p >> m = m >> = ( 𝜆 x → guard p >> return x ) , if m >> ∅ = ∅ . (15) The aim of the puzzle is to place n queens on a n by n chess board such that no two queens canattack each other. Given n , we number the rows and columns by [ . . n − ] . Since all queensshould be placed on distinct rows and distinct columns, a potential solution can be represented bya permutation xs of the list [ . . n − ] , such that xs !! i = j denotes that the queen on the 𝑖 th columnis placed on the 𝑗 th row (see Figure 1(a)). In this representation queens cannot be put on the samerow or column, and the problem is reduced to ﬁltering, among permutations of [ . . n − ] , thoseplacements in which no two queens are put on the same diagonal. The speciﬁcation can be writtenas a non-deterministic program: queens :: MonadPlus m ⇒ Int → m [ Int ] queens n = perm [ . . n − ] >> = ﬁlt safe , where perm non-deterministically computes a permutation of its input, and the pure function safe :: [ Int ] →

Bool determines whether no queens are on the same diagonal.This speciﬁcation of queens generates all the permutations, before checking them one by one,in two separate phases. We wish to fuse the two phases and produce a faster implementation. Theoverall idea is to deﬁne perm in terms of an unfold, transform ﬁlt safe into a fold, and fuse the twophases into a hylomorphism [Meijer et al. 1991]. During the fusion, some non-safe choices can bepruned oﬀ earlier, speeding up the computation.

Permutation.

The monadic function perm can be written both as a fold or an unfold. For thisproblem we choose the latter. The function select non-deterministically splits a list into a paircontaining one chosen element and the rest: select :: MonadPlus m ⇒ [ a ] → m ( a , [ a ]) . select [ ] = ∅ select ( x : xs ) = return ( x , xs ) (( id × ( x : )) h $ i select xs ) , where ( f × g ) ( x , y ) = ( f x , g y ) . For example, select [ , , ] yields one of ( , [ , ]) , ( , [ , ]) and ( , [ , ]) . The function call unfoldM p f y generates a list [ a ] from a seed y :: b . If p y holds,the generation stops. Otherwise an element and a new seed is generated using f . It is like the usual unfoldr apart from that f , and thus the result, is monadic: Technical Report TR-IIS-19-003, Institute of Information Science, Academia Sinica. Publication date: June 2019. alculating a Backtracking Algorithm 5 unfoldM :: Monad m ⇒ ( b → Bool ) → ( b → m ( a , b )) → b → m [ a ] unfoldM p f y | p y = return [ ]| otherwise = f y >> = 𝜆 ( x , z ) → ( x : ) h $ i unfoldM p f z . Given these deﬁnitions, perm can be deﬁned by: perm :: MonadPlus m ⇒ [ a ] → m [ a ] perm = unfoldM null select . scanl We have yet to deﬁne safe . Representing a placement as a permutation allows an easy way to checkwhether two queens are put on the same diagonal. An 8 by 8 chess board has 15 up diagonals (thoserunning between bottom-left and top-right). Let them be indexed by [ . . ] (see Figure 1(b)). Ifwe apply zipWith (+) [ . . ] to a permutation, we get the indices of the up-diagonals where thechess pieces are placed. Similarly, there are 15 down diagonals (those running between top-left andbottom right). By applying zipWith (−) [ . . ] to a permutation, we get the indices of their down-diagonals (indexed by [ − . . ] . See Figure 1(c)). A placement is safe if the diagonals contain noduplicates: ups , downs :: [ Int ] → [

Int ] ups xs = zipWith (+) [ . . ] xs , downs xs = zipWith (−) [ . . ] xs , safe :: [ Int ] →

Bool safe xs = nodup ( ups xs ) ∧ nodup ( downs xs ) , where nodup :: Eq a ⇒ [ a ] → Bool determines whether there is no duplication in a list.The eventual goal is to transform ﬁlt safe into a foldr , to be fused with perm , an unfold thatgenerates a list from left to right. In order to do so, it helps if safe can be expressed in a computationthat processes the list left-to-right, that is, a foldl or a scanl . To derive such a deﬁnition we use thestandard trick — introducing accumulating parameters, and generalising safe to safeAcc below: safeAcc :: ( Int , [ Int ] , [ Int ]) → [

Int ] →

Bool safeAcc ( i , us , ds ) xs = nodup us ′ ∧ nodup ds ′ ∧ all ( ∉ us ) us ′ ∧ all ( ∉ ds ) ds ′ , where us ′ = zipWith (+) [ i . . ] xsds ′ = zipWith (−) [ i . . ] xs . It is a generalisation because safe = safeAcc ( , [ ] , [ ]) . By plain functional calculation, one mayconclude that safeAcc can be deﬁned using a variation of scanl : safeAcc ( i , us , ds ) = all ok · scanl + (⊕) ( i , us , ds ) , where ( i , us , ds ) ⊕ x = ( i + , ( i + x : us ) , ( i − x : ds )) ok ( i , ( x : us ) , ( y : ds )) = x ∉ us ∧ y ∉ ds , where all p = foldr (∧) True · map p and scanl + is like the standard scanl , but applies foldl to allnon-empty preﬁxes of a list. It can be speciﬁed by: scanl + :: ( b → a → b ) → b → [ a ] → [ b ] scanl + (⊕) st = tail · scanl (⊕) st , and it also adopts an inductive deﬁnition: Technical Report TR-IIS-19-003, Institute of Information Science, Academia Sinica. Publication date: June 2019.

Shin-Cheng Mu scanl + (⊕) st [ ] = [ ] scanl + (⊕) st ( x : xs ) = ( st ⊕ x ) : scanl + (⊕) ( st ⊕ x ) xs . Operationally, safeAcc examines the list from left to right, while keeping a state ( i , us , ds ) , where i is the current position being examined, while us and ds are respectively indices of all the up anddown diagonals encountered so far. Indeed, in a function call scanl + (⊕) st , the value st can beseen as a “state” that is explicitly carried around. This naturally leads to the idea: can we converta scanl + to a monadic program that stores st in its state? This is the goal of the next section.As a summary of this section, after deﬁning queens , we have transformed it into the followingform: unfoldM p f > = > ﬁlt ( all ok · scanl + (⊕) st ) . This is the form of problems we will consider for the rest of this report: problems whose solutionsare generated by an monadic unfold, before being ﬁltered by an ﬁlt that takes the result of a scanl + . scanl The aim of this section is to turn the ﬁltering phase ﬁlt ( all ok · scanl + (⊕) st ) into a foldr . Forthat we introduce a state monad to pass the state around.The state eﬀect provides two operators: class Monad m ⇒ MonadState s m where get :: m sput :: s → m () , where get retrieves the state, while put overwrites the state by the given value. They are supposedto satisfy the state laws : put-put : put st >> put st ′ = put st ′ , (16) put-get : put st >> get = put st >> return st , (17) get-put : get >> = put = return () , (18) get-get : get >> = ( 𝜆 st → get >> = k st ) = get >> = ( 𝜆 st → k st st ) . (19) scanl + to monadic foldr Consider the following monadic variation of scanl : scanlM :: MonadState s m ⇒ ( s → a → s ) → s → [ a ] → m [ s ] scanlM (⊕) st xs = put st >> foldr (⊗) ( return [ ]) xs where x ⊗ n = get >> = 𝜆 st → let st ′ = st ⊕ x in ( st ′ : ) h $ i ( put st ′ >> n ) . It behaves like scanl + , but stores the accumulated information in a monadic state, which is retrievedand stored in each step. The main body of the computation is implemented using a foldr .To relate scanl + and scanlM , one would like to have return ( scanl + (⊕) st xs ) = scanlM (⊕) st xs .However, the lefthand side does not alter the state, while the righthand side does. One of the waysto make the equality hold is to manually backup and restore the state. Deﬁne protect :: MonadState s m ⇒ m b → m bprotect n = get >> = 𝜆 ini → n >> = 𝜆 x → put ini >> return x , We have

Technical Report TR-IIS-19-003, Institute of Information Science, Academia Sinica. Publication date: June 2019. alculating a Backtracking Algorithm 7

Theorem 4.1.

For all (⊕) :: ( s → a → s ) , st :: s, and xs :: [ a ] ,return ( scanl + (⊕) st xs ) = protect ( scanlM (⊕) st xs ) . Proof.

By induction on xs . We present the case xs : = x : xs . protect ( scanlM (⊕) st ( x : xs )) = { expanding deﬁnitions, let st ′ = st ⊕ x } get >> = 𝜆 ini → put st >> get >> = 𝜆 st →(( st ′ : ) h $ i ( put st ′ >> foldr (⊗) ( return [ ]) xs )) >> = 𝜆 r → put ini >> return r = { by put - get (17) } get >> = 𝜆 ini → put st >> (( st ′ : ) h $ i ( put st ′ >> foldr (⊗) ( return [ ]) xs )) >> = 𝜆 r → put ini >> return r = { by (6) } ( st ′ : ) h $ i ( get >> = 𝜆 ini → put st >> put st ′ >> foldr (⊗) ( return [ ]) xs >> = 𝜆 r → put ini >> return r ) = { by put - put (16) } ( st ′ : ) h $ i ( get >> = 𝜆 ini → put st ′ >> foldr (⊗) ( return [ ]) xs >> = 𝜆 r → put ini >> return r ) = { deﬁnitions of scanlM and protect } ( st ′ : ) h $ i protect ( scanlM (⊕) st ′ xs ) = { induction } ( st ′ : ) h $ i return ( scanl + (⊕) st ′ xs ) = return (( st ⊕ x ) : scanl + (⊕) ( st ⊕ x ) xs ) = return ( scanl + (⊕) st ( x : xs )) . (cid:3) This proof is instructive due to the use of properties (16) and (17), and that ( st ′ : ) , being a purefunction, can be easily moved around using (6).We have learned that scanl + (⊕) st can be turned into scanlM (⊕) st , deﬁned in terms of astateful foldr . In the deﬁnition, state is the only eﬀect involved. The next task is to transform ﬁlt ( scanl + (⊕) st ) into a foldr . The operator ﬁlt is deﬁned using non-determinism. Hence thetransformation involves the interaction between two eﬀects. We now digress a little to discuss one form of interaction between non-determinism and state. Inthis report, we wish that the following two additional properties are valid: right-distributivity : m >> = ( 𝜆 x → f x f x ) = ( m >> = f ) ( m >> = f ) , (20) right-zero : m >> ∅ = ∅ . (21)Note that the two properties hold for some monads with non-determinism, but not all. With someimplementations of the monad, it is likely that in the lefthand side of (20), the eﬀect of m happensonce, while in the righthand side it happens twice. In (21), the m on the lefthand side may incursome eﬀects that do not happen in the righthand side. Technical Report TR-IIS-19-003, Institute of Information Science, Academia Sinica. Publication date: June 2019.

Shin-Cheng Mu

Having (20) and (21) leads to profound consequences on the semantics and implementation ofmonadic programs. To begin with, (20) implies that ( ) be commutative. To see that, let m = m m and f = f = return in (20). Implementation of such non-deterministic monads have been studiedby Kiselyov [2013].When mixed with state, one consequence of (20) is that get >> = ( 𝜆 s → f s f s ) = ( get >> = f get >> = f ) . That is, f and f get the same state regardless of whether get is performed outsideor inside the non-determinism branch. Similarly, (21) implies put s >> ∅ = ∅ — when a programfails, the changes it performed on the state can be discarded. These requirements imply that eachnon-determinism branch has its own copy of the state . Therefore, we will refer to (20) and (21) as local state laws in this report — even though they do not explicitly mention state operators at all!One monad satisfying the local state laws is M a = s → [ ( a , s ) ] , which is the same monad onegets by StateT s ( ListT Identity ) in the Monad Transformer Library [Gill and Kmett 2014]. Witheﬀect handling [Kiselyov and Ishii 2015; Wu et al. 2012], the monad meets the requirements if werun the handler for state before that for list.The advantage of having the local state laws is that we get many useful properties, which makethis stateful non-determinism monad preferred for program calculation and reasoning. Recall, forexample, that (21) is the antecedent of (15). The result can be stronger: non-determinism commuteswith all other eﬀects if we have local state laws. Deﬁnition 4.2.

Let m and n be two monadic programs such that x does not occur free in m , and y does not occur free in n . We say m and n commute if m >> = 𝜆 x → n >> = 𝜆 y → f x y = n >> = 𝜆 y → m >> = 𝜆 x → f x y . (22)We say that m commutes with eﬀect 𝛿 if m commutes with any n whose only eﬀects are 𝛿 , andthat eﬀects 𝜖 and 𝛿 commute if any m and n commute as long as their only eﬀects are respectively 𝜖 and 𝛿 . Theorem 4.3.

If right-distributivity (20) and right-zero (21) hold in addition to the monad lawsstated before, non-determinism commutes with any eﬀect 𝜖 . Proof.

Let m be a monadic program whose only eﬀect is non-determinism, and stmt be anymonadic program. The aim is to prove that m and stmt commute. Induction on the structure of m . Case m : = return e : stmt >> = 𝜆 x → return e >> = 𝜆 y → f x y = { monad law (1) } stmt >> = 𝜆 x → f x e = { monad law (1) } return e >> = 𝜆 y → stmt >> = 𝜆 x → f x y . Case m : = m m : stmt >> = 𝜆 x → ( m m ) >> = 𝜆 y → f x y = { by (11) } stmt >> = 𝜆 x → ( m >> = f x ) ( m >> = f x ) = { by (20) } ( stmt >> = 𝜆 x → m >> = f x ) ( stmt >> = 𝜆 x → m >> = f x ) = { induction } ( m >> = 𝜆 y → stmt >> = 𝜆 x → f x y ) ( m >> = 𝜆 y → stmt >> = 𝜆 x → f x y ) Technical Report TR-IIS-19-003, Institute of Information Science, Academia Sinica. Publication date: June 2019. alculating a Backtracking Algorithm 9 = { by (11) } ( m m ) >> = 𝜆 y → stmt >> = 𝜆 x → f x y . Case m : = ∅ : stmt >> = 𝜆 x → ∅ >> = 𝜆 y → f x y = { by (12) } stmt >> = 𝜆 x → ∅ = { by (21) } ∅ = { by (12) } ∅ >> = 𝜆 y → stmt >> = 𝜆 x → f x y . (cid:3) Note.

We brieﬂy justify proofs by induction on the syntax tree. Finite monadic programs can berepresented by the free monad constructed out of return and the eﬀect operators, which can be rep-resented by an inductively deﬁned data structure, and interpreted by eﬀect handlers [Kiselyov and Ishii2015; Kiselyov et al. 2013]. When we say two programs m and m are equal, we mean that theyhave the same denotation when interpreted by the eﬀect handlers of the corresponding eﬀects, forexample, hdNondet ( hdState s m ) = hdNondet ( hdState s m ) , where hdNondet and hdState arerespectively handlers for nondeterminism and state. Such equality can be proved by induction onsome sub-expression in m or m , which are treated like any inductively deﬁned data structure. Amore complete treatment is a work in progress. ( End of Note ) Having dealt with scanl + (⊕) st in Section 4.1, in this section we aim to turn a ﬁlter of the form ﬁlt ( all ok · scanl + (⊕) st ) to a stateful and non-deterministic foldr .We calculate, for all ok , (⊕) , st , and xs : ﬁlt ( all ok · scanl + (⊕) st ) xs = guard ( all ok ( scanl + (⊕) st xs )) >> return xs = return ( scanl + (⊕) st xs ) >> = 𝜆 ys → guard ( all ok ys ) >> return xs = { Theorem 4.1, deﬁnition of protect , monad law } get >> = 𝜆 ini → scanlM (⊕) st xs >> = 𝜆 ys → put ini >> guard ( all ok ys ) >> return xs = { Theorem 4.3: non-determinism commutes with state } get >> = 𝜆 ini → scanlM (⊕) st xs >> = 𝜆 ys → guard ( all ok ys ) >> put ini >> return xs = { deﬁnition of protect , monad laws } protect ( scanlM (⊕) st xs >> = ( guard · all ok ) >> return xs ) . Recall that scanlM (⊕) st xs = put st >> foldr (⊗) ( return [ ]) xs . The following theorem fuses amonadic foldr with a guard that uses its result. Theorem 4.4.

Assume that state and non-determinism commute. Let (⊗) be deﬁned as that inscanlM for any given (⊕) :: s → a → s. We have that for all ok :: s → Bool and xs :: [ a ] :foldr (⊗) ( return [ ]) xs >> = ( guard · all ok ) >> return xs = foldr (⊙) ( return [ ]) xs , Technical Report TR-IIS-19-003, Institute of Information Science, Academia Sinica. Publication date: June 2019. where x ⊙ m = get >> = 𝜆 st → guard ( ok ( st ⊕ x )) >> put ( st ⊕ x ) >> (( x : ) h $ i m ) . Proof.

Unfortunately we cannot use a foldr fusion, since xs occurs free in 𝜆 ys → guard ( all ok ys ) >> return xs . Instead we use a simple induction on xs . For the case xs : = x : xs : ( x ⊗ foldr (⊗) ( return [ ]) xs ) >> = ( guard · all ok ) >> return ( x : xs ) = { deﬁnition of (⊗) } get >> = 𝜆 st →((( st ⊕ x ) : ) h $ i ( put ( st ⊕ x ) >> foldr (⊗) ( return [ ]) xs )) >> = ( guard · all ok ) >> return ( x : xs ) = { monad laws, (5), and (6) } get >> = 𝜆 st → put ( st ⊕ x ) >> foldr (⊗) ( return [ ]) xs >> = 𝜆 ys → guard ( all ok ( st ⊕ x : ys )) >> return ( x : xs ) = { since guard ( p ∧ q ) = guard q >> guard p } get >> = 𝜆 st → put ( st ⊕ x ) >> foldr (⊗) ( return [ ]) xs >> = 𝜆 ys → guard ( ok ( st ⊕ x )) >> guard ( all ok ys ) >> return ( x : xs ) = { assumption: nondeterminism commutes with state } get >> = 𝜆 st → guard ( ok ( st ⊕ x )) >> put ( st ⊕ x ) >> foldr (⊗) ( return [ ]) xs >> = 𝜆 ys → guard ( all ok ys ) >> return ( x : xs ) = { monad laws and deﬁnition of ( h $ i ) } get >> = 𝜆 st → guard ( ok ( st ⊕ x )) >> put ( st ⊕ x ) >> ( x : ) h $ i ( foldr (⊗) ( return [ ]) xs >> = 𝜆 ys → guard ( all ok ys ) >> return xs ) = { induction } get >> = 𝜆 st → guard ( ok ( st ⊕ x )) >> put ( st ⊕ x ) >> ( x : ) h $ i foldr (⊙) ( return [ ]) xs = { deﬁnition of (⊙) } foldr (⊙) ( return [ ]) ( x : xs ) . (cid:3) This proof is instructive due to extensive use of commutativity.In summary, we now have this corollary performing ﬁlt ( all ok · scanl + (⊕) st ) using a non-deterministic and stateful foldr: Corollary 4.5.

Let (⊙) be deﬁned as in Theorem 4.4. If state and non-determinism commute, wehave:ﬁlt ( all ok · scanl + (⊕) st ) xs = protect ( put st >> foldr (⊙) ( return [ ]) xs ) . To recap what we have done, we started with a speciﬁcation of the form unfoldM p f z >> = ﬁlt ( all ok · scanl + (⊕) st ) , where f :: MonadPlus m ⇒ b → m ( a , b ) , and have shown that Technical Report TR-IIS-19-003, Institute of Information Science, Academia Sinica. Publication date: June 2019. alculating a Backtracking Algorithm 11 unfoldM p f z >> = ﬁlt ( all ok · scanl + (⊕) st ) = { Corollary 4.5, with (⊙) deﬁned as in Theorem 4.4 } unfoldM p f z >> = 𝜆 xs → protect ( put st >> foldr (⊙) ( return [ ]) xs ) = { Theorem 4.3: nondeterminism commutes with state } protect ( put st >> unfoldM p f z >> = foldr (⊙) ( return [ ])) . The ﬁnal task is to fuse unfoldM p f with foldr (⊙) ( return [ ]) . In a pure setting, it is known that, provided that the unfolding phase terminates, foldr (⊗) e · unfoldr p f is the unique solution of hylo in the equation below [Hinze et al. 2015]: hylo y | p y = e | otherwise = let f y = ( x , z ) in x ⊗ hylo z . Hylomorphisms with monadic folds and unfolds are a bit tricky. Pardo [2001] discussed hylomor-phism for regular base functors, where the unfolding phase is monadic while the folding phase ispure. As for the case when both phases are monadic, he noted “the drawback ... is that they cannotbe always transformed into a single function that avoids the construction of the intermediate datastructure.”For our purpose, we focus our attention on lists, and have a theorem fusing the monadic unfold-ing and folding phases under a side condition. Given (⊗) :: b → m c → m c , e :: c , p :: a → Bool ,and f :: a → m ( b , a ) (where Monad m ), consider the expression: unfoldM p f > = > foldr (⊗) ( return e ) :: Monad m ⇒ a → m c . The following theorem says that this combination of folding and unfolding can be fused into one,with some side conditions:

Theorem 5.1.

Let m :: ∗ → ∗ be in type class Monad . For all (⊗) :: a → m c → m c, e :: m c,p :: b → Bool , and f :: b → m ( a , c ) , we have that unfoldM p f > = > foldr (⊗) e = hyloM (⊗) e p f ,deﬁned by:hyloM (⊗) e p f y | p y = e | otherwise = f y >> = 𝜆 ( x , z ) → x ⊗ hyloM (⊗) e p f z , if the relation (¬ · p ) ? · snd · ( = << ) · f is well-founded (see the note below) and, for all k, we haven >> = (( x ⊗) · k ) = x ⊗ ( n >> = k ) , (23) where n abbreviates unfoldM p f z. The “well-foundedness” condition essentially says that f eventually terminates — details to beexplained after the proof of this theorem. Condition (23) may look quite restrictive. In most casesthe author have seen, however, we can actually prove that (23) holds for an entire class of n that in-cludes unfoldM p f z . In the application of this report, for example, the only eﬀect of unfoldM p f z is non-determinism, and we will prove in Lemma 5.2 that (23) holds for all n whose only eﬀect isnon-determinism, for the particular operator we use in the n -queens problem.We prove Theorem 5.1 below. Proof.

We start with showing that unfoldM p f > = > foldr (⊗) e is a ﬁxed-point of the recursiveequations of hyloM . When p y holds, it is immediate that return [ ] >> = foldr (⊗) e = e . When ¬ ( p y ) , we reason: Technical Report TR-IIS-19-003, Institute of Information Science, Academia Sinica. Publication date: June 2019. unfoldM p f y >> = foldr (⊗) e = { deﬁnition of unfoldM , ¬ ( p y ) } ( f y >> = ( 𝜆 ( x , z ) → ( x : ) h $ i unfoldM p f z )) >> = foldr (⊗) e = { monad law and foldr } f y >> = ( 𝜆 ( x , z ) → unfoldM p f z >> = 𝜆 xs → x ⊗ foldr (⊗) e xs ) = { since n >> = (( x ⊗) · k ) = x ⊗ ( n >> = k ) where n = unfoldM p f z } f y >> = ( 𝜆 ( x , z ) → x ⊗ ( unfoldM p f z >> = foldr (⊗) e )) . Now that unfoldM p f z >> = foldr (⊗) e is a ﬁxed-point, we may conclude that it equals hyloM (⊗) e p f if the latter has a unique ﬁxed-point, which is guaranteed by the well-foundedness condition. Seethe note below. (cid:3) Note.

Let q be a predicate, q ? is a relation deﬁned by { ( x , x ) | q x } . The parameter y in unfoldM is called the seed used to generate the list. The relation (¬ · p ) ? · snd · ( = << ) · f maps one seed to thenext seed (where ( = << ) is ( >> = ) written reversed). If it is well-founded , intuitively speaking, the seedgeneration cannot go on forever and p will eventually hold. It is known that inductive types (thosecan be folded) and coinductive types (those can be unfolded) do not coincide in SET . To allow afold to be composed after an unfold, typically one moves to a semantics based on complete partialorders. However, it was shown [Doornbos and Backhouse 1995] that, in

Rel , when the relationgenerating seeds is well-founded, hylo-equations do have unique solutions. One may thus staywithin a set-theoretic semantics. Such an approach is recently explored again [Hinze et al. 2015].(

End of Note )Theorem 5.1 does not rely on the local state laws (20) and (21), and does not put restriction on 𝜖 .To apply the theorem to our particular case, we have to show that its preconditions hold for ourparticular (⊙) — for that we will need (21) and perhaps also (20). In the lemma below we slightlygeneralise (⊙) in Theorem 4.4: Lemma 5.2.

Assume that (21) holds. Given p :: a → s → Bool , next :: a → s → s, and res :: a → b → b, deﬁne (⊙) as below: (⊙) :: ( MonadPlus m , MonadState s m ) ⇒ a → m b → m bx ⊙ m = get >> = 𝜆 st → guard ( p x st ) >> put ( next x st ) >> ( res x h $ i m ) . We have n >> = (( x ⊙) · k ) = x ⊙ ( n >> = k ) , if n commutes with state. Proof.

We reason: n >> = (( x ⊙) · k ) = n >> = 𝜆 y → x ⊙ k y = { deﬁnition of (⊙) } n >> = 𝜆 y → get >> = 𝜆 st → guard ( p x st ) >> put ( next x st ) >> ( res x h $ i k y ) = { n commutes with state } get >> = 𝜆 st → n >> = 𝜆 y → guard ( p x st ) >> put ( next x st ) >> ( res x h $ i ( k y )) = { by (15), since (21) holds } get >> = 𝜆 st → guard ( p x st ) >> n >> = 𝜆 y → put ( next x st ) >> ( res x h $ i ( k y )) Technical Report TR-IIS-19-003, Institute of Information Science, Academia Sinica. Publication date: June 2019. alculating a Backtracking Algorithm 13 = { n commutes with state } get >> = 𝜆 st → guard ( p x st ) >> put ( next x st ) >> n >> = 𝜆 y → ( res x h $ i ( k y )) = { properties of ( h $ i ) } get >> = 𝜆 st → guard ( p x st ) >> put ( next x st ) >> ( res x h $ i n >> = k ) = { deﬁnition of (⊙) } x ⊙ ( n >> = k ) . (cid:3) n -Queens To conclude our derivation, a problem formulated as unfoldM p f z >> = ﬁlt ( all ok · scanl + (⊕) st ) can be solved by a hylomorphism. Deﬁne: solve :: ( MonadState s m , MonadPlus m ) ⇒( b → Bool ) → ( b → m ( a , b )) → ( s → Bool ) → ( s → a → s ) → s → b → m [ a ] solve p f ok (⊕) st z = protect ( put st >> hyloM (⊙) ( return [ ]) p f z ) , where x ⊙ m = get >> = 𝜆 st → guard ( ok ( st ⊕ x )) >> put ( st ⊕ x ) >> (( x : ) h $ i m ) . Corollary 5.3.

Given p :: b → Bool , f :: ( MonadPlus m , MonadState s m ) ⇒ b → m ( a , b ) ,z :: b, ok :: s → Bool , (⊕) :: s → a → s, st :: s, If the relation (¬ · p ) ? · snd · ( = << ) · f is well-founded, thelocal state laws hold in addition to the other laws, and unfoldM p f z commutes with state, we haveunfoldM p f z >> = ﬁlt ( all ok · scanl + (⊕) st ) = solve p f ok (⊕) st z . n-Queens Solved. Recall that queens n = perm [ . . n − ] >> = ﬁlt safe = unfoldM null select [ . . n − ] >> = ﬁlt ( all ok · scanl + (⊕) ( , [ ] , [ ])) , where the auxiliary functions select , ok , (⊕) are deﬁned in Section 3. The function select cannotbe applied forever since the length of the given list decreases after each call, and perm , using onlynon-determinism, commutes with state. Therefore, Corollary 5.3 applies, and we have queens n = solve null select ok (⊕) ( , [ ] , [ ]) [ . . n − ] . Expanding the deﬁnitions we get: queens :: ( MonadPlus m , MonadState ( Int , [ Int ] , [ Int ]) m ) ⇒ Int → m [ Int ] queens n = protect ( put ( , [ ] , [ ]) >> queensBody [ . . n − ]) , queensBody :: ( MonadPlus m , MonadState ( Int , [ Int ] , [ Int ]) m ) ⇒ [ Int ] → m [ Int ] queensBody [ ] = return [ ] queensBody xs = select xs >> = 𝜆 ( x , ys ) → get >> = 𝜆 st → guard ( ok ( st ⊕ x )) >> put ( st ⊕ x ) >> (( x : ) h $ i queensBody ys ) , where ( i , us , ds ) ⊕ x = ( + i , ( i + x ) : us , ( i − x ) : ds ) ok ( , u : us , d : ds ) = ( u ∉ us ) ∧ ( d ∉ ds ) . This completes the derivation of our backtracking algorithm for the n -queens problem. Technical Report TR-IIS-19-003, Institute of Information Science, Academia Sinica. Publication date: June 2019.

This report is a case study of reasoning and derivation of monadic programs. To study the interac-tion between non-determinism and state, we construct backtracking algorithms solving problemsthat can be speciﬁed in the form unfoldM f p > = > assert ( all ok · scanl + (⊕) st ) . The derivation ofthe backtracking algorithm works by fusing the two phases into a monadic hylomorphism. It turnsout that in derivations of programs using non-determinism and state, commutativity plays an im-portant role. We assume in this report that the local state laws (right-distributivity and right-zero)hold. In this scenario we have nicer properties at hand, and commutativity holds more generally.The local state laws imply that each non-deterministic branch has its own state. It is cheap toimplement when the state can be represented by linked data structures, such as a tuple containinglists, as in the n -queens example. When the state contains blocked data, such as a large array,duplicating the state for each non-deterministic branch can be costly. Hence there is practicalneed for sharing one global state among non-deterministic branches. When a monad supportsshared global state and non-determinism, commutativity of the two eﬀects holds in limited cases.The behaviour of the monad is much less intuitive, and might be considered awkward sometimes.In a subsequent paper [Pauwels et al. 2019], we attempt to ﬁnd out what algebraic laws we canexpect and how to reason with programs when the state is global.Aﬀeldt et al. [2019] modelled a hierarchy of monadic eﬀects in Coq. The formalisation was ap-plied to verify a number of equational proofs of monadic programs, including some of the proofsin an earlier version of this report. A number of errors was found and reported to the author. Acknowledgements.

The author would like to thank Tyng-Ruey Chuang for examining a very earlydraft of this report; Jeremy Gibbons, who has been following the development of this research andkeeping giving good advices; and Tom Schrijvers and Koen Pauwels, for nice cooperation on workfollowing-up this report. Thanks also go to Reynald Aﬀeldt, David Nowak and Takafumi Saikawafor verifying and ﬁnding errors in the proofs in an earlier version of this report. The author issolely responsible for any remaining errors, however.

REFERENCES

Reynald Aﬀeldt, David Nowak, and Takafumi Saikawa. 2019. A hierarchy of monadic eﬀects for program veriﬁcation usingequational reasoning. In

Mathematics of Program Construction , Graham Hutton (Ed.). Springer.Henk Doornbos and Roland C. Backhouse. 1995. Induction and recursion on datatypes. In

Mathematics of Program Con-struction (Lecture Notes in Computer Science) , Bernhard Möller (Ed.). Springer, 242–256.Jeremy Gibbons and Ralf Hinze. 2011. Just do it: simple monadic equational reasoning. In

International Conference onFunctional Programming , Olivier Danvy (Ed.). ACM Press, 2–14.Andy Gill and Edward Kmett. 2014. The Monad Transformer Library. https://hackage.haskell.org/package/mtl.Ralf Hinze, Nicolas Wu, and Jeremy Gibbons. 2015. Conjugate hylomorphisms, or: the mother of all structured recursionschemes. In

Symposium on Principles of Programming Languages , David Walker (Ed.). ACM Press, 527–538.Graham Hutton and Diana Fulger. 2008. Reasoning about eﬀects: seeing the wood through the trees. In

Draft Proceedingsof Trends in Functional Programming , Peter Achten, Pieter Koopman, and Marco T. Morazán (Eds.).Oleg Kiselyov. 2013. How to restrict a monad without breaking it: the winding road to the Set monad.http://okmij.org/ftp/Haskell/set-monad.html.Oleg Kiselyov. 2015. Laws of MonadPlus. http://okmij.org/ftp/Computation/monads.html

Symposium on Haskell , John H Reppy (Ed.).ACM Press, 94–105.Oleg Kiselyov, Amr Sabry, and Cameron Swords. 2013. Extensible eﬀects: an alternative to monad transformers. In

Sym-posium on Haskell , Chung-chieh Shan (Ed.). ACM Press, 59–70.Erik Meijer, Maarten Fokkinga, and Ross Paterson. 1991. Functional programming with bananas, lenses, envelopes, andbarbed wire. In

Functional Programming Languages and Computer Architecture (Lecture Notes in Computer Science) ,R. John Muir Hughes (Ed.). Springer-Verlag, 124–144.Eugenio Moggi. 1989. Computational lambda-calculus and monads. In

Logic in Computer Science , Rohit Parikh (Ed.). IEEEComputer Society Press, 14–23.Technical Report TR-IIS-19-003, Institute of Information Science, Academia Sinica. Publication date: June 2019. alculating a Backtracking Algorithm 15

Alberto Pardo. 2001. Fusion of recursive programs with computational eﬀects.

Theoretical Computer Science

Mathematics of ProgramConstruction , Graham Hutton (Ed.). Springer.Philip L. Wadler. 1992. Monads for functional programming. In

Program Design Calculi: Marktoberdorf Summer School ,Manfred Broy (Ed.). Springer-Verlag, 233–264.Nicolas Wu, Tom Schrijvers, and Ralf Hinze. 2012. Eﬀect handlers in scope. In

Symposium on Haskell , Janis Voigtländer(Ed.). ACM Press, 1–12.

A MISCELLANEOUS PROOFS

Proofs of (13) and (14).

Proof.

Proof of (13) relies only on property of if and conjunction: guard ( p ∧ q ) = if p ∧ q then return () else ∅ = if p then ( if q then return () else ∅) else ∅ = if p then guard q else ∅ = guard p >> guard q . To prove (14), surprisingly, we need only distributivity and not (12): guard p >> ( f h $ i m ) = { deﬁnition of guard } ( if p then return () else ∅) >> ( f h $ i m ) = { by (7) } if p then return () >> ( f h $ i m ) else ∅ >>> ( f h $ i m ) = { by (3) and (1) } if p then f h $ i ( return () >> m ) else f h $ i (∅ >> m ) = { by (7) } f h $ i (( if p then return () else ∅) >> m ) = { deﬁnition of guard } f h $ i ( guard p >> m ) . (cid:3) Proof of (15).

Proof.

We reason: m >> = 𝜆 x → guard p >> return x = { deﬁnition of guard } m >> = 𝜆 x → ( if p then return () else ∅) >> return x = { by (7), with f n = m >> = 𝜆 x → n >> return x } if p then m >> = 𝜆 x → return () >> return x else m >> = 𝜆 x → ∅ >> return x = { since return () >> n = n and (12) } if p then m else m >> ∅ = { assumption: m >> ∅ = ∅ } if p then m else ∅ = { since return () >> n = n and (12) } Technical Report TR-IIS-19-003, Institute of Information Science, Academia Sinica. Publication date: June 2019. if p then return () >> m else ∅ >> m = { by (7), with f = ( >> m ) } ( if p then return () else ∅) >> m = { deﬁnition of guard } guard p >> m . (cid:3)(cid:3)