[PDF] A Historical Account of My Early Research Interests

Abstract

This paper presents a brief account of some of the my early research interests. This historical account starts from my laurea thesis on Signal Theory and my master thesis on Computation Theory. It recalls some results in Combinatory Logic and Term Rewriting Systems. Some other results concern Program Transformation, Parallel Computation, Theory of Concurrency, and Proof of Program Properties. My early research activity has been mainly done in cooperation with Andrzej Skowron, Anna Labella, and Maurizio Proietti.

Full PDF

LL. Fribourg and M. Heizmann (Eds.): VPT/HCVS 2020EPTCS 320, 2020, pp. 1–28, doi:10.4204/EPTCS.320.1 c (cid:13)

A. PettorossiThis work is licensed under theCreative Commons Attribution License.

A Historical Account of My EarlyResearch Interests ∗ Alberto Pettorossi

DICII, University of Rome Tor Vergata, Rome, ItalyIASI-CNR, Rome, Italy [email protected]

This paper presents a brief account of some of the my early research interests. This historicalaccount starts from my laurea thesis on Signal Theory and my master thesis on ComputationTheory. It recalls some results in Combinatory Logic and Term Rewriting Systems. Someother results concern Program Transformation, Parallel Computation, Theory of Concur-rency, and Proof of Program Properties. My early research activity has been mainly donein cooperation with Andrzej Skowron, Anna Labella, and Maurizio Proietti.

Since my childhood I very much liked Arithmetic and Mathematics. The formal reasoning alwaysattracted my spirit and I always felt a special interest for numbers and geometrical patterns.Maybe this was due to the fact that I thought that Mathematics is a way of establishing ‘truthbeyond any doubt’. As Plato says: ‘Truth becomes manifest in the mathematical process’(Phaedo). (The actual word used by Plato for ‘mathematical process’ comes from λoγ ´ ιζoµαι which means: I compute, I deduce.) During my high school I attended the Classical Lyceum.Perhaps, for me the Scientiﬁc Lyceum would have been a better school to attend, but theScientiﬁc Lyceum was located too far away from my home town.At the age of nineteen, I began my university studies in Rome as a student of Engineer-ing. I was in doubt whether or not to enrol myself as a Mathematics student, but eventually Ifollowed my father’s suggestion to study Engineering because, as he said: “If you study Math-ematics, you will have no other choice in life than to become a teacher.” My thesis work wasin Telecommunication and, in particular, I studied the problem of how to pre-distort an electricsignal which encodes a sequence of symbols, each one being 0 or 1, via a sequence of impulses.The pre-distortion of the electric signal should minimize the eﬀect of a Gaussian white noise(which would require a reduction of bandwidth) and the interference between symbols (whichwould require an increase of bandwidth). A theoretical solution to this problem is not easy toﬁnd. Thus, I was suggested to look for a practical solution via a numerical simulation of thetransmission channel and the construction of the so called eye pattern [45]. In the numericalsimulation, which uses the Fast Fourier Transform algorithm, one could easily modify the vari-ous parameters of the pre-distortion for minimizing the errors in the output sequence of 0’s and1’s. The thesis work was done under the patient guidance of my supervisors Professors BrunoPeroni and Paolo Mandarini.After getting the laurea degree, I attended during 1972 at Rome University a course inEngineering of Control and Computation Systems. During that year I read the book entitled ∗ This work has been partially supported by GNCS-INdAM, Italy.

A Historical Account of My Early Research Interests

Mathematical Theory of Computation written by Professor Zohar Manna (1939–2018) (at thattime that book was nothing more than a thick technical report of Stanford University, Califor-nia). I wrote my master thesis on the “Automatic Derivation of Control Flow Graphs of FortranPrograms”, under the guidance of Professor Vincenzo Falzone and Professor Paolo Ercoli [46].In particular, I wrote a Fortran program which derives control ﬂow graphs of Fortran programs.That program ran on a UNIVAC 1108 computer with the EXEC 8 Operating System. Themain memory had 128k words. The program I wrote was a bit naive, but at that time I wasnot familiar with eﬃcient parsing techniques. I also studied various kinds of program schemasand, in particular, those introduced by Lavrov [37], Yanov [74], and Martynuk [39]. Havingconstructed the control ﬂow graph of a given Fortran program, one could transform that pro-gram into an equivalent one with better computational properties (such as smaller time or spacecomplexities) by applying a set of schema transformations [32] which are guaranteed to preservesemantical equivalence. Schema transformations are part of the research area in which I havebeen interested for some years afterwards.During that period, which overlapped with my military service in the Italian Air Force, I alsoread a book on Combinatory Logic (actually, not the entire book) by J. R. Hindley, B. Lercherand J. P. Seldin [27, 28]. I read the Italian edition of the book, which was emended of someinaccuracies with respect to the previous English edition (as Roger Hindley himself told melater). Under the guidance of Professor Giorgio Ausiello and the great help of my colleagueCarlo Batini, I studied various properties of subbases in Weak Combinatory Logic (WCL) [3].WCL is an applicative system whose terms, called combinators , can be deﬁned as follows:(i) K and S are atomic terms, and (ii) if t and t are terms, then ( t t ) is a term. Whenparentheses are missing, left associativity is assumed. A notion of reduction , denoted > , isintroduced as follows: for all terms x, y, z , Sxyz > xz ( yz ) and Kxy > x . Thus, for instance,

SKKS > KS ( KS ) > S . WCL is a Turing complete system as every partial recursive functioncan be represented as a combinator in WCL. A subbase in WCL is a set of terms which canbe constructed starting a ﬁxed set of (possibly non-atomic) combinators. For instance, thesubbase { B } , where B is a combinator deﬁned by the following reduction: Bxyz > x ( yz ), is madeout of all terms which are constructed by B ’s (and parentheses) only. These terms are called B -combinators. One can show that B can be expressed in the subbase { S, K } by S ( KS ) K .Indeed, S ( KS ) Kxyz > ∗ x ( yz ), where > ∗ denotes the reﬂexive, transitive closure of > . Thevarious subbases provide a way of partitioning the set of computable functions into various sets,according to the features of the combinators in the subbases. This should be contrasted withother stratiﬁcations of the set of computable functions one could deﬁne and, among them, thestratiﬁcations based on complexity classes or on the Chomsky hierarchy [30] with the type i (for i = 0 , , ,

3) classes of languages.Among other subbases, we studied the subbase { B } and we showed how to construct theshortest B -combinator for constructing bracketed terms out of sequences of atomic subterms.For instance, B ( B ( BB ) B )( BB ) is the shortest B -combinator X such that: Xx x x x x x > ∗ x ( x ( x x ))( x x ).During 1975, while attending in Rome the conference on λ -calculus and Computer ScienceTheory, where our results on subbases were presented [3], I heard from Professor Henk Baren-dregt of an open problem concerning the existence of a combinator e X made of only S ’s (andparentheses), having no weak normal form. A combinator T is said to be in weak normal form if no combinator T ′ exists such that T > T ′ . X is said to have weak normal form if there exists . Pettorossi

3a combinator T such that X > ∗ T and T is in weak normal form.It was not hard to show that one such combinator e X is SAA ( SAA ), where A denotes (( SS ) S ).I send the result to Henk Barendregt (by surface mail, of course). Some years later I was happyto see that an exercise about that problem and its solution was included in Barendregt’s bookon λ -calculus [2, page 162]. While studying Combinatory Logic, I became interested in terms viewed as trees and treetransformers. Indeed, combinators can be considered both as trees and tree transformers atthe same time. This area was also related to the research on Term Rewriting Systems whichwas going to be one of my interests for a few years later. The search for a non-terminatingcombinator stimulated my studies on inﬁnite, non-terminating computations.In 1979 I introduced a hierarchy of inﬁnite computations within WCL (and other Turingcomplete systems) which is related to the Chomsky hierarchy of languages [48]. That deﬁnitionuses the notion of a sampling function s which is a total function from the set of natural numbersto { true , false } , which from an inﬁnite sequence σ = h w , w , w , . . . i of ﬁnite words constructedby an inﬁnite computation, selects an inﬁnite subsequence σ s whose words are the elements ofa (ﬁnite or inﬁnite) language L s . We state that L s = def { w j | j ≥ ∧ w j occurs in σ ∧ s ( j ) = true } . Let us assume that L s is generated by a grammar G s . In this case we say that alsothe subsequence σ s is generated by the grammar G s . Given a sequence σ , by varying thesampling function s we have diﬀerent languages L s and diﬀerent generating grammars G s . For i = 0 , , ,

3, we say that the inﬁnite computation which generates σ is of type i if there existsa sampling function s selecting a subsequence σ s generated by a grammar of type i , and nosampling function s ′ exists such that the subsequence selected by s ′ is generated by a grammarof type ( i +1).For instance, let us consider the following program P : w = “ a ” ; while true do print w ; w = “ b ” w “ c ” ; π od where a, b, c are characters, w is a string of characters, and π is a terminating program fragmentassociated with a type 0 language L , such that: (i) L is not of type 1, (ii) π does not modify w ,(iii) at each loop body execution, π prints only one word of L , and (iv) for every word v ∈ L there is exactly one body execution in which π prints v . We have that P evokes an inﬁnitecomputation of type 2, as the grammar with axiom S and productions: S → a | b S c is a type 2(context free) grammar.When I ﬁrst presented this hierarchy deﬁnition at a conference, I met my dear colleaguePhilippe Flajolet (1948-2011) and he said to me: “I have already studied these topics [24]. Youshould look at the immune sets.” That remark motivated my ﬁrst encounter with Roger’s bookon recursivity [66] where immune sets are deﬁned and analyzed. Then also Professor MauriceNivat (1937-2017) came to me and said: “It is a nice piece of work,... but you should rewritethe paper in a better way!”. I was very glad that Nivat showed interest in my work. He wasright in asking me to rewrite it and improve it. Unfortunately, I did not follow his suggestion.Not even when, a few years later, Professor Tony Hoare told me: “I like writing and rewritingmy papers.”Looking for terms with inﬁnite behaviour in WCL, in 1980 I wrote a paper on the automaticconstruction of combinators having no normal form by using the so called accumulation method A Historical Account of My Early Research Interests and the pattern matching and hereditary embedding method [49]. The solutions of some equationsbetween terms would guarantee the existence of the combinators with the desired properties.On the other side of the camp, that is, considering the ﬁnite behaviours, many people atthat time were studying properties of Term Rewriting Systems (TRSs) which would guaranteetermination. Among them, Nachum Dershowitz, Samuel Kamin, Jean-Jacques Lévy, and DavidPlaisted. In 1981 I wrote a paper introducing the non-ascending property [50]. In that paperI related the various techniques which were proposed, including recursive path orderings, sim-pliﬁcation ordering, and bounded lexicographic orderings. I thank Nachum for pointing out tome some errors in that paper and, in particular, a missing left-linearity hypothesis about theTRS under consideration [20]. A TRS is said to be left linear if the variable occurrences on theleft hand side of every rule are all distinct. For instance, f ( x, y, z ) → f ( y, z, x ) is a left linearrule, while f ( x, y, x ) → g ( y, x ) is not. During a conference coﬀee-break, Jean-Jacques showedme a simple inductive proof of Fact 1 [50, pages 436–437] using bounded lexicographic order-ings (actually, that proof is based on a non-predicative deﬁnition of the non-ascending rewritingrules). During the years 1977–1981 I visited Edinburgh University. I was supported by the BritishCouncil organization and the Italian National Research Council. I did my Ph.D. thesis workon program transformation under the guidance of Professor Rod Burstall and also ProfessorRobin Milner, during Rod’s visit to Inria in Paris for some months. I met Rod in person forthe ﬁrst time at the Artiﬁcial Intelligence Department, in Hope Park Square at Edinburgh. Iaddressed him by saying: “Professor Burstall, . . . ”. I do not remember my subsequent words,but I do remember what he said to me in answering: “Alberto, this is the last time you callme ‘professor’. Please, call me Rod.” He introduced me to functional programming and hewrote ‘for me’, as he said, a compiler for a new functional language, called NPL [9] he wasdeveloping at that time. The language NPL later evolved into Hope [11]. While at Edinburgh,I wrote a paper [47] on the automatic annotation of functional programs for improving memoryutilization. Functions could destroy the value of their arguments whenever they were no longerneeded for subsequent computations. I apologize for not having Rod as co-author of that paper.My Ph.D. thesis work was mainly on program transformation starting from the seminal paperby Rod and John Darlington [10]. Some time before, Rod had received a letter from ProfessorEdger W. Dijkstra (1930-2002) proposing the following ‘exercise’ in program transformation:the derivation of an iterative program for the fusc function [22, pages 215–216, 230–232]: fusc ( ) = ( ) = ( ) = fusc ( n ) fusc ( + ) = fusc ( n + ) + fusc ( n ) for n ≥ In one of my scientiﬁc conversations with Rod, he told me about his research interests and healso mentioned the above exercise. The diﬃcult part of the exercise was how to motivate the‘invention’ of the new function deﬁnitions to be introduced during program transformation inthe so called eureka steps [10].To do the same exercise Bauer and Wössner [4, page 288] use an embedding into a linearcombination, that is, they deﬁne the function F ( n , a , b ) = def a × fusc ( n ) + b × fusc ( n + ) . Usingthat function, they are able to derive for fusc a program that is linear recursive and also tail-recursive. Then, from that program they easily derive an iterative program. But, where the . Pettorossi F comes from? I wanted to do the exercise using the unfolding/folding rules only [10]and, at the same time, I wanted to give a somewhat mechanizable account of the deﬁnition thenew functions to be introduced.Now, the unfolding rule allows one to unroll (upto a speciﬁed depth) the recursive callsthereby generating a directed acyclic graph of distinct calls. I called that graph the m-dag . Thepreﬁx m (short for minimal) tells us that in an m-dag identical function calls are denoted by asingle node. Then, I used the so called tupling strategy that allows one to deﬁne new functionsas the result of tupling together function calls which share common subcalls, that is, calls whichhave common descendants in the m-dag. Note that to check this sharing property requiressyntactic operations only on the m-dags. By using the tupling strategy, looking at the m-dagfor fusc , we introduce the tuple function t ( n ) = def h fusc ( n ) , fusc ( n + ) i and we get the followingrecursive equations for fusc : fusc ( n ) = u where h u , v i = t ( n ) for n ≥ ( ) ={by unfolding} = h fusc ( ) , fusc ( ) i = {by unfolding} = h , i t ( ) ={by unfolding} = h fusc ( ) , fusc ( + ) i = {by unfolding} == h fusc ( n ) , fusc ( n + )+ fusc ( n ) i = {by where abstraction [10]} == h u , u + v i where h u , v i = h fusc ( n ) , fusc ( n + ) i = {by folding} == h u , u + v i where h u , v i = t ( n ) for n > ( + ) = h u + v , v i where h u , v i = t ( n ) for n ≥ (by a derivation similar to that of t ( )) Now a last step is needed to get the iterative program desired by Dijkstra’s exercise.I used the following schema equivalence (such as the ones in [69]) stating that t ( n ) deﬁnedby the non-tail recursive equations: t ( ) = at ( ) = b ( t ( n )) for n > ( + ) = c ( t ( n )) for n ≥ is equal to the value of res returned by the following program, where B [ ℓ.. ] stores the binaryexpansion of m , the most signiﬁcant bit being at position ℓ (obviously, B [ ℓ.. ] can be computedby performing O (log m ) successive integer divisions by 2): res = a ; p = ℓ ; while p ≥ [ p ] = = b ( res ) else res = c ( res ) ; p = p − By using this schema equivalence we derive from the above linear, non-tail recursive programfor fusc the following iterative program: { n ≥ ∧ n = P ℓ p = B [ p ] · p } h u , v i = h , i ; p = ℓ ; while p ≥ [ p ] = = u + v else u = u + v ; p = p − {h u , v i = t ( n ) ∧ u = fusc ( n ) } Note that we do not need to state the somewhat intricate invariant of the while-loop for showingthe correctness of the derived iterative program, as Dijkstra’s methodology for program construc-tion would have required us to do. The derived program, which is correct by construction, usesan O (log n ) number of operations for computing fusc ( n ) as Dijkstra’s program reported in [22,page 215–216] . We have only to show by induction, once and for all, the validity of the schemaequivalence we have used. In order to get exactly Dijkstra’s program, one should perform a generalization step as indicated in [52].

A Historical Account of My Early Research Interests

Having derived an iterative program for the fusc function, I faced the problem of derivingby transformation an iterative program, such as the one suggested by [41], which computesthe Fibonacci function fib ( n ) using an O (log n ) number of arithmetic operations. Here is thedeﬁnition of the Fibonacci function: fib ( ) = ( ) = ( n + ) = fib ( n + ) + fib ( n ) for n ≥ ( † By using the tupling strategy the function g ( n ) = def h fib ( n ) , fib ( n − ) i is introduced and thefollowing program is derived: fib ( ) = ( ) = ( n + ) = u where h u , v i = g ( n + ) for n ≥ ( ) = h , i g ( n + ) = h u + v , u i where h u , v i = g ( n + ) for n ≥ The iterative program for fib can be obtained by applying the following schema equivalencestating that g ( n ) deﬁned by the equations: g ( ) = a g ( n + ) = b ( g ( n )) for n ≥ is equal to the value of res returned by the program: res = a ; while n > = b ( res ) ; n = n − Thus, we get: { n ≥ } if n = = = = = n − ; h u , v i = h , i ; while p > h u , v i = h u + v , u i ; p = p − { u = fib ( n ) } This program has a linear time complexity, in the sense that it computes the result by a linearnumber of additions. In order to get a program which requires O (log n ) arithmetic operationswhen computing fib ( n ) , we should invent the multiplication operation, which is not present inEquation ( † ) . From that equation by unfolding we have: fib ( n + ) = fib ( n + ) + fib ( n ) = {by unfolding fib ( n + )} == · fib ( n ) + fib ( n − ) = {by unfolding fib ( n )} == · fib ( n − ) + · fib ( n − ) ( † The unfolding process may continue for some more steps, but we stop here. We will not discusshere the important issue of how many unfolding steps should be performed when deriving pro-grams by transformation. Let us simply note that more unfoldings may exhibit more patternsof function calls from which more eﬃcient functions can be derived.In our case the invention of the multiplication operation is reduced to three generalizationsteps [55]. First, we generalize the initial values and of the function fib to two variables a and a , respectively. (This kind of generalization step is usually done when mechanicallyproving theorems about functions [7].) By promoting those new variables to arguments, we getthe following new function G : G ( a , a , ) = a G ( a , a , ) = a G ( a , a , n + ) = G ( a , a , n + ) + G ( a , a , n ) for n ≥ ( † This function G satisﬁes the following equation which is derived from Equation ( † , as Equa-tion ( † has been derived from ( † : G ( a , a , n + ) = · G ( a , a , n − ) + · G ( a , a , n − ) ( † . Pettorossi and to two functions p ( n ) and q ( n ) , respectively (and thus, multiplication is introduced). By this generalization weestablish a correspondence between the value of the coeﬃcients and the number of unfoldingsperformed. We can then derive the explicit deﬁnitions of the functions p ( n ) and q ( n ) as shownin [55], and we get that p ( n ) = G ( , , n ) and q ( n ) = G ( , , n ) .The third, ﬁnal generalization consists in generalizing the argument n + on the left handside of Equation ( † to n + k and promoting the new variable k to an argument of a new functiondeﬁned as follows: F ( a , a , n , k ) = def G ( a , a , n + k ) .From the equations deﬁning F ( a , a , n , k ) we get (the details are in [55, pages 184–185]): G ( a , a , n + k ) = G ( , , k ) · G ( a , a , n + ) + G ( , , k ) · G ( a , a , n ) Then, by taking n = k and n = k + , we also get: G ( a , a , ) = G ( , , k ) · G ( a , a , k + ) + G ( , , k ) · G ( a , a , k ) for k > ( a , a , + ) = G ( , , k ) · G ( a , a , k + ) + G ( , , k ) · G ( a , a , k + ) for k ≥ Eventually, by tupling together the function calls which share the same subcalls, we get thefollowing program which computes fib ( n ) by performing an O (log n ) number of arithmetic op-erations only, as desired. For all k ≥ , the function r ( k ) is the pair h G ( , , k ) , G ( , , k ) i . fib ( ) = ( ) = ( n + ) = u + v where h u , v i = r ( n + ) for n ≥ ( ) = h , i r ( ) = h u + v , + v i where h u , v i = r ( k ) for k > ( + ) = h + v , ( u + v ) + v i where h u , v i = r ( k ) for k ≥ We leave to the reader to derive the iterative program that can be obtained by a simple schemaequivalence from this program. One can say that the program we have derived is even betterthan the program based on 2 × × fib to the case of any linear recurrence relation over any semiringstructure. What remains to be done? One may want to derive a constant time program forevaluating any linear recurrence relation over a semiring. This would require the introductionof the exponentiation operation. Recall that fib ( n ) = ( A n − B n ) / sqrt ( ) , where A = ( + sqrt ( )) / and B = ( − sqrt ( )) / .From September 1977 to June 1978, I visited the School of Computer and Information Scienceat Syracuse University, N.Y., USA. I attended courses taught by Professor Alan Robinson,John Reynolds, Lockwood Morris, and Robert Kowalski (at that time a visiting professor fromImperial College, London, UK). It was a splendid occasion for deepening my knowledge aboutmany aspects of Computer Science from such illustrious teachers.In Syracuse I had the opportunity of reading more carefully some parts of the book AutomataTheory, Languages, and Computation by Hopcroft and Ullman [30] and the book

Introductionto Mathematical Logic by Mendelson [40]. I was exposed by Professor Kowalski for the ﬁrsttime to various topics of Artiﬁcial Intelligence and I read the preliminary draft of his beautifulbook

Logic for Problem Solving [35]. I remember the stress put by Kowalski on Keith Clark’s negation as failure semantics for logic programs [12]. This Computational Logic area was goingto become my main research area in the years to come, through my cooperation with MaurizioProietti in Logic Program Transformation.

A Historical Account of My Early Research Interests

The results of the use of tupling and generalization during program transformation were pre-sented in a paper of the 1984 ACM Symposium on Lisp and Functional Programming, Austin,Texas, USA [52]. While giving a seminar on those results at the University of Warsaw (Poland)Professor Helena Rasiowa who was in the audience, at the end kindly said to me: “Your paperis a collection of examples!”. I was not surprised by that remark, but I was happy to have,among the examples, a simple derivation of an iterative program for computing the moves ofthe Towers of Hanoi problem. That task was considered to be very challenging by some authors(see, for instance, [26, page 285]), and the derivation I proposed is also easily mechanizable.The following Hanoi function h ( n , A , B , C ) computes the shortest sequence of moves in the freemonoid { AB , BC , CA , BA , CB , AC } ∗ to move n ( ≥ ) disks from peg A to peg B using peg C as an extrapeg. A move of a disk from peg X to peg Y is denoted by XY , for any distinct X , Y in { A , B , C } .Every disk is of a diﬀerent size and over any disk only smaller disks can be placed. ε denotesthe empty sequence of moves, and :: denotes the concatenation of sequences of moves. h ( , A , B , C ) = ε h ( n + , A , B , C ) = h ( n , A , C , B ) :: AB :: h ( n , C , B , A ) for n ≥ ( † In order to get an iterative program for computing h ( n , A , B , C ) , we ﬁrst unfold h ( n , A , C , B ) and h ( n , C , B , A ) in ( † and then we tuple together in the new function t ( n − ) the calls of h ( n − , A , B , C ) , h ( n − , B , C , A ) , and h ( n − , C , A , B ) which share common subcalls (see Figure 1). The order of thecomponents in the tuple is insigniﬁcant. Details are in [53]. h ( n + , A , B , C ) h ( n , A , C , B ) h ( n , C , B , A ) t ( n − ) : h ( n − , B , C , A ) h ( n − , A , B , C ) h ( n − , C , A , B ) t ( n − ) : h ( n − , B , C , A ) h ( n − , A , B , C ) h ( n − , C , A , B ) Figure 1: An upper portion of the call graph m-dag of the Hanoi function h ( n + , A , B , C ) . An edgefrom an upper node to a lower node denotes that the upper call requires the lower call. Dashedlines denote tuples.We get: h ( , A , B , C ) = ε h ( , A , B , C ) = ABh ( n + , A , B , C ) = u :: AC :: v :: AB :: w :: CB :: u where h u , v , w i = t ( n ) for n ≥ ( ) = h ε, ε, ε i t ( ) = h AB , BC , CA i t ( n + ) = h u :: AC :: v :: AB :: w :: CB :: u , v :: BA :: w :: BC :: u :: AC :: v , w :: CB :: u :: CA :: v :: BA :: w i where h u , v , w i = t ( n ) for n ≥ Then, we can apply the schema equivalence stating that g ( n ) deﬁned by the equations: Helena Rasiowa and Roman Sikorski gave in 1950 a ﬁrst algebraic proof of Gödel Completeness Theorem forﬁrst-order predicate calculus. . Pettorossi g ( ) = a g ( ) = b g ( n + ) = c ( g ( n )) for n ≥ is equal to the value of res returned by the program: if even ( n ) then res = a else res = b ; while n > = c ( res ) ; n = n − We get the following program, where for k = , , , Tk denotes the k -th component of the triple T : { n ≥ } if n = = ε elseif n = = AB elsebegin n = n − ; if even ( n ) then T = h ε, ε, ε i else T = h AB , BC , CA i ; while n > = h T1 :: AC :: T2 :: AB :: T3 :: CB :: T1 , T2 :: BA :: T3 :: BC :: T1 :: AC :: T2 , T3 :: CB :: T1 :: CA :: T2 :: BA :: T3 i ; n = n − ; Hanoi = T1 :: AC :: T2 :: AB :: T3 :: CB :: T1end { Hanoi = h ( n , A , B , C ) } The technique we have presented is based only on the tupling strategy and a simple schemaequivalence. That technique is successful also for the many variants of the Towers of Hanoiproblem that can be found in the literature (see, among others, [23]). A diﬀerent derivationfor computing the Hanoi function can be done by introducing, besides the tuple t ( n ) , also thetuple t ′ ( n − ) = def h h ( n − , A , C , B ) , h ( n − , C , B , A ) , h ( n − , B , A , C ) i corresponding to the calls of h atlevel n − (not depicted in Figure 1). We leave this derivation to the reader.In a later paper I addressed the problem of ﬁnding the m -th move of algorithms whichcompute sequences of moves without computing any other move [54]. This problem arose asa generalization of the problem relative to the Towers of Hanoi. If the moves are computedby a function deﬁned by a recurrence relation, then under suitable hypotheses, it is indeedpossible to compute the m -th move without computing any other move. For the case of theHanoi function h ( n , A , B , C ) we have that the length Lh ( n ) of the sequence of moves for n disks,satisﬁes the following equations: Lh ( ) = ( n + ) = · Lh ( n )+ for n ≥ One can show [54] that the m -th move of h ( n , A , B , C ) , for ≤ m ≤ n − and n ≥ , can be computedusing the deterministic ﬁnite automaton of Figure 2. We assume that M [ ℓ.. ] is the binaryexpansion of m , the most signiﬁcant bit being at the leftmost position ℓ . Thus, m = P ℓ i = M [ i ] · i and m is not a power of iﬀ M [ ℓ.. ] ∗ . Let trans ( X , p ) denotes the state Y such that in theﬁnite automaton of Figure 2 there is an arc from state X to state Y with label p . i = ℓ ; state = AB ; while M [ i .. ] ∗ do begin state = trans ( state , M [ i ]); i = i − The m -th move is the name of the ﬁnal state, with B and C interchanged if an odd number ofstate transitions is made. AB BC CA

110 00 1

Figure 2: The ﬁnite automaton for computing the m -th move in the sequence h ( n , A , B , C ) of movesfor the Towers of Hanoi problem with n disks and pegs A , B , and C .0 A Historical Account of My Early Research Interests

Suppose that we want to compute the -th move of h ( , A , B , C ) . The binary expansion of is . Starting from the left, we take the preﬁx up to (and excluding) the suﬃx in ∗ (in our case ). We perform the transitions on the automaton of Figure 2 starting fromstate AB according to that preﬁx (from left to right) and we get to state CA . Since the lengthof the preﬁx is odd (it is indeed ), the move to be computed is BA , that is, CA with B and C interchanged.In a subsequent paper with Maurizio Proietti [57] we want to explore the idea of introducing lists , rather than arrays (indeed, tuples being of ﬁxed size can be seen as arrays). Originally,this idea was suggested to me by Rod Burstall. Since every recursive function can be computedby using stacks (actually, two stacks are suﬃcient for computing any partial recursive functionon natural numbers [30]), this technique seems to me, at ﬁrst, not very relevant in the practiceof improving the time complexity of a program or avoiding ineﬃcient recursions. We exploredthe use of this technique and, indeed, we managed to achieve good results. In particular, the list introduction strategy can be used when the recursive calls do not generate a sequence of cuts of constant size in the m-dag of the function calls, and thus it does not allow the useof the tupling strategy. A cut in an m-dag is set C of nodes such that every path from theroot to a leaf intersects C . In the case of the Hanoi function (see Figure 1) we have depictedthe cuts associated with t ( n − ) and t ( n − ) . Both of them are of size and thus, the tuplingstrategy (with three function calls) is successful. More details on cuts and their use for programtransformation also in relation with pebble games [44] can be found in my Ph.D. thesis [51].We used the list introduction strategy for deriving a program for computing the binomialcoeﬃcients: (cid:0) n +1 k +1 (cid:1) = (cid:0) nk (cid:1) + (cid:0) nk +1 (cid:1) . In this case the sequence of cuts from the root to the leavesis of increasing size. Indeed, (cid:0) n +1 k +1 (cid:1) requires the computations of (cid:0) nk (cid:1) and (cid:0) nk +1 (cid:1) , which in turns,require the computations of (cid:0) n − k − (cid:1) , (cid:0) n − k (cid:1) , (cid:0) n − k +1 (cid:1) , and so on. (Indeed, in the Pascal Triangle thebasis has an increasing size when the height of the triangle increases). Therefore, the tuplingstrategy cannot be used.Now, in order to show the power of the list introduction strategy, let us consider the n -queensproblem. Details are in [57]. An n × n board conﬁguration Qs is represented by a list of pairs ofthe form: [ h R , C i , . . . , h R n , C n i ], where for i =1 , . . . , n , h R i , C i i denotes a queen placed in row R i and column C i . For i = 1 , . . . , n , the values of R i and C i belong to the list [1 , . . . , n ].We start from the following initial program Queens :1. queens ( Ns , Qs ) ← placequeens ( Ns , Qs ) , safeboard ( Qs )2. placequeens ([ ] , [ ]) ← placequeens ( Ns , [ Q | Qs ]) ← select ( Q, Ns , Ns , placequeens ( Ns , Qs )4. safeboard ([ ]) ← safeboard ([ Q | Qs ]) ← safequeen ( Q, Qs ) , safeboard ( Qs )6. safequeen ( Q, [ ]) ← safequeen ( Q , [ Q | Qs ]) ← notattack ( Q , Q , safequeen ( Q , Qs )In order to place n queens we solve the goal queens ([1 , . . . , n ] , Qs ). By clause 1 we have that placequeens ([1 , . . . , n ] , Qs ) generates a board conﬁguration Qs and safeboard ( Qs ) checks thatin Qs no two queens lie on the same diagonal (either ‘up diagonal’ or ‘down diagonal’ in Dijkstra’sterminology [21]). We assume that notattack ( Q , Q

2) holds iﬀ queen position (or queen, forshort) Q

1, that is, h R , C i , is not on the same diagonal of the queen Q

2. The tests that thequeens are neither on the same row nor on the same column can be avoided by assuming that select ( Q, Ns , Ns

1) holds iﬀ Ns is a list of distinct numbers in [1 , . . . , n ], Q is queen h R, C i such . Pettorossi R is the length of Ns and column C is a member of Ns , and Ns Ns by deleting the occurrence of C . The length of the list Ns decreases by one unit aftereach call of placequeens . In particular, we have that board conﬁgurations having k queens (with1 ≤ k ≤ n ) are of the form: [ h n, c i , h n − , c i , . . . , h n − k +1 , c k i ], where c , c , . . . , c k are distinctnumbers in [1 , . . . , n ].Program Queens solves the problem using the generate-and-test approach and it is not eﬃ-cient. A more eﬃcient program using an accumulator that stores the diagonals which are notsafe, has been proposed in [67, page 255]. Eﬃciency is increased because backtracking is reduced.By applying the list introduction strategy (which includes also some generalization steps)one can derive the following program

TransfQueens whose behaviour is similar to that of the ac-cumulator version. The various transformation steps are described in [57]. The higher eﬃciencyof the ﬁnal program is due to the fact that the test for a safe board conﬁguration is ‘promoted’into the process of generating new conﬁgurations, and the number of generated unsafe boardconﬁgurations is decreased (see the ﬁlter promotion technique [5, 15]).8. queens ([ ] , [ ]) ← queens ( Ns , [ Q | Qs ]) ← select ( Q, Ns , Ns , genlist Ns , Qs , [ Q ])10. genlist , [ ] , Ps ) ← genlist Ns , [ Q | Qs ] , [ ]) ← select ( Q , Ns , Ns , genlist Ns , Qs , [ Q genlist Ns , [ Q | Qs ] , [ P | Ps ]) ← select ( Q , Ns , Ns , notattack ( P , Q , genlist Ns , Qs , [ P , Ps , Q genlist Ns , Qs , Ps , [ ] , Q ← genlist Ns , Qs , [ Q | Ps genlist Ns , Qs , Ps , [ P | Ps , Q ← notattack ( P , Q , genlist Ns , Qs , [ P | Ps , Ps , Q Queens program . By clause 9, the ﬁrstqueen position Q is selected and genlist Q ]. When a new queen is placed at position Q P genlist Q Ps (see clauses 13 and 14). If Q Q | P s

1] of genlist Q select in clause 12), a diﬀerent queen position is selected. If allpositions for the new queen are under attack, then by backtracking (see the atoms select inclauses 9 and 11), the position of a previously placed queen, if there is one, is selected in adiﬀerent way.The explanation which we have just given about the derived program (clauses 8–14), mayappear unclear to the non-expert reader, but one should note that it was not needed at all.Indeed, correctness of the derived program is guaranteed by the correctness of the transformationrules, and the eﬃciency improvement is due to ﬁlter promotion. While studying the tupling strategy and analyzing its power, a sentence by John Darlington,with whom I shared the oﬃce in Edinburgh, came often to my mind: “After unfolding, having In some experiments we have done, for 10 queens

TransfQueens runs about 70 times faster than

Queens . A Historical Account of My Early Research Interests done some local improvements (such as the ones obtained by the where abstraction as shown inSection 3 for the fusc function), you need to fold.” This need for folding [16] is an importantrequirement. Folding steps make the local improvements to be become global, so that they canbe replicated at each level of recursion and thus become signiﬁcant.However, folding steps need matchings between expressions and these matchings may besometimes impossible. Generalization of constants to variables may allow matchings in somecases, but not always. In particular, when an expression should match one of its subexpressions,generalization of constant to variables does not help. In those cases we have suggested toconstruct functions from expressions [62]. This is done by replacing the expression E [ e ] where thesubexpression e occurs, by the application ( λ x . E [ x ]) e . We call this technique lambda abstractionstrategy (or, as in other papers, higher-order abstraction ).Let us see how lambda abstraction works in the following two examples taken from [62].The ﬁrst example refers to the following program Reverse for reversing a list, where [ ] , : , and @ denote the empty list, cons , and append on lists, respectively. . rev ([ ]) = [ ] . rev ( a : ℓ ) = rev ( ℓ ) @ [ a ] . [ ] @ y = y4 . ( a : ℓ ) @ y = a : ( ℓ @ y ) We want to derive a tail recursive deﬁnition of rev . We need rev to be the top operator of theright hand side of Eq. , that is, rev ( ℓ ) @ [ a ] , and by induction we need that right hand side tobe rev ( ℓ ) . There is a subexpression mismatch between rev ( ℓ ) @ [ a ] and rev ( ℓ ) . Then we proceedas follows: (i) instead of rev ( ℓ ) , we consider rev ( ℓ ) @ [ ] , (ii) we generalize the constant [ ] to thevariable x , thereby deriving rev ( ℓ ) @ x , and (iii) we abstract rev ( ℓ ) @ x with respect to x , therebyderiving the function λ x . rev ( ℓ ) @ x .The deﬁnition of the new function f ( ℓ ) = def λ x . rev ( ℓ ) @ x is as follows. . f ([ ]) = λ x . rev ([ ]) @ x = {by Eq. } = λ x . [ ] @ x = {by Eq. } = λ x . x6 . f ( a : ℓ ) = λ x . rev ( a : ℓ ) @ x = λ x . ( rev ( ℓ ) @ [ a ]) @ x = {by associativity of @ } == λ x . rev ( ℓ ) @ ([ a ] @ x ) = λ x . rev ( ℓ ) @ ( a : x ) = {by folding} = λ x . ( f ( ℓ ) ( a : x )) We also have: . rev ( ℓ ) = f ( ℓ ) [ ] The derived program (Eqs. – ) is more eﬃcient than program (Eqs. – ) because the expensiveoperation append has been replaced by the cheaper operation cons . Eqs. – are basicallyequivalent to the program proposed in [31] where a new representation for list has to be invented.Note that the mechanization of the transformation we have now presented requires the useof associativity property for the append function. Thus, in general, it is important to haveknowledge of the algebraic properties of the operations in use.A second example refers to a problem proposed by Richard Bird [6]. Given a binary tree t we want to construct an isomorphic binary tree e t such that: (i) t and e t have the same multisetof leaves, and (ii) the leaves of e t , when read from left to right, are in ascending order. Oneshould derive a program which construct e t by making one traversal only of the tree t .In order to solve this program Richard Bird uses the so called locally recursive programs whose semantics is quite complex and it is based on the call-by-need mode of evaluation. Byusing the tupling and lambda abstraction strategies we will get the desired program with thefollowing advantages over Bird’s solution: (i) the use of call-by-value semantics, (ii) the absence . Pettorossi on the ﬂy , and (iv) the computation of componentsof tuples is done only when they are required for later computations.By tip ( n ) we denote a binary tree whose single leaf is the integer n , and by t1 ∧ t2 we denotea binary tree with children t1 and t2 . By hd and tl we denote, as usual, the head and tail functions on lists. Our initial program is as follows. . TreeSort ( t ) = replace ( t , sort ( leaves ( t ))) where:(i) leaves ( t ) returns the list of the leaves of the tree t , (ii) sort ( ℓ ) rearranges the list ℓ inascending order from left to right, and (iii) replace ( t , ℓ ) uses in the left-to-right order the elementsof the list ℓ to replace from left-to-right the leaves of the tree t .We assume that the length of ℓ is at least the number of leaves in t . For instance, we have: TreeSort (( tip ( ) ∧ tip ( )) ∧ tip ( )) = ( tip ( ) ∧ tip ( )) ∧ tip ( ) . Here is the deﬁnition of the variousfunctions required: . leaves ( tip ( n )) = [ n ] . leaves ( t1 ∧ t2 ) = leaves ( t1 ) @ leaves ( t2 ) . replace ( tip ( n ) , ℓ ) = tip ( hd ( ℓ )) . replace ( t1 ∧ t2 , ℓ ) = replace ( t1 , take ( k , ℓ )) ∧ replace ( t2 , drop ( k , ℓ )) where k = size ( t1 ) . take ( n , ℓ ) = if n = [ ] else take ( n − , ℓ ) @ [ hd ( drop ( n − , ℓ ))] . drop ( n , ℓ ) = if n = ℓ else tl ( drop ( n − , ℓ )) For instance, take ( , [ a , b , c , d , e ]) = [ hd ([ a , b , c , d , e ]) , hd ([ b , c , d , e ])] = [ a , b ] and drop ( , [ a , b , c , d , e ]) = tl ( tl ([ a , b , c , d , e ])) = [ c , d , e ] .As usual, given a list ℓ , we denote by length ( ℓ ) the number of elements in ℓ . We assumethat ≤ k ≤ length ( ℓ ) holds when evaluating take ( k , ℓ ) and drop ( k , ℓ ) . For all list ℓ , for all ≤ n ≤ length ( ℓ ) , we have ℓ = take ( n , ℓ ) @ drop ( n , ℓ ) . The function size ( t ) returns the numberof leaves in the tree t . We have: . size ( tip ( n )) = . size ( t1 ∧ t2 ) = size ( t1 ) + size ( t2 ) . Here is the deﬁnition of sort using merge of two ordered lists: . sort ( ℓ ) = if ℓ = [ ] then [ ] else merge ([ hd ( ℓ )] , sort ( tl ( ℓ ))) . . merge ([ ] , ℓ ) = ℓ . merge ( ℓ, [ ]) = ℓ . merge ( a : ℓ , b : ℓ ) = if a ≤ b then a : merge ( ℓ , b : ℓ ) else b : merge ( a : ℓ , ℓ ) Unfortunately,

TreeSort ( t ) traverses the tree t twice: a ﬁrst visit is for collecting the leaves, anda second visit is for replacing them in ascending order.Now, let us start oﬀ the derivation of the one traversal algorithm by getting the inductivedeﬁnition of TreeSort ( t ) . From Eq. we get: . replace ( tip ( n ) , sort ( leaves ( tip ( n )))) = replace ( tip ( n ) , sort ([ n ])) = tip ( n ) . replace ( t1 ∧ t2 , sort ( leaves ( t1 ∧ t2 ))) = replace ( t1 , take ( size ( t1 ) , ℓ )) ∧∧ replace ( t2 , drop ( size ( t1 ) , ℓ )) where ℓ = sort ( leaves ( t1 ∧ t2 )) Now no folding step can be performed, because in replace ( t1 , take ( size ( t1 ) , ℓ )) the subexpression take ( size ( t1 ) , ℓ ) does not match sort ( leaves ( t1 )) . Similarly, for the subtree t2 , instead of t1 .By the lambda abstraction we generalize the mismatching subexpression to the list variable z ,and we introduce the function λ z . replace ( t , z ) whose deﬁnition is as follows (the details arein [62]):4 A Historical Account of My Early Research Interests . λ z . replace ( tip ( n ) , z ) = λ z . tip ( hd ( z )) . λ z . replace ( t1 ∧ t2 , z ) = λ z . (( λ y . replace ( t1 , y ) take ( k , z )) ∧∧ (( λ y . replace ( t2 , y )) drop ( k , z ))) where k = size ( t1 ) The functions λ z . replace ( t , z ) and sort ( leaves ( t )) visit the same tree t . We apply the tuplingstrategy and we deﬁne the function: T ( t ) = def h λ z . replace ( t , z ) , sort ( leaves ( t )) i whose explicit deﬁnition is: . T ( tip ( n )) = h λ z . tip ( hd ( z )) , [ n ] i . T ( t1 ∧ t2 ) = h λ z . (( a1 take ( size ( t1 ) , z )) ∧ ( a2 drop ( size ( t1 ) , z ))) , merge ( b1 , b2 ) i where h a1 , b1 i = T ( t1 ) and h a2 , b2 i = T ( t2 ) Now T ( t1 ) , take ( size ( t1 ) , z ) , and drop ( size ( t1 ) , z ) visit the same tree t1 . We apply the tuplingstrategy and we introduce the new function: U ( t , y ) = def h λ z . replace ( t , z ) , sort ( leaves ( t )) , take ( size ( t ) , y ) , drop ( size ( t ) , y ) i We get the following explicit deﬁnition for U ( tip ( n ) , y ) : U ( tip ( n ) , y ) = h λ z . tip ( hd ( z )) , [ n ] , [ hd ( y )] , tl ( y ) i However, when looking for the explicit deﬁnition of U ( t1 ∧ t2 , y ) we get again a subexpressionmismatch (see [62]) and we use again lambda abstraction for the last two components of the4-tuple U ( t , y ) . Thus, we introduce the following function: V ( t ) = def h λ z . replace ( t , z ) , sort ( leaves ( t )) , λ z . take ( size ( t ) , z ) , λ z . drop ( size ( t ) , z ) i whose explicit deﬁnition is: . V ( tip ( n )) = h λ z . tip ( hd ( z )) , [ n ] , λ z . [ hd ( z )] , λ z . tl ( z ) i . V ( t1 ∧ t2 ) = h λ z . (( a1 ( c1 z )) ∧ ( a2 ( d1 z ))) , merge ( b1 , b2 ) , λ z . (( c1 z ) @ ( c2 ( d1 z ))) , λ z . ( d2 ( d1 z )) i where h a1 , b1 , c1 , d1 i = V ( t1 ) and h a2 , b2 , c2 , d2 i = V ( t2 ) We get the following program such that for all trees t , NewTreeSort ( t ) = TreeSort ( t ) (see Eq. ): . NewTreeSort ( t ) = ( a2 b2 ) where h a2 , b2 i = T ( t ) . T ( tip ( n )) = h λ z . tip ( hd ( z )) , [ n ] i . T ( t1 ∧ t2 ) = h λ z . (( a1 c1 , z )) ∧ ( a2 d1 , z ))) , merge ( b1 , b2 ) i where h a1 , b1 , c1 , d1 i = V ( t1 ) and h a2 , b2 i = T ( t2 ) together with Eqs. and for the function V ( t ) .A further improvement of this program can be made by avoiding the append function @ occurring in Eq. . One can use the same technique of lambda abstraction shown in the Reverse example at the beginning of this section. We consider a variant of the function V ( t ) whose 3rd component is the abstraction λ z x . take ( size ( t ) , z ) @ x , instead of λ z . take ( size ( t ) , z ) .The function T ∗ ( t ) is like T ( t ) , but uses V ∗ ( t ) , instead of V ( t ) . We get the following ﬁnal programsuch that for all trees t , NewTreeSort ∗ ( t ) = TreeSort ( t ) : ∗ . NewTreeSort ∗ ( t ) = ( a2 b2 ) where h a2 , b2 i = T ∗ ( t ) ∗ . T ∗ ( tip ( n )) = h λ z . tip ( hd ( z )) , [ n ] i ∗ . T ∗ ( t1 ∧ t2 ) = h λ z . (( a1 ( c1 ( z , [ ]))) ∧ ( a2 ( d1 z ))) , merge ( b1 , b2 ) i where h a1 , b1 , c1 , d1 i = V ∗ ( t1 ) and h a2 , b2 i = T ∗ ( t2 ) ∗ . V ∗ ( tip ( n )) = h λ z . tip ( hd ( z )) , [ n ] , λ z x . hd ( z ) : x , λ z . tl ( z ) i . Pettorossi ∗ . V ∗ ( t1 ∧ t2 ) = h λ z . (( a1 ( c1 ( z , [ ]))) ∧ ( a2 ( d1 z ))) , merge ( b1 , b2 ) ,λ z x . ( c1 ( z , c2 (( d1 z ) , x ))) , λ z . ( d2 ( d1 z )) i where h a1 , b1 , c1 , d1 i = V ∗ ( t1 ) and h a2 , b2 , c2 , d2 i = V ∗ ( t2 ) Computer experiments performed at the time of writing the paper [62] from which we take thisexample, show that the computation of the ﬁnal function

NewTreeSort ∗ ( t ) is faster than the oneof the initial function TreeSort ( t ) for trees whose size is greater than about 30. For trees ofsmaller size the overhead of dealing with functions is not compensated by the fact that the inputtree is visited once only.Note also that since lambda expressions do not have free variables, we can operate on themby using pairs of bound variables and function bodies, instead of the more expensive closures.Thus, for instance, λ z . expr can be represented by the pair h z , expr i .Some years later Maurizio Proietti and I have studied the application of the lambda ab-straction strategy in the area of logic programming. As in functional programs where we havelambda expressions denoting functions, in logic programming we should have terms denotinggoals, and thus goals should be allowed to occur as arguments of predicates. To allow goals asarguments, we have proposed a novel logic language, we have deﬁned its semantics, and we haveprovided for it a set of unfold/fold transformations rules, together with some goal replacementrules, such as the one stating the equivalence of the goal g ∧ true with the goal g [56, 63]. Thoserules have been proved correct.Here is an example of eﬃciency improvement obtained by program transformation in thisnovel language. This transformation has not been mechanized, but we believe that it is not hardto do it. Details can be found in [63, Section 7.1]. Let us consider a program which given abinary tree (either l ( N ) or t ( L, N, R )), (i) ﬂips all its left and right subtrees, and (ii) checks ina subsequent traversal of the tree, whether or not all labels are natural numbers.1. ﬂipcheck ( X, Y ) ← ﬂip ( X, Y ) , check ( Y )2. ﬂip ( l ( N ) , l ( N )) ← ﬂip ( t ( L, N, R ) , t ( FR , N, FL )) ← ﬂip ( L, FL ) , ﬂip ( R, FR )4. check ( l ( N )) ← nat ( N )5. check ( t ( L, N, R )) ← nat ( N ) , check ( L ) , check ( R )6. nat (0) ← nat ( s ( N )) ← nat ( N )We derived the following program which traverses the input tree only once and uses the contin-uation passing style:8 ﬂipcheck ( X, Y ) ← newp ( X, Y, G, true , G )9 newp ( l ( N ) , l ( N ) , G, C, D ) ← eq ¯ c ( G, nat ¯ c ( N, C ) , D )10 newp ( t ( L, N, R ) , t ( FR , N, FL ) , G, C, D ) ← newp ( L, FL , U, C, newp ( R, FR , V, U, eq ¯ c ( G, nat ¯ c ( N, V ) , D )))11 nat ¯ c (0 , C ) ← C nat ¯ c ( s ( N ) , C ) ← nat ¯ c ( N, C )For the predicate eq ¯ c we assume that: ⊢ ∀ ( eq ¯ c ( X, Y, C ) ↔ (( X = Y ) ∧ C ))).6 A Historical Account of My Early Research Interests

While at Edinburgh I had the privilege of attending a course on the Calculus of CommunicatingSystems (CCS) by Professor Robin Milner (1934-2010) [42]. I remember the day when RobinMilner and Gordon Plotkin decided the name to be given to this new calculus. As I was told,they ﬁrst decided that the name should have been of three letters only! I appreciated the beautyof the calculus which resembles a development of lambda calculus. The application of a function λ x . e [ x ] to an argument a can, indeed, be understood as a communication which takes placebetween: (i) the ‘function agent’ and (ii) the ‘argument agent’ through the ‘port’ named λ .After their communication, which is called a handshaking , the agents continue their respectiveactivities, namely, (i) the function agent does the evaluation of e [ a ] , that is, the body e [ x ] of thefunction where the variable x have been bound to the value a , and (ii) the argument agent doesnothing, that is, it become the null-agent (indeed, for the rest of the computation, the argumenthas nothing left to do).At about the same time, Professor Tony Hoare in Oxford was developing his calculus ofConcurrent Sequential Programs (CSP) [29]. I remember a visit that Tony Hoare made toRobin Milner at Edinburgh and the stimulating seminar Hoare gave on CSP on that occasion.In subsequent years, I thought of exploring the power of communications and parallelismin functional programming, also because the various components of the tuples introduced bythe tupling strategy can be computed in parallel. These components can be considered asindependent agents which may synchronize at the end of their computations. During those years,the notion of communicating agents was emerging quite signiﬁcantly in various programmingparadigms.Andrzej Skowron and I did some work in this area and we proposed (some variants of) afunctional language with communications [60, 61]. Each function call is assumed to be an agent ,that is, a triple of the form h x , m i :: expr , where x is its name, m is its message , that is, its localinformation, and expr is its expression, that is, the task it has to perform. The operationalsemantics of the language is based on the conditional rewriting of sets (or multisets) of agents,similarly to what is done in coordination languages (see, for instance, [25]).As an example of a functional program with communications which we proposed, let usconsider the following program for computing the familiar Fibonacci function.The variable x ranges over agent names which are strings constructed from x as the followinggrammar indicates: x ::= ε | x . | x . . The left and right son-calls of the agent whose nameis x have names x . and x . , respectively. By default, the name of the agent of the initial functioncall is the empty string ε .In our example, the variables ms and ms1 range over the three message constants: R (for ready ), R1 (for ready ), and W (for wait ). Agents with messages R and R1 may make rewritings,while agents with message W cannot (see Rules – below). The variables n and val range overintegers and the variable exp ranges over integer expressions. . (cid:8) h x , ms i :: fib ( ) (cid:9) ⇒ (cid:8) h x , ms i :: (cid:9) if ms = R or ms = R12 . (cid:8) h x , ms i :: fib ( ) (cid:9) ⇒ (cid:8) h x , ms i :: (cid:9) if ms = R or ms = R13 . (cid:8) h x , R i :: fib ( n + ) (cid:9) ⇒ (cid:8) h x , R i :: +( x . , x . ) , h x . , R i :: fib ( n + ) , h x . , R1 i :: fib ( n ) (cid:9) if n ≥ . (cid:8) h x , R1 i :: fib ( n + ) (cid:9) ⇒ (cid:8) h x , R1 i :: +( x . , x . ) , h x . , W i :: fib ( n + ) , h x . , R i :: fib ( n ) (cid:9) if n ≥ . (cid:8) h x . , ms i :: val , h x , ms1 i :: +( x . , exp ) (cid:9) ⇒ (cid:8) h x , ms1 i :: +( val , exp ) (cid:9) . Pettorossi . (cid:8) h x . , ms i :: val , h x , ms1 i :: +( exp , x . ) (cid:9) ⇒ (cid:8) h x . , ms i :: val , h x , ms1 i :: +( exp , val ) (cid:9) . (cid:8) h x . . , R1 i :: val , h x . . , W i :: exp (cid:9) ⇒ (cid:8) h x . . , R1 i :: val , h x . . , R i :: val (cid:9) Rules and are the expected ones for computing fib ( ) and fib ( ) . The recursive call of fib ( n + ) has two variants (see Rules and ) so to be able to evaluate the call of agent x . . ina diﬀerent way than that of agent x . . . The expression +( x . , x . ) has the eﬀect that, once thevalues of the son-calls are evaluated and sent to the father-call, according to Rules and , thenthe father-call silently performs the sum of the values it has received. Rule sends the valuecomputed by agent x . . to agent x . . . This communication is correct and improves eﬃciency.Indeed, by our program the value of fib ( n − ) which is needed for computing fib ( n + ) and fib ( n ) , is computed once only. Note, in fact, that one of the two agents which have to compute fib ( n − ) , has the message W and cannot make further rewritings.We have considered the problem of how to modify the rules of the programs when acquiringknowledge of new facts about the functions to be evaluated for improving program eﬃciency. Inthe case of the Fibonacci function, one such fact may be the equality of the expressions to becomputed by the agents x . . and x . .Note that the above Rules – do not perform the on-the-ﬂy garbage collection of the agentsbecause right-sons are not erased. To overcome this problem one may use more complex mes-sages [61] so that every agent knows the agents which are waiting for receiving the value itcomputes. If there are none, the agent may be erased once it has sent its value to the father-call.Note also that, if instead of Rule , we use the simpler equation: e . (cid:8) h x . , ms i :: val , h x , ms1 i :: +( exp , x . ) (cid:9) ⇒ (cid:8) h x , ms1 i :: +( exp , val ) (cid:9) deadlock may be generated. We have also proposed a modal logic for proving correctness ofour functional programs with agents and communications [59] and, in particular, the absence ofdeadlock. Unfortunately, no implementation of our language proposal and its modal logic hasbeen done.Concerning a more theoretical study of parallelism and communications, Anna Labella and Iconsidered categorical models for calculus with handshaking communications both in the caseof CCS [42] and CSP [29]. We were inspired by the deﬁnition of the cartesian closed categoriesfor providing models of the lambda calculus.We followed an approach diﬀerent from Winskel’s one [73]. We did not give an a priori deﬁnition of a categorical structure, where the embedding of the algebraic models of CCS orCSP might not be completely satisfactory. We started, instead, from the algebraic models, basedon labelled trees of various kinds, and we deﬁned suitable categories of labeled trees where onecan interpret all the basic operations of CCS and CSP. In a sense, we followed the approachpresented many years earlier by Rod Burstall for the description of ﬂowchart programs [8]. Thedetails of our categorical constructions can be found in [33, 36].In some models of ours we used enriched categories [34]. An enriched category is a categorywhere the sets of morphisms associated with the pairs of objects, are replaced by objects froma ﬁxed monoidal category. For lack of space we will not enter into the details here. While studying at Edinburgh, I thought of applying the transformation methodology to CCSagents. I remember talking to Robin Milner about this idea. He did not show much interest8

A Historical Account of My Early Research Interests maybe because for him it was more important to ﬁrst acquire a good understanding the equiva-lences between terms in the CCS calculus, before applying them to the transformations of agentswhich, of course, should be equivalence preserving.Then, I thought of applying program transformation to the area of logic programming whichI ﬁrst studied during my Ph.D. research at Edinburgh. At that time William Clocksin and ChrisMellish were writing their popular book on Prolog [13]. I remember reading some parts of adraft of the book. Also I had the chance of looking at David Warren’s report on how to compilelogic programs [70]. I also read his paper comparing the Prolog implementation with Lisp [72]and the later report on the Warren Abstract Machine [71]. From those days I still rememberDavid’s kindness, his cooperation with Fernando Pereira, and his love for plants and ﬂowers.A few years later, when back in Italy, I was introduced by Anna Labella to her former studentMaurizio Proietti who, not long before, had graduated in Mathematics at Rome University‘La Sapienza’, defending a thesis on Category Theory. I spoke to Maurizio and I introducedhim to logic programming [38]. I also encouraged him to work in the ﬁeld of logic programtransformation. He kindly accepted. The basis of his work was a paper by Hisao Tamakiand Taisuke Sato [68] that soon afterwards became the standard reference for logic programtransformation.That was the beginning of Maurizio’s cooperation with me. He was ﬁrst funded by a re-search grant from the private company Enidata (Rome) and soon later, he became a researcherof the Italian National Research Council in Rome. We ﬁrst considered some techniques forﬁnding the eureka predicates , that is, the predicate deﬁnitions to be introduced during programtransformation [64].Besides the deﬁnition introduction, unfolding, and folding rules, we have used for our trans-formations a rule called

Generalization + Equality Introduction (see also [7] for a similar rulewhen proving theorems in functional programs). By this rule, a clause of the form H ← A , . . . , A n is generalized to the clause H ← GenA , . . . , GenA n , X = t , . . . , X n = t r , where ( GenA , . . . , GenA n ) ϑ = ( A , . . . , A n ) being ϑ the substitution { X /t , . . . , X r /t r } .We have also introduced: (i) the class of non-ascending programs , where, among other prop-erties, each variable should occur in an atom at most once, (ii) the synchronized descent rule (SDR) for driving the unfolding steps by selecting the atoms to be unfolded, and (iii) the loopabsorption strategy for the synthesis of the eureka predicates. We have also characterized classesof programs in which that strategy is guaranteed to be successful.Let us see a simple example of application of the loop absorption strategy. Here is a program,called Comsub , for computing common subsequences of lists.1. comsub ( X, Y, Z ) ← sub ( X, Y ) , sub ( X, Z )2. sub ([ ] , X ) ← sub ([ A | X ] , [ A | Y ]) ← sub ( X, Y )4. sub ( X, [ A | Y ]) ← sub ( X, Y )where sub ( X, Y ) holds iﬀ X is a sublist of Y . The order of the elements should be preserved, butthe elements in X need not to be consecutive in Y . For instance, [1,2] is a sublist of [1,3,2,3],while [2,1] is not. We want to derive a program where the double visit of the list X in clause 1is avoided.First, we make the given program to be non-ascending by replacing clause 3 by the followingclause: . Pettorossi sub ([ A | X ] , [ A | Y ]) ← A = A , sub ( X, Y )Let

Comsub { , , . , } of clauses. In Figure 3 we have depicted an upper por-tion of the unfolding tree for Comsub

1. In that ﬁgure we have underlined the atoms which . comsub ( X, Y, Z ) ← sub ( X, Y ) , sub ( X, Z )5 . comsub ([ ] ,Y,Z ) ← sub ([ ] , Z ) 6 . comsub ([ A | X ] , [ A | Y ] ,Z ) ← A = A , sub ( X, Y ) , sub ([ A | X ] , Z )8 . comsub ([ ] , Y, Z ) ← . comsub ([ ] ,Y, [ A | Z ]) ← sub ([ ] , Z ) 7 . comsub ( X, [ A | Y ] ,Z ) ← sub ( X, Y ) , sub ( X, Z )10 . comsub ([ A | X ] , [ A | Y ] , [ A | Z ]) ← A = A , sub ( X, Y ) ,A = A , sub ( X, Z ) 11 . comsub ([ A | X ] , [ A | Y ] , [ B | Z ]) ← A = A , sub ( X, Y ) , sub ([ A | X ] , Z ) Figure 3: An upper portion of the unfolding tree for

Comsub sub ([ A | X ] , Z ) which is selected by the SDR rule. Indeed, by the synchronized descent rule,in clause 6 we have to unfold that atom, because in its ancestor-clause 1 we have unfolded theother atom sub ( X, Y ) occurring in the body of that ancestor. Unfolding is stopped when therecursive deﬁned atoms in the body of a leaf-clause, say L , are subsumed by the body of anancestor-clause, say A . In this case we say that a loop of the form h A, L i has been detected.Details can be found in [64].According to the loop absorption strategy, for each detected loop h A, L i we introduce a newdeﬁnition clause D so that the bodies of both clauses A and L can be folded using D . The loops h , i and h , i need not a new deﬁnition because we have clause 1 deﬁning comsub . The loops h , i and h , i require the following two new predicate deﬁnitions newsub ( Z ) ← sub ([ ] ,Z ) for loop h , i newcomsub ( A,X,Y,Z ) ← sub ( X,Y ) , sub ([ A | X ] ,Z ) for loop h , i By performing the unfolding and folding steps which correspond to the subtrees rooted inclauses 1, 5, and 6 of Figure 3, we get the explicit deﬁnitions of the predicates newsub and newcomsub .Eventually, by simplifying the equalities, we get the following program:5. comsub ([ ] , Y, Z ) ← ( ∗ )6. comsub ([ ] , Y, [ A | Z ]) ← newsub ( Z )7. comsub ([ A | X ] , [ A | Y ] , Z ) ← newcomsub ( A, X, Y, Z ) ( ∗ )8. comsub ( X, [ A | Y ] , Z ) ← comsub ( X, Y, Z ) ( ∗ )0 A Historical Account of My Early Research Interests newcomsub ( A, X, Y, [ A | Z ]) ← comsub ( X, Y, Z ) ( ∗ )10. newcomsub ( A, X, Y, [ B | Z ]) ← newcomsub ( A, X, Y, Z ) ( ∗ )11. newsub ( Z )12. newsub ([ A | Z ]) ← newsub ( Z )Now clause 6 is subsumed by clause 5 and can be erased. Then, also clauses 11 and 12 canbe erased and the ﬁnal program is made out of the marked clauses 5, 7–10 only. This ﬁnalprogram is equal to the one derived by Tamaki-Sato [68]. Note that our derivation does notrely on human intuition and can easily be mechanized. The computation of all solutions of thegoal comsub ( X, Y, Z ), where X is a free variable and Y and Z are ground lists of 10 elements, isabout 6 times faster when using the ﬁnal program, instead of the initial one [64]. A developmentof the technique we have now illustrated can be found in [65].The following example, taken from a paper of ours [58] written some years later in honor ofProfessor Robert Kowalski, shows an application of the program transformation methodologyalso to the case when clauses may have negated atoms in their body. For that kind of logicprograms, called locally stratiﬁed logic programs , we have also provided the transformation rulesthat can be applied and we have shown that they are correct, in the sense that they preservethe perfect model semantics. The details on the rules and the deﬁnition of the perfect modelsemantics can be found in [58].Let us consider the following program CFParser for deriving a word generated by a givencontext-free grammar over the alphabet { a, b } :1. derive ([ ] , [ ]) ← derive ([ A | S ] , [ A | W ]) ← terminal ( A ) , derive ( S, W )3. derive ([ A | S ] , W ) ← nonterminal ( A ) , production ( A, B ) , append ( B, S, T ) , derive ( T, W )4. nonterminal ( s ) ← nonterminal ( x ) ← terminal ( a ) ← terminal ( b ) ← production ( s, [ a, x, b ]) ← production ( x, [ ]) ← production ( x, [ a, x ]) ← production ( x, [ a, b, x ]) ← word ([ ]) ← word ([ A | W ]) ← terminal ( A ) , word ( W )14. append ([ ] , Ys , Ys ) ← append ([ A | Xs ] , Ys , [ A | Zs ]) ← append ( Xs , Ys , Zs )The relation derive ([ s ] , W ) holds iﬀ the word W can be derived from the start symbol s usingthe following productions (see clauses 8–11): s → a x b x → ε | a x | a b x The nonterminal symbols are s and x (see clauses 4 and 5), the terminal symbols are a and b (see clauses 6 and 7), words in { a, b } ∗ are represented as lists of a ’s and b ’s, and the emptyword ε is represented as the empty list [ ].The relation derive ( L, W ) holds iﬀ L is a sequence of terminal or nonterminal symbols fromwhich the word W can be derived by using the productions.We would like to derive an eﬃcient program for an initial goal G of the form: word ( W ) , ¬ derive ([ s ] , W )which holds in the perfect model of the program CFParser iﬀ W is a word which is not derivablefrom s by using the given context-free grammar. We perform our two step program derivationpresented in [58, Section 2.3]. In the ﬁrst step, from goal G we derive the following two clauses: . Pettorossi g ( W ) ← word ( W ) , ¬ new W )17. new W ) ← derive ([ s ] , W )In the second step, we apply the unfold-deﬁnition-folding strategy presented in [65]. We will notrecall here the formal deﬁnition of this strategy. It will be enough to say that it is similar tothe loop absorption strategy we have seen in action in the above derivation starting from the Comsub

CFParser program, at the end of the second step, we get: g ([ ]) ← g ([ a | A ]) ← new A ) g ([ b | A ]) ← new A ) new ← new a | A ]) ← new A ) new b | A ]) ← new A ) new ← new a | A ]) ← new A ) new b | A ]) ← new A ) new ← new a | A ]) ← new A ) new b | A ]) ← new A ) new a | A ]) ← new A ) new b | A ]) ← new A ) new a | A ]) ← new A ) new b | A ]) ← new A )This program corresponds to the deterministic ﬁnite automaton of Figure 4. Each predicate ofthe derived program is a state, (ii) g is the initial state, (iii) a state p is ﬁnal iﬀ it has a clauseof the form p ([ ]) ← , (iv) a clause of the form p ([ ℓ | A ]) ← q ( A ) denotes a transition with label ℓ from p to q . Note that the derivation of the ﬁnal program that corresponds to a ﬁnite automatonhas been possible because the context-free grammar indeed generates a regular language. new new g new new new a a, b a ba, b bab ab Figure 4: The ﬁnite automaton which accepts the words which are not generated from s by theproductions: s → a x b and x → ε | a x | a b x . State g is the initial state and the ﬁnal states havedouble circles.Finally we present an example on how to use the transformation methodology for the ver-iﬁcation of program properties. This example is the so called Yale Shooting Problem which isoften used in temporal reasoning. This problem can be described and formalized as follows.We have a person and a gun. Three events are possible: (e1) a load event, when the gun isloaded, (e2) a shoot event, when the gun shoots, and (e3) a wait event, when nothing happens(see clauses 1–3 below). A situation is (the result of) a sequence of events. A sequence isrepresented as a list. We assume that, as time progresses, the list grows ‘to the left’, that is,given the current list S of events, when a new event E occurs, the new list of events is [ E | S ].In any situation, at least one of the following three facts holds : (f1) the person is alive , (f2) theperson is dead , and (f3) the gun is loaded (see clauses 4–6 below).We also assume the following hypotheses (see clauses 7–11 and note the presence of a negatedatom in clause 11). (s1) In the initial situation denoted by the empty list, the person is alive .(s2) After a load event the gun is loaded. (s3) If the gun is loaded , then after a shoot event the2 A Historical Account of My Early Research Interests person is dead . (s4) If the gun is loaded , then it is abnormal that after a shoot event the personis alive . (s5)

Inertia Axiom : If a fact F holds in a situation S and it is not abnormal that F holds after the event E following S , then F holds also after the event E .The following locally stratiﬁed program YSP formalizes the above statements. A similarformalization is in a paper by Apt and Bezem [1].1. event ( load ) ← event ( shoot ) ← event ( wait ) ← fact ( alive ) ← fact ( dead ) ← fact ( loaded ) ← holds ( alive , [ ]) ← holds ( loaded , [ load | S ]) ← holds ( dead , [ shoot | S ]) ← holds ( loaded , S )10. ab ( alive , shoot , S ) ← holds ( loaded , S )11. holds ( F, [ E | S ]) ← fact ( F ) , event ( E ) , holds ( F, S ) , ¬ ab ( F, E, S )12. append ([ ] , Ys , Ys ) ← append ([ A | Xs ] , Ys , [ A | Zs ]) ← append ( Xs , Ys , Zs )By applying SLDNF-resolution [38], Apt and Bezem showed that holds ( dead , [ shoot , wait , load ])is true in the perfect model of program YSP . Now we consider a property Γ which cannot beshown by SLDNF-resolution (see [58]):Γ ≡ ∀ S ( holds ( dead , S ) → ∃ S , S , S , S ′ ( append ( S , [ shoot | S , S ′ ) , append ( S ′ , [ load | S , S )))Property Γ means that the fact that the person is dead in the current situation S implies thatin the past there was a load event followed, possibly not immediately, by a shoot event. Thus,since time progresses ‘to the left’, S is a list of events of the form: [ . . . , shoot , . . . , load , . . . ].In the ﬁrst step of our two step veriﬁcation method (see [58, Section 2.3]), we apply theLloyd-Topor transformation [38, page 113] starting from the statement: g ← Γ (where g is a newpredicate name) and we derive the following clauses:14. g ← ¬ new new ← holds ( dead , S ) , ¬ new S )16. new S ) ← append ( S , [ shoot | S , S ′ ) , append ( S ′ , [ load | S , S )At the end of the second step, after a few iterations of the unfold-deﬁnition-folding strategy andafter the deletion of all deﬁnitions of predicates which are not required by g , we are left withthe single clause: g ← . Details can be found in [58].Since g holds in the (perfect model of the) ﬁnal program, we have that property Γ holds inthe (perfect model of the) ﬁnal program. Thus, Γ holds also in the initial program made out ofclauses 1–13.Much more recently we have explored some veriﬁcation techniques based on the transfor-mation of constrained Horn clauses, also in the case of imperative and functional programs [19]and in the case of business processes (see, for instance, [17]). This recent work has been donein cooperation with Emanuele De Angelis and Fabio Fioravanti. They also have been work-ing and still work in the implementation and development of an automatic transformation andveriﬁcation tool [18], which was originally set up by Ornella Aioni and Maurizio Proietti. . Pettorossi Reviewing my research activity when writing this paper, I realized that many topics and issueswould need a more accurate analysis and study. It would be diﬃcult to list them all, but I havebeen encouraged to mention at least some of them. I hope that these suggestions may be usefulfor researchers in the ﬁeld and they may ﬁnd these suggestions of some interest.Concerning the theory of combinators and WCL presented in Section 1, one should notethat the combinator X ≡ B ( B ( BB ) B )( BB ) we have presented has parentheses and one couldconsider to construct a B -combinator, call it e B , which places those parentheses in a sequence ofseven B ’s, so that e BBBBBBBB > ∗ B ( B ( BB ) B )( BB ). A routine construction, following [3],shows that e B is, in fact, B ( B ( B ( BB ) B ) B )( BB ). The relation between combinators X and e B could be for the reader a stimulus for studying the process of placing parentheses in a list ofvariables, that is, the process of constructing a binary tree from the list of its leaves.One can start by considering, ﬁrst, the use of regular combinators only. A combinator X is said to be regular if its reduction is of the form Xx . . . x n > x t . . . t m , where t , . . . , t m areterms made out of x , . . . , x n only. A particular regular combinator for placing parentheses is,indeed, B . Similarly, one could study the permutative and duplicative properties of the regularcombinators C (deﬁned by Cxyz > xzy ) and W (deﬁned by W xy > xyy ) and other regular (ornon-regular) combinators. This study will improve the results reported in the classical book byCurry and Feys [14, Chapter 5].For Section 2 one could develop the techniques presented in [49]. Those developments canbe useful in the area of Term Rewriting Systems for constructing terms with inﬁnite behaviour.For the issues considered in Section 3 on Program Transformation, it will be importantto investigate how to invent the multiplication operation, having at our disposal in the initialprogram version only the addition operation. Generalizations of various kinds can be suggestedas we have done in this paper, but an interesting technique would be the one based on the idea ofderiving multiplication as the iteration of additions. Then, in an analogous way, exponentiationcan be invented as the iteration of multiplications, thus allowing us to derive even more eﬃcientprograms. The idea of iteration can hopefully be generated by mechanically analyzing the m-dagsconstructed by unfolding and looking at repeated patterns.For Sections 4 and 5, it could be important to mechanize the techniques we have presentedthere, and in particular those for ﬁnding the suitable tuples of functions and suitable lambda-abstractions via the analysis of: (i) cuts and pebble games in the m-dags, and (ii) subexpressionmismatchings, respectively.For Section 6 one can provide an implementation of the functional language with communi-cations we have proposed so that one can execute programs written in that language. One mayalso: (i) automate the process of adding communications to functional programs for improvingtheir eﬃciency by making use of the properties of the functions to be evaluated, and (ii) auto-mate the reasoning on the modal theories presented in [59] in which one can prove correctnessof those communications. Thus, one will have a machine-checked proof of correctness of thecommunications which have been added.For Section 7 a possible project is to construct a transformation system of logic programswith goals as arguments in which: (i) one can run the programs according to the operationalsemantics we have deﬁned in our paper [56], and (ii) one can apply the various transformationrules (deﬁnition introduction, unfolding, folding, goal replacement) we have listed in that paper.4

A Historical Account of My Early Research Interests

My gratitude goes to the various people who taught me Computer Science and among them,Paolo Ercoli, Rod Burstall, Robin Milner, Gordon Plotkin, John Reynolds, Alan Robinson, andLeslie Valiant. I am also grateful to my colleagues and students. In particular, I would like tothank Anna Labella, Andrzej Skowron, Maurizio Proietti, Valerio Senni, Sophie Renault, FabioFioravanti, and Emanuele De Angelis. A very special thank goes to Maurizio, who for manyyears has been ‘so devoted to me, so patient, so zealous’, as John Henry Newman said of hisfriend Ambrose St. John [43, page 190]. Maurizio has been for me an unvaluable source ofinspiration and strength.Many thanks to Andrei Nemytykh for inviting me to write this paper and for his comments,and Maurizio Proietti and Laurent Fribourg for their friendly help and encouragement.

References [1] K. R. Apt & M. Bezem (1991):

Acyclic Programs . New Generation Computing

9, pp. 335–363,doi: .[2] H. P. Barendregt (1984):

The Lambda Calculus, its Syntax and Semantics . North-Holland, Amster-dam, doi: .[3] C. Batini & A. Pettorossi (1975):

On Subrecursiveness in Weak Combinatory Logic . In:

Proceedingsof the Symposium λ -Calculus and Computer Science Theory , Lecture Notes in Computer Science37, Springer-Verlag, pp. 297–311, doi: .[4] F. L. Bauer & H. Wössner (1982): Algorithmic Language and Program Development . Springer-Verlag, doi: .[5] R. S. Bird (1984):

The Promotion and Accumulation Strategies in Transformational Programming . ACM Toplas .[6] R. S. Bird (1984):

Using Circular Programs to Eliminate Multiple Traversal of Data . Acta Infor-matica

21, pp. 239–250, doi: .[7] R. S. Boyer & J. S. Moore (1975):

Proving Theorems About Lisp Functions . Journal of the ACM .[8] R. Burstall (1972):

An Algebraic Description of Programs with Assertions, Veriﬁcation and Simu-lation . In:

Proc. ACM Conference on Proving Assertions about Programs , ACM, New York, NY,USA, pp. 7–14, doi: .[9] R. M. Burstall (1977):

Design Considerations for a Functional Programming Language . In:

Proc.Infotech State of the Art Conference “The Software Revolution”, Copenhagen, Denmark , pp. 47–57.[10] R. M. Burstall & J. Darlington (1977):

A Transformation System for Developing Recursive Programs . Journal of the ACM .[11] R. M. Burstall, D.B. MacQueen & G. H. Sannella (1980):

Hope: An Experimental ApplicativeLanguage . In:

Conference Record of the 1980 LISP Conference, Stanford University, Stanford, Ca,USA , pp. 136–143, doi: .[12] K. L. Clark (1978):

Negation as Failure . In H. Gallaire & J. Minker, editors:

Logic and Data Bases ,Plenum Press, New York, pp. 293–322, doi: .[13] W. F. Clocksin & C. S. Mellish (1984):

Programming in Prolog , Second edition. Springer-Verlag,New York, doi: .[14] H. B. Curry & R. Feys (1974):

Combinatory Logic . North-Holland. . Pettorossi [15] J. Darlington (1978): A Synthesis of Several Sorting Algorithms . Acta Informatica

11, pp. 1–30,doi: .[16] J. Darlington (1981):

An Experimental Program Transformation System . Artiﬁcial Intelligence .[17] E. De Angelis, F. Fioravanti, M. C. Meo, A. Pettorossi & M. Proietti (2017):

Veriﬁcation of Time-Aware Business Processes using Constrained Horn Clauses . In:

Proceedings of the 26th InternationalSymposium on Logic-Based Program Synthesis and Transformation ( LOPSTR 2016 ), Lecture Notesin Computer Science 10184, Springer, pp. 38–55, doi: .[18] E. De Angelis, F. Fioravanti, A. Pettorossi & M. Proietti (2014):

VeriMAP : A Tool for Veri-fying Programs through Transformations . In:

Proc. 20th International Conference on Tools andAlgorithms for the Construction and Analysis of Systems, TACAS ’14 , Lecture Notes in Com-puter Science 8413, Springer, pp. 568–574, doi:

Semantics-based generation ofveriﬁcation conditions via program specialization . Science of Computer Programming . Selected and Extended papers from the Int. Symp. onPrinciples and Practice of Declarative Programming 2015.[20] N. Dershowitz (1987):

Termination of rewriting . Journal of Symbolic Computation .[21] E. W. Dijkstra (1971):

A Short Introduction to the Art of Programming . Technical Report, EWD316.[22] E. W. Dijkstra (1982):

Selected Writing on Computing: A Personal Perspective . Springer-Verlag,New York, Heidelberg, Berlin, doi: .[23] M. C. Er (1983):

An iterative solution to the generalized Towers of Hanoi problems . BIT

23, pp.295–302, doi: .[24] P. Flajolet & J.-M. Steyaert (1974):

On Sets Having Only Hard Subsets . In J. Loeckx, editor: , Lecture Notes in Computer Science 14, Springer,pp. 446–457, doi: .[25] D. Gelernter & N. Carriero (1992):

Coordination Languages and their Signiﬁcance . Communicationsof the ACM .[26] P. J. Hayes (1977):

A note on the Towers of Hanoi problem . The Computer Journal .[27] J. R. Hindley, B. Lercher & J. P. Seldin (1975):

Introduzione alla Logica Combinatoria . Serie diLogica Matematica, Boringhieri. (In Italian).[28] J. R. Hindley & J. P. Seldin (1986):

Introduction to Combinators and λ -Calculus . London Mathe-matical Society, Cambridge University Press, doi: .[29] C.A.R. Hoare (1978): Communicating Sequential Processes . CACM .[30] J. E. Hopcroft & J. D. Ullman (1979):

Introduction to Automata Theory, Languages and Computa-tion . Addison-Wesley, doi: .[31] R. J. M. Hughes (1986):

A novel representation of lists and its application to the function “reverse” . Info. Proc. Lett.

22, pp. 141–144, doi: .[32] V. E. Itkin & Z. Zwienogrodsky (1971):

On equivalence of program schemata . Journ. Comp. Syst.Sci.

Observers, experiments, and agents: A comprehen-sive approach to parallelism . In I. Guessarian, editor:

Semantics of Systems of Concurrent Processes. A Historical Account of My Early Research Interests

LITP Spring School , Lecture Notes in Computer Science 469, Springer-Verlag, pp. 375–406, doi: .[34] G. M. Kelly (1982):

Basic Concepts of Enriched Category Theory . Cambridge University Press,Cambridge.[35] R. A. Kowalski (1979):

Logic for Problem Solving

Categorical Models for Handshaking Communications . Funda-menta Informaticae. Series IV.

VIII(3-4), pp. 322–357.[37] S. S. Lavrov (1961):

Economy of memory in closed operator schemes . U.S.S.R. Computat. Math.and Math. Physics , pp. 810–828.[38] J. W. Lloyd (1987):

Foundations of Logic Programming . Springer-Verlag, Berlin, doi: . Second Edition.[39] V. V. Martynuk (1965):

On the analysis of control-ﬂow graphs for a program scheme . Journ. Comp.Math. and Math. Phys.

Introduction to Mathematical Logic . Wadsworth & Brooks/Cole AdvancedBooks & Software, Monterey, California, USA, doi: . Third Edition.[41] J. Miller & S. Brown (1966):

An algorithm for evaluation of remote terms in a linear recurrencesequence . The Computer Journal

9, pp. 188–190, doi: .[42] R. Milner (1989):

Communication and Concurrency . Prentice Hall, doi: .[43] J. H. Newman (2001):

Apologia Pro Vita Sua . Maisie Ward (ed.), Sheed and Ward, London.[44] M. S. Paterson & C. E. Hewitt (1970):

Comparative Schematology . In:

Conference on ConcurrentSystems and Parallel Computation Project MAC, Woods Hole, Mass., USA , pp. 119–127. Availableat: https://dl.acm.org/doi/pdf/10.1145/1344551.1344563.[45] A. Pettorossi (1971):

Ottimizzazione di un Collegamento per Trasmissione di Dati Mediante Simu-lazione Numerica . Laurea Thesis (in Italian). University of Rome, Italy. Available on request to theauthor.[46] A. Pettorossi (1972):

Automatic Derivation of Control Flow Graphs of Fortran Programs . MasterThesis (in Italian). Original title: “Generazione Automatica del Grafo di Flusso del Controllo perun Programma di Calcolo Scritto in Fortran”. University of Rome, Italy. Available on request to theauthor.[47] A. Pettorossi (1978):

Improving memory utilization in transforming programs . In:

Proc. Mathemat-ical Foundations of Computer Science 1978, Zakopane ( Poland ), Lecture Notes in Computer Science64, Springer-Verlag, pp. 416–425, doi: .[48] A. Pettorossi (1979):

On the Deﬁnition of Hierarchies of Inﬁnite Sequential Computations . InLothar Budach, editor:

Fundamentals of Computation Theory, FCT’79 , Akademic-Verlag, Berlin,pp. 335–341. Available on request to the author.[49] A. Pettorossi (1980):

Synthesis of Subtree Rewriting Systems Behaviour by Solving Equations . In:

Proc. 5ème Colloque de Lille ( France ) on “Les Arbres en Algèbre et en Programmation” , U.E.R.I.E.E.A. BP 36, Université de Lille I, 59655 Villeneuve d’Ascq Cedex, France, pp. 63–74. Availableon request to the author.[50] A. Pettorossi (1981): Comparing and Putting Together Recursive Path Orderings, SimpliﬁcationOrderings, and Non-Ascending Property for Termination Proofs of Term Rewriting Systems . In:

Proc. ICALP 1981, Haifa ( Israel ), Lecture Notes in Computer Science 115, Springer-Verlag, pp.432–447, doi: .[51] A. Pettorossi (1984):

Methodologies for Transformations and Memoing in Applicative Languages .Ph.D. thesis, Edinburgh University, Edinburgh, Scotland. Available at:https://era.ed.ac.uk/handle/1842/15643. . Pettorossi [52] A. Pettorossi (1984): A Powerful Strategy for Deriving Eﬃcient Programs by Transformation . In:

ACM Symposium on Lisp and Functional Programming , ACM Press, pp. 273–281, doi: .[53] A. Pettorossi (1985):

Towers of Hanoi Problems: Deriving Iterative Solutions by Program Transfor-mation . BIT

25, pp. 327–334, doi: .[54] A. Pettorossi (1987):

Derivation of Eﬃcient Programs For Computing Sequences of Actions . Theo-retical Computer Science

53, pp. 151–167, doi: .[55] A. Pettorossi & R. M. Burstall (1982):

Deriving Very Eﬃcient Algorithms for Evaluating LinearRecurrence Relations Using the Program Transformation Technique . Acta Informatica

18, pp. 181–206, doi: .[56] A. Pettorossi & M. Proietti (2000):

Transformation Rules for Logic Programs with Goals as Argu-ments . In A. Bossi, editor:

Proceedings of the 9th International Workshop on Logic-based ProgramSynthesis and Transformation ( LOPSTR ’99 ) , Venezia, Italy , Lecture Notes in Computer Science1817, Springer-Verlag, Berlin, pp. 177–196, doi: .[57] A. Pettorossi & M. Proietti (2002): The List Introduction Strategy for the Derivation of LogicPrograms . Formal Aspects of Computing .[58] A. Pettorossi & M. Proietti (2002):

Program Derivation = Rules + Strategies . In A. Kakas &F. Sadri, editors:

Computational Logic: Logic Programming and Beyond ( Essays in honour of BobKowalski, Part I ), Lecture Notes in Computer Science 2407, Springer-Verlag, pp. 273–309, doi: .[59] A. Pettorossi & A. Skowron (1983):

Complete Modal Theories for Verifying Communicating AgentsBehaviour in Recursive Equations Programs . Internal Report CSR-128-83, University of Edinburgh,Edinburgh, Scotland. Available on request to the authors.[60] A. Pettorossi & A. Skowron (1985):

A methodology for improving parallel programs by adding com-munications . In:

Computation Theory, SCT 1984 , Lecture Notes in Computer Science 280, Springer-Verlag, Berlin, pp. 228–250, doi: .[61] A. Pettorossi & A. Skowron (1985):

A System for Developing Distributed Communicating Programs .In M. Feilmeier, G. Joubert & U. Schendel, editors:

International Conference ‘Parallel Computing85’ , North-Holland, pp. 241–246. Available on request to the authors.[62] A. Pettorossi & A. Skowron (1989):

The Lambda Abstraction Strategy for Program Derivation . Fundamenta Informaticae

XII(4), pp. 541–561. Available on request to the authors.[63] Alberto Pettorossi & Maurizio Proietti (2004):

Transformations of Logic Programs with Goals asArguments . Theory Pract. Log. Program. .[64] M. Proietti & A. Pettorossi (1990):

Synthesis of Eureka Predicates for Developing Logic Programs .In N. D. Jones, editor:

Third European Symposium on Programming, ESOP ’90 , Lecture Notes inComputer Science 432, Springer-Verlag, pp. 306–325, doi: .[65] M. Proietti & A. Pettorossi (1991):

Unfolding-Deﬁnition-Folding, in this Order, for Avoiding Unnec-essary Variables in Logic Programs . In J. Małuszyński & M. Wirsing, editors:

Third InternationalSymposium on Programming Language Implementation and Logic Programming, PLILP ’91 , Lec-ture Notes in Computer Science 528, Springer-Verlag, pp. 347–358, doi: .[66] H. Rogers (1967):

Theory of Recursive Functions and Eﬀective Computability . McGraw-Hill.[67] L. S. Sterling & E. Shapiro (1994):

The Art of Prolog . The MIT Press, Cambridge, Massachusetts.Second Edition.[68] H. Tamaki & T. Sato (1984):

Unfold/Fold Transformation of Logic Programs . In S.-Å. Tärnlund, ed-itor:

Proceedings of the Second International Conference on Logic Programming, ICLP ’84 , UppsalaUniversity, Uppsala, Sweden, pp. 127–138. A Historical Account of My Early Research Interests [69] S. A. Walker & H. R. Strong (1973):

Characterization of Flowchartable Recursions . Journal ofComputer and System Sciences .[70] D. H. D. Warren (1977):

Implementing Prolog – Compiling Predicate Logic Programs . ResearchReport 39 & 40, Department of Artiﬁcial Intelligence, University of Edinburgh, Scotland.[71] D. H. D. Warren (1983):

An Abstract Prolog Instruction Set . Technical Report 309, SRI International.[72] D. H. D. Warren, L. M. Pereira & F. Pereira (1977):

Prolog - the language and its implementationcompared with Lisp . SIGART Newsl.

64, pp. 109–115, doi: .[73] G. Winskel (1984):

Synchronization Trees . Theoretical Computer Science .[74] Y. I. Yanov (1960):