An Improved Algorithm for Generating Database Transactions from Relational Algebra Specifications
II. Mackie and A. Martins Moreira (Eds.): Tenth InternationalWorkshop on Rule-Based Programming (RULE 2009)EPTCS 21, 2010, pp. 77–89, doi:10.4204/EPTCS.21.7
An Improved Algorithm for Generating DatabaseTransactions from Relational Algebra Specifications
Daniel J. Dougherty
Worcester Polytechnic InstituteWorcester, MA, USA, 01609 [email protected]
Alloy is a lightweight modeling formalism based on relational algebra. In prior work with Fisler,Giannakopoulos, Krishnamurthi, and Yoo, we have presented a tool, Alchemy, that compiles Alloyspecifications into implementations that execute against persistent databases. The foundation ofAlchemy is an algorithm for rewriting relational algebra formulas into code for database transactions.In this paper we report on recent progress in improving the robustness and efficiency of thistransformation.
Alloy [5] is a popular modeling language that implements the lightweight formal methods philosophy [6].Its expressive power is that of first-order logic extended with transitive closure, and its syntax, based onrelational algebra, is strongly influenced by object modeling notations. The language is accompanied bythe Alloy Analyzer: the analyzer builds models (or “instances”) for a specification using SAT-solvingtechniques. Users can employ a graphical browser to explore instances and counter-examples to claims.Having written an Alloy specification, the user must then write the corresponding code by hand;consequently there are no formal guarantees that the resulting code has any relationship to thespecification. The
Alchemy project addresses this issue. Alchemy is a tool under active development [7,4] at Worcester Polytechnic Institute and Brown University, by Kathi Fisler, Shriram Krishnamurthi, andthe author, with our students Theo Giannakopoulos and Daniel Yoo, that compiles Alloy specificationsinto libraries of database operations. This is not a straightforward enterprise since, in contrast to Z [8] andB [1], where a notion of state machine is built into the language, Alloy does not have a native machinemodel.Alchemy opens up a new way of working with Alloy specifications: as declarative notations forimperative programs. In this way Alloy models support a novel kind of rule-based programming, inwhich underspecification is a central aspect of program design.In this note we report on recent progress in improving the process of generating imperative code fordeclarative specifications in a language like Alloy. This paper is a companion to [4], which developeda better semantic foundation for interpreting Alloy predicates as operations. With this better foundationwe are able to generate code for a wider class of predicates than that treated in [7] and also prove a morerobust correctness theorem relating the imperative code to the original specification.
Some of the material in this expository section is taken from [7].8 AnImproved Algorithm for Generating Database Transactions
An excellent introduction to Alloy is Daniel Jackson’s book [5]. Here we start with an informalintroduction to Alloy syntax and semantics via an example. The example is a homework submission andgrading system, shown in Figure 1. In this system, students may submit work in pairs. The gradebookstores the grade for each student on each submission. Students may be added to or deleted from thesystem at any time, as they enroll in or drop the course.The system’s data model centers around a course, which has three fields: a roster (set of students),submitted work (relation from enrolled students to submissions), and a gradebook. Alloy uses sig naturesto capture the sets and relations that comprise a data model. Each sig ( Submission , etc.) defines a unaryrelation. The elements of these relations are called atoms ; the type of each atom is its containing relation.Fields of signatures define additional relations. The sig for
Course , for example, declares roster to bea relation on
Course × Student . Similarly, the relation work is of type
Course × Student × Submission , butwith the projection on
Course and
Student restricted to pairs in the roster relation. The lone annotationon gradebook allows at most one grade per submission.The pred icates (
Enroll , etc.) capture the actions supported in the system. The predicates follow astandard Alloy idiom for stateful operations: each has parameters for the pre- and post-states of theoperation ( c and c ’, respectively), with the intended interpretation that latter reflects a change applied tothe former. Alloy fact s (such as SameGradeForPair ) capture invariants on the models. This particularfact states that students who submit joint work get the same grade.An important aspect of Alloy is that everything is a relation.
In particular sets are viewed as unaryrelations, and individual atoms are viewed as singleton unary relations. As a consequence the in operator does double-duty: it is interpreted formally as subset, but also stands in for the “element-of”relation, in the sense that if—intuitively— a is an atom that is an element of a set r , this is expressed inAlloy as a in r , since a is formally a (singleton) set.The Alloy semantics defines a set of models for the signatures and facts. Operators over sets andrelations have their usual semantics: ∪ (union), ∩ (intersection), h , i (tupling), and . (join). As notedabove, in denotes subset and is also used to encode membership. Square brackets provide a convenientsyntactic sugar for certain joins: e [ e ] is equivalent to e . e
2. The following relations constitute a modelunder the Alloy semantics.
Student = { Harry , Meg } Submission = { hwk1 } Grade = { A , A − , B + , B } Course = { c0 , c1 } roster = ( h c0 , Harry i , h c1 , Harry i , h c1 , Meg i ) work = {h c1 , Harry , hwk1 i} gradebook = {h c1 , Harry , hwk1 , A −i} A model of a predicate also associates each predicate parameter with an atom in the model such that thepredicate body holds. The above set of relations models the
Enroll predicate under bindings c = c0 , c ’= c1 and sNew = Meg . A model may include tuples beyond those required to satisfy a predicate: the
Enroll predicate does not constrain the work relation for pre-existing students, so the appearance of tuple h c1 , Harry , hwk1 i in the work relation is semantically acceptable. For consistency with the presentation and analysis of the algorithms below, we use standard mathematical notation in twoplaces where Alloy uses ASCII notation: ∪ is “+” in Alloy, ∩ is “&”. aniel J.Dougherty 79 sig Submission {} sig Grade {} sig Student {} sig Course { roster : set Student , work : roster → Submission , gradebook : work → lone Grade } pred Enroll ( c , c ’ : Course , sNew : Student ) { c ’ .roster = c.roster ∪ sNew and c ’ .work [ sNew ] = /0 } pred Drop ( c , c ’ : Course , s: Student ) { s not in c ’ .roster } pred SubmitForPair ( c , c ’ : Course , s1 , s2 : Student , bNew : Submission ) { // pre-condition s1 in c.roster and s2 in c.roster and // update c ’ .work = c.work ∪ < s1 , bNew > ∪ < s2 , bNew > and // frame condition c ’ .gradebook = c.gradebook } pred AssignGrade ( c , c ’ : Course , s : Student , b : Submission , g : Grade ) { c ’ .gradebook in c.gradebook ∪ < s , b , g > and c ’ .roster = c.roster } fact SameGradeForPair { all c : Course , s1 , s2 : Student , b : Submission | b in ( c.work [ s1 ] & c.work [ s2 ]) implies c.gradebook [ s1 ][ b ] = c.gradebook [ s2 ][ b ] } Figure 1: Alloy specification of a gradebook.0 AnImproved Algorithm for Generating Database TransactionsThe reader may want to check that the relations shown do not happen to model the predicate
SubmitForPair , in the sense that no bindings for c , c ′ , s , s , andbNew make the body of SubmitForPair true. Under c = c0 and c ’ = c1 , for example, the requirement c ’ .gradebook = c.gradebook fails becausethe gradebook starting from c ’ has one tuple while that starting from c has none. The requirement on work also fails. Similar inconsistencies contradict other possible bindings for c and c ’. We illustrate Alchemy in the context of the gradebook specification from Figure 1. Alchemy creates adatabase table for each relation (e.g.,
Submission , roster ), a procedure for each predicate (e.g., Enroll ),and a function for creating new elements of each atomic signature (e.g.,
CreateSubmission ). A samplesession using Alchemy might proceed as follows. We create a course with two students using thefollowing command sequence: cs311 = CreateCourse ( " cs311 " ); pete = CreateStudent ( " Pete " ); caitlin = CreateStudent ( " Caitlin " ); Enroll ( cs311 , pete ); Enroll ( cs311 , caitlin )Note that the Enroll function takes only one course-argument, in contrast to the two in the original Alloypredicate, since the implementation maintains only a single set of tables over time (the second courseparameter in the predicate corresponds to the resulting updated table). Executing the
Enroll functionadds the pairs h ” cs311 ” , ” Pete ” i and h ” cs311 ” , ” Caitlin ” i to the roster table. The second clause of the Enroll specification guarantees that the work table will not have entries for either student.Next, we submit a new homework for " Pete " and " Caitlin " : hwk1 = CreateSubmission ( " hwk1 " ); SubmitForPair ( cs311 , pete , caitlin , hwk1 )The implementation of SubmitForPair is straightforward relative to the specification. It treats the firstclause in the specification as a pre-condition by terminating the computation with an error if the clauseis false in the database at the start of the function execution. Next, it adds the work tuples required in thesecond (update) clause. It ensures that the gradebook table is unchanged, as required by the third clause.Assigning a grade illustrates the way that Alloy facts constrain Alchemy’s updates: gradeA = CreateGrade ( " A " ); AssignGrade ( cs311 , pete , hwk1 , gradeA ) AssignGrade inserts a tuple into the gradebook relation according to the first clause, and checks thatthe roster is unchanged according to the second. If execution were to stop here, however, the resultingtables would contradict the
SameGradeForPair invariant (which requires " Caitlin " to receive the samegrade on the joint assignment). Alchemy determines that adding the tuple h cs311 , Caitlin , hwk1 , A i to gradebook will satisfy both the predicate body and the SameGradeForPair fact, and executes thiscommand automatically. If there is no way to update the database to respect both the predicate and thefact, Alchemy will raise an exception. This could happen, for example, if the first clause in
AssignGrade used = instead of in : in this case, adding the repairing tuple would violate the predicate body).aniel J.Dougherty 81 Maintaining invariants
Alloy’s use of facts to constrain possibly-underspecified predicates offers apowerful lightweight modeling tool. The facts in an Alloy specification are axioms in the sense that theyhold in any instance for the specification. We may view the facts as integrity constraints: they capturethe fundamental invariants to be maintained across all transactions. Alchemy will guarantee preservationof all facts as database invariants. This is akin to the notion of repair of database transactions.
Alloy specifications
Formally, the
Alloy specifications we treat in this paper are tuples of signatures , predicates, and facts. In practice Alloy specifications may also include assertions to be checked by theanalyzer, but they do not play a direct role in Alchemy’s code generation so we omit them here. • A signature specifies its type name and a set of fields. Each field has a name and a type specification A → A → . . . → A n , where each A i is the type name of some signature. • A predicate has a header and a body. The header declares a set of variable names, each with anassociated signature type name; the body is a formula in which the only free variables are definedin the header. • A fact is a closed formula, having the force of an axiom: models of a specification are required tosatisfy these facts. Alloy permits the user to specify certain constraints on the signatures and fieldswhen they are declared, such as “relation r may have at most one tuple.” These can be alternativelyexpressed as facts and, for simplicity of presentation, we assume this is always done.The following language for expressions and formulas is essentially equivalent to the Kernel languageof Alloy [5] (modulo the lexical differences between standard mathematical notation used here andAlloy’s ASCII). expr :: = rel | var | none | expr binop expr | unop exprbinop :: = ∪ | ∩ | − | . | h , i unop :: = ∼| ∗ formula :: = elemFormula | compFormula | quantFormulaelemFormula :: = expr in expr | expr = exprcompFormula :: = not formula | formula ∧ formula | formula ∨ formulaquantFormula :: = ∀ var: expr { formula } | ∃ var: expr { formula } State-based specifications
The elements of an Alloy specification suggest natural implementationcounterparts. The signatures lay out relations that translate directly into persistent database schemas. Thefacts—those properties that are meant to hold of all models constructed by Alloy—function as databaseintegrity constraints. Finally, under a commonly idiom, certain predicates in an Alloy specificationconnote state changes. It is these state-based specifications that Alchemy (currently) treats.The state-transition idiom is a commonly understood convention rather than a formal notion in Alloy.To precisely define the class of specifications that Alchemy treats, we first require some terminology. Fixa distinguished signature, which we will call
State . An immutable type is one with no occurrences of the
State signature.
The assumptions Alchemy makes about the specifications it treats are: • specifications are state-based, and • facts have at most one variable of type State , and this variable is unprimed anduniversally quantified.2 AnImproved Algorithm for Generating Database Transactions
An operational semantics
The static semantics of Alloy is based on the class of relational algebras.To give an operational semantics for state-based Alloy specifications, one that takes seriously the readingof predicates as state-transformers, we pass to the class of transition systems whose nodes are relationalalgebras. We also assume that each state has a single atom of type
State . When individual relationalgebras are read as database instances, transitions between states can be viewed as database updatesequences transforming one state to another. We adopt a constant-domain assumption concerning ourtransition systems. Space consideration prohibit us from presenting the motivation and justification forthis (including the explanation why it is not as great a restriction as it may appear); details are in [4].Since predicates have parameters, the meaning of a predicate is relative to bindings from variablesto values. It is technically convenient to assume that for a given specification we identify, for each type,a universe of possible values of this type. Then an environment h is a mapping from typed variables tovalues. Definition 1 ( Operational semantics of predicates ) . Let p be a predicate with the property that p hasamong its parameters exactly two variables s and s’ of type State , and let h be an environment. Themeaning J p K h of p under h is the set of pairs h I , I ′ i of instances such that • h maps the parameters of p into the set of atoms of I (which equals the set of atoms of I ′ ), mappingthe unprimed State parameter to the
State -atom of I and the primed State parameter to the
State -atom of I ′ ; • ( I , I ′ ) makes the body of p true under the environment h : occurrences of the State variable s areinterpreted in I , while occurrences of the State variable s ′ are interpreted in I ′ .The meaning of a predicate p is a set of transitions because p can be applied to different nodes, withdifferent bindings of the parameters of course, but also—and more interestingly—because predicatestypically under-specify actions: different implementations of a predicate can yield different outcomes I ′ on the same input I . Any of these should be considered acceptable as long as the relation between pre-and post-states is described by the predicate. We observed that a predicate p determines a family of binary relations over instances, parametrized byenvironments. That is, for a given environment h : J p K h : Inst → Inst . (1)Now suppose t is a procedure defining a database transaction (so t is the sort of procedure that a predicate p specifies). Given an instance I and an environment h , t may return a new instance I ′ , terminatewith failure, or may diverge. None of the procedures we describe in this paper will diverge, so weare considering procedures t that (under an environment) determine a function over instances: J t K h : Inst → ( Inst + f ail ) . (2)Alchemy’s job is precisely the following: given predicate p , construct a procedure t = code ( p ) such thatthe semantics of code ( p ) as given in 2 refines the semantics of p as given in 1, in the following sense. Theorem 2 ( Main theorem ) . Let p be a predicate and let code ( p ) be any backtracking implementationof the algorithm A p , given in Definition 5 below. Then for each instance I and each environment h aniel J.Dougherty 83 J code ( p ) K h terminates on I;2. If there exists any instance I ′ such that ( I , I ′ ) satisfies p under h then the result of J code ( p ) K h issuch an I ′ . In particular in this situation J code ( p ) K does not return “failure” under h on I.Proof. The proof is given in Section 4.4.It is worth noting that the task of generating updates from specification submits to an uninterestingtrivial solution, particularly if we are willing to tolerate partial functions. Given predicate p we coulddefine code ( p ) by: on input I, exhaustively generate all possible I ′ ; for each one test whether ( I , I ′ ) in J p K . If and when such an I ′ is found, replace I by I ′ . Obviously this is a silly algorithm, even though itis “correct” in a formal sense. Our goal with Alchemy is to write code that is intuitively reasonable, andstill is correct in the sense of Theorem 2.
Suppose we are given an Alloy predicate p . Alchemy generates code for a procedure with parameterscorresponding to those of p (without the primed parameter).As observed above, a crucial aspect of Alloy is that it encourages “lightweight” specifications ofprocedures: the designer is free to ignore details about the computation that she may consider inessential.As a consequence,
Alchemy must be extremely flexible: different input instances may require quitedifferent computations in order to satisfy a specification, yet Alchemy must generate code that worksuniformly across all instances.The top-level view of how Alchemy generates code for a procedure is as follows. • In Definition 5 below we present a construction that, based on predicate p , builds a non-deterministic procedure A p . • The code generated by Alchemy, code ( p ) , is a backtracking implementation of A p . Computationpaths that do not succeed are recognized as such and abandoned, and A p is finite-branching, so code ( p ) will always terminate. • If there exists any instance I ′ such that ( I , I ′ ) satisfies p under h then some branch of A p isguaranteed to compute such some such instance. Coping with inconsistent predicates
It is possible for the code for a predicate p to fail on a givendatabase instance I , either because the predicate is internally inconsistent or because no update of I can implement p without violating the facts. Alchemy is guaranteed to detect such situations; we treatpredicates as transactions that rollback if they cannot be executed without violating their bodies or a fact. The general form of an Alloy predicate that specifies an operation and that Alchemy treats is pred p ( s , s ′ : State , a : A , . . . , a n : A n ) { ~ Qx . b ( ~ a ,~ x ) } where ~ Q is a sequence of quantified atoms and b is a quantifier free formula of relational algebra. Beforegiving an imperative interpretation of a predicate it is convenient to massage it into a convenient form.4 AnImproved Algorithm for Generating Database Transactions Skolemization
By the classical technique of Skolemization any formula ~ Qx . b ( ~ a ,~ x ) can be convertedinto a universal formula which is satisfiable if and only if ~ Qx . b ( ~ a ,~ x ) is satisfiable. We exploit this trick inAlchemy as follows. Given a predicate p we convert it to a predicate p ∀ whose body is in universal form;this involves expanding the specification language to include the appropriate Skolem functions. Supposewe generate code for p ∀ (over the expanded language). Then given an original instance I we may viewit as an instance I + over the enlarged schema, and apply the generated code to obtain an instance I ′ + . Weultimately return the instance I ′ that is the reduct of I ′ + to the original schema. So in what follows werestrict attention to predicates whose body is a universal formula. Incorporating the facts
Intuitively the facts in a specification comprise a separate set of constraintson how a predicate may build new instances from old ones. But by the following simple trick we canavoid treating the facts separately. When compiling a predicate to code we take each fact, prime everyoccurrence of the
State sig, and add the fact to the body of the predicate. The use of primed
State names means that the fact acts as a post-condition on the predicate. (Strictly speaking this is only trueunder an assumption of “state-boundedness” on the form of the facts, defined in [4]. The specifics of thissyntactic assumption are irrelevant to the current paper so we omit details.) This in turn guarantees thatany post-instance defined by the predicate will satisfy the facts.The following is a convenient form for formulas.
Definition 3 ( Special formulas ) . A special formula is a formula in either of the two forms ( e ∩ . . . ∩ e k ) = /0 or ( e ∩ . . . ∩ e k ) = /0 for k ≥
1, with each e i not containing ∪ or /0 and with converse applied only to variables and relationnames. Lemma 4.
Any quantifier-free formula can be transformed into an equivalent Boolean combination ofspecial formulas.Proof.
It is easy to see that every expression is equivalent to one in which the converse operator ∼ appliesonly to relation names or variables. It is easy to see that every expression other than /0 itself is equivalentto one in which the constant /0 never appears. Because union distributes over the other connectives everyexpression is equivalent to one of the form e ∪ . . . ∪ e n ( n ≥
1) with each e i being ∪ -free.We may take any equation e = f and replace it with ( e in f ) ∧ ( f in e ) . We do this as long as neither e nor f is the term /0 .Now each basic formula is in one of the forms ( d ∪ . . . ∪ d m ) in ( f ∪ . . . ∪ f n ) or ( d ∪ . . . ∪ d m ) not in ( f ∪ . . . ∪ f n ) with n , m ≥
0, where the d i and the f i are ∪ -free. We may transform the basic formulas above into thecorresponding forms ( d ∪ . . . ∪ d m ) − ( f ∪ . . . ∪ f n ) = /0 , respectively, ( d ∪ . . . ∪ d m ) − ( f ∪ . . . ∪ f n ) = /0 (3)The first equation in 3 is equivalent, via distributivity of ∪ over ∩ , to the conjunction of the equations d i − ( f ∪ . . . ∪ f n ) = /0 ≤ i ≤ m In turn, each of these is equivalent to the special formula ( d i − f ) ∩ . . . ∩ ( d i − f n ) = /0 aniel J.Dougherty 85Similar reasoning shows that each dis-equation as in 3 is equivalent to a disjunction of special formulas ( . . . (( d i − f ) − f ) − . . . − f n ) = /0 Bridging the declarative/imperative gap
The main procedure A p below is generated by an inductionthat walks the structure of the formula that is the body of p . There is a natural correspondence between thelogical operators in the predicate and control-flow operators in the generated procedure. The disjunctive(logical ∨ and ∃ ) constructors in predicates naturally suggest imperative nondeterminism; this of courseresults in backtracking in generated code. Likewise, conjunctive (logical ∧ and ∀ ) constructors leadnaturally to sequencing . This is natural enough, but a difficulty arises due to the fact that the logicaloperators are commutative but command-sequencing certainly is not. Indeed, implementing one part ofa predicate can undo the effect achieved by an earlier part. The solution is to iterate computation untila fixed-point is reached on the post-state. So we must be careful to ensure that such an iteration willalways halt. Compiling special formulas to code
Consider for example the body of the
Drop predicate in Figure 1.There are certainly many ways to update the data to make this true; for example we could delete all thetuples in the roster table! This is not what the specifier had in mind. But even this silly example pointsout the need for a principled approach to update. We start with the following goal: we attempt to make a minimal set of updates (measured by the number of tuples inserted or deleted into tables) to the systemto satisfy the predicate.The virtue of special formulas is that they facilitate identifying minimal updates to make a formulatrue. For example the formula a in s ′ . r , which, when a is an atom, is to say that a is in the relation s ′ . r isequivalent to the formula a − ( s ′ . r ) = /0 . So suppose a − ( s ′ . r ) = /0 is part of the body of a predicate. Weevaluate the expression a − ( s ′ . r ) in the pre-state and the current post-state: if the value of this expressionis indeed empty then there is nothing to do. If it is not empty then a is not in s ′ . r , and it is clear whataction to take: add a to s ′ . r .More generally, when confronted with a special formula e = /0 we may view any tuples in the currentvalue of e as obstacles to the truth of the formula . Then the action suggested by the formula is clear:make whatever insertions or deletions we can to ensure the formula becomes true. (The presence ofthe difference operator means that making an expression empty may involve insertions.) The importantthing to note is that, obviously, we may focus exclusively on tuples that are already in the value of e in attempting to make e = /0 in the updated state. This is our strategy for doing minimal updates for apredicate. Inserting and deleting tuples
We have seen that compiling a special formula amounts to orchestratingthe insertion or deletion of individual tuples from the relations denoted by expressions. These expressionscorrespond to database views , and indeed the task of inserting or deleting a tuple from a view is aninstance of the well-known view update problem [2, 3]. Our code proceeds by a structural induction overthe expression: see the procedures insertTuple and deleteTuple below.6 AnImproved Algorithm for Generating Database Transactions
Putting it all together
After the preceding discussion the pseudocode for the Alchemy’s translationalgorithm should be largely self-explanatory. For simplicity in notation we adopt the followingconventions. There are global variables pre-state and post-state ranging over instances, and a globalvariable
Updates which keeps a record of the insertions and deletions done as the algorithm progresses.We make use of the following function E val ( e : expression , J , J ′ : database instances ) that returns theset of tuples denoted by expression e under the convention that immutable relation-name occurrences areinterpreted in J and mutable relation-name occurrences are interpreted in J ′ . The pseudocode given herefor procedures A p , B p , insertTuple, and deleteTuple is directly based on the discussion in the previousparagraphs. Definition 5 (Algorithm A p ) . Let p be a Alloy predicate of the form pred p ( s , s ′ : State , a : A , . . . , a n : A n ) . {∀ ~ x . ^ i _ j s i , j } where each s i , j is a special formula. The procedure A p determined by p is as follows. Each of A p and B p reads the instance I globally and reads and writes I ’ and Updates globally. procedure A p ( I : database instance) { initialize poststate I ’ to be I ;initialize Updates to be empty;repeat B p ( a : A , . . . , a n : A n ) until no change in Updates } procedure B p ( a : A , . . . , a n : A n ) { for each binding ~ b of values in I for the variables in ~ a :let V i W j ¯ s i , j be the body of p instantiated by ~ b :for each conjunct W j ¯ s i , j choose some ¯ s i , j and realize ¯ s i , j as follows:Case 1: ¯ s i , j is of the form ( e ∩ . . . ∩ e k ) = /0 set e ≡ ( e ∩ . . . ∩ e k ) for each tuple t in E val ( e , I , I ′ ) :call deleteTuple ( t , e , I , I ′ ) ;Case 2: ¯ s i , j is of the form ( e ∩ . . . ∩ e k ) = /0 set e ≡ ( e ∩ . . . ∩ e k ) choose some t of the same type as e call insertTuple(t, e. I, I’) update Updates accordingly; } procedure insertTuple( t : tuple, e : expression) { match e :atom a : if a = t then FAIL else RETURNimmutable relation r : if t / ∈ r then FAIL else RETURNmutable relation r : if t has been previously deleted from r then FAILelse add t to the table r in J ′ e ∪ e : choose some e i ; insertTuple( t , e i ) e ∩ e : insertTuple( t , e ) ; insertTuple( t , e )aniel J.Dougherty 87 ∼ e : insertTuple( t , e ) h e , e i : let t = h t , t i where t i matches type of e i ; insertTuple( t , e ) ; insertTuple( t , e ) e − e : insertTuple( t , e ) ; deleteTuple( t , e ) e . e : let T be the common sig-type that joins e and e ;if T is the type of e then for some a in E val ( e , I , I ′ ) , insertTuple( h a , t i , e )elseif T is the type of e then for some a in E val ( e , I , I ′ ) , insertTuple( h t , a i , e )else choose a : T ; set t = h s , a i and set t = h a , s i ;insertTuple( t , e ) ; insertTuple( t , e ) ( e ) ∗ : insertTuple( t , e ) procedure deleteTuple( t : tuple, e : expression) { match e :atom a : if a = t then FAIL else RETURNimmutable relation r : if t ∈ r then FAIL else RETURNmutable relation r : if t has been previously inserted into r then FAILelse delete t from the table r in J ′ e ∪ e : deleteTuple( t , e ) ; deleteTuple( t , e ) e ∩ e : choose some e i ; deleteTuple( t , e i ) ∼ e : deleteTuple( t , e ) h e , e i : let t = h t , t i where t i matches type of e i ; choose some e i ; deleteTuple( t i , e i ) e − e : choose: deleteTuple( t , e ) or insertTuple( t , e ) e . e : let T be the common sig-type that joins e and e ;if T is the type of e then for each a in E val ( e , I , I ′ ) , deleteTuple( h a , t i , e )elseif T is the type of e then for each a in E val ( e , I , I ′ ) , deleteTuple( h t , a i , e )else for each a : T such that for some s , s , h s , a i = t is in e and h a , s i = t is in e and t . t = t ; choose e i then deleteTuple( t i , e i ) ( e ) ∗ : for each ( x , y ) , ( y , y ) , . . . , ( y n , y ) such that t = ( x , y ) and each pair is in e choose some pair ( y i , y i + ) ; deleteTuple( h y i , y i + i , e ) Proof of Theorem 2
Theorem 2 follows from the following lemma about A p . Lemma 6.
Let p be a predicate; let A p be the non-deterministic procedure constructed from p byDefinition 5. Then for every instance I and binding h for the parameters of p:1. Every computation of A p terminates on I under h , and if A p returns an instance I ′ , we have ( I , I ′ ) ∈ J p K h ;2. If there is an instance I ′ such that ( I , I ′ ) ∈ J p K ( h ) then A p will not fail.Proof of the lemma. For the first claim, first note that algorithm B p proceeds by primitive recursion overthe body of the predicates and algorithms insertTuple and deleteTuple proceed by primitive recursionover the body of expressions. So it suffices to argue that the iteration until fixed point in algorithm A p always terminates. But this follows from the fact that we never add or delete the same tuple from a givenrelation and the total size of the domain we work with never changes. It is easy to see that when A p haltswithout failure it is the case that the body of the predicate has been satisfied.8 AnImproved Algorithm for Generating Database TransactionsTo establish the second claim we start with a definition. Given instances I and I ′ let us say thatinstance J is an ( I , I ′ ) -approximation if I − J ⊆ I − I ′ and J − I ⊆ I ′ − I . We abuse notation slightly here:these calculations are done on a per-relation basis. Intuitively J is an ( I , I ′ ) -approximation if J can beobtained from I by making some of the inserts and deletes that transform I into I ′ . Note that I is an ( I , I ′ ) -approximation, as is I ′ . Now the second claim follows from the fact that, for initial instance I and chosen I ′ with ( I , I ′ ) ∈ J p K ( h ) , whenever algorithm B p is called (by A p ) when the current value ofthe poststate is an ( I , I ′ ) -approximation then there is a computation of B p that (i) does not fail, and (ii)updates the poststate so that it still is an ( I , I ′ ) -approximation. In particular A p will never fail. Complexity
There is nothing interesting that can be said about the run-time complexity of code ( p ) since it depends on the nature of the predicate p , and p can be an arbitrary predicate. On the otherhand it is natural to ask about the complexity of code () itself. In other words, what is the running time of Alchemy’s code generation algorithm? Since code ( p ) comprises a backtracking wrapper around thealgorithm A p the question is essentially the same as asking: what is the complexity of building the textof algorithm A p from the text of predicate p ? It is easy to see that this is linear in p . Note in particularthat the procedures insertTuple and deleteTuple do not depend on p at all. For an extensive discussion of previous research relevant to the Alchemy project itself we refer the readerto the related work section in [7]. The relationship of the present paper to the previous work on Alchemyis as follows. In [7] we did not handle the relational difference operator, we did not treat Skolemization,and our correctness result was only for a subset of Alloy predicates (those admitting “homogeneous”implementations as defined there). But most importantly, the treatment of when relation names wereevaluated in the pre-state and when in the post-state was ad-hoc: in the current paper this importantsemantic decision rests on the secure foundations of the work in [4]. This allows us to prove a truesoundness and completeness theorem (Theorem 2) for our code-generation algorithm.
References [1] Jean-Raymond Abrial (1996):
The B-Book: Assigning Programs to Meanings . Cambridge University Press.[2] Jos´e A. Blakeley, Per- ˚Ake Larson & Frank Wm. Tompa (1986):
Efficiently Updating Materialized Views . In:Carlo Zaniolo, editor: SIGMOD Conference. ACM Press, pp. 61–71. Available at http://doi.acm.org/10.1145/16894.16861,db/conf/sigmod/BlakeleyLT86.html .[3] Vanessa P. Braganholo, Susan B. Davidson & Carlos A. Heuser (2004):
From XML View Updates to RelationalView Updates: old solutions to a new problem . In: Mario A. Nascimento, M. Tamer ¨Ozsu, Donald Kossmann,Ren´ee J. Miller, Jos´e A. Blakeley & K. Bernhard Schiefer, editors: VLDB. Morgan Kaufmann, pp. 276–287.Available at .[4] Theophilos Giannakopoulos, Daniel J. Dougherty, Kathi Fisler & Shriram Krishnamurthi (2009):
Towards anOperational Semantics for Alloy . In: Proc.16thInternationalSymposiumonFormalMethods. To appear.[5] Daniel Jackson (2006):
Software Abstractions . MIT Press.[6] Daniel Jackson & Jeanette Wing (1996):
Lightweight Formal Methods . IEEEComputer.[7] Shriram Krishnamurthi, Daniel J. Dougherty, Kathi Fisler & Daniel Yoo (2008):
Alchemy: Transmuting BaseAlloy Specifications into Implementations . In: ACM SIGSOFT InternationalSymposiumontheFoundationsofSoftwareEngineering. pp. 158–169. aniel J.Dougherty 89 [8] J. Michael Spivey (1992):