[PDF] The Optics of Language-Integrated Query

Abstract

Monadic comprehensions reign over the realm of language-integrated query (LINQ), and for good reasons. Indeed, comprehensions are tightly integrated with general purpose programming languages and close enough to common query languages, such as SQL, to guarantee their translation into effective queries. Comprehensions also support features for writing reusable and composable queries, such as the handling of nested data and the use of functional abstractions. In parallel to these developments, optics have emerged in recent years as the technology of choice to write programs that manipulate complex data structures with nested components. Optic abstractions are easily composable and, in principle, permit both data access and updates. This paper attempts to exploit the notion of optic for LINQ as a higher-level language that complements comprehension-based approaches. In order to do this, we lift a restricted subset of optics, namely getters, affine folds and folds, into a full-blown DSL. The type system of the resulting language of optics, that we have named Optica, distills their compositional properties, whereas its denotational semantics is given by standard optics. This formal specification of the concept of optic enables the definition of non-standard optic representations beyond van Laarhoven, profunctor optics, etc. In particular, the paper demonstrates that a restricted subset of XQuery can be understood as an optic representation; it introduces Triplets, a non-standard semantic domain to normalize optic expressions and facilitate the generation of SQL queries; and it describes how to generate comprehension-based queries from optic expressions, thus showing that both approaches can coexist.The paper also describes S-Optica, a Scala implementation of Optica using the tagless-final approach.

Full PDF

HHighlights

The Optics of Language-Integrated Query (cid:63)

J. L´opez-Gonz´alez, Juan M. Serrano, • Getter, A ﬃ ne Fold and Fold optics are lifted into Optica, a query language for LINQ • XQuery and SQL queries derived from non-standard optic representations • Optics as a higher-level interface over comprehension-based query languages • Typed tagless-ﬁnal encoding in Scala of the Optica type system and semantics a r X i v : . [ c s . P L ] S e p he Optics of Language-Integrated Query J. L´opez-Gonz´alez a,b, ∗ , Juan M. Serrano a,b , a Universidad Rey Juan Carlos, Calle Tulip´an, s / n, 28933 M´ostoles, Spain b Habla Computing SL, Avda. Gregorio Peces Barba, 28918 Legan´es, Spain

Abstract

Monadic comprehensions reign over the realm of language-integrated query (LINQ), and forgood reasons. Indeed, comprehensions are tightly integrated with general purpose program-ming languages and close enough to common query languages, such as SQL, to guarantee theirtranslation into e ﬀ ective queries. Comprehensions also support features for writing reusable andcomposable queries, such as the handling of nested data and the use of functional abstractions.In parallel to these developments, optics have emerged in recent years as the technology ofchoice to write programs that manipulate complex data structures with nested components. Opticabstractions are easily composable and, in principle, permit both data access and updates. Thispaper attempts to exploit the notion of optic for LINQ as a higher-level language that comple-ments comprehension-based approaches.In order to do this, we lift a restricted subset of optics, namely getters, a ﬃ ne folds and folds,into a full-blown DSL. The type system of the resulting language of optics , that we have named Optica , distills their compositional properties, whereas its denotational semantics is given bystandard optics. This formal speciﬁcation of the concept of optic enables the deﬁnition of non-standard optic representations beyond van Laarhoven, profunctor optics, etc. In particular, thepaper demonstrates that a restricted subset of XQuery can be understood as an optic representa-tion; it introduces Triplets, a non-standard semantic domain to normalize optic expressions andfacilitate the generation of SQL queries; and it describes how to generate comprehension-basedqueries from optic expressions, thus showing that both approaches can coexist.Despite the limited expressiveness of optics in relation to comprehensions, results are encour-aging enough to anticipate the convenience and feasibility of extending existing comprehension-based libraries for LINQ in the functional ecosystem, with optic capabilities. In order to showthis potential, the paper also describes S-Optica, a Scala implementation of Optica using thetagless-ﬁnal approach.

Keywords: optics, language-integrated query, type systems, comprehensions, typed tagless-ﬁnal, Scala (cid:63)

This work is partially supported by a Doctorate Industry Program grant to Habla Computing SL, from the SpanishMinistry of Economy, Industry and Competitiveness. ∗ Corresponding author

Email addresses: [email protected] (J. L´opez-Gonz´alez), [email protected] (JuanM. Serrano)

Preprint submitted to Science of Computer Programming September 3, 2020 . Introduction

The research ﬁeld of language-integrated query (LINQ [1, 2, 3]) aims at alleviating the impedance mismatch problem [4, 5] that commonly originates in software systems where general-purpose programming languages, on the one hand, and query languages, on the other, need tointeroperate. The problem manifests itself in the form of maintenance, reliability and securityproblems, which are essentially due to the mismatches of programming paradigms and datamodels endorsed by the interacting languages. In order to tackle this issue, the LINQ researchﬁeld favors a domain-speciﬁc language (DSL)-based approach [6]. From this perspective, theprogrammer does not simply inject query expressions in the general-purpose language as plain strings , a practice which is a well-known source of bugs and injection attack problems; on thecontrary, she uses a DSL which ensures that the query is well-formed, correctly typed, and more-over, that it helps to overcome the conceptual gap between both the general-purpose and thequery language.Indeed, not every DSL can be given the seal of approval from a language-integrated perspec-tive. For instance, we may embed SQL in a host language like Scala [7] to attain the stateddemand of type safety and yet, the disparity between the ﬂat and nested computational mod-els from both languages would not be reduced in the slightest . This is, without a shadow ofa doubt, a necessary step in order to generate well-formed SQL queries. Scala libraries suchas Doobie [8], which focus on this speciﬁc issue, are worthwhile. However, to properly bridgethe impedance mismatch gap, we need DSLs at a higher level of abstraction: close enough togeneral-purpose languages, yet speciﬁc enough to allow for the e ﬃ cient generation of queries fora wide range of querying languages [9]. Since early in its foundation, the LINQ research ﬁeldhas exploited monadic comprehensions [10, 11] as its DSL of choice for this purpose. The basicinsight was originally introduced in [12], and then developed by the Nested Relational Calculus(NRC) [13, 14], which provides the foundation of query languages based on comprehensions.NRC subsumes much work on LINQ theories and systems such as in Kleisli [15], Links [16],Microsoft’s LINQ [2, 17], Database Supported Haskell (DSH) [18], T-LINQ [3], QUE Λ [19] andSQUR [20].In essence, the purpose of research on LINQ is borrowing the comprehension syntax thatin-memory data structures such as lists, sets, bags, and other bulk types enjoy in order to expressqueries at a generic monadic level. To this aim, bulk types are lifted into a proper DSL that ab-stracts away its characteristic in-memory representation but still allows to express queries usingcomprehension syntax. In some cases, as in Kleisli, Links and Microsoft’s LINQ, this liftingmechanism is a primitive part of the general-purpose language itself. In the case of more con-ventional, functional programming (FP) languages, such as Scala, F embedded in the host language using one of the several techniques that FP o ﬀ ers forthis purpose: typed tagless-ﬁnal [21], generalized algebraic data types (GADTs) [22] or quoteddomain-speciﬁc languages (QDSLs) [23]. For instance, quotation is used to embed the T-LINQlanguage in F Λ , in OCaml. Similarly,Quill [24] is a QDSL heavily inspired by T-LINQ, which is embedded in Scala.To illustrate the use of comprehensions in LINQ, we consider the data structures in column“comprehensions” of Table 1, implemented in the Scala programming language – our language ofchoice throughout this paper . According to this model, a couple consists of two members, ﬁrst The entities of a ﬂat model are either base types or classes with no nested collections. Otherwise, we say that themodel is nested . Appendix A o ﬀ ers a brief account of the major Scala features that are used in this paper. omprehension optics m od e l class Couple(fst: String, snd: String) class

Person(name: String, age: Int) class

Couple(fst: Person, snd: Person) class

Person(name: String, age: Int) i mm u t ab l e def under50_a(couples: List[Couple],people: List[Person]): List[String] =for {c ← couplesw ← people if c.fst == w.name && w.age < 50} yield w.name val under50Fl: Fold[List[Couple], String] = couples ≫ fst ≫ filtered (age < 50) ≫ name val under50_c: List[Couple] => List[String] = under50Fl.getAll g e n e r i c val under50_b = quote { for {c ← query[Couple]w ← query[Person] if c.fst == w.name && w.age < 50} yield w.name} ? Table 1: Towards Optic-Based LINQ. and second, to which we refer by their names; besides names, each person has also an age. Givena list of couples and a list of people, we may obtain the names of those partners who occupy theﬁrst position and are under 50 years of age by using a list comprehension, as query under50_a shows. Now, using Slick [25] or Quill [24], two well-known libraries of the Scala ecosystem,we may express the very same query in a generic way. To be precise, we call a query generic ifit can be e ﬃ ciently run (e.g. via appropriate compilation) against data stores of di ﬀ erent sorts:in-memory, external relational databases, non-relational stores such as XML / JSON ﬁles, etc. Forinstance, the query under50_b in Table 1 is a generic query implemented in Quill. Being generic,this query may be compiled to di ﬀ erent targets according to the mappings between Scala caseclasses and database schemas that the Quill framework supports (as of this writing Cassandra’sCQL [26] and SQL). For instance, the resulting SQL expression generated for this query by Quillwould be as follows: SELECT w.name

FROM

Couple AS c INNER JOIN

Person AS w ON c.fst = w.name WHERE w.age < 50

We have illustrated the use of comprehensions with a simple example of the so-called ﬂat-ﬂatquery, i.e. a query that receives and returns ﬂat types. These queries are signiﬁcant because rela-tional databases cannot handle nested results. For instance, we cannot write an SQL query thatreturns rows in which a ﬁeld contains a list of values. This demonstrates a signiﬁcant mismatchbetween SQL and programming languages, where nested data models are customary. More-over, comprehension queries, being founded on NRC, can perfectly well handle nested data,which might lead us to think about wasted expressiveness. In fact, the opposite is true: severalconservativity results show that we can certainly use nested data as intermediate values in com-prehensions [27], even in the presence of parameterized queries (i.e. using lambda expressions)[16] and still be able to generate normalized queries which do not use nested data in any way. Wecan even accommodate ﬂat-nested queries through several ﬂat-ﬂat normalized queries by usingtechniques for query shredding [28]. In sum, comprehensions are exceptionally good from aLINQ perspective: well integrated to a wide range of programming languages and close enoughto relational databases in order to generate e ﬀ ective queries.3n spite of this, we ﬁnd three major problems or inconveniences in the current use of com-prehensions for LINQ. First, comprehensions can only express retrieval queries but updates areequally important. This is acknowledged as an open problem in the LINQ ﬁeld [9]. Second,the use of nested data and functional abstraction in comprehension-based languages such asLinks / T-LINQ undoubtedly helps in obtaining more compositional queries [3]. However, thisis done at the expense of a complex re-writing machinery, specially in the case of QDSLs likeT-LINQ. Alternative approaches based on normalization-by-evaluation [20] ameliorate this prob-lem, but the support for compositional queries is nevertheless limited. Basically, this is due tothe fact that comprehensions are nearer to the point-wise notation exempliﬁed by the relationalcalculus than to pure relational algebra and have to deal with variable (re)naming, freshness andscope. Functionally, both formalisms are equally expressive, but the point-free combinators ofrelational algebra are arguably more ﬂexible [29]. This ﬂexibility naturally derives in more mod-ular queries, which directly impacts non-functional concerns such as reuse and change tolerance[30, 31]. Third, there are potential querying infrastructures which are essentially hierarchicalrather than relational, such as NoSQL document-oriented databases, which build upon nesteddata sources in JSON, XML or YAML. The translation of queries at the programming languagelevel into these infrastructures may beneﬁt from a more primitive, algebraic querying model,which is hierarchical by nature.This paper attempts to show that optics [32] may play this role in the realm of language-integrated query, and it supports a pure algebraic approach to LINQ that may potentially beextended for dealing with updates. Indeed, optics, also known as functional references , are ab-stractions that select parts which are contextualized within a whole, which provide methods toaccess and / or update the values that they are selecting. Since the ﬁrst appearance of lens [33],arguably the most prominent optic, a rich catalog of optics has emerged [34]. They can composewith each other, with a very few exceptions, so that they can seamlessly produce quite complextransformations over immutable nested data structures. Indeed, they have become an essentialcompanion for the functional programmer, as evidenced by the growing popularity of optic li-braries in the functional ecosystem [35, 36]. In sum, we may say that optics are the de-factostandard to manipulating nested data in a point-free, algebraic style; they are as ubiquitous ascomprehensions in functional programming languages and, more importantly, the most commonvariants are explicitly designed to handle both read and write accessors.How do we use optics as a higher-level language to express generic queries? How do theseoptic-based queries relate to generic comprehension queries? How do we translate optic expres-sions to SQL / NoSQL query languages directly? To answer to these questions, we may followthe same strategy that is illustrated in Table 1 for comprehension queries: by using lenses [33],traversals [37], folds, and other optic abstractions, we can query and update immutable datastructures quite naturally; why not borrow this very same syntax to express generic queries thatcan be interpreted over di ﬀ erent target storage systems? For instance, in column “optics” ofTable 1, an alternative nested model is deﬁned for the couples example, where keys in Couple are replaced by full-blown

Person values. Then, query under50_c , derived from optic under50Fl ,which in turn is composed from several optics (the fold couples , and getters her , age and name ),o ﬀ ers an alternative formulation to the comprehension query under50_a . What we are lookingfor is a generic optic-based query, analogous to the comprehension-based query under50_b . Inessence, we need to lift optics into a full-ﬂedged DSL. Contributions and outline.

This paper sets out to deﬁne the language of optics which we havedubbed

Optica . We aim at showing that Optica may play the role of an e ﬀ ective query language4or LINQ, alone and in combination with comprehension queries, albeit at the expense of limitedexpressiveness. In general, we aim to prove that optics o ﬀ er a fruitful abstraction for LINQ,and restrict our attention to proving the feasibility of this approach on a selected subset of opticabstractions and domain examples. In particular, these are our contributions: • A review of concrete optics from the mindset of LINQ. We show how to exploit a subset ofstandard optic abstractions and their combinators in order to express compositional queries(Sect. 2). We focus exclusively on read-only optics, i.e. those which select parts from thewhole but do not write back in the data structure, namely, getters , a ﬃ ne folds and folds .This allows us to focus on a tractable subset of optics, and to prepare the ground to tacklemore ambitious problems in future work, such as the modeling of updates in LINQ. • A formal speciﬁcation of read-only optics in terms of Optica, a full-blown DSL. The syntaxand type system of the language formalize their compositional and querying features in anabstract way. Its denotational semantics is given by concrete optics themselves (Sect. 3).We show how to implement generic queries over abstract optic models by using Optica ina declarative way, once and for all. • The abstract speciﬁcation of read-only optics provided by Optica enables the deﬁnition ofalternative, non-standard optic representations. We provide three major Optica interpre-tations which attempt to show the capabilities of Optica as a general query language forLINQ: – An XQuery interpretation, that allows us to directly translate Optica queries intoXQuery [38] expressions (Sect. 4). This aims at showing the adequacy of Optica fordealing with common data sources of NoSQL document-oriented databases. – A SQL intepretation, that generates SQL queries from Optica expressions (Sect. 5).This non-standard denotational semantics builds upon

Triplets , a semantic domainwhich normalizes the optic expression in order to facilitate its direct translation toSQL. The proposed semantics works similarly to the normalization-by-evaluationapproach of SQUR [20]. The major di ﬀ erence is that SQUR consists of a relationalcalculus whereas we work over optics, which are more akin to relational algebra. – A T-LINQ interpretation, that generates comprehension queries. This non-standardsemantics is aimed at showing how to use Optica as a higher-level language for nesteddata in conjunction with comprehension-based languages (Sect. 6). • S-Optica, an embedded DSL implementation of Optica in Scala using the tagless-ﬁnalapproach (Sect. 7). This implementation is intended as a proof-of-concept for illustratinghow to implement the formal Optica type system and semantics in a common, generalpurpose programming language of the software industry. It also aims at providing anexample of the tagless ﬁnal approach, as well as serving as a reference for extendingexisting libraries for LINQ with optic capabilities.As can be seen, Sects. 2-7 contain the bulk of the paper. Section 8 discusses related workand limitations of the approach. Finally, section 9 concludes and points towards current andfuture work. The Scala library that accompanies this paper is publicly available on a Githubrepository [39]. 5 . Querying with Optics

This section introduces three di ﬀ erent kinds of read-only optics: getters , a ﬃ ne folds and folds together with their main combinators, where we use Scala as the vehicle to implementthem . In essence, read-only optics are just views without updates, and hence they are not sub-ject to the familiar optic laws [34]. They are not as widespread as their siblings with updatingcapabilities (namely, lenses , a ﬃ ne traversals and traversals ), given that selecting nested ﬁeldsfrom immutable data structures is usually a trivial task. Nonetheless, they exhibit the same com-positional features and patterns as the rest of optics, and will thus allow us to illustrate the generaldeclarative querying style advocated by them. The abstractions and examples that we put forwardin this section will be used throughout the paper. ﬃ ne Folds and Folds First of all, it is worth noting that we choose the concrete optic representation, where thenotions of whole and part are clear, in order to make deﬁnitions easier to understand. Thereare other representations, such as van Laarhoven [37] or profunctor [32], that implement opticcomposition in a remarkably elegant way, whose signatures, however, are not as easy to approachfor an outsider . Deﬁnition 1 (Getter).

A getter consists of a function that selects a single part from the whole .We encode it in Scala as follows: case class Getter[S, A](get: S => A) The type parameters S and A will consistently serve as the whole and the part, respectively,throughout the di ﬀ erent optic deﬁnitions.There are several getter combinators that will be used frequently in the text that have beencollected in the companion object for Getter , which is shown in Fig. 1. The andThen methodcombines getters that are selecting nested values in order to produce a new getter that selectsa deeply nested value. The getter id is the neutral component under the andThen composition,where whole and part do coincide. The fork combinator is required if we wish to put di ﬀ erentparts together. The like combinator selects a constant part which is taken as parameter, where thewhole is ignored. The remaining combinators essentially lift arithmetic operations into functionsthat take getters selecting operands as parameters and produce a getter that selects the operationresult. Remark 1.

We assume ≫ and ∗∗∗ as inﬁx versions of andThen and fork , respectively, where thelast symbol has precedence over the ﬁrst. Similarly, we will overload the operators === , > and - as inﬁx versions of equal , greaterThan and subtract , respectively. Last, we will use the postﬁxexpression p.not as an alias for not(p) . We ignore other read-only optics such as fold1 , since they do not add value in the particular examples that we haveselected for this paper. Appendix A provides a Scala cheat sheet that describes the most fundamental Scala abstractions in the context ofthis work. As evidenced by the jokes around this topic in the functional programming community https://pbs.twimg.com/media/CypY7B1W8AAvqwl.jpg A concrete lens is basically a getter plus a function to update the whole from a new version of the part. bject Getter { def id[A]: Getter[A, A] = Getter(a => a) def andThen[S, A, B](u: Getter[S, A], d: Getter[A, B]): Getter[S, B] = Getter(s => d.get(u.get(s))) def fork[S, A, B](l: Getter[S, A], r: Getter[S, B]): Getter[S, (A, B)] = Getter(s => (l.get(s), r.get(s))) def like[S, A](a: A): Getter[S, A] = Getter(const(a)) def not[S](b: Getter[S, Boolean]): Getter[S, Boolean] = b ≫ Getter(!_) def equal[S, A: Equal](x: Getter[S, A], y: Getter[S, A]): Getter[S, Boolean] = Getter(s => x.get(s) === y.get(s)) def greaterThan[S](x: Getter[S, Int], y: Getter[S, Int]): Getter[S, Boolean] = Getter(s => x.get(s) > y.get(s)) def subtract[S](x: Getter[S, Int], y: Getter[S, Int]): Getter[S, Int] = Getter(s => x.get(s) - y.get(s))} Figure 1: Getter Combinators.

Remark 2.

Fork-like optic composition (we will also refer to it as horizontal composition ) isnot widespread in the folklore. Indeed, it is not possible to implement it in a safe way for mostoptics. For example, an analogous implementation for composing lenses horizontally wouldviolate lens laws [40] when both lenses select the very same part. Deﬁnition 2 (A ﬃ neFold). An a ﬃ ne fold consists of a function that selects at most one part fromthe whole. We encode it as follows: case class AffineFold[S, A](preview: S => Option[A])

We could see this optic as a simpliﬁcation of an a ﬃ ne traversal , where we omit the updatingpart.Once again, we have packaged several a ﬃ ne fold combinators in Fig. 2. The identity a ﬃ nefold simply selects the whole value and wraps it in a Some case. The andThen combinator selectsthe innermost value just in case both optics u and d denote deﬁned values. Otherwise, it will selectnothing. We implement this functionality in terms of the Option monad using for-comprehensionsyntax. Finally, we consider ﬁltered as an interesting builder of a ﬃ ne folds, which declares thesame types for whole and part. It just discards the value (returning None ) in case it is actuallypointing to something and the input predicate does not hold for it.

Remark 3.

It is worth emphasizing that the predicate that filtered takes as parameter is a getteritself. This is unusual in folklore libraries, where a plain lambda expression is taken as an argu-ment instead. However, predicates can be perfectly understood as queries (getters, in particular).We will exploit this idea in the next section to avoid introducing lambda terms in the Optica DSL. And consequently, we will refer to andThen as vertical composition . bject AffineFold { def id[A]: AffineFold[A, A] = AffineFold(a => Some(a)) def andThen[A, B, C](u: AffineFold[A, B],d: AffineFold[B, C]): AffineFold[A, C] = AffineFold(s =>for {b ← u.preview(s)c ← d.preview(b)} yield c) def filtered[S](p: Getter[S, Boolean]): AffineFold[S, S] = AffineFold(s => if (p.get(s)) Some(s) else

None) implicit def to af [S, A](g: Getter[S, A]): AffineFold[S, A] = AffineFold(s => Some(g.get(s)))}

Figure 2: A ﬃ ne Fold Combinators. Remark 4.

One of the major beneﬁts of optics is that they compose heterogeneously ; in otherwords, it is possible to combine getters, a ﬃ ne folds and folds among them. To put it simply,we can turn getters into a ﬃ ne folds and we can turn a ﬃ ne folds into folds. An example of suchcasting is shown in Fig. 2 ( to af ), where we ﬁnd it implemented as an implicit converter. Thereby,the Scala compiler applies the conversion automatically when it detects a getter where an a ﬃ nefold is expected instead. Deﬁnition 3 (Fold).

A fold consists of an optic that selects a (possibly empty) sequence of partsfrom the whole. case class

Fold[S, A](getAll: S => List[A])

We could see this optic as a simpliﬁcation of a traversal [37], where we omit the updating part.As usual, we have packaged the fold primitives in the corresponding companion object,which can be found in Fig. 3. The implementation of id and andThen is basically the same as theone we showed for a ﬃ ne folds, the di ﬀ erence being that we work with the List monad insteadof the

Option one . The nonEmpty method takes a fold as a parameter and produces a getter thatchecks whether the number of selected parts is equal to zero. The remaining combinators ( empty , all , any and elem ) are just derived deﬁnitions, which are implemented in terms of other combi-nators, where we assume that object-oriented dot syntax is available. For instance, nonEmpty(fl) becomes fl.nonEmpty and all(fl)(p) becomes fl.all(p) .The implementation of elem might deserve further explanation. Since we favor getters overplain functions as predicates (as stated in Remark 3), we need to use optic abstractions andcombinators to build them. The following is a derivation where we start with an implementationthat we consider natural —which requires lambda expressions— and we end up with the standingimplementation —where lambda expressions are removed. With a very few exceptions, which are beyond the scope of this paper. Similarly, we can have getters following the same pattern by using the Id monad, but we avoid doing this for brevity. bject Fold { def id[A]: Fold[A, A] = Fold(a => List(a)) def andThen[A, B, C](u: Fold[A, B], d: Fold[B, C]): Fold[A, C] = Fold(s =>for {b ← u.getAll(s)c ← d.getAll(b)} yield c) def nonEmpty[S, A](fl: Fold[S, A]): Getter[S, Boolean] = Getter(fl.getAll(_).nonEmpty) /* List.nonEmpty */ def empty[S, A](fl: Fold[S, A]): Getter[S, Boolean] = fl.nonEmpty.not def all[S, A](fl: Fold[S, A])(p: Getter[A, Boolean]): Getter[S, Boolean] = (fl ≫ filtered(p.not)).empty def any[S, A](fl: Fold[S, A])(p: Getter[A, Boolean]): Getter[S, Boolean] = fl.all(p.not).not def elem[S, A: Equal](fl: Fold[S, A])(a: A): Getter[S, Boolean] = fl.any(id === like(a)) implicit def to ﬂ [S, A](a: AffineFold[S, A]): Fold[S, A] = Fold(s => a.preview(s).toList)} Figure 3: Fold Combinators. fl.any(Getter(s => s === a)) (cid:39) [definition of ‘id‘ getter]fl.any(Getter(s => id.get(s) === a) (cid:39) [definition of ‘like‘ getter]fl.any(Getter(s => id.get(s) === like(a).get(s)) (cid:39) [definition of ‘equal‘ getter]fl.any(id === like(a)) Note that === is overloaded. In fact, the occurrence of this method in the last line corresponds tothe equal combinator from getters, while the rest refer to the standard comparison method fromthe

Equal type class.

Remark 5.

As we have seen throughout this section, read-only optics are essentially functionsthat select parts from the whole, but we have introduced them as separated deﬁnitions. Thisdistinction between functions and optics turns out to be central in this work, since Optica expres-sions denoting one or the other can be evaluated in very di ﬀ erent ways, as we will see throughSects. 3-6. Once we have seen several standard combinators and some interesting features from optics,we will exercise them to illustrate the querying style and common patterns advocated by optics.For this task, we have selected two examples from [3] , which will be used throughout the paper. The ﬁrst example has been slightly updated in order to adapt it to today’s society. .2.1. Couples Example This example extends the one which was introduced in Table 1. Remember that it consistsof a simple relation of couples, where the name and age of each person forming them is alsosupplied: type

Couples = List[Couple] case class

Couple(fst: Person, snd: Person) case class

Person(name: String, age: Int)

The associated data structures are deﬁned following a nested , rather than a relational approach,i.e. couples contain a full person value, rather than a person key. This distinction becomesessential in Sect. 6. Once we have deﬁned the model, we provide

CoupleModel -speciﬁc optics toselect relevant parts from the domain. object

CoupleModel { val couples: Fold[Couples, Couple] = Fold(identity) val fst: Getter[Couple, Person] = Getter(_.fst) val snd: Getter[Couple, Person] = Getter(_.snd) val name: Getter[Person, String] = Getter(_.name) val age: Getter[Person, Int] = Getter(_.age)}

Basically, and for this particular example, we supply a getter for each ﬁeld, where whole and partcorrespond to data and ﬁeld types, respectively. The Scala placeholder syntax is used in thesedeﬁnitions. There is also a simple fold that we can use to select each couple from

Couples , thatwe see as the root type in the nested model.

Remark 6.

The examples that are presented in this paper do not include a ﬃ ne folds as part ofthe domain models, but they could be helpful to model optional values. For instance, we coulduse them to consider an optional address ﬁeld associated to each person.Now, we can use the standard optics deﬁned in the previous section and the speciﬁc opticsdeﬁned for this domain to compose new ones. For instance, the following fold selects the nameand age di ﬀ erence from all those couples where the ﬁrst member is older than the second one. val differencesFl: Fold[Couples, (String, Int)] = couples ≫ filtered((fst ≫ age) > (snd ≫ age)) ≫ (fst ≫ name) ∗∗∗ ((fst ≫ age) - (snd ≫ age)) Firstly, we use couples as an entry point and we use filtered to remove the couples in which theage of the ﬁrst member is not greater than the age of the second one. Right after ﬁltering, weselect the name of the ﬁrst member and we put it together with the age di ﬀ erence, by means of ∗∗∗ , to determine the subparts that the optic is selecting.Once we have deﬁned the fold, we need to generate the query that selects the correspondinginformation from the immutable structure, i.e. a function that takes the couples as argument andreturns the matching values. For this task, we simply use getAll . val differences: Couples => List[(String, Int)] = differencesFl.getAll If we feed this query with the same data that was used in the original example [3], we shouldexpect the same result. val data: Couples = List(Couple(Person("Alex", 60), Person("Bert", 55)),Couple(Person("Cora", 33), Person("Demi", 31)),Couple(Person("Eric", 21), Person("Fred", 60))) val res: List[(String, Int)] = differences(data) // res: List[(String, Int)] = List((Alex,5), (Cora,2)) res when we run thequery with the original data. As expected, it indicates that Alex and Cora are older than theirmates by 5 and 2 years, respectively.

We move on to the next example, where our model is an organization which is formed byemployees. In addition, each employee has a set of tasks that she is able to perform. type

Org = List[Department] case class

Department(dpt: String, employees: List[Employee]) case class

Employee(emp: String, tasks: List[Task]) case class

Task(tsk: String)

Once again, we supply

OrgModel -speciﬁc optics to select relevant parts from the domain: object

OrgModel { val departments: Fold[Org, Department] = Fold(identity) val dpt: Getter[Department, String] = Getter(_.dpt) val employees: Fold[Department, Employee] = Fold(_.employees) val emp: Getter[Employee, String] = Getter(_.emp) val tasks: Fold[Employee, Task] = Fold(_.tasks) val tsk: Getter[Task, String] = Getter(_.tsk)}

In this case, we ﬁnd several ﬁelds containing lists, thus, we provide folds instead of getters todeal with sequences of parts. Now, we compose a fold to select the name of those departmentswhere all employees know how to abstract . def expertiseFl: Fold[Org, String] = departments ≫ filtered(employees.all(tasks.elem("abstract"))) ≫ dpt This expression refers ﬁrst to all departments, and then it ﬁlters the ones where all employeescontain the task "abstract" . Finally, it selects their textual identiﬁer ( dpt ). Once the fold isdeﬁned, we produce the query to obtain the selected departments: def expertise: Org => List[String] = expertiseFl.getAll Once more, we feed the query with the original organization’s data. val data: Org = List(Department("Product", List(Employee("Alex", List(Task("build"))),Employee("Bert", List(Task("build"))))),Department("Quality", List.empty),Department("Research", List(Employee("Cora", List(Task("abstract"), Task("build"), Task("design"))),Employee("Demi", List(Task("abstract"), Task("design"))),Employee("Eric", List(Task("abstract"), Task("call"), Task("design"))))),Department("Sales", List(Employee("Fred", List(Task("call")))))) val res: List[String] = expertise(data) // res: List[String] = List(Quality, Research)

The resulting value shows that the departments of

Quality and

Remark 7.

We will refer to get , preview and getAll as the queries derived from their corre-sponding optic types. Read-only optics just supply basic reading queries, in a one-to-one map-ping. Thereby, the separation of concerns between optic expressions and generated queries isnot as clear as in other optics. For instance, lenses broaden their catalog of derived queries withqueries to read, replace or even modify the part that the lens is selecting. We further discuss theimplications of this insight in Sect. 8.

3. Optica Core

The last section has introduced optics using their so-called concrete representation, but thesame querying style is actually supported by other isomorphic representations, such as vanLaarhoven or profunctor optics. This section aims at specifying the concepts of getters, a ﬃ nefolds and folds, in a generic way, independently of any particular representation . The resultingformalization is a domain-speciﬁc language that we have named Optica , which directly sup-ports the optic querying style, and potentially allows for non-standard optic representations thatgenerate queries in XQuery or SQL, for instance, as the next sections will show.This section will ﬁrst introduce the syntax and type system of Optica, where standard primi-tives and combinators are declared. Secondly, we will show how to provide a generic version ofthe models and queries that we have seen in Sect. 2.2. Finally, we will present the standard se-mantics that we can use as an interpretation to deal with immutable data structures: they recoverconcrete optics and queries, as introduced in the last section.

We introduce the syntax of Optica in Fig. 4. The upper part contains the model types (naturalnumbers, boolean, string and product), optic types (getters, a ﬃ ne folds and folds) and querytypes (selection functions). The second part shows the set of expressions that form the language,which are deﬁned in close correspondence with their concrete counterparts, introduced in theprevious section. Despite that, there are several aspects which deserve further explanation:12 ef empty ﬂ = def all ﬂ p = nonEmpty ﬂ ≫ not empty ( ﬂ ≫ ﬁltered ( not p )) def any ﬂ p = def elem ﬂ u = not ( all ﬂ ( not p )) any ﬂ ( id == like a ) Figure 5: Optica derived deﬁnitions • Constants are not valid query expressions on their own. As we will see later, we use like as the mechanism to represent literals in the language. By doing so, we can reuse opticcombinators for constants, improving the language composability. • The formal syntax avoids the object-oriented dot notation, idiomatic in Scala, and favoursthe preﬁx notation, as is usual practice in related work. • The methods all , any , elem and empty are not included as syntax primitives. Instead, theyare introduced as derived deﬁnitions, as can be seen in Fig. 5. • At present, query expressions are atomic, i.e. it is not possible to compose several of them.The type system is presented in Fig. 6, where α , β and γ represent model types (see Fig. 4).Unlike T-LINQ [9] or QUE Λ [20], Optica does not introduce terms for variables. Thereby, itstype rules are slightly simpliﬁed, since they omit the characteristic ‘ Γ (cid:96) ’ preﬁx. They are struc-tured into four groups, corresponding to getters, a ﬃ ne folds, folds, and their derived queries.The only optic constructors are id ∗ and like , which allow us to form new optic expressions fromscratch. The remaining combinators should be straightforward, since they exactly correspondwith the ones introduced in the companion objects for getters, a ﬃ ne folds and folds of Sect. 2 .In regard to queries, note that their type rules do not proceed from the companion objects, butfrom the case class deﬁnitions of concrete optics themselves. Their formalization leads to in-troducing functions as a new semantic domain for Optica. However, note that the part of thelanguage corresponding to optics is purely ﬁrst-order, i.e. no lambdas are needed in order tocreate optic expressions. As we have seen in Sect. 2, we deﬁned domain optics to model the couple and organizationexamples. Now, we want to do the same, but in a general way. To do so, we need to extend thecore language from Sect. 3.1 with new primitives, speciﬁc to the particular domain. We presentthem, along with their associated type rules, in Fig. 7. As can be seen, it introduces the entitytypes (

Couple and

Person ) and a term for each optic introduced in Sect. 2.2.1. The type rules justdetermine the type associated to each term, i.e. the kind, whole and part associated to each optic.Once we have deﬁned the Optica language primitives (where we place the standard opticsand combinators) and the domain extension (where we ﬁnd the structure of the domain data With the exception of those combinators, like any , all , elem and empty , which can be deﬁned compositionally interms of other combinators and do not need to access the internal optic representation. id gt : getter α α id gt g : getter α β g : getter β γ g ≫ gt g : getter α γ ≫ gt g : getter α β g : getter α γ g ∗∗∗ g : getter α ( β, γ ) ∗∗∗ b : β β ∈ base types like b : getter α β likeg : getter α B not g : getter α B not g : getter α N g : getter α N g > g : getter α B > g : getter α β g : getter α β g == g : getter α B == g : getter α N g : getter α N g − g : getter α N − id af : a ﬃ ne α α id af a : a ﬃ ne α β a : a ﬃ ne β γ a ≫ af a : a ﬃ ne α γ ≫ af p : getter α B ﬁltered p : a ﬃ ne α α ﬁltered g : getter α β to af g : a ﬃ ne α β to af id ﬂ : fold α α id ﬂ f : fold α β f : fold β γ f ≫ ﬂ f : fold α γ ≫ ﬂ f : fold α β nonEmpty f : getter α B nonEmpty a : a ﬃ ne α β to ﬂ a : fold α β to ﬂ g : getter α β get g : α → β get a : a ﬃ ne α β preview a : α → option β previewf : fold α β getAll f : α → list β getAll Figure 6: Optica type system ntity Types t + : = Couples | Couple | Person

Optic Expressions e + : = couples | fst | snd | name | agecouples : fold Couples Couple couples fst : getter Couple Person fstsnd : getter Couple Person snd name : getter Person S nameage : getter Person N age Figure 7: Couple syntax and type system

Entity Types t + : = Org | Department | Employee | Task

Optic Expressions e + : = departments | dpt | employees | emp | tasks | tskdepartments : fold Org Department departments dpt : getter Department S dptemployee : fold Department Employee employees emp : getter Employee S emptasks : getter Employee Task tasks tsk : getter Task S tsk Figure 8: Organization syntax and type system model in terms of speciﬁc optics), we should be able to provide generic domain queries. Weadapt differences (Sect. 2.2.1) as follows:

Deﬁnition 4.

The generic versions for di ﬀ erencesFl (optic expression) and di ﬀ erences (queryexpression) are implemented as follows, in terms of the Optica and couple-speciﬁc primitives. def di ﬀ erencesFl = couples ≫ to ﬂ ( ﬁltered (( fst ≫ age ) > ( snd ≫ age )) ≫ to af (( fst ≫ name ) ∗∗∗ (( fst ≫ age ) − ( snd ≫ age )))) def di ﬀ erences = getAll di ﬀ erencesFl The implementation of the generic versions of differencesFl and differences are basically thesame as the ones we introduced in Sect. 2.2.1 —modulo the di ﬀ erences that we listed in Sect. 3.1and the fact that the invocations to casting methods such as to af and to ﬂ are made explicit.We can carry out the same exercise for the organization example (Sect. 2.2.2). Once again, weintroduce a language extension containing the entity types and terms associated to this example(Fig. 8). Once we do that, we are able to introduce a generic counterpart for the expertise query.15 [ N ] = Int T [ B ] = Boolean T [ S ] = String T [( α, β )] = ( T [ α ] , T [ β ] ) T [ getter α β ] = Getter [ T [ α ] , T [ β ]] T [ a ﬃ ne α β ] = Affine [ T [ α ] , T [ β ]] T [ fold α β ] = Fold [ T [ α ] , T [ β ]] T [ α → β ] = T [ α ] ⇒ T [ β ] T [ α → option β ] = T [ α ] ⇒ Option [ T [ β ]] T [ α → list β ] = T [ α ] ⇒ List [ T [ β ]] Figure 9: Optica semantic domains

Deﬁnition 5.

The generic versions for expertiseFl (optic expression) and expertise (query ex-pression) are implemented as follows, in terms of the Optica and organization-speciﬁc primitives. def expertiseFl = departments ≫ to ﬂ ( ﬁltered ( all employees ( elem ( tasks ≫ to ﬂ ( to af tsk )) ‘ abstract ’)) ≫ to af dpt ) def expertise = getAll expertiseFl At this point, we have deﬁned generic queries which are not coupled to any particular queryinginfrastructure. In the rest of the paper, we will show how to reuse such queries for generatingin-memory, XQuery, SQL and comprehension-based queries.

In deﬁning a new language, it is common practice to start with its syntax and type system, andthen proceed to deﬁne its semantics. In our case, we have proceeded in reverse: we started withthe intended semantics (optics and queries) and created an abstract syntax and type system whichmimic its structure. Therefore, what is new in this section is how to formalize the connectionbetween the syntax and type system of Optica and concrete optics, its intended semantics. Forthis task, we provide semantic functions T (Fig. 9) and E (Fig. 10). The ﬁrst maps Optica typesto their corresponding semantic domains. The second maps an expression of type t to an elementof the semantic domain T ( t ). As can be seen, T just maps types to their Scala counterparts .Given this scenario, the implementation of E turns out to be trivial. In fact, we just translate theOptica expressions into their Scala analogues from Sect. 2. We use ⊕ to unify the di ﬀ erent binarycombinators ( > , - , etc.).We also need to take into account the evaluation of the extended versions of the language,where terms speciﬁc to each example are introduced. For instance, Fig. 11 shows the semantic Scala does not include a standard type for natural numbers. Instead of supplying them on our own, we prefer tochoose the standard

Int type for simplicity. E [ id gt : getter α α ] = Getter.id E [ g ≫ gt h : getter α γ ] = Getter.andThen ( E [ g : getter α β ] , E [ h : getter β γ ]) E [ g ∗∗∗ h : getter α ( β, γ )] = Getter.fork ( E [ g : getter α β ] , E [ h : getter α γ ]) E [ like b : getter α β ] = Getter.like ( E [ b : β ]) E [ not g : getter α B ] = Getter.not ( E [ g : getter α B ]) E [ g ⊕ h : getter α δ ] = Getter. ⊕ ( E [ g : getter α β ] , E [ h : getter α γ ]) E [ id af : a ﬃ ne α α ] = AffineFold.id E [ g ≫ af h : a ﬃ ne α γ ] = AffineFold.andThen ( E [ g : a ﬃ ne α β ] , E [ h : a ﬃ ne β γ ]) E [ ﬁltered p : a ﬃ ne α α ] = AffineFold.filtered ( E [ p : getter α B ]) E [ to af g : a ﬃ ne α β ] = AffineFold.to af ( E [ g : getter α β ]) E [ id ﬂ : fold α α ] = Fold.id E [ g ≫ ﬂ h : fold α γ ] = Fold.andThen ( E [ g : fold α β ] , E [ h : fold β γ ]) E [ nonEmpty g : getter α B ] = Fold.nonEmpty ( E [ g : fold α β ]) E [ to ﬂ a : fold α β ] = Fold.to ﬂ ( E [ g : a ﬃ ne α β ]) E [ get g : α → β ] = E [ g : getter α β ] .get E [ preview g : α → option β ] = E [ g : a ﬃ ne α β ] .preview E [ getAll g : α → list β ] = E [ g : fold α β ] .getAll Figure 10: Optica standard semantics T [ Couples ] = Couples T [ Couple ] = Couple T [ Person ] = Person E [ couples : fold Couples Couple ] = CoupleModel.couples E [ f st : getter Couple Person ] = CoupleModel.fst E [ snd : getter Couple Person ] = CoupleModel.snd E [ name : getter Person S ] = CoupleModel.name E [ age : getter Person N ] = CoupleModel.age

Figure 11: Semantic domains and standard semantics for couples extension omains and the evaluation of each term from the couples example extension. It is also trivial,since it just maps the domain-speciﬁc terms to the concrete optics from Sect. 2.2.1. The corre-sponding instance for the organization extension follows the very same pattern and we omit it forbrevity. Once we have deﬁned the standard semantics for all terms, we should be able to translategeneric queries into plain functions, by means of E . We evaluate di ﬀ erences (Def. 4) as follows: def di ﬀ erencesR : Couples ⇒ List[(String, Int)] = E [ di ﬀ erences : Couples → list ( S , N )] As can be seen, the resulting value is a Scala function that works with immutable data structures.Finally, expertise (Def. 5) is evaluated in this way: def expertiseR : Org ⇒ List[String] = E [ expertise : Org → list S ] It recovers a Scala function which selects the corresponding department names. These functionsare exactly the same as their counterparts from Sect. 2.

4. XQuery

So far, we have seen that optics allow us to manipulate immutable data structures in a modu-lar and elegant way, and that concrete optics can be lifted into the Optica language, a full-blownDSL. The standard semantics of Optica is given in terms of concrete optics; however, this is notsigniﬁcant from the point of view of language-integrated query. The state of real applications ismostly handled through SQL and NoSQL databases, web services, etc.; therefore, this sectionand the following will show how to reuse Optica expressions in order to generate queries forexternal data sources beyond in-memory data structures. In particular, this section shows howgetters, a ﬃ ne folds and folds from the Optica DSL can actually be given a non-standard repre-sentation in terms of XQuery expressions. Prior to that, we will manually adapt the differences and expertise queries and corresponding models into the XML / XQuery setting [38] in an id-iomatic way. This will serve as a point of reference to implement the aforementioned semantics.In this sense, there are several assumptions that we need to make in order to map optics and XMLmodels, which are described subsequently. / XQuery Background

Accommodating objects into XML types is not a trivial task [41]. Figure 12 shows a possibleway of encoding the state of the couples example in an XML document. It contains a root element xml where couples hang from as couple elements, which in turn contain subelements for the ﬁrst( fst ) and second ( snd ) members that form the couple. Finally, name and age are simple tags thatcontain primitive attributes.Usually, an XML document is accompanied by an XSD schema, which is essential to validatethe information that we place in the document. The schema associated to the couple documentcan be found in Appendix B.1. Among other things, it prevents us from deﬁning people withouta name element, placing non numerical values as age values, and deﬁning several fst elementsinside a couple. Later, we will see that it is important to take this schema into account whileimplementing queries.Now, we would like to produce an XQuery expression, analogous to differences . It shouldbe able to collect the name and age di ﬀ erence of all people who occupy the ﬁrst position in the18 xml >Alex60Bert55Cora33Demi31Eric21Fred60 Figure 12: Couples data represented as XML. couple and are older than their mates. Since we do not want to calculate a single value for thisquery, like a number or a boolean, the results should be presented as a sequence of nodes, i.e. theoutput is also an XML tree. For example, this is the output of the query we are looking for: < xml >Alex5Cora2 We pair values by means of a contrived tuple element, which contains one and two projectionsubelements, where data is ﬁnally stored. Once we know the output that we want to produce, weshow the XQuery expression that we could use to generate it . /xml/couple[fst/age > snd/age]/{fst/name}{fst/age - snd/age} We describe its main components in the following paragraphs: • One of the most fundamental queries is / , which grants access to the so-called document node, which can be seen as the entry point in the document. Since XML documents are es-sentially nested data structures, XQuery provides concise syntax to access nested elements.For example, /xml/couple selects all elements couple that are hanging from an element xml which in turn should be accessible from the document node. • XQuery does provide ﬁlters to enrich queries, which are placed inside square brackets.For example, [fst/age > snd/age] is a ﬁlter that we apply over /xml/couple to discard thecouples in which the age of the ﬁst member is not greater than the age of the second one.The operator > is able to extract the inner value of these elements and interpret them asnumbers. In fact, this should be safe, if we take into account the XML schema. We broke the query into several lines for readability, but note that certain interpreters require a single line in orderto produce a valid result. XQuery supports XML interpolation to enrich its results. It serves us to provide the struc-ture that we need in order to put pairs of values together. It is worth mentioning that thisis the only feature from XQuery which is not available in XPath, among the ones we usein this work.Now, we could adapt the organization example, along with the expertise query. Figure 13shows the XML document where we adapt the information from the original example. Thisdocument is valid according to the schema that we have placed in Appendix B.2. As we alreadyshowed, expertise returns the name of the departments where all employees are able to abstract .Once again, we need to return a node sequence since we could ﬁnd many departments matchingthe criteria. Thereby, the output that the query should produce might be: < xml >QualityResearch Producing such a query, analogous to expertise , is by no means straightforward since thereis no standard XQuery method to check if all elements that are hanging from a certain contextdo satisfy a predicate. Fortunately, we can implement the desired behavior in terms of simplerprimitives, as we show in the following query: /xml/department[not(employee[not(task[tsk = "abstract"])])]/dpt The query produces the expected results, although it is di ﬃ cult to read due primarily to thecombination of ﬁlters and negations. This query uses new XQuery features, that we describenext: • There are several invocations to the not function. This is just the negation function thatwe could ﬁnd in many programming languages, but it adds extra functionality beyondnegating booleans. Namely, it also produces true if the argument corresponds to a non-empty sequence of elements, and false if the argument corresponds to an empty one. • We ﬁnd a new operator = , which corresponds to equality. The ﬁrst operand in the equalityis a tag, and thus its value is extracted to do the comparison. The second operand is astring literal. Beyond strings, XQuery provides literals for other basic types, like numbersor booleans.Finally, we would like to introduce a new element that we have deliberately ignored so farsince it was not present in the queries. It is the self axis and it refers to the current context.In XQuery, it is represented as a . (dot). This self notion is neutral under nested access. Forexample, ./couple/./fst/. is equivalent to couple/fst . We will need this self notion afterwards,in order to implement the non-standard semantics. We come back to our objective of turning Optica expressions into XQuery expressions. Forthis task, we will use E xml as the semantic function that assigns well-typed Optica expression theirdenotations. Prior to that, we need to choose T xml , which maps Optica types to semantic domainsfor this infrastructure. Since we aim at generating XQuery expressions, it seems reasonable touse XQuery as the semantic domain for query types. We also need to identify the semanticdomain for optic types. Although this might sound contrived at this point, we adopt the very201 < xml >ProductAlexbuildBertbuildQualityResearchCoraabstractbuilddesignDemiabstractdesignEricabstractcalldesignSalesFredcall Figure 13: Organization deployed as XML. xml [ ] ::

XQuery E xml [ couples : fold Couples Couple ] = couple E xml [ f st : getter Couple Person ] = fst E xml [ snd : getter Couple Person ] = snd E xml [ name : getter Person S ] = name E xml [ age : getter Person N ] = age Figure 14: XQuery non-standard semantics for couples extension same semantic domain as the one that we have embraced for queries. Therefore, we deﬁne T xml as follows: T xml [ t ] = XQuery

In fact, regardless of the input type, it will always evaluate to an XQuery expression. Remark 8will shed some light on this decision. The rest of the section revolves around the details of E xml and discusses the results. Before presenting the implementation of E xml , there are several assumptions about the adap-tation of the Optic models into XML schemas that need to be made, where we basically adoptthe same conventions that we have seen in Sect. 4.1. Firstly, we will assume that all informationis hanging from an xml element, which acts as the root of the XML document. Secondly, we willassume that every optic corresponds to an XML element, where the optic kind determines thecardinality. Finally, optics that select base types are adapted as simple type elements containing avalue with the corresponding base type; optics that select domain entity types are adapted as ele-ments with complex type , since they nest other elements in turn. Each of the previous conventionsare assumed in the XSD schemas that can be found in the appendix.Now, we have all the ingredients that we need to provide the implementation of the XQueryevaluator. Given its simplicity, we will start with the evaluation of the extended terms for thecouples example that we have collected in Fig. 14. As we have said in the previous paragraph,optics correspond to XML elements, and thus we represent them as mere element selection.Indeed, optic names are good candidates as tag names. However, we need to adjust the pluralnames of folds into the singular, like in couple , since this information is supplied as individualelements. The evaluation for the organization model should be straightforward now and does notadd any value; therefore, we do not show it. The evaluations for the core combinators are collected in Fig. 15. We start with the com-binators for getters. Firstly, ≫ gt is translated as nested access, where the evaluations of thecomposing expressions g and h are tied together. For ∗∗∗ we use the XML interpolation, wherethe evaluation of the composing expressions is placed in the corresponding projection elements.Finally, id gt is interpreted as a self reference ( . ), which is neutral under composition. Now, wemove on to standard getter constructions, beginning with like . Since it produces constant optics,whose part does not depend on the surrounding whole, we decide to map them to XQuery liter-223 E xml [ ] :: XQuery E xml [ id gt : getter α α ] = . E xml [ g ≫ gt h : getter α γ ] = E xml [ g : getter α β ] / E xml [ h : getter β γ ] E xml [ g ∗∗∗ h : getter α ( β, γ )] = < tuple >< fst > E xml [ g : getter α β ] < / fst >< snd > E xml [ h : getter α γ ] < / snd >< / tuple > E xml [ like b : getter α β ] = b E xml [ not g : getter α B ] = not ( E xml [ g : getter α B ]) E xml [ g ⊕ h : getter α δ ] = ( E xml [ g : getter α β ] ⊕ E xml [ h : getter α γ ]) E xml [ id af : a ﬃ ne α α ] = . E xml [ g ≫ af h : a ﬃ ne α γ ] = E xml [ g : a ﬃ ne α β ] / E xml [ h : a ﬃ ne β γ ] E xml [ ﬁltered p : a ﬃ ne α α ] = . (cid:104) E xml [ p : a ﬃ ne α B ] (cid:105) E xml [ to af g : a ﬃ ne α β ] = E xml [ g : getter α β ] E xml [ id ﬂ : fold α α ] = . E xml [ g ≫ ﬂ h : fold α γ ] = E xml [ g : fold α β ] / E xml [ h : fold β γ ] E xml [ nonEmpty g : getter α B ] = exists ( E xml [ g : fold α β ]) E xml [ to ﬂ a : fold α β ] = E xml [ a : a ﬃ ne α β ] E xml [ get g : α → β ] = / xml / E xml [ g : getter α β ] E xml [ preview g : α → option β ] = / xml / E xml [ g : a ﬃ ne α β ] E xml [ getAll g : α → list β ] = / xml / E xml [ g : fold α β ] Figure 15: XQuery non-standard semantics ls . Next, we can see that not is interpreted as the function not , and binary combinators, whichare uniﬁed by the symbol ⊕ , are interpreted as the corresponding XQuery operations.Moving on to a ﬃ ne folds, we ﬁnd that the composition and identity primitives share the sameimplementation as the ones we have seen for getters. This situation —which also occurs in foldcombinators— demonstrates that we do not make a di ﬀ erence between semantic domains in theinterpretation. In fact, if we understand XQuery as a representation of an a ﬃ ne fold, it is naturalthat we can also use it as a representation of a getter, and the implementation of to af conﬁrmsthis intuition. This module also contains ﬁltered . Since we have a ﬁltering mechanism availablein XQuery, we simply interpret this primitive into square brackets ([]), passing the semantics ofthe predicate getter as an argument to it.Finally, we present the fold related method nonEmpty . In this particular case, we need toadapt any fold into a getter that selects a boolean. Luckily, XQuery provides a function exists which turns XQuery expressions into booleans. It does it by checking that the result producedby the query is not empty. It might have been noticed that exists was not even mentioned inthe background section. In fact, it was not necessary since the not function does the trick byturning an expression denoting a sequence into a boolean. In particular, not(exists(sq)) (where sq denotes a sequence of elements) is equivalent to not(sq) . However, while evaluating, we donot know whether the exists invocation denoted by nonEmpty will be consumed by a functionlike not (denoted by another expression), and thus we need to invoke exists explicitly .As ﬁnal notes, we must say that interpreting optic expressions like differencesFl (Def. 4)or expertiseFl (Def. 5) leads to relative queries, i.e. queries that do not start with / and whichare relative to the current context. Those queries are valid XQuery expressions, but they willnot produce any results if we run them against the XML document which contains the wholehierarchy. Fortunately, we could easily compose such relative queries with the ones generatedby external models to produce queries over more complex domains. Leaving this possibilityaside, the next section shows the ﬁnal reﬁnement that we need to perform in order to obtain theexpected XQuery expressions. The evaluation of query expressions from Optica can be found at the bottom of Fig. 15. Sinceboth optic and query types denote an XQuery expression, the semantics of query expressions isalmost direct. The only caveat is that get , preview and getAll prepend the /xml fragment tothe relative query obtained in the optic representation. This is just a consequence of one ofthe assumptions that we made when adopting XML, where we have stated that an < xml /> rootelement was necessary by convention. Thereby, we take the opportunity to prepend it here.At this point, with the required evaluations at hand, we should be able to recover the targetqueries. As a result, E xml [ di ﬀ erences ] provides the following XQuery expression: /xml/couple[fst/age > snd/age]/{fst/name}{fst/age - snd/age} The resulting query is exactly the same as the one that we have introduced in Sect. 4.1. We alsosupply the output provided by E xml [ expertise ]: The notation b just indicates the adaptation of b into an XQuery literal. There are di ﬀ erent implementation techniques in the literature that we could use to optimize this kind of situation,but we ignore them for brevity. xml/department[not(exists(employee[not(exists(task/tsk[. = "abstract"]))]))]/dpt We can see a self-reference (dot) when comparing the task with the literal ‘ abstract ’, whichis a consequence of the particular implementation of elem that we have presented in Sect. 3.1(we use the identity getter to refer to the predicate parameter while invoking any ). Besides,we ﬁnd redundant invocations to exists . If we ignore these minor di ﬀ erences, the query isessentially the same as the one that we presented at the beginning of this section, and thereforeit produces the very same output. The accompanying implementation in Scala of the XQueryinterpreter contains tests that show the right behavior of these queries, where we use BaseX asthe XQuery engine to access XML documents. Remark 8.

As we have seen throughout this section, both optic and query types are evaluatedinto the same semantic domain

XQuery . Indeed, if we leave interpolation facilities aside, this isessentially an interpretation into XPath, which is just a language to select parts from an XMLdocument, just like optics select parts from immutable data structures. In this sense, it is onlynatural that XQuery can behave as a non-standard optic representation.

5. SQL

SQL is a query language for relational data sources which greatly di ﬀ ers from the hierarchicalnature of both XML and optic models. Nevertheless, this section will show that we can generateSQL statements from Optica expressions. Firstly, we manually adapt the couple and organizationexamples into the SQL setting to better understand the kind of queries that we want to produce.Then, we will present the SQL non-standard semantics of Optica and the assumptions that webuild upon in order to automatically generate analogous queries to the ones that we have obtainedmanually. As opposed to XML, relational databases are organized around ﬂat data sources. As a con-sequence, we face the object-relational impedance mismatch [5] when trying to accommodatethe object models underlying optics into the relational setting. Fortunately, there are patternsthat we can embrace to approach this task, like the

Foreign Key Aggregation or the

Foreign KeyAssociation patterns [42]. We take them as a reference and propose the following tables to adaptthe couples example model that we introduced in 2.2.1:

CREATE TABLE

Person (name varchar (255)

PRIMARY KEY ,age int NOT NULL ); CREATE TABLE

Couple (fst varchar (255)

NOT NULL ,snd varchar (255)

NOT NULL , FOREIGN KEY (fst) REFERENCES Person(name),

FOREIGN KEY (snd) REFERENCES Person(name)); Once again, these invocations could be removed from the resulting query by means of annotations, as in [19], butwe wanted to keep the interpretation compositional in order to make it simpler. http://basex.org/basex/xquery/ ame age Alex 60Bert 55Cora 33Demi 31Eric 21Fred 60 (a)

Person fst snd

Alex BertCora DemiEric Fred (b)

Couple

Alex 5Cora 2 (c)

Differences

Figure 16: Data for the couples example.

As can be seen, case classes are adapted as tables and their attributes are adapted as columns.Once again, as we have seen in the XQuery interpretation, it is necessary to distinguish betweenattributes which contain base types and attributes containing other entities. In fact, attributes thatrefer to entities require pointers to establish the precise connections between the adapted tables,following the Foreign Key Aggregation pattern. We assume ﬁgures 16a and 16b as the initialstate for these tables, where the columns in

Couple are clearly selecting names from

Person .Previously, we have seen that the adaptation of differences in the XML setting producedXML as output. We are now dealing with SQL tables, where the output of a statement is a tableitself. Thereby, we would expect Fig. 16c as the result of executing the adaptation of differences .In particular, we could produce such output with the following query:

SELECT w.name, w.age - m.age

FROM

Couple c

INNER JOIN

Person w ON c.fst = w.name INNER JOIN

Person m ON c.snd = m.name WHERE w.age > m.age;

This statement is clearly separated in three major sections. First, we describe

FROM , which buildsthe raw table that the other parts use to gather information from. This table is created by joiningthe table

Couple with two occurrences of table

Person , thereby incorporating the informationfrom the couple members fst and snd . Variables c , w and m allow us to refer to these three tables.Second, the WHERE clause introduces ﬁlters that are applied over the compound table to discard therows that do not match the criteria: those where the age of the ﬁrst member is not greater thanthe age of the second one. Last, the

SELECT clause indicates the columns that we are interested in:the name of the ﬁrst member and the age di ﬀ erence.Now, we move on to the organization example. First of all, we create tables for departments,employees and tasks following the same adaptation pattern: CREATE TABLE

Department (dpt varchar (255)

PRIMARY KEY ); CREATE TABLE

Employee (emp varchar (255)

PRIMARY KEY ,dpt varchar (255)

NOT NULL , FOREIGN KEY (dpt) REFERENCES Department(dpt));

CREATE TABLE

Task (tsk varchar (255)

NOT NULL ,emp varchar (255)

NOT NULL , FOREIGN KEY (emp) REFERENCES Employee(emp));

All components in the previous statements should be familiar at this point, but there is an im-portant change in the way we conﬁgure foreign keys. As we have seen in the couples example,26 pt ProductQualityResearchSales (a)

Department emp dpt

Alex ProductBert ProductCora ResearchDemi ResearchEric ResearchFred Sales (b)

Employee tsk emp build Alexbuild Bertabstract Corabuild Coradesign Coraabstract Demidesign Demiabstract Ericcall Ericdesign Ericcall Fred (c)

Task

QualityResearch (d)

Expertise

Figure 17: Data for the organization example. getters selecting entities were mapped into a column containing a foreign key. However, theorganization example contains multivalued attributes, like employees or tasks , that should not beadapted as a single column. For this situation we adopt the Foreign Key Association pattern. Weassume that these tables have been populated with the data in ﬁgures 17a, 17b and 17c.As we have seen before, Quality and

Research are the departments where all employees areable to abstract; therefore, the adaptation of expertise should produce Fig. 17d as a result. Wepropose the following query to generate it:

SELECT d.dpt

FROM

Department AS d WHERE NOT ( EXISTS ( SELECT e.*

FROM

Employee AS e WHERE NOT ( EXISTS ( SELECT t.*

FROM

Task AS t WHERE (t.tsk = "abstract") AND (e.emp = t.emp))) AND (d.dpt = e.dpt))); Reading this query is by no means trivial. Fortunately, it shares the same pattern as the querythat we have seen while adapting expertise in the XQuery setting. In fact,

EXISTS is a functionthat returns true as long as the nested statement produces non-empty results. If we combine itwith

NOT to negate predicates, we can check if all rows satisfy a condition. Beyond the noisegenerated by this pattern, there are additional ﬁlters which manifest relations between nested andouter variables that introduce even more complexity in the picture.

Peculiarities of SQL have been known for a long time now [43]. As Date states in con-nection with certain aspects of SQL, “there is so much confusion in this area that it is dif-ﬁcult to criticize it coherently”. Part of the problem resides in that the formal deﬁnition ofSQL was produced after the fact , where many academic considerations were neglected. Con-sequently, the language does have its weak points, where the lack of orthogonality becomesa central issue. Although many deﬁciencies have been remedied in the last decades, the ob-trusive syntax of the

SELECT statement remains a problem. For example, despite the fact thatrelational algebra combinators may appear in any order, the rigid structure of

SELECT statementsmight demand the programmer to recast a relational algebra expression that is considered natu-ral (like

UNION (tabexp1, tabexp2) ) into a semantically equivalent form, compliant with the SQL27tandard (like ( SELECT ...

FROM ...

WHERE ...)

UNION ( SELECT ...

FROM ...

WHERE ...) ). Fortu-nately, [16] supplies a list of syntactic rules which we can use to rewrite any expression from anordinary impure functional programming language into its SQL form. Optica expressions sharewith relational algebra the purely compositional character of algebraic expressions; hence, theyalso require a set of transformations before being able to be translated to SQL queries. Thesetransformations will not be carried out on the optic expression itself, but through a new semanticdomain which plays the role of an intermediate expression that can be directly translated to SQL. T sql [ getter α β ] = Triplet → Triplet T sql [ a ﬃ ne α β ] = Triplet → Triplet T sql [ fold α β ] = Triplet → Triplet T sql [ α → list β ] = ( S → S ) → SQL T sql [ N ] = Fragment T sql [ B ] = Fragment T sql [ S ] = Fragment

Figure 18: SQL semantic domains

Accordingly, the new semantic domains deﬁned by the semantic function T sql are shown inFig. 18. Firstly, all optic types are mapped to a Triplet endofunction. Triplets are the intermediateexpressions which lie between optic and SQL expressions, whose major purpose is to reconcilethe main disagreements among them. Secondly, since we aim at generating SQL statements, thesemantic domain associated to query types is, as expected, an SQL expression. However, it isrequired to supply a function ( S → S ) that maps relational table names to the column name whichcorresponds to the primary key —information that is not contemplated by the optic model— inorder to produce SQL statements. It is important to remark that the types of the queries get and preview are ignored here. Later on we will explain why this partiality is needed. Finally, basetypes are mapped to triplet fragments , i.e. their evaluation will be used to form triplets.For the rest of the section we will proceed as usual, introducing the semantic function E sql ,which is responsible for evaluating domain, optic and query terms, and we will conclude dis-cussing the results. Prior to that, we ﬁnd it essential to describe the details about the intermediatestructure Triplet . As we have just seen, a SQL select statement exhibits a remarkable separation of concerns,where selection and ﬁltering, although sharing syntax, belong to di ﬀ erent query clauses. Thisseparation requires a unifying mechanism to refer to the very same item from both clauses. SQLsolves this problem by means of variables declared in the FROM clause, which are accessible fromthe

SELECT and

WHERE scopes.This way of representing queries in SQL contrasts with its optic counterpart. In Optica,the aspects of selection and ﬁltering may appear anywhere within the expression. Moreover,the information required by these components does not need to be collected in a single

FROM component, but speciﬁed on demand. In optic expressions, there is no need for variables either,since it is the context where two optics appear that determines whether they are selecting thesame item or not. For example, consider the following optic expression , where we ﬁnd two This query is only a less direct way of implementing query under50 from Sect. 1. igure 19: Triplet generated for di ﬀ erencesFl . occurrences of fst : couples ≫ ﬁltered ( fst ≫ age < ≫ fst ≫ name Despite having one of them surrounded by ﬁltered , we can see that both of them are selectingthe very same person. Furthermore, note that the information required by the ﬁltering expression(the age of the ﬁrst member) is collected within the predicate scope and not shared globally.

Triplet is the data structure that we use as an intermediary to alleviate the aforementioneddisagreements. Its main objective is to segregate the three di ﬀ erent aspects, which are evident ina SELECT statement, from an Optica expression. In particular, a triplet is made of three componentswhich correspond to the

SELECT , FROM and

WHERE clauses, respectively. We present an informal viewof this concept in Fig. 19, where we represent the triplet associated to the expression di ﬀ erencesFl (Def. 4). A triplet may be considered as a structured optic whose actual focus is determinedthrough three components: • The middle component determines the potential focus of the optic. In particular, Fig. 19shows this component as a trie whose edges are optics focusing on entity types (notbase types). Its elements are sequences of optics that represent a vertical composition,e.g. the sequence made of the primitive fold couples and getter fst represents the fold couples ≫ fst . The ﬁgure labels each node with a distinct name that refers to the entitiesof that unique path. In this example we potentially refer to the list of couples ( c ) and twolists of people: its ﬁrst ( w ) and second ( m ) members. The nodes of the trie, colored inblack, and its associated names can be reused in the left and right components. • The right component further constrains the potential collections of entities identiﬁed by theentity trie, by imposing conditions over them. In the example, there is just one conditionthat restricts the collection of couples (and, consequently, its dependent collections ofpeople) to those where her age ( w ) is greater than his ( m ). These conditions are representedin terms of directed graphs whose edges are optics or binary combinators, like > , whichmake two di ﬀ erent paths converge. Note that red nodes form restriction graphs. https://en.wikipedia.org/wiki/Trie :: = ( s , f , w ) s :: = ( e , e , . . . , e ) f :: = / | insert ˆ p fw :: = { e , e , . . . , e } e :: = like c | not e | e > e | e == e | e − e | ˆ p | ˆ p . optic | nonEmpty t ˆ p :: = ( optic , optic , . . . , optic ) Figure 20: Triplet syntax • Last, the left component deﬁnes the actual selection of the overall optic by selecting insequence certain collections from the entity trie, and possibly by further reﬁning themthrough additional optic expressions selecting base values. In the example, we select hername and the age di ﬀ erence of the couple (which will be greater than 0) according to theconstraints which were imposed by the right component. Selections are represented usingthe same graphs as in the constrain component, but nodes forming them are colored inblue.We formalize the notion of triplet in Fig. 20 through its associated syntax. As we have pointedout previously, the middle component is just a trie whose keys are primitive optic expressionsfocusing on entities. Thus, the elements stored in the trie are sequences of such expressions,which we will refer to as paths ( ˆ p ) . Entity tries may be the empty trie, / , or the result ofinserting a new path, insert ˆ p f . The left and right components, s and w , are a sequence and aset of expressions ( e ), respectively. Repeated restrictions in w are redundant and their ordering isirrelevant —that is why a set is chosen. Expressions e are very similar to those from Optica ( like , not , > , etc.), but there are a few major changes that deserve further explanation. Essentially,expressions do not include vertical composition as such; instead, if the vertical compositionselects an entity, it is simply represented through a path from the entity trie. Otherwise, if itselects a base type, it is represented as the projection of an attribute from a path, as in ˆp . optic .For instance, the Optica expression couples ≫ fst would denote the path ( couples , fst ), while theexpression couples ≫ fst ≫ name would denote the projection ( couples , fst ) . name . Horizontalcomposition is also unneeded since the left component is able to collect a sequence of singleselections. Finally, expressions also contain a nonEmpty term, where we keep a snapshot ofa triplet that is later used to produce nested queries. Section 5.2.4, where we formalize theprecise correspondence between triplets and SQL, will show that a nonEmpty term is eventuallytranslated into an EXISTS operator.At this point, we might consider using

Triplet as the chosen optic representation. However,composing the di ﬀ erent triplets generated by optic subexpressions turns out to be a clumsy task.Instead, we would like to use a representation with better compositional guarantees. In this sense,it is more convenient to use a triplet endofunction so that each subexpression can describe theprecise transformation that it performs over the input triplet when it is composed through verticalcomposition . This is how we obtain Triplet → Triplet , the chosen semantic domain for optictypes. We illustrate the idea behind this function in Fig. 21, which shows the evolution of the We will use hats , as in ˆ p , to emphasize the terms which correspond to paths. This is reminiscent of the functional representation of di ﬀ erence lists, where concatenation is implemented in termsof plain composition, and the list is recovered by passing the empty list as input. In this case, the analogues of the emptylist and concatenation are the empty triplet (Def. 6) and vertical composition. di ﬀ erencesFl query, starting from the empty triplet. The arcs in this ﬁgureare labelled by the optic subexpressions that identify the applied transformations. As expected,the last triplet in the chain corresponds to the structure that we presented in Fig. 19. We willdetail these steps throughout the next sections while presenting the E sql deﬁnition. This section provides the semantics of primitive optics from the domain syntax, such as couples , fst , etc., in terms of the precise transformations that they carry out over an input Triplet .Their formalization can be found in Fig. 22, using the couples domain for illustration purposes.Note that (cid:97) represents the concatenation of sequences. Before explaining this formalization, wewill describe the occurrences of domain primitives in the particular example shown in Fig. 21 aswell as in Fig. 24, where Step b) is shown in detail. • Step a) shows the changes introduced by the term couples . This is a very special case sinceit takes the initial triplet as input. As can be seen, the new changes consist of introducingthe new path in the trie and selecting it in the left component. Bear in mind that we can onlyintroduce optics selecting domain entities in the trie, like couples , that selects a sequenceof Couple entities. • Step b) contains more domain terms in the predicate. In particular, Step b1) (Fig. 24)shows the changes introduced by fst when it is applied to a triplet that focuses on couples.Since this optic focuses on entities and the input triplet is not empty, the result is a tripletthat extends the entity trie by appending the new optic to the couples path, changing itsfocus (i.e. the left part of the triplet) to the new path w . • Step b3) represents the changes introduced by age . In this case, we deal with an opticselecting a base type N . Thereby, it cannot be introduced in the trie. Instead, we reﬁne thefocus of the input triplet which becomes a projection to the new optic.Fortunately, the behaviour of the ﬁrst and second cases can be factorized, as long as wecontemplate the following deﬁnition for the empty triplet: Deﬁnition 6.

We formalize the empty triplet as the one that contains a single selection ˆ(), anempty trie and an empty set of restrictions. empty = (( ˆ()) , /, ∅ )The empty sequence () is used in tries to refer to its root, and thereby, we use it as the initial pathin the left component.However, our formalization must take into account the distinction between optics selectingentities and optics selecting base types. Figure 22 introduces functions base and entity for thistask, where (cid:55)→ just represents the standard “maps to” notation from functions. They take theoptic expression as parameter and they produce triplet endofunctions as result. The rest of theimplementation should be straightforward, given the previous explanations. We do not show theevaluation of the organization terms since they follow the very same pattern, exploiting entity and base . 31 igure 21: Triplet evolution for di ﬀ erencesFl sql [ : op α β where op ∈ { getter , a ﬃ ne , fold } ] :: Triplet → Triplet E sql [ couples : fold Couples Couple ] = entity couples E sql [ f st : getter Couple Person ] = entity fst E sql [ snd : getter Couple Person ] = entity snd E sql [ name : getter Person S ] = base name E sql [ age : getter Person N ] = base agebase b = (( ˆ x ) , f , w ) (cid:55)→ (( ˆ x . b ) , f , w ) whereˆ x is an element of fentity e = (( ˆ x ) , f , w ) (cid:55)→ ((ˆ y ) , f2 , w ) whereˆ y = ˆ x (cid:97) ( e ) and f2 = insert ˆ y f andˆ x is an element of f Figure 22: Triplet non-standard semantics for couples extension

This section speciﬁes the triplet transformations that are associated to the Optica core combi-nators which can be found in Fig. 23. Before delving into the semantics of the getter, a ﬃ ne foldand fold combinators, we will introduce the next deﬁnitions, which will be useful for ensuringthe consistency of the formalization: Deﬁnition 7.

Given e : optic β γ , where optic ∈ { getter , a ﬃ ne , fold } , a triplet t is a valid inputfor e if one of the following conditions holds: (1) t = empty (2) t = E sql [ e : optic α β ] t , for some e , t such that t is a valid input for e Basically, an input triplet is valid for a given optic e if it is the empty triplet (Def. 6), or if itis the result obtained from evaluating an optic expression e with a valid input, where the ‘part’type of optic e coincides with the ‘whole’ type from e . Deﬁnition 8. A singleton model type is either a base type or a domain type, i.e. it is the result ofdiscarding product types from Optica model types. Proposition 1.

Let e : optic α β , where optic ∈ { getter , a ﬃ ne , fold } , β ∈ singleton model type,and t a valid input for e; then: (( s ) , , ) = E sql [ e : optic α β ] t The proposition states that, given a valid input, the result from evaluating an optic that selectsa singleton model type always returns a single selection s . This can be easily proven by inductionsince all combinators producing optics that select singleton model types do generate a singleexpression as left component, according to Fig. 23. In fact, this proposition turns out to benecessary to consider that the implementations of base and entity (Fig. 22) are well-deﬁned. Getters.

First, we describe the implementation of ≫ gt . As Fig. 21 suggests, vertical com-position should be evaluated as the chaining of transformations, i.e. as function composition.Consequently, id gt is implemented as the identity function, meaning no transformation at all.334 E sql [ : op α β where op ∈ { getter , a ﬃ ne , fold } ] :: Triplet → Triplet E sql [ id gt : getter α α ] = t (cid:55)→ t E sql [ g ≫ gt h : getter α γ ] = E sql [ h : getter β γ ] · E sql [ g : getter α β ] E sql [ g ∗∗∗ h : getter α ( β, γ )] = t (cid:55)→ ( s (cid:97) s , f (cid:79) f , w ∪ w ) where( s , f , w ) = E sql [ g : getter α β ] t and( s , f , w ) = E sql [ h : getter α γ ] t E sql [ like b : getter α β ] = ( , f , w ) (cid:55)→ (( like b ) , f , w ) E sql [ not g : getter α B ] = ((( b ) , f , w ) (cid:55)→ (( not b ) , f , w ))) · E sql [ g : getter α B ] E sql [ g ⊕ h : getter α δ ] = t (cid:55)→ (( b ⊕ b ) , f (cid:79) f , w ∪ w ) where(( b ) , f , w ) = E sql [ g : getter α β ] t and(( b ) , f , w ) = E sql [ f : getter α γ ] t E sql [ id af : a ﬃ ne α α ] = t (cid:55)→ t E sql [ g ≫ af h : a ﬃ ne α γ ] = E sql [ h : a ﬃ ne β γ ] · E sql [ g : a ﬃ ne α β ] E sql [ ﬁltered g : a ﬃ ne α α ] = ( s , f , w ) (cid:55)→ ( s , f , { b } ∪ w ) where(( b ) , f , ∅ ) = E sql [ g : getter α B ] ( s , f , ∅ ) E sql [ to af g : a ﬃ ne α β ] = E sql [ g : getter α β ] E sql [ id ﬂ : fold α α ] = t (cid:55)→ t E sql [ g ≫ ﬂ h : fold α γ ] = E sql [ h : fold β γ ] · E sql [ g : fold α β ] E sql [ nonEmpty g : getter α B ] = ( s , f , w ) (cid:55)→ (( nonEmpty ( E sql [ g : fold α β ] ( s , f , ∅ ))) , f , w ) E sql [ to ﬂ a : fold α β ] = E sql [ a : a ﬃ ne α β ] Figure 23: Triplet non-standard semanticsigure 24: Filtered step in detail tep e) (Fig. 21) shows an example of horizontal composition ( ∗∗∗ ), where a pair of divergingtriplets are somehow combined. In this special case, the changes are only reﬂected in the leftcomponent since the middle and right components are exactly the same in both triplets. Theevaluation of ∗∗∗ supplies the input triplet t as an argument to the evaluations of g and h , whichresults in a pair of diverging triplets, as those in the illustration. To carry out the combination ofthem, we concatenate the selections s and s , we merge the tries f and f ( (cid:79) ), and we make theunion of the sets of restrictions w and w . Both the union of sets and the merging of tries areidempotent operations.Next, we ﬁnd like and not as examples of unary standard combinators which just updatethe left component of the input triplet. The former ignores the previous selection and replacesit by the constant value. The latter transforms the triplet by applying the operation over thecurrent selection. The evaluation demands a single expression as input, where we rely on Prop. 1.Moreover, the Optica type system guarantees that such an expression represents an optic selectinga boolean.Finally, Step b5) (Fig. 24) represents a binary combinator. The situation is very similar to thatof ∗∗∗ . However, instead of concatenating the selections, their single components are fused intothe corresponding expression. The evaluation of this term assumes that the triplets which derivefrom the evaluations of g and h contain singleton selections. Once again, we rely on Prop. 1 sinceall the binary combinators that we can ﬁnd in Optica take base types as operands. A ﬃ ne Folds. As in the XQuery evaluation, composition and identity primitives are exactly thesame as those we have just presented for getters. In addition, to af only returns the evaluation ofits argument. The same concept will apply to folds. Consequently, there only remains ﬁltered .As we have seen before, Step b) (Fig. 21) represents this combinator, which was further detailedin Fig. 24 given its complexity. This ﬁgure shows an inner box that describes the triplet evolutionspeciﬁed by the predicate, which starts from the same input triplet as the ﬁltered whole expres-sion. The rest of the evolution inside the box should be straightforward now. However, it has yetto be explained how to move from the result of Step b5) to the ﬁnal result of Step b) . Informally,what happens in this example is that the selection of the whole expression does by no meanschange, which seems meaningful since the ﬁlter expression should not change the focus; the leftcomponent of the inner expression represents the predicate, which becomes a new constraint inthe right component of the resulting triplet; last, the middle component remains unchanged inthis particular case.The evaluation of ﬁltered in Fig. 23 formalizes the previous intuitions. Firstly, the overallinput triplet is passed as argument to the evaluation of the predicate. Its right component is resetto the empty set and the triplet generated by the predicate is expected to contain an empty setof restrictions, since getters are unable to update the restriction component of the triplet. In theresulting triplet, the selection s passes as is, while the restriction that was selected in the predicateis appended to the existing ones in w . Finally, note that the new entity tree results from the innertriplet, f , since new paths may have been created internally. Folds.

Lastly, we present the interpretation of nonEmpty , which introduces a signiﬁcant di ﬀ er-ence in comparison with the rest of combinators: it takes a fold as parameter. The evaluation offolds is problematic since they lead to the introduction of nested queries in this infrastructure,as we will see later. This is the reason why we use the nonempty term from Fig. 20 here, which Incidentally, this step is better to illustrate the result of merging two tries.

Remark 9.

An Optica expression is always translatable into a triplet endofunction, as evidencedby the total implementation of E sql , where Prop. 1 has proven essential. In fact, this evaluationjust consists on moving things around to adapt Optica expressions to the triplet conﬁguration.Unfortunately, translating triplets into SQL statements is a partial process, as described in thefollowing section. We have designed triplets to be easily translatable into

SELECT statements. This is clearlyevidenced in Fig. 25, where we compare the triplet generated for di ﬀ erencesFl (Def. 4) and theexpected SQL query that we presented in the background (Sect. 5.1) for the same example. Whilethe translation of the expressions in the left and right components is straightforward, the genera-tion of the FROM clause from the middle component requires further explanation. We present theformal translation of Optica query expressions into SQL in Fig. 26. What ﬁrst calls our attentionis the absence of translations for get and preview . In fact, it is only possible to produce a SQLstatement from getAll . As suggested in Prop. 9, the translation of triplets into SQL statements isa partial process.

Preconditions.

We describe the precise conditions that an Optica query should satisfy in orderto produce a valid SQL statement : The optic selected type, i.e. its ‘part’, is a ﬂat type. For instance, couples is not translatableinto SQL, since it selects Couple as part, which contains nested references to the entityPerson. The expression couples ≫ fst is valid, since Person does not contain furthernested data structures: name and age are plain values. The expression cannot contain a fold selecting a base type. For example, departments ≫ employees ≫ tasks is valid since all the involved folds do select entity types. The original kind (ignoring castings) of the leftmost expression forming a query has tobe a fold . For example, couples ≫ fst ≫ name is translatable into SQL (it starts withthe couples fold) while fst ≫ name is not (it starts with the fst getter). Thereby, get andpreview are omitted, since getter or a ﬃ ne fold expressions do not satisfy such condition.We will further motivate each limitation in the following paragraphs, where the whole process ofgenerating SQL statements from triplets is described. Since we aim at turning triplets into SQL expressions, the very ﬁrst step is to produce atriplet. We achieve this by evaluating the fold expression that accompanies getAll and supplyingthe empty triplet (Def. 6) to the resulting function. Then, we need to reﬁne the entity trie of theobtained triplet by assigning fresh names for each of its paths (which the evaluation function E sql does not generate). Last, we pass the reﬁned triplet as argument to the actual translator ( sql ).Besides the triplet, note that this function receives an additional argument, ˆ local , that speciﬁes It is important to note that an error should be raised if any of them is not satisﬁed. sql function will be used to translate both the whole SQL query, and the inner queriesof nonEmpty expressions. In this very ﬁrst invocation, we aim at translating the whole triplet;thus, we pass as ˆ local the ˆ top of the entity trie, which represents the common preﬁx of every pathof the trie.The sql function delegates the generation of each clause of the whole

SELECT statement intothe corresponding functions select , from and where . Moreover, it calls an additional function, where + , whose purpose will be motivated later on. The results obtained from each function areconcatenated to form the ﬁnal query. Note that parentheses and brackets are discarded in the re-sult, they are simply introduced to delimit the arguments supplied to each function. In particular,an invocation surrounded by brackets informs that the invocation may be omitted, taking intoaccount the accompanying conditions. We describe the generation process of each clause in thenext paragraphs, where we will make frequent use of the following additional deﬁnitions: ρ. ˆ top The key which starts every path of the entity trie, if any ρ . local ( ρ ) The local path of ρ which is extended by ρ , if any ρ ( ˆ p ) The name assigned to the given path in the reﬁned trie ˆ p . last The key which ﬁnishes the given path ˆ p . up The second to last key of the given pathoptic . name The name of the given optic primitiveoptic . kind The kind of optic: getter, a ﬃ ne fold or foldoptic . whole The type of the whole entity that the optic inspectsoptic . part The type of focus to which the optic points to (an entity or base type)pk ( type ) The primary key of the relational table associated to the given type

With a little abuse of notation, we will omit the last attribute in path expressions, as inˆ p . name , instead of writing the more verbose ˆ p . last . name . Select clause.

The select function generates the

SELECT clause by separating the result of trans-lating each expression with commas. We describe the translation of the di ﬀ erent types of expres-sions in the following lines: • The translation of a path ˆ x simply refers to all the columns of the table corresponding tothat path, which was assigned by the fresh function. Also note that the path must referto an entity with no further nested entities (Precondition 1). Otherwise, the query outputwould not contain all the data required to reassemble the entity, i.e. this work does notsupport query shredding [28] yet. • The translation of a projection ˆ x . base is basically the same, but we ﬁnd an interestingrestriction here. SQL does not support multivalued columns, and therefore we cannot usea fold to project values (Precondition 2). • The translation of nonEmpty is given in terms of

EXISTS , which contains a nested SQLexpression. Thereby, we invoke the sql generator recursively. Before doing this, we needﬁrst to generate fresh names for the trie of the nonEmpty expression and to merge it withthe outer entity trie . Second, we need to calculate the right ˆ local path and to pass it tothe sql generator. The combinator (cid:67) merges tries, keeping the names from the left when it ﬁnds conﬂicting paths. The evaluation of the rest of expressions should be straightforward since they just adaptoperators and literals into their SQL form.

Remark 10.

None of the optic models associated to the guiding examples include a ﬃ ne foldsin their deﬁnitions. In the particular case of the SQL interpretation, such optics are assumed asﬁelds which may contain a NULL value, i.e. as nullable table columns.

Where clause.

We continue with the

WHERE clause given its similarity with the

SELECT clause,which is generated by the where and where + functions. The former is quite similar to select since it basically delegates the evaluation of the restriction expressions, although it uses AND asdelimiter for the results. The evaluation of expressions is exactly the same as the one that wehave introduced in the previous paragraph. Note that where produces

WHERE True if the set ofrestrictions is empty. Concerning where + , this function is responsible for appending the pre-cise connection between nested and outer variables, which were introduced at the very end ofSect. 5.1. We will explain it together with the discussion of the generation of the FROM clause inthe next paragraph.

From clause.

Before venturing into the from function, there are a few conditions that the genera-tor should preserve. Firstly, it is assumed that ρ. ˆ top must refer to a fold (Precondition 3), since weneed an entry point in the hierarchical tables. This means that we can only translate expressionsthat start with a fold, like di ﬀ erencesFl (Def. 4) —which starts with couples — or expertiseFl (Def. 5) —which starts with departments . Secondly, the invocation to from is omitted if ˆ local is not deﬁned, since this indicates that the current query is not introducing new variables, andtherefore no FROM clause is required.As expected, the from function prepares the

FROM clause. It selects the ‘part’ type from ˆ local as the initial table. Then, it produces an

INNER JOIN expression for each element hanging fromit. This is the reason why tries contain nothing more than entities, since they correspond torelational tables. In general, the complexity associated to these deﬁnitions is due to the choiceand implementation of the corresponding Foreign Key patterns (Sect. 5.1).

Figure 25: From triplet to SQL E sql [ getAll g : α → list β ] :: ( S → S ) → SQL E sql [ getAll g : α → list β ] pk = sql ( s , ρ, w ) pk ρ. ˆ top where( s , f , w ) = E sql [ g : fold α β ] empty and ρ = fresh f and ρ. ˆ top is deﬁned sql ( s , ρ, w ) pk [ ˆ local ] = ( select s ρ pk ) [ from ρ pk ˆ local ] ( where w ρ pk ) [ where + ρ pk ˆ local ]; where ρ. ˆ top . kind = fold and from invocation is omitted if ˆ local is not deﬁned and where + invocation is omitted if ρ. ˆ top = ˆ localselect ( e , e , . . . , e n ) ρ pk = SELECT expr e ρ pk , expr e ρ pk , . . . , expr e n ρ pkexpr ˆ x ρ pk = ρ ( ˆ x ) . ∗ whereˆ x . last . part ∈ ﬂat types expr ˆ x . optic ρ pk = ρ ( ˆ x ) . ( optic . name ) where optic . part ∈ base types and optic . kind ∈ { getter , a ﬃ ne } expr ( t ⊕ u ) ρ pk = expr t ρ pk ⊕ expr u ρ pkexpr ( not e ) ρ pk = NOT ( expr e ρ pk ) expr ( like a ) ρ pk = aexpr ( nonEmpty ( s , f , w )) ρ pk = EXISTS ( sql ( s , ρ (cid:48) , w ) pk ρ. local ( ρ (cid:48) ) ) where ρ (cid:48) = ρ (cid:67) fresh fwhere ∅ ρ = WHERE

Truewhere { e , e , . . . , e n } ρ pk = WHERE expr e ρ pk AND expr e ρ pk AND . . .

AND expr e n ρ pkwhere + ρ pk ˆ local = AND ρ ( ˆ local ) . key = ρ ( ˆ local . up ) . key where key = pk ( ˆ local . whole ) from ρ pk ˆ local = FROM ˆ local . part AS ρ ( ˆ local ) joins where joins = { eqjoin ˆ x ρ pk | ˆ x ∈ ρ, ˆ x = ˆ local (cid:97) ˆ y for some ˆ y (cid:44) () } eqjoin ˆ x ρ pk = INNER JOIN ˆx . part AS ρ ( ˆ x ) cond where cond =  USING pk ( ˆ x . whole ) if ˆ x . kind = fold ON ρ ( ˆ x . up ) . ( ˆ x . up . name ) = ρ ( ˆ x ) . ( pk ( ˆ x . part )) otherwise Figure 26: SQL generation nce we have implemented E sql , we can use it to translate generic queries into SQL state-ments. For di ﬀ erences (Def. 4) we get: def di ﬀ erencesSQL : SQL = E sql [ di ﬀ erences : Couples → list ( S , N )] ( Person (cid:32) name ) and we adapt expertise (Def. 5) as follows: def expertiseSQL : SQL = E sql [ expertise : Org → list S ] ( Department (cid:32) dpt , Employee (cid:32) emp ) Unlike other evaluators, E sql requires the relation of primary keys for the involved tables as anadditional argument, since this information is not contemplated in the optic model. We use thenotation ( t (cid:32) k , t (cid:32) k , ..., t n (cid:32) k n ) to build such argument. For instance, the primarykey associated to the table Person is the column name . If we ignore variable names, the SQLstatements which are generated by the previous deﬁnitions are exactly the same as those thatwere introduced in Sect. 5.1.As can be inferred from Fig. 26, the evaluation of getAll always leads to a SQL

SELECT state-ment, unless an error condition is present. The resulting query does not contain nested sub-queries, beyond the ones that emerge in the context of

EXISTS (due to the nonEmpty term). The

FROM clause uses

INNER JOIN s as the means to navigate downwards the tables in the model. Besidesthe previous elements, the evaluator just produces expressions with basic functions, operators andliterals; no additional SQL features are required.Clearly, the SQL semantics are not as neat as those associated the XML infrastructure (Sect. 4),since they require a non-trivial normalization into triplets prior to generating the SQL statement.Besides, such generation is partial, and thus triplets must meet certain conditions to guarantee acorrect translation. Fortunately, as we will see in the next section, we can take a di ﬀ erent pathtowards the generation of SQL where we can rely on existing work on language-integrated query.Despite this fact, in Sect. 8 we will discuss why the triplet normalization is still relevant.

6. T-LINQ

This section introduces Optica as a higher level language that we can interpret into com-prehensions. In particular, we generate T-LINQ [3] queries from optic expressions, essentiallyfollowing a similar translation implemented in the Links language in [44]. By doing this, wedemonstrate that the compositional style embraced by optics can be fruitfully exploited in orderto generate comprehension expressions automatically. Moreover, we open the possibility of del-egating the arduous task of generating SQL statements from optic expressions, described in theprevious section, to the existing translation and normalization techniques of T-LINQ. As usual,we supply a brief background on the querying language, T-LINQ, and then we show the non-standard semantics that is needed in order to produce the corresponding comprehension-basedqueries.

In order to manually adapt the expertise query (Def. 5) as a T-LINQ expression , we willﬁrst review the di ﬀ erence between a relational (or ﬂattened) model and a nested model. Figure 27 This section omits the couples example for the sake of brevity. We select expertise over di ﬀ erences since we considerit to be more challenging. ype NestedOrg = NestedDepartment listtype

NestedDepartment = { dpt : string ; employees : NestedEmployee list } type NestedEmployee = { emp : string ; tasks : Task list } type Task = { tsk : string } (a) Nested organization type Org = { departments : { dpt : string } list ; employees : { dpt : string ; emp : string } list ; tasks : { emp : string ; tsk : string } list } (b) Flattened organization Figure 27: Alternative organization models shows the nested (

NestOrg ) and ﬂat models (

Org ) for the organization example from [3], as T-LINQ records. Note that

Org di ﬀ ers from the nested version NestedOrg in the type of their ﬁelds,since it contains textual values which act as foreign keys to refer to the corresponding entities. Infact, the second version has a strong correspondence with the SQL tables that we introduced inSect 5.1. Cheney et al show the quoted expression that adapts the ﬂattened model into the nestedone (Fig. 28), where % org splices the database representation ( < @ database (“Org”)@ > ). In par-ticular, the programmer understands such representation as a list of entities from the relationalmodel; therefore, she can use the widespread notation of list comprehensions to implement thedesired queries, where ﬁltering ( if ... then ) and mapping ( yield ) features are also available. Fig-ure 29 shows the implementation of the expertise query in terms of T-LINQ, which builds uponthe ﬂattened model . Later on, we will see that the nested model becomes essential in the eval-uation of Optica expressions, where we will try to produce an equivalent for expertiseTlinq fromthe evaluation of expertise . def nestedOrg = < @ for d in % org . departments doyield { dpt = d . dpt , employees = for e in % org . employees doif d . dpt = e . dpt thenyield { emp = e . emp , employees = for t in % org . tasks doif e . emp = t . emp thenyield { tsk = t . tsk }}} @ > Figure 28: From ﬂat to nested organization model T-LINQ does support a compositional style, where analogous combinators for all , any , etc. could be supplied [3,Sect. 3.2]. Using these combinators and the nested version of the organizational model, NestedOrg , the expertiseTlinq could be written more concisely. Then, and thanks to its normalization engine, the query could be rewritten to itsequivalent version over the relational model. ef expertiseTlinq = < @ for d in % org . departments doif not existsfor e in % org . employees doif d . dpt = e . dpt ∧ not existsfor t in % org . tasks doif e . emp = t . emp ∧ t . tsk = ‘ abstract ’ then yield t . tsk then yield e . emp then yield d . dpt @ > Figure 29: T-LINQ analogous for expertise

As usual, we provide E tlinq in order to evaluate Optica expressions into T-LINQ expressions.Prior to this, we need to determine the semantic domains for this evaluation by means of T tlinq ,which is shown in Fig. 30. As expected, this semantic function maps Optica types to T-LINQrepresentation types ( Repr ). In particular it just relies on an auxiliary function T aux and wraps theresulting type with Expr . The implementation of T aux is direct for base types, whereas tuples arerepresented as records. Concerning query types, its representation is also straightforward sincefunctions are directly supported by T-LINQ, although we map option to list , since such datatypeis not contemplated by T-LINQ. Last, optic types are simply represented by the query type theygenerate. The next sections present the semantic domains for domain types, the implementationof E tlinq and discusses the ﬁnal results. T tlinq [ t ] = Expr < T aux [ t ] > T aux [ N ] = int T aux [ B ] = bool T aux [ S ] = string T aux [( α, β )] = { T aux [ α ] , T aux [ β ] } T aux [ α → β ] = T aux [ α ] → T aux [ β ] T aux [ α → option β ] = T aux [ α ] → list T aux [ β ] T aux [ α → list β ] = T aux [ α ] → list T aux [ β ] T aux [ getter α β ] = T aux [ α → β ] T aux [ a ﬃ ne α β ] = T aux [ α → option β ] T aux [ fold α β ] = T aux [ α → list β ] Figure 30: Semantic domains of the T-LINQ evaluation

This section introduces the evaluation of domain and core terms into T-LINQ expressions.As we have already seen, all domain terms represent optic expressions, and thus they have tobe adapted as functions. Figure 31 shows the semantic domains (by extending T aux ) and the43valuation of the terms in the organization example. Note how the organization types are mappedto the corresponding nested (instead of relational) types. This aspect will be relevant later onwhile generating the target queries. Back to the evaluation of terms, we can see that this isessentially a T-LINQ adaptation of the code that we presented in OrgModel (Sect. 2.2.2) where weused lambda expressions to build concrete optics. T aux [ Org ] = NestedOrg T aux [ Department ] = NestedDepartment T aux [ Employee ] = NestedEmployee T aux [ Task ] = Task E tlinq [ departments : fold Org Department ] = < @ fun ( ds ) → ds @ > E tlinq [ dpt : getter Department S ] = < @ fun ( d ) → d . dpt @ > E tlinq [ employees : fold Department Employee ] = < @ fun ( es ) → es @ > E tlinq [ emp : getter Employee S ] = < @ fun ( e ) → e . emp @ > E tlinq [ tasks : fold Employee Task ] = < @ fun ( ts ) → ts @ > E tlinq [ tsk : getter Task S ] = < @ fun ( t ) → t . tsk @ > Figure 31: T-LINQ semantic domains and non-standard semantics for organization extension

The evaluation of core combinators (Fig. 32) also shares a strong resemblance with thosethat we have seen for their concrete counterparts in Sect. 2.1. In essence, the di ﬀ erence liesin the fact that concrete optics build directly upon the type system of Scala and the T-LINQinterpretation on its own type system. Thus, the evaluation of ∗∗∗ creates a T-LINQ lambdaexpression using T-LINQ records instead of using the lambda expressions and products of Scala.Similarly, ≫ af and ≫ ﬂ implement composition by using directly the primitives of T-LINQ,whereas the implementation of this combinator in concrete optics is based upon the standardScala implementation. The last step towards the generation of ﬁnal queries is supplying the non-standard seman-tics for queries, which are shown in Fig. 33. This step is trivial since they share the very samesemantic domain as their associated optics; therefore, we just need to evaluate their optic argu-ment. However, and in order to produce the ﬁnal queries, there is a non-negligible disagreementthat we need to address: the T-LINQ expressions which are generated by E tlinq refer to entitiesfrom the nested model, as introduced by Fig. 31. To resolve this mismatch, we need to reconcilethe relational model with the nested model, so that we can use nestedOrg (Fig. 28) for the task.Thereby, we just supply the nested data to the T-LINQ lambda expression generated from theOptica expression: def expertiseTlinq = < @ % E tlinq [ expertise : Org → list S ] % nestedOrg @ > This produces an alternative implementation of the query which was presented in Fig. 29. How-ever, the T-LINQ expression generated by the new version is much more di ﬃ cult to read and lesse ﬃ cient, given the complexity introduced by nestedOrg . Fortunately, this is not a problem, sinceboth queries share the very same normal form, and consequently, they produce the same SQLstatement. 445 E tlinq [ id gt : getter α α ] = < @ fun ( a ) → a @ > E tlinq [ g ≫ gt h : getter α γ ] = < @ fun ( a ) → % E tlinq [ h : getter β γ ] (% E tlinq [ g : getter α β ] a ) @ > E tlinq [ g ∗∗∗ h : getter α ( β, γ )] = < @ fun ( a ) → { = % E tlinq [ g : getter α β ] a , = % E tlinq [ h : getter α γ ] a } @ > E tlinq [ like b : getter α β ] = < @ fun ( a ) → b @ > E tlinq [ not g : getter α B ] = < @ fun ( a ) → not (% E tlinq [ g : getter α B ] a ) @ > E tlinq [ g ⊕ h : getter α δ ] = < @ fun ( a ) → (% E tlinq [ g : getter α β ] a ⊕ % E tlinq [ h : getter α γ ] a ) @ > E tlinq [ id af : a ﬃ ne α α ] = < @ fun ( a ) → yield a @ > E tlinq [ g ≫ af h : a ﬃ ne α γ ] = < @ fun ( a ) → for b in % E tlinq [ g : a ﬃ ne α β ] a dofor c in % E tlinq [ h : a ﬃ ne β γ ] b yield c @ > E tlinq [ ﬁltered p : a ﬃ ne α α ] = < @ fun ( a ) → if % E tlinq [ p : a ﬃ ne α B ] a then yield a @ > E tlinq [ to af g : a ﬃ ne α β ] = < @ fun ( a ) → yield (% E tlinq [ g : getter α β ] a ) @ > E tlinq [ id ﬂ : fold α α ] = < @ fun ( a ) → yield a @ > E tlinq [ g ≫ ﬂ h : fold α γ ] = < @ fun ( a ) → for b in % E tlinq [ g : fold α β ] a dofor c in % E tlinq [ h : fold β γ ] b yield c @ > E tlinq [ nonEmpty g : getter α B ] = < @ fun ( a ) → exists (% E tlinq [ g : fold α β ] a ) @ > E tlinq [ to ﬂ a : fold α β ] = E tlinq [ a : a ﬃ ne α β ] Figure 32: T-LINQ non-standard semantics for optic terms tlinq [ get g : α → β ] = E tlinq [ g : getter α β ] E tlinq [ preview g : α → option β ] = E tlinq [ g : a ﬃ ne α β ] E tlinq [ getAll g : α → list β ] = E tlinq [ g : fold α β ] Figure 33: T-LINQ non-standard semantics for query terms

7. S-Optica: Optica as a Scala library

This section aims at implementing the Optica DSL in Scala. The resulting library (which wecall S-Optica) is provided as a proof-of-concept of the feasibility of extending existing librariesfor LINQ, especially those based on comprehensions with optic capabilities. We will show in de-tail the S-Optica implementation of the syntax and type system of Optica, as well as its standardsemantics. The reader may want to look into the accompanying sources for more informationabout the S-Optica implementation of the interpreters for XQuery, SQL and T-LINQ. The S-Optica implementation is also intended to serve as an illustration of the tagless-ﬁnal style [21],that we have chosen in order to implement our DSL.

In the tagless-ﬁnal style, the syntax and type system of a typed DSL is implemented through a type constructor class , which represents the class of representations, or possible interpretations,of that DSL. This type class does not need to be a single, monolithic module, but it is usuallydecomposed into di ﬀ erent type classes which encode di ﬀ erent aspects of the DSL. In our case,the division of classes has taken into consideration the structure of optics and combinators that wefollowed in Sect. 2, and the di ﬀ erence between optic and query types as introduced in Sect. 3.1.Accordingly, Fig. 34 shows the syntax and semantics of the Optica fragment corresponding togetters, a ﬃ ne folds and folds; Fig. 35 shows the implementation of the fragment of queries, aswell as the overall Optica type class. Some comments on the implementation follow below: • The primitive combinators of the di ﬀ erent types of optics, getters, a ﬃ ne folds and folds,are implemented in their respective modules. Those which are not primitive, but can bedeﬁned in terms of other combinators, namely, any , all , elem and empty , are deﬁned in the OpticCom type class. • The implementation of these derived combinators beneﬁts from the same syntactic en-hancements that we assumed in Sect. 2.1. In fact, their implementation is literally thesame as that for concrete optics shown in Fig. 3. The di ﬀ erences simply lie in their sig-natures and the intended semantics: whereas the implementations of Fig. 3 only work forconcrete optics, the implementations of Fig. 34 work for any optic representation Repr[_] .Thus, we may instantiate this class in order to work with concrete optics, or any otherstandard representation such as van Laarhoven or profunctor optics; of course, we mayalso instantiate this class in order to work with XQuery, TripletFun or T-LINQ, since theseare legitimate read-only optic representations as have been shown throughout the paper. We supply a brief tour of how to encode type (constructor) classes in Scala in Appendix A.1. We actually used concrete optic types in the signatures of these type classes, i.e. the types

Getter[_,_] , Affine[_,_] , etc. (to which the signatures refer to) are exactly those deﬁnedin Sect. 2.1. How can these signatures work for any representation, then? The reason issimply that these combinators do not receive and return plain concrete optics types, buttheir representations : the empty combinator does not receive a concrete fold, but anythingthat counts as a fold optic. Concrete types thus behave mostly as phantom types [45, 46],which specify the abstract semantic domains of the language and aid in the deﬁnition ofits type system. • The query types of Optica correspond in the tagless-ﬁnal style to observations [19]. Thesecan be understood as the standard interpretations that we demand from any representationof the implemented DSL. This matches perfectly with the distinction between optics andquery types: for instance, we will always want a getAll interpretation from a fold program,irrespective of the optic representation. In the tagless-ﬁnal style, we commonly assigndi ﬀ erent type constructors to DSL expressions and observations. Thus, in Fig. 35 weuse Repr[_] and

Obs[_] for optic and query representations, respectively. This is actuallyequivalent to having two di ﬀ erent DSLs, one for optics and another for queries, into whichoptics are compiled. • Base types of Optica also enjoy a di ﬀ erent representation. As the implementation of the like combinator shows, base values are represented using the very type system of the hostlanguage, i.e. Scala. Thus, its representation is not Repr[_] nor

Obs[_] , but the identity typeconstructor. This representation for base types is also common practice in tagless-ﬁnalstyle . • To avoid overloading the like method for the di ﬀ erent base types, Int , String and

Boolean ,we use the GADT

Base , whose object instances are marked as implicits, thereby enablingthe context bound syntax in Scala. The

Base

GADT is also declared in those signaturesthat depends on the like combinator, namely, elem , and the combinator equal . In order to write domain queries, we need to extend the syntax and type system of the Opticalanguage, as we have seen in Sect. 3.2. Quoting from [47], “extensibility is the strong suite of thetagless-ﬁnal embedding”; therefore, this task should be easy. Indeed, we simply need to declarea new type class where we have a component containing an entry for each domain optic in themodel, as shown in Fig. 36. The types

Couples , Person , etc., are immutable data structures whichmostly behave as phantom types and aid in the extension of the type system of the language.Once we have the core and domain primitives available, we should be able to implementgeneric optic expressions by declaring both dependencies, the combinators of the Optica APIand the domain model (note that observations are not needed to write pure optic expressions): We would run into problems, however, if the target optic representation does not also use itself the Scala types forrepresenting base types. This constraint could be slightly lifted since we may want to compare not only base, but model types in general (cf.Fig. 4). trait

GetterCom[Repr[_]] { def id gt [S]: Repr[Getter[S, S]] def andThen gt [S, A, B](u: Repr[Getter[S, A]],d: Repr[Getter[A, B]]): Repr[Getter[S, B]] def fork gt [S, A, B](l: Repr[Getter[S, A]],r: Repr[Getter[S, B]]): Repr[Getter[S, (A, B)]] def like[S, A: Base](a: A): Repr[Getter[S, A]] def not[S](b: Repr[Getter[S, Boolean]]): Repr[Getter[S, Boolean]] def equal[S, A: Base](x: Repr[Getter[S, A]],y: Repr[Getter[S, A]]): Repr[Getter[S, Boolean]] def greaterThan[S](x: Repr[Getter[S, Int]],y: Repr[Getter[S, Int]]): Repr[Getter[S, Boolean]] def subtract[S](x: Repr[Getter[S, Int]],y: Repr[Getter[S, Int]]): Repr[Getter[S, Int]]} trait AffineFoldCom[Repr[_]] { def id af [S]: Repr[AffineFold[S, S]] def andThen af [S, A, B](u: Repr[AffineFold[S, A]],d: Repr[AffineFold[A, B]]): Repr[AffineFold[S, B]] def filtered[S](p: Repr[Getter[S, Boolean]]): Repr[AffineFold[S, S]] def to af [S, A](gt: Repr[Getter[S, A]]): Repr[AffineFold[S, A]]} trait FoldCom[Repr[_]] { def id ﬂ [S]: Repr[Fold[S, S]] def andThen ﬂ [S, A, B](u: Repr[Fold[S, A]],d: Repr[Fold[A, B]]): Repr[Fold[S, B]] def nonEmpty[S, A](fl: Repr[Fold[S, A]]): Repr[Getter[S, Boolean]] def to ﬂ [S, A](afl: Repr[AffineFold[S, A]]): Repr[Fold[S, A]]} trait OpticaCom[Repr[_]] extends

GetterCom[Repr] with

AffineFoldCom[Repr] with

FoldCom[Repr] { def empty[S, A](fl: Repr[Fold[S, A]]): Repr[Getter[S, Boolean]] = fl.nonEmpty.not def all[S, A](fl: Repr[Fold[S, A]])(p: Repr[Getter[A, Boolean]]): Repr[Getter[S, Boolean]] = (fl ≫ filtered(p.not)).empty def any[S, A](fl: Repr[Fold[S, A]])(p: Repr[Getter[A, Boolean]]): Repr[Getter[S, Boolean]] = fl.all(p.not).not def elem[S, A: Base](fl: Repr[Fold[S, A]])(a: A): Repr[Getter[S, Boolean]] = fl.any(id gt === like(a))} Figure 34: OpticaCom symantics (optic combinators). rait

GetterQuery[Repr[_], Obs[_]] { def get[S, A](gt: Repr[Getter[S, A]]): Obs[S => A]} trait

AffineFoldQuery[Repr[_], Obs[_]] { def preview[S, A](af: Repr[AffineFold[S, A]]): Obs[S => Option[A]]} trait

FoldQuery[Repr[_], Obs[_]] { def getAll[S, A](fl: Repr[Fold[S, A]]): Obs[S => List[A]]} trait

Optica[Repr[_], Obs[_]] extends

OpticaCom[Repr] with

GetterQuery[Repr, Obs] with

AffineFoldQuery[Repr, Obs] with

Fold[Repr, Obs]

Figure 35: Optica symantics (generic combinators and queries). trait

CoupleModel[Repr[_]] { def couples: Repr[Fold[Couples, Couple]] def fst: Repr[Getter[Couple, Person]] def snd: Repr[Getter[Couple, Person]] def name: Repr[Getter[Person, String]] def age: Repr[Getter[Person, Int]]}

Figure 36: Couple domain symantics def differencesFl[Repr[_]]( implicit

O: OpticaCom[Repr],M: CoupleModel[Repr]): Repr[Fold[Couples, (String, Int)]] = couples ≫ filtered((fst ≫ age) > (snd ≫ age)) ≫ (fst ≫ name) ∗∗∗ ((fst ≫ age) - (snd ≫ age)) and generic query expressions (in this occasion, we pass the whole Optica type class, whichincludes the queries): def differences[Repr[_], Obs[_]]( implicit O: Optica[Repr, Obs],M: CoupleModel[Repr]): Obs[Couples => List[(String, Int)]] = differencesFl.getAll As can be seen, the required primitives are injected using the Scala implicit mechanism. In con-trast with Def. 4, this version remarks the aforementioned existence of di ﬀ erent representationsfor optics and queries, as evidenced by the result types. Scala implicits are also exploited bythe library to omit invocations to casting methods, although the required syntax is not shown forbrevity. Type classes in the tagless-ﬁnal style are commonly named

Symantics , a portmanteau of‘syntax’ and ‘semantics’, to emphasise the fact that the same abstraction serves a double purpose:the type class declaration deﬁnes the syntax and type system of the language, whereas type classinstances provide its semantics. The standard semantics of the language is no exception, and forthis purpose we greatly beneﬁt from having reused the standard semantic domains at the syntactic49evel: simply use the identity type lambda λ [x => x] for both the Repr and

Obs parameters, andmap each primitive into its concrete counterpart .We can ﬁnd the interpretation that supplies the standard semantics of Optica in Fig. 37. Inparticular, it is represented by the singleton object R , which is also a common name for meta-circular interpretations in the tagless-ﬁnal style. We follow the very same pattern to instantiatethe couple domain terms, as we show in Fig. 38.Now we can use the standard semantics to evaluate generic queries, and to re-implement, ina modular way, the ad-hoc functions that deal with immutable data structures. For instance: val differencesR: Couples => List[(String, Int)] = differences[ λ [x => x], λ [x => x]](R, CoupleModelR) As can be seen, we have speciﬁed the standard representation types for optics and queriesalongside the associated evidences. Fortunately, they could be inferred implicitly, as shown inthis alternative and preferred version. val differencesR: Couples => List[(String, Int)] = differences The resulting function is extensionally equal to differences from Sect. 2.2.1. The implemen-tation of the rest of interpretations in this article (XQuery, SQL and T-LINQ) follows the sameprinciples. Interested readers can ﬁnd a

README ﬁle in the companion sources [39] whichbrieﬂy describes the library structure and supplies links to the aforementioned interpreters —andother relevant modules.

8. Discussion

The language of optics.

One the most prominent sought-after features of optics is modularity , i.e. the capacity of cre-ating optics for compound data structures out of simpler optics for their parts. This is speciallyemphasized in the framework of profunctor optics [32], where optic composition builds uponplain function composition and enables straightforward combinations of isos, prisms, lenses,a ﬃ ne traversals, and traversals. The profunctor representation is particularly convenient to im-plement (and even reveal) the compositional structure of the di ﬀ erent varieties of optics, but, inessence, this structure is also enjoyed by concrete optics, van Laarhoven optics, etc. Modularityis a feature of the language of optics, rather than of any particular representation. This paperhas shown, albeit for a very restricted subset of optics (getters, a ﬃ ne folds and folds), that thiscompositional structure of optics can be encoded in the type system of a formal language, that wehave named Optica. The denotational semantics of this language was given in terms of concreteoptics but any other isomorphic representation, such as profunctor optics, may have served aswell.Now, the speciﬁcation of Optica includes not only the compositional features of read-onlyoptics but also, and signiﬁcantly, their characteristic queries. Taking into account this non-compositional character of optics is essential as soon as we tackle the extension of Optica withnew varieties of optics. For instance, the major di ﬀ erence between folds and traversals is notfound in their compositional properties, but in the queries that they must support: besides getAll ,traversals must also support a putAll query to replace the content of the elements that they areselecting. This is the common case in which standard semantic domains do not eventually behave as phantom types. trait

RGetterCom extends

GetterCom[ λ [x => x]] { def id gt [S] = Getter.id def andThen gt [S, A, B](u: Getter[S, A], d: Getter[A, B]) = Getter.andThen(u, d) def fork gt [S, A, B](l: Getter[S, A], r: Getter[S, B]) = Getter.fork(l, r) def like[S, A: Base](a: A) = Getter.like(a) def not[S](b: Getter[S, Boolean]) = Getter.not(b) def eq[S, A: Base](x: Getter[S, A], y: Getter[S, A]) = Getter.eq(x, y) def gt[S](x: Getter[S, Int], y: Getter[S, Int]) = Getter.gt(x, y) def sub[S](x: Getter[S, Int], y: Getter[S, Int]) = Getter.sub(x, y)} trait

RAffineFoldCom extends

AffineFoldCom[ λ [x => x]] { def id af [S] = AffineFold.id def andThen af [S, A, B](u: AffineFold[S, A], d: AffineFold[A, B]) = AffineFold.andThen(u, d) def filtered[S](p: Getter[S, Boolean]) = AffineFold.filtered(p) def as af [S, A](gt: Getter[S, A]) = gt} trait RFoldCom extends

FoldCom[ λ [x => x]] { def id ﬂ [S] = Fold.id def andThen ﬂ [S, A, B](u: Fold[S, A], d: Fold[A, B]) = Fold.andThen(u, d) def nonEmpty[S, A](fl: Fold[S, A]) = fl.nonEmpty def as ﬂ [S, A](afl: AffineFold[S, A]) = afl} trait RGetterAct extends

GetterAct[ λ [x => x], λ [x => x]] { def get[S, A](gt: Getter[S, A]) = gt.get} trait RAffineFoldAct extends

AffineFoldAct[ λ [x => x], λ [x => x]] { def preview[S, A](af: AffineFold[S, A]) = af.preview} trait RFoldAct extends

FoldAct[ λ [x => x], λ [x => x]] { def getAll[S, A](fl: Fold[S, A]) = fl.getAll} implicit object R extends Optica[ λ [x => x], λ [x => x]] with RGetterCom with

RGetterAct with

RAffineFoldCom with

RAffineFoldAct with

RFoldCom with

RFoldAct

Figure 37: Optica standard semantics. implicit object

CoupleExampleR extends

CoupleModel[ λ [x => x]] { val couples = CoupleModel.couples val fst = CoupleModel.fst val snd = CoupleModel.snd val name = CoupleModel.name val age = CoupleModel.age}

Figure 38: Couple domain standard semantics n the implementation side, we have found the typed tagless-ﬁnal approach especially suit-able in order to encode this separation of concerns between declarative optic combinators andtheir intended queries. In essence, it closely corresponds to the di ﬀ erence between represen-tations of the DSL and their observations or interpreters. Another essential feature from thetagless-ﬁnal pattern that we plan to proﬁt from is extensibility. In particular, new optics will beadded to S-Optica through their own type classes (as we have done for getters, a ﬃ ne folds andfolds) so that we can fully reuse old queries without recompiling sources. Optics versus comprehensions

Optics can be seamlessly combined with comprehensions, as shown in Sect. 6. Indeed, byusing the T-LINQ interpreter of Optica we can freely mix optic expressions with general com-prehension queries. In this way, optics may play within comprehensions a similar role to thatwhich is played by XPath within XQuery [44]. In the following paragraphs, we discuss the basictrade-o ﬀ between expressiveness and modularity of comprehension and Optica queries, so as tobetter appreciate their role in the LINQ landscape.The separation of concerns between declaratively selecting parts of a data structure and build-ing a variety of queries related to those parts is the cornerstone of optics. In this regard, the LINQapproach based on comprehensions focuses on the query building side and, commonly, on con-structing queries of a simple kind: retrieval queries denoting a multiset (the semantic domain forqueries on QUE Λ [19], T-LINQ [3], NRC [13], etc.). The optics approach is, hence, potentiallymore modular. For instance, a representation of traversals intended for SQL should allow us togenerate both SELECT and

UPDATE statements for the queries getAll and putAll , respectively. Weplan to deal with this extension and its trade-o ﬀ s with expressiveness in future versions of Optica.We can still claim further modularity advantages of optics over comprehensions. Basically,these are due to the fact that optics provide a language which is more akin to relational algebrathan the calculus approach that monads provide for comprehensions [29]. Arguably, the supportfor functional abstraction and intermediate nested data of comprehensions languages and systemssuch as Links, T-LINQ or DSH o ﬀ er , leads also to highly compositional queries . We can ﬁndexamples, however, where the di ﬀ erence in style manifests itself. For instance, this is the querythat remains to complete Table 1, in the style of S-Optica: def under50_d[Repr[_], Obs[_]]( implicit O: Optica[Repr, Obs],M: CoupleModel[Repr]): Obs[Couples => List[String]] = (couples ≫ fst ≫ filtered (age < 50) ≫ name).getAll which we compare to an analogous query using the Scala implementation of T-LINQ : def under50_e[Repr[_]](couples: Repr[Couples])( implicit Q: Tlinq[Repr],N: CoupleNested[Repr]): Repr[List[String]] = foreach(couples)(c => where (c.fst.age < 50) (yields (c.fst.name))) DSH, in particular, comes with an extensive catalog of list-processing combinators: https://github.com/ulricha/dsh/blob/master/src/Database/DSH/Frontend/Externals.hs . Indeed, our version of the expertise query in Sect. 6 is no more simple than the equivalent version using nested datain [3]. Tlinq[_[_]] provides the tagless-ﬁnal implementation in Scala of T-LINQ, that we have used to implement thecorresponding interpreter for S-Optica. The role of

CoupleNested in the sample query is similar to the

OrgNested model in Sect. 6.1.

52s this example shows, in adopting the language of optics, modularity is improved in severalrespects. First, as we have mentioned earlier, the query is actually composed of two majorparts: the optic expression, which declares what to select from, and the query expression, whichactually speciﬁes the kind of query to be executed over the selection. Second, the optic expressionis unaware of variables and builds upon ﬁner grained and reusable modules, such as couples , fst , age and name . This results in pure algebraic queries that are arguably more simple to composeand maintain. In essence, we are building out of simpler optics in a purely compositional style,and deriving queries in one shot.The downside of the optics approach in relation to comprehensions, at least in the currentversion of Optica, is its limited expressiveness. Indeed, variables are fruitfully exploited in com-prehensions to express arbitrary joins (e.g. cyclic) whereas optic queries appear to move onlydownwards from the root of the hierarchy. Relational models are more general than nested mod-els, providing the programmer with better navigation tools [48], and therefore not every modelis expressible in Optica. Take the couple model as an example. We assume that each person ishanging from a couple, so that we can ﬁnd them by diving into the couple ﬁelds fst/snd . How-ever, the relational model is able to supply more entries for people who do not necessarily forma couple. To alleviate this problem, we may introduce a new fold people besides the existing couples , sharing a virtual root type as source. The connections between people and fst / snd wouldstill be unclear in the optic model; therefore, new mechanisms should be introduced in order toestablish the precise relationship among them. We leave for future work a more precise investi-gation of the compared expressiveness of the comprehension and optic languages, as well as theextension of Optica with already supported features in comprehension languages like grouping,aggregation and order-by queries [19, 20, 49]. Optics as a general query language.

The role of optics in LINQ expands beyond its combined use with comprehensions. By liftingoptics into a full-ﬂedged DSL, we have opened the door to non-standard interpretations thatdirectly translate the language of optics to data accessors for alternative representations beyondimmutable data structures. For instance, we have provided an interpretation to turn Optica queriesinto XQuery expressions where we have seen that the connection among them is straightforward,leading to a compositional interpreter. The translation ignores the XQuery FLWOR syntax andbasically focuses on XPath features. Indeed, we understand XPath as a language to select partsfrom an XML document, which makes it a perfect example of optic representation. Moreover,since XPath does not provide the means to update an XML document, it also ﬁts perfectly withread-only optics such as getters, a ﬃ ne folds and folds.It might be worth mentioning that synergies among optics and XML are by no means new.In fact, prominent optic libraries are extended with modules to cope with XML or JSON docu-ments, even packaged as domain-speciﬁc query languages, such as JsonPath . In these projects,standard optics facilitate the deﬁnition of these DSLs for querying JSON or XML documents.Nevertheless, our approach is radically di ﬀ erent since we provide a general optic language inorder to build generic optics which may be translated over those DSLs (JsonPath, XQuery, etc.).Our approach also di ﬀ ers from others where the process is reversed and a translation ofXPath expressions into a general query language based on comprehensions is performed [3]. In https://hackage.haskell.org/package/xml-lens-0.1.6.3/docs/Text-XML-Lens.html https://github.com/julien-truffaut/jsonpath.pres view forests , a concept that is exploited bythe framework to separate the XML structure from its computation. On its part, Optica exposesa hierarchy of domain optics that external components may use to compose optic expressions, asapplication queries. In addition, we could understand view forests as a kind of optic since theyalso select parts from the underlying database. However, Optica is more general considering thatthe same application queries can be reused against di ﬀ erent targets, and not only SQL.Nevertheless, SQL is the primary target of classical LINQ with comprehensions, and wehave also provided a non-standard SQL interpreter for Optica. Commonly, comprehension-basedqueries need to be ﬂattened in order to guarantee a good performance: the naive translation toSQL is not optimal since it typically leads to nested subqueries. Moreover, translations of ﬂat-ﬂatqueries to SQL are guaranteed to be total and to avoid the problem of query avalanche [16]. Insystems like Links or T-LINQ, these guarantees are even statically checked. In Sect. 5.2.4, ourtranslation to SQL attains similar guarantees concerning the type of generated queries, which areabsent of subqueries, beyond those that are generated by EXISTS , which are unavoidable. How-ever, failures in query generation are signalled at run-time rather than at compile time. We planto solve this limitation in future work by using the optimization techniques that the tagless-ﬁnalapproach o ﬀ ers [19]. Our translation process resembles the denotational approach of SQUR [20]and Links [51] rather than the rewriting approach followed in [16, 3, 19]. In particular, we usean intermediate language TripletFun to decouple the ﬁltering, selection and collection aspects ofthe ﬁnal SQL query. We di ﬀ er from SQUR, however, because the ultimate translation to SQL isperformed directly from this non-standard semantic domain rather than from a normalized opticquery. We plan to incorporate normalization and partial evaluation in future work, which will beconvenient as soon as we extend the language with projections first and second , in correspon-dence with the fork combinator.Given the existing translation to comprehensions from Optica and the established resultsconcerning the generation of SQL from comprehensions the usefulness of TripletFun for thispurpose is certainly relative. However, this demonstrates an instance of optic representationin the relational setting, which we believe to have the potential of being very useful when weextend our results for optics with updating capabilities. In this light, we intend to exploit the verysame

TripletFun representation to generate both

SELECT and

UPDATE statements. Moreover, the

TripletFun interpreter represents an example of complex translation using an intermediate opticrepresentation, which resembles the denotational approach of [20] but performed in the algebraicsetting of optics rather than in the relational calculus of comprehensions. This semantic stylemay serve as a reference for similar complex interpreters, e.g. for NoSQL databases such asMongoDB [52].

Optica versus ORMs and LINQ libraries

Connections between optics and databases are widespread. As a matter of fact, lensesemerged in this context [33] under the umbrella of bidirectional programming . We remark [53]as a recent work in this ﬁeld, where a practical approach to the view update problem is intro-duced by means of the so-called incremental relational lenses . Although we still do not know ifextending Optica will lead us to contemplate views in the non-standard SQL semantics, we ﬁndthis research essential to deal with updating optics in an e ﬀ ective way.54-Optica and object-relational mappers (ORMs), like Hibernate, pursue similar goals: theyaim at working with data in persistent stores as if it were plain in-memory data. However, S-Optica uses the language of optics while ORMs try to remain as close as possible to the customaryobject-oriented style. These are other relevant di ﬀ erences: • S-Optica does not stick to relational databases as its preferred target infrastructure. In fact,Sect. 3 and Sect. 4 show that in-memory immutable data structures and XML ﬁles are alsopotential sources of information. • S-Optica is eminently declarative. Indeed, S-Optica queries are simply values that donot produce side e ﬀ ects on their own. This contrasts with ORMs, where it requires a hugeunderstanding of the particular ORM to identify which queries are being launched at anytime. The declarative style of S-Optica enables compositionality as well as the possibilityto introduce further optimizations. • S-Optica queries are expressive and well-typed. Many ORMs introduce contrived addi-tional languages to express queries, and their expressions are usually presented as plainstrings, so that errors are not detected at compile time. • ORMs usually consider the notion of object as the smallest granularity concept to dealwith, while S-Optica supports queries that select very speciﬁc parts from the whole data. • ORMs are able to write data back to the store, while this feature is future work in Optica.In general, ORMs have been used for a long time and they are consequently very mature,while Optica is still an experimental and limited library. However, it already solves many of theproblems that are deep-rooted in the ORM approach.The Scala libraries Quill [24] and Slick [25] are arguably the most similar frameworks toS-Optica. The former is strongly inspired by T-LINQ [3] and it therefore follows the same theo-retical principles. The major beneﬁt with regard to the original P-LINQ, the F flatMap methodfor the type constructor

Query , it apparently lacks an implementation for point which is requiredto translate some Optica queries into Quill expressions . Slick is similar to Quill, but it doesnot build upon a theoretical language like T-LINQ . In any case, both Quill and Slick map re-lational models in Scala in a direct way, i.e. as ﬂat data models, whereas S-Optica works withnested data models and has to solve a bigger impedance mismatch. On the other hand, althoughQuill and Slick support updates and deletes, they do this with ad-hoc languages that escape thecollection-like interface. Optica should be able to supply a standard interface in order to supportwrites by introducing additional optics. As a ﬁnal note, we want to recall that, as Sect. 6 pointsout, Optica should not be seen as a competitor but as a complement for these libraries, sinceoptics and comprehensions were shown to be compatible. Strictly speaking, values are objects in Scala whereas S-Optica queries are polymorphic methods. These may beeasily turned into values by using a Church encoding representation. For instance, the S-Optica query (like 1).getAll . A comparison between Quill and Slick (written by Quill’s author) is provided here https://github.com/getquill/quill/blob/master/SLICK.md . . Conclusions This paper has attempted to demonstrate that optics embrace a much wider range of represen-tations beyond concrete, van Laarhoven, profunctor optics and other isomorphic acquaintances.We have shown, for instance, that a restricted subset of XQuery can be properly understood as an optic representation , i.e. as an abstraction whose essential purpose is to allow us to select partsfrom a data source by using powerful combinators, declaratively, and derive queries from thoseselectors. From this standpoint, data sources of optic representations may range far beyond gen-eral immutable structures: they might be XML documents, as in the case of XQuery, or relationaldatabases. In fact, we have also shown how to derive SQL queries from

TripletFun , an optic rep-resentation that endorses the separation of concerns between selection, ﬁltering and collectionaspects, which characterizes SQL

SELECT statements. Strictly speaking, we may say that SQL isnot an optic but a query language which is translatable from an optic representation. In futurework, we aim at testing the generality of the language of optics through the generation of othere ﬀ ective, idiomatic translations into a diverse range of querying infrastructures. We will partic-ularly pay attention to technologies that are more recent than XQuery, with a clear bias towardsnested data models such as document-oriented NoSQL databases like MongoDB [52] [52], andlanguages like GraphQL [55].We put forward Optica, a full-ﬂedged DSL, to specify what all these representations havein common, i.e. the concept of optic itself. Technically speaking, the type system of Opticaencodes the compositional and querying features of getters, a ﬃ ne folds and folds, independentlyof any particular representation; concrete optics provide the semantic domains for its standarddenotational semantics; and XQuery, TripletFun and T-LINQ represent semantic domains fornon-standard optic representations. Currently, Optica only pays attention to a very restricted setof optics, namely getters, a ﬃ ne folds and folds. In future work, we will contemplate other opticslike lenses, a ﬃ ne traversals or traversals, as well as additional combinators that populate de-factolibraries like Haskell lens and Monocle. This will force us to also pay attention to the laws (e.g.the get-set law of lenses) that the intended queries of optics must comply with. We think opticalgebras [56] will be instrumental in that formalization.The ultimate goal behind this quest for the language of optics has been to show that optics canplay a signiﬁcant role in the theory and practice of language-integrated query. In particular, wehave demonstrated how optics can be used as a high-level language in order to derive comprehen-sion queries, the most common approach in the LINQ ﬁeld nowadays. This has the advantageof allowing programmers to exploit optics, the de-facto standard for dealing with hierarchicaldata structures, in their LINQ developments. Additionally, the XQuery and SQL interpretationshave also shown that the language of optics is general enough to cope with LINQ systems in-dependently from comprehensions. However, in the case of SQL, this is done at the expenseof a more limited expressiveness since joins are not currently supported. We plan to investigatepossible extensions to Optica based on the compositional encoding of equijoins in [29]. Wealso plan to investigate future interpretations of Optica into declarative query languages such asDatalog [57] and description logics [58], as well as its connection with recent developments incomprehension-related languages based on monoids [59]. We also think that Optica in its currentshape has a great potential to deal with modern warehouse technologies aimed at data analytics,where updates are not customary.Optics show a potential to cope not only with retrieval queries but also with updates, a kindof query that is commonly unaddressed in theoretical accounts but patently necessary in prac-tice. This paper lays the foundation to engage with this issue in future work. On the one hand,56xtending the syntax and type system of Optica (and S-Optica) with new optic types and com-binators is trivial. On the other hand, the feasibility of introducing updates in the interpretationis subject to limitations of the particular infrastructure. For example, XQuery does not supportupdates (although there are extensions that deal with them [60]), and thus the evaluation of opticswith updating capabilities would be partial. As for SQL, it does support updates, but there is atradeo ﬀ with expressiveness: not all relational queries can be updatable views, which introducesa new level of partiality. Whether triplets need to be extended in order to accomodate updatesis something that requires further research. Lastly, we are very optimistic about the potentialof updates in modern technologies based on nested models [52, 55], where we have carried outseveral simple experiments with positive results.Finally, we have implemented a proof-of-concept of Optica and its interpreters in the Scalalibrary S-Optica by using the tagless-ﬁnal style. Optica is thus implemented as a type class:the class of optic representations and their intended queries. Beyond the generic queries thatwe have used to guide the explanations, we have tested S-Optica with other queries around thesame domains and with new domains that were extracted from the o ﬃ cial documentation ofMonocle, Slick and Quill. These examples are located in an experimental branch of S-Opticathat will be available as soon as the library matures. In this sense, we intend to proﬁt from themany improvements in the inminent release of Scala 3.0, particularly in regard to type classesand metaprogramming facilities , with a new implementation of Optica in Dotty [61]. Similarimplementations may have also been developed in other languages that support the tagless-ﬁnalapproach, such as Haskell or OCaml. In any case, the results that have been obtained are encour-aging enough to anticipate the feasibility of extending existing comprehension-based libraries inthese languages for LINQ, with optic capabilities. Appendix A. Scala Background

This section aims at providing a brief background of those Scala features that we use in thispaper. First, Table A.2 supplies a cheat sheet where we can ﬁnd examples and short descriptionsof the abstractions and constructions that we consider to be more relevant in the particular contextof this work. As can be seen, some of them are speciﬁc to Scala but there are other conceptswhich are widespread in the functional programming community, where we just want to showhow to encode them in this language. Second, we describe the general pattern to encode typeclasses [62] in Scala [63].

Appendix A.1. Encoding Type Classes in Scala

In Scala, we can use traits to deﬁne new type classes. For instance, we encode

Functor asfollows: trait

Functor[F[_]] { def fmap[A, B](fa: F[A])(f: A => B): F[B]}

The trait itself is parameterized with a type constructor

F[_] ; therefore, this is a type constructorclass. It declares the fmap method, which is parameterized with concrete type parameters A and B and value parameters fa and f (organized in two sections ). As can be seen, function types https://dotty.epfl.ch/docs/reference/metaprogramming/toc.html Scala supports deﬁnitions with multiple groups of value parameters, delimited by parentheses. In this particularsituation, the separation turns out to be helpful to improve type inference while invoking the method.

Abstraction Code Example Description algebraic data type sealed abstract class

Option[A] case class

None[A]() extends

Option[A] case class

Some[A](a: A) extends

Option[A]

ADTs are implemented using object inher-itance. The example shows Option, whichis also known as Maybe. case class case class

Person(name: String,age: Int)

Deﬁnes a class with special features, likeconstruction and observation facilities. companion object trait

Person object

Person

Module that serve as a companion to a classor trait with the very same name. We use itto supply class members and provide im-plicit deﬁnitions, like conversors and typeclass instances. for comprehension for {i ← List(1, 2)j ← List(3, 4)} yield i + j // res: List[Int] = List(4,5,5,6)

Syntactic sugar for ﬂatMap , map , etc.Analogous for Haskell’s do-notation . function type val f: Int => Boolean = i => i > 0f(3) // res: Boolean = true Function types are represented with arrowsseparating domain and codomain. Lambdaexpressions follow a similar syntax, wherethe arrow separates parameter and functionbody. implicit resolution def isum(x: Int)( implicit y: Int): Int = x + y implicit val i: Int = // res: Int = Family of techniques to let the compiler in-fer certain parameters automatically. In theexample, i is implicitly passed as second ar-gument to isum . partial function syntax val f: Option[Int] => Boolean = { case None => truecase

Some(_) => false } Special syntax for those situations wherewe want to produce an anonymous (poten-tially partial) function that requires patternmatching on its parameter. placeholder syntax val inc: Int => Int = i => i + 1 val inc2: Int => Int = _ + 1 Syntax for lambda expressions where werefer to the parameter as ’ ’ ; consequently,there is no need to name it. Both inc and inc2 (placeholder syntax) are equivalent. trait trait

Person { def name: String = "John" def age: Int} Similar to Java interfaces (as they enablemultiple inheritance), but traits support par-tial implementation of members. type parameter trait

List[A] def nil[A]: List[A] trait

Symantics[Repr[_]]

Types that are taken as parameters by classor method deﬁnitions. It is required a spe-cial notation if we expect higher kindedtypes, like

Repr . type lambda λ [x => x] λ [x => Int] λ [x => Option[x]]

Notation enabled by the kind-projector compiler plugin to produce anonymoustype functions of the kind ∗ → ∗ .Table A.2: Scala Cheat Sheet. re represented with the arrow => . Now, we could follow the same pattern to provide other typeclasses, like Pointed : trait Pointed[F[_]] { def point[A](a: A): F[A]} or Bind : trait Bind[F[_]] { def bind[A, B](fa: F[A])(f: A => F[B]): F[B]} which follow the very same pattern. The previous deﬁnitions form the building blocks of

Monad .Thereby, we could compose them to provide the corresponding type class: trait

Monad[F[_]] extends

Functor[F] with

Pointed[F] with

Bind[F] { def fmap[A, B](fa: F[A])(f: A => B) = bind(fa)(a => point(f(a)))} Here, we exploit the multiple inheritance mechanism provided by Scala to mix the involved traits.At this point, we should be able to implement for fmap in terms of bind and point , once and forall. It is common practice to deploy type class instances in the type class companion object sincethe Scala compiler will search for instances in this module, among other places. For example,this is the

Monad companion object, where we have placed the monad instance for

Option ( Maybe in Haskell). object

Monad { implicit object

OptionMonad extends

Monad[Option] { def point[A](a: A) = Some(a) def bind[A, B](fa: Option[A])(f: A => Option[B]) = fa match { case None => None case

Some(a) = de} There are several alternatives to supply an instance. In this occasion, we have decided to imple-ment it as an object

OptionMonad which is declared with the implicit modiﬁer, so that the compilercould ﬁnd it implicitly if necessary. The implementation of point and bind turns out to be trivial.Once we have deﬁned a type class, we could implement derived functionality. For instance, wecould deﬁne the typical join method. def join[F[_], A](ffa: F[F[A]])( implicit

M: Monad[F]): F[A] = M.bind(mma)(identity)

This method requires an implicit evidence of

Monad for F , which is used in the implementation toinvoke bind . Now, we could use the Option instance to ﬂatten a nested optional value by meansof join . join[Option, Int](Option(Option(3)))(Monad.OptionMonad) // res: Option[Int] = Some(3)

Here, we manually supply the type parameters and the monad evidence. Fortunately, the Scalacompiler is able to infer them; therefore, the next version is preferred. join(Option(Option(3))) // res: Option[Int] = Some(3)

As a ﬁnal remark, note that a monad instance subsumes instances for the rest of type classes thatform it, e.g.

OptionMonad is also an

Option instance for

Pointed .59 ppendix B. XML Schemas

Appendix B.1. Couple XSD Appendix B.2. Organization XSD Acknowledgements

We would like to thank James Cheney, Oleg Kiselyov, Eric Torreborre and the anonymousreviewers for their helpful comments and corrections to a previous version of this paper. In partic-60lar, we give Cheney credit for Sect. 6, since he showed us how to translate into comprehensionsthe optic-based organization example, using the Links programming language.

References [1] V. Tannen, P. Buneman, L. Wong, Naturally embedded query languages, in: J. Biskup, R. Hull (Eds.), DatabaseTheory - ICDT’92, 4th International Conference, Berlin, Germany, October 14-16, 1992, Proceedings, Vol. 646 ofLecture Notes in Computer Science, Springer, 1992, pp. 140–154. doi:10.1007/3-540-56039-4\_38 .[2] E. Meijer, B. Beckman, G. M. Bierman, LINQ: reconciling object, relations and XML in the .NET framework,in: S. Chaudhuri, V. Hristidis, N. Polyzotis (Eds.), Proceedings of the ACM SIGMOD International Conference onManagement of Data, Chicago, Illinois, USA, June 27-29, 2006, ACM, 2006, p. 706. doi:10.1145/1142473.1142552 .[3] J. Cheney, S. Lindley, P. Wadler, A practical theory of language-integrated query, in: Proceedings of the 18th ACMSIGPLAN International Conference on Functional Programming, ICFP ’13, ACM, New York, NY, USA, 2013, pp.403–416. doi:10.1145/2500365.2500586 .[4] G. Copeland, D. Maier, Making Smalltalk a database system, in: ACM Sigmod Record, Vol. 14, ACM, 1984, pp.316–325.[5] C. Ireland, D. Bowers, M. Newton, K. Waugh, A classiﬁcation of object-relational impedance mismatch, in: 2009First International Confernce on Advances in Databases, Knowledge, and Data Applications, IEEE, 2009, pp.36–43.[6] P. Hudak, Building domain-speciﬁc embedded languages, ACM Comput. Surv. 28 (4es) (Dec. 1996). doi:10.1145/242224.242477 .[7] M. Odersky, P. Altherr, V. Cremet, B. Emir, S. Maneth, S. Micheloud, N. Mihaylov, M. Schinz, E. Stenman,M. Zenger, An overview of the Scala programming language, Tech. rep. (2004).[8] R. Norris, Doobie - a functional JDBC layer for Scala [cited 14 / / https://tpolecat.github.io/doobie/ [9] J. Cheney, Language-integrated query: state of the art and open problems, in: L. Dayn‘es, G. Fletcher, W. S. Han(Eds.), NII Shonan Meeting Report No. 2017-6, 2017. doi:10.1145/2588555.2612186 .URL https://shonan.nii.ac.jp/archives/seminar/098/wp-content/uploads/sites/149/2017/02/cheney-shonan-linq.pdf [10] P. Wadler, Comprehending monads, in: Proceedings of the 1990 ACM conference on LISP and functional pro-gramming, ACM, 1990, pp. 61–78.[11] P. Wadler, Monads for functional programming, in: International School on Advanced Functional Programming,Springer, 1995, pp. 24–52.[12] P. Trinder, P. Wadler, Improving list comprehension database queries, in: Fourth IEEE Region 10 InternationalConference TENCON, 1989, pp. 186–192. doi:10.1109/TENCON.1989.176921 .[13] P. Buneman, L. Libkin, D. Suciu, V. Tannen, L. Wong, Comprehension syntax, SIGMOD Record 23 (1) (1994)87–96. doi:10.1145/181550.181564 .[14] P. Buneman, S. A. Naqvi, V. Tannen, L. Wong, Principles of programming with complex objects and collectiontypes, Theor. Comput. Sci. 149 (1) (1995) 3–48. doi:10.1016/0304-3975(95)00024-Q .[15] L. Wong, Kleisli, a functional query system, J. Funct. Program. 10 (1) (2000) 19–56. doi:10.1017/S0956796899003585 .[16] E. Cooper, The script-writer’s dream: How to write great SQL in your own language, and be sure it will succeed,in: P. Gardner, F. Geerts (Eds.), Database Programming Languages - DBPL 2009, 12th International Symposium,Lyon, France, August 24, 2009. Proceedings, Vol. 5708 of Lecture Notes in Computer Science, Springer, 2009, pp.36–51. doi:10.1007/978-3-642-03793-1\_3 .[17] D. Syme, Leveraging .NET meta-programming components from F doi:10.1145/1159876.1159884 .[18] A. Ulrich, G. Giorgidze, J. Weijers, N. Schweinsberg, DSH: Database supported Haskell, http://hackage.haskell.org/package/DSH (2000–2004).[19] K. Suzuki, O. Kiselyov, Y. Kameyama, Finally, safely-extensible and e ﬃ cient language-integrated query, in: Pro-ceedings of the 2016 ACM SIGPLAN Workshop on Partial Evaluation and Program Manipulation, PEPM ’16,ACM, New York, NY, USA, 2016, pp. 37–48. doi:10.1145/2847538.2847542 .[20] O. Kiselyov, T. Katsushima, Sound and e ﬃ cient language-integrated query - maintaining the ORDER, in: B. E.Chang (Ed.), Programming Languages and Systems - 15th Asian Symposium, APLAS 2017, Suzhou, China,November 27-29, 2017, Proceedings, Vol. 10695 of Lecture Notes in Computer Science, Springer, 2017, pp. 364–383. doi:10.1007/978-3-319-71237-6\_18 .

21] O. Kiselyov, Typed tagless ﬁnal interpreters, in: Generic and Indexed Programming, Springer, 2012, pp. 130–174.[22] H. Xi, C. Chen, G. Chen, Guarded recursive datatype constructors, in: Proceedings of the 30th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL ’03, ACM, New York, NY, USA, 2003, pp.224–235. doi:10.1145/604131.604150 .[23] S. Najd, S. Lindley, J. Svenningsson, P. Wadler, Everything old is new again: quoted domain-speciﬁc languages,in: M. Erwig, T. Rompf (Eds.), Proceedings of the 2016 ACM SIGPLAN Workshop on Partial Evaluation andProgram Manipulation, PEPM 2016, St. Petersburg, FL, USA, January 20 - 22, 2016, ACM, 2016, pp. 25–36. doi:10.1145/2847538.2847541 .URL http://dl.acm.org/citation.cfm?id=2847538 [24] F. W. Brasil, Quill - compile-time language integrated queries for Scala [cited 14 / / https://getquill.io [25] Lightbend, Slick - functional relational mapping for Scala [cited 14 / / http://slick.lightbend.com/ [26] Apache Software Foundation, The Cassandra query language.URL http://cassandra.apache.org/doc/4.0/cql/ [27] L. Wong, Normal forms and conservative extension properties for query languages over collection types, J. Comput.Syst. Sci. 52 (3) (1996) 495–505. doi:10.1006/jcss.1996.0037 .[28] J. Cheney, S. Lindley, P. Wadler, Query shredding: e ﬃ cient relational evaluation of queries over nested multisets,in: Proceedings of the 2014 ACM SIGMOD international conference on Management of data, ACM, 2014, pp.1027–1038. doi:10.1145/2588555.2612186 .[29] J. Gibbons, F. Henglein, R. Hinze, N. Wu, Relational algebra by way of adjunctions, PACMPL 2 (ICFP) (2018)86:1–86:28. doi:10.1145/3236781 .[30] D. L. Parnas, On the criteria to be used in decomposing systems into modules, Commun. ACM 15 (12) (1972)1053–1058. doi:10.1145/361598.361623 .[31] J. Hughes, Why functional programming matters, The computer journal 32 (2) (1989) 98–107.[32] M. Pickering, N. Wu, J. Gibbons, Profunctor optics: Modular data accessors, Art, Science, and Engineering ofProgramming 1 (2) (4 2017). doi:10.22152/programming-journal.org/2017/1/7 .[33] J. N. Foster, M. B. Greenwald, J. T. Moore, B. C. Pierce, A. Schmitt, Combinators for bidirectional tree transfor-mations: A linguistic approach to the view-update problem, ACM Trans. Program. Lang. Syst. 29 (3) (May 2007). doi:10.1145/1232420.1232424 .[34] O. Grenrus, Glassery (Apr 2018) [cited 14 / / http://oleg.fi/gists/posts/2017-04-18-glassery.html [35] E. Kmett, lens - lenses, folds, and traversals [cited 14 / / https://github.com/ekmett/lens [36] J. Tru ﬀ aut, Monocle - optics library for Scala [cited 14 / / http://julien-truffaut.github.io/Monocle/ [37] R. O’Connor, Functor is to Lens as Applicative is to Biplate: Introducing Multiplate, CoRR abs / arXiv:1103.2841 .[38] P. Wadler, XQuery: A typed functional language for querying XML, in: International School on Advanced Func-tional Programming, Springer, 2002, pp. 188–212.[39] Habla Computing, Optica - optic-based language-integrated query [cited 14 / / https://github.com/hablapps/scico19 [40] S. Fischer, Z. Hu, H. Pacheco, A clear picture of lens laws, in: International Conference on Mathematics of ProgramConstruction, Springer, 2015, pp. 215–223.[41] R. L¨ammel, E. Meijer, Revealing the X / O impedance mismatch, in: International Spring School on Datatype-Generic Programming, Springer, 2006, pp. 285–367.[42] W. Keller, Mapping objects to tables, in: Proc. of European Conference on Pattern Languages of Programming andComputing, Kloster Irsee, Germany, Vol. 206, Citeseer, 1997, p. 207.[43] C. J. Date, A critique of the SQL database language, SIGMOD Rec. 14 (3) (1984) 8–54. doi:10.1145/984549.984551 .[44] J. Cheney, Email correspondence, Personal communication (May 2019).[45] D. Leijen, E. Meijer, Domain speciﬁc embedded compilers, in: Proceedings of the 2nd Conference on Conferenceon Domain-Speciﬁc Languages - Volume 2, DSL99, USENIX Association, USA, 1999, p. 9.[46] R. Hinze, et al., Fun with phantom types, The fun of programming (2003) 245–262.[47] O. Kiselyov, E ﬀ ects without monads: Non-determinism back to the meta language, Electronic Proceedings inTheoretical Computer Science 294 (2019) 1540. doi:10.4204/eptcs.294.2 .URL http://dx.doi.org/10.4204/EPTCS.294.2 [48] C. W. Bachman, The programmer as navigator, Commun. ACM 16 (11) (1973) 653658. doi:10.1145/355611.362534 .

49] T. Katsushima, O. Kiselyov, Language-integrated query with ordering, grouping and outer joins (poster paper),in: U. P. Schultz, J. Yallop (Eds.), Proceedings of the 2017 ACM SIGPLAN Workshop on Partial Evaluationand Program Manipulation, PEPM 2017, Paris, France, January 18-20, 2017, ACM, 2017, pp. 123–124. doi:10.1145/3018882 .URL http://dl.acm.org/citation.cfm?id=3018893 [50] M. Fern´andez, Y. Kadiyska, D. Suciu, A. Morishima, W.-C. Tan, Silkroute: A framework for publishing relationaldata in XML, ACM Transactions on Database Systems (TODS) 27 (4) (2002) 438–493.[51] S. Lindley, J. Cheney, Row-based e ﬀ ect types for database integration, in: B. C. Pierce (Ed.), Proceedings of TLDI2012: The Seventh ACM SIGPLAN Workshop on Types in Languages Design and Implementation, Philadelphia,PA, USA, Saturday, January 28, 2012, ACM, 2012, pp. 91–102. doi:10.1145/2103786.2103798 .URL http://dl.acm.org/citation.cfm?id=2103786 [52] MongoDB, Inc., MongoDB, (2019).[53] R. Horn, R. Perera, J. Cheney, Incremental relational lenses, Proc. ACM Program. Lang. 2 (ICFP) (2018) 74:1–74:30. doi:10.1145/3236769 .[54] E. Burmako, Scala macros: let our powers combine!: on how rich syntax and static types work with metaprogram-ming, in: Proceedings of the 4th Workshop on Scala, ACM, 2013, p. 3.[55] O. Hartig, J. P´erez, An initial analysis of Facebook’s GraphQL language, in: AMW 2017 11th Alberto MendelzonInternational Workshop on Foundations of Data Management and the Web, Montevideo, Uruguay, June 7-9, 2017.,Vol. 1912, Juan Reutter, Divesh Srivastava, 2017.[56] J. L´opez-Gonz´alez, J. M. Serrano, Towards optic-based algebraic theories: The case of lenses, in: InternationalSymposium on Trends in Functional Programming, Springer, 2018, pp. 74–93.[57] S. Ceri, G. Gottlob, L. Tanca, What you always wanted to know about Datalog (and never dared to ask), IEEEtransactions on knowledge and data engineering 1 (1) (1989) 146–166.[58] M. Kr¨otzsch, F. Simancik, I. Horrocks, A description logic primer, CoRR abs / arXiv:1201.4089 .URL https://arxiv.org/pdf/1201.4089 [59] L. Fegaras, An algebra for distributed big data analytics, Journal of Functional Programming 27 (2017) e27. doi:10.1017/S0956796817000193 .[60] M. Benedikt, J. Cheney, Semantics, types and e ﬀ ects for XML updates, in: P. Gardner, F. Geerts (Eds.), DatabaseProgramming Languages, Springer Berlin Heidelberg, Berlin, Heidelberg, 2009, pp. 1–17.[61] M. Odersky, D. Petrashko, G. Martres, et al., The Dotty project [cited 18 / / https://github.com/lampepfl/dotty [62] P. Wadler, S. Blott, How to make ad-hoc polymorphism less ad hoc, in: Proceedings of the 16th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, ACM, 1989, pp. 60–76.[63] B. C. Oliveira, A. Moors, M. Odersky, Type classes as objects and implicits, in: Proceedings of the ACM Interna-tional Conference on Object Oriented Programming Systems Languages and Applications, OOPSLA ’10, ACM,New York, NY, USA, 2010, pp. 341–360. doi:10.1145/1869459.1869489 ..