[PDF] A Rule-based Language for Application Integration

Abstract

Although message-based (business) application integration is based on orchestrated message flows, current modeling languages exclusively cover (parts of) the control flow, while under-specifying the data flow. Especially for more data-intensive integration scenarios, this fact adds to the inherent data processing weakness in conventional integration systems. We argue that with a more data-centric integration language and a relational logic based implementation of integration semantics, optimizations from the data management domain(e.g., data partitioning, parallelization) can be combined with common integration processing (e.g., scatter/gather, splitter/gather). With the Logic Integration Language (LiLa) we redefine integration logic tailored for data-intensive processing and propose a novel approach to data-centric integration modeling, from which we derive the control-and data flow and apply them to a conventional integration system.

Full PDF

AA Rule-based Language for Application Integration

Daniel Ritter

SAP SEDietmar-Hopp-Allee 16Walldorf, Germany [email protected] Jan Broß

Karlsruhe Institute of TechnologyKaiserstraße 12Karlsruhe, Germany [email protected]

ABSTRACT

Although message-based (business) application integrationis based on orchestrated message ﬂows, current modelinglanguages exclusively cover (parts of) the control ﬂow, whileunder-specifying the data ﬂow. Especially for more data-intensive integration scenarios, this fact adds to the inherentdata processing weakness in conventional integration systems.We argue that with a more data-centric integration languageand a relational logic based implementation of integrationsemantics, optimizations from the data management domain(e. g., data partitioning, parallelization) can be combined withcommon integration processing (e. g., scatter/gather, split-ter/gather). With the

Logic Integration Language (LiLa) were-deﬁne integration logic tailored for data-intensive process-ing and propose a novel approach to data-centric integrationmodeling, from which we derive the control-and data ﬂowand apply them to a conventional integration system.

1. INTRODUCTION

Conventional message-based integration systems show weak-nesses when it comes to data-intensive (business) applica-tion integration (e. g., fast-growing business areas like onlineplayer position tracking in sports management, internet ofthings)–using integration patterns like message transforma-tions and (partially) message routing [16]. This is due tothe facts that (

P1) most of the application data is storedin relational databases, which leads to many format conver-sions during the end-to-end processing, and the observation( P2 ) that application tier programming languages (e. g., Java,C Enterprise Integration Patterns (EIP)[11, 17] can be considered a “de-facto” modeling standard.Through the icons, the collected common integration pat-terns can be combined to describe and conﬁgure integrationsemantics on an abstract level. For instance, Figure 1 showsa “Soccer Player Event” integration scenario from sports man-agement in the EIP icon notation. The player event datais gained through a

Polling Consumer , loading game eventscollected by sensors attached to the players and the playingﬁeld during a soccer match. Depending on the event code, a

Content-based Router pattern is used to route the messagesto speciﬁc ﬁlter operations for “Shots on goal” and “Player atball” through a

Content Filter , whereafter additional playerinformation is merged into the resulting messages using a

Content Enricher . Then the messages are converted intothe formats understood by their receivers using a

MessageTranslator . The “Shots on goal” information is posted astwitter feed and ball possessions are stored to ﬁle. While thecontrol ﬂow is modeled, the message formats (e. g., “Gameevents”, “Player information”) and the actual data processing(e. g., routing and ﬁlter conditions, enricher and mappingprograms) remain hidden on a second level conﬁguration. Incontrast, a more data-aware formalization should treat dataas ﬁrst-class citizen of an integration scenario. This ( P3 )would give an integration expert the immediate control overthe actual core aspect of integration, the data and its format,and ( P4 ) would take away the burden of explicitly modelingthe system’s control ﬂow, while keeping best practices andoptimizations in mind, which should rather be conﬁguredby the system itself. In this context there is a new trendto use Datalog-style rule-based languages to declarativelyspecify data-centric application development by Green et al[9] and Abiteboul et al [1], who applied logic programming(i. e., extended Datalog) to analytical and web applicationdevelopment. Similarly, we showed in [16] the applicabilityand expressiveness of standard Datalog in the context of theEIPs [11, 17].To approach these observations (P1–P4), we propose a novelformalization tailored to data-intensive, message-based inte-gration and a data-centric modeling approach, which we call Logic Integration Language (LiLa). For that, we re-deﬁnecore EIPs as part of a conventional integration system usingDatalog. Datalog allows for data processing closer to its a r X i v : . [ c s . D B ] A ug ame events(File Consumer) event code router(Content based router) ball reception(Content Filter) player information(Content Enricher) player at ball(Message Translator)player information(Content Enricher) goal by player(Message Translator)goal(Content Filter) Twitter EndpointFile Endpoint Figure 1: Excerpt from a soccer player event-message integration scenario (EIP icon notation [11]).storage representation, and is suﬃciently expressive for theevaluation of EIPs [16]. Similarly, LiLa programs are basedon standard Datalog + , for which we carefully deﬁned a smallset of integration-speciﬁc extensions. @from ( file : gameEvents .json , json ){gE( period ,time , eventCode , pId ).}g( period ,time , pId ):-gE( period ,time ," Goal ", pId ).br( period ,time , pId ):-gE( period ,time ," BallReception ", pId ).gByP ( period ,time , firstN , lastN ):-g( period ,time , pId ),pInfo (pId , firstN , lastN ).pAtB ( period ,time , firstN , lastN ):-br( period ,time , pId ),pInfo (pId , firstN , lastN ).@enrich ( playerInfo .json , json ){ pInfo (pId , firstN , lastN ).}@to ( twitter : $config , json ){ gByP }@to ( file : playersAtBall . json ){ pAtB } Listing 1: Soccer Game Event Integration with LiLa.For instance, Listing 1 shows the LiLa program of our moti-vating example. Notably, the data ﬂow, formats and opera-tions are represented as Datalog program with annotations.The ﬁle-based message adapter @from reads a stream of gameevents in the JSON format, canonically converts and projectsthe message body to Datalog facts of the form gE . SeveralDatalog rules represent operations on the data like ﬁlters(i. e., predicates g , br ), enricher @enrich , loading and merg-ing pInfo from gByP and pByB ), before binding the IDBrelations to receiver endpoints @to that only pass speciﬁedpredicates and (canonically) convert them to the conﬁguredformat (e. g., JSON). From the LiLa programs we deriveintegration semantics and an eﬃcient control ﬂow using pat- tern detection. To show the applicability of our approachto real-world integration scenarios and to conduct perfor-mance measurements, we synthesize LiLa programs to theopen-source integration system Apache Camel [4] that im-plements most of the integration semantics in form of EIPs.The results of the runtime analysis show that the usage of amore data-centric message processing is especially promising(a) for message transformations, while the routing eﬃciencyremains similar to the conventional processing, and (b) froman end-to-end messaging point of view. Furthermore, thedata-centric modeling with LiLa emphasizes the potentialfor optimizations and a novel modeling clarity compared tothe existing control ﬂow centric languages.The remainder of the paper is organized along its contribu-tions. Section 2 brieﬂy describes the re-deﬁnition of EIPsas Datalog programs as foundation for the construction ofLiLa in Section 3. The synthesis of LiLa programs to ApacheCamel is explained in Section 4 as basis for experimentalevaluations discussed in Section 5. Section 6 sets LiLa incontext to related work and Section 7 concludes the paper.

2. INTEGRATION PATTERNS IN A NUTSHELL

The

Enterprise Integration Patterns (EIPs) [11, 17] deﬁne op-erations on the header (i. e., payload’s meta-data) and body(i. e., message payload) of a message, which are normallyimplemented in the integration system’s host language (e. g.,Java, C

Integration Logic Programming (ILP) targets an enhance-ment of conventional integration systems for data-intensiveprocessing, while preserving the general integration seman-tics like

Quality of Service (e. g., best eﬀort, exactly once)nd the

Message Exchange Pattern (e. g., one-way, two-way).In other words, the content part for the patterns is evaluatedby a Datalog system, which is invoked by an integrationsystem that processes the results.

When connecting applications, various operations are exe-cuted on the transferred messages in a uniform way. Thearriving messages are converted into an internal format un-derstood by the pattern implementation, called

CanonicalData Model (CDM) [11, 17], before the messages are trans-formed to the target format. Hence, if a new application isadded to the integration solution only conversions betweenthe CDM and the application format have to be created.Consequently, for the re-deﬁnition of integration patterns,we deﬁne a CDM as

Datalog Program , which consists of a setof facts, with an optional set of (supporting) rules as messagebody and a set of meta-facts that describes the actual dataas header. The meta-facts encode the name of the fact’spredicate and all parameter names within the relation as wellas the position of each parameter. With that information,parameters can be accessed by name instead of position byDatalog rules (e. g., for selections, projections).

Before re-deﬁning the patterns integration semantics for rout-ing and transformation patterns using Datalog, by function ilp ,let us recall some relevant, basic Datalog operations: join , projection , union , and selection . The join of two re-lations r ( x, y ) and s ( y, z ) on parameter y is encoded as j ( x, y, z ) ← r ( x, y ) , s ( y, z ), which projects all three parame-ters to the resulting predicate j . More explicitly, a projectionon parameter x of relation r ( x, y ) is encoded as p ( x ) ← r ( x, y ).The union of r ( x, y ) and s ( x, y ) is u ( x, y ) ← r ( x, y ) . u ( x, y ) ← s ( x, y ), which combines several relations to one. The selec-tion r ( x, y ) according to a built-in predicate φ ( x, [ const | z ])is encoded as s ( x, y ) ← r ( x, y ) , φ ( x, [ const | z ]), which onlyreturns s ( x, y ) records for which φ ( x, [ const | z ]) evaluates to true for a given constant value const or a variable value z . Built-in predicates can be numerical, binary relations φ ( x, const ) like <, >, < = , > = , = as well as string, binaryrelations like equals, contains, startswith, endswith , numeri-cal expressions based on binary operators like = , + , − , ∗ , / (e. g., x = p ( y ) + 1) and operations on relations like y = max ( p ( x )) , y = min ( p ( x )), which would assign the maximalor the minimal value of a predicate p to a parameter y .Although our approach allows each, single pattern deﬁnitionto evaluate arbitrary Datalog rules, queries and built-inpredicates, the Datalog to pattern mapping tries to identifyand focus on the most relevant Datalog operations for aspeciﬁc pattern. An overview of all discussed, re-deﬁnedrouting functions and their mapping to Datalog constructsis shown in Figure 2. Message Routing Patterns.

The routing patterns can beseen as control and data ﬂow deﬁnitions of an integrationchannel pipeline. For that, they access the message to route itwithin the integration system and eventually to its receiver(s).They inﬂuence the channel and message cardinality as wellas the content of the message. The most common rout-ing pattern that determines the message’s route based on bu il t - i n j o i n s e l ec t i o np r o j ec t i o nun i o n Message Routing

Router, Filter:Recipient ListMulticast, Join RouterSplitterCorrelation, CompletionAggregation

Message Transformation

Message translatorContent ﬁlterContent enricher

Figure 2: Message routing and transformation patternsmapped to Datalog. Most common Datalog operations fora single pattern are marked “dark blue”, less common ones“light blue”, and possible but uncommon ones “white”.its body is the

Content-based Router . The stateless routerhas a channel cardinality of 1: n , where n is the numberof leaving channels, while one channel enters the router,and a message cardinality of 1:1. The entering messageconstitutes the leaving message according to the evalua-tion of a routing condition . This condition is a function rc ,with { out , out , ..., out n } := rc ( msg in .body.x, conds ), where msg in determines the entering message and body.x is anarbitrary ﬁeld x of its structure. The function rc evaluatesto a list of Boolean output on a list of conditions conds (i. e., Datalog rules) for each leaving channel. The output { out , out , ..., out n } is a list of Boolean values for each ofthe n ∈ N leaving channels. However, only one channelmust evaluate to true , all others to false . The Booleanoutput determines on which leaving channel the messageis routed further (i. e., exactly one channel will route themessage). Common integration systems implement a routingfunction that provides the entering message msg in , repre-sented by a Datalog program (i. e., mostly facts) and the conds conﬁgurations as Datalog rules. Since standard Dat-alog rules cannot directly produce a Boolean result, thereare at least two ways of re-deﬁning rc : (a) by a supportingfunction in the integration system, or (b) by adding BooleanDatalog facts for each leaving channel that are joined withthe evaluated conditions and exclusively returned by projec-tion (not further discussed). An additional function help rc for option (a), could be deﬁned as { out , out , ..., out n } := help rc ( list ( list ( fact ))), ﬁtting to the input of the routingfunction, where list ( list ( fact )) describes the resulting factsof the evaluation of conds for each channel. The function help rc emits true , if and only if list ( facts ) (cid:54) = ∅ , and false otherwise. Now, the ILP routing condition is de-ﬁned as list ( fact ) := ilp rc ( msg in .body.x, conds ), while be-ing evaluated for each channel condition, thus generating list ( list ( fact )). The conds would then mainly be Datalog op-erations like selection or built-in predicates. For the messageﬁlter, which is a special case of the router that distinguishesonly in its channel cardinality of 1:1 and the resulting mes-sage cardinality of 1:[0 | ilp rc would have to be beevaluated once.The stateless Multicast and

Recipient List patterns routemultiple messages to leaving channels, which gives them aessage and channel cardinality of 1: n . While the multicastroutes messages statically to the leaving channels, the recipi-ent list determines the receiving channels dynamically. Thereceiver determination function rd , with { out , out , ..., out n } := rd ( msg in . [ header.y | body.x ]) ., computes n ∈ N receiver channel conﬁgurations { out , out ,..., out n } by extracting their key values either from an ar-bitrary message header ﬁeld y or from the body x ﬁeld ofthe message. The integration system has to implement a re-ceiver determination function that takes the list of key-strings { out , out , ..., out n } as input, for which it looks up receiverconﬁgurations recv i , recv i +1 , ..., recv i + m , where i, m, n ∈ N and m ≤ n , and passes copies of the entering message { msg (cid:48) out , msg (cid:48)(cid:48) out , ..., msg m (cid:48) out } . In terms of Datalog, rd ilp is aprojection from values of the message body or header to aunary, output relation. For instance, the receiver conﬁgura-tion keys recv and recv have to be part of the message bodylike body ( x, (cid:48) recv (cid:48) ) .body ( x, (cid:48) recv (cid:48) ) . and rd ilp would evaluatea Datalog rule similar to config ( y ) ← body ( x, y ). For moredynamic receiver determinations, a dynamic routing patterncould be used. However, deviations from the original patterndeﬁned in [11, 17] would extend the expressiveness of therecipient list. Our ILP deﬁnition does not prevent from doingthat. The multicast and join router pattern are staticallyconﬁgurable 1: n and n :1 channel patterns, which do not needa re-deﬁnition as ILP.The antipodal Splitter and

Aggregator patterns both have achannel cardinality of 1:1 and create new leaving messages.Thereby the splitter breaks the entering message into multi-ple (smaller) messages (i. e., message cardinality of 1: n ) andthe aggregator combines multiple entering messages to oneleaving message (i. e., message cardinality of n :1). To be ableto receive multiple messages from diﬀerent channels, a JoinRouter [14] pattern with a channel cardinality of n :1 andmessage cardinality of 1:1 can be used as predecessor to theaggregator. Hereby, the stateless splitter uses a split condi-tion sc , with { out , out , ..., outn } & = sc ( msg in .body, conds ),which accesses the entering message’s body to determine a listof distinct body parts { out , out , ..., out n } , based on a list ofconditions conds , that are each inserted to a list of individual,newly created, leaving messages { msg out , msg out , ..., msg outn } with n ∈ N by a splitter function. The header and attach-ments are copied from the entering to each leaving message.The re-deﬁnition sc ilp of split condition sc evaluates a set ofDatalog rules as conds , which mostly use Datalog selection,and sometimes built-in and join constructs (the latter twoare marked “light blue”). Each part of the body out i is a setof facts that is passed to a split function, which wraps eachset into a single message.The stateful aggregator deﬁnes a correlation condition, com-pletion condition and an aggregation strategy. The correla-tion condition crc , with coll i := crc ( msg in . [ header.y | body.x ] ,conds ), determines the aggregate collection coll i , with i ∈ N ,based on a set of conditions conds to which the messageis stored. The completion condition cpc , with cpout := cpc ( msg in . [ header.y | body.x ]), evaluates to a Boolean output cpout based on header or body ﬁeld information (similar tothe message ﬁlter). If cpout == true , then the aggregationstrategy as , with aggout := as ( msg in , msg in , ..., msg inn ),is called by an implementation of the messaging system and executed, else the current message is added to the collection coll i . The as evaluates the correlated entering message col-lection coll i and emits a new leaving message msg out . Forthat, the messaging system has to implement an aggregationfunction that takes aggout (i. e., the output of as ) as input.These three functions are re-deﬁned as crc ilp , cpc ilp suchthat the conds are rules mainly with selection and built-inDatalog constructs. The cpc ilp makes use of the deﬁned help rc function to map its evaluation result (i. e., list offacts or empty) to the Boolean value cpout . The aggregationstrategy as is re-deﬁned as as ilp , which mainly uses Data-log union to combine lists of facts from diﬀerent messages.The message format remains the same. To transform theaggregates’ formats, a message translator should be used tokeep the patterns modular. However, the combination ofthe aggregation strategy with translation capabilities couldlead to runtime optimizations. An overview of all discussed,re-deﬁned routing functions and their mapping to Datalogconstructs is shown in Figure 2. Message Transformation Patterns.

The transformationpatterns exclusively target the content of the messages interms of format conversations and modiﬁcations.The stateless

Message Translator changes the structure orformat of the entering message without generating a new one(i. e., channel, message cardinality 1:1). For that, the trans-lator computes the transformed structure by evaluating amapping program mt , with msg out .body := mt ( msg in .body ).Thereby the ﬁeld content can be altered.The related Content Filter and

Content Enricher patternscan be subsumed by the general

Content Modiﬁer patternand share the same characteristics as the translator pattern.The ﬁlter evaluates a ﬁlter function mt , which only ﬁltersout parts of the message structure, e. g., ﬁelds or values,and the enricher adds new ﬁelds or values as data to theexisting content structure using an enricher program ep , with msg out .body := ep ( msg in .body, data ).The re-deﬁnition of the transformation function mt ilp forthe message translator mainly uses Datalog join and pro-jection (plus built-ins for numerical calculations and stringoperations, thus marked “light blue”) and Datalog selection,projection and built-in (mainly numerical expressions andcharacter operations) for the content ﬁlter. While projectionsallow for rather static, structural ﬁltering, the built-in andselection operators can be used to ﬁlter more dynamicallybased on the content. The resulting Datalog programs arepassed as msg out .body . In addition, the re-deﬁned enricherprogram ep ilp mainly uses Datalog union operations to addadditional data to the message as Datalog programs. Figure 2summarizes the discussed message transformation functions. Pattern Composition.

The deﬁned patterns can be com-posed to more complex integration programs (i. e., integrationscenarios or pipelines). From the many combinations of pat-terns, we brieﬂy discuss two important structural patternsthat are frequently used in integration scenarios: (1) scat-ter/gather and (2) splitter/gather [11, 17]. Both patternsare supported by the patterns re-deﬁned as ILPs.he scatter/gather pattern (with a 1: n :1 channel cardinality)is a multicast or recipient list that copies messages to several,statically or dynamically determined pipeline conﬁgurations,which each evaluate a sequence of patterns on the messagesin parallel. Through a join router and an aggregator pattern,the messages are structurally and content-wise joined.The splitter/gather pattern (with a 1: n :1 message cardinality)splits one message into multiple parts, which can be processedin parallel by a sequence of patterns. In contrast to thescatter/gather the pattern sequence is the same for eachinstance. A subsequently conﬁgured aggregator combinesthe messages to one.

3. LOGIC INTEGRATION LANGUAGE

In the context of data-intensive message-processing, the cur-rent control ﬂow-centric integration languages do not allowto design the data ﬂow. Through the re-deﬁnition of theintegration patterns with Datalog as ILPs, a foundation fora data-centric deﬁnition of integration scenarios is provided.Hence the language design of the subsequently deﬁned

LogicIntegration Language (LiLa) is based on Datalog, which spec-ify programs that carefully extend standard Datalog + byintegration semantics using annotations for message end-points and complex routing patterns. As shown in Listing2, an annotation consists of an head with name preceded by“@” and zero or more parameters enclosed in brackets, aswell as a body enclosed in curly brackets. @< annotationName >( < parameter > + ){ < Annotation Body > } Listing 2: Format of an annotation in

LiLa

A LiLa program deﬁnes dependencies between Datalog facts,rules and annotations similar to the dependency graph ofa Datalog program [20]. Let us recall that the cyclic de-pendency graph DG D of a (recursive) Datalog program isdeﬁned as DG D := ( V D , E D ), where the nodes V D of thegraph are IDB predicates, and the edges E D are deﬁned froma node n ∈ N (predicate 1) to a node n ∈ N (predicate 2),if and only if, there is a rule with predicate 1 in the headand predicate 2 in the body.Analogously, the directed, acyclic LiLa dependency graph LDG is deﬁned as

LDG := ( V p , E p ), where V p are collectionsof IDB predicates, which we call processors. An edge E p from processor p ∈ V p to p ∈ V p exists, if there is a rulewith predicate 1 from p in the head and predicate 2 from p in the body. Hence the LDG contains processors withembedded cyclic rule dependency graphs, which do not leadto cycles in the LDG . In contrast to the DG D , annotationsare added to the LDG as nodes. If an annotation uses apredicate an edge from that predicate is drawn to the node ofthe annotation (i. e., annotation depends on that predicate).If another annotation or rule uses the predicates producedby an annotation an edge from the annotation to the noderepresenting the annotation or rule, which uses the data pro-duced by the annotation, is drawn. Figure 3 shows the

LDG for the LiLa program depicted in Listing 1. The message from(file:gameEvents,json)gEg pgByP pAtBto(twitter,json) to(file:playersAtBall,json)enricher:pInfopInfo

Figure 3: Dependency graph of the LiLa program from themotivating example.endpoint nodes are labeled with their consumer/producerURI with the predicate name of the rule for content ﬁlters.

To connect the message sender, the

Fact Source , with themessage receiver, the

Routing Goal , LiLa extends Datalog by @from , @to annotation statements similar to the open sourceintegration system Apache Camel [4]. Nodes of LDG withno incoming edges are either EDB predicates or fact sources.Nodes with no outgoing edges are (mostly) routing goals.The only counter example are obsolete/unused processingsteps, which can be deleted.Representing the sender-facing fact source speciﬁes the sender’stransport and message protocol. Listing 3 deﬁnes the factsource, which consists of a location , conﬁguration URI thatcan be directly interpreted by an integration system anddeﬁnes the location of the facts, and format , the messageformat of the data source (e. g., JSON, CSV, XML). Theannotation body speciﬁes the format’s relations in form ofDatalog facts. The message format is canonically convertedto Datalog programs according to the ILP-CDM. @from (< location >,< format >){ < relationName (< parameter > + ) >. + } Listing 3: Deﬁnition of a fact source in

LiLa

Similarily, the routing goal deﬁnitions specify the receiver-facing transport and message protocols (cf. Listing 4). Hereby,the ILP-CDM is canonically converted to the message formatunderstood by the receiver. @to (< producerURI >,< format >){ < relationName >[ < linebreak >< relationName >] ∗ } Listing 4: Deﬁnition of a routing goal in

LiLa .3 Inherent Integration Patterns

The Datalog facts provided by the fact source can be directlyevaluated by Datalog rules. The LiLa dependency graph isused to automaticlly identify message transformation andbasic routing patterns.

Message Transformation Patterns.

Further patterns thatcan be derived from the

LDG are message transformationpatterns like

Content Filter , Message Translator , and thelocal

Content Enricher .The content ﬁlter and message translator patterns are usedto ﬁlter parts of a message as well as to translate the mes-sage’s structure. Both are inherently declared in LiLa byusing Datalog rules, which are collected in processors of the

LDG . Each set of rules producing the same predicate corre-sponds to a ﬁlter or translator in the integration middleware.For instance, the LiLa program for the motivating exam-ple produces two content ﬁlters: one for the relation gByP and another one for the relation pByB . The routing betweenmultiple content ﬁlters is decided based on the dependencygraph of the LiLa program. If a node has a single outgoingedge, the incoming data is directly routed to the processorcorresponding to the subsequent node. If a node has multipleincoming edges a join router pattern is present, which isdetected and transformed as described in Section 4. Thesame is the case for a node having multiple outgoing edges,which corresponds to a multicast pattern.For the local content enricher, LiLa allows to specify factsin a LiLa program. The facts are treated as processor (i. e.,node in

LDG ) and are automatically placed into the messageafter a relation with this name is produced.

Message Routing Patterns.

In addition to the messagetransformation patterns, some simple routing patterns canbe derived from the dependency graph like

Multicast , MessageFilter , Content-based Router and

Join Router .The multicast pattern can be used as part of the commonmap/reduce-style message processing. The multicast is de-rived by analyzing the dependency graph for independentrules to which copies of the message are provided. One po-tential side-eﬀect is the detection of (too) many multicastconﬁguration, when a routing goal requests multiple inter-mediary results of a single route, which we mitigate by anoptimization that keeps these results (not shown).The message ﬁlter, removes messages according to a ﬁltercondition (cf. Section 2). For the ﬁlter, LiLa does notdeﬁne a special construct The ﬁltering of a message can beachieved by performing a content ﬁltering, which leads toan empty message. Empty messages are discarded beforesending the message for further processing to a routing goal.This behavior can be used to describe a content based-router,which distinguishes from the ﬁlter by its message cardinalityof 1: n . However in LiLa, we use the router with a channelcardinality of 1: n (i. e., multicast) with message ﬁlters oneach leaving message channel according to [11].A structurally, channel combining pattern is the join router. The join router has a channel cardinality of n :1, however,does only combine channels, but not messages. For that anaggregator is used that is deﬁned subsequently. The more complex routing patterns

Aggregator , Splitter andremote

Content Enricher can neither be described by stan-dard Datalog nor inherently detected in the dependencygraph. Hence we deﬁne special annotations for these pat-terns.For the aggregator, Listing 5 shows the @aggregate anno-tation with pre-deﬁned aggregation strategies like union and an either time (e. g., completionTime =3) or number-of-messages based completion condition (e. g., completion-Size=5 ). The annotation body consists of several Datalogqueries. The message correlation is based on the query eval-uation, where true means that the evaluation result is notan empty set of facts and false otherwise. As the aggrega-tor does not produce facts with a new relation name, butcombines multiple messages keeping their relations, it is chal-lenging how to reference to the aggregated relations in aLiLa program as their name does not change (i. e., messageproducing). This leads to problems, when building the de-pendency graph, i. e., it is undecidable whether a rule usesthe relation prior or after aggregation. As we do not wantthe user to specify explicitly, whether she means the relationprior or after aggregation in every rule using a predicate usedin an aggregator, we suﬃx all predicates after an aggregationstep with -aggregate by default. In combination with ajoin router, messages from several entering channels can becombined. @aggregate (< aggregationStrategy >,){ ( < parameter > + ).> + } Listing 5: Deﬁnition of an aggregator in

LiLa

LiLa speciﬁes the splitter as in Listing 6 with a new @split annotation, which does not have any parameters in the anno-tation head. Datalog queries are used in the annotation bodyas splitter expressions as brieﬂy described in Section 2. Thequeries are evaluated on the exchange and each evaluationresult is passed for further processing as a single message.Similar to the aggregator, all newly generated relations leav-ing a splitter are suﬃxed with -split by default in order tonot explicitly having to specify, whether the relation prior orafter splitting is meant. @split (){ ( < parameter > + ).> + } Listing 6: Deﬁnition of a splitter in

LiLa

The remote content enricher can be seen as a special messageendpoint. For instance for an enricher including data from aﬁle, the filename and format have to be speciﬁed as shownin Listing 7. Similar to the fact source, a set of relationshas to be speciﬁed. Again, a canonical conversion from thespeciﬁed ﬁle format to the ILP-CDM is conducted accordingto the deﬁnitions in Section 2. If the relations to enrich viahis construct are already generated by another constructor Datalog rule, they are enriched after this construct byadding the additional facts to the message. If there is noconstruct or Datalog rule producing the relations speciﬁedin the annotation body, the relations are enriched directlybefore their usage. The enricher construct is especially usefulwhen a single message shall be combined with additionalinformation. @enrich (< filename >,< format >){ < relationName (< parameter > + ).> + } Listing 7: Deﬁnition of a remote content enricher in

LiLa

4. SYNTHESIS OF LOGIC INTEGRATIONLANGUAGE PROGRAMS

The deﬁned LiLa constructs can be combined to complexrepresentations of integration programs, which can be exe-cuted by integration systems. For that we have chosen thelightweight, open-source integration system Apache Camel[4], since it implements all discussed integration semantics.To guarantee data-intensive processing, LiLa programs arenot synthesized to Apache Camel constructs directly, but tothe ILP integration pattern re-deﬁnitions that are plugged tothe respective system implementations (cf. Section 2). Theequivalent to message channels in Apache Camel are CamelRoutes.

Figure 4: LiLa Compiler Pipeline

The deﬁnition of a platform-independent message channelrepresentation, called

Route Graph (RG), enables a graphtransformation t : LDG → RG and an eﬃcient code genera-tion for diﬀerent runtime systems. The transformation is atwo-step process: In the ﬁrst step a condition is evaluated oneach edge or node respectively. If the condition evaluates totrue, further processing on this node/edge is performed. Thesecond step is the execution of the actual transformation.The route graph RG is deﬁned as RG := ( V R , E R ), wherethe nodes V R are runtime components of an integrationsystem (representing an ILP-EIP), and the edges E R arecommunication channels from one node n ∈ V R to anothernode n ∈ V R , or itself n . In most integration systemsacyclic route graphs are possible, however, not considered inthis work. The nodes in V R can be partitioned to diﬀerentroutes, while edges in E R from one route to another have tobe of type to for the source node and of type from for thetarget node.For instance, Figure 5 shows the route graph the LDG ofour motivating example (cf. Figure 3). The message ﬂowbetween separately generated routes (dashed-lines) indicatea to / from construct. Consequently, the LiLa program from Listing 1 results to four distinct routes, e. g., with a mul-ticast multicast(direct:p,direct:g) and a ﬁle-enricher from(direct:pEnrichInfo) identiﬁed through pattern de-tection, which is subsequently discussed. Enricher:pInfopAtBfrom(file:gameEvents,json)gEmulticast(direct:p,direct:g) from(direct:g)from(direct:p) gEnricher:pInfop gByPto(twitter,json)to(file:playersAtBall,json)from(direct:enrichpInfopInfo

Figure 5: Route graph for the LiLa program of the motivatingexample in Figure 3.

The more complex, structural join router, multicast andremote enricher patterns are automatically derived from the

LDG through a rule-based pattern detection approach. Withthese building blocks, common optimizations in integrationsystems, e. g., the map/reduce-like scatter/gather pattern[11, 17], which is a combination of the multicast, join routerand aggregator patterns, can be synthesized. The rule-baseddetection and transformation approach deﬁnes a matchingfunction [ true | false ] := mf LDG,mc on LDG , with matchingcondition mc and a transformation t G , with t G : LDG → RG . The matching function denotes a node and edge graphtraversal on the LDG that evaluates to true if the conditionholds, false otherwise. The transformation t G is executedonly if the condition holds. Join Router.

The router is a m :1 message channel joinpattern, which usually has to be combined with an aggregatorto join messages. The match condition is deﬁned as mc JR := deg − ( n i ) >

1, where deg − ( n i ) determines the number ofentering message channels on a speciﬁc node n i ∈ V P , with i ∈ N . Hence, only in case of multiple entering edges, thegraph transformation t jr is executed. The transformations t jr − change the RG : t jr : n i → n fd ⊕ n i , for all matchingnodes n i , a from-direct node n fd is added, denoted by ⊕ .dditionally, all nodes n j with direct, outgoing edges tothe matching node get an additional to-direct node n td : t jr : n j → n j ⊕ n td . Then all original edges e m have to beremoved: t jr : E P → E P \ e m . Figure 6a shows the LDG and Figure 6b the corresponding RG after the transformation. Multicast.

The multicast has a channel cardinality of 1: n .The match condition is deﬁned as mc Mu := deg + ( n i ) > deg + ( n i ) determines the number of leaving messagechannels on a speciﬁc node n i ∈ V P , with i ∈ N . Hence, onlyin case of multiple leaving edges, the graph transformation t jr is executed. The transformations t mu −− change the RG : t mu : n i → n i ⊕ n multic { n j } , for all matching nodes n i , a multicast node n multic is added, which references allprevious neighboring nodes n j via leaving edges. Then a from-direct node n fd is added to all neighboring nodes n j through transformation t mu : n j → n fd ⊕ n j . Additionally,all original edges e m have to be removed: t mu : E P → E P \ e m . Figure 7a shows the LDG and Figure 7b thecorresponding RG after the transformation. Remote Enricher.

An enricher potentially merges severalpredicate relations to the main route as additional data.Therefore it has to get a route on its own that (periodically)gathers the respective messages. The intermediate transfor-mation t re is deﬁned as t re : n i ⊕ { n j } → n fd ⊕ n file ⊕ { n j } ,with n i , n j ∈ V P , which takes all matching enricher nodes n i and the list of connected nodes { n j } and translates them toa from-direct node n fd that is followed by a ﬁle relation n file , referencing the connected nodes { n j } . Additionally,all original edges e m from the enricher n i to the list of con-nected nodes { n j } have to be removed: t er : E P → E P \ e m .The match condition for the remote enricher is the nodetype type ( n i ), determined through the @enrich annotation: mc RE := type ( n i ) == (cid:48) @ enrich (cid:48) . After the intermediatetranslation, all produced relations (nodes) that are linked tonodes in the main tree create a join router (cf. transforma-tions t jr − ) with a built-in aggregator that merges the facts,e. g., via union operation. Figure 7a denotes the LDG of anexample remote enricher pattern that is transformed to itscorresponding RG , shown in Figure 7b. In order to ﬁnd thecomplete path of nodes to extract the leaving edges have tobe followed starting at the enricher node until a node thathas multiple incoming nodes. Before the node with multipleincoming nodes, a to-direct node is inserted through t jr (dashed line). The URI of the call to enricher node is setto the URI of the consumer, which has to be added directlybefore the enricher node. The route graph represents the foundation for the code syn-thesis of the message channels, which are a combinationof ILP constructs and Apache Camel patterns and routes.The construction of the routes is a trivial graph traversalstarting from the fact source nodes. The multicast t mu , and join router t jr − transformations construct a RG with deg − ( n ) == 1, with n ∈ V R . Hence the ILP constructscan be synthesized one after the other based on their typeand the ILP properties, which were preserved during thetransformations and optimizations. For instance, Figure 9 shows the synthesized Apache Camel routes in the EIP iconnotation. In comparison to the motivating example in Fig-ure 1, the content-based router is exchanged by a multicastand message ﬁlters are added before the outbound messageendpoints, while preserving the same semantics and allowingfor parallel message processing. Message Endpoints.

The fact source and routing goal nodesare transformed to components in Apache Camel, passing theconﬁgurations that are stored in the node properties. A de-tected (not @from annotated) fact source gets an additional numOfMsgsToAgg property, which remembers the enteringmessage count of a join router a corresponding aggregatorILP with completionSize = numOfMsgsToAgg is added. The location property deﬁnes the component’s endpoint con-ﬁguration and the format leads to the generation of a ILPformat converter (e. g., JSON, CSV to Datalog) that is con-ﬁgured using the meta-facts supplied in the annotation body,conduction an additional projection. If the format is setto datalog no format conversion is needed. The routinggoals are conﬁgured similarly. A message ﬁlter ILP is addedthat discards empty messages. The a format converter (e. g.,Datalog to JSON/CSV) is added and conﬁgured through themeta-facts property. Finally a Camel producer componentis added to the route and conﬁgured. Complex Routing Patterns.

For the aggregator, additional renamingRules properties and renaming message transla-tors are generated, containing a Datalog rule that adds -aggregate suﬃxes to every Datalog predicate used in thehead of a query (for name diﬀerentiation). Similarly, for thesplitter, -split suﬃxes are generated that allow additionalmessage translators to rename the predicates. This is neces-sary in order to build the dependency graph as described insection 3.The inherent multicast nodes are conﬁgured through a re-cipient list property, containing the target node identiﬁers,which allows for a translation to the Camel multicast (noILP deﬁned).

Message Translation Patterns.

The content ﬁlter and mes-sage translator nodes can be generated to the ILP contentﬁlter, which is based on a Camel processor and conﬁguredaccordingly. The node of the inherent content enricher, whichcan be speciﬁed by writing facts into a LiLa program, storesin the facts as properties. The generated ILP (again basedon a Camel processor) adds the facts to every incomingmessage. The explicit ﬁle enricher pattern is conﬁgured sim-ilar to a fact source, however, the conﬁguration speciﬁes a fileName property, used to conﬁgure the Camel component.Again, ILP format converters are added and conﬁgured bythe meta-facts property.

5. EXPERIMENTAL EVALUATION

We implemented ILP constructs as extensions to the lightweight,open source integration system Apache Camel in version2.12.2 based on Section 2 that are references to LiLa pro-grams as described in Section 4. The

HLog

Datalog system c b (a) LiLa Graph r1 r2r3 to(direct:c)from(direct:c)ato(direct:c) bc (b) Route GraphFigure 6: Detection of Join Router Pattern.we used for the measurements is a Java implementation of thestandard na¨ıve-recursive Datalog evaluation (i. e., withoutstratiﬁcation) from Ullman [20] in version 0.0.6 as describedin [18].

We conduct all measurements on a HP Z600 work station,equipped with two Intel Xeon processors clocked at 2.67GHzwith a total of 2 x x . u match (" true ").meta (" match " ," matching " ,1). Listing 8: Single-fact message [{" matching " : " true "}]

Listing 9: Single-entry message in JSONThe multi-facts message tests were conducted with the JSONmessage shown in Listing 11 with its corresponding Datalogrepresentation in Listing 10. The messages’ payloads areapproximately 58 and 85 bytes, respectively. match (" true " ,1).match (" false " ,2).meta (" match " ," matching " ,1).meta (" match " ," count " ,2).

Listing 10: Mutli-fact message [{" matching ":" true ", " count ":1} ,{" matching ":" false ", " count ":2}]

Listing 11: Multi-entry message in JSON

The subsequently described measurements target an exper-imental runtime evaluation of some of the introduced inte-gration patterns from section 2 through a comparison of theILP with the original Java-based implementation. Keep inmind, that tests with empty messages or routes without anyprocessing steps result in identical performance results. Alltests measure the pattern processing only, thus neglect thenecessary format conversions.

Message Filter and Content-based Router Patterns.

Thebasic router pattern analysis is conducted for the messageﬁlter and content-based router using the single-fact messagefor ILP (cf. Listing 8) and the corresponding JSON messagefor the Java implementation (cf. Listing 9). We execute themessage ﬁlter in a Camel route for ILP (generated from aLiLa program) and for Java as shown in Listings 12 and 13.The performance is measured without the message endpointsby sending multiple single fact messages, while one half of the b c (a) LiLa Graph r1r2 r3amulticast(direct:b,direct:c)from(direct:b) from(direct:c)b c (b) Route GraphFigure 7: Detection of Multicast Pattern.messages is ﬁltered out and the other half is routed further. @from ( file : data / testMessageFilter ){ match ( matching ).}match - filtered ( matching ):- match (" true ").@to ( file : data / filtered )

Listing 12: LiLa message ﬁlter performance test program from ( file : data / testMessageFilter ). filter ( new JsonMatchKeyValueExpression (" match" ," true ")).to( file : data / filtered );

Listing 13: Camel Route used for the message ﬁlterperformance meassurements of Camel-JavaThe performance measurement results are depicted in Figure10. The Camel-ILP and the Camel-Java implementation showa linear performance on the amount of incoming messages.Although the setup favors the Java processing due to (a)only single-fact messages are sent (i. e., Datalog evaluationis better suitable for set-operations), and (b) the type ofoperation during message routing is mostly only used to“peak” into the message content (cf. Section 2), the Javaimplementation seems to be only slightly better for higheramounts of single fact messages. A similar result/behaviorcan be observed for the content-based router pattern.

Content Filter Pattern.

As an example for message trans-formations, we evaluate the content ﬁlter on a single messagecontaining a varying amount of facts based on the messagepayload from Listings 10 and 11. The routes in Listings 14and 15 show that the content ﬁlter is conﬁgured with a singlerule, for which half of the facts match. @from ( file : data / testContentFilter ){ match ( matching , count ). }match - filtered ( matching , count ):- match (" true ",count ).@to ( file : data / contentFilter ){ match - filtered }

Listing 14: LiLa content ﬁlter performance test program from ( file : data / testContentFilter ). process ( new JSONContentFilter ( newJsonMatchKeyValueExpression (" match " ," true "))).to( file : data / filtered );

Listing 15: Camel Route used for the content ﬁlterperformance meassurements of Camel-JavaThe results of the measurement depicted in Figure 11 showa linear performance compared to the amount of facts pro-cessed. Noticeable, ILP is approximately twice as fast asthe pure Camel-Java implementation. This is especially rele-vant for data-intensive processing scenarios and supports theobservations (a,b) from the routing pattern measurement.Even if the multi-fact message contains only two facts, (a)the Datalog evaluation is already faster and (b) the approachfavor more data-intensive operations on the message that arenot only “peaking” into the content for simple routing.

Coming back to the motivating example in Listing 1, whichwe extended with the calculation of the player’s position (cf. posAtShotOnGoal ), while shooting on goal, and to samplethe player positions on a minute basis by using a recursive nricher:playerInfo.txtplayerInfo playersfrom(file:inbox/players) (a) LiLa Graph route 1 route 2from(direct:enricher)Enricher:playerInfo.txtplayerInfo enrich:playerInfosfrom(file:inbox/players)players (b) Route GraphFigure 8: Detection of the Remote Enricher Patternrule (cf. pPosPerMinute ). Listing 16 shows the extendedversion of the LiLa program that “tweets” the calculatedpositions and stores stores them with the “players at ball” toa ﬁle, and the positions per minute to a database. @from ( file : gameEvents .json , json ){gE( period ,time , eventCode , pId ).}@from ( file : playerPosition .json , json ){ pPos ( period ,time , playerId ,posX , posY ).}g( period ,time , pId ):-gE( period ,time ," Goal ", pId ).p( period ,time , pId ):-gE( period ,time ,"BallReception ", pId ).gByP ( period ,time ,pId , firstN , lastN ):-g( period ,time , pId ),pInfo (pId , firstN , lastN ).pAtB ( period ,time ,pId , firstN , lastN ):-p( period ,time , pId ),pInfo (pId , firstN , lastN ).posAtShotOnGoal ( period ,time , firstN , lastN ,posX ,posY ):- gByP ( period ,time ,pId , firstN , lastN ),pPos ( period ,time ,pId ,posX , posY ).pPosPerMinute ( period ,time , playerId ,posX , posY ):-pPos ( period , millitime ,posX , posY ),time :=1 ,time = millitime /600.pPosPerMinute ( period ,time , playerId ,posX , posY ):-pPos ( period , millitime ,posX , posY ),pPosPerMinute (A, previousTime ,B,C,D),time :=previousTime +1 , time = millitime /600.@enrich ( playerInfo .json , json ){ pInfo (pId , firstN , last ).}@to ( twitter : $config , json ){ gByP }@to ( file : playersAtBall . json ) { pAtB }@to { file : positionAtShotOnGoal }{ posAtShotOnGoal }@to { jdbc : soccerDatabase }{ pPosPerMinute }

Listing 16: Soccer Game Event Integration (revisited) asLiLa program.The corresponding (extended) LiLa dependency graph

LDG is shown in Figure 12, which is used to generate the routegraph RG depicted in Figure 13. As the node posAtShotOn-Goal has multiple incoming arcs, a join router pattern isdetected and generated. Similarly a multicast pattern is de-tected and generated after the from(file:playerPosition,json) and gByP node.

6. RELATED WORK

The application of Datalog to integration programming forcurrent middleware systems has not been considered beforeand was only recently brought into talk by our BPMN-basedmodeling approach [16]. However, the work on Java systemslike Telegraph Dataﬂow [19], Jaguar [21]) can be consideredrelated work in the area of programming languages on appli-cation systems for faster, data-intensive processing. Theseapproaches are mainly targeting to make Java better capablefor data-intensive processing, while struggling with thread-ing, garbage collection and memory management. None ofthem considers the combination of the host language withrelational logic processing. ame events(File Consumer) Multicast ball reception(Content Filter) player information(Content Enricher) player at ball(Message Translator)player information(Content Enricher) goal by player(Message Translator)goal(Content Filter) Twitter EndpointFile Endpointfrom(direct:2)from(direct:1)

Figure 9: Generated Apache Camel routes in EIP-icon notation for the motivating example (inherent message ﬁlters beforerecipients omitted) R u n t i m e i n m s e c Amount of single fact messagesMessage ﬁ lter pattern Hlog Java

Figure 10: Basic Message Routing Pattern test R u n t i m e i n m s e c Amount of facts in thousandContent ﬁ lter pattern Hlog Java

Figure 11: Basic Message transformation pattern test

Declarative XML Processing and Semantic Web.

Re-lated work can be found in the area of declarative XMLmessage processing (e. g., [5]). Using an XQuery data storefor deﬁning persistent message queues, the work targets onlya subset of ILP (i. e., persistent message queuing).In the semantic web domain, several approaches use Datalogto integrate and query data from diﬀerent mostly XML-based sources. For instance, the Semantic Web Integration from(file:gameEvents,json)gEg pgByP pAtBfrom(file:playerPosition,json)pPosPerMinuteposAtShotOnGoal to(jdbc:soccerDatabase)to(file:positionAtShotOnGoal)to(twitter,json) to(file:playersAtBall,json)enricher:pInfopInfo

Figure 12: Dependency graph of the motivating example(revisited).Middleware (SWIM) extends Datalog with XPath expressionsin the rule body to map XML to RDF as well as RQL torelational queries [6]. ILP takes this approach one stepfurther by using standard Datalog + to describe integrationpatterns that can be composed to integration scenarios. Data Integration.

The data integration domain uses inte-gration middleware systems for querying remote data thatis treated as local or “virtual” relations. Starting with SQL-based approaches, e. g., using the

Garlic integration system[10], the data integration research reached relational logic pro-gramming, summarized by [8]. In contrast to remote queriesin data integration, ILP extends integration programmingwith declarative, relational logic programming for applicationintegration as well as the expressiveness of logic programsthrough integration semantics.

Figure 13: Route graph motivating example (revisited).

Declarative Application Programming.

With LiLa, wedeﬁned a language design similar to the trend of Datalog-stylerule-based languages for declaratively, data-centric applica-tion development. Major work in this complementary ﬁeldhas been conducted by Green et al [9] with the

Datalog LB language for (analytical) applications and Abiteboul et al [1],who applied logic programming (i. e., extended Datalog) toanalytical and web application development and developed Webdamlog [2, 1], which is a language based on Datalogdeveloped for specifying distributed applications.

Data-aware Integration Languages.

The modeling of data-intensive workﬂows and integration scenarios has been ap-proached only recently. Abiteboul et al compare businessentity modeling to their Active XML (AXML) approach [3].AXML is a data-aware workﬂow language, which speciﬁesXML documents with embedded Web Service calls. Com-pared to LiLa, Active XML partly deﬁnes the notion of factsources and content enrichers omitting message translation,routing goals and complex routing patterns like aggregatorand splitter patterns. Another workﬂow approach is de-scribed in [12], which describes a decision mining approachthat results to a

Product Data Model that strives to giveinsights into the data view of a business decision process. InLiLa the data graph is more explicit and the control ﬂowmodel is of no concern to the user.In the area of message-based integration, we deﬁne inte- gration scenarios as BPMN-based

Integration Flow (IFlow),which specify the control-, data-, and exception-ﬂow model-ing [13, 14]. Although the IFlow approach is far better thancontrol ﬂow-centric models (e. g., Guaran´a DSL [7]), dataoperations and formats still remain implicit.

7. CONCLUSION AND FUTURE WORK

According the observations

P1–P4 , the main contributions ofthis work are (a) the analysis of the “de-facto” standard inte-gration patterns with respect to their enhancement for data-intensive processing, (b) the deﬁnition of integration logicprograms, which are relational logic language constructs thatcan be embedded into patterns aligned with their semantics,(c) the deﬁnition of a data-aware logic integration language,which can be synthesized to integration logic programs, (d)an application to a conventional integration system, and(e) a brief performance analysis and the application to adata-intensive integration scenario.Future work will be conducted in the area of rule-basedoptimization during the automatic program to runtime com-pilation for common integration processing styles, e. g., forscatter/gather, splitter/gather, with the related questions ondata partitioning and provisioning during message process-ing.

8. REFERENCES [1] S. Abiteboul, E. Antoine, G. Miklau, J. Stoyanovich,and J. Testard. Rule-based application developmentsing webdamlog. In

SIGMOD , pages 965–968, 2013.[2] S. Abiteboul, M. Bienvenu, A. Galland, and E. Antoine.A rule-based language for web data management. In

PODS , pages 293–304, 2011.[3] S. Abiteboul and V. Vianu. Models for data-centricworkﬂows. In

Search of Elegance in the Theory andPractice of Computation - Essays Dedicated to PeterBuneman , pages 1–12, 2013.[4] J. Anstey and H. Zbarcea.

Camel in Action . Manning,2011.[5] A. B¨ohm, C.-C. Kanne, and G. Moerkotte. Demaq: Afoundation for declarative xml message processing. In

CIDR , pages 33–43, 2007.[6] V. Christophides, G. Karvounarakis, I. Koﬃna,G. Kokkinidis, A. Magkanaraki, D. Plexousakis,G. Serﬁotis, and V. Tannen. The ics-forth swim: Apowerful semantic web integration middleware. In

SWDB , pages 381–393, 2003.[7] R. Z. Frantz, A. M. R. Quintero, and R. Corchuelo. Adomain-speciﬁc language to design enterpriseapplication integration solutions.

Int. J. CooperativeInf. Syst. , 20(2):143–176, 2011.[8] M. R. Genesereth.

Data Integration: The RelationalLogic Approach . Synthesis Lectures on ArtiﬁcialIntelligence and Machine Learning. Morgan & ClaypoolPublishers, 2010.[9] T. J. Green, M. Aref, and G. Karvounarakis. Logicblox,platform and language: A tutorial. In

Datalog , pages1–8, 2012.[10] L. M. Haas, D. Kossmann, E. L. Wimmers, andJ. Yang. Optimizing queries across diverse data sources.In

VLDB , pages 276–285, 1997.[11] G. Hohpe and B. Woolf.

Enterprise IntegrationPatterns: Designing, Building, and DeployingMessaging Solutions . Addison-Wesley LongmanPublishing Co., Inc., Boston, MA, USA, 2003.[12] R. Petrusel, I. T. P. Vanderfeesten, C. C. Dolean, andD. Mican. Making decision process knowledge explicitusing the decision data model. In

Business InformationSystems (BIS) , pages 172–184, 2011.[13] D. Ritter. Experiences with business process model andnotation for modeling integration patterns. In

EuropeanConference Modelling Foundations and Applications(ECMFA) , pages 254–266, 2014.[14] D. Ritter. Using the business process model andnotation for modeling enterprise integration patterns.

CoRR , abs/1403.4053, 2014.[15] D. Ritter. What about database-centric enterpriseapplication integration? In

Central-European Workshopon Services and their Composition, (ZEUS) , pages73–76, 2014.[16] D. Ritter and J. Bross. Datalogblocks: Relational logicintegration patterns. In

Database and Expert SystemsApplications (DEXA) , pages 318–325, 2014.[17] D. Ritter, N. May, and S. Rinderle-Ma. Patterns foremerging application integration scenarios: A survey.

Inf. Syst. , 67:36–57, 2017.[18] D. Ritter and T. Westmann. Business networkreconstruction using datalog. In

Datalog in Academiaand Industry - Second International Workshop (Datalog2.0) , pages 148–152, 2012. [19] M. A. Shah, S. Madden, M. J. Franklin, and J. M.Hellerstein. Java support for data-intensive systems:Experiences building the telegraph dataﬂow system.

SIGMOD Record , 30(4):103–114, 2001.[20] J. D. Ullman.

Principles of Database andKnowledge-Base Systems, Volume I . Computer SciencePress, 1988.[21] M. Welsh and D. E. Culler. Jaguar: enabling eﬃcientcommunication and I/O in java.