A Rule-based Language for Application Integration
AA Rule-based Language for Application Integration
Daniel Ritter
SAP SEDietmar-Hopp-Allee 16Walldorf, Germany [email protected] Jan Broß
Karlsruhe Institute of TechnologyKaiserstraße 12Karlsruhe, Germany [email protected]
ABSTRACT
Although message-based (business) application integrationis based on orchestrated message flows, current modelinglanguages exclusively cover (parts of) the control flow, whileunder-specifying the data flow. Especially for more data-intensive integration scenarios, this fact adds to the inherentdata processing weakness in conventional integration systems.We argue that with a more data-centric integration languageand a relational logic based implementation of integrationsemantics, optimizations from the data management domain(e. g., data partitioning, parallelization) can be combined withcommon integration processing (e. g., scatter/gather, split-ter/gather). With the
Logic Integration Language (LiLa) were-define integration logic tailored for data-intensive process-ing and propose a novel approach to data-centric integrationmodeling, from which we derive the control-and data flowand apply them to a conventional integration system.
1. INTRODUCTION
Conventional message-based integration systems show weak-nesses when it comes to data-intensive (business) applica-tion integration (e. g., fast-growing business areas like onlineplayer position tracking in sports management, internet ofthings)–using integration patterns like message transforma-tions and (partially) message routing [16]. This is due tothe facts that (
P1) most of the application data is storedin relational databases, which leads to many format conver-sions during the end-to-end processing, and the observation( P2 ) that application tier programming languages (e. g., Java,C Enterprise Integration Patterns (EIP)[11, 17] can be considered a “de-facto” modeling standard.Through the icons, the collected common integration pat-terns can be combined to describe and configure integrationsemantics on an abstract level. For instance, Figure 1 showsa “Soccer Player Event” integration scenario from sports man-agement in the EIP icon notation. The player event datais gained through a
Polling Consumer , loading game eventscollected by sensors attached to the players and the playingfield during a soccer match. Depending on the event code, a
Content-based Router pattern is used to route the messagesto specific filter operations for “Shots on goal” and “Player atball” through a
Content Filter , whereafter additional playerinformation is merged into the resulting messages using a
Content Enricher . Then the messages are converted intothe formats understood by their receivers using a
MessageTranslator . The “Shots on goal” information is posted astwitter feed and ball possessions are stored to file. While thecontrol flow is modeled, the message formats (e. g., “Gameevents”, “Player information”) and the actual data processing(e. g., routing and filter conditions, enricher and mappingprograms) remain hidden on a second level configuration. Incontrast, a more data-aware formalization should treat dataas first-class citizen of an integration scenario. This ( P3 )would give an integration expert the immediate control overthe actual core aspect of integration, the data and its format,and ( P4 ) would take away the burden of explicitly modelingthe system’s control flow, while keeping best practices andoptimizations in mind, which should rather be configuredby the system itself. In this context there is a new trendto use Datalog-style rule-based languages to declarativelyspecify data-centric application development by Green et al[9] and Abiteboul et al [1], who applied logic programming(i. e., extended Datalog) to analytical and web applicationdevelopment. Similarly, we showed in [16] the applicabilityand expressiveness of standard Datalog in the context of theEIPs [11, 17].To approach these observations (P1–P4), we propose a novelformalization tailored to data-intensive, message-based inte-gration and a data-centric modeling approach, which we call Logic Integration Language (LiLa). For that, we re-definecore EIPs as part of a conventional integration system usingDatalog. Datalog allows for data processing closer to its a r X i v : . [ c s . D B ] A ug ame events(File Consumer) event code router(Content based router) ball reception(Content Filter) player information(Content Enricher) player at ball(Message Translator)player information(Content Enricher) goal by player(Message Translator)goal(Content Filter) Twitter EndpointFile Endpoint Figure 1: Excerpt from a soccer player event-message integration scenario (EIP icon notation [11]).storage representation, and is sufficiently expressive for theevaluation of EIPs [16]. Similarly, LiLa programs are basedon standard Datalog + , for which we carefully defined a smallset of integration-specific extensions. @from ( file : gameEvents .json , json ){gE( period ,time , eventCode , pId ).}g( period ,time , pId ):-gE( period ,time ," Goal ", pId ).br( period ,time , pId ):-gE( period ,time ," BallReception ", pId ).gByP ( period ,time , firstN , lastN ):-g( period ,time , pId ),pInfo (pId , firstN , lastN ).pAtB ( period ,time , firstN , lastN ):-br( period ,time , pId ),pInfo (pId , firstN , lastN ).@enrich ( playerInfo .json , json ){ pInfo (pId , firstN , lastN ).}@to ( twitter : $config , json ){ gByP }@to ( file : playersAtBall . json ){ pAtB } Listing 1: Soccer Game Event Integration with LiLa.For instance, Listing 1 shows the LiLa program of our moti-vating example. Notably, the data flow, formats and opera-tions are represented as Datalog program with annotations.The file-based message adapter @from reads a stream of gameevents in the JSON format, canonically converts and projectsthe message body to Datalog facts of the form gE . SeveralDatalog rules represent operations on the data like filters(i. e., predicates g , br ), enricher @enrich , loading and merg-ing pInfo from gByP and pByB ), before binding the IDBrelations to receiver endpoints @to that only pass specifiedpredicates and (canonically) convert them to the configuredformat (e. g., JSON). From the LiLa programs we deriveintegration semantics and an efficient control flow using pat- tern detection. To show the applicability of our approachto real-world integration scenarios and to conduct perfor-mance measurements, we synthesize LiLa programs to theopen-source integration system Apache Camel [4] that im-plements most of the integration semantics in form of EIPs.The results of the runtime analysis show that the usage of amore data-centric message processing is especially promising(a) for message transformations, while the routing efficiencyremains similar to the conventional processing, and (b) froman end-to-end messaging point of view. Furthermore, thedata-centric modeling with LiLa emphasizes the potentialfor optimizations and a novel modeling clarity compared tothe existing control flow centric languages.The remainder of the paper is organized along its contribu-tions. Section 2 briefly describes the re-definition of EIPsas Datalog programs as foundation for the construction ofLiLa in Section 3. The synthesis of LiLa programs to ApacheCamel is explained in Section 4 as basis for experimentalevaluations discussed in Section 5. Section 6 sets LiLa incontext to related work and Section 7 concludes the paper.
2. INTEGRATION PATTERNS IN A NUTSHELL
The
Enterprise Integration Patterns (EIPs) [11, 17] define op-erations on the header (i. e., payload’s meta-data) and body(i. e., message payload) of a message, which are normallyimplemented in the integration system’s host language (e. g.,Java, C
Integration Logic Programming (ILP) targets an enhance-ment of conventional integration systems for data-intensiveprocessing, while preserving the general integration seman-tics like
Quality of Service (e. g., best effort, exactly once)nd the
Message Exchange Pattern (e. g., one-way, two-way).In other words, the content part for the patterns is evaluatedby a Datalog system, which is invoked by an integrationsystem that processes the results.
When connecting applications, various operations are exe-cuted on the transferred messages in a uniform way. Thearriving messages are converted into an internal format un-derstood by the pattern implementation, called
CanonicalData Model (CDM) [11, 17], before the messages are trans-formed to the target format. Hence, if a new application isadded to the integration solution only conversions betweenthe CDM and the application format have to be created.Consequently, for the re-definition of integration patterns,we define a CDM as
Datalog Program , which consists of a setof facts, with an optional set of (supporting) rules as messagebody and a set of meta-facts that describes the actual dataas header. The meta-facts encode the name of the fact’spredicate and all parameter names within the relation as wellas the position of each parameter. With that information,parameters can be accessed by name instead of position byDatalog rules (e. g., for selections, projections).
Before re-defining the patterns integration semantics for rout-ing and transformation patterns using Datalog, by function ilp ,let us recall some relevant, basic Datalog operations: join , projection , union , and selection . The join of two re-lations r ( x, y ) and s ( y, z ) on parameter y is encoded as j ( x, y, z ) ← r ( x, y ) , s ( y, z ), which projects all three parame-ters to the resulting predicate j . More explicitly, a projectionon parameter x of relation r ( x, y ) is encoded as p ( x ) ← r ( x, y ).The union of r ( x, y ) and s ( x, y ) is u ( x, y ) ← r ( x, y ) . u ( x, y ) ← s ( x, y ), which combines several relations to one. The selec-tion r ( x, y ) according to a built-in predicate φ ( x, [ const | z ])is encoded as s ( x, y ) ← r ( x, y ) , φ ( x, [ const | z ]), which onlyreturns s ( x, y ) records for which φ ( x, [ const | z ]) evaluates to true for a given constant value const or a variable value z . Built-in predicates can be numerical, binary relations φ ( x, const ) like <, >, < = , > = , = as well as string, binaryrelations like equals, contains, startswith, endswith , numeri-cal expressions based on binary operators like = , + , − , ∗ , / (e. g., x = p ( y ) + 1) and operations on relations like y = max ( p ( x )) , y = min ( p ( x )), which would assign the maximalor the minimal value of a predicate p to a parameter y .Although our approach allows each, single pattern definitionto evaluate arbitrary Datalog rules, queries and built-inpredicates, the Datalog to pattern mapping tries to identifyand focus on the most relevant Datalog operations for aspecific pattern. An overview of all discussed, re-definedrouting functions and their mapping to Datalog constructsis shown in Figure 2. Message Routing Patterns.
The routing patterns can beseen as control and data flow definitions of an integrationchannel pipeline. For that, they access the message to route itwithin the integration system and eventually to its receiver(s).They influence the channel and message cardinality as wellas the content of the message. The most common rout-ing pattern that determines the message’s route based on bu il t - i n j o i n s e l ec t i o np r o j ec t i o nun i o n Message Routing
Router, Filter:Recipient ListMulticast, Join RouterSplitterCorrelation, CompletionAggregation
Message Transformation
Message translatorContent filterContent enricher
Figure 2: Message routing and transformation patternsmapped to Datalog. Most common Datalog operations fora single pattern are marked “dark blue”, less common ones“light blue”, and possible but uncommon ones “white”.its body is the
Content-based Router . The stateless routerhas a channel cardinality of 1: n , where n is the numberof leaving channels, while one channel enters the router,and a message cardinality of 1:1. The entering messageconstitutes the leaving message according to the evalua-tion of a routing condition . This condition is a function rc ,with { out , out , ..., out n } := rc ( msg in .body.x, conds ), where msg in determines the entering message and body.x is anarbitrary field x of its structure. The function rc evaluatesto a list of Boolean output on a list of conditions conds (i. e., Datalog rules) for each leaving channel. The output { out , out , ..., out n } is a list of Boolean values for each ofthe n ∈ N leaving channels. However, only one channelmust evaluate to true , all others to false . The Booleanoutput determines on which leaving channel the messageis routed further (i. e., exactly one channel will route themessage). Common integration systems implement a routingfunction that provides the entering message msg in , repre-sented by a Datalog program (i. e., mostly facts) and the conds configurations as Datalog rules. Since standard Dat-alog rules cannot directly produce a Boolean result, thereare at least two ways of re-defining rc : (a) by a supportingfunction in the integration system, or (b) by adding BooleanDatalog facts for each leaving channel that are joined withthe evaluated conditions and exclusively returned by projec-tion (not further discussed). An additional function help rc for option (a), could be defined as { out , out , ..., out n } := help rc ( list ( list ( fact ))), fitting to the input of the routingfunction, where list ( list ( fact )) describes the resulting factsof the evaluation of conds for each channel. The function help rc emits true , if and only if list ( facts ) (cid:54) = ∅ , and false otherwise. Now, the ILP routing condition is de-fined as list ( fact ) := ilp rc ( msg in .body.x, conds ), while be-ing evaluated for each channel condition, thus generating list ( list ( fact )). The conds would then mainly be Datalog op-erations like selection or built-in predicates. For the messagefilter, which is a special case of the router that distinguishesonly in its channel cardinality of 1:1 and the resulting mes-sage cardinality of 1:[0 | ilp rc would have to be beevaluated once.The stateless Multicast and
Recipient List patterns routemultiple messages to leaving channels, which gives them aessage and channel cardinality of 1: n . While the multicastroutes messages statically to the leaving channels, the recipi-ent list determines the receiving channels dynamically. Thereceiver determination function rd , with { out , out , ..., out n } := rd ( msg in . [ header.y | body.x ]) ., computes n ∈ N receiver channel configurations { out , out ,..., out n } by extracting their key values either from an ar-bitrary message header field y or from the body x field ofthe message. The integration system has to implement a re-ceiver determination function that takes the list of key-strings { out , out , ..., out n } as input, for which it looks up receiverconfigurations recv i , recv i +1 , ..., recv i + m , where i, m, n ∈ N and m ≤ n , and passes copies of the entering message { msg (cid:48) out , msg (cid:48)(cid:48) out , ..., msg m (cid:48) out } . In terms of Datalog, rd ilp is aprojection from values of the message body or header to aunary, output relation. For instance, the receiver configura-tion keys recv and recv have to be part of the message bodylike body ( x, (cid:48) recv (cid:48) ) .body ( x, (cid:48) recv (cid:48) ) . and rd ilp would evaluatea Datalog rule similar to config ( y ) ← body ( x, y ). For moredynamic receiver determinations, a dynamic routing patterncould be used. However, deviations from the original patterndefined in [11, 17] would extend the expressiveness of therecipient list. Our ILP definition does not prevent from doingthat. The multicast and join router pattern are staticallyconfigurable 1: n and n :1 channel patterns, which do not needa re-definition as ILP.The antipodal Splitter and
Aggregator patterns both have achannel cardinality of 1:1 and create new leaving messages.Thereby the splitter breaks the entering message into multi-ple (smaller) messages (i. e., message cardinality of 1: n ) andthe aggregator combines multiple entering messages to oneleaving message (i. e., message cardinality of n :1). To be ableto receive multiple messages from different channels, a JoinRouter [14] pattern with a channel cardinality of n :1 andmessage cardinality of 1:1 can be used as predecessor to theaggregator. Hereby, the stateless splitter uses a split condi-tion sc , with { out , out , ..., outn } & = sc ( msg in .body, conds ),which accesses the entering message’s body to determine a listof distinct body parts { out , out , ..., out n } , based on a list ofconditions conds , that are each inserted to a list of individual,newly created, leaving messages { msg out , msg out , ..., msg outn } with n ∈ N by a splitter function. The header and attach-ments are copied from the entering to each leaving message.The re-definition sc ilp of split condition sc evaluates a set ofDatalog rules as conds , which mostly use Datalog selection,and sometimes built-in and join constructs (the latter twoare marked “light blue”). Each part of the body out i is a setof facts that is passed to a split function, which wraps eachset into a single message.The stateful aggregator defines a correlation condition, com-pletion condition and an aggregation strategy. The correla-tion condition crc , with coll i := crc ( msg in . [ header.y | body.x ] ,conds ), determines the aggregate collection coll i , with i ∈ N ,based on a set of conditions conds to which the messageis stored. The completion condition cpc , with cpout := cpc ( msg in . [ header.y | body.x ]), evaluates to a Boolean output cpout based on header or body field information (similar tothe message filter). If cpout == true , then the aggregationstrategy as , with aggout := as ( msg in , msg in , ..., msg inn ),is called by an implementation of the messaging system and executed, else the current message is added to the collection coll i . The as evaluates the correlated entering message col-lection coll i and emits a new leaving message msg out . Forthat, the messaging system has to implement an aggregationfunction that takes aggout (i. e., the output of as ) as input.These three functions are re-defined as crc ilp , cpc ilp suchthat the conds are rules mainly with selection and built-inDatalog constructs. The cpc ilp makes use of the defined help rc function to map its evaluation result (i. e., list offacts or empty) to the Boolean value cpout . The aggregationstrategy as is re-defined as as ilp , which mainly uses Data-log union to combine lists of facts from different messages.The message format remains the same. To transform theaggregates’ formats, a message translator should be used tokeep the patterns modular. However, the combination ofthe aggregation strategy with translation capabilities couldlead to runtime optimizations. An overview of all discussed,re-defined routing functions and their mapping to Datalogconstructs is shown in Figure 2. Message Transformation Patterns.
The transformationpatterns exclusively target the content of the messages interms of format conversations and modifications.The stateless
Message Translator changes the structure orformat of the entering message without generating a new one(i. e., channel, message cardinality 1:1). For that, the trans-lator computes the transformed structure by evaluating amapping program mt , with msg out .body := mt ( msg in .body ).Thereby the field content can be altered.The related Content Filter and
Content Enricher patternscan be subsumed by the general
Content Modifier patternand share the same characteristics as the translator pattern.The filter evaluates a filter function mt , which only filtersout parts of the message structure, e. g., fields or values,and the enricher adds new fields or values as data to theexisting content structure using an enricher program ep , with msg out .body := ep ( msg in .body, data ).The re-definition of the transformation function mt ilp forthe message translator mainly uses Datalog join and pro-jection (plus built-ins for numerical calculations and stringoperations, thus marked “light blue”) and Datalog selection,projection and built-in (mainly numerical expressions andcharacter operations) for the content filter. While projectionsallow for rather static, structural filtering, the built-in andselection operators can be used to filter more dynamicallybased on the content. The resulting Datalog programs arepassed as msg out .body . In addition, the re-defined enricherprogram ep ilp mainly uses Datalog union operations to addadditional data to the message as Datalog programs. Figure 2summarizes the discussed message transformation functions. Pattern Composition.
The defined patterns can be com-posed to more complex integration programs (i. e., integrationscenarios or pipelines). From the many combinations of pat-terns, we briefly discuss two important structural patternsthat are frequently used in integration scenarios: (1) scat-ter/gather and (2) splitter/gather [11, 17]. Both patternsare supported by the patterns re-defined as ILPs.he scatter/gather pattern (with a 1: n :1 channel cardinality)is a multicast or recipient list that copies messages to several,statically or dynamically determined pipeline configurations,which each evaluate a sequence of patterns on the messagesin parallel. Through a join router and an aggregator pattern,the messages are structurally and content-wise joined.The splitter/gather pattern (with a 1: n :1 message cardinality)splits one message into multiple parts, which can be processedin parallel by a sequence of patterns. In contrast to thescatter/gather the pattern sequence is the same for eachinstance. A subsequently configured aggregator combinesthe messages to one.
3. LOGIC INTEGRATION LANGUAGE
In the context of data-intensive message-processing, the cur-rent control flow-centric integration languages do not allowto design the data flow. Through the re-definition of theintegration patterns with Datalog as ILPs, a foundation fora data-centric definition of integration scenarios is provided.Hence the language design of the subsequently defined
LogicIntegration Language (LiLa) is based on Datalog, which spec-ify programs that carefully extend standard Datalog + byintegration semantics using annotations for message end-points and complex routing patterns. As shown in Listing2, an annotation consists of an head with name preceded by“@” and zero or more parameters enclosed in brackets, aswell as a body enclosed in curly brackets. @< annotationName >( < parameter > + ){ < Annotation Body > } Listing 2: Format of an annotation in
LiLa
A LiLa program defines dependencies between Datalog facts,rules and annotations similar to the dependency graph ofa Datalog program [20]. Let us recall that the cyclic de-pendency graph DG D of a (recursive) Datalog program isdefined as DG D := ( V D , E D ), where the nodes V D of thegraph are IDB predicates, and the edges E D are defined froma node n ∈ N (predicate 1) to a node n ∈ N (predicate 2),if and only if, there is a rule with predicate 1 in the headand predicate 2 in the body.Analogously, the directed, acyclic LiLa dependency graph LDG is defined as
LDG := ( V p , E p ), where V p are collectionsof IDB predicates, which we call processors. An edge E p from processor p ∈ V p to p ∈ V p exists, if there is a rulewith predicate 1 from p in the head and predicate 2 from p in the body. Hence the LDG contains processors withembedded cyclic rule dependency graphs, which do not leadto cycles in the LDG . In contrast to the DG D , annotationsare added to the LDG as nodes. If an annotation uses apredicate an edge from that predicate is drawn to the node ofthe annotation (i. e., annotation depends on that predicate).If another annotation or rule uses the predicates producedby an annotation an edge from the annotation to the noderepresenting the annotation or rule, which uses the data pro-duced by the annotation, is drawn. Figure 3 shows the
LDG for the LiLa program depicted in Listing 1. The message from(file:gameEvents,json)gEg pgByP pAtBto(twitter,json) to(file:playersAtBall,json)enricher:pInfopInfo
Figure 3: Dependency graph of the LiLa program from themotivating example.endpoint nodes are labeled with their consumer/producerURI with the predicate name of the rule for content filters.
To connect the message sender, the
Fact Source , with themessage receiver, the
Routing Goal , LiLa extends Datalog by @from , @to annotation statements similar to the open sourceintegration system Apache Camel [4]. Nodes of LDG withno incoming edges are either EDB predicates or fact sources.Nodes with no outgoing edges are (mostly) routing goals.The only counter example are obsolete/unused processingsteps, which can be deleted.Representing the sender-facing fact source specifies the sender’stransport and message protocol. Listing 3 defines the factsource, which consists of a location , configuration URI thatcan be directly interpreted by an integration system anddefines the location of the facts, and format , the messageformat of the data source (e. g., JSON, CSV, XML). Theannotation body specifies the format’s relations in form ofDatalog facts. The message format is canonically convertedto Datalog programs according to the ILP-CDM. @from (< location >,< format >){ < relationName (< parameter > + ) >. + } Listing 3: Definition of a fact source in
LiLa
Similarily, the routing goal definitions specify the receiver-facing transport and message protocols (cf. Listing 4). Hereby,the ILP-CDM is canonically converted to the message formatunderstood by the receiver. @to (< producerURI >,< format >){ < relationName >[ < linebreak >< relationName >] ∗ } Listing 4: Definition of a routing goal in
LiLa .3 Inherent Integration Patterns
The Datalog facts provided by the fact source can be directlyevaluated by Datalog rules. The LiLa dependency graph isused to automaticlly identify message transformation andbasic routing patterns.
Message Transformation Patterns.
Further patterns thatcan be derived from the
LDG are message transformationpatterns like
Content Filter , Message Translator , and thelocal
Content Enricher .The content filter and message translator patterns are usedto filter parts of a message as well as to translate the mes-sage’s structure. Both are inherently declared in LiLa byusing Datalog rules, which are collected in processors of the
LDG . Each set of rules producing the same predicate corre-sponds to a filter or translator in the integration middleware.For instance, the LiLa program for the motivating exam-ple produces two content filters: one for the relation gByP and another one for the relation pByB . The routing betweenmultiple content filters is decided based on the dependencygraph of the LiLa program. If a node has a single outgoingedge, the incoming data is directly routed to the processorcorresponding to the subsequent node. If a node has multipleincoming edges a join router pattern is present, which isdetected and transformed as described in Section 4. Thesame is the case for a node having multiple outgoing edges,which corresponds to a multicast pattern.For the local content enricher, LiLa allows to specify factsin a LiLa program. The facts are treated as processor (i. e.,node in
LDG ) and are automatically placed into the messageafter a relation with this name is produced.
Message Routing Patterns.
In addition to the messagetransformation patterns, some simple routing patterns canbe derived from the dependency graph like
Multicast , MessageFilter , Content-based Router and
Join Router .The multicast pattern can be used as part of the commonmap/reduce-style message processing. The multicast is de-rived by analyzing the dependency graph for independentrules to which copies of the message are provided. One po-tential side-effect is the detection of (too) many multicastconfiguration, when a routing goal requests multiple inter-mediary results of a single route, which we mitigate by anoptimization that keeps these results (not shown).The message filter, removes messages according to a filtercondition (cf. Section 2). For the filter, LiLa does notdefine a special construct The filtering of a message can beachieved by performing a content filtering, which leads toan empty message. Empty messages are discarded beforesending the message for further processing to a routing goal.This behavior can be used to describe a content based-router,which distinguishes from the filter by its message cardinalityof 1: n . However in LiLa, we use the router with a channelcardinality of 1: n (i. e., multicast) with message filters oneach leaving message channel according to [11].A structurally, channel combining pattern is the join router. The join router has a channel cardinality of n :1, however,does only combine channels, but not messages. For that anaggregator is used that is defined subsequently. The more complex routing patterns
Aggregator , Splitter andremote
Content Enricher can neither be described by stan-dard Datalog nor inherently detected in the dependencygraph. Hence we define special annotations for these pat-terns.For the aggregator, Listing 5 shows the @aggregate anno-tation with pre-defined aggregation strategies like union and an either time (e. g., completionTime =3) or number-of-messages based completion condition (e. g., completion-Size=5 ). The annotation body consists of several Datalogqueries. The message correlation is based on the query eval-uation, where true means that the evaluation result is notan empty set of facts and false otherwise. As the aggrega-tor does not produce facts with a new relation name, butcombines multiple messages keeping their relations, it is chal-lenging how to reference to the aggregated relations in aLiLa program as their name does not change (i. e., messageproducing). This leads to problems, when building the de-pendency graph, i. e., it is undecidable whether a rule usesthe relation prior or after aggregation. As we do not wantthe user to specify explicitly, whether she means the relationprior or after aggregation in every rule using a predicate usedin an aggregator, we suffix all predicates after an aggregationstep with -aggregate by default. In combination with ajoin router, messages from several entering channels can becombined. @aggregate (< aggregationStrategy >,
LiLa
LiLa specifies the splitter as in Listing 6 with a new @split annotation, which does not have any parameters in the anno-tation head. Datalog queries are used in the annotation bodyas splitter expressions as briefly described in Section 2. Thequeries are evaluated on the exchange and each evaluationresult is passed for further processing as a single message.Similar to the aggregator, all newly generated relations leav-ing a splitter are suffixed with -split by default in order tonot explicitly having to specify, whether the relation prior orafter splitting is meant. @split (){ ( < parameter > + ).> + } Listing 6: Definition of a splitter in
LiLa
The remote content enricher can be seen as a special messageendpoint. For instance for an enricher including data from afile, the filename and format have to be specified as shownin Listing 7. Similar to the fact source, a set of relationshas to be specified. Again, a canonical conversion from thespecified file format to the ILP-CDM is conducted accordingto the definitions in Section 2. If the relations to enrich viahis construct are already generated by another constructor Datalog rule, they are enriched after this construct byadding the additional facts to the message. If there is noconstruct or Datalog rule producing the relations specifiedin the annotation body, the relations are enriched directlybefore their usage. The enricher construct is especially usefulwhen a single message shall be combined with additionalinformation. @enrich (< filename >,< format >){ < relationName (< parameter > + ).> + } Listing 7: Definition of a remote content enricher in
LiLa
4. SYNTHESIS OF LOGIC INTEGRATIONLANGUAGE PROGRAMS
The defined LiLa constructs can be combined to complexrepresentations of integration programs, which can be exe-cuted by integration systems. For that we have chosen thelightweight, open-source integration system Apache Camel[4], since it implements all discussed integration semantics.To guarantee data-intensive processing, LiLa programs arenot synthesized to Apache Camel constructs directly, but tothe ILP integration pattern re-definitions that are plugged tothe respective system implementations (cf. Section 2). Theequivalent to message channels in Apache Camel are CamelRoutes.
Figure 4: LiLa Compiler Pipeline
The definition of a platform-independent message channelrepresentation, called
Route Graph (RG), enables a graphtransformation t : LDG → RG and an efficient code genera-tion for different runtime systems. The transformation is atwo-step process: In the first step a condition is evaluated oneach edge or node respectively. If the condition evaluates totrue, further processing on this node/edge is performed. Thesecond step is the execution of the actual transformation.The route graph RG is defined as RG := ( V R , E R ), wherethe nodes V R are runtime components of an integrationsystem (representing an ILP-EIP), and the edges E R arecommunication channels from one node n ∈ V R to anothernode n ∈ V R , or itself n . In most integration systemsacyclic route graphs are possible, however, not considered inthis work. The nodes in V R can be partitioned to differentroutes, while edges in E R from one route to another have tobe of type to for the source node and of type from for thetarget node.For instance, Figure 5 shows the route graph the LDG ofour motivating example (cf. Figure 3). The message flowbetween separately generated routes (dashed-lines) indicatea to / from construct. Consequently, the LiLa program from Listing 1 results to four distinct routes, e. g., with a mul-ticast multicast(direct:p,direct:g) and a file-enricher from(direct:pEnrichInfo) identified through pattern de-tection, which is subsequently discussed. Enricher:pInfopAtBfrom(file:gameEvents,json)gEmulticast(direct:p,direct:g) from(direct:g)from(direct:p) gEnricher:pInfop gByPto(twitter,json)to(file:playersAtBall,json)from(direct:enrichpInfopInfo
Figure 5: Route graph for the LiLa program of the motivatingexample in Figure 3.
The more complex, structural join router, multicast andremote enricher patterns are automatically derived from the
LDG through a rule-based pattern detection approach. Withthese building blocks, common optimizations in integrationsystems, e. g., the map/reduce-like scatter/gather pattern[11, 17], which is a combination of the multicast, join routerand aggregator patterns, can be synthesized. The rule-baseddetection and transformation approach defines a matchingfunction [ true | false ] := mf LDG,mc on LDG , with matchingcondition mc and a transformation t G , with t G : LDG → RG . The matching function denotes a node and edge graphtraversal on the LDG that evaluates to true if the conditionholds, false otherwise. The transformation t G is executedonly if the condition holds. Join Router.
The router is a m :1 message channel joinpattern, which usually has to be combined with an aggregatorto join messages. The match condition is defined as mc JR := deg − ( n i ) >
1, where deg − ( n i ) determines the number ofentering message channels on a specific node n i ∈ V P , with i ∈ N . Hence, only in case of multiple entering edges, thegraph transformation t jr is executed. The transformations t jr − change the RG : t jr : n i → n fd ⊕ n i , for all matchingnodes n i , a from-direct node n fd is added, denoted by ⊕ .dditionally, all nodes n j with direct, outgoing edges tothe matching node get an additional to-direct node n td : t jr : n j → n j ⊕ n td . Then all original edges e m have to beremoved: t jr : E P → E P \ e m . Figure 6a shows the LDG and Figure 6b the corresponding RG after the transformation. Multicast.
The multicast has a channel cardinality of 1: n .The match condition is defined as mc Mu := deg + ( n i ) > deg + ( n i ) determines the number of leaving messagechannels on a specific node n i ∈ V P , with i ∈ N . Hence, onlyin case of multiple leaving edges, the graph transformation t jr is executed. The transformations t mu −− change the RG : t mu : n i → n i ⊕ n multic { n j } , for all matching nodes n i , a multicast node n multic is added, which references allprevious neighboring nodes n j via leaving edges. Then a from-direct node n fd is added to all neighboring nodes n j through transformation t mu : n j → n fd ⊕ n j . Additionally,all original edges e m have to be removed: t mu : E P → E P \ e m . Figure 7a shows the LDG and Figure 7b thecorresponding RG after the transformation. Remote Enricher.
An enricher potentially merges severalpredicate relations to the main route as additional data.Therefore it has to get a route on its own that (periodically)gathers the respective messages. The intermediate transfor-mation t re is defined as t re : n i ⊕ { n j } → n fd ⊕ n file ⊕ { n j } ,with n i , n j ∈ V P , which takes all matching enricher nodes n i and the list of connected nodes { n j } and translates them toa from-direct node n fd that is followed by a file relation n file , referencing the connected nodes { n j } . Additionally,all original edges e m from the enricher n i to the list of con-nected nodes { n j } have to be removed: t er : E P → E P \ e m .The match condition for the remote enricher is the nodetype type ( n i ), determined through the @enrich annotation: mc RE := type ( n i ) == (cid:48) @ enrich (cid:48) . After the intermediatetranslation, all produced relations (nodes) that are linked tonodes in the main tree create a join router (cf. transforma-tions t jr − ) with a built-in aggregator that merges the facts,e. g., via union operation. Figure 7a denotes the LDG of anexample remote enricher pattern that is transformed to itscorresponding RG , shown in Figure 7b. In order to find thecomplete path of nodes to extract the leaving edges have tobe followed starting at the enricher node until a node thathas multiple incoming nodes. Before the node with multipleincoming nodes, a to-direct node is inserted through t jr (dashed line). The URI of the call to enricher node is setto the URI of the consumer, which has to be added directlybefore the enricher node. The route graph represents the foundation for the code syn-thesis of the message channels, which are a combinationof ILP constructs and Apache Camel patterns and routes.The construction of the routes is a trivial graph traversalstarting from the fact source nodes. The multicast t mu , and join router t jr − transformations construct a RG with deg − ( n ) == 1, with n ∈ V R . Hence the ILP constructscan be synthesized one after the other based on their typeand the ILP properties, which were preserved during thetransformations and optimizations. For instance, Figure 9 shows the synthesized Apache Camel routes in the EIP iconnotation. In comparison to the motivating example in Fig-ure 1, the content-based router is exchanged by a multicastand message filters are added before the outbound messageendpoints, while preserving the same semantics and allowingfor parallel message processing. Message Endpoints.
The fact source and routing goal nodesare transformed to components in Apache Camel, passing theconfigurations that are stored in the node properties. A de-tected (not @from annotated) fact source gets an additional numOfMsgsToAgg property, which remembers the enteringmessage count of a join router a corresponding aggregatorILP with completionSize = numOfMsgsToAgg is added. The location property defines the component’s endpoint con-figuration and the format leads to the generation of a ILPformat converter (e. g., JSON, CSV to Datalog) that is con-figured using the meta-facts supplied in the annotation body,conduction an additional projection. If the format is setto datalog no format conversion is needed. The routinggoals are configured similarly. A message filter ILP is addedthat discards empty messages. The a format converter (e. g.,Datalog to JSON/CSV) is added and configured through themeta-facts property. Finally a Camel producer componentis added to the route and configured. Complex Routing Patterns.
For the aggregator, additional renamingRules properties and renaming message transla-tors are generated, containing a Datalog rule that adds -aggregate suffixes to every Datalog predicate used in thehead of a query (for name differentiation). Similarly, for thesplitter, -split suffixes are generated that allow additionalmessage translators to rename the predicates. This is neces-sary in order to build the dependency graph as described insection 3.The inherent multicast nodes are configured through a re-cipient list property, containing the target node identifiers,which allows for a translation to the Camel multicast (noILP defined).
Message Translation Patterns.
The content filter and mes-sage translator nodes can be generated to the ILP contentfilter, which is based on a Camel processor and configuredaccordingly. The node of the inherent content enricher, whichcan be specified by writing facts into a LiLa program, storesin the facts as properties. The generated ILP (again basedon a Camel processor) adds the facts to every incomingmessage. The explicit file enricher pattern is configured sim-ilar to a fact source, however, the configuration specifies a fileName property, used to configure the Camel component.Again, ILP format converters are added and configured bythe meta-facts property.
5. EXPERIMENTAL EVALUATION
We implemented ILP constructs as extensions to the lightweight,open source integration system Apache Camel in version2.12.2 based on Section 2 that are references to LiLa pro-grams as described in Section 4. The
HLog
Datalog system c b (a) LiLa Graph r1 r2r3 to(direct:c)from(direct:c)ato(direct:c) bc (b) Route GraphFigure 6: Detection of Join Router Pattern.we used for the measurements is a Java implementation of thestandard na¨ıve-recursive Datalog evaluation (i. e., withoutstratification) from Ullman [20] in version 0.0.6 as describedin [18].
We conduct all measurements on a HP Z600 work station,equipped with two Intel Xeon processors clocked at 2.67GHzwith a total of 2 x x . u match (" true ").meta (" match " ," matching " ,1). Listing 8: Single-fact message [{" matching " : " true "}]
Listing 9: Single-entry message in JSONThe multi-facts message tests were conducted with the JSONmessage shown in Listing 11 with its corresponding Datalogrepresentation in Listing 10. The messages’ payloads areapproximately 58 and 85 bytes, respectively. match (" true " ,1).match (" false " ,2).meta (" match " ," matching " ,1).meta (" match " ," count " ,2).
Listing 10: Mutli-fact message [{" matching ":" true ", " count ":1} ,{" matching ":" false ", " count ":2}]
Listing 11: Multi-entry message in JSON
The subsequently described measurements target an exper-imental runtime evaluation of some of the introduced inte-gration patterns from section 2 through a comparison of theILP with the original Java-based implementation. Keep inmind, that tests with empty messages or routes without anyprocessing steps result in identical performance results. Alltests measure the pattern processing only, thus neglect thenecessary format conversions.
Message Filter and Content-based Router Patterns.
Thebasic router pattern analysis is conducted for the messagefilter and content-based router using the single-fact messagefor ILP (cf. Listing 8) and the corresponding JSON messagefor the Java implementation (cf. Listing 9). We execute themessage filter in a Camel route for ILP (generated from aLiLa program) and for Java as shown in Listings 12 and 13.The performance is measured without the message endpointsby sending multiple single fact messages, while one half of the b c (a) LiLa Graph r1r2 r3amulticast(direct:b,direct:c)from(direct:b) from(direct:c)b c (b) Route GraphFigure 7: Detection of Multicast Pattern.messages is filtered out and the other half is routed further. @from ( file : data / testMessageFilter ){ match ( matching ).}match - filtered ( matching ):- match (" true ").@to ( file : data / filtered )
Listing 12: LiLa message filter performance test program from ( file : data / testMessageFilter ). filter ( new JsonMatchKeyValueExpression (" match" ," true ")).to( file : data / filtered );
Listing 13: Camel Route used for the message filterperformance meassurements of Camel-JavaThe performance measurement results are depicted in Figure10. The Camel-ILP and the Camel-Java implementation showa linear performance on the amount of incoming messages.Although the setup favors the Java processing due to (a)only single-fact messages are sent (i. e., Datalog evaluationis better suitable for set-operations), and (b) the type ofoperation during message routing is mostly only used to“peak” into the message content (cf. Section 2), the Javaimplementation seems to be only slightly better for higheramounts of single fact messages. A similar result/behaviorcan be observed for the content-based router pattern.
Content Filter Pattern.
As an example for message trans-formations, we evaluate the content filter on a single messagecontaining a varying amount of facts based on the messagepayload from Listings 10 and 11. The routes in Listings 14and 15 show that the content filter is configured with a singlerule, for which half of the facts match. @from ( file : data / testContentFilter ){ match ( matching , count ). }match - filtered ( matching , count ):- match (" true ",count ).@to ( file : data / contentFilter ){ match - filtered }
Listing 14: LiLa content filter performance test program from ( file : data / testContentFilter ). process ( new JSONContentFilter ( newJsonMatchKeyValueExpression (" match " ," true "))).to( file : data / filtered );
Listing 15: Camel Route used for the content filterperformance meassurements of Camel-JavaThe results of the measurement depicted in Figure 11 showa linear performance compared to the amount of facts pro-cessed. Noticeable, ILP is approximately twice as fast asthe pure Camel-Java implementation. This is especially rele-vant for data-intensive processing scenarios and supports theobservations (a,b) from the routing pattern measurement.Even if the multi-fact message contains only two facts, (a)the Datalog evaluation is already faster and (b) the approachfavor more data-intensive operations on the message that arenot only “peaking” into the content for simple routing.
Coming back to the motivating example in Listing 1, whichwe extended with the calculation of the player’s position (cf. posAtShotOnGoal ), while shooting on goal, and to samplethe player positions on a minute basis by using a recursive nricher:playerInfo.txtplayerInfo playersfrom(file:inbox/players) (a) LiLa Graph route 1 route 2from(direct:enricher)Enricher:playerInfo.txtplayerInfo enrich:playerInfosfrom(file:inbox/players)players (b) Route GraphFigure 8: Detection of the Remote Enricher Patternrule (cf. pPosPerMinute ). Listing 16 shows the extendedversion of the LiLa program that “tweets” the calculatedpositions and stores stores them with the “players at ball” toa file, and the positions per minute to a database. @from ( file : gameEvents .json , json ){gE( period ,time , eventCode , pId ).}@from ( file : playerPosition .json , json ){ pPos ( period ,time , playerId ,posX , posY ).}g( period ,time , pId ):-gE( period ,time ," Goal ", pId ).p( period ,time , pId ):-gE( period ,time ,"BallReception ", pId ).gByP ( period ,time ,pId , firstN , lastN ):-g( period ,time , pId ),pInfo (pId , firstN , lastN ).pAtB ( period ,time ,pId , firstN , lastN ):-p( period ,time , pId ),pInfo (pId , firstN , lastN ).posAtShotOnGoal ( period ,time , firstN , lastN ,posX ,posY ):- gByP ( period ,time ,pId , firstN , lastN ),pPos ( period ,time ,pId ,posX , posY ).pPosPerMinute ( period ,time , playerId ,posX , posY ):-pPos ( period , millitime ,posX , posY ),time :=1 ,time = millitime /600.pPosPerMinute ( period ,time , playerId ,posX , posY ):-pPos ( period , millitime ,posX , posY ),pPosPerMinute (A, previousTime ,B,C,D),time :=previousTime +1 , time = millitime /600.@enrich ( playerInfo .json , json ){ pInfo (pId , firstN , last ).}@to ( twitter : $config , json ){ gByP }@to ( file : playersAtBall . json ) { pAtB }@to { file : positionAtShotOnGoal }{ posAtShotOnGoal }@to { jdbc : soccerDatabase }{ pPosPerMinute }
Listing 16: Soccer Game Event Integration (revisited) asLiLa program.The corresponding (extended) LiLa dependency graph
LDG is shown in Figure 12, which is used to generate the routegraph RG depicted in Figure 13. As the node posAtShotOn-Goal has multiple incoming arcs, a join router pattern isdetected and generated. Similarly a multicast pattern is de-tected and generated after the from(file:playerPosition,json) and gByP node.
6. RELATED WORK
The application of Datalog to integration programming forcurrent middleware systems has not been considered beforeand was only recently brought into talk by our BPMN-basedmodeling approach [16]. However, the work on Java systemslike Telegraph Dataflow [19], Jaguar [21]) can be consideredrelated work in the area of programming languages on appli-cation systems for faster, data-intensive processing. Theseapproaches are mainly targeting to make Java better capablefor data-intensive processing, while struggling with thread-ing, garbage collection and memory management. None ofthem considers the combination of the host language withrelational logic processing. ame events(File Consumer) Multicast ball reception(Content Filter) player information(Content Enricher) player at ball(Message Translator)player information(Content Enricher) goal by player(Message Translator)goal(Content Filter) Twitter EndpointFile Endpointfrom(direct:2)from(direct:1)
Figure 9: Generated Apache Camel routes in EIP-icon notation for the motivating example (inherent message filters beforerecipients omitted) R u n t i m e i n m s e c Amount of single fact messagesMessage fi lter pattern Hlog Java
Figure 10: Basic Message Routing Pattern test R u n t i m e i n m s e c Amount of facts in thousandContent fi lter pattern Hlog Java
Figure 11: Basic Message transformation pattern test
Declarative XML Processing and Semantic Web.
Re-lated work can be found in the area of declarative XMLmessage processing (e. g., [5]). Using an XQuery data storefor defining persistent message queues, the work targets onlya subset of ILP (i. e., persistent message queuing).In the semantic web domain, several approaches use Datalogto integrate and query data from different mostly XML-based sources. For instance, the Semantic Web Integration from(file:gameEvents,json)gEg pgByP pAtBfrom(file:playerPosition,json)pPosPerMinuteposAtShotOnGoal to(jdbc:soccerDatabase)to(file:positionAtShotOnGoal)to(twitter,json) to(file:playersAtBall,json)enricher:pInfopInfo
Figure 12: Dependency graph of the motivating example(revisited).Middleware (SWIM) extends Datalog with XPath expressionsin the rule body to map XML to RDF as well as RQL torelational queries [6]. ILP takes this approach one stepfurther by using standard Datalog + to describe integrationpatterns that can be composed to integration scenarios. Data Integration.
The data integration domain uses inte-gration middleware systems for querying remote data thatis treated as local or “virtual” relations. Starting with SQL-based approaches, e. g., using the
Garlic integration system[10], the data integration research reached relational logic pro-gramming, summarized by [8]. In contrast to remote queriesin data integration, ILP extends integration programmingwith declarative, relational logic programming for applicationintegration as well as the expressiveness of logic programsthrough integration semantics.
Figure 13: Route graph motivating example (revisited).
Declarative Application Programming.
With LiLa, wedefined a language design similar to the trend of Datalog-stylerule-based languages for declaratively, data-centric applica-tion development. Major work in this complementary fieldhas been conducted by Green et al [9] with the
Datalog LB language for (analytical) applications and Abiteboul et al [1],who applied logic programming (i. e., extended Datalog) toanalytical and web application development and developed Webdamlog [2, 1], which is a language based on Datalogdeveloped for specifying distributed applications.
Data-aware Integration Languages.
The modeling of data-intensive workflows and integration scenarios has been ap-proached only recently. Abiteboul et al compare businessentity modeling to their Active XML (AXML) approach [3].AXML is a data-aware workflow language, which specifiesXML documents with embedded Web Service calls. Com-pared to LiLa, Active XML partly defines the notion of factsources and content enrichers omitting message translation,routing goals and complex routing patterns like aggregatorand splitter patterns. Another workflow approach is de-scribed in [12], which describes a decision mining approachthat results to a
Product Data Model that strives to giveinsights into the data view of a business decision process. InLiLa the data graph is more explicit and the control flowmodel is of no concern to the user.In the area of message-based integration, we define inte- gration scenarios as BPMN-based
Integration Flow (IFlow),which specify the control-, data-, and exception-flow model-ing [13, 14]. Although the IFlow approach is far better thancontrol flow-centric models (e. g., Guaran´a DSL [7]), dataoperations and formats still remain implicit.
7. CONCLUSION AND FUTURE WORK
According the observations
P1–P4 , the main contributions ofthis work are (a) the analysis of the “de-facto” standard inte-gration patterns with respect to their enhancement for data-intensive processing, (b) the definition of integration logicprograms, which are relational logic language constructs thatcan be embedded into patterns aligned with their semantics,(c) the definition of a data-aware logic integration language,which can be synthesized to integration logic programs, (d)an application to a conventional integration system, and(e) a brief performance analysis and the application to adata-intensive integration scenario.Future work will be conducted in the area of rule-basedoptimization during the automatic program to runtime com-pilation for common integration processing styles, e. g., forscatter/gather, splitter/gather, with the related questions ondata partitioning and provisioning during message process-ing.
8. REFERENCES [1] S. Abiteboul, E. Antoine, G. Miklau, J. Stoyanovich,and J. Testard. Rule-based application developmentsing webdamlog. In
SIGMOD , pages 965–968, 2013.[2] S. Abiteboul, M. Bienvenu, A. Galland, and E. Antoine.A rule-based language for web data management. In
PODS , pages 293–304, 2011.[3] S. Abiteboul and V. Vianu. Models for data-centricworkflows. In
Search of Elegance in the Theory andPractice of Computation - Essays Dedicated to PeterBuneman , pages 1–12, 2013.[4] J. Anstey and H. Zbarcea.
Camel in Action . Manning,2011.[5] A. B¨ohm, C.-C. Kanne, and G. Moerkotte. Demaq: Afoundation for declarative xml message processing. In
CIDR , pages 33–43, 2007.[6] V. Christophides, G. Karvounarakis, I. Koffina,G. Kokkinidis, A. Magkanaraki, D. Plexousakis,G. Serfiotis, and V. Tannen. The ics-forth swim: Apowerful semantic web integration middleware. In
SWDB , pages 381–393, 2003.[7] R. Z. Frantz, A. M. R. Quintero, and R. Corchuelo. Adomain-specific language to design enterpriseapplication integration solutions.
Int. J. CooperativeInf. Syst. , 20(2):143–176, 2011.[8] M. R. Genesereth.
Data Integration: The RelationalLogic Approach . Synthesis Lectures on ArtificialIntelligence and Machine Learning. Morgan & ClaypoolPublishers, 2010.[9] T. J. Green, M. Aref, and G. Karvounarakis. Logicblox,platform and language: A tutorial. In
Datalog , pages1–8, 2012.[10] L. M. Haas, D. Kossmann, E. L. Wimmers, andJ. Yang. Optimizing queries across diverse data sources.In
VLDB , pages 276–285, 1997.[11] G. Hohpe and B. Woolf.
Enterprise IntegrationPatterns: Designing, Building, and DeployingMessaging Solutions . Addison-Wesley LongmanPublishing Co., Inc., Boston, MA, USA, 2003.[12] R. Petrusel, I. T. P. Vanderfeesten, C. C. Dolean, andD. Mican. Making decision process knowledge explicitusing the decision data model. In
Business InformationSystems (BIS) , pages 172–184, 2011.[13] D. Ritter. Experiences with business process model andnotation for modeling integration patterns. In
EuropeanConference Modelling Foundations and Applications(ECMFA) , pages 254–266, 2014.[14] D. Ritter. Using the business process model andnotation for modeling enterprise integration patterns.
CoRR , abs/1403.4053, 2014.[15] D. Ritter. What about database-centric enterpriseapplication integration? In
Central-European Workshopon Services and their Composition, (ZEUS) , pages73–76, 2014.[16] D. Ritter and J. Bross. Datalogblocks: Relational logicintegration patterns. In
Database and Expert SystemsApplications (DEXA) , pages 318–325, 2014.[17] D. Ritter, N. May, and S. Rinderle-Ma. Patterns foremerging application integration scenarios: A survey.
Inf. Syst. , 67:36–57, 2017.[18] D. Ritter and T. Westmann. Business networkreconstruction using datalog. In
Datalog in Academiaand Industry - Second International Workshop (Datalog2.0) , pages 148–152, 2012. [19] M. A. Shah, S. Madden, M. J. Franklin, and J. M.Hellerstein. Java support for data-intensive systems:Experiences building the telegraph dataflow system.
SIGMOD Record , 30(4):103–114, 2001.[20] J. D. Ullman.
Principles of Database andKnowledge-Base Systems, Volume I . Computer SciencePress, 1988.[21] M. Welsh and D. E. Culler. Jaguar: enabling efficientcommunication and I/O in java.