Dynamic Graph Queries
DDynamic Graph Queries ∗ Pablo Muñoz † , Nils Vortmeier , and Thomas Zeume ‡ [email protected] {nils.vortmeier, thomas.zeume}@tu-dortmund.de Abstract
Graph databases in many applications—semantic web, transport or biological networks amongothers—are not only large, but also frequently modified. Evaluating graph queries in this dy-namic context is a challenging task, as those queries often combine first-order and navigationalfeatures.Motivated by recent results on maintaining dynamic reachability, we study the dynamic evaluation of traditional query languages for graphs in the descriptive complexity framework.
Our focus is on maintaining regular path queries, and extensions thereof, by first-order formulas.In particular we are interested in path queries defined by non-regular languages and in extendedconjunctive regular path queries (which allow to compare labels of paths based on word relations).Further we study the closely related problems of maintaining distances in graphs and reachabilityin product graphs.In this preliminary study we obtain upper bounds for those problems in restricted settings,such as undirected and acyclic graphs, or under insertions only, and negative results regard-ing quantifier-free update formulas. In addition we point out interesting directions for furtherresearch.
F.4.1. Mathematical Logic
Keywords and phrases
Dynamic descriptive complexity, graph databases, graph products, reach-ability, path queries
Digital Object Identifier
Graph databases are important in applications in which the topology of data is as important asthe data itself. Intuitively, a graph database represents objects (by nodes), and relationshipsbetween those objects (often modeled by labeled edges—see [1] for a survey on graph databasemodels). The last years have witnessed an increasing interest in graph databases, due tothe uprise of applications that need to manage and query massive and highly-connecteddata, as for example the semantic web, social networks or biological networks. In most ofthese applications, databases are not only large, but also highly dynamic. Data is frequently ∗ Parts of this work are also included in the dissertation thesis of the third author [23]. † The author acknowledges the financial support by Conicyt PhD scholarship and Millennium NucleusCenter for Semantic Web Research under Grant NC120004. ‡ The author acknowledges the financial support by DFG grant SCHW 678/6-1. © Pablo Muñoz, Nils Vortmeier and Thomas Zeume;licensed under Creative Commons License CC-BYLeibniz International Proceedings in InformaticsSchloss Dagstuhl – Leibniz-Zentrum für Informatik, Dagstuhl Publishing, Germany a r X i v : . [ c s . L O ] D ec Dynamic Graph Queries inserted and deleted, and hence so is its network structure. The goal of this work is toexplore how query languages for graph databases can be evaluated in this dynamic context.Many query languages for graph databases combine traditional first-order features with navigational ones. Already basic languages (such as regular path queries, see e.g. [21, 2])allow to test the existence of paths satisfying constraints on their labels (e.g. adherence to aregular expression in regular path queries). Computing the answers to this kind of querieson large, highly dynamic graphs is a big challenge. It is conceivable, though, for answers toa query before and after small modifications to be closely related. Thus a reasonable hopeis to be able to update the answer to a query in a more efficient way than recomputing itfrom scratch after each modification. Even more so if we allow to store extra auxiliary datathat might ease the updating task. To what extent this is possible, and in which preciseconditions, is the subject of dynamic computational complexity .Here we are interested in studying the dynamic complexity of query languages for graphdatabases from a descriptive approach. In the dynamic descriptive complexity setting,proposed independently by Dong, Su and Topor [9, 8] and by Patnaik and Immerman [16], a dynamic program maintains auxiliary relations with the intention to help answering a queryover a (relational) database subject to small modifications (insertions or deletions of tuples).When a modification occurs, the query answers and every auxiliary relation are updatedby first-order formulas (or, equivalently, by core SQL queries) evaluated over the currentdatabase and the available auxiliary data. Such programs benefit therefore from being bothhighly parallelizable (due to the close connection of first-order logic and small depth booleancircuits) and readily implementable in standard relational database engines. The class ofqueries maintainable by first-order update formulas is called
DynFO .Query languages for graphs have, so far, not been studied systematically in the dynamicdescriptive complexity setting. Very likely the main reason is that until recently it was noteven known whether reachability in directed graphs could be maintained by first-order updateformulas. That this indeed is possible was shown in [6], with the immediate consequencethat all fixed (conjunctions of) regular path queries can also be maintained. Thus regularpath queries can be evaluated in a highly parallel fashion in dynamic graph databases.Motivated by this result we study the dynamic maintainability of more expressive querylanguages. (cid:73)
Goal.
Gain a better understanding of the limits of maintaining graph query languages inthe dynamic context.
Our focus is on regular path queries and extensions thereof—non-regular path queriesand extended conjunctive regular path queries (short: ECRPQs).Some previous work on non-regular path queries has been done. Weber and Schwentickexhibited a context-free path query (the Dyck language D ) that can be maintained in DynFO on acyclic graphs [20]. Also, for the simple class of path-shaped graph databases,formal language results can be transferred. Already Patnaik and Immerman pointed out thatregular languages can be maintained in
DynFO [16]. Later, Gelade et al. systematicallystudied the dynamic complexity of formal languages [11]. They showed, among other results,that regular languages can be maintained by quantifier-free update formulas, and that allcontext-free languages can be maintained in
DynFO .The second extension of regular path queries to be studied here are extended conjunctiveregular path queries. In previous work it has been noticed that conjunctions of regularpath queries (CRPQs) fall short in expressive power for modern applications of graphdatabases [4]. A feature commonly demanded by these applications is the comparison of . Muñoz, N. Vortmeier, T. Zeume 3 labels of paths defined by CRPQs based on relations of words (e.g. prefix, length constraints,fixed edit-distance). ECRPQs have been introduced to fulfill this requirement [4], that is,they generalize CRPQs by allowing to test whether multiple labels of paths adhere to givenregular relations. Two basic properties expressible by ECRPQs are whether two pairs ofnodes are connected by paths of the same length and if so, whether also paths with the samelabel sequence exist. In general, maintaining the result of ECRPQs seems to be a difficulttask. In this article we therefore explore the maintenance of ECRPQs in restricted settings.Finally, there is also a close connection between the evaluation of graph queries and thereachability problem in unlabeled and labeled product graphs. We discuss this connection(see Section 2), and exploit it in several of our results.
Contributions
First we study path queries and show thatall regular path queries can be maintained by quantifier-free formulas when only insertionsare allowed,all context-free path queries can be maintained by first-order formulas on acyclic graphs,andthere are non-context-free path queries maintainable by first-order formulas on undirectedand acyclic graphs, as well as on general graphs under insertions only.As a first step towards maintaining ECRPQs we explore for which graph classes thelengths of paths between nodes can be maintained. We exhibit dynamic programs formaintaining all distances for undirected and acyclic graphs, as well as for directed graphswhen only insertions are allowed. It remains open, whether distances can be maintained in
DynFO for general directed graphs, but we show that quantifier-free update formulas donot suffice.The techniques used to maintain all distances can be used to maintain variants of ECRPQsin restricted settings. Denote the extension of a class of queries by linear constraints on thenumber of occurrences of symbols on paths by +LC. This extension was introduced andstudied in [4]. We show thatall CRPQ+LCs can be maintained by first-order formulas when only insertions are allowed,andall ECRPQ+LCs can be maintained by first-order formulas on acyclic graphs.An immediate consequence of our results for distances is that reachability can be main-tained in products of (unlabeled) graphs for those restrictions. By using the dynamic programfor maintaining the rank of matrices from [6], we extend this result to more general graphproducts. Furthermore we show that pairs of nodes connected by paths with the same labelsequence can be maintained in acyclic graphs using first-order update formulas.
Related work
The maintenance of problems has also been studied from an algorithmicpoint of view. A good starting point for readers interested in upper bounds for dynamicalgorithms is [18, 7]; a good starting point for lower bound techniques is the survey byMiltersen on cell probe complexity [14]. The upper bounds for reachability obtained in [18, 7]immediately transfer to dynamic algorithmic evaluation of regular path queries (using thereduction exhibited in [6]).
Dynamic Graph Queries
Outline
The dynamic setting and the basic graph query languages are introduced inSection 2. There we also discuss the connection between query evaluation and reachabilityin product graphs. Section 3 contains the results on maintaining graph queries. Our resultsfor maintaining distances and ECRPQs are presented in Section 4. In Section 5 some ofthe results for maintaining graph queries are transferred to reachability in graph products,and we also provide results for reachability in generalized graph products. We conclude inSection 6.This is a full version of [15].
Acknowledgements
We thank Pablo Barceló, Samir Datta and Thomas Schwentick forstimulating and illuminating discussions.
In this section we introduce the dynamic complexity framework as well as the graph querylanguages used in this article.
Dynamic complexity framework
In this work we use the dynamic complexity frameworkas introduced by Patnaik and Immerman [16]. The following introduction of the frameworkis borrowed from previous work [25].Intuitively, the goal of a dynamic program is to keep the result of a given query Q up todate while the database to be queried (the input database ) is subject to tuple insertions anddeletions. To this end the dynamic program stores auxiliary relations (the auxiliary database )with the aim that one of those relations always (that is, after every possible sequence ofmodifications), stores the result of Q for the current input structure. Whenever a tuple isinserted into or deleted from the input structure, each auxiliary relation is updated by thedynamic program by evaluating a specified first-order formula.We make this more precise now. A dynamic instance of a query Q is a pair ( D , α ), where D is a database over some finite domain D and α is a sequence of modifications to D . Here,a modification is either an insertion of a tuple over D into a relation of D or a deletion of atuple from a relation of D . The result of Q for ( D , α ) is the relation that is obtained by firstapplying the modifications from α to D and then evaluating Q on the resulting database.We use the Greek letters α and β to denote modifications as well as modification sequences.The database resulting from applying a modification α to a database D is denoted by α ( D ).The result α ( D ) of applying a sequence of modifications α def = α . . . α m to a database D isdefined by α ( D ) def = α m ( . . . ( α ( D )) . . . ).Dynamic programs, to be defined next, consist of an initialization mechanism and anupdate program. The former yields, for every (input) database D , an initial state with initialauxiliary data. The latter defines how the new state of the dynamic program is obtainedfrom the current state when applying a modification.A dynamic schema is a tuple ( τ in , τ aux ) where τ in and τ aux are the schemas of the inputdatabase and the auxiliary database, respectively. While τ in may contain constants, we donot allow constants in τ aux in the basic setting. We always let τ def = τ in ∪ τ aux . (cid:73) Definition 1 (Update program) . An update program P over a dynamic schema ( τ in , τ aux )is a set of first-order formulas (called update formulas in the following) that contains, forevery relation symbol R in τ aux and every δ ∈ { ins S , del S } with S ∈ τ in , an update formula φ Rδ (¯ x ; ¯ y ) over the schema τ where ¯ x and ¯ y have the same arity as S and R , respectively. . Muñoz, N. Vortmeier, T. Zeume 5 A program state S over dynamic schema ( τ in , τ aux ) is a structure ( D, I , A ) where D isa finite domain, I is a database over the input schema (the current database ) and A is adatabase over the auxiliary schema (the auxiliary database ).The semantics of update programs is as follows. Let P be an update program, S =( D, I , A ) be a program state and α = δ (¯ a ) a modification where ¯ a is a tuple over D and δ ∈ { ins S , del S } for some S ∈ τ in . If P is in state S then the application of α yields thenew state P α ( S ) def = ( D, α ( I ) , A ) where, in A , a relation symbol R ∈ τ aux is interpreted by { ¯ b | S | = φ Rδ (¯ a ; ¯ b ) } . The effect P α ( S ) of applying a modification sequence α def = α . . . α m toa state S is the state P α m ( . . . ( P α ( S )) . . . ). (cid:73) Definition 2 (Dynamic program) . A dynamic program is a triple ( P, Init , Q ), where P is an update program over some dynamic schema ( τ in , τ aux ), Init is a mapping that maps τ in -databases to τ aux -databases, and Q ∈ τ aux is a designated query symbol .A dynamic program P = ( P, Init , Q ) maintains a query Q if, for every dynamic instance( D , α ), the query result Q ( α ( D )) coincides with the content of Q in the state S = P α ( S Init ( D ))where S Init ( D ) is the initial state for D , that is, S Init ( D ) def = ( D, D , Init ( D )).The following example due to [16] shows how the transitive closure of an acyclic graphsubject to edge insertions and deletions can be maintained in this set-up. The basic techniqueof this example will be crucial in some of the later proofs. (cid:73) Example 3.
Consider an acyclic graph G subject to edge insertions and deletions. In thefollowing, our goal is to maintain the transitive closure of G using a dynamic program withfirst-order update formulas. It turns out that if the graph is guaranteed to remain acyclic,then it is sufficient to store the current transitive closure relation in an auxiliary relation T .We follow the argument from [16].When an edge ( u, v ) is inserted into G the following very simple rule updates T : there isa path from x to y after inserting ( u, v ) if (1) there was already a path from x to y before theinsertion, or (2) there were paths from x to u and from v to y before the insertion. This rulecan be easily specified by a first-order update formula that defines the updated transitiveclosure relation : φ T ins E ( u, v ; x, y ) def = T ( x, y ) ∨ (cid:0) T ( x, u ) ∧ T ( v, y ) (cid:1) .Deletions are slightly more involved. There is a path ρ from x to y after deleting an edge( u, v ) if there was a path from x to y before the deletion and (1) there was no such path via( u, v ), or (2) there is an edge ( z, z ) on ρ such that u can be reached from z but not from z .If there is still a path ρ from x to y , such an edge ( z, z ) must exist, as otherwise u wouldbe reachable from y , contradicting acyclicity. This rule can be described by a first-orderformula: φ T del E ( u, v ; x, y ) def = T ( x, y ) ∧ (cid:16)(cid:0) ¬ T ( x, u ) ∨ ¬ T ( v, y ) (cid:1) ∨ ∃ z ∃ z (cid:0) T ( x, z ) ∧ E ( z, z ) ∧ ( z = u ∨ z = v ) ∧ T ( z , y ) ∧ T ( z, u ) ∧ ¬ T ( z , u ) (cid:1)(cid:17) (cid:74) A word on the initial input and auxiliary databases is due. As default we use the originalsetting of Patnaik and Immerman, where the input database is empty at the beginning, andthe auxiliary relations are initialized by first-order formulas evaluated on the initial input For simplicity we use the same names for elements and variables.
Dynamic Graph Queries database. When we use a different initialization setting we state it explicitly. In the literatureseveral other settings have been investigated and we refer to [25, 23] for a detailed discussion.The class of queries that can be maintained by first-order update formulas in the setting ofPatnaik and Immerman is called DynFO . Restricting update formulas to be quantifier-freeyields the class
DynProp .When showing that a particular query is in
DynFO we often assume that arithmetic onthe domain is available from initialization time, that is, we assume the presence of relations ≤ , + , × that are interpreted as a linear order—allowing to identify elements with numbers—,addition and multiplication on the domain. From a DynFO program that relies on built-inarithmetic, a program without built-in arithmetic can be constructed for all queries studiedhere by using a technique from [6]. (cid:73)
Proposition 4 ([6, Theorem 4]) . Every domain-independent query Q that can be maintainedin DynFO with built-in arithmetic can also be maintained in
DynFO . Here, a query is domain-independent if its result does not change when elements are addedto the domain.Constructing a
DynFO program for a specific query Q can be a tedious task. Sucha construction can often be simplified by reducing Q to a query Q for which a dynamicprogram has already been obtained. Such a reduction needs to be consistent with first-orderlogic and its use in this dynamic context. A suitable kind of reductions are bounded first-orderreductions . Intuitively, a query Q reduces to a query Q via a bounded first-order reduction ifa modification of an instance of Q induces constantly many, first-order definable modificationsin a instance of Q . Note that if Q can be reduced to Q via a bounded first-order reduction,then first-order update formulas for a modification of an instance for Q can be obtainedby composing the first-order update formulas for the corresponding (first-order definable)modications of the instance of Q . We refer to [16] and [12] for a detailed exposition tobounded first-order reductions.In this article we study dynamic programs for queries on (labeled) graphs. For most ofour dynamic programs the precise encoding of graphs is not important. If the input to aquery is a single Σ-labeled graph G = ( V, E ) then it can, for example, be encoded by binaryrelations E σ that store all σ -labeled edges, for all σ ∈ Σ. Similarly for constantly manygraphs. Some of our results are for input databases that contain more than constantly manygraphs. Those can be encoded in higher arity relations in a straightforward way. For example,linearly many graphs can be stored in ternary relations E σ containing a tuple ( g, u, v ) ifgraph g contains a σ -labeled edge ( u, v ). Graph databases and query languages
We review basic definitions of graph databases inorder to fix notations and introduce the query languages used in this work.A graph database over an alphabet Σ is a finite Σ-labeled graph G = ( V, E ) where V is afinite set of nodes and E is a set of labeled edges ( u, σ, v ) ⊆ V × Σ × V . Here σ is called the label of edge ( u, σ, v ). Given a Σ-labeled graph G = ( V, E ) and a symbol σ ∈ Σ, we denoteby G σ the projection of G onto its σ -labeled edges, that is, the graph G σ has the edge set { ( u, v ) | ( u, σ, v ) ∈ E } . We say that a Σ-labeled graph G is acyclic if the graph ∪ σ ∈ Σ G σ isacyclic, and undirected , if for each σ ∈ Σ the graph G σ is undirected. In [25, 24, 22] the class
DynFO comes with an arbitrary initialization, yet there the focus is on lowerbounds. . Muñoz, N. Vortmeier, T. Zeume 7 A path ρ in G from v to v m is a sequence of edges ( v , σ , v ) , . . . , ( v m − , σ m , v m ) of G ,for some length m ≥
0. The label of ρ , denoted by λ ( ρ ), is the word σ · · · σ m ∈ Σ ∗ . Paths oflength zero are labeled by the empty string (cid:15) . For a formal language L ⊆ Σ ∗ , we say that ρ is an L -path if λ ( ρ ) ∈ L .The basic building block of many graph query languages are regular path queries (short:RPQs). An RPQ selects all pairs of nodes in a Σ-labeled graph that are connected by an L -path, for a a given regular language L ⊆ Σ ∗ . Here we are interested in two extensions ofregular path queries. One of them are path queries defined by non-regular languages, namelycontext-free and non-context-free languages.The second extension to be studied, extended conjunctive regular path queries (short:ECRPQs), allows to define multiple paths and to compare their labels based on relationson words. In the following we give a short introduction to ECRPQs and refer to [4] for adetailed study.In ECRPQs, paths are compared by regular relations . A k -ary regular relation R overalphabet Σ is defined by a finite state automaton A that synchronously reads k words overΣ ∪ ⊥ , with ⊥ / ∈ Σ . The ⊥ symbol is a padding symbol that may only occur at the end of aword, and therefore allows for processing words of different length. More formally A readswords over the alphabet (Σ ∪ ⊥ ) k , and a k -tuple of words is in R if its corresponding stringover (Σ ∪ ⊥ ) k is accepted by A .An ECRPQ is of the form Q ( ~z ) ←− V ≤ i ≤ m ( x i , π i , y i ) , V ≤ j ≤ t R j ( ~ω j ) whereeach R j is a regular relation over Σ (specified by some finite state automaton), ~x = ( x , ..., x m ), ~y = ( y , ..., y m ) and ~z are tuples of node variables such that the variablesin ~z occur in ~x or ~y , and ~π = ( π , ..., π m ) and ~ω , ..., ~ω t are distinct tuples of path variables such that all variablesin each ~ω j occur in ~π .In general, both node and path variables can occur in the head of an ECRPQ. Outputsof ECRPQs are potentially infinite sets then, since there can be infinitely many paths ingraphs with cycles. Nevertheless, given a Σ-labeled graph G = ( V, E ) and a tuple ~v of nodes,the answer set is a regular relation over the alphabet ( V k ∪ Σ k ⊥ ), where k is the number ofpath variables in the head. Such a regular relation is used as an encoding of all possible pathoutputs. For a fixed ECRPQ Q , given G and ~v , the automaton for this regular relation canobtained in PTime [4]. More precisely, it is definable by first-order queries evaluated on thegraph database. We can therefore neglect path variables in the dynamic setting—for everytuple of nodes in the answer of the query we can obtain the regular relation encoding theoutput paths by first order queries.The semantics of ECRPQs is defined in a natural way. For an ECRPQ Q of the aboveform, a Σ-labeled graph G = ( V, E ), and mappings ν from node variables to nodes and µ from path variables to paths, we write ( G, ν, µ ) | = Q if µ ( π i ) is a path in G from ν ( x i ) to ν ( y i ) for 1 ≤ i ≤ m , andthe tuple ( λ ( µ ( π j )) , ..., λ ( µ ( π j k ))) belongs to the relation R j for each ~ω j = ( π j , ..., π j k ).The result of Q evaluated on G is defined by Q ( G ) def = { ν ( ~z ) : ( G, ν, µ ) | = Q} . Product Graphs and Graph Query Languages
There is a strong connection between theevaluation problem for many graph query languages and the reachability query for productsof labeled graphs. For example, the evaluation of a regular path query L on a labeled graph G can be reduced to reachability in the product graph A × G where A is a finite state Dynamic Graph Queries automaton for L . Product graphs also help for the evaluation of fragments of ECRPQsas well. We will exploit this connection at several places and therefore present some basicproperties of product graphs next.The product graph Q i G i of m Σ-labeled graphs G i = ( V i , E i ), 1 ≤ i ≤ m , has nodes Q i V i and an edge ( ~x, ~y ) between two nodes ~x = ( x , ..., x m ) and ~y = ( y , ..., y m ) if there is asymbol σ ∈ Σ such that ( x i , σ, y i ) ∈ E i for each 1 ≤ i ≤ m . The graphs G i are called factors of the graph product. Graph products for unlabeled graphs are defined analogously. Thefollowing well known property characterizes reachability in (labeled) product graphs. (cid:73) Fact 5.
Let ( G i ) ≤ i ≤ m be graphs ( Σ -labeled graphs) with G i = ( V i , E i ) and let ~x =( x , . . . , x m ) , ~y = ( y , . . . , y m ) be two pairs of nodes of Q i G i . Then ~y is reachable from ~x in Q i G i if and only if there are paths ρ i from x i to y i in G i with | ρ i | ≤ | Q i V i | , for i ∈ { , . . . , m } , and | ρ i | = | ρ j | ( λ ( ρ i ) = λ ( ρ j ) respectively) for all i, j ∈ { , . . . , m } . (cid:74) The preceding fact can be used in the dynamic context as well, i.e. it is compatible withbounded first-order reductions. More precisely, reachability in products of unlabeled graphscan be inferred from all distances in the factors. We say that all distances up to n c , for c ∈ N , are computed by a dynamic program if, for a graph G with n nodes and arithmeticon the domain , it maintains a relation D that contains all tuples ( x, y, ‘ ) such that there isa path from x to y of length ‘ , for 0 ≤ ‘ ≤ n c . (cid:73) Proposition 6.
The following problems are equivalent under bounded first-order reductionswith built-in arithmetic:(a) Maintaining all distances up to n .(b) Maintaining reachability in the product of two graphs (both of them subject to modifica-tions).(c) Maintaining reachability in the product of two graphs, one of them a fixed path. Proof sketch.
Problem (c) is clearly a special case of Problem (b). The reduction fromProblem (b) to Problem (a) is an immediate consequence of Fact 5: two nodes ( x, x ) and( y, y ) of a product graph G × G are connected if and only if there are equal-length paths oflength at most n from x to y in G and from x to y in G .Thus it remains to reduce Problem (a) to Problem (c). For a graph G over domain D def = { , . . . , n − } , consider the product graph G × P where P is the path { (0 , , . . . , ( n − , n ) } (as usual numbers larger than n are encoded as tuples over D ). Then there is a path oflength ‘ between two nodes x and y of G if and only if there is a path from ( x,
0) to ( y, ‘ ) in G × P . Furthermore, the path P is never modified. (cid:74) A similar equivalence can be established for problems related to reachability in productsof Σ-labeled graphs: (cid:73)
Proposition 7.
The following problems are equivalent under bounded first-order reductions:(a) Maintaining the existence of equally labeled paths between two pairs of nodes.(b) Maintaining reachability in the product of two Σ -labeled graphs.(c) Maintaining reachability in the product of two Σ -labeled graphs, one of them undirected.(d) Maintaining the palindrome path query on Σ -labeled graphs. We note that from the arithmetic on the domain, arithmetic upto n c can be defined using first-orderformulas. . Muñoz, N. Vortmeier, T. Zeume 9 Proof sketch.
The equivalence of problems (a) and (b) is an immediate consequence ofFact 5.We next show the equivalence of Problems (b) and (c). Clearly, Problem (c) is a specialcase of (b). For reducing Problem (b) to Problem (c) consider two (directed) Σ-labeledgraphs G and G . Let / ∈ Σ be a fresh symbol, and denote Σ = Σ ∪ { } . In a first step,from G and G we construct two undirected Σ -labeled graphs G and G such that (1)there is a path between two nodes of G × G if and only if there is a (Σ ◦ { } ) ∗ -labeled pathbetween the corresponding nodes in G × G , and (2) one modification in G i corresponds toat most two modifications in G i definable in first-order. To this end, the graph G i has twonodes x in and x out for each node x of G i . An edge ( x, σ, y ) of G i is encoded by the edges( x out , σ, y in ) and ( y in , , y out ) in G i . (In particular, the edge ( y in , , y out ) is present in G i assoon as y has an incoming edge in G i .)Now every path in G × G corresponds to a (Σ ◦ { } ) ∗ -labeled path in G × G , whichin turn corresponds to a path in the product graph G × G × A where A is the labeled(directed) graph Σ representing the language (Σ ◦ { } ) ∗ . Since G × G × A is the productof the undirected graph G and the directed graph G × A , this yields the intended reduction(as one modification in G yields at most six modifications in G × A ).We now show that the problems (b) and (d) are equivalent. For reducing (d) to (b)we consider, for simplicity, only palindromes of even length; the construction can be easilyadapted to arbitrary palindromes. Let G be a labeled (directed) graph. Then there is apath from x to y labeled by a palindrome ww R if and only if there is a node z such thatthere are w and w R labeled paths from x to z and from z to y , respectively. Thus finding apalindromic path from x to y corresponds to finding a node z such that there is a path from( x, y ) to ( z, z ) in the product graph G × G − , where G − denotes the graph obtained from G by reversing each of its edges (definable in first-order). Note that one modification of G corresponds to two modifications in the factors of G × G − .For the other direction, let G and G be two arbitrary Σ-labeled graphs and let Σbe a fresh symbol. We assume, without loss of generality, that the node sets of the graphsare disjoint. There is a path from ( x , x ) to ( y , y ) in G × G if and only if there is a word w , a w -labeled path from x to y and a w -labeled path from x to y . The latter conditionis equivalent to the existence of a w w R -labeled path in the graph G def = G ∪ G − extendedby the edge ( y , , y ). (cid:74) Path queries, as mentioned in the introduction, have almost not been studied in dynamiccomplexity before. Until recently not even the simple query induced by the language L ( a ∗ )was known to be in DynFO . Yet as an immediate consequence of the dynamic first-orderupdate program for reachability exhibited in [6], all fixed regular path queries (and, since
DynFO is closed under conjunctions, also conjunctions of them) can be maintained byfirst-order update formulas.In this section we continue the exploration of the dynamic maintainability of path queries.We show that under insertions quantifier-free update formulas are sufficient to maintain(fixed) regular path queries, and that more expressive path queries can be maintained forrestricted classes of graphs and constrained modifications. (cid:73)
Theorem 8.
When only insertions are allowed then every regular path query can bemaintained by quantifier-free update formulas.
We conjecture that quantifier-free update formulas do not suffice to maintain RPQs underboth insertions and deletions. This would imply that reachability can be maintained withoutquantifiers which seems to be very unlikely. A first step towards verifying this conjecturewas done in [25] where it was shown that reachability cannot be maintained with binaryquantifier-free programs.
Proof.
The following notion will be useful. Let A be a deterministic finite state automaton(short: DFA) and let G be a labeled graph. Then a path ρ in G can be read by A starting ina state p and ending in a state q if A can reach state q from state p by reading the labelsequence λ ( ρ ) of ρ .Let L be a regular path query and let A = ( Q, Σ , δ, s, F ) be a DFA with L = L ( A ). Weconstruct a DynProp -program P that maintains L .The program P has input schema { E σ | σ ∈ Σ } and an auxiliary schema that contains abinary relation symbol R p,q for every pair ( p, q ) ∈ Q of states, as well as a binary designatedquery symbol R . The simple idea is that in a state S with underlying labeled graph G , therelation R S p,q contains all tuples ( x, y ) ∈ V such that A , for some labeled path ρ from x to y , can read ρ by starting in state p and ending in state q .The update formulas for the relations R p,q are slightly more involved than the formulasfor maintaining reachability under insertions. This is because A might reach a state q from a state p only by reading a labeled path from x to y that contains one or more loops.The crucial observation is, however, that for deciding whether ( x, y ) is in R p,q it suffices toconsider paths that contain the node x at most | Q | times (as paths that contain x more than | Q | times can be shortened). This suffices to maintain the relations R p,q dynamically.The update formulas for R p,q and R are as follows: φ R p,q ins Eσ ( u, v ; x, y ) def = R p,q ( u, v ) ∨ _ p ,q (cid:16) R p,p ( x, u ) ∧ ϕ | Q | p ,q ( u, v ) ∧ R q ,q ( v, y ) (cid:17) φ R ins Eσ ( u, v ; x, y ) def = _ f ∈ F φ R s,f ins Eσ ( u, v ; x, y )Here the formula ϕ | Q | p ,q ( u, v ) shall only be satisfied by tuples ( u, v ) for which there exists apath ρ from u to v such that A can read ρ by starting in p and ending in q . It shall besatisfied by all such tuples with a witness path ρ that contains node u at most | Q | times.We inductively define, for every 1 ≤ i ≤ | Q | and all p, q ∈ Q , the slightly more generalformulas ϕ ip,q ( u, v ) as follows: ϕ p,q ( u, v ) def = [( p, σ, q ) ∈ δ ] ∨ R p,q ( u, v ) ϕ ip,q ( u, v ) def = ϕ i − p,q ( u, v ) ∨ _ p ,q (cid:16) ϕ p,p ( u, v ) ∧ R p ,q ( v, u ) ∧ ϕ i − q ,q ( u, v ) (cid:17) (cid:74) Capturing non-regular path queries by first-order update formulas seems to be significantlyharder than capturing CRPQs. We provide only some preliminary results for restrictedclasses of graphs and modifications.When all distances for all pairs of nodes can be maintained for a restricted class of graphs,then also non-regular and even non-contextfree path queries can be maintained (e.g. thelanguage { a n b n c n | n ∈ N } ). (cid:73) Theorem 9. (a) There is a non-context-free path query that can be maintained in
DynFO on acyclic and undirected Σ -labeled graphs. . Muñoz, N. Vortmeier, T. Zeume 11 (b) There is a non-context-free path query that can be maintained in DynFO when onlyinsertions are allowed.
Proof.
The non-context-free path query induced by L = { a n b n c n } can be maintained sincefor a graph G distances on G a , G b , G c can be kept up-to-date for those restrictions (seeTheorem 12 and Theorem 13). The arithmetic needed for those theorems can be simulatedby Proposition 4. We note that also path queries induced by languages such as { a n b n + m c m } can be maintained by first-order update formulas. (cid:74) On acyclic graphs, all context-free path queries can be maintained. It is known thatcontext-free languages are in
DynFO [11] and that the Dyck language with two types ofparentheses can be maintained on acyclic graphs [20]. Generalizing the techniques used forthose two results yields the following theorem. (cid:73)
Theorem 10.
All context-free path queries can be maintained in
DynFO on acyclic graphs.
To prove Theorem 10, we fix a context-free language L and a grammar G = ( V, Σ , S, P )for L . We assume, without loss of generality, that G is in Chomsky normal form, that is, ithas only rules of the form X → Y Z and X → σ . Furthermore, if (cid:15) ∈ L then S → (cid:15) ∈ P andno right-hand side of a rule contains S . We write Z ⇒ ∗ w if w ∈ (Σ ∪ V ) ∗ can be derivedfrom Z ∈ V using rules of G .The dynamic program maintaining L on acyclic graphs will use 4-ary auxiliary relationsymbols R Z → Z for all Z, Z ∈ V . The intention is that in every state S with input database G ,the relation R S Z → Z contains a tuple ( x , y , x , y ) if and only if there are strings s , s ∈ Σ ∗ such that Z ⇒ ∗ s Z s and there is an s i -path ρ i from x i to y i for i ∈ { , } . The paths ρ and ρ are called witnesses for ( x , y , x , y ) ∈ R S Z → Z . Later we will see that whether twonodes are connected by an L -path after an update can be easily verified using those relations.It turns out that for updating the relations R S Z → Z it is necessary to have access to (2 k +2)-ary relations R S X → Y ,...,Y k , for k ∈ { , , } , which contain a tuple ( x , y , . . . , x k +1 , y k +1 ) ifand only if there are strings s , . . . , s k +1 ∈ Σ ∗ such that X ⇒ ∗ s Y s . . . s k Y k s k +1 and thereis an s i -path ρ i from x i to y i in the input database underlying S .Next, in Lemma 11, we prove that every relation R S X → Y ,...,Y k is first-order definablefrom the relations R S Z → Z (and thus only relations R S Z → Z have to be stored as auxiliarydata). This lemma is inspired by Lemma 7.3 from [20], and its proof is a generalization ofthe technique used in the proof of Theorem 4.1 in [11]. Afterwards we prove Theorem 10by showing how to use the relations R S Z → Z to maintain L and how to update the relations R S Z → Z using the formulas that define relations of the form R S X → Y ,Y and R S X → Y ,Y ,Y . (cid:73) Lemma 11.
For a grammar G in Chomsky normal form, k ≥ and variables X, Y , . . . , Y k there is a first-order formula ϕ X → Y ,...,Y k over schema τ = { R Z → Z | Z, Z ∈ V } that defines R X → Y ,...,Y k in states S where the relations R S Z → Z are as described above. Proof sketch.
We explain how ϕ X → Y ,Y ,Y tests whether a tuple is contained in R S X → Y ,Y ,Y .The construction for general k is analogous.If a tuple ( x , y , x , y , x , y , x , y ) is contained in R S X → Y ,Y ,Y witnessed by s i -paths ρ i from x i to y i such that X ⇒ ∗ s Y s Y s Y s , then in the derivation tree of s Y s Y s Y s from X there is a variable U such that U → U U and either (1) Y and Y are derived from U , and Y is derived from U ; or (2) Y is derived from U , and Y and Y are derived from U . In case (1), the derivation subtree starting from U contains a variable W such that W → W W and Y is derived from W and Y is derived from W . Analogously for case(2). The derivation tree of X for case (1) is illustrated in Figure 1. XUU U WW W Y Y Y x u w y x w y x w u y x u y s s s s Figure 1
Illustration of when a tuple ( x , y , x , y , x , y , x , y ) is contained in R X → Y ,Y ,Y inLemma 11. The formula ϕ X → Y ,Y ,Y is the disjunction of formulas ψ and ψ , responsible for dealingwith the cases (1) and (2) respectively. We only exhibit ψ , the formula ψ can be constructedanalogously. The formula ψ guesses the variables U, U , U , W, W and W , and the start andend positions of strings derived from those variables. Whether ( x , y , x , y , x , y , x , y ) iscontained in R S X → Y ,Y ,Y can then be tested using the relations R Z → Z . For simplicity theformula ψ reuses the element names x i and y i as variable names and is defined as follows: ψ ( x , y , . . . , x , y ) = ∃ u ∃ u ∃ u _ U,U ,U ∈ VU → U U ∈ P ∃ w ∃ w ∃ w _ W,W ,W ∈ VW → W W ∈ P (cid:16) R X → U ( x , u , u , y ) ∧ R U → W ( u , w , w , u ) ∧ R W → Y ( w , y , x , w ) ∧ R W → Y ( w , y , x , w ) ∧ R U → Y ( u , y , x , u ) (cid:17) (cid:74) We now use the relations R Z → Z and the formulas ϕ X → Y ,Y ,Y for maintaining context-free path queries on acyclic graphs. Proof idea (of Theorem 10).
Let L be an arbitrary context-free language and let G =( V, Σ , S, P ) be a grammar for L in Chomsky normal form. We provide a DynFO -program P with designated binary query symbol Q that maintains L on acyclic graphs. The inputschema is { E σ | σ ∈ Σ } and the auxiliary schema is τ aux = { R X → Y | X, Y ∈ V } ∪ { T } . Theintention of the auxiliary relation symbols R X → Y has already been explained above; therelation symbol T shall store the transitive closure of the input graph (where the input graphis the union of all E σ ). . Muñoz, N. Vortmeier, T. Zeume 13 Before showing how to update the relations R X → Y , we state the update formulas for thequery relation Q . The update formulas distinguish whether the witness path is of length 0or of length at least 1. The updated relations R X → Y are used for the latter case. φ Q ins Eσ ( u, v ; x, y ) def = ([ S → (cid:15) ∈ P ] ∧ x = y ) ∨ ∃ z ∃ z _ U ∈ VU → τ ∈ P (cid:0) φ R S → U ins Eσ ( u, v ; x, z , z , y ) ∧ E τ ( z , z ) (cid:1) ∨ _ U ∈ VU → σ ∈ P (cid:0) φ R S → U ins Eσ ( u, v ; x, u, v, y ) (cid:1) φ Q del Eσ ( u, v ; x, y ) def = ([ S → (cid:15) ∈ P ] ∧ x = y ) ∨ ∃ z ∃ z _ U ∈ Vτ = σU → τ ∈ P (cid:0) φ R S → U del Eσ ( u, v ; x, z , z , y ) ∧ E τ ( z , z ) (cid:1) ∨ ∃ z ∃ z _ U ∈ VU → σ ∈ P (cid:0) φ R S → U del Eσ ( u, v ; x, z , z , y ) ∧ E σ ( z , z ) ∧ ( z = u ∨ z = v ) (cid:1) It remains to present update formulas for each R X → Y . For simplicity we identify namesof variable and elements.After inserting a σ -edge ( u, v ), a tuple ( x , y , x , y ) is contained in R X → Y if there aretwo witness paths ρ and ρ such that (1) ρ and ρ have already been witnesses before theinsertion, or (2) only ρ uses the new σ -edge, or (3) only ρ uses the new σ -edge, or (4) both ρ and ρ use the new σ -edge. In case (2) the path ρ can be split into a path from x to u ,the edge ( u, v ) and a path from v to y . Similarly in the other cases and for ρ . Using theformulas from Lemma 11 this can be expressed as follows: φ R X → Y ins Eσ ( u, v ; x , y , x , y ) def = R X → Y ( x , y , x , y ) ∨ (1) _ U ,U ∈ VU → σ ∈ PU → σ ∈ P (cid:0) ϕ X → U ,Y ( x , u, v, y , x , y ) (2) ∨ ϕ X → Y,U ( x , y , x , u, v, y ) (3) ∨ ϕ X → U ,Y,U ( x , u, v, y , x , u, v, y ) (cid:1) (4)After deleting a σ -edge ( u, v ) a tuple ( x , y , x , y ) is in R X → Y if it still has witnesspaths ρ and ρ from x to y and from x to y , respectively. The update formula for R X → Y verifies that such witness paths exist. Therefore, similar to Example 3, the formuladistinguishes for each i ∈ { , } whether (1) there was no path from x i to y i via ( u, v ) beforedeleting the σ -edge ( u, v ), or (2) there was a path from x i to y i via ( u, v ). See Figure 2 foran illustration.In case (1) all paths present from x i to y i before the deletion of the σ -edge ( u, v ) are alsopresent after the deletion. In particular the set of possible witnesses ρ i remains the same.For case (2), the update formula has to check that there is still a witness path ρ i . Such apath ρ i has the options (a) to still use the edge ( u, v ) but for a τ = σ , and (b) to not use theedge ( u, v ) at all.The update formula for R X → Y is a disjunction over all those cases for the witnesses for( x , y ) and ( x , y ). Instead of presenting formulas for all those cases, we explain the ideafor two representative cases. All other cases are analogous. x x z z u v y y σ Figure 2
Illustration of the update of R X → Y after deletion of σ -edge ( u, v ) in the proof of Lemma11. The nodes x and y satisfy Condition (1), whereas nodes x and y satisfy Condition (2). We first look at the case where ( x , y ) satisfies (1), ( x , y ) satisfies (2) and there arewitness paths ρ and ρ where ρ satisfies (a). The following formula deals with this case:( ¬ T ( x , u ) ∨ ¬ T ( v, y )) ∧ T ( x , u ) ∧ T ( v, y ) ∧ _ τ = σ,U ∈ VU → τ ∈ P (cid:0) ϕ X → Y,U ( x , y , x , u, v, y ) ∧ E τ ( u, v ) (cid:1) In the first line the premises for this case are checked, in the second line it is verified that ρ uses τ -edge ( u, v ) for σ = τ .Now we consider the case where both ( x , y ) as well as ( x , y ) satisfy (2), and wherethere are witness paths ρ and ρ where ρ satisfies (a) and ρ satisfies (b). The existence ofsuch a path ρ can be verified as above. For verifying the existence of such a path ρ , a pathnot using ( u, v ) has to be found. This is achieved by relying on the same technique as formaintaining reachability for acyclic graphs (see Example 3). The following formula verifiesthe existence of such ρ and ρ : T ( x , u ) ∧ T ( v, y ) ∧ T ( x , u ) ∧ T ( v, y ) ∧ ∃ z ∃ z _ τ = σ,U ,U ∈ VU → τ ∈ PU → τ ∈ P (cid:16) ϕ X → U ,Y,U ( x , u, v, y , x , z, z , y ) ∧ E τ ( u, v ) ∧ (cid:0) T ( x , z ) ∧ E τ ( z, z ) ∧ ( z = u ∨ z = v ) ∧ T ( z , y ) ∧ T ( z, u ) ∧ ¬ T ( z , u ) (cid:1)(cid:17) Again, in the first line the premises for this case are checked. In the second line z and z are chosen with the purpose to find an alternative path ρ (as in Example 3), and it isverified that ρ and ρ are witness paths. The third and forth lines verify that z and z yieldan alternative path. (cid:74) In this section we explore the maintainability of ECRPQs. In contrast to path queries,ECRPQs allow for testing properties of tuples of paths between pairs of nodes. Comparingthe length of two paths is one of the simplest such properties and is therefore studied first.Afterwards we extend some of the techniques developed for maintaining the lengths of pathsto ECRPQs. . Muñoz, N. Vortmeier, T. Zeume 15
Maintaining all distances in arbitrary graphs is one of the big challenges of dynamic complexity.Recall that for maintaining all distances up to n c a dynamic program has to update, for agraph G , a relation D that contains all tuples ( x, y, ‘ ) such that there is a path from x to y of length ‘ in G , for 0 ≤ ‘ ≤ n c .The recent dynamic algorithm for maintaining reachability (see [6]) does, unfortunately,not offer hints at how to maintain distances. A dynamic upper bound for distances isprovided by Hesse’s DynTC -program for reachability [13]. The program actually maintainsthe number of different paths of length ‘ between every pair of nodes, for any length ‘ up tothe size of the graph, and thus all distances for all pairs of nodes. The program can be easilymodified to compute all distances up to fixed polynomials.Here we present preliminary results for maintaining all distances with first-order formulasfor restricted modifications as well as for restricted classes of graphs. Furthermore we showthat distances cannot be maintained with quantifier-free update formulas.The shortest distance between every pair of nodes can be easily maintained in DynFO when edges can only be inserted; basically because shortest paths do not contain loops.Maintaining all distances for all pairs of nodes under insertions requires some work. (cid:73)
Theorem 12.
All distances up to p ( n ) can be maintained in DynFO under insertions forevery fixed polynomial p ( n ) . Proof.
We describe how to maintain distances up to n ; the generalization to distances up to p ( n ) is straightforward and sketched at the end of the proof. The idea is to maintain a 4-aryrelation A that contains a tuple ( x, y, t, ‘ ) if there are t (not necessarily distinct) paths from x to y such that the sum of their lengths is ‘ .There is a path of length ‘ from node x to node y if and only if ( x, y, , ‘ ) holds. Formaintaining this information, we need the full relation: a path from x to y can use a newlyinserted edge ( u, v ) several times if cycles are present. Also, the path can use an arbitrarycombination of cycles including that edge, and each cycle can be used arbitrarily often.When inserting an edge ( u, v ) the updated relation A is defined by the following formula: φ A ins E ( u, v ; x, y, t, ‘ ) def = ∃ t − ∃ t + ∃ t (cid:9) ∃ ‘ − ∃ ‘ + ∃ ‘ + ∃ ‘ (cid:9) (cid:16) A ( x, y, t − , ‘ − ) ∧ A ( x, u, t + , ‘ + ) ∧ A ( v, y, t + , ‘ + ) ∧ A ( v, u, t (cid:9) , ‘ (cid:9) ) ∧ ( t + = 0 → t (cid:9) = 0) ∧ t − + t + = t ∧ ‘ − + ‘ + + ‘ + + ‘ (cid:9) + t + + t (cid:9) = ‘ (cid:17) If there are t paths with total length ‘ from x to y after the edge ( u, v ) is inserted, thesepaths can be divided into t − paths that do not use the new edge ( u, v ), with a total lengthof ‘ − , and t + paths that use the edge ( u, v ). Each one of these t + paths is composed of (i)one path from x to u that does not use ( u, v ), (ii) the edge ( u, v ), (iii) possibly some cyclesfrom v back to v created by combining an old path from v to u and the new edge ( u, v ), and(iv) one path from v to y that does not use ( u, v ).Without considering the cycles in v that use ( u, v ), in total there are t + paths from x to u (with total length ‘ + ), t + paths from v to y (with total length ‘ + ) and t + times thenew edge ( u, v ). So these paths have total length ‘ + + ‘ + + t + . Additionally, let t (cid:9) be thenumber of times the edge ( u, v ) is used in cycles from v to v in all t + paths together. Thesecycles can be obtained from t (cid:9) paths from v to u of total length l (cid:9) and t (cid:9) times the newedge ( u, v ). So in total, the t + paths have a total length of ‘ + + ‘ + + t + + ‘ (cid:9) + t (cid:9) . For maintaining distances upto p ( n ), numbers of this magnitude are encoded by tuples ofelements. Arithmetic upto p ( n ) can be easily defined in a first-order fashion from the built-inarithmetic upto n . The above construction then translates in a straightforward way. (cid:74) Next we show that all distances for all pairs of nodes in undirected and acyclic graphs canbe updated using first-order update formulas. For undirected graphs this slightly extends aresult by Grädel and Siebertz [12] that the shortest distance can be maintained for undirectedpaths. For acyclic graphs the maintenance of all distances is a straight-forward extension ofthe dynamic program for maintaining reachability shown in Example 3. (cid:73)
Theorem 13.
All distances up to p ( n ) can be maintained in DynFO for every fixedpolynomial p ( n ) for (a) undirected graphs, and (b) acyclic graphs. Proof.
Again we describe how to maintain distances upto n only; the generalization todistances upto p ( n ) is straightforward.(a) We use the simple observation that if two nodes in an undirected graph G are connectedby a path of length m >
0, then they are connected by a path of length m + 2 as well, sinceany edge of the path can be traversed repeatedly. A consequence is that all possible distancesbetween two nodes x and y in an undirected graph can be easily determined if the shortestlengths d o and d e of paths of odd and even length between x and y are known: there is apath of length m if m is odd and m ≥ d o or if m is even and m ≥ d e (and if x = y , then x isno isolated node). Thus in order to maintain whether two nodes x and y are connected by apath of length m , it suffices (1) to maintain d o and d e and (2) to know whether m is even orodd.The second part is easy since arithmetic is available. Maintaining d o and d e can be doneby maintaining the shortest distances of pairs of nodes in the graph G × K , where K isthe complete graphs on nodes { , } . The shortest distance between ( u,
1) and ( v,
1) in G × K equals the length of the shortest even path from u to v in G , whereas the distancebetween ( u,
1) and ( v,
2) is equal to the length of the shortest odd one. Observe that an edgemodification in G spans only two modifications in G × K . Since shortest distances in anundirected graphs can be maintained in DynFO [12], the result follows.(b) This is a simple adaption of the maintenance procedure for the transitive closure ofacyclic graphs (see Example 3). In addition to the transitive closure relation T , the dynamicprogram for distances in acyclic graphs maintains a ternary relation D that contains a tuple( x, y, ‘ ) if and only if there is a path from x to y of length ‘ . The update formulas fromExample 3 can be adapted easily by using the built-in arithmetic. φ D ins E ( u, v ; x, y, ‘ ) def = D ( x, y, ‘ ) ∨ ∃ d ∃ d (cid:0) d + d + 1 = ‘ ∧ D ( x, u, d ) ∧ D ( v, y, d ) (cid:1) φ D del E ( u, v ; x, y, ‘ ) def = T ( x, y ) ∧ (cid:16)(cid:0) ( ¬ T ( x, u ) ∨ ¬ T ( v, y )) ∧ D ( x, y, ‘ ) (cid:1) ∨ ∃ z ∃ z ∃ d ∃ d (cid:0) d + d + 1 = ‘ ∧ D ( x, z, d ) ∧ E ( z, z ) ∧ ( z = u ∨ z = v ) ∧ D ( z , y, d ) ∧ T ( z, u ) ∧ ¬ T ( z , u ) (cid:1)(cid:17) (cid:74) In the rest of this subsection we discuss why distance information cannot be maintainedby quantifier-free update formulas. So far the goal, when maintaining distances, was to store . Muñoz, N. Vortmeier, T. Zeume 17 tuples ( a, b, ‘ ) in some relation if there is a path from a to b of length ‘ , where the length ‘ referred to the built-in arithmetic. It can be easily seen that maintaining distances inthis fashion is not possible with quantifier-free formulas (basically because a quantifier-freeformula only has access to the numbers represented by the modified nodes).Another way of maintaining distance information is to store a 4-relation that contains atuple ( a , a , b , b ) if and only if there are paths from a to a and from b to b of equallength. We show that this relation cannot be maintained by quantifier-free programs.Denote by Equal-Length-Paths the query on (unlabeled) graphs that selects all tuples( a , a , b , b ) such that there are paths from a to a and from b to b of equal length. (cid:73) Theorem 14.
The query
Equal-Length-Paths cannot be maintained by quantifier-freeupdate formulas, even when the auxiliary relations can be initialized arbitrarily. In particular,ECRPQs and reachability in product graphs cannot be maintained in this setting either.
Intuitively this is not very surprising. It is well known that non-regular languages andtherefore, in particular, the language { a n b n | n ∈ N } cannot be maintained by a quantifier-freeprogram [11]. Thus maintaining whether two isolated paths have the same length should notbe possible either. Technical issues arise from the fact that the query Equal-Length-Paths is over graphs, not strings. Yet the techniques used for proving lower bounds for languagescan be adapted.We employ the following Substructure Lemma from [25, Lemma 4.1] which is a slightvariation of Lemma 1 from [11].The intuition of the Substructure Lemma is as follows. When updating an auxiliary tuple ~c after an insertion or deletion of a tuple ~d , a quantifier-free update formula has access to ~c , ~d , and the constants only. Thus if a sequence of modifications changes only tuples from asubstructure A of S , then the auxiliary data of A is not affected by information outside A .In particular, two isomorphic substructures A and B remain isomorphic, when correspondingmodifications are applied to them.The notion of corresponding modifications is formalized as follows. Let π be an isomorph-ism from a structure A to a structure B . Two modifications δ ( ~a ) on A and δ ( ~b ) on B aresaid to be π -respecting if δ = δ and ~b = π ( ~a ). Two sequences α = δ · · · δ m and β = δ · · · δ m of modifications respect π if δ i and δ i are π -respecting for every i ≤ m . Recall that P α ( S )denotes the state obtained by executing the dynamic program P for the modification sequence α from state S . (cid:73) Lemma 15 (Substructure Lemma [11]) . Let P be a DynProp -program and let S and T be states of P with domains S and T . Further let A ⊆ S and B ⊆ T such that S (cid:22) A and T (cid:22) B are isomorphic via π . Then P α ( S ) (cid:22) A and P β ( T ) (cid:22) B are isomorphic via π for all π -respecting modification sequences α , β on A and B . Proof (of Theorem 14).
Towards a contradiction, assume that P = ( P, Init , Q ) is a dy-namic program over schema τ = ( τ in , τ aux ) that maintains the query Equal-Length-Paths in its designated query relation Q . Let n be sufficiently large with respect to τ and n besufficiently large with respect to n . Further let m be the highest arity of a relation symbolfrom τ aux .Let G = ( V, E ) be the empty graph with | V | = n and let S = ( V, E, A ) be the stateobtained by applying the initialization mapping of P to G .By Ramsey’s Theorem for structures (see, e.g., [25, Theorem 4.3]) and because n = | V | is sufficiently large with respect to n there is a set V ⊆ V of size 2 n and an order ≺ on V such that all ≺ -ordered m -tuples over V are of equal atomic τ aux -type. Let us assume that V = A ∪ B with A = { a , . . . , a n } and B = { b , . . . , b n } , and that a ≺ . . . ≺ a n ≺ b ≺ . . . ≺ b n .Let S def = ( V, E , A ) be the state of P that is reached from S after inserting the edges( a , a ) , ( a , a ) , . . . , ( a n − , a n ).Our goal is to find i , i , i with i < i < i such that the substructures S (cid:22) { a i , a i , b . . . , b n } and S (cid:22) { a i , a i , b . . . , b n } are isormorphic. Then, in the state S obtained from S byinserting the edges { ( b , b ) , ( b , b ) , . . . , ( b i − i − , b i − i ) } , the tuples ( a i , a i , b , b i − i ) and( a i , a i , b , b i − i ) will either be both in Q or both not in Q (due to the Substructure Lemma).However, there is a path of length i − i between a i and a i but not from a i to a i , acontradiction.It remains to exhibit such i , i and i . To this end observe that for all m -ary tuples ~b and ~b , the tuples ( a i , a j ,~b ) and ( a i , a j ,~b ) have the same atomic type due to theSubstructure Lemma. Furthermore, by Ramsey’s Theorem for structures, one can find i , i , i such that ( a i , a i ,~b ) and ( a i , a i ,~b ) have the same atomic type. But then T def = S (cid:22) { a i , a i , b . . . , b n } ’ S (cid:22) { a i , a i , b . . . , b n } via the isomorphism that maps a i and each b i to itself and a i to a i . (cid:74) Here we study the maintenance of ECRPQs and provide results in restricted settings. Firstwe show that answers to an ECRPQ can be maintained in
DynFO on acyclic graphs. Evenmore, answers to the following extension of ECRPQs introduced in [4] can still be maintained.An ECRPQ with linear constraints on the number of occurrences of symbols on paths overan alphabet Σ = { σ , ..., σ k } is of the form Q ( ~z ) ←− ^ ≤ i ≤ m ( x i , π i , y i ) , ^ ≤ j ≤ t R j ( ~ω j ) , A~‘ ≥ ~b where A ∈ Z h × ( km ) for some h ∈ N , ~b ∈ Z h , and ~‘ = ( ‘ , , ...‘ ,k , ..., ‘ m, , ..., ‘ m,k ). Thesemantics extends the semantics of ECRPQs as follows: for each 1 ≤ i ≤ m and 1 ≤ j ≤ k ,the variable ‘ i,j is interpreted as the number of occurrences of the symbol σ j in the path π i .The last clause of the query Q is true if A~‘ ≥ ~b under this interpretation. (cid:73) Theorem 16.
Every ECRPQ with linear constraints on the number of occurrences ofsymbols is maintainable in
DynFO on acyclic graphs.
Proof.
Let Σ = { σ , ..., σ k } . We show how to maintain the answer of an ECRPQ Q withlinear constraints with only one regular relation R on an acyclic Σ-labeled graph G = ( V, E ).Thus Q is of the form: Q ( ~z ) ←− ^ ≤ i ≤ m ( x i , π i , y i ) , R ( π , . . . , π m ) , A~‘ ≥ ~b An arbitrary ECRPQ with linear constraints can be rewritten in this form by using closureproperties of regular relations.In a first step we reduce this problem to a structurally simpler one: the problem ofmaintaining Q on a Σ-labeled graph consisting of m disjoint acyclic graphs G , . . . , G m ,restricted in such a way that solutions may only map the variables x i , y i to nodes in G i , foreach 1 ≤ i ≤ m . The simple reduction from the original problem copies the queried graph m times. As m is a constant, this is a bounded first-order reduction. . Muñoz, N. Vortmeier, T. Zeume 19 Let A = ( Q, (Σ ∪ ⊥ ) m , δ, s, F ) be a finite automaton with padding symbol ⊥ 6∈ Σthat recognizes the m -ary regular relation R . The idea is to maintain (2 m + km )-aryauxiliary relations R p,q for all p, q ∈ Q intended to store a tuple ( ~x, ~y, ~‘ , . . . , ~‘ m ) with ~x = ( x , . . . , x m ) , ~y = ( y , . . . , y m ) and ~‘ i = ( ‘ i, , . . . , ‘ i,k ) if and only if the state q isreachable from the state p in A by reading a tuple of words ( λ ( ρ ) , . . . , λ ( ρ m )), where foreach 1 ≤ i ≤ m , ρ i is a path in G i from x i to y i , and ‘ i, , . . . , ‘ i,k are the number ofoccurrences of the symbols σ , . . . , σ k in the label sequence of ρ i .We show how to express the query relation Q by these relations. To this end observethat the (fixed) linear inequality system A~‘ ≥ ~b can be defined by a ( m × k )-ary first-orderformula ψ A,~b ( ~‘ , . . . , ~‘ m ) that uses the built-in arithmetic.The query relation Q is then defined by the following formula: ϕ ( ~z ) def = ∃ ~v ∃ ~‘ · · · ∃ ~‘ m _ f ∈ F R s,f ( ~x, ~y, ~‘ , . . . , ~‘ m ) ∧ ψ A,~b ( ~‘ , . . . , ~‘ m )Here the existentially quantified variables ~v correspond to variables of Q that do not occurin the head of the query, and all x i and y i occur in either ~z or ~v .The update formulas for the relations R p,q are similar in spirit to those for reachabilityin acyclic graphs used in Example 3. Suppose an edge ( u, σ, v ) is inserted into the graph G i for some i ∈ { , . . . , m } . The update formulas compose runs of A from the runs storedin the relations R p,q as follows. For all states p, q ∈ Q , a tuple ( ~x, ~y, ~‘ , . . . , ~‘ k ) shall be in R p,q after the insertion if and only if it was in R p,q before the insertion or if the followingconditions are satisfied:(a) There is a state p ∈ Q , a tuple of nodes ~x = ( x , . . . , x m ) with x i = u , and vectors ~a , . . . , ~a k ∈ N m , such that ( ~x, ~x , ~a , . . . , ~a k ) ∈ R p,p .(b) There is a state q ∈ Q , a tuple of nodes ~y = ( y , . . . , y m ) with y i = v , and vectors ~b , . . . ,~b k ∈ N m , such that ( ~y , ~y,~b , . . . ,~b k ) ∈ R p ,q .(c) There is a tuple of symbols ~s ∈ (Σ ∪ ⊥ ) m such that(i) s i = σ ,(ii) s j = ⊥ for each j = i with x j = y j , and(iii) there is an edge ( x j , s j , y j ) ∈ E j for each j = i with x j = y j and A has a transition from p to q by reading ~s .(d) ~‘ j = ~a j + ~b j + ~c σ j for each j ∈ { , . . . , k } , where ~c σ j ∈ { , } m is the vector whose r thcomponent is 1 if the r th component of ~s is σ j , and 0 otherwise.(e) A~‘ ≥ ~b , where ~‘ is the concatenation of ~‘ , . . . , ~‘ k .The conditions (a)-(c) can be easily expressed by first-order formulas using existentialquantification. The conditions (d)-(e) can be expressed by using built-in arithmetic: sincethe graphs G , . . . , G k are acyclic, it is easy to see that numbers used in those conditions arepolynomial in the size of the active domain. We can therefore build the needed arithmeticincrementally by Proposition 4.Deletions can be handled along the same lines by using the technique from Example 3. (cid:74) It remains open whether the answer relation of ECRPQs can be maintained on generalgraphs, even when only insertions are allowed. Yet when the rational relations are restrictedto be unary, the ECRPQs can be maintained under insertions. More formally, a
CRPQ withlinear constraints on the number of occurrences of symbols over Σ = { σ , . . . , σ k } is of theform Q ( ~z ) ←− ^ ≤ i ≤ m ( x i , π i , y i ) , ^ ≤ j ≤ m L j ( π j ) , A~‘ ≥ ~b where L j is a unary rational relation (that is, a regular language), and A , ~b and ~‘ are as inthe definition of ECRPQs with linear constraints. (cid:73) Theorem 17.
Every CRPQ with linear constraints on the number of occurrences of symbolsis maintainable in
DynFO under insertions.
Proof.
Let Σ = { σ , . . . , σ k } and Q be a CRPQ over Σ with linear constraints on the numberof occurrences of symbols as above. Further let A j = ( Q j , Σ , δ j , s j , F j ), 1 ≤ j ≤ m , be finitestate automata for the regular languages L j occurring in Q .We exhibit a DynFO -program with built-in arithmetic for maintaining Q on generalgraphs under insertions. The necessity for built-in arithmetic can be removed by Proposition 4.The idea is similar to the proof of the previous Theorem 16. We maintain ( k + 2)-aryauxiliary relations R jp,q for each j ∈ { , . . . , m } and all p, q ∈ Q j with the intention that R jp,q stores a tuple ( x, y, ‘ , . . . , ‘ k ) if and only if the state q is reachable from state p in theautomaton A j by reading the label of a path ρ between x and y in G such that ‘ , . . . , ‘ k are the number of occurrences of σ , . . . , σ k in ρ .Before sketching how to maintain the relations R jp,q , we show how they can be used toexpress the answer of Q . As in the proof of Theorem 16 the (fixed) linear inequality system A~‘ ≥ ~b can be defined by a ( m × k )-ary first-order formula ψ A,~b ( ‘ , , . . . , ‘ m,k ) that uses thebuilt-in arithmetic. Then a tuple ~u of nodes in G is in the answer of Q if and only if thefollowing formula holds: ϕ ( ~z ) def = ∃ ~v ∃ ‘ , , . . . , ‘ m,k , ^ ≤ j ≤ m _ f ∈ F j R js j ,f ( x j , y j , ‘ j, , . . . , ‘ j,k ) ∧ ψ A,~b ( ‘ , , . . . , ‘ m,k )Here the existentially quantified variables ~v correspond to variables of Q that do not occurin the head of the query, and all x j and y j occur in either ~v or ~z .A small technical issue arises from the fact that it is not obvious why the length of paths ρ , . . . , ρ m witnessing that a tuple of nodes ~u is in the answer of Q is polynomially bounded.This, however, is necessary for being able to quantify the length ‘ , , . . . , ‘ m,k and to use thebuilt-in arithmetic for computations. Fortunately the length of (shortest) witness paths canbe bounded by a fixed polynomial in the size of the active domain. This has been showneven for ECRPQs with such linear constraints in [4, Lemma 8.6].Now we show how to maintain the relations R jp,q . The following notion is useful. A relation R stores the Parikh distances of a Σ-labeled graph if it contains a tuple ( x, y, ‘ , . . . ‘ k ) if andonly if there is a path ρ between x and y such that its label λ ( ρ i ) contains ‘ i occurrences ofthe symbol σ i for each 1 ≤ i ≤ m . We observe that the relations R jp,q can be defined fromthe Parikh distance relations of the product graphs G × A j . Since the automata A j are fixed,a modification of G yields a bounded number of first-order definable modifications to G × A j .Thus in order to maintain R jp,q , it suffices to be able to maintain the Parikh distancerelation of a Σ-labeled graph under insertions. However, the dynamic program for maintainingdistances under insertions from Theorem 12 can be easily generalized to maintain Parikhdistances. For the sake of completeness we present the general construction. The goal is tomaintain an auxiliary relation S intended to store a tuple ( x, y, t, ~‘ ) with ~‘ def = ( ‘ , . . . , ‘ k ) ifthere are (not necessarily distinct) paths ρ , . . . , ρ t from x to y in G such that each symbol σ i ∈ Σ appears exactly ‘ i times among all ρ , .., ρ t paths. The update formula for S after . Muñoz, N. Vortmeier, T. Zeume 21 inserting an edge ( u, σ, v ) is as follows: φ S ins Eσi ( u, v ; x, y, t, ~‘ ) def = ∃ t − ∃ t + ∃ t (cid:9) ∃ ~‘ − ∃ ~‘ + ∃ ~‘ + ∃ ~‘ (cid:9) (cid:16) A ( x, y, t − , ~‘ − ) ∧ A ( x, u, t + , ~‘ + ) ∧ A ( v, y, t + , ~‘ + ) ∧ A ( v, u, t (cid:9) , ~‘ (cid:9) ) ∧ ( t + = 0 → t (cid:9) = 0) ∧ t − + t + = t ∧ ~‘ − + ~‘ + + ~‘ + + ~‘ (cid:9) + ( t + + t (cid:9) ) ~e i = ~‘ (cid:17) Here, for clarity we quantify k -ary tuples of variables. The tuple ~e i contains zeroes exceptfor its i -th component, which is 1.The correctness of this update formula follows immediately from the proof of Theorem 12. (cid:74) We remark that already boolean ECRPQs cannot be maintained under insertions in
DynProp due to lower bounds for non-regular languages [11], and boolean CRPQs with k + 2 existentially quantified node variables cannot be maintained in DynProp with k -aryrelations due to a lower bound for the k -clique query [24]. In this final section we study the reachability query for product graphs. In addition to itsimportance for the evaluation of fixed graph queries, reachability in graph products canbe used to maintain the result of regular path queries in combined complexity (i.e., whenthe query is subject to modifications as well). Furthermore it is relevant in model checking,where subsystems correspond to factors in product graphs (see, e.g., [3]).The results for maintaining all distances obtained in the previous section immediatelytransfer to reachability in simple graph products (see the discussion at the end of Section 2).A small technical obstacle arises from the fact that the reachability query does not comewith built-in arithmetic, while the distance query studied so far does. However, this is not aproblem due to Proposition 4. (cid:73)
Theorem 18.
Let G be a class of graphs and m ∈ N . If all distances up to n m on G canbe maintained in DynFO with built-in arithmetic, then reachability in the product of m G -graphs is maintainable in DynFO (without built-in arithmetic).
Proof.
For
DynFO with arithmetic this follows from Fact 5. As reachability is a domainindependent query the result follows from Proposition 4. (cid:74)
Shortest paths in products of acyclic and undirected graphs are of length at most n and n , respectively. For these two classes of graphs, reachability can therefore be maintainedin products of polynomially many factors using the program for all distances. More precisely,this is doable for reachability between two specified nodes ~s and ~t as opposed to all pairs ofnodes (as there are exponentially many nodes in such product graphs).For directed graphs, shortest paths in products of polynomially many graphs can be ofexponential length. For this reason, the approach to maintain reachability in such productsvia distances fails. Even more, it is unlikely that there is a DynFO -program for this problem:it could be used to decide reachability in the product of polynomially many graphs in
PTime ,which is NP-hard. This follows from a reduction from emptiness of intersections of unaryregular expressions which is known to be NP-hard [10]. (cid:73)
Corollary 19.
Reachability can be maintained in
DynFO in the product of (a) polynomially many undirected graphs,(b) polynomially many acyclic graphs, and(c) a constant number of directed graphs under insertions.
This follows immediately from Theorem 18, Theorem 13 and Theorem 12. Reachability inproducts of an undirected and an acyclic graph and similar constellations can, of course, alsobe maintained.For labeled graph products, the following corollary follows immediately from the proof ofTheorem 16. (cid:73)
Corollary 20.
Reachability in products of constantly many acyclic Σ -labeled graphs can bemaintained in DynFO . In the following we generalize Corollary 19 to a broader class of graph products. Inthe product graphs considered so far, there is an edge from a node ( x , . . . , x m ) to a node( y , . . . , y m ) if there is an edge ( x i , y i ) in every factor G i . This can be seen as a completelysynchronized traversal through the given graphs. The graph products to be introduced nextallow for more flexible, partially synchronized traversals.Let ( G i ) ≤ i ≤ m be a sequence of graphs with G i def = ( V i , E i ), and let A def = ( ~a , . . . , ~a k ) bea list of tuples from { , } m , called transition rules . We often identify A with the matrixthat has the tuples ~a i as columns. The generalized graph product of ( G i ) i with respect to A ,denoted Q Ai G i , has nodes V × · · · × V m and edges ( ~x, ~y ) defined by the first-order formula _ ~a ∈A ~a =( a ,...,a m ) ^ a i =0 x i = y i ∧ ^ a i =1 E i ( x i , y i )For example, the usual product of two graphs is defined by the transition rule { (1 , } ,and the so called cartesian product is defined by the rules { (1 , , (0 , } . We remark thatgeneralized graph products have also been called non-complete extended p-sums , short: NEPS(see, for example, [19]). (cid:73) Theorem 21.
Reachability in generalized product graphs is maintainable in
DynFO undermodifications to factors and transitions rules for(a) a constant number of directed graphs under insertions and a constant number of transitionrules,(b) polynomially many acyclic graphs and a constant number of transition rules,(c) polynomially many undirected graphs and polynomially many transition rules. Proof sketch.
As usual we assume built-in arithmetic, which can be removed by Proposi-tion 4.For (a) and (b), the key observation is that reachability in generalized graph productscan be reduced to finding a solution in natural numbers to a linear equation system. Let( G i ) ≤ i ≤ m be a list of graphs, ~x = ( x , . . . , x m ) and ~y = ( y , . . . , y m ) nodes of Q Ai G i , andlet D def = { ~d = ( d , . . . , d m ) | there is a path from x i to y i in G i of length d i , for each 1 ≤ i ≤ m } . Then there is a path from ~x to ~y in Q Ai G i if and only if there is a tuple ~d ∈ D and n , . . . , n k ∈ N such that n ~a + . . . n k ~a k = ~d . A shortest path witnessing that two tuples We permit single bit modifications to A , that is, modifying one bit of a transition rule at a time. . Muñoz, N. Vortmeier, T. Zeume 23 ~x and ~y are connected in a generalized product of constantly many directed graphs (or ofpolynomially many acyclic graphs) can be of at most polynomial length. In particular, wecan restrict numbers n , . . . , n k ∈ N to be of polynomial size.The dynamic program for maintaining reachability in those graph products works asfollows. It maintains all distances for each of the factors. Upon modification of a graph G i ,the program updates all distances for G i (using the program for maintaining all distances).Then it guesses n , . . . , n k by using existential quantification, computes ~d def = n ~a + . . . n k ~a k ,and checks for each component d i of ~d that in G i there is a path from x i to y i of length d i .Modifications of the transition rule are handled in a similar way.For (c) we rely on the following fact, which is a consequence of the proof of Theorem 2in [19]. (cid:73) Fact.
Let G def = Q Ai G i be the generalized product of the undirected graphs ( G i ) ≤ i ≤ m withrespect to a list A of k transition rules. Let ~x def = ( x , . . . , x m ) and ~y = ( y , . . . , y m ) be twonodes of G , let C i be the connected component of x i in G i and assume that C i , . . . , C i ‘ arethe only bipartite components. Then there is a path from ~x to ~y in G if and only iffor each i ∈ { , . . . , m } there is a path from x i to y i in G i , andthe linear equation system B~x = ~d is solvable over Z where B is obtained from A by setting rows r / ∈ { i , . . . , i ‘ } to zero, andthe r th component of ~d ∈ Z ‘ is the parity of the distances between x r and y r for r ∈ { i , . . . , i ‘ } and zero for r / ∈ { i , . . . , i ‘ } . Note that since the component of x r with r ∈ { i , . . . , i ‘ } is bipartite, all paths between x r and y r have the same parity.We use the above fact to construct a DynFO -program that maintains whether there is apath from ~x to ~y in the generalized product of polynomially many undirected graphs ( G i ) i with respect to polynomially many transition rules A under single edge modifications tofactors and single bit modifications to transition rules.The dynamic program maintains auxiliary data that contains (1) all distances for each ofthe factors (and thus, in particular, also whether there is a path from x i to y i and whetherthe component C i that contains x i is bipartite) and (2) whether the equation system B~x = ~d has a solution over Z . For the latter the program maintains whether rank( B ) = rank( B, ~d )over Z .It is known that the rank of matrices can be maintained in DynFO [6]. Even more, asobserved by William Hesse, the algorithm from [6] can maintain the rank even when wholerows may be replaced.Upon modification of a factor G i , the program updates all distances for G i using thedynamic program for maintaining distances in undirected graphs. If the modification yieldsa bipartite component C i of G i , then the i th row of B is replaced by the i th row of A and d i is set to the parity of paths between x i and y i . If the component C i became non-bipartite,then the i th row of B is replaced by the all-zero row of A and d i is set to zero. On theother hand, if the i -th bit of transition rule ~a j in A is modified, then the i -th row of B ismodified only at its j -th entry if and only if C i is bipartite. In all scenarios, at most one rowof B is modified. The program can therefore maintain the ranks of B and ( B, ~d ) accordingly.Finally, if y i is reachable from x i in G i for every 1 ≤ i ≤ m and rank( B ) = rank( B, ~d ) thenthe query bit of the dynamic program is set true.The update operations described above can be expressed by first-order formulas with theaforementioned auxiliary data. (cid:74)
Observe that deciding reachability in generalized products of (1) polynomially many graphswith constant many transition rules and of (2) polynomially many acyclic graphs with poly-nomially many transitions rules are NP -hard problems. More precisely, the first generalizesreachability in the product of polynomially many graphs, which we already discussed above.As for the second, notice that the problem of deciding the existence of a 0-1 solution of alinear equation A~x = ~
1, which is known to be NP-hard even for a 0-1 matrix A [5, Chapter8], can be straightforwardly reduced to reachability in the generalized product of acyclicgraphs when polynomially many transition rules are allowed (by using the distance andlinear equations characterization used in the proof of Theorem 21). These problems are thusunlikely to be maintainable in DynFO . In this article we explored graph query languages in the dynamic descriptive complexityframework introduced independently by Dong, Su and Topor, and Patnaik and Immerman.Furthermore we investigated the strongly related question, under which conditions distancesin graphs as well as reachability in product graphs can be maintained. Our work is only afirst step towards a systematic understanding of graph queries in dynamic graph databases.In the following we discuss some interesting directions for further research.For several restricted classes of graphs we exhibited first-order update programs formaintaining distances. We also showed that quantifier-free update formulas do not suffice. Itremains open, whether distances can be maintained for general graphs; we conjecture thatthis is the case. (cid:73)
Open problem 1.
Exhibit a
DynFO -program for maintaining distances.As we have seen, reachability in products of labeled graphs is related to maintainingfragments of the graph query language ECRPQ. While we showed that reachability can bemaintained in labeled products of acyclic graphs, this problem is already much harder forproducts of undirected, labeled paths—not to mention arbitrary labeled graphs. (cid:73)
Open problem 2.
Find dynamic
DynFO -programs for maintaining reachability in productsof restricted classes of labeled graphs.Another interesting direction is to exhibit dynamic programs for other, more expressivequery languages. (cid:73)
Open problem 3.
Identify further expressive query languages that can be maintaineddynamically.A candidate query language to be studied are nested regular expressions (NREs) [17].NREs allow to express queries with some branching capabilities. For example, the NRE( a [ b ]) ∗ selects pairs of nodes that are connected by an a ∗ -labeled path such that every node onthis path has an outgoing edge with label b . This query can easily be maintained in DynFO ,as it is bounded first-order reducible to reachability. On the other hand, it is already unclearwhether the query ( a [ bc ]) ∗ can be maintained in DynFO . References Renzo Angles and Claudio Gutierrez. Survey of graph database models.
ACM ComputingSurveys (CSUR) , 40(1):1, 2008. . Muñoz, N. Vortmeier, T. Zeume 25 Pablo Barceló Baeza. Querying graph databases. In Richard Hull and Wenfei Fan, editors,
Proceedings of the 32nd ACM SIGMOD-SIGACT-SIGART Symposium on Principles ofDatabase Systems, PODS 2013, New York, NY, USA - June 22 - 27, 2013 , pages 175–188.ACM, 2013. Christel Baier and Joost-Pieter Katoen.
Principles of Model Checking . The MIT Press,2008. Pablo Barceló, Leonid Libkin, Anthony Widjaja Lin, and Peter T. Wood. Expressive lan-guages for path queries over graph-structured data.
ACM Trans. Database Syst. , 37(4):31,2012. Sanjoy Dasgupta, Christos H Papadimitriou, and Umesh Vazirani.
Algorithms . McGraw-Hill, Inc., 2006. Samir Datta, Raghav Kulkarni, Anish Mukherjee, Thomas Schwentick, and Thomas Zeume.Reachability is in DynFO. In Magnús M. Halldórsson, Kazuo Iwama, Naoki Kobayashi, andBettina Speckmann, editors,
Automata, Languages, and Programming - 42nd InternationalColloquium, ICALP 2015, Kyoto, Japan, July 6-10, 2015, Proceedings, Part II , volume9135 of
Lecture Notes in Computer Science , pages 159–170. Springer, 2015. Camil Demetrescu and Giuseppe F. Italiano. Mantaining dynamic matrices for fully dy-namic transitive closure.
Algorithmica , 51(4):387–427, 2008. Guozhu Dong and Jianwen Su. First-order incremental evaluation of datalog queries. InCatriel Beeri, Atsushi Ohori, and Dennis Shasha, editors,
Database Programming Languages(DBPL-4), Proceedings of the Fourth International Workshop on Database ProgrammingLanguages - Object Models and Languages, Manhattan, New York City, USA, 30 August -1 September 1993 , Workshops in Computing, pages 295–308. Springer, 1993. Guozhu Dong and Rodney W. Topor. Incremental evaluation of datalog queries. In JoachimBiskup and Richard Hull, editors,
Database Theory - ICDT’92, 4th International Confer-ence, Berlin, Germany, October 14-16, 1992, Proceedings , volume 646 of
Lecture Notes inComputer Science , pages 282–296. Springer, 1992. Zvi Galil. Hierarchies of complete problems.
Acta Informatica , 6(1):77–88, 1976. Wouter Gelade, Marcel Marquardt, and Thomas Schwentick. The dynamic complexity offormal languages.
ACM Trans. Comput. Log. , 13(3):19, 2012. Erich Grädel and Sebastian Siebertz. Dynamic definability. In Alin Deutsch, editor, , pages 236–248. ACM, 2012. William Hesse. The dynamic complexity of transitive closure is in DynTC0.
TheoreticalComputer Science , 296(3):473–485, 2003. Peter Bro Miltersen. Cell probe complexity-a survey. In , 1999. Pablo Muñoz, Nils Vortmeier, and Thomas Zeume. Dynamic graph queries. To be presented at ICDT 2016. Sushant Patnaik and Neil Immerman. Dyn-FO: A parallel, dynamic complexity class.
J.Comput. Syst. Sci. , 55(2):199–209, 1997. Jorge Pérez, Marcelo Arenas, and Claudio Gutierrez. nSPARQL: A navigational languagefor RDF.
J. Web Sem. , 8(4):255–270, 2010. Liam Roditty and Uri Zwick. Improved dynamic reachability algorithms for directed graphs.
SIAM J. Comput. , 37(5):1455–1471, 2008. Dragan Stevanović. When is neps of graphs connected?
Linear Algebra and its Applications ,301(1):137–144, 1999. Volker Weber and Thomas Schwentick. Dynamic complexity theory revisited.
Theory
Comput. Syst. , 40(4):355–377, 2007. Peter T Wood. Query languages for graph databases.
ACM SIGMOD Record , 41(1):50–60,2012. Thomas Zeume. The dynamic descriptive complexity of k-clique. In Erzsébet Csuhaj-Varjú,Martin Dietzfelbinger, and Zoltán Ésik, editors,
Mathematical Foundations of ComputerScience 2014 - 39th International Symposium, MFCS 2014, Budapest, Hungary, August25-29, 2014. Proceedings, Part I , volume 8634 of
Lecture Notes in Computer Science , pages547–558. Springer, 2014. Thomas Zeume.
Small Dynamic Complexity Classes . PhD thesis, TU Dortmund University,2015. Thomas Zeume and Thomas Schwentick. Dynamic conjunctive queries. In Nicole Sch-weikardt, Vassilis Christophides, and Vincent Leroy, editors,
Proc. 17th International Con-ference on Database Theory (ICDT), Athens, Greece, March 24-28, 2014. , pages 38–49.OpenProceedings.org, 2014. Thomas Zeume and Thomas Schwentick. On the quantifier-free dynamic complexity ofreachability.