[PDF] Constant delay enumeration with FPT-preprocessing for conjunctive queries of bounded submodular width

Abstract

Marx (STOC~2010, J.~ACM 2013) introduced the notion of submodular width of a conjunctive query (CQ) and showed that for any class Φ of Boolean CQs of bounded submodular width, the model-checking problem for Φ on the class of all finite structures is fixed-parameter tractable (FPT). Note that for non-Boolean queries, the size of the query result may be far too large to be computed entirely within FPT time. We investigate the free-connex variant of submodular width and generalise Marx's result to non-Boolean queries as follows: For every class Φ of CQs of bounded free-connex submodular width, within FPT-preprocessing time we can build a data structure that allows to enumerate, without repetition and with constant delay, all tuples of the query result. Our proof builds upon Marx's splitting routine to decompose the query result into a union of results; but we have to tackle the additional technical difficulty to ensure that these can be enumerated efficiently.

Full PDF

CConstant delay enumeration with FPT-preprocessing forconjunctive queries of bounded submodular width ∗ Christoph Berkholz, Nicole SchweikardtHumboldt-Universit¨at zu Berlin { berkholz,schweikn } @informatik.hu-berlin.de March 3, 2020

Abstract

Marx (STOC 2010, J. ACM 2013) introduced the notion of submodular width of aconjunctive query (CQ) and showed that for any class Φ of Boolean CQs of boundedsubmodular width, the model-checking problem for Φ on the class of all ﬁnite structures isﬁxed-parameter tractable (FPT). Note that for non-Boolean queries, the size of the queryresult may be far too large to be computed entirely within FPT time. We investigate thefree-connex variant of submodular width and generalise Marx’s result to non-Boolean queriesas follows: For every class Φ of CQs of bounded free-connex submodular width, withinFPT-preprocessing time we can build a data structure that allows to enumerate, withoutrepetition and with constant delay, all tuples of the query result. Our proof builds uponMarx’s splitting routine to decompose the query result into a union of results; but we have totackle the additional technical diﬃculty to ensure that these can be enumerated eﬃciently.

In the past decade, starting with Durand and Grandjean [21], the ﬁelds of logic in computerscience and database theory have seen a large number of contributions that deal with the eﬃcientenumeration of query results. In this scenario, the objective is as follows: given a ﬁnite relationalstructure (i.e., a database) and a logical formula (i.e., a query), after a short preprocessing phase,the query results shall be generated one by one, without repetition, with guarantees on themaximum delay time between the output of two tuples. In this vein, the best that one can hopefor is constant delay (i.e., the delay may depend on the size of the query but not on that ofthe input structure) and linear preprocessing time (i.e., time f ( ϕ ) · O ( N ) where N is the size ofa reasonable representation of the input structure, ϕ is the query, and f ( ϕ ) is a number onlydepending on the query but not on the input structure). Constant delay enumeration has alsobeen adopted as a central concept in factorised databases that gained recent attention [39, 38].Quite a number of query evaluation problems are known to admit constant delay algorithmspreceded by linear or pseudo-linear time preprocessing. This is the case for all ﬁrst-order queries,provided that they are evaluated over classes of structures of bounded degree [21, 29, 13, 32],low degree [22], bounded expansion [30], locally bounded expansion [43], and on classes that arenowhere dense [41]. Also diﬀerent data models have been investigated, including tree-like dataand document spanners [7, 31, 5]. Recently, also the dynamic setting, where a ﬁxed query has tobe evaluated repeatedly against a database that is constantly updated, has received quite someattention [33, 13, 12, 27, 14, 4, 37, 36, 6].This paper deals with the classical, static setting without database updates. We focus onevaluating conjunctive queries (CQs, i.e., primitive-positive formulas) on arbitrary relational ∗ This is the full version of the conference contribution [15]. a r X i v : . [ c s . D B ] M a r tructures. In the following,

FPT-preprocessing (resp.,

FPL-preprocessing ) means preprocessingthat takes time f ( ϕ ) · N O (1) (resp., f ( ϕ ) · O ( N )), and constant delay means delay f ( ϕ ), where f is a computable function, ϕ is the query, and N is the size of the input structure.Bagan et al. [9] showed that every free-connex acyclic CQ allows constant delay enumerationafter FPL-preprocessing. More reﬁned results in this vein are due to Bagan [8] and Brault-Baron [17]; see [42] for a survey and [11] for a tutorial. Bagan et al. [9] complemented theirresult by a conditional lower bound: assuming that Boolean matrix multiplication cannotbe accomplished in time O ( n ), self-join-free acyclic CQs that are not free-connex cannot beenumerated with constant delay and FPL-preprocessing. This demonstrates that even if theevaluation of Boolean queries is easy (as known for all acyclic CQs [44]), the enumeration of theresults of non-Boolean queries might be hard (here, for acyclic CQs that are not free-connex).Bagan et al. [9] also introduced the notion of free-connex (fc) treewidth (tw) of a CQ andshowed that for every class Φ of CQs of bounded fc-tw, within FPT-preprocessing time, one canbuild a data structure that allows constant delay enumeration of the query results. This can beviewed as a generalisation, to the non-Boolean case, of the well-known result stating that themodel-checking problem for classes of Boolean CQs of bounded treewidth is FPT. Note that fornon-Boolean queries—even if they come from a class of bounded fc-tw—the size of the queryresult may be N Ω( || ϕ || ) , i.e., far too large to be computed entirely within FPT-preprocessing time;and generalising the known tractability result for Boolean CQs to the non-Boolean case is farfrom trivial.In a series of papers, the FPT-result for Boolean CQs has been strengthened to more andmore general width-measures, namely to classes of queries of bounded generalised hypertree width(ghw) [25], bounded fractional hypertree width (fhw) [26], and bounded submodular width (subw)[35]. The result on bounded fhw has been generalised to the non-Boolean case in the context offactorised databases [39], which implies constant delay enumeration after FPT-preprocessing forCQs of bounded free-connex fractional hypertree width (fc-fhw). Related data structures thatallow constant delay enumeration after FPT-preprocessing for (quantiﬁer-free) CQs of bounded(fc-)fhw have also been provided in [19, 28].An analogous generalisation of the result on bounded submodular width, however, is stillmissing. The present paper’s main result closes this gap: we show that on classes of CQs ofbounded fc-subw, within FPT-preprocessing time one can build a data structure that allowsconstant delay enumeration of the query results. And within the same FPT-preprocessing time,one can also construct a data structure that enables to test in constant time whether an inputtuple belongs to the query result. Our proof uses Marx’s splitting routine [35] to decompose thequery result of ϕ on A into the union of results of several queries ϕ i on several structures A i butwe have to tackle the additional technical diﬃculty to ensure that the results of all the ϕ i on A i can be enumerated eﬃciently. Once having achieved this, we can conclude by using an eleganttrick provided by Durand and Strozecki [23] for enumerating, without repetition, the union ofquery results.As an immediate consequence of the lower bound provided by Marx [35] in the context ofBoolean CQs of unbounded submodular width, one obtains that our main result is tight forcertain classes of CQs, namely, recursively enumerable classes Φ of quantiﬁer-free and self-join-freeCQs: assuming the exponential time hypothesis (ETH), such a class Φ allows constant delayenumeration after FPT-preprocessing if, and only if, Φ has bounded fc-subw.Let us mention a related recent result which, however, is incomparable to ours. Abo Khamiset al. [2] designed an algorithm for evaluating a quantiﬁer-free CQ ϕ of submodular width w within time O ( N w ) · (log N ) f ( ϕ ) + O ( r · log N ); and an analogous result is also achieved fornon-quantiﬁer-free CQs of fc-subw w [2]. Here, N is the size of the input structure, r is thenumber of tuples in the query result, and f ( ϕ ) is at least exponential in number of variablesof ϕ . In particular, the algorithm does not distinguish between a preprocessing phase and an In this paper, structures will always be ﬁnite and relational.

Outline.

The rest of the paper is structured as follows. Section 2 provides basic notationsconcerning structures, queries, and constant delay enumeration. Section 3 recalls concepts of(free-connex) decompositions of queries, provides a precise statement of our main result, andcollects the necessary tools for obtaining this result. Section 4 is devoted to the detailed proof ofour main result. We conclude in Section 5.

In this section we ﬁx notation and summarise basic deﬁnitions.

Basic notation.

We write N and R (cid:62) for the set of non-negative integers and reals, respec-tively, and we let N (cid:62) := N \ { } and [ n ] := { , . . . , n } for all n ∈ N (cid:62) . By 2 S we denote thepower set of a set S . Whenever G denotes a graph, we write V ( G ) and E ( G ) for the set of nodesand the set of edges, respectively, of G . Whenever writing a to denote a k -tuple (for some arity k ∈ N ), we write a i to denote the tuple’s i -th component; i.e., a = ( a , . . . , a k ). For a k -tuple a and indices i , . . . , i (cid:96) ∈ [ k ] we let π i ,...,i (cid:96) ( a ) := ( a i , . . . , a i (cid:96) ). For a set S of k -tuples we let π i ,...,i (cid:96) ( S ) := { π i ,...,i (cid:96) ( a ) : a ∈ S } .If h and g are mappings with domains X and Y , respectively, we say that h and g are joinable if h ( z ) = g ( z ) holds for all z ∈ X ∩ Y . In case that h and g are joinable, we write h (cid:111)(cid:110) g todenote the mapping f with domain X ∪ Y where f ( x ) = h ( x ) for all x ∈ X and f ( y ) = g ( y )for all y ∈ Y . If A and B are sets of mappings with domains X and Y , respectively, then A (cid:111)(cid:110) B := { h (cid:111)(cid:110) g : h ∈ A, g ∈ B, and h and g are joinable } .We use the following further notation where A is a set of mappings with domain X and h ∈ A .For a set I ⊆ X , the projection π I ( h ) is the restriction h | I of h to I ; and π I ( A ) := { π I ( h ) : h ∈ A } .For objects z, c where z (cid:54)∈ X , we write h ∪ { ( z, c ) } for the extension h (cid:48) of h to domain X ∪ { z } with h (cid:48) ( z ) = c and h (cid:48) ( x ) = h ( x ) for all x ∈ X . Signatures and structures. A signature is a ﬁnite set σ of relation symbols, where each R ∈ σ is equipped with a ﬁxed arity ar( R ) ∈ N (cid:62) . A σ -structure A consists of a ﬁnite set A (called the universe or domain of A ) and an ar( R )-ary relation R A ⊆ A ar( R ) for each R ∈ σ .The size || σ || of a signature σ is | σ | + (cid:80) R ∈ σ ar( R ). We write n A to denote the cardinality | A | of A ’s universe, we write m A to denote the number of tuples in A ’s largest relation, andwe write N A or ||A|| to denote the size of a reasonable encoding of A . To be speciﬁc, let N A = ||A|| = || σ || + n A + (cid:80) R ∈ σ || R A || , where || R A || = ar( R ) ·| R A | . Whenever A is clear from thecontext, we will omit the superscript · A and write n, m, N instead of n A , m A , N A . Considersignatures σ and τ with σ ⊆ τ . The σ -reduct of a τ -structure B is the σ -structure A with A = B and R A = R B for all R ∈ σ . A τ -expansion of a σ -structure A is a τ -structure B whose σ -reductis A . Conjunctive Queries.

We ﬁx a countably inﬁnite set var of variables . We allow queriesto use arbitrary relation symbols of arbitrary arities. An atom α is of the form R ( v , . . . , v r )with r = ar( R ) and v , . . . , v r ∈ var . We write vars( α ) to denote the set of variables occurringin α . A conjunctive query (CQ, for short) is of the form ∃ z · · · ∃ z (cid:96) (cid:0) α ∧ · · · ∧ α d (cid:1) , where (cid:96) ∈ N , d ∈ N (cid:62) , α j is an atom for every j ∈ [ d ], and z , . . . , z (cid:96) are pairwise distinct elements invars( α ) ∪ · · · ∪ vars( α d ). For such a CQ ϕ we let atoms( ϕ ) = { α , . . . , α d } . We write vars( ϕ )and σ ( ϕ ) for the set of variables and the set of relation symbols occurring in ϕ , respectively.The set of quantiﬁed variables of ϕ is quant( ϕ ) := { z , . . . , z (cid:96) } , and the set of free variablesis free( ϕ ) := vars( ϕ ) \ quant( ϕ ). We sometimes write ϕ ( x , . . . , x k ) to indicate that x , . . . , x k are the free variables of ϕ . The arity of ϕ is the number k := | free( ϕ ) | . The query ϕ is called quantiﬁer-free if quant( ϕ ) = ∅ , it is called Boolean if its arity is 0, and it is called self-join-free ifno relation symbol occurs more than once in ϕ .The semantics are deﬁned as usual: A valuation for ϕ on a σ ( ϕ )-structure A is a mapping β : vars( ϕ ) → A . A valuation β is a homomorphism from ϕ to a A if for every atom R ( v , . . . , v r ) ∈ ϕ ) we have (cid:0) β ( v ) , . . . , β ( v r ) (cid:1) ∈ R A . The query result (cid:74) ϕ (cid:75) A of a CQ ϕ on the σ ( ϕ )-structure A is deﬁned as the set { π free( ϕ ) ( β ) : β is a homomorphism from ϕ to A} . Often, wewill identify the mappings g ∈ (cid:74) ϕ (cid:75) A with tuples ( g ( x ) , . . . , g ( x k )), where x , . . . , x k is a ﬁxedlisting of the free variables of ϕ .The size || ϕ || of a query ϕ is the length of ϕ when viewed as a word over the alphabet σ ( ϕ ) ∪ vars( ϕ ) ∪ {∃ , ∧ , ( , ) } ∪ { , } . Model of computation.

For the complexity analysis we assume the RAM-model with auniform cost measure. In particular, storing and accessing elements from a structure’s universerequires O (1) space and time. For an r -ary relation R A we can construct in time O ( (cid:107) R A (cid:107) ) anindex that allows to enumerate R A with O (1) delay and to test for a given r -tuple a whether a ∈ R A in time O ( r ). Moreover, for every { i , . . . , i (cid:96) } ⊆ [ r ] we can build a data structure wherewe can enumerate for every (cid:96) -tuple b the selection { a ∈ R A : π i ,...,i (cid:96) ( a ) = b } with O (1) delay.Such a data structure can be constructed in time O ( (cid:107) R A (cid:107) ), for instance by a linear scan over R A where we add every tuple a ∈ R A to a list L π i ,...,i(cid:96) ( a ) . Using a constant access data structure oflinear size, the list L b can be accessed in time O ( (cid:96) ) when receiving an (cid:96) -tuple b . Constant delay enumeration and testing. An enumeration algorithm for query eval-uation consists of two phases: the preprocessing phase and the enumeration phase. In thepreprocessing phase the algorithm is allowed to do arbitrary preprocessing on the query ϕ andthe input structure A . We denote the time required for this phase by t p . In the subsequentenumeration phase the algorithm enumerates, without repetition, all tuples (or, mappings) inthe query result (cid:74) ϕ (cid:75) A , followed by the end-of-enumeration message EOE . The delay t d is themaximum time that passes between the start of the enumeration phase and the output of theﬁrst tuple, between the output of two consecutive tuples, and between the last tuple and EOE .A testing algorithm for query evaluation also starts with a preprocessing phase of time t p inwhich a data structure is computed that allows to test for a given tuple (or, mapping) b whetherit is contained in the query result (cid:74) ϕ (cid:75) A . The testing time t t of the algorithm is an upper boundon the time that passes between receiving b and providing the answer.One speaks of constant delay (testing time) if the delay (testing time) depends on the query ϕ , but not on the input structure A .We make use of the following result from Durand and Strozecki, which allows to eﬃcientlyenumerate the union of query results, provided that each query result in the union can beenumerated and tested eﬃciently. Note that this is not immediate, because the union mightcontain many duplicates that need to be avoided during enumeration. Theorem 2.1 ([23]) . Suppose that there is an enumeration algorithm A that receives a query ϕ and a structure A and enumerates (cid:74) ϕ (cid:75) A with delay t d ( ϕ ) after t p ( ϕ, A ) preprocessing time.Further suppose that there is a testing algorithm B that receives a query ϕ and a structure A and has t p ( ϕ, A ) preprocessing time and t t ( ϕ ) testing time. Then there is an algorithm C thatreceives (cid:96) queries ϕ i and structures A i and allows to enumerate (cid:83) i ∈ [ (cid:96) ] (cid:74) ϕ i (cid:75) A i with O ( (cid:80) i ∈ [ (cid:96) ] t d ( ϕ i )+ (cid:80) i ∈ [ (cid:96) ] t t ( ϕ i )) delay after O ( (cid:80) i ∈ [ (cid:96) ] t p ( ϕ i , A i )) preprocessing time.Proof (sketch). The induction start (cid:96) = 1 is trivial. For the induction step (cid:96) → (cid:96) + 1 start anenumeration of (cid:83) i ∈ [ (cid:96) ] (cid:74) ϕ i (cid:75) A i and test for every tuple whether it is contained in (cid:74) ϕ (cid:96) +1 (cid:75) A (cid:96) +1 . Ifthe answer is no, then output the tuple. Otherwise discard the tuple and instead output thenext tuple in an enumeration of (cid:74) ϕ (cid:96) +1 (cid:75) A (cid:96) +1 . Subsequently enumerate the remaining tuples from (cid:74) ϕ (cid:96) +1 (cid:75) A (cid:96) +1 . At the end of this section, we provide a precise statement of our main result. Before we cando so, we have to recall the concept of free-connex decompositions of queries and the notion ofsubmodular width. It will be convenient for us to use the following notation.4 eﬁnition 3.1.

Let ϕ = ∃ z · · · ∃ z (cid:96) (cid:0) α ∧ · · · ∧ α d (cid:1) be a CQ and S ⊆ vars( ϕ ). We write ϕ (cid:104) S (cid:105) for the CQ that is equivalent to the expression (cid:0) ∃ y · · · ∃ y r α (cid:1) ∧ · · · ∧ (cid:0) ∃ y · · · ∃ y r α d (cid:1) , (1)where { y , . . . , y r } = vars( ϕ ) \ S .Note that ϕ (cid:104) S (cid:105) is obtained from ϕ by discarding existential quantiﬁcation and projectingevery atom to S , hence free( ϕ (cid:104) S (cid:105) ) = S . However, (cid:74) ϕ (cid:104) S (cid:105) (cid:75) A shall not be confused with theprojection of (cid:74) ϕ (cid:75) A to S . In fact, it might be that (cid:74) ϕ (cid:75) A is empty, but (cid:74) ϕ (cid:104) S (cid:105) (cid:75) A is not, as thefollowing example illustrates: ϕ = E ( x, y ) ∧ E ( y, z ) ∧ E ( x, z ) and (2) ϕ (cid:104){ x, z }(cid:105) ≡ ∃ yE ( x, y ) ∧ ∃ yE ( y, z ) ∧ ∃ yE ( x, z ) (3) ≡ E ( x, z ) . (4) We use the same notation as [24] for decompositions of queries: A tree decomposition (TD, forshort) of a CQ ϕ is a tuple TD = ( T, χ ), for which the following two conditions are satisﬁed:1. T = ( V ( T ) , E ( T )) is a ﬁnite undirected tree.2. χ is a mapping that associates with every node t ∈ V ( T ) a set χ ( t ) ⊆ vars( ϕ ) such that(a) for each atom α ∈ atoms( ϕ ) there exists t ∈ V ( T ) such that vars( α ) ⊆ χ ( t ), and(b) for each variable v ∈ vars( ϕ ) the set χ − ( v ) := { t ∈ V ( T ) : v ∈ χ ( t ) } induces aconnected subtree of T (this condition is called path condition ).To use a tree decomposition TD = ( T, χ ) of ϕ for query evaluation one considers, for each t ∈ V ( T ) the query ϕ (cid:104) S (cid:105) for S := χ ( t ), evaluates this query on the input structure A , and thencombines these results for all t ∈ V ( T ) along a bottom-up traversal of T . If the query is Boolean,this yields the result of ϕ on A ; if it is non-Boolean, (cid:74) ϕ (cid:75) A can be computed by performingadditional traversals of T . This approach is eﬃcient if the result sets (cid:74) ϕ (cid:104) χ ( t ) (cid:105) (cid:75) A are small andcan be computed eﬃciently (later on, we will sometimes refer to the sets (cid:74) ϕ (cid:104) χ ( t ) (cid:105) (cid:75) A as projectionson bags ).The simplest queries where this is the case are acyclic queries [10, 16]. A number of equivalentcharacterisations of the acyclic CQs have been provided in the literature (cf. [1, 25, 27, 18]);among them a characterisation by Gottlob et al. [25] stating that a CQ is acyclic if and onlyif it has a tree-decomposition where every bag is covered by an atom, i.e., for every bag χ ( t )there is some atom α in ϕ with χ ( t ) ⊆ vars( α ). The approach described above leads to a lineartime algorithm for evaluating an acyclic CQ ϕ that is Boolean, and if ϕ is non-Boolean, (cid:74) ϕ (cid:75) A iscomputed in time linear in ||A|| + | (cid:74) ϕ (cid:75) A | . This method is known as Yannakakis’ algorithm . Butthis algorithm does not distinguish between a preprocessing phase and an enumeration phaseand does not guarantee constant delay enumeration. In fact, Bagan et al. identiﬁed the followingadditional property that is needed to ensure constant delay enumeration.

Deﬁnition 3.2 ([9]) . A tree decomposition TD = ( T, χ ) of a CQ ϕ is free-connex if thereis a subset U ⊆ V ( T ) that induces a connected subtree of T and that satisﬁes the conditionfree( ϕ ) = (cid:83) t ∈ U χ ( t ).Bagan et al. [9] identiﬁed the free-connex acyclic CQs, i.e., the CQs ϕ that have a free-connextree decomposition where every bag is covered by an atom, as the fragment of the acyclic CQswhose results can be enumerated with constant delay after FPL-preprocessing:5 heorem 3.3 (Bagan et al. [9]) . There is a computable function f and an algorithm whichreceives a free-connex acyclic CQ ϕ and a σ ( ϕ ) -structure A and computes within t p = f ( ϕ ) O ( ||A|| ) preprocessing time and space a data structure that allows to(i) enumerate (cid:74) ϕ (cid:75) A with f ( ϕ ) delay and(ii) test for a given tuple (or, mapping) b if b ∈ (cid:74) ϕ (cid:75) A within f ( ϕ ) testing time. The approach of using free-connex tree decompositions for constant delay enumeration canbe extended from acyclic CQs to arbitrary CQs. To do this, we have to compute for every bag χ ( t ) in the tree decomposition the projection (cid:74) ϕ (cid:104) χ ( t ) (cid:105) (cid:75) A . This reduces the task to the acycliccase, where the free-connex acyclic query contains one atom α with vars( α ) = χ ( t ) for everybag χ ( t ) and the corresponding relation is deﬁned by (cid:74) ϕ (cid:104) χ ( t ) (cid:105) (cid:75) A . Because the runtime in thisapproach is dominated by computing (cid:74) ϕ (cid:104) χ ( t ) (cid:105) (cid:75) A , it is only feasible if the projections are eﬃcientlycomputable for every bag. If the decomposition has bounded treewidth or bounded fractionalhypertree width, then it is possible to compute (cid:74) ϕ (cid:104) χ ( t ) (cid:105) (cid:75) A for every bag in time f ( ϕ ) ·||A|| O (1) [26], which in turn implies that the result can be enumerated after FPT-preprocessing time forCQs of bounded fc-tw [9] and for CQs of bounded fc-fhw [39]. Before providing the precise deﬁnition of the submodular width of a query, let us ﬁrst consideran example. The central idea behind algorithms that rely on submodular width [35, 2, 40] is tosplit the input structure into several parts and use for every part a diﬀerent tree decompositionof ϕ . This will give a signiﬁcant improvement over the fractional hypertree width, which usesonly one tree decomposition of ϕ . A typical example to illustrate this idea is the following 4-cyclequery (see also [2, 40]): ϕ := E ( x , x ) ∧ E ( x , x ) ∧ E ( x , x ) ∧ E ( x , x ) . There are essentially two non-trivial tree decompositions TD (cid:48) = ( T, χ (cid:48) ), TD (cid:48)(cid:48) = ( T, χ (cid:48)(cid:48) ) of ϕ ,which are both deﬁned over the two-vertex tree T = ( { t , t } , { ( t , t ) } ) by χ (cid:48) ( t ) = { x , x , x } , χ (cid:48) ( t ) = { x , x , x } and χ (cid:48)(cid:48) ( t ) = { x , x , x } , χ (cid:48)(cid:48) ( t ) = { x , x , x } . Both tree decompositionslead to an optimal fractional hypertree decomposition of width fhw ( ϕ ) = 2. Indeed, for theworst-case instance A with E A = E A := [ (cid:96) ] × { a } ∪ { b } × [ (cid:96) ] E A = E A := [ (cid:96) ] × { b } ∪ { a } × [ (cid:96) ]we have (cid:107)A(cid:107) = O ( (cid:96) ) while the projections on the bags have size Ω( (cid:96) ) in both decompositions: (cid:74) ϕ (cid:104) χ (cid:48) ( t ) (cid:105) (cid:75) A = (cid:74) ϕ (cid:104) χ (cid:48) ( t ) (cid:105) (cid:75) A = [ (cid:96) ] × { a } × [ (cid:96) ] ∪ { b } × [ (cid:96) ] × { b } , (cid:74) ϕ (cid:104) χ (cid:48)(cid:48) ( t ) (cid:105) (cid:75) A = (cid:74) ϕ (cid:104) χ (cid:48)(cid:48) ( t ) (cid:105) (cid:75) A = [ (cid:96) ] × { b } × [ (cid:96) ] ∪ { a } × [ (cid:96) ] × { a } . However, we can split A into A (cid:48) and A (cid:48)(cid:48) such that (cid:74) ϕ (cid:75) A is the disjoint union of (cid:74) ϕ (cid:75) A (cid:48) and (cid:74) ϕ (cid:75) A (cid:48)(cid:48) and the bag-sizes in the respective decompositions are small: E A (cid:48) = E A (cid:48) := { b } × [ (cid:96) ] E A (cid:48) = E A (cid:48) := [ (cid:96) ] × { b } E A (cid:48)(cid:48) = E A (cid:48)(cid:48) := [ (cid:96) ] × { a } E A (cid:48)(cid:48) = E A (cid:48)(cid:48) := { a } × [ (cid:96) ] (cid:74) ϕ (cid:104) χ (cid:48) ( t ) (cid:105) (cid:75) A (cid:48) = (cid:74) ϕ (cid:104) χ (cid:48) ( t ) (cid:105) (cid:75) A (cid:48) = { b } × [ (cid:96) ] × { b } , (cid:74) ϕ (cid:104) χ (cid:48)(cid:48) ( t ) (cid:105) (cid:75) A (cid:48)(cid:48) = (cid:74) ϕ (cid:104) χ (cid:48)(cid:48) ( t ) (cid:105) (cid:75) A (cid:48)(cid:48) = { a } × [ (cid:96) ] × { a } . Thus, we can eﬃciently evaluate ϕ on A (cid:48) using TD (cid:48) and ϕ on A (cid:48)(cid:48) using TD (cid:48)(cid:48) (in time O ( (cid:96) ) inthis example) and then combine both results to obtain ϕ ( A ). Using the strategy of Alon et al. [3],it is possible to split every database A for this particular 4-cycle query ϕ into two instances A (cid:48) recall from Section 2 our convention to identify mappings in query results with tuples; the free variables arelisted canonically here, by increasing indices A (cid:48)(cid:48) such that the bag sizes in TD (cid:48) on A (cid:48) as well as in TD (cid:48)(cid:48) on A (cid:48)(cid:48) are bounded by (cid:107)A(cid:107) / and can be computed in time O ( (cid:107)A(cid:107) / ) (see [2, 40] for a detailed account on this strategy). Asboth decompositions are free-connex, this also leads to a constant delay enumeration algorithmfor ϕ with O ( (cid:107)A(cid:107) / ) time preprocessing, which improves the O ( (cid:107)A(cid:107) ) preprocessing time thatfollows from using one decomposition.In general, whether such a data-dependent decomposition is possible is determined by thesubmodular width subw ( ϕ ) of the query. The notion of submodular width was introduced in [35].To present its deﬁnition, we need the following terminology. A function g : 2 vars( ϕ ) → R (cid:62) is • monotone if g ( U ) (cid:54) g ( V ) for all U ⊆ V ⊆ vars( ϕ ). • edge-dominated if g (vars( α )) (cid:54) α ∈ atoms( ϕ ). • submodular , if g ( U ) + g ( V ) (cid:62) g ( U ∩ V ) + g ( U ∪ V ) for every U, V ⊆ vars( ϕ ).We denote by S ( ϕ ) the set of all monotone, edge-dominated, submodular functions g : 2 vars( ϕ ) → R (cid:62) that satisfy g ( ∅ ) = 0, and by T ( ϕ ) the set of all tree decompositions of ϕ . The submodularwidth of a conjunctive query ϕ is subw ( ϕ ) := sup g ∈ S ( ϕ ) min ( T,χ ) ∈ T ( ϕ ) max t ∈ V ( T ) g ( χ ( t )) . (5)In particular, if the submodular width of ϕ is bounded by w , then for every submodular function g there is a tree decomposition in which every bag B satisﬁes g ( B ) (cid:54) w .It is known that subw ( ϕ ) (cid:54) fhw ( ϕ ) for all queries ϕ [35, Proposition 3.7]. Moreover, there is aconstant c and a family of queries ϕ such that subw ( ϕ ) (cid:54) c is bounded and fhw ( ϕ ) = Ω( (cid:112) log (cid:107) ϕ (cid:107) )is unbounded [34, 35]. The main result in [35] is that the submodular width characterises thetractability of Boolean CQs in the following sense. Theorem 3.4 ([35]) . (1) There is a computable function f and an algorithm that receives a Boolean CQ ϕ , subw ( ϕ ) ,and a σ ( ϕ ) -structure A and evaluates ϕ on A in time f ( ϕ ) ||A|| O ( subw ( ϕ )) .(2) Let Φ be a recursively enumerable class of Boolean, self-join-free CQs of unbounded submodularwidth. Assuming the exponential time hypothesis (ETH) there is no algorithm which, uponinput of a query ϕ ∈ Φ and a structure A , evaluates ϕ on A in time ||A|| o ( subw ( ϕ ) / ) . The free-connex submodular width of a conjunctive query ϕ is deﬁned in a similar way assubmodular width, but this time ranges over the set fcT ( ϕ ) of all free-connex tree decompositionsof ϕ (it is easy to see that we can assume that fcT ( ϕ ) is ﬁnite). fc-subw ( ϕ ) := sup g ∈ S ( ϕ ) min ( T,χ ) ∈ fcT ( ϕ ) max t ∈ V ( T ) g ( χ ( t )) . (6)Note that if ϕ is either quantiﬁer-free or Boolean, we have fc-subw ( ϕ ) = subw ( ϕ ). In general,this is not always the case. Consider for example the following quantiﬁed version ϕ (cid:48) := ∃ x ∃ x ϕ of the quantiﬁer-free 4-cycle query ϕ . Here we have subw ( ϕ (cid:48) ) = , but fc-subw ( ϕ (cid:48) ) = 2: onecan verify fc-subw ( ϕ (cid:48) ) (cid:62) { x , x , x , x } and taking the submodular function g ( U ) := | U | . Now we are ready to statethe main theorem of this paper. Theorem 3.5.

For every δ > and w (cid:62) there is a computable function f and an algorithmwhich receives a CQ ϕ with fc-subw ( ϕ ) (cid:54) w and a σ ( ϕ ) -structure A and computes within t p = f ( ϕ ) ||A|| (2+ δ ) w preprocessing time and space f ( ϕ ) ||A|| (1+ δ ) w a data structure that allows to(i) enumerate (cid:74) ϕ (cid:75) A with f ( ϕ ) delay and ii) test for a given tuple (or, mapping) b if b ∈ (cid:74) ϕ (cid:75) A within f ( ϕ ) testing time. The following corollary is an immediate consequence of Theorem 3.5 and Theorem 3.4. Aclass Φ of CQs is said to be of bounded free-connex submodular width if there exists a number w such that fc-subw ( ϕ ) (cid:54) w for all ϕ ∈ Φ. And by an algorithm for Φ that enumerates withconstant delay after FPT-preprocessing we mean an algorithm that receives a query ϕ ∈ Φ and a σ ( ϕ )-structure A and spends f ( ϕ ) ||A|| O (1) preprocessing time and then enumerates (cid:74) ϕ (cid:75) A withdelay f ( ϕ ), for a computable function f . Corollary 3.6. (1) For every class Φ of CQs of bounded free-connex submodular width, there is an algorithm for Φ that enumerates with constant delay after FPT-preprocessing.(2) Let Φ be a recursively enumerable class of quantiﬁer-free self-join-free CQs and assume thatthe exponential time hypothesis (ETH) holds.Then there is an algorithm for Φ that enumerates with constant delay after FPT-preprocessingif, and only if, Φ has bounded free-connex submodular width. To prove Theorem 3.5, we make use of Marx’s splitting routine for queries of bounded submodularwidth. In the following, we will adapt the main deﬁnitions and concepts from [35] to our notions.While doing this, we provide the following additional technical contributions: First, we give adetailed time and space analysis of the algorithm and improve the runtime of the consistencyalgorithm [35, Lemma 4.5] from quadratic to linear (see Lemma 4.2). Second, we ﬁx an oversightin [35, Lemma 4.12] by establishing strong M -consistency (unfortunately, this ﬁx incurs a blow-upin running time). Afterwards we prove our main theorem, where the non-Boolean setting requiresus to relax Marx’s partition into reﬁnements (Lemma 4.5) so that the subinstances are no longerdisjoint.Let ϕ be a quantiﬁer-free CQ with vars( ϕ ) = { x , . . . , x k } , and let σ := σ ( ϕ ). For every S = { x i , . . . , x i (cid:96) } ⊆ vars( ϕ ) where i < · · · < i (cid:96) we set x S := ( x i , . . . , x i (cid:96) ) and let R S / ∈ σ be afresh (cid:96) -ary relation symbol. For every collection s ⊆ vars( ϕ ) we let σ s := σ ∪ { R S : S ∈ s } and (7) ϕ s := ϕ ∧ (cid:86) S ∈ s R S ( x S ) . (8)A reﬁnement of ϕ and a σ -structure A is a pair ( s , B ), where s ⊆ vars( ϕ ) is closed under takingsubsets and B is a σ s -expansion of A . Note that if ( s , B ) is a reﬁnement of ϕ and A , then (cid:74) ϕ s (cid:75) B ⊆ (cid:74) ϕ (cid:75) A . In the following we will construct reﬁnements that do not change the resultrelation, i. e., (cid:74) ϕ s (cid:75) B = (cid:74) ϕ (cid:75) A . Subsequently, we will split reﬁnements in order to partition thequery result.The following deﬁnition collects useful properties of reﬁnements. Recall from Section 2 thatfor a CQ ψ and a structure B , the query result (cid:74) ψ (cid:75) B actually is a set of mappings from free( ψ )to B . For notational convenience we deﬁne R B S := (cid:74) R S ( x S ) (cid:75) B and use the set R B S of mappingsinstead of the relation R B S . In particular, by addressing/inserting/deleting a mapping h : S → B from R B S we mean addressing/inserting/deleting the tuple ( h ( x i ) , . . . , h ( x i (cid:96) )) from R B S , where( x i , . . . , x i (cid:96) ) = x S . Deﬁnition 4.1.

Let ϕ be a quantiﬁer-free σ -CQ, A a σ -structure, ( s , B ) a reﬁnement of ϕ and A , and M an integer.1. The reﬁnement ( s , B ) is consistent if R B S = (cid:74) ϕ s (cid:104) S (cid:105) (cid:75) B for all S ∈ s and (9) R B S = π S (cid:0) R B T (cid:1) for all S, T ∈ s with S ⊂ T . (10)8. The reﬁnement ( s , B ) is M -consistent if it is consistent and S ∈ s ⇐⇒ for all T ⊆ S : | (cid:74) ϕ s (cid:104) T (cid:105) (cid:75) B | (cid:54) M . (11)3. The reﬁnement ( s , B ) is strongly M -consistent if it is M -consistent and S ∈ s , T ∈ s , ( S ∪ T ) / ∈ s = ⇒ | (cid:74) ϕ s (cid:104) S ∪ T (cid:105) (cid:75) B | > M . (12) Lemma 4.2.

There is an algorithm that receives a reﬁnement R = ( s , B ) of ϕ and A andcomputes in time O ( | s | · (cid:107)B(cid:107) ) a consistent reﬁnement ( s , B (cid:48) ) with R B (cid:48) S ⊆ R B S for all S ∈ s and (cid:74) ϕ s (cid:75) B (cid:48) = (cid:74) ϕ s (cid:75) B .Proof. We start by letting B (cid:48) := B and then proceed by iteratively modifying B (cid:48) . We ﬁrstestablish the ﬁrst consistency requirement (9) by removing from every R B (cid:48) S all mappings h suchthat h / ∈ (cid:74) ϕ s (cid:104) S (cid:105) (cid:75) B (cid:48) . To ensure the second consistency requirement (10), the algorithm iterativelydeletes mappings in R B (cid:48) S that do not extend to larger mappings in R B (cid:48) T (for all S ⊂ T ∈ s ). Notethat removing a mapping from R B (cid:48) T might shrink the set (cid:74) ϕ s (cid:104) S (cid:48) (cid:105) (cid:75) B (cid:48) for sets S (cid:48) ∈ s that have anonempty intersection with S . In this case, we also have to delete aﬀected mappings from R B (cid:48) S (cid:48) in order to ensure that R B (cid:48) S (cid:48) = (cid:74) ϕ s (cid:104) S (cid:48) (cid:105) (cid:75) B (cid:48) . These steps will be iterated until the reﬁnement isconsistent. It is clear that the reﬁnement does not exclude tuples from the query result, i. e.,the ﬁnal structure B (cid:48) satisﬁes (cid:74) ϕ s (cid:75) B (cid:48) = (cid:74) ϕ s (cid:75) B . To see that this can be achieved in time linear in | s | · (cid:80) S ∈ s | R B S | , we formulate the problem as a set of Horn-clauses. The consistent reﬁnementcan then be computed by applying any linear-time unit propagation algorithm (cf., e.g., [20]).For every S ∈ s and every mapping h ∈ R B S we introduce a Boolean variable d hS which expressesthat, in order to achieve consistency, h has to be deleted from R B S . The Horn-formula containsfor every S, T ∈ s with S ⊂ T the clauses d gS ← (cid:86) { d hT : h ∈ R B T , π S ( h ) = g } for all g ∈ R B S , and (13) d hT ← d gS for all h ∈ R B T , g ∈ R B S , π S ( h ) = g . (14)The ﬁrst type of clauses ensures that when a mapping g with domain S does not extend to a tuple h with domain T ⊃ S , then it will be excluded from R B (cid:48) S . The second type of clauses ensuresthat for all T ∈ s we have R B (cid:48) T = (cid:74) ϕ s (cid:104) T (cid:105) (cid:75) B (cid:48) . Note that the size of the resulting Horn-formulais bounded by O (cid:0) | s | · (cid:80) S ∈ s | R B S | (cid:1) . Now we apply a linear time unit propagation algorithm toﬁnd a solution of minimum weight. If the formula is unsatisﬁable, we know that (cid:74) ϕ s (cid:75) B = ∅ andcan safely set R B (cid:48) S = ∅ for all S ∈ s . Otherwise, we obtain a minimal satisfying assignment β that sets a variable d hS to true if, and only if, h has to be deleted from R B S . Thus we set R B (cid:48) S := R B S \ { h : β ( d hS ) = 1 } . By minimality we have (cid:74) ϕ s (cid:75) B (cid:48) = (cid:74) ϕ s (cid:75) B . Lemma 4.3.

Let ϕ be a quantiﬁer-free CQ, let A be a σ ( ϕ ) -structure where the largest relationcontains m tuples, and let M (cid:62) m . There is an algorithm that computes in time O (2 | vars( ϕ ) | · M ) and space O (2 | vars( ϕ ) | · M ) a strongly M -consistent reﬁnement ( s , B ) that satisﬁes (cid:74) ϕ (cid:75) A = (cid:74) ϕ s (cid:75) B .Proof. The pseudocode of the algorithm is shown in Figure 1. For computing the strongly M -consistent reﬁnement we ﬁrst compute all sets S where for all T ⊆ S we have | (cid:74) ϕ s (cid:104) T (cid:105) (cid:75) B | (cid:54) M ;as in [35], we say that such sets S are M -small . First note that the empty set is M -small. Fornonempty sets S we know that S is only M -small if for every x ∈ S the set S \ { x } is M -smalland hence already included in s . If this is the case, then (cid:74) ϕ s (cid:104) S (cid:105) (cid:75) B can be computed in time O ( M · n ) by testing for every h ∈ R B S \{ x } (for an arbitrary x ∈ S ) and every element c in thestructure’s universe, whether h ∪ { ( x, c ) } ∈ (cid:74) ϕ s (cid:104) S (cid:105) (cid:75) B . If | (cid:74) ϕ s (cid:104) S (cid:105) (cid:75) B | (cid:54) M , then we include S and R B S := (cid:74) ϕ s (cid:104) S (cid:105) (cid:75) B into our current reﬁnement. Afterwards, we want to satisfy the condition onstrong M -consistency (12) by trying all pairs of M -small sets S and T . This is the bottleneck of9 : INPUT: quantiﬁer-free CQ ϕ ( x , . . . , x k ), σ ( ϕ )-structure A s ← ∅ ; B ← A repeat for (cid:96) = 1 , · · · , k do (cid:46) Step 1: Ensure condition (11). for S = { x i , . . . , x i (cid:96) } ⊆ vars( ϕ ) do if S / ∈ s and S \ { x } ∈ s for all x ∈ S then R B S ← ∅ Choose x ∈ S arbitrary for h ∈ R B S \{ x } and c ∈ A do if h ∪ { ( x, c ) } ∈ (cid:74) ϕ s (cid:104) S (cid:105) (cid:75) B then R B S ← R B S ∪ { h ∪ { ( x, c ) }} if | R B S | (cid:54) M then s ← s ∪ { S } for S, T ∈ s such that S ∪ T / ∈ s do (cid:46) Step 2: Ensure condition (12). for g ∈ R B S and h ∈ R B T do if g (cid:111)(cid:110) h ∈ (cid:74) ϕ s (cid:104) S ∪ T (cid:105) (cid:75) B then R B S ∪ T ← R B S ∪ T ∪ { g (cid:111)(cid:110) h } if | R B S ∪ T | > M then break if | R B S ∪ T | (cid:54) M then s ← s ∪ { S ∪ T } ( s , B ) ← Consistent ( s , B ) (cid:46) Step 3: Apply Lemma 4.2 to ensure (9), (10). until s remains unchanged return ( s , B ) Figure 1: Computing a strongly M -consistent reﬁnementour algorithm and requires time O ( | R B S | · | R B T | ) (cid:54) O ( M ). In the third step we apply Lemma 4.2to enforce consistency of the current reﬁnement. In particular, every set S ∪ T that was found instep 2 becomes M -small. Note that after deleting tuples to ensure consistency, new sets maybecome M -small. Therefore, we have to repeat steps 1–3 until no more sets became M -small.Overall, we repeat the outer loop at most 2 k times, step 1 takes time 2 O ( k ) · M · n , step 2 takestime 2 O ( k ) · M and step 3 takes time 2 O ( k ) · M . Since n (cid:54) M this leads to the required runningtime.The key step in the proof of Theorem 3.5 is to compute f ( ϕ ) strongly M -consistent reﬁnements( s i , B i ) of ϕ and A such that (cid:74) ϕ (cid:75) A = (cid:83) i (cid:74) ϕ s i (cid:75) B i . In addition to being strongly M -consistent, wewant the structures B i to be uniform in the sense that the degree of tuples (i. e. the number ofextensions) is roughly the average degree. We make this precise in a moment, but for illustrationit might be helpful to consult the example from Section 3.2 again. In every relation in A there isone vertex ( a or b ) of out-degree (cid:96) and there are (cid:96) vertices of out-degree 1. Hence the averageout-degree is 2 (cid:96)/ ( (cid:96) + 1) and the vertex degrees are highly imbalanced. However, after splittingthe instance in A (cid:48) and A (cid:48)(cid:48) , in every relation, all vertices have either out-degree (cid:96) or 1 and theout-degree of every vertex matches the average out-degree of the corresponding relation. Thenext deﬁnition generalises this to tuples of variables. We call a reﬁnement ( s , B ) non-trivial , ifevery additional relation in the expansion B is non-empty. For a non-trivial consistent reﬁnement( s , B ) and S, T ∈ s , S ⊆ T , we letavgdeg( S, T ) := | R B T | / | R B S | and (15)maxdeg( S, T ) := max g ∈ R B S (cid:8) h ∈ R B T : π S ( h ) = g (cid:9) . (16)Note that consistency ensures that these numbers are well-deﬁned and non-zero. Furthermore,we can compute them from ( s , B ) in time O ( | s | · (cid:107)B(cid:107) ). By deﬁnition we have maxdeg( S, T ) (cid:62) S, T ). The next deﬁnition states that maximum degree does not deviate too much fromthe average degree.

Deﬁnition 4.4.

Let ( s , B ) be a non-trivial consistent reﬁnement of ϕ and A , and let m be thenumber of tuples of largest relation of A . The reﬁnement ( s , B ) is ε -uniform if for all S, T ∈ s with S ⊆ T we have maxdeg( S, T ) (cid:54) m ε · avgdeg( S, T ).The next lemma uses Marx’s splitting routine to obtain a partition into strongly M -consistent ε -uniform reﬁnements, for M := m c . Lemma 4.5.

Let ϕ be a quantiﬁer-free CQ, let A be a σ ( ϕ ) -structure where the largest relationcontains m tuples, and let c (cid:62) and ε > be real numbers. There is a computable function f andan algorithm that computes in time O ( f ( ϕ, c, ε ) · m c ) and space O ( f ( ϕ, c, ε ) · m c ) a sequence of (cid:96) (cid:54) f ( ϕ, c, ε ) strongly m c -consistent ε -uniform reﬁnements ( s i , B i ) such that (cid:74) ϕ (cid:75) A is the disjointunion of the sets (cid:74) ϕ s i (cid:75) B i .Proof (sketch). We follow the same splitting strategy as in [35], but use the improved algorithmfrom Lemma 4.3 to ensure strong m c -consistency. Starting with the trivial reﬁnement ( ∅ , A ),in each step we ﬁrst apply Lemma 4.3 to ensure strong m c -consistency. Afterwards, we checkwhether the current reﬁnement ( s , B ) contains sets S, T ∈ s that contradict ε -uniformity, i. e., S ⊆ T and maxdeg( S, T ) > m ε · avgdeg( S, T ). If this is the case, we split the reﬁnement ( s , B )into ( s , B (cid:48) ) and ( s , B (cid:48)(cid:48) ) such that R B S is partitioned into tuples of small degree and tuples of largedegree: R B (cid:48) U = R B (cid:48)(cid:48) U := R B U for all U ∈ s \ { S } , (17) R B (cid:48) S := (cid:8) g ∈ R B S : (cid:12)(cid:12)(cid:8) h ∈ R B T : π S ( h ) = g (cid:9)(cid:12)(cid:12) (cid:54) m ε/ · avgdeg( S, T ) (cid:9) (18) R B (cid:48)(cid:48) S := (cid:8) g ∈ R B S : (cid:12)(cid:12)(cid:8) h ∈ R B T : π S ( h ) = g (cid:9)(cid:12)(cid:12) > m ε/ · avgdeg( S, T ) (cid:9) (19)It is clear that (cid:74) ϕ (cid:75) B is the disjoint union of (cid:74) ϕ (cid:75) B (cid:48) and (cid:74) ϕ (cid:75) B (cid:48)(cid:48) and that the recursion terminates atsome point with a sequence of strongly m c -consistent ε -uniform reﬁnements that partition (cid:74) ϕ (cid:75) A .It is also not hard to show that the height of the recursion tree is bounded by 2 O ( | vars( ϕ ) | ) · cε (see [35, Lemma 4.11]). Hence, by Lemma 4.3 the procedure can be implemented in time O ( f ( ϕ, c, ε ) · m c ) and space O ( f ( ϕ, c, ε ) · m c ).The nice thing about ε -uniform and strongly m c -consistent reﬁnements is that they deﬁne, forsmall enough ε , a submodular function g ∈ S ( ϕ ), which in turn guarantees the existence of a treedecomposition with small projections on the bags. The following lemma from [35, Lemma 4.12]provides these functions. However, there is an oversight in Marx’s proof and in order to ﬁxthis, we have to ensure strong m c -consistency instead of only m c -consistency as stated in [35,Lemma 4.12]. As suggested by Marx (personal communication), an alternative way to achievestrong m c -consistency would be to enforce m c -consistency, which leads to the same runtimeguarantees, but requires more space. Lemma 4.6.

Let ( s , B ) be an ε -uniform strongly m c -consistent reﬁnement of ϕ and A , andlet c (cid:62) and | vars( ϕ ) | − (cid:62) ε > be real numbers. Then g s , B : 2 vars( ϕ ) → R (cid:62) is a monotone,edge-dominated, submodular function that satisﬁes g s , B ( ∅ ) = 0 : g s , B ( U ) := (cid:40) (1 − ε / ) · log m (cid:0) | R B U | (cid:1) + h ( U ) if U ∈ s (1 − ε / ) · c + h ( U ) if U / ∈ s , (20) where h ( U ) := 2 ε / | U | − ε | U | (cid:62) for all U ⊆ vars( ϕ ) . The proof can be copied verbatim from Marx’s proof of [35, Lemma 4.12] by using the notionof strong consistency instead of plain consistency. For the reader’s convenience, we provide theproof below. 11 roof of Lemma 4.6 (Lemma 4.12 in [35]).

The function h is non-negative and monotone in therange 0 (cid:54) | U | (cid:54) /ε / . In particular, 0 (cid:54) h ( S ) (cid:54) h ( T ) (cid:54) ε / for all S ⊆ T ⊆ vars( ϕ ). Moreover h is submodular: h ( S ) + h ( T ) − h ( S ∩ T ) − h ( S ∪ T ) = 2 ε · | S \ T | · | T \ S | (cid:62) . (21)The monotonicity of g s , B follows from the monotonicity of h and the m c -consistency of thereﬁnement. To see that g s , B is edge-dominated, note that vars( α ) is m c -consistent for every c (cid:62) α ∈ atoms( ϕ ). Hence, g s , B (vars( α )) (cid:54) (1 − ε / ) + h (vars( α )) (cid:54) g s , B ( S ) + g s , B ( T ) − g s , B ( S ∩ T ) − g s , B ( S ∪ T ) (cid:62) . (22)This is trivial when S ⊆ T or T ⊆ S . Thus we can assume that | S \ T | (cid:62) | T \ S | (cid:62) h ( S ) + h ( T ) − h ( S ∩ T ) − h ( S ∪ T ) (cid:62) ε. ( ∗ )If at least one of S and T is not contained in s , then (22) follows from log m (cid:0) | R B U | (cid:1) (cid:54) c and thesubmodularity of h . The remaining case is that S ∈ s and T ∈ s . Here we have g s , B ( S ) + g s , B ( T ) (23)= (1 − ε / ) · log m (cid:0) | R B S | (cid:1) + (1 − ε / ) · log m (cid:0) | R B T | (cid:1) + h ( S ) + h ( T ) (24)= (1 − ε / ) · log m (cid:0) | R B S | (cid:1) + (1 − ε / ) · log m (cid:0) | R B S ∩ T | · avgdeg( S ∩ T, T ) (cid:1) (25)+ h ( S ) + h ( T ) (26) (cid:62) (1 − ε / ) · log m (cid:0) | R B S | (cid:1) + (1 − ε / ) · log m (cid:0) | R B S ∩ T | (cid:1) (27)+ (1 − ε / ) · log m (cid:0) maxdeg( S ∩ T, T ) /m ε (cid:1) + h ( S ) + h ( T ) (28)= (1 − ε / ) · log m (cid:0) | R B S ∩ T | (cid:1) + (1 − ε / ) · log m (cid:0) | R B S | · maxdeg( S ∩ T, T ) (cid:1) (29) − (1 − ε / ) ε + h ( S ) + h ( T ) (30) (cid:62) (1 − ε / ) · log m (cid:0) | R B S ∩ T | (cid:1) + (1 − ε / ) · log m (cid:0) | R B S | · maxdeg( S, S ∪ T ) (cid:1) (31) − (1 − ε / ) ε + h ( S ∩ T ) + h ( S ∪ T ) + 2 ε (32) (cid:62) (1 − ε / ) · log m (cid:0) | R B S ∩ T | (cid:1) + (1 − ε / ) · log m (cid:0) | R B S ∪ T | (cid:1) (33)+ h ( S ∩ T ) + h ( S ∪ T ) (34) (cid:62) g s , B ( S ∩ T ) + g s , B ( S ∪ T ) (35)The ﬁrst inequality holds because of ε -uniformity. The second inequality holds, becausein general maxdeg( X, Y ) (cid:62) maxdeg( X ∪ Z, Y ∪ Z ) and ( ∗ ). The last inequality holds because S ∩ T ∈ s by consistency and because of strong m c -consistency we have either | R B S ∪ T | > m c or S ∪ T ∈ s (and this is where the new requirement of strong m c -consistency is needed).Now we are ready to prove our main theorem. Proof of Theorem 3.5.

We ﬁx c = (1 + δ ) w and let ε be the minimum of (cid:0) − / (1 + δ ) (cid:1) and | vars( ϕ ) | − . Suppose that ϕ is of the form ∃ x · · · ∃ x k (cid:101) ϕ where (cid:101) ϕ is quantiﬁer-free. We applyLemma 4.5 to (cid:101) ϕ , A , c , ε to obtain in time O ( f ( ϕ ) m c ) a sequence of (cid:96) (cid:54) f ( ϕ ) strongly m c -consistent ε -uniform reﬁnements ( s i , B i ) such that (cid:74) (cid:101) ϕ (cid:75) A is the disjoint union of (cid:74) (cid:101) ϕ s (cid:75) B , . . . , (cid:74) (cid:101) ϕ s (cid:96) (cid:75) B (cid:96) . By Lemma 4.6 we have g s i , B i ∈ S ( (cid:101) ϕ ) = S ( ϕ ) for every i ∈ [ (cid:96) ]. Hence, by the deﬁnitionof free-connex submodular width (5), we know that there is a free-connex tree decomposition( T i , χ i ) of ϕ such that g s i , B i ( χ i ( t )) (cid:54) w for every t ∈ V ( T i ). Note that by the choice of c , ε andthe non-negativity of h (see Lemma 4.6) we have w = c/ (1 + δ ) (cid:54) (1 − ε / ) · c < (1 − ε / ) · c + h ( U ) . (36)12ence, g s i , B i ( U ) (cid:54) w implies U ∈ s and therefore | R B i U | = | (cid:74) ϕ s i (cid:104) U (cid:105) (cid:75) B i | (cid:54) m c by (9) and (11).Thus, every bag of the free-connex tree-decomposition ( T i , χ i ) is small in the i th reﬁnement.However, ( T i , χ i ) is a tree-decomposition of ϕ , but not necessarily of ϕ s i ! In fact, ϕ s i can be verydense, e. g., if s i = 2 vars( ϕ ) . To take care of this, we thin out the reﬁnement and only keep thoseatoms and relations that correspond to bags of the decomposition. In particular, for every i ∈ [ (cid:96) ]we deﬁne (cid:101) ψ i := (cid:86) t ∈ V ( T i ) R χ i ( t ) ( x χ i ( t ) ) and let ψ i := ∃ x · · · ∃ x k (cid:101) ψ i be the quantiﬁed version. Notethat ψ i is a free-connex acyclic CQ. Additionally, we let C i be the σ ( ψ i )-reduct of B i . We arguethat (cid:74) (cid:101) ϕ s i (cid:75) B i ⊆ (cid:74) (cid:101) ψ i (cid:75) C i ⊆ (cid:74) (cid:101) ϕ (cid:75) A . The ﬁrst inclusion holds because (cid:101) ϕ s i and B i reﬁne (cid:101) ψ i and C i . Thesecond inclusion holds because every atom from (cid:101) ϕ is contained in a bag of the decompositionand is hence covered by an atom in (cid:101) ψ i because of consistency. It therefore also follows that π F (cid:0) (cid:74) (cid:101) ϕ s i (cid:75) B i (cid:1) ⊆ π F (cid:0) (cid:74) (cid:101) ψ i (cid:75) C i (cid:1) ⊆ π F (cid:0) (cid:74) (cid:101) ϕ (cid:75) A (cid:1) for F := free( ϕ ), and hence (cid:74) ϕ s i (cid:75) B i ⊆ (cid:74) ψ i (cid:75) C i ⊆ (cid:74) ϕ (cid:75) A .Overall, we have that (cid:74) ϕ (cid:75) A = (cid:83) i ∈ [ (cid:96) ] (cid:74) ψ i (cid:75) C i , where the union is not necessarily disjoint, each ψ i is free-connex acyclic, and (cid:107)C i (cid:107) = O ( | vars( ϕ ) | m (1+ δ ) w ). By combining Theorem 3.3 withTheorem 2.1, the theorem follows. In this paper, we have investigated the enumeration complexity of conjunctive queries and haveshown that every class of conjunctive queries of bounded free-connex submodular width admitsconstant delay enumeration with FPT-preprocessing. These are by now the largest classes ofCQs that allow eﬃcient enumeration in this sense.For quantiﬁer-free self-join-free CQs this upper bound is matched by Marx’s lower bound [35].I. e., recursively enumerable classes of quantiﬁer-free self-join-free CQs of unbounded free-connexsubmodular width do not admit constant delay enumeration after FPT-preprocessing (assumingthe exponential time hypothesis ETH).A major future task is to obtain a complete dichotomy, or at least one for all self-join-freeCQs. The gray-zone for the latter are classes of CQs that have bounded submodular width, butunbounded free-connex submodular width. An intriguing example in this gray-zone is the k -starquery with a quantiﬁed center, i. e., the query ψ k of the form ∃ z (cid:86) ki =1 R i ( z, x i ). Here we have subw ( ψ k ) = 1 and fc-subw ( ψ k ) = k . It is open whether the class Ψ = { ψ k : k ∈ N (cid:62) } admitsconstant delay enumeration with FPT-preprocessing. Acknowledgements

Funded by the German Research Foundation (Deutsche Forschungsge-meinschaft, DFG) – project numbers 316451603; 414325841 (gef¨ordert durch die DeutscheForschungsgemeinschaft (DFG) – Projektnummern 316451603; 414325841).

References [1] Serge Abiteboul, Richard Hull, and Victor Vianu.

Foundations of Databases . Addison-Wesley,1995. URL: http://webdam.inria.fr/Alice/ .[2] Mahmoud Abo Khamis, Hung Q. Ngo, and Dan Suciu. What do Shannon-type inequalities,submodular width, and disjunctive datalog have to do with one another? In

Proceedings of the36th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems PODS2017 , pages 429–444, 2017. Full version available at CoRR, abs/1612.02503, 2016 (URL: http://arxiv.org/abs/1612.02503 ). URL: http://doi.acm.org/10.1145/3034786.3056105 , doi:10.1145/3034786.3056105 .[3] Noga Alon, Raphael Yuster, and Uri Zwick. Finding and counting given length cycles. Algorithmica , 17(3):209–223, 1997. URL: https://doi.org/10.1007/BF02523189 , doi:10.1007/BF02523189 . 134] Antoine Amarilli, Pierre Bourhis, and Stefan Mengel. Enumeration on trees under relabelings.In , pages 5:1–5:18, 2018. URL: https://doi.org/10.4230/LIPIcs.ICDT.2018.5 , doi:10.4230/LIPIcs.ICDT.2018.5 .[5] Antoine Amarilli, Pierre Bourhis, Stefan Mengel, and Matthias Niewerth. Constant-delay enumeration for nondeterministic document spanners. In , pages22:1–22:19, 2019. URL: https://doi.org/10.4230/LIPIcs.ICDT.2019.22 , doi:10.4230/LIPIcs.ICDT.2019.22 .[6] Antoine Amarilli, Pierre Bourhis, Stefan Mengel, and Matthias Niewerth. Enumeration ontrees with tractable combined complexity and eﬃcient updates. In Proceedings of the 38thACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, PODS2019, Amsterdam, The Netherlands, June 30 – July 5, 2019 , pages 89–103, 2019. URL: https://doi.org/10.1145/3294052.3319702 , doi:10.1145/3294052.3319702 .[7] Guillaume Bagan. MSO queries on tree decomposable structures are computable with lineardelay. In Computer Science Logic, 20th International Workshop, CSL 2006, 15th AnnualConference of the EACSL, Szeged, Hungary, September 25-29, 2006, Proceedings , pages 167–181, 2006. URL: https://doi.org/10.1007/11874683_11 , doi:10.1007/11874683\_11 .[8] Guillaume Bagan. Algorithmes et complexit´e des probl`emes d’´enum´eration pour l’´evaluationde requˆetes logiques. (Algorithms and complexity of enumeration problems for the evaluationof logical queries) . PhD thesis, University of Caen Normandy, France, 2009. URL: https://tel.archives-ouvertes.fr/tel-00424232 .[9] Guillaume Bagan, Arnaud Durand, and Etienne Grandjean. On acyclic conjunctive queriesand constant delay enumeration. In

Proceedings of the 16th Annual Conference of the EACSL,CSL’07, Lausanne, Switzerland, September 11–15, 2007 , pages 208–222, 2007. URL: http://dx.doi.org/10.1007/978-3-540-74915-8_18 , doi:10.1007/978-3-540-74915-8_18 .[10] Catriel Beeri, Ronald Fagin, David Maier, and Mihalis Yannakakis. On the desirability ofacyclic database schemes. J. ACM , 30(3):479–513, 1983. URL: http://doi.acm.org/10.1145/2402.322389 , doi:10.1145/2402.322389 .[11] Christoph Berkholz, Fabian Gerhardt, and Nicole Schweikardt. Constant delay enumerationfor conjunctive queries — a tutorial. SIGLOG News , 7(1):4–33, 2020. URL: https://doi.org/10.1145/3385634.3385636 .[12] Christoph Berkholz, Jens Keppeler, and Nicole Schweikardt. Answering conjunctive queriesunder updates. In

Proceedings of the 36th ACM SIGMOD-SIGACT-SIGAI Symposiumon Principles of Database Systems, PODS’17, Chicago, IL, USA, May 14–19, 2017 , pages303–318, 2017. Full version available at http://arxiv.org/abs/1702.06370 . URL: http://doi.org/10.1145/3034786.3034789 , doi:10.1145/3034786.3034789 .[13] Christoph Berkholz, Jens Keppeler, and Nicole Schweikardt. Answering FO+MOD queriesunder updates on bounded degree databases. ACM Trans. Database Syst. , 43(2):7:1–7:32,2018. URL: https://doi.org/10.1145/3232056 , doi:10.1145/3232056 .[14] Christoph Berkholz, Jens Keppeler, and Nicole Schweikardt. Answering UCQs under updatesand in the presence of integrity constraints. In , pages 8:1–8:19, 2018. URL: https://doi.org/10.4230/LIPIcs.ICDT.2018.8 , doi:10.4230/LIPIcs.ICDT.2018.8 .1415] Christoph Berkholz and Nicole Schweikardt. Constant delay enumeration with FPT-preprocessing for conjunctive queries of bounded submodular width. In , pages 58:1–58:15, 2019. URL: https://doi.org/10.4230/LIPIcs.MFCS.2019.58 , doi:10.4230/LIPIcs.MFCS.2019.58 .[16] Philip A. Bernstein and Nathan Goodman. Power of natural semijoins. SIAM J. Comput. ,10(4):751–771, 1981. URL: https://doi.org/10.1137/0210059 , doi:10.1137/0210059 .[17] Johann Brault-Baron. De la pertinence de l’´enum´eration : complexit´e en logiques propo-sitionnelle et du premier ordre. (The relevance of the list: propositional logic and com-plexity of the ﬁrst order) . PhD thesis, University of Caen Normandy, France, 2013. URL: https://tel.archives-ouvertes.fr/tel-01081392 .[18] Johann Brault-Baron. Hypergraph acyclicity revisited.

ACM Comput. Surv. , 49(3):54:1–54:26, 2016. URL: http://doi.acm.org/10.1145/2983573 , doi:10.1145/2983573 .[19] Shaleen Deep and Paraschos Koutris. Compressed representations of conjunctive queryresults. In Proceedings of the 37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principlesof Database Systems, Houston, TX, USA, June 10-15, 2018 , pages 307–322, 2018. URL: http://doi.acm.org/10.1145/3196959.3196979 , doi:10.1145/3196959.3196979 .[20] William F. Dowling and Jean H. Gallier. Linear-time algorithms for testing the satisﬁabilityof propositional horn formulae. J. Log. Program. , 1(3):267–284, 1984. URL: https://doi.org/10.1016/0743-1066(84)90014-1 , doi:10.1016/0743-1066(84)90014-1 .[21] Arnaud Durand and Etienne Grandjean. First-order queries on structures of boundeddegree are computable with constant delay. ACM Trans. Comput. Log. , 8(4), 2007. doi:10.1145/1276920.1276923 .[22] Arnaud Durand, Nicole Schweikardt, and Luc Segouﬁn. Enumerating answers to ﬁrst-orderqueries over databases of low degree. In

Proceedings of the 33rd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS’14, Snowbird, UT, USA,June 22–27, 2014 , pages 121–131, 2014. doi:10.1145/2594538.2594539 .[23] Arnaud Durand and Yann Strozecki. Enumeration complexity of logical query problemswith second-order variables. In

Computer Science Logic, 25th International Workshop /20th Annual Conference of the EACSL, CSL 2011, September 12-15, 2011, Bergen, Norway,Proceedings , pages 189–202, 2011. URL: https://doi.org/10.4230/LIPIcs.CSL.2011.189 , doi:10.4230/LIPIcs.CSL.2011.189 .[24] Georg Gottlob, Gianluigi Greco, Nicola Leone, and Francesco Scarcello. Hypertree decompo-sitions: Questions and answers. In Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAISymposium on Principles of Database Systems, PODS 2016, San Francisco, CA, USA,June 26 - July 01, 2016 , pages 57–74, 2016. URL: http://doi.acm.org/10.1145/2902251.2902309 , doi:10.1145/2902251.2902309 .[25] Georg Gottlob, Nicola Leone, and Francesco Scarcello. Hypertree decompositions andtractable queries. J. Comput. Syst. Sci. , 64(3):579–627, 2002. URL: https://doi.org/10.1006/jcss.2001.1809 , doi:10.1006/jcss.2001.1809 .[26] Martin Grohe and D´aniel Marx. Constraint solving via fractional edge covers. ACMTrans. Algorithms , 11(1):4:1–4:20, 2014. URL: http://doi.acm.org/10.1145/2636918 , doi:10.1145/2636918 . 1527] Muhammad Idris, Mart´ın Ugarte, and Stijn Vansummeren. The Dynamic YannakakisAlgorithm: Compact and eﬃcient query processing under updates. In Proceedings of the2017 ACM International Conference on Management of Data, SIGMOD Conference 2017,Chicago, IL, USA, May 14-19, 2017 , pages 1259–1274, 2017. URL: http://doi.acm.org/10.1145/3035918.3064027 , doi:10.1145/3035918.3064027 .[28] Ahmet Kara and Dan Olteanu. Covers of query results. In , pages16:1–16:22, 2018. URL: https://doi.org/10.4230/LIPIcs.ICDT.2018.16 , doi:10.4230/LIPIcs.ICDT.2018.16 .[29] Wojciech Kazana and Luc Segouﬁn. First-order query evaluation on structures of boundeddegree. Logical Methods in Computer Science , 7(2), 2011. doi:10.2168/LMCS-7(2:20)2011 .[30] Wojciech Kazana and Luc Segouﬁn. Enumeration of ﬁrst-order queries on classes of structureswith bounded expansion. In

Proceedings of the 32nd ACM SIGMOD-SIGACT-SIGARTSymposium on Principles of Database Systems, PODS 2013, New York, NY, USA - June 22- 27, 2013 , pages 297–308, 2013. URL: http://doi.acm.org/10.1145/2463664.2463667 , doi:10.1145/2463664.2463667 .[31] Wojciech Kazana and Luc Segouﬁn. Enumeration of monadic second-order queries on trees. ACM Trans. Comput. Log. , 14(4):25:1–25:12, 2013. URL: http://doi.acm.org/10.1145/2528928 , doi:10.1145/2528928 .[32] Dietrich Kuske and Nicole Schweikardt. First-order logic with counting. In , pages 1–12, 2017. URL: https://doi.org/10.1109/LICS.2017.8005133 , doi:10.1109/LICS.2017.8005133 .[33] Katja Losemann and Wim Martens. MSO queries on trees: enumerating answers underupdates. In Joint Meeting of the Twenty-Third EACSL Annual Conference on ComputerScience Logic (CSL) and the Twenty-Ninth Annual ACM/IEEE Symposium on Logic inComputer Science (LICS), CSL-LICS ’14, Vienna, Austria, July 14 - 18, 2014 , pages67:1–67:10, 2014. URL: http://doi.acm.org/10.1145/2603088.2603137 , doi:10.1145/2603088.2603137 .[34] D´aniel Marx. Tractable structures for constraint satisfaction with truth tables. Theory Com-put. Syst. , 48(3):444–464, 2011. URL: https://doi.org/10.1007/s00224-009-9248-9 , doi:10.1007/s00224-009-9248-9 .[35] D´aniel Marx. Tractable hypergraph properties for constraint satisfaction and conjunctivequeries. Journal of the ACM (JACM), Volume 60, Issue 6, Article No. 42 , November 2013.URL: http://doi.acm.org/10.1145/2535926 , doi:10.1145/2535926 .[36] Matthias Niewerth. MSO queries on trees: Enumerating answers under updates usingforest algebras. In Proceedings of the 33rd Annual ACM/IEEE Symposium on Logic inComputer Science, LICS 2018, Oxford, UK, July 09-12, 2018 , pages 769–778, 2018. URL: http://doi.acm.org/10.1145/3209108.3209144 , doi:10.1145/3209108.3209144 .[37] Matthias Niewerth and Luc Segouﬁn. Enumeration of MSO queries on strings with constantdelay and logarithmic updates. In Proceedings of the 37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, Houston, TX, USA, June 10-15,2018 , pages 179–191, 2018. URL: http://doi.acm.org/10.1145/3196959.3196961 , doi:10.1145/3196959.3196961 . 1638] Dan Olteanu and Maximilian Schleich. Factorized databases. SIGMOD Record , 45(2):5–16, 2016. URL: http://doi.acm.org/10.1145/3003665.3003667 , doi:10.1145/3003665.3003667 .[39] Dan Olteanu and Jakub Z´avodn´y. Size bounds for factorised representations of query results. ACM Trans. Database Syst. , 40(1):2:1–2:44, 2015. URL: http://doi.acm.org/10.1145/2656335 , doi:10.1145/2656335 .[40] Francesco Scarcello. From hypertree width to submodular width and data-dependentstructural decompositions. In Proceedings of the 26th Italian Symposium on AdvancedDatabase Systems, Castellaneta Marina (Taranto), Italy, June 24-27, 2018. , 2018. URL: http://ceur-ws.org/Vol-2161/paper24.pdf .[41] Nicole Schweikardt, Luc Segouﬁn, and Alexandre Vigny. Enumeration for FO queriesover nowhere dense graphs. In