[PDF] Online Sketch-based Query Optimization

Abstract

Cost-based query optimization remains a critical task in relational databases even after decades of research and industrial development. Query optimizers rely on a large range of statistical synopses -- including attribute-level histograms and table-level samples -- for accurate cardinality estimation. As the complexity of selection predicates and the number of join predicates increase, two problems arise. First, statistics cannot be incrementally composed to effectively estimate the cost of the sub-plans generated in plan enumeration. Second, small errors are propagated exponentially through join operators, which can lead to severely sub-optimal plans. In this paper, we introduce COMPASS, a novel query optimization paradigm for in-memory databases based on a single type of statistics -- Fast-AGMS sketches. In COMPASS, query optimization and execution are intertwined. Selection predicates and sketch updates are pushed-down and evaluated online during query optimization. This allows Fast-AGMS sketches to be computed only over the relevant tuples -- which enhances cardinality estimation accuracy. Plan enumeration is performed over the query join graph by incrementally composing attribute-level sketches -- not by building a separate sketch for every sub-plan. We prototype COMPASS in MapD -- an open-source parallel database -- and perform extensive experiments over the complete JOB benchmark. The results prove that COMPASS generates better execution plans -- both in terms of cardinality and runtime -- compared to four other database systems. Overall, COMPASS achieves a speedup ranging from 1.35X to 11.28X in cumulative query execution time over the considered competitors.

Full PDF

OOnline Sketch-based Query Optimization

Yesdaulet Izenov, Asoke Datta, Florin Rusu, Jun Hyung Shin { yizenov,adatta2,frusu,jshin33 } @ucmerced.eduUniversity of California MercedFebruary 2021 Abstract

Cost-based query optimization remains a critical task in relational databases even after decades of research andindustrial development. Query optimizers rely on a large range of statistical synopses – including attribute-levelhistograms and table-level samples – for accurate cardinality estimation. As the complexity of selection predicatesand the number of join predicates increase, two problems arise. First, statistics cannot be incrementally composedto effectively estimate the cost of the sub-plans generated in plan enumeration. Second, small errors are propagatedexponentially through join operators, which can lead to severely sub-optimal plans.In this paper, we introduce COMPASS, a novel query optimization paradigm for in-memory databases based ona single type of statistics—Fast-AGMS sketches. In COMPASS, query optimization and execution are intertwined.Selection predicates and sketch updates are pushed-down and evaluated online during query optimization. This allowsFast-AGMS sketches to be computed only over the relevant tuples—which enhances cardinality estimation accuracy.Plan enumeration is performed over the query join graph by incrementally composing attribute-level sketches—notby building a separate sketch for every sub-plan.We prototype COMPASS in MapD – an open-source parallel database – and perform extensive experiments overthe complete JOB benchmark. The results prove that COMPASS generates better execution plans – both in terms ofcardinality and runtime – compared to four other database systems. Overall, COMPASS achieves a speedup rangingfrom 1.35X to 11.28X in cumulative query execution time over the considered competitors.

Consider query 6a from the JOB benchmark [30]:

SELECT

MIN(k.keyword), MIN(n.name), MIN(t.title)

FROM cast info ci , keyword k , movie keyword mk , name n , title tWHERE (cid:46) selection predicates k.keyword = ’marvel-cinematic-universe’ AND n.name LIKE ’%Downey%Robert%’

AND t.production year > 2010

AND (cid:46) join predicates k.id = mk.keyword id

AND t.id = mk.movie id

AND t.id = ci.movie id

AND ci.movie id = mk.movie id

AND n.id = ci.person id

The query has 3 selection predicates – point, subset, and range – and joins 5 tables with 5 join predicates—there is atriangle subquery between tables t , mk , and ci . The corresponding join graph is depicted in Figure 1. For each join,the graph contains a named edge e – e that connects the tables involved in the join predicate. For example, edge e represents the join predicate k.id = mk.keyword id .Figure 1 also includes the execution plans together with their cost – the total cardinality of the intermediate results– for COMPASS and the four other databases considered in the paper. Although all the plans are left-deep trees, theircost ranges from , to millions tuples. This is entirely due to the statistics used for cardinality estimation.MapD [76] does not use any statistics, thus its cost is orders of magnitude higher. The plan is determined by sorting1 a r X i v : . [ c s . D B ] F e b 𝑛 ⋈ k mk ci n ⋈ ⋈ ⋈(14)(1242) (6) (6) 𝜎 𝑘 t 𝜎 𝑡 COMPASS Cost = ⋈ t mk ci n ⋈ ⋈(17 𝑀 )(1194) (6) 𝜎 𝑡 𝜎 𝑛 k 𝜎 𝑘 (300 𝐾 ) MonetDB ⋈ Cost ≈ ⋈ mk 𝜎 𝑛 t ⋈ ⋈(215 𝑀 )(10 𝐾 )(1194) (6) ci 𝜎 𝑡 k 𝜎 𝑘 n ⋈ MapD ⋈ k mk 𝜎 𝑡 t ⋈ ⋈(14) (11)(1224) (6) 𝜎 𝑘 ci n 𝜎 𝑛 ⋈ PostgreSQL,DBMS A Cost = Cost ≈ cast_info 𝐜𝐢 idperson_id e5 id movie_id e3 movie_id movie_id e4 title 𝐭 name 𝐧 keyword 𝐤 movie_keyword 𝐦𝐤 idmovie_id e2 idkeyword_id e1 Figure 1: Join graph and corresponding execution plans for query JOB 6a. The numbers represent cardinality.the tables in decreasing order of their size—number of tuples. MonetDB [77] has a rule-based optimizer with mini-mum support for statistics [19] which generates a better plan. The reason why both of these systems have primitiveoptimizers is because they are relatively “young” and are targeted at modern architectures. They try to compensatebad plans with highly-optimized execution engines that make use of extensive in-memory processing supported bymassive multithread parallelism and vectorized instructions. However, this approach is clearly limited.PostgreSQL [78] and the industrial-grade DBMS A – name anonymized for legal reasons – are “mature” databaseswith advanced query optimizers. In order to ﬁnd the much better plan, they use a large variety of statistics. Histograms,most frequent values, and number of distincts are used to estimate the selectivity of the point predicate on attribute k.keyword and of the range predicate on t.production year . The subset LIKE predicate on n.name is estimated withtable-level samples. Estimating join cardinality requires correlated statistics on the join attributes. While such statisticsexist, e.g., correlated samples [22, 62, 29], they require the existence of indexes on every join attribute combination,which severely limits their applicability in the case of multi-way joins. As a result, even advanced optimizers rely oncrude formulas that assume uniformity, inclusion, and independence—which are likely to produce highly sub-optimalexecution plans [28]. Since implementing and maintaining these many statistics requires considerable effort, it iscompletely understandable that only mature systems implement them.

Problem.

We investigate how to design a lightweight – yet effective – query optimizer for modern in-memorydatabases. We have two design principles. First, we aim to capitalize on the highly-parallel execution engine in thequery optimization process. Since query execution is already fast, it is challenging to minimize the overhead incurredby the additional optimization. Second, the type and number of synopses included in the optimizer has to be minimal.Our goal is to employ a single type of synopsis built exclusively for single-attributes and without the requirement ofadditional data structures such as indexes. The challenge is to design a composable – and consistent – synopsis thatprovides incremental cardinality estimates for the sub-plans generated in plan enumeration.

COMPASS query optimizer.

We introduce the online sketch-based COMPASS query optimizer. Fast-AGMSsketches [7] are the only statistics present in COMPASS. These sketches are a type of correlated synopses for joincardinality estimation [47, 49] that use small space, can be computed efﬁciently in a single scan over the data, arelinearly composable, and – more importantly – have statistically high accuracy. These properties allow for Fast-AGMS sketches to be computed online in COMPASS by leveraging the optimized parallel execution engine in moderndatabases. This is realized by decomposing query processing into two stages performed before and after the optimiza-tion. In the ﬁrst stage, selection predicates are pushed-down and Fast-AGMS sketches are built concurrently onlyover the relevant tuples. Sketches are built for each two-way join independently—not for every combination of tables.In the query optimization stage, plan enumeration is performed over the join graph by incrementally composing thecorresponding two-way join sketches in order to estimate the cardinality of multi-way joins. The optimal join orderingis ﬁnally passed to the execution engine to ﬁnalize the query. As shown in Figure 1, COMPASS identiﬁes a plan asgood as PostgreSQL and DBMS A, while relying exclusively on a single synopsis—Fast-AGMS sketches. In additionto the novel query optimization paradigm, we make the following technical contributions:• We present a systematic approach of using sketches for join cardinality estimation in a query optimizer. This includestwo-way and multi-way joins. We do this for two types of sketches—AGMS [1] and Fast-AGMS [7].2 We introduce two novel strategies to extend Fast-AGMS sketches to multi-way join cardinality estimation. Theﬁrst strategy – sketch partitioning – is a theoretically sound estimator for a given multi-way join. Since it does notsupport composition, sketch partitioning is not scalable for join order enumeration. The second strategy – sketchmerging – addresses scalability by incrementally creating multi-way sketches from two-way sketches. Althoughthis is done heuristically for a certain multi-way join taken separately, all the multi-way joins with a given size areequally impacted. This property guarantees estimation consistency in plan enumeration.• We prototype COMPASS in MapD and perform extensive experiments over the complete JOB benchmark—113queries. The results prove the reduced overhead COMPASS incurs – below 500 milliseconds – while generatingsimilar or better execution plans compared to the four databases systems included in Figure 1. COMPASS outper-forms the other databases both in terms of the number of queries it obtains the best result on, as well as on thecumulative workload execution time.

Outline.

The paper is organized as follows. Background information on cost-based query optimization and sketchesis given in Section 2. A high-level overview of COMPASS is presented in Section 3, followed by the technical detailsof sketch-based cardinality estimation in Section 4. The novel Fast-AGMS sketches for multi-way joins are introducedin Section 5. In Section 6, we show how the sketches are integrated in a typical enumeration algorithm. The empiricalevaluation of COMPASS is detailed in Section 7. We discuss related work in Section 8 and conclude with future workdirections in Section 9.

Cost-based query optimization.

The query optimization problem [67, 30, 28, 6] consists in ﬁnding the best exe-cution plan – which typically corresponds to the one with the fastest execution time – for a given query. The searchspace is deﬁned over all the valid plans – combinations of relational algebra operators – which can answer the querycorrectly. The number of potential plans is exponentially factorial in the number of tables. Thus, inspecting all ofthem is not practical for a large number of tables.

Plan enumeration is the procedure that deﬁnes the plans in thesearch space. Since the execution time of a plan cannot be determined without running it – which defeats the purpose– alternative cost functions are deﬁned. The most common cost function is the total size – or cardinality – of the inter-mediate results produced by all the operators in the plan . This function captures the correlation between the amountof accessed data and execution time—which is true in general. Computing the cardinality of a relational algebra op-erator is itself a difﬁcult problem and requires knowledge about the data on which the operator is performed. Thisknowledge is captured by incomplete statistics – or synopses – about the data. Different classes of statistics [8] areuseful for different relational operators. For example, attribute histograms and number of distinct values are optimalfor selection predicates, while correlated samples are better for join predicates. With statistics, the cardinality canonly be estimated—it is not exact. While accurate for simple predicates over a small number of attributes, cardinalityestimation becomes harder for correlated predicates and multi-way joins. This is not necessarily a problem if all theplans are equally impacted. However, estimation errors vary widely across sub-plans and this can potentially lead toa highly suboptimal plan. The COMPASS query optimizer includes solutions both for effective plan enumeration aswell as incremental cardinality estimation for the enumerated sub-plans.

Parallel in-memory databases.

Database systems for modern computing architectures rely on extensive in-memoryprocessing supported by massive multithread parallelism and vectorized instructions. GPUs represent the pinnacle ofsuch architectures, harboring thousands of SMT threads which execute tens of vectorized SIMD instructions simulta-neously. MapD, Ocelot [70], CoGaDB [74], Kinetica [75], and Brytlyt [72] are a few examples of modern in-memorydatabases with GPU support. They provide relational algebra operators and pipelines for GPU architectures [16, 4, 13]that optimize memory access and bandwidth. This results in considerable performance improvement for certain classesof queries. However, these databases provide only primitive rule-based query optimization—if at all. This limits dras-tically their applicability to general workloads. In COMPASS, we leverage the optimized execution engine of MapDto build a lightweight – yet accurate and general – query optimizer based on a single type of synopsis.3 ketches.

Sketch synopses [8] summarize the tuples of a relation as a set of random values. This is accomplished byprojecting the domain of the relation on a signiﬁcantly smaller domain using random functions or seeds. In the caseof join attributes, correlation between attributes is maintained by using the same random function. While sketchescompute only approximate results with probabilistic guarantees, they satisfy several major requirements of a queryoptimizer for in-memory databases—single-pass computation, small space, fast update and query time, and linearity:• A sketch is built by streaming over the input data and considers each tuple at most once.• A basic sketch is composed of a single counter and one or more random seeds—a few bytes. In order to improveaccuracy, a standard method is to use multiple independent basic sketch instances. The number of instances isderived from the desired accuracy and conﬁdence levels. In practice, very good accuracy can be achieved withsketches having size in kilobytes.• The update of a sketch with a new tuple consists in generating one or more random numbers and adding them to thesketch counter. The answer to a query involves simple arithmetic operations on the sketch. In the case of multiplesketches, both the update and query are applied to all the instances. Overall, update and query time are linearlyproportional with the sketch size.• A sketch can be computed by partitioning the input relation into multiple parts, building a sketch for every part,and then merging the partial sketches. This mergeable property makes sketches amenable for parallel processing onmodern hardware and can result in linear speedups in update and query time [44, 53].While previous work addresses how to apply sketches to certain cardinality estimation problems that occur in queryoptimization, COMPASS is a complete query optimizer based exclusively on sketches. In addition to cardinalityestimation, we show how to integrate the sketch estimations in plan enumeration. We are not aware of any work thatintegrates sketches effectively with plan enumeration. This is the main reason why sketches have not been integratedin a query optimizer before. COMPASS solves this problem.

In this section, we provide a high-level description of the COMPASS query optimization paradigm, while the technicaldetails of cardinality estimation, join ordering, and plan enumeration are presented in Section 4, 5, and 6, respectively.

Workﬂow.

The workﬂow performed by the COMPASS query optimizer is depicted in Figure 2. It consists of a two-step process that requires interaction with the query processor. First, the optimizer extracts the selection predicatesand join attributes for every table. A sketch is built for every join attribute while performing the selection queryon the base table, and only over the tuples that satisfy the predicate. Figure 2 shows the procedure for table title which has a range predicate and two join conditions—although both join predicates involve the same attribute t.id ,two independent sketches have to be built. COMPASS leverages the high-parallelism of in-memory databases and themergeable property of sketches to execute this process with minimal overhead. Two additional optimizations can beapplied to further reduce the overhead. Sketches for join attributes from tables without selection predicates can bebuilt ofﬂine and plugged-in directly. Sketches can be built only over a sample [46], which, however, incurs a decreasein accuracy. In the second step of the workﬂow, plan enumeration is performed by estimating the cardinality of allthe sub-plans using the sketches built in the ﬁrst step. This is possible only because the attribute-level sketches wedesign are incrementally composable. Otherwise, separate sketches have to be built for every enumerated sub-plan. Inour example, there are two sketches on attribute t.id , one for join e2 and one for join e3 in the join graph (Figure 1).The sketch for e2 is included in all the sub-plans that contain this join attribute—similar for e3 . In a sub-plan thatincludes both e2 and e3 , these two sketches are ﬁrst merged and then used in estimation as before. This process isperformed incrementally during plan enumeration. Finally, the optimal plan is submitted for execution together withany materialized intermediates from step one. Partitioned query execution.

As shown in Figure 2, COMPASS intertwines query optimization and evaluation bypartitioning execution into push-down selection (step 1) and join computation (step 3). Query optimization, i.e., joinordering plan enumeration, is performed in-between these two stages. Since plan enumeration and join computationare standard, we focus on push-down selection, where online sketch building is performed. Push-down selection4

ELECT

COUNT(*)

FROM title AS t WHERE t.production_year > 2010

𝐏𝐮𝐬𝐡-𝐃𝐨𝐰𝐧 𝐒𝐞𝐥𝐞𝐜𝐭𝐢𝐨𝐧

MIN(k.keyword), MIN(n.name), MIN(t.title) cast_info AS , keyword AS , movie_keyword AS , name AS , title AS k.keyword = 'marvel-cinematic-universe' n.name LIKE '%Downey%Robert%' t.production_year > 2010 k.id = mk.keyword_id t.id = mk.movie_id t.id = ci.movie_id ci.movie_id = mk.movie_id n.id = ci.person_id

𝐒𝐄𝐋𝐄𝐂𝐓𝐅𝐑𝐎𝐌 𝐜𝐢𝐤 𝐦𝐤𝐧𝐭𝐖𝐇𝐄𝐑𝐄 𝐀𝐍𝐃𝐀𝐍𝐃𝐀𝐍𝐃𝐀𝐍𝐃𝐀𝐍𝐃𝐀𝐍𝐃 𝐀𝐍𝐃

Sketch Build t.id (e2) t.id (e3) exact cardinalitiessketchesmaterialized intermediatesQueries on base tablesOptimal plan

𝐏𝐥𝐚𝐧 𝐄𝐧𝐮𝐦𝐞𝐫𝐚𝐭𝐢𝐨𝐧 𝑛 ⋈ 𝑚𝑘𝑐𝑖 ⋈ ⋈ ⋈ 𝑘 𝑡 Q U E RY P R O C E SS O R 𝐜𝐢𝐭 𝐧𝐤𝐦𝐤𝐂𝐚𝐫𝐝𝐢𝐧𝐚𝐥𝐢𝐭𝐲 𝐄𝐬𝐭𝐢𝐦𝐚𝐭𝐢𝐨𝐧 COMPASS QUERY OPTIMIZER Figure 2: COMPASS workﬂow: online sketch-based query optimization for in-memory databases.computes the exact selectivity cardinalities for all the base tables that have selections. This is similar to the ESCapproach introduced in [51]. However, in addition to predicate evaluation, COMPASS also builds sketches for everyjoin attribute in the table by piggybacking on the same traversal—sketch building is performed during the selection.Notice that this works both for sequential and index scans. It is important to emphasize that only the tuples that satisfythe predicate are included in the sketch, which increases their accuracy signiﬁcantly. Moreover, the sketch updateoverhead is kept to the minimum necessary. While the exact cardinalities and sketches are always materialized due totheir reduced size and role in optimization, the decision to materialize the selection output – the intermediate result– depends on its size. COMPASS follows the same approach as in [51]. If the intermediate size is smaller than athreshold, it is materialized. Otherwise, it is not, since the space reduction does not compensate for the access timereduction. Notice, though, that, even when intermediates are not materialized, sketches still contain only the relevanttuples for join cardinality estimation.While the idea of partitioned query execution for XML processing is introduced in ROX [22], the COMPASSapproach is different in several aspects. First, similar to adaptive query processing [9], COMPASS works for relationaldata and operators. However, COMPASS does not change the plan while the query is executing. This is not necessarybecause the sketch-based optimization strategy ﬁnds better plans in the ﬁrst place. ROX can decompose a join graphinto an arbitrary number of stages, each of which requiring materialization. COMPASS, on the other hand, splitsexecution in exactly two stages and intermediate result materialization is only optional. The reason ROX requiresmaterialization is because it uses chain sampling to estimate cardinalities. In order to provide acceptable accuracy,samples have to be extracted from the most recent intermediate results—not the base tables. Moreover, ROX chainsampling requires indexes on all the join attributes to guarantee a minimum sample size. This is a stringent constrainthardly satisﬁed in most real-world databases. Sketches, on the other hand, do not impose any constraints. Lastly, dueto its incremental greedy exploration of the join order space, ROX considers only a limited number of plans—possiblysub-optimal. In COMPASS, plan enumeration is performed at once after push-down selection and can cover anyportion of the join space. This can be achieved with the base table sketches which can be composed without the riskto become empty—the case for chain sampling.

Plan enumeration.

The join attribute-level sketches computed during push-down selection can be composed toestimate the cardinality of any valid join order – excluding cross products – generated during plan enumeration.In most cases, cross products are ignored by join enumeration algorithms anyway [29]. As shown in Section 5,sketch composition consists of two stages. First, the sketches of all the relevant join attributes in a table are mergedtogether. An attribute is relevant for a partial join order if its join is part of the order. Second, the sketches acrosstables are combined to estimate the cardinality of the join order. Since the overall composition consists only ofarithmetic operations, sketches can be integrated into any enumeration algorithm—exhaustive, bushy, or left-deep.Essentially, sketches can readily replace the standard join cardinality estimation formula based on table and joinattribute distinct cardinality [14]. However, since sketches capture the correlation between join attributes and do notmake the independence and containment assumptions, their accuracy is expected to be better.5 ketches vs. other synopses.

The decision to exclusively use sketches in COMPASS may seem questionable giventhat sketches are designed for speciﬁc stream processing tasks, while traditional databases support generic batch-oriented execution. To put it differently, there is a speciﬁc sketch for every streaming query, while synopses are for theentire database. To achieve generality, COMPASS has to build a set of sketches for every query—except base tableswithout predicates. However, this is done concurrently with push-down selection and is highly-parallel, resultingin low overhead (Section 7). As a result, sketches do not require any maintenance under modiﬁcation operationssince they are built on the current data. This is not possible for any of the other database synopses. The beneﬁt ofhaving query-speciﬁc synopses is also exploited in [29], where index-based join sampling – a variation of ROX chainsampling [22] – is introduced. Index-based join sampling is performed during the plan enumeration of every queryunder the corresponding selection predicates. Since the sample size – both minimum and maximum – is carefullycontrolled, index-based join sampling has improved memory usage and accuracy because it avoids empty results.Compared to sketches, though, this sampling strategy has two serious shortcomings. First, it requires the existenceof an index and complete frequency distribution on every join attribute. Sketches require nothing beyond the data.Second, the estimation of every join cardinality requires separate sampling from each of the involved tables. Sincethis process is time-consuming, plan enumeration is performed bottom-up – or breadth-ﬁrst – in a limited time budget.Sketches can be composed incrementally in any order, without the need to access the data.The other types of synopses – histograms and distinct cardinality – are not query-speciﬁc. Thus, they do not incurany creation overhead during optimization. To estimate join cardinality, the attribute-level instances of these synopsesare composed by simple arithmetic operations [50, 14]. However, due to the strong assumptions – uniformity, inde-pendence, inclusion, ad-hoc constants – made by these operations, the estimates can be highly-inaccurate. Sketchesdo not make any of these assumptions because they capture correlations by design. ⊕( 𝑡 . 𝑖𝑑 = )⊙ 𝜉 𝑒 ( 𝑡 . 𝑖𝑑 = )⊙ 𝜉 𝑒 ( 𝑡 . 𝑝𝑟𝑜𝑑𝑢𝑐𝑡𝑖𝑜𝑛 _ 𝑦𝑒𝑎𝑟 > ) 𝜉 𝑝 e1 e2 e4 e3 e5 ( 𝑘 . 𝑖𝑑 = )⊙ 𝜉 𝑒 ( 𝑘 . 𝑘𝑒𝑦𝑤𝑜𝑟𝑑 = 𝐦𝐜 ) 𝜉 𝑝 ′ 𝐮 ′ ⊕ LIKE ( 𝑛 . 𝑖𝑑 = )⊙ 𝜉 𝑒 𝑏 ( 𝑛 . 𝑛𝑎𝑚𝑒𝜉 𝑝 𝑏 %𝐃%𝐑 ) ′ % ′ ⊕( 𝑚𝑘 . 𝑘𝑒𝑦𝑤𝑜𝑟𝑑 _ 𝑖𝑑 = )⊙ 𝜉 𝑒 𝑟 ,2 ( 𝑚𝑘 . 𝑚𝑜𝑣𝑖𝑒 _ 𝑖𝑑 = )⊙ 𝜉 𝑒 𝑟 ,2 ( 𝑚𝑘 . 𝑚𝑜𝑣𝑖𝑒 _ 𝑖𝑑 = ) 𝜉 𝑒 𝑟 ,2 ( 𝑐𝑖 . 𝑚𝑜𝑣𝑖𝑒 _ 𝑖𝑑 = )⊙ 𝜉 𝑒 𝑟 ,1 ( 𝑐𝑖 . 𝑚𝑜𝑣𝑖𝑒 _ 𝑖𝑑 = )⊙ 𝜉 𝑒 𝑟 ,1 ( 𝑐𝑖 . 𝑝𝑒𝑟𝑠𝑜𝑛 _ 𝑖𝑑 = ) 𝜉 𝑒 𝑟 ,1 ⎛⎝⎜⎜⎜⎜⎜ 𝑠𝑘 𝐬𝐤 ⋮ 𝑠𝑘 𝑟 ,1 𝑠𝑘 𝑠𝑘 ⋮ 𝑠𝑘 𝑟 ,2 ⋯⋯⋱⋯ 𝑠𝑘 𝑏 𝑠𝑘 𝑏 ⋮ 𝑠𝑘 𝑟 , 𝑏 ⎞⎠⎟⎟⎟⎟⎟ 𝑟 × 𝑏 𝐬𝐤 𝐤 ⎛⎝⎜⎜⎜⎜⎜ 𝑠𝑘 𝑠𝑘 ⋮ 𝑠𝑘 𝑟 ,1 𝐬𝐤 𝑠𝑘 ⋮ 𝑠𝑘 𝑟 ,2 ⋯⋯⋱⋯ 𝑠𝑘 𝑏 𝑠𝑘 𝑏 ⋮ 𝑠𝑘 𝑟 , 𝑏 ⎞⎠⎟⎟⎟⎟⎟ 𝑟 × 𝑏 𝐬𝐤 𝐭 ⎛⎝⎜⎜⎜⎜⎜ 𝑠𝑘 𝑠𝑘 ⋮ 𝑠𝑘 𝑟 ,1 𝑠𝑘 𝑠𝑘 ⋮ 𝑠𝑘 𝑟 ,2 ⋯⋯⋱⋯ 𝑠𝑘 𝑏 𝐬𝐤 ⋮ 𝑠𝑘 𝑟 , 𝑏 ⎞⎠⎟⎟⎟⎟⎟ 𝑟 × 𝑏 𝐬𝐤 𝐧 ⎛⎝⎜⎜⎜⎜⎜ 𝑠𝑘 𝑠𝑘 ⋮ 𝑠𝑘 𝑟 ,1 𝑠𝑘 𝑠𝑘 ⋮ 𝐬𝐤 𝐫,2 ⋯⋯⋱⋯ 𝑠𝑘 𝑏 𝑠𝑘 𝑏 ⋮ 𝑠𝑘 𝑟 , 𝑏 ⎞⎠⎟⎟⎟⎟⎟ 𝑟 × 𝑏 𝐬𝐤 𝐦𝐤 ⎛⎝⎜⎜⎜⎜⎜ 𝑠𝑘 𝑠𝑘 ⋮ 𝐬𝐤 𝐫,1 𝑠𝑘 𝑠𝑘 ⋮ 𝑠𝑘 𝑟 ,2 ⋯⋯⋱⋯ 𝑠𝑘 𝑏 𝑠𝑘 𝑏 ⋮ 𝑠𝑘 𝑟 , 𝑏 ⎞⎠⎟⎟⎟⎟⎟ 𝑟 × 𝑏 𝐬𝐤 𝐜𝐢 ⎛⎝⎜⎜⎜⎜⎜ Avg { 𝑠 ⋅ 𝑠 ⋅ 𝑠 ⋅ 𝑠 ⋅ 𝑠𝑘 𝑘 𝑖 𝑘 𝑡 𝑖 𝑘 𝑛 𝑖 𝑘 𝑚𝑘 𝑖 𝑘 𝑐𝑖 𝑖 } 𝑏𝑖 =1 Avg { 𝑠 ⋅ 𝑠 ⋅ 𝑠 ⋅ 𝑠 ⋅ 𝑠𝑘 𝑘 𝑖 𝑘 𝑡 𝑖 𝑘 𝑛 𝑖 𝑘 𝑚𝑘 𝑖 𝑘 𝑐𝑖 𝑖 } 𝑏𝑖 =1 ⋮ Avg { 𝑠 ⋅ 𝑠 ⋅ 𝑠 ⋅ 𝑠 ⋅ 𝑠𝑘 𝑘𝑟 , 𝑖 𝑘 𝑡𝑟 , 𝑖 𝑘 𝑛𝑟 , 𝑖 𝑘 𝑚𝑘𝑟 , 𝑖 𝑘 𝑐𝑖𝑟 , 𝑖 } 𝑏𝑖 =1 ⎞⎠⎟⎟⎟⎟⎟ 𝑟 ×1 Median ⊕⊕ Figure 3: Cardinality estimation for query JOB 6a with AGMS sketches.6

SKETCH CARDINALITY ESTIMATION

In this section, we present how the class of AGMS sketches are applied for estimating the cardinality of complexqueries involving selection predicates and multi-way joins. We organize the presentation around the original AGMSsketches [1] which have known solutions to these problems. However, AGMS sketches are too inefﬁcient to beaccurate and cannot be integrated in query plan enumeration. This leads us to the Fast-AGMS sketches [7] which areasymptotically more efﬁcient and have been shown to be statistically more accurate [47, 49]. However, Fast-AGMSsketches are limited to estimating two-way join cardinality. Our main contributions are to extend Fast-AGMS sketchesto multi-way joins and to effectively integrate them in query plan enumeration.

The basic AGMS sketch of an attribute from a relation consists of a single random value sk that summarizes the valuesof the attribute across all the tuples in the relation. For example, all the values of attribute id from table keyword canbe summarized by a sketch sk ( k.id ) computed as sk(k.id) = (cid:80) t ∈ k ξ ( t.id ) , where ξ is a family of { +1 , − } randomvariables that are 4-wise independent. Essentially, a random value of either +1 or − is associated to each point inthe domain of attribute k.id . Then, the corresponding random value is added to the sketch sk ( k.id ) – initialized to – for each tuple t in table keyword . Intuitively, the more frequent a value is, the more is “pulling” the sketch to itsfrequency. Since all the tuples are combined in the same sketch sk ( k.id ) , they are conﬂicting and the output can befar away from the frequency of each single tuple. This is where the 4-wise independence property of ξ is important.It guarantees that for any group of at most 4 different values of attribute k.id , the product of their corresponding ξ values is on expectation—they cancel out. This, in turn, allows for each individual attribute value frequency to beunbiasedly estimated by multiplying the sketch with the corresponding ξ random value. For example, the frequencyof k.id = 5 is estimated by the product sk ( k.id ) · ξ (5) . Consider the join e between tables keyword and movie keyword with predicate k.id = mk.keyword id (Figure 1). Thecardinality of this join operator can be estimated with two AGMS sketches sk ( k.id ) and sk ( mk.keyword id ) builton the join attributes. The requirement is that these sketches share the same family ξ of random variables— ξ e isassociated with edge e . ξ e guarantees that join keys with the same value are assigned the same { +1 , − } randomvalue—they are correlated. The basic AGMS estimator is the product of sk ( k.id ) and sk ( mk.keyword id ) : Est ( | e | ) = sk(k.id) · sk(mk.keyword id) = (cid:88) x ∈ k (cid:88) y ∈ mk ξ e ( x.id ) · ξ e ( y.keyword id ) Due to the 4-wise independence property of ξ e , this estimator is unbiased—its expectation equals the true | e | car-dinality. However, its variance is high—it has poor accuracy. This is expected since a full table with any number oftuples is summarized as a single number. The standard technique to improve accuracy is to build multiple independentbasic sketch estimators. This is achieved by using independent families of random variables ξ e . It is theoreticallyproven that, in order to obtain an estimator with relative error at most (cid:15) with conﬁdence δ , O (cid:0) /(cid:15) log (1 /δ ) (cid:1) basicsketches are necessary. As shown in Figure 3, they are grouped into a matrix of r rows and b columns. Then, theﬁnal AGMS estimator is obtained by averaging the b instances in each row and taking the median over the resulting r averages. In summary, an AGMS sketch has Ω( r · b ) update and query time, and its space usage is also Ω( r · b ) . Thisassumes that the random number generators ξ have small seeds and produce their values fast—aspects that requirecareful implementation. We show how to extend AGMS sketches to multi-way join cardinality estimation. For this we add the join e between movie keyword and title to e and aim to estimate the cardinality of this 3-table query. Following the approachfor two-way joins, a family of sketches is built for edge e on attributes mk.movie id and t.id , respectively. Thesesketches share their own family ξ e of random variables. Since two attributes from mk – keyword id and movie id –7articipate in join operators with other tables, we have to preserve their tuple connection. This is achieved by creatinga single composed sketch sk ( mk.k id, mk.m id ) instead of separate sketches for each attribute [11]. The value of sk ( mk.k id, mk.m id ) is computed as: sk(mk.k id,mk.m id) = (cid:88) t ∈ mk ξ e ( t.k id ) · ξ e ( t.m id ) where the product of the two random variables is added to the sketch. The cardinality estimator is deﬁned as theproduct of three sketches in this case: Est ( | e ∪ e | ) = sk(k.id) · sk(mk.k id,mk.m id) · sk(t.id) = (cid:88) x ∈ k (cid:88) y ∈ mk (cid:88) z ∈ t ξ e ( x.id ) · ξ e ( y.k id ) · ξ e ( y.m id ) · ξ e ( z.id ) As long as the families ξ e and ξ e are independent, this estimator is unbiased. However, its variance can be exponen-tially worse than that of the two-way join estimator—which makes sense, given the additional degree of randomness.Thus, to achieve the same accuracy, a considerably larger number of basic sketches are required.This strategy can be generalized to complex queries involving any number of tables and join predicates. A sketchis built for every table. Independent random families ξ are used for every join predicate. The sketch correspondingto a table is updated with the product of all the ξ families incident to it, applied to the corresponding join attribute.In the case of our example query JOB 6a with 5 tables and 5 join predicates (Figure 3), there are 5 sketches and 5families ξ . The sketch sk mk for table mk is updated with the product ξ e ( k id ) · ξ e ( m id ) · ξ e ( m id ) which includesa factor for each of the three join predicates. The unbiased cardinality estimator is the product of the 5 sketches sk k · sk mk · sk t · sk ci · sk n . For the same number of basic sketches r · b as in the case of the | e | join, the accuracy of the | e ∪ e ∪ e ∪ e ∪ e | join can be exponentially worse. Query JOB 6a contains 3 selection predicates—point on k , subset on n , and range on t . These have to be accounted forwhen estimating the overall query cardinality. AGMS sketches can handle selection predicates as long as the domainof the attribute is discrete—which is the case for the ﬁxed-size data types in databases. The idea is to express theselection as a join predicate between the table and the domain of the selection attribute [45, 48]. Following the two-way join approach, a sketch is built on the selection attribute over all the tuples in the table. The sketch over the domain– which shares the same random family ξ – summarizes the values in the domain which satisfy the predicate by addingan entry for each of them to the sketch—for a point predicate, the sketch includes only the ξ value corresponding tothe constant in the predicate; for a range, the ξ values for all points in the range; for a subset, the ξ values for thepoints in the subset. As long as the number of points is small, these sketches can be computed fast. Moreover, evenfor ranges, there is a speciﬁc fast range-summable random family ξ for which the sketch can be computed in constanttime, independent of the range size [45, 48]. In the JOB 6a query depicted in Figure 3, the sketch update procedure fortables with predicates includes an additional factor corresponding to the selection attribute. For example, the sketch sk k for table keyword is updated with the product ξ e ( id ) · ξ p ( keyword ) . Overall, 8 families ξ and 8 sketches arerequired—the sketches over the domain of the selection attributes are not included in Figure 3. The ﬁnal estimator isthe product of these 8 sketches. Since this estimator is a multi-way join with a larger number of sketches, its accuracybecomes worse than that of the join sketches only. As shown, AGMS sketches can be theoretically used to estimate the cardinality of arbitrary complex queries withjoin and selection predicates. While all the sketches for a table can be built in a single scan, since the updatetime per AGMS sketch is linear in the sketch size, updating an exponential number of sketches becomes domi-nant. Moreover, the space requirement for all the sketches is also a problem. These scalability issues hinder theapplication of AGMS sketches to join order enumeration. However, AGMS sketches suffer from a more seriousproblem in query optimization—they cannot be incrementally composed. What this means is that a sketch used to8stimate a two-way join between two relations cannot be used to estimate a three-way join that includes another re-lation. The addition of join e to e in our example illustrates this well. It is not possible to compute the sketch sk ( mk.k id, mk.m id ) from sketch sk ( mk.keyword id ) . It is not even possible to compute sk ( mk.k id, mk.m id ) from sk ( mk.keyword id ) and sk ( mk.movie id ) . The reason is the order of multiplication and addition. The otherdirection – use sk ( mk.k id, mk.m id ) instead of sk ( mk.keyword id ) or sk ( mk.movie id ) – is also not possi-ble. Thus, in order to support plan enumeration, a separate sketch has to be built for every combination of the joinattributes—which is an exponential number. For example, 7 sketches have to be built for both tables mk and ci whichparticipate in 3 join predicates. If we include the attributes that can appear in selection predicates, the number ofsketches that has to be built for a table can become exponential in the number of attributes in the table. While work-load information can be used to reduce this number, there is little that can be done for tables that join with severalother tables on different attributes. Practically, AGMS sketches cannot achieve the goal of having synopses only forsingle attributes. 𝜎 ( 𝑡 . year ) t.id 𝑤 𝑤 𝑤 𝑡 k.id k.keyword 𝑥 xk 𝑥 xk … … 𝑥 𝑘 xk 𝑘 mk.keyword_id mk.movie_id 𝑦 𝑧 𝑦 𝑧 … … 𝑦 𝑚 𝑧 𝑚 𝜎 ( 𝑘 . keyword ) ⎛⎝⎜⎜⎜⎜⎜ ℎ 𝑒 ℎ 𝑒 ⋮ ℎ 𝑒 𝑟 ⎞⎠⎟⎟⎟⎟⎟ 𝑟 ×1 ⎛⎝⎜⎜⎜⎜⎜ ℎ 𝑒 ℎ 𝑒 ⋮ ℎ 𝑒 𝑟 ⎞⎠⎟⎟⎟⎟⎟ 𝑟 ×1 k.id 𝑥 𝑥 𝑘 ⎛⎝⎜⎜⎜⎜⎜ 𝜉 𝑒 𝜉 𝑒 ⋮ 𝜉 𝑒 𝑟 ⎞⎠⎟⎟⎟⎟⎟ 𝑟 ×1 ⎛⎝⎜⎜⎜⎜⎜ 𝜉 𝑒 𝜉 𝑒 ⋮ 𝜉 𝑒 𝑟 ⎞⎠⎟⎟⎟⎟⎟ 𝑟 ×1 t.id t.year 𝑤 𝑤𝑦 𝑤 𝑤𝑦 … … 𝑤 𝑡 𝑤𝑦 𝑡 ⎛⎝⎜⎜⎜⎜⎜ 𝑎 + ( 𝑥 ) 𝑎

2, ( 𝑥 ) ℎ 𝑒 𝜉 𝑒 ⋮ 𝑎 𝑟 ,1 𝑎 𝑎 ⋮ 𝑎 𝑟 ,2 ⋯⋯⋱⋯ + ( 𝑥 ) 𝑎

1, ( 𝑥 ) ℎ 𝑒 𝜉 𝑒 𝑎 𝑏 ⋮+ ( 𝑥 ) 𝑎 𝑟 , ( 𝑥 ) ℎ 𝑒 𝑟 𝜉 𝑒 𝑟 ⎞⎠⎟⎟⎟⎟⎟ 𝑟 × 𝑏 𝐬𝐤 𝐤.𝐢𝐝 ⎛⎝⎜⎜⎜⎜⎜ 𝑎 + ( 𝑦 ) 𝑎

2, ( 𝑦 ) ℎ 𝑒 𝜉 𝑒 ⋮ 𝑎 𝑟 ,1 + ( 𝑦 ) 𝑎

1, ( 𝑦 ) ℎ 𝑒 𝜉 𝑒 𝑎 ⋮+ ( 𝑦 ) 𝑎 𝑟 , ( 𝑦 ) ℎ 𝑒 𝑟 𝜉 𝑒 𝑟 ⋯⋯⋱⋯ 𝑎 𝑏 𝑎 𝑏 ⋮ 𝑎 𝑟 , 𝑏 ⎞⎠⎟⎟⎟⎟⎟ 𝑟 × 𝑏 𝐬𝐤 𝐦𝐤.𝐤𝐞𝐲𝐰𝐨𝐫𝐝_𝐢𝐝 ⎛⎝⎜⎜⎜⎜⎜ 𝑎 + ( 𝑧 ) 𝑎

2, ( 𝑧 ) ℎ 𝑒 𝜉 𝑒 ⋮ 𝑎 𝑟 ,1 𝑎 𝑎 ⋮ 𝑎 𝑟 ,2 ⋯⋯⋱⋯ + ( 𝑧 ) 𝑎

1, ( 𝑧 ) ℎ 𝑒 𝜉 𝑒 𝑎 𝑏 ⋮+ ( 𝑧 ) 𝑎 𝑟 , ( 𝑧 ) ℎ 𝑒 𝑟 𝜉 𝑒 𝑟 ⎞⎠⎟⎟⎟⎟⎟ 𝑟 × 𝑏 𝐬𝐤 𝐦𝐤.𝐦𝐨𝐯𝐢𝐞_𝐢𝐝 ⊕ ⎛⎝⎜⎜⎜⎜⎜ 𝑎 + ( 𝑤 ) 𝑎

2, ( 𝑤 ) ℎ 𝑒 𝜉 𝑒 ⋮ 𝑎 𝑟 ,1 + ( 𝑤 ) 𝑎

1, ( 𝑤 ) ℎ 𝑒 𝜉 𝑒 𝑎 ⋮+ ( 𝑤 ) 𝑎 𝑟 , ( 𝑤 ) ℎ 𝑒 𝑟 𝜉 𝑒 𝑟 ⋯⋯⋱⋯ 𝑎 𝑏 𝑎 𝑏 ⋮ 𝑎 𝑟 , 𝑏 ⎞⎠⎟⎟⎟⎟⎟ 𝑟 × 𝑏 𝐬𝐤 𝐭.𝐢𝐝 ⊕⊕ Push-Down Selection Fast-AGMS Sketch ⎛⎝⎜⎜⎜⎜⎜ 𝑠 ⋅ ( 𝑠 ⊗ 𝑠 ) ⋅ 𝑠 ∑ 𝑏𝑖 =1 𝑘 𝑘 . 𝑖𝑑 𝑖 𝑘 𝑚𝑘 . 𝑘 _ 𝑖𝑑 𝑖 𝑘 𝑚𝑘 . 𝑚 _ 𝑖𝑑 𝑖 𝑘 𝑡 𝑖 𝑠 ⋅ ( 𝑠 ⊗ 𝑠 ) ⋅ 𝑠 ∑ 𝑏𝑖 =1 𝑘 𝑘 . 𝑖𝑑 𝑖 𝑘 𝑚𝑘 . 𝑘 _ 𝑖𝑑 𝑖 𝑘 𝑚𝑘 . 𝑚 _ 𝑖𝑑 𝑖 𝑘 𝑡 𝑖 ⋮ 𝑠 ⋅ ( 𝑠 ⊗ 𝑠 ) ⋅ 𝑠 ∑ 𝑏𝑖 =1 𝑘 𝑘 . 𝑖𝑑𝑟 , 𝑖 𝑘 𝑚𝑘 . 𝑘 _ 𝑖𝑑𝑟 , 𝑖 𝑘 𝑚𝑘 . 𝑚 _ 𝑖𝑑𝑟 , 𝑖 𝑘 𝑡𝑟 , 𝑖 ⎞⎠⎟⎟⎟⎟⎟ 𝑟 ×1 Median ⊕ Figure 4: Cardinality estimation for query JOB 6a with Fast-AGMS sketches.

Fast-AGMS sketches preserve the ( r × b ) matrix structure of AGMS sketches. However, they deﬁne a complete rowof b counters as a basic sketch element (Figure 4). Only one of these counters is updated for every tuple, thus, a factor b reduction in update time is obtained. The updated counter is chosen by a random hash function h associated with therow. The purpose of h is to spread tuples with different values as evenly as possible—tuples with the same key still endup in the same bucket. On average, a factor b less tuples collide on the same counter, which preserves the frequency ofeach of them better. Since a full row is a sketch element, a single ξ family of random variables is associated with everyrow. Thus, a Fast-AGMS sketch with r rows requires only r hash and ξ random functions. The value of a counter j is sk(k.id) j = (cid:80) t ∈ k,h ( t.id )= j ξ ( t.id ) . 9 .2.1 Two-Way Join Cardinality Estimation In order to estimate join cardinality, the same principle applies—Fast-AGMS sketches are built over the join attributesusing the same random functions h and ξ . The hash function h lands identical keys to the same bucket, while ξ givesthe same sign. The unbiased estimator for a basic sketch sums up the product of the corresponding buckets: Est ( | e | ) = b (cid:88) j =1 sk(k.id) j · sk(mk.keyword id) j Summation is necessary because h partitions the tuples. As for AGMS sketches, the ﬁnal estimate is obtained by takingthe median of the r independent basic sketches. Although the accuracy of Fast-AGMS sketches is asymptoticallyequal to that of AGMS sketches [7] in the worst case, it has been shown statistically that Fast-AGMS sketches haveconsiderably better accuracy than any other sketching technique on average [47]. The combined accuracy and fastupdate time make Fast-AGMS sketches suitable for query optimization. As far as we know, there is no work that extends Fast-AGMS sketches to multi-way join estimation. The main problemis the requirement to have independent hash functions h e and h e for the two join attributes. These functions allocatethe attributes to different buckets, which means that the tuple is added to the sketch twice. Moreover, the relationshipbetween attributes is lost. Since sketch-based selectivity estimation is also reduced to a join between the selectionattribute and its domain, this implies that Fast-AGMS sketches cannot be used to estimate the cardinality of two-wayjoins with predicates. In fact, computing optimally the Fast-AGMS sketch of the domain of a range predicate doesnot have a solution. This is because there is no order relationship between the hash values of adjacent points in thedomain. Due to these limitations, Fast-AGMS sketches have not been used in query optimization before. COMPASSintroduces Fast-AGMS extensions for multi-way joins and solves the selectivity issue by pushing-down predicatesduring query optimization, and adding only the relevant tuples to the sketch. We present two strategies to extend Fast-AGMS sketches to multi-way join cardinality estimation. The ﬁrst strategy– sketch partitioning – is a theoretically sound estimator for a given multi-way join. Its limitation is that it cannotbe composed/decomposed, thus, it is not scalable for plan enumeration. The second strategy – sketch merging – ad-dresses the scalability issue by incrementally creating multi-way sketches from two-way sketches. Although this isdone heuristically for a certain multi-way join taken separately, all the multi-way joins with a given size are equally im-pacted. We show empirically that this property is a good surrogate for accuracy – which is much harder to consistentlyachieve – in join order enumeration.

The idea of sketch partitioning is to reorganize the b buckets of the elementary sketch into a ( b × b ) h e hashes a tuple mk ( k id, m id ) to one of the b rows, while h e hashesto one of the b columns. Then, only the counter at indices (cid:2) h e ( k id ) , h e ( m id ) (cid:3) is updated with the product ξ e ( k id ) · ξ e ( m id ) . This process is depicted in Figure 5. For example, tuple (6,3) in mk adds 1 to the counter [2,1]. h e guarantees that all the tuples with k id = 6 are hashed to row 2, while h e sends tuples with m id = 3 to column1. Conﬂicts happen only when the output of both hash functions is identical. Given the quadratic number of bucketscompared to the sketch for a single attribute – while the number of tuples is the same – conﬂicts are less frequent. Thecardinality estimate for the 3-table join k (cid:46)(cid:47) mk (cid:46)(cid:47) t is obtained by summing up all the entries in the matrix resultedafter the scalar multiplication between sk ( k.id ) and every row in sk part ( mk ) , followed by the scalar multiplicationbetween the transpose of sk ( t.id ) and every column in sk part ( mk ) . This can be written as: Est ( | e ∪ e | ) = (cid:88) ≤ i

1 2 0 -1 mk.keyword_idmk.movie_id -2 1 1 0

3 2 -1 0 -2 3 0 -1 k.idt.id (k.id)(t.id) (mk.keyword_id)(mk.movie_id) ( ⊕ ) ( ) = %4 ℎ ( ) = ( + 2)%4 ℎ -1 1 0 -1 -1 1 0 2 m k . k e y w o r d_ i d mk.movie_id ({1,4,5,6,8,13}) = +1 ({0,2,3,9}) = −1 ({1,3,4,7}) = +1 ({2,5,6,9}) = −1 Sketch Partitioning Sketch Merging ( ) -2 3 0 -1 -1 -1 0 -1 m k . k e y w o r d_ i d mk.movie_id Figure 5: Fast-AGMS sketches for multi-way join cardinality estimation on query JOB 6a.It can be shown theoretically that this estimator is unbiased following the same proof as for AGMS sketches in [11].Moreover, given the larger size of sketch sk part ( mk ) , its accuracy is expected to be better. This procedure can begeneralized to any number of join attributes by partitioning – or replicating – b into the corresponding number ofdimensions. For example, a table with 3 joins has a 3-D tensor as its sketch, with one dimension for every joinattribute. Thus, there is a polynomial factor increase in the size of the sketch and the estimate computation. This has tobe carefully accounted for in the overall memory budget since the likelihood of conﬂicts varies with the dimensionalityof the sketch tensor. The constraint to have the same number of buckets for a join predicate, e.g., sk ( k.id ) has as manybuckets as the number of rows in sk part ( mk ) , makes memory allocation among sketches more complicated than forthe 1-D AGMS sketch vectors.Partitioned Fast-AGMS sketches are not scalable for join order enumeration. This is because separate sketches arerequired for every join. For example, in Figure 5, the 2-D sketch sk part ( mk ) is used for the 3-way join k (cid:46)(cid:47) mk (cid:46)(cid:47) t ,while the 1-D sketches sk mk and sk mk are used for the 2-way joins k (cid:46)(cid:47) mk and t (cid:46)(cid:47) mk , respectively. Buildingand storing these many sketches is impractical in query optimization. One alternative is to build only the sketches forup to k-way joins and use other methods to estimate higher-order join cardinality. This strategy is applied for run-timejoin samples in [29]. The drawback is that other statistics are required for the higher-order joins and the interactionbetween the estimates produced by these statistics and the sketch estimates has to be carefully controlled.Our goal is to exclusively use sketches. Intuitively, we want to be able to either generate the 2-D sketch from the1-D sketches or extract the 1-D sketches from the 2-D sketch. Unfortunately, none of these have a clear solution forFast-AGMS sketches. The composition of sk mk and sk mk requires to determine how to combine all the pairs ofbuckets in the 1-D sketches in order to compute the quadratic number of entries in the 2-D sketch. Since the identityof tuples is lost when they are inserted in the 1-D sketch, it is not possible to recreate the tuple and determine itscorresponding 2-D bucket. Moreover, due to conﬂicts in the ξ random functions, we do not even know how manytuples belong to a 1-D bucket. For example, bucket 1 in sk mk is 2 even though 4 tuples are hashed to it. Theextraction of a 1-D sketch from the 2-D sketch also does not work because of the ξ variables. Speciﬁcally, the updateby the product ξ e · ξ e makes it impossible to retrieve the value of a 1-D bucket by summing up the corresponding2-D buckets. For example, the value of bucket 0 in sk mk is not the sum of the buckets in row 0 of sketch sk part ( mk ) .This property is true only for hash-based sketches [5]. We introduce the sketch merging heuristic as a lightweight method to compose two-way join Fast-AGMS sketches inorder to estimate the cardinality of multi-way joins. The procedure works as follows. We build sketches for everytwo-way join predicate independently, as shown in Figure 5. The number of sketches corresponding to a table is equal11o the number of joins it participates in. For example, tables k and t have one sketch, while mk has two sketches.We estimate any join combination generated during plan enumeration using only these sketches. The two-way joins k (cid:46)(cid:47) mk and t (cid:46)(cid:47) mk are estimated optimally with the sketch pairs ( sk k , sk mk ) and ( sk t , sk mk ) , respectively.These are the most accurate sketch estimates we can get. For the 3-way join k (cid:46)(cid:47) mk (cid:46)(cid:47) t , we create a mergedsketch sk merge ( mk ) = sk mk ⊕ sk mk from the two 2-way join sketches on demand during plan enumeration. Thismerged sketch approximates the partitioned sketch sk part ( mk ) computed with the same random functions, withoutaccessing the tuples. A bucket [ i, j ] in sk merge is set to the value having the minimum absolute magnitude among thecorresponding [ i ] and [ j ] buckets in the two basic sketches: sk merge [ i, j ] = (cid:26) sk mk1 [ i ] , if | sk mk1 [ i ] | ≤ | sk mk2 [ j ] | sk mk2 [ j ] , if | sk mk1 [ i ] | > | sk mk2 [ j ] | For the example in Figure 5, bucket [0 , is set to − because | | > |− | , while bucket [0 , to because | | < | | . Thereason for this merge procedure is multifolded. The interaction between the random functions ξ is considered – albeitnot through a direct multiplication – by preserving the sign of the value in the basic sketch. The absolute magnitudecorresponds to the maximum number of tuples with a given join key that are hashed to the bucket—assuming noconﬂicts. These tuples are partitioned across the buckets of the other join key. The minimum is chosen because this isthe maximum number of tuples that can have identical values for both join keys when considered together. However,this is an overestimate because the exact tuple pairing is lost. This can be seen when comparing the magnitude of thevalues in the two 2-D sketches in Figure 5. In fact, sketch merging is likely to always overestimate join cardinality.The only caveat is the interaction between the ξ functions.Sketch merging can be generalized to any number of joins by applying the procedure iteratively. Moreover, (n+1)-D sketches can be derived incrementally from n-D sketches in a single step—without the need to always start from thebasic sketches. This property can be exploited to speed up the computation and reduce memory usage in bottom-upplan enumeration since only the highest dimensional sketches have to be maintained. An even more important propertyof sketch merging is that it is consistent in how it handles the multi-way joins with the same number of predicates.Speciﬁcally, all these joins rely on the same basic sketches and the same assumptions for merging. Thus, it is likelythat these estimates exhibit similar accuracy behavior—same type of errors for equal join size. number of joins un d e r e s t i m a t e [ r a t i o ( l o g )] o v e r e s t i m a t e (a) estimate/true number of joins n o r m a li z e d L p e r m u t a t i o n d i s t a n c e (b) L1-distance Figure 6: Accuracy ratio (a) and L1-distance between the estimated sketch permutation and the correct join order (b).12n order to verify this claim, we depict the accuracy of sketch merging for the JOB queries in Figure 6. We usetwo measures to quantify accuracy. The ﬁrst is the ratio between the sketch estimate and the true cardinality for all theenumerated sub-plans having at most ten joins (Figure 6a). We observe that the median ratio is within a factor of 10for up to six joins, which is better than any previous practical results [29]. For a larger number of joins, sketch merginggenerates underestimates systematically. In previous results [30], this behavior occurs starting from 3-way joins.2-join mk (cid:46)(cid:47) k ci (cid:46)(cid:47) n mk (cid:46)(cid:47) t ci (cid:46)(cid:47) t mk (cid:46)(cid:47) ci

True cardinality 14 486 0.3M 6M 215MSketch merging 3K 7.5K 6.4M 14M 397M3-join mk (cid:46)(cid:47) k (cid:46)(cid:47) t t (cid:46)(cid:47) ci (cid:46)(cid:47) n mk (cid:46)(cid:47) ci (cid:46)(cid:47) k mk (cid:46)(cid:47) ci (cid:46)(cid:47) n mk (cid:46)(cid:47) ci (cid:46)(cid:47) t

True cardinality 11 61 1242 10K 17MSketch merging 6.7K 35K 0.7M 1.5M 8.7B4-join mk (cid:46)(cid:47) ci (cid:46)(cid:47) n (cid:46)(cid:47) k mk (cid:46)(cid:47) ci (cid:46)(cid:47) n (cid:46)(cid:47) t mk (cid:46)(cid:47) ci (cid:46)(cid:47) k (cid:46)(cid:47) t

True cardinality 6 1194 1224Sketch merging 198

18M 6.5M

Table 1: 2-, 3-, and 4-way join L1 permutation distance for JOB 6a.The second measure is the normalized L1 distance [71] between the permutation generated by sketch merging andthe correct join order. Given n sub-plans of the same size, the correct order C is obtained by sorting them in increasingorder of their cardinality. The permutation corresponding to sketch merging S is obtained by sorting the sub-plansbased on the sketch estimates. The L1 distance is deﬁned as (cid:80) ni =1 |S i − C i | , the sum of the differences between theposition in the permutation and the correct order. For example, the L1 distance for the 4-way joins in query JOB 6a(Table 1) is . The normalized L1 distance – we divide the distance by the number of sub-plans in thequery – is depicted in Figure 6b. The closer the distance is to zero, the more similar is the permutation to the correctorder. For reference, we plot the line corresponding to the maximum L1 distance. The join orders generated by sketchmerging have an L1 distance that is signiﬁcantly below the maximum. In particular, for 2-way join sub-plans, thedistance is almost zero, while for sub-plans with more joins, the distance is constantly below 10. This conﬁrms thatsketch merging selects orders that are close to optimal most of the time. In this section, we present how merged Fast-AGMS sketches are integrated into a novel plan enumeration algorithmwe introduce in this work. It is important to emphasize that – due to the proposed generalization to multi-way joins– Fast-AGMS sketches can be embedded into any enumeration algorithm. According to the investigation of differentjoin order enumeration algorithms performed in [30, 28], plan enumeration is not the most critical component of aquery optimizer—it has a relatively small impact on plan quality compared to cardinality estimation accuracy. Thisconﬁrms the approach taken by many query optimizers [78, 77] that consider only a plan subset —left- or right-deepplans, which are a permutation of the query tables. We follow a similar approach and introduce an heuristic planenumeration algorithm that allows us to explore the overall impact of merged Fast-AGMS sketches. Although thisalgorithm borrows ideas from previous work [52], we argue that it is original in the presented form.COMPASS uses the join graph (Figure 1) in plan enumeration. This guarantees that only valid join order plansare considered and cross products are ignored. Plan enumeration becomes a graph traversal problem. We designa depth-ﬁrst search (DFS) traversal algorithm (Algorithm 1) that enumerates left-deep plans following the edges inthe join graph. This is achieved by considering all the vertices as the source of DFS and backtracking whenever acomplete plan is reached (line 26). The number of plans explored from a source vertex is controlled by a user-deﬁnedparameter max plans (line 25) that plays a similar role to the timeout used to conﬁne the plan search space in [29].However, our algorithm does not require bottom-up plan enumeration since sketches can be merged and combinedin any order. The max plans parameter also allows the plan search to restart from other sources in the presence ofhigh-degree vertices and cycles—instead of getting locked on the initial selections. On a continuum spectrum that13 lgorithm 1

Fast-AGMS Sketch Join Order Enumeration Let G = ( V, E ) be the join graph Let C be the join order set, initially ∅ Let p be the number of complete plans, initially Let min cost be the minimum cost, initially ∞ Let max plans be the maximum threshold of plansenumerated from a source vertex procedure P LAN E NUMERATION ( G ) Let S be the set of vertices V sorted in in-creasing order of function f that combines tablecardinality with the degree of v in G : f ( v ) = α · cardinality ( v )max { cardinality ( u ) , ∀ u ∈ V } + β · deg ( v ) max { deg ( u ) , ∀ u ∈ V } ,where α and β are user-deﬁned constants for each vertex v ∈ S do p ← DFS T

RAVERSAL ( G , { v } , ) end for end procedure procedure DFS T

RAVERSAL ( G , C , cost ) curr est ← F AST -AGMS E

STIMATE ( C ) cost ← cost + curr est (cid:46) Early pruning if cost > min cost then return (cid:46) Evaluate complete plan if | C | = | V | then if cost < min cost then min cost ← cost opt path ← C end if p ← p + 1 if p = max plans then abort return end if (cid:46) Recursive enumeration in increasing order ofthe join cardinality estimates

Let L be the set of vertices v / ∈ C that are adjacentto a vertex u ∈ C for each vertex v ∈ L do e [ v ] ← F AST -AGMS E

STIMATE ( C ∪ { v } ) end for Let L (cid:48) be set L sorted in increasing order of e [ v ] for each vertex v ∈ L (cid:48) do DFS T

RAVERSAL ( G , C ∪ { v } , cost ) end for end procedure has left-deep greedy search at one extreme and exhaustive enumeration at the other [14], the proposed algorithmcan be conﬁgured anywhere in-between by controlling the value of parameter max plans . When a single plan isenumerated from every vertex, we obtain left-deep ordering. This is achieved by selecting the vertex v that has thesmallest cardinality when appended to the current plan in line 35. When all the plans are enumerated from everyvertex – set max plans = ∞ – we obtain exhaustive enumeration. The tradeoff between these two alternatives isevident—number of explored plans vs. enumeration overhead. Since sketch merging and estimation are fast operationsamenable to parallelization, the overhead is small—see the experiments in Section 7. Thus, max plans values largerthan are amenable— max plans is set by default to . This allows for a more comprehensive exploration of thejoin order space compared to alternative synopses that incur higher overhead [22, 29].We design two heuristics that control the order in which the plan space is explored and the depth of exploringsub-optimal plans. First, we sort the vertices according to a normalized cost function f ( v ) that combines vertexcardinality and the number of join predicates the vertex participates in (line 7). The conﬁgurable weights α and β control the relative importance of these factors. They are set by default to . , which assigns equal weight to eachfactor. DFS Traversal is invoked from the source vertices in increasing order of cost f ( v ) . The intuition is to generatesub-plans with small cardinalities and limited orders as early as possible in the enumeration—get the left-deep plancorresponding to max plans = 1 ﬁrst. The second heuristic is early pruning of sub-optimal plans (line 17). Wheneverthe cost of a sub-plan exceeds the minimum cost, we backtrack to a sub-plan that can still become optimal.Fast-AGMS sketch cardinality estimation is invoked for every enumerated sub-plan (line 14) and to decide theorder in which vertices are explored (line 31)—while the call on line 14 can be eliminated, we keep it for clarity.The Fast-AGMS Estimate function performs sketch merging and estimation only for the joins included in the sub-plan. In order to avoid recomputation, the estimates are cached. Alternatively, the merged sketches correspondingto join subsets a table is involved in and the product of the basic sketches corresponding to a sub-plan can alsobe cached. They provide different levels of reuse and computation to generate the estimate. Caching the estimateprovides the least reuse, while caching merged sketches and products allows for incremental estimate evaluation. In14ur implementation, we settle for a combined solution in which all the estimates, and merged sketches and productsof up to three sketches are cached. This insures that the estimate corresponding to any sub-plan is computed only onceand allows for incremental extension of the core sub-plans. It is important to emphasize that the input to

Fast-AGMSEstimate is always represented only by the sketches corresponding to the two-way joins.We illustrate how the plan enumeration algorithm works for query JOB 6a based on the join graph in Figure 1and the join cardinality estimates in Table 1. The ﬁve vertices are sorted in the order n − k − t − mk − ci basedon function f ( v ) applied on the tables resulted after selection push-down. mk and ci are the last two because theyhave the largest degree and cardinality—and have no selection predicates. k and n come before t because they havea smaller degree. Although n has an order of magnitude more tuples than k , the predicate on n is very selective andoutputs a smaller cardinality. The degree being the same, n is ﬁrst in the order. Thus, DFS Traversal is performedwith n as the ﬁrst source. The ﬁrst enumerated plan is n − ci − t − mk − k and its estimated cost is ≈ M . While aleft-deep plan search ﬁnishes the enumeration at this point, our algorithm backtracks and explores alternative plans—we set max plans = 2 in this case. The other plan enumerated from n is n − ci − mk − k − t which has an estimatedcost of ≈ . M . This is the optimal plan from n . Since the number p of explored plans from n reaches the value of max plans , in the next step, DFS Traversal is performed from k . The ﬁrst plan enumerated from k is k − mk − t − ci which is pruned early because its partial cost of ≈ . M is larger than the complete minimum cost of ≈ . M . Thenext plan k − mk − ci − n − t has an estimated cost of only ≈ . M and becomes the optimal plan up to this point—infact, it is the optimal plan identiﬁed by COMPASS (Figure 1). The third – and ﬁnal – plan considered from k is k − mk − ci − t . This plan is pruned early. Notice that the enumeration terminates without reaching the maximumnumber of allowed plans. The enumeration starting from t does not proceed beyond its immediate neighbors becauseof the large cardinality estimates. The only complete plan enumerated from mk is mk − k − ci − n − t which has thesame cost as the minimal cost plan—they are equivalent. All the plans starting from ci are pruned early. The left-deepplan identiﬁed by PostgreSQL and DBMS A is k − mk − t − ci − n (Figure 1). Although this plan has a slightly lowercost, it is not enumerated by our algorithm. The reason is the large sketch estimate for the join k (cid:46)(cid:47) mk (cid:46)(cid:47) t (cid:46)(cid:47) ci which stops the enumeration early. While this estimate is inaccurate, the more thorough plan space exploration allowsour algorithm to identify an alternative plan with a cost almost identical. We point out that if we apply the samegreedy strategy as in left-deep plan search using the sketch estimates in Table 1, we would get the same optimal planas PostgreSQL and DBMS A. This is because, once we reach the sub-plan k − mk − t , the only alternative is to choose ci —there is no backtracking. Thus, left-deep search identiﬁes the plan k − mk − t − ci − n by chance rather than byconsidering estimates for four-way joins—known to be unreliable for any type of synopses, not only sketches. We perform an extensive experimental study over the complete JOB benchmark [30] in order to evaluate the perfor-mance of COMPASS and compare it against four other database query optimizers (Figure 1). While our main goalis to determine whether COMPASS is a complete optimizer – which requires an effective integration of cardinalityestimation in plan enumeration – we also perform a detailed comparison between Fast-AGMS sketches and severalstate-of-the-art methods for multi-way join cardinality estimation. Moreover, we assess the impact of the proposedplan enumeration algorithm. To this end, our evaluation investigates the following questions:• What is the quality of the query execution plans generated by COMPASS? We measure plan quality as the total car-dinality of the intermediate results since this is independent from speciﬁc execution engine optimizations. Moreover,logical optimizers use cardinality information as the main criterion to rank plans.• What is the execution time – or runtime – for the COMPASS plans? Since this is highly dependent on the underlyingquery processing engine, we execute the plans in MapD, PostgreSQL and DBMS A. This allows us to identify thecorrelation – if there is one – between plan quality and execution time.• What is the overall JOB workload runtime? While individual queries allow for localized analysis, the workloadexecution time measures the reliability of COMPASS. However, due to the high variance in JOB query complexity,this measure alone is not an absolute indicator of the quality of an optimizer.• How does COMPASS compare against the pessimistic optimizers that minimize upper bound cardinality, i.e., over-estimates? Since the highly-optimized pessimistic plans are shown to be considerably faster than the default Post-greSQL plans [5, 18], we are interested where COMPASS stands on this scale.15 What is the optimization overhead incurred by sketch merging in plan enumeration? While signiﬁcantly improvingupon sketch partitioning, it is not clear if online sketch merging during push-down selection is fast enough tobe practical. We deem COMPASS to be a practical optimizer if it manages to consistently outperform the otherdatabases and also incurs a reduced overhead.• How does the accuracy of Fast-AGMS sketch merging compare against state-of-the-art methods for multi-way joincardinality estimation? Previous studies [56, 23] include only AGMS sketches, which are known to be considerablyworse than Fast-AGMS for two-way join estimation [47, 49].• How does the plan enumeration algorithm driven by sketch merging compare against standard algorithms? Doesthe larger search space improve the plan quality compared to the greedy left-deep enumeration? Alternatively, howclose (far) is the proposed algorithm to exhaustive enumeration?

Implementation.

We implement COMPASS in MapD (version 3.6.1) [76]. The source code is publicly availablein Github [65]. MapD has a highly-parallel GPU-accelerated query execution engine. Relational operators are com-piled into CUDA kernels that are executed concurrently across the SIMD GPU architecture. In order to reduce datamovement, MapD coalesces multiple relational operators into a single CUDA kernel. For joins, this corresponds to aworst-case optimal join algorithm [41]. The MapD query optimizer, however, is not as sophisticated as its executionengine. It relies on the Calcite SQL compiler [73] to get a lexicographic – in the order in which the query is written– query execution plan. The join order is computed based on a primitive heuristic that sorts the tables in decreasingorder of their cardinality. Moreover, selection predicates are not considered in the optimization. COMPASS brings aprincipled cost-based optimization procedure to the MapD query optimizer.The COMPASS implementation consists of two modules—a scan operator that integrates Fast-AGMS sketch con-struction with push-down selection and a lightweight join order enumeration algorithm. For sketch construction, weadapt a publicly available two-way join Fast-AGMS sketch implementation [69] to the MapD CUDA kernel API. Thisrequires parallelizing both the update and the estimation functions. The scan operator ﬁlters only the relevant tuplesto be passed to the sketch update. Since separate sketch instances are created for every GPU block warp, this requiresan additional merge stage—currently performed on the CPU. The sketches used throughout the experiments have 11rows of 1023 buckets, for a total of roughly ≈

11K integers. Assuming 4-byte integers, the memory usage of a sketchis ≈ ≈ · · ), which is quite small. Depending on the parallelization approach, there can be a sketch instanceon every GPU block warp. In our case, the NVIDIA Tesla K80 has 26 block warps, resulting in a total of ≈ Database systems & hardware.

The other three databases we use in addition to MapD are PostgreSQL (v.11.5),MonetDB (v.11.33.11), and the commercial DBMS A. PostgreSQL and DBMS A are used as the common ground inall the experiments because of their extensibility. Both of them allow us to inject and execute the join orders computedby the other databases—the

CROSS JOIN statement in PostgreSQL and the hints in DBMS A, respectively. Weconﬁgure PostgreSQL with 2GB memory per operator, 32GB buffer cache size, and we force the optimizer to usedynamic programming in plan enumeration for queries with no more than 18 join predicates. These settings followprior art [30]. We use an optimized docker image publicly available for DBMS A, while for MonetDB we keep thedefault conﬁguration. All the systems run on a Ubuntu 16.04 LTS machine with 56 CPU cores (Intel Xeon E5-2660),256GB RAM, HDD storage, and an NVIDIA Tesla K80 GPU.

Dataset and query workload.

We perform the experiments on the IMDB dataset [63] which has been used exten-sively to evaluate query optimizers [28] and has become a de-facto standard. The JOB benchmark [68] deﬁnes 113queries – grouped into 33 families – over the IMDB dataset. These queries vary signiﬁcantly in their complexity,with the simplest one having 4 joins and the most complex one having 28 joins. This variability manifests itself in16 .010.11101001,00010,000

MapDMonetDBPgSQLDBMS A C a r d i n a li t y [ l og ] Figure 7: Cardinality (in PostgreSQL) as a normalized ratio to COMPASS.17xecution times that are highly-different. To compensate for this, we split the queries into three groups and examineeach group separately. These groups are based on the number of joins in the query: group1 contains queries with 4-9joins; group2, 10-19 joins; and group3, 20-28 joins. We organize the results according to these groups.

Methodology.

To quantitatively assess the quality of a join order plan, we use two metrics—intermediate resultcardinality and query execution time. The total cardinality of the intermediate results quantiﬁes how many tuples areproduced by all the joins in the plan. The lower this number is, the better the plan. This is the primary metric usedin logical query optimization to estimate the cost of a plan. However, the actual execution time depends on speciﬁcquery processing optimizations. Thus, the execution time is not entirely correlated with the cardinality.In order to fairly evaluate the join orders produced by every database, we use both PostgreSQL and DBMS A ascommon ground. First, we run the queries in each database and collect their join plans. Then, we inject these plansinto PostgreSQL and DBMS A, respectively, and measure their runtime. Moreover, we execute all the subqueries inthe plans to compute the intermediate cardinality. Notice that every system generates its plan independently based onits own algorithm and statistics. PostgreSQL and DBMS A serve as common execution engines for all the plans. Thisprocedure allows for a holistic comparison of the query optimizers—independent of the execution engine.

We present the results of our extensive experimental evaluation, organized based on the investigation questions deﬁnedat the beginning of this section. The answers to the questions are summarized after the presentation of the results.

In this experiment, we compare COMPASS against every other system for each query in the JOB benchmark. Wemeasure both the intermediate result cardinality, as well as the execution time—taken as the median value over 9runs. The execution plans are obtained by performing the query in each system. These are subsequently injected inPostgreSQL and DBMS A, and executed on the same execution engine. The cardinalities are generated by executingall the subqueries in the plan in the corresponding order—which is done in PostgreSQL. This information is extractedfrom the individual plans. Figure 7 and 8 depict the results normalized to COMPASS. All the values are divided by theCOMPASS results—represented as a horizontal dotted line at position 1 on the y-axis. A point below this line meansthat the other system has a better result, otherwise, COMPASS performs better. The results are grouped by the numberof joins in the JOB queries (x-axis) and separated by two dotted vertical lines.

MapD.

MapD consistently produces execution plans that have cardinality two orders of magnitude or larger thanCOMPASS. With a few exceptions, all MapD plans are worse. There is one such query – the discontinuity going tozero in the ﬁgure – that indeed has cardinality zero and MapD correctly detects it. However, this is only a matterof chance because the ﬁrst join in the plan – between the largest tables in the query – does not produce any results.The reason for this poor plan quality is the lack of statistics in the MapD query optimizer. Decisions are taken solelybased on the full table cardinality—the number of tuples before any selection predicate. Therefore, the resulting plansare highly sub-optimal. While runtime follows cardinality – with many results 100X slower than COMPASS – thecorrelation between the two is not complete. There are several queries for which the MapD cardinality is considerablyworse, while the execution time is similar or better than COMPASS. This is the case for some of the complex querieswith 20 or more joins executed in PostgreSQL. In this situation, MapD chooses a large well-connected table early inthe plan. This allows it to check many join predicates at the beginning and prune a large number of tuples. On theother hand, COMPASS – and the other systems – start from small tables on the periphery of the join graph and maketheir way to the highly-connected tables in the center. This strategy produces many staged intermediate results thatincrease the runtime. While the runtime trend across PgSQL and DBMS A is similar, we observe that queries with 20or more joins are handled better by DBMS A, while queries with less than 20 joins are faster in PgSQL. This is anindication that DBMS A is better optimized for complex queries.18 .1110100 R un t i m e [ l og ] MapDMonetDBPgSQLDBMS A4-9 10-19 20-28Number of join predicatesPgSQL DBMS A

Figure 8: Runtime (in PosgreSQL and DBMS A) as a normalized ratio to COMPASS.19 onetDB.

The trend of the cardinality results in MonetDB follows the one in MapD. While the majority of theresults are worse than COMPASS, the ratio is smaller than for MapD. This improvement is due to the more advancedrule-based MonetDB query optimizer with limited statistics support. However, compared to the full sketch-basedCOMPASS, the MonetDB cardinalities are considerably worse—many times an order of magnitude or more. Interest-ingly enough, though, the corresponding query runtimes fare much better than predicted by the cardinality. With fewexceptions, they are always within a factor of 10 – more often less – off of COMPASS. Moreover, they are independentof query complexity and do not exhibit spikes. Overall, the MonetDB runtimes are the most consistent with COM-PASS across both PgSQL and DBMS A. This is because the MonetDB query optimizer ﬁnds plans that are executedsimilarly to COMPASS—albeit they have higher cardinality.

PostgreSQL.

The cardinality results for PostgreSQL – PgSQL in the ﬁgure – are the closest to COMPASS amongall the systems. This is entirely due to the advanced statistics the PostgreSQL optimizer employs. While mildly betterthan COMPASS for several queries, PostgreSQL still exhibits spikes that go beyond a factor of 1000X. The reason isthe failure to detect correlations between join attributes. Since the plans are optimized for the PostgreSQL executionengine in this case, we expect the runtimes to be optimal. This is indeed the case for queries with less than 20 joins.However, for 20 or more joins, the PostgreSQL runtime is considerably worse compared to COMPASS. This is wherethe PostgreSQL optimizer drops dynamic programming in plan search. With a few exceptions where there are dramaticspikes that go beyond 100X, the PgSQL plans executed in DBMS A perform as well as or better than in PostgreSQLitself. This is especially true for the complex queries having 20 or more joins. Overall, COMPASS generates morestable plans than PostgreSQL. Although not speciﬁcally optimized for it, PostgreSQL executes them as fast – or faster– than its own plans.

DBMS A.

The commercial DBMS A produces plans that have consistently higher cardinality than COMPASS acrossall the JOB queries. This clearly shows that the employed statistics do a poor job at estimating the join cardinality.However, when executed in PostgreSQL, these plans have unexpectedly good runtimes—except for queries with morethan 20 joins. This is likely due to the more complex cost function that considers other parameters beyond cardinalityin determining the optimal plan. Interestingly enough, when executing its own plans, DBMS A does not fare betterthan PostgreSQL, except for the complex queries with more than 20 joins. In fact, DBMS A has worse runtime forqueries with 10 to 20 joins. The runtimes of DBMS A and COMPASS are close to each other and always within afactor of 10X. This conﬁrms that the COMPASS plans are also optimal for DBMS A.

MapD MonetDB PgSQL DBMS A COMPASS020406080 To t a l nu m b e r o f qu e r i es w on (a) Cardinality comparison

17 33 13

MapD MonetDB PgSQL DBMS A COMPASSJoins N u m b e r o f qu e r i es w on (b) Cardinality vs number of joins Figure 9: Distribution of winning queries across the databases in terms of intermediate cardinality. The databases withthe lowest cardinality are the winners. The total (149) is larger than 113 – the number of queries in JOB – becausethere are multiple queries for which more than one database is the winner.

We aggregate the query-level results (Figure 7 and 8) in order to obtain an overall view of the relative performanceof the compared systems. These aggregated results are depicted in Figure 9 and 10, respectively. They give the total20umber of queries for which a database performs the best, as well as the distribution as a function of the number ofjoins in the query. In the case of cardinality, a database is counted if it achieves the minimum cardinality among all thedatabases. For runtime, a database is counted if it comes within 10% of the fastest runtime—computed as the medianof 9 runs. This bound compensates for variations in the environment.

MapD MonetDB PgSQL DBMS A COMPASS020406080100

14 35 56 63 8219 53 75 74 81 To t a l nu m b e r o f qu e r i es w on PgSQLDBMS A (a) Execution time comparison N u m b e r o f qu e r i es w on (b) Execution time vs number of joins Figure 10: Distribution of winning queries across the databases in terms of execution time when the optimal plans areplugged-in and executed in PostgreSQL and DBMS A. We obtain the optimal plan for every database from its queryoptimizer and execute it in PostgreSQL and DBMS A. All the databases within 10% of the fastest execution time areconsidered as winners. Thus, the total (250) is larger than 113—the number of queries in JOB.Based on Figure 9a, COMPASS achieves the plan with the minimum cardinality for 63 out of the 113 JOB queries.This represents approximately 56% of the workload. PostgreSQL (PgSQL) comes in second place with 39 queries.The other three databases obtain the best cardinality in less than 20% of the queries each, with MapD winning only 7queries. The careful reader notices that the sum of the winning queries is larger than 113. This is because there arequeries for which two or more systems achieve the same best cardinality—case in which we count each of them. Thedistribution of the winning queries in terms of the number of joins is depicted in Figure 9b. While for the simplerqueries with less than 10 joins all the systems perform similarly, COMPASS clearly dominates the others when thecomplexity increases. PostgreSQL is the only other database that performs sufﬁciently well, however, only for querieswith a moderate number of joins. These results prove the beneﬁt of using statistics in query optimization, especiallyfor complicated queries. While the PostgreSQL statistics perform well for simple to moderate queries, COMPASSsketches are less sensitive to the number of joins in the query—they provide more consistent estimates. Moreover,COMPASS is not heavily impacted by the greedy join enumeration algorithm. When PostgreSQL switches fromdynamic programming – more than 18 joins – it fails to ﬁnd any best plan.The aggregated runtime results in PgSQL and DBMS A are depicted in Figure 10a and 10b. They follow closelythe corresponding cardinality results—with one exception. The runtime for the commercial DBMS A is much betterthan its cardinalities anticipate—DBMS A has the best runtime for 63 and 74 queries, while its cardinality is bestonly for 21 queries. The reasons are outlined when the individual query results are discussed. Additionally, DBMSA beneﬁts from the bound on runtime since it often comes within the fastest system. Overall, COMPASS achievesthe fastest runtime for 82 (PgSQL) and 81 (DBMS A) out of the 113 JOB queries – 72% of the workload – which ismore than any other database. This proves the superiority of the identiﬁed plans and conﬁrms the correlation betweencardinality and runtime. The correlation manifests more clearly for queries with a larger number of joins because ofthe higher runtime, which makes ties more unlikely. Moreover, the correlation is stronger for PgSQL than for DBMSA since the number of winning queries is higher in DBMS A for all systems except COMPASS. A careful readerobserves that the runtime results are higher than the cardinality results for all the systems—and larger than 113 whensummed up. This is because it is more common to have close-enough runtimes than it is to have the same cardinality—multiple counting is more frequent. Based on these results, we conclude that COMPASS is the optimizer with the mostconsistent and resilient plans on the JOB benchmark. 21 .2.3 Total Workload Runtime

The runtimes for the complete JOB workload execution in PostgreSQL and DBMS A using the plans generated byeach database are included in Table 2. Given the high variance among queries, these numbers have to be taken witha grain of salt since they may be dominated by a few complex queries with a large number of joins. Nonetheless, wefollow prior art [5, 54] and include them together with the aggregated workload statistics. As expected, COMPASShas the overall fastest runtime. Somewhat unexpectedly, MonetDB comes in second for the PgSQL execution with aruntime that is almost twice as large as that of COMPASS. The reason is because MonetDB does not fail dramaticallyfor any of the JOB queries. While it performs consistently slower, it never derails on heavily sub-optimal plans. Theruntime for PgSQL and DBMS A in PgSQL is dominated by the long-running queries with 20 or more joins, whichpull the total time to more than 8X and 5X that of COMPASS. These outliers are sufﬁcient to skew the overall runtime.In the case of MapD, there are 30 queries that do not ﬁnish execution even after a timeout of 20 minutes per query.Thus, the very large runtime. When the workload is executed in DBMS A, all systems except DBMS A incur anincrease in runtime. The increase is most signiﬁcant for COMPASS as it stands at 50% more than in PgSQL. On theother hand, DBMS A has a reduction of more than 50% of its PgSQL runtime. Nonetheless, COMPASS still has theoverall fastest runtime, which is 35% faster than DBMS A.

Database Runtime (minutes) Ratio to COMPASS

PgSQL DBMS A PgSQL DBMS AMapD > > > > MonetDB 27.52 35.71

PgSQL 103.00 244.31

DBMS A 70.72 29.22

We compare the plans produced by COMPASS against the pessimistic plans generated in [18]. The pessimistic plansare determined by minimizing the worst case cardinality estimates. Thus, they always produce over-estimates of thetrue cardinality. This is in contrast to COMPASS, which generates both over- and under-estimates. The pessimisticplans for all the JOB queries in PostgreSQL are available at [64]. They are generated by rewriting the SQL statementssuch that selection predicates and one-to-many – key/foreign-key – joins are evaluated before the many-to-many joins.Moreover, ordering is performed separately for one-to-many and many-to-many joins. This partitioning of the searchspace results in a massive reduction of the number of considered join orders. A similar idea is employed in [5], whereonly at most 2-D partitioned sketches are built.The comparison between the runtime of the COMPASS plans and that of the pessimistic plans executed in Post-greSQL is depicted in Figure 11. The results are normalized to the runtime of the COMPASS plans. We observethat the difference between these plans is smaller than for the other systems—an indication that the plans have closerruntime. There are queries for which COMPASS generates faster plans and queries for which the pessimistic plans arebetter. Overall, there is a slight advantage for COMPASS since the curve is above the horizontal 1-axis more often.Moreover, the gap is higher for COMPASS, reaching a factor of almost 10X for certain queries. In terms of number ofjoins, the best plans are almost evenly distributed among the two methods. Table 3 summarizes the individual queryresults from Figure 11. The timing results are higher than previously published [18] because no indexes are deﬁnedover the key attributes. COMPASS achieves a slightly better performance both in the number of queries won – 88 vs.83 – as well as in the cumulative runtime—COMPASS is faster by approximately 25 seconds. The main reason for thebetter performance of pessimistic plans – compared to other systems – is the separate optimization of the joins. Thepartition of the join graph based on the many-to-many joins and their independent ordering reduces the multi-way joinestimation error signiﬁcantly. The separate evaluation of the one-to-many joins in every partition reduces the errorfurther. Moreover, the estimation for these simpler joins is more accurate. While partitioning the join order space22 .1110 R un t i m e [ l og ] Figure 11: Runtime of pessimistic plans (in PosgreSQL) as a normalized ratio to COMPASS.reduces estimation complexity, it also ignores orderings that can result in better plans. For example, the cardinalityof a many-to-many join can be smaller than that of a one-to-many join. Pessimistic plans ignore these interleavedorders. Given the holistic approach that considers the complete join graph, the COMPASS results are quite impressivegiven the size of the multi-way joins. This is possible because of the good accuracy – and consistency – of the mergedFast-AGMS sketches. We conjecture that COMPASS can be improved by adopting a similar tiered approach to joinordering. We plan to explore this idea in future work.

Query plans Queries won Runtime (seconds) Ratio to COMPASS

Pessimistic 83 777.10

MapDQuery executionCOMPASS R un t i m e ( m s ) [ l og ] Figure 12: Query runtime in MapD with the COMPASS query optimizer.

In this experiment, we evaluate the impact COMPASS has on the MapD database. For this, we replace the defaultMapD query optimizer with COMPASS and execute the JOB benchmark in both scenarios. We measure the end-to-end23

Plan enumeration Cardinality estimation O p t i m i z a t i on o ve r h ea d ( m s ) (a) Optimization overhead on GPU. Plan enumeration Cardinality estimation O p t i m i z a t i on o ve r h ea d ( % ) (b) Optimization overhead on GPU as a percentage from the total runtime. GPU CPU O ve r h ea d ( m s ) [ l og ] (c) Optimization overhead on GPU and CPU. Figure 13: The overhead of the COMPASS optimizer implemented in MapD.24uery runtime, as well as only the query execution time without optimization—these are the same in MapD. We reportthe median over 9 runs. Figure 12 depicts the results for every query. We observe that MapD outperforms COMPASSfor simple and some moderate queries. However, the differences are not signiﬁcant, as opposed to the difference formore complex queries. This may be surprising given the primitive MapD query optimizer. However, its executionengine is quite different from PostgreSQL. It is highly-optimized for parallel in-memory processing. This alleviatesthe need for careful optimization on simple queries. For more complicated queries, though, sketch-based optimizationpays off as COMPASS ﬁnds considerably better plans. In fact, MapD fails on 8 queries and times out after 30 minuteson 8 other queries. COMPASS ﬁnishes all the queries and is faster than MapD for 74 of them, which represents 65%of the workload. The total runtime for the 97 queries MapD successfully runs is included in Table 4. COMPASShas a runtime of 6.21 minutes to MapD’s 47.64—which is a net speedup of 7.67X. This proves both that sketchescan be effectively computed at runtime, as well as their beneﬁt to generate better query plans, which result in fasterexecution. The last point is clear when we compare only the execution time, without optimization overhead—less thanten COMPASS plans have execution time larger than MapD.

Database Queries won Runtime (minutes) Ratio to COMPASS

MapD 42 47.64

We measure the optimization overhead of building Fast-AGMS sketches, as well as that of sketch-based plan enumer-ation, for the COMPASS MapD implementation. Sketch building can be performed either on GPU or CPU, whilemerging and plan enumeration are performed on CPU. The results are depicted in Figure 13. Figure 13a and 13b showthe absolute and relative overhead, respectively, for the GPU execution. The overhead in queries with up to 9 joinsis at most 15%—or at most 420 ms. At a ﬁrst glance, the overhead may seem signiﬁcant. However, this time is notspent in vain since the plans selected by COMPASS are quite fast even with the overhead included. For these simplerqueries, there is not signiﬁcant difference between plans. Thus, a primitive optimizer as in MapD is sufﬁcient. Theoverhead for the rest of the workload is at most 17%—or 500 ms. This overhead becomes negligible in the overallexecution time. As a result, COMPASS largely outperforms the other four databases. Figure 13c shows that – asexpected – sketch building is more efﬁcient on GPU than on CPU due to the higher degree of parallelism. In bothcases, the optimization overhead increases with the number of joins in the query. For GPU, the overhead is in theorder of hundreds of milliseconds (ms), with a maximum of around 500 ms for certain complex queries. For CPU, theoverhead is always below 5 seconds, which is relatively small for queries that take minutes to run. Given that this isonly a prototype, we believe that the sketch overhead can be further reduced with more optimized code.

We compare COMPASS against seven methods for join cardinality estimation following the study presented in [23].These methods are PostgreSQL, AGMS sketches [2, 1], random table samples (TS), correlated samples [56], joinsamples (JS), KDE with table samples (TS+KDE), and KDE with join samples (JS+KDE) [23]. We perform all theexperiments for join cardinality estimation on three types of JOB queries over at most ﬁve tables – the simplest in thebenchmark – as presented in [23]. We use the publicly available code, workload, and data from [66]. The resultsare depicted in Figure 14. We observe that the COMPASS accuracy matches that of the best estimators closely forall the queries. The only two estimators that always outperform COMPASS are based on join samples (Join Sampleand JS+KDE). This type of estimators require indexes on all possible join attribute combinations, thus, they have ahigh set-up and maintenance cost. Moreover, the KDE models are trained on query samples with the same set ofselection and join predicates, i.e., same type of training queries with different constant values. In addition, JS+KDErequires training for every join size. Thus, it is not clear what is the behavior of the KDE estimators on different typesof queries—the sub-queries enumerated by the optimizer, in particular. Notice that the results also include sketches25AGMS). However, these are the AGMS sketches, not the Fast-AGMS sketches on which COMPASS is built upon.Given the detailed sketch comparison in [47, 49], the difference between the two is expected. Since we perform athorough query plan evaluation with PostgreSQL (Postgres), the comparison in terms of accuracy is interesting. WhileCOMPASS and Postgres have very similar accuracy – with an advantage for COMPASS – our holistic results provethat COMPASS generates better query plans both in terms of quality and execution time. This proves that other factorsbeyond single query accuracy – such as sub-plan enumeration and estimator composition – have to be considered by theoptimizer. Overall, the comparison in terms of accuracy is limited to a series of relatively simple hand-picked querieswith at most four joins—nothing close to the full JOB benchmark. While useful, it fails to conﬁrm the practicality ofthe considered approaches in a complete query optimizer, which COMPASS does.

IMDB Q1 Uniform Q - e rr o r ( l og sca l e ) IMDB Q2 Uniform IMDB Q3 UniformIMDB Q1 Distinct P o s t g r es A G M S T a b l e S a m p l e ( T S ) C o rr e l a t e d S a m p l eJ o i n S a m p l e ( J S ) T S + K D E J S + K D E C O M P A S S Q - e rr o r ( l og sca l e ) IMDB Q2 Distinct P o s t g r es A G M S T a b l e S a m p l e ( T S ) C o rr e l a t e d S a m p l eJ o i n S a m p l e ( J S ) T S + K D E J S + K D E C O M P A S S

IMDB Q3 Distinct P o s t g r es A G M S T a b l e S a m p l e ( T S ) C o rr e l a t e d S a m p l eJ o i n S a m p l e ( J S ) T S + K D E J S + K D E C O M P A S S

Figure 14: Comparison with state-of-the-art techniques for cardinality estimation.

We assess the performance of the join order enumeration algorithm by varying the search space. We set four differentdimensions in the left-deep search space– greedy , full-greedy , limit-10 , and exhaustive . The greedy solution traversesthe join graph from a single source and greedily adds nodes relying on the sketch estimations. The source node ischosen as the smallest table from the two-way join that has the smallest cardinality estimation. full-greedy executesthe greedy traversal from every node in the join graph. The join order with the smallest overall estimation is selected. limit-10 enhances full-greedy with backtracking. Rather than stopping after the ﬁrst greedy plan is generated, limit-10 traverses additional paths in the join graph until the ﬁrst 10 join orders are generated, i.e., max plans is set to in Algorithm 1. Lastly, exhaustive traverses the entire search space to ﬁnd the optimal plan, i.e., max plans is setto ∞ in Algorithm 1. Sketch merging and estimation are arithmetic operations amenable to parallelization. Thus,the traversal algorithm quickly explores even the large spaces enumerated by exhaustive . To further speed up theenumeration phase, two practical heuristics are applied in Algorithm 1 (see Section 6).26e collect the join order corresponding to every approach for all the JOB queries and compare their intermediatecardinality and runtime. The results are depicted in Figure 15. Intuitively, we expect that the larger the search spaceis, the higher the odds to ﬁnd a better join order plan. However, as discussed in [30, 28], the plan enumerationalgorithm has a relatively small impact on plan quality. Our results conﬁrm this hypothesis. We observe that noparticular approach is signiﬁcantly better for the proposed sketch-driven traversal algorithm. In larger search spaces,sub-optimal plans are pruned by the early stopping criteria since the overall cardinality estimates exceed the currentminimum cost. The plan search space is not only limited by the join graph. The search space may shrink because ofthe early stopping criteria that compare the overall cardinality estimates for the sub-queries. For example, it is possiblethe execution plans for greedy and full-greedy are the same, although full-greedy searches from each node separately.The reason is that all the other execution plans selected by full-greedy may have larger overall cost. In fact, even thesub-queries may have already larger cost than the plan selected by greedy solution—thus the enumeration algorithmbacktracks. As a result, there is no particular preference for the search space size since the cardinality differences arenot signiﬁcant in Figure 15 (upper part of the ﬁgure). Although there are some outliers – spikes in the ﬁgure – theyare mostly selected by full-greedy , limit-10 , and exhaustive . This behavior is caused by the over- and under-estimates.Also, there is no particular trend when comparing the runtime corresponding to the different plans—except for a fewoutliers in exhaustive . Moreover, there is no correlation between cardinality and runtime results for the outliers. Webelieve the plan differences chosen by these four different solutions are not signiﬁcantly different in order to noticechanges in terms of the runtime. The results conﬁrm that the plan chosen by greedy is not necessarily slow and,thus, we conclude that COMPASSS is not heavily impacted by the join enumeration algorithm. In fact, COMPASSoutperforms the other systems, especially for complex queries with a large number of joins. R un t i m e [ l og ] greedyfull-greedylimit-10exhaustive C a r d i n a li t y [ l og ] Figure 15: The effect of plan enumeration algorithms on the cardinality and runtime of the selected plan. Values arenormalized with respect to the cardinality and runtime of the plan determined by the greedy algorithm. Values abovethe “1” horizontal line are worse while values below are better.27 .3 Summary

Based on the presented results, we can answer the questions raised at the beginning of the experimental section:• COMPASS generates query plans with the lowest cardinality among all the considered systems for 56% of thequeries in the workload. This percentage increases to 65% for complicated queries with 10 or more joins.• The better plans identiﬁed by COMPASS translate into faster query runtimes in PostgreSQL, DBMS A, and MapD.Out of the 113 JOB queries, COMPASS achieves the fastest runtime for more than 80 in PostgreSQL and DBMS A,and 74 in MapD. This conﬁrms the correlation between cardinality and runtime. DBMS A is the only database thatdoes not satisfy this correlation, which can be problematic for a user.• COMPASS and MonetDB are the only databases that perform all the JOB queries without serious hiccups both inPostgreSQL and DBMS A. The other systems have several queries for which the runtime “explodes”. This results insigniﬁcantly higher workload runtime. On the PostgreSQL engine, COMPASS outperforms MonetDB by a factor of2.19X, while on DBMS A by 1.65X. DBMS A optimizes queries speciﬁcally for its engine, resulting in a signiﬁcantreduction in runtime compared to PostgreSQL. However, COMPASS is faster by a factor of 1.35X. Moreover,COMPASS is at least 7.67X faster than MapD.• Even though COMPASS treats the complete join graph in ordering – which results in a much larger search space –it manages to outperform the pessimistic optimizers—which perform join ordering in a tiered approach. While only3% reduction in the workload runtime, this improvement is signiﬁcant because COMPASS achieves a faster runtimeand higher speedup for more queries.• The overhead incurred by the COMPASS optimizer in MapD is less than 500 milliseconds on GPU and less than 5seconds on CPU. While this may be too large for simple queries, it results in faster execution for more than 91% ofthe queries. We plan to optimize our implementation in the future.• The accuracy achieved by COMPASS matches – and often surpasses – that of most of the state-of-the-art methods forjoin cardinality estimation. The only methods that have better accuracy are based on join samples, which have neverbeen fully-integrated in a query optimizer because of their complexity. If we consider only methods implemented inexistent query optimizers, COMPASS has better overall accuracy.• The COMPASS plan enumeration algorithm can be customized to perform the search for the optimal join order froma limited greedy to an exhaustive left-deep tree. However, the difference between these alternatives is not signiﬁcantenough to compensate for their gap in overhead. Thus, the fast greedy join order enumeration is sufﬁcient to achievethe optimal COMPASS plans.

Cardinality estimation.

While exhaustive surveys on query optimization [6, 67] argue that each component is im-portant in ﬁnding the optimal plan, Leis et al. [28, 30] show experimentally that cardinality estimation is the mostdominant component in query optimization. However, consistency in estimations is more important than high ac-curacy only for a limited number of instances. There are four mainstream cardinality estimation approaches in theliterature—histograms, sampling techniques, sketches, and, more recently, machine learning models. While his-tograms can provide accurate selectivity estimation for a single attribute in a relation [20], it is difﬁcult for themto capture correlations between cross-join attributes [43], thus reducing their applicability to joins. Unlike histograms,sampling techniques [29, 40] can detect arbitrary correlations for common values. However, samples are sensitiveto skewed and sparse data when few tuples are selected by a query [58]. As the query optimizer estimates a largenumber of joins, the cardinality drops quickly, causing wrong estimates for intermediate results. Estimating the car-dinality of multi-way joins with AGMS sketches is introduced in [10, 11], while a statistical analysis of two-wayjoin sketch-based techniques is performed by Rusu and Dobra [47, 49]. Their results show that Fast-AGMS sketchesare clearly superior to other sketches. In this work, we extend Fast-AGMS sketches to capture all the join attributesinvolved in a given query within a single sketch and efﬁciently estimate multi-way joins. Vengerov et al. [56] presentan extension to AGMS sketches that captures selection predicates, while Cai et al. [5] introduce bound sketches thatprovide theoretical upper bounds for cardinality estimation. The problem with these approaches is that the onlinesketch building process is not scalable. Hertzschuch et al. [18] maintain the pessimistic property for cardinality esti-mation, while replacing sketches with a simple formula based on statistics already available to the PostgreSQL query28ptimizer. This eliminates the sketch overhead, while preserving the quality of the pessimistic plans—as long as theoptimizer statistics estimate predicate selectivity accurately. Kernel density models for cardinality estimation (KDE)are introduced in [17, 23]. They are built on samples extracted either from the base tables or the join. While theiraccuracy is shown to be superior to any other method on JOB queries over at most ﬁve tables – the simplest in thebenchmark – it is not clear how to generalize and fully integrate KDE models in plan enumeration. Speciﬁcally, theKDE implementation [66] builds a separate estimator for every query. No details are provided on how to apply theestimator to query sub-plans derived from the main query, which is the centerpiece of plan enumeration.

Query reoptimization.

In order to overcome the inherent mis-estimations in the query optimizer, Adaptive QueryProcessing [9] allows the query processor to modify the optimal query plan computed by the optimizer in case oflarge deviations from the true cardinality values detected at runtime. The Mid-Query Re-Optimizer [21], ROX [22],and SkinnerDB [54] re-run the query optimizer at runtime in the case of large differences between estimations andthe true cardinalities. Wu et al. [59] apply online sampling to correct the errors in the plans generated by the queryoptimizer. These approaches use the output of the query executor and sampling techniques to re-estimate the car-dinalities based on already computed intermediate join outputs and change the query plan whenever the estimatedvalues deviate signiﬁcantly. In the self-adaptable LEO optimizer [36], the query engine monitors and uses the feed-back from the execution engine in order to adjust the histogram-based synopses for better performance in subsequentqueries. Eddies [3] process batches of tuples by following dynamic routing policies during query execution. Unlikethese systems, COMPASS performs query optimization as a single stage, while query execution is partitioned into twophases—before and after the optimization. As in query reoptimization, COMPASS uses the intermediates – sketches– produced by the ﬁrst phase of execution. However, this process is performed only once, thus its overhead is smallercompared to continuous reoptimization.

Machine learning for query optimization.

Using machine learning techniques and deep neural networks is a recenttrend in query optimization. Join order enumeration [34, 27, 33], cardinality estimation [32, 31, 24, 25, 57, 42], selec-tivity estimation [61, 15, 12], and index structures [26] have been active research directions. Regarding the cardinalityestimation problem, Malik et al. [32] propose to train neural network models based on cardinality distributions fora separate class of similar queries and estimate overall query cardinalities. Yang et al. [60] utilize neural networksto learn a function to estimate cardinalities of queries with range selection predicates. Kipf et al. [24] use multi-setconvolutional neural networks in order to model join and selection predicates, and capture join correlations. Woltmannet al. [57] propose to train neural network models to estimate cardinalities in equi-joins. Marcus et al. [34] use rein-forcement learning in order to efﬁciently explore the search space and ﬁnd optimal join order plans. Different fromthese approaches, COMPASS uses traditional randomized algorithms to estimate cardinality. Moreover, COMPASS isfully-integrated into a database engine, which is often not the case for these machine learning solutions.

Plan enumeration.

In plan enumeration, multiple semantically equivalent plans are explored in order to identify theoptimal execution plan. Different exhaustive [55, 39] and heuristic-based [52] algorithms have been proposed. Theyconsider different tree shapes – such as left-deep and bushy trees – in the search space. Leis et al. [28, 30] evaluate theinﬂuence of several plan enumeration algorithms and the impact of considering bushy trees. Several recent approacheshave been proposed to optimize the plan enumeration phase by using GPUs [37, 38] and deep reinforcement learningmodels [34, 35]. In this work, we propose an adaptive graph traversal algorithm that efﬁciently explores the searchspace. This algorithm can be conﬁgured to cover plan enumeration from greedy to exhaustive search.

We introduce the online sketch-based COMPASS query optimizer, which uses exclusively Fast-AGMS sketches forcardinality estimation and plan enumeration. Fast-AGMS sketches are computed online by leveraging the optimizedparallel execution engine in modern databases. Selection predicates and sketch updates are pushed-down and eval-uated online during query optimization. Plan enumeration is performed over the query join graph by incrementallycomposing the corresponding sketches. We prototype COMPASS in MapD and perform extensive experiments over29he complete JOB benchmark. The results prove the reduced overhead COMPASS incurs, while generating betterexecution plans than four other database systems—COMPASS outperforms four other databases on all the consideredmetrics over the JOB benchmark. In future work, we plan to investigate alternative merging strategies for Fast-AGMSsketches in order to support multi-way joins. SIMD-optimized sketch algorithms – for CPU and GPU – with loweroverhead and alternative plan enumeration strategies are other directions we plan to pursue.

Acknowledgments.

This work is supported by NSF award number 2008815 and by a U.S. Department of EnergyEarly Career Award (DOE Career). The authors want to thank Alex Suhan – one of the MapD architects – for explain-ing the internals of the MapD execution engine, as well as Hung Ngo and Mahmoud Abo Khamis from relationalAIfor the discussions on query optimization. Lastly, the authors acknowledge the insightful comments made by theSIGMOD 2021 anonymous reviewers that helped improve the quality of the paper.

References [1] N. Alon, P. B. Gibbons, Y. Matias, and M. Szegedy. Tracking Join and Self-Join Sizes in Limited Storage. In

PODS 1999 , pages 10–20.[2] N. Alon, Y. Matias, and M. Szegedy. The Space Complexity of Approximating the Frequency Moments. In

STOC 1996 , pages 20–29.[3] R. Avnur and J. M. Hellerstein. Eddies: Continuously Adaptive Query Processing. In

SIGMOD 2000 , pages261–272.[4] S. Breß, M. Heimel, N. Siegmund, L. Bellatreche, and G. Saake. GPU-accelerated Database Systems: Surveyand Open Challenges.

TLKDS , pages 1–35, 2014.[5] W. Cai, M. Balazinska, and D. Suciu. Pessimistic Cardinality Estimation: Tighter Upper Bounds for IntermediateJoin Cardinalities. In

SIGMOD 2019 , pages 18–35.[6] S. Chaudhuri. An Overview of Query Optimization in Relational Systems. In

PODS 1998 , pages 34–43.[7] G. Cormode and M. Garofalakis. Sketching Streams Through the Net: Distributed Approximate Query Tracking.In

VLDB 2005 , pages 13–24.[8] G. Cormode, M. Garofalakis, P. J. Haas, and C. Jermaine. Synopses for Massive Data: Samples, Histograms,Wavelets, Sketches.

Foundation and Trends in Databases , 4:1–294, 2012.[9] A. Deshpande, Z. Ives, and V. Raman. Adaptive Query Processing.

Foundations and Trends in Databases ,1(1):1–140, 2007.[10] A. Dobra, M. Garofalakis, J. Gehrke, and R. Rastogi. Processing Complex Aggregate Queries over Data Streams.In

SIGMOD 2002 , pages 61–72.[11] A. Dobra, M. Garofalakis, J. Gehrke, and R. Rastogi. Sketch-Based Multi-query Processing over Data Streams.In

EDBT 2004 , pages 551–568.[12] A. Dutt, C. Wang, A. Nazi, S. Kandula, V. Narasayya, and S. Chaudhuri. Selectivity Estimation for RangePredicates Using Lightweight Models.

PVLDB , 12(9):1044–1057, 2019.[13] H. Funke, S. Breß, S. Noll, V. Markl, and J. Teubner. Pipelined Query Processing in Coprocessor Environments.In

SIGMOD 2018 , pages 1603–1618.[14] H. Garcia-Molina, J. D. Ullman, and J. Widom.

Database Systems: The Complete Book . Prentice Hall, 2008.[15] S. Hasan, S. Thirumuruganathan, J. Augustine, N. Koudas, and G. Das. Multi-Attribute Selectivity EstimationUsing Deep Learning.

CoRR , arXiv:1903.09999v2, 2019.3016] B. He, M. Lu, K. Yang, R. Fang, N. K. Govindaraju, Q. Luo, and P. V. Sander. Relational Query Coprocessingon Graphics Processors.

TODS , 34(4):1–39, 2009.[17] M. Heimel, M. Kiefer, and V. Markl. Self-Tuning, GPU-Accelerated Kernel Density Models for Multidimen-sional Selectivity Estimation. In

SIGMOD 2015 , pages 1477–1492.[18] A. Hertzschuch, C. Hartmann, D. Habich, and W. Lehner. Simplicity Done Right for Join Ordering. In

CIDR2021 .[19] S. Idreos, F. Groffen, N. Nes, S. Manegold, S. Mullender, and M. Kersten. MonetDB: Two Decades of Researchin Column-oriented Database Architectures.

IEEE Data Engineering Bulletin , 35(1):40–45, 2012.[20] Y. E. Ioannidis and S. Christodoulakis. On the Propagation of Errors in the Size of Join Results.

SIGMODRecord , 20(2):268–277, 1991.[21] N. Kabra and D. J. DeWitt. Efﬁcient Mid-query Re-optimization of Sub-optimal Query Execution Plans. In

SIGMOD 1998 , pages 106–117.[22] A. R. Kader, P. Boncz, S. Manegold, and M. van Keulen. ROX: Run-time Optimization of XQueries. In

SIGMOD2009 , pages 615–626.[23] M. Kiefer, M. Heimel, S. Breß, and V. Markl. Estimating Join Selectivities using Bandwidth-Optimized KernelDensity Models.

PVLDB , 10(13):2085–2096, 2017.[24] A. Kipf, T. Kipf, B. Radke, V. Leis, P. Boncz, and A. Kemper. Learned Cardinalities: Estimating CorrelatedJoins with Deep Learning. In

CIDR 2019 .[25] A. Kipf, D. Vorona, J. Muller, T. Kipf, B. Radke, V. Leis, P. Boncz, T. Neumann, and A. Kemper. EstimatingCardinalities with Deep Sketches.

CoRR , arXiv:1904.08223v1, 2019.[26] T. Kraska, A. Beutel, E. H. Chi, J. Dean, and N. Polyzotis. The Case for Learned Index Structures. In

SIGMOD2018 , pages 489–504.[27] S. Krishnan, Z. Yang, K. Goldberg, J. Hellerstein, and I. Stoica. Learning to Optimize Join Queries With DeepReinforcement Learning.

CoRR , arXiv:1808.03196v2, 2018.[28] V. Leis, A. Gubichev, A. Mirchev, P. Boncz, A. Kemper, and T. Neumann. How Good Are Query Optimizers,Really?

PVLDB , 9(3):204–215, 2015.[29] V. Leis, B. Radke, A. Gubichev, A. Kemper, and T. Neumann. Cardinality Estimation Done Right: Index-BasedJoin Sampling. In

CIDR 2017 .[30] V. Leis, B. Radke, A. Gubichev, A. Mirchev, P. Boncz, A. Kemper, and T. Neumann. Query OptimizationThrough the Looking Glass, and What We Found Running the Join Order Benchmark.

VLDB Journal , 27:643–668, 2018.[31] H. Liu, M. Xu, Z. Yu, V. Corvinelli, and C. Zuzarte. Cardinality Estimation Using Neural Networks. In

CASCON2015 , pages 53–59.[32] T. Malik, R. C. Burns, and N. V. Chawla. A Black-Box Approach to Query Cardinality Estimation. In

CIDR2007 .[33] R. Marcus, P. Negi, H. Mao, C. Zhang, M. Alizadeh, T. Kraska, O. Papaemmanouil, and N. Tatbul. Neo: ALearned Query Optimizer.

VLDB Journal , 12(11), 2019.[34] R. Marcus and O. Papaemmanouil. Deep Reinforcement Learning for Join Order Enumeration. In aiDM 2018 .[35] R. Marcus and O. Papaemmanouil. Towards a Hands-Free Query Optimizer through Deep Learning.

CoRR ,arXiv:1809.10212v2, 2018. 3136] V. Markl, G. M. Lohman, and V. Raman. LEO: An Autonomic Query Optimizer for DB2.

IBM Systems Journal ,42(1):98–106, 2003.[37] A. Meister. GPU-Accelerated Join-Order Optimization. In

PhD Workshop @ VLDB 2015 .[38] A. Meister and G. Saake. Challenges for a GPU-Accelerated Dynamic Programming Approach for Join-OrderOptimization. In

GvD 2016 .[39] G. Moerkotte and T. Neumann. Analysis of Two Existing and One New Dynamic Programming Algorithm forthe Generation of Optimal Bushy Join Trees Without Cross Products. In

VLDB 2006 , pages 930–941.[40] M. Muller, G. Moerkotte, and O. Kolb. Improved Selectivity Estimation by Combining Knowledge from Sam-pling and Synopses.

PVLDB , 9(11):1016–1028, 2018.[41] H. Q. Ngo, E. Porat, C. R´e, and A. Rudra. Worst-Case Optimal Join Algorithms. In

PODS 2012 , pages 37–48.[42] J. Ortiz, M. Balazinska, J. Gehrke, and S. Sathiya Keerthi. An Empirical Analysis of Deep Learning for Cardi-nality Estimation.

CoRR , arXiv:1905.06425v2, 2019.[43] V. Poosala and Y. E. Ioannidis. Selectivity Estimation Without the Attribute Value Independence Assumption. In

VLDB 1997 , pages 486–495.[44] P. Roy, A. Khan, and G. Alonso. Augmented Sketch: Faster and More Accurate Stream Processing. In

SIGMOD2016 , pages 1449–1463.[45] F. Rusu and A. Dobra. Fast Range-Summable Random Variables for Efﬁcient Aggregate Estimation. In

SIGMOD2006 , pages 193–204.[46] F. Rusu and A. Dobra. Sketching Sampled Data Streams. In

ICDE 2009 , pages 381–392.[47] F. Rusu and A. Dobra. Statistical Analysis of Sketch Estimators. In

SIGMOD 2007 , pages 187–198.[48] F. Rusu and A. Dobra. Pseudo-Random Number Generation for Sketch-Based Estimations.

TODS , 32(2), 2007.[49] F. Rusu and A. Dobra. Sketches for Size of Join Estimation.

TODS , 33(15), 2008.[50] P. G. Selinger, M. M. Astrahan, D. D. Chamberlain, R. A. Lorie, and T. G. Price. Access Path Selection in aRelational Database Management System. In

SIGMOD 1979 , pages 23–34.[51] J. H. Shin, F. Rusu, and A. Suhan. Exact Selectivity Computation for Modern In-Memory Database QueryOptimization.

CoRR , arXiv:1901.01488v1, 2019.[52] M. Steinbrunn, G. Moerkotte, and A. Kemper. Heuristic and Randomized Optimization for the Join OrderingProblem.

VLDB Journal , 6(3):191–208, 1997.[53] C. Stylianopoulos, I. Walulya, M. Almgren, O. Landsiedel, and M. Papatriantaﬁlou. Delegation Sketch: AParallel Design with Support for Fast and Accurate Concurrent Operations. In

EuroSys 2020 .[54] I. Trummer, J. Wang, D. Maram, S. Moseley, S. Jo, and J. Antonakakis. SkinnerDB: Regret-Bounded QueryEvaluation via Reinforcement Learning. In

SIGMOD 2019 , pages 1153–1170.[55] B. Vance and D. Maier. Rapid Bushy Join-order Optimization with Cartesian Products.

SIGMOD Record ,25(2):35–46, 1996.[56] D. Vengerov, A. C. Menck, M. Zait, and S. P. Chakkappen. Join Size Estimation Subject to Filter Condition.

PVLDB , 8(12):1530–1541, 2015.[57] L. Woltmann, C. Hartmann, M. Thiele, D. Habich, and W. Lehner. Cardinality Estimation with Local DeepLearning Models. In aiDM 2019 , pages 1–8. 3258] W. Wu. Sampling-Based Cardinality Estimation Algorithms: A Survey and An Empirical Evaluation, 2012.[59] W. Wu, J. F. Naughton, and H. Singh. Sampling-Based Query Re-Optimization. In

SIGMOD 2016 , pages1721–1736.[60] T. Yang, L. Liu, Y. Yan, M. Shahzad, Y. Shen, X. Li, B. Cui, and G. Xie. SF-sketch: A Two-stage Sketch forData Streams.

CoRR , arXiv:1701.04148v3, 2017.[61] Z. Yang, E. Liang, A. Kamsetty, C. Wu, Y. Duan, X. Chen, P. Abbeel, J. M. Hellerstein, S. Krishnan, and I. Stoica.Selectivity Estimation with Deep Likelihood Models.

CoRR , arXiv:1905.04278v2, 2019.[62] F. Yu, W. Hou, C. Luo, D. Che, and M. Zhu. CS2: A New Database Synopsis for Query Estimation. In

SIGMOD2013 .[63] P. Boncz. The IMDB Dataset. http://homepages.cwi.nl/˜boncz/job/imdb.tgz .[64] A. Hertzschuch. SimplicityDoneRight. https://github.com/axhertz/SimplicityDoneRight .[65] Y. Izenov. The COMPASS Query Optimizer. https://github.com/yizenov/compass_query_optimizer .[66] M. Kiefer. join-kde. https://github.com/martinkiefer/join-kde .[67] G. Lohman. Is Query Optimization a Solved Problem? https://wp.sigmod.org/?p=1075 , 2014.[68] G. Rahn. Join Order Benchmark (JOB). https://github.com/gregrahn/join-order-benchmark .[69] F. Rusu. Sketches for Size of Join Estimation. https://faculty.ucmerced.edu/frusu/Projects/Sketches .[70] M. Saecker. MonetDB Ocelot. https://bitbucket.org/msaecker/monetdb-opencl/src/simple_mem_manager/ .[71] StackExchange. Distance Between Two Permutations? https://math.stackexchange.com/questions/2492954/distance-between-two-permutations .[72] Brytlyt. .[73] Apache Calcite. https://calcite.apache.org .[74] CoGaDB. http://cogadb.cs.tu-dortmund.de/wordpress/download/ .[75] Kinetica. .[76] MapD. .[77] MonetDB. .[78] PostgreSQL.

Related Researches

Empowering Investigative Journalism with Graph-based Heterogeneous Data Management

by Angelos-Christos Anadiotis

Approximating Happiness Maximizing Set Problems

by Phoomraphee Luenam

A Framework for Federated SPARQL Query Processing over Heterogeneous Linked Data Fragments

by Lars Heling

Materializing Knowledge Bases via Trigger Graphs

by Efthymia Tsamoura

Typing Errors in Factual Knowledge Graphs: Severity and Possible Ways Out

by Peiran Yao

The Forgotten Document-Oriented Database Management Systems: An Overview and Benchmark of Native XML DODBMSes in Comparison with JSON DODBMSes

by Ciprian-Octavian Truic?

Fast Distributed Complex Join Processing

by Hao Zhang

A Survey of RDF Stores & SPARQL Engines for Querying Knowledge Graphs

by Waqas Ali

Durable Top-K Instant-Stamped Temporal Records with User-Specified Scoring Functions

by Junyang Gao

Data Quality Certification using ISO/IEC 25012: Industrial Experiences

by Fernando Gualo

FAST: FPGA-based Subgraph Matching on Massive Graphs

by Xin Jin

Approximate Knowledge Graph Query Answering: From Ranking to Binary Classification

by Ruud van Bakel

LMKG: Learned Models for Cardinality Estimation in Knowledge Graphs

by Angjela Davitkova

New Recruiter and Jobs: The Largest Enterprise Data Migration at LinkedIn

by Xie Lu

Interactive Query Formulation using Point to Point Queries

by Henderik Alex Proper

Cornus: One-Phase Commit for Cloud Databases with Storage Disaggregation

by Zhihan Guo

A Unified System for Data Analytics and In Situ Query Processing

by Alex Watson

A Survey on Locality Sensitive Hashing Algorithms and their Applications

by Omid Jafari

A Lazy Approach for Efficient Index Learning

by Guanli Liu

THIA: Accelerating Video Analytics using Early Inference and Fine-Grained Query Planning

by Jiashen Cao

Data provenance, curation and quality in metrology

by James Cheney

Querying collections of tree-structured records in the presence of within-record referential constraints

by Foto N. Afrati

Updatable Materialization of Approximate Constraints

by Steffen Kläbe

Spatial Interpolation-based Learned Index for Range and kNN Queries

by Songnian Zhang

A GeoSPARQL Compliance Benchmark

by Milos Jovanovik

«

1

2

3

4

»

Submitted on 4 Feb 2021 Updated

arXiv.org Original Source

NASA ADS

Google Scholar

Semantic Scholar