Online Sketch-based Query Optimization
OOnline Sketch-based Query Optimization
Yesdaulet Izenov, Asoke Datta, Florin Rusu, Jun Hyung Shin { yizenov,adatta2,frusu,jshin33 } @ucmerced.eduUniversity of California MercedFebruary 2021 Abstract
Cost-based query optimization remains a critical task in relational databases even after decades of research andindustrial development. Query optimizers rely on a large range of statistical synopses – including attribute-levelhistograms and table-level samples – for accurate cardinality estimation. As the complexity of selection predicatesand the number of join predicates increase, two problems arise. First, statistics cannot be incrementally composedto effectively estimate the cost of the sub-plans generated in plan enumeration. Second, small errors are propagatedexponentially through join operators, which can lead to severely sub-optimal plans.In this paper, we introduce COMPASS, a novel query optimization paradigm for in-memory databases based ona single type of statistics—Fast-AGMS sketches. In COMPASS, query optimization and execution are intertwined.Selection predicates and sketch updates are pushed-down and evaluated online during query optimization. This allowsFast-AGMS sketches to be computed only over the relevant tuples—which enhances cardinality estimation accuracy.Plan enumeration is performed over the query join graph by incrementally composing attribute-level sketches—notby building a separate sketch for every sub-plan.We prototype COMPASS in MapD – an open-source parallel database – and perform extensive experiments overthe complete JOB benchmark. The results prove that COMPASS generates better execution plans – both in terms ofcardinality and runtime – compared to four other database systems. Overall, COMPASS achieves a speedup rangingfrom 1.35X to 11.28X in cumulative query execution time over the considered competitors.
Consider query 6a from the JOB benchmark [30]:
SELECT
MIN(k.keyword), MIN(n.name), MIN(t.title)
FROM cast info ci , keyword k , movie keyword mk , name n , title tWHERE (cid:46) selection predicates k.keyword = ’marvel-cinematic-universe’ AND n.name LIKE ’%Downey%Robert%’
AND t.production year > 2010
AND (cid:46) join predicates k.id = mk.keyword id
AND t.id = mk.movie id
AND t.id = ci.movie id
AND ci.movie id = mk.movie id
AND n.id = ci.person id
The query has 3 selection predicates – point, subset, and range – and joins 5 tables with 5 join predicates—there is atriangle subquery between tables t , mk , and ci . The corresponding join graph is depicted in Figure 1. For each join,the graph contains a named edge e – e that connects the tables involved in the join predicate. For example, edge e represents the join predicate k.id = mk.keyword id .Figure 1 also includes the execution plans together with their cost – the total cardinality of the intermediate results– for COMPASS and the four other databases considered in the paper. Although all the plans are left-deep trees, theircost ranges from , to millions tuples. This is entirely due to the statistics used for cardinality estimation.MapD [76] does not use any statistics, thus its cost is orders of magnitude higher. The plan is determined by sorting1 a r X i v : . [ c s . D B ] F e b 𝑛 ⋈ k mk ci n ⋈ ⋈ ⋈(14)(1242) (6) (6) 𝜎 𝑘 t 𝜎 𝑡 COMPASS Cost = ⋈ t mk ci n ⋈ ⋈(17 𝑀 )(1194) (6) 𝜎 𝑡 𝜎 𝑛 k 𝜎 𝑘 (300 𝐾 ) MonetDB ⋈ Cost ≈ ⋈ mk 𝜎 𝑛 t ⋈ ⋈(215 𝑀 )(10 𝐾 )(1194) (6) ci 𝜎 𝑡 k 𝜎 𝑘 n ⋈ MapD ⋈ k mk 𝜎 𝑡 t ⋈ ⋈(14) (11)(1224) (6) 𝜎 𝑘 ci n 𝜎 𝑛 ⋈ PostgreSQL,DBMS A Cost = Cost ≈ cast_info 𝐜𝐢 idperson_id e5 id movie_id e3 movie_id movie_id e4 title 𝐭 name 𝐧 keyword 𝐤 movie_keyword 𝐦𝐤 idmovie_id e2 idkeyword_id e1 Figure 1: Join graph and corresponding execution plans for query JOB 6a. The numbers represent cardinality.the tables in decreasing order of their size—number of tuples. MonetDB [77] has a rule-based optimizer with mini-mum support for statistics [19] which generates a better plan. The reason why both of these systems have primitiveoptimizers is because they are relatively “young” and are targeted at modern architectures. They try to compensatebad plans with highly-optimized execution engines that make use of extensive in-memory processing supported bymassive multithread parallelism and vectorized instructions. However, this approach is clearly limited.PostgreSQL [78] and the industrial-grade DBMS A – name anonymized for legal reasons – are “mature” databaseswith advanced query optimizers. In order to find the much better plan, they use a large variety of statistics. Histograms,most frequent values, and number of distincts are used to estimate the selectivity of the point predicate on attribute k.keyword and of the range predicate on t.production year . The subset LIKE predicate on n.name is estimated withtable-level samples. Estimating join cardinality requires correlated statistics on the join attributes. While such statisticsexist, e.g., correlated samples [22, 62, 29], they require the existence of indexes on every join attribute combination,which severely limits their applicability in the case of multi-way joins. As a result, even advanced optimizers rely oncrude formulas that assume uniformity, inclusion, and independence—which are likely to produce highly sub-optimalexecution plans [28]. Since implementing and maintaining these many statistics requires considerable effort, it iscompletely understandable that only mature systems implement them.
Problem.
We investigate how to design a lightweight – yet effective – query optimizer for modern in-memorydatabases. We have two design principles. First, we aim to capitalize on the highly-parallel execution engine in thequery optimization process. Since query execution is already fast, it is challenging to minimize the overhead incurredby the additional optimization. Second, the type and number of synopses included in the optimizer has to be minimal.Our goal is to employ a single type of synopsis built exclusively for single-attributes and without the requirement ofadditional data structures such as indexes. The challenge is to design a composable – and consistent – synopsis thatprovides incremental cardinality estimates for the sub-plans generated in plan enumeration.
COMPASS query optimizer.
We introduce the online sketch-based COMPASS query optimizer. Fast-AGMSsketches [7] are the only statistics present in COMPASS. These sketches are a type of correlated synopses for joincardinality estimation [47, 49] that use small space, can be computed efficiently in a single scan over the data, arelinearly composable, and – more importantly – have statistically high accuracy. These properties allow for Fast-AGMS sketches to be computed online in COMPASS by leveraging the optimized parallel execution engine in moderndatabases. This is realized by decomposing query processing into two stages performed before and after the optimiza-tion. In the first stage, selection predicates are pushed-down and Fast-AGMS sketches are built concurrently onlyover the relevant tuples. Sketches are built for each two-way join independently—not for every combination of tables.In the query optimization stage, plan enumeration is performed over the join graph by incrementally composing thecorresponding two-way join sketches in order to estimate the cardinality of multi-way joins. The optimal join orderingis finally passed to the execution engine to finalize the query. As shown in Figure 1, COMPASS identifies a plan asgood as PostgreSQL and DBMS A, while relying exclusively on a single synopsis—Fast-AGMS sketches. In additionto the novel query optimization paradigm, we make the following technical contributions:• We present a systematic approach of using sketches for join cardinality estimation in a query optimizer. This includestwo-way and multi-way joins. We do this for two types of sketches—AGMS [1] and Fast-AGMS [7].2 We introduce two novel strategies to extend Fast-AGMS sketches to multi-way join cardinality estimation. Thefirst strategy – sketch partitioning – is a theoretically sound estimator for a given multi-way join. Since it does notsupport composition, sketch partitioning is not scalable for join order enumeration. The second strategy – sketchmerging – addresses scalability by incrementally creating multi-way sketches from two-way sketches. Althoughthis is done heuristically for a certain multi-way join taken separately, all the multi-way joins with a given size areequally impacted. This property guarantees estimation consistency in plan enumeration.• We prototype COMPASS in MapD and perform extensive experiments over the complete JOB benchmark—113queries. The results prove the reduced overhead COMPASS incurs – below 500 milliseconds – while generatingsimilar or better execution plans compared to the four databases systems included in Figure 1. COMPASS outper-forms the other databases both in terms of the number of queries it obtains the best result on, as well as on thecumulative workload execution time.
Outline.
The paper is organized as follows. Background information on cost-based query optimization and sketchesis given in Section 2. A high-level overview of COMPASS is presented in Section 3, followed by the technical detailsof sketch-based cardinality estimation in Section 4. The novel Fast-AGMS sketches for multi-way joins are introducedin Section 5. In Section 6, we show how the sketches are integrated in a typical enumeration algorithm. The empiricalevaluation of COMPASS is detailed in Section 7. We discuss related work in Section 8 and conclude with future workdirections in Section 9.
Cost-based query optimization.
The query optimization problem [67, 30, 28, 6] consists in finding the best exe-cution plan – which typically corresponds to the one with the fastest execution time – for a given query. The searchspace is defined over all the valid plans – combinations of relational algebra operators – which can answer the querycorrectly. The number of potential plans is exponentially factorial in the number of tables. Thus, inspecting all ofthem is not practical for a large number of tables.
Plan enumeration is the procedure that defines the plans in thesearch space. Since the execution time of a plan cannot be determined without running it – which defeats the purpose– alternative cost functions are defined. The most common cost function is the total size – or cardinality – of the inter-mediate results produced by all the operators in the plan . This function captures the correlation between the amountof accessed data and execution time—which is true in general. Computing the cardinality of a relational algebra op-erator is itself a difficult problem and requires knowledge about the data on which the operator is performed. Thisknowledge is captured by incomplete statistics – or synopses – about the data. Different classes of statistics [8] areuseful for different relational operators. For example, attribute histograms and number of distinct values are optimalfor selection predicates, while correlated samples are better for join predicates. With statistics, the cardinality canonly be estimated—it is not exact. While accurate for simple predicates over a small number of attributes, cardinalityestimation becomes harder for correlated predicates and multi-way joins. This is not necessarily a problem if all theplans are equally impacted. However, estimation errors vary widely across sub-plans and this can potentially lead toa highly suboptimal plan. The COMPASS query optimizer includes solutions both for effective plan enumeration aswell as incremental cardinality estimation for the enumerated sub-plans.
Parallel in-memory databases.
Database systems for modern computing architectures rely on extensive in-memoryprocessing supported by massive multithread parallelism and vectorized instructions. GPUs represent the pinnacle ofsuch architectures, harboring thousands of SMT threads which execute tens of vectorized SIMD instructions simulta-neously. MapD, Ocelot [70], CoGaDB [74], Kinetica [75], and Brytlyt [72] are a few examples of modern in-memorydatabases with GPU support. They provide relational algebra operators and pipelines for GPU architectures [16, 4, 13]that optimize memory access and bandwidth. This results in considerable performance improvement for certain classesof queries. However, these databases provide only primitive rule-based query optimization—if at all. This limits dras-tically their applicability to general workloads. In COMPASS, we leverage the optimized execution engine of MapDto build a lightweight – yet accurate and general – query optimizer based on a single type of synopsis.3 ketches.
Sketch synopses [8] summarize the tuples of a relation as a set of random values. This is accomplished byprojecting the domain of the relation on a significantly smaller domain using random functions or seeds. In the caseof join attributes, correlation between attributes is maintained by using the same random function. While sketchescompute only approximate results with probabilistic guarantees, they satisfy several major requirements of a queryoptimizer for in-memory databases—single-pass computation, small space, fast update and query time, and linearity:• A sketch is built by streaming over the input data and considers each tuple at most once.• A basic sketch is composed of a single counter and one or more random seeds—a few bytes. In order to improveaccuracy, a standard method is to use multiple independent basic sketch instances. The number of instances isderived from the desired accuracy and confidence levels. In practice, very good accuracy can be achieved withsketches having size in kilobytes.• The update of a sketch with a new tuple consists in generating one or more random numbers and adding them to thesketch counter. The answer to a query involves simple arithmetic operations on the sketch. In the case of multiplesketches, both the update and query are applied to all the instances. Overall, update and query time are linearlyproportional with the sketch size.• A sketch can be computed by partitioning the input relation into multiple parts, building a sketch for every part,and then merging the partial sketches. This mergeable property makes sketches amenable for parallel processing onmodern hardware and can result in linear speedups in update and query time [44, 53].While previous work addresses how to apply sketches to certain cardinality estimation problems that occur in queryoptimization, COMPASS is a complete query optimizer based exclusively on sketches. In addition to cardinalityestimation, we show how to integrate the sketch estimations in plan enumeration. We are not aware of any work thatintegrates sketches effectively with plan enumeration. This is the main reason why sketches have not been integratedin a query optimizer before. COMPASS solves this problem.
In this section, we provide a high-level description of the COMPASS query optimization paradigm, while the technicaldetails of cardinality estimation, join ordering, and plan enumeration are presented in Section 4, 5, and 6, respectively.
Workflow.
The workflow performed by the COMPASS query optimizer is depicted in Figure 2. It consists of a two-step process that requires interaction with the query processor. First, the optimizer extracts the selection predicatesand join attributes for every table. A sketch is built for every join attribute while performing the selection queryon the base table, and only over the tuples that satisfy the predicate. Figure 2 shows the procedure for table title which has a range predicate and two join conditions—although both join predicates involve the same attribute t.id ,two independent sketches have to be built. COMPASS leverages the high-parallelism of in-memory databases and themergeable property of sketches to execute this process with minimal overhead. Two additional optimizations can beapplied to further reduce the overhead. Sketches for join attributes from tables without selection predicates can bebuilt offline and plugged-in directly. Sketches can be built only over a sample [46], which, however, incurs a decreasein accuracy. In the second step of the workflow, plan enumeration is performed by estimating the cardinality of allthe sub-plans using the sketches built in the first step. This is possible only because the attribute-level sketches wedesign are incrementally composable. Otherwise, separate sketches have to be built for every enumerated sub-plan. Inour example, there are two sketches on attribute t.id , one for join e2 and one for join e3 in the join graph (Figure 1).The sketch for e2 is included in all the sub-plans that contain this join attribute—similar for e3 . In a sub-plan thatincludes both e2 and e3 , these two sketches are first merged and then used in estimation as before. This process isperformed incrementally during plan enumeration. Finally, the optimal plan is submitted for execution together withany materialized intermediates from step one. Partitioned query execution.
As shown in Figure 2, COMPASS intertwines query optimization and evaluation bypartitioning execution into push-down selection (step 1) and join computation (step 3). Query optimization, i.e., joinordering plan enumeration, is performed in-between these two stages. Since plan enumeration and join computationare standard, we focus on push-down selection, where online sketch building is performed. Push-down selection4
ELECT
COUNT(*)
FROM title AS t WHERE t.production_year > 2010
𝐏𝐮𝐬𝐡-𝐃𝐨𝐰𝐧 𝐒𝐞𝐥𝐞𝐜𝐭𝐢𝐨𝐧
MIN(k.keyword), MIN(n.name), MIN(t.title) cast_info AS , keyword AS , movie_keyword AS , name AS , title AS k.keyword = 'marvel-cinematic-universe' n.name LIKE '%Downey%Robert%' t.production_year > 2010 k.id = mk.keyword_id t.id = mk.movie_id t.id = ci.movie_id ci.movie_id = mk.movie_id n.id = ci.person_id
𝐒𝐄𝐋𝐄𝐂𝐓𝐅𝐑𝐎𝐌 𝐜𝐢𝐤 𝐦𝐤𝐧𝐭𝐖𝐇𝐄𝐑𝐄 𝐀𝐍𝐃𝐀𝐍𝐃𝐀𝐍𝐃𝐀𝐍𝐃𝐀𝐍𝐃𝐀𝐍𝐃 𝐀𝐍𝐃
Sketch Build t.id (e2) t.id (e3) exact cardinalitiessketchesmaterialized intermediatesQueries on base tablesOptimal plan
𝐏𝐥𝐚𝐧 𝐄𝐧𝐮𝐦𝐞𝐫𝐚𝐭𝐢𝐨𝐧 𝑛 ⋈ 𝑚𝑘𝑐𝑖 ⋈ ⋈ ⋈ 𝑘 𝑡 Q U E RY P R O C E SS O R 𝐜𝐢𝐭 𝐧𝐤𝐦𝐤𝐂𝐚𝐫𝐝𝐢𝐧𝐚𝐥𝐢𝐭𝐲 𝐄𝐬𝐭𝐢𝐦𝐚𝐭𝐢𝐨𝐧 COMPASS QUERY OPTIMIZER Figure 2: COMPASS workflow: online sketch-based query optimization for in-memory databases.computes the exact selectivity cardinalities for all the base tables that have selections. This is similar to the ESCapproach introduced in [51]. However, in addition to predicate evaluation, COMPASS also builds sketches for everyjoin attribute in the table by piggybacking on the same traversal—sketch building is performed during the selection.Notice that this works both for sequential and index scans. It is important to emphasize that only the tuples that satisfythe predicate are included in the sketch, which increases their accuracy significantly. Moreover, the sketch updateoverhead is kept to the minimum necessary. While the exact cardinalities and sketches are always materialized due totheir reduced size and role in optimization, the decision to materialize the selection output – the intermediate result– depends on its size. COMPASS follows the same approach as in [51]. If the intermediate size is smaller than athreshold, it is materialized. Otherwise, it is not, since the space reduction does not compensate for the access timereduction. Notice, though, that, even when intermediates are not materialized, sketches still contain only the relevanttuples for join cardinality estimation.While the idea of partitioned query execution for XML processing is introduced in ROX [22], the COMPASSapproach is different in several aspects. First, similar to adaptive query processing [9], COMPASS works for relationaldata and operators. However, COMPASS does not change the plan while the query is executing. This is not necessarybecause the sketch-based optimization strategy finds better plans in the first place. ROX can decompose a join graphinto an arbitrary number of stages, each of which requiring materialization. COMPASS, on the other hand, splitsexecution in exactly two stages and intermediate result materialization is only optional. The reason ROX requiresmaterialization is because it uses chain sampling to estimate cardinalities. In order to provide acceptable accuracy,samples have to be extracted from the most recent intermediate results—not the base tables. Moreover, ROX chainsampling requires indexes on all the join attributes to guarantee a minimum sample size. This is a stringent constrainthardly satisfied in most real-world databases. Sketches, on the other hand, do not impose any constraints. Lastly, dueto its incremental greedy exploration of the join order space, ROX considers only a limited number of plans—possiblysub-optimal. In COMPASS, plan enumeration is performed at once after push-down selection and can cover anyportion of the join space. This can be achieved with the base table sketches which can be composed without the riskto become empty—the case for chain sampling.
Plan enumeration.
The join attribute-level sketches computed during push-down selection can be composed toestimate the cardinality of any valid join order – excluding cross products – generated during plan enumeration.In most cases, cross products are ignored by join enumeration algorithms anyway [29]. As shown in Section 5,sketch composition consists of two stages. First, the sketches of all the relevant join attributes in a table are mergedtogether. An attribute is relevant for a partial join order if its join is part of the order. Second, the sketches acrosstables are combined to estimate the cardinality of the join order. Since the overall composition consists only ofarithmetic operations, sketches can be integrated into any enumeration algorithm—exhaustive, bushy, or left-deep.Essentially, sketches can readily replace the standard join cardinality estimation formula based on table and joinattribute distinct cardinality [14]. However, since sketches capture the correlation between join attributes and do notmake the independence and containment assumptions, their accuracy is expected to be better.5 ketches vs. other synopses.
The decision to exclusively use sketches in COMPASS may seem questionable giventhat sketches are designed for specific stream processing tasks, while traditional databases support generic batch-oriented execution. To put it differently, there is a specific sketch for every streaming query, while synopses are for theentire database. To achieve generality, COMPASS has to build a set of sketches for every query—except base tableswithout predicates. However, this is done concurrently with push-down selection and is highly-parallel, resultingin low overhead (Section 7). As a result, sketches do not require any maintenance under modification operationssince they are built on the current data. This is not possible for any of the other database synopses. The benefit ofhaving query-specific synopses is also exploited in [29], where index-based join sampling – a variation of ROX chainsampling [22] – is introduced. Index-based join sampling is performed during the plan enumeration of every queryunder the corresponding selection predicates. Since the sample size – both minimum and maximum – is carefullycontrolled, index-based join sampling has improved memory usage and accuracy because it avoids empty results.Compared to sketches, though, this sampling strategy has two serious shortcomings. First, it requires the existenceof an index and complete frequency distribution on every join attribute. Sketches require nothing beyond the data.Second, the estimation of every join cardinality requires separate sampling from each of the involved tables. Sincethis process is time-consuming, plan enumeration is performed bottom-up – or breadth-first – in a limited time budget.Sketches can be composed incrementally in any order, without the need to access the data.The other types of synopses – histograms and distinct cardinality – are not query-specific. Thus, they do not incurany creation overhead during optimization. To estimate join cardinality, the attribute-level instances of these synopsesare composed by simple arithmetic operations [50, 14]. However, due to the strong assumptions – uniformity, inde-pendence, inclusion, ad-hoc constants – made by these operations, the estimates can be highly-inaccurate. Sketchesdo not make any of these assumptions because they capture correlations by design. ⊕( 𝑡 . 𝑖𝑑 = )⊙ 𝜉 𝑒 ( 𝑡 . 𝑖𝑑 = )⊙ 𝜉 𝑒 ( 𝑡 . 𝑝𝑟𝑜𝑑𝑢𝑐𝑡𝑖𝑜𝑛 _ 𝑦𝑒𝑎𝑟 > ) 𝜉 𝑝 e1 e2 e4 e3 e5 ( 𝑘 . 𝑖𝑑 = )⊙ 𝜉 𝑒 ( 𝑘 . 𝑘𝑒𝑦𝑤𝑜𝑟𝑑 = 𝐦𝐜 ) 𝜉 𝑝 ′ 𝐮 ′ ⊕ LIKE ( 𝑛 . 𝑖𝑑 = )⊙ 𝜉 𝑒 𝑏 ( 𝑛 . 𝑛𝑎𝑚𝑒𝜉 𝑝 𝑏 %𝐃%𝐑 ) ′ % ′ ⊕( 𝑚𝑘 . 𝑘𝑒𝑦𝑤𝑜𝑟𝑑 _ 𝑖𝑑 = )⊙ 𝜉 𝑒 𝑟 ,2 ( 𝑚𝑘 . 𝑚𝑜𝑣𝑖𝑒 _ 𝑖𝑑 = )⊙ 𝜉 𝑒 𝑟 ,2 ( 𝑚𝑘 . 𝑚𝑜𝑣𝑖𝑒 _ 𝑖𝑑 = ) 𝜉 𝑒 𝑟 ,2 ( 𝑐𝑖 . 𝑚𝑜𝑣𝑖𝑒 _ 𝑖𝑑 = )⊙ 𝜉 𝑒 𝑟 ,1 ( 𝑐𝑖 . 𝑚𝑜𝑣𝑖𝑒 _ 𝑖𝑑 = )⊙ 𝜉 𝑒 𝑟 ,1 ( 𝑐𝑖 . 𝑝𝑒𝑟𝑠𝑜𝑛 _ 𝑖𝑑 = ) 𝜉 𝑒 𝑟 ,1 ⎛⎝⎜⎜⎜⎜⎜ 𝑠𝑘 𝐬𝐤 ⋮ 𝑠𝑘 𝑟 ,1 𝑠𝑘 𝑠𝑘 ⋮ 𝑠𝑘 𝑟 ,2 ⋯⋯⋱⋯ 𝑠𝑘 𝑏 𝑠𝑘 𝑏 ⋮ 𝑠𝑘 𝑟 , 𝑏 ⎞⎠⎟⎟⎟⎟⎟ 𝑟 × 𝑏 𝐬𝐤 𝐤 ⎛⎝⎜⎜⎜⎜⎜ 𝑠𝑘 𝑠𝑘 ⋮ 𝑠𝑘 𝑟 ,1 𝐬𝐤 𝑠𝑘 ⋮ 𝑠𝑘 𝑟 ,2 ⋯⋯⋱⋯ 𝑠𝑘 𝑏 𝑠𝑘 𝑏 ⋮ 𝑠𝑘 𝑟 , 𝑏 ⎞⎠⎟⎟⎟⎟⎟ 𝑟 × 𝑏 𝐬𝐤 𝐭 ⎛⎝⎜⎜⎜⎜⎜ 𝑠𝑘 𝑠𝑘 ⋮ 𝑠𝑘 𝑟 ,1 𝑠𝑘 𝑠𝑘 ⋮ 𝑠𝑘 𝑟 ,2 ⋯⋯⋱⋯ 𝑠𝑘 𝑏 𝐬𝐤 ⋮ 𝑠𝑘 𝑟 , 𝑏 ⎞⎠⎟⎟⎟⎟⎟ 𝑟 × 𝑏 𝐬𝐤 𝐧 ⎛⎝⎜⎜⎜⎜⎜ 𝑠𝑘 𝑠𝑘 ⋮ 𝑠𝑘 𝑟 ,1 𝑠𝑘 𝑠𝑘 ⋮ 𝐬𝐤 𝐫,2 ⋯⋯⋱⋯ 𝑠𝑘 𝑏 𝑠𝑘 𝑏 ⋮ 𝑠𝑘 𝑟 , 𝑏 ⎞⎠⎟⎟⎟⎟⎟ 𝑟 × 𝑏 𝐬𝐤 𝐦𝐤 ⎛⎝⎜⎜⎜⎜⎜ 𝑠𝑘 𝑠𝑘 ⋮ 𝐬𝐤 𝐫,1 𝑠𝑘 𝑠𝑘 ⋮ 𝑠𝑘 𝑟 ,2 ⋯⋯⋱⋯ 𝑠𝑘 𝑏 𝑠𝑘 𝑏 ⋮ 𝑠𝑘 𝑟 , 𝑏 ⎞⎠⎟⎟⎟⎟⎟ 𝑟 × 𝑏 𝐬𝐤 𝐜𝐢 ⎛⎝⎜⎜⎜⎜⎜ Avg { 𝑠 ⋅ 𝑠 ⋅ 𝑠 ⋅ 𝑠 ⋅ 𝑠𝑘 𝑘 𝑖 𝑘 𝑡 𝑖 𝑘 𝑛 𝑖 𝑘 𝑚𝑘 𝑖 𝑘 𝑐𝑖 𝑖 } 𝑏𝑖 =1 Avg { 𝑠 ⋅ 𝑠 ⋅ 𝑠 ⋅ 𝑠 ⋅ 𝑠𝑘 𝑘 𝑖 𝑘 𝑡 𝑖 𝑘 𝑛 𝑖 𝑘 𝑚𝑘 𝑖 𝑘 𝑐𝑖 𝑖 } 𝑏𝑖 =1 ⋮ Avg { 𝑠 ⋅ 𝑠 ⋅ 𝑠 ⋅ 𝑠 ⋅ 𝑠𝑘 𝑘𝑟 , 𝑖 𝑘 𝑡𝑟 , 𝑖 𝑘 𝑛𝑟 , 𝑖 𝑘 𝑚𝑘𝑟 , 𝑖 𝑘 𝑐𝑖𝑟 , 𝑖 } 𝑏𝑖 =1 ⎞⎠⎟⎟⎟⎟⎟ 𝑟 ×1 Median ⊕⊕ Figure 3: Cardinality estimation for query JOB 6a with AGMS sketches.6
SKETCH CARDINALITY ESTIMATION
In this section, we present how the class of AGMS sketches are applied for estimating the cardinality of complexqueries involving selection predicates and multi-way joins. We organize the presentation around the original AGMSsketches [1] which have known solutions to these problems. However, AGMS sketches are too inefficient to beaccurate and cannot be integrated in query plan enumeration. This leads us to the Fast-AGMS sketches [7] which areasymptotically more efficient and have been shown to be statistically more accurate [47, 49]. However, Fast-AGMSsketches are limited to estimating two-way join cardinality. Our main contributions are to extend Fast-AGMS sketchesto multi-way joins and to effectively integrate them in query plan enumeration.
The basic AGMS sketch of an attribute from a relation consists of a single random value sk that summarizes the valuesof the attribute across all the tuples in the relation. For example, all the values of attribute id from table keyword canbe summarized by a sketch sk ( k.id ) computed as sk(k.id) = (cid:80) t ∈ k ξ ( t.id ) , where ξ is a family of { +1 , − } randomvariables that are 4-wise independent. Essentially, a random value of either +1 or − is associated to each point inthe domain of attribute k.id . Then, the corresponding random value is added to the sketch sk ( k.id ) – initialized to – for each tuple t in table keyword . Intuitively, the more frequent a value is, the more is “pulling” the sketch to itsfrequency. Since all the tuples are combined in the same sketch sk ( k.id ) , they are conflicting and the output can befar away from the frequency of each single tuple. This is where the 4-wise independence property of ξ is important.It guarantees that for any group of at most 4 different values of attribute k.id , the product of their corresponding ξ values is on expectation—they cancel out. This, in turn, allows for each individual attribute value frequency to beunbiasedly estimated by multiplying the sketch with the corresponding ξ random value. For example, the frequencyof k.id = 5 is estimated by the product sk ( k.id ) · ξ (5) . Consider the join e between tables keyword and movie keyword with predicate k.id = mk.keyword id (Figure 1). Thecardinality of this join operator can be estimated with two AGMS sketches sk ( k.id ) and sk ( mk.keyword id ) builton the join attributes. The requirement is that these sketches share the same family ξ of random variables— ξ e isassociated with edge e . ξ e guarantees that join keys with the same value are assigned the same { +1 , − } randomvalue—they are correlated. The basic AGMS estimator is the product of sk ( k.id ) and sk ( mk.keyword id ) : Est ( | e | ) = sk(k.id) · sk(mk.keyword id) = (cid:88) x ∈ k (cid:88) y ∈ mk ξ e ( x.id ) · ξ e ( y.keyword id ) Due to the 4-wise independence property of ξ e , this estimator is unbiased—its expectation equals the true | e | car-dinality. However, its variance is high—it has poor accuracy. This is expected since a full table with any number oftuples is summarized as a single number. The standard technique to improve accuracy is to build multiple independentbasic sketch estimators. This is achieved by using independent families of random variables ξ e . It is theoreticallyproven that, in order to obtain an estimator with relative error at most (cid:15) with confidence δ , O (cid:0) /(cid:15) log (1 /δ ) (cid:1) basicsketches are necessary. As shown in Figure 3, they are grouped into a matrix of r rows and b columns. Then, thefinal AGMS estimator is obtained by averaging the b instances in each row and taking the median over the resulting r averages. In summary, an AGMS sketch has Ω( r · b ) update and query time, and its space usage is also Ω( r · b ) . Thisassumes that the random number generators ξ have small seeds and produce their values fast—aspects that requirecareful implementation. We show how to extend AGMS sketches to multi-way join cardinality estimation. For this we add the join e between movie keyword and title to e and aim to estimate the cardinality of this 3-table query. Following the approachfor two-way joins, a family of sketches is built for edge e on attributes mk.movie id and t.id , respectively. Thesesketches share their own family ξ e of random variables. Since two attributes from mk – keyword id and movie id –7articipate in join operators with other tables, we have to preserve their tuple connection. This is achieved by creatinga single composed sketch sk ( mk.k id, mk.m id ) instead of separate sketches for each attribute [11]. The value of sk ( mk.k id, mk.m id ) is computed as: sk(mk.k id,mk.m id) = (cid:88) t ∈ mk ξ e ( t.k id ) · ξ e ( t.m id ) where the product of the two random variables is added to the sketch. The cardinality estimator is defined as theproduct of three sketches in this case: Est ( | e ∪ e | ) = sk(k.id) · sk(mk.k id,mk.m id) · sk(t.id) = (cid:88) x ∈ k (cid:88) y ∈ mk (cid:88) z ∈ t ξ e ( x.id ) · ξ e ( y.k id ) · ξ e ( y.m id ) · ξ e ( z.id ) As long as the families ξ e and ξ e are independent, this estimator is unbiased. However, its variance can be exponen-tially worse than that of the two-way join estimator—which makes sense, given the additional degree of randomness.Thus, to achieve the same accuracy, a considerably larger number of basic sketches are required.This strategy can be generalized to complex queries involving any number of tables and join predicates. A sketchis built for every table. Independent random families ξ are used for every join predicate. The sketch correspondingto a table is updated with the product of all the ξ families incident to it, applied to the corresponding join attribute.In the case of our example query JOB 6a with 5 tables and 5 join predicates (Figure 3), there are 5 sketches and 5families ξ . The sketch sk mk for table mk is updated with the product ξ e ( k id ) · ξ e ( m id ) · ξ e ( m id ) which includesa factor for each of the three join predicates. The unbiased cardinality estimator is the product of the 5 sketches sk k · sk mk · sk t · sk ci · sk n . For the same number of basic sketches r · b as in the case of the | e | join, the accuracy of the | e ∪ e ∪ e ∪ e ∪ e | join can be exponentially worse. Query JOB 6a contains 3 selection predicates—point on k , subset on n , and range on t . These have to be accounted forwhen estimating the overall query cardinality. AGMS sketches can handle selection predicates as long as the domainof the attribute is discrete—which is the case for the fixed-size data types in databases. The idea is to express theselection as a join predicate between the table and the domain of the selection attribute [45, 48]. Following the two-way join approach, a sketch is built on the selection attribute over all the tuples in the table. The sketch over the domain– which shares the same random family ξ – summarizes the values in the domain which satisfy the predicate by addingan entry for each of them to the sketch—for a point predicate, the sketch includes only the ξ value corresponding tothe constant in the predicate; for a range, the ξ values for all points in the range; for a subset, the ξ values for thepoints in the subset. As long as the number of points is small, these sketches can be computed fast. Moreover, evenfor ranges, there is a specific fast range-summable random family ξ for which the sketch can be computed in constanttime, independent of the range size [45, 48]. In the JOB 6a query depicted in Figure 3, the sketch update procedure fortables with predicates includes an additional factor corresponding to the selection attribute. For example, the sketch sk k for table keyword is updated with the product ξ e ( id ) · ξ p ( keyword ) . Overall, 8 families ξ and 8 sketches arerequired—the sketches over the domain of the selection attributes are not included in Figure 3. The final estimator isthe product of these 8 sketches. Since this estimator is a multi-way join with a larger number of sketches, its accuracybecomes worse than that of the join sketches only. As shown, AGMS sketches can be theoretically used to estimate the cardinality of arbitrary complex queries withjoin and selection predicates. While all the sketches for a table can be built in a single scan, since the updatetime per AGMS sketch is linear in the sketch size, updating an exponential number of sketches becomes domi-nant. Moreover, the space requirement for all the sketches is also a problem. These scalability issues hinder theapplication of AGMS sketches to join order enumeration. However, AGMS sketches suffer from a more seriousproblem in query optimization—they cannot be incrementally composed. What this means is that a sketch used to8stimate a two-way join between two relations cannot be used to estimate a three-way join that includes another re-lation. The addition of join e to e in our example illustrates this well. It is not possible to compute the sketch sk ( mk.k id, mk.m id ) from sketch sk ( mk.keyword id ) . It is not even possible to compute sk ( mk.k id, mk.m id ) from sk ( mk.keyword id ) and sk ( mk.movie id ) . The reason is the order of multiplication and addition. The otherdirection – use sk ( mk.k id, mk.m id ) instead of sk ( mk.keyword id ) or sk ( mk.movie id ) – is also not possi-ble. Thus, in order to support plan enumeration, a separate sketch has to be built for every combination of the joinattributes—which is an exponential number. For example, 7 sketches have to be built for both tables mk and ci whichparticipate in 3 join predicates. If we include the attributes that can appear in selection predicates, the number ofsketches that has to be built for a table can become exponential in the number of attributes in the table. While work-load information can be used to reduce this number, there is little that can be done for tables that join with severalother tables on different attributes. Practically, AGMS sketches cannot achieve the goal of having synopses only forsingle attributes. 𝜎 ( 𝑡 . year ) t.id 𝑤 𝑤 𝑤 𝑡 k.id k.keyword 𝑥 xk 𝑥 xk … … 𝑥 𝑘 xk 𝑘 mk.keyword_id mk.movie_id 𝑦 𝑧 𝑦 𝑧 … … 𝑦 𝑚 𝑧 𝑚 𝜎 ( 𝑘 . keyword ) ⎛⎝⎜⎜⎜⎜⎜ ℎ 𝑒 ℎ 𝑒 ⋮ ℎ 𝑒 𝑟 ⎞⎠⎟⎟⎟⎟⎟ 𝑟 ×1 ⎛⎝⎜⎜⎜⎜⎜ ℎ 𝑒 ℎ 𝑒 ⋮ ℎ 𝑒 𝑟 ⎞⎠⎟⎟⎟⎟⎟ 𝑟 ×1 k.id 𝑥 𝑥 𝑘 ⎛⎝⎜⎜⎜⎜⎜ 𝜉 𝑒 𝜉 𝑒 ⋮ 𝜉 𝑒 𝑟 ⎞⎠⎟⎟⎟⎟⎟ 𝑟 ×1 ⎛⎝⎜⎜⎜⎜⎜ 𝜉 𝑒 𝜉 𝑒 ⋮ 𝜉 𝑒 𝑟 ⎞⎠⎟⎟⎟⎟⎟ 𝑟 ×1 t.id t.year 𝑤 𝑤𝑦 𝑤 𝑤𝑦 … … 𝑤 𝑡 𝑤𝑦 𝑡 ⎛⎝⎜⎜⎜⎜⎜ 𝑎 + ( 𝑥 ) 𝑎
2, ( 𝑥 ) ℎ 𝑒 𝜉 𝑒 ⋮ 𝑎 𝑟 ,1 𝑎 𝑎 ⋮ 𝑎 𝑟 ,2 ⋯⋯⋱⋯ + ( 𝑥 ) 𝑎
1, ( 𝑥 ) ℎ 𝑒 𝜉 𝑒 𝑎 𝑏 ⋮+ ( 𝑥 ) 𝑎 𝑟 , ( 𝑥 ) ℎ 𝑒 𝑟 𝜉 𝑒 𝑟 ⎞⎠⎟⎟⎟⎟⎟ 𝑟 × 𝑏 𝐬𝐤 𝐤.𝐢𝐝 ⎛⎝⎜⎜⎜⎜⎜ 𝑎 + ( 𝑦 ) 𝑎
2, ( 𝑦 ) ℎ 𝑒 𝜉 𝑒 ⋮ 𝑎 𝑟 ,1 + ( 𝑦 ) 𝑎
1, ( 𝑦 ) ℎ 𝑒 𝜉 𝑒 𝑎 ⋮+ ( 𝑦 ) 𝑎 𝑟 , ( 𝑦 ) ℎ 𝑒 𝑟 𝜉 𝑒 𝑟 ⋯⋯⋱⋯ 𝑎 𝑏 𝑎 𝑏 ⋮ 𝑎 𝑟 , 𝑏 ⎞⎠⎟⎟⎟⎟⎟ 𝑟 × 𝑏 𝐬𝐤 𝐦𝐤.𝐤𝐞𝐲𝐰𝐨𝐫𝐝_𝐢𝐝 ⎛⎝⎜⎜⎜⎜⎜ 𝑎 + ( 𝑧 ) 𝑎
2, ( 𝑧 ) ℎ 𝑒 𝜉 𝑒 ⋮ 𝑎 𝑟 ,1 𝑎 𝑎 ⋮ 𝑎 𝑟 ,2 ⋯⋯⋱⋯ + ( 𝑧 ) 𝑎
1, ( 𝑧 ) ℎ 𝑒 𝜉 𝑒 𝑎 𝑏 ⋮+ ( 𝑧 ) 𝑎 𝑟 , ( 𝑧 ) ℎ 𝑒 𝑟 𝜉 𝑒 𝑟 ⎞⎠⎟⎟⎟⎟⎟ 𝑟 × 𝑏 𝐬𝐤 𝐦𝐤.𝐦𝐨𝐯𝐢𝐞_𝐢𝐝 ⊕ ⎛⎝⎜⎜⎜⎜⎜ 𝑎 + ( 𝑤 ) 𝑎
2, ( 𝑤 ) ℎ 𝑒 𝜉 𝑒 ⋮ 𝑎 𝑟 ,1 + ( 𝑤 ) 𝑎
1, ( 𝑤 ) ℎ 𝑒 𝜉 𝑒 𝑎 ⋮+ ( 𝑤 ) 𝑎 𝑟 , ( 𝑤 ) ℎ 𝑒 𝑟 𝜉 𝑒 𝑟 ⋯⋯⋱⋯ 𝑎 𝑏 𝑎 𝑏 ⋮ 𝑎 𝑟 , 𝑏 ⎞⎠⎟⎟⎟⎟⎟ 𝑟 × 𝑏 𝐬𝐤 𝐭.𝐢𝐝 ⊕⊕ Push-Down Selection Fast-AGMS Sketch ⎛⎝⎜⎜⎜⎜⎜ 𝑠 ⋅ ( 𝑠 ⊗ 𝑠 ) ⋅ 𝑠 ∑ 𝑏𝑖 =1 𝑘 𝑘 . 𝑖𝑑 𝑖 𝑘 𝑚𝑘 . 𝑘 _ 𝑖𝑑 𝑖 𝑘 𝑚𝑘 . 𝑚 _ 𝑖𝑑 𝑖 𝑘 𝑡 𝑖 𝑠 ⋅ ( 𝑠 ⊗ 𝑠 ) ⋅ 𝑠 ∑ 𝑏𝑖 =1 𝑘 𝑘 . 𝑖𝑑 𝑖 𝑘 𝑚𝑘 . 𝑘 _ 𝑖𝑑 𝑖 𝑘 𝑚𝑘 . 𝑚 _ 𝑖𝑑 𝑖 𝑘 𝑡 𝑖 ⋮ 𝑠 ⋅ ( 𝑠 ⊗ 𝑠 ) ⋅ 𝑠 ∑ 𝑏𝑖 =1 𝑘 𝑘 . 𝑖𝑑𝑟 , 𝑖 𝑘 𝑚𝑘 . 𝑘 _ 𝑖𝑑𝑟 , 𝑖 𝑘 𝑚𝑘 . 𝑚 _ 𝑖𝑑𝑟 , 𝑖 𝑘 𝑡𝑟 , 𝑖 ⎞⎠⎟⎟⎟⎟⎟ 𝑟 ×1 Median ⊕ Figure 4: Cardinality estimation for query JOB 6a with Fast-AGMS sketches.
Fast-AGMS sketches preserve the ( r × b ) matrix structure of AGMS sketches. However, they define a complete rowof b counters as a basic sketch element (Figure 4). Only one of these counters is updated for every tuple, thus, a factor b reduction in update time is obtained. The updated counter is chosen by a random hash function h associated with therow. The purpose of h is to spread tuples with different values as evenly as possible—tuples with the same key still endup in the same bucket. On average, a factor b less tuples collide on the same counter, which preserves the frequency ofeach of them better. Since a full row is a sketch element, a single ξ family of random variables is associated with everyrow. Thus, a Fast-AGMS sketch with r rows requires only r hash and ξ random functions. The value of a counter j is sk(k.id) j = (cid:80) t ∈ k,h ( t.id )= j ξ ( t.id ) . 9 .2.1 Two-Way Join Cardinality Estimation In order to estimate join cardinality, the same principle applies—Fast-AGMS sketches are built over the join attributesusing the same random functions h and ξ . The hash function h lands identical keys to the same bucket, while ξ givesthe same sign. The unbiased estimator for a basic sketch sums up the product of the corresponding buckets: Est ( | e | ) = b (cid:88) j =1 sk(k.id) j · sk(mk.keyword id) j Summation is necessary because h partitions the tuples. As for AGMS sketches, the final estimate is obtained by takingthe median of the r independent basic sketches. Although the accuracy of Fast-AGMS sketches is asymptoticallyequal to that of AGMS sketches [7] in the worst case, it has been shown statistically that Fast-AGMS sketches haveconsiderably better accuracy than any other sketching technique on average [47]. The combined accuracy and fastupdate time make Fast-AGMS sketches suitable for query optimization. As far as we know, there is no work that extends Fast-AGMS sketches to multi-way join estimation. The main problemis the requirement to have independent hash functions h e and h e for the two join attributes. These functions allocatethe attributes to different buckets, which means that the tuple is added to the sketch twice. Moreover, the relationshipbetween attributes is lost. Since sketch-based selectivity estimation is also reduced to a join between the selectionattribute and its domain, this implies that Fast-AGMS sketches cannot be used to estimate the cardinality of two-wayjoins with predicates. In fact, computing optimally the Fast-AGMS sketch of the domain of a range predicate doesnot have a solution. This is because there is no order relationship between the hash values of adjacent points in thedomain. Due to these limitations, Fast-AGMS sketches have not been used in query optimization before. COMPASSintroduces Fast-AGMS extensions for multi-way joins and solves the selectivity issue by pushing-down predicatesduring query optimization, and adding only the relevant tuples to the sketch. We present two strategies to extend Fast-AGMS sketches to multi-way join cardinality estimation. The first strategy– sketch partitioning – is a theoretically sound estimator for a given multi-way join. Its limitation is that it cannotbe composed/decomposed, thus, it is not scalable for plan enumeration. The second strategy – sketch merging – ad-dresses the scalability issue by incrementally creating multi-way sketches from two-way sketches. Although this isdone heuristically for a certain multi-way join taken separately, all the multi-way joins with a given size are equally im-pacted. We show empirically that this property is a good surrogate for accuracy – which is much harder to consistentlyachieve – in join order enumeration.
The idea of sketch partitioning is to reorganize the b buckets of the elementary sketch into a ( b × b ) h e hashes a tuple mk ( k id, m id ) to one of the b rows, while h e hashesto one of the b columns. Then, only the counter at indices (cid:2) h e ( k id ) , h e ( m id ) (cid:3) is updated with the product ξ e ( k id ) · ξ e ( m id ) . This process is depicted in Figure 5. For example, tuple (6,3) in mk adds 1 to the counter [2,1]. h e guarantees that all the tuples with k id = 6 are hashed to row 2, while h e sends tuples with m id = 3 to column1. Conflicts happen only when the output of both hash functions is identical. Given the quadratic number of bucketscompared to the sketch for a single attribute – while the number of tuples is the same – conflicts are less frequent. Thecardinality estimate for the 3-table join k (cid:46)(cid:47) mk (cid:46)(cid:47) t is obtained by summing up all the entries in the matrix resultedafter the scalar multiplication between sk ( k.id ) and every row in sk part ( mk ) , followed by the scalar multiplicationbetween the transpose of sk ( t.id ) and every column in sk part ( mk ) . This can be written as: Est ( | e ∪ e | ) = (cid:88) ≤ i