[PDF] Buffer Pool Aware Query Scheduling via Deep Reinforcement Learning

Abstract

In this extended abstract, we propose a new technique for query scheduling with the explicit goal of reducing disk reads and thus implicitly increasing query performance. We introduce \system, a learned scheduler that leverages overlapping data reads among incoming queries and learns a scheduling strategy that improves cache hits. \system relies on deep reinforcement learning to produce workload-specific scheduling strategies that focus on long-term performance benefits while being adaptive to previously-unseen data access patterns. We present results from a proof-of-concept prototype, demonstrating that learned schedulers can offer significant performance improvements over hand-crafted scheduling heuristics. Ultimately, we make the case that this is a promising research direction in the intersection of machine learning and databases.

Full PDF

BBuffer Pool Aware Query Schedulingvia Deep Reinforcement Learning(Extended Abstract)

Chi Zhang , Ryan Marcus , Anat Kleiman , Olga Papaemmanouil Brandeis University MIT CSAIL and Intel Labs { chizhang, akleiman, opapaemm } @brandeis.edu, [email protected] ABSTRACT

In this extended abstract, we propose a new technique forquery scheduling with the explicit goal of reducing disk readsand thus implicitly increasing query performance. We intro-duce SmartQueue, a learned scheduler that leverages over-lapping data reads among incoming queries and learns ascheduling strategy that improves cache hits. SmartQueuerelies on deep reinforcement learning to produce workload-speciﬁc scheduling strategies that focus on long-term perfor-mance beneﬁts while being adaptive to previously-unseendata access patterns. We present results from a proof-of-concept prototype, demonstrating that learned schedulerscan oﬀer signiﬁcant performance improvements over hand-crafted scheduling heuristics. Ultimately, we make the casethat this is a promising research direction in the intersectionof machine learning and databases.

1. INTRODUCTION

Query scheduling, the problem of deciding which of a setof queued queries to execute next, is an important and chal-lenging task in modern database systems. Query schedulingcan have a signiﬁcant impact on query performance and re-source utilization while it may need to account for a widenumber of considerations, such as cached data sets, avail-able resources (e.g., memory), per-query performance goals,query prioritization, or inter-query dependencies (e.g., cor-related data access patterns).In this work, we attempt to address the query schedul-ing problem by leveraging overlapping data access requests.Smart query scheduling policies can take advantage of suchoverlaps, allowing queries to share cached data, whereasnaive scheduling policies may induce unnecessary disk reads.For example, consider three queries q , q , q which need toread disk blocks ( b , b ), ( b , b ), and ( b , b ) respectively. Ifthe DBMS’s buﬀer pool (i.e., the component of the databaseengine that caches data blocks) can only cache two blocks atonce, executing the queries in the order of [ q , q , q ] will re-sult in reading 6 blocks from disk. However, if the queries areexecuting in the order [ q , q , q ], then only 5 blocks will beread from disk, as q will use the cached b . Since buﬀer poolhits can be orders of magnitude faster than cache misses,such savings could be substantial.In reality, designing a query scheduler that is aware ofthe current buﬀer pool is a complex task. First, the ex-act data block read set of a query is not known ahead oftime, and is dependent on data and query plan parame-ters (e.g., index lookups). Second, a smart scheduler mustbalance short-term rewards (e.g., executing a query that will take advantage of the current buﬀer state) against long-termstrategy (e.g., selecting queries that keep the most importantblocks cached). One could imagine many simple heuristics,such as greedily selecting the next query with the highestexpected buﬀer usage, to solve this problem. However, ahand-designed policy to handle the complexity of the en-tire problem, including diﬀerent buﬀer sizes, shifting queryworkloads, heterogeneous data types (e.g., index ﬁles vs baserelations), and balancing short-term gains against long-termstrategy is much more diﬃcult to conceive.Here, we showcase a prototype of SmartQueue, a deepreinforcement learning (DRL) system that automaticallylearns to maximize buﬀer hits in an adaptive fashion. Givena set of queued queries, SmartQueue combines a simplerepresentation of the database’s buﬀer state, the expectedreads of queries, and deep Q-learning model to order queuedqueries in a way that garners long-term increases in buﬀerhits. SmartQueue is fully learned, and requires minimal tun-ing. SmartQueue custom-tailors itself to the user’s queriesand database, and learns policies that are signiﬁcantly bet-ter than naive or simple heuristics. In terms of integratingSmartQueue into an existing DBMS, our prototype only re-quires access to the execution plan for each incoming query(to assess likely reads) and the current state of the DBMSbuﬀer pool (i.e., its cached data blocks).We present our system model and formalized our learningtask in Section 2. We present preliminary experimental re-sults from a proof-of-concept prototype implementation inSection 3, related work in Section 4, and in Section 5 wehighlight directions for future work.

2. THE SMARTQUEUE MODEL

SmartQueue is a learned query scheduler that automati-cally learns how to order the execution of queries to mini-mize disk access requests. The core of SmartQueue includesa deep reinforcement learning (DRL) agent [3] that learnsa query scheduling policy through continuous interactionswith its environment, i.e., the database and the incomingqueries. This DRL agent is not a static model, instead it continuously learns from its past scheduling decisions and adapts to new data access and caching patterns. Further-more, as we discuss below, using a DRL model allows us todeﬁne a reward function and scheduling policy that captureslong-term beneﬁts vs short-term gains in disk access.Our system model is depicted in Figure 1. Incoming userqueries are placed into an execution queue and SmartQueuedecides their order of execution. For each query execution,1 a r X i v : . [ c s . D B ] J u l he database collects the required data blocks of each in-put base relation, where a data block is the smallest dataunit used by the database engine. Data blocks requestsare ﬁrst resolved by the buﬀer pool. Blocks found in thebuﬀer ( buﬀer hits ) are returned for processing while therest of the blocks ( buﬀer misses ) are read from disk andplaced into the buﬀer pool (after possible block evictions).Higher buﬀer hit rates (and hence lower disk access rates)can enormously impact query execution times but requirestrategic query scheduling, as execution ordering aﬀects thedata blocks cached in the buﬀer pool.One tempting solution to address this challenge could in-volve a greedy scheduler which executes the query that willre-use the maximum number of cached data blocks. Whilethis simple approach would yield short term beneﬁts, it ig-nores the long-term impact of each choice. Speciﬁcally, whilethe next query for execution will maximally utilize the buﬀerpool contents, it will also lead to newly cached data blocks,which will aﬀect future queries. A greedy approach fails toidentify whether these new cached blocks could be of anybeneﬁt to the unscheduled yet queries.SmartQueue addresses this problem by training a deepreinforcement learning agent to make scheduling decisionsthat maximize long term beneﬁts. Speciﬁcally, it uses amodel that simultaneously estimates and tries to improvea weighted average between short-term buﬀer hits and thelong-term impact of query scheduling choices. In the nextparagraphs, we discuss the details of our approach: (a)the input features vector that capture data access requests( Query Bitmap ) and buﬀer state (

Buﬀer Bitmap ), and (b)the formalized DRL task.

Buffer Bitmap.

One input to the DRL model is the state ofthe buﬀer pool, namely which blocks are currently cached inmemory. Buﬀer state B is represented by a bitmap whererows represent base relations and columns represent datablocks. The ( i, j ) entry is set to 1 if the j -th block of relation i is cached in the buﬀer pool and is set to zero otherwise.Since the number of blocks of any given relation can be veryhigh and diﬀerent for each relation, each row vector F i isdownsized by calculating a simple moving average over thenumber of its blocks entries. Speciﬁcally D i is the downsizedrow of a relation i and F i is the full size row, we have: B ij = (cid:98)| F i | / | D i |(cid:99) × ( j +1) ×(cid:98)| D i | / | F i |(cid:99) (cid:88) k = j ×(cid:98)| D i | / | F i |(cid:99) F ik (1) Query Vector.

The second input to the DLR model is thedata block requests of each query in the queue. Speciﬁ-cally, given a query q , we generate a vector that indicatesthe data blocks to be accessed by q for each base relationin the database. To implement this, SmartQueue collectsthe query plan of q , and approximates the probability ofeach table’s data block being accessed. Our approach han-dle requests of index ﬁle and base relations similarly, as bothtype of blocks will be cached into the buﬀer pool. The queryvector is downsized in the same was as the buﬀer bitmap.Full table scans for a base relation i indicate that all datablocks of the given relation will be accessed, and thereforeeach cell of the i -th row vector has the value of 1. Forindexed table scans, we calculate the number of tuples tobe accessed based on the selectivity of the index scan. If Q n … Q Q Query QueueQuery State (blockrequests)

Buffer State (cached blocks)

Action (Q i to execute)Buffer Pool b b b Q value for Q Q value for Q Q value for Q n … Action SelectionQ-Learning DNN

StateLayer Hidden Layers Action Layer

Reward (buffer hit ratio)

DB Engine

Figure 1: SmartQueue’s system modelthe index scan is feeding a loop-based operator (i.e., nestedloop join) the selectivity is adapted accordingly to accountfor any iterations over the relation. We assume the relationis uniformly stored across data blocks and therefore, if x %tuples of a base relation are to be selected from an indexedoperation, we set the access probability of each data blockof the relation to x %. Similarly, we assume that the indexedoperation reads x % of the index’s blocks. We note that muchmore sophisticated probabilistic models could be used, butfor this preliminary work we use this simple approximation. Deep Q-Learning.

SmartQueue uses deep Q-learning [17]in order to decide which query to execute next. As with anydeep reinforcement learning system, SmartQueue is an agentthat operates over a set of states S (buﬀer pool states) anda set of actions A per state (candidate queries to executednext). SmartQueue models the problem of query schedul-ing as a Markov Decision Process (MDP) [38]: by pickingone query from the queue to execute, the agent transitionsfrom the current to a new buﬀer pool state (i.e., data blockscached). Executing a new query on the current buﬀer state,provide the agent with a reward. In our case, the rewardof an action is the buﬀer hit ratio of the executed querycalculated as buﬀer hitstotal block requests .The goal of the agent is to learn a scheduling policy thatmaximizes its total reward. This is an continues learningprocess: as more queries arrive and the agent makes morescheduling decisions, it collects more information (i.e., con-text of the decision and its reward) and adapts its policyaccordingly. The scheduling policy is expressed as a func-tion Q ( S t , A t ), that outputs a Q-value for taking an action A t (i.e., a query to execute next) on a buﬀer state S t . Givena state S t and a action A t , the Q-value Q ( S t , A t ) is calcu-lated by adding the maximum reward attainable from futurebuﬀer states to the reward for achieving its current buﬀerstate, eﬀectively inﬂuencing the current scheduling decisionby the potential future reward. This potential reward is aweighted sum of the expected buﬀer hit ratios of all futurescheduling decisions starting from the current buﬀer state.Formally, after each action A t on a state S t the agent learnsa new policy Q new ( S t , A t ) deﬁned as: Q ( S t , A t ) + α [ R t + γ max α ( Q ( S t +1 , α ) − Q ( S t , A t ))] (2)The parameter γ is the discount factor which weighs thecontribution of short-term vs. long-term rewards. Adjusting2he value of γ will diminish (e.g., favor choosing queries thatwill make use of the current buﬀer state) or increase (e.g.,favor choosing queries that will allow long-term increasedusage of the buﬀer) the contribution of future rewards. Theparameter α is the learning rate or step size. This sim-ply determines to what extent newly acquired informationoverrides old information: a low learning rate implies thatnew information should be treated skeptically, and may beappropriate when a workload is mostly stable but containssome outliers. A high learning rate implies that new in-formation is more fully trusted, and may be appropriatewhen query workloads smoothly change over time. Sincethe above is a recursive equation, it starts with making ar-bitrary assumptions for all Q -values (and hence arbitraryinitial scheduling decisions). However, as more experienceis collected through the execution of incoming queries, thenetwork likely converges to the optimal policy [27].

3. PRELIMINARY RESULTS

Here, we present preliminary experiments demonstratingthat SmartQueue can generate query ordering that increasethe buﬀer hit ratio and improve query execution times com-pared with alternative non-learned schedulers.

Experimental Setup.

Our experimental study used work-loads generated using the 99 query templates of the TPC-DS benchmark [29]. We deployed a database with a size of49GB on single node server with 4 cores, 32GB of RAM.For our experiments, we generated 1 ,

000 random query in-stances out of these 99 templates and placed them in a ran-dom order in the execution queue. The benchmark includes165 tables and indexes, and the number of blocks for eachof these ranged between 100 and 130 , × , For each query, wecollect its query plan without executing the query by usingthe

EXPLAIN command.SmartQueue uses a fully-connected neural network. OurDRL agent was implemented with Keras [12] and uses 2hidden layers with 128 nerons each. We also use an adaptivelearning rate optimization algorithm (Adam [13]) and ourloss function is the mean squared error.In our study, we compare SmartQueue with two al-ternative scheduling approaches.

First-Come-First-Served(FCFS) simply executes queries in the order they appear inthe queue.

Greedy employs a simple heuristic to identify thequery with the best expected hit ratio given the current con-tents of the buﬀer pool. Speciﬁcally, for each queued queryit calculates the dot product of the buﬀer state bitmap withthe data requests bitmap, estimating essentially the prob-ability of buﬀer hits for each data block request. We thenorder all queries based on the sum of these probabilities overall blocks and execute the query with the highest sum value.Following the execution, the new buﬀer state is calculatedand the heuristic is applied again until the queue is empty.This greedy approach focuses on short-terms buﬀer hits im-provements. We conﬁgured PostgreSQL to bypass the OS ﬁlesystemcache. In future work, multiple levels of caching should beconsidered.

Eﬀectiveness . First, we demonstrate that SmartQueuecan improve its eﬀectiveness as it collects more experience.In this set of experiments, we placed all 1 ,

000 queries in thequeue and we start scheduling them using SmartQueue. Inthe beginning our agent will make arbitrary scheduling deci-sions, but as it schedules more queries, SmartQueue collectsmore experience from its past actions and starts improvingits policy. To demonstrate that, we evaluated the learnedmodel at diﬀerent stages of its training. Figure 2a and Fig-ure 2b shows how the model performs as we increase thenumber of training queries. In Figure 2a, we measure theaverage buﬀer hit ratio when scheduling our 1 ,

000 queriesand we compare it with the buﬀer hit ratio of FCFS andGreedy (which is not aﬀected by the number of trainingqueries). We observe that the DRL agent is able to improvethe buﬀer hit ratio as it schedules more queries. It outper-forms the buﬀer hit of the other two heuristics eventuallyconverging into a ration that is 65% higher than FCFS and35% higher than Greedy.In addition, Figure 2b shows the number of executedqueries over time. The results demonstrate that DRL-guided scheduling of SmartQueue allows our approach toexecute the workload of 1 ,

000 queries around 42% fasterthan Greedy and 55% faster than FCFS. This indicates thatSmartQueue can eﬀectively capture the relationship betweenbuﬀer pool state and data access patterns, and leveragethat to better utilize the buﬀer pool and improve its queryscheduling decisions.

Adaptability to new queries . Next we studiesSmartQueue’s ability to adapt to unseen queries. Forthese experiments, we trained SmartQueue by ﬁrst schedul-ing 950 random queries out of 79 TPC-DS templates. Wethen test the model over 50 random queries out 20 unseenbefore TPC-DS templates. Figure 3a demonstrates howaverage buﬀer hit ratio of the testing queries is aﬀected asSmartQueue collects experience increases from schedulingmore training queries. The graph shows that the averagebuﬀer hit ratio of the testing queries is increased from0.2 (when the SmartQueue is untrained) to 0.64 (whenSmartQueue has schedule all 950 queries). Furthermore,SmartQueue outperforms FCFS and Greedy after havingscheduled less than 500 queries.Finally, Figure 3b, shows that the query latency ofour testing queries keeps decreasing (and eventually out-performs FCFS and Greedy) as SmartQueue is trained onmore queries. Our approach enables unseen queries to beeventually executed 11% faster than FCFS and 22% thanGreedy. These results indicate that query scheduling pol-icy can adapt to new query templates leading to signiﬁcantperformance and resource sharing improvements.

Overhead . We also measured the training and inferencetime. Our proof-of-concept prototype needed 240 mins toincorporate 950 queries in our agent (so in average the train-ing overhead is 3 .

95 mins per query). This time does notinclude the execution time of the query. This training over-head can potentially be optimized by oﬄoading it into an-other thread, introducing early stopping, or re-using previ-ous network weights to get a good ”starting point.” There isno training overhead for FCFS and Greedy. The inferencetime of SmartQueue is 3.12seconds while the inference timefor Greedy is 2.52 seconds and 0.0012 seconds for FCFS.3

00 200 300 400 500 600 700 800 900 1000Number of Queries0.00.20.40.60.81.0 A v e r a g e B u ff e r H i t R a t i o SmartQueueFCFSGreedy (a) Average buﬀer hit ratio

10 20 30 40 50 60 70Time (Hours)02004006008001000 N u m b e r o f Q u e r i e s SmartFCFSGreedy (b) Query execution time

Figure 2: SmartQueue’s eﬀectiveness (buﬀer hit ratio and query completion rate) with increasing training sets.

100 200 300 400 500 600 700 800 900Number of Queries0.00.20.40.60.81.0 A v e r a g e B u ff e r H i t R a t i o SmartQueueFCFSGreedy (a) Average buﬀer hit ratio

100 200 300 400 500 600 700 800 900Number of Queries050100150200250 Q u e r y L a t e n c y ( M i nu t e s ) SmartQueueFCFSGreedy (b) Query execution time

Figure 3: Buﬀer hit ratio and latency improvement on unseen query templates and increasing training queries.

4. RELATED WORK

Prior work on query scheduling have focused on query par-allelism [40], elastic cloud databases [4, 9, 18, 19, 22, 24, 30],meeting SLAs [6, 7, 15, 25, 33, 34, 43, 44], or cluster schedul-ing [20, 36, 39]. In terms of buﬀer pools and caching, mostprior work has focused on smart cache management [2, 10](i.e., assuming the query order is ﬁxed and choose whichblocks to evict or replace), or on (memory) cache-aware al-gorithms [46]. Here, we take a ﬂipped approach, in whichwe assume the buﬀer management policy is ﬁxed and thequery order may be modiﬁed (e.g., batch processing). Morebroadly, work on learned indexes follows recent trends inintegrating machine learning components into systems [11],especially database systems. Machine learning techniqueshave also been applied to query optimization [23, 32, 41],cardinality estimation [14, 31, 45], cost modeling [37], dataintegration [8, 28], tuning [42], and security [35].

5. CONCLUSION AND FUTURE WORK

We have presented SmartQueue, a deep reinforcementlearning query scheduler that seeks to maximize buﬀer hitrates in database management systems. While simple,SmartQueue was able to provide substantial improvementsover naive and simple heuristics, suggesting that cache-aware deep learning powered query schedulers are a promis-ing research direction. SmartQueue is only an early pro-totype, and in the future we plan to conduct a full exper-imental study of SmartQueue. In general, we believe thefollowing areas of future work are promising.

Neural network architecture.

While eﬀective in our ini-tial experiments, a fully connected neural network is likely not the correct inductive bias [26] for this problem. Afully connected neural network is not likely to innatelycarry much useful information for query scheduling [21],nor is there much of an intuitive connection between afully-connected architecture and the query scheduling prob-lem [5]. The ﬁrst layer of our network learns one linearcombination per neuron of the entire input. These linearcombinations would have to be extremely sparse to learnfeatures like ”the query reads this block, which is cached.”Other network architectures – like locally connected neuralnetworks [16] – may provide signiﬁcant beneﬁt.

SLAs.

Improving raw workload latency is helpful, but of-ten applications have much more complex performance re-quirements (e.g., some queries are more important than oth-ers). Integrating query priorities and customizable ServiceLevel Agreements (SLAs) into SmartQueue by modifyingthe reward signal could result in an buﬀer-aware and SLA-compliant scheduler.

Query optimization.

Diﬀerent query plans may per-form diﬀerently with diﬀerent buﬀer states. IntegratingSmartQueue into the query optimizer – so that query planscan be selected to maximize buﬀer usage – may provide sig-niﬁcant performance gains.

Buﬀer management.

SmartQueue only considers queryordering, and assumes that the buﬀer management policy isopaque. A larger system could consider both query orderingand buﬀer management, choosing to evict or hold buﬀeredblocks based on future queries. Such a system could repre-sent an end-to-end query scheduling and buﬀer managementpolicy.4 . REFERENCES

VLDB ’03 .[3] K. Arulkumaran et al. A Brief Survey of DeepReinforcement Learning.

IEEE Signal Processing ’17 .[4] Y. Azar et al. Cloud scheduling with setup cost. In

SPAA ’13 .[5] P. W. Battaglia et al. Relational inductive biases,deep learning, and graph networks. arXiv ’18 .[6] Y. Chi et al. iCBS: Incremental Cost-based SchedulingUnder Piecewise Linear SLAs.

VLDB ’11 .[7] Y. Chi et al. SLA-tree: A Framework for EﬃcientlySupporting SLA-based Decisions in Cloud Computing.In

EDBT ’11 .[8] R. C. Fernandez et al. Termite: A System forTunneling Through Heterogeneous Data. In aiDM ’19 .[9] S. Genaud et al. Cost-wait trade-oﬀs in client-sideresource provisioning with elastic clouds. In

CLOUD’11 .[10] S. Ghandeharizadeh et al. Cache augmented databasemanagement systems. In

DBSocial ’13 .[11] J. Gottschlich et al. The three pillars of machineprogramming. In

MAPL 2018 .[12] Keras: The Python Deep Learning library. https://keras.io// .[13] D. P. Kingma et al. Adam: A Method for StochasticOptimization. In

ICLR ’15 .[14] A. Kipf et al. Learned Cardinalities: EstimatingCorrelated Joins with Deep Learning. In

CIDR ’19 .[15] W. Lang et al. Towards Multi-Tenant PerformanceSLOs. In

ICDE ’14 .[16] Y. Lecun. Generalization and network designstrategies. In

Connectionism ’89 .[17] Y. LeCun et al. Deep learning.

Nature ’15 .[18] P. Leitner et al. Cost-Eﬃcient and ApplicationSLA-Aware Client Side Request Scheduling in anInfrastructure-as-a-Service Cloud. In

CLOUD ’12 .[19] Z. Liu et al. PMAX: Tenant Placement in MultitenantDatabases for Proﬁt Maximization. In

EDBT ’13 .[20] H. Mao et al. Learning Scheduling Algorithms forData Processing Clusters. arXiv ’18 .[21] G. Marcus. Innateness, AlphaZero, and ArtiﬁcialIntelligence. arXiv ’18 .[22] R. Marcus et al. NashDB: An End-to-End EconomicMethod for Elastic Database Fragmentation,Replication, and Provisioning. In

SIGMOD ’18 .[23] R. Marcus et al. Neo: A Learned Query Optimizer.

VLDB ’19 .[24] R. Marcus et al. Releasing Cloud Databases from theChains of Performance Prediction Models. In

CIDR’17 .[25] R. Marcus et al. WiSeDB: A Learning-based WorkloadManagement Advisor for Cloud Databases.

VLDB ’16 .[26] T. M. Mitchell.

The Need for Biases in LearningGeneralizations . PhD thesis.[27] V. Mnih et al. Human-level control through deepreinforcement learning.

Nature ’15 .[28] S. Mudgal et al. Deep Learning for Entity Matching:A Design Space Exploration. In

SIGMOD ’18 . [29] R. O. Nambiar et al. The Making of TPC-DS. In

VLDB ’06 .[30] V. Narasayya et al. SQLVM: Performance Isolation inMulti-Tenant Relational Database-as-a-Service. In

CIDR ’13 .[31] P. Negi et al. Cost-Guided Cardinality Estimation:Focus Where it Matters. In

SMDB @ ICDE ’20 .[32] J. Ortiz et al. Learning State Representations forQuery Optimization with Deep ReinforcementLearning. In

DEEM ’18 .[33] J. Ortiz et al. PerfEnforce Demonstration: DataAnalytics with Performance Guarantees. In

SIGMOD’16 .[34] J. Ortiz et al. SLAOrchestrator: Reducing the Cost ofPerformance SLAs for Cloud Data Analytics. In

USENIX ’18 .[35] Shrainik Jain et al. Database-Agnostic WorkloadManagement. In

CIDR ’19 .[36] B. Sotomayor et al. Virtual InfrastructureManagement in Private and Hybrid Clouds.

IEEE IC’09 .[37] J. Sun et al. An end-to-end learning-based costestimator.

VLDB ’19 .[38] R. S. Sutton et al.

Introduction to ReinforcementLearning . 1998.[39] R. Taft et al. STeP: Scalable Tenant Placement forManaging Database-as-a-Service Deployments. In

SoCC ’16 .[40] S. Tozer et al. Q-Cop: Avoiding bad query mixes tominimize client timeouts under heavy loads. In

ICDE’10 .[41] I. Trummer et al. SkinnerDB: Regret-bounded QueryEvaluation via Reinforcement Learning.

VLDB ’18 .[42] D. Van Aken et al. Automatic Database ManagementSystem Tuning Through Large-scale MachineLearning. In

SIGMOD ’17 .[43] P. Xiong et al. ActiveSLA: A Proﬁt-orientedAdmission Control Framework forDatabase-as-a-Service Providers. In

SoCC ’11 .[44] P. Xiong et al. Intelligent management of virtualizedresources for database systems in cloud environment.In

ICDE ’11 .[45] Z. Yang et al. Deep unsupervised cardinalityestimation.

VLDB ’19 .[46] J. Zhou et al. Buﬀering databse operations forenhanced instruction cache performance. In