[PDF] Comparisons of Algorithms in Big Data Processing

Abstract

Parallel computing is the fundamental base for MapReduce framework in Hadoop. Each data chunk is replicated over 3 servers for increasing availability of data and decreasing probability of data loss. Hence, the 3 servers that have Map task stored on their disk are fastest servers to process them, which are called local servers. All servers in the same rack as local servers are called rack-local servers that are slower than local servers since data chunk associated with Map task should be fetched through top of the rack switch. All other servers are called remote servers that are slowest servers since they need to fetch data from a local server in another rack, so data should be transmitted through at least 2 top of rack switches and a core switch. Note that number of switches in path of data transfer depends on internal network structure of data centers. The First-In-First-Out (FIFO) and Hadoop Fair Scheduler (HFS) algorithms do not take rack structure of data centers into account, so they are known to not be heavy-traffic delay optimal or even throughput optimal. The recent advances on scheduling for data centers considering rack structure of them and heterogeneity of servers resulted in state-of-the-art Balanced-PANDAS algorithm that outperforms classic MaxWeight algorithm. In both Balanced-PANDAS and MaxWeight algorithms, processing rate of local, rack-local, and remote servers are assumed to be known. However, with the change of traffic over time in addition to estimation errors of processing rates, it is not realistic to consider processing rates to be known. In this work, we study robustness of Balanced-PANDAS and MaxWeight algorithms in terms of inaccurate estimations of processing rates. We observe that Balanced-PANDAS is not as sensitive as MaxWeight on the accuracy of processing rates, making it more appealing to use in data centers.

Full PDF

RRobustness Comparison of Scheduling Algorithms inMapReduce Framework

Amirali Daghighi · Jim Q. Chen Abstract

Parallel computing is the fundamental base for MapReduce frame-work in Hadoop. A big data is split into small data chunks, where Map taskis referred to processing a data chunk. Each data chunk is replicated overthree servers in Hadoop for increasing the availability of data and decreas-ing the probability of data loss. As a result, the three servers that have theMap task stored on their disk are the fastest servers to process them, whichare called local servers. All the servers in the same rack as local servers arecalled rack-local servers that are slower than local servers in processing Maptasks since the data chunk associated with the Map task should be fetchedthrough the top of the rack switch. All the other servers are called remoteservers that are the slowest servers for processing a Map task since they needto fetch data from a local server in another rack, so data should be transmit-ted through at least two top of the rack switches and a core switch. Note thatthe number of switches in the path of data transfer depends on the internalnetwork structure of data centers. The First-In-First-Out (FIFO) and HadoopFair Scheduler (HFS) algorithms do not take the rack structure of data centersinto account, so they are known to not be heavy-traﬃc delay optimal or eventhroughput optimal. The recent advances on scheduling for data centers con-sidering the rack structure of them and the heterogeneity of servers resulted inthe state-of-the-art Balanced-PANDAS algorithm that outperforms the classicMaxWeight algorithm and its derivation, JSQ-MaxWeight algorithm. In bothBalanced-PANDAS and MaxWeight algorithms, the processing rate of local,rack-local, and remote servers are assumed to be known. However, with thechange of traﬃc over time in addition to estimation errors of processing rates,it is not realistic to consider the processing rates to be known. In this work,we study the robustness of Balanced-PANDAS and MaxWeight algorithms in [email protected]. Cloud State University [email protected]. Cloud State University a r X i v : . [ c s . PF ] A p r Amirali Daghighi , Jim Q. Chen terms of inaccurate estimations of processing rates. We observe that Balanced-PANDAS is not as sensitive as MaxWeight on the accuracy of processing rates,making it more appealing to use in data centers. Keywords

Hadoop · MapReduce · Data center · Scheduling · Load balancing · Robustness

Parallel computing for big data has diﬀerent applications from online socialnetworks, health-care industry, advertisement placement, and machine learn-ing to search engines and data centers. The most popular big data parallelcomputing framework is MapReduce which is broadly used in Hadoop [60],Dryad [25], Google [14], Deep Learning [11], and grid-computing [26, 46, 47].Before talking about MapReduce, we present some details on the networkstructure of data centers. Data centers used to mainly consist of two parts,storage and computing, and these two parts were connected to each otherthrough a network link with high bandwidth. At the moment that this struc-ture was being used, not much of big data was present, so the communication ofdata from storage to processing unit was not creating any bottleneck. However,by emergence of big data in health-care industry, machine learning applica-tions, search engines, etc., the network between the two units was unable toaccommodate fast and reliable data transfer. Hence, scientists come up withthe idea of bringing data into processing unit by splitting both data and pro-cessing units into hundreds of small units and combining each computing unitwith a small storage unit. As a result, each small unit consisting of storageand processing units, called a server, is capable of storing small pieces of dataand processing them at the same time. In other words, data does not needto be transferred from storage unit to processing unit since they are alreadytogether. However, note that the big data cannot be stored in the storage unitof a single server. The solution is to split the big data into small chunks ofdata, typically 68-128 MB, and storing them on multiple servers. In practice,each data chunk is stored on three servers to increase availability and decreasedata loss probability. As a result, processing of big data consists of processingof multiple data chunks that make the whole big data, and concatenating thedata chunk processing results for the completion of the big data process. Theprocessing of each data chunk is called Map task and the concatenation of theresults on all the Map tasks is called Reduce task, which make up the MapRe-duce processing framework for big data. Note that at least for Reduce tasks,servers need to be connected and cannot completely be isolated. We later seethat even for executing Map tasks, servers may need to exchange data. Hence,servers are connected to each other through links and switches. The struc-ture of switches connecting servers is a complete ﬁeld of research in computerscience, but servers are generally connected to each other by top of the rackswitches as well as core switches in the following way. The hundred servers aregrouped into batches of 20-50 servers, where each batch is inter-connected to obustness Comparison of Scheduling Algorithms in MapReduce Framework 3 each other by a switch, called rack switch, and all rack switches are connectedto one or more core switches which make all the servers connected.The rack structure of data centers brings a lot of complexity for load bal-ancing. As a result, most theoretical work on load balancing for data cen-ters either consider homogeneous model of servers or ignore the rack struc-ture and only consider data locality, where data locality refers to the factthat the data chunk associated to a Map task is stored on three servers, sois not available on other servers immediately. Examples of works that con-sider homogeneous server model are [31], [52], [29], and [32]. A branch ofresearch on homogeneous servers is utilization of the power of two or morechoices for load balancing that lowers the messaging overhead between hun-dreds of servers and the core load balancing scheduler. For example you canrefer to [13], [37], [19], [43], [6], [16], [34], [10], [33], and [20]. Although therehas been a huge body of work on heuristic algorithms for heterogeneous servermodel, examples of which are [59], [26], [70], [27], [23], [42], [71], [12], [28], [38],and [49], there has been a few recent works on algorithms with theoreticalguarantees on such more complicated models.The scheduling problem for a data center with a rack structure is a spe-ciﬁc case of the open aﬃnity scheduling problem, where each task type canbe processed by each server but with diﬀerent processing rates. The classicMaxWeight algorithm [54], [48], [53], [36],c- µ -rule [55], [35], and the work byHarrison and Lopez [22], [21] and Bell and Williams [4], [3] have diﬀerentapproaches on solving the load balancing problem for the aﬃnity scheduling,but they either not solve the delay optimality or have unrealistic assump-tions including known task arrival rates and existence of one queue per tasktype. The state-of-the-art on scheduling for data centers considering the rackstructure, no knowledge of task arrival rates, and having queues on the or-der of servers not the number of task types is presented by Xie et al. [63]and Yekkehkhany [65], which is extended for a general number of data lo-cality levels by Yekkehkhany et al. [67]. The central idea to all algorithmsin [63], [65], and [67] is to use weighted workload on servers instead of thequeue lengths, which leads to a better perfromance in terms of average de-lay expereinced by submitted tasks. The Balanced-PANDAS alborithm, wherePANDAS stands for Priority Algorithm for Near-Data Scheduling, is the namefor the weighetd-workload based algorithms proposed in [63], [65], and [67].The Join-the-Shortest-Queue-MaxWeight (JSQ-MaxWeight, JSQ-MW) pro-posed by Wang et al. [57] that only considers data locality is also extendedto the case where rack structure is considered by Xie et al. [63]. The priorityalgorithm proposed by [62] is another work that only considers data locality,not the rack structure of data centers which is interesting in its own rightssince both throughput and heavy-traﬃc delay optimality are proved for it;however, it is not even throughput optimal for a system with rack structure.All of the algorithms mentioned above consider complete knowledge aboutthe processing rates of diﬀerent task types on diﬀerent servers. However, thereality is that the processing rates are mostly not known due to errors inestimation methods and the change of the system structure over time or the Amirali Daghighi , Jim Q. Chen change of traﬃc which can change the processing rates. Hence, it is importantthat the algorithm that is used for load balancing is robust to estimation errorsof processing rates. In this work, we run extensive simulations to evaluate therobustness of the state-of-the-art algorithms on load balancing with diﬀerentlevels of data locality. It is observed that the Balanced-PANDAS algorithm notonly has a better heavy-traﬃc delay performance, but it also is more robust tochanges of processing rate estimations, while MaxWeight based algorithm doesnot perform as well as Balanced-PANDAS under processing rate estimationerrors. In order to estimate the processing rates of tasks on servers and bettermodel the data center structure, reinforcement learning methods can be usedas it is discussed in [69], [40], [41], [39]. A recent work by Yekkehkhany and Nagi[68] considers an exploration-exploitation approach as in the reinforcementlearning method to both learn the processing rates and exploit load balancingbased on the current estimation of the processing rates. They propose theBlind GB-PANDAS algorithm that is proven to be throughput-optimal andhave a lower mean task completion time than the existing methods. A moresophisticated risk-averse exploration-exploitation approach can be consideredfor this problem as discussed in [66].The rest of the paper is organized as follows. Section 2 presents the systemmodel that is used throughout the paper, section 3 summarizes the prelimi-nary materials including the description of Balanced-PANDAS, Priority, andMaxWeight based algorithms that are needed before we present the robustnesscomparison among diﬀerent algorithms in section 4. We consider the same system model described in [63] and [67] as follows. A dis-crete time model is considered, where time is indexed by t ∈ N . Assume a datacenter with M servers and denote the set of all servers as M = { , , · · · , M } .Without loss of generality, assume that the ﬁrst M R servers are connectedto each other with a top of the rack switch and are called the ﬁrst rack, thesecond M R servers are connected to each other with another top of the rackswitch and are called the second rack, and so on. Hence, there are N R = MM R racks in total. All the top of the rack switches are connected to each otherwith one or more core switches in a symmetric manner. As a result, there arethree levels of data locality as described below. Recall that the data chunkassociated to a map task is stored on three servers by Hadoop’s default, so allthose three servers are called local servers for the map task or in other wordsthe map task can receive service locally from those three servers. Since servershave the data chunk of local tasks, the processing is immediately started afterservers are assigned to process them. Note that the three servers storing thedata for a map task is normally diﬀerent for diﬀerent map tasks. Hence, weassociate a type to each map task, which is the label of the three servers, i.e.( m , m , m ) ∈ M , such that m < m < m . This gives us a unique and obustness Comparison of Scheduling Algorithms in MapReduce Framework 5 informative way of representation for diﬀerent task types as follows:¯ L ∈ L = { ( m , m , m ) ∈ M : m < m < m } , where a task type is denoted by ¯ L = ( m , m , m ) given that m , m , and m are the three local servers for task of type ¯ L and the set of all task types isdenoted by L .A map task is not limited to receive service from one of the local servers.It can receive service from one of the servers that are in the same rack asthe local servers with a slightly lower service rate. The slight depreciationof processing rate for such servers, which are called rack-local servers, is forthe travel time of the data associated to a map task from a local server tothe rack-local server that is assigned for processing the map task rack-locally.Finally, all other servers other than the local and rack-local servers, which arecalled remote servers, have the lowest processing rate for a map task, sincedata needs to be transmitted through at least two of the top rack switchesand a core switch, so the server cannot immediately start processing the taskwhen it is assigned to do so remotely. In order to formally deﬁne the rack-localand remote servers, we need to propose a notation. Let R ( m ) ∈ { , , · · · , N R } denotes the label of the rack that the m -th server belongs to. Then, the setof rack-local and remote servers to task of type ¯ L = ( m , m , m ), denoted by¯ L k and ¯ L r , respectively, are as follows:¯ L k = (cid:110) m ∈ M : m (cid:54)∈ ( m , m , m ) , R ( m ) ∈ (cid:16) R ( m ) , R ( m ) , R ( m ) (cid:17)(cid:111) , ¯ L r = (cid:110) m ∈ M : R ( m ) (cid:54)∈ (cid:16) R ( m ) , R ( m ) , R ( m ) (cid:17)(cid:111) . The service and arrival process of tasks is described below.

Task arrival process:

Let A ¯ L ( t ) denote the number of tasks of type ¯ L thatarrive to the system at time slot t , where E [ A ¯ L ( t )] = λ ¯ L , and it is assumedthat A ¯ L ( t ) < C A and P ( A ¯ L ( t ) = 0) >

0. The set of arrival rates for all tasktypes is denoted by λ = ( λ ¯ L : ¯ L ∈ L ). Service process:

The processing of a task on a local server is assumed to befaster than on a rack-local server, and the processing of a task on a rack-localserver is faster than on a remote server. This fact is formalized as follows.The processing time of a task on a local, rack-local, and remote server hasmeans α , β , and γ , respectively, where α > β > γ. Note that the processingtime of a task can have any distribution with the given means, but the heavy-traﬃc delay optimality of Balanced-PANDAS algorithm is only proven underGeometric service time distribution, while MaxWeight based algorithm doesnot have a general heavy-traﬃc delay optimality under any distribution forservice time.

Capacity region characterization:

An arrival rate for all task types issupportable for service by the M servers if and only if the load on each serveris strictly less than the capacity of the server. Considering a processing rate of Amirali Daghighi , Jim Q. Chen one for each server, an arrival rate vector λ = ( λ ¯ L : ¯ L ∈ L ) is in the capacityregion of the system if and only if: (cid:88) ¯ L : m ∈ ¯ L λ ¯ L,m α + (cid:88) ¯ L : m ∈ ¯ L k λ ¯ L,m β + (cid:88) ¯ L : m ∈ ¯ L r λ ¯ L,m γ < , ∀ m ∈ M , where λ ¯ L,m is the rate of incoming tasks of type ¯ L that are processed by server m . In this section, we introduce the main three algorithms on scheduling for datacenters with more than or equal to two levels of data locality that have the-oretical guarantees on optimality in some senses and under some conditions.In order to introduce the load balancing algorithm of each method, we alsoneed to present the queueing structure required for that method. The threealgorithms are1. Priority algorithm [62], which is best ﬁt for applications with two levels ofdata locality, e.g. for the cases that only data locality is taken into account.An example is scheduling for Amazon Web Services inside a rack.2. Balanced-PANDAS [63], [65], and [67], which is the state-of-the-art forscheduling applications with multiple levels of data locality and is observedto perform better in terms of average task completion time by fourfold incomparison to MaxWeight based algorithms. It is proven by Xie et al. [63]that under mild conditions, Balanced-PANDAS is both throughput andheavy-traﬃc delay optimal for a system with three levels of data localityand a rack structure.3. MaxWeight based algorithms [53] and [58], which can be used for multiplelevels of data locality and are throughput optimal, but not heavy-traﬃcdelay optimal, and it is observed that they generally have poor performanceat high loads compared to weighted workload based algorithm used inBalanced-PANDAS algorithm.The following three subsections present a complete introduction to these threemain algorithms.3.1 Priority algorithmThe Priority algorithm is designed for a system with two levels of data locality.In other words, it only considers data locality, but not the rack structure.Hence, there are only local and remote servers from the perspective of a task.The queueing structure under this algorithm is to have a single queue perserver, where the queue corresponding to a server only keeps tasks that arelocal to that server. At the arrival of a task, a central scheduler routes theincoming task to the local server with the shortest queue length. An idle server obustness Comparison of Scheduling Algorithms in MapReduce Framework 7 is scheduled to process a task in its corresponding queue as long as there isone, and if the idle server’s queue length is zero, it is scheduled to process atask from the longest queue in the system. The priority algorithm is provedto be both throughput and heavy-traﬃc delay optimal. However, its extensionto more than two levels of data locality is not even throughput optimal, letalone heavy-traﬃc delay optimal.3.2 Balanced-PANDAS algorithmThe Balanced-PANDAS algorithm can be used for a system with multiplelevels of data locality, but here we propose the algorithm for a data centerwith a rack structure with three levels of data locality. The queueing structureunder using this algorithm is to have three queues per server, one queue forstoring local tasks to the server, another queue for storing rack-local tasks tothe server, and a third queue for storing remote tasks to the server. Hence,server m has a tuple of three queues denoted by (cid:0) Q lm , Q km , Q rm (cid:1) , where theyrefer to the queues storing local, rack-local, and remote tasks respectively. Thecorresponding queue lengths at time t are denoted by (cid:0) Q lm ( t ) , Q km ( t ) , Q rm ( t ) (cid:1) .The workload on server m at time slot t is deﬁned as follows: W m ( t ) = Q lm ( t ) α + Q km ( t ) β + Q rm ( t ) γ . An incoming task of type ¯ L is routed to the corresponding queue of the serverwith the minimum weighted workload, where ties are broken randomly, in theset below: arg min m ∈M (cid:40) W m ( t ) α · { m ∈ ¯ L } + β · { m ∈ ¯ L k } + γ · { m ∈ ¯ L r } (cid:41) . An idle server m at time slot t is scheduled to process a local task from Q lm if Q lm ( t ) (cid:54) = 0; otherwise, it is scheduled to process a rack-local task from Q km if Q km ( t ) (cid:54) = 0; otherwise, it is scheduled to process a remote task from Q rm if Q rm ( t ) (cid:54) = 0; otherwise, it remains idle until a task joins one of its threequeues. The Balanced-PANDAS algorithm is throughput optimal. It is alsoheavy-traﬃc delay optimal for a system with a rack structure of three levelsof data locality if β > α · γ , which means that the rack-local service is fasterthan the remote service in a speciﬁc manner.3.3 MaxWeight based algorithmThe MaxWeight algorithm is proposed by [53] and a modiﬁcation of it calledJSQ-MaxWeight algorithm is proposed by [58], which is described below. Con-sider one queue per server, i.e. server m has a single queue called Q m , whereits queue length at time slot t is denoted by Q m ( t ). The routing policy is as Amirali Daghighi , Jim Q. Chen the Priority algorithm, i.e. an incoming task of type ¯ L is routed to the queue ofthe shortest length in ¯ L . This routing policy is called join-the-shortest-queue(JSQ). An idle server m at time slot t on the other hand is scheduled to processa task from a queue with the maximum weighted queue length, where ties arebroken at random, in the set below:arg max n ∈M (cid:8)(cid:0) α · { n = m } + β · { n (cid:54) = m,R ( n )= R ( m ) } + γ · { R ( n ) (cid:54) = R ( m ) } (cid:1) · Q n ( t ) (cid:9) . The JSQ-MaxWeight algorithm is throughput optimal, but it is not heavy-traﬃc delay optimal.

In this section, we present the results on our extensive simulations on robust-ness of scheduling algorithms presented in section 3. To this end, we run thealgorithms with parameters that have error to study which algorithm can tol-erate errors better than others. More speciﬁcally, we use incorrect α, β, and γ in the algorithms for calculating weighted workloads or weighted queue lengthsand observe the average task completion time under these scenarios. The ar-rival process is a Poisson process and the processing time has an exponentialdistribution. We have also tested the algorithms for processing times withheavy-tailed distributions and observed similar results. We make these pa-rameters 5% , , , , , and 30% oﬀ their real value, either greaterthan the real value or smaller than the real value, and evaluate algorithmsunder these cases. The traﬃc load is assumed the same under all algorithmsso that the comparison makes sense. We further compare all the three algo-rithms mentioned in section 3 with the Hadoop’s default scheduler which isFirst-In-First-Out (FIFO). Figure 1 shows the comparison between the four al-gorithms when precise value of parameters are known by the central scheduler.As we see, Balanced-PANDAS algorithm has the lowest average task comple-tion time at high loads. A closer look of high loads is presented in ﬁgure 2,where Balanced-PANDAS obviously outperform JSQ-MaxWeight in terms ofaverage task completion time.The performance of the four algorithms are compared to each other whenthe parameters are lower than their real values by certain percentages, wherethe results are shown in ﬁgure 3. As is seen, Balanced-PANDAS has bestperformance among all algorithms by changing the parameters’ error from5% to 30%. In fact, ﬁgure 4 shows that the Balanced-PANDAS has the leastsensitivity against change of parameters while JSQ-MaxWeight’s performancevaries notably by the increase of error in parameter estimations.Comparison of the algorithms when the parameters are oﬀ for some per-centages, but higher than their real values are given in ﬁgure 5. It is againobserved that the Balanced-PANDAS algorithm has consistent better perfor-mance than the JSQ-MaxWeight algorithm. The sensitivity comparison of theBalanced-PANDAS and JSQ-MaxWeight algorithms in this case is presentedin ﬁgure 6. obustness Comparison of Scheduling Algorithms in MapReduce Framework 9 Fig. 1

Comparison of the algorithms using the precise value of parameters.

Fig. 2

Comparison of Balanced-PANDAS and JSQ-MaxWeight at high loads using theprecise value of parameters.

In this work, we did a literature review on both classical and state-of-the-artscheduling algorithms for the aﬃnity scheduling problem. Data center loadbalancing is a special case of the aﬃnity scheduling problem. Consideringthe rack structure of data centers, there are three levels of data locality. Thepriority algorithm that is heavy-traﬃc delay optimal is not even throughputoptimal for three levels of data locality. The Balanced-PANDAS algorithm isthe state-of-the-art in heavy-traﬃc delay optimality. We investigated the ro-bustness of Balanced-PANDAS and JSQ-MaxWeight algorithms with respectto errors in parameter estimation. We observe that Balanced-PANDAS keepsits better performance even in the absence of precise parameter values ver-sus JSQ-MaxWeight. Note that the JSQ-MaxWeight algorithm is also robustunder parameter estimation errors, but it is more sensitive than Balanced-PANDAS, specially at high loads close to the boundary of the capacity region.For future work, one can use machine learning tools to estimate the systemparameters and make them more precise in the meanwhile that the load bal- , Jim Q. Chen (a) Parameters are oﬀ for 5% lower (b) Parameters are oﬀ for 10% lower(c) Parameters are oﬀ for 15% lower (d) Parameters are oﬀ for 20% lower(e) Parameters are oﬀ for 25% lower (f) Parameters are oﬀ for 30% lower Fig. 3

Robustness comparison of algorithms when parameters are oﬀ and lower than theirreal values.

Fig. 4

Sensitivity comparison of Balanced-PANDAS and JSQ-MaxWeight against param-eter estimation error.obustness Comparison of Scheduling Algorithms in MapReduce Framework 11(a) Parameters are oﬀ for 5% higher (b) Parameters are oﬀ for 10% higher(c) Parameters are oﬀ for 15% higher (d) Parameters are oﬀ for 20% higher(e) Parameters are oﬀ for 25% higher (f) Parameters are oﬀ for 30% higher

Fig. 5

Robustness comparison of algorithms when parameters are oﬀ and higher than theirreal values.

Fig. 6

Sensitivity comparison of Balanced-PANDAS and JSQ-MaxWeight against param-eter estimation error.2 Amirali Daghighi , Jim Q. Chen ancing algorithm is working with the estimated parameters. The schedulingalgorithms presented in this work can also be applied to a vast number ofapplications including but not limited to healthcare and super market mod-els [61], [9], [17], [24], web search engines [51], [50], [5], [30], [64], electric vehiclecharging [1], [18], [56], [15], [8], [2], [7], [44], [45] and so on. obustness Comparison of Scheduling Algorithms in MapReduce Framework 13 References

1. Bahram Alinia, Mohammad Sadegh Talebi, Mohammad H Hajiesmaili, AliYekkehkhany, and Noel Crespi. Competitive online scheduling algorithms with ap-plications in deadline-constrained ev charging.2. Mishari Metab Almalki, Maziar Isapour Chehardeh, and Constantine J Hatziadoniu.Capacitor bank switching transient analysis using frequency dependent network equiv-alents. In

North American Power Symposium (NAPS), 2015 . IEEE, 2015.3. Steven Bell, Ruth Williams, et al. Dynamic scheduling of a parallel server systemin heavy traﬃc with complete resource pooling: Asymptotic optimality of a thresholdpolicy.

Electronic Journal of Probability , 10:1044–1115, 2005.4. Steven L Bell, Ruth J Williams, et al. Dynamic scheduling of a system with two parallelservers in heavy traﬃc with resource pooling: asymptotic optimality of a thresholdpolicy.

The Annals of Applied Probability , 11(3):608–649, 2001.5. Andrei Broder. A taxonomy of web search. In

ACM Sigir forum , volume 36, pages3–10. ACM, 2002.6. John W Byers, Jeﬀrey Considine, and Michael Mitzenmacher. Geometric generalizationsof the power of two choices. In

Proceedings of the sixteenth annual ACM symposiumon Parallelism in algorithms and architectures , pages 54–63. ACM, 2004.7. Maziar Isapour Chehardeh, Mishari Metab Almalki, and Constantine J Hatziadoniu.Remote feeder transfer between out-of-phase sources using sts. In

Power and EnergyConference at Illinois (PECI), 2016 IEEE . IEEE, 2016.8. Maziar Isapour Chehardeh and Constantine J Hatziadoniu. A systematic method forreliability index evaluation of distribution networks in the presence of distributed gen-erators. In . IEEE, 2018.9. Robert Clower and Axel Leijonhufvud. The coordination of economic activities: a key-nesian perspective.

The American Economic Review , 65(2):182–188, 1975.10. Colin Cooper, Robert Els¨asser, and Tomasz Radzik. The power of two choices in dis-tributed voting. In

International Colloquium on Automata, Languages, and Program-ming , pages 435–446. Springer, 2014.11. Amirali Daghighi. Application of an artiﬁcial neural network as a third-party databaseauditing system. 2019.12. Amirali Daghighi and Mohammadamir Kavousi. Scheduling for data centers with multi-level data locality. In

Electrical Engineering (ICEE), 2017 Iranian Conference on , pages927–936. IEEE, 2017.13. Søren Dahlgaard, Mathias Bæk Tejs Knudsen, Eva Rotenberg, and Mikkel Thorup.The power of two choices with simple tabulation. In

Proceedings of the twenty-seventhannual ACM-SIAM symposium on Discrete algorithms , pages 1631–1642. SIAM, 2016.14. Jeﬀrey Dean and Sanjay Ghemawat. Mapreduce: simpliﬁed data processing on largeclusters.

Communications of the ACM , 51(1):107–113, 2008.15. Sara Deilami, Amir S Masoum, Paul S Moses, and Mohammad AS Masoum. Real-timecoordination of plug-in electric vehicle charging in smart grids to minimize power lossesand improve voltage proﬁle.

IEEE Transactions on Smart Grid , 2(3):456–467, 2011.16. Benjamin Doerr, Leslie Ann Goldberg, Lorenz Minder, Thomas Sauerwald, and Chris-tian Scheideler. Stabilizing consensus with the power of two choices. In

Proceedings ofthe twenty-third annual ACM symposium on Parallelism in algorithms and architec-tures , pages 149–158. ACM, 2011.17. Elizabeth Eisenhauer. In poor health: Supermarket redlining and urban nutrition.

Geo-Journal , 53(2):125–133, 2001.18. Lingwen Gan, Ufuk Topcu, and Steven H Low. Optimal decentralized protocol forelectric vehicle charging.

IEEE Transactions on Power Systems , 28(2):940–951, 2013.19. Kristen Gardner, Mor Harchol-Balter, Alan Scheller-Wolf, Mark Velednitsky, andSamuel Zbarsky. Redundancy-d: The power of d choices for redundancy.

OperationsResearch , 65(4):1078–1094, 2017.20. Nicolas Gast. The power of two choices on graphs: the pair-approximation is accurate?

ACM SIGMETRICS Performance Evaluation Review , 43(2):69–71, 2015.21. J Michael Harrison. Heavy traﬃc analysis of a system with parallel servers: asymptoticoptimality of discrete-review policies.

Annals of applied probability , pages 822–848,1998.4 Amirali Daghighi , Jim Q. Chen

22. J Michael Harrison and Marcel J L´opez. Heavy traﬃc resource pooling in parallel-serversystems.

Queueing systems , 33(4):339–368, 1999.23. Chen He, Ying Lu, and David Swanson. Matchmaking: A new mapreduce schedulingtechnique. In

Cloud Computing Technology and Science (CloudCom), 2011 IEEE ThirdInternational Conference on , pages 40–47. IEEE, 2011.24. Mohammad Hosseini, Yu Jiang, Ali Yekkehkhany, Richard R Berlin, and Lui Sha. Amobile geo-communication dataset for physiology-aware dash in rural ambulance trans-port. In

Proceedings of the 8th ACM on Multimedia Systems Conference . ACM, 2017.25. Michael Isard, Mihai Budiu, Yuan Yu, Andrew Birrell, and Dennis Fetterly. Dryad:distributed data-parallel programs from sequential building blocks. In

ACM SIGOPSoperating systems review , volume 41, pages 59–72. ACM, 2007.26. Michael Isard, Vijayan Prabhakaran, Jon Currey, Udi Wieder, Kunal Talwar, and An-drew Goldberg. Quincy: fair scheduling for distributed computing clusters. In

Pro-ceedings of the ACM SIGOPS 22nd symposium on Operating systems principles , pages261–276. ACM, 2009.27. Jiahui Jin, Junzhou Luo, Aibo Song, Fang Dong, and Runqun Xiong. Bar: An eﬃcientdata locality driven task scheduling algorithm for cloud computing. In

Proceedingsof the 2011 11th IEEE/ACM International Symposium on Cluster, Cloud and GridComputing , pages 295–304. IEEE Computer Society, 2011.28. Mohammadamir Kavousi. Aﬃnity scheduling and the applications on data centerscheduling with data locality. arXiv preprint arXiv:1705.03125 , 2017.29. Joseph Kreimer. Real-time system with homogeneous servers and nonidentical channelsin steady-state.

Computers & Operations Research , 29(11):1465–1473, 2002.30. K Hari Krishna and Kosuru Anusha Rani. Reducing the energy consumption energy-eﬃcient query processing node in web search engines. 2018.31. Miron Livny and Myron Melman. Load balancing in homogeneous broadcast distributedsystems. In

ACM SIGMETRICS Performance Evaluation Review , volume 11, pages47–55. ACM, 1982.32. James Craig Lowery, Mark Andrew Collins, and Brent Schroeder. Systems and methodsfor provisioning homogeneous servers, May 22 2008. US Patent App. 11/562,921.33. Malwina J Luczak, Colin McDiarmid, et al. On the power of two choices: balls and binsin continuous time.

The Annals of Applied Probability , 15(3):1733–1764, 2005.34. Steve Lumetta and Michael Mitzenmacher. Using the power of two choices to improvebloom ﬁlters.

Internet Mathematics , 4(1):17–33, 2007.35. Avishai Mandelbaum and Alexander L Stolyar. Scheduling ﬂexible servers with convexdelay costs: Heavy-traﬃc optimality of the generalized c µ -rule. Operations Research ,52(6):836–855, 2004.36. Sean Meyn. Stability and asymptotic optimality of generalized maxweight policies.

SIAM Journal on control and optimization , 47(6):3259–3294, 2009.37. Michael Mitzenmacher. The power of two choices in randomized load balancing.

IEEETransactions on Parallel and Distributed Systems , 12(10):1094–1104, 2001.38. Amir Moaddeli, Iman Nabati Ahmadi, and Negin Abhar. The power of dchoices in scheduling for data centers with heterogeneous servers. arXiv preprintarXiv:1904.00447 , 2019.39. Negin Musavi. A game theoretical framework for the evaluation of unmanned aircraftsystems airspace integration concepts. arXiv preprint arXiv:1904.08477 , 2019.40. Negin Musavi, Deniz Onural, Kerem Gunes, and Yildiray Yildiz. Unmanned aircraftsystems airspace integration: A game theoretical framework for concept evaluations.

Journal of Guidance, Control, and Dynamics , pages 96–109, 2016.41. Negin Musavi, Kaan Bulut Tekelio˘glu, Yildiray Yildiz, Kerem Gunes, and Deniz Onural.A game theoretical modeling and simulation framework for the integration of unmannedaircraft systems in to the national airspace. In

AIAA Infotech@ Aerospace . 2016.42. Jorda Polo, Claris Castillo, David Carrera, Yolanda Becerra, Ian Whalley, MalgorzataSteinder, Jordi Torres, and Eduard Ayguad´e. Resource-aware adaptive scheduling formapreduce clusters. In

ACM/IFIP/USENIX International Conference on DistributedSystems Platforms and Open Distributed Processing , pages 187–207. Springer, 2011.43. Andrea W Richa, M Mitzenmacher, and R Sitaraman. The power of two random choices:A survey of techniques and results.

Combinatorial Optimization , 9:255–304, 2001.obustness Comparison of Scheduling Algorithms in MapReduce Framework 1544. Sepehr Saadatmand, Sima Azizi, Mohammadamir Kavousi, and Donald Wunsch. Au-tonomous control of a line follower robot using a q-learning controller. In , pages0556–0561. IEEE, 2020.45. Sepehr Saadatmand, Mohammadamir Kavousi, and Sima Azizi. The voltage regulationof boost converters using dual heuristic programming. In , pages 0531–0536. IEEE,2020.46. Sepehr Saadatmand, Mohammad Saleh Sanjarinia, Pourya Shamsi, and Mehdi Ferdowsi.Dual heuristic dynamic programing control of grid-connected synchronverters.

NorthAmerican Power Symposium (NAPS), 2019 , pages 1–6, 2019.47. Sepehr Saadatmand, Mohammad Saleh Sanjarinia, Pourya Shamsi, Mehdi Ferdowsi, andDonald C Wunsch. Heuristic dynamic programming for adaptive virtual synchronousgenerators.

North American Power Symposium (NAPS), 2019 , pages 1–6, 2019.48. Bilal Sadiq and Gustavo De Veciana. Throughput optimality of delay-driven maxweightscheduler for a wireless system with ﬂow dynamics. In

Communication, Control, andComputing, 2009. Allerton 2009. 47th Annual Allerton Conference on , pages 1097–1102. IEEE, 2009.49. Mehdi Salehi. Optimal physiology-aware scheduling of clinical states in rural ambulancetransport. In , pages 247–252. IEEE, 2017.50. Sara Salehi, Jia Tina Du, and Helen Ashman. Use of web search engines and person-alisation in information searching for educational purposes.

Information Research: AnInternational Electronic Journal , 23(2):n2, 2018.51. Candy Schwartz. Web search engines.

Journal of the American Society for InformationScience , 49(11):973–982, 1998.52. Vijendra P Singh. Two-server markovian queues with balking: heterogeneous vs. homo-geneous servers.

Operations Research , 18(1):145–159, 1970.53. Alexander L Stolyar et al. Maxweight scheduling in a generalized switch: State spacecollapse and workload minimization in heavy traﬃc.

The Annals of Applied Probability ,14(1):1–53, 2004.54. Peter van de Ven, Sem Borst, and Seva Shneer. Instability of maxweight schedulingalgorithms. In

INFOCOM 2009, IEEE , pages 1701–1709. IEEE, 2009.55. Jan A Van Mieghem. Dynamic scheduling with convex delay costs: The generalized c—mu rule.

The Annals of Applied Probability , pages 809–833, 1995.56. Chwei-Sen Wang, Oskar H Stielau, and Grant A Covic. Design considerations for acontactless electric vehicle battery charger.

IEEE Transactions on industrial electronics ,52(5):1308–1314, 2005.57. Weina Wang, Kai Zhu, Lei Ying, Jian Tan, and Li Zhang. A throughput optimalalgorithm for map task scheduling in mapreduce with data locality.

ACM SIGMETRICSPerformance Evaluation Review , 40(4):33–42, 2013.58. Weina Wang, Kai Zhu, Lei Ying, Jian Tan, and Li Zhang. Maptask scheduling inmapreduce with data locality: Throughput and heavy-traﬃc optimality.

IEEE/ACMTransactions on Networking (TON) , 24(1):190–203, 2016.59. Tom White. Hadoop: The deﬁnitive guide, yahoo, 2010.60. Tom White.

Hadoop: The deﬁnitive guide . ” O’Reilly Media, Inc.”, 2012.61. Fedelma Winkler. Consumerism in health care: beyond the supermarket model.

Policy& Politics , 15(1):1–8, 1987.62. Qiaomin Xie and Yi Lu. Priority algorithm for near-data scheduling: Throughput andheavy-traﬃc optimality. In

Computer Communications (INFOCOM), 2015 IEEE Con-ference on , pages 963–972. IEEE, 2015.63. Qiaomin Xie, Ali Yekkehkhany, and Yi Lu. Scheduling with multi-level data locality:Throughput and heavy-traﬃc optimality. In

INFOCOM 2016-The 35th Annual IEEEInternational Conference on Computer Communications, IEEE . IEEE, 2016.64. Xiaohui Xie, Yiqun Liu, Maarten de Rijke, Jiyin He, Min Zhang, and Shaoping Ma.Why people search for images using web search engines. In

Proceedings of the EleventhACM International Conference on Web Search and Data Mining , pages 655–663. ACM,2018.6 Amirali Daghighi , Jim Q. Chen

65. Ali Yekkehkhany. Near data scheduling for data centers with multi levels of data locality. (Dissertation, University of Illinois at Urbana-Champaign) .66. Ali Yekkehkhany, Ebrahim Arian, Mohammad Hajiesmaili, and Rakesh Nagi. Risk-averse explore-then-commit algorithms for ﬁnite-time bandits. arXiv preprintarXiv:1904.13387 , 2019.67. Ali Yekkehkhany, Avesta Hojjati, and Mohammad H Hajiesmaili. Gb-pandas:: Through-put and heavy-traﬃc optimality analysis for aﬃnity scheduling.

ACM SIGMETRICSPerformance Evaluation Review , 45(2), 2018.68. Ali Yekkehkhany and Rakesh Nagi. Blind gb-pandas: A blind throughput-optimal loadbalancing algorithm for aﬃnity scheduling.

IEEE/ACM Transactions on Networking ,2020.69. Yildiray Yildiz, Adrian Agogino, and Guillaume Brat. Predicting pilot behavior inmedium scale scenarios using game theory and reinforcement learning. In

AIAA Mod-eling and Simulation Technologies (MST) Conference , page 4908, 2013.70. Matei Zaharia, Dhruba Borthakur, Joydeep Sen Sarma, Khaled Elmeleegy, ScottShenker, and Ion Stoica. Delay scheduling: A simple technique for achieving local-ity and fairness in cluster scheduling. In

Proceedings of the 5th European Conferenceon Computer Systems , pages 265–278. ACM, 2010.71. Matei Zaharia, Andy Konwinski, Anthony D Joseph, Randy H Katz, and Ion Stoica.Improving mapreduce performance in heterogeneous environments. In