Approximate Solution Approach and Performability Evaluation of Large Scale Beowulf Clusters
NNoname manuscript No. (will be inserted by the editor)
Approximate Solution Approach and PerformabilityEvaluation of Large Scale Beowulf Clusters
Yonal Kirsal · Yoney Kirsal Ever the date of receipt and acceptance should be inserted later
Abstract
Abstract Beowulf clusters are very popular and deployed worldwide insup- port of scientific computing, because of the high computational power andperformance. However, they also pose several challenges, and yet they need toprovide high availability. The practical large-scale Beowulf clusters result in un-predictable, fault-tolerant, often detrimental outcomes. Successful development ofhigh performance in storing and processing huge amounts of data in large-scaleclusters necessitates accurate quality of service (QoS) evaluation. This leads todevelop as well as design, analytical models to understand and predict of com-plex system behaviour in order to ensure availability of large- scale systems. Exactmodelling of such clusters is not feasible due to the nature of the large scale nodesand the diversity of user requests. An analytical model for QoS of large-scale serverfarms and solution approaches are necessary. In this paper, analytical modellingof large-scale Beowulf clusters is considered together with availability issues. Ageneric and flexible approximate solution approach is developed to handle largenumber of nodes for performability evaluation. The proposed analytical model andthe approximate solution approach provide flexibility to evaluate the QoS mea-surements for such systems. In order to show the efficacy and the accuracy of theproposed approach, the results obtained from the analytical model are validatedwith the results obtained from the discrete event simulations.
Keywords
Beowulf clusters · approximate solution technique · analyticalmodelling · performability evaluation · large-scale clusters · high performancecomputing a r X i v : . [ c s . PF ] J u l Yonal Kirsal, Yoney Kirsal Ever
The rapid development of computers, parallel and distributed systems recentlyprovide flexible, efficient, and highly available services to their users. Thus, usersincreasingly expect better quality of service (QoS) from such systems.Due to high QoS expectation, the computer/communication system and cloud-based networks have to be highly available and being easily managed [1]. In or-der to provide the best QoS and the seamless service to the users various kinds ofcomputing paradigms being effectively used. Parallel computing is most commonlyused approach in such systems. It is a way of dividing a large tasks into smallertasks and using more than one node simultaneously to perform these divided tasks[2]. Hence, parallel processing provides high performance with reducing the time bysharing the necessary work among the cluster nodes to perform complex computa-tions [3]. In addition, high- performance computing (HPC) uses of supercomputersand parallel process- ing techniques for solving complex computational problems[4?6]. HPC delivers sustained performance by offering storage resources. However,computations for parallel processing are very expensive and because of this reasonmany organizations cannot afford to have them. This leads to the introductiondifferent architectures such as symmetric multiprocessors, vector processors andcluster computing.The Beowulf clusters have been widely used for clustering. Beowulf clusters allowlarge amount of computing nodes to be available for parallel or simultaneous pro-cessing at much lower cost [7?9]. Beowulf clusters are scalable performance clustersbased on a multi-processor system that consist of commodity hardware on a pri-vate system network with open source software infrastructure [10]. In a Beowulfcluster, network of computers is tightly connected that are dedicated to simul-taneously provide service to incoming task requests. The typical Beowulf clustergenerally has two types of nodes: A head node (master node) and the identicalcomputing nodes [11]. The head node main duty is serving and distributing theuser requests to computing nodes. Computing nodes are usually dedicated to ser-vice. In addition, the computing nodes can- not serve the tasks if the head node isnot operative [12]. Due to the single head node formation, such systems are vul-nerable. The head node failures affect the availability of the clusters significantlyand therefore, this limits access to the healthy identical nodes. Additionally, de-pending on the structure of the cluster, head node may or may not be part of thecomputation.In this paper, performability models have been formed which takes both perfor-mance and availability concerns into account for large scale Beowulf clusters ananalytical point of view. It is essential to evaluate the performability out- put pa-rameters in order to obtain QoS of the system. Thus, the availability of multi-nodessystems can be affected by nodes failure due to various reasons. The single headnode architecture makes the performability analysis even more interesting and es- sential. Therefore, in order to obtain more realistic QoS measurements and analysisof Beowulf clusters, performance models and availability should be employed to-gether for such server farms. Although, analytical approaches are presented forserver farms (with one head and several identical computing nodes) as well as Be-owulf clusters, large-scale systems could not be considered due to the state space itle Suppressed Due to Excessive Length 3 explosion problem [13]. Existing modelling and exact solution approaches such asthe spectral expansion method is not able to handle such large networks [14]. In[15] a situation of the state space explosion problem is encountered in a spec-tral expansion method where Beowulf clusters are modelled and solved for variousperformance measures. Thus, the number of parallel servers considered could notexceed due to state space explosion. However, cloud computing, web server andcluster computing provide a total of 256, 372 or even more (i.e., 512) nodes [16,17].The well-known exact solution approaches to availability issues or open queuingnetworks with failure suffer from similar problems. In addition, solving complexsystems through simulation resulted in prohibitively long computational times forsuch systems. Therefore, it is of great importance to develop an analytical methodand solution approach to overcome these problems for such systems. The maincontribution of the paper can be summarized as follows: – An approximate and flexible three dimensional (3D) solution approach is pre-sented in large-scale Beowulf clusters for performability evaluation. Al- thoughthe Markov models given are not new, the approach is new and it can be usedfor large-scale Beowulf systems where large numbers of computing nodes canbe considered typically up to several hundreds or thousands.The performability results obtained from the analytical model are compared tothe discrete event simulation (DES) results in order to show the accuracy andeffectiveness of the proposed work. Findings show that the analytical modellingand an approximate solution approach presented in large-scale Beowulf clustersproviding a high degree of accuracy, and it is significantly more efficient than thesimulation approach in terms of computational time.The rest of the paper is organized as follows: Section 2 presents the related studiesand gives the motivation of the study. Section 3 describes the system model andthe analytical solution approach. Section 4 presents the numerical results anddiscussions for the proposed model. Finally, Section 5 concludes this paper.
The Beowulf clusters have been widely used all over the world. They can be em-ployed for coarse grained applications such as Monte Carlo calculations, statisticalsimulations, high throughput applications, and also can be used in grid computing.Beowulf type cluster system is a good example of HPC systems. This is due to asingle head node which is possible to have a backup node for the head node. Thesetypes of systems are called highly available Beowulf systems [18,19].HPC is the future paradigm that has been dominating the visualization and pro-cessing of huge amount of web data. HPC systems fundamentally provide accessto large pools of data and computational resources through a variety of interfaces similar to the existing grid, HPC resource management and programming systems[20]. The HPC service providers must strive to ensure good QoS by offering highlyavailable services with dynamically scalable resources as stated in [19]. In [19] au-thors used HA-OSCAR, which is an open source High Availability (HA) solutionfor HPC/cloud that offers component redundancy, failure detection, and automatic
Yonal Kirsal, Yoney Kirsal Ever failover. It is assumed that any task sent to the cloud/cluster center is servicedwith a suitable node which is called a facility node [21]. When the task is serviced,it leaves the center. This facility node may contain different computing resourcessuch as web servers, database servers, directory servers, and others. However, in[21] emphasized that cloud centers differ from traditional queuing systems in anumber of important aspects. However, more importantly, a cloud center can havea large number of orders of hundreds or thousands of facility nodes that, traditionalqueuing analysis rarely considers systems of this size. In 2006, Amazon introducedthe elastic computing cloud (EC2) that customers could rent, by the hour, Xen-based virtual machines hosted in Amazon?s data center [22]. In this, users havefull root-level access to virtual machines so that they can fully customize and op-tionally publish machine images. However, in [23], authors subordinated computernodes. These nodes are configured from the head node. Additionally, the authorsused the Rocks toolkit. The Rocks toolkit is such a methodology and more than2000 clusters have been built with open-source software stack.Homogeneous multi-server systems have been considered with different repairstrategies for performability evaluation in the literature [5, 7, 10-12, 14, 15, 27,29,31].However, the proposed models and solution approaches used are more applicableto small and/or medium size systems rather than large-scale systems. When thelarge-scale systems are considered, the state space explosion is a general prob-lem for state space representation of queuing systems. The state space explosionis a general problem for state space representation of queuing systems, but itis encountered especially in multiprocessor systems. The spectral expansion andthe matrix geometry method solve this problem partially. They are capable toconsider systems with infinite queuing capacities. However, when the number ofservers increase rapidly they face with analytical difficulties. Large scale Beowulfclusters face the space explosion problem as well. In [7] and [15] the Beowulf clus-ters are modelled and solved for various measures, but the identical computingnodes considered could not exceed some limitations due to the state space explo-sion problem. Thus, in this paper, the proposed model aim is to solve this problemfor a large number of nodes in Beowulf clusters. The given approach is new andflexible. Thus, it can be applicable to similar large scale and complex systems.As stated in [24], the majority of current cloud computing infrastructure con- sistsof services that are offered up and delivered through a service center such as adata center, that can be accessed from a web browser anywhere in the world. Ourproposal also relies on that. As the population size of a typical cloud center is rela-tively high while the probability that a given user will request service is relativelysmall, the arrival process can be modelled as a Markovian process [25]. In [26] anopen Jackson queueing network based model is used to characterize the servicecomponents in content-delivery-as-a- service (CoDaaS). A Jackson network is con-structed with a network of queues, where the arrivals at each queue are modelledas a Poisson process, and the service times follow the exponential distribution. In[27] queuing theory is used to identify and manage the users? response time for services. In [28], the authors obtained the response time distribution of a cloudsystem modelled on a classic
M/M/m open network, assuming an exponential den-sity function of the inter-arrival and service times. In [29], the authors obtainedthe response time distribution for a cloud with an
M/M/m/m + r system model.Both inter- arrival and service distribution times were assumed to be exponential itle Suppressed Due to Excessive Length 5 and the system had a finite number of m + r size buffers. In [30], a queuing per-formance model consisting of a cloud architecture and a service center such as adata center is studied. The service center is taken as a database server. This meansthat both the time between user arrivals to the system and the service time of thesystem follow an exponential distribution with means λ and µ respectively, withm servers with a first come first serve (FCFS) scheduling policy in [30].In [31] the availability modelling and evaluation of HPC computing systems is pre-sented. The necessity of availability is also emphasized and shown. The authorsin [31] have developed a novel solution approach using an object oriented Markovmodel which provides availability modelling for typical high- performance clustercomputing systems. Numerical results presented in [31] demonstrated that avail-ability modelling and evaluation need to be considered at the system design stagefor typical high-performance cluster computing systems. In this section, a homogeneous multi-nodes system is presented for performabilityevaluation of large-scale Beowulf clusters with failures and repairs. As definedbefore, a typical Beowulf cluster has two types of nodes: A head node and multipleidentical computing nodes as shown in Fig. 1. The head node may or may not servethe task, however, the main responsibility is distributing tasks to computing nodes.Identical computing nodes normally provide computation. In this paper, the headnode does participate to computations.
Fig. 1
Multi-nodes Beowulf cluster architecture Yonal Kirsal, Yoney Kirsal Ever
If the head node is not working, the computing nodes cannot serve the tasks. Thisis because of the head node is responsible for the organization and the distributionof tasks. Due to the head node failure, identical computing nodes are vulnerable.The failure of head node limits access to healthy identical computing nodes. TheBeowulf multi-nodes system is shown in Fig. 2. The system consists of a headnode (1) and S − , , ..., S, witha bounded common queue. L is the queue capacity of the proposed system where L ≥ S . Tasks arrive at the system in a Poisson stream at a mean rate of λ , and jointhe queue. Tasks are homogeneous and the service rates of the identical computingnodes are equal. The service times of requests serviced by the computing node k ( k = 2 , ..., S ) and the head node are distributed exponentially with mean 1 /µ and1 /µh , respectively. Even though, if the head node participates in computations,it generally has the same service rate as that of the identical computing nodes( µ = µh ). Fig. 2
The proposed system considered with failures and repairs /ξh and 1 /ξ are operative periods of head node and computing nodes, respec-tively, and the means are also distributed exponentially. Thus, ξh and ξ are failurerates of the head node and the identical computing nodes, respectively. At the endof the node k ( k = 2 , ..., S ) failure time an exponentially distributed repair time isneeded with mean 1 /η . On the other hand, if head node fails, the repair rate isprovided with mean repair time 1/?h. The repair priority is given to the head nodewhen the head node and more than one computing nodes fail at the same time.This is because, the identical computing nodes cannot serve without of the head node. If there are requests waiting to be served, the operative computing nodescannot be idle. In addition, the computing nodes serve with higher service rates, ifthe number of operative computing nodes is more than the number of requests inthe system. Services that are interrupted by fails are eventually resumed from thepoint of interruption or repeated with re-sampling. In case of head node failures itle Suppressed Due to Excessive Length 7 tasks continue to arrive with the same rate, λ and, tasks in the queue remain inthe queue without being serviced. In this section, the proposed analytical model and approximate solution methodsare introduced for large-scale Beowulf clusters. It is possible to represent the pro-pose system with S computing nodes, including the head node, by using a QuasiBirth and Death (QBD) process with finite state space. Since in Beowulf systems,none of the computing nodes can operate without the head node, the relation ofthe failure and the repair rate of the head node leads us to model the proposedsystem in two phases. The Fig. 3 indicates relationships between two phases. Thefirst phase is used in states where the head node is always available which is in-dicated as P lane . The P lane is the second phase, which, is used to representthe states where the head node is broken. Hence, the proposed system has twophases as shown in Fig. 3. Thus, the proposed system can be presented in threedimensions (3D). Fig. 3
General transitions between two phases P i,j,n are all the steady state probabilities of the proposed system. It can be seenin the Fig. 3, value of i and j indicate number of computing nodes and number oftasks in the system, respectively. n indicates the mode of a head node. When n = 0the head node does not operate and it represents the Plane0. On the other hand,when n = 1 the head node is operative and these states are represented in P lane .Figs. 4 and 5 show the state diagram of the P lane and P lane , respectivelyof the proposed 3-D system. There are S − i = 0 , , ..., S − S − P lane . L represents the number of tasks/requests in the system for bothfigures. The P lane describes case of the head node is not operative and the wholesystem does not provide services as shown in Fig. 4. Therefore, the downward transitions with service rate µ are not available.The repair priority is given to head node and the identical computing nodes cannotbe repaired before the head node became alive. In other words, there are no repairtransitions for the identical computing nodes in P lane , since the only repairtransition which can take place is the transition to P lane . On the other hand, in Yonal Kirsal, Yoney Kirsal Ever
Fig. 4
State diagram for the
P lane where head node is not working Fig. 5 there are S computing node configurations, i = 1 , ..., S and they are usedto represent possible operative sates similar to Fig. 4. However, the number ofcomputing nodes starts from one because the head node is operative. Downwardtransitions are possible since the system is alive. It is possible to use the repairfacility in order to deal with identical computing node failures since the head nodeis operative. Available identical computing nodes can provide service with a servicerate of µ and the broken nodes can be repaired with rate of η . Fig. 5
State diagram for the
P lane where head node is workingitle Suppressed Due to Excessive Length 9 P i,j, can be computedby analytical decomposition. These state probabilities of P i,j, may not give veryclose approximations to real steady state probabilities however the balance equa-tions are still required in order to take all possible transitions into account forfast convergence. Therefore, mathematical equations are required in order to havean approximate solution for these probabilities. Thus, to find all P i,j,n , the sumof all probabilities in both planes of the computing nodes should be consideredindividually. Two planes, P lane and P lane , can then be analyzed separately.Every single plane has its own states and sum of all these state probabilities arenot equal to one like single/multi server queue system. Thus, it is necessary tocompute the sum of all probabilities in each phase. The sum of the overall proba-bilities ( P lane + P lane ) should be one. In order to obtain a general solution forthe sum of overall probabilities in each plane, the equation 1 can be used. P i,j, + P i,j, = 1 (1)Therefore, the equations 2 and 3 are derived for P lane and P lane , respectively. P i,j, = ξhηh + ξh (2) P i,j, = ηhηh + ξh (3)Both equations clearly indicate that the head node failure and repair rates areessential to find overall probabilities for such systems considered. Therefore, thefollowing actions are taken to analyze performability of the Beowulf multi-nodesystems. The state probabilities of P lane has been analyzed where the headnode is operative. Hence, the sum of all possible probabilities of P lane is taken as ηh/ηh + ξh as it can be derived from equation 3 above. Hence, equation 4 canbe written as follows: S (cid:88) i =1 L (cid:88) j =0 P i,j, = ηhηh + ξh (4) It is required to obtain the P i,j, values in P lane . However, it is clear that theseprobabilities cannot be obtained directly by using a product form solution. There-fore, first the sums of all probabilities are considered for each operative state ofthe system in P lane . Figure 6 shows the overall operative states of computingnodes for P lane . Fig. 6
General lateral transitions of the system for a
P lane Equation 5 can then be used to derive to calculate P i,j, for all possible values of i . L (cid:88) j =0 P i,j, = ηhηh + ξh (cid:80) Sj =0 1 i ! ( ηξ ) i − i ! ( ηξ ) i − (5)where i = 1 , , , ...S . Since the sum of all P i,j, in P lane is known, it is easyto compute the overall probabilities for each operative computing node. P i,j, canthen be calculated in terms of P i, , . Hence, equation 6 is derived using productform solution [32]. By these set of equations all P i,j, can then be expressed interms of P i, , . P i,j, = ρ i j ! · P i, , , ρ i j ! i ! i − · P i, , , i + 1 ≤ j ≤ L (6)where ρ = λ/µ and 1 ≤ i ≤ S . Then, equation 6 can be generalized for each columnas follows: L (cid:88) j =0 P i,j, = (cid:34) i (cid:88) j =0 ρ i j ! + L (cid:88) j = i +1 ρ i i ! i j − i (cid:35) P i, , (7) where i = 0 , , ..., S . Hence, P i, , can be computed as in equation 8 using theequations 5-7 with some simplifications. P i, , = ( ηξ ) i i ! (cid:80) Sk =0 ( ηξ ) k k ! (cid:34) i (cid:88) j =0 ρ i j ! + ( i L − i ρ i +1 ) − ρ L +1 i ! i L − ( i − ρ ) (cid:35) − (8) itle Suppressed Due to Excessive Length 11 where i = 0 , , ..., S . Since P i, , s have been obtained, it is easy to find all P i,j, by using the above equations. Thus, the general expression for the approximatestate probabilities can be written as follows: P i,j, = ( ηξ ) i i ! (cid:80) Sk =0 ( ηξ ) kk ! (cid:34) (cid:80) ij =0 ρ i j ! + ( i L − i ρ i +1 ) − ρ L +1 i ! i L − ( i − ρ ) (cid:35) − ρ j j ! where j = 0 , , , , i ( ηξ ) i i ! (cid:80) Sk =0 ( ηξ ) kk ! (cid:34) (cid:80) ij =0 ρ i j ! + ( i L − i ρ i +1 ) − ρ L +1 i ! i L − ( i − ρ ) (cid:35) − ρ j j ! i j − i where j = i + 1 , i + 2 , , L (9)Hence, all P i,j, can be calculated using equation 9 which are not exact. Pleasenote that, the state probabilities obtained for P lane are used to get faster com-putations and more accurate results. However, it is not possible to follow a similarapproach for P lane . This is mainly because in P lane the system does not serve.3.2 Balance Equations and Iterative SolutionIn this section, the main balance equations are derived for each plane individuallyin order to obtain all P i,j,n . In addition, the balance equations obtained are alsoused to consider the transitions between these planes to take all possible transitionsinto account. P lane and P lane In order to accumulate the effects of lateral and horizontal transitions together forthe
P lane , all possible balance equations can be derived from the Fig. 5. Thesetransitions lead to obtain the following balance equations: i = 1; j = 0 P i,j, = µP i,j +1 , + ( i + 1) ξP i +1 ,j, η + λ (10) < j < L P i,j, = µP i,j +1 , + ( i + 1) ξP i +1 ,j, + λP i,j − , η + λ + µ (11) j = L P i,j, = ( i + 1) ξP i +1 ,j, + λP i,j − , η + µ (12)2 ≤ i < S ; j = 0 P i,j, = µP i,j +1 , + ( i + 1) ξP i +1 ,j, + ηP i,j − , λ + η + iξ (13)1 ≤ j < iP i,j, = ( j + 1) µP i,j +1 , + ( i + 1) ξP i +1 ,j, + λP i,j − , ηP i,j − , λ + η + jµ + iξ (14)1 ≤ j < L P i,j, = iµP i,j +1 , + ( i + 1) ξP i +1 ,j, λ + η + iµ + iξ + λP i,j − , + ηP i − ,j, (15) j = L P i,j, = ( i + 1) ξP i +1 ,j, + λP i,j − , + ηP i − ,j, η + iµ + iξ (16) i = S ; j = 0 P i,j, = µP i,j +1 , + ηP i − ,j, λ + iξ (17) ≤ j < i P i,j, = ( j + 1) µP i,j +1 , + λP i,j − , + ηP i − ,j, λ + jµ + iξ (18) itle Suppressed Due to Excessive Length 13 i ≤ j < L P i,j, = iµP i,j +1 , + λP i,j − , + ηP i − ,j, λ + iµ + iξ (19) j = L P i,j, = λP i,j − , + ηP i − ,j, iµ + iξ (20)On the other hand, the balance equations of the P lane is also required to have anapproximate solution of the proposed system. Therefore, similarly to P lane thegeneral balanced equations are produced considering the Fig. 4 and all possiblebalance equations are derived as follows: i = 0; j = 0 P i,j, = ξP i +1 ,j, λ (21)0 < j < L P i,j, = ξP i +1 ,j, + λP i,j − , λ (22) j = L P i,j, = ξP i +1 ,j, + λP i,j − , (23)1 ≤ i < S − j = 0 P i,j, = ( i + 1) ξP i +1 ,j, λ + iξ (24) ≤ j < andi ≤ j < L P i,j, = ( i + 1) ξP i +1 ,j, + λP i,j − , λ + iξ (25) j = L P i,j, = ( i + 1) ξP i +1 ,j, + λP i,j − , iξ (26) i = S − ≤ j < iandi ≤ j < L P i,j, = λP i,j − , λ + iξ (27) j = L P i,j, = λP i,j − , iξ (28)Therefore, the final values of P i,j, can also be obtained. The general balanceequations are derived for both planes separately. However, the relation betweenboth plane and balance equations are also required for the proposed system in orderto obtain more accurate and correct results which is given in the next section. In order to obtain correct steady state probabilities of the proposed system theessential balance equations have to be considered. Thus, Fig. 7 indicates the threedimensional model considered which shows relation between two planes.Thus, equation 29 gives the relation between two planes that can easily be obtainedfrom the Fig. 7. ηhP i,j, = ξhP i,j, (29)Hence, the relation between two planes with essential balance equations for eachstate can be obtained by using given relation in equation 29. The essential balanceequations obtained for P lane considering the effects of P lane are as follows: i = 1; j = 0 P i,j, = µP i,j +1 , + ( i + 1) ξP i +1 ,j, + ηhP i − ,j, λ + η + ξh (30) itle Suppressed Due to Excessive Length 15 Fig. 7
Three dimension (3D) feature of the proposed system < j < L P i,j, = µP i,j +1 , + ( i + 1) ξP i +1 ,j, + λP i,j − , + ηhP i − ,j, λ + η + µ + ξh (31) j = L P i,j, = ( i + 1) ξP i +1 ,j, + λP i,j − , + ηhP i − ,j, η + µ + ξh (32)2 ≤ i < S ; j = 0 P i,j, = µP i,j +1 , + ( i + 1) ξP i +1 ,j, + ηP i − ,j, + ηhP i − ,j, λ + η + iξ + ξh (33) ≤ j < iP i,j, = ( j + 1) µP i,j +1 , + ( i + 1) ξP i +1 ,j, + λP i,j − , λ + η + jµ + iξ + ξh + λP i,j − , + ηP i − ,j, + ηhP i − ,j, λ + η + jµ + iξ + ξh (34) i ≤ j < LP i,j, = iµP i,j +1 , + ( i + 1) ξP i +1 ,j, + λP i,j − , λ + η + iµ + iξ + ξh + ηP i − ,j, + ηhP i − ,j, λ + η + iµ + iξ + ξh (35) j = L P i,j, = ( i + 1) ξP i +1 ,j, + λP i,j − , + ηP i − ,j, + ηhP i − ,j, η + iµ + jξ + ξh (36) i = S ; j = 0 P i,j, = µP i,j +1 , + ηP i − , , + ηhP i − ,j, λ + iξ + ξh (37)1 ≤ j < i P i,j, = ( j + 1) µP i,j +1 , + λP i,j − , ηP i − ,j, + ηhP i − ,j, λ + jµ + iξ + ξh (38) i ≤ j < L P i,j, = ( i ) µP i,j +1 , + λP i,j − , ηP i − ,j, + ηhP i − ,j, λ + iµ + iξ + ξh (39) j = L P i,j, = λP i,j − , ηP i − ,j, + ηhP i − ,j, iµ + iξ + ξh (40)In addition, the essential balance equations obtained for P lane considering theeffects of P lane are as follows: i = 0; j = 0 P i,j, = ξP i +1 ,j, + ξhP i +1 ,j, λ + ξh (41) itle Suppressed Due to Excessive Length 17 < j < L P i,j, = ξP i +1 ,j, + λP i,j − , + ξhP i +1 ,j, λ + ξh (42) j = L P i,j, = ξP i +1 ,j, + λP i,j − , + ξhP i +1 ,j, ξh (43)1 ≤ i < S − j = 0 P i,j, = ( i + 1) ξP i +1 ,j, + ξhP i +1 ,j, λ + iξ + ξh (44)0 < j < L P i,j, = ( i + 1) ξP i +1 ,j, + λP i,j − , + ξhP i +1 ,j, λ + iξ + ξh (45) j = L P i,j, = ( i + 1) ξP i +1 ,j, + λP i,j − , + ξhP i +1 ,j, iξ + ξh (46) i = S − j = 0 P i,j, = ξhP i +1 ,j, λ + iξ + ηh (47)0 < j < L P i,j, = λP i,j − , + ξhP i +1 ,j, λ + iξ + ηh (48) j = L P i,j, = λP i,j − , + ξhP i +1 ,j, ( i − ξ + ηh (49)Then, the iterative procedure can be applied in order to accurately calculate the P i,j,n . The iterative procedure can be given as follows: Input:
Define the input parameters;
S, L, λ, µh, µ, ηh, η, ξh, ξ , the maximumnumber of iterations i.e, 2 and the converge parameter (cid:52)
2. Equation 9 calculates the approximate steady state probabilities, P i,j, of the P lane to get faster computations and more accurate results. Equations 10-20 and 21-28 are then used to calculate rough steady state probabilities of thesystem considered for both planes. These equations and computations are usedto have a faster convergence when the iterative procedure of essential balanceequations is employed. For instance, M QL converge is obtained using mathe-matical relations in application as follows: Set the initial value of
M QL converge = 0.0 for (i=0; i \leq S; i++) dofor (j=0; j \leq L; j++) do$MQL_{converge} = MQL_{converge} +j*P_{i,j,n} $end forend for
3. The balance equations given in equations 29-49 are used to calculate the correctsteady state probabilities, P i,j,n .4. The sum of all P i,j,n . can be determined by the help of the normalizing condi-tion. Steps 3 and 4 are repeated until the normalization condition is satisfied.In other words, the sum of probabilities sufficiently converges to one. Thus, aperformability measures, MQL, is chosen to check the accuracy by the conver-gence parameter (cid:52) , where (cid:52) = 0 .
001 is taken in this paper. for (i=0; i \leq S; i++) dofor (j=0; j \leq L; j++) do$MQL_{approx} = MQL_{approx} +j*P_{i,j,n} $end forend for
Thus, the iteration will be ended in case (cid:107)
M QL approx − M QL converge (cid:107) ≤ (cid:52) .Otherwise, the iterative procedure will assign the recent values of performabil-ity measures to old values i.e,
M QL converge = M QL approx and continue fromthe step 3 separately for both planes.5. Hence, the performability measures such as mean queue length (MQL),
M QL = M QL + M QL , throughput (THRP), T HRP = T HRP + T HRP and meanresponse time (MRT), M RT = M RT + M RT can be computed using themost recent steady state probabilities P i,j,n as: M QL = (cid:80) S − i =1 (cid:80) Lj =0 P i,j, + (cid:80) Si =1 i (cid:80) Lj =0 P i,j, T HRP = (cid:80) S − i =1 (cid:80) Lj =0 jµP i,j, + (cid:80) Si =1 (cid:80) Lj =0 jµP i,j, M RT = MQLT HRP itle Suppressed Due to Excessive Length 19
In this section, the results are presented for the performability evaluation of thelarge-scale Beowulf clusters considering a head node and a large number of com-puting nodes. Numerical results are presented for MQL, THRP, and MRT in orderto show the capability of the proposed analytical model and the solution approach.Numerical results obtained clearly show that, the proposed model and an approxi-mate solution approach can easily handle large number of nodes in Beowulf clusterswithout a state explosion problem. In addition, numerical results obtained fromthe proposed model are validated by DES. The CPU times of the proposed so-lution approach and DES are also compared in order to show the efficacy of theproposed model and an approximate solution approach.The mean arrival and service rates are mainly application dependent. The as-sumptions and parameters used in [3,7,9,11-13,32] are also employed in this paperfor consistency. However, the approach presented in this paper is flexible and canbe adopted to similar applications. For instance, the average failure and repairrates of the computing nodes considered are chosen from studies [3], and [7] con-sidering the availability of Beowulf clusters as well as cloud computing systemsfor consistency. Therefore, time between failures for head node and computingnodes can be taken as 250( ξ = ξh = 0 . /h ) , ξ = ξh = 0 . /h ), and1000( ξ = ξh = 0 . /h ) hours. The parameters are taken for many computa-tions as µ = 0 . requests/sec, η = ηh = 0 . /hr, ξ = ξh = 0 . /hr and the λ rateper user varies from 10 requests per second unless stated otherwise. MQL, THRP,and MRT results are presented as a function of λ for proposed analytical modeland DES in Figs. 8-10, respectively with different computing node failure rates. Asystem with S = 500, and L = 1000 is considered. Fig. 8
MQL vs λ for S=500 and L=1000 with different ξ .0 Yonal Kirsal, Yoney Kirsal Ever Fig. 9
THRP vs λ for S=500 and L=1000 with different ξ The figures clearly show that the effects of computing nodes failure on the QoS ofthe system is quite significant in large scale fault tolerant Beowulf clusters. Thesystem can be full quickly when a higher value of ξ is considered. However, theconsidered Beowulf cluster can serve the arriving requests due to the large numberof computing nodes for lower values of ξ . In other words, increasing the timebetween failures of computing nodes decreases the MQL as shown in Fig. 8. Forinstance, in Fig. 8, the MQL value is 280.924 when ξ = 0 .
001 and λ = 70. However,increasing the failure rate of computing nodes to 0.002 and 0.004 when λ = 70, theMQL values increase to 976.147 and 999.188, respectively. In addition, the averagenumber of requests in the system becomes same as the maximum capacity of thesystem L when the failure rate increases. It can be clearly observed in Fig. 9 thatTHRP of the system increases as λ increases, however, the THRP saturates aftersome point depending on as well as the failure rates. This is because of the systemcannot serve the requests efficiently and incoming requests start to queue up inthe system to be served especially for the loaded systems. Higher THRP valuesare obtained for the systems with lower failure rates due to the average value ofoperative computing nodes. On the other hand, similar behaviour is observed forMRT in Fig. 10. The MRT increases when the failure rate of computing nodeincreases as expected. On the other hand, the effects of the failure and the repair rates of head nodeare given in Figs. 11 and 12, respectively. Figure 11 shows the THRP results asa function of arrival rate for different head node failure rates. The THRP results,decrease when ξh = 0 .
01. This is due to the frequent failing of the head nodesince the computing nodes directly depend on the head node. Even though a goodrepair facility is provided to the system ( η = ηh = 0 . /h ), the QoS degrades due itle Suppressed Due to Excessive Length 21 to the failure rate of the head node. In addition, the MQL results as a function ofarrival rate for different repair rates of the head node is given in Fig. 12. Fig. 10
MRT vs λ for S=500 and L=1000 with different ξ . Fig. 11
THRP vs λ with different ξh .2 Yonal Kirsal, Yoney Kirsal Ever Fig. 12
MQL vs λ with different ηh . The significance of the head node repair rates is clearly shown in the figure interms of the queue length. In other words, when the system has a light traffic( λ = 70), the MQL is 281.849. However, for the same situation the MQL is almostfull (999.189) due to the frequent failing of the head node. In the case of thefrequent failure of the head node, the rest of the computing nodes are no longerable to serve. Hence, the MQL of the system increases rapidly. In addition THRPof the system decreases because the system does not serve. If a head node isunable to serve (fails), the system stops serving the requests. The head node mayregain connectivity and rejoin the system after some time. This situation is clearlyshown in Fig. 12. Hence, as a summary, the repair and failure rates of the headnode has a significant affect on the such system performance. On the other hand,Fig. 13 shows the MQL results as a function of arrival rate for different numberof computing nodes. The various numbers of computing nodes are taken from [16](S=32,64,128,256 and 372) in Fig. 13 where the researchers currently use them intheir laboratories for different purposes. As can be seen that the proposed modeland solution can easily handle the large amount of computing nodes up to severalhundreds with availability issues. Fig. 14 shows the MQL results as a function ofarrival rate for different queue capacities. As can be clearly seen from the figurethat queue capacity is the limiting factor of large scale Beowulf clusters.A comparative study is further performed in order to show a certain degree ofaccuracy of the proposed solution approach and DES. Tables 1 and 2 present MQL, THRP and MRT results comparatively with the simulation results for S=500,L=1000 and S=1000, L=2000, respectively. The discrepancies of the proposedanalytical model and DES are also presented in all the tables. The maximumdiscrepancies for MQL, THRP and MRT are less than 1.149%, 3.82%, and 3.76%,respectively, for both tables which is well within the 5% confidence interval of the itle Suppressed Due to Excessive Length 23
Fig. 13
MQL vs λ with different number of nodes. Fig. 14
MQL vs λ with different queue capacities. simulation. The DES is mainly used for the validation purposes, however it canalso be used for the performance evaluation of such systems. Because it simulatesthe actual scenario rather than the Markov models presented in this paper.The DES model is implemented in C++ language and adopted for the scenarioconsidered [33]. In addition, the CPU times of the analytical approach and DES for the computations are also presented comparatively in Tables 3 and 4. Allof the numerical results presented are obtained using workstations with Intel(R)Core(TM) i7-363QM CPU @ 2.40GHz, 16GB RAM, and 64-bit operating system.The proposed 3D analytical model uses an iterative approach to obtain steadystate probabilities based on ( S ) x ( L + 1) number of equations for both P lane and P lane . For instance, the number of states is considered in Table 4 is (1000 x2001) x (1000 x 2001). Thus, the processing times of analytical models are alsopresented with the processing times of the simulation for comparison. Tables 3and 4 show the CPU times of systems with S=500, L=1000, and S=1000, L=2000,respectively. The computational efficiency of the proposed solution approach withthe DES is clearly given in both tables in terms of CPU times. For example, inTable 4 the maximum CPU time for simulation is 80878.871 seconds (22.46 hours)for S=1000, L=2000 whereas the maximum CPU time of the analytical approach isless than 5hours for an extreme case. Thus, the proposed analytical model and anapproximate solution approach are efficient in performability evaluation of large-scale Beowulf clusters. itle Suppressed Due to Excessive Length 25 This paper proposed an analytical modelling approach and an approximate solu-tion approach to obtain QoS measurements for large-scale Beowulf clusters withouta state explosion problem. In order to obtain more realistic QoS measurements,availability issues are considered together with performance modelling. The pro- posed analytical modelling, solution, and the analysis is useful for achieving betterperformance in such systems. The system is modelled as a three-dimensional Con-tinuous Time Markov chain to determine the state probabilities. The proposedmodel can be used to analyze QoS measures such as mean queue length (MQL),throughput (THRP) and mean response time (MRT).The method used is novel and flexible where can be extended to the case of manyother practical, fault-tolerant multi-server farms. In addition, the numerical resultspresented show that the proposed solution approach can handle up to severalmillion states of the model presented. A number of statescan be increased using the proposed model without having a state explosion prob-lem depending on the system. The comparative results presented in this papershow that the discrepancy between the simulation and the analytical model pre-sented is less than 5% for all cases. In terms of efficacy, the computation timeof the proposed model is significantly shorter than the simulation especially forloaded and large-scale cases.The main demerit of this method is the large number equations. In the proposedmodel, the balance equations depend on each other and chained together for ob-taining approximate steady state probabilities. An iterative technique has beenused to solve the steady state probabilities. Also, this technique increases thecomputation times for approximate results. Furthermore, increasing the queue ca-pacity, the number of computing nodes, or mean arrival rate forces a significantincrease in computation times. Therefore, it is essential that programming tech-niques are effectively used to further reduce computation times. This work is still inprogress. Although the speed is an issue, the proposed method is superior to sim-ulation under all circumstances. Results show that, proposed method works withlarge queue capacities and large numbers of computing nodes effectively givingaccurate results.
References
1. Hwang, K., Geoffrey, C., Fox, J. J. Dongarra (2012). Distributed and Cloud Computing:From Parallel Processing to the Internet of Things. Morgan Kaufmann.2. Ngxande, M., Moorosi, N. (2014). Development of Beowulf cluster to perform large datasetssimulations in educational institutions. Int. J. Computer App., 99, 29-35.itle Suppressed Due to Excessive Length 273. Ever, E., Gemikonakli, O., Chakka, R. (2006). A Mathematical Model for Highly Avail-able Clusters with One Head and Several Identical Computing Nodes. In Proceedings of the9th International Conference on Computer Modelling and Simulation. 32-37.4. Pijanowski, B.C., Tayyebi, A., Doucette, J., Pekin, B.K., Braun, D. and Plourde, J. (2014).A big data urban growth simulation at a national scale: configuring the GIS and neuralnetwork based land transformation model to run in a high performance computing (HPC)environment. Environmental Modelling and Software. 51. 250-268.5. Song, A., Wang, W., and Luo, J. (2014). Stochastic modeling of dynamic power manage-ment policies in server farms with setup times and server failures. International Journal ofCommunication Systems. 27. 680-703.6. Ekanayake, J., and Geoffrey, F. (2010). High performance parallel computing with cloudsand cloud technologies. Cloud Computing. Springer Berlin Heidelberg. 20-38.7. Ever, E., Gemikonakli, O., Chakka, R. (2006). A Mathematical Model for Performabilityof Beowulf Clusters. In Proceedings of 39th Annual Symposium on Network Simulation.118-126.8. Donald, J. B., Sterling, T., Savarese, D., Dorband, J., E., Ranawake, U., A., Packer, andCharles., P. V. (1995). Beowulf: A parallel workstation for scientific computation. In Pro-ceedings International Conference on Parallel Processing.9. Adams, J. and Vos, D. (2002). Small-College Supercomputing: Building A Beowulf ClusterAt A Comprehensive College. In Proceedings of 33rd Technical symposium on computerscience education. Cincinnati. Kentucky. 411-415.10. Qin, C., Z. (2014). A strategy for raster-based geocomputation under different parallelcomputing platforms. International Journal of Geographical Information Science. 28. 11.2127-2144.11. Boukerche, A., Shaikh, A., and Notare, M. (2006). Towards Building a Highly-AvailableCluster Based Model for High Performance Computing, International Parallel and Dis- trib-uted Processing Symposium. 1-8.12. Gemikonakli, O., Do, T. V., Chakka., R., and Ever, E. (2005). Numerical Solution to thePerformability of a Multiprocessor System with Reconfiguration and Rebooting Delays. InProceedings of ECMS. 766-773.13. Ever, E., Gemikonakli, O., Kocyigit, A. and Gemikonakli, E. (2013). A hybrid approachto minimize state space explosion problem for the solution of two stage tandem queues, J.Network and Computer Applications. 36. 2. 908-926.14. Kirsal, Y., Ever, E., Kocyigit, A., Gemikonakli, O., Mapp, G. (2015). Modeling and anal-ysis of vertical handover in highly mobile environments: J. Supercomput. 71. 4352- 4380.15. Ever, E. Gemikonakli, O., Chakka, R. (2009). Analytical modelling and simulation of smallscale. typical and highly available Beowulf clusters with breakdowns and repairs, SimulationModelling Practice and Theory. 17. 327-347.16. Academic (UB-HPC) Compute Cluster Hardware Specs, The University of Buffalo,
17. High Performance Computing using Beowulf clusters