[PDF] CAPre: Code-Analysis based Prefetching for Persistent Object Stores

Abstract

Data prefetching aims to improve access times to data storage systems by predicting data records that are likely to be accessed by subsequent requests and retrieving them into a memory cache before they are needed. In the case of Persistent Object Stores, previous approaches to prefetching have been based on predictions made through analysis of the store's schema, which generates rigid predictions, or monitoring access patterns to the store while applications are executed, which introduces memory and/or computation overhead. In this paper, we present CAPre, a novel prefetching system for Persistent Object Stores based on static code analysis of object-oriented applications. CAPre generates the predictions at compile-time and does not introduce any overhead to the application execution. Moreover, CAPre is able to predict large amounts of objects that will be accessed in the near future, thus enabling the object store to perform parallel prefetching if the objects are distributed, in a much more aggressive way than in schema-based prediction algorithms. We integrate CAPre into a distributed Persistent Object Store and run a series of experiments that show that it can reduce the execution time of applications from 9% to over 50%, depending on the nature of the application and its persistent data model.

Full PDF

CCAPre: Code-Analysis based Prefetching for PersistentObject Stores (cid:73)(cid:73)

Rizkallah Touma a , Anna Queralt a , Toni Cortes a,b a Barcelona Supercomputing Center, Jordi Girona 29, 08034 Barcelona b Universitat Polit`ecnica de Catalunya, Jordi Girona 31, 08034 Barcelona

Abstract

Data prefetching aims to improve access times to data storage systems by pre-dicting data records that are likely to be accessed by subsequent requests andretrieving them into a memory cache before they are needed. In the case ofPersistent Object Stores, previous approaches to prefetching have been basedon predictions made through analysis of the store’s schema, which generatesrigid predictions, or monitoring access patterns to the store while applicationsare executed, which introduces memory and/or computation overhead.In this paper, we present

CAPre , a novel prefetching system for Persistent Ob-ject Stores based on static code analysis of object-oriented applications.

CAPre generates the predictions at compile-time and does not introduce any overheadto the application execution. Moreover,

CAPre is able to predict large amountsof objects that will be accessed in the near future, thus enabling the object storeto perform parallel prefetching if the objects are distributed, in a much moreaggressive way than in schema-based prediction algorithms. We integrate

CAPre into a distributed Persistent Object Store and run a series of experiments thatshow that it can reduce the execution time of applications from 9% to over 50%,depending on the nature of the application and its persistent data model.

Keywords:

Persistent Object Stores; Static Code Analysis; Data Prefetching;Parallel Prefetching; Object-Oriented Programming Languages (cid:73)

DOI: 10.1016/j.future.2019.10.023. c (cid:13)

Elsevier 2019. This manuscript version is made avail-able under the CC-BY-NC-ND 4.0 license

Preprint submitted to Elsevier May 26, 2020 a r X i v : . [ c s . D B ] M a y . Introduction Persistent Object Stores (POSs) are data storage systems that record andretrieve persistent data in the form of complete objects [1]. They are especiallyused with Object-Oriented programming languages to avoid the impedance mis-match that occurs when developing OO applications on top of other typesof databases, such as Relational Database Management Systems (RDBMSs).POSs make it easier to access persistent data without worrying about databaseaccess and query details, which can amount to 30% of the total code of anapplication [2, 3].Examples of POSs include object-oriented databases (e.g. Cach´e [4] andActian NoSQL [5]) and Object-Relational Mapping (ORM) systems (e.g. Hi-bernate [6], Apache OpenJPA [7] and DataNucleus [8]). The rise of NoSQLdatabases has also led to the development of mapping systems for non-relationaldatabases, such as Neo4J’s Object-Graph Mapping (OGM) [9]. Moreover, sev-eral POSs that support data distribution have been developed to accommodatethe needs of parallel and distributed programming (e.g. Mneme [10], Nexus [11],Thor [12] and dataClay [13, 14]).Like in any other storage system, accessing persistent media is very slowand thus prefetching is needed to improve access times to stored data. Previ-ous approaches to prefetching in POSs can be split into three broad categories:1. schema-based, 2. data-based, and 3. code-based. An example of a schema-based approach is the

Referenced-Objects Predictor (ROP) , which uses the fol-lowing heuristic: each time an object is accessed, all the objects referenced fromit are likely to be accessed as well [15]. This type of approach gives rigid pre-dictions that do not take into account how persistent objects are accessed bydiﬀerent applications. Nevertheless, ROP is widely used in commercial POSsbecause it achieves a reasonable accuracy and does not involve a costly predic-tion process (see Section 2).On the other hand, data-based approaches predict which objects to prefetchby detecting data access patterns while monitoring application execution. This

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13)

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13) Elsevier 2019. This manuscript version is made available under the CC-BY-NC-ND 4.0 license ype of approaches causes overhead that can amount to roughly 10% of theapplication execution time [16]. Furthermore, they may require large amounts ofmemory to store the detected patterns. Finally, few approaches have based thepredictions on analyzing the source code of the OO applications that access thePOS, and these have been largely theoretical without any in-depth analysis ofthe prediction accuracy or the performance improvement that they can achieve.For more details, Section 2 includes a study of the related work in the ﬁeld ofprefetching in POSs.In this paper, we present an approach to predict access to persistent objectsthrough static code analysis of object-oriented applications. The approach in-cludes a complex inter-procedural analysis and takes non-deterministic programbehavior into consideration. Then, we present

CAPre : a prefetching system thatuses this prediction approach to prefetch objects from a POS.

CAPre performsthe prediction at compile-time without adding any overhead to application ex-ecution time. It then uses source code generation and injection to modify theapplication’s original code to activate automatic prefetching of the predictedobjects when the application is executed.

CAPre also includes a further op-timization by automatically prefetching data in parallel whenever possible, inorder to maximize the beneﬁts obtained from prefetching when using distributedPOSs.We integrate

CAPre into dataClay [14], a distributed POS, and run a seriesof experiments to measure the improvement in application performance that itcan achieve. The experimental results indicate that using

CAPre to prefetchobjects from a POS can reduce execution times of applications, with the mostsigniﬁcant gains observed in applications with complex data models and/ormany collections of persistent objects.

Contributions.

The main contributions of the present paper can be summarized as follows: • We propose the theoretical basis of an approach to predict access to per-sistent objects based on static code analysis.

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13)

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13) Elsevier 2019. This manuscript version is made available under the CC-BY-NC-ND 4.0 license

We design and implement

CAPre , a prefetching system for Persistent Ob-ject Stores, using this prediction approach. • We demonstrate how

CAPre improves the performance of applications byintegrating it into an independent POS and running experiments on a setof well-known object-oriented and Big Data benchmarks.The work reported here extends our previous work [17] in several directions.First, after presenting the theoretical grounds, we present the design and imple-mentation of a complete prefetching system, based on static code analysis, andintegrate it into an independent POS. Second, we evaluate the accuracy andperformance gains obtained by our system by executing a set of benchmarks in-stead of simulating the expected accuracy results. These executions present thereal eﬀect of the technique on benchmarks and applications that were impossibleto obtain by only using simulation.

Paper Organization

Section 2 discusses the main diﬀerences of our proposal with current state ofthe art. Section 3 presents an example that motivates our work and that will beused throughout the paper to guide the diﬀerent steps. Section 4 summarizesthe formalization of the used static code analysis approach. Section 5 presentsour proposed prefetching system,

CAPre , and how it was implemented. Section6 discusses the integration details of

CAPre into a distributed POS. Section 7exposes the experimental evaluation of the system. Finally, Section 8 concludesthe paper and outlines some future work.

2. Related Work

The structure in which Persistent Object Stores (POSs) expose data, in theform of objects and relations between these objects, is rich in semantics idealfor predicting access to persistent data and has invited a signiﬁcant amountof research [18]. The most popular previous approach is the schema-based

Referenced-Objects Predictor (ROP) , deﬁned in Section 1. Hibernate [15], Data

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13)

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13) Elsevier 2019. This manuscript version is made available under the CC-BY-NC-ND 4.0 license ucleus [19], Neo4JOGM [9] and Spring Data JPA [20] all support this tech-nique through speciﬁc conﬁguration settings with varying degrees of ﬂexibility(e.g. apply the prefetching on system level or only to speciﬁc types). For in-stance, Hibernate oﬀers developers OR-Mappers [21], which include predeﬁnedinstructions that can be used to decide which related objects to prefetch for eachobject type, while with Django [22] developers need to supply explicit prefetch-ing hints with each access to the POS. This type of implementation of ROPrequires manual inspection of the entire application code by the developer andis an error-prone process, given that correct prefetching hints are diﬃcult todetermine and incorrect ones are hard to detect [23].Schema-based techniques, as opposed to our proposal, only take into accountthe structure of the classes, but not how they are actually used by applicationmethods, and thus can imply accessing a signiﬁcant amount of unused data.Furthermore, given their heuristic nature, ROP approaches do not prefetch col-lections because the probability of bringing many unnecessary objects is veryhigh. In our approach, as we will know exactly what collections will be accessed,we will show that we can use this information to prefetch them in a safe wayincreasing the eﬀectiveness of the prefetching without incurring in unnecessaryoverhead.Other prefetching mechanisms are data-based techniques that rely on thehistory of accesses to objects stored in the POS. Some examples of these ap-proaches include object-page relation modeling [24, 18], stochastic methods [16],Markov-Chains [16, 25], traversal proﬁling [26, 23], the Lempel-Ziv data com-pression algorithm [27] and context-aware prefetching [28]. Moreover, predictingaccess to persistent objects at the type-level was ﬁrst introduced by Han et al. based on the argument that patterns do not necessarily exist between individualobjects but rather between object types [29]. The same authors later presentan optimization of this approach by materializing the objects for each detectedaccess pattern [30]. However, all of these approaches gather the informationneeded to make the predictions by monitoring access to the POS during appli-cation execution and thus introduce overhead in both memory and execution

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13)

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13) Elsevier 2019. This manuscript version is made available under the CC-BY-NC-ND 4.0 license ime.Using code-based analysis to prefetch persistent objects was ﬁrst suggestedby Blair et al. , who analyze the source code of OO applications at compile-timein order to model object relations and detect when the invocation of a methodcauses access to a diﬀerent page [31]. This information is then used at runtimein order to prefetch the page once the execution of the corresponding methodstarts. The main diﬀerence with our approach is that they are based on pagegranularity, thus bringing and keeping many objects that may not be necessaryjust because they reside in the same page.Finally, there is a completely diﬀerent approach based on the queries ex-ecuted over the data: ”query rewriting”. This mechanism is another type ofoptimization that can be used to prefetch objects. The idea behind this mech-anism is to execute queries that are made more general to prefetch objectsthat might be relevant for future requests. Nevertheless, this again is based onheuristics and many unnecessary objects may be brought to the cache addingoverhead and ﬁlling the cache with useless data. For more information, [32] in-cludes an extensive, albeit outdated, survey of diﬀerent prefetching techniqueswhile both [33] and [31] present taxonomies categorizing prefetching techniquesin object-oriented databases.In summary, our approach performs the prediction process at compile-timeand produces type-level prefetching hints, combining the beneﬁts of both typesof approaches. The advantage of performing the prediction process at compile-time is the absence of overhead present in techniques which need informationgathered at runtime. Similarly, type-level prediction is more powerful than itsobject-level counterpart and can capture patterns even when diﬀerent objectsof the same type are accessed. Moreover, information is not stored for eachindividual object which reduces the amount of used memory [34]. Finally, ourapproach also prefetches individual objects instead of entire pages of objects,which reduces the amount of memory occupied by other objects in the samepage that might not necessarily be accessed.

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13)

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13) Elsevier 2019. This manuscript version is made available under the CC-BY-NC-ND 4.0 license ccount + accountID : Integer+ balance : Integer+ status : Integer

Company + compID : Integer+ name : String+ address : String+ phone : String

Transaction + transID : Integer+ dateTime : Date+ creditDebit : Boolean+ amount : Integer

Employee + empID : Integer+ salary : Integer+ level : Integer+ dateOfBirth : Date

Transaction Type + typeID : Integer+ name : String+ desc : String

Customer + custID : Integer+ type : String+ name : String+ custSince : Date

Department + deptID : Integer+ name : Stringtype1 emp 1account1cust1company1 dept1

Figure 1: Example of a Persistent Object Store (POS) schema. The schema represents abanking system with 7 entities, each of which corresponds to an object type in the POS.

3. Motivating Example

Figure 1 shows the POS schema of a bank management system. In theﬁgure, we can see various classes representing the entities of the system, suchas

Transaction , Account and

Customer . Let’s say that we want to updatethe customers of the accounts responsible for all the transactions to be in thename of the manager of the bank. However, as a security measure, the systemrestricts updates on accounts to customers of the same company as the customercurrently owning the account.In order to achieve this task, we need to retrieve and iterate through all the

Transaction objects. We then navigate to the referenced

Account and

Customer until reaching the

Company of each customer. Finally, we need to compare thecompany of the customer currently owning the account with the company ofthe bank manager.As we have mentioned, the most well used prediction technique that canbe applied in this case is the Referenced-Objects Predictor (ROP), deﬁned inSection 1. Applying ROP to our example means that, for instance, each time a

Transaction object is accessed, the referenced

Transaction Type , Account and

Employee objects are predicted to be accessed along with it.However, in order to accomplish our task we also need to access the

Customer and

Company objects which will not be prefetched. On the other hand, the

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13)

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13) Elsevier 2019. This manuscript version is made available under the CC-BY-NC-ND 4.0 license ransaction Type and

Employee objects will be prefetched with

Transaction but in reality are not needed for the task at hand. To put this in numbers, ifwe have 100,000

Transactions the ROP would wrongfully predict access to asmany as 200,000 objects in the worst case while missing another 200,000 objectsthat will be accessed.The prediction accuracy of ROP can be improved by increasing its ”fetchdepth”, i.e. the number of levels of referenced objects to predict. For instance,instead of only predicting access to

Transaction Type , Account and

Employee ,which are directly referenced from

Transaction , having a fetch depth equal to2 would also predict the objects referenced from them, which are

Department and

Customer in this example.Increasing the fetch depth of ROP may help in predicting more relevantobjects but it does not solve the problem of predicting access to objects thatare not necessary. As a matter of fact, the more the fetch depth is increasedthe more likely it is to predict irrelevant objects as well. This is due to the factthat the ROP applies a heuristic based on the schema of the POS that does nottake into account the application behavior.Another more complex approach would be to monitor accesses to the POSand generate predictions based on the most commonly accessed objects [29, 23,16]. For instance, monitoring accesses to the POS shown in Figure 1 might tellus that in 80% of the cases where a

Transaction object is accessed, its related

Account and

Customer objects are accessed as well.This would work perfectly for our task, we will only need to load the ref-erenced

Company object and all the other necessary objects will have beenalready prefetched. However, in the 20% of cases where a transaction’s

Account and

Customer are not needed, they will still be prefetched despite the fact thatthey will not be accessed. Moreover, retrieving the necessary information forthis approach requires runtime monitoring of the application which adds over-head to the application execution time and memory consumption [16].The problem faced in both cases is that sometimes we prefetch objects thatare not needed into memory and at the same time we don’t prefetch objects that

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13)

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13) Elsevier 2019. This manuscript version is made available under the CC-BY-NC-ND 4.0 license isting 1: Example OO application written in Java. public class Transaction { private Account account ; private Employee emp ; private TransactionType type ; public Account getAccount() { if ( this . type . typeID == 1) { this . emp .doSmth(); } else { this . emp . dept .doSmthElse(); } return this . account ; } } public class Account { private Customer cust ; public void setCustomer(Customer newCust ) { if ( this . cust . company == newCust . company ) { this . cust = newCust ; } } } public class BankManagement { private ArrayList transactions ; private Customer manager ; public void setAllTransCustomers() { for (Transaction trans : this . transactions ) { trans .getAccount().setCustomer( this . manager ); } } } are actually accessed. This partially stems from the fact that the predictionheuristics are applied without taking into consideration the actual applicationsbeing used to access the data.

4. Approach Formalization

This section summarizes the formalization of the approach we use to predictaccess to persistent objects. The formalization is based on the concept of typegraphs presented by Ibrahim and Cook [23] that we have extended in order tocapture the persistent objects accessed by a method in the form of a graph.

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13)

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13) Elsevier 2019. This manuscript version is made available under the CC-BY-NC-ND 4.0 license fter constructing these graphs, we generate a set of prefetching hints thatpredict which objects should be prefetched from the POS for each method ofthe analyzed application.

Example.

To help explain the approach, we use the sample object-orientedapplication shown in Listing 1, that uses the schema presented in Figure 1, asa running example.

For any such object-oriented application that uses a POS, we deﬁne T asthe set of types of the application and P T ⊆ T as its subset of persistent types.Furthermore, ∀ t ∈ T we deﬁne • F t : the set of persistent member ﬁelds of t such that ∀ f ∈ F t : type ( f ) ∈ P T , • M t : the set of member methods of t . First, we need to represent in a graph all the relationships between classesin order to be able to decide which other classes are reachable starting from theﬁelds of a given class. To keep this information, we represent the schema of theapplication through a directed type graph G T = ( T, A ), where: • T is the set of types deﬁned by the application. • A is a function T × F → P T ×{ single, collection } representing a set of asso-ciations between types. Given types t and t (cid:48) and ﬁeld f , if A ( t, f ) → ( t (cid:48) , c )then there is an association from t to t (cid:48) represented by f ∈ F t where type ( f ) = t (cid:48) with cardinality c indicating whether the association is single or collec-tion . Example.

Figure 2 (a) shows the type graph of the application from Listing1. Some of the associations of this type graph are: • A(Bank Management, trans) (cid:55)→ (Transaction, collection)

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13)

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13) Elsevier 2019. This manuscript version is made available under the CC-BY-NC-ND 4.0 license mployeeDepartmentCustomerCompanyAccount TypeBankManagementTransaction cust account type emp depttransactionsmanagercompany

Single Collection

Associations: (a) Type graph G T of the whole appli-cation. EmployeeDepartmentAccount TypeTransaction account type emp dept

Employee emp

Single Collection

Associations: (b) Type graph G m of the method getAccount() (lines 6 to 13 from List-ing 1). Branch-dependent navigations(Section 4.4) are highlighted in orange. Figure 2: Two type graphs from Listing 1. Solid lines represent single associations and dashedlines represent collection associations. • A(Transaction, account) (cid:55)→ (Account, single) • A(Employee, dept) (cid:55)→ (Department, single)

While G T represents the general schema of the application, it does not cap-ture how the associations between the diﬀerent types are traversed by the ap-plication’s methods. When a method m is executed, some of its instructionsmight trigger the navigation of a subset of the associations in G T .An association navigation t (cid:43) f t (cid:48) is triggered when an instruction accesses aﬁeld f in an object of type t (navigation source) to navigate to an object of type t (cid:48) (navigation target) such that A ( t, f ) → ( t (cid:48) , c ). A navigation of a collection as-sociation has multiple target objects corresponding to the collection’s elements.The set of all association navigations in m form the method type graph G m ,which is a sub-graph of G T and captures the objects directly accessed by themethod’s instructions. Example.

Figure 2 (b) shows the type graph G m of the method getAc-count() with the implementation shown in Listing 1 (lines 6 to 13). Notice thatinstructions that involve ﬁelds of primitive types, such as typeID (integer), arenot part of the graph because they do not trigger a navigation between objects. DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13)

The limitation of the method type graph ( G m ) is that it only includes as-sociation navigations that occur in the code of the method m , but does notinclude associations navigated in other methods invoked by the original method m . Thus, after constructing G m , we perform an inter-procedural analysis tocapture the objects accessed inside other methods invoked by m . The result ofthis analysis is the augmented method type graph AG m , which we constructby adding association navigations that are triggered inside an invoked method m (cid:48) ∈ M t (cid:48) to G m as follows: • The type graph of the invoked method G m (cid:48) is added to G m through thenavigation t (cid:43) f t (cid:48) that caused the invocation. • Association navigations triggered by passing a persistent object as a pa-rameter to m (cid:48) are added directly to G m . Example.

Figure 3 shows the augmented method type graph AG m ofmethod setAllTransCustomers() from Listing 1. It includes the type graphsof the invoked methods getAccount() and setCustomer(newCust) . Note thatthe navigations BankM anagement (cid:43) manager

Customer (cid:43) comp

Company aretriggered by passing the persistent object

BankManagement .manager as a pa-rameter to the method setCustomer(newCust) . After constructing the augmented type graph of a method, we can predictwhich objects will be accessed once the execution of the method starts. Weachieve this by traversing AG m and generating a set of prefetching hints P H m that predict access to persistent objects: P H m = (cid:8) ph | ph = f .f . . . . .f n where t i (cid:43) f i t i +1 ∈ AG M : 1 ≤ i < n (cid:9) Each prefetching hint ph ∈ P H m corresponds to a sequence of association navi-gations in AG m and indicates that the target object(s) of the navigations is/areaccessed. DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13)

Employee emp

CustomerCompany BankManagement transactionsmanagercompany

CustomerCompany company cust

Single Collection

Associations:

Figure 3: Augmented method type graph AG m of setAllTransCustomers() from Listing 1(lines 30-34). Navigations highlighted in orange are branch-dependent (Section 4.4). Example.

The augmented method type graph AG m of Fig. 3 results inthe following set of prefetching hints for method setAllTransCustomers() . Notethat hints starting with the collection transactions predict that all its elementsare accessed: P H m = (cid:8) transactions.type, transactions.emp,transactions.account.cust.company, manager.company (cid:9) Given that we perform this analysis statically prior to the execution of theapplication, there are association navigations that we cannot decide if theyare traversed or not, given that they depend on the runtime behavior of theapplication. Thus, in this section, we study how to react in such cases where astatic analysis might lead to erroneous predictions of which objects should beprefetched. In particular, we considered two types of such behavior: • Navigations that depend on a method’s branching behavior, which is deter-mined by the method’s conditional statements (e.g. if , if-else , switch-case )and branching instructions (e.g. return , break ). These navigations mayor may not be triggered during execution, depending on which branch istaken, and hence might lead us to predict access to an object that doesnot occur. An example of this is Employee (cid:43) dept

Department , high-

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13)

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13) Elsevier 2019. This manuscript version is made available under the CC-BY-NC-ND 4.0 license able 1: Summarized statistics of the corpus of applications used in our approach study.

Max Median Avg Std. Dev. Total lighted in orange in Fig. 3, which is only triggered inside the if branch ofa conditional statement. • Navigations that are triggered inside a method’s overridden versions. Thisbehavior is caused by the dynamic binding feature of OO languages, whichallows an object deﬁned of type t to be initialized to a sub-type t (cid:48) . Thus,when invoking a method of type t , the method being executed mightactually be an overridden version deﬁned in t (cid:48) , which in turn might resultin erroneous predictions.Once we have detected the problem, and before proposing a solution, weanalyzed how often methods contain this kind of runtime-dependent behavior inorder to understand the magnitude of the problem. We performed this analysisusing the applications we will later use, in Section 7, to evaluate our prefetchingalgorithm (OO7, WC, K-means, and PGA) combined with the applications ofthe SF110 corpus, which is a statistically representative sample of 100 Javaapplications from SourceForge, a popular open source repository, extended withthe 10 most popular applications from the same repository [35].Figure 4 shows an aggregation of relevant characteristics of the applicationsused in our study: number of classes, methods, conditional statements and loopstatements. Table 1 also shows some summarized statistics of these character-istics and indicates that the test suite covers a wide range of applications, fromvery small applications to large applications containing over 20,000 methods.Let’s now analyze the conditional and loop statements in the studied ap-plications. Figure 5 (a) shows the number of applications per percentage ofconditional and loop statements that do not trigger any branch-dependent nav- DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13)

Number of: N u m b e r o f A pp li c a t i o n s Classes MethodsCond. Stmts. Loop Stmts. Figure 4: For each power-of-10 interval on the x-axis, the y-axis represents the number ofapplications of the SF110 corpus that have the number of classes, methods, conditional state-ments and loop statements (as detected by our approach) in that interval. For instance, theﬁrst dark blue line starting from the left means that the number of applications that havebetween 0 and 10 methods is 20. igations. This means that the prefetching hints obtained when any branch istaken are the same (although the methods executed in each branch may bediﬀerent, the accessed objects are the same). The category axis of Figure 5 (a)starts at 20% as none of the analyzed applications scored less in either case. Itshould be noted that one of the studied applications, greencow , does not haveany conditional statements while two, greencow and dash-framework , do nothave any loop statements. Table 2 shows that an average of 67.5% of condi-tional statements and 82% of loop statements do not trigger branch-dependentnavigations, and hence do not pose a problem when generating access hints.We aggregated these results to calculate the percentage of methods of eachapplication that do not trigger any branch-dependent navigations, i.e. the meth-ods for which our approach predicts the exact set of persistent objects that willbe accessed. Figure 5 (b) shows the results of this experiment, its category axisstarts at 40% as none of the studied applications scored a lower percentage. Fig-ure 5 (b) shows that only 6 of the studied applications scored below 80%, whichindicates that for 95.5% of the studied applications, our approach can generatethe exact set of access hints for over 80% of methods. Table 2 indicates thaton average, 88.8% of an application’s methods do not trigger branch-dependent

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13)

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13) Elsevier 2019. This manuscript version is made available under the CC-BY-NC-ND 4.0 license

Percentage of: N u m b e r o f A pp li c a t i o n s Cond. Stmts. Loop Stmts. (a) Conditional and Loop Statements

Percentage of Methods N u m b e r o f A pp li c a t i o n s (b) Methods Figure 5: For each 10% interval on the x-axis, the y-axis represents the number of applicationsof the SF110 corpus that have the percentage of conditional statements, loop statements andmethods that do not trigger any branch-dependent navigations in that interval.Table 2: Summarized statistics of the experimental results. The ﬁrst three rows show thepercentage of conditional statements, loop statements and methods that do not trigger anybranch-dependent navigations. The last row shows the analysis time of the studied applica-tions.

Min Max Median Avg Std. Dev.

Cond. Stmts. (%) 26.8% 100% 67.1% 67.5% 17%Loop Stmts. (%) 24.8% 100% 85.7% 82% 15.7%Methods (%) 44% 100% 89.9% 88.8% 7.9% navigations, which is signiﬁcantly higher than the average reported for condi-tional and loop statements, and also reports a low standard deviation of 7.9%.These results indicate that the prediction errors stemming from branch-dependent navigations are conﬁned to a limited number of methods, while ourstatic code analysis approach can accurately predict access to persistent objectsin most cases. This is also in line with the intuition of the authors of [23]that accesses to persistent data are, in general, independent of an application’sbranching behavior.Given these results, we conclude that the diﬀerence between the prefetchinghints of the diﬀerent branches of an application is quite small. Thus, in the im-plementation of

CAPre we will include hints corresponding to branch-dependentnavigations (i.e. assuming both branches are taken) to increase the true positive

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13)

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13) Elsevier 2019. This manuscript version is made available under the CC-BY-NC-ND 4.0 license onstruct TypeGraphsGenerate PrefetchingHintsGenerate Prefetching MethodsInject PrefetchingMethod Calls S t a t i c C od e A n a l ys i s C o m pon e n t S ou r ce C od e I n j ec t i on C o m pon e n t JAVA

ApplicationModiﬁed Classes

Persistent Object Store

JAVA

Application Classes

Figure 6: Overview of the proposed prefetching system. rate (i.e. predicted objects that are accessed by the application), with minimaleﬀect on false positives (i.e. predicted objects that are not accessed).By contrast, our previous work has a detailed study indicating that includingprefetching hints of overridden methods sharply increases the false positives ratein some cases [17]. Based on the results of this study, in the implementation of

CAPre , we will not include the prefetching hints of overridden methods whengenerating

P H m of a particular method m .

5. System Overview

CAPre is a prefetching system for Persistent Object Stores based on thestatic code analysis of object-oriented applications described in Section 4. Itconsists of two main components, as depicted in Figure 6: 1. Static Code Anal-ysis Component, and 2. Source Code Injection Component. The

Static CodeAnalysis Component takes as input the source code of the application classes,written in Java, and executes the static analysis approach formalized in Section4 in order to generate prefetching hints that predict, for each method of theapplication, which persistent objects should be prefetched. We implementedthis analysis for Java applications since it is the most common OO language,

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13)

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13) Elsevier 2019. This manuscript version is made available under the CC-BY-NC-ND 4.0 license ut the theoretical concepts of our approach can be applied to any other OOlanguage.Afterwards, the

Source Code Injection Component generates, for each method,a helper prefetching method that prefetches the objects predicted by the gener-ated prefetching hints. It also injects an invocation of this prefetching methodto activate the prefetching automatically when the application is executed. Thegenerated and injected code snippets uses multi-threading in order to performthe prefetching without interrupting the normal execution of the application, aswell as to prefetch objects in parallel when using a distributed POS.In the following subsections, we describe both components in detail.

This component includes the implementation of the prediction approachsummarized in Section 4. We used IBM Wala [36], an open-source tool thatparses and manipulates Java source code, to generate an Abstract Syntax Tree(AST) and an Intermediate Representation (IR) of each method of the analyzedapplication. We then constructed the augmented type graphs of the applica-tion’s methods using these two structures, before ﬁnally generating the set ofprefetching hints for each method.

We used Wala’s AST to identify conditional and loop statements. In partic-ular, we identiﬁed two loop patterns used to iterate collections: using indexesor using iterators, each of which can be implemented with a for or a while loop.Similarly, we took if , if-else and switch-case statements into consideration whenidentifying conditional statements. On the other hand, we used the IR, whichcontains a custom representation of the method’s instructions, in order to de-tect association navigations that occur inside the method. Each IR instructionconsists of ﬁve parts: • II : the instruction’s index inside the IR, • IT ype : the instruction type (e.g. method invocation),

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13)

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13) Elsevier 2019. This manuscript version is made available under the CC-BY-NC-ND 4.0 license

IP arams : the instruction parameters (e.g. the invoked method, the ac-cessed ﬁeld), • def V arId : the ID of the variable deﬁned by the instruction (can be nullif the instruction doesn’t deﬁne any variables), • usedV arIDs []: zero or more previously-deﬁned variables that are used bythe instruction, indicated by their IDs. Example.

Listing 2 shows the IR instructions of the method setAllTran-sCustomers() from Listing 1. The line numbers correspond to the instructionindexes (II). Note that II , II , II , II and II are implicit instructions gener-ated due to the for loop and are not explicitly invoked in the method’s sourcecode. Some examples of instructions from Listing 2 include: • II , IT ype = getﬁeld , IP arams = < BankManagement, transactions,java/util/ArrayList >, def V arID = v , usedV arIDs = { v } : this instruc-tion accesses the ﬁeld BankManagement.transactions of type

ArrayList and assigns it the variable ID v . It also uses the variable ID v , whichcorresponds to the self-reference this , to access the ﬁeld. • II , IT ype = invokemethod , IP arams = < Account, setCustomer (Cus-tomer)V >, def V arID = φ , usedV arIDs = { v , v } : this instructioninvokes the method Account .setCustomer(newCust) and uses two vari-able IDs: v corresponding to the object of type Account on which themethod is invoked, and v corresponding to the ﬁeld manager used as aparameter of the invoked method. Listing 2: Wala’s Intermediate Representation (IR) of the method setAllTransCustomers() from Listing 1. v = getfield : v v = invokemethod : v v = invokemethod : v conditionalbranch (eq, to iindex = -1): v , true v = invokemethod : v v = invokemethod : v v = getfield : v DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13) Elsevier 2019. This manuscript version is made available under the CC-BY-NC-ND 4.0 license invokemethod : v , v goto (from iindex = 10 to iindex = 3) Table 3 summarizes the IR instructions that we take into consideration whenconstructing the augmented method type graphs. We detect single associationnavigations with the instruction getﬁeld when the type of the used ﬁeld is user-deﬁned (i.e. the type corresponds to a class deﬁned in the application). As forcollection association navigations, we detect them when one of the two followinginstructions occur inside a loop statement: • arrayload : which is used to access array elements, • invokemethod of the method next() of the class java.util .Iterator : whichis used to access collection elements.To detect branch-dependent navigations, we consider the branching instruc-tions continue , break and return when they occur inside a loop statement. Whensuch an instruction is detected, the navigations resulting from all instructionsinside the loop are marked as branch-dependent. Moreover, all navigations re-sulting from instructions inside a branch of a conditional statement are markedas branch-dependent.We also use invokemethod instructions to detect method invocations andaugment the method’s type graph with the type graph of the invoked method,as discussed in Section 4.2. When we do so, we bind the parameters of themethod with the variables used in the invocation to detect association naviga-tions triggered by passing a persistent object as a method parameter. Finally,we take into consideration return instructions, if any, to detect the object thatwas returned by a method, which might be used to access further objects fromthe method invocation directy (e.g. getAccount().setCustomer(newCust) ).These steps are detailed by the pseudo-code of Algorithm 1, which takesas input the source code of a method m and returns as output the augmentedmethod type graph AG m . The algorithm iterates through the instructions of DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13)

IR Instruction RestrictionsSingle Association Navigations getﬁeld User-deﬁned ﬁeld type

Collection Association Navigations arrayload Inside loop analysis scopeinvokemethod method java.util.Iterator.next() , Insideloop statement

Branch-Dependent Navigations break Inside loop statementcontinue Inside loop statementreturn Inside loop statement

Method Invocations invokemethod Method of user-deﬁned class

Method Return Object return

N/A the method and creates new nodes in AG m through the method createNode() ,which takes as parameters the ID of the variable deﬁned by the instruction,whether it corresponds to a navigation of a single or collection association andif it is branch-dependent. The method createEdge() is used to add an edgeto AG m between the node of the current instruction and the nodes of previousinstructions, based on the variable IDs used and deﬁned by the instructions. Fi-nally, in order to identify branch-dependent navigations, we implemented threehelper methods used by the algorithm: • getASTNode(instr) : which returns the AST node corresponding to an IRinstruction, and • hasConditionalParent(node) , hasLoopParent(node) : which indicate if anAST node has a parent node corresponding to a conditional or loop state-ment, respectively. Example.

Applying Algorithm 1 on the instructions of setAllTransCus-tomers() shown in Listing 2 results in the type graph AG m depicted in Figure DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13)

Applying Algorithm 1 on the instructions of setAllTransCus-tomers() shown in Listing 2 results in the type graph AG m depicted in Figure DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13) Elsevier 2019. This manuscript version is made available under the CC-BY-NC-ND 4.0 license lgorithm 1:

Construct Augmented Method Type Graph

Input : m ∈ M t : Source code of the method to analyze Output: AG m : Augmented Method Type Graph of mAG m ← ( φ, φ ) foreach instr ∈ I m do instrASTNode ← getASTNode (instr) // Identify branch-dependent navigations if hasConditionalParent (instrASTNode)) || ( hasLoopParent (instrASTNode)) && IType (instr) ∈ { return, break, continue } ) then isBranchDependent ← true else isBranchDependent ← false // Create single-association node in AG m if IType (instr) = getﬁeld && IParams ( instr ) .f ieldT ype ∈ T then AG m ← AG m ∪ createNode ( defVarID (instr), ‘single’,isBranchDependent) // Create collection-association node in AG m if (cid:0) ( IType (instr) = arrayload) || ( IType (instr) = invokemethod && IParams (instr).invokedMethod = ‘java.util.Iterator.next()’ ) (cid:1) && hasLoopParent (instrASTNode) then AG m ← AG m ∪ createNode ( defVarID (instr), ‘collection’,isBranchDependent) // Add nodes of invoked method to AG m if IType (instr) = invokemethod && IParams (instr).invokedMethod ∈ M T then m (cid:48) ← IParams (instr).invokedMethod AG m (cid:48) ← getMethodGraph ( m (cid:48) ) foreach node ∈ AG m (cid:48) doif isParameterNode (node) then AG m ← AG m ∪ bindParameter (node) else AG m ← AG m ∪ node // Flag return object of method if IType (instr) = return then usedNode ← getNode( defVarID (instr)) setIsReturnNode (usedNode) // Create edges between new and previous nodes deﬁnedNode ← getNode ( defVarID (instr)) foreach usedVarID ∈ usedVarIDs (instr) do usedNode ← getNode (usedVarID) AG m ← AG m ∪ createEdge (usedNode, deﬁnedNode) return AG m DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13) Elsevier 2019. This manuscript version is made available under the CC-BY-NC-ND 4.0 license as follows: • The instruction II = getﬁeld transactions accesses a ﬁeld of type collec-tion . Hence, no changes are made to AG m . • II is an invocation of java.util.Iterator.next() inside a loop statement,which means it is accessing elements of the collection transactions . Hence,a new node with the variable ID of transactions , cardinality collection and isBranchDepedent = false is added to AG m . • II is an invocation of getAccount() . Hence, the type graph of getAccount() is added to AG m and linked to the node corresponding to II , based onthe used variable ID v . • II is a getﬁeld instruction that accesses the object manager . Thus, itresults in the creation of a new node with the variable ID of manager andcardinality single . • II is an invocation of setCustomer(newCust) and results in adding itstype graph to AG m , linking it to the node resulting from II , whichrepresents the return object of getMethod() . We also bind the method’sparameter to the node resulting from II .Note that II , II , II , and II do not access any persistent objects andhence do not cause any changes to AG m . We generate the set of prefetching hints of a method

P H m by traversingthe augmented method type graph constructed following Algorithm 1. At thispoint, it is important to remember how we handle runtime application behavior(discussed in Section 4.4). In case of branch-dependent navigations, we willinclude the prefetching hints of all the branches, both taken and not taken,since it was shown in Section 4.4 to be the best option. On the other hand,we will not include any hints to prefetch the objects accessed by overridden DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13) Elsevier 2019. This manuscript version is made available under the CC-BY-NC-ND 4.0 license ethods, because it was shown to be a signiﬁcant source of false positives inour previous work [17].We then perform one ﬁnal modiﬁcation to

P H m by removing hints alreadyfound in previous method calls. For instance, a method m that invokes anothermethod m (cid:48) will have the prefetching hints resulting from both m and m (cid:48) , whichallows us to bring the prefetching forward ensuring that the predicted objectsare prefetched before they are accessed.However, this also means that m and m (cid:48) might have prefetching hints pre-dicting access to the same objects, which leads to launching several requests toprefetch the same objects when the application is executed, causing additionalunnecessary overhead. We solve this problem by removing from P H m thoseprefetching hints that are found in all of the methods that invoke m . This so-lution does not aﬀect the prediction accuracy of the approach since the objectspredicted by the removed hints are prefetched by other hints in a previouslyexecuted method. Considering an application with a set of methods M , Algorithm 1 has acomplexity of O ( | I m | ) when generating the augmented method type graph ofany method m ∈ M , where | I m | is the number of Wala IR instructions of m .Moreover, constructing the augmented method type graphs of all of the methodsin M has a computational complexity of O ( | M | ∗ max ( | I m | )), where max ( | I m | )is the number of IR instructions of the largest method in the application.This is due to the fact that each method of the application is only analyzedand its prefetching hints are only generated once, even if it is invoked multipletimes by diﬀerent methods of the application. Apart from this theoretical com-putational complexity, we provide detailed results of the time it takes to executethis static code analysis on various application in Section 7.1. The goal of this component is to modify the original source code of theapplication in order to prefetch the objects predicted by the prefetching hints

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13)

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13) Elsevier 2019. This manuscript version is made available under the CC-BY-NC-ND 4.0 license isting 3: Helper prefetching method of setAllTransCustomers() from Listing 1. public class BankManagement_prefetch { public void setAllTransCustomers_prefetch (BankManagement rootObject ) { for (Transaction trans : rootObject .load( transactions )) { trans .load( type ); trans .load( emp ); trans .load( account ).load( cust ).load( company ); }); rootObject .load( manager ).load( company ); } } generated by the Static Code Analysis Component. To do so, we ﬁrst generatea helper prefetching method for each method of the application, which loads theobjects predicted by the method’s prefetching hints from the POS. Afterwards,we use AspectJ to inject an invocation of the generated prefetching methodinside each method of the application. By doing so, the objects predicted by amethod’s AG m are automatically prefetched when the application is executed. Given that each POS has speciﬁc instructions that are used to retrieve storedobjects, the exact instructions used in the prefetching methods to load the pre-dicted objects depend on the used POS. For the purposes of this example, weassume that the POS has an instruction called load() that loads and returns atyped object from the POS. The generated prefetching method takes as param-eter the object on which the original method is executed, starting from whichit then prefetches the predicted objects.

Example.

The Source Code Injection Component generates the follow-ing prefetching method for the method setAllTransCustomers() from Listing 1.Note that the prefetching method is deﬁned in a new prefetching class corre-sponding to the class

BankManagement . Also note that the instruction load() is substituted with the concrete instruction that loads an object depending onthe used POS, as will be explained in Section 6.

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13)

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13) Elsevier 2019. This manuscript version is made available under the CC-BY-NC-ND 4.0 license isting 4: Parallelized prefetching method of setAllTransCustomers() from Listing 1. public class BankManagement_prefetch { public static void setAllTransCustomers_prefetch (BankManagement rootObject ) { // Parallel prefetching of collection elements rootObject .load( transactions ).parallelStream().forEach( trans -> { trans .load( type ); trans .load( emp ); trans .load( account ).load( cust ).load( company ); }); // Cannot be parallelized rootObject .load( manager ).load( company ); } } We further optimize

CAPre by performing parallel prefetching when an ap-plication accesses objects stored in a distributed POS. For instance, in the setof prefetching hints

P H m deﬁned in Section 4, the elements of the transactions collection can be prefetched in parallel if they are stored in diﬀerent nodes of adistributed POS. On the other hand, distributing single-association hints, suchas manager.company , is not possible since we need to load the object manager before its associated company is loaded.We implemented this parallel prefetching by using the Parallel Streams ofJava 8, which convert a collection into a stream and divide it into several sub-streams. The Java Virtual Machine (JVM) then uses a predeﬁned pool ofthreads to execute a speciﬁc task for each substream, which avoids the costsof creating and destroying threads in each prefetching method. The numberof threads in the pool is set by JVM to the number of processor cores of thecurrent machine and the management of the threads is done automatically bythe JVM. Example.

The parallel version of the prefetching method setAllTransCus-tomers prefetch() is shown in Listing 4.

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13)

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13) Elsevier 2019. This manuscript version is made available under the CC-BY-NC-ND 4.0 license .2.3. Injecting Prefetching Method Invocations

Instead of directly invoking the prefetching methods, we implemented amulti-threaded approach where the prefetching methods are executed by a back-ground thread in parallel to the main thread of the application. By doing so, weallow the execution of the application to continue uninterrupted while prefetch-ing objects in another thread whenever possible.We achieved this by using a thread pool executor that creates a pool of oneor more threads at the application level and then schedules tasks for executionin the created threads. This solution helps to save resources, since threads arenot created and destroyed multiple times, and also contains the parallelism inpredeﬁned limits, such as the number of threads that are run in parallel. Hence,we inject the following instruction into the class that contains the main methodfrom which the execution of the application starts: public static final

ThreadPoolExecutor prefetchingExecutor =(ThreadPoolExecutor) Executors.newFixedThreadPool(1);

This instruction creates a thread pool executor with a single thread to exe-cute the generated prefetching methods. Afterwards, we inject at the beginningof each method a scheduling of its helper prefetching method using this con-structed thread pool. The executor then checks the scheduled tasks and executesthem consecutively in its thread. Note that when using the parallel prefetch-ing methods, the single thread of the executor creates multiple sub-threads toperform the prefetching in parallel.

Example.

Listing 5 shows the injected instructions into the method setAll-TransCustomers() , which schedule its helper prefetching method setAllTran-sCustomers prefetch() for execution.

6. Prefetching in dataClay

In order to evaluate the eﬀect of

CAPre on application performance, weintegrated it into dataClay . dataClay is an object store that distributes objectsacross the network [14, 13] among the available storage nodes. In contrast with DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13) Elsevier 2019. This manuscript version is made available under the CC-BY-NC-ND 4.0 license isting 5: Injected scheduling of the prefetching method from Listing 4 into setAllTransCus-tomers() . public void setAllTransCustomers() { // Injected scheduling of prefetching method final BankManagement rootObject = this ; prefetchingExecutor .submit( new Runnable() { @Override public void run() { BankManagement_prefetch.setAllTransCustomers_prefetch( rootObject ); } }); ... } Logic Module

PrefetchingSystemsend registered classesexecute method

CLASS

Application

Data Services intercommunication prefetch objects

Prefetching Thread Collection Prefetching Threads

JAVA

Client

CLASS register new classes registered classes& metadata

Figure 7: System architecture of dataClay . A deployment of a Logic Module and threeData Services on diﬀerent nodes is depicted with the communications between the client and dataClay and between Logic Module and Data Services [14]. other database systems, data stored in dataClay never moves outside the POS.Instead, data is manipulated in the form of objects, exposing only the operationsthat can be executed on the data, which are executed inside the data store, ina manner transparent to the applications using the store. Figure 7 shows thesystem architecture of dataClay .To use dataClay , the client ﬁrst needs to register the application schema,i.e. the set of persistent classes (ﬁelds and methods) that will be used by theapplication, to a centralized service called the

Logic Module . The Logic Modulethen adds system-speciﬁc functionality to the received classes and deploys themodiﬁed classes to the

Data Services , which are the nodes of dataClay wherethe persistent objects are stored, and sends them back to the client.We integrated

CAPre into dataClay during this registration process. When

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13)

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13) Elsevier 2019. This manuscript version is made available under the CC-BY-NC-ND 4.0 license he classes are sent to the Logic Module for registration,

CAPre intercepts thesource code of the classes, performs the analysis and injects the prefetchingclasses and prefetching method invocations. These prefetching classes are thensent along with the modiﬁed application classes to the Logic Module for reg-istration. Since dataClay automatically loads an object when a reference tothat object is made, the generated prefetching methods do not use any speciﬁcinstructions to load the predicted objects but rather make explicit references tothem (e.g. trans.type, trans.account.cust.company ).Once the application schema is registered, the client can store any localobjects with the type of a registered class in dataClay , which automaticallydistributes the stored objects among the available Data Services. The client canthen access the stored objects to execute any method deﬁned in the registeredschema. However, dataClay does not send the objects to the client but ratherexecutes the methods locally in the same Data Service where the object is stored.Given the changes made by

CAPre during the schema registration, the helperprefetching method of the executed method is invoked once an execution requestis received by a Data Service, and the predicted objects are prefetched into thelocal memory of the Data Service. When a prefetching method encounters anobject in another Data Service, dataClay communicates with that Data Serviceto load the object where it is stored.

Example.

Executing the method setAllTransCustomers() (Listing 5) froma client application using dataClay with three Data Services, DS , DS and DS (Figure 7) on an object of type BankManagement stored in DS , is donethrough the following steps: • First, the client application launches the execution request to dataClay ,which in turn automatically redirects it to DS , where the object BankMan-agement is stored. • When DS receives the execution request of setAllTransCustomers() , itschedules the prefetching method setAllTransCustomers prefetch() for ex-ecution with the prefetching thread pool, as explained in Section 5.2.3. DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13)

Once the prefetching method is executed, it creates several sub-threadsand starts loading the elements of the collection transactions , which wasautomatically distributed by dataClay , in parallel from the diﬀerent DataServices. • When one of these threads, currently being executed on DS , tries to loadan object stored in a diﬀerent Data Service, say DS , dataClay redirectsthe load request to DS and loads the object where it is stored.

7. Evaluation

The purpose of this evaluation is to analyse how

CAPre reduces applicationexecution time, which is the ultimate goal of our prefetching technique. Forother indicators such as the true positive or the false negative rates, we referthe reader to [17], where these metrics were analysed in detail.

Before we evaluate the performance gains obtained by applications whenusing

CAPre , it is important to prove that the proposed static code analysisand the generation of the prefetching hints can be run in a reasonable amountof time. In order to understand this, we have run the static code analysis usingthe applications of the SF110 corpus (introduced in Section 4.4) as well as theapplications we used to evaluate the performance gains of

CAPre , as detailedin Section 7.2.Figure 8 plots the number of applications per range of analysis time in mil-liseconds and shows that 96 of the SF110 application were analyzed in less than1 second. Moreover, it also shows that the longest time the static code analysistook was 16 seconds, and this occurred with weka , the second largest applicationwith over 20,000 methods.As expected, the analysis time of our approach is correlated with the numberof classes and methods of an application. However, with an average analysis timeof 651 milliseconds and a maximum of roughly 16 seconds, we believe that theanalysis ﬁnishes within a reasonable time for all of the analyzed applications. It

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13)

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13) Elsevier 2019. This manuscript version is made available under the CC-BY-NC-ND 4.0 license

10 100 1000 10000 1000000102030405060

Analysis Time (milliseconds) N u m b e r o f A pp li c a t i o n s Figure 8: For each power-of-10 interval on the x-axis, the y-axis represents the number ofapplications for which our static code analysis approach ﬁnishes within that interval (in mil-liseconds).Table 4: Comparison between the compilation times and the times needed to perform the

CAPre static code analysis of each of the four benchmarks used in our evaluation (Section7.2).

Compilation

CAPre

AnalysisOO7 1,030 ms 827 msWordcount 923 ms 633 msK-Means 916 ms 519 msPGA 1,041 ms 1,068 ms is worth mentioning again here that this static analysis is done only once, priorto application execution and does not add any overhead to its execution time.Going into more details of the four benchmarks that we will later use to assesthe performance gains of

CAPre , Table 4 shows the time needed to compileeach of the benchmarks (by executing a javac command) and the time neededto perform our code analysis. As we can see, the time needed to analyze theapplication code never exceeds the pure compilation time of the application, thusit will not imply a signiﬁcant overhead when compiling the application (again,this analysis is only performed once before the applications are executed).

We tested the eﬀect that

CAPre has on application performance by cal-culating the execution times of four benchmarks using dataClay without any

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13)

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13) Elsevier 2019. This manuscript version is made available under the CC-BY-NC-ND 4.0 license refetching, and with

CAPre . We also compared

CAPre with the

Referenced-Objects Predictor (ROP) , deﬁned in Section 1, using diﬀerent fetch depths , whichindicate the levels of related objects that the ROP should prefetch.For each experiment, we executed the benchmark 10 times and took theaverage execution times. We ran all of the experiments on a cluster of 5 nodesinterconnected by a 10GbE link. Each node is composed of a 4-core Intel XeonE5-2609v2 processor (2.50GHz), a 32GB DRAM (DDR3) and a 1TB HDD(WD10JPVX 5400rpm). We deployed dataClay on the cluster using one nodeas both the client and Logic Module, and 4 nodes as 4 distinct Data Services.The rest of this section exposes the results of our experiments on each of thestudied benchmarks separately.

OO7 is the de facto standard benchmark for POSs and object-orienteddatabases [37]. Its data model is meant to be an abstraction of diﬀerent CAD/-CAM/CASE applications and contains a recursive data structure involving a setof classes with complex inheritance and composition relationships, as depictedin Figure 9. The benchmark includes a random data generator that takes asparameter the size of the database to be generated: small (˜1,000 objects),medium (˜30,000 objects) and large (˜600,000 objects). The benchmark alsohas an implemented set of 6 traversals, from which we executed the following:

OO7Benchmark - createOO7Database(int dbSize)- runTraversals()

Manual + title : String+ text : String

AssemblyComplexAssembly BaseAssemblyCompositePartAtomicPart + Integer x+ Integer y

Module

Document + title : String+ text : String+ text : StringatomicParts1..*connections1..*

Connection + length : Integer+ type : String 1tofrom 1

Figure 9: Class diagram of the OO7 benchmark.

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13)

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13) Elsevier 2019. This manuscript version is made available under the CC-BY-NC-ND 4.0 license t1: tests the data access speed by traversing the benchmark’s data modelstarting from the object

Module . • t2a, t2b and t2c: test the update speed by updating diﬀerent numbers of Composite Parts and

Atomic Parts .We did not execute the two remaining traversals, t8 and t9 , given that theywere designed to test text processing speed and only load one persistent object, Manual .Figure 10 (a) shows the execution times of the traversal t1 with the threeOO7 database sizes. It indicates that CAPre oﬀers more improvement to theoriginal execution time than the ROP, which oﬀers gradually better improve-ment when increasing its fetch depth from 1 to 5 before it stagnates with a fetchdepth of 10. This behavior is expected since ROP can only prefetch objects upto a certain depth before running out of referenced objects to prefetch. Onthe other hand,

CAPre does not depend on a predeﬁned fetch depth and canprefetch as many levels of related objects as predicted by the code analysis itperforms. In addition, given that

CAPre is able to know which collections willbe accessed, their elements can also be prefetched, something that is not doneby the ROP algorithm regardless of its depth (prefetching a collection that maynot be used is too much overhead). This enables

CAPre to prefetch many moreobjects, and thus take more beneﬁt from the parallel access to the distributedstorage.When considering previous work on prefetching that have used OO7 as abenchmark, Ibrahim et al. report an improvement of 7% in execution time withthe small OO7 database while Bernstein et al. report an improvement of 11%on the medium-sized database [28]. While these numbers are not directly com-parable to the ones obtained in our experiments given that the approaches use adiﬀerent POS, with diﬀerent levels of optimization and run their experiments ondiﬀerent hardware, it is worth mentioning that

CAPre achieves an improvementof 30% and 26% with the small and medium OO7 databases respectively.As for the traversal t2b , Figure 10 (b) shows that neither

CAPre nor the ROP

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13)

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13) Elsevier 2019. This manuscript version is made available under the CC-BY-NC-ND 4.0 license o prefetching ROP (depth = 1) ROP (depth = 3) ROP (depth = 5) ROP (depth = 10) CAPre

Legend: .

60 0 .

55 0 .

54 0 .

51 0 .

50 0 . E x e c u t i o n T i m e ( s e c o nd s ) small DB .

32 10 .

84 10 .

58 10 .

24 10 .

05 7 . medium DB .

47 158 .

49 157 .

35 153 .

72 153 .

03 119 . large DB (a) Traversal t1 E x e c u t i o n T i m e ( m illi s e c o nd s ) small DB

15 51 48 52 47 16 medium DB

197 547

515 458 216 large DB (b) Traversal t2b

Figure 10: Execution times of the traversals t1 and t2b of the OO7 benchmark. oﬀer any improvement, since the latency of the traversal is not caused by dataaccess but rather by the time taken to store the updated objects. However, theﬁgure also indicates that using the ROP produces signiﬁcant overhead, caused bythe fact that it prefetches the objects referenced from the object being updated,when in fact these objects are never accessed. By contrast, CAPre does notprefetch these objects since it takes into consideration the application’s codeand is aware that they are not needed, thus producing very little overhead.Note that the execution times of the traversals t2a and t2c were left out of thispaper because they exhibit similar behavior in terms of added overhead for both

CAPre and the ROP.

Wordcount is a parallel algorithm that parses input text ﬁles, splitting theirtext lines into words, and outputs the number of appearances of each uniqueword. Due to the resemblance of this algorithm to the problem of creating his-

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13)

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13) Elsevier 2019. This manuscript version is made available under the CC-BY-NC-ND 4.0 license ollections1..*

WordcountBenchmark - createTextCollections(String ﬁlePath)- computeWordCount(int itrs) chunks1..*

TextChunk + words : ArrayList texts 1..*

TextCollection

Figure 11: Class diagram of the Wordcount benchmark. tograms, Wordcount is commonly used as a Big Data benchmark. Unlike OO7,the data model of this benchmark, depicted in Figure 11, is fairly simple. It con-sists of several

Text Collections , each containing one or more

Texts representingthe input ﬁles. Each of the

Text objects in turn contains one or more

Chunks ,which represent fragments of the text, and contain the words to be counted.In our experiments, we used a data set of 8 ﬁles, containing a total of 10 words, divided them into four collections, and distributed the collections amongthe four dataClay Data Services. Furthermore, we ran the benchmark withdiﬀerent numbers of chunks c , ranging from one chunk containing all the wordsin each text (i.e. few large objects) to 10 chunks per text containing very fewwords (i.e. many small objects).Figure 12 shows the execution times of the Wordcount benchmark. Giventhat the data model of Wordcount is simpler than OO7, we can see that theROP stagnates at a lower fetch depth of 3. For this motivation, we do notinclude the results for ROP with a fetch depth of 10 with any of the rest ofexperiments in this section. On the other hand, given that most of the data arecollections, CAPre knows which ones to prefetch and thus does brings them tomain memory (something that, as we have mentioned cannot be done by ROP)increasing the hit ratio and, thus, reducing the execution time by more than50% in some cases.This improvement is considerably higher than what we obtained with OO7,because the Wordcount data model contains many collection associations, whichcan be prefetched by our approach. Finally, Figure 12 also shows that

CAPre oﬀers stable improvement regardless of the number of chunks, which indicates

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13)

Legend: .

41 7 .

22 6 .

89 6 .

76 4 . E x e c u t i o n T i m e ( s e c o nd s ) c = 1 .

25 6 .

67 6 .

63 6 .

69 3 . c = 10 .

27 5 .

83 5 .

73 5 .

65 3 . c = 10 .

75 6 .

26 6 .

14 6 .

07 3 . c = 10 .

22 10 .

83 10 .

39 10 .

06 5 . E x e c u t i o n T i m e ( s e c o nd s ) c = 10 .

02 32 .

22 32 .

13 31 .

73 19 . c = 10 .

22 247 .

95 238 .

88 238 .

54 139 . c = 10 Figure 12: Execution times of the Wordcount benchmark. collections1..*

KMeansBenchmark - generateRandomVecs(int n)- computeKMeans(int k) vectors1..*

Vector + dims: integer[]

VectorCollection

Figure 13: Class diagram of the K-Means benchmark. that it can be equally beneﬁcial for applications that handle a small number oflarge objects or many small-sized objects.

K-Means is a clustering algorithm commonly used as a Big Data benchmarkthat aims to partition n input vectors into k clusters in which each vector be-longs to the cluster with the nearest mean. It is a complex recursive algorithmthat requires several iterations to reach a converging solution. The data modelof K-Means that we used, depicted in Figure 13, consists of a set of VectorCollec-tions each containing a subset of the n input Vectors . We ran our experimentsusing various numbers of randomly generated vectors, n , each consisting of 10dimensions, and diﬀerent values of k . We also divided the input vectors into 4collections and distributed the collections among the dataClay Data Services.Figure 14 shows the execution times of this benchmark. In this case, the ROP

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13)

Legend: .

01 7 .

09 6 .

89 6 .

74 6 . E x e c u t i o n T i m e ( s e c o nd s ) n = 10 , k = 4 .

01 7 .

18 7 .

11 7 .

08 6 . n = 10 , k = 4 .

71 15 .

03 14 .

73 14 .

78 13 . n = 10 , k = 40 .

64 70 .

97 70 .

27 70 .

02 63 . n = 10 , k = 400 Figure 14: Execution times of the K-Means benchmark.

PGABenchmark - generateRandomGraph(int v, int e)- executeAlgorithms()

WeightedEdge + source : int+ target : int+ weight : int

WeightedDirectedGraph graph 1 vertices 1..*outgoingEdges1..*

Vertex + id : int

Figure 15: Class diagram of the PGA benchmark. does not oﬀer any signiﬁcant improvement regardless of the fetch depth giventhat the benchmark’s data model does not contain any single associations thatcan be prefetched. On the contrary,

CAPre achieves better improvement, reduc-ing between 9% and 15% of the benchmark’s execution time, when prefetchingdata collections in parallel, which again shows the advantage of

CAPre . The Princeton Graph Algorithms (PGA) is a benchmark used to test theexecution times of complex graph traversal algorithms using diﬀerent types ofgraphs (e.g. undirected, directed, weighted) [38]. Figure 15 depicts the subset ofthe benchmark’s classes that we used in our experiments. Namely, we executedthe Depth-First Search (DFS) and Bellman-Ford Shortest Path algorithms usinga

WeightedDirectedGraph . The graph consists of a set of

Vertex objects, eachcontaining the outgoing

WeightedEdges of the vertex. We ran our experimentsusing diﬀerent numbers of randomly generated vertices v and edges e , which wechose to construct graphs with diﬀerent levels of edge density. As with the rest DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13) Elsevier 2019. This manuscript version is made available under the CC-BY-NC-ND 4.0 license o prefetching ROP (depth = 1) ROP (depth = 3) ROP (depth = 5) CAPre

Legend: .

16 0 .

14 0 .

13 0 . E x e c u t i o n T i m e ( s e c o nd s ) v = 10 , e = 10 .

87 2 .

27 2 .

24 2 .

23 2 . v = 10 , e = 10 .

04 2 .

62 2 .

59 2 .

58 2 . v = 10 , e = 10 .

76 12 .

31 12 .

29 12 .

24 9 . v = 10 , e = 10 (a) Depth-First Search (DFS) .

37 0 .

31 0 .

29 0 .

30 0 . E x e c u t i o n T i m e ( s e c o nd s ) v = 10 , e = 10 .

40 1 .

53 1 .

55 1 .

57 1 . v = 10 , e = 10 .

58 0 .

61 0 .

53 0 .

56 0 . v = 10 , e = 10 .

60 3 .

65 3 .

44 3 .

38 3 . v = 10 , e = 10 (b) Bellman-Ford Shortest Path Figure 16: Execution times of the Princeton Graph Algorithms benchmark. of the benchmarks, we distributed the data among the four Data Services of dataClay .Figure 16 (a) shows that the execution times of the DFS algorithm are simi-lar to those reported for the WordCount benchmark; where

CAPre doubles theimprovement achieved by ROP and the same rationale applies. On the otherhand, Figure 16 (b) indicates that even when using

CAPre , we do not see signif-icant improvement in the execution time of the Bellman-Ford algorithm. Thisis due to the fact that this algorithm does not access the graph’s vertices ina predetermined order, but rather starts from a source vertex and applies atrial-and-error approach to reach the shortest path solution using various in-termediate data structures, and thus predicting access to the objects it uses ismore diﬃcult. Nevertheless, it is also important to notice that in these cases,

CAPre knows what not to prefetch and does not add unnecessary overhead asit happens in some cases with ROP.

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13)

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13) Elsevier 2019. This manuscript version is made available under the CC-BY-NC-ND 4.0 license .3. Discussion

The results obtained from our experiments indicate that

CAPre oﬀers thehighest improvement in execution time when used with applications with acomplex data model, such as OO7. This is due to the fact that CAPre is basedon type graphs, which analyze the way that the data model of the application isaccessed by its methods. As such, the more complex a data model is the moreinformation on which to base the prefetching predictions can be retrieved.Nonetheless, the fact that

CAPre can safely predict access to collections aswell as single objects, allows it to be used with simple data models that containmany collection associations as well, such as the case with the Wordcount andK-Means benchmarks. This prediction of access to collection also increases theamount of objects to be prefetched at a time, thus giving

CAPre more marginto take advantage of any potential parallelism in the POS when prefetching thepredicted objects.This prediction of access to collections of persistent objects, and the asso-ciated parallel prefetching of these objects, is an important area where

CAPre outperforms ROP. As discussed throughout this section, ROP is limited to pre-dicting access to single objects and unable to predict access to collections, dueto its heuristic of retrieving objects related to the one currently being accessed.This in turn means that a prefetching system based on ROP is not able to takeadvantage of parallelism in the POS, given that collections of objects that canbe accessed in parallel are never predicted for prefetching.In terms of data size, the experiments indicate that

CAPre provides thesame level of improvement regardless of the number or size of persistent objectsmanipulated by each benchmark. This indicates that CAPre can be used withboth applications that manipulate a large number of small persistent objects, aswell as with those that manipulate a small number of large persistent objects.When compared with the ROP,

CAPre achieves at least the same improvementand, in cases where prefetching is not needed, the negative eﬀect on applicationperformance is signiﬁcantly smaller than when using ROP.Throughout our experiments, we encountered one limitation of

CAPre , with

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13)

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13) Elsevier 2019. This manuscript version is made available under the CC-BY-NC-ND 4.0 license he Bellman-Ford shortest path algorithm, where it could not oﬀer signiﬁcantimprovement because the algorithm accesses persistent objects in a random or-der that is diﬃcult to predict. Theoretically, we can also run into another limita-tion when the objects accessed by diﬀerent branches of a conditional statementdo not have any overlap. In this case,

CAPre would retrieve many unneces-sary objects given that it prefetches the objects predicted by the union of theprefetching hints of the diﬀerent branches. However, our analysis of the SF110corpus, detailed in Section 4.4, shows that this limitation only occurs in a verysmall minority of the analyzed applications, and that in the majority of casesthere is a big overlap between the objects accessed by diﬀerent branches of aconditional statement (even though the methods executed on these objects maybe very diﬀerent).In these cases, any prefetching approach that uses a compile-time predictiontechnique will face the problem of unpredictability of the accessed objects, asevident by the inability of ROP to oﬀer any improvement in the execution timeas well. One solution to this problem is to use a hybrid approach that collectssome information during runtime in order to complement the predictions madeprior to the execution of the application. Such an approach will evidently haveto be studied and analyzed in detail in order to determine the overhead that itmight introduce.Finally,

CAPre currently uses the Java Virtual Machine’s (JVM) predeﬁnedthreadpool to execute the parallel prefetching of collections. This approachreduces the costs of creating and destroying threads and delegates the manage-ment of the threads to the JVM. Nonetheless, it does not allow us to test theeﬀects that the number of threads has on the experiment results, given thatit is the JVM that decides the optimal number of threads to create withoutoverloading the machine. It may be interesting, as future work, to take controlof the thread management operations from the JVM in order to evaluate howthe number of prefetching threads inﬂuences the eﬃciency of the prefetchingperformed by

CAPre . DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13)

CAPre . DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13) Elsevier 2019. This manuscript version is made available under the CC-BY-NC-ND 4.0 license . Conclusions

In this paper, we presented

CAPre , a prefetching system for Persistent Ob-ject Stores based on static code analysis of object-oriented applications. Wedetailed the analysis we perform to obtain prefetching hints that predict whichpersistent objects are accessed by the application and how we use code genera-tion and injection to prefetch the predicted objects when the application is exe-cuted. We also optimized the system by parallelizing the generated prefetchingmethods, allowing objects to be prefetched from various nodes of a distributedPOS in parallel. Afterwards, we integrated

CAPre into a distributed POS andperformed a series of experiments on known benchmarks to evaluate the im-provement to application performance that it can achieve.In the future, we want to address cases where

CAPre oﬀers limited im-provement by collecting more information during application execution, whilestudying the overhead that such a hybrid approach might introduce. We alsoplan to use the predictions made by the developed static code analysis to ap-ply other performance improvement techniques in conjunction with prefetching,such as smart cache replacement policies [39, 40, 41] and dynamic data place-ment [42, 43].

Acknowledgements

This work has been supported by the European Union’s Horizon 2020 re-search and innovation program under the BigStorage European Training Net-work (ETN) (grant H2020-MSCA-ITN-2014-642963), the Spanish Ministry ofScience and Innovation (contract TIN2015-65316) and the Generalitat de Catalunya(contract 2014-SGR-1051).

BibliographyReferences [1] A. L. Brown, R. Morrison, A generic persistent object store, Software Engi-neering Journal 7 (2) (1992) 161–168. doi:10.1049/sej.1992.0017 .URL http://dx.doi.org/10.1049/sej.1992.0017

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13)

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13) Elsevier 2019. This manuscript version is made available under the CC-BY-NC-ND 4.0 license

2] M. P. Atkinson, P. J. Bailey, K. J. Chisholm, P. W. Cockshott, R. Morrison,An approach to persistent programming, The Computer Journal 26 (4)(1983) 360. doi:10.1093/comjnl/26.4.360 .[3] T.-H. Chen, W. Shang, Z. M. Jiang, A. E. Hassan, M. Nasser, P. Flora, De-tecting performance anti-patterns for applications developed using object-relational mapping, in: Proceedings of the 36th International Conferenceon Software Engineering, ICSE 2014, ACM, New York, NY, USA, 2014,pp. 1001–1012. doi:10.1145/2568225.2568259 .[4] InterSystems, Cach´e for unstructured data analysis, [Accessed 08/10/2018](2018).URL [5] Actian, Actian NoSQL object database, [Accessed 08/10/2018] (2018).URL [6] A. S. Foundation, Hibernate. everything data., [Accessed 08/10/2018](2018).URL http://hibernate.org/ [7] R. C. Contributors, Apache OpenJPA, [Accessed 08/10/2018] (2013).URL http://openjpa.apache.org/ [8] D. C. Contributors, DataNucleus, [Accessed 08/10/2018] (2018).URL [9] N. C. Contributors, Neo4J OGM - an object graph mapping library forNeo4j v3.1, [Accessed 08/10/2018] (2018).URL https://neo4j.com/docs/ogm-manual/current/ [10] J. E. B. Moss, Design of the mneme persistent object store, ACM Trans.Inf. Syst. 8 (2) (1990) 103–139. doi:10.1145/96105.96109 .URL http://doi.acm.org/10.1145/96105.96109

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13)

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13) Elsevier 2019. This manuscript version is made available under the CC-BY-NC-ND 4.0 license

11] A. Tripathi, R. Wolfe, S. Koneru, Z. Attia, Management of persistent ob-jects in the nexus distributed system, in: Proceedings of the 2nd Interna-tional Workshop on Object Orientation in Operating Systems, IEEE, Wash-ington, DC, USA, 1992, pp. 100–104. doi:10.1109/IWOOOS.1992.252992 .[12] B. Liskov, M. Castro, L. Shrira, A. Adya, Providing persistent objectsin distributed systems, in: R. Guerraoui (Ed.), ECOOP’ 99 — Object-Oriented Programming, Springer Berlin Heidelberg, Berlin, Heidelberg,1999, pp. 230–257.[13] dataClay Contributors, dataClay - BSC-CNS, [Accessed 11/10/2018](2018).URL [14] J. Mart´ı, A. Queralt, D. Gasull, A. Barcel´o, J. J. Costa, T. Cortes, Dat-aclay: A distributed data store for eﬀective inter-player data sharing,Journal of Systems and Software 131 (2017) 129 – 145. doi:https://doi.org/10.1016/j.jss.2017.05.080 .URL [15] H. C. Contributors, Hibernate documentation - chapter 19 - improvingperformance, [Accessed 08/10/2018] (2018).URL https://docs.jboss.org/hibernate/orm/3.3/reference/en/html/performance.html [16] S. Garbatov, J. Cachopo, Data access pattern analysis and prediction forobject-oriented applications, INFOCOMP Journal of Computer Science10 (4) (2011) 1–14.[17] R. Touma, A. Queralt, T. Cortes, M. S. P´erez, Predicting access to per-sistent objects through static code analysis, in: New Trends in Databasesand Information Systems, Springer International Publishing, Cham, 2017,pp. 54–62.

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13)

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13) Elsevier 2019. This manuscript version is made available under the CC-BY-NC-ND 4.0 license

18] N. Knaﬂa, A prefetching technique for object-oriented databases, in: Ad-vances in Databases, Vol. 1271, Springer-Verlag, Berlin, Heidelberg, 1997,pp. 154–168. doi:10.1007/3-540-63263-8\_19 .[19] DataNucleus, Datanucleus - JDO fetch-groups, [Accessed 08/10/2018](2017).URL [20] O. Gierke, T. Darimont, C. Strobl, M. Paluch, Spring data JPA - referencedocumentation, [Accessed 08/10/2018] (2018).URL http://docs.spring.io/spring-data/jpa/docs/current/reference/html/ [21] [online][link].[22] Django, Queryset api reference - django documentation, [Accessed08/10/2018] (2018).URL https://docs.djangoproject.com/en/1.9/ref/models/querysets/ [23] A. Ibrahim, W. Cook, Automatic prefetching by traversal proﬁling in objectpersistence architectures, in: Proceedings of the 20th European Conferenceon Object-Oriented Programming, ECOOP 2006, Springer-Verlag, Berlin,Heidelberg, 2006, pp. 50–73. doi:10.1007/11785477\_4 .[24] J.-H. Ahn, H.-J. Kim, Dynamic SEOF: An adaptable object prefetch policyfor object-oriented database systems, The Computer Journal 43 (6) (2000)524–537. doi:10.1093/comjnl/43.6.524 .[25] N. Knaﬂa, Analysing object relationships to predict page access forprefetching, in: Proceedings of the 8th International Workshop on Per-sistent Object Systems (POS8), Morgan Kaufmann Publishers Inc., SanFrancisco, CA, USA, 1999, pp. 160–170.

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13)

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13) Elsevier 2019. This manuscript version is made available under the CC-BY-NC-ND 4.0 license

26] Z. He, A. Marquez, Path and cache conscious prefetching (PCCP), TheVLDB journal 16 (2) (2007) 235–249.[27] K. M. Curewitz, P. Krishnan, J. S. Vitter, Practical prefetching via datacompression, SIGMOD Rec. 22 (2) (1993) 257–266. doi:10.1145/170036.170077 .[28] P. A. Bernstein, S. Pal, D. Shutt, Context-based prefetch for implementingobjects on relations, in: Proceedings of the 25th International Conferenceon Very Large Data Bases, VLDB ’99, Morgan Kaufmann Publishers, SanFrancisco, CA, USA, 1999, pp. 7–10.[29] W. Han, K. Whang, Y. Moon, A formal framework for prefetching basedon the type-level access pattern in object-relational DBMSs, IEEE Trans.Knowledge Data Eng. 17 (10) (2005) 1436–1448. doi:10.1109/TKDE.2005.156 .[30] W. Han, W. Loh, K. Whang, Type-level access pattern view: A tech-nique for enhancing prefetching performance, in: Proceedings of the 11thInternational Conference on Database Systems for Advanced Applica-tions, DASFAA’06, Springer-Verlag, Berlin, Heidelberg, 2006, pp. 389–403. doi:10.1007/11733836\_28 .[31] S. A. Blair, On the classiﬁcation and evaluation of prefetching schemes,Ph.D. thesis, University of Glasgow (2003).[32] N. Knaﬂa, Prefetching techniques for client/server, object-orienteddatabase systems, Ph.D. thesis, University of Edinburgh (1999).[33] C. Gerlhof, A. Kemper, A multi-threaded architecture for prefetching inobject bases, in: Proceedings of the 4th International Conference on Ex-tending Database Technology: Advances in Database Technology, Vol. 779of EDBT ’94, Springer-Verlag, New York, NY, USA, 1994, pp. 351–364. doi:10.1007/3-540-57818-8\_63 . DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13)

34] W. Han, Y. Moon, K. Whang, Prefetchguide: capturing navigational accesspatterns for prefetching in client/server object-oriented/object-relationaldbmss, Information Sciencies 152 (2003) 47–61.[35] G. Fraser, A. Arcuri, A large-scale evaluation of automated unit test gen-eration using evosuite, ACM Trans. Softw. Eng. Methodol. 24 (2) (2014)8:1–8:42. doi:10.1145/2685612 .[36] I. Wala, Wala wiki, [Accessed 08/10/2018] (2015).URL http://wala.sourceforge.net/wiki/index.php/Main_Page [37] M. J. Carey, D. J. DeWitt, J. F. Naughton, The OO7 benchmark, in:Proceedings of the 1993 ACM SIGMOD International Conference on Man-agement of Data, SIGMOD ’93, ACM, New York, NY, USA, 1993, pp.12–21. doi:10.1145/170035.170041 .[38] R. Sedgewick, K. Wayne, Algorithms, 4th edition - graphs, [Accessed09/10/2018] (2016).URL https://algs4.cs.princeton.edu/40graphs/ [39] A. Jaleel, H. H. Najaf-abadi, S. Subramaniam, S. C. Steely, J. Emer, Cruise:Cache replacement and utility-aware scheduling, SIGARCH Comput. Ar-chit. News 40 (1) (2012) 249–260. doi:10.1145/2189750.2151003 .URL http://doi.acm.org/10.1145/2189750.2151003 [40] J. Jeong, M. Dubois, Cost-sensitive cache replacement algorithms, in: Pro-ceedings of the 9th International Symposium on High-Performance Com-puter Architecture, IEEE Computer Society, Washington, DC, USA, 2003,pp. 327–337. doi:10.1109/HPCA.2003.1183550 .[41] G. Keramidas, P. Petoumenos, S. Kaxiras, Cache replacement based onreuse-distance prediction, in: Proceedings of the 25th International Con-ference on Computer Design, ICCD’07, IEEE, Washington, DC, USA, 2007,pp. 245–250. doi:10.1109/ICCD.2007.4601909 . DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13)

42] C.-W. Lee, K.-Y. Hsieh, S.-Y. Hsieh, H.-C. Hsiao, A dynamic data place-ment strategy for hadoop in heterogeneous environments, Big Data Re-search 1 (2014) 14 – 22, special Issue on Scalable Computing for Big Data. doi:https://doi.org/10.1016/j.bdr.2014.07.002 .URL [43] N. Maheshwari, R. Nanduri, V. Varma, Dynamic energy eﬃcient dataplacement and cluster reconﬁguration algorithm for mapreduce frame-work, Future Generation Computer Systems 28 (1) (2012) 119 – 127. doi:https://doi.org/10.1016/j.future.2011.07.001 .URL

DOI: 10.1016/j.future.2019.10.023 c (cid:13)(cid:13)