[PDF] Function Delivery Network: Extending Serverless Computing for Heterogeneous Platforms

Abstract

Serverless computing has rapidly grown following the launch of Amazon's Lambda platform. Function-as-a-Service (FaaS) a key enabler of serverless computing allows an application to be decomposed into simple, standalone functions that are executed on a FaaS platform. The FaaS platform is responsible for deploying and facilitating resources to the functions. Several of today's cloud applications spread over heterogeneous connected computing resources and are highly dynamic in their structure and resource requirements. However, FaaS platforms are limited to homogeneous clusters and homogeneous functions and do not account for the data access behavior of functions before scheduling. We introduce an extension of FaaS to heterogeneous clusters and to support heterogeneous functions through a network of distributed heterogeneous target platforms called Function Delivery Network (FDN). A target platform is a combination of a cluster of homogeneous nodes and a FaaS platform on top of it. FDN provides Function-Delivery-as-a-Service (FDaaS), delivering the function to the right target platform. We showcase the opportunities such as varied target platform's characteristics, possibility of collaborative execution between multiple target platforms, and localization of data that the FDN offers in fulfilling two objectives: Service Level Objective (SLO) requirements and energy efficiency when scheduling functions by evaluating over five distributed target platforms using the FDNInspector, a tool developed by us for benchmarking distributed target platforms. Scheduling functions on an edge target platform in our evaluation reduced the overall energy consumption by 17x without violating the SLO requirements in comparison to scheduling on a high-end target platform.

Full PDF

RReceived: Added at production Revised: Added at production Accepted: Added at productionDOI: xxx/xxxx

SPECIAL ISSUE PAPER

Function Delivery Network: Extending Serverless Computing forHeterogeneous Platforms

Anshul Jindal | Michael Gerndt | Mohak Chadha | Vladimir Podolskiy | Pengfei Chen Chair of Computer Architecture andParallel Systems, Technical University ofMunich, Garching (near Munich), Germany School of Data and Computer Science, SunYat-sen University, Guangzhou, China

Correspondence

Anshul Jindal,Chair of Computer Architecture and ParallelSystems, Technical University of Munich,Informatics 10, Boltzmannstr. 3 85748,Garching b. Munich GermanyEmail: [email protected]

Summary

Serverless computing has rapidly grown following the launch of Amazon’s Lambdaplatform. Function-as-a-Service (FaaS) a key enabler of serverless computing allowsan application to be decomposed into simple, standalone functions that are executedon a FaaS platform. The FaaS platform is responsible for deploying and facilitatingresources to the functions. Several of today’s cloud applications spread over hetero-geneous connected computing resources and are highly dynamic in their structureand resource requirements. However, FaaS platforms are limited to homogeneousclusters and homogeneous functions and do not account for the data access behaviorof functions before scheduling.We introduce an extension of FaaS to heterogeneous clusters and to support hetero-geneous functions through a network of distributed heterogeneous target platformscalled Function Delivery Network (FDN). A target platform is a combination ofa cluster of homogeneous nodes and a FaaS platform on top of it. FDN providesFunction-Delivery-as-a-Service (FDaaS), delivering the function to the right targetplatform. We showcase the opportunities such as varied target platform’s character-istics, possibility of collaborative execution between multiple target platforms, andlocalization of data that the FDN oﬀers in fulﬁlling two objectives: Service LevelObjective (SLO) requirements and energy eﬃciency when scheduling functions byevaluating over ﬁve distributed target platforms using the

FDNInspector , a tooldeveloped by us for benchmarking distributed target platforms. Scheduling functionson an edge target platform in our evaluation reduced the overall energy consumptionby 17x without violating the SLO requirements in comparison to scheduling on ahigh-end target platform.

KEYWORDS: cloud computing, edge computing, high performance computing, serverless computing, function deliverynetwork, function-as-a-service, heterogeneous platforms, heterogeneous faas

Presently, there exists a multitude of resources for processing and data storage ranging from small, inexpensive devices with lim-ited computing resources to modestly priced servers with mid-range resources to expensive high performance computers with a r X i v : . [ c s . D C ] F e b Jindal

ET AL . extensive compute, storage, and network capabilities. These all combined form the computing continuum . Many of today’sapplications are spread out over these heterogeneous connected computing continuum . (1) Web applications, for instance, com-bine mobile devices, edge computers for content delivery, and servers to enable interaction and collaboration. (2) IoT applicationsuse micro-controllers, mini-computers, edge computers, and servers for delivering sensor measurements and controlling devicesin the physical world. (3) Large scale experiments gather big data sets that need to be preprocessed and aggregated, forwardedto analytics functions, fed into compute-intensive simulations, and be visualized for the scientists. Many of these applicationsare highly dynamic with respect to their structure as well as the workload . Programming and deploying these applications is ahighly challenging task. This is due to the heterogeneity of the underlying hardware, varying compute and data access require-ments across time and application components, as well as the dynamic structure of the applications due to agile programmingtechniques combined with continuous delivery.Signiﬁcant progress has been made in the context of cloud computing based on the idea of severless computing since itslaunch by Amazon as AWS Lambda in November 2014 . Serverless computing is a cloud computing model that abstractsserver management and infrastructure decisions away from the users . In this model, the allocation of resources is managed bythe cloud service provider rather than by the team of application developers and deployment managers, i.e., DevOps , therebyincreasing their productivity. Additionally, from the last couple of years there have been shift observed in the cloud native appli-cations architecture from independently deployable microservices towards serverless architecture which is more decentralizedand distributed .Function-as-a-Service (FaaS) is a key enabler of serverless computing . In FaaS, an application is decomposed into simple,standalone functions that are uploaded to a FaaS platform for execution. These functions are stateless, i.e., the state is not keptacross function invocations. Functions can be invoked by a user’s HTTP request or by another type of event created within theFaaS platform. The FaaS platform is responsible for deploying and facilitating resources to the application functions.Currently, a signiﬁcant number of open source and commercial FaaS platforms are available . All of the large cloud providershave FaaS platforms available based on a container orchestration platform such as Kubernetes. However, these platforms arelimited to homogeneous clusters of nodes as well as to homogeneous functions. These assumptions facilitate the scheduling offunctions invocations onto the available resources. Furthermore, FaaS platforms do not account for the data access behavior offunctions during scheduling . Since the functions are stateless, state changes and lookups require frequent access to databasesthat can lead latency in to data accesses.In this article, we introduce an extension to the concept of FaaS as a programming interface for heterogeneous clusters and tosupport heterogeneous functions with varying computational and data requirements. This extension is a network of distributedheterogeneous target platforms called Function Delivery Network (FDN) analogous to Content Delivery Networks . A targetplatform is a combination of a cluster of homogeneous nodes and a FaaS platform on top of it. FDN provides Function Deliveryas a Service (FDaaS), delivering the function to the right target platform based on the required computational and data demand.We target the integration of HPC clusters and distributed mini-computers (such as being used as edge devices) with the currentplatforms running on homogeneous clusters of servers in the cloud. In contrast to the elastic resource management in the cloud,HPC clusters are statically partitioned machines focusing on batch workloads. Space sharing is used to distribute the nodesto long-running applications that have exclusive access for their entire lifetime. The batch scheduling algorithm decides onthe resource distribution to optimize the overall utilization of the system. Edge computers are currently used as a deploymentdevice for a single application. In the IoT Greengrass system of Amazon , it is already possible to integrate edge devices withcloud resources in an IoT platform and application Lambda functions running on it are deployed to the edge computers forimplementing computing on the edge. This approach is thus limited to single applications on the edge and a static distributionof computation. The integration of edge systems for general FaaS applications will require an extension of the FaaS platformacross heterogeneous devices.The automatic management of resources in the proposed serverless based FDN facilitates application development by shiftingthe burden to the cloud platform. However, already existing challenges like the fast startup of containers, communication, andlatency of data accesses are further increased. The heterogeneity of the resources in the continuum is speciﬁcally challengingfor resource management. However, due to the heterogeneity of the FDN, it oﬀers a wide range of opportunities for meetingdiﬀerent objectives like SLO requirements and energy eﬃciency in unconventional ways. Towards this, we present an externalcomponent of the FDN, FDNInspector , a tool for benchmarking diﬀerent target platforms and show based on our experimentsthe opportunities oﬀered by FDN in meeting the two objectives: SLO requirements and energy eﬃciency. In summary, our maincontributions are presented as follows. indal

ET AL .

1. We propose an extension to the concept of FaaS as a programming interface for the computing continuum called FunctionDelivery Network (FDN). It should be noted that in this article we introduce the overall architecture of the FDN anddescribe its components in detail but the development of the FDN is still underway and therefore its implementationdetails are out of scope for this work.2. We develop and present a tool called

FDNInspector for evaluating distributed heterogeneous FaaS based target platforms.This tool is one of the external components of the FDN and is used for benchmarking the target platforms. It also containsthe monitoring of the target platforms using Prometheus, which will be extended and reused as part of the FDN later.3. We highlight various opportunities oﬀered by the FDN in meeting the two objectives: SLO requirements and energy eﬃ-ciency when scheduling function invocations on ﬁve diﬀerent target platforms by evaluating various function benchmarksusing the developed

FDNInspector .4. We present the performance evaluation results of target platforms for the introduced objectives and the opportunitiesprovided by the FDN in meeting them.The rest of this article is organized as follows. Section 2 gives a brief overview of FaaS cloud model and the diﬀerent FaaSplatforms used in this work. In Section 3, the Function Delivery Network (FDN) and its components are introduced. Section 4describes our overall experimental system design and the tool

FDNInspector . Diﬀerent goals and the performance evaluationresults of the opportunities provided by the FDN in meeting those goals are presented in section 5. In Section 6 a few additionalopportunities provided by the FDN are discussed. In Section 7, we describe some of the previous works in this domain and insection 8 we discuss the threats to validity. Finally, Section 9 concludes the paper and presents an outlook.

In this section, we ﬁrst present an overview of the FaaS cloud model. Following this, we describe the architecture and high levelworkﬂow of the three FaaS platforms used in this work.

Function-as-a-Service (FaaS) provides an attractive cloud model since it facilitates application development in which the userdoes not have to worry about the infrastructure management, but only about the code being deployed. The pricing is chargedbased on the number of requests to the functions and the duration, the time it takes for the function code to execute . The lattervaries according to the number of resources such as memory and CPU cores allocated to the function, and are automaticallyadapted to deliver the best performance. Instead of developing application logic in the form of services and managing therequired resources, the application developer implements ﬁne-grained functions connected in an event-driven application anddeploys them into the FaaS platform . The platform is responsible for providing resources for function invocations and performsautomatic scaling depending on the workload. The functions can be closely integrated with other services, e.g., cloud databases,authentication and authorization services, and messaging services. These services are called Backend-as-a-Service (BaaS). TheCloud Native Computing Foundation (CNCF) divides serverless into FaaS and BaaS . BaaS are the third-party services thatreplace a subset of functionality in a function and allow the users to only focus on the application logic . In FaaS, functioninvocations are handled by using containers. Since functions are stateless, the state of the application is stored in databases.In comparison to microservice applications, FaaS has three advantages (1) no continuously running services are required, (2)functions are only charged when they are executed, and (3) the function abstraction increases the developer’s productivity.One of the biggest diﬀerences between other forms of cloud models and the serverless model is scalability . In serverlesscomputing, the application automatically scales up or down based on the resource usage (with scaling down to zero number ofinstances as well) and DevOps do not have to specify any scaling parameters. The infrastructure of the cloud service providerstarts up ephemeral instances of each function on-demand. BaaS services are not set up to scale in this way unless the BaaSprovider also oﬀers serverless computing and the developers build this into their applications. Jindal

ET AL . FIGURE 1

Openwhisk high level workﬂow FaaS based functions can be invoked by a user’s HTTP request or by another type of event created within the FaaS platform.The FaaS platform is responsible for providing resources for function invocations and performs automatic scaling. Currently asigniﬁcant number of open source and commercial FaaS platforms are available . FaaS platforms implementations are basedon starting containers for function invocations on top of a container orchestration platform such as Kubernetes. Applicationsare deﬁned via a deployment speciﬁcation that describes the functions, APIs, permissions, conﬁgurations, and events that makeup a serverless application. The speciﬁcation can be given via a command-line or web interface, or by using some frameworkslike Serverless and Architect . Updating of a deployment is also done through this deployment speciﬁcation. All the updatesin the speciﬁcation are instantly propagated after which either the containers are restarted or only some conﬁguration ﬁles areupdated. Apache OpenWhisk is a serverless open source cloud platform that was originally developed by a research group at IBM in2015 and was released in December 2016. It was later donated to the Apache Software Foundation . It powers IBM’s serverlessoﬀering, IBM Cloud Functions and implements FaaS on top of Kubernetes as the container orchestration platform. Functions inOpenWhisk are called actions and the execution of an action is called an invocation. Actions and rules can be created throughthe command-line interface (CLI) ( wsk ), user interface (UI), or SDK. Created actions can then be invoked either manuallythrough the same methods or by event triggers. Events can originate from multiple sources including timers, databases, messagequeues, or websites like Slack or GitHub.OpenWhisk consists of multiple components under the hood as shown in the Figure 1 and all the components are packagedinside their individual docker containers when OpenWhisk is deployed . Each function invocation is translated into an HTTPrequest to the Nginx server . The Nginx server is a single point of entry and its main purpose is to implement the supportfor the HTTPS secure web protocol. On receiving a request, the Nginx server forwards it to the controller. The controller isresponsible for authenticating and authorizing the requests in coordination with CouchDB where all the user’s data and theirprivilege levels are stored. The controller also has a load balancer which keeps track of the availability of the invokers, i.e., theworkers that run the code and chooses one of them for the invocation. Controller and invokers communicate through Kafka , apublish-subscribe messaging system. The controller publishes the messages to Kafka addressed at a chosen invoker and once themessage delivery is conﬁrmed by the invoker, an HTTP request is sent back to the user with an ActivationId , which can be usedfor retrieving the results of this function call. This processing is asynchronous, however synchronous processing is also available. indal

ET AL . FIGURE 2

OpenFaas high level workﬂow .It functions similarly to asynchronous processing, except in this case, the client will block until the action is completed andwill retrieve the results immediately. Invokers set up a new docker container for each action, inject the code into them, executethe code, obtain the results, and then destroy it. These containers are run inside Kubernetes pods. There can be an invoker perkubernetes worker node as shown in the Figure 1 or an invoker can be responsible for managing multiple kubernetes workernodes. Functions can also be chained together into sequences where chained functions use the output of the preceding functionas input. OpenWhisk supports running functions in languages: Python, Node.js, Scala, Java, Go, Ruby, Swift, PHP, Ballerina,.NET and Rust . Functions which are not using these languages can be created by providing a custom built docker runtime. OpenFaaS is an another widely popular open source serverless cloud platform hosted by OpenFaaS Ltd . Until March 2019, itwas developed by a team of full-time developers from VMWare . It also implements FaaS on top of Kubernetes as the containerorchestration platform. Functions in OpenFaaS can be written in any language, and unlike OpenWhisk, one does not have tocreate custom runtimes to make it work. A pre-built docker image of the function can be supplied to it.Similar to OpenWhisk, functions can be deployed through any interface to the OpenFaaS Gateway (CLI/UI/REST), eithermanually or by setting up triggers. OpenFaaS Gateway is the single point of entry for all the requests. Figure 2 shows a high levelworkﬂow of the interaction between the diﬀerent components of OpenFaaS. From the gateway, CRUD (create, read, update,delete) operations and invocations are forwarded to the faas-provider , i.e., the controller which translates OpenFaaS functionalityto a certain provider. faas-netes is an example of a faas-provider in OpenFaaS which enables Kubernetes for it. Because ofthis transparency to Kubernetes, one can interact with OpenFaaS resources directly through kubectl , the command line interfacefor Kubernetes. When a function is created, its code is pulled from the docker registry and executed inside a container. Itutilizes Prometheus and its

AlertManager to continuously expose metrics. The AlertManager uses these metrics to determineauto-scaling decisions and inform them to the OpenFaaS gateway which then scales the function replicas up or down. Theminimum (initial) and maximum replica count can be set at the time of deployment by adding a label to the function. Whenusing Kubernetes, the built-in Horizontal Pod Autoscaler (HPA) can also be used instead of AlertManager . Scaling to zeroto recover idle resources is available in OpenFaaS, but is not turned on by default. Scaling down to zero replicas is also called"idling" in OpenFaaS. The faas-idler , an external component is responsible for making the scaling down to zero decision . Itmonitors the built-in Prometheus metrics on a regular basis along with the inactivity_duration variable to determine if afunction should be scaled to zero or not. Only functions with a label of com.openfaas.scale.zero=true are scaled to zero,all others are ignored. When using faas-netes as the provider, faas-idler is automatically deployed by default. Jindal

ET AL . OpenFaaS’s watchdog is responsible for starting and monitoring functions in OpenFaaS . It provides a generic interfacebetween the outside environment and the function. The watchdog is a tiny Golang webserver which every function uses as theirdocker ENTRYPOINT . It acts as the init process for function container. Once the function is invoked, the watchdog passes in theHTTP request via stdin and reads a HTTP response via stdout and sends it back to user.OpenFaaS enables long-running tasks or function invocations to run in the background through the use of NATS Streaming .This decouples the HTTP transaction between the caller and the function. The HTTP request is serialized to NATS Streamingthrough the gateway as a "producer". The queue-worker acts as a subscriber and deserializes the HTTP request and uses it toinvoke the function directly. To fetch the results from an asynchronous call, the user can specify a callback url. Google Cloud Functions is a serverless execution environment for building and connecting services in a cloud-based applicationoﬀered by Google Compute Platform(GCP) . With Google Cloud Functions, developers do not need to provision any infras-tructure or worry about managing any servers, the whole environment including infrastructure, operating systems, and runtimeenvironments are managed by Google. Currently, Cloud Functions supports JavaScript, Python 3, Go, and Java runtimes. CloudFunctions are simple, single-purpose functions that are attached to events emitted from the cloud infrastructure and services.The function is triggered when an event being watched is execcuted. These events can be things like changes in a database, ﬁlesadded to a storage system, or a new virtual machine instance is created. A response to an event is created using a trigger whichcan then be attached to a function to capture and act on events. GCFs can either be deployed using the web interface or thegcloud command line tool.Each Cloud Function runs in its own isolated secure execution context, scales automatically, and has a lifecycle independentfrom other functions . Cloud Functions handles incoming requests by assigning them to instances of function. Depending onthe volume of requests, as well as the number of existing function instances, Cloud Functions may assign a request to an existinginstance or create a new one. Each instance of a function handles only one concurrent request at a time. Thus the original requestcan use the full amount of resources (CPU and memory) that is requested. In cases where inbound request volume exceeds thenumber of existing instances, Cloud Functions start multiple new instances to handle requests. This automatic scaling behaviorallows Cloud Functions to handle many requests in parallel, each using a diﬀerent instance of the function. Serverless computing in the form of FaaS is extremely attractive to DevOps as they are no longer responsible for managinginfrastructure resources and autoscaling application components. FaaS provides automatic scaling for each function invocationas a result of a trigger. These invocations are then automatically distributed across the available resources. Current FaaS plat-forms are limited to clusters of homogeneous nodes. However, many cloud applications in the computing continuum requireheterogeneous resources for the execution. At a high level, heterogeneity in FaaS exists in two ways:• One by using FaaS over heterogeneous clusters, clusters with diﬀerent system architectures. For example, one clusterconsisting of VMs in the Cloud, and another cluster consisting of resource-constrained edge devices. Such a method hasan advantage of achieving higher application performance by placing the functions into the speciﬁc clusters dependingon their computational requirements, and could even be used for reduction in the overall energy consumption .• Second by using heterogeneous FaaS platforms. Due to resource constraints in edge devices not all serverless platformscan run on them. In four open source serverless frameworks, namely, Kubeless, Apache OpenWhisk, OpenFaaS, Knativeare evaluated on resource-constrained edge devices. Also, Pfandzelter et al. highlight the problem of running cloudbased FaaS platforms on the edge and introduce a new FaaS platform called tinyFaaS for edge environments. Therefore,one cannot run a homogeneous FaaS platform over heterogeneous clusters.In this article, a target platform is a combination of a homogeneous cluster and a FaaS platform on top of it. For extending theserverless computing FaaS platform to heterogeneous clusters and to support heterogeneous functions with varying computa-tional and data requirements we introduce a network of distributed target platforms called as Function Delivery Network (FDN) https://cloud.google.com/sdk/gcloud indal ET AL . FIGURE 3

Overall architecture and high level workﬂow of the Function Delivery Network (FDN). The FDN combines severaltarget platforms (Cloud, Edge, HPC) via a joint Control Plane for the continuous deployment of applications into the ComputingContinuum. It analyzes application characteristics (Behavioral Modeling) and the FDN platform parameters (Monitoring) andapplies a distributed approach for function scheduling (Scheduler and Sidecars in the platforms). External tools, such as theFDNinspector presented in this paper, allow to benchmark the FDN and tune its hyperparameters.analogous to Content Delivery Networks distributing web content and media to a network of distributed resources to provide ser-vice with the best quality of service (QoS). The FDN provides

Function Delivery as a Service (FDaaS) , delivering the functionto the right target platform based on the computational and data requirement.When extending FaaS to target platforms, challenges like communication latencies, function scheduling, and data accesspatterns are further increased. Deploying heterogeneous functions on these target platforms can make these challenges evenharder to solve . However, the opportunities that the FDN oﬀers such as varied target platform’s characteristics, possibility ofcollaborative execution between multiple target platforms, and localization of data can help in achieving higher Service LevelObjective (SLO) matching, lower energy consumption and higher throughput for a mix of applications.Extending FaaS to the continuum of resources requires scheduling functions and placing data onto the target platforms. Thisrequires more knowledge about the behavior of the application functions. The assumption of similar granularity does not holdsince applications will use functions with signiﬁcantly diﬀerent computational requirements. Furthermore, data will be storedin diﬀerent databases at diﬀerent locations providing a non-uniform access latency. Although the functions are stateless, statechanges and look-ups require frequent access to databases. The data access behavior of functions is not taken into account bythe current platforms for scheduling . Therefore, when scheduling function invocations both, the computational and the datarequirements have to be considered in an optimized manner beneﬁting from the distribution and the heterogeneity of the computeand data resources. Jindal

ET AL . Integrating target platforms with diﬀerent levels of computing power has the potential to improve overall application perfor-mance. Diﬀerent types of hardware may reduce the overall energy consumption by integrating IoT and other low-power targetplatforms . In the same way, high-performance computing target platforms might add large amounts of computing power. Otherdomains where heterogeneous FaaS platforms can be relevant are edge and fog computing. Both domains include several diﬀer-ent types of hardware nodes, sometimes with a huge diﬀerence in computing power (e.g. a smartphone and AWS ). Situationslike these arise the need for intelligent placement of computational tasks to fully exploit the beneﬁts of edge and fog comput-ing. In these domains, network links might vary vastly in terms of bandwidth sizes (e.g. Inﬁniband in datacenters and mobilenetworking in cellphones) . This improves the need for eﬃcient scheduling of functions to reduce the overall latency for theparticipants of the heterogeneous target platforms. Integrating specialized FaaS platforms which operate better on speciﬁc hard-ware (e.g. an HPC cluster or a cluster of IoT devices) can leverage both dimensions of heterogeneity and can optimally exploitthe available resources.Figure 3 outlines the overall architecture and high-level workﬂow of the proposed Function Delivery Network (FDN). Theuser provides an application conﬁguration speciﬁcation which describes the functions, APIs, permissions, conﬁgurations, andevents. The speciﬁcation can be given via a command-line or web interface, or by using some frameworks like Serverless and Architect . The Deployment Generator annotates this ﬁle with the deployment conﬁguration either based on the previousknowledge captured in the

Knowledge Base or based on the expert knowledge provided externally. This updated speciﬁcation isthen passed to the

FDN Control Plane . It manages function scheduling and data placement, monitors the overall infrastructureand applications, and provides access control for authentication and authorization. The functions are scheduled to the targetplatforms based on the speciﬁcation. Various behavioral models are constructed during application execution by the

BehavioralModeling component. These models are updated regularly in an online learning manner as the data from the application functionsis collected. The runtime decisions of function scheduling and data placement done by the

FDN Control Plane is based on thesemodels. Furthermore, the gathered historic application knowledge is used by external components for recommendations to theuser, or for oﬄine tuning of the FDN itself. The

FDNInspector (presented in Section 4.4) an external component of the FDN isutilized for benchmarking the FDN . The following subsections describe each component of the FDN in more detail.

This is the main component of the FDN and is responsible for managing the FDN. It’s responsibilities include access controlfor authentication and authorization, monitoring across the diﬀerent target platforms, and scheduling function invocations andplacement of data. The management of target platforms is done in a hierarchical manner, where the scheduling and placementdecisions concerning the target platforms are taken by the scheduler within this component, while the selection of the nodeswithin the target is delegated to the

Sidecar Controller component within each target platform. Both the control plane and thelocal sidecar controller work in collaboration to the make ﬁnal decision. The details regarding each sub-component of the FDNcontrol plane are presented below.

Every individual computing platform requires some security measures for scheduling functions and collecting resourceutilization data from them. This component deals with these measures.

This component is responsible for gathering data related to platform, application, and function level metrics. For collecting awide variety of metrics, it interfaces and extends the existing monitoring of the FaaS platforms and the Kubernetes clusters, andprovides base data for the

Scheduler and the

Behavioral Modeling components. Prometheus being a well known monitoringsystem will be used along with some added instrumentation for collecting heterogeneous monitoring data . However, in thiswork we have already built a monitoring system based on Prometheus as part of the FDNInspector (Section 4.4) to extractdiﬀerent metrics for evaluation. This will be reused for the implementation of the FDN. Metrics are classiﬁed under threecategories: (i) User-Centric metrics, the metrics responsible at the user side, (ii) FaaS-Platform-Centric metrics, the metricsfrom the FaaS platform, and (iii) Infrastructure-Centric metrics, the metrics from the host machines.• User-Centric metrics : The response time for a HTTP request below which 90% of the response time values lie, is calledthe 90-percentile (P90) response time, which means 90 percent of the requests are processed in 90-percentile response indal

ET AL . TABLE 1

Monitoring metrics from three diﬀerent layers.

User-Centric Metrics FaaS-Platform-Centric Metrics Infrastructure-Centric Metrics

Requests 90-percentile (P90) response time Number of function replicas Total number of coresNumber of requests served Number of function invocations Total MemoryNumber of cold starts CPU utilization of clusterFunction execution time Memory utilization of clusterMemory allocated to the function Disk I/O of clustertime or less. This metric is important from the SLA point of view, where one wants to have most of the requests (90% inthis case) completed before a certain time. This metric and the number of requests served per unit time are calculated aspart of this class of metrics.•

Platform-Centric metrics : Number of function invocations resulted from the received requests, number of replicas for thefunction creation created to load balance those invocations, number of invocations resulting in cold starts, and executiontime of the function (excluding the startup latency) along with the memory allocated to each function instance is consideredin this class of metrics.•

Infrastructure-Centric metrics : In this case, the amount and usage over time of static resources such as number of cores,memory inside individual nodes of a target platform are considered when functions are scheduled on it.The summary of these considered metrics from the three diﬀerent categories are shown in Table 1. For all these metrics, thedata is collected per unit time.Tracing of events (allocation of resources, start of container, deletion of container, etc.) will be added in the future, sincethese events are helpful for building models for anomaly detection and ﬁnding the root cause analysis. The monitoring solutionmust be carefully designed to reduce application jitter and performance degradation. All the collected monitoring data are storedinside the database and are used by the FDN’s

Behavioural Modeling (Section 3.3) component for building various models.

It is responsible for (1) scheduling or delivering the function and (2) placement of the data to an appropriate target platform basedon the compute and data requirements of the function. Apart from function scheduling and data placement, this component alsokeeps track of the high availability of the applications. For taking decisions, this component uses the data from the

Monitoring (Section 3.1.2) and the

Behavioural Modeling (Section 3.3) components, and applies a hierarchical decision making approach.In this approach, the scheduling and placement decisions with respect to the target platform are taken by the scheduler, whilethe selection of the resource within the target platform is delegated to the

Sidecar Controller component. The three importantfunctionalities of the scheduler are described below:

Function Scheduling

The

Scheduler is responsible for scheduling the function to the right target platform based on a distributed scheduling algorithm.This algorithm uses the function’s behavioral models, the target platforms conﬁguration and the current state of the FDN formaking a decision. Additionally, it investigates the trade-oﬀs between staging the data for individual function invocations orlong term migration to a speciﬁc server within the target platform for faster data access, and then selects the best suitable option.

Data Placement

Data Placement functionality includes tools and methods for adaptive data management. It enables migration of data betweenthe target platforms to exploit data aﬃnity. Targets of the adaptive data management are mostly the NoSQL databases and objectstorage platforms such as MinIO , that are used for storing the state of the functions and data ﬁles. It includes following threemain methods for data management: Jindal

ET AL . Distributed Data Caching : It is used for supporting data aﬃnity. It acts as an intermediate layer between functions and theused storage platform (databases or object storage). Written data as well as the accessed data will be cached and, means willbe provided to proactively migrate or replicate data to the selected target platform for future function invocations.2.

File Staging and Migration : It is applied when the data to be used is stored in ﬁles and is accessed by compute intensivefunctions. Such function invocations are candidates for being scheduled to a target platform of HPC nodes. Ideally staging isnot done on-demand but proactively, and scheduling decisions might even lead to a migration of ﬁles for reducing the stagingoverhead in case of repetitive execution of those functions.3.

Data Access Instrumentation : To enable distributed data caching for NoSQL databases as well as ﬁle staging, database andﬁle accesses in the functions have to be redirected to the data management layer. This method automatically instruments thedeployment speciﬁcation which will be as transparent as possible for the application developer. The general approach for thismethod is based on automatically intercepting the REST call and ﬁle system functions through library interposition.Function scheduling and data placement decision methods works in collaboration with each other by taking into accountmultiple objectives like compute and storage requirements, communication between functions, and cost.

Fault Tolerance

Methods for setting up a fault tolerant environment where the failing of a speciﬁc device/node in a target platform will lead to arestart or continuation at another device/node in the same or diﬀerent target platform such that the system continues to operateis done as part of this functionality. It also includes algorithms to detect failures in advance to keep a high availability using themodels from

Behavioural Modeling (Section 3.3) component.

This component resides along with the local FaaS platform installation on the target platform where it acts as a local decisionmaker. While the

Control Plane is responsible for deciding the target platform on which the function invocation goes to, thelocal decision to select a node of the Kubernetes cluster is taken by this component. Furthermore, it also checks whether toschedule a locally triggered function locally or to delegate it to the higher level

Control Plane . This component is responsible for characterizing the behaviour of the function based on the monitoring information (from the

Monitoring component) and the deployment conﬁguration ﬁle. It characterizes the application by the following models:1.

Application Event Model : Information about events like frequency of function invocations, sequence of functions invoked,creation, deletion or upgrade of functions is used to build this Model. This model will then be used for use cases like anomalydetection, forecasting of future events, events tracing etc. This event model will also be used for reducing the cold start timeby predicting the workload and starting the function containers before time.2.

Function Interaction Model : It characterizes the producer-consumer interactions of functions based on data accesses. Theinteraction might, for example, suggest to package functions together to reduce the communication costs.3.

Data Access Model : It characterizes the functions with respect to their data accesses. It determines, for example, how fre-quently data is read or written to certain databases or ﬁles. This can be useful for placement of functions considering thecaching scenarios.4.

Function Performance Model : The Function Performance Model will capture the performance with respect to time andenergy for certain combinations of resources, such as the number of cores, the network bandwidth, the memory size and I/Obandwidth. The model will be based on measured information obtained from the FDN

Monitoring (Section 3.1.2) as wellas on the current workload. This model will be used by the

Scheduler (Section 3.1.3) to ﬁnd the right target platform forinvoking the function (called as function delivery in this work) based on resource requirements and availability.The models will be provided to the

Scheduler (Section 3.1.3) and will be stored in the

Knowledge Base (Section 3.4). indal

ET AL . This component stores the application models prepared by

Behavioral Modeling (Section 3.3) as well as decisions taken by the

Scheduler (Section 3.1.3). The previously saved high performant decisions are used by the

Deployment Generator (Section 3.5)for automatically adding annotations to the deployment conﬁguration in case of redeployments or on deployment updates ofthe application. Furthermore, the external components, use the stored information for further analysis and decision making.Scalable NoSQL and SQL databases along with the scalable ﬁle storage platforms are the basis for the implementation of the

Knowledge Base . The

Deployment Generator is responsible for annotating the deployment speciﬁcation provided by the user. The provided appli-cation conﬁguration ﬁle can describe the initial application deployment conﬁguration but also serves as a measure to specifyupdates to the already running applications. The

Deployment Generator component inserts hints into the deployment speciﬁca-tion like where to deploy a function as well as function and data characteristics. Therefore, it adds annotations based on analyzingthe results in the

Knowledge Base for previous deployments. This is especially important for deployment updates that modifythe running application. It also performs any required instrumentation to the application, for example, to enable data cachingand migration.

There are some external components which either aid the FDN to take better decisions or help the user by recommending somedeployment conﬁgurations for optimizing function deployment, explaining runtime decisions through some visualizations, orbenchmarking the overall FDN with various applications and functions. Following components are part of it:•

Recommendation and Visualization : It extracts data from the

Knowledge Base , explains the FDN runtime decisionsto the user, and recommends certain conﬁgurations optimizing the application deployment. Such visualizations can behelpful to the application developer for knowing where the functions are scheduled and based on that some optimizationsrelated to the system architecture can be added.•

FDNInspector (Benchmarking) : This external component is responsible for benchmarking the FDN on certain functionsand applications. The benchmarking results can be further used for comparing the application performance by the useron various target platforms. In this work, this component is described in more detail in Section 4.4, where we utilize it, toshow the opportunities oﬀered by the FDN in achieving diﬀerent objectives.•

Threshold Tuning : The decisions taken by the

Scheduler (Section 3.1.3) are frequently based on thresholds that decide,when to, for example, migrate data to a diﬀerent target platform. A tuning based on historic data of the FDN will improvethe eﬀectiveness of resource management across diﬀerent applications. This tuning is part of this external component.

In this section, we ﬁrst present details about the diﬀerent benchmarks used for evaluating various opportunities oﬀered by theFDN and then describe the ﬁve diﬀerent target platforms used in this work. We also present the load testing details used for theevaluation, and ﬁnally the high-level design and the functioning of the developed

FDNInspector . To investigate the performance of each target platform available as part of the FDN, we used a subset of the benchmarks providedwith the FaaSProﬁler and modiﬁed them for our use case. Furthermore, we developed OpenFaaS implementations of thechosen functions to enable their execution on the target platforms using the OpenFaaS platform. The OpenWhisk action container Jindal

ET AL . TABLE 2

List of FaaS based functions we developed or modiﬁed for demonstrating the opportunities oﬀered by the FDN.

Function Name Description Language runtime nodeinfo Gives basic characteristics of node like CPU count, architecture, uptime. Node.jsprimes-python Calculates prime numbers till 10000000. Python3image-processing Reads an image from object storage (here Minio) and performs basic operations Python3(ﬂip, rotate, ﬁlter, grayscale and resize) to the image.sentiment-analysis Sentiment analysis of the given text. Python3JSON-loads Takes a big JSON ﬁle as input containing 1000 three coordinates (x,y,z) Python3records and return the average coordinates value.generally includes code for the function along with its language runtime. OpenWhisk processes the incoming HTTP requests forthe function invocation with any number of arguments and sends the results back to the user or caller. For most of the functions,we have used the default runtime environment provided by OpenWhisk depending on the language that the function is writtenin. If a function uses some extra packages which are not part of their default language runtimes, we created a docker runtime forit based on their default docker runtime. The OpenFaaS functions are also similar to OpenWhisk, however, we created our owndocker images for functions, to run them on the ARM platform. The Google Cloud Functions are also similar to OpenWhiskfunctions, however, one cannot create their own docker image of the runtime.The functions used as part of this work are summarized in the Table 2 along with their description and language runtimes.The nodeinfo function exposes an HTTP endpoint and provides the user with basic information about the system such ashostname, underlying architecture, number of CPUs, etc. We utilize this function to test the general performance of each targetplatform and get an overall idea of their capabilities. The compute-intensive primes-python , sentiment-analysis and JSON-loads functions are used for comparing the high-end target platforms (without the edge-based target platforms). Finally,for demonstrating the advantage of data localisation opportunity in the FDN the object (in our work an image) access latencyfrom the MinIo object storage platform, is showcased using the image-processing function. To demonstrate the opportunities oﬀered by the FDN, we evaluate the function benchmarks on ﬁve diﬀerent target platformsranging from a high performance HPC node to resource-constrained edge devices. The conﬁguration of each target platform,type of FaaS platform used, and the number of nodes present in that target platform is shown in Table 3.The edge-cluster consists of three embedded Nvidia Jetson Nano devices . Due to the limited resources available on theseboards, it was not possible to run the heavy OpenWhisk platform on our edge-cluster , and OpenFaaS does support low-enddevices and provides binaries for ARM processors, therefore we utilized OpenFaaS on top of k3s , a light weight version ofKubernetes to host a Kubernetes cluster on it. k3s reduces the footprint and bootstrap-process of Kubernetes and combines allthe low-level components required for running a Kubernetes cluster such as containerd , runc , and kubectl into a single binary.The cloud-cluster is composed of three virtual machines hosted on a private cloud at the Leibniz Supercomputing Center(LRZ) . Each VM has four virtual CPU cores and 8 GiB of memory. This target platform is based on OpenWhisk on topof Kubernetes. Additionally, we used Google Cloud Functions (GCF) for creating google-cloud-cluster platform and internalconﬁguration details of the VMs or the containers in which the functions are deployed is not available for the user.The other two target platforms represent compute nodes from High Performance Computing (HPC) environments. The hpc-node-cluster is a dual-socket system, with each socket containing an Intel Cascade Lake processor with 22 cores and the old-hpc-node-cluster consists of four sockets, with each socket containing an Intel Westmere-EX processor with 10 cores. We disabledhyper-threading and turbo boost on both the HPC clusters. OpenWhisk on top of Kubernetes is deployed on each of these nodes. The evaluation of various opportunities was done using the free and open-source load testing tool, k6 . k6 uses a script forrunning the tests where the HTTP(s) endpoint along with the request parameters are speciﬁed. HTTP(s) endpoint represents the indal ET AL . TABLE 3

Diﬀerent target platforms used as part of this work for evaluating the benchmarks.

Target Platform Processor FaaS Platform H/W Speciﬁcations Nodes in Cluster hpc-node-cluster Intel(R) Xeon(R) OpenWhisk 44 Cores 1Gold 6238 CPU @ 2.10GHz 754 GiB memoryold-hpc-node-cluster Intel(R) Xeon(R) OpenWhisk 40 Cores 1CPU E7- 4850 @ 2.00GHz 251 GiB memorycloud-cluster Intel(R) Xeon(R) OpenWhisk 4 vCPU 3CPU E5-2697A v4 @ 2.60GHz 8 GiB memorygoogle-cloud-cluster † N/A † GCF N/A † N/A † edge-cluster ARMv8 Processor OpenFaaS 4 Cores 3rev 1 (v8l) 4GiB memory † Host VMs or containers conﬁguration information in which functions are deployed is not available.deployed function endpoint and varies with each function and target platform in our work. Two of the other k6 parameters whichare conﬁgured as part of each test are:•

Virtual Users (VUs) : Virtual Users (VUs) are the entities in k6 that execute the test and make HTTP(s) or websocketrequests. VUs are concurrent and will continuously iterate through the request endpoint until the test ends.•

Duration : A string specifying the total duration a test will run. During this time each VU will execute the script in a loop.In our evaluations, duration was ﬁxed to

10 minutes and number of VUs varied from

10 to 50 depending on the functionand the target platform. The total duration for which the metrics data is collected is set to

20 minutes and the sampling rateis set to

10 seconds , i.e, metrics values are aggregated for 10 seconds. The term unit time refers to the sampling interval inSection 5.The number of requests per second generated by k6 depends on the number of VUs and the time taken by each request tocomplete. For example, if there are 10 VUs with total test duration set to 10 minutes and each request from a VU took 30 secondsto complete, then from each VU there will be 2 requests per minute and 20 requests per minute from 10 VUs with a total ofroughly 200 requests completed in the whole duration. Therefore it will vary for each target platform depending on the timetaken by each request to complete.Moreover, we increased the default limits on the number of concurrent invocations and invocations per minute which can beserved in OpenWhisk to and increased the memory allocated to the invoker to

MiB for each target platform usingOpenWhisk.

In this work, we introduce the

FDNInspector , a tool for benchmarking the diﬀerent target platforms of the FDN to iden-tify opportunities from smart function scheduling and data placement across the heterogeneous target platforms. This externalcomponent of the FDN is built ﬁrst to support our proposed work towards combing heterogeneous target platforms into the FDN. FDNInspector is written in python and serves the following purposes:• Facilitating the benchmarking via a centralized tool for remotely deploying and executing FaaS benchmarks on any of thetarget platforms.• Enabling load generation through function invocations for the desired duration and amount on diﬀerent target platforms.• Automatic collection of a diverse range of metrics.• Visualization of the gathered information for manual analysis. https://github.com/ansjin/hetrogenous-faas-proﬁler Jindal

ET AL . FIGURE 4

Overall architecture of

FDNInspector along with the interaction between its components when a load of functioninvocations is generated for ﬁve diﬀerent target platforms.• Supports multiple testing opportunities which can be achieved using the FDN where new ones can be added easily.Figure 4 shows the overall architecture of the

FDNInspector and the interaction of its components with diﬀerent target plat-forms when functions are invoked. In our experimental setup, the hpc-node-cluster and old-hpc-node-cluster target platformsare running on-premise in a private network and use a proxy server for accessing the internet. The cloud-cluster is running onthe private cloud at LRZ and can access the internet directly. The google-cloud-cluster is running on Google Compute Plat-form (GCP) in us-east region. Similar to the HPC clusters, the edge-cluster is also running on-premise and uses a proxy serverfor accessing the internet. The FDNInspector was deployed on a virtual machine having the access to all these target platformson-premise (in Germany). Each target platforms is running a Prometheus instance to collect data for a variety of metrics.The user provides the input ﬁle in JSON format, an example of which is in shown in Listing 1. The target platform informationlike host address, authentication, and hardware resources is present in the clusters conﬁguration ﬁle (Line 2). Information relatedto functions such as name, docker image, and runtime is present in the functions conﬁguration ﬁle (Line 3). The user providesthe function name and the target platforms (based on OpenWhisk, OpenFaaS and public FaaS platform) on which the test isto be executed (Line 5-7). Furthermore, the user provides the test parameters: number of virtual users (VUs) analogous to theactual users, duration of the test, parameter ﬁle (if it exists) and how much time (in seconds) to sleep in-between the requests(Line 8-14). This is particularly useful in cases where requests take a longer time to complete and we do not want to send thenext request until certain time has already been passed.The

Function Deployer takes this conﬁguration ﬁle as input and deploys the functions on the listed target platforms. The wsk command-line interface is used for deploying the functions onto the OpenWhisk based target platforms. For deploying functions indal

ET AL . Listing 1: Example conﬁguration ﬁle in JSON format. { " t e s t _ n a m e " : " e x a m p l e _ t e s t _ n a m e " , " f u n c t i o n s _ c o n f i g : " f u n c t i o n s / c o n f i g . j s o n " , " t a r g e t _ p l a t f o r m s _ c o n f i g " : " t a r g e t _ p l a t f o r m s / c o n f i g . j s o n " , " i n f l u x d b _ u r l " : " h t t p : / / l o c a l h o s t : 8086 / " , " o p e n w h i s k _ t a r g e t _ p l a t f o r m s " : [ " c l u s t e r − − " o p e n f a a s _ t a r g e t _ p l a t f o r m s " : [ " c l u s t e r − " p u b l i c _ c l o u d _ t a r g e t _ p l a t f o r m s " : [ " g o o g l e _ c l o u d _ c l u s t e r " ] , " t e s t _ i n s t a n c e s " : { " i n s t a n c e 1 " : { " a p p l i c a t i o n " : " p r i m e s − p y t h o n " , " t e s t _ s e t t i n g s " : { " v u s " : " 30 " , " d u r a t i o n " : " 600 s " , " p a r a m _ f i l e " : " " , " s l e e p " : " 1 " } } } }on the OpenFaaS target platforms, faas-cli is used. For deploying functions on the google cloud target platform, gcloud is used.Once the functions are deployed, we utilize k6 for invoking the functions on each of the target platforms based on the inputparameters (VUS, duration and sleep_time). After the load generation is ﬁnished, the Data Collector collects data for a varietyof metrics by querying the Prometheus instance of each target platform. The collected data is presented to the user throughgraphs. After completion of the tests, the

Function Destroyer deletes the function instances from each target platform.

We evaluate the ﬁve target platforms for demonstrating the opportunities oﬀered by the FDN for handling function schedulingand data placement in these heterogeneous target platforms in achieving diﬀerent objectives. However, before evaluating them itis important to know the capability of each target platform. To this end, we evaluate the resource usage of the nodeinfo functionon each cluster by varying the number of virtual users (from 10 to 50). Figure 5 shows the result of the conducted evaluationfor diﬀerent metrics represented as rows (from bottom to top): Number of requests processed, percentage CPU utilization of thecluster, number of activations and the 𝑡ℎ percentile (P90) of the response time in seconds of the requests. Diﬀerent numbersof VUs are represented as columns (increasing from 10 VUs to 50 VUs). The edge-cluster exhibits the worst performance interms of the number of requests processed (approximately 70-150 requests/second) and P90 response time (approximately 1second and above) across all scenarios. This can be attributed to the limited number of resources and low compute capability ofARM processors as compared to other architectures in our platform. When the number of VUs are less than or equal to 20, thefour target platforms: hpc-node-cluster, old-hpc-node-cluster, google-cloud-cluster and cloud-cluster perform similar. However,when the load is increased to 50 VUs (last column in Figure 5), the diﬀerent compute capabilities of each target platform ismore prominent. The hpc-node-cluster performs best and can handle around 500 requests per second with the P90 responsetime being below 500 ms. The google-cloud-cluster performs second-best followed by the old-hpc-node-cluster and then the cloud-cluster with requests per second and P90 response times being 450, 500ms, 400, 1s, and 200, 2.5s respectively.It is apparent from the metric P90 Response Time , that the requests initially suﬀer from the cold-start problem in all testsfor all the target platforms. The initial P90 response time is above ﬁve seconds, but after the containers are warm it decreasessigniﬁcantly for all the target platforms. Activations represent the number of functions invoked overtime. All the requests inOpenWhisk were sent with the blocking parameter enabled. This means that a function invocation request will wait for the Jindal

ET AL . FIGURE 5

Comparison between diﬀerent target platforms on four diﬀerent metrics (represented as rows) when nodeinfo function is invoked with varied workload (represented as columns). The edge-cluster being a resource-constrained target platformexhibits the worst performance in terms of the number of requests processed (maximum 150 requests/second) and their P 𝑡ℎ response time (minimum 1s) among all the ﬁve target platforms. All the other target platforms perform similar at a workloadless than 50 VUs. However, at 50 VUs workload, the diﬀerent compute power of each target platform is more prominent ( hpc-node-cluster and google-cloud-cluster exhibiting the best performance). It is to be noted that the 𝑡ℎ percentile response timeof the requests is truncated above 2, 4 and 10 seconds in each of the workloads to show better comparison after the initial longerrequests time due to cold-start.activation result to be available. For OpenFaaS there is no such parameter and it is by default blocking. For the workload with50 VUs, hpc-node-cluster has the highest number of activation’s since it serves more number of requests over time as comparedto the other clusters. The overall CPU utilization of both HPC node clusters is similar across all tests and is lower than theutilization of the other two clusters. The CPU utilization metric indicates amount of workload a cluster can handle.Figure 6 shows the detailed view of all the 9 metrics (divided into 3 classes: user-centric, platform-centric and infrastructure-centric parameters) when the workload of 20 VUs is applied on the nodeinfo function for all the ﬁve target platforms. Executiontime represents the function execution time and response time is the diﬀerence between the time when the request was sentand when it was returned. For each target platform, initially the function execution time is high due to cold-starts (2nd row,2nd column), and slow increase in the number of replicas (3rd row, 2nd column). Since, the edge-cluster was deployed on theOpenFaaS platform, values for the cold start metric were not available due to them being not exposed by OpenFaaS. It is onlypossible to obtain them through external instrumentation. Also, various infrastructure level metrics from within the google-cloud-cluster requires external instrumental and were not done within the scope of this work. While the nodeinfo function doesnot have much eﬀect on the memory usage of a target platform, a signiﬁcant change can be observed in the CPU and disk I/Ousage for all target platforms. Based on the values of these metrics, diﬀerent derived metrics can be formulated. For instance,the relation between the number of requests and the replicas created or a relation between the function execution time and theCPU usage of the target platform. These derived metrics can then be used for scheduling the functions to the target platformbased on their requirements. In the later half of this section, a subset of these metrics is used for demonstrating the opportunitiesFDN oﬀers in achieving diﬀerent objectives. indal

ET AL . FIGURE 6

Comparison of nodeinfo function for all target platforms with workload from 20 VUs on three diﬀerent classes ofthe metrics. nodeinfo being a simple HTTP endpoint function, does not characterize the performance of the target platforms for morecomplex functions. Towards this, we perform a workload test with 30 VUs using three diﬀerent functions: primes-pythonand sentiment-analysis being compute-intensive and

JSON-loads being I/O-intensive (see Table 2). Figure 7 shows theresults of our experiment, where three diﬀerent functions are represented as columns and four diﬀerent metrics are representedas rows. The edge-cluster cannot handle a high load for these three diﬀerent functions, therefore this comparison is only con-ducted for the four target platforms: hpc-node-cluster, old-hpc-node-cluster, cloud-cluster, google-cloud-cluster . The function primes-python is the most compute-intensive with a P90 response time of 14 seconds and 2 seconds per request for the cloud-cluster and hpc-node-cluster respectively. Also, google-cloud-cluster performed worst for this function with 20 requests perunit time and 19 seconds as P90 response time. This could be attributed towards the inability of the GCF to handle computeintensive functions. Furthermore, this evaluation results demonstrates the higher computation power of the hpc-node-cluster as compared to the other target platforms. For the other two functions, all target platforms perform similar to each other with google-cloud-cluster performing the best. However, the CPU utilization of cloud-cluster is much higher for the

JSON-loads and sentiment-analysis functions. Due to high computations requirement for primes-python function, each target platform isable to process smaller number of requests per unit time (maximum around 100 requests per unit time) as compared to the otherfunctions (maximum around 250 requests per unit time). Such an analysis can be used to derive inter-target platform relationsfor the same function. Inter-target platform relations can be used to oﬄoad a function from one target platform to another basedon the function’s performance within one target platform and then using these relations to ﬁnd which target platform will beideal for it.In the following subsections, we present and evaluate various opportunities that the FDN oﬀers in achieving two mainobjectives: meeting the SLO requirements and energy eﬃciency. All the opportunities presented in meeting the objectives areevaluated using the implemented

FDNInspector . A Service Level Agreement (SLA) deﬁnes a contract between the provider and the client to meet certain

Service Level Objectives(SLO) , such as a minimum uptime or a maximum response time. Due to the current homogeneity of nodes in FaaS platforms, itis not possible to scale the function vertically or to provide a specialized machine for its execution. In the case of heterogeneous Jindal

ET AL . FIGURE 7

Comparison of three diﬀerent functions: primes-python , sentiment-analysis and JSON-loads for three targetplatforms with 30 VUs generating the load on four diﬀerent metrics.target platforms, functions can beneﬁt from the heterogeneity of the underlying platforms and can help in meeting the SLOs.FDN oﬀers multiple opportunities in achieving this and a few of them are presented and evaluated in the following subsections.

One method is to always invoke the function on a target platform that has the highest compute capability (and gives the bestperformance). The hpc-node-cluster performed best among all target platforms as shown in Figure 7. Therefore, function invo-cations can always be scheduled on this target platform to meet the SLO, with the assumption that no new target platform isadded to the FDN. If a new target platform is added, then ﬁrst it needs to be benchmarked to analyze it’s performance and thenaccordingly ranked among the target platforms (in this work hpc-node-cluster , old-hpc-node-cluster , cloud-cluster , and edge-cluster respectively). Following this, the functions requiring strict SLOs can be scheduled on the target platform with the highestperformance. Scheduling the function invocations on the target platform with the highest compute capability will not always lead to the bestperformance. For example, in the situation in which a workload is already running on the target platform. In this case, schedulingfunction invocations on it can hamper the performance of both the workloads and result in SLO violations. Therefore, it isimportant to know the usage of each target platform before scheduling functions on them. Figure 8 shows the performancecomparison of image-processing function invocations for 40 VUs on the old-hpc-node-cluster across three scenarios: 1) whenthere is no additional workload on the target platform, 2) when the target platform has an additional 50% CPU load on it, and 3)when the target platform is fully utilized. Scheduling the function invocations on a target platform with 100% CPU load leads toa degradation in its performance (the P90 response time approximately increased from 0.8s to 1.5s and the number of requestsprocessed decreased by 100 per unit time). However, for scenario 2, no decrease in performance is seen. indal

ET AL . FIGURE 8

FIGURE 9

Performance comparison of image-processing function with invocations generated from 40 VUs on the old-hpc-node-cluster in three scenarios: 1) when the cluster is idle, 2) when the target platform has additional 50% Memory load on it,and 3) when the target platform has additional 100% Memory load on it. Scheduling the function invocations on a cluster with100% Memory load underneath can impact its performance signiﬁcantly.The performance comparison of image-processing function for the same three scenarios for 40 VUs on the old-hpc-node-cluster with additional load on memory rather than on the CPU is shown in Figure 9. When a function is invoked it leads tothe creation of function replicas that require a certain amount of memory (in this work 256MB). If the required memory is notavailable, the performance is decreased as shown in Figure 9. Scheduling function invocations on a target platform with 100%memory load leads to a signiﬁcant decrease in performance (the P90 response time approximately increased from 0.8s to 6s).However, similar to Figure 8 invoking functions on a machine with an additional 50% memory load does not aﬀect performanceas there is still free memory available for creating additional function replicas.Therefore, because of the heterogeneity oﬀered by the FDN, oﬄoading the function invocations from one target platformwith high resource utilization to another with a lower value will result in meeting the SLOs. Furthermore, this approach canalso be applied for placing or scheduling functions together if they are using complimentary resources. For instance, placingmemory-intensive and compute-intensive functions together on the same target platform and placing other compute-intensivefunctions on diﬀerent target platforms will lead to optimal utilization of the underlying resources. In this way, the performanceof functions will not decrease on simultaneous execution.

This method can be helpful in scenarios where there is some additional load on the target platform and scheduling functioninvocations on multiple target platforms can prevent performance degradation and SLO violations. In this work, we createda NGINX server in between the old-hpc-node-cluster and the cloud-cluster for collaborating the function invocations to bothclusters as shown in Figure 10. We consider two scenarios :1.

Round-robin Collaboration : In this case the function invocations are distributed across both target platforms in a roundrobin manner. Jindal

ET AL . FIGURE 10

Performance comparison of primes-python function with invocations generated from 30 VUs on the old-hpc-node-cluster , cloud-cluster , and when both were collaborated together in round-robin and weighted load balancing manner.2. Weighted Collaboration : The biggest drawback of using the round robin approach is that it assumes target-platforms aresimilar enough to handle equivalent loads. However, because of heterogeneous target platforms in the FDN, the algorithmhas no way to distribute more or less requests to these target platforms based on their resources. As a result, target platformswith less capacity may overload and fail more quickly while capacity on other target platforms remains idle. Therefore, in thiscase we use weighted collaboration, where function invocations are distributed across the two target platforms based on theweights assigned to each target platform. In this work, old-hpc-node-cluster target platform is assigned a weight of ﬁve and cloud-cluster of one, which means that out of total six function invocations, ﬁve will be invoked on the old-hpc-node-cluster and one on the cloud-cluster .We deploy the primes-python function on the two target platforms old-hpc-node-cluster and cloud-cluster . To demonstratethe beneﬁts of collaborative function invocations between multiple target platforms, we consider four scenarios. In scenarios 1and 2, all functions are invoked exclusively on the old-hpc-node-cluster and cloud-cluster respectively. For scenarios 3 and 4,the two target platforms are collaborated together with round robin and weighted manner. For all scenarios, we generate a loadof 30 VUs. The performance comparison for all the four scenarios is shown in Figure 10. We observe a signiﬁcant increase inperformance when the two platforms are collaborated with round robin manner as compared to when the functions are invokedexclusively on the cloud-cluster . In this case, the number of requests processed increased from 20 to 55 per unit time with alower P90 response time of six seconds per request. Moreover when compared to scenario 1, the number of requests served werehigher in scenario 3 with approximately the same P90 response time. We observe the best performance in scenario 4, i.e., withweighted collaboration. In this case, 60 requests per unit time were served with a response time of ﬁve seconds per request.Collaboration between multiple heterogeneous target platforms in the FDN is a method to overcome the shortcomings ofindividual target platforms. Additionally, this mechanism can also be used to reduce the cold-start problem. This can be done bykeeping a low resource cluster always warm and directing initial function invocations to it and later using weighted collaborationbetween other target platforms. Moreover, it is also possible to create a dynamic rule inside the load balancer that checks for thewarm target platform and directs the initial function invocations to it leading to a overall better performance.

Although functions in FaaS are stateless, changes in state and look-ups require frequent access to databases and object stor-ages. Current platforms do not take into account the data access behaviour of functions while scheduling. This leads to longerexecution times and a violation of the SLO requirements. To demonstrate this, we hosted 2 MinIO instances: one locally onthe target platform and another remotely on the Google Compute Platform (GCP) in us-east region. MinIO is an object store,which can store unstructured data such as photos, videos, log ﬁles, backups and container images. Following this, we evaluatedthe performance of the image-processing function, that takes an image from the two MinIO instances and performs diﬀerentoperations on them. For our experiments, we used the cloud-cluster with function invocations for 20 VUs for accessing the indal ET AL . FIGURE 11

Performance comparison of image-processing function with invocations generated from 20 VUs on the cloud-cluster when the data is available locally and remote.data from the two MinIO instances and google-cloud-cluster with same number of function invocations for showcasing the per-formance when the function is scheduled closer to the remote data storage. Figure 11 shows the performance comparison forthese three scenarios. The cloud-cluster with function invocations accessing the local MinIO instance was able to serve morerequests (approx. 60 per unit time) than when accessing the remote MinIO instance (approx. 45 per unit time) and at a lowerP90 response time (three seconds per request than four seconds per request in case of the remote MinIO instance). Executing thefunction on the google-cloud-cluster performed worst with 20 requests per unit time at P90 response time of 8.5 seconds. Thiscan be attributed towards the inability of the GCFs to handle compute intensive functions and also to the large latency caused bydiﬀerence in the regions from where the request is executed (in Germany) and where the request is handled (in us-east region).Migrating data closer to the target platform can signiﬁcantly reduce the access latency. Hence, adaptive data managementis a key part of the FDN in meeting the SLOs. For instance, data required for training a neural network can be migrated to ahigh-performance target platform. This will reduce the data access latency leading to a decrease in training time. Furthermore, asubset of the data can be migrated to edge-cluster for low-latency machine learning model inference. Additionally, placement ofthe functions closer to data location can provide an another way of achieving a lower access latency. However, in our experimentswhen we executed the function on google-cloud-cluster which is closer to data performed worst due to the large diﬀerence inbetween the execution and processing locations. Nevertheless, one can use such a strategy to handle large function requestswhen the local cluster doesn’t have enough resources for handling the requests.

Another important objective which is highlighted in the FDN is providing energy eﬃciency for certain amount of workloads dueto the availability of resource-constrained target platforms like our edge-cluster . This cluster is made up from Nvidia’s JetsonNano edge devices and consumes signiﬁcantly less energy than the other target platforms.To obtain power measurements for the Jetson Nano edge devices, we utilize the inbuilt power monitors that measure powerconsumption for diﬀerent supply rails. Speciﬁcally, we measure the power consumption for the rail POM_5V_CPU . On the otherhand, the power consumed by the hpc-node-cluster is obtained through running average power limit (RAPL) counters

PKG0 and

PKG1 for the two sockets respectively. It is important to note that for all experiments we measure the CPU power consumptionand average the power values over ﬁve runs of the same experiment. We evaluate the energy consumed by edge-cluster and hpc-node-cluster when a load of 400 requests per second from 40 VUs is invoked on the function

JSON-loads deployed oneach one of them. We calculate the energy consumed by multiplying the average power with the duration of the experiment.Although, the P90 response time (6.32s) is higher for the edge-cluster as compared to hpc-node-cluster (2.3s), the total numberof requests served is same for both target platforms (400 requests per second). Therefore, if a client has a SLO P90 responsetime of seven seconds then both target platforms can be used for meeting it for this workload. However, there is a signiﬁcantdiﬀerence in the CPU energy consumption of the target platforms as shown in Table 4. For the edge-cluster , we obtain a totalCPU energy consumption of around as compared to for the hpc-node-cluster . Table 4 also showsthe individual CPU power consumption with and without workload for each node in edge-cluster and for each socket in hpc-node-cluster . Clearly, choosing edge-cluster as the target platform for this small workload saves a lot of energy. Automaticallyplacing the functions on the low energy consumption target platform based on the workload is part of the FDN. Jindal

ET AL . TABLE 4

Total energy consumption for edge-cluster and hpc-node-cluster target platforms when a load of 400 requests persecond using 40 VUs is invoked on the function

JSON-loads . edge-cluster hpc-node-clusterNode 1 Node 2 Node 3 Socket 0 Socket 1CPU power consumption without workload (W) . CPU power consumption with workload (W) . Total CPU energy consumption (J) 2647.2 44645.64

In this section, we discuss a few other opportunities oﬀered by the FDN that can be used for achieving various objectives.

We have used three diﬀerent FaaS platforms in this work (OpenWhisk, OpenFaaS and GCFs) but there are several otherswhich are available including the ones oﬀered by the public cloud providers like AWS lambda function from Amazon, andAzure functions from Microsoft. One can integrate all these together into the FDN. However, mapping the metrics from allthese platforms to a common metric so that the FDN can make decisions is challenging due to diﬀerences in their semantics,aggregation, and measurement.Mature platforms like OpenWhisk use optimized caching and distinguish between cold, prewarm and warm containers toaddress the cold-start problem . Prewarm containers are containers that already have the runtime environment for an actionset up. For example, when OpenWhisk’s algorithm anticipates Node.js based actions, it will start preparing generic Node.jscontainers, which reduces most of the cold-start time. When an action is executed very frequently, OpenWhisk will detect thatand keep its containers warm. Warm containers are containers where the action is already initialized and ready to be run at anytime. On the other hand, OpenFaaS does not have the concept of warm and pre-warm containers as a result this can aﬀect theperformance on the target platform when using it. OpenFaaS like OpenWhisk does support the option to scale to zero and hencesave money on idle resources. Additionally, OpenFaaS provides support for low-end edge devices with ARM processors andtherefore is a clear candidate for usage on edge target platforms. Public cloud providers FaaS platforms provide an advantage ofexecuting the functions globally in any region of the world and also have large scaling capabilities.FDN oﬀers heterogeneity between FaaS platforms through which multiple devices with diﬀerent system architectures suchas android phones can be integrated into it. Furthermore, using the FaaS platform optimized for certain system architecturessuch as tinyFaaS for edge devices can lead to a higher performance and better SLOs. This can be exploited by the applicationdevelopers. Due to the availability of heterogeneous target platforms in FDN, application function developers can optimize their code to usespecialized hardware like GPUs or specialized processor features like SIMD/AVX for running their functions. For designing suchapplications, hints or recommendations on which target platform the function will be scheduled by the FDN can be provided tothe developer which will allow developers to target code for speciﬁc hardware features, to provide innovative hardware/softwareco-design. Moreover, application developers could provide functions using a high-level domain speciﬁc language, and the FDNcan automatically compile these functions to the most cost-eﬀective target platform based on the user speciﬁed SLOs.Furthermore, conventional enabling technologies for ML at edge networks require personal data to be shared with externalparties, e.g., edge servers. Recently, in light of growing data privacy concerns, the concept of Federated Learning (FL) hasbeen introduced . In FL, end devices use their local data to train an ML model required by the server. The end devices thensend the model updates rather than raw data to the server for aggregation. FL can serve as an enabling technology in edge net-works. However, in a large-scale and complex mobile edge network, heterogeneous devices with varying constraints are involved indal ET AL . which raises challenges of resource allocation in the implementation of FL at scale. FDN having heterogeneous target platformsincluding the edge can be used to provide automatic resource allocation and function scheduling for FL based applications. Functions can also be chained together into sequences where chained functions use the output of the preceding function as input.In AWS lambda platform these are called step functions . AWS lambda charges an additional cost for each transition from onefunction to another . Therefore, in such cases, in order to reduce the number of state transitions to make the overall deploymentcost-eﬃcient without violating the SLOs, multiple functions can be composed together. Moreover, functions with diﬀerentfunctionalities can be composed together into a larger function to meet user requests when serving requests from a single functionis not possible. For example, if the output parameters of one function can be used as the input parameters of another function,these two functions can be connected as a new function with input parameters that are the same as the input parameters of theﬁrst single function and output parameters that are the same as the output parameters of the second function. This new functionis called a composed function, and the elemental functions are referred to as the member functions. Deploying these memberfunctions together on a target platform having higher compute capability can result in higher QoS of the overall application. Oneway of achieving this in FDN is by deploying member functions together within a kubernetes pod, and then the two functionswill always be deployed together, resulting in lower number of transitions and cost reduction. In , the author mentioned aboutthe problem of double-spending with function composition, where a serverless function (composer function) whose purpose isto just call other serverless functions is also billed to the user although only the called functions are consuming the resources.FDN can recognize such composer functions and automatically schedule them to the low-resources target platforms to reducethe overall cost. Researchers have already identiﬁed the limitations of current serverless platforms, such as no control over specifying additionalhardware resources like the required number of CPUs, GPUs, or other types of accelerators for the functions, and ineﬃcientcommunication patterns between functions because of the data access latency . Jonas et al. suggest some improvementsand workarounds which can be adopted to overcome these limitations. Since the FDN targets heterogeneous platforms, it over-comes these limitations by taking into account the computational (CPUs, GPUs etc.) and data requirements (remote or local dataavailability) of the function and then schedules the function automatically on the right target platform. In this process, migrationof data closer to the function can also take place if there is a higher data access latency. Furthermore, Shahrad et al. stud-ied the architectural implications of serverless computing and pointed out that exploitation of system architectural features liketemporal locality and reuse are hampered by the short function runtimes in FaaS. In the FDN, application function deploymenthints regarding deployment target platforms of the functions will be provided to the user from which the developer can exploitthe system architectural features to optimize the application and achieve higher SLOs.In the following paragraphs, we present prior work from three aspects: (i) heterogeneity in public cloud providers FaaSplatforms performance and how FDN can take advantage of this, (ii) FaaS for HPC and how FDN can be advantageous for HPCworkloads, (iii) FaaS for edge devices, and (iv) diﬀerent strategies for coordination among heterogeneous platforms and howFDN strategies diﬀers from them.FaaSProﬁler is the ﬁrst to take a bottom-up approach in analyzing the architectural implication to unwrap the server-leveloverheads in the FaaS model. They analyzed the diﬀerence between native and in-FaaS function execution and calculated theadditional server-level overheads like computational overheads, memory consumption, bandwidth usage, and management over-heads like orchestration, queuing, scheduling, and power consumed. Furthermore, Lee et al. compared the performance ofvarious serverless computing environments oﬀered by public cloud providers by showcasing the results of throughput, net-work bandwidth, ﬁle I/O and compute performance regarding the concurrent function invocations. L.Wang et al. performedan in-depth study of resource management and performance isolation with three popular serverless computing providers: AWSLambda, Azure Functions, and Google Cloud Functions. Their analysis demonstrates a reasonable diﬀerence in performancebetween the FaaS platforms and states that azure functions use diﬀerent types of VMs hosts and 55% of the time a functioninstance runs on a VM with debased performance. They have also stated that on Azure the functions host VMs can have 1, 2 or4 vCPUs. Additionally, K. Figiela et al. developed a cloud function benchmarking framework. CPU intensive functions were Jindal

ET AL . deployed in major cloud providers FaaS platforms. The authors observe ﬂuctuation in response time duration based on the dif-ferent underlying hardware, runtime systems, and resource management. These observations showcase the heterogeneity in theperformance and resources availabilities from the public cloud FaaS oﬀerings. Thus, FDN across these public FaaS platformscan provide a way for enabling the scheduling of the functions on them by delivering the function in a right platform based onits requirements in such a way that the performance is adhered to the deﬁned SLOs at the lowest cost.Lynn et al. studies seven diﬀerent public serverless platforms including, AWS Lambda, Google Cloud Functions, andMicrosoft Azure Functions, to showcase that serverless computing can be applied to a wide range of use cases. Serverlesscomputing is highly relevant for scientiﬁc applications, especially in conjunction with HPC capabilities . PyWren utilizedan external ad-hoc orchestrator to share state and synchronize parallel execution of functions in simple map-reduce applica-tions. There has also been some work to enhance the function startup latencies such as SAND in which the authors utilizedapplication-level sandboxing, and a hierarchical message bus for achieving shorter startup delays and eﬃcient resource usage.McGrath et al. in proposed a queuing scheme with workers in which function containers that can be reused are put into warmqueues and workers where new containers need to be created are put into cold queues. Splillner et. al. demonstrated that FaaScloud model can be used for diﬀerent HPC batch workloads, such as, calculating the value of 𝜋 , image face detection, passwordcracking, and weather forecasting. Malla et al. compared Google Cloud Functions with Google Compute Engine in terms ofcost and performance for a HPC workload. They found that FaaS can be 14% to 40% less expensive than IaaS for the samelevel of performance, but, performance of FaaS exhibits higher variation due on-demand CPUs allocation by the cloud serviceproviders. Based on these observations we have integrated FDN to HPC nodes cluster platform. Furthermore, FDN provides theoption of scheduling the HPC based workload function to more performant HPC nodes cluster platform or to highly availableand scalable public cloud FaaS platforms. The decision to choose a platform can be made based on the user requirements suchas performance vs cost. In our previous work, we used similar approach for achieving federated learning using heterogeneousFaaS platforms .The ﬁrst documented eﬀorts for bringing serverless capabilities to the edge came from industry with the introduction ofAWS Lambda@Edge that allows one to explicitly deploy lambda functions to edge locations. This is then used within the IoTGreengrass system of Amazon . It allows to integrate edge devices with cloud resources in an IoT platform and applicationLambda functions running on it are deployed to the edge computers. Baresi et al. propose a serverless model for Multi-Access Edge Computing (MEC). They provide a broader range of application scenarios along with optimizations that composea serverless edge platform. KubeEdge is an open source system extending native containerized application orchestration anddevice management to hosts at the edge. These frameworks focus on executing the applications only on the edge by extendingcloud based FaaS platforms on the edge. Pfandzelter et al. highlight the problem of running cloud based FaaS platforms on theedge and introduce a new FaaS platform called tinyFaaS for edge environments. FDN includes edge-cluster platform, allowingthe opportunity of scheduling the functions closer to the user and hence providing a better performance. Furthermore, FDNallows multiple instances of same function to coexist across multiple heterogeneous platforms, thus providing a way for handlingfunction invocations from various opportunistic requirements.With respect to these works, our proposed FDN provides a way for cooperation among various heterogeneous platforms forincreasing the performance, robustness and scalability of these platforms. To share resources eﬃciently for multiple tasks inthe cloud, a game-theoretic approach is introduced by Freeman et al. . Designed for latency critical applications, PARTIES presents an online learning approach to eﬃciently allocate ﬁne-grained resources such as memory bandwidth and last levelcaches without QoS degradation. Delimitrou et al. present a cooperative ﬁlter based approach to assign a workload to the mostappropriate hardware conﬁgurations. Satyanarayanan et al. propose an edge computing approach to oﬄoad computation frommobile devices to the network edge using virtual machine (VM) based cloudlets. In fog and edge computing, a considerableamount of research work has also been done for developing methods for resource provisioning and management. Also, there havebeen studies on integrating edge and cloud computing for allowing the deployment of services on the resource-constrained edgedevices and oﬄoading compute-intensive parts to the cloud . Although the diﬀerent proposed approaches for resourceprovisioning show promising results in traditional computing environments, they have not been evaluated and extended forthe heterogeneous collection of target platforms in FDN especially involving HPC systems. Bermbach et al. have a veryparticular auction-based approach in which application developers bid on resources fog nodes to make a local decision aboutwhich functions to oﬄoad while maximizing revenue. It requires no centralized coordination and focuses on maximizing theearnings for the infrastructure provider. On the other hand, there is no guarantee for the user that its function will be executed. http://docs.aws.amazon.com/lambda/latest/dg/lambda-edge.html indal ET AL . Our approach within FDN is designed to have a central coordination point and focuses on the fast response to the user. Hellersteinet al. describe FaaS as a data-shipping architecture in the sense that it still ships data to code rather than shipping code todata and see it as perhaps the biggest shortcoming of FaaS platforms. The approach of ﬂuid code and data placement, describedas stepping forward to the future , is the suggested solution to the problem previously mentioned by which the platform wouldphysically colocate certain code and data. Based on this approach, we designed the data migration and function placementstrategies in FDN.To the best of the authors knowledge there has not been any work which involves using heterogeneous platforms Cloud, edgeand HPC together for achieving diﬀerent objectives in a serverless manner. In this section, we discuss potential threats to validity for replicability and reliability of the study, and external validity.There are two threats towards the replicability and reliability of the study. The ﬁrst one lies in the type of the systems usedin this study as experimental target platforms. These systems with the presented conﬁgurations may not be publicly availablewith everyone and hence presents a threat towards the replication of the presented results. However, even if the systems similarto the systems with the presented conﬁgurations are used, the authors believe that the drawn conclusions would still be true.Furthermore, the presented study showcase that diﬀerent heterogeneous platforms provide diﬀerent opportunities for schedulingfunctions across the platforms. Secondly, in this study we can see a potential risk of conﬁrmation bias towards the reliability ofthe study where we try to to conﬁrm our assumptions. This risk was mitigated by checking ourselves to make sure that we do nothave any preference in regard to the outcome. The whole research process is conducted using open source tools along with thestandard benchmarks and is made transparent, from how we gathered data, designed the tool and conducted our performanceevaluations. Additionally, we open source our designed tool and all the collected data.There are two major threats to the external validity of the study. The ﬁrst one lies in the limitation of the benchmarks used inthe work. The benchmarks used are smaller than the complex industrial FaaS applications and do not involve various public cloudservice providers BaaS services. The second one lies in the amount of the user workload generated for the benchmarking. Thegenerated user workload may be simpler and smaller than the real workload and represents only a limited part of diﬀerent typesof possible workloads. Thus it is not clear whether the work can be eﬀectively applied for much larger industrial applications andto more complex and real user workloads. Furthermore, the drawn conclusions and opportunities presented in this study maychange with the change in the type of platforms used for the evaluations and thus can not be generalized for all the platforms.

Due to the current limitations of serverless computing for applications which are highly dynamic in their structure and com-putational requirements, we introduced the Function Delivery Network (FDN), a network of distributed heterogeneous targetplatforms enabling the automatic scheduling of heterogeneous functions to target platforms based on their computational anddata requirements. Additionally, the concept of Function Delivery Network (FDN) was evaluated using ﬁve distributed targetplatforms having diﬀerent computational capabilities (ranging from small edge servers to high-end HPC based machines) forachieving two goals: SLO requirements and energy eﬃciency using

FDNInspector , a tool for benchmarking distributed FaaSbased target platforms. It was found that scheduling function invocations to the high-performance target platform leads to ahigher QoS in most cases. However, in the scenario where the target platform’s resources are already being used, schedulingfunctions on it can lead to a degradation in QoS of the application. Therefore, it is important to consider the resource-usageof the target platform before scheduling functions on it. Moreover, collaborating the function invocations between the multipletarget platforms can lead to a higher QoS as compared to scenarios where functions are exclusively invoked on individual targetplatforms. Migrating data closer to the target platform can also signiﬁcantly reduce the data access latency. We showcase thatsuch opportunities oﬀered by the FDN can help in meeting the SLO requirements. Finally, using an edge-based target platformcan achieve signiﬁcantly lower energy consumption. In this work, we showed that by using an edge-based target platform theoverall energy consumption is reduced by 17x as compared to scheduling it on a high-end target platform, without violating theSLO requirements . Jindal

ET AL . In the future, we plan to complete the implementation of the Function Delivery Network and demonstrate its use for thevarious dynamic heterogeneous applications such as Federated Learning. In addition, integrating AWS lambda as one of thetarget platforms in FDN is another perspective future scope.

ACKNOWLEDGMENTS

This work was supported by the funding of the German Federal Ministry of Education and Research (BMBF) in the scope ofthe Software Campus program. Google Cloud credits were provided by the Google Cloud Platform research credits. We thankthe anonymous reviewers for their constructive reviews to improve this work and inspire future work.

References

1. Balouek-Thomert D, Renart EG, Zamani AR, Simonet A, Parashar M. Towards a computing continuum: Enabling edge-to-cloud integration for data-driven workﬂows.

The International Journal of High Performance Computing Applications

Applied Sciences arXiv preprint arXiv:1902.03383

API Evangelist

DeveloperWorks Open, IBM, Feb

Linux J.

Apache Kafka . Packt Publishing Ltd . 2013. indal

ET AL .

20. OpenWhisk A. Apache openwhisk is a serverless, open source cloud platform. https://openwhisk.apache.org/documentation.html; 2018. [Online; Accessed: 4-Feburary-2020].21. OpenfaaS . OpenFaaS stack. https://docs.openfaas.com/architecture/stack/; 2019.22. Ellis A. A bright 2019 for OpenFaaS. https://blog.alexellis.io/openfaas-bright-2019/; 2019.23. OpenfaaS . faas-netes. https://github.com/openfaas/faas-netes; 2017.24. OpenfaaS . Kubernetes HPAv2 with OpenFaaS. https://docs.openfaas.com/tutorials/kubernetes-hpa/; 2019.25. OpenfaaS . faas-idler: Scale OpenFaaS functions to zero replicas after a period of inactivity. https://github.com/openfaas-incubator/faas-idler; 2018.26. OpenFaaS watchdog. https://docs.openfaas.com/architecture/watchdog/; 2016.27. T S, K SN. A study on Modern Messaging Systems- Kafka, RabbitMQ and NATS Streaming. https://arxiv.org/pdf/1912.03715.pdf; 2019.28. Cloud Functions Overview. https://cloud.google.com/functions/docs/concepts/overview; . (Accessed on 08/22/2020).29. GoogleCloud . Cloud Functions Execution Environment. https://cloud.google.com/functions/docs/concepts/exec; .(Accessed on 08/22/2020).30. Beloglazov A, Abawajy J, Buyya R. Energy-aware resource allocation heuristics for eﬃcient management of data centersfor cloud computing.

Future generation computer systems arXiv preprintarXiv:1812.03651

IEEEJournal on Selected Areas in Communications

IEEE Internet of Things Journal

Monitoring with Prometheus Jindal

ET AL .

44. What is k6?. https://k6.io/docs/; . (Accessed on 07/28/2020).45. McGrath G, Brenner PR. Serverless computing: Design, implementation, and performance. In: IEEE. ; 2017: 405–410.46. NVIDIA Developer Program Membership Required | NVIDIA Developer. https://developer.nvidia.com/embedded/dlc/jetson-tx2-series-thermal-design-guide; . (Accessed on 07/29/2020).47. Mohan A, Sane H, Doshi K, Edupuganti S, Nayak N, Sukhomlinov V. Agile cold starts for scalable serverless. In: USENIXAssociation. ; 2019.48. Kairouz P, McMahan HB, Avent B, et al. Advances and open problems in federated learning. arXiv preprintarXiv:1912.04977

Concurrencyand Computation: Practice and Experience arXiv e-prints

High Performance Computing

Springer International Publishing. Springer InternationalPublishing; 2018; Cham: 154–168.62. Malla S, Christensen K. HPC in the cloud: Performance comparison of function as a service (FaaS) vs infrastructure as aservice (IaaS).

Internet Technol. Lett.

Service-Oriented and Cloud Computing

SpringerInternational Publishing. Springer International Publishing; 2017; Cham: 196–210.65. Xiong Y, Sun Y, Xing L, Huang Y. Extend cloud to edge with KubeEdge. In: IEEE. ; 2018: 373–377. indal

ET AL .

66. Freeman R, Zahedi SM, Conitzer V, Lee BC. Dynamic Proportional Sharing: A Game-Theoretic Approach.

ACMMeasurement and Modeling of Computer Systems

IEEE pervasiveComputing

IEEECloud Computing

Software: Practice and Experience

IEEE/ACMTransactions on Networking

How to cite this article:

Jindal A., M. Gerndt, M. Chadha, V. Podolskiy, and P. Chen (2020), Function Delivery Network:Extending Serverless Computing for Heterogeneous Platforms, —-. , ——