[PDF] From Zero to Fog: Efficient Engineering of Fog-Based IoT Applications

Abstract

In IoT data processing, cloud computing alone does not suffice due to latency constraints, bandwidth limitations, and privacy concerns. By introducing intermediary nodes closer to the edge of the network that offer compute services in proximity to IoT devices, fog computing can reduce network strain and high access latency to application services. While this is the only viable approach to enable efficient IoT applications, the issue of component placement among cloud and intermediary nodes in the fog adds a new dimension to system design. State-of-the-art solutions to this issue rely on either simulation or solving a formalized assignment problem through heuristics, which are both inaccurate and fail to scale with a solution space that grows exponentially. In this paper, we present a three step process for designing practical fog-based IoT applications that uses best practices, simulation, and testbed analysis to converge towards an efficient system architecture. We then apply this process in a smart factory case study. By deploying filtered options to a physical testbed, we show that each step of our process converges towards more efficient application designs.

Full PDF

FFrom Zero to Fog: Eﬃcient Engineering of Fog-Based IoTApplications

Tobias Pfandzelter, Jonathan Hasenburg, and David BermbachMobile Cloud Computing Research GroupTechnische Universit¨at Berlin & Einstein Center Digital Future { tp, jh, db } @mcc.tu-berlin.deAugust 19, 2020 Abstract

In IoT data processing, cloud computing alone doesnot suﬃce due to latency constraints, bandwidth lim-itations, and privacy concerns. By introducing inter-mediary nodes closer to the edge of the network thatoﬀer compute services in proximity to IoT devices,fog computing can reduce network strain and highaccess latency to application services. While this isthe only viable approach to enable eﬃcient IoT ap-plications, the issue of component placement amongcloud and intermediary nodes in the fog adds a newdimension to system design. State-of-the-art solu-tions to this issue rely on either simulation or solvinga formalized assignment problem through heuristics,which are both inaccurate and fail to scale with a so-lution space that grows exponentially. In this paper,we present a three step process for designing practicalfog-based IoT applications that uses best practices,simulation, and testbed analysis to converge towardsan eﬃcient system architecture. We then apply thisprocess in a smart factory case study. By deployingﬁltered options to a physical testbed, we show thateach step of our process converges towards more eﬃ-cient application designs.

For more than a decade, cloud computing has beenthe dominating paradigm when designing and deploy- ing software services but this is not a good ﬁt for newapplication domains such as the Internet of Things(IoT): Sending the world’s IoT data to a centralizedcloud for processing is not only ineﬃcient but alsoprohibitively expensive [1]. Processing should insteadhappen where IoT data is generated and needed [2].Fog computing, as ﬁrst proposed by Bonomi et al. [3],brings the required paradigm shift: It extends thecloud to the edge of the network so that applica-tions can leverage additional infrastructure betweenthe cloud and end-devices. From powerful data cen-ters in larger cities to small, single-board computersco-located with cellular base stations, application de-signers can deploy their services not only in a centralcloud but anywhere on the edge-cloud continuum.While cloud resources still provide elastic, seeminglyinﬁnite scalability at low cost, edge infrastructure of-fers service consumers low latency access while alsoconsuming less network bandwidth [2]. Overall, fogcomputing enables hitherto impossible applicationarchitectures but it does not simplify application de-sign. Even worse, when designing fog-based IoT ap-plications, the placement of software services withinthe fog is now a new dimension on top of actuallybuilding the application.Correctly placing services, however, is vital inleveraging fog computing for the IoT as it directly in-ﬂuences both quality and cost of applications. At thesame time, the number of deployment options growsexponentially with each service or location. Existing1 a r X i v : . [ c s . D C ] A ug pproaches to designing fog-based IoT data process-ing applications each have their drawbacks. First,there are those that try to parameterize the entiresystem to form an optimization problem solved byheuristics or similar means (e.g., [4–8]). This re-quires detailed information upfront, is limited by theassumptions of the applied model, and can becomeinsolvable for complex applications. Alternatively, asecond approach is to follow guidelines, best practicesor reference architectures such as [9–12], which, whileuseful as a starting point, target generalized scenar-ios and are hence not suﬃcient for a speciﬁc use-case. Third, there are approaches that aim to simu-late the fog environment to help make informed de-cisions about application performance (e.g., [13–15])and, fourth, those that introduce tooling to create(emulated) fog testbeds (e.g., [16–18]) to deploy, test,and benchmark applications. Simulation and emu-lation, however, do not scale well with the growingamount of application deployment options, especiallygiven the cost of testbeds.In this paper, we propose a new process for de-signing eﬃcient fog-based systems that combinesand extends existing approaches, namely followingbest practices, simulation, and testbed emulation.Through this combination, we leverage the advan-tages of each approach while mitigating their respec-tive limitations. For instance, we apply best practicesto reduce the parameter space for simulation whichprevents incurring the costs of simulating the entireparameter space without sacriﬁcing the accuracy ofsimulation results. Our overall goal is, hence, to iden-tify an eﬃcient fog application design as eﬀectively aspossible.To this end, we make two core contributions: • We extend and integrate previous research ofours into a novel framework. We use best prac-tices [9], simulation with

FogExplorer [13, 14],and infrastructure emulation with

MockFog [16](Section 3). • We implement a smart factory application fol-lowing our proposed process and compare theﬁnal application design to a range of discardeddesign options in experiments on a physical fogtestbed. (Section 4). Figure 1: The layered fog architecture comprisescloud, intermediary, and edge nodes, as well as IoTdevices.

In this section, we summarize fog computing conceptsand discuss characteristics of fog-based IoT applica-tions and eﬃcient IoT application design.

Our deﬁnition of fog computing is adapted from [2]:Fog computing is the extension of the cloud towardsthe edge of the network. The idea is to combinecloud resources, intermediary nodes, edge computing,and even on-device computation to distribute appli-cations across a wide variety of infrastructure. In thisway, application developers can leverage both low ac-cess latency at the edge and scalability in the cloud.We show an example of a layered fog architecture inFig. 1.As fog computing combines platforms from dif-ferent vendors, e.g., a cloud provider or a networkprovider, heterogeneity is a major challenge. Diﬀer-ent platforms are likely to provide diﬀerent program-ming models and service levels. Furthermore, inter-mediary and especially edge nodes are also likely tobe more expensive and less scalable than their cloudcounterparts.A major obstacle to using fog computing is thatapplications need to be deployed in a distributedmanner, with diﬀerent software components placedon diﬀerent nodes in the fog. This is impossible2hen dealing with traditional monolithic informationsystems. Only a modularized application split intodistinct software services allows each service to beplaced at speciﬁc locations within the fog, whetherthat be in the cloud or towards the edge. Whileincreasing the communication overhead, smaller ser-vices are necessary for ﬁne-granular scaling and en-able more ﬂexibility in service placement on the foginfrastructure [2]. To this end, leveraging lightweightvirtualization technologies such as Docker can makesoftware deployment easier [19].

IoT applications analyze data from sensors or pro-cess them to trigger actor devices and software sys-tems [9]. A key characteristic of IoT applications isthat they do not follow a request-response model asin user-facing systems; instead, data move througha processing pipeline in a more “linear” way – typi-cally in the form of a directed acyclic graph (DAG).Overall, there are two classes of IoT data process-ing: event processing and data analytics . Zhang etal. [1] describe these as “real-time applications withlow-latency requirements” and “ambient data collec-tion and analytics,” respectively. An application of-ten comprises multiple data processing componentsthat can each be classiﬁed individually in this man-ner.In event processing, events from the outside world,measured through connected devices, trigger reac-tions in the system and, by extension, possibly inthe physical world. The main focus here is time sen-sitivity: events are expected to be reacted to as fastas possible. Advantageously, operations are thus alsowell-deﬁned and simple, and events as data points aresmall as they only carry metadata [20].Data analytics is the process of collecting and pro-cessing data to obtain information. Here, complexoperations are applied to data from multiple sourcesover a longer period of time [21].

We consider two dimensions to eﬃciency in fog-basedIoT applications: service level and cost.

Service level , often also referred to as quality ofservice (QoS), can be both the availability of the ap-plication and the access latency for particular ser-vices [2, 22]. Latency is highly dependent on ser-vice placement and is caused by data processing andtransmission. Data processing latency describes thetime that passes between the input into the process-ing unit, which could, for example, be a cloud func-tion, and the output of a computed result. Datatransmission latency, on the other hand, is the de-lay from the ﬁrst packet of data to be sent by thesender to the last packet of data to be received bythe receiver. To limit the scope of this paper, wedo not consider availability and leave this to futurework.

Cost is incurred through the usage of resources inthe fog, such as compute, storage, and network band-width, and through upfront investment in IoT devicesor other hardware. Generally, compute and storageare far cheaper towards the cloud, as providers aremore capable of leveraging economies of scale in largedata centers rather than on the edge. For networkbandwidth, fog platform providers often charge foroutgoing and incoming traﬃc to a data center andIoT devices may use cellular network access whereeach packet incurs a speciﬁc cost. These costs arethe main contributors towards the total cost of oper-ating an IoT application.When designing fog-based IoT applications, diﬀer-ent design options result in diﬀerent service levels andcost. An eﬃcient design oﬀers the best possible QoSlevels at the lowest possible cost, i.e., it ﬁnds a sweetspot on the QoS and cost tradeoﬀs [22–24], as QoSand cost are not independent from each other. De-ploying powerful servers at every edge location wouldminimize latency but result in high cost. Similarly,moving all services to the cloud can minimize costbut dramatically increase latency [1, 2].3

Designing Eﬃcient IoT Ap-plications

In this section, we present our proposed fog applica-tion design process. We start by giving a high leveloverview of our approach before describing the indi-vidual steps in detail.

The process we propose for designing eﬃcient fog-based IoT applications comprises ﬁve main steps. Ini-tially, there is a broad range of design options whicheach describe a mapping of software services to nodesin the cloud, edge, or in-between, i.e., the serviceplacement. Each step of the proposed process thenﬁlters out application design options starting at thecartesian product of all software and infrastructuremodels, thereby converging towards the limited setof most eﬃcient designs. The key idea is to create asequence of steps in which each step provides moreaccurate recommendations than its predecessor butis also more expensive to execute. Since each step re-duces the application design space by orders of mag-nitude, we use more expensive analysis steps for onlya limited number of options late in the process whilerelying on low-cost heuristics in the ﬁrst steps; seeFigure 2 for a high-level overview of the proposedprocess.In the ﬁrst step, we build models of software com-ponents and of the infrastructure on which the ap-plication will be deployed. In each later step, wethen extend these models and augment them withadditional details as available further in the designprocess. Finally, we are able to select an eﬃcient fogapplication design.In the second step, we apply a set of best practicesin IoT data processing. By following these informedrules, we can already discard all highly ineﬃcient op-tions. This reduces the set of options that we haveto consider later in the process, enabling us to movethrough these subsequent steps more eﬃciently. Asthe number of available options grows exponentiallywith each additional component, this step reduces the design options considered in the subsequent stepsfrom millions to only thousands.In the third step, we simulate service placement toinfrastructure components. With this, we can calcu-late service cost based on the given cost factors andexamine latency constraints for diﬀerent designs. Byintroducing service level objectives (SLOs) for partsof the application, we can remove application designoptions that violate required service levels and in-stead focus only on inexpensive options that conformto all constraints.In the fourth step, we set up emulated testbeds foreach of the remaining application design options todeploy and benchmark software services. As this stepis expensive and time-consuming, we propose to useonly the options in the 95 th percentile of the secondstep, again reducing the number of considered appli-cation design options by orders of magnitude. Basedon the number of remaining design options, this se-lection may be limited or broadened, reducing testingcost or leading to more accurate results, respectively.This process eventually converges towards a smallset of highly eﬃcient design options. If available,these options that show the best performance at goodcost levels can then be deployed on a physical testbedor the actual infrastructure to measure their perfor-mance in their real environment (ﬁfth step). Our process requires basic insights into the avail-able runtime infrastructure and the individual soft-ware services. We start with a notion of infrastruc-ture components, yet at this early step in the designprocess we cannot assume that detailed informationabout runtime infrastructure is available. We there-fore only require high level, abstract descriptions ofavailable data processing locations, such as IoT de-vices, edge nodes, or cloud platform providers. Suchknowledge can, for example, be gained by survey-ing and analyzing eligible providers and products orby comparing options for IoT devices and gateways.For some more complex use cases, synthesizing possi-ble edge infrastructure conﬁgurations as proposed byRausch et al. [25] could be an alternative approach.4 reateSoftware ModelCreateInfrastructure Model Select Design Options That Follow Best Practices Simulation in FogExplorer Apply SLA Constraints Select 99 th Percentile in Overall Cost Implement Software Components Emulation in MockFogTestbed Benchmark Deploy to Physical Testbed(Optional)Survey Available InfrastructureDefine Application Software

1. Preparation 2. BestPractices 3. FogExplorer 4. MockFog

Select Design With Best Performance

5. DeploymentNumber of Application Design Options ∼ ∼ ∼ ∼ Figure 2: Starting from the set of all possible application design options derived from an software andinfrastructure model, we remove poor design options in each step of the application design process, convergingon the most eﬃcient one.Aside from infrastructure components, we alsomodel software components . At this point, no ac-tual implementation has to be available yet. For ourmodel, we use three kinds of components: sources , services , and sinks . Sources are components that pro-duce new data. For an IoT use case, sources are typi-cally IoT sensors. Services consume data and performoperations, thereby producing new data. Servicescould, for instance, transform data through aggrega-tion or trigger events. Finally, sinks are componentsthat persist data, e.g., a database system, or interactwith the physical world based on data, e.g., an IoTactuator. Sinks that persist data can also have a sec-ondary role as sources exposing historical data. Weshow an example application of this kind in Fig. 3.We deﬁne the overall application as a collection of application paths . Each application path starts withone or more data sources, has a number of servicesalong the way and ends in one sink, i.e., an applica-tion path is the DAG of processing steps that leadsto a particular sink. Please, note that we are not trying to derive a formalmodel as used in either mathematically formulated optimiza-tion problems or in standardized modeling languages. Rather,our eﬀorts focus on deriving and abstracting certain propertiesfrom an application and its available infrastructure; the waythis abstract information is represented is irrelevant for ourpurposes.

At this point, albeit early in the process, it is al-ready useful to simplify both software and infrastruc-ture models. In most IoT applications, speciﬁc com-ponents are instances of the same class of compo-nents. In a smart home use case, for example, therecould be various light bulbs and corresponding lightswitches. Assuming that each switch controls a num-ber of lights, a pattern emerges. To simplify simula-tion and benchmarking, we model only one applica-tion path and later apply this to all instances of lightswitches and lights in the system. This also allowsour process to scale well and to require less upfrontinformation about the system, while not inﬂuencingthe results as we merge instances of the same com-ponent rather than modifying them.For sources and sinks, the mapping to infrastruc-ture components is clear, as these are tied to thephysical world. An IoT device, for instance, exists asa physical device, i.e., an infrastructure component,and as a source in the software model. Consequently,we only need to consider the placement of services,i.e., the software components that process data, inthe subsequent steps of the design process.

In previous work [9], we proposed best practices forfog-based IoT application design, which we now use5 ensor ActuatorFilter Datastore

SourceServiceSinkApplication Path

Controller

Figure 3: Example software model with two applica-tion paths, sources, services, and sinks.to exclude unsuitable application design options. Inthe following, we will brieﬂy describe how we ap-ply these best practices, which we split into rulesfor event processing and data analytics applicationpaths.In event processing, processing is time-sensitiveand services should be placed on the shortest com-munication path between data source(s) and sink,as close to the cloud as possible to minimize cost,and as close to the edge as necessary to fulﬁll SLOs.As typical event processing services are not compute-intensive, minimizing round-trip time is more impor-tant than reducing processing delay. Yet, as cloudcomputing resources scale better and moving towardsthe edge reduces ﬂexibility and increases cost, it isstill important to process events as close to the cloudas possible. That means selecting the infrastructurenode that provides the most ﬂexibility and least ex-pensive compute power from the set of nodes on theshortest path between the event source and its sink.For data analytics, rather than time sensitivity be-ing the focus, operations are complex and requirea lot of processing power. These operations rangefrom ﬁltering or aggregation to predictive analyticswith machine learning. Furthermore, services heremust consider and even combine data from diﬀerentsources. Data analytics processors that preprocessdata and reduce the data volume, e.g., through ﬁlter-ing and aggregation, should be kept as close to theedge as possible but and as close to the cloud as nec-essary. Compute-heavy operators, on the other hand,should be placed near the cloud, where processing ischeaper.Given these best practices, we can ﬁlter the set ofapplication design options. Here, we consider each application path individually. For each applicationpath, we ﬁrst identify whether it targets event pro-cessing or data analytics. For an event processingapplication path, infrastructure nodes that lie on theshortest path between the infrastructure componentsthat host the event source and sink are an eﬃcientlocation for software services. In data processing,we argue for preprocessing of data close to the edgewhere possible. This reduces usage of bandwidth to-wards the cloud, where we propose to place morecomplex data processing. We also rule out optionswhere the resulting data ﬂow uses the same networklinks more than twice.

In the second step, we use simulation to analyze theremaining application design options. For this, werely on FogExplorer [13, 14] which we presented inprevious work. For a given mapping, FogExplorercalculates QoS and cost metrics through simulationand provides recommendations for optimizing com-ponent placement. FogExplorer can be used in an in-teractive way in which application designers updatemappings and observe the resulting metric values in-stantly but can also be used in a batch mode throughits API.Based on an infrastructure and software model,FogExplorer calculates four metrics per mapping: processing cost , processing time , transmission cost ,and transmission time . Processing cost and trans-mission cost describe average cost per second withinthe system. Processing time and transmission timedescribe latency induced by services and transmissionof data.To calculate these metrics, FogExplorer ﬁrst deter-mines the data stream routing by identifying the pathwith the lowest total bandwidth cost for each set oftwo communicating software components.In a second step, FogExplorer calculates resourceusage to assert that the selected mapping does notexceed resource limits; for example, a connection mayhave a limited amount of bandwidth. FogExplorerwill thus determine if the bandwidth required by anyconnection within the mapping exceeds the availablebandwidth.6n the third step, FogExplorer calculates total costbased on resource usage. Transmission costs dependon bandwidth used and the respective bandwidthprice. In a similar manner, FogExplorer also calcu-lates processing costs.In addition, FogExplorer also determines time met-rics and calculates processing time and transmissiontime for each application path. Processing time isthe total latency induced by services processing data,while transmission time is the total connection la-tency along the application path.Finally, FogExplorer tallies transmission costs andprocessing costs, as well as transmission times andprocessing times to project the total cost and end-to-end latency of the given mapping.We use FogExplorer to further ﬁlter out applica-tion design options as the third step of our proposedprocess. To use this simulation, we have to extendour software and infrastructure models slightly.In the infrastructure model, we also specify dif-ferent hardware options that are available for eachnode. For example, at an edge data center location,the installation of diﬀerent types of servers with dif-ferent capabilities yet also diﬀerent price points maybe possible. Here, FogExplorer allows us to comparethese diﬀerent options to ﬁnd the most eﬃcient one.While this increases the space of application designoptions, this is necessary to determine the optimalinfrastructure.For each infrastructure option at each node, wespecify a relativePerformanceIndicator , which is arough estimate of compute power compared to a cho-sen reference machine. For instance, if a machinetype has a performance indicator of 2, it is twice as“fast” as the reference machine. Furthermore, the availableMemory metric speciﬁes how much memoryis available for the machine and the price metric spec-iﬁes the price for using the machine. Network com-ponents are extended with an availableBandwidth , a bandwidthPrice , and a latency for each connection.If latency cannot be accurately benchmarked aheadof time, it is also possible to use estimates based onlink layer performance and geographical locations ofnodes as done in, e.g., [26].Similarly, we add quantitative attributes to soft-ware model components as well. Sources produce Figure 4: Extension of the example software model;we infer connection data rates from given outputRates and outputRatios .data at a constant rate that we mark as their av-erage outputRate in the form of Byte/s. The rateat which services output data depends on their in-put rate, hence we use an outputRatio to calculatetheir outgoing bandwidth. For services, we also em-ploy a referenceProcessingDelay factor that describeshow long, on average, the service needs to processdata on the aforementioned reference machine, anda requiredMemory metric to describe the amount ofmemory needed by the service. Of course, both sinksand sources as software components require a certainamount of memory as well once they run on an infras-tructure node. The infrastructure nodes then incurcost for running these components. As we have de-scribed, however, mapping for sources and sinks isﬁxed, as these components relate to objects in thephysical world. Hence, while it is possible to simu-late costs incurred here as well, these costs would bestatic and, subsequently, not inﬂuence our decisionbetween one application design option and another,which is why we omit them in the simulation andonly focus on resources required by service compo-nents. We show the extended version of our examplesoftware model from Section 3 in Fig. 4.We also introduce SLOs in the form of limits toend-to-end latency for each application path at thispoint. As we have described in Section 2.3, we mea-sure eﬃciency for fog application design in cost andlatency. Yet as cost and latency depend on eachother, ﬁnding the most eﬃcient application designis a diﬃcult multi-objective optimization problem.Rather than ﬁnding the quantitatively optimal solu-7ion, we apply constraints in the form of SLOs to con-vert this problem into single-objective optimizationproblem . While it depends on the speciﬁc applica-tion, the economic law of diminishing returns usuallyalso applies to the tradeoﬀ between cost and latency:To give an example, imagine both a user-facing webservice and a machine-to-machine communication usecase. In the ﬁrst use case, investing a considerablecost to decrease latency by 10ms would often not beuseful, while it can be in the second scenario. Appli-cation designers can set the required access latencyfor all application paths arbitrarily high or low as isrequired by the application and our process will op-timize cost within this speciﬁed service level.Given these limits on end-to-end latency, we con-sider only those models that satisfy these constraintsin an eﬃcient way, that is, at low cost, further.From the set of application design options, we se-lect only those that do not violate the service levelsfor any application path as deﬁned in Section 3.2.If no model conforms to these constraints, it is use-ful to reconsider the constraints or available infras-tructure. From the remaining design options, wenow select those that we will consider in the testbedstep through the remaining inﬂuence factor, i.e., to-tal cost. As testbed evaluation is expensive and time-consuming, the number of application design optionsthat will be benchmarked needs to be low. On theother hand, the design options that are identiﬁed asgood options in the simulation step are not neces-sarily the best options, i.e., it can be beneﬁcial toproceed with a broader variety of options. We pro-pose to solve this tradeoﬀ by proceeding with designoptions that lie in the 95 th percentile when consider-ing their total cost. If necessary, this range can beadapted. In the fourth step of our process, we evaluate de-sign options through experiments on an emulated fogtestbed. This evaluation requires an implementation An alternative would be to use a utility function to trans-form the multi-objective optimization problem into a single-object optimization problem. of the application software that we can deploy to thetestbed and is thus the most time consuming andcostly. Yet, the low number of viable options thatremain after the ﬁrst three steps of our process lim-its the required experiments. Furthermore, it alsolimits needed implementation eﬀorts as services onlyneed to be implemented for the platforms they couldbe deployed on in the remaining application designoptions [19, 27].To benchmark fog application design options, wepropose to use MockFog as we have presented in [16].MockFog provides an emulated yet realistic environ-ment for functional testing and benchmarking of fogapplications in the cloud. In MockFog, cloud, edge,and intermediate nodes as well as IoT devices areinstantiated as cloud virtual machines. Computepower, memory, and intra-node network characteris-tics such as latency or failures rates can be conﬁgured;also, failure scenarios can be emulated.Again we need to modify our initial software andinfrastructure models to ﬁt the model used by Mock-Fog. Instead of a performance indicator given formachines in the infrastructure model, we now needto quantify the actual compute power , memory , and storage capabilities. Furthermore, we have to de-ﬁne bandwidth and latency parameters for networkconnections. MockFog introduces routers betweenconnected machines rather than direct connections.Hence, in order for all nodes to be able to communi-cate, we have to add these routing components whereapplicable.Rather than extending it, we need to replace theapplication model with actual implementations ofservice and sink components that we then deploy onthe MockFog testbed. For source components, themajority of which are IoT devices, implementationis more diﬃcult. These source components need toproduce IoT data in conformance with the applica-tion model. It is possible to use traces of real IoTdata, e.g., through BenchFoundry [28], or to attachreal world IoT devices, although this requires a con-sideration of network conditions between these de-vices and the MockFog testbed location. Finally, we8an also employ artiﬁcial workload generators suchas Apache JMeter to generate data.On the emulated MockFog testbed, we can then an-alyze the behavior of the IoT application, especiallyin the context of component placement. While theMockFog environment also allows us to change con-ﬁguration parameters at runtime, e.g., to inject fail-ures, we use it only to benchmark application designsunder the assumption that the provided applicationimplementation is correct. After these four steps, an informed decision on thebest design option can be made. The selected designoption is likely to be the most eﬃcient one regardingcost and service level, as it has been selected throughbest practices and simulation as well as veriﬁed onan emulated fog testbed. If in doubt, the best twoor three options can then also be test-deployed in thereal runtime environment or on a physical testbed tofurther substantiate the results.

To evaluate our approach, we use a case study basedon a smart factory scenario. In the ﬁrst part, wefollow the process described in Section 3 to show thatit can be used in practice. We make all software weuse available as open source .In the second part (Section 4.2), we show that thedesign option identiﬁed by our process is among thebest options; for this we implement the design on aphysical testbed and compare it to alternative designoptions. Due to the number of permutations and theresulting experiment eﬀort, it is not feasible to showthat the identiﬁed option is the best option. We,hence, rely on sampling and run experiments withrandomly selected design options that we have dis-carded in earlier process steps. https://jmeter.apache.org https://github.com/pfandzelter/zero2fog-artifacts Figure 5: The smart factory comprises a factory ﬂoor,factory data center, and logistics oﬃce, and is aug-mented by an oﬃce data center and the cloud.

In our case study, we apply our proposed process to asmart factory IoT application. We start by describingthe scenario and derive software and infrastructuremodels (Section 4.1.1), apply our set of best prac-tices (Section 4.1.2), use simulation (Section 4.1.3)and testbed experiments with the implemented soft-ware services (Section 4.1.4) to identify good designoptions, and brieﬂy discuss the results of followingour approach (Section 4.1.5). This shows that it isindeed possible to follow our proposed process and topick a resulting design option.

We give an overview of our IoT application’s compo-nents in Figure 5. The factory comprises a factoryﬂoor, a small data center, and a logistics oﬃce. Inaddition to the factory, there is a central oﬃce in anoﬀsite location.The factory ﬂoor has two machines: the

Produc-tion Machine produces a part that the

PackagingMachine then prepares for shipment. To ensure thatthe Packaging Machine processes only faultless parts,the Production Machine has an attached camera thattakes a picture of each produced part and checks fordefects. The Packaging Machine should adapt its9 ameraTemperature SensorProduction Controller Adapt MachinePackaging ControllerPackaging Controller Predict Pickup

A1A2A3A4

Central Office DashboardGenerate DashboardAggregate LogisticsPrognosisPackagingControllerProductionControllerCheck forDefects

Figure 6: Data sources, services, and sinks in ourapplication. We mark application paths A1 - A4 forthe components.speed to the output rate of its preceding machine.Furthermore, the Packaging Machine can only op-erate within a ﬁxed ambient temperature range andthus has a temperature sensor installed to shut-oﬀ thePackaging Machine if necessary. Each machine is alsoequipped with a controller that controls the speed atwhich the machine operates. These controllers areable to communicate over a common wireless gate-way. In the onsite logistics oﬃce, logistics person-nel makes the decision on when to arrange outgoingproduct shipments. To this end, a logistics dashboardpredicts machine output based on recent productiv-ity. The factory data center provides some computepower and a connection to the WAN.In the central company oﬃce in an oﬀsite location,the business requires central reporting of factory pro-ductivity. This central oﬃce also has a collocatedmedium-size datacenter. Additionally, it is possibleto leverage cloud computing to outsource some com-putational tasks.We use this information to create our infrastruc-ture model with the cloud, data centers in the smartfactory and central oﬃce as well as wireless gateway,machine controllers, and sensor nodes that all haveadditional compute capabilities.We also derive the following application paths fromthe initial concept (see Figure 6 for the softwaremodel): A1:

The

Camera takes pictures of parts leaving the ﬁrst machine and the

Check for Defects service ana-lyzes each picture for defects. In case of a defect, theservice instructs the

Production Controller to discardthe respective part.

A2:

The Production Controller has information onthe output rate of the machine that produces partsand uses this information to adapt the packaging rateof the packaging machine through an intermediaryservice. As a second input, the

Packaging Controller also relies on data from the

Temperature Sensor tocontrol the packaging rate. When temperature read-ings leave a speciﬁed range, as detected by the

AdaptMachine service, the Packaging Controller instructsthe packaging machine to pause operation.

A3:

The Packaging Controller provides data on therate and amount of packaged parts to the

PredictPickup service that feeds into the

Logistics TeamPrognosis . A4:

Data of the Packaging Controller is also con-sumed by a service that aggregates and ﬁlters thatdata to generate a dashboard for the central oﬃce,which then runs inside a browser on a machine in thecentral oﬃce.Data sources and sinks closely mirror the real worldand placement for them is straightforward. For ex-ample, the Camera component in both the infrastruc-ture and software models are the same device in thereal world. For services, however, we still need to ﬁndan eﬃcient mapping. To this end, we now follow theprocess introduced in Section 3.

As described in Section 3.3, we need to consider allapplication paths individually in this step. We beginby classifying each application path and then use thecorresponding best practice advice to ﬁlter out someapplication design options.

A1:

Although a photo is larger than a sensor value,we classify A1 as event processing. Each photo cor-responds to an event in the physical world, in thiscase the production of a part, and the Camera trans-lates this event into a message carrying metadata inthe form of the image. Processing the image is alsotime-critical as the Production Machine should dis-10ard any faulty part before it arrives at the PackagingMachine. Although the event message has a rela-tively large size, the Check for Defects service on thisapplication path only needs to consider one source ata time, which, depending on complexity of analysisfor each event, limits processing time.As such, limited bandwidth and high network la-tency can be a bigger factor in not achieving QoSgoals here. Therefore, image processing should atleast be kept on factory premises, if not even insidethe machine on either Camera or Production Con-troller. A more speciﬁc decision is not possible aslong as more detailed information about service com-plexity and infrastructure capabilities are not avail-able at this stage.

A2:

We can make a similar argument for A2. Here,two event sources produce events independently buta single service that controls the packaging rate con-sumes all of them. Again, we classify this path asevent processing as events are small in size and de-cisions need to be made quickly. Service complexityis also low as, albeit consuming two data sources,the service does not consider historic data and per-forms simple calculations. Thus, placing the AdaptMachine service closely to data sources and sinks, onfactory premises, is the most eﬃcient option.

A3:

Despite using only one data source producingrather simple data items, we classify A3 as data an-alytics since it needs to consider current and histor-ical data; also, the processing is more complex asthe goal is to predict future packaging rates. Fur-thermore, QoS limits for latency are in the range ofseconds (rather than milliseconds) as the staﬀ willonly periodically check the report. Consequently, de-pending on prediction complexity, we propose placingthe Predict Pickup service where compute power isthe cheapest, the cloud or a data center for instance.Correct placement then comes down to a cost calcu-lation between bandwidth price and compute costs,as is part of the subsequent simulation.

A4:

Finally there is A4 which monitors the factoryoutput rate to feed data into a dashboard in the cen-tral oﬃce. This, too, is a data analytics workﬂow andthere are no strict latency constraints. Instead, again,data amount and processing complexity are the lim- iting factors. Consequently, as the Aggregate serviceis a preprocessing step, placing it close to the Packag-ing Controller limits bandwidth usage. Similarly toA3, we can then place the complex processing serviceGenerate Dashboard where processing is available forthe lowest price, which is likely to be the cloud or oneof the data centers.Starting with ﬁve services that we can deploy toone of eight infrastructure components each meansthat there are 32,768 permutations, growing expo-nentially with additional services or infrastructurecomponents. By following our best practices, wemanaged to reduce the set of options to only 864.

We now use FogExplorer to simulate QoS and cost ofthe remaining application design options as explainedin Section 3.4. To use FogExplorer, we ﬁrst need toextend the software and infrastructure models whichwe show in Figures 7 and 8, respectively. To give anexample, in the application model the camera pro-duces data in the form of images at a rate of 100kb/sand the service thereafter takes an estimated 20msto process data items on a reference machine withan outputRatio of 0.1, meaning that with a 100kb/sinput it outputs 10kb/s. Furthermore, this service re-quires 250MB of memory. For each application pathwe have also introduced QoS requirements in the formof latency limits. In the simulation, we discard anyservice mapping that violates either of these condi-tions. For the A1 application path, for instance, weset an upper limit of 50ms as delay between takinga picture and the command reaching the productioncontroller. In the infrastructure model, we introducediﬀerent machine options with diﬀerent capabilitiesand price points for some nodes. For example, thereare two options for the camera component: One hascomputational capabilities of 0.1% that of the ref-erence machine with 1MB of memory at a price of$0.5/month while the other has 5% performance ofthe reference machine with 10MB of memory avail-able at a higher price of $5/month.As our case study is ﬁctional, we estimate theseprices in lieu of actual infrastructure. As a ba-sis, we use pricing for a moderate compute instance11 amera outputRate: 100kb/s outputRatio: 0.1referenceProcessingDelay: 20msrequiredMemory: 250MBLimit: 50ms

Temperature SensorProduction Controller Adapt Machine outputRate: 10kb/soutputRate: 10kb/s outputRatio: 0.5referenceProcessingDelay: 1msrequiredMemory: 100MBLimit: 30ms

Packaging Controller outputRate: 10kb/s outputRatio: 0.1referenceProcessingDelay: 2msrequiredMemory: 100MB outputRatio: 100referenceProcessingDelay: 50msrequiredMemory: 2500MB

Limit: 2s

Packaging Controller Predict Pickup outputRate: 10kb/s outputRatio: 0.1referenceProcessingDelay: 100msrequiredMemory: 1500MBLimit: 1s

A1A2A3A4

Check forDefects ProductionControllerPackagingControllerLogisticsPrognosisGenerate DashboardAggregate Central Office Dashboard

Figure 7: We extend components of the applicationpaths in our software model with attributes as re-quired by FogExplorer: sources have an outputRate and services have an outputRatio , referenceProcess-ingDelay , and requiredMemory . Furthermore, appli-cation paths have a QoS limit of acceptable end-to-end latency.with a 2 core processor and 4GB of memory onAmazon Web Services (AWS) Lightsail , which costs$20/month. This is similar in price and performanceto the medium machine option for the Factory DataCenter node. We estimate total cost of ownershipper performance to be lower near the cloud and withmore powerful machine options, yet higher near theedge, where maintenance is a higher factor, and ex-trapolate accordingly.The A2 application path has two sources and,depending on its placement, they have a diﬀerentconnection latency to their common service. Asboth sources send their data in parallel, we considerthe maximum end-to-end latency for this applicationpath and assert that this does not violate the QoS.We automate the simulation using the Node.js in-terface of FogExplorer. Although the number of pos-sible application design options grows exponentiallywith software and infrastructure components and ma-chine options for nodes, the preceding step in whichwe have discarded options using best practices has https://aws.amazon.com/lightsail already limited those options, allowing us to simu-late all remaining design options eﬃciently. In fact,with current software and infrastructure models weneed to consider only 186,624 diﬀerent options andare able to simulate and calculate metrics for all ofthem in about one minute on a standard laptop com-puter. For comparison, and to emphasize the impor-tance of the ﬁrst step of our process, there is a totalof 7,077,888 application design options and a com-plete simulation of those already takes 50 minutesfor this simple use case. As such, using only simula-tion without applying best practices ﬁrst is infeasibleespecially for more complex application scenarios.In addition to overall cost and time metrics, wealso calculate metrics for each application path on itsown. This helps us discard options that violate SLOlimits. From 186,624 possible application design op-tions only 2520 are valid and only 215 remain afterapplying the latency limits we deﬁned. Consequently,FogExplorer lets us discard the 99.9% of applicationdesign options that are impossible to deploy in prac-tice given infrastructure and SLO constraints.The options that remain are therefore those thatconform to all infrastructure and QoS constraints andwe can now choose those that have the lowest overallcost according to the simulation. We select the ap-plication designs in the 95 th percentile in the pool ofoptions based on cost, a total of ten designs. Fromthe simulation, it is clear that placing the Check forDefects service of the A1 application path in the Fac-tory Data Center, the Adapt service of the A2 pathon the Packaging Controller or the Factory Data Cen-ter, and the Aggregate service of path A4 on theWireless Gateway are the most eﬃcient applicationdesign options. Furthermore, it becomes apparentthat the Camera, Production Controller, and Sensordo not require additional compute capabilities as theydo not need to run any data processing services. Forthe Factory Data Center, the simulation recommendsthe medium machine option for each application de-sign options and the least expensive options for bothOﬃce Data Center and Cloud.12 elativePerformanceIndicator: [5,10,20]availableMemory: [32,64,256] GBprice: [50,100,300] $/month Office Data CenterCloud relativePerformanceIndicator: [10,25,50]availableMemory: [64,256,512] GBprice: [50,250,500] $/month

Factory Data Center relativePerformanceIndicator: [0.5,1,5]availableMemory: [2,4,16] GBprice: [10,20,50] $/month

Wireless Gateway relativePerformanceIndicator: [0.1,0.5]availableMemory: [0.1,0.25] GBprice: [25,60] $/month

Camera relativePerformanceIndicator: [0.001,0.05]availableMemory: [0.001,0.01] GBprice: [0.5,5] $/month

Sensor relativePerformanceIndicator: 0.001availableMemory: 0.001 GBprice: 0.5 $/month

Packaging Controller relativePerformanceIndicator: 0.5availableMemory: 0.1 GBprice: 30 $/month

Production Controller relativePerformanceIndicator: [0.05.0.5]availableMemory: [0.01,0.1] GBprice: [10,30] $/monthlatency: 110msavailableBandwidth: 10Gb/sbandwidthPrice: 1 $/Gb latency: 125msavailableBandwidth: 1Gb/sbandwidthPrice: 1 $/Gb latency: 1msavailableBandwidth: 1Gb/sbandwidthPrice: 0 $/Gb latency: 5msavailableBandwidth: 1Mb/sbandwidthPrice: 50 $/Gb latency: 0.01msavailableBandwidth: 100Mb/sbandwidthPrice: 0 $/Gblatency: 0.01msavailableBandwidth: 100Mb/sbandwidthPrice: 0 $/Gblatency: 5msavailableBandwidth: 1Mb/sbandwidthPrice: 50 $/Gblatency: 15msavailableBandwidth: 10Gb/sbandwidthPrice: 5 $/Gb

Figure 8: We extend infrastructure components and their network links with more attributes as required byFogExplorer: each node has a relativePerformanceIndicator , availableMemory , and MemoryPrice . Networkconnections have a latency , availableBandwidth , and a bandwidthPrice . Square brackets denote that morethan one hardware option is available at a speciﬁc node. These hardware options diﬀer in price and capability. Through simulation, we have chosen the ten most ef-ﬁcient application designs and can now deploy theseon an emulated MockFog testbed. Before deploymentcan begin, we must ﬁrst implement our software com-ponents. To this end, we implement each source,service, and sink in Go 1.14. We then install thecompiled binaries on the MockFog nodes as Dockercontainers. We use an extended version of MockFogfor our experiments that is available with all othersoftware artifacts.Each node in the system maps to one instance onAWS Elastic Compute Cloud (EC2) in the sameavailability zone of the us-east-1 region. To emu-late diﬀerent kinds of hardware, we use diﬀerent in-stance types. We show the mapping from referen-cePerformanceIndicator as employed in FogExplorerto EC2 instance types in Table 1. For instances ofthe t2 family, we enable unlimited accrual of CPUcredits to prevent inconsistent CPU bursting. Giventhe limited number of available instance types, how- https://aws.amazon.com/ec2 Table 1: referencePerformanceIndicator (rPI) andCorresponding AWS EC2 Instance Types Used inOur MockFog Experiments rPI EC2 InstanceType [0 ,

1[ t2.micro 1 2 1.25[1 ,

5[ t2.medium 2 4 2.90[5 ,

10[ t2.xlarge 4 16 5.89[10 ,

20[ t2.2xlarge 8 32 11.78[20 ,

50[ m5a.12xlarge 48 192 45.48[50 , ∞ [ m5a.24xlarge 96 384 90.91 ever, this is not as ﬁne-grained as the referencePer-formanceIndicator in FogExplorer. Furthermore, italso does not allow setting the availableMemory tothe same value as in the FogExplorer infrastructuremodel. To validate performance diﬀerences betweeninstance types we use the sysbench CPU benchmarkin version 1.0.20. This benchmark calculates allprime numbers up to a certain limit, which we setat 1,000,000, in 1,024 threads simultaneously. It https://github.com/akopytov/sysbench/tree/1.0 CPU speed metric that describes thenumber of events the benchmarked CPU was ableto handle per second, with each event correspondingto one completed prime computation. We repeat thisbenchmark three times and report median results. Asshown in Table 1, this metrics scales nearly linearlywith the amount of CPU cores. Note that in orderto leverage this performance for our application, theservices we deploy have to actually use all availableCPU cores. To this end, our implemented applica-tion services use multithreading through goroutines .Nevertheless, we can expect that performance doesnot scale strictly linearly with the number of CPUcores in practice.MockFog sets artiﬁcial network bandwidth and la-tency limits between machines, and deploys our soft-ware components to the machines. The mappings forsinks and sources are identical each time, with, forinstance, the Camera process running on the Cam-era node. Service mappings follow the ten most ef-ﬁcient design options identiﬁed through simulation.For each option, MockFog runs the application for 20minutes and then collects logs to determine end-to-end latency for each application path. We repeat thisprocess three times to gain accurate results and usemedian results in further analysis.We measure end-to-end latency by attaching times-tamps and unique identiﬁers to each request thatpasses through the system. Each component logswhen it sends or receives a request with a speciﬁcidentiﬁer. One problem with measuring end-to-endlatency in this manner is clock skew. When the clocksof two machines are not in sync, the measurement canbecome inaccurate. To limit this eﬀect, all machinessynchronize their clocks through the AWS Time SyncService in their region before the experiments runwhich, during our experiments, resulted in clock de-viations lower than 0.3ms.Between re-runs of the same experiment setup, wesee a small overall coeﬃcient of variation of between0% and 3%. Consequently, we can say that our ex-periment results are robust. We use the average end-to-end latency unless stated otherwise and show theseresults in Figure 9. As expected, latency for the A1application path is similar across all design options,as the Check for Defects service is always deployed to the same kind of Factory Data Center. On the A2application path, we observe an end-to-end latency ofbetween 3ms and 4ms when the Adapt service runson the Packaging Controller and 14ms when placedon the Factory Data Center, due to the increase innetwork latency caused by additional hops for eachrequest. This diﬀerence is even greater when con-sidering only the Sensor source, where end-to-end la-tency is sub-millisecond when the Adapt service isdeployed on the Packaging Controller. For the A3application path, processing latency of the Predictservice is higher when it runs on the Factory DataCenter, with an average latency of 89ms for applica-tion design option 1, and even higher for options 2and 9, where the Check for Defects, Adapt Machine,and Predict service are all deployed on this node,with 123ms and 108ms, respectively. When the Pre-dict service runs on the Oﬃce Data Center or Cloud,this processing latency is lower, between 67ms and77ms. For placement on the Cloud node, this reduc-tion of processing latency is oﬀset by a considerableincrease in network latency to 257ms. The Aggre-gate service of application path A4 has a processinglatency of between 0.1ms and 0.15ms, regardless ofthe machine type of the Wireless Gateway, which thisservice is always deployed to. At this scale, this diﬀer-ence could also be attributed to measurement error.The Generate Dashboard service has a lower process-ing latency when deployed to the Cloud at 89ms to90ms than when deployed to the Factory Data Cen-ter, where processing latency ranges from 95ms up to109ms. Yet again this diﬀerence is oﬀset by transmis-sion latency, which, here, is lower at 23ms comparedto 243ms.As we had already ensured through simulation withFogExplorer, all application design options we havebenchmarked on the MockFog testbed comply withall SLOs deﬁned for the application paths.

Using the results from our MockFog experiments,we can now discard more application design options.From the ten application design options we have de-ployed to the emulated testbed, option is the most14 Application Design Option L a t e n c y i n m s (a) Application Path A1 Application Design Option L a t e n c y i n m s (b) Application Path A2 Application Design Option L a t e n c y i n m s (c) Application Path A3 Application Design Option L a t e n c y i n m s (d) Application Path A4 Figure 9: Results for testbed experiments with MockFog. We show average end-to-end latency measured forall application design options for each application path. Error bars show the standard deviation. Applicationdesign option consistently is among those with the lowest end-to-end latency for each application path.eﬃcient. We show service mapping and determinedinfrastructure options in Figure 10. Here, the FactoryData Center hosts the Check for Defects and Gener-ate Dashboard services, the Adapt Machine serviceis placed on the Packaging Controller, the PredictPickup service on the Oﬃce Data Center, and theWireless Gateway is used for the Aggregate service.As infrastructure options, we use the smallest avail-able machines for the Wireless Gateway and OﬃceData Center, and the medium option for the Fac- tory Data Center. In this application design option,the Cloud is not used to host any services, hence wedo not require a machine there. Here, we skip theoptional deployment of several options on a physicalfog testbed as described in Section 3.6 since we willdo exactly that in our evaluation of result quality inSection 4.2.15 elativePerformanceIndicator: availableMemory:

32 GB price: $50/month

Office Data CenterCloud Factory Data Center relativePerformanceIndicator: availableMemory: price: $20/month Wireless Gateway relativePerformanceIndicator: availableMemory: price: $25/month

CameraSensorPackaging Controller relativePerformanceIndicator: availableMemory: price: $30/month

Production Controller

Check forDefects

GenerateDashboard Aggregate

PredictPickup AdaptMachine

Figure 10: Service Mapping and Infrastructure Op-tion in the Best Application Design Option as Deter-mined in our Case StudyTable 2: Overview of placement options and the stepin which the option was discarded. This shows thatearly process steps alone cannot provide good enoughrecommendations.

Camera ProductionController Sensor PackagingController WirelessGateway FactoryData Center OﬃceData Center Cloud A Check forDefects Simulation Simulation BestPractices BestPractices Simulation FinalDesign BestPractices BestPractices A AdaptMachine BestPractices BestPractices Simulation FinalDesign Simulation Simulation BestPractices BestPractices A PredictPickup BestPractices BestPractices BestPractices BestPractices BestPractices FinalDesign EmulatedTestbed EmulatedTestbed A Aggregate Simulation Simulation Simulation Simulation FinalDesign Simulation BestPractices BestPracticesGenerateDashboard BestPractices BestPractices BestPractices BestPractices BestPractices FinalDesign Simulation EmulatedTestbed

After having shown the applicability of our processthrough a case study, we now evaluate it by deploy-ing our resulting architecture on a physical testbed.We benchmark our application with a synthetic work-load and determine whether our process has reallyconverged towards the most eﬃcient design by com-paring it to application design options that were dis-carded in earlier steps of the process.We show application design options and at whichstep we have ﬁltered them out in Table 2. This ﬁgurealso shows the ﬁnal application design as determinedby our process to be the most eﬃcient. The ﬁnaldesign has passed the check for best practices, sim-ulation with FogExplorer, and benchmarking on theemulated MockFog testbed. We now further evaluatethis design by comparing it to other design optionsthat we have ﬁltered out during the process. Obvi-ously, we cannot compare all possible design options. For each ﬁlter we have applied, we randomly choosethree of the discarded design options, deploy themon a physical testbed and benchmark them.

M1-3 , F1-3 , and

B1-3 denote the three designs that wereﬁltered out by M ockFog, F ogExplorer, and the ap-plication of b est practices, respectively. For sake ofcomparison, we also deploy and benchmark our ﬁnal, w inning design as presented in Section 4.1.5, whichwe denote as W . Software components use the sameimplementation and deployment method, i.e., Dockercontainers, as in our emulated MockFog testbed. Ourtestbed comprises two Raspberry Pi 3B+ single boardcomputers, one acting as Camera and ProductionController, and the other as Sensor and PackagingController. These boards connect over 2.4GHz WiFito a

MacBook Pro with an

Intel Core 2 Duo pro-cessor that we use as our Wireless Gateway. Thiscomputer, in turn, connects to a LAN over GigabitEthernet. This network has a 50Mbit/s Internet up-link and a

ThinkPad x220 laptop with an

Intel Corei5 processor that acts as the Factory Data Centerconnected to it. Finally, as our Oﬃce Data Center,we use a virtual machine instance on AWS EC2 in the eu-west-1

Ireland region. As the Cloud instance, weuse an AWS EC2 virtual machine instance in the ap-northeast-2

Tokyo region. The respective instancetypes depend on the machine type used in the se-lected application design, see Table 1. Experimentsrun for 20 minutes after an initial startup time of 5minutes and are repeated three times. We report theresults of the median run, variance across runs withthe same experiment setup was between 1% and 4%for all experiment except for setup M3 (9%) whereone outlier had a higher end-to-end latency for theA3 application path, and experiments B1 (15%) andB3 (6%) that were unable to complete correctly.Figure 11 shows the average transmission and pro-cessing times measured in our experiments. Experi-ments for application design options B1 and B2 wereunable to complete as the Predict Pickup service ranout of memory on the Packaging Controller and Wire-less Gateway, respectively, where it was deployedwith these design options. The B3 option, while ableto run all services, leads to a higher latency than oth-ers that have been selected with the ﬁrst step of ourprocess. Design option F1 has been determined by16

Application Design Option L a t e n c y i n m s (a) Application Path A1 Application Design Option L a t e n c y i n m s (b) Application Path A2 W M1 M2 M3 F1 F2 F3 B1 B2 B3

Application Design Option L a t e n c y i n m s (c) Application Path A3 Application Design Option L a t e n c y i n m s (d) Application Path A4 Figure 11: Latency results for experiments on the physical testbed. We show average end-to-end latencymeasured for all application design options for each application path. Error bars show the standard deviation.Application design options B1 and B2 were unable to run the Predict Pickup service as the infrastructurecomponent would run out of memory, hence no results for the A3 application path can be shown here.FogExplorer to comply with all SLOs, yet was not inthe 95 th percentile cost-wise and was hence discarded.Nevertheless, latency measurements appear to be onpar with designs W and M1 through M3. Option F2violates SLO requirements in the simulation and weobserve that it is also less eﬃcient than others wetest, so this elimination was correct. Finally, whileFogExplorer discards F3 for insuﬃcient resources, asthe Wireless Gateway component here has too little available memory for the Check for Defects service,we were able to deploy it correctly on our physicaltestbed and latency is similar to our winning designoption W. Yet this deployment is more costly than Was it uses more expensive infrastructure components.For options W and M1 through M3 we see results asin MockFog where we have tested these design op-tions already. Consequently, design option W againis again the most eﬃcient option among those.17 Discussion

The ﬁve-step design process we propose can help toaddress the challenge of designing eﬃcient fog-basedIoT applications. Yet as with all tools, it is importantto know its limits to employ it correctly. First andforemost, our proposed process targets static applica-tions. Although not all information about the systemis necessarily required upfront and infrastructure andsoftware models are extended and modiﬁed along theway, as we have described, our design process is notequipped to deal with dynamic deployment changesas would be necessary for physically moving sources,sinks, or compute nodes. For example, in order toaugment the application with a new service, parts ofthe process would need to be re-run from the start.While simulation and testbed emulation can be au-tomated, applying best practices would need to beapplied by an actual application design engineer.While networks with mobile nodes, frequent out-ages, or regular changes in topology may exist, weenvision that static applications such as the smartfactory in our case study are common. Furthermore,our process may be used for the static componentsof a more dynamic application while the dynamiccomponents are deployed using other approaches suchas [29].One other challenge is that fog application designis complicated given the amount of factors that are atplay. For example, we mention in Section 2.3 that weonly consider service latency and cost as metrics todescribe application design eﬃciency. Beyond that,availability is of course important as well. Cloudplatforms may, for instance, provide better availabil-ity guarantees than a local data center. Availability,performance, network latency, or available networkbandwidth may also be subject to external inﬂuencefactors such as another tenant using the same net-work connection. Abstracting from such factors inour models means that our simulation and testbedexperiments cannot accurately reﬂect results that wewould observe in the real world. Yet, we argue thatwe need this abstraction to keep models and simula-tion simple, which in turn is necessary to even facil-itate its use in such a design process. These factors can then be tested later in the process by using phys-ical testbeds.In Section 3.4, we have introduced SLOs for appli-cation paths as a way to convert the multi-objectiveoptimization of cost and service latency for each pathinto a single-objective optimization of cost within thespeciﬁed latency constraints. While reducing end-to-end latency is always better, we argue that additionalinvestment can lead to diminishing returns after acertain point. Finding these ﬁxed constraints, how-ever, can be diﬃcult for system designers and settingSLOs too low or too high can have negative impactson the overall satisfaction with the ﬁnal applicationdesign by unnecessarily increasing cost or latency, re-spectively. In future work, we want to further explorethis relationship between cost and utility of reducedlatency so that this decision can be made on a moreinformed basis.

We have motivated how the correct placement of IoTapplication components in the fog is diﬃcult yet cru-cial for an eﬃcient use of resources. This is a knownresearch problem and has been discussed in existingpublications.Brogi et al. [30] present

FogTorch that models foginfrastructure by parameterizing available fog nodes,communication links, end devices, application com-ponents, and QoS constraints, and then ﬁnds eligibledeployments of application components. While thisapproach leads to a set of valid application deploy-ment options, solving fog application deployment inthis manner is NP-hard, as the authors show. Con-sequently, ﬁnding valid deployment options becomesexponentially harder with each added component andis infeasible for larger deployments. Tong et al. [31]and, to some extent, Heintz et al. [32] take a simi-lar approach to FogTorch, while [4–7, 33–36] employheuristics to solve the formalized optimization prob-lem. While using heuristics can lead to results moreeﬃciently, it requires infrastructure and software im-plementation details upfront, allowing little room forﬂexibility and agile development. Often, such infor-mation may not be available at design time. Further-18ore, these approaches only ﬁnd solutions throughstatic analysis, yet it is hard to verify if the calcu-lated results hold up in a real deployment, which isonly possible through benchmarks on an emulated orphysical testbed.Khare et al. [37] also employ heuristics to createan eﬃcient application design for distributed, edge-based stream processing. Additionally, they also em-ploy it in a multi-step process, where a DAG of theentire application is ﬁrst split into a set of linearchains for which latency is estimated individually,similar to the application paths we introduced in Sec-tion 3.2. The authors here, however, approximatethese processing chains algorithmically, which is aninteresting alternative approach, as it leads to lessoverhead for application designers, albeit by sacriﬁc-ing accuracy.

Fogernetes as proposed in [38] automates the de-ployment of software services across a number offog nodes by leveraging the Kubernetes orchestra-tion, as Santos et al. [39] have also proposed. Sim-ilarly, [27, 40, 41] have also presented such dynamicmiddleware. While these systems are ﬂexible, theycan only optimize latency and are not aware of systemcosts. Rather, they assume that a speciﬁc set of in-frastructure already exists and a mapping exists thatdoes not lead to under-provisioning. In our proposedprocess, we provision only infrastructure that is reallyneeded, keeping overall cost to a minimum. We arguethat a more eﬃcient fog application can be designedby building the underlying infrastructure alongside.Furthermore, often the infrastructure may not yet beﬁxed when the development process is started.To this end, Roy et al. [42] present MAQ-PRO,a process for infrastructure capacity planning forcomponent-based applications that is similar to ourproposed process. MAQ-PRO begins with a pro-ﬁle of components, analysis of the application sce-nario (compare Section 3.2), and a base performancemodel (compare Section 3.4), and it also considersSLA bounds and workloads. Their approach, how-ever, is unsuitable for the novel paradigm of fog com-puting as it does not consider network distance be-tween infrastructure components, which is crucial inthe fog. In Section 3.4 we propose to use FogExplorer tosimulate fog placement. Alternatively, Gupta etal. [15] have proposed the iFogSim tool to model andsimulate the use of fog application resources. Theirtool, however, has constraints in that it only allowstree-shaped infrastructure models, which is not repre-sentative of most fog infrastructure that can containcycles, as is the case in our case study, for example.Furthermore, their tool requires highly detailed ap-plication traces, which is not feasible this early in thedesign phase.In [43], Brambilla et al. present an approach forsimulating large scale sensor networks for the IoT.While useful in its own right, it lacks an estimationof system cost and we target more heterogeneous fognetworks, albeit at a lower scale. Additionally, [44–49] also present simulation tools that could be appliedto fog computing.We also propose to use MockFog as an emu-lated testbed for diﬀerent application designs inSection 3.5. Besides MockFog, other applicationtestbeds exist as well. Eisele et al. [50] propose ahardware in the loop simulation that uses a simu-lation tool in conjunction with a physical testbed.This allows them to leverage ﬂexibility in workloadgeneration from the simulation tool but a realisticenvironment from the physical testbed, yet it alsoleads to increased cost without being entirely accu-rate. The

D-Cloud [51] software testing frameworkallows individual software components to be placedon diﬀerent virtual machines to emulate a cloud en-vironment. This tool, however, cannot be appliedto a fog infrastructure. Furthermore, Coutinho etal. [18], and Mayer et al. [17] propose

Fogbed and

EmuFog , which use the network simulators Mininetand Maxinet [52], to test distributed fog applications.Yet unlike MockFog, these testbeds can only simu-late realistic network conditions, not the constrainedcompute capabilities of fog nodes, especially at theedge. Balasubramanian et al. [53] present a testbedfor fog applications that facilitates emulating theseconstraints, yet requires physical hardware for eachnode rather than cheaper virtual machines.To the best of our knowledge, our work is the ﬁrstthat combines best practices, simulation and emula-19ion into a complete design process for fog-based IoTapplications.

Engineering IoT applications in an eﬃcient way ischallenging as the process needs to consider both soft-ware architecture and their deployment to a physicalinfrastructure. Existing approaches can only providelimited guidance since they are either based on theo-retical models and simulation, i.e., inherently limitedin their accuracy, or based on experiment testbeds,i.e., the evaluation eﬀort is too high to explore morethan a few design options.In this paper, we have proposed a ﬁve-step processfor designing eﬃcient fog-based IoT applications thatintegrates and extends previous work of ours. Ratherthan relying solely on global optimization, simula-tion, or testbed benchmarking, we combine best prac-tices, simulation, and testbed evaluation to choosethe most eﬃcient infrastructure options and softwareservice placements from an exponentially growingpool of deployment options. Furthermore, we haveshown the eﬀectiveness of this approach through asmart factory case study. Through deploying diﬀer-ent options on a physical testbed, we also showed thatour process identiﬁed an eﬃcient application designin our case study and, by extension, that our processachieves the desired results.

Acknowledgments

Funded by the Deutsche Forschungsgemeinschaft(DFG, German Research Foundation) – 415899119.

References [1] B. Zhang, N. Mor, J. Kolb, D. S. Chan, K. Lutz,E. Allman, J. Wawrzynek, E. Lee, and J. Kubi-atowicz, “The cloud is not enough: Saving iotfrom the cloud,” in ,Jul. 2015, pp. 21–27. [2] D. Bermbach, F. Pallas, D. G. P´erez, P. Plebani,M. Anderson, R. Kat, and S. Tai, “A researchperspective on fog computing,” in

Service-Oriented Computing – ICSOC 2017 Workshops ,Jun. 2018, pp. 198–210.[3] F. Bonomi, R. Milito, J. Zhu, and S. Addepalli,“Fog computing and its role in the internet ofthings,” in

Proceedings of the First Edition ofthe MCC Workshop on Mobile Cloud Comput-ing , Aug. 2012, pp. 13–16.[4] A. Brogi, S. Forti, and A. Ibrahim, “How to bestdeploy your fog applications, probably,” in , May 2017, pp. 105–114.[5] O. Skarlat, M. Nardelli, S. Schulte,M. Borkowski, and P. Leitner, “OptimizedIoT service placement in the fog,”

ServiceOriented Computing and Applications , vol. 11,no. 4, pp. 427–443, 2017.[6] R. Mahmud, K. Ramamohanarao, andR. Buyya, “Latency-aware application modulemanagement for fog computing environments,”

ACM Trans. Internet Technol. , vol. 19, no. 1,pp. 1–21, 2018.[7] H. Hong, P. Tsai, and C. Hsu, “Dynamic mod-ule deployment in a fog computing platform,”in , Oct.2016, pp. 1–6.[8] O. Skarlat, M. Nardelli, S. Schulte, and S. Dust-dar, “Towards QoS-Aware fog service place-ment,” in , May2017, pp. 89–96.[9] T. Pfandzelter and D. Bermbach, “IoT data pro-cessing in the fog: Functions, streams, or batchprocessing?” in

Proc. of DaMove , Jun. 2019, pp.201–206.2010] M. Gusev, B. Koteska, M. Kostoska, B. Jaki-movski, S. Dustdar, O. Scekic, T. Rausch,S. Nastic, S. Ristov, and T. Fahringer, “A de-viceless edge computing approach for stream-ing IoT applications,”

IEEE Internet Comput-ing , vol. 23, no. 1, pp. 37–45, 2019.[11] V. Karagiannis and S. Schulte, “Comparison ofalternative architectures in fog computing,” in , May 2020, pp.19–28.[12] L. Santos, E. Silva, T. Batista, E. Cavalcante,J. Leite, and F. Oquendo, “An architecturalstyle for internet of things systems,” in

Proceed-ings of the 35th Annual ACM Symposium on Ap-plied Computing , Mar. 2020, pp. 1488–1497.[13] J. Hasenburg, S. Werner, and D. Bermbach,“Supporting the evaluation of fog-based IoT ap-plications during the design phase,” in

Proceed-ings of the 5th Workshop on Middleware and Ap-plications for the Internet of Things , Dec. 2018,pp. 1–6.[14] ——, “Fogexplorer,” in

Proceedings of the 19thInternational Middleware Conference (Demosand Posters) , Dec. 2018, pp. 1–2.[15] H. Gupta, A. Vahid Dastjerdi, S. K. Ghosh, andR. Buyya, “iFogSim: A toolkit for modeling andsimulation of resource management techniquesin the internet of things, edge and fog computingenvironments,”

Softw. Pract. Exp. , vol. 47, no. 9,pp. 1275–1296, 2017.[16] J. Hasenburg, M. Grambow, E. Gr¨unewald,S. Huk, and D. Bermbach, “MockFog: Emulat-ing fog computing infrastructure in the cloud,”in , Jun. 2019, pp. 144–152.[17] R. Mayer, L. Graser, H. Gupta, E. Saurez, andU. Ramachandran, “EmuFog: Extensible andscalable emulation of large-scale fog comput-ing infrastructures,” in , Oct. 2017, pp. 1–6. [18] A. Coutinho, F. Greve, C. Prazeres, and J. Car-doso, “Fogbed: A rapid-prototyping emulationenvironment for fog computing,” in , May 2018, pp. 1–7.[19] R. Morabito, V. Cozzolino, A. Y. Ding, N. Bei-jar, and J. Ott, “Consolidate IoT edge com-puting with lightweight virtualization,”

IEEENetw. , vol. 32, no. 1, pp. 102–111, 2018.[20] N. Govindarajan, Y. Simmhan, N. Jamadagni,and P. Misra, “Event processing across edge andthe cloud for internet of things applications,” in

Proceedings of the 20th International Conferenceon Management of Data , Dec. 2014, pp. 101–104.[21] M. R. Anawar, S. Wang, M. Azam Zia, A. K.Jadoon, U. Akram, and S. Raza, “Fog com-puting: An overview of big IoT data analyt-ics,”

Proc. Int. Wirel. Commun. Mob. Comput.Conf. , vol. 2018, pp. 1–22, 2018.[22] D. Bermbach, E. Wittern, and S. Tai,

Cloud Ser-vice Benchmarking: Measuring Quality of CloudServices from a Client Perspective . Springer,Cham, 2017.[23] D. Bermbach, “Benchmarking eventually consis-tent distributed storage systems,” Ph.D. disser-tation, Karlsruhe Institute of Technology, Feb.2014.[24] D. Kossmann, T. Kraska, and S. Loesing, “Anevaluation of alternative architectures for trans-action processing in the cloud,” in

Proceedingsof the 2010 ACM SIGMOD International Con-ference on Management of Data , Jun. 2010, pp.579–590.[25] T. Rausch, C. Lachner, P. A. Frangoudis,P. Raith, and S. Dustdar, “Synthesizing plau-sible infrastructure conﬁgurations for evaluat-ing edge computing systems,” in { USENIX } Workshop on Hot Topics in Edge Computing(HotEdge 20) , Jun. 2020.2126] J. Hasenburg, F. Stanek, F. Tschorsch, andD. Bermbach, “Managing latency and ex-cess data dissemination in fog-based pub-lish/subscribe systems,” in

Proceedings of theSecond IEEE International Conference on FogComputing (ICFC 2020) , Apr. 2020, pp. 9–16.[27] O. Skarlat, V. Karagiannis, T. Rausch, K. Bach-mann, and S. Schulte, “A framework for opti-mization, service placement, and runtime opera-tion in the fog,” in , Dec. 2018, pp. 164–173.[28] D. Bermbach, J. Kuhlenkamp, A. Dey, A. Ra-machandran, A. Fekete, and S. Tai, “Bench-Foundry: A Benchmarking Framework for CloudStorage Services,” in

Proceedings of the 15thInternational Conference on Service OrientedComputing (ICSOC 2017) . Springer, 2017.[29] D. Bermbach, S. Maghsudi, J. Hasenburg, andT. Pfandzelter, “Towards auction-based func-tion placement in serverless fog platforms,” in

Proceedings of the Second IEEE InternationalConference on Fog Computing (ICFC 2020) .IEEE, 2020.[30] A. Brogi and S. Forti, “QoS-Aware deploymentof IoT applications through the fog,”

IEEE In-ternet of Things Journal , vol. 4, no. 5, pp. 1185–1192, 2017.[31] L. Tong, Y. Li, and W. Gao, “A hierarchical edgecloud architecture for mobile computing,” in

IEEE INFOCOM 2016 - The 35th Annual IEEEInternational Conference on Computer Commu-nications , Apr. 2016, pp. 1–9.[32] B. Heintz, A. Chandra, and R. K. Sitara-man, “Optimizing timeliness and cost in Geo-Distributed streaming analytics,”

IEEE Trans-actions on Cloud Computing , vol. 8, no. 1, pp.232–245, 2020.[33] X. Xu, D. Li, Z. Dai, S. Li, and X. Chen,“A heuristic oﬄoading method for deep learn-ing edge services in 5G networks,”

IEEE Access ,vol. 7, pp. 67 734–67 744, 2019. [34] V. Cardellini, V. Grassi, F. Lo Presti, andM. Nardelli, “Optimal operator placement fordistributed stream processing applications,” in

Proceedings of the 10th ACM International Con-ference on Distributed and Event-based Systems ,Jun. 2016, pp. 69–80.[35] S. Shekhar, A. Chhokra, H. Sun, A. Gokhale,A. Dubey, X. Koutsoukos, and G. Karsai, “UR-MILA: Dynamically trading-oﬀ fog and edge re-sources for performance and mobility-aware IoTservices,”

Journal of Systems Architecture , vol.107, no. 101710, 2020.[36] K. Oh, A. Chandra, and J. Weissman, “A net-work cost-aware geo-distributed data analyt-ics system,” in , May 2020, pp. 649–658.[37] S. Khare, H. Sun, J. Gascon-Samson, K. Zhang,A. Gokhale, Y. Barve, A. Bhattacharjee, andX. Koutsoukos, “Linearize, predict and place:minimizing the makespan for edge-based streamprocessing of directed acyclic graphs,” in

Pro-ceedings of the 4th ACM/IEEE Symposium onEdge Computing , Nov. 2019, pp. 1–14.[38] C. W¨obker, A. Seitz, H. Mueller, andB. Bruegge, “Fogernetes: Deployment and man-agement of fog computing applications,” in

NOMS 2018 - 2018 IEEE/IFIP Network Oper-ations and Management Symposium , Apr. 2018,pp. 1–7.[39] J. Santos, T. Wauters, B. Volckaert, andF. De Turck, “Towards network-aware resourceprovisioning in kubernetes for fog computing ap-plications,” in , Jun. 2019, pp.351–359.[40] E. Saurez, K. Hong, D. Lillethun, U. Ramachan-dran, and B. Ottenw¨alder, “Incremental deploy-ment and migration of geo-distributed situation22wareness applications in the fog,” in

Proceed-ings of the 10th ACM International Conferenceon Distributed and Event-based Systems , Jun.2016, pp. 258–269.[41] D. Santoro, D. Zozin, D. Pizzolli, F. D. Pel-legrini, and S. Cretti, “Foggy: A platform forworkload orchestration in a fog computing envi-ronment,” in , Dec. 2017, pp. 231–234.[42] N. Roy, A. Dubey, A. Gokhale, and L. Dowdy,“A capacity planning process for performanceassurance of component-based distributed sys-tems,” in

Proceedings of the 2nd ACM/SPECInternational Conference on Performance engi-neering , Sep. 2011, pp. 259–270.[43] G. Brambilla, M. Picone, S. Cirani,M. Amoretti, and F. Zanichelli, “A simu-lation platform for large-scale internet of thingsscenarios in urban environments,” in

Proceed-ings of the First International Conference onIoT in Urban Space , Oct. 2014, pp. 50–55.[44] S. Sotiriadis, N. Bessis, E. Asimakopoulou, andN. Mustafee, “Towards simulating the internetof things,” in , May 2014, pp. 444–448.[45] X. Zeng, S. K. Garg, P. Strazdins, P. P. Ja-yaraman, D. Georgakopoulos, and R. Ranjan,“IOTSim: A simulator for analysing IoT ap-plications,”

Int. J. High Perform. Syst. Archit. ,vol. 72, pp. 93–107, 2017.[46] T. Qayyum, A. W. Malik, M. A. Khan Khat-tak, O. Khalid, and S. U. Khan, “FogNetSim++:A toolkit for modeling and simulation of dis-tributed fog environment,”

IEEE Access , vol. 6,pp. 63 570–63 583, 2018.[47] D. Fern´andez-Cerero, A. Fern´andez-Montes,F. Javier Ortega, A. Jak´obik, and A. Wid- lak, “Sphere: Simulator of edge infrastructuresfor the optimization of performance and re-sources energy consumption,”

Simulation Mod-elling Practice and Theory , vol. 101, no.1019663, 2020.[48] C. Sonmez, A. Ozgovde, and C. Ersoy, “Edge-CloudSim: An environment for performanceevaluation of edge computing systems,”

TransEmerging Tel Tech , vol. 29, no. 11, 2018.[49] N. K. Giang, M. Blackstock, R. Lea, andV. C. M. Leung, “Developing IoT applicationsin the fog: A distributed dataﬂow approach,” in , Oct. 2015, pp. 155–162.[50] S. Eisele, G. Pettet, A. Dubey, and G. Karsai,“Towards an architecture for evaluating and an-alyzing decentralized fog applications,” in , Oct. 2017,pp. 1–6.[51] T. Banzai, H. Koizumi, R. Kanbayashi,T. Imada, T. Hanawa, and M. Sato, “D-Cloud:Design of a software testing environment for reli-able distributed systems using cloud computingtechnology,” in , May 2010, pp. 631–636.[52] R. L. S. de Oliveira, C. M. Schweitzer, A. A.Shinoda, and L. Rodrigues Prete, “Usingmininet for emulation and prototyping software-deﬁned networks,” in , Jun. 2014, pp. 1–6.[53] D. Balasubramanian, A. Dubey, W. R. Otte,W. Emﬁnger, P. S. Kumar, and G. Karsai, “Arapid testing framework for a mobile cloud,” in , Oct. 2014, pp. 128–134.23

Overview of Application Design Options Deployed to EmulatedTestbed in Case Study

Table 3: Overview of the ten most eﬃcient designs as established by our FogExplorer simulation.

A1 A2 A3 A4

Application DesignOption

Check for Defects

Placement

Adapt Machine

Placement

Predict Pickup

Placement

Aggregate

Placement

Generate Dashboard

Placement FDC PKC FDC WGW CLD FDC FDC FDC WGW CLD FDC FDC ODC WGW CLD FDC FDC CLD WGW CLD FDC PKC ODC WGW FDC FDC FDC ODC WGW FDC FDC PKC CLD WGW FDC FDC FDC CLD WGW FDC FDC FDC FDC WGW CLD FDC FDC CLD WGW FDC

PKC Packaging ControllerWGW Wireless GatewayFDC Factory Data CenterODC Oﬃce Data CenterCLD Cloud (a) Service placement in the diﬀerent application design options tested on the emulated MockFog testbed.

Application DesignOptions

WirelessGateway

HardwareOption

FactoryData Center

HardwareOption

OﬃceData Center

HardwareOption

Cloud

HardwareOption

1, 2, 4, 7, 8 1 2 — 13 1 2 1 15, 6 1 2 1 —9, 10 2 2 — 1 (b) Infrastructure options in the diﬀerent applica-tion design options tested on the emulated MockFogtestbed. Hardware options for the

Camera and

Pro-duction Controller have been omitted for brevity asno service is deployed on these nodes.

End-to-End Latency in msApplication DesignOption A1 A2 A3 A4 Cost in$/month (c) Results of the FogExplorer simulation for the tenbest application design options tested on the emu-lated MockFog testbed. Overview of Application Design Options Deployed to the Phys-ical Testbed in Case Study

Table 4: Overview of the ten application design options selected for deployment on the physical testbed. W denotes the most eﬃcient design as determined by our process. M1-3 , F1-3 , and

B1-3 denote the threedesigns that were ﬁltered out by MockFog, FogExplorer, and the application of Best Practices respectively.

A1 A2 A3 A4

Application DesignOption

Check for Defects

Placement

Adapt Machine

Placement

Predict Pickup

Placement

Aggregate

Placement

Generate Dashboard

Placement W FDC PKC ODC WGW FDC M1 FDC PKC FDC WGW CLD M2 FDC FDC CLD WGW CLD M3 FDC PKC CLD WGW FDC F1 FDC FDC ODC WGW CLD F2 FDC PKC CLD WGW CLD F3 WGW PKC FDC WGW ODC B1 PKC CLD PRC ODC FDC B2 CLD ODC WGW WGW PKC B3 FDC CLD FDC PKC PKC

PRC Production ControllerPKC Packaging ControllerWGW Wireless GatewayFDC Factory Data CenterODC Oﬃce Data CenterCLD Cloud (a) Service placement in the diﬀerent application design options tested on the physical testbed.

Application DesignOptions

WirelessGateway

HardwareOption

FactoryData Center

HardwareOption

OﬃceData Center

HardwareOption

Cloud

HardwareOption

W 1 2 1 —M1, M2, M3 1 2 — 1F1 1 2 2 3F2 1 2 3 3F3 1 2 3 —B1 — 2 1 2B2 1 — 1 3B3 — 2 — 1 (b) Infrastructure options in the diﬀerent applicationdesign options tested on the physical testbed. Hard-ware options for the

Camera and

Production Con-troller have been omitted for brevity as no service isdeployed on these nodes.have been omitted for brevity as no service isdeployed on these nodes.