Considerations about Continuous Experimentation for Resource-Constrained Platforms in Self-Driving Vehicles
aa r X i v : . [ c s . S E ] N ov Considerations about ContinuousExperimentation for Resource-ConstrainedPlatforms in Self-Driving Vehicles
Federico Giaimo ( (cid:0) ), Christian Berger , and Crispin Kirchner Chalmers University of Technology, G¨oteborg, Sweden [email protected] University of G¨oteborg, G¨oteborg, Sweden [email protected] RWTH Aachen University, Germany [email protected]
Abstract.
Autonomous vehicles are slowly becoming reality thanks tothe efforts of many academic and industrial organizations. Due to thecomplexity of the software powering these systems and the dynamicityof the development processes, an architectural solution capable of sup-porting long-term evolution and maintenance is required.Continuous Experimentation (CE) is an already increasingly adoptedpractice in software-intensive web-based software systems to steadily im-prove them over time. CE allows organizations to steer the developmentefforts by basing decisions on data collected about the system in its fieldof application. Despite the advantages of Continuous Experimentation,this practice is only rarely adopted in cyber-physical systems and in theautomotive domain. Reasons for this include the strict safety constraintsand the computational capabilities needed from the target systems.In this work, a concept for using Continuous Experimentation for resource-constrained platforms like a self-driving vehicle is outlined.
Keywords:
Software architecture for cyber-physical systems · Contin-uous experimentation · Software evolution · Middlewarec (cid:13)
In European Conference on Software Architecture (pp. 84-91) , Springer, 2017, https://link.springer.com/chapter/10.1007/978-3-319-65831-5_6
Constant efforts in technology and software development by various research andcommercial institutions are making autonomous cars gradually a reality. Whilethis final objective is still out of reach in the nearest future, many features thatcan replace the human driver in ordinary driving tasks are already available.Due to its safety constraints the software in vehicles needs to be very highin quality. This will prove even more true for autonomous vehicles, which willhave the responsibility to assess the real world around them to decide a courseof action while always meeting the safety requirements. For this reason it isimperative to find and enable a process that allows continuous software qualityimprovements, possibly even after the vehicle is sold to the customers.ontinuous Experimentation (CE) is an Extreme Programming practice thatcould satisfy these needs by running so-called “experiments” to collect meaning-ful data. These experiments are usually either variants of the deployed softwareor additional software features. The goal is to collect and use the resulting real-world data in order to decide in an objective way which of the possible variantsor features is the most successful one. A CE setup begins with the target-basedivided in sets, one of which is the control set , running unmodified software, andone or more experimental sets , which will run an experiment each. The softwarein all sets then collects relevant usage and performance data that will be re-layed back to the developers. The best-performing set will decide which softwarevariant or feature will be further developed and deployed to all the other targets.CE is increasingly adopted in the context of software-intensive web-basedapplications, and the current state-of-practice is outlined in Section 2. With afocus on autonomous vehicles, we outlined in our previous work the design cri-teria for the software architecture to enable experimentation on Cyber-PhysicalSystems (CPS) as well [1]. However, challenges related to safety considerationsare still unresolved and pose a significant obstacle for the adoption of softwareexperimentation on vehicles. Scarcity of resources plays also an important rolein this sense since the hardware in the car is carefully dimensioned in terms ofperformances to provide “just enough”. Further challenges like scalability issuesin case of several systems conducting experiments have also been identified inour previous study [2].The Research Goal of this work is to assess the challenges related to thescarcity of resources that prevent the widespread adoption of CE in the auto-motive context, and to propose strategies to overcome them.This goal is further elaborated into the following Research Questions: RQ RQ Several works are present in literature focusing on Continuous Experimentation.One of these is Fagerholm et al. [3], which describes a CE model that takesinto account the roles, tasks, infrastructure and information artifacts involvedby this practice. In this paper, the authors developed and extended their model,validating it against the results of two empirical case studies conducted in startupcompanies.Another article of interest is Olsson and Bosch [4], which describes the stepsthat should be taken to move a traditional software development process to a“continuous” one. These steps involve the gradual introduction of Agile practicesand the modification of the organization and their strategies in order to alignthem to the ones that better support continuous product evolution and delivery.everal articles related to CE report the advancements and characteristicsof the experimentation processes and platforms in industrial settings. An ex-ample of these works is Tang et al. [5] that described the experimental settingat Google Inc. where, in order to improve the experimentation process and ex-ecution, experiments that involve independent factors are overlapped. Furtherexamples are Kohavi et al. [6], that described Microsoft Bing’s own solutionto run “over 200 experiments concurrently”, and Amatriain [7], that outlinedNetflix’s approach to experimentation.At the best of the authors’ knowledge, and perhaps hinting at the novelty ofthe field, some of the major academic databases, i.e. IEEE Xplore, ACM DigitalLibrary, Scopus, Web of Science, were searched for articles regarding ContinuousExperimentation in the context of CPS, but unrelated or no results at all werefound at the time of writing.
Running experimental software alongside production software requires additionalcomputational resources. In contrast to web-based applications running in serverfarms, where additional virtual servers can be spawned if needed, acquiring ad-ditional computational power in CPS is not trivial, as their hardware cannot bechanged after delivery to the customers.To assess these limitations, different execution strategies for acquiring unusedcomputational power are proposed, taking into consideration different initialconditions that we have explored in the context of one of our research projects [8].These strategies are explained in the following paragraphs and depicted in Fig. 1.The automotive software in the proposed execution scenarios is assumed to bestructured in modules, which are recurrently executed in time slots, either data- or time-triggered [9]. This means respectively that a module is either executedwhenever new information arrives, or at a fixed frequency even if new data hasnot been gathered or if new data was queued waiting to be processed. The idealway to test an experimental version of a production software module would beto run it in parallel to the production version in order to provide the same inputto both modules. However, due to safety reasons and lack of computationalresources the experimental module may be forced to run on a less frequentschedule than the production module and its communications capabilities maybe reduced (for example its output could be logged instead of forwarded to theintended recipients). In order to make the experimental software “believe” thatit is being run without such handicaps it is required to encapsulate the time andthe communication resources that the software modules can access.Due to the necessary level of control needed over the software modules inthe authors’ understanding it is not enough to simply delegate the experiment’sexecution schedule to the operating system’s Process Scheduler. Firstly becausethe choice of whether to run an experimental module and what execution sched-ule to adopt depends on several factors that are only known at high levels ofabstraction. Secondly and more importantly, executing an experiment can im-ly the execution of a software module at the potential “expenses” of anotherselected one when computational resources are scarce, and to unfairly favor asoftware module over another is against the Process Scheduler’s goal to serveresources in a fair way among all processes.In the following the identified execution strategies will be described. Parallel Execution.
In the simplest case, even though either time or com-putational resources are scarce on a particular core or processor alongside theproduction module, a third software module can be paused or stopped in or-der to reuse its resources to run the experiment. In this case it is possible toassume that an unused processor is available, and the experimental modulecan be executed in parallel to the production module. As both modules runon independent computing units, they are not necessarily coupled in terms ofexecution frequency. This case has been described for completeness but it isunlikely to be applicable.
Serial Execution.
In the typical case that there is no additional computingunit available to independently execute an experimental module, the comput-ing time needed by the experiment could come from the unused time of aproduction module. In this case the experimental module could be executed serially , i.e. always after the production module has finished its computationand until the production module is executed again in its next time slot.When production and experimental modules are functionally related and aresupposed to operate as synchronously as possible, two different cases withdifferent implications can be identified: whether the experimental module canor cannot conclude its calculations in the unused time left in the productionmodule’s time slot. In the simplest case, the experimental module can finishits tasks inside the time window left over by the production module, in thesecond case, the time left unused by the production module is not enoughfor the experimental module to complete its operations, which results in aninterruption of the experimental module. It is worth noting that wheneverthe execution of the experimental module needs to be stretched over two non-contiguous time slots due to the lack of unused time in the current slot, theresult is that the experimental module will be executed less frequently thanthe production module, potentially resulting in time synchronization issuesand affecting the comparability of metrics in the case of A/B testing.
Downsampled Execution.
The third execution strategy, called downsam-pling , is applicable if there is no additional computing node available andno computation time is left in the time slice of a module. As computationalpower on cars is limited, it can be expected to also be the most likely ap-plicable strategy. This approach is based on the assumption that conditionsexist under which the execution of a production module can periodically beskipped (analog to suspending the production module from time to time), free-ing computational resources to be used for experimentation purposes. Skippingexecution cycles of a production module may result in compromising safety-critical aspects of the vehicle, hence great care must be taken to ensure thatthe planned downsampling is safe. A possible way to ensure its safety coulde to run preliminary tests before applying this strategy, to verify in advancethat it is viable in practice and at which rate the production module canskip computation cycles before dependent modules downstream in the data-processing chain are affected. Furthermore, the conditions under which thedownsampling rate has been tested need to be fixed and the execution of theexperiment must only be carried out when the vehicle operates under thoseconditions. As with this strategy the time slots available to the experimen-tal module are non-contiguous, the considerations about time synchronizationand logic coherence that were expressed for the serial execution strategy applyto this case as well.
Down-sampling CPU0 P P E P P E P . . . P CPU0Serial E P E P E P E . . . E P CPU0CPU1Parallel P E P E P E P E P E P E . . . ttt Fig. 1.
Execution strategies. “P” and “E” stand for Production and Experimentalsoftware module. Picture based on Kirchner [8].
The proposed strategies may also be composed and adjusted at runtime. Forexample, it could happen that an experiment might initially require the analysisof relatively small amounts of data, thus making the serial execution strategyfeasible. If however more intensive calculations would later be required and theconditions would allow it, the strategy could be changed to downsampling inorder to allocate more time to each experimental iteration at the cost of a lessfrequent execution schedule.
Section 3 has identified three potential strategies to execute an experimentalsoftware module next to a product module. Furthermore, we have pointed outthat the production and experimental modules need to be decoupled from thereal system time and from their respective potential communication vector withownstream modules. The reason is that the production and experimental mod-ule should believe that they are triggered at the very same point in time bythe same input data; while the execution strategy in effect must be entirelytransparent for the modules. Also, the communication of data into and fromthe production and experimental modules must be controlled entirely. Whilethe ingoing communication may not be critical, a strict control of any outgoingcommunication is needed to avoid unwanted interference with the dependentdownstream software modules. Furthermore, any time stamping related to send-ing data from the production and experimental modules to other modules mustbe potentially adjusted to make the rest of the system believe that these mod-ules have not been executed with different execution strategies. The possibilityof rewriting time stamp information for communication is another indicator whythe regular Process Scheduler provided by the operating system does not meetthe requirements for conducting experiments on a resource-constrained compu-tational environment.Chalmers University of Technology hosts a vehicle laboratory called Revere,“Resource for Vehicle Research” [10], with the goal of conducting and devel-oping research for self-driving vehicles and active safety. The Revere labora-tory uses our middleware OpenDaVINCI , which allows the realization of dis-tributed microservices communicating via Protobuf-encoded messages. The ac-tivation of software modules realized with OpenDaVINCI complies to the time-triggered or data-triggered principle described in Section 3. OpenDaVINCI bydefault encapsulates the system time via an object called TimeStamp that ei-ther invokes the POSIX time API returning the “real” time or transparentlyreplaces the real system clock with a virtual one. The communication facil-ities available to the software modules are also encapsulated. OpenDaVINCIuses by default UDP multicast as communication principle. In OpenDaVINCIa so-called
ContainerConference is provided as the data to be exchanged iswrapped into
Container containing the actual data to be exchanged and somemeta-information like time stamps for sending, receiving, and sample time point.To enable CE using these building blocks, both the production and experi-mental modules will be handled by an
Experimenter software module that willmanage them to realize the aforementioned execution strategies by forwardinginput data to both modules, activating and suspending them according to therespective execution strategy, and receiving data containers to be distributed forboth delivery or logging purposes.
For the current state-of-practice of CE in web-based systems, which usuallyinvolves validation of user feedback, small scale approaches are not viable sinceless generalizable. However, in the automotive domain the experiments wouldfocus on algorithmic problems and their verification in realistic scenarios, makingthe results easier to generalize even if collected by a small number of vehicles. http://code.opendavinci.org his work proposes a new element to consider in order to apply CE on cyber-physical systems, which is the execution strategy. This element is introduced toaccount for the possible lack of computational resources, and can critically im-pact the amount of collected results or the overall viability of the experiments.For this reason we propose an addition to the CE model proposed by Fagerholmet al. [3] when it involves experiments on CPS: the domain expert , a person orteam with deep knowledge of the system and its capabilities. The domain ex-pert’s main role is to advise the experimenter and data scientist while devisingand planning the experiment to be run. The insights this figure could providewould not be limited only to the choice of the execution strategy but could rangefor example from deciding if an experiment could be run “live” on customers’vehicles, or if preliminary measurements would be needed to ensure its viability,and so on. As a direct application of the “web-based” continuous experimenta-tion would prove difficult or even impossible in the context of CPS due to theseveral key differences between the two fields, we claim that the presence of anintermediary figure can smoothen or in some cases enable the experimentationprocess thanks to its knowledge of both the system and the proposed techniquesto obtain the additional computational time needed to run experiments.We report about threats to the validity of this study according to Runesonand H¨ost [11]. Our current work in the lab concerns the validation of the proposedstrategies using our self-driving vehicles to increase the external validity of thesuggested architectural design considerations. It is also impossible to completelyeliminate the threat to reliability, i.e. whether different researchers would comeout with the same solution if they were to assess the same problem. To mitigatethis threat, we carefully described our reasoning to motivate our suggested designdecisions. The present work aims at contextualizing the Continuous Experimentation pro-cess into the Cyber-Physical System field, assessing the lack of surplus resourcesthat would be needed for the system to run the additional experimental code.In order to assess this deficit, three different execution strategies have beenproposed that would allow to run an experimental software module alongsidea production module. The different characteristics of the strategies enable theadaptation of the solution for different application scenarios.In order for a software architecture to enable and make use of the proposedstrategies it must be possible to strictly control two crucial types of informationthat are accessible to both the production and experimental software module,which are the time and the communication resource. Controlling the modules’access to these resources acts as enabling criteria ensuring the transparency ofthe execution strategy to the software modules themselves.Future efforts will focus on evaluating the contributions in a setting closer tothe specific challenges encountered in industry, by continuing the research in theCOPPLAR project, which is Chalmers University of Technology’s contributiono the DriveMe context . The DriveMe project is an autonomous driving pilotproject by Volvo Cars that aims at releasing 100 cars capable of self-drivingcapabilities on selected public roads in 2017. Acknowledgment
This work has been supported by the COPPLAR Project – CampusShuttle coop-erative perception and planning platform [12], funded by Vinnova FFI, Diarienr:2015-04849.
References
1. Giaimo, F., Berger, C.: Design criteria to architect continuous experimentation forself-driving vehicles. In: Proceedings of the International Conference on SoftwareArchitecture. ICSA ’17, New York, NY, USA, IEEE (2017)2. Giaimo, F., Yin, H., Berger, C., Crnkovic, I.: Continuous experimentation on cyber-physical systems: Challenges and opportunities. In: Proceedings of the ScientificWorkshop Proceedings of XP2016, ACM (2016) 143. Fagerholm, F., Guinea, A.S., M¨aenp¨a¨a, H., M¨unch, J.: The right model for con-tinuous experimentation. Journal of Systems and Software (2017) 292–3054. Olsson, H.H., Bosch, J.: Climbing the stairway to heaven: evolving from agile devel-opment to continuous deployment of software. In: Continuous software engineering.Springer (2014) 15–275. Tang, D., Agarwal, A., O’Brien, D., Meyer, M.: Overlapping experiment infras-tructure: More, better, faster experimentation. In: Proceedings of the 16th ACMSIGKDD international conference on Knowledge discovery and data mining, ACM(2010) 17–266. Kohavi, R., Deng, A., Frasca, B., Walker, T., Xu, Y., Pohlmann, N.: Onlinecontrolled experiments at large scale. In: Proceedings of the 19th ACM SIGKDDInternational Conference on Knowledge Discovery and Data Mining. KDD ’13,New York, NY, USA, ACM (2013) 1168–11767. Amatriain, X.: Beyond data: from user information to business value throughpersonalized recommendations and consumer science. In: Proceedings of the 22ndACM international conference on Information & Knowledge Management, ACM(2013) 2201–22088. Kirchner, C.: Assessing Safety Aspects for Continuous Experimentation on theExample of Automated Driving. Master’s thesis, RWTH Aachen (February 2017)9. Navet, N., Simonot-Lion, F.: Automotive embedded systems handbook. CRC press(2008)10. : ReVeRe - Research Vehicle Resource at Chalmers
Accessed 2017-01-14 .11. Runeson, P., H¨ost, M.: Guidelines for conducting and reporting case study researchin software engineering. Empirical software engineering (2) (2009) 13112. : COPPLAR Project - CampusShuttle cooperative perception and planning plat-form Accessed 2017-01-14 .5