The Dark Side of Unikernels for Machine Learning
TThe Dark Side of Unikernels for Machine Learning
Matthew Leon
Vanderbilt University
Nashville, Tennessee, [email protected]
Abstract —This paper analyzes the shortcomings of unikernelsas a method of deployment for machine learning inferencingapplications as well as provides insights and analysis on futurework in this space. The findings of this paper advocate for a toolto enable management of dependent libraries in a unikernel toenable a more ergonomic build process as well as take advantageof the inherent security and perfomance benefits of unikernels.
Index Terms —unikernel, virtualization, xen, kernel samepagemerging, docker, containerization, lightweight operating system,library operating system, cloud computing
I. I
NTRODUCTION
Virtualization technology is used in datacenters spanning thewhole world to provide availability, scalability, and securityto millions of client workloads. While virtualizing an entirecomputer and running everything, including the OS image,libraries and application code, on top is still popular, manyalternative methods have emerged over the last decade, eachwith promises to increase security and improve resourceutilization. There are numerous benefits to improving resourceutilization of the host system. For users, fewer resources usedby the host operating system means more resources availableto application code. For the large corporations hosting publicclouds, maximizing utilization on a small number of machinesmeans lower costs, as datacenters consume massive amountsof power [1]. Recently, containerization technology has beenadopted as a way to reduce the number of virtualizationlayers in the modern datacenter, with services like Dockerproviding benefits including easier deployment, ensuring con-sistency between the development and production environ-ments, providing limits on resource usage, and sandboxingapplications for better security. Containerization cuts down oncosts by reducing duplication of the operating systemratherthan running several different stacks all virtualized on top ofa hypervisor, a user can run a single operating system anddivide up its resources among several containers. Unikernel isyet another lightweight virtualization technology increasinglybeing adopted in cloud data centers.II. W
HAT ARE U NIKERNELS ?Unikernels, on the other hand, focus on the other sideof the playing field from containers. With unikernels, theoperating system is totally eliminatedthe application code itselfis augmented with the minimal set of code necessary tointerface with the hypervisor and is then directly run as abootable image on top of a hypervisor. The compactness ofthis system can result in numerous benefits over containers and fully virtualized linux servers. In one study, boot times as lowas 50ms were achieved, as well as lower memory usage andreduced latency due to zero-copy network implementation [2].The significant benefits of the unikernels are discussed in thenext section. III. S
TUDY G OALS
This study aimed to analyze the state of a few differ-ent unikernels and their environments, comparing them totraditional methods of virtualization in terms of developerexperience, performance, flexibility, security, and feasibilityfor adoption. Specifically, the study was conducted through ause case where we wanted to understand whether it was feasi-ble to deploy a machine learning-trained image classificationinference inside a Unikernel. To that end, we implemented animage classification API capable of receiving an image viaHTTP and responding with an inference as to the contents ofthe image. IV. R
EPORT O RGANIZATION
This report first outlines preliminary knowledge about thedifferences of unikernels, including major vendors of unikerneltechnology as well as an overview of the pertinent differ-ences from ordinary virtualization solutions. The next sectionprovides an overview of the work done in the process ofevaluating the maturity of unikernels as a modern, lightweightalternative to containerization technology. Finally, the paperis concluded with an analysis of the hurdles that must beaddressed before unikernels are sufficient for a modern de-ployment, V. U
NIKERNELS I N - DEPTH
A. What are Unikernels?
Unikernels, on the other hand, focus on the other sideof the playing field from containers. With unikernels, theoperating system is totally eliminatedthe application code itselfis augmented with the minimal set of code necessary tointerface with the hypervisor and is then directly run as abootable image on top of a hypervisor. The compactness ofthis system can result in numerous benefits over containers andfully virtualized linux servers. In one study, boot times as lowas 50ms were achieved, as well as lower memory usage andreduced latency due to zero-copy network implementation [2].The significant benefits of the unikernels are discussed in thenext section. a r X i v : . [ c s . D C ] A p r nikernels can be grouped into two distinct categories.Firstly are unikernels that function as a library operatingsystem. OSs in this group, such as IncludeOS [3], HaLVM [4],and MirageOS [5], cannot run full executable programs,instead, they are written in and run code in an augmentedruntime environment that implements operating system func-tions, such as I/O. The other group of unikernels, such asRumpRun [6], and Nanos [7], provide application binariesan entire POSIX-compatible runtime environment which canrun arbitrary ELF executables. In addition to these runtimeenvironments, several build, orchestration, and packaging toolsare available, such as ops [7], Unikraft [8], and UniK [9]. Thisstudy investigates the feasibility and shortcomings of usingthese tools to deploy a deep neural network inference solutionavailable via a web API. B. Benefits of Unikernels
The single address space architecture of unikernels providesnumerous benefits that are not achievable with conventionalpreemptive multitasking operating systems. Firstly, the totalattack surface is much lower with a unikernel. Bratterud,Happe, and Duncan highlight a 92% reduction in total bytesof code in a running unikernel, which they translate to a 92%smaller attack surface [10]. The lack of a shell prevents anentire class of vulnerabilities, while a single address spaceallows for compile-time address space layout randomization,which is more performant than the runtime alternative. Inaddition to the security implications of a single address space,the removal of kernel space eliminates time spent in kernelspace context switches as well as scheduling interrupts by theguest OS. Instead, scheduling and load balancing is handledentirely by the hypervisor.In terms of load balancing itself, unikernels offer distinctbenefits for web-related tasks, especially due to their startuptime. The unikernel itself being the executable and thus notrequiring file systems to be initialized as well as the smallsize the kernel code occupies means that the only boot stepnecessary is initializing the network interface. In a hypervisorenvironment, this allows the unikernel to be booted in responseto an incoming request in time to handle that request. Such afast boot time allows horizontal scaling with the granularity ofindividual requests. This instant availability enables applica-tions such as fog deployment for IoT, which was investigatedby Cozzolino, Ding, and Ott [11]. This work is further beingapplied at the same time as this research as infrastructure insmart city monitoring of ongoing road hazards [12].VI. I
NSIGHTS FROM OUR S TUDY
Supplementary source code materials and motivating ex-amples for the following findings may be found at [13].Many simple implementations of image classifications areavailable on GitHub, such as [14]. In the goal of evaluatingthe effectiveness of unikernels in different environments andimplementations, three different machine learning frameworkswere tested: Tensorflow, PyTorch, and Tensorflow.js (Tensor-flow and Tensorflow.js are included separately as they do not share bindings to the same underlying library; they are com-pletely separate implementations in two different languagesof the same API). IncludeOS was used in conjunction withTensorflow, and RumpRun and Nanos were both used to testeach of PyTorch and Tensorflow.js.Our findings revealed that none of the tested solutionswere successful. The shortcomings ranged depending on theimplementationTensorflow and PyTorch struggled with issueslinking inside of the unikernel, and Tensorflow.js struggledfetching the trained model via URL due to the lack of aDNS resolver in the unikernel environment. When addingthe node.js extension to Tensorflow.js to allow for loadingthe model from within the image, the unikernel struggleddue to lack of node-gyp (a C/C++ native binding) supportinside the unikernel. We note that Tensorflow.js could beextended to support loading from file without involving node-gyp, but performing large modifications to the source of theapplication was out of scope for this studys investigation ofunikernels as an alternative deployment environment. PyTorchencountered similar issues as it is an optimized runtime withmost of the deep learning code implemented in Cthe modulesfor the library were unable to be loaded inside the unikernelenvironment.Seeing as most of the encountered issues were due to thelack of interoperability between native libraries and interpretedcode, the next approach we took was compiling Tensorflowinto an application compiled with IncludeOS, the library op-erating system capable of transforming the C/C++ applicationit is built with to an Xen-bootable executable. Unfortunately,linking also became an issue in this case. The publiclyavailable distributions of Tensorflow depend on over 10 sharedlibraries, and IncludeOS must be built statically, which is notsupported (nor possible in an unsupported fashion) in anyversion of the library. Copying the shared libraries into theimage from the system used to build Tensorflow resulted ina bootable system, but the execution failed due to missingsymbols in the outdated version of glibc used in the hostsystem. No other languages were tested after these failures,as all languages link to the C library, with the only exceptionbeing the previously mentioned Tensorflow.js without node.jsextensions, which is designed for the browser environment. Itwas unexplored whether other smaller toolkits wouldve beenmore successfulmlpack [15] appears to be a good candidatefor future research, as it may allow static linking [16].VII. A
NALYSIS AND P OSSIBILITIES FOR F UTURE W ORK
Unikernels, when compared to a deployment solution usingdocker containers or a native Linux virtual machine, still havemany hurdles to overcome before they can claim full parity interms of supported use cases. Due to the decades of prevalenceof ecosystems which support dynamic linking as a way toquickly fix security issues and reduce compiled code dupli-cation across binaries, even common libraries like Tensorflowdo not support static linking, which is unfortunate news forany application developer looking to use these libraries in aunikernel. There are ways to build a static library manuallyuch as by packing GCCs object file output with tools suchas ar, but these are steps for build system maintainers ratherthan application developers [17]. It is this researchers opinionthat unikernels would be most benefited by a robust buildtool which handles dependency bundling inside the unikernelenvironment, much like Dockers build command or Ansiblescripts. With access to a layered build system, unikernelscould provide a compelling base layer for virtualization due totheir lightweight and secure runtimes; however, dependenciesin docker are handled through Linux distribution archives,which would be lacking in the environment of a unikernel.Without such tools, the art of manually packing a static archivefor linking or building each shared library with the correctversion of Glibc will remain out of reach for all but the mostskilled devops engineers deploying in the most demandingsituations where significant cost and performance benefitsof unikernels may offset the additional development workrequired for deploying the unikernel. The build tools testedduring this study, unik and ops, were both unable to contendwith library dependencies in an efficient manner.Beyond the deployment itself, there are supplementaryconsiderations that must be investigated in terms of the perfor-mance implications of unikernels. Dockers AUFS allows forsomething which unikernel images, in their current, statically-linked form, do notdeduplication of layers. For example, ifthe unikernel is being used for a microservice-based webAPI, it would not be uncommon for there to be two end-points that look very similar from a dependency point ofviewendpoints involved with creating and updating a usersprofile, for onewhich would duplicate all library code in eachbinary. In a Docker deployment, the libraries for the operatingsystem would be shared on disk, as the containers are storedas layers and extended with each command executed in theDockerfile. This benefit extends to memory, as welldifferentdocker containers descending from the same parent layersare able to share the same pages in memory due to KernelSamepage Merging [18], [19]. Unikernels, on the other hand,may be able to share less memory due to differences in howprivate pages may be accessed by unikernels sharing a majorityof code, but being compiled with different static dependencies.This is an area requiring further research to experimentallydetermine the extent of the memory saving, and the conceptof copy-on-write deduplication of memory pages is currentlysubject to security concerns discovered along with side channelattacks [19] [20]. VIII. C
ONCLUSION
Unikernels present compelling benefits in terms of perfor-mance and security for deploying applications to the fog or thecloud, but currently face issues in regards to managing depen-dencies, updates, and compatibility with 3rd party libraries. Asolution la docker build for unikernelsproviding a method for dependency management as well as possibly for sharing andextending images others have mademay provide a more secureand performant platform for future cloud computing needs.R
EFERENCES[1] D. E. Business. Powering a google search: The facts andfigures. [Online]. Available: https://business.directenergy.com/blog/2017/november/powering-a-google-search[2] A. Madhavapeddy, R. Mortier, C. Rotsos, D. Scott, B. Singh,T. Gazagnaire, S. Smith, S. Hand, and J. Crowcroft, “Unikernels:Library operating systems for the cloud,”
SIGARCH Comput. Archit.News , vol. 41, no. 1, p. 461472, Mar. 2013. [Online]. Available:https://doi.org/10.1145/2490301.2451167[3] N. Hussein, “IncludeOS: a unikernel for C++ applications,”
LWN.net .[Online]. Available: https://lwn.net/Articles/728682/[4] GaloisInc. The haskell lightweight virtual machine (HALVM) sourcearchive. [Online]. Available: https://github.com/GaloisInc/HaLVM[5] Xen Foundation and Linux Foundation. (2017) MirageOS. [Online].Available: https://mirage.io/[6] A. Kantee, “Flexible operating system internals: The design andimplementation of the anykernel and rump kernels; flexible operatingsystems: Design and implementation of the kernel and stubkernels,” p. 358, 2012. [Online]. Available: http://urn.fi/URN:ISBN:978-952-60-4917-5[7] NanoVMs Inc., “NanoVMs in depth,” Tech. Rep., 2017. [Online].Available: https://nanovms.com/whitepapers[8] NEC Laboratories Europe GmbH. (2020) Home unikraft. [Online].Available: http://unikraft.org/[9] Solo IO. The unikernel & microVM compilation and deploymentplatform. [Online]. Available: https://github.com/solo-io/unik[10] A. Bratterud, A. Happe, and R. Duncan, “Enhancing cloud security andprivacy: The unikernel solution,” in
Eighth International Conference onCloud Computing, GRIDs, and Virtualization, 19 February 2017 - 23February 2017, Athens, Greece , ser. Cloud Computing IARIA. CurranAssociates, 2 2017, pp. 79–86.[11] V. Cozzolino, A. Y. Ding, and J. Ott, “FADES: Fine-grained edgeoffloading with unikernels,” in
Proceedings of the Workshop on HotTopics in Container Networking and Networked Systems , ser. HotConNet17. New York, NY, USA: Association for Computing Machinery, 2017,p. 3641. [Online]. Available: https://doi.org/10.1145/3094405.3094412[12] V. Cozzolino, J. Ott, A. Ding, and R. Mortier, “ECCO: Edge-cloudchaining and orchestration framework for road context assessment,” in
Proceedings of the 2020 IEEE/ACM Fifth International Conference onInternet-of-Things Design and Implementation , 2020.[13] M. Leon. Supplementary materials for “The Dark Side of Unikernelsfor Machine Learning”. [Online]. Available: https://github.com/leonm1/unikernel-research-2020[14] avinassh. Pytorch flask API. [Online]. Available: https://github.com/avinassh/pytorch-flask-api[15] mlpack Contributors. (2020) mlpack. [Online]. Available: https://github.com/mlpack/mlpack[16] ——. mlpack CMakeLists.txt. [Online]. Available: https://github.com/mlpack/mlpack/blob/1c25a1bda52832841efec4e41477bfcfbbbf6f3f/CMakeLists.txt , 2014, pp. 379–384.[20] K. Suzaki, K. Iijima, T. Yagi, and C. Artho, “Software side channelattack on memory deduplication,”