Comparative Study of Virtual Machines and Containers for DevOps Developers
CComparative Study of Virtual Machines and Containers forDevOps DevelopersInstructor: Prof. Richard Martin
Sumit Maheshwari, Saurabh Deochake, Ridip De, Anish Grover
Abstract — In this work, we plan to develop asystem to compare virtual machines with containertechnology. We would devise ways to measure theadministrator effort of containers vs. Virtual Machines(VMs). Metrics that will be tested against includehuman efforts required, ease of migration, resourceutilization and ease of use using containers and virtualmachines.Keywords: DevOps, Virtual Machines, Containers,Cloud Computing, Internet Services
I. INTRODUCTIONVirtualization is key technique drivingmultiple research areas in today’s world.Gone are those days when cluster of nodesin Cloud Computing environment were set upusing physical servers in data center. Sincethe advent of virtualization, a term whichgoes hand-in-hand with cloud computingand is mentioned interchangeably with cloudcomputing, even if they are not the same,cloud infrastructure costs have gone well down.Virtualization is a technology where a virtualversion of machine hardware, storage devicesand network devices is created using someemulators called Hypervisors. Industry acceptedvirtual machines as de-facto deployment methodfor their production software. But, as and oncloud computing industry got matured, it wasnoticed that deployment of virtual machinesactually caused overhead. For a few virtualmachines, the overhead seems okay but whenyou have hundreds of virtual machines deployed
Fig. 1. System Architecture using Virtual Machines in your production, the overhead adds upto a lot. This is when some companies andopen source contributors came up with theidea of virtual environment instead of virtualmachines. With some tweaks and additions toLinux kernel’s ’chroot’ commands, open sourcecontributors came up with projects like LXC(Linux Containers) and FreeBSD Jails. Theconcept of container is where you do not haveto worry about full blown virtual machine andinfrastructure costs and overhead associatedwith it. Rather you would just have a virtualenvironment and each container runs as aprocess [1]. This revolutionized the operationssector of software development. Fig. 1 and Fig.2 show the final representation of system usingvirtual machines and containers respectivelywhere each of these are running apache webserver respectively.In this project, we will discuss the trade-offsof using virtual machine and containers for de-ployment of the software in DevOps perspective.Along with the benchmarking of virtual machinesand containers, we will also study the impact of a r X i v : . [ c s . O S ] A ug ig. 2. System Architecture using Containers using Docker containers to deploy software tothe clients. We usually read that with containersthe deployment time has been reduced from daysto minutes as opposed to virtual machines. Ourproject will try to understand the reasoning andmeaning behind above statement.II. L ITERATURE S URVEY
This section provides you details about theneed for isolation for running any service, andhow do these modern virtualization techniqueslike Virtual machines and containers provideisolation during runtime.
A. Cloud Computing
Cloud computing is often confused with vir-tualization whereas these two technologies aresimilar but not exactly same. Virtualization isthe technique which seperates the phsical ar-chitecture to create various dedicated resources,so that we are able to utilize the efficiency ofour hardware completely. Cloud computing onthe other hand is a service that results fromthe manipulation of the hardware carried out byvirtualization. The world today combines thesetwo technologies to get the best results i.e. us-ing the virtualized environment over the cloud.Virtualization allows you to run more than oneoperating system and run multiple applicationson the same server. So in order to incorporatethese benifits, cloud computing technology sawa boom in 2012. Cloud leads to several benifits,it is setup relatively in a quick span of time plusall the servers, srvices , lecenses are all provided.Small scale businesses are benifitting from SaaSapplications available today, which allows you to pay as you use the resources. While cloudcomputing and virtualization both have somebenifits but we consider cloud computing as anextension of virtualization.
B. Virtual Machine
A virtual machine is nothing but an applicationenvironment which imitates dedicated hardware.Using such an environment allows user to utilizethe resources of its machine efficiently. A virtualmachine uses a specialized software, a hypervisorwhich emulates the hardware resources, whichenables the virtual machines to share the re-sources. This limits cost of additional hardwareby utilizing the resources of a single machineefficiently. Reducing the number of hardwareresources leads to reduce in power and as well ascooling demands, which further reduces the man-agement efforts in managing resources. Vendorssuch as VMware, Oracle and Microsoft dominatethe market with their products in this field.
C. Container
Containers are platforms for developing anddeploying applications oblivious to the infras-tructure. Container methodology enables devel-opers to perform quick deployment and signif-icantly reduces the delay between writing codeand having it in production. Containers providethe ability to package and run an application in aloosely isolated environment, thus allowing manycontainers to run simultaneiusly in a given host.Unlike virtual machines, dockers do not have theoverhead of hypervisors. Docker is one of thevery well known containers. It also provides toolsto manage the lifecycle of the containers.
D. DevOps:
DevOps is a product management method-ology that aims at unifying application devel-opment (Dev) and operations (Ops). The mainmotive of this of this methodology is to supportautomation and monitor all steps in softwareconstruction: ranging from building, testing andeployment. In order to achieve this, a DevOpsperson could has the option to either use aVirtual machine or a container. Virtual Machinesgive high adaptability while containers’ primarilyconcentrate on applications and their dependen-cies. Multiple containers could be deployed ina single host or virtual machine. Thus thereis a good chance that if the host goes downall the containers would go down. If securityand reliability is the main concern, VM shouldbe preferred. VMs on the other hand does notprovide isolation like containers. Since conatinersrun on a single kernel, containers are easy todeploy and maintain.III. C
HOOSING BETWEEN C ONTAINERS AND V IRTUAL M ACHINES
Virtual machines and containers usually pro-vide different ways to virtualize resources to runapplications. In case of a virtual machine, aninfrastructure layer called hypervisor partitionsthe server below the operating system whichcreates true virtual machines that share onlyhardware. Whereas in case of a container virtu-alization is done at operating system level wheresome middleware are shared. Virtual machinesoffer a higher flexibility as the application runson a bare metal server, we can pick our ownoperating system and middleware but in case of acontainer we have to choose a common operatingsystem as well as middleware for our application.So if you want a full platform to run multipleservices Virtual machines will be a better optionbut if you want to deploy a scalable serviceon a distributed platform containers should beused [5]. We present the following factors thatdetermine what technology should be used andwhen:
A. Operating System Requirements
As an DevOps, the admin can select an oper-ating system and middleware as per his choice.This choice of the operating system could be independent of the other VMs[1] running on thesame server. This is not the case with containers.The admin would have to provide a ”common”operating system and middleware elements whenrunning the applications. The is because, eachcontainer would use the core server platform andshare it with other containers. Virtual Machinesare thus flexible, mainly due to the fact that theapplications running on the guest environmentare similar to a bare-metal server.
B. Security
A virtual machine is a demarcation betweenthe operating system and the physical hardware.An instance of Virtual Machine has its own ser-vices like BIOS, virtualized adapters for network,storage units, CPU and a replicated operatingsystem. Therefore, the applications running overVM are oblivious to the system/hardware re-sources and so control to the system resources isvery much restricted implying failing of one VMless likely to affect other running VMs. On theother hand, containers share kernel resources andapplication libraries. Thus applications runningon containers are system aware. There is nohardware isolation. So, the application can takecontrol of the system resources.
C. Scope of the application
One of the most important factors to keep inmind is the nature of the application. Contain-ers can run in any environment regardless ofthe infrastructure whereas the Virtual machinepartitions the server below the operating systemthat share the hardware. Thus, choosing VMswill be a better idea for embedded systems andinfrastructural applications while containers canbe used to run web applications, small databasesthat run on all platforms.
D. Size of the application
If there is a need to run maximum number ofa particular applications on minimum number ofservers, the best option is to run the applicationsn the container.[2] Deploying multiple instancesof a single application in multiple containers areless troublesome in comparison to the VirtualMachines. One area in which containers havegained high usage is in the field of microserviceswhere each container runs a single service asthey can be scaled quickly with the use ofcontainers[4]. Also, the hardware requirementfor deploying multiple web servers in containerswould be significantly less than doing so on vir-tual machines. But if we have to run really hugeapplications or databases that require multiplemachines, choosing virtual machines would bea better choice.
E. Service Model
The other factor which can be taken intoaccount when choosing between containers andvirtual machines is the service model which isused to deploy and manage environments. Virtualmachines are usually deployed using tools likeVMware. These tools are responsible for creatingthe Virtual machine and migrating it to otherenvironments. But these are not that useful inapplication management with DevOps. Whereascontainers on the other side provide developmentof the software independent of the underlyinghardware which means that containers can bedeployed quickly in comparison to virtual ma-chines.
F. Requirements of the organizations
For organizations that have the requirement torun different applications on variety of softwareplatforms, virtual machines would definitely havean edge over containers. Containers would bemore difficult to use in this case because of thenecessity to standardize a single hosting plat-form. If a software is dependent on a specificversion of the operating system and there is aneed to run multiple instances of the applications,containers should be preferred as the deploymentwould be easy and also the resource utilizationwould be optimum.
G. Overheads
Compared to Virtual Machines, containershave very less deployment overhead because theydon’t duplicate the platform software for everyinstance of the application. Thus it is possible torun more components per server with containertechnology. Also, the deployment and redeploy-ment of applications or components is faster withcontainers. In addition, each user in container en-vironment shares the same instance of OS, kerneland memory as well as same network connection.Since, an application instance will just use aninstance of user space, by mere reducing CPUusage, one can reduce the overhead of multipleOS thereby improve the performance. A newkernel is not invoked for each user session.IV. S
YSTEM AND T ECHNOLOGY • A Virtual Box or Qemu KVM as a virtualmachine • B Docker Containers • C Kubernetes to orchestrate the containers • D Tunneling techniques • E Measurement tools • F Linux Kernel programming e.g. shellscripting • G Apache Web ServerV. E
XPERIMENT
This section discusses the steps that we took tomeasure effectiveness of Docker containers overvirtual machines to deploy a software. We mea-sured effectiveness of Docker containers in termsof time required by each virtualization techniqueto start operating. In our test environment wetested a simple Apache web server that serveswebsite that performs heavy mathematical func-tions. We deploy the web server through Dockercontainers and virtual machine. A PostgreSQLdatabase sever serves the data to Apache webserver in the backend.The system configuration of the machines areas follows: ig. 3. System Architecture
A. Virtual Machine
We test our Apache based web service usingvirtual machine based on Ubuntu Linux distri-bution. Below are system specifications that weare using to study our web application throughvirtual machine: • Operating System: Ubuntu 16:04 • Linux Kernel: v4.10 • Apache Web Server: Apache httpd v2.24 • Hypervisor: Oracle VM Type-2 Hypervisor
B. Containers
After deploying the web application throughvirtual machines, we test the same deploymentthrough Docker containers. Below are containerspecifications that we are using to study our webapplications: • Docker Image: Ubuntu:latest • Linux Kernel: v4.10 • Apache Web Server: Apache httpd v2.24 • Docker: 17.09-ce
Fig. 4. Creating Containers in Ubuntu
Figure 3 shows a few steps involved to run adocker on Ubuntu platform.It is essential that we should have exactlysimilar versions of Apache web server and LinuxKernel to avoid any skewed results due to versiondifference in Kernel modules. We test virtualmachines and Docker container with respect toDevOps perspective in multiple metrics like timeto set up the environment, time required to startand stop the services, ease of operation like con-tinuous integration and deployment of features,ease of scaling the application. We then also testboth the approaches in perspective of resource re-quired to be managed in each of virtual machineand containers. Finally, we will argue whethercontainer would be the best practice to use orvirtual machines are beneficial for a DevOpsdeveloper. VI. R
ESULTS
So far in our progress with our project, wehave finished testing containers and virtual ma-chines with respect to initial human efforts re-quired to setup each of the virtualization solution.
A. Initial Human Efforts
When we say human efforts required to setup avirtualization solution, we mean to quantify thehuman effort index in various classes like costincurred in the form of infrastructure costs, timeto set up the infrastructure and time to start andstop the virtualization services. Time to Start and Stop : We measuredtime required to start and stop the containersand virtual machines. We measured the startime and stopping time taken by four systemadministrators (they call them DevOps develop-ers nowadays). As we already know that Dockercontainers are based on image creation usinglayered file system operations, it takes very lesstime around 40 ms to start and 28 ms to stopon an average to create and push a basic Dockerimage for a container. On the other hand, virtualmachines run as standalone operating system andit takes 55 sec to start and 30 sec to stop, on anaverage. Therefore, we must install whole oper-ating system from scratch to create a new virtualmachine, every single time. In our comparisontest, we assume that we have already installedthe operating system inside a virtual machine andhave Docker image readily created with Apacheinstalled. Figures 1 to 4 describe test results inthe form of time required to start and stop avirtual machine and container. Because Dockercontainers work as processes inside host operat-ing system, it takes only milliseconds to launch acontainer to invoke a system call through Dockercommand line system. Whereas in case of virtualmachine, it takes tens of seconds because eachvirtual machine must go through all genericoperating system boot process like Power On SelfTest (POST), loading GRUB, loading initramfsand then staring init process of a virtual machine.Therefore, we found out that as a DevOps itwould be an easy choice of selecting Dockercontainer technology instead of virtual machinesas it saves a lot of human efforts in terms of timerequired start or stop the systems.
Type Time to Start Time to StopDocker 44 ms 28 msVM 59 sec 33 secTABLE IA
DMIN
FFORTS
Apart from human efforts required to start andstop virtual machines and containers, we mustalso test virtual machine and containers against
Type Time to Start Time to StopDocker 39 ms 27 msVM 51 sec 29 secTABLE IIA
DMIN
FFORTS
Type Time to Start Time to StopDocker 41 ms 31 msVM 54 sec 28 secTABLE IIIA
DMIN
FFORTS other parameters that are more related to systemperformance as whole.
B. Disk I/O Resource Utilization
In this section, we will try to gauge the per-formance of containers against virtual machineswith respect to disk I/O performance. For anyDevOps developer deploying an application thatis I/O-heavy, it is absolutely necessary to selectan appropriate choice of virtualization solution.We ran fio workload generator inside Dockercontainers and virtual machine which performedrandread and randwrite operations on the disk.Figure 4 shows that for any I/O-heavy applica-tions, a DevOps developer would select contain-ers to deploy their application where containersprovide lesser overhead giving the performancealmost equal to bare metal machine.
Fig. 5. Benchmarking of Disk IOPS (higher the better)ype Time to Start Time to StopDocker 43 ms 31 msVM 49 sec 30 secTABLE IVA
DMIN
FFORTS
Therefore we can infer that container provideenough disk isolation at the expense of little tono resource overhead whereas performance invirtual machine suffers badly.
C. Network Resource Utilization
Another important aspect of any web appli-cation is that how much network resources, weutilized network benchmark tool named netperfthat performs network bandwidth benchmarkingand network tuning. We ran netperf inside virtualmachine and container and then compared theresult with respect to bare-metal.
Fig. 6. Benchmarking of Network Bandwidth (higher the better)Fig. 7. Benchmarking of Network Latency (ms) (lower thebetter)
We can see from the Figure 5 that virtual ma-chine performs worst when it comes to latency innetworking. Containers provide bare-metal likeperformance. Figure 6 shows average latencymeasurement when it comes to containers andvirtual machine. Therefore, for a DevOps devel-oper who is trying to deploy a web application,it is a wise to select containers for deploymentrather than virtual machines.
D. CPU Resource Utilization
We further compare containers and virtualmachines in terms of Floating Point OperatingOperations per Second (FLOPS) that we get. Wetest the CPU performance using Intel’s Linpacktool that measures CPU by performing heavyfloating point operations.In our tests, we found out that containersagain outperform virtual machines because theymake use of native system calls of host operatingsystem. Containers also use native memory swap-ping for performing high value floating pointoperations, on the other hand, virtual machineperforms poorly because of hypervisor that trans-lates the system call to the hardware.
Fig. 8. Benchmarking of CPU Performance (higher the better)
Our CPU benchmark performance on IntelCore i7 Skylake CPU show that containers againoutperform virtual machine. Therefore, contain-ers provide enough CPU isolation with little tono overhead as compared to virtual machine. . Checkpoint, Restore and Migration
After our research on checkpointing and re-store mechanisms in virtual machines and con-tainers, we found out that virtual machine man-agers like VirtualBox, VMWare and KVM Qemuhave better solutions built-in them that provideeasy checkpointing and restore on virtual ma-chines. On the other hand, we found out thatcheckpointing and restore mechanism is difficultto implement and there are not really manyrobust tools that achieve the same. We foundout that Docker’s native support for checkpointrestore and some third-party tools like CRIU(Checkpoint Restore in User-space) do not haveenough documentation and community supportto allow a DevOps developer easily migrate theircontainers let alone supporting live migration ofcontainers.VII. L
IMITATION OF C ONTAINERS
While containers are beneficial for their lesserresource requirements, there are a few limitationswhen using containers:
A. Security Issues in Containers
While there may be numerous benefits for aDevOps developer to deploy their applicationinside a container, there are a few scenarioswhere it would be wise to use virtual machinesinstead of containers.
1) Shared Kernel:
As containers are basedon a single Linux kernel, if an attacker attacksunderlying Linux kernel and if the kernel isbrought down then all the containers that arerunning on top of that kernel come under risk.Thus, even if containers require less resources toprovide just enough isolation, it should not betaken for granted especially if an application re-quires strong isolation and security. If a DevOpsdeveloper must use containers for their benefitsbut also requires strong isolation and securitythen they may try to run containers inside virtualmachines for added isolation and security. Unlike virtual machines, the shared kernel in containersimpacts the security. The whole host might shutdown if a container does something nasty [7].
2) Unrestricted Network Access:
Various con-tainers on a host operating system share samehardware. Therefore, in most deployments, con-tainers are left with unrestricted access throughnetwork interfaces. This may cause a securityconcern that is often overlooked. An attackerthat has gotten into a container may take downother containers in the cluster by exploiting un-restricted network access.
3) Running Containers with Privilege Mode:
As containers are run as deamon processes, itis very risky to run containers with root userprivilege. An attacker can exploit the privilegelevel with which the container is running to takedown whole host operating system along with allother containers running on the host operatingsystem.
B. Securing the Containers
Below are a few ways where we can securethe containers [8]: • App Armor - Administrators can assign se-curity profile to each program running in thesystem. • BlackDuck Security - It is mainly used incontainers inventory and mapping knownsecurity vulnerabilities to image indexes • REMnux - It is an open source Linus toolkit.It assists the DevOps in analysing malwaresand to reverse engineer infected application. • Cilium - It acts as a medium of networksecurity between container applications. • Dockscan - It analyses the installation pro-cess and monitors the running containers.
C. Live Migration
We tried to test checkpoint and restore mech-anism inside containers using Checkpoint andRestore in Userspace (CRIU), but it did notwork well inside Docker. We communicated withRIU community and Docker community tocarry out checkpoint restore inside containers.However, as we discussed above, owing to lim-ited documentation and support we could not per-form migration efficiently. This might be anotherarea where containers might not be beneficial touse as various virtual machine vendors alreadyprovide robust tools and support for live migra-tion of virtual machines.Containers provide weak isolation as com-pared to virtual machines. Therefore, if a DevOpsdeveloper wants to deploy an application thatrequires higher levels of security then it is betterto use virtual machines.VIII. C
ONTRIBUTIONS
All members of our group contributed equally.All the tasks including infrastructure setup, mi-crobenchmarking the infrastructure and authoringthe documentation of the project were dividedand completed by all the team members withequal efforts.IX. A
CKNOWLEDGEMENT
We would like to thank Prof. Richard Martinfor his suggestions and support throughout theproject. X. C
ONCLUSION
Our project studies the concept of virtual-ization using virtual machines and containers.Our work demonstrates the benefits of containerswhen compared to virtual machines for a DevOpsdeveloper who is looking to deploy their appli-cation. As showcased in section V and VI, wecan see that containers exhibit lower overheadwhile providing same services for application de-ployments as in virtual machine. Containers areshown to use lesser system and human resourcesto operate compared to virtual machines and pro-vide just enough isolation and better performanceimprovement for an application deployed insidethen as opposed to virtual machines. On the other hand, we also discussed a possibility of deploy-ing highly secured and isolated applications. Insuch scenarios, as we discussed, it is much wiserto use virtual machine for deployment rather thancontainers owing to their stronger isolation andsecurity mechanisms as compared to containers.Therefore, we conclude that it really dependson a DevOps developer and the application theywant to integrate, deploy or migrate on whatvirtualization technique should be used. If theapplication requires just enough isolation withlesser resource utilization, we definitely recom-mend using containers as opposed to virtualmachines. R