[PDF] μ Tiles: Efficient Intra-Process Privilege Enforcement of Memory Regions

Abstract

With the alarming rate of security advisories and privacy concerns on connected devices, there is an urgent need for strong isolation guarantees in resource-constrained devices that demand very lightweight solutions. However, the status quo is that Unix-like operating systems do not offer privilege separation inside a process. Lack of practical fine-grained compartmentalization inside a shared address space leads to private data leakage through applications' untrusted dependencies and compromised threads. To this end, we propose μ Tiles, a lightweight kernel abstraction and set of security primitives based on mutual distrust for intra-process privilege separation, memory protection, and secure multithreading. μ Tiles takes advantage of hardware support for virtual memory tagging (e.g., ARM memory domains) to achieve significant performance gain while eliminating various hardware limitations. Our results (based on OpenSSL, the Apache HTTP server, and LevelDB) show that μ Tiles is extremely lightweight (adds ≈10KB to kernel image) for IoT use cases. It adds negligible runtime overhead ( ≈0.5%−3.5% ) and is easy to integrate with existing applications for providing strong privilege separation.

Full PDF

µµTiles: Efﬁcient Intra-Process Privilege Enforcement of Memory Regions

Zahra Tarkhani

University of Cambridge

Anil Madhavapeddy

University of Cambridge

Abstract

With the alarming rate of security advisories and privacyconcerns on connected devices, there is an urgent needfor strong isolation guarantees in resource-constraineddevices that demand very lightweight solutions. However,the status quo is that Unix-like operating systems do notoffer privilege separation inside a process. Lack of prac-tical ﬁne-grained compartmentalization inside a sharedaddress space leads to private data leakage through ap-plications’ untrusted dependencies and compromisedthreads. To this end, we propose µTiles, a lightweightkernel abstraction and set of security primitives basedon mutual distrust for intra-process privilege separation,memory protection, and secure multithreading. µTilestakes advantage of hardware support for virtual mem-ory tagging (e.g., ARM memory domains) to achievesigniﬁcant performance gain while eliminating varioushardware limitations. Our results (based on OpenSSL,the Apache HTTP server, and LevelDB) show that µTilesis extremely lightweight (adds ≈ KB to kernel image)for IoT use cases. It adds negligible runtime overhead( ≈ . − . Many software attacks target sensitive content in anapplication’s address space, usually through remote ex-ploits, malicious third-party libraries, or unsafe languagevulnerabilities. Conventional operating systems considerprocesses as units of isolation. However, particularly inIoT use cases, most applications generate and analyzehighly sensitive data in a single process for efﬁciency rea-sons. This leads to real threats (summarised in Table 1)

Process A

Userspace

Thread 1 Thread 2 µtile 𝑖 Key untrusted code µtileµtiles Interfaceunauthorized accessInteractionuntrusted codeKernelµtiles API µtile 𝑣 µtile 𝑤 µtiles API authorized accessµtiles kernel abstraction & access controlTagged Thread Memory Managment

Tagged Address spaceTask managment Original objectµtile 𝑗 Figure 1: High-level architecture of µTiles: it providesstrong intra-process isolation and privilege separation.Each thread can deﬁne its own trust boundaries in theform of µTiles that are protected memory regions. µTilesare guarded against both untrusted code within the samethread as well any untrusted threads.that require effective protection against: (i) an application’s secret data (e.g., private keys or userpasswords) can be leaked in the presence of compromisedthird-party libraries like OpenSSL [22]; (ii) privilegedfunctions can be misused to access private content [21]; (iii) applications written in memory-safe languages suchas Rust or OCaml are vulnerable via unsafe external li-braries that jeopardizes all other safety guarantees [6,33];and (iv) in multithreaded servers attackers can exploitvulnerabilities (e.g., buffer overﬂows) so the compro-mised thread can access sensitive data owned by otherthreads [2]. This whole class of attacks could be avoidedby providing a practical way to enforce the least privilegewithin a shared address space.Process-based isolation is the primary compartmental-ization technique for security-sensitive applications suchas OpenSSH to separate their components into different1 a r X i v : . [ c s . O S ] A p r VE Description uTiles I n - P r o ce ss t h re a t s CVE-2019-9345 Shared mapping bug (cid:51)

CVE-2019-9423 missing bounds check (cid:51)

CVE-2019-15295 unsafe third party library (cid:51)

CVE-2019-1278 unsafe third party library (cid:51)

CVE-2018-0487 unsafe third party library (cid:51)

CVE-2017-1000376 unsafe native bindings (cid:51)

CVE-2014-0160 Heartbleed bug (cid:51) O t h er CVE-2018-0497 SW side-channelsCVE-2017-5754 HW side-channels

Table 1: A representative selection of vulnerabilities thatcause sensitive content leakage. The attacks with a tickcan be mitigated by using µTiles protection.processes [1, 3, 42]. However, this usually causes a largeoverhead and requires redesigning an application fromscratch using a multiprocess architecture (e.g., Chrome)that is impractical for most multithreaded applicationssuch as web servers. Previous work such as Privtrans [17]and Wedge [16] provide automatic process-based isola-tion of applications with a huge overhead ( ≈ − fork that already suffers from various efﬁciency andsecurity issues [13]. Even fork alternatives such as clone , are not ﬂexible enough for ﬁne-grained data shar-ing between processes for security-critical resources. Weneed a better abstraction for shrinking the trust bound-aries from inter-process to intra-process; so developerscan effectively prevent in-process attacks and build se-cure multithreaded applications.The importance of these security threats results in sig-niﬁcant improvement in hardware support for efﬁcientmemory isolation [9, 11, 29, 50]. However, simple APIsfor utilizing such hardware features are not effective dueto the complexity of attacks as well as various hardwarelimitations [40,47]. We need a more principled mitigationapproach. In particular, the requirements of real-worldIoT applications show a practicality gap in the existingsolutions (summarized in Table 2) that need to be cov-ered.In this paper, we present µTiles, a new OS abstrac-tion for enforcing least privilege between threads andon slabs of memory within the same address space. Un-like previous work, µTiles security model allows eachthread to selectively protect or share its memory com-partments both from the untrusted code within itself aswell as from any untrusted thread (see Figure 1). µTiles’access control layer maps threads’ security policies toµTiles dedicated virtual memory (VM) abstraction. ThisVM manager provides an efﬁcient memory tagging layer Isolation mechanism Features N o c o m p il e r m od i ﬁ ca ti on L o w h a r d w a r e d e p e nd e n c y F l e x i b l e s ec u r it ypo li c y S i m p li c it y / U s a b ilit y L o w ov e r h ea d U n li m it e d i s o l a t e dun it s M u ltit h r ea d e dp r i v il e g e s e p a r a ti on P O S I X c o m p a ti b ilit y E m b e dd e dd e v i ce s u it a b l e SFI/HFI [19,44,54] - (cid:71)(cid:35) - (cid:71)(cid:35) - (cid:32) - (cid:32) (cid:71)(cid:35) Tagged-VMA/TLB [36,40,47] (cid:32) - - (cid:71)(cid:35) (cid:71)(cid:35) (cid:32) (cid:71)(cid:35) (cid:32) -LibOS [37,41] (cid:71)(cid:35) (cid:32) - (cid:71)(cid:35) (cid:71)(cid:35) - - (cid:71)(cid:35) (cid:71)(cid:35) Process-based isolation [16,17] (cid:32) (cid:32) - (cid:71)(cid:35) - - - (cid:32) (cid:71)(cid:35) Capability hardware [50,52] (cid:71)(cid:35) - (cid:32) (cid:71)(cid:35) (cid:71)(cid:35) (cid:32) (cid:32) (cid:32) -DIFC-OSes [32,51] (cid:32) (cid:32) (cid:32) (cid:71)(cid:35) - (cid:32) (cid:71)(cid:35) (cid:71)(cid:35) -µTiles (cid:32) (cid:71)(cid:35) (cid:32) (cid:32) (cid:32) (cid:32) (cid:32) (cid:32) (cid:32) (cid:32) has the feature (cid:71)(cid:35) partially has the feature- does not have the feature Table 2: Overview of in-process isolation techniques. Weconsider these metrics for our design, focusing on therequirements of IoT use cases.by bypassing most of the kernel’s paging abstraction.It utilizes hardware-enforced VMA tagging (e.g., ARMmemory domains [11] or Intel MPK [29]) to achieve lowoverhead. It should be noted that these hardware featureshave various security and practicality limitations (§2)that are mitigated by µTiles high-level abstraction. Ourcontributions can be summarised as follows:• present a new kernel’s security primitives based onmutual-distrust for intra-process privilege separa-tion. It provides strong protection of private content,a secure multithreading model, and guarded com-munication within a shared address space.• describe how to utilize modern CPU facilities forefﬁcient memory tagging to avoid the overhead ofexisting solutions (due to TLB ﬂushes, per-threadpage tables, or nested page table management) whilerelieving the hardware limitations.• show that the solution is ultra-lightweight ( ≈ K LoC) to be practical for embedded devices with aminimal memory footprint.• evaluate our implementation using real-world soft-ware such as Apache HTTP server, OpenSSL, andGoogle’s LevelDB, which shows µTiles add negligi-ble runtime overhead for lightly modiﬁed applica-tions while signiﬁcantly improve their security bystrong compartmentalization.The remainder of this paper elaborates on the hardwarefeatures we use (§2), describes the architecture (§3) andimplementation of µTiles (§4), presents an evaluation(§5) and ﬁnally the trade-offs of our approach (§6).2

Goals & Assumptions

The µTiles abstraction aims to enforce thread-granularityleast privilege for memory accesses via the followingprinciples and assumptions on the underlying hardware.

Fine-grained strong isolation:

All threads of executionshould be able to deﬁne their security policies and trustmodels to selectively protect their sensitive resources.Current OS security models of sharing (“everything-or-nothing”) are not ﬂexible enough for deﬁning ﬁne-grained trust boundaries within processes or threads(lightweight processes).

Performance: µTiles operations, including lunching,running, changing access permissions, and sharing acrossthreads, should have minimal overhead. Moreover, un-trusted (i.e., µTiles-independent) parts of applicationsshould not suffer any overhead.

Efﬁciency: µTiles should be lightweight to be practicalfor embedded devices running on a few megabytes ofmemory and slow ARM CPUs.

Compatibility:

It is difﬁcult to provide strong securityguarantees with no code modiﬁcations, and µTiles is noexception. We move most of these modiﬁcations intothe Linux kernel (increasingly popular for embedded de-ployments [5]) and provide simple userspace interfaces.µTiles should be implemented without extensive changesto the Linux and not depend on a speciﬁc programminglanguage, so existing applications can be ported easily.To achieve effective isolation, we need a securitymodel based on mutual-distrust that lets each thread pro-tect its own µTiles from untrusted parts of the same threadas well as other threads and processes. Simply providingPOSIX memory management within µTiles (e.g. malloc or mprotect ) is inadequate. As a simple example, at-tackers can misuse the API for changing the memorylayout of other threads’ µTiles or unauthorized memoryaccesses. The µTiles interface needs (i) to provide isola-tion within a single thread; (ii) to be ﬂexible for sharingbetween threads, and (iii) to restrict all unauthorizedpermission changes or memory mappings modiﬁcationof allocated µTiles. Previous work such as ERIM [47]or libMPK [40] does not offer such security guaranteessince their focus is more on performance and virtualiza-tion of the hardware protection keys.We derive inspiration from Decentralised InformationFlow Control (DIFC) [32] but with a more constrainedinterface – by not supporting information ﬂow withina program, we avoid the complexities and performanceoverheads that typically involves. Existing DIFC kernels such as HiStar [51] achieve our isolation goals, but re-quires a non-POSIX-based OS that makes it impracticalfor many applications, particularly IoT use cases. To havea practical and lightweight solution, we therefore builtµTiles by modifying the Linux kernel and additionallyutilizing modern hardware facilities for VMA tagging todeliver low overhead. Modern CPUs have supported memory protection mech-anisms that are more efﬁcient than traditional paging.For example, VMA tagging features such as Intel Mem-ory Protection Keys [29] and ARMv7 Memory Domains(MDs) [11] provide fast isolation by reducing page ta-ble walks and TLB ﬂushes. Though the implementationsacross Intel and ARM vary considerably (see Table 3),µTiles high-level VM abstraction can securely utilizethese efﬁcient building blocks while hiding their limita-tions. In this paper, we primarily describe the design ofµTiles for ARMv7-A that is a widely used CPU in IoTand mobile devices. Also, ARM-MD is a less ﬂexible andmore challenging interface to support (§2.3) that coversmost MPK limitations as well.As a summary of ARMv7-A memory management,page table entries consist of a virtual base address, a phys-ical base address, Address Space Identiﬁer (ASID) tags,domain IDs, and a set of ﬂags for access control and otherpage attributes. It supports a two-level hierarchical pagetable when using a short-descriptor translation table for-mat, and supports variable page sizes (1GB, 1MB, 64KB,and 4KB). ARM supports two page tables simultane-ously, using the hardware registers TTBR0 and TTBR1.A virtual address is mapped to a physical address by theCPU, depending on settings in TTBRC. This control reg-ister has a ﬁeld that sets a split point in the address space.Addresses below the cutoff value are mapped throughthe page tables pointed to by TTBR0 (used per process),and addresses above the cutoff value are mapped throughTTBR1 (used by the kernel).The translation tables hold a four-bits domain ID rang-ing from D D

15. Access control for each domainis handled by setting a domain access control register(DACR) in CP

15, which is a 32-bit register only acces-sible in the privileged processor modes. Each domainis assigned two bits in DACR, which deﬁnes its accessrights.The four possible access rights for a domain are No Ac-cess, Manager, Client, and Reserved (see Table 4). Thoseﬁelds let the processor (i) prohibit access to the domainmapped memory–No Access; (ii) allow unlimited ac-3 eature ARM Memory Domains Intel MPKPer process domains 16 16Access control register DACR (2 bits per domain, privileged register) PKRU (2 bits per domain, userspace register)Access rights No-access, Full access, MMU default No-access, write-disable, MMU defaultPaging modes 2-level paging (bits 8:5, level 1 entries) 4-level paging (bits 62:59 of PDPTE)Address space privilege Privileged & usersapce Userspace onlySpeciﬁc page fault Domain fault PK faultKernel virtual memory API No support Limited support ( pkey_mprotect , pkey_alloc , pkey_free , and mmap ) Table 3: ARM memory domains vs Intel MPK: Despite being efﬁcient building blocks of isolation, such features havevarious limitations that µTiles abstraction resolves to provide effective intra-process privilege separation.

Mode Bits DescriptionNo Access 00 Any access causes a domain fault.Manager 11 Full accesses with no permissions check.Client 01 Accesses are checked against the page tablesReserved 10 Unknown behaviour.

Table 4: ARM memory domains access permissionscess to the memory despite permission bits in the pagetable– Manager; or (iii) let the access right be the same asthe page table permissions–Client. Any access violationcauses a domain fault. Changes to the DACR are low costand activated without affecting the TLB. Hence changingdomain permissions does not require TLB ﬂushes.

Though ARM memory domains are a promising primi-tive in concept, the current hardware implementation andOS support suffer from signiﬁcant problems that haveprevented their broader adoption:

Scalability:

ARM relies on a 32-bit DACR register andso supports only up to 16 domains. Allocating a largerregister (e.g., 512 bits) would mean larger page table en-tries or additional storage for domain IDs.

Flexibility:

Unlike Intel MPK, ARM-MDs only applyto ﬁrst-level entries; the second-level entries inherit thesame permissions. This prevents arbitrary granularity ofmemory protections to small page boundaries and re-duces the performance of some applications [20]. Also,the DACR access control options do not directly marka domain as read-only, write-only, or exec-only. So thehigher-level VM abstraction should resolve these issues.

Performance:

Changing the DACR is a fast but privi-leged operation, so any change of domain access permis-sions from userspace require a system call. This is unlikeIntel MPK that makes its Protection Key Rights Register(PKRU) accessible directly from userspace.

Userspace:

There is no Linux userspace interface forusing ARM-MD; it is only used within the kernel tomap the kernel and userspace into separate domains. In contrast, Linux already provides some basic support forutilizing Intel MPK from userspace.

Security:

Though the DACR is only accessible in priv-ileged mode, any syscall that changes this register is apotential breach that could cause the attacker to gain fullcontrol . Also, since only 16 domains are supported, itis trivial to guess other domains’ identiﬁers, making itessential to not expose these directly to application code. We now describe µTiles architecture, which is a kernelabstraction for intra-process privilege separation with anemphasis on strong isolation, performance, and practical-ity for IoT use cases. µTiles abstraction contains threeprimary kernel’s components for (1) access control andleast privilege enforcement, (2) threading and task man-agement, (3) and dedicated virtual memory manager.

This paper focuses on two types of threats. First, memory-corruption based threats inside a shared address spacethat lead to sensitive information leakage; these threatscan be caused by bugs or malicious third-party libraries(see Table 1). Second, attacks from threads that could getcompromised by exploiting logical bugs or vulnerabili-ties (e.g., buffer overﬂow attacks, code injection, or ROPattacks). We assume the attacker can control a thread ina vulnerable multithreaded application, allocate memory,and spawn more threads up to resource limits by the OSand hardware. The attacker will try to escalate privilegesthrough the attacker-controlled threads or gain controlof another thread, e.g., by manipulating another thread’sdata or via code injection. The adversary may also bypassprotection regions by exploiting race conditions betweenthreads or by leveraging confused-deputy attacks.µTiles thus provides isolation in two stages: (1) within a An occasion that has happened once already through the misuseof the put_user/get_user kernel API (CVE-2013-6282) (2) across threads in the same process. We consider threadsto be security principals that deﬁne their security policiesbased on mutual-distrust within the shared address space.We protect each thread’s µTiles against unauthorized, ac-cidental, and malicious access or disclosure. Therefore,the TCB consists of the OS kernel, which performs se-curity policy enforcement. It also assumes developerscorrectly specify their policies through the userspace in-terface for managing µTiles.µTiles are not protected against covert channels basedon shared hardware resources (e.g., a cache). Systemssuch as Nickel [45] or hardware-assisted platforms suchas Hyperﬂow [24] could be a helpful future addition forside-channel protection on µTiles.

Our modiﬁed Linux kernel enforces the principle ofleast privilege via a dynamic security policy based onDIFC [49, 51] and a simpler version of the Flume [32]labeling with only two kernel objects that are thread andaddress space. Any thread t has one secrecy( SL t ) andintegrity label ( IL t ) that each is set of unique tags. µTileobjects (e.i., contiguous units of memory) have only onesecrecy label instead of both types. The integrity viola-tions are restricted in the higher-level by controlling theﬂow of threads labels; this improves performance andreduces complexity.Privileges are represented in forms of two capabilities θ + and θ − per tag θ for adding or removing tags to/fromlabels. These capabilities are stored in a capability list C p per thread p . Unique tags are assigned internally by thekernel by calling utile_create . For improving security,none of µTiles API propagates tags in the userspace; allAPIs access control is done internally within the kernel.The kernel allows secrecy information ﬂow from α to β only if SL α ⊆ SL β , and integrity ﬂow if IL β ⊆ IL α . Everythread p may change its label from L i to L j if it has thecapability to add tags present in L j but not in L i , and candrop the tags that are in L i but not in L j . This is formallydeclared as ( L j − L i ⊆ C + p ) ∧ ( L i − L j ⊆ C − p ) .When a thread has θ + capability for µTile θ , it gainsthe privilege to only access µTile θ with only the per-mission set by its owner (read/write/execute). The ac-cess privileges to each µTile can be different; hence, twothreads can share a µTile, but the access privileges candiffer. Having a θ − capability lets a thread to declassifyµTile θ . The declassiﬁcation allows the thread to mod-ify the µTile memory layout (by adding/removing pagesto it), changing permissions, or copying the content to untrusted sources. Unsafe operations like declassifyingµTiles or by endorsing a µTile as high-integrity requirethe thread to be an owner or an authority ( acts-for rela-tionship); which can be managed by utile_grant and utile_revoke calls (see Table 5). syscalls Descriptionutile_transfer_caps( u _ info ∗ , tid ) passing only plus capabilities to thread tid utile_declassify( u _ info ∗ ) thread declassiﬁcation or endorsementutile_grant( u _ info ∗ , tid ) adds an acts-for or a delegation link to another threadutile_revoke_grant( u _ info ∗ , tid ) removes an acts-for or a delegation linkutile_lock ( u _ info ∗ ) disables access to set of µTilesutile_unlock ( u _ info ∗ ) enables access to locked µTilesutile_clone ( u _ info ∗ ,int(*fn)(void*)...) → tid creates a thread Table 5: µTiles access control system calls. tid repre-sents a thread ID, struct u _ in f o ∗ is the owner list ofµTiles IDs and other ﬁelds for ownership managementand capabilities per µTile. There is no direct propaga-tion of labels that are security-critical data structures, andsecurity policies are enforced within the kernel. Each thread may have multiple µTiles attached to it.There is no concept of inheriting credentials and capabil-ities by default (e.g., in the style of fork ) as this makesreasoning about security difﬁcult [13]. For a µTile to prop-agate, it must be through transferring capabilities; thiscan be done directly by calling utile_transfer_caps for “plus” capabilities and utile_grant for declassiﬁ-cation or endorsement. Both these operations are alsopossible via speciﬁc arguments of utile_clone syscallwhen creating a child thread. Figure 2 shows how eachthread can use the µTiles API for creating tags, chang-ing labels, and passing capabilities to other threads. Forinstance, thread 2 gains access to µTile 18 by directlygetting the b + capability from thread 1. Since it doesnot have the b − capability, it cannot change µTile 18permissions or its memory mappings.It should be noted that µTile ID is not the same asits label. All security-critical data structures for mang-ing labels are stored inside the kernel, so they can notbe modiﬁed by userspace attackers. Table 5 describesthe userspace µTile API. Threads can lock access or per-mission changes of their µTiles via utile_lock , whichtemporarily change µTile tag to restrict any modiﬁcationsof µTiles state. A locked µTile can only be accessed bycalling utile_unlock .A tagged thread can create a child by calling utile_clone ; the child thread does not inherit any ofits parent’s capabilities. However, the parent can create achild with a list of its µTiles and selected capabilities asan argument of utile_clone . For instance, in Figure 2,5 hread 1 OS Kernel µTile s Tags Caps (child)Thread 3 µTiles

API

Thread 2 µTiles

API µTile s Tags Caps µTile s Tags Caps

Userspace u t il e _ c l o n e ( l a b e l , … ) ; µTile s Virtual Memory Managment & map_to_DACR[D0,…D15] utile_mmaputile_munmaputile_mprotect utile_alloc_tag();utile_modify_label();µTiles API t r a n s f e r _ c a p s ( b + , … ) ; Figure 2: µTiles threading abstraction: each thread is a security principal, it can deﬁne security policies for controllingits µTiles collection, and pass its capabilities to other threads. The kernel enforces the security policies and handlesvirtual memory management of µTiles.thread 1 creates its child with only a “plus” capability totwo of its µTiles (18 , µTiles dedicated VM abstraction provides a familiar se-mantics for µTiles-aware memory management, VM tag-ging, mappings, protection, page faults handling, andleast privilege enforcement. It bypasses most of the ker-nel’s paging abstraction. Hence, it does not require ex-tensive modiﬁcations to the kernel memory manage-ment structures that might otherwise introduce securityholes due to inevitable TLB and memory managementbugs [53]. Threads’ security policy enforcement is doneby adding custom security hooks in the VM interfacesthat check the correct ﬂow of labels (§3.3).To improve performance (§2.1), the VM abstraction mapsper thread’s high-level security policies and memory man-agement interface to the underlying hardware domainsthat also hide its limitations(§2.3). Example code 1 showsa basic way of using µTiles to protect sensitive contentin a single thread.An application creates a new µTile by calling utile_create ; the kernel creates a unique tag with bothcapabilities (since it is the owner) and adds it to thethread’s label and capability lists, and returns a unique ID.Then the owner thread maps pages to its µTile by calling utile_mmap that updates the µTile’s metadata with itsaddress space ranges. The kernel allows mappings basedon the thread’s labels and free hardware domains. If thereis a free hardware domain, it maps pages to that domainand places it to µTiles cache. When the µTiles alreadyexists in the cache, further access to it is fast. When there is no free hardware domain, we have to evict one of theµTiles from the cache and map the new µTile metadatato the freed hardware domain; this requires storing allthe necessary information for restoring the evicted µTile,such as its permission, address space range, and label.The caching process can be further optimized by tuningthe eviction rate and suitable caching policies similar tolibMPK [40]. /* create a utile */ int utile_id = utile_create(); /* map a memory region to the utile */ memblock = (char*) utile_mmap(utile_id,addr, len, prot , 0, 0); //// set permissions by utile_mprotect/* allocate memory from utile */ private_blk = (char*) utile_malloc(utile_id, priv_len); /* make utile inaccessible */ lock_utile(utile_id); //... untrusted computations ....///* make utile accessible */ unlock_utile(utile_id); //... trusted computations ....///* cleanup utile */ utile_free(private_blk);utile_munmap(utile_id, memblock,len); Listing 1: Basic µTiles usage6 ame Descriptionutile_create → id Create a new µTileutile_kill( id ) Destroy a µTileutile_malloc( id , size ) → void* Allocate memory within a µTileutile_free( id , void ∗ ) free memory from a µTileutile_mprotect( id ,... ) change an µTile’s pages permissionutile_mmap( id ,... ) → void* Map a page group to a µTileutile_munmap( id ,... ) Unmap all pages of a µTileutile_get( id ) → perms Get a µTile permission

Table 6: Some of userspace µTiles memory managementAPI. Each µTile has an id and is a tagged kernel objectinternally. µTiles access control is checked within thekernel.The application uses utile_malloc and the µTileID to allocate memory within the µTile boundaries(utile_malloc), and utile_free to deallocate memoryor utile_mprotect to change its permissions; Table 6shows the familiar API for µTiles memory management.To mitigate attacks inside a single thread, unauthorizedaccess to µTile, by accident or other malicious code, arerestricted once the owner calls utile_lock . Then ap-plication developer can allow only her trusted functionsor necessary parts of the code to gain access by call-ing utile_unlock . For example, our single-threadedOpenSSL uses this mechanism for isolating private keysfrom vulnerabilities like Heartbleed bug §5.2).The VM manager has a separate fault handler for µTiles-speciﬁc cases. Illegal access to µTiles causes domainfaults that our handler logs (e.g., violating thread infor-mation) and terminates it with a signal. µTiles Kernel Modiﬁcations: The µTiles access controland the security model is implemented as a new Linux Se-curity Module (LSM) [39] with only four custom hooks.The LSM initializes the required data structures, such asthe label registry. Access control system calls (Table 5)for enforcing least privilege are implemented as a part ofthe LSM, including locking µTiles, transferring capabili-ties, authority operations, and declassiﬁcation based onthe labeling model we described(§3.3).We modify the Linux task structure to store the meta-data required to distinguish µTiles threads from regularones. Speciﬁcally, we add ﬁelds for storing µTiles meta-data, label/ownership as an array data structure holdingits tags (each tag is a 32-bit identiﬁcation whose upper 2bits stores plus and minus capabilities), a capability list;all included as a speciﬁc task’s cred->security data structure. We implemented a hash table-based registry tomake mostly used operations (e.g., store, set, get, remove)on these data structures more efﬁciently.The LSM also provides custom security hooks for pars-ing userspace labels to the kernel ( copy_user_label ),labeling a task ( set_task_label ), checking whetherthe task is labeled ( is_task_labeled ), and checkingif the information ﬂow between two tasks is allowed( check_labels_allowed ). These security hooks areadded in various places within the kernel to guard µTilesagainst unauthorized access or permission change by ei-ther the POSIX API (e.g., mmap, mprotect, fork) or theµTiles API. For example, forking a labeled task shouldnot copy its labels and capability lists, and this is enforcedusing these security hooks. As another example, µTiles-independent applications that using traditional POSIXAPI can not perform any unauthorized memory alloca-tion from a random µTile or mapping pages to it; thisis restricted via the security hooks that are placed in thekernel’s virtual memory management layer similar to theµTiles VM manager (Table 6).The µTiles virtual memory abstraction is implementedas a set of kernel functions similar to their Linux equiv-alents (e.g., do_mmap , do_munmap and do_mprotect )with similar high-level semantics but replaces the pag-ing compexity with simpler hardware domain-based op-erations. When an application creates a µTile by call-ing utile_create and maps an address range to itvia utile_mmap , The µTiles VM manager tags a 1MBaligned address space that covers the requested range,stores µTiles metadata, maps it to a free hardware do-main and updates the µTiles cache.When µTiles are mapped to hardware domains, the exactphysical domain number is hidden from the userspacecode to avoid possible misuse of the API. The mappingsbetween µTiles and hardware domains are maintainedthrough a cache-like structure similar to libmpk [40]. AµTile is inside the cache if it is already associated witha hardware domain; otherwise, it evicts another µTilebased on the least recently used (LRU) caching policywhile saving all require metadata for restoring the µTilemapping and permission ﬂags.µTiles owners (or authorities) can change their µTiles’permission via utile_mprotect . This operation is fasterwhen the requested permission matches one of the do-main’s supported options (Table 4) or undergo the over-head of effecting TLB. Any violation of µTiles permis-sions causes a µTiles fault that leads to the violatingthread being terminated. Userspace:

To reduce the size of the TCB, we didnot modify existing system libraries (e.g., glibc) and in-7tead provided a small userspace library for µTiles op-erations that summarized in Tables 5 and 6. As demon-strated, the library supports a familiar API for memorymanagement within a µTile, including utile_malloc and utile_free for memory management; which isimplemented as a custom memory allocator similar toHeapLayer [15].This allocates memory from an alreadymapped µTile. For each µTile, there is a memory domainin-kernel metadata structure that keeps essential infor-mation such as the µTile address space range (base andlength) and the two lists of free blocks from the head andtails of the µTile region that is used when searching forfree blocks of memory.

We evaluated our implementation of µTiles on a Rasp-berry Pi 3 Model B [4] that uses a Broadcom BCM2837SoC with a 1.2 GHz 64-bit quad-core ARM Cortex-A53processor with 32 KB L1 and 512 KB L2 cache mem-ory, running a 32-bit unmodiﬁed Linux kernel version4.19.42 and glibc version 2.28 as the baseline. We usemicrobenchmarks and compartmentalize real-world ap-plications to evaluate µTiles in terms of performanceand usability (§2.1 and §2.3) by answering the followingquestions:• What is the initialization and runtime overhead ofµTiles? How does utilizing hardware domains im-pact performance?• Are µTiles practical and adaptable for real-worldapplications? How much application change andprogramming effort is required? What is the perfor-mance impact? How does it perform for hardeninga multi-threaded environment?• What is the memory footprint of µTiles? Is it suit-able for small IoT devices? How much memorydoes it add (statically and dynamically) to both thekernel and userspace?

Creating µTiles:

Table 7 tests the cost of creating andmapping pages to µTiles using utile_mmap when µTilesare directly mapped to hardware domains, as comparedto virtualized µTiles when there is no free hardware do-main and requires evicting µTiles from the cache. Theresults show that when there is a free hardware domain,the performance improves by 4 .

9% compare to the virtu-alized one. Note that creating µTiles is usually a one-timeoperation at the initial phase of an application. L a t e n c y ( m s ) malloc & freeutile_malloc & free Figure 3: Cost of µTiles memory allocation (malloc &free). On average utile_malloc outperforms malloc by a small rate (0 . Operation Overhead stddevDirect utile_mmap/munmap 4.8% +- 0.17%Virtualised utile_mmap/munmap 10.01% +- 0.15%

Table 7: Cost of creating µTiles when directly mapped tohardware domains vs virtualized mapping that requiresµTiles caching. The results show the average of 10000runs.

Memory protection & allocation:

Changing µTilepermissions and memory allocation operations in-side µTiles have the most impact on runtime over-head. We evaluated the performance comparison of utile_mprotect vs glibc mprotect based on permis-sion ﬂags. Since hardware memory domains do nothave ﬂexible access control options (§2.3), we cannotbeneﬁt from a control switch of domains using theDACR register for all possible permission ﬂags suchas the RO, WO, and EO variants. Our results showthat on average utile_mprotect is 1 .

17x faster than mprotect for no access (PROT_NONE) or RW permis-sions (PROT_READ | PROT_WRITE), but 1 .

3x slowerfor read/write/execute-only options that are emulated.Allocating memory using utile_malloc is on aver-age 1.08x faster than glibc malloc for blocks ≤ KB and introduces a small overhead (8 . > KB ) as demonstrated in Figure 3. This cost can beoptimized by using high-performance memory alloca-tors [34]. We report the average of running microbench-marks 20000 times and show how utilizing µTiles pro-vides small overhead for memory allocation and permis-sion changes. Threading:

We tested the cost of µTile threading op-erations (creating and joining) through utile_clone that creates µTile-aware threads; utile_clone inter-nally uses the clone syscall with minor modiﬁcations8 tile-clone pthread fork10 L a t e n c y ( m s ) Lunch(1MB) Lunch(2MB)Join(1MB) Join(2MB)

Figure 4: Overhead of creating µTiles-enabled threads:the results are the average of 100000 runs with 1MB and2MB heap sizes. On average, utile_clone latency is5 .

39% lower than of pthread_create .to restrict any credential sharing with the child by de-fault (instead it provides additional clone options forpassing parent’s capabilities to its child). We imple-mented utile_join similar to waitpid . Table 4 shows utile_clone outperforms pthread_create by 0 . fork by 83 . utile_clone simply doing less sharing for initializingnew threads. Codebase overhead:

Another factor towards the us-ability of µTiles is the codebase size, which is importantboth from a security perspective and the resource limi-tations of small IoT devices. We implemented µTiles asa Linux kernel patch with no dependency on any third-party library. As Table 8 shows, it adds less than 5 . K LoC in total to both the kernel ( ≈ K ) and userspace( < . K ). It adds 7 KB to the kernel image size and adds204 KB for kernel slabs at runtime. The userspace libraryonly needs ≈ KB of memory. These results show thatthe µTiles memory footprint is extremely low and suit-able for many resource-constrained uses. Overhead Linux Kernel UserspaceAdded LoC 3023 2405Static Memory footprint static(7KB) slab(204KB) Static(10KB)

Table 8: µTiles codebase size and Memory footprint inthe Linux Kernel and userspace

OpenSSL is a widely used open-source library imple-menting cryptography operations and the transport layersecurity (TLS) protocol. It handles sensitive contentsuch as private keys and encrypted data. Hence it sig-niﬁcantly beneﬁts from isolating its sensitive content number of requests L a t e n c y ( s ) originalµTiles (single µTile)µTiles (per session µTiles) Figure 5: Overhead of httpd on unmodiﬁed OpenSSL vsµTiles-enabled one.in separate compartments to mitigate information leak-age attacks (e.g., Heartbleed). We modiﬁed OpenSSLto utilize µTiles for protecting private keys from poten-tial information leakage by storing the keys in protectedmemory pages inside a single µTile or multiple µTilesassigned per private key. Using multiple µTiles providesstronger security while adding more overhead due to thecost of caching µTiles.To enable µTiles inside OpenSSL, all the data struc-tures that store private keys such as

EVP_PKEY neededprotected heap memory allocation. This meant replac-ing

OpenSSL_malloc wit utile_malloc and using utile_mmap at the initialization phase for creating oneor multiple (per session) µTiles to store private keys.After storing the keys, access to µTiles is disabled bycalling utile_lock . Only trusted functions that re-quire access to private keys (e.g.,

EVP_EncryptUpdate or pkey_rsa_encrypt/decrypt ) can access µTiles bycalling utile_unlock . Modifying OpenSSL requiredfairly small code changes, and added only 281 LoC.We measured the performance overhead of µTiles-enabled OpenSSL by evaluating it on the Apache HTTPserver (httpd) that uses OpenSSL to implement HTTPS.Table 5 shows the overhead of ApacheBench httpd withboth the original OpenSSL library and the secured onewith µTiles. ApacheBench is launched 100 times withvarious request parameters. We choose the TLS1.2 DHE-RSA-AES256-GCM-SHA384 algorithm with 2048-bitkeys as a cipher suite in the evaluation.The results show that on average µTiles introduces0 .

47% performance overhead in terms of latency whenusing a single µTile for protecting all keys, and 3 . To show how µTiles can be used for hardening multi-threaded applications, we modiﬁed Google’s LevelDBthat is a fast key-value store and storage engine usedby many applications as a backend database. It supportsmultithreading for both concurrent writers to insert datainto the database as well as concurrent read to improve itsperformance. However, there is no privilege separationbetween threads, so threads can not communicate se-curely with the database and protect their private contentfrom other threads. We modiﬁed LevelDB to evaluateperformance overhead of using the µTiles secure thread-ing model when each thread has its own private storagethat cannot be accessed by other threads.We replaced the LevelDB threading backend( env_posix ) that uses pthreads with µTiles-awarethreading, where each thread creates an isolated µTileto protect its private storage and sensitive computations.We used the LevelDB db_bench tool (without modiﬁca-tion) for measuring the performance overhead of µTiles.We generate a database with 400K records with 16-byte keys and 100-byte values (a raw size of 44.3MB).The number of reader threads is set to 1, 2, 4, 8, 16, and32 threads for each successive run. The threads operateon randomly selected records in the database. The resultsin Figures 6 and 7 show how multithreading can improvethe performance of LevelDB, and utilizing µTiles adds asmall overhead on write (5%) and read (1 . . . W r it e T h r oughpu t ( M B / s ) originalµTiles Figure 6: LevelDB: performance overhead of µTiles-based multithreading compare to pthread-based in termsof write throughput (5%). R ea d T h r oughpu t ( M B / s ) originalµTiles Figure 7: LevelDB: performance overhead of µTiles-based multithreading compare to pthread-based in termsof read throughput (1 . We have shown that µTiles provides a practical and ef-ﬁcient mechanism for intra-process isolation and inter-thread privilege separation on data objects. However, themechanism can still be taken further.

For single-threaded scenarios (e.g., event-driven servers),although µTiles can protect sensitive content from unsafelibraries or untrusted parts of the applications, it can bevulnerable if the untrusted modules are also µTiles-awareand already use the µTiles APIs. The untrusted librarycan use utile_get to query µTile IDs and use the APIto reach them. It should be noted that this is not an issuefor untrusted legacy libraries. Additionally, to remove thepossibility of such attacks, it is better to run these unsafelibraries in a separate thread, which is isolated throughµTiles abstraction.Various covert attacks [45] and side-channel attackssuch as Meltdown [35] and Spectre [30] demonstratehow hardware and kernel isolation can be bypassed [28].µTiles are currently vulnerable to these class of attacks,although the existing countermeasures within the Linuxkernel are sufﬁcient protection. We believe these typesof attacks are important security threats, and hardeningµTiles against them could be signiﬁcant future work.

Providing a solution that is compatible with various oper-ating systems and heterogeneous hardware is challenging.Though we picked our base kernel on Linux and built theabstraction with minimal dependencies, some application10odiﬁcation is still required. We believe that buildingmore compatibility layers into our existing userspaceimplementation is possible. We are open-sourcing ourcode with further feedback and patches from the relevantupstream projects we have modiﬁed.Although Linux is the most widespread general-purpose kernel for embedded devices, many even smallerdevices depend on operating systems such as FreeR-TOS. These often use ARM Cortex-M based hardwarefeatures for isolation (such as memory protection units(MPUs) [10, 46]), or more recent CPUs with memorytagging extension [9]. We plan to explore the implemen-tation of the µTiles kernel memory management on thesesingle-address space operating systems, as well as broad-ening the port to Intel architectures on Linux (where thememory domains support is generally simpler to use thanon ARM).

There are many software or hardware-based techniquesfor providing process and intra-process memory protec-tion.

OS/hypervisor-based solutions:

Hardware virtualiza-tion features are used for in-process data encapsula-tion by Dune [14] by using the Intel VT-x virtualiza-tion extensions to isolate compartments within user pro-cesses. However, overall, the overheads of such hardwarevirtualization-based encapsulation are much more heavy-weight than µTiles, and not practical for IoT applications.ERIM [47], light-weight contexts (lwCs) [36] and se-cure memory views (SMVs) [27] all provide in-processmemory isolation and have reduced the overhead of sen-sitive data encapsulation on x86 platforms. The µTiles ab-straction provides stronger security guarantees and privi-lege separation. It allows more ﬂexible ways of deﬁningsecurity policies for legacy code – e.g., within a singlethread as in our OpenSSL example. Its small memoryfootprint makes it suitable for smaller devices, and ittakes advantage of efﬁcient virtual memory tagging byusing hardware domains to reduce overhead.Burow et al. [18] also leverage the Intel MPK andmemory protection extensions (MPX) to isolate theshadow stack. Our efforts to provide an OS abstractionfor in-process memory protection is orthogonal but moregeneral than these studies, which all have potential usecases for µTiles. Our focus has also been on lowering theresource cost to work well on embedded and IoT devices,while these projects are also currently x86-only. HiStar [51] is a DIFC-based OS that supports ﬁne-grained in-process address space isolation. It inﬂuencedour work, but we focused on providing a more general-purpose solution for small devices by basing our work onthe Linux kernel instead of a custom operating system.Other DIFC-based systems only support per-process pro-tection with very large overhead [32, 49] or need speciﬁcprogramming language support [43].

Compiler & Language Runtime:

Various compilertechniques introduce memory isolation as part of amemory-safe programming language. These approachesare ﬁne-grained and efﬁcient if the checks can be donestatically [23]. However, such isolation is language-speciﬁc, relies on the compiler and runtime, and not ef-fective when applications are co-linked with librarieswritten in unsafe languages and libraries. µTiles abstrac-tions are ﬁne-grained enough to be useful to these tools,for example, to isolate unsafe bindings.Software fault isolation (SFI) [44, 48] uses runtimememory access checks inserted by the compiler or byrewriting binaries to provide memory isolation in un-safe languages with substantial overhead. Bounds checksimpose overhead on the execution of all components(even untrusted ones), and additional overhead is requiredto prevent control-ﬂow hijacks, which could bypass thebounds checks [31]. ARMLock [54] is an SFI-based so-lution that offers lower overhead utilizing ARM memorydomains.Similarly, Shreds [19] provides new program-ming primitives for in-process private memory support.µTiles also uses ARM memory domains for improvingthe performance of intra-process memory protection, butis a more ﬂexible solution for intra-process privilege sep-aration; it provides a new threading model for dynamicﬁne-grained access control over the address space withno dependency on a binary rewriter, speciﬁc compiler orprogramming language (See Table 2).

Hardware-enforced techniques:

A wide range of sys-tems use hardware enclaves such as Intel’s SGX [7] orARM’s TrustZone [8] to provide a trusted execution en-vironment for applications that against malicious kernelor hypervisor [12, 25, 26].The trust model exposed by these hardware features isvery ﬁxed, and usually results in porting monolithic code-bases to execute within the enclaves. EnclaveDom [38]utilizes Intel MPK to provide in-enclave privilege separa-tion. µTiles provide better performance and more generalsolutions with no dependency on these hardware fea-tures; hence it can be used for in-enclave isolation andsecure multi-threading to improves both security and11erformance of enclave-assisted applications.Ultimately, dedicated hardware support for taggedmemory and capabilities (e.g., ARM MTE [9]) would bethe ideal platform to run µTiles on [52]. We are planningon building this support as future work, with a view toanalyzing if the overall increase in hardware complex-ity offsets the resource usage in software for embeddedsystems.

We have presented µTiles – an OS abstraction, a set ofsecurity primitives and APIs for protecting data objectsinside a shared address space, and providing ﬂexible priv-ileged separation for multithreaded applications. We de-signed µTiles to be extremely lightweight for IoT appli-cations, with no programming language requirements,and with a small performance overhead by utilizing efﬁ-cient hardware-based memory protection that makes itpractical for a variety of uses cases and security-sensitiveapplications.

We thank Ed Nightingale, Reuben Olinsky, and JewellSeay for helpful discussions, and David Chisnall, JonCrowcroft, Marno van der Maas, and Ali Varamesh forfeedback on earlier drafts of this paper.

References [1] ﬁrejail. https://github.com/netblue30/firejail . Access Date : 2019-09-28.[2] Format string vulnerability in the cherokee. .Access Date : 2020-1-5.[3] nsjail. https://github.com/google/nsjail .Access Date : 2019-09-28.[4] Raspberry pi 3 model b. .[5] Iot developer survey 2019. https://iot.eclipse.org/resources/iot-developer-survey/iot-developer-survey-2019.pdf , 2019.[6] Hussain MJ Almohri and David Evans. Fideliuscharm: Isolating unsafe rust code. In

Proceedings of the Eighth ACM Conference on Data and Appli-cation Security and Privacy , pages 248–255. ACM,2018.[7] Ittai Anati, Shay Gueron, Simon Johnson, and Vin-cent Scarlata. Innovative technology for CPU basedattestation and sealing. In

Proceedings of the 2ndinternational workshop on hardware and architec-tural support for security and privacy , volume 13.ACM New York, NY, USA, 2013.[8] ARM. ARM®v8-M Security Extensions: require-ments on development tools. 2015.[9] ARM. Arm® architecture reference manual armv8,for armv8-a architecture proﬁle documentation,2018.[10] ARM. Cmsis-zone. https://arm-software.github.io/CMSIS_5/Zone/html/index.html ,2018.[11] ARM ARM. Architecture reference manual.

ARMv7-A and ARMv7-R edition , 2012.[12] Sergei Arnautov, Bohdan Trach, Franz Gregor,Thomas Knauth, Andre Martin, Christian Priebe,Joshua Lind, Divya Muthukumaran, Dan O’keeffe,Mark Stillwell, et al. Scone: Secure linux containerswith intel sgx. In

OSDI , volume 16, pages 689–703,2016.[13] Andrew Baumann, Jonathan Appavoo, OrranKrieger, and Timothy Roscoe. A fork () in the road.In

Proceedings of the Workshop on Hot Topics inOperating Systems , pages 14–22. ACM, 2019.[14] Adam Belay, Andrea Bittau, Ali Mashtizadeh,David Terei, David Mazières, and ChristosKozyrakis. Dune: Safe user-level access toprivileged { CPU } features. In Presented as partof the 10th { USENIX } Symposium on OperatingSystems Design and Implementation ( { OSDI } ,pages 335–348, 2012.[15] Emery D Berger, Benjamin G Zorn, and Kathryn SMcKinley. Composing high-performance memoryallocators. 2001.[16] Andrea Bittau, Petr Marchenko, Mark Handley, andBrad Karp. Wedge: Splitting applications intoreduced-privilege compartments. USENIX Associ-ation, 2008.1217] David Brumley and Dawn Song. Privtrans: Auto-matically partitioning programs for privilege sep-aration. In USENIX Security Symposium , pages57–72, 2004.[18] Nathan Burow, Xinping Zhang, and Mathias Payer.Sok: Shining light on shadow stacks. In , pages985–999. IEEE, 2019.[19] Yaohui Chen, Sebassujeen Reymondjohnson,Zhichuang Sun, and Long Lu. Shreds: Fine-grainedexecution units with private memory. In , pages56–71. IEEE, 2016.[20] Guilherme Cox and Abhishek Bhattacharjee. Efﬁ-cient address translation for architectures with mul-tiple page sizes.

ACM SIGOPS Operating SystemsReview , 51(2):435–448, 2017.[21] Zhui Deng, Brendan Saltaformaggio, XiangyuZhang, and Dongyan Xu. iris: Vetting privateapi abuse in ios applications. In

Proceedings ofthe 22nd ACM SIGSAC Conference on Computerand Communications Security , pages 44–56. ACM,2015.[22] Zakir Durumeric, Frank Li, James Kasten, JohannaAmann, Jethro Beekman, Mathias Payer, NicolasWeaver, David Adrian, Vern Paxson, Michael Bai-ley, et al. The matter of heartbleed. In

Proceedingsof the 2014 conference on internet measurementconference , pages 475–488. ACM, 2014.[23] Archibald Samuel Elliott, Andrew Ruef, MichaelHicks, and David Tarditi. Checked c: Making csafe by extension. In , pages 53–60. IEEE.[24] Andrew Ferraiuolo, Mark Zhao, Andrew C Myers,and G Edward Suh. Hyperﬂow: A processor archi-tecture for nonmalleable, timing-safe informationﬂow security. In

Proceedings of the 2018 ACMSIGSAC Conference on Computer and Communi-cations Security , pages 1583–1600. ACM, 2018.[25] Tommaso Frassetto, David Gens, ChristopherLiebchen, and Ahmad-Reza Sadeghi. Jitguard:hardening just-in-time compilers with sgx. In

Pro-ceedings of the 2017 ACM SIGSAC Conferenceon Computer and Communications Security , pages2405–2419. ACM, 2017. [26] Le Guan, Peng Liu, Xinyu Xing, Xinyang Ge,Shengzhi Zhang, Meng Yu, and Trent Jaeger. Trust-shadow: Secure execution of unmodiﬁed applica-tions with arm trustzone. In

Proceedings of the15th Annual International Conference on MobileSystems, Applications, and Services , pages 488–501.ACM, 2017.[27] Terry Ching-Hsiang Hsu, Kevin Hoffman, PatrickEugster, and Mathias Payer. Enforcing least privi-lege memory views for multithreaded applications.In

Proceedings of the 2016 ACM SIGSAC Confer-ence on Computer and Communications Security ,pages 393–405. ACM, 2016.[28] Tyler Hunt, Zhipeng Jia, Vance Miller, Christo-pher J Rossbach, and Emmett Witchel. Isolationand beyond: Challenges for system security. In

Proceedings of the Workshop on Hot Topics in Op-erating Systems , pages 96–104. ACM, 2019.[29] Intel. Intel® 64 and ia-32 architectures softwaredeveloper’s manual, 2019.[30] Paul Kocher, Daniel Genkin, Daniel Gruss, WernerHaas, Mike Hamburg, Moritz Lipp, Stefan Man-gard, Thomas Prescher, Michael Schwarz, and Yu-val Yarom. Spectre attacks: Exploiting speculativeexecution. arXiv preprint arXiv:1801.01203 , 2018.[31] Koen Koning, Xi Chen, Herbert Bos, Cristiano Giuf-frida, and Elias Athanasopoulos. No need to hide:Protecting safe regions on commodity hardware. In

Proceedings of the Twelfth European Conferenceon Computer Systems , pages 437–452. ACM, 2017.[32] Maxwell Krohn, Alexander Yip, Micah Brodsky,Natan Cliffer, M Frans Kaashoek, Eddie Kohler,and Robert Morris. Information ﬂow control forstandard os abstractions. In

ACM SIGOPS Oper-ating Systems Review , volume 41, pages 321–334.ACM, 2007.[33] Benjamin Lamowski, Carsten Weinhold, AdamLackorzynski, and Hermann Härtig. Sandcrust: Au-tomatic sandboxing of unsafe components in rust.In

Proceedings of the 9th Workshop on Program-ming Languages and Operating Systems , pages 51–57. ACM, 2017.[34] Paul Liétar, Theodore Butler, Sylvan Clebsch,Sophia Drossopoulou, Juliana Franco, Matthew JParkinson, Alex Shamis, Christoph M Winter-steiger, and David Chisnall. Snmalloc: a message13assing allocator. In

Proceedings of the 2019 ACMSIGPLAN International Symposium on MemoryManagement , pages 122–135, 2019.[35] Moritz Lipp, Michael Schwarz, Daniel Gruss,Thomas Prescher, Werner Haas, Stefan Mangard,Paul Kocher, Daniel Genkin, Yuval Yarom, andMike Hamburg. Meltdown. arXiv preprintarXiv:1801.01207 , 2018.[36] James Litton, Anjo Vahldiek-Oberwagner, EslamElnikety, Deepak Garg, Bobby Bhattacharjee, andPeter Druschel. Light-weight contexts: An { OS } abstraction for safety and performance. In { USENIX } Symposium on Operating Systems De-sign and Implementation ( { OSDI } , pages 49–64, 2016.[37] Anil Madhavapeddy, Richard Mortier, Charalam-pos Rotsos, David Scott, Balraj Singh, ThomasGazagnaire, Steven Smith, Steven Hand, and JonCrowcroft. Unikernels: Library operating systemsfor the cloud. In Acm Sigplan Notices , volume 48,pages 461–472. ACM, 2013.[38] Marcela S Melara, Michael J Freedman, and MicBowman. Enclavedom: Privilege separation forlarge-tcb applications in trusted execution environ-ments. arXiv preprint arXiv:1907.13245 , 2019.[39] James Morris, Stephen Smalley, and Greg Kroah-Hartman. Linux security modules: General secu-rity support for the linux kernel. In

USENIX Secu-rity Symposium , pages 17–31. ACM Berkeley, CA,2002.[40] Soyeon Park, Sangho Lee, Wen Xu, HyungonMoon, and Taesoo Kim. libmpk: Software ab-straction for intel memory protection keys. arXivpreprint arXiv:1811.07276 , 2018.[41] Donald E Porter, Silas Boyd-Wickizer, Jon Howell,Reuben Olinsky, and Galen C Hunt. Rethinking thelibrary os from the top down. In

ACM SIGPLANNotices , volume 46, pages 291–304. ACM, 2011.[42] Niels Provos, Markus Friedl, and Peter Honeyman.Preventing privilege escalation. In

USENIX Secu-rity Symposium , 2003.[43] Indrajit Roy, Donald E Porter, Michael D Bond,Kathryn S McKinley, and Emmett Witchel.

Lami-nar: Practical ﬁne-grained decentralized informa-tion ﬂow control , volume 44. ACM, 2009. [44] David Sehr, Robert Muth, Cliff L Bifﬂe, Victor Khi-menko, Egor Pasko, Bennet Yee, Karl Schimpf, andBrad Chen. Adapting software fault isolation tocontemporary cpu architectures. 2010.[45] Helgi Sigurbjarnarson, Luke Nelson, BrunoCastro-Karney, James Bornholt, Emina Torlak, andXi Wang. Nickel: a framework for design andveriﬁcation of information ﬂow control systems. In { USENIX } Symposium on Operating SystemsDesign and Implementation ( { OSDI } , pages287–305, 2018.[46] tock. Finer grained memory protection on cortex-m3 mpus. https://github.com/tock/tock/issues/1532 , 2019.[47] Anjo Vahldiek-Oberwagner, Eslam Elnikety,Nuno O Duarte, Michael Sammler, Peter Druschel,and Deepak Garg. { ERIM } : Secure, efﬁcient in-process isolation with protection keys ( { MPK } ). In { USENIX } Security Symposium ( { USENIX } Security 19) , pages 1221–1238, 2019.[48] Robert Wahbe, Steven Lucco, Thomas E Anderson,and Susan L Graham. Efﬁcient software-basedfault isolation. In

ACM SIGOPS Operating SystemsReview , volume 27, pages 203–216. ACM, 1994.[49] Jun Wang, Xi Xiong, and Peng Liu. Between mu-tual trust and mutual distrust: practical ﬁne-grainedprivilege separation in multithreaded applications.In { USENIX } Annual Technical Conference( { USENIX }{ ATC } , pages 361–373, 2015.[50] Robert NM Watson, Ben Laurie, Steven J Murdoch,Robert Norton, Michael Roe, Stacey Son, MunrajVadera, Jonathan Woodruff, Peter G Neumann, Si-mon W Moore, et al. Cheri: A hybrid capability-system architecture for scalable software compart-mentalization. In , pages 20–37. IEEE, 2015.[51] Nickolai Zeldovich, Silas Boyd-Wickizer, EddieKohler, and David Mazières. Making informationﬂow explicit in histar. In Proceedings of the 7thsymposium on Operating systems design and imple-mentation , pages 263–278. USENIX Association,2006.[52] Nickolai Zeldovich, Hari Kannan, Michael Dalton,and Christos Kozyrakis. Hardware enforcement ofapplication security policies using tagged memory.1453] Project Zero. Introduction: Bugs in memory man-agement code, 2019.[54] Yajin Zhou, Xiaoguang Wang, Yue Chen, and ZhiWang. Armlock: Hardware-based fault isolation for arm. In