Enabling Failure-resilient Intermittent Systems Without Runtime Checkpointing
aa r X i v : . [ c s . O S ] O c t Enabling Failure-resilient Intermittent SystemsWithout Runtime Checkpointing
Wei-Ming Chen,
Student Member, IEEE , Tei-Wei-Kuo,
Fellow, IEEE , and Pi-Cheng Hsiu,
Senior Member, IEEE
Abstract —Self-powered intermittent systems typically adoptruntime checkpointing as a means to accumulate computationprogress across power cycles and recover system status frompower failures. However, existing approaches based on the check-pointing paradigm normally require system suspension and/orlogging at runtime. This paper presents a design which overcomesthe drawbacks of checkpointing-based approaches, to enablefailure-resilient intermittent systems. Our design allows accu-mulative execution and instant system recovery under frequentpower failures while enforcing the serializability of concurrenttask execution to improve computation progress and ensuringdata consistency without system suspension during runtime, byleveraging the characteristics of data accessed in hybrid memory.We integrated the design into FreeRTOS running on a TexasInstruments device. Experimental results show that our designcan still accumulate progress when the power source is tooweak for checkpointing-based approaches to make progress,and improves the computation progress by up to 43% undera relatively strong power source, while reducing the recoverytime by at least 90%.
Index Terms —Data consistency, system recovery, serializability,concurrency, energy harvesting, intermittent systems
I. I
NTRODUCTION
Applications based on smart embedded devices have be-come a ubiquitous part of daily life. However, powering suchdevices is a critical challenge because of their size restrictionsand applications in large-scale scenarios. Energy harvestinghas emerged as a promising alternative power source forthese devices. To enable intermittent computing , self-poweredsystems typically checkpoint execution progress and dataresiding in volatile memory (VM) to non-volatile memory(NVM) at runtime such that the systems can be recoveredafter power resumption. However, because ambient powersources suffer from frequent power failures, the overheadsincurred by frequent checkpoints could significantly reduce
W.-M. Chen is with the Department of Computer Science and InformationEngineering, National Taiwan University, No. 1, Sec. 4, Roosevelt Rd., Taipei10617, Taiwan, and also with Research Center for Information TechnologyInnovation (CITI), Academia Sinica, No. 128, Sec. 2, Academia Rd., NankangDist., Taipei 115, Taiwan (E-mail: [email protected]).T.-W. Kuo is with the Department of Computer Science and InformationEngineering, National Taiwan University, Taipei 106, Taiwan, and also withCollege of Engineering, City University of Hong Kong, 88 Tat Chee Avenue,Kowloon Tong, Hong Kong (Email: [email protected]).P.-C. Hsiu is with the Research Center for Information Technology Innova-tion (CITI), and the Institute of Information Science (IIS), Academia Sinica,No. 128, Sec. 2, Academia Rd., Nankang Dist., Taipei 115, Taiwan, andalso with the Department of Computer Science and Information Engineering,National Chi Nan University, No. 1, University Rd., Puli, Nantou 54561,Taiwan (E-mail: [email protected]).A preliminary version of this paper was presented at the IEEE/ACM DesignAutomation Conference (DAC) 2019. system performance, thus increasing the difficulty of designinghardware chips and system software.Many attempts have been made to enable intermittent sys-tems, which can survive in unstable power environments, atthe level of hardware circuits, system architectures, and systemsoftware by efficiently checkpointing data residing in VM toNVM [5], [16], [26]. To accumulate execution progress madein different power-on periods, non-volatile processors (NVPs)have emerged as a potential solution by checkpointing volatilestates in the CPU registers, allowing the system to resume fromthe drop off point when power is restored [28]. The volatilestates in main memory, including data, stacks, and heaps oftasks, can also be backed up to non-volatile memory so that theentire system can be recovered by restoring the checkpointedstates after power resumption [10]. Various mechanisms basedon the checkpointing paradigm have also been introducedto adapt peripheral I/O devices (e.g., sensors [13], Wi-Fimodules [14], and electrophoretic displays [20]) to intermittentpower supply [4].Recently, increased interest has focused on adapting systemsoftware to NVP-based devices. The compilers for NVP-based systems have been designed to reduce the size ofcheckpointing data (e.g., stack [12] and register [27]), thusincreasing checkpointing efficiency. In addition, task sched-ulers have been investigated to improve quality delivered bythe system in terms of respective performance indexes (e.g.,the deadline miss rate [31] or system value [7]) under differentapplication scenarios. To optimize performance in a best-effortfashion with unpredictable power supply, the forward progress of intermittent task execution was deemed a sensible indexand maximized using redesigned resource allocation policies(e.g., the scheduler [23] or power manager [18]). Under weakpower supply, program sections may be longer than power-on periods. Thus, to ensure forward progress while avoidingrepeated code execution, program atomicity was supportedin [11] by ensuring that an uninterruptible code section canbe run through at one execution, and progress stagnation hasbeen addressed in [8] by dynamically adapting the checkpointinterval and size to the harvested energy.Without careful consideration of different system snapshotsin the memory hierarchy, the checkpointing paradigm maysuffer from inconsistency between the data in non-volatilememory and the restored task progress [24]. How to achieve data consistency is a critical issue because correctness is oneof the basic requirements of computer systems. Some solu-tions have been proposed to eliminate consistency errors. Inparticular, consistency-aware checkpointing approaches havebeen proposed to checkpoint the system at safe lines ofprogram code [30] or to insert auxiliary code to ensure thecorrectness of all checkpoints [29]. A hardware scheme has been proposed to automatically checkpoint system states whilediscarding all speculative modifications which may lead toinconsistency [15]. Moreover, programming models have beenproposed to prevent errors by performing data versioning for non-volatile data [17] or using a task-based executionmodel which only allows executing one task at a time inthe system [19]. However, these solutions are either based onthe checkpointing paradigm which requires system suspensionto backup volatile data frequently, resulting in non-negligibleruntime overheads, or require changing the existing program-ming model, imposing a burden on application developers.This paper proposes a failure-resilient design which over-comes the drawbacks of checkpointing-based approacheswhile preserving progress across power cycles. Our design,which is compatible with multitasking operating systems,enables intermittent systems to (1) run multiple tasks con-currently to improve computation progress, (2) achieve data consistency without system suspension during runtime, (3)recover instantly from power failures, and (4) accumulatively preserve computation progress across power cycles to avoidstagnation. To realize the design, we add a data manager anda recovery handler in an operating system, so that the systemruntime can cope with intermittence and exempts applicationdevelopers from this responsibility. The idea behind the designis to leverage the characteristics of data accessed in hybridmemory, where VM provides high-performance data accesswhile NVM provides data persistency when power failuresoccur.However, endowing intermittent systems with the four abil-ities raises corresponding challenges. First, serializability ofconcurrent task execution must be guaranteed. In our design,the data manager allows two-version copies for each dataobject in VM and NVM to increase the concurrency, whileensuring that data objects modified by tasks in VM are writteninto NVM atomically and will not violate the serializability oftask execution. Second, data consistency must be maintained.To this end, the recovery handler tracks the progress of alltasks, while the data manager ensures that a nonvolatile dataversion in NVM is consistent with the progress of finished tasks at all times. This also guarantees that a persistentlyconsistent version is always available in NVM. Third, becausethe data is instantly recoverable, after power resumption, therecovery handler only needs to recreate and rerun unfinished tasks in VM, thereby achieving instant system recovery. Fi-nally, to prevent tasks whose execution times are longer thanpower-on periods from being repeatedly recreated and rerun,the data manager will allocate their contexts in NVM so thatthe recovery handler can accumulatively complete them acrossmultiple power cycles.To evaluate the efficacy of our design, we integratedit into a real-time operating system called FreeRTOS, andconducted extensive experiments with a collection of realtasks on an ultra-lightweight platform, namely the TexasInstruments MSP-EXP430FR5994 LaunchPad. Compared tocheckpointing-based approaches that require runtime check-pointing and/or system logging [10], [22], the proposed designcan improve the forward progress by between 8% and 49%while maintaining data consistency under a strong powersource, and by infinity when checkpointing-based approachescannot make progress under a weak power source. Experi- mental results with various power traces also show that ourdesign can reduce recovery time by at least 90%, thus makingit particularly suitable for self-powered devices which maysuffer from frequent power failures.The remainder of the paper is organized as follows. Sec-tion II provides background information and explains somedrawbacks of existing approaches based on checkpointing.In Section III, we present the details of our failure-resilientdesign. Experimental results are reported in Section IV. Sec-tion V presents some concluding remarks.II. B
ACKGROUND AND M OTIVATION
A. Ultra-lightweight Intermittent Devices
Fig. 1: System architecture of a self-powered device.
1) Hardware Architecture:
Figure 1 shows the systemarchitecture of a typical ultra-lightweight device equipped withvarious hardware components. To provide basic computingfunctionality, such devices must contain essential hardwarecomponents like a CPU and main memory. The CPU executesprogram code in memory and performs general logic andarithmetic operations on data in the processor’s registers andmemory. Recently, such devices have increasingly used hybridmemory architectures to take advantage of the characteris-tics of hybrid memory. Specifically, volatile memory (VM)features high performance and low energy consumption fordata access and is usually used to store runtime data, liketask stacks and intermediate results. By contrast, non-volatilememory (NVM) features non-volatility and high capacity andis usually used to preserve data when power failures occur. Toprovide additional functionality that may be required by vari-ous applications, an ultra-lightweight device can be equippedwith extra components like a DMA controller, a timer, asystem clock, and external I/O ports. The DMA controllerallows applications to manage memory without occupyingCPU time, in that external hardware components connected viathe I/O ports can directly access data in main memory by theDMA controller. For applications requiring timely responses,the real-time clock and timer can be used to measure time andtrigger interrupt functions to handle events in real time.To provide mobility without frequent recharging, energyharvesting has emerged as a promising power source forultra-lightweight devices. However, power supplies reliant onenergy harvesting are inherently unpredictable and unstable,increasing the difficulty of designing intermittently-powereddevices. For example, a sudden power loss will cause unsaved volatile data and the computing progress of tasks to be lost.To deal with this issue, non-volatile processors (NVP) haveemerged as a promising alternative to traditional processors.An intermittently-powered device equipped with an NVPtypically contains a voltage detector and a capacitor. Thecapacitor saves (resp. uses) additional energy when the input(resp. output) voltage is higher (resp. lower) than the output(resp. input) voltage, while the voltage detector monitors thepower supply voltage and can trigger specific functions whenthe voltage falls to a predefined threshold. By implementingbackup/restore mechanisms triggered by the voltage detector,several checkpointing-based solutions have been proposed toallow for intermittent task execution [10], [21]. For example,an intuitive approach is to checkpoint all volatile data in VM(including registers and main memory) to NVM when thevoltage falls below a threshold and then write the data backto VM when power is resumed.
2) System Software:
A lightweight operating system, pro-viding system services and exempting application developersfrom the responsibility of managing hardware resources, usu-ally employs a scheduler to support multitasking and controlthe execution order of tasks. Specifically, once the systemboots up, the scheduler will setup the timer to generateperiodic interrupts that divide CPU time into slices. Wheneverthe CPU handles an interrupt, the scheduler is invoked toallocate the next time slice to a task selected to occupy theCPU and access memory in the subsequent time slice. Notethat, if the selected task differs from the currently running task,the scheduler will first perform context switch , which saves therunning task’s context by pushing the data of the CPU registersinto the running task’s stack in memory and then restores theselected task’s context by popping the previously saved datain the selected task’s stack into the CPU’s registers.In a multitasking operating system, which enables tasks tobe executed in an interleaving manner, the system typicallysupports concurrency control to allow concurrently executedtasks to access shared data objects. When tasks attempt toaccess the same data objects via the provided data accessoperations, the operating system controls the order of dataaccess operations invoked by the tasks and manipulates thecopies of data objects in memory, keeping data managementbeing transparent to the tasks. However, when data objectsare concurrently accessed by interleavingly executed tasks, theoutcome of data objects is not deterministic and depends onthe execution order of the operations invoked by the tasks. Toensure data access predictability, the operating system shouldensure that each task can be deemed to be executed in isolationby guaranteeing serializability , in that the concurrent executionof tasks must be equivalent to the case where these tasks areexecuted serially in some arbitrary order. Note that any serialorder of task execution is legitimate, so the resultant valuesof data are not deterministic. By allowing more tasks to beexecuted concurrently, the operating system increases the CPUutilization and thus improves the forward progress achieved bythe system.
B. Drawbacks of Checkpointing
To preserve the forward progress of task execution, typicalintermittent systems have to frequently checkpoint task status and/or data at runtime. However, adopting checkpointing-based approaches in intermittently-powered devices presentssome critical drawbacks. First, to preserve execution progressmade between power failures, at runtime these approachesperiodically checkpoint the (entire or partial) snapshot in VMto NVM, so that, after power resumption, the system can berecovered to the latest checkpoint by restoring the snapshotfrom NVM to VM. Consequently, data inconsistency mayoccur if some data in NVM is modified between the latestcheckpoint and a power failure [32]. Specifically, after powerresumption, the execution progress will be rolled back to thelatest checkpoint, whereas the data in NVM cannot be rolledback. This leads to data inconsistency between VM and NVMbecause the data in NVM may be modified again.To achieve data consistency, a straightforward approachis to adopt system-wise checkpointing , which checkpoints anentire system snapshot, including data, heaps, and stacks oftasks [10]. This approach requires a lengthy suspension ofall running tasks to ensure that all volatile content in VM isexclusively accessed by the checkpointing procedure, resultingin extra runtime overhead. To reduce the checkpoint sizeand time required by checkpointing, an alternative approachis to adopt logging-based checkpointing , which records anddumps all write-ahead logs and modified data residing inVM to NVM [22]. In this way, the system can traverse logsto recover inconsistent data accordingly by redoing (resp.undoing) modifications made by finished (resp. unfinished)tasks. However, such logging-based checkpointing approachessuffer from long recovery time due to log traversing andprogress loss of unfinished tasks whenever a power failureoccurs. For intermittently-powered devices, which could sufferfrom extremely frequent power failures, the checkpointingparadigm may be unable to provide timely checkpointing anddata recovery based on logs within a short power-on period.This observation suggests that intermittently-powered de-vices should be capable of not only progress accumulation within short power-on periods but also instant recovery im-mediately after power resumption. Furthermore, to improvethe forward progress, task concurrency and data consistency should be achieved without runtime suspension and logging.III. F
AILURE - RESILIENT T ASK E XECUTION
In this section, we present a failure-resilient design whichallows instant system recovery from power failures and enablescomputation progress accumulation while achieving data con-sistency and the serializability of concurrent task executionwithout runtime checkpointing and logging. The rationalebehind our design is to ensure that tasks are executed serially in the logical sense and all modifications to data objects inNVM are written atomically , while computation progress isaccumulated by allocating data in VM or NVM instead ofcopying data from VM to NVM. Two components, namelya data manager and a recovery handler, are developed torealize the design. Section III-A gives a design overview, whileSections III-B and III-C respectively present some designdetails of the data manger and the recovery handler.
A. Design Overview
As shown in Figure 2, a lightweight operating systemtypically provides a task scheduler and memory management
Fig. 2: Our failure-resilient design.to support multitasking and allow tasks to access data inmemory. The scheduler provides functions to create and deletetasks and controls the execution order of tasks. Whenever atask is created, the scheduler will allocate memory space inVM by default to the task and initialize the task’s status.Then, the task will enter a ready queue and wait to bescheduled. At runtime, concurrently executed tasks can read , write , and commit data objects through the corresponding dataaccess operations provided by the operating system. If dataaccess operations made by interleavingly executed tasks areuncontrolled, the resultant values of data are unpredictable andmay be undesirable. Therefore, the operating system shouldguarantee serializability , in that the outcome of concurrentlyexecuted tasks is equivalent to the outcome of serially executedtasks in any serial order. The adoption of task concurrency canimprove forward progress but significantly complicates datamanagement in intermittent systems. Specifically, consistencybetween the data and execution progress of tasks must beachieved. This is particularly difficult for lengthy tasks whoseexecution times are longer than power-on periods, becausea lengthy task can finish only if its execution progress isaccumulated across different power-on periods; otherwise, itmay continuously rerun and never finish.Our design enables intermittently-powered systems to becapable of failure-resilient task concurrency without runtimecheckpointing and logging. As shown in Figure 2, we employa data manager to enforce the atomicity and serializability ofconcurrent task execution while maximizing forward progress,as well as a recovery handler to instantly recover the systemafter power is resumed. The data manager is responsiblefor allocating and maintaining data and task status in VMand NVM. To ensure serializability, it replaces the originalimplementations of read, write, and commit operations, andallows two-version copies respectively in VM and NVMfor each data object. Moreover, the data manager monitorsthe operations invoked by every task and validates whetherserializability will be violated immediately before the taskattempts to commit its modifications to data copies fromVM to NVM. If the serializability is violated, the task issimply aborted and recreated. To maintain data consistency,the recovery handler is responsible for keeping track of taskexecution progress as tasks are created, finished, and aborted.Specifically, once a task is created by the scheduler, therecovery handler records the task’s attributes in NVM so thatall unfinished tasks, which are volatile in VM, can be recreated after power resumption or task abortion. After a task is finishedby successfully committing its modifications to data objectsfrom VM to NVM, the recovery handler marks the task asfinished, preventing the committed data objects from beinginconsistent due to repeatedly modified by finished tasks.The data manager and the recovery handler also cooperateto accumulate progress of lengthy tasks whose execution timesare too long to be finished within one power-on period.Specifically, after power resumption, the recovery handlerdetermines whether a task is lengthy based on whether thetask has ever been rerun due to a power failure. Once a task isdeemed lengthy while being recreated by the recovery handler,the data manager will allocate memory space in NVM (insteadthe default VM) to the task so that its execution progress willbecome nonvolatile at the cost of lower execution performance.To avoid data inconsistency, before a power failure occurs(detected by a voltage detector in our implementation), therecovery handler enforces the scheduler to context switch thecurrently executed lengthy task (if any) to prevent it frombeing scheduled at a low voltage. After the task is switchedout, the data of the CPU registers are automatically pushedto the top of its stack and its context can be preserved inNVM during power-off periods. After power resumption, thoseunfinished lengthy tasks can instantly resume from where theyleft off by simply being added into the scheduler during systemrecovery. This design allows for lengthy tasks to accumulatetheir progress across power cycles without additional over-head of memory copying required by runtime checkpointingbetween VM and NVM. Note that, to ensure serializability,the data manager allocates data copies modified by lengthytasks in NVM and performs serializability validation as usualbefore a lengthy task attempts to commit its modifications. Ifserializability is violated, the lengthy task is also aborted andrecreated. B. Consistency-aware Memory Management1) Task Context Allocation:
After a task is created bythe scheduler, the data manager maintains the memory spaceallocated to the task as well as its stack. A task’s stack storeslocal variables created by unfinished function calls invoked bythe task. These variables will be declared and initialized by thesystem and then modified by the task at runtime. To maximizecomputation efficiency, when a task is created, its stack isallocated in VM by default and, during task execution, somevariables in the stack will be fetched into the CPU registers.
Because the stack size is usually fixed and needs to bespecified prior to task creation, the operating system normallysupports dynamic memory allocation as well, allowing a taskto acquire additional memory space to store local variableswhose sizes will be specified at runtime. The data managerallocates the additional memory space required by the taskfrom the system heap via system calls (e.g., malloc() and free() in a system supported standard C library). Becausethe stacks and heaps of tasks are allocated in VM by default,when a power failure occurs, the contexts of unfinished tasks,which have yet to commit the modified data to NVM, will belost as if the tasks have never been executed.To preserve the contexts of lengthy tasks (as determinedby the recovery handler) and modified data during power-offperiods, the data manager allocates their stacks, heaps, andall used memory space in NVM instead. Moreover, the datamanager uses the memory management mechanism providedby the operating system to maintain the memory space usedby lengthy tasks in some data structures, which are stored inNVM so that, after power resumption, the stacks, heaps, andmemory space allocated to lengthy tasks can be found andreused accordingly. However, if a power failure occurs duringthe execution of a lengthy task, its context will become invalidbecause the variables currently fetched into the CPU registerswill be lost, whereas the stack and heap will still be preservedin NVM, resulting in inconsistent task contexts in the memoryhierarchy. Thus, to completely preserve the contexts of lengthytasks, we prevent a lengthy task from being scheduled at a lowvoltage by forcing the scheduler to context switch the currentlyexecuted task if it is lengthy, so that the variables fetched intothe CPU registers will be pushed on top of the task’s stackand also preserved in NVM during power-off periods.
2) Two-version Data Allocation:
In addition to local vari-ables, all tasks are allowed to access data objects whichmay be shared by multiple concurrently executed tasks. Thedata manager maintains two respective versions (i.e., working version and consistent version) for each data object, where theworking version (allocated in VM by default unless otherwisespecified) provides high performance and energy efficiency fordata access, while the consistent version (stored permanentlyin NVM) provides reliability and persistency when a powerfailure occurs. Moreover, the data manager also allows formultiple working copies for the working version of a dataobject, as well as allows a temporary copy and a persistent copy respectively in VM and NVM for the consistent versionwhile keeping the two copies identical at all times. Thiscan increase the flexibility of concurrent task execution byallowing multiple tasks to simultaneously access the same dataobject, thus improving forward progress. All data accessesare via the three operations, namely read, write, and commit,provided by the data manager. A task can read a data objectby obtaining its memory address via the read operation. Toimprove data access efficiency, we adopt the copy-on-writestrategy for the write operation. Specifically, once a taskattempts to modify a data object that has yet to be modified bythe task, a working copy of the data object will be created anddedicated for the task to read and write afterward. To updatethe persistent copy of the consistent version in NVM, a taskmust perform the commit operation, and the update will bemade only if the serializability condition is not violated. By using the operations to access data objects, data managementcan take advantage of the characteristics of hybrid memorywhile being transparent to tasks. (a) Data access and allocation policy for non-lengthy tasks(b) Data access and allocation policy for lengthy tasks
Fig. 3: Data copies accessed by three operations.The data manager carries out the three operations to improveforward progress while ensuring data consistency. Figure 3(a)shows the data copies accessed when a non-lengthy taskinvokes each of the three operations. Once a task invokesthe read operation on a data object, the temporary copy ofthe data object is read by default unless the working copydedicated for the task is available (i.e., the data object hasbeen modified by the task). However, if the temporary copyin VM is not identical to the persistent copy in NVM (e.g.,the system resumes after a power failure), the temporarycopy is deemed to be invalid and the persistent copy is readinstead. Considering the access efficiency of writing a dataobject, the data manager allocates the working copy dedicatedfor each task in VM by default. A task calls the commitoperation immediately before finishing its execution to updatethe consistent versions of those data objects modified by thetask. Before updating the consistent versions, the data managervalidates whether the update violates the serializability ofthose finished tasks. If serializability is violated, the task isaborted and rerun. Otherwise, for each data object modifiedby the finished task, its persistent copy in NVM is updatedas the task’s working copy, and the working copy becomesits temporary copy in VM. Note that a data object mayhave multiple working copies if it is accessed concurrentlyby several tasks, and the working copy left by the recentlyfinished task in VM always transits into the temporary copy.The data copies accessed by a lengthy task are slightlydifferent from the copies accessed by a non-lengthy task,because the former is intermittently executed in NVM whilethe latter is atomically executed in VM. Figure 3(b) showsthe data copies accessed when a lengthy task invokes each ofthe three operations. Once the lengthy task attempts to reada data object, the data copy to be read is also determinedaccording to the default rule applied to non-lengthy tasks.The main difference is that the working copies of all dataobjects modified by the lengthy task will be allocated in NVM(instead of VM) to ensure that the execution progress anddata modifications of lengthy tasks are consistent in NVMacross power-on periods. When the lengthy task attempts to commit its modifications, the serializability is also validatedto determine whether the update to consistent versions ispermitted or the task should be aborted and rerun. However,if the commit operation is permitted, the working copy ofeach modified data object directly transits into its persistentcopy in NVM (without additional memory copying from VMto NVM), but the temporary copy remains unchanged andbecomes invalid in VM because it may not be identical tothe persistent copy.To prevent data corruption due to power failure duringthe commit operation, the operation must be implemented tobe atomically. In other words, to prevent partial updating ofthe consistent version in NVM, the commit operation mustatomically update none or all of the modifications made by thetask. To this end, we borrow an idea proposed to atomicallyupdate shadow pages from [9] and use a bit map stored inNVM to maintain the addresses of valid persistent copies.Specifically, for each data object committed by a non-lengthytask, its modification on a data object will first be made on a shadow copy in NVM, while for each data object committedby a lengthy task, its working copy will first transit into ashadow copy. Then, those modified shadow copies and theirpersistent copies will be swapped by updating the bit maponly after all modifications or transitions for the committeddata objects are finished. Because updating the bit map onlyrequires one CPU instruction, which is the minimum executionunit of a CPU, the commit operation is atomic and resilientagainst power failures. More implementation details will bediscussed in Section III-D3.
3) Serializability Validation:
The data manager ensuresserializability with a backward validation procedure whichdetermines whether those finished tasks remain serializable ifa new commit operation is performed by a task. To this end,for each finished task (whether it is lengthy or non-lengthy),the validation procedure maintains a validity time interval ,in which the task can be viewed as having been executed inisolation. Moreover, each data object is also associated witha validity time interval which is updated as the validity timeinterval of the most recently finished task that commits theobject. The validation procedure is invoked whenever a taskattempts to commit its modifications to data objects. If a validvalidity time interval can be derived for the task, its commitoperation proceeds; otherwise, it is aborted and rerun.Algorithm 1 implements the validation procedure whichdetermines the validity time interval ( T.begin to T.end ) for agiven task T . At runtime, the data manager records all readactions, R = { r ...r n } , made by task T on the temporary orpersistent copies, as well as all write actions, W = { w ...w m } ,made by task T on its working copies. Each read action r i records the validity time interval ( r i .begin to r i .end ) ofthe data object, r i .obj , read by task T via the i th readoperation. Similarly, each write action w i records the validitytime interval ( w i .begin to w i .end ) of the data object, w i .obj ,written by task T via the i th write operation. The algorithmfirst initializes the validity time interval of task T as the rangefrom the outset to the current time of the system (Lines 1-2). Then, the interval shrinks according to the task’s read and In our implementation, the unit of validity time intervals is set as onesingle system time tick, and the time tick is triggered every 2ms on the usedTexas Instruments platform.
Algorithm 1
Validation Procedure
Input: T , R = { r ...r n } , W = { w ...w m } , T.begin = 0 ; T.end = getcurrenttime () ; for i = 1 : n do T.begin = max ( T.begin, r i .begin + 1) ; if r i .obj was first modified by any finished task τ after r i then T.end = min ( T.end, τ.begin − ; for i = 1 : m do T.begin = max ( T.begin, w i .begin + 1) ; if w i .obj was last modified by any finished task τ after w i then T.begin = max ( T.begin, τ.begin + 1) ; if T.begin ≤ T.end then commit( T ); else abort( T ); write actions. For each read action r i , the beginning of thetime interval of task T will be pushed forward because T canonly read the data object r i .obj after the object is committedby another finished task at r i .begin (Line 4). Moreover, if thedata object r i .obj is first committed by any finished task τ again after r i , the end of the interval of task T will be pushedbackward so that task T can be viewed as finished before task τ starts (Lines 5-6). Similarly, for each write action w i , thebeginning of the interval of task T will be pushed forwardbecause T can only commit the data object w i .obj after theobject is committed at w i .begin (Line 8). Moreover, if the dataobject w i .obj is last committed by any finished task τ againafter w i , the beginning of the interval of task T will be furtherpushed forward so that task T can be viewed as started aftertask τ commits (Lines 9-10). Finally, the algorithm checkswhether the time interval of task T is valid (Line 11). If theinterval is not empty, the commit operation is performed (Line12); otherwise, task T will be aborted (Line 14).
4) Property Analysis:
We now analyze the time complexityof Algorithm 1 and prove that it maintains the serializability ofthose finished tasks. To prove serializability, we first constructa precedence graph based on the data access operations madeby finished tasks. In the precedence graph, each node repre-sents a finished task, and an arc between two nodes indicatesthe precedence order between two tasks due to their dataaccess patterns conducted on some shared data objects. Then,we show that the precedence graph is acyclic. An acyclic graphindicates that all finished tasks are conflict-serializable [9], inthat the data access operations conducted by all finished taskscan be viewed as if these operations are conducted by the tasksexecuted in a serial order according to the graph.
Lemma 1.
The time complexity of Algorithm 1 for validating atask T is O ( N + M ) , where N and M respectively representthe number of data objects accessed by T and the number oftasks concurrently executed with T .Proof. To validate whether the serializability is maintainedafter task T is committed, Algorithm 1 examines the readand write actions made by the task. Because task T can readat most N data objects which may be concurrently modifiedby at most M tasks, for all read actions, the algorithm needsto check the validity time intervals of at most N data objects and at most M concurrently executed tasks which may modifythe same data objects. Moreover, for all write actions, thealgorithm needs to check the validity time intervals of atmost N data objects written by task T . Therefore, the timecomplexity of the algorithm is O ( N + M ) . Theorem 1.
All the finished tasks validated by Algorithm 1are conflict-serializable.Proof.
This theorem can be proved by showing that theprecedence graph is acyclic. For ease of presentation, let G i represent the precedence graph constructed from the first i fin-ished tasks. We prove the theorem by mathematical inductionon index i when i ≥ . As the induction basis, when i = 1 ,the theorem is correct because only one task is committedand no cycle can be formed in the precedence graph G . Forthe induction hypothesis, suppose that the formula is correctfor the first k finished tasks, and the precedence graph G k isacyclic. We show that the formula is also correct for the first k + 1 finished tasks.We prove that the precedence graph G k +1 is acyclic bycontradiction. Suppose that G k +1 consists of a cycle formedby a task set { τ , τ , ..., τ c } in G k and the latest committedtask T . Without loss of generality, we assume that these tasksare finished in the order of τ , τ , ..., τ c , and T . A cycle formedimmediately after task T is committed indicates that the graphcontains an arc from T to τ and an arc from τ c to T . Based onthe precedence relationship determined by the algorithm, thearc from T to t suggests that the validity time interval of T isearlier than that of t . Similarly, the arc from t c to T suggeststhat the validity time interval of T is later than that of t c . Inother words, the validity time interval of t c is earlier than thatof t . However, we assume that τ , τ , ... , and τ c are finishedin order, so the validity time interval of t should be earlierthan that of t c . This results in a contradiction and impliesthat G k +1 will not consist of a cycle if T is permitted to becommitted. Therefore, we can conclude that all the finishedtasks validated by Algorithm 1 are conflict-serializable . C. Instant System Recovery1) Data Recovery:
To enable the system to recover to aconsistent state from power or task failures, the recovery han-dler maintains the consistency between data objects and taskexecution. Recall that, by atomically updating the persistentcopies in NVM, the data manager prevents data objects frombeing partially modified even if a power failure occurs duringthe update. Therefore, although all temporary copies in VM arelost after a power failure, once power is restored, a persistentcopy for each data object can be accessed immediately fromNVM and its temporary and working copies can be recreatedin VM when necessary according to the persistent copy. As aresult, the recovery handler achieves instant data recovery.
2) Task Recovery:
To recover tasks when power is resumedor validation fails, the recovery handler records informationabout whether a task is finished and its attributes (e.g., codeaddress, name, stack size, and priority) needed to recre-ate the task. Note that the information is stored in a datastructure in NVM. Based on the information, the recoveryhandler monitors unfinished tasks, recreates aborted tasks,and maintains consistency between task execution and dataobjects. Specifically, if the validation result for a task is not serializable, the recovery handler is notified to rerun the taskby aborting and recreating the task. Similarly, if a power failureoccurs, although the execution progress (e.g., data, stacks, andheaps) of non-lengthy tasks in VM and CPU registers are lost,the recovery handler simply identifies those unfinished tasksaccording to the data structure and recreates non-lengthy tasksto achieve instant task recovery.To prevent tasks whose execution times are too lengthy tobe finished within one power-on period from being repeatedlyrecreated and rerun, the recovery handler detects lengthy tasksand allows the computation progress of these tasks to bepreserved in NVM across multiple power-on periods. Whenthe power is resumed, if a task has ever been recreated and stillcannot be finished within the latest power-on period, the taskwill be created in NVM and deemed a lengthy task thereafter.Note that the classification of a task as lengthy is related tothe power condition at the moment. At runtime, to successfullypreserve the context of a lengthy task, including its variablesfetched in the CPU registers, if the currently executed task islengthy, the recovery handler forces the task to be switchedout at a low voltage (before a potential power failure occurs).Whenever the currently executed task is switched out, the re-covery handler removes all lengthy tasks from the ready queueand prevents them from being scheduled and executed untilthe next power-on period. Recall that once a lengthy task isdetected and created, the data manager will allocate its contextin NVM so that its computation progress can be preservedacross power cycles. Therefore, after each power resumption,the recovery handler only needs to repeatedly add (insteadof recreating) the task into the ready queue of the scheduleruntil the task is finished. More implementation details aboutlow voltage detection and context switch enforcement will berespectively discussed in Sections III-D4 and III-D5.
D. Implementation Issues
Our design was integrated into FreeRTOS [2], a real-time operating system supporting many kinds of commer-cial microcontrollers, running on an MSP-EXP430FR5994LaunchPad [3], a Texas Instruments platform featuring 256KBFRAM (Ferroelectric Random Access Memory) and 8KB on-chip SRAM (Static Random-Access Memory). For portabilityacross different system architectures and platforms, we inte-grated the proposed design into FreeRTOS while minimizingkernel code modifications. Our implementation comprises 11files and 1460 lines of C code, among which 72 lines arescattered in 3 files belonging to the kernel . We discuss sometechnical issues that arise when implementing our design intoFreeRTOS.
1) Operating System Integration:
The data manager andthe recovery handler are respectively implemented on top ofthe memory management and the task scheduler. FreeRTOSprovides a set of APIs for program developers to create,schedule, suspend, and delete tasks, where the created tasksare executed by the scheduler using a round-robin schedulingpolicy. The scheduler records the status of each task bymaintaining its task control block , which keeps the task’sinformation (e.g., code address, stack address, priority, etc.) The intermittent OS is released under an open-source license and availableat https://github.com/meenchen/Intermittent-OS. in VM by default. Through the APIs, the recovery handlerrecreates tasks aborted due to validation or power failures.Moreover, if a task is deemed lengthy during system recovery,the scheduler stores the task’s control blocks in NVM, keepingthe statuses of lengthy tasks across power failures. To this end,we extend the kernel to record task attributes once a task iscreated, to allocate the context of lengthy tasks in NVM, and tocount the number of context switches as the current timestamp(to avoid additional overhead caused by frequently accessingthe real-time clock or timer).On the other hand, FreeRTOS supports a memory manage-ment mechanism and provides interfaces to allocate and deal-locate memory space of the heap in VM by default. Throughthese interfaces, the data manager can manage the physicaladdresses of working and temporary copies for each dataobject and reclaim their space when they become invalid. Inaddition, we reserve an amount of memory space in NVM andadopt the memory management mechanism to maintain thepersistent copies of data objects, the contexts of lengthy tasks,and the task attributes (i.e., code address, priority, stack size,name, and execution status) for those unfinished tasks recordedby the recovery handler. Note that, to ensure serializabilityof concurrent task execution, our read, write, and commitoperations replace the original implementations provided bytypical operating systems with concurrency control.
2) Early-abortion Validation:
Whenever a task finishes,immediately before the modified data objects are committed,the data manager invokes the proposed algorithm to validatethe serializability of read and write actions performed by thetask, and if the serializability is violated, the task is abortedand rerun. To mitigate the waste of computation power onnon-serializable tasks, we detect whether a task becomes non-serializable during its execution by simultaneously executingparts of Algorithm 1. This is achieved by adding a datastructure to record the validity time intervals for all tasks anddata objects. According to the algorithm, the time interval ofa task shrinks immediately when the task reads a data object(Lines 3-4) or when a data object read by the task is firstcommitted by another task (Lines 5-6). Therefore, wheneverthe task performs a read operation, we can check whether thetime interval of the task remains valid and early-abort the taskonce the interval becomes empty. Note that the first half of thevalidation procedure (i.e., Lines 3-6 of the algorithm), whichhas already been executed during task execution, can thus beskipped when the task attempts to commit its modifications.
3) Atomic Commit Operations:
When a task attempts tocommit its modifications to persistent copies, the commit oper-ation must atomically update none or all of the modifications tothe persistent copies to prevent the consistent version in NVMfrom being partially updated and thus becoming inconsistent.To this end, we borrow the idea used to atomically updateshadow pages from database systems [9] and employ a datastructure stored in NVM to maintain the addresses of persistentcopies so that all updates to the consistent version can befinalized by one single CPU instruction. A similar idea wasalso adopted in [6] to realize atomic commit operations. Asshown in Figure 4, the data structure contains two addressmaps and a bit map. Each entry in an address map storesthe physical address of a persistent copy, and each bit in thebit map is associated with a data object to indicate which address map has the valid address of its persistent copy (e.g.,in the figure, the address map 0 has the valid addresses of thepersistent copies for data objects 0 and 1).Fig. 4: Data structure for addressing persistent data copies.Whenever a task finishes, for each data object modified bythe task, the commit operation updates the invalid address inone of the two maps to the new address (e.g., if data object 0 ismodified, the first entry in address map 1 is updated to its newaddress). Then, after the addresses of all modified data objectsare updated, the commit operation simultaneously toggles thecorresponding bits in the bit map (e.g., for data object 0, thefirst bit in the bit map is changed from 0 to 1). Becauseupdating the bit map only requires one CPU instruction, whichis the minimum execution unit of a CPU, the commit operationis atomic and resilient against power failures. However, themaximum number of bits updated by a CPU instruction islimited by the bit width of the CPU. For example, the TexasInstruments platform provides a 16-bit CPU, so at most 16data objects can be simultaneously committed with the datastructure. This should be sufficient for most applications onlightweight embedded devices. If the number of data objectsmodified exceeds the CPU bit width, a hierarchical bit mapcan be used to extend the data structure to commit a largernumber of data objects.
4) Low Voltage Detection:
To ensure the contexts oflengthy tasks can be successfully switched out before powerfailures, we use the platforms Analog-to-digital converter(ADC) to generate interrupts when the voltage of the capacitoris lower than a threshold. After power resumption, if there arelengthy tasks running in the system, we initialize and activatethe ADC to detect whether the current voltage is below thegiven threshold. The amount of energy stored in the capacitorcan be calculated by CV (in joules), where the C and V re-spectively represent the capacitance and the current voltage ofthe capacitor. Based on the platform’s specification, includingthe operating voltage ( V op ), the maximum power consumption( P ), and the context switch period of the operating system( T cs ), the threshold ( V th ) can be appropriately predetermined.Once a low-voltage interrupt is triggered, to ensure that theremaining energy is sufficient to successfully switch out alengthy task, the energy stored in the capacitor C ( V th − V op ) must be greater than or equal to the energy required for onecontext switch period P × T cs , so the threshold can be set as V th ≥ q P T cs C + V op .
5) Context Switch Enforcement:
When a low voltage in-terrupt is triggered, the interrupt service routine notifies therecovery handler to set a low voltage flag. If the flag isset, after the currently executed task is switched out by thescheduler, the recovery handler suspends all lengthy tasksby invoking an API, namely vTaskSuspend() , providedby the scheduler in FreeRTOS to remove every lengthy taskfrom the ready queue. Therefore, after the currently executedtask, which could be lengthy, is successfully switched out,only non-lengthy tasks are eligible to be scheduled and thecontexts of lengthy tasks will be preserved in NVM in thecurrent power-on period. After power resumption, the recoveryhandler resumes all lengthy tasks by invoking another API,namely vTaskResume() , provided by the scheduler to putevery lengthy task back into the ready queue.
6) Compatibility with Hardware-assisted Checkpointing:
Our design is also compatible with a hardware-assisted check-pointing mechanism, e.g., NVP-based devices that automati-cally checkpoint all volatile data to NVM when a low voltageis detected [28]. However, a checkpointing failure would leadthe system status to be rolled back to the latest successfulcheckpoint [25]. As a consequence, some finished tasks couldpotentially be rolled back and update their modifications to theconsistent version in NVM again, thus giving a rise to datainconsistency. To address this issue, if the system status isrolled back to the latest checkpoint after power consumption,the recovery handler can simply delete those finished tasksthat have successfully committed their modifications based onthe data structure it maintains in NVM.IV. P
ERFORMANCE E VALUATION
A. Experimental Setup
HardwareMCU 16-bit RISC EXP430FR5994Memory 8 KB SRAM & 256 KB FRAMSoftwareOS FreeRTOS V9.0.0Energy harvesting management & Power supplyCapacitance 200 µF Switch on/off voltage 2.8 V/2.4VStrong power source 3mW = 3V × × TABLE I: Specifications of the experimental platform.We conducted a series of experiments on the Texas In-struments platform with an energy harvesting management(EHM) module. Table I details the specifications of the relatedhardware and software. The platform is powered by the EHMunit which consists of a BQ25504 low-power boost converter,a 200 µF capacitor to store the harvested energy, and a switchto turn on (resp. off) the power supply of the platform whenthe voltage of the capacitor raises above 2.8V (resp. dropsbelow 2.4V). We used a programmable power supply madeby B&K Precision to emulate the power source for the EHM.To simulate different energy harvesting sources while makingthe experiments reproducible, we manufactured strong (3mW= 3V × × EXP430FR5994 BQ25504 low-powerboost converterCapacitance & Power switch Power supply
Fig. 4: The experimental environment.Given that self-powered devices typically run simple ap-plications for data collection and processing, we ported fourtasks from the benchmarks provided by Texas Instrumentsand implemented one task to encrypt and transmit resultantdata to an external device. Specifically, the four tasks respec-tively perform matrix multiplication, floating-point arithmetic,integer arithmetic, and a finite impulse response filter basedon given inputs, and then commit their computation results tofour respective data objects. The last task reads all the fourdata objects, performs SHA-256 (a secure hash algorithm) toencrypt the data, and transits the encrypted data to an exter-nal device via a universal asynchronous receiver-transmitter(UART) interface. Thus, the four data objects are read, written,and committed by the five concurrently executed tasks.To derive the low voltage threshold, V th , which is sufficientto successfully switch out a running lengthy task before apower failure, we used a profiling tool, namely EnergyTraceTechnology [1] provided by Texas Instrument. Based on ourmeasurement, the maximum power consumption P is up to5.25 mW when these tasks are concurrently executed onthe platform. Alternatively, the maximum power consumptioncan be directly obtained from the specifications of the usedplatform and external modules. As discussed in Section III-D4,once a low-voltage interrupt is triggered, the remaining energymust be greater than P × T cs , where the context switchperiod T cs is 1 ms in FreeRTOS. Thus, the remaining energymust be greater than 5.25 µJ and, according to the platformspecifications in Table I, V th can be derived and set as . V. To ensure the serializability of task execution and allowinstant recovery, our design needs to validate the data accessoperations made by tasks and maintain data in hybrid memoryat runtime. We first evaluated the overall additional costsincurred by our design, by comparing the forward progress(i.e., the number of finished tasks per second) achieved byour design and native FreeRTOS, when the device is poweredwith a stable power supply. Then, we conducted breakdown MSP430 Competitive Benchmarking is a collection of applications usedto evaluate different aspects of the microcontroller’s performance. analysis on the costs. We measured the time and space costsrequired by our design, which requires additional computationtime and memory space to respectively invoke data accessoperations and record the task attributes. Moreover, becauseour design accumulates the computation progress of lengthytasks at the cost of increased execution time and energyconsumption due to NVM access latency, we measured theexecution time and energy consumption of each task when itscontext is allocated in VM or NVM.To gain more insights into our design, which achieves dataconsistency without runtime checkpointing and system log-ging, we compared the performance of our design to that of the system-wise checkpointing [10] and logging-based checkpoint-ing [22] approaches described in Section II-B, respectivelydenoted as SYS and
LOG . To explore the impact of differentcheckpointing periods, we measured the performance achievedby SYS and LOG when they perform checkpointing frequently(i.e., every 20ms) and infrequently (i.e., every 200ms). Notethat LOG and SYS adopt our validation procedure to ensurethe serializability of concurrent task execution, because theyoriginally do not consider task concurrency. All tasks wererun repeatedly, and the number of finished tasks per second(i.e., forward progress) was adopted as the performance metric.Finally, to explore the runtime overheads incurred to enableintermittent computing, we measured the suspension time,the recovery time, and the data recentness achieved by
SYS , LOG , and our design. These runtime overheads affect theforward progress and data quality when the system suffersfrom frequent checkpointing and recovery due to unstablepower supply.
B. Experimental Results1) Cost measurement:
Our design enables an embeddedoperating system to achieve serializability and data consistencyat the cost of additional overheads. Figure 5 shows that theforward progress achieved by FreeRTOS with our designintegrated is reduced by 6.9%. Note that native FreeRTOSdoes not guarantee the serializability of task execution, so theresultant values of data objects could be unpredictable, andthe permanent version in NVM could become inconsistentafter power failures. In contrast, our design provides the read,write, and commit operations to ensure serializability, whilemaintaining task contexts and data copies in hybrid memory toachieve data consistency. To analyze the costs, we investigatedthe average computation time of each operation, as well asthe additional memory space required respectively by the datamanager and recovery handler. We also evaluated the averageexecution time and energy consumption required by each taskwhen its context is allocated in VM or NVM.
Read Write CommitAverage execution time 48 µ s 65 µ s 93 µ sData manager Recovery handlerAdditional memory usage 5378 bytes 164 bytes TABLE II: Execution time required by our data access oper-ations and additional memory space used by our design.Table II lists the average execution time of each operation,where the incurred cost in terms of average execution time F o r w a r d p r og r e ss ( N u m b e r o f f i n i s h e d t a s k s p e r s ec ond ) OursFreeRTOS-6.9%
Fig. 5: Forward progress achieved by FreeRTOS with andwithout our design.is 48, 65, and 93 µ s respectively for the read, write, andcommit operations. The commit operation requires relativelymore time than the read and write operations, because the datamanager needs to validate the serializability of task executionand update the data structure which ensures the atomicity ofthe commit operation. By contrast, the read operation onlyaccesses the addresses of data copies from the address maps,and the write operation only modifies working versions withcopy-on-write. However, compared to the task execution timewhich is in a range of a few to hundreds of milliseconds,runtime overheads incurred by these operations are almostnegligible. Moreover, our design uses an additional 5378 bytesand 164 bytes over the 256KB + 8KB memory space torespectively store the data structures maintained by the datamanager and task attributes recorded by the recovery handler.Thus, both the time and space costs of the operations arejustifiable. MatMul FIR filter SHA256 Float math Int. mathVM 439 ms 336 ms 246 ms 1.89 ms 1.5 msNVM 470 ms 352 ms 265 ms 1.9 ms 1.53 ms
TABLE III: Execution time required by each task.
MatMul FIR filter SHA256 Float math Int. mathVM 1.67 mJ 1.44 mJ 1.04 mJ 5.6 µJ µJ NVM 2.21 mJ 1.56 mJ 1.37 mJ 5.7 µJ µJ TABLE IV: Energy consumption required by each task.Tables III and IV respectively show the average executiontime and energy consumption required by each task when itscontext is allocated in VM or NVM. Overall, the executiontime of a task will be increased by between 2% and 7%when its context is allocated in NVM than in VM, and theenergy consumed by a task will be increased by between2% and 32%. The result is as expected because accessingNVM requires more energy and time than accessing VM.Consequently, when a task is deemed lengthy, the additionalcost of a memory-intensive task (e.g., matrix multiplication) ishigher than that of a computation-intensive task (e.g., integerarithmetic) due to frequent memory access. Note that ourdesign allocates the context of a task in NVM only when thetask is deemed lengthy to preserve its computation progress across power cycles. In our experimental settings, three taskswhich respectively perform a finite impulse response filter,matrix multiplication, and SHA-256, are often deemed lengthybecause they cannot be finished within a power-on periodunder the weak power source. F o r w a r d p r og r e ss ( N u m b e r o f f i n i s h e d t a s k s p e r s ec ond ) Checkpointing period (ms) OursLOGSYS1.18x 1.08x1.49x 1.10x (a) Non-lengthy tasks. F o r w a r d p r og r e ss ( N u m b e r o f f i n i s h e d t a s k s p e r s ec ond ) Checkpointing period (ms) OursLOGSYS1.28x 1.15x1.33x 1.13x (b) Lengthy tasks.
Fig. 6: Forward progress achieved by our design, SYS, andLOG under the strong power source.
2) Forward progress:
Figures 6(a) and 6(b) respectivelyshow the forward progress of non-lengthy and lengthy tasksachieved by our design, LOG, and SYS when the deviceis powered by the strong power source. In general, ourdesign outperforms SYS and LOG for both long or shortcheckpointing periods. The forward progress achieved by ourdesign is 1.1 to 1.49 times that achieved by SYS and is 1.08to 1.28 times that achieved by LOG. The improved forwardprogress is mainly because our design eliminates the runtimeoverheads of snapshot checkpointing and data logging, whichare respectively required by SYS and LOG to preserve forwardprogress and maintain data consistency. Therefore, when theshort checkpointing period is adopted, our design achievesmore progress improvement by eliminating the overheadsfrequently incurred by the checkpointing-based approaches.Figures 7(a) and 7(b) respectively show the forwardprogress of non-lengthy and lengthy tasks achieved by dif-ferent approaches when the weak power source is adopted.For non-lengthy tasks, as shown in Figure 7(a), our designachieves 1.58 to 1.83 times forward progress achieved by SYSand 1.4 to 1.49 times forward progress achieved by LOG.The efficacy of our design becomes more manifest when thepower supply is relatively unstable because our design enablesinstant system recovery, whereas SYS and LOG respectivelyneed to restore the system snapshot from NVM to VM and F o r w a r d p r og r e ss ( N u m b e r o f f i n i s h e d t a s k s p e r s ec ond ) Checkpointing period (ms) OursLOGSYS1.40x 1.49x1.58x 1.83x (a) Non-lengthy tasks. F o r w a r d p r og r e ss ( N u m b e r o f f i n i s h e d t a s k s p e r s ec ond ) Checkpointing period (ms) OursLOGSYS ∞ ∞ (b) Lengthy tasks.
Fig. 7: Forward progress achieved by our design, SYS, andLOG under the weak power source.traverse logs in NVM to maintain data consistency duringsystem recovery. For lengthy tasks, as shown in Figure 7(b),LOG makes no forward progress because lengthy tasks cannotbe finished within a power-on period and will be rolled back tothe outset after power resumption. Compared with SYS, ourdesign achieves 1.21 to 1.39 times more forward progress.The improved progress is because our design allocates thecontexts of lengthy tasks in NVM to preserve their progressacross power-off periods and switches them out before powerfailures, incurring less overheads compared to checkpointingthe entire system snapshot from VM to NVM at runtime. Notethat because a power-on period is usually much longer thanthe context switch period (e.g., every 1 ms in the FreeRTOSversion used for our implementation) and lengthy tasks willonly be switched out in a context switch period towards theend of a power-on period, lengthy tasks will still be executablein a large portion of the power-on period.Comparing Figures 6 and 7, when the power supply isrelatively stable, both SYS and LOG achieve more forwardprogress with a longer checkpointing period than with a shortcheckpointing period. However, when the power supply is rel-atively unstable, the forward progress achieved with a longercheckpointing period decreases more substantially than witha shorter checkpointing period because more uncheckpointedprogress could be lost under power failure conditions. Thisalso raises a robustness consideration because the performanceof checkpointing-based approaches is highly dependent on therelationship between the checkpointing period and the powerfailure period. Ours SYS (20ms) SYS (200ms) LOG (20ms) LOG (200ms)Suspension time (ms) 0 7.5 3.2Recovery time (ms) 0.6 7.6 7Data recentness (ms) 4.7 10.3 97.8 10.4 99.6
TABLE V: Average checkpoint time, recovery time, and data recentness achieved by our design, SYS, and LOG.
3) Runtime overhead:
To enable intermittent computing,the system may be suspended at runtime for checkpointing andtake some time to recover after power resumption, incurringadditional runtime overheads. As to the runtime overheadsincurred by checkpointing and recovery, we measured theaverage time required for system suspension during eachcheckpointing, the average time required for system recoveryafter power resumption (i.e., when the first task can be run afterpower resumption), and the average time required to completea non-lengthy or lengthy task. Moreover, the recentness ofdata objects after recovery (i.e., the time difference betweenthe last data update and the recovered system) was measuredto evaluate the quality of data over intermittent execution.As shown in Table V, compared to SYS and LOG, ourdesign completely eliminates runtime suspension, which re-spectively takes 7.5 and 3.2 ms for SYS and LOG. Byeliminating the time required to restore the system snapshotsback from or log traversing in NVM, our design reducesthe recovery time required by SYS and LOG respectivelyfrom 7.6 and 7 ms to 0.6 ms, a reduction of at least 90%.Our design achieves a shorter recovery because, after powerresumption, it simply reruns unfinished non-lengthy tasks andadds unfinished lengthy tasks into the ready queues basedon their attributes maintained by the recovery handler inNVM. Moreover, our design significantly improves the datarecentness achieved by SYS and LOG, and the improvementis more manifest as the checkpointing period increases becauseSYS and LOG will roll the data back to an older version afterpower resumption.To sum up, extensive experiments based on a prototypesystem running real tasks demonstrate that our design not onlyensures data consistency but also outperforms checkpointing-based approaches in terms of the forward progress, thecheckpoint time, the recovery time, and the data recentnessafter power resumption. This also suggests that our design isparticularly suitable for self-powered devices which may sufferfrom frequent power failures.V. C
ONCLUDING R EMARKS
We present a failure-resilient design, which employs a datamanager and a recovery handler, to endow intermittent sys-tems with concurrent task execution, data consistency withoutruntime suspension, instant system recovery, and stagnation-free computation. The data manager maintains serializabilityof concurrent task execution by controlling the data accessoperations conducted by tasks while ensuring data consistencyby atomically committing data copies modified by finishedtasks in VM to persistently consistent versions in NVM.In contrast to checkpointing-based approaches which requirefrequent system suspension to back up volatile data at runtime,the persistently consistent version allows the recovery handlerto instantly recover the system by rerunning all unfinishedtasks whose progress is lost in VM due to power failures, thereby eliminating the time required to restore the system toa previously checkpointed state. Moreover, to accumulate theprogress of lengthy tasks across power cycles, the data man-ager allocates the data and contexts of lengthy tasks in NVM,and the recovery handler allows these tasks to instantly resumetheir executions based on their states preserved in NVM afterpower resumption. We implemented the data manager andthe recovery handler on top of the memory managementand the task scheduler in FreeRTOS. The results of exper-iments conducted on a Texas Instruments EXP430FR5994LaunchPad show that our design significantly increases theforward progress achieved by system-wise checkpointing [10]and logging-based checkpointing [22] approaches. It alsosuggests that our design is particularly appropriate for self-powered devices suffering frequent power disruptions becausethe design guarantees data consistency while reducing theruntime overhead and the recovery time.To further improve forward progress, future work will seekto extend our design to consider not only data objects but alsocompiled program code of tasks. This will raise opportunitiesfor, as well as challenges to, designing a memory allocationpolicy which considers the trade-off between the overheadincurred by copying program code from NVM to VM andperformance improvements by running programs on VM.A
CKNOWLEDGEMENT
This work was supported in part by the Ministry of Scienceand Technology, Taiwan, under grant 107-2628-E-001-001-MY3. R
IEEE Trans. on Computers , pages 1390–1403, 2019.[5] P. Bogdan, M. Pajic, P. P. Pande, and V. Raghunathan. Making theInternet-of-Things a Reality: From Smart Models, Sensing and Actuationto Energy-Efficient Architectures. In
Proc. of IEEE/ACM CODES+ISSS ,pages 1–10, 2016.[6] W.-M. Chen, Y. Chen, P.-C. Hsiu, and T.-W. Kuo. MultiversionConcurrency Control on Intermittent Systems. In
Proc. of IEEE/ACMICCAD , 2019.[7] W.-M. Chen, T.-S. Cheng, P.-C. Hsiu, and T.-W. Kuo. Value-BasedTask Scheduling for Nonvolatile Processor-Based Embedded Devices.In
Proc. IEEE RTSS , pages 247–256, 2016.[8] J. Choi, H. Joe, Y. Kim, and C. Jung. Achieving Stagnation-FreeIntermittent Computation with Boundary-Free Adaptive Execution. In
Proc. of IEEE RTAS , pages 331–344, 2019.[9] J. Gray and A. Reuter.
Transaction Processing: Concepts and Tech-niques . Morgan Kaufmann Publishers Inc., 1st edition, 1992.[10] H. Jayakumar and A. Raha and V. Raghunathan. Quickrecall: A lowoverhead hw/sw approach for enabling computations across power cyclesin transiently powered computers. In
International Conference on VLSIDesign and International Conference on Embedded Systems , pages 330–335, 2014. [11] C.-K. Kang, C.-H. Lin, P.-C. Hsiu, and M.-S. Chen. HomeRun:HW/SW Co-Design for Program Atomicity on Self-Powered IntermittentSystems. In Proc. of IEEE/ACM ISLPED , pages 29:1–29:6, 2018.[12] Q. Li, M. Zhao, J. Hu, Y. Liu, Y. He, and C. J. Xue. Compiler DirectedAutomatic Stack Trimming for Efficient Non-volatile Processors. In
Proc. of IEEE/ACM DAC , pages 1–6, 2015.[13] Z. Li, Y. Liu, D. Zhang, C. J. Xue, Z. Wang, X. Shi, W. Sun, J. Shu,and H. Yang. HW/SW co-design of nonvolatile IO system in energyharvesting sensor nodes for optimal data acquisition. In
Proc. ofIEEE/ACM DAC , pages 1–6, 2016.[14] Y.-C. Lin, P.-C. Hsiu, and T.-W. Kuo. Autonomous I/O for IntermittentIoT Systems. In
Proc. of IEEE/ACM ISLPED , pages 1–6, 2019.[15] Q. Liu and C. Jung. Lightweight Hardware Support for Transpar-ent Consistency-aware Checkpointing in Intermittent Energy-harvestingSystems. In
Proc. of IEEE NVMSA , pages 1–6, 2016.[16] Y. Liu, Z. Li, H. Li, Y. Wang, X. Li, K. Ma, S. Li, M.-F. Chang, S. John,Y. Xie, J. Shu, and H. Yang. Ambient Energy Harvesting NonvolatileProcessors: From Circuit to System. In
Proc. of IEEE/ACM DAC , pages150:1–150:6, 2015.[17] B. Lucia and B. Ransford. A Simpler, Safer Programming and ExecutionModel for Intermittent Systems. In
Proc. of ACM PLDI , pages 575–585,2015.[18] K. Ma, X. Li, H. Liu, X. Sheng, Y. Wang, K. Swaminathan, Y. Liu,Y. Xie, J. Sampson, and V. Narayanan. Dynamic Power and EnergyManagement for Energy Harvesting Nonvolatile Processor Systems.
ACM Trans. Embed. Comput. Syst. , pages 107:1–107:23, 2017.[19] K. Maeng, A. Colin, and B. Lucia. Alpaca: Intermittent ExecutionWithout Checkpoints.
Proc. ACM Program. Lang. , pages 96:1–96:30,2017.[20] H. R. Mendis and P.-C. Hsiu. Accumulative Display Updating forIntermittent Systems. In
Proc. of IEEE/ACM CODES+ISSS , 2019.[21] N. Onizawa, A. Mochizuki, A. Tamakoshi, and T. Hanyu. Sudden Power-Outage Resilient In-Processor Checkpointing for Energy-HarvestingNonvolatile Processors.
IEEE Transactions on Emerging Topics inComputing , pages 151–163, 2017.[22] M. ¨Ozsu and P. Valduriez.
Principles of Distributed Database Systems .NY: Springer, 3st edition, 2011.[23] C. Pan, M. Xie, Y. Liu, Y. Wang, C. J. Xue, Y. Wang, Y. Chen,and J. Hu. A Lightweight Progress Maximization Scheduler for Non-volatile Processor Under Unstable Energy Harvesting. In
Proc. of ACMSIGPLAN/SIGBED , pages 101–110, 2017.[24] B. Ransford and B. Lucia. Nonvolatile Memory is a Broken TimeMachine. In
Proc. of MSPC , pages 5:1–5:3, 2014.[25] B. Ransford, J. Sorber, and K. Fu. Mementos: System Support forLong-running Computation on RFID-scale Devices.
SIGARCH Comput.Archit. News , pages 159–170, 2011.[26] W. Song, Y. Zhou, M. Zhao, L. Ju, C. J. Xue, and Z. Jia. EMC: Energy-Aware Morphable Cache Design for Non-Volatile Processors.
IEEETrans. on Computers , pages 498–509, 2019.[27] Y. Wang, H. Jia, Y. Liu, Q. Li, C. J. Xue, and H. Yang. RegisterAllocation for Hybrid Register Architecture in Nonvolatile Processors.In
Proc. of IEEE ISCAS , pages 1050–1053, 2014.[28] Y. Wang, Y. Liu, S. Li, D. Zhang, B. Zhao, M. F. Chiang, Y. Yan,B. Sai, and H. Yang. A 3us Wake-up Time Nonvolatile Processor Basedon Ferroelectric Flip-flops. In
Proc. of IEEE ESSCIRC , pages 149–152,2012.[29] M. Xie, C. Pan, M. Zhao, Y. Liu, C. J. Xue, and J. Hu. Avoiding DataInconsistency in Energy Harvesting Powered Embedded Systems.
ACMTrans. Des. Autom. Electron. Syst. , pages 38:1–38:25, 2018.[30] M. Xie, M. Zhao, C. Pan, J. Hu, Y. Liu, and C. J. Xue. Fixing the BrokenTime Machine: Consistency-aware Checkpointing for Energy HarvestingPowered Non-volatile Processor. In
Proc. of IEEE/ACM DAC , pages 1–6, 2015.[31] D. Zhang, Y. Liu, X. Sheng, J. Li, T. Wu, C. J. Xue, and H. Yang.Deadline-aware Task Scheduling for Solar-powered Nonvolatile SensorNodes with Global Energy Migration. In
Proc. of IEEE/ACM DAC ,pages 1–6, 2015.[32] M. Zhao, K. Qiu, Y. Xie, J. Hu, and C. J. Xue. Redesigning Softwareand Systems for Non-volatile Processors on Self-powered Devices. In