IoT Notary: Sensor Data Attestation in Smart Environment
Nisha Panwar, Shantanu Sharma, Guoxi Wang, Sharad Mehrotra, Nalini Venkatasubramanian, Mamadou H. Diallo, Ardalan Amiri Sani
II O T N
OTA RY : Sensor Data Attestation in SmartEnvironment
Nisha Panwar, Shantanu Sharma, Guoxi Wang, Sharad Mehrotra, Nalini Venkatasubramanian,Mamadou H. Diallo, and Ardalan Amiri Sani
University of California, Irvine, California, USA.
Abstract —Contemporary IoT environments, such as smartbuildings, require end-users to trust data-capturing rulespublished by the systems. There are several reasons why sucha trust is misplaced — IoT systems may violate the rulesdeliberately or IoT devices may transfer user data to a maliciousthird-party due to cyberattacks, leading to the loss of individuals’privacy or service integrity. To address such concerns, we proposeI O T N
OTARY , a framework to ensure trust in IoT systems andapplications. I O T N
OTARY provides secure log sealing on livesensor data to produce a verifiable ‘proof-of-integrity,’ based onwhich a verifier can attest that captured sensor data adheres tothe published data-capturing rules. I O T N
OTARY is an integralpart of TIPPERS, a smart space system that has been deployedat UCI to provide various real-time location-based servicesin the campus. I O T N
OTARY imposes nominal overheads forverification, thereby users can verify their data of one day inless than two seconds.
I. I
NTRODUCTION
While fine-grained continuous monitoring by IoT devices( e . g ., camera and WiFi access-points) offers numerous benefitsand empowers existing systems with new capabilities, italso raises several privacy and security concerns ( e . g .,smoking habits, gender, and religious belief). To highlightthe privacy concern, we first share our experience in buildinglocation-based services at UC Irvine using WiFi connectivitydata. Use-case: University WiFi data collection.
In our on-goingproject, entitled TIPPERS [1], we have developed a varietyof location-based services based on WiFi connectivity dataset.At UC Irvine, more than 2000 WiFi access-points and fourWLAN controllers (managed by the university IT department)provide campus-wide wireless network coverage. Whenevera device connects to the campus WiFi network (throughan access-point), the access-point generates Simple NetworkManagement Protocol (SNMP) trap for this association event.Each association event contains access-point-id, s i , user deviceMAC address, d j , and the time of the association, t k . AllSNMP traps (cid:104) s i , d j , t k (cid:105) are sent to access-point’s controllers in Accepted in IEEE International Symposium on Network Computingand Applications (NCA), 2019.
For the final version, please refer to theconference proceeding.This work is based on research sponsored by DARPA under agreementnumber FA8750-16-2-0021 and partially supported by NSF grants 1527536and 1545071. The U.S. Government is authorized to reproduce and distributereprints for Governmental purposes notwithstanding any copyright notationthereon. The views and conclusions contained herein are those of theauthors and should not be interpreted as necessarily representing the officialpolicies or endorsements, either expressed or implied, of DARPA or the U.S.Government. realtime. The access-point controller anonymizes device MACaddresses (to preserve the privacy of users in the campus).TIPPERS collects WiFi connectivity data from one ofthe controllers that manage 490 access-points and receives (cid:104) s i , d j , t k (cid:105) tuples for each connectivity event. However,before receiving any WiFi data, TIPPERS notifies all WiFiusers about the data-capture rules by sending emails overa university mailing list. Subsequently, based on WiFiconnectivity data (cid:104) s i , d j , t k (cid:105) , TIPPERS provides variousrealtime applications. Some of these services, e . g ., computingoccupancy levels of (regions in) buildings in the form of alive heatmap, require only anonymous data. Other services, e . g ., providing location information (within buildings) orcontextualized messaging (to provide messages to a user whenhe/she is in the vicinity of the desired location), require user’soriginal disambiguated data. To date, over one hundred usershave registered into TIPPERS to utilize realtime services. Akey requirement imposed by the university in sharing data withTIPPERS is that the system supports provable mechanismsto verify that individuals have been notified prior to theirdata (anonymized or not) being used for service provisioning.Also, an option for users to opt-out of sharing their WiFiconnectivity data with TIPPERS must be supported. If usersopt-out, the system must prove to the users that indeed theirdata was not shared with TIPPERS. TIPPERS use immutablelog-sealing to help all users to verify that the captured data isconsistent with pre-notified data-capture rules.Our experience in working with various groups in thecampus is that (persistent) location data can be deemed quitesensitive by certain individuals with concerns about the spiedupon by the administration or by others. Thus, mechanismsfor notification of data-capture rules, secure log-sealing, andverification components made a sub-framework, entitled I O TN OTARY , which has become an integral part of TIPPERS.Data-capture concerns in IoT environments are similar tothat in mobile computing, where mobile applications may havecontinuous access to resident sensors on mobile devices. In themobile setting, data-capture rules and permissions are used tocontrol data access, i . e ., which applications have access towhich data generated at the mobile device ( e . g ., location andcontact list) for which purpose and in which context. However,in IoT settings, the data-capture framework differs from thatin the mobile settings, in two important ways:1) Unlike the mobile setting, where applications can seekuser’s permission at the time of installation, in IoT settings, a r X i v : . [ c s . CR ] A ug here are no obvious mechanisms/interfaces to seek users’preferences about the data being captured by sensorsof the smart environment. Recent work [2] has begunto explore mechanisms using which environments canbroadcast their data-capture rules to users and seek theirexplicit permissions.2) Unlike the mobile setting, users cannot control sensors inIoT settings. While in mobile settings, a user can trustthe device operating system not to violate the data-capturerules, in IoT settings, trust (in the environment controllingthe sensors) may be misplaced. IoT systems may not behonest or may inadvertently capture sensor data, even ifdata-capture rules are not satisfied.We focus on the above-mentioned second scenario anddetermine ways to provide trustworthy sensing in an untrustedIoT environment. Thus, the users can verify their data capturedby IoT environment based on pre-notified data-capture rules.Particularly, we deal with three sub-problems, namely securenotification to the user about data-capture rules, secure (sensordata) log-sealing to retain immutable sensor data, as well as,data-capture rules, and remote attestation to verify the sensordata against pre-notified data-capture rules by a user, withoutbeing heavily involved in the attestation process. Our contribution and outline of the paper.
We provide: • A user-centric framework ( § III) to ensure trustworthy datacollection in untrusted IoT spaces, entitled I O T N
OTARY . • Two models to inform the user about the data-capture rules( § IV-A): notice-only model and notice-and-ACK model. • A secure log-sealing mechanism ( § IV-B) implementedby secure hardware that cryptographically retains logs,data-capture rules, sensors’ state, and contextual informationto generate a proof-of-integrity in an immutable fashion. • A secure attestation mechanism ( § IV-C), mixed with SIGMAprotocol [3], allowing a verifier (a user or a non-mandatory auditor) to securely attest the sealed logs as per thedata-capture rules. Implementation results of I O T N
OTARY on the university live WiFi data are provided in § V. Full version.
Due to space limitations, we could notdescribe several details about I O T N
OTARY , which are givenin the full version [4]. These include: future temporalpassword-based notification method, log retrieval at the verifierusing SIGMA, details of the verification phase, throughput andcommunication cost experiments, and security proofs.II. M
ODELING I O T D
ATA A TTESTATION
A. Entities
Our model has the following entities, see Figure 1:
Infrastructure Deployer (IFD).
IFD (which is the universityIT department in our use-case; see § I) deploys and ownsa network of p sensors devices (denoted by s , s , . . . , s p ),which capture information related to users in a space.The sensor devices could be: ( i ) dedicated sensing devices, e . g ., energy meters and occupancy detectors, or ( ii ) facilityproviding sensing devices, e . g ., WiFi access-points andRFID readers. Our focus is on facility providing sensing Infrastructure Deployer (IFD, e . g ., University IT Department)Users Service Provider(SP, e . g ., TIPPERS) InfrastructureAuditor
Figure 1: Entities in I O T N
OTARY .devices, especially WiFi access-points that also capturesome user-related information in response to services. E.g.,WiFi access-points capture the associated user-device-ids(MAC addresses), time of association, some other parameters(such as signal strength, signal-to-noise ratio); denoted by: (cid:104) d i , s j , t k , param (cid:105) , where d i is the i th user-device-id, s j isthe j th sensor device, t k is k th time, and param is otherparameters (we do not deal with param field and focus ononly the first three fields). All sensor data is collected at acontroller (server) owned by IFD. The controller may keepsensor data in cleartext or in encrypted form; however, it onlysends encrypted sensor data to the service provider. Service Providers (SP).
SP (which is TIPPERS in ouruse-case; see § I) utilizes the sensor data of a given spaceto provide different services , e . g ., monitoring a location andtracking a person. SP receives encrypted sensor data from thecontroller. Data-capture rules . SP establishes data-capture rules (denotedby a list DC having different rules dc , dc , . . . , dc q ).Data-capture rules are conditions on device-ids, time, andspace. Each data-capture rule has an associated validity thatindicates the time during which a rule is valid. Data-capturerules could be to capture user data by default (unless the userhas explicitly opted out). Alternatively, default rules may beto opt-out, unless, users opt-in explicitly. Consider a defaultrule that individuals on the th floor of the building will bemonitored from 9pm to 9am. Such a rule has an associatedcondition on the time and the id of the sensor used to generatethe data. Now, consider a rule corresponding to a user with adevice d i opting-out of data capture based on the previouslymentioned rule. Such an opt-out rule would have conditionson the user-id, as well as, on time and the sensor-id. Forsensor data for which a default data-capture rule is opt-in,the captured data is forwarded to SP, if there does not existany associated opt-out rules, whose associated conditions aresatisfied by the sensor data. Likewise, for sensor data wherethe default is opt-out, the data is forwarded to SP only, ifthere exists an explicit opt-in condition. We refer to the sensordata to have a sensor state ( s i . state denotes the state of thesensor s i ) of 1 (or active), if the data can be forwarded toSP; otherwise, 0 (or passive). In the remaining paper, unlessexplicitly noted, opt-out is considered as the default rule, forsimplicity of discussion.2 IEEE NCA, 2019.henever SP creates a new data-capture rule, SP mustsend a notice message to user devices about the currentusage of sensor data (this phase is entitled notification phase ).SP uses Intel Software Guard eXtension (SGX) [5], whichworks as a trusted agent of IFD, for securely storing sensordata corresponding to data-capture rules. SGX keeps all validdata-capture rules in the secure memory and only allows tokeep such data that qualifies pre-notified valid data-capturerules; otherwise, it discards other sensor data. Further, SGXcreates immutable and verifiable logs of the sensor data (thisphase is entitled log-sealing phase ). The assumption of securehardware at a machine is rational with the emerging systemarchitectures, e . g ., Intel machines are equipped with SGX [6].However, existing SGX architectures suffer from side-channelattacks, e . g ., cache-line, branch shadow, page-fault attacks [7],which are outside the scope of this paper. Users.
Let d , d , . . . , d m be m (user) devices carried by u , u , . . . , u m (cid:48) users, where m (cid:48) ≤ m . Using these devices,users enjoy services provided by SP. We define a term, entitled user-associated data . Let (cid:104) d i , s j , t k (cid:105) be a sensor reading. Let d i be the i th device-id owned by a user u i . We refer to (cid:104) d i , s j , t k (cid:105) as user-associated data with the user u i . Usersworry about their privacy, since SP may capture user datawithout informing them, or in violation of their preference( e . g ., when the opt-out was a default rule or when a useropted-out from an opt-in default). Users may also require SPto prove service integrity by storing all sensor data associatedwith the user (when users have opted-in into services), whileminimally being involved in the attestation process and storingrecords at their sides (this phase is entitled attestation phase ). Auditor.
An auditor is a non-mandatory trusted-third-partythat can (periodically) verify entire sensor data againstdata-capture rules. Note that a user can only verify his/herdata, not the entire sensor data or sensor data related to otherusers, since it may reveal the privacy of other users.
B. Threat Model
We assume that SP and users may behave like adversaries.The adversarial SP may store sensor data without informingdata-capture rules to the user. The adversarial SP may tamper with the sensor data by inserting, deleting, modifying, andtruncating sensor readings and secured-logs in the database. Bytampering with the sensor data, SP may simulate the sealingfunction over the sensor data to produce secured-logs that areidentical to real secured-logs. Thus, the adversary may hinderthe attestation process and make it impossible to detect anytampering with the sensor data by the verifier (that may bean auditor or a user). Further, as mentioned before that SPutilizes sensor data to provide services to the user. However,an adversarial SP may provide false answers in response touser queries. We assume that the adversarial SP cannot obtainthe secret key of the enclave (by any means of side-channelattacks on SGX). Since we assumed that sensors are trustedand cannot be spoofed, we do not need to consider a casewhen sensors would collude with SP to fabricate the logs. An adversarial user may repudiate the reception of noticemessages about data-capture rules. Also, an adversarial usermay impersonate a real user to retrieve the sensor data andsecured-log during the verification phase. Thus, an adversarialuser may reveal the privacy of the users by observingsensor data. Also, a user may infer the identity of otherusers associated with sensor data by potentially launching frequency-count attacks ( e . g ., by determining which device-idsare prominent). C. Security Properties
In the above-mentioned adversarial model, an adversarywishes to learn the (entire/partial) data about the user, withoutnotifying or by mis-notifying about data-capture rules, suchthat the user/auditor cannot detect any inconsistency betweendata-capture rules and stored sensor data at SP. Hence, a secureattestation algorithm must make it detectable, if the adversarystores sensor data in violation of the data-capture rules notifiedto the user. To achieve a secure attestation algorithm, we needto satisfy the following properties:
Authentication.
Authentication is required: ( i ) between SPand users, during notification phase; thus, the user can detecta rogue SP, as well as, SP can detect rogue users, and ( ii )between SP and the verifier (auditor/user), before sendingsensor data to the verifier to prevent any rogue verifierto obtain sensor data. Thus, authentication prevents threatssuch as impersonation and repudiation. Further, a periodicmutual authentication is required between IFD and SP, therebydiscarding rogue sensor data by SP, as well as, preventing anyrogue SP to obtain real sensor data. Immutability and non-identical outputs.
We need tomaintain immutability of notice messages, sensor data, andthe sealing function. Note that if the adversary can alternotice messages after transmission, it can do anything withthe sensor data, in which case, sensor data may be completelystored or deleted without respecting notice messages. Further,if the adversary can alter the sealing function, the adversarycan generate a proof-of-integrity, as desired, which makesthe flawless attestation impossible. The output of the sealingfunction should not be identical for each sensor readingto prevent an adversary to forge the sealing function (andto prevent the execution of frequency-count attack by theuser). Thus, immutability and non-identical outputs propertiesprevent threats, e . g ., inserting, deleting, modifying, andtruncating the sensor data, as well as, simulating the sealingfunction. Minimality, non-refutability and privacy-preservingverification.
The verification method must find anymisbehavior of SP, during storing sensor data inconsistentwith pre-notified data-capture rules. However, if the verifierswish to verify a subset of the sensor data, then they shouldnot verify the entire sensor data. Thus, SP should send aminimal amount of sensor data to the verifier, enabling themto attest what they wish to attest. Further, the verificationmethod: ( i ) cannot be refuted by SP, and ( ii ) should not reveal3 IEEE NCA, 2019. nfrastructure Deployer(IFD)
Service Provider (SP) Trusted
AuditorTrusted Untrusted A K E S ec u re d C o mm un i c a t i o n A K E S ec u re d C o mm un i c a t i o n EncryptedWiFi Sensor Data
Data-Capture Rule Creation
Trusted Notifier
Cleartext sensor dataSecured logs for integrity verification
Decrypt, check, seal
Applications User
Secure data-capture rule creation
Secured logs for user verification
Data-capture rule store
10 111210
SGX
Encrypted Data
Controller Figure 2: Dataflow and computation in the protocol. Trusted parts are shown in shaded boxes.any additional information to the user about all the other usersduring the verification process. These properties prevent SP tostore only sensor data that is consistent with the data-capturerules notified to the user. Further, these properties preservethe privacy of other users during attestation and imposeminimal work on the verifier.
D. Assumptions
This section presents assumptions, we made, as follows:1) The sensor devices are assumed to becomputationally-inefficient to locally generate a verifiable logfor the continuous data stream as per the data-capture rules.2) Sensor devices are tamper-proof, and they cannot bereplicated/spoofed ( i . e ., two devices cannot have an identicalid). In short, we assume a correct identification of sensors,before accepting any sensor-generated data at the controller atIFD, and it ensures that no rogue sensor device can generatethe data on behalf of an authentic sensor. Further, we assumethat an adversary cannot deduce any information from thedataflow between a sensor and the controller. Recall that inour setting the university IT department collects the entiresensor data from their owned and deployed sensors, beforesending it to TIPPERS.3) We assume the existence of an authentication protocol betweenthe controller and SP, so that SP receives sensor data only fromauthenticated and desired controller.4) The communication channels between SP and users, as wellas, between SP and auditor are insecure. Thus, our solutionincorporates an authenticated key exchange based on SIGMAprotocol (which protects sender identity). When the verifier’sidentity is proved, the cryptographically sealed logs are sentto the verifier.5) By any side-channel attacks on SGX, one cannot tamper withSGX and retrieve the secret-key of SGX. (Otherwise, theadversary can simulate the sealing process.)III. I O T N
OTARY
This section presents an overview of the three phases anddataflow among different entities and devices, see Figure 2.
Notification phase: SP to Users messages.
This is thefirst phase that notifies users about data-capture rules for theIoT space using notice messages (in a verifiable manner forlater stages). Such messages can be of two types: ( i ) noticemessages, and ( ii ) notice-and-acknowledgment messages. SPestablishes (the default) data-capture rules and informs trustedhardware ( 1 ). Trusted hardware securely stores data-capturerules ( 2 , 5 ) and informs the trusted notifier ( 3 ) thattransmits the message to all users ( 4 ). Only notice messagesneed a trusted notifier to transmit the message (see § IV-A).
Log-sealing phase: Sensor devices to SP messages.
Eachsensor sends data to the controller ( 0 ). The controller receivesthe correct data, generated by the actual sensor, as per ourassumptions (and settings of the university IT department).The controller sends encrypted data to SP ( 6 ) thatauthenticates the controller using any existing authenticationprotocol, before accepting data. Trusted hardware (Intel SGX)at SP reads the encrypted data in the enclave ( 7 ).
Working of the enclave.
The enclave decrypts the data andchecks against the pre-notified data-capture rules. Recall thatthe decrypted data is of the format: (cid:104) d i , s j , t k (cid:105) , where d i is i th user-device-id, s j is the j th sensor device, and t k is k th time. After checking each sensor reading, the enclaveadds a new field, entitled sensor (device) states . The sensorstate of a senor s j is denoted by s j . state , which can be active or passive , based on capturing user data. Forexample, s j . state = active or ( ), if data captured by thesensor s j satisfies the data-capture rules; otherwise, s j . state = passive or ( ). For all the sensors whose state = 0 , theenclave deletes the data. Then, the enclave cryptographicallyseals sensor data, regardless of the sensor state, and providescleartext sensor data of the format: (cid:104) d i , s j , s j . state = 1 , t k (cid:105) toSP ( 8 ) that provides services using this data ( 9 ). Note thatthe cryptographically sealed logs and cleartext sensor data arekept at untrusted storage of SP ( 8 , 10). Verification phase: SP to verifier messages.
In our model,an auditor and a user can verify the sensor data. Theauditor can verify the entire/partial sensor data against4 IEEE NCA, 2019.ata-capture rules by asking SP to provide cleartext sensordata and cryptographically sealed logs ( 8 , 10). The userscan also verify their own data against pre-notified messagesor can verify the results of the services provided by SPusing only cryptographically sealed logs (12). Note thatusing an underlying authentication technique (as per ourassumptions), auditor/users and SP authenticate each otherbefore transmitting data from SP to auditor/users.IV. A
TTESTATION P ROTOCOL
This section presents three phases of attestation protocol.
Preliminary Setup Phase.
We assume a preliminary setupphase that distributes public keys ( PK ) and private keys ( PR ),as well as, registers user devices into the system. The trustedauthority (which is the university IT department in our setup ofTIPPERS) generates/renews/revokes keys used by the securehardware enclave (denoted by (cid:104) PK E , PR E (cid:105) ) and the notifier(denoted by (cid:104) PK N , PR N (cid:105) ). The keys are provided to theenclave during the secure hardware registration process. Also, (cid:104) PK di , PR di (cid:105) denotes keys of the i th user device. Usagesof keys : The controller uses PK E to encrypt sensor readingsbefore sending to SP. PR E is also used by the enclave towrite encrypted sensor logs and decrypt sensor readings. PK N is used during the notification phase by SGX to send anencrypted message to the notifier. User device’s keys are usedduring device registration, as given below.We assume a registration process during which a useridentifies herself to the underlying system. For instance, in aWiFi network, users are identified by their mobile devices, andthe registration process consists of users providing the MACaddresses of their devices (and other personally identifiableinformation, e . g ., email and a public key). During registration,users also specify their preferred modality through which thesystem can communicate with the user ( e . g ., email and/or pushmessages to the user device). Such communication is usedduring the notification phase. A. Notification Phase
The notification phase informs data-capture rulesestablished by SP to the (registered) users by explicitlysending notice messages . We consider two models fornotification, differing based on acknowledgment from users.In the notice-only model (NoM) , SP informs users ofdata-capture rules, but users may not acknowledge receipt ofthe message. Such a model is used to implement policies,when data capture is mandatory, and the user cannot exercisecontrol, over data capture. Since there is no acknowledgment,SP is only required to ensure that it sends a notice, but isnot required to guarantee that the user received the notice.In contrast, a notice-and-ACK model (NaM) is intended fordiscretionary data-capture rules that require explicit permissionfrom users prior to data capture. Such rules may be associated,for instance, with fine-grained location services that requireusers’ location. A user can choose not to let SP track hislocation, but will likely not be able to avail some services. Implementation of notification differs based on the modelused. Interestingly, since NaM requires acknowledgment, thenotification phase is easier as compared to NoM that usesa trusted notifier to deliver the message to users. Below wediscuss the implementation of both models:
Notification implementation in NoM . NoM assumes that, bydefault, data-capture rules are set not to retain any user data,unless SP, first, informs SGX about a data-capture rule, ( i . e .,SP cannot use the encrypted sensor data for building anyapplication, see 9 in Figure 2). When SP creates a newdata-capture rule, SP must inform SGX. Then, the enclaveencrypts the data-capture rule using the public key ( i . e ., PK N )of the notifier and informs the trusted notifier (via SP) aboutthe encrypted data-capture rule by writing it outside of theenclave (in our user-case § I, the university IT departmentworks as a trusted notifier). Data-capture rules are maintainedby SP on stable storage, which is read by SGX into the enclaveto check, if the sensor data should be forwarded to SP. SGXcan retain a cache of rules in the enclave, if such rules arestill valid (and hence used for enforcement). Finally, thetrusted notifier acknowledges SP about receiving the encrypteddata-capture rule, and then, informs users of the encrypteddata-capture rule via signed notice messages. On receivingthe notice message, the users may decrypt it and obtain thedata-capture rule.To see the role of trusted hardware above, suppose that SPwas responsible for informing users about data-capture rulesdirectly. Since data-capture rules are also required by SGXduring log-sealing (P
HASE the trusted notifier above, suppose that SPcan directly inform users about encrypted data-capture rulesobtained from SGX. An adversarial SP may not deliver thedata-capture rule to all/some of the users; thus, an encrypteddata-capture rule is not helpful. Thus, a trusted notifier ensuresthat the notice message is sent to all the registered users. Notethat the trusted notifier might be a trusted web site that listsall the data-capture rules, which users can access.
Implementation of notification in NaM . Unlike NoM, thenotification phase of NaM does not require the trusted notifier.In NaM, by default, SP cannot utilize all those sensor readingshaving device-ids for which the users have not acknowledged.Likewise NoM, in NaM, SP informs data-capture rules to SGXthat encrypts the rule and writes outside of the enclave. Theencrypted rules are delivered by SP to users, unlike NoM.On receiving the message, a user may securely acknowledgethe enclave about her consent. The enclave retains all thosedevice-ids that acknowledge the notice message for log-sealingphase and considers those device-ids during the log-sealingphase to retain their data while discarding data of others. Due to the enclave’s limited memory, it cannot keep all valid and non-valid data-capturerules, after a certain size. Thus, the enclave writes all non-valid data-capture rules onthe disk, after computing a secured hash digest over all rules. Taking a hash over therules is needed to maintain the integrity of all rules. ← 𝐻 𝑑 𝑠 ԡ1 ԡ𝑡 𝐻 0ℎ ← 𝐻 𝑑 𝑠 ԡ1 ԡ𝑡 ℎ ℎ ← 𝐻 𝑑 𝑠 ԡ1 ԡ𝑡 ℎ ℎ ← 𝐻 𝑑 𝑠 ԡ1 ԡ𝑡 ℎ 𝒫𝐼 𝑐 𝑥 ← 𝑔 𝑏 , 𝑆𝑖𝑔𝑛 𝑃𝐾 𝐸 (ℎ ۩ 𝑆 𝑒𝑜𝑐𝑥 ) Sealing function execution for log-integrity 𝑑 , 𝑠 , 1, 𝑡 𝑑 , 𝑠 , 1, 𝑡 𝑑 , 𝑠 , 1, 𝑡 𝑑 , 𝑠 , 1, 𝑡 𝑜 ← 𝐻 𝑑 ฮ𝑡 ℎ𝑢 ← 𝐻 ԡ𝑜 ← 𝐻 𝑑 ฮ𝑡 ℎ𝑢 ← 𝐻 ԡ𝑜 ← 𝐻 𝑑 ฮ𝑡 ℎ𝑢 ← 𝐻 ԡ𝑜 ← 𝐻 𝑑 ฮ𝑡 ℎ𝑢 ← 𝐻 ԡ𝑜 𝑒𝑛𝑑 ← ℎ𝑢 ۩ℎ𝑢 ۩ℎ𝑢 ۩ℎ𝑢 𝒫𝑈 𝑐 𝑥 ← 𝑔 𝑏 , 𝑆𝑖𝑔𝑛 𝑃𝐾 𝐸 ℎ 𝑞𝑒𝑛𝑑 ۩𝑆 𝑒𝑜𝑐𝑥 Sealing function execution for user’s data/ query verificationSensor data after passing the enclave
Figure 3: Cryptographically sealing procedure executed on a chunk, C x . Gray-shaded data is not stored on the disk. White-shadeddata is stored on the disk and accessible by SP. Figure shows proof-of-integrity for a chunk, C x . B. Log Sealing Phase
The second phase does cryptographically sealing ofthe sensor data for future verification against pre-notifieddata-capture rules. The sensor data is sealed into securedlogs using authenticated data structures, e . g ., hash-chains andXOR-linked lists (as shown in Figures 3, 4), by the sealingfunction, Sealing ( PR E , (cid:104) d i , s j , s j . state , t k (cid:105) ) , executed in theenclave at SP. Let us explain log-sealing in the context of WiFiconnectivity data. The enclave reads the encrypted sensor data( 7 in Figure 2) and executes the three steps: ( i ) decrypts thedata, ( ii ) checks the data against pre-notified valid data-capturerules, and ( iii ) cryptographically seals the data and store appropriate secured logs .Below we explain our log sealing approach. To simplifythe discussion, we consider the case when all the sensor datasatisfies some data-capture rule ( i . e ., the state of all the sensordata is one), and hence, data is forwarded to and stored atSP § IV-B1. Likewise, the protocol to deal with all sensordata having state one, a protocol can also deal with the casewhen some sensor data satisfies some data-capture rule, whileremaining sensor data does not satisfy any rule ( i . e ., the stateof the remaining sensor data is zero). However, due to pagelimitations, we skip details of such a protocol. Sealing Entire Sensor Data : The sealing operationcontains three phases: ( i ) chunk creation, ( ii ) hash-chaincreation, and ( iii ) proof-of-integrity creation; described below. P HASE
1: Chunk creation.
The first phase of the sealingoperation finds an appropriate size of a chunk (to speed upthe attestation process). Note that the incoming encryptedsensor data may be large, and it may create problems duringverification, due to increased communication between SP andthe verifier. Also, the verifier needs to verify the entire data,which have been collected over a large period of time ( e . g .,months/years). Further, creating cryptographic sealing overthe entire sensor data may also degrade the performance of Sealing () function, due to the limited size of SGX enclave.Thus, we first determine an appropriate chunk size, for eachof which the sealing function is executed.The chunk size depends on time epochs, the enclave size,the computational overhead of executing sealing on the chunk,and the communication overhead for providing the chunk to the verifier. A small chunk size reduces the communicationoverhead and maintains the log minimality property, therebyduring the log verification phase, a verifier retrieves only thedesired log chunks, instead of retrieving the entire sensor data.Consequently, SP stores many chunks. P HASE
2: Hash-chain creation.
Consider a chunk, C x , thatcan store at most n sensor readings, each of them of theformat: (cid:104) d i , s j , t k (cid:105) . The sealing function checks each sensorreading against data-capture rules and adds sensor state toeach reading, as: (cid:104) d i , s j , s j . state , t k (cid:105) . Since in this section weassumed that all sensor data will be stored, the sensor stateof each sensor reading is set to 1. The sealing function startswith the first sensor reading of the chunk C x , as follows: First sensor reading . For the first sensor reading of the chunk,the sealing function computes a hash function on valuezero, i . e , H (0) . Then, the sealing function mixes H (0) withthe remaining values of the sensor reading, i . e ., sensor-id,device-id, sensor state, and time, at which it computes the hashfunction, denoted by H ( d || s j || s j . state || t k || H (0)) that resultsin a hash digest, denoted by h x . After processing the completefirst sensor reading of the chunk C x , the enclave writescleartext first sensor reading of C x , i . e ., (cid:104) d , s j , s j . state , t k (cid:105) on the disk, which can be accessed by SP. Second sensor reading . Let (cid:104) d , s j , s j . state , t k +1 (cid:105) be thesecond sensor reading. For this, the sealing function worksidentically to the processing of the first sensor reading. Itcomputes a hash function on the second sensor values, whilemixing it with the hash digest of the first sensor reading, i . e ., H ( d || s j || s j . state || t k +1 || h x ) that results in a hash digest, say h x . Finally, the enclave writes the second sensor reading incleartext on the disk. Processing the remaining sensor readings . Likewise, thesecond sensor reading processing, the sealing functioncomputes the hash function on all the remaining sensorreadings of the chunk C x . After processing the last sensorreading of the chunk C x , the hash digest h xn is obtained. P HASE
3: Proof-of-integrity creation.
Since each sensorreading is written on disk, SP can alter sensor readings, tomake it impossible to verify log integrity by an auditor. Thus,to show that all the sensor readings are kept according to thepre-notified data-capture rules, the sealing function preparesan immutable proof-of-integrity for each chunk, as follows:6 IEEE NCA, 2019. 𝑑 , 𝑠 , 1, 𝑡 , 𝑜 𝑥 𝑑 , 𝑠 , 1, 𝑡 , 𝑜 𝑥 𝑑 , 𝑠 , 1, 𝑡 , 𝑜 𝑑 , 𝑠 , 1, 𝑡 , 𝑜 𝒫𝐼 𝑐 𝑥 ← 𝑔 𝑏 , 𝑆𝑖𝑔𝑛 𝑃𝐾 𝐸 (ℎ ۩ 𝑆 𝑒𝑜𝑐 𝑥 ) 𝒫𝑈 𝑐 𝑥 ← 𝑔 𝑏 , 𝑆𝑖𝑔𝑛 𝑃𝐾 𝐸 ℎ 𝑞𝑒𝑛𝑑 ۩𝑆 𝑒𝑜𝑐𝑥 𝑑 , 𝑠 , 1, 𝑡 , 𝑜 𝑣 𝑑 , 𝑠 , 1, 𝑡 , 𝑜 𝑣 𝑑 , 𝑠 , 1, 𝑡 , 𝑜 𝑑 , 𝑠 , 1, 𝑡 , 𝑜 𝒫𝐼 𝑐 𝑣 ← 𝑔 𝑎 , 𝑆𝑖𝑔𝑛 𝑃𝐾 𝐸 (ℎ ۩ 𝑆 𝑒𝑜𝑐 𝑣 ) 𝒫𝑈 𝑐 𝑣 ← 𝑔 𝑎 , 𝑆𝑖𝑔𝑛 𝑃𝐾 𝐸 ℎ 𝑞𝑒𝑛𝑑 ۩𝑆 𝑒𝑜𝑐𝑣 𝑑 , 𝑠 , 1, 𝑡 , 𝑜 𝑑 , 𝑠 , 1, 𝑡 , 𝑜 𝑦 𝑑 , 𝑠 , 1, 𝑡 , 𝑜 𝑑 , 𝑠 , 1, 𝑡 , 𝑜 𝑦 𝒫𝐼 𝑐 𝑦 ← 𝑔 𝑐 , 𝑆𝑖𝑔𝑛 𝑃𝐾 𝐸 (ℎ ۩ 𝑆 𝑒𝑜𝑐 𝑦 ) 𝒫𝑈 𝑐 𝑦 ← 𝑔 𝑐 , 𝑆𝑖𝑔𝑛 𝑃𝐾 𝐸 ℎ 𝑞 𝑒𝑛𝑑 ۩𝑆 𝑒𝑜𝑐𝑦 Chunk 𝒄 𝒗 Chunk 𝒄 𝒙 Chunk 𝒄 𝒚 Figure 4: P
HASE
3: end of chunk, S eoc , creation for three chunks. Observe that S x eoc = g a ⊕ g b ⊕ g c .For each chunk C i , the sealing function generates a randomstring, denoted by g j , where i (cid:54) = j . Let C v , C x , and C y bethree consecutive chunks (see Figure 4), based on consecutivesensor readings. Let g a , g b , and g c be random strings forchucks C v , C x , and C y , respectively. The use of random stringswill ensure that any of the consecutive chunks have not beendeleted by SP (will be clear in § IV-C). Now, for producingthe proof-of-integrity for the chunk C x , the sealing function:( i ) executes XOR operation on g a , g b , g c , whose output isdenoted by S x eoc , where eoc denotes the end-of-chunk; ( ii )signs h xn XORed with S x eoc with the private key of the enclave;and ( iii ) writes the proof-of-integrity for log verificationof the chunk C x with the random string g b , as follows: PI C x = ( g b , Sign PR E ( h xn ⊕ S x eoc )) Note.
We do not generate the proof for each sensor reading.The enclave writes only the proof and the random string foreach chunk to the disk, which is accessible by SP. Further, thesensor readings having the state one are written on the disk,based on which SP develops services.
Example.
Please see Figure 3, where the middle box showsP
HASE h , the proof-of-integrity, PI , is created thatincludes signed h ⊕ S x eoc and a random string, g b . Note. g ∗ for the first chunk. The initialization of log sealingfunction requires an initial seed value, say g ∗ , due to theabsence of th chunk. Thus, in order to initialize the securebinding for the first chunk, the seed value is used as asubstitute random string. Sealing Data for User Data/Service Verification : While capturing user-associated data , users may wish toverify their user-associated data against notified messages.Note that the protocol presented so far requires entirecleartext data to be sent to the verifier to attest log integrity (it will be clear soon in § IV-C). However, such cleartext datatransmission is not possible in the case of user-associateddata verification, since it may reveal other users’ privacy.Thus, to allow verification of user-associated data (orservice/query result verification), we develop a new sealingmethod, consists of the three phases: ( i ) chunk creation, ( ii ) The users, who access services developed by SP (as mentioned in § I), may also wish toverify the query results, since SP may tamper with the data to show the wrong results. hash-generation, and ( iii ) proof-of-integrity creation. Chunkcreation phase of this new sealing method is identical to theabove-mentioned chunk creation phase 1; see § IV-B1. Below,we only describe P
HASE
HASE P HASE
2: Hash-generation.
Consider a chunk, C x , that canhave at most n sensor readings, each of them of the format: (cid:104) d i , s j , s j . state , t k (cid:105) . Our objective is to hide users’ device-idand its frequency-count ( i . e ., which device-id is prominent inthe given chunk). Thus, on the i th sensor reading, the sealingfunction mixes d j with t k , and then, computes a hash functionover them, denoted by H ( d j || t k ) that results in a digest value,say o i . Note that hash on device-ids mixed with time resultsin two different digests for more than one occurrence ofthe same device-id. Note that o i helps the user to know hispresence/absence in the data during attestation, but it will notprove that tampering has not happened with the data. Then,the sealing function mixes o i with the sensor state (to producea proof of sensor state) of the i th sensor reading, and on whichit computes the hash function, denoted by H ( o i || s j . state ) thatresults in a hash digest, denoted by hu xi . After processing the i th sensor reading of the chunk C x , the enclave writes o i on thedisk. After processing all the n sensor readings of the chunk C x , the sealing function computes XOR operation on all hashdigests, hu xi , where ≤ i ≤ n : hu x ⊕ hu x ⊕ . . . ⊕ hu xn , whoseoutput is denoted by hu x end . (Reason of computing hu x end willbe clear in § IV-C). P HASE
3: Proof-of-integrity creation for the user.
Thesealing function prepares an immutable proof-of-integrityfor users, denoted by PU , for each chunk and writeson the disk. Likewise, proof-of-integrity for entire logverification, PI ( § IV-B1), for each chunk, the sealingfunction obtains S eoc ; refer to P HASE § IV-B1.Now, for producing PU for the chunk C x , the sealingfunction: ( i ) signs hu x end XORed with S x eoc with theprivate key of the enclave, and ( ii ) writes the signedoutput with the random string of the chunk, g b , as PU C x . PU C x = ( g b , Sign PR E ( hu x end ⊕ S x eoc )) Note.
The enclave writes hash digests, o i for each sensorreading, the proof for user verification, and the random stringfor each chunk on the disk. Of course, the sensor readingshaving the state one are written on the disk. Example.
Please see Figure 3, where the last box showsP
HASE PU . C. Attestation Phase
The attestation phase contains two sub-phases: ( i ) keyestablishment between the verifier and SP to retrieve logs,and ( ii ) verification of the logs. Due to space restrictions, weskip the key establishment phase. Here, we briefly describethe verification process at the auditor and/or the user. Verification process at the auditor.
Recall that the auditor canverify any part of the sensor data. Suppose the auditor wishesto verify a chunk C x ; see Figure 4. Hence, entire sensor data(the data written in first box of Figure 3) of the chunk C x ,random strings g a , g b , and g c (corresponding to the previousand next chunks of C x ; see Figure 4), and proof-of-integrity PI C x are provided to the auditor. The auditor performs thesame operation as in P HASE § IV-B2. Also, the auditorcomputes the end-of-chunk string S x eoc = g a ⊕ g b ⊕ g c . At theend, the auditor matches the results of h xn ⊕ S x eoc against thedecrypted value of received PI C x , and if both the values areidentical, then it shows that the entire chunk is unchanged.Note that since SP transfers sensor readings of the chunk C x ,random strings ( g a , g b , and g c ) and PI C x to the user, SP canalter any transmitted data. However, SP cannot alter the signed Sign PR E ( h xn ⊕ S x eoc ) , due to unavailability of the private keyof the enclave, PR E , which was generated and provided bythe trusted authority to the enclave. Thus, by following theabove-mentioned procedure on the sensor readings of C x , anyinconsistency created by SP will be detected by the auditor. Verification process at the user.
If the user wishes to verifyhis data in a chunk, say C x , the user is provided all hashdigests computed over device-id and time ( o i , see the last boxin Figure 3), time, sensor state, random strings g a , g b , and g c (see Figure 4), and the proof PU by SP. Since, the user knowsher device-id, first, the user verifies her occurrences in thedata by computing the hash function on her device-id mixedwith provided time values and compares against received hashdigests. This confirms the user’s presence/absence in the data.Also, to verify that no hash-digest is modified/deleted by SP,the user computes the hash function on the sensor state mixedwith the received o i ( ≤ i ≤ n , where n in the number ofsensor readings in C x ) and computes hu x end = h x ⊕ h x ⊕ . . . ⊕ h xn . Finally, the user computes hu x end ⊕ S x eoc and comparesagainst the decrypted value of PU . The correctness of thismethod can be argued in a similar manner to the correctnessof the verification at the auditor.V. E XPERIMENTAL E VALUATION
This section presents our experimental results on live WiFidata. We execute I O T N
OTARY on a 4-core 16GB RAMmachine equipped with SGX at Microsoft Azure cloud.
Setup.
In our setup, the IT department at UCI is the trustedinfrastructure deployer. It also plays the role of the trustednotifier (notifying users over emailing lists). At UCI, 490WiFi sensors, installed over 30 buildings, send data to acontroller that forwards data to the cloud server, where I O T N
OTARY is installed. The cloud keeps cryptographiclog digests that are transmitted to the verifier, while sensordata, qualifies data-capture rules, is ingested into realtimeapplications supported by TIPPERS. We use SHA-256 as thehashing algorithm and 256-bit length random strings in I O TN OTARY . We allow users to verify the data collected over thelast 30minutes (min). S i ze o f D a t a ( G B ) Size of Original Data (GB)Size of Logs (GB)
Figure 5: Exp 1: Storage overhead. U s e r V e r i fi ca ti on T i m e ( S ec ond s ) ≈ ≈ ≈ ≈ ≈ ≈ User1 (1-core 1GB RAM)User1 (1-core 2GB RAM)User3 (2-cores 1GB RAM)User4 (2-core 2GB RAM)User5 (4-cores 16GB RAM)
Figure 6: Exp 4: Verification time.
Dataset size.
Although I O TN OTARY dealswith live WiFidata, we reportresults for dataprocessed bythe system over180 days duringwhich timeI O T N
OTARY processed 13GBof WiFi datahaving 110million WiFievents.
Data-capturerules.
We set thefollowing fourdata-capturerules: ( i ) Time-based :always retain data, except from t i to t j each day; ( ii ) User-location-based : do not store data about specified devicesif they are in a specific building; ( iii ) User-time-based : donot capture data having a specific device-id from t x to t y ( x (cid:54) = i , y (cid:54) = j ) each day; and ( iv ) Time-location-based : donot store any data from a specific building from time t x to t y each day. The validity of these rules was 40 days. After each40-days, variables i , j , x , y were changed. Exp 1. Storage overhead at the cloud.
We fix the size ofeach chunk to 5MB, and on average, each of them contains ≈
37K sensor readings, covering around 30min data of 30buildings in peak hours. Based on 5MB chunk size, we got3291 chunks for 180 days. For each chunk, the sealing functiongenerates two types of logs: ( i ) for auditor verification thatproduced proof-of-integrity PI of size 512bytes, and ( ii ) foruser verification that produces hashed values (see Figure 3)and proof-of-integrity for users PU of size 1.05MB. Figure 5shows 180-days WiFi data size without having sealed logs (redcolor) and with sealed logs (green color). Exp 2. Performance at the cloud.
For each 5MB chunk, thesealing function took around 310ms to seal each chunk. Thisincludes time to compute PI , PU and encrypt them. The reason of getting more chunks is that during non-peak hours 5MB chunk can storesensor readings for more than one hour. However, as per our assumption, we allow theuser to verify the data collected over the last 30min. Hence, regardless of chunk is fullor not, we compute the sealing function on each chunk after 30min. umber of Chunks 1 50 100 500 1000 3000 ≈ duration (day) 30-60min 1-2 2-5 8-18 35-55 175Time (seconds) 1 49 102 544 1160 4400 Table I: The auditor verification time. Duration varies due todifferent class schedules in buildings and working hours.
Exp 3. Auditor verification time.
The auditor at our campushas a 7th-Gen quad-core i7CPU and 16GB RAM machine.It downloads the chunks from the cloud and executes auditorverification. The auditor varied the number of chunks from 1to 3000; see Table I. Note that to attest one-day data across 30buildings, the auditor needs to download at most 50 chunks,which took less than 1min to verify. Observe that as thenumber of chunks increases, the time also increases, due toexecuting the hash function on more data.
Exp 4: Verification at a resource-constrained user.
To showthe practicality of I O T N
OTARY for resource-constrained users,we considered four types of users, differing on computationalcapabilities ( e . g ., available main memory (1GB/2GB) andthe number of cores (1 or 2 cores)). Each user verified1/10/20-days data; see Figure 6. Note that verifying 1-daydata, which is ≈
50 blocks, at resource-constrained userstook at most 30s. As the number of blocks increases,the computational time also increases, where the maximumcomputational time to verify 20-days data was < OMPARISON WITH E XISTING W ORK
We classify the related work in the area of IoT attestationinto the following three categories:
Attestation in the context of IoT.
The existing remoteattestation protocols verify the internal memory state ofuntrusted devices through a trusted remote verifier. Forexample, AID [8] attests the internal state of neighboringdevices through key exchange and proofs-of-non-absence.SEDA [9] attests embedded devices and provides the numberof devices that pass attestation. Also, DARPA [10] andSANA [11] allow detection of physical attacks by usingheartbeat messages and provide aggregate network attestation.In short, existing work cannot verify sensor data against thedata-capture rules, except sensors’ internal state. In contrast,I O T N
OTARY does not deal with the verification of theinternal state of sensors, since in our case, (WiFi access-point)sensors deployed by a trusted entity ( e . g ., the university ITdepartment). Of course, cyberattacks are possible on sensorsto maliciously record data and that can also be detected byI O T N
OTARY . Attestation using secure hardware. [12] providedSGX-based attestation method for physical attacks onthe sensor. Fiware [13] provides secure key managementthrough key vault running in SGX. However, [12], [13] cannot verify any sensor data. Also, in [12], [13], if data-capturerules are mis-notified to the user, SGX cannot detect anyinconsistency. In contrast, I O T N
OTARY does not deal withattacks on sensors, as well as, a specific key managementprotocol. However, I O T N
OTARY can detect and discardthe sensor data that does not comply with the notificationsreleased earlier.
Integrity verification. [14] proposed a privacy-preservingscheme based on zero-knowledge proofs to detectlog-exclusion attacks. [15] proposed a Bloom tree thatstores proof of logs at an untrusted cloud. vSQL [16] maybe used for verifying cleartext query results. However, thesetechniques cannot detect log deletion and incur significantoverheads. For example, vSQL takes more than 4000 secondsto verify a SQL query. In contrast, I O T N
OTARY providescomplete security to sensor data and realtime data attestationapproach. VII. C
ONCLUSION
This paper presented a framework, I O T N
OTARY for sensordata attestation through cryptographically enforced log-sealingmechanisms to produce immutable proofs, used for logverification. We improve the na¨ıve end-to-end encryptionmodel, where retroactive verification is not provable. Theuser-data verification mechanism allows users to revokeservices of the concerned IoT space. Thus, we empower theusers with the right-to-audit instead of right-to-own the datacaptured by sensors. I O T N
OTARY is a part of a real IoTsystem (TIPPERS) and provides verification on live WiFi datawith almost no overheads on users.R
EFERENCES[1] S. Mehrotra et al. , “TIPPERS: A privacy cognizant IoT environment,” in
PerCom Workshops , 2016, pp. 1–6, http://tippersweb.ics.uci.edu/.[2] A. Rao et al. , “Expecting the unexpected: Understanding mismatchedprivacy expectations online,” in
SOUPS , 2016, pp. 77–96.[3] H. Krawczyk, “Sigma: The ‘SIGn-and-MAc’ approach to authenticateddiffie-hellman and its use in the IKE protocols,” in
CRYPTO , 2003.[4] Full version of the paper available at: https://isg.ics.uci.edu/publications/.[5] V. Costan et al. , “Intel SGX explained,”
IACR Cryptology ePrint Archive ,vol. 2016, p. 86, 2016.[6] https://newsroom.intel.com/newsroom/wp-content/uploads/sites/11/2017/09/8th-gen-intel-core-product-brief.pdf.[7] W. Wang et al. , “Leaky cauldron on the dark land: Understanding memoryside-channel hazards in SGX,” in
CCS , 2017, pp. 2421–2434.[8] A. Ibrahim et al. , “AID: autonomous attestation of IoT devices,” in
SRDS ,2018.[9] N. Asokan et al. , “Seda: Scalable embedded device attestation,” in
CCS ,2015, pp. 964–975.[10] A. Ibrahim et al. , “Darpa: Device attestation resilient to physicalattacks,” in
WiSec , 2016, pp. 171–182.[11] M. Ambrosin et al. , “SANA: secure and scalable aggregate networkattestation,” in
CCS , 2016, pp. 731–742.[12] J. Wang et al. , “Enabling security-enhanced attestation with Intel SGXfor remote terminal and iot,”
TCDICS , vol. 37, no. 1, pp. 88–96, 2018.[13] D. C. G. Valadares et al. , “Achieving data dissemination with securityusing FIWARE and Intel software guard extensions,” in
ISCC , 2018.[14] J. Frankle et al. , “Practical accountability of secret processes,” in
USENIX , 2018, pp. 657–674.[15] S. Zawoad et al. , “Towards building forensics enabled cloud throughsecure logging-as-a-service,”
IEEE TDSC , vol. 13, pp. 148–162, 2016.[16] Y. Zhang et al. , “vSQL: Verifying arbitrary SQL queries over dynamicoutsourced databases,” in
IEEE SP , 2017, pp. 863–880., 2017, pp. 863–880.