Practical Location Validation in Participatory Sensing Through Mobile WiFi Hotspots
PPractical Location Validation in ParticipatorySensing Through Mobile WiFi Hotspots
Francesco Restuccia, Andrea Saracino, and Fabio Martinelli
Abstract — The reliability of information in participatorysensing (PS) systems largely depends on the accuracy ofthe location of the participating users. However, existing PSapplications are not able to efficiently validate the position ofusers in large-scale outdoor environments. In this paper, wepresent an efficient and scalable
Location Validation System (LVS) to secure PS systems from location-spoofing attacks.In particular, the user location is verified with the help ofmobile WiFi hot spots (MHSs), which are users activatingthe WiFi hotspot capability of their smartphones and accept-ing connections from nearby users, thereby validating theirposition inside the sensing area. The system also comprisesa novel verification technique called
Chains of Sight , whichtackles collusion-based attacks effectively. LVS also includesa reputation-based algorithm that rules out sensing reportsof location-spoofing users. The feasibility and efficiency ofthe WiFi-based approach of LVS is demonstrated by a setof indoor and outdoor experiments conducted using off-the-shelf smartphones, while the energy-efficiency of LVS isdemonstrated by experiments using the
Power Monitor energytool. Finally, the security properties of LVS are analyzed bysimulation experiments. Results indicate that the proposedLVS system is energy-efficient, applicable to most of thepractical PS scenarios, and efficiently secures existing PSsystems from location-spoofing attacks.
Index Terms —Participatory Sensing, Smartphones, Secu-rity, WiFi Hotspots, Location Spoofing.
I. I
NTRODUCTION
Undoubtedly, smartphones have become one of the mostpowerful and pervasive technologies today. Among allfeatures, the simplicity of use make smartphones ideallysuited for a novel and tremendously potential sensingparadigm, known as participatory sensing (PS) [1]. Thebasic idea behind PS is to allow ordinary citizens toparticipate in large-scale sensing surveys with the help ofuser-friendly applications installed in their smartphones.This not only reduces dramatically deployment costs offixed infrastructures, but also provides fine-grained spatio-temporal coverage of the sensing area. Significant researchand development from industry and academia has beendevoted to design PS systems improving life experience ofusers. Indeed an abundance of real-life applications, whichtake advantage of both low-level sensor data and high-level user activities, range from real-time traffic monitor-ing [2] [3] to air pollution or garbage monitoring [4]–[6]
F. Restuccia is with the Department of Electrical and Com-puter Engineering, Northeastern University, Boston, MA (e-mail:[email protected]) ).A. Saracino and F. Martinelli are with the Istituto di Informatica eTelematica del Consiglio Nazionale delle Ricerche, Via G. Moruzzi n.1,56124, Pisa, Italy (e-mail: { a.saracino, f.martinelli } @iit.cnr.it). to social networking [7], to name a few. For a completesurvey of PS applications, the readers may refer to [8].Above all features, the most innovative aspect of PSsystems is that they are infrastructure-free, and rely only onthe users’ active participation to gather reliable data aboutthe sensing area. Therefore, it becomes paramount to verifywith relative precision the current users’ location , since thesensed data (e.g., temperature) is significantly dependenton the spatial context. However, smartphone applications(apps) like LocationHolic or FakeLocation makeextremely easy for users to spoof their current GPS lo-cation. Such software provides users with easy-to-use in-terfaces to manually set the global position system (GPS)coordinates, thereby deceiving location-based apps runningon the device.Fig. 1: Screenshot of Waze and
FakeLocation apps.
A. Motivation
Large-scale LSAs have potentially devastating conse-quences in terms of reliability and revenue loss of thePS systems [9]–[13]. To motivate and demonstrate thesimplicity and impact of LSA, we considered the well-known traffic monitoring application
Waze , which is acommunity-driven application gathering some complemen-tary map data and other traffic information from users.Similar to other location-based apps, Waze learns fromusers’ driving times to provide routing and real-time trafficupdates. This application is free to download and use, andpeople can report accidents, traffic jams, speed and policetraps, and can update roads, landmarks, house numbers,and so on. Figure 1 depicts a screenshot of Waze and
FakeLocation apps. Waze has a point-based reputationsystem based on the frequency of traffic reports and milesdriven . a r X i v : . [ c s . N I] M a y R e w a r d pa i d t o a tt a ck e r s ( % ) Percentage of attackersPercentage of reward to attackerst a = t u t a = 2t u t a = 3t u Fig. 2: Revenue given to attackers in [14].In order to understand how much revenue the admin-istration of the PS application may lose due to LSAs,we implemented and simulated the reward mechanismscheme due to Yang et. al (appeared in ACM MobiCom2012 [14]). In particular, we focused our attention on thePlatform-Centric reward model, in which the users are paidproportionally to the time t u they declare to dedicate tothe sensing services (see [14] for additional details). Wefocused on this particular model because of its simplicityand its strong game-theoretical properties. Figure 2 showsthe average percentage of revenue that attackers receivefrom the reward mechanism at each time step with respectto the total number of reward R offered at each sensingrequest, as a function of the number of attackers and thedeclared time t a that will be dedicated to the sensing task(as multiple of t u = 1 time unit). Figure 2 concludes thatthe amount of revenue the attackers steal from the rewardmechanism grows linearly with the number of attackers.In order to have an idea of the impact of attackers interm of revenue loss, let us suppose to have a percentageof attackers equal to 5%, 1000 users in the system, R equal to $10, t a equal to t u , and sensing requests everyminute. Every time the PS application requests data fromthe users, the attackers steal $0.5 from the system, whichmeans in a day the attackers steal $720. In a month anda year, respectively, the administration will lose $21,600and $262,800, respectively. If the application requires dataevery 10 minutes, the attackers would still steal $26,280 ayear. B. Our contribution
The above examples demonstrate that LSAs tremen-dously undermine the reliability and the revenue of existingPS systems. However, given the extremely large scaleof real-world PS systems and the uncontrollable, randommobility of smartphone users, verifying with relative accu-racy the location of users becomes remarkably challenging.LSAs are also extremely difficult to detect, given thePS system has no means to find out whether users areusing apps such as
FakeLocator . This issue calls fora distributed solution which leverages the collective actionof participating users. This paper makes the following novel contributions. • We propose the
Location Validation System (LVS),which efficiently and effectively tackles LSA attacks.LVS exploits the collaborative actions of users andthe WiFi capability of smartphones to validate the po-sition of other users. In fact, two smartphones directlyconnected through WiFi range are practically sharingthe same location, due to the limited WiFi range. Insuch way, these two users can mutually validate theirlocations inside the sensing area. By exploiting thistechnique on a large scale, LVS implements an ef-fective, scalable and distributed anti location-spoofingsystem. A reputation-based algorithm is also proposedto filter out reports coming from malicious users. • We propose the novel
Chain of Sights (CoS) verifica-tion algorithm to contrast collusion-based attacks. CoSgive a representation of the history of the validationprocess introduced through LVS. This process is basedonly on feedbacks sent by users to the system anddoes not rely on trusted control mechanisms. In fact,such mechanisms open the possibility of smart attacksbased on colluding malicious users, which are partof a more challenging threat model not considered informer related work [15]. • The viability of the WiFi-based approach of LVS isdemonstrated by real experiments conducted using off-the-shelf smartphones on indoor and outdoor testbeds.The results show that the framework is effective bothin typical indoor environments, as well as in outdoorenvironments with greater distances between the users(up to 60m). We also measure the energy consumptionof the WiFi-based mechanism of LVS using the PowerMonitor [16] hardware tool. Results show that theproposed approach has practically no overhead on thesmartphone resources in terms of energy consumption. • The efficiency and effectiveness of the LVS frameworkagainst location-spoofing attacks is proven throughsimulations. Simulation results indicate that the frame-work is resilient to high percentages of attackers (upto 40%) and scales well with the number of users inthe system.This paper largely extends the work presented in [17]and [18], by introducing the concept of chain of sight, adeeper security analysis, an energy performance evaluationand formal demonstration of theorems.The rest of the paper is organized as follows. SectionII introduces preliminary concepts and formally defines thenotion of LSA attack, while Section V discusses the relatedworks. Section III describes in depth the proposed LVSframework, while Section IV presents experimental andsimulation results of the LSA framework considering prac-tical PS scenarios. Finally, Section VI draws conclusionswith directions of future work.I. P
RELIMINARIES
In this section, we first provide some technical back-ground on PS and we define the threat model.In this paper, we adopt the most common architecture forPS system, which is based on a PS platform (PSP) hiddeninside a mobile cloud computing system [19]. Periodically,users are requested by the PSP to submit their sensed datato the PS system. At each request, users may choose toparticipate by sending their data to the PSP, or may simplyignore the request. The sensing app , which can be dis-tributed through common application markets like
GooglePlay or App Store , is responsible for providing the usersa friendly interface for data visualization and acquisition,as well as ensuring reliable data communication betweenthe users and PSP through cellular or WiFi network. Afteroperations such as data filtering and aggregation, globalinformation about the sensing area may be sent back to theusers through the PS application, so as to be used for theirdaily activities.As far as potential threats are concerned, we will assumethat the communication between the users and the PSserver runs is via reliable and protected wireless channel,where data cannot be lost, eavesdropped, modified or sub-stituted. We also assume the PS server is totally reliableand trustworthy (root of trust), in particular, in terms ofuser registration, key management, issuing credentials, trustassessment and reputation management. Users are uniquelyidentified inside the network through an identifier (ID)which exploits a digest of the smartphone IMEI (Interna-tional Mobile Equipment Identifier), which is unique forany device worldwide. Moreover, trying to modify a deviceIMEI (i.e., spoofing) is considered illegal and is extremelymore complex than spoofing the location [20]. Therefore,given users cannot spoof their identity inside the systems,we assume that sybil attacks are not possible .Henceforth, we will focus our attention to solving theattacks formalized in section III-E. In particular, attacksvia the communication channels (e.g., eavesdropping, trafficjamming, etc.) are out of the scope of this paper.III. L
OCATION V ALIDATION S YSTEM
In this section we describe the LVS security framework totackle the location-spoofing attack (LSA). We first describethe system model and formalize the LSA attack under suchmodel. Next, we describe in details the algorithm used byLVS to select the users acting as mobile hot-spots (MHS),as well as the WiFi-based location validation algorithm ofLVS. Finally, we describe the reputation-based algorithmused by LVS to filter out unreliable reports and thereforeguarantee reliability of the PS system.
A. System Model and Assumptions
Hereafter, we will suppose the smartphone sensing areais logically divided into W location areas of size S × S ,in which N users can move without restrictions (we do notassume any particular user mobility pattern and model). Specifically, users are free to move from one location areato another, and a given location area may contain anynumber of users (from 0 to N ). However, users cannot bein two different location areas at the same time. We assume S and W are tuning parameters depending on the specificPS application and its required accuracy of user location.We assume the location area L tk of user u k at time t isidentified by a pair of numerical coordinates representing apoint in the two-dimensional Cartesian coordinate system C , {O , X , Y} . We also assume that users are connectedto the Internet through WiFi or 3/4G Internet connection. • LVS user module.
It is implemented inside the sensingapp installed in the users’ smartphones, and is re-sponsible for handling the communication (through theInternet) between the PSP and the smartphone as far asthe operations performed by LVS are concerned. It alsohandles the user WiFi hotspot activation as explainedin III-C. • PSP communication module.
This module is imple-mented on the PSP, and is responsible for the commu-nication (through the Internet) between the PSP andthe LVS user module. • PSP computation module.
It handles the computationburden of LVS, which is the optimal selection of theusers (see III-D) that will run the location validationalgorithm (see III-C), as well as the computation ofthe Chain of Sights (see III-F) and the calculation ofthe reputation of the users (see III-G).Internet
PSPcomm.module
LVS usermodule PSPcomputationmoduleUser PSP
Fig. 3: Block diagram of LVS.
B. Overview of LVS
Figure 3 depicts the four logical components of LVS. Letus describe the functionality of each module in detail. • LVS user module.
It is implemented inside the sensingapp installed in the users’ smartphones, and is re-sponsible for handling the communication (through theInternet) between the PSP and the smartphone as far asthe operations performed by LVS are concerned. It alsoandles the user WiFi hotspot activation as explainedin III-C. • PSP communication module.
This module is imple-mented on the PSP, and is responsible for the commu-nication (through the Internet) between the PSP andthe LVS user module. • PSP computation module.
It handles the computationburden of LVS, which is the optimal selection of theusers (see III-D) that will run the location validationalgorithm (see III-C), as well as the computation ofthe Chain of Sights (see III-F) and the calculation ofthe reputation of the users (see III-G).
C. Location Validation Algorithm
Before describing the LVS location validation algorithm,let us define as mobile hot-spots (MHSs) the subset ofselected users (selection is explain in section III-D) who ac-tivate the built-in WiFi hotspot feature of their smartphonesand wait for other users to connect. Users that reside in theWiFi range of MHSs are called neighbors . The neighborsand the MHS will mutually validate their locations.The location validation algorithm divides time into val-idation rounds , occurring every T r time units; henceforth,we will refer to t j = j · T r as the time of the j -thvalidation round. During a validation round, the MHSsand their neighbors mutually validate their locations. A setof consecutive validation rounds is called validation epoch (Figure 4). The number of rounds composing a validationepoch and hence, its duration T e , is variable and will bedetailed later in the subsection. T r T e t Fig. 4: Validation epoch timeline.Let N ji denote the number of users physically present inthe i -th location area i at time t j . Also, let D t j i define thenumber of users advertising their position to be inside thelocation area i at time t j . During every validation round,the LVS validation algorithm performs the following threesteps.S1. The user module transmits her current location L t j k to the PSP computation module through the PSPcommunication module. For each location area, thePSP selects a subset of users among D t j i users thatappear to be in the i -th location area (selectionalgorithm described in Section III-D).S2. The selected users receive a message request fromthe PSP to act as MHS and validate the positionof their neighbors through WiFi connection. At thesame time, the neighbors also validate the positionof the MHS for additional security. This is whenthe location validation takes place (details explainedbelow), which we call spotting for brevity. S3. Each user transmits the location validation informa-tion acquired in the current validation round to thePSP through the LVS user module. This informationis used by the PSP computation module to computeusers’ reputation as detailed in sections III-F andIII-G. T r T sw T vt t Fig. 5: Validation round timeline.In detail, the operations performed by each user duringeach validation round in step S2 are summarized as follows. • Each MHS turns on the WiFi hotspot capability, andafter WiFi setup time T sw (see Figure 5), starts ac-cepting connections from nearby users for a maximumvalidation time T vt . Next, – each user u i connected to MHS u j sends apacket containing her unique ID number ID j to u j (remember that IDs cannot be spoofed); – MHS u j replies by sending a packet containingID j to every neighbor; – after the reception of the packet from their MHS,the neighbors disconnect from their MHS. • After the validation time T vt elapses, each MHS turnsoff the WiFi connection (if not active before thevalidation phase). • Each user reports to the PSP the IDs of the usersverified in the current validation round (if any).This operation of mutual validation between an MHS and auser in its WiFi range is also called spotting . If the user u i is an MHS and the user u j is in its WiFi area, we say that u i spots u j and u j spots u i . As an illustrative example, letus consider location area A i containing four users A, B, Cand D at the validation round j (Figure 6.a). During thisround, the PSP chooses B to be MHS since she is close tousers A and C. Users A and C are within the WiFi rangeof B, while D is in a different zone of A i . Therefore, A validates the location of B and C , while both B and C validate the location of A . During round j + 1 (Figure 6.b), D and B validate the location of each other.As mentioned earlier, the PSP evaluates the reputation ofeach user once a validation epoch is finished. In particular,a validation epoch ends when the position of all users in agiven location area has been validated by at least q users,where q is a system parameter. More formally, the durationof j -th validation epoch for location area A i is definedas min( e M , e max ) , where e M is the number of validationrounds required to validate M % of the D t j i users by at least q users, and e max is a system parameter.Let us now discuss in details some aspects and advan-tanges of the location validation algorithm of LVS.A CD ACB D(a) (b)Fig. 6: Position of users at rounds j and j + 1 . • The operations performed by the LVS user module,included the activation of the WiFi hotspot capability,do not require manual activation by the user, but areinstead handled automatically by the PSP through theuser module of LVS. This allows the users to act asMHS without manual intervention, easing the burdenon the users. • Modern smartphones include functionalities enablingat the same time WiFi and 3/4G connection . Inparticular, the functionality avoids disconnections ifthe user is using the WiFi interface for Internet con-nection when a validation round begins. Hence, beforethe user WiFi is disconnected to become an MHSor to connect to an MHS nearby, the connection ismigrated to mobile data (3G/4G) to ensure continuity.This allows the location validation algorithm to runwithout disrupting existing connections on the usersmartphone. • The WiFi-based algorithm of LVS has the remarkableadvantage that neighbors and MHS will mutually au-thenticate each other. This given additional security tothe system, as more location validation information isavailable to the PSP to prevent LSAs. • The cooperation of the user in acting as MHS can beguaranteed by using efficient and effective incentivemechanism, such as [14]. Therefore, the assumptionthat enough users acting as MHSs will be available ateach validation round is sound.
D. Selection of MHSs
The selection of MHSs at each validation roundis an extremely important problem for the PS systemperformance. An MHS should be chosen in such a way thatmaximizes sensing area coverage and therefore, validatesas many users as possible at each validation round. In thissubsection, we define the problem of optimum selectionof MHSs at each validation round and prove that it is anNP-Hard problem. In particular, we first select a minimumset of users to act as MHSs such that every other user withat least one neighbor may connect to at least one MHS.Next, we present an approximation algorithm which yields Available at https://play.google.com/store/apps/details?id=it.opbyte.superdownload a suboptimal solution in polynomial time.
Problem 1 (P1).
Let U j = { u , ..., u N } the set of usersof the PS system at time t j having at least one neighborin the WiFi range. Given the position L jk , ≤ k ≤ N , ofevery user u k at time t j , select a subset P j ⊆ U j to actas MHSs such that (i) each user with at least one neighborcan connect to at least one MHS, and (ii) | P j | is minimum. Lemma 1:
Problem 1 is NP-Hard. Proof shown in theAppendix.To solve P1, we use the greedy algorithm proposedby Chvatal in [21]. The complexity of the algorithm is O ( | P j | log | P j | ) with the best known sorting algorithm.It can be proven [21] that this algorithm achieves anapproximation ratio of H ( | P j | ) , where H ( n ) is the n -thharmonic number, i.e., H ( n ) = P nk =1 1 k < ln( n + 1) . Proof:
We prove the hardness of the problem byreducing the optimization version of the set cover problem(O-SCP), which is known to be NP-Hard, to P1. O-SCPis stated as follows. Given a set X of elements (called theuniverse), and a set S of n subsets whose union equalsthe universe, identify the smallest subset of S whose unionequals the universe. By defining V j = X , and Z j = X ,a set cover for X is a solution Q j to P2, and viceversa.Therefore, O-SCP ≤ P P1, which means P1 is NP-Hard.
E. Attacks Formalization
After defining the main components of LVS, it is possibleto extend the threat model formally defining two attacks inaddition to the LSA, formally redefined here for the sakeof readability: • Location Spoofing Attack.
Let a PS system have N active users U j = { u , . . . , u N } at time t j and W location areas A = { A , ..., A W } . A location-spoofingattack (LSA) is performed when one or more users u s (called spoofers) belonging to the set U s ⊆ U j advertise to the PSP a position ( fake position ) in alocation area A fl ( fake location area ), while their reallocation is in the location area A rl ( real location area ),where A fl = A rl . The location in A fl is providedcontinuously by the spoofer. We assume that duringthe attack, spoofers can move from one location areainto another, but the condition A fl = A rl is never met. • Collusion Attack.
The collusion attack is performedwhen one or more sets of users U c = { u , . . . , u c } perform an LSA providing the location A k when theyare in location areas different from A k , and at eachvalidation round each user in U c validate the fakeposition in A k of all other users in U c . This attackcan represent a situation in which a group of usersis expected to be in a specific place, while they allare in different places. Thus, they collude mutuallyvalidating the fake position. • Fraud Covering Attack.
In the fraud covering attack auser u m performs an LSA providing a position in A j ,ut being located in A k . At the same time, anotheruser u f effectively residing in A j validates at eachvalidation round the position of u m in A j . With thisattack, a user can be located in a low density area, inorder not to be spotted by other nodes residing in A k and pretending to provide information on a differentarea. F. Chains of Sights
Chains of Sight (CoS) represent the situation of a lo-cation area describing the series of direct and indirectspotting between users in a validation epoch. The CoSshave been designed to improve the performance of LVSand to effectively tackle the Collusion and Fraud Coveringattack described formerly. During each validation epoch,each user keeps track of the users spotted in the variousrounds and shares this knowledge with the users spotted inthe following rounds. As an example, if the user u i spottedthe user u j in the round r , when at the round r +1 is spottedby the user u k , u i will tell to u k about the presence of u j in the area. Thus u k indirectly spots u j and also validatesthe u j position. This information is expressed through aCoS in the following form: u j → u i /u k , read as “ u j sees u k through u i ”. It stems that through CoSs it is possibleto reduce the number of validation round per epoch. ACoS has two main elements: the spotted node, which isthe user identifier on the right end of the chain and the length which is the number of users in the chain. Noticethat the chain length cannot be greater than ψ max . At theend of each validation round the collection of CoS stored bya user u i is defined as user area knowledge Ω u i , while thecollection of all user area knowledges compose the globalarea knowledge .Formally the CoSs are generated according to the fol-lowing algorithm;1) At each sensing round, the MHSs send their currentarea knowledge to the spotted users.2) Spotted users send their current area knowledge onlyto their hot spot.3) Each user U l , including hot spots, update its areaknowledge Ω l with the useful information from thereceived area knowledge(s), called Ω r , exploiting thefollowing algorithm: Algorithm 1
Updating Ω l for all γ i in Ω r doif γ i not in Ω l then Ω l = Ω l ∪ γ i end ifend for In Figure 7 it is reported an example of validation epochcomposed of four rounds in a specific location area, wherea fraud covering attack is being performed. We will usethis example to give a better understanding of the CoSalgorithm. The user C is located in a different location area and colludes with B who will always validate the positionof C in the area of Figure 7. In the first validation round theusers B and E are selected as MHS. Thus, B validates theposition of D who is nearby and maliciously also validatesthe position of C , whilst E validates the position of F . Werecall that the authentication is mutual, thus D and C willalso validate the location of B , while F will validate thelocation of E . The area knowledge will be the following: Ω A = ∅ , Ω B = { B → C ; B → D } Ω C = { C → B } , Ω D = { D → B } Ω E = { E → F } Ω F = { F → E } In the second round B is spotted by the MHS F and in theinformation exchange will tell that he spotted D and C inthe former round. Thus, the knowledge of F at the end ofthe second round will be: Ω F = { F → E F → D ; F → B, F → B/C } ACB D EF BA CDE FA BCDE F AB CDEF(Round 1) (Round 2)(Round 3) (Round 4)Normal User SpooferMHS ColluderFig. 7: Example of fraud covering.For the sake of brevity we omit the knowledges of otherusers. In the third validation round, the user C (spoofer) isselected as MHS together with A . C validates the positionof B and no one else, since cannot see anyone in the area.In the forth round B is chosen again as hot spot, togetherwith D , and B will validate again the position of D . Thearea knowledge of the various users at the end of round fouris represented with a shortened notation in the following: Ω A = { A → { B, E, F } ; A → B/D ; A → B/C } B = { B → { A, C, D, E, F }} Ω C = { C → B ; C → B { D, E, F }} Ω D = { D → { B, F } ; D → F/E } Ω E = { E → { A, B, F } ; E → B/ { C, D }} Ω F = { F → { A, B, D, E } , F → B/C } At the end of the fourth validation round, each user has beenspotted by at least four other users (80%). However, thepresence of C is validated directly only from B . Analyzingthe CoSs it is easy to see that all other users validate thepresence of C indirectly, from the information receivedfrom B . It is unlikely that no other users directly validate C , thus, as formalized in the following, if this suspicioussituation is repeated for a specific number of rounds, C isconsidered a spoofer and B a colluder.Chains of sight are effective in contrasting the Collusionand Fraud Covering attack. In the collusion attack all thecolluding users will provide to the system chains of sightwhere each user in the chain of sight belongs to the set U C of the colluding users. Assuming that the number ofcolluding users is lower than the average length of a CoS,for finding colluding users is sufficient to find CoSs thatreports a set of users U S that are never spotted, directly orindirectly by users outside from this set. Formally, calling Ω u k the list of CoS owned by the user u k at a validationepoch, a collusion attack is detected if ∀{ u s ∈ U S , u k ∈ U K } Ω u s ∩ Ω u k = ∅ where U K is the set of all users declaring their position inthe location area that are not part of U S , i.e. U S ∩ U K = ∅ .In such a case, all users in U S are deemed as malicious after the threshold of θ c validation epochs.Chains of sight also allow to tackle the Fraud CoveringAttack formerly described. In fact, the fake location of theuser u m is only validated by the user u f . Thus, every chainof sight that validates the position of u m will have thefollowing two possible formats: u i → u j / . . . /u f /u m u f → u m (1)Given a threshold θ f , if for a number of validation epochsgreater than θ f all chains validating the position of u m arein the formats of Eq. 1, u m and u f are deemed as malicious. G. Reputation Algorithm
Let us now introduce the reputation model used by LVSto rule out the reports submitted by users spoofing theirlocation. LVS assigns to each user u i reputation value ρ mi ,which is updated at the end of the m -th validation epoch.In particular, the reputation ρ mi of each user u i is updatedafter the end of the m -th validation epoch according to thefollowing relation, inspired to the J ø sang reputation model[22]: ρ mi = b mi − d mi − u mi where ≤ ρ mi , b mi , d mi , u mi ≤ . In detail, b mi , d mi and u mi are respectively the belief , disbelief and uncertainty level associated to the reputation of user u i after the m -thvalidation epoch. These three values are updated at the endof the m − -th validation epoch according to Algorithm 2. Algorithm 2
Updating ρ l ρ l = b l − d l − u l b l + d l + u l = 1 for all u ∈ U j doif u location is verified then b l = b l + ∆ b u l = u l − ∆ b d l = d l − ∆ b elseif u location is not verified then u l = u l + ∆ u b l = b l − ∆ u end ifelseif u location is fake ∧ u is malicious then d l = d l + ∆ d b l = b l − ∆ d u l = u l − ∆ d end ifend ifend for Let us now explain the algorithm in detail. By defining A di as the location area advertised by user u i , the location ofuser u i is verified when at the end of a validation epoch herposition has been validated by at least q users. The locationof user u i is not verified when, at the end of a validationepoch, less than q users have validated the position of u i to be in the location area A dl . Finally, the location of u i is considered fake when her position has been validated by q e users in a location area A el = A dl and q e > q . The userreputation is lowered as if her location is considered fake,when one or more of the conditions related to the collusionand fraud covering attacks applies.Since the condition b l + d l + u l = 1 must always hold,after each update the three components are normalized. Wepoint out that ∆ b , ∆ d and ∆ u are configurable parametersof the LVS framework and can be varied to best fit todifferent configurations with different values of T r , e max ,user density and number of location areas.IV. R ESULTS
The target of the experimental evaluation is to evaluatethe viability of the WiFi-based approach of LVS in indoorand outdoor scenarios, as well as its energy-efficiency.In particular, in the following we validate the followingassumptions. • First, some time is necessary to ensure effective WiFipairing between the MHS and neighbors. Moreover,onsidering that smartphones are battery-powered de-vices, it is not sound to assume that devices will havetheir WiFi interfaces always active. Therefore, beforethe pairing phase, it will be necessary to wait until theWiFi interface becomes active. Such overhead mustbe reasonably low with respect to the total time theuser and her MHS will be connected in the currentvalidation round. • Second, supposing that a normal moving user and aMHS are nearby when the validation protocol starts,if the user is not in the MHS range for enough time,the location validation will not occur.To validate such assumptions, we performed experimentsaimed at measuring the amount of successful mutual ver-ification between two users. The experiments have beenperformed using two
Galaxy Nexus 4 with Androidversion 4.4. Experiments have been performed both indoorand outdoor, with different configurations, users’ speeds,distances and movement patterns. Specifically, in eachexperiment the two users (hereafter referred to as U1 andU2) perform the following operations. At the beginningof each experiment, both U1 and U2 have their WiFiinterface off. As soon as the experiment starts, U1 becomesan MHS and activates the built-in WiFi hot spot feature,while U2 simply turns on the WiFi interface and attempsto connect to U1. After 30 seconds from the beginning ofeach experiment (i.e., T vt = 30 s ), U1 and U2 shut downtheir WiFi interfaces, ending the experiment. We developeda simple Android application which implements the LVSauthentication protocol.The indoor experiments (results summarized in TableII) have been performed at the National Research Council(CNR) building in Pisa, Italy. The indoor experimentalsetup is depicted in the upper side of Figure 8 (reported inthe Appendix due to space limitations), with the followingconfigurations. Each test has been performed with the sameconditions for 15 times. • Experiment 1 (E1) . In this experiment, U1 moves ona linear pattern while U2 stands still. The two usersare physically separated by a wall, as can be seen inFigure 8. The experiment has been performed with U1moving at two different speeds, namely 6 km/h and15 km/h, to evaluate the effectiveness of LVS withdifferent walking speeds. • Experiment 2 (E2).
Both U1 and U2 are moving onstraight and parallel linear patterns but in oppositedirections. As shown in Figure 8, this experiment isperformed with several obstacles between U1 and U2.The two users move on parallel trajectories which are19,2 meters far. The presence of the obstacles and themoving speed caused the authentication protocol tofail 9 times on 15 for the slow speed experiments, and11 times on 15 for the fast speed experiments. • Experiment 3 (E3).
Same configuration as E2, butwith the users moving in the same direction. Theexperiment has been performed with the users moving Fig. 8: Indoor and outdoor experimental setups.at the same speed.Table II concludes that if users are in the same room orin nearby rooms, it is almost guaranteed the mutual verifi-cation will be successful. However, walls and interferencecaused by other electronic devices may affect the mutualverification, as E2 shows. The improvement in E3 is due tothe reduction of the variance of the perceived signal strengthbetween U1 and U2.The outdoor experiments (summarized in Table III) havebeen performed in the premises of the National ResearchCouncil (CNR) of Pisa, Italy (lower side of Figure 8).Outdoor experiments differ from indoor ones for differentdistances and obstacles.In these experiments, U1 and U2 are moving on par-allel linear trajectories in opposite directions. Overall, weperformed 5 sets of outdoor experiments, with differentdistances and type of obstacles. As for indoor experiments,each test has been performed 15 times at two differentspeeds of the users, which are 6 km/h and 15 km/h. Namely,Experiments 1.1, 1.2, and 1.3 (lower side of Figure 8) havebeen performed with the users moving in different aislesof a parking lot. Each aisle is separated by two rows ofcars. Details on the distances and on the obstacles betweenthe users are summarized in Table III. Experiment 1.4 hasbeen performed at the distance of 70 meters in a conditionnergy ( µ Ah) CI (90%) Power (mW) CI (90%) Lifetime (h) CI (90%)LTE 5783.20 ± ± ± ± ± ± ± ± ± Exp. Distance 6 km/h 15 km/h
TABLE II: Details of Indoor Experiments.
Exp. Dist. Obstacles 6 km/h 15 km/h
TABLE III: Details of the outdoor experiments.of partial line of sight. Experiment 1.5 has been performedin the central plaza of the CNR area, with the users beingin direct line of sight. From this set of experiments, wederive that in absence of walls, with the users in lineof sight, the verification is performed always correctly ifthey are within a range of 60 meters. Experimental resultsconclude that the LVS approach is viable in both outdoorand indoor environments, given the mutual authenticationoccurs almost every time in every considered experimentalscenario.
A. Energy consumption evaluation
In order to calculate with high degree of precision theenergy efficiency of the LVS location verification algorithm,we set up an experimental testbed as depicted in Figure 9.In particular, we used the
Power Monitor device [16] toacquire instantaneous power and current consumption ofthe smartphone. The tool provides a interface by which wecould also obtain an estimation of the estimated batterylifetime of the smartphone according to the current energyconsumption rate. The measurements are then averagedover a predefined period of time and repeated over differentexperiments.Fig. 9: Experimental setup to calculate energy consumption.The following experiments have been performed. First, we evaluated the energy consumption of the smartphonewhile running the sensing app and LTE, with the WiFiinterface and LVS localization mechanism turned off. Next,we calculated the energy consumption of the smartphonewhen the sensing app is running, LTE is active and the LVSlocalization mechanism is active, which means, the WiFiinterface is turned on for T vt = 30 seconds for neighbordiscovery and then turned off. Finally, we evaluated theenergy consumption of the smartphone when the sensingapp is running, WiFi is active 100% of the time andLTE is active. Table I summarizes the energy consumed,the average power consumption and the expected residuallifetime, supposing a battery of 2600 mAh (i.e., the oneequipped on Samsung Galaxy S4). The experiments havebeen conducted by monitoring the energy consumption for200 seconds and then averaging over different repetitions.In all experiments, the screen was turned off and no otherapps were running on the phone; in particular, no app wasgenerating any downstream or upstream traffic, and WiFiwas not connected to any network.Table I remarks the significant difference in energyconsumption between WiFi and LTE. In particular, TableI shows a difference of almost 1600 µ Ah between WiFiand LTE. This was expected, given that WiFi consumeshuge amount of energy even when not connected to anetwork [23]. However, the most important result indicatedby Table I is that LVS has almost no impact on the energyconsumption of the smartphone, as WiFi is turned on onlyfor a short period of time (30 seconds), and then remainsturned off most of the time.
B. Simulations results
In this section, we evaluate through simulation experi-ments the performance of LVS in terms of resilience fromattackers and efficiency. To simulate a realistic environment,we modeled the sensing area as a single location area large4 square kilometers (size of a small city or city block).As far as user mobility is concerned, we assumed usersmove about the location area following the Truncated L´eviWalk (TLW) mobility model [24], which has been shownto best represent the mobility of humans [25]. Due tospace limitations, we refer the reader to [25] for additionalinsights.For the sake of simplicity, we modeled the WiFi rangeof the smartphones devices as circles centered on the userwith radius 50 meters. As default system parameters, wechose as reputation parameters ∆ b = 0 . , ∆ d = 0 . , and ∆ u = 0 . . The setup time T sw has been set to 7 secondsaccording to the experimental evaluation of Section II,
10 20 30 40 50 60 70 80 90 100 R epu t a t i on o f u s e r s Validation roundsReputation of users x network density50 users/sq.km.75 users/sq.km.100 users/sq.km.125 users/sq.km.
Fig. 10: Reputation of users (users den-sity).
50 75 100 125 A v g . o f v e r i f i c a t i on r ound s users/sq.km.Validation epoch duration x M M = 90%M = 80%M = 70% Fig. 11: Average time of validationepochs.
10 20 30 40 50 60 70 80 90 100 R epu t a t i on o f u s e r s Validation roundsReputation of attackers x percentage of attackers5%10%15%
Fig. 12: Reputation of attackers (50users/sq.km).while the validation round time T vr has been set to 15s. Thevalidation epoch threshold M and θ have been respectivelyset to 0.9 and 0.8, while the q parameter has been set to in all experiments. The confidence intervals are set to 95%.For the sake of graphical clarity, the confidence intervalsare not shown when less than 1% of the average. In thefollowing, we will refer to as “users” the participants notfaking their position, and to “attackers” as participants whofake their position and implement the LSA attack describedin Section 2. For the sake of simplicity, and without losingin generality, we also assumed that users remain activeinside the same location area for the whole duration ofa validation epoch.First, we evaluate the impact of the user density on theusers’ reputation and the efficiency of LVS. Specifically,Figure 10 shows the average reputation opinion of usersas function of user density, supposing no attackers arepresent in the location area. As expected, from Figure10 we observe that to greater user density correspondsfaster increase of user reputation level over time, which isgiven by the faster termination of each validation epoch ofLVS. This is further validated by Figure 11, which depictsthe average duration of the validation epochs of LVS asfunction of users density and the M parameter. Recall thatin LVS, a validation epoch ends when M percent of usershave their position validated by at least q users.Figure 11 concludes the validation epoch duration de-creases as M decreases and the users density increases,which is coherent to what depicted in Figure 10. Thismeans that increasing the M parameter allows a morecomplete validation of users’ location, to the expense oflonger validation epochs. This trade-off should be met bythe administrator of the PS application by considering theaverage users density and the desired level of security.As expected, it also turns out (results not shown herefor the sake of space) that the validation epoch durationincreases as the WiFi range decreases. This is reasonable,since more time connectivity will be available to usersat each validation round. However, a shorter WiFi rangeimplies that the validation of users’ position will be finer-grained. Therefore, the WiFi range parameter may be setby the administrator of the PS application depending on the desired trade-off between precision and efficiency of LVS. Users x sq.km. % of MHSs C.I.
50 14.5 2.5175 17.33 3.12100 21.75 3.56125 24.25 4.11
TABLE IV: Number of users selected as MHSs.Finally, to further validate the scalability of the LVSapproach, Table reports the percentage of the users selectedas MHSs in function of the user density. Table concludesthat the percentage of users selected as MHS is significantlyless than the total number of users, even when the densitybecome relatively high.
1) Resilience from attackers:
Let us now evaluate theresilience of LVS to attackers with Figures 12 and 13, whichshow the average reputation of all the attackers in functionof the percentage of attackers in the system. Specifically,the attack has been simulated by setting the position of allattackers outside the location area, and by making themadvertise a random position inside the location area to thePS platform. We recall that we do not consider in thisanalysis colluding attackers. Figures 12 and 13 concludethat the attackers’ reputation decreases faster when the usersdensity is higher, due to the shorter duration of validationrounds. Therefore, LVS is able to detect faster users notadvertising their real position when the users density ishigher. However, as anticipated earlier, note that LVSdoes not increase the reputation level of attackers in anycircumstance, given the location of the attackers will neverbe validated by any MHS. Also, note that the reputationof attackers never reaches the θ threshold necessary toaccept their reports inside the PS system. Therefore, weconclude LVS is able to exclude unreliable reports fromthe PS system and therefore protects the PS system fromthe location-spoofing attack defined in Section II, withoutcompromising the functionality of the PS application. Wewould like to point out that the security parameters of thereputation algorithm, as well as the validation round time T vt , may be tuned by the administrator of the PS applicationaccording to the desired trade-off between efficiency andsecurity.
10 20 30 40 50 60 70 80 90 100 R epu t a t i on o f u s e r s Validation roundsReputation of attackers x percentage of attackers20%30%40%
Fig. 13: Reputation of attackers (125 users/sq.km).
10 20 30 40 50 60 70 80 90 100 R epu t a t i on o f u s e r s Validation roundsReputation of users x network and attacker density50 users/sq.km., 10%50 users/sq.km., 15%50 users/sq.km., 20%125 users/sq.km., 20%125 users/sq.km., 30%125 users/sq.km., 40%
Fig. 14: Reputation of users (network and attacker density).To gain further insights on the impact of the attackerson the reputation of users, Figure 14 show the reputationof users as function of the percentage of attackers and theusers density. Figure 14 shows that when the density ofusers is relatively low (50 users/sq.km.), more validationepochs are needed to increase the reputation of users.Simply enough, this is due to the fact that in this case theusers density on the location area becomes very low, andtherefore LVS takes additional time to validate the users’locations. However, Figure 14 also remarks that when thedensity of users is relatively high (125 users/sq.km.), LVSis able to tolerate a very high percentage of attackers (40%)without hindering the reputation of users. This is becausein this case LVS will still maintain enough users to validateeach user’s location and therefore will tolerate a highernumber of attackers.V. R
ELATED W ORK
In this section, we survey existing work related to thelocalization of smartphones and users, and highlight thenovel contributions brought by this paper.Given location-spoofing software like
Fake Locator is able to hijack both GPS and GSM location services,approaches such the one presented in [26] are prone tothe LSA attack and therefore not suitable to validate userlocation in PS systems. In addition, the user locationobtained through GSM cell triangulation is known by thetelephone service providers only, and may not be sharedwith external parties due to privacy issues. Conversely, the LVS framework does not require any piece of informationthat cannot be retrieved on smartphones, which is essentialfor easy deployment.Existing WiFi-based solutions [27], [28] were specifi-cally designed for indoor environment only, and are there-fore not applicable to large-scale outdoor PS systems.Instead, LVS leverages a technique that is valid for bothindoor and outdoor PS systems. Although [27], [28] andsimilar solutions yield a greater accuracy than LVS, wepoint out here that LVS is not aimed at calculating the pre-cise location of users. Instead, the goal of LVS is to verify the user location provided by other localization servicesand thus solve LSA attacks. Finally, approaches based onambient-based fingerprints [15], [29] are not suitable in PSscenarios in which users are not able to observe the samephenomenon (e.g., users located in different floors/roomsof a building). The proposed LVS framework, instead, is independent of the collected data type and relies onlyon WiFi to verify user position. A trust algorithm fordistributed environments, similar to the one proposed in thiswork has been exploited in [30] to verify attribute valuesin usage control systems with faulty Attribute Managers,and in [20] where it has been exploited to manage acollaborative framework for Android malware analysis.VI. C
ONCLUSIONS
In this paper, we have proposed LVS, a location vali-dation system which verifies user location in participatorysensing (PS) systems and solves the proposed location-spoofing attack (LSA). First, we have proposed LVS, whichauthenticates user location in a distributed and scalableway through the use of the mobile WiFi hotspot capabilityof modern smartphones. Furthermore, we have introducedthe formalism of the Chains of Sight, which are used toimplement an algorithm to tackle collusion-based attacks.We have also proposed a reputation-based system based onLVS which rules out reports coming from users spoofingtheir location. Finally, we have tested the proposed ap-proach in indoor and outdoor testbeds, measured its energyconsumption, and shown its effectiveness against the LSAattack through simulations. Results conclude that LVS isenergy-efficient, applicable in almost every practical PSscenarios, and effectively solves LSA-based attacks.T
ABLE OF S YMBOLS R EFERENCES[1] J. Burke, D. Estrin, M. Hansen, A. Parker, N. Ramanathan, S. Reddy,and M. B. Srivastava. Participatory sensing.
In: Workshop on World-Sensor-Web (WSW06): Mobile Device Centric Sensor Networks andApplications , pages 117–134, 2006.[2] P. Mohan, V.N. Padmanabhan, and R. Ramjee. Nericell: Richmonitoring of road and traffic conditions using mobile smartphones.
SenSys 2008 - Proceedings of 6th ACM Conference on EmbeddedNetworked Sensor Systems , pages 323–336, 2008.[3] A. Thiagarajan, J. Biagioni, T. Gerlich, and J. Eriksson. Cooperativetransit tracking using smart-phones.
SenSys 2010 - Proceedings ofthe 8th ACM Conference on Embedded Networked Sensor Systems ,pages 85–98, 2010. ymbol Meaning Placement u i , u j , . . . Generic user III A i , A j , . . . Generic Location Area III A tk Location Area of user k at time t III-C T r Validation round duration III-C T e Validation epoch duration III-C T sw WiFi setup time of MHS III-C T vt Validation time of MHS III-C N t j i Users present in A i at time t j III-C D t j i Users declared in A i at time t j III-C ψ max Max duration of validation epoch III-F Ω i Area knowledge of user u i III-F
TABLE V: Table of Symbols [4] M. Mun, S. Reddy, K. Shilton, N. Yau, J. Burke, D. Estrin,M. Hansen, E. Howard, R. West, and P. Boda. Peir, the personalenvironmental impact report, as a platform for participatory sensingsystems research.
MobiSys’09 - Proceedings of the 7th ACM Inter-national Conference on Mobile Systems, Applications, and Services
Pervasive Computing, IEEE , 8(4):50–57,2009.[7] E. Miluzzo, N. D. Lane, K. Fodor, R. Peterson, H. Lu, M. Musolesi,S. B. Eisenman, X. Zheng, and A. T. Campbell. Sensing meetsmobile social networks: The design, implementation and evaluationof the cenceme application.
SenSys 2008 - Proceedings of the 6thACM Conference on Embedded Network Sensor Systems , pages 337–350, 2008.[8] W.Z. Khan, Yang X., M.Y. Aalsalem, and Q. Arshad. Mobile phonesensing systems: A survey.
Communications Surveys Tutorials,IEEE , 15(1):402–427, First 2013.[9] Francesco Restuccia, Salvatore D’Oro, and Tommaso Melodia.Securing the Internet of Things: New Perspectives and ResearchChallenges. arXiv preprint arXiv:1803.05022 , 2018.[10] Francesco Restuccia, Nirnay Ghosh, Shameek Bhattacharjee, Sajal KDas, and Tommaso Melodia. Quality of Information in MobileCrowdsensing: Survey and Research Challenges.
ACM Transactionson Sensor Networks (TOSN) , 13(4):34, 2017.[11] Francesco Restuccia, Sajal K Das, and Jamie Payton. IncentiveMechanisms for Participatory Sensing: Survey and Research Chal-lenges.
ACM Transactions on Sensor Networks (TOSN) , 12(2):13,2016.[12] Francesco Restuccia, Pierluca Ferraro, Timothy S Sanders, SimoneSilvestri, Sajal K Das, and Giuseppe Lo Re. FIRST: A Frameworkfor Optimizing Information Quality in Mobile Crowdsensing Sys-tems. arXiv preprint arXiv:1804.11147 , 2018.[13] Francesco Restuccia and Sajal K Das. FIDES: A Trust-basedFramework for Secure User Incentivization in Participatory Sensing.In
World of Wireless, Mobile and Multimedia Networks (WoWMoM),2014 IEEE 15th International Symposium on a , pages 1–10. IEEE,2014.[14] D. Yang, G. Xue, X. Fang, and J. Tang. Crowdsourcing tosmartphones: Incentive mechanism design for mobile phone sensing.
Proceedings of the 18th Annual International Conference on MobileComputing and Networking , pages 173–184, 2012.[15] M. Talasila, R. Curtmola, and C. Borcea. Improving locationreliability in crowd sensed data with minimal efforts. In
Wirelessand Mobile Networking Conference (WMNC), 2013 6th Joint IFIP , pages 81–86, 2015.[18] Francesco Restuccia, Andrea Saracino, Sajal K. Das, and FabioMartinelli. LVS: A wifi-based system to tackle location spoofing in location-based services. In , pages 1–4, 2016.[19] N.D. Lane, E. Miluzzo, Hong Lu, D. Peebles, T. Choudhury, andA.T. Campbell. A survey of mobile phone sensing.
CommunicationsMagazine, IEEE , 48(9):140–150, Sept 2010.[20] Mario Faiella, Antonio La Marra, Fabio Martinelli, Francesco Mer-caldo, Andrea Saracino, and Mina Sheikhalishahi. A distributedframework for collaborative and dynamic analysis of android mal-ware. In , pages 321–328, 2017.[21] V. Chvatal. A greedy heuristic for the set-covering problem.
Mathematics of Operations Research , 4(3):233–235, 1979.[22] A. Josang. An algebra for assessing trust in certification chains.In
Proceedings of the Network and Distributed System SecuritySymposium , pages 89–99, 1999.[23] Yuvraj Agarwal, Ranveer Chandra, Alec Wolman, Paramvir Bahl,Kevin Chin, and Rajesh Gupta. Wireless wakeups revisited: Energymanagement for voip over wi-fi smartphones. In
ACM/USENIXMobile Systems, Applications, and Services (MobiSys) . Associationfor Computing Machinery, Inc., June 2007.[24] S. Hachem, A. Pathak, and V. Issarny. Probabilistic registration forlarge-scale mobile participatory sensing. In
Pervasive Computingand Communications (PerCom), 2013 IEEE International Confer-ence on , pages 132–140, March 2013.[25] I. Rhee, M. Shin, S. Hong, K. Lee, and S. Chong. On the levy-walknature of human mobility. In
INFOCOM 2008. The 27th Conferenceon Computer Communications. IEEE , pages 1597–1606, April 2008.[26] F. Alcala, J. Beel, A. Frenkel, B. Gipp, J. L¨ulf, and H. H¨opfner.Ubiloc: A system for locating mobile devices using mobile devices.In
Proceedings of 1st Workshop on Positioning, Navigation andCommunication , pages 43–48, 2004.[27] P. Bahl and V. N. Padmanabhan. Radar: An in-building rf-based userlocation and tracking system. In
INFOCOM 2000. Nineteenth An-nual Joint Conference of the IEEE Computer and CommunicationsSocieties. Proceedings. IEEE , volume 2, pages 775–784. Ieee, 2000.[28] H. Liu, Y. Gan, J. Yang, S. Sidhom, Y. Wang, Y. Chen, and F. Ye.Push the limit of wifi based localization for smartphones. In
Proceedings of the 18th Annual International Conference on MobileComputing and Networking , Mobicom ’12, pages 305–316, NewYork, NY, USA, 2012. ACM.[29] S. P. Tarzia, P. A. Dinda, R. P. Dick, and G. Memik. Indoorlocalization without infrastructure using the acoustic backgroundspectrum. In
Proceedings of the 9th International Conference onMobile Systems, Applications, and Services , MobiSys ’11, pages155–168, New York, NY, USA, 2011. ACM.[30] Mario Faiella, Fabio Martinelli, Paolo Mori, Andrea Saracino, andMina Sheikhalishahi. Collaborative attribute retrieval in environmentwith faulty attribute managers. In11th International Conference onAvailability, Reliability and Security, ARES 2016, Salzburg, Austria,August 31 - September 2, 2016