Bluetooth-based COVID-19 Proximity Tracing Proposals: An Overview
11 Bluetooth-based COVID-19 Proximity TracingProposals: An Overview
Meng Shen,
Member, IEEE,
Yaqian Wei, and Tong Li*
Abstract —Large-scale COVID-19 infections have occurredworldwide, which has caused tremendous impact on the economyand people’s lives. The traditional method for tracing contagiousvirus, for example, determining the infection chain accordingto the memory of infected people, has many drawbacks. Withthe continuous spread of the pandemic, many countries ororganizations have started to study how to use mobile devicesto trace COVID-19, aiming to help people automatically recordinformation about incidents with infected people through tech-nologies, reducing the manpower required to determine theinfection chain and alerting people at risk of infection. Thisarticle gives an overview on various Bluetooth-based COVID-19 proximity tracing proposals including centralized and de-centralized proposals. We discussed the basic workflow and thedifferences between them before providing a survey of five typicalproposals with explanations of their design features and benefits.Then, we summarized eight security and privacy design goalsfor Bluetooth-based COVID-19 proximity tracing proposals andapplied them to analyze the five proposals. Finally, open problemsand future directions are discussed.
Index Terms —COVID-19, Bluetooth-based, proximity tracing,security, privacy
I. I
NTRODUCTION T HE COronaVIrus Disease of 2019, referred to as COVID-19, has become a global pandemic and caused tens ofmillions of infected people and hundreds of thousands ofdeath. The large-scale virus infection has caused tremendousimpact on people’s livelihood and the economy of manycountries. Many countries have to shut down cities to restrainthe development of the pandemic and prevent people fromworking and traveling. Therefore, how to effectively curbthe spread of COVID-19 has become one of the focuses ofresearches.Traditionally, to trace people who may be at risk of in-fection, the infected person needs to actively recall wherethey have been and who they have contacted during theinfection period. Experts trace the people at risk of infectionby constructing a relationship network and isolate them to cutoff the source of infection. However, relying on the memoryof the infected person is likely to miss key information. Whenthe infected person went to a place where there were lots ofpeople gathered, he/she could not enumerate those strangerswho had come into close contact with him/her, which made itdifficult for experts to analyze.
M. Shen and Y. Wei are with the School of Computer Science, Beijing In-stitute of Technology, Beijing 100081, China (e-mail: [email protected],[email protected]).T. Li is with 2012 Labs, Huawei. Shenzhen, 518129, China (e-mail:[email protected]).T. Li is the corresponding author (e-mail: [email protected]).
After the COVID-19 outbreak, many countries or organiza-tions have begun to study the use of technological means totrace people who may be infected and deployed applicationsaccordingly. These applications are expected to reduce thelabor required to determine infection chains and improve theaccuracy of tracing virus infections. There are already dozensof COVID-19 tracing applications. Due to the inevitable needto collect certain user information, how to protect their securityand privacy has become the focus of researchers. The tracingapplications can be divided into three categories based onthe data collected: location data, proximity data and mixeddata that includes the former two. Location data can beobtained by using Global Positioning System (GPS) to identifyuser’s latitude and longitude, while proximity data can beobtained by using the Bluetooth function on the mobile device.Bluetooth classifies close contacts with a significantly lowerfalse positive rate than GPS, especially in indoor environments,and it consumes lower battery [1]. These Bluetooth-basedapplications are basically created based on five Bluetooth-based COVID-19 proximity tracing proposals.In this article, we focus on the Bluetooth-based COVID-19 proximity tracing proposals, which can be divided intocentralized proposals and decentralized proposals. Firstly, wesummarized the basic workflows of the two categories of pro-posals and the differences between them. Then we specificallyanalyzed two decentralized and three centralized proposals’generation algorithms of anonymous IDs, locally stored data,uploaded data and so on. Moreover, we summarized eight se-curity and privacy design goals of proximity tracing proposalsand analyzed the five proposals according to them. We foundthat none of them has achieved the goals. Finally, we shedlight on open problems and opportunities of Bluetooth-basedCOVID-19 proximity tracing proposals.II. O
VERVIEW OF B LUETOOTH - BASED P ROXIMITY T RACING P ROPOSALS
With the continuous spread of the COVID-19 all overthe world, many countries or organizations have successivelyannounced Bluetooth-based proximity tracing proposals. Thefollowing are five typical proposals. In Asia, Singapore an-nounced a privacy preserving protocol called BlueTrace [1].In Europe, the Pan-European Privacy Preserving ProximityTracing project, referred to as PEPP-PT [2], comprises morethan 130 members across eight European countries. FrancesInria and Germanys Fraunhofer, as members of PEPP-PT,shared a ROBust and privacy-presERving proximity Tracingprotocol, referred to as ROBERT [3]. The Decentralised a r X i v : . [ c s . CR ] A ug Privacy- Preserving Proximity Tracing proposal, referred toas DP-3T [4], is an open protocol that ensures personaldata and computation stay entirely on an individuals phone,and this proposal was produced by a team of membersfrom across Europe. In North America, under the influenceof DP-3T [4], Google and Apple announced a two-phaseexposure notification solution, referred to as GAEN [5]. In thefirst phase, they released Application Programming Interfaces(APIs) that allow applications from health authorities to workacross Android and iOS devices. In the second phase, thiscapability will be introduced at the operating system level tohelp ensure broad adoption [6]. Many applications are createdbased on these five proposals. Based on BlueTrace, Singa-pore deployed the application called TraceTogether, whichis the world’s first Bluetooth-based proximity tracing systemdeployed nationwide. The COVIDSafe application was alsocreated based on BlueTrace and announced by the AustralianGovernment. PEPP-PT has been implemented in Germanyand they deployed the application called NTK. The Frenchgovernment has deployed the StopCovid application based onROBERT to trace COVID-19. Ketju based on DP-3T wastrialed in Finland and it’s among the first to use a decentralisedapproach to proximity tracing based on DP-3T in Europe.Many countries have released open source applications basedon GAEN, such as Corona-Warn-App [7] in Germany, StoppCorona in Austria, SwissCovid in Switzerland, Immuni in Italyand COVID Tracker in Ireland.
A. Centralized and Decentralized Proposals
According to the role of the server in the proximity tracingproposals, Bluetooth-based proximity tracing proposals canbe divided into two categories. One is centralized proximitytracing proposals, such as BlueTrace of Singapore, PEPP-PTof Europe and ROBERT of France. The other is decentralizedproximity tracing proposals, such as GAEN and DP-3T ofEurope. Figure 1 (a) (b) shows the workflows of centralizedand decentralized proposals, respectively.In the centralized proximity tracing proposals, users broad-cast and receive encounter information (anonymous ID, trans-mission time, etc.) via Bluetooth. When users are infected withCOVID-19, they can upload the encounter information to acentral server, which analyzes the encounter information anddetermines whether any related user is at risk of infection andnotifies them. The server plays a vital role in the workflow ofcentralized proposals and can handle the encounter informa-tion between users and analyze it.In the decentralized proximity tracing proposals, when usersare infected with COVID-19, the keys related to the generationof anonymous IDs is uploaded to the server. Then the serversimply passes the keys of these positive users to other users,who regenerate anonymous IDs and analyze whether they areat risk of infection. The server only plays the role of storingand distributing keys uploaded.The difference analysis between centralized and decentral-ized proposals is shown in Table I. The back-end serverusing the centralized proposals handles each user’s pseudonym(unique pseudo-random identifier) and encounter information.
Users tested positive ServerUsers potentially infectedEncounter information Handle all the user pseudonyms, anonymous IDs and encounter information uploaded. Do matching and risk calculation.
Alerts
Users tested positive Server Other users keys Do matching and risk calculation. Alerts keysHandle all the keys uploaded.(a) (b)
Fig. 1. Workflows of Centralized and Decentralized Proposals: (a) Central-ized; (b) Decentralized.
The weakness is that it can associate all the anonymous IDsof each user with his pseudonym. This allows operators ofback-end servers to monitor user’s behaviors. The centralizedtracing proposals have been strongly criticized by privacyadvocates and other stakeholders in the technical community,who believe that the centralized tracing proposals provide thegovernment with information that can be used to reverse-engineer personal information about individuals [8]. Singaporeand Italy have stated that they will switch from centralizedapplications to decentralized applications. The issue of trusthas also prompted the German government favoring a central-ized proposal before to adopt a decentralized one. The FrenchParliament debated similar concerns.
B. Comparison Between Proximity Tracing Proposals
In this section, the two decentralized proximity tracingproposals (Table II introduces GAEN and three designs ofDP-3T) and the three centralized proximity tracing proposals(Table III introduces BlueTrace, PEPP-PT and ROBERT) areanalyzed.The two decentralized proposals have roughly similar pro-cesses, and the specific difference is reflected in the differentalgorithms for generating anonymous IDs. In the low-costdesign, the seed keys of one user are linkable. In the formula,H represents the hash function and t represents the currentday, the seed key of which can be hashed to generate thatof the next day. Thus only the seed key of the first day isneeded to generate all the anonymous IDs for the next few
TABLE IT HE D IFFERENCES B ETWEEN C ENTRALIZED AND D ECENTRALIZED P ROPOSALS
Centralized Decentralized
Information obtained bythe server All the user pseudonyms, anonymous IDs and en-counter information uploaded by the users testedpositive All the keys uploaded by users tested positiveThe role that the serverplays Analyze the information obtained and determinewhether the related users may be at risk of infection Store and distribute the keysThe data volume commu-nicated between the mo-bile device and the server The data uploaded by the users tested positive issmall The server needs to periodically distribute keys toall the users, which means the data volume greatlyexceeds that of the centralized proposals days. AnonoID represents an anonymous ID. PRF is a pseudo-random function. PRG is a pseudorandom generator. Str is afixed, public string. Each seed key can be used to generate allthe anonymous IDs of the day. In the formula of the unlinkabledesign, Epochs i are encoded relative to a fixed starting pointshared by all the entities. LEFT128 takes the leftmost 128 bitsof the hash output. This design generates a seed key for eachepoch i and hashes it to generate anonymous IDs. Thus, all ofthese seek keys are unlinkable. The user can choose the timeperiod for uploading, then the server regenerates these hashvalues based on the seed key uploaded by the user and putsthem into a cuckoo filter before sending them to other users.Compared with the low-cost design, the unlinkable designprovides better privacy attributes with increased bandwidth.The hybrid design uses a time window w, whose length isan integer multiple of the anonymous ID’s valid period, toreduce the valid period of a seed key. The user can also selectthe time period and time window w for uploading. Comparedwith low-cost designs, this design requires more bandwidthand storage space, but less than that of the unlinkable design.GAEN is similar to the hybrid design of DP-3T. It correspondsto the case where the time window of the hybrid design isone day but has been upgraded in the generation algorithm ofanonymous ID. In the formula, SecSeed represents secondaryseek key and PriSeed represents primary seek key. HKDF is akey derivation algorithm. It first generates a primary seed keyevery day that is unassociated with each other. Then it usesthe primary seed key to generate a secondary seed key, whichis used to generate an anonymous ID.The servers in the three centralized proposals all graspuser pseudonyms, anonymous IDs and encounter informationuploaded. BlueTrace needs to collect user’s phone number andassociate the number with user’s pseudonym. Their differencesare also mainly reflected in the different algorithms for gener-ating anonymous IDs. In addition to generating an anonymousID using a key known only to itself, the server in ROBERTalso uses the anonymous ID and a key known only to itself togenerate encrypted country code to implement the proposalacross the country. In these proposals, the server plays animportant role.In the decentralized proposals, the users tested positive storethe encounter information broadcast by other users on mobiledevice, and upload the keys that generate the anonymous IDs.While in the centralized proposals, the users tested positivestore and upload the encounter information broadcast by other users. The server handles users pseudonyms and anonymousIDs. So as long as a user uploads the encounter information,the server can infer whether there are related users at risk ofinfection. In the decentralized proposals, because the serveris responsible for storing and distributing data, users need toupload keys so that other users can acquire keys and regenerateanonymous IDs to match.DP-3T proposes three different decentralized designs withdifferent bandwidth and privacy requirements. The low-costdesign requires the minimum bandwidth and provides theweakest privacy. The unlinkable design requires the maximumbandwidth and provides the strongest privacy. And the band-width and privacy of the hybrid design is between the low-cost design and the unlinkable design. GAEN is similar tothe hybrid design of DP-3T, but it has a better anonymous IDgeneration algorithm. In the above three centralized proposals,the information collected from users and the anonymous IDgeneration algorithm are different. BlueTrace needs to collectthe user’s phone number, while ROBERT does not need.ROBERT uses a secret key known only to the server to encryptthe country/region code as part of encounter information, whilethe other two proposals do not. In decentralized proposals, auser uploads keys related to his/her own anonymous IDs tothe server. But in centralized proposals, the user uploads theencounter information related to other users’ anonymous IDsto the server.III. S
ECURITY AND P RIVACY A NALYSIS OF B LUETOOTH - BASED P ROXIMITY T RACING P ROPOSALS
A. Security Analysis
This section summarizes the security design goals requiredfor the Bluetooth-based proximity tracing proposals based onsix types of threats proposed in the STRIDE Threat Model ofMicrosoft [9] and analyzes the security of five proposals.
1) Security Design Goals:
Eight security design goals areas follows.
Information confidentiality.
Attackers cannot obtain infor-mation transmitted by users through wireless communication.
Information integrity.
When transmitting and storing en-counter information, these proposals should ensure that theinformation is not tampered by unauthorized entities or canbe discovered afterwards.
Normal reception.
A user can normally receive the in-formation broadcast by the other users after granting theapplication permission.
TABLE IIC
OMPARISON OF T HE T WO D ECENTRALIZED P ROXIMITY T RACING P ROPOSALS
DecentralizedGAEN DP-3T1. Low-cost design 2. Unlinkable design 3. Hybrid designPersonal information None NoneUser pseudonym None NoneGenerationalgorithm ofanonymous ID
SecSeed t = HKDF ( P riSeed t ) AnonID i = P RG ( SecSeed t , i ) Seed t = H ( Seed t − ) AnonID || · · · || AnonID n = P RG ( P RF ( Seed t , str )) AnonID i = LEF T H ( Seed i )) AnonID w, || · · · || AnonID w,n = P RG ( P RF ( Seed w , str )) Generating location Mobile device Mobile deviceIdentity of infectedusers The server does notknow The server does not knowData saved on mo-bile devices Anonymous IDand AssociatedEncrypted Metadata(version and transmitpower level) Anonymous ID , exposure mea-surement (for example, signalattenuation) and receiving date Hash string of anonymousID and time, exposure mea-surement and receiving date Anonymous ID, exposure measure-ment, time window for receivinganonymous IDData uploaded ( P riSeed t , t ) ( Seed t , t ) ( Seed i , i ) in the time periodselected ( Seed w , w ) in the time period se-lected Processing of big data.
The application works normallywhen it receives a large amount of encounter information.
Avoidance of false contact.
Only when two users haveclose contact can they receive the information broadcast byeach other.
Real identity.
An attacker cannot claim to be a certain user.
Authorization.
The user tested positive needs authorizationor identity verification before uploading data to the server.
Non-repudiation.
Users cannot deny that they have hadclose contact with someone.
2) Security Analysis of Proposals:
The analysis of thesefive proposals’ achievement of the security design goals is asfollows.
Information confidentiality.
In all the proposals, a userbroadcasts information to the other nearby users via Bluetooth.In this process, attackers can use tools, such as sniffer, toobtain massages broadcast by users. But attackers cannotobtain valid information by analyzing these messages dueto using the generation algorithm of anonymous IDs. Indecentralized proposals, only users who may be at risk ofinfection can do risk calculation. In centralized proposals,only servers can decrypt the encounter information and obtainconfidential information about users.
Information integrity.
In the two decentralized proposals,if the seed keys uploaded by users tested positive are tampered,other users cannot regenerate real anonymous IDs based onthe false keys. In GAEN, anonymous IDs and AssociatedEncrypted Metadata (AEM) are both encrypted. If they aretampered, the encounter information regenerated by usersbased on the real seed keys cannot match them. In DP-3T,if the anonymous IDs or the hash value of the anonymousIDs in the encounter information is tampered, the anonymousIDs regenerated based on the seed keys cannot match them.In PEPP-PT, the anonymous IDs in the encounter informationare generated by the periodically changing seed keys. When the user pseudonym decrypted is invalid, it can be determinedthat the information has been tampered. In BlueTrace, thereare fields for integrity checking in the anonymous IDs. InROBERT, the Message Authentication Code (MAC) in theencounter information can be used to check the integrity.
Normal reception.
Any proximity tracing system based onBluetooth Low Energy (BLE) is vulnerable to active attackers.This attack may cause the normal recording of anonymous IDsto stop working, thereby preventing a user from discoveringthe other users. This is an inherent problem with this method.
Processing of big data.
When an attacker sends a largeamount of encounter information to a user, the user’s applica-tion may occupy too much memory to store the information,which may cause the application to crash. To solve thisproblem, the storage capacity can be set for the encounterinformation, but this method will also cause the applicationto be unable to receive more encounter information after theencounter information fills up the memory. None of these fiveproposals can deal with this problem.
Avoidance of false contact.
For all the proposals, falsecontact incidents cannot be completely avoided. The attackercan record the information broadcast by a user and broadcastit to victims as quickly as possible. If the user is later testedpositive, the victims will mistakenly believe that they are indanger. Technically savvy attackers can use large antennas toartificially increase their broadcast range. For attackers withoutbudget restriction, they may relay and broadcast anonymousIDs extensively to create large-scale false contact incidents.All the proposals resist these attacks to the greatest extent bylimiting the validity period of anonymous IDs but it cannotsolve this problem completely.
Real identity.
All the proposals use specific encryptionalgorithms to prevent the attacker from deriving seed keys oruser pseudonyms based on the collected anonymous IDs, sothe attacker cannot pretend to be a certain user.
TABLE IIIC
OMPARISON OF T HE T HREE C ENTRALIZED P ROXIMITY T RACING P ROPOSALS
CentralizedBlueTrace PEPP-PT ROBERTPersonal information Phone number Not clear NoneUser pseudonym A unique random identity A unique random identity A unique random identityGeneration algorithm ofanonymous ID Use seed key only known to severto encrypt user pseudonym, cre-ation time and expiration time Use seed key periodically gen-erated by server to encrypt userpseudonym Use seed key only known to severto encrypt user pseudonym and theepochGenerating location The server The server The serverIdentity of infected users The server knows Not clear The server knows user pseudonymsbut cannot link to users identityData saved on mobile de-vices Anonymous ID, Received SignalStrength Indication (RSSI), DeviceModel, timestamp, etc. Anonymous ID, metadata and op-tional device information and de-vice status (RSSI and TX/RXpower), timestamp and optionalfurther data (such as WiFi status) Encrypted country code, anony-mous ID, timestamp, MessageAuthentication Code (MAC) andtransmission timeData uploaded Data saved on mobile devices Data saved on mobile devices Data saved on mobile devices
Authorization.
In all the proposals, users infected withCOVID-19 can upload data to the server only after beingauthorized by health authorities.
Non-repudiation.
In the centralized proposals, the serverhandles user pseudonyms, anonymous IDs generated basedon the user pseudonyms and encounter information uploaded.When two users have proximity contacts, they will sendencounter information including anonymous IDs to each other.When one user is tested positive and uploads encounterinformation to the server, the other user cannot deny the prox-imity contact with him/her because the sever can get anotheruser’s pseudonym from the anonymous ID in the encounterinformation. In the decentralized proposals, if one user istested positive and uploads keys to the server and anotheruser gets the keys, regenerates and matches the anonymousIDs successfully, the user cannot deny the proximity contactwith another user because the keys are only known to him/her.Based on above analysis, we listed the achievement of thefive proximity tracing proposals for eight security design goals,as shown in Table IV. It can be seen that none of the fiveproposals can achieve the security design goals of normalreception, processing of big data and avoidance of false con-tacts. And all can achieve the design goals of confidentiality,integrity, real identity, authorization and non-repudiation.
B. Privacy Analysis
This section summarizes the eight privacy design goalsrequired for the Bluetooth-based proximity tracing proposalsbased on six data protection principles of General Data Pro-tection Regulation (GDPR) of the European Union [10] andanalyzes the privacy of the five proposals.
1) Privacy Design Goals:
The eight privacy design goalsare as follows.
Right of access.
Users shall have the right to obtainconfirmation as to whether or not personal data concerningthem are being processed, and where that is the case, access tothe personal data and the following information: the purposes of the processing, the categories of personal data concerned,the period that personal data will be stored etc.
Data minimisation.
Adequate, relevant and limited to whatis necessary in relation to the purposes for which they areprocessed.
Right to erasure.
Users shall have the right to obtain theerasure of personal data concerning them without undue delayand the system shall have the obligation to erase personal datawithout undue delay.
Storage limitation.
Save the data for necessary limited timeand then erase it.
Untraceability.
Users’ locations cannot be exposed basedon the information broadcast by them.
Protection of infected users.
The identity of infected usersshould not be exposed to unauthorized entities.
Protection of risky users.
The identity of risky users shouldnot be exposed to unauthorized entities.
Protection of interaction information.
The interactioninformation which reflects close-range physical interactionsbetween users should not be exposed to unauthorized entities.
2) Privacy Analysis of Proposals:
The analysis of the fiveproposals’ privacy design goals is as follows.
Right of access.
In all proposals, the applications provideintroduction to users before they use specific functions. Theyinform users the permission users should grant, the purpose ofcollecting the data, the data they will collect, the period thatpersonal data will be stored etc.
Data minimisation.
All of the five proposals indicatedthat the location information of mobile phones would not becollected. The processing amount of personal data is limitedto the minimum amount of data required by the system, andno unnecessary personal data is collected
Right to erasure.
In the five proposals, users have the rightto stop using the applications and delete personal data at anytime.
Storage limitation.
All of the five proposals limit thenumber of days the data can be kept. Once the data expires,it will be deleted, ensuring the accuracy of the stored data.
TABLE IVA
CHIEVEMENT OF S ECURITY D ESIGN G OALS OF T HE F IVE P ROPOSALS
Security Design Goals GAEN DP-3T BlueTrace PEPP-PT ROBERT
Information confidentiality √ √ √ √ √
Information integrity √ √ √ √ √
Normal reception × × × × ×
Processing of big data × × × × ×
Avoidance of false contact × × × × ×
Real identity √ √ √ √ √
Authorization √ √ √ √ √
Non-repudiation √ √ √ √ √
Untraceability.
In the two decentralized proposals, sinceusers use encryption algorithms to generate anonymous IDsthat change periodically, other entities cannot link to users byanalyzing anonymous IDs they broadcast. In the three central-ized proposals, other entities cannot link to user pseudonymsby analyzing anonymous IDs unless they have encryptionkeys. But encryption keys are only handled by servers. Con-sequently, other entities cannot link to users by analyzing theanonymous IDs they broadcast.
Protection of infected users.
The data uploaded by theinfected user is not related to personal information. In thedecentralized proposals, the data uploaded is keys, and inthe centralized proposals, it is the encounter information.However, attackers can determine that the user is uploading alarge amount of data to the server by tracing phone numbersof health authorities or observing the network traffic, inferringthat the user has been tested for COVID-19 and diagnosed.These attackers can be Internet Service Providers (ISPs),network operators, or hackers who set up malicious accesspoints or sniff public WiFi networks.
Protection of risky users.
In the two decentralized propos-als, the server sends seed keys uploaded by the infected usersto the other users, who use the seed keys locally to regeneratethe anonymous IDs and calculate the risk score. These seedkeys are not associated with the identity of the user at riskof infection, so the decentralized proposals will not discloseinformation about the user at risk of infection to the others. InPEPP-PT and ROBERT, all the users will periodically requestto the server to update the risk score. The format of the reply isthe same regardless of whether the user has a risk of infection.Therefore, if the server is credible and the communicationchannel is confidential, the eavesdropper cannot distinguishwhich user has a risk of infection. In BlueTrace, because theserver uses user’s phone number to notify them of the riskof COVID-19 infection, the information of the user with highrisk may be leaked through the attacker’s tracking of the healthauthority’s phone number.
Protection of interaction information.
In the two decen-tralized proposals, the system will not disclose any informationabout the interaction between two users to any entity. Theanonymous IDs derived from the keys uploaded by an infecteduser has nothing to do with whoever had interacted with thisuser. In the three centralized proposals, only the server canlearn an infected user’s interaction information by analyzing the encounter information uploaded by the user. If the serveris trusted, the other unauthorized parties will not learn aboutthese interaction information.Based on the above analysis, we listed the achievementof the five proximity tracing proposals for the eight privacydesign goals, as shown in Table V. It can be seen that all theproposals have achieved the same privacy in terms of right ofaccess, data minimisation, right to erasure, storage limitation,untraceablility and protection of interaction information. Andnone of them achieves the privacy goal of protecting infectedusers. For the privacy design goal of protecting risky users,since BlueTrace needs to collect users phone numbers, it maydisclose the information of users at risk of infection. In termsof privacy analysis, BlueTrace achieves less privacy designgoals than the other proposals.IV. O
PEN P ROBLEMS AND O PPORTUNITIES
At present, the research on Bluetooth-based proximity trac-ing proposals is still in the stage of continuous exploration,and researchers face many challenges during this process.
A. Precise Proximity Measurement and Risk Calculation
Precise proximity measurement and risk calculation arethe key steps in tracing COVID-19. GAEN standardizesfour scores which are attenuationScore, daysSinceLastExpo-sureScore, durationScore and transmissionRiskScore, and thenit multiples these scores to calculate the risk value of infection.SwissCovid, which is based on GAEN, divides the attenuationinto three intervals by using two attenuation values before as-signing different weight values to each interval. It uses GAENAPI to request users continuous attenuation time in differentintervals and get the risk value of infection by calculatingthe weighted sum of continuous attenuation time. GAEN isstill evolving, and the measurement and calibration betweendifferent operating systems and different mobile phone modelsof these parameters such as attenuation values, continuouscontact time, thresholds and weights are still not completed.To accurately estimate the distance between two users, GAENreleased a Bluetooth Low Energy RSSI Calibration Tool tocalibrate as many devices as possible. It collects the RSSICorrection and the transmit power of different mobile phonemodels to improve the calculation consistency of attenuationvalues of all devices [11]. GAEN currently uses this rough
TABLE VA
CHIEVEMENT OF P RIVACY D ESIGN G OALS OF T HE F IVE P ROPOSALS
Privacy Design Goals GAEN DP-3T BlueTrace PEPP-PT ROBERT
Right of access √ √ √ √ √
Data minimisation √ √ √ √ √
Right to erasure √ √ √ √ √
Storage limitation √ √ √ √ √
Untraceability √ √ √ √ √
Protection of infected users × × × × ×
Protection of risky users √ √ × √ √
Protection of interaction information √ √ √ √ √ calibration method as a stopgap measure. The United King-dom believes that the method of GAEN measuring distancethrough RSSI is inaccurate and creates its own centralizedtracing application. NHSX, the digital innovation unit of theBritish National Health Service, released the NHS COVID-19 application [12] and tried it out on the Isle of Wight, butmany technical challenges have been identified through systemtesting.Measuring the distance between users may consider themutual enhancement of Bluetooth and ultrasonic ranging. InBluetooth-based proximity tracing proposals, mobile devicesbroadcast anonymous IDs using Bluetooth Low Energy (BLE),in which the attenuation of Bluetooth signals is generallyused to indirectly represent the distance between users. Inaddition, ultrasound is also a way to measure distance, whichis more accurate and does not depend on special hardware.In a scenario where the distance between users is greaterthan the officially considered safe distance, the attenuation ofthe Bluetooth signal can be used to represent the distancebetween users, because this scenario does not require anaccurate distance measurement. When the users are in closecontact, for example, when the two users perform handshakeand other close actions, ultrasonic assisted ranging perhapscan be triggered as needed at this time to provide calibrationfor distance measurement.
B. Security and Privacy Guarantee
The researchers conducted experiments on the privacy andsecurity risks of GAEN in the real world. In this experiment,the researchers proved that the current framework design isvulnerable to two kinds of attacks [13]. One attack is toprofile the infected person and possibly de-anonymize it. Theresearchers used mobile devices as a Bluetooth sniffer tocapture anonymous IDs broadcast by them passing through sixlocations. The captured data appeared random and could notbe associated with a single user. However, after a user is testedpositive and continuously uploads the primary keys, the resultis completely different. By generating a users anonymous IDsand matching with the anonymous IDs received by the Blue-tooth sniffers at six locations, they can accurately know whichlocations the user has visited, and the users route map can beportrayed based on the time information. Thus, they can collecta lot of information about the user and cancel its anonymity.Because the code of GAEN is not open source and the API can only be used by health authorities, an analog tracker thatconforms to the anonymous ID encryption specification inthe GAEN API is used in this experiment. Another type ofattack is a relay-based wormhole attack, in which an attackerconstructs a fake contact event and may seriously affect atracing application built on GAEN. The researchers built amulti-location wormhole by integrating Bluetooth Low Energy(BLE) and the Raspberry Pi. First, the worm device sendsthe encounter information collected from a location to thecentral Message Queuing Telemetry Transport (MQTT) server.The server distributes the received messages among the wormdevices. These devices will copy the beacon within the validityperiod of the anonymous ID (10 minutes) and rebroadcast. Fi-nally, the researchers established a logical connection betweenthe mobile devices 40 kilometers away, but in fact they did nothave real contacts. This wormhole attack budget is relativelylow, and attackers can use higher-than-normal signal strengthand/or high-gain antennas to significantly increase the scopeof each wormhole device. Therefore, an attacker may establishfalse connections between a large number of users and expandthe number of people who need to be tested and isolated,causing unnecessary panic. Because GAEN is unavailable, theresearchers used DP-3T that inspired GAEN as an alternative.All the COVID-19 tracing applications designed based onGAEN are vulnerable to these two attacks. For the centralizedproposals, since the server handles every users pseudonym,each user can be monitored. Thus it is necessary to ensurethat the server is credible and will not disclose information.To promote the progress of these proposals in terms ofsecurity and privacy, governments or research institutions canopen-source their proposals and use everyone’s power to finda better evolutionary direction.
C. Interoperability of Applications
GAEN develops a Bluetooth-based proximity tracing systemon Android and iOS platforms to improve the security andprivacy of the Bluetooth function used in the proposal, but theframework may not be available on other platforms [8]. TheEuropean countries believe that Apples mobile phones restrictthe use of Bluetooth background scanning by third-party appli-cations. Their users mobile devices need to keep Bluetooth onand active at all times, which will negatively affect battery lifeand device availability, making their own proposals impossibleand turning to GAEN to build applications. In addition, when performing accurate proximity measurements based on radiosignal strength, devices with different technical characteristicsneed to be considered.
D. Large-scale User Group
To ensure the validity of the proximity tracing applications,a large number of users must download these applicationsand grant application permissions. When only a small numberof people choose to use tracing applications, none of theseproposals can play their true role. To protect the privacy ofusers, users infected with COVID-19 can decide whether toupload data to the server in the five proposals mentionedabove. If only some users choose to upload data, these propos-als cannot effectively trace COVID-19. The applications areunavailable in areas lacking the 3rd or 4th generation mobilecommunication technologies. And for the elderly, children,and people with difficult family conditions, they may nothave qualified mobile devices and consequently cannot usethese applications. For example, the smartphone penetration inIndia and Bangladesh is very low, which is 25.3% and 18.5%,respectively [14]. Moreover, in some countries or regions thatare highly concerned about personal privacy, the security andprivacy risks of the applications are also a reason that preventspeople from using them.The government should increase publicity efforts for suchtype of applications on the basis of protecting users safety andprivacy and try to implement this function on other portabledevices to reduce the threshold for using them [15].
E. Powerful Infection Detection Capability
In any country, ensuring a strong COVID-19 infection testcapability is the basis for preventing the spread of COVID-19. Health authorities must be able to test accuratly whetherpeople are infected with COVID-19 on a large scale so thatthese auxiliary tracing proposals can achieve their functions.If users fail to test in time and get accurate test results whenthey are informed of the risk of COVID-19 infection, theirenthusiasm for using such applications will be reduced.The government should take responsibility of testingCOVID-19 for the public, providing convenient and affordabletesting approaches for them.V. C
ONCLUSION
With the global pandemic of COVID-19, how to use tech-nology to assist in tracing and suppressing the spread ofCOVID-19 has become one of the focuses of researchers.This article gives an overview on Bluetooth-based COVID-19 proximity tracing proposals. We categorized the protocolsinto two categories and summarized the differences betweenthem. Then we specifically analyzed the five protocols andsummarized their features and benefits. For a deeper com-prehension, we summarized eight security and privacy designgoals of proximity tracing proposals and analyzed the fiveproposals’ achievement of these goals. We found that noneof them has achieved the design goals. Moreover, we shedlight on the numerous open issues and opportunities that needfurther research efforts from the technical requirements andcommunity building perspectives. R
EFERENCES[1] J. Bay, J. Kek, A. Tan, C. S. Hau, L. Yongquan, J. Tan, and T. A. Quy,“Bluetrace: A privacy-preserving protocol for community-driven con-tact tracing across borders,”
Government Technology Agency-Singapore,Tech. Rep et al. , “Decentralizedprivacy-preserving proximity tracing,” arXiv preprint arXiv:2005.12273
Threat modeling: Designing for security . John Wiley &Sons, 2014.[10] C. J. Hoofnagle, B. van der Sloot, and F. Z. Borgesius, “The europeanunion general data protection regulation: What it is and what it means,”
Information & Communications Technology Law , vol. 28, no. 1, pp.65–98, 2019.[11] “Exposure notifications ble attenuations,” [Online]. Available:https://developers.google.com/android/exposure-notifications/ble-attenuation-overview.[12] “Covid-19 app - documentation,” [Online]. Available:https://github.com/nhsx/COVID-19-app-Documentation-BETA.[13] L. Baumg¨artner, A. Dmitrienko, B. Freisleben, A. Gruler, J. H¨ochst,J. K¨uhlberg, M. Mezini, M. Miettinen, A. Muhamedagic, T. D. Nguyen et al. , “Mind the gap: Security & privacy risks of contact tracing apps,” arXiv preprint arXiv:2006.05914 , 2020.[14] M. J. M. Chowdhury, M. S. Ferdous, K. Biswas, N. Chowdhury,and V. Muthukkumarasamy, “Covid-19 contact tracing: Challenges andfuture directions,” 2020.[15] R. A. Kleinman and C. Merkel, “Digital contact tracing for covid-19,”
CMAJ , 2020.
Meng Shen (M’14) received the B.Eng degree from Shandong University,Jinan, China in 2009, and the Ph.D degree from Tsinghua University, Beijing,China in 2014, both in computer science. Currently he serves in BeijingInstitute of Technology, Beijing, China, as an associate professor, the Schoolof Computer Science, Beijing Institute of Technology. His research interestsinclude privacy protection for cloud and IoT, blockchain applications, andencrypted traffic classification. He received the Best Paper Runner-Up Awardat IEEE IPCCC 2014. He is a member of the IEEE.
Yaqian Wei received the B.Eng degree in computer science from XidianUniversity, Shanxi, China in 2020. Currently she is a master student in theSchool of Computer Science, Beijing Institute of Technology. Her researchinterest includes cyber security.