Privacy-Preserving Contact Tracing: current solutions and open questions
PPrivacy-Preserving Contact Tracing: currentsolutions and open questions
Qiang Tang
Luxembourg Institute of Science and Technology (LIST)4362, Esch sur Alzette, [email protected]
Abstract.
The COVID-19 pandemic has posed a unique challenge forthe world to find solutions, ranging from vaccines to ICT solutions toslow down the virus spreading. Due to the highly contagious nature ofthe virus, social distancing is one fundamental measure which has alreadyadopted by many countries. At the technical level, this prioritises contacttracing solutions, which can alert the users who have been in close contactwith the infected persons and meanwhile allow heath authorities to takeproper actions. In this paper, we examine several existing privacy-awarecontact tracing solutions and analyse their (dis)advantages. At the end,we describe several major observations and outline an interdisciplinaryresearch agenda towards more comprehensive and effective privacy-awarecontact tracing solutions.
The Coronavirus disease (COVID-19) pandemic, caused by the SARS-CoV-2virus, has put the world into a panic mode. Up to now, the virus has contractedmore than 2 million victims, and more than 120 thousands victims have losttheir lives . Among the survivors, many of them are in critical conditions andheavily rely on medical equipments such as ventilators to survive. It is reportedthat many victims have tragically died due to the shortage of such equipments .While these numbers are still increasing on a daily basis, the world has unitedunprecedentedly to find solutions to suppress the pandemic.According to the World Health Organization (WHO), on 31 December 2019,the WHO China Country Office was informed of a pneumonia of unknown cause,detected in the city of Wuhan in Hubei province, China. Soon, similar cases, laterbeing attributed to COVID-19, appeared in Wuhan and Hubei province rapidly. https://en.wikipedia.org/wiki/2019-20_coronavirus_pandemic a r X i v : . [ c s . CR ] A p r o combat the epidemic, the Chinese government adopted a variety of extrememeasures, e.g. completely close the borders of villages, cities, and provinces; trackand then quarantine all close contacts of COVID-19 victims; make wearing masksmandatory. From the public information, the epidemic has been successfullycontrolled in China and only dozens of new cases are identified daily now. In asharp contrast, the numbers are still rapidly growing in the western democracies,where some of these extreme measures cannot be easily enforced due to thepotential violation of fundamental rights. Nevertheless, in order to slow downthe virus spreading and reduce the pressure on medical systems, social distancing is widely promoted and enforced, furthermore many countries are investigatingthe concept of contact tracing . Without careful considerations, contact tracing can turn into a massive surveillance tool so that individual’s privacy can beseriously damaged, see the case analysis for China and for South Korea .What makes things harder is that the perception of privacy heavily depends onthe political regime and the underlying culture, see the analysis by Asghar et al.towards Singapore’s TraceTogether app .It remains an open problem to design privacy-aware contact tracing solutionswhich also satisfy the requirements in other dimensions. contact tracing Up to now, a number of contact tracing solutions have been introduced. Forinstance, China and South Korea have begun tracing COVID-19 victims andtheir contacts from the very early stage of the epidemic, via smartphones as wellas face recognition technologies. In addition, the Korean government even madea lot of collected data public . On one hand, these mandatory tracing solutionsgreatly facilitate the containment of the virus, but on the other hand it alsoraises serious privacy concerns [5].Recently, the Singapore government published an app, named TraceTogether , which exploits some cryptographic primitives for privacy protection. This apphas attracted the attention from some other countries, such as Australia, whichare currently evaluating its privacy guarantee according to their own privacyregulations, see [2]. At the end of March 2020, Israel passed an emergency lawto launch a smartphone app to reveal if a user was, over the previous 14 days, http://tiny.cc/fljqmz https://coronamap.site/ http://tiny.cc/onlqmz
2n close proximity to anyone who has contracted the virus. Besides the efforts atthe national level, initiatives have also been started by the general public. Forinstance, a Pan-European Privacy-Preserving Proximity Tracing (PEPP-PT)project has been kicked off with both public and private partners from several EUcountries , MIT is collaborating with WHO to advocate its app named PrivateKit: Safe Paths , Google and Apple are also collaborating on new solutions .There are also more theoretical proposals, e.g. those from [1,3,4,7].In the meantime, the concept of contact tracing and proposed solutions havebeen scrutinized by commenters and researchers. For instance, Anderson providevery insightful analysis on the practical aspects of contact tracing . Wang gavesome interesting remarks on current contact tracing solutions , including thatfrom Google and Apple solutions . Asghar et al. analysed Singapore’s Trace-Together app , and Vaudenay [8] provided a detailed analysis of the DP-3Tsolution by Troncoso et al. [4]. In this paper, we aim at a deeper understanding about the utility and privacyissues associated with emerging contact tracing solutions, particularly those re-lated to respiratory system diseases such as COVID-19. Our contribution lies inthree aspects.1. We analyse the application context in this COVID-19 pandemic and identifya broad set of utility and security requirements.2. We examine several existing privacy-aware contact tracing solutions andanalyse their (dis)advantages. These solutions include the
TraceTogether appfrom Singapore and three cryptographic solutions by Reichert et al. [7] andAltuwaiyan et al. [1] and Troncoso et al. [4]. It is worth noting that manysimilar solutions exist and more apps are being developed, and we wish ouranalysis can be extended to them.3. We summarize our findings into several major observations and outline aninterdisciplinary research agenda towards more comprehensive and effectiveprivacy-aware contact tracing solutions. http://safepaths.mit.edu/ http://tiny.cc/2z0zmz http://tiny.cc/tx0zmz http://tiny.cc/fljqmz http://tiny.cc/onlqmz In the public health domain, contact tracing refers to the process of identifica-tion of contacts who may have come into contact with an infected victim andsubsequent collection of further information about these contacts. By tracing thecontacts of infected individuals, testing them for infection, treating the infectedand tracing their contacts in turn, public health aims to reduce infections inthe population . In practice, contact tracing is widely performed for diseaseslike sexually transmitted infections (including HIV) and virus infections (e.g.SARS-CoV and SARS-CoV-2/COVID-19). Despite some pioneering attemptsin applying advanced ICT technologies, e.g. the FluPhone [9], contact tracing has mainly been implemented manually by medical personnel. Regardless, thisapproach has been proven effective in combating contiguous diseases because itcan at least (1) interrupt ongoing transmission and reduce spread, alert contactsto the possibility of infection and offer preventive counseling or prophylacticcare and (2) allow the medical professionals to learn about the epidemiology ina particular population.Ferretti et al. [5] investigated the key parameters of epidemic spread forCOVID-19 and concluded that viral spread is too fast to be contained by man-ual contact tracing (see an illustration in Fig. 1) but could be controlled if thisprocess was faster, more efficient and happened at scale. This necessitates digital contact tracing solutions, such as those based on mobile apps, which can effi-ciently achieve epidemic control in large scale. Interestingly, this coincides withthe empirical analysis of the control measures in China, by Tian et al. [6].In the following-up discussions, we focus on digital contact tracing solutions,by cautiously referring to their manual ancestors in the aspect of functionalrequirements. For simplicity, we omit the term “digital” in the rest of the paper. contact tracing
Overall, the purpose of a contact tracing solution is to prevent an epidemic or apandemic caused by a contagious virus or something similar. Depending on thestanding point, it may serve (at least) two purposes. https://en.wikipedia.org/wiki/Contact_tracing ig. 1. Quarantine Plan (taken from [5]) – At a global level, it helps medical personnel to trace the pattern of virusspreading, produce transmission graphs, trace the origin of the virus, andso on. With adequate knowledge of the virus, the authority can take appro-priate actions (e.g. disinfecting a facility) and make appropriate plans (e.g.enforcing social distancing ) to fight against the virus and prevent futuresimilar epidemic or pandemic. – At an individual level, it helps the medical personnel to alert individualswho might have been infected. Alternatively, it may allow an individual toevaluate his/her risks of being infected and take further actions. As a quickremark, most cryptographic solutions focus on the individual-level contacttracing and devoted their attention to privacy protection for individuals.For many diseases, such as COVID-19, an individual might contract the viruswith either direct or indirect contact to the infected person. In case of directcontact , we can imagine that the droplets containing virus from the infected canfly to his mouth, nose, or eyes. In addition, the droplets might also attach to hisclothes. In case of indirect contact , we can imagine that the infected leaves virus-droplets on a book, a chair, or any physical objects, and later on an individualmight tough the object and contract the virus. Taking the COVID-19 as anexample, the virus can be transmitted through either direct or indirect contacts,so that it makes comprehensive contact tracing a very hard problem.5o design a contact tracing solution, the main anchor is location data. In-tuitively, if two persons are located in close locations at a certain point of timethen we can informally assume that they have close contact with each other . Inreality, other anchoring technologies might be employed, such as face recognitionand other AI-based tracking technologies. However, we do not consider them be-cause such technologies are only deployed in very limited regions of the world.With the abundance of electronic gadgets, location data can be generated andcollected in many ways, e.g. GPS, WIFI, Telcom Cell Towers, Bluetooth beacons.Broadly, we can categorize location data into two categories. – One is absolute location data. In this category, we can think of GPS location,location with respect to static WIFI access points and Telcom cell towers. Alocation data point can often be written in the form of geolocation coordinatepair. – The other is relative location data. In this category, we can think about thepairing of two Bluetooth-enabled smart devices, the boarding on a trans-portation tool such as planes, buses, cars, or ships. In this case, we can havesome reference description about the location, for example both persons areon the same flight on the day XYZ.Besides the difference in collection and management, when being applied in contact tracing solutions, they also have very different precision and securityimplications.With respect to precision, absolute location data is often generated by exter-nal infrastructures and might not be precise enough to define “close contact” in aepidemic and pandemic. On the other hand, such location data can be collectedconstantly and will provide a big picture on the mobility patterns of individuals.In contrast, relative location could be more precise. But in order to collect suchdata, we need to assume a large potion of the population will install the sameapp to support the service. Another drawback of this type of location data isthat it might be ad hoc and will not be able to provide a comprehensive viewof individuals’ mobility history. Moreover, it is hard to use such data alone tostudy the transmission pattern of a disease in a population. It seems that, inorder to facilitate the global and individual level objectives, a contact tracing solution should be based on fusing location data of both categories.With respect to security, two aspects are of crucial importance. One aspectis data authenticity. In some scenarios, absolute location data could be moreauthentic because a third party could offer some sort of attestation. For instance,a Telcom operator can attest the location of an individual’s smartphone. Incontrast, it is harder to find attestations on an individual’s relative location data. However, there are exceptions. For instance, if a user presented a boarding Note that there are exceptions though.
With respect to the existing contact tracing solutions, there is neither a uni-form system architecture nor a defined set of participants. Nonetheless, all thepotential participants can be divided into two groups. In one group, there areindividual users, who either have been confirmed with infection or have the riskof being infected when in close contact with the infected. In the other group,there are third parties, which vary in specific solutions. For instance, one po-tential player in this group is the health authority and medical personnel, whoneed to evaluate the situation and help the individual users if necessary. If a contact tracing solution relies on smartphone apps, then the developer of thisapp may also be involved. In case that individual users needs to communicatewith each other, then a server may be required to facilitate the communication.Without loss of generality, the system architecture can be illustrated as in Fig. 2,where ( (cid:202) , (cid:203) , (cid:204) , (cid:205) ) indicate the four phases in the workflow of a contact tracing solution, i.e. ( initialisation , sensing , reporting , tracing ). – In the initialisation phase, individual users and relevant third parties needto set up the system to enable the operations in other phases. For example,every individual user might be required to have a smart phone and downloadan app from a third party. In addition, cryptographic credentials may needto be generated and distributed. – In the sensing phase, individual users will record their own location trailsand also collect location data from their close contacts. – In the reporting phase, if an individual user is confirmed to be infected thenshe needs to collaborate with some third parties to make her relevant locationdata available for the further uses. – In the tracing phase, some third parties could collect and aggregate the loca-tion data from infected individuals for any possible legitimate purposes. Forinstance, a third party can communicate with the close contacts of the in-fected, or let the uninfected individuals to evaluate the risks of being infectedon their own.In comparison to other digital solutions (e.g. a general-purpose social app), a contact tracing solution is expected to provide stricter guarantees towards utilityand security. 7 ig. 2.
System ArchitectureAs to utility, the solution should provide fine-grained and accurate mea-surement of the distance between a contact and the infected victim. The “fine-grained” requirement refers to a precise description of the occurring time andthe duration of the contacting event, while the “accurate” requirement meansthat the error in the distance measurement should be as small as possible. Takethe COVID-19 as an example, it is commonly considered that there is a riskwhen a contact and the infected victim stay together within 2 meters. In thiscase, if a GPS system has the error around 5 meters, then it should not be usedalone in any solution.The security guarantee imposes requirements on the protection for both au-thenticity and privacy. – The “authenticity” requirement means that the reported location data froma user must be real and should not have been forged or modified by anybodyincluding the user herself. Lacking of authenticity can cause a lot of seriousissues, as happened in China and South Korea. For example, an infectedvictim can blackmail a shop by claiming that she has been a visitor, aninfected victim can cause a social panic by claiming that she has visitedsome heavily populated areas such as shopping malls or train stations, auser can forge location data in order to probe the location data of infected8ictims and identify them given that some matching service is offered, andfake location data from infected users can mislead medical personnel in theirprofessional activities.In addition, the “authenticity” requirement can be extended to the bindingproperty between the location data to an individual. Quite often, smartdevices are used to collect and manage location data, while such devices canbe shared by several individuals. Lacking of binding can lead to some fraudactivities. For example, an individual user can present another user’s deviceand location data to get a priority in virus testing or avoid going to work bytriggering some quarantine policy on purpose. – The “privacy” requirement applies to both infected victims and other users.Except for disclosing the necessary information to the authorities, an in-fected victim might want to prevent any further disclosure to avoid socialembarrassment or discrimination. An individual user may want to check hisrisk of being infected by matching his location data with that from infected,but he may not want to disclose his data to the third parties or the infectedfor any other purpose.Note that the DP-3T consortium recently published a summary of privacyand security threats . In this section, we provide some high-level analysis of several solutions againstour formulation in Section 2. The common issues with these solutions are sum-marized into the major observations in Section 4.
TraceTogether
The
TraceTogether protocol from Singapore, as recapped in [2], has two typesof entities, namely users (1 ≤ i ≤ N ) and the Ministry of Health (MoH) of theSingapore government. It is assumed that all users trust the MoH to protecttheir information. Note that, as shown below, the users are not required to shareeverything with MoH if they have not been in close contact with any confirmedCOVID-19 victim. The protocol is elaborated below. – In the initialisation phase, a user i downloads the TraceTogether app andinstall it on her smartphone. The app sends the phone number
N U M i to https://github.com/DP-3T/documents ID i . MoH stores the ( N U M i , ID i ) pair inits database.MoH generates a secret key K and selects an encryption algorithm Enc . Atthe beginning of the app launch, MoH decides some time intervals [ t , t , · · · ],which will end when the pandemic is over. For the user i , MoH pushes T ID i,x = Enc ( ID i , t x ; K ) to user i ’s app at the beginning of t x , for x ≥ – In the sensing phase, user i broadcasts T ID i,x at the time interval [ t x , t x +1 )for all x ≥
0. For example, when user i and user j come into a range ofBluetooth communication at the interval [ t x , t x +1 ), then they will exchange T ID i,x and
T ID j,x . They will store a (
T ID i,x , T ID j,x , Sigstren ) locallyin their smartphones, respectively. The parameter
Sigstren indicates theBluetooth signal strength between their devices. – In the reporting phase, suppose that user i has been tested positive forCOVID-19, then she is obliged to share with MoH the locally-stored pairs( T ID i,x , T ID j,x ) for all relevant j and x . – In the tracing phase, after receiving the pairs from user i , MoH decryptsevery T ID j,x and obtains ID j . Based on ID j , MoH can looks up N U M j and then contact user j for further instructions.Whether or not to install the TraceTogether app is a voluntary choice for theSingapore residents, and it is not clear how many installations have been madeuntil this moment. So far, we have not found any official information showinghow much this app has contributed to Singapore’s war against COVID-19. Soonafter its launch, Asghar et al. pointed out some privacy concerns . We have thefollowing additional comments.With TraceTogether , individual users are required to put more trust on thethird party - MoH, than in other solutions. Note that this might be a result ofthe political regime and cultural status of Singapore. To prevent a curious MoHfrom learning unnecessary mobility information, the mobility data of low-risk in-dividuals are not required to be uploaded to MoH. However, things will changewhen more and more individual users are infected and diagnosed. By then, MoHwill have decrypted data for most of the individuals and learned their relative lo-cation data at certain points of time. If a big portion of the Singapore populationhas deployed the app, and then the location data on their smartphones wouldprovide MoH a very clear mobility view of the whole Singapore population.Without using secure hardware or other trusted computing technology, amalicious user can potentially manipulate the location data collected by the app,e.g. delete or add entries. Moreover, an attacker can mount relay attacks, e.g. torelay the Bluetooth signal from Alice’s smartphone to Bob’s smartphone evenwhen they are far from each other. To this end, some other attacks, demonstratedin Vaudenay’s analysis [8] against DP-3T solution by Troncoso et al. [4], can also http://tiny.cc/fljqmz Bluetooth rangeextender in a relatively populated place, such as a city square, and as a result itwill make any pair of users a close contact to each other. Such an attack is easyto mount and will seriously distort the functionality of the underlying solution.In comparison to other solutions, one advantage of
TraceTogether is thatit offers MoH the ability to draw the transmission graph of COVID-19 in thepopulation which has installed the app. It can be easily done based on the factthat the temporary identifiers are linked to the phone numbers, which can helpMoH identify the individual users. This advantage is an outcome of the tradeoffbetween privacy and utility.
In the solution proposed by Reichert et al. [7], it is assumed that user i (1 ≤ i ≤ N ) possesses a smart device that can collect and store geolocation data.In addition, a Health Authority (HA) will collect the geolocation history of allCOVID-19 victims and offers data matching as a service. – In the initialisation phase, HA prepares the cryptographic key materials forgenerating garbled circuits for later use. – In the sensing phase, user i records her geolocation data points on the fly.Let the time intervals be denoted by [ t , t , · · · ]. At time t x , user i generatesand stores a tuple ( t x , l x,u , l x,v ), where l x,u and l x,v represent the latitudeand longitude of the location, respectively. – In the reporting phase, if user i has been tested positive for COVID-19, thenshe shares with HA her ( t x , l x,u , l x,v ) for all relevant x . – In the tracing phase, suppose user j wants to check whether he has been inclose contact with any COVID-19 victim. For any COVID-19 victim user i ,HA constructs a garbled circuit based on her geolocation data points andshares the circuit with user j , who can then retrieve some key materials fromthe HA and privately evaluate the circuit based on his own geolocation datapoints. Note that user j is assumed to have high risk if, for some of her datapoint(s), both the time stamps and the geolocation coordinates are close toone of that from user i .This is a theoretical cryptographic solution, which covers the matching be-tween an infected and other individual users while ignoring other aspects of a contact tracing solution. With respect to the cryptographic design, scalabilitywill be a bottleneck. For any user j , HA will need to prepare garbled circuits forall the infected victims, and interact with user j in the tracing phase. When the11izes of infected populations and the requesters become large, the complexity forHA will be formidable. In addition, malicious users might leverage this to mount(D)DoS attacks, unless proper countermeasures are deployed. Another issue isthe lacking of details on the computation of infection risks, which are based onthe proximity of both stamps and the geolocation coordinates. It is unclear howto set a threshold on the time stamps, considering there are a variety of mobilitypatterns between the concerned users. In the solution proposed by Altuwaiyan et al. [1], it is assumed that user i (1 ≤ i ≤ N ) possesses a smart device that can exchange information (e.g.Bluetooth messages) with similar devices nearby. In addition, a server will collectthe data of all infected victims and offer data matching as a service. – In the initialisation phase, there is no special setup for the server and users. – In the sensing phase, user i exchanges information with similar devicesnearby. Let the time intervals be denoted by [ t , t , · · · ]. At time t x , user i generates and stores a tuple of data points ( t x , ( m i, , r i, , p i, ) , · · · , ( m i,n i,x , r i,n i,x , p i,n i,x )),where ( m i, , r i, , p i, ) records information about the first encountered device,where m i, is the hashed identifier of this device, r i, is the detected signalstrength and p i, is the type of device, and so on. – In the reporting phase, if user i has been tested positive, then she shareswith the server her data ( t x , ( m i, , r i, , p i, ) , · · · , ( m i,n i,x , r i,n i,x , p i,n i,x ) forall relevant x . – In the tracing phase, suppose user j wants to check whether he has beenin close contact with some infected victim, he generates a public/privatekey pair ( pk, sk ) for a homomorphic encryption scheme. Then, user j sendsthe timestamps in his data to the server, which will find all the infectedusers who have some overlapped timestamps. For each such infected user i and overlapped timestamp t x , the server and user j perform the followingprotocol.1. User j sends ( Enc ( m j, , pk ) , · · · , Enc ( m j,n j,x , pk )) to the server.2. The server computes the following matrix and sends it to user j . Rand ( Enc ( m j, − m i, , pk )) · · · Rand ( Enc ( m j,n j,x − m i, , pk ))... ... ... Rand ( Enc ( m j, − m i,n i,x , pk )) · · · Rand ( Enc ( m j,n j,x − m i,n j,x , pk )) In the table,
Rand is a ciphertext randomization function. If the cipher-text encrypts 0 then the randomized ciphertext still encrypts 0, otherwisethe randomized ciphertext will encrypt a random number.12. User j decrypts every ciphertext in the received matrix, and obtains the m j,y , where 1 ≤ y ≤ n j,x , which is overlapped with that from user i .Then, user j sends all the matched m j,y together with r j,y to the server.4. Based on the data from user j , the server computes the distance betweenuser i and user j , and then acts accordingly.We note that, in addition to the aforementioned privacy-aware matching pro-tocol, the other contribution of Altuwaiyan et al. [1] is a new method to measurethe distance between two smart devices. In comparison to other solutions wherethe distance is measured based on the perceived signal strength of the peer de-vice, the method leverages on the signal strengths of more devices so that it ismore accurate in practice.Regarding privacy, the infected victims reveal the location data to the server,where the location data contain hashed network identifiers such as those of WIFIaccess points. Give that network identifiers are often static, this allows the serverto recover the absolute location data points of the infected users. If a matchhas been found in the tracing phase, then user j ’s absolute location at sometime stamps are also revealed to the server. In addition, in order to improvecomputational efficiency, time stamps are always disclosed to the server. Therevelation of time stamps and absolute location implies serious privacy leakage,and should be avoided. In the solution proposed by Troncoso et al. [4], it is assumed that user i (1 ≤ i ≤ N ) possesses a smart device that can collect and store data. In addition, thereis a backend server, and the Health Authority (HA). Note that the backendserver acts as a communication platform to facilitate the matching activitiesamong the users. Let H , PRG and
PRF denote a cryptographic hash function, apseudorandom number generator and a pseudorandom function, respectively. – In the initialisation phase, user i generates a random initial daily key SK i, ,and computes the following-up daily keys based on a chain of hashes: i.e.the key for day 1 is SK i, = H ( SK i, ) and the key for day x is SK i,x = H ( SK i,x − ). Suppose n ephemeral identifiers are required in one day, thenthe identifiers for user i on the day x are generated as follows: EphID i,x, || · · · || EphID i,x,n = PRG ( PRF ( SK i,x , “ broadcastkey ”)) – In the sensing phase, on the day x , user i broadcasts the ephemeral iden-tifiers { EphID i,x, , · · · , EphID i,x,n } in a random order. At the same time,13er smart device stores the received ephemeral identifiers together with thecorresponding proximity (based on signal strength), duration, and other aux-iliary data, and a coarse time indication (e.g., “The morning of April 2). – In the reporting phase, if user i has been tested positive for COVID-19, thenHA will instruct her to send SK i,x to the backend server, where x is the firstday that user i becomes infectious.After sending the SK i,x to the backend server, user i chooses a new daily key SK i,y depending on the day when this event occurs. While not mentionedin [4], we believe this new key should also be sent to the backend server, asuser i might continue to be infectious. – In the tracing phase, periodically, the backend server broadcasts SK i,x afteruser i has been confirmed with the infection. On receiving SK i,x , user j canrecompute the ephemeral identifiers for day x as follows PRG ( PRF ( SK i,x , “ broadcastkey ”)) . Similarly, user j can compute the identifiers for day x + 1 and so on. Withthe ephemeral identifiers, user j can check whether any of the computedidentifiers appears in her local storage. Based on the associated information,namely “proximity, duration, and other auxiliary data, and a coarse timeindication”, user j can act accordingly.Vaudenay [8] provided detailed security and privacy analysis against thissolution. It argues that decentralisation introduces new attack vectors againstprivacy, contrasting to the common belief that decentralisation helps solve theprivacy concerns in centralised systems. One of the conclusions from [8] is thattrusted computing technology seems unavoidable in order to prevent all theidentified attacks.As noted by Vaudenay [8], a lot of practical details are missing from thewhitepaper [4]. For instance, it is unclear how to instantiate the backend server.With regard to its pan European ambition, it is not clear how HAs and backendservers from different countries can efficiently coordinate with each other to takeprompt responses. It also remains open how users’ privacy will be infected if thecredentials of some HAs and backend servers are compromised. We want to pointout that the hashchain-based method of generating daily keys, see more infor-mation in the initialisation phase, brings linkability risks when an individual’scredential is leaked to an attacker, e.g. via malware. For example, if the attackerlearns SK i,x then it can compute identifiers for day x + 1 and so on. Then, theattacker can link user i to his broadcasted identifiers, either collected by theattacker itself or or bought from other attackers. This could lead to disclosureof absolute location data points of user i .By design, the DP-3T protocol favors more the privacy for user j than the in-fected user i . Referring to the tracing phase description, if an identifier EphID ∗ j learns the exact time stamp when the per-ceived risk occurs. This can very likely enable user j to link EphID ∗ to the realperson, whose device has broadcasted EphID ∗ , due to the fact that user j canbe close to very few users at a specific time stamp. Moreover, this also raises aconcern of targeted identification attack . If an attacker wants to find out whetheror not some users have been infected, then he can simply get close enough tothem and make their devices exchange identifiers (of course he should recordthe time stamps), and finally he can make a decision if identifiers have beenmatched at these time stamps. In contrast, the TraceTogether avoids this prob-lem by letting MoH calculate the infection risk for encountered users. Overall,the DP-3T protocol leaks too much unnecessary information about the infectedusers to the public and raises serious privacy concerns without deploying anyfurther countermeasure. One possible solution is to deploy a two party securecomputation protocol between the backend server and user j , for the latter toonly learn a risk score. How to scale this up is a very practical problem to beaddressed. After examining many privacy-aware contact tracing solutions, we feel that thesituation is a bit chaotic at this moment. There is a rush for the academia topropose new technical solutions and for the industry to launch new apps, and newinitiatives pop up every day. These efforts will somehow contribute to the fightagainst the pandemic, at least raise the awareness and help start discussions.Unfortunately, it is a pity that many solutions are based on assumptions whichare unrealistic and hamper their usefulness and massive adoption. In this section,we first discuss our major observations, partly based on the analysis in Section3, and then put forward a roadmap which foresees an interdisciplinary researchagenda for the future.The first major observation is that many solutions only focused on the directcontact scenario and paid little attention to the precision of location data andthe accuracy of distance measurement (e.g. those based on Bluetooth signalstrength). In addition, a number of exceptional situations have not been takeninto account. For instance, the neighbours of an infected and quarantined victimcan be in very close range and be classified to be of high risk, while the truth isthat they are in different apartments or houses so that the risk of being infectedis low. Inaccurate distance estimation and false risk alerts will cause unnecessarypanic in the population and also waste a lot of resources in the healthcare systemto address the fake suspects. Moreover, the scenario of indirect contact has beenignored most of the time. Due to the fact that COVID-19 can be transmittedvia indirect contact. Ignoring this scenario makes the existing contact tracing solutions less practical and effective than expected.15he second major observation is that the scope of contact tracing in existingsolutions is very narrow. It is mostly about an individual evaluates his risk ofbeing infected based on his contact history with the infected victims. However,as we have described in the beginning of Section 2 and Section 2.1, contact trac-ing is supposed to enable health authority and medical personnel not only tostudy the epidemic or pandemic at the global level but also to help the individualat the individual level. This raises an open question how the health authorityand medical personnel can collect the necessary information and perform theirnormal duties. Inevitably, new privacy-aware solutions need to be designed andimplemented to fulfill the objectives of a comprehensive contact tracing yet min-imizes information disclosure. One root problem, facing both existing and newsolutions, is that a consensus on the trust relationship between the differentplayers in the scope of contact tracing is missing. When an app is deployed,more practical questions can occur. For example, how the results from the appof Google and Apple should be interpreted, how the liability is distributed,and can an individual interact with the health authority and medical personnelon the basis of these results?The third major observation is that most solutions only emphasized the pri-vacy concerns in contact tracing while paying no attention to the authenticityaspect (elaborated in Section 2.2). It implies that individual users can poten-tially forge location data for herself and for others as well. Consequently, theresults of contact tracing could have been manipulated to a large extent, andenable the dishonest users and attackers to disrupt the service and commit vari-ous fraud activities. Of particular importance is that the lacking of authenticitycan also lead to the breach of privacy, as demonstrated by Vaudenay [8]. How tobalance authenticity and privacy seems to be the biggest challenge in designinga privacy-aware and effective contact tracing solution.The fourth major observation is that most theoretical solutions have not pro-vided all the technical details to facilitate an implementation. In many solutions,an un-trusted backend server is required. It is unclear how this server can bechosen in practice and how to incentivize it to perform in the same way as whathas specified. Furthermore, a health authority is often involved and required toperform some cryptographic operations. This seems to be an unrealistic require-ment for an governmental authority in many countries. At least, it will not beeasily done in a short period of time.To fight against a pandemic, like COVID-19, every technology and solutioncould matter. Nevertheless, we believe it is also important to set an interdisci-plinary agenda to comprehensively formulate the problem, identify the require-ments, and search for the opportunities. We foresee the following key researchelements on the roadmap. Bridge the gap between health authority and solution designers. This essen-tially requires the solution designers to figure out what an effective contacttracing solution needs to generate, for both individual users, health author-ity and the medical personnel. Accordingly, different players’ roles should beclearly defined. Decisions should be drawn on the basis of regulations (atleast) related to healthcare and privacy. – Evaluate and model the privacy and other security risks. This will need toclarify the trust relationships among the players and result in a set of se-curity requirements, which should be satisfy by a solution. It should alsoreflect the accountability and liability configuration among the players. Var-ious tradeoffs could be inevitable among privacy, utility, efficiency and otheraspects. – Build incentive mechanisms into solutions from the beginning. Instead ofsimply providing a dichotomy choice through “opt-in” and “opt-out”, it isimportant to incentivize the participation of individual users and other play-ers, e.g. by employing technologies such as DLT or Blockchain. It is also im-portant to deploy mechanisms to deter dishonest and malicious behavioursand encourage honest behaviour for the society good. In particular, it shouldprevent the solutions from being used in any manner as a surveillance toolfor either political or economic purposes.
Acknowledgement
This work is partially funded by the European Unions Horizon 2020 SPARTAproject, under grant agreement No 830892.
References
1. T. Altuwaiyan, M. Hadian, and X. Liang. EPIC: efficient privacy-preserving con-tact tracing for infection detection. In , pages 1–6. IEEE, 2018.2. H. Asghar, F. Farokhi, D. Kaafar, and B. Rubinstein. On the privacy of traceto-gether, the singaporean covid-19 contact tracing mobile app, and recommendationsfor australia. Accessed on 10th April 2020. http://tiny.cc/pb3lmz , 2020.3. S. Brack, L. Reichert, and B. Scheuermann. Decentralized Contact Tracing Using aDHT and Blind Signatures. Accessed on 10th April 2020. https://eprint.iacr.org/2020/398 , 2020.4. C. Troncoso et al. Decentralized privacy-preserving proximity tracing. Version: 3rdApril 2020. https://github.com/DP-3T/documents , 2020.5. L. Ferretti et al. Quantifying dynamics of sars-cov-2 transmission suggests thatepidemic control with digital contact tracing.
Science , 2020.6. Tian et al. An investigation of transmission control measures during the first 50days of the covid-19 epidemic in china.
Science , 2020. . L. Reichert, S. Brack, and B. Scheuermann. Privacy-preserving contact tracing ofcovid-19 patients. Accessed on 10th April 2020. https://eprint.iacr.org/2020/375 , 2020.8. S. Vaudenay. Analysis of DP3T. Accessed on 10th April 2020. https://eprint.iacr.org/2020/399 , 2020.9. E. Yoneki. Fluphone study: virtual disease spread using haggle. In F. Legendre andA. Helmy, editors, Proceedings of the 6th ACM workshop on Challenged networks,CHANTS@MOBICOM 2011 , pages 65–66. ACM, 2011., pages 65–66. ACM, 2011.