Limits of individual consent and models of distributed consent in online social networks
Juniper Lovato, Antoine Allard, Randall Harp, Laurent Hébert-Dufresne
DDistributed consent and its impact on privacy andobservability in social networks
Juniper Lovato , Antoine Allard , Randall Harp , and Laurent H ´ebert-Dufresne Vermont Complex Systems Center, University of Vermont, Burlington VT D ´epartement de physique, de g ´enie physique et d’optique, Universit ´e Laval, Qu ´ebec (Qu ´ebec), Canada G1V 0A6 Centre interdisciplinaire de mod ´elisation math ´ematique, Universit ´e Laval, Qu ´ebec (Qu ´ebec), Canada G1V 0A6 Department of Philosophy, University of Vermont, Burlington VT Department of Computer Science, University of Vermont, Burlington VT
ABSTRACT
Personal data is not discrete in socially-networked digital environments. A single user who consents to allow access to theirown profile can thereby expose the personal data of their network connections to non-consented access. The traditional(informed individual) consent model is therefore not appropriate in online social networks where informed consent may not bepossible for all users affected by data processing and where information is shared and distributed across many nodes. Here,we introduce a model of “distributed consent” where individuals and groups can coordinate by giving consent conditional onthat of their network connections. We model the impact of distributed consent on the observability of social networks and findthat relatively low adoption of even the simplest formulation of distributed consent would allow macroscopic subsets of onlinenetworks to preserve their connectivity and privacy. Distributed consent is of course not a silver bullet, since it does not followdata as it flows in and out of the system, but it is one of the most straightforward non-traditional models to implement and itbetter accommodates the fuzzy, distributed nature of online data.
Introduction
One key focus of the blooming field of data ethics concerns how big data and networked systems challenge classic notionsof privacy, bias, transparency and consent . In particular, the traditional privacy model (TPM), which relies on individualself-determination and individual consent, we argue, is no longer appropriate for the digital age. First, TPM requires thatconsent be informed , which may not be possible in the context of large data sets and complicated technologies. Second, TPMpresumes individual control over personal information, but the flow of information in networked systems precludes anyonefrom having such control over any piece of data. While the modern information environment shows both conditions to beproblematic, and while we briefly discuss the information condition, we focus most of our attention here on the individualitycondition.Individual consent has many limitations—notably, we live in a highly networked and advanced technological society, wheredigital decisions and actions are interconnected and affect not just ourselves but our digital community as a whole. Individualconsent, in a digital age, is flawed and ineffectual when protected class data and social profiles can be easily inferred via oursocial networks . The individual consent model works most effectively in a physical space with linear contracts between twodiscrete parties and no externalities. This however does not translate well to a digital realm where personal data boundariesare fuzzy and interwoven. The current overuse of individual consent online has also lead to a negative externality of weakerconsent due to consent desensitization, in part because users are now faced with a deluge of consent requests . Thus, a newapproach for data privacy and consent in this context is needed.The new model of data privacy will need to take into account several factors: the networked virtual space that we occupy;integration of group consent; and a mechanism for distributed moral responsibility when data privacy is breached or datais processed, combined, or manipulated in unethical manners . In this paper, we will only focus on distributed consent inparticular and evaluate, in a mathematical model, its potential to increase the general privacy of online social networks. We aimto cover the latter data privacy concerns in future work. Failures of individual consent in the online world
Consent is an expressed action that facilitates an agreed upon initiative of another party. Consent, in this context, should not bemistaken for a state of mind or an attitudinal event . It is an autonomous act that must meet certain criteria in order to beconsidered a valid action. The legitimacy of consent hinges on a number of criteria : a r X i v : . [ phy s i c s . s o c - ph ] J un . the subject has sufficient accurate information and understands the nature of the agreement,2. the agreement is entered into without coercion,3. the agreement is entered into knowingly and intentionally,4. the agreement authorizes a specific course of action.It should be noted that consent is not an end in itself; rather, it is a mechanism for preserving autonomy, self-determination, andthe ability to make decisions about one’s personal and political development. There are, however, boundaries to this autonomy;the value of acting autonomously does not trump other individual or collective rights or harms . As the saying goes, “your rightto swing your arm leaves off where my right not to have my nose struck begins” .Importantly, the four criteria listed above fail in the context of online data and classic Terms of Service (ToS) agreements.First, most users entering into consent agreements know very little about data processing or the risks associated withhanding over their data . The dense legal and technical nature of ToS agreements task non-experts to consent to somethingthey do not understand . This dynamic takes advantages of an asymmetry in technical and legal knowledge.Second, it is difficult to opt-out of these services since online platforms are an important social ecology where people formpersonhood, maintain personal relationships, and build valuable networked counter-publics . Yet, there is little to no poweron the part of the individual to negotiate the ToS with these companies, as consent in these ToS are typically presented on antake-it-or-leave it basis and offer no conditions of choice. Online privacy then turns into an unfortunate social optimizationproblem , where the user must choose between the pressures of disclosing too much personal information (being digitallycrowded) and being socially isolated .Third, the volume of consent requests a user is faced with has lead to a troublesome externality where the user is fatiguedand in turn habitually agrees to everything due to consent desensitization . This delegitimizes the premise that each act ofputative consent actually reflects the individual user’s autonomous judgment.Fourth, the language in ToS are typically so broad and open ended that data processors have the flexibility to manipulate thedata in many ways. The scope of consent cannot be so broad as to allow actions that the user could not have considered orwould otherwise not have consented to. A properly limited scope of consent also implies that there should be some mechanismfor a user to check if their data is indeed following the agreed upon course of action. However, data processors often make itvery difficult if not impossible to track personal data, to know what they have collected or how it is being processed, and tohold them accountable for misuse .Perhaps more importantly, a major concern with the individual consent model is that personal data, in this context, isdistributed information that contains information about more than a single individual. A fundamental assumption for individualconsent is that the user has power over their personal data, and that they are able to trade their personal privacy in exchange forusing an online service . In reality, these data may not be wholly the individual’s and therefore it is not appropriate for theindividual to act alone in controlling the course of action or the flow of these data. From individual to distributed consent
The densely interconnected nature of online social ecology creates a significant problem with the model of individual consent.When a user shares personal information online they are also leaking personal information about others in their social network(digital or otherwise). In fact, platforms can create digital dossiers about users who do not even share their data online throughshadow profiles of inferred data and direct data collected from their social contacts . When a user attempts to signs onto anew online service, they may be prompted to skip the hassle of entering their personal information manually and instead opt touse an existing account to act as an secured access delegation in order to gain quicker access to the new third-party onlineservice. In turn, the online service can ask to gain access to the user’s contacts and other personal data. Through these leakydata in combination with the user profile, they are granted access to a wealth of knowledge about people who never agreedto share their information with that particular service. According to Bagrow et al. “due to the social flow of information, weestimate that approximately 95% of the potential predictive accuracy attainable for an individual is available within the socialties of that individual only, without requiring the individual’s data .”The shadow profile and leaky data issue calls into question the boundary of personal data online. If sensitive data is notcontrollable by individual self-determination alone but also rests in the hands of social ties, then the model of individual consentmay be invalid in this context. The physical metaphor of privacy in face-to-face interactions does not work in this context, theidea of a discrete personal online identity is challenged. Projecting the idea of the discrete self to the online world leads users toleak other’s data without forethought of what this means to their digital neighbors. igure 1. (left) Cartoon of information flow across a network with our basic implementation of distributed consent. Bluenodes have the lowest security settings, and are susceptible to surveillance from third-party applications or websites. Purplenodes have stricter security setting but share their posts and therefore data with all their neighbors. Orange nodes follow adistributed consent model and only share their data with purple nodes or other orange nodes. (right) The same network where ahandful of low-security accounts are directly observed by a third party, showed in red with a shaded aura. All nodes sharingtheir data with directly observed accounts are de facto observed as well, and are also shown in red. Nodes a distance L > A model of distributed consent and network observability
To account for the distributed nature of personal data (i.e. the distributed online self), we consider a simple model of distributedconsent. Imagine a social network platform where individuals have the following privacy options:0. Individuals share their data with all their connections and are vulnerable to third-party surveillance (similar to Facebookaccounts with access for “Apps, Websites and Games” turned on).1. Individuals share their data with all their connections but are not directly vulnerable to third-party surveillance.2. Individuals only share their data with their connections whose privacy level are set at least to 1.N. Individuals only share their data with their connections whose privacy level are set at least to N − ϕ of individuals with privacy level set to 0 who get infected by the malware.They can then leverage these accounts to access the data of neighboring individuals with privacy level set to 1 or 0, thereforeusing the network structure to indirectly observe more nodes. They can further leverage all of these data to infer informationabout other individuals further away in the network, for example through statistical methods, facial recognition, other datasets,etc.We model this process through the concept of depth-L percolation : Monitoring an individual allows to monitor theirneighbors up to L hops away. Depth-0 percolation is a well studied process known as site percolation. The third party wouldthen be observing the network without the help of any inference method and by ignoring its network structure. With depth-1percolation, they would observe nodes either directly or indirectly by observing neighbors of directly observed nodes (e.g. bysimply observing their data feed or timeline). Depth-2 percolation would allow one to observe not only directly monitored nodes,but also their neighbors and neighbors’ neighbors (e.g. through statistical inference ). And so on, with deeper observationrequiring increasingly advanced methods.We now study the interplay of distributed consent with network observability. We simulate our model on subsets ofFacebook friendship data to capture the density and heterogeneity of real online network platforms. We then ask to what extent . . . . Adoption of distributed consent . . . . . L a r g e s t un o b s e r v e d c o m p o n e n t s i z e Fraction of prevented data flow . . . . Adoption of distributed consent . . . . . F r a c t i o n o f o b s e r v a b l e t y p e i nd i v i du a l s . . . . Adoption of distributed consent . . . . . F r a c t i o n o f o b s e r v e d i nd i v i du a l s Figure 2.
We use the anonymized Facebook100 dataset . We assume that one third of the population has a taste for privacy while the remaining two thirds will use the default setting with lowest security, option 0. The remaining one third is splitbetween security options 1 and 2 (i.e., classic or distributed privacy) according to the adoption rate of distributed consent. Wevary the adoption rate and measure (a) the relative size of the largest unobserved connected component, (b) the fraction ofobserved individuals with security option 1, and (c) the total fraction of observed accounts.distributed consent can preserve individual privacy even when a large fraction of nodes can be directly observed and third-partiescan infer data of unobserved neighbors. How widely should distributed consent be adopted to guarantee connectivity and privacy of secure accounts? The results of our simulations are shown in Fig. 2.We find that low adoption level of distributed consent (roughly 1 in 5 users) can lead to a phase transition in unobservablenodes; see Fig. 2(a). At low adoption rate of distributed consent, there are few unobserved nodes and all are mostly disconnectedfrom each other. As higher levels of adoption rate, the system transitions to an unobservable and connected phase where privacycan co-exist with connectedness and information flow. With large scale adoption of distributed consent (say one third of users),we find that close to half of all accounts are now protected while their privacy settings only prevent about 22% of data flowaround them.To understand this result, notice that any user with privacy settings set to a greater value than the percolation depth will beunobservable. Indeed, users using a security setting N will only share their data with users using settings of N − N . We thus know that users using setting N will be at least N steps away from users using the lowestsetting, which are the only directly observable nodes. Users with security level set to 1 < N < L can however be observedindirectly through their relationships. At low levels of adoption of distributed consent, a large amount of luck is required toremain unobservable (e.g. having zero connections with low security users). At higher levels of adoption, users of distributedconsent connect to, and therefore protect, one another. These connections are however localized and do not spread throughoutthe entire system. We find that when roughly 25% of nodes adopt distributed consent, a large macroscopic component ofconnected unobservable nodes emerge. This component reflects a parallel, protected, community that is unobservable but stillconnected to the rest of the social networks.The macroscopic but unobservable component that emerges with increased adoption of distributed consent does not onlycontain adopters of distributed consent. Early adopters of distributed consent provide some low amount of herd-like immunityto the population, protecting otherwise vulnerable users; see Fig. 2(b). Users with lower privacy setting can thus also benefitsince adoption of distributed consent in one’s neighborhood reduces the probability that one of their neighbors is directly orindirectly observed, thereby reducing the probability that they are themselves observed. However, as long as a majority of usersrely on default lax security settings, this effect will be limited as a single compromised neighbor is sufficient to observe a node.Despite the fact that a phase transition in connected unobservable nodes occurs at fairly low level of distributed consentadoption and that these nodes provide secondary protection to other users, pervasive adoption of group consent is required tofully protect a network; see Fig. 2(c). Again, all it takes for one vulnerable node to be indirectly observed is a single observableneighbor. Because of this and because of the density of most online networks platforms, it is extremely hard to completelyprotect vulnerable nodes even if distributed consent provides some secondary protection to all nodes. We thus see coexistenceof both observed and unobserved connected components at medium adoption level of distributed consent. Interestingly, thesecomponents are interconnected, with data flowing both ways across observable and unobservable components, yet the users inlatter remain fully protected from statistical inference of their data. iscussion We listed four criteria for legitimacy of consent listed in introduction, and we argued that none are met by individual consentwithin the complex ecology of online media. One key problem is that if personal data is distributed across individuals, soshould be their consent.Our results based on computational simulations suggest that even the simplest implementation of distributed consent couldallow users to protect themselves and the flow of their data in the network. They do so by consenting to share their dataconditionally on the consent or security settings of their contacts; thereby not sharing their data with users who might in turnmake it available to third parties. This simple condition allows users to authorize a specific course of action for their ownpersonal data (criterion 4).While this protection disconnects them from some other users, only a relatively low level of adoption of distributed consentis required to create a connected, macroscopic, sub-system within existing online network platforms. This sub-system consistsof different individuals, including some that are granted secondary protection despite their low security settings, and remainsconnected to the rest of the system such that information still flows throughout the entire population of users. Via this protectedsub-system, distributed consent removes the de facto coercion (criterion 2) involved in forcing individuals to choose betweenrelinquishing control of their data or simply not participating in a platform.Beyond the actual protection mechanism, this new model of consent may also have interesting behavioral impacts on theusers. Exposing users to this type of coordinated privacy setting might prompt them to reflect about the distributed nature oftheir personal data and its flow through online media. This realization may prompt a user to more openly voice their socialboundaries to their social network or restrict sending sensitive information to social neighbors who do not share their taste forprivacy . Imagine a user publishing a post to their social network, before enacting the new privacy settings, urging those whowant to remain connected to change their settings as well. Beyond the utility of limiting observability of the social network, thismeasure could also serve as an important educational tool on the interconnectedness of personal data (criterion 1). Further workis required to observe and quantify the behavioral consequences of new privacy options.Altogether, it is our recommendation that simple implementations of distributed consent should be considered. Even in itssimplest form, distributed consent would allow concerned users to protect themselves without fully leaving a platform, andwould also let platforms maintain a large critical mass of observable users that chose to remain vulnerable and who are notgranted sufficient protection through their contacts.That being said, criterion 1 (understanding the consent agreement) and criterion 3 (or consent fatigue) remain an issue . Infact, useful implementations of distributed consent might require additional education regarding data privacy. Moreover, thereare many other types of privacy violations that are not solved by distributed consent alone. The data are still leaky; individualusers can still aggregate information about their neighbors that they did not directly consent to. And finally, while the distributedconsent model goes beyond the strict individuality of the traditional privacy model, it does so modestly; it models the agents,choices, and values as fundamentally individual. There is obviously no silver bullet to solve this complex problem; data privacyis a significant societal issue with multi-level interdependencies that need to be considered thoughtfully and ethically. Muchwork therefore remains to be done in this area.Effective data privacy measures will need to integrate a mechanism for distributed moral responsibility that will simulta-neously involve both top-down and bottom-up interventions. Doing so will involve a synergy between increased regulation,technological intervention, distributed consent, and empowerment of citizens. Increasing data privacy and protection is notonly an important public service but a democratic imperative . Access to data privacy and protection is a growing globalissue that must be tackled by a combination of technological, ethical, legal, sociological, and educational interventions. References Leonard, P. G. Emerging Concerns for Responsible Data Analytics: Trust, Fairness, Transparency and Discrimination.
SSRN Electron. J. (2018). Garcia, D. Leaking privacy and shadow profiles in online social networks.
Sci. Adv. , e1701172, DOI: 10.1126/sciadv.1701172 (2017). Bagrow, J. P., Liu, X. & Mitchell, L. Information flow reveals prediction limits in online social activity.
Nat. Hum. Behav. , 122–128, DOI: 10.1038/s41562-018-0510-5 (2019). Schermer, B. W., Custers, B. & van der Hof, S. The crisis of consent: How stronger legal protection may lead to weakerconsent in data protection.
Ethics Inf. Technol. , 171–182, DOI: 10.1007/s10676-014-9343-8 (2014). Floridi, L. Faultless responsibility: On the nature and allocation of moral responsibility for distributed moral actions.
Philos. Trans. Royal Soc. A , 20160112, DOI: 10.1098/rsta.2016.0112 (2016). Kleinig, J. The Ethics of Consent.
Can. J. Philos. , 91–118, DOI: 10.1080/00455091.1982.10715825 (1982). . Westen, P.
The logic of consent: the diversity and deceptiveness of consent as a defense to criminal conduct (Ashgate,2004). Faden, R. R. & Beauchamp, T. L.
A History and Theory of Informed Consent (Oxford University Press, 1986). Finch, J. B. & McCully, C. A.
The people versus the liquor traffic: Speeches of John B. Finch, delivered in the prohibitioncampaigns of the United States and Canada (The R.W.G. lodge, 1887), 24th (rev.) edn.
Skirpan, M. W., Yeh, T. & Fiesler, C. What’s at Stake: Characterizing Risk Perceptions of Emerging Technologies. In
Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems , CHI ’18, 70, DOI: 10.1145/3173574.3173644 (Association for Computing Machinery, 2018).
Custers, B., van Der Hof, S., Schermer, B., Appleby-Arnold, S. & Brockdorff, N. Informed consent in social media use —the gap between user expectations and EU personal data protection law.
SCRIPTed , 435 (2013). Fraser, N. Rethinking the Public Sphere: A Contribution to the Critique of Actually Existing Democracy.
Soc. Text
Jackson, S. J. & Foucault Welles, B. Hijacking
J.Commun. , 932–952, DOI: 10.1111/jcom.12185 (2015). Jackson, S. J. & Banaszczyk, S. Digital Standpoints: Debating Gendered Violence and Racial Exclusions in the FeministCounterpublic.
J. Commun. Inq. , 391–407, DOI: 10.1177/0196859916667731 (2016). Schwartz, P. M. Internet Privacy and the State.
Conn. L. Rev. , 815–859, DOI: 10.2139/ssrn.229011 (2000). Tufekci, Z. Can You See Me Now? Audience and Disclosure Regulation in Online Social Network Sites:.
Bull. Sci.Technol. Soc. , 20–36, DOI: 10.1177/0270467607311484 (2007). Altman, I. Privacy Regulation: Culturally Universal or Culturally Specific?
J. Soc. Issues , 66–84, DOI: 10.1111/j.1540-4560.1977.tb01883.x (1977). Lapowsky, I. One Man’s Obsessive Fight to Reclaim His Cambridge Analytica Data.
Wired (2019).
Solove, D. J.
The Digital Person: Technology and Privacy in the Information Age. (New York University Press, 2004).
Cohen, J. E. Examined Lives: Informational Privacy and the Subject as Object.
Stan. L. Rev. , 1373–1438 (2000). Wang, N., Xu, H. & Grossklags, J. Third-party apps on Facebook: privacy and the illusion of control. In
Proceedings ofthe 5th ACM symposium on computer human interaction for management of information technology , 1–10 (2011).
Yang, Y., Wang, J. & Motter, A. E. Network observability transitions.
Phys. Rev. Lett. , 258701, DOI: 10.1103/PhysRevLett.109.258701 (2012).
Allard, A., Hébert-Dufresne, L., Young, J.-G. & Dubé, L. J. Coexistence of phases and the observability of random graphs.
Phys. Rev. E , 022801, DOI: 10.1103/PhysRevE.89.022801 (2014). Traud, A. L., Mucha, P. J. & Porter, M. A. Social structure of Facebook networks.
Phys. A , 4165–4180, DOI:10.1016/j.physa.2011.12.021 (2012).
Lewis, K., Kaufman, J. & Christakis, N. The Taste for Privacy: An Analysis of College Student Privacy Settings in anOnline Social Network.
J. Comput. Commun. , 79–100, DOI: 10.1111/j.1083-6101.2008.01432.x (2008). Rouvroy, A. & Poullet, Y. The right to informational self-determination and the value of self-development: Reassessingthe importance of privacy for democracy. In
Reinventing data protection? , 45–76 (Springer, 2009).
Dutt, R., Deb, A. & Ferrara, E. “Senator, We Sell Ads”: Analysis of the 2016 Russian Facebook Ads Campaign. In
International conference on intelligent information technologies , 151–168 (Springer, 2018).
Mba, G., Onaolapo, J., Stringhini, G. & Cavallaro, L. Flipping 419 cybercrime scams: Targeting the weak and thevulnerable. In
Proceedings of the 26th International Conference on World Wide Web Companion , 1301–1310 (2017)., 1301–1310 (2017).