The Impact of Privacy Laws on Online User Behavior
1 The Impact of Privacy Laws on Online User Behavior
Julia Schmitt, Klaus M. Miller, Bernd Skiera Goethe University Frankfurt December 2020
Prof. Dr. Bernd Skiera, Department of Marketing, Faculty of Business and Economics, Goe-the University Frankfurt, Theodor-W.-Adorno-Platz 4, 60323 Frankfurt, Germany, Phone +49-69-798-34649, email: [email protected]. Jun.-Prof. Dr. Klaus M. Miller, Department of Marketing, Faculty of Business and Econom-ics, Goethe University Frankfurt, Theodor-W.-Adorno-Platz 4, 60323 Frankfurt, Germany, Phone +49-69-798-33865, email: [email protected]. Julia Schmitt, Department of Marketing, Faculty of Business and Economics, Goethe Univer-sity Frankfurt, Theodor-W.-Adorno-Platz 4, 60323 Frankfurt, Germany, Phone +49-69-798-34563, email: [email protected]. 2
The Impact of Privacy Laws on Online User Behavior
ABSTRACT
Policy makers worldwide draft privacy laws that require trading-off between safeguarding consumer privacy and preventing economic damage to companies that use consumer data. However, little empirical knowledge exists as to how privacy laws affect companies’ perfor-mance. Accordingly, this paper empirically quantifies the effects of the enforcement of the EU’s General Data Protection Regulation (GDPR) on online user behavior over time, analyz-ing data from 6,286 websites spanning 24 industries, during the 10 months before and 18 months after the GDPR’s enforcement in 2018. A difference-in-differences analysis, with a synthetic control group approach, enables the short- and long-term effects of the GDPR on user behavior to be reliably isolated. The results show that, on average, the GDPR’s effects on user quantity and usage intensity were negative; e.g., 3 months (18 months) post-GDPR, the numbers of unique visitors and total visits to a website decreased by 0.8% (6.6%) and 4.9% (10%), respectively. These effects could translate into average revenue losses of $7 mil-lion for e-commerce websites and almost $2.5 million for ad-based websites 18 months after GDPR. The GDPR’s effects vary across websites, with some industries even benefiting from it; moreover, more-popular websites suffered less damage, suggesting that the GDPR in-creased market concentration.
Keywords : Privacy Law, Online Privacy, Consumer Protection, GDPR, Data Privacy Regula-tion Internet users generally perceive their privacy as a cause for concern. For example, a sur-vey in 2019 by Pew Research Center showed that 79% of American users are concerned about how companies use their data, partly because they do not know which data are being collected. In recent years, in an effort to mitigate these types of concerns, policy makers worldwide have drafted and enforced privacy laws. One of the highest-profile and most ex-pansive laws is the European Union’s (EU) General Data Protection Regulation (GDPR), which became enforceable on May 25 th , 2018. Another example is the California Consumer Privacy Act (CCPA), which came into effect about two years later, in January 2020. Other countries such as Chile, Serbia, Brazil, India, and Thailand have also recently enforced or ap-proved privacy laws. While the specific details of the various privacy laws differ, their basic idea is the same: to increase the individuals’ privacy, which is commonly defined as the indi-viduals’ control over their personal data (Holvast 1993). More broadly, when drafting privacy laws, policy makers aim to increase the protection of personal data as a fundamental right (GDPR 2016/679). In practical terms, privacy laws such as the GDPR seek to enhance data privacy by target-ing the operations of companies that handle consumer data, through two main avenues: (i) limiting companies’ capacity to collect and use consumer data; and (ii) requiring that com-panies be transparent about their data collection practices. As will be elaborated in what fol-lows, these requirements affect companies’ operations, which may lead to economic loss. Moreover, companies’ attempts to recoup these losses may have downstream societal effects. For example, a company might scale back its services, or begin to charge for services that it once provided for free. If the company’s services include providing information (e.g., as in the case of a news website), the outcome could be a less-informed citizenry. Moreover, some companies might cut jobs, causing financial distress to their employees; if such layoffs take 4 place on a large scale, the societal harm could be profound. Thus, in establishing privacy regulations, policy makers must carefully balance between ensuring citizens’ right to privacy and avoiding excessive damage to the performance of com-panies that use consumer data, given the potential societal effects of such damage. Yet, it is challenging to predict in advance how implementation of data privacy laws will actually af-fect companies’ performance. Part of the challenge stems from the fact that consumers may respond in unexpected ways to efforts to protect their privacy. Indeed, it is well established that, though consumers claim to value their privacy, their actual behavior online does not nec-essarily align with these stated preferences (a phenomenon known as the privacy paradox; e.g., Acquisti 2004). Accordingly, we present a study that relies on field data to examine how implementation of the GDPR affected user behavior on thousands of websites. We focus on two classes of user behavior metrics: user quantity (e.g., numbers of unique visitors and numbers of visits) and usage intensity (e.g., number of page impressions per visit). These metrics are of interest in themselves as indicators of company performance, and they can be directly linked to the revenue of companies with different business models (e.g., e-commerce sites or sites with ad-based revenue; see also the concluding sections of this paper). Our analysis begins from the premise that the implementation of a privacy law can have both positive and negative effects on user quantity and usage intensity. Regarding user quan-tity, we assume that limitations on data collection and usage restrict companies’ marketing activities, such as the capacity to target new customers through personalized ads on other websites. As a result, users might be less aware of certain companies than they would have been previously, and might face increased search costs to find them. Consequently, traffic to those companies’ websites might be expected to decrease. At the same time, traffic to certain 5 websites might increase among users who find themselves with fewer alternatives—indeed, shortly after the enforcement of the privacy law GDPR, some websites operating outside the EU blocked access to EU users, to avoid having to comply with the law (Lecher 2018). Regarding usage intensity, to comply with requirements for transparency and attainment of consent to collect data, websites may have to adjust their appearances—thereby affecting the user experience. For example, users might face a pop-up with information regarding the website’s cookie usage or other data collection activities, and then have to click to accept or decline cookies (have their data collected). This interaction might increase users’ awareness of their data disclosure and thereby influence their usage intensity (Dinev and Hart 2006). In particular, they might spend less time on the website in order to reduce the amount of data it can collect, or they might abandon the website to avoid having to authorize it to collect data. Alternatively, once a user has consented to have her data collected, she might use the website more than she would otherwise—to avoid having to visit other websites and authorize them to collect data. Lastly, there might be users who do not change their behavior at all. These arguments suggest that, overall, enforcement of a privacy law such as the GDPR may have positive or negative effects, or no effect at all, on the quantity of users who visit a particular website and on their usage intensity. Moreover, different websites might be af-fected in different ways, as users’ expectations regarding their privacy, and their consequent responses to privacy-driven changes in website operations, may vary across regions (cul-tures), or across websites in different industries (e.g., Dinev et al. 2006). It is also important to understand how these effects develop over time, as it might take several months for a web-site to consolidate its policies to ensure compliance with new laws, and for users to adjust their usage habits. Thus, our study aims to achieve the following specific objectives: 6 Quantifying the effects of the enforcement of the GDPR on metrics of user quantity (i.e., number of unique visitors, number of (non-unique) visits, number of page im-pressions, time on website, number of bouncing visitors) and usage intensity (i.e., number of visits per unique visitor, page impressions per visit, time per visit, bounce rate) on websites over time (from 3 months up to 18 months after the enforcement of the privacy law); 2)
Identifying how these effects vary as a function of website characteristics (i.e., web-site industry and popularity) and user characteristics (i.e., user’s country of origin). Our analysis relies on a dataset capturing user behavior on 6,286 unique websites span-ning 24 industries; these websites represent the most popular websites in 13 countries (11 EU countries, Switzerland and the United States). The data cover the period from July 2017 to December 2019—i.e., 10 months before and 18 months after the enforcement of the GDPR (henceforth referred to simply as “GDPR”) on May 25 th , 2018—enabling us to construct a be-fore-and-after analysis. Within our dataset, some website–user interactions are subject to reg-ulation by the GDPR (i.e., interactions involving EU-based websites or EU-based users), whereas others are not (i.e., interactions involving non-EU-based websites and non-EU-based users), effectively creating a “control group.” Thus, we are able to use a difference-in-differ-ences (DiD) strategy (e.g., Janakiraman et al. 2018, Kumar et al. 2016, Goldstein et al. 2014), which we combine with a synthetic control group approach (Abadie et al. 2015) to isolate the effect of the GDPR on our metrics of interest. We obtain the following results: Among websites to which the GDPR is applicable, the average number of unique visi-tors per website decreases by 0.8% (6.6%) in the first 3 months (18 months) of GDPR. Moreover, the average total number of visits per website decreases by almost 7
5% after 3 months, and about 10% after 18 months; about two-thirds of websites con-tinue to be negatively affected by the GDPR 1.5 years after its enforcement. We simi-larly observe decreases in the average number of page impressions (drops of 3% at 3 months; >9% at 18 months) and the amount of time on the website (decrease of 4.7% at 3 months; 9.7% at 18 months). Among websites that suffer from a reduction in user quantity, the remaining users ex-hibit an increase in usage intensity—for example, the number of visits per user in-creases, on average, by about 4.8% at 18 months post-GDPR. Conversely, among websites that gain users after the GDPR, usage intensity decreases; e.g., the number of visits per user decreases, on average, by about 9.1% at 18 months post-GDPR. The effects of the GDPR vary across websites; for example, less-popular websites lose more total visits (10%-21% drop) than more-popular websites do (2%-9% drop), suggesting that the GDPR increases market concentration. The effects also vary across industries, with Entertainment and Leisure websites being most negatively af-fected (-12.5 to -13.8% after 18 months), whereas Business and Consumer Service websites even experience a positive effect (+4.7% after 18 months). User characteristics (i.e., a user’s country of origin) have only a small effect on how the GDPR affects user behavior. Though several studies have begun to explore the effects of the GDPR on various out-comes—including short-term effects on (recorded) web traffic (Goldberg et al. 2020)—to our knowledge, this study is the first to rigorously isolate these effects by using a control group, and to investigate how these effects develop over the long term. Accordingly, the findings of this study will contribute towards providing policy makers with a reliable assessment of the effects of the GDPR, towards informing the development of future privacy laws. 8
KNOWLEDGE ON EFFECTS OF PRIVACY CHANGES ON ONLINE USER BEHAVIOR
We draw from and contribute to two main streams of literature. The first stream attempts to illuminate, through surveys and lab experiments, users’ attitudes towards data privacy and their responses to different levels of privacy or control over their data. The second stream uses field studies to examine the effects of privacy laws on various outcomes of interest.
User Attitudes and Behavior with Regard to Privacy
Lab experiments and survey-based studies have examined how users’ attitudes and web-site usage behavior are affected by websites’ handling of user privacy. The results of these studies point to a nuanced relationship between privacy and user behavior. For example, sev-eral studies based on consumer surveys suggest that when users perceive themselves as hav-ing more control over their privacy—and specifically, more options to regulate their pri-vacy—they experience lower privacy concerns (Martin 2015), a higher level of trust in a website and an increase in their purchase intentions (Martin et al 2017) as well as their will-ingness to disclose data to websites (Malhotra et al 2004; Culnan and Armstrong 1999) and can even react more positively to personalized ads (Tucker 2013). Other studies, in contrast, find that user behavior is not, in fact, affected by different lev-els of privacy: For example, Belanger and Crossler (2011) show that while users claim to be concerned about their privacy, they exhibit a contradictory behavior and continue to share data to companies despite privacy concerns. This result may have been induced by users’ feelings of powerlessness regarding their privacy (Few 2018). Acquisti et al. (2012) further show that user privacy concerns and preferences for the same level of privacy are not stable and the willingness to disclose data can depend on other factors like the amount and order of such data requests. These findings are in line with the privacy paradox, indicating that us- 9 ers’ stated privacy preferences often differ from their actual behavior (e.g., Acquisti 2004). Still, other studies suggest that inclusion of more privacy control options for users might have negative effects on website usage. In particular, privacy features, such as requesting us-ers’ explicit consent for data collection (as required by the GDPR), can make users aware of data disclosure that they were not previously aware of (Dinev and Hart 2006). This awareness may lead users to feel warier about using the site, and thus diminish their usage behavior. The privacy calculus theory, proposed by Dinev and Hart (2006), provides a framework that can encompass all these different responses to privacy controls. Specifically, the theory suggests that the extent to which a user values privacy on a particular website depends on the user’s individual privacy concerns, the user’s trust in the website, and the value that the user derives from the website’s offerings. Users with higher levels of privacy concerns or lower trust towards a website may be more likely than others to respond favorably to more stringent privacy measures. In turn, when users attribute a high value to the website’s offerings, they may be willing to sacrifice privacy in exchange for convenient access to those offerings, and thus may be indifferent to privacy levels—or may even respond unfavorably if privacy hurts the website’s accessibility. This theory suggests that users’ responses to changes in a website’s handling of privacy may vary across users, and across websites. Indeed, several studies show that differences in privacy perceptions and expectations depend on a user’s country and cultural background (e.g., Dinev et al. 2006; Steenkamp and Geyskens 2006; Miltgen and Peyrat-Guillard 2014) and on the device used by a user to access a website (Melumad and Meyer 2020). The current study extends these findings by comparing how users in different countries vary in their ac-tual responses to privacy laws, as well as by considering variation across websites with dif-ferent characteristics. 10
Field Studies: Effects of Privacy Laws on Various Outcomes
The findings outlined above suggest that it is likely to be difficult to predict how large populations of users will respond to the enforcement of new privacy laws. Accordingly, sev-eral studies use field data to construct event studies of users’ revealed behavior after the en-forcement of such laws. Examples that predate the GDPR are the work of Goldfarb and Tucker (2011), who show that implementation of the EU Privacy and Electronic Communica-tions Directive (2002/58/EC), reduces ad effectiveness on websites, making it more challeng-ing for ad-financed websites to generate revenues, and of Campbell et al (2015) who show that it especially hurts smaller online firms. Several recent studies have specifically sought to characterize various effects of the GDPR. Some of these works focus on websites’ actions in response to the law, showing that many update their privacy policies (Degeling et al. 2019) and increase their privacy policy length (Linden et al. 2019). Furthermore, a clear reduction in third-party cookies occurrs (Libert et al. 2018; Goldberg et al. 2020; Lefrere et al. 2019; Hu and Sastry 2019). Partly due to the anticipated reduction in third-party cookies, Mirreh (2018) predicts that websites could lose almost half of their traffic because of an inevitable shift of retargeting strategies, making it more challenging for companies to get users to their websites. A study by Lefrere et al. (2019), however, suggests that the GDPR does not affect website reach, page views or the content of websites. A study that is particularly relevant to our research is that of Goldberg et al. (2020), who measure how the GDPR affected recorded web traffic and e-commerce sales in the 4 months after the regulation came into effect. The authors show that after the GDPR there was an av-erage 11.70% drop in recorded page views from EU users (Goldberg et al. 2020). Our empiri-cal study delivers insights that greatly extend Goldberg et al.’s research (for a detailed com- 11 parison of this study to Goldberg et al. (2020), see Table 1). Primarily, our study offers a methodological improvement: Whereas Goldberg et al. (2020) use a panel difference ap-proach to compare recorded traffic outcomes from 2017 (before GDPR) to 2018 (after GDPR), we observe a control group, enabling us to use a DiD approach. Moreover, our study adopts a long-term orientation, providing a more comprehensive analysis of GDPR. Given that the GDPR was the first major new privacy law in the EU since the e-Privacy Directive in 2002, companies and users may have needed some time to adjust their behavior to the GDPR and, therefore, the full effect of the privacy law might only become observable at a later stage. Furthermore, our study examines differences across websites and users. Finally, our study provides an empirical estimation of metrics covering actual traffic, whereas Goldberg et al. (2020) examine recorded traffic. As the authors mention in their study, recorded traffic differs from actual traffic because recorded traffic depends on (i) the actual traffic and (ii) the number of users that provide their consent to a website for measur-ing, and thus recording, traffic. As a result, a change in recorded traffic after GDPR is, in fact, a combination of two changes: A change in the number of consenting users and a change in the actual traffic that these consenting users generate.
DESCRIPTION OF EMPIRICAL STUDY
The aim of our empirical study is to analyze the effects of the enforcement of the GDPR on online user behavior, as reflected in measures of user quantity and usage intensity; to un-derstand how these effects evolve over time (between 3 and 18 months after enforcement); and to reveal how these effects vary as a function of website and user characteristics. 12
Table 1: Comparison of this Study to the Most Similar Study to Date: Goldberg et al. (2020)
Goldberg et al. (2020) This Study Aim of Study
Effect of GDPR on recorded web traffic in Europe Effect of GDPR on affected user behavior in Europe and US, and variation across websites
Existence of Effect of GDPR
Yes Yes
Before-After Comparison
Yes Yes
Control Group
No Yes (websites and users not affected by GDPR)
Duration of Effects
Variation across Websites and Users
No Yes
Observation Period
January – September 2017 January – September 2018 (18 months) July 2017 – December 2019 (30 months)
Website Sample
User Behavior Metrics
Recorded Page Impressions Unique Visitors Total Visits Visits per Unique Visitor Page Impressions Page Impressions per Visit Time on Website Time per Visit Bouncing Visitors Bounce Rate
Background on the GDPR
The GDPR, which came into effect on May 25th, 2018, is the first major privacy law in Europe since the e-Privacy Directive in 2002. The GDPR regulates any activity performed on personal data from EU-citizens. As a regulation, the law is binding for all websites based in EU countries; according to Article 3 of the GDPR, a website’s “base” (and thus the applica-bility of the GDPR) is determined according to the geographical location where the site’s data processing takes place. Websites within the scope of GDPR that do not comply with the pri-vacy law face significant fines of up to 4% of the website’s global annual turnover or €20 million, depending on the severity of the infringement. The GDPR handles a variety of privacy aspects that can either have a direct or indirect ef- 13 fect on how a user engages with a website. Compared with other approved or enforced pri-vacy laws, the GDPR has more stringent privacy protection requirements (Lucente and Clark 2020). For example, the CCPA, another major privacy law, only covers a subset of activities, namely the “collection”, “selling” or “sharing” of personal information, whereas the GDPR covers “any activity performed on data”. The CCPA further does not require websites to ob-tain a user’s explicit consent for data processing like the GDPR; rather, it mandates only that users be able to withdraw their consent, i.e., the CCPA follows an Opt-Out approach. Addi-tionally, the financial penalties that the GDPR imposes for non-compliance exceed those of the CCPA (up to 4% of the website’s global annual turnover or €20 million under the GDPR compared with up to $7,500 under the CCPA). Given the strict nature of GDPR compared with other privacy laws, the findings of this study might serve as an upper bound of the ef-fects of privacy laws on user quantity and usage intensity on websites.
Description of Set-Up of Empirical Study
The GDPR provides a useful setting for quantifying the effects of privacy laws on user behavior because it implicitly divides website–user interactions (here referred to as “observa-tions”) into a treatment group (i.e., those to which the GDPR is applicable) and control group (those to which the GDPR does not apply), as depicted in Figure 1. As noted above, the scope of the GDPR includes all websites based in the EU and further encompasses the processing of personal data from all EU-based users. Thus, the treatment group comprises observations cor-responding to EU-users visiting any website or to non-EU-users visiting EU-based websites. The control group consists of observations corresponding to non-EU-users visiting non-EU-based websites. In line with Article 3 of the GDPR described above, we use the website’s server location (retrieved from https://check-host.net) to determine the respective website’s data processing location, and the applicability of the GDPR. 14
Figure 1: Scope of GDPR and Resulting Assignment to Treatment and Control Group
We then use the enforcement date of GDPR (May 25 th , 2018) to construct a before-and-after analysis, comparing the treatment group to the control group. Although not all compa-nies were compliant on the GDPR enforcement date, very few companies were compliant prior to the enforcement date (Hochstadt 2018). Still, to eliminate the concern of possible early or late compliance of websites with the GDPR, we conducted all analyses for two ob-servation periods: the entire observation period as well as a period that excludes the period of 30 days before and after the enforcement date. The results presented in this study are robust to exclusion of this 60-day period. Hence, we report only the results based on the full dataset, including the 60-day period. Overview of Data Description of data sample.
This study utilizes data from SimilarWeb for the Top 1,000 websites—as listed in Alexa Top Sites in April 2018—of 11 EU countries (Austria, Den-mark, France, Germany, Hungary, Italy, Netherlands, Poland, Spain, Sweden and the UK ) and two Non-EU countries, Switzerland and USA. SimilarWeb is a company that draws on a diversified and rich global user panel to measure online user behavior. Its data are primarily During the time of our study the UK was still a member of the EU. Its membership ended on January 31, 2020. used by firms (e.g., Google, Alibaba, eBay, P&G), but have also been used in research (e.g., Calzada and Gill 2020, Lu et al. 2020). The websites in our sample span diverse industries (see Figure 2), audiences and popularity levels (here measured by SimilarWeb ranks). For each website in our sample, the dataset also includes information about the website industry as well as the global, country and industry rank, based on the website’s popularity worldwide, in the analyzed country and within the website’s industry. For each website in the sample, the dataset includes information on the user quantity met-rics of users accessing that website from the country in which the website is most popular. Additionally, for each website, user quantity data are available for users accessing the web-site from the US. (For websites that are most popular in the US, data are available only for US users.) These data span the period between July 1 st , 2017 and December 31 st , 2019—i.e., almost a year prior to GDPR’s enforcement (May 25 th , 2018) and 1.5 years after the enforce-ment—and can therefore be used for a before-and-after analysis as outlined above. Figure 2: Distribution of Websites across Industries We start with 13 countries with 1,000 websites each but our initial sample includes 7,332 unique websites when we remove duplicate websites. For example, the website “google.com” is a duplicate website as it is among the Top 1,000 websites in all 13 countries. Instead of 13 times, google.com just occurs once in our sample. For each of these 7,332 websites, we have user behavior data corresponding to Non-EU users. For 6,460 websites of those 7,332 web-sites, the dataset additionally includes user behavior data of EU users. Thus, for 6,460 websites, we have two sets of observations, corresponding, respectively, to the Non-EU user base and to the EU user base of that website. For the remaining 872 web-sites, we only observe the Non-EU user base. In what follows, we consider each website’s Non-EU and EU user bases separately and refer to each combination of a website with one of the two user bases, for convenience, as a “website-instance”. For example, for a website such as “zeit.de” that is based in an EU country (here, Germany), we observe two website-in-stances: One website-instance corresponds to the set of observations for the EU user base of “zeit.de”. The second website-instance corresponds to the set of observations for the Non-EU user base of “zeit.de”. As “zeit.de” is EU-based, GDPR applies to both its website-instances and both website-instances belong to the treatment group (as depicted in Figure 1). Accordingly, for a website such as “pinterest.com” that is based in a Non-EU country (here the US), we observe two website-instances: one website-instance corresponding to the set of observations for the Non-EU user base of “pinterest.com” and the second website-in-stance corresponding to the set of observations for the EU user base. As the website of this second example is not EU-based, GDPR applies only to the website-instance that corresponds to the EU user base of “pinterest.com”, which belongs to the treatment group, but not to the Non-EU user base, which belongs to the control group (see Figure 1). Overall, the initial sample includes 7,332 website-instances corresponding to a set of ob- 17 servations of the Non-EU user base and 6,460 website-instances corresponding to a set of ob-servations of the EU user base, totaling in 13,792 website-instances. We then drop website-instances for which the user base generated, on average, fewer than 1,000 visits per week or not a single visit for more than an entire month in the observation period. We also drop web-site-instances that exhibited strong traffic drops or peaks at some point in time that cannot be explained by our available data. This procedure results in a final sample of 9,683 website-in-stances, corresponding to 6,286 unique websites (Table 2).
Table 2: Derivation of Final Sample of Website-Instances
Website-Instances with EU-based data Website-Instances with Non-EU-based data Website sample (top 1,000 websites of 11 EU countries, CH and US)
Sample after removal of duplicated and non-existent websites (e.g., fraudulent pop-ups)
Sample after additional removal of website-instances with average weekly visits <1,000
Final sample after additional removal of website-in-stances with no visits in >1 month or strong outliers
Final sample for Unique Visitor analysis after addi-tional removal of website-instances with monthly unique visitors <5000 For 3,397 websites, we have two website-instances (EU and Non-EU user bases) and for 2,889 websites, we have one website-instance (EU or Non-EU user base). Overall, 5,683 websites, corresponding to 7,892 website-instances, belong to our treatment group, encom-passing over 1.15 trillion total website visits from the EU. Our control group consists of 1,701 websites, corresponding to the same number of website-instances (see Table 3), en-compassing almost 1.8 trillion total website visits from Switzerland and the US. SimilarWeb does not report unique visitor information for websites with less than 5,000 unique visitors in a month. The unique visitor analysis thus contains a set of 5,198 treated websites. Table 3: Distribution of Website-Instances in Treatment and Control Groups Description of user quantity and usage intensity metrics.
In what follows, we define our variables of interest, namely, our user quantity and usage intensity metrics. Though some var-iables may be connected to one another to some extent, each variable provides a slightly dif-ferent insight into the effects of the GDPR on user behavior and engagement on websites. For an overview of the relationship between the user quantity and usage intensity metrics, see Figure 3. Our user quantity metrics are as follows: 1)
Monthly Number of Unique Visitors: The number of unique users visiting a website in a month. This metric reveals the website’s reach. 2)
Weekly Total Number of Visits: This metric depends on the number of unique visitors to a website and the number of visits generated by each of these unique visitors. It re-flects the website’s traffic volume and, thus, the interactions with a website. 3)
Weekly Number of Page Impressions: This metric depends on the weekly number of visits to a website and the number of page impressions generated by each visit. It measures how well the website sparks users’ interest, keeps them engaged, and en-courages them to continue browsing the website’s offerings. 4)
Weekly Time on Website: This metric depends on the weekly number of visits to a 19 website and the time spent on the website in each visit. It reflects the website’s likeli-hood of achieving its goal in bringing users to the site (e.g., page views, purchases). 5)
Weekly Number of Bouncing Visitors: The number of visits in a week in which the user leaves a website after only a single page impression. It indicates how well a web-site can retain traffic.
Figure 3: Relationship Between the User Quantity and Usage Intensity Metrics
We analyze all variables on a weekly level, with the exception of the number of unique visitors, for which data are only available on a monthly level. Due to large differences in the values of each metric across websites and countries, we convert all user quantity metrics (+1 to avoid zero values) to their natural logarithm so that we capture relative (i.e., percentage) effects. For a mean comparison (before and after GDPR) of the log-transformed user quantity variables for the 7,892 website-instances within the treatment group and the 1,701 control website-instances, see Figure 4. We calculate the effects for the user quantity metrics per website-instance. We then deter-mine the effect of GDPR on a website as follows: If only one website-instance corresponds to 20 a specific website, i.e., only data for one user base is available for that website, the effect of GDPR on that website comprises only the effect of that one website-instance. This is the case for all Non-EU-based websites (for these websites, only one website-instance belongs to the treatment group, see Figure 1) and for about 30% of the EU-based websites.
Figure 4: Comparison of Logarithm of User Quantity Metrics (i.e., Dependent Variables) between Treatment and Control Group and Pre- and Post-GDPR Period
For most of the EU-based websites, two treated website-instances correspond to the same website, i.e., data for the EU and Non-EU user base are available. For these websites, the overall effect of GDPR comprises the effects of both website-instances. Hereby, the relative sizes of the two website-instances prior to GDPR are considered when merging the two ef-fects into one. For example, the website “zeit.de”, a reputable German online news website, received 98.94% of its total visits from German (EU) users. Thus, GDPR’s effect on the num-ber of total visits on the website “zeit.de” comprises 98.94% of its effect on the website-in-stance corresponding to the EU user base of “zeit.de”, and 1.06% of GDPR’s effect on the website-instance corresponding to the Non-EU user base of “zeit.de”. This weighting proce-dure results in the same effects as we would have determined if we had combined the two 21 website-instances from the beginning of the calculation. We then compare the respective effects of GDPR on the user quantity metrics for each website to examine GDPR’s effect on the usage intensity metrics that are calculated based on the changes of the user quantity metrics. Our usage intensity metrics are as follows: 1)
Visits per Unique Visitor: The average number of visits generated by a unique visitor. 2)
Page Impressions per Visit: The average number of pages viewed in a visit. 3)
Time per Visit: The average time spent on a website per visit. 4)
Bounce Rate: The share of visitors who leave a website after visiting just one page.
Description of Methodology to Analyze Data
To examine the effect of the GDPR on the various user behavior metrics, we use a combi-nation of DiD and the synthetic control group (SCG) approach and exploit the enforcement of the GDPR as the “treatment” event. DiD (e.g., Janakiraman et al. 2018, Kumar et al. 2016, Goldstein et al. 2014) aims to isolate the effect of treatment by comparing the differences in the treatment and control groups before and after treatment (here GDPR).
Description of methodology to analyze user quantity metrics.
Using the regression formula below and the control and treatment group assignment described in Figure 1, we calculate the treatment effect ( 𝛽 ) for every website-instance wi for each user quantity metric q . To determine the development of the treatment effect over time t (here measured in weeks for all user quantity metrics except unique vistitors where we measure t in months), we rerun our analysis several times, extending the duration of the post-GDPR observation period in each analysis. We first consider a post-treatment period of 3 months after GDPR (up to August 25 th , 2018, thus including observations from week 1 to week 60), then periods of 6 (week 1 to 73), 9 (week 1 to 86), 12 (week 1 to 99) and 18 (week 1 to 125) months. 22 These analyses enable us to determine the GDPR’s short- up to the long-term effects. (1) 𝑙𝑛(𝑌 𝑞,𝑡,𝑤𝑖 + 1) = 𝛽 + 𝛽 ∗ 𝐸𝑈 𝑤𝑖 + 𝛽 ∗ 𝑃𝑜𝑠𝑡𝑝𝑒𝑟𝑖𝑜𝑑 𝑡 + 𝛽 ∗ 𝑇𝑟𝑒𝑎𝑡𝑒𝑑 𝑡,𝑤𝑖 + 𝜖 𝑞,𝑡,𝑤𝑖 Y q,t,wi : Value of user quantity metric q in week t on website-instance wi EU wi : EU-Dummy, i.e., binary variable for which a value of 1 indicates that the users or website of website-instance wi are EU-based, else 0 Postperiod t : Postperiod-Dummy, i.e., binary variable for which a value of 1 indicates that the observation in week t lies in the post-treatment period, else 0 Treated t,wi : = EU wi * Postperiod t ; Treatment-Dummy, i.e., binary variable for which a value of 1 indicates that in week t , website-instance wi needs to consider GDPR, otherwise 0 𝜖 q,t,wi : Error term for user quantity metric q in week t for website-instance wi We rely on the SCG method (Abadie et al. 2015), which entails a synthetic construction of a control group whose pre-treatment patterns are comparable to those of the treatment group. This matched control group is constructed by selecting, for each treated website-instance, a weighted combination of several control website-instances. Thus, this approach requires (i) choosing a set of control website-instances to use and (ii) to weigh each website-instance. The weighting is done such that the weighted combination of control website-instances, referred to collectively as the “synthetic control website-instance”, minimizes the pre-treatment mean squared error (MSE) between the resulting synthetic control website-in-stance and the treated website-instance (following the approach outlined in Xu (2017)). Thus, this approach fulfills the parallel pre-treatment condition by construction. Then, the post-treatment metric of interest is calculated for the synthetic control website-instance and serves as the treated website-instance’s counterfactual. For each metric, the choice of control website-instances and their respective weights is as follows: For each treated website-instance, we select a set of five website-instances in the control group that (i) belong to the same industry as the website of the corresponding treated website-instance and (ii) have the highest correlations with the respective metric of the 23 treated website-instance in the pre-treatment period. Using these five control website-in-stances, we follow the approach outlined above to calculate the weights of these website-in-stances, to create a synthetic control website-instance that exhibits a comparable pre-treat-ment pattern as the treated website-instance. We then use the weights and observed values of the five website-instances to calculate a synthetic time series for the synthetic control web-site-instance, spanning the post-treatment period. The outcomes of these calculations serve as the control group in the DiD to determine the causal effect of the treatment on the metric of interest for all website-instances. We repeat the process for each user quantity metric q and website-instance wi . Depending on the number of website-instances that correspond to a specific website, we determine the effect of GDPR on that website, referred to as , as described above: If only one website-instance corresponds to a website, the GDPR’s effect on that website-instance and user quantity metric ( 𝛽 calculated in Equation 1) determines the GDPR’s effect on that website ( ). If two website-instances correspond to a website, the effects of GDPR on both website-instances for the user quantity metric (two treatment effects 𝛽 ) determine the GDPR’s effect on that website ( ), taking the relative sizes of the two website-instances for that one website into account. Description of methodology to analyze usage intensity.
After these steps, we have deter-mined the effect of interest, the treatment effect 𝛽 , for all treated websites-instances, user quantity metrics and post-treatment periods, and merged the treatment effects of the website-instances to obtain the treatment effects for all corresponding websites and user quantity metrics. We then use these treatment effects for all websites and post-treatment periods to ex-amine the change in our usage intensity metrics for each website over time. For this examina-tion, we take advantage of two aspects: (i) Each usage intensity metric is a function of two 24 user quantity metrics. For example, the number of visits per unique visitor is a function of the number of unique visitors and the total number visits on a website w (see Figure 3): (2) 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑉𝑖𝑠𝑖𝑡𝑠 𝑝𝑒𝑟 𝑈𝑛𝑖𝑞𝑢𝑒 𝑉𝑖𝑠𝑖𝑡𝑜𝑟 𝑤 = 𝑇𝑜𝑡𝑎𝑙 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑉𝑖𝑠𝑖𝑡𝑠 𝑤 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑈𝑛𝑖𝑞𝑢𝑒 𝑉𝑖𝑠𝑖𝑡𝑜𝑟𝑠 𝑤 (ii) The treatment effects calculated with Equation (1) result in relative (i.e., approxi-mately percentage) changes of our user quantity metrics for each post-treatment period p . Thus, to visualize the relative change of the number of visits per unique visitor due to GDPR for a particular website w for a particular post-treatment period p , we include the GDPR’s ef-fect in Equation (2): (3) 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑉𝑖𝑠𝑖𝑡𝑠 𝑝𝑒𝑟 𝑈𝑛𝑖𝑞𝑢𝑒 𝑉𝑖𝑠𝑖𝑡𝑜𝑟 𝑤 ∗ (1 + ∆𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑉𝑖𝑠𝑖𝑡𝑠 𝑝𝑒𝑟 𝑈𝑛𝑖𝑞𝑢𝑒 𝑉𝑖𝑠𝑖𝑡𝑜𝑟 𝑝,𝑤 ) = 𝑇𝑜𝑡𝑎𝑙 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑉𝑖𝑠𝑖𝑡𝑠 𝑤 ∗(1+∆𝑇𝑜𝑡𝑎𝑙 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑉𝑖𝑠𝑖𝑡𝑠 𝑝,𝑤 )𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑈𝑛𝑖𝑞𝑢𝑒 𝑉𝑖𝑠𝑖𝑡𝑜𝑟𝑠 𝑤 ∗(1+∆𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑈𝑛𝑖𝑞𝑢𝑒 𝑉𝑖𝑠𝑖𝑡𝑜𝑟𝑠 𝑝,𝑤 ) While the GDPR’s effect, reflected in , is calculated for the two user quantity metrics (number of unique visitors and total number of visits) with Equation (1), it is not known for the usage intensity metric (number of visits per unique visitor). Thus, we rewrite Equation (3) to isolate the GDPR’s effect on the number of visits per unique visitor (note that the usage in-tensity metric can be expressed by the two user quantity metrics (see Equation (2)): (4) ∆𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑉𝑖𝑠𝑖𝑡𝑠 𝑝𝑒𝑟 𝑈𝑛𝑖𝑞𝑢𝑒 𝑉𝑖𝑠𝑖𝑡𝑜𝑟 𝑝,𝑤 = 𝑝,𝑤 𝑝,𝑤 − 1 ∆Number of Visits per Unique Visitor p,w : GDPR’s effect in period p on the number of visits per unique visitor for website w ∆Total Number of Visits p,w : GDPR’s effect in period p on the total number of visits for website w ∆Number of Unique Visitors p,w : GDPR’s effect in period p on the number of unique visitors for website w This process enables us to reveal the GDPR’s effects on the number of visits per unique visitor for each website and each post-treatment period p (i.e., after 3, 6, 9, 12 and 18 months of GDPR). We calculate the GDPR’s effects on the other usage intensity metrics i (i.e., page 25 impressions per visit, time per visit, and bounce rate) with the same procedure. However, these usage intensity metrics are a function of the total number of visits and different user quantity metrics q (i.e., number of page impressions, time on website, and number of bounc-ing visitors, respectively). Therefore, we need to slightly adjust Equation (4): (5) ∆𝑈𝑠𝑎𝑔𝑒 𝐼𝑛𝑡𝑒𝑛𝑠𝑖𝑡𝑦 𝑀𝑒𝑡𝑟𝑖𝑐 𝑖,𝑝,𝑤 = 𝑝,𝑞,𝑤 𝑝,𝑤 − 1 ∆ Usage Intensity Metric i,p,w : GDPR’s effect on the usage intensity metric i (Page Impressions per Visit, Time per Visit, Bounce Rate) in period p for website w ∆ User Quantity Metric p,q,w : GDPR’s effect in period p on the corresponding user quantity metric q (Number of Page Impressions, Time on Website, Number of Bouncing Visitors) for website w ∆ Total Number of Visits p,w : GDPR’s effect in period p on the total number of visits for website w Description of methodology to analyze variations of effects across websites.
After calcu-lating the GDPR’s effects on usage quantity and usage intensity for each website, we subse-quently classify the websites according to a particular feature of interest—namely, website industry, popularity (measured by the ranks within SimilarWeb’s global, country and industry rankings of websites), and the country of origin of the predominant user base—and examine whether specific website or user characteristics are associated with positive or negative as well as stronger or weaker effects due to GDPR.
RESULTS OF EMPIRICAL STUDY
GDPR’s Effect on User Quantity Metrics
The following subsections outline the distribution of the GDPR’s effect across websites for each user quantity metric. GDPR does not affect all websites the same way. While some web- 26 sites experience negative effects, others are not affected by GDPR or even experience positive effects. The sizes of GDPR’s effects further differ across websites. As we later show, the GDPR’s effects on the analyzed metrics result in severe economic effects for websites. Thus, to examine the GDPR’s effect on the websites over time and the resulting economic impact of the privacy law, we first include both, the significant (on the 5%-level of significance) and insignificant (i.e., statistically not different from zero) effects in our reporting. We then report the share of websites that the GDPR significantly affected for all user quantity metrics.
GDPR’s effect on monthly number of unique visitors.
Figure 5 shows the distribution of the GDPR’s effect on the monthly number of unique visitors over time. On average, at each post-GDPR time point considered, the GDPR is associated with a decrease in the number of unique visitors as compared with pre-GDPR values. Specifically, 3 months after GDPR, the number of unique visitors decreases by 0.77%, on average, as compared with pre-GDPR values. Over time, this decrease becomes even stronger: One-and-a-half years after GDPR, the number of unique visitors to the treated website is 6.61% lower due to GDPR. After three months of GDPR, for about half of the websites, the number of unique visitors decreased (53.40%). The share of websites that experience a decrease in the number of unique visitors increases over time to 61.73% after 18 months. Overall, while only 54.15% of the websites experience a sig-nificant effect 3 months after the GDPR (24.77% of the websites were significantly positively affected, 29.38% of the websites experienced a negative significant effect), the share of signif-icantly affected websites increases to 80.52% after 18 months of the GDPR (28.47% of the websites experienced a positive significant effect, 52.05% of the websites experienced a nega-tive significant effect). 27
Figure 5: Development of the Distribution of the Effect of GDPR on Monthly Number of Unique Visitors over Time
GDPR’s effect on weekly total number of visits.
Figure 6 shows the distribution of the effect of GDPR on the weekly total number of visits over time. As in our analysis of the average number of unique visitors, we observe that, at each time point examined, the GDPR is associ-ated with a decrease in treated websites’ average total number of visits, as compared with pre-GDPR values. Specifically, in the period of 3 months after the GDPR, treated websites experi-ence an average decline in total visits of about 4.88% as a result of the GDPR. The average decrease after 18 months is 10.02%. With regard to the distribution of positive versus negative effects of the GDPR, we ob-serve that, 3 months after the GDPR, the share of positively affected websites is 40.69%, as 28 compared with 33.30% after 18 months. The effect sizes corresponding to these positive ef-fects also decrease over time. Thus, while there are websites that benefit from GDPR in the beginning, these positive effects diminish over time. For some of the initially positively af-fected websites, the effects even become negative after 12 or 18 months. Among websites that are initially affected negatively by the GDPR, the effect sizes become even more nega-tive over time. Overall, 78.83% of the websites are significantly affected by GDPR after 18 months of the GDPR (23.96% of the websites experienced positive significant effects, 54.87% of the websites experienced negative significant effects).
Figure 6: Development of the Distribution of the Effect of GDPR on Weekly Total Number of Visits over Time
GDPR’s effect on weekly number of page impressions.
Figure 7 shows the distribution of the effects of GDPR on the weekly number of page impressions over time. On average, the 29 effect of the GDPR is negative at each time point examined: 3 months after the GDPR, the average number of page impressions decreases by 3.12% as a result of the GDPR. This effect becomes stronger after 18 months, with an average number of page impressions that is about 9.28% lower than pre-GDPR levels. Overall, about 76.06% of the websites are significantly affected by GDPR after 18 months of the GDPR (24.64% positively; 51.44% negatively).
Figure 7: Development of the Distribution of the Effect of GDPR on Weekly Number of Page Impressions over Time GDPR’s effect on weekly time on website.
Figure 8 shows the distribution of the effects of GDPR on the weekly amount of time spent on the website. On average, the GDPR’s effect is negative at each post-GDPR period examined. Specifically, 3 months after the GDPR, the time on the website decreases by 4.72%. This effect becomes stronger a year after the GDPR, as the time spent on a website is more than 9.87% lower due to GDPR. After 1.5 years of GDPR, the average amount of time spent on the website starts to exhibit a slight upward 30 trend, such that its value is, on average, only 9.68% lower than pre-GDPR levels. Overall, about 74.18% of the websites are significantly affected by GDPR after 18 months of the GDPR (23.99% positively, 50.19% negatively).
Figure 8: Development of the Distribution of the Effect of GDPR on Weekly Time on Website over Time
GDPR’s effect on weekly number of bouncing visitors.
Figure 9 depicts the distribution of the GDPR’s effects on the number of bouncing visitors over time. We observe that, on aver-age, the GDPR decreases the number of bouncing visitors at each time point examined. After 3 months of GDPR, the number of bouncing visitors decreases by 4.35%. One-and-a-half years after the GDPR, the number of bouncing visitors is about 10.16% lower than pre-GDPR levels. After 18 months of the GDPR, 79.01% of the websites are significantly affected by GDPR (24.24% positively, 54.77% negatively). 31
Figure 9: Development of the Distribution of the Effect of GDPR on Weekly Number of Bouncing Visitors over Time
GDPR’s Effect on Usage Intensity Metrics GDPR’s effect on number of visits per unique visitor.
As depicted in Table 4, the average number of visits generated by each unique visitor to the websites drops by 1.62% 3 months after GDPR. This effect becomes weaker over time until it reaches an average drop of 0.59% after 18 months of GDPR. However, the effect of the GDPR on the number of visits per unique visitor differs substantially between websites that gain unique visitors and websites that lose unique visitors after the GDPR. Specifically, among websites that gain unique visi-tors, the number of visits per unique visitor indeed decreases: For example, at 3 months (18 months) after GDPR, the 46.60% (38.27%) of websites that gain unique visitors (see Figure 5) experience a 6.46% (9.09%) decrease in the number of visits per unique visitor, as com- 32 pared with pre-GDPR levels. Yet, the 53.40% (61.73%) of websites that lose unique visitors experience a 2.56% (4.77%) increase in the number of visits per unique visitor. Thus, among websites that lose unique visitors, the remaining visitors visit those websites more often. Websites that gain unique visitors, however, gain visitors who tend to re-visit less frequently.
Table 4: Average Change in Visits per Unique Visitor Based on Change in Unique Visitors
Analyzed Websites 3 Months 18 Months ∆Unique Visitors ∆Total Visits ∆Visits per Unique Visitor ∆Unique Visitors ∆Total Visits ∆Visits per Unique Visitor All Treated Websites -0.77% -3.59% -1.62% -6.61% -8.67% -0.59%
Websites that Gain Unique Visitors +15.90% +6.93% -6.46% +21.60% +8.18% -9.09%
Share of Gaining Websites
Websites that Lose Unique Visitors -15.39% -12.63% +2.56% -24.14% -19.19% +4.77%
Share of Losing Websites
GDPR’s effect on number of page impressions per visit.
We observe that, on average, the number of page impressions per visit increases after GDPR (see Table 5). In line with our analysis in the previous subsection, we examine the effect distribution based on the direction of the change in total visits for the websites. Among the 40.69% (33.30%) of websites that gain total visits 3 (18) months after GDPR (see Figure 6), the number of page impressions in-creases, on average—however, we observe that the usage intensity shows a negative trend. Specifically, the number of page impressions per visit decreases by 2.44% (4.58%).
Table 5: Average Change in Page Impressions per Visit based on Change in Total Visits
Analyzed Websites 3 Months 18 Months ∆Total Visits ∆Page Impressions ∆Page Impres-sions per Visit ∆Total Visits ∆Page Impression ∆Page Impres-sions per Visit All Treated Websites -4.88% -3.12% +1.97% -10.02% -9.28% +2.15%
Websites that Gain Total Visits +14.92% +11.39% -2.44% +20.45% +13.22% -4.58%
Share of Gaining Websites 40.69% 33.30%
Websites that Lose Total Visits -18.47% -25.23% +5.05% -25.23% -20.07% +5.53%
Share of Losing Websites The websites that lose total visits, however, exhibit a positive change in page impressions per visit: At 3 (18) months, GDPR increases the number of page impressions per visit by 5.05% (5.53%), as compared with pre-GDPR levels. Thus, the more a website is visited, the lower the usage intensity for these visits in terms of page impressions.
GDPR’s effect on time per visit.
As depicted in Table 6, the websites that increase their total numbers of visits as a result of GDPR experience a drop in the average amount of time per visit: 3 (18) months after the GDPR, the time per visit drops by 3.18% (4.02%). Among websites that experience a decrease in total visits, however, the time per visit increases—at 3 months, the increase is slight (0.82%), whereas at 18 months, it is more substantial (2.14%). These results suggest that, the more a website is visited, the lower the usage intensity in terms of time for these visits.
Table 6: Average Change in Time per Visit based on Change in Total Visits
Analyzed Websites 3 Months 18 Months ∆Total Visits ∆Time on Website ∆Time per Visit ∆Total Visits ∆Time on Website ∆Time per Visit All Treated Websites -4.88% -4.72% -0.83% -10.02% -9.68% +0.09%
Websites that Gain Total Visits +14.92% +10.59% -3.18% +20.45% +13.79% -4.02%
Share of Gaining Websites
Websites that Lose Total Visits -18.47% -15.76% +0.82% -25.23% -20.79% +2.14%
Share of Losing Websites
GDPR’s effect on bounce rate.
Comparing the change in the number of bouncing visitors to the change in total visits reveals the change in the underlying bounce rate (see Table 7). The average bounce rate across all websites increases slightly after 3 months of GDPR (by 0.81%) and reaches an increase level of 2.51% 18 months after the GDPR. We further ob-serve that websites that experience increases in the total number of visits also experience de-creases in bounce rate, of 2.86% (3.76%) after 3 months (18 months) of GDPR. Moreover, websites that experience decreases in the total number of visits also suffer from an increased 34 bounce rate: The bounce rate for these websites is 3.38% (5.67%) higher due to GDPR in the first 3 months (18 months) of GDPR. Thus, the less a website is visited, the more initial visi-tors bounce.
Table 7: Average Change in Bounce Rate based on Change in Total Visits
Analyzed Websites 3 Months 18 Months ∆Total Visits ∆Bouncing Visitors ∆Bounce Rate ∆Total Visits ∆Bouncing Visitors ∆Bounce Rate All Treated Websites -4.88% -4.35% +0.81% -10.02% -10.16% +2.51%
Websites that Gain Total Visits +14.92% +11.09% -2.86% +20.45% +14.83% -3.76%
Share of Gaining Websites
Websites that Lose Total Visits -18.47% -15.71% +3.38% -25.23% -21.69% +5.67%
Share of Losing Websites
Variation in the Effects of GDPR as a Function of Characteristics of the Website and the User
In the previous section, we have examined the distribution of GDPR’s across websites on the user quantity and usage intensity, showing that GDPR has affected websites in very dif-ferent ways. In what follows, we analyze how the effects of the GDPR on user quantity met-rics vary as a function of website characteristics—website industry and website popularity—and user characteristics, namely, users’ country of origin. For each of these analyses, we clas-sify the websites according to the focal feature of interest (e.g., website industry) and calcu-late the average effect of GDPR on each user quantity metric across all websites within each category (e.g., same industry). The patterns of the results of this analysis are very similar across the different user quantity metrics, as might be expected given the similarity of the re-sults outlined in Figures 5-9. Accordingly, to avoid redundancy, we present the results for only one metric: the total number of visits.
Variation in the effects of GDPR as a function of industry of the website.
As Figure 10 shows, websites from different industries are affected by the GDPR in very different ways. Websites within the “Heavy Industry and Engineering” and “Gambling” industries show the 35
Figure 10: Distribution of the Effect of GDPR Across Website Industries
Reading example: The value of 4.31% in the upper left panel (i.e. the figure with the GDPR’s 3-month effect across website industries) means that, on average, GDPR in-creases the total number of visits of the websites in the business and consumer services industry by 4.31%. most negative effects, losing an average of almost 50% and 20% of their total visits 3 months after GDPR, followed by “Lifestyle”, “Games”, “Arts and Entertainment”, “Reference Mate-rials” and “Hobbies and Leisure” websites. Websites in the “Business and Consumer Ser-vices” and “Vehicles” industries experience positive effects throughout the entire observation period. Some website industries exhibit positive effects in the early periods after the GDPR and subsequently experience negative effects. Such industries include “Travel and Tourism”, and “E-Commerce and Shopping”. Variation in the effects of GDPR as a function of website popularity.
To examine the role of website popularity in GDPR’s effect distribution, we split up the websites’ global, country and industry ranks into deciles. While the country and industry ranks are initially reported separately for each country and industry that a website belongs to, we group the correspond-ing ranks for all countries and industries together, respectively, for the assignment into dec-iles. That way, the 10% most popular websites (i.e., the ones with the lowest rank numbers) worldwide, across all countries, and across all industries are part of the 1 st decile, while the 10% least popular websites are part of the 10 th decile. Figure 11 shows the distribution of the average effect of GDPR on the websites based on the industry rank deciles. Analyses based on the global and country ranks result in similar distributions. Website popularity plays an important role in the effect distribution: Less-popular web-sites suffer from more negative effects compared with more-popular ones. Specifically, web-sites within the four bottom deciles exhibit far more negative effects than do websites within other deciles. While the least popular websites (i.e., those in the bottom decile) suffer the most from GDPR (up to a 21% drop in total visits 18 months after the GDPR), websites within the 6 th -9 th industry-rank deciles exhibit a drop in the number of visits by, on average, 4.30%-6.23% (10.31%-11.51%) after 3 (18) months. Interestingly, the websites in the top 37 Figure 11: Distribution of the Effect of GDPR Across Deciles of Industry Ranks
Reading example: The value of -3.74% in the upper left panel (i.e. the figure with the GDPR’s 3 month effect across industry rank deciles) means that, on average, GDPR reduces the total number of visits across the most popular (i.e., those in the top-decile) websites in each industry by -3.74%. The results of the top-decile reflect the change of the 10% highest ranked websites over the 24 industries. decile (i.e. the most popular websites) show more negative effects, and even more so over time (3.74% after 3 months; 9.04% after 18 months), than websites in the 2 nd -5 th deciles (1.82%-2.96% after 3 months; 5.82%-7.25% after 18 months). Still, the overall trend shows that users react less negatively to the changes induced by GDPR on websites that are more popular in comparison to the less popular websites, suggesting that the market is more con-centrated after GDPR. This increase in market concentration is strongest in the first 3-9 months after GDPR but still exists 1.5 years after the GDPR. Variation in the effects of GDPR as a function of users’ country of origin.
To examine the relationship between users’ country of origin and the effect of GDPR, we categorized each website according to its most popular user base’s country of origin (recall that our dataset provides user activity data corresponding to the country in which the website is most popular, as well as data corresponding to users in the US). Figure 12 suggests that the effects of the GDPR vary as a function of users’ country of origin. Websites whose primary user base is from Denmark, Poland or Germany suffered the least from GDPR over the analyzed period: 3 months after GDPR, the number of visits from users based in these countries decreased, on average, by 1%, 2.3% and 2.9%, respectively. The strongest drops in website visits were as-sociated with users from Austria, the Netherlands, UK, Hungary, Sweden and Switzerland.
DISCUSSION
Summary of Results: User Quantity and Usage Intensity
Table 8 summarizes the results of our user quantity and usage intensity analyses and re-veals that, on average, and at each time point investigated, the GDPR negatively affected most user quantity and usage intensity metrics. The only metric that was positively affected, 39
Figure 12: Distribution of GDPR Effect Across User Countries
Reading example: The value of -1.06% in the upper left panel (i.e. the figure with the GDPR’s 3 month effect across user countries) means that, on average, GDPR reduces the total number of visits of users coming from Denmark by 1.06%. on average, after GDPR was the number of page impressions per visit (note that an increase in bounce rate represents a negative development for that metric, as low values are preferred). We further observe that the negative effects of GDPR become stronger over time: 3 months after the GDPR, the user quantity metrics of treated website-instances drop by 0.8%–4.9% on average; yet 18 months after the GDPR, the average values of these metrics are 6.6%-10.2% below their pre-GDPR levels. These findings highlight the importance of inves-tigating the effects of the GDPR over time. Overall, though some websites were affected pos-itively by the GDPR, 62%-67% of the websites experienced negative effects (in terms of user quantity) after 18 months of GDPR. As outlined in Figure 3, many of our user quantity metrics are dependent on one another: For example, the total number of visits is dependent on the number of unique visitors of a website, and the number of page impressions, time on website and number of bouncing visi-tors each depend on the total number of visits. Thus, it is not too surprising that the directions of the effects of the user quantity metrics are aligned. This alignment becomes even more ap-parent when looking at the websites that gain (lost) unique visitors: For the 38% (62%) of websites that increase (decrease) their numbers of unique visitors, all other user quantity met-rics exhibit positive (negative) effects as well. However, the effect sizes differ between the user quantity metrics, revealing that the dependencies among the effects are not straightfor-ward, but rather are influenced by post-GDPR changes in the underlying usage intensity. With regard to usage intensity, we observe that while the average effect of GDPR on us-age intensity is generally negative in the first 3 months of GDPR (i.e., the bounce rate rises by 0.8%, the number of visits per unique visitor and the time spent per website visit decrease by 1.6% and 0.8%, respectively; only the page impressions per visit show a positive effect with a 2% increase), the negative effect becomes less strong over time. After 18 months of 41 GDPR, the number of visits per unique visitor is only 0.6% below the pre-GDPR baseline, and the number of page views and the time spent per visit even increase by 2.2% and 0.1%, respectively. Only the bounce rate shows a negative trend—it increases by 2.5%.
Table 8: Summary of Results for Different User Behavior Metrics
Metric 3 months 6 months 9 months 12 months 18 months
User quantity metrics
Unique Visitors
Median: Mean: Effects ≠ 0: - 1.24% - 0.77% 54.15% - 3.50% - 3.27% 66.14% - 5.60% - 5.50% 73.43% - 6.04% - 6.18% 75.82% - 6.65% - 6.61% 80.52%
Total Visits
Median: Mean: Effects ≠ 0: - 3.49% - 4.88% 49.93% - 5.54% - 7.22% 62.78% - 7.54% - 9.07% 70.64% - 8.24% - 9.57% 74.05% - 8.91% - 10.02% 78.83%
Page Impressions
Median: Mean: Effects ≠ 0: - 2.75% - 3.12% 46.53% - 3.92% - 4.83% 58.56% - 5.44% - 6.33% 66.21% - 6.04% - 6.48% 70.51% - 9.29% - 9.28% 76.06%
Time on Website
Median: Mean: Effects ≠ 0: - 4.51% - 4.72% 43.88% - 6.07% - 6.84% 55.71% - 8.42% - 8.89% 64.30% - 9.19% - 9.87% 67.72% - 9.50% - 9.68% 74.18%
Bouncing Visitors
Median: Mean: Effects ≠ 0: - 4.14% - 4.35% 47.22% - 6.77% - 7.28% 59.92% - 8.94% - 9.48% 68.73% - 9.48% - 9.94 % 73.10% - 10.16% - 10.16% 79.00%
Usage intensity metrics
Visits per Unique Visitor
Median: Mean: - 2.62% - 1.62% - 2.40% - 1.53% - 2.20% - 0.97% - 2.23% - 0.82% - 2.81% - 0.59%
Page Impressions per Visit
Median: Mean: + 0.56% + 1.97% + 1.32% + 2.83% + 1.86% + 3.58% + 2.09% + 4.39% + 0.28% + 2.15%
Time per Visit
Median: Mean: - 1.96% - 0.83% - 0.95% - 0.48% - 0.92% - 0.57% - 1.27% - 0.93% - 0.93% + 0.09%
Bounce Rate
Median: Mean: - 0.57% + 0.81% - 0.88% + 0.70% - 1.04% + 0.79% - 0.93% + 1.22% - 0.58% + 2.51%
Rows 1-5 show a summary of GDPR’s effect on user quantity metrics (see also Figures 5-9) and rows 6-9 on the usage intensity metrics (see also Tables 4-7). The table shows the mean and median values of the change in the metrics due to GDPR over all websites in each of the analyzed periods. For the user quantity metrics, the share of effects that are significantly (on the 5%-level) different from zero is reported for each period. For example, the 3-month effect of GDPR for total visits over all websites (second row / second column) was on average -4.88%, the median effect was -3.49% and 49.93% of the websites were significantly affected. The effect distribution across websites reveals that the GDPR’s effects on usage intensity metrics somewhat balance out the effects on user quantity. While the direction of GDPR’s ef-fects become positive for only some usage intensity metrics, the continuing alignment of these usage intensity metrics becomes clear again when dividing the websites according to whether the user quantity metrics increase or decrease. Among websites that lose unique visi-tors after GDPR, the remaining visitors use the website more intensively than they did pre-GDPR: Each user generates more visits to those websites (e.g., 4.8% more visits per unique visitor 18 months post-GDPR), and engages more with the websites in each visit, as reflected in increases in the number of page impressions (+4.6%) as well as the time spent per visit (+2.1%). The same users, however, exhibit an increased bounce rate, although the absolute number of bouncing visitors decreases, suggesting that the visitors that once bounced a web-site might be less likely to return to the website. Websites that gain unique visitors experience the opposite effect: The number of visits per visitor is lower post-GDPR than pre-GDPR (e.g., 9% lower at 18 months post-GDPR), and these visitors spend less time on each visit (-4%) and view fewer pages per visit (-4.6%) as compared with pre-GDPR visitors. Together with the increasing total number of visits, the number of bouncing visitors rises as well, although the bounce rate slightly decreases. Together, these results suggest that the GDPR negatively affects the average website in one of two major ways: Either the website experiences difficulties in attracting users, or, hav-ing attracted users, it struggles to keep them engaged and get them to return.
Differential Effects of the GDPR as a Function of Website and User Characteristics
Our results show that that the effects of the GDPR varied across different websites. In particular, less popular websites were hurt the most: For example, 18 months post-GDPR, the 10% least popular websites experienced an average drop of 21% in the total number of visits. 43
The 10% most popular websites, in contrast, experienced an average drop of only 9% in total visits. These results suggest that the GDPR increased market concentration. These effects may reflect users’ stronger motivation to continue using more popular and valued websites despite any potential disadvantages created by the GDPR—including, for example, users’ heightened awareness of data disclosure, diminished convenience of use due to website com-pliance, etc. For less popular websites, users may be less likely to feel that the benefits of continued use outweigh the disadvantages. We further observed that the effects of the GDPR varied across websites from different industries. For example, the most negatively affected websites included those in the Enter-tainment and Lifestyle segment (7.4%-13.8% decrease in total visits after 18 months of GDPR). Other types of websites experienced positive effects (e.g., Vehicles with an increase of 3.9% in total visits after 18 months). These effect differences may indicate differences in users’ expectations regarding privacy across industries. For example, users visiting entertain-ment websites may previously have been less aware of data collection compared with users on, e.g., e-commerce websites, where it is necessary to provide information to purchase prod-ucts. Consequently, highlighting data collection practices may have been more “surprising” to users of entertainment websites and had more of an effect on their behavior. Likewise, as in the case of more popular websites, users seeking services in domains that they deem more necessary may feel that the advantages of continuing to use the website outweigh the disad-vantages—and in some cases they may even value the added safeguards on their privacy. Finally, we observed that the GDPR’s effects differed across users from different coun-tries of origin, reflecting cultural differences across countries. For example, users from Den-mark, Poland, Germany, Italy and Spain reacted less negatively to the GDPR compared with users from the Netherlands, Sweden and the UK. 44
ANALYSIS OF GDPR’S ECONOMIC EFFECT ON WEBSITES
Our study focused on quantifying the effect of the GDPR on user quantity and usage inten-sity. However, what is truly of interest in policy makers’ tradeoffs is the extent to which the GDPR damages companies’ revenue, as this damage is likely to cause downstream societal effects. In what follows, we show how our findings can be used to generate a back-of-the-envelope estimation of the magnitude of possible economic effects on the websites, resulting from changes in user behavior due to GDPR. In these estimations, we rely on the average ef-fects of GDPR after 18 months as a basis for our calculations. We present two different esti-mations, corresponding to two kinds of websites: 1) websites that earn money by selling prod-ucts, i.e., e-commerce websites, and 2) websites that earn money by displaying ads.
Analysis of GDPR’s Economic Effect on E-Commerce Websites
For the e-commerce websites within this study’s sample, the average drop in total visits at 18 months post-GDPR amounted to 3.37% (see Figure 10). The determining revenue factors of an e-commerce website w are the number of visits (i.e., non-unique visitors), the conversion rate (i.e., the share of visits resulting in a purchase) and the revenue per purchase: (6) 𝑅𝑒𝑣𝑒𝑛𝑢𝑒 = 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑉𝑖𝑠𝑖𝑡𝑠 ∗ 𝐶𝑜𝑛𝑣𝑒𝑟𝑠𝑖𝑜𝑛 𝑅𝑎𝑡𝑒 ∗ 𝑅𝑒𝑣𝑒𝑛𝑢𝑒 𝑝𝑒𝑟 𝑃𝑢𝑟𝑐ℎ𝑎𝑠𝑒
Based on the Q1 2020 e-commerce benchmarks by Monetate (2020), the revenue per pur-chase globally is $105.99 and the average conversion rate per visit is 1.91%. Looking at the e-commerce websites within the website sample of this study, the average yearly total number of visits across all countries amounted to 70,461,862. Thus, the average yearly revenue for an e-commerce website within this study’s sample before GDPR amounts to: 45 (7)
𝑅𝑒𝑣𝑒𝑛𝑢𝑒 𝑎𝑣𝑔. = 70,461,862 ∗ 1.91% ∗ $105.99 = $142,643,623.54 . The average drop in total visits 18 months (=1.5 years) after the GDPR to the e-commerce websites within the study’s sample represents the respective drop in website visits: (8)
𝑅𝑒𝑣𝑒𝑛𝑢𝑒 𝐶ℎ𝑎𝑛𝑔𝑒
18 𝑚𝑜𝑛𝑡ℎ𝑠 = −3.37% ∗ $142.643.623,54 ∗ 1.5 = −$7,209,722.73 . A decrease of 3.37% in total visits due to an increase in privacy standards can thus decrease the revenue of an average e-commerce website by over $7 million in the first 18 months after enforcement of the privacy law.
Analysis of GDPR’s Economic Effect on Ad-Based Websites
For an ad-based website w , the determining factors for the revenue are the number of page impressions, the number of ads displayed per page impression and the price per ad impression. (9) 𝑅𝑒𝑣𝑒𝑛𝑢𝑒 = 𝑁𝑜. 𝑜𝑓 𝑃𝑎𝑔𝑒 𝐼𝑚𝑝𝑟𝑒𝑠𝑠𝑖𝑜𝑛𝑠 ∗ 𝐴𝑑𝑠 𝑝𝑒𝑟 𝑃𝑎𝑔𝑒 𝐼𝑚𝑝𝑟𝑒𝑠𝑠𝑖𝑜𝑛 ∗ 𝐴𝑑 𝑃𝑟𝑖𝑐𝑒
As an example of an ad-based industry, we examine the economic effect of the privacy standard increase on websites within the News and Media industry. In our sample, the average number of yearly page impressions on a news and media website across all regions was 358,859,344. A random sample of the homepages and article pages of 7 key news websites (nytimes.com, huffpost.com, washingtonpost.com, news.yahoo.com, bbc.com, wsj.com and cnn.com) shows an average of 7.6 ads per page. Based on both ComScore (2010) and The-BrandOwner (2017), the average CPM (cost for a thousand ad impressions) for news websites lies between $7 and $8 (here $0.0075 per Impression). Using these values, the total yearly revenue for an average news website prior to a privacy standard increase for the analyzed web-site sample amounts to: (10)
𝑅𝑒𝑣𝑒𝑛𝑢𝑒 𝑎𝑣𝑔. = 358,859,344 ∗ 7.6 ∗ $0.0075 = $10,454,982.61 . The average effect of GDPR on page impressions after 18 months on our sample of news and media websites is a drop of 8.05%. This reduction in the number of page impressions decreases the revenue of news websites significantly: (11)
𝑅𝑒𝑣𝑒𝑛𝑢𝑒 𝐶ℎ𝑎𝑛𝑔𝑒
18 𝑚𝑜𝑛𝑡ℎ𝑠 = −8.05% ∗ $10,454,982.61 ∗ 1.5 = −$2,469,953.43 . A decrease in page impressions of 8% due to an increase in privacy standards can thus decrease the revenue of an average news and media website by almost $2.5 million in the first 1.5 years after the privacy standard increase.
CONCLUDING REMARKS