Under the Spotlight: Web Tracking in Indian Partisan News Websites
Vibhor Agarwal, Yash Vekaria, Pushkal Agarwal, Sangeeta Mahapatra, Shounak Set, Sakthi Balan Muthiah, Nishanth Sastry, Nicolas Kourtellis
UUnder the Spotlight: Web Tracking in Indian Partisan NewsWebsites
Vibhor Agarwal ★ 𝛼 , Yash Vekaria ★ 𝛼 , Pushkal Agarwal 𝛽 , Sangeeta Mahapatra 𝛾 ,Shounak Set 𝛽 , Sakthi Balan Muthiah 𝛼 , Nishanth Sastry 𝛿 , Nicolas Kourtellis 𝜁 𝛼 The LNM Institute of Information Technology, Jaipur, India 𝛽 King’s College London, London, United Kingdom 𝛾 German Institute for Global and Area Studies, Hamburg, Germany 𝛿 University of Surrey, Surrey, United Kingdom 𝜁 Telefonica Research, Barcelona, Spain 𝛼 {vibhor.agarwal.y16, yash.vekaria.y16, sakthi.balan}@lnmiit.ac.in, 𝛽 {pushkal.agarwal, shounak.set}@kcl.ac.uk 𝛾 [email protected], 𝛿 n.s[email protected], 𝜁 [email protected] ABSTRACT
India is experiencing intense political partisanship and sectariandivisions. The paper performs, to the best of our knowledge, the firstcomprehensive analysis on the Indian online news media with re-spect to tracking and partisanship. We build a dataset of 103 online,mostly mainstream news websites. With the help of two experts,alongside data from the Media Ownership Monitor of the Reporterswithout Borders, we label these websites according to their partisan-ship (Left, Right, or Centre). We study and compare user trackingon these sites with different metrics: numbers of cookies, cookiesynchronizations, device fingerprinting, and invisible pixel-basedtracking. We find that Left and Centre websites serve more cookiesthan Right-leaning websites. However, through cookie synchroniza-tion, more user IDs are synchronized in Left websites than Right orCentre. Canvas fingerprinting is used similarly by Left and Right,and less by Centre. Invisible pixel-based tracking is 50% more in-tense in Centre-leaning websites than Right, and 25% more than Left.Desktop versions of news websites deliver more cookies than theirmobile counterparts. A handful of third-parties are tracking usersin most websites in this study. This paper, by demonstrating intenseweb tracking, has implications for research on overall privacy ofusers visiting partisan news websites in India.
India represents the largest and the most diverse news media marketamong democracies, with more than 100,000 registered newspapersand 400 news channels in 22 official languages. The growth ofonline news has been the fastest in the emerging markets, withIndia ranking among the top ten globally when it comes to print andonline news media [21]. Unfortunately, this growth of online politicalcommunications has been accompanied by rising partisanship [9, 25].The mainstream news media as major agents of information andinfluence, become important here.This paper focuses on major news websites in terms of how theytrack their users. Tracking allows them to obtain rich information ★ Both the authors contributed equally to this research. Registrar of Newspapers for India: http://rni.nic.in/ about readers, which may serve their business interest in revenuegeneration through targeted ads, as well as their political interest insetting agendas. There have been US-based studies about partisanmedia mostly in terms of their polarizing effects [5, 16, 38] and afew on tracking [2, 24]. For India, while there have been a few workson the division in the news media along partisan lines [27], thereis a lack of comprehensive, data-driven research on news websitesand tracking behavior. Indian news media are a major source ofinformation for the population [34]. Their tracking behavior hassocio-political implications as they are, by and large, a trusted sourceof public information [22].In this work, for the first time, we provide a comprehensive studyof the news websites in India with respect to partisanship and track-ing of online users. We focus on the online platforms of the largestEnglish, Hindi, and regional language news media (including thosewith print or broadcast platforms and the digital only ones) that canreach more than 77% of India’s population , , making them vulnera-ble to tracking. We first identify the major Indian news publicationsbased on their circulation figures from the Registrar of Newspapersfor India (RNI) supplemented with Indian Readership Survey of Q42019. We then create a list of 103 news websites, curated primar-ily from Alexa [3] and Feedspot [14]. Secondly, with the help oftwo experts in political science and journalism, alongside data fromthe Media Ownership Monitor of the Reporters without Borders,which traces associations between the media and political partiesand corporate interests [28], we label the 103 websites accordingto their partisanship as Right-, Left-, Centre-leaning, or Unknown(methodology explained in Section 3).We address the following questions: RQ1: What is the extent oftracking across partisan news websites? RQ2: What kind of track-ing methods are used on users? To answer them, we measure theintensity of user tracking across partisan websites with simple andadvanced mechanisms: basic first and third-party cookies, cookiesynchronization, device fingerprinting, and invisible pixel-basedtracking (Section 4).We share our Dataset, OpenWPM Crawls, and Codes publiclywith the research community for reproducibility and extension of https://bestmediainfo.in/mailer/nl/nl/IRS-2019-Q4-Highlights.pdf Media Research Users Council:https://bestmediainfo.in/mailer/nl/nl/IRS-2019-Q4-Highlights.pdf Broadcast Audience Research Council, India: https://barcindia.co.in/ a r X i v : . [ c s . C Y ] F e b nder the Spotlight: Web Tracking in Indian Partisan News Websites Agarwal V. and Vekaria Y., et al.our work . From this study, we derive the following key findings(Section 5). The 103 Indian news websites studied have more than100K cookies, for an average of over 100 cookies per website, butseveral websites have much higher number of cookies. For example, ∼ Sandesh.com , by itselfand its third-parties. Left- and Centre-leaning websites serve more(median) cookies than Right-leaning websites. Desktop versions ofwebsites set more cookies than their mobile versions, with interestingexceptions. Third-party domain doubleclick.net is present in 86%of news websites; such ubiquitous presence allows the tracking of ahuge proportion of users’ browsing histories.In addition to the large numbers of cookies, we also find evidenceof practically every known advanced method of user fingerprinting.Around 18% of all distinct third-parties, and 25% of all distinct first-parties in our data are involved in cookie synchronization. Around50% of unique user IDs are synced across tracking domains throughcookie synchronization. Cookie synchronization is higher amongLeft-leaning websites and their third-parties than for Right- andCentre-leaning websites. Over 25% of news websites use devicefingerprinting, which is invisible to the user and invasive to theironline privacy. Around 25.7% of Left, 23.7% of Right, and 17.9%of Centre websites employ different fingerprinting scripts to trackusers. More than 2.5K invisible (1x1 pixel) images (i.e., 23% ofall sent images) are detected on news website homepages. Invisiblepixel-based tracking is employed more by Centre, followed by Leftand then the Right websites.
We briefly discuss here the partisan nature of Indian news websitesas well as online tracking techniques studied in literature.
Partisan nature of Indian news : This paper takes partisanship tomean an adherence to the political beliefs and identification witha political party or cause, manifesting positively as a civic idealof shared values or negatively as a pathology where loyalty to aparty’s ideology/values/goals may trump logic and tolerance to otherpolitical views [39]. While numerous political parties exist in India,the three broad strands of political worldviews correspond to threeprincipal political formations at the national level of Indian politics:“Left” represented by parties like the Communist Party of India(Marxist), “Right to Right-Centre” represented by the BharatiyaJanata Party, and “Left-Centre” corresponding to the Indian NationalCongress. As India is a highly diverse country with their politicalparties and media reflecting this diversity, we take Right-leaningnews media to correspond with the Right to Right of the Centrespectrum of ideologies, the Left-leaning news media to correspondwith the Left to Left of the Centre spectrum, and the Centre-leaningmedia to be positioned in between the Right-Centre and the Left-Centre. The growth of heightened political partisanship may havea dramatic impact on media behavior and their influence on publicopinion, especially if they intensely track users.
Online tracking ecosystem and measurements : With the riseof online information consumption, online platforms have attractedthird parties for online advertising [26, 30]. These advertisementsare strategically drafted and placed on websites to get more user at-tention including pop-ups and banners [26, 35]. These websites track Data and code available at http://tiny.cc/india-tracking users by injecting cookies at the users’ side [7, 10, 20, 37] for contentpersonalization and improving user experience. However, cookiesand other data are also shared with other third parties, raising privacyconcerns. Users have an option to accept or reject these third-partycookies, but many users are not aware of the consequences if theyaccept them. These websites also use more sophisticated trackingtechniques like cookie synchronization [1, 2, 11, 20, 31, 36], devicefingerprinting [11, 29], and invisible (1x1) pixel-based tracking [15].Since users are often unaware of their presence, such methods posea greater privacy threat to the websites’ visitors. Studies have shownthat some popular trackers like Doubleclick and Google Analytics(both Google trackers) can be present in up to 50% and 70%, respec-tively, of top one million visited websites [11]. Specifically, newswebsites have seen large volume of trackers and advertisements in-cluding political campaigns [2, 11, 32]. Among USA news websites,Right-leaning websites track users more and have high cookie syn-chronization within the partisan group websites [2]. Having said that,less is known about the tracking ecosystem of Indian news media,which has recently seen exponential growth in online consumption.There are studies in online engagement (including social media)showing polarization and media bias, but none covers the exposureof user data to the tracking world [8, 25, 33]. With our work, we aimto fill this gap by measuring the extent to which users are exposedto a high amount of web tracking, using the aforementioned fourtracking techniques. We also explore tracking on desktop and mobileplatforms in Indian news media with partisan leanings.
Here, we discuss the methodology followed to curate a list of topnews websites in India, including metadata crawled for each using
Feedspot [14] and
Alexa.com [3], to label these websites based ontheir political leanings (Sec. 3.1). Furthermore, in Sec. 3.2, we pro-vide details of our website traffic crawling using
OpenWPM [11, 12],a tool for desktop browser automation and crawling, and
Cook-ies.txt [17], a browser plug-in for mobile browser automation.
We follow the methodology outlined in Figure 1 (left part) for web-site list creation and partisanship labeling.
List Creation:
We first examined a list of 141 top Indian newswebsites on the Web (ranked as on 28 April 2020) provided byFeedspot [14]. This website, maintained by over 25 experts, is up-dated daily and covers a wide range of factors to rank and discoverthe most prominent online news websites in India. They curate web-sites whose publishers explicitly publish their content via Feedspot,as well as by monitoring search engines and social media throughin-house media tools. The next list of websites we studied is fromAlexa (29 April 2020) [4]. Alexa Internet, Inc., is an AmericanWeb traffic analysis company, whose toolbar gathers information ofaround 30 million websites across the globe, based on their internetbrowsing behavior and traffic patterns. Their website stores the dataand provides extensive analysis of the websites. From Alexa, we gota list of 49 top Indian news websites based on their online popularityand traffic. Some of them were common with the Feedspot data. Wecombined Feedspot and Alexa lists to obtain a list of 153 websites.nder the Spotlight: Web Tracking in Indian Partisan News Websites Agarwal V. and Vekaria Y., et al.
Interactions betweenPartisan Websites andWeb Trackers : jth Indian News Website with leaning = i
Indian News Websites ...
OpenWPM
SQLite DB
StatelessCrawlStatefulCrawl
Web TrackingEcosystem
Responses Cookie-basedTrackingHTTP(s)Tracking-relatedDataPolitical ScienceExperts
Labelling of Political-leaning
CookieSynchronization Device FingerprintingInvisible Pixel-basedTracking
Privacy Analysis
Sources of IndianNews Websites Curation
Verified by : Right Leaning Websites: Left Leaning Websites: Centre Leaning Websites
Central India
Figure 1: Our framework for labeling Indian news websites along partisan lines and collecting web traffic data for studying webtracking mechanisms. Colors represent party-leaning: Right=Blue, Centre=Yellow, and Left=Red.
A large portion of news consumption in India happens throughonline platforms (Facebook, Twitter, and Instagram) rather thanTV/Radio [34]. Therefore, we further augment our data by visitingeach website’s Facebook, Twitter, and Instagram pages for metadatacollection. After opening a particular website on Facebook, Twitteror Instagram, we performed (in April 2020) a breadth first searchon other ‘Indian news page recommendations’ shown in the right-side panel under the heading of “Related Pages” in Facebook, “Youmight like” in Twitter, and “Related Accounts” at the bottom inInstagram. We added to our list all Indian news media shown inrecommendations (as described above) while visiting the socialmedia pages of initially curated websites. In the second-iteration,we repeated this with newly collected news media from the first-iteration. We repeated this approach up to five times, by which weobserved that 90% of recommendations were already in our dataset.Using this approach, we added to our list 65 new Indian news medialeading to a total of 218 websites. Then we removed websites withinactive web pages and retained only those which had more than10K followers on at least one of the three social media platformsinvestigated (to ensure we only include the popular ones). Our finallist has 123 Indian news websites, spanning nine languages and 28states. All have an online website, which can be freely accessedover the internet. Out of 123 websites, 10.56% are popular as TVchannels, 53.66% are print media and remaining 35.78% only havea website (no TV channel or print media). We determine popularityin terms of viewership/readership in TV/print media.
Website Labeling:
In order to understand and categorize websitesbased on their partisan leanings, we undertook a three-step labelingprocess. First, we approached two political science and journalismexperts who manually coded the political leanings of these websites.This approach has been used by media monitors at Buzzfeed News in past studies to review political leaning in the US news ecosystem. Second, we checked for their partisan associations from Media Own-ership Monitor [28] including data on parent company. The labelingwas then done along a spectrum of Right (Conservative: Right toRight-Centre), Left (Liberal: Left to Left-Centre), and Centre (i.e.,less biased or a combination of both Left and Right, that is, whenthe same parent company has two ideologically different news sites)categories based on ownership and ideological association. 20 web-sites were discarded due to uncertainty in their leaning. And theremaining 103 websites were labeled with a partisan leaning andconsidered for our study. The inter-annotator agreement betweenexperts, measured by Cohen’s Kappa, is 0.97. Throughout the paper,we use this categorization, with short names: “Left” for “Left toLeft-Centre”, “Right” for “Right to Right-Centre”, and “Centre” for“Centrist or representing view-points of Right and Left”. Our datasetconsists of Left-, Centre-, and Right-leaning websites.
We start our data collection using OpenWPM [11] by performing fivestateless crawls, while visiting the websites’ homepages from CentralIndia between August 10, 2020 to August 30, 2020. Stateless crawlsmake each website visit independent. Parallel browser instanceswere launched to allow multiple, simultaneous crawls of these newswebsites from a single location. We performed such crawls acrossdifferent times and days to account for infrequent but unavoidablenetwork errors during each crawl. We recorded more than 100Kcookies in total.We also performed five time-variant and order-variant, statefulcrawls of the websites’ homepages from September 01, 2020 toSeptember 15, 2020. Stateful crawls are important since we want tostudy tracking mechanisms such as cookie synchronization (CS). CSrequires state information to be maintained across different websitesand visits, to detect if user IDs from previous visits are being syncednder the Spotlight: Web Tracking in Indian Partisan News Websites Agarwal V. and Vekaria Y., et al.in future visits and with other websites and their third-parties. Time-variance is applied by crawling on different days with days-longtime between crawls.Order-variant means the websites are visited in a shuffled order foreach crawl, for the results to be independent of the website ordering.In stateful crawls, no parallel browser instances are launched todetect third-parties that indulge in cross-site tracking of users.For 23 of the 103 websites, we also find manually that they serveseparate mobile versions. Therefore, we perform five additionalcrawls for these mobile websites to compare tracking behavior indesktop websites and their mobile counterparts. The crawling for mo-bile websites uses
Cookies.txt , a Firefox Plug-In [17] to get browsercookies information. We automate this process using Selenium .At first, a Firefox browser is set to not block any type of cookies.Further steps include opening a Firefox Mobile Emulator in an incog-nito mode, loading the plug-in, visiting the mobile versions of thewebsites’ homepages (e.g., m.timesofindia.com ), and storing cookiesinformation. In these five crawls, we store 1400 cookies in total. In this section, we detail the methodology to measure various track-ing methods used by Indian news websites and the associated ad-ecosystem – Figure 1 (right part).
To perform the cookie-based analysis, we use the javascript_cookies table of SQLite dump from the OpenWPM crawled data. This dataprovides information on all different types of cookies being set bydifferent domains. In addition, we use the Disconnect List , whichis extensively used by the research community to report knowntracking domains, and categorize them into eight distinct categories:Advertising, Analytics, Content, Social, Fingerprinting, Cryptomin-ing, Disconnect, and Unknown. We use this list to understand thedistribution of cookies across these categories. Cookie synchronization (CS) is a cross-site tracking mechanism thatenables two trackers to generate a detailed browsing profile of theuser, by sharing unique user IDs with each other. CS circumventsthe Same-Origin Policy (SOP) . Past works have studied CS indifferent contexts [1, 2, 11, 13, 20, 31, 36]). However, CS has neverbeen studied specifically for Indian news websites along partisanlines or with respect to the privacy implications that it has in thecontext of India. CS can be abstracted as a two-step process. Inthe first step, a unique user ID is exchanged between two TPs inthe form of HTTP(s) requests, responses, or redirects in an effortto learn the identity of the given user on the web. This ID canbe used to aggregate user information by a variety of means [19]through step two. In the second step, domains exchange or mergethe identified user’s data including browsing histories, browsingpatterns, and interests through a separate “data sharing channel” tobuild a complete, consolidated user profile. https://github.com/disconnectme/disconnect-tracking-protection SOP allows tracking domains to access only cookies set by them.
Privacy impact:
Tracking and targeting based on CS primarily helpsadvertisers [23], especially in programmatic (real-time bidding) ad-vertising, where data sharing and purchasing involves CS for bettertargeting [18]. As a result of CS, trackers are able to track a givenuser over a larger set of websites, where they may not even be em-beded as TPs. In fact, repetitive CS across websites can enrich aparticular user’s profile built by trackers, helping them to preciselytrack and target a user over time. Also, server-to-server exchangesof user data (CS step 2 above) have become common [11], enablingdeeper user profiling.
Methodology:
We capture CS for websites in our dataset using sim-ilar methodology of past studies [1, 13, 31]. We use the fundamentalstructure of the open-source python code from [1] (referred to as
CSCode hereafter) and make modifications to work for our scenario:unlike [1] that crawled data simultaneously on two machines be-fore analyzing them with
CSCode , we perform time-variant crawls(Sec. 3.2).For each crawl, we detect CS for each leaning group and a combi-nation of them. For example, while studying CS between Left andRight, we iterate over all distinct pairs of websites (w1,w2) where w1 is any website which is Left only, while w2 is Right only (with w1!=w2 and (w1,w2) ≡ (w2,w1) ). Since we have 39 Left and37 Right websites, there are 39x37=1443 total pairs. For intra-partycomparisons like Right-Right for instance, the total unique pairswill be computed as 𝐶 = . Next, for each pair, we considerall the HTTP(s) request, response, and cookies data related to w1 and w2 , and use CSCode to search for IDs synced between FPs andTPs while visiting w1 and w2 . We try all possible combinations ofwebsite pairs falling into different partisan lines, i.e.: • 𝑤 ∈ 𝑊 𝐿 and 𝑤 ∈ 𝑊 𝐿 ; 𝑤 ∈ 𝑊 𝑅 and 𝑤 ∈ 𝑊 𝑅 • 𝑤 ∈ 𝑊 𝐶 and 𝑤 ∈ 𝑊 𝐶 ; 𝑤 ∈ 𝑊 𝐿 and 𝑤 ∈ 𝑊 𝑅 • 𝑤 ∈ 𝑊 𝐿 and 𝑤 ∈ 𝑊 𝐶 ; 𝑤 ∈ 𝑊 𝑅 and 𝑤 ∈ 𝑊 𝐶 Since [1] is an older paper on CS, we validated
CSCode , as wellas various parameters used with recent works on CS [2, 20, 31,36]). We made the following key changes to ensure result correct-ness. First, for each URL,
CSCode extracts the top-level-domain(e.g., com from rtb.gumgum.com ) in [1]. However, it is not rele-vant to study CS across such top-level domains. Instead, we fol-low [31] and map all domains (from cookies, requests, responseURLs, etc.) to the high-level domains returned by the WhoIS tool (e.g., rtb.gumgum.com is mapped to gumgum.com as obtained fromWhoIS). Second, CSCode constraints minimum length of an ID tobe characters. However, [36] suggests to discard shorter IDs, sincethey do not contain sufficient entropy to represent a user ID. Wefollow [31] and use threshold of characters to minimize falsepositives. Interestingly, the shortest ID detected in our data is characters long. Third, we upgraded CSCode to support python3and dependencies.
Limitations:
CSCode gives a strict conservative ID detection withfewer false positives [1]. However, false negatives may occur whenID is shared in URL parameters in an encoded or encrypted format [6,31], or when ID strings are hidden inside the longer strings with non-standard delimiters. According to [1], the adversarial trackers couldhave short-lived cookies mapped to user IDs at the backend-serverto later on track the user. Such cases are not captured by our code. As [1], we consider cookies with expiration date ≤
30 days nder the Spotlight: Web Tracking in Indian Partisan News Websites Agarwal V. and Vekaria Y., et al.Hence, our results represent a lower bound on the actual CS takingplace in a real-time scenario.
Privacy impact:
A device or browser fingerprinting is a powerfultechnique that websites and TPs use to identify unique users andtrack their online behavior. This method collects information aboutthe user’s browser type and version, operating system, time-zone,language, screen resolution, and other settings. It can lead to seriousprivacy issues as users are oblivious to this happening, and can haveimportant implications on the way third-parties track users acrossthe Web without cookies in the future.
Methodology:
Our fingerprinting measurement methodology [11]utilizes data collected by OpenWPM, as described in Sec. 3.2. Inparticular, we detect different types of fingerprinting such as canvas,WebRTC, and audioContext, by checking webpages and the inter-faces they call, such as
HTMLCanvasElement and
CanvasRender-ingContext2D for canvas,
RTCPeerConnection , createDataChannel and createOffer for WebRTC, and AudioContext and
OscillatorNode for audioContext.
Privacy impact:
Invisible pixels are 1x1 pixel images that do notadd any content to the websites hosting them. TPs use these invisiblepixels to track user’s behavior on a website. Whenever a websiteloads, it sends subsequent requests to the server to load variousassets like images, ads, and other media on the website. To load theseinvisible (1x1) pixels on the websites, TPs send some informationusing the requests sent to retrieve the images. Crucially, the usersare unaware of the pixels’ existence on the websites and that thesepixels report user’s activity. Therefore, every such pixel represents athreat to the user’s privacy.
Methodology:
We follow [15], and for every crawl using Open-WPM, we store all HTTP requests, responses, and redirects, alongwith response headers, to capture the communication between aclient and a server. We then filter HTTP requests and responses bychecking the content-type in the response header. If the content-type is an image , the corresponding requests and responses are for images.Next, we check for content-length in the response headers to filterout only those HTTP requests and responses with content-length less than 1KB. This threshold is used to save storage space (i.e.,not to store all images but only probable 1x1 pixel images). In [15],they use 100KB threshold, but this is a very large size for such 1x1pixel images. In fact, we found all detected invisible pixels in ourdataset are less than 1KB in size. All such images are downloadedusing the image’s URL recorded in the filtered HTTP requests andresponses and then checked for the image’s dimensions. If bothheight and width of an image are 1 pixel, then the image is labeledas invisible pixel. The corresponding HTTP request/response, imageURL, content length, and third-party setting of each invisible pixelare recorded for further analysis.
In this section, we present our privacy analysis on the partisan web-sites of our dataset, and how they track users. We start with cookie-based tracking analysis (Sec. 5.1). We then study more complex
Figure 2: CDF of number of cookies for Left, Centre, and Right-leaning news websites, for their desktop and mobile versions (ifavailable). tracking techniques such as cookie synchronization (Sec. 5.2), de-vice fingerprinting (Sec. 5.3), and invisible pixel-based tracking(Sec. 5.4).
We analyze 100K cookies placed by FPs and TPs while visiting the103 Indian news websites. Figure 2 shows the CDF of the number ofcookies for all the Left-, Centre-, and Right-leaning news websitesavailable for desktop (103) and mobile (23) versions of the web-sites. The median number of cookies are 86, 84, and 92 for Left-,Right-, and Centre-leaning desktop websites, and 30, 42, and 36,respectively, for mobile websites. Therefore, in all political leanings,websites for desktop push more cookies to the user’s browser thanmobile versions (in median). In mobile versions, Centre and Rightwebsites track users more compared to the Left by 1.2 and 1.4 times(KS-value: 0.33, p-value: 0.007), respectively, and Right websitestracks more than Centre websites by 1.2 times (KS-value: 0.28, p-value: 0.054). In desktop versions, median numbers are close for allleanings. The Right websites have fewer cookies than the Left, andthe Left has fewer than the Centre. Interestingly, when consideringthe case of websites for desktop delivering a lot more cookies thanthe median, Left tracks more than the Right and Centre. For example, sandesh.com , which is in the Left to Left-Centre political spectrum,has the highest number of cookies: more than 1400 cookies (medianover five crawls). These cookies are set by the FP and TPs on thiswebsite. When desktop websites have cookies less than the median,the trend is reversed, i.e., Right tracks more than Left and Centre.The different versions for desktop and mobile platforms for thesame news website imply opportunity for collaboration or data leak-age between the two tracking ecosystems across different devices.In Figure 3, we compare the total number of cookies for each of the23 news websites with mobile and desktop versions. Most websites(20/23) set more cookies in their desktop as compared to their mobileversions. Interesting exceptions are
Times of India , Punjab Kesari ,and
Daily Hunt , which set more cookies in their mobile websites.More cookies indicate higher intensity of tracking as well as net-work activity (for storing, updating, and synchronizing said cookies)between the browser and server. Therefore, such (mobile) websitesnder the Spotlight: Web Tracking in Indian Partisan News Websites Agarwal V. and Vekaria Y., et al.
Figure 3: Median number of cookies in mobile vs. desktop ver-sions for 23 news websites, grouped by political leaning in de-creasing order of their Facebook followers. $ G Y H U W L V L Q J &