[PDF] A Visualization Interface to Improve the Transparency of Collected Personal Data on the Internet

Abstract

Online services are used for all kinds of activities, like news, entertainment, publishing content or connecting with others. But information technology enables new threats to privacy by means of global mass surveillance, vast databases and fast distribution networks. Current news are full of misuses and data leakages. In most cases, users are powerless in such situations and develop an attitude of neglect for their online behaviour. On the other hand, the GDPR (General Data Protection Regulation) gives users the right to request a copy of all their personal data stored by a particular service, but the received data is hard to understand or analyze by the common internet user. This paper presents TransparencyVis - a web-based interface to support the visual and interactive exploration of data exports from different online services. With this approach, we aim at increasing the awareness of personal data stored by such online services and the effects of online behaviour. This design study provides an online accessible prototype and a best practice to unify data exports from different sources.

Full PDF

©© 2020 IEEE. This is the author’s version of the article that will be published in the proceedings of IEEE Visualization conference.The ﬁnal version of this record will soon be available at: xx.xxxx/TVCG.201x.xxxxxxx

A Visualization Interface to Improve the Transparency of CollectedPersonal Data on the Internet

Marija Schufrin, Steven Lamarr Reynolds, Arjan Kuijper and J ¨orn Kohlhammer,

Member, IEEE

Fig. 1: The

TimeView of the web interface

TransparencyVis with

MultiView mode on. The data elements from the GDPR data exportsof two different users, each from

Google and

Facebook , are visualized in interactive scatterplots as circles over time. Differentcolors represent the categories of the data elements. Patterns can be detected and compared as described in use case 2 (see Sect. 5.2).

Abstract —Online services are used for all kinds of activities, like news, entertainment, publishing content or connecting with others.But information technology enables new threats to privacy by means of global mass surveillance, vast databases and fast distributionnetworks. Current news are full of misuses and data leakages. In most cases, users are powerless in such situations and develop anattitude of neglect for their online behaviour. On the other hand, the GDPR (General Data Protection Regulation) gives users the rightto request a copy of all their personal data stored by a particular service, but the received data is hard to understand or analyze by thecommon internet user. This paper presents

TransparencyVis - a web-based interface to support the visual and interactive explorationof data exports from different online services. With this approach, we aim at increasing the awareness of personal data stored by suchonline services and the effects of online behaviour. This design study provides an online accessible prototype and a best practice tounify data exports from different sources.

Index Terms —Information visualization, usable privacy, privacy awareness, transparency-enhancing technologies, user-centereddesign

NTRODUCTION

In the last few decades humanity has entered the digital age and becamea modern information society. It is estimated that over ﬁfty percentof the global human population is using the Internet nowadays [21].Online services are used for all kinds of activities, like news, entertain- • Marija Schufrin is with Fraunhofer IGD, Germany. E-mail:[email protected].• Steven Lamarr Reynolds is with Fraunhofer IGD, Germany. E-mail:[email protected].• Arjan Kuijper is with Fraunhofer IGD, TU Darmstadt, Germany. E-mail:[email protected].• J¨orn Kohlhammer is with Fraunhofer IGD, TU Darmstadt, Germany.E-mail: [email protected]. ment, publishing content or connecting with others. But informationtechnology enables new threats to privacy through global mass surveil-lance, vast databases and fast distribution networks. By using onlineservices, data about users is collected on a daily basis. Companiescollect data to offer more content, improve their services, gather insightabout the users, or to increase the relevance of advertisements. A fewmajor companies offer users to connect all devices to their accounts forfree. This enables services to create an increasingly detailed proﬁle dueto the continued use of these services. Users are often unaware of theconsequences of these choices and the amount of data that is collectedfrom them as a result. A key point is that users lose control over thedata that concerns them because they are not aware of the data that isdistributed in different ways over multiple services. Furthermore, userscannot control exactly what happens with this data. Faced with this a r X i v : . [ c s . H C ] S e p mpotence, Internet users often develop an attitude of neglect of dataprivacy concerns. However, with regard to the inherent human rightto privacy of each individual, as written in Article 12 of the UniversalDeclaration of Human Rights [52], the ability to control the provisionof one’s own data to different services on the internet is of utmostimportance [37]. Privacy describes the right of individuals to decidehow they seclude and expose information about themselves. In thecontext of this paper, the primary focus is on informational privacy . Itcan be described as “the right to select what personal information isknown about me to what people?” [55].People are using so many services today, that it is often challengingto keep track of the data they collect. People employ different tacticsto preserve their privacy. Teenagers for example try to ﬂood the ser-vices with random non-sensitive content [6]. Other try to denounceprivacy threats by using common arguments such as “I have nothingto hide” [45]. We argue that the main reason for such arguments andtactics is the impotence to grasp the amount and value of the personaldata being collected. Therefore, means to support the users mentalaccess to these data collections are desirable. We argue that visualizingsuch data collections in a usable way can contribute to the situationalawareness of the common internet user concerning the own personaldata stored at different services. The data collection exports introducedby the GDPR, which was enacted in the European Union in 2018,proved to be a valuable resource for this aim. The GDPR gives morecontrol to the users by regulating how companies can collect, store anduse their personal data. It also enables users to download and accesstheir personal data and transmit it to other services. However, themany ﬁles and the differences of formats between and within the dataexports make it difﬁcult for casual Internet users to get an overview ofthe content. Therefore, in this paper we present

TransparencyVis , anonline accessible web tool to support a visual interactive exploration ofsuch data exports. In a user-centered design process we identiﬁed therelevant users, data and tasks, which are also presented in this paper.We have implemented the tool experimentally for four of the most pop-ular online services (Google, Facebook, Instagram, Twitter). However,the interface is extensible for other services. Therefore we share thegeneralization scheme for the data exports from different services, sothat the community can contribute by parsing the data exports fromfurther services. Our main contributions are:1. A web-based prototype for a visual exploration of the data exportsenabled by the GDPR, representing the data collections of theown personal data stored by different online services.2. Characterization of the relevant users, data and tasks based onMiksch and Aigner [30], as appropriate for the presented chal-lenge to raise the situational awareness concerning personal data.3. A uniﬁcation scheme to generalize the data exports from differentservices, to be able to merge and compare the various data sets inone visualization.4. Evaluation of the usability and the appropriateness of the tool afterthe ﬁrst design iteration and lessons learned and implementedchanges in the current version. ELATED W ORK

A popular research ﬁeld with the goal to increase the transparencyof personal data is called Transparency Enhancing Technologies(TETs) [18, 22, 32]. They enable users to better understand the im-plications of disclosing personal data, to protect their privacy and totake an active part in the value creation of services [7]. TETs canbe categorized into tools that enhance privacy before personal data isdisclosed (ex-ante TETs) and tools that retrospectively enhance privacyonce personal data has been disclosed (ex-post TETs) [15]. The ap-proach provided in this paper can be classiﬁed as ex-post TETs. Withthis approach we aim to increase the situational awareness of commonInternet users with respect to their personal data, which are stored bydifferent online services. Thereby, the approach is to visualize thecurrent content of the data collections that have been collected so far. https://transparency-vis.vx.igd.fraunhofer.de/ The goal is to help users reﬂect on their privacy attitude and their futurebehavior.In our research of related work we have found a number of helpfulapproaches for visual interactive systems to increase the transparencyof personal data. However, we have not found any approach, thataddressed the visualization of the complete GDPR data exports fromdifferent services in a comprehensive view. Some approaches use partsof the download [49, 50] or are actually aiming to use a direct API ofthe service [15]. While there are some approaches, that provide theuser with the accessibility to try the tools with their own data in theirown environment [3, 15, 49], many of the approaches either requirean implementation on the server side or are not designed for personaldata at all [5, 15, 23, 39]. We have not found any approach, wherethe data from multiple services could be combined and explored inone tool. However, there are tools which support data from multiplesources [41, 49]. While many approaches extend their data by derivingor adding further information (e.g. by machine learning, statisticalinformation or knowledge from the outside) [10, 41], our focus ismainly on depicting the collection as is. Most of the related workare appropriate for the use of a non-expert in IT. In the following, wepresent the most relevant groups of related work that we have found.

Visualization of data ﬂows

Related approaches that also aimto visualize personal collections of data with the goal of increasingthe privacy awareness are

DataTrack from Fischer-Hbner et al. [1,15, 16, 24],

PrivacyInsight from Bier et al. [5],

Privacy Dashboard from Raschke et al. [39] and the online interactive tool developed byKani-Zabihi and Helmhout [23]. These approaches are designed to beimplemented on the server side and while they are also designed forpersonal data, the main focus seems to be on showing the data ﬂows,who the data is shared with, and the details of the provided information.Our approach rather focuses on visualizing a collection of personal datato be viewed by a common internet user in an easily accessible way.

Inferred data

The approaches of Do Thi Duc [10] (

Dataselﬁe )and Rieder et al. [41] (

FindYou ) also aim at visualizing personal dataand thereby increasing their transparency. Their main focus is to inferadditional data using machine learning and statistical means to showwhat is possible to infer from the data. Do Thi Duc uses several barcharts that show the statistical information and also uses a time linevisualization similar to the one in

TransparencyVis , but only the lastseven days are visible due to their focus of collecting the data in real-time. In

TransparencyVis the whole time span of all available data isshown.

FindYou , on the other hand, is a location auditing tool, thusproviding a more speciﬁc service. Users can enter their own locationdata from three popular online services, including

Instagram , Twitter and

Foursquare . Visualizing privacy policies

There are also approaches, that vi-sualize privacy policies, as for example Harkous et al. [19], Tesfayet al. [48], or Kelly et al. [25]. Some mentionable but not scientiﬁcweb tools for this application area are

PrivacySpy [29],

Trackogra-phy [47],

Privacy Program [9],

ToS;DR [42] and useguard [38]. Theseapproaches rate privacy policies based on different assessment schemesand while they help to support users in reﬂecting on their privacy at-titude, they differ strongly from our approach by not visualizing theactual disclosed personal data.

Personal visualizations

Some approaches visualize personal datafor the purpose of reminiscing, self-reﬂection and self-expression ratherthan for privacy awareness. These approaches try to gain additionalvalue of the collected data, whereas approaches for privacy awarenesstry to show the value of the data collection itself. One example for thiscategory is

Visits from Thudt et al. [49, 50], where personal locationhistories are visualized in an appealing and interactive way. Users canupload their location history from

Google and three other location basedservices. Another example is

LastHistory , a work of Baur et al. [3],which visualizes the music listening history from the Last.fm [27]service and context (photo and calendar streams) in a timeline. Bothapproaches visualize an already collected data set of the users andprovide the possibility to use the service with personal data, eventhough for very speciﬁc data collections.2

Non-scientiﬁc tools

We found also some non-scientiﬁc online-tools, which are comparable to our approach. For example myfbdata developed by Do Thi Duc [12] and the

Facebook Analysis Tool byWolfram Alpha [57]. Both were designed to visualize personal data on

Facebook , either from the data export or directly via an API. myfbdata provided a map and a timeline, while the tool by Wolfram Alpha letusers gain insight by providing multiple visualizations about friendcircles, distributions and others. Both tools allowed few interactions,no categorization of the data, and were designed for only one onlineservice (

Facebook ). However, they became obsolete some years ago.Beyond that, there are several other online tools, which are designed toincrease the transparency of personal data on the web. One category ofthese tools is the visualization of tracked user activity: e.g. re:log [35],

Vorratsdaten by ZEIT Online [58], vds-suisse by OpenDataCity [36], publicdefault [11],

OnlineStatusMonitor [28],

WhatsSpy Public [59],

WhatsAppAll [26]. They visualize one or multiple static data sets toshow the sensitiveness of personal data. Other tools focus on thevisualization of tracking behavior on websites,

Mozilla Lightbeam [31],

Netograph [33],

Trackography [47].Thus, to the best of our knowledge there is no other approach that canprovide the means to analyze personal data collections simultaneouslyfrom more than one online service in a comprehensive and transparency-enhancing way.

ATA -U SER -T ASK

In this section we present the targeted data, user and tasks according toMiksch and Aigner’s design triangle [30].

In the scope of this research topic we are focusing on personal data thatis collected on the Internet. Personal data is “any information relatedto an identiﬁed or identiﬁable natural person” [40]. It is primarilyprovided by users to online services simply by using them. In recentyears, the Internet rights for users were strengthened by the introductionof the Californian CCPA or the European GDPR [40]. The latterprovides European citizens with Article 15, the right of access, i.e. theycan request a copy of their personal data, a data export . It also includesArticle 20, the right to data portability, with which they can use theirdata export for their own purposes across other services. It also requiresthe service to deliver the data export in a structured, commonly usedand machine-readable format.During our research, we investigated the data export request on sev-eral services and found large differences among the retrieval process.While most services employ an automated data export, some requireusers to contact the support via email and identify themselves withan image of their passport. Further, the retrieval process varies in theduration of the time till the export is created. For some services theduration depends on the size of the data export or the current workloadof that service, however, a few services need several days to weeksto generate the data export. Most data exports we encountered wereavailable as a zip archive and contained many different ﬁle formats,including json, js, csv, html, tcx, vcf, ics and others. As each ﬁlecontained data about various topics, they all had an individual datastructure and only occasionally used reoccurring data types. Some ser-vices used special encoding such as UTF-8 encoded strings, JavaScriptﬁles with an exported variable that contains the JSON data, or includeddata which purpose or context could not be identiﬁed. These ﬁles anddata structures were almost never documented by the service, only theTwitter data export provided a documentation. It should be noted thatsome services allow choosing between multiple ﬁle formats, in mostcases JSON and a HTML variant that allows for easier viewing. Somedata that was available on the website of the service was not included inthe data export, but it was mostly miscellaneous data or newer featureswhich were not added yet. We selected services which are popularamong users, have an automated and simple data export request fea-ture, have a short duration to generate the data export, and allow easy maintenance. We therefore decided for

Facebook [14],

Google [17],

Twitter [51] and

Instagram [20] in our initial prototype.

The data export comprises several folders that contain the data ofcertain parts or features of the service. In those folders are sub foldersand ﬁles in multiple ﬁle formats. Some ﬁles are in a common dataformat, like json, while others can contain images, videos, documentsor binary data which might not be known before. Due to the highvariation of the content and the structure of the data exports, we deﬁneda uniﬁcation scheme with the goal to simplify the data and to make itcomparable. The overall uniﬁcation scheme is shown in Fig. 2. Basedon our observations of the data formats we deﬁned two types of datafor our visualizations:

File elements : A ﬁle element represents ﬁles, which are containedin the data export. This can for example be a video, image, other archiveor a machine-readable document. The ﬁles are categorized based ontheir ﬁle extension to make it easier for users to understand the ﬁles’purpose. The main attributes of this type of data are:•

File Name - messages.json•

File Category - Picture, Video, Audio, Text, Document, Other•

Folder - messages/•

File Size - 5 MB•

Data Category - Messages, Security ...

Data element : Data elements represents chunks of data, whichcould be identiﬁed in the machine-readable ﬁles contained in the dataexport. Most of the machine-readable data was given in a list or arraywith individual elements that contain multiple relevant attributes. Mostelements are certain events that happen within the service. For exam-ple, account creation, password changes, sending messages, acceptingfriend requests, visiting an URL, using search, liking a page and oth-ers. After documenting several machine-readable ﬁles from multipleproviders, we created the following attributes for this data type:•

Time - 2019-01-01 12:34:56•

Text - Person says: “Hello World”•

Category - Messages, Security ...•

Subcategory - Chat with Person B, Chat with Person C, ...Finally - in order to support pattern exploration and a comparisonbetween data sets from different online services or users, a set of tencategories has been derived so that each data element could be classiﬁedaccording to these categories (see also Fig. 2):

Account (any data relatedto the users’ account),

Activity (any data that is collected passively fromusers),

Contacts (any data that contains contact addresses or friendslists or similar),

Location (any location-oriented data),

Media (anydata that primarily describes media data from the user),

Messages (anycommunication data),

Posts and Comments (any posts or commentsfrom the user),

Security (any security related data such as logins orIP addresses),

Other (any data that does not ﬁt the other categories).File elements which contain data elements have the same value for theattribute data category as the contained data elements. These categoriescan universally be applied for different services, so that a combinationand/or comparison of data from different services is eased as well.

The proposed approach has primarily been designed with the ordinaryInternet user in mind. For the user description in the design triangle, wecharacterize the user group by looking at two attributes: privacy concernand internet skills. With respect to the privacy attitude, Westin [56]deﬁnes three main groups of users based on their privacy concern index.•

Fundamentalist : A person that is distrustful of data collectionby organizations and cares about privacy.•

Pragmatist : A person that weighs the beneﬁts against the intru-siveness of data collection and believes that organizations shouldearn their trust rather then automatically have it. nit Files S e r v i c e s P a r s e r C a t e g o r i e s AccountActivityContactsLocationMediaMessagesOtherLikesPosts and CommentsSecurityDocumentVideoPictureAudioTextOther

File Elements Data ElementsFile NameFile CategoryFolderFile SizeData Category TimeTextCategorySubcategory D a t a T y p e s Instagram

TwitterGoogleFacebook

Data exportsParsable Files

Fig. 2: Uniﬁcation scheme: Data exports from different online servicesare uniﬁed to a deﬁned scheme by a speciﬁc set of parsers. For eachservice an own parser must be deﬁned. The uniﬁcation results in twodata types: ﬁle elements and data elements as well as an assignment ofthe elements to a category from the deﬁned set of categories.•

Unconcerned : A person that is trustful of organizations collect-ing personal data.We see beneﬁts from the ability to gain visual insight into their own datastored by different online services for each of these groups. Further-more, we assume that all user groups have sufﬁcient digital skills [53]to use online services such as Google, Facebook, Instagram and Twitter.The evaluation results show that the usability is appropriate for the eval-uated user groups. However, an evaluation of privacy and visualizationskills against the effectiveness of these tools would be a valuable futurework.

The main goal of our visualizations is to provide a comprehensiveinsight into the collection of personal data stored by different onlineservices. This collection is represented by the data export, that can berequested from the services as guaranteed by the GDPR. With this weaim to support the situational awareness of one’s personal data on theInternet. According to Endsley [13] situational awareness consists ofthree stages: perception , comprehension , projection . Applied to thecontext and data considered in this paper, the following three maingoals can be deﬁned.

1: Support Perception:

Support the investigation of the distributionof own data elements with regard to information type, time andthe service by which it is stored.

2: Support Comprehension:

Support the identiﬁcation of possiblysensitive information

3: Support Projection:

Increase the attention for the users currentand future online behavior.Based on these goals we have identiﬁed the following tasks, in thatour approach should support:

T1:

OVERVIEW of all data elements contained in the exported datacollection ( perception ) T2:

INSPECT the details of each data element ( comprehension ) T3:

RELATE the data elements to services ( perception and compre-hension ) T4:

RELATE the data elements to time ( perception and comprehen-sion ) T5:

COMPARE data between services and time periods ( perceptionand comprehension ) T6:

EXPLORE possible patterns and information resulting from ag-gregation of the data ( comprehension ) T7:

REFLECT on the personal value and perceived sensitivity of therevealed information ( projection )Through the overview of the whole data collection, the users shouldgain a ﬁrst insight into the data. At this stage the users might havealready identify unexpected data elements. Users can inspect the detailsof the data element to determine how conﬁdential or critical the infor-mation really is to them. By relating the data elements to the context oftime or exploring different services, the users should gain an additionalperspective on the value of the provided data. Furthermore, patterns andunexpected information resulting from bringing together different datacan be identiﬁed. Finally, the active reﬂection of users on the personallyperceived sensitivity of the data should increase the awareness for thevalue of the stored data. While we deﬁned the tasks mainly based onthe three deﬁned goals, we argue that the tasks are beneﬁcial for allthree user sub-groups. However, there might be different effects on thedifferent sub-groups. For example, while the

Fundamentalist mightuse T5 to detect sensitive information resulting from aggregation, the Unconcerned might use T5 to reminisce or self-reﬂect. On the otherhand, the latter might lead to a higher awareness of their own data as aside effect. For the visualization solution itself the following requirements ( R1 - R7 )have been derived based on the above data, user and task identiﬁcation.To increase the willingness of the user to use our tool, we additionallyadded three system related requirements R8 - R10 . These requirementsare in line with the requirements for privacy awareness supporting toolsproposed by Ptzsch et al. [37]. With these we mainly aimed to ensurethat the evaluated effect on the users experience results from the realinspection of the own data and not from a mockup, which we believe,makes a huge difference.

R1:

A view which shows all elements contained in the export at once( T1 ) R2:

Zoom and ﬁlter, details for each data element on demand ( T2 ) R3:

Ability to upload data from different online services ( T3 ) R4:

Timeline layout for data with a time attribute (

T4, T5, T6 ) R5:

Visual categorisation by type of data to support the pattern explo-ration process (

T5, T6 ) R6:

Display multiple data sets at the same time ( T5 ) R7:

Functionality to evaluate the perceived sensitivity of a piece ofinformation ( T7 ) R8:

Own data, not just demo data

R9:

No invasion to privacy by the prototype itself

R10:

Understandable for non-experts in IT and visualization4

RANSPARENCY V IS In this section, we present our prototype

TransparencyVis . First wewill explain the infrastructure and the main technologies we used inour prototype. Then, we describe the visualization components anddemonstrate how

TransparencyVis can be used in practice along someuse cases.

TransparencyVis is implemented as a web application that primarilyruns on the client side. The interface is written in

TypeScript and

React.js , and for the visualizations we use the JavaScript library d3.js .These technologies enable the implementation of an interface withinteraction paradigms familiar to the common Internet user (

R10 ). Tomeet R8 and to enable the users to explore the tool with their owndata, we have implemented an upload and parsing mechanism for fourexemplary, but well-known, services. To ensure R9 we decided toavoid any unnecessary connections to the server. Therefore, instead ofuploading the data to a server, the processing is done in a web-workerthread in the browser to fulﬁll the privacy aspect while still beinginteractive. When a data export is selected, the contents are extractedand the service is automatically detected, to reduce the complexity forthe user as far as possible. As deﬁned in Sect. 3.1, each service has itsown parser for each parsable ﬁle that is used to extract the relevant datafrom a JSON, or other, ﬁle to the data elements. The structure of JSONﬁles is documented by TypeScript typings to facilitate the extension andmaintenance of the application. Additional services can be added byimplementing a parser for their data export structure.

Based on the requirements stated in Sect. 3.4 we developed a collec-tion of views (see Fig. 3) to support the users and their tasks. Thetwo main views are the

FileView (b) and the

TimeView (c). They arecomplemented by the

Data Page (a) and the

ListView (d). The

FileView is mainly based on a TreeMap [43] and is primarily meant to enable theuser to get an overview of all ﬁle elements contained in the export at oneglance ( R1 ). The TimeView is mainly designed as a scatterplot [8] withthe temporal aspect of the data elements. It contains time-dependentdata and is primarily meant to explore patterns and time relations ( R4 ).The ListView displays all data elements in a list. Additionally, users canrate the perceived sensitivity for each data element to support reﬂection( R7 ). The common process is as follows: The users start by retrievingtheir personal data from the online services and dragging the zip archiveinto the Data Page . Multiple zip archives from different services can beinserted at once ( R3 , R6 ). The user proceeds by going to the FileView ,where the user can explore the ﬁles contained in the data export. Furtherthey can explore the temporal data in the

TimeView and ﬁnally have alook at the details in the

ListView . However, the user can also switchbetween the views as desired. The sidebar contains the ten categories,as described in Sect. 3.1 with the mapped color ( R5 ) which is the samefor all views. The data in each visualization is mapped to the color oftheir assigned category. The FileView has an additional category

Files .The legend list in the sidebar can also be used to ﬁlter each category( R2 ). At ﬁrst, the user is provided with an initial view that consists of adropzone to enter the data export and an overview of the supportedservices. The

Data Page (Fig. 3a) has a minimalistic design to reducethe users cognitive load. For each service, an instruction on how toretrieve the data exports from the services is provided. After the dataexport is loaded to

TransparencyVis , the corresponding service ﬁeld iscolored and the inserted data set is listed within this ﬁeld. After thedata exports are processed they are kept in memory until the browsertab is closed or reloaded.

The

FileView (Fig. 3b) displays all ﬁles of the data export in a treemap( R1 ). We decided to use a treemap as our goal was to show an overview over all elements contained in the data export and to depict the pro-portions in the parts-to-a-whole relationship between the ﬁle elementsand the whole export. Also we saw the metaphor of boxes, where thedata elements are stored, as an appropriate representation for the ﬁlevisualization. Because the amounts of ﬁles contained in the export canvary much from user to user we see the space-ﬁlling treemap also asa good choice to support the scalability. When choosing the treemapwe also had the hierarchical data in mind. While the current versiononly displays the leaves, for future extensions we aim to emphasizethe hierarchical structure of the data to increase the understandability.Each ﬁeld represents a ﬁle which is contained in the export. Thereare multiple attributes for the scaling of the treemap slices which theuser can choose from. As the main valuable options we see the ﬁlesize and the amount of data points included in the ﬁles. While theﬁrst attribute can help to discover large (possibly) sensitive ﬁles, as forexample videos or images, the latter can help to discover collections ofmany elements. This could for example be a conversation record or asearch history with many items. Details can further be inspected in the TimeView or the

ListView . Further possible but not yet implementedoptions would be to scale according to the sensitivity value. Howeverthis option depends on the input from the user. The color represents thecategory of the contained data ( R5 ). Files which do not contain furtherdata elements are colored white. In the treemap a user can comparethe different categories to see which is prevalent and how much datais collected in each category. Users can inspect details about the ﬁlesvia tooltips and zooming ( R2 ). Multiple data sets are merged in thisview to one. This allows the user to combine the data from differentservices in one overview. However, in the sidebar the user can selectand deselect the data sets to display. To support the exploration of patterns and trends, in this view, a time-line visualization (Fig. 3c) is used to display the temporal aspect ofthe data ( R4 ). Therefore, a scatterplot was chosen. While time is onedimensional, the repetitive cycles are considered and split into twodimensions. The x-axis shows the years and months across the datacontained in the export. The y-axis shows a single day. A grid allowsfor better orientation and comparability. Each circle in the visualizationrepresents a data element. To reduce overplotting, only a border of thecircle is drawn. The color indicates the category of the data elements.We decided to use a scatterplot to allow a display of each single dataelement, while being able to perceive general trends. Representing thedata elements as units should support the perception of the possiblerelevance of every single data element. By assigning the data elementsto a category and coloring them appropriately, the dense formation ofthe individual elements in the scatterplot additionally allows to observepatterns in groups of data elements. To fulﬁll R2 according to Shneider-mans Mantra [44] and support R10 the familiar interaction paradigm zoom and pan with the scroll wheel is implemented. Therewith userscan look at speciﬁc time frames, like years, months, or weeks by zoom-ing into the timeline. By seeing changes or deviations in the activitypatterns it is possible for the users to identify certain important eventsin their life. The data elements can be ﬁltered based on the categories.Therefore the user can click on the category ﬁlters on the left to hideirrelevant or overplotting data, such as the location history data fromthe Google service that is collected every few minutes on Androidphones (Fig. 1, (4)). To inspect the details of each data element, userscan hover over the circles to view a tooltip that shows additional infor-mation about that item. One extension of the current version after theevaluation was the search ﬁled in

TimeView and

ListView . With it, userscan search for speciﬁc terms or names and inspect the patterns in aspeciﬁc context. Multiple datasets are merged by default and are shownin one timeline visualization. However, the

MultiView option allows theuser to plot the different datasets on separate time lines. This is similarto small multiples and can be used to compare the patterns betweendifferent data exports. This is shown in Fig. 1 and is demonstrated inuse case 2 in Sect. 5.2. A combination of multiple sources increases thepossibilities to detect patterns such as daily routines, deviations, sleep,holidays, moving to a new place and others. a) Data Page (b) FileView(c) TimeView (d) ListView

Fig. 3: The four views of

TransparencyVis , (a)

Data Page where the user can drag and drop his data export folder to, (b)

FileView gives anoverview over all ﬁles contained in the export, (c)

TimeView with categorized data elements to explore temporal trends and patterns, (d)

ListView for details on each data element and the possibility to reﬂect on each data element by rating the perceived sensitivity.

The

ListView (Fig. 3d) is meant to support the user in inspecting the dataelements in detail ( R2 ) and in reﬂecting on the perceived sensitivityof this data, as required by R7 . It consists of a chronologically sortedscrollable list of the data elements from the selected data exports. Itdisplays the date, category and the contained text of each data element.Further, the user has the possibility to rate the perceived sensitivityof each reviewed data point by interacting with a slider. The sliderallows to choose a value between Not very sensitive and

Very Sensitive .The average of the sensitivity rating over all elements is calculated anddisplayed to the user. This way the motivation to inspect and reﬂecton further data elements should be increased. A search ﬁeld can beused by the users to search for speciﬁc terms and thereby to inspectparticular questions in detail. The last two features are improvementsbased on the the evaluation results. SE C ASES

In this section, we demonstrate two use cases that show how

Trans-parencyVis can be used. We do this by imaginary scenarios based onreal data.

Bob has uploaded his data from

Facebook to TransparencyVis by drag-ging and dropping the received zip archive into the

Data Page (Fig. 3a, T3 ). In the FileView (Fig. 3b) he can see all the ﬁles and folders con-tained in the data export ( T1 ). While hovering over the boxes andrevealing the names of the ﬁles ( T2 ), he wonders about some ﬁles,which he has sent to friends years ago and which seem to be still storedon Facebook’s servers ( T7 ). He also wonders about the large amountof images stored there, which he did not expect (or forgot about). Thenhe spots the - in comparison to the others - relatively large message ﬁle(the big rose one). By inspecting the details in the tooltip, he learns that the ﬁle contains the conversation with Alice. Having detected this,Bob might goes on to the TimeView (Fig. 3c) and search for all dataelements, which contain “Alice”. In the timeline he can, for example,see that the conversation has mostly taken place around 2011 ( T6 ). Buthe also can explore further patterns of the conversation. Bob might goalso to the ListView (Fig. 3d) and search for “Alice” in the search ﬁeld.There he would get all messages which he has exchanged with her andcould inspect, whether there is especially sensitive information, whichhe probably would like to delete.

Fig. 1 shows how four different datasets can be compared ( T5 ) witheach other in one view. Alice (left) and Bob (right) have both providedtheir personal data sets retrieved from Facebook (top) and from

Google (bottom) - ( T3 ). Compared to Bob, Alice seems to have used Facebook mostly for private messaging (rose circles in (1)). Hovering over thecircles reveals the communication partner as well as the full messagetext of the message item ( T2 ). According to the data, Alice primarilyused Facebook (1) rather than

Google (3). She seems to have somemessaging data on

Google around 2015, but then she seems to haveavoided using her

Google account ( T6 ). In contrast to this, Bob’s Google dataset (4) reveals a large amount of tracked activities (green).Beginning in 2014, his location is tracked constantly (orange). Eachcircle reveals the concrete stored information in a tooltip, like actuallocation coordinates, search terms, seen videos or visited webpages.Bob has an

Android phone, which is connected to his

Google account,while Bob’s privacy settings allow

Google to track all of his activitieson the platform. Alice on the other hand was surprised to discoverthat the green activities around 2015 hint at her

Youtube history ofvideos she had watched at that time. Thinking about how her taste andinterests have changed over the years, she caught herself at the thought,that she would feel uncomfortable to share part of the history withothers ( T7 ). Both Bob and Alice noticed that security related data in6 Demography

Explanation ofData Exports Service UsageTrustPrivacy ConcernsIndex PANAS Tutorial of

Application

Useage ofApplication Tasks Questions PANAS SUS Feedback

ConsentForm

VisualizationQuestions

Privacy Affect AffectApplication Tasks Usability FeedbackIntroduction

Fig. 4: Evaluation proceeding - Questionnaire to receive feedback about evaluated user group , appropriateness of TransparencyVis , effect on theprivacy attitude and usability . The online evaluation started with an introduction part and followed by a questionnaire to derive users attitudeto privacy. A PANAS questionnaire has been used to examine the participants emotional affect of seeing the data in TransparencyVis . Theparticipants could explore

TransparencyVis with their own data. The evaluation conclude with questions regarding usability and general feedback.the collections of

Facebook has increased since around 2016. While thescenario that two users would provide their data to merge them in onevisualization, seems quite unrealistic, we decided for this use case topresent the possibilities of the tool and to present the difference of datasets between different personalities. However possible applications ofthis scenario might be, the combination of data sets of members of afamily or the comparison of own data with exemplary average datasets.

VALUATION

We evaluated the ﬁrst iteration of our prototype

TransparencyVis , withregard to the three goals deﬁned in Sect. 3.3. The evaluation focusedon four main aspects: (1) evaluated user group , (2) appropriatenessof TransparencyVis , (3) effect on the privacy attitude and (4) usability .We have used the results to improve TransparencyVis into the versionpresented in this paper.

As detailed in Sect. 4, the prototype is a web interface that can beused with personal data. Therefore, we have conducted an online studywith 37 users (14 f, 21 m, 2 other) and their own personal data. Theage ranged between 20 and 64 years with a predominance on the agegroup of 20-34 years (30/37). Most participants were either students(12) or employees (24). All participants were from Germany. Thestudy ran for 21 days. The average duration of an evaluation sessionwas about 30 minutes. Only the participants that reached the last pageof the evaluation were recorded. The participants were led througha ﬁxed process by the evaluation tool [46] without the need for aninstructor. Therefore, participants could conduct the evaluation ontheir own, in their own pace and in their familiar environment. Thisway the usage of the tool during the evaluation leaned on the naturalcontext of an every day situation, in line with the targeted user group.The process of the evaluation is shown in Fig. 4. The questions of theevaluation can be found in supplementary materials. The evaluationstarted with the introduction, which consists of a consent form, aquestionnaire on demographic data and a data preparation session.Then, the participants had to ﬁll out a questionnaire about their attitudetowards privacy. This questionnaire was inspired by the works ofCabinakova et al. [7] (trust), Westin [56] and Bergmann [4] (privacyconcern index). Furthermore, we asked the participants to ﬁll out thePANAS questionnaire [54] before and after the actual interaction withthe tool. This was used to measure the possible emotional affect causedby the exploration of the own data as provided by

TransparencyVis .After using the tool, some questions regarding possible discoveries wereasked, followed by a questionnaire about the perceived appropriatenessof

TransprencyVis for some selected tasks, primarily concerning thegoals to support perception and comprehension . Then we checked theoverall perceived usability with the SUS questionnaire [2]. Finally, we asked the participants for their subjective opinion, if and how theinsights in the data have changed their attitude towards privacy andgathered more general feedback.

In the set of participants were 9

Unconcerend , 17

Pragmatics and 11

Fundamentalists , which goes along with the distributions observed byBergmann [4], that unconcerned users are usually underrepresented.The users’ trust in services was measured with two questions fromCabinakova et al. [7]. The answers were converted from their Likertscale to a score from 0 to 100. The mean of all participants was68.2 with a standard deviation of 26.5. Most users had an account onthe

Google platform with 33 out of 36 participants,

Facebook with30,

Instagram with 22 and

Twitter with 12 participants (see Fig. 5).Participants from the

Unconcerned group used the most services with2 to 5 services. The

Pragmatist group used between 2 to 4 services.The

Fundamentalists group used the least with 1 to 3 services. The

Unconcerned had the least amount of hours used with an average of15 hours weekly across all services. The

Pragmatists group had anaverage of 32 hours, and the

Fundamentalists an average of 16 hours.In conclusion, the participants of this survey are well distributed intheir privacy attitude and users of multiple services.

Overall, we have received much appreciation by the participants aswell as from informal presentations of the interface. Nine participantsexpressed their praise explicitly in the feedback section with an appro-priate comment. Several participants asked if they could forward thelink to friends.

Support perception of data:

The questionnaire supports the appro-priateness of the tool for the perception of the amount of data (28/37agreed), the type of data (25/37 agreed) and trends and patterns (18/37agreed). These are the main aspects with regard to the goal to support perception

Support comprehension of data:

To determine whether the partici-pants were able to bring the perceived data in context with their mean-ing, we asked what they saw during the exploration phase. This waywe wanted to estimate the effect with respect to the goal to support comprehension . In particular we asked about the patterns or insightsthat participants have gained from their data. Interestingly, the abilityto ﬁnd patterns seems to relate with the privacy concern group to whichthe participant belonged. This is shown in Fig. 4. While the majorityof the

Fundamentalists (7/11) reported about exciting trends, only 3of 9

Unconcerned have claimed to see any trend. For the

Pragmatists it appeared to be half-half. We received reports about some identiﬁedtrends, which we have clustered in the following groups.

31 23 Google Facebook Instagram Twitter Others N u m b e r o f P a r t i c i p a n t s Online Services (a) used online services

10 7 No Yes (b) saw trends No Yes (c) changed privacy attitude

Fig. 5: Evaluation results. (a) Amount of participants using each online service, (b) Amount of participants that claimed to have seen any trends intheir data, differentiated according to the privacy attitude groups, (c) Amount of participants, who claimed to have changed their privacy attitudeafter having seen their data through

TransparencyVis Usage patterns regarding the platform: e.g decreased activityon

Facebook , changes from one platform to another2.

Changes in online behavior:

Periods of a high amount of mes-sages or friends requests, last deletion of the browser-history andsimilar3.

Changes in location:

Changes of place of residency, changeof workplace, holidays, location change from working day toweekend4.

Changes induced by the platform:

For example increase ofsecurity elements from

Facebook in recent years5.

Personal events and patterns:

Sleeping patterns, online times,holidays, birthday congratulations, or change of jobs

The SUS revealed an average score of 65.4, which is a good value forthe ﬁrst iteration. There was nearly no difference between the differentuser groups. We further clustered the textual answers from the feedbacksection and derived the following main points for improvement, whichwe have adjusted in the version presented in this paper.1.

Zoom function:

Adding a zoom function or a selection of atime-period to the TimeView2.

Filter improvement:

Improve the ﬁlter option, e.g. ﬁltering onthe categories, reducing the overload3.

Tooltip improvement: e.g. format, details, position4.

Search function:

Improving the support for pattern detection byadding a search capability to the timelineFurthermore,

FileView and the

ListView turned out to be complicatedto understand for the participants. We have implemented some improve-ments for the current version. For the

FileView we have simpliﬁed someinteractions and improved the tooltips. We further classiﬁed the ﬁles(white) additionally according to the type of ﬁle, which is displayedas a label and in the tooltip. We also have simpliﬁed the layout of the

ListView and added the average rating value as a feedback for rating ofthe perceived sensitivity. Additionally we added a search functionalityto ﬁlter the elements.Summarized, the main extensions we have implemented after theevaluations are: Support of multiple data sets simultaneously, ﬁlter bycategories, zoom and pan, feedback for the sensitivity rating, improvedtooltips and layout simpliﬁcations.

With regard to the goal to support projection , we wanted to knowwhether the use of

TransparencyVis had any effect on the privacy at-titude of the participants. The results of the PANAS questionnairerevealed a signiﬁcant increase of the negative attributes

Upset (+0,78)and

Scared (+0,65) and a marginal signiﬁcant loss of the positive at-tribute

Determined (-0.41). With one of our goals being to triggermore attention for the effects of online behavior, an increase of a slight alertness based on the insights can be seen as success. However, whilethis evaluation only meant to get a trend about possible effects, furtherstudies should conduct deeper evaluations on the effects and their rea-sons.

Support projection:

17 participants conﬁrmed that the use of

Trans-parencyVis had an inﬂuence on their privacy attitude. Interestingly,most of these participants belonged to the group

Fundamentalists (9/11),while only one

Unconcerned (1/9) was affected in a similar way (seeFig. 5b). Further we clustered the answers to the questions aboutwhich kind of inﬂuence has been experienced. Overall, we derived thefollowing clusters of answers to the questions on privacy attitude:1.

More attentiveness:

Many anticipated on more attentiveness fortheir own personal data handling and online behavior (8 partici-pants)2.

Checking settings:

Some intended to check and change privacysettings, maybe switch to more trustful platforms (4)3.

Deletion:

Some stated to delete their data, the entire account oravoiding such platforms (4)4.

Gain of Conﬁdence:

Selected participants increased their conﬁ-dence in treating their personal online data (2)5.

Surprise:

Some expressed surprise about which data actually hasbeen collected (2)6.

Curiosity:

One expressed curiosity about what the exported copymight not include (1)

ISCUSSION

With our design study, we have gained several insights into the under-standing of personal data stored by online services. First, the idea andthe current implementation have received much approval and interestfrom the targeted user group. We have observed, that especially the in-clusion of the usability requirements

R8-R10 had a strong inﬂuence onthe positive feedback. This is especially true, because the users coulduse the tool with their own personal data at their own pace in their ownprivate environment. The evaluation showed good results when reﬂect-ing on the effect on the privacy attitude and perceived appropriatenessof the tool for the intended purposes. However, the obviously subjectiveanswers with regard to the change in privacy attitude should not be seenas possible trends in changed behavior. Therefore, long-term studieshave to be applied on improved versions of the interface to examine thesigniﬁcance of the effects. During the evaluation we have also gainedvaluable feedback, on how to improve the usability of the interface,which we have partly integrated into the version of the tool presentedin this paper. Some implications will require further research. This isespecially true for

ListView and

FileView , which purpose seemed tohave not been understood very well by the participants. Additionally tothe optimizations in this paper, a stronger improvement of the ratingfunctionality and the appropriate feedback of the

ListView should becarried out in future work. Especially the calculation of the perceivedsensitivity should get a stronger attention. While the concept of the8 treemap in the

FileView seems to be not very intuitive for the commonusers, it has many advantages with regard to the data of the exports,as described in Sect. 4.2.2. Future work, however, should take theoptimization of the treemap visualization for non-experts into account.One of the challenges, which is also related to the

FileView , is the hugevariance in the formats of the data exports between different onlineservices as well as between different users. This complicates the de-velopment of appropriate parsers for the proposed uniﬁcation scheme.The problem of the huge variance in formats and content also leads tothe open challenge, to achieve a comprehensive overview for the user.Additionally the communication of the difference between ﬁles and data elements to the user still needs to be improved.Overall, we are encouraged by the results of the evaluation of theﬁrst iteration of the tool and are more conﬁdent that tools of this typehave the potential to receive attention by a broad range of Internet users.While the current version is primarily designed as an independentinterface for the individual, the application of such a visualization byonline services is another possibility. Such functionality could increasethe users’ trust in the services, which is an increasingly important factorfor the willingness to share personal data with an online service [7, 34].As an extended stand-alone application, the interface could, however,also be used as a management tool for online data by bringing theexports of all used online services together.

ONCLUSION AND F UTURE W ORK

In this paper, we have presented our design study on increasing theattention and awareness of the common internet user for their ownpersonal data that are stored by different online services. We havepresented the targeted user group, which we differentiated by the pri-vacy concern index together with the used data source. We also haveprovided a uniﬁcation scheme based on two deﬁned data types andten plus one categories, which can be used by other researchers todevelop new parsers for further services. We have also presented thetasks which we have derived based on the theory of situation awarenessapplied to stored personal data. Based on the derived requirementswe have implemented the online accessible prototype

TransparencyVis ,which can be used with own real personal data. We have evaluated thistool with 37 targeted users and have elicited important insights withregard to the tool’s appropriateness, usability and effect on the partic-ipant’s attitude towards privacy. The evaluation of this ﬁrst iterationhas led to many ideas for improvements of the approach. The mainnext steps would be to improve the

FileView to enhance the overviewof all data contained in the download at a ﬁrst glance. Further we wantto investigate how an active reﬂection on the presented data can besupported more effectively. A possible approach could be to connect ausers rating of the sensitivity to an active learning approach to supportthe visualization of the results. A remaining challenge for any newideas is our effort to preserve the privacy of the user by not using aserver-based approach. Further potential improvements include theemployment of additional analysis methods, including other services,and further optimizations of the data parsing. A CKNOWLEDGMENTS

This research work has been funded by the German Federal Ministryof Education and Research and the Hessen State Ministry for HigherEducation, Research and the Arts within their joint support of theNational Research Center for Applied Cybersecurity ATHENE. R EFERENCES [1] J. Angulo, S. Fischer-H¨ubner, T. Pulls, and E. W¨astlund. Usable trans-parency with the data track: a tool for visualizing data disclosures. In

Proceedings of the 33rd Annual ACM Conference Extended Abstracts onHuman Factors in Computing Systems , pp. 1803–1808. ACM, 2015.[2] A. Bangor, P. Kortum, and J. Miller. Determining what individual susscores mean: Adding an adjective rating scale.

Journal of usability studies ,4(3):114–123, 2009.[3] D. Baur, F. Seiffert, M. Sedlmair, and S. Boring. The streams of ourlives: Visualizing listening histories in context.

IEEE Transactions onVisualization and Computer Graphics , 16(6):1119–1128, 2010. [4] M. Bergmann. Testing privacy awareness. In

IFIP Summer School on theFuture of Identity in the Information Society , pp. 237–253. Springer, 2008.[5] C. Bier, K. K¨uhne, and J. Beyerer. Privacyinsight: the next generationprivacy dashboard. In

Annual Privacy Forum , pp. 135–152. Springer,2016.[6] D. Boyd.

It’s complicated: The social lives of networked teens . YaleUniversity Press, 2014.[7] J. Cabinakova, C. Zimmermann, and G. M¨uller. An empirical analysisof privacy dashboard acceptance: the google case. In , p. Research Paper 114, 2016.[8] W. S. Cleveland and R. McGill. The many faces of a scatterplot.

Journalof the American Statistical Association , 79(388):807–822, 1984.[9] Common Sense Media. Privacy program. https://privacy.commonsense.org/evaluations/1 , 2013. Accessed: 2020-07-13.[10] H. Do Thi Duc. Data Selﬁe: To Know Thyself Like Facebook Knows Thee.Master’s thesis, Parsons School of Design, 2017. Retrieved from http://hangdothiduc.de/mfadt/thesis/2016_dothh489_01.pdf , Ac-cessed on 2019-11-18.[11] H. Do Thi Duc. Publicbydefault.fyi. https://publicbydefault.fyi/ ,2018. Accessed: 2020-07-13.[12] H. Do Thi Duc and T. Bazichelli. myfbdata. https://myfbdata.schloss-post.com/ , 2017. Accessed: 2020-07-13.[13] M. R. Endsley, D. J. Garland, et al. Theoretical underpinnings of sit-uation awareness: A critical review.

Situation awareness analysis andmeasurement , 1(1):3–21, 2000.[14] Facebook. . Accessed: 2020-07-13.[15] S. Fischer-H¨ubner, J. Angulo, F. Karegar, and T. Pulls. Transparency, pri-vacy and trust–technology for tracking and controlling my data disclosures:Does this work? In

IFIP International Conference on Trust Management ,pp. 3–14. Springer, 2016.[16] S. Fischer-H¨ubner, J. Angulo, and T. Pulls. How can cloud users besupported in deciding on, tracking and controlling how their data are used?In

IFIP PrimeLife International Summer School on Privacy and IdentityManagement for Life , pp. 77–92. Springer, 2013.[17] Google. . Accessed: 2020-07-13.[18] M. Hansen. Marrying transparency tools with user-controlled identitymanagement. In

IFIP International Summer School on the Future ofIdentity in the Information Society , pp. 199–220. Springer, 2007.[19] H. Harkous, K. Fawaz, R. Lebret, F. Schaub, K. G. Shin, and K. Aberer.Polisis: Automated analysis and presentation of privacy policies using deeplearning. In { USENIX } Security Symposium ( { USENIX } Security18) , pp. 531–548, 2018.[20] Instagram. . Accessed: 2020-07-13.[21] International Telecommunication Union (ITU). Statistics - Individu-als using the Internet, 2005-2019. , 2019. Accessed: 2020-07-13.[22] M. Janic, J. P. Wijbenga, and T. Veugen. Transparency enhancing tools(tets): an overview. In , pp. 18–25. IEEE, 2013.[23] E. Kani-Zabihi and M. Helmhout. Increasing service users privacy aware-ness by introducing on-line interactive privacy features. In

Nordic Confer-ence on Secure IT Systems , pp. 131–148. Springer, 2011.[24] F. Karegar, T. Pulls, and S. Fischer-H¨ubner. Visualizing exports of personaldata by exercising the right of data portability in the data track-are peopleready for this? In

IFIP International Summer School on Privacy andIdentity Management , pp. 164–181. Springer, 2016.[25] P. G. Kelley, J. Bresee, L. F. Cranor, and R. W. Reeder. A nutrition labelfor privacy. In

Proceedings of the 5th Symposium on Usable Privacy andSecurity , p. 4. ACM, 2009.[26] L. Kloeze. Whatsallapp. https://github.com/LoranKloeze/WhatsAllApp , 2019. Accessed: 2020-07-13.[27] Last.fm. . Accessed: 2020-07-13.[28] Lehrstuhl fr Informatik 1 Friedrich-Alexander-Universitt Erlangen-Nrnberg. Onlinestatusmonitor. https://onlinestatusmonitor.com/user_statistics/ , 2014. Accessed: 2020-07-13.[29] M. McCain and I. Barakaiev. Privacyspy. https://privacyspy.org/ ,2019. Accessed: 2020-07-13.[30] S. Miksch and W. Aigner. A matter of time: Applying a data–users–tasksdesign triangle to visual analytics of time-oriented data.

Computers &Graphics , 38:286–290, 2014.[31] Mozilla Foundation. Lightbeam. https://github.com/mozilla/ ightbeam-we , 2011. Accessed: 2020-07-13.[32] P. Murmann and S. Fischer-H¨ubner. Tools for achieving usable ex posttransparency: a survey.

IEEE Access , 5:22965–22991, 2017.[33] netograph.io. netograph. https://netograph.io , 2019. Accessed:2020-07-13.[34] P. A. Norberg, D. R. Horne, and D. A. Horne. The privacy paradox:Personal information disclosure intentions versus behaviors.

Journal ofconsumer affairs , 41(1):100–126, 2007.[35] OpenDataCity. re:log. https://opendatacity.github.io/relog/ ,2013. Accessed: 2020-07-13.[36] OpenDataCity. vds-suisse. https://opendatacity.github.io/vds-suisse/index_en.html , 2017. Accessed: 2020-07-13.[37] S. P¨otzsch. Privacy awareness: A means to solve the privacy paradox? In

IFIP Summer School on the Future of Identity in the Information Society ,pp. 226–236. Springer, 2008.[38] J. Rameerez. useguard. https://useguard.com/ , 2019. Accessed:2020-07-13.[39] P. Raschke, A. K¨upper, O. Drozd, and S. Kirrane. Designing a gdpr-compliant and usable privacy dashboard. In

IFIP International SummerSchool on Privacy and Identity Management , pp. 221–236. Springer, 2017.[40] Regulation (EU) 2016/679 of the European Parliament and of the Councilof 27 April 2016 on the protection of natural persons with regard to theprocessing of personal data and on the free movement of such data, andrepealing Directive 95/46/EC (General Data Protection Regulation), 2018.[41] C. Riederer, D. Echickson, S. Huang, and A. Chaintreau. Findyou: Apersonal location privacy auditing tool. In

Proceedings of the 25th In-ternational Conference Companion on World Wide Web , pp. 243–246.International World Wide Web Conferences Steering Committee, 2016.[42] Roy, Hugo and community. Tos;dr. https://tosdr.org/ , 2012. Ac-cessed: 2020-07-13.[43] B. Shneiderman. Tree Visualization with Tree-maps: 2-d Space-ﬁllingApproach. Technical Report 1, ACM Transactions on Graphics (TOG),New York, NY, USA, Jan. 1992. doi: 10.1145/102377.115768[44] B. Shneiderman. The eyes have it: A task by data type taxonomy forinformation visualizations. In

The craft of information visualization , pp.364–371. Elsevier, 2003.[45] D. J. Solove. I’ve got nothing to hide and other misunderstandings ofprivacy.

San Diego L. Rev. , 44:745, 2007.[46] SoSci Survey die Lsung fr eine professionelle Onlinebefragung. . Accessed: 2020-07-13.[47] Tactical Technology Collective. Trackography. https://trackography.org/ , 2016. Accessed: 2020-07-13.[48] W. B. Tesfay, P. Hofmann, T. Nakamura, S. Kiyomoto, and J. Serna.Privacyguide: towards an implementation of the eu gdpr on internet pri-vacy policy evaluation. In

Proceedings of the Fourth ACM InternationalWorkshop on Security and Privacy Analytics , pp. 15–21. ACM, 2018.[49] A. Thudt, D. Baur, and S. Carpendale. Visits: A spatiotemporal visualiza-tion of location histories. In

Proceedings of the eurographics conferenceon visualization , pp. 79–83, 2013.[50] A. Thudt, D. Baur, S. Huron, and S. Carpendale. Visual mementos: Re-ﬂecting memories with personal data.

IEEE transactions on visualizationand computer graphics , 22(1):369–378, 2015.[51] Twitter. . Accessed: 2020-07-13.[52] Universal Declaration of Human Rights, 1948. Art. 12. Accessed 2019-11-18.[53] A. van Deursen, E. Helsper, and R. Eynon.

Measuring digital skills : fromdigital skills to tangible outcomes project report . University of Twente,Netherlands, 2014.[54] D. Watson, L. A. Clark, and A. Tellegen. Development and validation ofbrief measures of positive and negative affect: the panas scales.

Journalof personality and social psychology , 54(6):1063, 1988.[55] A. F. Westin. Privacy and freedom.

Washington and Lee Law Review ,25(1):166, 1968.[56] A. F. Westin. Harris-equifax consumer privacy survey 1991.

Atlanta, GA:Equifax Inc , 1991.[57] WolframAlpha. Facebook analysis. , 2015. Accessed: 2020-07-13.[58] ZEIT ONLINE. Malte Spitz Vorratsdaten. , 2013. Accessed: 2020-07-13.[59] M. Zweerink. Whatsspy public. https://maikel.pro/blog/en-whatsapp-privacy-options-are-illusions/ , 2015. Accessed:2020-07-13., 2015. Accessed:2020-07-13.