A Privacy-Preserving Architecture for the Protection of Adolescents in Online Social Networks
Markos Charalambous, Petros Papagiannis, Antonis Papasavva, Pantelitsa Leonidou, Rafael Constaninou, Lia Terzidou, Theodoros Christophides, Pantelis Nicolaou, Orfeas Theofanis, George Kalatzantonakis, Michael Sirivianos
AA Privacy-Preserving Architecture for the Protection of Adolescentsin Online Social Networks
Markos Charalambous , Petros Papagiannis , Antonis Papasavva , Pantelitsa Leonidou ,Rafael Constaninou , Lia Terzidou , Theodoros Christophides , Pantelis Nicolaou Orfeas Theofanis , George Kalatzantonakis , Michael Sirivianos Cyprus University of Technology, Limassol Cyprus Cyprus Research and Innovation Center, Nicosia, Cyprus Aristotle University of Thessaloniki, Thessaloniki, Greece LSTech LTD, Milton Keynes, United KingdomEmail: { marcos.charalambous, petros.papagiannis, t.christophides, michael.sirivianos } @cut.ac.cy, { as.papasavva, pl.leonidou } @edu.cut.ac.cy, { r.constantinou, p.nicolaou } @cyric.eu, [email protected], { orfetheo, george } @lstech.io Abstract —Online Social Networks (OSN) constitute an integralpart of people’s every day social activity. Specifically, mainstreamOSNs, such as Twitter, YouTube, and Facebook are especiallyprominent in adolescents’ lives for communicating with otherpeople online, expressing and entertain themselves, and findinginformation. However, adolescents face a significant number ofthreats when using online platforms. Some of these threats includeaggressive behavior and cyberbullying, sexual grooming, falsenews and fake activity, radicalization, and exposure of personalinformation and sensitive content. There is a pressing need forparental control tools and Internet content filtering techniquesto protect the vulnerable groups that use online platforms.Existing parental control tools occasionally violate the privacy ofadolescents, leading them to use other communication channels toavoid moderation. In this work, we design and implement a user-centric Cybersafety Family Advice Suite (CFAS) with GuardianAvatars aiming at preserving the privacy of the individualstowards their custodians and towards the advice tool itself.Moreover, we present a systematic process for designing anddeveloping state of the art techniques and a system architectureto prevent minors’ exposure to numerous risks and dangers whileusing Facebook, Twitter, and YouTube on a browser.
Keywords – online social networks; online threats; cybersecurityrisks; privacy; minors. I. I
NTRODUCTION
The majority of teens ( ) use more than one socialmedia site according to a Pew Research Center [1] survey( N = 743 ). A 2018 poll ( N = 1001 ) [2] found that theaverage 5 to 15 year-olds spend about 15 hours online everyweek. Additionally, of the 11 to 16 year-olds surveyedsaid that they have an online social network account. Thesenumbers illustrate that the overwhelming majority of youngpeople use OSNs, even if they are not old enough to legallyregister accounts for most mainstream OSNs, like Facebook,Instagram, Twitter, YouTube, and Snapchat. Alarmingly, thereare many risks adolescents are exposed to when using OSNs.Specifically, a 2019 study [3] of 21.6K primary school childrenand 18.1K secondary school children found that and ,accordingly, had seen content that encouraged people to hurtthemselves. The same study reports that 11 to 18 year-oldsreported seeing sexual content in the most popular OSNs. Last, reviews from over 2K young people aged 11 to 18, showthat the witnessed violence and hatred, encounteredsexual content, and the witnessed others being victimsof cyberbullying. A different study conducted in 2018 foundthat of U.S. teens have been victims of cyberbullyingor harassment online. Additionally, about a third ( ) ofteens report that someone has spread false rumors about themon the Internet, while smaller shares ( ) have been thetarget of physical threats online. Notably, the majority of thevictims tend to be females. The study concludes that of the parents worry that their child might be getting bulliedonline, but most are confident they can teach their teen aboutacceptable online behavior [4].Overall, the popularity of the Internet, and OSN usagein particular, is very high and with an increasing tendencyamong youngsters. Thus, the online risks for these sensitive agegroups received increased awareness. To design an architecturefor the protection of youngsters in OSNs, we list the mostfrequent dangers the young users might encounter. Existingliterature [4]–[6] agrees to the following distinctive threats:i) cyberbullying; ii) cyberpredators; iii) sensitive informationleakage; iv) manipulated content and pornography; and v)offensive images and messages. Contributions.
In summary, this work makes the followingcontributions:1) The design and implementation of a privacy-preservingCFAS that utilizes machine learning classifiers and otherfilters to protect minors when using OSNs.2) CFAS makes efforts to keep the minors fully aware ofwhat their custodians and what the Family Advice Suitecan monitor, filter, and analyze about their online activity.3) CFAS employs fine-grained tools to spread awareness tothe custodians and the minors about the various threatsthey face when using OSNs. It also utilizes the GuardianAvatar that interacts and advises the adolescents in a directand user-friendly way.4) The proposed architecture can accurately detect: (i) cyber-bullying; (ii) sexual grooming; (iii) abusive users; (iv) botaccounts; (v) personal information exposure; (vi) sensitive a r X i v : . [ c s . S I] S e p ontent in pictures; (vii) hateful and racist memes; and(viii) disturbing videos. Paper Organization.
The rest of the paper is organized asfollows. First, we provide a detailed demonstration of theproposed architecture in Section II, followed by our designprinciples in Section III. Then, we list and discuss how theclassifiers hosted on the Intelligent Web-Proxy (IWP) work inSection IV. We also provide an early evaluation of the systemvia a virtual environment, and physical experiments with betatesters (Section V), before discussing existing related work onparental control tools in Section VI. Last, we conclude thiswork in Section VII.II. A
RCHITECTURAL O VERVIEW
In this section, we describe the main pillars of our ar-chitecture. This architecture comprises the following: 1) OSNData Analytics Software Stack (Back-End); 2) Intelligent Web-Proxy; and 3) browser add-on. For the tool to work efficiently,all three components interact with each other, but none dependson the other to function. Figure 1 depicts the proposed architec-ture of the CFAS framework, including its main componentsand the interfaces that interconnects them. We describe themain purposes and functionalities of each component below.
A. OSN Data Analytics Software Stack
The first component of the CFAS architecture is the OSNData Analytics Software Stack, referred to as the
Back-End henceforth. This is a single machine, which is responsible totrain machine learning algorithms for the detection of threatsin OSNs. The trained classifiers and detection rules created onthis machine are sent automatically to the registered IntelligentWeb-Proxies (IWP) when available (see only if both the custodian and theminor give their explicit consent (4* in the figure). Theseanonymized data are used to retrain the machine learningalgorithms hosted in the Back-End to extract more accurateand intelligent classifiers, which are sent back to the IWPs toreplace the existing classifiers, as shown in step
B. Intelligent Web-Proxy
The Intelligent Web-Proxy (IWP) is a small device that isconnected to the router of the service provider in the houseof the protected family. We note that every different networkneeds its own IWP to be protected as a single IWP supportsonly one network. The IWP consists of three modules thathandle specific tasks, as described below.
1) DOM Tree Analysis:
This part of the IWP captures allthe incoming and outgoing traffic of the user (child). Note thatthe word user refers to the child protected by our architecturehenceforth. First, the user requests a webpage using theirbrowser (see 1 at Figure 1). The response of this request issent to the IWP: the DOM Tree Analysis module, specifically(step 3 in the figure). After capturing the traffic, the DOM TreeAnalysis module handles TLS connections and performs TLStermination to decrepit HTTPS websites (only Facebook andTwitter currently). Importantly, the IWP is tested to managehigh network traffic load and extract the webpage content fromthe captured DOM tree. At the same time, the same data aresent to the Data Access Layer for analysis (see 4 in Figure 1).We describe how the Data Access Layer (DAL) works below.
2) Data Access Layer:
The Data Access Layer hosts all thetrained classifiers and detection rules generated from the Back-End that are used to check all the received captured traffic.Figure 2 demonstrates the functionality of the Data AccessLayer, which is the main storage unit hosted in the IWP and theBack-End of the CFAS infrastructure. First, the data capturedby the DOM Tree Analysis are sent to the
Decision Mecha-nism of DAL (step 1 in Figure 2). Every bit of information(Facebook chat, Facebook news-feed pictures, Facebook postscreated by the user, Facebook pictures uploaded by the user,visited YouTube videos, and visited Twitter user profiles) issent individually. Upon reception of this data, the Decisionmechanism creates a unique Execution ID (ExecID), see step2 in the figure. This unique string is used by the Decisionmechanism to define the job number of the trained classifier,which is used to analyze the data.Then, the Decision mechanism requests the Data AccessAPI to store this data in the database: a MongoDB (step 3).Once the data are stored, the Data Access API binds them witha unique number, which is used as a primary key to identifythese data: DataID. The DataID is sent back to the Decisionmechanism (step 4), which is combined with the ExecID tocall the suitably trained classifier to detect suspicious behavior(see step 5). Once the trained classifier receives the ExecIDand the DataID, it sends the DataID to the Data Access APIto request the retrieval of data for analysis (step 6), which inreturn are sent back to the trained classifier (step 7). Once thetrained classifier finished the analysis of the data, it sends itsresults to the Data Access API, along with the ExecID andDataID to be stored in the database (step 8). Then, the trainedclassifier sends the ExecID and DataID back to the DecisionMechanism to inform it that the analysis finished (step 9).In response, the Decision Mechanism requests the resultsof the job from the Data Access API (step 10), and theData Access API responds with the results of the analysis(step 11). Last, based on the results of the trained classifier,and thresholds set in the Decision mechanism, the Decisionmechanism is responsible to decide whether a notificationneeds to be sent to the user via the CFAS browser add-on,and to the custodian of the user, via the Parental Console. Ifthis is the case, the Decision Mechanism triggers an event viathe Notification Module (step 12). Note that step 12 in Figure 2is the same as step 5 and step 5* in Figure 1.
3) Parental Console:
The last component hosted in theIntelligent Web-Proxy is the Parental Console. The ParentalConsole is a fine-grained web-based platform that enables thecustodian of the user to manage which data of the user (child)he/she and the IWP can see. Also, via the Parental Console,the custodian can choose what the IWP filters, protects, andblocks. Additionally, custodians can set the level of the child’scybersafety. To set these options in operation, the child receivesnotifications on their browser add-on through the NotificationModule, informing them that their custodian has made somechanges in the options.We highlight that for these options to operate, the childneeds to approve them via their browser add-on. This way,we ensure that the child gave their consent about what theIWP captures, analyzes, filters, and blocks. At the sametime, this functionality ensures that the child knows exactlywhat notifications their custodian will be receiving about the igure 1. Cybersafety Family Advice Suite ArchitectureFigure 2. Data Access Layer (DAL) processes. DAL is the main storage unit of the IWP and the Back-End of the CFAS infrastructure. online activity of the child, and what OSN traffic activity thecustodian can see. We note that our proposed architecturepromotes a conversation and close communication betweenthe custodian and the child. This way, the family protectedby CFAS can agree on what online activity of the child thecustodians need to monitor, and what are the main risks andthreats involved in using OSNs. Moreover, this architecturepromotes OSN threat awareness, hence enforcing a culture ofsafe OSN usage. To achieve this, we introduce specific Parentaland Back-End visibility options and Cybersafety options.1) Parental Visibility Options: These options define whatthe custodian of the user can see, while enabling various levelsof monitoring for the custodians, always with the explicit consent of the user. We define three Visibility Levels: • Level 1: This is the lowest level of parental visibility,meaning that the custodian cannot see any data regardingthe OSN traffic of the user. We note that the custodian stillreceives notifications regarding the threats detected by thetrained classifiers hosted in the IWP, without mentioningthe name of the perpetrator or revealing any OSN data.For the sake of the following examples, we assume thatthe protected child’s name is
John : “John might be avictim of cyberbullying.” • Level 2: This level of visibility allows the custodian toselect some of the following OSN activity of the child tobe visible to them: suspicious Twitter usernames the childisited, disturbing YouTube videos the child watched,Facebook wall, photos, and friends of the child. Once theuser gives their consent via their browser add-on for thisdata to be visible to the custodian, the visibility optionis operational. A notification example: “John might bea victim of cyberbullying by Eve”, where John is theprotected child, and Eve is the perpetrator. • Level 3: This is the default and highest level of parentalvisibility. When this option is selected, it adds all theoptions from Level 2, along with data regarding the user’sFacebook chat. So at this level, the custodian of the childcan see all the incoming and outgoing traffic of the child’sFacebook wall, photos, notifications, friends, and chat, only in case of an incident. A notification example: “Johnmight be a victim of cyberbullying by Eve. Click here tosee the suspicious chat”. This way, the custodian can see portions of the chat between the user and the perpetratorthat show signs of cyberbullying.We note that these options expire once every six months, sothe custodian and the child can reset them as they wish. All theabove levels of visibility can be set up after a mutual agreementbetween the custodian and the user while keeping the user fullyaware of what their custodian can see.2) Back-End Visibility options: Through the Back-EndVisibility options, the Cybersafety Family Advice Suite offersoptions regarding which OSN traffic data is sent to the Back-End. OSN data sent to the Back-End are used to retrainthe machine learning algorithms and detection rules hostedthere to make them more accurate in future predictions. Thecustodian can choose among the child’s Facebook wall, photos,notifications, friends, and chat. We note that the user needs togive their consent for the data to be sent to the Back-End. Wedefine the following Back-End Visibility Levels: • Level 1: This is the lowest level of Back-End visibility.If this option is set, no data is sent to the Back-End. • Level 2: In this level, the custodian allows the IWP to senddata to the Back-End regarding the child’s Facebook wall,friend’s Facebook wall, and the child’s Facebook friendsprofiles. The custodian may select one or all of the above.Also, these data may be sent anonymized or not. • Level 3: This is the highest level of Back-End visibility.When this option is set, it allows the IWP to send allthe data from level 2, in addition to the child’s Facebookchats. Once again, these data may be sent anonymized ornot, and always with the consent of both the custodianand the child.3) Cybersafety Options: Last, the Parental Console allowsthe custodian to choose the child’s level of Cybersafety. Theseoptions define how aggressive the IWP can be, regarding theprotection of the user: what the IWP can filter, protect, block,replace, encrypt, or watermark. This options can be configuredat two different levels: • Level 1: This is the lowest level of cybersafety. If set, theIWP only pushes notifications to the user explaining thatcertain suspicious or malicious activity is detected. Thismeans that the IWP still detects suspicious activity, butit does not hide, protect, encrypt, blocks, or watermarksany content. Via the Parental Console, the custodian canchoose the notifications they wish for the child to receivefor each detection mechanism. The detection mechanisms include: a) cyber grooming; b) hate or inappropriatespeech (cyberbullying); c) distressed behavior (when thechild is suicidal, scared, depressed); d) fake activity (fakeOSN profiles); e) personal information exposure (whenthe child is about to publish personal information); f)hateful memes; g) inappropriate YouTube videos; and h)sensitive content in pictures (when the child is aboutto share a benign picture that includes nudity withoutprotection, like a picture in a swimsuit). • Level 2: At this level, the custodian may choose any ofthe above IWP detection mechanisms to take action andfilter, replace, protect, encrypt, or block content before itreaches the browser of the protected child. The detectionmechanisms remain the same as level 1, but the custodianneeds to select at least one to be operational for this levelto hold.Overall, the IWP is responsible for capturing the incomingand outgoing traffic of Facebook, Twitter, and YouTube of theuser and send it to the locally hosted trained classifiers to detectmalicious activity. In case the suspicious activity is detected byone or more trained classifiers, the IWP pushes a notificationto the browser add-on of the user to inform them about theimminent threat detected. At the same time, the suspiciousmalicious content is blocked or filtered by the browser add-onto protect the minor, given that the Cybersafety Option Level2 is set by the custodian and the user. The IWP hosts trainedclassifiers and detection rules to perform the following actions:1) detect nudity in images included in the captured traffic;2) encrypt sensitive images with steganography;3) detect and warn the minor in case they are about to sharepersonal information;4) detect cyberbullying in Facebook conversations;5) detect sexual grooming in Facebook conversations;6) detect hateful and racist memes in Facebook feed;7) detect bot, aggressive, bully, and spam Twitter users;8) detect inappropriate videos for children on YouTube;9) provide sentiment analysis of the chat of the minor;10) generate informative notifications to the minor;11) push notifications to the custodian about an incidence(e.g., sexual grooming);12) push notifications to the child via the browser add-on;13) submit data to the Back-End through a secure tunnel; and14) block adult, or any other site, defined by the custodian.
C. Browser add-on
The last component of our architecture is the browser add-on (CFAS add-on in Figure 1). The browser add-on is thegateway between the IWP and the user, responsible to informthe user about the threats detected from the IWP, and theVisibility and Cybersafety options set by their custodian.Importantly, our browser add-on operates as a GuardianAvatar that the child may interact with to ask for advice. Ouravatar operates as the guardian angel of the user while usingdifferent OSN platforms (Facebook, Twitter, and YouTubeonly, currently). By following the Guardian Avatar approach asa gamification feature [7], CFAS aims to encourage the usersto use it and interact with it because of its extended usabilityand improved user experience functionalities.In addition, the user can select their favorite avatar iconfrom a list of icons. The Guardian Avatar “follows” the usern their online-activities as a virtual friend. When the IWPdetects any malicious behavior or incidents, the notifications(warnings, advice, etc.) appear as chat bubbles of the avatar,in a friendly and encouraging text. An example of the avatarnotifying the minor about a detected incident is depicted inFigure 3. With the addition of the avatar, it is expected thatthe CFAS warnings and advice will be less disturbing forchildren (especially for the adolescents) and will make usersmore willing to use it.
Figure 3. Guardian Avatar notifies the minor of any detected incidents
The browser add-on can:1) notify the user about the activity detected by the IWP;2) notify the user about what their custodian can see basedon the preferences (Parental Visibility options) applied;3) notify the user about what data is sent to the Back-Endto aid the machine learning classifiers to become moreaccurate (Back-End Visibility options);4) let the user change the options about what OSN trafficactivity their custodian can see;5) let the user change the options about what data is sent tothe Back-End;6) let the user flag content/text as cyberbullying activity, sex-ual cyber grooming activity, aggressive behavior activity,fake identity activity, and false information activity in casethe IWP failed to detect so;7) let the user flag sensitive or nudity content in case theIWP failed to detect so; and8) let the user flag content/text as an incorrect sensitive con-tent, cyberbullying, sexual grooming, aggressive behavior,fake identity, and false information activity in case theIWP detected so.Overall, we propose a fully privacy-preserving architecturefor the protection of minors when they use OSNs, both towardstheir custodians and towards the system itself. First, the minoris empowered to choose the online activity and warnings thattheir custodian receives in case a threat is detected by theIWP. This can be done via the Parental Visibility Options.Second, the user can choose which online activity the IWPfilters, captures, and protects via the Cybersafety Options.Also, the IWP, the device that is responsible for capturingand analyzing the online activity of the minor to detect onlinethreats, is connected and physically exists within the networkof the user. Thus, the online activity of the minor is capturedand analyzed locally and is isolated within the network of the user. In addition, the IWP never makes any data visible tothe rest of the system (Back-End or other IWPs) without theexplicit authorization and consent of both the user and theircustodian via the Back-End Visibility Options.III. D
ESIGN
We now detail the design of the proposed architecture.Instead of simple rule-based filters, our architecture utilizesadvanced machine learning algorithms. The downside of hav-ing rule-based filters is that they are blunt. There are situationswhere there is a particular piece of content that technicallydoes not violate the specified policies, but when this content isanalyzed with advanced machine learning techniques, it mightturn out to be hate speech, sarcasm, sexual grooming, etc.Such techniques allow us to detect bullies or predators thatare close to the line. To sum up, the aim is to have thesegranular standards so that our design can control for bias. Ourdesign approach is based on the following design principles:1) We place all functionalities (filters, text replacement,notifications, data submission to the Back-End, etc.) in theIWP instead of the browser add-on when it can be correctlyand efficiently implemented. This way, we prevent a minorfrom modifying or disabling the system’s functionality throughthe browser add-on. For example, in case a minor accidentallyor willingly disables the browser add-on, the IWP does not getaffected, and all the processes and functionalities can continuetheir operation normally. We assume that the device of the mi-nor is still configured to route social network services throughthe IWP and that the child does not have the permission,knowledge, or access to alter the configuration of the IWP ortheir personal device. Also, the IWP can notify the custodianthrough the Parental Console that the browser add-on of theminor is not responding anymore.This architecture aims to provide the ability to seam-lessly support multiple types of clients (desktop browsers,mobile apps, etc.) with a minimal client or client platformconfigurations or modifications. Moreover, the browser add-ondoes not support complex functionalities other than javascriptand HTML scripts. For example, functionalities, like textreplacement, picture encryption, filtering, etc., are too complexto be implemented and run on a browser add-on.In case the IWP is down, the browser add-on calls RESTAPI requests from the Back-End, and the Back-End DAL isemployed to identify suspicious content. This means that theOSN traffic activity of the user is sent outside of the network,to the Back-End, for analysis. Whether a suspicious activityis detected by the Back-End or not, all the user OSN trafficdata is automatically deleted from the Back-End. Having somefunctionalities on the IWP prevents it from calling REST APIrequests from the Back-End every time it needs to analyzeOSN traffic activity. In addition, placing some functionalitieson the IWP, solves the potential problem of the whole systembeing down in case of Back-End unavailability, thus solvingthe problem of single-point failure. Examples: i) The IWP canpush notification to the browser add-on without the need of theBack-End. ii) Before any content reaches the minor’s device,the IWP can replace cyberbullying content without callingREST API requests from the Back-End, using the functionalityinstalled on it already.2) Rules and trained classifiers are generated in the Back-End. Trained classifiers are placed in the IWP only if they canun efficiently. The Back-End collects data from all the IWPs togenerate detection rules and trained classifiers. Data collectedfrom the IWPs are used to generate cyberbullying, sexualcyber grooming, distressed behavior, aggressive behavior, fakeidentity, and false information detection rules.3) Warning, flagging, and feedback functionality is placedon the browser add-on. The Guardian Avatar displays noti-fications in dialogue boxes after the IWP detects suspiciousbehavior and pushes a notification to the browser add-on. Theuser can flag content as cyberbullying activity, sexual cybergrooming activity, aggressive behavior activity, fake identity,false information, and sensitive picture through the browseradd-on in case the IWP failed to detect so. The user canalso give feedback based on the activity detected by theIWP. For example, in case the IWP detects cyberbullying, itpushes a notification to the browser add-on. The GuardianAvatar shows the notification/warning to the user explainingthat cyberbullying was detected (Figure 3). Then, the user canprovide feedback on whether this detection is accurate or not.4) The minor can check the content their custodian, theIWP, and the Back-End can see. The custodian can set upthe Visibility settings in a fine-grained way and always withthe consent of the minor. This way, we enable various levelsof monitoring for parents and the Back-End with the child’sconsent, while keeping the child fully aware of what theircustodians and the Back-End can see, e.g., chat messages.Overall, we propose a system that eases the tension ofensuring the safety of minors while respecting their privacywith respect to what their custodians and third parties can see.By automating the detection of malicious communication, weenable custodians to be continuously aware of their child’ssafety. This is achieved without the parent having to go throughthe minor’s online communication manually, thus, withouthaving to invade the minor’s privacy. Our approach aims towarn the custodians about the suspicious online activity thatwas detected, without violating the privacy of the minor. Forexample, if the minor has a Facebook online conversation withsexual content with somebody, the custodian of the minor willreceive a warning that such a conversation is taking place,once the IWP captures it. Still, the parent won’t be able to seethe actual content because that would violate the teenager’sprivacy. Instead, the parent can only see the actual conversationthrough their Parental Console once the explicit consent of thechild has been granted. To sum up, our design principles intendto encourage custodians to have a conversation with the minor;thus, bringing families closer and spreading awareness aboutthe numerous threats that exist in contemporary OSNs.IV. I
MPLEMENTATION
We implement all the architecture components, and inte-gration’s that we describe in Sections II and III. In this section,we provide the details of the prototype implementation. Notethat we employ classifiers created in previous work for thedetection of threats in OSNs. We note that these classifiersare generated on the Back-End and hosted on the IWP. Incase the classifiers detect suspicious activity, the IWP pushesnotifications to the browser add-on of the user, and the ParentalConsole.
A. Detection of Abusive Users on Twitter
When the minor visits a Twitter user account, the IWPcaptures the username of the visited user, and it calls theTwitter API to collect the last 20 tweets (including retweets)of that user [8]. This information is then sent to a classifierdeveloped by Chatzakou et al. [9] for analysis. The developedclassifier is trained with Twitter annotated data [10] [11] andanalyzes the last 20 tweets of the visited Twitter user to detectwhether it is an aggressive, bully, spam, or normal account.
B. Fake and Bot user detection on Twitter
When the minor visits an account on Twitter, the IWPcaptures the username of the Twitter account and sends it foranalysis via a REST API call developed by [17] and Echeverriaet al. [18]. This API returns True if the Twitter user accountis a bot, and False otherwise. In case of the former, the IWPpushes a notification to the browser add-on of the minor, andto the Parental Console of the custodian (based on the ParentalVisibility options).
C. Detection of Hateful and Racist memes on Facebook
The IWP captures the Facebook incoming and outgoingtraffic of the minor and performs TLS termination of the DOMtree. All the images that are extracted from the DOM tree aresent to the classifier developed by Zannettou et al. [12] tobe labeled as a hateful meme or not. This classifier is trainedusing images from Twitter, Reddit, 4chans Politically Incorrectboard [13], and Gab [14]. In case the detection is positive, thepicture will be automatically replaced by the IWP with a staticimage to inform the minor.Similarly, when the minor uploads an image on Facebook,the picture is analyzed by the aforementioned classifier todetect whether that image is hateful or racist. If so, then theIWP pushes a notification to the guardian avatar to advisethe minor that the image they try to upload contains hatefulcontent, and they shouldn’t upload it.
D. Sexual Predator Detection on Facebook
When the minor is chatting with a friend on Facebook,the conversation is captured by the IWP and is sent to theclassifier developed by Partaourides et al. [15] for analysis. Aprevious version of this classifier was trained with data fromPerverted Justice website [16] to recognize patterns similarto the ones from convicted sexual predators. Upon positivedetection, the IWP pushes a notification to the browser add-onof the minor, notifying them that signs of sexual predator havebeen detected. The custodian can see only portions of the chatbetween the minor and the predator via the Parental Console,only if the minor consents so via the Parental Visibility optionsexplained in Section II. We note that the custodian can onlysee portions of the chat that the classifier detects as a sexualgrooming pattern.
E. Cyberbullying Detection on Facebook
Similar to the Sexual Predator detection, when the minoris chatting with a friend on Facebook, the conversation iscaptured by the IWP and is sent to the classifier developedby Partaourides et al. [15] for analysis. This classifier returnspercentages of how angry, frustrated, and sad the minor is dur-ing the Facebook chat conversation, using sentiment analysis.If any of these three feelings exceed , the IWP pushes aotification to the browser add-on of the child to warn themthat the Facebook chat they are having seems to be toxicfor them. Similar to the sexual predator detection above, thecustodian is only able to see portions of the suspicious chat,only if the minor gave their consent beforehand.
F. Personal Information Leakage Detection on Facebook
When the user tries to make a post on Facebook, the IWPcaptures the text written by the user and analyzes it to detectdates, times, phone numbers with or without extensions, links,emails, IP and IPv6 addresses, prices, credit card numbers,street addresses, and zip codes. We implement this detectiontechnique using existing Python libraries [19]. In case any ofthe above personal information is detected, the IWP pushesa warning to the minor to remove the sensitive informationfrom their post. In case the minor dismiss these warnings, anotification is sent to the Parental Console of the custodian (inaccordance with the Parental Visibility options).
G. Watermarking and Steganography
For the purposes of this detection mechanism, we considerany image that includes nudity (topless images of boys, orswimsuit images) as sensitive content images. When the minortries to send a sensitive image to a friend over Facebook chat,the image first passes in the IWP for analysis. We followedsimilar techniques to Ghazali et al. [20] and Kolkur et al. [21]to develop our skin and nudity detection techniques. In case theimage contains sensitive content, the IWP watermarks it [22].Then, the IWP hides the original image in another static imageusing steganography. This way, only the person that the picturewas sent to is allowed to see the hidden original image. Wenote that for this to work, the receiver needs to be part ofthe Cybersafety Family Advice Suite network as decryptionkeys hosted on the Back-End are requested from the IWPto decrypt the image. Similarly, if the minor tries to post animage that contains sensitive content on their Facebook wall,the IWP watermarks and performs steganography techniquesto the image before posting it on Facebook. The minor, usingthe browser add-on, can set who is able to see (decrypt) thispicture (family members, friends, classmates, etc.). For thisscenario, we assume that the minor allows the image to bevisible to family members only, and that their family membersare registered CFAS members and have their own IWP setup at home. When a family member of the minor scrollsFacebook, their IWP captures that image and communicateswith the CFAS Back-End to check if they have permissionto see this image. If this is the case, then the IWP decryptsthe image automatically. In case the image does not containsensitive content, the IWP only applies watermarking on itbefore posting it. The receivers that are not part of the CFASnetwork can only see the static encrypted image.
H. Disturbing videos on YouTube
Our architecture also detects disturbing YouTube videos foryoung children, using the developed classifier by Papadamou etal. [23]. This classifier was trained using YouTube videos [24]and can discern inappropriate content with . accuracy.When a minor visits a YouTube video, the IWP captures theYouTube link, which includes the YouTube video ID, andit calls the YouTube API to collect the video features [25].These features include the video upload date, likes, tag, title, Figure 4. OSN actions with CFAS & without CFAS thumbnail, etc. The IWP then sends these video features to thedeveloped classifier for analysis. In case the classifier returnspositive detection (inappropriate), then it warns the minor thatthe video they are watching is not suitable for them via thebrowser add-on. V. E
VALUATION
In this section, we evaluate the performance of the proto-type implementation of the Cybersafety Family Advice Suite.
A. Performance Evaluation
To test the performance in regard to the number of con-current users, we set a small home cluster using a laptopwith 4GB Ram, a quad-core Intel Core i5 processor thatis running Ubuntu 18.04 64bit and Google Chrome Version80.0.3987.162 (64 Bit), which is used as the minor’s laptopthat hosts the browser add-on. In addition, we set up twovirtual machines with 2GB RAM each, and one tablet of 3GBRAM: 4 users in total. The IWP is a virtual machine hostedon the Google cloud, configured with 4GB RAM, a dual-coreIntel Xeon CPU, running Centos 7 (64 Bit), and it is usingthe mitmproxy [26]: the HTTPS proxy. Also, the IWP hostsa MongoDB for Data storage and Python3 for the API Calls.We run the experiments with a downlink of ∼
20 Mbps and anuplink of ∼ ∼ B. User Experience
In this section, we present the results of a user experienceevaluation questionnaire given to minors and custodians afterinteracting with CFAS. The participation of minors requiredtheir custodians’ consent. The sample consists of 30 minorsand 12 custodians that had no knowledge or experience of theCFAS tools. The questionnaires were GDPR-compliant and % p e r c e n t a g e o f u s e r s Figure 5. (Minors) Would you allow CFAS to send notifications to yourcustodian regarding suspicious detection? (1: Totally Disagree, 5: TotallyAgree) anonymous. The study has received data protection approvalsby the Ethics Committee of the Cyprus University of Technol-ogy, and by the Office of the Commissioner for Personal DataProtection of the Republic of Cyprus.To evaluate our tools, the minors had to answer a vari-ety of questions regarding their usability, accessibility, andperformance. The minors were between 12 to 16 years oldand reported using the Internet daily for entertainment andeducation purposes. The percentages of minors in our samplethat have a registered Facebook, Instagram, and YouTubeaccount are . , . , and . , respectively.We report some of the results we obtained from thequestionnaires given to minors and their custodians after theyused the CFAS tools. When minors asked whether they wouldallow CFAS to send notifications to their custodians, themajority reported high, and complete agreement (Figure 5). Inaddition, the majority of minors believe that these tools couldimprove their safety when using OSNs, as depicted in Figure 6.Importantly, all of the minors report being very happy with thecapabilities of CFAS (Figure 7). Alarmingly, Figure 8 depictsthat many minors had their personal data ( ) and photos( ) misused, being a victim of cyberbullying ( ), andwitnessing inappropriate speech and racism ( ) on socialnetworks. Note that the minors could select any that appliedto them for this question.On the other hand, the overwhelming majority of thecustodians report that their child never complained of beinga victim or a spectator of such threats online (Figure 9).Although this is a small number of participants, it depicts thatit is usually the case that minors don’t report the threats theyface on OSNs to their custodians. Last, all of the custodiansagree that CFAS could improve the safety of minors online(Figure 10), and the overwhelming majority of custodiansreport that they would install CFAS at home (Figure 11).VI. R ELATED W ORK
This section reviews some web-based and mobile applica-tions that try to protect adolescents on the Internet and OSNs.We list the ones most relevant to the concepts of CFAS.Qustodio is a parental control software [27] that enablesparents to monitor and manage their kids’ web and offlineactivity on their devices. It also tracks with whom the child % p e r c e n t a g e o f u s e r s Figure 6. (Minors) Do you believe CFAS would improve your safety whenusing OSNs? (1: Totally Disagree, 5: Totally Agree) % p e r c e n t a g e o f u s e r s Figure 7. (Minors) Are you satisfied with CFAS capabilities? (1: TotallyDisagree, 5: Totally Agree) (a) (b) (c) (d) (e) (f) (g)0510152025303540 % p e r c e n t a g e o f u s e r s Figure 8. (Minors) Have you ever experienced the following online-threats?Select all that apply to you: (a) I prefer not to say; (b) None; (c) Personaldata misused; (d) Personal photo misused; (e) Cyberbullying; (f)Inappropriate speech and racism; and (g) Sexual grooming communicates on various OSNs and can be used as sensitivecontent detection and protection tool (using filters). Last, itmonitors messages, calls, and the location of the minor’sdevice. Kidlogger allows custodians to monitor what theirchildren are doing on their computer or smartphone [28].It performs keystroke logging, keeps a schedule of whichwebsites the minors visit and what applications they use,and with whom they are communicating on Facebook. Also, a) (b) (c) (d) (e) (f) (g)010203040506070 % p e r c e n t a g e o f u s e r s Figure 9. (Custodians) Has your child ever reported to you being a victim ofthe following? (a) I prefer not to say; (b) None; (c) Personal data misused;(d) Personal photo misused; (e) Cyberbullying; (f) Inappropriate speech andracism; and (g) Sexual grooming % p e r c e n t a g e o f u s e r s Figure 10. (Custodians) Do you think that CFAS would improve the safetyof minors when using OSNs? (1: Totally Disagree, 5: Totally Agree) % p e r c e n t a g e o f u s e r s Figure 11. (Custodians) Would you install CFAS at home? (1: TotallyDisagree, 5: Totally Agree)
Kidlogger offers sound recording of phone and online calls,smartphone location tracking, and photo capture monitoring.Web of Trust (WoT) is a browser add-on and smartphoneapplication for website reputation rating that warns users aboutwhether to trust a website or not [29].Mspy is a smartphone application that monitors almostall the applications and activities on the smartphone of theminor [30]. Alarmingly, the application may be installed on the smartphone of the minor by the custodian and remain hidden,so the minor cannot know they are being monitored. Syfer [31]is a device, still in production, that can be plugged into therouter of the house network and analyses the traffic activity forpossible threats. It protects against cyber threats in realtime,stops invasive data collection, offers a VPN, has artificialintelligence for enhanced security, and blocks advertisements.It doesn’t log any information, and it offers encrypted activ-ity. It restricts inappropriate content with real-time websiteanalysis provided by their AI engine. Bark [32] monitors textmessages, YouTube, emails, and 24 different social networksfor potential safety concerns. Bark looks for activity that mayindicate online predators, adult content, cyberbullying, druguse, suicidal thoughts, and more. In case anything suspicious isdetected, the custodians receive automatic alerts along with ex-pert recommendations from child psychologists for addressingthe issue. They offer an application for iOS, Android, Kindle,browser add-ons for Google chrome on PC and Safari on Mac,and Kindle. The user has to allow the Bark application tosend all the traffic data to Bark’s Back-End for analysis anddetection.The majority of the existing applications follows a moretraditional approach (monitoring, restrictions over online ac-tivities). Most applications consider parents or custodians asthe end-users, instead of the children [33] [34]. Many of theapplications do not have interfaces for children but are justinstalled as services running in the background [35]. A newnotion suggests designing and developing tools and softwarethat is more “children-aware” and “children-friendly”. Onlinesafety applications should consider the child as the majoruser and try to enrich children’s self-regulation and their riskcoping skills in cases of online dangers [36]. By enforcingthis child-friendly approach, we achieve a collaboration whereparents and children need to communicate and discuss onlinerisks and behavior in contrast with the approach of restrictionand monitoring. We aim to teach children how to cope withonline threats and use social media with responsibility and self-awareness. CFAS follows this approach by involving the childin the process of setting the filters, and parental and Back-Endvisibility options. In addition, the cybersafety tools require thechild’s consent to be activated. Last, we note that this work isa follow up of the work presented by Papasavva [37].VII. C
ONCLUSION
In this paper, we present the architecture of a user-centricprivacy-preserving advanced family advice suite for the protec-tion of minors on OSNs. The architecture comprises three maincomponents, namely, the Data Analytics Software Stack, theIntelligent Web-Proxy, and a browser add-on, which operatesas a guardian angel of the child while using OSNs. Thisarchitecture aims to protect minors when using OSNs whilepreserving their privacy. We propose Guardian Avatars thatinteract with, warn, and advise adolescences when they facethreats on OSNs. Also, the custodian of the adolescent receivesnotifications on their Parental Console in case a maliciousactivity is detected by the classifiers hosted on the IWP to beaware of the threats their child was exposed to. Importantly, thecustodian can only see the relevant content, which indicatedto be suspicious, only if the minor had previously given theirexplicit consent.Blocking content from the minors or thoroughly monitoringheir every online-move should not be the solution as it violatesthe privacy of the adolescents. The proposed architectureadvertises the collaboration between parents and children andaims at bringing the family to work together to protect thevulnerable groups of the Internet while using OSNs.A
CKNOWLEDGMENTS
This project has received funding from the European Union’sHorizon 2020 Research and Innovation program under the MarieSkodowska-Curie ENCASE project (Grant Agreement No. 691025),and the CYberSafety II project (Grant Agreement No. 1614254). Thiswork reflects only the authors’ views. R EFERENCES[1] M. Anderson and J. Jiang, “Teens, Social Media & Technology 2018,”
Pew Research Center: Internet, Science & Tech , vol. 31, 2018.[2] “Children and parents media use and attitudes: annex 1.” 2019, URL:https://bit.ly/2JIshIk [accessed: 2020-08-25].[3] “Online Abuse - How safe are our children?” 2019, URL: https://bit.ly/390zOhO [accessed: 2020-08-25].[4] “Pew Research Center. A Majority of Teens Have Experienced SomeForm of Cyberbullying.” 2018, URL: https://pewrsr.ch/32o2AHY [ac-cessed: 2020-08-25].[5] “EU Kids Online II Dataset: A cross-national study of childrens useof the Internet and its associated opportunities and risks,” 2017, URL:https://ab.co/30dr3NB [accessed: 2020-08-26].[6] T. Andreas, T. Nicolas, S. Makis, P. Kwstantinos, and S. Michael,“Cyber Security Risks for Minors: A Taxonomy and a SoftwareArchitecture,” in h International Workshop on Semantic and So-cial Media Adaptation and Personalization (SMAP)) July 12–14,2013, Thessaloniki, Greece . SMAP, Nov. 2016, pp. 93 – 99,ISBN: 978-1-5090-5246-2, URL: https://ieeexplore.ieee.org/abstract/document/7753391 [accessed: 2020-08-26].[7] S. Deterding, M. Sicart, L. Nacke, K. O’Hara, and D. Dixon, “Gam-ification: Using game design elements in non-gaming contexts,”
ACMCHI , vol. 125, pp. 2425–2428, 2011, ISBN: 9781450302685.[8] “Twitter API,” 2020, URL: https://developer.twitter.com/en/docs [ac-cessed: 2020-08-25].[9] D. Chatzakou et al. , “Mean Birds: Detecting Aggression And BullyingOn Twitter,” in
Proceedings of the 2017 ACM on Web Science Confer-ence (WebSci) June, 2017, New York, NY, United States . ACM, Jun.2017, pp. 13–22, ISBN: 9781450348966,URL: https://dl.acm.org/doi/pdf/10.1145/3091478.3091487 [accessed: 2020-08-26].[10] “Restricted Dataset for “Large Scale Crowdsourcing and Characteri-zation of Twitter Abusive Behavior”,” 2020, URL: https://zenodo.org/record/3706866 [accessed: 2020-08-25].[11] “Dataset for “Mean Birds: Detecting Aggression and Bullying on Twit-ter”,” 2018, URL: https://zenodo.org/record/1184178 [accessed: 2020-08-25].[12] S. Zannettou et al. , “On The Origins Of Memes By Means OfFringe Web Communities,” in
Proceedings of the Internet MeasurementConference 2018 (IMC) October, 2018, New York, NY, United States .ACM IMC, Oct. 2018, pp. 188–202, ISBN: 9781450356190, URL:https://dl.acm.org/doi/pdf/10.1145/3278532.3278550 [accessed: 2020-08-26].[13] A. Papasavva, S. Zannettou, E. De Cristofaro, G. Stringhini, andJ. Blackburn, “Raiders of the lost kek: 3.5 years of augmented4chan posts from the politically incorrect board,” in
Proceedingsof the International AAAI Conference on Web and Social Media(ICWSM) 8-11 June, 2020, Atlanta, Georgia, US
IEEE International Conference on Acoustics, Speechand Signal Processing (ICASSP) 4-8 May, 2020, Barcelona,Spain et al. , “LOBO: Evaluation Of Generalization Deficien-cies In Twitter Bot Classifiers,” in
Proceedings of the 34t h An-nual Computer Security Applications Conference (ACSAC) December,2018, New York, NY, United States . ACM, Dec. 2018, pp. 137–146, ISBN: 9781450365697, URL: https://dl.acm.org/doi/pdf/10.1145/3278532.3278550 [accessed: 2020-08-26].[19] “GitHub - madisonmay/CommonRegex: A collection of common reg-ular expressions bundled with an easy to use interface,” 2019, URL:https://bit.ly/2Zu4gh8 [accessed: 2020-08-26].[20] G. Osman, M. S. Hitam, and M. N. Ismail, “Enhanced skin colourclassifier using RGB ratio model,” arXiv , 2012.[21] S. Kolkur, D. Kalbande, P. Shimpi, C. Bapat, and J. Jatakia, “Humanskin detection using RGB, HSV and YCbCr color models,” arXiv , 2017.[22] “Watermark with PIL,” 2005, URL: http://code.activestate.com/recipes/362879/ [accessed: 2020-08-25].[23] K. Papadamou et al. , “Disturbed YouTube For Kids: Characterizing AndDetecting Inappropriate Videos Targeting Young Children,” in
Proceed-ings of the International AAAI Conference on Web and Social Media 26May, 2020, Palo Alto, California USA et al. , “”Stranger Danger!” Social media appfeatures co-designed with children to keep them safe online,” in
Proceedings of the 18t h ACM International Conference on InteractionDesign and Children (IDC)) July 19, 2019, New York, NY, UnitedStates . ACM, Jun. 2019, p. 394406, ISBN: 9781450366908, URL:https://dl.acm.org/doi/pdf/10.1145/3311927.3323133 [accessed: 2020-08-26].[34] B. McNally et al. , “Co-designing mobile online safety applicationswith children,” in
Proceedings of the 2018 CHI Conference on HumanFactors in Computing Systems(CHI)) July 19, 2018, New York, NY,United States . ACM, Apr. 2018, p. 523, ISBN: 9781450356206, URL:https://dl.acm.org/doi/pdf/10.1145/3173574.3174097 [accessed: 2020-08-26].[35] P. Wisniewski, A. K. Ghosh, H. Xu, M. B. Rosson, and J. M. Carroll,“Parental control vs. teen self-regulation: Is there a middle ground formobile online safety?” in
Proceedings of the 2017 ACM Conference onComputer Supported Cooperative Work and Social Computing (CSCW))February 25, 2017, New York, NY, United States . ACM, Feb. 2017,pp. 51–69, ISBN: 9781450343350, URL: https://dl.acm.org/doi/pdf/10.1145/2998181.2998352 [accessed: 2020-08-26].[36] A. K. Ghosh, C. E. Hughes, M. B. Wisniewski, Pamela J, and J. M.Carroll, “Circle of Trust: A New Approach to Mobile Online Safetyfor Families,” in