[PDF] Bubble Storytelling with Automated Animation: A Brexit Hashtag Activism Case Study

Abstract

Hashtag data are common and easy to acquire. Thus, they are widely used in studies and visual data storytelling. For example, a recent story by China Central Television Europe (CCTV Europe) depicts Brexit as a hashtag movement displayed on an animated bubble chart. However, creating such a story is usually laborious and tedious, because narrators have to switch between different tools and discuss with different collaborators. To reduce the burden, we develop a prototype system to help explore the bubbles' movement by automatically inserting animations connected to the storytelling of the video creators and the interaction of viewers to those videos. We demonstrate the usability of our method through both use cases and a semi-structured user study.

Full PDF

JJournal of Visualization manuscript No. (will be inserted by the editor)

Bubble Storytelling with Automated Animation:A Brexit Hashtag Activism Case Study

Noptanit Chotisarn · Junhua Lu · Libinzi Ma · Jingli Xu · Linhao Meng · Bingru Lin · Ying Xu · Xiaonan Luo · Wei Chen

Received: date / Accepted: date

Abstract

Hashtag data are common and easy to acquire. Thus, they are widely used in studies and visual data story-telling. For example, a recent story by China Central Television Europe (CCTV Europe) depicts Brexit as a hashtagmovement displayed on an animated bubble chart. However, creating such a story is usually laborious and tedious,because narrators have to switch between different tools and discuss with different collaborators. To reduce the bur-den, we develop a prototype system to help explore the bubbles’ movement by automatically inserting animationsconnected to the storytelling of the video creators and the interaction of viewers to those videos. We demonstrate theusability of our method through both use cases and a semi-structured user study.

Keywords

Storytelling · Data Journalism · Automated Animation

Hashtags are an essential part of Twitter trends. Using Twitter’s hashtags for Internet activism is called “Hashtagactivism” (Carr, 2012), commonly used to make people aware of social and political issues. Hashtag activism occurswhen vast amounts of postings appear under the usual hashtagged words, expressions, or paragraphs in social orpolitical cases via social media (Yang, 2016). This type of information can be easily acquired through web crawlingtechniques and is widely used for scientiﬁc researches (e.g., (Sun et al., 2017; Wu et al., 2018)) and data storytelling(e.g., data journalism).Recently, Brexit has been discussed on online social media all over the world. A data story presented by CCTVEurope shows the dynamic of hashtags on animated bubble charts and has attracted much attention in China. Thispresentation is inspired by the famous animated visualization of Hans Rosling’s talk . The movement of the bubbles(hashtags) are interrelated and can illustrate interesting events to tell stories. As a collaborator of this Brexit visualiza-tion story project, we provided various types of visualization, including an animated bubble chart, to support the visualpresentation of the data. However, the creation of the ﬁnal story requires multiple additional steps. The data journalistsshould ﬁrst understand all the data and write news scripts based on speciﬁc themes. Subsequently, they need to explorethe data and ﬁnd stories with the provided visualizations based on the scripts. Finally, proper animations (e.g., high-lighting, slow-motion) are added based on the previous ﬁndings to enhance the animated visualizations. This processrequires much manual work, and journalists have to switch off between different tools or discuss with collaborators toachieve the ﬁnal results. Noptanit Chotisarn, Junhua Lu, Libinzi Ma, Jingli Xu, Linhao Meng, Bingru Lin, Ying Xu, Wei ChenState Key Lab of CAD&CG, Zhejiang University, Hangzhou, ChinaE-mail: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected],xu [email protected], [email protected] Chen is the corresponding author.Xiaonan LuoGuilin University of Electronic Technology, Guilin, ChinaE-mail: [email protected] a r X i v : . [ c s . C Y ] S e p Noptanit Chotisarn et al.

To reduce efforts during the creating process and provide more opportunities for data storytelling, we presentBrexble (stands for

Bre xit bub ble ), a storytelling prototype system with animated data visualization. This systemsupports automatic data storytelling through animated bubble charts. We use scatterplot for visualizing data, as it isintuitive to read (Xie et al., 2018) and can help gain a good understanding of its distribution. Then we analyze the datato determine the level of movement of each hashtag. The intense level was relevant to essential events. Therefore, wecan map the movements to animations to emphasize the events that allow the hashtags to tell a story. The story can beenhanced by ﬁne-tuning animations and adding captions. The narrators can choose to outline the event by utilizing allthe provided hashtags, or they can select hashtags from the system. The viewers can watch the videos created by thenarrators via the system. To validate the system’s usefulness, we conducted semi-structured interviews to measure theawareness of stories by viewers and narrators. Awareness is the extent to which the viewers and narrators can perceivethe relationship between the animated bubble movements and the events of the story through the use of captions. Thesystem is available online at https://bit.ly/2SHqEjw.To summarize, the contributions are twofold. – A method that deﬁnes the three types of animations used for emphasizing the important parts of storytelling throughthe movement of bubbles, – A prototype system that provides automated animations to support the storytelling of bubble movements, whichrepresent real-life events. et al. suggested to tackle data by using techniques that encourage amixture of exploration and explanation (Young et al., 2018).Our authoring tool provides two abilities of visual data-driven stories, which are to communicate a narrative andpresent information based on data via video playback by following the data journalism pipeline.2.2 Visualization authoring toolsA survey (Young et al., 2018) indicated that the interactions most used by journalists are straightforward data tech-niques, such as, “inspect”, “ﬁlter”, “extract” and “elaborate”. A study on systematically identifying factors by propos-ing a design space derived from the reading experiences in 80 data-driven visual storytelling, which is presentedin seven factors, Navigation input, Level of control, Navigation progress, Story layout, Role of visualization, Storyprogression, and Navigation feedback (McKenna et al., 2017).There are some programming-based tools for authoring visualizations (Li et al., 2018; Mei et al., 2018), but theyrequire much programming skills and visualization expertise. Some interactive approaches are more usable for casualusers. ChartAccent (Ren et al., 2017) allows charts to be added quickly and easily through a series of interactive notesthat create self-explanatory and data-driven annotations. VisJockey is a technique that enables viewers to easily accessthe author’s intended view through Orchestrated Interactive Visualization (Kwon et al., 2014). Some visualizationforms of VisJockey requires the user to keep scrolling to read the content continuously. DataClips (Amini et al.,2016) allows non-experts to collect data-driven “clips” to create longer sequences and export as a video clip withadded captions. iStoryline (Tang et al., 2018) is a tool that integrates high-level user interactions into automatic layoutoptimization algorithms to balance manual and automatic storyline layouts. ubble Storytelling with Automated Animation: A Brexit Hashtag Activism Case Study 3

The sharing function in data journalism is also an essential part for journalists to increase audience reach in newsand discussions, indicating the quality of their work (Young et al., 2018). In the Gapminder authoring tool, there is asharing function that can share/embed the HTML iframe for websites.We studied the surveys mentioned and compared them to the quoted authoring tools to create the design space. Thenarrator can tell multiple perspectives of the story from the dataset. Furthermore, video playback can support sharingto increase viewer awareness. We present the data movement through automatic animation and share it via a videoformat that viewers can instantly receive from narrators through video playback. In this section, we describe the Brexit dataset used to create the visualization. This dataset is generated based on therequirements of the journalists from CCTV Europe. We provide an overview of the Brexit dataset’s data preparationprocess in Section 3.1 and Section 3.2 to explain how the data preparation team processed the data for visualization(Figure 1). They collected data from twitter, tagging “leave” or “remain” side for training data, run the classiﬁcationmodel to categorize all the captured tweets into “leave” vs. “remain” and to determine the size of the bubble andpolarization by tendency score. For other datasets, these processes may not be used to prepare the data.Section 3.3 explains how to ﬁnd insights and converts the data obtained from Section 3.2 into story movement dataused to tell the story further. Another dataset, COVID-19 dataset, has been prepared without passing the preparationprocess described in Section 3.1 and 3.2. The COVID-19 data is readily available from the European Center for DiseasePrevention and Control website .3.1 Brexit data characteristicsThe characteristics of the Brexit dataset applied to the visualization is a kind of political polarization, which is atendency to participate in politics on one side of the left-right political spectrum (Garimella and Weber, 2017), thatis currently available online in the form of internet activism on Twitter. The Brexit topic consists of bipolar opinionsbetween leaving the EU and staying in the EU. The middle ground between these two is considered to be neutral. In thefollowing part, we explain the rationale of dividing Brexit datasets into three viewpoints: leave, remain, and neutral.3.2 Brexit data preprocessingThis section will explain the output of the tendency value x to categorize selected hashtags within the political polar-ization of the Brexit topic. x ranges from 0 to 1, where 1 represents a complete leave, and 0 represents a completeremain. The closer to 1, the more obvious the tendency to leave the EU. The closer to 0, the more obvious the tendencyto remain in the EU. Those close to 0 . Twitter Developer API was used to capture 41 monthly tweets from January 2016 to May 2019, by ﬁrst gatheringtweets under the “trending categories of and . Subsequently, the monthly tweets data was capturedaccording to the required hashtag.Tweets were labeled as “remain” or “leave” and used as training data for the classiﬁcation model. Some of thesehashtags are very emotional, meaning there is no need to judge anything else to comprehend the tweet trend. Forexample, the tweets with the hashtag almost certainly referred to Brexit.We pre-marked tweets with the following topics as leave: and tweets with the following topics as remain: . Fig. 1: The following steps determine how the data preparation team processed the data: ﬁrstly, the Brexit tweets werecollected from twitter. Secondly, the tweets were tagged as “leave” and “remain” side for training data. Thirdly, theSVM classiﬁcation model was run to categorize all the captured tweets by polarization. Next, the tendency scores werefrom the probability. Finally, a CSV ﬁle was summarized for ﬁnding insights.

The SVM classiﬁcation model was used to categorize all the captured tweets into “leave” vs. “remain” dichotomies.First, it is necessary to clean the initial tweets with hyperlinks, non-English characters, and then evenly format theminto two phrases with space. Lastly, convert them into English lowercase.TF-IDF is used to extract feature information. The token pattern of the word is expressed by the regular expression,which means the beginning of the letter or

Na¨ıve Bayes is a probabilistic model that weighs the probability of a given classiﬁcation under a given condition toreveal the probability of each feature of the speciﬁed class. In this Brexit dataset, Polynomial Bayes is adopted, whoseparameters are set without learning prior probability. Additionally, all the data previously classiﬁed as the data ofknown labels are re-used. Finally, the tendency score corresponding to each hashtag is generated. The tendency scorewill be used to determine the size of the bubble and determine the polarization mentioned at the beginning of Section3.2.The result is in a CSV ﬁle (Figure 2) with a structure in which the tendency score will be taken into the “Trend”column to determine the size and polarization of bubbles. The data in the “x” and “y” columns represent the totalnumber of tweets and retweets with those hashtags (“Topic” column). The data will be used to ﬁnd the data movement(the grey area) through processes in Section 3.3. As mentioned before, Section 3.3 can be applied with other input data,such as the sample COVID-19 that we present, and will be transferred to the animation mapping pipeline in Section3.3.3.Fig. 2: This ﬁgure shows the structure of the CSV ﬁles, before and after mapping the levels of movement, where the“Topic” column has a list of hashtags, “Trend” columns showing tendency scores. The “x” and “y” columns refer tothe number of tweets and retweets. Furthermore, the “lv” columns mean the level of x and y over time. ubble Storytelling with Automated Animation: A Brexit Hashtag Activism Case Study 5

This project intends to specify the level of bubble movement and identify the event’s importance at a speciﬁc time.The narrator uses bubble movement to tell the story through essential facts that connect each topic.A cluster analysis was used to determine the importance of hashtag movements by plotting the dots of all hashtags,which equates to 36 hashtags within 41 periods, a total of 1476 dots on the scatter plot. We determine the thresholdof this dataset more than an average of all tweets (98) because hashtags occur in tweets ﬁrst, followed by retweets.With this condition, the initial 1476 dots decreased to 282, and the dots that have tweets lower than the threshold aregrouped as the Zero level. The 282 dots were applied with the clustering algorithms and compared with the Pythonlibrary sklearn.Clustering algorithms, e.g. , k -means, Afﬁnity Propagation, Mean Shift, Spectral Clustering, Agglomerative Clus-tering, DBSCAN, and HDBSCAN were examined on the dataset. It was discovered that k -means is suitable. The elbowtechnique (Kodinariya and Makwana, 2013) is a helpful graphical instrument to estimate the ideal amount of clustersfor a speciﬁed assignment k . The concept behind the elbow technique is to quickly deﬁne the value of k where thedistortion starts to decline, which will become more apparent if the distortion for distinct is plotted k . Based on theplot result, the elbow is at k =

3, which indicates that k = k -means clustering to determine “natural” groupings of instances for a given unlabeled data in predeﬁnedclasses. After this step, the 282 dots can be separated into three classes, namely, Low level, Medium level, and Highlevel. When combined with the Zero level mentioned above, there are a total of four levels. The automatic is used toreduce the burden of authoring. In the future, we can use a more curated method (e.g., (Weng et al., 2018)) to allownarrators to assign the importance of events interactively. As a result of the clustering analysis from the previous insights process, we have determined the bubble movementlevel. It will be divided into four levels; the ﬁrst level is very shallow, level 0, the second group is low, level 1, the thirdgroup is medium, level 2, and the fourth group is high, level 3.The four levels of each hashtag can be used to draw a line chart where the x-axis has 41 periods, and the y-axis hasfour levels. Each hashtag will have its line chart bringing the total to 36 line charts. The line charts are called hashtagmovement behavior. It will be included as a result of the Hashtag Pulse presentation (Section 4.2.1).From the dataset, 41 monthly periods represent essential events, but the effect it has on each hashtag is not thesame. The tipping point of the line chart means a precise movement that may be affected by relevant events at thattime. It means that each change interacts differently with the main events. Therefore, we can map the events with thetipping point, allowing the hashtags to tell a story.

This part summarizes the process mentioned above for the animation mapping pipeline (Figure 3). In the ﬁrst step,36 hashtags are imported and multiplied by 41 months, which equates to 1476 times. Hashtags with tweets belowthreshold (98) are considered to have the lowest movement level. For the second step, only the hashtags with more thanthe threshold were selected, which leaves only 282 dots. The 282 dots were divided into three levels using k -means.As for the third step, the level of each hashtag is transformed into a movement line. The fourth step compares themovement level of each hashtag, which resulted in various animations as follows; the highest value of each period usedto determine hashtag highlightingthe value change between two periods used to determine slow-motion. Moreover, Noptanit Chotisarn et al.

Fig. 3: The following steps are required to carry out animation mapping: ﬁrstly, the data is extracted and plotted.Secondly, the data is ﬁltered and grouped into four levels. Thirdly, the level of each hashtag is transformed into amovement line, and lastly, the level of movements is compared to deﬁne the three animations.when there are many highlighted hashtags, it was applied to determine the pausing to let the user be aware of thehighlighted hashtags. From this pipeline, it can be truncated by importing other inputs, such as COVID-19 dataset, toﬁnd the level of motion of the bubbles for further visualization.3.4 Brexit data stories exampleAn example of a movement that can classify events is the moving at a more intense level than all other hashtagsfrom April to June 2017 (Figure 4). The movement corresponds to the critical events of that period: Prime Ministercalls a General Election - to be held on 8 June 2017.It can be seen that if we do not emphasize the bubble, we may neglect this important election event. Sowe offer a way to understand the movement of these hashtags by focusing on the movement level. To reiterate, themovement level of each hashtag corresponds to actual events. Therefore, narrators must ﬁrst understand the real storyand then choose a viewpoint to tell the story through the movement of hashtags.Fig. 4: (highlighted with a red rectangle) moves at a more intense level than all other hashtags from April toJune 2017.

Brexble has been created through an iterative customization process of evolutionary prototyping. The ﬁrst version ofthe system was edited via post-production to present the news on CCTV Europe. It was extracted and compared withthe next version that has an automated animation movement function to see whether or not the movement is consistent ubble Storytelling with Automated Animation: A Brexit Hashtag Activism Case Study 7 with the ﬁrst version. It was discovered that a movement could tell the story by itself; the function of captioning wasdeveloped so that narrators could create captions while also choosing the hashtags they want to present. Subsequently,it can be exported as a video for viewers.4.1 Design rationalesBrexble visualization allows the narrators to create their story and emphasize the bubble movements by using thecaptions that need to convey the story in the form of captions from different points of view based on these designtasks, namely, exploration, explanation (Riche et al., 2018), and suggestion.The exploration task provides narrators with the power to ﬁnd their own story in a set of data. In this study, thebubble chart can display the color and size of the bubbles, indicating the side and degree of inclination to either sideof the Brexit campaign. Narrators can choose and explore the bubbles they want to convey in the story.While the explanation task communicates the narrators’ story from the data, storytelling is the ability to explainstories. Narrators can choose which topic to tell via the selected bubble. Moreover, storytelling is the ability to addcaptions. The captions used here explain the movement of the bubbles at the desired time and are presented in the formof a short message subtitle.The suggestion task is also added to help the narrator create their story. The bubbles have their movement level,based on the recommended automatic animation, which is either highlight, pause, or slow-motion. Highlights are usedfor emphasis. Automatic pauses are also used so the audience could understand better. Slow-motions are also includedso the receiver can notice the crucial.Fig. 5: A browser-based visual Brexit storytelling authoring tool, Brexble, portraying a timeline of one month beforethe EU referendum, where each color corresponds to either political polarization; the interactive bubble chart can beaccessed at https://bit.ly/2SHqEjw. A demonstration video can be viewed at https://youtu.be/huuko8p3-e4.4.2 User interfaceThe main elements of the hashtag storytelling are described as follows.

Noptanit Chotisarn et al.

Bubbles are the key visual elements of the visualization. The colors encode the polarization: blue represents theside that wants to stay in the EU, red represents those want to leave, and grey is used for neutral bubbles (Figure5(a)). Bubbles that have not yet been speciﬁed and given a color remain white until it has been highlighted. The size of the bubble is the extent of the hashtag’s inclination toward either side (Figure 5(a)), whereby the bigger it is, themore inclined it is. For the neutral bubbles, since there is not much inclination, the bubbles are never signiﬁcant. The trajectories of the hashtags show the direction and trend of each bubble’s movement.

Hashtag bubble selection at the top-left of the system (Figure 5(a)) is where narrators can choose which bubbles theywant to tell their story. By default, every bubble will be shown. However, if the narrator chooses only some bubbles,those bubbles will be displayed with highlights automatically shown. The narrator can turn the function that displaysthe hashtags’ trajectory on or off at the bottom of the Hashtag Bubble Selection panel (Figure 5(a)).The bubble chart placed in the middle (Figure 5(c)), and the narrator can select the one that matches their story.Under the bubble chart, the narrator can insert captions to explain the movements of bubbles. The video progress barindicates the timeline of the movement. The main screen and captions can be exported into a video playback.

Hashtag pulse is a line chart (Figure 5(b)) that combines the movement level of every hashtag, which is dividedinto four levels, 0-3. Hashtags with the lowest level move on the lower level and may rise to the middle of the chart orgo up to the top right corner of the chart when its level of movement changes to the highest level. The Hashtag BubbleSelection panel can be used to display information on the selected hashtag. Therefore, the narrator can comparehashtags.

Hashtag timeline (Figure 5(d)) is used with Hashtag Pulse. It helps narrators carry out analysis and make informeddecisions. Each hashtag has a different timeline of occurrence, existence, and disappearance. The system calculatesthe occurrence in all the selected hashtag timelines. Whichever one occurs ﬁrst, the ﬁrst occurrence of that hashtagwill be used as the base in the main timeline. The main timeline will affect the start and end of the video progress bar.

In terms of customization, narrators can customize the caption, such as the font size of captions, size of texts on thebubble, video duration, and the beginning and end of the pause of the bubble movement, and record and replay thebubble chart.

Caption transcript : In the system, stories can be used by the narrator to describe hashtag movements with amaximum of 160 characters. The narrator can choose which time to tell the story by adjusting the timeline or pressingthe play button and waiting until the precise time then ﬁlls in the text in the caption box. A maximum subtitle lengthof two lines is recommended . While ﬁlling in the text, the bubble chart will not move until the narrator clicks theplay button again. Narrators can edit the caption on the Caption Transcript panel on the right (Figure 5(f)). They cancustomize the start and end time for each caption by clicking on the date, which can be deleted by double-clicking onthe text. In the bottom of the Caption Transcript panel, narrators can customize the caption, the font size of the caption,size of texts on a bubble, video duration (Figure 5(g)). Auto pause : Pausing at Auto Pause panel (Figure 5(h)) is the only animation that narrators can choose to enable ordisable animation displays. The narrator can specify the minimum number of colored bubbles moving during the sameperiod to pause when they meet certain conditions. This function enables the narrator to observe and discern the colorof bubbles that are being displayed at that time. Narrators can also specify the duration of the pause by calculating theamount of time a human can read an object multiplied by the number of bubbles.

Record and replay the video : Once the narrator has written a caption to describe the hashtag movement of thebubble and adequately adjusted the size of the characters and the pausing duration, the narrator can record the videousing the top bar (Figure 5(e)). While recording, the narrator can continue to interact with the bubbles on the screen,such as scrolling the mouse to adjust the size of the text on the bubble. The interaction will be recorded as a JSON ﬁlefor uploading and replaying later in video format. https://bbc.github.io/subtitle-guidelines/ubble Storytelling with Automated Animation: A Brexit Hashtag Activism Case Study 9 Three animations were created from the animation mapping pipeline (Figure 3) and automatically added to the bubblechart to give the story clarity and meaning.

Highlight

As mentioned before, during each period, there are different hashtags with different levels of movement,which are identiﬁed by highlights. This allows the narrator to tell stories based on the highlighted hashtags, whichsignify the importance of real events. Highlighting allows the bubble with the highest movement levels to remain andbe shown in its original color, while the bubbles that are not the highest value will be in white. In the case where allthe bubbles are displayed without any particular selection, the narrator can enable or disable the display of the bubblesnot highlighted.

Pause

Although the essential bubbles have been highlighted, it may still be difﬁcult for the narrator to focus oneach highlighted bubble simultaneously. For this reason, We offer automatic animations that can temporarily pause themovement of the bubbles at locations with many highlighted bubbles. The duration of the pause is calculated from thespeed reading per word per number of highlighted bubbles, which is determined by the narrator or in the system bydefault. Once pausing has expired, the bubbles will continue to move along the timeline.

Slow-motion

The movement level of bubbles used to calculate and adjust the speed to a slow-motion will be takenfrom the highest level of bubble movement in each period. Different movement times have different movement levels,whereby the higher level of movement, the faster the bubble moves. Moreover, when the bubbles move from one timeto another, it will compare the movement values of those two periods, and if the maximum values of the two periodsare higher, the bubbles will move faster. Nevertheless, this will be adjusted accordingly, whereby the higher the value,the most slowly the bubbles will move. This decrease in speed will be done by employing the d3.easePolyOut(t) function where (t) refers to the difference between the maximum values of the two time periods.4.3 InteractionThe provided interactions are grouped according to the categorization of interaction techniques proposed in (Yi et al.,2007). The narrator can tell the story from the hashtag movement behavior through the following interactions;

Selection & Exploration can help the narrator to explore the display, movement behavior, life span of the topic thatcan bring to tell the story. By default, Brexble displays all the hashtags on the bubble chart. Brexble allows narrators toselect interest hashtags that displayed prominently from the Hashtag Bubble Selection panels, and the chosen hashtagwill display on the bubble chart, the Hashtag Pulse, Hashtag Timeline.

Reconﬁguration enables narrators to choose their interested topic to further tell a story by comparing the anima-tion of the chosen topics. In every hashtags selection, the bubble presentation will have different automatic animationinsertions, so the movement of each selected bubbles being displayed by the arrangement of highlighting, pausing, andslow-motion in a different combination.

Abstraction & Elaboration allow narrators to alter the representation of the animations by bubbles selection.Brexble enables narrators to select the bubble, which will automatically show the highlighted bubbles. It is related topausing, which can be activated when many highlighted bubbles are displayed at the same time. Pausing can adjustthe minimum number of bubbles and the pause duration. Slow-motion can be auto-adjusted when the selected bubblehas changed and represents a leap motion.

Filtering enables narrators to change the items being presented based on some speciﬁc conditions or some criticalinformation. They can select all the hashtags to show, they are allowed to view only the highlighted bubbles. Thus, thenon-highlighted bubbles will make the highlighted ones more noticeable.4.4 Prototype implementationThe back end of the pipeline that provides the data handling for the extract and format tasks is implemented in Python.The front end that supports the presentation of the data is implemented with D3.js. We use rrweb.io to export a JSONﬁle of the recorded screen of the bubble chart, which can be replayed as a video through the Brexble system. This section discusses the design user study of the system through the use of the Brexit hashtag activism case study.The case study is divided into four types of user study, which will be summarized at the end. We have also appliedBrexble to a dataset that captures COVID-19 epidemic in Southeast Asia.5.1 Brexit hashtag activism case studyBojo is interested in the news of the Brexit and wants to create a video based on Brexit events through the movementof hashtag bubbles. Initially, he clicks on the play button to see all the movements of the bubbles from start to end withthe default animation of the system. He explores the overall movement level by using Hashtag Pulse, together withHashtag Timeline.Then, he selects a few hashtags and presses play to view the movement of those hashtags. The animation conditionswill display different results depending on the selection of the hashtags. At this point, Bojo writes a caption to explainthe events and hashtag movements to tell the story in his chosen style. If Bojo has not chosen any hashtags, the systemwill display all the bubbles by default. All the colored hashtags with the default number of four are displayed at thesame time. The movement of the bubbles will pause, and Bojo inputs the caption at that time.The critical pause periods are at the 2nd, 4th, and 5th month, which are major hashtag events on both the leave andremain sides and are highly contested because this period is before the referendum on the withdrawal from the EU.Furthermore, after the referendum, the bubbles that have been competing on both sides reduced the level of competitionbetween June and July 2016, which was a very remarkable event and resulted in a slower system.Subsequently, the moment Theresa May became the Prime Minister was considered to be a standout event that isof great importance, and so captions have been added. After that, a new election was held in 2017, which is also seenas a prominent time and so captions are also added. After the 2016 referendum, numerous calls for the revision of thereferendum results were requested, which Bojo can see represented in the hashtag presentation. Therefore, he wrote acaption describing the story here as well.This user story demonstrates that highlights, pauses, and slow-motion added to the hashtag bubbles can allow thenarrator to communicate a variety of stories. As a result, by experimenting with narrators and viewers, four types ofstorytelling were deﬁned, which will be discussed in the next section.5.2 User studyThis visualization displayed the changes that occur over time (Andrienko et al., 2003):1. Existential changes,2. Changes of spatial properties,3. Changes of thematic properties expressed through values of attributes.We consider the movement of the bubbles in the space over time to consist of three co-dependent parts: “Where”(location/space), “When” (time),“What” (objects) (Peuquet, 1994).Hashtag activism is presented along with the time-series data, meaning that each event should be mapped withspeciﬁc periods. A Brexit timeline (Walker, 2018), provided by the House of Commons Library, discusses the eventsleading up to the UK’s exit from the EU. The users are offered a summary of the important events from the Brexittimeline as an awareness guide on the movement of the bubbles.The important moments (“When”) from the Brexit timeline or the one created by the narrators will be used to de-scribe the bubble chart to see the relationship between the movement of the bubbles and the captions. Movement refersto the position in space, also known as location, the object is “What”, and the location is “Where”. The relationshipbetween three co-dependent parts outlines in Figure 6.Both sides of the experiments, the narrators and the viewers have been designed. The narrators will be requiredto create a video of the Brexit timeline story through events (“When”) and specify and describe the “What” and“Where”. A semi-structured interview was carried out on both the narrators and viewers, which discusses the “What”and “Where” questions of hashtags and included a part where they were asked to identify and explain what wasperceived from the videos. ubble Storytelling with Automated Animation: A Brexit Hashtag Activism Case Study 11

Fig. 6: Three co-dependent parts show the relations of “What” + “Where” and “When”.On the narrator’s side, two narrators were asked to create videos. One is an IT researcher for social research andcreated the ﬁrst and second video. The other, a social media researcher, created the third and fourth video.On the viewer’s side, ﬁve IT students, who were interested in Brexit, were asked to participate in a semi-structuredinterview. They watched the video that the narrators created. They were asked to specify their awareness of the bubblechart movement and how it relates to the events on a scale from 1-5 (1-considered “not at all aware” and 5-“extremelyaware”). Moreover, viewers were asked to share an overview of the events that occurred with the movement of thebubbles.For this research, four videos were created based on “When” and “What” + “Where”.“When” is divided into two types;1. The researcher speciﬁes the events from the Brexit timeline (Walker, 2018).2. The narrator speciﬁes the events.“What” + “Where” is divided into two types;1. Select all the hashtags that the system provides.2. Select from some of the hashtags that the system provides.Subsequently, video versions are created as follows;1. Version 1; Specify “When” and Specify “What” + “Where”2. Version 2; Specify “When” and Do not specify “What” + “Where”3. Version 3; Do not specify “When” and Specify “What” + “Where”4. Version 4; Do not specify “When” and Do not specify “What” + “Where”The results from the sequence of the videos were reported, where 1 means “not at all aware”, 2 refers to “slightlyaware”, 3 means “moderately aware”, 4 indicates “very aware” and 5 refers to “extremely aware”. The interval is 0.8,with the highest score minus the lowest score and then divided by the total number of 5 levels.

Version 1 : The ﬁrst narrator created the ﬁrst video. This version has been created under the condition that all theprovided hashtags must be used with the given events. The events were deﬁned as general events of Brexit. It does notexplicitly mention the movement of the hashtag unless the speciﬁc word in that event relates directly to the word thatwas used as a hashtag.The data results on the awareness of each viewer watching this version (Figure 7(a)) indicated that four out of ﬁvepeople were extremely aware of the events according to the movement of the hashtags, while one person was veryaware.

Version 2 : The second video was also created by the ﬁrst narrator, who was given conditions to use the speci-ﬁed event. In this version, the narrator chose only the hashtags she knows to tell the story, which are , and . The reasons for choosing these hashtags are becausethe hashtags are consistent and match the keywords in the given event and because they coincide with the personalknowledge of the narrator.The data results on the awareness of each viewer watching the second version (Figure 7(a)) portrayed that four outof ﬁve people were extremely aware of the events through the movement of the hashtags, while one person was veryaware. Results indicated that viewers are more aware of the hashtags used and are also more aware of the slow-motionthan the ﬁrst video.The bar chart (Figure 7(b)) compares the two versions of the video with the control variable, which is an identicalevent. Data indicated that the viewers watching the second version of the video had a higher level of awareness. Duringthe interview, the audience expressed that they had a higher level of awareness because of fewer hashtags related to

Fig. 7: The bar charts show the four versions of Brexit user study result.the events. However, for events 5 and 8, there was a higher level of awareness for the ﬁrst version. Moreover, in theﬁnal event where many hashtags were presented simultaneously, both the narrators and viewers had more awarenessof the hashtags and animations than the ﬁrst video.

Version 3 : The third version was created by the second narrator, which shows all the hashtags in the system. Thenarrator speciﬁes the caption by himself, and the movement of the bubble group was emphasized more prominentlythan the individual bubbles.The viewers mostly focused on the movement because the captions tell more about bubbles’ change than the realBrexit event. The viewers claimed they understand the progress of the bubbles because the captions led them to see itstransformation and links with their prior knowledge.For the third video (Figure 7(c)), all ﬁve of the viewers indicated an awareness level of “very aware”, and the restare extremely aware.

Version 4 : The second narrator created the fourth video. In this version, the narrator is free to put in the captionshimself and choose which hashtags to tell the story. The narrator chose eight hashtags: and because they correspond to his knowledge ofthe Brexit timeline. The narrator inserted captions that summarize Brexit’s importance according to his viewpoint.Moreover, he added an overview of the movement of the hashtags in each event period.Results indicated that viewers are more aware (Figure 7(a)) of the hashtags that the narrator chose to convey andare better aware of the slow-motion when compared with other video versions.For the events in the fourth video (Figure 7(d)), almost all of them scored a high level of awareness, apart fromone who had a lower score. However, the video is still categorized as obtaining extremely aware results.5.3 A Summary of resultsBased on the interviews, the narrators are aware of the highlighted animation, the slowing down when hashtags movetoo quickly, and the pauses when there are too many important hashtags simultaneously.As for the awareness level (Figure 7(a)), results indicated that all video versions scored an awareness level of“extremely aware”, with an average between 4.21-5.00, which is at the level of 4.31, 4.44, 4.49 and 4.97, respectively.Most viewers expressed a higher level of awareness in the second and fourth videos than the ﬁrst and third videos.In the second and fourth videos, the narrators chose the hashtags that they think were important and related to the ubble Storytelling with Automated Animation: A Brexit Hashtag Activism Case Study 13 events, which according to the viewers, made to important events more comfortable to understand because there werefewer hashtags.Fig. 8: Brexble trial with COVID-19 dataset, named as Brexble × COVID-19, portraying a timeline of the panicperiod of Southeast Asia counties under the COVID-19 curtain; the interactive bubble chart can be accessed athttps://bit.ly/38aLv5i.5.4 COVID-19 case studyWe also applied this visualization to a dataset that reﬂects the COVID-19 outbreak in Southeast Asia. The sum ofweekly new conﬁrmed cases of each country is represented with its x-axis position, and the y-axis position encodesthe weekly total number of deaths (Figure 8). The bubble size encodes the population of each country.The topic is changed from the hashtags to Southeast Asia countries without polarization. The number of conﬁrmedcases and the number of deaths can convey the story from a variety of perspectives, e.g. , public health, foreign affairs,people’s anxiety. These perspectives can be chosen by narrators to tell the story using captions of the movement ofbubbles.This dataset is converted to a log scale, which is different from the linear scale in the Brexit dataset. For the timescale, it is different because the COVID-19 dataset is just happening, so it was put into weekly spans, unlike the Brexitdataset, which is divided into monthly spans. For movement levels, k -means was still used to divide the movementinto four levels.One case is brieﬂy described as follows. During January 2020, people in the region were not so awakened byCOVID-19. However, by the beginning of March 2020, many countries have started to take very intense action, whichis consistent with the bubble’s automatic highlights during the third month, with highlights being shown in almostevery country. By May 2020, it can be seen that many bubbles are not highlighted, which means that many countriesare starting to ease up but still lack trust. In June 2020, many countries, especially those with large populations, stillhave many conﬁrmed cases, and many people continually died. The system is available online at https://bit.ly/38aLv5i. The narrators used the content from the Brexit timeline to create videos. They tried to determine the duration of theevent according to the timeline and observed the hashtag movement to tell a story by selecting speciﬁc hashtags fromthe system. That is, the narrator observed and associated the events (“When”) with hashtag movements (“What” + “Where”). On the other hand, the viewers ﬁrst described the movement of the hashtag (“What” + “Where”), thendescribed the event of the movement (“When”) and ﬁnally connected the two things and explained their perceptions.The study included ﬁve users as the viewers and two users as the narrators. Nonetheless, an increase in the numberof test subjects would be desirable to gather more quantitative data for analysis.For applying Brexble frameworks to another dataset, we have tried the COVID-19 outbreak in the Southeast Asiadataset that is different from the Brexit dataset in terms of time scale, data scale, or even non-polarization. We foundthat it was still able to tell its own story from the narrators’ perspective. For polarization or grouping topics, it helpsthe viewer to remember the moving bubbles from colors better than the non-grouping topics.So both datasets can tell their own stories through animations and captions, but the narrators may need to haveprior knowledge of the topic. Brexble will help the narrators to connect that knowledge in the form of events with thebubbles’ movements and then communicate it as a story.

The study presents a prototype system to augment the animated bubble chart by automatically inserting animationsconnected to the storytelling of the video narrators and the interaction of viewers to those videos. It reduces the burdenon humans when making visualizations. The proposed prototype system not only helps in exploring changing datapattern but also supports to make conclusions.In the future, clustering by employing k -means can be adjusted using some other clustering algorithm to labelthe data for presentation. Furthermore, as data results indicated, most viewers have prior knowledge of the story.Therefore, we should design Belief-Driven Data Journalism (Nguyen et al., 2019), which is a framework that integratesthe viewers beliefs into the design to encourage interaction. Acknowledgements

This work is supported by National Natural Science Foundation of China (61772456, 61761136020). Moreover, the ﬁrstauthor wishes to thank Mr. Jaturapat Patanasongsivilai and Ms. Min Zhu for their valuable technical support on this project. Mr. Jaturapat Patana-songsivilai, the founder of “facebook.com/programmerthai” and computer textbook author, for his help to lay the foundation of the three proposedanimations. Furthermore, Ms. Min Zhu, a postgraduate student from the Heinz College at Carnegie Mellon University in Pittsburgh, Pennsylvania,United States, for her help in initial collecting the Brexit dataset and consulting about the data preprocessing process.

References

Amini F, Henry Riche N, Lee B, Hurter C, Irani P (2015) Understanding data videos: Looking at narrative visualiza-tion through the cinematography lens. In: Proceedings of the ACM Conference on Human Factors in ComputingSystems, pp 1459–1468Amini F, Riche NH, Lee B, Monroy-Hernandez A, Irani P (2016) Authoring data-driven videos with DataClips. IEEETransactions on Visualization and Computer Graphics 23(1):501–510Andrienko N, Andrienko G, Gatalsky P (2003) Exploratory spatio-temporal visualization: an analytical review. Journalof Visual Languages & Computing 14(6):503–541Carr D (2012) Hashtag activism, and its limits. New York Times 25Garimella VRK, Weber I (2017) A long-term analysis of polarization on twitter. In: Eleventh International AAAIConference on Web and Social MediaGray J, Chambers L, Bounegru L (2012) The data journalism handbook: how journalists can use data to improve thenews. O’Reilly Media, Inc.Kodinariya TM, Makwana PR (2013) Review on determining number of cluster in k-means clustering. InternationalJournal 1(6):90–95Kwon BC, Stoffel F, J¨ackle D, Lee B, Keim D (2014) Visjockey: Enriching data stories through orchestrated interactivevisualization. In: Poster Compendium of the Computation+ Journalism Symposium, vol 3Li D, Mei H, Shen Y, Su S, Zhang W, Wang J, Zu M, Chen W (2018) Echarts: A declarative framework for rapidconstruction of web-based visualization. Visual Informatics 2(2):136–146Lorenz M (2010) Data driven journalism: What is there to learn. In: IJ-7 Innovation Journalism Conference, pp 7–9McKenna S, Henry Riche N, Lee B, Boy J, Meyer M (2017) Visual narrative ﬂow: Exploring factors shaping datavisualization story reading experiences. Computer Graphics Forum 36(3):377–387 ubble Storytelling with Automated Animation: A Brexit Hashtag Activism Case Study 15ubble Storytelling with Automated Animation: A Brexit Hashtag Activism Case Study 15