[PDF] Dynamic Social Media Monitoring for Fast-Evolving Online Discussions - Researchain

Abstract

Tracking and collecting fast-evolving online discussions provides vast data for studying social media usage and its role in people's public lives. However, collecting social media data using a static set of keywords fails to satisfy the growing need to monitor dynamic conversations and to study fast-changing topics. We propose a dynamic keyword search method to maximize the coverage of relevant information in fast-evolving online discussions. The method uses word embedding models to represent the semantic relations between keywords and predictive models to forecast the future time series. We also implement a visual user interface to aid in the decision-making process in each round of keyword updates. This allows for both human-assisted tracking and fully-automated data collection. In simulations using historical #MeToo data in 2017, our human-assisted tracking method outperforms the traditional static baseline method significantly, with 37.1% higher F-1 score than traditional static monitors in tracking the top trending keywords. We conduct a contemporary case study to cover dynamic conversations about the recent Presidential Inauguration and to test the dynamic data collection system. Our case studies reflect the effectiveness of our process and also points to the potential challenges in future deployment.

Full PDF

DDynamic Social Media Monitoring for Fast-Evolving OnlineDiscussions

Maya Srikanth

California Institute of TechnologyPasadena, [email protected]

Anqi Liu

California Institute of TechnologyPasadena, [email protected]

Nicholas Adams-Cohen

Stanford UniversityPalo Alto, [email protected]

Jian Cao

California Institute of TechnologyPasadena, [email protected]

R. Michael Alvarez

California Institute of TechnologyPasadena, [email protected]

Anima Anandkumar

California Institute of TechnologyPasadena, [email protected]

ABSTRACT

Tracking and collecting fast-evolving online discussions providesvast data for studying social media usage and its role in people’spublic lives. However, collecting social media data using a staticset of keywords fails to satisfy the growing need to monitor dy-namic conversations and to study fast-changing topics. We proposea dynamic keyword search method to maximize the coverage of rel-evant information in fast-evolving online discussions. The methoduses word embedding models to represent the semantic relationsbetween keywords and predictive models to forecast the futuretime series. We also implement a visual user interface to aid in thedecision making process in each round of keyword updates. Thisallows for both human-assisted tracking and fully-automated datacollection. In simulations using historical

Conversations on social media platforms like Twitter are often dy-namic [3]. As new events occur, the language in tweets changes —words rise and fall in popularity and evolve with time. Influencers,celebrities, and political leaders often change the topics discussedonline to sell their products, brands, and ideas. Furthermore, ex-tremist groups, state-sponsored organizations targeting dissidents,and those intent on harassing and intimidating others online oftenpermute the syntax of existing keywords or create new hashtags toavoid detection by social media platforms (and others who mightbe monitoring them) [1]. To understand the natural evolution ofonline discussions and detect abusive conversations in real-time, itis important to develop data collection methods which track shiftsin online discourse.Many researchers who use social media data from Twitter collectdata using a set of static (unchanging) keywords and hashtags, e.g.,[4]. But as previous research shows, static data collection methodsfall short when social media conversations change, either becausethe language used to discuss some topic alters or the hashtags are syntactically modified [8, 9]. Thus, there is a need for buildingdynamic social media monitors that can adapt to changes in socialmedia conversations.Developing a dynamic social media data collection monitor thatcan update keywords and hashtags is a challenging task. Priorresearch has proposed methods that require human intervention,or are semi-automated [8, 9]. Other researchers may prefer fully-automated methods. Either way, a dynamic monitor requires theintegration of a number of different methods: it needs to start witha collection of social media posts on a certain topic, which can thenbe analyzed by natural-language processing tools to determine ifthere are new keywords or hashtags emerging in the data overtime. The dynamic monitor then needs a predictive modeling step,where it forecasts the likelihood that the new language on the topicwill continue to grow. Finally, based on the predictive model, thedynamic monitor then needs to adjust the keywords or hashtags itcollects information on, and needs to continue to analyze whethernew keywords or hashtags should continue to be included in themonitor.We design and implement a dynamic monitor for collectingdata on fast-evolving online discussions. We allow for both semi-automatic and fully-automatic data collection. Our final dynamicmonitor design uses word embeddings, corpus frequencies, andpredictive time series modelling to visualize trends in a real-timesocial media discussion, recommend new keywords for data stream-ing, and facilitate social media data collection. We provide the codeso that other researchers can use these tools . Figure 1 demon-strates the four components of our framework, which include datacollection and storage, data analysis, and visualization.Our work makes the following contributions:(1) By combining word embeddings with predictive time seriesmodeling, our methodology allows for fully-automated, semi-automated, or completely by-hand updating of keywordsused to pull social media data over extended periods.(2) By providing an online interface, we give analysts the meansto visualize the various components of the dynamic monitor:using our code, researchers can oversee the operations of adynamic monitor and alter the course of their data collection.(3) By conducting simulations and case studies with both his-torical data from the prominent 2017 https://github.com/mayasrikanth/DynamicMonitor a r X i v : . [ c s . S I] F e b reprint, Under Review, 2021 Maya Srikanth, Anqi Liu, Nicholas Adams-Cohen, Jian Cao, R. Michael Alvarez, and Anima Anandkumar Twitter

Stream API

Backend

Analyses

GloVe Training

Data

Visualization Platform

Charts

TablesUI

GCP Monitor

PubSub

DataFlow

Storage

Time Series Modelling

Figure 1: Workflow of our Data Collection, Storage, Analysis, and Visualization Platform. Here we use Twitter APIs as anexample for data collection and we use Google Cloud Platform (GCP) as an example for data streaming and storage. GCPMonitor works in a sequential way while the backend analyses can be parallel.

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

AddRemove

END

Figure 2: Keywords update (add and remove) by the dynamic monitor for data collection in historical and the tumultuous 2021 Presidential Inauguration, we putthe deployed components of our process into use on real-world, fast-evolving online discussions to demonstrate theirefficacy.More specifically, Figure 2 summarizes our dynamic monitorupdates of keywords in the simulation, where we only use GloVe embeddings and keyword frequency information to model the key-words relations and apply human judgement for keywords updates.Our method achieves 37.1% higher F-1 score on average than thetraditional static monitor in tracking the top trending keywordsfor each month in 2017 (Table 1). Figure 3 shows dynamic monitor ynamic Social Media Monitoring for Fast-Evolving Online Discussions Preprint, Under Review, 2021

Jan 2021

Jan 11 Jan 12 Jan 13 Jan 14 Jan 15 Jan 16 Jan 17 Jan 18 Jan 19 Jan 20 Jan 22

AddRemove safety,security, bidenharrisinauguration

END

Figure 3: Keywords update (add and remove) by the dynamic monitor for data collection in real-time case study of 2021 updates of keywords used to collect real-time

Notation.

We define 𝑠 𝑡 as the set of keywords we are interestedin tracking at timestep 𝑡 . Using 𝑠 𝑡 , we can filter a corpus 𝐾 𝑡 usingAPIs from different social media platforms. For example, Twitterprovides various APIs that filter tweets containing specific key-words and hashtags. We define 𝐺 𝑡 as the semantic representation ofwords included in the filtered corpus. In prevalent word embeddingmodels, we can use a vector to represent each word’s semantic rela-tion with other words. We use 𝑃 𝑡 to represent the future trends foreach word. 𝑃 𝑡 can be directions (increases or decreases) or specificfrequencies. Our goal then is to build a system for dynamic datacollection such that maximizes the information coverage for evolv-ing discussions around certain topics. As such, given keywords 𝑠 𝑡 and the corresponding corpus 𝐾 𝑡 , we aim to update the keywordset 𝑠 𝑡 + according to the patterns in 𝐺 𝑡 and 𝑃 𝑡 . Data Collection and Storage.

We assume data collection usingthe APIs from social media platforms can be conducted efficiently.In practice, there exist additional difficulties in conducting this datafiltering in a large scale and storing the data reliably [4]. However, inthis paper, we focus on the process of decision making for updatingthe keywords and the visualization for facilitating the decisionmaking when human intervention is needed.

Decision Making on Updating Keywords.

Based on 𝐾 , ..., 𝐾 𝑡 , wefirst need to generate the semantic representation 𝐺 𝑡 and the fu-ture trend 𝑃 𝑡 . Given 𝐺 𝑡 and 𝑃 𝑡 , there are different many ways todetermine the updated keywords, and in many contexts it is de-sirable that this updating process remain flexible. When there is asmooth trend in topic shifting or there are emerging events thatslowly change the direction or sentiment of the discussion, fully-automatic updating using simple rules is sufficient. However, insome contexts human intervention guided by information of 𝐺 𝑡 and 𝑃 𝑡 is needed. For example, swift topic shifting is hard to forecastusing historical data. In these situations, human intervention maybe helpful. User Interface.

Demonstrating 𝐺 𝑡 and 𝑃 𝑡 for decision makingis important for closing the loop for future data collection. Theinterface should consist of user-friendly elements that enable clearillustration of the semantic relation between keywords and theforecasts of future trends. At present, we have built and deployed the word embedding com-ponent of our dynamic keyword and hashtag monitoring process(focusing at this point on hashtags). The predictive modeling com-ponent of our process is still under development, as we discuss later,we have implemented some time series predictive models and willdemonstrate their utility in future research. reprint, Under Review, 2021 Maya Srikanth, Anqi Liu, Nicholas Adams-Cohen, Jian Cao, R. Michael Alvarez, and Anima Anandkumar

At time 𝑡 , we train the GloVe model [12] using 𝐾 𝑡 to produce 50-dimensional word embedding representations 𝐺 𝑡 of all tokens in ourfiltered corpus 𝐾 𝑡 . GloVe can represent linear substructures in data.It is a log-bilinear model with a weighted least-squares objective,and aims to learn word vectors such that their dot product equalsthe logarithm of the words’ probability of co-occurrence. In theresulting word vector space, cosine similarity indicates linguisticor semantic similarity between two words, while vector differencescapture analogies between pairs of words. These embeddings allowus to project each token into euclidean space, where we can usedistance metrics to measure "closest" neighbors to each of ourkeywords 𝑠 ∈ 𝑠 𝑡 . Our latest implementation supports time series forecasting withARIMA (auto-regressive integrated moving average) for univariatefrequency data. Specifically, we visualize keyword frequencies andpredict their trajectory within a confidence interval to determinewhether a discussion topic is increasing or decreasing.Prior to any forecasting, we apply log transform to all corpuscounts in order to stabilize the variance [10] and induce stationarityin all series by differencing lags. For each keyword, we grid searchARIMA model parameters and select those which maximize perfor-mance (minimize mean-squared error) on the validation set. Thebest model is then used to forecast frequencies − time-stepsinto the future.While linear models like ARIMA do not outperform larger deeplearning models when data is abundant, they can be informativein earlier stages of data streaming when textual data is sparse. Forapplications with more abundant data, we plan to include optionsfor training more robust deep learning models in the future.In order to collect enough data to allow for meaningful timeseries prediction during the data collection process, our dynamicimplementation design pulls days worth of data from Twitter’sREST API using the starting set of keywords as the query. With 𝐺 𝑡 and 𝑃 𝑡 defined above, our algorithm proceeds as follows.For each 𝑠 ∈ 𝑠 𝑡 at time 𝑡 , from embedding space 𝐺 𝑡 we find 𝐶 , the set of 30 closest neighbors to keyword 𝑠 , defining closestwith the cosine similarity metric. From this set of thirty words, wechoose the most relevant hashtags or mentions in the domain orevent we are tracking. We define these neighbors as 𝑛 𝑠 ∈ 𝐶 𝑠 ⊆ 𝐶 .For each 𝑛 𝑠 ∈ 𝐶 𝑠 , we use our time series model 𝑃 𝑡 to predictfuture frequencies for the hashtag or mention. If we predict 𝑛 𝑠 isdeclining in future time periods, we drop these values from the set 𝐶 𝑠 . We define 𝐶 𝑠 ′ ⊆ 𝐶 𝑠 as the set of neighbors without decliningtime series predictions.Finally, define 𝑠 𝑡 + , the set of keywords we track in the next timeperiod, as 𝐶 𝑠 ′ . Algorithm 1:

Dynamic Algorithm

Input: 𝑠 𝑡 : keyword set at 𝑡 , 𝐾 𝑡 : filtered corpus at 𝑡 Output: 𝑠 𝑡 + = {} Data: 𝐺 𝑡 ← : obtain 50-dimension GloVe embeddingstrained on 𝐾 𝑡 . 𝑃 𝑡 ← : update time series models withlatest frequency data from corpus for 𝑠 ∈ 𝑠 𝑡 do 𝐶 : 30 closest neighbors to 𝑠 in embedding space 𝐺 𝑡 .2. 𝐶 𝑠 : choose relevant hashtags or mentions from 𝐶 𝐶 𝑠 ′ : Discard hashtags from 𝐶 𝑠 with declining trendlines in time series prediction or low corpus counts.4. 𝑠 𝑡 + ←− 𝐶 𝑠 ′ Return 𝑠 𝑡 + . The case studies we conduct on historical only by corpus frequen-cies and GloVe embeddings : we do not use time series models topredict the trajectory of keywords.However, we observed from independent analysis that even lin-ear time series models can produce reasonable short-term estimatesof keyword frequency trajectories. Thus, our data visualization plat-form code includes interactive charts with time series forecasts (seeFigure 5), as well as code for training ARIMA on real-time Twitterdata.

To enable researchers to collaboratively adapt their social mediadata collection process to dynamically changing discussions, we arebuilding a data visualization platform which integrates a browser-based user interface on the frontend with scripts for data streamingand AI-driven predictive modelling on the backend. Below, wedetail the features and capabilities of this tool, before demonstratingits use on real-time monitoring of 2021 Twitter discussions about

Frontend.

The frontend of our platform is built with JavaScript,HTML, and CSS. We use the JavaScript library echarts to renderfigures and tables which are dynamically updated by the backendto reflect real-time data. Our frontend code uses the echarts APIto make our figures interactive: users can “zoom in" on specificnumerical values. We host our frontend using Github pages andplan to release our code to allow any research group to host theirown personalized version of the data visualization platform. Figures4, 5, and 6 show the interface visualizations for ynamic Social Media Monitoring for Fast-Evolving Online Discussions Preprint, Under Review, 2021

Figure 4: Interactive tsne (t-distributed stochastic neighborembedding) plot of the closest 30 neighbors to the trackedkeyword 𝑝, 𝑑, 𝑞 values were naivelygrid-searched. MSE for forecasts is . . X-axis shows last50 hours in case study. Backend.

Our code streams data using Twitter API and stores itin a cloud compute service (Oracle Cloud or GCP). On the cloud,several scripts preprocess data, train GloVe and other predictivetime series models, and update the frontend interface. Then on thefrontend, the user can select new keywords or drop old ones tocustomize their data collection process. Time intervals for updatingall frontend visualizations are customizable, with a lower bound of ≈ minutes. Our dynamic monitor design lends itself to semi-automated andfully -automated data collection processes.

Figure 6: Table of 30 closest neighbors to

Semi-automated.

We consider a semi-automated data collectionprocess to mean a human-assisted one. That is, a group of re-searchers interested in tracking a particular set of topics on so-cial media can utilize the AI-driven keyword recommendations onthe frontend to alter their keyword set throughout data collection.The time series forecasting and word embeddings can uncover newdiscussion topics and indicate whether existing hashtags are increas-ing or decreasing in frequency: given this real-time information,researchers can adjust their data streaming.

Fully-automated.

Our implementation leaves room for a fully-automated approach which sorts candidate keywords using a linearcombination of predictive factors, such as: 𝑠 𝑖 = 𝛼 · 𝑚 𝑖 + 𝛽 · 𝑑 𝑖 + 𝛾 · 𝑓 𝑖 + 𝛿 · 𝑣 𝑖 (1)where 𝑠 𝑖 is the “virality" score for a given keyword, 𝑚 𝑖 is theslope of the projected frequency trend-line, 𝑑 𝑖 is the average cosinedistance from the current set of keywords, 𝑓 𝑖 is the current corpusfrequency of the keyword, and 𝑣 𝑖 is the variance of the keyword’sfrequency trend-line. Keywords can be sorted according to thismetric, and the first − keywords can be automatically added tothe set. Further, removal criterion can be imposed–e.g. we can dropkeywords that are relatively old and have low usage in the corpus.The scaling factors for these variables can be customized to fit theresearch objective: for instance, if a researcher wants to track nichetopics with low predicted popularity, they can reduce the weightof 𝑓 𝑖 .We take the semi-automated approach in our studies, as it betterfits our research objectives of testing a human-in-the-loop dynamicdata collection method. In future iterations, we will include frame-work for the fully-automated keyword selection in our code. Building on a series of women’s rights marchesand protests, the reprint, Under Review, 2021 Maya Srikanth, Anqi Liu, Nicholas Adams-Cohen, Jian Cao, R. Michael Alvarez, and Anima Anandkumar

Figure 7: Upper: Upper: Percentage of monthly tweets con-taining globe adopted the

Evolution of Topics:

As shown in Figure 8(a), January 2017 sees anupsurge of the hashtag “

Simulation of a Dynamic Monitor:

The shifts in the languageused to describe a general set of issues and the advent of unfiltered historical data represents the ground truth target setfor evaluation. (a) (b)

Figure 8: (a) Word cloud containing 40 most frequently usedhashtags in January 2017; (b) Word cloud of 40 most fre-quently used hashtags in August 2017. The size of the wordsindicate the frequency. There is a significant topic shift onTwitter between these two months.

We compare our method with two baselines: Static and Last-Top.(1) Static: uses the top 𝑛 keywords from January 2017 for eachof the following months throughout the simulation. ynamic Social Media Monitoring for Fast-Evolving Online Discussions Preprint, Under Review, 2021 (2) Last-Top: assumes previously trending hashtags can exposetrending conversations in the current month and uses thekeywords set from the last month.To make comparison between the methods more tractable, we set 𝑛 = keywords at any time 𝑡 , although in practice it is certainlypossible to use more.Since we have access to the top trending hashtags for each monthin the whole dataset, we set these to be the ground truth, and eval-uate the most frequently used hashtags pulled by various monitorsagainst this. We use the Jaccard similarity index and F1-score forthe evaluation.To calculate F1-score, we regard the proportion of correctlyretrieved top trending hashtags in the retrieved hashtags ( 𝑅 ) andin the ground truth hashtags ( 𝐺 ) as precision and recall.For the Jaccard similarity, we look at the union and the intersec-tion of the retrieved hashtags ( 𝑅 ) and the ground truth hashtags ( 𝐺 ).Importantly, all monitors begin with the same set of keywords–thetop most frequently used hashtags in our January 2017 𝐹 = · Precision · recallPrecision + recall (2) 𝐽 = 𝑅 ∩ 𝐺𝑅 ∪ 𝐺 (3) Table 1: Global Performance Comparison

Jaccard Avg. F1 Avg. F1Weighted Unweighted

Dynamic . . . Last-Top . . . Static . . . Figure 9 shows the performance of our algorithms and the base-lines. Table 1 shows the weighted average of Jaccard similarityindex and F1-score of each algorithm as percentages with respectto the to target set (top hashtags for the full-month data), where theweight is the proportion of the size of monthly data to the entirecorpus size. We also compare with an unweighted average F1-score.In all of the metrics, our method outperforms the Last-Top base-line and the conventional Static monitor typically used in socialscience research. This indicates that our method can better capturethe trending topics in the

Figure 9: The similarity between the set of top 20 hashtagsin the subset of data recovered by each algorithm (dynamic,last-top, and static) and the target set of top 20 hashtags inthe full month of data in terms of Jaccard index (top) andF1 score (bottom). Dynamic algorithm outperforms the base-lines at most timesteps. In particular, the dynamic monitorcovers 75% more top trending keywords than the static mon-itor on average.

Keyword Evolution in

Figure 2 demonstratesthe keywords retrieved by our algorithm, which reflects the evolu-tion of the topics in the early stage of the

To provide a contemporary case study using the dynamic keywordselection method, we started the dynamic monitor on January 11,2021 (at 10:40:44 Pacific Standard Time), with a single keyword,“inauguration.” We used the Twitter data collection architecture de-veloped by [4] to stream data. We stopped collecting data using thisdynamic monitor on January 22, 2021 at 15:21:06 Pacific StandardTime.We show in Figure 10 the amount of data collected by this dy-namic monitor (by hour in the top panel, and by day in the bottompanel). The bottom panel of Figure 10 provides what a static monitor reprint, Under Review, 2021 Maya Srikanth, Anqi Liu, Nicholas Adams-Cohen, Jian Cao, R. Michael Alvarez, and Anima Anandkumar would have collected, daily, during this period. During the first fullday of data collection, the dynamic monitor pulled 499,111 tweets;rising by January 14, 2021 to 2,470,410 on that day. The numberof tweets collected by the dynamic monitor peaked on January 20,2021 (the day of the inauguration), pulling 3,434,650 tweets. In total,the dynamic monitor collected 19,723,508 tweets.

Dynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStartsDynamic MonitorStarts InaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInaugurationInauguration0 K50 K100 K150 K J an J an J an J an J an J an J an J an J an J an J an J an J an J an J an J an J an Monitor

DynamicStatic

Figure 10: Hourly Number of Tweets containing

We pulled discussions related to 𝑝𝑚 PST, we usedthe closet neighbor table and tsne plots on our webpage (see ?? , 4)to collaboratively determine whether to add or subtract keywordsfor data collection. As shown in Figure 11, the semi-automated dynamic approach isvisibly better than a static data collection procedure, producinga more uniform distribution of popular keywords. The dynamicapproach keeps old frequently-discussed hashtags (

We have the following observations in the data analysis and key-word update for this case study.

Word Ambiguity.

General terms are ambiguous and may induceirrelevant information. In day 2, we saw a surge in the keywordfrequency of “safety" and “security" in our filtered corpus. We thendecided to include them in our keywords set and expect to capturemore text about data security. However, with the knowledge that“safety" and “security" are general and ambiguous terms, we takeextra caution in the following days. When taking a closer look atthe corpus that contains the keywords “security", we dropped it and

Figure 11: Word cloud from the static monitor (left) and thedynamic monitor (right) on day 9 of the experiment show-ing the most frequently used hashtags in the statically anddynamically obtained Twitter datasets. The dynamic mon-itor covers a wider range of topics and has a more uni-form distribution in terms of the frequency of different key-words. include more specific terms like “datasecurity" and “databreach"instead.

Convergence.

The dynamic discussion would “converge" withno new keywords added in the end. In our case study, we did notfind new keywords to add to the keyword set in the last three days.Since we did not set a upper limit in the keyword set, it is possiblethat we captured all the topics after 9 days. In practice, we believethe convergence may or may not happen depending on whetherthere is a constraint in terms of the number of keywords in total ornumber of new keywords each round.

Forecast Information.

One difficulty in our case study is the us-age of the forecast information. Due to the short time horizon weare working on, there is not enough data for us to train reliablepredictive models on the keywords, which impose difficulty in fullyautomatic update. Fortunately, our method is flexible enough to alsoaccount for human intervention. In the future, we aim to furthertest various type of update rules.

Information Retrieval.

Existing information retrieval methodslike Okapi BM25 [13] which take a probabilistic approach to rankingdocuments use a complex and rigid weighting scheme with vari-ous parameters that require refinement. Deep learning informationretrieval methods like BERT [6] are non-transparent and requireabundant data to produce sensible rankings. These approaches fo-cus on optimizing information retrieval in largely static databasesand returning results that are most similar or relevant to user’squery. Our task is different: social media data is extremely dy-namic and prediction intervals occur on smaller scales. Therefore,rankings must be produced efficiently and robustly on relativelysmall datasets which are streamed real-time. Further, the searchprocess for new keywords begins with a known keyword, but couldend in new discoveries which alter the course of data collection.Our target user is uncertain about what they are looking for, and ynamic Social Media Monitoring for Fast-Evolving Online Discussions Preprint, Under Review, 2021 their “information retrieval” process is highly informed by real-timepredictions and newly exposed keywords. Finally, our applicationrequires flexibility in ranking documents (or keywords): in one usecase, corpus frequency may be the strongest indicator of relevance,while in another use case, semantic similarity to an existing set ofkeywords may be more important.

Word Embedding Models.

Previous work has shown that wordembedding models serve as an efficient information retrieval mech-anism on vast corpora [7]. In previous work, we show the abilityof word embedding models to uncover conversational threads andemergent hashtags in online conversation [9]. As our applicationrequires real-time processing of streamed social media data, weiteratively train GloVe word embeddings [12] on incoming data toefficiently create a vector representation of the corpus and retrieveinformation about closest neighbors based on metrics like cosinesimilarity. To provide greater flexibility and transparency to theranking scheme, our final implementation sorts keywords basedon a linear combination of keyword frequency information andembedding information.

Keyword Selection.

Prior research has focused on the problemof keyword selection [14]. This is different from our work in thatwe assume that the analyst begins with subject-matter expertisethat provides keywords and hashtags to seed the dynamic mon-itor. Other related work has used similar static keyword searchquery approaches [e.g. 2, 5, 11], which fall short in the study ofdynamic debates with rapidly evolving conversation. Another ap-proach uses semi-automated approaches for keyword selection andsearch by crafting a semi-supervised dynamic keyword method-ology [15]. Their approach differs from ours, as we use relativelyeasy-to-estimate word embedding models and straightforward time-series predictive models, making our approach more intuitive andlikely faster computationally. Further, related deep learning meth-ods [14] often lack transparency and require vast data to performwell, which precludes their usage in applications with sparse data.Finally, related work has proposed semi-automated keyword se-lection combining computer and human input [8]: these are basedon complex rules are more effort-intensive, as researchers mustthemselves apply these rules in their decision process. We offeran AI-driven dynamic keyword searching methodology that is fastand efficient (like fully-automated methods), yet provides moretransparency and intuition than these methods. We also offers un-precedented visualizations and a interface for user decision making.

In this paper, we design and implement a novel dynamic keywordsearch method for tracking and monitoring fast-evolving onlinediscussions. This closes the gap between the traditional static key-word search method and the highly dynamic data sources in socialmedia. We use word embedding models for finding relevant key-word automatically and use indicators from predictive time seriesmodels to help the decision making in keyword updates. We allowfor both semi-automatic and fully-automatic data collection. Thewhole system is build using modern data collection, storage andvisualization tools. To test the current deployment, we simulate on data collectedfrom the

REFERENCES [1] Adam Badawy, Emilio Ferrara, and Kristina Lerman. 2018. Analyzing the DigitalTraces of Political Manipulation: The 2016 Russian Interference Twitter Cam-paign. In . 258–265. https://doi.org/10.1109/ASONAM.2018.8508646[2] Ning Wang Barberá, Pablo, Richard Bonneau, John T. Jost, Jonathan Nagler,Joshua Tucker, and Sandra Gonzalez-Bailon. 2015. The Critical Periphery in theGrowth of Social Protests.

PLOS ONE

10 (11 2015), 1–15. https://doi.org/10.1371/journal.pone.0143611[3] Axel Bruns. 2012. HOW LONG IS A TWEET? MAPPING DYNAMIC CONVERSA-TION NETWORKS ON TWITTER USING GAWK AND GEPHI.

Information, Com-munication & Society

15, 9 (2012), 1323–1351. https://doi.org/10.1080/1369118X.2011.635214[4] Jian Cao, Nicholas Adams-Cohen, and R. Michael Alvarez. 2020. Reliable andEfficient Long-Term Social Media Monitoring. arXiv:cs.CY/2005.02442[5] Michael D Conover, Bruno Gonçalves, Alessandro Flammini, and Filippo Menczer.2012. Partisan asymmetries in online political activity.

EPJ Data Science

Gesellschaft für Informatik, Bonn (2017), 2155–2167. https://doi.org/10.18420/in2017_215[8] Gary King, Patrick Lam, and Margaret Roberts. 2017. Computer-Assisted Keywordand Document Set Discovery from Unstructured Text.

American Journal ofPolitical Science

61, 4 (2017), 971–988. https://onlinelibrary.wiley.com/doi/abs/10.1111/ajps.12291[9] Anqi Liu, Maya Srikanth, Nicholas Adams-Cohen, R. Michael Alvarez, and AnimaAnandkumar. 2019. Finding social media trolls: Dynamic keyword selectionmethods for rapidly-evolving online debates. In

AI For Social Good Workshop,NeurIPS . https://arxiv.org/abs/1911.05332[10] Helmut Lütkepohl and Fang Xu. 2012. The role of the log transformation inforecasting economic variables. (2012). https://doi.org/10.1007/s00181-010-0440-1[11] Brendan O’Connor, Ramnath Balasubramanyan, and Bryan R. Routledge. 2010.From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series. In

Proceedings of the Fourth International AAAI Conference on Weblogs and SocialMedia .[12] Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. Glove:Global vectors for word representation. In

Proceedings of the 2014 conference onempirical methods in natural language processing (EMNLP) . 1532–1543.[13] K. Sparck Jones, S. Walker, and S.E. Robertson. 2000. A probabilistic modelof information retrieval: development and comparative experiments: Part 1.

Information Processing & Management

36, 6 (2000), 779–808. https://doi.org/10.1016/S0306-4573(00)00015-7[14] Shuai Wang, Zhiyuan Chen, Bing Liu, and Sherry Emery. 2016. IdentifyingSearch Keywords for Finding Relevant Social Media Posts.

Proceedings of theAAAI Conference on Artificial Intelligence

30, 1 (Mar. 2016). https://ojs.aaai.org/index.php/AAAI/article/view/10387[15] Xin Zheng, Aixin Sun, Sibo Wang, and Jialong Han. 2017. Semi-Supervised Event-Related Tweet Identification with Dynamic Keyword Generation. In