Author Profiling for Hate Speech Detection
Pushkar Mishra, Marco Del Tredici, Helen Yannakoudakis, Ekaterina Shutova
AAuthor Profiling for Hate Speech Detection
Pushkar Mishra
Dept. of CS and TechnologyUniversity of CambridgeUnited Kingdom [email protected]
Marco Del Tredici
ILLCUniversity of AmsterdamThe Netherlands [email protected]
Helen Yannakoudakis
Dept. of CS and TechnologyThe ALTA InstituteUniversity of CambridgeUnited Kingdom [email protected]
Ekaterina Shutova
ILLCUniversity of AmsterdamThe Netherlands [email protected]
Abstract
The rapid growth of social media in recent years has fed into some highly undesirable phenomenasuch as proliferation of abusive and offensive language on the Internet. Previous research sug-gests that such hateful content tends to come from users who share a set of common stereotypesand form communities around them. The current state-of-the-art approaches to hate speech de-tection are oblivious to user and community information and rely entirely on textual (i.e., lexicaland semantic) cues. In this paper, we propose a novel approach to this problem that incorpo-rates community-based profiling features of Twitter users. Experimenting with a dataset of k tweets, we show that our methods significantly outperform the current state of the art in hatespeech detection. Further, we conduct a qualitative analysis of model characteristics. We releaseour code, pre-trained models and all the resources used in the public domain. Hate speech, a term used to collectively refer to offensive language, racist comments, sexist remarks, etc.,is omnipresent in social media. Users on social media platforms are at risk of being exposed to contentthat may not only be degrading but also harmful to their mental health in the long term.
Pew ResearchCenter highlighted the gravity of the situation via a recently released report (Duggan, 2014). As perthe report, 40% of adult Internet users have personally experienced harassment online, and 60% havewitnessed the use of offensive names and expletives. Expectedly, the majority (66%) of those who havepersonally faced harassment have had their most recent incident occur on a social networking website orapp. While most of these websites and apps provide ways of flagging offensive and hateful content, only8.8% of the victims have actually considered using such provisions. These statistics suggest that passiveor manual techniques for curbing propagation of hateful content (such as flagging) are neither effectivenor easily scalable (Pavlopoulos et al., 2017). Consequently, the efforts to automate the detection andmoderation of such content have been gaining popularity in natural language processing (
NLP ) (Waseemand Hovy, 2016; Wulczyn et al., 2017).Several approaches to hate speech detection demonstrate the effectiveness of character-level bag-of-words features in a supervised classification setting (Djuric et al., 2015; Nobata et al., 2016; Davidsonet al., 2017). More recent approaches, and currently the best performing ones, utilize recurrent neuralnetworks (
RNN s) to transform content into dense low-dimensional semantic representations that are thenused for classification (Pavlopoulos et al., 2017; Badjatiya et al., 2017). All of these approaches relysolely on lexical and semantic features of the text they are applied to. Waseem and Hovy (2016) adopted
This work is licensed under a Creative Commons Attribution 4.0 International License.License details: http://creativecommons.org/licenses/by/4.0/ . a r X i v : . [ c s . C L ] F e b more user-centric approach based on the idea that perpetrators of hate speech are usually segregatedinto small demographic groups; they went on to show that gender information of authors (i.e., userswho have posted content) is a helpful indicator. However, Waseem and Hovy focused only on coarsedemographic features of the users, disregarding information about their communication with others. Butprevious research suggests that users who subscribe to particular stereotypes that promote hate speechtend to form communities online. For example, Zook (2012) mapped the locations of racist tweetsin response to President Obama’s re-election to show that such tweets were not uniformly distributedacross the United States but formed clusters instead. In this paper, we present the first approach to hatespeech detection that leverages author profiling information based on properties of the authors’ socialnetwork and investigate its effectiveness.Author profiling has emerged as a powerful tool for NLP applications, leading to substantial perfor-mance improvements in several downstream tasks, such as text classification, sentiment analysis andauthor attribute identification (Hovy, 2015; Eisenstein, 2015; Yang and Eisenstein, 2017). The relevanceof information gained from it is best explained by the idea of homophily , i.e., the phenomenon that peo-ple, both in real life as well as on the Internet, tend to associate more with those who appear similar. Here,similarity can be defined along various axes, e.g., location, age, language, etc. The strength of authorprofiling lies in that if we have information about members of a community c defined by some similaritycriterion, and we know that the person p belongs to c , we can infer information about p . This concept hasa straightforward application to our task: knowing that members of a particular community are prone tocreating hateful content, and knowing that the author p is connected to this community, we can leverageinformation beyond linguistic cues and more accurately predict the use of hateful/non-hateful languagefrom p . The questions that we seek to address here are: are some authors, and the respective communitiesthat they belong to, more hateful than the others? And can such information be effectively utilized toimprove the performance of automated hate speech detection methods?In this paper, we answer these questions and develop novel methods that take into account community-based profiling features of authors when examining their tweets for hate speech. Experimenting with adataset of k tweets, we show that the addition of such profiling features to the current state-of-the-artmethods for hate speech detection significantly enhances their performance. We also release our code(including code that replicates previous work), pre-trained models and the resources we used in the publicdomain. Amongst the first ones to apply supervised learning to the task of hate speech detection were Yin et al.(2009) who used a linear
SVM classifier to identify posts containing harassment based on local (e.g., n-grams), contextual (e.g., similarity of a post to its neighboring posts) and sentiment-based (e.g., presenceof expletives) features. Their best results were with all of these features combined.Djuric et al. (2015) experimented with comments extracted from the Yahoo Finance portal and showedthat distributional representations of comments learned using paragraph2vec (Le and Mikolov, 2014)outperform simpler bag-of-words (
BOW ) representations in a supervised classification setting for hatespeech detection. Nobata et al. (2016) improved upon the results of Djuric et al. by training their clas-sifier on a combination of features drawn from four different categories: linguistic (e.g., count of insultwords), syntactic (e.g.,
POS tags), distributional semantic (e.g., word and comment embeddings) and
BOW -based (word and characters n-grams). They reported that while the best results were obtained withall features combined, character n-grams contributed more to performance than all the other features.Waseem and Hovy (2016) created and experimented with a dataset of racist, sexist and clean tweets.Utilizing a logistic regression ( LR ) classifier to distinguish amongst them, they found that charactern-grams coupled with gender information of users formed the optimal feature set; on the other hand,geographic and word-length distribution features provided little to no improvement. Working with thesame dataset, Badjatiya et al. (2017) improved on their results by training a gradient-boosted decisionree ( GBDT ) classifier on averaged word embeddings learnt using a long short-term memory (
LSTM )network that they initialized with random embeddings.Waseem (2016) sampled k more tweets in the same manner as Waseem and Hovy (2016). Theyrecruited expert and amateur annotators to annotate the tweets as racism , sexism , both or neither inorder to study the influence of annotator knowledge on the task of hate speech detection. Combiningthis dataset with that of Waseem and Hovy (2016), Park et al. (2017) explored the merits of a two-stepclassification process. They first used a LR classifier to separate hateful and non-hateful tweets, followedby another LR classifier to distinguish between racist and sexist ones. They showed that this setup hadcomparable performance to a one-step classification setup built with convolutional neural networks.Davidson et al. (2017) created a dataset of about k tweets wherein each tweet was annotated asbeing racist , offensive or neither of the two . They tested several multi-class classifiers with the aim ofdistinguishing clean tweets from racist and offensive tweets while simultaneously being able to separatethe racist and offensive ones. Their best model was a LR classifier trained using TF - IDF and
POS n-gramfeatures, as well as the count of hash tags and number of words.Wulczyn et al. (2017) prepared three different datasets of comments collected from the EnglishWikipedia Talk page; one was annotated for personal attacks, another for toxicity and the third onefor aggression. Their best performing model was a multi-layered perceptron (
MLP ) classifier trained oncharacter n-gram features. Experimenting with the personal attack and toxicity datasets, Pavlopouloset al. (2017) improved the results of Wulczyn et al. by using a gated recurrent unit (
GRU ) model toencode the comments into dense low-dimensional representations, followed by a LR layer to classify thecomments based on those representations. Author profiling has been leveraged in several ways for a variety of purposes in
NLP . For instance,many studies have relied on demographic information of the authors. Amongst these are Hovy et al.(2015) and Ebrahimi et al. (2016) who extracted age and gender-related information to achieve superiorperformance in a text classification task. Pavalanathan and Eisenstein (2015), in their work, furthershowed the relevance of the same information to automatic text-based geo-location. Researching alongthe same lines, Johannsen et al. (2015) and Mirkin et al. (2015) utilized demographic factors to improvesyntactic parsing and machine translation respectively.While demographic information has proved to be relevant for a number of tasks, it presents a signif-icant drawback: since this information is not always available for all authors in a social network, it isnot particularly reliable. Consequently, of late, a new line of research has focused on creating repre-sentations of users in a social network by leveraging the information derived from the connections thatthey have with other users. In this case, node representations (where nodes represent the authors in thesocial network) are typically induced using neural architectures. Given the graph representing the socialnetwork, such methods create low-dimensional representations for each node, which are optimized topredict the nodes close to it in the network. This approach has the advantage of overcoming the absenceof information that the previous approaches face. Among those that implement this idea are Yang etal. (2016), who used representations derived from a social graph to achieve better performance in entitylinking tasks, and Chen and Ku (2016), who used them for stance classification.A considerable amount of literature has also been devoted to sentiment analysis with representationsbuilt from demographic factors (Yang and Eisenstein, 2017; Chen et al., 2016). Other tasks that havebenefited from social representations are sarcasm detection (Amir et al., 2016) and political opinionprediction (T˘alm˘acel and Leon, 2017).
We experiment with the dataset of Waseem and Hovy (2016), containing tweets manually annotated forhate speech. The authors retrieved around k tweets over a period of two months. They bootstrappedtheir collection process with a search for commonly used slurs and expletives related to religious, sexual,gender and ethnic minorities. From the results, they identified terms and references to entities thatrequently showed up in hateful tweets. Based on this sample, they used a public Twitter API to collectthe entire corpus of ca. k tweets. After having manually annotated a randomly sampled subset of , tweets under the categories racism , sexism or none themselves, they asked an expert to reviewtheir annotations in order to mitigate against any biases. The inter-annotator agreement was reported at κ = 0 . , with a further insight that of all the disagreements occurred in the sexism class.The dataset was released as a list of , tweet IDs along with their corresponding annotations .Using python’s Tweepy library, we could only retrieve , of the tweets since some of them havenow been deleted or their visibility limited. Of the ones retrieved, 1,939 (12%) are labelled as racism ,3,148 (19.4%) as sexism , and the remaining 11,115 (68.6%) as none ; this distribution follows the originaldataset very closely (11.7%, 20.0%, 68.3%).We were able to extract community-based information for 1,836 out of the 1,875 unique authors whoposted the , tweets, covering a cumulative of 16,124 of them; the remaining 39 authors have eitherdeactivated their accounts or are facing suspension. Tweets in the racism class are from 5 of the 1,875authors, while those in the sexism class are from 527 of them. In order to leverage community-based information for the authors whose tweets form our dataset, wecreate an undirected unlabeled community graph wherein nodes are the authors and edges are the con-nections between them. An edge is instantiated between two authors u and v if u follows v on Twitteror vice versa. There are a total of 1,836 nodes and 7,561 edges. Approximately 400 of the nodes haveno edges, indicating solitary authors who neither follow any other author nor are followed by any. Othernodes have an average degree of 8, with close to 600 of them having a degree of at least 5. The graph isoverall sparse with a density of 0.0075.From this community graph, we obtain a vector representation, i.e., an embedding that we refer to as author profile , for each author using the node2vec framework (Grover and Leskovec, 2016). Node2vec applies the skip-gram model of Mikolov et al. (2013) to a graph in order to create a representation foreach of its nodes based on their positions and their neighbors. Specifically, given a graph with nodes V = { v , v , . . . , v n } , node2vec seeks to maximize the following log probability: (cid:88) v ∈ V log P r ( N s ( v ) | v ) where N s ( v ) denotes the network neighborhood of node v generated through sampling strategy s .In doing so, the framework learns low-dimensional embeddings for nodes in the graph. These embed-dings can emphasize either their structural role or the local community they are a part of. This dependson the sampling strategies used to generate the neighborhood: if breadth-first sampling ( BFS ) is adopted,the model focuses on the immediate neighbors of a node; when depth-first sampling (
DFS ) is used, themodel explores farther regions in the network, which results in embeddings that encode more informa-tion about the nodes’ structural role (e.g., hub in a cluster, or peripheral node). The balance betweenthese two ways of sampling the neighbors is directly controlled by two node2vec parameters, namely p and q . The default value for these is 1, which ensures a node representation that gives equal weight toboth structural and community-oriented information. In our work, we use the default value for both p and q . Additionally, since node2vec does not produce embeddings for solitary authors, we map these toa single zero embedding.Figure 1 shows example snippets from the community graph. Some authors belong to densely-connected communities (left figure), while others are part of more sparse ones (right figure). In eithercase, node2vec generates embeddings that capture the authors’ neighborhood. https://github.com/ZeerakW/hatespeech/blob/master/NAACL_SRW_2016.csv The degree of a node is equal to the number of its direct connections to other nodes.a) Densely-connected authors (b) Sparsely-connected authors
Figure 1: Snippets from the community graph for our Twitter data.
We experiment with seven different methods for classifying tweets as one of racism , sexism , or none .We first re-implement three established and currently best-performing hate speech detection methods —based on character n-grams and recurrent neural networks — as our baselines. We then test whetherincorporating author profiling features improves their performance. Char n-grams ( LR ). As our first baseline, we adopt the method used by Waseem and Hovy (2016)wherein they train a logistic regression ( LR ) classifier on the Twitter dataset using character n-gramcounts. We use uni-grams, bi-grams, tri-grams and four-grams, and L -normalize their counts. Charactern-grams have been shown to be effective for the task of hate speech detection (Nobata et al., 2016). Hidden-state ( HS ). As our second baseline, we take the “
RNN ” method of Pavlopoulos et al. (2017)which achieves state-of-the-art results on the Wikipedia datasets released by Wulczyn et al. (2017).The method comprises a 1-layer gated recurrent unit (
GRU ) that takes a sequence w , . . . , w n of wordsrepresented as d -dimensional embeddings and encodes them into hidden states h , . . . , h n . This isfollowed by an LR layer that uses the last hidden state h n to classify the tweet. We make two minormodifications to the authors’ original architecture: we deepen the 1-layer GRU to a 2-layer
GRU and usesoftmax instead of sigmoid in the LR layer. Like Pavlopoulos et al., we initialize the word embeddingsto GL o V e vectors (Pennington et al., 2014). In all our methods, words not available in the GL o V e set arerandomly initialized in the range ± . , indicating the lack of semantic information. By not mappingthese words to a single random embedding, we mitigate against the errors that may arise due to theirconflation (Madhyastha et al., 2015). A special OOV (out of vocabulary) token is also initialized in thesame range. All the embeddings are updated during training, allowing some of the randomly-initializedones to get task-tuned; the ones that do not get tuned lie closely clustered around the
OOV token, to whichunseen words in the test set are mapped.
Word-sum ( WS ). As a third baseline, we adopt the “
LSTM + GL o V e+ GBDT ” method of Badjatiya et al.(2017), which achieves state-of-the-art results on the Twitter dataset we are using. The authors firstutilize an
LSTM to task-tune GL o V e-initialized word embeddings by propagating the error back from an LR layer. They then train a gradient boosted decision tree ( GBDT ) classifier to classify texts based onthe average of the embeddings of constituent words. We make two minor modifications to this method:we use a 2-layer
GRU instead of the LSTM to tune the embeddings, and we train the
GBDT classifieron the L -normalized sum of the embeddings instead of their average. Although the authors achieved We also experimented with 1-layer
GRU / LSTM and 1/2-layer bi-directional
GRU s/ LSTM s but performance only worsenedor showed no gains; using sigmoid instead of softmax did not have any noteworthy effects on the results either. We note the deeper 2-layer
GRU slightly improves performance. Although
GBDT , as a tree based model, is not affected by the choice of monotonic function, the L -normalized sum ensuresuniformity of range across the feature set in all our methods. tate-of-the-art results on Twitter by initializing embeddings randomly rather than with GL o V e (whichis what we do here), we found the opposite when performing a 10-fold stratified cross-validation ( CV ).A possible explanation of this lies in the authors’ decision to not use stratification, which for such ahighly imbalanced dataset can lead to unexpected outcomes (Forman and Scholz, 2010). Furthermore,the authors train their LSTM on the entire dataset (including the test set) without any early stoppingcriterion, which leads to over-fitting of the randomly-initialized embeddings.
Author profile (
AUTH ). In order to test whether community-based information of authors is in itself suf-ficient to correctly classify the content produced by them, we utilize just the author profiles we generatedto train a
GBDT classifier.
Char n-grams + author profile ( LR + AUTH ). This method builds upon the LR baseline by appendingauthor profile vectors on to the character n-gram count vectors for training the LR classifier. Hidden-state + author profile ( HS + AUTH ) and Word-sum + author profile ( WS + AUTH ). Thesemethods are identical to the char n-grams + author profile method except that here we append the authorprofiling features on to features derived from the hidden-state and word-sum baselines respectively andfeed them to a
GBDT classifier.
We normalize the input by lowercasing all words and removing stop words. For the
GRU architecture,we use exactly the same hyper-parameters as Pavlopoulos et al. (2017), i.e., 128 hidden units, Glorotinitialization, cross-entropy loss, and the Adam optimizer (Kingma and Ba, 2015). Badjatiya et al.(2017) also use the same settings except they have fewer hidden units. In all our models, besides dropoutregularization (Srivastava et al., 2014), we hold out a small part of the training set as validation data toprevent over-fitting. We implement the models in Keras (Chollet and others, 2015) with
Theano back-end and use 200-dimensional pre-trained GL o V e word embeddings. We employ
Lightgbm (Ke et al.,2017) as our
GDBT classifier and tune its hyper-parameters using 5-fold grid search. For the node2vec framework, we use the same parameters as in the original paper (Grover and Leskovec, 2016) except weset the dimensionality of node embeddings to 200 and increase the number of iterations to 25 for betterconvergence.
We perform 10-fold stratified cross validation ( CV ), as suggested by Forman and Scholz (2010), to eval-uate all seven methods described in the previous section. Following previous research (Badjatiya et al.,2017; Park and Fung, 2017), we report the average weighted precision, recall, and F scores for all themethods. The average weighted precision is calculated as: (cid:80) i =1 ( w r · P ir + w s · P is + w n · P in )10 where P ir , P is , P in are precision scores on the racism , sexism , and none classes from the i th fold of the CV . The values w r , w s , and w n are the proportions of the racism , sexism , and none classes in thedataset respectively; since we use stratification, these proportions are constant ( w r = 0 . , w s = 0 . , w n = 0 . ) across all folds. Average weighted recall and F are calculated in the same manner.The results are presented in Table 1. For all three baseline methods ( LR , WS , and HS ), the addition ofauthor profiling features significantly improves performance ( p < . under 10-fold CV paired t-test).The LR + AUTH method yields the highest performance of F = 87 . , exceeding its respective baselineby nearly 4 points. A similar trend can be observed for the other methods as well. These results point The authors have not released their models, and we therefore replicate their approach based on the details in their paper. http://nlp.stanford.edu/data/glove.twitter.27B.zip o the importance of community-based information and author profiling in hate speech detection anddemonstrate that our approach can further improve the performance of existing state-of-the-art methods. Method
P R F Baselines LR HS WS AUTH LR + AUTH HS + AUTH WS + AUTH
Table 1: Average weighted precision, recall and F scores of the different methods on the Twitter datasest.All improvements are significant ( p < . ) under 10-fold CV paired t-test. Method
P R F LR HS WS AUTH LR + AUTH HS + AUTH WS + AUTH
Racism class
Method
P R F LR HS WS AUTH LR + AUTH HS + AUTH WS + AUTH
Sexism class
Table 2: Performance of the methods on the racism and sexism classes separately. All improvements aresignificant ( p < . ) under 10-fold CV paired t-test.In Table 2, we further compare the performance of the different methods on the racism and sexism classes individually. As in the previous experiments, the scores are averaged over 10 folds of CV . Ofparticular interest are the scores for the sexism class where the F increases by over 10 points uponthe addition of author profiling features. Upon analysis, we find that such a substantial increase inperformance stems from the fact that many of the 527 unique authors of the sexist tweets are closelyconnected in the community graph. This allows for their penchant for sexism to be expressed in theirrespective author profiles.The author profiling features on their own ( AUTH ) achieve impressive results overall and in particularon the sexism class, where their performance is typical of a community-based generalization, i.e., lowprecision but high recall. For the racism class on the other hand, the performance of
AUTH on its ownis quite poor. This contrast can be explained by the fact that tweets in the racism class come from only5 unique authors who: (i) are isolated in the community graph, or (ii) have also authored several tweetsin the sexism class, or (iii) are densely connected to authors from the sexism and none classes whichpossibly camouflages their racist nature.We believe that the gains in performance will be more pronounced as the underlying community graphgrows since there will be less solitary authors and more edges worth harnessing information from. Evenwhen the data is skewed and there is an imbalance of hateful vs. non-hateful authors, we do expect ourapproach to still be able to identify clusters of authors with similar views.
We conduct a qualitative analysis of system errors and the cases where author profiling leads to thecorrect classification of previously misclassified examples. Table 3 shows examples of hateful tweetsfrom the dataset that are misclassified by the LR method, but are correctly classified upon the addition ofauthor profiling features, i.e., by the LR + AUTH method. It is worth noting that some of the wins scored Regarding the scalability of our approach, we quote the authors of node2vec : “The major phases of node2vec are triviallyparallelizable, and it can scale to large networks with millions of nodes in a few hours”. y the latter are on tweets that are part of a larger hateful discourse or contain links to hateful contentwhile not explicitly having textual cues that are indicative of hate speech per se. The addition of authorprofiling features may then be viewed as a proxy for wider discourse information, thus allowing us tocorrectly resolve the cases where lexical and semantic features alone are insufficient. Tweet Predicted label
LR LR + AUTH @Mich McConnell Just “her body” right? none sexism@Starius:
Table 3: Examples of improved classification upon the addition of author profiling features (
AUTH ).However, a number of hateful tweets still remain misclassified despite the addition of author pro-filing features. According to our analysis, many of these tend to contain
URL s to hateful content,e.g., “ @salmonfarmer1: Logic in the world of Islam http://t.co/6nALv2HPc3 ” and “ @juliarforster Yes.http://t.co/ixbt0uc7HN ”. Since Twitter shortens all
URL s into a standard format, there is no indicationof what they refer to. One way to deal with this limitation could be to additionally maintain a blacklistof links. Another source of system errors is the deliberate obfuscation of words by authors in order toevade detection, e.g., “
Kat, a massive c*nt. The biggest ever on ”. Current hatespeech detection methods, including ours, do not directly attempt to address this issue. While this is achallenge for bag-of-word based methods such as LR , we hypothesize that neural networks operating atthe character level may be helpful in recognizing obfuscated words.Figure 2: Visualization of author embeddings in 2-dimensional space. We note that the annotators of the dataset took discourse into account when annotating the tweets. However, the datasetwas released as a list of tweet ID and corresponding annotation (racism/sexism/none) pairs; there is no annotation availableregarding which tweets are related to which other ones.a) None class (b)
Sexism class
Figure 3: Visualization of authors from different classes.We further conducted an analysis of the author embeddings generated by node2vec , in order to validatethat they capture the relevant aspects of the community graph. We visualized the author embeddings in2-dimensional space using t - SNE (van der Maaten and Hinton, 2008), as shown in Figure 2. We observethat, as in the community graph, there are a few densely populated regions in the visualization thatrepresent authors in closely knit groups who exhibit similar characteristics. The other regions are largelysparse with smaller clusters. Note that we exclude solitary users from this visualization since we have touse a single zero embedding to represent them.Figure 3 further provides visualizations for authors from the sexism and none classes separately. Whilethe authors from the none class are spread out in the embedding space, the ones from the sexism classare more tightly clustered. Note that we do not visualize the 5 authors from the racism class since 4 ofthem are already covered in the sexism class.
In this paper, we explored the effectiveness of community-based information about authors for the pur-pose of identifying hate speech. Working with a dataset of k tweets annotated for racism and sexism ,we first comprehensively replicated three established and currently best-performing hate speech detectionmethods based on character n-grams and recurrent neural networks as our baselines. We then constructeda graph of all the authors of tweets in our dataset and extracted community-based information in the formof dense low-dimensional embeddings for each of them using node2vec . We showed that the inclusionof author embeddings significantly improves system performance over the baselines and advances thestate of the art in this task. Users prone to hate speech do tend to form social groups online, and thisstresses the importance of utilizing community-based information for automatic hate speech detection.In the future, we wish to explore the effectiveness of community-based author profiling in other taskssuch as stereotype identification and metaphor detection. References [Amir et al.2016] Silvio Amir, Byron C Wallace, Hao Lyu, and Paula Carvalho M´ario J Silva. 2016. Modellingcontext with user embeddings for sarcasm detection in social media. arXiv preprint arXiv:1607.00976 .[Badjatiya et al.2017] Pinkesh Badjatiya, Shashank Gupta, Manish Gupta, and Vasudeva Varma. 2017. Deeplearning for hate speech detection in tweets. In
Proceedings of the 26th International Conference on WorldWide Web Companion , WWW ’17 Companion, pages 759–760, Republic and Canton of Geneva, Switzerland.International World Wide Web Conferences Steering Committee.[Chen and Ku2016] Wei-Fan Chen and Lun-Wei Ku. 2016. Utcnn: a deep learning model of stance classificationon social media text. arXiv preprint arXiv:1611.03599 .Chen et al.2016] Huimin Chen, Maosong Sun, Cunchao Tu, Yankai Lin, and Zhiyuan Liu. 2016. Neural sentimentclassification with user and product attention. In
Proceedings of the 2016 Conference on Empirical Methods inNatural Language Processing , pages 1650–1659.[Chollet and others2015] Franc¸ois Chollet et al. 2015. Keras.[Davidson et al.2017] Thomas Davidson, Dana Warmsley, Michael Macy, and Ingmar Weber. 2017. Automatedhate speech detection and the problem of offensive language. In
Proceedings of the 11th International AAAIConference on Web and Social Media , ICWSM ’17.[Djuric et al.2015] Nemanja Djuric, Jing Zhou, Robin Morris, Mihajlo Grbovic, Vladan Radosavljevic, andNarayan Bhamidipati. 2015. Hate speech detection with comment embeddings. In
Proceedings of the 24thInternational Conference on World Wide Web , WWW ’15 Companion, pages 29–30, New York, NY, USA.ACM.[Duggan2014] Maeve Duggan. 2014. Online harassment.[Ebrahimi and Dou2016] Javid Ebrahimi and Dejing Dou. 2016. Personalized semantic word vectors. In
Pro-ceedings of the 25th ACM International on Conference on Information and Knowledge Management , pages1925–1928. ACM.[Eisenstein2015] Jacob Eisenstein. 2015. Written dialect variation in online social media.
Charles Boberg, JohnNerbonne, and Dom Watt, editors, Handbook of Dialectology. Wiley .[Forman and Scholz2010] George Forman and Martin Scholz. 2010. Apples-to-apples in cross-validation studies:Pitfalls in classifier performance measurement.
SIGKDD Explor. Newsl. , 12(1):49–57, November.[Grover and Leskovec2016] Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable feature learning fornetworks. In
Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery andData Mining .[Hovy2015] Dirk Hovy. 2015. Demographic factors improve classification performance. In
Proceedings of the53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Confer-ence on Natural Language Processing (Volume 1: Long Papers) , volume 1, pages 752–762.[Johannsen et al.2015] Anders Johannsen, Dirk Hovy, and Anders Søgaard. 2015. Cross-lingual syntactic varia-tion over age and gender. In
Proceedings of the Nineteenth Conference on Computational Natural LanguageLearning , pages 103–112.[Ke et al.2017] Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, and Tie-Yan Liu. 2017. Lightgbm: A highly efficient gradient boosting decision tree. In I. Guyon, U. V. Luxburg,S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors,
Advances in Neural InformationProcessing Systems 30 , pages 3149–3157. Curran Associates, Inc.[Kingma and Ba2015] Diederik P. Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In
Proceedings of the 3rd International Conference on Learning Representations , ICLR ’15.[Le and Mikolov2014] Quoc V. Le and Tomas Mikolov. 2014. Distributed representations of sentences and docu-ments.
CoRR , abs/1405.4053.[Madhyastha et al.2015] Pranava Swaroop Madhyastha, Mohit Bansal, Kevin Gimpel, and Karen Livescu. 2015.Mapping unseen words to task-trained embedding spaces.
CoRR , abs/1510.02387.[Mikolov et al.2013] Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation ofword representations in vector space. arXiv preprint arXiv:1301.3781 .[Mirkin et al.2015] Shachar Mirkin, Scott Nowson, Caroline Brun, and Julien Perez. 2015. Motivating personality-aware machine translation. In
Proceedings of the 2015 Conference on Empirical Methods in Natural LanguageProcessing , pages 1102–1108.[Nobata et al.2016] Chikashi Nobata, Joel Tetreault, Achint Thomas, Yashar Mehdad, and Yi Chang. 2016. Abu-sive language detection in online user content. In
Proceedings of the 25th International Conference on WorldWide Web , WWW ’16, pages 145–153, Republic and Canton of Geneva, Switzerland. International World WideWeb Conferences Steering Committee.Park and Fung2017] Ji Ho Park and Pascale Fung. 2017. One-step and two-step classification for abusive lan-guage detection on twitter. In
Proceedings of the First Workshop on Abusive Language Online , pages 41–45.Association for Computational Linguistics.[Pavalanathan and Eisenstein2015] Umashanthi Pavalanathan and Jacob Eisenstein. 2015. Confounds and conse-quences in geotagged twitter data. arXiv preprint arXiv:1506.02275 .[Pavlopoulos et al.2017] John Pavlopoulos, Prodromos Malakasiotis, and Ion Androutsopoulos. 2017. Deep learn-ing for user comment moderation. In
Proceedings of the First Workshop on Abusive Language Online , pages25–35. Association for Computational Linguistics.[Pennington et al.2014] Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. Glove: Globalvectors for word representation. In
Empirical Methods in Natural Language Processing (EMNLP) , pages 1532–1543.[Srivastava et al.2014] Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhut-dinov. 2014. Dropout: A simple way to prevent neural networks from overfitting.
Journal of Machine LearningResearch , 15:1929–1958.[T˘alm˘acel and Leon2017] Ciprian T˘alm˘acel and Florin Leon. 2017. Predicting political opinions in social net-works with user embeddings. In
Proceedings of the 2017 13th IEEE International Conference on IntelligentComputer Communication and Processing .[van der Maaten and Hinton2008] Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data usingt-SNE.
Journal of Machine Learning Research , 9:2579–2605.[Waseem and Hovy2016] Zeerak Waseem and Dirk Hovy. 2016. Hateful symbols or hateful people? predictivefeatures for hate speech detection on twitter. In
Proceedings of the NAACL Student Research Workshop , pages88–93, San Diego, California, June. Association for Computational Linguistics.[Waseem2016] Zeerak Waseem. 2016. Are you a racist or am i seeing things? annotator influence on hate speechdetection on twitter. In
Proceedings of the First Workshop on NLP and Computational Social Science , pages138–142. Association for Computational Linguistics.[Wulczyn et al.2017] Ellery Wulczyn, Nithum Thain, and Lucas Dixon. 2017. Ex machina: Personal attacksseen at scale. In
Proceedings of the 26th International Conference on World Wide Web , WWW ’17, pages1391–1399, Republic and Canton of Geneva, Switzerland. International World Wide Web Conferences SteeringCommittee.[Yang and Eisenstein2017] Yi Yang and Jacob Eisenstein. 2017. Overcoming language variation in sentimentanalysis with social attention.
Transactions of the Association for Computational Linguistics .[Yang et al.2016] Yi Yang, Ming-Wei Chang, and Jacob Eisenstein. 2016. Toward socially-infused informationextraction: Embedding authors, mentions, and entities. arXiv preprint arXiv:1609.08084 .[Yin et al.2009] Dawei Yin, Brian D. Davison, Zhenzhen Xue, Liangjie Hong, April Kontostathis, and Lynne Ed-wards. 2009. Detection of harassment on web 2.0. In