Combating Hostility: Covid-19 Fake News and Hostile Post Detection in Social Media
CCombating Hostility: Covid-19 Fake News andHostile Post Detection in Social Media
Omar Sharif ∗ ID , Eftekhar Hossain ID , and Mohammed Moshiul Hoque ∗ ID * Department of Computer Science and Engineering † Department of Electronics and Telecommunication EngineeringChittagong University of Engineering and Technology, Bangladesh { omar.sharif, eftekhar.hossain, moshiul 240 } @cuet.ac.bd Abstract.
This paper illustrates a detail description of the system andits results that developed as a part of the participation at CONSTRAINTshared task in AAAI-2021. The shared task comprises two tasks: a)COVID19 fake news detection in English b) Hostile post detection inHindi. Task-A is a binary classification problem with fake and real class,while task-B is a multi-label multi-class classification task with five hostileclasses (i.e. defame, fake, hate, offence, non-hostile). Various techniquesare used to perform the classification task, including SVM, CNN, BiLSTM,and CNN+BiLSTM with tf-idf and Word2Vec embedding techniques.Results indicate that SVM with tf-idf features achieved the highest 94.39%weighted f score on the test set in task-A. Label powerset SVM withn-gram features obtained the maximum coarse-grained and fine-grained f score of 86.03% and 50.98% on the task-B test set respectively. Keywords:
Natural language processing · Fake news detection · Hostile postclassification · Machine learning · Deep learning
In recent years there has been a phenomenal surge in the number of usersin social media platforms (i.e. Facebook, Twitter) to communicate, publishcontent, and express their opinions. The swelling number of users has resultedin the generation of the countless amount of posts on social media platforms[1]. Although communication proliferated via the social media platforms, theyalso create space for anti-social and unlawful activities such as disseminatingpreposterous information, rumour, bullying, harassing, stalking, trolling, andhate speech [2][3]. During emergency and crises, these anti-social behaviours arestirring up immensely and thus deliberately or unintentionally create a hazardouseffect towards a community. COVID-19 pandemic is one such situation that haschanged people’s lifestyles by confining them to homes and engaging in spendingmore time on social media. As a result, many online social media users posthostile (such as fake, offensive, defame) contents by crossing the line definedby constitutional rights. Moreover, hostile posts on COVID-19 is a matter of a r X i v : . [ c s . C L ] J a n Sharif et al. concern as it can impel people to take extreme actions by believing the postis real. In order to combat the hostile contents, development of an automatedsystem is utmost important. To address the issue, the critical contributions ofthis work illustrates in the following: • Develop various machine learning and deep learning-based models to detecthostile texts in social media. • Present performance analysis and qualitative error analysis of the system.The rest of the paper organized as follows: the related work discussed inSection 2. Problem definition and brief analysis of the dataset presented inSection 3. Section 4 described the methods used to develop the system. Findingsand the results of the errors analysis presented in Section 5. Finally, conclusionand future direction are given in Section 6.
Identification and classification of hostile contents have become a prominentresearch issue in recent years. Various machine and deep learning approacheshave achieved reasonable accuracy in solving various NLP tasks such as fakenews, hate speech and abusive language detection. Saroj et al. [4] used SVMclassifier along with tf-idf feature to identify the hate speech and offensivelanguages. They obtained the macro f score of 78% and 72% respectivelyfor Arabic and Greek dataset. Ibrohim et al. [5] exploited a combination ofn-gram tf-idf features with SVM and Naive Bayes (NB) classifier to detect theabusive language. Their approach achieved the highest f score of 86.43% forthe NB classifier with unigram feature. A machine learning-based approach usedby Gaydhani et al. [6] to classify the hate speech and offensive language onTwitter. This work used tf-idf for varying n-gram ranges and experimented withmultiple classifiers such as logistic regression (LR) and SVM. Ibrohim et al.[7] used a machine learning approach with problem transformation methodsfor classifying the multi-label hate speech and abusive content written in theIndonesian language. Authors exploited various transformation techniques suchas binary relevance (BR), label powerset (LP), and classifier chain (CC). Theyobtained the highest accuracy 76.16% by using word unigram feature and anensemble approach with LP data transformation method. To detect offensivelanguage, an LSTM and word embedding based technique is proposed by Goelet al. [8]. Sadiq et al. [9] employed combination of CNN and BiLSTM methodto identify aggression on Twitter. In order to detect the fake news, a hybridconvolutional neural network (CNN) is developed by Wang et al. [10]. This workalso exploited BiLSTM, LR and SVM techniques for the classification. The CONSTRAINT shared task comprises of two tasks: task-A and task-B.The task definition with various class labels and brief analysis of the dataset isdescribed in the following subsections. ovid-19 Fake News and Hostile Post Detection in Social Media 3
The goal of task-A is to identify whether a tweet contains real or fake information.The tweets are related to the Covid-19 pandemic and written in English. Intask-B, we have to perform multi-label multi-class classification on five hostiledimensions such as fake news, hate speech, offensive, defamation and non-hostile.The organizers culled hostile posts in Hindi Devanagari script from Facebookand Twitter. To better understand the task, it is monumental to have a clearidea of the class labels. The organizers [11] [12] have defined various hostile andfake classes as illustrates in the following: • Fake:
Articles, posts and tweets provide information or make claims whichare verified not to be true. • Real:
The articles, posts and tweets which provided verified information andmake authentic claims. • Hate speech:
Post having the malicious intention of spreading hate andviolence against specific group or person based on some specific characteristicssuch as religious beliefs, ethnicity, and race. • Offensive:
A post contains vulgar, rude, impolite and obscene languages toinsult a targeted individual or a group. • Defamation:
Posts spread misinformation against a group or individualswhich aim to damage their social identity publicly. • Non-hostile:
Posts without any hostility.
Classifier models have been trained and tested with the dataset conferred by theorganizers. A validation set is utilized to tune the model parameters and settlethe optimal hyperparameter combination. There are two classes in task-A, andwe have to deal with five overlapping classes in task-B. The number of instancesused to train, validate and test the models summarized in table 1.To get the useful insights, we investigated the train set. Statistics of the trainset exhibited in table 2. From the distribution, it observed that the training setis highly imbalanced for both tasks. In task-A, the real class has total 100kwords, while the fake class has only 64k words. Although there is a considerabledifference in the number of total words, the number of unique words in bothclasses is approximately identical. That means words in real class are morefrequent than words of fake class. In task-B, the total words of the non-hostileclass are about four times as much as the defame class. In average, the fake classhas a maximum, and non-hostile class has the minimum number of words pertexts.Figure 1 depicts the number of texts fall in various length range. It is observedthat the fake tweets length is relatively shorter than the length of real tweets.Approximately 2000 fake tweets have a length of less than 20. In contrast,more than 2800 tweets of the real class have higher than 20 words. Only a https://constraint-shared-task-2021.github.io/ Sharif et al. Task-A Task-BReal Fake Defame Fake Hate Offense Non-HostileTrain 3360 3060 564 1144 792 742 3050Validation 1120 1020 77 160 103 110 435Test 1120 1020 169 334 237 219 873 Table 1: Number of instances in train, validation and test set for each task
Class Totalwords Uniquewords Max textlength(words) Avg. no. ofwords pertextsTask-A Real 102100 10029 58 30.39Fake 64929 9980 1409 21.22Task-B Defame 17287 4051 69 30.65Fake 40265 7129 403 35.22Hate 25982 5101 116 32.80Offense 21520 4624 123 29.00Non-Hostile 69481 9485 60 22.78
Table 2: Training set statistics (a) Task-A (b) Task-B
Fig. 1: Number of tweets/posts in varying length distribution scenario fordifferent classes in training setfraction of tweets has more than 60 words. Meanwhile in task-B, non-hostileclass dominates in every length distribution. This difference occurs because thenumber of instances in non-hostile class is higher compare to other classes.Length of most of the posts is between range 20 to 40. Analysis of the trainingset performed after removing the punctuation, numbers and other unwantedcharacters. ovid-19 Fake News and Hostile Post Detection in Social Media 5
Figure 2 presents the schematic diagram of our system, which has three majorphases: preprocessing, feature extraction and classification.Fig. 2: Schematic view of the hostile text detection
Raw input texts are processed in the preprocessing phase. All the flawed characters,numbers, emojis and punctuation’s are discarded from the texts. For both tasks,identical preprocessing techniques are used. For deep learning methods, textsconverted into fixed-length numeric sequences using ‘tokenizer’ and ‘pad sequence’methods of Keras . After preprocessing, features are extracted to train ML andDL models. N-gram features of the texts extracted with tf-idf technique [13].Although tf-idf is a useful feature extraction technique, it could not capturethe word’s semantic meaning in a text. To tackle this, we used the Word2Vecword embedding technique [14] that map words into a dense feature vector.Furthermore, pre-trained word vectors [15] are also explored to investigate themodels’ effectiveness. These features are employed on different methods to performthe tasks. We have employed different machine learning (ML) and deep learning (DL)approaches to crack the tasks. In this section, we describe the methods used todevelop classifier models for each task. task A: is a binary classification problem where we have to detect whether atweet is fake or real. https://keras.io/api/preprocessing/text/ Sharif et al. • ML approach:
SVM is implemented by both ‘linear’ and ‘rbf’ kernel alongwith tf-idf feature extraction technique. Combination of unigram and bigramfeature is utilized. Value of ‘C’ and other parameters is fixed by trial anderror approach. Furthermore, SVM is applied on top of word embeddingfeatures as well. Embedding vectors are generated by varying window size,embedding dimension and other influential parameters. Other ML methods(logistic regression, decision tree, random forest) are also exploited. However,the performance of SVM is superior compare to other techniques. • DL approach:
We have implemented convolution neural network (CNN),bidirectional long short term memory (BiLSTM) and combined (CNN+BiLSTM) network. For experimentation, network architectures are changedby varying number layers, number of neurons, dropout rate, learning rate,and other hyperparameters. In CNN, we used 64 convolution filters withkernel size 5 × ×
5. BiLSTM network comprisesof 32 bidirectional cells with dropout rate 0.2. In the combined model, thesetwo networks are sequentially added one after another.
Task B: is a multi-label multi-class problem where our aim is to identifydifferent hostile dimensions . • ML approach:
We used label powerset (LP) [16] along with support vectormachine (LPSVM) classifier to develop our model. LP generates a new labelfor each label combination in the training corpus and thus transforming amulti-label problem into a multi-class problem [17]. At first, we use tf-idffeature values of n-gram (1, 3) with linear SVM where C = 1 .
7. To reducecomputation complexity, features that appear in less than five documentsare discarded. In other approaches, unigram and n-gram (1,2) features areapplied along with LPSVM, where the value of ‘C’ is taken as 3.8 and 0.7,respectively. In both cases, the linear kernel is used. • DL approach:
BiLSTM network is employed with Word2Vec embeddingtechnique to capture the sequential and semantic features of the texts. Toget the embedding vectors, we use entire training corpus with embeddingdimension 64. These features propagated to the LSTM layer consisting of 32bidirectional cells. The BiLSTM layer’s output transferred to the dense layerhaving nodes equal to the number of classes (5) where sigmoid activationfunction is used. A dropout layer with the rate of 0 . . Experiments were conducted on Google co-laboratory platform along with Python= 3.6.9. Machine learning models are developed by using scikit-learn = 0.22.2packages. Besides, Keras = 2.4.0 with TensorFlow = 2.3.0 framework is chosen ovid-19 Fake News and Hostile Post Detection in Social Media 7 to implement the deep learning models. Models are developed over train data,whereas the validation set is used for tweaking the model parameters. Severalmeasures such as accuracy (A), precision (P), recall (R) and f -score (F) arechosen to evaluate the models. Moreover, coarse-grained and fine-grained scoresused to compare the performance of the models. In task-A, the superiority of the model is determined based on the weighted f score. On the other hand, coarse-grained (CG) and fine-grained (FG) f -score isused to find the best task-B model. Table 3 presents the evaluation results of task-A on the test set. We have reported the outcomes for four models. The resultsrevealed that the combination of CNN and BiLSTM approach achieved f scoreof 92.01%. In contrast, two models with the combination of SVM and Word2Vecobtained slightly higher f score of 92.66% and 92.94% respectively. However,SVM with a tf-idf approach shows about 2% rise and exceeds all the approachesby achieving the highest f score (94.39%). Our best model performance lagsalmost 4% compared to the best result ( f score = 98.69%) obtained in task-A. Method A P R FCNN + BiLSTM 92.01 92.01 92.01 92.01SVM + TF-idf 94.35 94.42 94.39
SVM + Word2Vec (ED=200) 92.66 92.67 92.66 92.66SVM + Word2Vec (ED=150) 92.94 92.94 92.94 92.94Best 98.69 98.69 98.69 98.69
Table 3: Evaluation results of task-A on test set. Here A, P, R, F denotesaccuracy, precision, recall, weighted f score respectively and ED indicatesembedding dimensionEvaluation results of task-B on the test set are presented in table 4. WithBiLSTM and word embedding features, we obtained CG and FG f score of83.37% and 52.80% respectively. After employing LPSVM with unigram tf-idffeatures, it shows a slight rise in CG f score (84.10%). However, FG f scoredrop approximately 3% amounting to 49.12%. For n-gram (1, 2) with LPSVM,We observed a further rise in CG f score (85.31%), while FG score also increasedslightly to 50.98%. Surprisingly, we achieve the highest CG f score of 86.03% byvarying n-gram range to (1, 3) with LPSVM. Nevertheless, FG f score furtherdecrease from the previous score to 50.66%. Meanwhile, for defamation, and fakeclass BiLSTM + Word2Vec technique provides the highest f score of 28.65% and52.06% respectively. On the other hand, LPSVM + Ngram (1, 2) and LPSVM+ Ngram (1, 3) methods give the highest f score of 64.97% (for fake class)and 57.91% (for offensive class). Compared to the best CG ( f score = 97.15%)and FG ( f score = 64.40%) obtained in task-B, our best performing model lagsapproximately more than 10% and 14% respectively in both scores. Sharif et al.Method CG Defame Fake Hate Offense FGBiLSTM + Word2Vec 83.37 28.65 63.63 52.06 55.72 52.80LPSVM + Unigram 84.10 25.81 61.30 44.39 53.59 49.12LPSVM + Ngram (1,2) 85.31 27.59 64.97 47.21 51.72 50.98LPSVM + Ngram (1,3)
Table 4: Evaluation results of Task-B on the test set. All values presented in f score and CG, FG denotes coarse-grained, and fine-grained f scores The results revealed that SVM+tf-idf and LPSVM+Ngram (1, 3) are the bestperforming models for task-A and task-B. To get more insights, quantitativeerror analysis of classification models carried out by using the confusion matrix.Tables 5a-5f represent the confusion matrices of the classes for task-A and B.
Class Real FakeReal 1047 73Fake 47 973(a) Task-A Class Defame OtherDefame 25 144Other 42 1442(b) Task-B:DefameClass Fake OtherFake 190 144Other 116 1203(c) Task-B:Fake Class Hate OtherHate 91 143Other 75 13442(d) Task-B:HateClass NH OtherNH 814 59Other 170 610(e) Task-B:Non-Hostile Class Offense OtherOffense 108 111Other 49 1385(f) Task-B:Offensive
Table 5: Confusion matrix for SVM + tf-idf (task-A) and LPSVM + Ngram(1,3)(task-B)
Quantitative analysis:
Table 5a indicates that a total of 73 real tweets wronglyidentified as fake and the system marked 47 fake tweets as real. The false-negativerate in defame, hate, and offensive classes are very high. Only 25, 91 and 108posts are correctly classified among 169, 134 and 219 posts respectively for theseclasses. Majority of these tweets are wrongly classified as either fake or non-hostile. In contrast, the false positive rate is high for fake and non-hostile classes.A total of 116 posts are incorrectly classified as fake while the model could notidentify 144 actual fake posts. The model also even suffered to differentiate the ovid-19 Fake News and Hostile Post Detection in Social Media 9 hostile and non-hostile posts and thus misclassified 170 hostile posts. Datasetimbalance might be the reason behind this poor performance. The model trainedwith many fakes, and non-hostile class instances compare to other hostile classes.Increasing the number of training examples will undoubtedly help the model toperform better.
Qualitative analysis:
Some misclassified examples with there actual (A) andpredicted label (P) presented in table 6. After giving a closer look at the examples,we discover some interesting facts. The error occurs due to some standardfrequent terms such as ‘covid19’, ‘coronavirus’, ‘modi’, and ‘vaccine’ that existin real and fake tweets. These influential words make it difficult to differentiatebetween fake and real. We also notice that fake news claims unverified factsusing responsible agencies, i.e. FDA, and WHO. It is challenging to verify suchclaims and make a proper prediction. In the case of hostility detection, someposts inherently express hostility which is very arduous to identify from thesurface level analysis without understanding the context. Moreover, due to theoverlapping characteristics of hostile dimensions, confusion mostly occurred whenseparating the defame, hate and offensive posts. Analyzing the context of theposts might help to develop more successful models.
Surprisingly, machine learning models have performed better than deep learningmodels in the CONSTRAINT shared tasks. As deep learning techniques arethe data-driven method, lack of training examples in a few classes might be areason for this peculiar behaviour. In order to handle this issue, pretrained wordembedding can be utilized. However, noticeable change in the performance wasnot observed for example, combined (CNN+BiLSTM) model obtained 92.43%weighted f score with pretrained word vectors which is lower than the SVM (94.39%). After analysis, we realized that large pretrained language models mighthelp the system make predictions more accurately. Thus, after getting the actualtext labels at the end of the shared task, we applied the BERT model [18] andnotice an astonishing rise in accuracy for both tasks. In task-A weighted f scoreincreased from 0.94 to 0.98 and coarse-grained f score incremented from 0.86to 0.97. Table 6 shows the outcomes of the BERT model. Task-A Task-B f score Coarsegrained Defame Fake Hate OffenseBERT 0.98 0.97 0.38 0.79 0.46 0.54 Table 6: Results obtained by the BERT model on task-A and task-B