[PDF] Harassment detection: a benchmark on the #HackHarassment dataset

Abstract

Online harassment has been a problem to a greater or lesser extent since the early days of the internet. Previous work has applied anti-spam techniques like machine-learning based text classification (Reynolds, 2011) to detecting harassing messages. However, existing public datasets are limited in size, with labels of varying quality. The #HackHarassment initiative (an alliance of 1 tech companies and NGOs devoted to fighting bullying on the internet) has begun to address this issue by creating a new dataset superior to its predecssors in terms of both size and quality. As we (#HackHarassment) complete further rounds of labelling, later iterations of this dataset will increase the available samples by at least an order of magnitude, enabling corresponding improvements in the quality of machine learning models for harassment detection. In this paper, we introduce the first models built on the #HackHarassment dataset v1.0 (a new open dataset, which we are delighted to share with any interested researcherss) as a benchmark for future research.

Full PDF

CCERC Harassment detection: a benchmark on the

Alexei Bastidas, Edward Dixon, Chris Loo, John Ryan Intel email: [email protected]

Keywords: e.g.Machine Learning, Natural Language Processing, Cyberbullying Introduction

Online harassment has been a problem to a greater or lesser extent since the early days of the internet. Previous work has applied antispam techniques like machinelearning based text classification (Reynolds, 2011) to detecting harassing messages. However, existing public datasets are limited in size, with labels of varying quality. The tech companies and NGOs devoted to fighting bullying on the internet) has begun to address this issue by creating a new dataset superior to its predecssors in terms of both size and quality. As we ( Related Work

Previous work in the area by Bayzik 2011 showed that machine learing and natural language processing could be successfully applied to detect bullying messages on an online forum. However, the same work also made clear that the limiting factor on such models was the availability of a suitable quantity of labeled examples. For example, the Bayzick work relied of a dataset of 2,696 samples, only 196 of which were found to be examples of bullying behaviour. Additionally, this work relied on model types like J48 and JRIP (types of decision tree), and knearest neighbours classifiers like IBk, as opposed to popular modern ensemble methods or deep neuralnetworkbased approaches.

Methodology

Our work was carried out using the "Hack Harassment." 2016. 26 Jul. 2016 < > ERC All preprocessing, training and evaluation was carried out in Python, using the popular SciKitLearn library (for feature engineering and linear models) in combination with Numpy (for matrix operations), Keras and TensorFlow (for models based on deep neural networks DNNs).

For the linear models, features were generated by tokenizing the text (breaking it aparting into words), hashing the resulting unigrams, bigrams and trigrams (collectiojns of one, two, or three adjacent words) and computing at TF/IDF for each hashed value. The resulting feature vectors were used to train and test Logistic Regressioin, Support Vector Machine and Gradient Boosted Tree models, with 80% of data used for training and 20% held out for testing (results given are based on the heldout 20%). For the DNNbased approach, a similar approach was taken to tokenization, both bigram and trigram hashes were computed; these were onehot encoded, and dense representations of these features were learned during training, as per Joulin 2016. "scikitlearn: machine learning in Python — scikitlearn 0.17.1 ..." 2011. 29 Jul. 2016 < http://scikitlearn.org/ > "NumPy — Numpy." 2002. 29 Jul. 2016 < > "Keras Documentation." 2015. 29 Jul. 2016 < http://keras.io/ > "TensorFlow — an Open Source Software Library for Machine ..." 2015. 29 Jul. 2016 < > ERC The FastText model used is a python implenmentation of the model described in "Bag of Tricks for Efficient Text Classification.” . For the text encoding, bigrams and trigrams are used. 20% of the data was held out for testing. The Recurrent Character Level Neural Network model consists of 2 GRU layers of width 100 followed by a Dense Layer of size 2 with softmax on the output, Between each of the layers batch normailization is performed. The optimiser used was rmsprop. For data preperation each of characters was onehot encoded and each sample was truncated/padded to 500 charcters in length. 20% of the data was held out for testing. Results

Model Precision (Harassing) Recall (Harassing)

Gradient Boosted Trees (ScikitLearn) 0.80 0.71 Bernoulli Naive Bayes 0.54 0.30 FastText 0.60 0.78 Recurrent Character Level Neural Network 0.71 0.73

Conclusions

We have presented the first results on a new open cyberbullying/harassment dataset. While our models clearly demonstrate a degree of ability to discriminate between the content classes, the achieved precision in particular falls far short of our ambitions for

References

Reynolds, Kelly, April Kontostathis, and Lynne Edwards. "Using machine learning to detect cyberbullying." Machine Learning and Applications and Workshops (ICMLA), 2011 10th International Conference on arXiv preprint arXiv:1607.01759 (2016). Improved Cyberbullying Detection Through Personal Profiles "fastText" 2015. 22 Jul. 2016 < https://github.com/sjhddh/fastText >>