2019 5th International Conference on Web Research (ICWR) | 2019

A Supervised Framework for Review Spam Detection in the Persian Language

 
 
 

Abstract


Sentiment analysis of online reviews has attracted an increasing attention from both academia and industry. Although online reviews are valuable sources of information for detecting public opinion towards different aspects of products, they may be written by spammers with different purposes. In order to detect such spam reviews, several methods have been proposed for English language but no study has been reported on Persian spam detection so far. In the current study, Persian reviews of cell-phones are investigated to find spam type 1 and type 2 which are fake reviews and reviews only written about brands, respectively. In the proposed framework a labeled dataset, SpamPer, is first created using a majority voting on the answers of 11 questions previously designed for spam detection by human annotators. Then several preprocessing steps for Persian language are performed to refine the training data. Finally review-based and metadata features are extracted. The obtained results on 3000 reviews of SpamPer shows that the highest accuracy is obtained using the decision tree with 0.78 F1-measure. Moreover, the results reveal that SVM for unbalanced data and decision tree for balanced data achieve better performance when they are trained on the combination of metadata and review-based features.

Volume None
Pages 203-207
DOI 10.1109/ICWR.2019.8765275
Language English
Journal 2019 5th International Conference on Web Research (ICWR)

Full Text