2021 7th International Conference on Advanced Computing and Communication Systems (ICACCS) | 2021

Leveraging Universal Sentence Encoder to Predict Movie Genre

 
 
 
 

Abstract


Multi-label text classification (MLTC) refers to the problem of dealing with textual data classification based on multiple labels or tags. There are numerous real-world scenarios where the need for assigning labels to a particular object arises, such that the labels are descriptive of the properties of that object. However, in real life, it is not uncommon for one object to hold more than a singular property describing itself, hence it needs multiple labels to be associated with it. In the cases of textual data, one such scenario is assigning labels to a movie describing the genre, which needs more than one genre to specify the plot in a practical scenario. This makes movie genre prediction the desired choice for multi-label classification in many kinds of literature. This paper explores and presents an in-depth analysis of the approach of solving movie-genre prediction problems using the sequential model with universal sentence encoder (USE) for text encoding, alongside the use of label powerset (LP) as the problem transformation approach. Along with that, a comparative analysis of the performance of the model with different optimizers is performed. The best outcome achieved is f1-score 0.69 and accuracy of 0.89 with Adam optimizer, which, upon comparison with other literature of the similar domain, is either an equal or better in performance.

Volume 1
Pages 1013-1018
DOI 10.1109/ICACCS51430.2021.9441685
Language English
Journal 2021 7th International Conference on Advanced Computing and Communication Systems (ICACCS)

Full Text