Proceedings of The 6th Asia-Pacific Education And Science Conference, AECon 2020, 19-20 December 2020, Purwokerto, Indonesia | 2021

Text Mining to Analyse Publication Topics of COVID-19 using HDP and LDA Methods

 
 
 

Abstract


COVID-19 is a disease caused by the novel coronavirus, which almost all countries are affected. This worldwide effect has led many researchers to conduct research related to COVID-19. It is wanted to know what topics have been carried out from all the studies published by researchers in various countries. This research analyzes the data crawled from full abstracts of publications related to COVID-19 start January 2020 to August 2020. The abstract s text was crawled and then preprocessed by eliminating punctuation, lowering text, lemmatizer, and stopword. Furthermore, the clean data is ready for analysis using the text mining method to allocate topics and use as future research information. The methods used are the Hierarchical Dirichlet Process (HDP) and Latent Dirichlet Allocation (LDA) approaches. It also found that the LDA method has a coherence score of 42% higher than the HDP method, which means LDA is more appropriate in this case.

Volume None
Pages None
DOI 10.4108/eai.19-12-2020.2309174
Language English
Journal Proceedings of The 6th Asia-Pacific Education And Science Conference, AECon 2020, 19-20 December 2020, Purwokerto, Indonesia

Full Text