Archive | 2021
Comparison of Effectiveness of Stemming Algorithms in Indonesian Documents
Abstract
Stemming is a process to determine basic word with some rules. In Bahasa Indonesia, the way is to eliminate prefixes, infixes, suffixes, or combination of prefixes and suffixes in derivative words. Several stemming algorithms for Bahasa Indonesia have been developed. But their effectiveness has not been studied. In this study, these three stemming algorithms will be compared. We used 900 affixes to conduct the comparison. Each word is searched for their basic words using the three algorithms. The basic word resulted then referred to KBBI or Indonesian dictionary to see whether they are right. Comparison process of stemming show that Sastrawi’s could do the best stemming that 95,2% of the affix words tested could be root words. The Nazief & Adriani Algorithm resulted 92,4%, while Arifin Setiono’s finished at 89%. It could state that Arifin Setiono’s needs a lot of improvement because many affixed words could not return to the root word.