2019 5th International Conference on Science in Information Technology (ICSITech) | 2019

Categorical Data Classification based on Fuzzy K-Nearest Neighbor Approach

Abstract

To facilitate organizing complaints, improve the efficiency of administrators in sorting and determining categories of fields and work units, an intelligent system is needed to classify the documents. Complaints have an unbalanced number of categories of categorical data, which is known as imbalanced text in text classification. Complaints also have sentence ambiguity that allows a complaint to be categorized into several categories of fields and work units. The use of fuzzy theory in K-Nearest Neighbor can handle the ambiguity of a sentence. In the final stage, testing is done using a multiclass confusion matrix table to find the value of accuracy, precision, and recall. The test results showed optimum accuracy values for field classifications of 96.528% with K = 11 in the 80 (train) data percentage: 20 (test) and the classification of work units at 97.375% with K = 9 in 70 data train (train): 30 ( test). K-Nearest Neighbor can be an alternative in classifying data categories, because it produces a high level of accuracy, although its level of precision and recall still needs improvement.

Volume None

2019 5th International Conference on Science in Information Technology (ICSITech) | 2019

Categorical Data Classification based on Fuzzy K-Nearest Neighbor Approach

Abstract

Volume None

Pages 171-175

DOI 10.1109/ICSITech46713.2019.8987477

Language English

Journal 2019 5th International Conference on Science in Information Technology (ICSITech)

Full Text