The Influence Of Applying Stopword Removal And Smote On Indonesian Sentiment Classification

Authors

  • Arif Bijaksana Putra Negara

DOI:

https://doi.org/10.24843/LKJITI.2023.v14.i03.p05

Abstract

Information, like public opinions or responses, can be obtained through Twitter tweets. These opinions can expressed as a sentiment. Sentiments can be positive, neutral, or negative. Sentiment analysis (opinion mining) on a text can performed through text classification. This research aims to determine the influence of implementing Stopword Removal and SMOTE on the sentiment classification model for Indonesian tweets. The algorithms used in this research are Logistic Regression and Random Forest. Based on the evaluation, the best classification model in this research was achieved by implementing the Random Forest algorithm along with SMOTE, with an f1-score value of 75.03%. Meanwhile, implementing the Random Forest algorithm and Stopword Removal achieved the worst classification model, with an f1-score value of 68.09%. Implementing Stopword Removal in both algorithms has a negative impact in the form of a decrease in the resulting f1-score. Meanwhile, the performance of SMOTE provides a positive impact in the form of an increase in the resulting f1-score. This happened since Stopword Removal could reduce information and alter the meaning of processed tweets, causing the tweet to lose its sentiment.

Downloads

Published

2025-10-13

How to Cite

[1]
Arif Bijaksana Putra Negara, “The Influence Of Applying Stopword Removal And Smote On Indonesian Sentiment Classification”, LKJITI, vol. 14, no. 03, pp. 172–185, Oct. 2025.