Main Article Content

Abstract

 In late 2019 came a flu-like illness that infected the lungs in the city of Wuhan. It is suspected that the disease is suspected to have originated in bats. WHO named this disease Covid-19 and the virus spread throughout the world, causing a pandemic. The government took a vaccination drive to overcome this virus, but received a response of pros and cons from the public. There are many studies that discuss people's sentiments towards vaccination, one of which is the classification of sentiments. This study discusses the classification of sentiment towards covid-19 vaccines using the K-Nearest Neighbor and Fasttext algorithms on twitter. Data is obtained by crawling using the pyton programming language and Twitter API.  Data labeling is carried out by crowdsourcing and majority voting techniques.  The data used after the balancing process are 6000 training data, 778 development data and 400 test data.  The test results after various experiments and feature engineering got the best results with an accuracy value of 69% and an f1-score of 60%. This result is the best result compared to previous studies with the same dataset.

Keywords

K-Nearest Neighbor, Fasttext, Sentiment Classification, Covid-19 Vaccine

Article Details

How to Cite
1.
Safrizal AN, Surya Agustian. Classification of Covid-19 Vaccine Sentiment Using K-Nearest Neighbor and Fasttext on Twitter. EKSAKTA [Internet]. 2024Sep.30 [cited 2025Jan.21];25(03):362-71. Available from: https://eksakta.ppj.unp.ac.id/index.php/eksakta/article/view/384

References

  1. Makmun, A., & Hazhiyah, S. F. (2020). Tinjauan Terkait Pengembangan Vaksin Covid 19. Molucca Medica, 52-59.
  2. Haque, A., & Pant, A. B. (2020). Efforts at COVID-19 vaccine development: challenges and successes. Vaccines, 8(4), 739.
  3. Sohrabi, C., Alsafi, Z., O'neill, N., Khan, M., Kerwan, A., Al-Jabir, A., ... & Agha, R. (2020). World Health Organization declares global emergency: A review of the 2019 novel coronavirus (COVID-19). International journal of surgery, 76, 71-76.
  4. Cucinotta, D., & Vanelli, M. (2020). WHO declares COVID-19 a pandemic. Acta bio medica: Atenei parmensis, 91(1), 157.
  5. Laurensz, B., & Sediyono, E. (2021). Analisis Sentimen Masyarakat terhadap Tindakan Vaksinasi dalam Upaya Mengatasi Pandemi Covid-19. Jurnal Nasional Teknik Elektro dan Teknologi Informasi, 10(2).
  6. Keputusan Menteri, Keputusan Menteri Kesehatan Republik Indonesia Nomor Hk.01.07/Menkes/12757/2020 Tentang Penetapan Sasaran Pelaksanaan Vaksinasi Corona Virus Disease 2019 (Covid-19), Keputusan Menteri Kesehatan Republik Indonesia Nomor Hk.01.07/Menkes/12757/2020 Tentang Penetapan Sasaran Pelaksanaan Vaksinasi Corona Virus Disease 2019 (Covid-19), vol. 284, pp. 99–119, 2020.
  7. Mutikasari, A. D., & Susila, I. (2023). Analysis Factors Affecting Customer Loyalty of Indihome Provider During The Covid-19 Pandemic In Surakarta. Jurnal Pamator: Jurnal Ilmiah Universitas Trunojoyo, 16(4), 727-744.
  8. Pristiyono, Ritonga, M., Ihsan, M. A. A., Anjar, A., & Rambe, F. H. (2021, February). Sentiment analysis of COVID-19 vaccine in Indonesia using Naïve Bayes Algorithm. In IOP Conference Series: Materials Science and Engineering (Vol. 1088, No. 1, p. 012045). IOP Publishing.
  9. Putraa, F. M., & Santiyasaa, I. W. Sentiment Analysis of the Indonesian Health Ministry Performance in Covid-19 Crisis using Support Vector Machine (SVM). Jurnal Elektronik Ilmu Komputer Udayana p-ISSN, 2301, 5373.
  10. Ihsan, M., Negara, B. S., & Agustian, S. (2022). Metode LSTM (Long short term memory) untuk Klasifikasi Sentimen Vaksin Covid-19 pada Twitter. Digital Zone: Jurnal Teknologi Informasi Dan Komunikasi, 13(1), 1-13.
  11. Kahraman, E., Demirel, S., & Gündüz, U. (2023). COVID-19 vaccines in twitter ecosystem: Analyzing perceptions and attitudes by sentiment and text analysis method. Journal of Public Health, 1-15.
  12. Harun, A., & Ananda, D. P. (2021). Analisa Sentimen Opini Publik Tentang Vaksinasi Covid-19 di Indonesia Menggunakan Naïve bayes dan Decission Tree: Analysis of Public Opinion Sentiment About Covid-19 Vaccination in Indonesia Using Naïve Bayes and Decission Tree. MALCOM: Indonesian Journal of Machine Learning and Computer Science, 1(1), 58-64.
  13. Ernawati, S., & Wati, R. (2018). Penerapan Algoritma K-Nearest Neighbors Pada Analisis Sentimen Review Agen Travel. jurnal khatulistiwa informatika, 6(1).
  14. Sahin, E. K. (2020). Assessing the predictive capability of ensemble tree methods for landslide susceptibility mapping using XGBoost, gradient boosting machine, and random forest. SN Applied Sciences, 2(7), 1308.
  15. Fibrianda, M. F., & Bhawiyuga, A. (2018). Analisis Perbandingan Akurasi Deteksi Serangan Pada Jaringan Komputer Dengan Metode Naïve Bayes Dan Support Vector Machine (SVM). Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer, 2(9), 3112-3123.
  16. Prasetyo, V. R., Mercifia, M., Averina, A., Sunyoto, L., & Budiarjo, B. (2022). Prediksi Rating Film Pada Website Imdb Menggunakan Metode Neural Network. Nero (Networking Engineering Research Operation), 7(1), 1-8.
  17. Bertsimas, D., Dunn, J., Pawlowski, C., & Zhuo, Y. D. (2019). Robust classification. INFORMS Journal on Optimization, 1(1), 2-34.
  18. Ernawati, S., & Wati, R. (2018). Penerapan Algoritma K-Nearest Neighbors Pada Analisis Sentimen Review Agen Travel. jurnal khatulistiwa informatika, 6(1).
  19. Taufiqurrahman, T., Nababan, E. B., & Efendi, S. (2021). Analysis of dimensional reduction effect on K-Nearest Neighbor classification method. Sinkron: jurnal dan penelitian teknik informatika, 5(2B), 222-230.
  20. Arsi, P., Hidayati, L. N., & Nurhakim, A. (2022). Komparasi model klasifikasi sentimen issue vaksin COVID-19 berbasis platform instagram. Jurnal Media Informatika Budidarma, 6(1), 459-466.
  21. Saputro, I. W., & Sari, B. W. (2020). Uji Performa Algoritma Naïve Bayes untuk Prediksi Masa Studi Mahasiswa. Creat. Inf. Technol. J, 6(1), 1.
  22. Kerwin, K. R., & Bastian, N. D. (2021). Stacked generalizations in imbalanced fraud data sets using resampling methods. The Journal of Defense Modeling and Simulation, 18(3), 175-192.
  23. Kumar, A., Saxena, N., Jung, S., & Choi, B. J. (2021). Improving detection of false data injection attacks using machine learning with feature selection and oversampling. Energies, 15(1), 212.
  24. Pradha, S., Halgamuge, M. N., & Vinh, N. T. Q. (2019, October). Effective text data preprocessing technique for sentiment analysis in social media data. In 2019 11th international conference on knowledge and systems engineering (KSE) (pp. 1-8). IEEE.
  25. Alam, S., & Yao, N. (2019). The impact of preprocessing steps on the accuracy of machine learning algorithms in sentiment analysis. Computational and Mathematical Organization Theory, 25, 319-335.
  26. Gkaimanis, D. (2024). Stock Market Prediction using Double-DQN and Sentiment Analysis.
  27. Alkurdi, A., & Abdulazeez, A. M. (2024). Comprehensive Classification of Fetal Health Using Cardiotocogram Data Based on Machine Learning. Indonesian Journal of Computer Science, 13(1).
  28. Putra, S. J., Gunawan, M. N., & Hidayat, A. A. (2022, September). Feature engineering with Word2vec on text classification using the K-nearest neighbor algorithm. In 2022 10th International Conference on Cyber and IT Service Management (CITSM) (pp. 1-6). IEEE.
  29. Syaputra, R. A., & Ali, R. (2022). Improving mental health surveillance over Twitter text classification using word embedding techniques. In Artificial intelligence, machine learning, and mental health in pandemics (pp. 235-258). Academic Press.
  30. Soleimani, B. H., & Matwin, S. (2019, July). Fast PMI-based word embedding with efficient use of unobserved patterns. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 33, No. 01, pp. 7031-7038).