Hyperparameter Tuning Feature Selection with Genetic Algorithm and Gaussian Naïve Bayes for Diabetes Disease Prediction

Authors

  • Ilham Firman Ashari Institut Teknologi Sumatera
  • Meida Cahyo Untoro Institut Teknologi Sumatera

DOI:

https://doi.org/10.61769/telematika.v17i1.488

Keywords:

Diabetes Melitus, hyperparameter, feature selection, Genetic algorithm, Naïve Bayes

Abstract

Diabetes Mellitus is a disease that occurs due to disorders of carbohydrate, fat and protein metabolism associated with a lack of performance of insulin secretion. Diabetes is a degenerative disease that requires appropriate and serious treatment efforts. The effects lead to various complications of other serious diseases such as heart disease and stroke. Erectile dysfunction, kidney failure, nervous system damage, etc. Because there are so many impacts caused by diabetes, it is important to study this disease. The benefit of this study is to prevent the occurrence of severe complications and can help medical personnel in predicting this disease early and reduce the cost burden that arises due to this problem.  The purpose of this study is to determine the level of accuracy resulting from the use of feature selection with genetic algorithms and nave Bayes. In this study, predictions will be made using hyperparameter tuning with genetic algorithms and Naive Bayes optimization by performing feature selection. After conducting related research, it was found that the accuracy of 17 features using a genetic algorithm was better than modeling with 10 features. By using 17 features and hyperparameter tuning with genetic algorithm and naive Bayes modeling, the accuracy is 93.2%. By using 17 features without feature selection, the accuracy is 91.2%, there is an increase in accuracy of 1.5%.

Author Biographies

Ilham Firman Ashari, Institut Teknologi Sumatera

Informatics Engineering

Meida Cahyo Untoro, Institut Teknologi Sumatera

Informatics Engineering

References

M. A. Wiratama and W. M. Pradnya, “Optimasi algoritma data mining menggunakan backward elimination untuk klasifikasi penyakit diabetes,” J. Nas. Pendidik. Tek. Inform., vol. 11, no. 1, p. 1, 2022. DOI: 10.23887/janapati.v11i1.45282.

F. Handayanna, Rinawati, E. Arisawati, and L. S. Dewi, “Prediksi penyakit diabetes menggunakan Naive Bayes dengan optimasi parameter menggunakan algoritme Genetika,” KNiST (Konferensi Nas. Ilmu Sos. Teknol., March 2017, pp. 71–76.

K. Mohammad Burhan Hanif, “Sistem aplikasi prediksi penyakit diabetes menggunakan feature selection korelasi Pearson dan klasifikasi Naive Bayes,” Pengemb. Rekayasa dan Teknol., vol. 16, no. 2, pp. 199–205, 2020.

S. Katno and D. Anistyani, “Uji aktivitas hipoglikemik ekstrak etanol daun teh (Camellia Sinensis L.) pada tikus putih jantan galur wistar,” in Prosiding Seminar Nasional "Peranan dan Kontribusi Herbal dalam Terapi Penyakit Degeneratif”, Semarang, December 17th, 2011, pp. 108-113.

G. Kusnadi, E. A. Murbawani, and D. Y. Fitranti, “Faktor risiko diabetes melitus tipe 2 pada petani dan buruh,” J. Nutr. Coll., vol. 6, no. 2, p. 138, 2017. DOI: 10.14710/jnc.v6i2.16905.

W. D. Septiani and U. Rohwadi, “Optimasi algoritma Genetika pada algoritme C4.5 untuk deteksi dini penyakit diabetes,” J. Akrab Juara, vol. 6, pp. 221–229, 2021.

M. C. Untoro, M. Praseptiawan, I. F. Ashari, and A. Afriansyah, “Evaluation of Decision Tree, K-NN, Naive Bayes, and SVM with MWMOTE on UCI dataset,” J. Phys. Conf. Ser., vol. 1477, no. 3, 2020. DOI: 10.1088/1742-6596/1477/3/032005.

A. Ridwan, “Penerapan algoritme Naïve Bayes untuk klasifikasi penyakit diabetes melitus,” J. Siskom-KB (Sistem Komput. dan Kecerdasan Buatan), vol. 4, no. 1, pp. 15–21, 2020. DOI: 10.47970/siskom-kb.v4i1.169.

N. M. Putry and B. N. Sari, “Komparasi algoritme KNN dan Naïve Bayes untuk klasifikasi diagnosis penyakit diabetes melitus,” Evolusi J. Sains dan Manaj., vol. 10, no. 1, pp. 45–57, 2022.

Noviandi, “Implementasi algoritme Decision Tree C4.5 untuk prediksi penyakit diabetes,” J. Inohim, vol. 6, no. 1, pp. 1–5, 2018. [Online]. Available: http://www.kaggle.com/uciml/pima-indians-diabetes-database.

I. S. Bakti and Ivandari, “Model prediksi penyakit diabetes menggunakan Bayesian classification dan information gain untuk seleksi fitur dan adaptive boosting untuk pembobotan data,” J. Inform. Comput. Technol., vol. 14, no. 1, pp. 1–13, 2019. DOI: 10.47775/ictech.v14i1.54

T. Zheng, W. Xie, L. Xiu, X. He, Y. Zhang, M. You, G. Yang, and Y. Chen, “A machine learning-based framework to identify type 2 diabetes through electronic health records,” International Journal of Medical Informatics, vol. 97, p. 120-127, 2017. DOI: 10.1016/j.ijmedinf.2016.09.014.A.

S. A. Putri and D. Larasati, “Penerapan feature selection pada Bayesian network untuk prediksi cacat perangkat lunak,” Pilar Nusa Mandiri, vol. 13, no. 2, pp. 275–280, 2017.

O. Somantri and D. Apriliani, “Support vector machine berbasis feature selection untuk sentiment analysis kepuasan pelanggan terhadap pelayanan warung dan restoran kuliner kota Tegal,” J. Teknol. Inf. dan Ilmu Komput., vol. 5, no. 5, p. 537, 2018. DOI: 10.25126/jtiik.201855867.

O. Somantri and M. Khambali, “Feature selection klasifikasi kategori cerita pendek menggunakan Naïve Bayes dan algoritme Genetika,” J. Nas. Tek. Elektro dan Teknol. Inf., vol. 6, no. 3, pp. 301–306, 2017. DOI: 10.22146/jnteti.v6i3.332.

R. N. Putri and D. Setiawan, “Prediksi penyakit Systemic Lupus Erythematosus menggunakan algoritme Genetika,” Digit. Zo. J. Teknol. Inf. dan Komun., vol. 12, no. 1, pp. 19–31, 2021. DOI: 10.31849/digitalzone.v12i1.5973.

D. Setiawan, R. N. Putri, and R. Suryanita, “Implementasi algoritme Genetika untuk prediksi penyakit autoimun,” Rabit J. Teknol. dan Sist. Inf. Univrab, vol. 4, no. 1, pp. 8–16, 2019. DOI: 10.36341/rabit.v4i1.595.

I. F. Ashari, A. G. Manalu, and R. Setiawan, “Analysis of security guard scheduling system using Genetic algorithm and tournament selection (case study: Institut Teknologi Sumatera),” vol. 5, no. 2, pp. 202–207, 2021.

I. F. Ashari, R. Banjarnahor, and D. R. Farida, “Application of data mining with the K-Means clustering method and Davies Bouldin Index for grouping IMDB movies,” vol. 6, no. 1, pp. 7–15, 2022.

Downloads

Published

2022-10-31

Issue

Section

Articles