Comparative Analysis of Machine Learning Models for Email Spam Detection

Mugi Lestari; Yasir Salih; Alim Jaizul

doi:10.47194/ijgor.v6i3.392

Comparative Analysis of Machine Learning Models for Email Spam Detection

https://doi.org/10.47194/ijgor.v6i3.392

Authors

Mugi Lestari
mu2lestari@gmail.com
Yasir Salih Department of Mathematics, Faculty of Education, Red Sea University, SUDAN
Alim Jaizul Research Collaboration Community, Bandung, Indonesia

Abstract

The development of information technology has driven a significant increase in the use of email as a primary communication tool across various sectors. Spam emails have become a serious issue that can disrupt productivity and threaten data security as well as user privacy. Conventional rule-based spam filtering systems are no longer considered effective in countering increasingly sophisticated and adaptive spam attack patterns. A more dynamic and accurate approach is required through the utilization of Machine Learning. This study aims to analyze and compare the performance of several Machine Learning algorithms in detecting spam emails, namely Extra Trees Classifier, Random Forest, Support Vector Machine (SVM) with an RBF kernel, and CatBoost. The methodology involves data acquisition from the SMS Spam Collection Dataset, data preprocessing through text cleaning and feature extraction using Term Frequency–Inverse Document Frequency (TF-IDF), followed by model training and evaluation using Accuracy, F1 Score, and ROC AUC metrics. The results show that the Extra Trees Classifier achieved the best performance, with an Accuracy of 97.29%, an F1 Score of 0.8814, and a ROC AUC of 0.9868. Tree-based ensemble models, particularly Extra Trees and Random Forest, demonstrated superior capability in maintaining a balance between precision and recall. The SVM (RBF) recorded the highest AUC value but presented a trade-off in the form of a higher number of False Negatives. The findings of this research serve as a reference for the development of more adaptive and effective Machine Learning–based spam detection systems.

Published

2025-08-25

Issue

Vol. 6 No. 3 (2025): International Journal of Global Operations Research (IJGOR), August 2025

Section

Articles

This work is licensed under a Creative Commons Attribution 4.0 International License.

Authors who publish with this journal agree to the following terms:

With the receipt of the article by Editorial Board of the International Journal of Global Operations Research (IJGOR) and it was decided to be published, then the copyright regarding the article will be diverted to IJGOR

International Journal of Global Operations Research (IJGOR) hold the copyright regarding all the published articles and has the right to multiply and distribute the article under Creative Commons Atribusi 4.0 Internasional.

Copyright tranfer statement the author to the journal is done through filling out the copyright transfer form by author. The form can be downloaded HERE.

Comparative Analysis of Machine Learning Models for Email Spam Detection

Authors

Abstract

Published

Issue

Section

submitarticle

sidebar

Accreditation

template

visitor

tools

scopuscitednes