SpamShield: An Explainable Multi-Model Spam Detection Framework with Machine Learning and Deep Neural Networks

Yadav, Sainath; Nahar, Deep; Shingote, Omkar; Sinarkar, Prathamesh

doi:https://doi.org/10.55041/ijcope.v2i4.1044

Volume 02, Issue 05

Published on: May 2026

SPAMSHIELD: AN EXPLAINABLE MULTI-MODEL SPAM DETECTION FRAMEWORK WITH MACHINE LEARNING AND DEEP NEURAL NETWORKS

Sainath Yadav Deep Nahar Omkar Shingote Prathamesh Sinarkar

Department of Artificial Intelligence Vishwakarma University Pune India

DOI:https://doi.org/10.55041/ijcope.v2i4.1044

Article Status

Plagiarism Passed Peer Reviewed Open Access

Available Documents

Download PDF Review Report

Abstract

This paper presents SpamShield, a production-grade, modular spam detection system that unifies three classifier paradigms—Multinomial Naive Bayes (MNB) within a fully encapsulated sklearn Pipeline, Support Vector Machine (SVM) optimised via GridSearchCV across a nine-combination hyperparameter grid, and a five-layer Deep Neural Network (DNN) with Batch Normalisation, progressive Dropout regularisation, and class-weighted training—with Local Interpretable Model-agnostic Explanations (LIME) post-hoc explainability. Evaluated on the UCI SMS Spam Collection (5,571 messages, 13.4% spam), the system achieves DNN accuracy of 99.2%, F1=98.9%, and AUC=0.999 with only 5 false positives on 1,115 test messages. The GridSearchCV-optimised SVM achieves 98.5% accuracy and F1=98.3% with 22 false positives, providing a practical high-performance alternative. A novel multi-class sklearn Pipeline replaces hardcoded regex category classification, achieving 94.0% accuracy across five spam sub-categories (Financial, Promotional, Scam, Adult, Phishing). The system features sklearn Pipeline encapsulation to eliminate preprocessing skew, rotating file logging, a three-tier Flask Blueprint architecture, real-time Gmail API integration with token-bucket rate limiting and automated spam-folder routing, LIME explanations with vocabulary learned from NB log-probability differences, and a fully dynamic frontend with all metrics fetched from REST APIs. This work demonstrates that architectural discipline—modular code, full Pipeline encapsulation, systematic hyperparameter search, and explainability integration—transforms a prototype classifier into a portfolio-grade production system.

Index Terms— email classification, hyperparameter optimization, interpretable artificial intelligence, automated message filtering, production-ready machine learning

How to Cite this Paper

Yadav, S., Nahar, D., Shingote, O. & Sinarkar, P. (2026). SpamShield: An Explainable Multi-Model Spam Detection Framework with Machine Learning and Deep Neural Networks. International Journal of Creative and Open Research in Engineering and Management, <i>02</i>(05). https://doi.org/10.55041/ijcope.v2i4.1044

Yadav, Sainath, et al.. "SpamShield: An Explainable Multi-Model Spam Detection Framework with Machine Learning and Deep Neural Networks." International Journal of Creative and Open Research in Engineering and Management, vol. 02, no. 05, 2026, pp. . doi:https://doi.org/10.55041/ijcope.v2i4.1044.

Yadav, Sainath,Deep Nahar,Omkar Shingote, and Prathamesh Sinarkar. "SpamShield: An Explainable Multi-Model Spam Detection Framework with Machine Learning and Deep Neural Networks." International Journal of Creative and Open Research in Engineering and Management 02, no. 05 (2026). https://doi.org/https://doi.org/10.55041/ijcope.v2i4.1044.

Search & Index

References

[1] G. V. Cormack, "Email spam filtering: A systematic review," Foundations and Trends in Information Retrieval, vol. 1, no. 4, pp. 335–455, 2008.

[2] T. A. Almeida, J. M. Gómez Hidalgo, and A. Yamakami, "Contributions to the study of SMS spam filtering: New collection and results," in Proc. ACM Symp. Document Engineering, 2011, pp. 259–262.

[3] M. Sahami, S. Dumais, D. Heckerman, and E. Horvitz, "A Bayesian approach to filtering junk e-mail," in Proc. AAAI Workshop on Learning for Text Categorization, 1998, pp. 98–105.

[4] I. Androutsopoulos, J. Koutsias, K. V. Chandrinos, G. Paliouras, and C. D. Spyropoulos, "An experimental comparison of Naive Bayesian and keyword-based anti-spam filtering," in Proc. ACM SIGIR Conf., 2000, pp. 160–167.

[5] H. Drucker, D. Wu, and V. N. Vapnik, "Support vector machines for spam categorization," IEEE Trans. Neural Networks, vol. 10, no. 5, pp. 1048–1054, 1999.

[6] T. Joachims, "Text categorization with support vector machines: Learning with many relevant features," in Proc. European Conf. Machine Learning (ECML), 1998, pp. 137–142.

[7] N. Pavitha and S. Sugave, "Optimizing machine learning models: An adaptive hyperparameter tuning approach," International Journal of Intelligent Systems and Applications in Engineering, vol. 11, pp. 344–354, 2023.

[8] N. Pavitha and S. Sugave, "Explainable multistage ensemble 1D convolutional neural network for trustworthy credit decision," International Journal of Advanced Computer Science and Applications, vol. 15, no. 2, pp. 351–358, 2024.

[9] C. Cortes and V. Vapnik, "Support-vector networks," Machine Learning, vol. 20, no. 3, pp. 273–297, 1995.

[10] F. Pedregosa et al., "Scikit-learn: Machine learning in Python," Journal of Machine Learning Research, vol. 12, pp. 2825–2830, 2011.

Ethical Compliance & Review Process

•All submissions are screened under plagiarism detection.
•Review follows editorial policy.
•Authors retain copyright.
•Peer Review Type: Double-Blind Peer Review
•Published on: May 03 2026

CCBYNC

This article is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. You are free to share and adapt this work for non-commercial purposes with proper attribution.

View License

Back to Volume 02, Issue 05 View All Issues Next Article

← Previous Article

Smart Traffic Management System

Next Article →

State-Level Disparities in Renewable Energy Potential and Installed Capacity in India