2024 | R.J. van Geest, G. Cascavilla, J. Hulstijn, N. Zannone
This research explores the applicability of a hybrid framework for automated phishing detection, addressing the limitations of single-analysis models that are vulnerable to sophisticated bypass attempts by cybercriminals. The study introduces a novel framework designed to enhance both the robustness and effectiveness of phishing detection in real-world scenarios. The framework combines multiple models to analyze different website features, including the URL, HTML content, and HTML DOM tree structure, and uses a stacking function to consolidate their predictions. The authors conduct a proof of concept to evaluate the framework's effectiveness, robustness, and detection speed, achieving an accuracy of 97.44% and demonstrating superior performance compared to individual models. The research highlights the importance of considering multiple factors such as effectiveness, robustness, speed of detection, scalability, adaptability, and flexibility in the design of phishing detection systems. The findings provide insights into the strengths and limitations of hybrid models and emphasize the need for holistic approaches to address the critical challenge of robust phishing detection.This research explores the applicability of a hybrid framework for automated phishing detection, addressing the limitations of single-analysis models that are vulnerable to sophisticated bypass attempts by cybercriminals. The study introduces a novel framework designed to enhance both the robustness and effectiveness of phishing detection in real-world scenarios. The framework combines multiple models to analyze different website features, including the URL, HTML content, and HTML DOM tree structure, and uses a stacking function to consolidate their predictions. The authors conduct a proof of concept to evaluate the framework's effectiveness, robustness, and detection speed, achieving an accuracy of 97.44% and demonstrating superior performance compared to individual models. The research highlights the importance of considering multiple factors such as effectiveness, robustness, speed of detection, scalability, adaptability, and flexibility in the design of phishing detection systems. The findings provide insights into the strengths and limitations of hybrid models and emphasize the need for holistic approaches to address the critical challenge of robust phishing detection.