From Traditional Machine Learning to Large Language Models for Advancing Cybersecurity in Dynamic Environments
Loading...
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
The gap between machine learning (ML) research in cybersecurity and its practical application in industry presents a significant challenge. Although academic models often demonstrate impressive results under controlled conditions, they frequently underperform in dynamic industrial environments where continuous changes and evolving threats complicate their effectiveness. This dissertation aims to bridge this gap by systematically assessing the limitations of existing ML-based cybersecurity models, specifically in scenarios where environmental dynamism demands frequent model adaptation, and by leveraging traditional and advanced ML techniques to address these challenges.
The dissertation begins by evaluating the limitations of traditional ML approaches in several key areas of cybersecurity and proposing effective solutions. First, in the context of malware detection, we investigate how adversarial examples can significantly degrade model performance. Second, we examine user authentication systems based on typing patterns, highlighting how changes in behavior over time can undermine these systems. Third, we focus on IoT device identification, where the frequent addition of new devices post-training introduces significant classification challenges. Finally, we extend the investigation to Quishing, a recent evolution of phishing attacks that leverage QR codes to deceive users, and propose a robust ML-based detection framework to identify and mitigate these threats. In each of these cases, we propose novel methods to mitigate the impact of these dynamic factors, improving models’ robustness and reducing, wherever possible, the need for continuous retraining, thereby enhancing the reliability of these models in industrial settings.
With the emergence of large language models (LLMs), which are designed primarily for processing and generating text, and large multimodal models (LMMs), which can handle multiple modalities such as text and visual data, new opportunities arise to overcome the limitations of traditional ML models by reducing the burden of retraining and maintaining local models. This dissertation explores the application of such models to develop adaptive cybersecurity solutions capable of responding to changing environments without the need for extensive training and model maintenance.
We first compare the effectiveness of prompt engineering, where prompts are adjusted to elicit desired outputs from a model, with fine-tuning, where model weights are adjusted to tailor the base model for specific tasks. This comparison is conducted across text-based tasks, such as phishing URL detection, and image-based tasks, such as trigger detection and malware classification. For both cases, while fine-tuning consistently yielded better performance by directly optimizing the model for each specific task, it also required significantly more resources and time. Conversely, prompt engineering, though less resource-intensive, offered more adaptable solutions that could be implemented quickly but often with some compromise in performance accuracy and API costs.
As such, we propose strategies to enhance the effectiveness of prompt-engineered large models that require no or minimal training. Specifically, we explore ensemble methods based on majority voting and stacking to improve performance when using prompt engineered models. Furthermore, we evaluate the relationship between manual and automated prompting techniques, demonstrating that while automated methods can accelerate deployment and improve consistency, manual refinement remains necessary to achieve optimal task-specific performance. In addition, we investigate advanced techniques such as Retrieval Augmented Generation (RAG) to enhance the prompts. Finally, we introduce agentic approaches to not only elevate the performance of such models, but also to reduce API costs.
By integrating these advanced techniques, this dissertation aims to deliver practical, cost-effective solutions that enhance the resilience and reliability of cybersecurity systems in industrial settings, ultimately narrowing the gap between ML research and its application in the real world.
Description
Dissertation. Ph.D. American University of Beirut. Department of Electrical and Computer Engineering, 2025.