A Comprehensive Framework for Software Vulnerability Prediction Using  Large Language Models رسالة ماجستير

Shehada, Wala'a Yusuf Ata$AAUP$Palestinian

Please use this identifier to cite or link to this item: http://repository.aaup.edu/jspui/handle/123456789/3443

Title:	A Comprehensive Framework for Software Vulnerability Prediction Using Large Language Models رسالة ماجستير
Other Titles:	أطار عمل شامل لكشف التسلل باستخدام خوارزميات التعلم الآلي.
Authors:	Shehada, Wala'a Yusuf Ata$AAUP$Palestinian
Keywords:	Software Vulnerability, Large Language Models, Network Intrusion, Machine Learning
Issue Date:	2025
Publisher:	AAUP
Abstract:	Numerous threats have emerged in our digital age, most notably software vulnerabilities and network intrusions. These threats can result in significant financial losses, data breaches, and system disruptions across various industries. due to the complexity and evolving nature of cyberattacks, addressing them requires new and advanced mechanisms to detect or prevent them before they occur. LLMs such as GPT, LLaMA, and BERT were evaluated for their effectiveness in classifying software code as vulnerable or non-vulnerable, and for intrusion detection, classical machine learning algorithms were employed on the NSL-KDD dataset. This thesis contributes to enhancing cybersecurity by addressing large language models and machine learning models in detecting software vulnerabilities and intrusions carefully and accurately. It also suggests prospects for improving the results and expanding the study. Our study focused on two main goals: predicting software vulnerabilities using large language models and detecting network intrusions using machine learning models. To predict software vulnerabilities, we used large language models (GPT, Llama, and BERT) to analyze and classify the software code as normal or abnormal. The DiverseVul dataset was used and test data of three different sizes (1000, 5000, and 20,000 records) were extracted from the DiverseVul dataset. The performance of GPT and Llama was compared in the zero-shot and few-shot cases, and we noticed that GPT outperformed Llama in all cases. When comparing the performance of the CodeBERT-5000 and the CodeBERT-1000 with GPT and Llama, we noticed that the Codebert 5000 achieved the best results with an accuracy of 79.2%, which is considered suitable for a complex task like software vulnerabilities. The main contribution of this study is to conduct three main experiments aimed at studying the performance of the models and providing deeper insights into their efficiency and reliability: (1) Consistency Check to ensure the stability of the models by repeating the experiment several V times; (2) Analyzing the ability of the GPT and Llama models to explain their predictions and understanding the logic behind predictions. and (3) measure the latency of each model when predicting 1000 records. These experiments aim to deeply analyze the models from multiple aspects, enhancing the results' reliability. As for the second section, the network intrusion detection section used machine learning algorithms (―Logistic Regression, Random Forest, Support Vector Machine, and Decision Tree‖) on the NSL KDD data to categorize the network traffic into normal and abnormal. The results showed ideal performance for the Decision Tree model, which obtained an accuracy of 100%, and the Random Forest model, which had a rate of 99.8% higher than the Support Vector Machine and Logistic Regression models.
Description:	Master \ Computer Science
URI:	http://repository.aaup.edu/jspui/handle/123456789/3443
Appears in Collections:	Master Theses and Ph.D. Dissertations

Files in This Item:

File	Description	Size	Format
ولاء شحادة.pdf		4.09 MB	Adobe PDF	View/Open

Show full item record

Admin Tools

ARAB AMERICAN UNIVERSITY Repository