Building and Evaluating Phishing Detection Systems with Machine  Learning and Deep Learning رسالة ماجستير

Kmail, Tahany Sadeq$AAUP$Palestinian

Please use this identifier to cite or link to this item: http://repository.aaup.edu/jspui/handle/123456789/3799

Title:	Building and Evaluating Phishing Detection Systems with Machine Learning and Deep Learning رسالة ماجستير
Other Titles:	بناء و تقييم أنظمة الكشف عن التصيد الاحتيالي باستخدام التعلم الالي و التعلم العميق.
Authors:	Kmail, Tahany Sadeq$AAUP$Palestinian
Keywords:	Phishing Detection, Machine Learning, Deep Learning, Ensemble Models, and Datasets
Issue Date:	2026
Publisher:	AAUP
Abstract:	Phishing attacks are still a significant cybersecurity enable attackers to compromise users by taking advantage of social engineering, together with the deceptive form of URLs and content obfuscation that bypass many rules-based defenses. Since phishing techniques are not fixed, and new attack patterns may emerge, we need to respond with adaptive detection strategies against the plethora of context-dependent threats. In this paper, we assess and compare the performance of various Machine Learning (ML)/Deep Learning (DL) methodologies in phishing detection, cross-dataset evaluation from global to local data. The experiment was implemented in Palestine using real internet browsing traces of local institutional domains and published global phishing datasets during the data collection and experimental evaluation phase. In this context, an experimental quantitative approach was pursued by implementing and assessing several detection methods, such as Convolutional Neural Networks (CNN), Bidirectional Long Short-Term Memory networks (BiLSTM), and transformer-based ones like Distilled Bidirectional Encoder Representations from Transformers (DistilBERT) or advanced ensemble classifiers. The dataset was made up of phishing and legitimate URLs, with a sample size in research scale accounting for 11,430 global samples and over 6,000 locally collected instances. For feature collection, we used URL-based features and HTML-based features as well as domain-based features with some preprocessing and feature engineering processes to improve the quality of data. For a fair comparison, model evaluation was also conducted using common classification metrics. detection performance clearly. The discriminative score of clustering methods was low, CNN models served as a good V base method, and BiLSTM and DistilBERT were able to improve the results by modelling sequential and contextual patterns. Ensembles accounted for the best stability and regularity, with an ensemble attaining .0966 accuracy on the local Palestinian dataset. An important contribution of this work is a new and evaluated localized phishing dataset that can be used for realistic evaluations in regionally bounded context. The proposed detection pipeline achieved stable performance on both local and global datasets, proving its generalization ability. Thus, the study suggests using ensemble - based detection mechanisms, considering localized datasets while training, and focusing on adaptability and interpretability of models to improve phishing detection systems in real-life settings.
Description:	Master \ Cyber Security
URI:	http://repository.aaup.edu/jspui/handle/123456789/3799
Appears in Collections:	Master Theses and Ph.D. Dissertations

Files in This Item:

File	Description	Size	Format
تهاني كميل.pdf		2.22 MB	Adobe PDF	View/Open

Show full item record

Admin Tools

ARAB AMERICAN UNIVERSITY Repository