Please use this identifier to cite or link to this item: http://repository.aaup.edu/jspui/handle/123456789/3780
Full metadata record
DC FieldValueLanguage
dc.contributor.authorAlawneh, Hussam Fawzi Abed$AAUP$Palestinian-
dc.date.accessioned2026-02-23T11:39:19Z-
dc.date.available2026-02-23T11:39:19Z-
dc.date.issued2025-
dc.identifier.urihttp://repository.aaup.edu/jspui/handle/123456789/3780-
dc.descriptionMaster \ Data Science and Business Analyticsen_US
dc.description.abstractSocial media users often express emotions, ideas, and thoughts through text in posts and tweets, which can be used to determine the text’s polarity as positive or negative - a process known as sentiment analysis. Sentiment analysis has become critical for various real-world domains, including politics, tourism, e-commerce, education, and health. However, although sentiment analysis approaches perform well with English text, they face notable drawbacks when dealing with Arabic text. The morphological complexity inherent in the Arabic language poses challenges for building robust models, making it difficult to understand public sentiment and subsequently make informed decisions. In response to these challenges, effective data preprocessing and deep learning techniques are employed to overcome the complexity of the Arabic language and provide insightful sentiment predictions. This thesis evaluates a combined Convolutional Neural Network-Long Short-Term Memory (CNN-LSTM) framework with different data preprocessing techniques for Arabic Sentiment Analysis (ASA) using the Arabic Sentiment Twitter Corpus (ASTC) dataset. Three experiments with eight distinct preprocessing configurations were conducted to evaluate the effect of data preprocessing on Arabic sentiment analysis, namely the effect of encoding and translating emojis to their real and emotional meanings. Emoji meanings were collected from four websites specialized in defining the meaning of emojis in social media, which resulted in a new dataset of emoji meaning called the “Emoji Meaning” dataset. Furthermore, the CNN-LSTM parameters were optimized using the Keras Tuner during the 5- fold cross-validation process. The proposed model with emoji translation into Arabic text, obtained the highest accuracy rate (91.85%) by keeping non-Arabic words, removing punctuations, using the Snowball stemmer, and using Keras embedding. This approach yields competitive results compared to other state-of-the-art approaches, proves that emoji encoding enriches text by accurately reflecting emotions, and investigates the effect of data preprocessing on model performance. This allows the hybrid model to achieve results comparable to other studies that use the same ASTC dataset, thereby improving sentiment analysis accuracyen_US
dc.publisherAAUPen_US
dc.subjectSocial media,models,Memory (CNN-LSTM),model performance,Hybrid models,ASTC Data,Data Science,Businessen_US
dc.titleA Hybrid CNN-LSTM Framework for Enhanced Arabic Sentiment Analysis: Investigating Emoji Encoding and Preprocessing Strategies رسالة ماجستيرen_US
dc.title.alternativeطار هجين قائم على الشبكات العصبية الالتفافية و شبكات الذاكرة طويلة المدى قصيرة الاجل لتعزيز تحليل المشاعر في اللغة العربية: دراسة استراتيجيات ترميز الرموز التعبيرية و المعالجة المسبقة.en_US
dc.typeThesisen_US
Appears in Collections:Master Theses and Ph.D. Dissertations

Files in This Item:
File Description SizeFormat 
حازم علاونة.pdf2.12 MBAdobe PDFView/Open
Show simple item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Admin Tools