Please use this identifier to cite or link to this item: http://repository.aaup.edu/jspui/handle/123456789/3127
Full metadata record
DC FieldValueLanguage
dc.contributor.authorMesqali, enas$AAUP$Palestinian-
dc.date.accessioned2025-02-05T07:44:25Z-
dc.date.available2025-02-05T07:44:25Z-
dc.date.issued2024-
dc.identifier.urihttp://repository.aaup.edu/jspui/handle/123456789/3127-
dc.descriptionMaster`s degree in Computer Scienceen_US
dc.description.abstractThe recent advancements in Natural Language Processing (NLP) technologies have significantly enhanced the capabilities of processing, analyzing, and understanding sentiments expressed in user-generated reviews across various products and services. This surge of interest in sentiment analysis has spurred considerable research efforts. In this study, we explore sentiment analysis with a specific focus on Arabic language. Leveraging both traditional pre processing techniques and machine learning algorithms, we propose a comprehensive sentiment analysis model consisting of four stages. The primary objective of our model is to harness English language resources and techniques to gauge their impact on classifier accuracy when applied to Arabic sentences. Through a series of experiments conducted on Arabic datasets and their English translations, we assess the effectiveness of various pre-processing methods and machine learning classifiers: Logistic Regression (LR), Random Forest (RF), Naïve Bayes (NB), and Support Vector Machines (SVM). Notably, SVM classifier consistently outperformed others, exhibiting the highest accuracy across most scenarios especially when combining Lemmatization and Stemming. Furthermore, we explore the influence of translating datasets and incorporating synonyms on sentiment analysis accuracy. While the translation of datasets from Arabic to English and vice versa did not yield significant changes in accuracy, the inclusion of synonyms from English datasets in Arabic sentiment analysis experiments produced mixed results. This underscores the intricacies of language-specific nuances and the challenges in effectively capturing sentiment across different languages. V When comparing our study with previous research that used the ASTD dataset, several key differences and similarities emerge. Previous studies explored a range of classifiers, including SVM, NB, LR, CNN, and RNTN, with accuracy results varying between 85% and 90% for traditional features like n-grams, TF-IDF, and word embeddings like Word2Vec. However, the RNTN algorithm showed a lower accuracy rate of 58.5%, and the SVM algorithm achieved 51.7%. Other research focused on deep learning models like CNN and LSTM, which yielded accuracy rates of 64.3% and 64.75%, respectively. In contrast, our study highlighted the importance of specific pre-processing techniques, demonstrating that methods such as lemmatization and stemming could significantly enhance the performance of machine learning classifiers like SVM, achieving accuracy results of up to 80%. Overall, our study showcases the evolving landscape of sentiment analysis research, highlighting the adaptability of techniques to address language-specific challenges and nuances. These findings contribute to the broader understanding of sentiment analysis methodologies and underscore the importance of considering linguistic differences in sentiment analysis tasks. Finally, recommendations for future research include expanding the Arabic dataset and exploring advanced deep learning models to capture more complex patterns. Additionally, refining linguistic tools specific to Arabic could further enhance sentiment analysis accuracy. These steps aim to better address the intricacies of language-specific challenges and contribute to more effective sentiment analysis methodologiesen_US
dc.publisherAAUPen_US
dc.subjectData Collection,Computer Science,Classification techniquesen_US
dc.titleOn the Combination of NLP and Extrinsic Semantic Resources for Developing an Arabic-English Sentiment Analyzer رسالة ماجستيرen_US
dc.title.alternativeنحو دمج تقنيات معالجة اللغة الطبيعية ومصادر دلالات المعاني الخارجية لبناء نظام تحليل الآراء.en_US
dc.typeThesisen_US
Appears in Collections:Master Theses and Ph.D. Dissertations

Files in This Item:
File Description SizeFormat 
ايناس مسقلة.pdf2.34 MBAdobe PDFThumbnail
View/Open
Show simple item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Admin Tools