Please use this identifier to cite or link to this item: http://repository.aaup.edu/jspui/handle/123456789/2439
Title: Lexicon-Based Sentiment Analysis for Arabic Slang Text رسالة ماجستير
Authors: Rantisi, Khalil Edward Khalil$AAUP$Palestinian
Keywords: data preprocessing,fiter stop words,software,data preparation,sentiment dataset
Issue Date: 2021
Publisher: AAUP
Abstract: The rapid spread of social media generates a massive amount of data every day. Understanding and mining this data to determine the attitude of users towards products, services, events, and other topics are getting very beneficial for individuals as well as stakeholders. This kind of text mining requires a high-level language processing, a.k.a Natural Language Processing (NLP). This topic, s receiving increasing attention from all stakeholders including machine learning and artificial intelligence specialists, the business community, language specialists, etc. However, text mining and information extraction within the Arabic content still require extra efforts to arrive at the level of other languages like English. The Arabic language is one of the popular content-sharing Languages through social networks; the analysis of content written in Arabic faces various challenges, especially in the case of Colloquial/ Slang Arabic which is the widely used language in social media. This thesis is intended to place some efforts towards trying some approaches to enhance the sentiment analysis of social media content written in the Arabic language with the focus of Palestinian colloquial. The proposed approach is based on the use of machine learning algorithms and lexicon tools to enhance outputs of sentiment analysis performed on Arabic social media content. The approach consists of two phases, the first phase is handled by the machine learning algorithm using three classifiers; Support Vector Machine (SVM), K-Nearest Neighbors (KNN) and Naïve Bayes (NB). The second phase is handled by a lexicon-based method using two classifiers: SVM, and NB. The second phase of the approach is implemented to enhance the result of the first stage. The output of the first stage was used to train the SVM and NB classifiers. VI The proposed methods are tested using a customized dataset extracted from the Facebook pages of some public services provider in Palestine, the dataset consists thousands of comments and posts on various topics. The results of the analysis revealed that lexicon-based approach improved the accuracy of comment polarity detection as the accuracy results increased from 90.57% to 90.68% using the proposed approach, while the F-measure results increased from 94.50% to 94.58%.the results also indicated that SVM was the best ML algorithm compared to NB and KNN for this research problem.
Description: Master's Degree in Computer Science.
URI: http://repository.aaup.edu/jspui/handle/123456789/2439
Appears in Collections:Master Theses and Ph.D. Dissertations

Files in This Item:
File Description SizeFormat 
خليل رنتيسي.pdf2.25 MBAdobe PDFThumbnail
View/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Admin Tools