Please use this identifier to cite or link to this item: http://repository.aaup.edu/jspui/handle/123456789/1868
Full metadata record
DC FieldValueLanguage
dc.contributor.authorMaree, Mohammed$AAUP$Palestinian-
dc.contributor.authorEleyat, Mujahed$AAUP$Palestinian-
dc.contributor.authorMesqali, Enas$AAUP$Palestinian-
dc.date.accessioned2024-07-22T07:28:06Z-
dc.date.available2024-07-22T07:28:06Z-
dc.date.issued2024-03-
dc.identifier.citationMohammed Maree,Mujahed Eleyat,Enas Mesqali, "Optimizing Machine Learning-based Sentiment Analysis Accuracy in Bilingual Sentences via Preprocessing Techniques", The International Arab Journal of Information Technology (IAJIT) ,Volume 21, Number 02, pp. 257 - 270, March 2024, doi: 10.34028/21/2/8.en_US
dc.identifier.issn1683-3198-
dc.identifier.urihttps://iajit.org/paper/4989/Optimizing-Machine-Learning-based-Sentiment-Analysis-Accuracy-in-Bilingual-Sentences-via-Preprocessing-Techniques-
dc.identifier.urihttp://repository.aaup.edu/jspui/handle/123456789/1868-
dc.description.abstractWith the recent advances in Natural Language Processing (NLP) technologies, the ability to process, analyze, and understand sentiments expressed in user-generated reviews regarding the products and services they use is becoming more achievable. Despite the latest improvements in this field, little attention has been given to multilingual sentiment analysis. In this article, a framework is presented for sentiment analysis in Arabic and English using two datasets (ASTD, AJGT) along with their translations. Preprocessing techniques, including n-gram tokenization, Arabic-specific stop words removal, punctuation removal, removing repeating characters, parts of speech tagging, stemming, and lemmatization, are applied. Four machine learning classifiers, namely Logistic Regression (LR), Random Forest (RF), Naive Bayes (NB), and Support Vector Machine (SVM), are employed. We highlight existing specialized research in sentiment analysis for Arabic and English, as well as the employed techniques in each. Furthermore, the impact of preprocessing on accuracy results for both Arabic and English languages is investigated through separate experiments for each step. Experimental results on the ASTD dataset demonstrate close performance across classifiers, with the SVM classifier achieving the highest accuracy of 70%. However, the accuracy varied when using the AJGT dataset, with the NB classifier yielding the best accuracy at approximately 87%. The experiments on the translated datasets from Arabic to English did not exhibit significant differences, although some features performed slightly better using the Arabic datasets.en_US
dc.language.isoen_USen_US
dc.publisherZarqa Universityen_US
dc.relation.ispartofseriesThe International Arab Journal of Information Technology (IAJIT);Volume 21, Number 02-
dc.subjectMachine learningen_US
dc.subjectBilingual sentiment analysisen_US
dc.subjectNLPen_US
dc.subjectSentiment datasetsen_US
dc.titleOptimizing Machine Learning-based Sentiment Analysis Accuracy in Bilingual Sentences via Preprocessing Techniquesen_US
dc.typeArticleen_US
Appears in Collections:Faculty & Staff Scientific Research publications

Show simple item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Admin Tools