A sentiment treebank and morphologically enriched recursive deep models for effective sentiment analysis in Arabic

dc.contributor.authorBaly, Ramy
dc.contributor.authorHajj, Hazem M.
dc.contributor.authorHabash, Nizar Y.
dc.contributor.authorBashir Shaban, Khaled Bashir
dc.contributor.authorEl-Hajj, Wassim
dc.contributor.departmentDepartment of Electrical and Computer Engineering
dc.contributor.departmentDepartment of Computer Science
dc.contributor.facultyMaroun Semaan Faculty of Engineering and Architecture (MSFEA)
dc.contributor.facultyFaculty of Arts and Sciences (FAS)
dc.contributor.institutionAmerican University of Beirut
dc.date.accessioned2025-01-24T11:29:23Z
dc.date.available2025-01-24T11:29:23Z
dc.date.issued2017
dc.description.abstractAccurate sentiment analysis models encode the sentiment of words and their combinations to predict the overall sentiment of a sentence. This task becomes challenging when applied to morphologically rich languages (MRL). In this article, we evaluate the use of deep learning advances, namely the Recursive Neural Tensor Networks (RNTN), for sentiment analysis in Arabic as a case study of MRLs. While Arabic may not be considered the only representative of all MRLs, the challenges faced and proposed solutions in Arabic are common to many other MRLs. We identify, illustrate, and address MRL-related challenges and show how RNTN is affected by the morphological richness and orthographic ambiguity of the Arabic language. To address the challenges with sentiment extraction from text in MRL, we propose to explore different orthographic features as well as different morphological features at multiple levels of abstraction ranging from raw words to roots. A key requirement for RNTN is the availability of a sentiment treebank; a collection of syntactic parse trees annotated for sentiment at all levels of constituency and that currently only exists in English. Therefore, our contribution also includes the creation of the first Arabic Sentiment Treebank (ARSENTB) that is morphologically and orthographically enriched. Experimental results show that, compared to the basic RNTN proposed for English, our solution achieves significant improvements up to 8% absolute at the phrase level and 10.8% absolute at the sentence level, measured by average F1 score. It also outperforms well-known classifiers including Support Vector Machines, Recursive Auto Encoders, and Long Short-Term Memory by 7.6%, 3.2%, and 1.6% absolute respectively, all models being trained with similar morphological considerations. © 2017 ACM
dc.identifier.doihttps://doi.org/10.1145/3086576
dc.identifier.eid2-s2.0-85026660253
dc.identifier.urihttp://hdl.handle.net/10938/27202
dc.language.isoen
dc.publisherAssociation for Computing Machinery
dc.relation.ispartofACM Transactions on Asian and Low-Resource Language Information Processing
dc.sourceScopus
dc.subjectForestry
dc.subjectLong short-term memory
dc.subjectArabic languages
dc.subjectAuto encoders
dc.subjectF1 scores
dc.subjectMorphological features
dc.subjectMultiple levels
dc.subjectSentence level
dc.subjectSentiment analysis
dc.subjectSyntactic parse tree
dc.subjectData mining
dc.titleA sentiment treebank and morphologically enriched recursive deep models for effective sentiment analysis in Arabic
dc.typeArticle

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
2017-8860.pdf
Size:
2.03 MB
Format:
Adobe Portable Document Format