Information Extraction from Arabic Social Media Content

Dankar, Rayan

AUB ScholarWorks Home
→
Students Publications
→
AUB Students' Theses, Dissertations, and Projects
→
View Item

Information Extraction from Arabic Social Media Content

Dankar, Rayan

URI: http://hdl.handle.net/10938/22111

Date: 9/23/2020

Abstract:

Stakeholders such as advertisers, celebrities and politicians are showing high interest in information extraction from social media content. Social media content includes posts and interactions in local dialects. Arabic and its local dialects are among the top used languages in the world. Modern standard Arabic is dominant in formal platforms and events such as newscasts, speeches, books, and newspapers. Dialects of Arabic are dominant in everyday communication and are increasingly used on social media platforms. Information extraction techniques focus on extracting entities and relational entities from unstructured text. They have limited support for MSA and lack support for Arabic dialects. In this thesis, we construct necessary resources for information extraction from Arabic social media content. We introduce a method to retrieve relevant information by extracting entities and relational entities from modern standard and dialectical Arabic text. To improve entity and relational entity extraction, we construct ADAT, an Arabic Dialect Annotation Tool. ADAT serves as a building block to construct computational linguistic models from Arabic text and requires basic linguistic knowledge from its users. We evaluate the obtained results from our work on entity and relational entity extraction on a corpora concerning Yemen and report its performance using precision and recall metrics. We finally use graph construction and analysis techniques to draw insights from the extracted entities and relational entities.

Advisor(s):

Zaraket, Fadi

Show full item record

Files in this item

Name: Thesis_Signed.pdf

Size: 9.034Mb

Format: PDF

View/Open

This item appears in the following Collection(s)

AUB Students' Theses, Dissertations, and Projects [12709]

Search AUB ScholarWorks

Browse

All of AUB ScholarWorks
This Collection
- By Issue Date
- Authors
- Titles
- Subjects

My Account

Copyright Statement

All materials included in the institutional repository are protected by copyright laws and are the property of their respective copyright holders. Materials may be used for non-commercial, educational, or research purposes only, and must be cited or attributed to the original source. Permission for any other use must be obtained from the copyright holder(s) directly. The American University of Beirut Libraries does not assume responsibility for any infringement of copyright laws that may occur as a result of the use of materials in the repository. If you believe that your copyright has been infringed upon in the repository, please contact the AUB Libraries immediately.

For further information, please contact us at scholarworks@aub.edu.lb

Information Extraction from Arabic Social Media Content

Information Extraction from Arabic Social Media Content

Abstract:

Advisor(s):

Files in this item

This item appears in the following Collection(s)

Search AUB ScholarWorks

Browse

All of AUB ScholarWorks

This Collection

My Account

Copyright Statement