AraDialAug: A Meta-Learning and Data Augmentation Approach for Arabic Dialogue Generation

El Halabi, Assaad

AUB ScholarWorks Home
→
Students Publications
→
AUB Students' Theses, Dissertations, and Projects
→
View Item

AraDialAug: A Meta-Learning and Data Augmentation Approach for Arabic Dialogue Generation

El Halabi, Assaad

URI: http://hdl.handle.net/10938/24319

Date: 2024-02-07

Abstract:

Arabic dialogue generation presents unique challenges due to the language's rich morphology and the scarcity of data resources. Recent advances have employed metalearning to facilitate fast adaptation of language models to low-resource domains. This thesis builds upon such groundwork by introducing paraphrase data augmentation to further improve the generalization and adaptation capabilities of pre-trained models in Arabic Natural Language Generation (NLG). We propose an enhanced approach that leverages a fine-tuned ARAT5 model with meta-learning via the Reptile algorithm. Our methodology encompasses augmenting both the context and responses within the auxiliary and target datasets. We incorporate paraphrase data augmentation for 10% and 30% of the seed data, examining the resultant impact on model performance. Our experiments demonstrate significant improvements in dialogue generation quality, as evidenced by higher BLEU-4 scores and Semantic Textual Similarity (STS) metrics in intrinsic evaluation, even with limited data. These results surpass those achieved by the state-of-the-art methods described in prior work. The qualitative extrinsic evaluations reinforce the quantitative metrics, indicating a noticeable enhancement in the fluency and relevance of the generated responses. Our findings suggest that paraphrase data augmentation, when used judiciously within the framework of meta-learning, can serve as a powerful tool for advancing the field of Arabic conversational AI, particularly in low-resource scenarios.

Advisor(s):

El Hajj, Wassim

Show full item record

Files in this item

Name: ElHalabiAssaad_20 ...

Size: 1.762Mb

Format: PDF

View/Open

This item appears in the following Collection(s)

AUB Students' Theses, Dissertations, and Projects [12709]

Search AUB ScholarWorks

Browse

All of AUB ScholarWorks
This Collection
- By Issue Date
- Authors
- Titles
- Subjects

My Account

Copyright Statement

All materials included in the institutional repository are protected by copyright laws and are the property of their respective copyright holders. Materials may be used for non-commercial, educational, or research purposes only, and must be cited or attributed to the original source. Permission for any other use must be obtained from the copyright holder(s) directly. The American University of Beirut Libraries does not assume responsibility for any infringement of copyright laws that may occur as a result of the use of materials in the repository. If you believe that your copyright has been infringed upon in the repository, please contact the AUB Libraries immediately.

For further information, please contact us at scholarworks@aub.edu.lb

AraDialAug: A Meta-Learning and Data Augmentation Approach for Arabic Dialogue Generation

AraDialAug: A Meta-Learning and Data Augmentation Approach for Arabic Dialogue Generation

Abstract:

Advisor(s):

Files in this item

This item appears in the following Collection(s)

Search AUB ScholarWorks

Browse

All of AUB ScholarWorks

This Collection

My Account

Copyright Statement