Building a Comprehensive Large Arabic Fact Checking Dataset Using Large Language Models

Loading...
Thumbnail Image

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Large-scale fact verification poses a significant challenge in Arabic natural language processing due to limited datasets and resources. This work introduces a new large- scale dataset for fact-checking in Modern Standard Arabic, constructed through an automated framework leveraging large language models (LLMs). We propose a three-step pipeline: (1) claim generation from Arabic Wikipedia articles with sup- porting evidence, (2) systematic claim mutation to create challenging counterfactual statements, and (3) rigorous verification and labeling. The resulting dataset com- prises 180,000 claim-evidence pairs labeled as Supported, Refuted, or Not Enough Info. Human evaluation demonstrates strong inter-annotator agreement (κ= 0.89) in Cohen’s Kappa for the Generation Task and (κ= 0.94) for the Refutation Task on our testing sample, while our baseline models achieve 87% accuracy on the verifi- cation task with respect to the expert annotator. Our approach employs specialized prompt engineering and grammatical rules to address Arabic-specific linguistic fea- tures. This provides the first large-scale benchmark for Arabic fact verification.Our methodology presents a scalable approach for developing similar resources for other low-resource languages. Through this work, we aim to advance the state of auto- mated fact verification in Arabic and provide a foundation for future research in multilingual fact-checking.

Description

Keywords

Citation

Endorsement

Review

Supplemented By

Referenced By