Abstract:
Misinformation can undermine public trust and lead to misguided actions based on unreliable
sources and fact-checking efforts. Traditional manual fact-checking systems suffer
from several challenges, including issues related to scaling, performance, and complexity.
In response to this challenge, we introduce FactFormer, an automatic fact-checking system
that retrieves evidence from trustworthy sources for a given claim and subsequently
classifies the claims into different labels based on the retrieved evidence. Our retrieval
model adopts the extractive question-answering technique. This approach treats claims as
questions and trusted sources as context from which evidence, construed as answers, is
retrieved. We harnessed the capabilities of the Bidirectional Encoder Representations from
Transformers (BERT) and Distilled Bidirectional Encoder Representations from Transformers
(DistilBERT) architectures, fine-tuning them specifically for the task of evidence
extraction. Subsequently, claim verification was accomplished using a multi-headed BERT
combined with a fully connected network layer. During the evaluation phase, our retrieval
models demonstrated state-of-the-art results: the BERT model yielded an exact match rate
of 89.89% and an F1-measure score of 93.93%, while the DistilBERT model achieved an
exact match rate of 90.19% and an F1-measure score of 93.98% when evaluated with a
maximum evidence length of 100 words. Our claim verification model achieved a high
accuracy score of 90% using the existing manually annotated Fact Extraction and VERification
(FEVER) dataset with three classes, outperforming other state-of-the-art papers.
We further conducted end-to-end system experiments and evaluations using our retrieved
evidence to demonstrate its ability to generalize well when compared to the manually
annotated FEVER-2 dataset with two labels. Our claim verification model performance on
FEVER-2 with DistilBERT achieved 87.14%, outperforming the manual FEVER-2 with an
86.54% accuracy score. In conclusion, our approach significantly enhances fact-checking
by improving both evidence retrieval and claim classification.