Bridging the Semantic Gap: Tackling Contradictions in Semantic Similarity for Natural Language

Abou Fares, Marian

AUB ScholarWorks Home
→
Students Publications
→
AUB Students' Theses, Dissertations, and Projects
→
View Item

dc.contributor.advisor	Khreich, Wael
dc.contributor.author	Abou Fares, Marian
dc.date.accessioned	2024-05-09T10:20:25Z
dc.date.available	2024-05-09T10:20:25Z
dc.date.issued	2024-05-09
dc.date.submitted	2024-05-02
dc.identifier.uri	http://hdl.handle.net/10938/24430
dc.description.abstract	Understanding and accurately processing semantic relations is key to advancing Nat ural Language Processing (NLP). One primary semantic relation is contradictions between sentences which play a crucial role in influencing the interpretation of other semantic relations and are essential for several NLP tasks, such as sarcasm and inconsistency detection. The ability to automatically detect contradictions is vital for identifying mutually exclusive statements, thus recognizing the underlying irony in sarcastic expressions and ensuring logical coherence in textual data. Addition ally, differentiating between various semantic relations can significantly enhance the precision of automated systems and virtual assistants in generating contradiction free information. However, contradiction detection has often been overshadowed within the semantic field in favor of entailment and similarity tasks. Contradictory ideas can appear in diverse forms within sentences, making them challenging to identify. Our research addresses this gap by developing reliable models specifically tailored for contradiction detection. We employed extensive methodologies, includ ing data restructuring, benchmarking, and fine-tuning, achieving an accuracy of 98% in classifying contradictions. Furthermore, we developed another model specialized in differentiating between the three semantic relations: contradiction, similarity, and dissimilarity, which achieved an accuracy of 97% in differentiating between contradicting and dissimilar pairs. Leveraging these models, we discovered histori cally overlooked contradictory pairs within the Semantic Textual Similarity (STS) benchmarks, inaccurately labeled as similar or dissimilar, which represent about a quarter of this dataset. This mislabeling may lead to biases in how language mod els differentiate between contradicting, similar, and dissimilar pairs. Highlighting these neglected contradicting pairs provides insights into the impact of contradic tions within the STS dataset on corresponding models. These insights confirm that the presence of contradictions significantly affects the accuracy and effectiveness of STS models. This thesis contributes significantly to realizing the full potential of NLP in capturing the complexity of human communication, thereby enriching both academic discourse and practical applications in the digital age.
dc.language.iso	en
dc.subject	Natural Language Processing
dc.subject	Machine Learning
dc.subject	Semantic Textual Similarity
dc.subject	Contradictions
dc.subject	Embeddings
dc.subject	Large Language Models
dc.title	Bridging the Semantic Gap: Tackling Contradictions in Semantic Similarity for Natural Language
dc.type	Thesis
dc.contributor.department	Suliman S. Olayan School of Busines
dc.contributor.faculty	Suliman S. Olayan School of Busines
dc.contributor.commembers	Nasr, Walid
dc.contributor.commembers	Taleb, Sirine
dc.contributor.degree	MSBA
dc.contributor.AUBidnumber	202370135

Files in this item

Name: AbouFaresMarian_2 ...

Size: 1.270Mb

Format: PDF

Description: Main Thesis

View/Open

Name: AbouFaresMarian_A ...

Size: 176.5Kb

Format: PDF

Description: Approval Form

View/Open

Name: AbouFaresMarian_R ...

Size: 164.2Kb

Format: PDF

Description: Release Form

View/Open

This item appears in the following Collection(s)

AUB Students' Theses, Dissertations, and Projects [12709]

Show simple item record

Search AUB ScholarWorks

Browse

All of AUB ScholarWorks
This Collection
- By Issue Date
- Authors
- Titles
- Subjects

My Account

Copyright Statement

All materials included in the institutional repository are protected by copyright laws and are the property of their respective copyright holders. Materials may be used for non-commercial, educational, or research purposes only, and must be cited or attributed to the original source. Permission for any other use must be obtained from the copyright holder(s) directly. The American University of Beirut Libraries does not assume responsibility for any infringement of copyright laws that may occur as a result of the use of materials in the repository. If you believe that your copyright has been infringed upon in the repository, please contact the AUB Libraries immediately.

For further information, please contact us at scholarworks@aub.edu.lb

Bridging the Semantic Gap: Tackling Contradictions in Semantic Similarity for Natural Language

Files in this item

This item appears in the following Collection(s)

Search AUB ScholarWorks

Browse

All of AUB ScholarWorks

This Collection

My Account

Copyright Statement