AUB ScholarWorks

Bridging the Semantic Gap: Tackling Contradictions in Semantic Similarity for Natural Language

Show simple item record

dc.contributor.advisor Khreich, Wael
dc.contributor.author Abou Fares, Marian
dc.date.accessioned 2024-05-09T10:20:25Z
dc.date.available 2024-05-09T10:20:25Z
dc.date.issued 2024-05-09
dc.date.submitted 2024-05-02
dc.identifier.uri http://hdl.handle.net/10938/24430
dc.description.abstract Understanding and accurately processing semantic relations is key to advancing Nat ural Language Processing (NLP). One primary semantic relation is contradictions between sentences which play a crucial role in influencing the interpretation of other semantic relations and are essential for several NLP tasks, such as sarcasm and inconsistency detection. The ability to automatically detect contradictions is vital for identifying mutually exclusive statements, thus recognizing the underlying irony in sarcastic expressions and ensuring logical coherence in textual data. Addition ally, differentiating between various semantic relations can significantly enhance the precision of automated systems and virtual assistants in generating contradiction free information. However, contradiction detection has often been overshadowed within the semantic field in favor of entailment and similarity tasks. Contradictory ideas can appear in diverse forms within sentences, making them challenging to identify. Our research addresses this gap by developing reliable models specifically tailored for contradiction detection. We employed extensive methodologies, includ ing data restructuring, benchmarking, and fine-tuning, achieving an accuracy of 98% in classifying contradictions. Furthermore, we developed another model specialized in differentiating between the three semantic relations: contradiction, similarity, and dissimilarity, which achieved an accuracy of 97% in differentiating between contradicting and dissimilar pairs. Leveraging these models, we discovered histori cally overlooked contradictory pairs within the Semantic Textual Similarity (STS) benchmarks, inaccurately labeled as similar or dissimilar, which represent about a quarter of this dataset. This mislabeling may lead to biases in how language mod els differentiate between contradicting, similar, and dissimilar pairs. Highlighting these neglected contradicting pairs provides insights into the impact of contradic tions within the STS dataset on corresponding models. These insights confirm that the presence of contradictions significantly affects the accuracy and effectiveness of STS models. This thesis contributes significantly to realizing the full potential of NLP in capturing the complexity of human communication, thereby enriching both academic discourse and practical applications in the digital age.
dc.language.iso en
dc.subject Natural Language Processing
dc.subject Machine Learning
dc.subject Semantic Textual Similarity
dc.subject Contradictions
dc.subject Embeddings
dc.subject Large Language Models
dc.title Bridging the Semantic Gap: Tackling Contradictions in Semantic Similarity for Natural Language
dc.type Thesis
dc.contributor.department Suliman S. Olayan School of Busines
dc.contributor.faculty Suliman S. Olayan School of Busines
dc.contributor.commembers Nasr, Walid
dc.contributor.commembers Taleb, Sirine
dc.contributor.degree MSBA
dc.contributor.AUBidnumber 202370135


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search AUB ScholarWorks


Browse

My Account