AUB ScholarWorks

Unveiling Gender in Text: Advanced Approaches in Language Model Analysis

Show simple item record

dc.contributor.advisor Khreich, Wael
dc.contributor.author Al Bidewe, Nour Ayman
dc.date.accessioned 2024-01-29T06:49:22Z
dc.date.available 2024-01-29T06:49:22Z
dc.date.issued 2024-01-29
dc.date.submitted 2024-01-27
dc.identifier.uri http://hdl.handle.net/10938/24280
dc.description.abstract This thesis investigates the complex task of gender detection in text analysis, focusing on identifying an author's gender through linguistic and stylistic analysis. The study emphasizes the role of gender detection in enhancing the precision and relevance of information processing systems, which is pivotal for more personalized content strategies and combating gender biases in various sectors such as social media, and AI-driven analytics. The research conducts an exhaustive evaluation of diverse methodologies, encompassing a range of preprocessing techniques and feature selection strategies, and assesses the effectiveness of both traditional and advanced language models like BERT, particularly in analyzing tweets. Our study's key findings show that username-based data splitting in social media, as opposed to random splitting, enhances model performance and generalization, and prevents data leakage. Integrating word and character N-Grams, along with combining linguistic and textual features, proved highly effective. BERT emerged as a superior performer among large language models, though it did not outperform traditional models. This work not only advances the understanding of gender detection but also contributes significantly to the development of more sophisticated and equitable text analysis tools in the field of computational linguistics.
dc.language.iso en
dc.subject Gender Detection
dc.subject Large Language Model
dc.subject Natural language processing
dc.subject Bidirectional Encoder Representations from Transformers (BERT)
dc.subject Generative pre-trained transformers (GPT)
dc.title Unveiling Gender in Text: Advanced Approaches in Language Model Analysis
dc.type Thesis
dc.contributor.department Suliman S. Olayan School of Business
dc.contributor.faculty Suliman S. Olayan School of Business
dc.contributor.commembers Nasr, Walid
dc.contributor.commembers Taleb, Sirine
dc.contributor.degree MSBA
dc.contributor.AUBidnumber 201706202


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search AUB ScholarWorks


Browse

My Account