Unveiling Gender in Text:  Advanced Approaches in Language Model Analysis

Al Bidewe, Nour Ayman

AUB ScholarWorks Home
→
Students Publications
→
AUB Students' Theses, Dissertations, and Projects
→
View Item

Unveiling Gender in Text: Advanced Approaches in Language Model Analysis

Al Bidewe, Nour Ayman

URI: http://hdl.handle.net/10938/24280

Date: 2024-01-29

Abstract:

This thesis investigates the complex task of gender detection in text analysis, focusing on identifying an author's gender through linguistic and stylistic analysis. The study emphasizes the role of gender detection in enhancing the precision and relevance of information processing systems, which is pivotal for more personalized content strategies and combating gender biases in various sectors such as social media, and AI-driven analytics. The research conducts an exhaustive evaluation of diverse methodologies, encompassing a range of preprocessing techniques and feature selection strategies, and assesses the effectiveness of both traditional and advanced language models like BERT, particularly in analyzing tweets. Our study's key findings show that username-based data splitting in social media, as opposed to random splitting, enhances model performance and generalization, and prevents data leakage. Integrating word and character N-Grams, along with combining linguistic and textual features, proved highly effective. BERT emerged as a superior performer among large language models, though it did not outperform traditional models. This work not only advances the understanding of gender detection but also contributes significantly to the development of more sophisticated and equitable text analysis tools in the field of computational linguistics.

Advisor(s):

Khreich, Wael

Show full item record

Files in this item

Name: AlBideweNour_2024.pdf

Size: 1.195Mb

Format: PDF

Description: This thesis inves ...

View/Open

This item appears in the following Collection(s)

AUB Students' Theses, Dissertations, and Projects [12709]

Search AUB ScholarWorks

Browse

All of AUB ScholarWorks
This Collection
- By Issue Date
- Authors
- Titles
- Subjects

My Account

Copyright Statement

All materials included in the institutional repository are protected by copyright laws and are the property of their respective copyright holders. Materials may be used for non-commercial, educational, or research purposes only, and must be cited or attributed to the original source. Permission for any other use must be obtained from the copyright holder(s) directly. The American University of Beirut Libraries does not assume responsibility for any infringement of copyright laws that may occur as a result of the use of materials in the repository. If you believe that your copyright has been infringed upon in the repository, please contact the AUB Libraries immediately.

For further information, please contact us at scholarworks@aub.edu.lb

Unveiling Gender in Text: Advanced Approaches in Language Model Analysis

Unveiling Gender in Text: Advanced Approaches in Language Model Analysis

Abstract:

Advisor(s):

Files in this item

This item appears in the following Collection(s)

Search AUB ScholarWorks

Browse

All of AUB ScholarWorks

This Collection

My Account

Copyright Statement