Abstract:
The proliferation of Hate Speech on social media platforms has been increasing recently,
causing severe adverse effects on victims’ mental health and well-being. This serious
phenomenon requires updated automated detection systems. However, existing
supervised machine learning models have significant limitations as they rely heavily on
labeled data, which is costly, prone to errors, and lacks scalability and generalizability.
This thesis explores unsupervised learning techniques, specifically clustering enhanced
with deep representation learning, to overcome these limitations. Traditional (TF-IDF,
Word2Vec) and modern methods (transformers, pre-trained language models, and
contrastive learning) are leveraged to enrich representations of short texts and capture
semantic similarities without labeling. We investigate the state-of-the-art Simple
Contrastive Learning of Sentence Embedding (SimCSE), a contrastive learning approach
for sentence embeddings, and propose Hate-SimCSE: a finetuned SimCSE framework to
encode robust hate speech representations, leading to better clustering results. Extensive
experiments on diverse public datasets demonstrate significant clustering performance
improvements from Hate-SimCSE over conventional text clustering approaches with an
accuracy ranging from 0.58 to 0.86, a 2% to 15% improvement. Overall, our work
illustrates the potential of these new techniques to develop more effective methods for
combating the pressing societal issue of online hate and to create a safer online
environment for all users. Additionally, this research can extend beyond hate speech
detection, impacting various applications in NLP downstream tasks, such as semantic text
similarity, information extraction, and question-answering.