Abstract:
The rise of social media platforms has witnessed a disturbing increase in online abusive behavior, causing psychological harm, especially among children. This study focuses on detecting implicit abusive language, often overlooked in favor of explicit abuse. Implicit abuse conceals derogatory language within seemingly positive expressions, making it harder to identify. Using Twitter data, we collected and annotated a dataset distinguishing implicit, explicit, and non-abusive language. Our research leveraged traditional, deep learning, and transfer learning models to detect online abusive language. The Ensemble BERT model achieved a remarkable F1 score of 0.72 and AUC of 0.81 in detecting Implicit Abuse versus not abusive content. This research provides a deeper understanding of the nuances of online abuse and offers a significant step toward creating a safer online environment that promotes healthy digital interactions and the well- being of users.