Evaluation of Recurrent Neural Network architectures for abusive language detection in cyberbullying contexts

Filip Markoski; Eftim Zdravevski; Nikola Ljubešić; Sonja Gievska

Evaluation of Recurrent Neural Network architectures for abusive language detection in cyberbullying contexts

Date Issued

2020-05-08

Author(s)

Filip Markoski

Eftim Zdravevski

Nikola Ljubešić

Sonja Gievska

Abstract

Cyberbullying is a form of bullying that takes place over digital devices. Social media is one of the most common environments where it occurs. It can lead to serious long-lasting trauma and can lead to problems with fear, anxiety, sadness, mood, energy level, sleep, and appetite. Therefore, detection and tagging of hateful or abusive comments can help in the mitigation or prevention of the negative consequences of cyberbullying. This paper evaluates seven different architectures relying on Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) gating units for classification of comments. The evaluation is conducted on two abusive language detection tasks, on a Wikipedia data set and a Twitter data set, obtaining ROC-AUC scores of up to 0.98. The architectures incorporate various neural network mechanisms such as bi-directionality, regularization, convolutions, attention etc. The paper presents results in multiple evaluation metrics which may serve as baselines in future scientific endeavours. We conclude that the difference is extremely negligible with the GRU models marginally outperforming their LSTM counterparts whilst taking less training time.

Subjects

Deep Learning, NLP, R...

File(s)

Name

CIIT2020_paper_21.pdf

Size

580.02 KB

Format

Adobe PDF

Checksum

(MD5):e9d2fb56f4c2638f3d31c6549ec425b8