Predicting the victims of hate speech on microblogging platforms

Sahrish Khan; Rabeeh Ayaz Abbasi; Muddassar Azam Sindhu; Sachi Arafat; Akmal Saeed Khattak; Ali Daud; Mubashar Mushtaq

doi:10.1016/j.heliyon.2024.e40611

Predicting the victims of hate speech on microblogging platforms

Heliyon. 2024 Nov 26;10(23):e40611. doi: 10.1016/j.heliyon.2024.e40611. eCollection 2024 Dec 15.

Authors

Sahrish Khan^{1

2}, Rabeeh Ayaz Abbasi¹, Muddassar Azam Sindhu¹, Sachi Arafat³, Akmal Saeed Khattak¹, Ali Daud⁴, Mubashar Mushtaq⁵

Affiliations

¹ Department of Computer Science, Quaid-i-Azam University, Islamabad, Pakistan.
² Department of Computer Science, University of Warwick, Coventry CV4 7AL, UK.
³ Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia.
⁴ Faculty of Resilience, Rabdan Academy, Abu Dhabi, United Arab Emirates.
⁵ Department of Computer Science, Forman Christian College (A Chartered University), Lahore, Pakistan.

Abstract

Hate speech constitutes a major problem on microblogging platforms, with automatic detection being a growing research area. Most existing works focus on analyzing the content of social media posts. Our study shifts focus to predicting which users are likely to become targets of hate speech. This paper proposes a novel Hate-speech Target Prediction Framework (HTPK) and introduces a new Hate Speech Target Dataset (HSTD), which contains tweets labeled for targets and non-targets of hate speech. Using a combination of Term Frequency-Inverse Document Frequency (TFIDF), N-grams, and Part-of-Speech (PoS) tags, we tested various machine learning algorithms, Naïve Bayes (NB) classifier performs best with an accuracy of 93%, significantly outperforming other algorithms. This research identifies the optimal combination of features for predicting hate speech targets and compares various machine learning algorithms, providing a foundation for more proactive hate speech mitigation on social media platforms.

Keywords: Hate speech; Machine learning; Prediction; Social media; Twitter.