Quicklinks

2020 Conference

CADRE 2025

Contact

Dani Kirsch Research Data SpecialistQuestions?405-744-9859
danielle.kirsch@okstate.edu

Hate Speech Classification of Social Media Posts Using Text Analysis and Machine Learning


Hate crimes are on the rise in the United States and other parts of the world. Hate speech is one tool that a person or group uses to let out feelings of bias, hatred and prejudice towards a religion, race, ethnicity, ancestry, sexual orientation, gender or disability thereby spreading hatred. This paper focuses on how SAS Enterprise Miner’s Text Analytics was used to develop a model that categorizes tweets based on their content, specifically hateful vs normal. After sampling and cleaning of the data and breaking the tweets down into quantifiable components, different models were built and compared. The best performing model was used to score unseen data, achieving reasonable accuracy in classification. This paper touches upon how text analytics could be harnessed by organizations like Twitter for encouraging civic responsibility in its users. By providing a feature at the user-level which allows tweets to be labelled as a particular category as they are typed, the users might be given an opportunity to review and possibly modify any hateful tweets before posting them.

Hate crimes are on the rise in the United States and other parts of the world. Hate speech is one tool that a person or group uses to let out feelings of bias, hatred and prejudice towards a religion, race, ethnicity, ancestry, sexual orientation, gender or disability thereby spreading hatred. This paper focuses on how SAS Enterprise Miner’s Text Analytics was used to develop a model that categorizes tweets based on their content, specifically hateful vs normal. After sampling and cleaning of the data and breaking the tweets down into quantifiable components, different models were built and compared. The best performing model was used to score unseen data, achieving reasonable accuracy in classification. This paper touches upon how text analytics could be harnessed by organizations like Twitter for encouraging civic responsibility in its users. By providing a feature at the user-level which allows tweets to be labelled as a particular category as they are typed, the users might be given an opportunity to review and possibly modify any hateful tweets before posting them.

Presenter(s)

Venkateshwarlu Konduri | Sarada Padathula | Asish Pamu | Sravani Sigadam

Coalition for Advancing Digital Research and Education

CADRE

Search

Quicklinks

Logins

Academic Schedule

Places & Departments

Trending Now

2020 Conference

Hate Speech Classification of Social Media Posts Using Text Analysis and Machine Learning

Presenter(s)