2020 Conference
Hate Speech Classification of Social Media Posts Using Text Analysis and Machine Learning
Hate crimes are on the rise in the United States and other parts of the world. Hate speech is one tool that a person or group uses to let out feelings of bias, hatred and prejudice towards a religion, race, ethnicity, ancestry, sexual orientation, gender or disability thereby spreading hatred. This paper focuses on how SAS Enterprise Miner’s Text Analytics was used to develop a model that categorizes tweets based on their content, specifically hateful vs normal. After sampling and cleaning of the data and breaking the tweets down into quantifiable components, different models were built and compared. The best performing model was used to score unseen data, achieving reasonable accuracy in classification. This paper touches upon how text analytics could be harnessed by organizations like Twitter for encouraging civic responsibility in its users. By providing a feature at the user-level which allows tweets to be labelled as a particular category as they are typed, the users might be given an opportunity to review and possibly modify any hateful tweets before posting them. |
Presenter(s)
Venkateshwarlu Konduri | Sarada Padathula | Asish Pamu | Sravani Sigadam