Hatebase is a collaborative, regionalized repository of multilingual hate speech
Hatebase is a collaborative, regionalized repository of multilingual hate speech
Hatebase was built to assist companies, government agencies, NGOs and research organizations moderate online conversations and potentially use hate speech as a predictor for regional violence. (Language-based classification, or symbolization, is one of a handful of quantifiable steps toward genocide.)
Social media has become ubiquitous in present-day society and, as with all technological tools and platforms, it reflects both the good and bad aspects of human nature and decision making. Put another way, social media platforms are neither the pure-intentioned bastions of community-building espoused by their evangelists nor are they the source of destruction of civil engagement and bipartisan discussion. Rather, social media platforms amplify and echo what is already present within society. From this perspective, the presence of hate speech and misinformation is unsurprising. What is unique when it comes to online conversation is the physical and mental distance and anonymity with which perpetrators of hate speech can operate as well as the potential speed and reach with which hateful messages can be spread.

When tens of thousands of inflammatory posts appeared on Facebook and Twitter in the months preceding the 2016 US presidential election, an embittered rural electorate seemed a likely cause. Few suspected Russian state interference, although in hindsight the warnings were there.

When we relaunched Hatebase at the end of last year, we deployed not only a wealth of new data attributes (e.g. targeted groups, plurals, transliterations) and a new API (now v4.1); we also included a complete bottom-up rebuild of HateBrain, the natural language processing (NLP) engine at the heart of Hatebase, which is responsible for (as of this month) 738,000 regionalized, timestamped hate speech sightings.