Measuring Harm Potential of Online Hate Speech

This project is developing a framework to assess the ‘harm potential’ of online posts in India, collaborating with experts across fields. It aims to address online harm with context-aware responses, focusing on preventing physical violence. The project also involves creating tools for identifying harmful content. More details and resources are available through provided links.

This ongoing project seeks to build a framework to measure the ‘harm potential’ of public online posts in India. The framework is being developed in conjunction with linguists, social anthropologists, sociologists, legal experts, law enforcement agents, parliamentarians, online activists, and victims of hate speech and disinformation campaigns. Informed by nuances of Indian languages, legal systems, ethnic and cultural divisions, and existing response mechanisms, the framework aims to suggest proportionate responses to online content that has high harm potential. This interdisciplinary framework can be used to design tool(s) that identify online content that is likely to lead to real-world (physical) harm and undertake a proportional response. While harm can take many forms, the project focuses on the potential for physical violence as this could be a pragmatic approach in the current context. The project recognizes the existing limitations of content moderation systems which often lack the variety of contexts determining the harm potential (or lack of it) of online content. With this understanding, the framework seeks to create a context-rich tool for mitigating online harm.

Project Docs and Details

Tagset and Draft Annotation Guidelines

Annotation Tool

Publications

Events

podcasts