Skip to main content
Version: 1.1.4

Understanding Governance in Kadal

How to Monitor Violations as Per the Moderation Set in Kadal

Kadal provides a governance framework to maintain a safe and respectful environment for users by monitoring violations based on pre-defined moderation rules. The system includes a list of disturbing and potentially offensive terms such as profanity, hate speech, sexual content, and more. Admins can monitor violations made by users during interactions with agents, providing detailed insights into the nature of each violation. This helps ensure that the platform remains respectful and compliant with community standards.

Steps:

  1. Log in to Kadal as an Admin.
  2. Navigate to Governance in the Admin panel.
  3. Under the Moderation tab, you’ll see a list of categories containing potentially offensive terms. These terms are automatically filtered to help ensure that interactions remain appropriate.
    • The list includes categories such as Race, Violence, Gender, Profanity, Sexuality, Swear Words, Self-Harm, Hate, Insults, Sexual Content, and Misconduct.
  4. Switch to the Violation tab. Here, you’ll find a list of user violations, showing the number of violations committed by each user during their interaction with an agent.
  5. Click on the number of violations next to any user’s name to see a detailed view of the violations.
  6. In the detailed view, violations will be displayed in a table format, showing the following information:
    • Category: The type of violation (e.g., profanity, hate speech, etc.).
    • Agent Name: The name of the agent involved in the violation.
    • Input: The prompt entered by the user that triggered the violation.