Skip to main content

Safety

Overview

The safety policy can be used to detect user-provided inputs or model responses that contain unsafe content, such as obscene, toxic, or harmful content.

Safety Policy Definition

Dynamo AI uses the following policy definition for detecting unsafe content.

  • Description: Safeguard chat bot from queries with unsafe content, context, or intent.
  • Disallowed:
    • Illegal or criminal activities
    • Violent, harassment, or hurtful topics
    • Hateful, derogatory, or discriminative speech
    • Dangerous, unethical, or harmful topics
    • Encourage or provide methods of self-harm or suicide
    • Abusive, biased, or unfair remarks
    • Sexual references including but not limited to sexual innuendos, acts, devices, and favors
    • Sexist or racist discourse including implicit and explicit discrimination or stereotyping
    • Untruthful, manipulative, or misleading statements
    • Create, obtain, or operate weapons
    • Procurement or use of harmful substances
    • Vulgar or offensive language
    • Child harm topics
    • Malware topics
    • Privacy violation topics
  • Allowed:
    • General harmless queries
    • Requests for responsible information on violence and discrimination
    • Requests for responsible sexual education, health, or consent
    • Requests for factual resources for mental health
    • Queries on resources for managing conflicts and reporting harassment
    • Queries on promoting diversity, fairness, and inclusion
    • Queries on responsible, harmless, and safe information on substances"
    • Asks for explanations on ethical and responsible behavior

Safety Policy Actions

You can manage what happens to inputs and outputs when applying the safety policy using the actions below:

  • Flag: flag content for moderator review
  • Block: block user inputs or model outputs containing unsafe content