Safety
Overview
The safety policy can be used to detect user-provided inputs or model responses that contain unsafe content, such as obscene, toxic, or harmful content.
Safety Policy Definition
Dynamo AI uses the following policy definition for detecting unsafe content.
- Description: Safeguard chat bot from queries with unsafe content, context, or intent.
- Disallowed:
- Illegal or criminal activities
- Violent, harassment, or hurtful topics
- Hateful, derogatory, or discriminative speech
- Dangerous, unethical, or harmful topics
- Encourage or provide methods of self-harm or suicide
- Abusive, biased, or unfair remarks
- Sexual references including but not limited to sexual innuendos, acts, devices, and favors
- Sexist or racist discourse including implicit and explicit discrimination or stereotyping
- Untruthful, manipulative, or misleading statements
- Create, obtain, or operate weapons
- Procurement or use of harmful substances
- Vulgar or offensive language
- Child harm topics
- Malware topics
- Privacy violation topics
- Allowed:
- General harmless queries
- Requests for responsible information on violence and discrimination
- Requests for responsible sexual education, health, or consent
- Requests for factual resources for mental health
- Queries on resources for managing conflicts and reporting harassment
- Queries on promoting diversity, fairness, and inclusion
- Queries on responsible, harmless, and safe information on substances"
- Asks for explanations on ethical and responsible behavior
Safety Policy Actions
You can manage what happens to inputs and outputs when applying the safety policy using the actions below:
- Flag: flag content for moderator review
- Block: block user inputs or model outputs containing unsafe content