[SDK] System Policy Compliance Quickstart
Run a System Policy Compliance Test with DynamoEval SDK (GPT-4o)
Last updated: May 21st, 2025
This Quickstart provides a step-by-step walkthrough for using Dynamo AI's SDK and platform to run a System Policy Compliance Test on a GPT-4o model. The goal is to evaluate your AI system and its guardrails for compliance with a set of policies.
Prerequisites
- Dynamo AI API token (get it from https://apps.dynamo.ai/profile)
- OpenAI API key
Environment Setup
Begin by installing the Dynamo AI SDK and setting up your API keys.
!pip install dynamofl
DYNAMOFL_HOST = "https://api.dynamo.ai"
DYNAMOFL_API_KEY = "" # Paste your Dynamo AI API token here
OPENAI_API_KEY = "" # Paste your OpenAI API key here
Now, import the required libraries and initialize the SDK:
from dynamofl import DynamoFL, GPUConfig, GPUType
import requests
dfl = DynamoFL(DYNAMOFL_API_KEY, host=DYNAMOFL_HOST)
print(f"Connected as {dfl.get_user()['email']}")
Create an AI System
First, let's register your target model as an AI system. Here, we use OpenAI's GPT-4o Mini as an example, but you can swap in any supported OpenAI model.
model = dfl.create_openai_model(
name="GPT 4o Mini - System Policy Compliance",
api_instance="gpt-4o-mini",
api_key=OPENAI_API_KEY
)
print(f"Target model created with key {model.key}")
Configure Policy IDs
You can enable DynamoGuard guardrails on your AI system during testing, or you can simply evaluate policies. Add your list of policy IDs to the appropriate lists below:
APPLIED_DYNAMOGUARD_POLICIES = [] # Applied guardrails. Note that all applied guardrails will also be evaluated.
EVALUATED_DYNAMOGUARD_POLICIES = ["<POLICY_ID>"] # Policies only evaluated (replace with your policy ID)
(Optional) Validate Policy IDs
It's a good idea to validate your policy IDs before running the test. Use the helper below:
def try_analyze_endpoint(model_endpoint, policy_id, api_key):
headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {api_key}",
}
data = {
"messages": [{"role": "user", "content": "hi"}],
"textType": "MODEL_INPUT",
"policyIds": [policy_id],
}
response = requests.post(model_endpoint, headers=headers, json=data, timeout=50)
response_json = response.json()
try:
if not len(response_json["appliedPolicies"]) > 0:
raise ValueError(
f"The output of model_endpoint {model_endpoint}, policy_id {policy_id}, api_key {api_key} doesn't have any appliedPolicies attached."
)
except Exception as exc:
raise ValueError(
f"For policy {policy_id}, there is some error with the guardrail inference. Here's the response json from DynamoGuard's end:\n\n{response_json}."
) from exc
return not "status" in response_json, response_json
for policy_id in APPLIED_DYNAMOGUARD_POLICIES + EVALUATED_DYNAMOGUARD_POLICIES:
result, _ = try_analyze_endpoint(
model_endpoint=f"{DYNAMOFL_HOST}/v1/moderation/analyze/",
policy_id=policy_id,
api_key=DYNAMOFL_API_KEY
)
print(f"Policy {policy_id} is valid: {result}")
Run a System Policy Compliance Test
Now you're ready to create and run your compliance test! This will submit a test to the DynamoEval platform, where your model will be evaluated for policy compliance under various input perturbations.
benchmark_info = dfl.create_system_policy_compliance_test(
name="guardrail_benchmark_test",
applied_dynamoguard_policies=APPLIED_DYNAMOGUARD_POLICIES or [],
evaluated_dynamoguard_policies=EVALUATED_DYNAMOGUARD_POLICIES or [],
target_model=model.key,
dynamoguard_endpoint=f"{DYNAMOFL_HOST}/v1/moderation/analyze/",
dynamoguard_api_key=DYNAMOFL_API_KEY,
model_key=model.key,
gpu=GPUConfig(gpu_type=GPUType.A10G, gpu_count=1),
perturbation_methods=[
"rewording",
"common_misspelling",
"leet_letters",
"random_upper"
]
)
print(benchmark_info)
Checking Test Status
You can check the status of your compliance test using the following snippet:
print(dfl.get_attack_info(benchmark_info.attacks[0]["id"])['status'])
Viewing Test Results
After your test has been created, navigate to the model dashboard page in the Dynamo AI UI. Here, you should see your model and the running test. Once the test is complete, a detailed report will be available for review.