Skip to main content

[SDK] System Policy Compliance Quickstart

Run a System Policy Compliance Test with DynamoEval SDK (GPT-4o)

Last updated: May 21st, 2025


This Quickstart provides a step-by-step walkthrough for using Dynamo AI's SDK and platform to run a System Policy Compliance Test on a GPT-4o model. The goal is to evaluate your AI system and its guardrails for compliance with a set of policies.

Prerequisites

Open In Colab

Environment Setup

Begin by installing the Dynamo AI SDK and setting up your API keys.

!pip install dynamofl
DYNAMOFL_HOST = "https://api.dynamo.ai"
DYNAMOFL_API_KEY = "" # Paste your Dynamo AI API token here
OPENAI_API_KEY = "" # Paste your OpenAI API key here

Now, import the required libraries and initialize the SDK:

from dynamofl import DynamoFL, GPUConfig, GPUType
import requests

dfl = DynamoFL(DYNAMOFL_API_KEY, host=DYNAMOFL_HOST)
print(f"Connected as {dfl.get_user()['email']}")

Create an AI System

First, let's register your target model as an AI system. Here, we use OpenAI's GPT-4o Mini as an example, but you can swap in any supported OpenAI model.

model = dfl.create_openai_model(
name="GPT 4o Mini - System Policy Compliance",
api_instance="gpt-4o-mini",
api_key=OPENAI_API_KEY
)
print(f"Target model created with key {model.key}")

Configure Policy IDs

You can enable DynamoGuard guardrails on your AI system during testing, or you can simply evaluate policies. Add your list of policy IDs to the appropriate lists below:

APPLIED_DYNAMOGUARD_POLICIES = []  # Applied guardrails. Note that all applied guardrails will also be evaluated.
EVALUATED_DYNAMOGUARD_POLICIES = ["<POLICY_ID>"] # Policies only evaluated (replace with your policy ID)

(Optional) Validate Policy IDs

It's a good idea to validate your policy IDs before running the test. Use the helper below:

def try_analyze_endpoint(model_endpoint, policy_id, api_key):
headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {api_key}",
}
data = {
"messages": [{"role": "user", "content": "hi"}],
"textType": "MODEL_INPUT",
"policyIds": [policy_id],
}
response = requests.post(model_endpoint, headers=headers, json=data, timeout=50)
response_json = response.json()
try:
if not len(response_json["appliedPolicies"]) > 0:
raise ValueError(
f"The output of model_endpoint {model_endpoint}, policy_id {policy_id}, api_key {api_key} doesn't have any appliedPolicies attached."
)
except Exception as exc:
raise ValueError(
f"For policy {policy_id}, there is some error with the guardrail inference. Here's the response json from DynamoGuard's end:\n\n{response_json}."
) from exc
return not "status" in response_json, response_json

for policy_id in APPLIED_DYNAMOGUARD_POLICIES + EVALUATED_DYNAMOGUARD_POLICIES:
result, _ = try_analyze_endpoint(
model_endpoint=f"{DYNAMOFL_HOST}/v1/moderation/analyze/",
policy_id=policy_id,
api_key=DYNAMOFL_API_KEY
)
print(f"Policy {policy_id} is valid: {result}")

Run a System Policy Compliance Test

Now you're ready to create and run your compliance test! This will submit a test to the DynamoEval platform, where your model will be evaluated for policy compliance under various input perturbations.

benchmark_info = dfl.create_system_policy_compliance_test(
name="guardrail_benchmark_test",
applied_dynamoguard_policies=APPLIED_DYNAMOGUARD_POLICIES or [],
evaluated_dynamoguard_policies=EVALUATED_DYNAMOGUARD_POLICIES or [],
target_model=model.key,
dynamoguard_endpoint=f"{DYNAMOFL_HOST}/v1/moderation/analyze/",
dynamoguard_api_key=DYNAMOFL_API_KEY,
model_key=model.key,
gpu=GPUConfig(gpu_type=GPUType.A10G, gpu_count=1),
perturbation_methods=[
"rewording",
"common_misspelling",
"leet_letters",
"random_upper"
]
)
print(benchmark_info)

Checking Test Status

You can check the status of your compliance test using the following snippet:

print(dfl.get_attack_info(benchmark_info.attacks[0]["id"])['status'])

Viewing Test Results

After your test has been created, navigate to the model dashboard page in the Dynamo AI UI. Here, you should see your model and the running test. Once the test is complete, a detailed report will be available for review.