CLI

The AgentWarden CLI is used for build-time workflows such as use cases, evaluations, and reviewed policy workflows.

Start here when you want to run build-time evaluation and policy workflow from the command line.

The CLI is a thin client for the AgentWarden API. It validates local inputs, submits remote jobs, and lets AgentWarden run evaluation and policy workflows on the server side.

Access and Setup

Contact the Dynamo AI support team for the AgentWarden CLI package or install command for your environment.

After installation, configure the CLI with an AgentWarden server URL and a build-time API key:

agentwarden setup \
  --api-key <agentwarden-api-key> \
  --server https://agentwarden.example.com

JSON output is the default for scripts. Add --format human when you want terminal-friendly output during an interactive session.

Core Workflow

The build-time workflow usually follows this path:

create use case -> submit eval -> review results -> draft policy -> compile policy -> deploy policy

For the product flow behind these commands, see Build-Time Flow. For the meaning of evaluation inputs and reports, see Static Evaluation. For how draft, compile, and deploy fit together, see Policy Workflow.

Command Map

Area	Commands	Use for
Configure	`agentwarden setup` `agentwarden status`	Configure local CLI access and verify connectivity.
Use cases	`agentwarden use-case create` `agentwarden use-case list` `agentwarden use-case show`	Create and inspect the unit that groups an agent, tools, evaluations, and active policy.
Evaluation jobs	`agentwarden eval submit` `agentwarden eval status` `agentwarden eval results` `agentwarden eval list`	Submit tool/MCP evidence, track eval jobs, and fetch reports.
Policy draft jobs	`agentwarden policy draft submit` `agentwarden policy draft status` `agentwarden policy draft results` `agentwarden policy draft list`	Turn a completed eval into a reviewable policy draft.
Policy compile jobs	`agentwarden policy compile submit` `agentwarden policy compile status` `agentwarden policy compile results` `agentwarden policy compile list`	Compile a reviewed draft for a target runtime and check deployability.
Policy deploy	`agentwarden policy deploy`	Preview or activate the compiled policy for runtime enforcement.

Evaluation, draft, and compile are job-shaped command groups: submit work, check status, fetch results, and list prior jobs. Deploy is a synchronous action because it changes the active runtime policy.

Create a Use Case

agentwarden use-case create \
  --name "Support Bot" \
  --description "Customer support assistant"

Most eval and policy commands use the returned use_case_id to group jobs and artifacts.

You can inspect use cases later:

agentwarden use-case list
agentwarden use-case show --use-case-id <use-case-id>

Submit an Eval

Submit raw tool inventory:

agentwarden eval submit \
  --use-case-id <use-case-id> \
  --tool-inventory tools.json

Example tools.json:

[
  {
    "server": "support-mcp",
    "name": "get_user_data",
    "doc": "Fetch a user profile including private email, address, and account history.",
    "parameters": [
      {"name": "user_id", "type": "string", "description": "Internal user id"}
    ]
  }
]

You can also submit pre-tagged tools, discover tools from an MCP config, or include observed trajectories:

agentwarden eval submit \
  --use-case-id <use-case-id> \
  --tool-inventory tools.json \
  --trajectories trajectories.json \
  --wait

Common evaluation input modes:

Input mode	CLI flags
Raw tool inventory	`--tool-inventory tools.json`
Pre-tagged tools	`--tags tags.json`
MCP discovery	`--mcp-json mcp.json --mcp-host <host>`
Trajectories	`--trajectories trajectories.json`
Scope constraints	`--scope scope.json`
Policy constraints	`--policy-global policy-global.json`, `--policy-servers policy-servers.json`, `--policy-tools policy-tools.json`
Server metadata	`--server-metadata server-metadata.json`
Saved report examples	`--max-save-static-paths <count>`
Job grouping metadata	`--domain <value>`

For MCP discovery, --mcp-host identifies the host environment for the MCP config, such as a supported coding-agent runtime. Use --allow-partial-mcp only when the eval should continue after recoverable MCP discovery failures.

Use --server-metadata when raw MCP or tool-inventory inputs need server-level context before tagging. The file maps server ids to short descriptions. It applies to raw MCP and tool-inventory inputs, not pre-tagged --tags inputs.

Use --max-save-static-paths to control how many static path and observed witness examples are saved in the downloadable report. Summary counts remain complete. Lower values keep large reports smaller.

Use --created-by <actor> when automation should set an explicit actor instead of using the local OS username.

Raw sources and pre-tagged sources can be combined in one evaluation. If the same tool identity appears in more than one source, the CLI rejects the submission so the report has one source of truth for each tool.

Inspect Eval Jobs

agentwarden eval status --job-id <job-id>

agentwarden eval list \
  --use-case-id <use-case-id> \
  --status completed

Fetch result metadata:

agentwarden eval results --job-id <job-id>

Download a report:

agentwarden eval results \
  --job-id <job-id> \
  --download \
  --report-format html \
  --output report.html

Draft, Compile, and Deploy Policy

Start from a completed eval job:

agentwarden policy draft submit \
  --eval-job-id <eval-job-id> \
  --classification classification.json \
  --presets presets.yaml \
  --use-case-context use-case-context.yaml \
  --wait

These review inputs should reflect the approved risk context for the use case.

Policy draft input options:

Input	CLI flags
Evaluation evidence	`--eval-job-id <eval-job-id>`
Risk classification	`--classification classification.json`
Organization presets	`--presets presets.yaml`
Use-case context	`--use-case-context use-case-context.yaml`
Existing policy seed	`--policy policy.yaml`
Duplicate-submit handling	`--idempotency-key <key>`

Inspect or download the draft before compiling:

agentwarden policy draft results --job-id <draft-job-id>

agentwarden policy draft results \
  --job-id <draft-job-id> \
  --download \
  --output draft-artifacts/

Compile the reviewed draft for a target runtime:

agentwarden policy compile submit \
  --draft-job-id <draft-job-id> \
  --target <runtime-target> \
  --wait

Compile can start from a completed draft job or a reviewed draft spec file. The target identifies the runtime policy format to produce.

If the draft was reviewed and edited as a file, compile from that file:

agentwarden policy compile submit \
  --use-case-id <use-case-id> \
  --draft-spec draft_spec.json \
  --target <runtime-target> \
  --wait

Compile input options:

Input	CLI flags
Draft job source	`--draft-job-id <draft-job-id>`
Draft file source	`--use-case-id <use-case-id> --draft-spec draft_spec.json`
Runtime target	`--target <runtime-target>`

Compile results include deployability and compatibility information for the selected runtime.

Preview a deploy:

agentwarden policy deploy \
  --compile-job-id <compile-job-id> \
  --dry-run

Commit the deploy:

agentwarden policy deploy \
  --compile-job-id <compile-job-id> \
  --yes

Deploy makes the reviewed policy available for runtime enforcement.

Deploy options:

Input	CLI flags
Compiled policy package	`--compile-job-id <compile-job-id>`
Preview only	`--dry-run`
Confirm activation	`--yes`
Audited unsupported-capability override	`--ignore-unsupported`

After deploy, choose the runtime integration path that will emit events and enforce decisions. See Runtime Integrations.

Access and Setup​

Core Workflow​

Command Map​

Create a Use Case​

Submit an Eval​

Inspect Eval Jobs​

Draft, Compile, and Deploy Policy​