Skip to main content

CLI

The AgentWarden CLI is used for build-time workflows such as use cases, evaluations, and reviewed policy workflows.

Start here when you want to run build-time evaluation and policy workflow from the command line.

The CLI is a thin client for the AgentWarden API. It validates local inputs, submits remote jobs, and lets AgentWarden run evaluation and policy workflows on the server side.

Access and Setup

Contact the Dynamo AI support team for the AgentWarden CLI package or install command for your environment.

After installation, configure the CLI with an AgentWarden server URL and a build-time API key:

agentwarden setup \
--api-key <agentwarden-api-key> \
--server https://agentwarden.example.com

JSON output is the default for scripts. Add --format human when you want terminal-friendly output during an interactive session.

Core Workflow

The build-time workflow usually follows this path:

create use case -> submit eval -> review results -> draft policy -> compile policy -> deploy policy

For the product flow behind these commands, see Build-Time Flow. For the meaning of evaluation inputs and reports, see Static Evaluation. For how draft, compile, and deploy fit together, see Policy Workflow.

Command Map

AreaCommandsUse for
Configureagentwarden setup
agentwarden status
Configure local CLI access and verify connectivity.
Use casesagentwarden use-case create
agentwarden use-case list
agentwarden use-case show
Create and inspect the unit that groups an agent, tools, evaluations, and active policy.
Evaluation jobsagentwarden eval submit
agentwarden eval status
agentwarden eval results
agentwarden eval list
Submit tool/MCP evidence, track eval jobs, and fetch reports.
Policy draft jobsagentwarden policy draft submit
agentwarden policy draft status
agentwarden policy draft results
agentwarden policy draft list
Turn a completed eval into a reviewable policy draft.
Policy compile jobsagentwarden policy compile submit
agentwarden policy compile status
agentwarden policy compile results
agentwarden policy compile list
Compile a reviewed draft for a target runtime and check deployability.
Policy deployagentwarden policy deployPreview or activate the compiled policy for runtime enforcement.

Evaluation, draft, and compile are job-shaped command groups: submit work, check status, fetch results, and list prior jobs. Deploy is a synchronous action because it changes the active runtime policy.

Create a Use Case

agentwarden use-case create \
--name "Support Bot" \
--description "Customer support assistant"

Most eval and policy commands use the returned use_case_id to group jobs and artifacts.

You can inspect use cases later:

agentwarden use-case list
agentwarden use-case show --use-case-id <use-case-id>

Submit an Eval

Submit raw tool inventory:

agentwarden eval submit \
--use-case-id <use-case-id> \
--tool-inventory tools.json

Example tools.json:

[
{
"server": "support-mcp",
"name": "get_user_data",
"doc": "Fetch a user profile including private email, address, and account history.",
"parameters": [
{"name": "user_id", "type": "string", "description": "Internal user id"}
]
}
]

You can also submit pre-tagged tools, discover tools from an MCP config, or include observed trajectories:

agentwarden eval submit \
--use-case-id <use-case-id> \
--tool-inventory tools.json \
--trajectories trajectories.json \
--wait

Common evaluation input modes:

Input modeCLI flags
Raw tool inventory--tool-inventory tools.json
Pre-tagged tools--tags tags.json
MCP discovery--mcp-json mcp.json --mcp-host <host>
Trajectories--trajectories trajectories.json
Scope constraints--scope scope.json
Policy constraints--policy-global policy-global.json, --policy-servers policy-servers.json, --policy-tools policy-tools.json
Server metadata--server-metadata server-metadata.json
Saved report examples--max-save-static-paths <count>
Job grouping metadata--domain <value>

For MCP discovery, --mcp-host identifies the host environment for the MCP config, such as a supported coding-agent runtime. Use --allow-partial-mcp only when the eval should continue after recoverable MCP discovery failures.

Use --server-metadata when raw MCP or tool-inventory inputs need server-level context before tagging. The file maps server ids to short descriptions. It applies to raw MCP and tool-inventory inputs, not pre-tagged --tags inputs.

Use --max-save-static-paths to control how many static path and observed witness examples are saved in the downloadable report. Summary counts remain complete. Lower values keep large reports smaller.

Use --created-by <actor> when automation should set an explicit actor instead of using the local OS username.

Raw sources and pre-tagged sources can be combined in one evaluation. If the same tool identity appears in more than one source, the CLI rejects the submission so the report has one source of truth for each tool.

Inspect Eval Jobs

agentwarden eval status --job-id <job-id>

agentwarden eval list \
--use-case-id <use-case-id> \
--status completed

Fetch result metadata:

agentwarden eval results --job-id <job-id>

Download a report:

agentwarden eval results \
--job-id <job-id> \
--download \
--report-format html \
--output report.html

Draft, Compile, and Deploy Policy

Start from a completed eval job:

agentwarden policy draft submit \
--eval-job-id <eval-job-id> \
--classification classification.json \
--presets presets.yaml \
--use-case-context use-case-context.yaml \
--wait

These review inputs should reflect the approved risk context for the use case.

Policy draft input options:

InputCLI flags
Evaluation evidence--eval-job-id <eval-job-id>
Risk classification--classification classification.json
Organization presets--presets presets.yaml
Use-case context--use-case-context use-case-context.yaml
Existing policy seed--policy policy.yaml
Duplicate-submit handling--idempotency-key <key>

Inspect or download the draft before compiling:

agentwarden policy draft results --job-id <draft-job-id>

agentwarden policy draft results \
--job-id <draft-job-id> \
--download \
--output draft-artifacts/

Compile the reviewed draft for a target runtime:

agentwarden policy compile submit \
--draft-job-id <draft-job-id> \
--target <runtime-target> \
--wait

Compile can start from a completed draft job or a reviewed draft spec file. The target identifies the runtime policy format to produce.

If the draft was reviewed and edited as a file, compile from that file:

agentwarden policy compile submit \
--use-case-id <use-case-id> \
--draft-spec draft_spec.json \
--target <runtime-target> \
--wait

Compile input options:

InputCLI flags
Draft job source--draft-job-id <draft-job-id>
Draft file source--use-case-id <use-case-id> --draft-spec draft_spec.json
Runtime target--target <runtime-target>

Compile results include deployability and compatibility information for the selected runtime.

Preview a deploy:

agentwarden policy deploy \
--compile-job-id <compile-job-id> \
--dry-run

Commit the deploy:

agentwarden policy deploy \
--compile-job-id <compile-job-id> \
--yes

Deploy makes the reviewed policy available for runtime enforcement.

Deploy options:

InputCLI flags
Compiled policy package--compile-job-id <compile-job-id>
Preview only--dry-run
Confirm activation--yes
Audited unsupported-capability override--ignore-unsupported

After deploy, choose the runtime integration path that will emit events and enforce decisions. See Runtime Integrations.