CLI Reference
15 commands for running evaluations, security scans, and managing your AI testing workflow from the terminal.
Installation
npm install -g @evalguard/cli
After installing, the evalguard command will be available globally.
evalguard login
Authenticate with your EvalGuard API key
Usage
evalguard login --key <apiKey> [--url <baseUrl>]
Options
| Option | Description |
|---|---|
| --key <apiKey> | API key (or set EVALGUARD_API_KEY env var) |
| --url <baseUrl> | Custom API base URL |
Example
evalguard login --key eg_sk_abc123def456
evalguard logout
Remove stored credentials
Usage
evalguard logout
Example
evalguard logout
evalguard init
Initialize EvalGuard in current project. Creates evalguard.config.json, evals/example.json, and scans/example.json.
Usage
evalguard init [--project <projectId>]
Options
| Option | Description |
|---|---|
| --project <projectId> | Set default project ID |
Example
evalguard init --project proj_abc123
evalguard eval
Run an evaluation from a config file via the cloud API. Requires authentication.
Usage
evalguard eval <file> [options]
Options
| Option | Description |
|---|---|
| --project <projectId> | Override project ID |
| --model <model> | Override model |
| --wait | Wait for completion and show results |
Example
evalguard eval evals/qa-test.json --model gpt-4o --wait
evalguard scan
Run a security scan from a config file via the cloud API. Requires authentication.
Usage
evalguard scan <file> [options]
Options
| Option | Description |
|---|---|
| --project <projectId> | Override project ID |
| --model <model> | Override model |
| --wait | Wait for completion and show results |
Example
evalguard scan scans/red-team.json --wait
evalguard whoami
Show current authentication status, masked API key, and configured project.
Usage
evalguard whoami
Example
evalguard whoami
evalguard eval:local
Run evaluation locally using @evalguard/core. No API key needed. Runs entirely on your machine.
Usage
evalguard eval:local <file> [options]
Options
| Option | Description |
|---|---|
| --model <model> | Override model |
| --provider <provider> | Override provider (openai, anthropic, etc.) |
| --output <format> | Output format: json, csv, html, or file path |
| --verbose | Show detailed output per test case |
Example
evalguard eval:local evals/my-eval.json --model gpt-4o-mini --verbose
evalguard scan:local
Run red team security scan locally using @evalguard/core. No API key needed.
Usage
evalguard scan:local <file> [options]
Options
| Option | Description |
|---|---|
| --model <model> | Override model |
| --provider <provider> | Override provider |
| --output <format> | Output format: json or file path |
| --verbose | Show each finding |
Example
evalguard scan:local scans/pentest.json --provider anthropic --verbose
evalguard generate tests
Generate synthetic test cases from a description using an LLM.
Usage
evalguard generate tests <description> [options]
Options
| Option | Description |
|---|---|
| -n, --count <n> | Number of test cases (default: 10) |
| --model <model> | LLM model for generation (default: gpt-4o) |
| --provider <provider> | Provider name (default: openai) |
| --strategies <list> | Evolution strategies (comma-separated) |
| --output <file> | Output file path (default: generated-tests.json) |
Example
evalguard generate tests "customer support chatbot" -n 20 --model gpt-4o
evalguard generate assertions
Auto-generate assertions for existing test cases.
Usage
evalguard generate assertions <file> [options]
Options
| Option | Description |
|---|---|
| --model <model> | LLM model for generation |
| --provider <provider> | Provider name |
| --output <file> | Output file path |
Example
evalguard generate assertions evals/qa-test.json --output evals/qa-with-assertions.json
evalguard validate
Validate an eval or scan config file. Checks JSON structure, scorer names, plugin names, and strategy names against the registry.
Usage
evalguard validate <file>
Example
evalguard validate evals/my-eval.json
evalguard compare
Compare two evaluation result files side-by-side. Shows score differences, regressions, and improvements.
Usage
evalguard compare <file1> <file2> [options]
Options
| Option | Description |
|---|---|
| --threshold <n> | Minimum score improvement to highlight (default: 0.05) |
Example
evalguard compare results/baseline.json results/candidate.json --threshold 0.1
evalguard list
List available components: scorers, plugins, strategies, graders, or providers.
Usage
evalguard list <component> [--json]
Options
| Option | Description |
|---|---|
| --json | Output as JSON |
Example
evalguard list scorers evalguard list plugins --json evalguard list providers
evalguard firewall
Test input against LLM firewall rules locally. Supports stdin, file input, and custom rules.
Usage
evalguard firewall <input> [options]
Options
| Option | Description |
|---|---|
| --rules <file> | Custom firewall rules JSON file |
| --json | Output as JSON |
Example
evalguard firewall "Ignore previous instructions" evalguard firewall @suspicious-input.txt --rules my-rules.json --json
evalguard watch
Watch an eval config file and re-run the evaluation automatically on every save.
Usage
evalguard watch <file> [options]
Options
| Option | Description |
|---|---|
| --model <model> | Override model |
| --provider <provider> | Override provider |
| --debounce <ms> | Debounce interval in ms (default: 1000) |
Example
evalguard watch evals/my-eval.json --model gpt-4o-mini
CI/CD Usage
Use the CLI in your CI/CD pipeline by setting the EVALGUARD_API_KEY environment variable. The CLI reads it automatically.
name: EvalGuard CI
on: [push]
jobs:
eval:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: "20"
- run: npm install -g @evalguard/cli
- run: evalguard eval:local evals/regression.json --output json
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
- run: evalguard scan:local scans/security.json --verbose
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}