Documentation
Everything you need to evaluate, secure, and monitor your AI applications with EvalGuard.
Get started
Install, migrate, and run your first eval.
API & SDKs
Reference for the REST API, CLI, and three first-party SDKs.
API Reference
307 REST endpoints for evals, security, gateway, traces, datasets, and more.
CLI Reference
33 commands: eval, scan, firewall, gateway, models-scan, shadow-ai, siem, debug, BYOK, budgets, and more.
TypeScript SDK
93+ methods covering evals, security, traces, gateway, cost, compliance, and more.
Python SDK
99+ methods with full parity — evals, scans, traces, OTLP, Shadow AI, AI-SPM.
Go SDK
93+ methods for Go backends — evals, security, gateway, monitoring, compliance.
gRPC Gateway
Connect-RPC service over HTTP/1.1 + JSON. Same authoritative gateway logic with a strongly-typed contract.
Catalogs
What ships in the box — scorers, attack plugins, and provider adapters.
Scorers
188 built-in scorers across 8 categories: accuracy, relevance, safety, quality, RAG, agent, cost, custom.
Attack Plugins
249 red-team plugins across 42 adversarial strategies.
Providers
91 LLM providers including OpenAI, Anthropic, Gemini, Bedrock, Azure, Vertex, Groq, and more.
Dataset Versioning
Immutable per-dataset snapshots. Pin experiments to a frozen version for bit-perfect reproducible re-runs.
Fine-Tuning
Cross-provider fine-tuning ledger. Pin training to immutable dataset snapshots; track jobs across OpenAI, Anthropic, Vertex, and more.
Integrations & Ops
Wire EvalGuard into your stack and deploy on your terms.
Integrations
15 integrations — Slack, Discord, Teams, PagerDuty, Jira, Linear, GitHub Actions, GitLab CI, and more.
OpenTelemetry
Point any OTLP/HTTP exporter at EvalGuard. Traces, metrics, and logs — no agent install.
Self-Hosting
Docker Compose, Kubernetes, and Helm deployment guides.
MCP Vendor Presets
One-click registration for 12 popular MCP servers: GitHub, Slack, Atlassian, Linear, Notion, Figma, Stripe, Sentry, PagerDuty, Postgres, Cloudflare, Datadog.
Agent Graph View
Node-and-edge DAG of agent execution — third view mode on the trace detail page.
ClickHouse OLAP
Opt-in dual-write of traces to ClickHouse for sub-second multi-day rollups. Postgres stays the OLTP authority.
Governance
Mapping findings to compliance frameworks.
Quick Start
Get running with EvalGuard in three commands.
npm install @evalguard/sdk npx @evalguard/cli login --key eg_your_api_key npx @evalguard/cli eval my-eval.jsonFull getting started guide