The most comprehensive AI evaluation and security platform on earth. 170 attack plugins. 90 scorers. 86 providers. One platform, zero blind spots.
Integrates with 83+ LLM providers and frameworks
From evaluation to monitoring, EvalGuard covers the entire AI quality lifecycle.
Run 87 built-in scorers across faithfulness, relevance, toxicity, and more. Create custom LLM-as-judge evaluators. Catch regressions before your users do.
A complete toolkit built for teams who take AI quality seriously.
87 built-in scorers, custom LLM-as-judge evaluators, A/B testing, and CI/CD integration.
170 attack plugins across 42 strategies with OWASP LLM Top 10 compliance.
Full trace visualization with infinite loop detection and root cause analysis.
Real-time content filtering with <5ms latency. Block prompt injections before they reach your model.
Real-time dashboards for latency, cost, quality drift, and anomaly detection.
Three steps. No infrastructure to manage.
# Install the SDK npm install @evalguard/sdk
# Run an evaluation npx evalguard eval --suite faithfulness \ --model gpt-4o
# Add to CI/CD pipeline npx evalguard gate --threshold 0.9 > All 87 scorers passed. Deploying...
Six products. One platform. Complete AI quality coverage.
Run 168+ attack plugins against your LLM app. Detect jailbreaks, data leaks, and prompt injection vulnerabilities in minutes, not weeks.
88+ built-in scorers for relevance, faithfulness, toxicity, bias, and hallucination. Run thousands of evaluations with one command.
OpenTelemetry-native tracing, drift detection, and anomaly alerts. Know when your model degrades before your users do.
Intelligent routing across 83+ providers with automatic failover, rate limiting, and semantic caching. Cut costs with smart routing.
Map findings to OWASP LLM Top 10, NIST AI RMF, MITRE ATLAS, EU AI Act, ISO 42001, HIPAA, PCI DSS, and FedRAMP.
Per-model cost tracking, budget alerts, and optimization recommendations. Know exactly where your AI budget goes.
Tailored workflows for every stakeholder in the AI pipeline.
Enterprise-grade security, compliance, and deployment options from day one.
Architecture designed for SOC 2 compliance with continuous monitoring and evidence collection.
Full data processing agreements with EU data residency options.
Enterprise identity providers with SCIM provisioning.
Run in your own cloud. AWS, GCP, and Azure supported.
Granular role-based access control with audit logging.
Guaranteed uptime with dedicated support and escalation paths.
Every feature is backed by comprehensive testing.
Start evaluating, securing, and monitoring your AI in production today.