Public commitment

Model coverage24h after every GA.

EvalGuard supports every flagship model from OpenAI, Anthropic, Google, Mistral, and Meta within 24 hours of general availability. Public commitment — track misses at the changelog.

Today

  • 91 typed providers in packages/core/src/providers/registry.ts — drift-checked at build time against the live registry.
  • All major frontier models: GPT-5.3, GPT-4o, Claude Opus 4.7, Claude Sonnet 4.6 (1M ctx), Gemini 2.5 Pro, Mistral Large, Llama 3.3 — supported on day of release.
  • All major regional providers: Cerebras, Groq, Fireworks, Together, DeepSeek, Moonshot, Zhipu, NVIDIA NIM, AWS Bedrock, Azure OpenAI, Google Vertex AI, Oracle Cloud GenAI.
  • Voice + multimodal: Whisper, ElevenLabs, OpenAI TTS, vision models on GPT-4o / Claude / Gemini.

The commitment

  • Day-0 support — when a flagship model goes GA, we ship a typed provider entry within 24 hours. If we miss, it's tracked at /changelog with cause and ETA.
  • Typed, not proxied — each provider has a real type-safe schema, not a config dictionary. You get autocompletion + compile-time errors when the provider's API changes, not runtime surprises.
  • Open source — Apache 2.0. Add a provider yourself in a PR and we'll merge within 48 hours if it has tests.
  • No proxy markup — we don't intermediate the API call or charge per-token. Self-hosted or BYOK; you pay the provider directly.

Methodology

The 91 number is hand-maintained in packages/core/src/counts.ts and verified against the live registry by scripts/verify-counts.cjs in CI. If the marketing claim drifts from the code, the build fails. Same drift-check protects every other count (188 scorers, 249 attack plugins, 33 compliance frameworks).

Provider integration tests live at apps/web/src/__tests__/integration/llm-*.test.ts and run nightly against real provider APIs (skip when keys absent — see the Test Workflow on GitHub Actions).

Try EvalGuard Free

Want a model added? Open an issue or send a PR.