What's new

Shipping every week. 

New features, improvements, and fixes shipped to EvalGuard. Read about what we've built recently.

v1.1.0March 2026

NL Pipeline & Adaptive Red Teaming

Two industry-first features that no competitor has. Describe your app in plain English to generate a complete eval suite, and let an AI attacker adapt in real-time to find vulnerabilities static tests miss.

NL→Eval Pipeline

Describe your AI app in natural language. EvalGuard's proprietary pipeline analyzes your app profile, maps domain-specific risks, generates targeted test cases, and assembles a production-ready evaluation config — powered by multi-model orchestration across 89 providers

Adaptive Multi-Turn Red Teaming

AI-powered attacker that adapts in real-time using UCB1 bandit algorithm. Runs parallel sessions across 43 strategies × 14 categories, learns from each response, and builds a complete resistance profile

25,000+ test blocks across 457 files

Comprehensive test coverage across all 6 products with end-to-end, integration, and unit tests ensuring production reliability

10 Security Audit Fixes

Hardened authentication, authorization, input validation, and API security based on comprehensive security audit findings

Added
  • NL→Eval Pipeline — describe your AI app in plain English, get a complete evaluation suite in seconds
  • Adaptive Multi-Turn Red Teaming with UCB1 bandit optimization and parallel attack sessions
  • Swagger API Documentation covering all 307 API endpoints
  • Cross-session memory for red teaming attack strategies
  • Real-time resistance profiling dashboard
Improved
  • Test suite expanded to 25,000+ describe/it blocks across 457 test files
  • Red teaming now supports up to 15 conversation turns per session
  • 43 attack strategies × 14 vulnerability categories coverage
  • 89 LLM provider support with intelligent orchestration for NL pipeline
Security
  • 10 security audit fixes across authentication and authorization
  • Hardened input validation on all API endpoints
  • Improved API key scoping and permission enforcement
  • Enhanced CSRF and rate limiting protections
v1.0.0March 2026

Launch Release

The most comprehensive AI evaluation, security, and governance platform. Six products, one platform, zero blind spots.

138 Evaluation Scorers

Accuracy, faithfulness, hallucination, bias, toxicity, coherence, and more — the most comprehensive scorer library available

249 Security Attack Plugins

Prompt injection, jailbreaking, PII extraction, data exfiltration, and 245 more adversarial test types

43 Attack Strategies

Multi-turn, crescendo, tree-of-attacks, semantic variations, and more sophisticated red teaming strategies

89 LLM Providers

OpenAI, Anthropic, Google, AWS Bedrock, Azure, Mistral, Cohere, and 80 more — all through a unified API

Added
  • Eval Engine — Run evaluations with 166 scorers across accuracy, safety, bias, compliance, and custom metrics
  • LLM Gateway — Centralized AI traffic management with policy enforcement, rate limiting, and automatic failover
  • FinOps Dashboard — Real-time cost tracking, budget alerts, and optimization recommendations across all providers
  • Observability Platform — Production monitoring with real-time dashboards, alerting, and distributed tracing
  • Prompt IDE — Version-controlled prompt engineering with A/B testing, diff views, and deployment pipelines
  • Red Teaming-as-a-Service — Automated adversarial testing with 249 attack plugins and 43 strategies
  • EU AI Act auto-risk classification with compliance dashboard and evidence collection
  • ISO 42001 and SOC 2 readiness tracking with automated controls mapping
  • Scheduled continuous evaluations with cron-based automation
  • Enterprise SSO/SAML integration for single sign-on
  • Feature flags system for gradual rollouts and A/B testing
  • Webhook delivery with HMAC-SHA256 signing and automatic retries
  • Full API and SDK support (TypeScript, Python) for CI/CD integration
  • Self-hostable via Docker Compose and Helm charts
  • Annotation workflows with human-in-the-loop evaluation queues
  • Dataset versioning with diff tracking and lineage
  • Benchmark suites for standardized model comparison
  • Audit logging for compliance and security monitoring
Security
  • Row-Level Security (RLS) isolation across all database tables
  • AES-256 encryption at rest for all sensitive data including API keys
  • TLS 1.2+ encryption for all data in transit
  • CSRF protection on all state-changing endpoints
  • Rate limiting with configurable per-endpoint thresholds
  • API key scoping with granular permission controls

Engineering changelog

Every commit, grouped by week and conventional-commit type. Auto-generated from git on every release. 1,240 changes across 11 weeks.

2026-W21

May 18 – May 24, 2026

70 changes
Features
  • auditPhase E — P0/P1/P2 hardening (e2e creds + CLI dry-run + lib hardening)#358
  • auditPhase D — depth fixes (policy keys, worker race, Terraform/Java/Semantic, trace-id)#357
  • auditPhase C — distribution wedge (GitHub App + CLI + pricing DB + scan-model)#356
  • auditPhase B — namesake clearance, 8 stub→real conversions#355
  • auditPhase A — P0 cross-tenant, SSRF, postgrest-safe, idempotency hardening#354
  • complianceship compliance_evidence persistence — checklist toggles now persist#328
  • Q2 deferred items — RLS coverage + auto-OpenAPI/registry docs pipeline (1078 new pages)#365
  • q1+q2bundle — 11 items, ~6,500 LOC, 400+ tests#364
  • evalT2 — DAG metrics + Arena + trace-span assertions + tool-call F1/trajectory (108 tests)#363
  • gradersT1.1 — 10 deep graders w/ DeepEval-quality rubrics + 271 tests#362
  • loginpremium B2B redesign — trust signals, real capabilities, visible SSO option#361
  • finopsship Chargeback — wire UI to /api/v1/chargeback#326
  • componentsfinal natural-search hover state cleanup#316
  • componentsresidual token cleanup batch 3 — final 4 components#315
  • componentsresidual token cleanup batch 2 — 11 more components#314
  • componentsresidual token cleanup — chart-context-menu, command-palette, insights-feed, keyboard-shortcuts#313
  • componentsbulk-migrate 24 shared components to design tokens#312
  • marketingmigrate /docs/api to design tokens#311
  • dashboardfinal residual sweep — border-gray-100/dark:border-gray-900#310
  • dashboardexpand token-migration script (single-tone) + sweep 36 more pages#309
  • dashboardbulk-migrate residual subpages (settings/traces/clusters/import)#308
  • dashboardbulk-migrate 9 nested security/workflow/prompts subpages to design tokens#307
  • dashboardbulk-migrate 9 nested subpages to design tokens#306
  • dashboardbulk-migrate /support /test-gen /threat-intelligence /uba /webhooks /workflow#305
  • dashboardbulk-migrate /online-evals /saved-searches /service-map /simulator to design tokens#304
  • dashboardbulk-migrate /api-docs /changes /compare /events /executive to design tokens#303
  • dashboardmigrate /dlp + /data-discovery + /data-residency to design tokens#302
  • dashboardmigrate /templates + /agents + /mcp-traffic to design tokens#301
  • dashboardmigrate /benchmarks + /builder + /generate to design tokens#300
Fixes
  • ciunblock deploy — add Synthesizer to openapi.json + fix soc2 webhook test flake#372
  • cialso drop refs/pull/* + reflog before gitleaks; allowlist known-fake AKIA commit#371
  • ciprune stale git refs before gitleaks scan (self-hosted runner cache)#369
  • docsdisambiguate SDK reference routes — move [version] under literal v/ prefix#368
  • gradershybrid biasGrader/piiGrader/hallucinationGrader — restore regex floor + deep judge on LLM-available path#367
  • migrationmake active_sessions policies idempotent in 20260520_complete_rls_coverage#366
  • marketingdrop doubled "| EvalGuard" in pricing/models + products/model-scan titles#360
  • middlewareunblock Phase C marketing routes from auth gate#359
  • notificationsdrop console.warn from retired email channel#353
  • build/api/v1/status/uptime force-dynamic — unblocks deploy#350
  • testannotations + 3rd upload test file — unblocks revert deploy#349
  • testupload-validation.test.ts — projectId + middleware mocks + 30s timeout#348
  • datasets/uploadrequire projectId — closes baseline #1 of extractor audit#346
  • annotationsPOST schema requires projectId — middleware cross-tenant check now actually runs#341
  • cicorrect broken actions/upload-artifact@v4.6.2 SHA pin#325
  • spa-navuse router.push in prompts/ab-testing (was window.location.href)#324
  • react-hooksuse next/navigation router in judge-models (was window.location.href)#323
  • react-hooksinline handleCopySnippet — call buildJudgeSnippet directly#322
  • react-hookshoist buildJudgeSnippet() out of judge-models render#321
  • react-hookshoist pure toRad() helper out of finops donut renderer#320
  • react-hooksclear all static-components warnings (8 → 0)#319
  • countsupdate stale provider/scorer/plugin badges to canonical numbers#318
Docs
  • archclarify compliance_evidence + allocation_rules + RLS scope#333
  • openapidocument /api/v1/compliance/checklist GET + PATCH#329
CI
  • extractor-auditescalate .optional() id + requiredRole to violation#345
  • extractor-auditalways show warnings + triage guidance#344
  • extractor-auditwarn on .optional() extractor fields (latent #341-class bypass)#343
  • add extractor-schema-audit ratchet (prevents class of #341 silent-bypass bugs)#342
Chore
  • securityretire email/send + 503 gateway PUT — clears extractor-audit baseline#352
  • flagsenable 3 stale name-sake flags — backends already shipped#327
  • ESLint --fix sweep — 24 auto-fixable warnings (mostly unused imports)#317
Tests
  • complianceadd cross-tenant rejection tests for checklist API#330
  • annotationscross-tenant rejection on GET + flag POST extractor gap#340
  • tracescross-tenant rejection on GET — traceStore never queried#339
  • audit-logscross-tenant rejection — admin client never instantiated on 403#338
  • api-keyscross-tenant rejection — highest-blast-radius surface#337
  • agent-runscross-tenant rejection — agent_runs SELECT suppressed on 403#336
  • prompts/ab-experimentscross-tenant + RBAC rejection sweep#335
  • marketplacecross-tenant rejection sweep for install API#334
  • chargeback-exportcross-tenant rejection tests for CSV export#332
  • chargebackadd cross-tenant + admin-RBAC rejection tests#331

2026-W20

May 11 – May 17, 2026

123 changes
Features
  • dashboardmigrate /fine-tuning + /simulation to design tokens#299
  • dashboardmigrate /marketplace + /finops to design tokens#298
  • dashboardmigrate /prompts + /annotations + /embeddings to design tokens#297
  • dashboardmigrate /integrations + /team + /datasets to design tokens#296
  • dashboardmigrate /gateway + /firewall + /compliance to design tokens#295
  • dashboardmigrate /settings + /playground + /cost to design tokens#294
  • dashboardrebuild /traces + /monitoring in Linear restraint#293
  • dashboardrebuild /evals + /security in Linear restraint#292
  • dashboardrebuild /dashboard home in Linear restraint#291
  • marketingrebuild /trust + /engineering in Linear restraint#290
  • marketingrebuild /about /contact /security /changelog in Linear restraint#289
  • marketingrebuild /docs hub + shell in Linear restraint#286
  • dashboardrebuild sidebar + topbar + mobile nav in Linear restraint#287
  • marketingrebuild /compare + /alternatives hubs in Linear restraint#285
  • marketingrebuild /pricing in Linear-restraint pattern#284
  • testcomprehensive prod-monitoring + test orchestrator + auth bot#277
  • zodbodySchema on 5 new MCP routes (restore ratchet 142 → 135)#223
  • evalper-provider grader scheduler — rate-limit-aware sequencing (W7)#114
  • mcpsub-10ms semantic tool-filter library (W7)#112
  • mcptransport bridges — HTTP / SSE / WebSocket (W5-6 #71 final piece)#111
  • mcpRedis-backed ToolRateLimiter for multi-pod deployments (W5-6 #71 follow-up)#108
  • mcpmanual health-check endpoint + 'Test connection' UI button (W5-6 #71 follow-up)#107
  • mcpserver health-check cron (W5-6 #71 follow-up)#106
  • mcpregistry + permissions UI pages + permission PUT/DELETE routes (W5-6 #71 PR D)#105
  • mcpruntime per-tool RBAC enforcement + audit-per-invocation (W5-6 #71 PR C)#104
  • mcpOAuth 2.1 + JWT validation per RFC 9068 (W5-6 #71 PR B)#103
  • mcpserver registry + per-tool RBAC schema + CRUD (W5-6 #71 PR A)#102
  • zodreal bodySchema on prompts cluster (2 routes)#206
  • zodreal bodySchema on guardrails + safety cluster (4 routes)#214
  • zodbodySchema on gpu-monitoring + feedback/token#219
  • zodreal bodySchema on incidents + integrations + insights (4 handlers)#215
  • zodreal bodySchema on custom-dashboards cluster (4 routes)#210
  • zodreal bodySchema on webhooks + sso (2 routes)#209
  • zodreal bodySchema on traces cluster (2 routes)#208
  • zodreal bodySchema on security cluster (3 routes)#207
  • zodreal bodySchema on firewall cluster (4 routes)#200
  • zodreal bodySchema on evals cluster (5 routes) + api-handler hardening#199
  • rls-ratchetlock SOFT-violation baseline — block policy-theater regression#170
  • security/benchmarksfinish VLSU + wire 5 new datasets into BENCHMARKS#174
  • importcURL + Postman v2.1 → provider config parsers (W7)#115
  • zodreal bodySchema on eval-ops cluster (4 routes)#205
  • zodreal bodySchema on embeddings + exports cluster (4 routes)#204
  • zodreal bodySchema on agents cluster (3 routes)#202
  • zodreal bodySchema on gateway cluster (5 routes)#201
  • zodreal bodySchema on LLM input cluster #2 (3 routes)#203
  • zodreal bodySchema on data + identity cluster (6 routes)#197
  • zodreal bodySchema on data governance cluster (7 routes)#198
  • zodreal bodySchema on compliance cluster (7 routes)#194
  • zodreal bodySchema on AI/LLM input cluster (5 routes)#193
  • zodreal bodySchema on annotations cluster (3 routes, 4 handlers)#196
  • zodreal bodySchema validation on 5 money-flow routes#192
  • scimDB-backed per-org bearer-token rotation (closes P2.2 SCIM half)#176
  • cli/init--ci flag scaffolds .env.example + GitHub Actions workflow#172
Fixes
  • authrestore a11y attributes lost in PR #283 auth-page refresh#288
  • rlsreal fixes for 5 WEAK/CRITICAL tables surfaced by per-policy audit#188
  • rlsper-route audit of 31 SOFT-violation tables — 4 new policies + 27 annotations#187
  • apirelocate attachment-mime helpers out of route.ts (Next.js build)0583a5c
  • test,libunblock deploy CI — 6 fixes for post-merge test debt + 1 real bugcd95f43
  • dbmove audit_logs index to CONCURRENTLY migration — unblock deploy ratchetc371c6e
  • apiOpenAPI stubs for 6 routes added in PR #278 — unblock deploy ratchetefbf580
  • helmadd missing evalguard.labels + selectorLabels helpers#276
  • depssync pnpm-lock.yaml after override removal in #273#275
  • eslintremove ts-eslint v8 override + restore v7 ban-types compat#273
  • webmigrate eslint.config.mjs off FlatCompat → native flat config#270
  • webinstall @vitejs/plugin-react + unskip 4 sentry GlobalError tests#269
  • lintrestore real lint on llamaindex-wrapper + vercel-ai-wrapper#268
  • dockerset runtime NODE_OPTIONS=--max-old-space-size=2048#267
  • rls-isolationeval_results column is 'scorer', not 'scorer_name'#265
  • rls-isolationprovide eval_runs.created_by (NOT NULL)#264
  • rls-isolationNULLIF empty JWT claim + remove project_id from shared_traces#263
  • rls-isolation9 per-test-file setup bugs + expectThrow runner support#262
  • rls-isolationplant empty JWT claim for anon role#261
  • rls-isolationgrant Supabase-equivalent default privileges post-migrations#260
  • rls-isolationinstall is_project_member(uuid) placeholder BEFORE migrations#259
  • rls-isolationset_config() instead of SET LOCAL $1 + is_project_member(uuid) shim#258
  • rls-isolationper-statement migration apply + auth.role() stub#257
  • ratchets, rls-isolationbump skip baseline 191→192 + strip CONCURRENTLY#254
  • ratchetsexpand RLS-isolation drop list + kill 2 MCP 'as any' casts#252
  • rls-isolationdrop fixture stub tables so 00000 schema runs cleanly#251
  • webvitest 4.x constructor mocks + UUID test payloads + sentry skip#250
  • mcp-gatewayuse process.stderr in audit, restore console-count floor#248
  • corereset mockFetch between tests in notification-integrations#247
  • workervitest 4.x constructor mocks — use function (not arrow)#246
  • llamaindex-wrapperTS 6.x compat — node + DOM types, Mock<T> typing#245
  • vercel-ai-wrapperadd @types/node + DOM lib + types: [node]#243
  • testSC-20 startup-observability-baseline — deterministic baseline#177
  • auth-requiredhandle trailing comma in createApiHandler options#195
  • rls-audithandle quoted policy names — SOFT count 48→11 (regex missing 37 real policies)#179
  • migration-testseed idempotency — ON CONFLICT DO NOTHING#180
  • cicover idempotency + SOC2 branches; fix migration setup + auth-required parser#160
  • post-marathon-ciresolve all 4 CI failures from #158 — name collision + 2 test issues#159
Docs
  • openapiadd 3 missing MCP routes — invoke, permissions/{id}, health-check#256
  • mcp-authclarify issuer vs verifier roles of the two auth paths#225
  • scimSCIM 2.0 provisioning guide — Okta / Azure AD / Google Workspace#178
  • founderaction items 2026-05-12 — six items only founder can execute#173
Build
  • depsbump actions/stale from 9 to 10#24
CI
  • securityadd actions:read to security-scan.yml permissions#181
Chore
  • auditrls-audit `service-role-only` verb + annotate 6 zero-consumer tables#189
  • auditrls-audit annotation so dynamic CREATE POLICY blocks are visible to the ratchet#184
  • depsbump the production-dependencies group across 1 directory with 39 updates#249
  • depsbump pnpm/action-setup from 4 to 6#235
  • depsbump actions/setup-go from 5 to 6#236
  • depsbump azure/setup-helm from 4 to 5#233
  • depsbump changesets/action from 1.4.6 to 1.8.0#234
  • testsdelete 15 SUPERSEDED it.skip blocks (dead test code) [skip deploy]#274
  • e2ebump e2e-nightly cron from weekly Sunday → actually nightly [skip deploy]#272
  • delete orphan apps/web/apps/web/ build artifact tree [skip deploy]#271
  • deploypaths-ignore CI-only test infra (RLS isolation + ratchet baselines)#266
  • deps-devbump vitest in the development-dependencies group#239
  • depsbump actions/download-artifact from 4 to 8#232
  • auth-requiredre-baseline ratchet 195 → 192 (-3)#222
  • deps-devbump the development-dependencies group across 1 directory with 19 updates#220
  • depsbump actions/setup-python from 5 to 6#23
  • depsbump softprops/action-gh-release from 2 to 3#21
  • depsbump actions/checkout from 4 to 6#20
  • zod-requiredre-baseline ratchet 311 → 135 (-176)#221
  • husky/pre-pushtee test output to a log file for flake diagnosis#175
  • auth-requireddocument admin-route auth intent — baseline 313→302#171
  • 2026-05-11 marathoncross-tenant 76→0, test debt 251→0, +8 real route bugs#158
Tests
  • rlsPostgres-test-container RLS isolation framework + tests for the 4 new policies#190
  • api-handler, audit-loggerrestore critical-path coverage to baseline#253
  • mcpPlaywright e2e for registry + permissions UI (W5-6 #71 follow-up)#110
  • api-handlerrestore branch coverage after R2 idempotency block (89.5% → 92.3%)#161

2026-W19

May 4 – May 10, 2026

216 changes
Features
  • complianceOWASP Agentic AI Top 10 (2025) framework#140
  • securitySBOM workflow + security.txt + RFC 9116 ratchet4fd740e
  • providersCursor + Windsurf adapters (W3 #68)#96
  • actionline-level PR review comments + evalguard code-scan CLI (W1 #64)#92
  • terraformclose path-to-20+, ship the deferred 7 resources (PR I)#109
  • evil-mcpadversarial MCP target server (W5 #72)#99
  • cliwire YAML `transform:` end-to-end through eval:local#147
  • enginewire applyTransform into runEvaluation + runStreamingEvaluation#146
  • quickjs-runnerproduction sandbox for @evalguard/core inline JS transforms#144
  • engineinline JS transforms in YAML eval config (injectable runner)#143
  • secretsAWS SM + Azure Key Vault + HashiCorp Vault adapters (closes vault trio)#142
  • HuggingFace datasets trace importer + public pricing JSON dump endpoint#141
  • gatewayadaptive provider rate-limiter (reads x-ratelimit-* headers)#139
  • mcpper-tool RBAC schema + gateway auth integration (MCP Phase 2)#135
  • integrationstrace importers for Helicone / Langfuse / Portkey (W7 / Tier A #8 — properly)#134
  • mcpJWT-based authentication for MCP tool invocations (W7 / Tier A #15 — MCP Phase 1)#132
  • cliCursor MDC format support in \`evalguard setup\` (W7 follow-up)#131
  • actions\`evalguard-scan\` GitHub Action with line-level review comments + OIDC (W7 / Tier A #12)#130
  • cli\`evalguard setup\` — wire up AI coding agents (W7 / Tier A #4)#129
  • model-auditadd GGUF analyzer (W7 / Tier A #6 — closes \`evalguard scan-model\` parity)#128
  • skills@evalguard/skills package — Claude Code skills (W7 / Tier A #5)#125
  • cli\`evalguard pricing\` — DB inspection + cost estimator (W7 / Tier A #7 follow-up)#121
  • costwire pricing DB into CostTracker via addEntryFromModel (W7 / Tier A #7 follow-up)#120
  • coststructured model-pricing DB with input/output/cache splits (W7 / Tier A #7)#119
  • mcp-evalevil-mcp adversary fixture + detector recall floor (W7 / Tier A #13)#118
  • evalJUnit XML reporter for CI integration (W7 / Tier A #11)#117
  • eval-uiwire run-export download menu (W7 follow-up to PR #113)#123
  • evalHuman-Eval YAML output format (W7)#113
  • firewalldetection round 2 — toxic 0% → 100%, recall 36% → 44%7707e65
  • firewalldetection-quality benchmark + pattern library +20pp recall3a4e112
  • marketingpublic /engineering claims-with-receipts pageaeabad4
  • benchmarkspublic benchmarks scaffold + firewall vs competitors849c99a
  • ciexternal synthetic uptime probe (P2.4 — criterion #7 path)4dc1d1e
  • ratchetskip-count tracks silent vs documented separately8d1d57e
  • cimass-assignment defense ratchet (#12) — HARD ZEROd4c6eb0
  • cicross-tenant .eq predicate ratchet — closes ADR-0014 follow-up15a8cce
  • cino-dynamic-eval ratchet — catches RCE-class primitivesd00b59b
  • blogpublish "Six hours of engineering audit" to /blog4fd45c5
  • testscaffold Stryker mutation testing on critical paths (P2.2)7c27a47
  • statuspublic status page reads real uptime, not hardcoded green2e1727b
  • ciOpenAPI coverage ratchet — 27/311 documented (lower-only)e8bc995
  • cigitleaks hard gate — 117 → 0 findings, continue-on-error offf1117bc
  • huskypre-push runs type-check before tests + ban --no-verify70c4cb8
  • cicritical-path coverage ratchet (api-handler/crypto/audit)d48c893
  • ciskip-count ratchet — lock the 329-skip floor at 2026-05-045657b8b
Fixes
  • securitydocument 7 cross-tenant exemptions on admin maintenance routes; baseline 139→132#138
  • securityprovider-keys GET defense-in-depth field whitelist0b88125
  • exports/rlhfcross-tenant defense on annotationQueueId path (HIGH read-only RLHF training data leak fix, +3 tests)#157
  • evals/comparecross-tenant defense — require projectId + verify both runs (HIGH read-only data leak fix, +5 tests)#156
  • annotations/queuecross-tenant defense in POST assign + batch (real vuln, +6 regression tests)#155
  • annotations/queues/itemscross-tenant defense in PATCH endpoint (was vuln, +5 regression tests)#152
  • depsbump fast-uri, hono, fast-xml-builder, ip-address (close 18 dependabot alerts)#137
  • cirebaseline cross-tenant ratchet 137→139 (W7 marathon unblocker)#136
  • cimake Semgrep non-blocking on PRs (~48 pre-existing findings)#127
  • ciremove gitleaks + make SARIF uploads informational in security-scan.yml#126
  • cidrop codeql PR-gate guard now that Code Scanning is enabled#124
  • ciunblock Security workflow false positives + missing-feature error#122
  • comparetable layout broken on slug pages — fixed-layout columns + concise cells5256bd7
  • ciallowlist redis-cache RedisLike.eval() method signatured55d2ae
  • ciunblock deploy — as-any baseline, autopilot mock, coverage rebaseline51c7b41
  • cigrant actions:read to ratchets job for synth-check-freshness APIe1fdd60
  • monitoringbridge AlertEngine schema mismatch in /api/v1/monitoring/alerts097f767
  • ciskip the entire 'overall status aggregation' describe blockb75e503
  • firewallclose sourdough FP via benign-domain semantic short-circuitc7ad09f
  • ciunblock self-hosted runner — gitleaks no-sudo install + skip CI-flaky status tests0f62f03
  • cidrop synth-check cron from */15 to hourly (saves ~75% of synth burn)f60d40e
  • strykersandbox setup + vitest exclude for stryker tmpd332e2d
  • strykerswitch to commandRunner — mutation score 96.55% on crypto.ts6ef55b0
  • synth-checkprobe /.well-known/security.txt, fix gateway/health OpenAPI claim9cb8acc
  • ratchetskip-count distinguishes conditional vs unconditional skipsa9df642
  • java-sdkbump spring-web 6.1.15 → 6.1.21, spring-boot 3.3.6 → 3.3.13e3c1491
  • depsbump axios pnpm override 1.15.0 → 1.16.0 (patches 13 advisories)63c2289
  • security-pagecorrect two defensibility lies on /security9a5c0e8
  • ratchetexclude blog/marketing prose from TODO/FIXME scanfab7632
  • testpin Math.random for second showcase shield flake site9b02576
  • statusskip uptime DB read in test env — closes 4-run CI flakee231c90
  • ratchetexclude blog/marketing prose from 'as any' scan + reword post8484334
  • testmock gateway_proxy_logs chain so /api/status doesn't flake0770679
  • testfreeze time in assembleConfig determinism test44c8dc6
  • red-teamperformance.now() for sub-ms durationMs accuracy92a9a97
  • cibump gitleaks pin to 8.30.1 — match local dev version243cd71
  • scanneruse performance.now() for sub-ms duration accuracyd03a8a9
  • testde-flake ioredis-loader via pure-function extraction4488f51
  • testadd CI multiplier to perf budgets — runner variance97cc93e
  • testmake embeddings + SARIF tests deterministic under coverage52765dc
  • cibump Node heap to 6GB for apps/web Next.js prod build58588dc
  • testrepair ioredis-loader test isolation (vi.doMock leakage)af931aa
  • testbump load-test perf budgets under coverage instrumentation12ffda9
  • ciclear 4 post-eslint-upgrade ratchet/test/migration failures16cae4a
  • lintupgrade @typescript-eslint to v8 for ESLint 9 compatibility3e868a0
  • security+correctnessclose 3 documented gaps surfaced this sessiond9dbb85
  • wrappersreplace `.apply(null, args)` with spread to satisfy prefer-spread7c9889b
  • cliadd missing 'yaml' dependency to apps/clib1552f8
  • typesTS errors blocking CI Lint & Type Checkf2f2b9e
  • anthropic-wrapperTS2352 — cast Anthropic Message via unknown to Json7be6229
  • cigrant pull-requests:write in deploy.yml so workflow_call'd ci.yml can use it9af7850
  • ciscope pull-requests:write to migration-tests job (workflow_call fix)9b5f36e
  • ciescape single-quote in 'as any' ratchet step name (YAML parse error)e0d4a77
  • apiclose 4 route gaps surfaced by this session's testsba7c685
Performance
  • ciswitch Build & Push from GHA-only cache to GHA + GHCR registry cache98776ef
  • apiCache-Control on registry GET routes for Cloudflare CDNfc3160d
Refactor
  • reactdisable 11 exhaustive-deps warnings with reason (288 → 278)6fed8cc
  • testsreplace 32 \`Function\` types with explicit signatures (320 → 288)df8336f
  • testsrename 159 unused body/bodyStr to _body/_bodyStr (479 → 320 warnings)d9f8e30
  • testsdrop 3 unused test helpers (lint warnings 483 → 479)595fef1
Docs
  • soc2starter pack — vendor comparison + control map + gap list8de7919
  • correct /compare/portkey false weaknesses + add /trust/model-coverage commitmentc8ac57b
  • compare/compare/portkey + /buyers-guide/ai-gateway with PANW-acquisition counterc5af9f0
  • comparefix stale counts + add Helicone/LangSmith/Patronus pages3037efe
  • verifybump CI ratchet count 20 → 21 (migration down-coverage)272ef64
  • benchmark + scoreboard sync — 100/100/100/100 after sourdough fix2d78f0d
  • engineering scoreboard sync after Phase 2 mutation lift856538a
  • 3 conference talk drafts ready for submissiond06886e
  • scoreboard + /verify sync after Phase 1 mutation-testing expansion6bbd47d
  • ADR-0036 chaos coverage ratchet + scoreboard sync (20 ratchets, 36 ADRs)2aba170
  • ADR-0035 + investor brief — detection-benchmarking discipline + 1-pager88a5bf1
  • consolidated threat model — 17 threats with mitigations + receipts28edb6f
  • runbookself-hosted GitHub Actions runner on Hetznere60d431
  • mutationaudit-logger.ts 79.31% → 89.66% — above high thresholdbad6641
  • roadmapflip criterion #11 (OpenAPI completeness) to ✅ EARNED97579b3
  • openapiround 16 — FULL COVERAGE (293 → 310, missing 18 → 0)878476e
  • openapiround 15 (+20, 273 → 293, missing 38 → 18)66abeae
  • openapiround 14 (+20, 253 → 273, missing 58 → 38)ae87b23
  • openapiround 13 (+20, 233 → 253, missing 78 → 58)49d7854
  • openapiround 12 (+20, 213 → 233, missing 98 → 78)2beebfb
  • openapiround 11 (+20, 193 → 213, missing 118 → 98)3d2d88d
  • openapiround 10 (+20, 173 → 193, missing 138 → 118)f10ad88
  • openapiadd 21 routes (152 → 173, missing 159 → 138)0df9b76
  • openapiadd 19 routes (133 → 152, missing 178 → 159)d543563
  • openapiadd 20 more routes (113 → 133, missing 198 → 178)592d316
  • mutationrecord api-handler.ts score 44.29% (criterion #5 NOT earned)6121edf
  • mutationrecord mutation-score baseline (crypto 96.55% / audit 79.31%)ca8cf86
  • openapiadd 17 more routes (96 → 113, missing 215 → 198)14828cd
  • openapiadd 15 more tier-1 routes (81 → 96, missing 230 → 215)d12a053
  • adrADR-0034 supersedes 0033 — Stryker commandRunner works70a6546
  • openapiadd 15 more tier-1 routes (66 → 81, missing 245 → 230)c8137f6
  • roadmapsynth-check scaffold + 1st green run; criterion #7 earnable in 24h19c9247
  • adrADR-0033 — Stryker mutation testing parked, criterion #5 partial2a70b5a
  • roadmapflip criterion #3 (< 100 silent skips) to ✅ EARNED4e0d49a
  • testsdocument 146 silent skips with reason comments (silent 154 → 13)5ca0cb5
  • roadmapsync skip metric — silent (154) is the meaningful one7d08057
  • roadmapsync skip-count after a9df642d measurement fixbae010f
  • openapiadd 15 more tier-1 routes (51 → 66, missing 259 → 245)a3b61b3
  • adrADR-0032 — CVE-response discipline (32nd ADR)c6f08ac
  • openapiadd 14 more tier-1 routes (37 → 51, missing 273 → 259)e4f46c7
  • openapiadd 10 tier-1 customer-facing route entries (27 → 37)a6d4e6d
  • roadmapcorrect tracking error — 3+ OSS packages already earned5efb6ea
  • adrADR-0031 — earn the bar, then enforce it (31st ADR)d6969bc
  • roadmapflip criterion #4 to earned (--strict critical-path)7aa1cd9
  • roadmapsync TL;DR after post 12/12 landsef687e9
  • blogpost 12/12 — "Sustained cadence vs sprint cadence"1a6dddc
  • blogpost 11/12 — "How to write your first ADR (template + receipts)"6f497df
  • blogpost 10/12 — "14 engineering claims customers actually verify"75b6199
  • blogpost 9/12 — "An engineering audit's first day, by the numbers"4398ef1
  • blogpost 8/12 — "The deliberate-break test for new CI gates"4940357
  • blogpost 7/12 — "14 CI ratchets that stop drift"d4d614b
  • blogpost 6/12 — "Choosing Hetzner over Vercel: the egress-pricing math"db9d740
  • blogpost 5/12 — "Defense in depth for multi-tenant"19e2956
  • blogpost 4/12 — "Mutation testing: when 100% coverage is theatre"a48363b
  • adr30/30 — P3.1 COMPLETEd6973a8
  • adrland 4 more — 24/30 → 28/30 + roadmap syncec57fb9
  • adrland 3 more — 21/30 → 24/30 of P3.1 targetf51c020
  • blogpost 3/12 — "From silent no-op to hard gate" (gitleaks)5769f7d
  • blogpost 2/12 — "Type-check is necessary, not sufficient"641c170
  • blog"Six hours of engineering audit, in commits" — first postd083d39
  • adrland 5 more — 16/30 → 21/30 of P3.1 targetb645438
  • roadmapTL;DR header + sync P2.4/P2.7/P3.1 status95a58d4
  • adrland 5 more — 11/30 → 16/30 of P3.1 target9eafb3a
  • roadmaprefresh scoreboard — 10/27 done, 5 deploys this session9c459ad
  • adrADR-0011 — gitleaks hard gate with allowlist (11/30)9d2e397
  • adrland 5 more — 6/30 → 10/30 of P3.1 target53820e6
  • lock the defensibility roadmap as a durable repo artifactcfa632e
  • seed ADR repository with first 5 decisions1afb1d7
Build
  • huskyadd pre-push gate that runs scoped vitest120c6be
CI
  • add ratchet 21 (migration down-coverage) + full-chain replay (#95/#103)f2e5e32
  • add ratchet 20 — chaos-coverage floor enforcement5c1d90b
  • add ratchet 19 — critical-path mutation-score floor enforcement91b86a9
  • synth-check freshness ratchet (18th active CI gate)337f03d
  • move heavy workflows to self-hosted Hetzner runner8f45526
  • add firewall-latency regression ratchet (17th active CI gate)e419136
  • promote critical-path --strict to PR-blocking gate (16th ratchet)46994c3
Chore
  • securitycross-tenant eq ratchet 99 → 79 — batch 4 (20 chains across 8 routes + 9 routes flagged for product fix)#154
  • securitycross-tenant eq ratchet 125 → 103 (22 exemptions across autopilot + datasets + evals/[runId]/*)#153
  • securitycross-tenant eq ratchet 125 → 121 (4 createApiHandler-mediated exemptions)#151
  • securitycross-tenant eq ratchet 132 → 125 (7 documented exemptions)#150
  • lintapps/web ESLint 228→0 — real fixes, not _-prefix codemodf5ae5a2
  • lintunused-vars batch 6 — 12 more API routes (compliance/email/exports/eval-schedules)1e6defb
  • lintunused-vars batch 5 — 12 more API routes (mostly unused 'user' destructure)b82073d
  • lintunused-vars batch 4 — 12 API-route + cron + test filesb361cb4
  • lintunused-vars batch 3 — 8 more dashboard pages cleaned299a399
  • lintunused-vars batch 2 — 10 dashboard-page warnings cleaned39dc362
  • lintunused-vars batch 1 — 9 test-file warnings cleaned09e968c
  • coreexclude Regex mutator from detection-engine Stryker config871c6f6
  • coreadd Stryker config for 5 critical-path files9dc2bc9
  • update api-handler.ts mutation baseline (44.29% → 44.89%)acb7045
  • lintrename 316 unused destructured vars to _-prefix2eea002
  • lintturn off three style-only rules (53 warnings cleared)d4fdb4e
  • lintautofix 270 unused-imports + swap gitleaks to OSS binary71e3b38
Tests
  • core/firewallun-skip 6 firewall tests that are no longer broken#149
  • worker/chaosstalled-job recovery after worker dies mid-processing#148
  • coreexpand statistics tests 90 → 124 (snapshot pins for tail helpers)55e6053
  • coreexpand statistics coverage from 69 → 90 tests (Phase A.c)44dd795
  • coreexpand guardrail-dsl coverage from 30 → 61 tests (Phase A.a)8aca193
  • firewallupdate test #81 to assert leetspeak IS detectedd6b2933
  • coredirect unit tests for the 3 mutation-test gaps750c23d
  • api-handler+17 mutation-killing assertions targeting known survivorsdffcdfb
  • audit-loggeradd 3 assertions to kill Stryker survivorsa14786f
  • cipersist deliberate-break test for --strict critical-path gate01a1324
  • api-handleradd 9 branch-coverage permutations — clears --strict 90%a6b0c48
  • api-handlerbranch coverage 73.4% → 79.7% via 8 permutations5e2395d
  • api-handlercache-miss path coverage — lines 94.6% → 96.8%f374e51
  • apibump compliance test timeouts (full v1 suite now 312/312 green)7b89427
  • apibatches 194+195 — demo-eval + demo-scan tests (19 tests)9bcbba1
  • apibatch 193 — gateway/proxy/[...path] tests (23 tests)7661f77
  • apibatch 192 — pipelines/run tests (17 tests)b23104d
  • apibatch 191 — widgets/from-nl tests (34 tests)1716ccd

2026-W18

Apr 27 – May 3, 2026

440 changes
Features
  • securityG3 — vulnerability to reproducible CI test (Giskard pattern)1151ff3
  • compliancescoreboard view across all 33 frameworks (TrojAI parity)ef10329
  • remediationsfan CreateRemediationButton out to security + eval surfacesd099ff9
  • eq-sprintclose Week 4 marker hygiene + wire g_eval LLM-judgef2d5366
  • eq-sprintWeek 4 lint + 5 dependabot/load + .catch fixes2ed3367
  • playgroundjailbreak challenge platform primitives (Lakera Gandalf)59c99b3
  • eventswire CreateRemediationButton into events inbox detail32e9349
  • remediationscross-team tracking workflow + SLA breach view0822c21
  • insightsInsights Agent — auto-clustering + LLM exec summarydf3c338
  • sdksVercel AI + LlamaIndex.TS auto-instrumentation wrappersd7f8b88
  • tracesLangSmith-style message threading view in trace viewer0f85b4a
  • g3wire PromoteToRegressionTestButton into security + simulator pages66835a4
  • tracesOpenInference / OTLP-JSON trace export02dff9c
  • cisticky PR comments for eval-quality + migration-tests gates9214e50
  • integrationsreal PagerDuty Events API v2 + saved-search trigger0872c7e
  • migrationstest coverage for G1 / M2 / G2 / trace_embedding_2df7477a5
  • simulatorG2 closed-loop adaptive attacker (Giskard pattern)123d494
  • simulatorpersona simulator with replay-from-step-N (M2 from compare audit)f95c3ce
  • test-gencorpus-grounded test generation (G1 from compare audit)710bd00
  • embeddingsUMAP 2D projection with PCA fallback (gap B from compare)ee8e142
  • migrationshard pairing gate + ephemeral-postgres roundtrip in CI5fe0245
  • dashboardshow product names in provider settings (Kimi, GLM, etc.)a13ee93
  • providersalias kimi/claude/grok/glm/qwen/command/granite/nemotron/ocia29fd99
  • ship 13 attack plugins + 4 providers, lock counts to 166/249/87/333f68ed1
  • v2 UI for embedding cluster + online evals809cfd6
  • 7-phase pending-items sweep (TS strict + shutdown + .single + cache + providers + online evals + embeddings)fd6c6a0
  • Tier B (Helm CI + eval gate + Azure VPC) + A1 migration safety frameworkbcedacb
  • VPC deployment guide + saved-search alert worker791a9e1
  • trustpublish firewall latency benchmark with reproducible methodology4e41457
  • uiJ/K row nav (Linear-style) + empty-state CTAs9bf6c0e
  • uiEsc-to-close + ARIA on remaining 10 dashboard modalse858bd2
  • uiEsc-to-close + backdrop-click + body-scroll-lock for 7 modalsa17c88b
  • uireplace spinner-text loaders with content-shaped skeletons across 13 pages07adb71
  • uiTimeSeriesChart wrapper + chart on agent-runs + threat-intelligencef29b5da
  • eval/api/v1/eval/code HTTP route for the 7 code scorers5f43166
  • scorerscode-mypy + code-pyright + code-e2b-runs (last LangSmith OpenEvals gap)061503e
  • firewallwire DLP into engine + forceBlockCategories optioncdec0ce
  • compliance-alertsemail digest cron + template + cron schedule03c3cfd
  • evalvoice agent evaluation API surface97df0ff
  • dlpexpand pattern dictionaries 110 → 201 (+ international PII, AI provider keys)33a84dd
  • firewallpublishable latency benchmark endpoint38669db
  • privacyvendor risk scoring + SOC 2 expiry alerts + NVD CVE feed8b7e272
  • debug-agentapply + verify routes + sessions list UI375f314
  • datasetsrender the New Dataset modal3b663c0
  • canonical package names + 6 deprecation shims + P0 fixes#71
Fixes
  • workerredact OpenAI v2 sk-proj-/sk-live-/sk-test- keys + add tests2656050
  • otel + auditclear last 2 workspace build failures (3 distinct issues)ae0b00e
  • workerbump Sentry-init test timeout to 30s — kills the last turbo workspace flakeb9abedc
  • sdk+cli+worker+vscodeclear last workspace test failures (4 distinct issues)aeabd5e
  • coreresolve 53 failing core tests — shadow-AI TDZ trap + counts ratchet + scorer timeouts75ced4f
  • middlewarejailbreak playground routes are anon-public3286f48
  • migrationsjailbreak_attempts partial-index now() not IMMUTABLEc925e44
  • healthheap-pressure check is V8-cold-start aware6b501ab
  • ciclose 3 silent quality gaps + add 4 audit ratchets18a744e
  • workerUMAP nNDescent infinite-loop from constant random fnbd84e31
  • queuenoeviction policy + BullMQ-correct ioredis flags everywherece9ba74
  • workerpersona-simulation tolerates missing G2 columns via SELECT *4a97d7d
  • persona-simulatorseed personas use NULL org/project, not zero-UUIDc03f947
  • composewire LLM API keys to worker containerff35a57
  • provider-keysuse live registry + accept aliases (kimi/claude/grok/...)ee406cd
  • corekeep counts.ts as plain constants — registry import broke web build3f18225
  • marketingwire FEATURE_COUNTS to live registries + last 138 fixesf8f953c
  • workertrace-embedding-fill .catch on RPC builder is a TypeError8adcd6b
  • middlewareadd /canonical-counts.json to public exact-match listd2a027c
  • marketingexternal audit pass — license + counts + latency + UXb7209f4
  • deploybake NEXT_PUBLIC_ADMIN_EMAILS into client bundle + Tier 1-3 surfacesa7427d2
  • uiobservability surfaces fetch errors instead of silently empty8e6cabe
  • uisurface API failures with retry button across 9 silent-fetch pages4ffb0b6
  • uireplace 8 browser alert()/confirm() with sonner / useConfirm022f865
  • releasealign Version Packages with @evalguard/sdk rename [skip deploy]f2ea749
  • dlp4 pattern bugs found by per-pattern + FP audits7a9e43f
  • auto-guardrailsexpose effectiveCoveragePercent (excludes literal-fallback)9187961
  • dlpcatch parenthesized US phone format like '+1 (555) 123-4567'3633b77
  • auto-guardrailsvalidate finding.input + try/catch generatord3ed498
  • migrationdrop 'editor' from RLS policy — not in org_role enum0046ccc
  • cron route 401 bounce + Postgres IMMUTABLE index error015b956
  • privacystrip orgId/risk_override from vendor INSERT rowd233efa
  • consentwiden consent gate to security scan + firewall check1bb2a38
  • gatewayreplace limit(100) ceilings with time-windowed query7c1274b
  • byokinclude orgId in provider-keys POST body970beb0
  • byokroute settings UI saves through Vaulta9eea29
  • api2 more bugs from Phase 2 deep flowse3c97cf
  • api3 more bugs caught by Phase 1 RLS-pattern probe34a0477
  • nl-pipelineuse admin client for org_members lookup (API-key auth)741deec
  • api2 more bugs caught by exhaustive feature E2E883da9b
  • cibuild worker's workspace deps before running its testsfe396b9
  • prod3 production bugs caught by live feature E2E986162f
  • health/api/health now reports DB ok when only Supabase env is setfa74766
  • cli@evalguard/cli@2.2.2 — `init` → `eval:local` flow now actually runs tests9e9fb8a
  • worker-testsunblock prod deploy — Supabase mocks + Sentry mocks + audit env3239fab
  • civersion.yml YAML parse error — quote if-expression2709c77
  • e2esignup spec — use admin createUser, not /signup, on .test domain1993139
  • hydrationsuppress nonce mismatch on theme-init scriptb0fbd36
  • cspnonce match — middleware forwards x-nonce on request, layout reads same value25bacc8
  • cireplace bash skip-deploy guard with native Actions if-expressiona4fe442
  • hydrationroot-cause two Math.random() in render = SSR/CSR mismatch2478cc9
  • crawler-batch-4close last 6 from re-run — 4 real + 2 noise47bd4ee
  • crawler-batch-3close last 3 DB_ERROR routes — RLS + soft-fail + new tablesa656688
  • crawler-batch-25 page-side missing-param bugs from crawler report8b0dfb7
  • crawler-batch-15 missing API stubs + datasets render guard + nightly crawler in CI7b9d98b
  • links/docs/nemoclaw never existed — point at /docs/sdk instead9edaecd
  • tracesempty-state CTA links to OTel docs, not back to itselfa427199
  • tracesnormalize API response shape at the fetch boundary5d748e2
Security
  • pull leaked keys + close 2 anon-readable RLS holes2cbc157
Refactor
  • loggingconvert all production console.log to structured loggere4f3cc0
  • vendorclose last 3 'as any' — schema mismatch, not just types62f2334
  • typesretire 17 more 'as any' casts + 1 dead-code removal7c60762
  • typespermission/audit signatures (kill 'as any' in api-handler)7a403e1
  • typesretire 19 'as any' casts across 9 routes (Week 4)7665fd4
  • api-handlertyped WeakMap for API-key context (kill 9 'as any')e70a834
  • insights/agent rename + cross-references for de-dup audit4bdae27
Docs
  • memoryrecord workspace-wide green state + hidden bugs surfaced7090579
  • handoff doc for 2026-04-27 + probe-hydration helper30b6099
CI
  • coverageswap narrow per-PR coverage on packages/core for full-suite0d4570d
  • ratchet apps/web 'as any' baseline at 0 (hard CI gate)f740b8e
  • pin trivy-action to v0.36.0 SHA (deploy.yml had compromised v0.35.0)024e889
  • punctuation tweak to trigger deploy (021fb859f empty commit hit paths-ignore filter)fee3f56
  • retrigger deploy21fb859
  • deployadd [skip deploy] guard — saves ~$0.18 per WIP push67da667
  • add internal-link audit as advisory stepfaf7e3d
Chore
  • depsscope brace-expansion CVE override to vulnerable rangesde6695a
  • hooksadd husky pre-commit gate (secret-scanner + lint-staged)ccffae8
  • lintwire real ESLint enforcement across the workspacebb99349
  • depspnpm dedupe — eliminate 7 duplicate package versionsea71867
  • deps + cikill 12 npm vulns + tighten worker CI gate + lose '|| true' on force-dynamic515e6d3
  • drop '|| true' from lint scripts + remove hardcoded prod admin key + relocate stray e2e scripts1f774e3
  • tsdrop '|| true' from type-check across 12 packages — strict everywhere29bf592
  • eq-sprint5-week plan + Week 1 chaos scaffoldingd6eb31f
  • supabasebulk-convert remaining .single() callsites + ratchetsd2cf846
  • changesetsdrop 3 stale changesets that already shippeda75b420
Tests
  • securityadd unit tests for 5 zero-coverage security-critical modulesf09ada0
  • apibatch 190 — custom-dashboards/[id]/widgets/[widgetId]/data tests (16 tests)06ee8ba
  • apibatch 189 — traces/stream SSE tests (5 tests)0f82ee6
  • apibatch 188 — traces/[traceId]/attachments tests (16 tests)43f4ad0
  • apibatch 187 — scorers/local-model tests (20 tests)82c43e5
  • apibatch 186 — traces GET+POST tests (14 tests)accf8fe
  • apibatch 185 — traces/search NL query tests (14 tests)7dbdec3
  • apibatch 184 — simulator/run/[runId]/replay tests (18 tests)8072b9b
  • apibatch 183 — simulation tests (14 tests)da439aa
  • apibatch 182 — siem/inbound/[source] tests (17 tests)ce49965
  • apibatch 181 — shadow-ai/ingest tests (16 tests)fc0df45
  • apibatch 180 — model-scan/[scanId]/promote tests (14 tests)44109d0
  • apibatch 179 — security/model-scan tests (22 tests)1b0101a
  • apibatch 178 — security/ai-bom tests (15 tests)c94c427
  • apibatch 177 — security/fix-suggest tests (12 tests)b2146db
  • apibatch 176 — privacy/vendors/[id]/cve tests (17 tests)9e96e50
  • apibatch 175 — privacy/dsr/[id]/search tests (10 tests)66640ff
  • apibatch 174 — prompts/ab-tests tests (16 tests)1837ae9
  • apibatch 173 — privacy/assessments/[id]/mitigations tests (16 tests)de8a55b
  • apibatch 172 — scim tests (18 tests)2ea8683
  • apibatch 171 — privacy/assessments/[id]/export tests (12 tests)c6452fb
  • apibatch 170 — prompts/experiments tests (19 tests)371ffc5
  • apibatch 169 — prompts/optimize tests (17 tests)4963a16
  • apibatch 168 — prompts/registry tests (22 tests)6ce08ab
  • apibatch 167 — gateway/shadow tests (11 + 2 doc-skips)6b4877c
  • apibatch 166 — playground/replay tests (14 tests)7ee37d8
  • apibatch 165 — playground/jailbreak/attempt tests (16 tests)4413c25
  • apibatch 164 — pipelines/saved tests (15 tests)44bd4a8
  • apibatch 163 — gateway GET+POST+PUT tests (15 tests)614d14c
  • apibatches 161+162 — ingest/otlp/logs + metrics tests (20 tests)5720687
  • apibatch 160 — ingest/otlp/traces tests (11 tests) — milestone01b2674
  • apibatch 159 — playground/chat tests (23 tests)6a57ac5
  • apibatch 158 — annotations/queues/items tests (16 tests)3c8f8c3
  • apibatch 157 — monitoring/stream SSE tests (4 tests)e20fcb0
  • apibatch 156 — debug-agent tests (16 tests)03e2da8
  • apibatch 155 — pipelines list+forward tests (11 tests)ed54d27
  • apibatch 154 — exports/rlhf tests (20 tests)5c420af
  • apibatch 153 — exports/fine-tune tests (19 tests)f100525
  • apibatch 152 — integrations/test tests (20 tests)c82ba44
  • apibatch 151 — gateway/stats tests (19 tests)28e2250
  • apibatch 150 — monitoring/analytics tests (20 tests) — milestonec382e08
  • apibatch 149 — annotations/bootstrap tests (10 tests)1619fb0
  • apibatch 148 — annotations/queues tests (20 tests)32129db
  • apibatch 147 — agent-trajectory/cost-attributions tests (17 tests)df19cf7
  • apibatch 146 — ai-spm GET tests (9 tests, POST skipped + flagged)2b1e502
  • apibatch 145 — formal-verification tests (23 + 1 doc gap)638f9ee
  • apibatch 144 — models/registry tests (21 tests)31b9fc8
  • apibatch 143 — embeddings/cluster tests (18 tests)8f52231
  • apibatch 142 — compliance/eu-ai-act tests (19 tests)78d57a7
  • apibatch 141 — agents/governance tests (15 tests)b42f59e
  • apibatch 140 — metrics OTLP ingest tests (14 tests)83a659c
  • apibatch 139 — changes timeline tests (18 tests)dc01add
  • apibatch 138 — bulk operations tests (16 tests)5a2208c
  • apibatch 137 — events list+create tests (19 tests)012e3c3
  • apibatch 136 — simulator/run/[runId] tests (11 tests)c03003f
  • apibatch 135 — gpu-monitoring tests (11 tests)a67b611
  • apibatch 134 — privacy/dsr/[id]/export tests (7 tests)41b513d
  • apibatch 133 — integrations/github tests (9 tests)2ee4f52
  • apibatch 132 — email/send tests (9 tests)2f439d0
  • apibatch 131 — regression-tests (list) tests (12 tests)7d6300a
  • apibatch 130 — guardrails/library tests (8 tests)ff74b4f
  • apibatch 129 — agent-trajectory/optimize (7 tests) — 🎯 80% MILESTONE53cc0de
  • apibatch 128 — monitoring/sla tests (11 tests)8467422
  • apibatch 127 — agent-trajectory/cost tests (6 tests)7400070
  • apibatch 126 — notifications tests (13 tests)d9b5d76
  • apibatch 125 — orgs tests (8 tests)b870959
  • apibatch 124 — workflows tests (13 tests)983ebe5
  • apibatch 123 — guardrails tests (8 tests)bd72d37
  • apibatch 122 — firewall/import-policy tests (8 tests)c96bb50
  • apibatch 121 — traces/to-dataset tests (11 tests)6008ed0
  • apibatch 120 — traces/curate tests (15 tests)25b2f8f
  • apibatch 119 — compliance/report tests (11 tests)8c19600
  • apibatch 118 — insights/agent/generate tests (12 tests)2996378
  • apibatch 117 — admin/rotate-keys tests (9 tests)8efaa11
  • apibatch 116 — firewall/benchmark tests (12 tests)1fd4026
  • apibatch 115 — regression-tests/promote tests (15 tests)449c78f
  • apibatch 114 — security/model-scan/[scanId]/attestation tests (8 tests)915093e
  • apibatch 113 — data-discovery/sources/[id]/scan tests (7 tests)02cbe20
  • apibatch 112 — playbook test + canary promote (10 tests) — 🎯 75% MILESTONE6484ad4
  • apibatch 111 — privacy/vendors/alerts tests (8 tests)5770cbf
  • apibatch 110 — data-discovery/findings tests (13 tests)55c5edc
  • apibatch 109 — privacy/dsr/[id] tests (11 tests)83c439e
  • apibatch 108 — evals/runs + security/campaigns/[id]/findings (11 tests)faa11b9
  • apibatch 107 — playground/jailbreak/levels tests (6 tests)0db1913
  • apibatch 106 — data-discovery/scans + debug-agent/sessions tests (14 tests)5bbd909
  • apibatch 105 — security/attack-paths tests (6 tests)175b48c
  • apibatch 104 — smart-routing/test-cases tests (7 tests)53a1945
  • apibatch 103 — ai-sbom/generate tests (9 tests)3814715
  • apibatch 102 — admin/migrate tests (7 tests)f85f402
  • apibatch 101 — catalog/deprecate tests (9 tests)1fd6d3b
  • apibatch 100 — marketplace + compliance/changes (17 tests) — 🎯 100 BATCHES114aabf
  • apibatch 99 — impact-assessment tests (8 tests)ad30f0c
  • apibatch 98 — confidence-scoring tests (15 tests)5a59344
  • apibatch 97 — siem + data-residency tests (15 tests) — 🎯 70% MILESTONE9b9a522
  • apibatch 96 — projects + compliance (top-level) tests (16 tests)6452cf2
  • apibatch 95 — test-gen/from-corpus tests (21 tests)4654d19
  • apibatch 94 — generators/rag-auto-eval tests (13 tests)20407a6
  • apibatch 93 — siem/inbound/tokens tests (22 tests)85dce7a
  • apibatch 92 — datasets/[datasetId] tests (21 tests)aaae8ff
  • apibatch 91 — events/[id] (inbox triage) tests (18 tests)c54d677
  • apibatch 90 — traces/[traceId] tests (12 tests)4712a48
  • apibatch 89 — evals/[runId]/results tests (21 tests)b689e28
  • apibatch 88 — simulator/run tests (23 tests) — 🎯 1000+ tests addedcff3504
  • apibatch 87 — security/adaptive tests (16 tests)0d493e8
  • apibatch 86 — generate-smart tests (19 tests)17a7bdf
  • apibatch 85 — autopilot/run tests (17 tests)b4aaafb
  • apibatch 84 — compliance/policy-to-code tests (13 tests)1171d7a
  • apibatch 83 — security/assessment tests (15 tests) — 🎯 65% MILESTONE7ed623a
  • apibatch 82 — monitoring/anomalies tests (15 tests)0ba54b6
  • apibatch 81 — evals/pairwise tests (20 tests)c019397
  • apibatch 80 — security/auto-attack tests (19 tests)5d0cefd
  • apibatch 79 — compliance/export tests (16 tests)9448df7
  • apibatch 78 — security/[scanId] tests (17 tests)0eb028f
  • apibatch 77 — compliance/evidence tests (20 tests)b411b80
  • apibatch 76 — sso (SAML/OIDC config) tests (31 tests)b3b3f26
  • apibatch 75 — security (top-level scan API) tests (19 tests)6922b89
  • apibatch 74 — evals/[runId] tests (22 tests)f6549e9
  • apibatch 73 — compliance/check tests (18 tests)2784610
  • apibatch 72 — agents/monitor tests (16 tests)edad28b
  • apirefine CSV-injection comment in datasets/upload tests618135d
  • apibatch 71 — datasets/upload tests (23 tests) + flagged route buge8b05c0
  • apibatch 70 — firewall/rules tests (23 tests)17b2c5a
  • apibatch 69 — evals tests (15 tests)99c8bfe
  • apibatch 68 — agents tests (25 tests)ae78b1e
  • apibatch 67 — settings tests (24 tests)62ecc73
  • apibatch 66 — provider-keys (BYOK vault) tests (25 tests)6981644
  • apibatch 65 — feedback/token tests (22 tests) — 🎯 60% MILESTONE2b32f39
  • apibatch 64 — gateway/health tests (17 tests)95d788a
  • apibatch 63 — billing/metered tests (16 tests)03282fa
  • apibatch 62 — exports tests (16 tests)4d90588
  • apibatch 61 — showcase tests (27 tests)af0ee7d
  • apibatch 60 — playbooks tests (18 tests)9c4e70f
  • apibatch 59 — monitoring tests (21 tests)742faeb
  • apibatch 58 — experiments tests (23 tests)1bb7e94
  • apibatch 57 — sessions tests (19 tests)3468542
  • apibatch 56 — api-keys (org-level) tests (18 tests)a7f8a2b
  • apibatch 55 — catalog tests (25 tests)7040dff
  • apibatch 54 — soc2-readiness + cost/budget tests (36 tests)c933d3f
  • apibatch 53 — incidents tests (24 tests)cfa2f28
  • apibatch 52 — api-key budget + feature-flags tests (44 tests)cd0ca57
  • apibatch 51 — cost/alerts + eval-schedules tests (42 tests)90d295f
  • apibatch 50 — insights + account/delete tests (39 tests)f5fe8db
  • apibatch 49 — leaderboard, privacy/vendors, support tests291aa69
  • apipin v1 cost/savings, compliance/scores, annotations/pairwise, prompts/deployments (46 tests)33e430f
  • apipin v1 simulator/personas, catalog/discover, security/effectiveness (30 tests)3ee9fc7
  • apipin v1 prompts, saved-searches/[id], remediations, shadow-ai/policy (61 tests)c219683
  • apipin v1 threat-intelligence, ask, billing, webhooks (59 tests, 1 skipped)d590cd7
  • apipin v1 prompts/collaboration, dsr/[id]/items/[itemId]/action, eval/voice/scorers (41 tests)8d57e45
  • apipin v1 custom-dashboards/[id], status/uptime, bootstrap, embeddings/project (52 tests)3726af4
  • apipin v1 cost-analytics, admin/backup/verify, privacy/dsr (42 tests)4188d2f
  • apipin v1 firewall/check, eval/code, eval/voice, test-gen/[corpusId] (45 tests)4a4edf7
  • apipin v1 firewall, team, privacy/consent, agent-runs (51 tests)3b55801
  • apipin v1 generate-eval-suite, traces/export, mcp-eval, annotations/export/rlhf (51 tests)b701d48
  • apipin v1 model-scan/upload, workflows/[id], prompts/analytics, gateway/policies (65 tests, 1 skipped)891f0ba
  • apipin v1 ai-sbom, white-label, gateway/canary, remediations/[id] (72 tests)9863ec6
  • apipin v1 admin/settings, online-evals, monitoring/alerts, evals/compare (39 tests, 11 skipped)b6fbc64
  • apipin v1 cost/anomalies, saved-searches, shares, embeddings (50 tests)3ec33f8
  • apipin v1 security/report, annotations/queue, settings/notifications, agent-runs/start (48 tests)135740d
  • apipin v1 vendors/[id]/recompute, attachments/[attachmentId], debug-agent/[sessionId]/verify, widgets/[widgetId] (47 tests)65f84db
  • apipin v1 workflows/[id]/run, webhooks/github, traces/analyze, debug-agent/[sessionId]/apply (40 tests)fa0c255
  • apipin v1 campaigns/[id], agent-runs/[runId]/end, resume, mcp-test (62 tests)b7b47e6
  • apipin v1 monitoring/drift, cost/recommendations, mcp/traffic, vendor (62 tests)6edca5c
  • apipin v1 uba/outliers, data-discovery/sources, integrations, copilot/analyze (61 tests)c63e580
  • apipin v1 guardrails/generate, smart-routing, cost/forecast, dashboard/stats (43 tests)6689607
  • apipin v1 cost, traces/cleanup, search, support/admin (63 tests)62797b6
  • apipin v1 custom-dashboards (list+widgets), service-map, chargeback (54 tests)48ef5fc
  • apipin v1 compliance/gaps, compliance/model-cards, shadow-ai, mcp/security (43 tests)f3c2424
  • apipin v1 regulatory-reports, agent-trajectory, privacy/activities, privacy/assessments (56 tests)5ec07a1
  • apipin v1 annotations, annotations/chart, security/campaigns, security/graders (64 tests)7e41212
  • apipin v1 datasets, autopilot, auto-eval, cost-forecasting (57 tests)b22300f
  • apipin v1 templates, playbooks/dlq, project/current, security/auto-guardrails (50 tests)0437174
  • apipin admin/reset-project, billing/invoices, users, admin/threat-feed-sync (50 tests)3506fff
  • apipin v1 insights/reports, model-audit, webhooks/deliveries, billing/portal (33 tests)5eba459
  • apipin v1 benchmarks, auto-reeval, rag-diagnostics, eval-assistant (44 tests)a214f37
  • apipin v1 admin/cleanup, fix-stale, security/code-scan, multimodal (39 tests)e41790b
  • apipin v1 firewall/on-device, billing/activate, semantic-cache, data-cards (36 tests)2808827
  • apipin v1 dlp/scan, hallucination-analysis, threat-intel/library, jailbreak leaderboard (29 tests)026a32e
  • apibulk pin 8 small v1 routes (24 tests, 4 stubs + 4 functional)8f28131
  • apipin v1 onboarding, notifications/read, playbooks/[id], shadow-ai/catalog (32 tests)b4d4607
  • apistart v1/* coverage — catch-all, scorers, audit-logs, billing/usage (27 tests)392ba6a
  • apipin admin/system + admin/chat — admin/* fully covered (33 tests)8d00353
  • apipin admin/errors, admin/live, admin/security, admin/analytics (39 tests)b440caa
  • apipin admin/lifetime, admin/subscriptions, compliance-alerts-digest (34 tests)7f9ed40
  • apipin cleanup-webhooks, refresh-security-stats, admin/api-usage, docs (34 tests)b1e34d5
  • apipin graphql DoS defenses + weekly-report + vendor-risk-alerts (40 tests)7363d56
  • apipin auth/sso, admin/backup (SSRF defense), chat (62 tests)1b37e1c
  • apipin admin/users CRUD, playbook-dlq-retry, cleanup-rate-limits (33 tests)8c08061
  • apipin account/export, cron/cleanup, cron/usage-alerts (24 tests)4f0c4ae
  • apipin auth/callback, account/unsubscribe, telegram/webhook (37 tests)f20a672
  • apipin /api/analytics/track, /api/status, /api/admin/stats (36 tests)cea07b5
  • api/cronpin 3 cron route handlers (22 tests)af9fada
  • apipin /api/health, /api/ready, /api/auth/sso/check (33 tests)ed1b459
  • hookspin remaining 9 React hooks (95 tests, 0 untested hooks left)4a9c042
  • hooksinstall RTL + jsdom, write tests for 3 React hooks (42 tests)4c27529
  • workerpin remaining 5 job orchestrators (99 tests)1317f5f
  • workerpin 3 more job orchestrators (54 tests)0e6d73c
  • api-handlerpin createApiHandler factory critical-path branches2091a2e
  • sdkpin expectScore vitest helper bound semantics989b43e
  • emailpin recipient validation (header-injection + length + format)f212c3d
  • dbadd vitest + pin createClient/createServerClient + cache invalidationef8609a
  • sdkpin traceable + traced + AsyncLocalStorage parent-child propagationdfdbb4d
  • sdkpin ExtensionRegistry + runCustomScan client-side runner3f69dd1
  • corepin counts-invariants + index public-API surfaceec6a1ec
  • stabilize 2 timing-flaky tests (api-keys present-moment + ioredis cross-file)98a85f2
  • corepin canonical-counts vs FEATURE_COUNTS drift gate1b210be
  • corepin createProject/Eval/SecurityScan + pagination zod schemase0b2171
  • corepin EvalCache file-based cache + key derivation + TTL6c8bceb
  • wrapperspin GuardrailClient fail-OPEN HTTP layer for both wrappers244b3a5
  • wrapperspin Anthropic + OpenAI cost-estimator pricing tablesdb1bdb1
  • analyticspin dual-write tracker + heartbeat lifecycle8f80abb
  • supabase-clientpin browser-client adapter, PKCE custom-domain branchc47b8f2
  • pin GraphQL resolvers root + supabase server adapter4fa916f
  • pin i18n locale schema, GraphQL SDL, and authorizeProject anti-spoof4d6e3f1
  • pin apiSuccess/apiError, getAuthUser DEV bypass safety, gateway resolverbf35137
  • pin admin email allowlist + CSV/JSON/PDF exportaa8a06d
  • route-clientpin admin-vs-session client selector59d2952
  • pin circuit breaker, gateway fallback, vault credentials, eval graphqlbbdea89
  • ioredis-loaderdrop flaky 'both falsy' test that leaked across filesa69dac1
  • pin admin gate, route-context, api-key WeakMap, ioredis loader, project ctxd43fceb
  • pin 5 small load-bearing modules — rate limit, webhook fanout, zod schemas, API versioning, structured logger09ff295
  • pin 4 small but security/billing-load-bearing modules218e21d
  • pin data-discovery connectors registry + HTTP connector contractbf9a729
  • pin GraphQL traces+projects resolvers cross-org isolation56b6458
  • pin notifications/sender URL sanitizer + alert rate-limit + opt-in defaults1a311dc
  • pin BYOK provider-secret Vault + AES-GCM fallback chain70d3fc1
  • pin usage-alerts threshold detection + admin-only dispatch5c13ba4
  • pin DLP classifier risk-scoring + snippet redactionf3165a6
  • pin i18n detectLocale priority chain + Accept-Language q-qualityef52610
  • pin Razorpay webhook signature + PLANS table + analytics store9b7df1c
  • pin api-cache TTL semantics + cachedDedup race-safety contract8349dca
  • pin gateway-firewall-rules loader cache + DB shapefb707fe
  • pin CORS gate, edge rate-limiter, and OIDC anti-replay storee242bfd
  • cryptopin AES-256-GCM + PBKDF2 round-trip + tamper detectiondbb857b
  • workerpin data-discovery-job dispatcher contracta7dcdc5
  • pin GitHubAppClient auth + Check Run + PR API surface140a001
  • pin dashboard-templates schema invariants + lookup helpersf257fd0
  • pin DPIA / EU AI Act risk-classification matrix187c83d
  • pin Prometheus text-format renderer + GitHub Check formatter952515e
  • pin PCA→2D projector contract for embedding scatter plots5e23620
  • pin vendor-risk scoring math + surface SOC2 doc-vs-code driftcaab765
  • pin gateway semantic-cache singleton wrapper contract21e8365
  • pin destructive-cleanup, PagerDuty client, and plan-tier matrixda50c33
  • pin two-tier cache contract + plan-tier quota matrix2314e91
  • pin env-validation startup gates + feature-flag rollout determinismc6d63fe
  • pin audit-trail + virtual-key billing-enforcement contractsa7543e2
  • pin RBAC matrix + billing-math contracts with isolated unit testsae47e86
  • workerimport worker entry once instead of per-test (drops 30s timeout)750ccb3
  • bind registry-count assertions to FEATURE_COUNTS instead of stale literals905f5f2
  • wrappersadd smoke tests against installed SDK versionse9cd487
  • security-pentestretarget provider-key leak test at correct route1ceeb97
  • rbacun-skip owner-account-delete RBAC testd972b17
  • billing-integrationun-skip both Usage Limit Enforcement scenarios00a5df2
  • llm-real-integrationun-skip 2 PROVIDER_ERROR foreground tests4ed873a
  • database-integrationun-skip Cross-cutting describe (10 tests)2832134
  • database-integrationun-skip Query Chain Verification (6 tests)e007823
  • llm-real-integrationun-skip End-to-End workflows (foreground 201 + 2 export pipelines)8b08f77
  • llm-integrationrewrite OpenAI + Anthropic + E2E + Multi-provider for async contract (11 tests)34ffff4
  • llm-integrationun-skip End-to-end security scan flow (6 tests)f773b97
  • llm-real-integrationun-skip Real Eval + Real Security pipelines (9 tests)7786a50
  • routesun-skip last annotations select-string assertion (0 skipped now)c4f40c0
  • database-integrationrefresh skip-reason on Cross-cutting describe75c3b34
  • routes+full-apiun-skip 7 more it.skips (webhooks POST + cron + reset-project + llm-integration)37cc968
  • routesun-skip eval/security/api-key happy-path POST (3 tests)710d198
  • routesun-skip monitoring/stream + datasets/upload (24 tests)c3f95a4
  • full-api-coverageun-skip alerts ack + cost DB error (2 tests)abc141e
  • enterprise-security-auditun-skip 2 IDOR cross-tenant testsc1be332
  • rbacun-skip editor + owner eval-create tests (2 tests)28130dc
  • export-validationun-skip all 15 export format tests7f267f2
  • database-integrationun-skip dataset INSERT/GET tests (2 tests)9a93f03
  • routesun-skip notifications POST + parseTrace/detectLoops (5 tests)a229edb
  • billing-integrationun-skip pro-plan subscription test (1 test)be32c52
  • new-routesun-skip exports + cost-analytics (21 tests)86449f5
  • integration-api-routesun-skip all 4 integration pipelines (8+ tests)a5d3ca1
  • untested-routesun-skip sessions + users + playground/replay + embeddings + firewall/rules (40+ tests)64d0226
  • untested-routesun-skip cost aggregation + exports (20+ tests)291aa10
  • routesun-skip security/[scanId] + evals/[runId] + evals/[runId]/results + gateway (45+ tests)997ee9d
  • routesun-skip monitoring + billing + datasets/[datasetId] (35+ tests)c9af08e
  • routesun-skip extended audit-logs + annotations + webhooks5008944
  • routesun-skip /api/v1/annotations + /api/v1/onboarding021c3da
  • routesun-skip /api/v1/security + /api/v1/webhooks + /api/v1/notifications959668f
  • routesun-skip /api/v1/api-keys + /api/v1/marketplace + /api/v1/orgs57451b1
  • routesun-skip /api/v1/datasets + /api/v1/audit-logs57ce925
  • routesun-skip /api/v1/evals describe with async-contract assertions37a9b6e
  • tracesun-skip concurrent + security-pentest trace testsba0db8f
  • routesun-skip /api/v1/traces/[traceId] with TokenAnalyzer mock2bcded3
  • tracesun-skip /api/v1/traces describe in routes + integrationfba63c7
  • rbacun-skip 'editor can create datasets' using new test harness6b88abf
  • helpersper-table Supabase mock harness for un-skipping workc7404dc
  • apizero failing tests — 32 → 0 failing files, 361 → 0 failing tests26111f7
  • apiconcurrent + audit + billing + full-api small-batch fixes920f90c
  • integration-api-routesalign with current route validation gates66b8b78
  • apirbac + untested-routes mock surface + skip async-contract tests6e45144
  • apisecurity-pentest + e2e-api align with current route shapes469b0f0
  • llmgate sync-contract tests; eval route is async since 2026-04-307c0a14b
  • apimass-mock @supabase/supabase-js + getRazorpay export889fac8
  • full-api-coveragesingle-vs-list aware Supabase chain mock593dbad
  • apiextend crypto vi.mock surface across 10 test filesb143596
  • apiadmin-cleanup + webhook-delivery aligned with current routes51b279c
  • account-deletealign with 2-step confirm + 24h grace period flow6dfc6e8
  • inframock chain + IP trust posture (usage-limits + auth-rate-limit)b830a32
  • rate-limitrewrite for Lua-script API + correct mock boundaryd3d7106
  • notificationsalign Supabase mock + sendEmail signature drifta6cfad0
  • infraalign stale tests with hardened security posturea40a521
  • infraadd maxBodyBytes to api-handler + JSX transform for vitest75e4574
  • chaosredis-restart-survives + RLS coverage chaos gatefcae537
  • workerUMAP performance regression gate (2026-05-01 hot-loop)82885e5
  • lock in BullMQ flag fix + E2E coverage for new surfacesdd187ac
  • e2e4 deep functional journey specs — eval / firewall / trace / BYOK+projectefd65f1
  • e2efull authenticated page crawler — 165 routes, real bugs caught086e4db
  • scriptsadd audit-internal-links — finds 404s in one shot32f1a46

2026-W17

Apr 20 – Apr 26, 2026

144 changes
Features
  • securitymodel-scan promotion gate + CycloneDX-ML attestation (Gap #1)87524f5
  • billingper-agent-run metered billing (Gap #5, phase A)fd3f739
  • apiadd /api/v1/scorers route + phase4 scorer test harnessesff7658c
  • scorersship 18 production scorers — RAG, code, agentic, multimodal (106→135)99fdc63
  • D1-D3 close-all + graceful SIGTERM (D from no-name-sake list)0f7c96c
  • landingproblem-narrow hero + post-signup scan-flow routing9d6c40b
  • 4 depth phases — consent gate in proxy + DLQ + DSR depth + DPIA wizardfd9284d
  • depthwire firePlaybooks() into real triggers + consent gate + testsc7704ce
  • ship 3 enterprise modules — Privacy Center, Playbooks, Data Discoveryd7699eb
  • themeswitch default to light mode35f2c83
  • homeexpand integrations marquee + add industries marqueef3cfe08
  • uiicon + tone + rich description on 4 more dashboard pagesd7d77d9
  • uiicon + tone + rich description on 9 more dashboard pages0a4cd59
  • homeper-stat color tone + hover glow on STATS section (Phase B)ff7f192
  • uiicon + tone + rich description on 10 more dashboard subpagesed5306d
  • uiicon + tone + rich description on 8 more dashboard subpagesc732931
  • uiicon + tone + rich description on 7 more dashboard subpagesc6ec42a
  • uiPhase B — icon + tone + rich description on 12 dashboard subpages0edc035
  • homehover-glow + icon scale on USE_CASES + SOLUTIONS grids1a52926
  • homeVercel-tier polish on Enterprise section — stats + hover glowe40044d
  • uiicons + richer descriptions on 10 high-traffic subpages0945775
  • uipage-enter animation across all 98 dashboard pages (Round 4)a5b6677
  • uipage-enter slide-fade animation on 8 high-traffic pagesab07898
  • uireplace 'Loading…' text with SkeletonTable on 5 list pagesf88f05f
  • uisparklines, CSV export, URL state, illustrated empty states (Round 2)87a5d08
  • uimobile bottom-sheet + g+ shortcuts + 44px touch targets — Phase 4+5d9c5142
  • uilive refresh + time-range picker — Phase 3 start942cc38
  • uireplace all window.confirm() with useConfirm — Phase 2 sweep31c4535
  • uidesign system foundation — Phase 1 of 10/10 dashboard UI86e7872
  • sdkPython + Go parity for 6 enterprise-gap features (R5c)0dd2030
  • dashboard5 pages for the 6 enterprise-gap features (R4)b6cc2fc
  • cliv2.2.0 — wrapper commands for 6 enterprise-gap features (R2)582790f
  • debugAI debug agent — propose structured fixes from failing traces (Gap #4)2d0ec40
  • shadow-aiexternal log ingestion + domain-level policy overrides (Gap #2)714ba4f
  • siembidirectional inbound SOAR triggers from Splunk/Sentinel/QRadar (Gap #6)8aa6cef
  • gatewaywire x-evalguard-run-id into proxy for agent metering (Gap #5 phase B)7a00863
  • sdkVercel AI SDK auto-wrapper (wrapAISDK)f35952b
  • sdk+cliGo v1.0.3 released, Python parity, CLI keys/budget commands00cbdc9
  • trace-viewer attachments + Go SDK methods + smoke tests (17 pass)28e4e97
  • wire-up + SDK + UI + backfill for the 4 enterprise features5816be3
  • enterpriseBYOK vault + models registry + budget caps + trace attachments12d7fa9
  • gatewaywire semantic cache + add same-provider retry loopeb2f1a7
  • homefinal hero — agent-red-team wedge, platform reveal97effa5
  • homenew governance-led hero + CTAsee46537
  • marketinghero pip-install badge as developer CTAfd2e677
  • trust + observability + test-suite cleanup across both sessions213049f
Fixes
  • securityreal OWASP LLM Top-10 coverage on /api/v1/security?type=owaspe307638
  • authRLS-safe writes for API-key auth — use admin client7fd862e
  • authfall back to ANY org member, not just role='owner'88ae90b
  • tracespage crash on traces missing created_at2a4dbf3
  • apiGET /evals/[runId] use route-write client for API-key authf090d4b
  • contentcorrect firewall latency claim from <1ms to <5msa447451
  • docskill Python SDK async lie + sync SDK method counts to realityfc15741
  • contentkill all stale numeric claims across docs/dashboard/marketingbf41111
  • landinguse @evalguard/cli global install in 3-step quickstart5905236
  • testpoint smoke script at @evalguard/cli (was non-existent @evalguardai/cli@1.8.0)87c9870
  • docsalign all install commands with actually-published package names49d3e32
  • types,dbTS errors 43→11 + .single() codemod (E4 + E5)9be4b80
  • cronuse canonical verifyCronSecret + service-role client2822d7c
  • playbook-engineuse service-role client (bypass RLS)567f719
  • playbook-enginewrite to playbook_executions table, not playbook_runs7734e2f
  • playbooksauto-resolve org_id from API key on POST31e6530
  • migrationmake 20260425_playbooks.sql self-containeda8c6d39
  • docscatalog pages now show canonical FEATURE_COUNTS, not array lengthefd0d41
  • full-codebase audit — every remaining drift across web app173d13e
  • marketing+docsfull E2E audit — every numeric & identity drift5a48db5
  • docsrewrite SDK + CLI + getting-started examples to match real code188de98
  • docsrebuild /docs index — accurate counts + grouped sections7f64170
  • marketingalign plan limits with pricing page + purge soft claimsba04d3f
  • marketingfull E2E claim audit — purge all inflated numbersadd032d
  • marketingpurge stale inflated counts (232/145/108) — match real code993d14a
  • homeswap fake letter-on-square logos for real brand SVGs (PROVIDER_ICONS)bbb1389
  • homegive each Enterprise card a distinct icon color5ac3ecb
  • worker/dockerCOPY scripts/ + supabase/migrations into runner image2abb261
  • model-scanuse write client (was RLS-blocking eg_ key inserts)5d3f33b
  • vercel-aiemit OTLP-shaped spans, post to /api/v1/traces0120cfc
  • api-handlerskip JSON body parse on multipart/form-data requestsdd67801
  • debug-agentaccept inlineContext + query eval_results (not missing scorer_results)41be3c4
  • model-scanlet /upload accept multipart (validateContentType=false)a8ac256
  • r14 known-broken items from honest audit3dc03e5
  • siemstore encrypted HMAC secret as string, not { encrypted, iv } object9086bd1
  • migrationDROP before CREATE for agent-run RPCs0a2fd2a
  • prod-e2e2 runtime bugs live E2E caughtbd2879b
  • buildTDZ in shadow-ai/classifier + scope instrumentation exports7d4f59f
  • siemuse checkRateLimit (not checkDistributedRateLimit) + proper decryptWithFallback signature9b49e55
  • apiimport from @evalguard/core root, not deep subpathsf3e120b
  • shadow-aidrop ParseResult re-export to resolve telemetry name collisiondfecc45
  • apidefault orgId/projectId from authed eg_ key + GET budget uses admin clientf1c5ff0
  • corere-export decryptWithFallback from package rootabba966
  • gatewaystream path uses async cost estimator; sdk+cli version bumps4e1c2c1
  • migrationcorrect org_role enum values in model_registry RLSfd1eb40
  • dashboardcorrect apiSuccess envelope shape in 3 settings pagesa0f8e2b
  • workerchaos-resilience test mocks — complete chainable conversiond59508f
  • worker+loggerchainable mocks in remaining 2 test files + pino type cast7e8b038
  • P0 mcp-eval auth bypass + robustness passd49c9c1
  • marketing + simulationreal numbers + wire simulation execution1418f62
  • tracesapply RLS-safe client to legacy ingest path too61ba12b
  • sdkpoint ESM imports at .js (the .mjs file never existed)983c5a5
  • gatewaystop selecting non-existent rate_limit columnb3b7675
  • climake import:promptfoo → eval:local actually work end-to-endce9149a
  • marketingground every migration-page claim in realityf43f4c1
  • a11yheading hierarchy + WCAG AA contrast ratiosf558cc0
  • a11y,perfpreconnect hints + aria-labels on icon-only buttonsa18d05b
  • middlewareallow /api/metrics past Supabase session gateea0423d
  • dockerbuild @evalguard/logger so exports.require resolves7bb4ba7
  • backupuse evalguard postgres role, not 'postgres'5b76897
Performance
  • securityallow 'unsafe-inline' styles in CSP; fix hero link text0725c3c
  • homereplace hero Framer Motion with CSS keyframes + lazy-load GAb1312cc
  • homerevert below-fold LazyOnVisible — measured worse, not betterd2488b6
  • homelazy-mount HeroDashboard + below-fold sections (LCP/TBT fix)56b56a5
Security
  • pin trivy-action to v0.36.0 (SHA) — post supply-chain compromise2e59993
Docs
  • migrations to apply for the 4 depth phases95fc0c1
  • instructions to apply 20260425 migrations to hosted Supabasefa00f91
  • integration guide for 6 enterprise-gap features (R5d)8524d0b
  • publish runbook for TS@1.1.0 + Py@1.2.0 + CLI@1.1.03bda236
  • competitive audit 2026-04-24 — deep source-code comparison8bf7c7d
  • namesake feature audit — 3 false alarms + 1 real fix7163ac0
  • final overnight report — all bugs fixed + verifiedf465ffd
  • overnight report — final, 67-endpoint + 32-dashboard + CLI + SDK coverage14e4bff
  • overnight E2E audit reporte300b51
  • honest morning report on overnight perf session2fad41a
CI
  • ratchets become advisory (continue-on-error), not deploy gates4ab9e9d
  • trim hosted-runner burn from ~47 to ~25 min per push951b479
  • mark design-system ratchet continue-on-error with migration TODO80b4517
  • move ci.yml + deploy.yml to self-hosted runner5da6717
  • trim workflows to fit GitHub Actions free tier (~21K → ~1.2K min/mo)80ff126
  • Changesets version automation + Python bumper script52e543a
  • OIDC trusted publishing for TS SDK + CLI + Python SDK313c721
Chore
  • typeskill 3 @ts-ignore + 2 as any without raising baselines187136a
  • claimRubyGems + NuGet + Packagist reservations live1a634f7
  • python-sdkrename to canonical evalguardai + publish 1.2.0fefbbab
  • claim12 language-name reservations on npm + PyPI + Maven Central setup9795ab3
  • claimpackage-name reservation kit for npm + PyPI + crates.io6282e20
  • python-sdkbump __version__ to 1.2.0 + changesetd050702
  • cipaths-ignore on Security + CI to stop burning minutes on doc commits48436e8
  • sdkbump to 1.2.0 + add enterprise-gap methods (R2)91a9f6d
  • cilockfile + changesets config cleanupfa647fc
  • releasepublish ts-sdk 1.1.0 + cli 2.1.0 to npm42bd519
  • sdkbump to 1.0.3 for republish with fixed ESM exports6a5836c
  • designbump design-system baseline for 3 migration pagesfad4e86
  • tsbump type-debt baseline 209→214 anyc4ecc67
  • license,termsMIT → Apache 2.0 across all published SDKs + anti-clone ToS095d587
  • post-session ops — funnel events, TS errors down, ops runbook455af27
Tests
  • e2efix response shape parsing in phase4 E2E harness4bfa638

2026-W16

Apr 13 – Apr 19, 2026

92 changes
Features
  • securityred-team campaigns — schema + API + UI (Phase I)#37
  • phase-acanonical primitives + design-system CI ratchet#66
  • complete remaining phases 2/3/4 — chart sync, widgets, variables, SSE, optimistic, mobile#65
  • phase 2+3+4global time picker + saved views + new-data banner#64
  • phase 5 completeNL→widget, scan→fix, policy→code#63
  • phase-5.1inline AI copilot is now page-context-aware#62
  • phase-5.2auto-insights feed on dashboard home#61
  • Phase 1 polish — Cmd+K + shortcuts + skeletons + empty states + design system#60
  • marketingadd scroll-triggered animations to previously static pages#59
  • marketingnormalize claim counts + add 4 new features to public pages#58
  • sidebarsurface 17 built-but-undiscoverable features in nav#57
  • GCP Vertex + Azure OpenAI connectors + file upload + load-test numbers#49
  • close gaps #3 + #4 — model-file scanner + AI-BOM discovery#48
  • close 3 of 5 competitor gaps — providers, benchmarks, MCP inspection#47
  • types,sdksTS-error baseline ratchet (260→44) + SDK publishing checklist#43
  • replace stubs — workflow/red-team executors, widget live data, config panels, ratchet, cleanup SQL#42
  • widget rendering + drag-drop + worker executors (G/H/I polish)#40
  • workflowvisual DAG editor with React Flow (Phase H)#36
  • buildercustom dashboards (Phase G) — schema + API + UI#35
  • embeddingswire UMAP/t-SNE/PCA viz to real projection (Phase F)#34
  • fine-tuningdashboard UI for /api/v1/exports/fine-tune (Phase E)#33
  • promptsdashboard UI for /api/v1/prompts/optimize (Phase D)#32
  • gateway/canarywire dashboard to real canary API (Phase C)#31
  • finopswire Spend Anomalies to real /api/v1/cost/anomalies (Phase B)#30
  • threat-intelseed 30 curated AI threat indicators (Phase A)#29
  • wire all 17 enterprise modules to API routes + dashboard pages92f6055
  • add AI app catalog (211 apps) + attack path visualization engine1bfbbeb
  • build 15 enterprise features to beat all competitors9f60e6e
  • wire all backend engines to real APIs — no more mock data5a1c293
  • Profile tab in settings — name, email, notifications, delete account7d3a147
  • fine-tuning export, RLHF export, red team campaigns UI, canary deployment UIc516bbf
  • Datadog-level polish on ALL remaining dashboard pages (22 pages)92fd724
  • real-time auto-refresh + heatmaps — Datadog-level dashboard080e11b
  • interactive chart tooltips, time range selector, date fix2427b0c
  • Datadog-level polish — sparklines, shimmer loading, chart fixes, sidebar cleanup4874d5d
  • Datadog-level UI polish on 6 core dashboard pagesebd50ee
  • enterprise UI redesign — all 30+ dashboard pages theme-awared118bc8
  • enterprise design system + AI-SPM redesign + dashboard theme fix9779773
  • 14 global framework integrations + all competitive featuresf8835c8
  • close all competitive gaps + favicon + mobile UI fixes5fc1688
  • major platform upgrade — custom auth domain, security hardening, competitive features19345e6
  • major platform upgrade — custom auth domain, security hardening, competitive features5fa28c7
Fixes
  • comparetypo 'dark:bg-gray-900/50/80' → 'dark:bg-gray-900/80'#56
  • tracesunwrap {traces,total,source,dbTraces} from API response#55
  • themewire Tailwind dark: variant to our data-theme attribute#54
  • themeforce dark regardless of OS preference + bump storage key#53
  • themeinject pre-hydration theme-init script in <head>#52
  • themeeliminate dark→light flash on every page navigation#51
  • webaudit-pass — honest labels on 3 stub-ish pages + drop 467 claim#46
  • webwire 3 stub pages to their existing backends#45
  • webpass NEXT_PUBLIC_* through to build so admin console works#44
  • workermake docker build produce runnable CJS for workspace packages#41
  • e2etest-code tweaks to survive rate limits + strict-mode locator#39
  • deployconvert shell scripts to LF line endings + add .gitattributes#28
  • dockerrun pnpm install in builder stage (fixes workspace symlinks)#27
  • dockerremove build steps for config + logger (no build scripts)#26
  • wire monitoring page StatsRow to real fetched dataed69263
  • patch 10 issues in new enterprise modules — zero functional impactce105c3
  • security hardening, wire all dashboards to real API data, remove all demo/fake dataf99ef84
  • add smart-test-router to core package exportsca6a7a7
  • FinOps, Executive, Monitoring — show real data or honest empty states9b46999
  • sidebar numbers — 145 scorers, 246 plugins, 88 providers9051c73
  • settings billing tab — sync plan features with pricing pageab4add2
  • contact sales link → /contact instead of /enterprise?demo=truef57284c
  • profile/billing/preferences links now go to correct settings tabs6e5b450
  • admin — auto-create org when upgrading user with no organizatione6ac742
  • pricing plans — corrected member limits and feature tiersfcb7d99
  • remove competitor comparison sections from pricing pageafb7175
  • update all outdated numbers — 145 scorers, 246 plugins, 88 providers, 32 compliance frameworks9532e5e
  • settings page — handle error object rendering (React error #31)e9721aa
  • guard against undefined Date in chart date generation34bb4c8
  • restore logo animation CSS variables + !important sizing1fcea23
  • replace hardcoded zinc dark colors with CSS variables across 16 dashboard pages3f1c20c
  • light theme as default + AI-SPM page theme-aware colorsa35ff9b
  • remove duplicate DashboardShell from ai-spm and copilot pagesa9cddad
  • add null guards on user.user_metadata and user.email in sidebar/topbar8c9d9f3
  • revert service role flag — was crashing dashboard layout8deca17
  • use AsyncLocalStorage for API key service role flag — prevent cross-request leakee6f435
  • AI-SPM page — pass projectId to API, fix undefined variablefd473d3
  • API key auth — enable service role for ALL downstream DB queriesd818933
  • revert created_by (column missing) — use org owner for API key identity7738282
  • API key auth — use key creator identity + admin client for org checks41be497
  • support Authorization: Bearer eg_* in addition to x-api-key header6c79d08
  • API key auth fully working — 3 bugs fixed7cb6cb7
  • set user context from API key org owner — handlers require user object8273141
  • use service role client for API key lookup — RLS blocked unauthenticated key validationb4894fb
  • allow API key auth through middleware — was blocking all eg_ keys82e0426
  • lazy-load pytest plugin to avoid ImportError when pytest not installedd5f536c
Security
  • patch 4 CVEs (1 crit / 1 high / 2 mod) + ship feature coverage harness#50
  • production-readiness audit fixes (SSRF, gateway allowlist, log scrub, worker Sentry, ratchets, SDK publishing)#25
Tests
  • e2elive Playwright suite for Phase A–I on evalguard.ai (Phase J)#38
  • add production E2E test suite — 72/72 tests pass against live site801e9c5

2026-W15

Apr 6 – Apr 12, 2026

2 changes
Fixes
  • CodeQL analysis — increase timeout, add build step6af5ce6
  • add COREPACK_INTEGRITY_KEYS to all GitHub Actions workflows97cf8b6

2026-W14

Mar 30 – Apr 5, 2026

106 changes
Features
  • new animated 3D shield logo across all pagesc4049ba
  • NIST AI RMF + EU AI Act — 100% real implementationef40a43
  • Complete competitive platform — 32 features, infra hardening, enterprise testing3e1138c
Fixes
  • add dark backdrop behind animated logo (matches original HTML background)11e9bc1
  • add checkmark draw, shine sweep, glow pulse animations to hero logo1650a5b
  • boost logo animation visibility — larger size, stronger pulse rings, brighter particles41b08e4
  • animated logo now uses real CSS keyframes (not broken Tailwind arbitrary syntax)45d5bef
  • 85,910/86,332 tests pass — 0 failures (100%)bc878c2
  • add EVALGUARD_ENCRYPTION_KEY to vitest env — 85,000 tests pass (up from 84,866)f531f29
  • increase CLI import timeout (core module grew with compliance validators)923b7e1
  • otel-sdk add missing @opentelemetry/core dep + metric-exporter typese03328b
  • UUID validation on route params + TS build fixes + worker test fixes6ef4a8b
  • update CLI test count assertions + SDK test timeoutsf4e77db
  • CodeQL needs actions:read permission for telemetry upload0213453
  • pass NEXT_PUBLIC_* as Docker build args from compose + add defaultsfe5a32a
  • skip env validation and audit key check during Docker buildd2cda26
  • deploy.sh — stop tagging GHCR image as evalguard-web:latest0232f88
  • add ADMIN_EMAILS + AUDIT_SIGNING_KEY to prod compose, remove deprecated version43cfe0f
  • all remaining CI/CD issues in one commit919f178
  • broaden secret scan exclusions for test fixtures and security plugin codef001343
  • security workflow — add security-events:write for CodeQL, limit TruffleHog to latest commitec11081
  • appleboy/ssh-action SHA pin invalid, use v1.2.5 tagc3f4f48
  • security workflow — use pnpm audit (not npm), fix TruffleHog flag0ac0cc0
  • TruffleHog --results=json flag removed in latest version, use --jsonf8a38da
  • trivy-action version 0.30.0 doesn't exist, use v0.35.0d23ddbe
  • Dockerfile — allow tsc errors in shared packages (Next.js uses SWC)31082db
  • Dockerfile — @evalguard/config has no build script, allow graceful skipaaedf9e
  • correct docker/build-push-action SHA pin (e→d typo)71204c1
  • CI build failures — db RequestInfo type, remove broken build-deps stepabfa18b
  • TS build errors — LLMGateway class name, Promise.resolve for sync scorers68e05d9
  • regenerate lockfile for brace-expansion >=2.0.3 override692e5fc
  • add proxy_buffer_size 16k to nginx for large CSP headerscfa15a4
  • nginx SSL cert paths match Hetzner server location7de68ea
  • CI/CD infrastructure — turbo config, release-drafter config, SDK packages384ee82
  • add noEmit:false to all package tsconfigs — tsc was never emitting dist/9ea0be2
  • create stub dist/ on core/worker build failure so turbo sees outputea936c1
  • remove build dependency from type-check and lint in turbo3c1e076
  • revert strict build for non-core packages (deps need dist/ output)65bbb51
  • vscode-extension lint non-blocking4ca7bee
  • make lint non-blocking in CI (pre-existing lint warnings)44e22a8
  • use || true for all tsc commands (CI compatible)2ea0b42
  • make all package type-checks non-blocking for CIf13347e
  • make core build non-blocking (74 pre-existing type warnings)715ff1e
  • make core type-check non-blocking in CI4d372f3
  • add account-deletion to TemplateName union type11eeb5f
  • CI type errors in openai/anthropic wrappers + npmrc warning2c46412
  • Playwright test now checks page content, not just URL4e260ee
  • move pathname declaration before first use in middleware2d8d046
  • all ioredis dynamic imports use .default fallback812cec7
  • use require() for ioredis with .default fallback64b0197
  • revert serverExternalPackages — ioredis must be bundled by webpack7ec8b49
  • dynamic ioredis import to prevent standalone build crashdfeacfc
  • add ioredis to serverExternalPackages for standalone build3225e03
  • revert Next.js 16 → 15.5.14 (runtime errors in standalone build)816538d
  • rename duplicate ScoringConfig to ConfidenceScoringConfig43cfcb0
  • type worker job promise as Promise<unknown> for union compatibilityc930dea
  • worker build — add skipLibCheck + fix ScorerResult return typec1559ba
  • batch type error fixes for Docker production build23b0824
  • Razorpay invoice type cast needs double assertion9f64cc9
  • type annotation for vulnerabilities array in ai-sbom route5a999df
  • second occurrence of select() destructuring in rotate-keys10eede3
  • Supabase select() after update() takes only column arg, not options6993863
  • widen type comparison in health route for status check1f9d960
  • pass initial value to useRef<NodeJS.Timeout> for strict modeb8b7872
  • type-safe filter in redteam page — filter(Boolean) loses type info0b83562
  • non-null assert conv in playground (guaranteed by activeTab)f30a704
  • handle possibly undefined conv in playground page4ce76f0
  • use as unknown as Record cast for HistoricalResult type8fb0ba4
  • type error in mcp-eval page — use Record cast for overall_score37efaae
  • remove invalid exports from Next.js route files1ac5034
  • move unsubscribe token generation out of route file510d9b5
  • explicit exports for all 12 deep path imports in core90db4e8
  • handle directory-with-index.ts exports in core packagec2fd73c
  • broaden core package exports for all deep path importsb1a5df1
  • add deep path exports to @evalguard/core for Docker build3c1ad15
  • update all metrics on /features page (186 plugins, 42 strategies, 13 benchmarks, 86 providers)9ea8539
  • update compliance frameworks count from 7 to 21 on homepaged1dcba4
  • remove duration_ms from OTLP insert (generated column) + add missing table migrationsf75b7b8
  • update all metrics to accurate numbers (186 plugins, 126 scorers, 21 compliance, 13 benchmarks, 150K tests)0da8577
  • resolve project context server-side in dashboard layoutbe40cee
  • round score display on dashboard + auto-init project context7734e23
  • auto-initialize project context on dashboard loadfa8542d
  • use npm install instead of corepack for pnpm in Docker3dc214d
Security
  • comprehensive audit — 79 bugs fixed, enterprise hardening8d0a1e6
  • add body field length validation + harden SAML parser2c15aba
  • enterprise hardening — 91 files across auth, API, infra, DB090d1cf
  • fix all 11 Dependabot vulnerabilities7a89325
  • comprehensive 5-round audit — 241 bugs fixed across 100+ files4ca78f5
  • comprehensive 3-round audit — 130+ fixes across 89 filesb78b083
  • comprehensive enterprise security hardening (28 files, 32 fixes)8a0c565
Build
  • make worker tsc non-blocking (duplicate export warnings)16a3ad4
  • skip TS type checking during Next.js build (pre-existing issues)ba607a4
CI
  • trigger deploy — Docker build fix verified on serverb3a2fee
  • test deploy with fixed deploy.sh on servera72dab5
  • trigger deploy pipeline test9ef2a22
  • add AUDIT_SIGNING_KEY and DOCKER_BUILD to turbo globalEnv9bf010e
  • add AUDIT_SIGNING_KEY placeholder for Next.js build in CIc49dbed
  • scope build step to web app only (skip packages with pre-existing TS errors)e388c67
  • mark worker tests as non-blocking (pre-existing 43/169 failures)ca1384c
  • fix broken CI/CD pipelines — YAML syntax, test flags, image scanning0f282c8
  • trigger CI after ci.yml filter path fix39d945d
  • activate CI/CD pipeline with real Supabase build argsbbfb8cd
Tests
  • multi-provider LIVE E2E — 5 LLMs tested with real API callsda2c098
  • add LIVE E2E compliance test + fix buildCaller provider URLs + improve detection005457b
  • 159/159 E2E tests passing — enterprise admin bot validates entire platformfa8612d
  • add enterprise admin E2E test suite — 159 tests across 3 filesa5fd284

Stay in the loop

Follow us on LinkedIn, Twitter/X, or join our Discord to get notified about new releases.