ClickHouse OLAP layer
Opt-in dual-write of traces to ClickHouse for fast multi-day aggregations. Postgres stays the OLTP authority; ClickHouse accelerates rollups.
Self-hosted operators only. Cloud customers use the managed analytics pipeline by default.
Phase 5b foundation
The connection config table, encrypted-password vault, HTTP client, and CLI evalguard clickhouse status ship today. Dual-write of trace_spans into ClickHouse + the rollup-routing on the dashboard land in the follow-up sprint.
Why opt-in?
ClickHouse pays off when you ingest hundreds of millions of trace spans per month and need sub-second rollups across them. Below that, Postgres + the materialized views already in 00000_combined_schema.sqlare faster end-to-end (no second store to keep in sync). The opt-in row in clickhouse_config gates the dual-write path so orgs without analytics scale stay on the simpler stack.
Connection config
host_url https://my-clickhouse:8443/
database_name evalguard
username default
password ••• (encrypted at rest via the vault path
shared with provider_keys)
enabled true
health_status healthy | degraded | unreachable
last_health_check_at 2026-05-22T...
last_error NULL or vendor error textRLS: SELECT for org members; INSERT / UPDATE / DELETE owner-only. We don't let admins redirect trace writes to an attacker-controlled cluster — that's a sensitive piece of infra config.
CLI verification
# Quick probe — confirms the configured cluster is reachable + auth works evalguard clickhouse status # Output: # Status: healthy # Latency: 42ms # Database: evalguard # Host: https://my-clickhouse:8443/
HTTP client
The package exports a thin ClickHouseHTTPClientwrapper around ClickHouse's documented HTTP interface — no heavy Node driver dep, keeps the worker bundle slim. Surface:
import { ClickHouseHTTPClient } from "@evalguard/core";
const client = new ClickHouseHTTPClient({
hostUrl: "https://chc.example/",
database: "evalguard",
username: "default",
password: "•••",
timeoutMs: 10_000,
});
await client.ping();
// → { ok: true, latencyMs: 42 }
await client.insertJSONEachRow("traces", [
{ id: "t-1", duration_ms: 142, model: "gpt-4o" },
{ id: "t-2", duration_ms: 88, model: "gpt-4o" },
]);
const { data } = await client.queryJSON<{ p95: number }>(
"SELECT quantile(0.95)(duration_ms) AS p95 FROM traces WHERE date >= today() - 7"
);Vs the competition
- Langfuse — also ships a ClickHouse path; this is parity.
- Datadog — proprietary analytics layer. Ours is open + opt-in.
- Promptfoo / DeepEval — no analytics layer.