● DATA ARCHITECTURE / V4.2

Ingested. Normalized. Auditable.

Quantum.DX captures every engineering signal from source through deploy, normalizes it into a canonical model, and serves it through APIs and MCP servers your agents can query directly. Nothing is recomputed without a replayable raw event behind it.

Processing pipeline

01Edge ingest
Cloudflare Workers · HMAC verify

Per-source webhook endpoints validate signatures, deduplicate event IDs, and write raw payloads to a durable queue within ~50ms.

02Event bus
Kafka · NATS JetStream

All raw events land on a partitioned, replayable log keyed by org_id. Schema registry enforces backwards-compatible Avro contracts.

03Normalization
Rust workers · Protobuf

Source-specific adapters map heterogeneous payloads into a unified canonical model (Actor, Repo, Change, Deploy, Incident, Session).

04Identity resolution
Deterministic + ML matcher

Stitches Git authors, Jira accounts, Slack IDs, SSO subjects, and HR records into a single Person graph per organization.

05Enrichment
Embeddings · LLM classifiers

PRs are classified as feature/maintenance/fix, incidents tagged by severity, free-text sentiment scored — all with provenance.

06Metric computation
Flink · dbt · materialized views

DORA, DX Core 4, PR cycle time, AI utilization, and team rollups computed incrementally; backfills run via deterministic replays.

07Serving layer
GraphQL · REST · SQL passthrough

API gateway exposes pre-computed metrics, ad-hoc SQL, and CSV exports. Row-level security enforced by org and team scope.

Ingestion catalog

Version control
PR lifecycle, commits, reviews, branches
GitHubWebhook + GraphQLSTREAMING
GitLabWebhook + RESTSTREAMING
BitbucketWebhook + RESTSTREAMING
Azure ReposService Hook + RESTSTREAMING
GerritEvents stream APISTREAMING
Issue & project tracking
Epics, sprints, status flow, throughput
Jira Cloud / DCWebhook + RESTSTREAMING
LinearGraphQL subscriptionsSTREAMING
AsanaWebhook + RESTSTREAMING
ShortcutWebhookSTREAMING
ClickUpWebhookSTREAMING
Azure BoardsService HookSTREAMING
CI / CD & deploys
Build duration, deploy frequency, lead time
GitHub Actionsworkflow_run webhookSTREAMING
GitLab CIPipeline webhookSTREAMING
CircleCIWebhook + APISTREAMING
BuildkiteWebhook + APISTREAMING
JenkinsPlugin pushSTREAMING
Argo CD / FluxKubernetes eventsSTREAMING
SpinnakerWebhookSTREAMING
Incidents & observability
MTTR, change fail rate, alert volume
PagerDutyWebhook + RESTSTREAMING
OpsgenieWebhookSTREAMING
Incident.ioWebhookSTREAMING
FireHydrantWebhookSTREAMING
StatuspageWebhookSTREAMING
SentryWebhook + RESTSTREAMING
DatadogEvent APISTREAMING
New RelicNerdGraphPOLLING
Grafana / PrometheusAlertmanager webhookSTREAMING
Collaboration & calendar
Focus time, meeting load, async ratio
SlackEvents APISTREAMING
Microsoft TeamsGraph subscriptionsSTREAMING
DiscordGateway APISTREAMING
Google CalendarPush notificationsSTREAMING
Outlook CalendarGraph subscriptionsSTREAMING
ZoomWebhookSTREAMING
IDE & developer signals
Active coding time, AI tool usage
VS Code extensionLocal telemetryBATCHED
JetBrains pluginLocal telemetryBATCHED
GitHub CopilotUsage APIPOLLING
CursorUsage exportBATCHED
Cody / TabnineUsage APIPOLLING
Identity & HR
Team graph, tenure, allocation
Okta / Entra IDSCIM + SAMLSTREAMING
Google WorkspaceDirectory APIPOLLING
BambooHRRESTPOLLING
Rippling / WorkdayRESTPOLLING
Surveys & sentiment
DXI, eNPS, qualitative pulse
DX Surveys (native)Internal APITRIGGERED
Culture AmpRESTPOLLING
LatticeRESTPOLLING
Typeform / TallyWebhookSTREAMING

Persistence layer

StoreRoleTechnologyRetention
Raw event lakeImmutable audit & replayS3 · Parquet · Iceberg13 months hot · cold archive forever
Canonical OLTPEntities, relationships, identity graphPostgreSQL 16 · logical replicationLive
Time-series metricsDORA, cycle time, utilization seriesClickHouse · MergeTree5 years rolled up
Warehouse mirrorCustomer-owned analyticsSnowflake · BigQuery · DatabricksCustomer policy
Vector storeSemantic search over PRs, incidents, notespgvector · Qdrant13 months
Secrets vaultOAuth tokens, signing keysHashiCorp Vault · KMS-wrappedRotated 90d

MCP servers

quantum-metrics

DORA, DX Core 4, PR cycle, AI utilization

get_metric({ metric: "pr_cycle_time", team: "Platform", window: "90d" })
quantum-deliverables

Epics, allocation, investment cost

list_deliverables({ status: "in_progress", quarter: "Q4" })
quantum-incidents

Incident feed, MTTR, postmortems

search_incidents({ severity: ">=SEV2", since: "30d" })
quantum-sentiment

Survey results, eNPS, qualitative themes

get_pulse({ team: "Core Product", question: "tooling" })
quantum-people

Identity graph, tenure, team tree

resolve_person({ email: "user@org.com" })
quantum-sql

Read-only SQL over the warehouse mirror

run_sql({ query: "SELECT * FROM dora_weekly LIMIT 100" })
Transport · Auth
All MCP servers expose Streamable HTTP (POST /mcp) and SSE transports. Authentication uses short-lived JWTs scoped to an organization, team, or service account. Tools are dynamically advertised per caller based on RBAC, so agents only see what their identity is allowed to read.
EXACTLY-ONCE
Deduped by source event ID

Replays from the raw lake are idempotent at every downstream stage.

REPLAYABLE
Backfill any metric in hours

Canonical events make any new metric definition a deterministic recompute.

SOC 2 · ISO 27001
Encrypted in transit & at rest

Per-tenant KMS keys, row-level security, audit log of every read.