Data Flow
This page documents the runtime data flows through Platform-Kernel — how a request travels from a browser to a database, how events propagate between services, and how analytical data reaches ClickHouse.
Depth: For service inventory see Service Map. For CDC internals see CDC Pipeline.
1. Authenticated Request Lifecycle
This is the critical path for every API call. The Gateway enforces authentication and authorization before any downstream service sees the request.
Key points:
| Step | Detail |
|---|---|
| Rate limiting | Two-tier: Envoy local token-bucket (zero-latency) → Valkey global counter (cross-instance) |
| JWT validation | ES256 asymmetric; public key cached in Gateway memory; no IAM gRPC call per request |
| RBAC | O(1) Valkey set lookup; permissions cached under rbac:{tenantId}:{userId} |
| Circuit breaker | Opens after 5 consecutive failures in 30s; returns 503 immediately during open |
| RLS | PostgreSQL Row-Level Security enforces tenant_id = current_setting('app.tenant_id') |
2. Login Flow (JWT Issuance)
Token Grace Window (multi-tab refresh race):
Problem : Two browser tabs simultaneously see expired access token
→ both call POST /auth/refresh with the same refresh token
→ first tab gets new pair, second tab gets 401 (token revoked)
Solution: Refresh token stays valid for 10s after first use (grace window)
Within grace window: same refresh token → same new token pair (idempotent)
After 10s: permanently revoked
SDK also uses BroadcastChannel to coordinate in-flight refresh across tabs
3. Domain Event Flow
Every state mutation in any core service publishes a typed event to Kafka. Consumers process asynchronously with at-least-once delivery.
Event deduplication:
| Event domain | Dedup mechanism | TTL |
|---|---|---|
money.* | PostgreSQL processed_event_ids table | Permanent (no TTL) |
auth.*, module.* | Valkey SET {domain}:{event_id} 1 EX 86400 | 24 hours |
| DLQ retry | EventService.DLQRetry RPC requeues to original topic | N/A — explicit |
4. CDC Pipeline (PostgreSQL → ClickHouse)
The Change Data Capture pipeline propagates all tenant data mutations from PostgreSQL to ClickHouse for analytical queries.
At-least-once → exactly-once deduplication in ClickHouse:
Table engine : ReplacingMergeTree(updated_at)
Ordering key : (tenant_id, record_id)
Dedup : On INSERT — Kafka consumer produces idempotent row
On QUERY — SELECT ... FINAL collapses duplicates at read time
Background : OPTIMIZE TABLE ... FINAL on schedule merges rows
WAL retention : 24h (KAFKA_RETENTION_HOURS=168 covers any consumer lag)
Bloat guard : Alert if WAL > 50GB (CDC_WAL_BLOAT_ALERT_GB=50)
Recovery : Auto-snapshot on consumer lag spike; lastSyncAt checkpoint
5. Money Hold/Confirm Pattern
Financial operations use a two-phase commit pattern inside PostgreSQL to prevent over-spending and ensure consistency without distributed transactions.
Expired hold cleanup:
A background job (Money Service ticker, interval: 60s) scans:
SELECT * FROM transactions WHERE status='held' AND hold_expires_at < NOW()
For each expired hold:
BEGIN TX
UPDATE wallets SET available += amount, frozen -= amount
UPDATE transactions SET status='expired'
COMMIT
Publish money.hold.expired → Kafka
6. WebSocket Notification Push
Real-time notifications travel from a publishing service through Kafka / RabbitMQ to the browser via the Notify Service WebSocket hub.
Connection limits:
Max connections per tenant : 1000 (NOTIFY_WS_MAX_CONNECTIONS_PER_TENANT)
Max broadcast rate : 200 msg/s per tenant (NOTIFY_WS_RATE_PER_TENANT)
Heartbeat : Server Ping every 30s; 2 missed Pongs → close(4408)
Replay on reconnect : Valkey LPUSH/LRANGE ws:replay:{tenantId}:{channel}
Buffer: 100 messages per channel (LIFO)
7. Module Installation Flow
Related Pages
- Architecture Overview — C4 Level 1 System Context
- Service Map — C4 Level 2 service inventory and ports
- CDC Pipeline — ClickHouse analytics pipeline deep dive
- Security Deep Dive — Encryption and key hierarchy
- Tenant Isolation — 6-layer isolation model