Data Primitive — Data Layer Service
The Data Layer is the unified data access service for every module
on the platform. It provides OLTP persistence via PostgreSQL, sub-second
analytics via ClickHouse, distributed caching via Valkey, and real-time
change propagation via the Debezium CDC pipeline. Modules never connect
to databases directly — all access flows through the Data Layer and
the kernel.data() SDK primitive.
Responsibilities
| Responsibility | Detail |
|---|---|
| CRUD persistence | PostgreSQL OLTP — type-safe queries via sqlc |
| Auto REST endpoints | autoApi: true in module manifest → 5 CRUD endpoints generated, zero controller code |
| Analytics | ClickHouse OLAP — aggregations over millions of rows, sub-second latency |
| Caching | Valkey — TTL-based cache, pattern invalidation, rate-limiting, session dedup |
| Real-time | CDC pipeline: PostgreSQL → Debezium → Kafka → ClickHouse; SDK useRealtime() |
| Tenant isolation | PostgreSQL RLS + ClickHouse row policy — dual enforcement at DB level |
| Schema introspection | listModels(), describeModel() — runtime schema discovery from information_schema |
Technical Stack
| Component | Technology | Purpose |
|---|---|---|
| Language | Go | All service logic |
| PostgreSQL driver | pgx/v5 + pgxpool | Connection pooling + type-safe queries |
| Query generation | sqlc (sql_package: pgx/v5) | Compile-time checked SQL → Go code |
| OLAP database | ClickHouse (ClickHouse/clickhouse-go) | Analytics, billions of rows |
| Cache / session store | Valkey (valkey-io/valkey-go) | TTL cache, rate limiting, dedup |
| CDC connector | Debezium | PostgreSQL WAL → Kafka |
| Migrations | pressly/goose | Versioned SQL + Go migrations |
| Service path | services/data/ | Monorepo location |
| OpenAPI spec | services/data/api/openapi.yaml | REST API specification |
10 Internal Packages
| Package | Purpose |
|---|---|
internal/api | Auto-generated REST endpoint router and handler registration |
internal/cache | Valkey cache operations: get, set, delete, pattern invalidation |
internal/clickhouse | ClickHouse connection pool, query builder, tenant isolation middleware |
internal/config | Environment-based configuration |
internal/crud | Core CRUD operations: create, retrieve, list, update, soft-delete |
internal/hook | Lifecycle hook registry and synchronous-before / async-after dispatch |
internal/migration | goose migration runner, post-migration permission reconciliation |
internal/permissions | Auto-registration of {module}.{model}.* permissions with IAM |
internal/query | PostgREST-inspired query language parser: operators, ordering, cursor pagination |
internal/schema | Schema introspection from information_schema, model metadata cache |
Auto-generated REST — autoApi
If a module declares autoApi: true for a model in its manifest,
the Data Layer automatically exposes 5 REST endpoints — no controller
code required:
| Method | Endpoint | Operation |
|---|---|---|
GET | /api/v1/data/{module}/{model} | List with filters, ordering, cursor pagination |
GET | /api/v1/data/{module}/{model}/:id | Retrieve single record |
POST | /api/v1/data/{module}/{model} | Create |
PATCH | /api/v1/data/{module}/{model}/:id | Update (partial) |
DELETE | /api/v1/data/{module}/{model}/:id | Soft delete |
For a module crm with model contacts, the auto-generated endpoints
are:
GET https://api.septemcore.com/v1/data/crm/contacts
GET https://api.septemcore.com/v1/data/crm/contacts/:id
POST https://api.septemcore.com/v1/data/crm/contacts
PATCH https://api.septemcore.com/v1/data/crm/contacts/:id
DELETE https://api.septemcore.com/v1/data/crm/contacts/:id
Auto-registered Permissions
When autoApi: true is set, the Data Layer automatically registers
granular permissions with IAM — no manual declaration needed in the
manifest:
crm.contacts.read
crm.contacts.write
crm.contacts.delete
These appear automatically in the role builder UI.
Tenant Isolation
PostgreSQL
Every module table includes a tenant_id UUID NOT NULL column.
Row-Level Security policies enforce tenant boundaries at the database
level:
-- Applied automatically by the Data Layer migration runner
ALTER TABLE contacts ENABLE ROW LEVEL SECURITY;
CREATE POLICY tenant_isolation ON contacts
USING (tenant_id = current_setting('app.tenant_id')::uuid);
The tenant_id is injected from the JWT via the request context.
Application code cannot override it.
ClickHouse (Dual Protection)
ClickHouse does not support native RLS. Two independent layers enforce tenant isolation for analytics:
| Layer | Mechanism | Guarantees |
|---|---|---|
| Application | WHERE tenant_id = ? appended to every query in Go | No leakage even if policy is missing |
| Database | CREATE ROW POLICY tenant_filter ON ... USING tenant_id = {tenantId} | DB-level fence even if Go code has a bug |
All ClickHouse queries are logged with tenantId + traceId for
forensic audit.
CDC Pipeline
PostgreSQL (OLTP)
│
│ WAL
▼
Debezium
│
│ Kafka topic: platform.data.events
▼
ClickHouse consumers
│
▼
Analytics tables (ReplacingMergeTree)
| Parameter | Value |
|---|---|
| Latency | < 5 seconds (OLTP write → ClickHouse) |
| Delivery guarantee | At-least-once. Debezium → Kafka retries on failure |
| Deduplication | ClickHouse ReplacingMergeTree + FINAL keyword deduplicates on read |
| WAL retention | Minimum 24 hours (wal_keep_size). Slot bloat guard: max_slot_wal_keep_size = 50 GB. Alert on WAL bytes > 10 GB |
| Debezium loss recovery | Full snapshot of PostgreSQL → ClickHouse re-sync |
| Consistency disclosure | meta.lastSyncAt (ISO 8601) returned on all analytics responses |
Deduplication — Three Layers
| Layer | Strategy | Mechanism |
|---|---|---|
| PostgreSQL (CRUD) | Idempotency key | INSERT ... ON CONFLICT (idempotency_key) DO NOTHING. SDK auto-generates UUID v7. |
| Kafka (Event Bus) | Consumer dedup | eventId (UUID v7) stored in Valkey SET, TTL 24 h. Duplicate event → skip. |
| ClickHouse (CDC) | ReplacingMergeTree | Dedup by PK at merge time. SELECT ... FINAL always returns single latest row. |
SDK Quick Reference
import { kernel } from '@platform/sdk-core';
const data = kernel.data();
// CRUD
await data.create('contacts', payload);
await data.retrieve('contacts', id);
await data.list('contacts', { filters: { status: 'eq.active' } });
await data.update('contacts', id, patch);
await data.delete('contacts', id);
// Analytics (ClickHouse)
await data.analytics({ model: 'contacts', aggregate: 'count', groupBy: 'status' });
// Cache
await data.cache.get('my-key');
await data.cache.set('my-key', value, { ttl: 300 });
await data.cache.delete('my-key');
await data.cache.invalidate('contacts:*');
// Schema introspection
const models = await data.listModels();
const schema = await data.describeModel('contacts');
Full operation reference:
- CRUD Operations — create, retrieve, list, update, delete
- Query Language — filters, ordering, field selection, OR logic
- Relations — FK eager loading with
include(Batch 9) - Analytics — ClickHouse aggregations (Batch 9)
- Caching — Valkey cache API (Batch 9)