Skip to main content

Data Primitive — Data Layer Service

The Data Layer is the unified data access service for every module on the platform. It provides OLTP persistence via PostgreSQL, sub-second analytics via ClickHouse, distributed caching via Valkey, and real-time change propagation via the Debezium CDC pipeline. Modules never connect to databases directly — all access flows through the Data Layer and the kernel.data() SDK primitive.


Responsibilities

ResponsibilityDetail
CRUD persistencePostgreSQL OLTP — type-safe queries via sqlc
Auto REST endpointsautoApi: true in module manifest → 5 CRUD endpoints generated, zero controller code
AnalyticsClickHouse OLAP — aggregations over millions of rows, sub-second latency
CachingValkey — TTL-based cache, pattern invalidation, rate-limiting, session dedup
Real-timeCDC pipeline: PostgreSQL → Debezium → Kafka → ClickHouse; SDK useRealtime()
Tenant isolationPostgreSQL RLS + ClickHouse row policy — dual enforcement at DB level
Schema introspectionlistModels(), describeModel() — runtime schema discovery from information_schema

Technical Stack

ComponentTechnologyPurpose
LanguageGoAll service logic
PostgreSQL driverpgx/v5 + pgxpoolConnection pooling + type-safe queries
Query generationsqlc (sql_package: pgx/v5)Compile-time checked SQL → Go code
OLAP databaseClickHouse (ClickHouse/clickhouse-go)Analytics, billions of rows
Cache / session storeValkey (valkey-io/valkey-go)TTL cache, rate limiting, dedup
CDC connectorDebeziumPostgreSQL WAL → Kafka
Migrationspressly/gooseVersioned SQL + Go migrations
Service pathservices/data/Monorepo location
OpenAPI specservices/data/api/openapi.yamlREST API specification

10 Internal Packages

PackagePurpose
internal/apiAuto-generated REST endpoint router and handler registration
internal/cacheValkey cache operations: get, set, delete, pattern invalidation
internal/clickhouseClickHouse connection pool, query builder, tenant isolation middleware
internal/configEnvironment-based configuration
internal/crudCore CRUD operations: create, retrieve, list, update, soft-delete
internal/hookLifecycle hook registry and synchronous-before / async-after dispatch
internal/migrationgoose migration runner, post-migration permission reconciliation
internal/permissionsAuto-registration of {module}.{model}.* permissions with IAM
internal/queryPostgREST-inspired query language parser: operators, ordering, cursor pagination
internal/schemaSchema introspection from information_schema, model metadata cache

Auto-generated REST — autoApi

If a module declares autoApi: true for a model in its manifest, the Data Layer automatically exposes 5 REST endpoints — no controller code required:

MethodEndpointOperation
GET/api/v1/data/{module}/{model}List with filters, ordering, cursor pagination
GET/api/v1/data/{module}/{model}/:idRetrieve single record
POST/api/v1/data/{module}/{model}Create
PATCH/api/v1/data/{module}/{model}/:idUpdate (partial)
DELETE/api/v1/data/{module}/{model}/:idSoft delete

For a module crm with model contacts, the auto-generated endpoints are:

GET https://api.septemcore.com/v1/data/crm/contacts
GET https://api.septemcore.com/v1/data/crm/contacts/:id
POST https://api.septemcore.com/v1/data/crm/contacts
PATCH https://api.septemcore.com/v1/data/crm/contacts/:id
DELETE https://api.septemcore.com/v1/data/crm/contacts/:id

Auto-registered Permissions

When autoApi: true is set, the Data Layer automatically registers granular permissions with IAM — no manual declaration needed in the manifest:

crm.contacts.read
crm.contacts.write
crm.contacts.delete

These appear automatically in the role builder UI.


Tenant Isolation

PostgreSQL

Every module table includes a tenant_id UUID NOT NULL column. Row-Level Security policies enforce tenant boundaries at the database level:

-- Applied automatically by the Data Layer migration runner
ALTER TABLE contacts ENABLE ROW LEVEL SECURITY;

CREATE POLICY tenant_isolation ON contacts
USING (tenant_id = current_setting('app.tenant_id')::uuid);

The tenant_id is injected from the JWT via the request context. Application code cannot override it.

ClickHouse (Dual Protection)

ClickHouse does not support native RLS. Two independent layers enforce tenant isolation for analytics:

LayerMechanismGuarantees
ApplicationWHERE tenant_id = ? appended to every query in GoNo leakage even if policy is missing
DatabaseCREATE ROW POLICY tenant_filter ON ... USING tenant_id = {tenantId}DB-level fence even if Go code has a bug

All ClickHouse queries are logged with tenantId + traceId for forensic audit.


CDC Pipeline

PostgreSQL (OLTP)

│ WAL

Debezium

│ Kafka topic: platform.data.events

ClickHouse consumers


Analytics tables (ReplacingMergeTree)
ParameterValue
Latency< 5 seconds (OLTP write → ClickHouse)
Delivery guaranteeAt-least-once. Debezium → Kafka retries on failure
DeduplicationClickHouse ReplacingMergeTree + FINAL keyword deduplicates on read
WAL retentionMinimum 24 hours (wal_keep_size). Slot bloat guard: max_slot_wal_keep_size = 50 GB. Alert on WAL bytes > 10 GB
Debezium loss recoveryFull snapshot of PostgreSQL → ClickHouse re-sync
Consistency disclosuremeta.lastSyncAt (ISO 8601) returned on all analytics responses

Deduplication — Three Layers

LayerStrategyMechanism
PostgreSQL (CRUD)Idempotency keyINSERT ... ON CONFLICT (idempotency_key) DO NOTHING. SDK auto-generates UUID v7.
Kafka (Event Bus)Consumer dedupeventId (UUID v7) stored in Valkey SET, TTL 24 h. Duplicate event → skip.
ClickHouse (CDC)ReplacingMergeTreeDedup by PK at merge time. SELECT ... FINAL always returns single latest row.

SDK Quick Reference

import { kernel } from '@platform/sdk-core';

const data = kernel.data();

// CRUD
await data.create('contacts', payload);
await data.retrieve('contacts', id);
await data.list('contacts', { filters: { status: 'eq.active' } });
await data.update('contacts', id, patch);
await data.delete('contacts', id);

// Analytics (ClickHouse)
await data.analytics({ model: 'contacts', aggregate: 'count', groupBy: 'status' });

// Cache
await data.cache.get('my-key');
await data.cache.set('my-key', value, { ttl: 300 });
await data.cache.delete('my-key');
await data.cache.invalidate('contacts:*');

// Schema introspection
const models = await data.listModels();
const schema = await data.describeModel('contacts');

Full operation reference:

  • CRUD Operations — create, retrieve, list, update, delete
  • Query Language — filters, ordering, field selection, OR logic
  • Relations — FK eager loading with include (Batch 9)
  • Analytics — ClickHouse aggregations (Batch 9)
  • Caching — Valkey cache API (Batch 9)