Tenant Isolation

Tenant isolation is the most critical security property of Platform-Kernel. One tenant must never see another tenant's data — regardless of bugs, misconfiguration, or malicious module code.

The platform enforces isolation at six independent layers. A vulnerability in any single layer does not cause a cross-tenant data exposure because the other five layers remain intact.

Isolation Overview

Layer 1 — PostgreSQL Row-Level Security

Every table in Platform-Kernel has a tenant_id UUID NOT NULL column. PostgreSQL Row-Level Security (RLS) is enabled on every table and enforced at the database engine level — not in application code.

-- Applied to every tenant-scoped table
ALTER TABLE {table_name} ENABLE ROW LEVEL SECURITY;
ALTER TABLE {table_name} FORCE ROW LEVEL SECURITY;

CREATE POLICY tenant_isolation ON {table_name}
  USING (tenant_id = current_setting('app.tenant_id')::uuid);

How app.tenant_id is set:

Before any query, the Go Data Layer executes:

SET LOCAL app.tenant_id = '{tenantId}';
-- Then executes the actual query within the same transaction

SET LOCAL is scoped to the current transaction. If multiple connections are pooled (pgBouncer), the setting is reset on connection return — cross-leaking is architecturally impossible.

What FORCE ROW LEVEL SECURITY does:

Table owners (the PostgreSQL superuser used for migrations) are normally exempt from RLS. FORCE RLS removes this exemption — even the migration user cannot bypass the policy at runtime.

Protection	Without RLS	With RLS
Bug in Go code forgets `WHERE tenant_id`	Cross-tenant data returned	Database rejects query — returns empty set
SQL injection bypasses `WHERE` clause	All tenant data exposed	RLS still filters by `app.tenant_id`
Malicious module calls Data Layer directly	Could access other tenants	Impossible — policy applied at DB engine

Layer 2 — ClickHouse Dual Filter

ClickHouse does not have native Row-Level Security comparable to PostgreSQL. The platform applies two independent filters so that a bug in either one does not cause data exposure.

Application Filter (Go Data Layer)

// All ClickHouse queries go through this builder — tenantId is mandatory
func (r *AnalyticsRepo) Query(ctx context.Context, q AnalyticsQuery) (*Result, error) {
    tenantID := auth.TenantIDFromContext(ctx) // extracted from JWT
    sql := fmt.Sprintf(
        "SELECT %s FROM %s WHERE tenant_id = ? AND %s",
        q.Select, q.Table, q.Where,
    )
    return r.ch.Query(ctx, sql, tenantID, q.Args...)
}

Database-Level Row Policy

-- Created automatically when a new tenant is provisioned
CREATE ROW POLICY tenant_filter_{tenantId}
  ON analytics.data_records
  FOR SELECT
  USING tenant_id = '{tenantId}'
  TO tenant_user_{tenantId};

Each tenant has a dedicated ClickHouse user with this policy attached. The Go Data Layer connects as the tenant's user — even a completely correct query cannot return another tenant's rows.

All ClickHouse queries are logged with tenantId and traceId for forensic analysis (SOC 2 compliance).

Layer 3 — Kafka Consumer SDK Filter

Kafka topics are shared across all tenants (one topic per domain). A naive consumer would receive every tenant's events. The SDK enforces tenant isolation at the subscription level.

Publish-side isolation:

When a module publishes an event, the Gateway injects tenantId from the JWT — the module cannot set or override it:

Module: kernel.events().publish("order.created", { ... })
         ↓
SDK:     EventEnvelope { tenantId: JWT.tenantId (injected), payload: { ... } }
         ↓
Gateway: Validates tenantId matches JWT — rejects if mismatch

RBAC on events:

A module can only subscribe to events declared in manifest.events.subscribes[].
A module can only publish events declared in manifest.events.publishes[].
Kernel-owned events (auth.*, money.*, billing.*) are on a hardcoded whitelist — only core services can publish them.

Layer 4 — WebSocket Channel Namespace

WebSocket connections are managed by the Notify Service. The hub uses a compound key that includes tenantId as a mandatory prefix.

Internal hub key: {tenantId}:{channel}

Example:
  Module calls:  kernel.notify().broadcast("dashboard.metrics", data)
  SDK sends:     subscribe { channel: "dashboard.metrics" }
  Hub stores:    connections["abc123:dashboard.metrics"]

Cross-tenant broadcast is architecturally impossible:
  Tenant A "abc123" → hub["abc123:dashboard.metrics"]
  Tenant B "def456" → hub["def456:dashboard.metrics"]
  These are different in-memory buckets — no API to cross.

Auth on connection:

Step 1: Client opens WSS connection
Step 2: Client sends { "type": "auth", "token": "<JWT>" }
Step 3: Notify Service validates JWT (ES256 + expiry + revocation check)
Step 4: Connection is registered under hub["{tenantId}:{channel}"]
Step 5: Any message arriving on this connection is tenant-scoped

Invalid JWT → close connection with code 4401 (Unauthorized)

Connection limits per tenant:

NOTIFY_WS_MAX_CONNECTIONS_PER_TENANT = 1000
NOTIFY_WS_RATE_PER_TENANT            = 200 msg/s

One tenant cannot exhaust WebSocket capacity for another tenant.
Limits are enforced per tenantId bucket in the hub.

Layer 5 — Feature Flag Key Prefix

Feature flags are stored in a shared PostgreSQL table and evaluated by GoFeatureFlag. The SDK adds tenantId as a mandatory key prefix, making flag names globally unique per tenant.

Developer writes:  kernel.flags().isEnabled("new-checkout")
SDK transforms to: isEnabled("abc123:new-checkout")

PostgreSQL storage:
  key          | tenant_id | enabled
  -------------|-----------|--------
  abc123:new-checkout | abc123 | true
  def456:new-checkout | def456 | false  ← different tenant, different state

Admin UI filters: WHERE tenant_id = currentTenantId
                  → Tenant A admin cannot see Tenant B flags

Optimistic locking for concurrent admin edits:

Every flag has a version: int. PATCH /flags/:key requires { version: N } in the request body. Concurrent edits by two admins result in 409 Conflict for the second writer — no silent overwrite.

Layer 6 — S3 Path Prefix

All files uploaded through the Files Service are stored under a path that includes tenantId as the first path component:

S3 key structure:
  {tenantId}/{bucket}/{fileId}

Examples:
  abc123/avatars/usr_01j8m...webp
  abc123/documents/doc_01j8n...pdf
  def456/avatars/usr_01j9a...webp   ← different tenant, different prefix

Gateway enforcement:
  1. Module requests: GET /api/v1/files/{fileId}
  2. Gateway fetches file metadata from Files Service
  3. Files Service: WHERE file_id = ? AND tenant_id = JWT.tenantId
  4. If tenant_id mismatch → 404 Not Found (not 403 — prevent oracle)
  5. Presigned URLs: generated with tenantId scoped IAM policy

Presigned URL security:

Presigned GET URL → includes X-Amz-Credential scoped to {tenantId} path prefix
Presigned PUT URL → upload allowed ONLY to {tenantId}/staging/{fileId}

Cross-tenant presigned URL: impossible — credential is path-scoped

Isolation Summary Table

Layer	Technology	Enforced by	Failure mode if layer is bypassed
PostgreSQL RLS	`CREATE POLICY` + `FORCE RLS`	PostgreSQL engine	ClickHouse row policy + 4 more layers remain
ClickHouse dual	`row_policy` + Go `WHERE`	DB engine + application	One sub-layer remains active
Kafka SDK filter	`EventEnvelope.tenantId` check	SDK consumer	Gateway publish validation remains
WebSocket namespace	Hub key `{tenantId}:{channel}`	Notify Service hub	JWT auth on connection remains
Feature Flag prefix	Key `{tenantId}:{flagName}`	SDK + PostgreSQL `RLS`	Admin UI filter remains
S3 path prefix	Path `{tenantId}/{bucket}/`	Gateway + Files Service	Files Service DB query remains

Defence in depth: Each layer is sufficient by itself to prevent cross-tenant access. All six together provide layered redundancy — the SOC 2 Type II and ISO 27001 standard for multi-tenant SaaS.

Tenant Provisioning Isolation

When a new tenant is created, the following isolation primitives are provisioned atomically (single PostgreSQL transaction):

INSERT INTO tenants (id, slug, ...)
INSERT INTO wallets  (id, tenant_id, type='system', ...)    ← system wallet
INSERT INTO subscriptions (id, tenant_id, status='trialing')
CREATE ROW POLICY tenant_filter_{tenantId} ON analytics.*   ← ClickHouse
CREATE CLICKHOUSE USER tenant_user_{tenantId}               ← ClickHouse user
GRANT SELECT WITH row_policy TO tenant_user_{tenantId}
IAM: create Owner role with wildcard (*) for tenantId

If any step fails → ROLLBACK. No partial tenant state possible.

Security Deep Dive — encryption layers and key hierarchy
CDC Pipeline — ClickHouse ReplacingMergeTree and FINAL for deduplication
Service Map — service-to-service mTLS topology
Data Flow — JWT propagation through the request lifecycle

Isolation Overview​

Layer 1 — PostgreSQL Row-Level Security​

Layer 2 — ClickHouse Dual Filter​

Application Filter (Go Data Layer)​

Database-Level Row Policy​

Layer 3 — Kafka Consumer SDK Filter​

Layer 4 — WebSocket Channel Namespace​

Layer 5 — Feature Flag Key Prefix​

Layer 6 — S3 Path Prefix​

Isolation Summary Table​

Tenant Provisioning Isolation​

Related Pages​