Skip to main content

Security Deep Dive

Platform-Kernel follows an «encrypt everything» principle inspired by Apple's data protection model. There are no unencrypted data paths at rest, in transit, or between services.


7-Layer Encryption Model

LayerWhat is encryptedStandardHow
At restPostgreSQL, ClickHouse, S3, backups, logsAES-256PostgreSQL TDE, S3 SSE-S3/SSE-KMS, ClickHouse encrypted volumes
In transitAll HTTP, REST, WebSocket trafficTLS 1.3HTTPS-only, HSTS, no TLS 1.2
Service-to-serviceAll inter-service communicationmTLSMutual TLS; Istio service mesh in Kubernetes
Field-level (PII)email, phone, ip_address in PostgreSQLAES-256-GCMColumn-level encryption; DEK per-tenant
EnvelopeEncryption keys themselvesRSA-4096DEK → KEK → Master Key (see hierarchy below)
SecretsAPI keys, DB passwords, third-party tokensHashiCorp Vault exclusively; never in env, code, or Git
BackupsAll backup archivesAES-256Encrypted before upload; key from Vault

Key Hierarchy

Why three levels?

Wrapping keys (envelope encryption) limits the blast radius of any single key compromise:

  • A compromised DEK affects only one tenant's field-level data.
  • A compromised KEK cannot decrypt data directly — it only unwraps DEKs.
  • The Master Key never leaves the HSM boundary in plaintext form.

Automatic Key Rotation (90-Day Cycle)

Zero-downtime guarantee: The dual-secret window ensures that JWT tokens issued just before rotation remain valid. The maximum overlap is 15 minutes (JWT TTL). During this window:

  • New JWTs are signed with the new key.
  • Existing JWTs remain valid under the old key.
  • No 401 errors occur.

JWT Signing Key Lifecycle

StateBehaviour
LoadingIAM fetches current key from Vault on process start
ActiveAll JWTs signed with in-memory key; public JWKS served from /.well-known/jwks.json
DualActiveBoth old and new keys accepted; new key signs; old key validates
DegradedVault down at runtime; IAM continues with key in memory; rotation impossible; alert system.health channel
CriticalKey age exceeds KEY_MAX_AGE (env: 7 days default); IAM enters degraded mode; alert level critical; PagerDuty
Env variables:
KEY_MAX_AGE=604800 # 7 days in seconds — degraded → critical
VAULT_ROTATION_PERIOD=7776000 # 90 days in seconds
JWT_ACCESS_TOKEN_TTL=900 # 15 minutes

At-Rest Encryption Detail

PostgreSQL: Field-Level Encryption (PII)

PII columns (email, phone, ip_address) are encrypted at the application layer using AES-256-GCM before being written to PostgreSQL. Each tenant has its own DEK.

Write path:
plaintext PII → AES-256-GCM encrypt (DEK from Vault) → Base64 → store in column

Read path:
Base64 column → AES-256-GCM decrypt (DEK from Vault cache) → plaintext

Vault calls are cached in IAM memory (per-tenant DEK, TTL = 1h).
A cache miss triggers one gRPC call to Vault — not per-row.

PostgreSQL Transparent Data Encryption (TDE) is the second layer — even if the disk volume is extracted, data is unreadable without the TDE key (stored in Vault).

ClickHouse: Encrypted Volumes

ClickHouse uses disk_encryption config with AES-256-CTR:

<!-- config.xml (managed by kernel-cli) -->
<storage_configuration>
<disks>
<encrypted_disk>
<type>encrypted</type>
<disk>default</disk>
<path>encrypted/</path>
<algorithm>AES_256_CTR</algorithm>
<key_hex from_env="CLICKHOUSE_ENCRYPTION_KEY_HEX"/>
</encrypted_disk>
</disks>
</storage_configuration>

CLICKHOUSE_ENCRYPTION_KEY_HEX is injected by the Vault Sidecar at pod startup — it is never written to disk or committed to Git.

S3 / MinIO: Server-Side Encryption

AWS S3: SSE-S3 (AES-256) for module bundles and files
SSE-KMS (AES-256 with KMS key) for sensitive exports

MinIO: MinIO SSE-S3 with AES-256
Key stored in Vault (MinIO KES sidecar)

Backup archives: AES-256 encrypted before upload;
key fetched from Vault per backup job

In-Transit Encryption

HopProtocolCertificate
Client → EnvoyTLS 1.3 + HSTSLet's Encrypt / custom domain
Envoy → Go GatewayHTTP/1.1 (pod-local)N/A — within pod network
Go Gateway → Core ServicesgRPC mTLSInternal mTLS CA (Vault PKI)
Services → PostgreSQLTLS 1.3 (sslmode=verify-full)Internal CA
Services → ValkeyTLS 1.3 (tls config)Internal CA
Services → KafkaTLS 1.3 + SASLInternal CA + SCRAM-SHA-512

HSTS policy:

Strict-Transport-Security: max-age=63072000; includeSubDomains; preload

HashiCorp Vault Configuration

Vault Deployment (per environment):
Dev : Vault in dev mode (Docker Compose, unsealed automatically)
Staging : Vault single-node, auto-unseal via GCP Cloud KMS
Production : Vault cluster (3 nodes), auto-unseal via HSM (FIPS 140-3)

Secret Engines enabled:
transit/ — key operations (encrypt, decrypt, rotate, rewrap)
pki/ — internal CA for mTLS certificates
secret/ — KV v2 for API keys, DB passwords, third-party tokens
database/ — dynamic PostgreSQL credentials (TTL 1h)

Auth methods:
kubernetes — services authenticate via K8s service account JWT
approle — CI/CD pipeline authentication

Vault Sidecar (per pod):

Every core service pod runs a vault-agent sidecar that:

  1. Authenticates to Vault using Kubernetes service account.
  2. Fetches secrets and writes them to a shared in-memory volume.
  3. Renews leases automatically before expiry.
  4. The main container reads secrets from the volume — it never calls Vault directly.

Vault Outage Behaviour

Vault stateImpactAutomatic recovery
Down < 15 minNone — all secrets already in memoryYes, on reconnect
Down > 15 minNew key rotation skipped; alert firedYes
Down 7 daysKEY_MAX_AGE exceeded; critical alert; degraded modeNo — manual action required
Pod restart during outageVault sidecar cannot inject secrets → pod fails readiness → K8s keeps old pod runningAuto when Vault recovers

Security Threat Model

ThreatMitigation layer
Malicious third-party moduleCSP + Wasm sandbox (Rust, < 5ms budget) + SCA scan (Snyk)
XSS between MFE modulesError boundaries + CORS + per-module isolated scope
Cross-tenant data leakPostgreSQL RLS + ClickHouse row policy (dual)
Stolen JWT15-minute TTL + ES256 + refresh token rotation
Stolen refresh tokenValkey invalidation + single-use + 10-second grace window
Compromised DEKBlast radius = one tenant; re-wrap via new KEK on rotation
SQL injectionsqlc prepared statements; input validated by OpenAPI 3.x
API abuse / DDoSEnvoy local token-bucket + Valkey global rate limit + Cloudflare edge
Secret in GitVault-only secret storage; automated trufflehog scan in CI
Kubernetes secret exposureVault sidecar injection into memory volume; no K8s Secret objects

  • Service Map — Vault Sidecar in the service inventory
  • Tenant Isolation — how RLS and ClickHouse row policies enforce per-tenant data boundaries
  • Data Flow — JWT issuance and validation in the request lifecycle