Deployment

Platform Kernel ships as a polyglot monorepo (Go services + TypeScript shell) and supports two deployment targets:

Environment	Orchestrator	Use case
Local / Dev	Docker Compose	Full-stack sandbox — one command
Staging / Prod	Kubernetes (k8s)	Multi-tenant SaaS, HPA, rolling updates

Container Model — `Dockerfile.monorepo`

All 12 Go services share a single Dockerfile (Dockerfile.monorepo at monorepo root). The target service is selected via --build-arg SERVICE=<name>.

Build Stages

Key Build Properties

Property	Value
Go version	`1.26` (from `docker/versions.env`)
Alpine version	`3.21` (from `docker/versions.env`)
Binary	Fully static, no libc (`CGO_ENABLED=0`)
Debug symbols	Stripped (`-ldflags="-s -w"`) — minimizes image size
User	`appuser` UID 10001 — non-root
`files` service	`CGO_ENABLED=1` (requires `libvips` for `bimg`)
Go Workspace	`GOWORK=off` during build — workspace resolution via replace directives
Platform	`linux/amd64` (CI)

Image Tag Strategy

ghcr.io/{owner}/platform-kernel/{service}:{git-sha}
ghcr.io/{owner}/platform-kernel/{service}:latest          # main branch only
ghcr.io/{owner}/platform-kernel/{service}:staging         # staging tag

Local Development — Docker Compose

Stack Definition

The full development stack is defined in docker/docker-compose.yml. Versions are pinned in docker/versions.env (single source of truth for all environments):

# docker/versions.env (source of truth)
GO_VERSION=1.26
ALPINE_VERSION=3.21
POSTGRES_VERSION=17-alpine
VALKEY_VERSION=8.1-alpine
KAFKA_VERSION=3.9.0
RABBITMQ_VERSION=4.1-management-alpine
CLICKHOUSE_VERSION=25.3-alpine
SEAWEEDFS_VERSION=3.84
VAULT_VERSION=1.19
GO_FEATURE_FLAG_VERSION=v1.42.0
ENVOY_VERSION=v1.33-latest

Startup

# Start full kernel stack (all 12 services + infra)
docker compose \
  --env-file docker/versions.env \
  -f docker/docker-compose.yml \
  up -d --wait

# Start only infrastructure (for local Go service development)
docker compose \
  --env-file docker/versions.env \
  -f docker/docker-compose.yml \
  up -d --wait \
  postgres valkey kafka clickhouse rabbitmq vault

Service Port Map (local)

Service	Port	Notes
API Gateway (Envoy)	`8443`	HTTPS, TLS termination
Gateway (Go)	`8080`	Internal gRPC
IAM	`8081`	gRPC + `/health`
PostgreSQL	`5432`	`platform_kernel` database
ClickHouse	`8123` / `9000`	HTTP + native
Kafka	`9092`	Plaintext
Valkey	`6379`	—
Vault	`8200`	Dev mode
GoFeatureFlag	`1031`	—
SeaweedFS	`8888`	—

Compose Profiles

Compose file	Purpose
`docker-compose.yml`	Full stack
`docker-compose.ci.yml`	CI overlay — removes volumes, adds healthchecks
`docker-compose.pact.yml`	Pact Broker overlay — Postgres + pact-broker UI
`docker-compose.kafka.yaml`	Kafka-only
`docker-compose.sandbox.yml`	Tenant sandbox — isolated per-tenant env
`docker-compose.gateway.yaml`	Gateway + Envoy only

Production — Kubernetes

Zero-Downtime Rolling Update

All Kernel services use RollingUpdate deployment strategy with maxUnavailable: 0 — no pods are terminated until new pods are Ready.

# Example: IAM service Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  name: platform-iam
spec:
  replicas: 3
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1          # One extra pod spun up before old removed
      maxUnavailable: 0    # Zero downtime — old pod stays until new ready
  template:
    spec:
      containers:
        - name: iam
          image: ghcr.io/{owner}/platform-kernel/iam:{sha}
          ports:
            - containerPort: 8080
          livenessProbe:
            httpGet:
              path: /health/live
              port: 8080
            initialDelaySeconds: 5
            periodSeconds: 10
          readinessProbe:
            httpGet:
              path: /health/ready
              port: 8080
            initialDelaySeconds: 3
            periodSeconds: 5
          terminationGracePeriodSeconds: 30

Zero-Downtime Kafka Consumer Rolling (Cooperative Sticky)

Standard Kafka rebalance pauses all consumers in the group during partition reassignment. Cooperative Sticky rebalance pauses only the partitions being migrated:

Configured in services/event-bus Go consumer:

// services/event-bus — reader configuration
r := kafka.NewReader(kafka.ReaderConfig{
    Brokers:        cfg.Brokers,
    GroupID:        cfg.GroupID,
    Topic:          cfg.Topic,
    GroupBalancers: []kafka.GroupBalancer{
        kafka.CooperativeStickyGroupBalancer{},
    },
})

Horizontal Pod Autoscaler (HPA)

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: platform-gateway
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: platform-gateway
  minReplicas: 2
  maxReplicas: 20
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70   # Scale out when CPU > 70%

HPA triggers at CPU > 70% (env HPA_CPU_THRESHOLD). Scale-down has a 5-minute stabilization window to prevent flapping.

Canary Deployment via Feature Flags

New features ship behind GoFeatureFlag gates. Traffic is split at the application level — no Kubernetes traffic-splitting required:

Canary configuration in docker/feature-flags/:

# feature-flags/flags.yaml
feature-x:
  variations:
    enabled: true
    disabled: false
  defaultRule:
    percentage:
      enabled: 10   # 10% of tenants get the new behavior
      disabled: 90
  targeting: []

Rollout sequence:

Deploy new code (Feature Flag OFF for all) — zero impact
Enable for 10% → monitor error rates and latency
Ramp to 50% → 100% → remove flag from code

Envoy xDS Dynamic Config

Envoy Gateway uses xDS (LDS/RDS/CDS/EDS) for zero-restart config updates. When a new module is installed, the Gateway Service pushes a new RDS route config:

ENVOY_DRAIN_TIMEOUT_SEC=10 — in-flight requests on removed routes have 10 s to complete before the route is deleted.

CI/CD Pipeline

The pipeline is defined in .github/workflows/ci.yml. All jobs are GitHub Actions, running on ubuntu-latest. Load tests run on a self-hosted VDS runner tagged [self-hosted, linux, load-test].

Pipeline Graph

Job Details

Job	Trigger	Tool	SLA
Lint	Every push/PR	golangci-lint v2 · ESLint 9.x · markdownlint	< 2 min
Unit Tests — Go	Every push/PR	`go test -race -count=1` all 12 services	< 5 min
Unit Tests — TypeScript	Every push/PR	Vitest + Playwright (Browser Mode)	< 4 min
Build — Go	Every push/PR	`go build -ldflags="-s -w"` all 12	< 3 min
Build — TypeScript	Every push/PR	`pnpm build`	< 3 min
Integration Tests	Every PR (needs Build-Go)	Testcontainers (real PG 16, Kafka 3.9, Valkey 8.1)	< 8 min
Contract Tests	Every PR (needs Integration)	Pact v4 Go SDK + self-hosted Pact Broker	< 4 min
Security Scan	Every push/PR	Snyk + OWASP ZAP + Semgrep	< 6 min
Docker Build	Every PR (needs Build-Go)	BuildKit parallel `--no-cache`	< 10 min
Smoke Tests	Every PR (needs Docker)	`curl /health/live /health/ready` all services	< 5 min
Load Test	`push main` only	k6 — 10K RPS, p99 < 200 ms, 0 errors	< 15 min
CI Gate	After all above	Required status check for branch protection	instant

Concurrency Policy

concurrency:
  group: ci-${{ github.workflow }}-${{ github.ref }}
  cancel-in-progress: ${{ github.ref != 'refs/heads/main' }}

PRs: cancel in-progress on new push (saves CI minutes)
main: serialize — no torn deploys

Integration Test Stack

Testcontainers spins up real infrastructure (not mocks) for integration tests via docker compose up -d --wait:

postgres (PG 16-alpine) · valkey (8.1-alpine) · kafka (3.9.0) ·
clickhouse (25.3-alpine) · rabbitmq (4.1-management) · vault (1.19)

Environment variables injected:

TEST_POSTGRES_URL=postgres://kernel:kernel_dev_password@localhost:5432/platform_kernel
TEST_KAFKA_BROKERS=localhost:9092
TEST_VALKEY_ADDR=localhost:6379

Container Model — Dockerfile.monorepo​

Build Stages​

Key Build Properties​

Image Tag Strategy​

Local Development — Docker Compose​

Stack Definition​

Startup​

Service Port Map (local)​

Compose Profiles​

Production — Kubernetes​

Zero-Downtime Rolling Update​

Zero-Downtime Kafka Consumer Rolling (Cooperative Sticky)​

Horizontal Pod Autoscaler (HPA)​

Canary Deployment via Feature Flags​

Envoy xDS Dynamic Config​

CI/CD Pipeline​

Pipeline Graph​

Job Details​

Concurrency Policy​

Integration Test Stack​

See Also​