Kubernetes Deployment
This page documents the production Kubernetes deployment architecture
for Platform-Kernel. As of April 2026, the project does not ship
first-party Helm charts — operators must adapt the Docker Compose stack
to their cluster using the patterns described here. Each Go service
image is built from the project's multi-stage Dockerfile and pushed
to GHCR (see CI pipeline).
Recommended toolchain:
- Kubernetes 1.36 (April 2026 GA)
- Helm 4.1.4
- Istio 1.29.2 (service mesh, mTLS enforcement)
- cert-manager 1.17+ (TLS certificate lifecycle)
Architecture Overview
Namespace Layout
kubectl create namespace platform-infra # Stateful: PG, CH, Kafka, Vault
kubectl create namespace platform-services # Go services
kubectl create namespace platform-ingress # Envoy (Ingress)
kubectl create namespace istio-system # Istio control plane
Docker Images
Each Go service is built using a two-stage Dockerfile:
- Builder stage:
golang:1.26.1-alpine— compiles a static binary withCGO_ENABLED=0for cross-platform compatibility - Runtime stage:
alpine:3.21— minimal image, non-root userappuser(UID 10001), onlyca-certificates,wget,tzdata
# Shared build pattern (all 12 Go services):
FROM golang:${GO_VERSION}-alpine AS builder
ARG TARGETOS TARGETARCH
RUN CGO_ENABLED=0 GOOS=$TARGETOS GOARCH=$TARGETARCH \
go build -ldflags="-s -w" -o /service ./cmd/<service>
FROM alpine:${ALPINE_VERSION}
RUN adduser -D -u 10001 appuser
COPY /service /service
USER appuser
EXPOSE 8080 50050
ENTRYPOINT ["/service"]
Images are published to GHCR:
ghcr.io/<org>/platform-kernel/<service>:<git-sha>
The CI pipeline (ci.yml) builds and pushes on every merge to main.
CI and Image Registry
The ci.yml pipeline (GitHub Actions) builds production images using
the docker-build job. Required secrets and variables:
| Secret / Variable | Description |
|---|---|
REGISTRY_TOKEN | GHCR push credentials |
SONAR_TOKEN | SonarQube 25.1 SAST upload |
STAGING_SSH_KEY | SSH key for staging deploy |
STAGING_HOST | Staging server hostname |
STAGING_USER | SSH user on staging |
STAGING_WORK_DIR | Working directory on staging |
Build args passed to every docker build:
docker build \
--build-arg GO_VERSION=1.26 \
--build-arg ALPINE_VERSION=3.21 \
-t ghcr.io/<org>/platform-kernel/iam:<sha> \
-f services/iam/Dockerfile \
. # Build context = monorepo root (required for go.work)
Stateless Services — Deployment Pattern
All 12 Go services share the same Deployment pattern. Example for
the IAM service:
apiVersion: apps/v1
kind: Deployment
metadata:
name: iam
namespace: platform-services
spec:
replicas: 3
selector:
matchLabels:
app: iam
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0 # Zero-downtime rolling update
template:
metadata:
labels:
app: iam
spec:
serviceAccountName: platform-iam
securityContext:
runAsNonRoot: true
runAsUser: 10001
fsGroup: 10001
containers:
- name: iam
image: ghcr.io/<org>/platform-kernel/iam:<sha>
ports:
- containerPort: 8080 # HTTP health + REST
- containerPort: 50050 # gRPC
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: platform-db-secret
key: iam-dsn
- name: VAULT_ADDR
value: "http://vault.platform-infra.svc.cluster.local:8200"
- name: VAULT_TOKEN
valueFrom:
secretKeyRef:
name: vault-token
key: token
- name: IAM_GRPC_PORT
value: "50050"
resources:
requests:
cpu: 100m
memory: 64Mi
limits:
cpu: 500m
memory: 128Mi
livenessProbe:
httpGet:
path: /health/live
port: 8080
initialDelaySeconds: 10
periodSeconds: 10
readinessProbe:
httpGet:
path: /health/ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
Key principles for all services:
maxUnavailable: 0— zero downtime rolling update- Non-root user (UID 10001) enforced in both Dockerfile and
securityContext - Memory
limitmatches Docker Composedeploy.resources.limits - Health probes use
/health/liveand/health/readyendpoints
StatefulSets — PostgreSQL
PostgreSQL 17 requires a StatefulSet with persistent volumes:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: postgres
namespace: platform-infra
spec:
serviceName: postgres
replicas: 2 # 1 primary + 1 replica (streaming replication)
selector:
matchLabels:
app: postgres
template:
metadata:
labels:
app: postgres
spec:
containers:
- name: postgres
image: postgres:17-alpine
env:
- name: POSTGRES_DB
value: platform_kernel
- name: POSTGRES_USER
value: kernel
- name: POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
name: postgres-secret
key: password
resources:
limits:
memory: 512Mi
cpu: "2"
volumeMounts:
- name: postgres-data
mountPath: /var/lib/postgresql/data
livenessProbe:
exec:
command: ["pg_isready", "-U", "kernel", "-d", "platform_kernel"]
periodSeconds: 10
volumeClaimTemplates:
- metadata:
name: postgres-data
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: fast-ssd # NVMe StorageClass required
resources:
requests:
storage: 200Gi
RLS note: PostgreSQL Row-Level Security is applied at the
application level via goose migrations. Kubernetes does not require
any special PostgreSQL configuration for RLS.
StatefulSets — ClickHouse
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: clickhouse
namespace: platform-infra
spec:
serviceName: clickhouse
replicas: 2 # 1 shard + 1 replica
template:
spec:
containers:
- name: clickhouse
image: clickhouse/clickhouse-server:25.3-alpine
resources:
limits:
memory: 2Gi
cpu: "4"
volumeMounts:
- name: clickhouse-data
mountPath: /var/lib/clickhouse
volumeClaimTemplates:
- metadata:
name: clickhouse-data
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: fast-ssd
resources:
requests:
storage: 500Gi
Kafka — KRaft Mode (No ZooKeeper)
Kafka 3.9.0 runs in KRaft mode — no separate ZooKeeper cluster is
required. Deploy as a 3-node StatefulSet for HA:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: kafka
namespace: platform-infra
spec:
serviceName: kafka
replicas: 3
template:
spec:
containers:
- name: kafka
image: apache/kafka:3.9.0
env:
- name: KAFKA_PROCESS_ROLES
value: broker,controller
- name: KAFKA_CONTROLLER_QUORUM_VOTERS
value: "0@kafka-0:9093,1@kafka-1:9093,2@kafka-2:9093"
- name: KAFKA_LOG_DIRS
value: /var/lib/kafka/data
- name: KAFKA_NUM_PARTITIONS
value: "6"
- name: KAFKA_DEFAULT_REPLICATION_FACTOR
value: "3"
- name: KAFKA_MESSAGE_MAX_BYTES
value: "1048576" # 1 MB max event payload
- name: KAFKA_LOG_RETENTION_HOURS
value: "168" # 7 days
- name: KAFKA_AUTO_CREATE_TOPICS_ENABLE
value: "false"
resources:
limits:
memory: 512Mi
cpu: "2"
volumeClaimTemplates:
- metadata:
name: kafka-data
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: fast-ssd
resources:
requests:
storage: 200Gi
HashiCorp Vault — HA Raft Mode
In production, Vault runs in HA mode with integrated Raft storage
(3 nodes). In dev, Vault runs as a single node with server -dev.
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: vault
namespace: platform-infra
spec:
serviceName: vault
replicas: 3
template:
spec:
containers:
- name: vault
image: hashicorp/vault:1.19
args: ["server"]
env:
- name: VAULT_LOCAL_CONFIG
value: |
ui = true
cluster_addr = "https://$(POD_IP):8201"
api_addr = "https://$(POD_IP):8200"
storage "raft" {
path = "/vault/data"
node_id = "$(POD_NAME)"
}
listener "tcp" {
address = "0.0.0.0:8200"
tls_disable = false
tls_cert_file = "/vault/tls/tls.crt"
tls_key_file = "/vault/tls/tls.key"
}
securityContext:
capabilities:
add: ["IPC_LOCK"] # Required for Vault memory locking
ports:
- containerPort: 8200
- containerPort: 8201 # Cluster/raft
Vault roles used by Platform-Kernel:
| Role | Policies |
|---|---|
platform-iam | read jwt-signing-keys, write token-store |
platform-domain-resolver | read tls-certs, write acme-challenges |
platform-services | read db-creds (dynamic secrets) |
JWT signing keys (ES256 P-256) are stored in Vault KV v2 and rotated every 90 days using Vault's credential rotation with the dual-key strategy (old key remains valid during rotation window).
Istio Service Mesh
Istio 1.29.2 enforces mutual TLS (mTLS) between all services in
the platform-services namespace:
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
name: default
namespace: platform-services
spec:
mtls:
mode: STRICT
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
name: platform-services-mtls
namespace: platform-services
spec:
host: "*.platform-services.svc.cluster.local"
trafficPolicy:
tls:
mode: ISTIO_MUTUAL
This replaces the services/shared/mtls Go package's self-managed
certificate handling in production. In development (Docker Compose),
shared/mtls handles mTLS without a service mesh.
Horizontal Pod Autoscaling
CPU-based HPA for stateless Go services:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: gateway-hpa
namespace: platform-services
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: gateway
minReplicas: 3
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 60
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 75
Apply the same pattern to iam, data-layer, money — the services
under highest load.
Zero-Downtime Deployment
Platform-Kernel's zero-downtime strategy uses:
maxUnavailable: 0in allDeploymentrolling update strategies- Kafka cooperative sticky rebalancing — consumers continue processing during rolling restarts without full partition rebalance
- Kafka static group membership (
group.instance.id=$POD_NAME) — prevents unnecessary consumer group rebalances when pods restart - Readiness gating —
kubectl rollout statuswaits for all pods to pass/health/readybefore marking the rollout complete
# Deploy a new image version:
kubectl set image deployment/iam \
iam=ghcr.io/<org>/platform-kernel/iam:<new-sha> \
-n platform-services
# Monitor rollout:
kubectl rollout status deployment/iam -n platform-services
# → deployment "iam" successfully rolled out
# Rollback if needed:
kubectl rollout undo deployment/iam -n platform-services
Secret Management
All secrets are stored in Vault and injected via the Vault Agent sidecar or Vault Secrets Operator (VSO):
# Kubernetes Secret referencing Vault (via VSO):
apiVersion: secrets.hashicorp.com/v1beta1
kind: VaultStaticSecret
metadata:
name: platform-jwt-keys
namespace: platform-services
spec:
type: kv-v2
mount: secret
path: platform/iam/jwt-keys
destination:
name: jwt-keys
create: true
refreshAfter: 1h
Do not store JWT_PRIVATE_KEY or JWT_PUBLIC_KEY in plain
Kubernetes Secrets without encryption at rest.
Node Affinity
Separate stateful and stateless containers to prevent resource contention:
# For Go service Deployments:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: node-type
operator: In
values: ["stateless"]
# For StatefulSets (PG, CH, Kafka):
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: node-type
operator: In
values: ["stateful"]
Label your nodes accordingly:
kubectl label node <worker-1> node-type=stateless
kubectl label node <worker-2> node-type=stateless
kubectl label node <storage-1> node-type=stateful
kubectl label node <storage-2> node-type=stateful
See Also
- Requirements — Kubernetes version matrix and hardware sizing
- Docker Compose Setup — local and staging deployment
- Vault Setup — JWT key rotation and Raft HA configuration
- Monitoring — VictoriaMetrics, Grafana, alerts
- Architecture → Deployment — CI/CD pipeline and deployment lifecycle