SeptemCore LogoSeptemCore
Self-Hosting

Vault Setup

HashiCorp Vault 1.19 setup for Platform-Kernel: KV v2 secrets engine, ES256 JWT signing key lifecycle, AppRole authentication, 90-day automatic rotation, Raft HA unsealing, and TLS certificate management.

Platform-Kernel integrates with HashiCorp Vault via the services/vault/ package, which implements the SecretProvider interface. There is no fallback to environment variables for secrets (Architectural Decision AD-1): if Vault is unreachable at startup, the service aborts after VAULT_INIT_TIMEOUT_SEC seconds (120 by default).

Project-pinned version: Vault 1.19 (docker/versions.env).

Vault 2.0.0 note: Vault 2.0.0 was released April 2026 and introduces breaking changes in the Agent/SPIFFE integrations. The project remains on 1.19 until a migration guide is published. Do not upgrade without reviewing the Vault 2.0 changelog against services/vault/vault_provider.go.


Architecture

Loading diagram...

Development — Dev Mode

In development, Vault runs as a single node in dev mode (no persistence, auto-unsealed, root token):

# docker/docker-compose.yml (excerpt)
vault:
  image: hashicorp/vault:1.19
  command: "server -dev"
  environment:
    VAULT_DEV_ROOT_TOKEN_ID: kernel-dev-root-token
    VAULT_DEV_LISTEN_ADDRESS: "0.0.0.0:8200"
  ports:
    - "8200:8200"
  healthcheck:
    test: ["CMD", "wget", "-q", "--spider", "http://127.0.0.1:8200/v1/sys/health"]
    interval: 10s
    timeout: 5s
    retries: 5

The dev root token (kernel-dev-root-token) is injected directly into all Go services via VAULT_TOKEN. Dev-mode JWT keys are pre-seeded base64-encoded ES256 key pairs in docker-compose.yml.

Start Vault (dev):

docker compose \
  --env-file docker/versions.env \
  -f docker/docker-compose.yml \
  up -d vault --wait

# Verify:
curl -s http://localhost:8200/v1/sys/health | python3 -m json.tool
# → {"initialized":true,"sealed":false,"standby":false,...}

Production — Raft HA Mode

Production uses three-node Raft integrated storage (no Consul required). Each node is a Vault StatefulSet pod in the platform-infra namespace.

HCL Configuration

# /etc/vault/config.hcl (mounted as a ConfigMap)
ui            = true
cluster_addr  = "https://NODE_IP:8201"
api_addr      = "https://NODE_IP:8200"

storage "raft" {
  path    = "/vault/data"
  node_id = "POD_NAME"     # injected via Downward API env var
}

listener "tcp" {
  address       = "0.0.0.0:8200"
  tls_disable   = false
  tls_cert_file = "/vault/tls/tls.crt"
  tls_key_file  = "/vault/tls/tls.key"
}

seal "awskms" {               # or "gcpckms" / "azurekeyvault"
  region     = "eu-central-1"
  kms_key_id = "arn:aws:kms:..."
}

Initialisation

Run once per cluster on fresh install:

export VAULT_ADDR=https://vault-0.vault.platform-infra.svc.cluster.local:8200

# Initialise (5 key shares, 3 threshold)
vault operator init \
  -key-shares=5 \
  -key-threshold=3 \
  -format=json > /secure/vault-init.json

# Store unseal keys in separate HSM / secret manager — NEVER in git.

# Unseal vault-0
vault operator unseal <key-1>
vault operator unseal <key-2>
vault operator unseal <key-3>

# Join vault-1 and vault-2 to the Raft cluster
VAULT_ADDR=https://vault-1.vault.platform-infra.svc.cluster.local:8200 \
  vault operator raft join https://vault-0.vault.platform-infra.svc.cluster.local:8200

VAULT_ADDR=https://vault-2.vault.platform-infra.svc.cluster.local:8200 \
  vault operator raft join https://vault-0.vault.platform-infra.svc.cluster.local:8200

Auto-Unseal (Production Requirement)

In production, Vault must be configured with auto-unsealing via a cloud KMS (AWS KMS, GCP Cloud KMS, or Azure Key Vault). Manual unseal from init keys is only for break-glass scenarios.


Secrets Engine Setup

Enable KV v2

vault secrets enable -path=secret kv-v2

All Platform-Kernel secrets are stored under secret/data/platform/.

Secret Path Convention

SecretVault pathReader
IAM JWT signing key (ES256)secret/data/platform/iam/jwt-keysIAM
IAM OIDC RSA key (RS256)secret/data/platform/iam/oidc-rsa-keyIAM
MFA encryption key (AES-256)secret/data/platform/iam/mfa-keyIAM
Domain TLS certificatesecret/data/platform/domains/{domain}/certDomain Resolver
DB credentials (dynamic)database/creds/platform-db-roleAll DB services

JWT Signing Key Management

Key Generation (ES256 — ECDSA P-256)

# Generate private key
openssl ecparam -name prime256v1 -genkey -noout \
  -out /tmp/iam-jwt-private.pem

# Extract public key
openssl ec -in /tmp/iam-jwt-private.pem \
  -pubout -out /tmp/iam-jwt-public.pem

# Store in Vault KV v2
vault kv put secret/platform/iam/jwt-keys \
  private_key="$(cat /tmp/iam-jwt-private.pem | base64)" \
  public_key="$(cat /tmp/iam-jwt-public.pem | base64)"

# Destroy local copies immediately
shred -u /tmp/iam-jwt-private.pem /tmp/iam-jwt-public.pem

Dual-Key Rotation (Zero Downtime)

The SecretProvider.RotateSecret method implements atomic dual-key rotation defined in services/vault/provider.go:

t=0   Write new key pair → Vault (version N+1)
t=0   Old key (version N) stays readable for gracePeriod
t+0   IAM WatchRotation callback fires → begin signing with N+1
t+0   Gateway receives new public key → validates both N and N+1
t+Δ   gracePeriod expires → old key (N) invalidated

The grace period defaults to 24 hours (OIDC_SECRET_ROTATION_GRACE_HOURS = 24). All in-flight tokens signed with the old key remain valid for their exp lifetime.

Automated 90-Day Rotation

Schedule a CronJob (Kubernetes) or cron task (Docker Compose host) to rotate the JWT signing key every 90 days:

# kubernetes/cronjob-jwt-rotation.yaml
apiVersion: batch/v1
kind: CronJob
metadata:
  name: jwt-key-rotation
  namespace: platform-infra
spec:
  schedule: "0 2 * * 0"    # Every Sunday 02:00 UTC (≈ every 13 weeks)
  jobTemplate:
    spec:
      template:
        spec:
          restartPolicy: OnFailure
          containers:
            - name: rotate
              image: hashicorp/vault:1.19
              command:
                - /bin/sh
                - -c
                - |
                  # Generate new ES256 key pair
                  openssl ecparam -name prime256v1 -genkey -noout -out /tmp/key.pem
                  openssl ec -in /tmp/key.pem -pubout -out /tmp/pub.pem

                  # Atomic KV v2 write (creates new version)
                  vault kv put secret/platform/iam/jwt-keys \
                    private_key="$(cat /tmp/key.pem | base64)" \
                    public_key="$(cat /tmp/pub.pem | base64)"

                  shred -u /tmp/key.pem /tmp/pub.pem
              env:
                - name: VAULT_ADDR
                  value: http://vault.platform-infra.svc.cluster.local:8200
                - name: VAULT_TOKEN
                  valueFrom:
                    secretKeyRef:
                      name: vault-rotation-token
                      key: token

IAM's WatchRotation callback polls Vault every VAULT_WATCH_INTERVAL_SEC (30 seconds, default). After a new version is written, IAM reloads the signing key without restart:

// services/vault/vault_provider.go — WatchRotation flow
// Poll interval: VAULT_WATCH_INTERVAL_SEC (default 30s)
// On version change: invoke all registered callbacks with newData
// IAM callback: reload JWT private key in-memory

AppRole Authentication (Production)

In production, services must NOT use the root token. Configure AppRole:

# Enable AppRole auth
vault auth enable approle

# Create IAM policy
vault policy write platform-iam - <<EOF
path "secret/data/platform/iam/*" {
  capabilities = ["read"]
}
path "secret/metadata/platform/iam/*" {
  capabilities = ["read", "list"]
}
EOF

# Create AppRole for IAM
vault write auth/approle/role/platform-iam \
  token_policies="platform-iam" \
  token_ttl=1h \
  token_max_ttl=4h \
  secret_id_ttl=0

# Get RoleID (store in ConfigMap)
vault read auth/approle/role/platform-iam/role-id

# Get SecretID (store in Kubernetes Secret)
vault write -f auth/approle/role/platform-iam/secret-id

Services authenticate at startup:

vault write auth/approle/login \
  role_id=<ROLE_ID> \
  secret_id=<SECRET_ID>
# → Returns a service token (TTL 1h, renewable)

Vault Policies per Service

ServicePolicy pathCapabilities
platform-iamsecret/data/platform/iam/*read
platform-domain-resolversecret/data/platform/domains/*read, create, update, delete
platform-servicesdatabase/creds/platform-db-roleread
jwt-rotation-jobsecret/data/platform/iam/jwt-keyscreate, update

TLS Certificate Storage

Domain Resolver stores per-domain TLS certificates (issued by Certbot v4.0.0 via Let's Encrypt) in Vault:

# Store certificate after ACME issuance
vault kv put secret/platform/domains/example.com/cert \
  fullchain="$(cat /etc/letsencrypt/live/example.com/fullchain.pem)" \
  privkey="$(cat /etc/letsencrypt/live/example.com/privkey.pem)" \
  expires_at="2026-07-22T00:00:00Z"

Domain Resolver renews certificates DOMAINS_SSL_RENEW_DAYS_BEFORE (30 days) before expiry and writes new versions atomically. Old versions are retained for the grace period.


Vault Environment Variables

VariableDefaultDescription
VAULT_ADDRhttp://vault:8200Vault HTTP/HTTPS address.
VAULT_TOKEN(required)Root or AppRole service token.
VAULT_INIT_TIMEOUT_SEC120Max seconds to wait for Vault at startup.
VAULT_RETRY_INITIAL_MS250Initial exponential backoff (ms).
VAULT_RETRY_MAX_MS16000Max exponential backoff (ms).
VAULT_WATCH_INTERVAL_SEC30JWT key rotation poll interval (seconds).

Health and Status

# Cluster status (Raft)
vault operator raft list-peers

# Seal status
curl -s $VAULT_ADDR/v1/sys/health | python3 -m json.tool
# initialized: true, sealed: false, standby: false

# List KV versions for a secret
vault kv metadata get secret/platform/iam/jwt-keys

# Read current JWT keys (requires policy)
vault kv get -format=json secret/platform/iam/jwt-keys

See Also

On this page