Skip to main content

Testing Strategy

Platform Kernel enforces an 8-level testing pyramid. Each level has a distinct frequency, toolchain, and acceptance threshold. No level can be skipped: the CI Gate (gate job in ci.yml) blocks merge until all required levels are green.

Pyramid Overview


Level 1 — Unit Tests

Go Services

# Run across all 12 services (as in ci.yml)
for svc in services/audit services/billing services/data-layer \
services/domain-resolver services/event-bus services/files \
services/gateway services/iam services/integration-hub \
services/module-registry services/money services/notify; do
cd "$svc" && go test ./... -race -count=1 \
-coverprofile=/tmp/cover-$(basename "$svc").out
done
FlagPurpose
-raceDetect data races (Go race detector)
-count=1Disable test result cache — always rerun
-coverprofilePer-service coverage artifact (uploaded to GitHub Actions)

Coverage target: ≥ 80% per service (enforced by CI).

TypeScript Packages

pnpm vitest run --coverage \
--exclude '**/e2e/**' \
--exclude '**/*.browser.test.tsx'

Browser-mode tests (*.browser.test.tsx) require Playwright Chromium and run in a separate step:

pnpm --filter @platform/sdk-ui exec playwright install chromium --with-deps

Logger Pattern

All Go services use log/slog (stdlib) with JSON handler in production:

// Pattern used across all services (services/iam, services/vault, etc.)
logger := slog.New(slog.NewJSONHandler(os.Stdout, &slog.HandlerOptions{
Level: slog.LevelInfo,
}))
slog.SetDefault(logger)

Tests use a discarding no-op logger to suppress output:

func noopLogger() *slog.Logger {
return slog.New(slog.NewTextHandler(io.Discard, nil))
}

Level 2 — Integration Tests

Integration tests run against real infrastructure via Testcontainers. No mocks for databases, message brokers, or secret stores.

Infrastructure Stack

Tag Convention

All integration tests are gated by a build tag to prevent accidental execution during unit test runs:

//go:build integration

package integration_test

Run command: go test -tags=integration ./tests/integration/... -timeout=120s

IAM Integration Test Structure

The IAM service (services/iam/tests/integration/) is the reference implementation for integration test patterns across all services:

Test fileWhat it tests
user_service_test.goUser CRUD, soft delete, restore — real PostgreSQL
oauthapp_service_test.goOAuth app lifecycle, token rotation
handler_routes_test.goHTTP handler routes with full middleware stack
block44_final_verification_test.goTenant lifecycle state machine, idempotency

Connection Configuration

// Injected via env in CI (ci.yml)
TEST_POSTGRES_URL=postgres://kernel:kernel_dev_password@localhost:5432/platform_kernel
TEST_KAFKA_BROKERS=localhost:9092
TEST_VALKEY_ADDR=localhost:6379

Level 3 — Contract Tests (Pact)

Pact v4 (Go SDK) verifies that the Gateway (consumer) and downstream services (providers) agree on the same API contract, independent of integration tests.

Consumer–Provider Map

CI Flow

Contract Location

services/gateway/tests/contract/pacts/
└── Gateway-IAM.json
└── Gateway-Billing.json
└── Gateway-DataLayer.json

Contracts are uploaded as GitHub Actions artifacts (90-day retention) after each successful consumer test run.

Pact Broker

Self-hosted Pact Broker runs via docker-compose.pact.yml:

# docker/docker-compose.pact.yml (excerpt)
pact-broker:
image: pactfoundation/pact-broker:latest
ports:
- "9292:9292"
environment:
PACT_BROKER_DATABASE_URL: "postgres://pact:pact@postgres:5432/pact"

Access: http://localhost:9292 · credentials: pact / pact_dev


Level 4 — Load Tests (k6)

Load tests run only on push to main via a self-hosted VDS runner tagged [self-hosted, linux, load-test].

Acceptance Criteria

MetricTarget
Sustained RPS10,000
p99 latency< 200 ms
Error rate0%
Traffic mix80% GET / 20% POST

Script Location

scripts/load-test/gateway-load.js

k6 Invocation (ci.yml)

k6 run \
--env BASE_URL="$BASE_URL" \
--env JWT_TOKEN="$STAGING_JWT" \
--out json=k6-results.json \
scripts/load-test/gateway-load.js

Results are uploaded as GitHub Actions artifacts (90-day retention) for historical trend analysis.


Level 5 — Chaos Engineering (Litmus)

Chaos tests run monthly against the staging environment.

ExperimentTargetExpected outcome
Pod kill (IAM)IAM podGateway circuit breaker opens, 503 < 30s, auto-recovery
Network partitionKafka brokerConsumer group rebalances (Cooperative Sticky), no loss
Disk full (ClickHouse)ClickHouse data volumeCDC pipeline pauses, WAL slot bloat alert fires
Vault sealVaultServices enter degraded mode (cached DEKs), alert fires
PostgreSQL killPrimary PG podData Layer 503, Readiness probe fails → pod exits LB

Litmus Workflow


Level 6 — Fuzz Tests (AFL)

Fuzz targets run monthly on the CI self-hosted runner.

TargetInput
Protobuf parserMalformed proto binary blobs
JSON deserializerMalformed JSON payloads for all gRPC requests
JWT parserMalformed JWT tokens (header + payload + signature)
OpenAPI validatorMalformed HTTP request bodies

Go stdlib fuzzing (go test -fuzz) is used for Go targets. AFL is used for boundary testing of C libraries (libvips in files service).


Level 7 — Security Tests

SAST / SCA — Every PR

ToolWhat it scans
SnykGo dependency CVEs (SCA), Docker image CVEs
SemgrepGo source code patterns (SAST), hardcoded secrets
golangci-lintgosec linter — SQL injection, path traversal, weak crypto

Defined in .github/workflows/security-scan.yml.

Gate: 0 Critical, 0 High vulnerabilities required for merge to main.

DAST — Quarterly

ToolTarget
OWASP ZAPAPI Gateway (Envoy) — active scan against staging
Burp SuiteManual penetration test (quarterly, by security team)

OWASP ZAP is run in API scan mode against the Staging URL with the staging JWT.


Level 8 — Smoke Tests

Smoke tests run:

  • Every PR in CI (after Docker Build) — scripts/smoke-test.sh
  • Every 1 minute in production — Kubernetes startupProbe + external uptime monitor

Health Endpoint Contract

Defined in services/iam/api/openapi.yaml and implemented identically across all 12 services:

EndpointProbe typeReturns
GET /health/liveKubernetes livenessProbe200 {"status":"alive"} always if process alive
GET /health/readyKubernetes readinessProbe200 {"status":"ready"} if PG up; 503 if not
GET /healthFull reportJSON with status, service, version, uptime_seconds, checks

smoke-test.sh Target List

The smoke test iterates over all 12 service /health/live endpoints to verify every container started successfully after a Docker build:

#!/usr/bin/env bash
SERVICES=(iam gateway data-layer event-bus notify files money
audit module-registry billing integration-hub domain-resolver)
for svc in "${SERVICES[@]}"; do
curl -sf "http://${svc}:8080/health/live" \
|| { echo "❌ ${svc} health check failed"; exit 1; }
done
echo "✅ All smoke tests passed"

CI Pipeline — Complete Sequence


See Also