Authentication & Authorization

The Gateway enforces authentication and authorization on every request. No request reaches an upstream service without a verified JWT and a successful permission check. These two checks are handled by the internal/middleware and internal/permission packages inside the Go Gateway Service.

JWT Verification

The Gateway verifies the JWT using ES256 (ECDSA P-256) — the IAM Service signs all tokens with the platform's private key. The Gateway holds only the corresponding public key (read-only).

Incoming request:
  Authorization: Bearer eyJhbGciOiJFUzI1NiIsInR5cCI6IkpXVCJ9...

Gateway:
  1. Extract token from Authorization header
  2. Verify ES256 signature with platform public key
  3. Check exp claim (tokens are valid 15 minutes)
  4. Extract claims → userId, tenantId, roles[]
  5. Pass claims to permission resolution

JWT Claims

Claim	Type	Description
`sub`	`string` (ULID)	User ID — forwarded as `X-User-ID` header to upstream
`email`	`string`	User email address
`tenantId`	`string` (ULID)	Tenant ID — forwarded as `X-Tenant-ID` header to upstream
`roles`	`string[]`	Role names assigned to the user
`iat`	`number`	Issued at (Unix timestamp)
`exp`	`number`	Expiry (Unix timestamp — issued at + 15 minutes)
`iss`	`string`	Issuer (OIDC issuer URL of the platform)
`aud`	`string`	Audience (`platform-kernel` for internal requests; `client_id` for OIDC app tokens)
`custom claims`	`object`	Namespace-prefixed custom data from modules

Permissions are NOT stored in the JWT. Storing permissions in the token creates tokens of 5–10 KB at scale (hundreds of atomic permissions from dozens of modules), breaks cookie storage (4 KB limit), and increases bandwidth on every request. Only roles[] are stored. Permissions are resolved server-side on each request.

Maximum JWT size: 8 KB (NGINX header limit). Custom claims must respect this. PII (full phone numbers, IPs) in custom claims is forbidden — JWT is base64-encoded and readable without a key.

Permission Resolution

Roles stored in the JWT are resolved to permissions server-side on every request. The resolution uses a two-level cache to minimize latency:

Request arrives with roles: ['admin', 'billing-viewer']

Level 1: Check Valkey permission_version:{tenantId}
  → Compare with cached version
  → Version match → Cache HIT
     → Use cached permissions[] (Valkey key: permissions:{tenantId}:{roles_hash}, TTL 5 min)
  → Version mismatch or MISS
     → gRPC call to IAM: resolve roles → permissions[]
     → Write to Valkey with current version + TTL 5 min

Level 2 (fallback — IAM + Valkey both down):
  → In-process LRU cache (GATEWAY_LOCAL_PERMISSION_CACHE_TTL_SEC=60)
  → Serve stale permissions + inject X-Permissions-Stale: true header
  → Cold start + IAM down → 503 Service Unavailable (fail-closed)

Cache parameter	Value
Valkey key (permissions)	`permissions:{tenantId}:{sorted_roles_hash}`
Valkey TTL	5 minutes
Local in-process cache TTL	60 seconds (`GATEWAY_LOCAL_PERMISSION_CACHE_TTL_SEC`)
Platform Owner wildcard	Role `*` → permission check bypassed (no Valkey lookup)

Permission Check Result

✅ Permission found → forward request to upstream service
❌ Permission missing → 403 Forbidden (RFC 9457)

{
  "type":   "https://api.septemcore.com/problems/forbidden",
  "title":  "Forbidden",
  "status": 403,
  "detail": "Required permission 'billing.plan.change' not found for roles ['billing-viewer'].",
  "traceId": "01j9ptr0000000000000002"
}

Cache Invalidation: Dual Channel

When a role or permission changes in the IAM Service, the permission cache must be invalidated immediately. The Gateway uses two independent channels to guarantee invalidation even during outages:

Channel	Mechanism	Dependency
Primary (async)	Event Bus: `auth.role.changed` → Gateway invalidates Valkey cache	Kafka must be available
Fallback (sync)	Version counter: IAM increments `INCR permission_version:{tenantId}` in Valkey on every role change. Gateway compares version on every request (~0.1 ms). Mismatch → force-reload.	Kafka-independent

The version counter adds one Valkey GET (~0.1 ms) per request. This cost is justified: it ensures that even a complete Kafka outage does not delay permission revocation.

For bulk role assignments, IAM publishes a single auth.roles.bulk_changed event. The Gateway debounces these events in a 50 ms window to prevent thundering herd (1 000 invalidations → 1 batch reload).

Forwarded Headers

After JWT verification, the Gateway injects claims into upstream request headers. Upstream services must read these headers — they must not re-verify the JWT:

Header	Source	Example
`X-User-ID`	`sub` claim	`01j9pa5mz700000000000000`
`X-Tenant-ID`	`tenantId` claim	`01j9ten0000000000000000`
`X-Request-ID`	Generated by RequestID middleware (ULID)	`01j9ptr0000000000000003`
`X-Permissions-Stale`	Injected when serving from stale local cache	`true`

Upstream services trust these headers unconditionally — the Gateway is the single verification point. A service that re-verifies the JWT is a design anti-pattern in this architecture.

JWT Refresh Token Race Protection

When a user has multiple browser tabs open and the access token expires, all tabs simultaneously call POST /auth/refresh with the same refresh token. Without protection, only the first succeeds and the rest receive 401.

The platform protects against this with a grace window on the refresh token:

Parameter	Value
Refresh grace window	10 seconds after first use (`AUTH_REFRESH_GRACE_WINDOW_SEC=10`)
Behavior in window	Same refresh token → same new token pair (idempotent)
After window	Refresh token invalidated — next use returns `401`
SDK coordination	`BroadcastChannel` API: one in-flight refresh per origin, result broadcast to all tabs

B2B2B Delegation Middleware

The platform supports a three-level tenant hierarchy: Platform Owner → Partner → Client.

A Partner-level user acting on behalf of a Client tenant must be authenticated in their own tenant but authorized to access the Client tenant's resources. The delegation middleware validates this:

Partner user (tenantId: partner-01j...) makes request to Client tenant:
  Header: X-Delegate-To-Tenant: client-01j...
  JWT:    tenantId=partner-01j..., roles=['partner-admin']

Gateway delegation middleware:
  1. Detect X-Delegate-To-Tenant header
  2. gRPC: TenantHierarchyService.IsDescendant(
         ancestor: partner-01j...,
         descendant: client-01j...
     )
     ┌─ Not a descendant → 403 Forbidden (partner cannot access this client)
     └─ Is a descendant:
         Forward X-Tenant-ID: client-01j...   ← effective tenant
         Forward X-User-ID:   partner-user-id ← original user
         Forward X-Delegated-By: partner-01j... ← audit trail

The IsDescendant check uses a PostgreSQL closure table with O(1) ancestry lookups — no recursive queries regardless of hierarchy depth.

Three-Level Security Enforcement

Level 1: UI Shell     → Hides UI elements (UX, not security)
Level 2: API Gateway  → hasPermission() on every request (BLOCKING)
Level 3: PostgreSQL   → Row-Level Security by tenant_id (DATA ISOLATION)

Level 2 is mandatory even when Level 1 hides the UI element. A user who manually constructs a URL hits Level 2. A compromised service that bypasses Level 2 still hits Level 3.

Anonymous Requests

Requests without an Authorization header are processed as anonymous. Anonymous requests:

Are subject to anonymous rate limits (100 req/min)
Can only access publicly declared endpoints (e.g. POST /auth/login, POST /auth/register)
Receive 401 Unauthorized on any protected endpoint

{
  "type":   "https://api.septemcore.com/problems/unauthorized",
  "title":  "Unauthorized",
  "status": 401,
  "detail": "Authorization header is missing or malformed.",
  "traceId": "01j9ptr0000000000000004"
}

JWT Verification​

JWT Claims​

Permission Resolution​

Permission Check Result​

Cache Invalidation: Dual Channel​

Forwarded Headers​

JWT Refresh Token Race Protection​

B2B2B Delegation Middleware​

Three-Level Security Enforcement​

Anonymous Requests​