Feature Flags — Overview
The Feature Flag Service enables canary releases, A/B testing, and
instant feature rollback without code deployments. Developers wrap new
functionality in isEnabled('flag-name') and tenant admins control
enablement percentage from the Admin UI — no redeploy required.
Technical Stack
| Component | Technology | Detail |
|---|---|---|
| Engine | GoFeatureFlag (self-hosted, Apache 2.0, Go) | Flag storage and evaluation |
| Backing store | PostgreSQL | Single database shared with all other services |
| Flag resolution | REST API → Go service → PostgreSQL | Go service exposes /flags REST API |
| SDK | @platform/sdk-flags | HTTP poller — fetches flags every 15 seconds, caches in-process |
| Cache fallback | Valkey | Stale flag cache on pod startup (Readiness probe waits for at least stale data) |
SDK API — kernel.flags()
isEnabled()
Check whether a feature flag is enabled for the current user context:
const isNew = await kernel.flags().isEnabled('new-checkout-ui');
if (isNew) {
return <NewCheckout />;
} else {
return <LegacyCheckout />;
}
The call performs an in-memory lookup — no network call, latency
in nanoseconds. The SDK automatically prefixes the key with the
current tenant's ID: isEnabled('new-checkout-ui') →
evaluates {tenantId}:new-checkout-ui internally.
isEnabled() returns false if:
- The flag does not exist
- The flag is archived
- The flag's
percentageis 0 - GoFeatureFlag is down AND the cache is empty (cold start)
getVariant()
Return which variant the current user sees in an A/B test:
const variant = await kernel.flags().getVariant('checkout-button-color');
// variant = 'blue' | 'green' | 'red' | undefined
Up to 10 variants per flag. The variant is determined by deterministic hashing of the user ID — the same user always sees the same variant within a flag's lifetime.
snapshot()
Freeze the current flag state for the duration of a critical flow (form submission, checkout, multi-step wizard). Flags cannot change mid-session even if the 15-second poll delivers new values:
const flags = kernel.flags().snapshot();
// During a multi-step checkout (may span multiple 15s cache updates):
const showNewPayment = flags.isEnabled('new-payment-flow');
// showNewPayment remains constant for this snapshot instance
// regardless of admin toggling the flag mid-checkout
getVariant() on a live kernel.flags() instance may return a
different value after each 15-second cache update. snapshot() is
immutable for its entire lifetime.
SDK Caching and Fallback
| State | Behavior |
|---|---|
| GoFeatureFlag available | isEnabled() = in-memory lookup (nanoseconds). Zero network calls. |
| GoFeatureFlag unavailable | SDK returns last cached values. Works autonomously. |
| Cold start + GoFeatureFlag down | All flags = false (safe default — new features are off, not on). |
| Cold start + Valkey available | Pod loads stale flags from Valkey. Readiness probe passes. Kubernetes routes traffic. |
| Cold start + both down | Pod = NOT READY (Readiness probe fails). Kubernetes does not route traffic to this pod. Existing pods with warm cache continue serving. |
SDK startup sequence:
1. Try load from GoFeatureFlag HTTP API → success → pod = READY
2. Try load from Valkey (stale cache) → success → pod = READY (stale)
3. Both unavailable → pod = NOT READY (no traffic routed)
4. GoFeatureFlag recovers → SDK polls, updates cache → pod = READY
Cache update interval: 15 seconds (FLAGS_POLL_INTERVAL_SEC).
Tenant Isolation
Every flag key is automatically namespaced by tenant. Developers never manage this namespace manually:
Developer writes: isEnabled('new-reporting-ui')
SDK evaluates: isEnabled('{tenantId}:new-reporting-ui')
Tenant A (01j9ten...): evaluates '01j9ten...:new-reporting-ui'
Tenant B (01j9ten...): evaluates '01j9ten...:new-reporting-ui'
→ entirely independent flag states
| Isolation mechanism | Detail |
|---|---|
| Flag key format | {tenantId}:{flagName} (SDK prefix, automatic) |
| Module visibility | Each module only sees its own tenant's flags |
| Admin UI | Shows only the flags belonging to the current tenant |
REST API
For layout brevity, the /api/v1 base path prefix is omitted
from the endpoint table below.
| Method | Endpoint | Description |
|---|---|---|
GET | /flags | List all flags for this tenant (paginated) |
GET | /flags/:key | Get a single flag state (enabled, percentage, variants) |
POST | /flags | Create a new flag |
PATCH | /flags/:key | Update flag (toggle, percentage, variants). Requires version. |
DELETE | /flags/:key | Archive flag (soft delete, data retained 180 days) |
Create Flag
POST https://api.septemcore.com/v1/flags
Authorization: Bearer <access_token>
Content-Type: application/json
{
"key": "new-checkout-ui",
"enabled": true,
"percentage": 10,
"variants": {
"control": { "label": "Legacy checkout" },
"treatment": { "label": "New checkout UI" }
}
}
Response 201 Created:
{
"key": "new-checkout-ui",
"tenantId": "01j9ten0000000000000000",
"enabled": true,
"percentage": 10,
"variants": {
"control": { "label": "Legacy checkout" },
"treatment": { "label": "New checkout UI" }
},
"version": 1,
"createdAt": "2026-04-22T02:10:00Z",
"updatedAt": "2026-04-22T02:10:00Z"
}
Update Flag (Optimistic Locking)
All PATCH requests must include the current version:
PATCH https://api.septemcore.com/v1/flags/new-checkout-ui
Authorization: Bearer <access_token>
Content-Type: application/json
{
"percentage": 50,
"version": 1
}
Response 200 OK:
{
"key": "new-checkout-ui",
"enabled": true,
"percentage": 50,
"version": 2,
"updatedAt": "2026-04-22T02:15:00Z"
}
Conflict (another admin changed the flag first):
{
"type": "https://api.septemcore.com/problems/flag-version-conflict",
"title": "Conflict",
"status": 409,
"detail": "This flag was modified by another user. Current version is 2, you sent 1. Review the current state and retry.",
"traceId": "01j9ptr0000000000000012"
}
When a 409 occurs, the Admin UI shows a diff: who changed the flag,
when, what the current value is, and what the user attempted to set —
with a "Overwrite with my value?" confirmation.
Optimistic Locking
| Parameter | Value |
|---|---|
| Mechanism | version: int field, auto-incremented on every PATCH |
| Required on | Every PATCH /flags/:key request |
| Conflict response | 409 Conflict, RFC 9457 flag-version-conflict |
| Audit | Every flag change recorded: flags.updated + { before, after, version } |
Limits
| Parameter | Value |
|---|---|
| Max flags per tenant | Determined by Billing plan. checkLimit("feature_flags") → 402 Payment Required if exceeded. Env FLAGS_MAX_PER_TENANT (default: 0 = unlimited) is a safety net. |
| Max variants per flag | 10 |
| Max variant payload | 4 KB (JSON) |
| SDK optimization | percentage: 0 AND enabled: true → short-circuits to false without computing hash (protects CPU on tenants with 500+ flags at 0%) |
Stale Flag Cleanup
Unused flags accumulate over time. The service auto-detects and archives stale flags:
| Parameter | Value |
|---|---|
| Stale detection | Flag not changed in 90 days AND evaluates identically for 100% of requests |
| Admin notification | UI shows: "X flags may be stale. Archive?" |
| On archive | DELETE /flags/:key → soft delete. Code referencing the flag → returns false (safe default). No data created under the flag is lost. |
| Archive retention | 180 days in PostgreSQL, then physical delete. Un-archive possible within 180 days. |
| Audit | Archive event recorded in Audit Service indefinitely. |