Skip to main content

Feature Flags — Overview

The Feature Flag Service enables canary releases, A/B testing, and instant feature rollback without code deployments. Developers wrap new functionality in isEnabled('flag-name') and tenant admins control enablement percentage from the Admin UI — no redeploy required.


Technical Stack

ComponentTechnologyDetail
EngineGoFeatureFlag (self-hosted, Apache 2.0, Go)Flag storage and evaluation
Backing storePostgreSQLSingle database shared with all other services
Flag resolutionREST API → Go service → PostgreSQLGo service exposes /flags REST API
SDK@platform/sdk-flagsHTTP poller — fetches flags every 15 seconds, caches in-process
Cache fallbackValkeyStale flag cache on pod startup (Readiness probe waits for at least stale data)

SDK API — kernel.flags()

isEnabled()

Check whether a feature flag is enabled for the current user context:

const isNew = await kernel.flags().isEnabled('new-checkout-ui');

if (isNew) {
return <NewCheckout />;
} else {
return <LegacyCheckout />;
}

The call performs an in-memory lookup — no network call, latency in nanoseconds. The SDK automatically prefixes the key with the current tenant's ID: isEnabled('new-checkout-ui') → evaluates {tenantId}:new-checkout-ui internally.

isEnabled() returns false if:

  • The flag does not exist
  • The flag is archived
  • The flag's percentage is 0
  • GoFeatureFlag is down AND the cache is empty (cold start)

getVariant()

Return which variant the current user sees in an A/B test:

const variant = await kernel.flags().getVariant('checkout-button-color');
// variant = 'blue' | 'green' | 'red' | undefined

Up to 10 variants per flag. The variant is determined by deterministic hashing of the user ID — the same user always sees the same variant within a flag's lifetime.

snapshot()

Freeze the current flag state for the duration of a critical flow (form submission, checkout, multi-step wizard). Flags cannot change mid-session even if the 15-second poll delivers new values:

const flags = kernel.flags().snapshot();

// During a multi-step checkout (may span multiple 15s cache updates):
const showNewPayment = flags.isEnabled('new-payment-flow');
// showNewPayment remains constant for this snapshot instance
// regardless of admin toggling the flag mid-checkout

getVariant() on a live kernel.flags() instance may return a different value after each 15-second cache update. snapshot() is immutable for its entire lifetime.


SDK Caching and Fallback

StateBehavior
GoFeatureFlag availableisEnabled() = in-memory lookup (nanoseconds). Zero network calls.
GoFeatureFlag unavailableSDK returns last cached values. Works autonomously.
Cold start + GoFeatureFlag downAll flags = false (safe default — new features are off, not on).
Cold start + Valkey availablePod loads stale flags from Valkey. Readiness probe passes. Kubernetes routes traffic.
Cold start + both downPod = NOT READY (Readiness probe fails). Kubernetes does not route traffic to this pod. Existing pods with warm cache continue serving.
SDK startup sequence:
1. Try load from GoFeatureFlag HTTP API → success → pod = READY
2. Try load from Valkey (stale cache) → success → pod = READY (stale)
3. Both unavailable → pod = NOT READY (no traffic routed)
4. GoFeatureFlag recovers → SDK polls, updates cache → pod = READY

Cache update interval: 15 seconds (FLAGS_POLL_INTERVAL_SEC).


Tenant Isolation

Every flag key is automatically namespaced by tenant. Developers never manage this namespace manually:

Developer writes: isEnabled('new-reporting-ui')
SDK evaluates: isEnabled('{tenantId}:new-reporting-ui')

Tenant A (01j9ten...): evaluates '01j9ten...:new-reporting-ui'
Tenant B (01j9ten...): evaluates '01j9ten...:new-reporting-ui'
→ entirely independent flag states
Isolation mechanismDetail
Flag key format{tenantId}:{flagName} (SDK prefix, automatic)
Module visibilityEach module only sees its own tenant's flags
Admin UIShows only the flags belonging to the current tenant

REST API

Base Path

For layout brevity, the /api/v1 base path prefix is omitted from the endpoint table below.

MethodEndpointDescription
GET/flagsList all flags for this tenant (paginated)
GET/flags/:keyGet a single flag state (enabled, percentage, variants)
POST/flagsCreate a new flag
PATCH/flags/:keyUpdate flag (toggle, percentage, variants). Requires version.
DELETE/flags/:keyArchive flag (soft delete, data retained 180 days)

Create Flag

POST https://api.septemcore.com/v1/flags
Authorization: Bearer <access_token>
Content-Type: application/json

{
"key": "new-checkout-ui",
"enabled": true,
"percentage": 10,
"variants": {
"control": { "label": "Legacy checkout" },
"treatment": { "label": "New checkout UI" }
}
}

Response 201 Created:

{
"key": "new-checkout-ui",
"tenantId": "01j9ten0000000000000000",
"enabled": true,
"percentage": 10,
"variants": {
"control": { "label": "Legacy checkout" },
"treatment": { "label": "New checkout UI" }
},
"version": 1,
"createdAt": "2026-04-22T02:10:00Z",
"updatedAt": "2026-04-22T02:10:00Z"
}

Update Flag (Optimistic Locking)

All PATCH requests must include the current version:

PATCH https://api.septemcore.com/v1/flags/new-checkout-ui
Authorization: Bearer <access_token>
Content-Type: application/json

{
"percentage": 50,
"version": 1
}

Response 200 OK:

{
"key": "new-checkout-ui",
"enabled": true,
"percentage": 50,
"version": 2,
"updatedAt": "2026-04-22T02:15:00Z"
}

Conflict (another admin changed the flag first):

{
"type": "https://api.septemcore.com/problems/flag-version-conflict",
"title": "Conflict",
"status": 409,
"detail": "This flag was modified by another user. Current version is 2, you sent 1. Review the current state and retry.",
"traceId": "01j9ptr0000000000000012"
}

When a 409 occurs, the Admin UI shows a diff: who changed the flag, when, what the current value is, and what the user attempted to set — with a "Overwrite with my value?" confirmation.


Optimistic Locking

ParameterValue
Mechanismversion: int field, auto-incremented on every PATCH
Required onEvery PATCH /flags/:key request
Conflict response409 Conflict, RFC 9457 flag-version-conflict
AuditEvery flag change recorded: flags.updated + { before, after, version }

Limits

ParameterValue
Max flags per tenantDetermined by Billing plan. checkLimit("feature_flags")402 Payment Required if exceeded. Env FLAGS_MAX_PER_TENANT (default: 0 = unlimited) is a safety net.
Max variants per flag10
Max variant payload4 KB (JSON)
SDK optimizationpercentage: 0 AND enabled: true → short-circuits to false without computing hash (protects CPU on tenants with 500+ flags at 0%)

Stale Flag Cleanup

Unused flags accumulate over time. The service auto-detects and archives stale flags:

ParameterValue
Stale detectionFlag not changed in 90 days AND evaluates identically for 100% of requests
Admin notificationUI shows: "X flags may be stale. Archive?"
On archiveDELETE /flags/:key → soft delete. Code referencing the flag → returns false (safe default). No data created under the flag is lost.
Archive retention180 days in PostgreSQL, then physical delete. Un-archive possible within 180 days.
AuditArchive event recorded in Audit Service indefinitely.