Skip to main content

Custom Domains — Overview

The Custom Domain Mapping service lets tenants replace the default platform subdomain (tenant-123.platform.io) with their own branded hostname (mycompany-crm.com, shop.mybrand.co). The service handles end-to-end domain verification, TLS certificate issuance, and live Envoy routing — all without restarting the gateway.


Technical Stack

ComponentTechnologyResponsibility
Go servicedomain-resolverDNS verification worker, ACME client, Envoy xDS control plane
TLSLet's Encrypt (ACME HTTP-01 / DNS-01)Automatic certificate issuance and renewal
GatewayEnvoy xDS — SDS (Secret Discovery Service)SNI routing → tenant resolution, zero-restart config updates
SecretsHashiCorp VaultPrivate key storage — never on disk, never in Kubernetes Secrets
DatabasePostgreSQL + RLScustom_domains table; tenants see only their own records

End-to-End Domain Flow

1. Tenant: POST /api/v1/domains { "hostname": "mycompany.com" }
Platform: generate verification_token
INSERT custom_domains (status: 'pending')
Response: { id, hostname, verification_token, instructions }

2. Tenant configures DNS with their registrar:
Option A (subdomain): CNAME mycompany.com → custom.platform.io
Option B (apex): A mycompany.com → 76.76.21.21 (Anycast IP)
+ TXT _platform-verify.mycompany.com → "verify-{token}"

3. Background worker (Go, ticker every 30s):
DNS TXT lookup for _platform-verify.mycompany.com
┌─ Token matches → status = 'verified'
└─ No match → retry (max 72 hours, then status = 'failed')

4. ACME HTTP-01 challenge:
Envoy answers /.well-known/acme-challenge/{token}
Let's Encrypt verifies → certificate issued
Private key → stored in HashiCorp Vault (never on disk)
ssl_status: 'none' → 'validating' → 'issued'

5. Envoy xDS SDS (on-demand):
Envoy receives request for mycompany.com
SDS: lazy-fetch certificate from Go control plane → Vault
Certificate loaded into Envoy memory (not disk)
Domain status → 'active'

6. Live traffic:
TLS terminated by Envoy
Go middleware: SELECT tenant_id FROM custom_domains
WHERE hostname = $1 AND status = 'active'
Inject X-Tenant-ID header → upstream services
Standard JWT + RLS flow applies

Domain Status State Machine

StatusMeaning
pendingDomain registered, waiting for tenant to configure DNS
dns_checkDNS TXT record found, verification in progress
verifiedDNS verified, ACME SSL challenge initiated
activeSSL issued, traffic routed through platform
failedVerification timed out (72h) or max retries exceeded

SSL Status State Machine

StatusMeaning
noneNo SSL attempted yet
validatingACME challenge in progress
issuedCertificate issued and loaded into Envoy
expiringCertificate expires within 30 days — renewal triggered
renewedCertificate renewed successfully
errorACME challenge failed (rate limit, DNS misconfiguration)

DNS Verification

ParameterValue
TXT record name_platform-verify.{hostname}
TXT record valueverify-{token}
Polling intervalEvery 30 seconds (DOMAINS_DNS_CHECK_INTERVAL_SEC)
Maximum verification window72 hours (DOMAINS_VERIFY_TTL_HOURS). Expired → status = 'failed'
Fallback verificationIf CNAME already points to the platform → HTTP-01 challenge acts as functional verification

On failure, the tenant receives a Notify notification and the domain UI shows status failed with instructions to retry.


SSL — Let's Encrypt ACME

ParameterValue
Primary challengeHTTP-01: Envoy answers /.well-known/acme-challenge/{token}
Wildcard domainsDNS-01 challenge (TXT _acme-challenge.{hostname})
Certificate storageHashiCorp Vault only — private keys never on disk, never in Kubernetes Secrets
Auto-renewalBackground worker triggers renewal 30 days before expiry (DOMAINS_SSL_RENEW_DAYS_BEFORE)
Zombie protectionfailure_count tracked per domain. Auto-paused at 10 failures (DOMAINS_MAX_SSL_RETRIES). Exponential backoff between attempts.
Let's Encrypt rate limits50 certificates per registered domain per week · 300 orders per 3 hours · 5 failed validations per hostname per hour
Enterprise bypassExternal Account Binding (EAB) with Key ID + HMAC for ZeroSSL/Let's Encrypt Enterprise

Envoy xDS Integration

The domain-resolver Go service acts as an xDS control plane, pushing configuration to Envoy without restarts:

xDS componentDescription
SDS (Secret Discovery Service)On-demand certificate loading. Envoy does not load all certificates at startup — it fetches lazily when a new hostname first appears.
LDS / RDSDynamic Listener + Route: hostname → upstream cluster mapping. Updated without Envoy restart.
Go control planeUses go-control-plane library. Listens for PostgreSQL LISTEN/NOTIFY events → generates xDS Snapshot → Envoy picks up via gRPC stream.
Delta xDSPhase 2+: incremental updates instead of full snapshots (required at > 1,000 domains to avoid control plane overload).

PostgreSQL Schema

CREATE TABLE custom_domains (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
tenant_id UUID NOT NULL REFERENCES tenants(id),
hostname VARCHAR(255) UNIQUE NOT NULL,
verification_token TEXT NOT NULL,
status VARCHAR(50) NOT NULL DEFAULT 'pending',
-- pending → dns_check → verified → active / failed
ssl_status VARCHAR(50) NOT NULL DEFAULT 'none',
-- none → validating → issued → expiring → renewed / error
cert_expiry TIMESTAMPTZ,
failure_count INT DEFAULT 0,
metadata JSONB DEFAULT '{}',
created_at TIMESTAMPTZ DEFAULT now(),
updated_at TIMESTAMPTZ DEFAULT now()
);

-- Fast hostname lookup: Envoy → tenant resolution (nanoseconds)
CREATE UNIQUE INDEX idx_custom_domains_hostname
ON custom_domains(hostname);

-- Partial index: SSL renewal worker scans only validating/expiring rows
CREATE INDEX idx_domains_ssl_pending
ON custom_domains(ssl_status)
WHERE ssl_status IN ('validating', 'expiring');

-- RLS: tenant sees only their own domains
ALTER TABLE custom_domains ENABLE ROW LEVEL SECURITY;
CREATE POLICY tenant_isolation_domains ON custom_domains
USING (tenant_id = current_setting('app.current_tenant')::uuid);

REST API

Base Path

For layout brevity, the /api/v1 base path prefix is omitted from the endpoint table below.

MethodEndpointDescription
POST/domainsRegister domain. Body: { "hostname": "mycompany.com" }. Response includes verification_token and DNS setup instructions.
GET/domainsList all domains for this tenant (paginated)
GET/domains/:idGet domain detail: verification status, SSL status, expiry
GET/domains/:id/verifyTrigger DNS verification immediately (outside worker schedule)
DELETE/domains/:idDetach domain: soft delete + SSL certificate revoke

Register Domain — Example

POST https://api.septemcore.com/v1/domains
Authorization: Bearer <access_token>
Content-Type: application/json

{
"hostname": "mycompany.com"
}

Response 201 Created:

{
"id": "01j9pdom0000000000000000",
"hostname": "mycompany.com",
"status": "pending",
"sslStatus": "none",
"verificationToken": "verify-abc123def456",
"instructions": {
"subdomain": {
"type": "CNAME",
"name": "mycompany.com",
"value": "custom.platform.io"
},
"apex": {
"type": "A",
"name": "mycompany.com",
"value": "76.76.21.21"
},
"verification": {
"type": "TXT",
"name": "_platform-verify.mycompany.com",
"value": "verify-abc123def456"
}
},
"createdAt": "2026-04-22T02:00:00Z"
}

Limits and Constraints

ParameterValue
Max domains per tenant5 by default (DOMAINS_MAX_PER_TENANT). Determined by Billing plan.
Max hostname length253 characters (RFC 1035)
Blocked domains*.platform.io, *.localhost, bare IP addresses
AuditAll operations (add/verify/delete/ssl_issue/ssl_renew) → Audit Service

Scaling Phases

PhaseDomain countArchitecture
Phase 10 – 1,000Shared Envoy fleet, full xDS snapshots
Phase 21k – 50kDistributed Envoy + Anycast, Delta xDS, Vault SDS
Phase 350k+Cell-based infrastructure, multi-region, SPIFFE/SPIRE

Error Reference

SituationBehavior
DNS not configuredUI shows status "Pending DNS" with setup instructions and "Verify now" button
SSL pendingStatus shown in UI, traffic not routed until issued
Verification timeout (72h)status = 'failed', Notify notification sent to tenant admin
Zombie clientfailure_count >= 10 → auto-pause + alert to Platform Admin
Certificate expiredAuto-renew via ACME. On failure → fallback to platform subdomain
Domain not yet activeEnvoy serves default page: "Domain not configured for this platform"