Custom Domains — Overview
The Custom Domain Mapping service lets tenants replace the default
platform subdomain (tenant-123.platform.io) with their own branded
hostname (mycompany-crm.com, shop.mybrand.co). The service handles
end-to-end domain verification, TLS certificate issuance, and live
Envoy routing — all without restarting the gateway.
Technical Stack
| Component | Technology | Responsibility |
|---|---|---|
| Go service | domain-resolver | DNS verification worker, ACME client, Envoy xDS control plane |
| TLS | Let's Encrypt (ACME HTTP-01 / DNS-01) | Automatic certificate issuance and renewal |
| Gateway | Envoy xDS — SDS (Secret Discovery Service) | SNI routing → tenant resolution, zero-restart config updates |
| Secrets | HashiCorp Vault | Private key storage — never on disk, never in Kubernetes Secrets |
| Database | PostgreSQL + RLS | custom_domains table; tenants see only their own records |
End-to-End Domain Flow
1. Tenant: POST /api/v1/domains { "hostname": "mycompany.com" }
Platform: generate verification_token
INSERT custom_domains (status: 'pending')
Response: { id, hostname, verification_token, instructions }
2. Tenant configures DNS with their registrar:
Option A (subdomain): CNAME mycompany.com → custom.platform.io
Option B (apex): A mycompany.com → 76.76.21.21 (Anycast IP)
+ TXT _platform-verify.mycompany.com → "verify-{token}"
3. Background worker (Go, ticker every 30s):
DNS TXT lookup for _platform-verify.mycompany.com
┌─ Token matches → status = 'verified'
└─ No match → retry (max 72 hours, then status = 'failed')
4. ACME HTTP-01 challenge:
Envoy answers /.well-known/acme-challenge/{token}
Let's Encrypt verifies → certificate issued
Private key → stored in HashiCorp Vault (never on disk)
ssl_status: 'none' → 'validating' → 'issued'
5. Envoy xDS SDS (on-demand):
Envoy receives request for mycompany.com
SDS: lazy-fetch certificate from Go control plane → Vault
Certificate loaded into Envoy memory (not disk)
Domain status → 'active'
6. Live traffic:
TLS terminated by Envoy
Go middleware: SELECT tenant_id FROM custom_domains
WHERE hostname = $1 AND status = 'active'
Inject X-Tenant-ID header → upstream services
Standard JWT + RLS flow applies
Domain Status State Machine
| Status | Meaning |
|---|---|
pending | Domain registered, waiting for tenant to configure DNS |
dns_check | DNS TXT record found, verification in progress |
verified | DNS verified, ACME SSL challenge initiated |
active | SSL issued, traffic routed through platform |
failed | Verification timed out (72h) or max retries exceeded |
SSL Status State Machine
| Status | Meaning |
|---|---|
none | No SSL attempted yet |
validating | ACME challenge in progress |
issued | Certificate issued and loaded into Envoy |
expiring | Certificate expires within 30 days — renewal triggered |
renewed | Certificate renewed successfully |
error | ACME challenge failed (rate limit, DNS misconfiguration) |
DNS Verification
| Parameter | Value |
|---|---|
| TXT record name | _platform-verify.{hostname} |
| TXT record value | verify-{token} |
| Polling interval | Every 30 seconds (DOMAINS_DNS_CHECK_INTERVAL_SEC) |
| Maximum verification window | 72 hours (DOMAINS_VERIFY_TTL_HOURS). Expired → status = 'failed' |
| Fallback verification | If CNAME already points to the platform → HTTP-01 challenge acts as functional verification |
On failure, the tenant receives a Notify notification and the
domain UI shows status failed with instructions to retry.
SSL — Let's Encrypt ACME
| Parameter | Value |
|---|---|
| Primary challenge | HTTP-01: Envoy answers /.well-known/acme-challenge/{token} |
| Wildcard domains | DNS-01 challenge (TXT _acme-challenge.{hostname}) |
| Certificate storage | HashiCorp Vault only — private keys never on disk, never in Kubernetes Secrets |
| Auto-renewal | Background worker triggers renewal 30 days before expiry (DOMAINS_SSL_RENEW_DAYS_BEFORE) |
| Zombie protection | failure_count tracked per domain. Auto-paused at 10 failures (DOMAINS_MAX_SSL_RETRIES). Exponential backoff between attempts. |
| Let's Encrypt rate limits | 50 certificates per registered domain per week · 300 orders per 3 hours · 5 failed validations per hostname per hour |
| Enterprise bypass | External Account Binding (EAB) with Key ID + HMAC for ZeroSSL/Let's Encrypt Enterprise |
Envoy xDS Integration
The domain-resolver Go service acts as an xDS control plane,
pushing configuration to Envoy without restarts:
| xDS component | Description |
|---|---|
| SDS (Secret Discovery Service) | On-demand certificate loading. Envoy does not load all certificates at startup — it fetches lazily when a new hostname first appears. |
| LDS / RDS | Dynamic Listener + Route: hostname → upstream cluster mapping. Updated without Envoy restart. |
| Go control plane | Uses go-control-plane library. Listens for PostgreSQL LISTEN/NOTIFY events → generates xDS Snapshot → Envoy picks up via gRPC stream. |
| Delta xDS | Phase 2+: incremental updates instead of full snapshots (required at > 1,000 domains to avoid control plane overload). |
PostgreSQL Schema
CREATE TABLE custom_domains (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
tenant_id UUID NOT NULL REFERENCES tenants(id),
hostname VARCHAR(255) UNIQUE NOT NULL,
verification_token TEXT NOT NULL,
status VARCHAR(50) NOT NULL DEFAULT 'pending',
-- pending → dns_check → verified → active / failed
ssl_status VARCHAR(50) NOT NULL DEFAULT 'none',
-- none → validating → issued → expiring → renewed / error
cert_expiry TIMESTAMPTZ,
failure_count INT DEFAULT 0,
metadata JSONB DEFAULT '{}',
created_at TIMESTAMPTZ DEFAULT now(),
updated_at TIMESTAMPTZ DEFAULT now()
);
-- Fast hostname lookup: Envoy → tenant resolution (nanoseconds)
CREATE UNIQUE INDEX idx_custom_domains_hostname
ON custom_domains(hostname);
-- Partial index: SSL renewal worker scans only validating/expiring rows
CREATE INDEX idx_domains_ssl_pending
ON custom_domains(ssl_status)
WHERE ssl_status IN ('validating', 'expiring');
-- RLS: tenant sees only their own domains
ALTER TABLE custom_domains ENABLE ROW LEVEL SECURITY;
CREATE POLICY tenant_isolation_domains ON custom_domains
USING (tenant_id = current_setting('app.current_tenant')::uuid);
REST API
Base Path
For layout brevity, the /api/v1 base path prefix is omitted
from the endpoint table below.
| Method | Endpoint | Description |
|---|---|---|
POST | /domains | Register domain. Body: { "hostname": "mycompany.com" }. Response includes verification_token and DNS setup instructions. |
GET | /domains | List all domains for this tenant (paginated) |
GET | /domains/:id | Get domain detail: verification status, SSL status, expiry |
GET | /domains/:id/verify | Trigger DNS verification immediately (outside worker schedule) |
DELETE | /domains/:id | Detach domain: soft delete + SSL certificate revoke |
Register Domain — Example
POST https://api.septemcore.com/v1/domains
Authorization: Bearer <access_token>
Content-Type: application/json
{
"hostname": "mycompany.com"
}
Response 201 Created:
{
"id": "01j9pdom0000000000000000",
"hostname": "mycompany.com",
"status": "pending",
"sslStatus": "none",
"verificationToken": "verify-abc123def456",
"instructions": {
"subdomain": {
"type": "CNAME",
"name": "mycompany.com",
"value": "custom.platform.io"
},
"apex": {
"type": "A",
"name": "mycompany.com",
"value": "76.76.21.21"
},
"verification": {
"type": "TXT",
"name": "_platform-verify.mycompany.com",
"value": "verify-abc123def456"
}
},
"createdAt": "2026-04-22T02:00:00Z"
}
Limits and Constraints
| Parameter | Value |
|---|---|
| Max domains per tenant | 5 by default (DOMAINS_MAX_PER_TENANT). Determined by Billing plan. |
| Max hostname length | 253 characters (RFC 1035) |
| Blocked domains | *.platform.io, *.localhost, bare IP addresses |
| Audit | All operations (add/verify/delete/ssl_issue/ssl_renew) → Audit Service |
Scaling Phases
| Phase | Domain count | Architecture |
|---|---|---|
| Phase 1 | 0 – 1,000 | Shared Envoy fleet, full xDS snapshots |
| Phase 2 | 1k – 50k | Distributed Envoy + Anycast, Delta xDS, Vault SDS |
| Phase 3 | 50k+ | Cell-based infrastructure, multi-region, SPIFFE/SPIRE |
Error Reference
| Situation | Behavior |
|---|---|
| DNS not configured | UI shows status "Pending DNS" with setup instructions and "Verify now" button |
| SSL pending | Status shown in UI, traffic not routed until issued |
| Verification timeout (72h) | status = 'failed', Notify notification sent to tenant admin |
| Zombie client | failure_count >= 10 → auto-pause + alert to Platform Admin |
| Certificate expired | Auto-renew via ACME. On failure → fallback to platform subdomain |
| Domain not yet active | Envoy serves default page: "Domain not configured for this platform" |