Skip to main content

WebSocket Protocol Reference

The WebSocket API is the real-time delivery channel for the kernel.notify() primitive. It operates on a persistent, authenticated connection and delivers notification events pushed from the Notify Service over a tenant-isolated channel.

WebSocket is the only push channel — all other delivery mechanisms (email, SMS, in-app push) are dispatched asynchronously via RabbitMQ. This page documents the complete wire protocol as implemented in services/notify/internal/ws/.


Connection URL

wss://api.septemcore.com/v1/ws

The endpoint is served by the Notify Service (NOTIFY_HTTP_PORT=8095) behind the Envoy Gateway. The Gateway terminates TLS and forwards the upgrade request with Upgrade: websocket intact.

ParameterValue
ProtocolWebSocket (RFC 6455)
TransportTLS 1.3 (Envoy terminates)
Message encodingJSON (UTF-8 text frames only)
Binary framesNot supported — rejected silently
Sub-protocolsNone
Max message size4096 bytes (MaxMessageSize = 4096)

Connection Lifecycle

[Client] [Notify Service]

connect wss://api.septemcore.com/v1/ws
──────────────────────────────────────►
TCP + TLS handshake
WebSocket upgrade (101 Switching Protocols)
◄──────────────────────────────────────

◄── {"type":"auth_required"}

{"type":"auth","token":"<JWT>"}
─────────────────────────────►
JWT validation via IAM gRPC ValidateToken
◄── {"type":"auth_ok","connId":"ws-1745012345"}

{"type":"subscribe","channels":["notifications","alerts"]}
──────────────────────────────────────────────────────────►
◄── {"type":"subscribed"}

◄── {"type":"notification",...} (asynchronous)

{"type":"ping"}
───────────────►
◄── {"type":"pong"}

┌── server ping every 30s (WebSocket protocol Ping frame) ──┐
│ client must respond Pong within 10s │
│ 2 missed pongs → close(4408) │
└────────────────────────────────────────────────────────────┘

Handshake — Step by Step

StepDirectionMessage TypeNotes
1Server → Clientauth_requiredSent immediately on upgrade
2Client → Serverauth10-second timeout from step 1. Includes <JWT>
3Server → Clientauth_okJWT valid + tenantId/userId extracted. Includes connId
3aServer → Clientauth_errorInvalid token → connection closes (1008)
4Client → ServersubscribeProvides array of channels
5Server → ClientsubscribedSubscription registered in Hub
6Server → ClientnotificationAsynchronous, pushed by Hub.Broadcast

Auth timeout: If the client does not send {"type":"auth"} within 10 seconds of receiving auth_required, the server closes the connection with StatusPolicyViolation (1008).


Message Reference

Client → Server Messages

auth

Authenticates the connection. Must be sent within 10 seconds of auth_required.

{
"type": "auth",
"token": "eyJhbGciOiJFUzI1NiIsInR5cCI6IkpXVCJ9..."
}
FieldTypeRequiredDescription
typestringMust be "auth"
tokenstringJWT ES256 access token (without "Bearer " prefix)

subscribe

Subscribes to one or more notification channels.

{
"type": "subscribe",
"channels": ["notifications", "alerts", "system"]
}
FieldTypeRequiredDescription
typestringMust be "subscribe"
channelsstring[]Channel names. Max 50 per connection.

Channel namespace: Channel names are automatically namespaced by the server with the authenticated tenantId:

channels: ["notifications"]
→ namespaced: "{tenantId}:notifications"

Clients never see the namespaced form — they use bare channel names. Cross-tenant delivery is impossible at the Hub level (Hub broadcasts only to tenantID-matching connections).


ping

Client-initiated heartbeat. Server responds with pong. Optional — the server also sends its own WebSocket protocol-level pings.

{"type": "ping"}

replay

Requests missed messages since the last known message ID. Used on reconnect to recover messages delivered while the connection was down.

{
"type": "replay",
"lastMessageId": "msg-1745012300000000001"
}
FieldTypeRequiredDescription
typestringMust be "replay"
lastMessageIdstringID of the last message received before disconnect

Replay is backed by Valkey (ws:replay:{tenantId}:{channel} key). The server replays all messages stored after lastMessageId for all channels of this tenant. Replay buffer is capped at 100 messages per channel (LIFO).


Server → Client Messages

auth_required

Sent immediately upon successful WebSocket upgrade. Signals that the connection is unauthenticated and the client must send auth.

{"type": "auth_required"}

auth_ok

Sent after successful JWT validation. Includes the server-assigned connection ID.

{
"type": "auth_ok",
"connId": "ws-1745012345678901234"
}
FieldTypeDescription
connIdstringUnique connection identifier. Use in logs and support tickets.

auth_error

Sent when JWT validation fails. Connection is closed immediately after.

{
"type": "auth_error",
"error": "invalid token"
}

subscribed

Confirms that the channel subscription was registered.

{"type": "subscribed"}

notification

Push message delivered to subscribed channels. The payload field is channel-specific and defined by the originating service.

{
"id": "msg-1745012345678901234",
"channel": "notifications",
"type": "notification",
"payload": {
"title": "Payment received",
"body": "Your invoice #INV-2026-042 has been paid.",
"severity": "info",
"action_url": "https://api.septemcore.com/v1/billing/invoices/INV-2026-042"
},
"timestamp": "2026-04-22T10:56:00Z"
}
FieldTypeDescription
idstringUnique message ID. Used in replay.lastMessageId.
channelstringBare channel name (without tenant prefix).
typestringAlways "notification" for pushed messages.
payloadobjectNotification data. Shape defined per channel.
timestampstringISO 8601 UTC. Server-set delivery time.

pong

Response to a client ping.

{"type": "pong"}

replay_error

Sent when replay is unavailable (replay store not configured or internal error).

{
"type": "replay_error",
"error": "replay not available"
}

Heartbeat — Keepalive

The server sends a WebSocket protocol-level Ping frame every 30 seconds (PingInterval = 30s). The client must respond with a Pong frame within 10 seconds (PongTimeout = 10s).

ParameterValueSource
Ping interval30 secondsws.PingInterval = 30 * time.Second
Pong timeout10 secondsws.PongTimeout = 10 * time.Second
Max missed pongs2ws.MaxMissedPongs = 2
Close code on timeout4408ws.CloseCodeMissedPong = 4408

After MaxMissedPongs consecutive missed pongs, the server closes the connection with close code 4408. This is a custom application-level code in the 4000–4999 range (reserved for application use per RFC 6455).

The 2-missed-pong threshold means the effective heartbeat timeout is PingInterval + MaxMissedPongs × PongTimeout = 30 + 2 × 10 = 50 seconds.


Reconnect and Replay

Recommended reconnect strategy:

  1. Client detects connection close (any code).
  2. Wait min(2^attempt × 1s, 60s) (exponential backoff with 60s cap).
  3. Re-establish connection: full auth handshake → subscribe → send replay with the last known message.id.
  4. Server replays missing messages from Valkey buffer.
{
"type": "replay",
"lastMessageId": "msg-1745009999000000001"
}

Replay limitations:

ParameterValue
Replay buffer depth100 messages per channel (LIFO — oldest dropped first)
Replay storageValkey LPUSH / LRANGE on ws:replay:{tenantId}:{channel}
Messages older than bufferNot replayed — client must poll REST API
REST fallbackGET https://api.septemcore.com/v1/notifications?since=<ISO>

Close Codes

CodeMeaningWho closes
1000Normal closureEither side
1008Policy violation — auth timeout, invalid tokenServer
4408Heartbeat timeout (2 missed pongs)Server

Standard WebSocket close codes 10011007 may appear from Envoy (network errors, protocol violations, etc.) and should be treated as transient — reconnect.


Tenant Isolation

All connections, channels, and message dispatch are fully tenant-isolated at the Hub level:

MechanismImplementation
Connection scopeEach Conn carries TenantID extracted from JWT
Channel namespacechannel → {tenantID}:{channel} before Hub registration
Broadcast routingHub.Broadcast(tenantID, channel, data) only delivers to connections matching tenantID
Cross-tenant deliveryArchitecturally impossible — Hub maps are keyed by tenantID

Limits

LimitValueConfig env variable
Max connections per tenant1000NOTIFY_WS_MAX_CONNECTIONS_PER_TENANT=1000
Max broadcast rate per tenant200 msg/secNOTIFY_WS_RATE_PER_TENANT=200
Max message size (incoming)4096 bytesws.MaxMessageSize = 4096
Per-connection send buffer256 messagesws.SendBufferSize = 256
Auth timeout10 secondsHardcoded in handler
Max channels per connection50Enforced at subscribe
Replay buffer depth100 messages / channelReplayStore.Push cap

When maxConnsPerTenant is reached, the new connection is closed immediately after registration — no error message is sent. The client will see a clean close and should apply exponential backoff before reconnecting.


Implementation Notes

ComponentPath
WebSocket handlerservices/notify/internal/ws/handler.go
Hub (connection manager)services/notify/internal/ws/hub.go
Protocol constantsws.PingInterval, ws.PongTimeout, ws.MaxMissedPongs, ws.CloseCodeMissedPong
Configservices/notify/internal/config/config.go
Integration testsservices/notify/tests/integration/websocket_test.go
WebSocket librarygithub.com/coder/websocket (nhooyr/websocket fork, maintained April 2026)