Skip to main content

Rate Limiting

Rate limiting in the Notify Service protects external providers (email, SMS, Telegram) from module bugs and prevents one tenant from degrading the delivery performance of all others.

Key principle: A notification is never silently dropped when a rate limit is hit. A 429 response means the notification was queued with a delay and will be delivered — it is not lost.


HTTP API Rate Limits

Per-Tenant

All send() and sendBatch() calls from all modules of a tenant are counted together against the tenant limit:

ParameterValueEnv override
Rate100 notifications / minuteNOTIFY_RATE_LIMIT_PER_TENANT
WindowRolling 60-second window
ScopeAll modules of the tenant combined

Per-Module

Each individual module has its own sub-limit within the tenant limit:

ParameterValue
Rate50 notifications / minute
ScopePer module (identified by module.id from JWT claims)

This means a module with a bug that calls send() in a tight loop cannot exhaust the tenant's entire 100/min allowance — it is capped at 50, leaving headroom for other modules.

Rate Limit Headers

Every response from /api/v1/notifications includes rate limit headers:

X-RateLimit-Limit-Tenant: 100
X-RateLimit-Remaining-Tenant: 43
X-RateLimit-Limit-Module: 50
X-RateLimit-Remaining-Module: 17
X-RateLimit-Reset: 1744714260

X-RateLimit-Reset is a Unix timestamp for when the window resets.

Rate Limit Exceeded — 429 Response

When the per-tenant or per-module rate is exceeded and the priority is low, normal, or high:

HTTP/1.1 429 Too Many Requests
Retry-After: 18

{
"type": "https://api.septemcore.com/problems/rate-limit-exceeded",
"status": 429,
"detail": "Module rate limit exceeded. Notification queued with 18s delay.",
"code": "RATE_LIMIT_EXCEEDED",
"notificationId": "01j9panot700000000000000"
}

The notification is not dropped — it enters a delayed queue and will be delivered when the rate window resets. The notificationId is returned so the module can track delivery status.

critical priority bypasses rate limiting and is always delivered immediately regardless of current rates. Reserve critical for time-sensitive security events (OTP codes, account lockout).

Module hits rate limit with priority: 'normal'
→ 429 Queued with delay
→ Notification delivered when window resets

Module sends with priority: 'critical'
→ Bypasses rate limit
→ Immediate delivery

Batch Limits

ParameterValue
Max recipients per sendBatch() call500
Batch → background job threshold> 100 recipients
Batch exceeds 500400 Bad RequestBATCH_LIMIT_EXCEEDED

Each recipient in a batch counts as one notification against the rate limit. A batch of 500 to email recipients consumes 500 of the tenant's 100/min allowance — the batch will be spread across multiple rate windows automatically by the background job worker.


WebSocket Rate Limits

WebSocket rate limiting operates at the Notify Service level, independent of the HTTP API limits. It protects the WebSocket infrastructure from broadcast storms.

ParameterValueEnv override
Max messages / sec per tenant200NOTIFY_WS_RATE_PER_TENANT
Max concurrent connections per tenant1 000NOTIFY_WS_MAX_CONNECTIONS_PER_TENANT
ExceededThrottle + warning message (connection stays open)
Max message size64 KB
Max subscriptions per connection50 channels

Throttle behaviour on WebSocket rate limit:

{ "type": "warning", "code": "WS_RATE_LIMITED", "message": "Message throttled. Retry in 1s." }

The connection is not closed — only that message batch is delayed.

Tenant Isolation on WebSocket

Each tenant broadcasts on {tenantId}:{channel} — one tenant's broadcast volume cannot affect another tenant's connection quality:

Tenant A: 1000 connections sending 200 msg/sec → throttled at limit
Tenant B: 10 connections sending 5 msg/sec → unaffected

RabbitMQ Queue Limits

Each tenant has an isolated RabbitMQ queue for outgoing notifications:

ParameterValueEnv override
Queue namenotify.outgoing.{tenantId}
Max queue length10 000 messagesNOTIFY_RABBITMQ_MAX_QUEUE_LENGTH
Overflow policyx-overflow: reject-publish

When the queue for a tenant reaches 10 000 pending messages (e.g. due to a stuck adapter or very high batch volume), new sends are rejected:

HTTP/1.1 503 Service Unavailable

{
"type": "https://api.septemcore.com/problems/queue-full",
"status": 503,
"detail": "Notification queue is at capacity. Retry after existing notifications are processed.",
"code": "QUEUE_FULL"
}

This 503 affects only the tenant whose queue is full. All other tenants continue delivering normally.


Protection Rationale

Without rate limits:
Module bug: send() in tight loop on 100K users
→ External provider: SendGrid / Telegram bans platform IP
→ Impact: ALL tenants lose email / Telegram delivery

With rate limits:
Module bug → hits 50/min module cap → queued with delay → provider not spammed
→ Other modules: continue within tenant's 100/min allowance
→ Other tenants: completely unaffected (per-tenant queue isolation)

Configuration Reference

VariableDefaultDescription
NOTIFY_RATE_LIMIT_PER_TENANT100Notifications per minute per tenant
NOTIFY_RATE_LIMIT_PER_MODULE50Notifications per minute per module
NOTIFY_WS_RATE_PER_TENANT200WebSocket messages per second per tenant
NOTIFY_WS_MAX_CONNECTIONS_PER_TENANT1000Max concurrent WebSocket connections
NOTIFY_RABBITMQ_MAX_QUEUE_LENGTH10000Max pending messages per tenant queue