Rate Limiting
Rate limiting in the Notify Service protects external providers (email, SMS, Telegram) from module bugs and prevents one tenant from degrading the delivery performance of all others.
Key principle: A notification is never silently dropped when a rate limit is hit. A
429response means the notification was queued with a delay and will be delivered — it is not lost.
HTTP API Rate Limits
Per-Tenant
All send() and sendBatch() calls from all modules of a tenant are
counted together against the tenant limit:
| Parameter | Value | Env override |
|---|---|---|
| Rate | 100 notifications / minute | NOTIFY_RATE_LIMIT_PER_TENANT |
| Window | Rolling 60-second window | — |
| Scope | All modules of the tenant combined | — |
Per-Module
Each individual module has its own sub-limit within the tenant limit:
| Parameter | Value |
|---|---|
| Rate | 50 notifications / minute |
| Scope | Per module (identified by module.id from JWT claims) |
This means a module with a bug that calls send() in a tight loop
cannot exhaust the tenant's entire 100/min allowance — it is capped
at 50, leaving headroom for other modules.
Rate Limit Headers
Every response from /api/v1/notifications includes rate limit headers:
X-RateLimit-Limit-Tenant: 100
X-RateLimit-Remaining-Tenant: 43
X-RateLimit-Limit-Module: 50
X-RateLimit-Remaining-Module: 17
X-RateLimit-Reset: 1744714260
X-RateLimit-Reset is a Unix timestamp for when the window resets.
Rate Limit Exceeded — 429 Response
When the per-tenant or per-module rate is exceeded and the priority
is low, normal, or high:
HTTP/1.1 429 Too Many Requests
Retry-After: 18
{
"type": "https://api.septemcore.com/problems/rate-limit-exceeded",
"status": 429,
"detail": "Module rate limit exceeded. Notification queued with 18s delay.",
"code": "RATE_LIMIT_EXCEEDED",
"notificationId": "01j9panot700000000000000"
}
The notification is not dropped — it enters a delayed queue and
will be delivered when the rate window resets. The notificationId is
returned so the module can track delivery status.
critical priority bypasses rate limiting and is always delivered
immediately regardless of current rates. Reserve critical for
time-sensitive security events (OTP codes, account lockout).
Module hits rate limit with priority: 'normal'
→ 429 Queued with delay
→ Notification delivered when window resets
Module sends with priority: 'critical'
→ Bypasses rate limit
→ Immediate delivery
Batch Limits
| Parameter | Value |
|---|---|
Max recipients per sendBatch() call | 500 |
| Batch → background job threshold | > 100 recipients |
| Batch exceeds 500 | 400 Bad Request — BATCH_LIMIT_EXCEEDED |
Each recipient in a batch counts as one notification against the
rate limit. A batch of 500 to email recipients consumes 500 of the
tenant's 100/min allowance — the batch will be spread across multiple
rate windows automatically by the background job worker.
WebSocket Rate Limits
WebSocket rate limiting operates at the Notify Service level, independent of the HTTP API limits. It protects the WebSocket infrastructure from broadcast storms.
| Parameter | Value | Env override |
|---|---|---|
| Max messages / sec per tenant | 200 | NOTIFY_WS_RATE_PER_TENANT |
| Max concurrent connections per tenant | 1 000 | NOTIFY_WS_MAX_CONNECTIONS_PER_TENANT |
| Exceeded | Throttle + warning message (connection stays open) | — |
| Max message size | 64 KB | — |
| Max subscriptions per connection | 50 channels | — |
Throttle behaviour on WebSocket rate limit:
{ "type": "warning", "code": "WS_RATE_LIMITED", "message": "Message throttled. Retry in 1s." }
The connection is not closed — only that message batch is delayed.
Tenant Isolation on WebSocket
Each tenant broadcasts on {tenantId}:{channel} — one tenant's
broadcast volume cannot affect another tenant's connection quality:
Tenant A: 1000 connections sending 200 msg/sec → throttled at limit
Tenant B: 10 connections sending 5 msg/sec → unaffected
RabbitMQ Queue Limits
Each tenant has an isolated RabbitMQ queue for outgoing notifications:
| Parameter | Value | Env override |
|---|---|---|
| Queue name | notify.outgoing.{tenantId} | — |
| Max queue length | 10 000 messages | NOTIFY_RABBITMQ_MAX_QUEUE_LENGTH |
| Overflow policy | x-overflow: reject-publish | — |
When the queue for a tenant reaches 10 000 pending messages (e.g. due to a stuck adapter or very high batch volume), new sends are rejected:
HTTP/1.1 503 Service Unavailable
{
"type": "https://api.septemcore.com/problems/queue-full",
"status": 503,
"detail": "Notification queue is at capacity. Retry after existing notifications are processed.",
"code": "QUEUE_FULL"
}
This 503 affects only the tenant whose queue is full. All other
tenants continue delivering normally.
Protection Rationale
Without rate limits:
Module bug: send() in tight loop on 100K users
→ External provider: SendGrid / Telegram bans platform IP
→ Impact: ALL tenants lose email / Telegram delivery
With rate limits:
Module bug → hits 50/min module cap → queued with delay → provider not spammed
→ Other modules: continue within tenant's 100/min allowance
→ Other tenants: completely unaffected (per-tenant queue isolation)
Configuration Reference
| Variable | Default | Description |
|---|---|---|
NOTIFY_RATE_LIMIT_PER_TENANT | 100 | Notifications per minute per tenant |
NOTIFY_RATE_LIMIT_PER_MODULE | 50 | Notifications per minute per module |
NOTIFY_WS_RATE_PER_TENANT | 200 | WebSocket messages per second per tenant |
NOTIFY_WS_MAX_CONNECTIONS_PER_TENANT | 1000 | Max concurrent WebSocket connections |
NOTIFY_RABBITMQ_MAX_QUEUE_LENGTH | 10000 | Max pending messages per tenant queue |