Rate Limiting

Rate limiting in the Notify Service protects external providers (email, SMS, Telegram) from module bugs and prevents one tenant from degrading the delivery performance of all others.

Key principle: A notification is never silently dropped when a rate limit is hit. A 429 response means the notification was queued with a delay and will be delivered — it is not lost.

HTTP API Rate Limits

Per-Tenant

All send() and sendBatch() calls from all modules of a tenant are counted together against the tenant limit:

Parameter	Value	Env override
Rate	100 notifications / minute	`NOTIFY_RATE_LIMIT_PER_TENANT`
Window	Rolling 60-second window	—
Scope	All modules of the tenant combined	—

Per-Module

Each individual module has its own sub-limit within the tenant limit:

Parameter	Value
Rate	50 notifications / minute
Scope	Per module (identified by `module.id` from JWT claims)

This means a module with a bug that calls send() in a tight loop cannot exhaust the tenant's entire 100/min allowance — it is capped at 50, leaving headroom for other modules.

Rate Limit Headers

Every response from /api/v1/notifications includes rate limit headers:

X-RateLimit-Limit-Tenant:     100
X-RateLimit-Remaining-Tenant: 43
X-RateLimit-Limit-Module:     50
X-RateLimit-Remaining-Module: 17
X-RateLimit-Reset:            1744714260

X-RateLimit-Reset is a Unix timestamp for when the window resets.

Rate Limit Exceeded — 429 Response

When the per-tenant or per-module rate is exceeded and the priority is low, normal, or high:

HTTP/1.1 429 Too Many Requests
Retry-After: 18

{
  "type":   "https://api.septemcore.com/problems/rate-limit-exceeded",
  "status": 429,
  "detail": "Module rate limit exceeded. Notification queued with 18s delay.",
  "code":   "RATE_LIMIT_EXCEEDED",
  "notificationId": "01j9panot700000000000000"
}

The notification is not dropped — it enters a delayed queue and will be delivered when the rate window resets. The notificationId is returned so the module can track delivery status.

critical priority bypasses rate limiting and is always delivered immediately regardless of current rates. Reserve critical for time-sensitive security events (OTP codes, account lockout).

Module hits rate limit with priority: 'normal'
→ 429 Queued with delay
→ Notification delivered when window resets

Module sends with priority: 'critical'
→ Bypasses rate limit
→ Immediate delivery

Batch Limits

Parameter	Value
Max recipients per `sendBatch()` call	500
Batch → background job threshold	> 100 recipients
Batch exceeds 500	`400 Bad Request` — `BATCH_LIMIT_EXCEEDED`

Each recipient in a batch counts as one notification against the rate limit. A batch of 500 to email recipients consumes 500 of the tenant's 100/min allowance — the batch will be spread across multiple rate windows automatically by the background job worker.

WebSocket Rate Limits

WebSocket rate limiting operates at the Notify Service level, independent of the HTTP API limits. It protects the WebSocket infrastructure from broadcast storms.

Parameter	Value	Env override
Max messages / sec per tenant	200	`NOTIFY_WS_RATE_PER_TENANT`
Max concurrent connections per tenant	1 000	`NOTIFY_WS_MAX_CONNECTIONS_PER_TENANT`
Exceeded	Throttle + warning message (connection stays open)	—
Max message size	64 KB	—
Max subscriptions per connection	50 channels	—

Throttle behaviour on WebSocket rate limit:

{ "type": "warning", "code": "WS_RATE_LIMITED", "message": "Message throttled. Retry in 1s." }

The connection is not closed — only that message batch is delayed.

Tenant Isolation on WebSocket

Each tenant broadcasts on {tenantId}:{channel} — one tenant's broadcast volume cannot affect another tenant's connection quality:

Tenant A: 1000 connections sending 200 msg/sec → throttled at limit
Tenant B: 10 connections sending 5 msg/sec     → unaffected

RabbitMQ Queue Limits

Each tenant has an isolated RabbitMQ queue for outgoing notifications:

Parameter	Value	Env override
Queue name	`notify.outgoing.{tenantId}`	—
Max queue length	10 000 messages	`NOTIFY_RABBITMQ_MAX_QUEUE_LENGTH`
Overflow policy	`x-overflow: reject-publish`	—

When the queue for a tenant reaches 10 000 pending messages (e.g. due to a stuck adapter or very high batch volume), new sends are rejected:

HTTP/1.1 503 Service Unavailable

{
  "type":   "https://api.septemcore.com/problems/queue-full",
  "status": 503,
  "detail": "Notification queue is at capacity. Retry after existing notifications are processed.",
  "code":   "QUEUE_FULL"
}

This 503 affects only the tenant whose queue is full. All other tenants continue delivering normally.

Protection Rationale

Without rate limits:
  Module bug:            send() in tight loop on 100K users
  → External provider:   SendGrid / Telegram bans platform IP
  → Impact:              ALL tenants lose email / Telegram delivery

With rate limits:
  Module bug → hits 50/min module cap → queued with delay → provider not spammed
  → Other modules:       continue within tenant's 100/min allowance
  → Other tenants:       completely unaffected (per-tenant queue isolation)

Configuration Reference

Variable	Default	Description
`NOTIFY_RATE_LIMIT_PER_TENANT`	`100`	Notifications per minute per tenant
`NOTIFY_RATE_LIMIT_PER_MODULE`	`50`	Notifications per minute per module
`NOTIFY_WS_RATE_PER_TENANT`	`200`	WebSocket messages per second per tenant
`NOTIFY_WS_MAX_CONNECTIONS_PER_TENANT`	`1000`	Max concurrent WebSocket connections
`NOTIFY_RABBITMQ_MAX_QUEUE_LENGTH`	`10000`	Max pending messages per tenant queue

HTTP API Rate Limits​

Per-Tenant​

Per-Module​

Rate Limit Headers​

Rate Limit Exceeded — 429 Response​

Batch Limits​

WebSocket Rate Limits​

Tenant Isolation on WebSocket​

RabbitMQ Queue Limits​

Protection Rationale​

Configuration Reference​