Integration Hub — Overview
The Integration Hub is the platform's single outbound gateway for all calls to external APIs — payment processors, SMS providers, email services, analytics platforms, advertising networks, and custom third-party services. Every module that needs to reach an external API does so only through the Integration Hub — direct outbound calls from modules are prohibited.
This mirrors the inbound architecture: the API Gateway is the single point of entry for all external traffic into the platform; the Integration Hub is the single point of exit.
Technical Stack
| Component | Technology | Detail |
|---|---|---|
| Language | Go | net/http client + chi router |
| Circuit Breaker | sony/gobreaker | Per-provider state machine (CLOSED / HALF_OPEN / OPEN) |
| Database | PostgreSQL | Provider registry, DLQ, configuration |
| Cache | Valkey | Circuit breaker state, outgoing rate limiting counters |
| DLQ queue | RabbitMQ | Retry queue for failed requests |
| gRPC | Protobuf definition | Internal service API |
| OpenAPI | Specification file | REST API schema |
Paths & Configuration
gRPC:
proto/platform/integration_hub/v1/integration_hub_service.proto
OpenAPI:
services/integration-hub/api/openapi.yaml
Internal Packages
| Package | Responsibility |
|---|---|
internal/config | Environment configuration |
internal/dlq | Dead Letter Queue: persist, paginate, retry |
internal/handler | HTTP request handlers |
internal/integration | Provider call logic, circuit breaker, retry, timeout |
internal/kafka | Kafka event publisher |
internal/rabbitmq | RabbitMQ DLQ retry queue consumer |
internal/schema | Provider schema validation |
internal/tenant | Tenant-scoped provider registry |
Why All Modules Use the Hub
External APIs are unreliable by nature. Without centralized management, every module would independently implement circuit breaking, retry, timeout, and credential management — duplicating logic and creating inconsistent failure behavior. The Integration Hub enforces these patterns uniformly:
| Problem | Hub solution |
|---|---|
| External API flapping | Circuit breaker isolates failures per provider |
| Cascading timeouts | Per-provider timeout cap (default 10s, max 30s) |
| Thundering herd on retry | Exponential backoff with ±500ms jitter |
| Credential exposure | Credentials encrypted at rest (AES-256-GCM) |
| Untracked failures | All failed calls go to DLQ (30-day retention) |
| No outgoing rate control | Per-provider rate limit (100 req/s by default) |
Resilience Patterns
Circuit Breaker (sony/gobreaker)
Each provider has its own circuit breaker instance, independent from all other providers:
CLOSED (normal operation)
│
│ 5 consecutive failures within 30 seconds
▼
OPEN (all calls rejected immediately, no upstream call made)
│
│ After 60 seconds
▼
HALF_OPEN (1 probe request allowed)
│
├─ Probe succeeds → CLOSED
└─ Probe fails → OPEN (60s reset again)
| Parameter | Value |
|---|---|
| Failure threshold | 5 consecutive errors within 30 seconds |
| Open duration | 60 seconds |
| HALF_OPEN probes | 1 request |
| Error when OPEN | 503 Service Unavailable + X-Circuit-Breaker-Open: true header |
| State stored in | Valkey (shared across Hub instances) — no split-brain on scale-out |
When the circuit is OPEN, the Hub returns an immediate 503 without
making any outbound request. This protects the external provider from
being hammered during a failure, and protects upstream module
callers from accumulating latency.
Retry + Exponential Backoff
Retries are applied after a failed call on a CLOSED or HALF_OPEN circuit:
Attempt 1: fails
→ wait 1s ± 500ms jitter
Attempt 2: fails
→ wait 2s ± 500ms jitter
Attempt 3: fails
→ wait 4s ± 500ms jitter
Attempt 4: fails
→ wait 8s ± 500ms jitter
Attempt 5: fails
→ max retries exhausted → send to DLQ
| Parameter | Value |
|---|---|
| Max retries | 5 |
| Base delay | 1 second |
| Multiplier | 2× per attempt |
| Jitter | ±500ms (prevents synchronized retry storms) |
| Idempotency | UUID per outgoing request (CONVENTIONS §24) — safe to retry |
Retries only apply to idempotent-safe failures (network timeout, 5xx). 4xx responses from the external provider are not retried — they indicate a client-side error that retrying will not fix.
Timeout
| Parameter | Value |
|---|---|
| Default timeout | 10 seconds |
| Maximum timeout | 30 seconds (per-provider override) |
| Configuration | Provider config.timeout_ms field |
| Timeout response | 504 Gateway Timeout to the calling module |
Outgoing Rate Limiting
Module → Hub: POST /api/v1/integrations/call { providerId: "..." }
Hub: Valkey incr outgoing_rate:{providerId} per second
→ under limit (100 req/s) → forward to external API
→ over limit → 429 Too Many Requests (caller must backoff)
| Env variable | Default |
|---|---|
INTEGRATION_RATE_LIMIT_PER_SEC | 100 |
Call Flow
Module: POST /api/v1/integrations/call
{ "providerId": "01j9pint...", "method": "POST", "path": "/v1/charges",
"body": { ... }, "idempotencyKey": "uuid-..." }
Integration Hub:
1. Resolve provider by ID → get baseUrl, credentials, config
2. Check outgoing rate limit (Valkey)
3. Check circuit breaker state
┌─ OPEN → 503, no upstream call
└─ CLOSED / HALF_OPEN → continue
4. Decrypt credentials (AES-256-GCM, Vault-managed key)
5. Build outgoing HTTP request: baseUrl + path, inject auth headers
6. Execute with timeout (provider config or default 10s)
7. ┌─ Success (2xx) → return response to module
└─ Failure → retry (up to 5 times, exponential backoff)
→ max retries exhausted → persist to DLQ
→ return 502 Bad Gateway to module
REST API
For layout brevity, the /api/v1 base path prefix is omitted
from the endpoint table below.
| Method | Endpoint | Description |
|---|---|---|
POST | /integrations/providers | Register a new provider |
GET | /integrations/providers | List providers (paginated) |
GET | /integrations/providers/:id | Get provider details |
PATCH | /integrations/providers/:id | Update provider config or credentials |
DELETE | /integrations/providers/:id | Soft delete provider |
GET | /integrations/providers/:id/health | Provider health + circuit breaker state |
POST | /integrations/call | Call a provider (via circuit breaker) |
GET | /integrations/dlq | Dead Letter Queue (paginated) |
POST | /integrations/dlq/:id/retry | Retry one DLQ entry |
POST | /integrations/dlq/retry-all | Retry all DLQ entries |
DELETE | /integrations/dlq/:id | Delete DLQ entry |
Related Pages
- Providers — provider model, auth types, credentials, AES-256-GCM encryption, status values, CRUD examples, health check
- Dead Letter Queue — DLQ schema, paginated retrieval, retry single and bulk, 30-day retention, idempotency