Introduce a production-grade queue system using BullMQ + Redis to handle all asynchronous workloads in CommDesk:
- Event processing
- Webhook delivery
- Retry handling
- Background jobs
This replaces synchronous execution with a scalable, fault-tolerant pipeline.
π― Problem
Current (sync):
Event β Webhook β Response
Issues:
- API latency & timeouts
- No retry mechanism
- Poor scalability
- Failure = data loss risk
β
Proposed Solution
Event
β Queue (BullMQ)
β Redis
β Worker
β Delivery Engine
β Logs + Retry
π§± Tech Stack
- Queue: BullMQ (open-source, production-proven)
- Broker: Redis
- Runtime: Node.js + Express
π¦ Scope
1οΈβ£ Queue Setup
Path:
Files:
connection.ts
queues.ts
jobs.ts
Example:
import { Queue } from "bullmq";
export const webhookQueue = new Queue("webhook", {
connection: {
host: process.env.REDIS_HOST,
port: Number(process.env.REDIS_PORT),
},
});
2οΈβ£ Job Types
WEBHOOK_DELIVERY
EVENT_PROCESSING
RETRY_DELIVERY
3οΈβ£ Worker Implementation
Path:
Files:
webhook.worker.ts
event.worker.ts
Responsibilities:
Fetch job
β Load event
β Find matching webhooks
β Trigger delivery
β Log result
β Retry if needed
4οΈβ£ Job Payload
{
eventId: string,
webhookId: string,
attempt: number
}
5οΈβ£ Retry Strategy (Required)
1 β immediate
2 β 30 sec
3 β 2 min
4 β 10 min
5 β fail β Dead Letter Queue
6οΈβ£ Dead Letter Queue
- Store permanently failed jobs
- Support manual retry via API
7οΈβ£ Rate Limiting
- Per webhook β 10 req/sec
- Per community β configurable
8οΈβ£ Concurrency
9οΈβ£ Logging Integration (MANDATORY)
Must integrate with logging system defined in docs:
π
Log events:
- job queued
- job started
- job completed
- job failed
- retry triggered
π Failure Handling
Handle:
- timeouts
- network failures
- invalid webhook URLs
- external API errors
π Observability
Track:
- queue size
- jobs/sec
- failure rate
- retry count
Alert on:
- high failure rate
- queue backlog spike
- worker crash
π Folder Structure
src/
βββ queue/
βββ workers/
βββ modules/
βββ logs/
π§ͺ Testing
- job enqueue works
- worker executes jobs
- retry logic triggers correctly
- failure scenarios handled
𧨠Edge Cases
- duplicate jobs
- Redis downtime
- worker crash
- retry storms
β‘ Performance Targets
- Handle 10k+ jobs/min
- Non-blocking API
- Horizontally scalable workers
π Environment
docker run -d -p 6379:6379 redis
REDIS_HOST=localhost
REDIS_PORT=6379
β
Acceptance Criteria
π« Constraints
- Do NOT process webhooks synchronously
- Do NOT implement custom queue logic
- Use BullMQ best practices only
π₯ Impact
After implementation:
Scalable β
Reliable β
Production-ready β
Without this:
System fails under load β
Introduce a production-grade queue system using BullMQ + Redis to handle all asynchronous workloads in CommDesk:
This replaces synchronous execution with a scalable, fault-tolerant pipeline.
π― Problem
Current (sync):
Issues:
β Proposed Solution
π§± Tech Stack
π¦ Scope
1οΈβ£ Queue Setup
Path:
Files:
Example:
2οΈβ£ Job Types
WEBHOOK_DELIVERYEVENT_PROCESSINGRETRY_DELIVERY3οΈβ£ Worker Implementation
Path:
Files:
Responsibilities:
4οΈβ£ Job Payload
5οΈβ£ Retry Strategy (Required)
6οΈβ£ Dead Letter Queue
7οΈβ£ Rate Limiting
8οΈβ£ Concurrency
concurrency: 109οΈβ£ Logging Integration (MANDATORY)
Must integrate with logging system defined in docs:
π
Log events:
π Failure Handling
Handle:
π Observability
Track:
Alert on:
π Folder Structure
π§ͺ Testing
𧨠Edge Cases
β‘ Performance Targets
π Environment
β Acceptance Criteria
π« Constraints
π₯ Impact
After implementation:
Without this: