@@ -17,28 +17,40 @@ A Serverless AI Agent system built on Claude Agent SDK, implementing stateful co
1717## Architecture
1818
1919```
20- Telegram User → Bot API → API Gateway → Producer Lambda → SQS Queue → Consumer Lambda
21- ↓ ↓
22- Return 200 agent-server Lambda
23- immediately ↓
20+ Telegram User → Bot API → API Gateway → Producer Lambda → SQS FIFO Queue → Consumer Lambda
21+ ↓ ↓
22+ Return 200 agent-server Lambda
23+ immediately ↓
2424 DynamoDB (Session mapping) + S3 (Session files) + Bedrock (Claude)
2525```
2626
2727** Core Design** :
2828- Uses the Hybrid Sessions pattern recommended by Claude Agent SDK
29- - ** SQS Async Architecture** : Producer returns 200 immediately to Telegram, Consumer processes requests asynchronously
29+ - ** SQS FIFO Async Architecture** : Producer returns 200 immediately to Telegram, Consumer processes requests asynchronously with message ordering guarantee
3030
3131## Features
3232
3333- ** Session Persistence** : DynamoDB for mapping storage, S3 for conversation history, cross-request recovery support
3434- ** Multi-tenant Isolation** : Client isolation based on Telegram chat_id + thread_id
35+ - ** Forum Group Support** : Topic-based conversation isolation with auto-precheck
36+ - ** User Whitelist** : Control private chat and group invitation permissions
3537- ** SubAgent Support** : Configurable specialized Agents (e.g., AWS support) with example implementations
3638- ** Skills Support** : Reusable skill modules with hello-world example
37- - ** MCP Integration** : Support for HTTP and local command-based MCP servers
39+ - ** MCP Integration** : Support for HTTP and local command-based MCP servers (Node.js 20+)
40+ - ** Security** : Telegram Webhook secret token verification (HMAC)
3841- ** Auto Cleanup** : 25-day TTL + S3 lifecycle management
39- - ** SQS Queue** : Async processing + auto retry + dead letter queue
42+ - ** SQS FIFO Queue** : Ordered async processing + auto retry + dead letter queue
4043- ** Quick Start** : Provides example Skill/SubAgent/MCP configurations for adding other components
4144
45+ ## Commands
46+
47+ | Command | Description |
48+ | ---------| -------------|
49+ | ` /newchat <message> ` | Create new Topic in Forum group and start conversation |
50+ | ` /debug ` | Download current session files (conversation.jsonl, debug.txt, todos.json) |
51+ | ` /start ` | Welcome message (private chat) |
52+ | ` /help ` | Show help message |
53+
4254## Project Structure
4355
4456```
@@ -56,7 +68,9 @@ Telegram User → Bot API → API Gateway → Producer Lambda → SQS Queue →
5668├── agent-sdk-client/ # Telegram Client (ZIP Deployment)
5769│ ├── handler.py # Producer: Webhook receiver, writes to SQS
5870│ ├── consumer.py # Consumer: SQS consumer, calls Agent
59- │ └── config.py # Configuration management
71+ │ ├── config.py # Configuration management
72+ │ ├── config.toml # Command configuration
73+ │ └── security.py # Security utilities
6074│
6175├── docs/ # Documentation
6276│ └── anthropic-agent-sdk-official/ # SDK Official Docs Reference
@@ -98,28 +112,31 @@ sam deploy --guided
98112| ` BEDROCK_SECRET_ACCESS_KEY ` | Bedrock secret key |
99113| ` SDK_CLIENT_AUTH_TOKEN ` | Internal authentication token |
100114| ` TELEGRAM_BOT_TOKEN ` | Telegram Bot Token |
115+ | ` TELEGRAM_WEBHOOK_SECRET ` | (Optional) Webhook secret for security verification |
101116| ` QUEUE_URL ` | SQS queue URL (auto-created) |
102117
103118## Tech Stack
104119
105120- ** Runtime** : Python 3.12 + Claude Agent SDK
106121- ** Computing** : AWS Lambda (ARM64)
107122- ** Storage** : S3 + DynamoDB
108- - ** Message Queue** : AWS SQS (Standard Queue + DLQ)
123+ - ** Message Queue** : AWS SQS (FIFO Queue + DLQ)
109124- ** AI** : Claude via Amazon Bedrock
110125- ** Orchestration** : AWS SAM
111126- ** Integration** : Telegram Bot API + MCP
112127
113- ## SQS Async Architecture
128+ ## SQS FIFO Async Architecture
114129
115130** Problem Solved** : Telegram Webhook times out and retries after ~ 27s, while Agent processing may take 30-70s, causing duplicate responses.
116131
117132** Solution** :
118- 1 . Producer Lambda receives Webhook, writes to SQS, returns 200 immediately (<1s)
133+ 1 . Producer Lambda receives Webhook, writes to SQS FIFO , returns 200 immediately (<1s)
1191342 . Consumer Lambda consumes from SQS, calls Agent Server, sends response to Telegram
120- 3 . Retry 3 times on failure, then move to dead letter queue (DLQ)
135+ 3 . FIFO queue ensures message ordering within same session (MessageGroupId = chat_id: thread_id )
136+ 4 . Retry 3 times on failure, then move to dead letter queue (DLQ)
121137
122138** Queue Configuration** :
139+ - FifoQueue: true (ordered delivery per MessageGroupId)
123140- VisibilityTimeout: 900s (= Lambda timeout)
124141- maxReceiveCount: 3 (retry 3 times)
125142- DLQ Alarm: CloudWatch alarm triggers when messages enter DLQ
@@ -137,6 +154,25 @@ sam deploy --guided
137154- ` debug.txt ` - Debug logs
138155- ` todos.json ` - Task status
139156
157+ ## Configure Commands
158+
159+ Edit ` agent-sdk-client/config.toml ` :
160+
161+ ``` toml
162+ [agent_commands ]
163+ commands = [" /custom-skill" , " /hello-world" ]
164+
165+ [local_commands ]
166+ # Static response
167+ help = { type = " static" , response = " Hello World" }
168+ # Handler function
169+ newchat = { type = " handler" , handler = " newchat" }
170+ debug = { type = " handler" , handler = " debug" }
171+
172+ [security ]
173+ user_whitelist = [" all" ] # or [123456789, 987654321]
174+ ```
175+
140176## Configure SubAgents
141177
142178Edit ` agent-sdk-server/claude-config/agents.json ` :
@@ -173,6 +209,17 @@ Edit `agent-sdk-server/claude-config/mcp.json`, supporting two types:
173209
174210Examples include AWS knowledge base MCP servers. Refer to existing configurations to add more MCP servers.
175211
212+ ## Forum Group Setup
213+
214+ For Telegram Forum groups:
215+
216+ 1 . Enable Topics feature in group settings
217+ 2 . Add Bot to group (must be by whitelisted user)
218+ 3 . Promote Bot to admin with "Manage Topics" permission
219+ 4 . Use ` /newchat <message> ` to create new conversation topics
220+
221+ See [ docs/forum-group-security.md] ( docs/forum-group-security.md ) for details.
222+
176223## Quick Start Examples
177224
178225The project includes the following example components; follow these examples to add other components:
@@ -202,28 +249,40 @@ MIT
202249## 架构
203250
204251```
205- Telegram User → Bot API → API Gateway → Producer Lambda → SQS Queue → Consumer Lambda
206- ↓ ↓
207- 立即返回 200 agent-server Lambda
208- ↓
252+ Telegram User → Bot API → API Gateway → Producer Lambda → SQS FIFO Queue → Consumer Lambda
253+ ↓ ↓
254+ 立即返回 200 agent-server Lambda
255+ ↓
209256 DynamoDB (Session映射) + S3 (Session文件) + Bedrock (Claude)
210257```
211258
212259** 核心设计** :
213260- 采用 Claude Agent SDK 官方推荐的 Hybrid Sessions 模式
214- - ** SQS 异步架构** :Producer 立即返回 200 给 Telegram,Consumer 异步处理请求
261+ - ** SQS FIFO 异步架构** :Producer 立即返回 200 给 Telegram,Consumer 异步处理请求,保证消息顺序
215262
216263## 特性
217264
218265- ** Session 持久化** :DynamoDB 存储映射,S3 存储对话历史,支持跨请求恢复
219266- ** 多租户隔离** :基于 Telegram chat_id + thread_id 实现客户端隔离
267+ - ** Forum 群组支持** :基于 Topic 的对话隔离,自动预检权限
268+ - ** 用户白名单** :控制私聊和群组邀请权限
220269- ** SubAgent 支持** :可配置多个专业 Agent(如 AWS 支持),包含示例实现
221270- ** Skills 支持** :可复用的技能模块,包含 hello-world 示例
222- - ** MCP 集成** :支持 HTTP 和本地命令类型的 MCP 服务器
271+ - ** MCP 集成** :支持 HTTP 和本地命令类型的 MCP 服务器 (Node.js 20+)
272+ - ** 安全验证** :支持 Telegram Webhook 密钥验证 (HMAC)
223273- ** 自动清理** :25天 TTL + S3 生命周期管理
224- - ** SQS 队列** :异步处理 + 自动重试 + 死信队列
274+ - ** SQS FIFO 队列** :有序异步处理 + 自动重试 + 死信队列
225275- ** 快速开始** :提供示例 Skill/SubAgent/MCP 配置,可按照示例添加其他组件
226276
277+ ## 命令
278+
279+ | 命令 | 说明 |
280+ | ------| ------|
281+ | ` /newchat <消息> ` | 在 Forum 群组中创建新 Topic 开始对话 |
282+ | ` /debug ` | 下载当前会话文件 (conversation.jsonl, debug.txt, todos.json) |
283+ | ` /start ` | 欢迎消息 (私聊) |
284+ | ` /help ` | 显示帮助信息 |
285+
227286## 项目结构
228287
229288```
@@ -241,7 +300,9 @@ Telegram User → Bot API → API Gateway → Producer Lambda → SQS Queue →
241300├── agent-sdk-client/ # Telegram客户端 (ZIP部署)
242301│ ├── handler.py # Producer: Webhook接收,写入SQS
243302│ ├── consumer.py # Consumer: SQS消费,调用Agent
244- │ └── config.py # 配置管理
303+ │ ├── config.py # 配置管理
304+ │ ├── config.toml # 命令配置
305+ │ └── security.py # 安全工具
245306│
246307├── docs/ # 文档
247308│ └── anthropic-agent-sdk-official/ # SDK官方文档参考
@@ -283,28 +344,31 @@ sam deploy --guided
283344| ` BEDROCK_SECRET_ACCESS_KEY ` | Bedrock密钥 |
284345| ` SDK_CLIENT_AUTH_TOKEN ` | 内部认证Token |
285346| ` TELEGRAM_BOT_TOKEN ` | Telegram Bot Token |
347+ | ` TELEGRAM_WEBHOOK_SECRET ` | (可选) Webhook密钥验证 |
286348| ` QUEUE_URL ` | SQS队列URL(自动创建) |
287349
288350## 技术栈
289351
290352- ** Runtime** : Python 3.12 + Claude Agent SDK
291353- ** 计算** : AWS Lambda (ARM64)
292354- ** 存储** : S3 + DynamoDB
293- - ** 消息队列** : AWS SQS (Standard Queue + DLQ)
355+ - ** 消息队列** : AWS SQS (FIFO Queue + DLQ)
294356- ** AI** : Claude via Amazon Bedrock
295357- ** 编排** : AWS SAM
296358- ** 集成** : Telegram Bot API + MCP
297359
298- ## SQS 异步架构
360+ ## SQS FIFO 异步架构
299361
300362** 解决的问题** :Telegram Webhook 在 ~ 27s 后超时重试,而 Agent 处理可能需要 30-70s,导致重复响应。
301363
302364** 解决方案** :
303- 1 . Producer Lambda 接收 Webhook,写入 SQS,立即返回 200(<1s)
365+ 1 . Producer Lambda 接收 Webhook,写入 SQS FIFO ,立即返回 200(<1s)
3043662 . Consumer Lambda 从 SQS 消费,调用 Agent Server,发送响应给 Telegram
305- 3 . 失败重试 3 次,最终失败进入死信队列(DLQ)
367+ 3 . FIFO 队列保证同一会话内消息顺序 (MessageGroupId = chat_id: thread_id )
368+ 4 . 失败重试 3 次,最终失败进入死信队列(DLQ)
306369
307370** 队列配置** :
371+ - FifoQueue: true(按 MessageGroupId 有序投递)
308372- VisibilityTimeout: 900s(= Lambda 超时)
309373- maxReceiveCount: 3(重试 3 次)
310374- DLQ 告警:消息进入 DLQ 时触发 CloudWatch 告警
@@ -322,6 +386,25 @@ sam deploy --guided
322386- ` debug.txt ` - 调试日志
323387- ` todos.json ` - 任务状态
324388
389+ ## 配置命令
390+
391+ 编辑 ` agent-sdk-client/config.toml ` :
392+
393+ ``` toml
394+ [agent_commands ]
395+ commands = [" /custom-skill" , " /hello-world" ]
396+
397+ [local_commands ]
398+ # 静态回复
399+ help = { type = " static" , response = " Hello World" }
400+ # 处理函数
401+ newchat = { type = " handler" , handler = " newchat" }
402+ debug = { type = " handler" , handler = " debug" }
403+
404+ [security ]
405+ user_whitelist = [" all" ] # 或 [123456789, 987654321]
406+ ```
407+
325408## 配置 SubAgent
326409
327410编辑 ` agent-sdk-server/claude-config/agents.json ` :
@@ -358,6 +441,17 @@ sam deploy --guided
358441
359442示例中配置了 AWS 知识库 MCP 服务器。可参考现有配置添加更多 MCP 服务器。
360443
444+ ## Forum 群组设置
445+
446+ 在 Telegram Forum 群组中使用:
447+
448+ 1 . 在群组设置中启用 Topics 功能
449+ 2 . 将 Bot 添加到群组(必须由白名单用户添加)
450+ 3 . 将 Bot 提升为管理员,授予「管理 Topics」权限
451+ 4 . 使用 ` /newchat <消息> ` 创建新对话 Topic
452+
453+ 详见 [ docs/forum-group-security.md] ( docs/forum-group-security.md ) 。
454+
361455## 快速开始示例
362456
363457项目已包含以下示例组件,可按照这些示例添加其他组件:
0 commit comments