Skip to content

Commit 8d75bc1

Browse files
committed
docs: update README and CHANGELOG for v0.3.0
- Update architecture diagram to reflect FIFO queue - Add Commands section with /newchat, /debug, /start, /help - Add Forum Group Setup section - Document new features: webhook security, handler system, typing indicator - Update tech stack to show FIFO Queue - Add Configure Commands section for config.toml - Update CHANGELOG with v0.3.0 release notes - Update sqs-async-architecture-plan.md with FIFO upgrade notes
1 parent 618f8b4 commit 8d75bc1

File tree

3 files changed

+187
-25
lines changed

3 files changed

+187
-25
lines changed

CHANGELOG.md

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,36 @@
22

33
## [Unreleased]
44

5+
## [0.3.0] - 2025-02-04
6+
7+
### Added
8+
- **SQS FIFO 队列**: 升级为 FIFO 队列,保证同一会话内消息顺序处理
9+
- **Telegram Webhook 安全验证**: 支持 `X-Telegram-Bot-Api-Secret-Token` HMAC 验证
10+
- **本地命令 Handler 系统**: 支持 `static` (静态回复) 和 `handler` (处理函数) 两种类型
11+
- **新命令**:
12+
- `/newchat` - 在群组 Forum 中创建新 Topic 开始独立对话
13+
- `/debug` - 下载当前会话的 session 文件 (conversation.jsonl, debug.txt, todos.json)
14+
- `/start` - 私聊欢迎消息
15+
- **持续打字指示**: Consumer 每 4 秒发送打字状态,改善长请求时的用户体验
16+
- **Markdown 转换管道**: 将 Agent 输出转换为 Telegram MarkdownV2 格式
17+
- **消息时间戳**: `message_time` 字段透传到 Agent Server
18+
- **Forum 群组支持**:
19+
- Bot 入群时自动检查 Topics 功能和权限
20+
- General Topic 拦截非命令消息,引导用户使用 `/newchat`
21+
- **用户白名单**: 支持限制私聊和群组邀请权限
22+
23+
### Changed
24+
- **Node.js 升级**: Docker 镜像升级到 Node.js 20+ (MCP undici 依赖要求)
25+
- **HOME 目录**: 从 `/root` 改为 `/tmp` (MCP auth 文件写入兼容)
26+
- **npm 缓存**: 配置 `/tmp/.npm` 目录
27+
- **环境变量清理**: 移除重复的 `ANTHROPIC_DEFAULT_OPUS_4_5_MODEL`
28+
- **Producer 权限扩展**: 新增 DynamoDB 读取和 S3 读取权限 (支持 /debug 命令)
29+
30+
### Fixed
31+
- 移除无效的 `release-changelog.yml` workflow
32+
33+
## [0.2.0] - 2025-01-04
34+
535
### Changed
636
- **架构调整:从同步到异步处理模式**
737
- 重构SDK Client为SQS异步架构
@@ -14,3 +44,13 @@
1444
- SNS告警主题(AlarmTopic)用于CloudWatch通知
1545
- CloudWatch告警和自定义指标监控
1646
- DynamoDB会话表用于多轮对话状态管理
47+
48+
## [0.0.1-beta] - 2024-12-15
49+
50+
### Added
51+
- 初始版本
52+
- Claude Agent SDK 集成
53+
- S3 + DynamoDB 会话持久化
54+
- Telegram Bot 集成
55+
- SubAgent 和 Skills 支持
56+
- MCP 服务器集成

README.md

Lines changed: 118 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -17,28 +17,40 @@ A Serverless AI Agent system built on Claude Agent SDK, implementing stateful co
1717
## Architecture
1818

1919
```
20-
Telegram User → Bot API → API Gateway → Producer Lambda → SQS Queue → Consumer Lambda
21-
↓ ↓
22-
Return 200 agent-server Lambda
23-
immediately ↓
20+
Telegram User → Bot API → API Gateway → Producer Lambda → SQS FIFO Queue → Consumer Lambda
21+
22+
Return 200 agent-server Lambda
23+
immediately
2424
DynamoDB (Session mapping) + S3 (Session files) + Bedrock (Claude)
2525
```
2626

2727
**Core Design**:
2828
- Uses the Hybrid Sessions pattern recommended by Claude Agent SDK
29-
- **SQS Async Architecture**: Producer returns 200 immediately to Telegram, Consumer processes requests asynchronously
29+
- **SQS FIFO Async Architecture**: Producer returns 200 immediately to Telegram, Consumer processes requests asynchronously with message ordering guarantee
3030

3131
## Features
3232

3333
- **Session Persistence**: DynamoDB for mapping storage, S3 for conversation history, cross-request recovery support
3434
- **Multi-tenant Isolation**: Client isolation based on Telegram chat_id + thread_id
35+
- **Forum Group Support**: Topic-based conversation isolation with auto-precheck
36+
- **User Whitelist**: Control private chat and group invitation permissions
3537
- **SubAgent Support**: Configurable specialized Agents (e.g., AWS support) with example implementations
3638
- **Skills Support**: Reusable skill modules with hello-world example
37-
- **MCP Integration**: Support for HTTP and local command-based MCP servers
39+
- **MCP Integration**: Support for HTTP and local command-based MCP servers (Node.js 20+)
40+
- **Security**: Telegram Webhook secret token verification (HMAC)
3841
- **Auto Cleanup**: 25-day TTL + S3 lifecycle management
39-
- **SQS Queue**: Async processing + auto retry + dead letter queue
42+
- **SQS FIFO Queue**: Ordered async processing + auto retry + dead letter queue
4043
- **Quick Start**: Provides example Skill/SubAgent/MCP configurations for adding other components
4144

45+
## Commands
46+
47+
| Command | Description |
48+
|---------|-------------|
49+
| `/newchat <message>` | Create new Topic in Forum group and start conversation |
50+
| `/debug` | Download current session files (conversation.jsonl, debug.txt, todos.json) |
51+
| `/start` | Welcome message (private chat) |
52+
| `/help` | Show help message |
53+
4254
## Project Structure
4355

4456
```
@@ -56,7 +68,9 @@ Telegram User → Bot API → API Gateway → Producer Lambda → SQS Queue →
5668
├── agent-sdk-client/ # Telegram Client (ZIP Deployment)
5769
│ ├── handler.py # Producer: Webhook receiver, writes to SQS
5870
│ ├── consumer.py # Consumer: SQS consumer, calls Agent
59-
│ └── config.py # Configuration management
71+
│ ├── config.py # Configuration management
72+
│ ├── config.toml # Command configuration
73+
│ └── security.py # Security utilities
6074
6175
├── docs/ # Documentation
6276
│ └── anthropic-agent-sdk-official/ # SDK Official Docs Reference
@@ -98,28 +112,31 @@ sam deploy --guided
98112
| `BEDROCK_SECRET_ACCESS_KEY` | Bedrock secret key |
99113
| `SDK_CLIENT_AUTH_TOKEN` | Internal authentication token |
100114
| `TELEGRAM_BOT_TOKEN` | Telegram Bot Token |
115+
| `TELEGRAM_WEBHOOK_SECRET` | (Optional) Webhook secret for security verification |
101116
| `QUEUE_URL` | SQS queue URL (auto-created) |
102117

103118
## Tech Stack
104119

105120
- **Runtime**: Python 3.12 + Claude Agent SDK
106121
- **Computing**: AWS Lambda (ARM64)
107122
- **Storage**: S3 + DynamoDB
108-
- **Message Queue**: AWS SQS (Standard Queue + DLQ)
123+
- **Message Queue**: AWS SQS (FIFO Queue + DLQ)
109124
- **AI**: Claude via Amazon Bedrock
110125
- **Orchestration**: AWS SAM
111126
- **Integration**: Telegram Bot API + MCP
112127

113-
## SQS Async Architecture
128+
## SQS FIFO Async Architecture
114129

115130
**Problem Solved**: Telegram Webhook times out and retries after ~27s, while Agent processing may take 30-70s, causing duplicate responses.
116131

117132
**Solution**:
118-
1. Producer Lambda receives Webhook, writes to SQS, returns 200 immediately (<1s)
133+
1. Producer Lambda receives Webhook, writes to SQS FIFO, returns 200 immediately (<1s)
119134
2. Consumer Lambda consumes from SQS, calls Agent Server, sends response to Telegram
120-
3. Retry 3 times on failure, then move to dead letter queue (DLQ)
135+
3. FIFO queue ensures message ordering within same session (MessageGroupId = chat_id:thread_id)
136+
4. Retry 3 times on failure, then move to dead letter queue (DLQ)
121137

122138
**Queue Configuration**:
139+
- FifoQueue: true (ordered delivery per MessageGroupId)
123140
- VisibilityTimeout: 900s (= Lambda timeout)
124141
- maxReceiveCount: 3 (retry 3 times)
125142
- DLQ Alarm: CloudWatch alarm triggers when messages enter DLQ
@@ -137,6 +154,25 @@ sam deploy --guided
137154
- `debug.txt` - Debug logs
138155
- `todos.json` - Task status
139156

157+
## Configure Commands
158+
159+
Edit `agent-sdk-client/config.toml`:
160+
161+
```toml
162+
[agent_commands]
163+
commands = ["/custom-skill", "/hello-world"]
164+
165+
[local_commands]
166+
# Static response
167+
help = { type = "static", response = "Hello World" }
168+
# Handler function
169+
newchat = { type = "handler", handler = "newchat" }
170+
debug = { type = "handler", handler = "debug" }
171+
172+
[security]
173+
user_whitelist = ["all"] # or [123456789, 987654321]
174+
```
175+
140176
## Configure SubAgents
141177

142178
Edit `agent-sdk-server/claude-config/agents.json`:
@@ -173,6 +209,17 @@ Edit `agent-sdk-server/claude-config/mcp.json`, supporting two types:
173209

174210
Examples include AWS knowledge base MCP servers. Refer to existing configurations to add more MCP servers.
175211

212+
## Forum Group Setup
213+
214+
For Telegram Forum groups:
215+
216+
1. Enable Topics feature in group settings
217+
2. Add Bot to group (must be by whitelisted user)
218+
3. Promote Bot to admin with "Manage Topics" permission
219+
4. Use `/newchat <message>` to create new conversation topics
220+
221+
See [docs/forum-group-security.md](docs/forum-group-security.md) for details.
222+
176223
## Quick Start Examples
177224

178225
The project includes the following example components; follow these examples to add other components:
@@ -202,28 +249,40 @@ MIT
202249
## 架构
203250

204251
```
205-
Telegram User → Bot API → API Gateway → Producer Lambda → SQS Queue → Consumer Lambda
206-
↓ ↓
207-
立即返回 200 agent-server Lambda
208-
252+
Telegram User → Bot API → API Gateway → Producer Lambda → SQS FIFO Queue → Consumer Lambda
253+
254+
立即返回 200 agent-server Lambda
255+
209256
DynamoDB (Session映射) + S3 (Session文件) + Bedrock (Claude)
210257
```
211258

212259
**核心设计**
213260
- 采用 Claude Agent SDK 官方推荐的 Hybrid Sessions 模式
214-
- **SQS 异步架构**:Producer 立即返回 200 给 Telegram,Consumer 异步处理请求
261+
- **SQS FIFO 异步架构**:Producer 立即返回 200 给 Telegram,Consumer 异步处理请求,保证消息顺序
215262

216263
## 特性
217264

218265
- **Session 持久化**:DynamoDB 存储映射,S3 存储对话历史,支持跨请求恢复
219266
- **多租户隔离**:基于 Telegram chat_id + thread_id 实现客户端隔离
267+
- **Forum 群组支持**:基于 Topic 的对话隔离,自动预检权限
268+
- **用户白名单**:控制私聊和群组邀请权限
220269
- **SubAgent 支持**:可配置多个专业 Agent(如 AWS 支持),包含示例实现
221270
- **Skills 支持**:可复用的技能模块,包含 hello-world 示例
222-
- **MCP 集成**:支持 HTTP 和本地命令类型的 MCP 服务器
271+
- **MCP 集成**:支持 HTTP 和本地命令类型的 MCP 服务器 (Node.js 20+)
272+
- **安全验证**:支持 Telegram Webhook 密钥验证 (HMAC)
223273
- **自动清理**:25天 TTL + S3 生命周期管理
224-
- **SQS 队列**异步处理 + 自动重试 + 死信队列
274+
- **SQS FIFO 队列**有序异步处理 + 自动重试 + 死信队列
225275
- **快速开始**:提供示例 Skill/SubAgent/MCP 配置,可按照示例添加其他组件
226276

277+
## 命令
278+
279+
| 命令 | 说明 |
280+
|------|------|
281+
| `/newchat <消息>` | 在 Forum 群组中创建新 Topic 开始对话 |
282+
| `/debug` | 下载当前会话文件 (conversation.jsonl, debug.txt, todos.json) |
283+
| `/start` | 欢迎消息 (私聊) |
284+
| `/help` | 显示帮助信息 |
285+
227286
## 项目结构
228287

229288
```
@@ -241,7 +300,9 @@ Telegram User → Bot API → API Gateway → Producer Lambda → SQS Queue →
241300
├── agent-sdk-client/ # Telegram客户端 (ZIP部署)
242301
│ ├── handler.py # Producer: Webhook接收,写入SQS
243302
│ ├── consumer.py # Consumer: SQS消费,调用Agent
244-
│ └── config.py # 配置管理
303+
│ ├── config.py # 配置管理
304+
│ ├── config.toml # 命令配置
305+
│ └── security.py # 安全工具
245306
246307
├── docs/ # 文档
247308
│ └── anthropic-agent-sdk-official/ # SDK官方文档参考
@@ -283,28 +344,31 @@ sam deploy --guided
283344
| `BEDROCK_SECRET_ACCESS_KEY` | Bedrock密钥 |
284345
| `SDK_CLIENT_AUTH_TOKEN` | 内部认证Token |
285346
| `TELEGRAM_BOT_TOKEN` | Telegram Bot Token |
347+
| `TELEGRAM_WEBHOOK_SECRET` | (可选) Webhook密钥验证 |
286348
| `QUEUE_URL` | SQS队列URL(自动创建) |
287349

288350
## 技术栈
289351

290352
- **Runtime**: Python 3.12 + Claude Agent SDK
291353
- **计算**: AWS Lambda (ARM64)
292354
- **存储**: S3 + DynamoDB
293-
- **消息队列**: AWS SQS (Standard Queue + DLQ)
355+
- **消息队列**: AWS SQS (FIFO Queue + DLQ)
294356
- **AI**: Claude via Amazon Bedrock
295357
- **编排**: AWS SAM
296358
- **集成**: Telegram Bot API + MCP
297359

298-
## SQS 异步架构
360+
## SQS FIFO 异步架构
299361

300362
**解决的问题**:Telegram Webhook 在 ~27s 后超时重试,而 Agent 处理可能需要 30-70s,导致重复响应。
301363

302364
**解决方案**
303-
1. Producer Lambda 接收 Webhook,写入 SQS,立即返回 200(<1s)
365+
1. Producer Lambda 接收 Webhook,写入 SQS FIFO,立即返回 200(<1s)
304366
2. Consumer Lambda 从 SQS 消费,调用 Agent Server,发送响应给 Telegram
305-
3. 失败重试 3 次,最终失败进入死信队列(DLQ)
367+
3. FIFO 队列保证同一会话内消息顺序 (MessageGroupId = chat_id:thread_id)
368+
4. 失败重试 3 次,最终失败进入死信队列(DLQ)
306369

307370
**队列配置**
371+
- FifoQueue: true(按 MessageGroupId 有序投递)
308372
- VisibilityTimeout: 900s(= Lambda 超时)
309373
- maxReceiveCount: 3(重试 3 次)
310374
- DLQ 告警:消息进入 DLQ 时触发 CloudWatch 告警
@@ -322,6 +386,25 @@ sam deploy --guided
322386
- `debug.txt` - 调试日志
323387
- `todos.json` - 任务状态
324388

389+
## 配置命令
390+
391+
编辑 `agent-sdk-client/config.toml`
392+
393+
```toml
394+
[agent_commands]
395+
commands = ["/custom-skill", "/hello-world"]
396+
397+
[local_commands]
398+
# 静态回复
399+
help = { type = "static", response = "Hello World" }
400+
# 处理函数
401+
newchat = { type = "handler", handler = "newchat" }
402+
debug = { type = "handler", handler = "debug" }
403+
404+
[security]
405+
user_whitelist = ["all"] # 或 [123456789, 987654321]
406+
```
407+
325408
## 配置 SubAgent
326409

327410
编辑 `agent-sdk-server/claude-config/agents.json`
@@ -358,6 +441,17 @@ sam deploy --guided
358441

359442
示例中配置了 AWS 知识库 MCP 服务器。可参考现有配置添加更多 MCP 服务器。
360443

444+
## Forum 群组设置
445+
446+
在 Telegram Forum 群组中使用:
447+
448+
1. 在群组设置中启用 Topics 功能
449+
2. 将 Bot 添加到群组(必须由白名单用户添加)
450+
3. 将 Bot 提升为管理员,授予「管理 Topics」权限
451+
4. 使用 `/newchat <消息>` 创建新对话 Topic
452+
453+
详见 [docs/forum-group-security.md](docs/forum-group-security.md)
454+
361455
## 快速开始示例
362456

363457
项目已包含以下示例组件,可按照这些示例添加其他组件:

docs/sqs-async-architecture-plan.md

Lines changed: 29 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,6 @@
1-
# 重构 SDK Client 为 SQS 异步架构
1+
# 重构 SDK Client 为 SQS FIFO 异步架构
2+
3+
> **注意**: 本文档为历史设计文档。v0.3.0 已将队列升级为 FIFO 队列以保证消息顺序。
24
35
## 问题分析
46

@@ -191,3 +193,29 @@ Resources:
191193
- Telegram 用户可能极少数情况下收到重复响应
192194
- Agent Server 会为同一个问题执行两次
193195
- 如不可接受,可保留 DynamoDB 去重 (在 Consumer 端检查)
196+
197+
---
198+
199+
## v0.3.0 更新: FIFO 队列升级
200+
201+
### 升级原因
202+
Standard 队列无法保证同一会话内的消息顺序,可能导致:
203+
- 用户连续发送消息时,后发的消息先被处理
204+
- 状态覆盖和对话混乱
205+
206+
### FIFO 队列配置
207+
```yaml
208+
TaskQueue:
209+
Type: AWS::SQS::Queue
210+
Properties:
211+
QueueName: !Sub '${AWS::StackName}-TaskQueue.fifo'
212+
FifoQueue: true
213+
ContentBasedDeduplication: false
214+
DeduplicationScope: messageGroup
215+
FifoThroughputLimit: perMessageGroupId
216+
```
217+
218+
### 关键参数
219+
- **MessageGroupId**: `chat_id:thread_id` - 同一会话内消息有序
220+
- **MessageDeduplicationId**: `chat_id-message_id-uuid` - 允许重试处理
221+
- **FifoThroughputLimit**: `perMessageGroupId` - 不同会话可并行处理

0 commit comments

Comments
 (0)