Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
80 commits
Select commit Hold shift + click to select a range
db432ad
feat: update log context
Oct 22, 2025
9502acc
feat: update log context
Oct 22, 2025
d9f863e
fix: sqlite list users error (#384)
fridayL Oct 23, 2025
b5ea7e6
feat: introduce async memory add for TreeTextMemory using MemSchedule…
CaralHsi Oct 23, 2025
d74e628
feat: update mcp
Oct 23, 2025
32b2ac1
feat: update mcp
Oct 23, 2025
e4c6b92
feat: add error log
Oct 23, 2025
c27bd61
feat: add error log
Oct 23, 2025
6769b4c
feat: add error log
Oct 23, 2025
01547e1
feat: update log
Oct 24, 2025
a19584f
feat: add chat_time
Oct 24, 2025
8dfa338
feat: add chat_time
Oct 24, 2025
a91e3e2
feat: add chat_time
Oct 24, 2025
5b962e2
feat: update log
Oct 24, 2025
69a6e9a
feat: update log
Oct 24, 2025
6efe419
add pm and pref eval scripts (#385)
Nyakult Oct 24, 2025
d325a31
feat: update log
Oct 24, 2025
f0e5f5c
feat: update log
Oct 24, 2025
7fc8c05
feat: update log
Oct 24, 2025
651e8df
Meger update about scheduler and new api to Dev (#386)
tangg555 Oct 25, 2025
0b2b6ed
Feat/merge inst cplt to dev (#388)
Wang-Daoji Oct 25, 2025
185ed93
feat: add arms
Oct 26, 2025
f641b70
feat: add arms
Oct 26, 2025
d5c59a0
fix: format
Oct 26, 2025
b144470
fix: format
Oct 26, 2025
33921b7
feat: add dockerfile
Oct 26, 2025
49a9079
feat: add dockerfile
Oct 26, 2025
27c49b6
feat: add arms config
Oct 26, 2025
60c5dd8
feat: update log
Oct 26, 2025
3096321
feat: add sleep time
Oct 26, 2025
204efef
feat: add sleep time
Oct 26, 2025
f6e96d5
Feat: add reranker strategies and update configs (#390)
fridayL Oct 27, 2025
e069928
modify code in evaluation (#392)
Wang-Daoji Oct 27, 2025
84adda6
fix bug in pref_mem return (#399)
Wang-Daoji Oct 27, 2025
ce34bd1
add polardb (#395)
wustzdy Oct 27, 2025
83a7c34
feat: fix mode (#400)
lijicode Oct 28, 2025
018d759
Feat: remove long waring for internet and add content for memreader (…
fridayL Oct 28, 2025
e2c9cbf
fix: conflict
Oct 28, 2025
33a41e8
feat: update log
Oct 28, 2025
cf23174
feat: delete dockerfile
Oct 28, 2025
18e2eda
feat: delete dockerfile
Oct 28, 2025
f9a18a5
feat: update dockerfile
Oct 28, 2025
399e200
fix: conflict
Oct 28, 2025
1d4f3d1
fix: conflict
Oct 28, 2025
92be50b
feat: replace ThreadPool to context
Oct 28, 2025
8a1fd64
feat: add timed log
Oct 28, 2025
3680286
feat: redis for sync history memories and new api of mixture search (…
tangg555 Oct 28, 2025
7d7f731
fix: conflict
Oct 28, 2025
5ff29d1
memos online api eval scripts and readme (#403)
Nyakult Oct 28, 2025
1f6757d
feat: fix sources (#404)
lijicode Oct 28, 2025
e21f5bb
fix porlar (#406)
lijicode Oct 29, 2025
d79647e
Feat/arms (#402)
CarltonXiang Oct 29, 2025
f8859f1
Hotfix: memos playground prompt reverse (#408)
fridayL Oct 29, 2025
7eb531b
Feat/pref optimize update (#409)
Wang-Daoji Oct 29, 2025
4ed7574
feat: fix polardb graph (#411)
lijicode Oct 29, 2025
fef40e9
feat: async add api (#410)
CaralHsi Oct 29, 2025
6e219c4
use nacos (#407)
lijicode Oct 29, 2025
f74ea76
feat: async add api (#413)
CaralHsi Oct 29, 2025
5923001
revision of mixture api: add conversation turn and reduce 2 stage ran…
tangg555 Oct 29, 2025
a375911
Feat: add recall strategy (#414)
whipser030 Oct 29, 2025
5b8893e
Revert "Feat: add recall strategy " (#415)
CaralHsi Oct 30, 2025
445c597
Feat: add new recall and verify (#416)
fridayL Oct 30, 2025
0765e1c
Feat: remove usage data (#417)
fridayL Oct 30, 2025
39a4f29
feat: add moniter schedule (#419)
CaralHsi Oct 30, 2025
a4d1e7b
feat:turn off graph call (#418)
whipser030 Oct 30, 2025
87e2699
pm & prefEval scripts updates (#421)
Nyakult Oct 30, 2025
81c7ad9
add polardb pool (#420)
wustzdy Oct 30, 2025
25c7642
Feat/pref optimize update (#422)
Wang-Daoji Oct 30, 2025
0e7128e
fix:tree file change Searcher inputs (#423)
whipser030 Oct 30, 2025
aa80863
Feat/pref optimize update (#425)
Wang-Daoji Oct 30, 2025
8be2e80
Fix/query schedule (#424)
CaralHsi Oct 30, 2025
28cf578
fix: message schema bug (#426)
CaralHsi Oct 30, 2025
af89531
fix commit (#427)
wustzdy Oct 30, 2025
9c5d9fb
Feat/pref optimize update (#429)
Wang-Daoji Oct 30, 2025
c7e9af4
Feat/pref optimize update (#431)
Wang-Daoji Oct 31, 2025
387fe8a
Feat/pref optimize update (#432)
Wang-Daoji Nov 3, 2025
9fea59b
feat: add request log
Nov 3, 2025
4f96241
Merge branch 'main' into dev
CaralHsi Nov 3, 2025
4b72a63
feat: add request log
Nov 3, 2025
c3b9e83
fix: merge dev conflict
Nov 3, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ evaluation/.env
!evaluation/configs-example/*.json
evaluation/configs/*
**tree_textual_memory_locomo**
**script.py**
.env
evaluation/scripts/personamem

Expand Down
2 changes: 1 addition & 1 deletion docker/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -157,4 +157,4 @@ volcengine-python-sdk==4.0.6
watchfiles==1.1.0
websockets==15.0.1
xlrd==2.0.2
xlsxwriter==3.2.5
xlsxwriter==3.2.5
8 changes: 7 additions & 1 deletion docs/openapi.json
Original file line number Diff line number Diff line change
Expand Up @@ -884,7 +884,7 @@
"type": "string",
"title": "Session Id",
"description": "Session ID for the MOS. This is used to distinguish between different dialogue",
"default": "0ce84b9c-0615-4b9d-83dd-fba50537d5d3"
"default": "41bb5e18-252d-4948-918c-07d82aa47086"
},
"chat_model": {
"$ref": "#/components/schemas/LLMConfigFactory",
Expand Down Expand Up @@ -939,6 +939,12 @@
"description": "Enable parametric memory for the MemChat",
"default": false
},
"enable_preference_memory": {
"type": "boolean",
"title": "Enable Preference Memory",
"description": "Enable preference memory for the MemChat",
"default": false
},
"enable_mem_scheduler": {
"type": "boolean",
"title": "Enable Mem Scheduler",
Expand Down
22 changes: 14 additions & 8 deletions evaluation/.env-example
Original file line number Diff line number Diff line change
Expand Up @@ -3,21 +3,27 @@ MODEL="gpt-4o-mini"
OPENAI_API_KEY="sk-***REDACTED***"
OPENAI_BASE_URL="http://***.***.***.***:3000/v1"

MEM0_API_KEY="m0-***REDACTED***"

ZEP_API_KEY="z_***REDACTED***"

# response model
CHAT_MODEL="gpt-4o-mini"
CHAT_MODEL_BASE_URL="http://***.***.***.***:3000/v1"
CHAT_MODEL_API_KEY="sk-***REDACTED***"

# memos
MEMOS_KEY="Token mpg-xxxxx"
MEMOS_URL="https://apigw-pre.memtensor.cn/api/openmem/v1"
PRE_SPLIT_CHUNK=false # pre split chunk in client end
MEMOS_URL="http://127.0.0.1:8001"
MEMOS_ONLINE_URL="https://memos.memtensor.cn/api/openmem/v1"

# other memory agents
MEM0_API_KEY="m0-xxx"
ZEP_API_KEY="z_xxx"
MEMU_API_KEY="mu_xxx"
SUPERMEMORY_API_KEY="sm_xxx"
MEMOBASE_API_KEY="xxx"
MEMOBASE_PROJECT_URL="http://***.***.***.***:8019"

MEMOBASE_API_KEY="xxxxx"
MEMOBASE_PROJECT_URL="http://xxx.xxx.xxx.xxx:8019"
# eval settings
PRE_SPLIT_CHUNK=false

# Configuration Only For Scheduler
# RabbitMQ Configuration
Expand All @@ -38,4 +44,4 @@ MEMSCHEDULER_GRAPHDBAUTH_URI=bolt://localhost:7687
MEMSCHEDULER_GRAPHDBAUTH_USER=neo4j
MEMSCHEDULER_GRAPHDBAUTH_PASSWORD=***
MEMSCHEDULER_GRAPHDBAUTH_DB_NAME=neo4j
MEMSCHEDULER_GRAPHDBAUTH_AUTO_CREATE=true
MEMSCHEDULER_GRAPHDBAUTH_AUTO_CREATE=true
41 changes: 37 additions & 4 deletions evaluation/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Evaluation Memory Framework

This repository provides tools and scripts for evaluating the LoCoMo dataset using various models and APIs.
This repository provides tools and scripts for evaluating the `LoCoMo`, `LongMemEval`, `PrefEval`, `personaMem` dataset using various models and APIs.

## Installation

Expand All @@ -21,11 +21,33 @@ This repository provides tools and scripts for evaluating the LoCoMo dataset usi

2. Copy the `configs-example/` directory to a new directory named `configs/`, and modify the configuration files inside it as needed. This directory contains model and API-specific settings.

## Setup MemOS
### local server
```bash
# modify {project_dir}/.env file and start server
uvicorn memos.api.server_api:app --host 0.0.0.0 --port 8001 --workers 8

# configure {project_dir}/evaluation/.env file
MEMOS_URL="http://127.0.0.1:8001"
```
### online service
```bash
# get your api key at https://memos-dashboard.openmem.net/cn/quickstart/
# configure {project_dir}/evaluation/.env file
MEMOS_KEY="Token mpg-xxxxx"
MEMOS_ONLINE_URL="https://memos.memtensor.cn/api/openmem/v1"

```

## Supported frameworks
We support `memos-api` and `memos-api-online` in our scripts.
And give unofficial implementations for the following memory frameworks:`zep`, `mem0`, `memobase`, `supermemory`, `memu`.


## Evaluation Scripts

### LoCoMo Evaluation
⚙️ To evaluate the **LoCoMo** dataset using one of the supported memory frameworks — `memos`, `mem0`, or `zep` — run the following [script](./scripts/run_locomo_eval.sh):
⚙️ To evaluate the **LoCoMo** dataset using one of the supported memory frameworks — run the following [script](./scripts/run_locomo_eval.sh):

```bash
# Edit the configuration in ./scripts/run_locomo_eval.sh
Expand All @@ -45,10 +67,21 @@ First prepare the dataset `longmemeval_s` from https://huggingface.co/datasets/x
./scripts/run_lme_eval.sh
```

### prefEval Evaluation
### PrefEval Evaluation
Downloading benchmark_dataset/filtered_inter_turns.json from https://github.com/amazon-science/PrefEval/blob/main/benchmark_dataset/filtered_inter_turns.json and save it as `./data/prefeval/filtered_inter_turns.json`.
To evaluate the **Prefeval** dataset — run the following [script](./scripts/run_prefeval_eval.sh):

```bash
# Edit the configuration in ./scripts/run_prefeval_eval.sh
# Specify the model and memory backend you want to use (e.g., mem0, zep, etc.)
./scripts/run_prefeval_eval.sh
```

### personaMem Evaluation
### PersonaMem Evaluation
get `questions_32k.csv` and `shared_contexts_32k.jsonl` from https://huggingface.co/datasets/bowen-upenn/PersonaMem and save them at `data/personamem/`
```bash
# Edit the configuration in ./scripts/run_pm_eval.sh
# Specify the model and memory backend you want to use (e.g., mem0, zep, etc.)
# If you want to use MIRIX, edit the the configuration in ./scripts/personamem/config.yaml
./scripts/run_pm_eval.sh
```
Loading
Loading