Skip to content

NevaMind-AI/memU

Repository files navigation

MemU Banner

MemU

Always-On Proactive Memory for AI Agents

PyPI version License: Apache 2.0 Python 3.13+ Discord Twitter

NevaMind-AI%2FmemU | Trendshift

English | δΈ­ζ–‡ | ζ—₯本θͺž | ν•œκ΅­μ–΄ | EspaΓ±ol | FranΓ§ais


MemU is a 7Γ—24 proactive memory framework that continuously learns, anticipates, and adapts. It transforms passive LLM backends into intelligent agents with always-on memory that proactively surfaces insights, predicts needs, and evolves context without explicit queries.


⭐️ Star the repository

If you find memU useful or interesting, a GitHub Star ⭐️ would be greatly appreciated.

✨ Core Capabilities

Capability Description
πŸ”„ Continuous Learning 7Γ—24 memory extraction from every interactionβ€”conversations, documents, actions
🎯 Proactive Retrieval Anticipates information needs before being asked, surfaces relevant context automatically
🧠 Context Evolution Memory structure adapts in real-time based on usage patterns and emerging topics
πŸ” Dual Intelligence Fast embedding-based recall + deep LLM reasoning for comprehensive understanding
🎨 Multimodal Awareness Unified memory across text, images, audio, videoβ€”remembers what it sees and hears

πŸ”„ How Proactive Memory Works

Unlike traditional retrieval systems that wait for queries, MemU operates in continuous mode:

Passive vs. Proactive Memory

Traditional RAG MemU Proactive Memory
❌ Waits for explicit queries βœ… Monitors context continuously
❌ Reactive information retrieval βœ… Anticipates information needs
❌ Static knowledge base βœ… Self-evolving memory structure
❌ One-time processing βœ… Always-on learning pipeline

Proactive Memory Lifecycle

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  1. CONTINUOUS INGESTION                        β”‚
β”‚  └─ Every conversation, document, action        β”‚
β”‚     automatically processed 7Γ—24                β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                      ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  2. REAL-TIME EXTRACTION                        β”‚
β”‚  └─ Immediate memory item creation              β”‚
β”‚     No batch delays, instant availability       β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                      ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  3. PROACTIVE STRUCTURING                       β”‚
β”‚  └─ Auto-categorization into evolving topics    β”‚
β”‚     Hierarchical organization adapts to usage   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                      ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  4. ANTICIPATORY RETRIEVAL                      β”‚
β”‚  └─ Surfaces relevant memory without prompting  β”‚
β”‚     Context-aware suggestions and insights      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

🎯 Proactive Use Cases

1. Contextual Assistance

Agent monitors conversation context and proactively surfaces relevant memories

# User starts discussing a topic
User: "I'm thinking about that project..."

# MemU automatically retrieves without explicit query:
- Previous project discussions
- Related preferences and constraints
- Past decisions and their outcomes
- Relevant documents and resources

Agent: "Based on your previous work on the dashboard redesign,
        I noticed you preferred Material UI components..."

2. Predictive Preparation

Agent anticipates upcoming needs based on patterns

# Morning routine detection
User logs in at 9 AM (usual time)

# MemU proactively surfaces:
- Daily standup talking points
- Overnight notifications summary
- Priority tasks based on past behavior
- Relevant context from yesterday's work

Agent: "Good morning! Here's what's relevant today..."

3. Autonomous Memory Management

System self-organizes without manual intervention

# As interactions accumulate:
βœ“ Automatically creates new categories for emerging topics
βœ“ Consolidates related memories across modalities
βœ“ Identifies patterns and extracts higher-level insights
βœ“ Prunes outdated information while preserving history

# Result: Always-optimized memory structure

πŸ—‚οΈ Hierarchical Memory Architecture

MemU's three-layer system enables both reactive queries and proactive context loading:

structure

Layer Reactive Use Proactive Use
Resource Direct access to original data Background monitoring for new patterns
Item Targeted fact retrieval Real-time extraction from ongoing interactions
Category Summary-level overview Automatic context assembly for anticipation

Proactive Benefits:

  • Auto-categorization: New memories self-organize into topics
  • Pattern Detection: System identifies recurring themes
  • Context Prediction: Anticipates what information will be needed next

πŸš€ Quick Start

Option 1: Cloud Version

Experience proactive memory instantly:

πŸ‘‰ memu.so - Hosted service with 7Γ—24 continuous learning

For enterprise deployment with custom proactive workflows, contact info@nevamind.ai

Cloud API (v3)

Base URL https://api.memu.so
Auth Authorization: Bearer YOUR_API_KEY
Method Endpoint Description
POST /api/v3/memory/memorize Register continuous learning task
GET /api/v3/memory/memorize/status/{task_id} Check real-time processing status
POST /api/v3/memory/categories List auto-generated categories
POST /api/v3/memory/retrieve Query memory (supports proactive context loading)

πŸ“š Full API Documentation


Option 2: Self-Hosted

Installation

pip install -e .

Basic Example

Requirements: Python 3.13+ and an OpenAI API key

Test Continuous Learning (in-memory):

export OPENAI_API_KEY=your_api_key
cd tests
python test_inmemory.py

Test with Persistent Storage (PostgreSQL):

# Start PostgreSQL with pgvector
docker run -d \
  --name memu-postgres \
  -e POSTGRES_USER=postgres \
  -e POSTGRES_PASSWORD=postgres \
  -e POSTGRES_DB=memu \
  -p 5432:5432 \
  pgvector/pgvector:pg16

# Run continuous learning test
export OPENAI_API_KEY=your_api_key
cd tests
python test_postgres.py

Both examples demonstrate proactive memory workflows:

  1. Continuous Ingestion: Process multiple files sequentially
  2. Auto-Extraction: Immediate memory creation
  3. Proactive Retrieval: Context-aware memory surfacing

See tests/test_inmemory.py and tests/test_postgres.py for implementation details.


Custom LLM and Embedding Providers

MemU supports custom LLM and embedding providers beyond OpenAI. Configure them via llm_profiles:

from memu import MemUService

service = MemUService(
    llm_profiles={
        # Default profile for LLM operations
        "default": {
            "base_url": "https://dashscope.aliyuncs.com/compatible-mode/v1",
            "api_key": "your_api_key",
            "chat_model": "qwen3-max",
            "client_backend": "sdk"  # "sdk" or "http"
        },
        # Separate profile for embeddings
        "embedding": {
            "base_url": "https://api.voyageai.com/v1",
            "api_key": "your_voyage_api_key",
            "embed_model": "voyage-3.5-lite"
        }
    },
    # ... other configuration
)

OpenRouter Integration

MemU supports OpenRouter as a model provider, giving you access to multiple LLM providers through a single API.

Configuration

from memu import MemoryService

service = MemoryService(
    llm_profiles={
        "default": {
            "provider": "openrouter",
            "client_backend": "httpx",
            "base_url": "https://openrouter.ai",
            "api_key": "your_openrouter_api_key",
            "chat_model": "anthropic/claude-3.5-sonnet",  # Any OpenRouter model
            "embed_model": "openai/text-embedding-3-small",  # Embedding model
        },
    },
    database_config={
        "metadata_store": {"provider": "inmemory"},
    },
)

Environment Variables

Variable Description
OPENROUTER_API_KEY Your OpenRouter API key from openrouter.ai/keys

Supported Features

Feature Status Notes
Chat Completions Supported Works with any OpenRouter chat model
Embeddings Supported Use OpenAI embedding models via OpenRouter
Vision Supported Use vision-capable models (e.g., openai/gpt-4o)

Running OpenRouter Tests

export OPENROUTER_API_KEY=your_api_key

# Full workflow test (memorize + retrieve)
python tests/test_openrouter.py

# Embedding-specific tests
python tests/test_openrouter_embedding.py

# Vision-specific tests
python tests/test_openrouter_vision.py

See examples/example_4_openrouter_memory.py for a complete working example.


πŸ“– Core APIs

memorize() - Continuous Learning Pipeline

Processes inputs in real-time and immediately updates memory:

memorize

```python result = await service.memorize( resource_url="path/to/file.json", # File path or URL modality="conversation", # conversation | document | image | video | audio user={"user_id": "123"} # Optional: scope to a user )

Returns immediately with extracted memory:

{ "resource": {...}, # Stored resource metadata "items": [...], # Extracted memory items (available instantly) "categories": [...] # Auto-updated category structure }


**Proactive Features:**
- Zero-delay processingβ€”memories available immediately
- Automatic categorization without manual tagging
- Cross-reference with existing memories for pattern detection

### `retrieve()` - Dual-Mode Intelligence

MemU supports both **proactive context loading** and **reactive querying**:

<img width="100%" alt="retrieve" src="assets/retrieve.png" />

#### RAG-based Retrieval (`method="rag"`)

Fast **proactive context assembly** using embeddings:

- βœ… **Instant context**: Sub-second memory surfacing
- βœ… **Background monitoring**: Can run continuously without LLM costs
- βœ… **Similarity scoring**: Identifies most relevant memories automatically

#### LLM-based Retrieval (`method="llm"`)

Deep **anticipatory reasoning** for complex contexts:

- βœ… **Intent prediction**: LLM infers what user needs before they ask
- βœ… **Query evolution**: Automatically refines search as context develops
- βœ… **Early termination**: Stops when sufficient context is gathered

#### Comparison

| Aspect | RAG (Fast Context) | LLM (Deep Reasoning) |
|--------|-------------------|---------------------|
| **Speed** | ⚑ Milliseconds | 🐒 Seconds |
| **Cost** | πŸ’° Embedding only | πŸ’°πŸ’° LLM inference |
| **Proactive use** | Continuous monitoring | Triggered context loading |
| **Best for** | Real-time suggestions | Complex anticipation |

#### Usage
```python
# Proactive retrieval with context history
result = await service.retrieve(
    queries=[
        {"role": "user", "content": {"text": "What are their preferences?"}},
        {"role": "user", "content": {"text": "Tell me about work habits"}}
    ],
    where={"user_id": "123"},  # Optional: scope filter
    method="rag"  # or "llm" for deeper reasoning
)

# Returns context-aware results:
{
    "categories": [...],     # Relevant topic areas (auto-prioritized)
    "items": [...],          # Specific memory facts
    "resources": [...],      # Original sources for traceability
    "next_step_query": "..." # Predicted follow-up context
}

Proactive Filtering: Use where to scope continuous monitoring:

  • where={"user_id": "123"} - User-specific context
  • where={"agent_id__in": ["1", "2"]} - Multi-agent coordination
  • Omit where for global context awareness

πŸ“š For complete API documentation, see SERVICE_API.md - includes proactive workflow patterns, pipeline configuration, and real-time update handling.


πŸ’‘ Proactive Scenarios

Example 1: Always-Learning Assistant

Continuously learns from every interaction without explicit memory commands:

export OPENAI_API_KEY=your_api_key
python examples/example_1_conversation_memory.py

Proactive Behavior:

  • Automatically extracts preferences from casual mentions
  • Builds relationship models from interaction patterns
  • Surfaces relevant context in future conversations
  • Adapts communication style based on learned preferences

Best for: Personal AI assistants, customer support that remembers, social chatbots


Example 2: Self-Improving Agent

Learns from execution logs and proactively suggests optimizations:

export OPENAI_API_KEY=your_api_key
python examples/example_2_skill_extraction.py

Proactive Behavior:

  • Monitors agent actions and outcomes continuously
  • Identifies patterns in successes and failures
  • Auto-generates skill guides from experience
  • Proactively suggests strategies for similar future tasks

Best for: DevOps automation, agent self-improvement, knowledge capture


Example 3: Multimodal Context Builder

Unifies memory across different input types for comprehensive context:

export OPENAI_API_KEY=your_api_key
python examples/example_3_multimodal_memory.py

Proactive Behavior:

  • Cross-references text, images, and documents automatically
  • Builds unified understanding across modalities
  • Surfaces visual context when discussing related topics
  • Anticipates information needs by combining multiple sources

Best for: Documentation systems, learning platforms, research assistants


πŸ“Š Performance

MemU achieves 92.09% average accuracy on the Locomo benchmark across all reasoning tasks, demonstrating reliable proactive memory operations.

benchmark

View detailed experimental data: memU-experiment


🧩 Ecosystem

Repository Description Proactive Features
memU Core proactive memory engine 7Γ—24 learning pipeline, auto-categorization
memU-server Backend with continuous sync Real-time memory updates, webhook triggers
memU-ui Visual memory dashboard Live memory evolution monitoring

Quick Links:


🀝 Partners

Ten OpenAgents Milvus xRoute Jazz Buddie Bytebase LazyLLM


🀝 How to Contribute

We welcome contributions from the community! Whether you're fixing bugs, adding features, or improving documentation, your help is appreciated.

Getting Started

To start contributing to MemU, you'll need to set up your development environment:

Prerequisites

  • Python 3.13+
  • uv (Python package manager)
  • Git

Setup Development Environment

# 1. Fork and clone the repository
git clone https://github.com/YOUR_USERNAME/memU.git
cd memU

# 2. Install development dependencies
make install

The make install command will:

  • Create a virtual environment using uv
  • Install all project dependencies
  • Set up pre-commit hooks for code quality checks

Running Quality Checks

Before submitting your contribution, ensure your code passes all quality checks:

make check

The make check command runs:

  • Lock file verification: Ensures pyproject.toml consistency
  • Pre-commit hooks: Lints code with Ruff, formats with Black
  • Type checking: Runs mypy for static type analysis
  • Dependency analysis: Uses deptry to find obsolete dependencies

Contributing Guidelines

For detailed contribution guidelines, code standards, and development practices, please see CONTRIBUTING.md.

Quick tips:

  • Create a new branch for each feature or bug fix
  • Write clear commit messages
  • Add tests for new functionality
  • Update documentation as needed
  • Run make check before pushing

πŸ“„ License

Apache License 2.0


🌍 Community


⭐ Star us on GitHub to get notified about new releases!