䏿–‡ç‰ˆ | English | 📚 Wiki
# Basic dependencies
pip install numpy
# Optional: OpenAI support
pip install openai python-dotenv
# Optional: Better vector embeddings
pip install sentence-transformers
# Optional: Enhanced experiments (statistical analysis and visualization)
pip install scipy matplotlibCreate a .env file:
# OpenAI Official API
OPENAI_API_KEY=sk-your-key-here
OPENAI_API_BASE=https://api.openai.com/v1
OPENAI_MODEL=gpt-3.5-turbo
# Or use Azure OpenAI
# OPENAI_API_KEY=your-azure-key
# OPENAI_API_BASE=https://your-resource.openai.azure.com
# OPENAI_MODEL=your-deployment-name
# Or use local models (e.g., Ollama)
# OPENAI_API_BASE=http://localhost:11434/v1
# OPENAI_MODEL=llama2# Basic experiment (4-round dialogue)
python cognitive_workspace_poc.py
# Enhanced experiments (10-round dialogue + multi-hop reasoning + conflict resolution)
python cognitive_workspace_enhanced.pyRequires OpenAI API key, demonstrates real LLM behavioral differences:
- Higher quality task decomposition
- More accurate information prediction
- More coherent answer generation
No API key required, uses rule-based simulation:
- Still demonstrates architectural differences
- Suitable for proof-of-concept
- Fully reproducible
Uses local models like Ollama:
- Data privacy
- No API costs
- Performance depends on local hardware
Compares Cognitive Workspace vs traditional RAG on single complex questions:
- Operation count difference (12 vs 3)
- Operation type difference (active vs passive)
- Memory management difference (hierarchical vs flat)
- Single-turn memory reuse rate: 50% vs 0%
Demonstrates cumulative advantages from state persistence:
Round CW Reuse Rate RAG Reuse Rate
1 50.0% 0%
2 55.0% 0%
3 56.7% 0%
4 56.4% 0%
Average reuse rate: 54.5% vs 0%
Memory advantages in long-term conversations:
Average reuse rate: 57.1% vs 0%
Net efficiency gain: 17.3%
Cohen's d: 23.2 (huge effect)
P-value: < 0.001 (extremely significant)
Advantages in complex reasoning chains:
Average reuse rate: 58.8% vs 0%
Net efficiency gain: 17.9%
Cohen's d: 190.0 (extremely large effect)
Operations saved: 194
Performance when handling contradictory information:
Average reuse rate: 59.8% vs 0%
Net efficiency gain: 17.8%
Cohen's d: 195.7 (extremely large effect)
Operations saved: 226
cognitive_workspace_results.json: Basic experiment resultsenhanced_results.json: Enhanced experiment detailed resultscognitive_workspace_analysis.png: Experiment visualization charts.env.example: Environment variable template (if .env doesn't exist)
- Basic experiment (4 rounds): Average 54.5%, reuse starts from round 1
- 10-round dialogue: Average 57.1%, long-term dialogue advantage clear
- Multi-hop reasoning: Average 58.8%, higher reuse rate for complex tasks
- Conflict resolution: Average 59.8%, best performance in information integration scenarios
- Traditional RAG: Always 0% (stateless)
Net efficiency = Reuse rate / (1 + Extra operation ratio)- 10-round dialogue: 17.3% net improvement
- Multi-hop reasoning: 17.9% net improvement
- Conflict resolution: 17.8% net improvement
- P-values: All experiments < 0.001 (extremely significant)
- Cohen's d effect size:
- 10-round dialogue: 23.2 (huge)
- Multi-hop reasoning: 190.0 (extremely large)
- Conflict resolution: 195.7 (extremely large)
- Cognitive Workspace: Sub-linear growth (reduces redundant computation through memory reuse)
- Traditional RAG: Linear growth (starts fresh for each query)
- Cognitive Workspace: Dynamically tracks task completion and information sufficiency
- Traditional RAG: No confidence concept
This code supports the following paper arguments:
-
Active memory management outperforms passive retrieval
- Code proof: Task decomposition, information prediction, active preparation
-
State persistence improves efficiency
- Code proof: Memory reuse in multi-turn dialogues
-
Hierarchical buffers optimize resource utilization
- Code proof: immediate→working→episodic promotion mechanism
-
Metacognitive control enhances intelligence
- Code proof: Confidence tracking, information gap identification
A: Because we prove architectural behavioral differences, not generation quality. Even with rule simulation, the differences between active vs passive, stateful vs stateless are still obvious.
A: Use the following format in your LaTeX:
Code available at: \url{https://github.com/tao-hpu/cognitive-workspace}A: Full experiments require approximately:
- Single-turn experiment: ~10 API calls
- Multi-turn experiment: ~20 API calls
- Total cost: < $0.05 (using GPT-3.5-turbo)
A: Yes! The code supports:
- OpenAI-compatible APIs (by modifying OPENAI_API_BASE)
- Local models (Ollama, llama.cpp)
- Any service providing chat/completion interfaces
Problem: openai.error.AuthenticationError or connection timeout
Solutions:
- Verify your API key is correct in
.env - Check
OPENAI_API_BASEURL format (should end with/v1) - For Azure OpenAI, ensure you're using the correct endpoint format
- Test connection:
curl -H "Authorization: Bearer $OPENAI_API_KEY" $OPENAI_API_BASE/models
Problem: ModuleNotFoundError: No module named 'sentence_transformers'
Solutions:
- Install missing dependencies:
pip install sentence-transformers - For full functionality:
pip install openai python-dotenv sentence-transformers scipy matplotlib - Check Python version (requires 3.7+)
Problem: Reuse rates or metrics don't match documentation
Solutions:
- Simulation mode (no API key): Results are deterministic but simplified
- Full mode (with API key): Results vary slightly due to LLM randomness
- Set temperature=0 in code for more consistent results
- Run multiple trials for statistical validity
- Ensure you're comparing same experiment (basic vs enhanced)
Problem: Script runs slowly or uses too much memory
Solutions:
- Start with basic experiment first:
python cognitive_workspace_poc.py - Reduce number of documents in test data
- For local models, ensure adequate RAM (8GB+ recommended)
- Check if background processes are consuming resources
Problem: Missing .json or .png output files
Solutions:
- Check for errors in console output
- Ensure write permissions in current directory
- For visualization: verify matplotlib is installed
- Run with:
python cognitive_workspace_enhanced.py 2>&1 | tee output.log
-
Add longer-term tests (20+ rounds)
# Modify question list in cognitive_workspace_enhanced.py extended_questions = [...20 questions...]
-
Integrate real vector databases
# Use ChromaDB or Pinecone from chromadb import Client
-
Add more statistical tests
# Mann-Whitney U test, Friedman test, etc. from scipy import stats stats.mannwhitneyu(cw_results, rag_results)
-
Performance benchmarking
# Test performance at different scales for doc_count in [10, 100, 1000]: test_scalability(doc_count)
We welcome contributions to improve this proof-of-concept implementation! Here's how you can help:
- Bug Reports: Open an issue describing the problem with steps to reproduce
- Feature Suggestions: Propose new experiments or architectural improvements
- Code Improvements: Submit pull requests for bug fixes or enhancements
- Documentation: Improve README, add code comments, or create tutorials
- Testing: Add test cases or validate results on different platforms
- Fork the repository and create your branch from
main - Make your changes with clear, descriptive commit messages
- Test your changes thoroughly (run both basic and enhanced experiments)
- Update documentation if you change functionality
- Submit a pull request with a clear description of your changes
- Be respectful and constructive in discussions
- Focus on the technical merits of contributions
- Help maintain this as a research and educational resource
- Issues: For bug reports and feature requests, use GitHub Issues
- Discussions: For questions and general discussion, start a GitHub Discussion
- Documentation: Check the Wiki for additional resources
If you're interested in collaborating on research related to Cognitive Workspace or have academic questions about the paper:
- Author: Tao An
- Paper: arXiv:2508.13171
- For research inquiries, please reference the paper for contact information
If you discover a security vulnerability, please report it privately rather than opening a public issue.
If you use this code, please cite:
@article{an2025cognitive,
title={Cognitive Workspace: Towards Functional Infinite Context Through Active Memory Management},
author={Tao An},
year={2025},
eprint={2508.13171},
archivePrefix={arXiv},
primaryClass={cs.AI}
}MIT License - Free to use, modify and distribute