This directory contains experimental code that is offered as-is and should be treated as experimental components, not part of the core tau2 benchmark.
⚠️ Important: The code in this directory is experimental and may not be fully tested or supported. Use at your own discretion.
The experiments/ folder is used for experimental features and research code that extends beyond the core tau2 benchmark. It can be used for new features, prototypes, and innovative approaches that are not part of the core evaluation framework. These components are provided for research purposes and to enable advanced use cases.
This directory is organized into subdirectories for different types of experimental components. Each subdirectory should contain its own README with specific documentation and usage instructions.
To contribute experimental code:
- Create a new subdirectory for your experiment
- Add a comprehensive README.md explaining the purpose and usage
- Include example scripts and basic tests
- Follow the development guidelines below
When working with experimental code:
- Backward Compatibility: Maintain compatibility with core tau2 interfaces when possible
- Documentation: Each experimental component should have its own README
- Testing: Include basic testing scripts and examples
- Dependencies: Manage dependencies carefully to avoid conflicts with core tau2
- Isolation: Keep experimental code self-contained within this directory
Experimental contributions are welcome! Please:
- Add comprehensive documentation in your subfolder's README
- Include example usage and test scripts
- Mark any breaking changes or dependencies clearly
- Consider the experimental nature - code doesn't need to be production-ready
Since this is experimental code:
- No guarantees of stability or continued support
- Community-driven - contributions and improvements welcome
- Use at your own risk - test thoroughly before production use
- Documentation-first - refer to individual README files for detailed usage
For core tau2 benchmark support, see the main project documentation.