-
Notifications
You must be signed in to change notification settings - Fork 22
Description
Background
In the legacy ONE framework (C++), the typical workflow relied on .cfg files and invoking onecc from the command line. However, in the LLM era, we are developing TICO, a Python-first library that directly imports PyTorch modules and exports them to Circle.
To support this new workflow, we have published onecc as a Python library, allowing onecc to be invoked programmatically from Python instead of via CLI + config files.
As a result, the overall compilation and quantization pipeline needs to be clarified and demonstrated with concrete examples.
What
We want to add example pipelines to the TICO repository that clearly show how onecc and TICO should be used together, depending on the model type and quantization strategy.
Proposed Examples
We plan to add two example pipelines:
1. Legacy / non-Transformer models
Quantization handled by onecc
Pipeline:
- Start from a PyTorch module
- Export to Circle
- Run:
- one-optimize
- one-quantize (performed by onecc, legacy flow)
This example targets:
- Existing / legacy models
- Users migrating from the traditional ONE workflow
- Cases where quantization is still owned by onecc
2. Transformer / LLM-style models
Quantization handled by TICO
Pipeline:
- Start from a PyTorch module
- Perform quantization inside TICO (Python-level PTQ logic)
- Export quantized model to Circle
- Run:
- one-optimize only
This example targets:
- Transformer-based models
- LLM workloads
- Workflows where quantization must be tightly coupled with PyTorch semantics