Problem
There's no explicit way to pre-download the models without writing Python code. This makes it difficult to:
- Pre-download models in container builds
- Cache models in CI/CD pipelines
- Ensure models are available before running the application
- Verify model integrity explicitly
Current Workaround
Users must write custom Python code:
import iscc_sct.code_semantic_text as sct
import iscc_sct.utils as sct_utils
# Trigger download
sct.model()
# or
sct_utils.get_model()
This is not intuitive and requires understanding the internal API.
Expected Behavior
Add a CLI command to explicitly download and verify models:
# Download all models
iscc-sct download-models
# Verify models are present and valid
iscc-sct verify-models
# Show model information
iscc-sct model-info
Example Output
$ iscc-sct download-models
Downloading ONNX model (iscc-sct-v0.1.0.onnx)...
URL: https://github.com/iscc/iscc-sct/releases/download/v0.1.0/iscc-sct-v0.1.0.onnx
Size: 435 MB
Progress: ████████████████████ 100%
Checksum: ✓ Verified (BLAKE3)
Location: /home/user/.local/share/iscc-sct/iscc-sct-v0.1.0.onnx
Downloading tokenizer model...
✓ Complete
All models downloaded successfully.
$ iscc-sct verify-models
✓ ONNX model: OK
✓ Tokenizer model: OK
$ iscc-sct model-info
ONNX Model:
Version: v0.1.0
Path: /home/user/.local/share/iscc-sct/iscc-sct-v0.1.0.onnx
Size: 435 MB
Checksum: valid
Tokenizer Model:
Path: /home/user/.local/share/iscc-sct/tokenizer
Size: 1.2 MB
Use Cases
Container Builds
# Pre-download models during build
RUN iscc-sct download-models
CI/CD
- name: Setup models
run: iscc-sct download-models
- name: Verify models
run: iscc-sct verify-models
User Scripts
#!/bin/bash
# Ensure models are available before starting app
iscc-sct verify-models || iscc-sct download-models
python app.py
Implementation
Add new commands to the existing sct CLI:
import typer
from iscc_sct import utils
app = typer.Typer()
@app.command()
def download_models():
"""Download all required models."""
typer.echo("Downloading models...")
utils.get_model()
# Download tokenizer if needed
typer.echo("All models downloaded successfully.")
@app.command()
def verify_models():
"""Verify all models are present and valid."""
try:
if not utils.MODEL_PATH.exists():
raise FileNotFoundError("ONNX model not found")
if not utils.check_integrity(utils.MODEL_PATH, utils.MODEL_CHECKSUM):
raise ValueError("ONNX model integrity check failed")
typer.echo("✓ All models verified")
except Exception as e:
typer.echo(f"✗ Verification failed: {e}", err=True)
raise typer.Exit(1)
@app.command()
def model_info():
"""Show information about downloaded models."""
# Display model paths, sizes, versions, etc.
pass
Benefits
- Clear, explicit way to manage models
- Better user experience
- Simplifies container and CI/CD workflows
- Follows CLI best practices
- Makes model management transparent
Problem
There's no explicit way to pre-download the models without writing Python code. This makes it difficult to:
Current Workaround
Users must write custom Python code:
This is not intuitive and requires understanding the internal API.
Expected Behavior
Add a CLI command to explicitly download and verify models:
Example Output
Use Cases
Container Builds
CI/CD
User Scripts
Implementation
Add new commands to the existing
sctCLI:Benefits