Problem
When multiple processes or threads attempt to download the model simultaneously (e.g., during parallel test execution with pytest -n auto), race conditions can occur that corrupt the downloaded model file. This results in INVALID_PROTOBUF errors when trying to load the model.
Current Behavior
iscc_sct.utils.get_model() downloads the model without file locking
- Multiple concurrent calls can write to the same file simultaneously
- Downloaded file can become corrupted
- No atomic download mechanism (download to temp, then move)
Expected Behavior
- Only one process should download the model at a time
- Other processes should wait for the download to complete
- Use file locking (e.g.,
fcntl.flock on Unix, msvcrt.locking on Windows, or filelock library)
- Download to temporary file, verify integrity, then atomically rename
Suggested Implementation
import tempfile
from pathlib import Path
from filelock import FileLock
def get_model():
lock_path = MODEL_PATH.parent / f"{MODEL_PATH.name}.lock"
with FileLock(str(lock_path), timeout=300): # 5 minute timeout
# Check again after acquiring lock (another process may have downloaded)
if MODEL_PATH.exists() and check_integrity(MODEL_PATH, MODEL_CHECKSUM):
return MODEL_PATH
# Download to temporary file
with tempfile.NamedTemporaryFile(delete=False, dir=MODEL_PATH.parent) as tmp:
download_file(MODEL_URL, tmp.name)
# Verify integrity before moving
if not check_integrity(tmp.name, MODEL_CHECKSUM):
Path(tmp.name).unlink()
raise ValueError("Downloaded model failed integrity check")
# Atomic rename
Path(tmp.name).replace(MODEL_PATH)
return MODEL_PATH
Impact
This would make iscc-sct safe to use in:
- Parallel test environments
- Multi-worker application servers
- Container deployments with shared volumes
- Any concurrent execution scenario
Related
- Similar issue in other ML libraries that download models (transformers, sentence-transformers, etc.) which all use file locking
Problem
When multiple processes or threads attempt to download the model simultaneously (e.g., during parallel test execution with
pytest -n auto), race conditions can occur that corrupt the downloaded model file. This results inINVALID_PROTOBUFerrors when trying to load the model.Current Behavior
iscc_sct.utils.get_model()downloads the model without file lockingExpected Behavior
fcntl.flockon Unix,msvcrt.lockingon Windows, orfilelocklibrary)Suggested Implementation
Impact
This would make iscc-sct safe to use in:
Related