Add file locking during model download to prevent race conditions

## Problem

When multiple processes or threads attempt to download the model simultaneously (e.g., during parallel test execution with `pytest -n auto`), race conditions can occur that corrupt the downloaded model file. This results in `INVALID_PROTOBUF` errors when trying to load the model.

## Current Behavior

- `iscc_sct.utils.get_model()` downloads the model without file locking
- Multiple concurrent calls can write to the same file simultaneously
- Downloaded file can become corrupted
- No atomic download mechanism (download to temp, then move)

## Expected Behavior

- Only one process should download the model at a time
- Other processes should wait for the download to complete
- Use file locking (e.g., `fcntl.flock` on Unix, `msvcrt.locking` on Windows, or `filelock` library)
- Download to temporary file, verify integrity, then atomically rename

## Suggested Implementation

```python
import tempfile
from pathlib import Path
from filelock import FileLock

def get_model():
    lock_path = MODEL_PATH.parent / f"{MODEL_PATH.name}.lock"
    
    with FileLock(str(lock_path), timeout=300):  # 5 minute timeout
        # Check again after acquiring lock (another process may have downloaded)
        if MODEL_PATH.exists() and check_integrity(MODEL_PATH, MODEL_CHECKSUM):
            return MODEL_PATH
        
        # Download to temporary file
        with tempfile.NamedTemporaryFile(delete=False, dir=MODEL_PATH.parent) as tmp:
            download_file(MODEL_URL, tmp.name)
            
            # Verify integrity before moving
            if not check_integrity(tmp.name, MODEL_CHECKSUM):
                Path(tmp.name).unlink()
                raise ValueError("Downloaded model failed integrity check")
            
            # Atomic rename
            Path(tmp.name).replace(MODEL_PATH)
        
    return MODEL_PATH
```

## Impact

This would make iscc-sct safe to use in:
- Parallel test environments
- Multi-worker application servers
- Container deployments with shared volumes
- Any concurrent execution scenario

## Related

- Similar issue in other ML libraries that download models (transformers, sentence-transformers, etc.) which all use file locking

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add file locking during model download to prevent race conditions #18

Problem

Current Behavior

Expected Behavior

Suggested Implementation

Impact

Related

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Add file locking during model download to prevent race conditions #18

Description

Problem

Current Behavior

Expected Behavior

Suggested Implementation

Impact

Related

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions