Add streaming preprocess mode for inference with optional disk persistence by vadanamu · Pull Request #29 · vadanamu/DeepRM

vadanamu · 2026-04-18T03:33:59Z

The existing inference flow requires preprocessing to write .npz shards to disk before running inference, which adds I/O and prevents online inference.
Provide a stream mode that preprocesses POD5/BAM on-the-fly and feeds chunks directly to the inference pipeline while optionally allowing the same chunks to be persisted to disk.

Add CLI options to deeprm call run: --preprocess-mode (disk|stream), --pod5, --stream-save-dir, and several --prep-* knobs to control streaming preprocess behavior.
Extend src/deeprm/inference/inference_preprocess_python.py with _prepare_bam_dataframe, _list_pod5_paths, stream(...), dataframe_to_chunk(...), and make segment_normalize_signal(...) emit in-memory chunk payloads via an emit_chunk_fn while optionally writing .npz via write_to_disk.
Refactor inference internals in src/deeprm/inference/inference.py by adding _init_model_for_device(...) and _run_chunk_inference(...), and implement run_inference_stream(...) to consume preprocessed chunks via callback and run inference without intermediate files.
Preserve existing disk-backed behavior (preprocess-mode=disk) and make streamed execution a drop-in alternative that can also persist streamed chunks when --stream-save-dir is provided.

Ran python -m compileall -q src/deeprm/inference/inference.py src/deeprm/inference/inference_preprocess_python.py which succeeded.
Ran pytest -q tests/test_imports.py which passed.
Ran full pytest -q which failed in this environment because CLI help tests invoke the deeprm console script that is not installed on the PATH (automated failure unrelated to the diff logic).
Running PYTHONPATH=src python -m deeprm call run --help printed a Torch availability message and did not proceed because torch is not installed in the test environment (environment limitation, not a code error).

Add streamed inference-preprocess mode with optional chunk persistence

7bd6c7f

vadanamu added the codex label Apr 18, 2026 — with ChatGPT Codex Connector

Provide feedback