Author: Mr. Watson 🦄 Date: 2026-02-08
- Goal
- Quick operations
- Ingest folders (SFTP)
- Supported formats
- Pipeline service
- Precision setup (high)
- Important GPU note for this host (GTX 1060 3GB)
- Environment config
- Operations
- Current status snapshot (2026-02-08)
Run a local high-precision RAG pipeline where documents are dropped via SFTP and indexed automatically.
gpu-service to manage it:
gpu-service status # Check if running
gpu-service start rag # Start when needed (e.g., before uploading docs)
gpu-service stop rag # Stop when inbox is emptySee GPU Service Management for details.
Status check (when running):
systemctl status rag-library-ingest
journalctl -u rag-library-ingest -n 120 --no-pager
ls -lah /home/sftpuser/library_inbox /home/sftpuser/library_done /home/sftpuser/library_failedUpload new files to:
/home/sftpuser/library_inbox
Pipeline moves files to:
/home/sftpuser/library_done(indexed OK)/home/sftpuser/library_failed(parse/index errors)
.pdf.epub.txt.md
Systemd service:
rag-library-ingest.service
Script:
/opt/rag-library/rag_pipeline.py
Virtual env:
/opt/rag-library/.venv
Data:
- Chroma vectors:
/opt/rag-library/data/chroma - File registry (SQLite):
/opt/rag-library/data/registry.db
- Embeddings model:
intfloat/multilingual-e5-base - Semantic chunking with overlap
- Metadata per chunk (source hash, title, chunk index)
- Cross-encoder reranker:
cross-encoder/ms-marco-MiniLM-L-6-v2 - Dedup by SHA-256
- Incremental ingest (new/changed files only)
Upgraded eGPU to RTX 2070 Super 8GB (sm_75). All inference now runs on CUDA:
- Embedding model (
multilingual-e5-base):device='cuda'— ingest and query - Cross-encoder reranker (
ms-marco-MiniLM-L-6-v2):device='cuda'
No CPU inference paths remain in rag_pipeline.py.
/etc/rag-library.env
RAG_EMBED_MODEL=intfloat/multilingual-e5-base
RAG_RERANK_MODEL=cross-encoder/ms-marco-MiniLM-L-6-v2# Service status
systemctl status rag-library-ingest
# Logs
journalctl -u rag-library-ingest -n 200 --no-pager
# Check indexed/failed counts
sqlite3 /opt/rag-library/data/registry.db "select status,count(*) from files group by status;"
# One-shot manual query
/opt/rag-library/.venv/bin/python /opt/rag-library/rag_pipeline.py query "tu pregunta"- service: active
- indexed files: 2
- failed files: 0
- inbox watcher: active
(Use SQL command above for live numbers.)