Project for PEI evaluation 25/26
Data processing service that aggregates and processes network telemetry measurements in time-aligned windows. Consumes raw network data from Kafka, groups measurements by cell and time window, applies statistical processing, and publishes aggregated results back to Kafka.
- Python 3.13 with async/await patterns
- Apache Kafka (confluent_kafka client) - Message streaming
- FastAPI/Uvicorn ecosystem
- httpx - Async HTTP requests
- Docker - Containerization
- pytest with asyncio support
- Time-windowed processing: Event-time aligned windows (configurable: 60s, 300s)
- Watermark-driven lifecycle: Manages window completion with configurable allowed lateness
- Processing profiles: Extensible ProfileBase classes
- LatencyProfile: Aggregates RSRP, SINR, RSRQ, mean_latency, CQI with statistics (min, max, mean, std dev)
- Empty window strategies: Pluggable StrategyBase patterns
- SkipStrategy: Ignore empty windows
- ZeroFillStrategy: Generate zero/null-filled records
- ForwardFillStrategy: Replicate last known values
- Parallel processing: Async cell data fetching and windowing
- Integration: Storage API for cell metadata, Kafka for input/output
docker run -p 9092:9092 apache/kafka:4.1.1utils/topic.sh [container] "network.data.ingested" -c
utils/topic.sh [container] "network.data.processed" -cuvicorn receiver:app --reload --host 0.0.0.0 --port 8000python3 producer/main.py -a "http://localhost:8000/receive" -f dataset/hbahn/latency_data.csvdocker-compose upRuns two processor instances for different time scales (60s and 300s windows).
- Modular design: Extensible ProfileBase and StrategyBase classes
- Kafka topics: Consumes from
network.data.ingested, produces tonetwork.data.processed - Cell-level aggregation: Validates cell consistency within windows
- Batch pagination: Supports large result sets
processor/
├── main.py # Entry point - Kafka consumer/watermark coordination
├── src/
│ ├── time_window_manager.py # Core windowing logic
│ ├── empty_window_strategy.py # Empty window handling
│ └── profiles/
│ ├── processing_profile.py # Abstract profile interface
│ └── latency_profile.py # Network latency aggregation
└── tests/ # pytest test suite