Skip to content

Conversation

@nemerna
Copy link
Contributor

@nemerna nemerna commented Nov 17, 2025

test

operetz-rh and others added 17 commits November 17, 2025 15:12
update pom.xml with yaml parsing dependency and Dockerfile to have DVC cli installation.
-Add 60-second timeout to DVC process execution
validate each of nvr type to make sure all items are added as string
Implement new /api/v1/mlops-batch endpoint for automated testing of NVRs
fetched from DVC (Data Version Control). This feature is completely
independent from regular job batches and designed for ML operations workflows.

Database Changes:
- Add mlops_batch table for batch tracking with DVC version metadata
- Add mlops_job table for individual NVR-based jobs
- Add mlops_job_metrics table for storing aggregated metrics and confusion matrix
- Create indexes for optimal query performance

Core Features:
- Fetch NVR lists from DVC using version tags
- Automatic NVR resolution (package name, version, source URL, false positives URL)
- Parallel job execution with configurable rolling window (default: 3 concurrent jobs)
- Ground truth sheet URL generation per NVR (MinIO integration)
- Pipeline metrics extraction (accuracy, precision, recall, F1, confusion matrix)
- Comprehensive error handling and graceful degradation

New Entities & Models:
- MlOpsBatch, MlOpsJob, MlOpsJobMetrics entities
- Repositories for data access layer
- Request/Response DTOs with validation

Services:
- MlOpsBatchService: Core orchestration with DVC integration
- MlOpsJobService: Job lifecycle and status management
- MlOpsMetricsService: Extract and store pipeline metrics from workflow results
- PipelineParameterMapper: Enhanced with MLOps-specific parameters

Pipeline Parameters (MLOps-specific):
- CONTAINER_IMAGE: Custom SAST AI workflow image
- PROMPTS_VERSION: DVC version for prompts configuration
- KNOWN_NON_ISSUES_VERSION: DVC version for known non-issues
- INPUT_REPORT_FILE_PATH: NVR-specific ground truth sheet URL
- Standard parameters: source URL, false positives, LLM configuration

REST Endpoints:
- POST /api/v1/mlops-batch - Submit new MLOps batch
- GET /api/v1/mlops-batch - List all batches (paginated)
- GET /api/v1/mlops-batch/{id} - Get batch summary
- GET /api/v1/mlops-batch/{id}/detailed - Get batch with all jobs and metrics

Platform Integration:
- MlOpsPipelineRunWatcher: Monitor Tekton pipelines for MLOps jobs
- Automatic status updates and batch progress tracking
- Metrics extraction from workflow-metrics result
- Rolling window semaphore for controlled parallelism

Configuration:
- mlops.batch.max.parallel.jobs: Control concurrent job execution (default: 3)

Key Design Decisions:
- Complete database isolation from regular batches per requirements
- No INPUT_SOURCE_TYPE needed - MLOps jobs work directly with NVRs
- Graceful handling of missing metrics when pipelines fail
- Transaction-safe status polling to avoid context exceptions
- Semaphore-based rolling window for optimal resource utilization
update pom.xml with yaml parsing dependency and Dockerfile to have DVC cli installation.
Co-authored-by: ikrispin <[email protected]>
@github-actions
Copy link

AI Code Review Skipped: This PR is too large for automated review (diff size exceeds 100KB).

@sonarqubecloud
Copy link

Quality Gate Failed Quality Gate failed

Failed conditions
8.9% Duplication on New Code (required ≤ 3%)

See analysis details on SonarQube Cloud

@nemerna nemerna closed this Nov 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants