Finova AI Roadmap

This file is the end-to-end development plan for the MVP. Use the markdown checkboxes to track progress as we move from documentation and scaffolding to a full demo-ready intake and review flow.

How To Use

Keep this file high level and execution-oriented.
Mark a task [x] only when the outcome is actually usable, not just partially started.
Add child tasks in the relevant implementation doc or issue tracker if a checkbox becomes too large.
Use docs/agent/ as the source of truth for scope, architecture, API, pipeline, and validation rules.

Current Status Snapshot

AI documentation split into task-oriented files under docs/agent/
AGENTS.md reduced to a short context router
Extraction schemas moved to docs/agent/schemas/*.json
Frontend design guidance moved to docs/design/frontend-design-system.md
docs/agent/playbooks/ added for recurring task workflows
Backend application scaffold created
Frontend application scaffold created
End-to-end demo flow implemented

Milestone 0: Documentation Foundation

Replace monolithic AGENTS.md with a router + doc map
Create domain-focused docs under docs/agent/
Separate machine-readable extraction schemas from narrative docs
Move frontend design system into docs/design/
Validate task-based doc bundles for backend, frontend, and QA/demo work
Add docs/agent/playbooks/ for common execution paths
Add one playbook for upload flow implementation
Add one playbook for OCR or pipeline debugging
Add one playbook for review UI changes

Milestone 1: Local Dev And Project Scaffolding

Scaffold backend/ with FastAPI app structure from docs/agent/04-backend-structure.md
Scaffold frontend/ with Next.js app structure from docs/agent/05-frontend-review-ui.md
Add Dockerfiles for backend and frontend
Add docker-compose.yml with frontend, backend, postgres, minio
Add environment-variable configuration for DB, MinIO, and LLM settings
Add MinIO bucket initialization flow
Confirm the full stack boots locally with one command

Milestone 2: Core Persistence And Storage

Implement SQLAlchemy models for applications, documents, document pages, extracted fields, validation flags, and review actions
Add Alembic migrations for the initial schema
Implement database session and repository base patterns
Implement application and document repositories
Implement MinIO storage client and deterministic storage-key generation
Persist raw uploads to MinIO and document metadata to PostgreSQL

Milestone 3: Upload And Intake API

Implement POST /applications
Implement GET /applications
Implement GET /applications/{application_id}
Implement POST /applications/{application_id}/documents
Validate supported file types and reject empty files
Compute file hash and prepare duplicate-upload detection hooks
Set initial document status to uploaded

Milestone 4: File Normalization And OCR Pipeline

Implement POST /documents/{document_id}/process
Convert PDF uploads into page images
Normalize image uploads into the same page model
Preprocess pages with OpenCV
Run PaddleOCR page by page
Persist document_pages records with OCR text and OCR JSON
Store OCR artifacts in MinIO
Compute and persist aggregate OCR confidence
Handle corrupted PDF, empty OCR output, and incomplete page conversion gracefully

Milestone 5: Classification, Extraction, And Normalization

Implement rule-based document classification for id_card, payslip, and bank_statement
Support unknown classification with validation flags
Add document-specific extraction prompts
Integrate LLM extraction with strict JSON-only output validation
Validate extraction output against docs/agent/schemas/*.json
Retry once on invalid JSON output
Fall back to rule-based partial extraction when parsing still fails
Normalize dates, salary or balance fields, account numbers, and names
Persist raw and normalized extraction payloads

Milestone 6: Validation And Confidence Routing

Implement field-level validation for ID card, payslip, and bank statement
Implement completeness warnings for important missing fields
Implement quality flags for low-quality image and OCR cases
Implement cross-document name checks across application documents
Compute quality_score, ocr_confidence, and extraction_confidence
Route documents into processed, needs_review, or failed based on outcome
Persist validation flags for reviewer visibility

Milestone 7: Review API And Reviewer Actions

Implement GET /documents/{document_id}
Implement GET /documents/{document_id}/pages
Implement PATCH /documents/{document_id}/review
Implement POST /documents/{document_id}/decision
Persist reviewer corrections into review history
Persist reviewer decisions for approve, reject, and request_reupload
Keep review actions auditable in review_actions

Milestone 8: Frontend Review Experience

Build applications list page
Build application detail page
Build document review page
Show upload status, document type, and warning summary on the application detail page
Show document preview, editable extracted fields, validation flags, and OCR raw text on the review page
Allow manual field correction before decision
Allow approve, reject, and request-reupload actions
Apply docs/design/frontend-design-system.md to the review workflow

Milestone 9: Testing, QA, And Demo Readiness

Stretch Goals

OCR bounding-box overlay on the review screen
Duplicate upload detection
Simple image quality scoring heuristic
Extraction retry observability or diagnostics
Lightweight background-job orchestration if synchronous processing becomes too slow

Done Definition For MVP

The MVP is done when all of these are true:

A user can create an application
A user can upload an ID card, a payslip, and a bank statement
The system stores raw files and derived artifacts
The system runs normalization, OCR, classification, extraction, and validation
The review UI displays extracted fields, confidence signals, and validation flags
A reviewer can correct a field and approve a document
The project runs locally with Docker Compose
Core tests for upload, normalization, extraction, validation, and persistence exist

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Finova AI Roadmap

How To Use

Current Status Snapshot

Milestone 0: Documentation Foundation

Milestone 1: Local Dev And Project Scaffolding

Milestone 2: Core Persistence And Storage

Milestone 3: Upload And Intake API

Milestone 4: File Normalization And OCR Pipeline

Milestone 5: Classification, Extraction, And Normalization

Milestone 6: Validation And Confidence Routing

Milestone 7: Review API And Reviewer Actions

Milestone 8: Frontend Review Experience

Milestone 9: Testing, QA, And Demo Readiness

Stretch Goals

Done Definition For MVP

FilesExpand file tree

roadmap.md

Latest commit

History

roadmap.md

File metadata and controls

Finova AI Roadmap

How To Use

Current Status Snapshot

Milestone 0: Documentation Foundation

Milestone 1: Local Dev And Project Scaffolding

Milestone 2: Core Persistence And Storage

Milestone 3: Upload And Intake API

Milestone 4: File Normalization And OCR Pipeline

Milestone 5: Classification, Extraction, And Normalization

Milestone 6: Validation And Confidence Routing

Milestone 7: Review API And Reviewer Actions

Milestone 8: Frontend Review Experience

Milestone 9: Testing, QA, And Demo Readiness

Stretch Goals

Done Definition For MVP