The AI Agent Verification System is an automated pipeline designed to verify user identities given a set of documents (Aadhaar, PAN) and a selfie. It uses advanced Large Language Models (LLMs) and Computer Vision to extracting data, detecting fraudulent/masked documents, and verifying face ownership.
The system is built as a Distributed Architecture consisting of two main components:
- Batch Dispatcher (
batch_dispatcher.py): The "Client" that acts as an agent. It fetches batches of users from a central backend, locks them for processing, and sends them to the local AI server. - Local AI Server (
batch.py): The "Worker" that runs the heavy AI models (Qwen-VL, Face Verification) to process the images and return a decision.
- Job Assignment: The Dispatcher (
batch_dispatcher.py) contacts the main backend (qoneqt.com) to "lock" a batch of pending KYC requests for a specific Agent ID. - Data Retrieval: The Dispatcher downloads the user's uploaded images (Selfie, Aadhaar Front/Back, PAN) into memory.
- Local Processing: The data is sent to the Local AI Server (
batch.py) running onlocalhost:8101. - AI Analysis:
- Document Extraction:
qwen3-vl:8b-instruct(via Ollama) reads the Aadhaar/PAN cards. - Logic Check: The system checks for Masked Aadhaar cards (illegal for this specific flow) and rejects them.
- Data Matching: Extracted text (Name, DOB, Gender) is compared against the user's input.
- Face Verification: The selfie gender is detected and matched against the document gender.
- Document Extraction:
- Result Submission: The final decision (APPROVED/REJECTED/REVIEW) and extracted metadata are pushed back to the central backend by the Dispatcher.
This script manages the lifecycle of a verification job.
- Locking Mechanism: Uses
/admin/kyc-lock-batchto ensure no other agent processes the same users. - Resiliency: Implements exponential backoff for 502 Server Errors (Server Down/Overloaded).
- Logging:
- Local SQLite: Saves every record to
local_kyc_data.dbfor audit trails. - Google Sheets: Optionally logs stats to a Google Sheet.
- JSON Logs: detailed logs in
logs/agent_{id}/.
- Local SQLite: Saves every record to
- Redis Caching: Caches repeatedly accessed data to save bandwidth.
A FastAPI application that hosts the intelligence.
- Ollama Integration: Connects to a local Ollama instance running
qwen3-vl:8b-instruct. - Prompt Engineering: Uses specific prompts to extract fields like Name, DOB, and strictly identifiy Masked Aadhaars.
- Image Optimization: Resizes images to <800px to ensure fast processing and low token usage.
- In-Memory Processing: Uses
io.BytesIOto handle images in RAM, avoiding slow disk I/O.
- Operating System: Linux / MacOS (Recommended for performant I/O)
- Python: 3.10 or higher
- Ollama: Installed and running.
- Redis: Installed and running (default port 6379).
- Models:
- Pull the vision model:
ollama pull qwen3-vl:8b-instruct
- Pull the vision model:
-
Clone/Setup Directory: Ensure you are in the
AI_Agent_Verificationfolder. -
Install Python Dependencies:
pip install -r requirements.txt
-
Configure Environment: Check
config.pyor.envfile (if applicable) for API keys and Backend URLs.batch_dispatcher.pyhas constants likeADMIN_ID,AGENT_IDS(77, 78, 79, 80).
This must be running before you start the dispatcher.
python batch.py- Health Check: Open
http://localhost:8101/healthin your browser. It should say"status": "healthy".
In a separate terminal window:
python batch_dispatcher.pyThis will:
- Connect to the backend.
- Lock a batch of users (default 20).
- Start processing them one by one.
- Print status logs (e.g.,
✅ User 123: Processed & Pushed (APPROVED)).
The system enforces strict rules. A user is REJECTED if:
- Masked Aadhaar: The Aadhaar number is hidden with 'X' or '*'. This is a strict rejection criteria.
- Document Not Found: The AI cannot find a valid Aadhaar card in the image.
- Gender Mismatch: The gender detected in the selfie does not match the Aadhaar gender.
A user is sent to REVIEW if:
- Low Confidence: The AI is unsure about the data extraction (rare).
- Partial Match: Specific fields match but others are ambiguous.
batch.py: Core AI Server.batch_dispatcher.py: Core Orchestrator.app/: Helper modules (Gender detection, Entity definitions).legacy/: Old/Unused files (main.py,scoring.py) - kept for reference.logs/: Execution logs.redis_cache.py: Redis interface.
- "Qwen agent not initialized": cancel the script, make sure
ollama serveis running, and generic model access is working. - 502 Bad Gateway: The main backend server is down. The dispatcher will auto-retry (exponential backoff).
- Redis Connection Error: Ensure Redis is running (
sudo systemctl start redis).