GapIQ — AI-Powered Career Gap Analysis Platform

Overview

GapIQ is an AI-powered career intelligence platform designed to analyze the gap between a candidate’s resume and a target job description. The system extracts structured information from resumes and job descriptions, computes deterministic compatibility scores, and generates explainable AI-driven insights including skill gaps, recruiter feedback, resume improvements, and personalized learning roadmaps.

The project was developed as part of the M.Tech internship and research work under Mesanite Software Solutions Pvt. Ltd. in collaboration with CHRIST (Deemed to be University), Bangalore.

Problem Statement

Students and early-career professionals often receive repeated job rejections without understanding:

Which skills are missing
Whether their experience aligns with the role
How recruiters evaluate their resumes
What improvements are required to increase selection chances

Traditional ATS systems are opaque and rejection-oriented. GapIQ aims to provide an explainable and constructive alternative.

Objectives

The primary objectives of GapIQ are:

Perform automated resume and job description analysis
Extract structured information using AI
Compute multi-dimensional compatibility scores
Identify missing skills and domain gaps
Generate actionable recommendations
Provide recruiter-style feedback
Build a learning roadmap for improvement
Maintain explainability and transparency in scoring

Core Features

Resume Parsing

PDF resume upload support
Text extraction using pdfplumber
Noise cleaning and normalization
Structured parsing pipeline

Job Description Processing

Raw JD text input
Skill extraction
Seniority detection
Domain keyword identification

Structured Extraction (Phase 3A)

GapIQ converts unstructured text into validated structured JSON using:

Groq LLM
Pydantic schema validation
Deterministic formatting

Extracted entities include:

Skills
Experience signals
Keywords
Seniority indicators
Required skills
Nice-to-have skills

Deterministic Scoring Engine (Phase 3B)

GapIQ computes compatibility scores across multiple dimensions:

Score Axis	Description
Technical	Skill overlap between resume and JD
Experience	Experience relevance to job requirements
Seniority	Career level alignment
Domain	Domain and industry relevance
Language	Semantic language alignment

Scoring uses:

TF-IDF vectorization
Cosine similarity
Heuristic matching
Weighted scoring
Skill overlap analysis

AI Enhancement Layer (Phase 3C)

AI-generated insights include:

Skill gap analysis
Resume improvement suggestions
Recruiter strengths and weaknesses
Personalized learning roadmap

The enhancement layer is constrained to:

Maximum 2 LLM calls
Strict JSON output
Pydantic validation
Fallback-safe execution

Explainability Layer

GapIQ emphasizes transparent evaluation.

The platform provides:

Matched skills
Missing skills
Priority-based gaps
Compatibility labels
Human-readable explanations

System Architecture

Frontend (React + Tailwind)
        ↓
FastAPI Backend
        ↓
Resume Parser + JD Processor
        ↓
Structured Extraction Layer
        ↓
Deterministic Scoring Engine
        ↓
LLM Enhancement Layer
        ↓
Final Analysis Response

Tech Stack

Frontend

Technology	Purpose
React	UI Framework
Vite	Frontend Build Tool
Tailwind CSS	Styling
Axios	API Communication

Backend

Technology	Purpose
FastAPI	API Backend
Pydantic	Data Validation
pdfplumber	Resume PDF Parsing
Scikit-learn	TF-IDF + Similarity Scoring
Groq API	LLM Inference
Python-dotenv	Environment Management

Deployment

Platform	Usage
Render	Backend Hosting
Vercel / Netlify	Frontend Hosting

Project Structure

gapiq/
│
├── backend/
│   ├── main.py
│   ├── extractor.py
│   ├── scorer.py
│   ├── enhancer.py
│   ├── models.py
│   ├── requirements.txt
│   ├── .env
│   └── uploads/
│
├── frontend/
│   ├── src/
│   │   ├── components/
│   │   ├── App.jsx
│   │   └── main.jsx
│   ├── package.json
│   └── vite.config.js
│
└── docs/

Backend API Documentation

Base URL

http://localhost:8000

API Endpoints

Health Check

Endpoint

GET /health

Purpose

Checks whether backend service is active.

Response

{
  "status": "ok"
}

Resume Parsing

Endpoint

POST /parse/resume

Input

Multipart PDF file upload.

Output

{
  "text": "parsed resume text"
}

Structured Extraction

Endpoint

POST /extract

Input

{
  "resume_text": "...",
  "jd_text": "..."
}

Output

{
  "resume": {
    "skills": [],
    "experience": [],
    "keywords": [],
    "seniority_signals": []
  },
  "jd": {
    "required_skills": [],
    "nice_to_have": [],
    "keywords": [],
    "seniority": ""
  }
}

Deterministic Scoring

Endpoint

POST /score

Output

{
  "scores": {
    "technical": 78,
    "experience": 72,
    "seniority": 65,
    "domain": 70,
    "language": 80,
    "overall": 74,
    "role_fit": "Moderate Match"
  }
}

Full Analysis

Endpoint

POST /analyze

Output

{
  "scores": {},
  "gaps": [],
  "recommendations": [],
  "roadmap": [],
  "recruiter_lens": {}
}

Scoring Methodology

Technical Score

Calculated using:

Direct skill overlap
TF-IDF fallback similarity
Partial semantic alignment

Experience Score

Computed using:

TF-IDF similarity between:
- Resume experience
- JD requirements

Seniority Score

Uses heuristic matching against predefined career-level keywords.

Domain Score

Measures overlap between:

Resume keywords
JD keywords

Language Score

Measures semantic alignment between:

Overall resume profile
Overall job description profile

Overall Score

Weighted aggregation:

weighted = (
    technical * 0.35 +
    experience * 0.30 +
    seniority * 0.10 +
    domain * 0.15 +
    language * 0.15
)

Compatibility classification:

Overall Score	Classification
75+	Strong Match
55–74	Moderate Match
40–54	Potential Match
Below 40	Weak Match

Frontend UI Components

Component	Purpose
UploadSection	Resume upload + JD input
ScoreSection	Compatibility visualization
GapSection	Missing skills display
RecommendationSection	Resume improvement suggestions
RoadmapSection	Learning roadmap
RecruiterSection	Recruiter perspective analysis

Screenshots 📸

Code Structure

Groq API

Vercel for Deployment

Render for Backend

GAPIQ-UI

Resume with Job Description

Resume Matching Scores - Analysis

Skill Gaps - Analysis

Resume Recommendation - Modifications

Learning Roadmap - Analysis

Recruiter Lens - Scanning

Installation Guide

Clone Repository

git clone <repository-url>
cd gapiq

Backend Setup

Create Virtual Environment

cd backend
python -m venv venv

Activate Virtual Environment

Windows

source venv/Scripts/activate

Linux / Mac

source venv/bin/activate

Install Dependencies

pip install -r requirements.txt

Configure Environment Variables

Create .env

GROQ_API_KEY=your_api_key_here

Run Backend

uvicorn main:app --reload

Backend URL:

http://127.0.0.1:8000

Frontend Setup

Install Dependencies

cd frontend
npm install

Run Frontend

npm run dev

Frontend URL:

http://localhost:5173

Deployment Guide

Backend Deployment (Render)

Recommended Configuration

Root Directory

backend

Build Command

pip install -r requirements.txt

Start Command

uvicorn main:app --host 0.0.0.0 --port $PORT

Environment Variables

GROQ_API_KEY=your_api_key

Frontend Deployment

Deploy frontend separately using:

Vercel
Netlify

Update API base URLs after deployment.

Design Principles

GapIQ follows these core principles:

Explainability First

Every score must be understandable.

Hybrid AI Architecture

Combines:

Deterministic scoring
LLM enhancement

Minimal LLM Dependence

The system minimizes:

hallucination risk
API cost
latency

Lightweight Infrastructure

Avoids:

heavy GPU inference
local transformer hosting
unnecessary ML overhead

Current Limitations

Resume parsing may vary across complex PDF layouts
LLM extraction quality depends on resume formatting
Domain understanding is keyword dependent
TF-IDF similarity lacks deep semantic reasoning
No authentication system currently implemented
No database persistence layer yet

Future Enhancements

Technical Improvements

Embedding API integration
Adaptive scoring weights by role type
Resume section segmentation
ATS compatibility scoring
Multi-resume comparison
Historical analytics dashboard

Research Enhancements

Confidence-aware scoring
Explainable AI evaluation metrics
Hybrid semantic retrieval
Role-specific calibration models
Bias analysis in resume evaluation

Product Features

User authentication
Resume history
Exportable reports
Skill trend analytics
Personalized job recommendations
Recruiter dashboard

Research Scope

GapIQ also serves as a research-oriented project in:

Explainable AI for recruitment
Intelligent career guidance systems
Resume-job semantic alignment
Human-centered AI scoring systems
Hybrid deterministic + generative architectures

Security Considerations

API keys stored using environment variables
File upload validation required
Rate limiting recommended
Input sanitization required
Sensitive resume data should not be permanently stored without consent

Performance Optimizations

Implemented optimizations include:

Limited LLM calls
Lightweight TF-IDF scoring
Pydantic validation
Structured logging
Controlled output schemas

Recommended future optimizations:

Redis caching
Async processing
Background task queues
Request batching

Contributors

Project Developer

Renold Stephen R

M.Tech Computer Science and Engineering CHRIST (Deemed to be University)

Acknowledgements

CHRIST (Deemed to be University)
Groq
FastAPI Community
Open-source NLP ecosystem

License

This project is currently developed for academic and research purposes.

Further licensing decisions may be applied during production release.

Final Note

GapIQ is designed not merely as an ATS-style rejection system, but as an explainable AI career guidance platform focused on helping candidates understand, improve, and strategically align themselves with industry roles.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
backend		backend
docs		docs
frontend		frontend
screenshots		screenshots
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
parser.py		parser.py
render.yaml		render.yaml
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

GapIQ — AI-Powered Career Gap Analysis Platform

Overview

Problem Statement

Objectives

Core Features

Resume Parsing

Job Description Processing

Structured Extraction (Phase 3A)

Deterministic Scoring Engine (Phase 3B)

AI Enhancement Layer (Phase 3C)

Explainability Layer

System Architecture

Tech Stack

Frontend

Backend

Deployment

Project Structure

Backend API Documentation

Base URL

API Endpoints

Health Check

Endpoint

Purpose

Response

Resume Parsing

Endpoint

Input

Output

Structured Extraction

Endpoint

Input

Output

Deterministic Scoring

Endpoint

Output

Full Analysis

Endpoint

Output

Scoring Methodology

Technical Score

Experience Score

Seniority Score

Domain Score

Language Score

Overall Score

Frontend UI Components

Screenshots 📸

Code Structure

Groq API

Vercel for Deployment

Render for Backend

GAPIQ-UI

Resume with Job Description

Resume Matching Scores - Analysis

Skill Gaps - Analysis

Resume Recommendation - Modifications

Learning Roadmap - Analysis

Recruiter Lens - Scanning

Installation Guide

Clone Repository

Backend Setup

Create Virtual Environment

Activate Virtual Environment

Windows

Linux / Mac

Install Dependencies

Configure Environment Variables

Run Backend

Frontend Setup

Install Dependencies

Run Frontend

Deployment Guide

Backend Deployment (Render)

Recommended Configuration

Root Directory

Build Command

Packages