Role: Security / Defense Data Analyst Objective: Identify terrorism hotspots, trends, and patterns worldwide using historical data (1970–2017).
This project combines data engineering, dimensional modeling, and business-focused analytics, ending with an interactive Power BI dashboard.
Terrorism data is:
- large (180K+ records),
- wide (130+ columns),
- messy,
- and not analytics-friendly.
Goal: Transform raw terrorism records into a clean, performant analytical model that enables:
- regional analysis,
- country-level risk assessment,
- group behavior profiling,
- and temporal trend discovery.
Raw CSV (Global Terrorism DB)
↓
Python (Pandas)
↓
Star Schema (Facts + Dimensions)
↓
Power BI Semantic Model
↓
Interactive Dashboards
The original dataset (181,691 × 128) was normalized into a star schema to achieve:
- high query performance
- clarity and maintainability
- BI-friendly design
- DimLocation (Country, Region, State, City)
- DimAttackType
- DimTargetType
- DimTarget
- DimWeapon
- DimGroup
- DimClaimMode
- DimEventDesc
- FactAttackEvent
- FactKidnapping
Key steps:
- Cleaned inconsistent text fields
- Standardized categorical values
- Removed noise & invalid markers (
-9,-99) - Generated surrogate keys
- Split wide dataset into analytical dimensions
📓 Notebook:
👉 GT data pre-processing and modeling.ipynb
🔍 Full Data Engineering & Transformation Code
This section includes:
- dimension creation logic
- joins & merges
- cleanup and normalization
- fact table construction
- CSV exports for BI ingestion
(All original notebook code is preserved.)
Due to file size and sharing constraints, the Power BI file and screenshots are included locally.
📂 Dashboard File:
👉 Global Terrorism Dashboard.pbix
- 2014 recorded the highest number of attacks globally
- Explosives & firearms dominate attack methods
- A small number of groups account for a large share of incidents
- Iraq recorded ~23,000 attacks
- ~3,926 attacks occurred in 2014 alone
- Strong geographic clustering is visible via heatmaps
Analysis includes:
- active years & lifespan
- success rate
- preferred weapons
- suicide attack usage
- target preferences
Example: ETA operated from 1972–2010, with:
- ~1,650 attacks
- 85.45% success rate
- Primary targets: police & civil guard units
This project was developed as part of The Spark Foundation Internship Program.
It is a demonstration and learning project only and does not represent operational or political views.
- Demonstrates real-world data engineering & modeling
- Shows strong dimensional modeling practices
- Balances technical depth with business insight
- Scales well to production BI environments
This project reflects how I approach messy data, structure it properly, and turn it into insight.










