🌍 Exploratory Data Analysis — Global Terrorism

Role: Security / Defense Data Analyst Objective: Identify terrorism hotspots, trends, and patterns worldwide using historical data (1970–2017).

This project combines data engineering, dimensional modeling, and business-focused analytics, ending with an interactive Power BI dashboard.

🧠 Problem Statement

Terrorism data is:

large (180K+ records),
wide (130+ columns),
messy,
and not analytics-friendly.

Goal: Transform raw terrorism records into a clean, performant analytical model that enables:

regional analysis,
country-level risk assessment,
group behavior profiling,
and temporal trend discovery.

🏗️ Solution Overview

Architecture

Raw CSV (Global Terrorism DB)
        ↓
Python (Pandas)
        ↓
Star Schema (Facts + Dimensions)
        ↓
Power BI Semantic Model
        ↓
Interactive Dashboards

🧱 Data Modeling (Star Schema)

The original dataset (181,691 × 128) was normalized into a star schema to achieve:

high query performance
clarity and maintainability
BI-friendly design

Dimensions

DimLocation (Country, Region, State, City)
DimAttackType
DimTargetType
DimTarget
DimWeapon
DimGroup
DimClaimMode
DimEventDesc

Fact Tables

FactAttackEvent
FactKidnapping

⚙️ Part 1 — Data Preprocessing & Modeling (Python)

Key steps:

Cleaned inconsistent text fields
Standardized categorical values
Removed noise & invalid markers (-9, -99)
Generated surrogate keys
Split wide dataset into analytical dimensions

📓 Notebook: 👉 GT data pre-processing and modeling.ipynb

🔍 Full Data Engineering & Transformation Code

This section includes:

dimension creation logic
joins & merges
cleanup and normalization
fact table construction
CSV exports for BI ingestion

(All original notebook code is preserved.)

📊 Part 2 — Power BI Dashboard & Analysis

Due to file size and sharing constraints, the Power BI file and screenshots are included locally.

📂 Dashboard File: 👉 Global Terrorism Dashboard.pbix

🔍 Key Insights

🌐 Global Overview

2014 recorded the highest number of attacks globally
Explosives & firearms dominate attack methods
A small number of groups account for a large share of incidents

🏳️ Country-Level Insights

Iraq recorded ~23,000 attacks
~3,926 attacks occurred in 2014 alone
Strong geographic clustering is visible via heatmaps

🧨 Group Behavior Analysis

Analysis includes:

active years & lifespan
success rate
preferred weapons
suicide attack usage
target preferences

Example: ETA operated from 1972–2010, with:

~1,650 attacks
85.45% success rate
Primary targets: police & civil guard units

📸 Dashboard Screenshots

Overview

Country Analysis

Group Analysis

📎 Resources

📄 Power BI Dashboard
📘 Dataset Codebook

⚠️ Disclaimer

This project was developed as part of The Spark Foundation Internship Program.

It is a demonstration and learning project only and does not represent operational or political views.

⭐ Why This Project Matters

Demonstrates real-world data engineering & modeling
Shows strong dimensional modeling practices
Balances technical depth with business insight
Scales well to production BI environments

This project reflects how I approach messy data, structure it properly, and turn it into insight.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Dataset		Dataset
screenshots		screenshots
.gitignore		.gitignore
Codebook.pdf		Codebook.pdf
GT data pre-processing and modeling.ipynb		GT data pre-processing and modeling.ipynb
Global Terrorism Dashboard.pbix		Global Terrorism Dashboard.pbix
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🌍 Exploratory Data Analysis — Global Terrorism

🧠 Problem Statement