End-to-end analysis of NYC Yellow Taxi trip data using PySpark, covering data cleaning, EDA, feature engineering, demand patterns, and fare prediction modeling.
-
Updated
Jan 4, 2026 - Jupyter Notebook
End-to-end analysis of NYC Yellow Taxi trip data using PySpark, covering data cleaning, EDA, feature engineering, demand patterns, and fare prediction modeling.
Through exploratory data analysis and statistical evaluation, this project investigates how customer attributes specifically Gender, Age, and User Type influence ride duration. The analysis aims to identify patterns in trip behavior and assess whether demographic and subscription characteristics are associated with differences in usage.
Power BI dashboard analysing rail operations: ticket sales, delays, revenue, and customer behaviour.
An end-to-end data analysis project focused on OLA ride-sharing data including trip patterns, demand analysis, driver performance metrics using SQL and Power BI.
30-year U.S. airline market analysis (1993–2024) examining pricing trends, competition effects, demand concentration, and route-level business intelligence in Python.
Difference-in-Differences analysis of bus-lane policies and ridership trends in Israel.
fmCSA carrier data extraction tool
Apache Airflow ETL pipeline that consolidates multi-format toll road traffic data (CSV, TSV, fixed-width) into a unified, transformed dataset using BashOperators and scheduled workflows.
Develop a predictive model to accurately forecast hourly traffic volumes at different road junctions based on historical traffic data
Real-world Markov Process Simulation using Divvy Bike Data (GBFS), demonstrating SAS-to-Python migration for operational analytics and forecasting.
Interactive Power BI dashboard analyzing Mumbai Local Railway passenger traffic, peak-hour congestion, and station utilization using DAX, Power Query, and Excel.
Python EDA on Uber ride request data to analyze trip patterns, peak demand hours, cancellation trends, and supply-demand gaps using Pandas and Seaborn.
Deep exploratory analysis of NYC TLC trip data to understand demand patterns, zone-level variability, seasonality, and revenue distribution. Conducted structured EDA on spatial heterogeneity, temporal trends, skew, feature correlations, and lag effects. Built Prophet and LightGBM models.
Data analysis for Ride-sharing market and demand analysis for Chicago, integrating trip, competitor, and weather data to generate data-driven operational and strategic insights.
🚖 Analyze NYC Yellow Taxi trip data for fare prediction and demand insights using PySpark, enabling efficient data processing and accurate modeling.
Add a description, image, and links to the transportation-analytics topic page so that developers can more easily learn about it.
To associate your repository with the transportation-analytics topic, visit your repo's landing page and select "manage topics."