A modular Python project for autonomous driving research and prototyping, fully integrated with the BeamNG.tech simulator and Foxglove visualization. This system combines traditional computer vision algorithms and deep learning (CNN, YOLO) with real-time sensor fusion and autonomous vehicle control to tackle:
- Multi-Lane Detection: YOLOP, Traditional CV
- Traffic Sign: Classification & Detection
- Traffic Lights: Classification & Detection
- Object Detection: Vehicles, pedestrians, cyclists and more
- Multi-Sensor Fusion: Camera, Lidar, Radar, GPS, IMU
- Real-Time Control: PID steering, cruise control (CC), automatic emergency braking (AEB)
- Visualization: Real-time monitoring with Foxglove WebSocket + multiple CV windows
- Configuration System: YAML-based modular settings
- VisionPilot: Autonomous Driving Simulation, Computer Vision & Real-Time Perception (BeamNG.tech)
Evaluation of the multi-lane perception pipeline across various environmental edge cases, including high-glare transitions, low-light tunnels, and heavy atmospheric fog:
Extended Demo: Watch the full video here
Watch the Emergency Braking System (AEB) in action with real-time radar filtering and collision avoidance:
Extended Demo: Watch the full video here
See the Blind Spot Detection (BSD) system in action using radar data to identify vehicles in the blind spot:
Extended Demo: Watch the full video here
This demo shows real-time traffic sign detection and classification:
Extended Demo: Watch the full video here
VisionPilot does not yet support multi-camera. This is for demonstration purposes only.
This demo shows real-time traffic light detection and classification:
No extended Demo avaliable yet.
Watch the improved autonomous lane keeping demo (v2) in BeamNG.tech, featuring smoother fused CV+SCNN lane detection, stable PID steering, and robust cruise control:
Extended Demo: Watch the full video here
Note: Very low-light (tunnel) scenarios are not yet supported.
The original demo is still available for reference:
Lane Keeping & Multi-Model Detection Demo (v1)
Watch both the raw model segmentation output and the multiple processed lanes on a highway video.
Extended Demo: Watch the full video here
Note: This is not the final integration of the yolop model in VisionPilot. This only serves as a demo of the model's capabilities and use cases for VisionPilot.
See real-time LiDAR point cloud streaming and autonomous vehicle telemetry in Foxglove Studio:
Extended Demo: Watch the full video here
See real-time image segmentation using front and rear cameras:
Extended Demo: Watch the full video here
More demo videos and visualizations will be added as features are completed.
The vehicle is equipped with a comprehensive multi-sensor suite for autonomous perception and control:
| Sensor | Specification | Purpose |
|---|---|---|
| Front Camera | 1920x1080 @ 50Hz, 70Β° FOV, Depth enabled | Lane detection, traffic signs, traffic lights, object detection |
| LiDAR (Top) | 80 vertical lines, 360Β° horizontal, 120m range, 20Hz | Obstacle detection, 3D scene understanding |
| Front Radar | 200m range, 128Γ64 bins, 50Hz | Collision avoidance, adaptive cruise control |
| Rear Left & Right Radar | 30m range, 64Γ32 bins, 50Hz | Blindspot monitoring, rear object detection |
| Dual GPS | Front & rear positioning @ 50Hz | Localization |
| IMU | 100Hz update rate | Vehicle dynamics, pose estimation |
![]() |
![]() |
![]() |
| Sensor Array | Front Radar | Lidar Visualization |
Configuration files are located in the
/configdirectory:
Note: The microservices architecture is documented below as the intended design. Currently, for active development and rapid iteration, all perception models run locally in-process (bypassing Docker containers and the aggregator). This allows faster prototyping and validation of the complete pipeline. The containerized microservices will be re-integrated once the core perception, sensor fusion, and control systems are finalized and validated.
VisionPilot is designed to use a containerized microservices architecture where each perception task runs as an independent Flask service, orchestrated by a central Aggregator:
| Service | Port | Function | Model/Framework |
|---|---|---|---|
| Object Detection | 5777 | Vehicle, pedestrian, cyclist detection | YOLOv11 |
| Traffic Light Detection | 6777 | Traffic light detection & state classification | YOLOv11 |
| Sign Detection | 7777 | Traffic sign detection | YOLOv11 |
| Sign Classification | 8777 | Traffic sign type classification | CNN |
| YOLOP | 9777 | Unified: lanes + drivable area + objects | YOLOPX |
BeamNG Simulation Loop
β
PerceptionClient.process_frame()
β
Aggregator (concurrent orchestration)
βββ Object Detection (5777)
βββ Traffic Light (6777)
βββ Sign Detection (7777)
βββ Sign Classification (8777)
βββ YOLOP (9777)
β
Merge all responses
β
Return unified AggregationResult
β
Extract individual results + visualize
Concurrency: All services run in parallel (ThreadPoolExecutor)
Modularity: Add/remove services without modifying BeamNG code
Scalability: Easy horizontal scaling with container orchestration
Fault Tolerance: Individual service failures don't break the pipeline
Reusability: Services can be used independently or together
- Sign classification & Detection (CNN / YOLO)
- Traffic light classification & Detection (CNN / YOLO)
- Lane detection Fusion (YOLOP / CV)
- π₯π₯ YOLOP integration
- Drivable area segmentation
- Lane detection (segmentation output)
- Object detection
- CV Lane Detection (Traditional Computer Vision)
- Integrate Majority Voting system for CV
- Lighting Condition Detection
- β π€ Semantic Segmentatation (Already built not implemented here yet)
- Panoptic segmentation (instance + semantic)
- Real-Time Object Detection (Cars, Trucks, Buses, Pedestrians, Cyclists)
- π₯ Speed Estimation using detection from camera and lidar
- Multiple Object Tracking (MOT)
- π₯π₯ Handle dashed lines better in lane detection
- Road Marking Detection (Arrows, Crosswalks, Stop Lines)
- π₯ Lidar Object Detection 3D
- π€ Ocluded Object Detection (Detect objects that are partially blocked or not visible in the camera view using radar/lidar)
- Detect multiple lanes
- π€ Multi Camera Setup (Will implement after all other camera-based features are finished)
- π€ Overtaking, Merging (Will be part of Path Planning)
- Kalman Filtering
- Extended
- Integrate Radar
- Integrate Lidar
- Integrate GPS
- Integrate IMU
- π₯π₯ Ultrasonic Sensor Integration
- π€π€ SLAM (simultaneous localization and mapping)
- Build HD Map of the BeamNG.tech map
- Localize Vehicle on HD Map
- Integrate vehicle control (Throttle, Steering, Braking Implemented) (PID needs further tuning)
- Integrate PIDF controller
- β Adaptive Cruise Control (Currently only basic Cruise Control implemented)
- Automatic Emergency Braking AEB
- Obstacle Avoidance (Steering away from obstacles instead of just braking)
- π₯ Model Predictive Control MPC (More advanced control strategy that optimizes control inputs over a future time horizon)
- Curve Speed Optimization (Slow down for sharp curves based on lane curvature)
- Trajectory Predcition for surrounding vehicles
- π₯ Blindspot Monitoring (Using left/right rear short range radars)
- Traffic Rule Enforcement (Stop at red lights, stop signs, yield signs)
- Dynamic Target Speed based on Speed Limit Signs
- Global Path planning
- Local Path planning
- π₯ Lane Change Logic
- Check Blindspots before lane change
- Signal Lane Change
- Parking Logic (Path finding / Parallel or Perpendicular)
- π€π€ Advanced traffic participant prediction (trajectory, intent)
- Integrate and test in BeamNG.tech simulation
- Modularize and clean up BeamNG.tech pipeline
- Fog Weather conditions
- Traffic scenarios: driving in heavy, moderate, and light traffic
- Test all Systems in different lighting conditions (Day, Night, Dawn/Dusk, Tunnel)
- π€π€ Test using actual RC car
- β Full Foxglove visualization integration (Overhaul needed)
- Modular YAML configuration system
- Real-time drive logging and telemetry
- π₯ Birds eye view BEV (Top down view of vehicle and surroundings)
- Real time Annotations Overlay in Foxglove
- Show predicted trajectories in Foxglove
- Show Global and local path plans in Foxglove
- Live Map Visualization
Note: Considering moving away from Foxglove entirely to build a custom dashboard. Not a priority at this time.
- Containerize Models for easy deployment and scalability
- β Microservices Architecture (Aggregator + individual services)
- Message Broker (Redis support in docker-compose)
- Docker Compose orchestration
- Aggregator service (concurrent service orchestration)
- Add detailed documentation (Lane Det first)
- Add demo images and videos to README
- π€ Add performance benchmarks section
- Add Table of Contents for easier navigation
- Vibe-Code a website for the project
- Redo project structure for better modularity
Driver Monitoring System would've been pretty cool but human drivers are not implemented in BeamNG.tech or Carla
π₯ = High Priority
β = Complete but still being improved/tuned/changed (not final version)
π€ = Minimal Priority, can be addressed later
π€π€ = Very Low Priority, may not be implement
Status: This project is currently in active development. A stable, production-ready release with pre-trained models and complete documentation will be available eventually.
- Tunnel/Low-Light Scenarios: Camera perception fails below certain lighting thresholds
- Multi-Camera Support: Single front-facing camera only (future roadmap)
- PID Controller Tuning: May oscillate on tight curves
- Real-World Testing: Only validated in simulation (BeamNG.tech), for now...
Datasets:
- CU Lane, LISA, GTSRB, Mapillary, BDD100K
Simulation & Tools:
- BeamNG.tech by BeamNG GmbH
- Foxglove Studio for visualization
- Docker & Docker Compose for containerization
Special Thanks:
- Kaggle for free GPU resources (model training)
- Mr. Pratt (teacher/supervisor) for guidance
Academic Papers & Research:
YOLOP/YOLOPX: Anchor-free multi-task learning network for panoptic driving perception
@article{YOLOPX2024,
title={YOLOPX: Anchor-free multi-task learning network for panoptic driving perception},
author={Zhan, Jiao and Luo, Yarong and Guo, Chi and Wu, Yejun and Liu, Jingnan},
journal={Pattern Recognition},
volume={148},
pages={110152},
year={2024}
}If you use VisionPilot in your project, please cite:
@software{visionpilot2026,
title={VisionPilot: Autonomous Driving Simulation, Computer Vision & Real-Time Perception},
author={Julian Stamm},
year={2026},
url={https://github.com/visionpilot-project/VisionPilot}
}Title: BeamNG.tech
Author: BeamNG GmbH
Address: Bremen, Germany
Year: 2025
Version: 0.35.0.0
URL: https://www.beamng.tech/
This project is licensed under the MIT License - see LICENSE file for details.












