Intro - Game - Model used - How to run
Implementation of Deep reinforcement learning (Q-learning) in a self-made game environment of a highway driver. Inspiration was taken from the DeepTraffic competition with a few tweaks in the rules and features of the game (eg. the Agent driver can not know the exact speed of surrounding cars). At each time step, the agent chooses to speed up, slow down, turn left or turn right. The goal of the agent is to achieve the highest average speed while avoiding accidents with surrounding cars.
The game is built with the Python library PyGame and is meant to represent a highway with multiple lanes divided into segments and unpredictable surrounding cars. Among the main parameters of the game are road difficulty (i.e. how many cars there are), number of lanes, number of segments per lane (i.e. discretization of the q-learning model's input), and range of agent's vision (i.e. input the Q-learning model gets).
The agent uses Deep Q-learning model with two layers (default is 64 and 128 units, respectively), the ReLU activation function, and one output layer with 4 units (action space). Pre-trained weights can be found in weights/q_learner_latest_weights.pt. The input of this model (state) consists of an array with the agent's visible segments on the road (0 if vacant, 1 for the place of the agent, 10 for surrounding car, -10 for out of the road) and agent's current speed.
python run.py [-h] [--mode MODE_TO_RUN] [--episodes NUM_OF_EPISODES] [--difficulty DIFFICULTY_OF_ROAD] [--weights PATH_TO_WEIGHTS] [--cpu_only] [--slow] [--silent]
This script runs the highway agent with the Deep Q-learning model.
Optional arguments:
-h, --help show this help message and exit
--mode MODE_TO_RUN Mode of the script to run (Either 'train' or 'test'), default: 'train'
--episodes NUM_OF_EPISODES Number of road runs to run, default: 1000
--difficulty DIFFICULTY_OF_ROAD
Difficulty of the road (number of surrounding cars),
default: 2
--weights PATH_TO_WEIGHTS Path to pre-trained weights. When not specified, the
the model learns from scratch.
--cpu_only Include for using only CPU
--slow Include for slowing down the road simulation
--silent Include for keeping the logs from the progress of agent's
road run silent

