Name	Name	Last commit message	Last commit date
Latest commit History 802 Commits
.github	.github
demo	demo
docker	docker
docs	docs
examples	examples
tests	tests
tools	tools
yolo	yolo
.gitignore	.gitignore
.pre-commit-config.yaml	.pre-commit-config.yaml
.python-version	.python-version
.readthedocs.yaml	.readthedocs.yaml
.tmp_compare_pt.py	.tmp_compare_pt.py
ARCHITECTURE_ENHANCED_YOLOv9.md	ARCHITECTURE_ENHANCED_YOLOv9.md
LICENSE	LICENSE
README.md	README.md
pyproject.toml	pyproject.toml
requirements-dev.txt	requirements-dev.txt
requirements.txt	requirements.txt
tree_command.py	tree_command.py
uv.lock	uv.lock

YOLO.

YOLO: Official Implementation of YOLOv9, YOLOv7, YOLO-RD

Welcome to the official implementation of YOLOv7¹ and YOLOv9², YOLO-RD³. This repository will contains the complete codebase, pre-trained models, and detailed instructions for training and deploying YOLOv9.

TL;DR

This is the official YOLO model implementation with an MIT License.

Introduction

Installation

To get started using YOLOv9's developer mode, we recommand you clone this repository and install the required dependencies:

git clone https://github.com/PINTO0309/YOLO.git
cd YOLO

curl -LsSf https://astral.sh/uv/install.sh | sh
uv sync
source .venv/bin/activate
export PYTHONWARNINGS="ignore"

Features

Task

For more customization details, please refer to HOWTO.

YOLO format dataset structure

data
└── wholebody34
    ├── train.pache # Cache file automatically generated when training starts
    ├── val.pache # Cache file automatically generated when training starts
    ├── images
    │   ├── train
    │   │   ├── 000000000036.jpg
    │   │   ├── 000000000077.jpg
    │   │   ├── 000000000110.jpg
    │   │   ├── 000000000113.jpg
    │   │   └── 000000000165.jpg
    │   └── val
    │       ├── 000000000241.jpg
    │       ├── 000000000294.jpg
    │       ├── 000000000308.jpg
    │       ├── 000000000322.jpg
    │       └── 000000000328.jpg
    └── labels
        ├── train
        │   ├── 000000000036.txt
        │   ├── 000000000077.txt
        │   ├── 000000000110.txt
        │   ├── 000000000113.txt
        │   └── 000000000165.txt
        └── val
            ├── 000000000241.txt
            ├── 000000000294.txt
            ├── 000000000308.txt
            ├── 000000000322.txt
            └── 000000000328.txt

000000000036.txt

Item	Note
classId	classId
cx, cy	0.0-1.0 normalized center coordinates
w, h	0.0-1.0 normalized width and height

classId cx cy w h

30 0.729688 0.959667 0.141042 0.080667
25 0.919385 0.974417 0.052521 0.051167
25 0.525000 0.680847 0.049167 0.071806
23 0.663813 0.657361 0.100125 0.105889
21 0.612667 0.519583 0.068542 0.068056
29 0.628292 0.896000 0.292500 0.082889
30 0.546063 0.957611 0.210792 0.084778
19 0.547917 0.417986 0.073125 0.037361
26 0.488281 0.653583 0.123104 0.151444
24 0.840208 0.778889 0.080417 0.092222
24 0.435312 0.790972 0.074375 0.089167
22 0.411469 0.557500 0.103313 0.112222
22 0.773646 0.546944 0.087708 0.110556
9 0.560417 0.366667 0.233333 0.266667
7 0.560417 0.366667 0.233333 0.266667
27 0.956385 0.970417 0.087229 0.055833
16 0.541667 0.370833 0.154167 0.197222
26 0.956385 0.970417 0.087229 0.055833
4 0.681458 0.621667 0.637083 0.756667
0 0.681458 0.621667 0.637083 0.756667
18 0.527188 0.373333 0.042917 0.047500
20 0.644792 0.370028 0.023125 0.036667
1 0.681458 0.621667 0.637083 0.756667
28 0.488281 0.653583 0.123104 0.151444
17 0.489687 0.370972 0.032917 0.020556
17 0.561875 0.350694 0.044583 0.019722

Dataset config

yolo/config/dataset/wholebody34.yaml

path: data/wholebody34
train: train
validation: val

class_num: 34
class_list: ['body', 'adult', 'child', 'male', 'female', 'body_with_wheelchair', 'body_with_crutches', 'head', 'front', 'right-front', 'right-side', 'right-back', 'back', 'left-back', 'left-side', 'left-front', 'face', 'eye', 'nose', 'mouth', 'ear', 'collarbone', 'shoulder', 'solar_plexus', 'elbow', 'wrist', 'hand', 'hand_left', 'hand_right', 'abdomen', 'hip_joint', 'knee', 'ankle', 'foot']

auto_download:

Training

To train YOLO on your machine/dataset:

Modify the configuration file yolo/config/dataset/**.yaml to point to your dataset.
Run the training script:

uv run python yolo/lazy.py task=train dataset=** use_wandb=True
uv run python yolo/lazy.py task=train task.data.batch_size=8 model=v9-c weight=False # or more args

Transfer Learning

To perform transfer learning with YOLOv9:

configs

# n, t, s, c, e
VARIANT=n
EPOCH=100
BATCHSIZE=8

uv run python yolo/lazy.py \
task=train \
name=v9-${VARIANT} \
task.epoch=${EPOCH} \
task.data.batch_size=${BATCHSIZE} \
model=v9-${VARIANT} \
dataset=wholebody34 \
device=cuda \
use_wandb=False \
use_tensorboard=True

# When specifying trained weights as initial weights
uv run python yolo/lazy.py \
task=train \
name=v9-${VARIANT} \
task.epoch=${EPOCH} \
task.data.batch_size=${BATCHSIZE} \
model=v9-${VARIANT} \
weight="runs/train/v9-n/lightning_logs/version_1/checkpoints/best_n_0002_0.0065.pt" \
dataset=wholebody34 \
device=cuda \
use_wandb=False \
use_tensorboard=True

# Automatically downloading the initial weights published by the official repository
# Default: weight=True
# Weight download path: weights/*.pt
uv run python yolo/lazy.py \
task=train \
name=v9-${VARIANT} \
task.epoch=${EPOCH} \
task.data.batch_size=${BATCHSIZE} \
model=v9-${VARIANT} \
weight=True \
dataset=wholebody34 \
device=cuda \
use_wandb=False \
use_tensorboard=True

# When starting training without initial weights
# Default: weight=True
uv run python yolo/lazy.py \
task=train \
name=v9-${VARIANT} \
task.epoch=${EPOCH} \
task.data.batch_size=${BATCHSIZE} \
model=v9-${VARIANT} \
weight=False \
dataset=wholebody34 \
device=cuda \
use_wandb=False \
use_tensorboard=True

# Resume learning from where you left off
# Please note that you must specify the Lightning checkpoint file (.ckpt)
# and not the .pt file that contains only the EMA weights.
# Unlike the official implementation, all parameters are restored from the .ckpt file,
# so training resumes exactly where it left off.
uv run python yolo/lazy.py \
task=train \
name=v9-${VARIANT} \
task.epoch=${EPOCH} \
task.data.batch_size=${BATCHSIZE} \
model=v9-${VARIANT} \
task.resume_ckpt="runs/train/v9-n/lightning_logs/version_3/checkpoints/epoch_5_step_3660.ckpt" \
dataset=wholebody34 \
device=cuda \
use_wandb=False \
use_tensorboard=True

# To run a shorter fine-tuning schedule, use the dedicated configuration
# at `yolo/config/task/trainft.yaml`
# All CLI overrides available for `task=train` (e.g., `task.data.batch_size`,
# `task.resume_ckpt`) also apply to `task=trainft`.
VARIANT=n
EPOCH=60
BATCHSIZE=8
uv run python yolo/lazy.py \
task=trainft \
name=v9-${VARIANT} \
task.epoch=${EPOCH} \
task.data.batch_size=${BATCHSIZE} \
model=v9-${VARIANT} \
weight="runs/train/v9-n/lightning_logs/version_1/checkpoints/best_n_0002_0.0065.pt" \
dataset=wholebody34 \
device=cuda \
use_wandb=False \
use_tensorboard=True

# # DDP (Distributed data parallel training), Multi-GPU training
# # Below is a sample for 8 GPUs
# # n, t, s, c, e
# VARIANT=n
# EPOCH=100
# # Number of GPUs running on one node
# NPROC=8
# # When NPROC=8, the string [0,1,2,3,4,5,6,7] is set to DEVICES.
# DEVICES="[$(seq -s, 0 $((NPROC-1)))]"
# # When there are 8 GPUs and 8 batches are assigned to each GPU
# # {Batch size per GPU} x {Number of GPUs} = {Total batch size}
# # 8 x 8 = 64
# BATCHSIZE=8
# TOTALBATCHSIZE=$((BATCHSIZE * NPROC))

# uv run torchrun \
# --nproc_per_node=${NPROC} \
# yolo/lazy.py \
# task=train \
# device=${DEVICES} \
# name=v9-${VARIANT} \
# task.epoch=${EPOCH} \
# task.data.batch_size=${TOTALBATCHSIZE} \
# task.data.cpu_num=$((TOTALBATCHSIZE / NPROC)) \
# model=v9-${VARIANT} \
# weight=False \
# dataset=wholebody34 \
# use_wandb=False \
# use_tensorboard=True

↓↓↓ Experimental implementation. Not recommended as accuracy is significantly reduced. ↓↓↓
# Online Knowledge Distillation (Teacher E → Student {C,S,T,N})
# Default: task.kd.enable=False
# ./ARCHITECTURE_ENHANCED_YOLOv9.md#8-online-knowledge-distillation-teacher-e--student-cstn
# ./yolo/config/task/train.yaml
uv run python yolo/lazy.py \
task=train \
name=v9-${VARIANT} \
task.epoch=${EPOCH} \
task.data.batch_size=${BATCHSIZE} \
model=v9-${VARIANT} \
weight=False \
task.kd.enable=True \
task.kd.teacher_model=v9-e \
task.kd.teacher_weight=weights/v9-e.pt \
task.kd.apply_to=both \
dataset=wholebody34 \
device=cuda \
use_wandb=False \
use_tensorboard=True
↑↑↑ Experimental implementation. Not recommended as accuracy is significantly reduced. ↑↑↑

⚠️ important points ⚠️

1. About batch size

Pay particular attention to the maximum number of CPU threads and the amount of RAM on the machine you are trying to train on. I'm talking RAM, not VRAM. The number of worker processes specified during training is batch_size + 1, but you must adjust batch_size so that it is less than the maximum number of CPU threads - 1. Also, the amount of RAM consumed increases in proportion to the number of enabled augmentations, so you need to pay attention to the amount of RAM installed on your PC. Checking only the amount of VRAM is not enough. If you need to run heavy augmentation that exceeds the RAM capacity, we recommend setting batch_size to a relatively small value.

The figure below shows the CPU and RAM status of my work PC. When I run 16 batches with the maximum number of augmentations enabled, 17 threads are started, which not only consumes a lot of RAM, causing the learning process to silently abort after a few epochs without outputting any errors.

2. If training does not start normally (silently aborts)

Countermeasure for situations where resume is unstable and CUDA initialization error occurs https://discuss.pytorch.org/t/dataloader-num-workers-1-cuda-initialization-error-3/159989

If the following mp.set_start_method is specified, there are some environments where the process will silently terminate before learning begins. Therefore, if you are in an environment where learning does not start normally, it may be a good idea to comment out the following line: mp.set_start_method.

yolo/lazy.py

if __name__ == "__main__":
    # Countermeasure for situations where resume is unstable and CUDA initialization error occurs
    # https://discuss.pytorch.org/t/dataloader-num-workers-1-cuda-initialization-error-3/159989
    # If the following `mp.set_start_method` is specified, there are some environments where
    # the process will silently terminate before learning begins.
    # Therefore, if you are in an environment where learning does not start normally,
    # it may be a good idea to comment out the following line: `mp.set_start_method`.
    # mp.set_start_method("spawn", force=True) <--- Here
    main()

Validation graph during training

To speed up training and significantly reduce VRAM consumption during training, validation is limited to a simple, minimal evaluation per epoch. Therefore, validation results other than the final epoch do not properly evaluate the model's true performance, but they do confirm that training is progressing normally, that accuracy is not deteriorating significantly, and that overfitting is not occurring. The true performance of the model can only be confirmed by the evaluation results of rigorous validation performed at the final epoch. This means that the spot validation results do not perfectly match the true weight improvement as the learning progresses. It would be foolish to perform early stopping based solely on the validation status of each epoch. First of all, you should not use an insufficient dataset that results in overfitting.

The final epoch performs fairly accurate validation, so it may take several minutes or more depending on the volume of your dataset.

NMS settings for validation at each learning progress

Intermediate Epoch 　　　 Final Epoch

pre_topk 300 20,000

max_bbox 300 20,000

multi_label False True

class_agnostic False False

	Intermediate Epoch	Final Epoch
pre_topk	300	20,000
max_bbox	300	20,000
multi_label	False	True
class_agnostic	False	False

print_map_per_class

If you want to display the AP for each class for all epochs, change yolo/config/task/validation.yaml's print_map_per_class: True and start training. If print_map_per_class: False is set, AP per class will be calculated and output only once at the end of the final epoch. Since print_map_per_class takes a very long time to process, we recommend setting it to False and automatically calculating map_per_class only in the final epoch.

┏━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━┓
┃Epoch┃Avg. Precision  ┃     %┃Avg. Recall     ┃     %┃
┡━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━┩
│    2│AP @ .5:.95     │000.77╎AR maxDets   1  │003.08│
│    2│AP @     .5     │002.02╎AR maxDets  10  │006.91│
│    2│AP @    .75     │000.45╎AR maxDets 100  │008.74│
│    2│AP  (small)     │000.33╎AR     (small)  │001.93│
│    2│AP (medium)     │000.69╎AR    (medium)  │007.74│
│    2│AP  (large)     │001.34╎AR     (large)  │008.55│
└─────┴────────────────┴──────┴────────────────┴──────┘
┏━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━┳━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━┓
┃ ID┃Name                     ┃     AP┃ ID┃Name                     ┃     AP┃
┡━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━╇━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━┩
│  0│body                     │ 0.0343│ 20│ear                      │ 0.0023│
│  1│adult                    │ 0.0320│ 21│collarbone               │ 0.0003│
│  2│child                    │ 0.0000│ 22│shoulder                 │ 0.0033│
│  3│male                     │ 0.0268│ 23│solar_plexus             │ 0.0003│
│  4│female                   │ 0.0103│ 24│elbow                    │ 0.0001│
│  5│body_with_wheelchair     │ 0.0029│ 25│wrist                    │ 0.0001│
│  6│body_with_crutches       │ 0.0455│ 26│hand                     │ 0.0029│
│  7│head                     │ 0.0340│ 27│hand_left                │ 0.0022│
│  8│front                    │ 0.0102│ 28│hand_right               │ 0.0027│
│  9│right-front              │ 0.0155│ 29│abdomen                  │ 0.0005│
│ 10│right-side               │ 0.0059│ 30│hip_joint                │ 0.0006│
│ 11│right-back               │ 0.0023│ 31│knee                     │ 0.0010│
│ 12│back                     │ 0.0001│ 32│ankle                    │ 0.0012│
│ 13│left-back                │ 0.0015│ 33│foot                     │ 0.0063│
│ 14│left-side                │ 0.0025│   │                         │       │
│ 15│left-front               │ 0.0105│   │                         │       │
│ 16│face                     │ 0.0047│   │                         │       │
│ 17│eye                      │ 0.0000│   │                         │       │
│ 18│nose                     │ 0.0000│   │                         │       │
│ 19│mouth                    │ 0.0000│   │                         │       │
└───┴─────────────────────────┴───────┴───┴─────────────────────────┴───────┘

Weights after training

The weights after training are output to the following path.

File	Note
`best_{variant}_{epoch:04}_{map:.4f}.pt`	Optimized weight file containing only EMA weights. The weights with the highest mAP are automatically saved.
`epoch_{epoch}_step_{step}.ckpt`	A checkpoint file containing all learning logs automatically saved by Lightning.
`last.pt`	Optimized weight file containing only EMA weights. The weights of the last epoch are automatically saved.

e.g.

runs/train/v9-n/lightning_logs/version_0/checkpoints
├── best_n_0002_0.0065.pt
├── epoch_2_step_3462.ckpt
└── last.pt

Inference

To use a model for object detection, use:

# n, t, s, c, e
VARIANT=n
RENDER_LABELS=False

# If you do not specify `dataset={dataset_name}` correctly,
# the classification head weights will not be loaded properly
# and you will not see any inference results.
# The number of classes in the head part of the weights used for inference
# must match `class_num`.
# https://github.com/PINTO0309/YOLO/blob/wholebody/yolo/config/dataset/wholebody34.yaml
---
path: data/wholebody34
train: train
validation: val

class_num: 34 # <--- Here
class_list: ['body', ..., 'foot']
---

uv run python yolo/lazy.py \
task=inference \
name=v9-${VARIANT} \
model=v9-${VARIANT} \
weight="runs/train/v9-n/lightning_logs/version_1/checkpoints/best_n_0002_0.0065.pt" \
dataset=wholebody34 \
task.nms.min_confidence=0.1 \
task.fast_inference=onnx \
task.data.source=data/wholebody34/images/val \
task.data.max_samples=100 \
task.render_labels=${RENDER_LABELS} \
+quite=True

Validation

To validate model performance, or generate a json file in COCO format:

# n, t, s, c, e
VARIANT=n
# Specify the same `batch_size` as the validation batch size used during training.
# Otherwise, the mAP value after validation will be significantly degraded.
# data:
#  batch_size: 32
# https://github.com/PINTO0309/YOLO/blob/wholebody/yolo/config/task/validation.yaml
BATCHSIZE=32
# The higher the model's performance, the more accurate the evaluation will be
# if the MAXDET value (the upper limit of the number of detections) is set to
# a larger value. The default value is 1,000. yolo/config/task/validation.yaml
# However, setting a value that exceeds the maximum number of labels contained
# in one image will have no effect. For example, in my dataset, an image contains
# a maximum of 3,875 labels, so setting it to 4,000 is appropriate.
MAXDET=20000

uv run python yolo/lazy.py \
task=validation \
name=v9-${VARIANT} \
task.data.batch_size=${BATCHSIZE} \
task.nms.pre_topk=${MAXDET} \
task.nms.max_bbox=${MAXDET} \
task.nms.multi_label=True \
task.nms.class_agnostic=False \
model=v9-${VARIANT} \
weight="runs/train/v9-n/lightning_logs/version_1/checkpoints/best_n_0002_0.0065.pt" \
dataset=wholebody34 \
device=cuda \
use_wandb=False

Export

Use the Hydra-driven CLI to run the export task and produce a compact ONNX graph. The exporter emits a single [batches, 4 + num_classes, boxes] tensor, keeps detection heads minimal, and derives an informative filename (e.g. best_e_0060_0.6585_1x3x480x640.onnx). Example:

uv run python yolo/lazy.py \
task=export \
name=v9-demo \
model=v9-e \
dataset=wholebody34 \
weight="runs/trainft/v9-e/lightning_logs/version_ft0/checkpoints/best_e_0060_0.6585.pt" \
task.dynamic_batch=False \
task.dynamic_size=False \
task.image_size=480x640 \
task.batch_size=1 \
task.opset=13 \
task.half=false \
task.apply_sigmoid=True \
task.include_metadata=True

output: [batches, [cx,cy,w,h,class_scores], boxes]

Key overrides (all optional):

task.batch_size: dummy input batch size (default 1).
task.dynamic_batch: true marks batch as symbolic N and names the file accordingly.
task.dynamic_size: true marks Height and Width as symbolic H, W and names the file accordingly.
task.image_size: input resolution. Accepts 'HxW'.
task.batch_size: input batch size.
task.opset: ONNX opset version (default 13).
task.simplify: run onnxsim for graph simplification.
task.half: export weights/activations in FP16.
task.apply_sigmoid: emit post-sigmoid class probabilities instead of raw logits.
task.include_metadata: embed class names in ONNX metadata.
task.output_path: explicit destination; omit to auto-name beside the weight file.
task.name: experiment/run folder label (standard Hydra behaviour).

Generate and merge post-processing with NMS

tools/post_process_gen_tools

Convert ONNX with NMS to LiteRT/TensorFlow.js

If you want to use webgpu, you can use ONNX without NMS or TensorFlow.js models without NMS. If you don't want to go through ONNX, you can output the LiteRT model directly from PyTorch using ai_edge_torch.

ONNX to TF/LiteRT

# Transformation with `Grouped Convolution` disabled
uv run onnx2tf -i yolov9_n_wholebody25_post_0100_1x3x480x640.onnx -dgc

TF to TFJS

uv run tensorflowjs_converter \
--input_format tf_saved_model \
--output_format tfjs_graph_model \
saved_model \
tfjs_model

Simple performance benchmark using ONNX/TensorRT

# Install CUDA==12.9
#  https://developer.nvidia.com/cuda-toolkit-archive
# Install TensorRT==10.13.3.9-1+cuda12.9
#  https://docs.nvidia.com/deeplearning/tensorrt/latest/installing-tensorrt/installing.html

uv run sit4onnx -if best_e_0205_0.4140_1x3x640x640.onnx -oep cpu

INFO: file: best_e_0205_0.4140_1x3x640x640.onnx
INFO: providers: ['CPUExecutionProvider']
INFO: input_name.1: images shape: [1, 3, 640, 640] dtype: float32
INFO: test_loop_count: 10
INFO: total elapsed time:  3673.502206802368 ms
INFO: avg elapsed time per pred:  367.3502206802368 ms
INFO: output_name.1: output shape: [1, 38, 8400] dtype: float32

uv run sit4onnx -if best_e_0205_0.4140_1x3x640x640.onnx -oep cuda

INFO: file: best_e_0205_0.4140_1x3x640x640.onnx
INFO: providers: ['CUDAExecutionProvider', 'CPUExecutionProvider']
INFO: input_name.1: images shape: [1, 3, 640, 640] dtype: float32
INFO: test_loop_count: 10
INFO: total elapsed time:  350.10218620300293 ms
INFO: avg elapsed time per pred:  35.01021862030029 ms
INFO: output_name.1: output shape: [1, 38, 8400] dtype: float32

# It will take a while to generate the TensorrtExecutionProvider_TRTKernel_*.engine cache.
uv run sit4onnx -if best_e_0205_0.4140_1x3x640x640.onnx -oep tensorrt

INFO: file: best_e_0205_0.4140_1x3x640x640.onnx
INFO: providers: ['TensorrtExecutionProvider', 'CPUExecutionProvider']
INFO: input_name.1: images shape: [1, 3, 640, 640] dtype: float32
INFO: test_loop_count: 10
INFO: total elapsed time:  104.28452491760254 ms
INFO: avg elapsed time per pred:  10.428452491760254 ms
INFO: output_name.1: output shape: [1, 38, 8400] dtype: float32

# With NMS + TensorRT
# For models with dynamic tensors as input, specify the size of the tensor
# to be tested using the --fixed_shapes / -fs option.
uv run sit4onnx -if yolov9_n_wholebody25_post_0100_1x3xHxW.onnx -oep tensorrt -fs 1 3 480 640

INFO: file: yolov9_n_wholebody25_post_0100_1x3x480x640.onnx
INFO: providers: ['TensorrtExecutionProvider', 'CPUExecutionProvider']
INFO: input_name.1: input_bgr shape: [1, 3, 480, 640] dtype: float32
INFO: test_loop_count: 10
INFO: total elapsed time:  20.3857421875 ms
INFO: avg elapsed time per pred:  2.03857421875 ms
INFO: output_name.1: batchno_classid_score_x1y1x2y2 shape: [0, 7] dtype: float32

Contributing

Contributions to the YOLO project are welcome! See CONTRIBUTING for guidelines on how to contribute.

Star History

Citations

@inproceedings{wang2022yolov7,
      title={{YOLOv7}: Trainable Bag-of-Freebies Sets New State-of-the-Art for Real-Time Object Detectors},
      author={Wang, Chien-Yao and Bochkovskiy, Alexey and Liao, Hong-Yuan Mark},
      year={2023},
      booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},

}
@inproceedings{wang2024yolov9,
      title={{YOLOv9}: Learning What You Want to Learn Using Programmable Gradient Information},
      author={Wang, Chien-Yao and Yeh, I-Hau and Liao, Hong-Yuan Mark},
      year={2024},
      booktitle={Proceedings of the European Conference on Computer Vision (ECCV)},
}
@inproceedings{tsui2024yolord,
      author={Tsui, Hao-Tang and Wang, Chien-Yao and Liao, Hong-Yuan Mark},
      title={{YOLO-RD}: Introducing Relevant and Compact Explicit Knowledge to YOLO by Retriever-Dictionary},
      booktitle={Proceedings of the International Conference on Learning Representations (ICLR)},
      year={2025},
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Repository files navigation

This fork is an MIT version of YOLO, with some bug fixes and the addition of the V9-N (nano) and V9-E (extended) variants to the original. This repository is already capable of achieving convergence speed and accuracy comparable to stable GPLv3. https://github.com/MultimediaTechLab/YOLO.

YOLO: Official Implementation of YOLOv9, YOLOv7, YOLO-RD

TL;DR

Introduction

Installation

Features

Task

YOLO format dataset structure

Dataset config

Training

Transfer Learning

⚠️ important points ⚠️

1. About batch size

2. If training does not start normally (silently aborts)

Validation graph during training

print_map_per_class

Weights after training

Inference

Validation

Export

Generate and merge post-processing with NMS

Convert ONNX with NMS to LiteRT/TensorFlow.js

Simple performance benchmark using ONNX/TensorRT

Contributing

Star History

Citations

About

Uh oh!

Releases 1

Packages

Languages

Uh oh!

License

Uh oh!

PINTO0309/YOLO

Folders and files

Latest commit

History

Repository files navigation

This fork is an MIT version of YOLO, with some bug fixes and the addition of the V9-N (nano) and V9-E (extended) variants to the original. This repository is already capable of achieving convergence speed and accuracy comparable to stable GPLv3. https://github.com/MultimediaTechLab/YOLO.

YOLO: Official Implementation of YOLOv9, YOLOv7, YOLO-RD

TL;DR

Introduction

Installation

Features

Task

YOLO format dataset structure

Dataset config

Training

Transfer Learning

⚠️ important points ⚠️

1. About batch size

2. If training does not start normally (silently aborts)

Validation graph during training

print_map_per_class

Weights after training

Inference

Validation

Export

Generate and merge post-processing with NMS

Convert ONNX with NMS to LiteRT/TensorFlow.js

Simple performance benchmark using ONNX/TensorRT

Contributing

Star History

Citations

Footnotes

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages