KNN - Beetle tracking

Authors:

Dalibor Kříčka (xkrick01)
Jakub Pekárek (xpekar19)
Pavel Osinek (xosine00)

2025, Brno

Datasets and trained detection YOLO11 models: Goole Drive (Log in with a BUT Google account required)

An Example of beetle tracking in video using a finetuned YOLO11 detection model and BoT-Sort tracker:

beetle-tracking-example.mp4

Task Description

This project aims to develop a tool for tracking beetles in video recordings (several minutes long). The outcome should be the recorded paths of individual beetles, i.e., their movement across the scene during the recording. The relevant problems for this project are:

Data annotation – creation of a custom dataset;
Object detection – fine-tuning a pre-trained YOLO11m model;
Multi-Object Tracking (MOT) – utilizing the BoT-Sort algorithm.

The input is a video recording with a resolution of 1920 × 1304 and a frame rate of 15 FPS, and the output is machine-processable output with individual beetle tracks (for example json).

Dataset

For a more time-efficient dataset creation, a simple method of recognizing dark objects on a light background was initially used in the first phase, utilizing thresholding in grayscale and contour detection with OpenCV. This allowed for the automatic annotation of part of the individuals in the space while the remaining objects were annotated manually. In the second phase, the pre-trained YOLO model was used for automatic annotation with the original smaller dataset. The result was that a significant portion of the objects were labeled, reducing the amount of manual work and enabling adding a larger number of annotated frames to the dataset (a certain form of the Active Learning method).

Dataset information:

Data source: 8 video recordings of 1000 frames each, with varying beetle density per frame (see OneDrive);
Storage with individual versions of the datasets (higher number = newer) – Google Drive (you need to be logged in with a BUT Google account);
Number of annotated frames –
- Training set: 79 (number of objects: 14,319)
- Validation set: 20 (number of objects: 3,357)
Total number of annotated objects: 17,676;
Average number of objects per frame: 178.54;
Minimum number of objects per frame: 63;
Maximum number of objects per frame: 334;
Image resolution: 640 × 640;
Dataset and annotation file format: Ultralytics YOLO format;
Histogram of the number of annotated beetles per frame, see image 1.

Img. 1: Histogram of the number of annotated beetles per frame.

Evaluation

YOLO11

The evaluation is performed by comparing the metrics of the fine-tuned models. The best fine-tuned model so far has the following metrics:

AP@0.5 = 0.983
- Area under the PR curve, see Image 2;
Recall = 0.957
- The proportion of beetles that were correctly identified out of all the beetles present in the image;
- $recall = \frac{TP}{TP+FN}\quad$
Precision = 0.965
- The proportion of correctly identified beetles out of all the objects detected. Formula.
- $precision = \frac{TP}{TP+FP}$

where:

TP - True positives,
FP - False positives,
TN - True negatives,
FN - False negatives

Img. 2: Precision-Recall Curve.

The goal is to maximize these metrics towards a value of 1. The progression of the metric values and loss functions during training can be observed in Image 3.

Img. 3: Metric values and loss functions during training.

BoT-Sort

As part of the current solution to the project, the metrics (such as MOTA, IDF1, ...) for object tracking have not yet been evaluated. The reason is the necessity to create ground truth, which involves labeling all animals in the sequence of frames (several hundred due to the slow movement of the beetles) and simultaneously maintaining their unique identifier across all frames. We have not yet found an effective way to solve this problem, and it is a subject for further consultation.

The preliminary results of the tracking are the recorded paths of individual beetles using the model we fine-tuned for detection (see the video above).

Baseline Solution

The best model mentioned in the Evaluation section was fine-tuned from the pre-trained YOLO11m model.

The parameters used for fine-tuning were:

The image size was set to 640;
The number of epochs was kept at 100. According to the output graphs (Figure 4), it might be possible to reduce the number of epochs and still achieve similar results;
The batch size was set to 8 due to memory constraints;
The dataset chosen was Dataset7, see the Dataset chapter.

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
doc_img		doc_img
src		src
src_odevzdani		src_odevzdani
.gitignore		.gitignore
README.md		README.md
botsort.yaml		botsort.yaml
botsort_beetle.yaml		botsort_beetle.yaml
bytetrack.yaml		bytetrack.yaml
enviroment.yml		enviroment.yml
tracker_eval_notes.md		tracker_eval_notes.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

KNN - Beetle tracking

Authors:

2025, Brno

Task Description

Dataset

Evaluation

YOLO11

BoT-Sort

Baseline Solution

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

KNN - Beetle tracking

Authors:

2025, Brno

Task Description

Dataset

Evaluation

YOLO11

BoT-Sort

Baseline Solution

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages