-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Description
OpenVINO Version
2025.2.0
Operating System
Other (Please specify in description)
Device used for inference
CPU
Framework
PyTorch
Model used
yolo11n
Issue description
Bit of a weird issue I ran into, it only happens on a Raspberry Pi 5. I tried to reproduce it on a bunch of systems including a Raspberry Pi Zero 2W but it only happens on the Pi5.
Running a CPU inference increases the used memory (RSS) until it fills up completely, then the program slows down and eventually halts completely.
I eventually found out that I can get the inference model to work reliably on the Pi5 when I write
ie.set_property("CPU", {"INFERENCE_NUM_THREADS": 1, "NUM_STREAMS": 1})
But without that, the openvino inference crashed my Pi5 after around 5000 executions.
I'm not sure how it works under the hood, but I suspect it spawns a new inference thread when none exist already or all existing ones are still busy. And maybe on the RaspberryPi 5, it "forgets" all previously spawned threads for some reason.
Step-by-step reproduction
System:
- Raspberry Pi 5 8GB
- OS: Debian GNU/Linux 12 (bookworm) (current 64-bit Raspberry Pi OS)
- CPU: Cortex-A76, ARM aarch64
- Python venv:
- Python 3.11.2
- numpy==2.2.6
- openvino==2025.2.0
- openvino-telemetry==2025.1.0
- packaging==25.0
- psutil==7.0.0
How I created the model:
from ultralytics import YOLO
model = YOLO('yolo11n.pt')
path = model.export(format="openvino", device="cpu", imgsz=128, int8=True, single_cls=True, max_det=1)
How I used it:
from openvino import Core
import numpy as np
import psutil
SET = False
ie = Core()
if SET:
ie.set_property("CPU", {"INFERENCE_NUM_THREADS": 1, "NUM_STREAMS": 1})
compiled_model = ie.compile_model(model="yolo11n_int8_openvino_model/yolo11n.xml", device_name="CPU")
proc = psutil.Process()
last_rss = proc.memory_info().rss
for i in range(10_000):
img = np.random.rand(1,3,128,128)
result = compiled_model(img)
new_rss = proc.memory_info().rss
if new_rss > last_rss:
print(f"Step {i}: RSS increase by {(new_rss-last_rss)/1e6:.2f}MB, total RSS = {new_rss/1e6:.2f}MB")
last_rss = new_rss
I ran it on a bunch of different systems and various Python and openvino versions. Sometimes it used up more memory after some steps, but not in a significant way. It was only a problem on the Raspberry Pi 5. The Windows PC produced a bit more log-outputs, but the RSS stayed more or less constant.
I ran it on all systems once with SET = True and once with SET = False and marked it in the .txt files which outputs come from where.
Here are all the print outputs from the above code (only the Pi5 had an actual problem):
Pi5.txt
PiZero2W.txt
UbuntuPC.txt
UbuntuSurface.txt
Windows10PC.txt
Relevant log output
Uploaded as .txt files above.Issue submission checklist
- I'm reporting an issue. It's not a question.
- I checked the problem with the documentation, FAQ, open issues, Stack Overflow, etc., and have not found a solution.
- There is reproducer code and related data files such as images, videos, models, etc.