Skip to content

[Bug]: Unbounded Memory increase on a RaspberryPi 5 #31064

@ValentinBhend

Description

@ValentinBhend

OpenVINO Version

2025.2.0

Operating System

Other (Please specify in description)

Device used for inference

CPU

Framework

PyTorch

Model used

yolo11n

Issue description

Bit of a weird issue I ran into, it only happens on a Raspberry Pi 5. I tried to reproduce it on a bunch of systems including a Raspberry Pi Zero 2W but it only happens on the Pi5.
Running a CPU inference increases the used memory (RSS) until it fills up completely, then the program slows down and eventually halts completely.

I eventually found out that I can get the inference model to work reliably on the Pi5 when I write
ie.set_property("CPU", {"INFERENCE_NUM_THREADS": 1, "NUM_STREAMS": 1})
But without that, the openvino inference crashed my Pi5 after around 5000 executions.

I'm not sure how it works under the hood, but I suspect it spawns a new inference thread when none exist already or all existing ones are still busy. And maybe on the RaspberryPi 5, it "forgets" all previously spawned threads for some reason.

Step-by-step reproduction

System:

  • Raspberry Pi 5 8GB
  • OS: Debian GNU/Linux 12 (bookworm) (current 64-bit Raspberry Pi OS)
  • CPU: Cortex-A76, ARM aarch64
  • Python venv:
    • Python 3.11.2
    • numpy==2.2.6
    • openvino==2025.2.0
    • openvino-telemetry==2025.1.0
    • packaging==25.0
    • psutil==7.0.0

How I created the model:

from ultralytics import YOLO
model = YOLO('yolo11n.pt')
path = model.export(format="openvino", device="cpu", imgsz=128, int8=True, single_cls=True, max_det=1)

How I used it:

from openvino import Core
import numpy as np
import psutil

SET = False

ie = Core()
if SET:
    ie.set_property("CPU", {"INFERENCE_NUM_THREADS": 1, "NUM_STREAMS": 1})
compiled_model = ie.compile_model(model="yolo11n_int8_openvino_model/yolo11n.xml", device_name="CPU")

proc = psutil.Process()
last_rss = proc.memory_info().rss

for i in range(10_000):
    img = np.random.rand(1,3,128,128)
    result = compiled_model(img)
    new_rss = proc.memory_info().rss
    if new_rss > last_rss:
        print(f"Step {i}: RSS increase by {(new_rss-last_rss)/1e6:.2f}MB, total RSS = {new_rss/1e6:.2f}MB")
    last_rss = new_rss

I ran it on a bunch of different systems and various Python and openvino versions. Sometimes it used up more memory after some steps, but not in a significant way. It was only a problem on the Raspberry Pi 5. The Windows PC produced a bit more log-outputs, but the RSS stayed more or less constant.

I ran it on all systems once with SET = True and once with SET = False and marked it in the .txt files which outputs come from where.
Here are all the print outputs from the above code (only the Pi5 had an actual problem):
Pi5.txt
PiZero2W.txt
UbuntuPC.txt
UbuntuSurface.txt
Windows10PC.txt

Relevant log output

Uploaded as .txt files above.

Issue submission checklist

  • I'm reporting an issue. It's not a question.
  • I checked the problem with the documentation, FAQ, open issues, Stack Overflow, etc., and have not found a solution.
  • There is reproducer code and related data files such as images, videos, models, etc.

Metadata

Metadata

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions