Skip to content

Conversation

@KernAttila
Copy link
Contributor

@KernAttila KernAttila commented Jun 1, 2023

Link the Issue(s) this Pull Request is related to.
#1295

Summarize your change.

Report threads:

  • Report and reserve only the requested amount of threads (instead of cores), taking into account hybrid system with P-cores and E-cores.
    • This is backward compatible by default (using cores).
  • Updated report.proto and the underlying RQD report method.
    • added total_threads argument in CoreDetails (used for reporting), sent to Cuebot (not used yet).
  • Added OVERRIDE_THREADS in rqconstants.py (overridable via rqd.conf).

Cleanup:

  • added a separate module to analyse CPU and memory for windows, using WinDLL directly.
  • cleaned up the process to count cpu/cores/threads on linux and windows.
  • hyperthreadingMultiplier can be a float for hybrid CPUs

Various Windows sugar:

  • getBootTime() -> default to psutils.boot_time() when there is no stat file provided (used to be just 0 on Win).
  • get real free disk space on Windows instead of default hardcoded value.
  • get cpu load on windows machines with psutils.

Various changes:

  • renamed variables in rqCore.py to match wording in the code base.
    • __procs_by_physid_and_coreid -> __threadid_by_cpuid_and_coreid
    • __physid_and_coreid_by_proc -> __cpuid_and_coreid_by_threadid
  • in rqMachine.py:
    • reserveHT() is marked as deprecated, with a suggestion to use the new reserveCores() method instead.

Tests:

  • I had to rename some cpu files to match the number of threads in each.
  • cpuinfo numbers (end of file name) represent totalThreads-coresPerProc-numProcs + hyperThreadingMultiplier (optional)
  • Updated __cpuinfoTestHelper() for that
  • added test_reserveCores(), test_reserveHybridHT()
  • Added a test for hybrid cpu (i9-12900)

Edit 08/25

The PR is ready, it reports both physical cores and logical cores (aka threads) on all kind of CPUs (including hybrid cores).
The data is sent to Cuebot and is backward compatible (still using cores for now).
I'll update the server in a separate PR, with new fields to register the threads and display them in CueGui.

Edit 04/25:

The system should allow legacy "physical" core selection (mostly for backward compatibility), and move toward using logical cores (aka threads).
Logical cores is what users see in htop or task manager, which feels more intuitive.
For that to work, as discussed in previous TSC meetings, I will implement a new "thread" field in the database for hosts and jobs.
RQD will report both physical and logical cores.
Submitting jobs will then have the two options to chose from (separate PR).

@DiegoTavares
Copy link
Collaborator

This is an interesting change that never really got attention. I'm taking some time to study the architecture of hybrid system with P-cores and E-cores to understand what this PR is trying to accomplish. At a first glance, I like the direction it is taking, but I don't think it is documented properly.

@DiegoTavares
Copy link
Collaborator

DiegoTavares commented Aug 28, 2024

I'm finishing up a collection of changes that were pending on SPI's end and will keep this PR right at the top of my review list for when I finish. Sorry for the wait.

…o rqd-reserve-all-cores

# Conflicts:
#	VERSION.in
#	proto/src/report.proto
#	rqd/setup.py
- Always add the LOAD_MODIFIER to the loadAvg, even on unchecked systems.
- Get cpu load instead for Windows and MacOS.
doc:
- Explicit documentation to differentiate the value meaning on each system.
…o rqd-reserve-all-cores

# Conflicts:
#	VERSION.in
@KernAttila KernAttila marked this pull request as ready for review August 6, 2025 16:57
@KernAttila
Copy link
Contributor Author

KernAttila commented Aug 6, 2025

@DiegoTavares and @lithorus the PR is now fully ready for review.
I'll update Cuebot in a separate PR.
For now RQD is fully backward compatible with the legacy physical cores reservation method, it sends extra data to Cuebot (not used yet).
The rest of the PR is a big refactor and a now accurate CPU/Core/Threads count
with some updated data for windows (cpu load and free disk temp space).

@lithorus
Copy link
Collaborator

lithorus commented Aug 27, 2025

Hmm... the rust part is still failing. Perhaps due to the changes in the proto files?
Edit:
Looking at it closer, it looks like the same logic needs to be implemented in the rust part as well.

@KernAttila
Copy link
Contributor Author

Hmm... the rust part is still failing. Perhaps due to the changes in the proto files? Edit: Looking at it closer, it looks like the same logic needs to be implemented in the rust part as well.

@lithorus I was afraid this was going to happen, we'll discuss this with @DiegoTavares together today.

@KernAttila KernAttila marked this pull request as draft August 27, 2025 21:31
@KernAttila KernAttila changed the title Rqd reserve all cores [rqd] reserve all cores and report threads Sep 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants