Skip to content

Conversation

@dev-sda1
Copy link
Contributor

@dev-sda1 dev-sda1 commented Dec 17, 2025

Nodes that are marked as unavailable/down in Slurm, along with their processors, shouldn't be counted in the total_nodes and total_processors figure, as this leads to misleading numbers in certain areas such as the System Status, which presents more resources available than there really are.

Have also added "down" (since down nodes are also inaccessible to the user) and changed the string match on gres_lines done in #916 to /ig to account for multiple nodes being in one of those states

Total nodes shouldn't account for nodes marked as down in Slurm, as this might look misleading in certain widgets like System Status. Have also broken out unavailable nodes as a new property if directly modifying total_nodes isn't considered ideal.
@dev-sda1 dev-sda1 marked this pull request as draft December 17, 2025 16:09
@dev-sda1 dev-sda1 marked this pull request as ready for review January 16, 2026 12:16
@johrstrom
Copy link
Contributor

Thanks for the contribution. I think on the surface it makes sense, I just want a second to test it out to see what's what. I don't think accessing [] from a regex output value is super safe, but I'll have to check it out (besides that's kinda what we were doing before).

Some casts to_i may be nice for safety - that way nils get cast to 0 if there are any (for example no nodes being down).

@dev-sda1
Copy link
Contributor Author

Added some to_i casts to the gres_lines references. Regarding your first point I definitely feel the same as well (since they might suddenly change the formatting in a future release), maybe it would be worth in the long term looking into slurmrestd as a way to communicate with Slurm?

node_cpu_info = call("sinfo", "-aho %F/%D/%C").strip.split('/')
gres_length = call("sinfo", "-o %G").lines.map(&:strip).map(&:length).max + 2
gres_lines = call("sinfo", "-ahNO ,nodehost,gres:#{gres_length},gresused:#{gres_length},statelong")
.lines.uniq.reject { |line| line.match?(/maint|drain/i) }.map(&:split)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently g is not valid for ruby regular expressions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be sorted now, committed fix

@johrstrom
Copy link
Contributor

OK thanks for this addition. I'm working to fix it up a bit and add a test case for the same in a subsequent PR.

@johrstrom johrstrom self-requested a review January 22, 2026 18:21
Copy link
Contributor

@johrstrom johrstrom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This addition will be in 4.1, coming out shortly.

@johrstrom johrstrom merged commit e1f2b3b into OSC:master Jan 22, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants