Skip to content

Conversation

@ummarfarooq-swe
Copy link

This PR extends the InfiniBand collector to expose 30 additional hardware counter metrics from /sys/class/infiniband//ports//hw_counters/.

These metrics provide deeper visibility into RDMA operations, RoCE performance, congestion control, and error conditions.
Changes

  • Added 30 new metric descriptions to the InfiniBand collector
  • Implemented collection of hardware counters via port.HwCounters fields
  • All new metrics follow the existing naming conventions and are exposed as counters

"rx_read_requests": "Number of read Requests from hwcounters.",
"rx_icrc_encapsulated": "Number of RxIcrcEncapsulated packets from hwcounters",
"rx_dct_connect": "Number of DCT connect requests received from hwcounters",
"rx_atomic_requests": "Number of atomic requests received from hwcounters",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please ensure first these names follow naming best practices (_total for cummulative counters, spelling out abbreviated names (trans, err etc))

@SuperQ
Copy link
Member

SuperQ commented Nov 12, 2025

Duplicate of #2827

@SuperQ SuperQ marked this as a duplicate of #2827 Nov 12, 2025
@SuperQ SuperQ closed this Nov 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants