Skip to content

Conversation

@schmikei
Copy link
Contributor

Updates the IBM MQ mixin to use more modern libraries

IBM MQ cluster overview

image

IBM MQ queue manager overview
image
image

IBM MQ queue overview
image
image

IBM MQ topic overview

I couldn't figure out how to quickly generate topic metrics so I mostly validated query via git diff

image image

IBM MQ logs
image

Metrics should be flowing to the shared Grafana instance on Nov 25 12pm-5pm EST :)

@schmikei schmikei force-pushed the ibm-mq-modernization branch from b01900a to 321be7d Compare November 25, 2025 21:29
@schmikei schmikei marked this pull request as ready for review November 25, 2025 21:31
@schmikei schmikei requested a review from a team as a code owner November 25, 2025 21:31
@Dasomeone Dasomeone self-assigned this Dec 4, 2025
Copy link
Member

@Dasomeone Dasomeone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple minor comments here, overall looks great

Copy link
Contributor

@aalhour aalhour left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor changes.

summary: 'There are expired messages, which imply that application resilience is failing.',
description:
(
'The number of expired messages in the {{$labels.qmgr}} is {{$labels.value}} which is above the threshold of %(alertsExpiredMessages)s.'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the description reference the $value of metric? Probably need to format it as well: {{ printf "%.0f" $value }}, WDYT?

summary: 'Stale messages have been detected.',
description:
(
'A stale message with an age of {{$labels.value}} has been sitting in the {{$labels.queue}} which is above the threshold of %(alertsStaleMessagesSeconds)ss.'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment about referencing {{ $value }} instead of {{ $labels.value }}.

summary: 'There is limited disk available for a queue manager.',
description:
(
'The amount of disk space available for {{$labels.qmgr}} is at {{$labels.value}}%% which is below the threshold of %(alertsLowDiskSpace)s%%.'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment about referencing {{ $value }} instead of {{ $labels.value }}.

summary: 'There is a high CPU usage estimate for a queue manager.',
description:
(
'The amount of CPU usage for the queue manager {{$labels.qmgr}} is at {{$labels.value}}%% which is above the threshold of %(alertsHighQueueManagerCpuUsage)s%%.'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment about referencing {{ $value }} instead of {{ $labels.value }}.

name: 'Time on queue',
type: 'gauge',
description: 'The average time messages spent on the queue.',
unit: 'µs',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The metric says _seconds but the unit says micro seconds, can you please change it to 's'?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think this one needs updating though, thanks for the catch 👍

signals.queueManager.queueOperationsMqput.asTarget(),
])
+ g.panel.timeSeries.panelOptions.withDescription('The number of queue operations of the queue manager.')
+ g.panel.timeSeries.standardOptions.withUnit('operations')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think 'operations' is a unit that we support in jsonnet-lib/common-lib. Can we use 'short' instead?

signals.queue.mqputMqput1Count.asTarget(),
])
+ g.panel.timeSeries.panelOptions.withDescription('The number of queue operations of the queue manager.')
+ g.panel.timeSeries.standardOptions.withUnit('operations')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think 'operations' is a unit that we support in jsonnet-lib/common-lib. Can we use 'short' instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Several signals have the unit 'operations' and I don't think it is a supported unit in jsonnet-libs/common-lib. Can we use 'short' instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to queue signals.

Several signals have the unit 'operations' and I don't think it is a supported unit in jsonnet-libs/common-lib. Can we use 'short' instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the two other signals files.

Several signals have the unit 'operations' and I don't think it is a supported unit in jsonnet-libs/common-lib. Can we use 'short' instead?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

operations is preferable here actually. If Grafana doesn't recognise a unit, it's treated as a custom string unit, so it'll be just fine

@aalhour aalhour self-assigned this Dec 11, 2025
Copy link
Member

@Dasomeone Dasomeone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple more comments, overall I'm happy with it!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

operations is preferable here actually. If Grafana doesn't recognise a unit, it's treated as a custom string unit, so it'll be just fine

Copy link
Member

@Dasomeone Dasomeone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, all my comments have been addressed, though holding off approval until @aalhour is happy as well :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants