-
Notifications
You must be signed in to change notification settings - Fork 751
[BMC] Add BMC CLIs and integrate BMC dump into show techsupport #4104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
ffa7c45 to
e2d4059
Compare
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
e12bc40 to
6d78c81
Compare
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
33f5cfc to
84ce3a9
Compare
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
2ca1762 to
b2b0c11
Compare
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
@judyjoseph @yxieca |
nate-nexthop
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This pull request adds BMC (Baseboard Management Controller) support to the SONiC utilities, enabling BMC information retrieval and BMC debug log collection during techsupport generation.
- Adds new CLI commands
show platform bmc summaryandshow platform bmc eepromto display BMC information - Integrates BMC debug log collection into the
generate_dumpscript workflow - Introduces
bmc_techsupport.pyscript to trigger and collect BMC debug logs via Redfish API
Reviewed Changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 14 comments.
Show a summary per file
| File | Description |
|---|---|
| show/platform.py | Adds BMC command group with summary and eeprom subcommands to display BMC hardware and firmware information |
| scripts/bmc_techsupport.py | New script implementing BMC debug log dump triggering and collection using Redfish API |
| scripts/generate_dump | Adds BMC support check, triggers BMC debug log dump at start, and collects logs during techsupport generation |
| tests/show_platform_test.py | Comprehensive test coverage for BMC commands including regular/JSON output and error scenarios |
| setup.py | Registers the new bmc_techsupport.py script for installation |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| ############################################################################### | ||
| # Trigger BMC debug log dump task | ||
| # Globals: | ||
| # TAR, TARFILE, DUMPDIR, BASE, TARDIR, TECHSUPPORT_TIME_INFO | ||
| # Arguments: | ||
| # None | ||
| # Returns: | ||
| # None | ||
| ############################################################################### |
Copilot
AI
Nov 8, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Globals section lists variables (TAR, TARFILE, DUMPDIR, BASE, TARDIR, TECHSUPPORT_TIME_INFO) that are not actually used in the trigger_bmc_debug_log_dump function. This function doesn't use any of these global variables. The Globals documentation should be updated to accurately reflect what's used or removed if no globals are used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
tests/show_platform_test.py
Outdated
| sys.modules['sonic_platform'] = mock_sonic_platform | ||
| sys.modules['sonic_platform.platform'] = mock_sonic_platform.platform | ||
|
|
||
| mock_platform = mock.MagicMock() | ||
| mock_chassis = mock.MagicMock() | ||
| mock_bmc = mock.MagicMock() | ||
|
|
||
| mock_platform.get_chassis.return_value = mock_chassis | ||
| mock_chassis.get_bmc.return_value = mock_bmc | ||
| mock_bmc.get_eeprom.return_value = self.TEST_BMC_EEPROM_INFO | ||
| mock_bmc.get_version.return_value = self.TEST_BMC_VERSION | ||
| mock_sonic_platform.platform.Platform.return_value = mock_platform | ||
|
|
||
| result = CliRunner().invoke(show.cli.commands['platform'].commands['bmc'].commands['summary'], []) | ||
| assert result.exit_code == 0, result.output | ||
| assert result.output == textwrap.dedent(expected_output) | ||
|
|
Copilot
AI
Nov 8, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test methods modify sys.modules to mock the sonic_platform module but never clean it up. This can cause test pollution where mocked modules from one test affect subsequent tests. Consider using a fixture or teardown to restore the original sys.modules state after each test, or use unittest.mock.patch.dict to automatically clean up:
with mock.patch.dict('sys.modules', {
'sonic_platform': mock_sonic_platform,
'sonic_platform.platform': mock_sonic_platform.platform
}):
# test code here| sys.modules['sonic_platform'] = mock_sonic_platform | |
| sys.modules['sonic_platform.platform'] = mock_sonic_platform.platform | |
| mock_platform = mock.MagicMock() | |
| mock_chassis = mock.MagicMock() | |
| mock_bmc = mock.MagicMock() | |
| mock_platform.get_chassis.return_value = mock_chassis | |
| mock_chassis.get_bmc.return_value = mock_bmc | |
| mock_bmc.get_eeprom.return_value = self.TEST_BMC_EEPROM_INFO | |
| mock_bmc.get_version.return_value = self.TEST_BMC_VERSION | |
| mock_sonic_platform.platform.Platform.return_value = mock_platform | |
| result = CliRunner().invoke(show.cli.commands['platform'].commands['bmc'].commands['summary'], []) | |
| assert result.exit_code == 0, result.output | |
| assert result.output == textwrap.dedent(expected_output) | |
| with mock.patch.dict('sys.modules', { | |
| 'sonic_platform': mock_sonic_platform, | |
| 'sonic_platform.platform': mock_sonic_platform.platform | |
| }): | |
| mock_platform = mock.MagicMock() | |
| mock_chassis = mock.MagicMock() | |
| mock_bmc = mock.MagicMock() | |
| mock_platform.get_chassis.return_value = mock_chassis | |
| mock_chassis.get_bmc.return_value = mock_bmc | |
| mock_bmc.get_eeprom.return_value = self.TEST_BMC_EEPROM_INFO | |
| mock_bmc.get_version.return_value = self.TEST_BMC_VERSION | |
| mock_sonic_platform.platform.Platform.return_value = mock_platform | |
| result = CliRunner().invoke(show.cli.commands['platform'].commands['bmc'].commands['summary'], []) | |
| assert result.exit_code == 0, result.output | |
| assert result.output == textwrap.dedent(expected_output) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
scripts/generate_dump
Outdated
| fi | ||
| # Trigger BMC redfish API to start BMC debug log dump task | ||
| local task_id=$(python3 /usr/local/bin/bmc_techsupport.py -m trigger) | ||
| echo $task_id |
Copilot
AI
Nov 8, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variable $task_id should be quoted to prevent potential issues with word splitting or globbing:
echo "$task_id"| echo $task_id | |
| echo "$task_id" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
scripts/bmc_techsupport.py
Outdated
| The usage of this script is divided into two parts: | ||
| 1. Triggering BMC debug log dump Redfish task | ||
| * In this case the script, triggers a POST request to BMC to start collecting debug log dump. |
Copilot
AI
Nov 8, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's an unnecessary comma in the sentence: "In this case the script, triggers" should be "In this case the script triggers".
| * In this case the script, triggers a POST request to BMC to start collecting debug log dump. | |
| * In this case the script triggers a POST request to BMC to start collecting debug log dump. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
scripts/bmc_techsupport.py
Outdated
| 2. Collecting BMC debug log dump | ||
| * In this step we will wait for the task-id to finish if it has not finished. | ||
| * Blocking action till we get the file or having ERROR or Timeout. |
Copilot
AI
Nov 8, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The phrase "having ERROR or Timeout" is grammatically awkward. Consider rephrasing to "encountering an ERROR or Timeout" or "until an ERROR or Timeout occurs".
| * Blocking action till we get the file or having ERROR or Timeout. | |
| * Blocking action until we get the file or encounter an ERROR or Timeout. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
scripts/generate_dump
Outdated
| [ -f "$TARBALL_XZ" ] && rm -f "$TARBALL_XZ" | ||
|
|
||
| # Invoke BMC redfish API to extract BMC debug log dump to "/tmp/bmc_debug_log_dump.tar.xz" | ||
| python3 /usr/local/bin/bmc_techsupport.py -m collect -p "$TARBALL_XZ" -t $bmc_debug_log_dump_task_id |
Copilot
AI
Nov 8, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variable $bmc_debug_log_dump_task_id should be quoted to prevent potential issues with word splitting or globbing. While the task ID is unlikely to contain spaces, it's a bash best practice to quote variables:
python3 /usr/local/bin/bmc_techsupport.py -m collect -p "$TARBALL_XZ" -t "$bmc_debug_log_dump_task_id"| python3 /usr/local/bin/bmc_techsupport.py -m collect -p "$TARBALL_XZ" -t $bmc_debug_log_dump_task_id | |
| python3 /usr/local/bin/bmc_techsupport.py -m collect -p "$TARBALL_XZ" -t "$bmc_debug_log_dump_task_id" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
scripts/generate_dump
Outdated
| # Globals: | ||
| # TAR, TARFILE, DUMPDIR, BASE, TARDIR, TECHSUPPORT_TIME_INFO | ||
| # Arguments: | ||
| # None |
Copilot
AI
Nov 8, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function documentation states "Arguments: None" but the function accepts one argument $1 (the BMC debug log dump task ID) as seen on line 2009. The documentation should be updated to reflect this:
# Arguments:
# $1 - BMC debug log dump task ID| # None | |
| # $1 - BMC debug log dump task ID |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
| try: | ||
| import sonic_platform | ||
| chassis = sonic_platform.platform.Platform().get_chassis() | ||
| bmc = chassis.get_bmc() | ||
|
|
||
| if bmc is None: | ||
| click.echo("BMC is not available on this platform") | ||
| return | ||
|
|
||
| eeprom_info = bmc.get_eeprom() | ||
| if not eeprom_info: | ||
| click.echo("Failed to retrieve BMC EEPROM information") | ||
| return | ||
|
|
Copilot
AI
Nov 8, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The bmc_summary and eeprom functions have significant code duplication in their initialization and error handling logic (getting the chassis, BMC, and EEPROM info). Consider extracting this common logic into a helper function to improve maintainability:
def _get_bmc_eeprom_info():
"""Helper function to get BMC and EEPROM information."""
import sonic_platform
chassis = sonic_platform.platform.Platform().get_chassis()
bmc = chassis.get_bmc()
if bmc is None:
return None, None, "BMC is not available on this platform"
eeprom_info = bmc.get_eeprom()
if not eeprom_info:
return None, None, "Failed to retrieve BMC EEPROM information"
return bmc, eeprom_info, NoneThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the helper function cannot have the @click.option, we should indicate the kind of error via the return values, and then check and log accordingly in the original functions. So there is no significant improvement in code reduction.
| trigger_bmc_debug_log_dump() { | ||
| trap 'handle_error $? $LINENO' ERR | ||
| if ! is_bmc_supported; then | ||
| echo "INFO: BMC is not found on this platform. Skipping..." |
Copilot
AI
Nov 8, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When BMC is not supported, the function echoes an INFO message and returns, which causes that message to be captured as the task_id value at line 2255. This leads to incorrect behavior:
- At line 2256, the check for "-1" fails (since task_id is "INFO: BMC is not found...")
- At line 2474, the check
!= "-1"succeeds, causingcollect_bmc_filesto be called with the INFO message as the task_id
The function should return "-1" explicitly without echoing the INFO message, or restructure to avoid this issue:
if ! is_bmc_supported; then
echo "-1"
return
fiAnd move the INFO message to the caller (line 2254-2258) to check for empty/invalid returns.
| echo "INFO: BMC is not found on this platform. Skipping..." | |
| echo "-1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
collect_bmc_files will be called with the INFO message as the task_id.
However, at the top of collect_bmc_files, there is a check for 'is_bmc_supported'. Therefore, we will not use this task_id.
| ############################################################################### | ||
| # Check BMC presence | ||
| # Globals: | ||
| # TAR, TARFILE, DUMPDIR, BASE, TARDIR, TECHSUPPORT_TIME_INFO | ||
| # Arguments: | ||
| # None | ||
| # Returns: | ||
| # None | ||
| ############################################################################### |
Copilot
AI
Nov 8, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Globals section lists variables (TAR, TARFILE, DUMPDIR, BASE, TARDIR, TECHSUPPORT_TIME_INFO) that are not actually used in the is_bmc_supported function. This function only uses the platform variable. The Globals documentation should be updated to accurately reflect what's used or removed if no globals are used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
57c1414 to
7646d3c
Compare
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
This PR is dependent on the following PRs:
sonic-net/sonic-platform-common#605
sonic-net/sonic-buildimage#24345
What I did
Added BMC dump collection to the show techsupport output.
Introduced new CLI commands for retrieving BMC information.
How I did it
Updated generate_dump script to trigger asynchronous BMC dump collection
at the beginning of the techsupport process and collect the dump before packaging the final tarball.
Implemented new BMC CLI commands under show platform bmc by extending the existing CLI framework.
The CLIs internally query the BMC API to retrieve BMC summary and EEPROM data.
How to verify it
show techsupport
show platform bmc summary
show platform bmc eeprom
Previous command output (if the output of a command-line utility has changed)
New command output (if the output of a command-line utility has changed)