- 
                Notifications
    You must be signed in to change notification settings 
- Fork 373
Add initial support for powerpc64le initialization. #267
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
    * Adds header definitions for PPC64le
    * Adds support to construct the processor, core, cluster, package
      and cache(L1i,L1d,L2 and L3) information reported by the system.
    Test: Build and ran cpu_info on PPC64le linux machine. confirmed
          that it properly reports the logical processors, cores, clusters,
          packages and cache information.
    | @malfet could you review the patch again | 
| Hi @malfet , Did you get a chance to review this patch? | 
| @malfet I am waiting for your review. | 
| @fbarchard @malfet can you review this. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still confused about use of #if (b & FOOBAR) conditions in the test
preprocessor code will never be compiled.
| I have updated the code. Please review it @malfet | 
| @malfet can you review the latest code. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds initial support for powerpc64le initialization including new header definitions, architectural decoding logic, and test coverage for POWER-specific features.
- Adds new cases to uarch-to-string conversion to support POWER7, POWER8, POWER9, and POWER10.
- Introduces PPC64-specific processing flows in CPU parsing, cache handling, and initialization.
- Updates configuration and test infrastructure to integrate PPC64 support.
Reviewed Changes
Copilot reviewed 18 out of 19 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description | 
|---|---|
| tools/cpu-info.c | New POWER architecture cases added in the uarch_to_string function. | 
| test/name/power-features.cc | Added tests for POWER-specific features based on auxiliary vector flags. | 
| test/init.cc | Conditionals to disable cache consistency tests for PPC64 are added. | 
| src/powerpc/uarch.c, linux/ppc64-isa.c, linux/ppc64-hw.c | New implementations for decoding vendor, ISA, and hardware capabilities on PPC64. | 
| src/powerpc/linux/cpuinfo.c | CPU parsing updated to support POWER processor architectures with proper logging. | 
| src/powerpc/cache.c | New implementation for decoding cache information for PPC processors. | 
| src/linux/processors.c & src/linux/api.h | Added functions to get processor online status on Linux. | 
| src/init.c, internal-api.h, src/api.c | Updated PPC64 branches added to initialization and global data. | 
| include/cpuinfo.h | Added PPC-specific fields and inline feature functions. | 
| configure.py | Configures PPC64-specific source files and tests for build targets. | 
Files not reviewed (1)
- CMakeLists.txt: Language not supported
The 'disabled' field was declared for configurations other than SMT8
mode, but it is already handled during PowerPC64 Linux system
initialization. This change removes the redundant code.
make test:
    Start 1: init-test
1/3 Test pytorch#1: init-test ........................   Passed    0.02 sec
    Start 2: get-current-test
2/3 Test pytorch#2: get-current-test .................   Passed    0.02 sec
    Start 3: power-features-test
3/3 Test pytorch#3: power-features-test ..............   Passed    0.02 sec
100% tests passed, 0 tests failed out of 3
    c4e88ef    to
    33cc865      
    Compare
  
    | goto unknown; | ||
| } | ||
| break; | ||
| case 5: | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a break statement for case 5, 7, 8?
        
          
                src/powerpc/linux/cpuinfo.c
              
                Outdated
          
        
      | if (memcmp(line_start, "clock", key_length) == 0) { | ||
| parse_cpu_architecture(value_start, value_end, processor); | ||
| } else { | ||
| goto unknown; | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we avoid using goto's for logging.
        
          
                src/powerpc/linux/init.c
              
                Outdated
          
        
      | /* Asiigning linux_id's for all the online processors in consecutive manner. | ||
| * This is only needed in other than SMT8 modes. | ||
| */ | ||
| if (smt !=8) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For SMT mode otherthan 8, the Linux IDs should not be continuous.
Ex: In SMT8, Linux Ids for a core.      : 0,1,2,3,4,5,6,7
Linux Ids for two cores : 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15
In SMT4, Linux Ids for single core: 0,1,2,3,-,-,-,-
Linux Ids for two cores : 0,1,2,3,-,-,-,-,8,9,10,11,-,-,-,-
Similarly for SMT2 and SMT off.
        
          
                src/powerpc/linux/init.c
              
                Outdated
          
        
      | } | ||
|  | ||
| uint32_t l1i_index = UINT32_MAX, l1d_index = UINT32_MAX, l2_index = UINT32_MAX, l3_index = UINT32_MAX; | ||
| for (uint32_t i = 0; i < valid_processors_count; i++) { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For SMT=off, where we have only single hardware thread (valid_processor_count) active but two cache instances present per core, this leads to mismatch while mapping cache instances to logical processors. Instead, iterate over cache instances and map to the logical processors.
        
          
                src/powerpc/linux/init.c
              
                Outdated
          
        
      | struct cpuinfo_cache* l1d = NULL; | ||
| struct cpuinfo_cache* l2 = NULL; | ||
| struct cpuinfo_cache* l3 = NULL; | ||
| struct cpuinfo_cache* l4 = NULL; | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
l4 can be removed for PowerPC
| case 10: /* POWER10 */ | ||
| processor->core.uarch = cpuinfo_uarch_power10; | ||
| break; | ||
| default: | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add support for Power11 and deprecate Power7.
| case cpuinfo_uarch_power9: | ||
| return "POWER9"; | ||
| case cpuinfo_uarch_power10: | ||
| return "POWER10"; | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add Power11 support and deprecate Power7.
        
          
                test/name/power-features.cc
              
                Outdated
          
        
      | int a = mock_hwcap; | ||
| #else | ||
| int a = (uint32_t) getauxval(AT_HWCAP); | ||
| int b = (uint32_t) getauxval(AT_HWCAP2); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename 'b' to be more meaningful name to reflect its purpose. If 'a' is not used , we can delete initializing 'a'.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove cpuinfo_linux_get_processor_online_status and then it looks like a reasonable structured PR
        
          
                src/linux/api.h
              
                Outdated
          
        
      | uint32_t processor, | ||
| uint32_t package_id[restrict static 1]); | ||
| CPUINFO_INTERNAL bool cpuinfo_linux_get_processor_core_id(uint32_t processor, uint32_t core_id[restrict static 1]); | ||
| CPUINFO_INTERNAL bool cpuinfo_linux_get_processor_online_status(uint32_t processor, uint32_t* online_status); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be subject of another PR with some documentation explaining what this is and why it is needed only for Power system?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @malfet  ,Please  go through the https://www.redbooks.ibm.com/redpapers/pdfs/redp5649.pdf
We have moved the cpuinfo_linux_get_processor_online_status to src/powerpc/linux/ as it is completely written for power.
"The Power E1080 server uses the Power10 enterprise-class processor variant in which each
core can run with up to eight independent hardware threads. If all threads are active, the
mode of operation is referred to as 8-way simultaneous multithreading (SMT8) mode. A
Power10 core with SMT8 capability is named Power10 SMT8 core or SMT8 core for short.
The Power10 core also supports modes with four active threads (SMT4); that is, two active
threads (SMT2) and one single active thread (ST)."
Fixes incorrect processor linux_ids when running in modes other than SMT8. Corrects cache count reporting when SMT is disabled. Removed the cpuinfo_linux_get_processor_online_status and added it in the powerpc specific code and optimized the usage of it. Cleans up the codebase to improve readability and maintainability.
| I have addressed all the review comments. Please go through the latest changes. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated changes looks fine. Will let @malfet suggest next actions/approval.
| @malfet could you review the latest changes. | 
Added header definitions for PPC64LE.
Implemented support to construct and report processor topology information including processor, core, cluster, package, PVR, and cache levels (L1i, L1d, L2, and L3).
Test:
Built and executed cpu_info on a PPC64LE Linux machine. Confirmed that it accurately reports logical processors, cores, clusters, packages, and cache information.
Ran unit tests and verified that there were no regressions.