Skip to content

x86-64: enable PCIe enhanced configuration support #1822

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

zyuiop
Copy link

@zyuiop zyuiop commented Jul 7, 2025

No description provided.

@zyuiop zyuiop force-pushed the feat/mmap-dev-discovery branch 2 times, most recently from 684dd31 to f4a0a63 Compare July 7, 2025 14:10
@mkroening mkroening requested review from stlankes and mkroening July 8, 2025 11:37
@mkroening mkroening self-assigned this Jul 8, 2025
Copy link
Contributor

@stlankes stlankes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good for me. Maybe some debug messages would help to see that PCIe is detected. How do test your PR?

@zyuiop
Copy link
Author

zyuiop commented Jul 28, 2025

Hi, I'll add some debug messages. For now, I tested on VMs in which standard PCI device discovery was not available, so I confirmed this works.

@zyuiop zyuiop force-pushed the feat/mmap-dev-discovery branch from 3212a54 to 398eb1b Compare July 28, 2025 10:06
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Benchmark Results

Benchmark Current: 398eb1b Previous: 916c0d1 Performance Ratio
startup_benchmark Build Time 99.72 s 99.64 s 1.00
startup_benchmark File Size 0.87 MB 0.87 MB 1.01
Startup Time - 1 core 0.88 s (±0.02 s) 0.91 s (±0.02 s) 0.97
Startup Time - 2 cores 0.89 s (±0.02 s) 0.91 s (±0.01 s) 0.98
Startup Time - 4 cores 0.91 s (±0.02 s) 0.92 s (±0.02 s) 0.99
multithreaded_benchmark Build Time 99.87 s 100.90 s 0.99
multithreaded_benchmark File Size 0.98 MB 0.97 MB 1.01
Multithreaded Pi Efficiency - 2 Threads 70.33 % (±6.13 %) 72.02 % (±3.89 %) 0.98
Multithreaded Pi Efficiency - 4 Threads 40.32 % (±2.75 %) 42.13 % (±3.21 %) 0.96
Multithreaded Pi Efficiency - 8 Threads 20.94 % (±1.80 %) 20.62 % (±1.67 %) 1.02
micro_benchmarks Build Time 82.63 s 82.66 s 1.00
micro_benchmarks File Size 0.99 MB 0.98 MB 1.01
Scheduling time - 1 thread 51.03 ticks (±1.19 ticks) 50.76 ticks (±1.58 ticks) 1.01
Scheduling time - 2 threads 31.22 ticks (±5.62 ticks) 29.97 ticks (±4.74 ticks) 1.04
Micro - Time for syscall (getpid) 13.69 ticks (±1.84 ticks) 13.62 ticks (±1.90 ticks) 1.01
Memcpy speed - (built_in) block size 4096 89679.94 MByte/s (±61805.44 MByte/s) 87836.97 MByte/s (±60687.62 MByte/s) 1.02
Memcpy speed - (built_in) block size 1048576 44150.75 MByte/s (±30516.18 MByte/s) 44377.25 MByte/s (±30655.67 MByte/s) 0.99
Memcpy speed - (built_in) block size 16777216 30100.54 MByte/s (±24699.95 MByte/s) 30073.21 MByte/s (±24686.15 MByte/s) 1.00
Memset speed - (built_in) block size 4096 89875.60 MByte/s (±61947.13 MByte/s) 87609.46 MByte/s (±60567.75 MByte/s) 1.03
Memset speed - (built_in) block size 1048576 44359.82 MByte/s (±30658.99 MByte/s) 44601.88 MByte/s (±30809.54 MByte/s) 0.99
Memset speed - (built_in) block size 16777216 30869.61 MByte/s (±25140.82 MByte/s) 30861.29 MByte/s (±25146.75 MByte/s) 1.00
Memcpy speed - (rust) block size 4096 79538.43 MByte/s (±55193.44 MByte/s) 78085.94 MByte/s (±54382.07 MByte/s) 1.02
Memcpy speed - (rust) block size 1048576 44099.41 MByte/s (±30486.42 MByte/s) 44176.28 MByte/s (±30556.63 MByte/s) 1.00
Memcpy speed - (rust) block size 16777216 29985.26 MByte/s (±24598.82 MByte/s) 30071.23 MByte/s (±24675.89 MByte/s) 1.00
Memset speed - (rust) block size 4096 79984.86 MByte/s (±55500.79 MByte/s) 78452.24 MByte/s (±54650.22 MByte/s) 1.02
Memset speed - (rust) block size 1048576 44316.18 MByte/s (±30633.57 MByte/s) 44416.03 MByte/s (±30723.13 MByte/s) 1.00
Memset speed - (rust) block size 16777216 30760.57 MByte/s (±25048.33 MByte/s) 30862.88 MByte/s (±25137.48 MByte/s) 1.00
alloc_benchmarks Build Time 79.52 s 79.90 s 1.00
alloc_benchmarks File Size 0.94 MB 0.94 MB 1.01
Allocations - Allocation success 100.00 % 100.00 % 1
Allocations - Deallocation success 70.05 % (±0.27 %) 69.94 % (±0.25 %) 1.00
Allocations - Pre-fail Allocations 100.00 % 100.00 % 1
Allocations - Average Allocation time 9554.74 Ticks (±154.50 Ticks) 10148.44 Ticks (±733.14 Ticks) 0.94
Allocations - Average Allocation time (no fail) 9554.74 Ticks (±154.50 Ticks) 10148.44 Ticks (±733.14 Ticks) 0.94
Allocations - Average Deallocation time 656.80 Ticks (±15.10 Ticks) 662.38 Ticks (±35.19 Ticks) 0.99
mutex_benchmark Build Time 81.01 s 80.36 s 1.01
mutex_benchmark File Size 0.99 MB 0.98 MB 1.01
Mutex Stress Test Average Time per Iteration - 1 Threads 11.72 ns (±0.45 ns) 11.50 ns (±0.50 ns) 1.02
Mutex Stress Test Average Time per Iteration - 2 Threads 13.64 ns (±0.62 ns) 12.88 ns (±0.71 ns) 1.06

This comment was automatically generated by workflow using github-action-benchmark.

@mkroening
Copy link
Member

For now, I tested on VMs in which standard PCI device discovery was not available, so I confirmed this works.

I think Stefan is interested in some example QEMU flags for running this. :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants