Improve efficiency of resource usage monitor#59
Merged
Conversation
…llichore into feature/resource-usage-lean
samclark2015
approved these changes
Jan 22, 2026
samclark2015
left a comment
There was a problem hiding this comment.
LGTM! Code changes look great, with special attention paid to the unsafe blocks. One nit in the MacOS process tree implementation. Tested locally on an Ubuntu VM and all looked well
crates/kcserver/src/process_tree.rs
Outdated
Comment on lines
+99
to
+117
| let count = unsafe { proc_listchildpids(pid as libc::c_int, std::ptr::null_mut(), 0) }; | ||
|
|
||
| if count <= 0 { | ||
| return Vec::new(); | ||
| } | ||
|
|
||
| // Allocate buffer for PIDs | ||
| let buffer_size = count as usize; | ||
| let mut buffer: Vec<libc::c_int> = vec![0; buffer_size]; | ||
|
|
||
| // SAFETY: We've allocated a buffer of sufficient size (as returned by the first call). | ||
| // proc_listchildpids writes at most buffersize bytes to the buffer. | ||
| let result = unsafe { | ||
| proc_listchildpids( | ||
| pid as libc::c_int, | ||
| buffer.as_mut_ptr(), | ||
| (buffer_size * size_of::<libc::c_int>()) as libc::c_int, | ||
| ) | ||
| }; |
There was a problem hiding this comment.
Nit: proc_listchildpids returns the number of bytes required, but is being used as count. This is functional, but overallocates memory.
samclark2015
approved these changes
Jan 23, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This change significantly improves the efficiency of resource usage monitoring for kernel sessions by reducing how much work the monitor does on each tick.
Previously, the resource monitor called
refresh_processes(ProcessesToUpdate::All)every tick, which scans the entire process table on the system. This is wasteful when we only care about monitoring a handful of kernel processes and their children, and can also contribute to runaway resource usage when the process table becomes pathologically large (as happened recently in CI).The new implementation introduces OS-specific optimizations that directly query only the processes we need. On macOS, we use the
proc_listchildpids()API to efficiently enumerate child processes. On Linux, we scan/procentries but limit ourselves to processes in the same process group as the kernel, and we cache the results so we don't rescan on every tick. Windows uses a similar caching strategy with periodic refreshes.Unfortunately,
sysinfohas a bug on Linux in which it does not compute CPU usage percentages when you ask it to refresh only a subset of processes, so additional work is needed; the change adds a custom CPU tracker inproc_stat.rsthat reads CPU times directly from/proc/[pid]/statand computes usage percentages manually.The branch also adds an optimization to skip resource collection entirely when no clients are connected to any session, since there's no one listening for the resource update messages anyway. This should reduce unnecessary CPU overhead when Kallichore is running but the user doesn't have a browser/window connected.
Finally, there's a new integration test in
resource_usage_test.rsthat verifies resource usage data is being populated correctly through both the HTTP API and WebSocket streams.