-
Notifications
You must be signed in to change notification settings - Fork 3.6k
DOC: Clarify DeviceStatsMonitor logged metrics #20895
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
DOC: Clarify DeviceStatsMonitor logged metrics #20895
Conversation
798f9c9
to
2461f52
Compare
1afe3a5
to
c707a02
Compare
271e5d7
to
fd323ed
Compare
for more information, see https://pre-commit.ci
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. If you need further help see our docs: https://lightning.ai/docs/pytorch/latest/generated/CONTRIBUTING.html#pull-request or ask the assistance of a core contributor here or on Discord. Thank you for your contributions. |
|
||
The actual metrics depend on the active accelerator and the ``cpu_stats`` flag. | ||
|
||
**CPU (via `psutil`)** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
** really does not well represent the hierarchy in the headers, so let's rather use a nested lists
What does this PR do?
This PR addresses issue #20807 by adding detailed documentation for the metrics logged by
DeviceStatsMonitor
.The key clarifications include:
psutil
, CUDA GPU viatorch.cuda.memory_stats
, and other accelerators viaaccelerator.get_device_stats()
).DeviceStatsMonitor.{hook_name}/{base_metric_name}
.torch.cuda.memory_stats()
for the full list of memory metrics.profiler_basic.rst
to align with these clarifications and link to the API docs.This documentation aims to help users understand what statistics to expect when using
DeviceStatsMonitor
with different hardware configurations.Fixes #20807
No breaking changes are introduced by this documentation update.
Before submitting
PR review
Anyone in the community is welcome to review the PR.
Before you start reviewing, make sure you have read the review guidelines. In short, see the following bullet-list:
Reviewer checklist
📚 Documentation preview 📚: https://pytorch-lightning--20895.org.readthedocs.build/en/20895/