Skip to content

Commit 8bd78d6

Browse files
ZhaoqiongZtye1
andauthored
Doc content finetune (#4215)
update known issues, profiler and torch.compile doc contents. --------- Co-authored-by: Ye Ting <[email protected]>
1 parent e8317aa commit 8bd78d6

File tree

13 files changed

+173
-192
lines changed

13 files changed

+173
-192
lines changed

docs/index.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ Intel® Extension for PyTorch* has been released as an open–source project at
2626

2727
You can find more information about the product at:
2828

29-
- `Features <https://intel.github.io/intel-extension-for-pytorch/gpu/latest/tutorials/features>`_
29+
- `Features <https://intel.github.io/intel-extension-for-pytorch/xpu/latest/tutorials/features>`_
3030
- `Performance <https://intel.github.io/intel-extension-for-pytorch/xpu/latest/tutorials/performance>`_
3131

3232
Architecture

docs/tutorials/features.rst

Lines changed: 4 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -137,19 +137,6 @@ For more detailed information, check `torch.compile for GPU <features/torch_comp
137137

138138
features/torch_compile_gpu
139139

140-
Legacy Profiler Tool (Prototype)
141-
--------------------------------
142-
143-
The legacy profiler tool is an extension of PyTorch* legacy profiler for profiling operators' overhead on XPU devices. With this tool, you can get the information in many fields of the run models or code scripts. Build Intel® Extension for PyTorch* with profiler support as default and enable this tool by adding a `with` statement before the code segment.
144-
145-
For more detailed information, check `Legacy Profiler Tool <features/profiler_legacy.md>`_.
146-
147-
.. toctree::
148-
:hidden:
149-
:maxdepth: 1
150-
151-
features/profiler_legacy
152-
153140
Simple Trace Tool (Prototype)
154141
-----------------------------
155142

@@ -191,14 +178,13 @@ For more detailed information, check `Compute Engine <features/compute_engine.md
191178
features/compute_engine
192179

193180

181+
``IPEX_LOGGING`` (Prototype feature for debug)
182+
----------------------------------------------
194183

195184

196-
IPEX LOG (Prototype)
197-
--------------------
198-
199-
IPEX_LOGGING provides the capacity to log IPEX internal information. If you would like to use torch-style log, that is, the log/verbose is introduced by torch, and refer to cuda code, pls still use torch macro to show the log. For example, TORCH_CHECK, TORCH_ERROR. If the log is IPEX specific, or is going to trace IPEX execution, pls use IPEX_LOGGING. For some part of usage are still discussed with habana side, if has change some feature will update here.
185+
``IPEX_LOGGING`` provides the capability to log verbose information from Intel® Extension for PyTorch\* . Please use ``IPEX_LOGGING`` to get the log information or trace the execution from Intel® Extension for PyTorch\*. Please continue using PyTorch\* macros such as ``TORCH_CHECK``, ``TORCH_ERROR``, etc. to get the log information from PyTorch\*.
200186

201-
For more detailed information, check `IPEX LOG <features/ipex_log.md>`_.
187+
For more detailed information, check `IPEX_LOGGING <features/ipex_log.md>`_.
202188

203189
.. toctree::
204190
:hidden:

docs/tutorials/features/ipex_log.md

Lines changed: 46 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -1,54 +1,57 @@
1-
IPEX Logging usage
2-
===============================================
3-
<style>
4-
table {
5-
margin: auto;
6-
}
7-
8-
</style>
9-
1+
`IPEX_LOGGING` (Prototype)
2+
==========================
103

114
## Introduction
125

13-
IPEX_LOGGING provides the capacity to log IPEX internal information. If you would like to use torch-style log, that is, the log/verbose is introduced by torch, and refer to cuda code, pls still use torch macro to show the log. For example, TORCH_CHECK, TORCH_ERROR. If the log is IPEX specific, or is going to trace IPEX execution, pls use IPEX_LOGGING. For some part of usage are still discussed with habana side, if has change some feature will update here.
6+
`IPEX_LOGGING` provides the capability to log verbose information from Intel® Extension for PyTorch\* . Please use `IPEX_LOGGING` to get the log information or trace the execution from Intel® Extension for PyTorch\*. Please continue using PyTorch\* macros such as `TORCH_CHECK`, `TORCH_ERROR`, etc. to get the log information from PyTorch\*.
147

15-
## Feature for IPEX Log
16-
### Log level
17-
Currently supported log level and usage are as follow, default using log level is `WARN`:
8+
## `IPEX_LOGGING` Definition
9+
### Log Level
10+
The supported log levels are defined as follows, default log level is `DISABLED`:
1811

1912
| log level | number | usage |
2013
| :----: | :----: | :----: |
21-
| TRACE | 0 | reserve it for further usage extension|
22-
| DEBUG | 1 | We would like to insert DEBUG inside each host function, when log level is debug, we can get the whole calling stack |
23-
| INFO | 2 | Record calls to other library functions and environment variable settings, such as onemkl calling and set verbose level|
24-
| WARN | 3 | On the second attempt of the program, such as memory reallocation |
25-
| ERR | 4 | Found error in try catch |
26-
| CRITICAL | 5 | reserve it for further usage extension |
14+
| DISABLED | -1 | Disable the logging |
15+
| TRACE | 0 | Reserve for further usage |
16+
| DEBUG | 1 | Provide the whole calling stack info |
17+
| INFO | 2 | Record calling info to other library functions and environment variable settings |
18+
| WARN | 3 | Warn the second attempt of an action, such as memory reallocation |
19+
| ERR | 4 | Report error in try catch |
20+
| CRITICAL | 5 | Reserve for further usage |
2721

28-
### Log component
29-
Log component is for specify which part of IPEX does this log belongs to, currently we have seprate IPEX into four parts, shown as table below.
22+
### Log Component
23+
Log component is used to specify which part from Intel® Extension for PyTorch\* does this log information belong to. The supported log components are defined as follows:
3024

3125
| log component | description |
3226
| :----: | :----:
33-
| OPS | Intergrate/Launch sycl onednn, onemkl ops |
34-
| SYNGRAPH | Habana Syngraph related |
27+
| OPS | Launch SYCL, oneDNN, oneMKL operators |
28+
| SYNGRAPH | Syngraph related |
3529
| MEMORY | Allocate/Free memory, Allocate/Free cache |
3630
| RUNTIME | Device / Queue related |
31+
| ALL | All output log |
32+
33+
## Usage in C++
34+
All the usage are defined in `utils/LogUtils.h`. Currently Intel® Extension for PyTorch\* supports:
3735

38-
For `SYNGRAPH` you can also add log sub componment which is no restriction on categories.
36+
### Simple Log
37+
You can use `IPEX_XXX_LOG`, XXX represents the log level as mentioned above. There are four parameters defined for simple log:
38+
- Log component, representing which part of Intel® Extension for PyTorch\* does this log belong to.
39+
- Log sub component, input an empty string("") for general usages. For `SYNGRAPH` you can add any log sub componment.
40+
- Log message template format string.
41+
- Log name.
3942

43+
Below is an example for using simple log inside abs kernel:
4044

41-
## How to add log in IPEX
42-
All the usage are inside file `utils/LogUtils.h`. IPEX Log support two types of log usage, the first one is simple log, you can use IPEX_XXX_LOG, XXX represents the log level, including six log level mentioned above. There are for params for simple log, the first one is log component, representing which part of IPEX does this log belongs to. The second on is log sub component, for most of the IPEX usage, just input a empty string("") here. For the third param, it is an log message template, you can use it as python format string, or you can also refer to fmt lib https://github.com/fmtlib/fmt. Here is an example for simple log, for add a log inside abs kernel:
4345
``` c++
4446

4547
IPEX_INFO_LOG("OPS", "", "Add a log for inside ops {}", "abs");
4648

4749
```
50+
### Event Log
51+
Event log is used for recording a whole event, such as an operator calculation. The whole event is identified by an unique `event_id`. You can also mark each step by using `step_id`. Use `IPEX_XXX_EVENT_END()` to complete the logging of the whole event.
4852
49-
For the second log level is event log, which is used for recording a whole event, such as a ops calculation. A whole event is identify through a event_id, for the whole event this event_id should be the same, but cannot be duplicate with other event_id, or it will met an undefined behaviour. You can also mark each step by using a step_id, there are no limitation for step_id. For the end of the whole event, should use IPEX_XXX_EVENT_END(), XXX is the log level mention aboved.
53+
Below is an example for using event log:
5054
51-
Below is an example for ipex event log:
5255
```c++
5356
IPEX_EVENT_END("OPS", "", "record_avg_pool", "start", "Here record the time start with arg:{}", arg);
5457
prepare_data();
@@ -58,23 +61,23 @@ IPEX_INFO_EVENT_END("OPS", "", "record_avg_pool", "finish conv", "Here record th
5861
```
5962

6063
## Enviornment settings
61-
IPEX privide some enviornment setting used for log output settings, currently, we support below five settings.
64+
Intel® Extension for PyTorch\* provides five enviornment variables for configuring log output:
6265

63-
1. IPEX_LOGGING_LEVEL, accept int or string, default is 3 for `WARN`. Currently you can choose seven different log level within ipex, including -1 `DISABLED` under this setting, all the usage related with IPEX_LOGGING will be disabled. Another six log levels are we mentioned above.
64-
2. IPEX_LOG_COMPONENT, accept a string, sepreated by `/` first part is log component and the second part is for log sub component, used for state which component and sub log component you would like to log, default is "ALL". Currently we supoort 5 different log component,, `ALL` is for all the output in IPEX, the other four log component are we mentioned above. You could also specify several log component, sepreating using `,` such as "OPS;MEMORY".For log sub component, it still discussed with habana side, pls don't use it first.
65-
3. IPEX_LOG_OUTPUT, accept a string. If you are using IPEX_LOG_OUTPUT, than all the logs will log inside the file rather than log into the console, you can use it like export IPEX_LOG_OUTPUT="./ipex.log", all the log will log inside ipex.log in current work folder.
66-
4. IPEX_LOG_ROTATE_SIZE, accept a int, default =10. Only validate when export IPEX_LOG_OUTPUT, specifing how large file will be used when rotating this log, size is MB.
67-
5. IPEX_LOG_SPLIT_SIZE, accept a int, default = null. Only validate when export IPEX_LOG_OUTPUT, specifing how large file will be used when split this log, size is MB.
66+
- `IPEX_LOGGING_LEVEL`, accept integar or string, default is -1 for `DISABLED`.
67+
- `IPEX_LOG_COMPONENT`, accept string, used for specifying the log component and sub log component you would like to log, default is "ALL". The log component and sub log component are separated by `/`. You could also specify several log components, such as "OPS;MEMORY".
68+
- `IPEX_LOG_OUTPUT`, accept string. If you are using `IPEX_LOG_OUTPUT`, than all the logs will recorded inside a file rather than the console. Example: export IPEX_LOG_OUTPUT="./ipex.log".
69+
- `IPEX_LOG_ROTATE_SIZE`, accept integar, default is 10. Can be used only with `IPEX_LOG_OUTPUT`, for specifing how large file will be used when rotating this log, size is MB.
70+
- `IPEX_LOG_SPLIT_SIZE`, accept integar, default = null. Can be used only with `IPEX_LOG_OUTPUT`, for specifing how large file will be used when splitting the logs, size is MB.
6871

6972
## Usage in python
70-
1. torch.xpu.set_log_level(log_level) and torch.xpu.get_log_level(), these two functions are used for get and set the log level.
71-
2. torch.xpu.set_log_output_file_path(log_path) and torch.xpu.get_log_output_file_path(), these two functions are used for get and set the log output file path, once log output file path is set, logs will not be print on the console, will only output in the file.
72-
3. torch.xpu.set_log_rotate_file_size(file size) and torch.xpu.get_log_rotate_file_size(), these two functions are used for get and set the log rotate file size, only validate when output file path is set.
73-
4. torch.xpu.set_log_split_file_size(file size) and torch.xpu.get_log_split_file_size(), these two functions are used for get and set the log split file size, only validate when output file path is set.
74-
5. torch.xpu.set_log_component(log_component), and torch.xpu.get_log_component(), these two functions are used for get and set the log component, log component string are same with enviornment settings.
73+
- `torch.xpu.set_log_level(log_level)` and `torch.xpu.get_log_level()`, these two functions are used for getting and setting the log level.
74+
- `torch.xpu.set_log_output_file_path(log_path)` and `torch.xpu.get_log_output_file_path()`, these two functions are used for getting and setting the log output file path, once log output file path is set, logs will be recorded in file only.
75+
- `torch.xpu.set_log_rotate_file_size(file size)` and `torch.xpu.get_log_rotate_file_size()`, these two functions are used for getting and setting the log rotate file size. Can be used when output file path is set.
76+
- `torch.xpu.set_log_split_file_size(file size)` and `torch.xpu.get_log_split_file_size()`, these two functions are used for getting and setting the log split file size. Can be used when output file path is set.
77+
- `torch.xpu.set_log_component(log_component)`, and `torch.xpu.get_log_component()`, these two functions are used for getting and setting the log component. The log component string are the same as defined in enviornment settings.
7578

76-
## Use IPEX log for simple trace
77-
For now, IPEX_SIMPLE_TRACE is depre deprecated, and pls use torch.xpu.set_log_level(0), it will show logs like previous IPEX_SIMPLE_TRACE.
79+
## Replace `IPEX_SIMPLE_TRACE`
80+
Use `torch.xpu.set_log_level(0)` to get logs to replace the previous usage in `IPEX_SIMPLE_TRACE`.
7881

79-
## Use IPEX log for verbose
80-
For now, IPEX_VERBOSE is deprecated, pls use torch.xpu.set_log_level(1), it will show logs like previous IPEX_VERBOSE.
82+
## Replace `IPEX_VERBOSE`
83+
Use `torch.xpu.set_log_level(1)` to get logs to replace the previous usage in `IPEX_VERBOSE`.

docs/tutorials/features/profiler_kineto.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -168,3 +168,13 @@ prof.export_chrome_trace("trace_file.json")
168168
You can examine the sequence of profiled operators, runtime functions and XPU kernels in these trace viewers. Here shows a trace result for ResNet50 run on XPU backend viewed by Perfetto viewer:
169169

170170
![profiler_kineto_result_perfetto_viewer](../../images/profiler_kineto/profiler_kineto_result_perfetto_viewer.png)
171+
172+
## Known issues
173+
174+
You may meet an issue that cannot collect profiling information of XPU kernels and device memory operations due to the failures in creating the tracers, when using Kineto profiler based on oneTrace. If you meet such failures that any tracer or collector could not be successfully created, please try the following workaround.
175+
176+
```bash
177+
export ZE_ENABLE_TRACING_LAYER=1
178+
```
179+
180+
> Note that this environment variable should be set as global before running any user level applications.
Lines changed: 2 additions & 89 deletions
Original file line numberDiff line numberDiff line change
@@ -1,93 +1,6 @@
1-
Legacy Profiler Tool (Prototype)
1+
Legacy Profiler Tool (Deprecated)
22
================================
33

44
## Introduction
55

6-
The legacy profiler tool is an extension of PyTorch\* legacy profiler for profiling operators' overhead on XPU devices. With this tool, users can get the information in many fields of the run models or code scripts. User should build Intel® Extension for PyTorch\* with profiler support as default and enable this tool by a `with` statement before the code segment.
7-
8-
## Use Case
9-
10-
To use the legacy profiler tool, you need to build Intel® Extension for PyTorch\* from source or install it via prebuilt wheel. You also have various methods to disable this tool.
11-
12-
### Build Tool
13-
14-
The build option `BUILD_PROFILER` is switched on as default but you can switch it off via setting `BUILD_PROFILER=OFF` while building Intel® Extension for PyTorch\* from source. With `BUILD_PROFILER=OFF`, no profiler code will be compiled and all python scripts using profiler with XPU support will raise a runtime error to user.
15-
16-
```bash
17-
[BUILD_PROFILER=ON] python setup.py install # build from source with profiler tool
18-
BUILD_PROFILER=OFF python setup.py install # build from source without profiler tool
19-
```
20-
21-
### Use Tool
22-
23-
In your model script, write `with` statement to enable the legacy profiler tool ahead of your code snippets, as shown in the following example:
24-
25-
```python
26-
# import all necessary libraries
27-
import torch
28-
import intel_extension_for_pytorch
29-
30-
# these lines won't be profiled before enabling profiler tool
31-
input_tensor = torch.randn(1024, dtype=torch.float32, device='xpu:0')
32-
33-
# enable legacy profiler tool with a `with` statement
34-
with torch.autograd.profiler_legacy.profile(use_xpu=True) as prof:
35-
# do what you want to profile here after the `with` statement with proper indent
36-
output_tensor_1 = torch.nonzero(input_tensor)
37-
output_tensor_2 = torch.unique(input_tensor)
38-
39-
# print the result table formatted by the legacy profiler tool as your wish
40-
print(prof.key_averages().table())
41-
```
42-
43-
There are a number of useful parameters defined in `torch.autograd.profiler_legacy.profile()`. Many of them are aligned with usages defined in PyTorch\*'s official profiler, such as `record_shapes`, a very useful parameter to control whether to record the shape of input tensors for each operator. To enable legacy profiler on XPU devices, pass `use_xpu=True`. For the usage of more parameters, please refer to [PyTorch\*'s tutorial page](https://pytorch.org/tutorials/recipes/recipes/profiler_recipe.html).
44-
45-
### Disable Tool in Model Script
46-
47-
To disable the legacy profiler tool temporarily in your model script, pass `enabled=False` to `torch.autograd.profiler_legacy.profile()`:
48-
49-
```python
50-
with torch.autograd.profiler_legacy.profile(enabled=False, use_xpu=True) as prof:
51-
# as `enabled` is set to false, the profiler won't work on these lines of code
52-
output_tensor_1 = torch.nonzero(input_tensor)
53-
output_tensor_2 = torch.unique(input_tensor)
54-
55-
# This print will raise an error to user as the profiler was disabled
56-
print(prof.key_averages().table())
57-
```
58-
59-
### Results
60-
61-
Using the script shown above in **Use Tool** part, you'll see the result table printed out to the console as below:
62-
63-
![Legacy_profiler_result_1](../../images/profiler_legacy/Legacy_profiler_result_1.png)
64-
65-
In this result, you can find several fields like:
66-
67-
- `Name`: the name of run operators
68-
- `Self CPU %`, `Self CPU`: the time consumed by the operator itself at host excluded its children operator call. The column marked with percentage sign shows the propotion of time to total self cpu time. While an operator calls more than once in a run, the self cpu time may increase in this field.
69-
- `CPU total %`, `CPU total`: the time consumed by the operator at host included its children operator call. The column marked with percentasge sign shows the propotion of time to total cpu time. While an operator calls more than once in a run, the cpu time may increase in this field.
70-
- `CPU time avg`: the average time consumed by each once call of the operator at host. This average is calculated on the cpu total time.
71-
- `Self XPU`, `Self XPU %`: similar to `Self CPU (%)` but shows the time consumption on XPU devices.
72-
- `XPU total`: similar to `CPU total` but shows the time consumption on XPU devices.
73-
- `XPU time avg`: similar to `CPU time avg` but shows average time sonsumption on XPU devices. This average is calculated on the XPU total time.
74-
- `# of Calls`: number of call for each operators in a run.
75-
76-
You can print result table in different styles, such as sort all called operators in reverse order via `print(prof.table(sort_by='id'))` like:
77-
78-
![Legacy_profiler_result_2](../../images/profiler_legacy/Legacy_profiler_result_2.png)
79-
80-
### Export to Chrome Trace
81-
82-
You can export the result to a json file and then load it in the Chrome trace viewer (`chrome://tracing`) by add this line in your model script:
83-
84-
```python
85-
prof.export_chrome_trace("trace_file.json")
86-
```
87-
88-
In Chrome trace viewer, you may find the result shows like:
89-
90-
![Legacy_profiler_result_3](../../images/profiler_legacy/Legacy_profiler_result_3.png)
91-
92-
For more example results, please refer to [PyTorch\*'s tutorial page](https://pytorch.org/tutorials/recipes/recipes/profiler_recipe.html).
93-
6+
The legacy profiler tool will be deprecated from Intel® Extension for PyTorch* very soon. Please use [Kineto Supported Profiler Tool](./profiler_kineto.md) instead for profiling operators' executing time cost on Intel® GPU devices.

0 commit comments

Comments
 (0)