Skip to content

Commit 11c4dbb

Browse files
1e-toetotmeni
andauthored
Added docs for Offload Diagnostics and Controllable Fallback (#113)
Co-authored-by: etotmeni <[email protected]>
1 parent 2f3451f commit 11c4dbb

File tree

2 files changed

+22
-0
lines changed

2 files changed

+22
-0
lines changed

README.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -68,3 +68,7 @@ Please follow instructions in the [DEBUGGING.md](DEBUGGING.md)
6868
## Reporting issues
6969

7070
Please use https://github.com/IntelPython/numba-dppy/issues to report issues and bugs.
71+
72+
## Features
73+
74+
Read this guide for additional features [INDEX.md](docs/INDEX.md)

docs/INDEX.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
# numba-dppy
2+
3+
Below is the functionality that is implemented in numba-dppy. You can follow the detailed descriptions of some of the features.
4+
5+
## Offload Diagnostics
6+
7+
Setting the debug environment variable `NUMBA_DPPY_OFFLOAD_DIAGNOSTICS `
8+
(e.g. `export NUMBA_DPPY_OFFLOAD_DIAGNOSTICS=1`) enables the parallel and offload diagnostics information.
9+
10+
If set to an integer value between 1 and 4 (inclusive) diagnostic information about parallel transforms undertaken by Numba will be written to STDOUT. The higher the value set the more detailed the information produced.
11+
In the "Auto-offloading" section there is the information on which device (device name) this parfor or kernel was offloaded.
12+
13+
## Controllable Fallback
14+
15+
With the default behavior of numba-dppy, if a section of code cannot be offloaded on the GPU, then it is automatically executed on the CPU and printed a warning. This behavior only applies to njit functions and auto-offloading of numpy functions, array expressions, and prange loops.
16+
17+
Setting the debug environment variable `NUMBA_DPPY_FALLBACK_OPTION `
18+
(e.g. `export NUMBA_DPPY_FALLBACK_OPTION=0`) enables the code is not automatically offload to the CPU, and an error occurs. This is necessary in order to understand at an early stage which parts of the code do not work on the GPU, and not to wait for the program to execute on the CPU if you don't need it.

0 commit comments

Comments
 (0)