Jingxu10/111 tutorials (#613)

jingxu10 · web-flow · commit 9e67c6ce8f6c · 2022-03-16T00:39:48.000+08:00
* updated installation guide for 1.11

* add known issues

* Add load_state_dict + ipex.optimize() to optimize docstring

* update doc for 1.11 release

* update doc for 1.11 release

* update docs for 1.11 release

* update docs for 1.11 release
diff --git a/docs/tutorials/features.rst b/docs/tutorials/features.rst
@@ -117,6 +117,8 @@ Intel® Extension for PyTorch* has built-in quantization recipes to deliver good
 
 Check more detailed information for `INT8 <features/int8.html>`_.
 
+oneDNN provides an evaluation feature called `oneDNN Graph Compiler <https://github.com/oneapi-src/oneDNN/tree/dev-graph-preview4/doc#onednn-graph-compiler>`_. Please refer to `oneDNN build instruction <https://github.com/oneapi-src/oneDNN/blob/dev-graph-preview4/doc/build/build_options.md#build-graph-compiler>`_ to try this feature.
+
 .. toctree::
    :hidden:
    :maxdepth: 1
diff --git a/docs/tutorials/installation.md b/docs/tutorials/installation.md
@@ -5,16 +5,17 @@ Installation Guide
 
 |Category|Content|
 |--|--|
-|Compiler|Verified with GCC 9|
+|Compiler|Recommend to use GCC 9|
 |Operating System|CentOS 7, RHEL 8, Ubuntu newer than 18.04|
-|Python|3.6, 3.7, 3.8, 3.9|
+|Python|See prebuilt wheel files availability matrix below|
 
 ## Install PyTorch
 
 You need to make sure PyTorch is installed in order to get the extension working properly. For each PyTorch release, we have a corresponding release of the extension. Here is the PyTorch versions that we support and the mapping relationship:
 
 |PyTorch Version|Extension Version|
 |--|--|
+|[v1.11.\*](https://github.com/pytorch/pytorch/tree/v1.11.0 "v1.11.0")|[v1.11.\*](https://github.com/intel/intel-extension-for-pytorch/tree/v1.11.0)|
 |[v1.10.\*](https://github.com/pytorch/pytorch/tree/v1.10.0 "v1.10.0")|[v1.10.\*](https://github.com/intel/intel-extension-for-pytorch/tree/v1.10.100)|
 |[v1.9.0](https://github.com/pytorch/pytorch/tree/v1.9.0 "v1.9.0")|[v1.9.0](https://github.com/intel/intel-extension-for-pytorch/tree/v1.9.0)|
 |[v1.8.0](https://github.com/pytorch/pytorch/tree/v1.8.0 "v1.8.0")|[v1.8.0](https://github.com/intel/intel-extension-for-pytorch/tree/v1.8.0)|
@@ -24,7 +25,7 @@ You need to make sure PyTorch is installed in order to get the extension working
 |[v1.5.0-rc3](https://github.com/pytorch/pytorch/tree/v1.5.0-rc3 "v1.5.0-rc3")|[v1.0.1](https://github.com/intel/intel-extension-for-pytorch/tree/v1.0.1)|
 |[v1.5.0-rc3](https://github.com/pytorch/pytorch/tree/v1.5.0-rc3 "v1.5.0-rc3")|[v1.0.0](https://github.com/intel/intel-extension-for-pytorch/tree/v1.0.0)|
 
-Here is an example showing how to install PyTorch (1.10.0). For more details, please refer to [pytorch.org](https://pytorch.org/get-started/locally/)
+Here is an example showing how to install PyTorch. For more details, please refer to [pytorch.org](https://pytorch.org/get-started/locally/)
 
 ---
 
@@ -38,48 +39,67 @@ From 1.8.0, compiling PyTorch from source is not required. If you still want to
 
 ## Install via wheel file
 
-Prebuilt wheel files are available starting from 1.8.0 release. We recommend you to install the latest version with the following commands:
+Prebuilt wheel files availability matrix for Python versions
+
+| Extension Version | Python 3.6 | Python 3.7 | Python 3.8 | Python 3.9 | Python 3.10 |
+| :--: | :--: | :--: | :--: | :--: | :--: |
+| 1.11.0 |  | ✔️ | ✔️ | ✔️ | ✔️ |
+| 1.10.100 | ✔️ | ✔️ | ✔️ | ✔️ |  |
+| 1.10.0 | ✔️ | ✔️ | ✔️ | ✔️ |  |
+| 1.9.0 | ✔️ | ✔️ | ✔️ | ✔️ |  |
+| 1.8.0 |  | ✔️ |  |  |  |
+
+Starting from 1.11.0, you can use normal pip command to install the package.
 
 ```
-python -m pip install intel_extension_for_pytorch==1.10.100 -f https://software.intel.com/ipex-whl-stable
-python -m pip install psutil
+python -m pip install intel_extension_for_pytorch
 ```
 
-**Note:** Wheel files availability for Python versions
+Alternatively, you can also install the latest version with the following commands:
 
-| Extension Version | Python 3.6 | Python 3.7 | Python 3.8 | Python 3.9 |
-| :--: | :--: | :--: | :--: | :--: |
-| 1.10.100 | ✔️ | ✔️ | ✔️ | ✔️ |
-| 1.10.0 | ✔️ | ✔️ | ✔️ | ✔️ |
-| 1.9.0 | ✔️ | ✔️ | ✔️ | ✔️ |
-| 1.8.0 |  | ✔️ |  |  |
-
-**Note:** The wheel files released are compiled with AVX-512 instruction set support only. They cannot be running on hardware platforms that don't support AVX-512 instruction set. Please compile from source with AVX2 support in this case.
+```
+python -m pip install intel_extension_for_pytorch -f https://software.intel.com/ipex-whl-stable
+```
 
 **Note:** For version prior to 1.10.0, please use package name `torch_ipex`, rather than `intel_extension_for_pytorch`.
 
+**Note:** To install a package with a specific version, please use the standard way of pip.
+
+```
+python -m pip install <package_name>==<version_name> -f https://software.intel.com/ipex-whl-stable
+```
+
 ## Install via source compilation
 
 ```bash
 git clone --recursive https://github.com/intel/intel-extension-for-pytorch
 cd intel-extension-for-pytorch
-git checkout v1.10.100
+git checkout v1.11.0
 
 # if you are updating an existing checkout
 git submodule sync
 git submodule update --init --recursive
 
-# run setup.py to compile and install the binaries
-# if you need to compile from source with AVX2 support, please uncomment the following line.
-# export AVX2=1
 python setup.py install
 ```
 
 ## Install C++ SDK
 
 |Version|Pre-cxx11 ABI|cxx11 ABI|
 |--|--|--|
+| 1.11.0 | [libintel-ext-pt-shared-with-deps-1.11.0+cpu.run](http://) | [libintel-ext-pt-cxx11-abi-shared-with-deps-1.11.0+cpu.run](http://) |
 | 1.10.100 | [libtorch-shared-with-deps-1.10.0%2Bcpu-intel-ext-pt-cpu-1.10.100.zip](http://intel-optimized-pytorch.s3.cn-north-1.amazonaws.com.cn/wheels/v1.10/libtorch-shared-with-deps-1.10.0%2Bcpu-intel-ext-pt-cpu-1.10.100.zip) | [libtorch-cxx11-abi-shared-with-deps-1.10.0%2Bcpu-intel-ext-pt-cpu-1.10.100.zip](http://intel-optimized-pytorch.s3.cn-north-1.amazonaws.com.cn/wheels/v1.10/libtorch-cxx11-abi-shared-with-deps-1.10.0%2Bcpu-intel-ext-pt-cpu-1.10.100.zip) |
 | 1.10.0 | [intel-ext-pt-cpu-libtorch-shared-with-deps-1.10.0+cpu.zip](https://intel-optimized-pytorch.s3.cn-north-1.amazonaws.com.cn/wheels/v1.10/intel-ext-pt-cpu-libtorch-shared-with-deps-1.10.0%2Bcpu.zip) | [intel-ext-pt-cpu-libtorch-cxx11-abi-shared-with-deps-1.10.0+cpu.zip](https://intel-optimized-pytorch.s3.cn-north-1.amazonaws.com.cn/wheels/v1.10/intel-ext-pt-cpu-libtorch-cxx11-abi-shared-with-deps-1.10.0%2Bcpu.zip) |
 
-**Usage:** Donwload one zip file above according to your scenario, unzip it and follow the [C++ example](./examples.html#c).
+**Usage:** For version newer than 1.11.0, donwload one run file above according to your scenario, run the following command to install it and follow the [C++ example](./examples.html#c).
+```
+bash <libintel-ext-pt-name>.run install <libtorch_path>
+```
+
+You can get full usage help message by running the run file alone, as the following command.
+
+```
+bash <libintel-ext-pt-name>.run
+```
+
+**Usage:** For version prior to 1.11.0, donwload one zip file above according to your scenario, unzip it and follow the [C++ example](./examples.html#c).
diff --git a/docs/tutorials/performance.md b/docs/tutorials/performance.md
@@ -68,7 +68,7 @@ This page shows performance boost with Intel® Extension for PyTorch\* on severa
     <td style="text-align: center; vertical-align: middle" scope="col">Input shape<br />[3, 224, 224]</td>
   </tr>
   <tr>
-    <td style="text-align: center; vertical-align: middle" scope="col">Fast R-CNN ResNet50 FPN</td>
+    <td style="text-align: center; vertical-align: middle" scope="col">Faster R-CNN ResNet50 FPN</td>
     <td style="text-align: center; vertical-align: middle" scope="col">Float32</td>
     <td style="text-align: center; vertical-align: middle" scope="col">80</td>
     <td style="text-align: center; vertical-align: middle" scope="col">1.71x</td>
diff --git a/docs/tutorials/performance_tuning/known_issues.md b/docs/tutorials/performance_tuning/known_issues.md
@@ -1,6 +1,17 @@
 Known Issues
 ============
 
+- BFloat16 is currently only supported natively on platforms with the following instruction set. The support will be expanded gradually to more platforms in furture releases.
+
+  | Instruction Set | Description |
+  | --- | --- |
+  | AVX512\_CORE | Intel AVX-512 with AVX512BW, AVX512VL, and AVX512DQ extensions |
+  | AVX512\_CORE\_VNNI | Intel AVX-512 with Intel DL Boost |
+  | AVX512\_CORE\_BF16 | Intel AVX-512 with Intel DL Boost and bfloat16 support |
+  | AVX512\_CORE\_AMX | Intel AVX-512 with Intel DL Boost and bfloat16 support and Intel Advanced Matrix Extensions (Intel AMX) with 8-bit integer and bfloat16 support |
+
+- INT8 performance of EfficientNet and DenseNet with IntelÂ® Extension for PyTorch\* is slower than that of FP32
+
 - `omp_set_num_threads` function failed to change OpenMP threads number of oneDNN operators if it was set before.
 
   `omp_set_num_threads` function is provided in Intel® Extension for PyTorch\* to change number of threads used with openmp. However, it failed to change number of OpenMP threads if it was set before.
diff --git a/intel_extension_for_pytorch/frontend.py b/intel_extension_for_pytorch/frontend.py
@@ -172,7 +172,12 @@ def optimize(
 
     .. warning::
 
-        Please invoke ``optimize`` function before invoking DDP in distributed
+        Please invoke ``optimize`` function AFTER loading weights to model via
+        ``model.load_state_dict(torch.load(PATH))``.
+
+    .. warning::
+
+        Please invoke ``optimize`` function BEFORE invoking DDP in distributed
         training scenario.
 
         The ``optimize`` function deepcopys the original model. If DDP is invoked
@@ -185,6 +190,7 @@ def optimize(
 
         >>> # bfloat16 inference case.
         >>> model = ...
+        >>> model.load_state_dict(torch.load(PATH))
         >>> model.eval()
         >>> optimized_model = ipex.optimize(model, dtype=torch.bfloat16)
         >>> # running evaluation step.