Skip to content

Commit 19a6d45

Browse files
Improve and update documentation
Update supported architecture and make various usage scenario more explicit. Fix #1202
1 parent 37e5d9f commit 19a6d45

File tree

3 files changed

+27
-4
lines changed

3 files changed

+27
-4
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -142,7 +142,7 @@ This example outputs:
142142

143143
### Auto detection of the instruction set extension to be used
144144

145-
The same computation operating on vectors and using the most performant instruction set available:
145+
The same computation operating on vectors and using the most performant instruction set available at compile time, based on the provided compiler flags (e.g. ``-mavx2`` for GCC and Clang to target AVX2):
146146

147147
```cpp
148148
#include <cstddef>

docs/source/index.rst

Lines changed: 20 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -12,15 +12,27 @@ C++ wrappers for SIMD intrinsics.
1212
Introduction
1313
------------
1414

15-
SIMD (Single Instruction, Multiple Data) is a feature of microprocessors that has been available for many years. SIMD instructions perform a single operation
15+
`SIMD`_ (Single Instruction, Multiple Data) is a feature of microprocessors that has been available for many years. SIMD instructions perform a single operation
1616
on a batch of values at once, and thus provide a way to significantly accelerate code execution. However, these instructions differ between microprocessor
1717
vendors and compilers.
1818

1919
`xsimd` provides a unified means for using these features for library authors. Namely, it enables manipulation of batches of scalar and complex numbers with the same arithmetic
2020
operators and common mathematical functions as for single values.
2121

22-
`xsimd` makes it easy to write a single algorithm, generate one version of the algorithm per micro-architecture and pick the best one at runtime, based on the
23-
running processor capability.
22+
There are several ways to use `xsimd`:
23+
24+
- one can write a generic, vectorized, algorithm and compile it as part of their
25+
application build, with the right architecture flag;
26+
27+
- one can write a generic, vectorized, algorithm and compile several version of
28+
it by just changing the architecture flags, then pick the best version at
29+
runtime;
30+
31+
- one can write a vectorized algorithm specialized for a given architecture and
32+
still benefit from the high-level abstraction proposed by `xsimd`.
33+
34+
Of course, nothing prevents the combination of several of those approach, but
35+
more about this in section :ref:`Writing vectorized code`.
2436

2537
You can find out more about this implementation of C++ wrappers for SIMD intrinsics at the `The C++ Scientist`_. The mathematical functions are a
2638
lightweight implementation of the algorithms also used in `boost.SIMD`_.
@@ -52,6 +64,10 @@ The following SIMD instruction set extensions are supported:
5264
+--------------+---------------------------------------------------------+
5365
| WebAssembly | WASM |
5466
+--------------+---------------------------------------------------------+
67+
| Risc-V | Vector ISA |
68+
+--------------+---------------------------------------------------------+
69+
| PowerPC | VSX |
70+
+--------------+---------------------------------------------------------+
5571

5672
Licensing
5773
---------
@@ -104,6 +120,7 @@ This software is licensed under the BSD-3-Clause license. See the LICENSE file f
104120

105121

106122

123+
.. _SIMD: https://fr.wikipedia.org/wiki/Single_instruction_multiple_data
107124
.. _The C++ Scientist: http://johanmabille.github.io/blog/archives/
108125
.. _boost.SIMD: https://github.com/NumScale/boost.simd
109126

docs/source/vectorized_code.rst

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -69,5 +69,11 @@ as a template parameter:
6969

7070
.. literalinclude:: ../../test/doc/explicit_use_of_an_instruction_set_mean_arch_independent.cpp
7171

72+
Then you just need to ``#include`` that file, force instantiation for a specific
73+
architecture and pass the appropriate flag to the compiler. For instance:
74+
75+
.. literalinclude:: ../../test/doc/sum_sse2.cpp
76+
77+
7278
This can be useful to implement runtime dispatching, based on the instruction set detected at runtime. `xsimd` provides a generic machinery :cpp:func:`xsimd::dispatch()` to implement
7379
this pattern. Based on the above example, instead of calling ``mean{}(arch, a, b, res, tag)``, one can use ``xsimd::dispatch(mean{})(a, b, res, tag)``. More about this can be found in the :ref:`Arch Dispatching` section.

0 commit comments

Comments
 (0)