You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -142,7 +142,7 @@ This example outputs:
142
142
143
143
### Auto detection of the instruction set extension to be used
144
144
145
-
The same computation operating on vectors and using the most performant instruction set available:
145
+
The same computation operating on vectors and using the most performant instruction set available at compile time, based on the provided compiler flags (e.g. ``-mavx2`` for GCC and Clang to target AVX2):
Copy file name to clipboardExpand all lines: docs/source/index.rst
+20-3Lines changed: 20 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -12,15 +12,27 @@ C++ wrappers for SIMD intrinsics.
12
12
Introduction
13
13
------------
14
14
15
-
SIMD (Single Instruction, Multiple Data) is a feature of microprocessors that has been available for many years. SIMD instructions perform a single operation
15
+
`SIMD`_ (Single Instruction, Multiple Data) is a feature of microprocessors that has been available for many years. SIMD instructions perform a single operation
16
16
on a batch of values at once, and thus provide a way to significantly accelerate code execution. However, these instructions differ between microprocessor
17
17
vendors and compilers.
18
18
19
19
`xsimd` provides a unified means for using these features for library authors. Namely, it enables manipulation of batches of scalar and complex numbers with the same arithmetic
20
20
operators and common mathematical functions as for single values.
21
21
22
-
`xsimd` makes it easy to write a single algorithm, generate one version of the algorithm per micro-architecture and pick the best one at runtime, based on the
23
-
running processor capability.
22
+
There are several ways to use `xsimd`:
23
+
24
+
- one can write a generic, vectorized, algorithm and compile it as part of their
25
+
application build, with the right architecture flag;
26
+
27
+
- one can write a generic, vectorized, algorithm and compile several version of
28
+
it by just changing the architecture flags, then pick the best version at
29
+
runtime;
30
+
31
+
- one can write a vectorized algorithm specialized for a given architecture and
32
+
still benefit from the high-level abstraction proposed by `xsimd`.
33
+
34
+
Of course, nothing prevents the combination of several of those approach, but
35
+
more about this in section :ref:`Writing vectorized code`.
24
36
25
37
You can find out more about this implementation of C++ wrappers for SIMD intrinsics at the `The C++ Scientist`_. The mathematical functions are a
26
38
lightweight implementation of the algorithms also used in `boost.SIMD`_.
@@ -52,6 +64,10 @@ The following SIMD instruction set extensions are supported:
Then you just need to ``#include`` that file, force instantiation for a specific
73
+
architecture and pass the appropriate flag to the compiler. For instance:
74
+
75
+
.. literalinclude:: ../../test/doc/sum_sse2.cpp
76
+
77
+
72
78
This can be useful to implement runtime dispatching, based on the instruction set detected at runtime. `xsimd` provides a generic machinery :cpp:func:`xsimd::dispatch()` to implement
73
79
this pattern. Based on the above example, instead of calling ``mean{}(arch, a, b, res, tag)``, one can use ``xsimd::dispatch(mean{})(a, b, res, tag)``. More about this can be found in the :ref:`Arch Dispatching` section.
0 commit comments