Skip to content

Commit f67f3c4

Browse files
committed
Misc. update
1 parent 6768a47 commit f67f3c4

13 files changed

+1171
-132
lines changed

01-preface.rst

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -100,15 +100,15 @@ If you want to contribute to this book, you can:
100100
Publishing
101101
++++++++++
102102

103-
If you're an editor interested in publishing this book, you can contact me if
104-
you agree to have this version and all subsequent versions open access
105-
(i.e. online), you know how to deal with `restructured text
106-
<http://docutils.sourceforge.net/rst.html>`_ (Word is not an option), you
107-
provide a real added-value as well as supporting services, and more
108-
importantly, you have a truly amazing latex book template (and be warned that
109-
I'm a bit picky about typography & design: E.Tufte is my hero).
110-
111-
Still here?
103+
If you're an editor interested in publishing this book, you can `contact me
104+
<mailto:[email protected]>`_ if you agree to have this version and all
105+
subsequent versions open access (i.e. online at `this address
106+
<http://www.labri.fr/perso/nrougier/from-python-to-numpy>`_), you know how to
107+
deal with `restructured text <http://docutils.sourceforge.net/rst.html>`_ (Word
108+
is not an option), you provide a real added-value as well as supporting
109+
services, and more importantly, you have a truly amazing latex book template
110+
(and be warned that I'm a bit picky about typography & design: `Edward Tufte
111+
<https://www.edwardtufte.com/tufte/>`_ is my hero). Still here?
112112

113113

114114
License

02-introduction.rst

Lines changed: 65 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,68 @@
11
Introduction
22
===============================================================================
33

4-
.. contents:: **Contents**
5-
:local:
4+
Numpy is all about vectorization.
5+
6+
If you are familiar with Python, this is the main difficulty you'll face
7+
because it requires for you to change your way of thinking and your new friends
8+
are named vectors, arrays, views or ufuncs.
9+
10+
Let's take a very simple example: random walk. One possible object oriented
11+
approach would be to define a `RandomWalker` class and to write with a walk
12+
method that would return current position after each (random) steps. It's nice,
13+
but is is slow:
14+
15+
**Object oriented approach**
16+
17+
.. code:: python
18+
19+
class RandomWalker:
20+
def __init__(self):
21+
self.steps = []
22+
self.position = 0
23+
24+
def walk(self, n):
25+
yield self.position
26+
for i in range(n):
27+
step = 2*random.randint(0, 1) - 1
28+
self.position += step
29+
yield self.position
30+
31+
walker = RandomWalker()
32+
walk = []
33+
for position in walker.walk(1000):
34+
walk.append(position)
35+
36+
37+
38+
**Functional approach**
39+
40+
For such a simple problem, we can probably save the class definition and
41+
concentrate only on the walk method that compute successive positions after
42+
each random steps.
43+
44+
.. code:: python
45+
46+
def random_walk(n):
47+
position = 0
48+
walk = [position]
49+
for i in range(n):
50+
step = 2*random.randint(0, 1)-1
51+
position += step
52+
walk.append(position)
53+
return walk
54+
55+
walk = random_walk(1000)
56+
57+
**Vectorized approach**
58+
59+
But, we can further simplifying things by considering a random walk to be
60+
composed of a number of steps and corresponding positions are the cumulative
61+
sum of these steps.
62+
63+
.. code:: python
64+
65+
steps = 2*np.random.randint(0, 2, size=n) - 1
66+
walk = np.cumsum(steps)
67+
68+

03-anatomy.rst

Lines changed: 84 additions & 81 deletions
Original file line numberDiff line numberDiff line change
@@ -3,115 +3,118 @@ Anatomy of an array
33

44
.. contents:: **Contents**
55
:local:
6+
7+
|WIP|
68

7-
Data type
8-
---------
9+
..
10+
Data type
11+
---------
12+
13+
Memory layout
14+
-------------
15+
16+
View and copy
17+
-------------
18+
19+
Let's consider two vectors `Z1` and `Z2`. We would like to know if `Z2` is a
20+
view of `Z1` and if yes, what is this view ? Let's consider a simple example:
21+
22+
.. code-block::
923
10-
Memory layout
11-
-------------
24+
>>> Z1 = np.arange(10)
25+
>>> Z2 = Z1[1:-1:2]
1226
13-
View and copy
14-
-------------
27+
.. code-block::
28+
:class: output
1529
16-
Let's consider two vectors `Z1` and `Z2`. We would like to know if `Z2` is a
17-
view of `Z1` and if yes, what is this view ? Let's consider a simple example:
30+
╌╌╌┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬╌╌
31+
Z1 │ 0 │ 1 │ 2 │ 3 │ 4 │ 5 │ 6 │ 7 │ 8 │ 9 │
32+
╌╌╌┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴╌╌
33+
╌╌╌╌╌╌╌┬───┬╌╌╌┬───┬╌╌╌┬───┬╌╌╌┬───┬╌╌╌╌╌╌╌╌╌╌
34+
Z2 │ 1 │ │ 3 │ │ 5 │ │ 7 │
35+
╌╌╌╌╌╌╌┴───┴╌╌╌┴───┴╌╌╌┴───┴╌╌╌┴───┴╌╌╌╌╌╌╌╌╌╌
1836
19-
.. code-block::
37+
First test is to check whether `Z1` is the base of `Z2`
2038

21-
>>> Z1 = np.arange(10)
22-
>>> Z2 = Z1[1:-1:2]
39+
.. code-block::
2340
24-
.. code-block::
25-
:class: output
41+
>>> print(Z2.base is Z1)
42+
True
2643
27-
╌╌╌┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬╌╌
28-
Z1 │ 0 │ 1 │ 2 │ 3 │ 4 │ 5 │ 6 │ 7 │ 8 │ 9 │
29-
╌╌╌┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴╌╌
30-
╌╌╌╌╌╌╌┬───┬╌╌╌┬───┬╌╌╌┬───┬╌╌╌┬───┬╌╌╌╌╌╌╌╌╌╌
31-
Z2 │ 1 │ │ 3 │ │ 5 │ │ 7 │
32-
╌╌╌╌╌╌╌┴───┴╌╌╌┴───┴╌╌╌┴───┴╌╌╌┴───┴╌╌╌╌╌╌╌╌╌╌
44+
At this point, we know `Z2` is a view of `Z1`, meaning `Z2` can be expressed as
45+
`Z1[start:stop:step]`. The difficulty now is to find `start`, `stop` and
46+
`step`. For the `step`, we can use the `strides` property of any array that
47+
gives the number of bytes to go from one element to the other in each
48+
dimension. In our case, and because both arrays are one-dimensional, we can
49+
directly compare the first stride only:
3350

34-
First test is to check whether `Z1` is the base of `Z2`
51+
.. code-block::
3552
36-
.. code-block::
53+
>>> step = Z2.strides[0] // Z1.strides[0]
54+
>>> print(step)
55+
2
3756
38-
>>> print(Z2.base is Z1)
39-
True
57+
Next difficulty is to find the `start` and the `stop` indices. To do this, we
58+
can take advantage of the `byte_bounds` method that returns a pointer to the
59+
end-points of an array.
4060

41-
At this point, we know `Z2` is a view of `Z1`, meaning `Z2` can be expressed as
42-
`Z1[start:stop:step]`. The difficulty now is to find `start`, `stop` and
43-
`step`. For the `step`, we can use the `strides` property of any array that
44-
gives the number of bytes to go from one element to the other in each
45-
dimension. In our case, and because both arrays are one-dimensional, we can
46-
directly compare the first stride only:
61+
.. code-block::
62+
:class: output
4763
48-
.. code-block::
64+
byte_bounds(Z1)[0] byte_bounds(Z1)[-1]
65+
↓ ↓
66+
╌╌╌┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬╌╌
67+
Z1 │ 0 │ 1 │ 2 │ 3 │ 4 │ 5 │ 6 │ 7 │ 8 │ 9 │
68+
╌╌╌┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴╌╌
4969
50-
>>> step = Z2.strides[0] // Z1.strides[0]
51-
>>> print(step)
52-
2
70+
byte_bounds(Z2)[0] byte_bounds(Z2)[-1]
71+
↓ ↓
72+
╌╌╌╌╌╌╌┬───┬╌╌╌┬───┬╌╌╌┬───┬╌╌╌┬───┬╌╌╌╌╌╌╌╌╌╌
73+
Z2 │ 1 │ │ 3 │ │ 5 │ │ 7 │
74+
╌╌╌╌╌╌╌┴───┴╌╌╌┴───┴╌╌╌┴───┴╌╌╌┴───┴╌╌╌╌╌╌╌╌╌╌
5375
54-
Next difficulty is to find the `start` and the `stop` indices. To do this, we
55-
can take advantage of the `byte_bounds` method that returns a pointer to the
56-
end-points of an array.
5776
58-
.. code-block::
59-
:class: output
77+
.. code-block::
6078
61-
byte_bounds(Z1)[0] byte_bounds(Z1)[-1]
62-
↓ ↓
63-
╌╌╌┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬╌╌
64-
Z1 │ 0 │ 1 │ 2 │ 3 │ 4 │ 5 │ 6 │ 7 │ 8 │ 9 │
65-
╌╌╌┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴╌╌
79+
>>> offset_start = np.byte_bounds(Z2)[0] - np.byte_bounds(Z1)[0]
80+
>>> print(offset_start) # bytes
81+
8
6682
67-
byte_bounds(Z2)[0] byte_bounds(Z2)[-1]
68-
↓ ↓
69-
╌╌╌╌╌╌╌┬───┬╌╌╌┬───┬╌╌╌┬───┬╌╌╌┬───┬╌╌╌╌╌╌╌╌╌╌
70-
Z2 │ 1 │ │ 3 │ │ 5 │ │ 7 │
71-
╌╌╌╌╌╌╌┴───┴╌╌╌┴───┴╌╌╌┴───┴╌╌╌┴───┴╌╌╌╌╌╌╌╌╌╌
83+
>>> offset_stop = np.byte_bounds(Z2)[-1] - np.byte_bounds(Z1)[-1]
84+
>>> print(offset_stop) # bytes
85+
-16
7286
87+
Converting these offsets into indices is straightforward using the `itemsize`
88+
and taking into account that the `offset_stop` is negative (end-bound of `Z2`
89+
is logically smaller than end-bound of `Z1` array). We thus need to add the
90+
items size of Z1 to get the right end index.
7391

74-
.. code-block::
92+
.. code-block::
7593
76-
>>> offset_start = np.byte_bounds(Z2)[0] - np.byte_bounds(Z1)[0]
77-
>>> print(offset_start) # bytes
78-
8
79-
80-
>>> offset_stop = np.byte_bounds(Z2)[-1] - np.byte_bounds(Z1)[-1]
81-
>>> print(offset_stop) # bytes
82-
-16
94+
>>> start = offset_start // Z1.itemsize
95+
>>> stop = Z1.size + offset_stop // Z1.itemsize
96+
>>> print(start, stop, step)
97+
1, 8, 2
8398
84-
Converting these offsets into indices is straightforward using the `itemsize`
85-
and taking into account that the `offset_stop` is negative (end-bound of `Z2`
86-
is logically smaller than end-bound of `Z1` array). We thus need to add the
87-
items size of Z1 to get the right end index.
88-
89-
.. code-block::
99+
Last we test our results:
90100

91-
>>> start = offset_start // Z1.itemsize
92-
>>> stop = Z1.size + offset_stop // Z1.itemsize
93-
>>> print(start, stop, step)
94-
1, 8, 2
101+
.. code-block::
95102
96-
Last we test our results:
103+
>>> print(np.allclose(Z1[start,stop,step], Z2))
104+
True
97105
98-
.. code-block::
99106
100-
>>> print(np.allclose(Z1[start,stop,step], Z2))
101-
True
102-
107+
Exercice
108+
++++++++
103109

104-
Exercice
105-
++++++++
110+
As an exercise, you can improve this first and very simple implementation by
111+
taking into account:
106112

107-
As an exercise, you can improve this first and very simple implementation by
108-
taking into account:
113+
* Negative steps
114+
* Multi-dimensional arrays
109115

110-
* Negative steps
111-
* Multi-dimensional arrays
112116

113-
114-
Sources
115-
+++++++
117+
Sources
118+
+++++++
116119

117-
* `find_index <../code/find_index.py>`_ (solution to the exercise)
120+
* `find_index.py <code/find_index.py>`_ (solution to the exercise)

0 commit comments

Comments
 (0)