Skip to content

Commit e860981

Browse files
committed
Merge branch 'master' of gitlab.com:deepanshu2017/Hydra.Python
2 parents f06c080 + 84bcfaa commit e860981

15 files changed

+689
-24
lines changed
-201 Bytes
Binary file not shown.
4.4 KB
Binary file not shown.

docs/_build/doctrees/intro.doctree

-62 Bytes
Binary file not shown.
43.3 KB
Binary file not shown.

docs/_build/html/PhaseSpaceExample.html

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -37,8 +37,7 @@
3737
<div class="bodywrapper">
3838
<div class="body" role="main">
3939

40-
<img alt="_images/hydra_logo.png" src="_images/hydra_logo.png" />
41-
<div class="section" id="phase-space-example">
40+
<div class="section" id="phase-space-example">
4241
<h1>Phase Space Example<a class="headerlink" href="#phase-space-example" title="Permalink to this headline"></a></h1>
4342
<p>This page is basically to demonstrate, how the PhaseSpace class with N
4443
particles can be used to generate the Events.</p>

docs/_build/html/_sources/PhaseSpaceExample.txt

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,3 @@
1-
.. image:: hydra_logo.png
2-
31
Phase Space Example
42
====================
53
This page is basically to demonstrate, how the PhaseSpace class with N

docs/_build/html/_sources/intro.txt

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,13 @@
11
About this project
22
==================
33
The **Hydra.Python** package provides the Python bindings for the header-only C++ `Hydra`_ library.
4-
This library is an abstraction over the C++ library, so that daily work can be code and run with the Python language,
4+
This library is an abstraction over the C++ library, so that daily work can be coded and run with the Python language,
55
concentrating on the logic and leaving all the complex memory management and optimisations to the C++ library.
66

7-
The bindings are produced with `pybind11`_. The project makes use of `CMAKE`_.
7+
The bindings are produced with `pybind11`_. The project makes use of `CMAKE`_ for what concerns the building of the Hydra.Python library.
88

99
The library is written with ``Linux`` systems in mind, but compatibility with other platforms may be achieved with "hacks".
10-
Python 2.7, and 3.x are supported.
10+
Python versions 2.7, and 3.x are supported.
1111

1212

1313
.. _Hydra: https://github.com/MultithreadCorner/Hydra
@@ -18,11 +18,10 @@ Python 2.7, and 3.x are supported.
1818
Core features
1919
*************
2020
The core functionality of Hydra has been exposed to Python.
21-
2221
The following core C++ features of Hydra can be mapped to Python:
2322

2423
- The continuous expansion of the original Hydra library.
25-
- Support for ``particles`` with ``Vector4R`` class.
24+
- Support for particles with ``Vector4R`` class.
2625
- Support for containers like ``Events`` or ``Decay``.
2726

2827

Lines changed: 198 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,198 @@
1+
###############
2+
Project Report:
3+
###############
4+
5+
***************************************************************
6+
Google Summer of Code 2017
7+
***************************************************************
8+
9+
===============================================================
10+
Umbrella Organization: CERN-HSF, CERN’s HEP software foundation
11+
===============================================================
12+
13+
================================================================================================================================
14+
Project: Efficient Python routines for analysis on massively multi-threaded platforms-Python bindings for the Hydra C++ library
15+
================================================================================================================================
16+
17+
Submitted by- Deepanshu Thakur
18+
******************************
19+
20+
I spend my last 3 months working on `GSoC project`_. My GSoC project was
21+
related with writing the bindings of the Hydra C++ library. Hydra is a header
22+
only C++ library designed and used to run on Linux platforms. Hydra is a
23+
templated C++11 library designed to perform common High Energy Physics data
24+
analyses on massively parallel platforms. The idea of this GSoC project is to
25+
provide the bindings of the Hydra library, so that the python support for
26+
Hydra library can be added and python can be used for the prototyping or
27+
development.
28+
29+
30+
.. _GSoC project: https://summerofcode.withgoogle.com/projects/#6669304945704960
31+
32+
My original proposal deliverables and my final output looks a little bit
33+
different and there are some very good reasons for it. The change of
34+
deliverables will become evident in the discussion of the design challenges
35+
and choices later in the report. In the beginning the goal was to write the
36+
bindings for the ``Data Fitting``, ``Random Number Generation``,
37+
``Phase-Space Monte Carlo Simulation``, ``Functor Arithmetic`` and
38+
``Numerical integration``, but we ended up having the bindings for
39+
``Random Number Generation`` and ``Phase-Space Monte Carlo Simulation`` only.
40+
(Though remaining classes can be binded with some extra efforts but we do
41+
not have time left under the current scope of GSoC, so I have decided to
42+
continue with the project outside the scope of GSoC.)
43+
44+
45+
Choosing proper tools
46+
*********************
47+
48+
Let me take you to my 3 months journey. First step was to find a tool or
49+
package to write the bindings. Several options were in principle available to
50+
write the bindings for example in the beginning we tried to evaluate the
51+
`SWIG`_.
52+
But the problem with SWIG is, it is very complicated to use and second it
53+
does not support the ``variadic templates`` while Hydra underlying
54+
`Thrust library`_ depends heavily on variadic templates. After trying hands
55+
with SWIG and realizing it cannot fulfill our requirements, we turned our
56+
attention to `Boost.Python`_ which looks quite promising and a very large
57+
project but this large and complex suite project have so many tweaks and
58+
hacks so that it can work on almost any compiler but with added so many
59+
complexities and cost. Finally we turned our attention to use `pybind11`_.
60+
A quote taken from pybind11 documentation,
61+
62+
Boost is an enormously large and complex suite of utility libraries
63+
that works with almost every C++ compiler in existence. This compatibility
64+
has its cost: arcane template tricks and workarounds are necessary to
65+
support the oldest and buggiest of compiler specimens. Now
66+
that C++11-compatible compilers are widely available, this heavy
67+
machinery has become an excessively large and unnecessary dependency.
68+
69+
After investigating a lot of things and trying `various programs`_ we decided
70+
to go ahead with pybind11. Next step was to `familiarize myself`_ with pybind11.
71+
72+
.. _SWIG: http://swig.org
73+
.. _Thrust library: https://github.com/andrewcorrigan/thrust-multi-permutation-iterator
74+
.. _Boost.Python: http://www.boost.org/doc/libs/1_65_0/libs/python/doc/html/index.html
75+
.. _pybind11: https://github.com/pybind/pybind11
76+
.. _various programs: https://github.com/Deepanshu2017/boost.python_practise
77+
.. _familiarize myself: https://github.com/Deepanshu2017/pybind11_practise
78+
79+
80+
The Basic design problem
81+
************************
82+
83+
Now we needed to solve the basic design problem which is the `CRTP idiom`_.
84+
Hydra library relies on the CRTP idiom to avoid runtime overhead. I
85+
investigated a lot about CRTP and it took a little while to finally come up
86+
with a solution that can work with any number N. It means our class can accept
87+
any number of particles at final states. (denoted by N) If you know about
88+
CRTP, it is a type of static polymorphism or compile time polymorphism. The
89+
idea that I implemented was to take a parameter from python and based on that
90+
parameter, I was writing the bindings in a new file, compiling and generating
91+
them on runtime with system calls. Unfortunately generating bindings at
92+
runtime and compiling them would take a lot of time and so, it is not
93+
feasible for user to each time wait for few minutes before actually be
94+
able to use the generated package. We decided to go ahead with fixed number
95+
of values. Means we generate bindings for a limited number of particles.
96+
Currently python bindings for classes supports up to 10 (N = 10) number of
97+
particles at final state. We can make that to work with any number we want,
98+
as our binding code is written within a macro, so it is just a matter of
99+
writing additional 1 extra call to make it use with extra value of N.
100+
101+
.. _CRTP idiom: https://en.wikipedia.org/wiki/Curiously_recurring_template_pattern
102+
103+
104+
The Hydra Binding
105+
*****************
106+
107+
Now that the approach was decided, we jump into the bindings of Hydra.
108+
(Finally after so many complications but unfortunately this was not the
109+
end of them.) We decided to bind the most important classes first,
110+
``Random Number Generation`` and ``Phase-Space Monte Carlo Simulation``.
111+
My mentors decided that they will bind the ``Random Number Generation`` while
112+
``Phase-Space Monte Carlo Simulation`` was my responsibility. Rest of the
113+
report will explain more about Phase-Space Monte Carlo Simulation.
114+
115+
“Phase-Space Monte Carlo Simulation” or PhaseSpace C++ Hydra class is useful
116+
to generate the phase space monte carlo simulation.
117+
118+
The events are generated in the center-of-mass frame, but the decay products
119+
are finally boosted using the betas of the original particle. The code is
120+
based on the Raubold and Lynch method as documentd in
121+
[F. James, Monte Carlo Phase Space, CERN 68-15 (1968)]
122+
(https://cds.cern/ch/record/275743).
123+
124+
The Momentum and Energy units are GeV/C, GeV/C^2. The PhaseSpace monte
125+
carlo class depends on the ``Vector3R``, ``Vector4R`` and ``Events`` classes.
126+
Thus PhaseSpace class cannot be binded before without any of the above classes.
127+
128+
The ``Vector3R`` and ``Vector4R`` classes were binded. There were some problems
129+
like generating ``__eq__`` and ``__nq__`` methods for python side but I solved
130+
them by creating ``lambda function`` and iterating over values and checking
131+
if they satisfy the conditions or not. The ``Vector4R`` or four-vector class
132+
represents a particle. The idea is I first bind the particles class
133+
(the four-vector class) than I had to bind the ``Events`` class that will
134+
hold the Phase Space generated by the ``PhaseSpace`` class, and then bind the
135+
actual ``PhaseSpace`` class. The ``Events`` class were not so easy to bind
136+
because they were dependent on the ``hydra::multiarray`` and without their
137+
bindings, the ``Events`` class was impossible to bind. Thanks to my mentor
138+
who had already binded these bindings for ``Random`` class with some tweaks on
139+
the pybind11’s bind_container itself. We even faced some design issues of
140+
Events class in Hydra itself. But eventually after solving these problems,
141+
I now had Events class working and I therefore converted the binding code
142+
into a macro, so that we can use Events class with up-to 10 particles.
143+
144+
Now came the actual bindings for the ``PhaseSpace`` class. The ``PhaseSpace``
145+
class have constructors and methods like ``GetSeed``, ``SetSeed``, ``AverageOn``, ``Evaluate`` and ``Generate``.
146+
147+
148+
The ``GetSeed`` and ``SetSeed`` were easy to implement. The remaining 3 methods
149+
have two version, one which accept single mother particle and one which accept
150+
a list of mother particle. I got the success of bindings methods which accept
151+
the single mother particle but was unable to bind the methods that accepts
152+
the list of mother particles. I was trying to pass the list of events object
153+
along with the list of mother particles. I was successfully able to pass the
154+
list of mother particles but wasn’t getting any way to pass the list of Events
155+
without casting each Event object from python object in my bindings code.
156+
(Later I realized that is impossible to do) My mentor wrote the bindings for
157+
methods that accept the list of mother particles. After looking at binding
158+
code I realized. Alas! I was making a very stupid mistake. I had to pass the
159+
``single Events object, not the list of Events object`` which I already did
160+
but never showed to my mentor, thought I’m making a mistake. Well learned a
161+
lesson from this, always show your mentor what you did, even though if you
162+
believe you are wrong. Maybe it could save some of your time. ;)
163+
164+
After completing the PhaseSpace code, I quickly converted the code into macro
165+
for supporting up-to 10 particles.
166+
167+
Now the PhaseSpace class was working perfectly! Next step was to create a
168+
series of test cases and documentation and of-course the example of
169+
PhaseSpace class in action. The remaining algorithms that I named at the
170+
start of the article are left to implement.
171+
172+
173+
The happy learning
174+
******************
175+
176+
GSoC 2017 was a really very learning experience for me. I learned a lot of
177+
things not only related with programming but related with high energy physics.
178+
I learned about *Monte Carlo Simulations*, and how they can be used to solve
179+
challenging real life problems. I read and studied a research paper
180+
( https://cds.cern.ch/record/275743/files/CERN-68-15.pdf ), learned about
181+
particle decays, learned the insights of C++ varidiac templates,
182+
wrote a blog about `CRTP`_, learned how to compile a
183+
python function and why simple python functions cannot be used in
184+
multithreaded environments. Most importantly I learned how to structure
185+
a project from scratch, how important documentation and test cases are.
186+
187+
188+
.. _CRTP: https://medium.com/@deepanshu2017/a-curiously-recurring-python-d3a441a58174
189+
190+
191+
Special Thanks
192+
**************
193+
194+
Shoutout to my amazing mentors. I would like to thank
195+
Dr. Antonio Augusto Alaves Jr. and Eduardo Rodrigues for being awesome
196+
mentors and for all the time they invested in me during GSoC. I also would
197+
like to thank the CERN-HSF community for their time and helping me whenever I
198+
had a problem. Thank you!

docs/_build/html/_static/PhaseSpaceExample.rst

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,3 @@
1-
.. image:: hydra_logo.png
2-
31
Phase Space Example
42
====================
53
This page is basically to demonstrate, how the PhaseSpace class with N

docs/_build/html/_static/intro.rst

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,13 @@
11
About this project
22
==================
33
The **Hydra.Python** package provides the Python bindings for the header-only C++ `Hydra`_ library.
4-
This library is an abstraction over the C++ library, so that daily work can be code and run with the Python language,
4+
This library is an abstraction over the C++ library, so that daily work can be coded and run with the Python language,
55
concentrating on the logic and leaving all the complex memory management and optimisations to the C++ library.
66

7-
The bindings are produced with `pybind11`_. The project makes use of `CMAKE`_.
7+
The bindings are produced with `pybind11`_. The project makes use of `CMAKE`_ for what concerns the building of the Hydra.Python library.
88

99
The library is written with ``Linux`` systems in mind, but compatibility with other platforms may be achieved with "hacks".
10-
Python 2.7, and 3.x are supported.
10+
Python versions 2.7, and 3.x are supported.
1111

1212

1313
.. _Hydra: https://github.com/MultithreadCorner/Hydra
@@ -18,11 +18,10 @@ Python 2.7, and 3.x are supported.
1818
Core features
1919
*************
2020
The core functionality of Hydra has been exposed to Python.
21-
2221
The following core C++ features of Hydra can be mapped to Python:
2322

2423
- The continuous expansion of the original Hydra library.
25-
- Support for ``particles`` with ``Vector4R`` class.
24+
- Support for particles with ``Vector4R`` class.
2625
- Support for containers like ``Events`` or ``Decay``.
2726

2827

0 commit comments

Comments
 (0)