Documentation improvements

giltirn · giltirn · commit 169122cc6800 · 2022-12-16T15:12:21.000-05:00
diff --git a/sphinx/source/install_usage/install.rst b/sphinx/source/install_usage/install.rst
@@ -2,9 +2,9 @@
 Installation
 ************
 
-For x86 systems we provide pre-built docker images users can quickly start with their own TAU instrumented applications (See `Chimbuko docker <https://codarcode.github.io/Chimbuko/installation/docker.html>`_). Otherwise, we recommend that Chimbuko be installed via the `Spack package manager <https://spack.io/>`_. Below we provide instructions for installing Chimbuko on a typical Ubuntu desktop and also on the Summit computer. Some details on installing Chimbuko in absence of Spack can be found in the :ref:`Appendix <manual_installation_of_chimbuko>`. 
+For x86 systems we provide pre-built docker images users can quickly start with their own TAU instrumented applications (See `Chimbuko docker <https://codarcode.github.io/Chimbuko/installation/docker.html>`_). Otherwise, we recommend that Chimbuko be installed via the `Spack package manager <https://spack.io/>`_. Below we provide instructions for installing Chimbuko on a typical Ubuntu desktop and also on the Summit and Crusher computers. Some details on installing Chimbuko in absence of Spack can be found in the :ref:`Appendix <manual_installation_of_chimbuko>`. 
 
-In all cases, the first step is to download and install Spack following the instructions `here <https://github.com/spack/spack>`_ . Note that installing Spack requires Python.
+The first step is to download and install Spack following the instructions `here <https://github.com/spack/spack>`_ . Note that installing Spack requires Python.
 
 We require Spack repositories for Chimbuko and for the Mochi stack:
 
@@ -24,9 +24,10 @@ A basic installation of Chimbuko can be achieved very easily:
 
 .. code:: bash
 
-	  spack install chimbuko^py-setuptools-scm+toml
+	  spack install chimbuko
 
-Note that the dependency on :code:`py-setuptools-scm+toml` resolves a dependency conflict likely resulting from a bug in Spack's current dependency resolution.
+..
+ ^py-setuptools-scm+toml  Note that the dependency on :code:`py-setuptools-scm+toml` resolves a dependency conflict likely resulting from a bug in Spack's current dependency resolution.
 
 A Dockerfile (instructions for building a Docker image) that installs Chimbuko on top of a basic Ubuntu 18.04 image following the above steps can be found `here <https://github.com/CODARcode/PerformanceAnalysis/blob/master/docker/ubuntu18.04/openmpi4.0.4/Dockerfile.chimbuko.spack>`_ .
 
@@ -71,7 +72,10 @@ Chimbuko can be built without MPI by disabling the **mpi** Spack variant as foll
 
 .. code:: bash
 
-	  spack install chimbuko~mpi ^py-setuptools-scm+toml
+	  spack install chimbuko~mpi
+
+..
+ ^py-setuptools-scm+toml
 
 When used in this mode the user is responsible for manually assigning a "rank" index to each instance of the online AD module, and also for ensuring that an instance of this module is created alongside each instance or rank of the target application (e.g. using a wrapper script that is launched via mpirun). We discuss how this can be achieved :ref:`here <non_mpi_run>`. 
 
@@ -117,10 +121,10 @@ Once installed, simply
 after loading the modules above.	  
 
 
-Spock
+Crusher
 ~~~~~~
 
-In the PerformanceAnalysis source we also provide a Spack environment yaml for use on Spock, :code:`spack/environments/spock.yaml`. This environment is designed for the AMD compiler suite with Rocm 4.3.0. Installation instructions follow:
+In the PerformanceAnalysis source we also provide a Spack environment yaml for use on Crusher, :code:`spack/environments/crusher_rocm5.2_PrgEnv-amd.yaml`. This environment is designed for the AMD programming environment with Rocm 5.2.0. Installation instructions follow:
 
 First download the Chimbuko and Mochi repositories:
 
@@ -129,7 +133,7 @@ First download the Chimbuko and Mochi repositories:
 	  git clone https://github.com/mochi-hpc/mochi-spack-packages.git
 	  git clone https://github.com/CODARcode/PerformanceAnalysis.git
 
-Copy the file :code:`spack/environments/spock.yaml` from the PerformanceAnalysis git repository to a convenient location and edit the paths in the :code:`repos` section to point to the paths at which you downloaded the repositories:
+Copy the file :code:`spack/environments/crusher_rocm5.2_PrgEnv-amd.yaml` from the PerformanceAnalysis git repository to a convenient location and edit the paths in the :code:`repos` section to point to the paths at which you downloaded the repositories:
 
 .. code:: yaml
 
@@ -141,10 +145,18 @@ This environment uses the following modules, which must be loaded prior to insta
 
 .. code:: bash
 
-	  module reset
-	  module load PrgEnv-amd/8.2.0
-	  module load rocm/4.3.0
-	  module load cray-python/3.9.4.1
+          module reset
+          module load PrgEnv-amd/8.3.3
+          module swap amd amd/5.2.0
+          module load cray-python/3.9.12.1
+          module load cray-mpich/8.1.17
+          module load gmp
+          module load craype-accel-amd-gfx90a
+          export LD_LIBRARY_PATH=/opt/gcc/mpfr/3.1.4/lib:$LD_LIBRARY_PATH
+
+          # For some reason not set by the cray-mpich module?
+          export PATH=${CRAY_MPICH_PREFIX}/bin:${PATH}
+          export PATH=${ROCM_COMPILER_PATH}/bin:${PATH}
 
 To install the environment:
 
@@ -154,16 +166,13 @@ To install the environment:
 	  spack env activate my_chimbuko_env
 	  spack install
 
-Unfortunately at present there are a few issues with Spack on Spock that require workarounds when loading the environment: 	 
+To load the environment:
 
 .. code:: bash
 
 	  #Looks like spack doesn't pick up cray-xpmem pkg-config loc, put at end so only use as last resort
 	  export PKG_CONFIG_PATH=${PKG_CONFIG_PATH}:/usr/lib64/pkgconfig
 
-          #Looks like spack misses an rpath for Chimbuko
-          export LD_LIBRARY_PATH=/opt/cray/pe/libsci/21.08.1.2/AMD/4.0/x86_64/lib:${LD_LIBRARY_PATH}
-	  
 	  spack env activate my_chimbuko_env
 	  spack load tau chimbuko-performance-analysis chimbuko-visualization2
 
diff --git a/sphinx/source/install_usage/install_usage.rst b/sphinx/source/install_usage/install_usage.rst
@@ -6,3 +6,4 @@ Installation and Usage
 .. include:: install.rst
 .. include:: instrumenting.rst
 .. include:: run_chimbuko.rst
+.. include:: post_analysis.rst
diff --git a/sphinx/source/install_usage/post_analysis.rst b/sphinx/source/install_usage/post_analysis.rst
@@ -0,0 +1,133 @@
+********************************************
+Analysing the results of a run with Chimbuko
+********************************************
+
+The output of Chimbuko is stored in the provenance database. The database is sharded over multiple files of the form **provdb.${SHARD}.unqlite** that are by default output into the :file:`chimbuko/provdb` directory in the run path. We provide several tools for analyzing the contents of the provenance database:
+
+1. **provdb_query**, a command-line tool for filtering and querying the database
+
+2. A **Python module** for connecting to the database, filtering and querying, for use in custom analysis tools
+
+3. **provdb-python**, a Python-based command-line tool for analyzing the database.   
+
+Using the post-analysis **Python module**
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The module
+
+.. code:: bash
+	  
+	  scripts/provdb_python/src/provdb_python/provdb_interact.py
+
+in the PerformanceAnalysis source provides an interface for connecting and querying the database. Futher documentation is forthcoming.
+	  
+
+Using **provdb-python**
+~~~~~~~~~~~~~~~~~~~~~~~
+
+This tool can also be installed via Spack
+
+.. code:: bash
+	  
+	  spack repo add /src/develop/PerformanceAnalysis/spack/repo/chimbuko
+          spack install chimbuko-provdb-python
+
+
+Or using **pip** from the Chimbuko source (note the **py-mochi-sonata** Spack module is expected to be loaded):
+
+.. code:: bash
+	  
+	  git clone -b ckelly_develop https://github.com/CODARcode/PerformanceAnalysis.git && \
+	  python3.6 -m pip install PerformanceAnalysis/scripts/provdb_python/
+
+
+Once installed the tool can be used as a regular command line function, executed from the directory containing the provenance database UnQLite files:
+	  
+.. code:: bash
+	  
+          cd chimbuko/provdb
+	  provdb-python 
+
+Several components are available, with further documentation forthcoming.
+	  
+   
+Using **provdb_query**
+~~~~~~~~~~~~~~~~~~~~~~
+
+The provenance database is stored in a single file, **provdb.${SHARD}.unqlite** in the job's run directory. From this directory the user can interact with the provenance database via the visualization module. A more general command line interface to the database is also provided via the **provdb_query** tool that allows the user to execute arbitrary jx9 queries on the database.
+
+The **provdb_query** tool has two modes of operation: **filter** and **execute**.
+
+Filter mode
+-----------
+
+**filter** mode allows the user to provide a jx9 filter function that is applied to filter out entries in a particular collection. The result is displayed in JSON format and can be piped to disk. It can be used as follows:
+
+.. code:: bash
+
+	  provdb_query filter ${COLLECTION} ${QUERY}
+
+Where the variables are as follows:
+
+- **COLLECTION** : One of the three collections in the database, **anomalies**, **normalexecs**, **metadata** (cf :ref:`introduction/provdb:Provenance Database`).
+- **QUERY**: The query, format described below.
+
+The **QUERY** argument should be a jx9 function returning a bool and enclosed in quotation marks. It should be of the format
+
+.. code:: bash
+
+	  QUERY="function(\$entry){ return \$entry['some_field'] == ${SOME_VALUE}; }"
+
+
+Alternatively the query can be set to "DUMP", which will output all entries.
+
+The function is applied sequentially to each element of the collection. Inside the function the entry is described by the variable **$entry**. Note that the backslash-dollar (\\$) is necessary to prevent the shell from trying to expand the variable. Fields of **$entry** can be queried using the square-bracket notation with the field name inside. In the sketch above the field "some_field" is compared to a value **${SOME_VALUE}** (here representing a numerical value or a value expanded by the shell, *not* a jx9 variable!).
+
+Some examples:
+
+- Find every anomaly whose function contains the substring "Kokkos":
+
+.. code:: bash
+
+	  provdb_query filter anomalies "function(\$a){ return substr_count(\$a['func'],'Kokkos') > 0; }"
+
+- Find all events that occured on a GPU:
+
+.. code:: bash
+
+	  provdb_query filter anomalies "function(\$a){ return \$a['is_gpu_event']; }"
+
+Filter-global mode
+------------------
+
+If the pserver is connected to the provenance database, at the end of the run the aggregated function profile data and global averages of counters will be stored in a "global" database "provdb.global.unqlite". This database can be queried using the **filter-global** mode, which like the above allows the user to provide a jx9 filter function that is applied to filter out entries in a particular collection. The result is displayed in JSON format and can be piped to disk. It can be used as follows:
+
+.. code:: bash
+
+	  provdb_query filter-global ${COLLECTION} ${QUERY}
+
+Where the variables are as follows:
+
+- **COLLECTION** : One of the two collections in the database, **func_stats**, **counter_stats**.
+- **QUERY**: The query, format described below.
+
+The formatting of the **QUERY** argument is described above.
+
+Execute mode
+------------
+
+**execute** mode allows running a complete jx9 script on the database as a whole, allowing for more complex queries that collect different outputs and span collections.
+
+.. code:: bash
+
+	  provdb_query execute ${CODE} ${VARIABLES} ${OPTIONS}
+
+Where the variables are as follows:
+
+- **CODE** : The jx9 script
+- **VARIABLES** : a comma-separated list (without spaces) of the variables assigned by the script
+
+The **CODE** argument is a complete jx9 script. As above, backslashes ('\') must be placed before internal '$' and '"' characters to prevent shell expansion.
+
+If the option **-from_file** is specified the **${CODE}** variable above will be treated as a filename from which to obtain the script. Note that in this case the backslashes before the special characters are not necessary.
+
diff --git a/sphinx/source/install_usage/run_chimbuko.rst b/sphinx/source/install_usage/run_chimbuko.rst
@@ -172,10 +172,10 @@ which can be used as follows:
 	  <LAUNCH N RANKS OF APP ON BODY NODES> = jsrun -U main.urs
 
 
-Running on Spock
-^^^^^^^^^^^^^^^^
+Running on Slurm-based systems
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
-In this section we provide specifics on launching on the Spock machine.
+This section we provide specifics on launching on the Spock machine, but the procedure will also apply to other machines using the Slurm task scheduler.
 
 Spock uses the *slurm* job management system. To control the explicit placement of the ranks we will use the :code:`--nodelist` (:code:`-w`) slurm option to specify the nodes associated with a resource set, the :code:`--nodes` (:code:`-N`) option to specify the number of nodes and the :code:`--overlap` option to allow the AD and application resource sets to coexist on the same node. These options are documented `here <https://slurm.schedmd.com/srun.html>`_.
 
@@ -350,84 +350,3 @@ To run the image the user must have access to a system with an installation of t
 	  nvidia-docker run -p 5002:5002 --cap-add=SYS_PTRACE --security-opt seccomp=unconfined chimbuko/run_mocu:latest
 
 And connect to this visualization server at **localhost:5002**.
-
-
-Interacting with the Provenance Database
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-The provenance database is stored in a single file, **provdb.${SHARD}.unqlite** in the job's run directory. From this directory the user can interact with the provenance database via the visualization module. A more general command line interface to the database is also provided via the **provdb_query** tool that allows the user to execute arbitrary jx9 queries on the database.
-
-The **provdb_query** tool has two modes of operation: **filter** and **execute**.
-
-Filter mode
------------
-
-**filter** mode allows the user to provide a jx9 filter function that is applied to filter out entries in a particular collection. The result is displayed in JSON format and can be piped to disk. It can be used as follows:
-
-.. code:: bash
-
-	  provdb_query filter ${COLLECTION} ${QUERY}
-
-Where the variables are as follows:
-
-- **COLLECTION** : One of the three collections in the database, **anomalies**, **normalexecs**, **metadata** (cf :ref:`introduction/provdb:Provenance Database`).
-- **QUERY**: The query, format described below.
-
-The **QUERY** argument should be a jx9 function returning a bool and enclosed in quotation marks. It should be of the format
-
-.. code:: bash
-
-	  QUERY="function(\$entry){ return \$entry['some_field'] == ${SOME_VALUE}; }"
-
-
-Alternatively the query can be set to "DUMP", which will output all entries.
-
-The function is applied sequentially to each element of the collection. Inside the function the entry is described by the variable **$entry**. Note that the backslash-dollar (\\$) is necessary to prevent the shell from trying to expand the variable. Fields of **$entry** can be queried using the square-bracket notation with the field name inside. In the sketch above the field "some_field" is compared to a value **${SOME_VALUE}** (here representing a numerical value or a value expanded by the shell, *not* a jx9 variable!).
-
-Some examples:
-
-- Find every anomaly whose function contains the substring "Kokkos":
-
-.. code:: bash
-
-	  provdb_query filter anomalies "function(\$a){ return substr_count(\$a['func'],'Kokkos') > 0; }"
-
-- Find all events that occured on a GPU:
-
-.. code:: bash
-
-	  provdb_query filter anomalies "function(\$a){ return \$a['is_gpu_event']; }"
-
-Filter-global mode
-------------------
-
-If the pserver is connected to the provenance database, at the end of the run the aggregated function profile data and global averages of counters will be stored in a "global" database "provdb.global.unqlite". This database can be queried using the **filter-global** mode, which like the above allows the user to provide a jx9 filter function that is applied to filter out entries in a particular collection. The result is displayed in JSON format and can be piped to disk. It can be used as follows:
-
-.. code:: bash
-
-	  provdb_query filter-global ${COLLECTION} ${QUERY}
-
-Where the variables are as follows:
-
-- **COLLECTION** : One of the two collections in the database, **func_stats**, **counter_stats**.
-- **QUERY**: The query, format described below.
-
-The formatting of the **QUERY** argument is described above.
-
-Execute mode
-------------
-
-**execute** mode allows running a complete jx9 script on the database as a whole, allowing for more complex queries that collect different outputs and span collections.
-
-.. code:: bash
-
-	  provdb_query execute ${CODE} ${VARIABLES} ${OPTIONS}
-
-Where the variables are as follows:
-
-- **CODE** : The jx9 script
-- **VARIABLES** : a comma-separated list (without spaces) of the variables assigned by the script
-
-The **CODE** argument is a complete jx9 script. As above, backslashes ('\') must be placed before internal '$' and '"' characters to prevent shell expansion.
-
-If the option **-from_file** is specified the **${CODE}** variable above will be treated as a filename from which to obtain the script. Note that in this case the backslashes before the special characters are not necessary.
diff --git a/sphinx/source/introduction/ad.rst b/sphinx/source/introduction/ad.rst
diff --git a/sphinx/source/introduction/ps.rst b/sphinx/source/introduction/ps.rst