Skip to content

Commit a2887b3

Browse files
authored
Merge pull request #155 from tensorwerk/benchmark-gh-actions
Benchmark Suite and Automated PR Regression Testing.
2 parents 1d932c8 + 7975eac commit a2887b3

18 files changed

+750
-5
lines changed

.github/workflows/asvbench.yml

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
name: ASV Benchmarking
2+
3+
on:
4+
pull_request:
5+
branches:
6+
- master
7+
8+
jobs:
9+
build:
10+
runs-on: ${{ matrix.os }}
11+
strategy:
12+
max-parallel: 4
13+
fail-fast: false
14+
matrix:
15+
os: [ubuntu-18.04, macOS-10.14]
16+
python-version: [3.6, 3.7]
17+
steps:
18+
- uses: actions/checkout@v1
19+
- name: Set up Python ${{ matrix.python-version }}
20+
uses: actions/setup-python@v1
21+
with:
22+
python-version: ${{ matrix.python-version }}
23+
- name: Install dependencies
24+
run: |
25+
python -m pip install --upgrade pip
26+
pip install --upgrade setuptools virtualenv
27+
pip install --upgrade asv
28+
- name: Run Benchmarks
29+
run: |
30+
cd asv_bench/
31+
asv machine --yes
32+
asv continuous origin/master HEAD | tee asv.log
33+
ASV_COMPARE="$(asv compare origin/master HEAD)"
34+
if [[ $(cat asv.log | grep "failed") ]]; then
35+
echo "Benchmarks Run With Errors"
36+
exit 1
37+
elif [[ $(cat asv.log | grep "PERFORMANCE DECREASED") ]]; then
38+
echo "$ASV_COMPARE"
39+
echo "Benchmarks Decreased Performance"
40+
exit 1
41+
else
42+
echo "$ASV_COMPARE"
43+
echo "Benchmarks Run Without Errors"
44+
fi
45+
shell: bash

.gitignore

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,11 @@ coverage.xml
3535
htmlcov
3636
.hypothesis
3737

38+
# Performance Testing
39+
asv_bench/html
40+
asv_bench/env
41+
asv_bench/results
42+
3843
# Translations
3944
*.mo
4045

CHANGELOG.rst

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,9 @@ New Features
1414
(`#131 <https://github.com/tensorwerk/hangar-py/pull/131>`__) `@rlizzo <https://github.com/rlizzo>`__
1515
* Ability to change the backend storage format and options applied to an ``arrayset`` after initialization.
1616
(`#133 <https://github.com/tensorwerk/hangar-py/pull/133>`__) `@rlizzo <https://github.com/rlizzo>`__
17+
* Added Benchmarking Suite to Test for Performance Regressions in PRs.
18+
(`#155 <https://github.com/tensorwerk/hangar-py/pull/155>`__) `@rlizzo <https://github.com/rlizzo>`__
19+
1720

1821
Improvements
1922
------------

asv_bench/README.rst

Lines changed: 142 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,142 @@
1+
Hangar Performance Benchmarking Suite
2+
=====================================
3+
4+
A set of benchmarking tools are included in order to track the performance of
5+
common hangar operations over the course of time. The benchmark suite is run
6+
via the phenomenal `Airspeed Velocity (ASV) <https://asv.readthedocs.io/>`_
7+
project.
8+
9+
Benchmarks can be viewed at the following web link, or by examining the raw
10+
data files in the separate benchmark results repo.
11+
12+
- `Benchmark Web View <https://tensorwerk.com/hangar-benchmarks>`_
13+
- `Benchmark Results Repo <https://github.com/tensorwerk/hangar-benchmarks>`_
14+
15+
.. figure:: ../docs/img/asv-detailed.png
16+
:align: center
17+
18+
Purpose
19+
*******
20+
21+
In addition to providing historical metrics and insight into application
22+
performance over many releases of Hangar, *the benchmark suite is used as a
23+
canary to identify potentially problematic pull requests.* All PRs to the
24+
Hangar repository are automatically benchmarked by our CI system to compare the
25+
performance of proposed changes to that of the current ``master`` branch.
26+
27+
*The results of this canary are explicitly NOT to be used as the
28+
"be-all-end-all" decider of whether a PR is suitable to be merged or not.*
29+
30+
Instead, it is meant to serve the following purposes:
31+
32+
1. **Help contributors understand the consequences of some set of changes on the
33+
greater system early in the PR process.** Simple code is best; if there's no
34+
obvious performance degradation or significant improvement to be had, then
35+
there's no need (or really rationale) for using more complex algorithms or
36+
data structures. It's more work for the author, project maintainers, and
37+
long term health of the codebase.
38+
39+
2. **Not everything can be caught by the capabilities of a traditional test
40+
suite.** Hangar is fairly flat/modular in structure, but there are certain
41+
hotspots in the codebase where a simple change could drastically degrade
42+
performance. It's not always obvious where these hotspots are, and even a
43+
change which is functionally identical (introducing no issues/bugs to the
44+
end user) can unknowingly cross a line and introduce some large regression
45+
completely unnoticed to the authors/reviewers.
46+
47+
3. Sometimes tradeoffs need to be made when introducing something new to a
48+
system. Whether this be due to fundamental CS problems (space vs. time) or
49+
simple matters of practicality vs. purity, it's always easier to act in
50+
environments where relevant information is available before a decision is
51+
made. **Identifying and quantifying tradeoffs/regressions/benefits during
52+
development is the only way we can make informed decisions.** The only times
53+
to be OK with some regression is when knowing about it in advance, it might
54+
be the right choice at the time, but if we don't measure we will never know.
55+
56+
57+
Important Notes on Using/Modifying the Benchmark Suite
58+
******************************************************
59+
60+
1. **Do not commit any of the benchmark results, environment files, or generated
61+
visualizations to the repository**. We store benchmark results in a `separate
62+
repository <https://github.com/tensorwerk/hangar-benchmarks>`_ so to not
63+
clutter the main repo with un-necessary data. The default directories these are
64+
generated in are excluded in our ``.gitignore`` config, so baring some unusual
65+
git usage patterns, this should not be a day-to-day concern.
66+
67+
2. Proposed changes to the benchmark suite should be made to the code in this
68+
repository first. The benchmark results repository mirror will be
69+
synchronized upon approval/merge of changes to the main Hangar repo.
70+
71+
72+
Introduction to Running Benchmarks
73+
**********************************
74+
75+
As ASV sets up and manages it's own virtual environments and source
76+
installations, benchmark execution is not run via ``tox``. While a brief
77+
tutorial is included below, please refer to the `ASV Docs
78+
<https://asv.readthedocs.io/>`_ for detailed information on how to both run,
79+
understand, and write ASV benchmarks.
80+
81+
First Time Setup
82+
----------------
83+
84+
1. Ensure that ``virtualenv``, ``setuptools``, ``pip`` are updated to the
85+
latest version.
86+
87+
2. Install ASV ``$ pip install asv``.
88+
89+
3. Open a terminal and navigate to the ``hangar-py/asv-bench`` directory.
90+
91+
4. Run ``$ asv machine`` to record details of your machine, it is OK to
92+
just use the defaults.
93+
94+
95+
Running Benchmarks
96+
------------------
97+
98+
Refer to the `using ASV
99+
<https://asv.readthedocs.io/en/stable/using.html#running-benchmarks>`_ page for
100+
a full tutorial, paying close attention to the `asv run
101+
<https://asv.readthedocs.io/en/stable/commands.html#asv-run>`_ command.
102+
Generally ``asv run`` requires a range of commits to benchmark across
103+
(specified via either branch name, tags, or commit digests).
104+
105+
To benchmark every commit between the current master ``HEAD`` and ``v0.3.0``,
106+
you would execute::
107+
108+
$ asv run v0.2.0..master
109+
110+
However, this may result in a larger workload then you are willing to wait
111+
around for. To limit the number of commits, you can specify the ``--steps=N``
112+
option to only benchmark ``N`` commits at most between ``HEAD`` and ``v0.3.0``.
113+
114+
The most useful tool during development is the `asv continuous
115+
<https://asv.readthedocs.io/en/stable/commands.html#asv-continuous>`_ command.
116+
using the following syntax will benchmark any changes in a local development
117+
branch against the base ``master`` commit::
118+
119+
$ asv continuous origin/master HEAD
120+
121+
Running `asv compare
122+
<https://asv.readthedocs.io/en/stable/commands.html#asv-compare>`_ will
123+
generate a quick summary of any performance differences::
124+
125+
$ asv compare origin/master HEAD
126+
127+
Visualizing Results
128+
-------------------
129+
130+
After generating benchmark data for a number of commits through history, the
131+
results can be reviewed in (an automatically generated) local web interface by
132+
running the following commands::
133+
134+
$ asv publish
135+
$ asv preview
136+
137+
Navigating to ``http://127.0.0.1:8080/`` will pull up an interactive webpage
138+
where the full set of benchmark graphs/explorations utilities can be viewed.
139+
This will look something like the image below.
140+
141+
.. figure:: ../docs/img/asv-main.png
142+
:align: center

asv_bench/asv.conf.json

Lines changed: 160 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,160 @@
1+
{
2+
// The version of the config file format. Do not change, unless
3+
// you know what you are doing.
4+
"version": 1,
5+
6+
// The name of the project being benchmarked
7+
"project": "hangar",
8+
9+
// The project's homepage
10+
"project_url": "https://hangar-py.readthedocs.io",
11+
12+
// The URL or local path of the source code repository for the
13+
// project being benchmarked
14+
"repo": "..",
15+
16+
// The Python project's subdirectory in your repo. If missing or
17+
// the empty string, the project is assumed to be located at the root
18+
// of the repository.
19+
// "repo_subdir": "",
20+
21+
// Customizable commands for building, installing, and
22+
// uninstalling the project. See asv.conf.json documentation.
23+
//
24+
// "install_command": ["in-dir={env_dir} python -mpip install {wheel_file}"],
25+
// "uninstall_command": ["return-code=any python -mpip uninstall -y {project}"],
26+
// "build_command": [
27+
// "python setup.py build",
28+
// "PIP_NO_BUILD_ISOLATION=false python -mpip wheel --no-deps --no-index -w {build_cache_dir} {build_dir}"
29+
// ],
30+
31+
// List of branches to benchmark. If not provided, defaults to "master"
32+
// (for git) or "default" (for mercurial).
33+
"branches": ["master"], // for git
34+
// "branches": ["default"], // for mercurial
35+
36+
// The DVCS being used. If not set, it will be automatically
37+
// determined from "repo" by looking at the protocol in the URL
38+
// (if remote), or by looking for special directories, such as
39+
// ".git" (if local).
40+
"dvcs": "git",
41+
42+
// The tool to use to create environments. May be "conda",
43+
// "virtualenv" or other value depending on the plugins in use.
44+
// If missing or the empty string, the tool will be automatically
45+
// determined by looking for tools on the PATH environment
46+
// variable.
47+
"environment_type": "virtualenv",
48+
49+
// timeout in seconds for installing any dependencies in environment
50+
// defaults to 10 min
51+
//"install_timeout": 600,
52+
53+
// the base URL to show a commit for the project.
54+
"show_commit_url": "http://github.com/tensorwerk/hangar-py/commit/",
55+
56+
// The Pythons you'd like to test against. If not provided, defaults
57+
// to the current version of Python used to run `asv`.
58+
// "pythons": ["3.7"],
59+
60+
// The list of conda channel names to be searched for benchmark
61+
// dependency packages in the specified order
62+
// "conda_channels": ["conda-forge", "defaults"],
63+
64+
// The matrix of dependencies to test. Each key is the name of a
65+
// package (in PyPI) and the values are version numbers. An empty
66+
// list or empty string indicates to just test against the default
67+
// (latest) version. null indicates that the package is to not be
68+
// installed. If the package to be tested is only available from
69+
// PyPi, and the 'environment_type' is conda, then you can preface
70+
// the package name by 'pip+', and the package will be installed via
71+
// pip (with all the conda available packages installed first,
72+
// followed by the pip installed packages).
73+
//
74+
// "matrix": {
75+
// "numpy": ["1.6", "1.7"],
76+
// "six": ["", null], // test with and without six installed
77+
// "pip+emcee": [""], // emcee is only available for install with pip.
78+
// },
79+
80+
// Combinations of libraries/python versions can be excluded/included
81+
// from the set to test. Each entry is a dictionary containing additional
82+
// key-value pairs to include/exclude.
83+
//
84+
// An exclude entry excludes entries where all values match. The
85+
// values are regexps that should match the whole string.
86+
//
87+
// An include entry adds an environment. Only the packages listed
88+
// are installed. The 'python' key is required. The exclude rules
89+
// do not apply to includes.
90+
//
91+
// In addition to package names, the following keys are available:
92+
//
93+
// - python
94+
// Python version, as in the *pythons* variable above.
95+
// - environment_type
96+
// Environment type, as above.
97+
// - sys_platform
98+
// Platform, as in sys.platform. Possible values for the common
99+
// cases: 'linux2', 'win32', 'cygwin', 'darwin'.
100+
//
101+
// "exclude": [
102+
// {"python": "3.2", "sys_platform": "win32"}, // skip py3.2 on windows
103+
// {"environment_type": "conda", "six": null}, // don't run without six on conda
104+
// ],
105+
//
106+
// "include": [
107+
// // additional env for python2.7
108+
// {"python": "2.7", "numpy": "1.8"},
109+
// // additional env if run on windows+conda
110+
// {"platform": "win32", "environment_type": "conda", "python": "2.7", "libpython": ""},
111+
// ],
112+
113+
// The directory (relative to the current directory) that benchmarks are
114+
// stored in. If not provided, defaults to "benchmarks"
115+
"benchmark_dir": "benchmarks",
116+
117+
// The directory (relative to the current directory) to cache the Python
118+
// environments in. If not provided, defaults to "env"
119+
"env_dir": "env",
120+
121+
// The directory (relative to the current directory) that raw benchmark
122+
// results are stored in. If not provided, defaults to "results".
123+
"results_dir": "results",
124+
125+
// The directory (relative to the current directory) that the html tree
126+
// should be written to. If not provided, defaults to "html".
127+
"html_dir": "html",
128+
129+
// The number of characters to retain in the commit hashes.
130+
"hash_length": 8,
131+
132+
// `asv` will cache results of the recent builds in each
133+
// environment, making them faster to install next time. This is
134+
// the number of builds to keep, per environment.
135+
"build_cache_size": 1
136+
137+
// The commits after which the regression search in `asv publish`
138+
// should start looking for regressions. Dictionary whose keys are
139+
// regexps matching to benchmark names, and values corresponding to
140+
// the commit (exclusive) after which to start looking for
141+
// regressions. The default is to start from the first commit
142+
// with results. If the commit is `null`, regression detection is
143+
// skipped for the matching benchmark.
144+
//
145+
// "regressions_first_commits": {
146+
// "some_benchmark": "352cdf", // Consider regressions only after this commit
147+
// "another_benchmark": null, // Skip regression detection altogether
148+
// },
149+
150+
// The thresholds for relative change in results, after which `asv
151+
// publish` starts reporting regressions. Dictionary of the same
152+
// form as in ``regressions_first_commits``, with values
153+
// indicating the thresholds. If multiple entries match, the
154+
// maximum is taken. If no entry matches, the default is 5%.
155+
//
156+
// "regressions_thresholds": {
157+
// "some_benchmark": 0.01, // Threshold of 1%
158+
// "another_benchmark": 0.5, // Threshold of 50%
159+
// },
160+
}

asv_bench/benchmarks/__init__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+

0 commit comments

Comments
 (0)