Skip to content

Commit 732dc05

Browse files
Performance badges (#17)
This PR introduces an automated performance benchmark comparison between the master branch and the Pull Request branch, with the results automatically posted as a PR comment. - **CI image** updated to `0.0.2` with Python support. - **Performance job** in `test.yml`: - For PRs: downloads latest master benchmarks, runs current benchmarks, compares via `compare_benchmarks.py`, posts markdown report. - For master pushes: the same actions as for PRs + uploads updated master benchmark artifacts. - **New tool**: - `build/ci/compare_benchmarks.py` — parses and compares Go benchmark metrics, highlights improvements ⚡️ and regressions 💔, outputs markdown.
1 parent 55464e8 commit 732dc05

File tree

7 files changed

+299
-12
lines changed

7 files changed

+299
-12
lines changed

.github/workflows/build.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ jobs:
1414
binary:
1515
runs-on: "ubuntu-latest"
1616
container:
17-
image: "ghcr.io/tarantool/sdvg-ci:0.0.1"
17+
image: "ghcr.io/tarantool/sdvg-ci:0.0.2"
1818
strategy:
1919
matrix:
2020
os_family: ["darwin", "linux"]
@@ -38,7 +38,7 @@ jobs:
3838
docker:
3939
runs-on: "ubuntu-latest"
4040
container:
41-
image: "ghcr.io/tarantool/sdvg-ci:0.0.1"
41+
image: "ghcr.io/tarantool/sdvg-ci:0.0.2"
4242
strategy:
4343
matrix:
4444
os_family: ["linux"]

.github/workflows/test.yml

Lines changed: 54 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ jobs:
1717
lint:
1818
runs-on: "ubuntu-latest"
1919
container:
20-
image: "ghcr.io/tarantool/sdvg-ci:0.0.1"
20+
image: "ghcr.io/tarantool/sdvg-ci:0.0.2"
2121
steps:
2222
- uses: "actions/checkout@v4"
2323

@@ -34,7 +34,7 @@ jobs:
3434
unit:
3535
runs-on: "ubuntu-latest"
3636
container:
37-
image: "ghcr.io/tarantool/sdvg-ci:0.0.1"
37+
image: "ghcr.io/tarantool/sdvg-ci:0.0.2"
3838
steps:
3939
- uses: "actions/checkout@v4"
4040

@@ -46,7 +46,7 @@ jobs:
4646
cover:
4747
runs-on: "ubuntu-latest"
4848
container:
49-
image: "ghcr.io/tarantool/sdvg-ci:0.0.1"
49+
image: "ghcr.io/tarantool/sdvg-ci:0.0.2"
5050
steps:
5151
- uses: "actions/checkout@v4"
5252

@@ -64,10 +64,58 @@ jobs:
6464

6565
performance:
6666
runs-on: "ubuntu-latest"
67+
env:
68+
BENCH_MASTER_ARTIFACT_KEY: "bench-master"
69+
BENCH_MASTER_INFO_DIR: "bench-master-info"
70+
BENCH_MASTER_FILE_PATH: "bench-master-info/benchmark-master.txt"
71+
BENCH_MASTER_SHA_FILE_PATH: "bench-master-info/benchmark-master-sha.txt"
6772
container:
68-
image: "ghcr.io/tarantool/sdvg-ci:0.0.1"
73+
image: "ghcr.io/tarantool/sdvg-ci:0.0.2"
74+
6975
steps:
7076
- uses: "actions/checkout@v4"
7177

72-
- name: "Run benchmarks"
73-
run: "make test/performance | tee performance.out"
78+
- name: "Download master benchmark artifact"
79+
uses: "dawidd6/action-download-artifact@v11"
80+
with:
81+
github_token: "${{ secrets.GITHUB_TOKEN }}"
82+
branch: "${{ github.event.repository.default_branch }}"
83+
if_no_artifact_found: "warn"
84+
allow_forks: false
85+
name: "${{ env.BENCH_MASTER_ARTIFACT_KEY }}"
86+
path: "${{ env.BENCH_MASTER_INFO_DIR }}"
87+
88+
- name: "Run benchmarks on current branch"
89+
run: "make test/performance | tee benchmark.txt; exit ${PIPESTATUS[0]}"
90+
91+
- name: "Make comparison report"
92+
run: |
93+
python ./build/ci/compare_benchmarks.py \
94+
--old-commit-sha-path "$BENCH_MASTER_SHA_FILE_PATH" \
95+
"$BENCH_MASTER_FILE_PATH" \
96+
benchmark.txt \
97+
>> performance-report.md
98+
99+
cat performance-report.md >> $GITHUB_STEP_SUMMARY
100+
101+
- uses: "mshick/add-pr-comment@v2"
102+
if: "${{ github.event_name == 'pull_request' }}"
103+
with:
104+
message-path: "performance-report.md"
105+
message-id: "perf-report-pr-${{ github.event.pull_request.number }}"
106+
refresh-message-position: true
107+
108+
- name: "Prepare master benchmark info for uploading as artifact"
109+
if: "${{ github.ref_name == github.event.repository.default_branch }}"
110+
run: |
111+
mkdir -p ${{ env.BENCH_MASTER_INFO_DIR }}
112+
mv benchmark.txt "${{ env.BENCH_MASTER_FILE_PATH }}"
113+
echo "${GITHUB_SHA:0:7}" > ${{ env.BENCH_MASTER_SHA_FILE_PATH }}
114+
115+
- name: "Upload master benchmark artifact"
116+
if: "${{ github.ref_name == github.event.repository.default_branch }}"
117+
uses: "actions/upload-artifact@v4"
118+
with:
119+
name: "${{ env.BENCH_MASTER_ARTIFACT_KEY }}"
120+
path: "${{ env.BENCH_MASTER_INFO_DIR }}"
121+

Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ test/cover:
2020
go tool cover -html=coverage.out -o coverage.html
2121

2222
test/performance:
23-
go test -run=^$$ -bench=. -cpu 4 ./...
23+
go test -run=^$$ -bench=. -count=2 -cpu 4 ./...
2424

2525
include ./build/package/Makefile
2626
include ./build/ci/Makefile

build/ci/Dockerfile

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,15 @@ WORKDIR /tmp
55
# Install dependencies
66

77
RUN apk update \
8-
&& apk add --update --no-cache bash curl git make gcc musl-dev docker
8+
&& apk add --update --no-cache bash curl git make gcc musl-dev docker tar
9+
10+
# Configure python
11+
12+
RUN apk add --no-cache python3 py3-pip \
13+
&& python3 -m venv /venv \
14+
&& /venv/bin/pip install --upgrade pip setuptools wheel
15+
16+
ENV PATH="/venv/bin:$PATH"
917

1018
# Configure Go
1119

@@ -32,6 +40,8 @@ WORKDIR /sdvg
3240

3341
COPY ./go.mod ./go.mod
3442
COPY ./go.sum ./go.sum
43+
COPY ./build/ci/requirements.txt ./requirements.txt
3544

3645
RUN git config --global --add safe.directory /sdvg \
37-
&& go mod download
46+
&& go mod download \
47+
&& pip install -r ./requirements.txt

build/ci/Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Arguments
22

3-
ci_image = ghcr.io/tarantool/sdvg-ci:0.0.1
3+
ci_image = ghcr.io/tarantool/sdvg-ci:0.0.2
44

55
# Targets
66

build/ci/compare_benchmarks.py

Lines changed: 227 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,227 @@
1+
import argparse
2+
import re
3+
import statistics
4+
import os
5+
import textwrap
6+
from collections import OrderedDict
7+
from typing import Dict, Tuple, List, Literal, Optional
8+
9+
import pandas as pd
10+
11+
METRICS = {
12+
'MB/s': {'name': 'B/s', 'good_direction': 'up', 'scale': 2 ** 20},
13+
'values/s': {'good_direction': 'up'},
14+
# 'ns/op': {'name': 's/op', 'good_direction': 'down', 'scale': 1e-9},
15+
# 'rows/s': {'good_direction': 'up'},
16+
}
17+
18+
EMOJIS = {
19+
'good': '⚡️',
20+
'bad': '💔'
21+
}
22+
23+
24+
def format_benchmark_name(name: str) -> str:
25+
name = name.replace("Benchmark", "")
26+
name = name.replace("/CI/", "/")
27+
28+
parts = name.split("/")
29+
if len(parts) == 1:
30+
return parts[0]
31+
32+
base_name = " ".join(parts[:-1])
33+
params_split = parts[-1].split("-")
34+
35+
params = []
36+
for i in range(0, len(params_split) - 1, 2):
37+
params.append(f"{params_split[i]}={params_split[i + 1]}")
38+
39+
if params:
40+
return f"{base_name} ({', '.join(params)})"
41+
else:
42+
return base_name
43+
44+
45+
def parse_bench_line(line: str) -> Tuple[Optional[str], Optional[Dict[str, float]]]:
46+
"""parses `go test -bench` results output.
47+
Example:
48+
49+
BenchmarkPartitioning/CI/cpu-4 2569041 475.5 ns/op 218.73 MB/s 8412793 rows/s 16825587 values/s
50+
51+
result:
52+
('Partitioning (cpu=4)', {'ns/op': 475.5, 'MB/s': 218.73, 'rows/s': 8412793, 'values/s': 16825587}
53+
"""
54+
55+
parts = re.split(r'\s+', line.strip())
56+
if len(parts) < 3 or not parts[0].startswith("Benchmark") or "/CI/" not in parts[0]:
57+
return None, None
58+
59+
bench_name = format_benchmark_name(parts[0])
60+
61+
metrics = {}
62+
for value, metric in zip(parts[2::2], parts[3::2]):
63+
if metric not in METRICS:
64+
continue
65+
try:
66+
metrics[metric] = float(value)
67+
except ValueError:
68+
raise ValueError(f"Failed to parse value '{value}' for '{metric}'")
69+
70+
return bench_name, metrics
71+
72+
73+
def parse_metrics_file(path: str) -> Dict[str, Dict[str, List[float]]]:
74+
results = {}
75+
76+
with open(path) as f:
77+
for line in f:
78+
name_test, metrics = parse_bench_line(line)
79+
if name_test is None:
80+
continue
81+
82+
if not metrics:
83+
continue
84+
85+
if name_test not in results:
86+
results[name_test] = {m: [] for m in METRICS.keys()}
87+
88+
for metric_name, value in metrics.items():
89+
results[name_test][metric_name].append(value)
90+
91+
return results
92+
93+
94+
def aggregate_results(
95+
parsed_metrics: Dict[str, Dict[str, List[float]]],
96+
method: Literal["mean", "median"]
97+
) -> OrderedDict[str, Dict[str, float]]:
98+
aggregated: OrderedDict[str, Dict[str, float]] = OrderedDict()
99+
100+
for bench_name, metrics in parsed_metrics.items():
101+
aggregated[bench_name] = {}
102+
103+
for m, values in metrics.items():
104+
if method == "median":
105+
aggregated[bench_name][m] = statistics.median(values)
106+
elif method == "mean":
107+
aggregated[bench_name][m] = statistics.mean(values)
108+
109+
return aggregated
110+
111+
112+
def humanize_number(val: float, scale: float) -> str:
113+
if val is None:
114+
return "?"
115+
116+
val = val * scale
117+
abs_val = abs(val)
118+
if abs_val >= 1_000_000:
119+
return f"{val / 1_000_000:.2f}M"
120+
elif abs_val >= 1_000:
121+
return f"{val / 1_000:.2f}K"
122+
else:
123+
return f"{val:.2f}"
124+
125+
126+
def format_metric_changes(metric_name: str, old_val, new_val: Optional[float], alert_threshold: float) -> str:
127+
old_val_str = humanize_number(old_val, METRICS[metric_name].get('scale', 1))
128+
new_val_str = humanize_number(new_val, METRICS[metric_name].get('scale', 1))
129+
130+
if old_val is None or new_val is None:
131+
suffix = " ⚠️"
132+
else:
133+
change_pct = (new_val / old_val - 1) * 100
134+
suffix = f" ({change_pct:+.2f}%)"
135+
136+
if abs(change_pct) >= alert_threshold:
137+
is_better = METRICS[metric_name].get('good_direction') == 'up' and change_pct > 0
138+
suffix += f" {EMOJIS['good'] if is_better else EMOJIS['bad']}"
139+
140+
return f"{old_val_str}{new_val_str}{suffix}"
141+
142+
143+
def compare_benchmarks_df(old_metrics, new_metrics, alert_threshold=None):
144+
if old_metrics is None:
145+
old_metrics = {}
146+
147+
if new_metrics is None:
148+
new_metrics = {}
149+
150+
all_metrics = OrderedDict()
151+
all_metrics.update(old_metrics)
152+
all_metrics.update(new_metrics)
153+
154+
df = pd.DataFrame(columns=["Benchmark"] + [v.get('name', k) for k, v in METRICS.items()])
155+
156+
for bench_name in all_metrics.keys():
157+
row = {"Benchmark": bench_name}
158+
159+
for metric_name, metric_params in METRICS.items():
160+
old_val = old_metrics.get(bench_name, {}).get(metric_name, None)
161+
new_val = new_metrics.get(bench_name, {}).get(metric_name, None)
162+
row[metric_params.get('name', metric_name)] = format_metric_changes(
163+
metric_name, old_val, new_val, alert_threshold
164+
)
165+
166+
df.loc[len(df)] = row
167+
168+
return df.to_markdown(index=False)
169+
170+
171+
def build_report_header(old_file, sha_file: str) -> str:
172+
event_name = os.environ.get("GITHUB_EVENT_NAME", "")
173+
base_branch = os.environ.get("GITHUB_DEFAULT_BRANCH", "master")
174+
175+
warning = ""
176+
if not os.path.exists(old_file):
177+
warning = textwrap.dedent("""
178+
> [!WARNING]
179+
> No test results found for master branch. Please run workflow on master first to compare results.
180+
""").strip()
181+
182+
if event_name == "pull_request":
183+
pr_branch = os.environ.get("GITHUB_HEAD_REF", "")
184+
header_ending = f"`{pr_branch}`" if not os.path.exists(old_file) else f"`{base_branch}` VS `{pr_branch}`"
185+
else:
186+
if not os.path.exists(old_file):
187+
header_ending = f"`{base_branch}`"
188+
else:
189+
prev_master_sha = "(sha not found)"
190+
if sha_file and os.path.exists(sha_file):
191+
with open(sha_file) as f:
192+
prev_master_sha = f.read().strip()
193+
194+
commit_sha = os.environ.get("GITHUB_SHA", "")[:7]
195+
header_ending = f"`{base_branch} {prev_master_sha}` VS `{base_branch} {commit_sha}`"
196+
197+
header = f"# Perf tests report: {header_ending}\n"
198+
return f"{warning}\n\n{header}" if warning else header
199+
200+
201+
def main():
202+
parser = argparse.ArgumentParser(description="Compare go test -bench results in markdown format")
203+
parser.add_argument(
204+
"--alert-threshold", type=float, default=7,
205+
help="Percent change threshold for adding emoji alerts"
206+
)
207+
parser.add_argument(
208+
"--aggregation", choices=["mean", "median"], default="mean",
209+
help="Aggregation method for multiple runs of the same benchmark"
210+
)
211+
parser.add_argument("--old-commit-sha-path", help="Path to file with sha commit of the old benchmark")
212+
parser.add_argument("old_file", help="Path to old benchmark results file", nargs='?', default="")
213+
parser.add_argument("new_file", help="Path to new benchmark results file")
214+
args = parser.parse_args()
215+
216+
old_metrics = None
217+
if args.old_file and os.path.exists(args.old_file):
218+
old_metrics = aggregate_results(parse_metrics_file(args.old_file), args.aggregation)
219+
220+
new_metrics = aggregate_results(parse_metrics_file(args.new_file), args.aggregation)
221+
222+
print(build_report_header(args.old_file, args.old_commit_sha_path))
223+
print(compare_benchmarks_df(old_metrics, new_metrics, alert_threshold=args.alert_threshold))
224+
225+
226+
if __name__ == "__main__":
227+
main()

build/ci/requirements.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
pandas==2.3.1
2+
tabulate==0.9.0

0 commit comments

Comments
 (0)