Skip to content

Commit c76a433

Browse files
authored
Merge pull request #68 from man-group/cleaner-index-page
Cleaner index page
2 parents 31f8eba + 8841889 commit c76a433

31 files changed

+795
-663
lines changed

README.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,10 @@ Productionise and schedule your Jupyter Notebooks, just as interactively as you
1616
![Screenshot of some notebook results](https://raw.githubusercontent.com/man-group/notebooker/master/docs/images/nbkr_results.png)
1717

1818
## All results are accessible from the home page
19-
![Screenshot of Executing a notebook](https://raw.githubusercontent.com/man-group/notebooker/master/docs/images/nbkr_homepage.png)
19+
![Screenshot of the Notebooker homepage](https://raw.githubusercontent.com/man-group/notebooker/master/docs/images/nbkr_homepage.png)
20+
21+
## Drill down into each template's results
22+
![Screenshot of result listings](https://raw.githubusercontent.com/man-group/notebooker/master/docs/images/nbkr_results_listing.png)
2023

2124

2225
## Getting started

docs/images/nbkr_homepage.png

100755100644
194 KB
Loading
281 KB
Loading
-31.3 KB
Binary file not shown.

docs/webapp/webapp.rst

Lines changed: 14 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2,22 +2,29 @@ The Notebooker webapp
22
=====================
33

44
Notebooker's primary interface is a simple webapp written to allow users to view and
5-
run Notebooker reports. It displays all results in a handy grid, and allows for rerunning
5+
run Notebooker reports. It first displays all unique template names which have ever run, and a drill-down
6+
view lists all results for that notebook template in a handy grid, allowing for rerunning
67
and parameter tweaking.
78
The entrypoint used to run Notebooks via the webapp is the
8-
same as the external API, so as long as you are using the same environment (e.g. within
9+
same as the external API; as long as you are using the same environment (e.g. within
910
a docker image) you will get consistent results.
1011

1112

1213
Report dashboard
1314
----------------
1415
The home page of the Notebooker webapp displays an overview of all reports which have recently run.
16+
17+
.. image:: /images/nbkr_homepage.png
18+
:width: 400
19+
:alt: Screenshot of Notebooker webapp homepage
20+
21+
Clicking on one of these elements will bring up an overview of all reports which have recently run.
1522
It is possible to view each full report by clicking "Result". It's also possible to rerun, delete, and
1623
copy parameters of each report in the grid.
1724

18-
.. image:: /images/notebooker_homepage.png
25+
.. image:: /images/nbkr_results_listing.png
1926
:width: 400
20-
:alt: Screenshot of Notebooker webapp homepage
27+
:alt: Screenshot of Notebooker results listing
2128

2229

2330
Running a report
@@ -40,7 +47,7 @@ Running a report
4047
.. warning::
4148
In order to prevent users having to write JSON, the Override parameters box actually takes raw python statements
4249
and converts them into JSON. Therefore, it is strongly recommended that you run Notebooker in an environment
43-
where you either completely trust all of the user base, or within a docker container
50+
where you either completely trust all of the user base, or within a docker container
4451
where executing variable assignments will not have any negative side-effects.
4552

4653
Customisable elements:
@@ -49,6 +56,7 @@ Customisable elements:
4956
* Override parameters - the values which will override the parameters in the report (in python). Can be left blank.
5057
* Email to - upon completion of the report, who should it be emailed to? Can be left blank.
5158
* Generate PDF output - whether to generate PDFs or not. Requires xelatex to be installed - see :ref:`export to pdf`
59+
* Hide code from email and PDF output - whether to display the notebook code when producing output emails and PDFs.
5260

5361
Viewing results
5462
---------------
@@ -67,6 +75,7 @@ If the job fails, the stack trace will be presented to allow for easier debuggin
6775

6876

6977
| If the job succeeds, the .ipynb will have been converted into HTML for viewing on this page.
78+
| **Please note** for user convenience, all notebook code is hidden by default.
7079
| You can also get to this view by clicking the blue "Result" button on the homepage.
7180
| If you are using a framework such as seaborn or matplotlib, the images will be available and served by the webapp.
7281
| If you are using plotly, you can use offline mode to store the required javascript within the HTML render,

notebooker/constants.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@
88

99
SUBMISSION_TIMEOUT = 3
1010
RUNNING_TIMEOUT = 60
11+
DEFAULT_RESULT_LIMIT = 100
1112
CANCEL_MESSAGE = "The webapp shut down while this job was running. Please resubmit with the same parameters."
1213
TEMPLATE_DIR_SEPARATOR = "^"
1314
DEFAULT_SERIALIZER = "PyMongoResultSerializer"

notebooker/serialization/mongo.py

Lines changed: 31 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
11
import datetime
22
import json
3+
from collections import Counter, defaultdict
4+
35
from abc import ABC
46
from logging import getLogger
57
from typing import Any, AnyStr, Dict, List, Optional, Tuple, Union, Iterator
@@ -300,20 +302,45 @@ def get_check_result(
300302
result = self.library.find_one({"job_id": job_id}, {"_id": 0})
301303
return self._convert_result(result)
302304

305+
def _get_raw_results(self, base_filter, projection, limit):
306+
if "status" in base_filter:
307+
base_filter["status"].update({"$ne": JobStatus.DELETED.value})
308+
else:
309+
base_filter["status"] = {"$ne": JobStatus.DELETED.value}
310+
return self.library.find(base_filter, projection).sort("update_time", -1).limit(limit)
311+
312+
def get_count_and_latest_time_per_report(self):
313+
reports = list(
314+
self._get_raw_results(
315+
base_filter={},
316+
projection={"report_name": 1, "job_start_time": 1, "scheduler_job_id": 1, "_id": 0},
317+
limit=0,
318+
)
319+
)
320+
jobs_by_name = defaultdict(list)
321+
for r in reports:
322+
jobs_by_name[r["report_name"]].append(r)
323+
output = {}
324+
for report, all_runs in jobs_by_name.items():
325+
latest_start_time = max(r["job_start_time"] for r in all_runs)
326+
scheduled_runs = len([x for x in all_runs if x.get("scheduler_job_id")])
327+
output[report] = {"count": len(all_runs), "latest_run": latest_start_time, "scheduler_runs": scheduled_runs}
328+
return output
329+
303330
def get_all_results(
304331
self,
305332
since: Optional[datetime.datetime] = None,
306333
limit: Optional[int] = 100,
307334
mongo_filter: Optional[Dict] = None,
308335
load_payload: bool = True,
309336
) -> Iterator[Union[NotebookResultComplete, NotebookResultError, NotebookResultPending]]:
310-
base_filter = {"status": {"$ne": JobStatus.DELETED.value}}
337+
base_filter = {}
311338
if mongo_filter:
312339
base_filter.update(mongo_filter)
313340
if since:
314341
base_filter.update({"update_time": {"$gt": since}})
315342
projection = REMOVE_ID_PROJECTION if load_payload else REMOVE_PAYLOAD_FIELDS_AND_ID_PROJECTION
316-
results = self.library.find(base_filter, projection).sort("update_time", -1).limit(limit)
343+
results = self._get_raw_results(base_filter, projection, limit)
317344
for res in results:
318345
if res:
319346
converted_result = self._convert_result(res, load_payload=load_payload)
@@ -404,8 +431,8 @@ def get_latest_successful_job_ids_for_name_all_params(self, report_name: str) ->
404431

405432
return [result["job_id"] for result in results]
406433

407-
def n_all_results(self):
408-
return self.library.find({"status": {"$ne": JobStatus.DELETED.value}}).count()
434+
def n_all_results_for_report_name(self, report_name: str) -> int:
435+
return self._get_raw_results({"report_name": report_name}, {}, 0).count()
409436

410437
def delete_result(self, job_id: AnyStr) -> None:
411438
self.update_check_status(job_id, JobStatus.DELETED)

notebooker/utils/results.py

Lines changed: 17 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,11 @@
1+
import datetime
2+
from collections import defaultdict
13
from datetime import datetime as dt
24
from logging import getLogger
35
from typing import Callable, Dict, Iterator, List, Mapping, Optional, Tuple
46

7+
import babel.dates
8+
import inflection
59
from flask import url_for
610

711
from notebooker import constants
@@ -106,9 +110,10 @@ def get_all_result_keys(
106110
return all_keys
107111

108112

109-
def get_all_available_results_json(serializer: MongoResultSerializer, limit: int) -> List[constants.NotebookResultBase]:
113+
def get_all_available_results_json(serializer: MongoResultSerializer, limit: int, report_name: str = None) -> List[constants.NotebookResultBase]:
110114
json_output = []
111-
for result in serializer.get_all_results(limit=limit, load_payload=False):
115+
mongo_filter = {"report_name": report_name} if report_name is not None else {}
116+
for result in serializer.get_all_results(mongo_filter=mongo_filter, limit=limit, load_payload=False):
112117
output = result.saveable_output()
113118
output["result_url"] = url_for(
114119
"serve_results_bp.task_results", job_id=output["job_id"], report_name=output["report_name"]
@@ -126,6 +131,16 @@ def get_all_available_results_json(serializer: MongoResultSerializer, limit: int
126131
return json_output
127132

128133

134+
def get_count_and_latest_time_per_report(serializer: MongoResultSerializer):
135+
reports = serializer.get_count_and_latest_time_per_report()
136+
output = {}
137+
for report_name, metadata in sorted(reports.items(), key=lambda x: x[1]["latest_run"], reverse=True):
138+
metadata["report_name"] = report_name
139+
metadata["time_diff"] = babel.dates.format_timedelta(datetime.datetime.utcnow() - metadata["latest_run"])
140+
output[inflection.titleize(report_name)] = metadata
141+
return output
142+
143+
129144
def get_latest_successful_job_results_all_params(
130145
report_name: str,
131146
serializer: MongoResultSerializer,

notebooker/web/app.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -121,7 +121,10 @@ def setup_app(flask_app: Flask, web_config: WebappConfig):
121121
logging.basicConfig(level=logging.getLevelName(web_config.LOGGING_LEVEL))
122122
flask_app.config.from_object(web_config)
123123
flask_app.config.update(
124-
TEMPLATES_AUTO_RELOAD=web_config.DEBUG, EXPLAIN_TEMPLATE_LOADING=True, DEBUG=web_config.DEBUG
124+
TEMPLATES_AUTO_RELOAD=web_config.DEBUG,
125+
EXPLAIN_TEMPLATE_LOADING=True,
126+
DEBUG=web_config.DEBUG,
127+
TESTING=web_config.DEBUG,
125128
)
126129
flask_app = setup_scheduler(flask_app, web_config)
127130
return flask_app

notebooker/web/report_hunter.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,7 @@ def _report_hunter(webapp_config: WebappConfig, run_once: bool = False, timeout:
3939
JobStatus.SUBMITTED: now - datetime.timedelta(minutes=SUBMISSION_TIMEOUT),
4040
JobStatus.PENDING: now - datetime.timedelta(minutes=RUNNING_TIMEOUT),
4141
}
42+
cutoff.update({k.value: v for (k, v) in cutoff.items()}) # Add value to dict for backwards compat
4243
for result in all_pending:
4344
this_cutoff = cutoff.get(result.status)
4445
if result.job_start_time <= this_cutoff:

0 commit comments

Comments
 (0)