Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
96 commits
Select commit Hold shift + click to select a range
16a5c3f
Create branch for UPenn Lib rework of CDH code
emeryr-upenn Jul 11, 2025
48a2c31
Don't track tmp dir
emeryr-upenn Jul 11, 2025
829c424
Remove custom PU pagess
emeryr-upenn Jul 11, 2025
ba72b7d
Remove custom login page
emeryr-upenn Jul 11, 2025
7503f7b
Remove pucus dependency
emeryr-upenn Jul 11, 2025
85bca3f
Confirm htr2hpc tasks not used
emeryr-upenn Jul 11, 2025
4a937bb
Penn SAS GPC-specifc and version updates
emeryr-upenn Jul 30, 2025
6accdbc
Add Penn Library notes
emeryr-upenn Jul 31, 2025
ea6e769
Fix module for conda loading
emeryr-upenn Jul 31, 2025
cb63764
fix readme bullets
emeryr-upenn Jul 31, 2025
897d223
Rough updated draft
emeryr-upenn Aug 1, 2025
7eb3274
Specify slurm qos and partition
emeryr-upenn Aug 1, 2025
a830a0b
Proof and update Penn Readme
emeryr-upenn Aug 1, 2025
04fa953
Further Penn README edits
emeryr-upenn Aug 1, 2025
be595d8
Further README tweaks
emeryr-upenn Aug 1, 2025
6cd2b59
Update working dir and command for SAS GPC2
emeryr-upenn Aug 6, 2025
bfac0ac
Move raise for preventing remote execution
emeryr-upenn Aug 6, 2025
d634c2b
Update code changes README
emeryr-upenn Aug 6, 2025
b68bfde
Remove debugging excecptions
emeryr-upenn Aug 7, 2025
fe91005
Add docker deployment files.
emeryr-upenn Aug 7, 2025
40599e1
Update local_settings.py
emeryr-upenn Aug 7, 2025
73fbc9f
Don't ignore local_settings.py
emeryr-upenn Aug 7, 2025
88ceccb
Update docker readme
emeryr-upenn Aug 8, 2025
f87ea69
Have django use escriptorium.local_settings
emeryr-upenn Aug 8, 2025
0146602
Dev Dockerfile to build using local files
emeryr-upenn Aug 8, 2025
4d76eb2
Restoring required templates
emeryr-upenn Aug 8, 2025
d035d34
Remove htr2hpc user migration
emeryr-upenn Aug 12, 2025
a801e2b
Change requirements handling for portainer
emeryr-upenn Aug 13, 2025
c872b0c
Add portainer docker compose file
emeryr-upenn Aug 13, 2025
9a83ef8
Add docker-compose network; change uwsgi port
emeryr-upenn Aug 13, 2025
2ec4cad
Add our uwsgi.ini to image
emeryr-upenn Aug 13, 2025
c851c90
specify 0.0.0.0:8899 for uwsgi http option
emeryr-upenn Aug 13, 2025
0bee884
add network to docker compose
emeryr-upenn Aug 13, 2025
8e3c953
Use network name:; expose web port on portainer
emeryr-upenn Aug 14, 2025
305e0f4
Tweak uwsgi.ini to manage worker memory
emeryr-upenn Aug 14, 2025
50f4329
Increase uwsgi buffer-size to prevent invalid request
emeryr-upenn Aug 14, 2025
a6cf55d
Make image tags configurable
emeryr-upenn Aug 14, 2025
00f2290
Increment version
emeryr-upenn Aug 18, 2025
4f55216
Use slurm partion with gpus
emeryr-upenn Aug 19, 2025
94462c4
Bump version to 0.5.2
emeryr-upenn Aug 19, 2025
76dfec2
Add htr2hpc-train --log-level
emeryr-upenn Aug 19, 2025
f780e83
Bump version to 0.5.3
emeryr-upenn Aug 19, 2025
72d4075
Add usage, GPU partition notes
emeryr-upenn Aug 19, 2025
e69befa
Remove log_level opt not used by TrainingManager
emeryr-upenn Aug 19, 2025
8ad26f5
Change to ed2551 ssh key
emeryr-upenn Aug 19, 2025
9e4dedf
Remove OMP_NUM_THREADS; breaks celery-main
emeryr-upenn Aug 19, 2025
e5e009d
Bump version to 0.5.4
emeryr-upenn Aug 19, 2025
ec7b97f
Update htr2hpc-train install instructions
emeryr-upenn Aug 19, 2025
a4ea8da
Remove squeue flag not working on SAS gpc2
emeryr-upenn Aug 19, 2025
7c75040
Bump version to 0.5.5
emeryr-upenn Aug 19, 2025
8012214
Fix get_model_accuracy reference
emeryr-upenn Aug 19, 2025
9988837
Bump version to 0.5.6
emeryr-upenn Aug 19, 2025
d761afe
HPC ssh changes; add site-wide ssh user config
emeryr-upenn Aug 19, 2025
a3df923
Use sacct to get job statistics
emeryr-upenn Aug 21, 2025
943f306
Bump version to 0.5.7
emeryr-upenn Aug 21, 2025
7008bce
Consolidate dev docker config; update docs
emeryr-upenn Aug 22, 2025
5a41e18
Remover unused Nginx ssl files
emeryr-upenn Aug 22, 2025
d7c4850
Update docker readme for portainer
emeryr-upenn Aug 22, 2025
5cc3666
Update UPenn changes readme
emeryr-upenn Aug 22, 2025
9c71199
Claude AI supported revision of UPenn README.
emeryr-upenn Aug 25, 2025
cfcf7c1
Correct uwsgi.ini comment
emeryr-upenn Aug 25, 2025
8c10774
Increase wsgi work max size
emeryr-upenn Aug 25, 2025
26a1d67
Increment version to 0.5.8
emeryr-upenn Aug 25, 2025
d1ea460
Test Dockerfile with eScriptorium dev-0.14.2
emeryr-upenn Aug 26, 2025
c98f12e
Update for dev eScriptorium build
emeryr-upenn Aug 26, 2025
f5f32ef
Test develop branch Dockerfile for eScriptorium
emeryr-upenn Aug 27, 2025
3d191be
Change from http to socket mode
Aug 27, 2025
5e4c7ba
Clean up uwsgi.ini [formatting
emeryr-upenn Aug 27, 2025
ae2e74a
Integrate eScriptorium build from develop branch
emeryr-upenn Aug 27, 2025
0b230dc
Temporarily restore uwsgi http to set fd limit
emeryr-upenn Aug 27, 2025
97d0efe
Going back to uwsgi socket mode
emeryr-upenn Aug 27, 2025
29d3561
Remove custom PU CDH profile.html
emeryr-upenn Aug 29, 2025
79571ef
Clean up uwsgi handling
emeryr-upenn Sep 2, 2025
d6271e6
bump mem per CPU for training
emeryr-upenn Sep 17, 2025
9d40766
Restore original mem_per_cpu calculation
emeryr-upenn Sep 18, 2025
d1c5756
Add loglevel to htr2hpc-train for DEBUG
emeryr-upenn Sep 18, 2025
80760ed
Set django app loglevel to DEBUG
emeryr-upenn Sep 18, 2025
15090ea
Don't clean up htr jobs when debugging
emeryr-upenn Sep 18, 2025
dacaedd
User more flexible api url matching
emeryr-upenn Sep 18, 2025
ba9a667
Restore api URL handling; force Django HTTPS
emeryr-upenn Sep 19, 2025
e97c07c
Force API to return HTTPS URLs; 2nd attempt
emeryr-upenn Sep 19, 2025
17301f7
Restore previous SSL config
emeryr-upenn Sep 19, 2025
32434bb
Force API to return HTTPS URI; attempt 3
emeryr-upenn Sep 19, 2025
5a82f41
Force API to return HTTPS URLs; attempt 4
emeryr-upenn Sep 19, 2025
6cce2ce
Force API to return https URLs; attempt 5
emeryr-upenn Sep 19, 2025
db035a1
Don't update non-existent task report
emeryr-upenn Sep 19, 2025
be841e9
Update built image names in dev docker-compose.yml
emeryr-upenn Sep 20, 2025
38ff67d
Make CSRF_TRUSTED_ORIGINS, IIIF_IMPORT_QUALITY configurable
emeryr-upenn Sep 20, 2025
0b3269c
Update example and portainer files
emeryr-upenn Sep 20, 2025
9b6f5ff
Bump to version 0.5.9
emeryr-upenn Sep 20, 2025
1d9c15a
Make ketos workers configurable
emeryr-upenn Sep 21, 2025
f4c8a46
Bump version to 0.5.10
emeryr-upenn Sep 21, 2025
09fc51f
Bump segtrain memory allotment
emeryr-upenn Sep 22, 2025
6666d8e
Encapsulate Slurm job stats in SlurmJobStats
emeryr-upenn Sep 26, 2025
9722154
Fix broken job_stats method calls
emeryr-upenn Sep 26, 2025
44c73aa
Test full duration recog training with time limit
emeryr-upenn Oct 7, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,6 @@ cover/

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal

Expand Down Expand Up @@ -158,3 +157,6 @@ cython_debug/
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/
/tmp/
variables.env
/ssh
92 changes: 92 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
FROM docker.io/library/node:12-alpine AS frontend

RUN apk update && apk add git

ENV ESCRIPTORIUM_SRC=/escriptorium-src
RUN git clone https://gitlab.com/scripta/escriptorium.git ${ESCRIPTORIUM_SRC} && \
cd ${ESCRIPTORIUM_SRC} && \
git checkout develop

RUN cp -r ${ESCRIPTORIUM_SRC}/front /build
WORKDIR /build
RUN npm ci && npm run production

# Pull official base image
FROM registry.gitlab.com/scripta/escriptorium/base:kraken529 AS escriptorium

# try to autodetect number of cpus available
# ENV NGINX_WORKER_PROCESSES auto

ARG VERSION_DATE="passthistobuildcmd"
ENV VERSION_DATE=$VERSION_DATE
ENV FRONTEND_DIR=/usr/src/app/front
ENV LANG=C.UTF-8
ENV LC_ALL=C.UTF-8

ENV ESCRIPTORIUM_SRC=/escriptorium-src
COPY --from=frontend ${ESCRIPTORIUM_SRC} ${ESCRIPTORIUM_SRC}

# set work directory
WORKDIR /usr/src/app

RUN cp ${ESCRIPTORIUM_SRC}/app/entrypoint.sh /usr/src/app/entrypoint.sh && \
cp ${ESCRIPTORIUM_SRC}/app/manage.py /usr/src/app/manage.py && \
cp ${ESCRIPTORIUM_SRC}/app/requirements.txt /usr/src/app/requirements.txt && \
cp ${ESCRIPTORIUM_SRC}/app/uwsgi.ini /usr/src/app/uwsgi.ini && \
cp -r ${ESCRIPTORIUM_SRC}/app/apps /usr/src/app/apps && \
cp -r ${ESCRIPTORIUM_SRC}/app/escriptorium /usr/src/app/escriptorium && \
cp -r ${ESCRIPTORIUM_SRC}/app/locale /usr/src/app/locale && \
cp -r ${ESCRIPTORIUM_SRC}/app/homepage /usr/src/app/homepage && \
rm -rf ${ESCRIPTORIUM_SRC}
COPY --from=frontend /build/dist /usr/src/app/front

WORKDIR /usr/src/app

COPY ./escriptorium/local_settings.py /usr/src/app/escriptorium/local_settings.py
RUN chmod 644 /usr/src/app/escriptorium/local_settings.py

# We want to replicate PU CDH's Ansible tasks for eScriptorium:
#
# https://github.com/Princeton-CDH/cdh-ansible/blob/013fd75dfa9c857d025b97b02c95e2072166264a/roles/escriptorium_setup/tasks/main.yml
#
# They ensure eScriptorium will use the htr2hpc module for model and segmentation training. Specifically, they:
#
# 1. rename the train and segtrain functions in tasks.py to es_train and es_segtrain
# 2. import segtrain and train functions from htr2hpc.tasks

# rename the train and segtrain functions in tasks.py
ENV TASKS_FILE=/usr/src/app/apps/core/tasks.py
RUN sed -E -i 's/^( *)def segtrain/\1def es_segtrain/' ${TASKS_FILE}
RUN sed -E -i 's/^( *)def train/\1def es_train/' ${TASKS_FILE}

# Import the functions htr2hpc.tasks module just above "@shared_task...\ndef es_segtrain..."
RUN line_number=$(($(grep -n "^ *def es_segtrain" ${TASKS_FILE} | cut -d: -f1) - 1)) && \
echo "${line_number}" | grep -q "^[0-9][0-9]*$" && \
sed -i "${line_number}i from htr2hpc.tasks import segtrain, train" ${TASKS_FILE} && \
sed -i "${line_number}i # EDITED BY pennlib-escritorium Dockerfile" ${TASKS_FILE}

# - name: Expose read-write training accuracy model field in API
# see: the ansible task referenced above
ENV SERIALIZERS_PY=/usr/src/app/apps/api/serializers.py
RUN sed -E -i "s/'accuracy_percent', 'rights',/'accuracy_percent', 'training_accuracy', 'rights',/" ${SERIALIZERS_PY}

# Add htr2hpc to requirements.txt and run `pip install`
# for local development just add this project as ./
RUN mkdir /htr2hpc
COPY src/ /htr2hpc/src/
RUN ls /htr2hpc/
COPY pyproject.toml /htr2hpc/
COPY README.md /htr2hpc/
RUN echo >> requirements.txt
RUN echo '/htr2hpc/' >> requirements.txt
RUN pip --no-cache-dir install --root-user-action ignore -r requirements.txt

# Change the django port; configure processes
COPY ./escriptorium/uwsgi.ini /usr/src/app/
RUN chmod 644 /usr/src/app/uwsgi.ini

# update entry point to set the site based on ESCRIPTORIUM_HOST
COPY ./escriptorium/entrypoint.sh /usr/src/app/
RUN chmod 755 /usr/src/app/entrypoint.sh

ENTRYPOINT ["/usr/src/app/entrypoint.sh"]
88 changes: 88 additions & 0 deletions Dockerfile.portainer
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
FROM docker.io/library/node:12-alpine AS frontend

RUN apk update && apk add git

ENV ESCRIPTORIUM_SRC=/escriptorium-src
RUN git clone https://gitlab.com/scripta/escriptorium.git ${ESCRIPTORIUM_SRC} && \
cd ${ESCRIPTORIUM_SRC} && \
git checkout tags/dev-0.14.2

RUN cp -r ${ESCRIPTORIUM_SRC}/front /build
WORKDIR /build
RUN npm ci && npm run production

# Pull official base image
FROM registry.gitlab.com/scripta/escriptorium/base:kraken529 AS escriptorium

# try to autodetect number of cpus available
# ENV NGINX_WORKER_PROCESSES auto

ARG VERSION_DATE="passthistobuildcmd"
ENV VERSION_DATE=$VERSION_DATE
ENV FRONTEND_DIR=/usr/src/app/front
ENV LANG=C.UTF-8
ENV LC_ALL=C.UTF-8

ENV ESCRIPTORIUM_SRC=/escriptorium-src
COPY --from=frontend ${ESCRIPTORIUM_SRC} ${ESCRIPTORIUM_SRC}

# set work directory
WORKDIR /usr/src/app

RUN cp ${ESCRIPTORIUM_SRC}/app/entrypoint.sh /usr/src/app/entrypoint.sh && \
cp ${ESCRIPTORIUM_SRC}/app/manage.py /usr/src/app/manage.py && \
cp ${ESCRIPTORIUM_SRC}/app/requirements.txt /usr/src/app/requirements.txt && \
cp ${ESCRIPTORIUM_SRC}/app/uwsgi.ini /usr/src/app/uwsgi.ini && \
cp -r ${ESCRIPTORIUM_SRC}/app/apps /usr/src/app/apps && \
cp -r ${ESCRIPTORIUM_SRC}/app/escriptorium /usr/src/app/escriptorium && \
cp -r ${ESCRIPTORIUM_SRC}/app/locale /usr/src/app/locale && \
cp -r ${ESCRIPTORIUM_SRC}/app/homepage /usr/src/app/homepage && \
rm -rf ${ESCRIPTORIUM_SRC}
COPY --from=frontend /build/dist /usr/src/app/front

WORKDIR /usr/src/app

COPY local_settings.py /usr/src/app/escriptorium/local_settings.py
RUN chmod 644 /usr/src/app/escriptorium/local_settings.py

# We want to replicate PU CDH's Ansible tasks for eScriptorium:
#
# https://github.com/Princeton-CDH/cdh-ansible/blob/013fd75dfa9c857d025b97b02c95e2072166264a/roles/escriptorium_setup/tasks/main.yml
#
# They ensure eScriptorium will use the htr2hpc module for model and segmentation training. Specifically, they:
#
# 1. rename the train and segtrain functions in tasks.py to es_train and es_segtrain
# 2. import segtrain and train functions from htr2hpc.tasks

# rename the train and segtrain functions in tasks.py
ENV TASKS_FILE=/usr/src/app/apps/core/tasks.py
RUN sed -E -i 's/^( *)def segtrain/\1def es_segtrain/' ${TASKS_FILE}
RUN sed -E -i 's/^( *)def train/\1def es_train/' ${TASKS_FILE}

# Import the functions htr2hpc.tasks module just above "@shared_task...\ndef es_segtrain..."
RUN line_number=$(($(grep -n "^ *def es_segtrain" ${TASKS_FILE} | cut -d: -f1) - 1)) && \
echo "${line_number}" | grep -q "^[0-9][0-9]*$" && \
sed -i "${line_number}i from htr2hpc.tasks import segtrain, train" ${TASKS_FILE} && \
sed -i "${line_number}i # EDITED BY pennlib-escritorium Dockerfile" ${TASKS_FILE}

# - name: Expose read-write training accuracy model field in API
# see: the ansible task referenced above
ENV SERIALIZERS_PY=/usr/src/app/apps/api/serializers.py
RUN sed -E -i "s/'accuracy_percent', 'rights',/'accuracy_percent', 'training_accuracy', 'rights',/" ${SERIALIZERS_PY}

# Add htr2hpc to requirements.txt and run `pip install`
RUN cp requirements.txt requirements.txt.bak
COPY extra_requirements.txt /usr/src/app/extra_requirements.txt
RUN cat requirements.txt.bak extra_requirements.txt | sort | uniq > requirements.txt
RUN rm requirements.txt.bak extra_requirements.txt
RUN pip --no-cache-dir install --root-user-action ignore -r requirements.txt

# Change the django port; configure workers
COPY uwsgi.ini /usr/src/app/
RUN chmod 644 /usr/src/app/uwsgi.ini

# update entry point to set the site based on ESCRIPTORIUM_HOST
COPY entrypoint.sh /usr/src/app/
RUN chmod 755 /usr/src/app/entrypoint.sh

ENTRYPOINT ["/usr/src/app/entrypoint.sh"]
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ The project goal is integrating the [eScriptorium handwritten text recognition (
This package can be installed directly from GitHub using `pip`:

```console
pip install git+https://github.com/Princeton-CDH/htr2hpc.git@main#egg=htr2hpc
pip install git+https://github.com/upenn-libraries/htr2hpc.git@main#egg=htr2hpc
```

[`pucas`](https://github.com/Princeton-CDH/django-pucas) is a dependency of this package and will be included when you install this package.
Expand Down
177 changes: 177 additions & 0 deletions README_Docker.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,177 @@
# htr2hpc docker

Penn Libraries' htr2hpc provides Dockerfile and docker-compose files for development and portainer deployments of eScriptorium with htr2hpc.

The Dockerfile and docker compose configurations automate the installation steps in the [README](README.md) and [Princeton CDH's Ansible playbook](https://github.com/Princeton-CDH/cdh-ansible/blob/main/playbooks/escriptorium.yml).

This file provides instructions for docker deployment in development and on portainer.

## GPC installation for development and portainer deployment

HTR2HPC is both a django application that integrates with eScriptorium and a command-line application `htr2hpc-train` that is run on the HPC cluster. Both pieces must be installed and operating. For instructions on installing `htr2hpc-train` on the GPC see the [training README](src/htr2hpc/train/README.md).

## Development deployment

Copy `variables.env_example` to `variables.env` and edit it to match your environment. See "Configuration variables" for more information.

```bash
cp variables.env_example variables.env
```

Create an SSH key.

```bash
$ mkdir ssh
$ ssh-keygen -N "" -f ./ssh/htr2hpc_ed25519 -t ed25519
$ ls ssh
htr2hpc_ed25519 htr2hpc_ed25519.pub
```

Add the `./ssh/htr2hpc_id_rsa.pub` public key to the `${HOME}/.ssh/authorized_keys` file on the HPC cluster. See "SSH key authentication" below for more information.

Then build and run:

```bash
docker compose build --no-cache
docker compose up
# or, if you don't want to see the logs:
# docker compose up -d
```

eScriptorium should be available at http://localhost:8080. Use the admin username and password from `variables.env`.

### To clear everything out and start over:

```bash
docker compose down
docker volume rm $(docker volume ls -q -f name=htr2hpc)
docker compose build --no-cache
docker compose up
```

Or, if you prefer one line:

```bash
docker compose down && docker volume rm $(docker volume ls -q -f name=htr2hpc) && docker compose build --no-cache && docker compose up
```

Or, more thorough:

```bash
docker compose down -v --remove-orphans && docker container prune -f && { [[ -n "$(docker volume ls -q -f name=htr2hpc)" ]] && docker volume rm -f $(docker volume ls -q -f name=htr2hpc) || docker compose build --no-cache && docker compose up; }
```

### Configuration variables

For model training in development you'll need to edit the variables: `ESCRIPTORIUM_HOST`, `HPC_SSH_USER`, and `HPC_WORKING_DIR`. `ESCRIPTORIUM_HOST` should be set to the protocol, local machine public IP, and nginx port; e.g., http://1.2.3.4:8080.

NOTE: I haven't been able to get training to work from my laptop. (de 2025-08-22)

```shell
ESCRIPTORIUM_HOST=example.com # The host escriptorium is running on
HPC_HOSTNAME=hpc.host.edu # The hostname of the HPC cluster
HPC_SSH_USER=uername # Username if one account is used for training
```

### SSH key authentication

htr2hpc relies on ssh secure authentication to run slurm jobs on the HCP cluster. By default, the key is expected at `./ssh/htr2hpc_ed25519`.

The development docker compose file maps local `./ssh` to `/usr/src/app/.ssh` in the docker container.

```
# docker-compose.yml
x-app:
# ...
volumes:
- ./ssh:/usr/src/app/.ssh
```

_**IMPORTANT: Do not check the ssh key into version control!**_

The directory `./ssh` is in the `.gitignore` file and, thus, will ignored by git commands. If you put the ssh key in another directory in this project, make sure it is not checked into version control.

## What this repository does

This repo provides a docker deployment that builds a custom instance of eScriptorium with the Penn Libraries fork of htr2hpc and runs it in a docker compose environment.

The deployment is based on the htr2hpc installation instructions and the Princeton-CDH Ansible deployment scripts, https://github.com/Princeton-CDH/cdh-ansible/

Useful links:

- [eScriptorium playbook](https://github.com/Princeton-CDH/cdh-ansible/blob/main/playbooks/escriptorium.yml)
- [Staging variables](https://github.com/Princeton-CDH/cdh-ansible/blob/main/inventory/group_vars/htr_staging/vars.yml)
- [escriptorium_setup tasks](https://github.com/Princeton-CDH/cdh-ansible/blob/main/roles/escriptorium_setup/tasks/main.yml)

The `docker-compose.yml` file is adapted from the official eScriptorium repository (https://gitlab.com/scripta/escriptorium). It changes the service configurations for the `app` and `nginx` services to use locally built images, `pennlib-escriptorium` and `pennlib-escriptorium-nginx`, respectively.

The `pennlib-escriptorium` image is built from `./Dockerfile`. It pulls the latest eScriptorium image, then

- adds the `escriptorium/local_settings.py` file, which imports the htr2hpc module
- modifies the `requirements.txt` file to include the htr2hpc module,
- add the `escriptorium/uwsgi.ini` custom web server configuration, and
- runs `pip install` to install the htr2hpc module.

The `pennlib-escriptorium-nginx` image is built from `./nginx/Dockerfile`. It replaces the original nginx image, `registry.gitlab.com/scripta/escriptorium/nginx:latest`, which is two years old and does not include the eScriptorium proxy configuration.

**TODO**: The docker-compose.yml file still refers to `registry.gitlab.com/scripta/escriptorium/mail`, which, like the nginx image, is two years old. It may need to be replaced as well.

## Portainer

Use `Dockerfile.portainer`, `nginx/Dockerfile`, `docker-compose.portainer.yml`, and `variables.env.portainer_example` for Portainer deployments.

The steps for deployment are these:

1. Build the htr2hpc image
2. Build the htr2hpc-nginx image
3. Create the stack using `docker-compose.portainer.yml` and edited `variables.portainer`

**(1) Build the HTR2HPC image**

Build htr2hpc image on portainer using `Dockerfile.portainer`.

On the Images > Build Image page:

- Name the image `htr2hpc`
- Paste the content of `Dockerfile.portainer` into the Web Editor
- Upload the files from `./escriptorium`: `entrypoint.sh, extra_requirements.txt, local_settings.py, uwsgi.ini`
- Build the image

**(2) Build the HTR2HPC Nginx image**

On the Images > Build Image page:

- Name the image `htr2hpc-nginx`
- Paste the content of `nginx/Dockerfile` into the Web Editor
- Upload the file `nginx.conf` from `./nginx`
- Build the image

**(3) Create the stack**

On the 'Stacks > Add stack' page

- Name the stack: `htr2hpc`
- Paste the content of `docker-compose.portainer.yml` in the web editor box
- Click on 'Advanced mode' under Environments variables and paste in the content `variables.portainer`
- Edit the environment variables
- Click 'Deploy the stack'

If this is the initial setup, the private SSH key file should be added to the volume `ssh`. This can be done by bashing into the web container and creating a private key file in `/usr/src/app/.ssh`. In the default configuration this is an ed25519 key named `htr2hpc_ed25519`

### Trobleshooting

#### Bad gateway

Bad gateway errors can arise for a couple of reasons.

1. The web (django) container has not completely started. Check the log for web container and look for the notice that the wsgi workers have been spawned

- ```*** uWSGI is running in multiple interpreter mode ***
spawned uWSGI master process (pid: 1)
spawned uWSGI worker 1 (pid: 75, cores: 1)
spawned uWSGI worker 2 (pid: 76, cores: 1)
spawned uWSGI worker 3 (pid: 77, cores: 1)
spawned uWSGI worker 4 (pid: 78, cores: 1)
spawned uWSGI http 1 (pid: 79)
```
2. The web container has been redeployed and assigned an IP not known to the nginx server. Try restarting nginx.
Loading