liquifier

This repo holds code that can be used to generate fluid transcriptions.

Installation

locally

To run and test the liquifier locally, follow these steps:

Clone the repo and navigate into it:

git clone [email protected]:BeethovensWerkstatt/liquifier.git
cd liquifier

create the docker image:
```
docker build -t liquifier:latest .
```
Move to the BeethovensWerkstatt/data directory:
```
cd <path/to/BeethovensWerkstatt/data>
```
(This might be cd ../data if the liquifier repo is cloned next to data.)

Run the docker image:

docker run --rm -ti -v $(pwd)/data:/usr/src/app/data -v $(pwd)/cache:/usr/src/app/cache -v $(pwd)/.git:/usr/src/app/.git:ro -w /usr/src/app liquifier node index.js --input-dir=/usr/src/app/data/sources --output-dir=/usr/src/app/cache

The -v flags mount the data and cache directories from the host machine into the docker container, so that the liquifier can read input files and write output files. The .git directory is mounted to enable accessing the versioning of the generated files. It is mounted read-only to prevent any accidental changes to the git repository.

Newly created files are not committed automatically. You can check which files were created or modified by running git status in the data directory.

on a server

The liquifier can also be run as a GitHub Action on a server. To set this up, follow these steps:

Fork the [BeethovensWerkstatt/data]
In your fork, go to the "Actions" tab and enable GitHub Actions if it is not already enabled.
Create a new workflow file in the .github/workflows directory of your fork. You can name the file liquifier.yml.

Add the following content to the liquifier.yml file: ```yaml name: Liquifier | Generate and cache diplomatic, annotated, and fluid transcriptions on: push: branches: - main paths: - 'data/sources//annotatedTranscripts//.xml' - 'data/sources//diplomaticTranscripts//.xml' jobs: build: name: Generate Cache files runs-on: ubuntu-latest permissions: contents: write

 steps:
   - name: Checkout repository
     uses: actions/checkout@d632683dd7b4114ad314bca15554477dd762a938 # v4.2.0

   - name: ensure cache
     run: mkdir -p cache

   - name: run Docker image
     run: docker run --rm -v $(pwd)/data:/usr/src/app/data -v $(pwd)/cache:/usr/src/app/cache -v $(pwd)/.git:/usr/src/app/.git:ro ghcr.io/beethovenswerkstatt/liquifier:latest node index.js

   # check repo status before push to avoid overlapping commits

   - name: configure git
     run: |
       echo "Configuring git..."
       git config user.name "github-actions"
       git config user.email "[email protected]"
   
   - name: Commit
     run: |
       git add .
       git commit -m "generate and cache transcriptions for @${{ github.sha }}"
       git push

```

Commit and push the liquifier.yml file to your fork.
The workflow will run automatically whenever there is a push to the main branch that affects files in the data/sources/**/annotatedTranscripts/ or data/sources/**/diplomaticTranscripts/ directories.

Make sure to monitor the Actions tab in your fork to see the status of the workflow runs.

The GitHub action is responsible for committing the newly created or modified files back to the repository. It is configured to use a generic "github-actions" user for the commits. The liquifier node application will not make any commits itself!

Command line arguments

node index.js [-q] [--recreate] [--input-dir <path>] [--output-dir <path>] [fileNames*]

The liquifier script can be run with the following command line arguments:

-q: quiet mode, suppresses non-essential output
-v: verbose mode, log some more information (superseded by -q)
--recreate: forces the recreation of all output files, even if they are up-to-date
--hours <number>: specifies the number of hours to look back for modified files (default 24)
--since <date>: specifies a date to look back for modified files (supersedes --hours)
--full: generates full fluid transcriptions instead of only the changes (supersedes --hours and --since)
--types: comma separated list of transcription types (at,dt,ft | default all)
--media: comma separated list of files to create (svg,midi,html | default all)
--input-dir (or -i): specifies the base directory for input files (default ./)
--output-dir (or -o): specifies the base directory for output files (default ./cache)
fileNames: any number of file names to process, separated by spaces. If not provided, the files are selected by the most recently modified ones (work in progress).

Directory Configuration

The --input-dir and --output-dir parameters allow you to configure where the liquifier reads input files from and writes output files to. This is particularly useful for different deployment scenarios:

Local development example:

# Read from data repository, write to data repository's cache
node index.js --input-dir=../data/data/sources --output-dir=../data/cache diplomaticTranscripts/filename.xml

Docker container example:

# When running in Docker with mounted volumes
docker run --rm -v $(pwd)/data:/usr/src/app/data -v $(pwd)/cache:/usr/src/app/cache liquifier \
  node index.js --input-dir=/usr/src/app/data/sources --output-dir=/usr/src/app/cache

When using these parameters, file paths in the fileNames argument should be relative to the --input-dir. Output files will maintain the same directory structure within the --output-dir.

Any filter options are ignored, when a list of files is given.

The dates are compared by the last commit date of any file. If the files are modified but not yet committed, they will be recreated even if they are already up-to-date.

Output Structure

The liquifier generates multiple output files organized in a page-based folder hierarchy. This organization ensures manageable folder sizes (typically ~20 files per page) and enables efficient API access patterns.

Folder Organization

All output files are organized into page-based folders using the pattern {type}/{page}/:

cache/sources/D-BNba_MH_60_Engelmann/
├── annotatedTranscripts/
│   └── p005/
│       ├── D-BNba_MH_60_Engelmann_p005_wz06_at.svg
│       ├── D-BNba_MH_60_Engelmann_p005_wz07_at.svg
│       └── ...
├── annotatedMidi/
│   └── p005/
│       ├── D-BNba_MH_60_Engelmann_p005_wz06_at.mid
│       └── ...
├── diplomaticTranscripts/
│   └── p005/
│       ├── D-BNba_MH_60_Engelmann_p005_wz06_dt.svg
│       ├── D-BNba_MH_60_Engelmann_p005_wz06_syss289fb17d-10e3-4b27-9b64-8d2d6a560c1d_dt.svg
│       ├── D-BNba_MH_60_Engelmann_p005_wz06_sys{anotherSystemId}_dt.svg
│       └── ...
├── fluidTranscripts/
│   └── p005/
│       └── D-BNba_MH_60_Engelmann_p005_wz06_ft.svg
└── fluidHTML/
    └── p005/
        └── D-BNba_MH_60_Engelmann_p005_wz06_ft.html

Page folder naming:

Extracted from input filename using pattern _p(\d{3})_
Always three-digit page numbers: p005, p042, p123, etc.
Applied to all output types: AT, DT, FT, MIDI, HTML

Diplomatic Transcript System Files

For each diplomatic transcript, the liquifier generates multiple output files:

Full diplomatic transcript: Contains all systems from the writing zone
- Filename: {source}_{page}_{wz}_dt.svg
- Example: D-BNba_MH_60_Engelmann_p005_wz06_dt.svg
- Renders the complete diplomatic transcript with all systems and their rastrums
Individual system files: One file per system in the diplomatic transcript
- Filename: {source}_{page}_{wz}_sys{systemId}_dt.svg
- Example: D-BNba_MH_60_Engelmann_p005_wz06_syss289fb17d-10e3-4b27-9b64-8d2d6a560c1d_dt.svg
- Each file contains only one system with its associated rastrums (staff lines)
- Optimized for file size by including only the rastrums used by that specific system
- Enables assembly into "virtual continuous staves" across multiple pages

System file characteristics:

System IDs are taken from the MEI document's <system xml:id="..."> or <bw:system xml:id="..."> elements
Each system file is standalone and can be loaded independently
Rastrums (staff lines) are filtered to include only those used by that specific system
All system files use the same coordinate system as the full DT for consistent positioning

Use cases:

Full DT files: Display complete diplomatic transcripts in context
Individual system files: Assemble continuous notation across page boundaries, create excerpt views, or display systems independently in a web application

Metadata and ordering:

No separate JSON metadata files are generated (all metadata remains in source MEI)
System ordering can be reconstructed from annotated transcripts
API implementations can retrieve file lists and construct ordering from MEI source data

File Naming Patterns

All output files follow consistent naming conventions:

Type	Pattern	Example
Annotated Transcript	`{source}_{page}_{wz}_at.svg`	`D-BNba_MH_60_Engelmann_p005_wz06_at.svg`
Annotated MIDI	`{source}_{page}_{wz}_at.mid`	`D-BNba_MH_60_Engelmann_p005_wz06_at.mid`
Diplomatic Transcript (Full)	`{source}_{page}_{wz}_dt.svg`	`D-BNba_MH_60_Engelmann_p005_wz06_dt.svg`
Diplomatic System	`{source}_{page}_{wz}_sys{systemId}_dt.svg`	`D-BNba_MH_60_Engelmann_p005_wz06_syss289fb17d-10e3-4b27-9b64-8d2d6a560c1d_dt.svg`
Fluid Transcript	`{source}_{page}_{wz}_ft.svg`	`D-BNba_MH_60_Engelmann_p005_wz06_ft.svg`
Fluid HTML	`{source}_{page}_{wz}_ft.html`	`D-BNba_MH_60_Engelmann_p005_wz06_ft.html`

Where:

{source}: Source manuscript identifier (e.g., D-BNba_MH_60_Engelmann)
{page}: Three-digit page number (e.g., p005)
{wz}: Writing zone identifier (e.g., wz06)
{systemId}: MEI system element ID (e.g., s289fb17d-10e3-4b27-9b64-8d2d6a560c1d)

Environment variables

If the environment variable fileNames is set, it will be used as a comma-separated list of file names to process. This can be useful when running the script in a Docker container or on a server where command line arguments may not be easily passed. The command line arguments take precedence over the environment variable.

Code Structure

The liquifier application follows a modular architecture with clear separation of concerns. The codebase is organized into logical directories that reflect the application's execution flow:

Entry Point

index.js - Main orchestrator that coordinates the high-level application flow:

Parse CLI arguments
Initialize logger
Initialize tools (Verovio, Thulemeier)
Process files

Source Directory Structure

src/
├── config.mjs                         # Global configuration
│
├── core/                              # Application orchestration
│   ├── cli.js                         # CLI argument parsing
│   ├── logger.js                      # Logging utility
│   ├── init.js                        # Tool initialization
│   └── processor.js                   # File processing orchestration
│
├── rendering/                         # Rendering engines & orchestration
│   ├── renderers.js                   # Rendering decisions (what/when to render)
│   ├── verovioHandler.js             # Verovio rendering engine interface
│   └── thulemeierHandler.js          # Thulemeier rendering engine interface
│
├── preparation/                       # Data preparation for rendering
│   ├── annotatedTranscripts.js       # Annotated transcript preparation
│   ├── diplomaticTranscripts.js      # Diplomatic transcript preparation
│   ├── fluidTranscripts.js           # Fluid transcript preparation
│   └── mei.js                         # MEI XML manipulation utilities
│
├── filehandlers/                      # File I/O operations
│   └── filehandler.js                 # File reading, writing, triple management
│
└── utils/                             # Generic utility functions
    ├── geometry.js                    # Vector & geometric calculations
    ├── utils.js                       # General utility functions
    ├── trigonometry.js                # Trigonometric calculations
    ├── facsimileHelpers.js           # Facsimile-specific helpers
    └── uuid.js                        # UUID generation

Module Descriptions

Core Modules

cli.js: Parses command-line arguments using minimist and returns a normalized configuration object
logger.js: Provides a configurable logger with support for quiet and verbose modes
init.js: Initializes Verovio and Thulemeier rendering engines
processor.js: Orchestrates the processing of multiple files, handling data fetching and rendering coordination

Rendering Modules

renderers.js: Contains specialized rendering functions for each output type (AT SVG, AT MIDI, DT SVG, FT SVG, FT HTML) with date-based rendering decisions
verovioHandler.js: Wrapper for the Verovio rendering engine, handles MEI-to-SVG and MEI-to-MIDI conversion
thulemeierHandler.js: Integration layer for the Thulemeier rendering library

Preparation Modules

annotatedTranscripts.js: Prepares annotated transcript DOM structures for rendering
diplomaticTranscripts.js: Prepares diplomatic transcript data structures
fluidTranscripts.js: Generates fluid transcription animations and transitions
mei.js: Core MEI XML manipulation functions used across transcript types

File Handling

filehandler.js: Manages file I/O operations, creates file triples (input/output path mappings), fetches data from multiple sources, and writes rendered output

Utilities

geometry.js: Vector mathematics and geometric calculations (formerly index.js)
utils.js: General-purpose utility functions (DOM manipulation, bounding boxes, etc.)
trigonometry.js: Mathematical functions for rotations and transformations
facsimileHelpers.js: Helper functions for working with facsimile measurements and OpenSeadragon
uuid.js: UUID generation for MEI elements

Design Principles

Separation of Concerns: Each module has a single, well-defined responsibility
Modularity: Functions are organized by their role in the application flow
Testability: Each module can be tested independently with clear input/output contracts
Maintainability: Related functionality is grouped together, making it easy to locate and modify code
Extensibility: New rendering types or transcript formats can be added by following established patterns

Name		Name	Last commit message	Last commit date
Latest commit History 96 Commits
.github/workflows		.github/workflows
.vscode		.vscode
src		src
.eslintignore		.eslintignore
.eslintrc.js		.eslintrc.js
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
index.js		index.js
jsconfig.json		jsconfig.json
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

liquifier

Installation

locally

on a server

Command line arguments

Directory Configuration

Output Structure

Folder Organization

Diplomatic Transcript System Files

File Naming Patterns

Environment variables

Code Structure

Entry Point

Source Directory Structure

Module Descriptions

Core Modules

Rendering Modules

Preparation Modules

File Handling

Utilities

Design Principles

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors 2

Uh oh!

Languages

License

BeethovensWerkstatt/liquifier

Folders and files

Latest commit

History

Repository files navigation

liquifier

Installation

locally

on a server

Command line arguments

Directory Configuration

Output Structure

Folder Organization

Diplomatic Transcript System Files

File Naming Patterns

Environment variables

Code Structure

Entry Point

Source Directory Structure

Module Descriptions

Core Modules

Rendering Modules

Preparation Modules

File Handling

Utilities

Design Principles

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors 2

Uh oh!

Languages

Packages