Skip to content

Conversation

bioball
Copy link
Member

@bioball bioball commented Aug 8, 2025

Implementation of apple/pkl-evolution#20

Currently, in order to update a pkl-doc documentation site, almost the entire existing site is read in order to update metadata like known versions, known subtypes, and more.

For example, adding a new version of a package requires that the existing runtime data of all existing versions be updated. Eventually, this causes the required storage size to balloon exponentially to the number of versions.

This addresses these limitations by:

  1. Updating the runtime data structure to move "known versions" metadata to the package level (the same JSON file is used for all versions).
  2. Eliminating known subtype and known usage information at a cross-package level.
  3. Generating the search index by consuming the previously generated search index.
  4. Generating the main page by consuming the search index.

Because this changes how runtime data is stored, an existing docsite needs to be migrated.

This also introduces a new migration command, pkl-doc --migrate, which transforms an older version of the website into a newer version.

NOTE: most of the additions are due to new input/output tests. I split those off into a separate commit. To review this PR, just take a look at the first commit.

The generated output changes now need to be served via an HTTP server (because of ES6 module imports).
You can review the generated site using a command like python3 -m http.server -d pkl-doc/src/test/files/DocGeneratorTest/output/run-1/).

@bioball bioball force-pushed the pkldoc-improvements branch from 5cafc7b to 0b43b89 Compare August 8, 2025 18:23
@bioball bioball force-pushed the pkldoc-improvements branch 3 times, most recently from 2f16ea9 to 2a89899 Compare August 11, 2025 22:15
@bioball bioball force-pushed the pkldoc-improvements branch 3 times, most recently from 4f9b476 to 827e6c7 Compare August 21, 2025 13:49
@bioball bioball force-pushed the pkldoc-improvements branch from 827e6c7 to b965728 Compare September 3, 2025 13:42
Currently, in order to update a pkl-doc documentation site,
almost the entire existing site is read in order to update metadata
like known versions, known subtypes, and more.

For example, adding a new version of a package requires that the
existing runtime data of all existing versions be updated.
Eventually, this causes the required storage size to balloon
exponentially to the number of versions.

This addresses these limitations by:

1. Updating the runtime data structure to move "known versions" metadata
   to the package level (the same JSON file is used for all versions).
2. Eliminating known subtype and known usage information at a
   cross-package level.
3. Generating the search index by consuming the previously generated
   search index.
4. Generating the main page by consuming the search index.

Because this changes how runtime data is stored, an existing docsite
needs to be migrated.

This also introduces a new migration command, `pkl-doc --migrate`,
which transforms an older version of the website into a newer version.
@bioball bioball force-pushed the pkldoc-improvements branch from b965728 to b00a035 Compare September 3, 2025 18:31

for (docPackage in docPackages) {
if (docPackage.isUnlisted) continue

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unlisted filtering logic is moved up to line 104

id = "search-icon"
classes = setOf("material-icons")
+"search"
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated: address a warning that an input is missing a label

name("classes").value(classes(item))
}
}
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The JsonWriter and JsonReader APIs too low level; since we are also parsing runtime data, it's much nicer to use kotlinx json.

We don't want to use kotlinx json in pkl-core (which is partially why JsonWriter/JsonReader exist), but we don't have any issues using this library inside the pkl-doc tool.

stream.write(POSTFIX.toByteArray(StandardCharsets.UTF_8))
stream.flush()
}
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are converting the runtime data files from .js to .json. I chose to avoid doing that for the search index because it's not actually a problem right now, and takes more work (more migration needed).

Comment on lines +75 to +82
lang = "en-US"

head {
meta { charset = "UTF-8" }
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated: address warnings emitted by linters

val writtenFiles = mutableSetOf<Path>()

private suspend fun writeData(packages: List<PackageData>) {
coroutineScope {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A coroutineScope will block until any jobs spawned by launch are completed. And each of these launch jobs are executed in parallel.

Copy link
Contributor

@HT154 HT154 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall. One question, but no blockers.

Comment on lines +194 to +195
isModuleClass -> "$module/index.json".pathEncoded
else -> "$module/$type.json".pathEncoded
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would pkldoc have a bad time with class index in a module? If so, it's probably worth catching that and throwing an error or rewriting it to something else.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, good catch, I think it'll conflict. I won't address that in this PR; this is an existing issue and I'd rather not block this PR on it.

`copyToRecursively` does not work correctly when copying from Windows
paths to jimfs.
@bioball bioball force-pushed the pkldoc-improvements branch from eedab7b to 2a7941b Compare September 26, 2025 13:42
@bioball bioball force-pushed the pkldoc-improvements branch from 9a80791 to 061d18f Compare September 29, 2025 18:17
@bioball bioball merged commit 5d90cf8 into apple:main Sep 29, 2025
4 checks passed
@bioball bioball deleted the pkldoc-improvements branch September 29, 2025 23:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants