Fetch readme and generate Markdown for CLI flags at build time #113

katrinafyi · 2025-11-01T10:41:43Z

It looks like this:

This implementation works but there are fun details to think about:

Is the formatting good? atm, it's a funny mix of rich text markdown (for the headings) and monospace text (for the help). I think this is a good compromise between having nice formatting and enabling automatic generation
also on formatting, astro will combine two hyphens (--) into a unicode em dash (—) and this affects the option flags in markdown headings. I've prevented this using a zero width space but it's a bit :/ is this okay? do we care about keeping the original hyphens?
The generation is kind of hacky. It doesn't hook into any astro APIs. instead, it just runs JavaScript code in the content definition file. This means that, for instance, astro needs to be manually restarted after changes to the generator or the _cli.md template file.
I tried custom astro content collections as a more "official" way to do this generation. however, it looked like headings would not be propagated if you render a fragment within another page (as we would want to embed the generated options within the template page). this negates most of the benefit

Todo:

assertion for body text

mre · 2025-11-01T17:58:13Z

The rich text headings + monospace blocks is actually quite good. It makes the docs scannable while preserving the exact CLI output format. I'd keep this.

About the em dash issue, a few alternatives come to mind:

Wrap option names in backticks: ### `--dump` (keeps hyphens, looks cleaner)
Use a custom Astro component for option headings
Configure Astro's typography settings to disable smart quotes/dashes

Have you tried to use Astro's integration system instead?
I'm not an Astro expert by any means. Just found it in their docs: https://docs.astro.build/en/guides/integrations-guide/

Create a proper Astro integration that runs during the build:

// astro.config.mjs
import { generateCliDocs } from './src/integrations/cli-docs.ts';

export default defineConfig({
  integrations: [
    generateCliDocs(),
  ]
});

// src/integrations/cli-docs.ts
import type { AstroIntegration } from 'astro';
import { generateCliOptionsMarkdown } from '../fetchReadme';
import { writeFileSync, readFileSync, rmSync } from 'node:fs';

export function generateCliDocs(): AstroIntegration {
  return {
    name: 'generate-cli-docs',
    hooks: {
      'astro:config:setup': async () => {
        const docTemplateFile = "src/content/docs/guides/_cli.md";
        const docOutputFile = docTemplateFile.replace('_', '');
        
        rmSync(docOutputFile, { force: true });
        const usageText = await generateCliOptionsMarkdown();
        
        const docTemplateText = readFileSync(docTemplateFile, "utf-8");
        const docOutput = docTemplateText.replace(
          'README-OPTIONS-PLACEHOLDER', 
          usageText
        );
        writeFileSync(docOutputFile, docOutput);
      }
    }
  };
}

The main benefit would be that the dev server would regenerate the docs on restart, and you wouldn't need to manually run a script to update the docs.
But I haven't tested it. @jacobdalamb knows way more about Astro than me.

jacobdalamb · 2025-11-01T23:49:40Z

@mre good point. I hadn't considered using Astro's integration API to add this functionality. Because you mentioned that, I started going through community made integrations and came across this https://github.com/natemoo-re/astro-remote#README which enables "render remote HTML or Markdown content in Astro with full control over the output." what do you think?
@katrinafyi

katrinafyi · 2025-11-02T02:39:03Z

astro-readme looks promising! I tried it but it has the same problem as the content collections approach, it doesn't add headers from the remote content to the TOC.

But maybe we can cook up our own integration to get reloading.

katrinafyi · 2025-11-02T03:31:46Z

I've made changes following @mre's suggestions. We disable smart dashes across the entire site and we use a custom integration to generate the file. The integration code is basically the same as mre's suggestion, but we have to add addWatchFile to reload when the md file is changed.

mre · 2025-11-03T17:54:21Z

Neat! I can see how this is taking shape. 😊

One thing I wanted to mention was that I like the colorful examples in the current version (these little "terminal" boxes). Can we get them back somehow? I don't know how hard that would be, so that's completely optional, of course.

Before

After

thomas-zahner · 2025-11-07T16:54:11Z

@katrinafyi Cool, I really like that. Rebasing on top of master should fix the link check errors.

thomas-zahner · 2025-11-07T17:02:32Z

src/fetchReadme.ts

+import { readFileSync, realpathSync, rmSync, writeFileSync } from 'node:fs';
+import { basename, dirname, join } from 'node:path';
+
+const VERSION = "lychee-v0.21.0";


Updating this with each release manually is quite cumbersome. Instead, we can fetch the latest version via API.

# first element seems to be "nightly", second is latest stable release curl "https://api.github.com/repos/lycheeverse/lychee/releases" | jq '.[1] | .name' -r

I remember being told, effusively, by @jacobdalamb that there is a CI workflow to periodically check the release tags. That workflow doesn't exist anymore - I don't know what happened there.

My plan was that the version would be hard coded and the CI check would be a gentle reminder to update it.

I think it is good to tie the website's source code to a particular version, so that any commit can be checked out and the website can be built as it was at that point in time. This helps, e.g., when bisecting (tho this is more important for code rather than docs).

Updating the version manually could also be a reminder to update the docs with changes from that version. In the past, I have been sloppy and forgotten to write/fix the docs :)

I deleted it because it felt tacky but I can add it back.

I see. Let's see what people think about hard coding vs fetching the latest automatically, we might not have to bring it back.

Agree, it makes sense to pin a specific release version in the code. I would opt for this approach. But then it would be really great to have automation, similar to dependabot, which checks for the newest release version and create a new PR to bump the version in the code automatically. This can be done in a later PR.

In any case it would make sense to extract this pinned version into a separate file. Possibly even a non-TS config file.

I've pulled the version into a separate TS file. The only reason it's typescript is so I can use javascript comments, otherwise i'd have to do some manual parsing. Let me know if that works :)

Sure, that works 👍

Let's see what people think about hard coding vs fetching the latest automatically, we might not have to bring it back.

I personally much prefer fetching the latest automatically. It's one less thing to remember and one less thing to continuously update manually. And I don't see much of a downside. On top of that, it's less code to fetch the version instead of maintaining the GitHub workflow. @jacobdalamb @thomas-zahner wdyt? 😅

Hmm, initially I thought the same. But fetching the latest version means fetching the latest version at build time. (correct me if I'm wrong) So the result purely depends on when the docs are built. If we pin a specific version in the code the docs become reproducible, meaning the same result for everyone for each revision.

Regardless of which approach we use we need to rebuild the docs, when we want to reflect changes in the README. So ideally we should write an additional workflow which handles this. (I can do so after we have merged this PR)

The main difference is that one approach keeps the result reproducible while the other doesn't. So I think pinning is just an additional "feature" we ideally should do but it doesn't affect the workflow situation. (on new release -> [optional: update pinned version in code] ->rebuild and publish docs)

I see your point. That's fine with me.

This reverts commit 2b85c81.

This reverts commit 35353c7.

katrinafyi · 2025-11-08T14:33:07Z

I've added some colourful boxes. However, it's pretty naive and they're not quite valid examples - they still use placeholder variables and, in the case where a flag requires another, it does not show the flags being used together.

It looks like this

The more I work with this, the more I think that it would be nice if the docs site shows the CLI help as markdown formatted text. Initially, I was reluctant because it means the CLI help text would have to be written with a particular formatting which is only useful to the website. But maybe it's not so bad because it's already most of the way there and the hard wrapping of lines is already a requirement with the code block approach. I'll investigate.

The CI pipeline that builds astro is failing at the astro build step. Idk why because the package should exist as a transitive dependency, and it works on my machine with an npm ci. Also, I feel like the README for this repo used to show npm commands to use but those have disappeared - why is pnpm being used in CI? I will appreciate any guidance.

This reverts commit 0936b31.

that this is generally nicer to read. However, this means that the website text is rendered from the same text as the CLI --help. The website understands Markdown, but the CLI does not (obviously) so we have to be careful. I would suggest that when writing help, the focus should be on readability within the *console*. This means that while Markdown can be used, it should be limited to syntax which is unobtrusive in raw text. Raw text is still the main format which the help text will be rendered in. Personally, these would be okay: - single `*` or `_` for emphasis (with preference for `*`) - single backticks for inline code - four space indentation for code blocks - bullet and numbered lists Imo, these would *not* be okay, because they appear too jarring in plain text: - code blocks with triple backtick fences. this includes any astro-specific asides and the like. - link syntax with `[link](https://url)` - bold when used for subheadings like `**Note**:` I think this is a good compromise which lets the same text be usable for both CLI --help and the website's rich text HTML.

katrinafyi · 2025-11-08T14:54:11Z

Update to my last comment, I've gone ahead and made it markdown text rather than code blocks. See commit for commentary de6c2a8

This generally works fine and looks nice, aside from --files-from:

Examples:
  lychee --files-from list.txt
  find . -name '*.md' | lychee --files-from -
  echo 'README.md' | lychee --files-from -

File Format:
  Each line should contain one input (file path, URL, or glob pattern).
  Lines starting with '#' are treated as comments and ignored.
  Empty lines are also ignored.

If this is something we want, I'll PR to tweak the files-from help text.

jacobdalamb · 2025-11-08T17:17:55Z

Check if remark-smartypants exists within node_modules if not, delete it and pnpm-lock.ymal and reinstall packages.

katrinafyi · 2025-11-09T00:38:23Z

It exists in my node modules after I use npm ci. Should I be using npm or pnpm? Is there a difference? Idk pnpm, so it would be good to have some commands in the readme.

jacobdalamb · 2025-11-09T03:18:13Z

Use PNPM. I've added back the commands in the readme.

tweak page titles to be in Title Case.

mre · 2025-11-11T13:55:11Z

src/components/code.astro

+
+
+


Where does that whitespace come from?

I don't know! a past life (branch), perhaps.

mre · 2025-11-11T13:59:23Z

src/generate-cli-options.ts

+import { LYCHEE_VERSION } from "./lychee-version";
+
+// https://raw.githubusercontent.com/lycheeverse/lychee/master/README.md
+const url = `https://raw.githubusercontent.com/lycheeverse/lychee/refs/tags/${LYCHEE_VERSION}/README.md`;


Keeping this in sync with upstream is a bit cumbersome.

I was contemplating whether we could fetch it at build-time?

async function getLatestLycheeVersion(): Promise<string> { const response = await fetch('https://api.github.com/repos/lycheeverse/lychee/releases/latest'); const data = await response.json(); return data.tag_name; }

We talked a bit about this in #113 (comment)

Thanks. My vote goes to fetching it. The "project" is the state of truth, not GitHub CI. Will add a comment there.

Either way, no hard blocker. Just wanted to raise this again. We can merge however we decide.

src/generate-cli-options.ts

mre · 2025-11-11T14:04:38Z

I really like the new formatting!

Co-authored-by: Matthias Endler <[email protected]>

blahhhhhhhhh

08061f2

katrinafyi mentioned this pull request Nov 1, 2025

Implement anchors for CLI flags, and automatically fetch --help #102

Closed

katrinafyi added 2 commits November 2, 2025 13:26

use custom integration

7311e8c

add message

d3d0850

katrinafyi added 2 commits November 4, 2025 11:57

Merge remote-tracking branch 'origin/master' into fetch-readme-goofy

3e2e7bf

fdjsamiofdsajiofdsaijo

b968bb4

thomas-zahner reviewed Nov 7, 2025

View reviewed changes

katrinafyi added 10 commits November 8, 2025 12:25

Merge remote-tracking branch 'origin/master' into fetch-readme-goofy

3c8dc20

rm cli.md

45a3710

trying single code block highlight again. why is it blue???

35353c7

steal starlight's colours to make it grey ;-;

2b85c81

Revert "steal starlight's colours to make it grey ;-;"

5bd52be

This reverts commit 2b85c81.

Revert "trying single code block highlight again. why is it blue???"

0975b29

This reverts commit 35353c7.

biome

e6b31e9

remove escapeMarkdown

0c90d66

remove default value formatting

0936b31

try add remark-smartypants to package.json

35bde02

katrinafyi added 2 commits November 9, 2025 00:35

Revert "remove default value formatting"

96550cf

This reverts commit 0936b31.

katrinafyi added 4 commits November 9, 2025 17:33

pnpm lock

fb71736

reorder sidebar based on (subjective) frequency of use. also

70d233c

tweak page titles to be in Title Case.

Merge remote-tracking branch 'origin/master' into fetch-readme-goofy

0c04834

extract lychee-version.ts and add note of current version to docs

2890f8c

thomas-zahner mentioned this pull request Nov 11, 2025

Keep example config file and documentation up to date lycheeverse/lychee#1768

Open

mre reviewed Nov 11, 2025

View reviewed changes

katrinafyi and others added 2 commits November 12, 2025 00:10

Apply suggestions from code review

a2eba55

Co-authored-by: Matthias Endler <[email protected]>

undo 3 blank lines

9b2d97b

Fetch readme and generate Markdown for CLI flags at build time #113

Are you sure you want to change the base?

Fetch readme and generate Markdown for CLI flags at build time #113

Uh oh!

Conversation

katrinafyi commented Nov 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mre commented Nov 1, 2025

Uh oh!

jacobdalamb commented Nov 1, 2025

Uh oh!

katrinafyi commented Nov 2, 2025

Uh oh!

katrinafyi commented Nov 2, 2025

Uh oh!

mre commented Nov 3, 2025

Before

After

Uh oh!

thomas-zahner commented Nov 7, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

katrinafyi Nov 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

katrinafyi commented Nov 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

katrinafyi commented Nov 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jacobdalamb commented Nov 8, 2025

Uh oh!

katrinafyi commented Nov 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jacobdalamb commented Nov 9, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mre commented Nov 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

katrinafyi commented Nov 1, 2025 •

edited

Loading

katrinafyi Nov 7, 2025 •

edited

Loading

katrinafyi commented Nov 8, 2025 •

edited

Loading

katrinafyi commented Nov 8, 2025 •

edited

Loading

katrinafyi commented Nov 9, 2025 •

edited

Loading