Skip to content

Conversation

@katrinafyi
Copy link
Contributor

@katrinafyi katrinafyi commented Nov 1, 2025

It looks like this:

Screenshot_20251030_133817

This implementation works but there are fun details to think about:

  • Is the formatting good? atm, it's a funny mix of rich text markdown (for the headings) and monospace text (for the help). I think this is a good compromise between having nice formatting and enabling automatic generation
  • also on formatting, astro will combine two hyphens (--) into a unicode em dash (—) and this affects the option flags in markdown headings. I've prevented this using a zero width space but it's a bit :/ is this okay? do we care about keeping the original hyphens?
  • The generation is kind of hacky. It doesn't hook into any astro APIs. instead, it just runs JavaScript code in the content definition file. This means that, for instance, astro needs to be manually restarted after changes to the generator or the _cli.md template file.
  • I tried custom astro content collections as a more "official" way to do this generation. however, it looked like headings would not be propagated if you render a fragment within another page (as we would want to embed the generated options within the template page). this negates most of the benefit

Todo:

  • assertion for body text

@mre
Copy link
Member

mre commented Nov 1, 2025

The rich text headings + monospace blocks is actually quite good. It makes the docs scannable while preserving the exact CLI output format. I'd keep this.

About the em dash issue, a few alternatives come to mind:

  • Wrap option names in backticks: ### `--dump` (keeps hyphens, looks cleaner)
  • Use a custom Astro component for option headings
  • Configure Astro's typography settings to disable smart quotes/dashes

Have you tried to use Astro's integration system instead?
I'm not an Astro expert by any means. Just found it in their docs: https://docs.astro.build/en/guides/integrations-guide/

Create a proper Astro integration that runs during the build:

// astro.config.mjs
import { generateCliDocs } from './src/integrations/cli-docs.ts';

export default defineConfig({
  integrations: [
    generateCliDocs(),
  ]
});
// src/integrations/cli-docs.ts
import type { AstroIntegration } from 'astro';
import { generateCliOptionsMarkdown } from '../fetchReadme';
import { writeFileSync, readFileSync, rmSync } from 'node:fs';

export function generateCliDocs(): AstroIntegration {
  return {
    name: 'generate-cli-docs',
    hooks: {
      'astro:config:setup': async () => {
        const docTemplateFile = "src/content/docs/guides/_cli.md";
        const docOutputFile = docTemplateFile.replace('_', '');
        
        rmSync(docOutputFile, { force: true });
        const usageText = await generateCliOptionsMarkdown();
        
        const docTemplateText = readFileSync(docTemplateFile, "utf-8");
        const docOutput = docTemplateText.replace(
          'README-OPTIONS-PLACEHOLDER', 
          usageText
        );
        writeFileSync(docOutputFile, docOutput);
      }
    }
  };
}

The main benefit would be that the dev server would regenerate the docs on restart, and you wouldn't need to manually run a script to update the docs.
But I haven't tested it. @jacobdalamb knows way more about Astro than me.

@jacobdalamb
Copy link
Collaborator

@mre good point. I hadn't considered using Astro's integration API to add this functionality. Because you mentioned that, I started going through community made integrations and came across this https://github.com/natemoo-re/astro-remote#README which enables "render remote HTML or Markdown content in Astro with full control over the output." what do you think?
@katrinafyi

@katrinafyi
Copy link
Contributor Author

astro-readme looks promising! I tried it but it has the same problem as the content collections approach, it doesn't add headers from the remote content to the TOC.

image

But maybe we can cook up our own integration to get reloading.

@katrinafyi
Copy link
Contributor Author

I've made changes following @mre's suggestions. We disable smart dashes across the entire site and we use a custom integration to generate the file. The integration code is basically the same as mre's suggestion, but we have to add addWatchFile to reload when the md file is changed.

@mre
Copy link
Member

mre commented Nov 3, 2025

Neat! I can see how this is taking shape. 😊

One thing I wanted to mention was that I like the colorful examples in the current version (these little "terminal" boxes). Can we get them back somehow? I don't know how hard that would be, so that's completely optional, of course.

Before

image

After

image

@thomas-zahner
Copy link
Member

@katrinafyi Cool, I really like that. Rebasing on top of master should fix the link check errors.

import { readFileSync, realpathSync, rmSync, writeFileSync } from 'node:fs';
import { basename, dirname, join } from 'node:path';

const VERSION = "lychee-v0.21.0";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updating this with each release manually is quite cumbersome. Instead, we can fetch the latest version via API.

# first element seems to be "nightly", second is latest stable release
curl  "https://api.github.com/repos/lycheeverse/lychee/releases" | jq '.[1] | .name' -r

Copy link
Contributor Author

@katrinafyi katrinafyi Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember being told, effusively, by @jacobdalamb that there is a CI workflow to periodically check the release tags. That workflow doesn't exist anymore - I don't know what happened there.

My plan was that the version would be hard coded and the CI check would be a gentle reminder to update it.

I think it is good to tie the website's source code to a particular version, so that any commit can be checked out and the website can be built as it was at that point in time. This helps, e.g., when bisecting (tho this is more important for code rather than docs).

Updating the version manually could also be a reminder to update the docs with changes from that version. In the past, I have been sloppy and forgotten to write/fix the docs :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I deleted it because it felt tacky but I can add it back.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Let's see what people think about hard coding vs fetching the latest automatically, we might not have to bring it back.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, it makes sense to pin a specific release version in the code. I would opt for this approach. But then it would be really great to have automation, similar to dependabot, which checks for the newest release version and create a new PR to bump the version in the code automatically. This can be done in a later PR.

In any case it would make sense to extract this pinned version into a separate file. Possibly even a non-TS config file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've pulled the version into a separate TS file. The only reason it's typescript is so I can use javascript comments, otherwise i'd have to do some manual parsing. Let me know if that works :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, that works 👍

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's see what people think about hard coding vs fetching the latest automatically, we might not have to bring it back.

I personally much prefer fetching the latest automatically. It's one less thing to remember and one less thing to continuously update manually. And I don't see much of a downside. On top of that, it's less code to fetch the version instead of maintaining the GitHub workflow. @jacobdalamb @thomas-zahner wdyt? 😅

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, initially I thought the same. But fetching the latest version means fetching the latest version at build time. (correct me if I'm wrong) So the result purely depends on when the docs are built. If we pin a specific version in the code the docs become reproducible, meaning the same result for everyone for each revision.

Regardless of which approach we use we need to rebuild the docs, when we want to reflect changes in the README. So ideally we should write an additional workflow which handles this. (I can do so after we have merged this PR)

The main difference is that one approach keeps the result reproducible while the other doesn't. So I think pinning is just an additional "feature" we ideally should do but it doesn't affect the workflow situation. (on new release -> [optional: update pinned version in code] ->rebuild and publish docs)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point. That's fine with me.

@katrinafyi
Copy link
Contributor Author

katrinafyi commented Nov 8, 2025

I've added some colourful boxes. However, it's pretty naive and they're not quite valid examples - they still use placeholder variables and, in the case where a flag requires another, it does not show the flags being used together.

It looks like this
image

The more I work with this, the more I think that it would be nice if the docs site shows the CLI help as markdown formatted text. Initially, I was reluctant because it means the CLI help text would have to be written with a particular formatting which is only useful to the website. But maybe it's not so bad because it's already most of the way there and the hard wrapping of lines is already a requirement with the code block approach. I'll investigate.

The CI pipeline that builds astro is failing at the astro build step. Idk why because the package should exist as a transitive dependency, and it works on my machine with an npm ci. Also, I feel like the README for this repo used to show npm commands to use but those have disappeared - why is pnpm being used in CI? I will appreciate any guidance.

that this is generally nicer to read. However, this means that the
website text is rendered from the same text as the CLI --help. The
website understands Markdown, but the CLI does not (obviously) so we
have to be careful.

I would suggest that when writing help, the focus should be on
readability within the *console*. This means that while Markdown can be
used, it should be limited to syntax which is unobtrusive in raw text.
Raw text is still the main format which the help text will be  rendered
in.

Personally, these would be okay:
- single `*` or `_` for emphasis (with preference for `*`)
- single backticks for inline code
- four space indentation for code blocks
- bullet and numbered lists

Imo, these would *not* be okay, because they appear too jarring
in plain text:
- code blocks with triple backtick fences. this includes any
  astro-specific asides and the like.
- link syntax with `[link](https://url)`
- bold when used for subheadings like `**Note**:`

I think this is a good compromise which lets the same text be usable for
both CLI --help and the website's rich text HTML.
@katrinafyi
Copy link
Contributor Author

katrinafyi commented Nov 8, 2025

Update to my last comment, I've gone ahead and made it markdown text rather than code blocks. See commit for commentary de6c2a8

image

This generally works fine and looks nice, aside from --files-from:

Examples:
  lychee --files-from list.txt
  find . -name '*.md' | lychee --files-from -
  echo 'README.md' | lychee --files-from -

File Format:
  Each line should contain one input (file path, URL, or glob pattern).
  Lines starting with '#' are treated as comments and ignored.
  Empty lines are also ignored.

If this is something we want, I'll PR to tweak the files-from help text.

@jacobdalamb
Copy link
Collaborator

Check if remark-smartypants exists within node_modules if not, delete it and pnpm-lock.ymal and reinstall packages.

@katrinafyi
Copy link
Contributor Author

katrinafyi commented Nov 9, 2025

It exists in my node modules after I use npm ci. Should I be using npm or pnpm? Is there a difference? Idk pnpm, so it would be good to have some commands in the readme.

@jacobdalamb
Copy link
Collaborator

Use PNPM. I've added back the commands in the readme.

Comment on lines 2 to 4
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does that whitespace come from?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know! a past life (branch), perhaps.

Comment on lines +5 to +8
import { LYCHEE_VERSION } from "./lychee-version";

// https://raw.githubusercontent.com/lycheeverse/lychee/master/README.md
const url = `https://raw.githubusercontent.com/lycheeverse/lychee/refs/tags/${LYCHEE_VERSION}/README.md`;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keeping this in sync with upstream is a bit cumbersome.

I was contemplating whether we could fetch it at build-time?

async function getLatestLycheeVersion(): Promise<string> {
  const response = await fetch('https://api.github.com/repos/lycheeverse/lychee/releases/latest');
  const data = await response.json();
  return data.tag_name;
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We talked a bit about this in #113 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. My vote goes to fetching it. The "project" is the state of truth, not GitHub CI. Will add a comment there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either way, no hard blocker. Just wanted to raise this again. We can merge however we decide.

@mre
Copy link
Member

mre commented Nov 11, 2025

I really like the new formatting!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants