Skip to content

Conversation

@hkir-dev
Copy link
Contributor

@hkir-dev hkir-dev commented May 21, 2025

This PR adds support for generating changelogs for ontology releases using OAK's diff functionality, addressing #470.

A new release_changelog Makefile target added that produces:

  • reports/changelog.md – a markdown-formatted summary suitable for GitHub Releases
  • reports/changelog.csv – a CSV with changeling statistics
  • reports/changelog.yaml – KGCL Yaml file

How it works:

  1. The latest release -base.obo is downloaded (or an empty obo file is created if unavailable)
  2. Diffs are computed between the latest release and the current version using runoak diff
  3. Outputs are generated in all three formats

Copy link
Contributor

@gouttegd gouttegd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two open questions about the feature:

  • Should the changelogs be generated (1) on-demand only (as is the case for now in the PR), (2) automatically unconditionally, or (3) automatically if explicitly enabled using a new option in the ODK configuration?
  • Should the diffs be computed (1) on -base, (2) on -full, (3) on whatever the user wants (through another option in the config file)?

Beyond that, the feature should not depend on OBO artefacts (which may not always be generated) and the changelogs should be gitignore’d by default.

{%- endif %}

.PHONY: release_changelog
release_changelog: $(REPORTDIR)/changelog.md $(REPORTDIR)/changelog.csv $(REPORTDIR)/changelog.yaml
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Open question: for now, the release changelogs are only generated “on demand”, when a user explicitly runs make release_changelog.

Do we want the changelog to be generated automatically? The original issue is uncommitted about that that.

(Personally I’d say no – i.e., leave things like that, generate on-demand only –, or at least, if it is generated automatically make it configurable.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd favour to be able to activate it via the config yaml. Running it on demand only would suffice, though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My worry is if you do NOT fold it into the release process, then some quirk in the dependency chain might cause a complete regeneration of the "current release" locally, while its 99% certain that, if the rule is indeed run manually, the release has already been generated.

Also, if the temporary release files are copied up during prepare_release, it will definitely regenerate the release.

So I vote for

  1. making this configurable in odk.yaml
  2. including it in the normal release process
  3. While at (1), also add an option to configure the ontology variant used for the diff.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My worry is if you do NOT fold it into the release process, then some quirk in the dependency chain might cause a complete regeneration of the "current release" locally

This will only happen if a source file has changed since the last time the release was built – and in that case, re-generating the release before making the diff is the expected behaviour.

If a dependency on $(ONT)-base.owl causes $(ONT)-base.owl to be rebuilt even though no source has changed, then it’s not a “quirk in the dependency chain”, it’s a full-on bug – one that should be fixed once it is reported.

Also, if the temporary release files are copied up during prepare_release, it will definitely regenerate the release.

I don’t see why?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t see why?

See here:

prepare_release: all_odk
rsync -R $(RELEASE_ASSETS) $(RELEASEDIR) &&\{% if project.sssom_mappingset_group is defined %}{% if project.sssom_mappingset_group.released_products is defined %}
mkdir -p $(RELEASEDIR)/mappings && cp -rf $(RELEASED_MAPPING_FILES) $(RELEASEDIR)/mappings &&\{% endif %}{% endif %}{% if project.use_dosdps %}
mkdir -p $(RELEASEDIR)/patterns && cp -rf $(PATTERN_RELEASE_FILES) $(RELEASEDIR)/patterns &&\{% endif %}
rm -f $(CLEANFILES) &&\
echo "Release files are now in $(RELEASEDIR) - now you should commit, push and make a release \
on your git hosting site such as GitHub or GitLab"

People run: prepare_release; clean will clean up main release files so they are no longer in src/ontology. Running the diff will force the base to be rebuild.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what I would consider a bug. prepare_release should not, in my opinion, clean files after a release.

What is even the point of doing that? It only ensures that the next time prepare_release is run, it would be from a clean state.

It would be much useful to ensure that intermediate files are cleaned up before running the release (that’s in fact what I always do: I never run make prepare_release, always make clean prepare_release).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⬆️ Anyway, this above is a different issue, and one that I am reluctant to fix now, so close to release – this “cleaning after release“ has been the behaviour of prepare_release for a while, I don’t think it is urgent to change it (I do think it should be changed, just not now).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

making this configurable in odk.yaml
[…]
While at (1), also add an option to configure the ontology variant used for the diff.

For simplicity, I’d favour a single option, that decides both whether a changelog is generated and the type of artefact from which it is computed (instead of one option to enable the changelog and another one from the type of artefact).

Something like

changelog_from: (base|full|simple|none)

where none is understood as disabling the production of the changelog.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like

Fine by me!

Copy link
Contributor

@gouttegd gouttegd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove the unneeded grep -v ^owl-axioms steps.

Apart from that, looking good to me.

…from the ROBOT OWL-to-OBO conversion operations
Copy link
Contributor

@gouttegd gouttegd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One last thing that I didn’t see earlier.

Copy link
Contributor

@gouttegd gouttegd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks all good to me.

@matentzn Feel free to make a case that (1) the release changelog should be generated automatically instead of on-demand and/or (2) it should be possible to generate the changelog from -full instead of -base.

Personally, I’m happy with the current behaviour.

Copy link
Contributor

@matentzn matentzn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments.

release_changelog: $(REPORTDIR)/changelog.md $(REPORTDIR)/changelog.csv $(REPORTDIR)/changelog.yaml

$(TMPDIR)/latest_release.obo: $(TMPDIR)
wget $(ONTBASE)/$(ONT)-base.owl -O [email protected] || { \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if the dependence on "base" here should be documented somewhere. Not everyone likes the idea of having a base, but everyone will want to generate release notes.. Can we make the "diff" ontology configurable, using base by default if and only if it is a configured release product, otherwise the main release?

{%- endif %}

.PHONY: release_changelog
release_changelog: $(REPORTDIR)/changelog.md $(REPORTDIR)/changelog.csv $(REPORTDIR)/changelog.yaml
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My worry is if you do NOT fold it into the release process, then some quirk in the dependency chain might cause a complete regeneration of the "current release" locally, while its 99% certain that, if the rule is indeed run manually, the release has already been generated.

Also, if the temporary release files are copied up during prepare_release, it will definitely regenerate the release.

So I vote for

  1. making this configurable in odk.yaml
  2. including it in the normal release process
  3. While at (1), also add an option to configure the ontology variant used for the diff.

}
$(ROBOT) convert -i [email protected] --check false -f obo $(OBO_FORMAT_OPTIONS) -o $@ && rm [email protected]

$(TMPDIR)/current_release.obo: $(ONT)-base.owl
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is good as it is not at all certain the ontology does indeed have an OBO release.

@matentzn matentzn marked this pull request as draft May 27, 2025 09:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants