Skip to content

Conversation

@Keno
Copy link
Member

@Keno Keno commented Nov 1, 2025

Motivation

There are several corner cases in the Julia syntax that are essentially bugs or mistakes that we'd like to possibly remove, but can't due to backwards compatibility concerns.

Similarly, when adding new syntax features, there are often cases that overlap with valid (but often nonsensical) existing syntax. In the past, we've mostly done judegement calls of these being "minor changes", but as the package ecosystem grows, so does the chance of someone accidentally using these anyway and our "minor changes" have (subjectively) resulted in more breakages recently.

Fortunately, all the recent work on making the parser replacable, combined with the fact that JuliaSyntax already supports parsing multiple revisions of Julia syntax provides a solution here: Just let packages declare what version of the Julia syntax they are using. That way, packages would not break if we make changes to the syntax and they can be upgraded at their own pace the next time the author of that particular package upgrades to a new julia version.

Core mechanism

The way this works is simple. Right now, the parser function is always looked up in Core._parse. With this PR, it is instead looked up as mod._internal_julia_parse (slightly longer name to avoid conflicting with existing bindings of the name in downstream packages), or Core._parse if no such binding exists. Similar for _lower.

There is a macro @Base.Experimental.set_syntax_version v"1.xx" that will set the _internal_julia_parse (and inte the future the _lower version) to one that propagates the version to the parser, so users are not expected to manipulate the binding directly.

Versioned package loading

The loading system is extended to look at a new syntax.julia_version key in Project.toml (and Manifest for explicit environments). If no such key exists, it defaults to the minimum allowed version of the Julia compat. If no compat is defined, it defaults to the current Julia version. This is technically slightly less backwards compatible than defaulting this to Julia 1.13, but I think it will be less suprising in the future for the default syntax to match what is in the REPL. Most julia packages do already define a julia compat.

Note that as a result of this, the code for parse compat ranges moves from Pkg to Base.

Syntax changes

This introduces two parser changes:

  1. @VERSION (and similar macrocall forms of a macro named VERSION) are now special and trigger the parser to push its version information into the source location field of the macrocall. Note that because this is in the parser, this affects all macros with the name. However, there is also logic on the macrocall side that discards this again if the macro cannot accept it. This special mechanism is used by the Base.Experimental.@VERSION macro to let users detect the parse version.

  2. The module syntax form gains a syntax version argument that is automatically populated with the parser's current version. This is the mechanism to propagate syntax information from the parser to the core mechanism above.

Note that these are only active if a module has opted into 1.14 syntax, so macros that process :module exprs will not see these changes unless and until the calling module opts into 1.14 syntax via the above mentioned mechanisms (which is the primary advantage of this scheme).

Final words

I should emphasize that I'm not proposing using this for any big syntax revolutions or anything. I would just like to start cleaning up a few corners of the syntax that I think are universally agreed to be bad but that we've kept for backwards compatibility. This way, by the time we get around to making a breaking revision, our entire ecosystem will have already upgraded to the new syntax.

Remaining TODO

  • Standalone tests for the parser changes
  • JuliaLowering update
  • Pkg.jl side changes
  • NEWS entry

Open questions

  • What should be the version of the flisp parser?

Some doc and test edits by Claude but mostly manual.

@Keno Keno requested review from JeffBezanson and mlechu November 1, 2025 22:05
@KristofferC
Copy link
Member

Can you compare this to https://doc.rust-lang.org/edition-guide/editions/ which on the surface looks fairly similar.

How would macros be handled? The rust edition docs has a special section about that https://doc.rust-lang.org/edition-guide/editions/advanced-migrations.html#migrating-macros

@KristofferC
Copy link
Member

KristofferC commented Nov 2, 2025

It's also common to directly include package files in the REPL for e.g. interactive debugging. To me that would mean the project file syntax version would be required (so you know how to parse things based on the current active project). and the @Base.Experimental.set_syntax_version v"1.14" is not really workable. Alternatively, all files need a marker that says how they are parsed.

@Keno
Copy link
Member Author

Keno commented Nov 2, 2025

https://doc.rust-lang.org/edition-guide/editions/

Yes, it's a substantially similar mechanism with the same goals.

How would macros be handled?

Macros are expanded according to the lowering version of the calling module. This may of course mean that the macro sees syntax that is not part of the syntax revision that the defining module expects, but the user of the macro can decide how to deal with that at usage time - the resolution will not retroactively change.

It's also common to directly include package files in the REPL for e.g. interactive debugging

If the REPL context module is switched to the package, the REPL will use the syntax version of the package. If the file is included in Main, then of course the environment may be different. However, I don't think this is all that different from e.g. loading a different version of a package because the project wasn't activated. That said, I think it would be reasonable and useful to have the REPL use the syntax revision of the activated project, even for the main module (and switch this when the project is switched).

@Keno
Copy link
Member Author

Keno commented Nov 2, 2025

Alternatively, all files need a marker that says how they are parsed.

I don't want to do this at file (or even module) granularity, at least without explicit opt-in - I think it would be very confusing if it was a common situation that different files within the same package did not have the same syntax revision.

@tecosaur
Copy link
Member

tecosaur commented Nov 5, 2025

However, to make this truly smooth, I think this should happen automatically through a Project.toml opt-in specifying the expected syntax version.

I do wonder if in the future it would be reasonable to have certain minimum Julia versions imply a certain minimum syntax version?

@mlechu
Copy link
Member

mlechu commented Nov 5, 2025

I read the discussion above, but I still think macros are going to be tricky. This shouldn't block starting work on the mechanism, but it might become a problem when we start trying to do the evolution part. Some questions and suggestions are below, hopefully more helpful than distracting.

Here are the guarantees I think we should be providing with this mechanism:

  1. Within a syntax version, the same AST has the same semantics (or lack thereof). Requiring a constant syntax version here is a weakening of what we currently have, but it should be made OK by (3) and (4) below.
  2. Within a syntax version, the same text parses to the same AST (or error). The "weakening" comment above applies here since we haven't realistically been able to change parsing without macro breakage, but it's a bit of a strengthening too: we don't promise this stability right now, and it would be nice to. It would eliminate issues like 59911.
  3. The compiler and runtime can handle code written using the current syntax version or any earlier one.
  4. (from rust) No splitting the ecosystem; modules can interact as they do now no matter what syntax version each uses

I'm the least sure about how to achieve the last one with macros. If we're designating the caller responsible for knowing what syntax version it's running in and what syntax version its callees take, I think we might as well make syntax evolution a parsing-only thing and not worry about breaking the AST. I fear either would just produce our current "change breaks macros" situation with extra steps. Some examples:

  • Assume we want to change parsing so that global a=1, b=2 becomes (global (= a 1) (= b 2)) instead of (global (= a (= (tuple 1 b) 2))). What syntax should the macro in DifferentVersion.@foo global a=1, b=2 see? Should a macro anticipate being called from other syntax versions?
  • On the output end, would @eval $(M.@produce_different_version_syntax) work?
  • 57368 is an example of a potential change of meaning to an existing AST. Only a few packages needed their macros updated, but ancient versions of those packages show up in dependency chains everywhere. Unless I'm missing something, "caller responsibility" would mean views.jl needs to provide different macro implementations per syntax version and have the caller choose between them.

My suggestion is to run an AST conversion on other-version macro inputs and outputs. I wrote down a few thoughts in the "attempt to define the AST" PR (c42f/JuliaLowering.jl#93), but I'll keep thinking about this

Another suggestion: if we're able to convert to the latest version at the AST level, could we define "syntax version" to end between macro expansion and desugaring rather than after lowering? Old syntax would be converted to new syntax before lowering. This way we can avoid tying up the lowering implementation (and the version of CodeInfo it produces) into the definition of a syntax version, so we can get new lowering changes in all versions without worrying about syntax other than the latest version.

@Keno
Copy link
Member Author

Keno commented Nov 7, 2025

My preference would be to use #59995 if that is merged, but this is a separate feature (with similar motivations around API evolution of course) and there could be a different opt-in mechanism.

Since this won't be merged as is, the current thinking is to have a syntax.julia_version key in the Project.toml, but default it to the min-compat of julia if not set. The mechanism for propagating this from Project.toml to the module is somewhat TBD, but will need to be recorded in the manifest also.

@Keno
Copy link
Member Author

Keno commented Nov 7, 2025

  • Should a macro anticipate being called from other syntax versions?

Macros should anticipate being called from any julia syntax version within their julia compat range.

Within a syntax version, the same text parses to the same AST (or error)

Yes, I think this is reasonable.

The compiler and runtime can handle code written using the current syntax version or any earlier one.

Yes, required by semver

(from rust) No splitting the ecosystem; modules can interact as they do now no matter what syntax version each uses

I think it's ok for the package to declare compat bounds and only support users with those syntax versions.

My suggestion is to run an AST conversion on other-version macro inputs and outputs.

I think this would be to complicated. If at all, I don't think we should do this in the macro expander, but instead provide a SyntaxCompat.jl package that does appropriate rewrites.

This way we can avoid tying up the lowering implementation (and the version of CodeInfo it produces) into the definition of a syntax version, so we can get new lowering changes in all versions without worrying about syntax other than the latest version.

I don't think we want to guarantee anything about the output from lowering. However, I do think we may want to enable (subtle) change in the behavior of things. So e.g. using Foo may lower to usingv1 in one syntax version and usingv2 in another if we want to fix a semantic bug in how using works (e.g. whether or not it explicitly introduces the imported module name).

@mlechu
Copy link
Member

mlechu commented Nov 13, 2025

Unless I'm missing something, "caller responsibility" would mean views.jl needs to provide different macro implementations per syntax version and have the caller choose between them.

answer: the macro can figure out its caller's syntax version by inspecting __module__, so that at least provides some way out of the "same AST, different meanings" situation.

I think this would be too complicated

I guess so; we would need round-tripping guarantees. I agree that not rewriting should be fine for now, since deciding we need it later wouldn't break anything (though maybe the inspection mentioned above should be through a macro just in case). It's also worth waiting until we know how users write provenance-preserving macros with JuliaLowering, as that interface is still unspecified, and would ideally fit in with the syntax versioning mechanism.

I do think we may want to enable (subtle) change in the behavior of things. So e.g. using Foo may lower to usingv1 in one syntax version and usingv2 in another if we want to fix a semantic bug in how using works (e.g. whether or not it explicitly introduces the imported module name).

I still think it's better to rewrite old syntax here than swap in an old copy of lowering. This would be simpler than any rewrites in macro expansion, since we'd only need to convert forwards.

@Keno Keno force-pushed the kf/syntaxevolution branch from 0109cc1 to 25bd387 Compare November 17, 2025 08:00
@Keno Keno changed the title WIP: Provide mechanism for Julia syntax evolution Provide mechanism for Julia syntax evolution Nov 17, 2025
# Motivation
There are several corner cases in the Julia syntax that are essentially
bugs or mistakes that we'd like to possibly remove, but can't due to
backwards compatibility concerns.

Similarly, when adding new syntax features, there are often cases
that overlap with valid (but often nonsensical) existing syntax.
In the past, we've mostly done judegement calls of these being
"minor changes", but as the package ecosystem grows, so does the
chance of someone accidentally using these anyway and our "minor
changes" have (subjectively) resulted in more breakages recently.

Fortunately, all the recent work on making the parser replacable,
combined with the fact that JuliaSyntax already supports parsing
multiple revisions of Julia syntax provides a solution here:
Just let packages declare what version of the Julia syntax they
are using. That way, packages would not break if we make changes
to the syntax and they can be upgraded at their own pace the next
time the author of that particular package upgrades to a new julia
version.

# Core mechanism
The way this works is simple. Right now, the parser function is always
looked up in `Core._parse`. With this PR, it is instead looked up as
`mod._internal_julia_parse` (slightly longer name to avoid
conflicting with existing bindings of the name in downstream packages),
or `Core._parse` if no such binding exists. Similar for `_lower`.

There is a macro `@Base.Experimental.set_syntax_version v"1.xx"` that
will set the `_internal_julia_parse` (and inte the future the _lower
version) to one that propagates the version to the parser, so users
are not expected to manipulate the binding directly.

# Versioned package loading
The loading system is extended to look at a new `syntax.julia_version`
key in Project.toml (and Manifest for explicit environments).
If no such key exists, it defaults to the minimum allowed version
of the Julia compat. If no compat is defined, it defaults to the current
Julia version. This is technically slightly less backwards compatible than
defaulting this to Julia 1.13, but I think it will be less suprising
in the future for the default syntax to match what is in the REPL.
Most julia packages do already define a julia compat.

Note that as a result of this, the code for parse compat ranges moves
from Pkg to Base.

# Syntax changes
This introduces two parser changes:
1. `@VERSION` (and similar macrocall forms of a macro named `VERSION`)
   are now special and trigger the parser to push its version
   information into the source location field of the macrocall. Note
   that because this is in the parser, this affects all macros with the
   name. However, there is also logic on the macrocall side that discards this
   again if the macro cannot accept it. This special mechanism is used by
   the `Base.Experimental.@VERSION` macro to let users detect the parse
   version.

2. The `module` syntax form gains a syntax version argument that is
   automatically populated with the parser's current version. This is
   the mechanism to propagate syntax information from the parser to the
   core mechanism above.

Note that these are only active if a module has opted into 1.14 syntax,
so macros that process `:module` exprs will not see these changes unless
and until the calling module opts into 1.14 syntax via the above
mentioned mechanisms (which is the primary advantage of this scheme).

# Final words

I should emphasize that I'm not proposing using this for any big syntax
revolutions or anything. I would just like to start cleaning up a few
corners of the syntax that I think are universally agreed to be bad but
that we've kept for backwards compatibility. This way, by the time we
get around to making a breaking revision, our entire ecosystem will have
already upgraded to the new syntax.
@Keno Keno force-pushed the kf/syntaxevolution branch from 25bd387 to 39b331a Compare November 17, 2025 08:01
@Keno Keno marked this pull request as ready for review November 17, 2025 08:01
@Keno
Copy link
Member Author

Keno commented Nov 17, 2025

This is now fully implemented. I've replaced the PR description by a full description of the final mechanism.

@lgoettgens
Copy link
Contributor

Some comments:

  1. I find it a bit confusing to have both the global variable VERSION of the currently running julia version and the macro Base.Experimental.@VERSION that of the parser version. Would it be possible to make the latter a bit more explicit in its name that it is about parsing?
  2. How should @Base.Experimental.set_syntax_version and syntax.parser_version in the Project.toml interact? Which one is a package author supposed to set? What happens if they differ? And why even two different places where this can be set?

@Keno
Copy link
Member Author

Keno commented Nov 17, 2025

I find it a bit confusing to have both the global variable VERSION of the currently running julia version and the macro Base.Experimental.@VERSION that of the parser version. Would it be possible to make the latter a bit more explicit in its name that it is about parsing?

I originally had Base.Experimental.@SYNTAX_VERSION, but it turns out the parser can only support 10-character keywords at the moment. That's a bit of an arbitrary restriction and could definitely be fixed, but then I thought @VERSION may be better anyway. I was thinking of maybe doing (@VERSION).syntax, with (@VERSION).runtime being the same as VERSION. Possibly there would also be (@VERSION).lowering - although for that to differ from .syntax, it'd have to be a bit of a weird situation where you parse something in one module and then macroexpand it in the context of another. Also note that it's not the parser version, but the version of the language that the parser parses (subtle distinction).

How should @Base.Experimental.set_syntax_version and syntax.parser_version in the Project.toml interact?

(N.B. syntax.julia_version)

The former overrides the latter. But note that the former probably doesn't work like you expect. It's a runtime change, not a parse-time change, so in module; (@set_syntax_version foo); bar; end the expression bar will be parsed with the old rules because the entire module gets parsed before the module runs the first statement. For this reason the Project.toml version is necessary and also for this reason it's not recommended to be used in pacakges.

Which one is a package author supposed to set?

Package authors are supposed to use the Project.toml version.

What happens if they differ?

As above.

And why even two different places where this can be set?

There are situations where there is no Project.toml (REPL, scripts, some testing scenarios), but you may want to switch the syntax version anyway for testing. I'm not expecting it to be super common though.

@tecosaur
Copy link
Member

I originally had Base.Experimental.@SYNTAX_VERSION, but it turns out the parser can only support 10-character keywords at the moment

How about SYNTAX_VER? That's exactly 10 characters.

@Keno
Copy link
Member Author

Keno commented Nov 17, 2025

How about SYNTAX_VER? That's exactly 10 characters.

Heh, I had that briefly also, but _ is disallowed in keywords also. Again, it's a bit of a dumb restriction, but I just wanted to get it to work for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants