Skip to content

Conversation

@pieter-bos
Copy link

Hello! We'd like to contribute some work to get to a state where we can add refinement types to the prelude of Clash. I thought it would be good to start by improving the support for mentioning value-kind type variables in refinements; here phrased as a proposal so that we can discuss a plan.

It is still somewhat light on details, but broadly I'd suggest to continue adopting LHName over Symbol to represent names, in our case especially for local names. I'd also suggest adding an equivalent of GHC's Unique to LHName, to remove the need to make names unique prior to or along resolution.

Curious to hear what you think!

@ranjitjhala
Copy link
Member

Hi @pieter-bos — thanks! This sounds like an excellent idea to me. I also like the proposal: @facundominguez or @nikivazou do you see any issues?

A (much!) more “extreme” approach would be to have some version of fixpoints ExprV just for LH and make these different kinds of names different Var variants there. (This is what “flux” the LH for rust does which allows us to handle const generics there) but this proposal seems more pragmatic given the present state.

thanks!!!

@facundominguez
Copy link
Collaborator

facundominguez commented Sep 9, 2025

👋 Thanks for the proposal!

Introducing uniques looks plausible. I share some questions below.


The proposal mentions Constraint.Split as a place where things would improve, are there other places where you expect uniques to be necessary?


Considering alternatives, I could only think of the following. Maybe you already have a good argument to prefer uniques:
Instead of making names of local bindings unique, could the code in Constraint.Split be adapted to deal with bindings which use duplicated names?


It would be helpful to collect here the issues that would be helped with this proposal. Or you could create some, if they are not written yet. At the moment, I don't know what examples would be possible to verify after the proposal is implemented.

@pieter-bos
Copy link
Author

Thanks for the prompt feedback!

The proposal mentions Constraint.Split as a place where things would improve, are there other places where you expect uniques to be necessary?

On review I specifically meant consE from Constraint.Generate here, the case for application of a value-kinded type. My other thought was that it would probably be necessary to have type arguments like this as ANF values, which is required similarly by argType and argExpr - though I'm no longer sure it is strictly required to have unique names to transform them.

Instead of making names of local bindings unique, could the code in Constraint.Split be adapted to deal with bindings which use duplicated names?

Hmm, I see two obstacles (please correct me if I'm mistaken about anything):

  • I think it would involve adapting the use of F.subst1 at the site above to rename bindings from RAllT that shadow names mentioned in the type that is being spliced in.
  • I think that in SpecType the type variable is a GHC name in the end, which makes it hard to ensure it has a fresh text name. GHC names are already unique, but this does not help us, since refinements refer to it via a Symbol at the moment.

Actually I suppose then that collisions between tv and tv could be resolved by using LHName up to Constraint.Split without the need for a unique, since they would store a GHC Name, but I don't know how things would work out for collisions between e.g. type variables and dependent bindings.

It would be helpful to collect here the issues that would be helped with this proposal.

Will do!

@facundominguez
Copy link
Collaborator

facundominguez commented Sep 10, 2025

... rename bindings from RAllT that shadow names mentioned in the type that is being spliced in.
... but I don't know how things would work out for collisions between e.g. type variables and dependent bindings.

That looks like adopting the rapier approach. Section 2 of this paper [link] gives a short account of the difficulties. If it were only F.subst1 perhaps it may not be a lot of effort to get right.

In this place of the code, I had to determine the scope of the various bindings in RTypeV [link]. It may help define other functions that also need to understand the scope of bindings like F.subst1.

GHC names are already unique, but this does not help us, since refinements refer to it via a Symbol at the moment.

Maybe the transformation from GHC name to symbol could include the unique as a suffix. But I don't claim to know all the trouble that this will create.

Also, note that there is a place in the code that is hardcoding the UniqueId when creating GHC names.

symbolRTyCon :: F.Symbol -> RTyCon
symbolRTyCon n = RTyCon (stringTyCon 'x' 42 $ F.symbolString n) [] defaultTyConInfo

Might not affect bindings.

@pieter-bos
Copy link
Author

On applying the rapier: this looks like a nice and principled approach. I don't think it quite prevents the need to generate fresh GHC names (as long as they are also mentioned by symbols), as type variables can presumably also shadow dependent bindings. I think that's pretty unlikely given that those bindings are also renamed by refreshArgs I think (?), but it seems shaky to rely on that.

Maybe the transformation from GHC name to symbol could include the unique as a suffix. But I don't claim to know all the trouble that this will create.

Quite honestly an interesting idea but I don't have high hopes that I can get this right 😅


I think I still have the most confidence that I can do a Unique-based approach. Do you have objections if I start on that direction? Or do you think there is more to discuss :)

@facundominguez
Copy link
Collaborator

facundominguez commented Sep 30, 2025

Maybe the transformation from GHC name to symbol could include the unique as a suffix. But I don't claim to know all the trouble that this will create.

Quite honestly an interesting idea but I don't have high hopes that I can get this right

Can it be avoided though? The proposal is abstracting Symbol from the various data types. This looks to me like a good amount of work, but even if it can be completed, names still need to be disambiguated when putting names in string form for the SMT solver. Is it better to wait until this late to differentiate between names that only differ in their uniques?

@pieter-bos
Copy link
Author

pieter-bos commented Sep 30, 2025

That's true, but I hope that that is a final transformation after all the logic that needs names to be well-resolved is finished. I'm mostly nervous about having a stage where RType's mention type variables unqualified by uniques (forall t . {v:_|v == t}) and then later qualified by uniques (forall t . {v:_|v == t#12345), so you have to know what stage you're in to e.g. convert a wired-in name to a symbol - I would rather have the type system prevent me from making a mistake like that. There's no getting around generating unique ids (be they text or an extra record field), but I'd rather do it once and not think about that again.

Is it better to wait until this late to differentiate between names that only differ in their uniques?

To me: yes! It's nice to have a type that separates the pretty name and the identity of the name, where you can always glue the pretty name and identity (unique) together in the end.

@facundominguez
Copy link
Collaborator

having a stage where RType's mention type variables unqualified by uniques

When a type variable is represented with a symbol, I would consider always suffixing with the unique, so there are not stages where they are not suffixed. When initially parsed, the type variables are symbols (as BTyVar), so suffixing the uniques there would require fiddling with the parser.

One other feature to consider is the serialization and deserialization of lifted specs in serialiseLiquidLib. The uniques after deserializing need to agree with the uniques when deserializing other specs, and they must not clash with uniques produced for the current module.

I think I still have the most confidence that I can do a Unique-based approach. Do you have objections if I start on that direction? Or do you think there is more to discuss :)

I think it is ok to explore this path. I'm sure quite some details will surface that we are not anticipating now, so we might get back to discussing approaches later on.

@pieter-bos
Copy link
Author

When initially parsed, the type variables are symbols (as BTyVar), so suffixing the uniques there would require fiddling with the parser.

Right, I see. I had written off doing this immediately from the parser, as I suppose this would require some notion of name resolution to be embedded in the parser immediately.

One other feature to consider is the serialization and deserialization of lifted specs in serialiseLiquidLib.

Yes, good point. I think the most principled approach is to refuse to serialize types that have an embedded Unique, so that we have a strong guarantee after deserialization that we must generate or look up uniques from an appropriate source with respect to the already loaded modules. Then again, we re-create the problem in serialization of representing names with equal presentation but different identity. We could suffix the names with their associated unique in text to avoid suggesting we store the unique as a stable identifier, but it would probably make serialization non-idempotent and complicate looking up global names.

Given that I think it's probably more practical to just store the Unique after all, and carefully refresh them after deserialization.

I think it is ok to explore this path.

Okay! Practically I'll start then with binding generalization, so steps (1)-(3) from the proposal.

@facundominguez
Copy link
Collaborator

For reference, an issue discussing uniques and unique tags in GHC: https://gitlab.haskell.org/ghc/ghc/-/issues/26264

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants