Skip to content
Open
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
284 changes: 284 additions & 0 deletions text/3512-closure-move-bindings.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,284 @@
- Feature `closure_move_bindings`
- Start Date: 2023-10-09
- RFC PR: [rust-lang/rfcs#3512](https://github.com/rust-lang/rfcs/pull/3512)
- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000)

# Summary
[summary]: #summary

Adds the syntax `move(bindings) |...| ...`
to explicitly specify how to capture bindings into a closure.

# Motivation
[motivation]: #motivation

Currently there are two ways to capture local bindings into a closure,
namely by reference (`|| foo`) and by moving (`move || foo`).
This mechanism has several ergonomic problems:

- It is not possible to move some bindings and reference the others.
To do so, one must define another binding that borrows the value
and move it into the closure:

```rs
{
let foo = &foo;
move || run(foo, bar)
}
```

- It is a very frequent scenario to clone a value into a closure
(especially common with `Rc`/`Arc`-based values),
but even the simplest scenario requires three lines of boilerplate:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be missing some implied boilerplate here, but is this just two lines?

Suggested change
but even the simplest scenario requires three lines of boilerplate:
but even the simplest scenario requires two lines of boilerplate:

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be just one line (and maybe even inlined in another expression). So that's 3.5 lines of boilerplate.


```rs
{
let foo = foo.clone();
move || foo.run()
}
```

This RFC proposes a more concise syntax to express these moving semantics.

# Guide-level explanation
[guide-level-explanation]: #guide-level-explanation

A closure may capture bindings in its defining scope.
Bindings are captured by reference by default:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not accurate.
Bindings are "captured by example", so if the closure needs a reference to the value it's captured by reference and if needs an owned value it's captured by move...except for Copy types which always capture by reference unless you move.

Issue with more details:
rust-lang/rust#100905

I think we should consider changing this behavior in a new edition and seeing if this helps without needing new syntax.
Personally for me every time I need the workarounds described in this RFC it was because of this.

Copy link
Author

@SOF3 SOF3 Oct 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it sound reasonable and feasible to detect this as a breaking change and fix/warn in automated edition migration tools?


```rs
let mut foo = 1;
let mut closure = || { foo = 2; };
closure();
dbg!(foo); // foo is now 2
```

You can add a `move` keyword in front of the closure
to indicate that all captured bindings are moved into the closure
instead of referenced:

```rs
let mut foo = 1;
let mut closure = move || { foo = 2; };
closure();
dbg!(foo); // foo is still 1, but the copy of `foo` in `closure` is 2
```

Note that `foo` is _copied_ during move in this example
as `i32` implements `Copy`.

If a closure captures multiple bindings,
all the `move` keywoed makes them all captured by moving.
To move only specific bindings,
list them in parentheses after `move`:

```rs
let foo = 1;
let mut bar = 2;
let mut closure = move(mut foo) || {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just noticed that GitHub renders move() with the color of a function call instead of as a keyword. This adds to the previously discussed problem of ambiguity as a function call.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move is already a keyword and cannot be the identifier of a function, so move( unambiguously starts a captures list.

foo += 10;
bar += 10;
};
closure();
dbg!(foo, bar); // foo = 1, bar = 12
```

Note that the outer `foo` no longer requires `mut`;
it is relocated to the closure since it defines a new binding.

Moved bindings may also be renamed:

```rs
let mut foo = 1;
let mut closure = move(mut bar = foo) || {
foo = 2;
bar = 3;
};
closure();
dbg!(foo); // the outer `foo` is 2 as it was captured by reference
```

Bindings may be transformed when moved:

```rs
let foo = vec![1];
let mut closure = move(mut foo = foo.clone()) || {
foo.push(2);
};
closure();
dbg!(foo); // the outer `foo` is still [1] because only the cloned copy was mutated
```

The above may be simplified to `move(mut foo.clone())` as well.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be left as a future extension?

  1. It is just syntactic sugar, having the same effect as move(mut foo = foo.clone()).
  2. It's a bit weird that only a subset of expressions are allowed -- namely, those where the first token is an identifier.
  3. It should possibly be extended so that move(mut self.foo.clone()) yields move(mut foo = self.foo.clone()).
  4. There's no precedent for introducing such bindings. In particular, arguably, it should work similarly in constructors: Self { foo.clone() } should be short-hand for Self { foo: foo.clone() } for consistency.

I think it would be easier to first focus on the idea of explicit capture -- and whether it's worth it -- and defer the special short-hand notation for later, and thus that it would make sense to move this to a possible future extension.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this should be left for future, if added at all.

By looking at move(mut foo.clone()), it is unclear what it should bind to. In this situation, foo is the only ident but in move(mut foo.bar.clone()), is it supposed to be bound to bar or foo? That sort of ambiguity isn't a good thing to add.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's why it is strictly limited to a single identify in the method call receiver.

This simplification is only allowed
when the transformation expression is a method call on the captured binding.

# Reference-level explanation
[reference-level-explanation]: #reference-level-explanation

A closure expression has the following syntax:

> **<sup>Syntax</sup>**\
> _ClosureExpression_ :\
> &nbsp;&nbsp; ( `move` _MoveBindings_<sup>?</sup> )<sup>?</sup>\
> &nbsp;&nbsp; ( `||` | `|` _ClosureParameters_<sup>?</sup> `|` )\
> &nbsp;&nbsp; (_Expression_ | `->` _TypeNoBounds_&nbsp;_BlockExpression_)>\
> _MoveBindings_ :\
> &nbsp;&nbsp; `(` ( _MoveBinding_ (`,` _MoveBinding_)<sup>\*</sup> `,`<sup>?</sup> )<sup>?</sup> `)`\
> _MoveBinding_ :\
> &nbsp;&nbsp; _NamedMoveBinding_ | _UnnamedMoveBinding_\
> _NamedMoveBinding_ :\
> &nbsp;&nbsp; _PatternNoTopAlt_ `=` _Expression_\
> _UnnamedMoveBinding_ :\
> &nbsp;&nbsp; `mut`<sup>?</sup> ( _IdentifierExpression_ | _MethodCallExpression_ )\
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This requires infinite lookahead for parsing if I am not mistaken, a method call expression can begin with any expression which we cannot differentiate from a pattern without attempting to parse the whole construct first. This is I believe the reason as to why assignment destructuring (pat = expr) doesn't parse full patterns either, it accepts a subset of expressions on the lhs and interprets those are patterns.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As specified below, the MethodCallExpression always starts with an identifier, so this is not a parsing issue right now. But it is indeed problematic if we want to relieve this restriction in the future.

Copy link
Author

@SOF3 SOF3 Oct 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My mental model would be to parse it like this:

if peek("mut") { parse("mut") }
if peek(Identifier) && !peek("=") {
    UnnamedMoveBInding { parse(Expr)}
} else {
     NamedMoveBinding { parse(Pattern), parse("="), parse(Expr) }
}

Copy link

@danielhenrymantilla danielhenrymantilla Oct 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be quite valuable to also have move(&x, &mut y) syntax, for the specific cases of explicit by-reference captures:

  • not only for my suggested implicit-by-move change, should it be accepted;
  • but even in the case of the current implict-by-ref-captures semantics of the RFC, when wanting to be explicit, it would be nicer to be using move(&x, &mut y) than move(x = &x, y = &mut y), especially when x and y are longer_variable_names.
  • there would otherwise be no way to mimic, explicitly, the "magic binding" semantics which implicit by-ref closures captures currently have (e.g., when having x: i32, an implicit by-ref capture of x makes it so the closure has captured &x, but the x name is still fully usable to refer to the *&x place).

If we didn't have this sugar, I could very well see myself having a .by_ref() and .by_mut() couple of blanket-implemented extension methods.

  • whilst these methods could be a bearable alternative, I don't think it's worth the hassle w.r.t. adding another 1-lookahead case for these parsing rules:
    $(mut)? $binding:ident $(. $method…)?
OR
    &$(mut)? $binding:ident
OR
    $binding:pat = $e:expr

> _ClosureParameters_ :\
> &nbsp;&nbsp; _ClosureParam_ (`,` _ClosureParam_)<sup>\*</sup> `,`<sup>?</sup>\
> _ClosureParam_ :\
> &nbsp;&nbsp; _OuterAttribute_<sup>\*</sup> _PatternNoTopAlt_&nbsp;( `:` _Type_ )<sup>?</sup>

Closure expressions are clsasified into two main types,
namely _ImplicitReference_ and _ImplicitMove_.
A closure expression is _ImplicitMove_ IF AND ONLY IF
it starts with a `move` token immediately followed by a `|` token,
without any parentheses in between.

## _ImplicitReference_ closures

When the parentheses for _MoveBindings_ is present, or when the `move` keyword is absent,
the closure expression is of the _ImplicitReference_ type, where
all local variables in the closure construction scope not shadowed by any _MoveBinding_
are implicitly captured into the closure by shared or mutable reference on demand,
preferring shared reference if possible.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to ask for a special sigil for this "and other implicit captures" behavior:

move(x, y, ..) || {}

Indeed, by requiring a trailing .. in order to allow implicit captures:

  • such implicit capture is now clearer: the x, y enumeration is now visibly non-exhaustive;
  • it gives the option not to provide , .., thereby requiring exhaustive enumeration of captures: this gives the power to the person defining the closure to restrict the kind of captures a closure may use.
    • Given that the motivation for this syntax is finer-grained control, it would be a pity not to be featuring such a capability, and have to wait for a later edition-requiring amendment to get into this behavior.
    • reasons for restricting external captures can be:
      • soundness of more advanced designs (e.g., macros and advanced lifetime shenanigans);
      • especially in FFI or other type-erased scenarios where certain lifetimes may have been lost;
      • (or when worried about things like {A,}Rc cycles in case of implicit by-move captures)
      • better control of a closure size;
      • helps teachability of how closures work: move(x) |…| { … } would always be an x-capturing closure, no matter the body of ..., thereby properly decoupling implementation details from API surface (of the closure type).
  • it mirror struct pattern destructuring, wherein a Foo { x, .. } pattern is not expected to be exhaustive over the fields it has, whereas Foo { x } is.

Copy link

@ranile ranile Oct 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, by requiring a trailing .. in order to allow implicit captures:

Do you mean implicit captures by-value? or keeping the current default capture behavior (without move)? Foo { x, .. } means all fields except x are ignored so it would make sense if all fields other than x were ignored in move(x, ..) when doing move captures

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for this

If .. also allowed to specify a default capture mode it would be a killer feature for UI development and other similar cases where a lot of variables are captured by cloning.

Foo { x, .. } means all fields except x are ignored

IMO .. just means "the rest" when used as a pattern, as it always matches all the elements that weren't already in some other way. It just so happens that with structs/tuples it doesn't allow to create bindings for the matches elements, though with slices this is possible (e.g. [head, tail @ ..]).


Each _MoveBinding_ declares binding(s) in its left-side pattern,
assigned with the value of the right-side expression evaluated during closure construction,
thus referencing any relevant local variables if necessary.

If the left-side pattern is omitted (_UnnamedMoveBinding_),
the expression must be either a single-segment (identifier) `PathExpression`
or a _MethodCallExpression_,
the receiver expression of which must be a single identifier variable,
and the argument list must not reference any local variables.
The left-side pattern is then automatically inferred to be a simple _IdentifierPattern_
using the identifier/receiver as the new binding.

### Mutable bindings

If a captured binding mutated inside the closure is declared in a _NamedMoveBinding_,
the `IdentifierPattern` that declares the binding must have the `mut` keyword.

If it is declared in an _UnnamedMoveBinding_,
the `mut` keyword must be added in front of the expression;
since the declared binding is always the first token in the expression,
the `mut` token is always immediately followed by the mutable binding,
thus yielding consistent readability.

If it is implicitly captured from the parent scope
instead of declared in a _MoveBinding_,
the local variable declaration must be declared `mut` too.

## _ImplicitMove_ closures

When the `move` keyword is present but _MoveBindings_ is absent (with its parentheses absent as well),
the closure expression is of the _ImplicitMove_ type, where
all local variables in the closure construction scope
are implicitly moved or copied into the closure on demand.

Note that `move` with an empty pair of parentheses is allowed and follows the former rule;
in other words, `move() |...| {...}` and `|...| {...}` are semantically equivalent.
Copy link

@danielhenrymantilla danielhenrymantilla Oct 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ughh, I have to admit to finding this rather counter-intuitive:
I'd have expected move() || to be closer to move || than to ||

This allows macros to emit repeating groups of `_MoveBinding_ ","` inside a pair of parentheses
and achieve correct semantics when there are zero repeating groups.

If a moved binding is mutated inside the closure,
its declaration in the parent scope must be declared `mut` too.

# Drawbacks
[drawbacks]: #drawbacks

Due to backwards compatibility, this RFC proposes a new syntax
that is an extension of capture-by-move
but actually looks more similar to capture-by-reference,
thus confusing new users.

# Rationale and alternatives
[rationale-and-alternatives]: #rationale-and-alternatives

Capture-by-reference is the default behavior for implicit captures for two reasons:
Copy link

@danielhenrymantilla danielhenrymantilla Oct 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to suggest the opposite, here, actually: for move(..) (or move() if the suggestion for .. is rejected) to mean the same as move.

  • big consistency gain: move(..) || being closer to move || than to ||;

  • make by-reference captures be the odd one that stands out. I agree this may be subjective, but bear with me:

    There are two kind of closures:

    • immediately-called closures, such as with Option::map() or panic::catch_unwind();
    • long-lived closures (: 'long), such as the spawn() family.

    The former, 99% of the time, accomodates to fully implicit || {} capture rules, and the latter, kind of accomodate to move || closures, except for when specific "right-before move adjustments" are needed, which is very often just a .clone() (and sometimes an Arc::downgrade()).

    This very RFC, by providing explicit captures, is thus mostly helping the latter case: the one with : 'long-lived requirements. In such a scenario, capturing by reference is rather non-desired, or something niche stemming from a having long-lived borrowable available (e.g., borrowing a variable but because it would happen to outlive the 'env of a scope::spawn())). It thus does seem rather legitimate to favor by-move captures for the ..-implicit ones, much like move || does.

  • lastly, although I agree this one may not be that important to others, I feel like it would be the most helpful syntax w.r.t. teachability: with the current semantics of implicit by-ref captures for move(..), it would mean having borrows by using the move keyword, which kind of denaturates/dilutes the meaning of that word. Whereas in the case of move(..) implicitly capturing by move, the explicit move(&x) would be justified insofar the word move would have been itself neutered/overriden by the explicit & sigil.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, rather than adding move(..) (which could just be written move) or move() (which could just be omitted), we should have a warn-by-default lint for an empty capture list (with a suggestion that removes it and a note that mentions using move if you want to capture everything), and add it to the list of lints suppressed in macros (since macros might want to generate a capture list).


1. It is more consistent to have `move(x)` imply `move(x=x)`,
which leaves us with implicit references for the unspecified.
Comment on lines +205 to +206
Copy link

@danielhenrymantilla danielhenrymantilla Oct 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that writing move(&some_var) is not that much longer than move(some_var), so this argument may be a bit weak (to clarify, in a vacuum / ceteris paribus, I'm on board with this argument; but in this case, it may not pull its weight compared to the advantages of by-move implicit captures).

2. Move bindings actually define a new shadowing binding
that is completely independent of the original binding,
so it is more correct to have the new binding explicitly named.
Consider how unintuitive it is to require that
a moved variable be declared `mut` in the outer scope
even though it is only mutated inside the closure (as the new binding).
Comment on lines +207 to +212
Copy link

@danielhenrymantilla danielhenrymantilla Oct 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once this RFC lands, I could imagine there being a lint against implicitly by-move-captured mut bindings, be it through move ||, or through move(..) || 🙂 (warn-by-default for the latter, and the former requiring the edition dance)


The possible syntax for automatically-inferred _MoveBinding_ pattern
is strictly limited to allow maximum future compatibility.
Currently, many cases of captured bindings are in the form of
`foo = foo`, `foo = &foo` or `foo.clone()`.
This RFC intends to solve the ergonomic issues for these common scenarios first
and leave more room for future enhancement when other frequent patterns are identified.

Alternative approaches previously proposed
include explicitly adding support for the `clone` keyword.
This RFC does not favor such suggestions
as they make the fundamental closure expression syntax
unnecessarily dependent on the `clone` language item,
and does not offer possibilities for alternative transformers.

# Prior art
[prior-art]: #prior-art

## Other languages

Closure expressions (with the ability to capture) are known to many languages,
varying between explicit and implicit capturing.
Nevertheless, most such languages do not support capturing by reference.
Examples of languages that support capture-by-reference include
C++ lambdas (`[x=f(y)]`) and PHP (`use(&$x)`).
Of these, C++ uses a leading `&`/`=` in the capture list
to indicate the default behavior as move or reference,
and allows an initializer behind a variable:

```cpp
int foo = 1;
auto closure = [foo = foo+1]() mutable {
foo += 10; // does not mutate ::foo
return foo;
}
closure(); // 12
closure(); // 22
```

This RFC additionally proposes the ability to omit the capture identifier,
because use cases of `foo.clone()` are much more common in Rust,
compared to C++ where most values may be implicitly cloned.

## Rust libraries

Attempts to improve ergonomics for cloning into closures were seen in proc macros:

- [enclose](https://crates.io/crates/enclose)
- [clown](https://crates.io/crates/clown)
- [closet](https://crates.io/crates/closet)
- [capture](https://crates.io/crates/capture)

# Unresolved questions
[unresolved-questions]: #unresolved-questions

- This RFC actually solves two not necessarily related problems together,
namely clone-into-closures and selective capture-by-move.
It might be more appropriate to split the former to a separate RFC,
but they are currently put together such that
consideration for the new syntax includes possibility for both enhancements.

# Future possibilities
[future-possibilities]: #future-possibilities

- Should we consider deprecating the _ImplicitMove_ syntax
in favor of explicitly specifying what gets moved,
especially for mutable variables,
considering that moved variables actually create a new, shadowing binding?
- The set of allowed expressions may be extended in the future.