|
| 1 | +# Generics and substitutions |
| 2 | + |
| 3 | +Given a generic type `MyType<A, B, …>`, we may want to swap out the generics `A, B, …` for some |
| 4 | +other types (possibly other generics or concrete types). We do this a lot while doing type |
| 5 | +inference, type checking, and trait solving. Conceptually, during these routines, we may find out |
| 6 | +that one type is equal to another type and want to swap one out for the other and then swap that out |
| 7 | +for another type and so on until we eventually get some concrete types (or an error). |
| 8 | + |
| 9 | +In rustc this is done using the `SubstsRef` that we mentioned above (“substs” = “substitutions”). |
| 10 | +Conceptually, you can think of `SubstsRef` of a list of types that are to be substituted for the |
| 11 | +generic type parameters of the ADT. |
| 12 | + |
| 13 | +`SubstsRef` is a type alias of `List<GenericArg<'tcx>>` (see [`List` rustdocs][list]). |
| 14 | +[`GenericArg`] is essentially a space-efficient wrapper around [`GenericArgKind`], which is an enum |
| 15 | +indicating what kind of generic the type parameter is (type, lifetime, or const). Thus, `SubstsRef` |
| 16 | +is conceptually like a `&'tcx [GenericArgKind<'tcx>]` slice (but it is actually a `List`). |
| 17 | + |
| 18 | +[list]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc/ty/struct.List.html |
| 19 | +[`GenericArg`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc/ty/subst/struct.GenericArg.html |
| 20 | +[`GenericArgKind`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc/ty/subst/enum.GenericArgKind.html |
| 21 | + |
| 22 | +So why do we use this `List` type instead of making it really a slice? It has the length "inline", |
| 23 | +so `&List` is only 32 bits. As a consequence, it cannot be "subsliced" (that only works if the |
| 24 | +length is out of line). |
| 25 | + |
| 26 | +This also implies that you can check two `List`s for equality via `==` (which would be not be |
| 27 | +possible for ordinary slices). This is precisely because they never represent a "sub-list", only the |
| 28 | +complete `List`, which has been hashed and interned. |
| 29 | + |
| 30 | +So pulling it all together, let’s go back to our example above: |
| 31 | + |
| 32 | +```rust,ignore |
| 33 | +struct MyStruct<T> |
| 34 | +``` |
| 35 | + |
| 36 | +- There would be an `AdtDef` (and corresponding `DefId`) for `MyStruct`. |
| 37 | +- There would be a `TyKind::Param` (and corresponding `DefId`) for `T` (more later). |
| 38 | +- There would be a `SubstsRef` containing the list `[GenericArgKind::Type(Ty(T))]` |
| 39 | + - The `Ty(T)` here is my shorthand for entire other `ty::Ty` that has `TyKind::Param`, which we |
| 40 | + mentioned in the previous point. |
| 41 | +- This is one `TyKind::Adt` containing the `AdtDef` of `MyStruct` with the `SubstsRef` above. |
| 42 | + |
| 43 | +Finally, we will quickly mention the |
| 44 | +[`Generics`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc/ty/struct.Generics.html) type. It |
| 45 | +is used to give information about the type parameters of a type. |
| 46 | + |
| 47 | +### Unsubstituted Generics |
| 48 | + |
| 49 | +So above, recall that in our example the `MyStruct` struct had a generic type `T`. When we are (for |
| 50 | +example) type checking functions that use `MyStruct`, we will need to be able to refer to this type |
| 51 | +`T` without actually knowing what it is. In general, this is true inside all generic definitions: we |
| 52 | +need to be able to work with unknown types. This is done via `TyKind::Param` (which we mentioned in |
| 53 | +the example above). |
| 54 | + |
| 55 | +Each `TyKind::Param` contains two things: the name and the index. In general, the index fully |
| 56 | +defines the parameter and is used by most of the code. The name is included for debug print-outs. |
| 57 | +There are two reasons for this. First, the index is convenient, it allows you to include into the |
| 58 | +list of generic arguments when substituting. Second, the index is more robust. For example, you |
| 59 | +could in principle have two distinct type parameters that use the same name, e.g. `impl<A> Foo<A> { |
| 60 | +fn bar<A>() { .. } }`, although the rules against shadowing make this difficult (but those language |
| 61 | +rules could change in the future). |
| 62 | + |
| 63 | +The index of the type parameter is an integer indicating its order in the list of the type |
| 64 | +parameters. Moreover, we consider the list to include all of the type parameters from outer scopes. |
| 65 | +Consider the following example: |
| 66 | + |
| 67 | +```rust,ignore |
| 68 | +struct Foo<A, B> { |
| 69 | + // A would have index 0 |
| 70 | + // B would have index 1 |
| 71 | +
|
| 72 | + .. // some fields |
| 73 | +} |
| 74 | +impl<X, Y> Foo<X, Y> { |
| 75 | + fn method<Z>() { |
| 76 | + // inside here, X, Y and Z are all in scope |
| 77 | + // X has index 0 |
| 78 | + // Y has index 1 |
| 79 | + // Z has index 2 |
| 80 | + } |
| 81 | +} |
| 82 | +``` |
| 83 | + |
| 84 | +When we are working inside the generic definition, we will use `TyKind::Param` just like any other |
| 85 | +`TyKind`; it is just a type after all. However, if we want to use the generic type somewhere, then |
| 86 | +we will need to do substitutions. |
| 87 | + |
| 88 | +For example suppose that the `Foo<A, B>` type from the previous example has a field that is a |
| 89 | +`Vec<A>`. Observe that `Vec` is also a generic type. We want to tell the compiler that the type |
| 90 | +parameter of `Vec` should be replaced with the `A` type parameter of `Foo<A, B>`. We do that with |
| 91 | +substitutions: |
| 92 | + |
| 93 | +```rust,ignore |
| 94 | +struct Foo<A, B> { // Adt(Foo, &[Param(0), Param(1)]) |
| 95 | + x: Vec<A>, // Adt(Vec, &[Param(0)]) |
| 96 | + .. |
| 97 | +} |
| 98 | +
|
| 99 | +fn bar(foo: Foo<u32, f32>) { // Adt(Foo, &[u32, f32]) |
| 100 | + let y = foo.x; // Vec<Param(0)> => Vec<u32> |
| 101 | +} |
| 102 | +``` |
| 103 | + |
| 104 | +This example has a few different substitutions: |
| 105 | + |
| 106 | +- In the definition of `Foo`, in the type of the field `x`, we replace `Vec`'s type parameter with |
| 107 | + `Param(0)`, the first parameter of `Foo<A, B>`, so that the type of `x` is `Vec<A>`. |
| 108 | +- In the function `bar`, we specify that we want a `Foo<u32, f32>`. This means that we will |
| 109 | + substitute `Param(0)` and `Param(1)` with `u32` and `f32`. |
| 110 | +- In the body of `bar`, we access `foo.x`, which has type `Vec<Param(0)>`, but `Param(0)` has been |
| 111 | + substituted for `u32`, so `foo.x` has type `Vec<u32>`. |
| 112 | + |
| 113 | +Let’s look a bit more closely at that last substitution to see why we use indexes. If we want to |
| 114 | +find the type of `foo.x`, we can get generic type of `x`, which is `Vec<Param(0)>`. Now we can take |
| 115 | +the index `0` and use it to find the right type substitution: looking at `Foo`'s `SubstsRef`, we |
| 116 | +have the list `[u32, f32]` , since we want to replace index `0`, we take the 0-th index of this |
| 117 | +list, which is `u32`. Voila! |
| 118 | + |
| 119 | +You may have a couple of followup questions… |
| 120 | + |
| 121 | + **`type_of`** How do we get the “generic type of `x`"? You can get the type of pretty much anything |
| 122 | + with the `tcx.type_of(def_id)` query. In this case, we would pass the `DefId` of the field `x`. |
| 123 | + The `type_of` query always returns the definition with the generics that are in scope of the |
| 124 | + definition. For example, `tcx.type_of(def_id_of_my_struct)` would return the “self-view” of |
| 125 | + `MyStruct`: `Adt(Foo, &[Param(0), Param(1)])`. |
| 126 | + |
| 127 | +**`subst`** How do we actually do the substitutions? There is a function for that too! You use |
| 128 | +[`subst`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc/ty/subst/trait.Subst.html) to |
| 129 | +replace a `SubstRef` with another list of types. |
| 130 | + |
| 131 | +[Here is an example of actually using `subst` in the compiler][substex]. The exact details are not |
| 132 | +too important, but in this piece of code, we happen to be converting from the `rustc_hir::Ty` to |
| 133 | +a real `ty::Ty`. You can see that we first get some substitutions (`substs`). Then we call |
| 134 | +`type_of` to get a type and call `ty.subst(substs)` to get a new version of `ty` with |
| 135 | +the substitutions made. |
| 136 | + |
| 137 | +[substex]: https://github.com/rust-lang/rust/blob/597f432489f12a3f33419daa039ccef11a12c4fd/src/librustc_typeck/astconv.rs#L942-L953 |
| 138 | + |
| 139 | +**Note on indices:** It is possible for the indices in `Param` to not match with what we expect. For |
| 140 | +example, the index could be out of bounds or it could be the index of a lifetime when we were |
| 141 | +expecting a type. These sorts of errors would be caught earlier in the compiler when translating |
| 142 | +from a `rustc_hir::Ty` to a `ty::Ty`. If they occur later, that is a compiler bug. |
| 143 | + |
| 144 | + |
0 commit comments