Skip to content

Exploring general solutions for supporting arbitrary T in Vec<T>, CxxVector<T>, etc. #1538

@anforowicz

Description

@anforowicz

Context

cxx currently supports the following generic types: Vec<T>, Box<T>, CxxVector<T>, UniquePtr<T>, SharedPtr<T>, WeakPtr<T>. These types currently only support T which can be described with a TypeIdent. There is already a number of individual issues which ask to support other kinds of T, but I think it will be useful to have a separate issue for discussing a general solution (if one is possible). This will hopefully help these individual issues:

(there are also a few issues that seem like duplicates to me - e.g. #1210 or #543)

Past work

Workarounds:

  • Manual monomorphization as proposed in Vec<Vec<T>> #671: instead of writing Vec<Vec<Foo>> we can define VecOfFoo in the “shared” section and write Vec<VecOfFoo>

T-specific solutions:

  • CxxVector<*mut Thing> #795 (and Support for CxxVector<*mut T> and CxxVector<*const T> #1375) propose solutions that may help with CxxVector<*mut T> and CxxVector<*const T> but that may not necessarily generalize for solving other issues listed above (for generic types other than CxxVector and/or for generic type arguments other than pointers). The proposed solutions:
    • introduce CxxVectorPayloadImplKey::Ptr and CxxVectorPayloadImplKey::Named (the latter preserving the old/current handling of named type arguments) in syntax/instantiate.rs
    • work around orphan-rule trouble by introducing trait ConstPtrVectorElement in src/cxx_vector.rs

My recent work

I have a set of commits (see the branch here, or a more permanent link to the commit range here) that implements support for Vec<Box<T>> in a way that I hope may generalize to other Ts and other generic types.

I think that this can be polished, divided into smaller PRs, and then landed (hopefully starting soon, but possibly finishing in September 2025 after the summer vacation):

Landing this work would lay the foundation for addressing some of the other issues listed above:

  • I am not aware of any immediate issues that would block supporting Box<T> in other generic types (e.g. CxxVector<Box<T>>)
  • In the section below, I try to lay out options for solving the “orphan rule problem” for other Ts.

Orphan rule problem

C++ generic bindings vs orphan rule

cxx currently uses VectorElement (and UniquePtrTarget etc.) and similar traits for dispatching implementation of CxxVector (or UniquePtr etc.) methods to the right template instantiations in the generated code.

This makes implementing CxxVector<*const Foo> problematic (as outlined in #795), because Rust doesn’t allow impl VectorElement for *const Foo because it treats *const Foo as a foreign type. For more details see description of the orphan rules in Rust reference.

Orphan rule special-cases Box subjects

The orphan rule prevents implementing a foreign trait for a foreign type, but Box<LocalType> is treated as a local type - see https://doc.rust-lang.org/reference/items/implementations.html#r-items.impl.trait.fundamental. This is why Vec<Box<T>> seems to work okay in my commits above, letting us to make some progress.

Rust generics vs orphan rule

IIUC cxx doesn’t rely on Rust traits for dispatching implementation of rust::Vec (and rust::Box) to the right monomorphizations in the generated code (I believe that instead it uses C++ template specializations for this). OTOH cxx still provides impls for undocumented marker traits ImplBox and ImplVec (in module named private). I am guessing that this is done as a defense-in-depth to avoid ODR violations, but it seems to me that it may be sufficient to check that T bottoms out in a type defined locally in a #[cxx::bridge]. So, maybe we can delete ImplBox and ImplVec traits?

If we do the above, then maybe there won’t be any other issues with supporting arbitrary T in Vec<T> and Box<T>. (There are some other potential complications, but they only apply to a subset of Ts - e.g. Vec<T> can only work if T values can be stored+moved on Rust side, so Vec<CxxVector<T>> would need to remain forbidden.)

I think this direction may be worth exploring and prototyping (because I don't currently see any obvious downsides). I hope to be able to revisit this in September-October 2025 (fingers crossed :-P).

Avoiding trait-based dispatch

The orphan rule problem can be potentially avoided altogether if we can find an alternative mechanism for dispatching calls from Rust generics (e.g. from CxxVector<T>) to instantiation-specific thunks. I have one idea that I think may be worth discussing: dispatching based on the typeid of the generic type argument.

At a high-level:

  • Let’s store a private global map in cxx: a map from type id to a virtual-method table.
  • Let’s make that private global map a LazyLock that during initialization goes over linkme-provided things
  • #[cxx::bridge] proc macro would expand into linkme declarations that provide global compile-time slice with instantiation-specific virtual-method table “things” (“things” = functions that can be invoked by a OnceCell? “things” = direct ’static references to static virtual tables?)
  • Structs representing the virtual method tables would have to be notionally public, but can be hidden in a module named private

OTOH this approach has some drawbacks:

  • Runtime performance impact is unclear:
    • Trait-based-dispatch happens at compile time. Typeid dispatch would happen at runtime.
    • Maybe a “sufficiently smart” compiler can optimize the runtime map lookups away?
  • Removing the VectorElement and other traits would mean:
    • A breaking change
    • Inability to have a generic function which works with any generic CxxVector

As I said above, I think this may be worth discussing further. OTOH, I don’t feel strongly about it / it didn’t come up yet as a scenario that is important for my current project. So I am not sure if this would be worth exploring further and/or prototyping.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions