Skip to content

Conversation

@elenaf9
Copy link
Contributor

@elenaf9 elenaf9 commented May 5, 2025

For libp2p/rust-libp2p#6009 it would be great if we could iterate through the streams in a futures_bounded::StreamMap.

Just implementing an iterator for that type unfortunately doesn't work because StreamMap boxes its inner streams and thus obscures the stream type.
This PR adds StreamMapIterable as a variant of StreamMap that doesn't do any boxing and implements iter and iter_mut.

Implementation wise, StreamMapIterable is just the old StreamMap but without boxing of streams, and StreamMap a wrapper around it.

Notes / Open Questions

If we want be consistent with the rest of rust-libp2p, we should technically call the StreamMapIterable just StreamMap, and the old StreamMap with the boxing StreamMapBoxed. But that would a) break the API and b) also require to change all other types for consistency. I'd be fine with that as well, but went with the minimal change for now. Wdyt @thomaseizinger @jxs?

elenaf9 added 2 commits May 5, 2025 17:41
Refactor existing `StreamMap` into a `StreamMapInterable` that doesn't
box the streams and allows iterating over the inner streams.
Re-add type `StreamMap` as a wrapper around `StreamMapIterable` that
does the old boxing and thus avoids breaking API.
@elenaf9 elenaf9 changed the title feat(stream_map): Add iterable variant of StreamMapIterable feat(stream_map): Add iterable variant of StreamMap May 5, 2025
@elenaf9 elenaf9 changed the title feat(stream_map): Add iterable variant of StreamMap feat: Add iterable variant of StreamMap May 5, 2025
@thomaseizinger
Copy link
Owner

Just implementing an iterator for that type unfortunately doesn't work because StreamMap boxes its inner streams and thus obscures the stream type.

You can still downcast them back to what you had right? Perhaps we can offer an iter function that also takes a type-parameter and iterates and downcasts for you? I feel like that would be more ergonomic than having to restate the stream type in the type signature.

@teor2345
Copy link
Contributor

teor2345 commented May 5, 2025

Just implementing an iterator for that type unfortunately doesn't work because StreamMap boxes its inner streams and thus obscures the stream type.

You can still downcast them back to what you had right? Perhaps we can offer an iter function that also takes a type-parameter and iterates and downcasts for you? I feel like that would be more ergonomic than having to restate the stream type in the type signature.

Sounds like either way you're restating the stream type in a type signature - either when you declare the type, or when you call the iter function?

I think the iter function would be more difficult to get correct. You can call it multiple times, and you'd have to get the type correct every time. And downcasting only does a runtime check, so if you get the type wrong, you might not know until that code gets called.

But declaring a type usually only happens once per struct or function, and the compiler will warn you if they don't match.

Another advantage of this PR is that the try_push and remove methods return the concrete type. I'm not sure we'd want to create downcasing versions of them as well. (And every downcast adds a 128-bit TypeId, which might be an issue for some performance or memory sensitive code, but it seems unlikely.)

@thomaseizinger
Copy link
Owner

Sounds like either way you're restating the stream type in a type signature - either when you declare the type, or when you call the iter function?

The difference is that in most cases - even in rust-libp2p for which I originally wrote this code - you don't pass a name-able Stream type. Rust doesn't have first-class generators yet but even Streams created with stream::unfold are most likely unnameable and therefore, you'll have to fallback to Box anyway in the type signature.

If we add an iter function with a type argument, then only the usecases where this iterator is needed have to pay for this complexity. In all other cases, StreamMap has the same ergonomic type signature.

I think the iter function would be more difficult to get correct. You can call it multiple times, and you'd have to get the type correct every time. And downcasting only does a runtime check, so if you get the type wrong, you might not know until that code gets called.

I'd expect that you only use the iter function if you have defined a type yourself that implements Stream and not if it you are passing in a combinator. Whilst I am huge fan of type-safe APIs, I don't think it is a very strong argument here especially if you balance it against the ergonomic hit described above.

@teor2345
Copy link
Contributor

teor2345 commented May 6, 2025

Thank you for explaining!

I'm not sure if we're talking about the same thing here?

If we add an iter function with a type argument, then only the usecases where this iterator is needed have to pay for this complexity. In all other cases, StreamMap has the same ergonomic type signature.

I think we agree about that. But in this PR, StreamMap keeps the same type signature as before?

Only StreamMapIterable requires the type to be specified in the type signature (and only when it can't be inferred by the compiler). And that's the type where you'd be using iteration on a concrete type anyway.

Let's try some examples from the libp2p PR:

With this PR:

let iterable_streams: StreamMapIterable<UniqueConnecId, InboundSubstreamState> = ...;
let found_stream = iterable_stream.iter().find(|(id, state)| ...);
...
let another_found_stream = iterable_stream.iter().find(|(id, state)| ...);

let boxed_streams: StreamMap<UniqueConnecId, ConnectionHandlerEvent<...>> = ...;

What I think you're suggesting:

let iterable_streams: StreamMap<UniqueConnecId, ConnectionHandlerEvent<...>> = ...;
let found_stream = iterable_streams.iter::<InboundSubstreamState>().find(|(id, maybe_downcast_state)| if let Some(state) = maybe_downcast_state { ... });
...
let another_found_stream = iterable_streams.iter::<InboundSubstreamState>().find(|(id, maybe_downcast_state)| if let Some(state) = maybe_downcast_state { ... });

let boxed_streams: StreamMap<UniqueConnecId, ConnectionHandlerEvent<...>> = ...;

The second example looks slightly more complex than the first to me: it has more types, and more checks. Which is why I'm not sure if we're talking about the same thing.

I think the iter function would be more difficult to get correct. You can call it multiple times, and you'd have to get the type correct every time. And downcasting only does a runtime check, so if you get the type wrong, you might not know until that code gets called.

I'd expect that you only use the iter function if you have defined a type yourself that implements Stream and not if it you are passing in a combinator. Whilst I am huge fan of type-safe APIs, I don't think it is a very strong argument here especially if you balance it against the ergonomic hit described above.

Your explanation makes sense, but I don't see an ergonomic hit with the code in this PR.

@thomaseizinger
Copy link
Owner

thomaseizinger commented May 6, 2025

Your explanation makes sense, but I don't see an ergonomic hit with the code in this PR.

But it duplicates the code! If we just add the iter function on the current StreamMap, then we don't have to duplicate code. The ergonomic hit comes from not wanting to duplicate this code in here and therefore, all users of StreamMap would have to move to specifying the type of the stream in the type signature.

What I think you're suggesting:

Not quite. I was thinking we do the downcasting for the user so if you call .iter(), you just get an iterator of type T (which may not yield any elements if we don't have streams of that type in the StreamMap).

@teor2345
Copy link
Contributor

teor2345 commented May 6, 2025

Your explanation makes sense, but I don't see an ergonomic hit with the code in this PR.

But it duplicates the code!
If we just add the iter function on the current StreamMap, then we don't have to duplicate code.

Yep, code duplication is a downside, and so is having to choose between types with similar APIs but different type parameters. Even if StreamMapIterable is just a thin wrapper.

The ergonomic hit comes from ... all users of StreamMap would have to move to specifying the type of the stream in the type signature.

Sorry, still not quite sure what you mean here. Maybe if I don't get it this time, you could provide an example?
Or I'm happy to just drop it and leave it to you to decide.

Do you mean users having to choose between this PR's StreamMapIterable::<ID, StreamType>::iter() vs StreamMap::<ID, ItemType>::iter::<StreamType>()?

Or something that's not in this PR, like StreamMap::<ID, StreamType>::iter() vs StreamMap::<ID, ItemType>::iter::<StreamType>()?

Not quite. I was thinking we do the downcasting for the user so if you call .iter(), you just get an iterator of type T (which may not yield any elements if we don't have streams of that type in the StreamMap).

Ah, so downcast and flatten. That makes sense, and would be more ergonomic in the happy case where the types were all correct.

@thomaseizinger
Copy link
Owner

thomaseizinger commented May 6, 2025

I should have probably spelled this out more clearly. I don't want to provide two versions of StreamMap if the only difference is that one has an iter function and the other doesn't.

So if we take that as a precondition and want to enable iter() without downcasting, then every user of StreamMap, regardless whether they are going to use .iter() has to pay the ergonomic hit that they now have to specify the type of stream in the field declaration where they store StreamMap.

Given that most users won't use iter, this seems like an unacceptable price to pay for enabling this feature when we can instead just localise this to the iter function. If we downcast and flatten, the way you use iter is the exact same in both designs. The difference is that only users of iter have to specify the type. For all other users, this is a non-breaking change and they can use StreamMap the same way they did before.

This can also be framed as an additional advantage if you want to store multiple different stream types in the same StreamMap.

@teor2345
Copy link
Contributor

teor2345 commented May 6, 2025

I should have probably spelled this out more clearly. I don't want to provide two versions of StreamMap if the only difference is that one has an iter function and the other doesn't.

Makes sense!

Edit: thanks for your patience here.

@elenaf9 just confirming that StreamMap is the type you want to make this change on, rather than StreamSet? (Or both?)

Currently in libp2p/rust-libp2p#6009 it would be easiest to use StreamSet<InboundSubstreamState>.

But I can understand wanting to change to StreamMap<UniqueConnecId, InboundSubstreamState> at some point in the future, and drop the UniqueConnecId fields from inside InboundSubstreamState.

@thomaseizinger
Copy link
Owner

@elenaf9 just confirming that StreamMap is the type you want to make this change on, rather than StreamSet? (Or both?)

I am happy for this change to land on both, or even all types of this crate given that they all follow the same pattern.

@elenaf9
Copy link
Contributor Author

elenaf9 commented May 6, 2025

TLDR: I would prefer having a type-safe interface rather than downcasting, but I am also fine with the latter. But I am not able to make that work, so some pointers would be great @thomaseizinger.


I initially tried to solve it with downcasting, but wasn't able to make it work.
Specifically, I'm not able to cast the &dyn Stream<Item = O> type that I get out of the BoxStream into a &dyn Any that I could call downcast_ref on. I think it's because the dyn Stream isn't type-bounded to implement Any?

I should have probably spelled this out more clearly. I don't want to provide two versions of StreamMap if the only difference is that one has an iter function and the other doesn't.

I understand that perspective, but I am not sure I agree.
I think it's quite common to have wrapper types that add or hide just one functionality. The existing StreamSet in this repo is also just a simple wrapper around StreamMap with the only difference that it hides the ID parameter.

Also, StreamMapIterator doesn't just add iter{_mut}, but, as @teor2345 mentioned, it returns the non-boxed type again in remove and try_push, which might be useful. So in the end, it's just a non-boxed variant of StreamMap.

The difference is that in most cases - even in rust-libp2p for which I originally wrote this code - you don't pass a name-able Stream type. Rust doesn't have first-class generators yet but even Streams created with stream::unfold are most likely unnameable and therefore, you'll have to fallback to Box anyway in the type signature.

I think for streams specifically (not really for futures) it is actually quite common to implement Stream directly on a type, rather than using helpers like stream::unfold.

@elenaf9 just confirming that StreamMap is the type you want to make this change on, rather than StreamSet? (Or both?)

StreamSet is just a wrapper around StreamMap, that's why I implemented it for StreamMap first. Ideally we'd also have it for StreamSet, but I'd be fine with paying the price of adding IDs in our rust-libp2p code if that gives us an iterator 😄.

@teor2345
Copy link
Contributor

teor2345 commented May 6, 2025

TLDR: I would prefer having a type-safe interface rather than downcasting, but I am also fine with the latter. But I am not able to make that work, so some pointers would be great @thomaseizinger.

I initially tried to solve it with downcasting, but wasn't able to make it work. Specifically, I'm not able to cast the &dyn Stream<Item = O> type that I get out of the BoxStream into a &dyn Any that I could call downcast_ref on. I think it's because the dyn Stream isn't type-bounded to implement Any?

Any requires the type to be 'static, so it should work with the current bounds on BoxStream. But you might have to add a BoxStream<'static, O>: Any or T: Any bound to the iter method to do a downcast:
https://doc.rust-lang.org/nightly/std/any/trait.Any.html#method.downcast

Box also implements Any itself, so it's worth checking you're not casting the Box to &dyn Any:
https://doc.rust-lang.org/nightly/core/any/index.html#smart-pointers-and-dyn-any

@thomaseizinger
Copy link
Owner

It took me a bit of playing around but I managed to make it work: #9

Feel free to build on top of that!

@elenaf9
Copy link
Contributor Author

elenaf9 commented May 7, 2025

It took me a bit of playing around but I managed to make it work: #9

Feel free to build on top of that!

Thanks! Opened #10 based on your draft.

@elenaf9 elenaf9 deleted the iterable-stream-map branch May 7, 2025 19:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants