Using libdispatch #138
Replies: 3 comments 2 replies
-
|
Thanks for your interest. v0.1 is coming as soon as I complete the documentation effort and maybe introduce a few additional async primitives, such as heterogeneous multi-await which would enable timeouts. As for libdispatch, I bet that the MacOS version (Grand Central Dispatch) is quite fast, as it has additional kernel support. I don't own a Mac but I think I will have access to one in the next few months, so at that point I could benchmark this against that. I'm also not sure about the support for coroutines in Apple Clang - so I don't know if all of the examples in this project repo will compile there. I can still test using regular functors of course. I do see that there is a Linux port of libdispatch, so I will try to find some time to test that against a few of my benchmarks. Sounds like a fun weekend project :) |
Beta Was this translation helpful? Give feedback.
-
|
Since the original discussion, many new features have been added, and v1.0.0 was released last month. This now leaves me time to pursue some more experimental new approaches. I'd like to benchmark against libdispatch on an Apple M processor (now that I have one), but first I'm planning to complete an overhaul of the The P vs E detection is the most important one, since Apple M's have these hybrid cores, and libdispatch/Grand Central Dispatch is optimized to make use of them. It will be interesting to see if we can configure an executor to outperform libdispatch using this knowledge. There's 2 possible configurations that I want to explore:
Once I have that in place, I can do the comparative benchmarks. If it turns out that libdispatch is unbeatable, then I may also experiment with using libdispatch as a backend for TooManyCooks on the Apple platform. Random thoughts: An article: https://tclementdev.com/posts/what_went_wrong_with_the_libdispatch.html hopefully this still isn't an issue - apparently libdispatch was creating way too many threads. I wonder what kind of tasks are actually most appropriate to run on the E cores... background loading (like synchronous I/O? Or would it be OK to put network I/O on it, if the program is normally compute heavy (to keep the P cores free)? Or should we be putting background compute on the E cores... if your application actually has such a thing? |
Beta Was this translation helpful? Give feedback.
-
|
I've implemented hybrid work steering between P and E cores on the main branch. Updated example: https://github.com/tzcnt/tmc-examples/blob/main/examples/hwloc/hybrid_executor.cpp Documentation:
On MacOS we cannot pin threads directly, so I set the QoS class to hint to the OS where the threads should run. Currently using QOS_CLASS_USER_INTERACTIVE for P-cores and QOS_CLASS_USER_INITIATED for E-cores. You can override this by using the new overload of |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
First off, I'm really looking forward to seeing this in action!
When do you plan a 0.1 release?
Have you benchmarked your executor against libdispatch? According to Sean Parent, it's really hard to beat it given it's also hardware-aware.
Thank you.
Beta Was this translation helpful? Give feedback.
All reactions