Skip to content

Conversation

@rocallahan
Copy link
Contributor

If your work is part of a larger effort, please discuss your general plans on Discourse first to align your vision with maintainers.

https://yosyshq.discourse.group/t/fixing-idstring-refcounting/

What are the reasons/motivation for this change?

IdString refcounting bloats code, has measurable overhead, and would be even higher overhead if we made it thread-safe. It would be nice to get rid of it.

Explain how this is achieved.

We may be better off instead garbage-collecting dead IdStrings at opportune times by tracing the live Designs.

This PR has the basics and passes Yosys' make test but is not yet complete. In particular I want to do something to optimize NEW_ID.

In this PR we only GC IdStrings during top level opt_cleans.

I considered instead of tracing Designs, making fields like NamedObject::name be OwningIdStrings. That change has ripple effects which would probably break third-party code, so at least for now, I'm sticking with tracing.

@rocallahan
Copy link
Contributor Author

NEW_ID shoves the file/line/function name into the created ID, plus autoidx for uniqueness. In theory it could collide with an existing ID if there was malicious input, but it ignores that, so fine.

We would like to be able to call NEW_ID and have it not leak space until the next GC, if the new ID is in fact not permanently added to the design.

One way to do this would be to allocate a space of ID values (e.g., the negative values) to represent IDs that don't use the normal string storage or refcount array. Group these special IDs into fixed-size blocks (e.g. 256 IDs), where each block has a stored file/line/function name/autoidx-base and the string for an ID can be computed from the ID and its block's metadata. The refcounts for these negative IDs would be stored in a hashtable so only NEW_IDs that end up in OwningIdStrings would create entries in the hashtable. This would reduce the space overhead of temporary NEW_IDs to negligible levels. We can optimize block assignment using a simliar closure-with-a-static trick that the ID() macro uses.

One complicating factor is that IdString::c_str() needs to return a pointer to string storage and we have nowhere to put a temporary std::string. For that we probably have to create a global hashtable to hold explicit strings for these special IDs on-demand.

@jix jix self-assigned this Oct 13, 2025
@rocallahan
Copy link
Contributor Author

I ended up doing something simpler for NEW_ID, instead of using blocks. A hashtable maps negative IDs to a pointer to a std::string; the ID's name is the concatenation of that string followed by the (negated) ID index (its associated autoidx). The downside: there is space overhead of one of those hashtable entries per allocated NEW_ID until it is garbage-collected. The upside is that this is simpler and more efficient than managing blocks.

To make the special treatment of NEW_ID actually beneficial, we have to avoid calling .c_str() on those IDs, so I've commits that convert usage of IdString::c_str() to use other approaches that avoid allocating a persistent string, in particular a character iterator that I've introduced. This does add some complexity but it seems to me that this complexity would be required by any approach that wants to avoid allocating string storage for NEW_IDs.

Currently tests are failing because apparently any nontrivial change to the way ID indices are ordered causes tests to fail 🙁.

@rocallahan rocallahan marked this pull request as ready for review October 13, 2025 23:07
@rocallahan
Copy link
Contributor Author

Marking this as "ready for review" because although I'm not sure this PR is a slam-dunk as is, I think it's time for some feedback on this approach.

@rocallahan
Copy link
Contributor Author

One nice thing we can do as part of this is get rid of StaticIdString --- now that IdString has a trivial destructor, we can create constexpr IdStrings. I've added a commit for this.

@ShinyKate ShinyKate requested review from widlarizer and removed request for widlarizer October 14, 2025 14:50
Copy link
Collaborator

@widlarizer widlarizer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't looked at the actual code changes yet, but as a perf-first PR, I'm considering it from a perf-first perspective: With this PR, running synth on jpeg from RTLIL regresses from 3.20 seconds to 3.39. With detailed statistics (driver option -d) I can see that opt_clean goes from 273 to 373ms runtime, and that stat reliably goes from 40ms to 104ms runtime. Note: I do all perf testing with clang and LTO

Please investigate if this is incidental or caused by something at the core of the implemented idea

@rocallahan
Copy link
Contributor Author

Let's discuss it further in https://yosyshq.discourse.group/t/fixing-idstring-refcounting/94.

@rocallahan
Copy link
Contributor Author

BTW I'm pretty sure the test failures are because we're changing the order of ID string indices, and that changes test results. I did an experiment where I mixed a random seed into IdString::operator< to get different orderings, and various tests started failing.

Copy link
Collaborator

@widlarizer widlarizer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remaining CI failures:

  • with C++20, constexpr check_format can't do YOSYS_ABORT
  • on Windows, there's no compatible from_chars declaration to the call you added, possibly due to standard library type conversions

@rocallahan
Copy link
Contributor Author

  • with C++20, constexpr check_format can't do YOSYS_ABORT

This is what happens now when the format string doesn't match the parameter types. Fixed.

on Windows, there's no compatible from_chars declaration to the call you added, possibly due to standard library type conversions

I needed to use std::string_view::data() instead of std::string_view::begin().

@rocallahan
Copy link
Contributor Author

@widlarizer By the way I realized I don't actually have the jpeg pre-synth RTLIL, only the post-synth RTLIL, so I'm not actually running the same experiment as you. If you let me know where I can find the pre-synth RTLIL I can run a more accurate experiment. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants