Add tile intersection checking #1293

b0nes164 · 2025-11-13T17:04:04Z

This PR adds generation of intersection data to make_tiles.

For our MSAA calculations, we need the points where a line intersects with a tile. Since the end goal is to perform the MSAA in parallel, we need these intersections to be watertight between different threads. By performing these calculations in make_tiles, we can create a source of ground truth, ensuring watertightness.

Although it is feasible to calculate the exact intersection points in make_tiles, we defer that work to the gpu. Instead we create an intersection bitmask, which unambiguously defines which edges the line intersects. We say a line intersects the edge of a tile if it touches that edge AND continues into another tile. A endpoint that exactly touches an edge does NOT count as intersecting, though it may still contribute to winding (in the case of a top edge touch).

The bitmask is 5 bits, and will be consumed downstream by the rasterization stage:
P(erfect) | R(ight) | L(eft) | B(ottom) | T(op)
The lower 4 bits correspond to the intersection edges. The "Perfect" bit is necessary to resolve ambiguity in cases where a line perfectly intersects with a corner of a tile.

Cases of 3 bit ambiguity can be resolved by always calculating intersections on the opposing edges of the tile.
Consider a tile with intersections T | L | B:

o--------+
|\       |
| \      |
|  \     | 
+---o----+

Calculate the intersections on T and B!

Cases of 2 bit ambiguity require the "Perfect" bit, which is set when there is exactly one unique edge intersection.
Consider a tile with intersections T | L:

This is valid:

o--------+
|\       |
| o      |
|        | 
+--------+

But so is this:

+--o-----+
| /      |
o        |
|        | 
+--------+

With the perfect bit, the first case would be P | T | L and the second case would be T | L

LaurenzV · 2025-11-14T09:08:11Z

I ran this through my PDF test suite and there only seem to be small unnoticeable single-pixel differences, so no regressions. 👍 However, I won't be able to take a closer look at this before Monday.

But just one thing I noticed, to me, it does seem like this will add some processing time for a single tile. Perhaps we could add a const generic to the method indicating whether tile intersection data should be computed, and when false it just sets it to 0? This way, we can skip all of the calculations for vello_cpu, where it's not needed.

Also, this PR would make #1211 irrelevant, right?

b0nes164 · 2025-11-14T23:22:09Z

However, I won't be able to take a closer look at this before Monday.

No worries! Take your time.

But just one thing I noticed, to me, it does seem like this will add some processing time for a single tile.

Yes there is a special case for a single tile.

This way, we can skip all of the calculations for vello_cpu

Yes, I was thinking the same thing. Not super familiar rust, so not sure what's the rustiest way to do this, but I think even two different make_tiles could work.

Also, this PR would make #1211 irrelevant, right?

I think so. I was debating whether it would be worth it to case out more fast paths, but as mentioned above, I do have a "line completely enclosed in tile" case.

tomcur

Cool stuff. The packing is clever and looks good.

Your explanation of the packing is great, I think it would be worthwhile to add that as documentation to the packed_winding_line_idx.

Cases of 3 bit ambiguity can be resolved by always calculating intersections on the opposing edges of the tile.

Perhaps if you add the documentation to the code, also mention the 4-bit case (where, if my understanding is correct, you just calculate for any two opposing sides).

To be able to properly review the core idea itself I probably need a bit more background on the needs of MSAA, would you have some pointers?

This way, we can skip all of the calculations for vello_cpu

Yes, I was thinking the same thing.

Conditionally running this seems like a good idea. Here's an example of the const-generic pattern: https://github.com/linebender/parley/blob/4f887570bac98c6693e4a3c6937eebb3423dde72/parley/src/layout/alignment.rs#L84-L97, but two versions also makes sense, especially if the two versions can be optimized differently.

LaurenzV · 2025-11-15T10:41:55Z

I think by default I would prefer a single version, and only if it really turns out that having two separate methods is faster do that. Otherwise, when changing stuff in the logic we need to always remember to do it in both methods.

b0nes164 · 2025-11-17T04:45:46Z

Re:All
I will work in your comments, eta tomorrow.

If I understand it, a const-generic is the equivalent of a C++ non-typed templating? I.e. the templated/const-generic variable is compile-time visible, and if we use a bool as the variable we can conditionally compile what we need?

This would work, but I expect it may be a little messy, as the function cannot be split cleanly. I will include this in the version tomorrow, so we can get a picture of what it would look like.

Re:Tom

Perhaps if you add the documentation to the code, also mention the 4-bit case (where, if my understanding is correct, you just calculate for any two opposing sides).

Yes this is correct.

To be able to properly review the core idea itself I probably need a bit more background on the needs of MSAA, would you have some pointers?

Yes apologies!

Conceptually, the MSAA-version of sparse-strips is almost identical to the analytic version but instead of calculating the coverage-mask through the area of the trapezoid formed by the line, we instead iterate through N subpixel sampling locations and determine if they are "inside" or "outside" the line. You then take the bitmask of these inside/outsides, count them, and turn that into coverage.

For example (note there is a bug in the topmost-left tile):

Naively this can be done by maintaining a winding number per subpixel sample, but because we want to do this in parallel, communicating 8, 16, 32 windings across threads becomes prohibitively costly.

Instead, we can minimize the communication required to a single winding number per tile---the same coarse winding used in the analytic version---by applying three rules. For simplicity we'll assume even/odd fill rule, for a single non-overlapping tile:

For any given pixel in a tile, the winding is the XOR of the coarse winding number for the whole tile, with the left edge intersection, with the per-pixel calculations.
pixel[x][y] = coarse_winding ^ left_edge_intersection ^ per_pixel_calculations

The coarse winding we get from the tiling.
The left-edge-intersection rule is: If the line intersecting a tile intersects its left edge, then left_edge_intersection is true for all pixel rows below the pixel row the line intersected. IF the line perfectly intersects the top-left corner, this requires tie-breaking logic, which is not implemented yet. (I realized today that the current logic is insufficient).
The per-pixel-calculations involve determining which pixels the line intersects, then getting the subpixel sample mask, but I am keeping this deliberately vague, as it doesn't relate to the tiling.

Example, the left-edge-intersection in action:

Example of the current bug with top left intersection:

Because the left-edge-intersection seeds an entire row of pixels, any discrepancy in the left-edge pixel intersection between tiles (due to floating point errors) is potentially catastrophic. So the idea is to take advantage of the fact that we fully traverse the line during the tiling to create a source of ground-truth for the compute-shader to use.

Instead of going straight into the code, I think the best to way to review this is to go into the test cases, and see if the results match what you would expect. Or add your own test case and see what you get.

b0nes164 · 2025-11-17T22:50:43Z

Added changes as discussed, sans the additional comments. I deferred adding comments since the top-left intersection logic needs to be reworked to address the issue mentioned above. Previously, I had planned on doing all top-left corner tie-breaking logic downstream. However, I need to reconsider this decision because of the bug...

LaurenzV · 2025-11-24T18:28:51Z

Sorry for the delay, I hope that I will be able to take a look in the next few days.

b0nes164 · 2025-11-24T23:19:02Z

@tomcur

I'm currently working on an overhaul to the intersections which will change the intersection data mask. Laurenz is working on a patch to the blender2d benchmarking suite to better integrate it with vello.

We would like to get both done, and then benchmark the performance before moving forward, so for now there is no need to review.

Add tile intersection checking

68701fe

b0nes164 requested review from LaurenzV and tomcur November 13, 2025 17:04

tomcur reviewed Nov 15, 2025

View reviewed changes

Thomas Smith added 3 commits November 17, 2025 14:09

cleanup tests a little

d3c1c1e

Check completely vertically culled earlier.

d7b5513

Add const generic

0ab63ab

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add tile intersection checking #1293

Add tile intersection checking #1293

Uh oh!

b0nes164 commented Nov 13, 2025 •

edited

Loading

Uh oh!

LaurenzV commented Nov 14, 2025

Uh oh!

b0nes164 commented Nov 14, 2025

Uh oh!

tomcur left a comment

Uh oh!

LaurenzV commented Nov 15, 2025

Uh oh!

b0nes164 commented Nov 17, 2025 •

edited

Loading

Uh oh!

b0nes164 commented Nov 17, 2025

Uh oh!

LaurenzV commented Nov 24, 2025

Uh oh!

b0nes164 commented Nov 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Add tile intersection checking #1293

Are you sure you want to change the base?

Add tile intersection checking #1293

Uh oh!

Conversation

b0nes164 commented Nov 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

LaurenzV commented Nov 14, 2025

Uh oh!

b0nes164 commented Nov 14, 2025

Uh oh!

tomcur left a comment

Choose a reason for hiding this comment

Uh oh!

LaurenzV commented Nov 15, 2025

Uh oh!

b0nes164 commented Nov 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

b0nes164 commented Nov 17, 2025

Uh oh!

LaurenzV commented Nov 24, 2025

Uh oh!

b0nes164 commented Nov 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

b0nes164 commented Nov 13, 2025 •

edited

Loading

b0nes164 commented Nov 17, 2025 •

edited

Loading