Skip to content

Conversation

easyice
Copy link
Contributor

@easyice easyice commented Aug 11, 2025

This uses FixedBitSet#cardinality to speed up counting liveDocs in CheckIndex and some assert implementations, instead of checking bits one by one.

Copy link
Contributor

This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR.

@github-actions github-actions bot added this to the 10.3.0 milestone Aug 12, 2025
Copy link
Contributor

@jpountz jpountz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI I have an in-progress PR that would break this optimization: #14996. Furthermore, in the typical case, live docs are not as instance of FixedBitSet but of FixedBits (the result of FixedBitSet#asReadOnlyBits) so I don't think it would help much?

@easyice
Copy link
Contributor Author

easyice commented Aug 12, 2025

Thanks for the review Adrien, sorry for not making it clear, this change also use instanceof FixedBits to enable the optimization, it will call fixedBits.bitSet to get the FixedBitSet instance, but just placed the new method cardinality(Bits, int, int) in class FixedBitSet, so it is also compatible with #14996 in my understand?

I had considered another approach: placing the new cardinality method in Bits, similar to applyMask, but it seems that only a util method is needed here, WDYT?

Copy link
Contributor

@jpountz jpountz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for explaining. I'm not not too fond of the approach, it looks like you'd really want to add int Bits#cardinality(), but also don't want to add it to keep Bits lean (which I appreciate). But it looks a bit odd to me.

If we'd like to speed these things up, maybe we should allocate a FixedBitSet(1024), copy the content of the Bits into this FixedBitSet using applyMask and then call cardinality() on the FixedBitSet?

@easyice
Copy link
Contributor Author

easyice commented Aug 25, 2025

It s a nice idea! although it requires allocating an FixedBitSet(bits.length()), it is still much faster than checking bits one by one.

Here are some JMH numbers:

Benchmark                                     (density)  (size)   Mode  Cnt   Score   Error   Units
FixedBitSetBenchmark.countWithCardinality           0.5    1024  thrpt    5  53.244 ± 3.754  ops/us
FixedBitSetBenchmark.countWithFallbackGet           0.5    1024  thrpt    5   1.912 ± 0.090  ops/us
FixedBitSetBenchmark.countWithFixedBitSetGet        0.5    1024  thrpt    5   1.758 ± 0.047  ops/us

countWithCardinality this approach, use applyMask and cardinality
countWithFallbackGet uses a subclass of Bits to mock a Bits object that is not an instance of FixedBitSet
countWithFixedBitSetGet use FixedBitSet#get

Code
@BenchmarkMode(Mode.Throughput)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
@State(Scope.Benchmark)
@Warmup(iterations = 3, time = 3)
@Measurement(iterations = 5, time = 5)
@Fork(1)
public class FixedBitSetBenchmark {

  @Param({"1024"})
  private int size;

  @Param({"0.5"})
  private float density; // the percentage of 1 in the bitset

  private Bits bits; // FixedBitSet#asReadOnlyBits 

  Bits fallbackBits; // will not use FixedBitSet

  @Setup(Level.Trial)
  public void setup() {
    FixedBitSet bitSet = new FixedBitSet(size);
    int numSet = (int) (size * density);

    if (numSet == size) {
      bitSet.set(0, size);
    } else if (numSet > 0) {
      Random random = new Random(0);
      for (int i = 0; i < numSet; i++) {
        bitSet.set(random.nextInt(size - 1));
      }
    }
    bits = bitSet.asReadOnlyBits();

    fallbackBits =
        new Bits() {
          @Override
          public boolean get(int index) {
            return index % 2 == 0;
          }

          @Override
          public int length() {
            return size;
          }
        };
  }

  @Benchmark
  public void countWithCardinality(Blackhole bh) {
    int count = 0;
    FixedBitSet bitSet = new FixedBitSet(size);
    bitSet.set(0, size);

    bits.applyMask(bitSet, 0);
    count = bitSet.cardinality();
    bh.consume(count);
  }

  @Benchmark
  public void countWithFixedBitSetGet(Blackhole bh) {
    int count = 0;
    for (int i = 0; i < bits.length(); i++) {
      if (bits.get(i)) {
        count++;
      }
    }
    bh.consume(count);
  }

  @Benchmark
  public void countWithFallbackGet(Blackhole bh) {
    int count = 0;
    for (int i = 0; i < fallbackBits.length(); i++) {
      if (fallbackBits.get(i)) {
        count++;
      }
    }
    bh.consume(count);
  }
}

@jpountz
Copy link
Contributor

jpountz commented Aug 25, 2025

We don't actually need to allocate a FixedBitSet of size maxDoc, we could copy slices of 1024 bits into a FixedBitSet(1024) to do the counting?

@easyice
Copy link
Contributor Author

easyice commented Aug 26, 2025

No problem. I will update it.

@easyice
Copy link
Contributor Author

easyice commented Aug 26, 2025

The new approach is similar to #14998, so I reused part of the code.

The changes touch PendingDeletes and CheckIndex, which would introduce some duplicate code (the new bitsCardinality method). Since there doesn’t seem to be a suitable util class for this method, and the modifications in PendingDeletes were only related to assertions and not essential, I reverted those changes.

The optimization is now applied only to CheckIndex to keep the code clean.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants