Skip to content

Conversation

jmikola
Copy link
Member

@jmikola jmikola commented Jul 24, 2025

Copy link
Member

@GromNaN GromNaN left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the python implementation, the Binary class has a from_vector static method. You have certainly considered this solution, of having only a factory method instead of a new class. But the new methods toArray and getVectorType are specific to this binary type, so having a dedicated class makes sense. But if you create a new Binary($raw, Binary::TYPE_VECTOR) instance with the vector type from the raw binary data, you don't have this methods. The current state doesn't allow creating a BinaryVector from raw data. I think you would have to wrap in into a BSON Document and encode/decode it to get a BinaryVector instance. I don't find any other binary type that would require a specific class.

I see 3 possible implementations:

  1. Add Binary::fromVector(array $vector, VectorType $vectorType) and toArray() to the existing Binary class
  2. Create BinaryVector class that extends Binary (current state of this PR).
  3. Create BinaryVector class that implements BinaryInterface, \JsonSerializable, Type, \Stringable.

We need to compare pro/con of each solution.

@@ -7,7 +7,7 @@

namespace MongoDB\BSON;

final class Binary implements BinaryInterface, \JsonSerializable, Type, \Stringable
class Binary implements BinaryInterface, \JsonSerializable, Type, \Stringable
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This classes does not define any method that is not in one of the interfaces. This means you don't need to make it extensible: the new BinaryVector class could implement all this interfaces (re-using the underlying C functions).

In the library, I used MongoDB\BSON\Binary type in the builder, but I would change it to use BinaryInterface. Why this interface doesn't extend MongoDB\BSON\Type?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The interfaces were never very useful for PHPC. One of the original use cases was have an embedded object masquerade as a UTCDateTime and incorporate a timezone, but the benefits are mainly for userland applications.

I think sharing a single Binary class is ideal for anything that would expect the BSON type, and avoiding a separate BinaryVector child class seems like it'd make things easier for users (and simplify what comes back from BSON-to-PHP conversion).


final static public function fromPackedBitArray(array $vector): BinaryVector {}

final public function getVectorType(): int {}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to return VectorType here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, once this method is moved to the MongoDB\BSON\Binary class.


final static public function fromInt8Array(array $vector): BinaryVector {}

//final static public function fromInt8Array2(array $vector): BinaryVector {}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reminder to remove this since we benchmarked both implementations and this was worse.

// TODO: Implement and branch
public function __construct(array $vector, int $vectorType) {}

final static public function fromInt8Array(array $vector): BinaryVector {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think distinct factory names fromInt8Vector, fromFloat32Vector and fromPackedBitVector would make sense if we can provide array types list<int>, list<float>, list<bool|0|1>

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK there's no prior art for using Psalm type hints in stub files. I'm not familiar with how PHPStorm parses the stubs, but would these have a significant benefit for developers? For instance, would they be able to influence Psalm's analysis of an app?

Also, does this mean we prefer factory methods on Binary in lieu of factories on the vector type enum? Either way I expect we'll get a Binary::fromVector() method.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants