Skip to content

Conversation

friendlymatthew
Copy link
Contributor

@friendlymatthew friendlymatthew commented Oct 21, 2025

Which issue does this PR close?

Rationale for this change

This PR preserves the typed_value's Field metadata. This way, we can check for extension types.

@github-actions github-actions bot added the parquet-variant parquet-variant* crates label Oct 21, 2025
Comment on lines +940 to +952
pub fn with_column(mut self, field_name: &str, array: ArrayRef, nullable: bool) -> Self {
let field = Field::new(field_name, array.data_type().clone(), nullable);
self.fields.push(Arc::new(field));
self.arrays.push(array);
self
}

pub fn with_field(mut self, field: FieldRef, array: ArrayRef) -> Self {
self.fields.push(field);
self.arrays.push(array);
self
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love the naming here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a convenience method, right?

We already have the datatype (from the array), so maybe we just need pass an optional metadata?, making this similar to the Field constructor? Or, if that's too disruptive for exiting callers, with_column_and_metadata?

index: usize,
) -> Variant<'a, 'a> {
let data_type = typed_value.data_type();
let (_typed_value_field, typed_value_column) = typed_value;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we have the field information, we can check for extension types like: e21dc1b

@friendlymatthew
Copy link
Contributor Author

friendlymatthew commented Oct 21, 2025

cc @scovich @alamb @klion26

Copy link
Contributor

@scovich scovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking a stab at this. I'm not immediately sure how to react, so I just left a couple high level comments while I stew on it more.

Comment on lines 401 to +403
.typed_value_field()
.unwrap()
.1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just glancing at this code in isolation, I would guess that .0 is the field and .1 is the array?
But that requires knowing context (ie of this PR)

If this usage will show up often, is it worth returning a newtype instead of a tuple?

Comment on lines +297 to +308
let typed_value = if let Some(typed_value_array) = typed_value.clone() {
let field_ref = Arc::new(Field::new(
"typed_value",
typed_value_array.data_type().clone(),
true,
));
builder = builder.with_field(field_ref.clone(), typed_value_array.clone());

Some((field_ref, typed_value_array))
} else {
None
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this is just Option::map (no ? to complicate things)

Comment on lines +660 to +668
let typed_value = if let Some(typed_value_array) = typed_value.clone() {
let field_ref = Arc::new(Field::new(
"typed_value",
typed_value_array.data_type().clone(),
true,
));
builder = builder.with_field(field_ref.clone(), typed_value_array.clone());

Some((field_ref, typed_value_array))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because of the dual typing (one in field and one in array), this code is technically no longer infallible -- the data types could disagree. Not sure the best way to address that issue as long as we're passing a full-blown field.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, this logic is duplicated with the variant array constructor above.

@friendlymatthew
Copy link
Contributor Author

Thanks for taking a stab at this. I'm not immediately sure how to react, so I just left a couple high level comments while I stew on it more.

I think one thing I'm debating is the coupling between the typed_value array and field. If field is an optional metadata, then maybe we can split array and field into separate optional fields.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

parquet-variant parquet-variant* crates

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants