-
Notifications
You must be signed in to change notification settings - Fork 174
added default values for field decoding (#788) #792
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
added default values for field decoding (#788) #792
Conversation
class ExtraFieldSupported: | ||
a: int | ||
b: int | ||
c: list[int] = field(default_factory=list) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is using an explicit default value - which is not an auto default case we start to support by this PR. I think we want to test with the following cases: a field with int | None
type, and a field with list[Base]
type (as LTable), and a field with dict[str, Base]
type (as KTable), and all shouldn't have explicit default value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dataclass
class Base:
a: int
@dataclass
class NullableField:
a: int
b: int | None
@dataclass
class LTableField:
a: int
b: list[Base]
@dataclass
class KTableField:
a: int
b: dict[str, Base]
@dataclass
class UnsupportedField:
a: int
b: int
would these dataclasses be fine?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These look good. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the KtableField I am getting an error
TypeError: 'dict' object cannot be converted to 'Sequence' when using validate_full_roundtrip(KTableField(1, {}), KTableField) Can you guys help me out with this. Attaching the dumps below.
=================================== FAILURES ===================================
____________ test_auto_default_for_supported_and_unsupported_types _____________
def test_auto_default_for_supported_and_unsupported_types() -> None:
@dataclass
class Base:
a: int
@dataclass
class NullableField:
a: int
b: int | None
@dataclass
class LTableField:
a: int
b: list[Base]
@dataclass
class KTableField:
a: int
b: dict[str, Base]
@dataclass
class UnsupportedField:
a: int
b: int
engine_val = [1]
validate_full_roundtrip(NullableField(1, None), NullableField)
validate_full_roundtrip(LTableField(1, []), LTableField)
decoder = build_engine_value_decoder(KTableField)
result = decoder(engine_val)
assert result == KTableField(1, {})
validate_full_roundtrip(KTableField(1, {}), KTableField)
python/cocoindex/tests/test_convert.py:1508:
python/cocoindex/tests/test_convert.py:122: in validate_full_roundtrip
validate_full_roundtrip_to(
value = test_auto_default_for_supported_and_unsupported_types..KTableField(a=1, b={})
value_type = <class 'cocoindex.tests.test_convert.test_auto_default_for_supported_and_unsupported_types..KTableField'>
decoded_values = ((test_auto_default_for_supported_and_unsupported_types..KTableField(a=1, b={}), <class 'cocoindex.tests.test_convert.test_auto_default_for_supported_and_unsupported_types..KTableField'>),)
_engine = <module 'cocoindex._engine' from '/home/kushal/Desktop/Open-Source/cocoindex/python/cocoindex/_engine.cpython-312-x86_64-linux-gnu.so'>
eq = <function validate_full_roundtrip_to..eq at 0x7d7a6d6d9940>
encoded_value = [1, {}]
def validate_full_roundtrip_to(
value: Any,
value_type: Any,
*decoded_values: tuple[Any, Any],
) -> None:
"""
Validate the given value becomes specific values after encoding, sending to engine (using output_type), receiving back and decoding (using input_type).
`decoded_values` is a tuple of (value, type) pairs.
"""
from cocoindex import _engine # type: ignore
def eq(a: Any, b: Any) -> bool:
if isinstance(a, np.ndarray) and isinstance(b, np.ndarray):
return np.array_equal(a, b)
return type(a) is type(b) and not not (a == b)
encoded_value = encode_engine_value(value)
value_type = value_type or type(value)
encoded_output_type = encode_enriched_type(value_type)["type"]
value_from_engine = _engine.testutil.seder_roundtrip(
encoded_value, encoded_output_type
)
E TypeError: 'dict' object cannot be converted to 'Sequence'
python/cocoindex/tests/test_convert.py:101: TypeError
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. The test case is validate_full_roundtrip(KTableField(1, {}), KTableField)
. Here the empty dict is being passed. When encoding struct fields that contained empty dicts (KTables), the function was preserving them as {}
, but the expected engine format for KTables was []
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you both! Since fix for the issue (PR #807) may still need more time to be resolved, we can disable this specific test case (e.g. comment out the specific line) and merge the PR first. After the fix is in, we can re-enable it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks guys, I've commented out the line for now and have added a test case just for the decoder.
python/cocoindex/convert.py
Outdated
default_value = param.default | ||
if default_value is not inspect.Parameter.empty: | ||
return lambda _: default_value | ||
|
||
auto_default = _handle_missing_field_with_auto_default(param, name, field_path) | ||
return lambda _: auto_default | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these lines need to be indented under the with ChildFieldPath(field_path, f".{name}"):
, because it's for this specific field (the .{name}
need to be included in error messages)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have addressed this. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All look good! Thanks for the PR!
fixed #788
Overview
Implements robust auto-defaulting for missing fields in the CocoIndex engine value decoder, improving error handling and data conversion flexibility. This addresses #788.
Changes
LTable
/KTable
) are auto-defaulted.Key Features
Auto-Default Rules
Optional[T]
/T | None
None
LTable
[]
KTable
{}
Enhanced Error Handling
Implementation Details
_get_auto_default_for_type()
helper:auto_default_missing_fields
parameter.Files Modified
python/cocoindex/convert.py
: Core implementation and error/warning improvementspython/cocoindex/tests/test_convert.py
: New tests for auto-defaulting and error reporting