- 
                Notifications
    You must be signed in to change notification settings 
- Fork 247
fix: Throws an exception when struct type has duplicate keys #2459
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| Codecov Report❌ Patch coverage is  
 Additional details and impacted files@@             Coverage Diff              @@
##               main    #2459      +/-   ##
============================================
+ Coverage     56.12%   58.47%   +2.34%     
- Complexity      976     1440     +464     
============================================
  Files           119      146      +27     
  Lines         11743    13511    +1768     
  Branches       2251     2350      +99     
============================================
+ Hits           6591     7900    +1309     
- Misses         4012     4379     +367     
- Partials       1140     1232      +92     ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
 | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
| fieldType, | ||
| Seq(toArrowField("element", elementType, containsNull, timeZoneId)).asJava) | ||
| case StructType(fields) => | ||
| case st @ StructType(fields) => | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we have a test case to avoid regression?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we have a test case to avoid regression?
Sorry for the late reply, I just got back from the Chinese National Day holiday.
I will try to add a test case later.
| This PR does not make comet plan fallback when there is struct type with duplicate keys, but instead throws an exception. As comment on #2457 (comment), we are capable of supporting struct with duplicate keys, so I will mark this PR as draft and look forward to more discussions. | 
Which issue does this PR close?
Closes #2457.
Rationale for this change
In the failed test case of #2444, I found that the struct type with duplicate keys will be deduplicated when converted to arrowType, which causes RowToColumanr to lose some columns.
What changes are included in this PR?
How are these changes tested?
This PR does not avoid the issue, but makes the error message more obvious. It was difficult to create a proper test case for this change until #2444 was merged.
Before this, the failing test case error in #2444 looked like:
after this: