-
Notifications
You must be signed in to change notification settings - Fork 5.5k
feat(native): Support custom schemas in native sidecar function registry #26236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
Joe-Abraham
wants to merge
1
commit into
prestodb:master
Choose a base branch
from
Joe-Abraham:hiveInitcap
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
13 changes: 13 additions & 0 deletions
13
presto-native-execution/presto_cpp/main/connectors/hive/CMakeLists.txt
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,13 @@ | ||
| # Licensed under the Apache License, Version 2.0 (the "License"); | ||
| # you may not use this file except in compliance with the License. | ||
| # You may obtain a copy of the License at | ||
| # | ||
| # http://www.apache.org/licenses/LICENSE-2.0 | ||
| # | ||
| # Unless required by applicable law or agreed to in writing, software | ||
| # distributed under the License is distributed on an "AS IS" BASIS, | ||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
|
|
||
| add_subdirectory(functions) |
19 changes: 19 additions & 0 deletions
19
presto-native-execution/presto_cpp/main/connectors/hive/functions/CMakeLists.txt
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,19 @@ | ||
| # Licensed under the Apache License, Version 2.0 (the "License"); | ||
| # you may not use this file except in compliance with the License. | ||
| # You may obtain a copy of the License at | ||
| # | ||
| # http://www.apache.org/licenses/LICENSE-2.0 | ||
| # | ||
| # Unless required by applicable law or agreed to in writing, software | ||
| # distributed under the License is distributed on an "AS IS" BASIS, | ||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
|
|
||
| add_library(presto_hive_functions HiveFunctionRegistration.cpp) | ||
| target_link_libraries(presto_hive_functions presto_dynamic_function_registrar | ||
| velox_functions_string) | ||
|
|
||
| if(PRESTO_ENABLE_TESTING) | ||
| add_subdirectory(tests) | ||
| endif() | ||
36 changes: 36 additions & 0 deletions
36
...o-native-execution/presto_cpp/main/connectors/hive/functions/HiveFunctionRegistration.cpp
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,36 @@ | ||
| /* | ||
| * Licensed under the Apache License, Version 2.0 (the "License"); | ||
| * you may not use this file except in compliance with the License. | ||
| * You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, software | ||
| * distributed under the License is distributed on an "AS IS" BASIS, | ||
| * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| * See the License for the specific language governing permissions and | ||
| * limitations under the License. | ||
| */ | ||
|
|
||
| #include "presto_cpp/main/connectors/hive/functions/HiveFunctionRegistration.h" | ||
|
|
||
| #include "presto_cpp/main/connectors/hive/functions/InitcapFunction.h" | ||
| #include "presto_cpp/main/functions/dynamic_registry/DynamicFunctionRegistrar.h" | ||
|
|
||
| using namespace facebook::velox; | ||
| namespace facebook::presto::hive::functions { | ||
|
|
||
| namespace { | ||
| void registerHiveFunctions() { | ||
| // Register functions under the 'hive.default' namespace. | ||
| facebook::presto::registerPrestoFunction<InitCapFunction, Varchar, Varchar>( | ||
| "initcap", "hive.default"); | ||
| } | ||
| } // namespace | ||
|
|
||
| void registerHiveNativeFunctions() { | ||
| static std::once_flag once; | ||
| std::call_once(once, []() { registerHiveFunctions(); }); | ||
| } | ||
|
|
||
| } // namespace facebook::presto::hive::functions |
23 changes: 23 additions & 0 deletions
23
presto-native-execution/presto_cpp/main/connectors/hive/functions/HiveFunctionRegistration.h
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,23 @@ | ||
| /* | ||
| * Licensed under the Apache License, Version 2.0 (the "License"); | ||
| * you may not use this file except in compliance with the License. | ||
| * You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, software | ||
| * distributed under the License is distributed on an "AS IS" BASIS, | ||
| * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| * See the License for the specific language governing permissions and | ||
| * limitations under the License. | ||
| */ | ||
| #pragma once | ||
|
|
||
| namespace facebook::presto::hive::functions { | ||
|
|
||
| // Registers Hive-specific native functions into the 'hive.default' namespace. | ||
| // This method is safe to call multiple times; it performs one-time registration | ||
| // guarded by an internal call_once. | ||
| void registerHiveNativeFunctions(); | ||
|
|
||
| } // namespace facebook::presto::hive::functions |
52 changes: 52 additions & 0 deletions
52
presto-native-execution/presto_cpp/main/connectors/hive/functions/InitcapFunction.h
Joe-Abraham marked this conversation as resolved.
Show resolved
Hide resolved
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,52 @@ | ||
| /* | ||
| * Licensed under the Apache License, Version 2.0 (the "License"); | ||
| * you may not use this file except in compliance with the License. | ||
| * You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, software | ||
| * distributed under the License is distributed on an "AS IS" BASIS, | ||
| * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| * See the License for the specific language governing permissions and | ||
| * limitations under the License. | ||
| */ | ||
|
|
||
| #pragma once | ||
|
|
||
| #include "velox/functions/Macros.h" | ||
| #include "velox/functions/lib/string/StringImpl.h" | ||
|
|
||
| namespace facebook::presto::hive::functions { | ||
|
|
||
| /// The InitCapFunction capitalizes the first character of each word in a | ||
| /// string, and lowercases the rest. | ||
| template <typename T> | ||
| struct InitCapFunction { | ||
| VELOX_DEFINE_FUNCTION_TYPES(T); | ||
|
|
||
| // ASCII input always produces ASCII result. | ||
| static constexpr bool is_default_ascii_behavior = true; | ||
Joe-Abraham marked this conversation as resolved.
Show resolved
Hide resolved
Joe-Abraham marked this conversation as resolved.
Show resolved
Hide resolved
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please remove this unused variable. |
||
|
|
||
| FOLLY_ALWAYS_INLINE void call( | ||
| out_type<velox::Varchar>& result, | ||
| const arg_type<velox::Varchar>& input) { | ||
| velox::functions::stringImpl::initcap< | ||
| /*strictSpace=*/false, | ||
| /*isAscii=*/false, | ||
| /*turkishCasing=*/true, | ||
| /*greekFinalSigma=*/true>(result, input); | ||
| } | ||
|
|
||
| FOLLY_ALWAYS_INLINE void callAscii( | ||
| out_type<velox::Varchar>& result, | ||
| const arg_type<velox::Varchar>& input) { | ||
| velox::functions::stringImpl::initcap< | ||
| /*strictSpace=*/false, | ||
| /*isAscii=*/true, | ||
| /*turkishCasing=*/true, | ||
| /*greekFinalSigma=*/true>(result, input); | ||
| } | ||
| }; | ||
|
|
||
| } // namespace facebook::presto::hive::functions | ||
22 changes: 22 additions & 0 deletions
22
presto-native-execution/presto_cpp/main/connectors/hive/functions/tests/CMakeLists.txt
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,22 @@ | ||
| # Licensed under the Apache License, Version 2.0 (the "License"); | ||
| # you may not use this file except in compliance with the License. | ||
| # You may obtain a copy of the License at | ||
| # | ||
| # http://www.apache.org/licenses/LICENSE-2.0 | ||
| # | ||
| # Unless required by applicable law or agreed to in writing, software | ||
| # distributed under the License is distributed on an "AS IS" BASIS, | ||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
|
|
||
| add_executable(presto_hive_functions_test InitcapTest.cpp) | ||
|
|
||
| add_test( | ||
| NAME presto_hive_functions_test | ||
| COMMAND presto_hive_functions_test | ||
| WORKING_DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR}) | ||
|
|
||
| target_link_libraries( | ||
| presto_hive_functions_test presto_hive_functions presto_common | ||
| velox_functions_test_lib GTest::gtest GTest::gtest_main) |
80 changes: 80 additions & 0 deletions
80
presto-native-execution/presto_cpp/main/connectors/hive/functions/tests/InitcapTest.cpp
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,80 @@ | ||
| /* | ||
| * Licensed under the Apache License, Version 2.0 (the "License"); | ||
| * you may not use this file except in compliance with the License. | ||
| * You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, software | ||
| * distributed under the License is distributed on an "AS IS" BASIS, | ||
| * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| * See the License for the specific language governing permissions and | ||
| * limitations under the License. | ||
| */ | ||
| #include <gtest/gtest.h> | ||
|
|
||
| #include "presto_cpp/main/connectors/hive/functions/HiveFunctionRegistration.h" | ||
| #include "velox/functions/prestosql/tests/utils/FunctionBaseTest.h" | ||
|
|
||
| namespace facebook::presto::functions::test { | ||
| class InitcapTest : public velox::functions::test::FunctionBaseTest { | ||
| protected: | ||
| static void SetUpTestCase() { | ||
| velox::functions::test::FunctionBaseTest::SetUpTestCase(); | ||
| facebook::presto::hive::functions::registerHiveNativeFunctions(); | ||
| } | ||
| }; | ||
|
|
||
| TEST_F(InitcapTest, initcap) { | ||
| const auto initcap = [&](const std::optional<std::string>& value) { | ||
| return evaluateOnce<std::string>("\"hive.default.initcap\"(c0)", value); | ||
| }; | ||
|
|
||
| // Unicode only. | ||
| EXPECT_EQ( | ||
| initcap("àáâãäåæçèéêëìíîïðñòóôõöøùúûüýþ"), | ||
| "Àáâãäåæçèéêëìíîïðñòóôõöøùúûüýþ"); | ||
| EXPECT_EQ(initcap("αβγδεζηθικλμνξοπρςστυφχψ"), "Αβγδεζηθικλμνξοπρςστυφχψ"); | ||
| // Mix of ascii and unicode. | ||
| EXPECT_EQ(initcap("αβγδεζ world"), "Αβγδεζ World"); | ||
| EXPECT_EQ(initcap("αfoo wβ"), "Αfoo Wβ"); | ||
| // Ascii only. | ||
| EXPECT_EQ(initcap("hello world"), "Hello World"); | ||
| EXPECT_EQ(initcap("HELLO WORLD"), "Hello World"); | ||
| EXPECT_EQ(initcap("1234"), "1234"); | ||
| EXPECT_EQ(initcap("a b c d"), "A B C D"); | ||
| EXPECT_EQ(initcap("abcd"), "Abcd"); | ||
| // Numbers. | ||
| EXPECT_EQ(initcap("123"), "123"); | ||
| EXPECT_EQ(initcap("1abc"), "1abc"); | ||
| // Edge cases. | ||
| EXPECT_EQ(initcap(""), ""); | ||
| EXPECT_EQ(initcap(std::nullopt), std::nullopt); | ||
|
|
||
| // Test with various whitespace characters | ||
| EXPECT_EQ(initcap("YQ\tY"), "Yq\tY"); | ||
| EXPECT_EQ(initcap("YQ\nY"), "Yq\nY"); | ||
| EXPECT_EQ(initcap("YQ\rY"), "Yq\rY"); | ||
| EXPECT_EQ(initcap("hello\tworld\ntest"), "Hello\tWorld\nTest"); | ||
| EXPECT_EQ(initcap("foo\r\nbar"), "Foo\r\nBar"); | ||
|
|
||
| // Test with multiple consecutive whitespaces | ||
| EXPECT_EQ(initcap("hello world"), "Hello World"); | ||
| EXPECT_EQ(initcap("a b c"), "A B C"); | ||
| EXPECT_EQ(initcap("test\t\tvalue"), "Test\t\tValue"); | ||
| EXPECT_EQ(initcap("line\n\n\nbreak"), "Line\n\n\nBreak"); | ||
|
|
||
| // Test with leading and trailing whitespaces | ||
| EXPECT_EQ(initcap(" hello"), " Hello"); | ||
| EXPECT_EQ(initcap("world "), "World "); | ||
| EXPECT_EQ(initcap(" spaces "), " Spaces "); | ||
| EXPECT_EQ(initcap("\thello"), "\tHello"); | ||
| EXPECT_EQ(initcap("\nworld"), "\nWorld"); | ||
| EXPECT_EQ(initcap("test\n"), "Test\n"); | ||
|
|
||
| // Test with mixed whitespace types | ||
| EXPECT_EQ(initcap("hello \t\nworld"), "Hello \t\nWorld"); | ||
| EXPECT_EQ(initcap("a\tb\nc\rd"), "A\tB\nC\rD"); | ||
| EXPECT_EQ(initcap(" \t\n "), " \t\n "); | ||
| } | ||
| } // namespace facebook::presto::functions::test |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -265,7 +265,7 @@ json buildWindowMetadata( | |
|
|
||
| } // namespace | ||
|
|
||
| json getFunctionsMetadata() { | ||
| json getFunctionsMetadata(const std::optional<std::string>& catalog) { | ||
| json j; | ||
|
|
||
| // Get metadata for all registered scalar functions in velox. | ||
|
|
@@ -285,6 +285,10 @@ json getFunctionsMetadata() { | |
| } | ||
|
|
||
| const auto parts = getFunctionNameParts(name); | ||
| // Skip if catalog filter is specified and doesn't match | ||
| if (catalog.has_value() && parts[0] != catalog.value()) { | ||
Joe-Abraham marked this conversation as resolved.
Show resolved
Hide resolved
sourcery-ai[bot] marked this conversation as resolved.
Show resolved
Hide resolved
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Abstract a lambda for this check. |
||
| continue; | ||
| } | ||
| const auto schema = parts[1]; | ||
| const auto function = parts[2]; | ||
Joe-Abraham marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| j[function] = buildScalarMetadata(name, schema, entry.second); | ||
|
|
@@ -295,6 +299,10 @@ json getFunctionsMetadata() { | |
| if (!aggregateFunctions.at(entry.first).metadata.companionFunction) { | ||
| const auto name = entry.first; | ||
| const auto parts = getFunctionNameParts(name); | ||
| // Skip if catalog filter is specified and doesn't match | ||
| if (catalog.has_value() && parts[0] != catalog.value()) { | ||
| continue; | ||
| } | ||
| const auto schema = parts[1]; | ||
| const auto function = parts[2]; | ||
| j[function] = | ||
|
|
@@ -309,6 +317,10 @@ json getFunctionsMetadata() { | |
| if (aggregateFunctions.count(entry.first) == 0) { | ||
| const auto name = entry.first; | ||
| const auto parts = getFunctionNameParts(entry.first); | ||
| // Skip if catalog filter is specified and doesn't match | ||
| if (catalog.has_value() && parts[0] != catalog.value()) { | ||
| continue; | ||
| } | ||
| const auto schema = parts[1]; | ||
| const auto function = parts[2]; | ||
| j[function] = buildWindowMetadata(name, schema, entry.second.signatures); | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.