-
Notifications
You must be signed in to change notification settings - Fork 3.9k
GH-45063: [C++] Upgrade vendored grpc, protobuf, abseil #47503
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Thanks for opening a pull request! If this is not a minor PR. Could you open an issue for this pull request on GitHub? https://github.com/apache/arrow/issues/new/choose Opening GitHub issues ahead of time contributes to the Openness of the Apache Arrow project. Then could you also rename the pull request title in the following format? or See also: |
raulcd
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the great work here, there seems to be some CI issues for some specific builds, seems to be related to some missing targets
| set_property(TARGET absl::bad_optional_access | ||
| PROPERTY INTERFACE_LINK_LIBRARIES absl::config absl::raw_logging_internal) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could this removal be the reason for some of the CI failures?
CMake Error at cmake_modules/BuildUtils.cmake:142 (file):
Error evaluating generator expression:
$<TARGET_FILE:absl::bad_optional_access>
Target "absl::bad_optional_access" is not an executable or library.
Call Stack (most recent call first):
src/arrow/CMakeLists.txt:1030 (arrow_create_merged_static_lib)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bad_optional_access doesn't exist in this updated version of absl, but it looks like I missed removing the declaration of it even though I removed this link libraries. Not sure if the script didn't catch it or if I made a mistake, but I'll fully remove it, hopefully that fixes the issue
|
@raulcd I think I fixed the issue, can you run the CI again? |
|
@raulcd Looking at the failure for C GLib & Ruby / AMD64 Ubuntu 22.04 GLib & Ruby (pull_request), I see this:
It seems that the Ubuntu build uses a verison of nlohmann_json which is installed on the local machine (in |
|
The failure from AMD64 Ubuntu 22.04 C++ ASAN UBSAN looks like it's a failure of |
|
The failure from AMD64 Windows 2022 AVX2 says that it's failing to install the patch for ORC. I don't think anything in this change should affect that, and the patch works in other builds |
|
I'm not entirely sure how to make sense of the R failures because I have no familiarity with R. Does the R library link to the cpp library in some way? |
raulcd
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I haven't had time to come back here. The ASAN, Windows AVX2 and CGLib & Ruby failures are fixed on main. A rebase would should fix them.
The R failures seem related, I can see these logs:
g++ -shared -s -static-libgcc -o arrow.dll tmp.def RTasks.o altrep.o array.o array_to_vector.o arraydata.o arrowExports.o bridge.o buffer.o chunkedarray.o compression.o compute-exec.o compute.o config.o csv.o dataset.o datatype.o expression.o extension-impl.o feather.o field.o filesystem.o io.o json.o memorypool.o message.o parquet.o r_to_arrow.o recordbatch.o recordbatchreader.o recordbatchwriter.o safe-call-into-r-impl.o scalar.o schema.o symbols.o table.o threadpool.o type_infer.o -L../windows/arrow-21.0.0.9000/lib-14.2.0/x64 -L../windows/arrow-21.0.0.9000/lib/x64-ucrt -larrow_dataset -larrow_acero -lparquet -larrow_compute -larrow -larrow_bundled_dependencies -lutf8proc -lsnappy -lz -lzstd -llz4 -lbz2 -lbrotlienc -lbrotlidec -lbrotlicommon -lole32 -lbcrypt -lpsapi -lcrypto -lcrypt32 -lre2 -luserenv -lversion -lws2_32 -lbcrypt -lwininet -lwinhttp -lsecur32 -lshlwapi -lncrypt -lcurl -lnormaliz -lssh2 -lgdi32 -lssl -lcrypto -lcrypt32 -lwldap32 -lz -lws2_32 -lnghttp2 -ldbghelp -LC:/rtools45/x86_64-w64-mingw32.static.posix/lib/x64 -LC:/rtools45/x86_64-w64-mingw32.static.posix/lib -LC:/R/bin/x64 -lR
C:\rtools45\x86_64-w64-mingw32.static.posix\bin/ld.exe: ../windows/arrow-21.0.0.9000/lib/x64-ucrt/libarrow_bundled_dependencies.a(crc32c.cc.obj):crc32c.cc:(.text+0x33e): undefined reference to `absl::lts_20250512::ConcatCrc32c(absl::lts_20250512::crc32c_t, absl::lts_20250512::crc32c_t, unsigned long long)'
C:\rtools45\x86_64-w64-mingw32.static.posix\bin/ld.exe: ../windows/arrow-21.0.0.9000/lib/x64-ucrt/libarrow_bundled_dependencies.a(crc32c.cc.obj):crc32c.cc:(.text+0x3e4): undefined reference to `absl::lts_20250512::ConcatCrc32c(absl::lts_20250512::crc32c_t, absl::lts_20250512::crc32c_t, unsigned long long)'
C:\rtools45\x86_64-w64-mingw32.static.posix\bin/ld.exe: ../windows/arrow-21.0.0.9000/lib/x64-ucrt/libarrow_bundled_dependencies.a(crc32c.cc.obj):crc32c.cc:(.text+0x419): undefined reference to `absl::lts_20250512::ConcatCrc32c(absl::lts_20250512::crc32c_t, absl::lts_20250512::crc32c_t, unsigned long long)'
C:\rtools45\x86_64-w64-mingw32.static.posix\bin/ld.exe: ../windows/arrow-21.0.0.9000/lib/x64-ucrt/libarrow_bundled_dependencies.a(crc32c.cc.obj):crc32c.cc:(.text+0x433): undefined reference to `absl::lts_20250512::ConcatCrc32c(absl::lts_20250512::crc32c_t, absl::lts_20250512::crc32c_t, unsigned long long)'
collect2.exe: error: ld returned 1 exit status
no DLL was created
ERROR: compilation failed for package 'arrow'
There seems to be some undefined references for absl crc32?
|
@raulcd ok, I synced my branch to main, so hopefully that will fix those failures For the R failure, I think it may have something to do with the fact that the GCS library calls absl::ConcatCrc32c but doesn't link absl::crc32c. But I'm confused about why that would only come up in the R build. I went ahead and linked it to GCS, hopefully that fixes it. If it doesn't, could you give me an idea of how I can recreate these R failures locally so I can try to fix them? I'm not sure what sort of build I'd need to run |
|
@raulcd could you run the presubmits again? |
|
@raulcd Looking at the Ubuntu C++ failures, and the Ubuntu R failure, there's a linkage error for When I look at the logs for these tests, I see this: And notably I don't see any log like I think what's going on is that the machines these tests are running on have an old version of absl installed and are using that instead of building it from source, which is causing the linkage to fail. If you look at the logs for successful tests, you can see that they are either finding a newer version of absl from Do you know if there's any way to configure these tests to always build absl from source? Or if there's something I need to add to |
|
Those jobs are using system abseil. The official abseil for Ubuntu 24.04 should be supported we could force upgrading abseil on Ubuntu 22.04 if necessary but I think we should revert 5e28bb2 as this didn't seem to fix the R job failures and is causing other issues. |
|
This bug is about upgrading vendored grpc, which needs an upgraded version of absl to work, which is why it also upgrades the vendored absl. Shouldn't the tests reflect that by using the vendored absl downloaded form source instead of the system absl? |
|
Hmm. Can we upgrade one by one? For example, PR1 upgrades Abesil -> PR2 upgrades Protocol Buffers -> PR3 upgrades gRPC. We want to migrate to FetchContent from ExternalProject. We don't want to maintain ExternalProject for Abseil... |
|
@kou I think splitting them into separate PRs would be a problem, because there are some ways that these dependencies only work together when they're updated together. For example, some update to absl over the years removes certain absl libraries, like |
|
Can we revisit this after #48075 and related PRs are merged? |
Rationale for this change
The currently bundled version of GRPC is several years out of date
What changes are included in this PR?
There's a few things going on here so let me explain:
-DCMAKE_CXX_EXTENSIONS=OFFto the common cmake args. I think this is actually intended to be the case anyway, based on this, but that wasn't propagating to the ThirdPartyToolchain build. This was causing all bundled libraries to default to building withstd=gnu++17, whereas the arrow libraries were being build withstd=c++17, which was causing runtime issues with the new version of abslbuild_opentelemetrymacro. It isn't needed anymore as of Support protobuf 3.22 or upper open-telemetry/opentelemetry-cpp#2163, and https://github.com/apache/arrow/pull/46509/files#diff-39a645630afbfb1702af73bcb3fcdb13d87be3d78fcf501f2dcf93000f4aa738 bumped it to the version with that fixI think that's everything. Github is weirdly showing a bunch of things that I didn't actually touch in the diff, I think it maybe has something to do with the fact that I moved some of the macros higher up in the file. If you open it in VScode it shows a much more sensible diff
Are these changes tested?
I ran the ORC tests, Flight tests, and Telemetry tests, since those seem to the parts of the repo that have dependencies on grpc, protobuf, and abseil. Everything passed. Let me know if I need to do anything else
Are there any user-facing changes?
No