-
Notifications
You must be signed in to change notification settings - Fork 0
Use ATTACH maps for array-sections/subscripts on pointers. #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: tgt-capture-mapped-ptrs-by-ref
Are you sure you want to change the base?
Use ATTACH maps for array-sections/subscripts on pointers. #1
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The libomptarget code will disappear from this PR once llvm#149036 is merged.
| const ValueDecl *BaseDecl = nullptr, const Expr *MapExpr = nullptr, | ||
| ArrayRef<OMPClauseMappableExprCommon::MappableExprComponentListRef> | ||
| OverlappedElements = {}, | ||
| bool AreBothBasePtrAndPteeMapped = false) const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AreBothBaseptrAndPteeMapped was used to decide to use PTR_AND_OBJ maps for something like map(p, p[0]). We don't do that now, since we map them independently, and attach them separately.
This PR contains the following updates: | Package | Type | Update | Change | |---|---|---|---| | [actions/upload-artifact](https://redirect.github.com/actions/upload-artifact) | action | major | `v4.6.2` -> `v5.0.0` |
Reducing spurious diff in an upcoming change.
This moves a call inside an assert to avoid a warning about the result variable being unused in release builds.
This reverts commit e719e93. revert this since it caused regression in our internal CI. Deduction guide with host/device attrs have already been used in https://github.com/ROCm/rocm-libraries/blob/develop/projects/rocrand/library/src/rng/utils/cpp_utils.hpp#L249 ``` template<class V> __host__ __device__ vec_wrapper(V) -> vec_wrapper<V>; ```
…0358) These two are both incredibly similar and simple, basically identical to 'seq'. This patch adds them both together.
Adding the following dependencies to PluginScriptedProcess: - "//lldb:CoreHeaders", - "//lldb:SymbolHeaders", - "//llvm:Support", For c50802c
This upstreams the handler for the BI__builtin_constant_p function.
…160525) Co-authored-by: Alexander Kornienko <[email protected]> Co-authored-by: Louis Dionne <[email protected]>
Commit b262785 introduced a separate `AnalysisFpExc` target to try to workaround the lack of a bazel equivalent of single source file properties. However, this introduces backref errors when `--warn-backrefs` is enabled. This change alternatively just adds the `-ftrapping-math` copt to the entire `Analysis` target. Fix suggested by @rocallahan.
…del (llvm#168270) The VPlan-based cost model assigns the forced cost once for a whole VPInterleaveRecipe. Update the legacy cost model to match this behavior. This fixes a cost-model divergence, and assigns the cost in a way that matches the generated code more accurately. PR: llvm#168270
This clause is pretty small/trivial and is a simple 'set a bool' value on the IR node, so its implementation is quite simple. We create the Operation with this as 'false', so the 'nohost' marks it as true always.
Remove a redundant duplicated computeCost call. NFC, just skipping an unneeded call.
Shared memory for TMA operation needs to be align to 16. Add ability to set an alignment on the cuf.shared_memory operation.
Add more tests for follow-up to llvm#169576.
…lvm#170350) Updates `InitializeRequestArguments` to correctly follow the spec, see https://microsoft.github.io/debug-adapter-protocol/specification#Requests_Initialize. This should correct which fields are tracked as optional and simplifies some of the types to make sure they're meaningful (e.g. an `optional<bool>` isn't anymore helpful than a `bool` since undefined and false are basically equivalent and it requires us to handle interpreting undefined as the default value in all the places we use the `optional<bool>`).
This change fixes couple of issues with static resources: - Enables assignment to static resource or resource array variables (fixes llvm#166458) - Initializes static resources and resource arrays with default constructor that sets the handle to poison
…m#170375) src and dst pointer needs to have an address cast
llvm#170265) * Add compatibility support for DP and REPORT macros * Define a set of predefined Debug Type for libomptarget * Start to update libomptarget files (OffloadRTL.cpp, device.cpp)
…lure (llvm#169918) Use standard GlobalISel error reporting with reportGISelFailure and pass returning false instead of llvm_unreachable. Also enables -global-isel-abort=0 or 2 for -global-isel -new-reg-bank-select. Note: new-reg-bank-select with abort 0 or 2 runs LCSSA, while "intended use" without abort or with abort 1 does not run LCSSA.
…FC (llvm#132364) Adding some new test cases (including FIXME:s) to highlight some bugs related to lowering of llvm.objectsize. One special case is when there are getelementptr instruction with index types that are larger than the index type size for the pointer being analysed. This will add a couple of tests to show what happens both when using a smaller and larger index type, and when having out-of-bounds indices (both too large and negative).
…8281) This PR is a follow up to llvm#167975 and replaces calls to trivial copy constructors with `cir::CopyOp`. --------- Co-authored-by: Andy Kaylor <[email protected]> Co-authored-by: Henrich Lauko <[email protected]>
Reverts llvm#154069. I pointed out a number of issues post-merge, most importantly examples of miscompiles: llvm#154069 (comment). While the motivation of the change is clear, I think the implementation approach is flawed. It seems like the goal is to allow elements like `load <2xi16>` and `load i32` to be vectorized together despite the current algorithm not grouping them into the same equivalence classes. I personally think that if we want to attempt this it should be a more wholistic approach, maybe even redefining the concept of an equivalence class. This current solution seems like it would be really hard to do bug-free, and even if the bugs were not present, it is only able to merge chains that happen to be adjacent to each other after `splitChainByContiguity`, which seems like it is leaving things up to chance whether this optimization kicks in. But we can discuss more in the re-land. Maybe the broader approach I'm proposing is too difficult, and a narrow optimization is worthwhile. Regardless, this should be reverted, it needs more iteration before it is correct.
This PR is part of llvm#167752. It upstreams the codegen and tests for the shuffle builtins implemented in the incubator, including: - `vinsert` + `insert` - `pblend` + `blend` - `vpermilp` - `pshuf` + `shufp` - `palignr` It does NOT upstream the `perm`, `vperm2`, `vpshuf`, `shuf_i` / `shuf_f` and `align` builtins, which are not yet implemented in the incubator. This _is_ a large commit, but most of it is tests. The `pshufd` / `vpermilp` builtins seem to have no test coverage in the incubator, what should I do?
We were not marking the `.cfi.jumptable` functions as `naked` on windows. The referenced bug (https://llvm.org/bugs/show_bug.cgi?id=28641#c3) appears to be fixed: ```bash build/bin/opt -S -passes=lowertypetests -mtriple=i686-pc-win32 llvm/test/Transforms/LowerTypeTests/function.ll | build/bin/llc -O0 ``` ``` L_.cfi.jumptable: # @.cfi.jumptable # %bb.0: # %entry #APP jmp _f.cfi@PLT int3 int3 int3 #NO_APP #APP jmp _g.cfi@PLT int3 int3 int3 #NO_APP # -- End function .section .rdata,"dr" .p2align 4, 0x0 # @0 ``` Not seeing the spilled registers described in the bug anymore.
…accept `const T *` arguments when the key is `T *` (llvm#170377) Also use `is_contained` to implement `contains`, since this tries the `contains` member function of the set type first.
…68915) fixes llvm#168737 fixes llvm#168755 This change fixes adds support for Matrix truncations via the ICK_HLSL_Matrix_Truncation enum. That ends up being most of the files changed. It also allows Matrix as an HLSL Elementwise cast as long as the cast does not perform a shape transformation ie 3x2 to 2x3. Tests for the new elementwise and truncation behavior were added. As well as sema tests to make sure we error n the shape transformation cast. I am punting right now on the ConstExpr Matrix support. That will need to be addressed later. Will file a seperate issue for that if reviewers agree it can wait.
…#170373) Fixes: SWDEV-563886
…on-using-attach-maptype
This is the initial clang change to support using
ATTACHmap-type for pointer-attachment.This builds upon the following:
targetby reference. llvm/llvm-project#145454For example, for the following:
The following maps are now emitted by clang:
Previously, the two possible maps emitted by clang were:
(B) does not perform any pointer attachment, while (C) also maps the
pointer p, both of which are incorrect.
With this change, we are using ATTACH-style maps, like
(A), for cases where the expression has a base-pointer. For example:We also group mapping of clauses with the same base decl in the order of the increasing complexity of their base-pointers, e.g. for something like:
We first map
spp, thenspp[0]then spp[0][0] and spp[0][0].a.This allows us to also group "struct" allocation based on their attach pointers.
Cases that need handling:
pis a base-pointer in a map from a member function within the same class, p is not beingprivatized, instead, we still try to create an implicit map ofthis[0:1], and accesspthrough that, which is incorrect.use_device_addrclause does not work properly, because we don't have a proper component-list set-up for it, just one component, so we cannot find the proper attach-ptr. Foruse_device_addr, we should match existing maps whose attach-ptr matches the attach-ptr of theuse_device_addroperand.use_device_ptrhandling has some issues too. Need debugging.Some tests still haven't been updated. These include: