clang-p2996

Author	SHA1	Message	Date
Kareem Ergawy	84c3b05e5e	[OpenMP][flang][MLIR] Decouple alloc, init, and copy regions for `omp.private\|declare_reduction` ops (#125699 ) This PR changes the emitted block structure of alloc, init, and copy regions for `omp.private` and `omp.declare_reduction` ops a little bit. In particular, this decouples init and copy regions from the alloca insertion-point. The main motivation is fix "Instruction does not dominate all uses!" errors that happen specially when an init region uses a value from the OpenMP region it is being inlined into. The issue happens because, previous to this PR, we inline the init region right after the latest alloc block (since we used the alloca IP); which in some cases (see exmaple below), is too early and causes the use dominance issue. Example that would break without this PR (when delayed privatization is enabled for `omp.wsloop`s): ```fortran subroutine test2 (xyz) integer :: i integer :: xyz(:) !$omp target map(from:xyz) !$omp do private(xyz) do i = 1, 10 xyz(i) = i end do !$omp end target end subroutine ```	2025-02-06 11:45:40 +01:00
Abid Qadeer	5f7acf7259	[flang][OMPIRbuilder] Set debug loc on terminator created by splitBB. (#125897 ) Fixes #125088. When splitBB is called with createBranch=true, it creates a branch instruction in the old block. But no debug loc is set on that branch instruction. If that is used as InsertPoint in the restoreIP, it has the potential to set the current debug location to null and subsequent instruction will come out without a debug location. This caused the verification check to fail as shown in the bug report. This PR changes splitBB and spliceBB function to also take a debugLoc parameter which can be used to set the debug location of the branch instruction.	2025-02-05 22:35:43 +00:00
Abid Qadeer	e151b1d1f6	[MLIR][OpenMP] Use correct DebugLoc in target construct callbacks. (#125856 ) This is same as PR #125106 which somehow is stuck in a "Processing Update" loop for many hours now. I am going to close that one and push this one instead. While working on https://github.com/llvm/llvm-project/issues/125088, I noticed a problem with the TargetBodyGenCallbackTy and TargetGenArgAccessorsCallbackTy. The OMPIRBuilder and MLIR side Both maintain their own IRBuilder and when control goes from one to other, we have to take care to not use a stale debug location. The code currently rely on restoreIP to set the insertion point and the debug location. But if the passes InsertPointTy has an empty block, then the debug location will not be updated (see SetInsertPoint). This can cause invalid debug location to be attached to instruction and the verifier will complain. Similarly when we exit the callback, the debug location of the Builder is not set to what it was before the callback. This again can cause verification failures. This PR resets the debug location at the start and also uses an InsertPointGuard to restore the debug location at exit. Both of these problems would have been caught by the unit tests but they were not setting the debug location of the builder before calling the createTarget so the problem was hidden. I have updated the tests accordingly.	2025-02-05 14:59:37 +00:00
Tom Eccles	9ad4ebd82b	[mlir][OpenMP][NFC] break out priv var init into helper (#125303 )	2025-02-03 09:10:44 +00:00
Tom Eccles	aeaafce464	[mlir][OpenMP][flang] make private variable allocation implicit in omp.private (#124019 ) The intention of this work is to give MLIR->LLVMIR conversion freedom to control how the private variable is allocated so that it can be allocated on the stack in ordinary cases or as part of a structure used to give closure context for tasks which might outlive the current stack frame. See RFC: https://discourse.llvm.org/t/rfc-openmp-supporting-delayed-task-execution-with-firstprivate-variables/83084 For example, a privatizer for an integer used to look like ```mlir omp.private {type = private} @x.privatizer : !fir.ref<i32> alloc { ^bb0(%arg0: !fir.ref<i32>): %0 = ... allocate proper memory for the private clone ... omp.yield(%0 : !fir.ref<i32>) } ``` After this change, allocation become implicit in the operation: ```mlir omp.private {type = private} @x.privatizer : i32 ``` For more complex types that require initialization after allocation, an init region can be used: ``` mlir omp.private {type = private} @x.privatizer : !some.type init { ^bb0(%arg0: !some.pointer<!some.type>, %arg1: !some.pointer<!some.type>): // initialize %arg1, using %arg0 as a mold for allocations omp.yield(%arg1 : !some.pointer<!some.type>) } dealloc { ^bb0(%arg0: !some.pointer<!some.type>): ... deallocate memory allocated by the init region ... omp.yield } ``` This patch lays the groundwork for delayed task execution but is not enough on its own. After this patch all gfortran tests which previously passed still pass. There are the following changes to the Fujitsu test suite: - 0380_0009 and 0435_0009 are fixed - 0688_0041 now fails at runtime. This patch is testing firstprivate variables with tasks. Previously we got lucky with the undefined behavior and won the race. After these changes we no longer get lucky. This patch lays the groundwork for a proper fix for this issue. In flang the lowering re-uses the existing lowering used for reduction init and dealloc regions. In flang, before this patch we hit a TODO with the same wording when generating the copy region for firstprivate polymorphic variables. After this patch the box-like fir.class is passed by reference into the copy region, leading to a different path that didn't hit that old TODO but the generated code still didn't work so I added a new TODO in DataSharingProcessor.	2025-01-31 09:35:26 +00:00
agozillon	2428b6ec40	[Flang][MLIR][OpenMP] Fix Target Data if (present(...)) causing LLVM-IR branching error (#123771 ) Currently if we generate code for the below target data map that uses an optional mapping: !$omp target data if(present(a)) map(alloc:a) do i = 1, 10 a(i) = i end do !$omp end target data We yield an LLVM-IR error as the branch for the else path is not generated. This occurs because we enter the NoDupPriv path of the call back function when generating the else branch, however, the emitBranch function needs to be set to a block for it to functionally generate and link in a follow up branch. The NoDupPriv path currently doesn't do this, while it's not supposed to generate anything (as far as I am aware) we still need to at least set the builders placement back so that it emits the appropriate follow up branch. This avoids the missing terminator LLVM-IR verification error by correctly generating the follow up branch.	2025-01-30 17:33:36 +01:00
Tom Eccles	2bde7a1b7c	[mlir][OpenMP][NFC] Remove dead uses of OpenMPVarMappingStackFrame (#125061 ) This is left over from the old way reductions were implemented. OpenMPVarMappingStackFrame doesn't actually do anything anymore so these uses can go away.	2025-01-30 14:35:10 +00:00
agozillon	e0054e984c	[MLIR][OpenMP] Emit nullary check for mapped pointer members and appropriate size select based on results (#124604 ) This PR aims to fix a mapping error when trying to map nullary elements of a record type (primary example is allocatables/pointer types in Fortran at the moment). This should be legal to map, just not write to without pointing to anything within the target region. A common Fortran OpenMP idiom/example where this is useful can be found in the added Fortran offload example. The runtime error arises when we try to map the pointer member utilising a prescribed constant size that we receive from the lowered type, resulting in mapping of data that will be non-existent when there is no allocated data. The fix in this case is to emit a runtime check to see if the data has been allocated, if it hasn't been we select a size of 0, if it has we emit the usual type size.	2025-01-29 17:51:33 +01:00
Jeremy Morse	749443a307	[NFC][DebugInfo] Mop up final instruction-insertion call sites (#124289 ) These are the final places in the monorepo that make use of instruction insertion for methods like insertBefore and moveBefore. As part of the RemoveDIs project, instead use iterators for insertion. (see: https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939 ).	2025-01-27 16:07:27 +00:00
Anchu Rajendran S	afcbcae668	[mlir][OpenMP] inscan reduction modifier and scan op mlir support (#114737 ) Scan directive allows to specify scan reductions within an worksharing loop, worksharing loop simd or simd directive which should have an `InScan` modifier associated with it. This change adds the mlir support for the same. Related PR: [Parsing and Semantic Support for scan](https://github.com/llvm/llvm-project/pull/102792)	2025-01-22 09:53:54 -08:00
Kareem Ergawy	937cbce14c	Revert "[flang][OpenMP] Enable delayed privatization by default `omp.wsloop` (#122471 )" (#123324 ) This seems to have caused some regressions in Fujitsu's test-suite: https://linaro.atlassian.net/browse/LLVM-1521 This reverts commit `6f82408bb5`.	2025-01-22 10:16:40 +01:00
Thirumalai Shaktivel	c2aa11d148	[Flang] Add LLVM lowering support for UNTIED clause in Task (#121052 ) Implementation details: The UNTIED clause is recognized by setting the flag=0 for the default case or performing logical OR to flag if other clauses are specified, and this flag is passed as an argument to the `__kmpc_omp_task_alloc` runtime call. Resubmitting the PR with fix for the failure, as it was reverted here: `927a70daf3` and previously merged here: https://github.com/llvm/llvm-project/pull/115283	2025-01-21 09:10:25 +05:30
Kareem Ergawy	6b3ba6677d	[flang][OpenMP] Unconditionally create `after_alloca` block in `allocatePrivateVars` (#123168 ) While https://github.com/llvm/llvm-project/pull/122866 fixed some issues, it introduced a regression in worksharing loops. The new bug comes from the fact that we now conditionally created the `after_alloca` block based on the number of sucessors of the alloca insertion point. This is unneccessary, we can just alway create the block. If we do this, we respect the post condtions expected after calling `allocatePrivateVars` (i.e. that the `afterAlloca` block has a single predecessor.	2025-01-16 19:08:38 +01:00
Kareem Ergawy	6f82408bb5	[flang][OpenMP] Enable delayed privatization by default `omp.wsloop` (#122471 ) This enable delayed privatization by default for `omp.wsloop` ops, with one caveat! I had to workaround the "impure" alloc region issue that being resolved at the moment. The workaround detects whether the alloc region's argument is used in the region and at the same time defined in block that does not dominate the chosen alloca insertion point. If so, we move the alloca insertion point below the defining instruction of the alloc region argument. This basically reverts to the non-delayed-privatizaiton behavior.	2025-01-16 15:44:59 +01:00
Thirumalai Shaktivel	1d890b06ee	[Flang, OpenMP] Add LLVM lowering support for PRIORITY in TASK (#120710 ) Implementation details: The PRIORITY clause is recognized by setting the flags = 32 to the `__kmpc_omp_task_alloc` runtime call. Also, store the priority-value to the `kmp_task_t` struct member	2025-01-16 10:02:30 +05:30
Kareem Ergawy	a32c45631b	[flang][OpenMP] Generalize fixing `alloca` IP pre-condition for `private` ops (#122866 ) This PR generalizes a fix that we implemented previously for `omp.wsloop`s. The fix makes sure the pre-condtion that the `alloca` block has a single successor whenever we inline delayed privatizers is respected. I simply moved the fix to `allocatePrivateVars` so that it kicks in for any op not just `omp.wsloop`. This handles a bug uncovered by [a test](https://github.com/OpenMP-Validation-and-Verification/OpenMP_VV/blob/master/tests/4.5/target_simd/test_target_simd_safelen.F90) in the OpenMP_VV test suite.	2025-01-15 14:52:10 +01:00
Sergio Afonso	9bc8828093	[OMPIRBuilder][MLIR] Add support for target 'if' clause (#122478 ) This patch implements support for handling the 'if' clause of OpenMP 'target' constructs in the OMPIRBuilder and updates MLIR to LLVM IR translation of the `omp.target` MLIR operation to make use of this new feature.	2025-01-15 10:16:19 +00:00
Sergio Afonso	d2d4c3bd59	[MLIR][OpenMP] LLVM IR translation of host_eval (#116052 ) This patch adds support for processing the `host_eval` clause of `omp.target` to populate default and runtime kernel launch attributes. Specifically, these related to the `num_teams`, `thread_limit` and `num_threads` clauses attached to operations nested inside of `omp.target`. As a result, the `thread_limit` clause of `omp.target` is also supported. The implementation of `initTargetDefaultAttrs()` is intended to reflect clang's own processing of multiple constructs and clauses in order to define a default number of teams and threads to be used as kernel attributes and to populate global variables in the target device module. One side effect of this change is that it is no longer possible to translate to LLVM IR target device MLIR modules unless they have a supported target triple. This is because the local `getGridValue()` function in the `OpenMPIRBuilder` only works for certain architectures, and it is called whenever the maximum number of threads has not been explicitly defined. This limitation also matches clang. Evaluating the collapsed loop trip count of SPMD and Generic-SPMD kernels remains unsupported.	2025-01-14 13:07:38 +00:00
Sergio Afonso	fabc443e93	[OMPIRBuilder] Support runtime number of teams and threads, and SPMD mode (#116051 ) This patch introduces a `TargetKernelRuntimeAttrs` structure to hold host-evaluated `num_teams`, `thread_limit`, `num_threads` and trip count values passed to the runtime kernel offloading call. Additionally, kernel type information is used to influence target device code generation and the `IsSPMD` flag is replaced by `ExecFlags`, which provides more granularity.	2025-01-14 12:34:37 +00:00
Sergio Afonso	27bc6bdaba	[OMPIRBuilder] Introduce struct to hold default kernel teams/threads (#116050 ) This patch introduces the `OpenMPIRBuilder::TargetKernelDefaultAttrs` structure used to simplify passing default and constant values for number of teams and threads, and possibly other target kernel-related information in the future. This is used to forward values passed to `createTarget` to `createTargetInit`, which previously used a default unrelated set of values.	2025-01-14 11:08:55 +00:00
Sergio Afonso	9d7d8d2c87	[MLIR][OpenMP] Add host_eval clause to omp.target (#116049 ) This patch adds the `host_eval` clause to the `omp.target` operation. Additionally, it updates its op verifier to make sure all uses of block arguments defined by this clause fall within one of the few cases where they are allowed. MLIR to LLVM IR translation fails on translation of this clause with a not-yet-implemented error.	2025-01-14 10:21:46 +00:00
Kareem Ergawy	42da12063f	[flang][OpenMP] Extend delayed privatization for `omp.simd` (#122156 ) Adds support for delayed privatization for `simd` directives. This PR includes PFT down to LLVM IR lowering.	2025-01-12 07:46:58 +01:00
Kareem Ergawy	6f9e688203	[flang][OpenMP] Fix reduction init region block management (#122079 ) Replaces https://github.com/llvm/llvm-project/pull/121886 Fixes https://github.com/llvm/llvm-project/issues/120254 (hopefully 🤞) ## Problem Consider the following example: ```fortran program test real :: x(1) integer :: i !$omp parallel do reduction(+:x) do i = 1,1 x = 1 end do !$omp end parallel do end program ``` The HLFIR+OMP IR for this example looks like this: ```mlir func.func @_QQmain() { ... omp.parallel { %5 = fir.embox %4#0(%3) : (!fir.ref<!fir.array<1xf32>>, !fir.shape<1>) -> !fir.box<!fir.array<1xf32>> %6 = fir.alloca !fir.box<!fir.array<1xf32>> ... omp.wsloop private(@_QFEi_private_ref_i32 %1#0 -> %arg0 : !fir.ref<i32>) reduction(byref @add_reduction_byref_box_1xf32 %6 -> %arg1 : !fir.ref<!fir.box<!fir.array<1xf32>>>) { omp.loop_nest (%arg2) : i32 = (%c1_i32) to (%c1_i32_0) inclusive step (%c1_i32_1) { ... omp.yield } } omp.terminator } return } ``` The problem addressed by this PR is related to: the `alloca` in the `omp.parallel` region + the related `reduction` clause on the `omp.wsloop` op. When we try translate the reduction from MLIR to LLVM, we have to choose an `alloca` insertion point. This happens in `convertOmpWsloop` where at entry to that function, this is what the LLVM module looks like: ```llvm define void @_QQmain() { %tid.addr = alloca i32, align 4 ... entry: %omp_global_thread_num = call i32 @__kmpc_global_thread_num(ptr @1) br label %omp.par.entry omp.par.entry: %tid.addr.local = alloca i32, align 4 ... br label %omp.par.region omp.par.region: br label %omp.par.region1 omp.par.region1: ... %5 = alloca { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]] }, align 8 ``` Now, when we choose an `alloca` insertion point for the reduction, this is the chosen block `omp.par.entry` (without the changes in this PR). The problem is that the allocation needed for the reduction needs to reference the `%5` SSA value. This results in inserting allocations in `omp.par.entry` that reference allocations in a later block `omp.par.region1` which causes the `Instruction does not dominate all uses!` error. ## Possible solution - take 2: This PR contains a more localized solution than https://github.com/llvm/llvm-project/pull/121886. It makes sure that on entry to `initReductionVars`, the IR builder is at a point where we can starting inserting initialization region; to make things cleaner, we still split the builder insertion point to a dedicated `omp.reduction.init`. This way we avoid splitting after the latest allocation block; which is what causing the issue.	2025-01-09 16:11:18 +01:00
agozillon	fa56e8bb64	[OpenMP][MLIR] Fix threadprivate lowering when compiling for target when target operations are in use (#119310 ) Currently the compiler will ICE in programs like the following on the device lowering pass: ``` program main implicit none type i1_t integer :: val(1000) end type i1_t integer :: i type(i1_t), pointer :: newi1 type(i1_t), pointer :: tab=>null() integer, dimension(:), pointer :: tabval !$omp THREADPRIVATE(tab) allocate(newi1) tab=>newi1 tab%val(:)=1 tabval=>tab%val !$omp target teams distribute parallel do do i = 1, 1000 tabval(i) = i end do !$omp end target teams distribute parallel do end program main ``` This is due to the fact that THREADPRIVATE returns a result operation, and this operation can actually be used by other LLVM dialect (or other dialect) operations. However, we currently skip the lowering of threadprivate, so we effectively never generate and bind an LLVM-IR result to the threadprivate operation result. So when we later go on to lower dependent LLVM dialect operations, we are missing the required LLVM-IR result, try to access and use it and then ICE. The fix in this particular PR is to allow compilation of threadprivate for device as well as host, and simply treat the device compilation as a no-op, binding the LLVM-IR result of threadprivate with no alterations and binding it, which will allow the rest of the compilation to proceed, where we'll eventually discard the host segment in any case. The other possible solution to this I can think of, is doing something similar to Flang's passes that occur prior to CodeGen to the LLVM dialect, where they erase/no-op certain unrequired operations or transform them to lower level series of operations. And we would erase/no-op threadprivate on device as we'd never have these in target regions. The main issues I can see with this are that we currently do not specialise this stage based on wether we're compiling for device or host, so it's setting a precedent and adding another point of having to understand the separation between target and host compilation. I am also not sure we'd necessarily want to enforce this at a dialect level incase someone else wishes to add a different lowering flow or translation flow. Another possible issue is that a target operation we have/utilise would depend on the result of threadprivate, meaning we'd not be allowed to entirely erase/no-op it, I am not sure of any situations where this may be an issue currently though.	2025-01-03 18:01:01 +01:00
Kaviya Rajendiran	d3eb65f15d	[MLIR][OpenMP] Lowering aligned clause to LLVM IR for SIMD directive (#119536 ) This patch, - Added a translation support for aligned clause in SIMD directive by passing the alignment details to "llvm.assume" intrinsic. - Updated the insertion point for llvm.assume intrinsic call in "OMPIRBuilder.cpp". - Added a check in aligned clause MLIR lowering, to ensure that the alignment value must be a power of 2.	2025-01-03 16:22:38 +05:30
Thirumalai Shaktivel	cbe583b0bd	[Flang] Add translation support for MutexInOutSet and InOutSet (#120715 ) Implementatoin details: Both Mutexinoutset and Inoutset is recognized as flag=0x4 and 0x8 respectively, the flags is set to `kmp_depend_info` and passed as argument to `__kmpc_omp_task_with_deps` runtime call	2024-12-26 15:02:09 +05:30
Muhammad Omair Javaid	927a70daf3	Revert "[Flang OpenMP] Add LLVM translation support for UNTIED in Task (#115283 )" This reverts commit `919aead1db`. It breaks following LLVM bots: https://lab.llvm.org/buildbot/#/builders/199 https://lab.llvm.org/buildbot/#/builders/143 https://lab.llvm.org/buildbot/#/builders/17	2024-12-24 01:47:24 +05:00
Thirumalai Shaktivel	919aead1db	[Flang OpenMP] Add LLVM translation support for UNTIED in Task (#115283 ) Implementation details: The UNTIED clause is recognized by setting the flag=0 for the default case or performing logical OR to flag if other clauses are specified, and this flag is passed as an argument to the `__kmpc_omp_task_alloc` runtime call.	2024-12-20 16:36:51 +05:30
Ivan R. Ivanov	7c9404c279	[flang][OpenMP] Add frontend support for ompx_bare clause (#111106 )	2024-12-13 21:44:43 +09:00
Jie Fu	46ec271e03	[mlir] Fix -Wunused-variable in OpenMPToLLVMIRTranslation.cpp (NFC) /llvm-project/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp:3921:12: error: unused variable 'varType' [-Werror,-Wunused-variable] Type varType = mapInfoOp.getVarType(); ^ 1 error generated.	2024-12-12 22:11:41 +08:00
Kareem Ergawy	f9734b9df1	[mlir][OpenMP] - MLIR to LLVMIR translation support for delayed privatization of allocatables in `omp.target` ops (#116576 ) This PR adds support to translate the `private` clause from MLIR to LLVMIR when used on allocatables in the context of an `omp.target` op. This replaces https://github.com/llvm/llvm-project/pull/113208. Parent PR: https://github.com/llvm/llvm-project/pull/116770. Only the latest commit is relevant to the PR.	2024-12-12 14:39:58 +01:00
Kareem Ergawy	0e70e0edd5	[reapply (#118463 )][OpenMP][OMPIRBuilder] Add delayed privatization support for `wsloop` (#119170 ) This reapplies PR #118463 after introducing a fix for a bug uncovere by the test suite. The problem is that when the alloca block is terminated with a conditional branch, this violates a pre-condition of `allocatePrivateVars` (which assumes the alloca block has a single successor). This new PR includes a test that reproduces the issue. Extend MLIR to LLVM lowering by adding support for `omp.wsloop` for delayed privatization. This also refactors a few bit of code to isolate the logic needed for `firstprivate` initialization in a shared util that can be used across constructs that need it. The same is done for `dealloc` regions.	2024-12-09 14:32:04 +01:00
NimishMishra	9eb4056144	[mlir][llvm] Translation support for task detach (#116601 ) This PR adds translation support for task detach. Essentially, if the `detach` clause is present on a task, emit a `__kmpc_task_allow_completion_event` on it, and store its return (of type `kmp_event_t*`) into the `event_handle`.	2024-12-08 06:09:52 -08:00
Kareem Ergawy	c54616ea48	Revert "[OpenMP][OMPIRBuilder] Add delayed privatization support for `wsloop` (#118463 )" (#118848 )	2024-12-05 20:49:13 +01:00
Kareem Ergawy	0993335134	[OpenMP][OMPIRBuilder] Add delayed privatization support for `wsloop` (#118463 ) Extend MLIR to LLVM lowering by adding support for `omp.wsloop` for delayed privatization. This also refactors a few bit of code to isolate the logic needed for `firstprivate` initialization in a shared util that can be used across constructs that need it. The same is done for `dealloc` regions. Parent PR: https://github.com/llvm/llvm-project/pull/118447. Only latest commit is relevant for this PR.	2024-12-05 05:59:52 +01:00
Kareem Ergawy	7f72d71de7	[OpenMP][OMPIRBuilder] Refactor reduction initialization logic into one util (#118447 ) This refactors the logic needed to emit init logic for reductions by moving some duplicated code into a shared util. The logic for doing is quite involved and is needed for any construct that has reductions. Moreover, when a construct has both private and reduction clauses, both sets of clauses need to cooperate with each other when emitting the logic needed for allocation and initialization. Therefore, this PR clearly sets the boundaries for the logic needed to initialize reductions.	2024-12-05 05:23:49 +01:00
NimishMishra	b9e3a769b9	[flang][mlir][llvm][OpenMP] Add lowering and translation support for mergeable clause on task (#114662 ) Add FIR generation and LLVMIR translation support for mergeable clause on task construct. If mergeable clause is present on a task, the relevant flag in `ompt_task_flag_t` is set and passed to `__kmpc_omp_task_alloc`.	2024-11-26 02:40:26 -08:00
Tom Eccles	a6385a3fc8	[mlir][OpenMP][NFC] use llvm::zip_equal for firstprivate copy region translation (#116416 ) I think this is a bit easier to read.	2024-11-18 10:25:19 +00:00
agozillon	b5db75bfce	[OpenMP][MLIR] Descriptor explicit member map lowering changes (#113556 ) This is one of 3 PRs in a PR stack that aims to add support for explicit mapping of allocatable members in derived types. The primary changes in this PR are the OpenMPToLLVMIRTranslation.cpp changes, which are small and seek to alter the current member mapping to add an additional map insertion for pointers. Effectively, if the member is a pointer (currently indicated by having a varPtrPtr field) we add an additional map for the pointer and then alter the subsequent mapping of the member (the data) to utilise the member rather than the parents base pointer. This appears to be necessary in certain cases when mapping pointer data within record types to avoid segfaulting on device (due to incorrect data mapping). In general this record type mapping may be simplifiable in the future. There are also additions of tests which should help to showcase the affect of the changes above.	2024-11-16 12:26:29 +01:00
agozillon	d84d0caf28	[Flang][OpenMP] Update MapInfoFinalization to use BlockArgs Interface and modify use_device_ptr/addr to be order independent (#113919 ) This patch primarily updates the MapInfoFinalization pass to utilise the BlockArgument interface. It also shuffles newly added arguments the MapInfoFinalization passes to the end of the BlockArg/Relevant MapInfo lists, instead of one prior to the owning descriptor type. During this it was noted that the use_device_ptr/addr handling of target data was a little bit too order dependent so I've attempted to make it less so, as we cannot depend on argument ordering to be the same as Fortran for any future frontends.	2024-11-14 15:47:37 +01:00
Tom Eccles	8269c400b4	[mlir][OpenMP][NFC] delayed privatisation cleanup (#115298 ) Upstreaming some code cleanups ahead of supporting delayed task execution. - Make allocatePrivateVars not need to be a template (it will need to operate separately on firstprivate and private variables for delayed task execution so it can't index into lists of all variables in the operation). - Use llvm::SmallVectorImpl for function arguments - collectPrivatizationDecls already reserves size for privateDecls so we don't need to do that in callers - Use llvm::zip_equal instead of C-style array indexing	2024-11-07 12:27:31 +00:00
Tom Eccles	28452acac0	[mlir][OpenMP] delayed privatisation for TASK (#114785 ) This uses essentially an identical implementation to that used for ParallelOp. The private variable allocation and deallocation use shared functions to avoid code duplication. FIRSTPRIVATE variable copying uses duplicated code for now because I anticipate the implementation diverging in the near future once I store data for firstprivate variables in the task description structure. After enabling delayed privatisation for TASK in flang, one more test in the fujitsu test suite passes (I haven't looked into why).	2024-11-06 13:19:12 +00:00
Sergio Afonso	d3e796c2d0	[MLIR][OpenMP] Update not-yet-implemented errors, NFC (#114966 ) This patch improves not-yet-implemented error diagnostics to more closely follow the format used by Flang lowering for the same kind of errors. This helps keep some level of uniformity from a user perspective.	2024-11-05 12:48:54 +00:00
Sergio Afonso	6c28530ed0	[Flang][OpenMP] Properly bind arguments of composite operations (#113682 ) When composite constructs are lowered, clauses for each leaf construct are lowered before creating the set of loop wrapper operations, using these outside values to populate their operand lists. Then, when the loop nest associated to that composite construct is lowered, the binding of Fortran symbols to the entry block arguments defined by these loop wrappers is performed, resulting in the creation of `hlfir.declare` operations in the entry block of the `omp.loop_nest`. This approach prevents `hlfir.declare` operations related to the binding and other operations resulting from the evaluation of the clauses from being inserted between loop wrapper operations, which would be an illegal MLIR representation. However, this introduces the problem of entry block arguments defined by a wrapper that then should be used by one of its nested wrappers, because the corresponding Fortran symbol would still be mapped to an outside value at the time of gathering the list of operands for the nested wrapper. This patch adds operand re-mapping logic to update wrappers without changing when clauses are evaluated or where the `hlfir.declare` creation is performed.	2024-10-31 16:39:53 +00:00
Sergio Afonso	bd6c21460f	[MLIR][OpenMP] Emit descriptive errors for all unsupported clauses (#114037 ) This patch improves error reporting in the MLIR to LLVM IR translation pass for the 'omp' dialect by emitting descriptive errors when encountering clauses not yet supported by that pass. Additionally, not-yet-implemented errors previously missing for some clauses are added, to avoid silently ignoring them. Error messages related to inlining of `omp.private` and `omp.declare_reduction` regions have been updated to use the same format.	2024-10-31 11:59:51 +00:00
Sergio Afonso	21a6032eca	[MLIR][OpenMP] Simplify translation to LLVM IR error handling (#114036 ) This patch unifies the handling of errors passed through the OpenMPIRBuilder and removes some redundant error messages through the introduction of a custom `ErrorInfo` subclass. Additionally, the current list of operations and clauses unsupported by the MLIR to LLVM IR translation pass is added to a new Lit test to check they are being reported to the user.	2024-10-31 11:34:24 +00:00
Sergio Afonso	a1f2fb6078	[MLIR][OpenMP] Prevent composite omp.simd related crashes (#113680 ) This patch updates the translation of `omp.wsloop` with a nested `omp.simd` to prevent uses of block arguments defined by the latter from triggering null pointer dereferences. This happens because the inner `omp.simd` operation representing composite `do simd` constructs is currently skipped and not translated, but this results in block arguments defined by it not being mapped to an LLVM value. The proposed solution is to map these block arguments to the LLVM value associated to the corresponding operand, which is defined above.	2024-10-29 17:05:12 +00:00
Sergio Afonso	d87964de78	[OpenMP][OMPIRBuilder] Error propagation across callbacks (#112533 ) This patch implements an approach to communicate errors between the OMPIRBuilder and its users. It introduces `llvm::Error` and `llvm::Expected` objects to replace the values returned by callbacks passed to `OMPIRBuilder` codegen functions. These functions then check the result for errors when callbacks are called and forward them back to the caller, which has the flexibility to recover, exit cleanly or dump a stack trace. This prevents a failed callback to leave the IR in an invalid state and still continue the codegen process, triggering unrelated assertions or segmentation faults. In the case of MLIR to LLVM IR translation of the 'omp' dialect, this change results in the compiler emitting errors and exiting early instead of triggering a crash for not-yet-implemented errors. The behavior in Clang and openmp-opt stays unchanged, since callbacks will continue always returning 'success'.	2024-10-25 11:30:16 +01:00
Kareem Ergawy	ad70f3e095	[flang][OpenMP] Support `target enter\|update\|exit .. nowait` (#113305 ) Extends `nowait` support for other device directives. This PR refactors the task generation utils used for the `target` directive so that they are general enough to be reused for other device directives as well.	2024-10-23 10:48:54 +02:00
Tom Eccles	621fcf892b	[mlir][OpenMP] rewrite conversion of privatisation for omp.parallel (#111844 ) The existing conversion inlined private alloc regions and firstprivate copy regions in mlir, then undoing the modification of the mlir module before completing the conversion. To make this work, LLVM IR had to be generated using the wrong mapping for privatised values and then later fixed inside of OpenMPIRBuilder. This approach violated an assumption in OpenMPIRBuilder that private variables would be values not constants. Flang sometimes generates code where private variables are promoted to globals, the address of which is treated as a constant in LLVM IR. This caused the incorrect values for the private variable from being replaced by OpenMPIRBuilder: ultimately resulting in programs producing incorrect results. This patch rewrites delayed privatisation for omp.parallel to work more similarly to reductions: translating directly into LLVMIR with correct mappings for private variables. RFC: https://discourse.llvm.org/t/rfc-openmp-fix-issue-in-mlir-to-llvmir-translation-for-delayed-privatisation/81225 Tested against the gfortran testsuite and our internal test suite. Linaro's post-commit bots will check against the fujitsu test suite. I decided to add the new tests as flang integration tests rather than in mlir/test/Target/LLVMIR: - The regression test is for an issue filed against flang. i wanted to keep the reproducer similar to the code in the ticket. - I found the "worst case" CFG test difficult to reason about in abstract it helped me to think about what was going on in terms of a Fortran program. Fixes #106297	2024-10-16 14:43:57 +01:00

1 2 3 4 5 ...

261 Commits