This is part of an effort to migrate from llvm::Optional to
std::optional. This patch changes the way mlir-tblgen generates .inc
files, and modifies tests and documentation appropriately. It is a "no
compromises" patch, and doesn't leave the user with an unpleasant mix of
llvm::Optional and std::optional.
A non-trivial change has been made to ControlFlowInterfaces to split one
constructor into two, relating to a build failure on Windows.
See also: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
Signed-off-by: Ramkumar Ramachandra <r@artagnon.com>
Differential Revision: https://reviews.llvm.org/D138934
Currently generation of align assumptions for OpenMP simd construct is done
outside OMPIRBuilder for C code and it is not supported for Fortran.
According to OpenMP 5.0 standard (2.9.3) only pointers and arrays can be
aligned for C code.
If given aligned variable is pointer, then Clang generates the following set
of the LLVM IR isntructions to support simd align clause:
; memory allocation for pointer address:
%A.addr = alloca ptr, align 8
; some LLVM IR code
; Alignment instructions (alignment is equal to 32):
%0 = load ptr, ptr %A.addr, align 8
call void @llvm.assume(i1 true) [ "align"(ptr %0, i64 32) ]
If given aligned variable is array, then Clang generates the following set
of the LLVM IR isntructions to support simd align clause:
; memory allocation for array:
%B = alloca [10 x i32], align 16
; some LLVM IR code
; Alignment instructions (alignment is equal to 32):
%arraydecay = getelementptr inbounds [10 x i32], ptr %B, i64 0, i64 0
call void @llvm.assume(i1 true) [ "align"(ptr %arraydecay, i64 32) ]
OMPIRBuilder was modified to generate aligned assumptions. It generates only
llvm.assume calls. Frontend is responsible for generation of aligned pointer
and getting the default alignment value if user does not specify it in aligned
clause.
Unit and regression tests were added to check if aligned clause was handled correctly.
Differential Revision: https://reviews.llvm.org/D133578
Reviewed By: jdoerfert
If 'order(concurrent)' clause is specified, then the iterations of SIMD loop
can be executed concurrently.
This patch adds support for LLVM IR codegen via OMPIRBuilder for SIMD loop
with 'order(concurrent)' clause. The functionality added to OMPIRBuilder is
similar to the functionality implemented in 'CodeGenFunction::EmitOMPSimdInit'.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D134046
Signed-off-by: Dominik Adamski <dominik.adamski@amd.com>
This patch adds translation from OpenMP Dialect to LLVM IR for
omp.taskgroup. This patch also adds missing tests for the clauses in
omp.taskgroup operation.
Reviewed By: peixin
Differential Revision: https://reviews.llvm.org/D130157
This supports translation from MLIR to LLVM IR using OMPIRBuilder for
OpenMP safelen clause in SIMD construct.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D132245
This patch adds OMPIRBuilder support for the safelen clause for the
simd directive.
Reviewed By: shraiysh, Meinersbur
Differential Revision: https://reviews.llvm.org/D131526
Scope of changes:
1) Added new function to generate loop versioning
2) Added support for if clause to applySimd function
2) Added tests which confirm that lowering is successful
If ifCond is specified, then collapsed loop is duplicated and if branch
is added. Duplicated loop is executed if simd ifCond is evaluated to false.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D129368
Signed-off-by: Dominik Adamski <dominik.adamski@amd.com>
This supports lowering from parse-tree to MLIR and translation from
MLIR to LLVM IR using OMPIRBuilder for OpenMP simdlen clause in SIMD
construct.
Reviewed By: shraiysh, peixin, arnamoy10
Differential Revision: https://reviews.llvm.org/D130195
This patch adds OMPIRBuilder support for the simdlen clause for the
simd directive. It uses the simdlen support in OpenMPIRBuilder when
it is enabled in Clang. Simdlen is lowered by OpenMPIRBuilder by
generating the loop.vectorize.width metadata.
Reviewed By: jdoerfert, Meinersbur
Differential Revision: https://reviews.llvm.org/D129149
This patch adds translation for omp.task from OpenMPDialect to LLVM IR
Dialect and adds tests for the same.
Depends on D71989
Reviewed By: ftynse, kiranchandramohan, peixin, Meinersbur
Differential Revision: https://reviews.llvm.org/D123919
1. Remove the redundant collapse clause in MLIR OpenMP worksharing-loop
operation.
2. Fix several typos.
3. Refactor the chunk size type conversion since CreateSExtOrTrunc has
both type check and type conversion.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D128338
The callback is expected to create a branch to the ContinuationBB (sometimes called FiniBB in some lambdas) argument when finishing. This creates problems:
1. The InsertPoint used for CodeGenIP does not need to be the end of a block. If it is not, a naive callback will insert a branch instruction into the middle of the block.
2. The BasicBlock the CodeGenIP is pointing to may or may not have a terminator. There is an conflict where to branch to if the block already has a terminator.
3. Some API functions work only with block having a terminator. Some workarounds have been used to insert a temporary terminator that is removed again.
4. Some callbacks are sensitive to whether the BasicBlock has a terminator or not. This creates a callback ordering problem where different callback may have different behaviour depending on whether a previous callback created a terminator or not. The problem also exists for FinalizeCallbackTy where some callbacks do create branch to another "continue" block, but unlike BodyGenCallbackTy does not receive the target as argument. This is not addressed in this patch.
With this patch, the callback receives an CodeGenIP into a BasicBlock where to insert instructions. If it has to insert control flow, it can split the block at that position as needed but otherwise no separate ContinuationBB is needed. In particular, a callback can be empty without breaking the emitted IR. If the caller needs the control flow to branch to a specific target, it can insert the branch instruction itself and pass an InsertPoint before the terminator to the callback.
Certain frontends such as Clang may expect the current IRBuilder position to be at the end of a basic block. In this case its callbacks must split the block at CodeGenIP before setting the IRBuilder position such that the instructions after CodeGenIP are moved to another basic block and before returning create a new branch instruction to the split block.
Some utility functions such as `splitBB` are supporting correct splitting of BasicBlocks, independent of whether they have a terminator or not, returning/setting the InsertPoint of an IRBuilder to the end of split predecessor block, and optionally omitting creating a branch to the split successor block to be added later.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D118409
This reverts commit af0285122f.
The test "libomp::loop_dispatch.c" on builder
openmp-gcc-x86_64-linux-debian fails from time-to-time.
See #54969. This patch is unrelated.
The OMPScheduleType enum stores the constants from libomp's internal sched_type in kmp.h and are used by several kmp API functions. The enum values have an internal structure, namely each scheduling algorithm (e.g.) exists in four variants: unordered, orderend, normerge unordered, and nomerge ordered.
This patch (basically a followup to D114940) splits the "ordered" and "nomerge" bits into separate flags, as was already done for the "monotonic" and "nonmonotonic", so we can apply bit flags operations on them. It also now contains all possible combinations according to kmp's sched_type. Deriving of the OMPScheduleType enum from clause parameters has been moved form MLIR's OpenMPToLLVMIRTranslation.cpp to OpenMPIRBuilder to make available for clang as well. Since the primary purpose of the flag is the binary interface to libomp, it has been made more private to LLVMFrontend. The primary interface for generating worksharing-loop using OpenMPIRBuilder code becomes `applyWorkshareLoop` which derives the OMPScheduleType automatically and calls the appropriate emitter function.
While this is mostly a NFC refactor, it still applies the following functional changes:
* The logic from OpenMPToLLVMIRTranslation to derive the OMPScheduleType also applies to clang. Most notably, it now applies the nonmonotonic flag for non-static schedules by default.
* In OpenMPToLLVMIRTranslation, the nonmonotonic default flag was previously not applied if the simd modifier was used. I assume this was a bug, since the effect was due to `loop.schedule_modifier()` returning `mlir::omp::ScheduleModifier::none` instead of `llvm::Optional::None`.
* In OpenMPToLLVMIRTranslation, the nonmonotonic default flag was set even if ordered was specified, in breach to what the comment before citing the OpenMP specification says. I assume this was an oversight.
The ordered flag with parameter was not considered in this patch. Changes will need to be made (e.g. adding/modifying function parameters) when support for it is added. The lengthy names of the enum values can be discussed, for the moment this is avoiding reusing previously existing enum value names such as `StaticChunked` to avoid confusion.
Reviewed By: peixin
Differential Revision: https://reviews.llvm.org/D123403
This supports the threadprivate directive in OpenMP dialect following
the OpenMP 5.1 [2.21.2] standard. Also lowering to LLVM IR using OpenMP
IRBduiler.
Reviewed By: kiranchandramohan, shraiysh, arnamoy10
Differential Revision: https://reviews.llvm.org/D123350
This commit restructures how TypeID is implemented to ideally avoid
the current problems related to shared libraries. This is done by changing
the "implicit" fallback path to use the name of the type, instead of using
a static template variable (which breaks shared libraries). The major downside to this
is that it adds some additional initialization costs for the implicit path. Given the
use of type names for uniqueness in the fallback, we also no longer allow types
defined in anonymous namespaces to have an implicit TypeID. To simplify defining
an ID for these classes, a new `MLIR_DEFINE_EXPLICIT_INTERNAL_INLINE_TYPE_ID` macro
was added to allow for explicitly defining a TypeID directly on an internal class.
To help identify when types are using the fallback, `-debug-only=typeid` can be
used to log which types are using implicit ids.
This change generally only requires changes to the test passes, which are all defined
in anonymous namespaces, and thus can't use the fallback any longer.
Differential Revision: https://reviews.llvm.org/D122775
This patch supports ordered clause specified without parameter in
worksharing-loop directive in the OpenMPIRBuilder and lowering MLIR to
LLVM IR.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D114940
This patch adds the nowait parameter to `createSingle` in
OpenMPIRBuilder and handling for IR generation from OpenMP Dialect.
Also added tests for the same.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D122371
This patch adds translation from omp.single to LLVM IR.
Depends on D122288
Reviewed By: ftynse, kiranchandramohan
Differential Revision: https://reviews.llvm.org/D122297
This patch adds translation from `omp.atomic.capture` to LLVM IR. Also
added tests for the same.
Depends on D121546
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D121554
The current dialect registry allows for attaching delayed interfaces, that are added to attrs/dialects/ops/etc.
when the owning dialect gets loaded. This is clunky for quite a few reasons, e.g. each interface type has a
separate tracking structure, and is also quite limiting. This commit refactors this delayed mutation of
dialect constructs into a more general DialectExtension mechanism. This mechanism is essentially a registration
callback that is invoked when a set of dialects have been loaded. This allows for attaching interfaces directly
on the loaded constructs, and also allows for loading new dependent dialects. The latter of which is
extremely useful as it will now enable dependent dialects to only apply in the contexts in which they
are necessary. For example, a dialect dependency can now be conditional on if a user actually needs the
interface that relies on it.
Differential Revision: https://reviews.llvm.org/D120367
Patch adds a new operation for the SIMD construct. The op is designed to be very similar to the existing `wsloop` operation, so that the `CanonicalLoopInfo` of `OpenMPIRBuilder` can be used.
Reviewed By: shraiysh
Differential Revision: https://reviews.llvm.org/D118065
This patch adds lowering from omp.atomic.update to LLVM IR. Whenever a
special LLVM IR instruction is available for the operation, `atomicrmw`
instruction is emitted, otherwise a compare-exchange loop based update
is emitted.
Depends on D119522
Reviewed By: ftynse, peixin
Differential Revision: https://reviews.llvm.org/D119657
The loopInfos gets invalidated after collapsing nested loops. Use the
saved afterIP since the returned afterIP by applyDynamicWorkshareLoop
may be not valid.
Reviewed By: shraiysh
Differential Revision: https://reviews.llvm.org/D120294
This patch adds assemblyFormat for `omp.critical.declare`, `omp.atomic.read`,
`omp.atomic.write`, `omp.atomic.update` and `omp.atomic.capture`.
Also removing those clauses from `parseClauses` that aren't needed
anymore, thanks to the new assemblyFormats.
Reviewed By: NimishMishra, rriddle
Differential Revision: https://reviews.llvm.org/D120248
Add applyStaticChunkedWorkshareLoop method implementing static schedule when chunk-size is specified. Unlike a static schedule without chunk-size (where chunk-size is chosen by the runtime such that each thread receives one chunk), we need two nested loops: one for looping over the iterations of a chunk, and a second for looping over all chunks assigned to the threads.
This patch includes the following related changes:
* Adapt applyWorkshareLoop to triage between the schedule types, now possible since all schedules have been implemented. The default schedule is assumed to be non-chunked static, as without OpenMPIRBuilder.
* Remove the chunk parameter from applyStaticWorkshareLoop, it is ignored by the runtime. Change the value for the value passed to the init function to 0, as without OpenMPIRBuilder.
* Refactor CanonicalLoopInfo::setTripCount and CanonicalLoopInfo::mapIndVar as used by both, applyStaticWorkshareLoop and applyStaticChunkedWorkshareLoop.
* Enable Clang to use the OpenMPIRBuilder in the presence of the schedule clause.
Differential Revision: https://reviews.llvm.org/D114413