clang-p2996

Author	SHA1	Message	Date
Dominik Adamski	ffabf73553	[NFC][OpenMP][MLIR] Add test for lowering parallel workshare GPU loop (#76144 ) This test checks if MLIR code is lowered according to schema presented below: func1() { call __kmpc_parallel_51(..., func2, ...) } func2() { call __kmpc_for_static_loop_4u(..., func3, ...) } func3() { //loop body }	2023-12-22 11:58:04 +01:00
Jakub Kuderski	a03c53c530	[mlir][spirv] Add physical storage buffer extension test. NFC. (#76196 ) This test demonstrates how the PhysicalStorageBuffer extension can be used end-2-end in a spir-v module. This module has been verified to pass serialization, deserialization, and validation with spirv-val.	2023-12-21 23:06:26 -05:00
Gil Rapaport	d9803841f2	[mlir][emitc] Add op modelling C expressions (#71631 ) Add an emitc.expression operation that models C expressions, and provide transforms to form and fold expressions. The translator emits the body of emitc.expression ops as a single C expression. This expression is emitted by default as the RHS of an EmitC SSA value, but if possible, expressions with a single use that is not another expression are instead inlined. Specific expression's inlining can be fine tuned by lowering passes and transforms.	2023-12-20 15:04:46 +02:00
Oleksandr "Alex" Zinenko	9519e3ecbf	[mlir] support dialect attribute translation to LLVM IR (#75309 ) Extend the `amendOperation` mechanism for translating dialect attributes attached to operations from another dialect when translating MLIR to LLVM IR. Previously, this mechanism would have no knowledge of the LLVM IR instructions created for the given operation, making it impossible for it to perform local modifications such as attaching operation-level metadata. Collect instructions inserted by the LLVM IR builder and pass them to `amendOperation`.	2023-12-19 14:18:16 +01:00
Dominik Adamski	6deb5d4e44	[NFC][OpenMP][MLIR] Verify if empty workshare loop is lowered correctly (#75518 ) Check if workshare loop without loop body is lowered correctly i.e.: 1) null pointer is passed to OpenMP device RTL function as a parameter which denotes loop function body aggregated parameters 2) Outlined loop function body has only one parameter - loop counter	2023-12-18 11:59:35 +01:00
Kareem Ergawy	d777504355	[MLIR][OpenMP][Offload] Lower target update op to DeviceRT (#75159 ) Adds support for lowring `UpdateDataOp` to the DeviceRT. This reuses the existing utils used by other device directive.	2023-12-18 11:14:46 +01:00
Jessica Del	32f9983c06	[AMDGPU] - Add address space for strided buffers (#74471 ) This is an experimental address space for strided buffers. These buffers can have structs as elements and a stride > 1. These pointers allow the indexed access in units of stride, i.e., they point at `buffer[index * stride]`. Thus, we can use the `idxen` modifier for buffer loads. We assign address space 9 to 192-bit buffer pointers which contain a 128-bit descriptor, a 32-bit offset and a 32-bit index. Essentially, they are fat buffer pointers with an additional 32-bit index.	2023-12-15 15:49:25 +01:00
Tobias Gysi	25d942403c	[mlir][llvm] Add invariant intrinsics (#75354 ) This commit implements the LLVM IR invariant intrinsics in LLVM dialect. These intrinsics can be used to mark a program regions in which the contents of a specific memory object will not change. The LLVM dialect implementation also implements the PromotableOpInterface to ensure Mem2Reg & SROA are able to promote pointers that are marked using the invariant intrinsics.	2023-12-14 14:58:45 +01:00
Tom Eccles	79524ba527	[mlir][ArmSME] Add sve streaming compatible attribute (#75222 ) Following the same path already used for ArmStreaming and ArmLocallyStreaming. This should correspond to clang's __arm_streaming_compatible attribute.	2023-12-13 13:53:01 +00:00
Christian Ulmann	eab62971cd	[MLIR][LLVM] Support nameless and scopeless global constants (#75307 ) This commit ensures that we model DI information for global constants correctly. These constructs can lack scopes, names, and linkage names, so these parameters were made optional for the DIGlobalVariable attribute.	2023-12-13 10:47:59 +01:00
Dominik Adamski	b730703726	[NFC][MLIR][OpenMP] Add test to check lowering omp.wsloop for GPU (#74857 ) This test checks if proper OpenMP device RTL function is called to handle workshare loop for GPU. The code generation for GPU worksharing loops is implemented by the patch: https://github.com/llvm/llvm-project/pull/73360	2023-12-12 15:05:26 +01:00
Ivan R. Ivanov	d5fb4c0f11	[MLIR][NVVM] Enable nvvm intrinsics import to LLVMIR (#68843 ) Co-authored-by: Tobias Gysi <tobias.gysi@nextsilicon.com> Co-authored-by: Christian Ulmann <christianulmann@gmail.com>	2023-12-12 13:31:55 +09:00
Tom Eccles	e9e1c411b6	[mlir][LLVM] Add nsw and nuw flags (#74508 ) The implementation of these are modeled after the existing fastmath flags for floating point arithmetic.	2023-12-07 10:35:00 +00:00
Billy Zhu	2ea60f4197	[MLIR][LLVM] Fuse Scope into CallsiteLoc Callee (#74546 ) There's an issue in the translator today where, for a CallsiteLoc, if the callee does not have a DI scope (perhaps due to compile options or optimizations), it may get propagated the DI scope of its callsite's parent function, which will create a non-existent DILocation combining line & col number from one file, and the filename from another. The root problem is we cannot propagate the parent scope when translating the callee location, as it no longer applies to inlined locations (see code diff and hopefully this will make sense). To facilitate this, the importer is also changed so that callee scopes are fused with the callee FileLineCol loc, instead of on the Callsite loc itself. This comes with the benefit that we now have a symmetric Callsite loc representation. If we required the callee scope be always annotated on the Callsite loc, it would be hard for generic inlining passes to maintain that, since it would have to somehow understand the semantics of the fused metadata and pull it out while inlining.	2023-12-06 09:13:12 +01:00
Sang Ik Lee	7fc792cba7	[MLIR] Enable GPU Dialect to SYCL runtime integration (#71430 ) GPU Dialect lowering to SYCL runtime is driven by spirv.target_env attached to gpu.module. As a result of this, spirv.target_env remains as an input to LLVMIR Translation. A SPIRVToLLVMIRTranslation without any actual translation is added to avoid an unregistered error in mlir-cpu-runner. SelectObjectAttr.cpp is updated to 1) Pass binary size argument to getModuleLoadFn 2) Pass parameter count to getKernelLaunchFn This change does not impact CUDA and ROCM usage since both mlir_cuda_runtime and mlir_rocm_runtime are already updated to accept and ignore the extra arguments.	2023-12-05 16:55:24 -05:00
Billy Zhu	12e5148f9c	[MLIR][LLVM] Fix CallOp asm parser for attr-dict (#74372 ) Currently the parser & printer of `CallOp` do not match when both varargs and attr-dict are present (round tripping is broken). This fixes the parser so that it conforms to the written asm format in the comments.	2023-12-05 21:18:52 +01:00
Rik Huijzer	13da9a58c5	[mlir][llvm] Fix verifier for const int and dense (#74340 ) Continuation of https://github.com/llvm/llvm-project/pull/74247 to fix https://github.com/llvm/llvm-project/issues/56962. Fixes verifier for (Integer Attr): ```mlir llvm.mlir.constant(1 : index) : f32 ``` and (Dense Attr): ```mlir llvm.mlir.constant(dense<100.0> : vector<1xf64>) : f32 ``` ## Integer Attr The addition that this PR makes to `LLVM::ConstantOp::verify` is meant to be exactly verifying the code in `mlir::LLVM::detail::getLLVMConstant`: `9f78edbd20/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp (L350-L353)` One failure mode is when the `type` (`llvm.mlir.constant(<value>) : <type>`) is not an `Integer`, because then the `cast` in `getIntegerBitWidth` will crash: `dca432cb7b/llvm/include/llvm/IR/DerivedTypes.h (L97-L99)` So that's now caught in the verifier. Apart from that, I don't see anything we could check for. `sextOrTrunc` means "Sign extend or truncate to width" and that one is quite permissive. For example, the following doesn't have to be caught in the verifier as it doesn't crash during `mlir-translate -mlir-to-llvmir`: ```mlir llvm.func @main() -> f32 { %cst = llvm.mlir.constant(100 : i64) : f32 llvm.return %cst : f32 } ``` ## Dense Attr Crash if not either a MLIR Vector type or one of these: `9f78edbd20/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp (L375-L391)`	2023-12-05 12:31:49 +01:00
Benjamin Maxwell	17de468df1	[mlir][llvm] Add llvm.target_features features attribute (#71510 ) This patch adds a target_features (TargetFeaturesAttr) to the LLVM dialect to allow setting and querying the features in use on a function. The motivation for this comes from the Arm SME dialect where we would like a convenient way to check what variants of an operation are available based on the CPU features. Intended usage: The target_features attribute is populated manually or by a pass: ```mlir func.func @example() attributes { target_features = #llvm.target_features<["+sme", "+sve", "+sme-f64f64"]> } { // ... } ``` Then within a later rewrite the attribute can be checked, and used to make lowering decisions. ```c++ // Finds the "target_features" attribute on the parent // FunctionOpInterface. auto targetFeatures = LLVM::TargetFeaturesAttr::featuresAt(op); // Check a feature. // Returns false if targetFeatures is null or the feature is not in // the list. if (!targetFeatures.contains("+sme-f64f64")) return failure(); ``` For now, this is rather simple just checks if the exact feature is in the list, though it could be possible to extend with implied features using information from LLVM.	2023-12-05 11:29:31 +00:00
Radu Salavat	3257e4ca16	[MLIR] Add support for frame pointers in MLIR (#72145 ) Add support for frame pointers in MLIR. --------- Co-authored-by: Markus Böck <markus.boeck02@gmail.com> Co-authored-by: Christian Ulmann <christianulmann@gmail.com>	2023-12-05 11:52:13 +01:00
Billy Zhu	fd870c6fa9	[MLIR][LLVM] Translate Debug EmissionKind (#74376 ) Translate debug emission kind into LLVM (the importer already supports this).	2023-12-05 11:05:21 +01:00
Durga	f81eb7daf4	[MLIR][NVVM]: Add cp.async.mbarrier.arrive Op (#74241 ) Add: * an Op for 'cp.async.mbarrier.arrive', targeting the nvvm_cp_async_mbarrier_arrive* family of intrinsics. * The 'noinc' intrinsic property is modelled as a default-valued-attr of type I1. * Test cases are added to verify the Op as well as the intrinsic lowering. Signed-off-by: Durgadoss R <durgadossr@nvidia.com>	2023-12-04 10:04:58 -08:00
Justin Wilson	6da578cec1	[mlir] Add support for DIGlobalVariable and DIGlobalVariableExpression (#73367 ) This PR introduces DIGlobalVariableAttr and DIGlobalVariableExpressionAttr so that ModuleTranslation can emit the required metadata needed for debug information about global variable. The translator implementation for debug metadata needed to be refactored in order to allow translation of nodes based on MDNode (DIGlobalVariableExpressionAttr and DIExpression) in addition to DINode-based nodes. A DIGlobalVariableExpressionAttr can now be passed to the GlobalOp operation directly and ModuleTranslation will create the respective DIGlobalVariable and DIGlobalVariableExpression nodes. The compile unit that DIGlobalVariable is expected to be configured with will be updated with the created DIGlobalVariableExpression.	2023-12-04 15:52:02 +01:00
Rik Huijzer	67f9cd4670	[mlir][llvm] Fix verifier for const float (#74247 ) Fixes one of the cases of https://github.com/llvm/llvm-project/issues/56962. This PR basically moves some code from `mlir::LLVM::detail::getLLVMConstant` ([source](`9f78edbd20/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp (L354-L371)`)) over to the verifier of `LLVM::ConstantOp`. For now, I focused just on the case where the attribute is a float and ignored the integer case of https://github.com/llvm/llvm-project/issues/56962. Note that without this patch, both added tests will crash inside `getLLVMConstant` during `mlir-translate -mlir-to-llvmir`.	2023-12-04 08:06:02 +01:00
Benjamin Maxwell	eaff02f28e	[mlir][ArmSME] Switch to an attribute-based tile allocation scheme (#73253 ) This reworks the ArmSME dialect to use attributes for tile allocation. This has a number of advantages and corrects some issues with the previous approach: * Tile allocation can now be done ASAP (i.e. immediately after `-convert-vector-to-arm-sme`) * SSA form for control flow is now supported (e.g.`scf.for` loops that yield tiles) * ArmSME ops can be converted to intrinsics very late (i.e. after lowering to control flow) * Tests are simplified by removing constants and casts * Avoids correctness issues with representing LLVM `immargs` as MLIR values - The tile ID on the SME intrinsics is an `immarg` (so is required to be a compile-time constant), `immargs` should be mapped to MLIR attributes (this is already the case for intrinsics in the LLVM dialect) - Using MLIR values for `immargs` can lead to invalid LLVM IR being generated (and passes such as -cse making incorrect optimizations) As part of this patch we bid farewell to the following operations: ```mlir arm_sme.get_tile_id : i32 arm_sme.cast_tile_to_vector : i32 to vector<[4]x[4]xi32> arm_sme.cast_vector_to_tile : vector<[4]x[4]xi32> to i32 ``` These are now replaced with: ```mlir // Allocates a new tile with (indeterminate) state: arm_sme.get_tile : vector<[4]x[4]xi32> // A placeholder operation for lowering ArmSME ops to intrinsics: arm_sme.materialize_ssa_tile : vector<[4]x[4]xi32> ``` The new tile allocation works by operations implementing the `ArmSMETileOpInterface`. This interface says that an operation needs to be assigned a tile ID, and may conditionally allocate a new SME tile. Operations allocate a new tile by implementing... ```c++ std::optional<arm_sme::ArmSMETileType> getAllocatedTileType() ``` ...and returning what type of tile the op allocates (ZAB, ZAH, etc). Operations that don't allocate a tile return `std::nullopt` (which is the default behaviour). Currently the following ops are defined as allocating: ```mlir arm_sme.get_tile arm_sme.zero arm_sme.tile_load arm_sme.outerproduct // (if no accumulator is specified) ``` Allocating operations become the roots for the tile allocation pass, which currently just (naively) assigns all transitive uses of a root operation the same tile ID. However, this is enough to handle current use cases. Once tile IDs have been allocated subsequent rewrites can forward the tile IDs to any newly created operations.	2023-11-30 10:22:22 +00:00
Guray Ozen	edf5cae739	[mlir][gpu] Support Cluster of Thread Blocks in `gpu.launch_func` (#72871 ) NVIDIA Hopper architecture introduced the Cooperative Group Array (CGA). It is a new level of parallelism, allowing clustering of Cooperative Thread Arrays (CTA) to synchronize and communicate through shared memory while running concurrently. This PR enables support for CGA within the `gpu.launch_func` in the GPU dialect. It extends `gpu.launch_func` to accommodate this functionality. The GPU dialect remains architecture-agnostic, so we've added CGA functionality as optional parameters. We want to leverage mechanisms that we have in the GPU dialects such as outlining and kernel launching, making it a practical and convenient choice. An example of this implementation can be seen below: ``` gpu.launch_func @kernel_module::@kernel clusters in (%1, %0, %0) // <-- Optional blocks in (%0, %0, %0) threads in (%0, %0, %0) ``` The PR also introduces index and dimensions Ops specific to clusters, binding them to NVVM Ops: ``` %cidX = gpu.cluster_id x %cidY = gpu.cluster_id y %cidZ = gpu.cluster_id z %cdimX = gpu.cluster_dim x %cdimY = gpu.cluster_dim y %cdimZ = gpu.cluster_dim z ``` We will introduce cluster support in `gpu.launch` Op in an upcoming PR. See [the documentation](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#cluster-of-cooperative-thread-arrays) provided by NVIDIA for details.	2023-11-27 11:05:07 +01:00
Oleksandr "Alex" Zinenko	8735b7dcc9	[mlir] do not inject malloc/free in to-LLVM translation (#73224 ) In the early days of MLIR-to-LLVM IR translation, it had to forcefully inject declarations of `malloc` and `free` functions as then-standard (now `memref`) dialect ops were unconditionally lowering to libc calls. This is no longer the case. Even when they do lower to libc calls, the signatures of those methods are injected at lowering since calls must target declared functions in valid IR. Don't inject those declarations anymore.	2023-11-23 13:38:25 +01:00
Dominik Adamski	bda723f5ba	[NFC][OpenMP][MLIR] Add MLIR test for lowering parallel if (#71788 ) Add test for clause omp target parallel if (parallel : cond ) Test checks if corresponding MLIR construct is correctly lowered to LLVM IR.	2023-11-23 13:29:55 +01:00
Benjamin Maxwell	dbb8643333	[mlir][LLVM] Support `immargs` in LLVM_IntrOpBase intrinsics (#73013 ) This extends `LLVM_IntrOpBase` so that it can be passed a list of `immArgPositions` and a list (of the same length) of `immArgAttrNames`. `immArgPositions` contains the positions of `immargs` on the LLVM IR intrinsic, and `immArgAttrNames` maps those to a corresponding MLIR attribute. This allows modeling LLVM `immargs` as MLIR attributes, which is the closest match semantically (and had already been done manually for the LLVM dialect intrinsics). This has two upsides: * It's slightly easier to implement intrinsics with immargs now (especially if they make use of other features, such as overloads) * It clearly defines that `immargs` should map to attributes, before there was no mention of `immargs` in LLVMOpBase.td, so implementing them was unclear This works with other features of the `LLVM_IntrOpBase`, so `immargs` can be marked as overloaded too (which is used in some intrinsics). As part of this patch (and to test correctness) existing intrinsics have been updated to use these new parameters. This also uncovered a few issues with the `llvm.intr.vector.insert/extract` intrinsics. First, the argument order for insert did not match the LLVM intrinsic, and secondly, both were missing a mlirBuilder (so failed to import from LLVM IR). This is corrected with this patch (and a test case added).	2023-11-23 10:12:12 +00:00
Oleksandr "Alex" Zinenko	8134a8fc3f	[mlir] use TypeSize and uint64_t in DataLayout (#72874 ) Data layout queries may be issued for types whose size exceeds the range of 32-bit integer as well as for types that don't have a size known at compile time, such as scalable vectors. Use best practices from LLVM IR and adopt `llvm::TypeSize` for size-related queries and `uint64_t` for alignment-related queries. See #72678.	2023-11-21 16:12:27 +01:00
Ivan Butygin	b84fe8ff16	[mlir][spirv] Add some op decorations (#72809 ) NoSignedWrap, NoUnsignedWrap, FPFastMathMode.	2023-11-21 15:31:23 +03:00
Sam Tebbs	f7b5c25507	[AArch64][SME] Remove immediate argument restriction for svldr and svstr (#68565 ) The svldr_vnum and svstr_vnum builtins always modify the base register and tile slice and provide immediate offsets of zero, even when the offset provided to the builtin is an immediate. This patch optimises the output of the builtins when the offset is an immediate, to pass it directly to the instruction and to not need the base register and tile slice updates.	2023-11-20 09:57:29 +00:00
Marius Brehler	c4fd1fd6d4	[mlir][emitc] Rename `call` op to `call_opaque` (#72494 ) This renames the `emitc.call` op to `emitc.call_opaque` as the existing call op does not refer to the callee by symbol. The rename allows to introduce a new call op alongside with a future `emitc.func` op to model and facilitate functions and function calls.	2023-11-17 10:22:15 +01:00
Jakub Kuderski	c6f7b631a9	[mlir][spirv] Fix VectorShuffle assembly format (#72568 ) Align with the rest of the spirv dialect by using a functional type syntax. Regex for updating existing code: `spirv\.VectorShuffle (\[.+\]) (%[^:]+): ([^,]+), (%[^:]+): ([^\s]+) ->(.+)` ==> `spirv.VectorShuffle $1 $2, $4 : $3, $5 ->$6`	2023-11-16 19:34:00 -05:00
Billy Zhu	0ab6b20c36	[MLIR] Add DIExpression to LLVM dialect (#72462 ) Add initial support for DIExpression in LLVM dialect. Similar to LLVM IR, DI Expression is encoded as a list of uint64. The difference is that LLVM IR has helpers for understanding the expression (e.g. for verification and pretty printing), whereas the current support added by this PR treats the expression elements as opaque.	2023-11-16 11:32:02 -08:00
agozillon	718793ce6a	[OpenMP][OMPIRBuilder] Handle replace uses of ConstantExpr's inside of Target regions (#71891 ) Currently there's an edge cases where constant indexing in target regions can lead to incorrect results as we do not correctly replace uses of mapped variables in generated target functions with the target arguments (and accessor instructions) that replace them. This patch seeks to fix that by extending the current logic in the OMPIRBuilder. Things like GEP's can come in the form of Constants/ConstantExprs, Constants and ConstantExpr's do not have access to the knowledge of what they're contained in, so we must dig a little to find an instruction so we can tell if they're used inside of the function we're outlining so we can be sure they are replaceable and we are not accidentally replacing a usage somewhere else in the module that's still necessary. This patch handles these by replacing the original constant expression with a new instruction equivalent; an instruction as it allows easy modification in the following loop, as we can now know the constant (instruction) is owned by our target function (as it holds this knowledge) and replaceUsesOfWith can now be invoked on it (cannot do this with constants it seems), a brand new one also allows us to be cautious as it is perhaps possible the old expression was used inside of the function but exists and is used externally (unlikely by the nature of a Constant, but still a positive side affect).	2023-11-15 15:45:32 +01:00
Benjamin Maxwell	783ac3b6fb	[mlir][ArmSME] Make use of backend function attributes for enabling ZA storage (#71044 ) Previously, we were inserting za.enable/disable intrinsics for functions with the "arm_za" attribute (at the MLIR level), rather than using the backend attributes. This was done to avoid a dependency on the SME ABI functions from compiler-rt (which have only recently been implemented). Doing things this way did have correctness issues, for example, calling a streaming-mode function from another streaming-mode function (both with ZA enabled) would lead to ZA being disabled after returning to the caller (where it should still be enabled). Fixing issues like this would require re-doing the ABI work already done in the backend within MLIR. Instead, this patch switches to use the "arm_new_za" (backend) attribute for enabling ZA for an MLIR function. For the integration tests, this requires some way of linking the SME ABI functions. This is done via the `%arm_sme_abi_shlib` lit substitution. By default, this expands to a stub implementation of the SME ABI functions, but this can be overridden by providing the `ARM_SME_ABI_ROUTINES_SHLIB` CMake cache variable (pointing it at an alternative implementation). For now, the ArmSME integration tests pass with just stubs, as we don't make use of nested ZA-enabled calls. A future patch may add an option to compiler-rt to build the SME builtins into a standalone shared library to allow easily building/testing with the actual implementation.	2023-11-14 12:50:38 +00:00
Shraiysh	c9626e6264	[OpenMP][mlir] Add `enter` capture attribute to declare target (#72062 ) This patch adds support for enter attribute in declare target. As the enter attribute is a replacement for `to` attribute, it has the same tests.	2023-11-13 14:51:20 -06:00
David Truby	a72e034f13	[mlir] Add llvm.linker.options operation to the LLVM IR Dialect (#71720 ) This patch adds a `llvm.linker.options` operation taking a list of strings to pass to the linker when the resulting object file is linked. This is particularly useful on Windows to specify the CRT version to use for this object file.	2023-11-13 14:13:05 +00:00
Dominik Adamski	f2f5f1bfb6	[OMPIRBuilder] Do not call __kmpc_push_num_threads for device parallel (#71934 ) Function __kmpc_push_num_threads should be called only if we specify number of threads for host parallel region. Number of threads specified by the user should be passed as one of arguments of __kmpc_parallel_51 function.	2023-11-10 20:38:56 +01:00
Dominik Adamski	cee3b5ef85	[NFC][MLIR][OpenMP] Add test for lowering omp target parallel (#70795 ) Added MLIR test which checks if MLIR sample code with omp target parallel construct is correctly lowered to LLVM IR for the device.	2023-11-07 14:36:29 +01:00
Nikita Popov	d1ee26ba9f	[MLIR] Use different constant expression in test (NFC) Mul expressions will be removed, so use something else.	2023-11-07 11:10:13 +01:00
Johannes Doerfert	3de645efe3	[OpenMP][NFC] Split the reduction buffer size into two components Before we tracked the size of the teams reduction buffer in order to allocate it at runtime per kernel launch. This patch splits the number into two parts, the size of the reduction data (=all reduction variables) and the (maximal) length of the buffer. This will allow us to allocate less if we need less, e.g., if we have less teams than the maximal length. It also allows us to move code from clangs codegen into the runtime as we now know how large the reduction data is.	2023-11-06 11:50:41 -08:00
Simon Camphausen	68b071d9a2	[mlir][emitc] Fix corner case in translation of literal ops (#71375 ) Fix a corner case missed in #71296 when operands generated by literals are mixed with the args attribute of a call op. Additionally remove a range check that is already handled by the CallOp verifier.	2023-11-06 16:17:20 +01:00
Akash Banerjee	72e2387c05	[OpenMP][MLIR] Add "IsolatedFromAbove" trait to omp.target This patch adds the MLIR translation changes required for add the IsolatedFromAbove and OutlineableOpenMPOpInterface traits to omp.target. It links the newly added block arguments to their corresponding llvm values. Depends on #67164.	2023-11-06 13:24:02 +00:00
Gil Rapaport	6c59f0e1b0	[mlir][emitc] Fix literal translation (#71296 ) - Do not emit variables-at-top for literals - Do not emit an error for a missing name for literals used as call operands.	2023-11-05 17:06:24 -08:00
Gil Rapaport	1bb48c440b	[mlir][emitc] Add literal op testing (NFC) - Literal ops are emitted as unused variables under declare-variables-at-top - Translator fails to emit literals used as emitc.call arguments	2023-11-04 19:47:38 +02:00
Christian Ulmann	fcc26bad82	[MLIR][LLVM] Remove typed pointer remnants from target tests (#71210 ) This commit removes all LLVM dialect typed pointers from the target tests. Typed pointers have been deprecated for a while now and it's planned to soon remove them from the LLVM dialect. Related PSA: https://discourse.llvm.org/t/psa-removal-of-typed-pointers-from-the-llvm-dialect/74502	2023-11-03 21:21:45 +01:00
Johannes Doerfert	b8cbc5c02c	[OpenMP] Introduce the KernelLaunchEnvironment as implicit argument (#70401 ) The KernelEnvironment is for compile time information about a kernel. It allows the compiler to feed information to the runtime. The KernelLaunchEnvironment is for dynamic information per kernel launch. It allows the rutime to feed information to the kernel that is not shared with other invocations of the kernel. The first use case is to replace the globals that synchronize teams reductions with per-launch versions. This allows concurrent teams reductions. More uses cases will follow, e.g., per launch memory pools. Fixes: https://github.com/llvm/llvm-project/issues/70249	2023-10-31 19:38:43 -07:00
Christian Ulmann	4ce93d531d	[MLIR][LLVM] Avoid creating invalid DICompositeTypes in import (#70797 ) This commit ensures that the debug info import skips `DICompositeTypes` that have an array type tag and failed to translate the base type. This is necessary because array `DICompositeTypes` require a base type to be valid. Note that this is currently not verified in LLVM, instead it leads to an explosion of the `ASMPrinter`.	2023-10-31 14:30:31 +01:00
agozillon	6a62707c04	[Flang][OpenMP][MLIR] Initial array section mapping MLIR -> LLVM-IR lowering utilising omp.bounds (#68689 ) This patch seeks to add initial lowering of OpenMP array sections within target region map clauses from MLIR to LLVM IR. This patch seeks to support fixed sized contiguous (don't think OpenMP supports anything other than contiguous sections from my reading but i could be wrong) arrays initially, before looking toward assumed size and shaped arrays. The patch also currently does not include stride, it's left for future work. Although, assumed size works in some fashion (dummy arguments) with some minor alterations to the OMPEarlyOutliner, so it is possible changes made in the IsolatedFromAbove series may allow this to work with no further required patches. It utilises the generated omp.bounds to calculate the size of the mapped OpenMP array (both for sectioned and un-sectioned arrays) as well as the offset to be passed to the kernel argument structure. Alongside these changes some refactoring of how map data is handled is attempted, using a new MapData structure to keep track of information utilised in the lowering of mapped values. The initial addition of a more complex createDeviceArgumentAccessor that utilises capture kinds similarly to (and loosely based on) Clang to generate different kernel argument accesses is also added. A similar function for altering how the kernel argument is passed to the kernel argument structure on the host is also utilised (createAlteredByCaptureMap), which allows modification of the pointer/basePointer based on their capture (and bounds information). It's of note ByRef, is the default for explicit mappings and ByCopy will be the default for implicit captures, so the former is currently tested in this patch and the latter is not for the moment.	2023-10-30 16:00:23 +01:00

1 2 3 4 5 ...

789 Commits