clang-p2996

Author	SHA1	Message	Date
Daniel Hernandez-Juarez	1c47fa9b62	[mlir][AMDGPU] Add support for AMD f16 math library calls (#108809 ) In this PR we add support for AMD f16 math library calls (`__ocml_*_f16`) CC: @krzysz00 @manupak	2024-09-23 12:52:00 -05:00
Krzysztof Drewniak	6292ea6879	[mlir][AMDGPU] Remove an old bf16 workaround (#108409 ) The AMDGPU backend now implements LLVM's `bfloat` type. Therefore, we no longer need to type convert MLIR's `bf16` to `i16` during lowerings to ROCDL. As a result of this change, we discovered that, whel the code for MFMA and WMMA intrinsics was mainly prepared for this change, we were failing to bitcast the bf16 results of WMMA operations out from the i16 they're natively represented as. This commit also fixes that issue. --------- Co-authored-by: Jakub Kuderski <kubakuderski@gmail.com>	2024-09-12 17:45:39 -05:00
Nirvedh Meshram	a16164d0c2	[MLIR][ROCDL] Add dynamically legal ops to LowerGpuOpsToROCDLOpsPass (#108302 ) Similar to https://github.com/llvm/llvm-project/pull/108266 After https://github.com/llvm/llvm-project/pull/102971 It is legal to generate `LLVM::ExpOp` and `LLVM::LogOp` if the type is is a float16 or float32	2024-09-12 11:20:27 -05:00
Krzysztof Drewniak	90a0be9482	[mlir][LLVM] Refactor how range() annotations are handled for ROCDL intrinsics (#107658 ) This commit introduces a ConstantRange attribute to match the ConstantRange attribute type present in LLVM IR. It then refactors the LLVM_IntrOpBase so that the basic part of the intrinsic builder code can be re-used without needing to copy it or get rid of important context. This, along with adding code for handling an optional `range` attribute to that same base, allows us to make the support for range() annotations generic without adding another bit to IntrOpBase. This commit then updates the lowering of index intrinsic operations to use the new ConstantRange attribute and fixes a bug (where we'd be subtracting 1 from upper bounds instead of adding it on operations like gpu.block_dim) along the way. The point of these changes is to enable these range annotations to be used for the corresponding NVVM operations in a future commit.	2024-09-12 09:46:42 -05:00
Jan Leyonberg	3ebd79751f	[MLIR][ROCDL] Remove patterns for ops supported as intrinsics in the AMDGPU backend (#102971 ) This patch removes patterns for a few operations which allows mathToLLVM conversion to convert the operations into LLVM intrinsics instead since they are supported directly by the AMDGPU backend.	2024-09-04 11:29:10 -04:00
Krzysztof Drewniak	43fd4c49bd	[mlir][GPU] Improve handling of GPU bounds (#95166 ) This change reworks how range information for GPU dispatch IDs (block IDs, thread IDs, and so on) is handled. 1. `known_block_size` and `known_grid_size` become inherent attributes of GPU functions. This makes them less clunky to work with. As a consequence, the `gpu.func` lowering patterns now only look at the inherent attributes when setting target-specific attributes on the `llvm.func` that they lower to. 2. At the same time, `gpu.known_block_size` and `gpu.known_grid_size` are made official dialect-level discardable attributes which can be placed on arbitrary functions. This allows for progressive lowerings (without this, a lowering for `gpu.thread_id` couldn't know about the bounds if it had already been moved from a `gpu.func` to an `llvm.func`) and allows for range information to be provided even when `gpu._{id,dim}` are being used outside of a `gpu.func` context. 3. All of these index operations have gained an optional `upper_bound` attribute, allowing for an alternate mode of operation where the bounds are specified locally and not inherited from the operation's context. These also allow handling of cases where the precise launch sizes aren't known, but can be bounded more precisely than the maximum of what any platform's API allows. (I'd like to thank @benvanik for pointing out that this could be useful.) When inferring bounds (either for range inference or for setting `range` during lowering) these sources of information are consulted in order of specificity (`upper_bound` > inherent attribute > discardable attribute, except that dimension sizes check for `known__bounds` to see if they can be constant-folded before checking their `upper_bound`). This patch also updates the documentation about the bounds and inference behavior to clarify what these attributes do when set and the consequences of setting them up incorrectly. --------- Co-authored-by: Mehdi Amini <joker.eph@gmail.com>	2024-06-17 23:47:38 -05:00
stefankoncarevic	94be801879	[mlir][ROCDL] Update the LLVM data layout for ROCDL lowering. (#92127 ) This change updates the dataLayout string to ensure alignment with the latest LLVM TargetMachine configuration. The aim is to maintain consistency and prevent potential compilation issues related to memory address space handling.	2024-05-28 10:17:02 -05:00
Krzysztof Drewniak	4cba5957e6	[mlir][ROCDL] Set the LLVM data layout when lowering to ROCDL LLVM (#74501 ) In order to ensure operations lower correctly (especially memref.addrspacecast, which relies on the data layout benig set correctly then dealing with dynamic memrefs) and to prevent compilation issues later down the line, set the `llvm.data_layout` attribute on GPU modules when lowering their contents to a ROCDL / AMDGPU target. If there's a good way to test the embedded string to prevent it from going out of sync with the LLVM TargetMachine, I'd appreciate hearing about it. (Or, alternatively, if there's a place I could farctor the string out to).	2024-02-27 09:59:50 -06:00
Christian Ulmann	4279a642fb	[MLIR][GPUToROCDL] Remove typed pointer support (#70908 ) This commit removes the support for lowering GPU to ROCDL dialect with typed pointers. Typed pointers have been deprecated for a while now and it's planned to soon remove them from the LLVM dialect. Related PSA: https://discourse.llvm.org/t/psa-removal-of-typed-pointers-from-the-llvm-dialect/74502	2023-11-01 10:13:06 +01:00
Ivan R. Ivanov	26eb4285b5	[MLIR][LLVM] Add vararg support in LLVM::CallOp and InvokeOp (#67274 ) In order to support indirect vararg calls, we need to have information about the callee type - this patch adds a `callee_type` attribute that holds that. The attribute is required for vararg calls, else, it is optional and the callee type is inferred by the operands and results of the operation if not present. The syntax for non-vararg calls remains the same, whereas for vararg calls, it is changed to this: ``` llvm.call %p(%arg0, %arg0) vararg(!llvm.func<void (i32, ...)>) : !llvm.ptr, (i32, i32) -> () llvm.call @s(%arg0, %arg0) vararg(!llvm.func<void (i32, ...)>) : (i32, i32) -> () ```	2023-09-28 08:26:45 +02:00
Adrian Kuegel	baf2d13519	[mlir][GPUToROCDL] Lower arith.remf to GPU intrinsic. Differential Revision: https://reviews.llvm.org/D159423	2023-09-04 14:05:04 +02:00
Stanley Winata	1896096002	[mlir][ROCM] Add Wave/Warp shuffle lowering and op for ROCM. Reduction is heavily used for many DL workload especially with softmax/Attention layers. Wave/Warp shuffle and reduction is known to be a speedy/efficient way to do these reductions. In this patch we introduce AMD shuffle intrinsic Ops to ROCDL, along with it's corresponding lowering from gpu.shuffle. This should speed up a lot of DL workloads on ROCM backend. Currently, we have support for xor and idx, which are the more common ones. In the future, we plan on adding support for Down and Up, as well as using the ds_swizzle to further enhance it's performance when width and offsets are constant. Reviewed By: antiagainst Differential Revision: https://reviews.llvm.org/D158684	2023-08-24 17:35:34 -07:00
Krzysztof Drewniak	51b65d0895	[mlir][AMDGPU] Improve BF16 handling through AMDGPU compilation Many previous sets of AMDGPU dialect code have been incorrect in the presence of the bf16 type (when lowered to LLVM's bfloat) as they were developed in a setting that run a custom bf16-to-i16 pass before LLVM lowering. An overall effect of this patch is that you should run --arith-emulate-unsupported-floats="source-types=bf16 target-type=f32" on your GPU module before calling --convert-gpu-to-rocdl if your code performs bf16 arithmetic. While LLVM now supports software bfloat, initial experiments showed that using this support on AMDGPU inserted a large number of conversions around loads and stores which had substantial performance imparts. Furthermore, all of the native AMDGPU operations on bf16 types (like the WMMA operations) operate on 16-bit integers instead of the bfloat type. First, we make the following changes to preserve compatibility once the LLVM bfloat type is reenabled. 1. The matrix multiplication operations (MFMA and WMMA) will bitcast bfloat vectors to i16 vectors. 2. Buffer loads and stores will operate on the relevant integer datatype and then cast to bfloat if needed. Second, we add type conversions to convert bf16 and vectors of it to equivalent i16 types. Third, we add the bfloat <-> f32 expansion patterns to the set of operations run before the main LLVM conversion so that MLIR's implementation of these conversion routines is used. Finally, we extend the "floats treated as integers" support in the LLVM exporter to handle types other than fp8. We also fix a bug in the unsupported floats emulation where it tried to operate on `arith.bitcast` due to an oversight. Reviewed By: rsuderman Differential Revision: https://reviews.llvm.org/D156361	2023-08-17 18:31:28 +00:00
SJW	cdf7ca6db7	[MLIR][ROCDL] Add conversion for gpu.lane_id to ROCDL Creates rocdl.lane_id op with llvm conversion to: __device__ static unsigned int __lane_id() { return __builtin_amdgcn_mbcnt_hi( -1, __builtin_amdgcn_mbcnt_lo(-1, 0)); } Reviewed By: krzysz00 Differential Revision: https://reviews.llvm.org/D154666	2023-07-26 15:12:48 +00:00
Christopher Bate	14858cf05d	[mlir][Conversion/GPUCommon] Fix bug in conversion of `math` ops The common GPU operation transformation that lowers `math` operations to function calls in the `gpu-to-nvvm` and `gpu-to-rocdl` passes handles `vector` types by applying the function to each scalar and returning a new vector. However, there was a typo that results in incorrectly accumulating the result vector, and the rewrite returns an `llvm.mlir.undef` result instead of the correct vector. A patch is added and tests are strengthened. Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D154269	2023-07-03 13:26:51 -06:00
Manupa Karunaratne	8c3a8d17c8	[MLIR][ROCDL] add gpu to rocdl erf support This commit adds lowering of lib func call to support erf in rocdl. Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D150355	2023-05-15 14:42:52 +00:00
Markus Böck	0e5aeae6f5	[mlir][GPUToLLVM] Add support for emitting opaque pointers Part of https://discourse.llvm.org/t/rfc-switching-the-llvm-dialect-and-dialect-lowerings-to-opaque-pointers/68179 This patch adds the new pass option `use-opaque-pointers` to the GPU to LLVM lowerings (including ROCD and NVVM) and adapts the code to support using opaque pointers in addition to typed pointers. The required changes mostly boil down to avoiding `getElementType` and specifying base types in GEP and Alloca. In the future opaque pointers will be the only supported model, hence tests have been ported to using opaque pointers by default. Additional regression tests for typed-pointers have been added to avoid breaking existing clients. Note: This does not yet port the `GpuToVulkan` passes. Differential Revision: https://reviews.llvm.org/D144448	2023-02-21 20:46:33 +01:00
Mehdi Amini	4f1e244eb5	Add missing dependent dialects to "convert-gpu-to-rocdl" Fixes #60198	2023-01-22 03:28:49 +01:00
Mehdi Amini	4e01532450	Check for FunctionOpInterface when looking up a parent function in GPU lowering This makes it more robust when expanding code in other function than func.func, like spv.func for example. Fixes #60072	2023-01-16 16:40:11 +00:00
Goran Flegar	b7fe0d346b	Lower math.tan to __nv_tan[f] / __ocml_tan_f{32\|64} At present math.tan fails to lower for NVVM and ROCDL. Differential Revision: https://reviews.llvm.org/D141505	2023-01-12 15:26:11 +01:00
Johannes Reifferscheid	059cf735a9	Lower math.cbrt to NVVM/ROCDL. Reviewed By: pifon2a Differential Revision: https://reviews.llvm.org/D141270	2023-01-09 13:17:35 +01:00
Krzysztof Drewniak	f6076bd81f	[mlir][ROCDL] Translate known block size attributes to ROCDL 1. When converting from the GPU dialect to the ROCDL dialect, if the function that contains a gpu.thread_id or gpu.block_id op is annotated with gpu.known_{block,grid}_size, use that size to set a "range" attribute on the corresponding rocdl intrinsic so that the LLVM frontend can optimize based on that range information. 1b. When translating from the rocdl dialect to LLVM IR, use the "range" attribute, if present, to set !range metadata on the relevant function call. 2. Deprecate the old rocdl.max_flat_work_group_size attribute, which was used in a tensorflow backend. Instead, use rocdl.flat_work_group_size going forward to allow kernel generators to specify the minimum and maximum work group sizes a kernel may be launched with in one attribute, thus more closely matching the backend. 3. When translating from gpu.func to llvm.func within gpu-to-rocdl, copy the known_block_size attribute as rocdl.reqd_work_group_size to enable further translations to set the corresponding metadata on the LLVM IR function. Also, set the rocdl.flat_work_group_size attribute to ensure that the reqd_work_group_size metadata and the amdgpu-flat-work-group-size metadata are consistent. 3b. Extend the ROCDL to LLVM IR translation to set the !reqd_work_group_size metadata on LLVM functions Also update tests and add functions to the ROCDL dialect to ensure attribute names are used consistently. Depends on D139865 Reviewed By: antiagainst Differential Revision: https://reviews.llvm.org/D139866	2023-01-02 21:04:13 +00:00
Christian Sigg	b251b608b5	[mlir][gpu] Unroll ops on vectors which map to intrinsic calls Unroll ops that map to intrinsics when lowering to LLVM, because intrinsics don't support vector operands/results. Reviewed By: herhut Differential Revision: https://reviews.llvm.org/D136345	2022-10-28 10:33:38 +02:00
Jeff Niu	00f7096d31	[mlir][math] Rename math.abs -> math.absf To make room for introducing `math.absi`. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D131325	2022-08-08 11:04:58 -04:00
Krzysztof Drewniak	c2fc8d9b95	[mlir][GPU] Allow bare pointer memrefs when calling GPU kernels In the ROCm runtime (and probably CUDA as well), all kernel arguments are aligned. Therefore, enable using bare pointers for memref arguments to kernels when these memrefs have static shape and a trivial layout. This is a substantial optimization to launching kernels that use memrefs with known, static sizes, since it causes the kernel launch packet to no longer include information already known to the kernel, which can enable packing the kernel launch arguments into launch packets instead of having to allocate an entire separate structure to hold unneeded memref information. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D130716	2022-08-02 20:58:34 +00:00
Markus Böck	bd7eff1f2a	[mlir][flang] Make use of the new `GEPArg` builder of GEP Op to simplify code This is the follow up on https://reviews.llvm.org/D130730 which goes through upstream code and removes creating constant values in favour of using the constant indices in GEP directly. This leads to less and more readable code and more compact IR as well. Differential Revision: https://reviews.llvm.org/D130731	2022-08-01 17:22:55 +02:00
River Riddle	3655069234	[mlir] Move the Builtin FuncOp to the Func dialect This commit moves FuncOp out of the builtin dialect, and into the Func dialect. This move has been planned in some capacity from the moment we made FuncOp an operation (years ago). This commit handles the functional aspects of the move, but various aspects are left untouched to ease migration: func::FuncOp is re-exported into mlir to reduce the actual API churn, the assembly format still accepts the unqualified `func`. These temporary measures will remain for a little while to simplify migration before being removed. Differential Revision: https://reviews.llvm.org/D121266	2022-03-16 17:07:03 -07:00
River Riddle	23aa5a7446	[mlir] Rename the Standard dialect to the Func dialect The last remaining operations in the standard dialect all revolve around FuncOp/function related constructs. This patch simply handles the initial renaming (which by itself is already huge), but there are a large number of cleanups unlocked/necessary afterwards: * Removing a bunch of unnecessary dependencies on Func * Cleaning up the From/ToStandard conversion passes * Preparing for the move of FuncOp to the Func dialect See the discussion at https://discourse.llvm.org/t/standard-dialect-the-final-chapter/6061 Differential Revision: https://reviews.llvm.org/D120624	2022-03-01 12:10:04 -08:00
Mogball	aae5125550	[mlir] Replace StrEnumAttr -> EnumAttr in core dialects Removes uses of `StrEnumAttr` in core dialects Reviewed By: mehdi_amini, rriddle Differential Revision: https://reviews.llvm.org/D117514	2022-01-18 17:15:00 +00:00
Krzysztof Drewniak	e1da62910e	[MLIR][GPU] Define gpu.printf op and its lowerings - Define a gpu.printf op, which can be lowered to any GPU printf() support (which is present in CUDA, HIP, and OpenCL). This op only supports constant format strings and scalar arguments - Define the lowering of gpu.pirntf to a call to printf() (which is what is required for AMD GPUs when using OpenCL) as well as to the hostcall interface present in the AMD Open Compute device library, which is the interface present when kernels are running under HIP. - Add a "runtime" enum that allows specifying which of the possible runtimes a ROCDL kernel will be executed under or that the runtime is unknown. This enum controls how gpu.printf is lowered This change does not enable lowering for Nvidia GPUs, but such a lowering should be possible in principle. And: [MLIR][AMDGPU] Always set amdgpu-implicitarg-num-bytes=56 on kernels This is something that Clang always sets on both OpenCL and HIP kernels, and failing to include it causes mysterious crashes with printf() support. In addition, revert the max-flat-work-group-size to (1, 256) to avoid triggering bugs in the AMDGPU backend. Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D110448	2021-12-09 15:54:31 +00:00
Mogball	a54f4eae0e	[MLIR] Replace std ops with arith dialect ops Precursor: https://reviews.llvm.org/D110200 Removed redundant ops from the standard dialect that were moved to the `arith` or `math` dialects. Renamed all instances of operations in the codebase and in tests. Reviewed By: rriddle, jpienaar Differential Revision: https://reviews.llvm.org/D110797	2021-10-13 03:07:03 +00:00
Mehdi Amini	387f95541b	Add a new interface allowing to set a default dialect to be used for printing/parsing regions Currently the builtin dialect is the default namespace used for parsing and printing. As such module and func don't need to be prefixed. In the case of some dialects that defines new regions for their own purpose (like SpirV modules for example), it can be beneficial to change the default dialect in order to improve readability. Differential Revision: https://reviews.llvm.org/D107236	2021-08-31 17:52:40 +00:00
Adrian Kuegel	74b88807ae	[mlir][rocdl] Add math::Exp2Op lowering to ROCDL Differential Revision: https://reviews.llvm.org/D106057	2021-07-15 14:33:04 +02:00
Adrian Kuegel	07cc77187a	Lower math.expm1 to intrinsics in the GPUToNVVM and GPUToROCDL conversions. This adds the lowering for expm1 for GPU backends. Differential Revision: https://reviews.llvm.org/D96756	2021-02-16 10:23:42 +01:00
Alex Zinenko	4c4876c314	[mlir] Use target-specific GPU kernel attributes in lowering pipelines Until now, the GPU translation to NVVM or ROCDL intrinsics relied on the presence of the generic `gpu.kernel` attribute to attach additional LLVM IR metadata to the relevant functions. This would be problematic if each dialect were to handle the conversion of its own options, which is the intended direction for the translation infrastructure. Introduce `nvvm.kernel` and `rocdl.kernel` in addition to `gpu.kernel` and base translation on these new attributes instead. Reviewed By: herhut Differential Revision: https://reviews.llvm.org/D96591	2021-02-12 14:09:24 +01:00
Stephan Herhut	4348d8ab7f	[mlir][math] Split off the math dialect. This does not split transformations, yet. Those will be done as future clean ups. Differential Revision: https://reviews.llvm.org/D96272	2021-02-12 10:55:12 +01:00
Frederik Gossen	4ef38f9c12	Add log1p lowering from standard to ROCDL intrinsics Differential Revision: https://reviews.llvm.org/D95129	2021-01-21 14:02:48 +01:00
Alex Zinenko	dd5165a920	[mlir] replace LLVM dialect float types with built-ins Continue the convergence between LLVM dialect and built-in types by replacing the bfloat, half, float and double LLVM dialect types with their built-in counterparts. At the API level, this is a direct replacement. At the syntax level, we change the keywords to `bf16`, `f16`, `f32` and `f64`, respectively, to be compatible with the built-in type syntax. The old keywords can still be parsed but produce a deprecation warning and will be eventually removed. Depends On D94178 Reviewed By: mehdi_amini, silvas, antiagainst Differential Revision: https://reviews.llvm.org/D94179	2021-01-08 17:38:12 +01:00
Alex Zinenko	2230bf99c7	[mlir] replace LLVMIntegerType with built-in integer type The LLVM dialect type system has been closed until now, i.e. did not support types from other dialects inside containers. While this has had obvious benefits of deriving from a common base class, it has led to some simple types being almost identical with the built-in types, namely integer and floating point types. This in turn has led to a lot of larger-scale complexity: simple types must still be converted, numerous operations that correspond to LLVM IR intrinsics are replicated to produce versions operating on either LLVM dialect or built-in types leading to quasi-duplicate dialects, lowering to the LLVM dialect is essentially required to be one-shot because of type conversion, etc. In this light, it is reasonable to trade off some local complexity in the internal implementation of LLVM dialect types for removing larger-scale system complexity. Previous commits to the LLVM dialect type system have adapted the API to support types from other dialects. Replace LLVMIntegerType with the built-in IntegerType plus additional checks that such types are signless (these are isolated in a utility function that replaced `isa<LLVMType>` and in the parser). Temporarily keep the possibility to parse `!llvm.i32` as a synonym for `i32`, but add a deprecation notice. Reviewed By: mehdi_amini, silvas, antiagainst Differential Revision: https://reviews.llvm.org/D94178	2021-01-07 19:48:31 +01:00
Tres Popp	9adc64539f	[mlir] Add std.powf to ROCDL lowering. Differential Revision: https://reviews.llvm.org/D93313	2020-12-15 18:47:49 +01:00
Frederik Gossen	1c6bc2c0b5	[MLIR] Add lowerings for atan and atan2 to ROCDL intrinsics Differential Revision: https://reviews.llvm.org/D93123	2020-12-14 10:43:19 +01:00
Adrian Kuegel	ada4c7a351	Add rsqrt lowering from standard to ROCDL. Add a lowering for rsqrt from standard dialect to ROCDL. Differential Revision: https://reviews.llvm.org/D93011	2020-12-11 13:18:57 +01:00
Adrian Kuegel	09f717b929	Add sqrt lowering from standard to ROCDL Add a lowering for sqrt from standard dialect to ROCDL. Differential Revision: https://reviews.llvm.org/D92921	2020-12-10 09:47:37 +01:00
Rob Suderman	5556575230	Added std.floor operation to match std.ceil There should be an equivalent std.floor op to std.ceil. This includes matching lowerings for SPIRV, NVVM, ROCDL, and LLVM. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D85940	2020-08-18 10:25:32 -07:00
Tobias Gysi	10643c9ad8	[mlir] make the bitwidth of device side index computations configurable (reland) Summary: The patch makes the index type lowering of the GPU to NVVM/ROCDL conversion configurable. It introduces a pass option that controls the bitwidth used when lowering index computations and uses the LowerToLLVMOptions structure to control the Standard to LLVM lowering. This commit fixes a use-after-free bug introduced by the reverted commit `d10b1a3`. It implements the following changes: - Added a getDefaultOptions method to the LowerToLLVMOptions struct that returns a reference to statically allocated default options. - Use the getDefaultOptions method to provide default LowerToLLVMOptions (instead of an initializer list). - Added comments to clarify the required lifetime of the LowerToLLVMOptions Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D82475	2020-06-29 12:22:39 +02:00
Tobias Gysi	2ff6fad700	Revert "[mlir] make the bitwidth of device side index computations configurable" This reverts commit `d10b1a38a7`.	2020-06-23 19:21:36 +02:00
Tobias Gysi	d10b1a38a7	[mlir] make the bitwidth of device side index computations configurable The patch makes the index type lowering of the GPU to NVVM/ROCDL conversion configurable. It introduces a pass option that controls the bitwidth used when lowering index computations. Differential Revision: https://reviews.llvm.org/D80285	2020-06-22 11:43:37 +02:00
Mehdi Amini	d31c9e5a46	Change filecheck default to dump input on failure Having the input dumped on failure seems like a better default: I debugged FileCheck tests for a while without knowing about this option, which really helps to understand failures. Remove `-dump-input-on-failure` and the environment variable FILECHECK_DUMP_INPUT_ON_FAILURE which are now obsolete. Differential Revision: https://reviews.llvm.org/D81422	2020-06-09 18:57:46 +00:00
Wen-Heng (Jack) Chung	bc23c1d85e	[mlir][rocdl] add rocdl.barier op. - Add rocdl.barrier op. - Lower gpu.barier to rocdl.barrier in -convert-gpu-to-rocdl. Differential Revision: https://reviews.llvm.org/D79126	2020-05-04 10:35:01 +02:00
Wen-Heng (Jack) Chung	9ad5e57316	[mlir][nvvm][rocdl] refactor NVVM and ROCDL dialect. NFC. - Extract common logic between -convert-gpu-to-nvvm and -convert-gpu-to-rocdl. - Cope with the fact that alloca operates on different addrspaces between NVVM and ROCDL. - Modernize unit tests for ROCDL dialect. Differential Revision: https://reviews.llvm.org/D79021	2020-05-01 00:13:26 +02:00

1 2

59 Commits