clang-p2996

Author	SHA1	Message	Date
Jakub Kuderski	80d5400d92	[mlir][spirv] Account for type conversion failures in scf-to-spirv Fixes: https://github.com/llvm/llvm-project/issues/59136 Reviewed By: antiagainst Differential Revision: https://reviews.llvm.org/D141292	2023-01-09 11:35:47 -05:00
Johannes Reifferscheid	059cf735a9	Lower math.cbrt to NVVM/ROCDL. Reviewed By: pifon2a Differential Revision: https://reviews.llvm.org/D141270	2023-01-09 13:17:35 +01:00
Alexander Shaposhnikov	9e1a344155	[MLIR][TOSA] Switch Tosa to DenseArrayAttr This diff completes switching Tosa to DenseArrayAttr. Test plan: ninja check-mlir check-all Differential revision: https://reviews.llvm.org/D141111	2023-01-06 22:57:14 +00:00
Rob Suderman	7ce53e3102	[mlir][tosa] Add tosa.conv3d lowering to Linalg Conv3D has an existing linalg operation for floating point. Adding a quantized variant and corresponding lowering from TOSA. Numerical correctness was validated using the TOSA conformance tests. Reviewed By: jpienaar Differential Revision: https://reviews.llvm.org/D140919	2023-01-06 10:47:45 -08:00
Thomas Raoux	7efdc117b1	[mlir][nvvm] Add lowering of gpu.printf to nvvm When converting to nvvm lowering gpu.printf to vprintf allows us to support printing when running on cuda. Differential Revision: https://reviews.llvm.org/D141049	2023-01-06 17:29:30 +00:00
mariecwhite	76dc9a853a	[mlir][tosa] Remove clamping behavior in `tosa.cast` for integer truncation Reviewed By: rsuderman Differential Revision: https://reviews.llvm.org/D141015	2023-01-04 15:10:06 -08:00
Alexander Shaposhnikov	11030c7d67	[MLIR][TOSA] Switch Tosa_IntArrayAttr[N], Tosa_IntArrayAttrUpto[N] to DenseI64ArrayAttr Switch Tosa_IntArrayAttr[N], Tosa_IntArrayAttrUpto[N] to DenseI64ArrayAttr. Test plan: ninja check-mlir check-all Differential revision: https://reviews.llvm.org/D140748 https://reviews.llvm.org/D140829, https://reviews.llvm.org/D140832, https://reviews.llvm.org/D140833, https://reviews.llvm.org/D140834	2023-01-04 21:58:20 +00:00
Robert Walker	ca21499526	[mlir][tosa] Fix floating point offset for tosa.resize Offset is a signed value, so use `arith.sitofp` See also https://github.com/llvm/llvm-project/issues/59585 Reviewed By: NatashaKnk, jpienaar Differential Revision: https://reviews.llvm.org/D140958	2023-01-04 12:53:54 -08:00
Rob Suderman	b5a1de9c98	[mlir][tosa] Add broadcasting case for tosa.resize to linalg implementation When lowering tosa.resize it is possible there is an unary input dimension. Lowering to a new tosa.resize and explicit broadcast simplifies the tosa.resize operation to avoid recomputing the identical broadcasted values. This change reworks the broadcast optimization reuse the tosa.resize generic implementation. Reviewed By: jpienaar Differential Revision: https://reviews.llvm.org/D139963	2023-01-03 14:29:06 -08:00
Johannes Reifferscheid	998a3a3894	Add a math.cbrt instruction and lowering to libm. There's currently no way to get accurate cube roots in the math dialect. powf(x, 1/3.0) is too inaccurate in some cases. Reviewed By: akuegel Differential Revision: https://reviews.llvm.org/D140842	2023-01-03 08:44:12 +01:00
Krzysztof Drewniak	f6076bd81f	[mlir][ROCDL] Translate known block size attributes to ROCDL 1. When converting from the GPU dialect to the ROCDL dialect, if the function that contains a gpu.thread_id or gpu.block_id op is annotated with gpu.known_{block,grid}_size, use that size to set a "range" attribute on the corresponding rocdl intrinsic so that the LLVM frontend can optimize based on that range information. 1b. When translating from the rocdl dialect to LLVM IR, use the "range" attribute, if present, to set !range metadata on the relevant function call. 2. Deprecate the old rocdl.max_flat_work_group_size attribute, which was used in a tensorflow backend. Instead, use rocdl.flat_work_group_size going forward to allow kernel generators to specify the minimum and maximum work group sizes a kernel may be launched with in one attribute, thus more closely matching the backend. 3. When translating from gpu.func to llvm.func within gpu-to-rocdl, copy the known_block_size attribute as rocdl.reqd_work_group_size to enable further translations to set the corresponding metadata on the LLVM IR function. Also, set the rocdl.flat_work_group_size attribute to ensure that the reqd_work_group_size metadata and the amdgpu-flat-work-group-size metadata are consistent. 3b. Extend the ROCDL to LLVM IR translation to set the !reqd_work_group_size metadata on LLVM functions Also update tests and add functions to the ROCDL dialect to ensure attribute names are used consistently. Depends on D139865 Reviewed By: antiagainst Differential Revision: https://reviews.llvm.org/D139866	2023-01-02 21:04:13 +00:00
Ivan Butygin	2e4aa3bd83	[mlir][gpu][spirv] Lower gpu reduction ops to spirv Supports only "add" and "mul" ops for now. More ops will be added later. Differential Revision: https://reviews.llvm.org/D140576	2022-12-30 17:44:08 +01:00
Lei Zhang	56c069887b	[mlir][spirv] Fail vector.bitcast conversion with different bitwidth Depending on the target environment, we may need to emulate certain types, which can cause issue with bitcast. Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D140437	2022-12-29 15:43:55 -08:00
Benoit Jacob	eec575e548	Allow non-constant divisors in affine mod, floordiv, ceildiv. The requirement that divisor>0 is not enforced here outside of the constant case, but how to enforce it? If I understand correctly, it is UB and while it is nice to be able to deterministically intercept UB, that isn't always feasible. Hopefully, keeping the existing enforcement in the constant case is enough. Differential Revision: https://reviews.llvm.org/D140079	2022-12-17 02:24:02 +00:00
Rob Suderman	b37a0318cb	[mlir][tosa] Make tosa.resize to linalg avoid redundant loads for unit width When using a tosa resize for ?x1x1x? to ?x1x?x? we should avoid doing a 2D interpolation as only two unique values are loaded. As the extract operation performance numerical computation on its values the superfluous extracts may fail to be coalesced. Instead we only interpolate between the values if there are multiple values to interpolate between. For the integer case we also perform scaling by the scaling-factor to apply the same integer scaling behavior as interpolation. Reviewed By: jpienaar, NatashaKnk Differential Revision: https://reviews.llvm.org/D139979	2022-12-15 16:22:46 -08:00
Lei Zhang	f1db4aec30	[mlir][VectorToGPU] Support transposed+broadcasted 2D MMA load This is loading from 2-D memref, in addition to D139655 where we load from 1-D memref cases. Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D140136	2022-12-15 19:34:32 +00:00
Lei Zhang	dbddd4f6a4	[mlir][VectorToGPU] Support transposed+broadcasted 1D MMA load This is now possible with transpose semantics on subgroup MMA load ops. Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D139655	2022-12-15 19:22:35 +00:00
Quinn Dawkins	b05b8970d8	[mlir][gpu][spirv] Verify elementwise op type as mulf when converting to spirv.MatrixTimesScalar Conversion from gpu.subgroup_mma_constant_matrix to spirv.MatrixTimesScalar didn't check that the op type was a multiplication and thus would incorrectly convert other elementwise scalar operations. Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D140081	2022-12-15 03:15:04 +00:00
Jakub Kuderski	4f47677dee	[mlir][arith][spirv] Account for possible type conversion failures Check results of all type conversions in `--convert-arith-to-spirv`. Fixes: https://github.com/llvm/llvm-project/issues/59496 Reviewed By: antiagainst Differential Revision: https://reviews.llvm.org/D140033	2022-12-14 19:32:40 -05:00
Slava Zakharin	70174b8035	[mlir][math] Added math::FPowI conversion to LLVM dialect. The operations are converted into LLVM::PowIOp. Reviewed By: Mogball Differential Revision: https://reviews.llvm.org/D129812	2022-12-14 10:15:05 -08:00
Ivan Butygin	247d8d4f7a	[mlir][gpu] Add `uniform` flag to gpu reduction ops Differential Revision: https://reviews.llvm.org/D138758	2022-12-14 13:15:58 +01:00
Slava Zakharin	22702cc76c	[mlir][math] Added math::FPowI conversion to calls of outlined implementations. Power functions are implemented as linkonce_odr scalar functions for FPowI operations met in a module. Vector form of FPowI is linearized into a sequence of calls of the scalar functions. Option {min-width-of-fpowi-exponent} controls which FPowI operations are converted by MathToFuncs: if the width of the exponent's integer type is less than the specified value, then the operation is not converted. Flang will specify {min-width-of-fpowi-exponent=33} to make sure that math::FPowI operations with exponent wider than 32 bits will be converted by MathToFuncs, and operations with more narrow exponent will be left for MathToLLVM to convert them to LLVM::PowIOp. Reviewed By: Mogball Differential Revision: https://reviews.llvm.org/D139804	2022-12-13 12:15:35 -08:00
Rob Suderman	78503e1a2f	[mlir][tosa] Refactor tosa.resize Moved to using helper lambdas to avoid code repetition. IR needed to be reordered to accommodate which should be the only changes to the existing tests. This changes the quantized test to target `i48` types to guarantee types are extended correctly when necessary. Reviewed By: jpienaar Differential Revision: https://reviews.llvm.org/D136500	2022-12-12 14:38:38 -08:00
Jakub Kuderski	f39b47264e	[mlir][arith][tosa] Use extended mul in 32-bit `tosa.apply_scale` To not introduce 64-bit types that may be difficult to handle for some targets. Reviewed By: rsuderman, antiagainst Differential Revision: https://reviews.llvm.org/D139777	2022-12-12 14:39:58 -05:00
Benjamin Chetioui	a6c8f06f55	[mlir] Clean up typos in FileCheck directives in various tests. Reviewed By: tpopp Differential Revision: https://reviews.llvm.org/D139698	2022-12-12 09:29:14 +01:00
Jakub Kuderski	285d321a85	[mlir][arith] Define mulsi_extended op Extend D139688 with the signed version of the extended multiplication op. Add conversion to the SPIR-V and LLVM dialects. This was originally proposed in: https://discourse.llvm.org/t/rfc-arith-add-extended-multiplication-ops/66869. Reviewed By: antiagainst Differential Revision: https://reviews.llvm.org/D139743	2022-12-09 20:25:31 -05:00
Jakub Kuderski	b4bdcea214	[mlir][arith] Define mului_extended op Add conversion to the SPIR-V and LLVM dialects. This was originally proposed in: https://discourse.llvm.org/t/rfc-arith-add-extended-multiplication-ops/66869. Reviewed By: antiagainst Differential Revision: https://reviews.llvm.org/D139688	2022-12-09 17:37:06 -05:00
Guray Ozen	b2bba5b65c	[mlir][spirv] Support conversion of `CopySignOp` to spirv for 1D vector with 1 element Conversion of CopySignOp to SPIRV is supported for scalar and vectors but not 1D vectors with 1 element (aka vector<1xf32>). This revisions adds supports this by treating them as scalars. An alternative solution would be to allow 0D vectors for SPIRV, but the spec [0] strictly defines the vector type as non-0D. "Vector: An ordered homogeneous collection of two or more scalars. Vector sizes are quite restrictive and dependent on the execution model." [0] https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#_types Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D139518	2022-12-08 09:11:27 +01:00
Jakub Kuderski	28246b7e75	[mlir][arith] Rename addui_carry to addui_extended The goal is to make the naming of the future `_extended` ops more consistent. With unsigned addition, the carry value/flag and overflow bit are the same, but this is not true when it comes to signed addition. Also rename the second result from `carry` to `overflow`. Reviewed By: antiagainst Differential Revision: https://reviews.llvm.org/D139569	2022-12-07 17:15:56 -05:00
Rob Suderman	8e7630ece1	[mlir][tosa] Fix tosa.resize for i48 accumulator Implementation assumed a i32 accumulator. Fixed the implementation to work with an i32 accumulator. Reviewed By: NatashaKnk Differential Revision: https://reviews.llvm.org/D139365	2022-12-07 11:27:33 -08:00
Ramkumar Ramachandra	2a19625424	mlir/tosa: move tosa.pad from Linalg to Tensor conversion Since tosa.pad is lowered strictly to artih and tensor ops, move ConvertPad from TosaToLinalg to TosaToTensor, benefitting non-Linalg Tosa targets. TensorToLinalg exists, and is trivial, so nothing is lost. Signed-off-by: Ramkumar Ramachandra <r@artagnon.com> Differential Revision: https://reviews.llvm.org/D139091	2022-12-06 07:39:29 +01:00
Lei Zhang	2c7827da4f	[mlir][spirv] Add GPU subgroup MMA to spirv.MMAMatrixTimesScalar Along the way, make the default pattern fail instead of crashing when an elementwise op is not supported yet. Reviewed By: kuhar Differential Revision: https://reviews.llvm.org/D139280	2022-12-05 22:30:50 +00:00
Rob Suderman	58fa8426ff	[mlir][tosa] Handle tosa.resize nearest rounding correctly Rounding of tosa.resize did not handle rounding to the nearest pixel correctly. Rather than dividing the scale by 2 we should double the partial pixel to guarantee we include a check on the lowest bit. Reviewed By: NatashaKnk Differential Revision: https://reviews.llvm.org/D139162	2022-12-05 13:10:08 -08:00
Navdeep Katel	3d35546cd1	Support `transpose` mode for `gpu.subgroup` WMMA ops Add support for loading, computing, and storing `gpu.subgroup` WMMA ops in transpose mode as well. Update the GPU to NVVM lowerings to support `transpose` mode and update integration tests as well. Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D139021	2022-12-05 22:37:02 +05:30
Ramkumar Ramachandra	1e33330e29	mlir/TosaToTensor: fix typos in test This patch fixes a misspelt CHECK-LABEL in tosa-to-tensor.mlir. Signed-off-by: Ramkumar Ramachandra <r@artagnon.com> Differential Revision: https://reviews.llvm.org/D139085	2022-12-03 09:57:10 +01:00
Quentin Colombet	786cbb09ed	Re-apply "[mlir][MemRefToLLVM] Remove the code for lowering subview" This reverts commit `d0650d1089`. Original commit message: Subviews are supposed to be expanded before we hit the lowering code. The expansion is done with the pass called expand-strided-metadata. Add a test that demonstrate how these passes can be linked up to achieve the desired lowering. This patch is NFC in spirit but not in practice because `subview` gets lowered into `reinterpret_cast(extract_strided_metadata, <some math>)` which lowers in two memref descriptors (one for `reinterpert_cast` and one for `extract_strided_metadata`), which creates some noise of the form: `extractvalue(unrealized_cast(extractvalue[0]))[0]` that is currently not simplified within MLIR but that is really just noop in that case. Differential Revision: https://reviews.llvm.org/D136377	2022-12-02 15:26:58 +00:00
Quentin Colombet	d0650d1089	Revert "[mlir][MemRefToLLVM] Remove the code for lowering subview" This reverts commit `c8e15afa4c`. This breaks some integration tests, see https://lab.llvm.org/buildbot/#/builders/220/builds/10446 I have to update a bunch of RUN lines in the tests to use the new lowering scheme. Nothing complicated but let's keep the build clean while I'm fixing that.	2022-12-02 14:19:37 +00:00
Quentin Colombet	c8e15afa4c	[mlir][MemRefToLLVM] Remove the code for lowering subview Subviews are supposed to be expanded before we hit the lowering code. The expansion is done with the pass called expand-strided-metadata. Add a test that demonstrate how these passes can be linked up to achieve the desired lowering. This patch is NFC in spirit but not in practice because `subview` gets lowered into `reinterpret_cast(extract_strided_metadata, <some math>)` which lowers in two memref descriptors (one for `reinterpert_cast` and one for `extract_strided_metadata`), which creates some noise of the form: `extractvalue(unrealized_cast(extractvalue[0]))[0]` that is currently not simplified within MLIR but that is really just noop in that case. Differential Revision: https://reviews.llvm.org/D136377	2022-12-02 10:17:06 +00:00
Manish Gupta	9774cd17e8	[mlir][nvgpu] Fix affine maps computing indices for LdMatrixOp srcMemref This patch fixes and simplifies the ldmatrix affine map arithmetic by abstracting the affine expressions in terms of pitch-linear layout (strided and contiguous dimensions). Then it applies the maps for strided and contiguous dimensions in row-major and col-major. LdMatrixOp collaboratively (32 threads in a warp) load tiles (8 row x 128b col) of data. It can load either x1, x2, x4 tiles. Additionally, it can transpose at 16-bit granularity when moving data from the Shared Memory to registers. This patch fixes affine map: (laneid -> coordinate index a thread points in a tile). - Loading x4 tiles needs all 32 lanes T0-31 point to a contiguous chunk of 128b. The issue was exposed when running this case. - Loading x2 tiles and x1 needs T0-15 threads and T0-7 threads points to contiguous chunk of 128b. The patch is NFC for these cases. Differential Revision: https://reviews.llvm.org/D138978	2022-12-01 18:26:33 -08:00
Nicolas Vasilache	3af6438372	Revert "[WIP] Add support for MMA conversion for 1-D vector.transfer followed by a broadcast to 2-D" This reverts commit `7db25f78db`. This was mistakently stacked below (and committed) along with an NFC change.	2022-12-01 02:57:03 -08:00
Nicolas Vasilache	7db25f78db	[WIP] Add support for MMA conversion for 1-D vector.transfer followed by a broadcast to 2-D Differential Revision: https://reviews.llvm.org/D139040	2022-12-01 02:49:47 -08:00
Lei Zhang	ff81cc824f	[mlir][spirv] Improve vector extract/insert element conversion * Fix type conversions around positions--we need to use the converted value from the adaptor. * Convert constant position cases to composite extract/insert. Reviewed By: kuhar Differential Revision: https://reviews.llvm.org/D139057	2022-12-01 00:35:41 +00:00
Jakub Kuderski	9ad215bb3d	[mlir][spirv] Drop experimental LinalgToSPIRV pass This experimental pass is unused and obsolete. Reviewed By: antiagainst Differential Revision: https://reviews.llvm.org/D139056	2022-11-30 19:25:40 -05:00
Lei Zhang	52ca149931	[mlir][spirv] Allow controlling subgroup size This commit extends the `ResourceLimitsAttr` to support specifying a minimal and maximal subgroup size, and extends `EntryPointABIAttr` to support specifying the requested subgroup size. This is possible now in Vulkan with the VK_EXT_subgroup_size_control extension. For OpenCL it's possible to use the `SubgroupSize` execution mode directly. Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D138962	2022-11-30 12:34:09 -05:00
Diego Caballero	eb7e2998d1	Reland "[mlir][Vector] Re-define masking semantics in vector.transfer ops"" This relands commit `847b5f82a4`. Differential Revision: https://reviews.llvm.org/D138079	2022-11-29 03:36:54 +00:00
Quinn Dawkins	c0321edc26	[mlir][gpu] Adding support for transposed mma_load_matrix Enables transposed gpu.subgroup_mma_load_matrix and updates the lowerings in Vector to GPU and GPU to SPIRV. Needed to enable B transpose matmuls lowering to wmma ops. Taken over from author: stanley-nod <stanley@nod-labs.com> Reviewed By: ThomasRaoux, antiagainst Differential Revision: https://reviews.llvm.org/D138770	2022-11-29 03:35:49 +00:00
Diego Caballero	f6d90055fd	[mlir][Vector] Remove 'lower-permutation-maps' option from VectorToSCF This patch is part of a larger simplification effort of vector transfer operations. It removes the flag `lower-permutation-maps` from VectorToSCF conversion and enables the lowering of permutation maps by default. This means that VectorToSCF will always lower permutation maps to independent broadcast/transpose operations before lowering vector operations to SCF. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D138742	2022-11-28 23:56:43 +00:00
Hanhan Wang	0a1569a400	[mlir][NFC] Remove trailing whitespaces from `.td` and `.mlir` files. This is generated by running ``` sed --in-place 's/[[:space:]]\+$//' mlir/*/.td sed --in-place 's/[[:space:]]\+$//' mlir/*/.mlir ``` Reviewed By: rriddle, dcaballe Differential Revision: https://reviews.llvm.org/D138866	2022-11-28 15:26:30 -08:00
Thomas Raoux	df47f3ea0d	[mlir][spirv] Add lowering for gpu shuffle idx Differential Revision: https://reviews.llvm.org/D138863	2022-11-28 22:17:19 +00:00
Luca Boasso	4f9c9295a6	[mlir][index] Add and, or, and xor ops This patch adds the and, or, and xor bitwise operations to the index dialects with folders and LLVM lowerings. Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D138590	2022-11-23 13:26:02 -06:00

1 2 3 4 5 ...

1163 Commits