Commit Graph

109 Commits

Author SHA1 Message Date
Krzysztof Drewniak
4cba5957e6 [mlir][ROCDL] Set the LLVM data layout when lowering to ROCDL LLVM (#74501)
In order to ensure operations lower correctly (especially
memref.addrspacecast, which relies on the data layout benig set
correctly then dealing with dynamic memrefs) and to prevent compilation
issues later down the line, set the `llvm.data_layout` attribute on GPU
modules when lowering their contents to a ROCDL / AMDGPU target.

If there's a good way to test the embedded string to prevent it from
going out of sync with the LLVM TargetMachine, I'd appreciate hearing
about it. (Or, alternatively, if there's a place I could farctor the
string out to).
2024-02-27 09:59:50 -06:00
Mehdi Amini
45c226d452 [MLIR] Add ODS support for generating helpers for dialect (discardable) attributes (#77024)
This is a new ODS feature that allows dialects to define a list of
key/value pair representing an attribute type and a name.
This will generate helper classes on the dialect to be able to
manage discardable attributes on operations in a type safe way.

For example the `test` dialect can define:

```
  let discardableAttrs = (ins
     "mlir::IntegerAttr":$discardable_attr_key,
  );
```

And the following will be generated in the TestDialect class:

```
   /// Helper to manage the discardable attribute `discardable_attr_key`.
    class DiscardableAttrKeyAttrHelper {
      ::mlir::StringAttr name;
    public:
      static constexpr ::llvm::StringLiteral getNameStr() {
        return "test.discardable_attr_key";
      }
      constexpr ::mlir::StringAttr getName() {
        return name;
      }

      DiscardableAttrKeyAttrHelper(::mlir::MLIRContext *ctx)
        : name(::mlir::StringAttr::get(ctx, getNameStr())) {}

     mlir::IntegerAttr getAttr(::mlir::Operation *op) {
       return op->getAttrOfType<mlir::IntegerAttr>(name);
     }
     void setAttr(::mlir::Operation *op, mlir::IntegerAttr val) {
       op->setAttr(name, val);
     }
     bool isAttrPresent(::mlir::Operation *op) {
       return op->hasAttrOfType<mlir::IntegerAttr>(name);
     }
     void removeAttr(::mlir::Operation *op) {
       assert(op->hasAttrOfType<mlir::IntegerAttr>(name));
       op->removeAttr(name);
     }
   };
   DiscardableAttrKeyAttrHelper getDiscardableAttrKeyAttrHelper() {
     return discardableAttrKeyAttrName;
   }
```

User code having an instance of the TestDialect can then manipulate this
attribute on operation using:

```
  auto helper = testDialect.getDiscardableAttrKeyAttrHelper();

  helper.setAttr(op, value);
  helper.isAttrPresent(op);
  ...
```
2024-02-19 23:30:03 -08:00
Hugo Trachino
65066c0277 [mlir] Use create instead of createOrFold for ConstantOp as folding has no effect (NFC) (#80129)
This aims to clean-up confusing uses of
builder.createOrFold<ConstantOp> since folding of constants fails.
2024-01-31 23:40:37 -08:00
Guray Ozen
391a7577e7 [mlir][gpu] Add lowering dynamic_shared_memory op for rocdl (#74473)
This PR adds lowering of `gpu.dynamic_shared_memory` to rocdl target.
2023-12-05 19:56:43 +01:00
Christian Ulmann
4279a642fb [MLIR][GPUToROCDL] Remove typed pointer support (#70908)
This commit removes the support for lowering GPU to ROCDL dialect with
typed pointers. Typed pointers have been deprecated for a while now and
it's planned to soon remove them from the LLVM dialect.

Related PSA:
https://discourse.llvm.org/t/psa-removal-of-typed-pointers-from-the-llvm-dialect/74502
2023-11-01 10:13:06 +01:00
Adrian Kuegel
baf2d13519 [mlir][GPUToROCDL] Lower arith.remf to GPU intrinsic.
Differential Revision: https://reviews.llvm.org/D159423
2023-09-04 14:05:04 +02:00
Stanley Winata
1896096002 [mlir][ROCM] Add Wave/Warp shuffle lowering and op for ROCM.
Reduction is heavily used for many DL workload especially with
softmax/Attention layers. Wave/Warp shuffle and reduction is known to be
a speedy/efficient way to do these reductions.

In this patch we introduce AMD shuffle intrinsic Ops to ROCDL, along with it's corresponding lowering from gpu.shuffle. This should speed up a lot of DL workloads on ROCM backend. Currently, we have support for xor and idx, which are the more common ones. In the future, we plan on adding support for Down and Up, as well as using the ds_swizzle to further enhance it's performance when width and offsets are constant.

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D158684
2023-08-24 17:35:34 -07:00
Krzysztof Drewniak
51b65d0895 [mlir][AMDGPU] Improve BF16 handling through AMDGPU compilation
Many previous sets of AMDGPU dialect code have been incorrect in the
presence of the bf16 type (when lowered to LLVM's bfloat) as they were
developed in a setting that run a custom bf16-to-i16 pass before LLVM
lowering.

An overall effect of this patch is that you should run
--arith-emulate-unsupported-floats="source-types=bf16 target-type=f32"
on your GPU module before calling --convert-gpu-to-rocdl if your code
performs bf16 arithmetic.

While LLVM now supports software bfloat, initial experiments showed
that using this support on AMDGPU inserted a large number of
conversions around loads and stores which had substantial performance
imparts. Furthermore, all of the native AMDGPU operations on bf16
types (like the WMMA operations) operate on 16-bit integers instead of
the bfloat type.

First, we make the following changes to preserve compatibility once
the LLVM bfloat type is reenabled.
1. The matrix multiplication operations (MFMA and WMMA) will bitcast
bfloat vectors to i16 vectors.
2. Buffer loads and stores will operate on the relevant integer
datatype and then cast to bfloat if needed.

Second, we add type conversions to convert bf16 and vectors of it to
equivalent i16 types.

Third, we add the bfloat <-> f32 expansion patterns to the set of
operations run before the main LLVM conversion so that MLIR's
implementation of these conversion routines is used.

Finally, we extend the "floats treated as integers" support in the
LLVM exporter to handle types other than fp8.

We also fix a bug in the unsupported floats emulation where it tried
to operate on `arith.bitcast` due to an oversight.

Reviewed By: rsuderman

Differential Revision: https://reviews.llvm.org/D156361
2023-08-17 18:31:28 +00:00
Nicolas Vasilache
888717e853 [mlir][transform] Enable gpu-to-nvvm via conversion patterns driven by TD
This revision untangles a few more conversion pieces and allows rewriting
the relatively intricate (and somewhat inconsistent) LowerGpuOpsToNVVMOpsPass
in a declarative fashion that provides a much better understanding and control.

Differential Revision: https://reviews.llvm.org/D157617
2023-08-10 15:30:48 +00:00
SJW
cdf7ca6db7 [MLIR][ROCDL] Add conversion for gpu.lane_id to ROCDL
Creates rocdl.lane_id op with llvm conversion to:

  __device__ static unsigned int __lane_id() {
      return  __builtin_amdgcn_mbcnt_hi(
                 -1, __builtin_amdgcn_mbcnt_lo(-1, 0));
  }

Reviewed By: krzysz00

Differential Revision: https://reviews.llvm.org/D154666
2023-07-26 15:12:48 +00:00
Matthias Springer
61223c49dd [mlir][GPU] Rename MLIRGPUOps CMake target to MLIRGPUDialect
This is for consistency with other dialects.

Differential Revision: https://reviews.llvm.org/D150659
2023-05-16 14:25:08 +02:00
Manupa Karunaratne
8c3a8d17c8 [MLIR][ROCDL] add gpu to rocdl erf support
This commit adds lowering of lib func
call to support erf in rocdl.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D150355
2023-05-15 14:42:52 +00:00
Tres Popp
5550c82189 [mlir] Move casting calls from methods to function calls
The MLIR classes Type/Attribute/Operation/Op/Value support
cast/dyn_cast/isa/dyn_cast_or_null functionality through llvm's doCast
functionality in addition to defining methods with the same name.
This change begins the migration of uses of the method to the
corresponding function call as has been decided as more consistent.

Note that there still exist classes that only define methods directly,
such as AffineExpr, and this does not include work currently to support
a functional cast/isa call.

Caveats include:
- This clang-tidy script probably has more problems.
- This only touches C++ code, so nothing that is being generated.

Context:
- https://mlir.llvm.org/deprecation/ at "Use the free function variants
  for dyn_cast/cast/isa/…"
- Original discussion at https://discourse.llvm.org/t/preferred-casting-style-going-forward/68443

Implementation:
This first patch was created with the following steps. The intention is
to only do automated changes at first, so I waste less time if it's
reverted, and so the first mass change is more clear as an example to
other teams that will need to follow similar steps.

Steps are described per line, as comments are removed by git:
0. Retrieve the change from the following to build clang-tidy with an
   additional check:
   https://github.com/llvm/llvm-project/compare/main...tpopp:llvm-project:tidy-cast-check
1. Build clang-tidy
2. Run clang-tidy over your entire codebase while disabling all checks
   and enabling the one relevant one. Run on all header files also.
3. Delete .inc files that were also modified, so the next build rebuilds
   them to a pure state.
4. Some changes have been deleted for the following reasons:
   - Some files had a variable also named cast
   - Some files had not included a header file that defines the cast
     functions
   - Some files are definitions of the classes that have the casting
     methods, so the code still refers to the method instead of the
     function without adding a prefix or removing the method declaration
     at the same time.

```
ninja -C $BUILD_DIR clang-tidy

run-clang-tidy -clang-tidy-binary=$BUILD_DIR/bin/clang-tidy -checks='-*,misc-cast-functions'\
               -header-filter=mlir/ mlir/* -fix

rm -rf $BUILD_DIR/tools/mlir/**/*.inc

git restore mlir/lib/IR mlir/lib/Dialect/DLTI/DLTI.cpp\
            mlir/lib/Dialect/Complex/IR/ComplexDialect.cpp\
            mlir/lib/**/IR/\
            mlir/lib/Dialect/SparseTensor/Transforms/SparseVectorization.cpp\
            mlir/lib/Dialect/Vector/Transforms/LowerVectorMultiReduction.cpp\
            mlir/test/lib/Dialect/Test/TestTypes.cpp\
            mlir/test/lib/Dialect/Transform/TestTransformDialectExtension.cpp\
            mlir/test/lib/Dialect/Test/TestAttributes.cpp\
            mlir/unittests/TableGen/EnumsGenTest.cpp\
            mlir/test/python/lib/PythonTestCAPI.cpp\
            mlir/include/mlir/IR/
```

Differential Revision: https://reviews.llvm.org/D150123
2023-05-12 11:21:25 +02:00
Markus Böck
0e5aeae6f5 [mlir][GPUToLLVM] Add support for emitting opaque pointers
Part of https://discourse.llvm.org/t/rfc-switching-the-llvm-dialect-and-dialect-lowerings-to-opaque-pointers/68179

This patch adds the new pass option `use-opaque-pointers` to the GPU to LLVM lowerings (including ROCD and NVVM) and adapts the code to support using opaque pointers in addition to typed pointers.
The required changes mostly boil down to avoiding `getElementType` and specifying base types in GEP and Alloca.

In the future opaque pointers will be the only supported model, hence tests have been ported to using opaque pointers by default. Additional regression tests for typed-pointers have been added to avoid breaking existing clients.

Note: This does not yet port the `GpuToVulkan` passes.

Differential Revision: https://reviews.llvm.org/D144448
2023-02-21 20:46:33 +01:00
Krzysztof Drewniak
499abb243c Add generic type attribute mapping infrastructure, use it in GpuToX
Remapping memory spaces is a function often needed in type
conversions, most often when going to LLVM or to/from SPIR-V (a future
commit), and it is possible that such remappings may become more
common in the future as dialects take advantage of the more generic
memory space infrastructure.

Currently, memory space remappings are handled by running a
special-purpose conversion pass before the main conversion that
changes the address space attributes. In this commit, this approach is
replaced by adding a notion of type attribute conversions
TypeConverter, which is then used to convert memory space attributes.

Then, we use this infrastructure throughout the *ToLLVM conversions.
This has the advantage of loosing the requirements on the inputs to
those passes from "all address spaces must be integers" to "all
memory spaces must be convertible to integer spaces", a looser
requirement that reduces the coupling between portions of MLIR.

ON top of that, this change leads to the removal of most of the calls
to getMemorySpaceAsInt(), bringing us closer to removing it.

(A rework of the SPIR-V conversions to use this new system will be in
a folowup commit.)

As a note, one long-term motivation for this change is that I would
eventually like to add an allocaMemorySpace key to MLIR data layouts
and then call getMemRefAddressSpace(allocaMemorySpace) in the
relevant *ToLLVM in order to ensure all alloca()s, whether incoming or
produces during the LLVM lowering, have the correct address space for
a given target.

I expect that the type attribute conversion system may be useful in
other contexts.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D142159
2023-02-09 18:00:46 +00:00
Quentin Colombet
cb4ccd38fa [mlir][Conversion] Rename the MemRefToLLVM pass
Since the recent MemRef refactoring that centralizes the lowering of
complex MemRef operations outside of the conversion framework, the
MemRefToLLVM pass doesn't directly convert these complex operations.

Instead, to fully convert the whole MemRef dialect space, MemRefToLLVM
needs to run after `expand-strided-metadata`.

Make this more obvious by changing the name of the pass and the option
associated with it from `convert-memref-to-llvm` to
`finalize-memref-to-llvm`.
The word "finalize" conveys that this pass needs to run after something
else and that something else is documented in its tablegen description.

This is a follow-up patch related to the conversation at:
https://discourse.llvm.org/t/psa-you-need-to-run-expand-strided-metadata-before-memref-to-llvm-now/66956/14

Differential Revision: https://reviews.llvm.org/D142463
2023-01-27 09:10:10 +00:00
Mehdi Amini
4f1e244eb5 Add missing dependent dialects to "convert-gpu-to-rocdl"
Fixes #60198
2023-01-22 03:28:49 +01:00
Christopher Bate
60dd937d56 [mlir][gpu] Fix build failure / silence windows build warnings
Fixes Windows build failure (C4715) caused by
6ca1a09f03.
2023-01-13 13:40:57 -07:00
Christopher Bate
6ca1a09f03 [mlir][gpu] Migrate hard-coded address space integers to an enum attribute (gpu::AddressSpaceAttr)
This is a purely mechanical change that introduces an enum attribute in the GPU
dialect to represent the various memref memory spaces as opposed to the
hard-coded integer attributes that are currently used.

The following steps were taken to make the transition across the codebase:

1. Introduce a pass "gpu-lower-memory-space-attributes":

The pass updates all memref types that have a memory space attribute that is a
`gpu::AddressSpaceAttr`. These attributes are changed to `IntegerAttr`'s using a
mapping that is given by the caller. This pass is based on the
"map-memref-spirv-storage-class" pass and the common functions can probably
be refactored into a set of utilities under the MemRef dialect.

2. Update the verifiers of GPU/NVGPU dialect operations.

If a verifier currently checks the address space of an operand using
e.g.`getWorkspaceAddressSpace`, then it can continue to do so. However, the
checks are changed to only fail if the memory space is either missing or a wrong
value of type `gpu::AddressSpaceAttr`. Otherwise, it just assumes the address
space is correct because it was specifically lowered to something other than a
`gpu::AddressSpaceAttr`.

3. Update existing gpu-to-llvm conversion infrastructure.

In the existing gpu-to-X passes, we add a full conversion equivalent to
`gpu-lower-memory-space-attributes` just before doing the conversion to the
LLVMDialect. This is done because currently both the gpu-to-llvm passes
(rocdl,nvvm) run gpu-to-gpu rewrites within the pass, which introduce
`AddressSpaceAttr` memory space annotations. Therefore, I inserted the
memory space conversion between the gpu-to-gpu rewrites and the LLVM
conversion.

For more context see the below discourse discussion:
https://discourse.llvm.org/t/gpu-workgroup-shared-memory-address-space-is-hard-coded/

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D140644
2023-01-13 11:00:10 -07:00
Goran Flegar
b7fe0d346b Lower math.tan to __nv_tan[f] / __ocml_tan_f{32|64}
At present math.tan fails to lower for NVVM and ROCDL.

Differential Revision: https://reviews.llvm.org/D141505
2023-01-12 15:26:11 +01:00
Johannes Reifferscheid
059cf735a9 Lower math.cbrt to NVVM/ROCDL.
Reviewed By: pifon2a

Differential Revision: https://reviews.llvm.org/D141270
2023-01-09 13:17:35 +01:00
Krzysztof Drewniak
f6076bd81f [mlir][ROCDL] Translate known block size attributes to ROCDL
1. When converting from the GPU dialect to the ROCDL dialect, if the
function that contains a gpu.thread_id or gpu.block_id op is annotated
with gpu.known_{block,grid}_size, use that size to set a "range"
attribute on the corresponding rocdl intrinsic so that the LLVM
frontend can optimize based on that range information.
1b. When translating from the rocdl dialect to LLVM IR, use the
"range" attribute, if present, to set !range metadata on the relevant
function call.
2. Deprecate the old rocdl.max_flat_work_group_size attribute, which
was used in a tensorflow backend. Instead, use
rocdl.flat_work_group_size going forward to allow kernel generators to
specify the minimum and maximum work group sizes a kernel may be
launched with in one attribute, thus more closely matching the backend.
3. When translating from gpu.func to llvm.func within gpu-to-rocdl,
copy the known_block_size attribute as rocdl.reqd_work_group_size to
enable further translations to set the corresponding metadata on the
LLVM IR function. Also, set the rocdl.flat_work_group_size attribute
to ensure that the reqd_work_group_size metadata and the
amdgpu-flat-work-group-size metadata are consistent.
3b. Extend the ROCDL to LLVM IR translation to set the
!reqd_work_group_size metadata on LLVM functions

Also update tests and add functions to the ROCDL dialect to ensure
attribute names are used consistently.

Depends on D139865

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D139866
2023-01-02 21:04:13 +00:00
Christian Sigg
b251b608b5 [mlir][gpu] Unroll ops on vectors which map to intrinsic calls
Unroll ops that map to intrinsics when lowering to LLVM, because intrinsics don't support vector operands/results.

Reviewed By: herhut

Differential Revision: https://reviews.llvm.org/D136345
2022-10-28 10:33:38 +02:00
Jakub Kuderski
abc362a107 [mlir][arith] Change dialect name from Arithmetic to Arith
Suggested by @lattner in https://discourse.llvm.org/t/rfc-define-precise-arith-semantics/65507/22.

Tested with:
`ninja check-mlir check-mlir-integration check-mlir-mlir-spirv-cpu-runner check-mlir-mlir-vulkan-runner check-mlir-examples`

and `bazel build --config=generic_clang @llvm-project//mlir:all`.

Reviewed By: lattner, Mogball, rriddle, jpienaar, mehdi_amini

Differential Revision: https://reviews.llvm.org/D134762
2022-09-29 11:23:28 -04:00
Michele Scuttari
67d0d7ac0a [MLIR] Update pass declarations to new autogenerated files
The patch introduces the required changes to update the pass declarations and definitions to use the new autogenerated files and allow dropping the old infrastructure.

Reviewed By: mehdi_amini, rriddle

Differential Review: https://reviews.llvm.org/D132838
2022-08-31 12:28:45 +02:00
Michele Scuttari
039b969b32 Revert "[MLIR] Update pass declarations to new autogenerated files"
This reverts commit 2be8af8f0e.
2022-08-30 22:21:55 +02:00
Michele Scuttari
2be8af8f0e [MLIR] Update pass declarations to new autogenerated files
The patch introduces the required changes to update the pass declarations and definitions to use the new autogenerated files and allow dropping the old infrastructure.

Reviewed By: mehdi_amini, rriddle

Differential Review: https://reviews.llvm.org/D132838
2022-08-30 21:56:31 +02:00
Jeff Niu
00f7096d31 [mlir][math] Rename math.abs -> math.absf
To make room for introducing `math.absi`.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D131325
2022-08-08 11:04:58 -04:00
Krzysztof Drewniak
c2fc8d9b95 [mlir][GPU] Allow bare pointer memrefs when calling GPU kernels
In the ROCm runtime (and probably CUDA as well), all kernel arguments
are aligned. Therefore, enable using bare pointers for memref
arguments to kernels when these memrefs have static shape and a
trivial layout.

This is a substantial optimization to launching kernels that use
memrefs with known, static sizes, since it causes the kernel launch
packet to no longer include information already known to the kernel,
which can enable packing the kernel launch arguments into launch
packets instead of having to allocate an entire separate structure to
hold unneeded memref information.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D130716
2022-08-02 20:58:34 +00:00
Krzysztof Drewniak
d6ef3d20b4 [mlir] Remove VectorToROCDL
Between issues such as
https://github.com/llvm/llvm-project/issues/56323, the fact that this
lowering (unlike the code in amdgpu-to-rocdl) does not correctly set
up bounds checks (and thus will cause page faults on reads that might
need to be padded instead), and that fixing these problems would,
essentially, involve replicating amdgpu-to-rocdl, remove
--vector-to-rocdl for being broken. In addition, the lowering does not
support many aspects of transfer_{read,write}, like supervectors, and
may not work correctly in their presence.

We (the MLIR-based convolution generator at AMD) do not use this
conversion pass, nor are we aware of any other clients.

Migration strategies:
- Use VectorToLLVM
- If buffer ops are particularly needed in your application, use
amdgpu.raw_buffer_{load,store}

A VectorToAMDGPU pass may be introduced in the future.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D129308
2022-07-12 15:21:22 +00:00
Krzysztof Drewniak
cab44c515c [mlir][AMDGPU] Add --chipset option to AMDGPUToROCDL
Because the buffer descriptor structure (the V#) has no backwards-compatibility
guarentees, and since said guarantees have been violated in practice
(see https://github.com/llvm/llvm-project/issues/56323 ), and since
the `targetIsRDNA` attribute isn't something that higher-level clients can set
in general, make the lowering of the amdgpu dialect to rocdl take a --chipset
option.

Note that this option is a string because adding a parser for the Chipset
struct to llvm::cl wasn't working out.

Reviewed By: herhut

Differential Revision: https://reviews.llvm.org/D129228
2022-07-07 14:58:13 +00:00
Alex Zinenko
610139d2d9 [mlir] replace 'emit_c_wrappers' func->llvm conversion option with a pass
The 'emit_c_wrappers' option in the FuncToLLVM conversion requests C interface
wrappers to be emitted for every builtin function in the module. While this has
been useful to bootstrap the interface, it is problematic in the longer term as
it may unintentionally affect the functions that should retain their existing
interface, e.g., libm functions obtained by lowering math operations (see
D126964 for an example). Since D77314, we have a finer-grain control over
interface generation via an attribute that avoids the problem entirely. Remove
the 'emit_c_wrappers' option. Introduce the '-llvm-request-c-wrappers' pass
that can be run in any pipeline that needs blanket emission of functions to
annotate all builtin functions with the attribute before performing the usual
lowering that accounts for the attribute.

Reviewed By: chelini

Differential Revision: https://reviews.llvm.org/D127952
2022-06-17 11:10:31 +02:00
Mogball
e16d13322b [mlir] (NFC) Clean up bazel and CMake target names
All dialect targets in bazel have been named *Dialect and all dialect
targets in CMake have been named MLIR*Dialect.
2022-06-13 16:24:15 +00:00
Mogball
d7ef488bb6 [mlir][gpu] Move GPU headers into IR/ and Transforms/
Depends on D127350

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D127352
2022-06-09 22:49:03 +00:00
Krzysztof Drewniak
814b605095 [mlir][AMDGPU] Add AMDGPU conversion patterns to ConvertGPUToROCDL
This ensures that attributes such as the index bitwidth propagate
correctly to the AMDGPUToROCDL patterns.

Differential Revision: https://reviews.llvm.org/D125320
2022-05-10 16:49:11 +00:00
Krzysztof Drewniak
f1f05a91ca [MLIR][AMDGPU] Add AMDGPU dialect, wrappers around raw buffer intrinsics
By analogy with the NVGPU dialect, introduce an AMDGPU dialect for
AMD-specific intrinsic wrappers.

The dialect initially includes wrappers around the raw buffer intrinsics.

On AMD GPUs, a memref can be converted to a "buffer descriptor" that
allows more precise control of memory access, such as by allowing for
out of bounds loads/stores to be replaced by 0/ignored without adding
additional conditional logic, which is important for performance.

The repository currently contains a limited conversion from
transfer_read/transfer_write to Mubuf intrinsics, which are an older,
deprecated intrinsic for the same functionality.

The new amdgpu.raw_buffer_* ops allow these operations to be used
explicitly and for including metadata such as whether the target
chipset is an RDNA chip or not (which impacts the interpretation of
some bits in the buffer descriptor), while still maintaining an
MLIR-like interface.

(This change also exposes the floating-point atomic add intrinsic.)

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D122765
2022-05-10 14:59:58 +00:00
River Riddle
58ceae9561 [mlir:NFC] Remove the forward declaration of FuncOp in the mlir namespace
FuncOp has been moved to the `func` namespace for a little over a month, the
using directive can be dropped now.
2022-04-18 12:01:55 -07:00
River Riddle
3655069234 [mlir] Move the Builtin FuncOp to the Func dialect
This commit moves FuncOp out of the builtin dialect, and into the Func
dialect. This move has been planned in some capacity from the moment
we made FuncOp an operation (years ago). This commit handles the
functional aspects of the move, but various aspects are left untouched
to ease migration: func::FuncOp is re-exported into mlir to reduce
the actual API churn, the assembly format still accepts the unqualified
`func`. These temporary measures will remain for a little while to
simplify migration before being removed.

Differential Revision: https://reviews.llvm.org/D121266
2022-03-16 17:07:03 -07:00
River Riddle
1d7120c69a [mlir] Split out AttrDef/TypeDef and pattern constructs from OpBase.td
OpBase.td has formed into a huge monolith of all ODS constructs. This
commits starts to rectify that by splitting out some constructs to their
own .td files.

Differential Revision: https://reviews.llvm.org/D118636
2022-03-15 00:18:03 -07:00
River Riddle
5a7b919409 [mlir][NFC] Rename StandardToLLVM to FuncToLLVM
The current StandardToLLVM conversion patterns only really handle
the Func dialect. The pass itself adds patterns for Arithmetic/CFToLLVM, but
those should be/will be split out in a followup. This commit focuses solely
on being an NFC rename.

Aside from the directory change, the pattern and pass creation API have been renamed:
 * populateStdToLLVMFuncOpConversionPattern -> populateFuncToLLVMFuncOpConversionPattern
 * populateStdToLLVMConversionPatterns -> populateFuncToLLVMConversionPatterns
 * createLowerToLLVMPass -> createConvertFuncToLLVMPass

Differential Revision: https://reviews.llvm.org/D120778
2022-03-07 11:25:23 -08:00
Krzysztof Drewniak
24a1869d00 [MLIR][GPU] Update GPUToROCDL to account for ControlFlow dialect
The conversion to the new ControlFlow dialect didn't change the
GPUToROCDL pass - this commit fixes this issue.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D119188
2022-02-08 16:34:34 +00:00
Matthias Springer
99ef9eebad [mlir][vector][NFC] Split into IR, Transforms and Utils
This reduces the dependencies of the MLIRVector target and makes the dialect consistent with other dialects.

Differential Revision: https://reviews.llvm.org/D118533
2022-01-31 19:17:09 +09:00
Krzysztof Drewniak
e1da62910e [MLIR][GPU] Define gpu.printf op and its lowerings
- Define a gpu.printf op, which can be lowered to any GPU printf() support (which is present in CUDA, HIP, and OpenCL). This op only supports constant format strings and scalar arguments
- Define the lowering of gpu.pirntf to a call to printf() (which is what is required for AMD GPUs when using OpenCL) as well as to the hostcall interface present in the AMD Open Compute device library, which is the interface present when kernels are running under HIP.
- Add a "runtime" enum that allows specifying which of the possible runtimes a ROCDL kernel will be executed under or that the runtime is unknown. This enum controls how gpu.printf is lowered

This change does not enable lowering for Nvidia GPUs, but such a lowering should be possible in principle.

And:
[MLIR][AMDGPU] Always set amdgpu-implicitarg-num-bytes=56 on kernels

This is something that Clang always sets on both OpenCL and HIP kernels, and failing to include it causes mysterious crashes with printf() support.

In addition, revert the max-flat-work-group-size to (1, 256) to avoid triggering bugs in the AMDGPU backend.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D110448
2021-12-09 15:54:31 +00:00
Mehdi Amini
be0a7e9f27 Adjust "end namespace" comment in MLIR to match new agree'd coding style
See D115115 and this mailing list discussion:
https://lists.llvm.org/pipermail/llvm-dev/2021-December/154199.html

Differential Revision: https://reviews.llvm.org/D115309
2021-12-08 06:05:26 +00:00
River Riddle
195730a650 [mlir][NFC] Replace references to Identifier with StringAttr
This is part of the replacement of Identifier with StringAttr.

Differential Revision: https://reviews.llvm.org/D113953
2021-11-16 17:36:26 +00:00
Mogball
a54f4eae0e [MLIR] Replace std ops with arith dialect ops
Precursor: https://reviews.llvm.org/D110200

Removed redundant ops from the standard dialect that were moved to the
`arith` or `math` dialects.

Renamed all instances of operations in the codebase and in tests.

Reviewed By: rriddle, jpienaar

Differential Revision: https://reviews.llvm.org/D110797
2021-10-13 03:07:03 +00:00
Adrian Kuegel
74b88807ae [mlir][rocdl] Add math::Exp2Op lowering to ROCDL
Differential Revision: https://reviews.llvm.org/D106057
2021-07-15 14:33:04 +02:00
Alex Zinenko
75e5f0aac9 [mlir] factor memref-to-llvm lowering out of std-to-llvm
After the MemRef has been split out of the Standard dialect, the
conversion to the LLVM dialect remained as a huge monolithic pass.
This is undesirable for the same complexity management reasons as having
a huge Standard dialect itself, and is even more confusing given the
existence of a separate dialect. Extract the conversion of the MemRef
dialect operations to LLVM into a separate library and a separate
conversion pass.

Reviewed By: herhut, silvas

Differential Revision: https://reviews.llvm.org/D105625
2021-07-09 14:49:52 +02:00
Alex Zinenko
b5d847b1b9 [mlir] factor out common parts of the converstion to the LLVM dialect
"Standard-to-LLVM" conversion is one of the oldest passes in existence. It has
become quite large due to the size of the Standard dialect itself, which is
being split into multiple smaller dialects. Furthermore, several conversion
features are useful for any dialect that is being converted to the LLVM
dialect, which, without this refactoring, creates a dependency from those
conversions to the "standard-to-llvm" one.

Put several of the reusable utilities from this conversion to a separate
library, namely:
- type converter from builtin to LLVM dialect types;
- utility for building and accessing values of LLVM structure type;
- utility for building and accessing values that represent memref in the LLVM
  dialect;
- lowering options applicable everywhere.

Additionally, remove the type wrapping/unwrapping notion from the type
converter that is no longer relevant since LLVM types has been reimplemented as
first-class MLIR types.

Reviewed By: pifon2a

Differential Revision: https://reviews.llvm.org/D105534
2021-07-07 10:51:08 +02:00
Uday Bondhugula
4acf3807e3 [MLIR] Split out GPU ops library from Transforms
Split out GPU ops library from GPU transforms. This allows libraries to
depend on GPU Ops without needing/building its transforms.

Differential Revision: https://reviews.llvm.org/D105472
2021-07-07 11:26:49 +05:30