Commit Graph

75 Commits

Author SHA1 Message Date
Matthias Springer
9b5ef2bea8 [mlir][Interfaces] LoopLikeOpInterface: Support ops with multiple regions (#66754)
This commit implements `LoopLikeOpInterface` on `scf.while`. This
enables LICM (and potentially other transforms) on `scf.while`.

`LoopLikeOpInterface::getLoopBody()` is renamed to `getLoopRegions` and
can now return multiple regions.

Also fix a bug in the default implementation of
`LoopLikeOpInterface::isDefinedOutsideOfLoop()`, which returned "false"
for some values that are defined outside of the loop (in a nested op, in
such a way that the value does not dominate the loop). This interface is
currently only used for LICM and there is no way to trigger this bug, so
no test is added.
2023-09-19 17:35:38 +02:00
Matthias Springer
5cf714bb2f [mlir][SCF] scf.for: Consistent API around initArgs (#66512)
* Always use the auto-generated `getInitArgs` function. Remove the
hand-written `getInitOperands` duplicate.
* Remove `hasIterOperands` and `getNumIterOperands`. The names were
inconsistent because the "arg" is called `initArgs` in TableGen. Use
`getInitArgs().size()` instead.
* Fix verification around ops with no results.
2023-09-18 09:13:43 +02:00
Christopher Bate
cafb6284d1 [mlir][VectorToGPU] Update memref stride preconditions on nvgpu.mma.sync path
This change removes the requirement that the row stride be statically known when
converting `vector.transfer_read` and `vector.transfer_write` to distributed
SIMT operations in the `nvgpu` lowering path. It also adds a check to verify
that the last dimension of the source memref is statically known to have stride
1 since this is assumed in the conversion logic.  No other change should be
required since the generated `vector.load` operations are never created across
dimensions other than the last. The routines for checking preconditions on
`vector.transfer_read/write` are moved to under nvgpu utilities.

The change is NFC with respect to the GPU dialect lowering path.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D155753
2023-09-14 13:51:42 -06:00
Daniil Dudkin
8a6e54c9b3 [mlir][arith] Rename operations: maxfmaximumf, minfminimumf (#65800)
This patch is part of a larger initiative aimed at fixing floating-point `max` and `min` operations in MLIR: https://discourse.llvm.org/t/rfc-fix-floating-point-max-and-min-operations-in-mlir/72671.

This commit addresses Task 1.2 of the mentioned RFC. By renaming these operations, we align their names with LLVM intrinsics that have corresponding semantics.
2023-09-11 22:02:19 -07:00
Lei Zhang
a01194377c [mlir][gpu] Support arith.extf in subgroup MMA elementwise ops
This commit adds support for arith.extf in the supported list of
elementwise ops for subgroup MMA ops, and enables lowering to
SPIR-V.

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D156847
2023-08-01 21:12:37 -07:00
Matthias Springer
16b75cd2bb [mlir][vector] Use DenseI64ArrayAttr for ExtractOp/InsertOp positions
`DenseI64ArrayAttr` provides a better API than `I64ArrayAttr`. E.g., accessors returning `ArrayRef<int64_t>` (instead of `ArrayAttr`) are generated.

Differential Revision: https://reviews.llvm.org/D156684
2023-07-31 15:25:37 +02:00
Matthias Springer
efc290ce9c [mlir][affine] More efficient makeComposedFolded... helpers
The old code used to materialize constants as ops, immediately folded them into the resulting affine map and then deleted the constant ops again. Instead, directly fold the attributes into the affine map. Furthermore, all helpers accept `OpFoldResult` instead of `Value` now. This makes the code at call sites more efficient, because it is no longer necessary to materialize a `Value`, just to be able to use these helper functions.

Note: The API has changed (accepts OpFoldResult instead of Value), otherwise this change is NFC.

Differential Revision: https://reviews.llvm.org/D153324
2023-06-22 10:47:38 +02:00
Mahesh Ravishankar
641b12e94b [mlir][SliceAnalysis] Add an options object to forward and backward slice.
Add an options object to allow control of the slice computation (for
both forward and backward slice). This makes the ABI stable, and also
allows avoiding an assert that makes the slice analysis unusable for
operations with multiple blocks.

Reviewed By: hanchung, nicolasvasilache

Differential Revision: https://reviews.llvm.org/D151520
2023-06-08 18:40:20 +00:00
Tres Popp
5550c82189 [mlir] Move casting calls from methods to function calls
The MLIR classes Type/Attribute/Operation/Op/Value support
cast/dyn_cast/isa/dyn_cast_or_null functionality through llvm's doCast
functionality in addition to defining methods with the same name.
This change begins the migration of uses of the method to the
corresponding function call as has been decided as more consistent.

Note that there still exist classes that only define methods directly,
such as AffineExpr, and this does not include work currently to support
a functional cast/isa call.

Caveats include:
- This clang-tidy script probably has more problems.
- This only touches C++ code, so nothing that is being generated.

Context:
- https://mlir.llvm.org/deprecation/ at "Use the free function variants
  for dyn_cast/cast/isa/…"
- Original discussion at https://discourse.llvm.org/t/preferred-casting-style-going-forward/68443

Implementation:
This first patch was created with the following steps. The intention is
to only do automated changes at first, so I waste less time if it's
reverted, and so the first mass change is more clear as an example to
other teams that will need to follow similar steps.

Steps are described per line, as comments are removed by git:
0. Retrieve the change from the following to build clang-tidy with an
   additional check:
   https://github.com/llvm/llvm-project/compare/main...tpopp:llvm-project:tidy-cast-check
1. Build clang-tidy
2. Run clang-tidy over your entire codebase while disabling all checks
   and enabling the one relevant one. Run on all header files also.
3. Delete .inc files that were also modified, so the next build rebuilds
   them to a pure state.
4. Some changes have been deleted for the following reasons:
   - Some files had a variable also named cast
   - Some files had not included a header file that defines the cast
     functions
   - Some files are definitions of the classes that have the casting
     methods, so the code still refers to the method instead of the
     function without adding a prefix or removing the method declaration
     at the same time.

```
ninja -C $BUILD_DIR clang-tidy

run-clang-tidy -clang-tidy-binary=$BUILD_DIR/bin/clang-tidy -checks='-*,misc-cast-functions'\
               -header-filter=mlir/ mlir/* -fix

rm -rf $BUILD_DIR/tools/mlir/**/*.inc

git restore mlir/lib/IR mlir/lib/Dialect/DLTI/DLTI.cpp\
            mlir/lib/Dialect/Complex/IR/ComplexDialect.cpp\
            mlir/lib/**/IR/\
            mlir/lib/Dialect/SparseTensor/Transforms/SparseVectorization.cpp\
            mlir/lib/Dialect/Vector/Transforms/LowerVectorMultiReduction.cpp\
            mlir/test/lib/Dialect/Test/TestTypes.cpp\
            mlir/test/lib/Dialect/Transform/TestTransformDialectExtension.cpp\
            mlir/test/lib/Dialect/Test/TestAttributes.cpp\
            mlir/unittests/TableGen/EnumsGenTest.cpp\
            mlir/test/python/lib/PythonTestCAPI.cpp\
            mlir/include/mlir/IR/
```

Differential Revision: https://reviews.llvm.org/D150123
2023-05-12 11:21:25 +02:00
Matthias Springer
4c48f016ef [mlir][Affine][NFC] Wrap dialect in "affine" namespace
This cleanup aligns the affine dialect with all the other dialects.

Differential Revision: https://reviews.llvm.org/D148687
2023-04-20 11:19:21 +09:00
Manish Gupta
84eed7843e [Updated commit] Fix Transpose Check in MMA.SYNC Path.
Pushed a stale commit for the same review in my previous commit.
I am updating the main-line with the latest commit including
review commits. Apologies for the redundant commit.

Differential Revision: https://reviews.llvm.org/D147749
2023-04-11 00:38:35 +00:00
Manish Gupta
6b82fc77c2 Fix Transpose Check in MMA.SYNC Path
Differential Revision: https://reviews.llvm.org/D147749
2023-04-11 00:02:20 +00:00
Diego Caballero
7b70baa9ef [mlir][Vector] Remove lhs and rhs masks from vector.contract
This patch removes the historical lhs and rhs masks in vector.contract,
now that vector.mask supports vector.contract and the lhs and rhs masks
are barely supported by all the vector.contract lowerings and
transformations.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D144430
2023-03-29 19:53:29 +00:00
Jakub Kuderski
fb7ef637a8 [mlir][vector][nvgpu] Move MMA contraction preparation to VectorUtils
This pattern is not specific to nvgpu; I intend to use in SPIR-V codegen. `VectorTransforms` seems like a more generally useful place.

In addition:
-  Fix a bug in the second condition (the dimensions were swapped for RHS).
-  Add tests.
-  Add support for externally provided filter functions, similar to other vector transforms.
-  Prefer to transpose before zero/sign-extending inputs.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D145638
2023-03-09 14:56:21 -05:00
Lei Zhang
a1aad28d29 [mlir][vector] NFC: Improve vector type accessor methods
Plain `getVectorType()` can be quite confusing and error-prone
given that, well, vector ops always work on vector types, and
it can commonly involve both source and result vectors. So this
commit makes various such accessor methods to be explicit w.r.t.
source or result vectors.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D144159
2023-02-16 04:08:33 +00:00
Thomas Raoux
3cf7f22498 [mlir][vectorToGPU] Fix type used when folding transpose into read op
Pick the right result type when folding transpose op into a read

Differential Revision: https://reviews.llvm.org/D144113
2023-02-15 17:34:09 +00:00
Jie Fu
c0f504df48 [mlir] Fix two build warnings in VectorToGPU (NFC)
In file included from /data/llvm-project/mlir/lib/Conversion/VectorToGPU/VectorToGPU.cpp:13:
/data/llvm-project/mlir/include/mlir/Conversion/VectorToGPU/VectorToGPU.h:15:1: error: class 'LogicalResult' was previously declared as a struct; this is valid, but may result in linker errors under the Microsoft C++ ABI [-Werror,-Wmismatched-tags]
class LogicalResult;
^
/data/llvm-project/mlir/include/mlir/Support/LogicalResult.h:26:22: note: previous use is here
struct [[nodiscard]] LogicalResult {
                     ^
/data/llvm-project/mlir/include/mlir/Conversion/VectorToGPU/VectorToGPU.h:15:1: note: did you mean struct here?
class LogicalResult;
^~~~~
struct

/data/llvm-project/mlir/lib/Conversion/VectorToGPU/VectorToGPU.cpp:724:5: error: ignoring return value of function declared with 'nodiscard' attribute [-Werror,-Wunused-result]
    rewriter.notifyMatchFailure(
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~
2 errors generated.
2023-02-15 09:35:49 +08:00
Nicolas Vasilache
5ef7ceae57 [mlir][Vector] Significantly improve VectorToGPU.cpp
This revision performs a bunch of cleanups and tracks free-flowing IR mutations.
APIs are systematized around RewriterBase and relevant debug messages are added.
Deliberate use of OpBuilder::InsertionGuard is added where needed.

Differential Revision: https://reviews.llvm.org/D143738
2023-02-14 16:49:36 -08:00
Quinn Dawkins
5205c7126b [mlir][gpu] Add support for unsigned integer extend in vector to gpu.subgroup_mma lowering
Unsigned integer types are supported in subgroup mma ops by matching
against arith.extui ops. This allows for subgroup_mma_compute ops with
mixed signedness which requires later conversions to handle this. SPIR-V
cooperative matrix ops support this while the lowering to WMMA does not.

Differential Revision: https://reviews.llvm.org/D143922
2023-02-14 13:09:46 -05:00
Krzysztof Drewniak
d0f19ce774 [mlir] Handle different pointer sizes in unranked memref descriptors
The code for unranked memref descriptors assumed that
sizeof(!llvm.ptr) == lizeof(!llvm.ptr<N>) for all address spaces N.
This is not always true (ex. the AMDGPU compiler backend has
sizeof(!llvm.ptr) = 64 bits but sizeof(!llvm.ptr<5>) = 32 bits, where
address space 5 is used for stack allocations). While this is merely
an overallocation in the case where a non-0 address space has pointers
smaller than the default, the existing code could cause OOB memory
accesses when sizeof(!llvm.ptr<N>) > sizeof(!llvm.ptr).

So, add an address spaces parameter to computeSizes in order to
partially resolve this class of bugs. Note that the LLVM data layout
in the conversion passes is currently set to "" and not constructed
from the MLIR data layout or some other source, but this could change
in the future.

Depends on D142159

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D141293
2023-02-09 19:14:58 +00:00
Krzysztof Drewniak
499abb243c Add generic type attribute mapping infrastructure, use it in GpuToX
Remapping memory spaces is a function often needed in type
conversions, most often when going to LLVM or to/from SPIR-V (a future
commit), and it is possible that such remappings may become more
common in the future as dialects take advantage of the more generic
memory space infrastructure.

Currently, memory space remappings are handled by running a
special-purpose conversion pass before the main conversion that
changes the address space attributes. In this commit, this approach is
replaced by adding a notion of type attribute conversions
TypeConverter, which is then used to convert memory space attributes.

Then, we use this infrastructure throughout the *ToLLVM conversions.
This has the advantage of loosing the requirements on the inputs to
those passes from "all address spaces must be integers" to "all
memory spaces must be convertible to integer spaces", a looser
requirement that reduces the coupling between portions of MLIR.

ON top of that, this change leads to the removal of most of the calls
to getMemorySpaceAsInt(), bringing us closer to removing it.

(A rework of the SPIR-V conversions to use this new system will be in
a folowup commit.)

As a note, one long-term motivation for this change is that I would
eventually like to add an allocaMemorySpace key to MLIR data layouts
and then call getMemRefAddressSpace(allocaMemorySpace) in the
relevant *ToLLVM in order to ensure all alloca()s, whether incoming or
produces during the LLVM lowering, have the correct address space for
a given target.

I expect that the type attribute conversion system may be useful in
other contexts.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D142159
2023-02-09 18:00:46 +00:00
Quinn Dawkins
985f7ff632 [mlir][gpu] Add support for integer types in gpu.subgroup_mma ops
The signedness is carried by `!gpu.mma_matrix` types to most closely
match the Cooperative Matrix specification which determines signedness
with the type (and sometimes the operation).

See: https://htmlpreview.github.io/?https://github.com/KhronosGroup/SPIRV-Registry/blob/master/extensions/NV/SPV_NV_cooperative_matrix.html

To handle the lowering from vector to gpu, ops such as arith.extsi are
pattern matched next to `vector.transfer_read` and `vector.contract` to
determine the signedness of the matrix type.

Enables s8 and u8 WMMA types in NVVM for the GPUToNVVM conversion.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D143223
2023-02-07 17:58:01 -05:00
Thomas Raoux
066b4fcb8d [mlir] Update VectorToGPU to new memory space
GPU memory space have changed to new attributes. Update VectorToGPU pass
to use those.

Differential Revision: https://reviews.llvm.org/D142105
2023-01-19 20:12:37 +00:00
Lei Zhang
f1db4aec30 [mlir][VectorToGPU] Support transposed+broadcasted 2D MMA load
This is loading from 2-D memref, in addition to D139655 where we
load from 1-D memref cases.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D140136
2022-12-15 19:34:32 +00:00
Lei Zhang
dbddd4f6a4 [mlir][VectorToGPU] Support transposed+broadcasted 1D MMA load
This is now possible with transpose semantics on subgroup MMA
load ops.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D139655
2022-12-15 19:22:35 +00:00
Lei Zhang
50882b4daf [mlir] List more elementwise ops in VectorToGPU MMA conversion
Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D139244
2022-12-05 22:51:19 +00:00
Navdeep Katel
3d35546cd1 Support transpose mode for gpu.subgroup WMMA ops
Add support for loading, computing, and storing `gpu.subgroup` WMMA ops
in transpose mode as well. Update the GPU to NVVM lowerings to support
`transpose` mode and update integration tests as well.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D139021
2022-12-05 22:37:02 +05:30
Kazu Hirata
192d9dd731 [mlir] Use std::nullopt instead of None in comments (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-04 19:58:32 -08:00
Kazu Hirata
1a36588ec6 [mlir] Use std::nullopt instead of None (NFC)
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated.  The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-03 18:50:27 -08:00
Nicolas Vasilache
3af6438372 Revert "[WIP] Add support for MMA conversion for 1-D vector.transfer followed by a broadcast to 2-D"
This reverts commit 7db25f78db.

This was mistakently stacked below (and committed) along with an NFC change.
2022-12-01 02:57:03 -08:00
Nicolas Vasilache
7db25f78db [WIP] Add support for MMA conversion for 1-D vector.transfer followed by a broadcast to 2-D
Differential Revision: https://reviews.llvm.org/D139040
2022-12-01 02:49:47 -08:00
Quinn Dawkins
c0321edc26 [mlir][gpu] Adding support for transposed mma_load_matrix
Enables transposed gpu.subgroup_mma_load_matrix and updates the lowerings in Vector to GPU and GPU to SPIRV. Needed to enable B transpose matmuls lowering to wmma ops.

Taken over from author: stanley-nod <stanley@nod-labs.com>

Reviewed By: ThomasRaoux, antiagainst

Differential Revision: https://reviews.llvm.org/D138770
2022-11-29 03:35:49 +00:00
Ramkumar Ramachandra
d32ec5232c mlir/VectorToGPU: use std::optional (NFC)
This is part of an effort to migrate from llvm::Optional to std::optional:

See also: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716

Signed-off-by: Ramkumar Ramachandra <r@artagnon.com>
2022-11-27 13:32:18 -08:00
Aliia Khasanova
399638f98c Merge kDynamicSize and kDynamicSentinel into one constant.
resolve conflicts

Differential Revision: https://reviews.llvm.org/D138282
2022-11-21 13:01:26 +00:00
Mehdi Amini
6a7a1188d3 Apply clang-tidy fixes for llvm-else-after-return in VectorToGPU.cpp (NFC) 2022-11-06 20:15:00 +00:00
Manish Gupta
114ba722c1 [mlir][NVGPU] Handle native mma.sync and ldmatrix(x4) sizes
This patch handles native `mma.sync` sizes and enables issuing `ldmatrix` on
largest possible tiles for matrixB. It requires handling
`vector.extract_strided_slice` from vector to ngpu lowering.

Differential Revision: https://reviews.llvm.org/D135749
2022-10-19 17:10:21 -07:00
Christopher Bate
ea2ed80e6d [mlir][nvgpu] NFC - move NVGPU conversion helpers to NvGpu utils library
The ConvertVectorToGpu pass implementation contained a small private
support library for performing various calculations during conversion
between `vector` and `nvgpu.mma.sync` and `nvgpu.ldmatrix` operations.
The support library is moved under `Dialect/NVGPU/Utils` because the
functions have wider utility. Some documentation comments are added or
improved.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D135303
2022-10-05 20:21:27 -06:00
Jakub Kuderski
abc362a107 [mlir][arith] Change dialect name from Arithmetic to Arith
Suggested by @lattner in https://discourse.llvm.org/t/rfc-define-precise-arith-semantics/65507/22.

Tested with:
`ninja check-mlir check-mlir-integration check-mlir-mlir-spirv-cpu-runner check-mlir-mlir-vulkan-runner check-mlir-examples`

and `bazel build --config=generic_clang @llvm-project//mlir:all`.

Reviewed By: lattner, Mogball, rriddle, jpienaar, mehdi_amini

Differential Revision: https://reviews.llvm.org/D134762
2022-09-29 11:23:28 -04:00
Kazu Hirata
be650de57d [mlir] Use empty (NFC) 2022-09-18 17:46:53 -07:00
Oleg Shyshkov
4758e916e1 [mlir] Change IteratorType in ContractionOp in Vector dialect from string to enum.
This is the first step in replacing interator_type from strings with enums in Vector and Linalg dialect. This change adds IteratorTypeAttr and uses it in ContractionOp.

To avoid breaking all the tests, print/parse code has conversion between string and enum for now.

There is a shared code in StructuredOpsUtils.h that expects iterator types to be strings. To break this dependancy, this change forks helper function `isParallelIterator` and `isReductionIterator` to utils in both dialects and adds `getIteratorTypeNames()` to support backward compatibility with StructuredGenerator.

In the later changes, I plan to add a similar enum attribute to Linalg.

Differential Revision: https://reviews.llvm.org/D133696
2022-09-12 16:59:34 +02:00
Michele Scuttari
67d0d7ac0a [MLIR] Update pass declarations to new autogenerated files
The patch introduces the required changes to update the pass declarations and definitions to use the new autogenerated files and allow dropping the old infrastructure.

Reviewed By: mehdi_amini, rriddle

Differential Review: https://reviews.llvm.org/D132838
2022-08-31 12:28:45 +02:00
Michele Scuttari
039b969b32 Revert "[MLIR] Update pass declarations to new autogenerated files"
This reverts commit 2be8af8f0e.
2022-08-30 22:21:55 +02:00
Michele Scuttari
2be8af8f0e [MLIR] Update pass declarations to new autogenerated files
The patch introduces the required changes to update the pass declarations and definitions to use the new autogenerated files and allow dropping the old infrastructure.

Reviewed By: mehdi_amini, rriddle

Differential Review: https://reviews.llvm.org/D132838
2022-08-30 21:56:31 +02:00
Manish Gupta
14d79afeae [mlir][NVGPU] nvgpu.mmasync on F32 through TF32
Adds optional attribute to support tensor cores on F32 datatype by lowering to `mma.sync` with TF32 operands. Since, TF32 is not a native datatype in LLVM we are adding `tf32Enabled` as an attribute to allow the IR to be aware of `MmaSyncOp` datatype. Additionally, this patch adds placeholders for nvgpu-to-nvgpu transformation targeting higher precision tf32x3.

For mma.sync on f32 input using tensor cores there are two possibilites:
(a) tf32   (1 `mma.sync` per warp-level matrix-multiply-accumulate)
(b) tf32x3 (3 `mma.sync` per warp-level matrix-multiply-accumulate)

Typically, tf32 tensor core acceleration comes at a cost of accuracy from missing precision bits. While f32 has 23 precision bits, tf32 has only 10 precision bits. tf32x3 aims to recover the precision bits by splitting each operand into two tf32 values and issue three `mma.sync` tensor core operations.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D130294
2022-08-01 23:23:27 +00:00
Jeff Niu
e179532284 [mlir] Remove types from attributes
This patch removes the `type` field from `Attribute` along with the
`Attribute::getType` accessor.

Going forward, this means that attributes in MLIR will no longer have
types as a first-class concept. This patch lays the groundwork to
incrementally remove or refactor code that relies on generic attributes
being typed. The immediate impact will be on attributes that rely on
`Attribute` containing a type, such as `IntegerAttr`,
`DenseElementsAttr`, and `ml_program::ExternAttr`, which will now need
to define a type parameter on their storage classes. This will save
memory as all other attribute kinds will no longer contain a type.

Moreover, it will not be possible to generically query the type of an
attribute directly. This patch provides an attribute interface
`TypedAttr` that implements only one method, `getType`, which can be
used to generically query the types of attributes that implement the
interface. This interface can be used to retain the concept of a "typed
attribute". The ODS-generated accessor for a `type` parameter
automatically implements this method.

Next steps will be to refactor the assembly formats of certain operations
that rely on `parseAttribute(type)` and `printAttributeWithoutType` to
remove special handling of type elision until `type` can be removed from
the dialect parsing hook entirely; and incrementally remove uses of
`TypedAttr`.

Reviewed By: lattner, rriddle, jpienaar

Differential Revision: https://reviews.llvm.org/D130092
2022-07-31 20:01:31 -04:00
Jacques Pienaar
d2c0572b2e [mlir] Flip LinAlg dialect to _Both
This one required more changes than ideal due to overlapping generated name
with different return types. Changed getIndexingMaps to getIndexingMapsArray to
move it out of the way/highlight that it returns (more expensively) a
SmallVector and uses the prefixed name for the Attribute.

Differential Revision: https://reviews.llvm.org/D129919
2022-07-19 14:42:58 -07:00
Christopher Bate
670eee08ce [mlir][VectorToGPU] Fix support for i4, col-major operand support
For the conversion to nvgpu `mma.sync` and `ldmatrix` pathways, the code
was missing support for the `i4` data type. While fixing this, another
bug was discoverd that caused the number of ldmatrix tiles calculated for
certain operand types and configurations to be incorrect. This change
fixes both issues and adds additional tests.

Differential Revision: https://reviews.llvm.org/D128074
2022-06-30 10:26:59 -06:00
Kazu Hirata
064a08cd95 Don't use Optional::hasValue (NFC) 2022-06-20 20:05:16 -07:00
Alex Zinenko
8b68da2c7d [mlir] move SCF headers to SCF/{IR,Transforms} respectively
This aligns the SCF dialect file layout with the majority of the dialects.

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D128049
2022-06-20 10:18:01 +02:00
Christopher Bate
51b925df94 [mlir][nvgpu] shared memory access optimization pass
This change adds a transformation and pass to the NvGPU dialect that
attempts to optimize reads/writes from a  memref representing GPU shared
memory in order to avoid bank conflicts. Given a value representing a
shared memory memref, it traverses all reads/writes within the parent op
and, subject to suitable conditions, rewrites all last dimension index
values such that element locations in the final (col) dimension are
given by
`newColIdx = col % vecSize + perm[row](col/vecSize,row)`
where `perm` is a permutation function indexed by `row` and `vecSize`
is the vector access size in elements (currently assumes 128bit
vectorized accesses, but this can be made a parameter). This specific
transformation can help optimize typical distributed & vectorized accesses
common to loading matrix multiplication operands to/from shared memory.

Differential Revision: https://reviews.llvm.org/D127457
2022-06-17 09:31:05 -06:00