Commit Graph

19496 Commits

Author SHA1 Message Date
Jeff Niu
fb771fe315 [mlir] Slightly optimize bytecode op numbering (#88310)
If the bytecode encoding supports properties, then the dictionary
attribute is always the raw dictionary attribute of the operation,
regardless of what it contains. Otherwise, get the dictionary attribute
from the op: if the op does not have properties, then it returns the raw
dictionary, otherwise it returns the combined inherent and discardable
attributes.
2024-04-10 23:34:48 +02:00
Kojo Acquah
04bf1a4090 Update LowerContractionToSMMLAPattern to ingnore matvec (#88288)
Patterns in `LowerContractionToSMMLAPattern` are designed to handle
vector-to-matrix multiplication but not matrix-to-vector. This leads to
the following error when processing `rhs` with rank < 2:

```
iree-compile: /usr/local/google/home/kooljblack/code/iree-build/llvm-project/tools/mlir/include/mlir/IR/BuiltinTypeInterfaces.h.inc:268: int64_t mlir::detail::ShapedTypeTrait<mlir::VectorType>::getDimSize(unsigned int) const [ConcreteType = mlir::VectorType]: Assertion `idx < getRank() && "invalid index for shaped type"' failed.
```

Updates to explicitly check the rhs rank and fail cases that cannot
process.
2024-04-10 13:18:47 -04:00
Aart Bik
f388a3a446 [mlir][sparse] update doc and examples of the [dis]assemble operations (#88213)
The doc and examples of the [dis]assemble operations did not reflect all
the recent changes on order of the operands. Also clarified some of the
text.
2024-04-10 09:42:12 -07:00
Mehdi Amini
43b2b2ebce Revert "Fix complex log1p accuracy with large abs values." (#88290)
Reverts llvm/llvm-project#88260

The test fails on the GCC7 buildbot.
2024-04-10 18:25:16 +02:00
Johannes Reifferscheid
49ef12a08c Fix complex log1p accuracy with large abs values. (#88260)
This ports https://github.com/openxla/xla/pull/10503 by @pearu. The new
implementation matches mpmath's results for most inputs, see caveats in
the linked pull request. In addition to the filecheck test here, the
accuracy was tested with XLA's complex_unary_op_test and its MLIR
emitters.
2024-04-10 14:55:56 +02:00
Jeff Niu
f2ade91a9f [mlir] Optimize getting properties on concrete ops (#88259)
This makes retrieving properties on concrete operations faster by
removing a branch when it is known that the operation must have
properties.
2024-04-10 14:11:45 +02:00
Raghu Maddhipatla
eec41d2f8d Revert "[Flang] [OpenMP] [Semantics] [MLIR] [Lowering] Add lowering support for IS_DEVICE_PTR and HAS_DEVICE_ADDR clauses on OMP TARGET directive." (#88198)
Reverts llvm/llvm-project#74187
2024-04-09 16:18:56 -05:00
srcarroll
b79db39659 [mlir][linalg] Support ParamType in vector_sizes option of VectorizeOp transform (#87557) 2024-04-09 15:52:40 -05:00
Joseph Huber
470aefb240 [Offload][NFC] Remove omp_ prefix from offloading entries (#88071)
Summary:
These entires are generic for offloading with the new driver now. Having
the `omp` prefix was a historical artifact and is confusing when used
for CUDA. This patch just renames them for now, future patches will
rework the binary format to make it more common.
2024-04-09 15:50:15 -05:00
Raghu Maddhipatla
9d9560facb [Flang] [OpenMP] [Semantics] [MLIR] [Lowering] Add lowering support for IS_DEVICE_PTR and HAS_DEVICE_ADDR clauses on OMP TARGET directive. (#74187)
Added lowering support for IS_DEVICE_PTR and HAS_DEVICE_ADDR clauses for
OMP TARGET directive and added related tests for these changes.

IS_DEVICE_PTR and HAS_DEVICE_ADDR clauses apply to OMP TARGET directive
OpenMP spec states

`The **is_device_ptr** clause indicates that its list items are device
pointers.`

`The **has_device_addr** clause indicates that its list items already
have device addresses and therefore they may be directly accessed from a
target device.`

Whereas USE_DEVICE_PTR and USE_DEVICE_ADDR clauses apply to OMP TARGET
DATA directive and OpenMP spec for them states

`Each list item in the **use_device_ptr** clause results in a new list
item that is a device pointer that refers to a device address`

`Each list item in a **use_device_addr** clause that is present in the
device data environment is treated as if it is implicitly mapped by a
map clause on the construct with a map-type of alloc`
2024-04-09 14:59:20 -05:00
Peiming Liu
a454d92c5a [mlir][sparse] rename files and unifies APIs (#88162) 2024-04-09 10:59:15 -07:00
Mehdi Amini
60c5c4ccad [MLIR] Don't check for key before inserting in map in GreedyPatternRewriteDriver worklist (NFC) (#88148)
This is a common anti-pattern (any volunteer for a clang-tidy check?).

This does not show real word significant impact though.
2024-04-09 19:33:53 +02:00
xiaoleis-nv
8d6469b0e0 [mlir][vector] Add lower-vector-multi-reduction pass (#87333)
This MR adds the `lower-vector-multi-reduction` pass to lower the
vector.multi_reduction operation.

While the Transform Dialect includes an operation,
`transform.apply_patterns.vector.lower_multi_reduction`, intended for a
similar purpose, its utility is limited to projects that have adopted
the Transform Dialect. Recognizing that not all projects are equipped to
integrate this dialect, the proposed pass serves as a vital standalone
alternative. It ensures that projects solely dependent on the
traditional pass infrastructure can also benefit from the optimized
lowering of `multi_reduction` operation.

---------

Co-authored-by: Xiaolei Shi <xiaoleis@nvidia.com>
2024-04-09 10:04:25 -07:00
Billy Zhu
6f6336858e [MLIR][LLVM] Add DebugNameTableKind to DICompileUnit (#87974)
Add the DebugNameTableKind field to DICompileUnit, along with its
importer & exporter.
2024-04-09 06:18:07 -07:00
Sergio Afonso
6528f10366 [MLIR][OpenMP] Group clause operands into structures and use them to define simplified op builders (#86797)
This patch introduces a set of composable structures grouping the MLIR
operands associated to each OpenMP clause. This makes it easier to keep
the MLIR representation for the same clause consistent throughout all
operations that accept it.

The relevant clause operand structures are grouped into per-operation
structures using a mixin pattern and used to define new operation
constructors. These constructors can be used to avoid having to get the
order of a possibly large list of operands right.

Missing clauses are documented as TODOs, as well as operands which are
part of the relevant operation's operand structure but cannot be
attached to the associated operation yet, due to missing op arguments to
its MLIR definition.

A follow-up patch will update Flang lowering to make use of these
structures, simplifying the passing of information from clause
processing to operation-generating functions and also simplifying the
creation of operations through the use of the new operation
constructors.
2024-04-09 13:40:18 +01:00
Kai Sasaki
51089e360e [mlir][complex] Support fast math flag for complex.tan op (#87919)
See
https://discourse.llvm.org/t/rfc-fastmath-flags-support-in-complex-dialect/71981
2024-04-09 15:22:43 +09:00
Uday Bondhugula
0e5a53cc01 [MLIR] Fix typo bug in AffineExprVisitor for WalkResult return case (#86138)
Fix typo bug in AffineExprVisitor for the WalkResult return case. This
didn't show up immmediately because most walks in the tree didn't
use walk result.
2024-04-09 08:37:57 +05:30
Matthias Braun
4a812b5912 Verify threadlocal_address constraints (#87841)
Check invariants for `llvm.threadlocal.address` intrinsic in IR
Verifier.
2024-04-08 17:47:57 -07:00
Andrei Golubev
be006372f3 [mlir][OpPrintingFlags] Allow to disable ElementsAttr hex printing (#85766)
At present, large ElementsAttr is unconditionally printed with a hex
string. This means that in IR large constant values often look like:
dense<"0x000000000004000000080000000004000000080000000..."> :
tensor<10x10xi32>

Hoisting hex printing control to the user level for tooling means that
one can disable the feature and get human-readable values when
necessary:
dense<[16, 32, 48, 500...]> : tensor<10x10xi32>

Note: AsmPrinterOptions::printElementsAttrWithHexIfLarger is not always
possible to be used as it requires that one exposes MLIR's command-line
options in user tooling (including an actual compiler).

Co-authored-by: Harald Rotuna <harald.razvan.rotuna@intel.com>
2024-04-09 02:08:32 +02:00
Corentin Ferry
50b937331f [mlir] Add missing libm member operations to MathToLibm (#87981)
This PR adds support for lowering the following Math operations to
`libm` calls:
* `math.absf` -> `fabsf, fabs`
* `math.exp` -> `expf, exp`
* `math.exp2` -> `exp2f, exp2`
* `math.fma` -> `fmaf, fma`
* `math.log` -> `logf, log`
* `math.log2` -> `log2f, log2`
* `math.log10` -> `log10f, log10`
* `math.powf` -> `powf, pow`
* `math.sqrt` -> `sqrtf, sqrt`

These operations are direct members of `libm`, and do not seem to
require any special manipulations on their operands.
2024-04-09 00:41:12 +02:00
Andrzej Warzyński
e276dcec17 [mlir][arith] Refine the verifier for arith.constant (#87999)
Disallows initialization of scalable vectors with an attribute of
arbitrary values, e.g.:
```mlir
  %c = arith.constant dense<[0, 1]> : vector<[2] x i32>
```

Initialization using vector splats remains allowed (i.e. when all the
init values are identical):
```mlir
  %c = arith.constant dense<[1, 1]> : vector<[2] x i32>
```

Note: This is a re-upload of #86178
2024-04-08 21:22:00 +01:00
Andrzej Warzynski
40327a628a Revert "[mlir][arith] Refine the verifier for arith.constant (#86178)"
This reverts commit 662c62609e.

Broken both:
  * https://lab.llvm.org/buildbot/#/builders/61/builds/56565
2024-04-08 14:39:20 +01:00
Adrian Kuegel
a4c84d6ac1 [mlir] Only inline if properties are used.
This is a followup to 0f52f4ddd9
It breaks dialects that don't use properties yet.
2024-04-08 13:13:57 +00:00
Andrzej Warzyński
662c62609e [mlir][arith] Refine the verifier for arith.constant (#86178)
Disallows initialization of scalable vectors with an attribute of
arbitrary values, e.g.:
```mlir
  %c = arith.constant dense<[0, 1]> : vector<[2] x i32>
```

Initialization using vector splats remains allowed (i.e. when all the
init values are identical):
```mlir
  %c = arith.constant dense<[1, 1]> : vector<[2] x i32>
```
2024-04-08 13:59:27 +01:00
Jie Fu
2abd71ec51 [mlir] Fix -Wunused-variable in DebugImporter.cpp (NFC)
llvm-project/mlir/lib/Target/LLVMIR/DebugImporter.cpp:377:10:
error: unused variable '[_, inserted]' [-Werror,-Wunused-variable]
    auto [_, inserted] = dependentCache.try_emplace(
         ^
1 error generated.
2024-04-08 18:22:06 +08:00
Billy Zhu
81a7b6454e [MLIR][LLVM] Recursion importer handle repeated self-references (#87295)
Followup to this discussion:
https://github.com/llvm/llvm-project/pull/80251#discussion_r1535599920.

The previous debug importer was correct but inefficient. For cases with
mutual recursion that contain more than one back-edge, each back-edge
would result in a new translated instance. This is because the previous
implementation never caches any translated result with unbounded
self-references. This means all translation inside a recursive context
is performed from scratch, which will incur repeated run-time cost as
well as repeated attribute sub-trees in the translated IR (differing
only in their `recId`s).

This PR refactors the importer to handle caching inside a recursive
context.
- In the presence of unbound self-refs, the translation result is cached
in a separate cache that keeps track of the set of dependent unbound
self-refs.
- A dependent cache entry is valid only when all the unbound self-refs
are in scope. Whenever a cached entry goes out of scope, it will be
removed the next time it is looked up.
2024-04-08 01:09:54 -07:00
Prashant Kumar
9ffecef1c6 [mlir][vector][NFC] Fix typo temp -> tmp. (#87878) 2024-04-08 08:34:36 +02:00
Fabian Mora
a2c4b7c8e2 [mlir] Add convertInstruction and getSupportedInstructions to LLVMImportInterface (#86799)
This patch adds the `convertInstruction` and `getSupportedInstructions`
to `LLVMImportInterface`, allowing any non-LLVM dialect to specify how
to import LLVM IR instructions and overriding the default import of LLVM instructions.
2024-04-07 08:46:21 +02:00
Aviad Cohen
ccc02563f4 [mlir][linalg]: Fixed possible memory leak in cloneToCollapsedOp (#87595)
* Direct call to `clone` function leads to memory leak. Instead, we should use `RewriterBase` clone function instead.
2024-04-07 08:23:16 +03:00
Matthias Springer
c459a366d3 [mlir][Arith] ValueBoundsOpInterface: Support arith.select (#87870)
This commit adds a `ValueBoundsOpInterface` implementation for
`arith.select`. The implementation is almost identical to `scf.if`
(#85895), but there is one special case: if the condition is a shaped
value, the selection is applied element-wise and the result shape can be
inferred from either operand.

Note: This is a re-upload of #86383.
2024-04-07 09:36:28 +09:00
Kai Sasaki
a522dbbd62 [mlir][complex] Support fast math flag for complex.sign op (#87148)
We are going to support the fast math flag given in `complex.sign` op in
the conversion to standard dialect.

See:
https://discourse.llvm.org/t/rfc-fastmath-flags-support-in-complex-dialect/71981
2024-04-06 15:35:10 +09:00
Matthias Springer
0ba3e96be1 [mlir][SCF][NFC] ValueBoundsConstraintSet: Simplify scf.for implementation (#87862)
This commit simplifies the implementation of the
`ValueBoundsOpInterface` for `scf.for` based on the newly added
`ValueBoundsConstraintSet::compare` API and adds additional
documentation.

Previously, the interface implementation created a new constraint set
just to check if the yielded value and iter_arg are equal. This was
inefficient because constraints were added multiple times (to two
different constraint sets) for ops that are inside the loop.

Note: This is a re-upload of #86239.
2024-04-06 15:30:26 +09:00
Matthias Springer
76435f2dca [mlir][SCF] ValueBoundsConstraintSet: Support scf.if (branches) (#87860)
This commit adds support for `scf.if` to `ValueBoundsConstraintSet`.

Example:
```
%0 = scf.if ... -> index {
  scf.yield %a : index
} else {
  scf.yield %b : index
}
```

The following constraints hold for %0:
* %0 >= min(%a, %b)
* %0 <= max(%a, %b)

Such constraints cannot be added to the constraint set; min/max is not
supported by `IntegerRelation`. However, if we know which one of %a and
%b is larger, we can add constraints for %0. E.g., if %a <= %b:
* %0 >= %a
* %0 <= %b

This commit required a few minor changes to the
`ValueBoundsConstraintSet` infrastructure, so that values can be
compared while we are still in the process of traversing the IR/adding
constraints.

Note: This is a re-upload of #85895, which was reverted. The bug that
caused the failure was fixed in #87859.
2024-04-06 13:04:49 +09:00
Matthias Springer
08200fa3f5 [mlir][Arith] Specify evaluation order of getExpr (#87859)
The C++ standard does not specify an evaluation order for addition/...
operands. E.g., in `a() + b()`, the compiler is free to evaluate `a` or
`b` first.

This lead to different `mlir-opt` outputs in #85895. (FileCheck passed
when compiled with LLVM but failed when compiled with gcc.)
2024-04-06 12:43:26 +09:00
Jeff Niu
0f52f4ddd9 [mlir][ods] Emit "trivial" ODS getter/setters inline (#87741)
Emitting trivial getters that amount to `(*this)->getOperand(1)`
out-of-line or `getProperties().foo` is a pretty significant performance
hit on these basic MLIR APIs for manipulating ops (3-4x). Emit them
inline (without adding additional dependencies to header files).
2024-04-06 04:01:37 +02:00
Diego Caballero
42a6ad7bad [mlir][Vector] Fix n-D vector.extract/insert lowering to LLVM (#87591)
The lowering of n-D vector.extract/insert ops to LLVM is not supported
but if one of these accidentally reaches the vector-to-llvm conversion
patterns, we end up with a kind of puzzling crash. This PR fixes that
crash and gracefully bails out in those cases.
2024-04-05 15:01:20 -07:00
Christian Ulmann
541962306d [MLIR][LLVM] Remove bitcast pattern from type consistency pass (#87755)
This commit removes the no longer required bitcast inserting pattern in
LLVM dialect's type consistency pattern. This was previously required to
enable Mem2Reg and SROA to promote accesses that had different types.
Recent changes to both passes added direct support for this feature to
them, so the pattern has no further use.
2024-04-05 15:47:16 +02:00
Jan Leyonberg
9708d09003 [MLIR][OpenMP] Skip host omp ops when compiling for the target device (#85239)
This patch separates the lowering dispatch for host and target devices.
For the target device, if the current operation is not a top-level
operation (e.g. omp.target) or is inside a target device code region it
will be ignored, since it belongs to the host code.


This is an alternative approach to #84611, the new test in this PR was
taken from there.
2024-04-05 09:25:28 -04:00
Andrei Golubev
9b5155c936 [mlir][OpFormatGen][NFC] Change Raw{Operands,Types} arrays to objects (#85631)
Tablegen generates uninitialized arrays of size 1 for raw operands and
types. In the current state this causes static analysis warnings about
"uninitialized fixed-size arrays" as their init is separated from their
declaration. Since these are single-entry array, we can just use a plain
variable instead of an array here.

Co-authored-by: Orest Chura <orest.chura@intel.com>
Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
2024-04-05 13:49:32 +02:00
Mehdi Amini
8487e05967 Revert "[mlir][SCF] ValueBoundsConstraintSet: Support scf.if (branches) (#85895)"
This reverts commit 6b30ffef28.

gcc7 bot is broken
2024-04-05 03:00:35 -07:00
Mehdi Amini
e5e1bc0ad8 Revert "[mlir][SCF][NFC] ValueBoundsConstraintSet: Simplify scf.for implementation (#86239)"
This reverts commit 24e4429980.

gcc7 bot is broken
2024-04-05 03:00:29 -07:00
Mehdi Amini
f2d8218efa Revert "[mlir][Arith] ValueBoundsOpInterface: Support arith.select (#86383)"
This reverts commit 62b58d3418.

gcc7 bot is broken.
2024-04-05 03:00:02 -07:00
Benjamin Maxwell
0b7362c257 [mlir][arith] Add result pretty printing for constant vscale values (#83565)
In scalable code it is very common to have constant multiples of vscale,
e.g. `4 * vscale`. This updates `arith.muli` to pretty print the result
name in cases like this, so `4 * vscale` would be `%c4_vscale`.

This makes reading IR dumps of scalable code a little nicer.
2024-04-05 10:48:16 +01:00
Andrzej Warzynski
5ed60ffd79 [mlir][test] Extend CMake logic for e2e tests
Adds two new CMake functions to query the host system:

  * `check_hwcap`,
  * `check_emulator`.

Together, these functions are used to check whether a given set of MLIR
integration tests require an emulator. If yes, then the corresponding
CMake var that defies the required emulator executable is also checked.

`check_hwcap` relies on ELF_HWCAP for discovering CPU features from
userspace on Linux systems. This is the recommended approach for Arm
CPUs running on Linux as outlined in this blog post:

  * https://community.arm.com/arm-community-blogs/b/operating-systems-blog/posts/runtime-detection-of-cpu-features-on-an-armv8-a-cpu

Other operating systems (e.g. Android) and CPU architectures will
most likely require some other approach. Right now these new hooks are
only used for SVE and SME integration tests.

This relands #86489 with the following changes:
  * Replaced:
      `set(hwcap_test_file ${CMAKE_BINARY_DIR}/${CMAKE_FILES_DIRECTORY}/hwcap_check.c)`
    with:
      `set(hwcap_test_file ${CMAKE_BINARY_DIR}/temp/hwcap_check.c)`
    The former would trigger an infinite loop when running `ninja`
    (after the initial CMake configuration).
  * Fixed commit msg. Previous one was taken from the initial GH PR
    commit rather than the final re-worked solution (missed this when
    merging via GH UI).
  * A couple more NFCs/tweaks.
2024-04-05 08:43:37 +00:00
mlevesquedion
73fa6685c4 Fix a few broken links (#87098)
References to headings need to be preceded with a slash. Also,
references to headings on the same page do not need to contain the name
of the document (omitting the document name means if the name changes
the links will still be valid).

I double checked the links by building [the
website](https://github.com/llvm/mlir-www):

```shell
./mlir-www-helper.sh --install-docs ../llvm-project website
cd website && hugo serve
```
2024-04-05 09:52:53 +02:00
Christian Ulmann
ef8322f41d [MLIR][LLVM] Improve bit- and addrspacecast folders (#87745)
This commit extends the folders of chainable casts (bitcast and
addrspacecast) to ensure that they fold a chain of the same casts into a
single cast.

Additionally cleans up the canonicalization test file, as this used some
outdated constructs.
2024-04-05 09:14:13 +02:00
Christian Ulmann
974f1ee58d [MLIR][LLVM][Mem2Reg] Relax type equality requirement for load and store (#87637)
This commit relaxes Mem2Reg's type equality requirement for the LLVM
dialect's load and store operations. For now, we only allow loads to be
promoted if the reaching definition can be casted into a value of the
target type.

For stores, the same conversion casting check is applied and we ensure
that their result is properly casted to the type of the memory slot.
This is necessary to satisfy assumptions of the general mem2reg pass, as
it creates block arguments with the types of the memory slot.

This relands https://github.com/llvm/llvm-project/pull/87504
2024-04-05 08:25:36 +02:00
Matthias Springer
62b58d3418 [mlir][Arith] ValueBoundsOpInterface: Support arith.select (#86383)
This commit adds a `ValueBoundsOpInterface` implementation for
`arith.select`. The implementation is almost identical to `scf.if`
(#85895), but there is one special case: if the condition is a shaped
value, the selection is applied element-wise and the result shape can be
inferred from either operand.
2024-04-05 13:39:14 +09:00
Matthias Springer
24e4429980 [mlir][SCF][NFC] ValueBoundsConstraintSet: Simplify scf.for implementation (#86239)
This commit simplifies the implementation of the
`ValueBoundsOpInterface` for `scf.for` based on the newly added
`ValueBoundsConstraintSet::compare` API and adds additional
documentation.

Previously, the interface implementation created a new constraint set
just to check if the yielded value and iter_arg are equal. This was
inefficient because constraints were added multiple times (to two
different constraint sets) for ops that are inside the loop.
2024-04-05 13:27:50 +09:00
Matthias Springer
6b30ffef28 [mlir][SCF] ValueBoundsConstraintSet: Support scf.if (branches) (#85895)
This commit adds support for `scf.if` to `ValueBoundsConstraintSet`.

Example:
```
%0 = scf.if ... -> index {
  scf.yield %a : index
} else {
  scf.yield %b : index
}
```

The following constraints hold for %0:
* %0 >= min(%a, %b)
* %0 <= max(%a, %b)

Such constraints cannot be added to the constraint set; min/max is not
supported by `IntegerRelation`. However, if we know which one of %a and
%b is larger, we can add constraints for %0. E.g., if %a <= %b:
* %0 >= %a
* %0 <= %b

This commit required a few minor changes to the
`ValueBoundsConstraintSet` infrastructure, so that values can be
compared while we are still in the process of traversing the IR/adding
constraints.
2024-04-05 13:14:00 +09:00