Commit Graph

5910 Commits

Author SHA1 Message Date
Ivan Butygin
f54cdc5d6e [mlir] IntegerRangeAnalysis: add support for vector type (#112292)
Treat integer range for vector type as union of ranges of individual
elements. With this semantics, most arith ops on vectors will work out
of the box, the only special handling needed for constants and vector
elements manipulation ops.

The end goal of these changes is to be able to optimize vectorized index
calculations.
2024-11-01 23:58:16 +03:00
Manupa Karunaratne
a6e72f9392 [MLIR][Vector] Add Lowering for vector.step (#113655)
Currently, the lowering for vector.step lives
under a folder. This is not ideal if we want
to do transformation on it and defer the
 materizaliztion of the constants much later.

This commits adds a rewrite pattern that
could be used by using
`transform.structured.vectorize_children_and_apply_patterns`
transform dialect operation.

Moreover, the rewriter of vector.step is also
now used in -convert-vector-to-llvm pass where
it handles scalable and non-scalable types as
LLVM expects it.

As a consequence of removing the vector.step
lowering as its folder, linalg vectorization
will keep vector.step intact.
2024-11-01 16:38:36 +00:00
Luke Hutton
36878b5542 [TOSA] Remove i64 from valid element datatypes in validation (#113380)
Align the validation pass valid element datatypes check more closely to
the specification by removing i64 as a supported datatype. The spec does
not currently support it.

Signed-off-by: Luke Hutton <luke.hutton@arm.com>
2024-11-01 10:12:43 +00:00
Rolf Morel
5c1752e368 [MLIR][DLTI] Pretty parsing and printing for DLTI attrs (#113365)
Unifies parsing and printing for DLTI attributes. Introduces a format of
`#dlti.attr<key1 = val1, ..., keyN = valN>` syntax for all queryable
DLTI attributes similar to that of the DictionaryAttr, while retaining
support for specifying key-value pairs with `#dlti.dl_entry` (whether to
retain this is TBD).

As the new format does away with most of the boilerplate, it is much easier
to parse for humans. This makes an especially big difference for nested
attributes.

Updates the DLTI-using tests and includes fixes for misc error checking/
error messages.
2024-10-31 19:18:24 +00:00
Abid Qadeer
89f2d50cda [mlir][debug] Support DIGenericSubrange. (#113441)
`DIGenericSubrange` is used when the dimensions of the arrays are
unknown at build time (e.g. assumed-rank arrays in Fortran). It has same
`lowerBound`, `upperBound`, `count` and `stride` fields as in
`DISubrange` and its translation looks quite similar as a result.

---------

Co-authored-by: Tobias Gysi <tobias.gysi@nextsilicon.com>
2024-10-31 10:09:26 +00:00
Ilya Enkovich
d2109640a3 [MLIR] [AMX] Fix strides used by AMX lowering for tile loads and stores. (#113476) 2024-10-30 20:41:28 +01:00
Matthias Springer
217700baf7 [mlir][bufferization] Support bufferization of external functions (#113999)
This commit adds support for bufferizing external functions that have no
body. Such functions were previously rejected by One-Shot Bufferize if
they returned a tensor value.

This commit is in preparation of removing the deprecated
`func-bufferize` pass. That pass can bufferize external functions.

Also update a few comments.
2024-10-30 21:49:10 +09:00
Matthias Springer
ea050ab1a9 [mlir][Transforms][NFC] Dialect conversion: Reformat materialization error message (#114176)
This commit changes the format of the materialization error message.

Previously: `failed to legalize unresolved materialization from ('f64')
to 'f32' that remained live after conversion`
Now: `failed to legalize unresolved materialization from ('f64') to
('f32') that remained live after conversion`

This commit is in preparation of merging the 1:1 and 1:N dialect
conversions. At that point, target materializations may create more than
one SSA value. I am sending this change as a separate PR to keep the
main PR smaller.
2024-10-30 21:36:39 +09:00
donald chen
df0d249b65 [mlir] [linalg] fix side effect of linalg op (#114045)
Linalg op need to take into account memory side effects happening inside
the region when determining their own side effects.

This patch fixed issue
https://github.com/llvm/llvm-project/issues/112881
2024-10-30 14:01:49 +08:00
lialan
2c313259c6 [MLIR] VectorEmulateNarrowType to support loading of unaligned vectors (#113411)
Previously, the pass only supported emulation of loading vector sizes
that are multiples of the emulated data type. This patch expands its
support for emulating sizes that are not multiples of byte sizes. In
such cases, the element values are packed back-to-back to preserve
memory space.

To give a concrete example: if an input has type `memref<3x3xi2>`, it is
actually occupying 3 bytes in memory, with the first 18 bits storing the
values and the last 6 bits as padding. The slice of `vector<3xi2>` at
index `[2, 0]` is stored in memory from bit 12 to bit 18. To properly
load the elements from bit 12 to bit 18 from memory, first load byte 2
and byte 3, and convert it to a vector of `i2` type; then extract bits 4
to 10 (element index 2-5) to form a `vector<3xi2>`.

A limitation of this patch is that the linearized index of the unaligned
vector has to be known at compile time. Extra code needs to be emitted
to handle it if the condition does not hold.

The following ops are updated:
* `vector::LoadOp`
* `vector::TransferReadOp`
* `vector::MaskedLoadOp`
2024-10-29 20:04:48 -07:00
Kunwar Grover
2c5eea0e88 [mlir][Vector] Fix vector.insert folder for scalar to 0-d inserts (#113828)
The current vector.insert folder tries to replace a scalar with a 0-rank
vector. This patch fixes this crash by not folding unless they types of
the result and replacement are same.
2024-10-29 22:47:44 +00:00
Andrzej Warzyński
39ad84e4d1 [mlir][linalg] Split GenericPadOpVectorizationPattern into two patterns (#111349)
At the moment, `GenericPadOpVectorizationPattern` implements two
orthogonal transformations:
  1. Rewrites `tensor::PadOp` into a sequence of `tensor::EmptyOp`,
    `linalg::FillOp` and `tensor::InsertSliceOp`.
  2. Vectorizes (where possible) `tensor::InsertSliceOp` (see
    `tryVectorizeCopy`).

This patch splits `GenericPadOpVectorizationPattern` into two separate
patterns:
  1. `GeneralizePadOpPattern` for the first transformation (note that
    currently `GenericPadOpVectorizationPattern` inherits from
    `GeneralizePadOpPattern`).
  2. `InsertSliceVectorizePattern` to vectorize `tensor::InsertSliceOp`.

With this change, we gain the following:
  * a clear separation between pre-processing and vectorization
    transformations/stages,
  * a path to support masked vectorisation for `tensor.insert_slice`
    (with a dedicated pattern for vectorization, it is much easier to
    specify the input vector sizes used in masking),
  * more opportunities to vectorize `tensor.insert_slice`.

Note for downstream users:
--------------------------

If you were using `populatePadOpVectorizationPatterns`, following this
change you will also have to add
`populateInsertSliceVectorizationPatterns`.

Finer implementation details:
-----------------------------

1.  The majority of changes in this patch are copy & paste + some edits.
  1.1. The only functional change is that the vectorization of
    `tensor.insert_slice` is now broadly available (as opposed to being
    constrained to the pad vectorization pattern:
    `GenericPadOpVectorizationPattern`).
  1.2. Following-on from the above, `@pad_and_insert_slice_dest` is
    updated. As expected, the input `tensor.insert_slice` Op is no
    longer "preserved" and instead gets vectorized successfully.

2. The `linalg.fill` case in `getConstantPadVal` works under the
   assumption that only _scalar_ source values can be used. That's
   consistent with the definition of the Op, but it's not tested at the
   moment. Hence a test case in Linalg/invalid.mlir is added.

3. The behaviour of the two TD vectorization Ops,
   `transform.structured.vectorize_children_and_apply_patterns` and
   `transform.structured.vectorize` is preserved.
2024-10-29 16:57:23 +00:00
goldsteinn
2e612f8d86 [MLIR][Arith] Improve accuracy of inferDivU (#113789)
1) We can always bound the maximum with the numerator.
    - https://alive2.llvm.org/ce/z/PqHvuT
2) Even if denominator min can be zero, we can still bound the minimum
   result with `lhs.umin u/ rhs.umax`.

This is similar to https://github.com/llvm/llvm-project/pull/110169
2024-10-29 09:41:59 -05:00
Matthias Springer
1549a0c183 [mlir][SCF] Remove scf-bufferize pass (#113840)
The dialect conversion-based bufferization passes have been migrated to
One-Shot Bufferize about two years ago. To clean up the code base, this
commit removes the `scf-bufferize` pass, one of the few remaining parts
of the old infrastructure. Most bufferization passes have already been
removed.

Note for LLVM integration: If you depend on this pass, migrate to
One-Shot Bufferize or copy the pass to your codebase.
2024-10-29 09:10:30 +09:00
Andrzej Warzyński
0cf7aaf300 [MLIR][Vector] Update Transfer{Read|Write}DropUnitDimsPattern patterns (#112394)
Updates `TransferWriteDropUnitDimsPattern` and
`TransferReadDropUnitDimsPattern` to inherit from
`MaskableOpRewritePattern` so that masked versions of
xfer_read/xfer_write Ops are also supported:

```mlir
    %v = vector.mask %mask {
      vector.transfer_read %arg[%c0, %c0, %c0, %c0], %cst :
        memref<1x1x3x2xi8, strided<[6, 6, 2, 1], offset: ?>>, vector<3x2xi8>
    } : vector<3x2xi1> -> vector<3x2xi8>
```
2024-10-26 13:54:04 +01:00
Jacques Pienaar
bb00f5b1ed [mlir][vector] Remove unneeded mask restriction (#113742)
These were added when the only mapping was to LLVM.
2024-10-25 20:45:44 -07:00
donald chen
889b67c9d3 [mlir] [memref] add more checks to the memref.reinterpret_cast (#112669)
Operation memref.reinterpret_cast was accept input like:

%out = memref.reinterpret_cast %in to offset: [%offset], sizes: [10],
strides: [1]
         : memref<?xf32> to memref<10xf32>

A problem arises: while lowering, the true offset of %out is %offset,
but its data type indicates an offset of 0. Permitting this
inconsistency can result in incorrect outcomes, as certain pass might
erroneously extract the offset from the data type of %out.

This patch fixes this by enforcing that the return value's data type
aligns
with the input parameter.
2024-10-26 08:07:51 +08:00
Andrzej Warzyński
ac4bd74190 [mlir] Add apply_patterns.linalg.pad_vectorization TD Op (#112504)
This PR simply wraps `populatePadOpVectorizationPatterns` into a new
Transform Dialect Op: `apply_patterns.linalg.pad_vectorization`.

This change makes it possible to run (and test) the corresponding
patterns _without_:

  `transform.structured.vectorize_children_and_apply_patterns`.

Note that the Op above only supports non-masked vectorisation (i.e. when
the inputs are static), so, effectively, only fixed-width vectorisation
(as opposed to scalable vectorisation). As such, this change is required
to construct vectorization pipelines for tensor.pad targeting scalable
vectors.

To test the new Op and the corresponding patterns, I added
"vectorization-pad-patterns.mlir" - most tests have been extracted from
"vectorization-with-patterns.mlir".
2024-10-25 10:39:26 -07:00
Ian Wood
455f71d285 [mlir] Convert expand_shape to more static form (#112265)
Add pattern that converts a `tensor.expand_shape` op to a more static
form.

This matches the pattern: `tensor.cast` -> `tensor.expand_shape` if it
has a foldable `tensor.cast` and some constant foldable `output_shape`
operands for the `tensor.expand_shape`. This makes the
`tensor.expand_shape` more static, as well as allowing the static
information to be propagated further down in the program.
2024-10-24 17:04:02 -07:00
Georgios Pinitas
8ad8db973e Revert "[TOSA] bug fix infer shape for slice" (#113413)
Reverts llvm/llvm-project#108306
2024-10-23 04:37:21 +01:00
Tai Ly
3b9526b231 [TOSA] bug fix infer shape for slice (#108306)
This fixes the infer output shape of TOSA slice op for start/size values
that are out-of-bound or -1

added tests to check:
  - size = -1
  - size is out of bound
  - start is out of bound

Signed-off-by: Tai Ly <tai.ly@arm.com>
2024-10-23 04:25:41 +01:00
Andrzej Warzyński
2a25200828 [mlir][tensor] Restrict the verifier for tensor.pack/tensor.unpack (#113108)
Restricts the verifier for tensor.pack and tensor.unpack Ops so that the
following is no longer allowed:

```mlir
  %c8 = arith.constant 8 : index
  %0 = tensor.pack %input inner_dims_pos = [0, 1] inner_tiles = [8, %c8] into %output : tensor<?x?xf32> -> tensor<?x?x8x8xf32>
```

Specifically, in line with other Tensor Ops, require:
  * a dynamic dimensions for each (dynamic) SSA value,
  * a static dimension for each static size (attribute).

In the example above, a static dimension (8) is mixed with a dynamic
size (%c8).

Note that this is mostly deleting existing code - that's because this
change simplifies the logic in verifier.

For more context:
* https://discourse.llvm.org/t/tensor-ops-with-dynamic-sizes-which-behaviour-is-more-correct
2024-10-22 20:11:05 -07:00
Longsheng Mou
519eef3bdc [mlir][tosa] Add a verifier for tosa.mul (#113320)
This PR adds a verifier check for tosa.mul, requiring that the shift be
0 for float types.
Fixes #112716.
2024-10-22 22:34:04 +01:00
weiwei chen
7191ced3b6 [MLIR] Add folding constants canonicalization for mlir::index::AddOp. (#111084)
- [x] Add a simple canonicalization for `mlir::index::AddOp`.
2024-10-22 12:04:26 -07:00
Kunwar Grover
1004865f1c [mlir][Vector] Support 0-d vectors natively in TransferOpReduceRank (#112907)
Since
ddf2d62c7d
, 0-d vectors are supported in VectorType. This patch removes 0-d vector
handling with scalars for the TransferOpReduceRank pattern. This pattern
specifically introduces tensor.extract_slice during vectorization,
causing vectorization to not fold transfer_read/transfer_write slices
properly. The changes in vectorization test files reflect this.

There are other places where lowering patterns are still side-stepping
from handling 0-d vectors properly, by turning them into scalars, but
this patch only focuses on the vector.transfer_x patterns.
2024-10-22 15:50:16 +01:00
Andrzej Warzyński
91c11574e8 Revert "[MLIR] Make OneShotModuleBufferize use OpInterface (#110322)" (#113124)
This reverts commit 2026501cf1.

Failing bot:
  * https://lab.llvm.org/staging/#/builders/125/builds/389
2024-10-22 13:28:44 +01:00
Longsheng Mou
2ce655cf1b [mlir][func] Fix multiple bugs in DuplicateFunctionElimination (#109571)
This PR fixes multiple bugs in `DuplicateFunctionElimination`.
- Prevents elimination of function declarations.
- Updates all symbol uses to reference unique function representatives.

Fixes #93483.
2024-10-22 09:19:12 +08:00
Razvan Lupusoru
ac9ee61857 [acc] Improve LegalizeDataValues pass to handle data constructs (#112990)
Renames LegalizeData to LegalizeDataValues since this pass fixes up SSA
values. LegalizeData suggested that it fixed data mapping.

This change also adds support to fix up ssa values for data clause
operations. Effectively, compute regions within a data region use the
ssa values from data operations also. The ssa values within data regions
but not within compute regions are not updated.

This change is to support the requirement in the OpenACC spec which
notes that a visible data clause is not just one on the current compute
construct but on the lexically containing data construct or visible
declare directive.
2024-10-21 09:49:58 -07:00
Felix Schneider
02bf3b54c0 [mlir][linalg] Add quantized conv2d operator with FCHW,NCHW order (#107740)
This patch adds a quantized version of the `linalg.conv2d_nchw_fchw` Op.
This is the "channel-first" ordering typically used by PyTorch and
others.
2024-10-19 18:25:27 +02:00
Max191
2bff9d9ffe [mlir] Don't hoist transfers from potentially zero trip loops (#112752)
The hoistRedundantVectorTransfers function does not verification of loop
bounds when hoisting vector transfers. This is not safe in general,
since it is possible that the loop will have zero trip count. This PR
uses ValueBounds to verify that the lower bound is less than the upper
bound of the loop before hoisting. Trip count verification is currently
behind an option `verifyNonZeroTrip`, which is false by default.

Zero trip count loops can arise in GPU code generation, where a loop
bound can be dependent on a thread id. If not all threads execute the
loop body, then hoisting out of the loop can cause these threads to
execute the transfers when they are not supposed to.

---------

Signed-off-by: Max Dawkins <max.dawkins@gmail.com>
2024-10-18 16:11:21 -04:00
Max191
98e838a890 [mlir] Do not bufferize parallel_insert_slice dest to read for full slices (#112761)
In the insert_slice bufferization interface implementation, the
destination tensor is not considered read if the full tensor is
overwritten by the slice. This PR adds the same check for
tensor.parallel_insert_slice.

Adds two new StaticValueUtils:
- `isAllConstantIntValue` checks if an array of `OpFoldResult` are all
equal to a passed `int64_t` value.
- `areConstantIntValues` checks if an array of `OpFoldResult` are all
equal to a passed array of `int64_t` values.

fixes https://github.com/llvm/llvm-project/issues/112435

---------

Signed-off-by: Max Dawkins <max.dawkins@gmail.com>
2024-10-18 16:02:03 -04:00
Max191
1ae24460d2 [mlir] Add forall canonicalization to replace constant induction vars (#112764)
Adds a canonicalization pattern for scf.forall that replaces constant
induction variables with a constant index. There is a similar
canonicalization that completely removes constant induction variables
from the loop, but that pattern does not apply on foralls with mappings,
so this one is necessary for those cases.

---------

Signed-off-by: Max Dawkins <max.dawkins@gmail.com>
2024-10-18 15:21:01 -04:00
Andrzej Warzyński
0a3347dc63 [mlir][linalg] Fix idx comparison in the vectorizer (#112900)
Fixes loop comparison condition in the vectorizer.

As that logic is used specifically for vectorising `tensor.extract`, I
also added a test that violates the assumptions made inside
`getTrailingNonUnitLoopDimIdx`, namely that Linalg loops are non-empty.
Vectorizer pre-conditions will capture that much earlier making sure
that `getTrailingNonUnitLoopDimIdx` is only run when all the assumptions
are actually met.

Thank you for pointing this out, @pfusik !
2024-10-18 15:27:43 +01:00
Andrzej Warzyński
1a871b2122 [mlir][tensor] Add tests to invalid.mlir (nfc) (#112759)
Adds two test with invalid usage of `tensor.extract_slice` that were
missing. Also moves one other test for `tensor.extract_slice`, so that
all tests for this Op are clustered together.

Note, this PR merely documents the current behaviour. No new
functionality is added.
2024-10-18 12:20:17 +01:00
Vinayak Dev
2f15d7e43e [mlir][tensor] Fix off-by-one error in ReshapeOpsUtils (#112774)
This patch fixes an off-by-one error in
`mlir::getReassociationIndicesForCollapse()` that occurs when the last
two dims of the source tensor satisfy the while loop.

This would cause an assertion failure due to out-of-bounds-access, which
is now fixed.
2024-10-18 14:02:30 +05:30
Sergio Afonso
4091bc61e3 [MLIR][OpenMP] Split region-associated op verification (#112355)
This patch moves the part of operation verifiers dependent on the
contents of their regions to the corresponding `verifyRegions` method.
This ensures these are only triggered after the operations in the region
have themselved already been verified in advance, avoiding checks based
on invalid nested operations.

The `LoopWrapperInterface` is also updated so that its verifier runs
after operations in the region of ops with this interface have already
been verified.
2024-10-17 10:46:38 +01:00
Ivan Butygin
6902b39b6f [mlir] UnsignedWhenEquivalent: use greedy rewriter instead of dialect conversion (#112454)
`UnsignedWhenEquivalent` doesn't really need any dialect conversion
features and switching it normal patterns makes it more composable with
other patterns-based transformations (and probably faster).
2024-10-17 12:23:11 +03:00
Longsheng Mou
f5aee1f18b [mlir][memref] Fix type conversion in emulate-wide-int and emulate-narrow-type (#112214)
This PR follows with #112104, using `nullptr` to indicate that type
conversion failed and no fallback conversion should be attempted.
2024-10-17 09:08:24 +08:00
Sirui Mu
1dfb104eac [mlir][LLVMIR] Add operand bundle support for llvm.intr.assume (#112143)
This patch adds operand bundle support for `llvm.intr.assume`.

This patch actually contains two parts:

- `llvm.intr.assume` now accepts operand bundle related attributes and
operands. `llvm.intr.assume` does not take constraint on the operand
bundles, but obviously only a few set of operand bundles are meaningful.
I plan to add some of those (e.g. `aligned` and `separate_storage` are
what interest me but other people may be interested in other operand
bundles as well) in future patches.

- The definitions of `llvm.call`, `llvm.invoke`, and
`llvm.call_intrinsic` actually define `op_bundle_tags` as an operation
property. It turns out this approach would introduce some unnecessary
burden if applied equally to the intrinsic operations because properties
are not available through `Operation *` but we have to operate on
`Operation *` during the import/export of intrinsics, so this PR changes
it from a property to an array attribute.

This patch relands commit d8fadad07c.
2024-10-16 20:49:02 +08:00
Sirui Mu
484c02780b Revert "[mlir][LLVMIR] Add operand bundle support for llvm.intr.assume (#112143)"
This reverts commit d8fadad07c.

The commit breaks the following CI builds:
- ppc64le-mlir-rhel-clang: https://lab.llvm.org/buildbot/#/builders/129/builds/7685
- ppc64le-flang-rhel-clang: https://lab.llvm.org/buildbot/#/builders/157/builds/10338
2024-10-16 14:15:31 +08:00
Sirui Mu
d8fadad07c [mlir][LLVMIR] Add operand bundle support for llvm.intr.assume (#112143)
This patch adds operand bundle support for `llvm.intr.assume`.

This patch actually contains two parts:

- `llvm.intr.assume` now accepts operand bundle related attributes and
operands. `llvm.intr.assume` does not take constraint on the operand
bundles, but obviously only a few set of operand bundles are meaningful.
I plan to add some of those (e.g. `aligned` and `separate_storage` are
what interest me but other people may be interested in other operand
bundles as well) in future patches.

- The definitions of `llvm.call`, `llvm.invoke`, and
`llvm.call_intrinsic` actually define `op_bundle_tags` as an operation
property. It turns out this approach would introduce some unnecessary
burden if applied equally to the intrinsic operations because properties
are not available through `Operation *` but we have to operate on
`Operation *` during the import/export of intrinsics, so this PR changes
it from a property to an array attribute.
2024-10-16 12:51:50 +08:00
ravil-mobile
d741435d77 [MLIR][ROCDL] Added SchedGroupBarrier and IglpOpt ops (#112237)
This PR adds missing `sched.group.barrier` and `rocdl.iglp.opt` ops to
the ROCDL dialect (see
[here](ec78f0da0e/clang/include/clang/Basic/BuiltinsAMDGPU.def (L66-L68))).
The ops are converted to the corresponding intrinsic calls during the
translation from MLIR to LLVM IRs. This intrinsics are hints to the
instruction scheduler of the AMDGPU backend.
2024-10-15 22:54:39 +01:00
SJW
8da5aa16f6 [mlir][SCF] Fix dynamic loop pipeline peeling for num_stages > total_iters (#112418)
When pipelining an `scf.for` with dynamic loop bounds, the epilogue
ramp-down must align with the prologue when num_stages >
total_iterations.

For example:
```
scf.for (0..ub) {
  load(i)
  add(i)
  store(i)
}
```
When num_stages=3 the pipeline follows:
```
load(0)  -  add(0)      -  scf.for (0..ub-2)    -  store(ub-2)
            load(1)     -                       -  add(ub-1)     -  store(ub-1)

```
The trailing `store(ub-2)`, `i=ub-2`, must align with the ramp-up for
`i=0` when `ub < num_stages-1`, so the index `i` should be `max(0,
ub-2)` and each subsequent index is an increment. The predicate must
also handle this scenario, so it becomes `predicate[0] =
total_iterations > epilogue_stage`.
2024-10-15 13:13:49 -07:00
Andrzej Warzyński
a758bcdbd9 [mlir][td] Rename pack_paddings in structured.pad (#111036)
The pack_paddings attribute in the structure.pad TD Op is used to set
the `nofold` attribute in the generated tensor.pad Op. The current name
is confusing and suggests that there's a relation with the tensor.pack
Op. This patch renames it as `nofold_flags` to better match the actual
usage.
2024-10-15 19:24:43 +01:00
Sergio Afonso
0a17bdfc36 [MLIR][OpenMP] Remove terminators from loop wrappers (#112229)
This patch simplifies the representation of OpenMP loop wrapper
operations by introducing the `NoTerminator` trait and updating
accordingly the verifier for the `LoopWrapperInterface`.

Since loop wrappers are already limited to having exactly one region
containing exactly one block, and this block can only hold a single
`omp.loop_nest` or loop wrapper and an `omp.terminator` that does not
return any values, it makes sense to simplify the representation of loop
wrappers by removing the terminator.

There is an extensive list of Lit tests that needed updating to remove
the `omp.terminator`s adding some noise to this patch, but actual
changes are limited to the definition of the `omp.wsloop`, `omp.simd`,
`omp.distribute` and `omp.taskloop` loop wrapper ops, Flang lowering for
those, `LoopWrapperInterface::verifyImpl()`, SCF to OpenMP conversion
and OpenMP dialect documentation.
2024-10-15 11:28:39 +01:00
Sasha Lopoukhine
36a405519b [mlir][SCF] Multiply lower bound in loop range folding (#111875)
Fixes #83482
2024-10-14 20:15:12 +02:00
Longsheng Mou
4b31568e02 [mlir][linalg] Bugfix for InlineScalarOperands (#111534)
This PR fixes a bug where `scalarOperand` is a simple scalar and should
be used directly, rather than accessed via `tensor.extract`. Fixes
#111243.
2024-10-14 15:38:35 +08:00
Abid Qadeer
cd12ffb622 [mlir][debug] Allow multiple DIGlobalVariableExpression on globals. (#111981)
Currently, we allow only one DIGlobalVariableExpressionAttr per global.
It is especially evident in import where we pick the first from the list
and ignore the rest. In contrast, LLVM allows multiple
DIGlobalVariableExpression to be attached to the global. They are needed
for correct working of things like DICommonBlock. This PR removes this
restriction in mlir. Changes are mostly mechanical. One thing on which I
went a bit back and forth was the representation inside GlobalOp. I
would be happy to change if there are better ways to do this.

---------

Co-authored-by: Tobias Gysi <tobias.gysi@nextsilicon.com>
2024-10-13 23:36:00 +01:00
Matthias Springer
77f8297c6f [mlir][sparse] Improve sparse tensor type constraints (#112133)
Sparse tensors are always ranked tensors. Encodings cannot be attached
to unranked tensors. Change the type constraint to `RankedTensorOf`, so
that we generate `TypedValue<RankedTensorType>` instead of
`TypedValue<TensorType>`. This removes the need for type casting in some
cases.

Also improve the verifiers (missing `return` statements) and switch a
few other `AnyTensor` to `AnyRankedTensor`.

This commit is in preparation of a dialect conversion commit that
required fixes in the sparse dialect.
2024-10-13 21:12:38 +02:00
Jakub Kuderski
935810c4de [mlir][arith] Fix type conversion in emulate-wide-int (#112104)
Use `nullptr` to indicate that type conversion failed and no fallback
conversion should be attempted.

Fixes: https://github.com/llvm/llvm-project/issues/108163
2024-10-12 15:06:45 -04:00