Commit Graph

21049 Commits

Author SHA1 Message Date
Kazu Hirata
6f66530fd1 [mlir] Fix a warning
This patch fixes:

  mlir/lib/Pass/PassRegistry.cpp:425:37: error: ISO C++ requires the
  name after '::~' to be found in the same scope as the name before
  '::~' [-Werror,-Wdtor-name]
2024-10-29 10:55:34 -07:00
Sergio Afonso
a1f2fb6078 [MLIR][OpenMP] Prevent composite omp.simd related crashes (#113680)
This patch updates the translation of `omp.wsloop` with a nested
`omp.simd` to prevent uses of block arguments defined by the latter from
triggering null pointer dereferences.

This happens because the inner `omp.simd` operation representing
composite `do simd` constructs is currently skipped and not translated,
but this results in block arguments defined by it not being mapped to an
LLVM value. The proposed solution is to map these block arguments to the
LLVM value associated to the corresponding operand, which is defined
above.
2024-10-29 17:05:12 +00:00
Andrzej Warzyński
39ad84e4d1 [mlir][linalg] Split GenericPadOpVectorizationPattern into two patterns (#111349)
At the moment, `GenericPadOpVectorizationPattern` implements two
orthogonal transformations:
  1. Rewrites `tensor::PadOp` into a sequence of `tensor::EmptyOp`,
    `linalg::FillOp` and `tensor::InsertSliceOp`.
  2. Vectorizes (where possible) `tensor::InsertSliceOp` (see
    `tryVectorizeCopy`).

This patch splits `GenericPadOpVectorizationPattern` into two separate
patterns:
  1. `GeneralizePadOpPattern` for the first transformation (note that
    currently `GenericPadOpVectorizationPattern` inherits from
    `GeneralizePadOpPattern`).
  2. `InsertSliceVectorizePattern` to vectorize `tensor::InsertSliceOp`.

With this change, we gain the following:
  * a clear separation between pre-processing and vectorization
    transformations/stages,
  * a path to support masked vectorisation for `tensor.insert_slice`
    (with a dedicated pattern for vectorization, it is much easier to
    specify the input vector sizes used in masking),
  * more opportunities to vectorize `tensor.insert_slice`.

Note for downstream users:
--------------------------

If you were using `populatePadOpVectorizationPatterns`, following this
change you will also have to add
`populateInsertSliceVectorizationPatterns`.

Finer implementation details:
-----------------------------

1.  The majority of changes in this patch are copy & paste + some edits.
  1.1. The only functional change is that the vectorization of
    `tensor.insert_slice` is now broadly available (as opposed to being
    constrained to the pad vectorization pattern:
    `GenericPadOpVectorizationPattern`).
  1.2. Following-on from the above, `@pad_and_insert_slice_dest` is
    updated. As expected, the input `tensor.insert_slice` Op is no
    longer "preserved" and instead gets vectorized successfully.

2. The `linalg.fill` case in `getConstantPadVal` works under the
   assumption that only _scalar_ source values can be used. That's
   consistent with the definition of the Op, but it's not tested at the
   moment. Hence a test case in Linalg/invalid.mlir is added.

3. The behaviour of the two TD vectorization Ops,
   `transform.structured.vectorize_children_and_apply_patterns` and
   `transform.structured.vectorize` is preserved.
2024-10-29 16:57:23 +00:00
Hugo Trachino
a9c417c28a [MLIR][SCF] Fix LoopPeelOp documentation (NFC) (#113179)
As an example, I added annotations to the peel_front unit test.

```
func.func @loop_peel_first_iter_op() {
  // CHECK: %[[C0:.+]] = arith.constant 0
  // CHECK: %[[C41:.+]] = arith.constant 41
  // CHECK: %[[C5:.+]] = arith.constant 5
  // CHECK: %[[C5_0:.+]] = arith.constant 5
  // CHECK: scf.for %{{.+}} = %[[C0]] to %[[C5_0]] step %[[C5]]
  // CHECK:   arith.addi
  // CHECK: scf.for %{{.+}} = %[[C5_0]] to %[[C41]] step %[[C5]]
  // CHECK:   arith.addi
  %0 = arith.constant 0 : index
  %1 = arith.constant 41 : index
  %2 = arith.constant 5 : index
  scf.for %i = %0 to %1 step %2 {
    arith.addi %i, %i : index
  }
  return
}

module attributes {transform.with_named_sequence} {
  transform.named_sequence @__transform_main(%arg1: !transform.any_op {transform.readonly}) {
    %0 = transform.structured.match ops{["arith.addi"]} in %arg1 : (!transform.any_op) -> !transform.any_op
    %1 = transform.get_parent_op %0 {op_name = "scf.for"} : (!transform.any_op) -> !transform.op<"scf.for">
    %main_loop, %remainder = transform.loop.peel %1 {peel_front = true} : (!transform.op<"scf.for">) -> (!transform.op<"scf.for">, !transform.op<"scf.for">)
    transform.annotate %main_loop "main_loop" : !transform.op<"scf.for">
    transform.annotate %remainder "remainder" : !transform.op<"scf.for">
    transform.yield
  }
}
```
Gives :
```
  func.func @loop_peel_first_iter_op() {
    %c0 = arith.constant 0 : index
    %c41 = arith.constant 41 : index
    %c5 = arith.constant 5 : index
    %c5_0 = arith.constant 5 : index
    scf.for %arg0 = %c0 to %c5_0 step %c5 {
      %0 = arith.addi %arg0, %arg0 : index
    } {remainder}  // The first iteration loop (second result) has been annotated remainder
    scf.for %arg0 = %c5_0 to %c41 step %c5 {
      %0 = arith.addi %arg0, %arg0 : index
    } {main_loop} // The main loop (first result) has been annotated main_loop
    return
  }
```

---------

Co-authored-by: Andrzej Warzyński <andrzej.warzynski@gmail.com>
2024-10-29 15:47:13 +00:00
goldsteinn
2e612f8d86 [MLIR][Arith] Improve accuracy of inferDivU (#113789)
1) We can always bound the maximum with the numerator.
    - https://alive2.llvm.org/ce/z/PqHvuT
2) Even if denominator min can be zero, we can still bound the minimum
   result with `lhs.umin u/ rhs.umax`.

This is similar to https://github.com/llvm/llvm-project/pull/110169
2024-10-29 09:41:59 -05:00
Piotr Fusik
c370869cd6 [mlir][NFC] Avoid a warning (#114052)
gcc 14.1 warning: template-id not allowed for destructor in C++20
[-Wtemplate-id-cdtor]
2024-10-29 15:01:37 +01:00
Matthias Springer
c0cba25cdd [mlir][Transforms] Dialect conversion: Hardening replaceOp (#109540)
This commit adds extra checks/assertions to the
`ConversionPatternRewriterImpl::notifyOpReplaced` to improve its
robustness.

1. Replacing an `unrealized_conversion_cast` op that was created by the
driver is now forbidden and caught early during `replaceOp`. It may work
in some cases, but it is generally dangerous because the conversion
driver keeps track of these ops and performs some extra legalization
steps during the "finalize" phase. (Erasing is them is fine.)
2. `null` replacement values are no longer registered in the
`ConversionValueMapping`. This was an oversight in #106760. There is no
benefit in having `null` values in the `ConversionValueMapping`. (It may
even cause problems.)

This change is in preparation of merging the 1:1 and 1:N dialect
conversion drivers.
2024-10-29 21:13:54 +09:00
Matthias Springer
6588073724 [mlir][func] Fix incorrect API usage in FuncOpConversion (#113977)
This commit fixes a case of incorrect dialect conversion API usage
during `FuncOpConversion`. `replaceAllUsesExcept` (same as
`replaceAllUsesWith`) is currently not supported in a dialect
conversion. `replaceUsesOfBlockArgument` should be used instead. It
sometimes works anyway (like in this case), but that's just because of
the way we insert materializations.

This commit is in preparation of merging the 1:1 and 1:N dialect
conversion drivers. (At that point, the current use of
`replaceAllUsesExcept` will no longer work.)
2024-10-29 13:19:43 +09:00
Matthias Springer
1549a0c183 [mlir][SCF] Remove scf-bufferize pass (#113840)
The dialect conversion-based bufferization passes have been migrated to
One-Shot Bufferize about two years ago. To clean up the code base, this
commit removes the `scf-bufferize` pass, one of the few remaining parts
of the old infrastructure. Most bufferization passes have already been
removed.

Note for LLVM integration: If you depend on this pass, migrate to
One-Shot Bufferize or copy the pass to your codebase.
2024-10-29 09:10:30 +09:00
Thomas Preud'homme
7db4cacfd7 [MLIR] Add missing MLIRLLVMDialect dep to MLIRLinalgToStandard (#113561)
This fixes the following failure when doing a clean build (in particular
no .ninja* lying around) of lib/libMLIRLinalgToStandard.a only:
```
In file included from llvm/include/llvm/IR/Module.h:22,
                 from mlir/include/mlir/Dialect/LLVMIR/LLVMDialect.h:37,
                 from mlir/lib/Conversion/LinalgToStandard/LinalgToStandard.cpp:13:
llvm/include/llvm/IR/Attributes.h:90:14: fatal error: llvm/IR/Attributes.inc: No such file or directory
```
2024-10-28 22:53:39 +01:00
Thomas Preud'homme
82cb22e735 [MLIR] Add missing MLIRLLVMDialect dep to MLIRMathToLibm (#113563)
This fixes the following failure when doing a clean build (in particular
no .ninja* lying around) of lib/libMLIRMathToLibm.a only:
```
In file included from llvm/include/llvm/IR/Module.h:22,
                 from mlir/include/mlir/Dialect/LLVMIR/LLVMDialect.h:37,
                 from mlir/lib/Conversion/MathToLibm/MathToLibm.cpp:13
llvm/include/llvm/IR/Attributes.h:90:14: fatal error: llvm/IR/Attributes.inc: No such file or directory
```
2024-10-28 22:50:23 +01:00
Petr Kurapov
7a710110fc [MLIR][Vector] Remove unused and unimplemented Vector_WarpExecuteOnLa… (#112338)
…ne0Op builder

Removing the declaration instead of implementing the builder as
discussed in #110106
2024-10-28 17:12:12 +01:00
donald chen
39ac64c1c0 [mlir][Arith] ValueBoundsInterface: speedup arith.select (#113531)
When calculating value bounds in the arith.select op , the compare
function is invoked to compare trueValue and falseValue. This function
rebuilds constraints, resulting in repeated computations of value
bounds.

In large-scale programs, this redundancy significantly impacts
compilation time.
2024-10-28 10:14:44 +08:00
Longsheng Mou
7ad63c0e44 [mlir][MathToFuncs] MathToFuncs only support integer type (#113693)
This PR fixes a bug in `MathToFuncs` where it incorrectly converts index
type for `math.ctlz` and `math.ipowi`, leading to a crash. Fixes
#108150.
2024-10-28 09:54:51 +08:00
Durgadoss R
e33aec89ef [MLIR][NVVM] Update the elect.sync Op to use intrinsics (#113757)
Recently, we added an intrinsic for the elect.sync PTX instruction (PR
104780). This patch updates the corresponding Op in NVVM Dialect
to lower to the intrinsic instead of inline-ptx.

The existing test under Conversion/ is migrated to check for the new
pattern. A separate test is added to verify the lowered intrinsic under
the Target/ directory.

Signed-off-by: Durgadoss R <durgadossr@nvidia.com>
2024-10-27 22:24:31 +05:30
Kazu Hirata
5287a9b345 [mlir] Prefer StringRef::substr to slice (NFC) (#113788)
I'm planning to migrate StringRef to std::string_view
eventually.  Since std::string_view does not have slice, this patch
migrates:

  slice(0, N)                to  substr(0, N)
  slice(N, StringRef::npos)  to  substr(N)
2024-10-27 07:28:27 -07:00
Sirui Mu
93da6423af [mlir][LLVM] Add builders for llvm.intr.assume (#113317)
This patch adds several new builders for llvm.intr.assume that build the
operation with additional operand bundles.
2024-10-27 11:52:00 +08:00
Andrzej Warzyński
0cf7aaf300 [MLIR][Vector] Update Transfer{Read|Write}DropUnitDimsPattern patterns (#112394)
Updates `TransferWriteDropUnitDimsPattern` and
`TransferReadDropUnitDimsPattern` to inherit from
`MaskableOpRewritePattern` so that masked versions of
xfer_read/xfer_write Ops are also supported:

```mlir
    %v = vector.mask %mask {
      vector.transfer_read %arg[%c0, %c0, %c0, %c0], %cst :
        memref<1x1x3x2xi8, strided<[6, 6, 2, 1], offset: ?>>, vector<3x2xi8>
    } : vector<3x2xi1> -> vector<3x2xi8>
```
2024-10-26 13:54:04 +01:00
Durgadoss R
13d6233e77 [MLIR][NVGPU] Fix nvgpu_arrive syntax in matmulBuilder.py (#113713)
This patch updates the syntax for nvgpu_arrive Op
in matmulBuilder.py. This fixes the compilation
error for this test.

For the warp-specialized matmul_kernel implementation,
removing the WaitGroupSyncOp (after the mma-main-loop)
fixes the hang observed.

With these two fixes, the test compiles and
executes successfully on an sm90a machine.

Signed-off-by: Durgadoss R <durgadossr@nvidia.com>
2024-10-26 11:15:50 +05:30
Jacques Pienaar
bb00f5b1ed [mlir][vector] Remove unneeded mask restriction (#113742)
These were added when the only mapping was to LLVM.
2024-10-25 20:45:44 -07:00
Longsheng Mou
5aa741d7ca [mlir][SPIRVToLLVM] Erase empty spirv.mlir.loop in LoopPattern (#113527)
This PR erases `spirv.mlir.loop` with an empty region in `LoopPattern`,
resolving a crash. Fixes #113404.
2024-10-26 11:22:57 +08:00
Longsheng Mou
8f9fc6ce47 [mlir][GPU] Add FunctionOpInterface check for OpToFuncCallLowering (#113449)
This PR adds a `FunctionOpInterface` check in `OpToFuncCallLowering` to
resolve a crash when ops not in function. Fixes #113334.
2024-10-26 11:22:08 +08:00
Matthias Springer
a8ef0b33a6 [mlir] Fix build (#113750)
```
mlir/lib/Transforms/Utils/DialectConversion.cpp:2851:28: error: call of overloaded ‘TypeRange(llvm::SmallVector<mlir::Value>&)’ is ambiguous
     assert(TypeRange(result) == resultTypes &&
```
2024-10-25 19:11:38 -07:00
donald chen
889b67c9d3 [mlir] [memref] add more checks to the memref.reinterpret_cast (#112669)
Operation memref.reinterpret_cast was accept input like:

%out = memref.reinterpret_cast %in to offset: [%offset], sizes: [10],
strides: [1]
         : memref<?xf32> to memref<10xf32>

A problem arises: while lowering, the true offset of %out is %offset,
but its data type indicates an offset of 0. Permitting this
inconsistency can result in incorrect outcomes, as certain pass might
erroneously extract the offset from the data type of %out.

This patch fixes this by enforcing that the return value's data type
aligns
with the input parameter.
2024-10-26 08:07:51 +08:00
Matthias Springer
8c4bc1e75d [mlir][Transforms] Merge 1:1 and 1:N type converters (#113032)
The 1:N type converter derived from the 1:1 type converter and extends
it with 1:N target materializations. This commit merges the two type
converters and stores 1:N target materializations in the 1:1 type
converter. This is in preparation of merging the 1:1 and 1:N dialect
conversion infrastructures.

1:1 target materializations (producing a single `Value`) will remain
valid. An additional API is added to the type converter to register 1:N
target materializations (producing a `SmallVector<Value>`). Internally,
all target materializations are stored as 1:N materializations.

The 1:N type converter is removed.

Note for LLVM integration: If you are using the `OneToNTypeConverter`,
simply switch all occurrences to `TypeConverter`.

---------

Co-authored-by: Markus Böck <markus.boeck02@gmail.com>
2024-10-25 11:44:20 -07:00
Andrzej Warzyński
ac4bd74190 [mlir] Add apply_patterns.linalg.pad_vectorization TD Op (#112504)
This PR simply wraps `populatePadOpVectorizationPatterns` into a new
Transform Dialect Op: `apply_patterns.linalg.pad_vectorization`.

This change makes it possible to run (and test) the corresponding
patterns _without_:

  `transform.structured.vectorize_children_and_apply_patterns`.

Note that the Op above only supports non-masked vectorisation (i.e. when
the inputs are static), so, effectively, only fixed-width vectorisation
(as opposed to scalable vectorisation). As such, this change is required
to construct vectorization pipelines for tensor.pad targeting scalable
vectors.

To test the new Op and the corresponding patterns, I added
"vectorization-pad-patterns.mlir" - most tests have been extracted from
"vectorization-with-patterns.mlir".
2024-10-25 10:39:26 -07:00
Thomas Preud'homme
bbc0e631d2 [MLIR] Remove unneeded LLVMDialect.h include in ControlFlowToSCF.cpp (#113560)
This fixes the following failure when doing a clean build (in particular
no .ninja* lying around) of lib/libMLIRControlFlowToSCF.a only:
```
In file included from llvm/include/llvm/IR/Module.h:22,
                 from mlir/include/mlir/Dialect/LLVMIR/LLVMDialect.h:37,
                 from mlir/lib/Conversion/ControlFlowToSCF/ControlFlowToSCF.cpp:19
llvm/include/llvm/IR/Attributes.h:90:14: fatal error: llvm/IR/Attributes.inc: No such file or directory
```
2024-10-25 15:41:39 +01:00
Andrea Faulds
9f6c632ecd [mlir][mlir-spirv-cpu-runner] Move MLIR pass pipeline to mlir-opt (#113594)
Adds a new mlir-opt test-only pass, -test-spirv-cpu-runner-pipeline,
which runs the set of MLIR passes needed for the mlir-spirv-cpu-runner,
and removes them from the runner. The tests are changed to invoke
mlir-opt with this flag before running the runner. The eventual goal is
to move all host/device code generation steps out of the runner, like
with some of the other runners.

Recommit of 17e9752267. It was reverted
due to a build failure, but the build failure had in fact already been
fixed in e7302319b5.
2024-10-25 07:21:59 -07:00
Sergio Afonso
d87964de78 [OpenMP][OMPIRBuilder] Error propagation across callbacks (#112533)
This patch implements an approach to communicate errors between the
OMPIRBuilder and its users. It introduces `llvm::Error` and
`llvm::Expected` objects to replace the values returned by callbacks
passed to `OMPIRBuilder` codegen functions. These functions then check
the result for errors when callbacks are called and forward them back to
the caller, which has the flexibility to recover, exit cleanly or dump a
stack trace.

This prevents a failed callback to leave the IR in an invalid state and
still continue the codegen process, triggering unrelated assertions or
segmentation faults. In the case of MLIR to LLVM IR translation of the
'omp' dialect, this change results in the compiler emitting errors and
exiting early instead of triggering a crash for not-yet-implemented
errors. The behavior in Clang and openmp-opt stays unchanged, since
callbacks will continue always returning 'success'.
2024-10-25 11:30:16 +01:00
Sayan Saha
b91ce7bc0e [Tosa] : Fix integer overflow for computing intmax+1 in tosa.cast to linalg. (#112455)
This PR fixes an issue related to integer overflow when computing
`(intmax+1)` for `i64` during `tosa-to-linalg` pass for `tosa.cast`.

Found this issue while debugging a numerical mismatch for `deeplabv3`
model from `torchvision` represented in `tosa` dialect using the
`TorchToTosa` pipeline in `torch-mlir` repository. `torch.aten.to.dtype`
is converted to `tosa.cast` that casts `f32` to `i64` type. Technically
by the specification, `tosa.cast` doesn't handle casting `f32` to `i64`.
So it's possible to add a verifier to error out for such tosa ops
instead of producing incorrect code. However, I chose to fix the
overflow issue to still be able to represent the `deeplabv3` model with
`tosa` ops in the above-mentioned pipeline. Open to suggestions if
adding the verifier is more appropriate instead.
2024-10-25 10:51:09 +01:00
Max191
f1595ecfdc [mlir] Fix bug in UnPackOp tiling implementation causing infinite loop (#113571)
This fixes a bug in the tiling implementation of tensor.unpack that was
causing an infinite loop when certain unpack ops get tiled and fused as
a producer. The tiled implementation of tensor.unpack sometimes needs to
create an additional tensor.extract_slice on the result of the tiled
unpack op, but this slice was getting added to the `generatedSlices` of
the tiling result. The `generatedSlices` are used to find the next
producers to fuse, so it caused an infinite loop of fusing the same
unpack op after it was already in the loop. This fixes the bug by adding
the slice of the source instead of the result.

Signed-off-by: Max Dawkins <max.dawkins@gmail.com>
2024-10-24 21:32:45 -04:00
Ian Wood
455f71d285 [mlir] Convert expand_shape to more static form (#112265)
Add pattern that converts a `tensor.expand_shape` op to a more static
form.

This matches the pattern: `tensor.cast` -> `tensor.expand_shape` if it
has a foldable `tensor.cast` and some constant foldable `output_shape`
operands for the `tensor.expand_shape`. This makes the
`tensor.expand_shape` more static, as well as allowing the static
information to be propagated further down in the program.
2024-10-24 17:04:02 -07:00
Adrian Kuegel
7e61d89373 [mlir] Apply ClangTidy performance finding
loop variable is copied but only used as const reference
2024-10-24 12:32:49 +00:00
Longsheng Mou
927559d27d [mlir][vector] Fix a crash in VectorToGPU (#113454)
This PR fixes a crash in `VectorToGPU` when the operand of `extOp` is a
function argument, which cannot be retrieved using `getDefiningOp`.
Fixes #107967.
2024-10-24 20:28:42 +08:00
Rahul Joshi
743f839a88 [NFC][LLVM][TableGen] Change RecordKeeper::getClass to return const pointer (#112261)
Change `RecordKeeper::getClass` to return const record pointer. This is
a part of effort to have better const correctness in TableGen backends:


https://discourse.llvm.org/t/psa-planned-changes-to-tablegen-getallderiveddefinitions-api-potential-downstream-breakages/81089
2024-10-23 11:32:05 -07:00
Matthias Springer
f18c3e4e73 [mlir][Transforms] Dialect Conversion: Simplify materialization fn result type (#113031)
This commit simplifies the result type of materialization functions.

Previously: `std::optional<Value>`
Now: `Value`

The previous implementation allowed 3 possible return values:
- Non-null value: The materialization function produced a valid
materialization.
- `std::nullopt`: The materialization function failed, but another
materialization can be attempted.
- `Value()`: The materialization failed and so should the dialect
conversion. (Previously: Dialect conversion can roll back.)

This commit removes the last variant. It is not particularly useful
because the dialect conversion will fail anyway if all other
materialization functions produced `std::nullopt`.

Furthermore, in contrast to type conversions, at least one
materialization callback is expected to succeed. In case of a failing
type conversion, the current dialect conversion can roll back and try a
different pattern. This also used to be the case for materializations,
but that functionality was removed with #107109: failed materializations
can no longer trigger a rollback. (They can just make the entire dialect
conversion fail without rollback.) With this in mind, it is even less
useful to have an additional error state for materialization functions.

This commit is in preparation of merging the 1:1 and 1:N type
converters. Target materializations will have to return multiple values
instead of a single one. With this commit, we can keep the API simple:
`SmallVector<Value>` instead of `std::optional<SmallVector<Value>>`.

Note for LLVM integration: All 1:1 materializations should return
`Value` instead of `std::optional<Value>`. Instead of `std::nullopt`
return `Value()`.
2024-10-23 07:29:17 -07:00
Kareem Ergawy
ad70f3e095 [flang][OpenMP] Support target enter|update|exit .. nowait (#113305)
Extends `nowait` support for other device directives. This PR refactors
the task generation utils used for the `target` directive so that they
are general enough to be reused for other device directives as well.
2024-10-23 10:48:54 +02:00
Dmitriy Smirnov
27158edaa4 [MLIR][SPIRV] Update cast from IntN to Bool (#113329)
This PR updates the cast to bool from IntN to treat any non-zero value
as TRUE. This makes the cast more resilient to non-generic (i.e. "non
1") TRUE values.

Signed-off-by: Dmitriy Smirnov <dmitriy.smirnov@arm.com>
2024-10-23 09:47:33 +01:00
Nikolay Panchenko
076d3e2326 [mlir][ods] Verify access to operands in inferReturnTypes (#112574)
The patch adds graceful handling of incorrectly constructed MLIR
operation with less operands than expected.
2024-10-23 00:20:53 -04:00
Georgios Pinitas
8ad8db973e Revert "[TOSA] bug fix infer shape for slice" (#113413)
Reverts llvm/llvm-project#108306
2024-10-23 04:37:21 +01:00
Tai Ly
3b9526b231 [TOSA] bug fix infer shape for slice (#108306)
This fixes the infer output shape of TOSA slice op for start/size values
that are out-of-bound or -1

added tests to check:
  - size = -1
  - size is out of bound
  - start is out of bound

Signed-off-by: Tai Ly <tai.ly@arm.com>
2024-10-23 04:25:41 +01:00
Andrzej Warzyński
2a25200828 [mlir][tensor] Restrict the verifier for tensor.pack/tensor.unpack (#113108)
Restricts the verifier for tensor.pack and tensor.unpack Ops so that the
following is no longer allowed:

```mlir
  %c8 = arith.constant 8 : index
  %0 = tensor.pack %input inner_dims_pos = [0, 1] inner_tiles = [8, %c8] into %output : tensor<?x?xf32> -> tensor<?x?x8x8xf32>
```

Specifically, in line with other Tensor Ops, require:
  * a dynamic dimensions for each (dynamic) SSA value,
  * a static dimension for each static size (attribute).

In the example above, a static dimension (8) is mixed with a dynamic
size (%c8).

Note that this is mostly deleting existing code - that's because this
change simplifies the logic in verifier.

For more context:
* https://discourse.llvm.org/t/tensor-ops-with-dynamic-sizes-which-behaviour-is-more-correct
2024-10-22 20:11:05 -07:00
NimishMishra
645e6f1114 [llvm][OpenMP] Handle complex types in atomic read (#111377)
This patch adds functionality for atomically reading `llvm.struct`
types.

Fixes: https://github.com/llvm/llvm-project/issues/93441
2024-10-22 18:34:22 -07:00
Longsheng Mou
519eef3bdc [mlir][tosa] Add a verifier for tosa.mul (#113320)
This PR adds a verifier check for tosa.mul, requiring that the shift be
0 for float types.
Fixes #112716.
2024-10-22 22:34:04 +01:00
weiwei chen
7191ced3b6 [MLIR] Add folding constants canonicalization for mlir::index::AddOp. (#111084)
- [x] Add a simple canonicalization for `mlir::index::AddOp`.
2024-10-22 12:04:26 -07:00
Kazu Hirata
c5ea7b8338 [mlir] Avoid repeated hash lookups (NFC) (#113249) 2024-10-22 08:00:11 -07:00
Kunwar Grover
1004865f1c [mlir][Vector] Support 0-d vectors natively in TransferOpReduceRank (#112907)
Since
ddf2d62c7d
, 0-d vectors are supported in VectorType. This patch removes 0-d vector
handling with scalars for the TransferOpReduceRank pattern. This pattern
specifically introduces tensor.extract_slice during vectorization,
causing vectorization to not fold transfer_read/transfer_write slices
properly. The changes in vectorization test files reflect this.

There are other places where lowering patterns are still side-stepping
from handling 0-d vectors properly, by turning them into scalars, but
this patch only focuses on the vector.transfer_x patterns.
2024-10-22 15:50:16 +01:00
Andrzej Warzyński
91c11574e8 Revert "[MLIR] Make OneShotModuleBufferize use OpInterface (#110322)" (#113124)
This reverts commit 2026501cf1.

Failing bot:
  * https://lab.llvm.org/staging/#/builders/125/builds/389
2024-10-22 13:28:44 +01:00
Longsheng Mou
2ce655cf1b [mlir][func] Fix multiple bugs in DuplicateFunctionElimination (#109571)
This PR fixes multiple bugs in `DuplicateFunctionElimination`.
- Prevents elimination of function declarations.
- Updates all symbol uses to reference unique function representatives.

Fixes #93483.
2024-10-22 09:19:12 +08:00
Mehdi Amini
3acc58c1bb Revert "Fix CMake dependencies on mlir-linalg-ods-yaml-gen" (#113229)
Reverts llvm/llvm-project#112224

Many bots are broken
2024-10-21 15:28:20 -07:00