clang-p2996

Author	SHA1	Message	Date
Ivan Butygin	f54cdc5d6e	[mlir] IntegerRangeAnalysis: add support for vector type (#112292 ) Treat integer range for vector type as union of ranges of individual elements. With this semantics, most arith ops on vectors will work out of the box, the only special handling needed for constants and vector elements manipulation ops. The end goal of these changes is to be able to optimize vectorized index calculations.	2024-11-01 23:58:16 +03:00
Razvan Lupusoru	c0a1597029	[mlir][acc] Consistency between acc.loop and acc compute ops (#114549 ) - GangPrivate and GangFirstPrivate renamed to just Private and Firstprivate respectively. This is makes compute ops consistent with the loop op (and also with the acc spec wording for the clause). - Added getBody to all compute ops - Verifier for firstprivate ops / recipes is enabled	2024-11-01 10:53:51 -07:00
Manupa Karunaratne	a6e72f9392	[MLIR][Vector] Add Lowering for vector.step (#113655 ) Currently, the lowering for vector.step lives under a folder. This is not ideal if we want to do transformation on it and defer the materizaliztion of the constants much later. This commits adds a rewrite pattern that could be used by using `transform.structured.vectorize_children_and_apply_patterns` transform dialect operation. Moreover, the rewriter of vector.step is also now used in -convert-vector-to-llvm pass where it handles scalable and non-scalable types as LLVM expects it. As a consequence of removing the vector.step lowering as its folder, linalg vectorization will keep vector.step intact.	2024-11-01 16:38:36 +00:00
Luke Hutton	36878b5542	[TOSA] Remove i64 from valid element datatypes in validation (#113380 ) Align the validation pass valid element datatypes check more closely to the specification by removing i64 as a supported datatype. The spec does not currently support it. Signed-off-by: Luke Hutton <luke.hutton@arm.com>	2024-11-01 10:12:43 +00:00
Rolf Morel	5c1752e368	[MLIR][DLTI] Pretty parsing and printing for DLTI attrs (#113365 ) Unifies parsing and printing for DLTI attributes. Introduces a format of `#dlti.attr<key1 = val1, ..., keyN = valN>` syntax for all queryable DLTI attributes similar to that of the DictionaryAttr, while retaining support for specifying key-value pairs with `#dlti.dl_entry` (whether to retain this is TBD). As the new format does away with most of the boilerplate, it is much easier to parse for humans. This makes an especially big difference for nested attributes. Updates the DLTI-using tests and includes fixes for misc error checking/ error messages.	2024-10-31 19:18:24 +00:00
Jakub Kuderski	0f8a6b7d03	[mlir] Add fast walk-based pattern rewrite driver (#113825 ) This is intended as a fast pattern rewrite driver for the cases when a simple walk gets the job done but we would still want to implement it in terms of rewrite patterns (that can be used with the greedy pattern rewrite driver downstream). The new driver is inspired by the discussion in https://github.com/llvm/llvm-project/pull/112454 and the LLVM Dev presentation from @matthias-springer earlier this week. This limitation comes with some limitations: * It does not repeat until a fixpoint or revisit ops modified in place or newly created ops. In general, it only walks forward (in the post-order). * `matchAndRewrite` can only erase the matched op or its descendants. This is verified under expensive checks. * It does not perform folding / DCE. We could probably relax some of these in the future without sacrificing too much performance.	2024-10-31 11:10:09 -04:00
Sergio Afonso	21a6032eca	[MLIR][OpenMP] Simplify translation to LLVM IR error handling (#114036 ) This patch unifies the handling of errors passed through the OpenMPIRBuilder and removes some redundant error messages through the introduction of a custom `ErrorInfo` subclass. Additionally, the current list of operations and clauses unsupported by the MLIR to LLVM IR translation pass is added to a new Lit test to check they are being reported to the user.	2024-10-31 11:34:24 +00:00
Abid Qadeer	89f2d50cda	[mlir][debug] Support DIGenericSubrange. (#113441 ) `DIGenericSubrange` is used when the dimensions of the arrays are unknown at build time (e.g. assumed-rank arrays in Fortran). It has same `lowerBound`, `upperBound`, `count` and `stride` fields as in `DISubrange` and its translation looks quite similar as a result. --------- Co-authored-by: Tobias Gysi <tobias.gysi@nextsilicon.com>	2024-10-31 10:09:26 +00:00
Matthias Springer	d043670d66	[mlir][func] Replace `ValueDecomposer` with target materialization (#114192 ) The `ValueDecomposer` in `DecomposeCallGraphTypes` was a workaround around missing 1:N support in the dialect conversion. Since #113032, the dialect conversion infrastructure supports 1:N type conversions and 1:N target materializations. The `ValueDecomposer` class is no longer needed. (However, target materializations must still be inserted manually, until we fully merge the 1:1 and 1:N drivers.) Note for LLVM integration: Register 1:N target materializations on the type converter instead of "decompose value conversions" on the `ValueDecomposer`.	2024-10-31 07:26:12 +09:00
Ilya Enkovich	d2109640a3	[MLIR] [AMX] Fix strides used by AMX lowering for tile loads and stores. (#113476 )	2024-10-30 20:41:28 +01:00
Matthias Springer	217700baf7	[mlir][bufferization] Support bufferization of external functions (#113999 ) This commit adds support for bufferizing external functions that have no body. Such functions were previously rejected by One-Shot Bufferize if they returned a tensor value. This commit is in preparation of removing the deprecated `func-bufferize` pass. That pass can bufferize external functions. Also update a few comments.	2024-10-30 21:49:10 +09:00
lialan	2c313259c6	[MLIR] VectorEmulateNarrowType to support loading of unaligned vectors (#113411 ) Previously, the pass only supported emulation of loading vector sizes that are multiples of the emulated data type. This patch expands its support for emulating sizes that are not multiples of byte sizes. In such cases, the element values are packed back-to-back to preserve memory space. To give a concrete example: if an input has type `memref<3x3xi2>`, it is actually occupying 3 bytes in memory, with the first 18 bits storing the values and the last 6 bits as padding. The slice of `vector<3xi2>` at index `[2, 0]` is stored in memory from bit 12 to bit 18. To properly load the elements from bit 12 to bit 18 from memory, first load byte 2 and byte 3, and convert it to a vector of `i2` type; then extract bits 4 to 10 (element index 2-5) to form a `vector<3xi2>`. A limitation of this patch is that the linearized index of the unaligned vector has to be known at compile time. Extra code needs to be emitted to handle it if the condition does not hold. The following ops are updated: * `vector::LoadOp` * `vector::TransferReadOp` * `vector::MaskedLoadOp`	2024-10-29 20:04:48 -07:00
Kunwar Grover	2c5eea0e88	[mlir][Vector] Fix vector.insert folder for scalar to 0-d inserts (#113828 ) The current vector.insert folder tries to replace a scalar with a 0-rank vector. This patch fixes this crash by not folding unless they types of the result and replacement are same.	2024-10-29 22:47:44 +00:00
Andrzej Warzyński	39ad84e4d1	[mlir][linalg] Split GenericPadOpVectorizationPattern into two patterns (#111349 ) At the moment, `GenericPadOpVectorizationPattern` implements two orthogonal transformations: 1. Rewrites `tensor::PadOp` into a sequence of `tensor::EmptyOp`, `linalg::FillOp` and `tensor::InsertSliceOp`. 2. Vectorizes (where possible) `tensor::InsertSliceOp` (see `tryVectorizeCopy`). This patch splits `GenericPadOpVectorizationPattern` into two separate patterns: 1. `GeneralizePadOpPattern` for the first transformation (note that currently `GenericPadOpVectorizationPattern` inherits from `GeneralizePadOpPattern`). 2. `InsertSliceVectorizePattern` to vectorize `tensor::InsertSliceOp`. With this change, we gain the following: * a clear separation between pre-processing and vectorization transformations/stages, * a path to support masked vectorisation for `tensor.insert_slice` (with a dedicated pattern for vectorization, it is much easier to specify the input vector sizes used in masking), * more opportunities to vectorize `tensor.insert_slice`. Note for downstream users: -------------------------- If you were using `populatePadOpVectorizationPatterns`, following this change you will also have to add `populateInsertSliceVectorizationPatterns`. Finer implementation details: ----------------------------- 1. The majority of changes in this patch are copy & paste + some edits. 1.1. The only functional change is that the vectorization of `tensor.insert_slice` is now broadly available (as opposed to being constrained to the pad vectorization pattern: `GenericPadOpVectorizationPattern`). 1.2. Following-on from the above, `@pad_and_insert_slice_dest` is updated. As expected, the input `tensor.insert_slice` Op is no longer "preserved" and instead gets vectorized successfully. 2. The `linalg.fill` case in `getConstantPadVal` works under the assumption that only _scalar_ source values can be used. That's consistent with the definition of the Op, but it's not tested at the moment. Hence a test case in Linalg/invalid.mlir is added. 3. The behaviour of the two TD vectorization Ops, `transform.structured.vectorize_children_and_apply_patterns` and `transform.structured.vectorize` is preserved.	2024-10-29 16:57:23 +00:00
Hugo Trachino	a9c417c28a	[MLIR][SCF] Fix LoopPeelOp documentation (NFC) (#113179 ) As an example, I added annotations to the peel_front unit test. ``` func.func @loop_peel_first_iter_op() { // CHECK: %[[C0:.+]] = arith.constant 0 // CHECK: %[[C41:.+]] = arith.constant 41 // CHECK: %[[C5:.+]] = arith.constant 5 // CHECK: %[[C5_0:.+]] = arith.constant 5 // CHECK: scf.for %{{.+}} = %[[C0]] to %[[C5_0]] step %[[C5]] // CHECK: arith.addi // CHECK: scf.for %{{.+}} = %[[C5_0]] to %[[C41]] step %[[C5]] // CHECK: arith.addi %0 = arith.constant 0 : index %1 = arith.constant 41 : index %2 = arith.constant 5 : index scf.for %i = %0 to %1 step %2 { arith.addi %i, %i : index } return } module attributes {transform.with_named_sequence} { transform.named_sequence @__transform_main(%arg1: !transform.any_op {transform.readonly}) { %0 = transform.structured.match ops{["arith.addi"]} in %arg1 : (!transform.any_op) -> !transform.any_op %1 = transform.get_parent_op %0 {op_name = "scf.for"} : (!transform.any_op) -> !transform.op<"scf.for"> %main_loop, %remainder = transform.loop.peel %1 {peel_front = true} : (!transform.op<"scf.for">) -> (!transform.op<"scf.for">, !transform.op<"scf.for">) transform.annotate %main_loop "main_loop" : !transform.op<"scf.for"> transform.annotate %remainder "remainder" : !transform.op<"scf.for"> transform.yield } } ``` Gives : ``` func.func @loop_peel_first_iter_op() { %c0 = arith.constant 0 : index %c41 = arith.constant 41 : index %c5 = arith.constant 5 : index %c5_0 = arith.constant 5 : index scf.for %arg0 = %c0 to %c5_0 step %c5 { %0 = arith.addi %arg0, %arg0 : index } {remainder} // The first iteration loop (second result) has been annotated remainder scf.for %arg0 = %c5_0 to %c41 step %c5 { %0 = arith.addi %arg0, %arg0 : index } {main_loop} // The main loop (first result) has been annotated main_loop return } ``` --------- Co-authored-by: Andrzej Warzyński <andrzej.warzynski@gmail.com>	2024-10-29 15:47:13 +00:00
Matthias Springer	1549a0c183	[mlir][SCF] Remove `scf-bufferize` pass (#113840 ) The dialect conversion-based bufferization passes have been migrated to One-Shot Bufferize about two years ago. To clean up the code base, this commit removes the `scf-bufferize` pass, one of the few remaining parts of the old infrastructure. Most bufferization passes have already been removed. Note for LLVM integration: If you depend on this pass, migrate to One-Shot Bufferize or copy the pass to your codebase.	2024-10-29 09:10:30 +09:00
donald chen	39ac64c1c0	[mlir][Arith] ValueBoundsInterface: speedup arith.select (#113531 ) When calculating value bounds in the arith.select op , the compare function is invoked to compare trueValue and falseValue. This function rebuilds constraints, resulting in repeated computations of value bounds. In large-scale programs, this redundancy significantly impacts compilation time.	2024-10-28 10:14:44 +08:00
Sirui Mu	93da6423af	[mlir][LLVM] Add builders for llvm.intr.assume (#113317 ) This patch adds several new builders for llvm.intr.assume that build the operation with additional operand bundles.	2024-10-27 11:52:00 +08:00
Andrzej Warzyński	0cf7aaf300	[MLIR][Vector] Update Transfer{Read\|Write}DropUnitDimsPattern patterns (#112394 ) Updates `TransferWriteDropUnitDimsPattern` and `TransferReadDropUnitDimsPattern` to inherit from `MaskableOpRewritePattern` so that masked versions of xfer_read/xfer_write Ops are also supported: ```mlir %v = vector.mask %mask { vector.transfer_read %arg[%c0, %c0, %c0, %c0], %cst : memref<1x1x3x2xi8, strided<[6, 6, 2, 1], offset: ?>>, vector<3x2xi8> } : vector<3x2xi1> -> vector<3x2xi8> ```	2024-10-26 13:54:04 +01:00
Jacques Pienaar	bb00f5b1ed	[mlir][vector] Remove unneeded mask restriction (#113742 ) These were added when the only mapping was to LLVM.	2024-10-25 20:45:44 -07:00
donald chen	889b67c9d3	[mlir] [memref] add more checks to the memref.reinterpret_cast (#112669 ) Operation memref.reinterpret_cast was accept input like: %out = memref.reinterpret_cast %in to offset: [%offset], sizes: [10], strides: [1] : memref<?xf32> to memref<10xf32> A problem arises: while lowering, the true offset of %out is %offset, but its data type indicates an offset of 0. Permitting this inconsistency can result in incorrect outcomes, as certain pass might erroneously extract the offset from the data type of %out. This patch fixes this by enforcing that the return value's data type aligns with the input parameter.	2024-10-26 08:07:51 +08:00
Matthias Springer	8c4bc1e75d	[mlir][Transforms] Merge 1:1 and 1:N type converters (#113032 ) The 1:N type converter derived from the 1:1 type converter and extends it with 1:N target materializations. This commit merges the two type converters and stores 1:N target materializations in the 1:1 type converter. This is in preparation of merging the 1:1 and 1:N dialect conversion infrastructures. 1:1 target materializations (producing a single `Value`) will remain valid. An additional API is added to the type converter to register 1:N target materializations (producing a `SmallVector<Value>`). Internally, all target materializations are stored as 1:N materializations. The 1:N type converter is removed. Note for LLVM integration: If you are using the `OneToNTypeConverter`, simply switch all occurrences to `TypeConverter`. --------- Co-authored-by: Markus Böck <markus.boeck02@gmail.com>	2024-10-25 11:44:20 -07:00
Andrzej Warzyński	ac4bd74190	[mlir] Add apply_patterns.linalg.pad_vectorization TD Op (#112504 ) This PR simply wraps `populatePadOpVectorizationPatterns` into a new Transform Dialect Op: `apply_patterns.linalg.pad_vectorization`. This change makes it possible to run (and test) the corresponding patterns _without_: `transform.structured.vectorize_children_and_apply_patterns`. Note that the Op above only supports non-masked vectorisation (i.e. when the inputs are static), so, effectively, only fixed-width vectorisation (as opposed to scalable vectorisation). As such, this change is required to construct vectorization pipelines for tensor.pad targeting scalable vectors. To test the new Op and the corresponding patterns, I added "vectorization-pad-patterns.mlir" - most tests have been extracted from "vectorization-with-patterns.mlir".	2024-10-25 10:39:26 -07:00
Max191	f1595ecfdc	[mlir] Fix bug in UnPackOp tiling implementation causing infinite loop (#113571 ) This fixes a bug in the tiling implementation of tensor.unpack that was causing an infinite loop when certain unpack ops get tiled and fused as a producer. The tiled implementation of tensor.unpack sometimes needs to create an additional tensor.extract_slice on the result of the tiled unpack op, but this slice was getting added to the `generatedSlices` of the tiling result. The `generatedSlices` are used to find the next producers to fuse, so it caused an infinite loop of fusing the same unpack op after it was already in the loop. This fixes the bug by adding the slice of the source instead of the result. Signed-off-by: Max Dawkins <max.dawkins@gmail.com>	2024-10-24 21:32:45 -04:00
Ian Wood	455f71d285	[mlir] Convert `expand_shape` to more static form (#112265 ) Add pattern that converts a `tensor.expand_shape` op to a more static form. This matches the pattern: `tensor.cast` -> `tensor.expand_shape` if it has a foldable `tensor.cast` and some constant foldable `output_shape` operands for the `tensor.expand_shape`. This makes the `tensor.expand_shape` more static, as well as allowing the static information to be propagated further down in the program.	2024-10-24 17:04:02 -07:00
Matthias Springer	f18c3e4e73	[mlir][Transforms] Dialect Conversion: Simplify materialization fn result type (#113031 ) This commit simplifies the result type of materialization functions. Previously: `std::optional<Value>` Now: `Value` The previous implementation allowed 3 possible return values: - Non-null value: The materialization function produced a valid materialization. - `std::nullopt`: The materialization function failed, but another materialization can be attempted. - `Value()`: The materialization failed and so should the dialect conversion. (Previously: Dialect conversion can roll back.) This commit removes the last variant. It is not particularly useful because the dialect conversion will fail anyway if all other materialization functions produced `std::nullopt`. Furthermore, in contrast to type conversions, at least one materialization callback is expected to succeed. In case of a failing type conversion, the current dialect conversion can roll back and try a different pattern. This also used to be the case for materializations, but that functionality was removed with #107109: failed materializations can no longer trigger a rollback. (They can just make the entire dialect conversion fail without rollback.) With this in mind, it is even less useful to have an additional error state for materialization functions. This commit is in preparation of merging the 1:1 and 1:N type converters. Target materializations will have to return multiple values instead of a single one. With this commit, we can keep the API simple: `SmallVector<Value>` instead of `std::optional<SmallVector<Value>>`. Note for LLVM integration: All 1:1 materializations should return `Value` instead of `std::optional<Value>`. Instead of `std::nullopt` return `Value()`.	2024-10-23 07:29:17 -07:00
Georgios Pinitas	8ad8db973e	Revert "[TOSA] bug fix infer shape for slice" (#113413 ) Reverts llvm/llvm-project#108306	2024-10-23 04:37:21 +01:00
Tai Ly	3b9526b231	[TOSA] bug fix infer shape for slice (#108306 ) This fixes the infer output shape of TOSA slice op for start/size values that are out-of-bound or -1 added tests to check: - size = -1 - size is out of bound - start is out of bound Signed-off-by: Tai Ly <tai.ly@arm.com>	2024-10-23 04:25:41 +01:00
Andrzej Warzyński	2a25200828	[mlir][tensor] Restrict the verifier for tensor.pack/tensor.unpack (#113108 ) Restricts the verifier for tensor.pack and tensor.unpack Ops so that the following is no longer allowed: ```mlir %c8 = arith.constant 8 : index %0 = tensor.pack %input inner_dims_pos = [0, 1] inner_tiles = [8, %c8] into %output : tensor<?x?xf32> -> tensor<?x?x8x8xf32> ``` Specifically, in line with other Tensor Ops, require: * a dynamic dimensions for each (dynamic) SSA value, * a static dimension for each static size (attribute). In the example above, a static dimension (8) is mixed with a dynamic size (%c8). Note that this is mostly deleting existing code - that's because this change simplifies the logic in verifier. For more context: * https://discourse.llvm.org/t/tensor-ops-with-dynamic-sizes-which-behaviour-is-more-correct	2024-10-22 20:11:05 -07:00
Longsheng Mou	519eef3bdc	[mlir][tosa] Add a verifier for `tosa.mul` (#113320 ) This PR adds a verifier check for tosa.mul, requiring that the shift be 0 for float types. Fixes #112716.	2024-10-22 22:34:04 +01:00
weiwei chen	7191ced3b6	[MLIR] Add folding constants canonicalization for mlir::index::AddOp. (#111084 ) - [x] Add a simple canonicalization for `mlir::index::AddOp`.	2024-10-22 12:04:26 -07:00
Kunwar Grover	1004865f1c	[mlir][Vector] Support 0-d vectors natively in TransferOpReduceRank (#112907 ) Since `ddf2d62c7d` , 0-d vectors are supported in VectorType. This patch removes 0-d vector handling with scalars for the TransferOpReduceRank pattern. This pattern specifically introduces tensor.extract_slice during vectorization, causing vectorization to not fold transfer_read/transfer_write slices properly. The changes in vectorization test files reflect this. There are other places where lowering patterns are still side-stepping from handling 0-d vectors properly, by turning them into scalars, but this patch only focuses on the vector.transfer_x patterns.	2024-10-22 15:50:16 +01:00
Andrzej Warzyński	91c11574e8	Revert "[MLIR] Make `OneShotModuleBufferize` use `OpInterface` (#110322 )" (#113124 ) This reverts commit `2026501cf1`. Failing bot: * https://lab.llvm.org/staging/#/builders/125/builds/389	2024-10-22 13:28:44 +01:00
Longsheng Mou	2ce655cf1b	[mlir][func] Fix multiple bugs in `DuplicateFunctionElimination` (#109571 ) This PR fixes multiple bugs in `DuplicateFunctionElimination`. - Prevents elimination of function declarations. - Updates all symbol uses to reference unique function representatives. Fixes #93483.	2024-10-22 09:19:12 +08:00
Razvan Lupusoru	ac9ee61857	[acc] Improve LegalizeDataValues pass to handle data constructs (#112990 ) Renames LegalizeData to LegalizeDataValues since this pass fixes up SSA values. LegalizeData suggested that it fixed data mapping. This change also adds support to fix up ssa values for data clause operations. Effectively, compute regions within a data region use the ssa values from data operations also. The ssa values within data regions but not within compute regions are not updated. This change is to support the requirement in the OpenACC spec which notes that a visible data clause is not just one on the current compute construct but on the lexically containing data construct or visible declare directive.	2024-10-21 09:49:58 -07:00
Kazu Hirata	af6e1881e0	[mlir] Avoid repeated map lookups (NFC) (#113122 )	2024-10-21 06:52:24 -07:00
Kazu Hirata	2077fb80ff	[mlir] Avoid repeated map lookups (NFC) (#113074 )	2024-10-20 10:42:28 -07:00
Frank Schlimbach	d5746d73ce	eliminating g++ warnings (#105520 ) Eliminating g++ warnings. Mostly declaring "[[maybe_unused]]", adding return statements where missing and fixing casts. @rengolin --------- Co-authored-by: Benjamin Maxwell <macdue@dueutil.tech> Co-authored-by: Renato Golin <rengolin@systemcall.eu>	2024-10-18 21:20:47 +01:00
Max191	2bff9d9ffe	[mlir] Don't hoist transfers from potentially zero trip loops (#112752 ) The hoistRedundantVectorTransfers function does not verification of loop bounds when hoisting vector transfers. This is not safe in general, since it is possible that the loop will have zero trip count. This PR uses ValueBounds to verify that the lower bound is less than the upper bound of the loop before hoisting. Trip count verification is currently behind an option `verifyNonZeroTrip`, which is false by default. Zero trip count loops can arise in GPU code generation, where a loop bound can be dependent on a thread id. If not all threads execute the loop body, then hoisting out of the loop can cause these threads to execute the transfers when they are not supposed to. --------- Signed-off-by: Max Dawkins <max.dawkins@gmail.com>	2024-10-18 16:11:21 -04:00
Max191	98e838a890	[mlir] Do not bufferize parallel_insert_slice dest to read for full slices (#112761 ) In the insert_slice bufferization interface implementation, the destination tensor is not considered read if the full tensor is overwritten by the slice. This PR adds the same check for tensor.parallel_insert_slice. Adds two new StaticValueUtils: - `isAllConstantIntValue` checks if an array of `OpFoldResult` are all equal to a passed `int64_t` value. - `areConstantIntValues` checks if an array of `OpFoldResult` are all equal to a passed array of `int64_t` values. fixes https://github.com/llvm/llvm-project/issues/112435 --------- Signed-off-by: Max Dawkins <max.dawkins@gmail.com>	2024-10-18 16:02:03 -04:00
Max191	1ae24460d2	[mlir] Add forall canonicalization to replace constant induction vars (#112764 ) Adds a canonicalization pattern for scf.forall that replaces constant induction variables with a constant index. There is a similar canonicalization that completely removes constant induction variables from the loop, but that pattern does not apply on foralls with mappings, so this one is necessary for those cases. --------- Signed-off-by: Max Dawkins <max.dawkins@gmail.com>	2024-10-18 15:21:01 -04:00
Andrzej Warzyński	0a3347dc63	[mlir][linalg] Fix idx comparison in the vectorizer (#112900 ) Fixes loop comparison condition in the vectorizer. As that logic is used specifically for vectorising `tensor.extract`, I also added a test that violates the assumptions made inside `getTrailingNonUnitLoopDimIdx`, namely that Linalg loops are non-empty. Vectorizer pre-conditions will capture that much earlier making sure that `getTrailingNonUnitLoopDimIdx` is only run when all the assumptions are actually met. Thank you for pointing this out, @pfusik !	2024-10-18 15:27:43 +01:00
Vinayak Dev	2f15d7e43e	[mlir][tensor] Fix off-by-one error in ReshapeOpsUtils (#112774 ) This patch fixes an off-by-one error in `mlir::getReassociationIndicesForCollapse()` that occurs when the last two dims of the source tensor satisfy the while loop. This would cause an assertion failure due to out-of-bounds-access, which is now fixed.	2024-10-18 14:02:30 +05:30
Andrzej Warzyński	f7f51f2afb	[mlir][vector] Clarify the semantics of masking maps (nfc) (#111383 ) We use the term "masking map" throughout the Linalg vectorization logic, but we don't really define what it is and how it differs from Linalg indexing maps. This PR clarifies the differnces, makes sure that the new terminology is used consistenty and improves code re-use.	2024-10-18 08:58:58 +01:00
Prashant Kumar	c1047ba836	[MLIR] Enable pattern only for scf.forall op (#110230 ) The init args shape might change in the loop body and hence the pattern doesn't hold true.	2024-10-17 18:32:03 +05:30
Sergio Afonso	4091bc61e3	[MLIR][OpenMP] Split region-associated op verification (#112355 ) This patch moves the part of operation verifiers dependent on the contents of their regions to the corresponding `verifyRegions` method. This ensures these are only triggered after the operations in the region have themselved already been verified in advance, avoiding checks based on invalid nested operations. The `LoopWrapperInterface` is also updated so that its verifier runs after operations in the region of ops with this interface have already been verified.	2024-10-17 10:46:38 +01:00
Ivan Butygin	6902b39b6f	[mlir] UnsignedWhenEquivalent: use greedy rewriter instead of dialect conversion (#112454 ) `UnsignedWhenEquivalent` doesn't really need any dialect conversion features and switching it normal patterns makes it more composable with other patterns-based transformations (and probably faster).	2024-10-17 12:23:11 +03:00
Longsheng Mou	f5aee1f18b	[mlir][memref] Fix type conversion in emulate-wide-int and emulate-narrow-type (#112214 ) This PR follows with #112104, using `nullptr` to indicate that type conversion failed and no fallback conversion should be attempted.	2024-10-17 09:08:24 +08:00
Alexander Pivovarov	a24c468782	[MLIR] Fix assert expressions (#112474 ) I noticed that several assertions in MLIR codebase have issues with operator precedence The issue with operator precedence in these assertions is due to the way logical operators are evaluated. The `&&` operator has higher precedence than the `\|\|` operator, which means the assertion is currently evaluating incorrectly, like this: ``` assert((resType.getNumDynamicDims() == dynOutDims.size()) \|\| (dynOutDims.empty() && "Either none or all output dynamic dims must be specified!")); ``` We should add parentheses around the entire expression involving `dynOutDims.empty()` to ensure that the logical conditions are grouped correctly. Here’s the corrected version: ``` assert(((resType.getNumDynamicDims() == dynOutDims.size()) \|\| dynOutDims.empty()) && "Either none or all output dynamic dims must be specified!"); ```	2024-10-16 15:22:29 -07:00
Kazu Hirata	0a20ab908c	[mlir] Avoid repeated hash lookups (NFC) (#112472 )	2024-10-16 06:40:48 -07:00

1 2 3 4 5 ...

8820 Commits