clang-p2996

Author	SHA1	Message	Date
Ivan Butygin	f54cdc5d6e	[mlir] IntegerRangeAnalysis: add support for vector type (#112292 ) Treat integer range for vector type as union of ranges of individual elements. With this semantics, most arith ops on vectors will work out of the box, the only special handling needed for constants and vector elements manipulation ops. The end goal of these changes is to be able to optimize vectorized index calculations.	2024-11-01 23:58:16 +03:00
Jakub Kuderski	0f8a6b7d03	[mlir] Add fast walk-based pattern rewrite driver (#113825 ) This is intended as a fast pattern rewrite driver for the cases when a simple walk gets the job done but we would still want to implement it in terms of rewrite patterns (that can be used with the greedy pattern rewrite driver downstream). The new driver is inspired by the discussion in https://github.com/llvm/llvm-project/pull/112454 and the LLVM Dev presentation from @matthias-springer earlier this week. This limitation comes with some limitations: * It does not repeat until a fixpoint or revisit ops modified in place or newly created ops. In general, it only walks forward (in the post-order). * `matchAndRewrite` can only erase the matched op or its descendants. This is verified under expensive checks. * It does not perform folding / DCE. We could probably relax some of these in the future without sacrificing too much performance.	2024-10-31 11:10:09 -04:00
Ivan Butygin	6902b39b6f	[mlir] UnsignedWhenEquivalent: use greedy rewriter instead of dialect conversion (#112454 ) `UnsignedWhenEquivalent` doesn't really need any dialect conversion features and switching it normal patterns makes it more composable with other patterns-based transformations (and probably faster).	2024-10-17 12:23:11 +03:00
Longsheng Mou	f5aee1f18b	[mlir][memref] Fix type conversion in emulate-wide-int and emulate-narrow-type (#112214 ) This PR follows with #112104, using `nullptr` to indicate that type conversion failed and no fallback conversion should be attempted.	2024-10-17 09:08:24 +08:00
Jakub Kuderski	935810c4de	[mlir][arith] Fix type conversion in emulate-wide-int (#112104 ) Use `nullptr` to indicate that type conversion failed and no fallback conversion should be attempted. Fixes: https://github.com/llvm/llvm-project/issues/108163	2024-10-12 15:06:45 -04:00
donald chen	4b3f251bad	[mlir] [dataflow] unify semantics of program point (#110344 ) The concept of a 'program point' in the original data flow framework is ambiguous. It can refer to either an operation or a block itself. This representation has different interpretations in forward and backward data-flow analysis. In forward data-flow analysis, the program point of an operation represents the state after the operation, while in backward data flow analysis, it represents the state before the operation. When using forward or backward data-flow analysis, it is crucial to carefully handle this distinction to ensure correctness. This patch refactors the definition of program point, unifying the interpretation of program points in both forward and backward data-flow analysis. How to integrate this patch? For dense forward data-flow analysis and other analysis (except dense backward data-flow analysis), the program point corresponding to the original operation can be obtained by `getProgramPointAfter(op)`, and the program point corresponding to the original block can be obtained by `getProgramPointBefore(block)`. For dense backward data-flow analysis, the program point corresponding to the original operation can be obtained by `getProgramPointBefore(op)`, and the program point corresponding to the original block can be obtained by `getProgramPointAfter(block)`. NOTE: If you need to get the lattice of other data-flow analyses in dense backward data-flow analysis, you should still use the dense forward data-flow approach. For example, to get the Executable state of a block in dense backward data-flow analysis and add the dependency of the current operation, you should write: ``getOrCreateFor<Executable>(getProgramPointBefore(op), getProgramPointBefore(block))`` In case above, we use getProgramPointBefore(op) because the analysis we rely on is dense backward data-flow, and we use getProgramPointBefore(block) because the lattice we query is the result of a non-dense backward data flow computation. related dsscussion: https://discourse.llvm.org/t/rfc-unify-the-semantics-of-program-points/80671/8 corresponding PSA: https://discourse.llvm.org/t/psa-program-point-semantics-change/81479	2024-10-11 21:59:05 +08:00
BARRET	1666d13078	[CMake]: Remove unnecessary dependencies on LLVM/MLIR (#111255 ) Previous https://github.com/llvm/llvm-project/pull/110362 (reverted) caused breakage. Here is the PR with fix. My build cmdline: ``` cmake ../llvm \ -G Ninja \ -DCMAKE_BUILD_TYPE=Release \ -DCMAKE_INSTALL_PREFIX=install \ -DCMAKE_C_COMPILER=gcc-9 \ -DCMAKE_CXX_COMPILER=g++-9 \ -DCMAKE_CUDA_COMPILER=$(which nvcc) \ -DLLVM_ENABLE_LLD=OFF \ -DLLVM_ENABLE_ASSERTIONS=ON \ -DLLVM_BUILD_EXAMPLES=ON \ -DCOMPILER_RT_BUILD_LIBFUZZER=OFF \ -DLLVM_CCACHE_BUILD=ON \ -DMLIR_ENABLE_BINDINGS_PYTHON=ON \ -DBUILD_SHARED_LIBS=ON \ -DLLVM_ENABLE_PROJECTS='llvm;mlir' ```	2024-10-07 15:52:43 +02:00
Matthias Springer	206fad0e21	[mlir][NFC] Mark type converter in `populate...` functions as `const` (#111250 ) This commit marks the type converter in `populate...` functions as `const`. This is useful for debugging. Patterns already take a `const` type converter. However, some `populate...` functions do not only add new patterns, but also add additional type conversion rules. That makes it difficult to find the place where a type conversion was added in the code base. With this change, all `populate...` functions that only populate pattern now have a `const` type converter. Programmers can then conclude from the function signature that these functions do not register any new type conversion rules. Also some minor cleanups around the 1:N dialect conversion infrastructure, which did not always pass the type converter as a `const` object internally.	2024-10-05 21:32:40 +02:00
Nikhil Kalra	fef3566a25	[mlir] Pass Options ownership modifications (#110582 ) This change makes two (related) changes: First, it updates the tablegen option for `ListOption` to emit a `SmallVector` instead of an `ArrayRef`. This brings `ListOption` more inline with the traditional `Option`, where values are typically provided using types that have storage. After this change, all options should be fully owned by a Pass' `Options` object after it has been fully constructed, unless the underlying type of the `Option` explicitly indicates otherwise. Second, it updates the generated constructors for Passes to consume options by value instead of reference, and prefers moving options into the pass itself. This should be more efficient for non-trivial options objects, where the previous interface forced a copy to be materialized. Now, at worst case the API materializes a copy (no worse than before); at best-case, all options objects are moved into place. Ideally, we could update the Pass constructor to take an r-value reference to the Options object instead, but this approach will require numerous changes to existing passes and their factory functions. --------- Authored-by: Nikhil Kalra <nkalra@apple.com>	2024-10-01 09:48:51 -07:00
Mehdi Amini	8b47711e84	Revert "CMake: Remove unnecessary dependencies on LLVM/MLIR" (#110594 ) Reverts llvm/llvm-project#110362 Multiple bots are broken.	2024-10-01 00:44:21 +02:00
BARRET	4980f2177e	CMake: Remove unnecessary dependencies on LLVM/MLIR (#110362 ) There are some spurious libraries which can be removed. I'm trying to bundle MLIR/LLVM library dependencies for our own libraries. We're utilizing cmake function to recursively collect MLIR/LLVM related dependencies. However, we identified certain library dependencies as redundant and safe for removal.	2024-09-30 23:57:13 +02:00
Daniel Hernandez-Juarez	1fd1f65569	[mlir] Refactor LegalizeToF32 to specify extra supported float types and target type as arguments (#108815 ) Instead of hardcoding all fp smaller than 32 bits are unsupported we provide a way to pass supported floating point types as well as the target type. fp64 and fp32 are implicitly supported. CC: @krzysz00 @manupak	2024-09-27 10:02:16 -05:00
Sergey Kozub	2c58063435	[MLIR] Add f4E2M1FN type (#108877 ) This PR adds `f4E2M1FN` type to mlir. `f4E2M1FN` type is proposed in [OpenCompute MX Specification](https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf). It defines a 4-bit floating point number with bit layout S1E2M1. Unlike IEEE-754 types, there are no infinity or NaN values. ```c f4E2M1FN - Exponent bias: 1 - Maximum stored exponent value: 3 (binary 11) - Maximum unbiased exponent value: 3 - 1 = 2 - Minimum stored exponent value: 1 (binary 01) - Minimum unbiased exponent value: 1 − 1 = 0 - Has Positive and Negative zero - Doesn't have infinity - Doesn't have NaNs Additional details: - Zeros (+/-): S.00.0 - Max normal number: S.11.1 = ±2^(2) x (1 + 0.5) = ±6.0 - Min normal number: S.01.0 = ±2^(0) = ±1.0 - Min subnormal number: S.00.1 = ±2^(0) x 0.5 = ±0.5 ``` Related PRs: - [PR-95392](https://github.com/llvm/llvm-project/pull/95392) [APFloat] Add APFloat support for FP4 data type - [PR-105573](https://github.com/llvm/llvm-project/pull/105573) [MLIR] Add f6E3M2FN type - was used as a template for this PR - [PR-107999](https://github.com/llvm/llvm-project/pull/107999) [MLIR] Add f6E2M3FN type	2024-09-24 08:22:48 +02:00
Sergey Kozub	73d83f20c9	[MLIR] Add f6E2M3FN type (#107999 ) This PR adds `f6E2M3FN` type to mlir. `f6E2M3FN` type is proposed in [OpenCompute MX Specification](https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf). It defines a 6-bit floating point number with bit layout S1E2M3. Unlike IEEE-754 types, there are no infinity or NaN values. ```c f6E2M3FN - Exponent bias: 1 - Maximum stored exponent value: 3 (binary 11) - Maximum unbiased exponent value: 3 - 1 = 2 - Minimum stored exponent value: 1 (binary 01) - Minimum unbiased exponent value: 1 − 1 = 0 - Has Positive and Negative zero - Doesn't have infinity - Doesn't have NaNs Additional details: - Zeros (+/-): S.00.000 - Max normal number: S.11.111 = ±2^(2) x (1 + 0.875) = ±7.5 - Min normal number: S.01.000 = ±2^(0) = ±1.0 - Max subnormal number: S.00.111 = ±2^(0) x 0.875 = ±0.875 - Min subnormal number: S.00.001 = ±2^(0) x 0.125 = ±0.125 ``` Related PRs: - [PR-94735](https://github.com/llvm/llvm-project/pull/94735) [APFloat] Add APFloat support for FP6 data types - [PR-105573](https://github.com/llvm/llvm-project/pull/105573) [MLIR] Add f6E3M2FN type - was used as a template for this PR	2024-09-16 21:09:27 +02:00
Sergey Kozub	918222ba43	[MLIR] Add f6E3M2FN type (#105573 ) This PR adds `f6E3M2FN` type to mlir. `f6E3M2FN` type is proposed in [OpenCompute MX Specification](https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf). It defines a 6-bit floating point number with bit layout S1E3M2. Unlike IEEE-754 types, there are no infinity or NaN values. ```c f6E3M2FN - Exponent bias: 3 - Maximum stored exponent value: 7 (binary 111) - Maximum unbiased exponent value: 7 - 3 = 4 - Minimum stored exponent value: 1 (binary 001) - Minimum unbiased exponent value: 1 − 3 = −2 - Has Positive and Negative zero - Doesn't have infinity - Doesn't have NaNs Additional details: - Zeros (+/-): S.000.00 - Max normal number: S.111.11 = ±2^(4) x (1 + 0.75) = ±28 - Min normal number: S.001.00 = ±2^(-2) = ±0.25 - Max subnormal number: S.000.11 = ±2^(-2) x 0.75 = ±0.1875 - Min subnormal number: S.000.01 = ±2^(-2) x 0.25 = ±0.0625 ``` Related PRs: - [PR-94735](https://github.com/llvm/llvm-project/pull/94735) [APFloat] Add APFloat support for FP6 data types - [PR-97118](https://github.com/llvm/llvm-project/pull/97118) [MLIR] Add f8E4M3 type - was used as a template for this PR	2024-09-10 10:41:05 +02:00
Kazu Hirata	5262865aac	[mlir] Construct SmallVector with ArrayRef (NFC) (#101896 )	2024-08-04 11:43:05 -07:00
Alexander Pivovarov	eef1d7e377	[MLIR] Add f8E3M4 IEEE 754 type (#101230 ) This PR adds `f8E3M4` type to mlir. `f8E3M4` type follows IEEE 754 convention ```c f8E3M4 (IEEE 754) - Exponent bias: 3 - Maximum stored exponent value: 6 (binary 110) - Maximum unbiased exponent value: 6 - 3 = 3 - Minimum stored exponent value: 1 (binary 001) - Minimum unbiased exponent value: 1 − 3 = −2 - Precision specifies the total number of bits used for the significand (mantissa), including implicit leading integer bit = 4 + 1 = 5 - Follows IEEE 754 conventions for representation of special values - Has Positive and Negative zero - Has Positive and Negative infinity - Has NaNs Additional details: - Max exp (unbiased): 3 - Min exp (unbiased): -2 - Infinities (+/-): S.111.0000 - Zeros (+/-): S.000.0000 - NaNs: S.111.{0,1}⁴ except S.111.0000 - Max normal number: S.110.1111 = +/-2^(6-3) x (1 + 15/16) = +/-2^3 x 31 x 2^(-4) = +/-15.5 - Min normal number: S.001.0000 = +/-2^(1-3) x (1 + 0) = +/-2^(-2) - Max subnormal number: S.000.1111 = +/-2^(-2) x 15/16 = +/-2^(-2) x 15 x 2^(-4) = +/-15 x 2^(-6) - Min subnormal number: S.000.0001 = +/-2^(-2) x 1/16 = +/-2^(-2) x 2^(-4) = +/-2^(-6) ``` Related PRs: - [PR-99698](https://github.com/llvm/llvm-project/pull/99698) [APFloat] Add support for f8E3M4 IEEE 754 type - [PR-97118](https://github.com/llvm/llvm-project/pull/97118) [MLIR] Add f8E4M3 IEEE 754 type	2024-08-02 00:22:11 -07:00
Alexander Pivovarov	019136e30f	[MLIR] Add f8E4M3 IEEE 754 type (#97118 ) This PR adds `f8E4M3` type to mlir. `f8E4M3` type follows IEEE 754 convention ```c f8E4M3 (IEEE 754) - Exponent bias: 7 - Maximum stored exponent value: 14 (binary 1110) - Maximum unbiased exponent value: 14 - 7 = 7 - Minimum stored exponent value: 1 (binary 0001) - Minimum unbiased exponent value: 1 − 7 = −6 - Precision specifies the total number of bits used for the significand (mantisa), including implicit leading integer bit = 3 + 1 = 4 - Follows IEEE 754 conventions for representation of special values - Has Positive and Negative zero - Has Positive and Negative infinity - Has NaNs Additional details: - Max exp (unbiased): 7 - Min exp (unbiased): -6 - Infinities (+/-): S.1111.000 - Zeros (+/-): S.0000.000 - NaNs: S.1111.{001, 010, 011, 100, 101, 110, 111} - Max normal number: S.1110.111 = +/-2^(7) x (1 + 0.875) = +/-240 - Min normal number: S.0001.000 = +/-2^(-6) - Max subnormal number: S.0000.111 = +/-2^(-6) x 0.875 = +/-2^(-9) x 7 - Min subnormal number: S.0000.001 = +/-2^(-6) x 0.125 = +/-2^(-9) ``` Related PRs: - [PR-97179](https://github.com/llvm/llvm-project/pull/97179) [APFloat] Add support for f8E4M3 IEEE 754 type	2024-07-22 23:20:28 -07:00
Ramkumar Ramachandra	db791b278a	mlir/LogicalResult: move into llvm (#97309 ) This patch is part of a project to move the Presburger library into LLVM.	2024-07-02 10:42:33 +01:00
Ivy Zhang	7042fcc638	[MLIR][Arith][Resubmit] add fastMathAttr on arith::extf and arith::truncf (#95346 ) Add an `fastMathAttr` on `arith::extf` and `arith::truncf`. If these two ops are inserted by some promotion passes (like legalize-to-f32 / emulate-unsupported-floats), they will be labeled as `FastMathFlags::contract`, denoting that they can be then `eliminated by canonicalizer`. The `elimination` can help improve performance, while may introduce some numerical differences.	2024-06-15 07:42:29 +08:00
Ivy Zhang	f941908d77	Revert "[MLIR][Arith] add fastMathAttr on arith::extf and arith::truncf" (#95344 ) Reverts llvm/llvm-project#93443	2024-06-13 11:23:20 +08:00
Ivy Zhang	6784bf7642	[MLIR][Arith] add fastMathAttr on arith::extf and arith::truncf (#93443 ) Add an `fastMathAttr` on `arith::extf` and `arith::truncf`. If these two ops are inserted by some promotion passes (like legalize-to-f32 / emulate-unsupported-floats), they will be labeled as `FastMathFlags::contract`, denoting that they can be then `eliminated by canonicalizer`. The `elimination` can help improve performance, while may introduce some numerical differences.	2024-06-13 09:27:44 +08:00
Krzysztof Drewniak	472291111d	[mlir][Arith] Generalize and improve -int-range-optimizations (#94712 ) When the integer range analysis was first develop, a pass that did integer range-based constant folding was developed and used as a test pass. There was an intent to add such a folding to SCCP, but that hasn't happened. Meanwhile, -int-range-optimizations was added to the arith dialect's transformations. The cmpi simplification in that pass is a strict subset of the constant folding that lived in -test-int-range-inference. This commit moves the former test pass into -int-range-optimizaitons, subsuming its previous contents. It also adds an optimization from rocMLIR where `rem{s,u}i` operations that are noops are replaced by their left operands.	2024-06-10 09:56:33 -05:00
Kunwar Grover	debdbeda15	[mlir] Remove dialect specific bufferization passes (Reland) (#93535 ) These passes have been depreciated for a long time and replaced by one-shot bufferization. These passes are also unsafe because they do not check for read-after-write conflicts. Relands https://github.com/llvm/llvm-project/pull/93488 which failed on buildbot. Fixes the failure by updating integration tests to use one-shot-bufferize instead.	2024-05-28 20:04:27 +01:00
Kunwar Grover	39848d0a98	Revert "[mlir] Remove dialect specific bufferization passes" (#93528 ) Reverts llvm/llvm-project#93488 Buildbot failure: https://lab.llvm.org/buildbot/#/builders/220/builds/39911	2024-05-28 11:21:34 +01:00
Kunwar Grover	2fc5106437	[mlir] Remove dialect specific bufferization passes (#93488 ) These passes have been depreciated for a long time and replaced by one-shot bufferization. These passes are also unsafe because they do not check for read-after-write conflicts.	2024-05-28 11:12:58 +01:00
Felix Schneider	78b3a00418	[mlir] `int-range-optmizations`: Fix referencing of deleted ops (#91807 ) The pass runs a `DataFlowSolver` and collects state information on the input IR. Then, the rewrite driver and folding is applied. During pattern application and folding it can happen that an Op from the input IR is deleted and a new Op is created at the same address. When the newly created Ops is looked up in the `DataFlowSolver` state memory, the state of the original Op is returned. This patch adds a method to `DataFlowSolver` which removes all state related to a `ProgramPoint`. It also adds a listener to the Pass which clears the state information of deleted Ops from the `DataFlowSolver`. Fix https://github.com/llvm/llvm-project/issues/81228	2024-05-12 18:11:42 +02:00
Matthias Gehre	30badf96bb	[MLIR][Arith] expand-ops: Support mini/maxi (#90575 ) Expand `arith.minsi`, `arith.minui`, `arith.maxsi`, `arith.maxui` into `arith.cmpi` and `arith.select`. --------- Co-authored-by: Jakub Kuderski <kubakuderski@gmail.com>	2024-04-30 19:02:32 +02:00
Christian Sigg	a5757c5b65	Switch member calls to `isa/dyn_cast/cast/...` to free function calls. (#89356 ) This change cleans up call sites. Next step is to mark the member functions deprecated. See https://mlir.llvm.org/deprecation and https://discourse.llvm.org/t/preferred-casting-style-going-forward.	2024-04-19 15:58:27 +02:00
Matthias Springer	40dd3aa91d	[mlir][Interfaces] `Variable` abstraction for `ValueBoundsOpInterface` (#87980 ) This commit generalizes and cleans up the `ValueBoundsConstraintSet` API. The API used to provide function overloads for comparing/computing bounds of: - index-typed SSA value - dimension of shaped value - affine map + operands This commit removes all overloads. There is now a single entry point for each `compare` variant and each `computeBound` variant. These functions now take a `Variable`, which is internally represented as an affine map and map operands. This commit also adds support for computing bounds for an affine map + operands. There was previously no public API for that.	2024-04-16 10:59:02 +02:00
Matthias Springer	5e4a44380e	[mlir][Interfaces][NFC] `ValueBoundsConstraintSet`: Pass stop condition in the constructor (#86099 ) This commit changes the API of `ValueBoundsConstraintSet`: the stop condition is now passed to the constructor instead of `processWorklist`. That makes it easier to add items to the worklist multiple times and process them in a consistent manner. The current `ValueBoundsConstraintSet` is passed as a reference to the stop function, so that the stop function can be defined before the the `ValueBoundsConstraintSet` is constructed. This change is in preparation of adding support for branches.	2024-04-04 17:05:47 +09:00
Fehr Mathieu	e03f16f9fd	[mlir] [arith] Remove buggy illegal operation in --arith-unsigned-when-equivalent (#87298 ) `CeilDivUIOp` seemed to have been added by mistake to the list of dynamically illegal operations in `arith-unsigned-when-equivalent`. The only illegal operations should be the signed operations that can be converted to their unsigned counterpart.	2024-04-02 14:47:27 +01:00
Victor Perez	8827ff92b9	[MLIR][Arith] Add rounding mode attribute to `truncf` (#86152 ) Add rounding mode attribute to `arith`. This attribute can be used in different FP `arith` operations to control rounding mode. Rounding modes correspond to IEEE 754-specified rounding modes. Use in `arith.truncf` folding. As this is not supported in dialects other than LLVM, conversion should fail for now in case this attribute is present. --------- Signed-off-by: Victor Perez <victor.perez@codeplay.com>	2024-04-01 11:57:14 +02:00
Matthias Springer	a45e58af1b	[mlir][bufferization] Add `BufferViewFlowOpInterface` (#78718 ) This commit adds the `BufferViewFlowOpInterface` to the bufferization dialect. This interface can be implemented by ops that operate on buffers to indicate that a buffer op result and/or region entry block argument may be the same buffer as a buffer operand (or a view thereof). This interface is queried by the `BufferViewFlowAnalysis`. The new interface has two interface methods: * `populateDependencies`: Implementations use the provided callback to declare dependencies between operands and op results/region entry block arguments. E.g., for `%r = arith.select %c, %m1, %m2 : memref<5xf32>`, the interface implementation should declare two dependencies: %m1 -> %r and %m2 -> %r. * `mayBeTerminalBuffer`: An SSA value is a terminal buffer if the buffer view flow analysis stops at the specified value. E.g., because the value is a newly allocated buffer or because no further information is available about the origin of the buffer. Ops that implement the `RegionBranchOpInterface` or `BranchOpInterface` do not have to implement the `BufferViewFlowOpInterface`. The buffer dependencies can be inferred from those two interfaces. This commit makes the `BufferViewFlowAnalysis` more accurate. For unknown ops, it conservatively used to declare all combinations of operands and op results/region entry block arguments as dependencies (false positives). This is no longer the case. While the analysis is still a "maybe" analysis with false positives (e.g., when analyzing ops such as `arith.select` or `scf.if` where the taken branch is not known at compile time), results and region entry block arguments of unknown ops are now marked as terminal buffers. This commit addresses a TODO in `BufferViewFlowAnalysis.cpp`: ``` // TODO: We should have an op interface instead of a hard-coded list of // interfaces/ops. ``` It is no longer needed to hard-code ops.	2024-03-24 12:48:19 +09:00
long.chen	631e54aa1a	[mlir][arith] fix wrong floordivsi fold (#83248 ) Fixs https://github.com/llvm/llvm-project/issues/83079	2024-03-22 23:52:47 +08:00
Benoit Jacob	9c7cde64e6	Fix the lowering of `arith.truncf : f32 to bf16`. (#83180 ) This lowering was not correctly handling the case where saturation of the mantissa results in an increase of the exponent value. The new code borrows, with credit, the idea from `e1502c0cdb/c10/util/BFloat16.h (L60-L79)` and adds comments to explain the magic trick going on here and why it's correct. Hat tip to its original author, whom I believe to be @Maratyszcza. A testcase was also requiring a tie to be broken upwards in a case where "to nearest-even" required going downward. The fact that it used to pass suggests that there was another bug in the old code.	2024-02-28 13:56:18 -05:00
ian Bearman	067d2779fc	[MLIR] Setting MemorySpace During Bufferization (#78484 ) Collection of changes with the goal of being able to convert `encoding` to `memorySpace` during bufferization - new API for encoder to allow implementation to select destination memory space - update existing bufferization implementations to support the new interface	2024-02-08 16:59:37 +01:00
Krzysztof Drewniak	750e90e440	[mlir][ArithToAMDGPU] Add option for saturating truncation to fp8 (#74153 ) Many machine-learning applications (and most software written at AMD) expect the operation that truncates floats to 8-bit floats to be saturatinng. That is, they expect `truncf 256.0 : f32 to f8E4M3FNUZ` to yield `240.0`, not `NaN`, and similarly for negative numbers. However, the underlying hardware instruction that can be used for this truncation implements overflow-to-NaN semantics. To enable handling this usecase, we add the saturate-fp8-truncf option to ArithToAMDGPU (off by default), which causes the requisite clamping code to be emitted. Said clamping code ensures that Inf and NaN are passed through exactly (and thus trancate to NaN). Per review feedback, this commit efactors createScalarOrSplatConstant() to the Arith dialect utilities and uses it in this code. It also fixes naming of existing patterns and switches from vector.extractelement/insertelement to vector.extract/insert.	2024-01-23 16:52:21 -06:00
Mehdi Amini	e5e08955af	Apply clang-tidy fixes for performance-move-const-arg in IntRangeOptimizations.cpp (NFC)	2024-01-15 20:59:12 -08:00
Han-Chung Wang	b33a131c82	[mlir][arith] Add support for expanding arith.maxnumf/minnumf ops. (#75989 ) The maxnum/minnum semantics can be found at https://llvm.org/docs/LangRef.html#llvm-minnum-intrinsic. The revision also updates function names in lit tests to match op name. Take arith.maxnumf as example: ``` func.func @maxnumf(%lhs: f32, %rhs: f32) -> f32 { %result = arith.maxnumf %lhs, %rhs : f32 return %result : f32 } ``` will be expanded to ``` func.func @maxnumf(%lhs: f32, %rhs: f32) -> f32 { %0 = arith.cmpf ugt, %lhs, %rhs : f32 %1 = arith.select %0, %lhs, %rhs : f32 %2 = arith.cmpf uno, %lhs, %lhs : f32 %3 = arith.select %2, %rhs, %1 : f32 return %3 : f32 } ``` Case 1: Both LHS and RHS are not NaN; LHS > RHS In this case, `%1` is LHS. `%3` and `%1` have the same value, so `%3` is LHS. Case 2: LHS is NaN and RHS is not NaN In this case, `%2` is true, so `%3` is always RHS. Case 3: LHS is not NaN and RHS is NaN In this case, `%0` is true and `%1` is LHS. `%2` is false, so `%3` and `%1` have the same value, which is LHS. Case 4: Both LHS and RHS are NaN: `%1` and RHS are all NaN, so the result is still NaN.	2023-12-20 10:35:12 -08:00
Matthias Springer	32c3decb77	[mlir][vector] Modernize `vector.transpose` op (#72594 ) * Declare arguments/results with `let` statements. * Rename `transp` to `permutation`. * Change type of `transp` from `I64ArrayAttr` to `DenseI64ArrayAttr` (provides direct access to `ArrayRef<int64_t>` instead of `ArrayAttr`).	2023-11-20 11:25:35 +01:00
long.chen	1609f1c2a5	[mlir][affine][nfc] cleanup deprecated T.cast style functions (#71269 ) detail see the docment: https://mlir.llvm.org/deprecation/ Not all changes are made manually, most of them are made through a clang tool I wrote https://github.com/lipracer/cpp-refactor.	2023-11-14 13:01:19 +08:00
Mehdi Amini	b97aaa72d9	Remove `let construct =` from ArithExpandOpsPass definition (NFC) Note that the `Pass` suffix is added in tablegen, and as a side effect the options are renamed from `ArithExpandOpsOptions` to `ArithExpandOpsPassOptions`.	2023-10-02 15:54:22 -07:00
Mehdi Amini	b1c10dfd72	Fixup on ArithBufferizePass: add the Pass suffix in TableGen to ensure consitency of the generated code	2023-10-02 15:50:41 -07:00
Mehdi Amini	c1c56ae49e	Remove `let constructor =` from ArithBufferizePass and rely on TableGen to generate the glue (NFC)	2023-10-02 15:41:16 -07:00
Martin Erhart	6a651c7f44	Revert "[mlir][bufferization] Don't clone on unknown ownership and verify function boundary ABI (#66626 )" This reverts commit `aa9eb47da2`. It introduced a double free in a test case. Reverting to have some time for fixing this and relanding later.	2023-09-28 09:14:46 +00:00
Martin Erhart	aa9eb47da2	[mlir][bufferization] Don't clone on unknown ownership and verify function boundary ABI (#66626 ) Inserting clones requires a lot of assumptions to hold on the input IR, e.g., all writes to a buffer need to dominate all reads. This is not guaranteed by one-shot bufferization and isn't easy to verify, thus it could quickly lead to incorrect results that are hard to debug. This commit changes the mechanism of how an ownership indicator is materialized when there is not already a unique ownership present. Additionally, we don't create copies of returned memrefs anymore when we don't have ownership. Instead, we insert assert operations to make sure we have ownership at runtime, or otherwise report to the user that correctness could not be guaranteed.	2023-09-28 10:45:35 +02:00
Diego Caballero	98f6289a34	[mlir][Vector] Add support for Value indices to vector.extract/insert `vector.extract/insert` ops only support constant indices. This PR is extending them so that arbitrary values can be used instead. This work is part of the RFC: https://discourse.llvm.org/t/rfc-psa-remove-vector-extractelement-and-vector-insertelement-ops-in-favor-of-vector-extract-and-vector-insert-ops Differential Revision: https://reviews.llvm.org/D155034	2023-09-22 00:39:32 +00:00
Martin Erhart	942ce31985	[mlir][bufferization] BufferDeallocationOpInterface: support custom ownership update logic (#66350 ) Add a method to the BufferDeallocationOpInterface that allows operations to implement the interface and provide custom logic to compute the ownership indicators of values it defines. As a demonstrating example, this new method is implemented by the `arith.select` operation.	2023-09-14 14:34:04 +02:00
Martin Erhart	9782232ec7	Revert "[mlir][bufferization] BufferDeallocationOpInterface: support custom ownership update logic" This reverts commit `89117f1807`. This caused problems in downstream projects. We are reverting to give them more time for integration.	2023-09-13 13:53:47 +00:00

1 2 3

118 Commits