clang-p2996

Author	SHA1	Message	Date
Prashant Kumar	5b702be1e8	[mlir][math] Convert math.fpowi to math.powf in case of non constant (#87472 ) Convert math.fpowi to math.powf by converting dtype of power operand to floating point.	2024-04-03 22:19:26 +05:30
Prashant Kumar	10a57f3aff	[mlir][math] Expand powfI operation for constant power operand. (#87081 ) -- Convert `math.fpowi` to a series of `arith.mulf` operations. -- If the power is negative, we divide the result by 1.	2024-04-01 13:18:27 +05:30
srcarroll	d39ac3a8e0	[mlir][math] Reland `58ef9bec07` (#85436 ) The previous implementation decomposes tanh(x) into `(exp(2x) - 1)/(exp(2x)+1), x < 0` `(1 - exp(-2x))/(1 + exp(-2x)), x >= 0` This is fine as it avoids overflow with the exponential, but the whole decomposition is computed for both cases unconditionally, then the result is chosen based off the sign of the input. This results in doing two expensive exp computations. The proposed change avoids doing the whole computation twice by exploiting the reflection symmetry `tanh(-x) = -tanh(x)`. We can "normalize" the input to be positive by setting `y = sign(x) * x`, where the sign of `x` is computed as `sign(x) = (float)(x > 0) * (-2) + 1`. Then compute `z = tanh(y) `with the decomposition above for `x >=0` and "denormalize" the result `z * sign(x)` to retain the sign. The reason it is done this way is that it is very amenable to vectorization. This method trades the duplicate decomposition computations (which takes 5 instructions including an extra expensive exp and div) for 4 cheap instructions to compute the signs value `arith.cmpf `(which is a pre-existing instruction in the previous impl) `arith.sitofp` `arith.mulf` `arith.addf` and 1 more instruction to get the right sign in the result 5. `arith.mulf`. Moreover, numerically, this implementation will yield the exact same results as the previous implementation. As part of the relanding, a casting issue from the original commit has been fixed, i.e. casting bool to float with `uitofp`. Additionally a correctness test with `mlir-cpu-runner` has been added.	2024-03-17 11:23:30 -05:00
Benjamin Maxwell	e74bcecd36	[mlir][math] Propagate scalability in polynomial approximation (#84949 ) This simply updates the rewrites to propagate the scalable flags (which as they do not alter the vector shape, is pretty simple). The added tests are simply scalable versions of the existing vector tests.	2024-03-15 20:08:52 +00:00
srcarroll	f75d164eea	Revert "[mlir][math] Implement alternative decomposition for tanh (#8… (#85429 ) …5025)" This reverts commit `58ef9bec07`. There is a bool to float casting issue that needs to be sorted out to make sure this is target independent	2024-03-15 14:39:57 -05:00
srcarroll	58ef9bec07	[mlir][math] Implement alternative decomposition for tanh (#85025 ) The previous implementation decomposes `tanh(x)` into `(exp(2x) - 1)/(exp(2x)+1), x < 0` `(1 - exp(-2x))/(1 + exp(-2x)), x >= 0` This is fine as it avoids overflow with the exponential, but the whole decomposition is computed for both cases unconditionally, then the result is chosen based off the sign of the input. This results in doing two expensive `exp` computations. The proposed change avoids doing the whole computation twice by exploiting the reflection symmetry `tanh(-x) = -tanh(x)`. We can "normalize" the input to be positive by setting `y = sign(x) * x`, where the sign of `x` is computed as `sign(x) = (float)(x > 0) * (-2) + 1`. Then compute `z = tanh(y)` with the decomposition above for `x >=0` and "denormalize" the result `z * sign(x)` to retain the sign. The reason it is done this way is that it is very amenable to vectorization. This method trades the duplicate decomposition computations (which takes 5 instructions including an extra expensive `exp` and `div`) for 4 cheap instructions to compute the signs value 1. `arith.cmpf` (which is a pre-existing instruction in the previous impl) 2. `arith.sitofp` 3. `arith.mulf` 4. `arith.addf` and 1 more instruction to get the right sign in the result 5. `arith.mulf`. Moreover, numerically, this implementation will yield the exact same results as the previous implementation.	2024-03-14 19:18:56 -05:00
Krzysztof Drewniak	05e85e4fc5	[mlir][Math] Add pass to legalize math functions to f32-or-higher (#78361 ) Since most of the operations in the `math` dialect don't have low-precision implementations, add the -math-legalize-to-f32 pass that goes through and brackets low-precision math funcitons (like `math.sin %0 : f16`) with `arith.extf` and `arith.truncf`. This preserves the original semantics of the math operation but allows lowering to proceed. Versions of this lowering are already implicitly present in some passes, like ConvertGPUToROCDL. However, because those are implicit rewrites, they hide the floating-point extension and truncation, preventing anyone from writing passes that operate on those implitic extf/truncf pairs. Exposing this legalization explicitly is needed to allow lowening 8-bit floats on AMD GPUs, as the implementation of extf and truncf on that platform requires the complex logic found in ArithToAMDGPU, which runs before the GPU to ROCDL lowering.	2024-01-18 09:37:43 -06:00
Matthias Springer	bb6d5c2200	[mlir][Transforms] `GreedyPatternRewriteDriver`: Do not CSE constants during iterations (#75897 ) The `GreedyPatternRewriteDriver` tries to iteratively fold ops and apply rewrite patterns to ops. It has special handling for constants: they are CSE'd and sometimes moved to parent regions to allow for additional CSE'ing. This happens in `OperationFolder`. To allow for efficient CSE'ing, `OperationFolder` maintains an internal lookup data structure to find the existing constant ops with the same value for each `IsolatedFromAbove` region: ```c++ /// A mapping between an insertion region and the constants that have been /// created within it. DenseMap<Region *, ConstantMap> foldScopes; ``` Rewrite patterns are allowed to modify operations. In particular, they may move operations (including constants) from one region to another one. Such an IR rewrite can make the above lookup data structure inconsistent. We encountered such a bug in a downstream project. This bug materialized in the form of an op that uses the result of a constant op from a different `IsolatedFromAbove` region (that is not accessible). This commit changes the behavior of the `GreedyPatternRewriteDriver` such that `OperationFolder` is used to CSE constants at the beginning of each iteration (as the worklist is populated), but no longer during an iteration. `OperationFolder` is no longer used after populating the worklist, so we do not have to care about inconsistent state in the `OperationFolder` due to IR rewrites. The `GreedyPatternRewriteDriver` now performs the op folding by itself instead of calling `OperationFolder::tryToFold`. This change changes the order of constant ops in test cases, but not the region in which they appear. All broken test cases were fixed by turning `CHECK` into `CHECK-DAG`. Alternatives considered: The state of `OperationFolder` could be partially invalidated with every `notifyOperationModified` notification. That is more fragile than the solution in this commit because incorrect rewriter API usage can lead to missing notifications and hard-to-debug `IsolatedFromAbove` violations. (It did not fix the above mention bug in a downstream project, which could be due to incorrect rewriter API usage or due to another conceptual problem that I missed.) Moreover, ops are frequently getting modified during a greedy pattern rewrite, so we would likely keep invalidating large parts of the state of `OperationFolder` over and over. Migration guide: Turn `CHECK` into `CHECK-DAG` in test cases. Constant ops are no longer folded during a greedy pattern rewrite. If you rely on folding (and rematerialization) of constant ops during a greedy pattern rewrite, turn the folder into a pattern.	2024-01-05 09:22:18 +01:00
Cullen Rhodes	9816edc9f3	[mlir][vector] add result type to vector.extract assembly format (#66499 ) The vector.extract assembly format currently only contains the source type, for example: %1 = vector.extract %0[1] : vector<3x7x8xf32> it's not immediately obvious if this is the source or result type. This patch improves the assembly format to make this clearer, so the above becomes: %1 = vector.extract %0[1] : vector<7x8xf32> from vector<3x7x8xf32>	2023-09-28 11:11:16 +01:00
Ivan Butygin	5dce74817b	[mlir][ub] Add poison support to CommonFolders.h Return poison from foldBinary/unary if argument(s) is poison. Add ub dialect as dependency to affected dialects (arith, math, spirv, shape). Add poison materialization to dialects. Add tests for some ops from each dialect. Not all affected ops are covered as it will involve a huge copypaste. Differential Revision: https://reviews.llvm.org/D159013	2023-09-07 12:30:29 +02:00
Balaji V. Iyer	f66e4bd67a	[mlir][math] Modify math.powf to handle negative bases. Powf expansion currently returns NaN when the base is negative. This is because taking natural log of a negative number gives NaN. This patch will square the base and half the exponent, thereby getting around the negative base problem. Reviewed By: rsuderman Differential Revision: https://reviews.llvm.org/D158797	2023-08-25 15:35:05 -07:00
Alexander Shaposhnikov	fe355a44e7	[MLIR][Math] Add support for f64 in the expansion of math.roundeven Add support for f64 in the expansion of math.roundeven. Associated GitHub issue: https://github.com/openxla/iree/issues/13522 This is based on the offline discussion and essentially recommits https://reviews.llvm.org/D158234. Test plan: ninja check-mlir check-all	2023-08-24 21:41:26 +00:00
Alexander Shaposhnikov	d22883e384	Revert "[MLIR][Math] Add support for f16 in the expansion of math.roundeven" This reverts commit `40bf36319e`. The build bot ppc64le-mlir-rhel-test got broken by these changes, see https://lab.llvm.org/buildbot#builders/88/builds/61048 .	2023-08-18 18:20:52 +00:00
Alexander Shaposhnikov	40bf36319e	[MLIR][Math] Add support for f16 in the expansion of math.roundeven Add support for f16 in the expansion of math.roundeven. Associated GitHub issue: https://github.com/openxla/iree/issues/13522 This version addresses the build issues on Windows reported on https://reviews.llvm.org/D157204 Test plan: ninja check-mlir check-all Differential revision: https://reviews.llvm.org/D158234	2023-08-18 17:48:34 +00:00
Alexander Shaposhnikov	f745c91f61	Revert "[MLIR][Math] Add support for f16 in the expansion of math.roundeven" This reverts commit `b96f6cf629` since it has broken some Windows build bots (see https://reviews.llvm.org/D157204). Will recommit a fixed version later.	2023-08-17 23:22:05 +00:00
Alexander Shaposhnikov	b96f6cf629	[MLIR][Math] Add support for f16 in the expansion of math.roundeven Add support for f16 in the expansion of math.roundeven. Associated GitHub issue: https://github.com/openxla/iree/issues/13522 Test plan: ninja check-mlir check-all Differential revision: https://reviews.llvm.org/D157204	2023-08-17 18:40:26 +00:00
Robert Suderman	0bedb667af	[mlir][math] Improved math.atan approximation Used the cephes numerical approximation for `math.atan`. This is a significant accuracy improvement over the previous taylor series approximation. Reviewed By: jpienaar Differential Revision: https://reviews.llvm.org/D153656	2023-06-23 17:25:34 -07:00
Robert Suderman	710dc7282a	[mlir][math] Modified the 'math.exp' lowering for higher precision The existing lowering has lower precision for certain use cases, e.g. tanh. Improved version should demonstrate an overall higher level of precision. Reviewed By: cota, jpienaar Differential Revision: https://reviews.llvm.org/D153592	2023-06-23 12:25:18 -07:00
Ivan Butygin	ee8b8d6b58	[mlir][math] Uplift from arith to math.fma Add pass to uplift from arith mulf + addf ops to math.fma if fastmath flags allow it. Differential Revision: https://reviews.llvm.org/D152633	2023-06-18 17:11:21 +02:00
Ramiro Leal-Cavazos	44baa65589	Revert "Revert "Fix handling of special and large vals in expand pattern for `round`" and "Add pattern that expands `math.roundeven` into `math.round` + arith"" This reverts commit `87cef78fa1`. The issue in the original revert is that a lit test expecting a `-nan` as an output was failing on M2. Since the IEEE 754-2008 standard does not require the sign to be printed when displaying a `nan`, this commit changes the `CHECK` for `-nan` to one that checks the result value bitcasted to an `i32` to ensure that input is being left unchanged. This check should now be independent of platform being used to run test. Reviewed By: jpienaar, mehdi_amini Differential Revision: https://reviews.llvm.org/D148941	2023-04-22 07:15:40 -07:00
Mehdi Amini	87cef78fa1	Revert "Fix handling of special and large vals in expand pattern for `round`" and "Add pattern that expands `math.roundeven` into `math.round` + arith" This reverts commit `8d2bae9abd` and commit `ab2fc9521e`. Tests are broken on Mac M2	2023-04-21 00:16:32 -06:00
Ramiro Leal-Cavazos	8d2bae9abd	Add pattern that expands `math.roundeven` into `math.round` + arith This commit adds a pattern that expands `math.roundeven` into `math.round` + some ops from `arith`. This is needed to be able to run `math.roundeven` in a vectorized manner. Reviewed By: jpienaar Differential Revision: https://reviews.llvm.org/D148285	2023-04-20 12:48:12 -07:00
Ramiro Leal-Cavazos	ab2fc9521e	Fix handling of special and large vals in expand pattern for `round` The current expand pattern for `math.round` does not handle the special values -0.0, +-inf, and +-nan correctly. It also does not properly handle values with magnitude \|x\| >= 2^23. Lastly, the pattern generates invalid IR when the input to `math.round` is a vector. This patch fixes these issues. Reviewed By: rsuderman Differential Revision: https://reviews.llvm.org/D148398	2023-04-20 18:08:19 +00:00
Balaji V. Iyer	2d4e856709	[mlir][math] Expand math.powf to exp, log and multiply Powf functions are pushed directly to libm. This is problematic for situations where libm is not available. This patch will decompose the powf function into log of exponent multiplied by log of base and raise it to the exp. Reviewed By: rsuderman Differential Revision: https://reviews.llvm.org/D148164	2023-04-14 14:04:19 +00:00
Balaji V. Iyer	be9115788c	[mlir][math] Expand math.round to truncate, compare and increment. Round functions are pushed directly to libm. This is problematic for situations where libm is not available. This patch will decompose the roundf function by adding 0.5 to positive number to input (subtracting for negative) following by a truncate. Reviewed By: rsuderman Differential Revision: https://reviews.llvm.org/D148026	2023-04-13 18:02:10 +00:00
Balaji V. Iyer	4da96515ea	[mlir][math] Expand math.exp2 to use math.exp. Exp2 functions are pushed directly to libm. This is problematic for situations where libm is not available. This patch will expand the exp2 function to use exp2 with the input multiplied by ln2 (natural log). Reviewed By: rsuderman Differential Revision: https://reviews.llvm.org/D148064	2023-04-13 16:06:04 +00:00
Balaji V. Iyer	2217888d2c	[mlir][math] Expand math.ceilf to truncate, compares and increments Ceilf are pushed directly to libm. This is problematic for situations where libm is not available. This patch will break down a ceilf function to truncate followed by an increment if the truncated value is smaller than the input value. Reviewed By: rsuderman Differential Revision: https://reviews.llvm.org/D147974	2023-04-11 13:52:45 +00:00
Alex Zinenko	76c3908273	[mlir] rename Math/dependent-dialect.mlir to depends-on-arith.mlir	2023-04-11 12:35:46 +02:00
Alex Zinenko	56a275b999	[mlir] make Math dialect depend on Arith dialect Ops from the Math dialect use fastmath attributes defined in Arith. Therefore Math dialect must declare a dependency on Arith for proper construction and parsing. Reviewed By: tpopp Differential Revision: https://reviews.llvm.org/D147999	2023-04-11 12:34:51 +02:00
Balaji V. Iyer	af9eb1e384	[mlir][math] Expand math.floorf to truncate, compares and increments Floorf are pushed directly to libm. This is problematic for situations where libm is not available. This patch will break down a floorf function to truncate followed by an increment for negative values, if necessary. Reviewed By: rsuderman Differential Revision: https://reviews.llvm.org/D147966	2023-04-10 21:04:27 +00:00
Balaji V. Iyer	a7c2102d98	[mlir][math]Expand Fused math.fmaf to a multiply-add Fused multiply and add are being pushed directly to the libm. This is problematic for situations where libm is not available. This patch will break down a fused multiply and add into a multiply followed by an add. Reviewed By: rsuderman Differential Revision: https://reviews.llvm.org/D147811	2023-04-07 22:14:56 +00:00
Robert Suderman	711c58938f	[mlir][math] Update math arith expansions for vectorization The math arithmetic expansions do not support vectorized types. Updated the lowerings so that they support vectorized types. This includes a different implementation for `math.ctlz` to be a binary search and not have variable termination time. Reviewed By: jpienaar, NatashaKnk Differential Revision: https://reviews.llvm.org/D147289	2023-04-06 18:42:01 +00:00
Robert Suderman	57e1943e8f	[mlir] Add support for non-f32 polynomial approximation Polynomial approximations assume F32 values. We can convert all non-f32 cases to operate on f32s with intermediate casts. Reviewed By: jpienaar Differential Revision: https://reviews.llvm.org/D146677	2023-03-27 21:57:57 +00:00
Robert Suderman	6b53881048	[mlir][math] Add math.cbrt polynomial approximation Cbrt can be approximated with some relatively simple polynomial operators. This includes a lit test validating the implementation and some run tests that validate numerical correct. Reviewed By: jpienaar Differential Revision: https://reviews.llvm.org/D145019	2023-03-06 13:29:49 -08:00
Robert Suderman	740e2e908c	[mlir][math] Math expansion for math.tan We can implement a polynomial approximation of math.tan by decomposing to `math.sin` and `math.cos`. While it is not technically a polynomial approximation it should be the most straight forward approximation. Reviewed By: jpienaar Differential Revision: https://reviews.llvm.org/D144980	2023-03-01 01:13:54 +00:00
Matthias Springer	ed9194be6d	[mlir] GreedyPatternRewriter: Add ancestors to worklist When adding an op to the worklist, also add its ancestors to the worklist. This allows for RewritePatterns to match an op `a` based on what is inside of the body of `a`. This change fixes a problem that became apparent with `vector.warp_execute_on_lane_0`, but could probably be triggered with similar patterns. The pattern extracts an op `b` with `eligible = true` from the body of an op `a`: ``` test.a { %0 = test.b() {eligible = true} yield %0 } ``` Afterwards: ``` %0 = test.b() {eligible = true} test.a { yield %0 } ``` The pattern is an `OpRewritePattern<OpA>`. For some reason, `test.a` is not on the GreedyPatternRewriter's worklist. E.g., because no pattern could be applied and it was removed. Now, another pattern updates `test.b`, so that `eligible` is changed from `true` to `false`. The `OpRewritePattern<OpA>` could now be applied, but (without this revision) `test.a` is still not on the worklist. Note: In the above example, an `OpRewritePattern<OpB>` could have been used instead of an `OpRewritePattern<OpA>`. With such a design, we can run into the same problem (when the `eligible` attr is on `test.a` and `test.b` is removed from the worklist because no patterns could be applied). Note: This change uncovered an unrelated bug in TestSCFUtils.cpp that was triggered due to a change in the order in which ops are processed. A TODO is added to the broken code and test cases are adapted so that the bug is no longer triggered. Differential Revision: https://reviews.llvm.org/D140304	2023-01-13 10:51:28 +01:00
Matthias Springer	e7790fbed3	[mlir] Add `test-convergence` option to Canonicalizer tests This new option is set to `false` by default. It should be set only in Canonicalizer tests to detect faulty canonicalization patterns. I.e., patterns that prevent the canonicalizer from converging. The canonicalizer should always convergence on such small unit tests that we have in `canonicalize.mlir`. Two faulty canonicalization patterns were detected and fixed with this change. Differential Revision: https://reviews.llvm.org/D140873	2023-01-04 12:02:21 +01:00
Johannes Reifferscheid	998a3a3894	Add a math.cbrt instruction and lowering to libm. There's currently no way to get accurate cube roots in the math dialect. powf(x, 1/3.0) is too inaccurate in some cases. Reviewed By: akuegel Differential Revision: https://reviews.llvm.org/D140842	2023-01-03 08:44:12 +01:00
Slava Zakharin	07a4c4d601	[mlir][math] Added arith::FastMathAttr support for math::FPowI. Differential Revision: https://reviews.llvm.org/D139805	2022-12-13 20:47:20 -08:00
Slava Zakharin	095ce655ec	[mlir][math] Simplify pow(x, 0.75) into sqrt(sqrt(x)) * sqrt(x). Trivial simplification for CPU2017/503.bwaves resulting in 3.89% speed-up on icelake. Differential Revision: https://reviews.llvm.org/D137351	2022-11-04 10:48:19 -07:00
Slava Zakharin	589764a382	[mlir][math] Initial support for fastmath flag attributes for Math dialect. Added arith::FastMathAttr and ArithFastMathInterface support for Math dialect floating point operations. This change-set creates ArithCommon conversion utils that currently provide classes and methods to aid with arith::FastMathAttr conversion into LLVM::FastmathFlags. These utils are used in ArithToLLVM and MathToLLVM convertors, but may eventually be used by other converters that need to convert fast math attributes. Since Math dialect operations use arith::FastMathAttr, MathOps.td now has to include enum and attributes definitions from Arith dialect. To minimize the amount of TD code included from Arith dialect, I moved FastMathAttr definition into ArithBase.td. Differential Revision: https://reviews.llvm.org/D136312	2022-11-04 10:41:56 -07:00
jacquesguan	a4ace22c05	[mlir][Math] Change regex to match fp value on different target. Link: https://github.com/llvm/llvm-project/issues/58048 Reviewed By: ftynse, Mogball Differential Revision: https://reviews.llvm.org/D134850	2022-10-12 15:08:21 +08:00
jacquesguan	6eebdc46e4	[mlir][Math] Add constant folder for ErfOp. This patch adds constant folder for ErfOp by using erf/erff of libm. Reviewed By: ftynse, Mogball Differential Revision: https://reviews.llvm.org/D134017	2022-09-19 10:55:16 +08:00
jacquesguan	71e52a125c	[mlir][Math] Add constant folder for SinOp. This patch adds constant folder for SinOp by using sin/sinf of libm. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D133915	2022-09-16 14:30:05 +08:00
Jeff Niu	3108249dea	[MLIR][math] Use approximate matches for folded ops LibM implementations differ, so the folders can have different results on different platforms. For instance, the `cos` folder was failing on M1 mac. I chose to match the constant floats to 2(.5) significant digits. Reviewed By: jacquesguan Differential Revision: https://reviews.llvm.org/D133797	2022-09-14 08:39:41 -07:00
jacquesguan	9d0b90e933	[mlir][Math] Add TruncOp. This patch adds TruncOp for Math, it returns the operand rounded to the nearest integer not larger in magnitude than the operand. And this patch also adds the correspond llvm intrinsic op. Reviewed By: Mogball Differential Revision: https://reviews.llvm.org/D133342	2022-09-09 10:01:28 +08:00
Kai Sasaki	5bb621056b	[mlir][math] Canonicalization for math.floor op Support constant folding for math.floor op as well as math.ceil. Reviewed By: Mogball Differential Revision: https://reviews.llvm.org/D133398	2022-09-09 10:21:48 +09:00
jacquesguan	b53bccb18d	[mlir][Math] Add constant folder for RoundOp. This patch uses round/roundf of libm to fold RoundOp of constant. Differential Revision: https://reviews.llvm.org/D133401	2022-09-08 14:51:17 +08:00
jacquesguan	ac66d87c4b	[mlir][Math] Add constant folder for RoundEvenOp. This patch uses roundeven/roundevenf of libm to fold RoundEvenOp of constant. Differential Revision: https://reviews.llvm.org/D133344	2022-09-07 11:13:00 +08:00
jacquesguan	e3434a8627	[mlir][Math] Add constant folder for CosOp. This patch adds constant folder for CosOp which only supports single and double precision floating-point. Differential Revision: https://reviews.llvm.org/D131233	2022-09-07 10:54:08 +08:00

1 2 3

107 Commits