The specialisation will not be valid when ConstantInt gains native
support for vector types.
This is largely a mechanical change but with extra attention paid to constant
folding, InstCombineVectorOps.cpp, LoopFlatten.cpp and Verifier.cpp to
remove the need to call `getIntegerType()`.
Co-authored-by: Nikita Popov <github@npopov.com>
This patch passes `SimplifyQuery` to `computeKnownBits` directly in
`InstSimplify` and `InstCombine`.
As the `DomConditionCache` in #73662 is only used in `InstCombine`, it
is inconvenient to introduce a new argument `DC` to `computeKnownBits`.
When folding urem instructions we can end up not recognizing that
the output will always be 0 due to Value*s being different, despite
generating the same data (in this case, 2 different calls to vscale).
This patch recognizes the (x << N) & (add (x << M), -1) pattern that
instcombine replaces urem with after the two vscale calls have been
reduced to one via CSE, then replaces with 0 when x is a power of 2
and N >= M.
There are a number of and folds that are repeated for both
operand orders. Move these into a helper that is invoked with
both orders.
This is conceptually NFC, but may not be entirely so, as the order
of folds may change.
This code was incorrectly checking that the CtxI has required FMF, but
the context instruction need not always be the instrinsic call.
Check that the intrinsic call has the required FMF.
Fixes PR71548.
We need to check FPMathOperator for Ctx instruction before checking fast
math flag on this Ctx.
Ctx is not always an FPMathOperator, so explicitly check for it.
Fixes#71548.
Relative to the first attempt, this contains two changes:
First, we only handle the case where one side simplifies to true or
false, instead of calling simplification recursively. The previous
approach would return poison if one operand simplified to poison
(under the equality assumption), which is incorrect.
Second, we do not fold llvm.is.constant in simplifyWithOpReplaced().
We may be assuming that a value is constant, if the equality holds,
but it may not actually be constant. This is nominally just a QoI
issue, but the std::list implementation in libstdc++ relies on the
precise behavior in a way that causes miscompiles.
-----
and/or in logical (select) form benefit from generic simplifications via
simplifyWithOpReplaced(). However, the corresponding fold for plain
and/or currently does not exist.
Similar to selects, there are two general cases for this fold
(illustrated with `and`, but there are `or` conjugates).
The basic case is something like `(a == b) & c`, where the replacement
of a with b or b with a inside c allows it to fold to true or false.
Then the whole operation will fold to either false or `a == b`.
The second case is something like `(a != b) & c`, where the replacement
inside c allows it to fold to false. In that case, the operand can be
replaced with c, because in the case where a == b (and thus the icmp is
false), c itself will already be false.
As the test diffs show, this catches quite a lot of patterns in existing
test coverage. This also obsoletes quite a few existing special-case
and/or of icmp folds we have (e.g. simplifyAndOrOfICmpsWithLimitConst),
but I haven't removed anything as part of this patch in the interest of
risk mitigation.
Fixes#69050.
Fixes#69091.
Mostly the same as `and`. We also have a check for a useless
`llvm.ptrmask` if the ptr is already known aligned.
Differential Revision: https://reviews.llvm.org/D156633
and/or in logical (select) form benefit from generic simplifications via
simplifyWithOpReplaced(). However, the corresponding fold for plain
and/or currently does not exist.
Similar to selects, there are two general cases for this fold
(illustrated with `and`, but there are `or` conjugates).
The basic case is something like `(a == b) & c`, where the replacement
of a with b or b with a inside c allows it to fold to true or false.
Then the whole operation will fold to either false or `a == b`.
The second case is something like `(a != b) & c`, where the replacement
inside c allows it to fold to false. In that case, the operand can be
replaced with c, because in the case where a == b (and thus the icmp is
false), c itself will already be false.
As the test diffs show, this catches quite a lot of patterns in existing
test coverage. This also obsoletes quite a few existing special-case
and/or of icmp folds we have (e.g. simplifyAndOrOfICmpsWithLimitConst),
but I haven't removed anything as part of this patch in the interest of
risk mitigation.
Fixes#69050.
Fixes#69091.
Some folds using m_NUW, m_NSW style matchers were missed, make
sure they respect UseInstrInfo.
This is part of #53218, but not a complete fix for the issue.
Instead of unsetting flags on the instruction, attempting the
fold, and the resetting the flags if it failed, add support to
simplifyWithOpReplaced() to ignore poison-generating flags/metadata
and collect all instructions where they may need to be dropped.
This allows us to perform the fold a) with poison-generating
metadata, which was previously not handled and b) poison-generating
flags/metadata that are not on the root instruction.
Proof for the ctpop case: https://alive2.llvm.org/ce/z/3H3HFs
Fixes https://github.com/llvm/llvm-project/issues/62450.
This patch simplifies the pattern `icmp X and/or C1, X and/or C2` when
one constant mask is the subset of the other.
If `C1 & C2 == C1`, `A = X and/or C1`, `B = X and/or C2`, we can do the
following folds:
`icmp ule A, B -> true`
`icmp ugt A, B -> false`
We can apply similar folds for signed predicates when `C1` and `C2` are
the same sign:
`icmp sle A, B -> true`
`icmp sgt A, B -> false`
Alive2: https://alive2.llvm.org/ce/z/Q4ekP5Fixes#65833.
Revert "InstSimplify: Add baseline tests for reported regression"
Revert "InstSimplify: Start cleaning up simplifyFCmpInst"
This reverts commit 0637b00041.
This reverts commit 239fb206de.
This reverts commit ddb3f12c42.
These commits causes crashes when compiling chromium code, attached reduced ir at: https://reviews.llvm.org/D151887#4634914
Also picks up a few improvements (Some of the fcmp.ll
test names imply they aren't quite testing what was intended.
Checking the sign bit can't be performed with a compare to a 0).
Much of the logic in here is the same as the class detection
logic of fcmpToClassTest. We could unify more with a weaker
version of fcmpToClassTest which returns implied classes rather
than exact class-like compares. Also could unify more with detection
of possible classes in non-splat vectors.
One problem here is we now only perform folds that used
to always work now require a context instruction. This is
because fcmpToClassTest requires the parent function.
Either fcmpToClassTest could tolerate a missing context
function, or we could require passing in one to simplifyFCmpInst.
Without this it's possible to hit the !isNan assert (which feels like
an unnecessary assert). In any case, these cases don't appear in
any tests.
https://reviews.llvm.org/D151887
We check the loop trip count is known a power of 2 to determine
whether the tail loop can be eliminated in D146199.
However, the remainder loop of mask scalable loop can also be removed
If we know the mask is always going to be true for every vector iteration.
Depend on the assume of power-of-two vscale on D155350
proofs: https://alive2.llvm.org/ce/z/bT62Wa
Fix https://github.com/llvm/llvm-project/issues/63616.
Reviewed By: goldstein.w.n, nikic, david-arm, paulwalker-arm
Differential Revision: https://reviews.llvm.org/D154953
This reverts commit 3e386b2278.
Next to the original fold, this also implements an unnecessary and
inappropriate simplifyICmpWithDominatingAssume() based fold.
We check the loop trip count is known a power of 2 to determine
whether the tail loop can be eliminated in D146199.
However, the remainder loop of mask scalable loop can also be removed
If we know the mask is always going to be true for every vector iteration.
Depend on the assume of power-of-two vscale on D155350
proofs: https://alive2.llvm.org/ce/z/FkTMoy
Fix https://github.com/llvm/llvm-project/issues/63616.
Reviewed By: goldstein.w.n, nikic, david-arm, paulwalker-arm
Differential Revision: https://reviews.llvm.org/D154953
Use the maximum 64 for BitWidth of getVScaleRange to avoid returning an empty range.
the previous changes bring in a Buildbot failure because MinSVEVectorSize = MinSVEVectorSize.
error: explicitly assigning value of variable of type 'unsigned int' to itself [-Werror,-Wself-assign]
Reviewed By: sdesmalen, nikic, dmgreen
Differential Revision: https://reviews.llvm.org/D155708
Use the maximum 64 for BitWidth of getVScaleRange to
avoid returning an empty range.
Reviewed By: sdesmalen, nikic, dmgreen
Differential Revision: https://reviews.llvm.org/D155708
A similar assumption as for the x^x case also existed for the absorber
case, which lead to a stage2 miscompile. That assumption is not fixed.
-----
Support replacement of operands not only in the immediate
instruction, but also instructions it uses.
To the most part, this extension is straightforward, but there are
two bits worth highlighting:
First, we can now no longer assume that if the Op is a vector, the
instruction also returns a vector. If Op is a vector and the
instruction returns a scalar, we should consider it as a cross-lane
operation.
Second, for the x ^ x special case and the absorber special case, we
can no longer assume that one of the operands is RepOp, as we might
have a replacement higher up the instruction chain.
There is one optimization regression, but it is in a fuzzer-generated
test case.
Fixes https://github.com/llvm/llvm-project/issues/63104.