Use the maximum 64 for BitWidth of getVScaleRange to avoid returning an empty range.
the previous changes bring in a Buildbot failure because MinSVEVectorSize = MinSVEVectorSize.
error: explicitly assigning value of variable of type 'unsigned int' to itself [-Werror,-Wself-assign]
Reviewed By: sdesmalen, nikic, dmgreen
Differential Revision: https://reviews.llvm.org/D155708
Use the maximum 64 for BitWidth of getVScaleRange to
avoid returning an empty range.
Reviewed By: sdesmalen, nikic, dmgreen
Differential Revision: https://reviews.llvm.org/D155708
A similar assumption as for the x^x case also existed for the absorber
case, which lead to a stage2 miscompile. That assumption is not fixed.
-----
Support replacement of operands not only in the immediate
instruction, but also instructions it uses.
To the most part, this extension is straightforward, but there are
two bits worth highlighting:
First, we can now no longer assume that if the Op is a vector, the
instruction also returns a vector. If Op is a vector and the
instruction returns a scalar, we should consider it as a cross-lane
operation.
Second, for the x ^ x special case and the absorber special case, we
can no longer assume that one of the operands is RepOp, as we might
have a replacement higher up the instruction chain.
There is one optimization regression, but it is in a fuzzer-generated
test case.
Fixes https://github.com/llvm/llvm-project/issues/63104.
Handle constant folding and idempotent folding. Not sure
this is an appropriate use of undef for the inf/nan case. The
C version says the second result is "unspecified". The AMDGPU
instruction returns 0.
This is very likely the cause of a stage 2 failure in
Transforms/LoopVectorize/check-prof-info.ll. Revert until I can
investigate this.
This reverts commit 3d199d086e.
Support replacement of operands not only in the immediate
instruction, but also instructions it uses.
To the most part, this extension is straightforward, but there are
two bits worth highlighting:
First, we can now no longer assume that if the Op is a vector, the
instruction also returns a vector. If Op is a vector and the
instruction returns a scalar, we should consider it as a cross-lane
operation.
Second, for the x ^ x special case, we can no longer assume that
the operand is RepOp, as we might have a replacement higher up the
instruction chain.
There is one optimization regression, but it is in a fuzzer-generated
test case.
Fixes https://github.com/llvm/llvm-project/issues/63104.
After the semantics change from https://reviews.llvm.org/D154051,
gep inbounds x, 0 can no longer produce poison. As such, we can
also perform this fold during non-refining operand replacement
and avoid unnecessary drops of the inbounds flag.
The online alive2 version has not been update to the new
semantics yet, but we can use the following proof locally:
define ptr @src(ptr %base, i64 %offset) {
%cmp = icmp eq i64 %offset, 0
%gep = getelementptr inbounds i8, ptr %base, i64 %offset
%sel = select i1 %cmp, ptr %base, ptr %gep
ret ptr %sel
}
define ptr @tgt(ptr %base, i64 %offset) {
%gep = getelementptr inbounds i8, ptr %base, i64 %offset
ret ptr %gep
}
With the semantics change from D154051, it is no longer valid to
fold gep inbounds undef to poison (unless we know the index is
non-zero). Fold it to undef instead.
Differential Revision: https://reviews.llvm.org/D154215
Strengthen the fold for icmps of non-overlapping storage, by
working on the difference of offsets, rather than considering
both offsets independently. In particular, this allows handling
comparisons of pointers to the end of equal-sized allocations.
Proofs: https://alive2.llvm.org/ce/z/Po2nL4
Differential Revision: https://reviews.llvm.org/D153752
D143505 fixed/simplified folding of operations with SNaN operands. In
doing so it introduced a crash when handling scalable vector types,
wherein the scalable-vector ConstantVector was cast to a ConstantFP.
Since we know by that point in the code that if we've found a NaN, we're
dealing with a scalable-vector splat (as there are no other kinds of
scalable-vector constant for which that holds), we can grab the splatted
value and re-use the existing code, which will automatically splat the
new NaN back to a scalable vector for us.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D153566
The way this is currently implemented the accumulated offsets can
end up having a different size, which causes unnecessary
complication for further extension of the code.
Don't strip pointer casts at the start and rely on
stripAndAccumulate to do any necessary stripping. It gracefully
handles different index sizes and will always retain the width of
the original pointer index type.
This is not NFC, but unlikely to make any practical difference.
`select i1 non-const, i1 true, i1 false` has been optimized to
`non-const`. There is no reason that we can not optimize `select i1
ConstExpr, i1 true, i1 false` to `ConstExpr`.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D151631
Unlike every other analysis and transform, simplifyInstruction
permitted operating on instructions which are not inserted
into a function. This created an edge case no other code needs
to really worry about, and limited transforms in cases that
can make use of the context function. Only the inliner and a handful
of other utilities were making use of this, so just fix up these
edge cases. Results in some IR ordering differences since
cloned blocks are inserted eagerly now. Plus some additional
simplifications trigger (e.g. some add 0s now folded out that
previously didn't).
This reverts commit 0012b94a4e.
Reverting due to test failures introduced by 73925ef8b0
Updated floating-point-compare.ll to keep the assume declaration.
This is generally handled already in early CSE.
If a specialized pipeline is used, however, its possible for `i1`
operand with known-zero denominator to slip through. Generally the
known-zero denominator is caught and poison is returned, but if it is
indirect enough (known zero through a phi node) we can miss this case
in `InstructionSimplify` and then miss handling `i1`. This is because
`i1` is current handled with the following check:
`if(Known.countMinLeadingZeros() == Known.getBitWidth() - 1)`
which only works on the assumption we don't know the denominator to be
zero. If we know the denominator to be zero, this check fails:
https://github.com/llvm/llvm-project/issues/62607
This patch simply adds an explicit `if(Known.isZero) return poison;`
which fixes the issue.
Alive2 Link for tests:
https://alive2.llvm.org/ce/z/VTw54n
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D150142
Following the change in shufflevector semantics,
poison will be used to represent undefined elements in shufflevector masks.
Differential Revision: https://reviews.llvm.org/D149256
Use simplifySelectWithICmpEq to handle the implied equalities from the icmp-and,
then both of ICMP_NE and ICMP_EQ will be addressed including vector type.
(X & Y) == -1 ? X : -1 --> -1 (commuted 2 ways)
(X & Y) != -1 ? -1 : X --> -1 (commuted 2 ways)
This is a supplement to the icmp-or scenario on D148986.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D149229
Use simplifySelectWithICmpEq to handle the implied equalities from the icmp-or,
then both of ICMP_NE and ICMP_EQ will be addressed
(X | Y) == 0 ? X : 0 --> 0 (commuted 2 ways)
(X | Y) != 0 ? 0 : X --> 0 (commuted 2 ways)
Fixes https://github.com/llvm/llvm-project/issues/62263
Reviewed By: nikic, RKSimon
Differential Revision: https://reviews.llvm.org/D148986