Commit Graph

766 Commits

Author SHA1 Message Date
Nikita Popov
efe4e7a026 [SCEV] Fix incorrect nsw inference for multiply of addrec (#66500)
SCEV currently preserves the nsw flag when performing an nsw multiply of
an nsw addrec. While this is legal for nuw, this is not generally the
case for nsw.

This is because nsw mul does not distribute over nsw add:
https://alive2.llvm.org/ce/z/mergCt

Instead, we need either both nuw and nsw to be set
(https://alive2.llvm.org/ce/z/7wpgGc) or explicitly prove that the
distributed multiplications are also nsw
(https://alive2.llvm.org/ce/z/wef9su).

Fixes https://github.com/llvm/llvm-project/issues/66066.
2023-09-18 08:23:10 +02:00
Nikita Popov
0e67a68478 [SCEV] Add tests for PR66066 (NFC) 2023-09-15 13:53:11 +02:00
Nikita Popov
d82f0b74de [IndVars] Don't assume backedge value is instruction (PR64891)
In degenerate cases, the backedge value can be folded to poison.

Fixes https://github.com/llvm/llvm-project/issues/64891.
2023-08-22 10:33:33 +02:00
Nikita Popov
1c6e6432ca [SCEVExpander] Fix incorrect reuse of more poisonous instructions (PR63763)
SCEVExpander tries to reuse existing instruction with the same
SCEV expression. However, doing this replacement blindly is not
safe, because the instruction might be more poisonous.

What we were already doing is to drop poison-generating flags on
the reused instruction. But this is not the only way that more
poison can be introduced. The poison-generating flag might not
be directly on the reused instruction, or the poison contribution
might come from something like 0 * %var, which folds to 0 but can
still introduce poison.

This patch fixes the issue in a principled way, by determining which
values can contribute poison to the SCEV expression, and then
checking whether any additional values can contribute poison to the
instruction being reused. Poison-generating flags are dropped if
doing that enables reuse.

This is a pretty big hammer and does cause some regressions in
tests, but less than I would have expected. I wasn't able to come
up with a less intrusive fix that still satisfies the correctness
requirements.

Fixes https://github.com/llvm/llvm-project/issues/63763.
Fixes https://github.com/llvm/llvm-project/issues/63926.
Fixes https://github.com/llvm/llvm-project/issues/64333.
Fixes https://github.com/llvm/llvm-project/issues/63727.

Differential Revision: https://reviews.llvm.org/D158181
2023-08-22 09:27:07 +02:00
Nikita Popov
ed72dc8c1f [IndVarSimplify] Add test for PR63763 (NFC) 2023-08-11 16:53:55 +02:00
Matt Arsenault
25bc999d1f Intrinsics: Add type overload to stacksave and stackstore
This allows use with non-0 address space stacks. llvm_ptr_ty should
never be used. This could use some more percolation up through mlir,
but this is enough to fix existing tests.

https://reviews.llvm.org/D156666
2023-08-09 18:33:11 -04:00
Eli Friedman
60712732ea [IndVars] Teach replaceCongruentIVs to avoid scrambling induction variables
replaceCongruentIVs analysis is based on ScalarEvolution; this makes
comparing different PHIs and performing the replacement straightforward.
However, it can have some side-effects: it isn't aware whether an
induction variable is in canonical form, so it can perform replacements
which obscure the meaning of the IR.

In test22 in widen-loop-comp.ll, the resulting loop can't be analyzed by
ScalarEvolution at all.

My attempted solution is to restrict the transform: don't try to replace
induction variables using PHI nodes that don't represent simple
induction variables.

I'm not sure if this is the best solution; suggestions welcome.

Differential Revision: https://reviews.llvm.org/D121950
2023-07-12 12:27:39 -07:00
Nikita Popov
edb2fc6dab [llvm] Remove explicit -opaque-pointers flag from tests (NFC)
Opaque pointers mode is enabled by default, no need to explicitly
enable it.
2023-07-12 14:35:55 +02:00
Nikita Popov
0e34b6a504 [LCSSA] Compute SCEV of LCSSA phi if original instruction had SCEV
The backstory is that the LCSSA invalidation we perform here is not
really necessary from a SCEV perspective. However, other code may
rely on the fact that invalidating only LCSSA phi nodes is sufficient
for transforms like loop peeling
(see https://reviews.llvm.org/D149331#4398582 for more details).

However, performing invalidation during LCSSA construction also
means that SCEV expansion (which may need to construct LCSSA) can
invalidate SCEV, which is somewhat unexpected and code may not be
prepared to deal with it (see the added test case, reported at
https://reviews.llvm.org/D149435#4428219).

Instead of invalidating SCEV, ensure that the LCSSA phi node also
has cached SCEV if the original instruction did. This means that
later invalidation of LCSSA phi nodes will work as expected. This
should avoid both the above issues and be more efficient.

Differential Revision: https://reviews.llvm.org/D153145
2023-06-26 14:43:31 +02:00
zhongyunde
34d380e1f6 [IndVars] Add check of loop invariant for indirect use
We usually only check direct use instruction of IV, while the
bitcast of 'ptrtoint ptr to i64' doesn't affect the result, so go
a step further.
Fix https://github.com/llvm/llvm-project/issues/59633.

Reviewed By: markoshorro
Differential Revision: https://reviews.llvm.org/D151877
2023-06-03 22:29:09 +08:00
Nikita Popov
dfb369399d [ValueTracking] Directly use KnownBits shift functions
Make ValueTracking directly call the KnownBits shift helpers, which
provides more precise results.

Unfortunately, ValueTracking has a special case where sometimes we
determine non-zero shift amounts using isKnownNonZero(). I have my
doubts about the usefulness of that special-case (it is only tested
in a single unit test), but I've reproduced the special-case via an
extra parameter to the KnownBits methods.

Differential Revision: https://reviews.llvm.org/D151816
2023-06-01 09:46:16 +02:00
Nikita Popov
dc81e69eb1 [IndVars] Check expansion safety in makeIVComparisonInvariant() (PR62992)
Make sure the invariant expressions are safe to expand. In
particular, we should not speculative a trapping division into
the preheader.

Fixes https://github.com/llvm/llvm-project/issues/62992.
2023-05-31 11:21:35 +02:00
Tobias Hieta
f84bac329b [NFC][Py Reformat] Reformat lit.local.cfg python files in llvm
This is a follow-up to b71edfaa4e
since I forgot the lit.local.cfg files in that one.

Reformatting is done with `black`.

If you end up having problems merging this commit because you
have made changes to a python file, the best way to handle that
is to run git checkout --ours <yourfile> and then reformat it
with black.

If you run into any problems, post to discourse about it and
we will try to help.

RFC Thread below:

https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style

Reviewed By: barannikov88, kwk

Differential Revision: https://reviews.llvm.org/D150762
2023-05-17 17:03:15 +02:00
Krzysztof Drewniak
53a4adc0de [AMDGPU] Fix crash with 160-bit p7's by manually defining getPointerTy
While pointers in address space 7 (128 bit rsrc + 32 bit offset)
should be rewritten out of the code before IR translation on AMDGPU,
higher-level analyses may still call MVT getPointerTy() and the like
on the target machine. Currently, since there is no MVT::i160, this
operation ends up causing crashes.

The changes to the data layout that caused such crashes were D149776.

This patch causes getPointerTy() to return the type MVT::v5i32
and getPointerMemTy() to be MVT::v8i32. These are accurate types,
but mean that we can't use vectors of address space 7 pointers during
codegen. This is mostly OK, since vectors of buffers aren't supported
in LPC anyway, but it's a noticable limitation.

Potential alternative solutions include adjusting getPointerTy() to return
an EVT or adding MVT::i160 and MVT::i256, both of which are rather
disruptive to the rest of the compiler.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D150002
2023-05-12 15:57:53 +00:00
Krzysztof Drewniak
f0415f2a45 Re-land "[AMDGPU] Define data layout entries for buffers""
Re-land D145441 with data layout upgrade code fixed to not break OpenMP.

This reverts commit 3f2fbe92d0.

Differential Revision: https://reviews.llvm.org/D149776
2023-05-03 19:43:56 +00:00
Krzysztof Drewniak
3f2fbe92d0 Revert "[AMDGPU] Define data layout entries for buffers"
This reverts commit f9c1ede254.

Differential Revision: https://reviews.llvm.org/D149758
2023-05-03 16:11:00 +00:00
Krzysztof Drewniak
f9c1ede254 [AMDGPU] Define data layout entries for buffers
Per discussion at
https://discourse.llvm.org/t/representing-buffer-descriptors-in-the-amdgpu-target-call-for-suggestions/68798,
we define two new address spaces for AMDGCN targets.

The first is address space 7, a non-integral address space (which was
already in the data layout) that has 160-bit pointers (which are
256-bit aligned) and uses a 32-bit offset. These pointers combine a
128-bit buffer descriptor and a 32-bit offset, and will be usable with
normal LLVM operations (load, store, GEP). However, they will be
rewritten out of existence before code generation.

The second of these is address space 8, the address space for "buffer
resources". These will be used to represent the resource arguments to
buffer instructions, and new buffer intrinsics will be defined that
take them instead of <4 x i32> as resource arguments. ptr
addrspace(8). These pointers are 128-bits long (with the same
alignment). They must not be used as the arguments to getelementptr or
otherwise used in address computations, since they can have
arbitrarily complex inherent addressing semantics that can't be
represented in LLVM. Even though, like their address space 7 cousins,
these pointers have deterministic ptrtoint/inttoptr semantics, they
are defined to be non-integral in order to prevent optimizations that
rely on pointers being a [0, [addr_max]] value from applying to them.

Future work includes:
- Defining new buffer intrinsics that take ptr addrspace(8) resources.
- A late rewrite to turn address space 7 operations into buffer
intrinsics and offset computations.

This commit also updates the "fallback address space" for buffer
intrinsics to the buffer resource, and updates the alias analysis
table.

Depends on D143437

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D145441
2023-05-03 15:25:58 +00:00
Florian Hahn
b14be1e7c0 [SCEV] Use object size for globals to sharpen ranges.
The highest address the object can start is ObjSize bytes before the
end (unsigned max value). If this value is not a multiple of the
alignment, the last possible start value is the next lowest multiple
of the alignment. Note: The computations cannot overflow,
because if they would there's no possible start address for the
object.

At the moment, this is limited to GlobalVariables, because I could not
find a API similar to getObjectSize to also get the alignment of the
object. With such an API, this can be generalized to general addresses.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D149483
2023-04-29 21:33:30 +01:00
Florian Hahn
6ebe394915 [IndVars] Add test for simplifying a induction cmp against global. 2023-04-28 20:41:09 +01:00
Nikita Popov
079c525f20 [SCEV] Try simplifying phi before createNodeFromSelectLikePHI()
Sometimes a phi can both be trivial and match the
createNodeFromSelectLikePHI() fold. In that case it is generally
more profitable to look through the phi node.
2023-04-27 15:07:19 +02:00
Max Kazantsev
ab07cbe437 [SCEV] Support sub in and negative constants willNotOverflow
This lifts two TODOs from this function, allowing us to prove
no-overflow whether it happens through max int (up) or through
min int (down) for both and and sub.

Differential Revision: https://reviews.llvm.org/D148618
Reviewed By: dmakogon
2023-04-25 16:40:37 +07:00
Nikita Popov
d003c01c30 [LV][IndVars] Move test to correct directory and regenerate (NFC)
For some reason, an IndVarSimplify test was in the LoopVectorize
directory.
2023-04-21 18:03:41 +02:00
Max Kazantsev
7cdea872d4 [Test] Add test showing that we can infer nsw 2023-04-18 13:17:45 +07:00
Max Kazantsev
dc9833067d [Test] Regenerate checks in some tests using auto-update script 2023-04-12 13:42:31 +07:00
Max Kazantsev
915a45c0a2 [Test] Add some more IndVars canonicalization tests 2023-04-11 12:37:10 +07:00
Florian Hahn
484c622760 [SCEV] Do not strengthen nuw/nsw flags during get[Zero,Sign]ExtendedExpr.
Modifying AddRecs when constructing other expressions can lead to
surprising changes. It also seems like it is not really beneficial i
most cases.

At the moment, there's a single regression, but we still might be able
to improve the flags at AddRec construction.

Might help with the issue discussed in D143409.

Reviewed By: mkazantsev

Differential Revision: https://reviews.llvm.org/D144051
2023-03-15 20:57:34 +00:00
Nikita Popov
660403940c [SCEV] Fix finite loop non-strict predicate simplification (PR60944)
There are a number of issues with the current code for converting
ule -> ult (etc) predicates for comparisons controlling finite loops:

* It sets nowrap flags, which may only hold for that particular
  comparison, not globally. (PR60944)
* It doesn't check that the RHS is invariant. (I'm not sure this
  can cause practical issues independently of the previous point.)
* It runs before simplifications that may be more profitable. (PR54191)

This patch moves the handling for this into computeExitLimitFromICmp(),
because it is somewhat tightly coupled with assumptions in that code,
and addresses the aforementioned issues.

Fixes https://github.com/llvm/llvm-project/issues/60944.
Fixes https://github.com/llvm/llvm-project/issues/54191.

Differential Revision: https://reviews.llvm.org/D145510
2023-03-14 10:55:02 +01:00
Nikita Popov
f893d35c0d [IndVars] Add test for PR60944 (NFC) 2023-03-07 15:33:35 +01:00
Max Kazantsev
90f5176ab2 [Test] Add tests where we can replace IV check with invariant check basing on predicated backedge cond 2023-02-10 13:57:39 +07:00
Max Kazantsev
f3e2f26378 [IndVars] Expand icmp in preheader rather than in loop
The motivation is that 'createInvariantCond' unconditionally
builds icmp in the loop block,  while it could always do it
in preheader. Build it in preheader instead.

Patch by Aleksandr Popov!

Differential Revision: https://reviews.llvm.org/D141994
Reviewed By: nikic
2023-01-25 14:41:29 +07:00
Max Kazantsev
932ae48c27 [IndVars] Improve handling of multi-exit loops with known symbolic counts
This patch does two things, both related to support of multi-exit loops with
many exits that have known symbolic max exit count. They can theoretically
go independently, but I don't know how to write a test showing separate
impact.

Part 1: `SkipLastIter` can be set to `true` not when a particular exit has exit
count same as the whole loop (and therefore it must exit on the last iteration),
but when the aggregate of first few exits has umin same as whole loop exit count.
It means that it's not known which of them will exit exactly, but one of them will.

Part 2: when `SkipLastIter` is set, and exit count is `umin(a, b, c)`, instead of
`umin(a, b, c) - 1` use `umin(a - 1, b - 1, c - 1)`. We don't care about overflows
here, but the further logic knows how to deal with umin by element, but the
`SCEVAddExpr` node will confuse it.

Differential Revision: https://reviews.llvm.org/D141361
Reviewed By: nikic
2023-01-24 12:46:48 +07:00
Max Kazantsev
602916d2de [IndVars] Apply more optimistic SkipLastIter for AND/OR conditions
When exit by condition `C1` dominates exit by condition `C2`, and
max symbolic exit count for `C1` matches those for loop, we will
apply more optimistic logic to `C2` by setting `SkipLastIter` for it,
meaning that it will do 1 iteration less because the dominating branch
must exit on the last loop iteration.

But when we have a single exit by condition `C1 & C2`, we cannot
apply the same logic, because there is no dominating condition.

However, if we can prove that the symbolic max exit count of `C1 & C2`
matches those of `C1`, it means that for `C2` we can assume that it
doesn't matter on the last iteration (because the whole thing is `false`
because `C1` must be `false`). Therefore, in this situation, we can handle
`C2` as if it had `SkipLastIter`.

Differential Revision: https://reviews.llvm.org/D139934
Reviewed By: nikic
2023-01-24 12:26:22 +07:00
Jannik Silvanus
986029c164 [Transforms] Fix i8 alignment in datalayout of lit test
A lit test used overaligned i8, apparently due to an old copy-paste
error, intending to specify i16 alignment.

Change the datalayout string to use naturally aligned i8.
2023-01-20 15:52:07 +01:00
Max Kazantsev
32aea4bb84 [Test] Add test showing one more missing case of turn-to-invariant with widening 2023-01-11 13:07:38 +07:00
Owen Anderson
90c1846629 Do not short circuit hoistIVInc when recomputation of poison flags is needed.
Fixes https://github.com/llvm/llvm-project/issues/59777

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D140836
2023-01-10 23:03:07 -07:00
Max Kazantsev
fae63a9a22 [Test] Give test variables more reasonable names 2023-01-11 12:29:26 +07:00
Max Kazantsev
9f37ecf8fa [IndVars] Support AND/OR in optimizeLoopExitWithUnknownExitCount
This patch allows optimizeLoopExitWithUnknownExitCount to deal with
branches by conditions that are not immediately ICmp's, but aggregates
of ICmp's joined by arithmetic or logical AND/OR. Each ICmp is optimized
independently.

Differential Revision: https://reviews.llvm.org/D139832
Reviewed By: nikic
2023-01-11 11:36:02 +07:00
Max Kazantsev
3602d852a5 [Test] One more test where check is not replaced to invariant
Irrelevant constant check makes things even more difficult, surprisingly.
2023-01-09 19:26:48 +07:00
Max Kazantsev
ba2bb63562 [Test] Add tests with logical AND/OR 2022-12-27 08:58:12 +07:00
Max Kazantsev
5f24f893ca [Test] Update inverse test for turn-to-invariant to what they meant to be
They were supposed to test inverted branches with OR condition, not AND.
Fixed this now.
2022-12-26 16:01:18 +07:00
Max Kazantsev
d02c3b1358 [Test] Add test showing potential conflict b/w AND elimination and IV widening 2022-12-26 14:37:49 +07:00
Max Kazantsev
9a7286b61f [SCEV] Help getLoopInvariantExitCondDuringFirstIterations deal with complex umin exit counts. PR59615
Recent improvements in symbolic exit count computation revealed some problems with
SCEV's ability to find invariant predicate during first iterations. Ultimately it is based on its
ability to prove some facts for value on the last iteration. This last value, when it includes
`umin` as part of exit count, isn't always simplified enough. The motivating example is following:

https://github.com/llvm/llvm-project/issues/59615

Could not prove:
```
        Pred = 36, LHS = (-1 + (-1 * (2147483645 umin (-1 + %var)<nsw>))<nsw> + %var), RHS = %var
        FoundPred = 36, FoundLHS = {1,+,1}<nuw><nsw><%bb3>, FoundRHS = %var
```
Can prove:
```
        Pred = 36, LHS = (-1 + (-1 * (-1 + %var)<nsw>)<nsw> + %var), RHS = %var
        FoundPred = 36, FoundLHS = {1,+,1}<nuw><nsw><%bb3>, FoundRHS = %var
```

Here ` (2147483645 umin (-1 + %var)<nsw>)` is exit count composed of two parts from
two different exits: `2147483645 ` and `(-1 + %var)<nsw>`. When it was only one (latter)
analyzeable exit, for it everything was easily provable. Unfortunately, in general case `umin`
in one of `add`'s operands doesn't guarantee that the whole sum reduces, especially in presence
of negative steps and lack of `nuw`. I don't think there is a generic legal way to somehow play
around this `umin`.

So the ad-hoc solution is following: if we failed to find an equivalent predicate that is invariant
during first `MaxIter` iterations, and `MaxIter = umin(a, b, c...)`, try to find solution for at least one
of `a`, `b`, `c`... Because they all are `uge` than `MaxIter`, whatever is true during `a (b, c)` iterations
is also true during `MaxIter` iterations.

Differential Revision: https://reviews.llvm.org/D140456
Reviewed By: nikic
2022-12-21 18:12:17 +07:00
Max Kazantsev
474c8fe9b7 [Test] Precommit test for PR59615 2022-12-21 11:39:05 +07:00
Roman Lebedev
af1f1b064a [NFC] Fixup checkline confusion 2022-12-14 17:53:06 +03:00
Roman Lebedev
da80639ee2 [NFC][IndVar] Autogenerate checklines in one test 2022-12-14 17:39:10 +03:00
Nikita Popov
3ce360f15b [IndVarSimplify] Convert more tests to opaque pointers (NFC) 2022-12-14 15:37:58 +01:00
Nikita Popov
8b7b5f9cfe [IndVarSimplify] Regenerate test checks (NFC) 2022-12-14 15:35:58 +01:00
Florian Hahn
6e86b544dd [SCEV] Cache folded SExt SCEV expressions.
Use FoldID to cache SignExtendExprs that get folded to a different
SCEV.

Depends on D137505.

Reviewed By: mkazantsev

Differential Revision: https://reviews.llvm.org/D137849
2022-12-14 11:59:19 +00:00
Nikita Popov
bf4aeefed4 [IndVarSimplify] Convert last test to opaque pointers (NFC)
After addressing the SCEVExpander issue that caused me pause.

Also delete a redundant test that is no longer needed.
2022-12-13 15:20:35 +01:00
Nikita Popov
5810927dcb [SCEVExpander] Produce canonical constant GEP
Go through IRBuilder to enable DL-based folding, so that we produce
a canonical constant GEP. Noticed while converting tests to opaque
pointers.
2022-12-13 15:08:28 +01:00