Commit Graph

583 Commits

Author SHA1 Message Date
Florian Hahn
3ba3ea3c06 [IVUsers] Check getExpr result in findAddRecForLoop.
This fixes a crash if the SCEV for the use isn't invertible and nullptr
is returned.

Fixes https://github.com/llvm/llvm-project/issues/63840
2023-07-20 14:56:19 +01:00
Nikita Popov
ddb46abd3c [LSR] Don't consider users of constant outside loop
In CollectLoopInvariantFixupsAndFormulae(), LSR looks at users
outside the loop. E.g. if we have an addrec based on %base, and
%base is also used outside the loop, then we have to keep it in a
register anyway, which may make it more profitable to use
%base + %idx style addressing.

This reasoning doesn't hold up when the base is a constant, because
the constant can be rematerialized. The lsr-memcpy.ll test regressed
when enabling opaque pointers, because inttoptr (i64 6442450944 to ptr)
now also has a use outside the loop (previously it didn't due to a
pointer type difference), and that extra "use" results in worse use
of addressing modes in the loop. However, the use outside the loop
actually gets rematerialized, so the alleged register saving does
not occur.

The same reasoning also applies to other types of constants, such
as global variable references.

Differential Revision: https://reviews.llvm.org/D155073
2023-07-13 12:22:38 +02:00
Nikita Popov
e8a5df7beb [LSR] Add test variant with global variables (NFC)
A variant of the test using globals instead of inttoptr expressions
for D155073.
2023-07-13 12:12:48 +02:00
Nikita Popov
edb2fc6dab [llvm] Remove explicit -opaque-pointers flag from tests (NFC)
Opaque pointers mode is enabled by default, no need to explicitly
enable it.
2023-07-12 14:35:55 +02:00
Nikita Popov
6c388e06f5 [LSR] Convert test to opaque pointers (NFC)
This regresses with opaque pointers. I'll submit a patch to recover
the regression.
2023-07-12 14:07:25 +02:00
Nikita Popov
4ec3ea8afa [LSR] Convert some tests to opaque pointers (NFC)
These no longer show codegen regressions.
2023-07-12 11:48:44 +02:00
Nikita Popov
bd0710c221 [LSR] Move test to target specific directory (NFC)
Uses an x86 triple.
2023-07-12 11:44:09 +02:00
Nikita Popov
d69033d245 [SCEVExpander] Fix GEP IV inc reuse logic for opaque pointers
Instead of checking the pointer type, check the element type of
the GEP.

Previously we ended up reusing GEP increments that were not in
expanded form, thus not respecting LSRs choice of representation.

The change in 2011-10-06-ReusePhi.ll recovers a regression that
appeared when converting that test to opaque pointers.

Changes in various Thumb tests now compute the step outside the
loop instead of using add.w inside the loop, which is LSR's
preferred representation for this target.
2023-07-12 11:32:13 +02:00
Nikita Popov
7a21efce72 [LSR] Move test to target-specific directory (NFC) 2023-07-12 10:10:49 +02:00
Nikita Popov
cfa9275888 [LSR] Convert some tests to opaque pointers (NFC) 2023-07-12 09:46:08 +02:00
Nikita Popov
7a78756118 [LSR] Regenerate test checks (NFC) 2023-07-12 09:40:10 +02:00
Florian Hahn
69ca5c9d62 [SCEV] Add flag to control invertible check for normalization.
When normalizing a SCEV expression during expansion, there should be
no need for it to be invertible, as it will only be used for code
generation. This fixes a crash after 7f5b15ad15.

Fixes https://github.com/llvm/llvm-project/issues/63678.
2023-07-05 18:11:44 +01:00
Florian Hahn
7f5b15ad15 [LSR] Move normalization check to normalizeForPostIncUse.
Move the logic added in 3a57152d85 to normalizeForPostIncUse to catch
additional un-invertable cases. This fixes another mis-compile pointed
out by @peixin in D153004.
2023-07-04 11:56:51 +01:00
Florian Hahn
02591d26b9 [LSR] Add test for another normalization miscompile.
Based on @peixin test case shared in D153004.
2023-07-03 18:57:31 +01:00
Fangrui Song
d39b4ce3ce [test] Replace aarch64-*-eabi with aarch64
Using "eabi" for aarch64 targets is a common mistake and warned by Clang Driver.
We want to avoid it elsewhere as well. Just use the common "aarch64" without
other triple components.
2023-06-27 20:02:52 -07:00
Nikita Popov
b51153792b [LSR] Convert some tests to opaque pointers (NFC) 2023-06-23 17:13:57 +02:00
Nikita Popov
2c9aba9352 [LSR] Regenerate test checks (NFC) 2023-06-23 17:06:51 +02:00
Florian Hahn
3a57152d85 [LSR] Return nullptr from getExpr if the result isn't invertible.
getExpr is missing a check to make sure the result is invertible.
This can lead to incorrect results, so return nullptr in those cases
like in other places in IVUsers.

Fixes #62660.

Reviewed By: qcolombet

Differential Revision: https://reviews.llvm.org/D153202
2023-06-22 19:10:48 +01:00
Florian Hahn
93407f7675 [LSR] Adjust test to make sure it keeps testing for the original issue.
Make sure the test keeps testing for the original issue after D153202.
2023-06-22 15:36:32 +01:00
Matt Arsenault
92ee60b66f AMDGPU: Drop and upgrade llvm.amdgcn.atomic.inc/dec to atomicrmw 2023-06-21 21:20:26 -04:00
Florian Hahn
dae5cd73cb Recommit "[LSR] Consider post-inc form when creating extends/truncates."
This reverts the revert commit 1797ab36ef.

The recommitted version now checks the PostIncLoopSets for all fixups
and returns nullptr if the result doesn't match for all fixups.
2023-06-19 17:57:06 +01:00
Florian Hahn
798b6419bc [LSR] Add test for for issue leading to revert of abfeda5af3.
Add unit test triggering an assertion with abfeda5af3.
2023-06-19 15:35:48 +01:00
NAKAMURA Takumi
7400bdc19f pr62660-normalization-failure.ll REQUIRES: asserts (#62660) 2023-06-18 15:24:53 +09:00
Florian Hahn
8225698212 [LSR] Enable SCEV verification for test from f3a0ad2d and mark as XFAIL
The test fails SCEV verification, which cause the expensive check bots
to fail. Always run verification and mark as XFAIL until fixed.
2023-06-17 21:06:49 +01:00
Florian Hahn
1797ab36ef Revert "[LSR] Consider post-inc form when creating extends/truncates."
This reverts commit abfeda5af3.
and fe19036e12.

The added assertion triggers during clang bootstrap builds. Revert while
I investigate.
2023-06-17 17:58:41 +01:00
Florian Hahn
f3a0ad2d8b [LSR] Add test for #62660.
Add test for LSR miscompile.
2023-06-17 17:37:25 +01:00
Florian Hahn
abfeda5af3 [LSR] Consider post-inc form when creating extends/truncates.
GenerateTruncates at the moment creates extends/truncates for post-inc
uses of normalized expressions. For example, if an add rec of the form
{1,+,-1} is used outside the loop, the normalized form will use {1,+,-1}
instead of {0,+,-1}. When naively sign-extending the normalized
expression, it will get extended incorrectly to {1,+,-1} for the wider
type, if the backedge-taken count of the loop is 1.

To address this, the patch updates GenerateTruncates to check if the
LSRUse contains any fixups with PostIncLoops. If that's the case, first
de-normalize the expression, then perform the extend/truncate, then
normalize again.

There may be other places where similar checks are needed and the helper
can be generalized for those cases. I'd not be surprised if other subtle
mis-compiles are caused by this.

Fixes #38847.
Fixes #58039.
Fixes #62852.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D153004
2023-06-17 09:58:37 +01:00
Florian Hahn
f63c038af4 [LSR] Add test case for #58039. 2023-06-17 09:57:00 +01:00
Florian Hahn
672b35d554 [LSR] Move new test to X86 subdir.
The test added in 1665cb0630 requires the X86 backend, so move it to
the X86 subdirectory.
2023-06-15 11:11:06 +01:00
Florian Hahn
1665cb0630 [LSR] Add test cases showing bad handling of extends of post-inc uses.
Tests from #38847, #62852.
2023-06-15 10:15:12 +01:00
Dmitry Makogon
0a3dc73e70 [Test] Move LoopStrengthReduce/pr62563.ll to X86 specific test folder (NFC)
The test case is X86 specific. Should unblock buildbots after 253e3e2.
2023-05-31 20:24:30 +07:00
Dmitry Makogon
253e3e2619 [Test] Add test showing miscompilation in LoopStrengthReduce on min/max expressions (NFC)
This is a test case from https://github.com/llvm/llvm-project/issues/62563.
2023-05-31 18:46:23 +07:00
sgokhale
c4a60c9d34 [CodeGen][ShrinkWrap] Enable PostShrinkWrap by default
This is an attempt to reland D42600 and enabling this optimisation by default.

This also resolves the issue pointed out in the context of PGO build.

Differential Revision: https://reviews.llvm.org/D42600
2023-05-25 13:56:29 +05:30
Tobias Hieta
f84bac329b [NFC][Py Reformat] Reformat lit.local.cfg python files in llvm
This is a follow-up to b71edfaa4e
since I forgot the lit.local.cfg files in that one.

Reformatting is done with `black`.

If you end up having problems merging this commit because you
have made changes to a python file, the best way to handle that
is to run git checkout --ours <yourfile> and then reformat it
with black.

If you run into any problems, post to discourse about it and
we will try to help.

RFC Thread below:

https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style

Reviewed By: barannikov88, kwk

Differential Revision: https://reviews.llvm.org/D150762
2023-05-17 17:03:15 +02:00
Alan Zhao
f4999d3535 Revert "[CodeGen][ShrinkWrap] Split restore point"
This reverts commit 1ddfd1c818.

The original commit causes a Chrome build assertion failure with
ThinLTO: https://crbug.com/1443635
2023-05-08 16:27:59 -07:00
sgokhale
1ddfd1c818 [CodeGen][ShrinkWrap] Split restore point
Try to reland D42600

Differential Revision: https://reviews.llvm.org/D42600
2023-05-08 13:21:07 +05:30
Krzysztof Drewniak
f0415f2a45 Re-land "[AMDGPU] Define data layout entries for buffers""
Re-land D145441 with data layout upgrade code fixed to not break OpenMP.

This reverts commit 3f2fbe92d0.

Differential Revision: https://reviews.llvm.org/D149776
2023-05-03 19:43:56 +00:00
Krzysztof Drewniak
3f2fbe92d0 Revert "[AMDGPU] Define data layout entries for buffers"
This reverts commit f9c1ede254.

Differential Revision: https://reviews.llvm.org/D149758
2023-05-03 16:11:00 +00:00
Krzysztof Drewniak
f9c1ede254 [AMDGPU] Define data layout entries for buffers
Per discussion at
https://discourse.llvm.org/t/representing-buffer-descriptors-in-the-amdgpu-target-call-for-suggestions/68798,
we define two new address spaces for AMDGCN targets.

The first is address space 7, a non-integral address space (which was
already in the data layout) that has 160-bit pointers (which are
256-bit aligned) and uses a 32-bit offset. These pointers combine a
128-bit buffer descriptor and a 32-bit offset, and will be usable with
normal LLVM operations (load, store, GEP). However, they will be
rewritten out of existence before code generation.

The second of these is address space 8, the address space for "buffer
resources". These will be used to represent the resource arguments to
buffer instructions, and new buffer intrinsics will be defined that
take them instead of <4 x i32> as resource arguments. ptr
addrspace(8). These pointers are 128-bits long (with the same
alignment). They must not be used as the arguments to getelementptr or
otherwise used in address computations, since they can have
arbitrarily complex inherent addressing semantics that can't be
represented in LLVM. Even though, like their address space 7 cousins,
these pointers have deterministic ptrtoint/inttoptr semantics, they
are defined to be non-integral in order to prevent optimizations that
rely on pointers being a [0, [addr_max]] value from applying to them.

Future work includes:
- Defining new buffer intrinsics that take ptr addrspace(8) resources.
- A late rewrite to turn address space 7 operations into buffer
intrinsics and offset computations.

This commit also updates the "fallback address space" for buffer
intrinsics to the buffer resource, and updates the alias analysis
table.

Depends on D143437

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D145441
2023-05-03 15:25:58 +00:00
Momchil Velikov
6c9066fe2e Recommit "[AArch64] Fix incorrect isLegalAddressingMode"
This patch recommits 0827e2fa3f after
reverting it in ed7ada259f.  Added
workround for `Targetlowering::AddrMode` no longer being an aggregate
in C++20.

`AArch64TargetLowering::isLegalAddressingMode` has a number of
defects, including accepting an addressing mode, which consists of
only an immediate operand, or not checking the offset range for an
addressing mode in the form `1*ScaledReg + Offs`.

This patch fixes the above issues.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D143895

Change-Id: I41a520c13ce21da503ca45019979bfceb8b648fa
2023-04-21 16:21:01 +01:00
Momchil Velikov
ed7ada259f Revert "[AArch64] Fix incorrect isLegalAddressingMode"
This reverts commit 0827e2fa3f.

Failing buildbot, perhaps due to `-std=c++20`.
2023-04-20 16:10:45 +01:00
Momchil Velikov
0827e2fa3f [AArch64] Fix incorrect isLegalAddressingMode
`AArch64TargetLowering::isLegalAddressingMode` has a number of
defects, including accepting an addressing mode which consists of only
an immediate operand, or not checking the offset range for an
addressing mode in the form `1*ScaledReg + Offs`.

This patch fixes the above issues.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D143895

Change-Id: I756fa21941844ded44f082ac7eea4391219f9851
2023-04-20 15:43:11 +01:00
sgokhale
bb5befefc6 Revert "[CodeGen][ShrinkWrap] Split restore point"
This reverts commit 5f0bccc3d1.

An issue has been reported here: https://github.com/ClangBuiltLinux/linux/issues/1833
2023-04-13 10:52:28 +05:30
Nikita Popov
e7f4ad13ae [Transforms] Convert some tests to opaque pointers (NFC) 2023-04-11 16:49:12 +02:00
sgokhale
5f0bccc3d1 [CodeGen][ShrinkWrap] Split restore point
This patch splits a restore point to allow it to only post-dominate blocks reachable by use
or def of CSRs(Callee Saved Registers)/FI(Frame Index).

Benchmarking this on SPEC2017, this gives around 4% improvement on povray and no significant change
for others.

Co-authored-by: junbuml

Differential Revision: https://reviews.llvm.org/D42600
2023-04-11 11:58:50 +05:30
Dmitry Makogon
3d7242f05e Reapply "[LSR] Preserve LCSSA when rewriting instruction with PHI user"
This reverts commit efd34ba60f.

Reapplies 8ff4832679. Missed a failing test. Needed to just
update test checks.
2023-04-06 17:31:27 +07:00
Nico Weber
efd34ba60f Revert "[LSR] Preserve LCSSA when rewriting instruction with PHI user"
This reverts commit 8ff4832679.
Breaks tests, see https://reviews.llvm.org/D146811#4232839
2023-03-30 06:40:16 -04:00
Dmitry Makogon
8ff4832679 [LSR] Preserve LCSSA when rewriting instruction with PHI user
Fixes https://github.com/llvm/llvm-project/issues/61182.

LoopStrengthReduce may sometimes break LCSSA form when applying a rewrite
for an instruction used in a PHI.
It happens if:
 - The PHI is in a loop exit block,
 - The edge from the corresponding exiting block to that exit is critical,
 - The PHI has at least two inputs coming from loop blocks,
 - and the rewritten instruction is inserted in the loop.

In such case we split the critical edge and then replace PHI inputs
with the rewritten instruction. However ExitBlock is no longer
a loop exit, so LCSSA form is broken.

This patch fixes it by collecting all inserted instructions for PHIs
whose parent block is not a loop exit and then forming LCSSA for them.

Differential Revision: https://reviews.llvm.org/D146811
2023-03-30 14:46:28 +07:00
Dmitry Makogon
8e85bede79 [Test] Regenerate test checks for some LSR tests (NFC) 2023-03-24 21:24:22 +07:00
Dmitry Makogon
2ac5bf2272 [Test] Add test to check that LCSSA is preserved by LSR (NFC)
Currently it fails as LSR doesn't preserve LCSSA in some cases.
2023-03-24 21:24:21 +07:00