Commit Graph

926 Commits

Author SHA1 Message Date
Matt Arsenault
e1ac984a10 ValueTracking: Implement computeKnownFPClass for ldexp
https://reviews.llvm.org/D149590
2023-07-11 09:26:41 -04:00
Juan Manuel MARTINEZ CAAMAÑO
dd1df099ae [InlineCost][TargetTransformInfo][AMDGPU] Consider cost of alloca instructions in the caller (2/2)
Before this patch, the compiler gave a bump to the inline-threshold
when the total size of the allocas passed as arguments to the
callee was below 256 bytes.
This heuristic ignores that some of these allocas could have be removed
by SROA if inlining was applied.

Ideally, this bonus would be attributed to the threshold once the
size of all the allocas that could not be handled by SROA is known:
at the end of the InlineCost analysis.
However, we may never reach this point if the inline-cost analysis exits
early when the inline cost goes over the threshold mid-analysis.

This patch proposes:
* Attribute the bonus in the inline-threshold when allocas are passed
  as arguments (regardless of their total size).
* Assigns a cost to each alloca proportional to its size,
  such that the cost of all the allocas cancels the bonus.

Potential problems:
* This patch assumes that removing alloca instructions with SROA is
  always profitable. This may not be the case if the total size of the
  allocas is still too big to be promoted to registers/LDS.
* Redundant calls to getTotalAllocaSize
* Awkwardly, the threshold attributed contributes to the single-bb and
  vector bonus.

Reviewed By: scchan

Differential Revision: https://reviews.llvm.org/D149741
2023-06-29 09:49:16 +02:00
Arthur Eubanks
ff4fcbb5f4 [test] Add test for null_pointer_is_valid and Inliner instsimplify interaction
As requested in D151254

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D153435
2023-06-21 14:00:53 -07:00
Nikita Popov
650041a7f1 [Inline] Convert tests to opaque pointers (NFC) 2023-06-21 11:32:45 +02:00
Nikita Popov
4c51f0dee5 [Inline] Regenerate test checks (NFC) 2023-06-21 11:32:45 +02:00
Arthur Eubanks
f4f826bcd4 Revert "Revert "ValueTracking: Fix nan result handling for fmul""
This reverts commit 464dcab8a6.

Going to fix forward size regression instead due to more dependent patches needing to be reverted otherwise.
2023-06-16 13:53:32 -07:00
Arthur Eubanks
3e39cfe5b4 Revert "Revert "InstSimplify: Require instruction be parented""
This reverts commit 0c03f48480.

Going to fix forward size regression instead due to more dependent patches needing to be reverted otherwise.
2023-06-16 13:53:31 -07:00
Arthur Eubanks
0c03f48480 Revert "InstSimplify: Require instruction be parented"
This reverts commit 1536e299e6.

Causes large binary size regressions, see comments on https://reviews.llvm.org/rG1536e299e63d7788f38117b0212ca50eb76d7a3b.
2023-06-16 11:24:29 -07:00
Arthur Eubanks
464dcab8a6 Revert "ValueTracking: Fix nan result handling for fmul"
This reverts commit a632ca4b00.

Dependent commit to be reverted
2023-06-16 11:24:28 -07:00
Alan Zhao
d6b4f6786b Revert "Revert "InstSimplify: Require instruction be parented""
This reverts commit 00264eac4d.

Reason: caused a bunch of bots to break
2023-06-16 10:58:54 -07:00
Alan Zhao
00264eac4d Revert "InstSimplify: Require instruction be parented"
This reverts commit 1536e299e6.

Reason: causes a regression in the inliner (see https://crbug.com/1454531 and https://reviews.llvm.org/rG1536e299e63d7788f38117b0212ca50eb76d7a3b#1217141)
2023-06-16 10:36:49 -07:00
Matt Arsenault
a632ca4b00 ValueTracking: Fix nan result handling for fmul
This was mishandling maybe 0 * inf.

Fixes issue #63316
2023-06-15 09:35:12 -04:00
Matt Arsenault
19293b82c1 Inline: Fix case of not inlining with denormal-fp-math-f32
This was failing to inline the opencl libraries with daz enabled. As a
modifier to the base mode, denormal-fp-mode-f32 is weird and has no
meaning if it's missing.
2023-06-09 19:09:48 -04:00
Matt Arsenault
d0b9cb1f65 AMDGPU: Add inlining testcases for denormal-fp-math
Somehow missed this one and it's not working correctly
2023-06-09 19:09:48 -04:00
Kazu Hirata
d6f994acb3 [InlineCost] Check for conflicting target attributes early
When we inline a callee into a caller, the compiler needs to make sure
that the caller supports a superset of instruction sets that the
callee is allowed to use.  Normally, we check for the compatibility of
target features via functionsHaveCompatibleAttributes, but that
happens after we decide to honor call site attribute
Attribute::AlwaysInline.  If the caller contains a call marked with
Attribute::AlwaysInline, which can happen with
__attribute__((flatten)) placed on the caller, the caller could end up
with code that cannot be lowered to assembly code.

This patch fixes the problem by checking the target feature
compatibility before we honor Attribute::AlwaysInline.

Fixes https://github.com/llvm/llvm-project/issues/62664

Differential Revision: https://reviews.llvm.org/D150396
2023-06-02 16:00:47 -07:00
Matt Arsenault
1536e299e6 InstSimplify: Require instruction be parented
Unlike every other analysis and transform, simplifyInstruction
permitted operating on instructions which are not inserted
into a function. This created an edge case no other code needs
to really worry about, and limited transforms in cases that
can make use of the context function. Only the inliner and a handful
of other utilities were making use of this, so just fix up these
edge cases. Results in some IR ordering differences since
cloned blocks are inserted eagerly now. Plus some additional
simplifications trigger (e.g. some add 0s now folded out that
previously didn't).
2023-06-02 18:14:28 -04:00
Arthur Eubanks
aceaea6784 [Inliner] Mark inlinings stopped with inlining history as noinline
The inline history makes sure that we don't keep inlining due to mutual devirtualization. But this gets forgotten between inliner invocations.

So mark the inlined calls as noinline so we respect previous inline history decisions.

This overlaps with D121084, but they're not redundant since we may not inline completely through a child SCC, but we still want a cost multiplier when that happens.

See discussions in D145516.

Reviewed By: jmorse

Differential Revision: https://reviews.llvm.org/D150989
2023-05-25 09:55:53 -07:00
Denis Antrushin
291223409c [InlineCost] Consider branches with !make.implicit metadata as free.
!make.implicit metadata attached to branch means it will very likely
be eliminated (together with associated cmp instruction).

Reviewed By: apilipenko

Differential Revision: https://reviews.llvm.org/D149747
2023-05-25 18:43:16 +03:00
Matt Arsenault
ca6aa47585 Inline: Convert test to generated checks 2023-05-24 15:40:56 +01:00
Matt Arsenault
abf1abbfbe Inline: Convert test to generated checks 2023-05-24 08:49:04 +01:00
Arthur Eubanks
94063cac47 [test] Make mut-rec-scc.ll a bit more robust
By adding noinline

Also make the SCC have 3 functions to prevent test changes with an upcoming change.
2023-05-19 12:25:44 -07:00
Matt Arsenault
4130ccc8be ValueTracking: Check context instruction is in a function 2023-05-18 14:40:13 +01:00
Matt Arsenault
f42136d4d6 ValueTracking: Check instruction is in a parent in computeKnownFPClass
For some reason the inliner calls simplifyInstruction with disembodied
instructions. I consider this to be an API defect. Either the instruction
should always be inserted prior to simplification, or we at least
should pass in the new function for the context.
2023-05-18 12:21:47 +01:00
Tobias Hieta
f84bac329b [NFC][Py Reformat] Reformat lit.local.cfg python files in llvm
This is a follow-up to b71edfaa4e
since I forgot the lit.local.cfg files in that one.

Reformatting is done with `black`.

If you end up having problems merging this commit because you
have made changes to a python file, the best way to handle that
is to run git checkout --ours <yourfile> and then reformat it
with black.

If you run into any problems, post to discourse about it and
we will try to help.

RFC Thread below:

https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style

Reviewed By: barannikov88, kwk

Differential Revision: https://reviews.llvm.org/D150762
2023-05-17 17:03:15 +02:00
Tobias Hieta
b71edfaa4e [NFC][Py Reformat] Reformat python files in llvm
This is the first commit in a series that will reformat
all the python files in the LLVM repository.

Reformatting is done with `black`.

See more information here:

https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style

Reviewed By: jhenderson, JDevlieghere, MatzeB

Differential Revision: https://reviews.llvm.org/D150545
2023-05-17 10:48:52 +02:00
Krzysztof Drewniak
f0415f2a45 Re-land "[AMDGPU] Define data layout entries for buffers""
Re-land D145441 with data layout upgrade code fixed to not break OpenMP.

This reverts commit 3f2fbe92d0.

Differential Revision: https://reviews.llvm.org/D149776
2023-05-03 19:43:56 +00:00
Krzysztof Drewniak
3f2fbe92d0 Revert "[AMDGPU] Define data layout entries for buffers"
This reverts commit f9c1ede254.

Differential Revision: https://reviews.llvm.org/D149758
2023-05-03 16:11:00 +00:00
Krzysztof Drewniak
f9c1ede254 [AMDGPU] Define data layout entries for buffers
Per discussion at
https://discourse.llvm.org/t/representing-buffer-descriptors-in-the-amdgpu-target-call-for-suggestions/68798,
we define two new address spaces for AMDGCN targets.

The first is address space 7, a non-integral address space (which was
already in the data layout) that has 160-bit pointers (which are
256-bit aligned) and uses a 32-bit offset. These pointers combine a
128-bit buffer descriptor and a 32-bit offset, and will be usable with
normal LLVM operations (load, store, GEP). However, they will be
rewritten out of existence before code generation.

The second of these is address space 8, the address space for "buffer
resources". These will be used to represent the resource arguments to
buffer instructions, and new buffer intrinsics will be defined that
take them instead of <4 x i32> as resource arguments. ptr
addrspace(8). These pointers are 128-bits long (with the same
alignment). They must not be used as the arguments to getelementptr or
otherwise used in address computations, since they can have
arbitrarily complex inherent addressing semantics that can't be
represented in LLVM. Even though, like their address space 7 cousins,
these pointers have deterministic ptrtoint/inttoptr semantics, they
are defined to be non-integral in order to prevent optimizations that
rely on pointers being a [0, [addr_max]] value from applying to them.

Future work includes:
- Defining new buffer intrinsics that take ptr addrspace(8) resources.
- A late rewrite to turn address space 7 operations into buffer
intrinsics and offset computations.

This commit also updates the "fallback address space" for buffer
intrinsics to the buffer resource, and updates the alias analysis
table.

Depends on D143437

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D145441
2023-05-03 15:25:58 +00:00
Matt Arsenault
bc37be1855 LangRef: Add "dynamic" option to "denormal-fp-math"
This is stricter than the default "ieee", and should probably be the
default. This patch leaves the default alone. I can change this in a
future patch.

There are non-reversible transforms I would like to perform which are
legal under IEEE denormal handling, but illegal with flushing zero
behavior. Namely, conversions between llvm.is.fpclass and fcmp with
zeroes.

Under "ieee" handling, it is legal to translate between
llvm.is.fpclass(x, fcZero) and fcmp x, 0.

Under "preserve-sign" handling, it is legal to translate between
llvm.is.fpclass(x, fcSubnormal|fcZero) and fcmp x, 0.

I would like to compile and distribute some math library functions in
a mode where it's callable from code with and without denormals
enabled, which requires not changing the compares with denormals or
zeroes.

If an IEEE function transforms an llvm.is.fpclass call into an fcmp 0,
it is no longer possible to call the function from code with denormals
enabled, or write an optimization to move the function into a denormal
flushing mode. For the original function, if x was a denormal, the
class would evaluate to false. If the function compiled with denormal
handling was converted to or called from a preserve-sign function, the
fcmp now evaluates to true.

This could also be of use for strictfp handling, where code may be
changing the denormal mode.

Alternative name could be "unknown".

Replaces the old AMDGPU custom inlining logic with more conservative
logic which tries to permit inlining for callees with dynamic handling
and avoids inlining other mismatched modes.
2023-04-29 08:44:59 -04:00
Nikita Popov
e7f4ad13ae [Transforms] Convert some tests to opaque pointers (NFC) 2023-04-11 16:49:12 +02:00
Mircea Trofin
ab2e7666c2 [mlgo][inl] Interactive mode: optionally tell the default decision
This helps training algorithms that may want to sometimes replicate the
default decision. The default decision is presented as an extra feature
called `inlining_default`. It's not normally exported to save
computation time.

This is only available in interactive mode.

Differential Revision: https://reviews.llvm.org/D147794
2023-04-10 12:20:09 -07:00
Dávid Bolvanský
e1f94336e9 Revert "[InlineCost] isKnownNonNullInCallee - handle also dereferenceable attribute"
This reverts commit 3b5ff3a67c.
2023-04-06 16:54:26 +02:00
Dávid Bolvanský
d5fe5604a6 Revert "xxx"
This reverts commit f60592438a.
2023-04-06 16:54:00 +02:00
Dávid Bolvanský
f60592438a xxx 2023-04-06 16:51:31 +02:00
Dávid Bolvanský
3b5ff3a67c [InlineCost] isKnownNonNullInCallee - handle also dereferenceable attribute 2023-04-06 16:51:28 +02:00
Dávid Bolvanský
06ddb7bfe2 [Inliner] Added test with nonnull callsite attribute 2023-04-06 11:06:49 +02:00
Dávid Bolvanský
ca42cd3e12 [Tests] More InlineCost tests with attributes only on callsites 2023-04-06 00:50:17 +02:00
Dávid Bolvanský
df8f12f38e [Tests] Added InlineCost test when arg is known as dereferenceable 2023-04-05 23:58:37 +02:00
Yuanfang Chen
e7a2da5298 [Inliner] Assign dummy debug location to the memcpy for byval argument
A similar fix to D133095.

Fixes https://github.com/llvm/llvm-project/issues/58770.

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D145607
2023-03-15 10:30:28 -07:00
Nikita Popov
9ca2c309ab [InstSimplify] Fix poison safety in insertvalue fold
We can only fold insertvalue undef, (extractvalue x, n) to x
if x is not poison, otherwise we might be replacing undef with
poison (https://alive2.llvm.org/ce/z/fnw3c8). The insertvalue
poison case is always fine.

I didn't go to particularly large effort to preserve cases where
folding with undef is still legal (mainly when there is a chain of
multiple inserts that end up covering the whole aggregate),
because this shouldn't really occur in practice: We should always
be generating the insertvalue poison form when constructing
aggregates nowadays.

Differential Revision: https://reviews.llvm.org/D144106
2023-02-16 09:39:44 +01:00
Janek van Oirschot
e3515ba381 Reapply "[AMDGPU] Modify adjustInliningThreshold to also consider the cost of passing function arguments through the stack"
Reapplies 142c28ffa1 as part of D140242 which got reverted due to amdgpu openmp test failures.

This diff fixes said failures by eliding most of `adjustInliningThresholdUsingCallee` for indirect calls as the callee function is unavailable for indirect calls.

Reviewed By: arsenm, #amdgpu

Differential Revision: https://reviews.llvm.org/D143498
2023-02-13 12:17:43 +00:00
Mircea Trofin
b87e53ee2a Revert "[mlgo] Fix test after D143624"
This reverts commit dc4c3cfd78.

Reverting because D143624 has been reverted.
2023-02-10 07:46:47 -08:00
David Green
86bfeb906e Revert "Inlining: Run the legacy AlwaysInliner before the regular inliner."
This seems to cause large regressions in existing code, as much as 75% slower
(4x the time taken). Small always inline functions seem to be used a lot in the
cmsis-dsp library.

I would add a phase ordering test to show the problems, but one already exists!
The llvm/test/Transforms/PhaseOrdering/ARM/arm_mult_q15.ll was just changed by
removing alwaysinline to hide the problems that existed.

This reverts commit cae033dcf2.
This reverts commit 8e33c41e72.
2023-02-10 15:01:49 +00:00
Amara Emerson
8e33c41e72 Inliner: Address missed review comments for D143624 2023-02-09 21:56:40 -08:00
Mircea Trofin
dc4c3cfd78 [mlgo] Fix test after D143624 2023-02-09 21:14:52 -08:00
Amara Emerson
cae033dcf2 Inlining: Run the legacy AlwaysInliner before the regular inliner.
We have several situations where it's beneficial for code size to ensure that every
call to always-inline functions are inlined before normal inlining decisions are
made. While the normal inliner runs in a "MandatoryOnly" mode to try to do this,
it only does it on a per-SCC basis, rather than the whole module. Ensuring that
all mandatory inlinings are done before any heuristic based decisions are made
just makes sense.

Despite being referred to the "legacy" AlwaysInliner pass, it's already necessary
for -O0 because the CGSCC inliner is too expensive in compile time to run at -O0.

This also fixes an exponential compile time blow up in
https://github.com/llvm/llvm-project/issues/59126

Differential Revision: https://reviews.llvm.org/D143624
2023-02-09 16:49:29 -08:00
Mircea Trofin
062380c86f [mlgo] Bump the unsupported versions for interactive tests to 3.8
e006c7dfa7 already covered the regalloc one.
2023-02-04 12:15:48 -08:00
Mircea Trofin
445ea1e777 [mlgo] only enable interactive mode tests on linux
`os.mkfifo` may not be supported everywhere (e.g. windows).
2023-02-03 19:57:26 -08:00
Mircea Trofin
79f7a5e02b [mlgo] Disable mlgo tests when python version is 6
Supporting 3.6 requires a bit too much of a change in the mlgo test python scripts.
2023-02-03 19:45:22 -08:00
Mircea Trofin
5fd51fcba6 Reland "[mlgo] Hook up the interactive runner to the mlgo-ed passes"
This reverts commit a772f0bb92.

The main problem was related to how we handled `dbgs()` from the hosted
compiler. Using explicit `subprocess.communicate`, and not relying on
dbgs() being flushed until the end appears to address the problem.

Also some fixes due to some bots running older pythons, so we can't have
nice things like `int | float` and such.
2023-02-03 17:54:42 -08:00