Commit Graph

13163 Commits

Author SHA1 Message Date
Mingming Liu
dda73336ad [ThinLTO]Record import type in GlobalValueSummary::GVFlags (#87597)
The motivating use case is to support import the function declaration
across modules to construct call graph edges for indirect calls [1]
when importing the function definition costs too much compile time
(e.g., the function is too large has no `noinline` attribute).
1. Currently, when the compiled IR module doesn't have a function
definition but its postlink combined summary contains the function
summary or a global alias summary with this function as aliasee, the
function definition will be imported from source module by IRMover. The
implementation is in FunctionImporter::importFunctions [2]
2. In order for FunctionImporter to import a declaration of a function,
both function summary and alias summary need to carry the def / decl
state. Specifically, all existing summary fields doesn't differ across
import modules, but the def / decl state of is decided by
`<ImportModule, Function>`.

This change encodes the def/decl state in `GlobalValueSummary::GVFlags`.

In the subsequent changes
1. The indexing step `computeImportForModule` [3]
will compute the set of definitions and the set of declarations for each
module, and passing on the information to bitcode writer.
2. Bitcode writer will look up the def/decl state and sets the state
when it writes out the flag value. This is demonstrated in
https://github.com/llvm/llvm-project/pull/87600
3. Function importer will read the def/decl state when reading the
combined summary to figure out two sets of global values, and IRMover
will be updated to import the declaration (aka linkGlobalValuePrototype [4])
into the destination module.

- The next change is https://github.com/llvm/llvm-project/pull/87600

[1] mentioned in rfc https://discourse.llvm.org/t/rfc-for-better-call-graph-sort-build-a-more-complete-call-graph-by-adding-more-indirect-call-edges/74029#support-cross-module-function-declaration-import-5
[2] 3b337242ee/llvm/lib/Transforms/IPO/FunctionImport.cpp (L1608-L1764)
[3] 3b337242ee/llvm/lib/Transforms/IPO/FunctionImport.cpp (L856)
[4] 3b337242ee/llvm/lib/Linker/IRMover.cpp (L605)
2024-04-10 19:46:01 -07:00
Noah Goldstein
81cdd35c0c [ValueTracking] Add support for xor/disjoint or in isKnownNonZero
Handles cases like `X ^ Y == X` / `X disjoint| Y == X`.

Both of these cases have identical logic to the existing `add` case,
so just converting the `add` code to a more general helper.

Proofs: https://alive2.llvm.org/ce/z/Htm7pe

Closes #87706
2024-04-10 13:13:43 -05:00
Noah Goldstein
0c57a2e4b4 [ValueTracking] Add support for xor/disjoint or in getInvertibleOperands
This strengthens our `isKnownNonEqual` logic with some fairly
trivial cases.

Proofs: https://alive2.llvm.org/ce/z/4pxRTj

Closes #87705
2024-04-10 13:13:43 -05:00
Noah Goldstein
9c545a14c0 [ValueTracking] Add support for insertelement in isKnownNonZero
Inserts don't modify the data, so if all elements that end up in the
destination are non-zero the result is non-zero.

Closes #87703
2024-04-10 13:13:43 -05:00
Noah Goldstein
87528bfefb [ValueTracking] Add support for shufflevector in isKnownNonZero
Shuffles don't modify the data, so if all elements that end up in the
destination are non-zero the result is non-zero.

Closes #87702
2024-04-10 13:13:42 -05:00
Noah Goldstein
f1ee458ddb [ValueTracking] improve isKnownNonZero precision for smax
Instead of relying on known-bits for strictly positive, use the
`isKnownPositive` API. This will use `isKnownNonZero` which is more
accurate.

Closes #88170
2024-04-10 10:40:49 -05:00
Noah Goldstein
37ca6fa1e2 [ValueTracking] Add support for overflow detection functions is isKnownNonZero
Adds support for: `{s,u}{add,sub,mul}.with.overflow`

The logic is identical to the the non-overflow binops, we where just
missing the cases.

Closes #87701
2024-04-10 10:40:48 -05:00
Noah Goldstein
f0a487d7e2 [ValueTracking] Split isNonZero(mul) logic to a helper; NFC 2024-04-10 10:40:48 -05:00
Noah Goldstein
41c52217b0 [ValueTracking] Add support for vector_reduce_{s,u}{min,max} in computeKnownBits
Previously missing. We compute by just applying the reduce function on
the knownbits of each element.

Closes #88169
2024-04-10 10:40:48 -05:00
Noah Goldstein
77d668451a [ValueTracking] Add support for vector_reduce_{s,u}{min,max} in isKnownNonZero
Previously missing, proofs for all implementations:
https://alive2.llvm.org/ce/z/G8wpmG
2024-04-10 10:40:48 -05:00
annamthomas
54a9f0007c [SCEV] Fix BinomialCoefficient Iteration to fit in W bits (#88010)
BinomialCoefficient computes the value of W-bit IV at iteration It of a loop. When W is 1, we can call multiplicative inverse on 0 which triggers an assert since 1b76120.
    
Since the arithmetic is supposed to wrap if It or K does not fit in W bits, do the truncation into W bits after we do the shift.
    
 Fixes #87798
2024-04-10 09:02:23 -04:00
Florian Hahn
cac4c14ecf [LAA] Replace std::tuple with struct (NFCI).
As suggested in https://github.com/llvm/llvm-project/pull/88039, replace
the tuple with a struct, to make it easier to extend.
2024-04-10 10:28:43 +01:00
Björn Pettersson
5d9d740c39 Remove the unused IntervalPartition analysis pass (#88133)
This removes the old legacy PM "intervals" analysis pass (aka
IntervalPartition). It also removes the associated Interval and
IntervalIterator help classes.

Reasons for removal:
1) The pass is not used by llvm-project (not even being tested by
   any regression tests).
2) Pass has not been ported to new pass manager, which at least
   indicates that it isn't used by the middle-end.
3) ASan reports heap-use-after-free on
      ++I;  // After the first one...
   even if false is passed to intervals_begin. Not sure if that is
   a false positive, but it makes the code a bit less trustworthy.
2024-04-09 20:12:26 +02:00
David Green
4ac2721e51 [AArch64] Add costs for ST3 and ST4 instructions, modelled as store(shuffle). (#87934)
This tries to add some costs for the shuffle in a ST3/ST4 instruction,
which are represented in LLVM IR as store(interleaving shuffle). In
order to detect the store, it needs to add a CxtI context instruction to
check the users of the shuffle. LD3 and LD4 are added, LD2 should be a
zip1 shuffle, which will be added in another patch.

It should help fix some of the regressions from #87510.
2024-04-09 16:36:08 +01:00
Noah Goldstein
964df099e1 [ValueTracking] Support non-constant idx for computeKnownBits of insertelement
Its same logic as before, we just need to intersect what we know about
the new Elt and the entire pre-existing Vec.

Closes #87707
2024-04-09 01:01:41 -05:00
Noah Goldstein
b65ab0b726 [ValueTracking] Add comment clarifying missing usub.sat in isKnownNonZero; NFC
Closes #87700
2024-04-08 23:33:06 -05:00
Matt Arsenault
bdf428af98 ValueTracking: Consider demanded elts for vector constants in computeKnownFPClass 2024-04-08 09:32:14 -04:00
Matt Arsenault
2bc637b1ce ValueTracking: Handle ConstantAggregateZero in computeKnownFPClass 2024-04-08 09:26:12 -04:00
Matt Arsenault
95f984f37e ValueTracking: Don't use unnecessary null checked dyn_cast 2024-04-08 08:32:04 -04:00
Noah Goldstein
e4db938a4e [ValueTracking] Support non-constant idx for computeKnownFPClass of insertelement
Its same logic as before, we just need to intersect what we know about
the new Elt and the entire pre-existing Vec.

Closes #87708
2024-04-06 17:51:15 -05:00
Noah Goldstein
678f32ab66 [ValueTracking] Add more conditions in to isTruePredicate
There is one notable "regression". This patch replaces the bespoke `or
disjoint` logic we a direct match. This means we fail some
simplification during `instsimplify`.
All the cases we fail in `instsimplify` we do handle in `instcombine`
as we add `disjoint` flags.

Other than that, just some basic cases.

See proofs: https://alive2.llvm.org/ce/z/_-g7C8

Closes #86083
2024-04-04 12:42:58 -05:00
Noah Goldstein
05cff99a29 [ValueTracking] Infer known bits fromfrom (icmp eq (and/or x,y), C)
In `(icmp eq (and x,y), C)` all 1s in `C` must also be set in both
`x`/`y`.

In `(icmp eq (or x,y), C)` all 0s in `C` must also be set in both
`x`/`y`.

Closes #87143
2024-04-04 12:42:58 -05:00
Jay Foad
1b761205f2 [APInt] Add a simpler overload of multiplicativeInverse (#87610)
The current APInt::multiplicativeInverse takes a modulus which can be
any value, but all in-tree callers use a power of two. Moreover, most
callers want to use two to the power of the width of an existing APInt,
which is awkward because 2^N is not representable as an N-bit APInt.

Add a new overload of multiplicativeInverse which implicitly uses
2^BitWidth as the modulus.
2024-04-04 16:11:06 +01:00
Andreas Jonson
d4cd65ecf2 [LVI] Handle range attributes (#86413)
This adds handling of range attribute for return values of Call and
Invoke in getFromRangeMetadata and handling of argument with range
attribute in solveBlockValueNonLocal.
There is one additional check of the range metadata at line 1120 in
getValueFromSimpleICmpCondition that is not covered in this PR as after
https://github.com/llvm/llvm-project/pull/75311 there is no test that
cover that check any more and I have not been able to create a test that
trigger that code.
2024-04-04 14:48:11 +08:00
Mingming Liu
1e15371dd8 [ThinLTO][TypeProf] Implement vtable def import (#79381)
Add annotated vtable GUID as referenced variables in per function
summary, and update bitcode writer to create value-ids for these
referenced vtables.

- This is the part3 of type profiling work, and described in the "Virtual Table Definition Import" [1] section of the
RFC.

[1] https://github.com/llvm/llvm-project/pull/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
2024-04-01 15:14:49 -07:00
Vitaly Buka
37d6e5b7a5 [memoryssa] Exclude llvm.allow.{runtime,ubsan}.check() (#86066)
RFC:
https://discourse.llvm.org/t/rfc-add-llvm-experimental-hot-intrinsic-or-llvm-hot/77641
2024-03-31 22:50:02 -07:00
Vitaly Buka
0bc3781649 [Analysis] Exclude llvm.allow.{runtime,ubsan}.check() from AliasSetTracker (#86065)
RFC:
https://discourse.llvm.org/t/rfc-add-llvm-experimental-hot-intrinsic-or-llvm-hot/77641
2024-03-31 22:47:55 -07:00
Vitaly Buka
2bfb19e813 Revert "Make two texts static in ReplayInlineAdvisor" (#82071)
Reverts llvm/llvm-project#79489

We know that the issues was with asan/annotations. We can revert it.
2024-03-31 19:45:12 -07:00
Noah Goldstein
0e78655731 [LVI] Use m_AddLike instead of m_Add when matching simple condition
We have more complete logic for handling `Add`, so try to use that
logic for `or disjoint` (which can definitionally be treated as
`add`).

Closes #86058
2024-03-28 13:49:05 -05:00
Noah Goldstein
637421cb88 [ValueTracking] Tracking or disjoint conditions as add in Assumption/DomCondition Cache
We can definitionally treat `or disjoint` as `add` anywhere.

Closes #86302
2024-03-28 13:49:05 -05:00
Xiangyang (Mark) Guo
1607e8212c [InlineCost] Disable cost-benefit when sample based PGO is used (#86626)
#66457 makes InlineCost to use cost-benefit by default, which causes
0.4-0.5% performance regression on multiple internal workloads. See
discussions https://github.com/llvm/llvm-project/pull/66457. This pull
request reverts it.

Co-authored-by: helloguo <helloguo@meta.com>
2024-03-28 10:11:57 -07:00
Alex MacLean
e318613418 [NFC][TLI] Move VecFuncs to statics to reduce stack usage (#86829)
`TargetLibraryInfoImpl::addVectorizableFunctionsFromVecLib` has a lot of
data in local stack arrays, which MSVC keeps on the stack even in
release builds. To reduce stack usage, the data arrays (which are
const), are moved outside the function as statics. This drops the method
stack usage to be negligible.
2024-03-27 16:49:59 -07:00
Xiangyang (Mark) Guo
d312788962 [InlineOrder] fix the calculation of Cost for CostBenefitPriority (#86630)
getCost() expects that isVariable() is true.
https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/Analysis/InlineCost.h#L146

Co-authored-by: helloguo <helloguo@meta.com>
2024-03-26 15:59:31 -07:00
Yingwei Zheng
2f1f6b704d [LLVM] Use std::move for APInt. NFC. (#86257)
This patch adjusts argument passing for `APInt` to improve the
compile-time.
Compile-time improvement:
https://llvm-compile-time-tracker.com/compare.php?from=d1f182c895728d89c5c3d198b133e212a5d9d4a3&to=ba3e326def3a6e5cd6d72ff5a49c74fba18de1df&stat=instructions:u
2024-03-23 14:58:25 +08:00
Andreas Jonson
e66cfebb04 [ValueTracking] Handle range attributes (#85143)
Handle the range attribute in ValueTracking.
2024-03-20 12:43:00 +01:00
Graham Hunter
36a3f8f647 [TTI][TLI][AArch64] Support scalable immediates with isLegalAddImmediate (#84173)
Adds a second parameter (default to 0) to isLegalAddImmediate, to
represent a scalable immediate.

Extends the AArch64 implementation to match immediates based on what addvl and inc[h|w|d] support.
2024-03-20 10:28:46 +00:00
Graham Hunter
cd768ec983 [AArch64] Support scalable offsets with isLegalAddressingMode (#83255)
Allows us to indicate that an addressing mode featuring a
vscale-relative immediate offset is supported.
2024-03-20 10:13:20 +00:00
Nikita Popov
eefef900c6 Revert "Enable exp10 libcall on linux (#68736)"
This reverts commit 9848fa4aa2.

Causes buildbot failures.
2024-03-20 11:08:52 +01:00
Nikita Popov
0f46e31cfb [IR] Change representation of getelementptr inrange (#84341)
As part of the migration to ptradd
(https://discourse.llvm.org/t/rfc-replacing-getelementptr-with-ptradd/68699),
we need to change the representation of the `inrange` attribute, which
is used for vtable splitting.

Currently, inrange is specified as follows:

```
getelementptr inbounds ({ [4 x ptr], [4 x ptr] }, ptr @vt, i64 0, inrange i32 1, i64 2)
```

The `inrange` is placed on a GEP index, and all accesses must be "in
range" of that index. The new representation is as follows:

```
getelementptr inbounds inrange(-16, 16) ({ [4 x ptr], [4 x ptr] }, ptr @vt, i64 0, i32 1, i64 2)
```

This specifies which offsets are "in range" of the GEP result. The new
representation will continue working when canonicalizing to ptradd
representation:

```
getelementptr inbounds inrange(-16, 16) (i8, ptr @vt, i64 48)
```

The inrange offsets are relative to the return value of the GEP. An
alternative design could make them relative to the source pointer
instead. The result-relative format was chosen on the off-chance that we
want to extend support to non-constant GEPs in the future, in which case
this variant is more expressive.

This implementation "upgrades" the old inrange representation in bitcode
by simply dropping it. This is a very niche feature, and I don't think
trying to upgrade it is worthwhile. Let me know if you disagree.
2024-03-20 10:59:45 +01:00
Yingwei Zheng
f0420c7bc6 [ValueTracking] Handle not in isImpliedCondition (#85397)
This patch handles `not` in `isImpliedCondition` to enable more fold in
some multi-use cases.
2024-03-20 16:16:42 +08:00
Krishna Narayanan
9848fa4aa2 Enable exp10 libcall on linux (#68736) 2024-03-20 13:03:49 +05:30
Jeremy Morse
b9d83eff25 [NFC][RemoveDIs] Use iterators for insertion at various call-sites (#84736)
These are the last remaining "trivial" changes to passes that use
Instruction pointers for insertion. All of this should be NFC, it's just
changing the spelling of how we identify a position.

In one or two locations, I'm also switching uses of getNextNode etc to
using std::next with iterators. This too should be NFC.

---------

Merged by: Stephen Tozer <stephen.tozer@sony.com>
2024-03-19 16:36:29 +00:00
Nikita Popov
2cc75aed09 [ValueTracking] Move MD_range handling to isKnownNonZeroFromOperator()
All the isKnownNonZero() handling for instructions should be inside
this function. This makes the structure more similar to
computeKnownBitsFromOperator() as well.

This may not be entirely NFC due to different depth handling.
2024-03-19 16:16:48 +01:00
Nikita Popov
d1e2305a6d [ValueTracking] Fix release build
Move the declaration of the Ty variable outside the NDEBUG guard
and make use of it in the remainder of the function.
2024-03-19 16:07:14 +01:00
Nikita Popov
6872a64652 [ValueTracking] Handle vector range metadata in isKnownNonZero()
Nowadays !range can be placed on instructions with vector of int
return value. Support this case in isKnownNonZero().
2024-03-19 15:50:13 +01:00
Matt Arsenault
1a6953a75d ValueTracking: Fix bug with fcmp false to nan constant
If we had a comparison to a literal nan with a false predicate,
we were incorrectly treating it as an unordered compare. This was
correct for fcmp true, but not fcmp false. I noticed this in the
review for e44d3b3e50 but misdiagnosed
the reason. Also change the test for the fcmp true case to be more
useful, but it wasn't wrong previously.
2024-03-19 14:52:45 +05:30
Noah Goldstein
5265be11b1 [InstSimply] Simplify (fmul -x, +/-0) -> -/+0
We already handle the `+x` case, and noticed it was missing in the bug
affecting #82555

Proofs: https://alive2.llvm.org/ce/z/WUSvmV

Closes #85345
2024-03-18 15:11:55 -05:00
Noah Goldstein
01d8e1ca01 [ValueTracking] Handle non-canonical operand order in isImpliedCondICmps
We don't always have canonical order here, so do it manually.

Closes #85575
2024-03-17 17:46:06 -05:00
Artem Tyurin
141145232f [IRBuilder] Fold binary intrinsics (#80743)
Fixes https://github.com/llvm/llvm-project/issues/61240.
2024-03-15 09:58:25 +01:00
Paschalis Mpeis
f795d1a8b1 [AArch64][LV][SLP] Vectorizers use call cost for vectorized frem (#82488)
getArithmeticInstrCost is used by both LoopVectorizer and SLPVectorizer
to compute the cost of frem, which becomes a call cost on AArch64 when
TLI has a vector library function.

Add tests that do SLP vectorization for code that contains 2x double and
4x float frem instructions.
2024-03-14 17:20:29 +00:00