Commit Graph

24394 Commits

Author SHA1 Message Date
spupyrev
61eb12e1f4 [BOLT] introducing profi params
We want to use profile inference (**profi**) in BOLT for stale profile matching.
To this end, I am making a few changes modifying the interface of the algorithm.
This is the first change for existing usages of profi (e.g., CSSPGO):
- introducing an object holding the algorithmic parameters;
- some renaming of existing options;
- dropped unused option, SampleProfileInferEntryCount, as we don't plan to change its default value;
- no changes in the output / tests.

Reviewed By: hoy

Differential Revision: https://reviews.llvm.org/D134756
2023-01-09 12:03:28 -08:00
Sanjay Patel
0eedc9e567 [InstCombine] bitrev (zext i1 X) --> select X, SMinC, 0
https://alive2.llvm.org/ce/z/ZXCtgi

This breaks the infinite combine loop for issue #59897,
but we may still need more changes to avoid those loops.
2023-01-09 12:27:37 -05:00
Sanjay Patel
b1e6947618 [InstCombine] add tests for bitreverse of i1; NFC 2023-01-09 12:27:37 -05:00
Nikita Popov
fd07583ca4 [ConstantRange] Fix single bit abs range (PR59887)
For a full range input, we would produce an empty range instead
of a full range. The change to the SMin.isNonNegative() branch is
an optimality fix, because we should account for the potentially
discarded SMin value in the IntMinIsPoison case.

Change TestUnaryOpExhaustive to test both 4 and 1 bits, to both
cover this specific case in unit tests, and make sure all other
unary operations deal with 1-bit inputs correctly.

Fixes https://github.com/llvm/llvm-project/issues/59887.
2023-01-09 16:34:09 +01:00
Sanjay Patel
2dcbd740ee [InstCombine] reduce smul.ov with i1 types to 'and'
https://alive2.llvm.org/ce/z/5tLkW6

There's still a miscompile bug as shown in issue #59876 / D141214 .
2023-01-09 10:27:15 -05:00
Sanjay Patel
fc9d54a4a1 [InstCombine] add tests for smul/umul with overflow with i1 types; NFC
More coverage for D141214 / issue #59876
2023-01-09 10:27:15 -05:00
Nikita Popov
b50961bded [CVP] Add test for PR59887 (NFC)
Also fix all the incorrect intrinsic name mangling while here.
2023-01-09 16:16:23 +01:00
Nikita Popov
59f91ddf90 [InstCombine] Preserve alignment in atomicrmw -> store fold
Preserve the alignment of the original atomicrmw, rather than using
the ABI alignment.

The same problem exists for loads, but that code is being removed
in D141277 anyway.
2023-01-09 15:37:24 +01:00
Nikita Popov
22cafc7381 [InstCombine] Test alignment in atomicrmw -> store transform (NFC)
And regenerate test checks. The current alignment is incorrect.
2023-01-09 15:26:01 +01:00
Jamie Hill-Daniel
6b9317f52a [InstCombine] Fold zero check followed by decrement to usub.sat
Fold (a == 0) : 0 ? a - 1 into usub.sat(a, 1).

Differential Revision: https://reviews.llvm.org/D140798
2023-01-09 14:22:25 +01:00
Jamie Hill-Daniel
8f4795ef13 [InstCombine] Add tests for saturating subtract by one (NFC)
Tests for D140798.
2023-01-09 14:10:28 +01:00
Max Kazantsev
3602d852a5 [Test] One more test where check is not replaced to invariant
Irrelevant constant check makes things even more difficult, surprisingly.
2023-01-09 19:26:48 +07:00
Noah Goldstein
6d839621da [InstCombine] Canonicalize (A & B_Pow2) eq/ne B_Pow2 patterns
1. A & B_Pow2 != B_Pow2 -> A & B_Pow2 == 0
   https://alive2.llvm.org/ce/z/KVUej4

2. A & B_Pow2 == B_Pow2 -> A & B_Pow2 != 0
   https://alive2.llvm.org/ce/z/PVv9FR

This allows the patterns to more easily be analyzed elsewhere.

Differential Revision: https://reviews.llvm.org/D141090
2023-01-09 12:48:28 +01:00
Ben Mudd
1f11d1bd12 [DebugInfo] Fix jump threading failing to update cloned dbg.values
This is a patch to fix duplicated dbg.values in the JumpThreading pass not
pointing towards their local value, and instead towards the variable in the
original block.
JumpThreadingPass::cloneInstructions is the changed function to target metadata
as well as normal cloned values.

Reviewed By: jmorse, StephenTozer

Differential Revision: https://reviews.llvm.org/D140006
2023-01-09 11:42:33 +00:00
Max Kazantsev
957952dbf2 [JumpThreading] Preserve profile metadata during select unfolding
Jump threading can replace select and unconditional branch with
conditional branch, but when doing so loses profile information.

This destructive transform can eventually lead to a performance
degradation due to folding of branches in
shouldFoldCondBranchesToCommonDestination as branch probabilities
are no longer known.

Patch by Roman Paukner!

Differential Revision: https://reviews.llvm.org/D138132
Reviewed By: mkazantsev
2023-01-09 16:14:58 +07:00
chenglin.bi
33794cffcf [InstCombine] Fold logic-and/logic-or by distributive laws part2
Follow up https://reviews.llvm.org/D139408, support `and/or+select` patterns
X && Z || Y && Z --> (X || Y) && Z
https://alive2.llvm.org/ce/z/EMCkBG
https://alive2.llvm.org/ce/z/Q-YRvr
https://alive2.llvm.org/ce/z/SFkVQc
https://alive2.llvm.org/ce/z/S9MCuJ
https://alive2.llvm.org/ce/z/KZ7zzz

(X || Z) && (Y || Z) --> (X && Y) || Z
https://alive2.llvm.org/ce/z/Ggpa8-
https://alive2.llvm.org/ce/z/nhQRLY
https://alive2.llvm.org/ce/z/zpmEnq
https://alive2.llvm.org/ce/z/7omsrf
https://alive2.llvm.org/ce/z/CWBzBp

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D139630
2023-01-09 10:21:17 +08:00
Shilei Tian
acd22b2751 [AAUnderlyingObjects] Introduce an AA for getting underlying objects of a pointer
This patch introduces a new AA `AAUnderlyingObjects`. It is basically like a wrapper
AA of the function `AA::getAssumedUnderlyingObjects`, but it can recursively do
query if the underlying object is an indirect access, such as a phi node or a select
instruction.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D141164
2023-01-08 16:45:50 -05:00
Sanjay Patel
21d3871b7c [InstCombine] fold not-shift of signbit to icmp+zext, part 2
Follow-up to:
6c39a3aae1

That converted a pattern with ashr directly to icmp+zext, and
this updates the pattern that we used to convert to.

This canonicalizes to icmp for better analysis in the minimum case
and shortens patterns where the source type is not the same as dest type:
https://alive2.llvm.org/ce/z/tpXJ64
https://alive2.llvm.org/ce/z/dQ405O

This requires an adjustment to an icmp transform to avoid infinite looping.
2023-01-08 12:04:09 -05:00
Florian Hahn
78914e8c32 [VPlan] Keep entries in worklist in sinkScalarOperands.
Not removing the entries ensures that duplicates are avoided,
reducing the number of iterations.
2023-01-08 15:52:57 +00:00
luxufan
eda8e999dd [InstCombine] Combine (zext a) mul (zext b) to llvm.umul.with.overflow only if mul has NUW flag
Fixes: https://github.com/llvm/llvm-project/issues/59836

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D141031
2023-01-08 14:41:59 +08:00
Alexey Bataev
7439e1b2de [SLP]Fix incorrect reordering of clustered scalars.
The new mask represents the order, not the mask itself. At first, need
to treat as the order, convert to mask and only after that reorder
gathered scalars to build correct clustered order.

Differential Revision: https://reviews.llvm.org/D141161
2023-01-06 16:04:09 -08:00
James Y Knight
1ae36b1387 Remove special cases for invoke of non-throwing inline-asm.
Non-throwing inline asm infers the nounwind attribute in
instcombine. Thus, it can be handled in the same manner as
non-throwing target functions are generally. Further special casing is
unnecessary complexity.
2023-01-06 13:53:10 -05:00
Alex Richardson
968f2c77a8 Re-gernerate a test in preparation for D141060 2023-01-06 17:38:55 +00:00
Alexey Bataev
9b5f62685a [SLP]Fix cost of the broadcast buildvector/gather.
Need to include the cost of the initial insertelement to the cost of the
broadcasts. Also, need to adjust the cost of the gather/buildvector if
the element is inserted into poison/undef vector.

Differential Revision: https://reviews.llvm.org/D140498
2023-01-06 09:25:05 -08:00
Nikita Popov
d18a2dc5c9 [GVN] Name instructions in test (NFC) 2023-01-06 17:28:18 +01:00
Nikita Popov
896ca595f9 [EntryExitInstrumenter] Convert test to opaque pointers (NFC) 2023-01-06 17:25:30 +01:00
Craig Topper
e5a71a41d8 [RISCV] Add support for the vscale_range attribute.
This is based on @frasercrmck's D107290. At least some of the clang
portion of D107290 has already been committed.

This uses vscale_range for min/max vector width unless the command
line overrides are used.

As a follow up, I plan to add a max or exact VLEN option to clang
to control the vscale_range. This will eliminate many of the reasons
for users to use the overrides through the -mllvm interface.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D139873
2023-01-06 08:20:37 -08:00
David Green
161bfa5f53 [LoopFlattening] Check for extra uses on Mul
Similar to D138404, we were not guarding against extra uses of the Mul.
In most cases other checks would catch the issue due to unsupported
instructions in the outer loop, but certain non-canonical loop forms
could still get through.

Fixes #59339

Differential Revision: https://reviews.llvm.org/D141114
2023-01-06 15:32:38 +00:00
David Green
8cc9530ebc [LoopFlatten][NFC] Run instnamer on pr59339.ll 2023-01-06 15:32:38 +00:00
Alex Richardson
1b440155c1 Make switch-to-lookup-large-types.ll more reliable
When larger integer types are natively supported simplifycfg will use an
inline constant instead of a global variable for this transform. I noticed
this while trying to automatically infer the datalayout from the target
triple in opt if it is not explicitly specified. Since the x86_64
datalayout includes "n8:16:32:64", this test started failing.

While touching this file also change i128 to i64 in the first test since
this was intended behaviour in the original commit.

Reviewed By: spatel, fhahn

Differential Revision: https://reviews.llvm.org/D141055
2023-01-06 13:35:43 +00:00
Nikita Popov
4abc820c66 [CallSiteSplitting] Convert test to opaque pointers (NFC)
Keeping the bitcasts here because this is in part testing the
(legal) bitcast after a musttail call, even though it's no longer
really relevant.
2023-01-06 14:35:31 +01:00
Nikita Popov
5867241eac [Transforms] Convert some tests to opaque pointers (NFC) 2023-01-06 12:14:45 +01:00
Nikita Popov
b3bad0ab31 [GlobalSplit] Convert test to opaque pointers (NFC) 2023-01-06 12:07:06 +01:00
Florian Hahn
68469a80cb [LV] Disable runtime unrolling for vectorized loops.
This patch adds metadata to disable runtime unrolling to the vectorized
loop. If runtime unrolling/interleaving is considered profitable, LV
will interleave the loop directly. There should be no need to perform
runtime unrolling at a later stage.

Note that we already add metadata to disable runtime unrolling to the
scalar loop after vectorization.

The additional unrolling unnecessarily increases code size and compile
time. In addition to that we have several bug reports of unncessary
runtime unrolling for vectorized loops, e.g. PR40961

Compile-time improvements:

  NewPM-O3: -1.04%
  NewPM-ReleaseThinLTO: -0.59%
  NewPM-ReleaseLTO-g: -0.97%

https://llvm-compile-time-tracker.com/compare.php?from=ce1be13a868d0f8afa367975558c1a6175cce33a&to=78bc2e67f22e9e10e61cdb6cdac4bb857d95eb1b&stat=instructions:u

Fixes #40306.

Reviewed By: lebedev.ri, nikic

Differential Revision: https://reviews.llvm.org/D115261
2023-01-06 10:56:17 +00:00
OCHyams
7ea47f9e41 [DebugInfo] Replace UndefValue with PoisonValue in setKillLocation
This helps towards the effort to remove UndefValue from LLVM.

Related to https://discourse.llvm.org/t/auto-undef-debug-uses-of-a-deleted-value

Reviewed By: nlopes

Differential Revision: https://reviews.llvm.org/D140905
2023-01-06 10:51:02 +00:00
Nikita Popov
b7b0ce6e76 [LoopUnroll] Convert test to opaque pointers (NFC) 2023-01-06 11:48:03 +01:00
Nikita Popov
9c7afbacd8 [LoopUnroll] Name instructions in test (NFC) 2023-01-06 11:48:03 +01:00
Nikita Popov
7a752e8108 [LoopIdiom] Convert tests to opaque pointers (NFC)
The differences here are due to SCEVExpander producing GEPs with
explicit offset calculation, a known difference with opaque pointers.
2023-01-06 11:36:37 +01:00
Nikita Popov
89f1876b61 [LoopIdiom] Name instructions in test (NFC) 2023-01-06 11:07:57 +01:00
Keno Fischer
1436a9232b [LVI] Look through negations when evaluating conditions
This teaches LVI (and thus CVP) to extract range information
from branches whose condition is negated using (`xor %c, true`).
On the implementation side, we switch the cache to additionally
track whether we're looking for the inverted value or not and
otherwise using the existing support for computing inverted
conditions.

I think the biggest question here is why this negation shows up
here at all. After all, it should always be possible for some
other pass to fold such a negation into a branch, comparison or
some other logical operation. Indeed, instcombine does just that.
However, these negations can be otherwise fairly persistent, e.g.
instsimplify is not able to exchange branch conditions from
negations. In addition, jumpthreading, which sits at the same
point in default pass pipeline also handles this pattern, which
adds further evidence that we might expect these negations to
not have been canonicalized away yet at this point in the pass
pipeline.

In the particular case I was looking at there was a bit of a
circular dependency where flags computed by cvp were needed
by instcombine, and incstombine's folding of the negation was
needed for cvp. Adding a second instombine pass would have
worked of course, but instcombine can be somewhat expensive,
so it appeared desirable to not require it to have run
before cvp (as is the case in the default pass pipeline).

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D140933
2023-01-05 23:03:46 +00:00
Peter Rong
1db51d8eb2 [Transform] Rewrite LowerSwitch using APInt
This rewrite fixes https://github.com/llvm/llvm-project/issues/59316.

Previously LowerSwitch uses int64_t, which will crash on case branches using integers with more than 64 bits.
Using APInt fixes this problem. This patch also includes a test

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D140747
2023-01-05 14:30:42 -08:00
Craig Topper
239a174d92 [RISCV] Prevent constant hoisting for or/and/xor that can use bseti/bclri/binvi.
Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D140928
2023-01-05 11:18:31 -08:00
Valery N Dmitriev
6d677c0b3d [SLP] Unify GEP cost modeling for load, store and GEP nodes.
Make a separate routine for GEPs cost calculation and make
the approach uniform across load, store and GEP tree nodes.
Additional issue fixed is GEP cost savings were applied twice
for ScatterVectorize nodes (aka gather load) making them look
unrealistically profitable for vectorization.

Differential Revision: https://reviews.llvm.org/D140789
2023-01-05 10:11:36 -08:00
Sanjay Patel
ad14cef1d5 [InstCombine] add tests for cmp of pow2 mask; NFC 2023-01-05 12:28:08 -05:00
Nikita Popov
11be5cc001 [LoopSimplifyCFG] Convert test to opaque pointers (NFC) 2023-01-05 14:05:39 +01:00
Nikita Popov
3ed1c21ac5 [PredicateInfo] Enable test with broken REQUIRES condition (NFC)
Add some extra uses of the comparisons, so that these do get
visited.
2023-01-05 12:51:28 +01:00
Nikita Popov
055fb7795a [Transforms] Convert some tests to opaque pointers (NFC)
These are all tests where conversion worked automatically, and
required no manual fixup.
2023-01-05 12:43:45 +01:00
David Green
586fd86b0a [LoopVectorizer] Fix inloop reductions mask placement
The validation of vplans could fail if an inloop reduction was created
with a block-in mask that did not dominate the reduction. This makes
sure that the insert point is set when creating the mask, to ensure it
dominates the reduction.

Differential Revision: https://reviews.llvm.org/D141003
2023-01-05 11:37:37 +00:00
Nikita Popov
7ac6b2fdaf [CVP] Convert tests to opaque pointers (NFC) 2023-01-05 12:34:36 +01:00
Nikita Popov
b061159e79 [SLPVectorizer] Convert test to opaque pointers (NFC) 2023-01-05 12:32:44 +01:00