Commit Graph

1648 Commits

Author SHA1 Message Date
Bjorn Pettersson
2e14900db9 [test][NewPM] Use -passes=loop-vectorize instead of -loop-vectorize
Update a bunch of loop-vectorize regression tests to use the new PM
syntax (opt -passes=loop-vectorize) instead of the deprecated legacy
PM syntax (opt -loop-vectorize).
2022-04-28 16:46:00 +02:00
Florian Hahn
bea69b232f [VPlan] Initial modeling of middle block in VPlan.
This patch extends the scope of VPlan to also include the exit (aka
middle) block.

For now, the exit block remains empty, but handling of exit values will
subsequently be moved to VPlan, by adding recipes to model exit values
in the exit block.

As a first step, this will allow fixing #51366.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D123457
2022-04-20 19:34:41 +01:00
Florian Hahn
a65f2730d2 [VPlan] Expand induction step in VPlan pre-header.
This patch moves SCEV expansion of steps used by
VPWidenIntOrFpInductionRecipes to the pre-header using
VPExpandSCEVRecipe. This ensures that those steps are expanded while the
CFG is in a valid state. Previously, SCEV expansion may happen during
vector body code-generation, during which the CFG may be invalid,
causing issues with SCEV expansion.

Depends on D122095.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D122096
2022-04-19 13:06:39 +02:00
Craig Topper
ac8c720d48 [IR] Allow constant folding (insertelement <vscale x 2 x i32> zeroinitializer, i32 0, i32 i32 0.
Most of insertelement constant folding is blocked if the vector type
is scalable. I believe we can make an exception for inserting null
into an all zeros vector.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D123413
2022-04-15 17:44:32 -07:00
Florian Hahn
73f5d7d0d6 [VPlan] Handle equal address and store ops in onlyFirstLaneDemanded.
With opaque pointers, the stored value and address can be the same.

Previously the code in VPWidenMemoryInstructionRecipe::onlyFirstLaneDemanded
incorrectly considers stores with matching store and pointer operands as
only demanding the first lane, causing a crash.
2022-04-15 22:53:33 +02:00
Muhammad Omair Javaid
42ebfa8269 Revert "[AArch64] Set maximum VF with shouldMaximizeVectorBandwidth"
This reverts commit 64b6192e81.

This broke LLVM AArch64 buildbot clang-aarch64-sve-vls-2stage:

https://lab.llvm.org/buildbot/#/builders/176/builds/1515

llvm-tblgen crashes after applying this patch.
2022-04-13 04:53:07 +05:00
Simon Pilgrim
431e93f4f5 [InstCombine] Fold sub(add(x,y),min/max(x,y)) -> max/min(x,y) (PR38280)
As discussed on Issue #37628, we can flip a min/max node if we're subtracting from the sum of the node's operands

Alive2: https://alive2.llvm.org/ce/z/W_KXfy

Differential Revision: https://reviews.llvm.org/D123399
2022-04-11 11:32:56 +01:00
Florian Hahn
5f1eb74850 [VPlan] Place VPExpandSCEVRecipe in pre-header.
After D121624 models the pre-header in VPlan, VPExpandSCEVRecipes can be
placed there. This ensures SCEV expansion happens before modifying the
CFG during VPlan execution, when CFG is incomplete.

Depends on D121624.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D122095
2022-04-10 10:26:20 +02:00
Florian Hahn
256c6b0ba1 [VPlan] Model pre-header explicitly.
This patch extends the scope of VPlan to also model the pre-header.
The pre-header can be used to place recipes that should be code-gen'd
outside the loop, like SCEV expansion.

Depends on D121623.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D121624
2022-04-09 14:19:47 +02:00
Simon Pilgrim
450f0d76b4 [LoopVectorize] Regenerate first-order-recurrence.ll 2022-04-09 10:33:03 +01:00
Stanislav Mekhanoshin
fced87d457 [AMDGPU] Fix regression with vectorization limiting
D67148 has removed TTI::getNumberOfRegisters(bool Vector) and
started to call TTI::getNumberOfRegisters(unsigned ClassID) from
the LoopVectorize. This has resulted in an unrestricted vectorization
on AMDGPU blowing up register pressure.

Differential Revision: https://reviews.llvm.org/D122850
2022-04-08 17:46:49 -07:00
Florian Hahn
467dbcd9f1 [LV] Set debug loc after setting insert point.
This fixes the code to actually use the location of the instruction, if
available. Previously, SetInsertPoint would overwrite the insert point
set from the instruction.
2022-04-08 20:34:40 +02:00
Florian Hahn
4c0d5db9c9 [LV] Add test case for wrong debug location with replicate recipe. 2022-04-08 20:34:16 +02:00
Florian Hahn
29fe998eaa [VPlan] Preserve debug location when creating branch.
Update createEmptyBasicBlock to preserve the debug location of the
previous terminator.
2022-04-08 17:22:53 +02:00
Florian Hahn
547567fe2b [LV] Add test for missing debug info on branch in vector loop.
Adds a test case where currently no debug location is added to branches
in the vector body.
2022-04-08 17:22:53 +02:00
Florian Hahn
631016a853 [LV] Add test case for PR54427.
Reduced test for #54427.
2022-04-07 23:21:21 +02:00
Jingu Kang
64b6192e81 [AArch64] Set maximum VF with shouldMaximizeVectorBandwidth
Set the maximum VF of AArch64 with 128 / the size of smallest type in loop.

Differential Revision: https://reviews.llvm.org/D118979
2022-04-05 13:16:52 +01:00
Florian Hahn
1ff022e21b [LV] Add vector.body block to parent loop during skeleton creation.
When creating induction resume values, SCEV queries may rely on
LoopInfo. Make sure vector.body gets added to the loop of the pre-header
during skeleton construction.

%vector.body will be moved to the vector preheader during VPlan
execution.

Fixes #54745.
2022-04-05 11:54:17 +01:00
Florian Hahn
368d35a894 [LV] Add addiitonal tests for pointer difference memory checks.
Additional tests for D119078.
2022-04-04 17:58:48 +01:00
Philip Reames
88de27e3fd [LV] Handle non-integral types when considering interleave widening legality
In general, anywhere we might need to insert a blind bitcast, we need to make sure the types are losslessly convertible.

This fixes pr54634.
2022-04-03 20:16:20 -07:00
Dávid Bolvanský
872f7000fc Revert "[NFCI] Regenerate SROA/LoopVectorize test checks"
This reverts commit 14e3450fb5.
2022-04-04 01:15:30 +02:00
Dávid Bolvanský
a113a582b1 [NFCI] Regenerate LoopVectorize test checks 2022-04-03 21:56:24 +02:00
Florian Hahn
95b2aa511e [VPlan] Set VPlan header block name to vector.body.
This brings the VPlan block naming in line with the naming of the
generated basic blocks.
2022-04-02 19:34:32 +01:00
Florian Hahn
a08c90a402 [LV] Re-use TripCount from EPI.TripCount.
During skeleton construction for the epilogue vector loop, generic
helpers use getOrCreateTripCount, which will re-expand the trip count
computation. Instead, re-use the TripCount created during main loop
vectorization.
2022-04-01 13:47:34 +01:00
David Green
b65267ca7b [LV] Invalidate widening decisions after maximizing vector bandwidth
When MaximizeVectorBandwidth is enabled, we can end up (via calls to
collectUniformsAndScalars/setCostBasedWideningDecision through
calculateRegisterUsage) making widening decisions before we have decided
whether to fold the tail by masking. These decisions will be wrong if we
later decided to fold the tail, for example when the trip count is very
low. It will use incorrect costs for loads that should get masked, using
standard memory operation costs instead.

This still at the moment uses the EmulatedMaskMemRefHack costs (a bit
unfortunately), but the old costs without this change were 1, leading to
too optimistic vectorization.

This slightly changes the way that the MaximizeVectorBandwidth option
works to make it easier to test, always honouring the option if it is
set.

Differential Revision: https://reviews.llvm.org/D120215
2022-03-31 09:19:31 +01:00
Florian Hahn
ecb4171dcb [LV] Handle zero cost loops in selectInterleaveCount.
In some case, like in the added test case, we can reach
selectInterleaveCount with loops that actually have a cost of 0.

Unfortunately a loop cost of 0 is also used to communicate that the cost
has not been computed yet. To resolve the crash, bail out if the cost
remains zero after computing it.

This seems like the best option, as there are multiple code paths that
return a cost of 0 to force a computation in selectInterleaveCount.
Computing the cost at multiple places up front there would unnecessarily
complicate the logic.

Fixes #54413.
2022-03-29 22:52:43 +01:00
Florian Hahn
46432a0088 [VPlan] Add VPWidenPointerInductionRecipe.
This patch moves pointer induction handling from VPWidenPHIRecipe to its
own recipe. In the process, it adds all information required to generate
code for pointer inductions without relying on Legal to access the list
of induction phis.

Alternatively VPWidenPHIRecipe could also take an optional pointer to InductionDescriptor.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D121615
2022-03-24 14:58:45 +00:00
Florian Hahn
890fc21742 [LV] Extend checks in debugloc.ll. 2022-03-23 20:21:58 +00:00
Florian Hahn
973183612e [VPlan] Add test for VPExpandSCEVRecipe printing. 2022-03-20 10:11:40 +00:00
Florian Hahn
d5fbcf76fd [VPlan] Improve pattern in vplan-printing.ll check line.
The existing pattern only matched a single value, which breaks if the
numbering slightly changes.
2022-03-19 16:03:25 +00:00
Andrew Wei
0af3e6a22d [InstCombine] Sink instructions with multiple users in a successor block.
This patch tries to sink instructions when they are only used in a successor block.

This is a further enhancement patch based on Anna's commit:
D109700, which allows sinking an instruction having multiple uses in a single user.

In this patch, sink instructions with multiple users in a single successor block will be supported.
It could fix a known issue from rust:
  https://github.com/rust-lang/rust/issues/51346#issuecomment-394443610

Reviewed By: nikic, reames

Differential Revision: https://reviews.llvm.org/D121585
2022-03-18 11:53:45 +08:00
Florian Hahn
151c144350 [LV] Use usesScalars in widenPHIInstruction.
This uses the existing VPlan helpers to check whether there are scalar
uses of a phi recipe. It remove one of the few remaining dependencies on
the cost model from VPlan code generation.

Depends on D121612.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D121613
2022-03-17 13:16:32 +00:00
Malhar Jajoo
a36d269658 [VPlan] Avoid collecting scalars for SVE
This patch ensures scalars (except for uniforms) are no
longer collected (prior to LVP planning phase) for
scalable vectorization.

This is to avoid the chances of generating scalarized
instructions later (during LVP execute phase) as they
are not supported for scalable vectorization.

Relevant test has also been added.

Differential Revision: https://reviews.llvm.org/D121452
2022-03-16 16:33:34 +00:00
Florian Hahn
5c4d64eb0d [LV] Make reduction-order.ll test independent of instruction naming.
Also update test to not use branch on undef.
2022-03-15 11:13:18 +00:00
Florian Hahn
4a0481e981 [LV] Check for users of truncated IVs, add more detailed comment.
Add missing outside user check for truncated IVs. Also hoist the code in
the helper with additional explanations.

Fixes #54370.
2022-03-14 19:39:30 +00:00
Florian Hahn
1c0fc1f074 [VPlan] Ensure each iv user is only visited once in transform.
If a recipe has multiple uses of an IV, we crash. It causes a crash when
building llvm-test-suite.

Exposed by 95f76bff1c.
2022-03-13 21:42:17 +00:00
Florian Hahn
95f76bff1c [LV] Create & use VPScalarIVSteps for all scalar users.
This patch is a follow-up to D115953. It updates optimizeInductions
to also introduce new VPScalarIVStepsRecipes if an IV has both vector
and scalar uses.

It updates all uses that only need scalar values to use the newly
created recipe for the scalar steps.

This completes untangling of VPWidenIntOrFpInductionRecipe
code-generation. Now the recipe *only* creates the widened vector
values, as it says on the tin.

The code to genereate IR has been moved directly to
VPWidenIntOrFpInductionRecipe::execute.

Note that the recipe has been updated to hold a reference to
ScalarEvolution, which is needed to expand the step, until we can place
the corresponding SCEV expansion in the pre-header.

Depends on D120827.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D120828
2022-03-13 17:15:24 +00:00
Sanjay Patel
b48fe158e0 [Analysis] remove bogus smin/smax pattern detection
This is a revert of cfcc42bdc. The analysis is wrong as shown by
the minimal tests for instcombine:
https://alive2.llvm.org/ce/z/y9Dp8A

There may be a way to salvage some of the other tests,
but that can be done as follow-ups. This avoids a miscompile
and fixes #54311.
2022-03-09 17:50:34 -05:00
Florian Hahn
a12403cfea [LV] Do not consider instrs dead if used by phi that's not in plan.
Single value phis won't be modeled in VPlan. If the phi only gets used
outside the loop, the current code misses the fact that the incoming
value is not dead. Update the code to also look through such phis to
check for outside users.

Fixes #54266
2022-03-09 16:04:44 +00:00
Florian Hahn
a2979c8399 [IVDescriptors] Bail out instead of asserting that order is expected.
When dealing with multiple phis that depend on each other, the order
might have been changed and may not match the expectation. If that
happens, bail out, rather than asserting.

Fixes https://github.com/llvm/llvm-project/issues/54218
Fixes https://github.com/llvm/llvm-project/issues/54233
Fixes https://github.com/llvm/llvm-project/issues/54254
2022-03-07 19:57:26 +00:00
Florian Hahn
f4368487aa [LV] Add test from PR54227.
Test from https://github.com/llvm/llvm-project/issues/54227.

The underlying issue has already been fixed in de8ac48 with a separate
test.
2022-03-07 17:01:22 +00:00
Roman Lebedev
2f80ea7f4f [NFC][LV] Use different braces in debug output
The analysis passes output function name encapsulated in `'` braces,
but LV uses `"`. Harmonizing this may help in creating an update script
for the LV costmodel test checks.

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D121105
2022-03-07 19:32:37 +03:00
Florian Hahn
de8ac485e5 [IVDescriptor] Remove SinkCandidate from SinkAfter before re-sinking.
This ensures the right order in the sink-after map is maintained. If we
re-sink an instruction, it must be sunk after all earlier instructions
have been sunk.

Fixes https://github.com/llvm/llvm-project/issues/54223
2022-03-05 19:48:26 +00:00
Florian Hahn
5a60260efe [IVDescriptor] Use DT to check order of Previous, OtherPrev.
Previous and OhterPrev may not be in the same block. Use DT::dominates
instead of local comesBefore. DT::dominates is already used earlier to
check the order of Previous and SinkCandidate.

Fixes https://github.com/llvm/llvm-project/issues/54195
2022-03-04 11:07:42 +00:00
Florian Hahn
139215af8e [IVDescriptor] Find original 'Previous' for first-order recurrences.
This patch extends first-order recurrence handling to support cases
where we already sunk an instruction for a different recurrence, but
LastPrev comes before Previous.

To handle those cases correctly, we need to find the earliest entry for
the sink-after chain, because this is references the Previous from the
original recurrence. This is needed to ensure we use the correct
instruction as sink point.

Depends on D118558.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D118642
2022-03-03 16:41:26 +00:00
Florian Hahn
8777cb66a8 [VPlan] Remove reliance on underlying instr for ScalarIVSteps (NFCI).
Instead of relying on underlying instructions, this patch updates
VPScalarIVStepsRecipe to only store the required type information.

This removes access to unrelated information, as well as avoiding issues
with the same underlying instruction being shared by multiple recipes.

This change should only change the debug output and not cause any
codegen changes, hence NFCI.
2022-03-02 16:23:19 +00:00
Florian Hahn
6dc456a375 [LV] Remove redundant check line from recurrence test.
The removed line matches the previous line, modulo the check prefix.
There is no way to disable sinking instructions as required due to
first-order recurrence and removing the line should be safe.
2022-03-02 13:48:46 +00:00
Florian Hahn
83fd2071f0 [LV] Modernize test matching hardcoded induction phi name. 2022-03-02 10:12:38 +00:00
Florian Hahn
470b5c7f0d [LV] Add test with multiple use of a FOR chained together.
Additional test coverage for D118642.
2022-03-01 14:18:23 +00:00
Nikita Popov
26748bb15a [InstCombine] Slightly relax one-use check in abs canonicalization
Treat the icmp and sub symmetrically, and require that one of them
has one use, not the icmp in particular. This could be further
relaxed in the abs (but not nabs) case to not check one-use at
all.
2022-03-01 15:06:41 +01:00