Commit Graph

2093 Commits

Author SHA1 Message Date
Sanjay Patel
24238f09ed [SLP] fix formatting; NFC
Also move variable declarations closer to usage and add code comments.
2020-09-16 08:50:27 -04:00
Sanjay Patel
6a23668e78 [SLP] remove uses of 'auto' that obscure functionality; NFC 2020-09-16 08:26:21 -04:00
Sanjay Patel
0cee1bf5d1 [SLP] remove redundant size check; NFC
We bail out on small array size anyway.
2020-09-16 08:11:19 -04:00
Sanjay Patel
bbad998bab [SLP] move loop index variable declaration to its use; NFC 2020-09-16 07:59:31 -04:00
Sanjay Patel
158989184e [SLP] change poorly named variable; NFC
'V' shadows a function argument.
2020-09-16 07:59:31 -04:00
Wenlei He
2ea4c2c598 [BFI] Make BFI information available through loop passes inside LoopStandardAnalysisResults
~~D65060 uncovered that trying to use BFI in loop passes can lead to non-deterministic behavior when blocks are re-used while retaining old BFI data.~~

~~To make sure BFI is preserved through loop passes a Value Handle (VH) callback is registered on blocks themselves. When a block is freed it now also wipes out the accompanying BFI entry such that stale BFI data can no longer persist resolving the determinism issue. ~~

~~An optimistic approach would be to incrementally update BFI information throughout the loop passes rather than only invalidating them on removed blocks. The issues with that are:~~
~~1. It is not clear how BFI information should be incrementally updated: If a block is duplicated does its BFI information come with? How about if it's split/modified/moved around? ~~
~~2. Assuming we can address these problems the implementation here will be a massive undertaking. ~~

~~There's a known need of BFI in LICM analysis which requires correct but not incrementally updated BFI data. A follow-up change can register BFI in all loop passes so this preserved but potentially lossy data is available to any loop pass that wants it.~~

See: D75341 for an identical implementation of preserving BFI via VH callbacks. The previous statements do still apply but this change no longer has to be in this diff because it's already upstream 😄 .

This diff also moves BFI to be a part of LoopStandardAnalysisResults since the previous method using getCachedResults now (correctly!) statically asserts (D72893) that this data isn't static through the loop passes.

Testing
Ninja check

Reviewed By: asbirlea, nikic

Differential Revision: https://reviews.llvm.org/D86156
2020-09-15 16:16:24 -07:00
Huihui Zhang
3b7f5166bd [SLPVectorizer][SVE] Skip scalable-vector instructions before vectorizeSimpleInstructions.
For scalable type, the aggregated size is unknown at compile-time.
Skip instructions with scalable type to ensure the list of instructions
for vectorizeSimpleInstructions does not contains any scalable-vector instructions.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D87550
2020-09-15 13:10:15 -07:00
Fangrui Song
4452cc4086 [VectorCombine] Don't vectorize scalar load under asan/hwasan/memtag/tsan
Similar to the tsan suppression in
`Utils/VNCoercion.cpp:getLoadLoadClobberFullWidthSize` (rL175034; load widening used by GVN),
the D81766 optimization should be suppressed under tsan due to potential
spurious data race reports:

  struct A {
    int i;
    const short s; // the load cannot be vectorized because
    int modify;    // it overlaps with bytes being concurrently modified
    long pad1, pad2;
  };
  // __tsan_read16 does not know that some bytes are undef and accessing is safe

Similarly, under asan, users can mark memory regions with
`__asan_poison_memory_region`. A widened load can lead to a spurious
use-after-poison error. hwasan/memtag should be similarly suppressed.

`mustSuppressSpeculation` suppresses asan/hwasan/tsan but not memtag, so
we need to exclude memtag in `vectorizeLoadInsert`.

Note, memtag suppression can be relaxed if the load is aligned to the
its granule (usually 16), but that is out of scope of this patch.

Reviewed By: spatel, vitalybuka

Differential Revision: https://reviews.llvm.org/D87538
2020-09-15 09:47:21 -07:00
Simon Pilgrim
2b42d53e5e SLPVectorizer.h - remove unnecessary AliasAnalysis.h include. NFCI.
Forward declare AAResults instead of the (old) AliasAnalysis type.

Remove includes from SLPVectorizer.cpp that are already included in SLPVectorizer.h.
2020-09-15 16:24:05 +01:00
David Green
74760bb00f [LV][ARM] Add preferInloopReduction target hook.
This allows the backend to tell the vectorizer to produce inloop
reductions through a TTI hook.

For the moment on ARM under MVE this means allowing integer add
reductions of the correct size. In the future this can include integer
min/max too, under -Os.

Differential Revision: https://reviews.llvm.org/D75512
2020-09-12 17:47:04 +01:00
Sanjay Patel
40f12ef621 [SLP] further limit bailout for load combine candidate (PR47450)
The test example based on PR47450 shows that we can
match non-byte-sized shifts, but those won't ever be
bswap opportunities. This isn't a full fix (we'd still
match if the shifts were by 8-bits for example), but
this should be enough until there's evidence that we
need to do more (this is a borderline case for
vectorization in the first place).
2020-09-11 11:56:11 -04:00
Craig Topper
c195ae2f00 [SLPVectorizer][X86][AMDGPU] Remove fcmp+select to fmin/fmax reduction support.
Previously we could match fcmp+select to a reduction if the fcmp had
the nonans fast math flag. But if the select had the nonans fast
math flag, InstCombine would turn it into a fminnum/fmaxnum intrinsic
before SLP gets to it. Seems fairly likely that if one of the
fcmp+select pair have the fast math flag, they both would.

My plan is to start vectorizing the fmaxnum/fminnum version soon,
but I wanted to get this code out as it had some of the strangest
fast math flag behaviors.
2020-09-10 11:49:19 -07:00
Simon Pilgrim
5ea9e655ef VPlan.h - remove unnecessary forward declarations. NFCI.
Already defined in includes.
2020-09-07 18:35:06 +01:00
Huihui Zhang
b4f04d7135 [VectorCombine][SVE] Do not fold bitcast shuffle for scalable type.
First, shuffle cost for scalable type is not known for scalable type;
Second, we cannot reason if the narrowed shuffle mask for scalable type
is a splat or not.

E.g., Bitcast splat vector from type <vscale x 4 x i32> to <vscale x 8 x i16>
will involve narrowing shuffle mask <vscale x 4 x i32> zeroinitializer to
<vscale x 8 x i32> with element sequence of <0, 1, 0, 1, ...>, which cannot be
reasoned if it's a valid splat or not.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D86995
2020-09-02 15:02:16 -07:00
Sanjay Patel
8fb055932c [VectorCombine] allow vector loads with mismatched insert type
This is an enhancement to D81766 to allow loading the minimum target
vector type into an IR vector with a different number of elements.

In one of the motivating tests from PR16739, SLP creates <2 x float>
load ops mixed with <4 x float> insert ops, so we want to handle that
pattern in addition to potential oversized vectors created by the
vectorizers.

For now, we are assuming the insert/extract subvector with undef is
free because there is no exact corresponding TTI modeling for that.

Differential Revision: https://reviews.llvm.org/D86160
2020-09-02 08:11:36 -04:00
Aaron Liu
d7e16ca28f [LV] Interleave to expose ILP for small loops with scalar reductions.
Interleave for small loops that have reductions inside,
which breaks dependencies and expose.

This gives very significant performance improvements for some benchmarks.
Because small loops could be in very hot functions in real applications.

Differential Revision: https://reviews.llvm.org/D81416
2020-09-01 19:47:32 +00:00
Florian Hahn
eb35ebb3a2 [LV] Update CFG before adding runtime checks.
addRuntimeChecks uses SCEVExpander, which relies on the DT/LoopInfo to
be up-to-date. Changing the CFG afterwards may invalidate some inserted
instructions, especially LCSSA phis.

Reorder the code to first update the CFG and then create the runtime
checks. This should not have any impact on the generated code, as we
adjust the CFG and generate runtime checks together.

Fixes PR47343.
2020-08-30 18:21:44 +01:00
Sanjay Patel
af4581e8ab [SLP] make commutative check apply only to binops; NFC
As discussed in D86798, it's not clear if the caller code
works with a more liberal definition of "commutative" that
includes intrinsics like min/max. This makes the binop
restriction (current functionality is unchanged) explicit
until the code is audited/tested.
2020-08-30 10:55:44 -04:00
David Green
543c5425f1 [LV] Add some const to RecurrenceDescriptor. NFC 2020-08-30 12:27:51 +01:00
Florian Hahn
5067f4b626 [LV] Check opt-for-size before expanding runtime checks.
Move bail out when optimizing for size before runtime check generation.
In that case, we do not use the result of the expansion, the expanded
instruction will be dead and cleaned up later.

By doing the check before expanding the runtime-checks, we can save a
bit of unnecessary work.
2020-08-29 20:35:14 +01:00
David Sherwood
f4257c5832 [SVE] Make ElementCount members private
This patch changes ElementCount so that the Min and Scalable
members are now private and can only be accessed via the get
functions getKnownMinValue() and isScalable(). In addition I've
added some other member functions for more commonly used operations.
Hopefully this makes the class more useful and will reduce the
need for calling getKnownMinValue().

Differential Revision: https://reviews.llvm.org/D86065
2020-08-28 14:43:53 +01:00
Christopher Tetreault
5e63083435 [SVE] Remove calls to VectorType::getNumElements from Transforms/Vectorize
Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D82056
2020-08-27 12:02:20 -07:00
Sjoerd Meijer
bda8fbe2d2 [LV] Fallback strategies if tail-folding fails
This implements 2 different vectorisation fallback strategies if tail-folding
fails: 1) don't vectorise at all, or 2) vectorise using a scalar epilogue. This
can be controlled with option -prefer-predicate-over-epilogue, that has been
changed to take a numeric value corresponding to the tail-folding preference
and preferred fallback.

Patch by: Pierre van Houtryve, Sjoerd Meijer.

Differential Revision: https://reviews.llvm.org/D79783
2020-08-26 16:55:25 +01:00
Sjoerd Meijer
ae366479e8 [LV] get.active.lane.mask consuming tripcount instead of backedge-taken count
This adapts LV to the new semantics of get.active.lane.mask as discussed in
D86147, which means that the LV now emits intrinsic get.active.lane.mask with
the loop tripcount instead of the backedge-taken count as its second argument.
The motivation for this is described in D86147.

Differential Revision: https://reviews.llvm.org/D86304
2020-08-25 13:49:19 +01:00
Francesco Petrogalli
5a34b3ab95 [llvm][LV] Replace unsigned VF with ElementCount VF [NFCI]
Changes:

* Change `ToVectorTy` to deal directly with `ElementCount` instances.
* `VF == 1` replaced with `VF.isScalar()`.
* `VF > 1` and `VF >=2` replaced with `VF.isVector()`.
* `VF <=1` is replaced with `VF.isZero() || VF.isScalar()`.
* Replaced the uses of `llvm::SmallSet<ElementCount, ...>` with
   `llvm::SmallSetVector<ElementCount, ...>`. This avoids the need of an
   ordering function for the `ElementCount` class.
* Bits and pieces around printing the `ElementCount` to string streams.

To guarantee that this change is a NFC, `VF.Min` and asserts are used
in the following places:

1. When it doesn't make sense to deal with the scalable property, for
example:
   a. When computing unrolling factors.
   b. When shuffle masks are built for fixed width vector types
In this cases, an
assert(!VF.Scalable && "<mgs>") has been added to make sure we don't
enter coepaths that don't make sense for scalable vectors.
2. When there is a conscious decision to use `FixedVectorType`. These
uses of `FixedVectorType` will likely be removed in favour of
`VectorType` once the vectorizer is generic enough to deal with both
fixed vector types and scalable vector types.
3. When dealing with building constants out of the value of VF, for
example when computing the vectorization `step`, or building vectors
of indices. These operation _make sense_ for scalable vectors too,
but changing the code in these places to be generic and make it work
for scalable vectors is to be submitted in a separate patch, as it is
a functional change.
4. When building the potential VFs in VPlan. Making the VPlan generic
enough to handle scalable vectorization factors is a functional change
that needs a separate patch. See for example `void
LoopVectorizationPlanner::buildVPlans(unsigned MinVF, unsigned
MaxVF)`.
5. The class `IntrinsicCostAttribute`: this class still uses `unsigned
VF` as updating the field to use `ElementCount` woudl require changes
that could result in changing the behavior of the compiler. Will be done
in a separate patch.
7. When dealing with user input for forcing the vectorization
factor. In this case, adding support for scalable vectorization is a
functional change that migh require changes at command line.

Note that in some places the idiom

```
unsigned VF = ...
auto VTy = FixedVectorType::get(ScalarTy, VF)
```

has been replaced with

```
ElementCount VF = ...
assert(!VF.Scalable && ...);
auto VTy = VectorType::get(ScalarTy, VF)
```

The assertion guarantees that the new code is (at least in debug mode)
functionally equivalent to the old version. Notice that this change had been
possible because none of the methods that are specific to `FixedVectorType`
were used after the instantiation of `VTy`.

Reviewed By: rengolin, ctetreau

Differential Revision: https://reviews.llvm.org/D85794
2020-08-24 13:54:03 +00:00
Francesco Petrogalli
bad7d6b373 Revert "[llvm][LV] Replace unsigned VF with ElementCount VF [NFCI]"
Reverting because the commit message doesn't reflect the one agreed on
phabricator at https://reviews.llvm.org/D85794.

This reverts commit c8d2b065b9.
2020-08-24 13:50:55 +00:00
Francesco Petrogalli
c8d2b065b9 [llvm][LV] Replace unsigned VF with ElementCount VF [NFCI]
Changes:

* Change `ToVectorTy` to deal directly with `ElementCount` instances.
* `VF == 1` replaced with `VF.isScalar()`.
* `VF > 1` and `VF >=2` replaced with `VF.isVector()`.
* `VF <=1` is replaced with `VF.isZero() || VF.isScalar()`.
* Add `<` operator to `ElementCount` to be able to use
`llvm::SmallSetVector<ElementCount, ...>`.
* Bits and pieces around printing the ElementCount to string streams.
* Added a static method to `ElementCount` to represent a scalar.

To guarantee that this change is a NFC, `VF.Min` and asserts are used
in the following places:

1. When it doesn't make sense to deal with the scalable property, for
example:
   a. When computing unrolling factors.
   b. When shuffle masks are built for fixed width vector types
In this cases, an
assert(!VF.Scalable && "<mgs>") has been added to make sure we don't
enter coepaths that don't make sense for scalable vectors.
2. When there is a conscious decision to use `FixedVectorType`. These
uses of `FixedVectorType` will likely be removed in favour of
`VectorType` once the vectorizer is generic enough to deal with both
fixed vector types and scalable vector types.
3. When dealing with building constants out of the value of VF, for
example when computing the vectorization `step`, or building vectors
of indices. These operation _make sense_ for scalable vectors too,
but changing the code in these places to be generic and make it work
for scalable vectors is to be submitted in a separate patch, as it is
a functional change.
4. When building the potential VFs in VPlan. Making the VPlan generic
enough to handle scalable vectorization factors is a functional change
that needs a separate patch. See for example `void
LoopVectorizationPlanner::buildVPlans(unsigned MinVF, unsigned
MaxVF)`.
5. The class `IntrinsicCostAttribute`: this class still uses `unsigned
VF` as updating the field to use `ElementCount` woudl require changes
that could result in changing the behavior of the compiler. Will be done
in a separate patch.
7. When dealing with user input for forcing the vectorization
factor. In this case, adding support for scalable vectorization is a
functional change that migh require changes at command line.

Differential Revision: https://reviews.llvm.org/D85794
2020-08-24 13:39:42 +00:00
David Green
2b69efded0 [ARM][LV] Add a preferPredicatedReductionSelect target hook
As part of D84741, this adds a target hook for the
preferPredicatedReductionSelect option and makes use
of it under MVE, allowing us to tail predicate most
reduction loops.

Differential Revision: https://reviews.llvm.org/D85980
2020-08-21 08:48:12 +01:00
David Green
816097e4e5 [LV] Allow tail folded reduction selects to remain in the loop
The normal scheme for tail folding reductions is to use:

loop:
  p = phi(0, a)
  mask = ...
  x = masked_load(..., mask)
  a = add(x, p)
s = select(mask, a, p)

This means we need to keep the register p and a alive out of the loop, plus
the mask. On a target with predicated operations we can instead generate
the phi as p = phi(0, s). This ensures the select in the loop and we can
fold select(m, add(a, b), c) to something like a vaddt c, a, b using the
m predicate. This in turn allows us to tail predicate the entire loop.

Differential Revision: https://reviews.llvm.org/D84741
2020-08-20 14:31:14 +01:00
Hiroshi Yamauchi
ab401a8c8a [PGO][PGSO][LV] Fix loop not vectorized issue under profile guided size opts.
D81345 appears to accidentally disables vectorization when explicitly
enabled. As PGSO isn't currently accessible from LoopAccessInfo, revert back to
the vectorization with versioning-for-unit-stride for PGSO.

Differential Revision: https://reviews.llvm.org/D85784
2020-08-19 12:13:34 -07:00
Mehdi Amini
a407ec9b6d Revert "Revert "[NFC][llvm] Make the contructors of ElementCount private.""
Was reverted because MLIR/Flang builds were broken, these APIs have been
fixed in the meantime.
2020-08-19 17:26:36 +00:00
Mehdi Amini
4fc56d70aa Revert "[NFC][llvm] Make the contructors of ElementCount private."
This reverts commit 264afb9e6a.
(and dependent 6b742cc48 and fc53bd610f)

MLIR/Flang are broken.
2020-08-19 17:21:37 +00:00
Francesco Petrogalli
264afb9e6a [NFC][llvm] Make the contructors of ElementCount private.
Differential Revision: https://reviews.llvm.org/D86120
2020-08-19 16:26:44 +00:00
Bjorn Pettersson
11446b02c7 [VectorCombine] Fix for non-zero addrspace when creating vector load from scalar load
This is a fixup to commit 43bdac2906, to make sure the
address space from the original load pointer is retained in the
vector pointer.

Resolves problem with
  Assertion `castIsValid(op, S, Ty) && "Invalid cast!"' failed.
due to address space mismatch.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D85912
2020-08-13 18:25:32 +02:00
Sanjay Patel
cc892fd9f4 [VectorCombine] early exit if target has no vector registers
Based on post-commit discussion in:
D81766

Other vectorization passes (SLP and Loop) use this TTI API similarly.
2020-08-12 09:22:31 -04:00
Sanjay Patel
b0b95dab1c [VectorCombine] add safety check for 0-width register
Based on post-commit discussion in D81766, Hexagon sets this to "0".
I'll see if I can come up with a test, but making the obvious
code fix first to unblock that target.
2020-08-11 20:30:02 -04:00
Dinar Temirbulatov
b1600d8b89 [NFC] Guard the cost report block of debug outputs with NDEBUG and
switch to SmallString, this is part of D57779.
2020-08-11 16:34:47 +02:00
Florian Hahn
0b774acf11 [SLP] Make sure instructions are ordered when computing spill cost.
The entries in VectorizableTree are not necessarily ordered by their
position in basic blocks. Collect them and order them by dominance so
later instructions are guaranteed to be visited first. For instructions
in different basic blocks, we only scan to the beginning of the block,
so their order does not matter, as long as all instructions in a basic
block are grouped together. Using dominance ensures a deterministic order.

The modified test case contains an example where we compute a wrong
spill cost (2) without this patch, even though there is no call between
any instruction in the bundle.

This seems to have limited practical impact, .e.g on X86 with a recent
Intel Xeon CPU with -O3 -march=native -flto on MultiSource,SPEC2000,SPEC2006
there are no binary changes.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D82444
2020-08-11 11:18:12 +02:00
Sanjay Patel
43bdac2906 [VectorCombine] try to create vector loads from scalar loads
This patch was adjusted to match the most basic pattern that starts with an insertelement
(so there's no extract created here). Hopefully, that removes any concern about
interfering with other passes. Ie, the transform should almost always be profitable.

We could make an argument that this could be part of canonicalization, but we
conservatively try not to create vector ops from scalar ops in passes like instcombine.

If the transform is not profitable, the backend should be able to re-scalarize the load.

Differential Revision: https://reviews.llvm.org/D81766
2020-08-09 09:05:06 -04:00
Anton Afanasyev
a7478fab6c [SLP] Fix order of insertelement/insertvalue seed operands
Summary:
This patch takes the indices operands of `insertelement`/`insertvalue`
into account while generation of seed elements for `findBuildAggregate()`.
This function has kept the original order of `insert`s before.
Also this patch optimizes `findBuildAggregate()` preventing it from
redundant temporary vector allocations and its multiple reversing.

Fixes llvm.org/pr44067

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D83779
2020-08-06 22:09:24 +03:00
David Green
745bf6cf44 [LoopVectorizer] Inloop vector reductions
Arm MVE has multiple instructions such as VMLAVA.s8, which (in this
case) can take two 128bit vectors, sign extend the inputs to i32,
multiplying them together and sum the result into a 32bit general
purpose register. So taking 16 i8's as inputs, they can multiply and
accumulate the result into a single i32 without any rounding/truncating
along the way. There are also reduction instructions for plain integer
add and min/max, and operations that sum into a pair of 32bit registers
together treated as a 64bit integer (even though MVE does not have a
plain 64bit addition instruction). So giving the vectorizer the ability
to use these instructions both enables us to vectorize at higher
bitwidths, and to vectorize things we previously could not.

In order to do that we need a way to represent that the reduction
operation, specified with a llvm.experimental.vector.reduce when
vectorizing for Arm, occurs inside the loop not after it like most
reductions. This patch attempts to do that, teaching the vectorizer
about in-loop reductions. It does this through a vplan recipe
representing the reductions that the original chain of reduction
operations is replaced by. Cost modelling is currently just done through
a prefersInloopReduction TTI hook (which follows in a later patch).

Differential Revision: https://reviews.llvm.org/D75069
2020-08-06 10:10:50 +01:00
Jordan Rupprecht
3c39db0c44 Revert "[LoopVectorizer] Inloop vector reductions"
This reverts commit e9761688e4. It breaks the build:

```
~/src/llvm-project/llvm/lib/Analysis/IVDescriptors.cpp:868:10: error: no viable conversion from returned value of type 'SmallVector<[...], 8>' to function return type 'SmallVector<[...], 4>'
  return ReductionOperations;
```
2020-08-05 10:24:15 -07:00
David Green
e9761688e4 [LoopVectorizer] Inloop vector reductions
Arm MVE has multiple instructions such as VMLAVA.s8, which (in this
case) can take two 128bit vectors, sign extend the inputs to i32,
multiplying them together and sum the result into a 32bit general
purpose register. So taking 16 i8's as inputs, they can multiply and
accumulate the result into a single i32 without any rounding/truncating
along the way. There are also reduction instructions for plain integer
add and min/max, and operations that sum into a pair of 32bit registers
together treated as a 64bit integer (even though MVE does not have a
plain 64bit addition instruction). So giving the vectorizer the ability
to use these instructions both enables us to vectorize at higher
bitwidths, and to vectorize things we previously could not.

In order to do that we need a way to represent that the reduction
operation, specified with a llvm.experimental.vector.reduce when
vectorizing for Arm, occurs inside the loop not after it like most
reductions. This patch attempts to do that, teaching the vectorizer
about in-loop reductions. It does this through a vplan recipe
representing the reductions that the original chain of reduction
operations is replaced by. Cost modelling is currently just done through
a prefersInloopReduction TTI hook (which follows in a later patch).

Differential Revision: https://reviews.llvm.org/D75069
2020-08-05 18:14:05 +01:00
Bardia Mahjour
3c0f347002 [NFC][LV] Vectorized Loop Skeleton Refactoring
This patch tries to improve readability and maintenance
of createVectorizedLoopSkeleton by reorganizing some lines,
updating some of the comments and breaking it up into
smaller logical units.

Reviewed By: pjeeva01

Differential Revision: https://reviews.llvm.org/D83824
2020-08-04 14:50:57 -04:00
Florian Hahn
98db27711d [LV] Do not check widening decision for instrs outside of loop.
No widening decisions will be computed for instructions outside the
loop. Do not try to get a widening decision. The load/store will be just
a scalar load, so treating at as normal should be fine I think.

Fixes PR46950.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D85087
2020-08-03 10:09:24 +01:00
Vitaly Buka
b0eb40ca39 [NFC] Remove unused GetUnderlyingObject paramenter
Depends on D84617.

Differential Revision: https://reviews.llvm.org/D84621
2020-07-31 02:10:03 -07:00
Vitaly Buka
89051ebace [NFC] GetUnderlyingObject -> getUnderlyingObject
I am going to touch them in the next patch anyway
2020-07-30 21:08:24 -07:00
David Green
1da0c47fa2 [LoopVectorizer] Don't create unused block masks for reductions. NFC
This removes some unneeded block masks when we don't have any
reductions. It should not have any effect on codegen as the values
created are dead anyway.

Differential Revision: https://reviews.llvm.org/D81415
2020-07-30 14:28:08 +01:00
Simon Pilgrim
cc529285fd VectorUtils.h - reduce unnecessary includes. NFC.
Replace TargetLibraryInfo.h include with forward declaration and fix implicit dependencies.

Reduce SmallSet.h include to SmallVector.h include.
2020-07-30 12:27:49 +01:00
David Sherwood
9ad7c980bb [SVE] Don't consider scalable vector types in SLPVectorizerPass::vectorizeChainsInBlock
In vectorizeChainsInBlock we try to collect chains of PHI nodes
that have the same element type, but the code is relying upon
the implicit conversion from TypeSize -> uint64_t. For now, I have
modified the code to ignore PHI nodes with scalable types.

Differential Revision: https://reviews.llvm.org/D83542
2020-07-29 16:29:19 +01:00