Commit Graph

10350 Commits

Author SHA1 Message Date
Craig Topper
18799f4c07 [InstCombine] Allow fptrunc (fpext X)) to be reduced to a single fpext/ftrunc
If we are only truncating bits from the extend we should be able to just use a smaller extend.

If we are truncating more than the extend we should be able to just use a fptrunc since the presense of the fpextend shouldn't affect rounding.

Differential Revision: https://reviews.llvm.org/D43970

llvm-svn: 326595
2018-03-02 18:16:51 +00:00
Yaxun Liu
3c42f1c3c9 LoopUnroll: respect pragma unroll when AllowRemainder is disabled
Currently when AllowRemainder is disabled, pragma unroll count is not
respected even though there is no remainder. This bug causes a loop
fully unrolled in many cases even though the user specifies a unroll
count. Especially it affects OpenCL/CUDA since in many cases a loop
contains convergent instructions and currently AllowRemainder is
disabled for such loops.

Differential Revision: https://reviews.llvm.org/D43826

llvm-svn: 326585
2018-03-02 16:22:32 +00:00
Clement Courbet
c9119b3b6a [MergeICmps] Revert accidentally submitted failing test case.
Reverts r326574.

llvm-svn: 326582
2018-03-02 14:53:33 +00:00
Clement Courbet
1de4a183d4 [MergeIcmps] Add the test case from PR36557.
Summary: See PR36557.

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D44009

llvm-svn: 326574
2018-03-02 14:34:39 +00:00
Craig Topper
8cb5fc15ee [InstCombine] Add more test case to fpextend.ll.
This includes the test cases from D43970 and additional tests for combining (fptrunc (binop (fpext), (fpext))) where the pre-extended types don't match the trunc and therefore can't be completely removed.

llvm-svn: 326528
2018-03-02 01:34:42 +00:00
Fedor Indutny
1571b1271e [ArgumentPromotion] don't break musttail invariant PR36543
Summary:
Do not break musttail invariant by promoting arguments of musttail
callee or caller.

Reviewers: sanjoy, dberlin, hfinkel, george.burgess.iv, fhahn, rnk

Reviewed By: rnk

Subscribers: rnk, llvm-commits

Differential Revision: https://reviews.llvm.org/D43926

llvm-svn: 326521
2018-03-02 00:59:27 +00:00
Craig Topper
113446ca37 [InstCombine] Simplify test cases by removing loads/stores that aren't required for what is being tested.
The loads and stores were getting the data and storing the results. There's no reason we can't just use function arguments and return.

llvm-svn: 326515
2018-03-02 00:27:44 +00:00
Sanjay Patel
d0cdb2f861 [InstCombine] allow fmul fold with less than 'fast'
This is a retry of r326502 with updates to the reassociate 
test file that I missed the first time.

@test15_reassoc in the supposed -reassociate test file 
(except that it tests 2 other passes too...) shows that
there's no clear responsiblity for reassociation transforms.

Instcombine now gets that case, but only because the
constant values are identical. Otherwise, it would still
miss that pattern. 

Reassociate doesn't get that case because it hasn't been 
updated to use less than 'fast' FMF.

llvm-svn: 326513
2018-03-02 00:14:51 +00:00
Sanjay Patel
f2664d0663 [Reassociate] regenerate checks; NFC
llvm-svn: 326511
2018-03-01 23:41:03 +00:00
Sanjay Patel
eb5d046890 revert r326502: [InstCombine] allow fmul fold with less than 'fast'
I forgot that I added tests for 'reassoc' to -reassociate, but
suprisingly that file calls -instcombine too, so it is affected.
I'll update that file and try again.

llvm-svn: 326510
2018-03-01 23:39:24 +00:00
Sanjay Patel
7373ae5c9a [InstCombine] allow fmul fold with less than 'fast'
llvm-svn: 326502
2018-03-01 22:53:47 +00:00
Craig Topper
f1a7c6755d [InstCombine] Auto-generate complete checks. NFC
llvm-svn: 326474
2018-03-01 20:05:07 +00:00
Sanjay Patel
d696e93cb6 [InstCombine] remove stale comments for tests; NFC
llvm-svn: 326448
2018-03-01 16:28:32 +00:00
Sanjay Patel
3fd43a843b [InstCombine] move/add tests for fmul reassociation; NFC
This transform may be out-of-scope for instcombine, 
but this is only documenting the current behavior.

llvm-svn: 326442
2018-03-01 15:30:44 +00:00
Sanjay Patel
237e52674f [InstCombine] auto-generate full checks; NFC
llvm-svn: 326440
2018-03-01 15:13:42 +00:00
Max Kazantsev
f8d2969abb [SCEV] Smart range calculation for SCEVUnknown Phis
The range of SCEVUnknown Phi which merges values `X1, X2, ..., XN`
can be evaluated as `U(Range(X1), Range(X2), ..., Range(XN))`.

Reviewed By: sanjoy
Differential Revision: https://reviews.llvm.org/D43810

llvm-svn: 326418
2018-03-01 06:56:48 +00:00
Reid Kleckner
3762a089d7 [IPSCCP] do not break musttail invariant (PR36485)
Do not replace results of `musttail` calls with a constant if the
call itself can't be removed.

Do not zap returns of `musttail` callees, if the call site can't be
removed and replaced with a constant.

Do not zap returns of `musttail`-calling blocks, this breaks
invariant too.

Patch by Fedor Indutny

Differential Revision: https://reviews.llvm.org/D43695

llvm-svn: 326404
2018-03-01 01:19:18 +00:00
Reid Kleckner
cb9611ca67 [DAE] don't remove args of musttail target/caller
`musttail` requires identical signatures of caller and callee. Removing
arguments breaks `musttail` semantics.

PR36441

Patch by Fedor Indutny

Differential Revision: https://reviews.llvm.org/D43708

llvm-svn: 326394
2018-03-01 00:09:35 +00:00
Sanjay Patel
eaf5a120ed [InstCombine] simplify code for X * -1.0 --> -X; NFC
I've added random FMF to one of the tests to show those are propagated.

llvm-svn: 326377
2018-02-28 22:30:04 +00:00
Jonas Devlieghere
9ca064552a [GlobalOpt] don't change CC of musttail calle(e|r)
When the function has musttail call - its cc is fixed to be equal to the
cc of the musttail callee. In such case (and in the case of the musttail
callee), GlobalOpt should not change the cc to fastcc as it will break
the invariant.

This fixes PR36546

Patch by: Fedor Indutny (indutny)

Differential revision: https://reviews.llvm.org/D43859

llvm-svn: 326376
2018-02-28 22:28:44 +00:00
Sanjay Patel
356e77f550 [InstCombine] auto-generate complete checks; NFC
llvm-svn: 326331
2018-02-28 16:53:45 +00:00
Xin Tong
8ba674e43b [MergeICmp] Fix a bug in MergeICmp that can lead to a block being processed more than once.
Summary:
Fix a bug in MergeICmp that can lead to a BCECmp block being processed more than once and eventually lead to a broken LLVM module.
The problem is that if the non-constant value is not produced by the last block, the producer will be processed once when the its parent block
is processed and second time when the last block is processed.

We end up having 2 same BCECmpBlock in the merge queue. And eventually lead to a broken LLVM module.

Reviewers: courbet, davide

Reviewed By: courbet

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D43825

llvm-svn: 326318
2018-02-28 12:08:00 +00:00
Mohammad Shahid
ddeee12f59 [SLP] Added new tests and updated existing for jumbled load, NFC.
llvm-svn: 326303
2018-02-28 04:19:34 +00:00
Sanjay Patel
bf28a8fc01 [InstSimplify] add tests for FP with undef operand; NFC
Are any of these correct?

llvm-svn: 326241
2018-02-27 20:17:18 +00:00
Craig Topper
301991080e [ValueTracking] Teach cannotBeOrderedLessThanZeroImpl to look through ExtractElement.
This is similar to what's done in computeKnownBits and computeSignBits. Don't do anything fancy just collect information valid for any element.

Differential Revision: https://reviews.llvm.org/D43789

llvm-svn: 326237
2018-02-27 19:53:45 +00:00
Sanjay Patel
8529dd5ee1 [ARM] add loop vectorizer test based on 482.sphinx3 from SPEC2006; NFC
This is a slight reduction of one of the benchmarks
that suffered with D43079. Cost model changes should
not cause this test to remain scalarized.

llvm-svn: 326221
2018-02-27 18:33:24 +00:00
Sanjay Patel
04d1d79ee5 [AArch64] add SLP test based on TSVC; NFC
This is a slight reduction of one of the benchmarks
that suffered with D43079. Cost model changes should
not cause this test to remain scalarized.

llvm-svn: 326217
2018-02-27 18:06:15 +00:00
Florian Hahn
1807c516c7 [NewGVN] Update phi-of-ops def block when updating existing ValuePHI.
In case we update a ValuePHI node created earlier, we could update it
based on a different OpPHI which could be in a different block.
We need to update the TempToBlock mapping reflecting the new block,
otherwise we would end up placing the new phi node in a wrong block.

This problem is exposed by the test case in
https://bugs.llvm.org/show_bug.cgi?id=36504.

This patch fixes a slightly simpler problem than in the bug report. In
the bug's re-producer, the additional problem is that we are re-using a
ValuePHI node with to few incoming values for the new OpPHI. If this
patch makes sense, I will follow it up with a patch that creates a new
PHI node if the existing PHI node has a different number of incoming
values.

Reviewers: davide, dberlin

Reviewed By: dberlin

Differential Revision: https://reviews.llvm.org/D43770

llvm-svn: 326181
2018-02-27 09:34:51 +00:00
Adam Nemet
b424cd5d61 Make test agnostic to cost model
This was causing bot failures on greendragon

llvm-svn: 326169
2018-02-27 05:41:16 +00:00
Evgeny Stupachenko
f1c058d99b Fix r326154 buildbots test fail
Summary:

Add specific mtriples to tests added in r326154.

From: Evgeny Stupachenko <evstupac@gmail.com>
                         <evgeny.v.stupachenko@intel.com>
llvm-svn: 326158
2018-02-27 01:33:11 +00:00
Evgeny Stupachenko
a732611ac8 Fix PR36032, PR35432
Summary:

The change fix an assert fail at ScalarEvolutionExpander.cpp:
  assert(ExitCount != SE.getCouldNotCompute() && "Invalid loop count");

Reviewers: sbaranga

Differential Revision: http://reviews.llvm.org/D42604

From: Evgeny Stupachenko <evstupac@gmail.com>
                         <evgeny.v.stupachenko@intel.com>
llvm-svn: 326154
2018-02-27 00:17:31 +00:00
Sanjay Patel
66911b16e6 [InstCombine, InstSimplify] add tests with undef elements in constant FP vectors; NFC
llvm-svn: 326148
2018-02-26 23:23:02 +00:00
Craig Topper
69c8972fd1 [ValueTracking] Teach cannotBeOrderedLessThanZeroImpl to handle vector constants.
Summary: This allows vector fabs to be removed in more cases.

Reviewers: spatel, arsenm, RKSimon

Reviewed By: spatel

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D43739

llvm-svn: 326138
2018-02-26 22:33:17 +00:00
Simon Pilgrim
9929f90740 [X86][SSE] Reduce FADD/FSUB/FMUL costs on later targets (PR36280)
Agner's tables indicate that for SSE42+ targets (Core2 and later) we can reduce the FADD/FSUB/FMUL costs down to 1, which should fix the Himeno benchmark.

Note: the AVX512 FDIV costs look rather dodgy, but this isn't part of this patch.

Differential Revision: https://reviews.llvm.org/D43733

llvm-svn: 326133
2018-02-26 22:10:17 +00:00
Alexey Bataev
b44e2b75e8 [SLP] Added new test + fixed some checks, NFC.
llvm-svn: 326117
2018-02-26 20:01:24 +00:00
Craig Topper
43fb1cdef7 [InstCombine] Add test cases with vector constants to fpextend.ll
llvm-svn: 326115
2018-02-26 19:36:37 +00:00
Craig Topper
b284e8b9b4 [InstCombine] Switch to using FileCheck instead of grep. Auto-generate checks. NFC
llvm-svn: 326114
2018-02-26 19:36:36 +00:00
Sanjay Patel
31a90468e1 [InstCombine] allow fdiv folds with less than fully 'fast' ops
Note: gcc appears to allow this fold with -freciprocal-math alone, 
but clang/llvm require more than that with this patch. The wording
in the definitions seems fuzzy enough that it could go either way,
but we'll err on the conservative side of FMF interpretation.

This patch also changes the newly created fmul to have FMF propagated
by the last fdiv rather than intersecting the FMF of the fdivs. This
matches the behavior of other folds near here. The new fmul is only 
used to produce an intermediate op for the final fdiv result, so it
shouldn't be any stricter than that result. The previous behavior
could result in dropping FMF via other folds in instcombine or CSE.

Differential Revision: https://reviews.llvm.org/D43398

llvm-svn: 326098
2018-02-26 16:02:45 +00:00
Renato Golin
9d1b2acaaa [LV] Move isLegalMasked* functions from Legality to CostModel
All SIMD architectures can emulate masked load/store/gather/scatter
through element-wise condition check, scalar load/store, and
insert/extract. Therefore, bailing out of vectorization as legality
failure, when they return false, is incorrect. We should proceed to cost
model and determine profitability.

This patch is to address the vectorizer's architectural limitation
described above. As such, I tried to keep the cost model and
vectorize/don't-vectorize behavior nearly unchanged. Cost model tuning
should be done separately.

Please see
http://lists.llvm.org/pipermail/llvm-dev/2018-January/120164.html for
RFC and the discussions.

Closes D43208.

Patch by: Hideki Saito <hideki.saito@intel.com>

llvm-svn: 326079
2018-02-26 11:06:36 +00:00
Florian Hahn
ed45836253 [LoopInterchange] Add test case for D43236.
llvm-svn: 326078
2018-02-26 10:46:25 +00:00
Craig Topper
aee341ef28 [InstSimplify] Add test cases for removal of vector fabs on known positive.
llvm-svn: 326050
2018-02-25 06:51:52 +00:00
Craig Topper
2b8f051aaa [InstSimplify] Remove unused parameter from test cases.
llvm-svn: 326049
2018-02-25 06:51:51 +00:00
Adam Nemet
e4e1de60aa Revert "StructurizeCFG: Test for branch divergence correctly"
This reverts commit r325881.

Breaks many bots

llvm-svn: 326037
2018-02-24 17:29:09 +00:00
Sanjay Patel
db53d1847b [InstSimplify] sqrt(X) * sqrt(X) --> X
This was misplaced in InstCombine. We can loosen the FMF as a follow-up step.

llvm-svn: 325965
2018-02-23 22:20:13 +00:00
Sanjay Patel
d32104e1b2 [InstCombine] allow fmul-sqrt folds with less than full -ffast-math
Also, add a Builder method for intrinsics to reduce code duplication for clients.

llvm-svn: 325960
2018-02-23 21:16:12 +00:00
Matt Davis
708271849a [Test] Fix the test to output to /dev/null instead of redirecting.
The redirection was confusing the windows build machine.

llvm-svn: 325937
2018-02-23 19:03:04 +00:00
Matt Davis
523c656e25 [Debug] Add dbg.value intrinsics for PHIs created during LCSSA.
Summary:
This patch is an enhancement to propagate dbg.value information when Phis are created on behalf of LCSSA.
I noticed a case where a value carried across a loop was reported as <optimized out>.

Specifically this case:
```
int bar(int x, int y) {
  return x + y;
}

int foo(int size) {
  int val = 0;
  for (int i = 0; i < size; ++i) {
    val = bar(val, i);  // Both val and i are correct
  }
  return val; // <optimized out>
}
```

In the above case, after all of the interesting computation completes our value
is reported as "optimized out." This change will add a dbg.value to correct this.

This patch also moves the dbg.value insertion routine from LoopRotation.cpp 
into Local.cpp, so that we can share it in both places (LoopRotation and LCSSA).

Reviewers: mzolotukhin, aprantl, vsk, davide

Reviewed By: aprantl, vsk

Subscribers: dberlin, llvm-commits

Differential Revision: https://reviews.llvm.org/D42551

llvm-svn: 325926
2018-02-23 17:38:27 +00:00
Nicolai Haehnle
43c1115cd4 StructurizeCFG: Test for branch divergence correctly
Summary:
This fixes cases like the new test @nonuniform. In that test, %cc itself
is a uniform value; however, when reading it after the end of the loop in
basic block %if, its value is effectively non-uniform.

This problem was encountered in
https://bugs.freedesktop.org/show_bug.cgi?id=103743; however, this change
in itself is not sufficient to fix that bug, as there is another issue
in the AMDGPU backend.

Change-Id: I32bbffece4a32f686fab54964dae1a5dd72949d4

Reviewers: arsenm, rampitec, jlebar

Subscribers: wdng, tpr, llvm-commits

Differential Revision: https://reviews.llvm.org/D40546

llvm-svn: 325881
2018-02-23 10:45:46 +00:00
Bjorn Steinbrink
983d6c3f18 Mark MergedLoadStoreMotion as not preserving MemDep results
Summary:
MemDep caches results that signify that a dependence is non-local, and
there is currently no way to invalidate such cache entries.
Unfortunately, when MLSM sinks a store that can result in a non-local
dependence becoming a local one, and then MemDep gives wrong answers.
The easiest way out here is to just say that MLSM does indeed not
preserve MemDep results.

Reviewers: davide, Gerolf

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D43177

llvm-svn: 325880
2018-02-23 10:41:57 +00:00
Daniel Neilson
20c9207be3 [AlignmentFromAssumptions] Set source and dest alignments of memory intrinsiscs separately
Summary:
This change is part of step five in the series of changes to remove alignment argument from
memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the
AlignmentFromAssumptions pass to cease using the old getAlignment()/setAlignment API of
MemoryIntrinsic in favour of getting/setting source & dest specific alignments through
the new API. This allows us to simplify some of the code in this pass and also be more
aggressive about setting the source and destination alignments separately.

Steps:
Step 1) Remove alignment parameter and create alignment parameter attributes for
memcpy/memmove/memset. ( rL322965, rC322964, rL322963 )
Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing
source and dest alignments. ( rL323597 )
Step 3) Update Clang to use the new IRBuilder API. ( rC323617 )
Step 4) Update Polly to use the new IRBuilder API. ( rL323618 )
Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API,
and those that use use MemIntrinsicInst::[get|set]Alignment() to use [get|set]DestAlignment()
and [get|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273, rL324278,
rL324384, rL324395, rL324402, rL324626, rL324642, rL324653, rL324654, rL324773, rL324774,
rL324781, rL324784, rL324955, rL324960 )
Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the
MemIntrinsicInst::[get|set]Alignment() methods.

Reference
   http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html
   http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html

Reviewers: hfinkel, bollu, reames

Reviewed By: reames

Subscribers: reames, llvm-commits

Differential Revision: https://reviews.llvm.org/D43081

llvm-svn: 325816
2018-02-22 18:55:59 +00:00