Commit Graph

8317 Commits

Author SHA1 Message Date
Evgeny Stupachenko
fe6f548d2d Fix PR23384 (under "-lsr-insns-cost" option)
Summary:
The patch adds instructions number generated by a solution
 to LSR cost under "-lsr-insns-cost" option.

Reviewers: qcolombet, hfinkel

Differential Revision: http://reviews.llvm.org/D28307

From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 294821
2017-02-11 02:57:43 +00:00
Ahmed Bougacha
8425f453ef [ARM] Make f16 interleaved accesses expensive.
There are no vldN/vstN f16 variants, even with +fullfp16.
We could use the i16 variants, but, in practice, even with +fullfp16,
the f16 sequence leading to the i16 shuffle usually gets scalarized.
We'd need to improve our support for f16 codegen before getting there.

Teach the cost model to consider f16 interleaved operations as
expensive.  Otherwise, we are all but guaranteed to end up with
a large block of scalarized vector code.

llvm-svn: 294819
2017-02-11 01:53:04 +00:00
Ahmed Bougacha
fc979dc9dd [ARM] Don't lower f16 interleaved accesses.
There are no vldN/vstN f16 variants, even with +fullfp16.
We could use the i16 variants, but, in practice, even with +fullfp16,
the f16 sequence leading to the i16 shuffle usually gets scalarized.
We'd need to improve our support for f16 codegen before getting there.

Reject f16 interleaved accesses.  If we try to emit the f16 intrinsics,
we'll just end up with a selection failure.

llvm-svn: 294818
2017-02-11 01:53:00 +00:00
Ahmed Bougacha
f37fb89edc [ARM] Unique some redundant CHECK lines. NFC.
llvm-svn: 294817
2017-02-11 01:52:57 +00:00
Wei Mi
8f20e63a20 [LSR] Recommit: Allow formula containing Reg for SCEVAddRecExpr related with outerloop.
The recommit includes some changes of testcases. No functional change to the patch.

In RateRegister of existing LSR, if a formula contains a Reg which is a SCEVAddRecExpr,
and this SCEVAddRecExpr's loop is an outerloop, the formula will be marked as Loser
and dropped.

Suppose we have an IR that %for.body is outerloop and %for.body2 is innerloop. LSR only
handle inner loop now so only %for.body2 will be handled.

Using the logic above, formula like
reg(%array) + reg({1,+, %size}<%for.body>) + 1*reg({0,+,1}<%for.body2>) will be dropped
no matter what because reg({1,+, %size}<%for.body>) is a SCEVAddRecExpr type reg related
with outerloop. Only formula like
reg(%array) + 1*reg({{1,+, %size}<%for.body>,+,1}<nuw><nsw><%for.body2>) will be kept
because the SCEVAddRecExpr related with outerloop is folded into the initial value of the
SCEVAddRecExpr related with current loop.

But in some cases, we do need to share the basic induction variable
reg{0 ,+, 1}<%for.body2> among LSR Uses to reduce the final total number of induction
variables used by LSR, so we don't want to drop the formula like
reg(%array) + reg({1,+, %size}<%for.body>) + 1*reg({0,+,1}<%for.body2>) unconditionally.

From the existing comment, it tries to avoid considering multiple level loops at the same time.
However, existing LSR only handles innermost loop, so for any SCEVAddRecExpr with a loop other
than current loop, it is an invariant and will be simple to handle, and the formula doesn't have
to be dropped.

Differential Revision: https://reviews.llvm.org/D26429

llvm-svn: 294814
2017-02-11 00:50:23 +00:00
Yaxun Liu
ba01ed00fe Fix invalid addrspacecast due to combining alloca with global var
For function-scope variables with large initialisation list, FE usually 
generates a global variable to hold the initializer, then generates 
memcpy intrinsic to initialize the alloca. InstCombiner::visitAllocaInst 
identifies such allocas which are accessed only by reading and replaces 
them with the global variable. This is done by casting the global variable 
to the type of the alloca and replacing all references.

However, when the global variable is in a different address space which 
is disjoint with addr space 0 (e.g. for IR generated from OpenCL, 
global variable cannot be in private addr space i.e. addr space 0), casting 
the global variable to addr space 0 results in invalid IR for certain 
targets (e.g. amdgpu).

To fix this issue, when the global variable is not in addr space 0, 
instead of casting it to addr space 0, this patch chases down the uses 
of alloca until reaching the load instructions, then replaces load from 
alloca with load from the global variable. If during the chasing 
bitcast and GEP are encountered, new bitcast and GEP based on the global 
variable are generated and used in the load instructions.

Differential Revision: https://reviews.llvm.org/D27283

llvm-svn: 294786
2017-02-10 21:46:07 +00:00
Dehao Chen
fb02f7140a Encode duplication factor from loop vectorization and loop unrolling to discriminator.
Summary:
This patch starts the implementation as discuss in the following RFC: http://lists.llvm.org/pipermail/llvm-dev/2016-October/106532.html

When optimization duplicates code that will scale down the execution count of a basic block, we will record the duplication factor as part of discriminator so that the offline process tool can find the duplication factor and collect the accurate execution frequency of the corresponding source code. Two important optimization that fall into this category is loop vectorization and loop unroll. This patch records the duplication factor for these 2 optimizations.

The recording will be guarded by a flag encode-duplication-in-discriminators, which is off by default.

Reviewers: probinson, aprantl, davidxl, hfinkel, echristo

Reviewed By: hfinkel

Subscribers: mehdi_amini, anemet, mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D26420

llvm-svn: 294782
2017-02-10 21:09:07 +00:00
Matthew Simpson
df124a7569 [LV] Remove type restriction for vector phi creation
We previously only created a vector phi node for an induction variable if its
type matched the type of the canonical induction variable.

Differential Revision: https://reviews.llvm.org/D29776

llvm-svn: 294755
2017-02-10 16:15:26 +00:00
Philip Reames
578dafbd8b [LoopUnswitch] Remove BFI usage (dead code)
Chandler mentioned at the last social that the need for BFI in the new pass manager was causing a slight hiccup for this pass.  Given this code has been checked in, but off for over a year, it makes sense to just remove it for now.

Note that there's nothing wrong with the general idea - it's actually a quite good one - and once we have the infrastructure in place to implement this without the full recompuation on every loop, we absolutely should.

llvm-svn: 294715
2017-02-10 06:12:06 +00:00
Michael J. Spencer
788b10ecbc [LoadCombine] Change test to not use instcombine.
llvm-svn: 294682
2017-02-10 00:44:08 +00:00
Davide Italiano
fc0d442cf1 [NewGVN] Fix test so that it doesn't rely on InstCombine anymore.
llvm-svn: 294668
2017-02-09 23:48:10 +00:00
Chandler Carruth
addcda483e [PM] Port ArgumentPromotion to the new pass manager.
Now that the call graph supports efficient replacement of a function and
spurious reference edges, we can port ArgumentPromotion to the new pass
manager very easily.

The old PM-specific bits are sunk into callbacks that the new PM simply
doesn't use. Unlike the old PM, the new PM simply does argument
promotion and afterward does the update to LCG reflecting the promoted
function.

Differential Revision: https://reviews.llvm.org/D29580

llvm-svn: 294667
2017-02-09 23:46:27 +00:00
Peter Collingbourne
17febdbb25 WholeProgramDevirt: Check that VCP candidate functions are defined before evaluating them.
This was crashing before.

llvm-svn: 294666
2017-02-09 23:46:26 +00:00
Sanjay Patel
f38bab73aa [InstCombine] allow (X * C2) << C1 --> X * (C2 << C1) for vectors
This fold already existed for vectors but only when 'C1' was a splat
constant (but 'C2' could be any constant). 

There were no tests for any vector constants, so I'm adding a test
that shows non-splat constants for both operands.  

llvm-svn: 294650
2017-02-09 23:13:04 +00:00
Michael J. Spencer
714d9d22ad [LoadCombine] Fix combining of loads which span an aliasing store.
Fixes PR31517

Differential Revision: https://reviews.llvm.org/D28922

llvm-svn: 294632
2017-02-09 21:46:49 +00:00
Sanjay Patel
ae3b43e488 [InstCombine] use m_APInt to allow demanded bits analysis on splat constants
llvm-svn: 294628
2017-02-09 21:43:06 +00:00
Sanjay Patel
5bcb2d97f0 [InstCombine] add test for demanded bits with splat vector constants; NFC
llvm-svn: 294625
2017-02-09 21:33:19 +00:00
Sanjoy Das
74bda4d591 [JumpThreading] Thread through guards
Summary:
This patch allows JumpThreading also thread through guards.
Virtually, guard(cond) is equivalent to the following construction:

  if (cond) { do something } else {deoptimize}

Yet it is not explicitly converted into IFs before lowering.
This patch enables early threading through guards in simple cases.
Currently it covers the following situation:

  if (cond1) {
    // code A
  } else {
    // code B
  }
  // code C
  guard(cond2)
  // code D

If there is implication cond1 => cond2 or !cond1 => cond2, we can transform
this construction into the following:

  if (cond1) {
    // code A
    // code C
  } else {
    // code B
    // code C
    guard(cond2)
  }
  // code D

Thus, removing the guard from one of execution branches.

Patch by Max Kazantsev!

Reviewers: reames, apilipenko, igor-laevsky, anna, sanjoy

Reviewed By: sanjoy

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29620

llvm-svn: 294617
2017-02-09 19:40:22 +00:00
Sanjay Patel
b36e1f0223 [InstCombine] add tests for icmp with add nsw; NFC
llvm-svn: 294601
2017-02-09 18:12:39 +00:00
Peter Collingbourne
58c90c0c80 LowerTypeTests: Change a few vtable globals in tests to constants.
It turns out that some of our negative tests were not in fact providing the
test coverage we expected: they were passing because the vtables were failing
an early check that they were constant. Fix this by changing the globals in
these tests to constants.

llvm-svn: 294550
2017-02-09 01:48:24 +00:00
Sanjay Patel
a62bc44f67 [InstCombine] add tests to show information-losing add nsw/nuw transforms; NFC
llvm-svn: 294524
2017-02-08 22:14:11 +00:00
Peter Collingbourne
28ffd3261f ThinLTOBitcodeWriter: Strip debug info from merged module.
This module will contain nothing but vtable definitions and (soon)
available_externally function definitions, so there is no point in keeping
debug info in the module.

Differential Revision: https://reviews.llvm.org/D28913

llvm-svn: 294511
2017-02-08 20:44:00 +00:00
Alexey Bataev
0674fe39e5 [SLP] Additional test to check correct work of horizontal reductions,
NFC.

llvm-svn: 294505
2017-02-08 19:52:46 +00:00
Elena Demikhovsky
5267edd3e3 [Loop Vectorizer] Cost-based decision for vectorization form of memory instruction.
Making the cost model selecting between Interleave, GatherScatter or Scalar vectorization form of memory instruction.
The right decision should be done for non-consecutive memory access instrcuctions that may have more than one vectorization solution.

This patch includes the following changes:
- Cost Model calculates the cost of Load/Store vector form and choose the better option between Widening, Interleave, GatherScactter and Scalarization. Cost Model keeps the widening decision.
- Arrays of Uniform and Scalar values are moved from Legality to Cost Model.
- Cost Model collects Uniforms and Scalars per VF. The collection is based on CM decision map of Loadis/Stores vectorization form.
- Vectorization of memory instruction is performed according to the CM decision.

Differential Revision: https://reviews.llvm.org/D27919

llvm-svn: 294503
2017-02-08 19:25:23 +00:00
Sanjay Patel
d11a03b263 [InstCombine] add test for missed vector icmp fold; NFC
Also, move the related existing scalar test to a renamed file 
where I'm planning to add more icmp-add tests.

llvm-svn: 294487
2017-02-08 17:37:17 +00:00
Igor Laevsky
900ffa34c8 [InstCombineCalls] Unfold element atomic memcpy instruction
Differential Revision: https://reviews.llvm.org/D28909

llvm-svn: 294453
2017-02-08 14:32:04 +00:00
Chandler Carruth
1497710f52 [ArgPromote] Delete a test that makes no sense (any more).
This test is under 'ArgumentPromotion' but there are no arguments that
get promoted in the test case, so there seems to be no point. Also,
there are no assertions about the output at all, so this seems like
something we should just delete given the low value.

llvm-svn: 294428
2017-02-08 08:54:08 +00:00
Chandler Carruth
2af523f20c [ArgPromote] Clean up a crash test case by rinsing it through opt,
renaming things to at least have somewhat spelled out names, and even
have meaningful names where I could guess at what they should be.

Also add FileCheck assertions that we're actually doing what we set out
to do for some of the tests, for example not promoting a type that would
result in infinite promotion.

llvm-svn: 294426
2017-02-08 08:47:35 +00:00
Chandler Carruth
102fa92b4e [ArgPromote] Actually add FileCheck to a test that I actually updated to
have nice CHECK patterns instead of relying on a coarse 'not grep'
check. Sorry that I missed this the first time through.

llvm-svn: 294422
2017-02-08 08:04:02 +00:00
Chandler Carruth
9e44e08953 [ArgPromote] Actually run FileCheck on this test. The CHECK lines are
already there, just waiting to, well, be checked. =]

llvm-svn: 294421
2017-02-08 08:01:14 +00:00
Matt Arsenault
cb3fa37c7e LSR: Check atomic instruction pointer operands
llvm-svn: 294410
2017-02-08 06:44:58 +00:00
Craig Topper
e0ac7f3beb [X86] Remove PCOMMIT instruction support since Intel has deprecated this instruction with no plans to release products with it.
Intel's documentation for the deprecation https://software.intel.com/en-us/blogs/2016/09/12/deprecate-pcommit-instruction

llvm-svn: 294405
2017-02-08 05:45:39 +00:00
Sanjoy Das
ec892139bd [IRCE] Add a missing invariant check
Currently IRCE relies on the loops it transforms to be (semantically) of
the form:

  for (i = START; i < END; i++)
    ...

or

  for (i = START; i > END; i--)
    ...

However, we were not verifying the presence of the START < END entry
check (i.e. check before the first iteration).  We were only verifying
that the backedge was guarded by (i + 1) < END.

Usually this would work "fine" since (especially in Java) most loops do
actually have the START < END check, but of course that is not
guaranteed.

llvm-svn: 294375
2017-02-07 23:59:07 +00:00
Daniel Berlin
439042b7ad Add PredicateInfo utility and printing pass
Summary:
This patch adds a utility to build extended SSA (see "ABCD: eliminating
array bounds checks on demand"), and an intrinsic to support it. This
is then used to get functionality equivalent to propagateEquality in
GVN, in NewGVN (without having to replace instructions as we go). It
would work similarly in SCCP or other passes. This has been talked
about a few times, so i built a real implementation and tried to
productionize it.

Copies are inserted for operands used in assumes and conditional
branches that are based on comparisons (see below for more)

Every use affected by the predicate is renamed to the appropriate
intrinsic result.

E.g.
%cmp = icmp eq i32 %x, 50
br i1 %cmp, label %true, label %false
true:
ret i32 %x
false:
ret i32 1

will become

%cmp = icmp eq i32, %x, 50
br i1 %cmp, label %true, label %false
true:
; Has predicate info
; branch predicate info { TrueEdge: 1 Comparison: %cmp = icmp eq i32 %x, 50 }
%x.0 = call @llvm.ssa_copy.i32(i32 %x)
ret i32 %x.0
false:
ret i23 1

(you can use -print-predicateinfo to get an annotated-with-predicateinfo dump)

This enables us to easily determine what operations are affected by a
given predicate, and how operations affected by a chain of
predicates.

Reviewers: davide, sanjoy

Subscribers: mgorny, llvm-commits, Prazek

Differential Revision: https://reviews.llvm.org/D29519

Update for review comments

Fix a bug Nuno noticed where we are giving information about and/or on edges where the info is not useful and easy to use wrong

Update for review comments

llvm-svn: 294351
2017-02-07 21:10:46 +00:00
Matthew Simpson
3877f397cd [LV] Add new ARM/AArch64 interleaved access cost model tests (NFC)
llvm-svn: 294342
2017-02-07 19:34:24 +00:00
Matthew Simpson
1cd02f13a5 [LV] Simplify ARM/AArch64 interleaved access cost model tests (NFC)
This patch removes unneeded instructions from the existing ARM/AArch64
interleaved access cost model tests. I'll be adding a similar set of tests in a
follow-on patch to increase coverage.

llvm-svn: 294336
2017-02-07 19:17:44 +00:00
Reid Kleckner
828f3179c2 Fix my GVNHoist test case from r294317
llvm-svn: 294319
2017-02-07 17:35:53 +00:00
Reid Kleckner
79e37d517c Revert "[GVNHoist] Merge DebugLoc metadata on hoisted instructions"
This reverts commit r294250. It caused PR31891.

Add a test case that shows that inlinable calls retain location
information with an accurate scope.

llvm-svn: 294317
2017-02-07 17:31:13 +00:00
Dehao Chen
4a9dd70213 Fix the samplepgo indirect call promotion bug: we should not promote a direct call.
Summary: Checking CS.getCalledFunction() == nullptr does not necessary indicate indirect call. We also need to check if CS.getCalledValue() is not a constant.

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29570

llvm-svn: 294260
2017-02-06 23:33:15 +00:00
Paul Robinson
383c5c228f Merge DebugLoc on combined stores; in this case, when combining stores
from the end of two blocks, merge instead of arbitrarily picking one.

Differential Revision: http://reviews.llvm.org/D29504

llvm-svn: 294251
2017-02-06 22:19:04 +00:00
Taewook Oh
44a856f7d5 [GVNHoist] Merge DebugLoc metadata on hoisted instructions
Summary:
When instructions are hoisted, current implementation keeps DebugLoc metadata of the instruction that chosen as Repl (and its GEP operand if Repl is a load or a store). However, DebugLoc metadata should be updated to the 'merged' location across all hoisted instructions. See the following example code:


```
  1:  typedef struct {
  2:    int a[10];
  3:  } S1;
  4: 
  5:  extern S1 *s1[10];
  6: 
  7:  void foo(int x, int y, int i) {
  8:    if (y)
  9:      s1[i]->a[i] = x + y;
 10:    else
 11:      s1[i]->a[i] = x;
 12:  }
```

Below is LLVM IR representation of the program before gvn-hoist:


```
%struct.S1 = type { [10 x i32] }
@s1 = external local_unnamed_addr global [10 x %struct.S1*], align 16

define void @foo(i32 %x, i32 %y, i32 %i) !dbg !4 {
entry:
  %tobool = icmp ne i32 %y, 0, !dbg !8
  br i1 %tobool, label %if.then, label %if.else, !dbg !10

if.then:                                          ; preds = %entry
  %add = add nsw i32 %x, %y, !dbg !11
  %idxprom = sext i32 %i to i64, !dbg !12
  %arrayidx = getelementptr inbounds [10 x %struct.S1*], [10 x %struct.S1*]* @s1, i64 0, i64 %idxprom, !dbg !12
  %0 = load %struct.S1*, %struct.S1** %arrayidx, align 8, !dbg !12, !tbaa !13
  %a = getelementptr inbounds %struct.S1, %struct.S1* %0, i32 0, i32 0, !dbg !17
  br label %if.end, !dbg !12

if.else:                                          ; preds = %entry
  %idxprom3 = sext i32 %i to i64, !dbg !18
  %arrayidx4 = getelementptr inbounds [10 x %struct.S1*], [10 x %struct.S1*]* @s1, i64 0, i64 %idxprom3, !dbg !18
  %1 = load %struct.S1*, %struct.S1** %arrayidx4, align 8, !dbg !18, !tbaa !13
  %a5 = getelementptr inbounds %struct.S1, %struct.S1* %1, i32 0, i32 0, !dbg !19
  br label %if.end

if.end:                                           ; preds = %if.else, %if.then
  %a5.sink = phi [10 x i32]* [ %a5, %if.else ], [ %a, %if.then ]
  %.sink = phi i32 [ %x, %if.else ], [ %add, %if.then ]
  %idxprom6 = sext i32 %i to i64
  %arrayidx7 = getelementptr inbounds [10 x i32], [10 x i32]* %a5.sink, i64 0, i64 %idxprom6
  store i32 %.sink, i32* %arrayidx7, align 4, !tbaa !20
  ret void, !dbg !22
}

```
where


```
!11 = !DILocation(line: 9, column: 18, scope: !9)
!12 = !DILocation(line: 9, column: 5, scope: !9)
!18 = !DILocation(line: 11, column: 5, scope: !9)
!19 = !DILocation(line: 11, column: 9, scope: !9)
```

. And below is after gvn-hoist:


```
define void @foo(i32 %x, i32 %y, i32 %i) !dbg !4 {
entry:
  %tobool = icmp ne i32 %y, 0, !dbg !8
  %idxprom = sext i32 %i to i64, !dbg !10
  %0 = getelementptr inbounds [10 x %struct.S1*], [10 x %struct.S1*]* @s1, i64 0, i64 %idxprom, !dbg !10
  %1 = load %struct.S1*, %struct.S1** %0, align 8, !dbg !10, !tbaa !11
  br i1 %tobool, label %if.then, label %if.else, !dbg !15

if.then:                                          ; preds = %entry
  %add = add nsw i32 %x, %y, !dbg !16
  %arrayidx = getelementptr inbounds [10 x %struct.S1*], [10 x %struct.S1*]* @s1, i64 0, i64 %idxprom, !dbg !10
  %a = getelementptr inbounds %struct.S1, %struct.S1* %1, i32 0, i32 0, !dbg !17
  br label %if.end, !dbg !10

if.else:                                          ; preds = %entry
  %arrayidx4 = getelementptr inbounds [10 x %struct.S1*], [10 x %struct.S1*]* @s1, i64 0, i64 %idxprom, !dbg !18
  %a5 = getelementptr inbounds %struct.S1, %struct.S1* %1, i32 0, i32 0, !dbg !19
  br label %if.end

if.end:                                           ; preds = %if.else, %if.then
  %a5.sink = phi [10 x i32]* [ %a5, %if.else ], [ %a, %if.then ]
  %.sink = phi i32 [ %x, %if.else ], [ %add, %if.then ]
  %arrayidx7 = getelementptr inbounds [10 x i32], [10 x i32]* %a5.sink, i64 0, i64 %idxprom
  store i32 %.sink, i32* %arrayidx7, align 4, !tbaa !20
  ret void, !dbg !22
}

```
As you see, loads and their GEPs have been hosited from if.then/if.else block to entry block. However, DebugLoc metadata of these new instructions are still same as the instructions in if.then block, as they are moved/cloned from if.then block. This may result incorrect stepping and imprecise sample profile result.

Reviewers: majnemer, pcc, sebpop

Reviewed By: sebpop

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29377

llvm-svn: 294250
2017-02-06 22:05:04 +00:00
Michael Kuperstein
7a86bb2589 [SLP] Revert "Allow using of extra values in horizontal reductions."
This breaks when one of the extra values is also a scalar that
participates in the same vectorization tree which we'll end up
reducing.

llvm-svn: 294245
2017-02-06 21:50:59 +00:00
Peter Collingbourne
e69e73c7b8 IR: Consider two DISubprograms to be odr-equal if they have the same template parameters.
In ValueMapper we create new operands for MDNodes and
rely on MDNode::replaceWithUniqued to create a new MDNode
with the specified operands. However this doesn't always
actually happen correctly for DISubprograms because when we
uniquify the new node, we only odr-compare it with existing nodes
(MDNodeSubsetEqualImpl<DISubprogram>::isDeclarationOfODRMember). Although
the TemplateParameters field can refer to a distinct DICompileUnit via
DITemplateTypeParameter::type -> DICompositeType::scope -> DISubprogram::unit,
it is not currently included in the odr comparison. As a result, we can end
up getting our original DISubprogram back, which means we will have a cloned
module referring to the DICompileUnit in the original module, which causes
a verification error.

The fix I implemented was to consider TemplateParameters to be one of the
odr-equal properties. But I'm a little uncomfortable with this. In general it
seems unsound to rely on distinct MDNodes never being reachable from nodes
which we only check odr-equality of. My only long term suggestion would be
to separate odr-uniquing from full uniquing.

Differential Revision: https://reviews.llvm.org/D29240

llvm-svn: 294240
2017-02-06 21:23:03 +00:00
Sanjay Patel
54656ca7db [ValueTracking] emit a remark when we detect a conflicting assumption (PR31809)
This is a follow-up to D29395 where we try to be good citizens and let the user know that
we've probably gone off the rails.

This should allow us to resolve:
https://llvm.org/bugs/show_bug.cgi?id=31809

Differential Revision: https://reviews.llvm.org/D29404

llvm-svn: 294208
2017-02-06 18:26:06 +00:00
Dehao Chen
c81483d79c Fix the bug of samplepgo indirect call promption when type casting of the return value is needed.
Summary: When type casting of the return value is needed, promoteIndirectCall will return the type casting instruction instead of the direct call. This patch changed to return the direct call instruction instead.

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29569

llvm-svn: 294205
2017-02-06 18:10:36 +00:00
Chandler Carruth
044d1e0dcf [ArgPromote] Replace all the grep-based testing with precise FileCheck
tests.

This also removes the use of instcombine as we can max the patterns
produced by argument promotion directly with the more powerful tools in
FileCheck.

llvm-svn: 294174
2017-02-06 08:43:11 +00:00
Davide Italiano
ec49313b11 [IPCP] Don't propagate return value for naked functions.
This is pretty much the same change made in SCCP.

llvm-svn: 294098
2017-02-04 19:44:14 +00:00
Sanjay Patel
0fe32ac256 [InstCombine] treat i1 as a special type in shouldChangeType()
This patch is based on the llvm-dev discussion here:
http://lists.llvm.org/pipermail/llvm-dev/2017-January/109631.html

Folding to i1 should always be desirable because that's better for value tracking 
and we have special folds for i1 types.

I checked for other users of shouldChangeType() where this might have an effect, 
but we already handle the i1 case differently than other types in all of those cases.

Side note: the default datalayout includes i1, so it seems we only find this gap in 
shouldChangeType + phi folding for the case when there is (1) an explicit datalayout 
without i1, (2) casting to i1 from a legal type, and (3) a phi with exactly 2 incoming
casted operands (as Björn mentioned).

Differential Revision: https://reviews.llvm.org/D29336

llvm-svn: 294066
2017-02-03 23:13:11 +00:00
Sanjay Patel
73fc8ddb06 [InstCombine] fix operand-complexity-based canonicalization (PR28296)
The code comments didn't match the code logic, and we didn't actually distinguish the fake unary (not/neg/fneg) 
operators from arguments. Adding another level to the weighting scheme provides more structure and can help 
simplify the pattern matching in InstCombine and other places.

I fixed regressions that would have shown up from this change in:
rL290067
rL290127

But that doesn't mean there are no pattern-matching logic holes left; some combines may just be missing regression tests.

Should fix:
https://llvm.org/bugs/show_bug.cgi?id=28296

Differential Revision: https://reviews.llvm.org/D27933

llvm-svn: 294049
2017-02-03 21:43:34 +00:00
Sanjay Patel
62f175f01e [InstCombine] auto-generate better test checks; NFC
llvm-svn: 294040
2017-02-03 20:56:38 +00:00