Commit Graph

6841 Commits

Author SHA1 Message Date
Sean Silva
e3bb457423 [PM] Port DeadArgumentElimination to the new PM
The approach taken here follows r267631.

deadarghaX0r should be easy to port when the time comes to add new-PM
support to bugpoint.

llvm-svn: 272507
2016-06-12 09:16:39 +00:00
Sean Silva
f5080194fd [PM] Port ReversePostOrderFunctionAttrs to the new PM
Below are my super rough notes when porting. They can probably serve as
a basic guide for porting other passes to the new PM. As I port more
passes I'll expand and generalize this and make a proper
docs/HowToPortToNewPassManager.rst document. There is also missing
documentation for general concepts and API's in the new PM which will
require some documentation.
Once there is proper documentation in place we can put up a list of
passes that have to be ported and game-ify/crowdsource the rest of the
porting (at least of the middle end; the backend is still unclear).

I will however be taking personal responsibility for ensuring that the
LLD/ELF LTO pipeline is ported in a timely fashion. The remaining passes
to be ported are (do something like
`git grep "<the string in the bullet point below>"` to find the pass):

General Scalar:
[ ] Simplify the CFG
[ ] Jump Threading
[ ] MemCpy Optimization
[ ] Promote Memory to Register
[ ] MergedLoadStoreMotion
[ ] Lazy Value Information Analysis

General IPO:
[ ] Dead Argument Elimination
[ ] Deduce function attributes in RPO

Loop stuff / vectorization stuff:
[ ] Alignment from assumptions
[ ] Canonicalize natural loops
[ ] Delete dead loops
[ ] Loop Access Analysis
[ ] Loop Invariant Code Motion
[ ] Loop Vectorization
[ ] SLP Vectorizer
[ ] Unroll loops

Devirtualization / CFI:
[ ] Cross-DSO CFI
[ ] Whole program devirtualization
[ ] Lower bitset metadata

CGSCC passes:
[ ] Function Integration/Inlining
[ ] Remove unused exception handling info
[ ] Promote 'by reference' arguments to scalars

Please let me know if you are interested in working on any of the passes
in the above list (e.g. reply to the post-commit thread for this patch).
I'll probably be tackling "General Scalar" and "General IPO" first FWIW.

Steps as I port "Deduce function attributes in RPO"
---------------------------------------------------

(note: if you are doing any work based on these notes, please leave a
note in the post-commit review thread for this commit with any
improvements / suggestions / incompleteness you ran into!)

Note: "Deduce function attributes in RPO" is a module pass.

1. Do preparatory refactoring.

Do preparatory factoring. In this case all I had to do was to pull out a static helper (r272503).
(TODO: give more advice here e.g. if pass holds state or something)

2. Rename the old pass class.

llvm/lib/Transforms/IPO/FunctionAttrs.cpp
Rename class ReversePostOrderFunctionAttrs -> ReversePostOrderFunctionAttrsLegacyPass
in preparation for adding a class ReversePostOrderFunctionAttrs as the pass in the new PM.
(edit: actually wait what? The new class name will be
ReversePostOrderFunctionAttrsPass, so it doesn't conflict. So this step is
sort of useless churn).

llvm/include/llvm/InitializePasses.h
llvm/lib/LTO/LTOCodeGenerator.cpp
llvm/lib/Transforms/IPO/IPO.cpp
llvm/lib/Transforms/IPO/FunctionAttrs.cpp
Rename initializeReversePostOrderFunctionAttrsPass -> initializeReversePostOrderFunctionAttrsLegacyPassPass
(note that the "PassPass" thing falls out of `s/ReversePostOrderFunctionAttrs/ReversePostOrderFunctionAttrsLegacyPass/`)
Note that the INITIALIZE_PASS macro is what creates this identifier name, so renaming the class requires this renaming too.

Note that createReversePostOrderFunctionAttrsPass does not need to be
renamed since its name is not generated from the class name.

3. Add the new PM pass class.

In the new PM all passes need to have their
declaration in a header somewhere, so you will often need to add a header.
In this case
llvm/include/llvm/Transforms/IPO/FunctionAttrs.h is already there because
PostOrderFunctionAttrsPass was already ported.
The file-level comment from the .cpp file can be used as the file-level
comment for the new header. You may want to tweak the wording slightly
from "this file implements" to "this file provides" or similar.

Add declaration for the new PM pass in this header:

    class ReversePostOrderFunctionAttrsPass
        : public PassInfoMixin<ReversePostOrderFunctionAttrsPass> {
    public:
      PreservedAnalyses run(Module &M, AnalysisManager<Module> &AM);
    };

Its name should end with `Pass` for consistency (note that this doesn't
collide with the names of most old PM passes). E.g. call it
`<name of the old PM pass>Pass`.

Also, move the doxygen comment from the old PM pass to the declaration of
this class in the header.
Also, include the declaration for the new PM class
`llvm/Transforms/IPO/FunctionAttrs.h` at the top of the file (in this case,
it was already done when the other pass in this file was ported).

Now define the `run` method for the new class.
The main things here are:
a) Use AM.getResult<...>(M) to get results instead of `getAnalysis<...>()`

b) If the old PM pass would have returned "false" (i.e. `Changed ==
false`), then you should return PreservedAnalyses::all();

c) In the old PM getAnalysisUsage method, observe the calls
   `AU.addPreserved<...>();`.

   In the case `Changed == true`, for each preserved analysis you should do
   call `PA.preserve<...>()` on a PreservedAnalyses object and return it.
   E.g.:

       PreservedAnalyses PA;
       PA.preserve<CallGraphAnalysis>();
       return PA;

Note that calls to skipModule/skipFunction are not supported in the new PM
currently, so optnone and optimization bisect support do not work. You can
just drop those calls for now.

4. Add the pass to the new PM pass registry to make it available in opt.

In llvm/lib/Passes/PassBuilder.cpp add a #include for your header.
`#include "llvm/Transforms/IPO/FunctionAttrs.h"`
In this case there is already an include (from when
PostOrderFunctionAttrsPass was ported).

Add your pass to llvm/lib/Passes/PassRegistry.def
In this case, I added
`MODULE_PASS("rpo-functionattrs", ReversePostOrderFunctionAttrsPass())`
The string is from the `INITIALIZE_PASS*` macros used in the old pass
manager.

Then choose a test that uses the pass and use the new PM `-passes=...` to
run it.
E.g. in this case there is a test that does:
; RUN: opt < %s -basicaa -functionattrs -rpo-functionattrs -S | FileCheck %s
I have added the line:
; RUN: opt < %s -aa-pipeline=basic-aa -passes='require<targetlibinfo>,cgscc(function-attrs),rpo-functionattrs' -S | FileCheck %s
The `-aa-pipeline=basic-aa` and
`require<targetlibinfo>,cgscc(function-attrs)` are what is needed to run
functionattrs in the new PM (note that in the new PM "functionattrs"
becomes "function-attrs" for some reason). This is just pulled from
`readattrs.ll` which contains the change from when functionattrs was ported
to the new PM.
Adding rpo-functionattrs causes the pass that was just ported to run.

llvm-svn: 272505
2016-06-12 07:48:51 +00:00
Eli Friedman
9f8031c2da [MergedLoadStoreMotion] Use correct helper for load hoist safety.
It isn't legal to hoist a load past a call which might not return;
even if it doesn't throw, it could, for example, call exit().

Fixes http://llvm.org/PR27953.

llvm-svn: 272495
2016-06-12 02:11:20 +00:00
Eli Friedman
f1da33e4d3 [LICM] Make isGuaranteedToExecute more accurate.
Summary:
Make isGuaranteedToExecute use the
isGuaranteedToTransferExecutionToSuccessor helper, and make that helper
a bit more accurate.

There's a potential performance impact here from assuming that arbitrary
calls might not return. This probably has little impact on loads and
stores to a pointer because most things alias analysis can reason about
are dereferenceable anyway. The other impacts, like less aggressive
hoisting of sdiv by a variable and less aggressive hoisting around
volatile memory operations, are unlikely to matter for real code.

This also impacts SCEV, which uses the same helper.  It's a minor
improvement there because we can tell that, for example, memcpy always
returns normally. Strictly speaking, it's also introducing
a bug, but it's not any worse than everywhere else we assume readonly
functions terminate.

Fixes http://llvm.org/PR27857.

Reviewers: hfinkel, reames, chandlerc, sanjoy

Subscribers: broune, llvm-commits

Differential Revision: http://reviews.llvm.org/D21167

llvm-svn: 272489
2016-06-11 21:48:25 +00:00
Simon Pilgrim
3fc09f7be6 [CostModel][X86][SSE] Updated costs for vector BITREVERSE ops on SSSE3+ targets
To account for the fast PSHUFB implementation now available

llvm-svn: 272484
2016-06-11 19:23:02 +00:00
Matthew Simpson
12b9c5ba98 Reapply "[TTI] Refine default cost for interleaved load groups with gaps"
This reapplies commit r272385 with a fix. The build was failing when compiled
with gcc, but not with clang. With the fix, we now get the data layout from the
current TTI implementation, which will hopefully solve the issue.

llvm-svn: 272395
2016-06-10 14:33:30 +00:00
Matthew Simpson
65c7b74de4 Revert "[TTI] Refine default cost for interleaved load groups with gaps"
This reverts commit r272385. This commit broke the build. I'm temporarily
reverting to investigate.

llvm-svn: 272391
2016-06-10 12:41:33 +00:00
Matthew Simpson
b16907f17a [TTI] Refine default cost for interleaved load groups with gaps
This patch refines the default cost for interleaved load groups having gaps. If
a load group has gaps, the legalized instructions corresponding to the unused
elements will be dead. Thus, we don't need to account for them in the cost
model. Instead, we only need to account for the fraction of legalized loads
that will actually be used.

Differential Revision: http://reviews.llvm.org/D20873

llvm-svn: 272385
2016-06-10 11:27:51 +00:00
Easwaran Raman
71069cf67d Use ProfileSummaryInfo in inline cost analysis.
Instead of directly using MaxFunctionCount and function entry count to determine callee hotness, use the isHotFunction/isColdFunction methods provided by ProfileSummaryInfo.

Differential revision: http://reviews.llvm.org/D21045

llvm-svn: 272321
2016-06-09 22:23:21 +00:00
Easwaran Raman
e12c487b8c [PM] Port LCSSA to the new PM.
Differential Revision: http://reviews.llvm.org/D21090

llvm-svn: 272294
2016-06-09 19:44:46 +00:00
Michael Kuperstein
c5edcdeb0e [LV] Use vector phis for some secondary induction variables
Previously, we materialized secondary vector IVs from the primary scalar IV,
by offseting the primary to match the correct start value, and then broadcasting
it - inside the loop body. Instead, we can use a real vector IV, like we do for
the primary.

This enables using vector IVs for secondary integer IVs whose type matches the
type of the primary.

Differential Revision: http://reviews.llvm.org/D20932

llvm-svn: 272283
2016-06-09 18:03:15 +00:00
Sanjoy Das
c7f69b921f Be wary of abnormal exits from loop when exploiting UB
We can safely rely on a NoWrap add recurrence causing UB down the road
only if we know the loop does not have a exit expressed in a way that is
opaque to ScalarEvolution (e.g. by a function call that conditionally
calls exit(0)).

I believe with this change PR28012 is fixed.

Note: I had to change some llvm-lit tests in LoopReroll, since it looks
like they were depending on this incorrect behavior.

llvm-svn: 272237
2016-06-09 01:13:59 +00:00
Michael Zolotukhin
8e7e76729d [LoopSimplify] Preserve LCSSA when merging exit blocks.
Summary:
This fixes PR26682. Also add LCSSA as a preserved pass to LoopSimplify,
that looks correct to me and allows to write a test for the issue.

Reviewers: chandlerc, bogner, sanjoy

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D21112

llvm-svn: 272224
2016-06-08 23:13:21 +00:00
Michael Zolotukhin
987ab631fa [SLPVectorizer] Handle GEP with differing constant index types
Summary:
This fixes PR27617.

Bug description: The SLPVectorizer asserts on encountering GEPs with different index types, such as i8 and i64.

The patch includes a simple relaxation of the assert to allow constants being of different types, along with a regression test that will provoke the unrelaxed assert.

Reviewers: nadav, mzolotukhin

Subscribers: JesperAntonsson, llvm-commits, mzolotukhin

Differential Revision: http://reviews.llvm.org/D20685

Patch by Jesper Antonsson!

llvm-svn: 272206
2016-06-08 21:55:16 +00:00
Evgeny Stupachenko
3e2f389a7e The patch set unroll disable pragma when unroll
with user specified count has been applied.

Summary:
Previously SetLoopAlreadyUnrolled() set the disable pragma only if
there was some loop metadata.
Now it set the pragma in all cases. This helps to prevent multiple
unroll when -unroll-count=N is given.

Reviewers: mzolotukhin

Differential Revision: http://reviews.llvm.org/D20765

From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 272195
2016-06-08 20:21:24 +00:00
Tim Shen
7aa0ad65ce [MemCpyOpt] Do not exchange llvm.lifetime.start and llvm.memcpy
Reviewers: iteratee

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D21087

llvm-svn: 272192
2016-06-08 19:42:32 +00:00
Easwaran Raman
f894b5e89c Use FileCheck instead of grepping for patterns. NFC.
llvm-svn: 272065
2016-06-07 21:46:14 +00:00
Andrey Turetskiy
94c2179550 Quick fix for the test from rL272014 "[LAA] Improve non-wrapping pointer
detection by handling loop-invariant case" (s couple of buildbots failed).

Patch by Roman Shirokiy.

llvm-svn: 272019
2016-06-07 15:52:35 +00:00
Andrey Turetskiy
9f02c58670 [LAA] Improve non-wrapping pointer detection by handling loop-invariant case.
This fixes PR26314. This patch adds new helper “isNoWrap” with detection of
loop-invariant pointer case.

Patch by Roman Shirokiy.

Ref: https://llvm.org/bugs/show_bug.cgi?id=26314

Differential Revision: http://reviews.llvm.org/D17268

llvm-svn: 272014
2016-06-07 14:55:27 +00:00
Simon Pilgrim
db9893fb90 [InstCombine][AVX2] Add support for simplifying AVX2 per-element shifts to native shifts
Unlike native shifts, the AVX2 per-element shift instructions VPSRAV/VPSRLV/VPSLLV handle out of range shift values (logical shifts set the result to zero, arithmetic shifts splat the sign bit).

If the shift amount is constant we can sometimes convert these instructions to native shifts:

1 - if all shift amounts are in range then the conversion is trivial.
2 - out of range arithmetic shifts can be clamped to the (bitwidth - 1) (a legal shift amount) before conversion.
3 - logical shifts just return zero if all elements have out of range shift amounts.

In addition, UNDEF shift amounts are handled - either as an UNDEF shift amount in a native shift or as an UNDEF in the logical 'all out of range' zero constant special case for logical shifts.

Differential Revision: http://reviews.llvm.org/D19675

llvm-svn: 271996
2016-06-07 10:27:15 +00:00
Simon Pilgrim
91e3ac8293 [InstCombine][SSE] Add MOVMSK constant folding (PR27982)
This patch adds support for folding undef/zero/constant inputs to MOVMSK instructions.

The SSE/AVX versions can be fully folded, but the MMX version can only handle undef inputs.

Differential Revision: http://reviews.llvm.org/D20998

llvm-svn: 271990
2016-06-07 08:18:35 +00:00
Michael Kuperstein
a0c6ae02a5 [InstCombine] scalarizePHI should not assume the code it sees has been CSE'd
scalarizePHI only looked for phis that have exactly two uses - the "latch"
use, and an extract. Unfortunately, we can not assume all equivalent extracts
are CSE'd, since InstCombine itself may create an extract which is a duplicate
of an existing one. This extends it to handle several distinct extracts from
the same index.

This should fix at least some of the  performance regressions from PR27988.

Differential Revision: http://reviews.llvm.org/D20983

llvm-svn: 271961
2016-06-06 23:38:33 +00:00
Michael Zolotukhin
19edbadfc5 [LoopUnrollAnalyzer] Fix a crash in analyzeLoopUnrollCost.
In some cases, when simplifying with SCEV, we might consider pointer values as
just usual integer values.  Thus, we might get a different type from what we
had originally in the map of simplified values, and hence we need to check
types before operating on the values.

This fixes PR28015.

llvm-svn: 271931
2016-06-06 19:21:40 +00:00
Geoff Berry
43e5160d0e Reapply [LSR] Create fewer redundant instructions.
Summary:
Fix LSRInstance::HoistInsertPosition() to check the original insert
position block first for a canonical insertion point that is dominated
by all inputs.  This leads to SCEV being able to reuse more instructions
since it currently tracks the instructions it creates for reuse by
keeping a table of <Value, insert point> pairs.

Originally reviewed in http://reviews.llvm.org/D18001

Reviewers: atrick

Subscribers: llvm-commits, mzolotukhin, mcrosier

Differential Revision: http://reviews.llvm.org/D18480

llvm-svn: 271929
2016-06-06 19:10:46 +00:00
Sanjay Patel
6a333c3ed9 [InstCombine] limit icmp transform to ConstantInt (PR28011)
In r271810 ( http://reviews.llvm.org/rL271810 ), I loosened the check
above this to work for any Constant rather than ConstantInt. AFAICT, 
that part makes sense if we can determine that the shrunken/extended 
constant remained equal. But it doesn't make sense for this later 
transform where we assume that the constant DID change. 

This could assert for a ConstantExpr:
https://llvm.org/bugs/show_bug.cgi?id=28011

And it could be wrong for a vector as shown in the added regression test.

llvm-svn: 271908
2016-06-06 16:56:57 +00:00
Sanjay Patel
027c469158 regenerate checks
llvm-svn: 271904
2016-06-06 16:03:06 +00:00
Sanjay Patel
70aa568c4e regenerate checks
llvm-svn: 271903
2016-06-06 15:55:00 +00:00
Eli Friedman
ee89505799 LICM: Don't sink stores out of loops that may throw.
Summary:
This hasn't been caught before because it requires noalias or similarly
strong alias analysis to actually reproduce.

Fixes http://llvm.org/PR27952 .

Reviewers: hfinkel, sanjoy

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D20944

llvm-svn: 271858
2016-06-05 22:13:52 +00:00
Sanjoy Das
b7e861a488 Add safety check to InstCombiner::commonIRemTransforms
Since FoldOpIntoPhi speculates the binary operation to potentially each
of the predecessors of the PHI node (pulling it out of arbitrary control
dependence in the process), we can FoldOpIntoPhi only if we know the
operation doesn't have UB.

This also brings up an interesting profitability question -- the way it
is written today, commonIRemTransforms will hoist out work from
dynamically dead code into code that will execute at runtime.  Perhaps
that isn't the best canonicalization?

Fixes PR27968.

llvm-svn: 271857
2016-06-05 21:17:04 +00:00
Sanjoy Das
0dcd1d859c Add test case for InstCombiner::commonIRemTransforms; NFC
The PHI case in commonIRemTransforms was untested; add a trivial test
case.

llvm-svn: 271856
2016-06-05 21:17:00 +00:00
Davide Italiano
a1cbc3f8cc [Internalize] Test that __stack_chk_{guard, fail} are not internalized.
r154645 introduced this feature without test. This should have better
coverage now.

llvm-svn: 271853
2016-06-05 19:08:54 +00:00
Sanjoy Das
4d4339d1e8 [PM] Port IndVarSimplify to the new pass manager
Summary:
There are some rough corners, since the new pass manager doesn't have
(as far as I can tell) LoopSimplify and LCSSA, so I've updated the
tests to run them separately in the old pass manager in the lit tests.
We also don't have an equivalent for AU.setPreservesCFG() in the new
pass manager, so I've left a FIXME.

Reviewers: bogner, chandlerc, davide

Subscribers: sanjoy, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D20783

llvm-svn: 271846
2016-06-05 18:01:19 +00:00
Sanjoy Das
f90e28d6fd [IndVars] Remove -liv-reduce
It is an off-by-default option that no one seems to use[0], and given
that SCEV directly understands the overflow instrinsics there is no real
need for it anymore.

[0]: http://lists.llvm.org/pipermail/llvm-dev/2016-April/098181.html

llvm-svn: 271845
2016-06-05 18:01:12 +00:00
Sanjay Patel
0fab306eb5 fix checks
update_test_checks.py got confused matching the variable names. 

llvm-svn: 271844
2016-06-05 17:54:56 +00:00
Sanjay Patel
a6fbc82392 [InstCombine] allow vector icmp bool transforms
llvm-svn: 271843
2016-06-05 17:49:45 +00:00
Sanjay Patel
54d7010627 add tests to show missing vector transforms
llvm-svn: 271842
2016-06-05 17:32:58 +00:00
Sanjay Patel
51dc83c052 regenerate checks
llvm-svn: 271841
2016-06-05 17:29:45 +00:00
Sanjay Patel
009c3da65f update test to use FileCheck
llvm-svn: 271840
2016-06-05 17:13:09 +00:00
Sanjay Patel
f48b909f28 update test to use FileCheck
llvm-svn: 271838
2016-06-05 16:41:20 +00:00
Sanjay Patel
8a3b6d0d8b update test to FileCheck
llvm-svn: 271837
2016-06-05 16:29:15 +00:00
Xinliang David Li
64dbb295b6 [PM] Port GCOVProfiler pass to the new pass manager
llvm-svn: 271823
2016-06-05 05:12:23 +00:00
David Majnemer
2482e1c017 [SimplifyCFG] Don't kill empty cleanuppads with multiple uses
A basic block could contain:
  %cp = cleanuppad []
  cleanupret from %cp unwind to caller

This basic block is empty and is thus a candidate for removal.  However,
there can be other uses of %cp outside of this basic block.  This is
only possible in unreachable blocks.

Make our transform more correct by checking that the pad has a single
user before removing the BB.

This fixes PR28005.

llvm-svn: 271816
2016-06-04 23:50:03 +00:00
Sanjay Patel
ea8a211169 [InstCombine] allow vector constants for cast+icmp fold
This is step 1 of unknown towards fixing PR28001:
https://llvm.org/bugs/show_bug.cgi?id=28001

llvm-svn: 271810
2016-06-04 22:04:05 +00:00
Sanjay Patel
8e63999bee [InstCombine] add test for missing vector optimization
llvm-svn: 271808
2016-06-04 21:41:25 +00:00
Sanjay Patel
4c42211c6f [InstCombine] add test for missing vector optimization
llvm-svn: 271806
2016-06-04 21:20:03 +00:00
Sanjay Patel
58a92a327d [InstCombine] minimize test case and use FileCheck
llvm-svn: 271805
2016-06-04 21:04:59 +00:00
Simon Pilgrim
ba319ded5e [Analysis] Enabled BITREVERSE as a vectorizable intrinsic
Allows XOP to vectorize BITREVERSE - other targets will follow as their costmodels improve.

llvm-svn: 271803
2016-06-04 20:21:07 +00:00
Simon Pilgrim
fda22d66fc [InstCombine][MMX] Extend SimplifyDemandedUseBits MOVMSK support to MMX
Add the MMX implementation to the SimplifyDemandedUseBits SSE/AVX MOVMSK support added in D19614

Requires a minor tweak as llvm.x86.mmx.pmovmskb takes a x86_mmx argument - so we have to be explicit about the implied v8i8 vector type.

llvm-svn: 271789
2016-06-04 13:42:46 +00:00
Sanjay Patel
6cf18af1c5 [InstCombine] look through bitcasts to find selects
There was concern that creating bitcasts for the simpler potential select pattern:

define <2 x i64> @vecBitcastOp1(<4 x i1> %cmp, <2 x i64> %a) {
  %a2 = add <2 x i64> %a, %a
  %sext = sext <4 x i1> %cmp to <4 x i32>
  %bc = bitcast <4 x i32> %sext to <2 x i64>
  %and = and <2 x i64> %a2, %bc
  ret <2 x i64> %and
}

might lead to worse code for some targets, so this patch is matching the larger
patterns seen in the test cases.

The motivating example for this patch is this IR produced via SSE intrinsics in C:

define <2 x i64> @gibson(<2 x i64> %a, <2 x i64> %b) {
  %t0 = bitcast <2 x i64> %a to <4 x i32>
  %t1 = bitcast <2 x i64> %b to <4 x i32>
  %cmp = icmp sgt <4 x i32> %t0, %t1
  %sext = sext <4 x i1> %cmp to <4 x i32>
  %t2 = bitcast <4 x i32> %sext to <2 x i64>
  %and = and <2 x i64> %t2, %a
  %neg = xor <4 x i32> %sext, <i32 -1, i32 -1, i32 -1, i32 -1>
  %neg2 = bitcast <4 x i32> %neg to <2 x i64>
  %and2 = and <2 x i64> %neg2, %b
  %or = or <2 x i64> %and, %and2
  ret <2 x i64> %or
}

For an AVX target, this is currently:

vpcmpgtd  %xmm1, %xmm0, %xmm2
vpand     %xmm0, %xmm2, %xmm0
vpandn    %xmm1, %xmm2, %xmm1
vpor      %xmm1, %xmm0, %xmm0
retq

With this patch, it becomes:

vpmaxsd   %xmm1, %xmm0, %xmm0

Differential Revision: http://reviews.llvm.org/D20774

llvm-svn: 271676
2016-06-03 14:42:07 +00:00
Sanjay Patel
172bf6edd1 [InstCombine] change tests to show a more obvious transform possibility
The original tests were intended to show a missing transform that would
be solved by D20774:
http://reviews.llvm.org/D20774

But it's not clear that the transform for the simpler tests is a win for
all targets. Make the tests show a larger pattern that should be a win
regardless of the cost of bitcast instructions.

llvm-svn: 271603
2016-06-02 22:45:49 +00:00