Commit Graph

91424 Commits

Author SHA1 Message Date
Craig Topper
13cf7cac07 [AVX512] Remove maksed pshufd, pshuflw, and phufhw intrinsics and autoupgrade them to selects and shufflevector.
llvm-svn: 272527
2016-06-13 02:36:48 +00:00
Craig Topper
ea703ae30a [X86] Refactor some of the X86 autoupgrade code to split mask vector and select generation into routines that can be reused for future intrinsic upgrades. NFC
llvm-svn: 272526
2016-06-13 02:36:42 +00:00
Benjamin Kramer
ea76b6fde2 Use 'auto' to avoid implicit copies.
td_type is std::pair<std::string, std::string>, but the map returns
elements of std::pair<const std::string, std::string>. In well-designed
languages like C++ that yields an implicit copy perfectly hidden by
constref's lifetime extension. Just use auto, the typedef obscured the
real type anyways.

Found with a little help from clang-tidy's
performance-implicit-cast-in-loop.

llvm-svn: 272519
2016-06-12 19:02:34 +00:00
Benjamin Kramer
7ab4fe32d7 [Verifier] Simplify code. No functionality change intended.
llvm-svn: 272517
2016-06-12 17:46:23 +00:00
Benjamin Kramer
4ca41fd09e Run clang-tidy's performance-unnecessary-copy-initialization over LLVM.
No functionality change intended.

llvm-svn: 272516
2016-06-12 17:30:47 +00:00
Xinliang David Li
071d0f1807 [MBP] Code cleanup /NFC
This is second patch to clean up the code.

In this patch, the logic to determine block outlinining
is refactored and more comments are added.
 

llvm-svn: 272514
2016-06-12 16:54:03 +00:00
Benjamin Kramer
d3f4c05aea Move instances of std::function.
Or replace with llvm::function_ref if it's never stored. NFC intended.

llvm-svn: 272513
2016-06-12 16:13:55 +00:00
Benjamin Kramer
bdc4956bac Pass DebugLoc and SDLoc by const ref.
This used to be free, copying and moving DebugLocs became expensive
after the metadata rewrite. Passing by reference eliminates a ton of
track/untrack operations. No functionality change intended.

llvm-svn: 272512
2016-06-12 15:39:02 +00:00
Sanjay Patel
977530a8c9 [x86, SSE] change patterns for CMPP to float types to allow matching with SSE1 (PR28044)
This patch is intended to solve:
https://llvm.org/bugs/show_bug.cgi?id=28044

By changing the definition of X86ISD::CMPP to use float types, we allow it to be created 
and pass legalization for an SSE1-only target where v4i32 is not legal.

The motivational trail for this change includes:
https://llvm.org/bugs/show_bug.cgi?id=28001

and eventually makes this trigger:
http://reviews.llvm.org/D21190

Ie, after this step, we should be free to have Clang generate FP compare IR instead of x86
intrinsics for SSE C packed compare intrinsics. (We can auto-upgrade and remove the LLVM 
sse.cmp intrinsics as a follow-up step.) Once we're generating vector IR instead of x86
intrinsics, a big pile of generic optimizations can trigger.

Differential Revision: http://reviews.llvm.org/D21235

llvm-svn: 272511
2016-06-12 15:03:25 +00:00
Craig Topper
1067986c5b [X86] Remove sse2 pshufd/pshuflw/pshufhw intrinsics and upgrade them to shufflevector.
llvm-svn: 272510
2016-06-12 14:11:32 +00:00
Benjamin Kramer
bc2f4fb691 [RegUsageInfoCollector] Drop unneccesary const_cast. NFC.
llvm-svn: 272509
2016-06-12 13:32:23 +00:00
Sean Silva
e3bb457423 [PM] Port DeadArgumentElimination to the new PM
The approach taken here follows r267631.

deadarghaX0r should be easy to port when the time comes to add new-PM
support to bugpoint.

llvm-svn: 272507
2016-06-12 09:16:39 +00:00
Amaury Sechet
48b0665bf2 Change () to (void) in the C API.
llvm-svn: 272506
2016-06-12 07:56:21 +00:00
Sean Silva
f5080194fd [PM] Port ReversePostOrderFunctionAttrs to the new PM
Below are my super rough notes when porting. They can probably serve as
a basic guide for porting other passes to the new PM. As I port more
passes I'll expand and generalize this and make a proper
docs/HowToPortToNewPassManager.rst document. There is also missing
documentation for general concepts and API's in the new PM which will
require some documentation.
Once there is proper documentation in place we can put up a list of
passes that have to be ported and game-ify/crowdsource the rest of the
porting (at least of the middle end; the backend is still unclear).

I will however be taking personal responsibility for ensuring that the
LLD/ELF LTO pipeline is ported in a timely fashion. The remaining passes
to be ported are (do something like
`git grep "<the string in the bullet point below>"` to find the pass):

General Scalar:
[ ] Simplify the CFG
[ ] Jump Threading
[ ] MemCpy Optimization
[ ] Promote Memory to Register
[ ] MergedLoadStoreMotion
[ ] Lazy Value Information Analysis

General IPO:
[ ] Dead Argument Elimination
[ ] Deduce function attributes in RPO

Loop stuff / vectorization stuff:
[ ] Alignment from assumptions
[ ] Canonicalize natural loops
[ ] Delete dead loops
[ ] Loop Access Analysis
[ ] Loop Invariant Code Motion
[ ] Loop Vectorization
[ ] SLP Vectorizer
[ ] Unroll loops

Devirtualization / CFI:
[ ] Cross-DSO CFI
[ ] Whole program devirtualization
[ ] Lower bitset metadata

CGSCC passes:
[ ] Function Integration/Inlining
[ ] Remove unused exception handling info
[ ] Promote 'by reference' arguments to scalars

Please let me know if you are interested in working on any of the passes
in the above list (e.g. reply to the post-commit thread for this patch).
I'll probably be tackling "General Scalar" and "General IPO" first FWIW.

Steps as I port "Deduce function attributes in RPO"
---------------------------------------------------

(note: if you are doing any work based on these notes, please leave a
note in the post-commit review thread for this commit with any
improvements / suggestions / incompleteness you ran into!)

Note: "Deduce function attributes in RPO" is a module pass.

1. Do preparatory refactoring.

Do preparatory factoring. In this case all I had to do was to pull out a static helper (r272503).
(TODO: give more advice here e.g. if pass holds state or something)

2. Rename the old pass class.

llvm/lib/Transforms/IPO/FunctionAttrs.cpp
Rename class ReversePostOrderFunctionAttrs -> ReversePostOrderFunctionAttrsLegacyPass
in preparation for adding a class ReversePostOrderFunctionAttrs as the pass in the new PM.
(edit: actually wait what? The new class name will be
ReversePostOrderFunctionAttrsPass, so it doesn't conflict. So this step is
sort of useless churn).

llvm/include/llvm/InitializePasses.h
llvm/lib/LTO/LTOCodeGenerator.cpp
llvm/lib/Transforms/IPO/IPO.cpp
llvm/lib/Transforms/IPO/FunctionAttrs.cpp
Rename initializeReversePostOrderFunctionAttrsPass -> initializeReversePostOrderFunctionAttrsLegacyPassPass
(note that the "PassPass" thing falls out of `s/ReversePostOrderFunctionAttrs/ReversePostOrderFunctionAttrsLegacyPass/`)
Note that the INITIALIZE_PASS macro is what creates this identifier name, so renaming the class requires this renaming too.

Note that createReversePostOrderFunctionAttrsPass does not need to be
renamed since its name is not generated from the class name.

3. Add the new PM pass class.

In the new PM all passes need to have their
declaration in a header somewhere, so you will often need to add a header.
In this case
llvm/include/llvm/Transforms/IPO/FunctionAttrs.h is already there because
PostOrderFunctionAttrsPass was already ported.
The file-level comment from the .cpp file can be used as the file-level
comment for the new header. You may want to tweak the wording slightly
from "this file implements" to "this file provides" or similar.

Add declaration for the new PM pass in this header:

    class ReversePostOrderFunctionAttrsPass
        : public PassInfoMixin<ReversePostOrderFunctionAttrsPass> {
    public:
      PreservedAnalyses run(Module &M, AnalysisManager<Module> &AM);
    };

Its name should end with `Pass` for consistency (note that this doesn't
collide with the names of most old PM passes). E.g. call it
`<name of the old PM pass>Pass`.

Also, move the doxygen comment from the old PM pass to the declaration of
this class in the header.
Also, include the declaration for the new PM class
`llvm/Transforms/IPO/FunctionAttrs.h` at the top of the file (in this case,
it was already done when the other pass in this file was ported).

Now define the `run` method for the new class.
The main things here are:
a) Use AM.getResult<...>(M) to get results instead of `getAnalysis<...>()`

b) If the old PM pass would have returned "false" (i.e. `Changed ==
false`), then you should return PreservedAnalyses::all();

c) In the old PM getAnalysisUsage method, observe the calls
   `AU.addPreserved<...>();`.

   In the case `Changed == true`, for each preserved analysis you should do
   call `PA.preserve<...>()` on a PreservedAnalyses object and return it.
   E.g.:

       PreservedAnalyses PA;
       PA.preserve<CallGraphAnalysis>();
       return PA;

Note that calls to skipModule/skipFunction are not supported in the new PM
currently, so optnone and optimization bisect support do not work. You can
just drop those calls for now.

4. Add the pass to the new PM pass registry to make it available in opt.

In llvm/lib/Passes/PassBuilder.cpp add a #include for your header.
`#include "llvm/Transforms/IPO/FunctionAttrs.h"`
In this case there is already an include (from when
PostOrderFunctionAttrsPass was ported).

Add your pass to llvm/lib/Passes/PassRegistry.def
In this case, I added
`MODULE_PASS("rpo-functionattrs", ReversePostOrderFunctionAttrsPass())`
The string is from the `INITIALIZE_PASS*` macros used in the old pass
manager.

Then choose a test that uses the pass and use the new PM `-passes=...` to
run it.
E.g. in this case there is a test that does:
; RUN: opt < %s -basicaa -functionattrs -rpo-functionattrs -S | FileCheck %s
I have added the line:
; RUN: opt < %s -aa-pipeline=basic-aa -passes='require<targetlibinfo>,cgscc(function-attrs),rpo-functionattrs' -S | FileCheck %s
The `-aa-pipeline=basic-aa` and
`require<targetlibinfo>,cgscc(function-attrs)` are what is needed to run
functionattrs in the new PM (note that in the new PM "functionattrs"
becomes "function-attrs" for some reason). This is just pulled from
`readattrs.ll` which contains the change from when functionattrs was ported
to the new PM.
Adding rpo-functionattrs causes the pass that was just ported to run.

llvm-svn: 272505
2016-06-12 07:48:51 +00:00
Amaury Sechet
5db224e1f0 Make sure we have a Add/Remove/Has function for various thing that can have attribute.
Summary: This also deprecated the get attribute function familly.

Reviewers: Wallbraker, whitequark, joker.eph, echristo, rafael, jyknight

Subscribers: axw, joker.eph, llvm-commits

Differential Revision: http://reviews.llvm.org/D19181

llvm-svn: 272504
2016-06-12 06:17:24 +00:00
Sean Silva
adc7939525 Factor out a helper. NFC
Prep for porting to new PM.

llvm-svn: 272503
2016-06-12 05:44:51 +00:00
Craig Topper
c0a5fa0a07 [X86] Pre-allocate some of the shuffle mask SmallVectors in the auto upgrade code instead of calling push_back in a loop. This removes the need to check if the vector needs to grow on each iteration.
llvm-svn: 272501
2016-06-12 04:48:00 +00:00
Craig Topper
251030babe [AVX512] Remove the masked palignr intrinsics that I forgot to remove when I added auto-upgrade code to turn them into shufflevectors and selects.
llvm-svn: 272497
2016-06-12 04:14:13 +00:00
Craig Topper
8a10505f23 [X86] Greatly simplify the llvm.x86.avx.vpermil.* auto-upgrade code. We can fully derive everything using types of the intrinsic arguments rather than writing separate loops for each intrinsic. NFC
llvm-svn: 272496
2016-06-12 03:10:47 +00:00
Eli Friedman
9f8031c2da [MergedLoadStoreMotion] Use correct helper for load hoist safety.
It isn't legal to hoist a load past a call which might not return;
even if it doesn't throw, it could, for example, call exit().

Fixes http://llvm.org/PR27953.

llvm-svn: 272495
2016-06-12 02:11:20 +00:00
Craig Topper
2f5618270b [X86,IR] Make use of the CreateShuffleVector form that takes an ArrayRef<uint32_t> to avoid the need to manually create a bunch of Constants and a ConstantVector. NFC
llvm-svn: 272493
2016-06-12 01:05:59 +00:00
Craig Topper
99d1eab327 [IR] Require ArrayRef of 'uint32_t' instead of 'int' for the mask argument for one of the signatures of CreateShuffleVector. This better emphasises that you can't use it for the -1 as undef behavior.
llvm-svn: 272491
2016-06-12 00:41:19 +00:00
Eli Friedman
f1da33e4d3 [LICM] Make isGuaranteedToExecute more accurate.
Summary:
Make isGuaranteedToExecute use the
isGuaranteedToTransferExecutionToSuccessor helper, and make that helper
a bit more accurate.

There's a potential performance impact here from assuming that arbitrary
calls might not return. This probably has little impact on loads and
stores to a pointer because most things alias analysis can reason about
are dereferenceable anyway. The other impacts, like less aggressive
hoisting of sdiv by a variable and less aggressive hoisting around
volatile memory operations, are unlikely to matter for real code.

This also impacts SCEV, which uses the same helper.  It's a minor
improvement there because we can tell that, for example, memcpy always
returns normally. Strictly speaking, it's also introducing
a bug, but it's not any worse than everywhere else we assume readonly
functions terminate.

Fixes http://llvm.org/PR27857.

Reviewers: hfinkel, reames, chandlerc, sanjoy

Subscribers: broune, llvm-commits

Differential Revision: http://reviews.llvm.org/D21167

llvm-svn: 272489
2016-06-11 21:48:25 +00:00
Simon Pilgrim
3fc09f7be6 [CostModel][X86][SSE] Updated costs for vector BITREVERSE ops on SSSE3+ targets
To account for the fast PSHUFB implementation now available

llvm-svn: 272484
2016-06-11 19:23:02 +00:00
Xinliang David Li
594ffa3d36 [MBP] Code cleanup /NFC
This is one of the patches to clean up the code so that
it is in a better form to make future enhancements easier.

In htis patch, the logic to collect viable successors are
extrated as a helper to unclutter the caller which gets very
large recenty. Also cleaned up BP adjustment code.
 

llvm-svn: 272482
2016-06-11 18:35:40 +00:00
Vikram TV
c702b8b3d7 Delay dominator updation while cloning loop.
Summary:
Dominator updation fails for a loop inserted with a new basicblock.

A block required by DT to set the IDom might not have been cloned yet. This is because there is no predefined ordering of loop blocks (except for the header block which should be the first block in the list).

The patch first creates DT nodes for the cloned blocks and then separately updates the DT in a follow-on loop.

Reviewers: anemet, dberlin

Subscribers: dberlin, llvm-commits

Differential Revision: http://reviews.llvm.org/D20899

llvm-svn: 272479
2016-06-11 16:41:10 +00:00
Simon Pilgrim
5b9bade8dd [X86][SSSE3] Added PSHUFB LUT implementation of BITREVERSE
PSHUFB can speed up BITREVERSE of byte vectors by performing LUT on the low/high nibbles separately and ORing the results. Wider integer vector types are already BSWAP'd beforehand so also make use of this approach.

llvm-svn: 272477
2016-06-11 15:44:13 +00:00
Simon Pilgrim
b13961d25b Strip trailing whitespace. NFCI.
llvm-svn: 272476
2016-06-11 14:34:10 +00:00
Craig Topper
504fba5c8a [AVX512] Lower v8i64 and v16i32 to pshufd when possible.
llvm-svn: 272473
2016-06-11 13:43:21 +00:00
Simon Pilgrim
6800a45790 [X86][SSE] Added PSLLDQ/PSRLDQ as a target shuffle type
Ensure that PALIGNR/PSLLDQ/PSRLDQ are byte vectors so that they can be correctly decoded for target shuffle combining

llvm-svn: 272471
2016-06-11 13:38:28 +00:00
Simon Pilgrim
255fdd0666 [X86][SSE] Use vXi8 return type for PSLLDQ/PSRLDQ instructions
These are byte shift instructions and it will make shuffle combining a lot more straightforward if we can assume a vXi8 vector of bytes so decoded shuffle masks match the return type's number of elements

llvm-svn: 272468
2016-06-11 12:54:37 +00:00
Simon Pilgrim
d386941676 [X86][AVX512] Tidied up VSHUFF32x4/VSHUFF64x2/VSHUFI32x4/VSHUFI64x2 comment generation
Now matches other shuffles

llvm-svn: 272464
2016-06-11 11:18:38 +00:00
Chandler Carruth
4c0e94dce6 Try a bit harder to remove the signed and unsigned comparison warning.
Hopefully this time it actually works and stays away.

llvm-svn: 272463
2016-06-11 09:13:00 +00:00
Chandler Carruth
306e270b83 Compare to an unsigned literal to avoid a -Wsign-compare warning.
llvm-svn: 272459
2016-06-11 08:02:01 +00:00
Chandler Carruth
3ecbd9d287 Use const_cast to cast away constness. This silences a warning.
llvm-svn: 272458
2016-06-11 08:01:57 +00:00
Lang Hames
717eacfcf4 [MCJIT] Update MCJIT and get the fibonacci example working again.
MCJIT will now set the DataLayout on a module when it is added to the JIT,
rather than waiting until it is codegen'd, and the runFunction method will
finalize the module containing the function to be run before running it.

The fibonacci example has been updated to include and link against MCJIT.

llvm-svn: 272455
2016-06-11 05:47:04 +00:00
Craig Topper
40abd1cc61 [AVX512] Add support for lowering v32i16 shuffles with repeated lanes. This allows us to create 512-bit PSHUFLW/PSHUFHW.
llvm-svn: 272450
2016-06-11 03:27:42 +00:00
Craig Topper
b9b86fcfff [AVX512] No need to check for BWI being enabled before lowering v32i16 and v64i8 shuffles. If we get this far the types are already legal which means BWI must be enabled.
llvm-svn: 272449
2016-06-11 03:27:37 +00:00
Matthias Braun
959a8c974d LiveIntervalAnalysis: findLastUseBefore() must ignore undef uses.
undef uses are no real uses of a register and must be ignored by
findLastUseBefore() so that handleMove() does not produce invalid live
intervals in some cases.

This fixed http://llvm.org/PR28083

llvm-svn: 272446
2016-06-11 00:31:28 +00:00
Qin Zhao
bc8fbeacf3 [esan|cfrag] Handle complex GEP instr in the cfrag tool
Summary:
Iterates all (except the first and the last) operands within each GEP
instruction for instrumentation.

Adds test struct_field_gep.ll.

Reviewers: aizatsky

Subscribers: vitalybuka, zhaoqin, kcc, eugenis, bruening, llvm-commits

Differential Revision: http://reviews.llvm.org/D21242

llvm-svn: 272442
2016-06-10 22:28:55 +00:00
Michael Zolotukhin
b98294d006 Don't try to rotate a loop more than once - we never do this anyway.
Summary:
I can't find a case where we can rotate a loop more than once, and it looks
like we never do this. To rotate a loop following conditions should be met:
1) its header should be exiting
2) its latch shouldn't be exiting

But after the first rotation the header becomes the new latch, so this
condition can never be true any longer.

Tested on with an assert on LNT testsuite and make check.

Reviewers: hfinkel, sanjoy

Subscribers: sebpop, sanjoy, llvm-commits, mzolotukhin

Differential Revision: http://reviews.llvm.org/D20181

llvm-svn: 272439
2016-06-10 22:03:56 +00:00
Zachary Turner
97609bb2fd [pdb] Fix issues with pdb writing.
This fixes an alignment issue by forcing all cached allocations
to be 8 byte aligned, and also fixes an issue arising on big
endian systems by writing ulittle32_t's instead of uint32_t's
in the test.

llvm-svn: 272437
2016-06-10 21:47:26 +00:00
Sebastian Pop
e1f60b1fb3 MemorySSA: fix memory access local dominance function for live on entry
A memory access defined on function entry cannot be locally dominated by another memory access.
The patch was split from http://reviews.llvm.org/D19338 which exposes the problem.

Differential Revision: http://reviews.llvm.org/D21039

llvm-svn: 272436
2016-06-10 21:36:41 +00:00
Sanjoy Das
39c226fdba [STLExtras] Introduce and use llvm::count_if; NFC
(This is split out from was D21115)

llvm-svn: 272435
2016-06-10 21:18:39 +00:00
Quentin Colombet
f2a1909bb5 [IRTranslator] Support the translation of or.
Now or instructions get translated into G_OR.

llvm-svn: 272433
2016-06-10 20:50:35 +00:00
Quentin Colombet
13c55e07ed [IRTranslator] Refactor to expose a translateBinaryOp method.
This method will be used for every binary operation.

NFC.

llvm-svn: 272431
2016-06-10 20:50:18 +00:00
Chad Rosier
d1f6c840ee [AArch64] Move comments closer to relevant check. NFC.
llvm-svn: 272430
2016-06-10 20:49:18 +00:00
Chad Rosier
c5083c2ccf [AArch64] Refactor a check earlier. NFC.
llvm-svn: 272429
2016-06-10 20:47:14 +00:00
Sanjay Patel
b114fd65fc [x86] enable bitcasted fabs/fneg transforms
The vector cases don't change because we already have folds in X86ISelLowering
to look through and remove bitcasts.

llvm-svn: 272427
2016-06-10 20:33:50 +00:00
Etienne Bergeron
2e50fedb2c [CodeGen] Fix PrologEpilogInserter to avoid duplicate allocation of SEH structs
Summary:
When stack-protection is activated and WinEH exceptions is used, 
the EHRegNode (exception handling registration) is allocated twice on the stack.

This was not breaking anything except loosing space on the stack.

```
D:\src\llvm\examples>llc exc2.ll  -debug-only=pei
alloc FI(0) at SP[-24]
alloc FI(1) at SP[-48]   <<-- Allocated
alloc FI(1) at SP[-72]   <<-- Allocated twice!?
alloc FI(2) at SP[-76]
alloc FI(4) at SP[-80]
alloc FI(3) at SP[-84]
```

Reviewers: rnk, majnemer

Subscribers: chrisha, llvm-commits

Differential Revision: http://reviews.llvm.org/D21188

llvm-svn: 272426
2016-06-10 20:24:38 +00:00