Commit Graph

25896 Commits

Author SHA1 Message Date
Eugene Leviant
7f78d4712f [DebugInfo] Apply subprogram attributes on behalf of owner CU
When using full LTO it is possible that template function definition DIE
is bound to one compilation unit and it's declaration to another. We should
add function declaration attributes on behalf of its owner CU otherwise
we may end up with malformed file identifier in function declaration 
DW_AT_decl_file attribute.

Differential revision: https://reviews.llvm.org/D58538

llvm-svn: 354978
2019-02-27 14:46:59 +00:00
Petar Avramovic
bd39569913 [MIPS GlobalISel] Select G_UADDO
Lower G_UADDO.
Legalize G_UADDO for MIPS32

Differential Revision: https://reviews.llvm.org/D58671

llvm-svn: 354900
2019-02-26 17:22:42 +00:00
Nirav Dave
582d46328c [DAG] Fix constant store folding to handle non-byte sizes.
Avoid crashes from zero-byte values due to sub-byte store sizes.

Reviewers: uabelho, courbet, rnk

Reviewed By: courbet

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58626

llvm-svn: 354884
2019-02-26 15:02:32 +00:00
Simon Pilgrim
566177c3d5 [LegalizeDAG] Use APInt::getSplat helper to create bitreverse masks. NFCI.
llvm-svn: 354867
2019-02-26 11:44:23 +00:00
Simon Pilgrim
810fa04ac7 [LegalizeDAG] Expand SADDO/SSUBO using SADDSAT/SSUBSAT (PR37763)
If SADDSAT/SSUBSAT are legal, then we can expand SADDO/SSUBO by performing a ADD/SUB and a SADDO/SSUBO and then compare the results.

I looked at doing this for UADDO/USUBO as well but as we don't have to do as many range comparisons I didn't see any/much benefit.

Differential Revision: https://reviews.llvm.org/D58637

llvm-svn: 354866
2019-02-26 11:27:53 +00:00
Aaron Smith
1d5f8632d7 [CodeView] Emit HasConstructorOrDestructor class option for non-trivial constructors
Reviewers: zturner, rnk, llvm-commits, aleksandr.urakov

Reviewed By: zturner, rnk

Subscribers: jdoerfert, majnemer, asmith

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D44406

llvm-svn: 354841
2019-02-26 03:23:56 +00:00
Matt Arsenault
752579736e RegBankSelect: Handle slightly more complex value mappings
Try to use concat_vectors. Also remove unnecessary assert on
pointers. Fixes asserting for <4 x s16> operations and 64-bit pointers
for AMDGPU.

llvm-svn: 354828
2019-02-25 22:24:13 +00:00
Matt Arsenault
0b14857415 RegisterScavenger: Allow fail without spill
AMDGPU wants to use this in some contexts where
the spilling is either impossible, or a worse alternative
to doing something else.

llvm-svn: 354816
2019-02-25 20:29:04 +00:00
Andrea Di Biagio
4a1e59a6e0 Fix a sign compare warning breaking the -Werror build.
The warning was introduced at r354793.

llvm-svn: 354810
2019-02-25 19:33:58 +00:00
Simon Pilgrim
80d0e9c563 [SelectionDAG] Add demanded elts variants to isConstOrConstSplat helpers. NFCI.
These helpers extend the existing isConstOrConstSplat helper checks to support DemandedElts masks as well.

We already had a local version of this in SelectionDAG that computeKnownBits/ComputeNumSignBits made use of, but this adds the functionality directly to the BuildVectorSDNode node and extends isConstOrConstSplat etc. to use that.

This will allow us to reuse the functionality in SimplifyDemandedVectorElts/SimplifyDemandedBits.

Differential Revision: https://reviews.llvm.org/D58503

llvm-svn: 354797
2019-02-25 16:31:58 +00:00
Simon Pilgrim
28441ac75f [DAGCombine] Add undef shuffle elt support to partitionShuffleOfConcats
Support undef shuffle mask indices in the shuffle(concat_vectors, concat_vectors) -> concat_vectors fold

Differential Revision: https://reviews.llvm.org/D58585

llvm-svn: 354793
2019-02-25 16:02:01 +00:00
Craig Topper
8c9724ea4f [SelectionDAG] Add a OPC_CheckChild2CondCode to SelectionDAGISel to remove a MoveChild and MoveParent pair.
OPC_CheckCondCode is always used as operand 2 of a setcc. And its always surrounded by a MoveChild2 and a MoveParent. By having a dedicated opcode for this case we can reduce the number of bytes needed for this pattern from 4 bytes to 2.

This saves ~3000 bytes in the X86 table.

llvm-svn: 354763
2019-02-25 03:11:44 +00:00
Craig Topper
be3348573e [LegalizeTypes][AArch64][X86] Make type legalization of vector (S/U)ADD/SUB/MULO follow getSetCCResultType for the overflow bits. Make UnrollVectorOverflowOp properly convert from scalar boolean contents to vector boolean contents
Summary:
When promoting the over flow vector for these ops we should use the target's desired setcc result type. This way a v8i32 result type will use a v8i32 overflow vector instead of a v8i16 overflow vector. A v8i16 overflow vector will cause LegalizeDAG/LegalizeVectorOps to have to use v8i32 and truncate to v8i16 in its expansion. By doing this in type legalization instead, we get the truncate into the DAG earlier and give DAG combine more of a chance to optimize it.

We also have to fix unrolling to use the scalar setcc result type for the scalarized operation, and convert it to the required vector element type after the scalar operation. We have to observe the vector boolean contents when doing this conversion. The previous code was just taking the scalar result and putting it in the vector. But for X86 and AArch64 that would have only put a the boolean value in bit 0 of the element and left all other bits in the element 0. We need to ensure all bits in the element are the same. I'm using a select with constants here because that's what setcc unrolling in LegalizeVectorOps used.

Reviewers: spatel, RKSimon, nikic

Reviewed By: nikic

Subscribers: javed.absar, kristof.beyls, dmgreen, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58567

llvm-svn: 354753
2019-02-24 19:23:36 +00:00
Sanjay Patel
cb04ba032f [CGP] add special-cases to form unsigned add with overflow (PR40486)
There's likely a missed IR canonicalization for at least 1 of these
patterns. Otherwise, we wouldn't have needed the pattern-matching
enhancement in D57516.

Note that -- unlike usubo added with D57789 -- the TLI hook for
this transform defaults to 'on'. So if there's any perf fallout
from this, targets should look at how they're lowering the uaddo
node in SDAG and/or override that hook.

The x86 diffs suggest that there's some missing pattern-matching
for forming inc/dec.

This should fix the remaining known problems in:
https://bugs.llvm.org/show_bug.cgi?id=40486
https://bugs.llvm.org/show_bug.cgi?id=31754

llvm-svn: 354746
2019-02-24 15:31:27 +00:00
Craig Topper
dc185522fb [TwoAddressInstructionPass] After commuting an instruction and before trying to look for more commutable operands, resample the number of operands.
The new instruciton might have less operands than the original instruction. If we don't resample, the next loop iteration might read an operand that doesn't exist.

X86 can commute blends to movss/movsd which reduces from 4 operands to 3. This happened in the test case that caused r354363 & company to be reverted. A reduced version of that has been committed here.

Really this whole checking for more commutable operands is a little fragile. It assumes that the new instructions operands are the same order and positions as the original except for the pair that was swapped. I don't know of anything that breaks this assumption today, but I've left a fixme. Fixing this will likely require an interface change.

llvm-svn: 354738
2019-02-23 21:41:44 +00:00
Craig Topper
ccc860cb81 Recommit r354647 and r354648 "[LegalizeTypes] When promoting the result of EXTRACT_SUBVECTOR, also check if the input needs to be promoted. Use that to determine the element type to extract"
r354648 was a follow up to fix a regression "[X86] Add a DAG combine for (aext_vector_inreg (aext_vector_inreg X)) -> (aext_vector_inreg X) to fix a regression from my previous commit."

These were reverted in r354713 as their context depended on other patches that were reverted for a bug.

llvm-svn: 354734
2019-02-23 19:51:32 +00:00
Jordan Rupprecht
6387fa2715 [NFC] Fix typos: preceeding -> preceding
llvm-svn: 354715
2019-02-23 01:28:32 +00:00
Reid Kleckner
e3876637cf Revert r354363 & co "[X86][SSE] Generalize X86ISD::BLENDI support to more value types"
r354363 caused https://crbug.com/934963#c1, which has a plain C reduced
test case.

I also had to revert some dependent changes:
- r354648
- r354647
- r354640
- r354511

llvm-svn: 354713
2019-02-23 01:19:42 +00:00
Craig Topper
62619d064d [LegalizeTypes] Use PromoteTargetBoolean in PromoteIntOp_ADDSUBCARRY instead of reimplementing it. NFCI
llvm-svn: 354710
2019-02-23 00:38:19 +00:00
Daniel Sanders
07cda257f8 Restore ability for C++ API users to Enable IPRA.
Summary:
Prior to r310876 one of our out-of-tree targets was enabling IPRA by modifying
the TargetOptions::EnableIPRA. This no longer works on current trunk since the
useIPRA() hook overrides any values that are set in advance. This patch adjusts
the behaviour of the hook so that API users and useIPRA() can both enable it
but useIPRA() cannot disable it if the API user already enabled it.

Reviewers: arsenm

Reviewed By: arsenm

Subscribers: wdng, mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D38043

llvm-svn: 354692
2019-02-22 20:59:07 +00:00
Sanjay Patel
ffe1cf5e92 [CGP] move overflow intrinsic insertion to common location; NFCI
We need to enhance the uaddo matching to handle special-cases
as seen in PR40486 and PR31754. That means we won't necessarily
have a def-use pattern, so we'll need to check dominance to
determine where to place the intrinsic (as we already do for
usubo). This preliminary patch is just rearranging the code,
so the planned follow-up to improve uaddo will be more clear.

llvm-svn: 354689
2019-02-22 20:20:24 +00:00
Matt Arsenault
7b55066a34 MIR: Preserve incoming frame index numbers
Don't skip incrementing the frame index number
if the object is dead. Instructions can still be
referencing the old frame index number, and this
doesn't attempt to remap those. The resulting
MIR then fails to load because the use instructions
use a higher frame index number than recorded
list of stack objects.

I'm not sure it's possible to craft a testcase
with the existing set of passes. It requires
selectively marking some stack objects
dead in an essentially random order.
StackSlotColoring condenses towards
the low indexes. This avoids a regression in a
future AMDGPU commit when some frame indexes
are lowered separately from PEI.

llvm-svn: 354688
2019-02-22 19:30:38 +00:00
Matt Arsenault
6d05d6a7b6 CodeGen: Make RegAllocRegistry a template class
Will allow re-using the machinery for independent
sets of register allocators.

This will allow AMDGPU to use separate command line
options for the allocator to use for SGPRs separate
from VGPRs.

llvm-svn: 354687
2019-02-22 19:16:52 +00:00
Guozhi Wei
4c8e480358 [MBP] Factor out function hasViableTopFallthrough and enhancement
This patch factor out the function hasViableTopFallthrough from rotateLoop. It is also enhanced. Original code checks only if there is a block can be placed before current loop top. This patch also checks if the loop top is the most possible successor of its predecessor. The attached test case shows its effect.

Differential Revision: https://reviews.llvm.org/D58393

llvm-svn: 354682
2019-02-22 18:04:37 +00:00
Nirav Dave
46f939c118 Disable big-endian constant store merges from rL354676.
llvm-svn: 354677
2019-02-22 16:20:34 +00:00
Nirav Dave
44037d7a63 [DAGCombine] Fold overlapping constant stores
Fold a smaller constant store into larger constant stores immediately
preceeding it.

Reviewers: rnk, courbet

Subscribers: javed.absar, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58468

llvm-svn: 354676
2019-02-22 16:00:19 +00:00
Craig Topper
fa6187d230 [LegalizeVectorOps] Improve the placement of ANDs in the ExpandLoad path for non-byte-sized loads.
When we need to merge two adjacent loads the AND mask for the low piece was still sized for the full src element size. But we didn't have that many bits. The upper bits are already zero due to the SRL. So we can skip the AND if we're going to combine with the high bits.

We do need an AND to clear out any bits from the high part. We were anding the high part before combining with the low part, but it looks like ANDing after the OR gets better results.

So we can just emit the final AND after the optional concatentation is done. That will handling skipping before the OR and get rid of extra high bits after the OR.

llvm-svn: 354655
2019-02-22 07:03:25 +00:00
Craig Topper
069cf05e87 [LegalizeVectorOps] Simplify the non-byte sized load handling VectorLegalizer::ExpandLoad. NFCI
Remove an if that should always be true. Merge the body of another into the only block that could make the if true.

llvm-svn: 354654
2019-02-22 06:18:33 +00:00
Matt Arsenault
0280a5e143 DAG: Add helper for creating shifts with correct type
llvm-svn: 354649
2019-02-22 03:38:47 +00:00
Craig Topper
be22f329a9 [LegalizeTypes] When promoting the result of EXTRACT_SUBVECTOR, also check if the input needs to be promoted. Use that to determine the element type to extract.
Otherwise we end up creating extract_vector_elts that then each need to have their input promoted. This can lead to truncates needing to be emitted for each of those.

But we already emitted any_extends when we legalized the extract_subvector. So now we have pairs of any_extend+trunc that partially cancel. But depending on how DAGCombiner visits them we can get weird results.

By promoting the input at the same time we can create only a single any_extend or truncate.

There's one regression in the vector-narrow-binop.ll case, but that looks easy to fix with a follow up patch.

llvm-svn: 354647
2019-02-22 01:49:50 +00:00
Sanjay Patel
ba5ee817e9 [DAGCombiner] prevent infinite looping by truncating 'and' (PR40793)
This fold can occur during legalization, so it can fight with promotion
to the larger type. It apparently takes a special sequence and subtarget
to avoid more basic simplifications that would hide the problem.

But there's a bigger question raised here: why does distributeTruncateThroughAnd()
even exist? It duplicates functionality from a more minimal pattern that we
already have. But getting rid of this function requires some preliminary steps.

https://bugs.llvm.org/show_bug.cgi?id=40793

llvm-svn: 354594
2019-02-21 16:01:48 +00:00
Matt Arsenault
8df2f3dab2 RegBankSelect: Allow targets to introduce control flow for mapping
For AMDGPU, if an operand requires an SGPR but is only available as a
VGPR, a loop needs to be introduced to execute the instruction with
each unique combination of values across all lanes. The rest of the
instructions in the block will be moved to a new block following the
loop. Check if the next instruction's parent changed, and update the
iterators and insertion block if this happened.

Tests will be included in a future patch.

llvm-svn: 354591
2019-02-21 15:48:13 +00:00
Clement Courbet
a0321c23e8 Re-land part of r354244 "[DAGCombiner] Eliminate dead stores to stack."
This part introduces the lifetime node.

llvm-svn: 354578
2019-02-21 12:59:36 +00:00
Xin Tong
2d84c00dfa Add skipFunction to PostRA machine sinking pass.
Summary: Add skipFunction to PostRA machine sinking pass.

Reviewers: junbuml

Subscribers: arsenm, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57847

llvm-svn: 354541
2019-02-21 02:11:06 +00:00
Sanjay Patel
198cc305e9 [CGP] match a special-case of unsigned subtract overflow
This is the 'sub0' (negate) pattern from PR31754:
https://bugs.llvm.org/show_bug.cgi?id=31754

llvm-svn: 354519
2019-02-20 21:23:04 +00:00
Nirav Dave
48cf37b55c [DAGCombine] Generalize Dead Store to overlapping stores.
Summary:
Remove stores that are immediately overwritten by larger
stores.

Reviewers: courbet, rnk

Reviewed By: rnk

Subscribers: javed.absar, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58467

llvm-svn: 354518
2019-02-20 21:07:50 +00:00
Craig Topper
8d9c224a8c [SelectionDAG] Teach GetDemandedBits to look at the known zeros of the LHS when handling ISD::AND
If the LHS has known zeros, then the RHS immediate mask might have been simplified to remove those bits.

This patch adds a call to computeKnownBits to get the known zeroes to handle that possibility. I left an early out to skip the call if all of the demanded bits are set in the mask.

Differential Revision: https://reviews.llvm.org/D58464

llvm-svn: 354514
2019-02-20 20:52:26 +00:00
Nikita Popov
c3b496de7a [SDAG] Support vector UMULO/SMULO
Second part of https://bugs.llvm.org/show_bug.cgi?id=40442.

This adds an extra UnrollVectorOverflowOp() method to SDAG, because
the general UnrollOverflowOp() method can't deal with multiple results.

Additionally we need to expand UMULO/SMULO during vector op
legalization, as it may result in unrolling, which may need additional
type legalization.

Differential Revision: https://reviews.llvm.org/D57997

llvm-svn: 354513
2019-02-20 20:41:44 +00:00
Craig Topper
f4923db5a3 Revert r354498 "[X86] Add test case to show missed opportunity to remove an explicit AND on the bit position from BT when it has known zeros."
I accidentally committed more than just the test.

llvm-svn: 354499
2019-02-20 18:47:26 +00:00
Craig Topper
f8498a615b [X86] Add test case to show missed opportunity to remove an explicit AND on the bit position from BT when it has known zeros.
If the bit position has known zeros in it, then the AND immediate will likely be optimized to remove bits.

This can prevent GetDemandedBits from recognizing that the AND is unnecessary.

llvm-svn: 354498
2019-02-20 18:45:38 +00:00
Matt Arsenault
75e30c4d5d GlobalISel: Fix fewerElementsVector for ctlz with different result type
Also complete the set of related operations.

llvm-svn: 354480
2019-02-20 16:42:52 +00:00
Matt Arsenault
c4d07554e4 GlobalISel: Implement moreElementsVector for g_insert results
llvm-svn: 354477
2019-02-20 16:11:22 +00:00
Clement Courbet
62b3b91ab2 Re-land the refactoring part of r354244 "[DAGCombiner] Eliminate dead stores to stack."
This is an NFC.

llvm-svn: 354476
2019-02-20 15:45:58 +00:00
David Green
cb5a48b060 [Codegen] Remove dead flags on Physical Defs in machine cse
We may leave behind incorrect dead flags on instructions that are CSE'd. Make
sure we remove the dead flags on physical registers to prevent other incorrect
code motion.

Differential Revision: https://reviews.llvm.org/D58115

llvm-svn: 354443
2019-02-20 10:22:18 +00:00
Mikael Holmen
2d6bb13443 [RegAllocGreedy] Take last chance recoloring into account in split and assign
Summary:
This is a follow-up to r353988 where tryEvict was extended to take last
chance recoloring into account. Now we do the same thing for trySplit and
tryAssign.

Now we always pass a "FixedRegisters" argument to canEvictInterference and
tryEvict so it doesn't need to have a default value anymore.

The need for this was found long ago in an out-of-tree target.
Unfortunately I don't have a reproducer for an in-tree target.

Reviewers: qcolombet, rudkx

Reviewed By: qcolombet, rudkx

Subscribers: rudkx, MatzeB, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58376

llvm-svn: 354439
2019-02-20 07:14:39 +00:00
Chen Zheng
b934fce613 [NFC] add/modify wrapper function for findRegisterDefOperand().
llvm-svn: 354438
2019-02-20 07:01:04 +00:00
Thomas Lively
2e1504091e [WebAssembly] Update MC for bulk memory
Summary:
Rename MemoryIndex to InitFlags and implement logic for determining
data segment layout in ObjectYAML and MC. Also adds a "passive" flag
for the .section assembler directive although this cannot be assembled
yet because the assembler does not support data sections.

Reviewers: sbc100, aardappel, aheejin, dschuff

Subscribers: jgravelle-google, hiraditya, sunfish, rupprecht, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57938

llvm-svn: 354397
2019-02-19 22:56:19 +00:00
Nikita Popov
04e45e9311 [SDAG] Use shift amount type in MULO promotion; NFC
Directly use the correct shift amount type if it is possible, and
future-proof the code against vectors. The added test makes sure that
bitwidths that do not fit into the shift amount type do not assert.

Split out from D57997.

llvm-svn: 354359
2019-02-19 17:37:55 +00:00
Matt Arsenault
b4c95b338b GlobalISel: Implement moreElementsVector for select
llvm-svn: 354354
2019-02-19 17:03:09 +00:00
Matt Arsenault
4d88427a58 GlobalISel: Implement moreElementsVector for G_EXTRACT source
llvm-svn: 354348
2019-02-19 16:44:22 +00:00