clang-p2996

Author	SHA1	Message	Date
Chandler Carruth	67fc52f067	[PM] Port the always inliner to the new pass manager in a much more minimal and boring form than the old pass manager's version. This pass does the very minimal amount of work necessary to inline functions declared as always-inline. It doesn't support a wide array of things that the legacy pass manager did support, but is alse ... about 20 lines of code. So it has that going for it. Notably things this doesn't support: - Array alloca merging - To support the above, bottom-up inlining with careful history tracking and call graph updates - DCE of the functions that become dead after this inlining. - Inlining through call instructions with the always_inline attribute. Instead, it focuses on inlining functions with that attribute. The first I've omitted because I'm hoping to just turn it off for the primary pass manager. If that doesn't pan out, I can add it here but it will be reasonably expensive to do so. The second should really be handled by running global-dce after the inliner. I don't want to re-implement the non-trivial logic necessary to do comdat-correct DCE of functions. This means the -O0 pipeline will have to be at least 'always-inline,global-dce', but that seems reasonable to me. If others are seriously worried about this I'd like to hear about it and understand why. Again, this is all solveable by factoring that logic into a utility and calling it here, but I'd like to wait to do that until there is a clear reason why the existing pass-based factoring won't work. The final point is a serious one. I can fairly easily add support for this, but it seems both costly and a confusing construct for the use case of the always inliner running at -O0. This attribute can of course still impact the normal inliner easily (although I find that a questionable re-use of the same attribute). I've started a discussion to sort out what semantics we want here and based on that can figure out if it makes sense ta have this complexity at O0 or not. One other advantage of this design is that it should be quite a bit faster due to checking for whether the function is a viable candidate for inlining exactly once per function instead of doing it for each call site. Anyways, hopefully a reasonable starting point for this pass. Differential Revision: https://reviews.llvm.org/D23299 llvm-svn: 278896	2016-08-17 02:56:20 +00:00
Zijiao Ma	53d55f45a1	Some places that could using TargetParser in LLVM. NFC. llvm-svn: 278888	2016-08-17 02:08:28 +00:00
Duncan P. N. Exon Smith	ec083b59ed	ARM: Avoid dereferencing end() in ARMFrameLowering::emitPrologue llvm::tryFoldSPUpdateIntoPushPop assumes its arguments are valid MachineInstrs. Update ARMFrameLowering::emitPrologue to respect that; when LastPush==end(), it can't possibly be a push instruction anyway. llvm-svn: 278880	2016-08-17 00:53:04 +00:00
Duncan P. N. Exon Smith	e04fe1a394	Hexagon: Avoid dereferencing end() in HexagonInstrInfo::InsertBranch llvm-svn: 278878	2016-08-17 00:34:00 +00:00
Duncan P. N. Exon Smith	db53d99d02	AMDGPU: Avoid looking for the DebugLoc in end() The end() iterator isn't a safe thing to dereference. Pass the DebugLoc into EmitFetchClause and EmitALUClause to avoid it. llvm-svn: 278873	2016-08-17 00:06:43 +00:00
Konstantin Zhuravlyov	e0b87181cf	[AMDGPU] Remove duplicate initialization of SIDebuggerInsertNops pass Differential Revision: https://reviews.llvm.org/D23556 llvm-svn: 278863	2016-08-16 22:30:11 +00:00
Sanjay Patel	904cd39b05	[x86] Allow merging multiple instances of an immediate within a basic block for code size savings, for 64-bit constants. This patch handles 64-bit constants which can be encoded as 32-bit immediates. It extends the functionality added by https://reviews.llvm.org/D11363 for 32-bit constants to 64-bit constants. Patch by Sunita Marathe! Differential Revision: https://reviews.llvm.org/D23391 llvm-svn: 278857	2016-08-16 21:35:16 +00:00
Evandro Menezes	5a5b8dcd32	[AArch64] Adjust the scheduling model for Exynos M1. Refine the model for the FP division unit. llvm-svn: 278846	2016-08-16 20:35:01 +00:00
Evandro Menezes	d03aff2e11	[AArch64] Adjust the scheduling model for Exynos M1. Refine the model for the integer division unit. llvm-svn: 278845	2016-08-16 20:34:58 +00:00
Matt Arsenault	7f19298bfa	AMDGPU: Remove excessive padding from ImmOp and RegOp. The structs ImmOp and RegOp are in AArch64AsmParser.cpp (inside anonymous namespace). This diff changes the order of fields and removes the excessive padding (8 bytes). Patch by Alexander Shaposhnikov llvm-svn: 278844	2016-08-16 20:28:06 +00:00
Krzysztof Parzyszek	1d01a79304	[Hexagon] Standardize next batch of pseudo instructions ALIGNA PS_aligna ALLOCA PS_alloca TFR_FI PS_fi TFR_FIA PS_fia TFR_PdFalse PS_false TFR_PdTrue PS_true VMULW PS_vmulw VMULW_ACC PS_vmulw_acc llvm-svn: 278832	2016-08-16 18:08:40 +00:00
Simon Dardis	4893aff94e	[mips] Enforce compact branch restrictions Check both operands for use of the $zero register which cannot be used with a compact branch instruction. Reviewers: dsanders, vkalintris Differential Review: https://reviews.llvm.org/D23547 llvm-svn: 278824	2016-08-16 17:16:11 +00:00
Krzysztof Parzyszek	eabc0d0fd5	[Hexagon] Clean up some miscellaneous V60 intrinsics a bit llvm-svn: 278823	2016-08-16 17:14:44 +00:00
Krzysztof Parzyszek	17aa4136a2	[Hexagon] Standardize vector predicate load/store pseudo instructions - Remove unused instructions: LDriq_pred_vec_V6, STriq_pred_vec_V6, and the 128B counterparts. - Rename: LDriq_pred_V6 PS_vloadrq_ai LDriq_pred_V6_128B PS_vloadrq_ai_128B STriq_pred_V6 PS_vstorerq_ai STriq_pred_V6_128B PS_vstorerq_ai_128B llvm-svn: 278813	2016-08-16 15:43:54 +00:00
Ahmed Bougacha	e4c03abddd	[AArch64][GlobalISel] Select G_MUL. llvm-svn: 278810	2016-08-16 14:37:46 +00:00
Ahmed Bougacha	59e160a19c	[AArch64][GlobalISel] Factor out unsupported binop check. NFC. We're going to need it for G_MUL, and, if other targets end up using something similar, we can easily put it in the generic selector. llvm-svn: 278808	2016-08-16 14:37:40 +00:00
Ahmed Bougacha	2ac5bf94bc	[AArch64][GlobalISel] Select (variable) shifts. For now, no support for immediates. llvm-svn: 278804	2016-08-16 14:02:47 +00:00
Ahmed Bougacha	0306b5ef07	[AArch64][GlobalISel] Select p0 G_FRAME_INDEX. And mark it as legal. llvm-svn: 278802	2016-08-16 14:02:42 +00:00
Pierre Gousseau	051db7d838	[x86] Refactor a PowerPC specific ctlz/srl transformation (NFC). Following the discussion on D22038, this refactors a PowerPC specific setcc -> srl(ctlz) transformation so it can be used by other targets. Differential Revision: https://reviews.llvm.org/D23445 llvm-svn: 278799	2016-08-16 13:53:53 +00:00
Simon Pilgrim	cc316f013a	[X86][SSE] Add support for combining v2f64 target shuffles to VZEXT_MOVL byte rotations The combine was only matching v2i64 as it assumed lowering to MOVQ - but we have v2f64 patterns that match in a similar fashion llvm-svn: 278794	2016-08-16 12:52:06 +00:00
Prakhar Bahuguna	a27c4a0e66	Correct the upper bound for a CBZ/CBNZ branch target. Summary: Fix for the upper bound check that was causing a build failure. Reviewers: olista01, rengolin, t.p.northover Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23501 llvm-svn: 278789	2016-08-16 10:41:56 +00:00
Prakhar Bahuguna	15ed7ec5aa	[Thumb] Validate branch target for CBZ/CBNZ instructions. Summary: The assembler currently does not check the branch target for CBZ/CBNZ instructions, which only permit branching forwards with a positive offset. This adds validation for the branch target to ensure negative PC-relative offsets are not encoded into the instruction, whether specified as a literal or as an assembler symbol. Reviewers: rengolin, t.p.northover Subscribers: llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D23312 llvm-svn: 278788	2016-08-16 10:41:52 +00:00
Simon Pilgrim	f16cd361d4	[X86][SSE] Add support for combining target shuffles to PALIGNR byte rotations llvm-svn: 278787	2016-08-16 10:03:23 +00:00
Job Noorman	6cd8c9a9d6	[AVR] Fix compile errors Differential Revision: https://reviews.llvm.org/D23450 llvm-svn: 278784	2016-08-16 08:41:35 +00:00
Guy Blank	722caebdae	[X86] Add xgetbv/xsetbv intrinsics to non-windows platforms Differential Revision: https://reviews.llvm.org/D21958 llvm-svn: 278782	2016-08-16 06:41:00 +00:00
Reid Kleckner	229d32abfc	[AMDGPU] Give enum an explicit 64-bit type to fix MSVC 2013 failures Recall that MSVC always gives enums the type 'int', nothing else. MSVC 2015 does not appear to have this problem anymore. Clang-cl -Wmicrosoft-enum-value flags this, FWIW, so now I have a true positive for my warning. :) llvm-svn: 278762	2016-08-15 23:54:44 +00:00
Jan Vesely	0486f739a4	AMDGPU/R600: Convert buffer id to VTX_READ input Use patterns instead of multiple instructions Add buffer id to asm string https://reviews.llvm.org/D22650 llvm-svn: 278749	2016-08-15 21:38:30 +00:00
Matthias Braun	b948c52416	Revert "[Thumb] Validate branch target for CBZ/CBNZ instructions." This currently breaks the greendragon clang-stage1-configure-RA/ and brotli. It is probably just uncovering a pre-existing problem. Reverting temporarily to get the buildbots green again. A reduced testcase will follow shortly. This reverts commit r278659. llvm-svn: 278711	2016-08-15 18:50:13 +00:00
Yaxun Liu	c7cbd72921	AMDGPU: Update AMDGPURuntimeMetadata.h for enums of address space qualifiers llvm-svn: 278682	2016-08-15 16:54:25 +00:00
Matt Arsenault	3661e90e71	AMDGPU: Don't fold subregister extracts into tied operands llvm-svn: 278676	2016-08-15 16:18:36 +00:00
Valery Pykhtin	c761675ef4	[AMDGPU] fix failure on printing of non-existing instruction operands. Differential revision: https://reviews.llvm.org/D23323 llvm-svn: 278665	2016-08-15 10:56:48 +00:00
Sjoerd Meijer	58156715b4	MachineLoop: add methods findLoopControlBlock and findLoopPreheader This adds two new utility functions findLoopControlBlock and findLoopPreheader to MachineLoop and MachineLoopInfo. These functions are refactored and taken from the Hexagon target as they are target independent; thus this is intendend to be a non-functional change. Differential Revision: https://reviews.llvm.org/D22959 llvm-svn: 278661	2016-08-15 08:22:42 +00:00
Prakhar Bahuguna	a305a435a6	[Thumb] Validate branch target for CBZ/CBNZ instructions. Summary: The assembler currently does not check the branch target for CBZ/CBNZ instructions, which only permit branching forwards with a positive offset. This adds validation for the branch target to ensure negative PC-relative offsets are not encoded into the instruction, whether specified as a literal or as an assembler symbol. Reviewers: rengolin, t.p.northover Subscribers: llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D23312 llvm-svn: 278659	2016-08-15 07:57:44 +00:00
Craig Topper	f774de6d54	[X86] PADDUSB/W instructions should be commutable. llvm-svn: 278654	2016-08-15 06:31:57 +00:00
Craig Topper	80c8b80919	[X86] Mark some of the X86 SDNodes as commutative. llvm-svn: 278653	2016-08-15 04:47:30 +00:00
Craig Topper	dbc387cfc9	[X86] X86ISD::FANDN is not commutative or associative. llvm-svn: 278652	2016-08-15 04:47:28 +00:00
Craig Topper	37e8c5443c	[AVX-512] Mark VPMADDWD as commutable to match SSE/AVX version. llvm-svn: 278629	2016-08-14 17:57:22 +00:00
Craig Topper	c677e97dff	[AVX-512] Add masked commutable floating point max/min instructions to folding tables. llvm-svn: 278628	2016-08-14 17:57:19 +00:00
Craig Topper	29fbdc309a	[AVX-512] Add masked logical operations to memory folding tables. llvm-svn: 278627	2016-08-14 17:57:16 +00:00
Igor Breger	505f2cc468	[AVX512] Fix VFPCLASSSD/VFPCLASSSS intrinsic lowering. The i1 result should be zero extended according to SPEC. Differential Revision: http://reviews.llvm.org/D23489 llvm-svn: 278626	2016-08-14 13:58:57 +00:00
Igor Breger	8672408db0	[AVX512] Fix insertelement i1 lowering. 1. Use shuffle to insert element i1 into vector. The previous implementation was incorrect ( dest_bit OR src_bit , it doesn't clear the bit if src_bit=0 ) 2. Improve shuffle i1 vector, use CVT2MASK if supported instead TRUNCATE. Differential Revision: http://reviews.llvm.org/D23347 llvm-svn: 278623	2016-08-14 05:25:07 +00:00
Ron Lieberman	822ee88ab8	Fix unsupported relocation type R_HEX_6_X' for symbol .rodata LowerTargetConstantPool is not properly setting the TargetFlag to indicate desired relocation. Coding error, the offset parameter was omitted, so the TargetFlag was used as the offset, and the TargetFlag defaulted to zero. This only affects -fpic compilation, and only those items created in a Constant Pool, for example a vector of constants. Halide ran into this issue. llvm-svn: 278614	2016-08-13 23:41:11 +00:00
Craig Topper	8c372a31b7	[X86] Add a check of isCommutable at the top of X86InstrInfo::findCommutedOpIndices. Most callers don't check if the instruction is commutable before calling. This saves us the trouble of ending up in the default of the switch and having to determine if this is an FMA or not. llvm-svn: 278597	2016-08-13 06:48:44 +00:00
Craig Topper	eafdbecc44	[AVX-512] Add isCommutable to scalar FMA3 instructions. llvm-svn: 278596	2016-08-13 06:48:41 +00:00
Craig Topper	5f2441d8f3	[AVX-512] Add commutable flags to 132 form FMA3 instructions. llvm-svn: 278595	2016-08-13 06:48:39 +00:00
Craig Topper	e5115aa4ca	[X86] Remove patterns for (vzmovl (insert_subvector undef, (scalar_to_vector))) as the (vzmovl VR256) pattern has higher priority. NFC llvm-svn: 278594	2016-08-13 06:02:19 +00:00
Craig Topper	3f8126e6fa	[AVX-512] Remove an AddedComplexity that was prioritizing basic vzmovl patterns over more complex ones that produce better code. llvm-svn: 278593	2016-08-13 05:43:20 +00:00
Craig Topper	600685d510	[AVX-512] Add patterns to support VZEXT_MOVL from 512-bit vectors with 64-bit and 32-bit elements. Fixes PR28961. llvm-svn: 278592	2016-08-13 05:33:12 +00:00
Matt Arsenault	c1ebd82ebe	AMDGPU: Fix not estimating MBB operand sizes correctly llvm-svn: 278590	2016-08-13 01:43:54 +00:00
Matt Arsenault	3cc1e0066d	AMDGPU: Fix missing test for addressing mode with odd offsets Add test if the constant offset looks unaligned. llvm-svn: 278589	2016-08-13 01:43:51 +00:00

1 2 3 4 5 ...

38899 Commits