clang-p2996

Author	SHA1	Message	Date
Simon Pilgrim	89b89650f3	[SelectionDAG] Attempt to split BITREVERSE vector legalization into BSWAP and BITREVERSE stages For BITREVERSE, bit shifting/masking every bit in a vector element is a very lengthy procedure. If the input vector type is a whole multiple of bytes wide then we can split this into a BSWAP shuffle stage (to reverse at the byte level) and then a BITREVERSE stage applied to each byte. Most vector capable targets can efficiently BSWAP using shuffles resulting in a considerable reduction in instructions. With this patch targets would only need to implement a target specific vXi8 BITREVERSE implementation to efficiently reverse most legal vector types. Differential Revision: http://reviews.llvm.org/D19978 llvm-svn: 269290	2016-05-12 13:09:49 +00:00
Justin Bogner	b3534c494f	SDAG: Have SelectNodeTo replace uses if it CSE's instead of morphing a node It's awkward to force callers of SelectNodeTo to figure out whether the node was morphed or CSE'd. Update uses here instead of requiring callers to (sometimes) do it. llvm-svn: 269235	2016-05-11 21:00:33 +00:00
Sanjay Patel	87f6ed6f48	fix typos in comments; NFC llvm-svn: 269206	2016-05-11 17:00:07 +00:00
Justin Bogner	1df01f0e31	SDAG: Make SelectCodeCommon return void This means SelectCode unconditionally returns nullptr now. I'll follow up with a change to make that return void as well, but it seems best to keep that one very mechanical. This is part of the work to have Select return void instead of an SDNode *, which is in turn part of llvm.org/pr26808. llvm-svn: 269136	2016-05-10 22:58:26 +00:00
Marcin Koscielnicki	bbac890b53	[PR27599] [SystemZ] [SelectionDAG] Fix extension of atomic cmpxchg result. Currently, SelectionDAG assumes 8/16-bit cmpxchg returns either a sign extended result, or a zero extended result. SystemZ takes a third option by returning junk in the high bits (rotated contents of the other bytes in the memory word). In that case, don't use Assert*ext, and zero-extend the result ourselves if a comparison is needed. Differential Revision: http://reviews.llvm.org/D19800 llvm-svn: 269075	2016-05-10 16:49:04 +00:00
Sanjay Patel	91592568f9	[TargetLowering] make helper function for SetCC + and optimizations (NFC) After looking at D19087 again, it occurred to me that we can do better. If we consolidate the valueHasExactlyOneBitSet() transforms, we won't incur extra overhead from calling it a 2nd time, and we can shrink SimplifySetCC() a bit. No functional change intended. Differential Revision: http://reviews.llvm.org/D20050 llvm-svn: 268932	2016-05-09 16:42:50 +00:00
Simon Pilgrim	ed39d150f5	Fix unused variable warning. llvm-svn: 268867	2016-05-07 20:19:59 +00:00
Simon Pilgrim	b6f82c449a	[SelectionDAG] Added bitreverse(bitreverse(v)) --> v Added bitreverse creation testing llvm-svn: 268865	2016-05-07 20:12:36 +00:00
Sanjay Patel	c2751e7050	[x86, BMI] add TLI hook for 'andn' and use it to simplify comparisons For the sake of minimalism, this patch is x86 only, but I think that at least PPC, ARM, AArch64, and Sparc probably want to do this too. We might want to generalize the hook and pattern recognition for a target like PPC that has a full assortment of negated logic ops (orc, nand). Note that http://reviews.llvm.org/D18842 will cause this transform to trigger more often. For reference, this relates to: https://llvm.org/bugs/show_bug.cgi?id=27105 https://llvm.org/bugs/show_bug.cgi?id=27202 https://llvm.org/bugs/show_bug.cgi?id=27203 https://llvm.org/bugs/show_bug.cgi?id=27328 Differential Revision: http://reviews.llvm.org/D19087 llvm-svn: 268858	2016-05-07 15:03:40 +00:00
Justin Bogner	c45c960006	SDAG: Don't leave dangling dead nodes after SelectCodeCommon Relying on the caller to clean up after we've replaced all uses of a node won't work when we've migrated to the `void Select(...)` API. llvm-svn: 268774	2016-05-06 18:42:16 +00:00
Ahmed Bougacha	16547c4e31	[CodeGen] Round [SU]INT_TO_FP result when promoting from f16. If we don't, values that aren't precisely representable in f16 could be used as-is in a promoted f32 operation, which would produce incorrect results. AArch64 had the correct behavior; add a focused test. Fixes http://llvm.org/PR26871 llvm-svn: 268700	2016-05-06 00:58:00 +00:00
Justin Bogner	b012699741	SDAG: Rename Select->SelectImpl and repurpose Select as returning void This is a step towards removing the rampant undefined behaviour in SelectionDAG, which is a part of llvm.org/PR26808. We rename SelectionDAGISel::Select to SelectImpl and update targets to match, and then change Select to return void and consolidate the sketchy behaviour we're trying to get away from there. Next, we'll update backends to implement `void Select(...)` instead of SelectImpl and eventually drop the base Select implementation. llvm-svn: 268693	2016-05-05 23:19:08 +00:00
Justin Bogner	465886ece1	SDAG: Remove OPC_MarkGlueResults and associated logic. NFC This opcode never happens in practice, and yet the logic we have in place to handle it would be undefined behaviour if we ever executed it. Remove it rather than trying to refactor code that's never reached. llvm-svn: 268692	2016-05-05 22:37:45 +00:00
Sanjay Patel	c91351c2b7	clean up; NFCI llvm-svn: 268564	2016-05-04 22:39:36 +00:00
Simon Pilgrim	1f5ad702f8	[SelectionDAG] BITREVERSE vector legalization of bit operations (REAPPLIED) Some vector bit operations are promoted instead of having custom lowering. This patch changes the isOperationLegalOrCustom tests for vector AND/OR operations to use a new TLI helper isOperationLegalOrCustomOrPromote instead, allowing the SSE implementations to stay on the simd unit. Differential Revision: http://reviews.llvm.org/D19805 llvm-svn: 268561	2016-05-04 22:08:51 +00:00
Simon Pilgrim	1a14f0d25c	Revert r268504 llvm-svn: 268526	2016-05-04 17:49:14 +00:00
Simon Pilgrim	b97c06210b	[SelectionDAG] BITREVERSE vector legalization of bit operations Vector bit operations are typically promoted instead of having custom lowering. This patch changes the isOperationLegalOrCustom tests for vector AND/OR operations to use isOperationLegalOrPromote instead, allowing the SSE implementations to stay on the simd unit. Differential Revision: http://reviews.llvm.org/D19805 llvm-svn: 268504	2016-05-04 15:01:13 +00:00
Craig Topper	3fc0e668ff	[CodeGen] Add some space optimized forms of EmitNode and MorphNodeTo that implicitly indicate the number of result VTs. This shaves about 16K off the X86 matching table taking it down to about 470K. Overall this reduces the llc binary size with all in-tree targets by about 40K. llvm-svn: 268365	2016-05-03 05:54:13 +00:00
Wolfgang Pieb	56aa4b0629	DebugInfo: Avoid propagating incorrect debug locations in SelectionDAG via CSE. Summary: When SelectionDAG performs CSE it is possible that the context's source location is different from that of the selected node. This can lead to incorrect line number records. We update the debug location to the one that occurs earlier in the instruction sequence. This fixes PR21006. Reviewers: echristo, sdmitrouk Subscribers: jevinskie, asl, llvm-commits Differential Revision: http://reviews.llvm.org/D12094 llvm-svn: 268323	2016-05-02 22:50:51 +00:00
Eric Christopher	94a9ee65c6	Fix grammar and correct comment - the debug information wasn't incorrect, rather suboptimal. llvm-svn: 268211	2016-05-02 05:30:26 +00:00
Craig Topper	e3c1e225d7	[CodeGen] Add OPC_MoveChild0-OPC_MoveChild7 opcodes to isel matching tables to optimize table size. Shaves about 12K off the X86 matcher table. llvm-svn: 268209	2016-05-02 01:53:30 +00:00
Igor Breger	110af565c7	getelementptr instruction, support index vector of EVT. Differential Revision: http://reviews.llvm.org/D19775 llvm-svn: 268195	2016-05-01 13:29:12 +00:00
Matt Arsenault	ab2232cf73	DAGCombiner: Reduce truncated shl width llvm-svn: 268094	2016-04-29 19:53:16 +00:00
Simon Pilgrim	464f1f3bea	Use SelectionDAG::getTargetConstant* helper functions. NFC. Instead of SelectionDAG::getConstant directly to make it more obvious that we're creating target constants. llvm-svn: 268074	2016-04-29 17:42:45 +00:00
Filipe Cabecinhas	0da9937517	Unify XDEBUG and EXPENSIVE_CHECKS (into the latter), and add an option to the cmake build to enable them. Summary: Historically, we had a switch in the Makefiles for turning on "expensive checks". This has never been ported to the cmake build, but the (dead-ish) code is still around. This will also make it easier to turn it on in buildbots. Reviewers: chandlerc Subscribers: jyknight, mzolotukhin, RKSimon, gberry, llvm-commits Differential Revision: http://reviews.llvm.org/D19723 llvm-svn: 268050	2016-04-29 15:22:48 +00:00
Gerolf Hoflehner	50426191d7	[DAGCombiner] Follow coding convention for function name (NFC) llvm-svn: 267745	2016-04-27 17:27:16 +00:00
Nico Weber	e69b9548b8	Revert r267649, it caused PR27539. llvm-svn: 267723	2016-04-27 15:16:54 +00:00
Cong Hou	6f879d9eb1	Detects the SAD pattern on X86 so that much better code will be emitted once the pattern is matched. Differential revision: http://reviews.llvm.org/D14840 llvm-svn: 267649	2016-04-27 01:29:18 +00:00
Ahmed Bougacha	128f8732a5	[CodeGen] Add getBuildVector and getSplatBuildVector helpers. NFCI. Differential Revision: http://reviews.llvm.org/D17176 llvm-svn: 267606	2016-04-26 21:15:30 +00:00
Marcin Koscielnicki	1c1af6ef77	[PR27390] [CodeGen] Reject indexed loads in CombinerDAG. visitAND, when folding and (load) forgets to check which output of an indexed load is involved, happily folding the updated address output on the following testcase: target datalayout = "e-m:e-i64:64-n32:64" target triple = "powerpc64le-unknown-linux-gnu" %typ = type { i32, i32 } define signext i32 @_Z8access_pP1Tc(%typ* %p, i8 zeroext %type) { %b = getelementptr inbounds %typ, %typ* %p, i64 0, i32 1 %1 = load i32, i32* %b, align 4 %2 = ptrtoint i32* %b to i64 %3 = and i64 %2, -35184372088833 %4 = inttoptr i64 %3 to i32* %_msld = load i32, i32* %4, align 4 %zzz = add i32 %1, %_msld ret i32 %zzz } Fix this by checking ResNo. I've found a few more places that currently neglect to check for indexed load, and tightened them up as well, but I don't have test cases for them. In fact, they might not be triggerable at all, at least with current targets. Still, better safe than sorry. Differential Revision: http://reviews.llvm.org/D19202 llvm-svn: 267420	2016-04-25 15:43:44 +00:00
Gerolf Hoflehner	01b3a6184a	[MachineCombiner] Support for floating-point FMA on ARM64 (re-commit r267098) The original patch caused crashes because it could derefence a null pointer for SelectionDAGTargetInfo for targets that do not define it. Evaluates fmul+fadd -> fmadd combines and similar code sequences in the machine combiner. It adds support for float and double similar to the existing integer implementation. The key features are: - DAGCombiner checks whether it should combine greedily or let the machine combiner do the evaluation. This is only supported on ARM64. - It gives preference to throughput over latency: the heuristic used is to combine always in loops. The targets decides whether the machine combiner should optimize for throughput or latency. - Supports for fmadd, f(n)msub, fmla, fmls patterns - On by default at O3 ffast-math llvm-svn: 267328	2016-04-24 05:14:01 +00:00
Craig Topper	36c133159a	[CodeGen] Teach DAG combine to fold select_cc seteq X, 0, sizeof(X), ctlz_zero_undef(X) -> ctlz(X). InstCombine already does this for IR and X86 pattern matches this during isel. A follow up commit will remove the X86 patterns to allow this to be tested. llvm-svn: 267325	2016-04-24 04:38:32 +00:00
Craig Topper	7e5fad66f3	[CodeGen] When promoting CTTZ operations to larger type, don't insert a select to detect if the input is zero to return the original size instead of the extended size. Instead just set the first bit in the zero extended part. llvm-svn: 267280	2016-04-23 05:20:47 +00:00
Matt Arsenault	3b748d76f6	DAGCombiner: Relax alignment restriction when changing store type If the target allows the alignment, this should be OK. llvm-svn: 267217	2016-04-22 21:01:41 +00:00
Matt Arsenault	629d12de70	DAGCombiner: Relax alignment restriction when changing load type If the target allows the alignment, this should still be OK. llvm-svn: 267209	2016-04-22 20:21:36 +00:00
Daniel Sanders	591c379563	Revert r267098 - [MachineCombiner] Support for floating-point FMA on ARM64 It introduced buildbot failures on clang-cmake-mips, clang-ppc64le-linux, among others. llvm-svn: 267127	2016-04-22 09:37:26 +00:00
Gerolf Hoflehner	b32f11fc62	[MachineCombiner] Support for floating-point FMA on ARM64 Evaluates fmul+fadd -> fmadd combines and similar code sequences in the machine combiner. It adds support for float and double similar to the existing integer implementation. The key features are: - DAGCombiner checks whether it should combine greedily or let the machine combiner do the evaluation. This is only supported on ARM64. - It gives preference to throughput over latency: the heuristic used is to combine always in loops. The targets decides whether the machine combiner should optimize for throughput or latency. - Supports for fmadd, f(n)msub, fmla, fmls patterns - On by default at O3 ffast-math llvm-svn: 267098	2016-04-22 02:15:19 +00:00
Matt Arsenault	7846d885ed	LegalizeDAG: Move unaligned load/store expansion to TLI When custom lowered, this is not called if the store is custom lowered. Move it to be a utility function so targets can easily expand unaligned accesses when custom lowering. llvm-svn: 267029	2016-04-21 18:19:11 +00:00
Matt Arsenault	8d1052f55c	DAGCombiner: Reduce 64-bit BFE pattern to pattern on 32-bit component If the extracted bits are restricted to the upper half or lower half, this can be truncated. llvm-svn: 267024	2016-04-21 18:03:06 +00:00
Craig Topper	52cb5ec36f	[SelectionDAG] Teach LegalizeVectorOps to directly Expand CTTZ_ZERO_UNDEF/CTLZ_ZERO_UNDEF to CTTZ/CTLZ directly if those ops are Legal/Custom instead of deferring it to LegalizeOps. This is needed to support CTTZ/CTLZ Custom correctly since LegalizeOps would be too late to do the custom lowering. llvm-svn: 266951	2016-04-21 04:43:57 +00:00
Tim Shen	a1d8bc5597	[PPC, SSP] Support PowerPC Linux stack protection. llvm-svn: 266809	2016-04-19 20:14:52 +00:00
Tim Shen	e885d5e4d3	[SSP, 2/2] Create llvm.stackguard() intrinsic and lower it to LOAD_STACK_GUARD With this change, ideally IR pass can always generate llvm.stackguard call to get the stack guard; but for now there are still IR form stack guard customizations around (see getIRStackGuard()). Future SSP customization should go through LOAD_STACK_GUARD. There is a behavior change: stack guard values are not CSEed anymore, since we should never reuse the value in case that it has been spilled (and corrupted). See ssp-guard-spill.ll. This also cause the change of stack size and codegen in X86 and AArch64 test cases. Ideally we'd like to know if the guard created in llvm.stackprotector() gets spilled or not. If the value is spilled, discard the value and reload stack guard; otherwise reuse the value. This can be done by teaching register allocator to know how to rematerialize LOAD_STACK_GUARD and force a rematerialization (which seems hard), or check for spilling in expandPostRAPseudo. It only makes sense when the stack guard is a global variable, which requires more instructions to load. Anyway, this seems to go out of the scope of the current patch. llvm-svn: 266806	2016-04-19 19:40:37 +00:00
Mehdi Amini	b550cb1750	[NFC] Header cleanup Removed some unused headers, replaced some headers with forward class declarations. Found using simple scripts like this one: clear && ack --cpp -l '#include "llvm/ADT/IndexedMap.h"' \| xargs grep -L 'IndexedMap[<]' \| xargs grep -n --color=auto 'IndexedMap' Patch by Eugene Kosov <claprix@yandex.ru> Differential Revision: http://reviews.llvm.org/D19219 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266595	2016-04-18 09:17:29 +00:00
Hans Wennborg	07f6d3a893	Switch lowering: don't add incoming PHI values from skipped bit test MBB's (PR27135) After r245976, LLVM will skip the last bit test case if knows it will always be true. However, we would still erroneously update PHI nodes with incoming values from the MBB that would perform the final bit test, causing -verify-machineinstrs to fail. llvm-svn: 266479	2016-04-15 21:45:30 +00:00
Hans Wennborg	c944c13dc1	SelectionDAGISel: rangeify a loop llvm-svn: 266478	2016-04-15 21:45:09 +00:00
Reid Kleckner	28865809fe	Sink DI metadata usage out of MachineInstr.h and MachineInstrBuilder.h MachineInstr.h and MachineInstrBuilder.h are very popular headers, widely included across all LLVM backends. It turns out that there only a handful of TUs that actually care about DI operands on MachineInstrs. After this change, touching DebugInfoMetadata.h and rebuilding llc only needs 112 actions instead of 542. llvm-svn: 266351	2016-04-14 18:29:59 +00:00
David Majnemer	0f26b0aeb4	[CodeGen] Teach LLVM how to lower @llvm.{min,max}num to {MIN,MAX}NAN The behavior of {MIN,MAX}NAN differs from that of {MIN,MAX}NUM when only one of the inputs is NaN: -NUM will return the non-NaN argument while -NAN would return NaN. It is desirable to lower to @llvm.{min,max}num to -NAN if they don't have a native instruction for -NUM. Notably, ARMv7 NEON's vmin has the -NAN semantics. N.B. Of course, it is only safe to do this if the intrinsic call is marked nnan. llvm-svn: 266279	2016-04-14 07:13:24 +00:00
Matt Arsenault	9cd90712f0	AMDGPU: Implement canonicalize Also add generic DAG node for it. llvm-svn: 266272	2016-04-14 01:42:16 +00:00
Matthias Braun	46b0f03e12	TargetLowering: Factor out common code for tail call eligibility checking; NFC llvm-svn: 266270	2016-04-14 01:10:42 +00:00
Nirav Dave	2477491a92	Cleanup Store Merging in UseAA case This patch fixes a bug (PR26827) when using anti-aliasing in store merging. This sets the chain users of the component stores to point to the new store instead of the component stores chain parent. Reviewers: jyknight Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18909 llvm-svn: 266217	2016-04-13 17:27:26 +00:00

1 2 3 4 5 ...

7561 Commits