clang-p2996

Author	SHA1	Message	Date
Jonas Paulsson	8b8dc50e79	[RegAlloc] Avoid compile time regression with multiple copy hints. As a fix for https://bugs.llvm.org/show_bug.cgi?id=40986 ("excessive compile time building opencollada"), this patch makes sure that no phys reg is hinted more than once from getRegAllocationHints(). This handles the case were many virtual registers are assigned to the same physreg. The previous compile time fix (r343686) in weightCalcHelper() only made sure that physical/virtual registers are passed no more than once to addRegAllocationHint(). Review: Dimitry Andric, Quentin Colombet https://reviews.llvm.org/D59201 llvm-svn: 355854	2019-03-11 19:00:37 +00:00
Simon Pilgrim	f3be93a2ff	[DAG] FoldSetCC - reuse valuetype + ensure its simple. llvm-svn: 355847	2019-03-11 17:56:18 +00:00
Brian Gesiak	4349dc76fa	[Utils] Extract EliminateUnreachableBlocks (NFC) Summary: Extract the functionality of eliminating unreachable basic blocks within a function, previously encapsulated within the -unreachableblockelim pass, and make it available as a function within BlockUtils.h. No functional change intended other than making the logic reusable. Exposing this logic makes it easier to implement https://reviews.llvm.org/D59068, which fixes coroutines bug https://bugs.llvm.org/show_bug.cgi?id=40979. Reviewers: mkazantsev, wmi, davidxl, silvas, davide Reviewed By: davide Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59069 llvm-svn: 355846	2019-03-11 17:51:57 +00:00
Simon Pilgrim	1bb5b56485	[DAG] Move SetCC NaN handling into FoldSetCC llvm-svn: 355845	2019-03-11 17:43:10 +00:00
Simon Pilgrim	53518b45a5	[DAG] TargetLowering::SimplifySetCC - call FoldSetCC early to handle constant/commute folds. Noticed while looking at PR40800 (and also D57921) llvm-svn: 355828	2019-03-11 15:01:31 +00:00
Sam Parker	52760bf435	[CGP] Limit distance between overflow math and cmp Inserting an overflowing arithmetic intrinsic can increase register pressure by producing two values at a point where only one is needed, while the second use maybe several blocks away. This increase in pressure is likely to be more detrimental on performance than rematerialising one of the original instructions. So, check that the arithmetic and compare instructions are no further apart than their immediate successor/predecessor. Differential Revision: https://reviews.llvm.org/D59024 llvm-svn: 355823	2019-03-11 13:19:46 +00:00
Benjamin Kramer	6ff32e143a	[MIPS GlobalISel] Silence uninitialized variable warning The control flow here cannot ever use the uninitialized value, but it's too hard for the compiler to figure that out. Clang warns: llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp:2600:28: error: variable 'CarrySum' is used uninitialized whenever 'for' loop exits because its condition is false [-Werror,-Wsometimes-uninitialized] for (unsigned i = 2; i < Factors.size(); ++i) ^~~~~~~~~~~~~~~~~~ llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp:2604:26: note: uninitialized use occurs here CarrySumPrevDstIdx = CarrySum; ^~~~~~~~ llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp:2600:28: note: remove the condition if it is always true for (unsigned i = 2; i < Factors.size(); ++i) ^~~~~~~~~~~~~~~~~~ llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp:2583:22: note: initialize the variable 'CarrySum' to silence this warning unsigned CarrySum; ^ = 0 llvm-svn: 355818	2019-03-11 10:39:15 +00:00
Petar Avramovic	5229f47f9f	[MIPS GlobalISel] NarrowScalar G_UMULH NarrowScalar G_UMULH in LegalizerHelper using multiplyRegisters helper function. NarrowScalar G_UMULH for MIPS32. Differential Revision: https://reviews.llvm.org/D58825 llvm-svn: 355815	2019-03-11 10:08:44 +00:00
Petar Avramovic	0b17e59b5c	[MIPS GlobalISel] NarrowScalar G_MUL Narrow Scalar G_MUL for MIPS32. Revisit NarrowScalar implementation in LegalizerHelper. Introduce new helper function multiplyRegisters. It performs generic multiplication of values held in multiple registers. Generated instructions use only types NarrowTy and i1. Destination can be same or two times size of the source. Differential Revision: https://reviews.llvm.org/D58824 llvm-svn: 355814	2019-03-11 10:00:17 +00:00
Amaury Sechet	a135fd5562	Remove redundant extractBooleanFlip argument. NFC llvm-svn: 355794	2019-03-11 00:37:01 +00:00
Sanjay Patel	7d8260feb6	[CGP] fix comments; NFC llvm-svn: 355791	2019-03-10 18:42:30 +00:00
Craig Topper	1a872f2b15	Recommit r355224 "[TableGen][SelectionDAG][X86] Add specific isel matchers for immAllZerosV/immAllOnesV. Remove bitcasts from X86 patterns that are no longer necessary." Includes a fix to emit a CheckOpcode for build_vector when immAllZerosV/immAllOnesV is used as a pattern root. This means it can't be used to look through bitcasts when used as a root, but that's probably ok. This extra CheckOpcode will ensure that the first match in the isel table will be a SwitchOpcode which is needed by the caching optimization in the ISel Matcher. Original commit message: Previously we had build_vector PatFrags that called ISD::isBuildVectorAllZeros/Ones. Internally the ISD::isBuildVectorAllZeros/Ones look through bitcasts, but we aren't able to take advantage of that in isel. Instead of we have to canonicalize the types of the all zeros/ones build_vectors and insert bitcasts. Then we have to pattern match those exact bitcasts. By emitting specific matchers for these 2 nodes, we can make isel look through any bitcasts without needing to explicitly match them. We should also be able to remove the canonicalization to vXi32 from lowering, but I've left that for a follow up. This removes something like 40,000 bytes from the X86 isel table. Differential Revision: https://reviews.llvm.org/D58595 llvm-svn: 355784	2019-03-10 05:21:52 +00:00
Amaury Sechet	b62642a115	Refactor isBooleanFlip into extractBooleanFlip so that users do not depend on the patern matched. NFC llvm-svn: 355769	2019-03-09 02:51:52 +00:00
Craig Topper	69f8c1653d	[ScalarizeMaskedMemIntrin] Use IRBuilder functions that take uint32_t/uint64_t for getelementptr, extractelement, and insertelement. This saves needing to call getInt32 ourselves. Making the code a little shorter. The test changes are because insert/extract use getInt64 internally. Shouldn't be a functional issue. This cleanup because I plan to write similar code for expandload/compressstore. llvm-svn: 355767	2019-03-09 02:08:41 +00:00
Wei Mi	98214347c4	Rename a local variable counter to Counter. llvm-svn: 355759	2019-03-08 23:32:07 +00:00
Wei Mi	fb9693d1c9	[RegisterCoalescer][NFC] bind a DenseMap access to a reference to avoid repeated lookup operations llvm-svn: 355757	2019-03-08 23:29:46 +00:00
Craig Topper	d84f605910	[ScalarizeMaskedMemIntrin] Only set the ModifiedDT flag if new basic blocks were added. There are special cases in the scalarization for constant masks. If we hit one of the special cases we don't need to reset the iteration. Noticed while starting work on adding expandload/compressstore to this pass. llvm-svn: 355754	2019-03-08 23:03:43 +00:00
Rong Xu	ce3be45cac	[CodeGenPrepare] Fix ModifiedDT flag in optimizeSelectInst r44412 fixed a huge compile time regression but it needed ModifiedDT flag to be maintained correctly in optimizations in optimizeBlock() and optimizeInst(). Function optimizeSelectInst() does not update the flag. This patch propagates the flag in optimizeSelectInst() back to optimizeBlock(). This patch also removes ModifiedDT in CodeGenPrepare class (which is not used). The property of ModifiedDT is now recorded in a ref parameter. Differential Revision: https://reviews.llvm.org/D59139 llvm-svn: 355751	2019-03-08 22:46:18 +00:00
Matt Arsenault	26e76ef0e2	DAG: Don't try to cluster loads with tied inputs This avoids breaking possible value dependencies when sorting loads by offset. AMDGPU has some load instructions that write into the high or low bits of the destination register, and have a tied input for the other input bits. These can easily have the same base pointer, but be a swizzle so the high address load needs to come first. This was inserting glue forcing the opposite ordering, producing a cycle the InstrEmitter would assert on. It may be potentially expensive to look for the dependency between the other loads, so just skip any where this could happen. Fixes bug 40936 by reverting r351379, which added a hacky attempt to fix this by adding chains in this case, which I think was just working around broken glue before the InstrEmitter. The core of the patch is re-implementing the fix for that problem. llvm-svn: 355728	2019-03-08 20:46:15 +00:00
Amaury Sechet	782ac933b5	[DAGCombiner] fold (add (add (xor a, -1), b), 1) -> (sub b, a) Summary: This pattern is sometime created after legalization. Reviewers: efriedma, spatel, RKSimon, zvi, bkramer Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58874 llvm-svn: 355716	2019-03-08 19:39:32 +00:00
Wei Mi	72ec6801b5	[RegisterCoalescer] Limit the number of joins for large live interval with many valnos. Recently we found compile time out problem in several cases when SpeculativeLoadHardening was enabled. The significant compile time was spent in register coalescing pass, where register coalescer tried to join many other live intervals with some very large live intervals with many valnos. Specifically, every time JoinVals::mapValues is called, computeAssignment will be called by getNumValNums() times of the target live interval. If the large live interval has N valnos and has N copies associated with it, trying to coalescing those copies will at least cost N^2 complexity. The patch adds some limit to the effort trying to join those very large live intervals with others. By default, for live interval with > 100 valnos, and when it has been coalesced with other live interval by more than 100 times, we will stop coalescing for the live interval anymore. That put a compile time cap for the N^2 algorithm and effectively solves the compile time problem we saw. Differential revision: https://reviews.llvm.org/D59143 llvm-svn: 355714	2019-03-08 19:25:32 +00:00
Simon Pilgrim	04e8439f72	[DAGCombine] Merge visitSMULO+visitUMULO into visitMULO. NFCI. llvm-svn: 355690	2019-03-08 11:41:18 +00:00
Simon Pilgrim	c71d6d157f	[DAGCombine] Merge visitSADDO+visitUADDO into visitADDO. NFCI. llvm-svn: 355689	2019-03-08 11:30:33 +00:00
Simon Pilgrim	2c2e76a9e2	[DAGCombine] Merge visitSSUBO+visitUSUBO into visitSUBO. NFCI. llvm-svn: 355688	2019-03-08 11:16:55 +00:00
Brian Gesiak	4e467043fb	[CodeGen] Reuse BlockUtils for -unreachableblockelim pass (NFC) Summary: The logic in the -unreachableblockelim pass does the following: 1. It traverses the function it's given in depth-first order and creates a set of basic blocks that are unreachable from the function's entry node. 2. It iterates over each of those unreachable blocks and (1) removes any successors' references to the dead block, and (2) replaces any uses of instructions from the dead block with null. The logic in (2) above is identical to what the `llvm::DeleteDeadBlocks` function from `BasicBlockUtils.h` does. The only difference is that `llvm::DeleteDeadBlocks` replaces uses of instructions from dead blocks not with null, but with undef. Replace the duplicate logic in the -unreachableblockelim pass with a call to `llvm::DeleteDeadBlocks`. This results in less code but no functional change (NFC). Reviewers: mkazantsev, wmi, davidxl, silvas, davide Reviewed By: davide Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59064 llvm-svn: 355634	2019-03-07 20:40:55 +00:00
Paul Robinson	05efe0fdc4	[PS4] Emit a trap after a stack-protector fail call. llvm-svn: 355542	2019-03-06 19:57:43 +00:00
Philip Reames	9549f7560f	[AtomicExpand] Allow libcall expansion for non-zero address spaces (try 2) Restore a reverted commit, with the silly mistake fixed. Sorry for the previous breakage. Be consistent about how we treat atomics in non-zero address spaces. If we get to the backend, we tend to lower them as if in address space 0. Do the same if we need to insert a libcall instead. Differential Revision: https://reviews.llvm.org/D58760 llvm-svn: 355540	2019-03-06 19:27:13 +00:00
Simon Pilgrim	9d6347cfc1	[DAGCombine] Improve select (not Cond), N1, N2 -> select Cond, N2, N1 fold Move the x86 combine from D58974 into the DAGCombine VSELECT code and update the SELECT version to use the isBooleanFlip helper as well. Requested by @spatel on D59006 llvm-svn: 355533	2019-03-06 18:52:52 +00:00
Simon Pilgrim	cdf95f8f07	[DAGCombiner] Enable UADDO/USUBO vector combine support Differential Revision: https://reviews.llvm.org/D58965 llvm-svn: 355517	2019-03-06 16:11:03 +00:00
Sanjay Patel	1fefc30b08	[TargetLowering] simplify code for uaddsat/usubsat expansion; NFC We had 2 local variable names for the same type. llvm-svn: 355516	2019-03-06 16:06:27 +00:00
Alexander Kornienko	3d467a890e	Revert "[CodeGen] Omit range checks from jump tables when lowering switches with unreachable default" This reverts commit `2a0f2c5ef3` (r355490). The commit causes an assertion failure when compiling LLVM code: $ cat repro.cpp class QQQ { public: bool x() const; bool y() const; unsigned getSizeInBits() const { if (y() \|\| x()) return getScalarSizeInBits(); return getScalarSizeInBits() * 2; } unsigned getScalarSizeInBits() const; }; int f(const QQQ &Ty) { switch (Ty.getSizeInBits()) { case 1: case 8: return 0; case 16: return 1; case 32: return 2; case 64: return 3; default: __builtin_unreachable(); } } $ clang -O2 -o repro.o repro.cpp assert.h assertion failed at llvm/include/llvm/ADT/ilist_iterator.h:139 in llvm::ilist_iterator::reference llvm::ilist_iterator<llvm::ilist_detail::node_options<llvm::MachineInstr, true, true, void>, true, false>::operator() const [OptionsT = llvm::ilist_detail::node_options<llvm::MachineInstr, true, true, void>, IsReverse = true, IsConst = false]: !NodePtr->isKnownSentinel() Check failure stack trace: * @ 0x558aab4afc10 __assert_fail @ 0x558aa885479b llvm::ilist_iterator<>::operator() @ 0x558aa8854715 llvm::MachineInstrBundleIterator<>::operator() @ 0x558aa92c33c3 llvm::X86InstrInfo::optimizeCompareInstr() @ 0x558aa9a9c251 (anonymous namespace)::PeepholeOptimizer::optimizeCmpInstr() @ 0x558aa9a9b371 (anonymous namespace)::PeepholeOptimizer::runOnMachineFunction() @ 0x558aa99a4fc8 llvm::MachineFunctionPass::runOnFunction() @ 0x558aab019fc4 llvm::FPPassManager::runOnFunction() @ 0x558aab01a3a5 llvm::FPPassManager::runOnModule() @ 0x558aab01aa9b (anonymous namespace)::MPPassManager::runOnModule() @ 0x558aab01a635 llvm::legacy::PassManagerImpl::run() @ 0x558aab01afe1 llvm::legacy::PassManager::run() @ 0x558aa5914769 (anonymous namespace)::EmitAssemblyHelper::EmitAssembly() @ 0x558aa5910f44 clang::EmitBackendOutput() @ 0x558aa5906135 clang::BackendConsumer::HandleTranslationUnit() @ 0x558aa6d165ad clang::ParseAST() @ 0x558aa6a94e22 clang::ASTFrontendAction::ExecuteAction() @ 0x558aa590255d clang::CodeGenAction::ExecuteAction() @ 0x558aa6a94840 clang::FrontendAction::Execute() @ 0x558aa6a38cca clang::CompilerInstance::ExecuteAction() @ 0x558aa4e2294b clang::ExecuteCompilerInvocation() @ 0x558aa4df6200 cc1_main() @ 0x558aa4e1b37f ExecuteCC1Tool() @ 0x558aa4e1a725 main @ 0x7ff20d56abbd __libc_start_main @ 0x558aa4df51c9 _start llvm-svn: 355515	2019-03-06 15:23:50 +00:00
Teresa Johnson	b1daf0aef6	[CGP] Avoid repeatedly building DominatorTree causing long compile-time (NFC) Summary: In r354298 a DominatorTree construction was added via new function combineToUSubWithOverflow, which was subsequently restructured into replaceMathCmpWithIntrinsic in r354689. We are hitting a very long compile time due to this repeated construction, once per math cmp in the function. We shouldn't need to build the DominatorTree more than once per function, except when a transformation invalidates it. There is already a boolean flag that is returned from these methods indicating whether the DT has been modified. We can simply build the DT once per Function walk in CodeGenPrepare::runOnFunction, since any time a change is made we break out of the Function walk and restart it. I modified the code so that both replaceMathCmpWithIntrinsic as well as mergeSExts (which was also building a DT) use the DT constructed by the run method. From -mllvm -time-passes: Before this patch: CodeGen Prepare user time is 328s With this patch: CodeGen Prepare user time is 21s Reviewers: spatel Subscribers: jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58995 llvm-svn: 355512	2019-03-06 14:57:40 +00:00
Francis Visoiu Mistrih	6b622ebea0	Revert "[Remarks] Refactor remark diagnostic emission in a RemarkStreamer" This reverts commit 2e8c4997a2089f8228c843fd81b148d903472e02. Breaks bots. llvm-svn: 355511	2019-03-06 14:52:37 +00:00
Sanjay Patel	89e534746f	[TargetLowering] simplify code for uaddsat/usubsat expansion; NFC llvm-svn: 355508	2019-03-06 14:34:59 +00:00
Francis Visoiu Mistrih	9052f50cb4	[Remarks] Refactor remark diagnostic emission in a RemarkStreamer This allows us to store more info about where we're emitting the remarks without cluttering LLVMContext. This is needed for future support for the remark section. Differential Revision: https://reviews.llvm.org/D58996 llvm-svn: 355507	2019-03-06 14:32:08 +00:00
Simon Pilgrim	1bdc2d1874	[DAGCombiner] Add SADDO/SSUBO combine support Basic constant handling folds, for both scalars and vectors Differential Revision: https://reviews.llvm.org/D58967 llvm-svn: 355506	2019-03-06 14:22:21 +00:00
Simon Pilgrim	642f53d292	[DAGCombiner] Enable SMULO/UMULO vector combine support (PR40442) Differential Revision: https://reviews.llvm.org/D58968 llvm-svn: 355495	2019-03-06 11:04:21 +00:00
Ayonam Ray	2a0f2c5ef3	[CodeGen] Omit range checks from jump tables when lowering switches with unreachable default During the lowering of a switch that would result in the generation of a jump table, a range check is performed before indexing into the jump table, for the switch value being outside the jump table range and a conditional branch is inserted to jump to the default block. In case the default block is unreachable, this conditional jump can be omitted. This patch implements omitting this conditional branch for unreachable defaults. Differential Revision: https://reviews.llvm.org/D52002 Reviewers: Hans Wennborg, Eli Freidman, Roman Lebedev llvm-svn: 355490	2019-03-06 10:01:02 +00:00
Ayonam Ray	af92b7a3b8	Reversing the commit of revision 355483 since it is giving a regression on a newly added test. llvm-svn: 355487	2019-03-06 07:51:28 +00:00
Ayonam Ray	6025fa8e30	[CodeGen] Omit range checks from jump tables when lowering switches with unreachable default During the lowering of a switch that would result in the generation of a jump table, a range check is performed before indexing into the jump table, for the switch value being outside the jump table range and a conditional branch is inserted to jump to the default block. In case the default block is unreachable, this conditional jump can be omitted. This patch implements omitting this conditional branch for unreachable defaults. Differential Revision: https://reviews.llvm.org/D52002 Reviewers: Hans Wennborg, Eli Freidman, Roman Lebedev llvm-svn: 355483	2019-03-06 07:27:45 +00:00
Mitch Phillips	f0c21e2ff5	Revert "[AtomicExpand] Allow libcall expansion for non-zero address spaces" for buildbot failures. llvm-svn: 355461	2019-03-06 00:25:40 +00:00
Philip Reames	1e4c5d3611	[AtomicExpand] Allow libcall expansion for non-zero address spaces Be consistent about how we treat atomics in non-zero address spaces. If we get to the backend, we tend to lower them as if in address space 0. Do the same if we need to insert a libcall instead. Differential Revision: https://reviews.llvm.org/D58760 llvm-svn: 355453	2019-03-05 23:00:14 +00:00
Craig Topper	57fd733140	Revert r355224 "[TableGen][SelectionDAG][X86] Add specific isel matchers for immAllZerosV/immAllOnesV. Remove bitcasts from X86 patterns that are no longer necessary." This caused the first matcher in the isel table for many targets to Opc_Scope instead of Opc_SwitchOpcode. This leads to a significant increase in isel match failures. llvm-svn: 355433	2019-03-05 19:18:16 +00:00
Craig Topper	2982b846e9	[Subtarget] Merge ProcSched and ProcDesc arrays in MCSubtargetInfo into a single array. These arrays are both keyed by CPU name and go into the same tablegenerated file. Merge them so we only need to store keys once. This also removes a weird space saving quirk where we used the ProcDesc.size() to create to build an ArrayRef for ProcSched. Differential Revision: https://reviews.llvm.org/D58939 llvm-svn: 355431	2019-03-05 18:54:38 +00:00
Craig Topper	ca26808da9	[Subtarget] Create a separate SubtargetSubtargetKV struct for ProcDesc to remove fields from the stack tables that aren't needed for CPUs The description for CPUs was just the CPU name wrapped with "Select the " and " processor". We can just do that directly in the help printer instead of making a separate version in the binary for each CPU. Also remove the Value field that isn't needed and was always 0. Differential Revision: https://reviews.llvm.org/D58938 llvm-svn: 355429	2019-03-05 18:54:34 +00:00
Sanjay Patel	8b72080d4d	[SDAG] move FP constant folding to helper function; NFC llvm-svn: 355411	2019-03-05 16:42:33 +00:00
Sanjay Patel	3b2d0bc7c2	[CodeGenPrepare] avoid crashing on non-canonical/degenerate code The test is reduced from an example in the post-commit thread for: rL354746 http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190304/632396.html While we must avoid dying here, the real question should be: Why is non-canonical and/or degenerate code making it to CGP when using the new pass manager? llvm-svn: 355345	2019-03-04 22:47:13 +00:00
Craig Topper	509a8a3cf1	[DAGCombiner][X86][SystemZ][AArch64] Combine some cases of (bitcast (build_vector constants)) between legalize types and legalize dag. This patch enables combining integer bitcasts of integer build vectors when the new scalar type is legal. I've avoided floating point because the implementation bitcasts float to int along the way and we would need to check the intermediate types for legality Differential Revision: https://reviews.llvm.org/D58884 llvm-svn: 355324	2019-03-04 19:12:16 +00:00
Eugene Leviant	daea28ab64	[DebugInfo] Construct nested types on behalf of owner CU Differential revision: https://reviews.llvm.org/D58786 llvm-svn: 355303	2019-03-04 07:15:36 +00:00
Heejin Ahn	195a62e9ae	[WebAssembly] Delete ThrowUnwindDest map from WasmEHFuncInfo Summary: Before when we implemented the first EH proposal, 'catch <tag>' instruction may not catch an exception so there were multiple EH pads an exception can unwind to. That means a BB could have multiple EH pad successors. Now after we switched to the new proposal, every 'catch' instruction catches an exception, and there is only one catchpad per catchswitch, so we at most have one EH pad successor, making `ThrowUnwindDest` map in `WasmEHInfo` unnecessary. Keeping `ThrowUnwindDest` map in `WasmEHInfo` has its own problems, because other optimization passes can split a BB that contains possibly throwing calls (previously invokes), and we have to update the map every time that happens, which is not easy for common CodeGen passes. This also correctly updates successor info in LateEHPrepare when we add a rethrow instruction. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58486 llvm-svn: 355296	2019-03-03 22:35:56 +00:00

1 2 3 4 5 ...

25953 Commits