clang-p2996

Author	SHA1	Message	Date
Amara Emerson	93b0ff20c9	[GlobalOpt] Improve common case efficiency of static global initializer evaluation For very, very large global initializers which can be statically evaluated, the code would create vectors of temporary Constants, modifying them in place, before committing the resulting Constant aggregate to the global's initializer value. This had effectively O(n^2) complexity in the size of the global initializer and would cause memory and non-termination issues compiling some workloads. This change performs the static initializer evaluation and creation in batches, once for each global in the evaluated IR memory. The existing code is maintained as a last resort when the initializers are more complex than simple values in a large aggregate. This should theoretically by NFC, no test as the example case is massive. The existing test cases pass with this, as well as the llvm test suite. To give an example, consider the following C++ code adapted from the clang regression tests: struct S { int n = 10; int m = 2 * n; S(int a) : n(a) {} }; template<typename T> struct U { T r = &q; T q = 42; U p = this; }; U<S> e; The global static constructor for 'e' will need to initialize 'r' and 'p' of the outer struct, while also initializing the inner 'q' structs 'n' and 'm' members. This batch algorithm will simply use general CommitValueTo() method to handle the complex nested S struct initialization of 'q', before processing the outermost members in a single batch. Using CommitValueTo() to handle member in the outer struct is inefficient when the struct/array is very large as we end up creating and destroy constant arrays for each initialization. For the above case, we expect the following IR to be generated: %struct.U = type { %struct.S, %struct.S, %struct.U } %struct.S = type { i32, i32 } @e = global %struct.U { %struct.S* gep inbounds (%struct.U, %struct.U* @e, i64 0, i32 1), %struct.S { i32 42, i32 84 }, %struct.U* @e } The %struct.S { i32 42, i32 84 } inner initializer is treated as a complex constant expression, while the other two elements of @e are "simple". Differential Revision: https://reviews.llvm.org/D42612 llvm-svn: 323933	2018-01-31 23:56:07 +00:00
Matt Arsenault	06dfbb50d7	Utils: Fix DomTree update for entry block If SplitBlockPredecessors was used on a function entry block, it wouldn't update the dominator tree. llvm-svn: 323928	2018-01-31 22:54:37 +00:00
Amjad Aboud	b86b771c02	[AggressiveInstCombine] Fixed TruncCombine class to handle TruncInst leaf node correctly. This covers the case where TruncInst leaf node is a constant expression. See PR36121 for more details. Differential Revision: https://reviews.llvm.org/D42622 llvm-svn: 323926	2018-01-31 22:39:05 +00:00
Eli Friedman	79d297abe4	[GlobalOpt] Fix exponential compile-time with selects. If you have a long chain of select instructions created from something like `int* p = &g; if (foo()) p += 4; if (foo2()) p += 4;` etc., a naive recursive visitor will recursively visit each select twice, which is O(2^N) in the number of select instructions. Use the visited set to cut off recursion in this case. (No testcase because this doesn't actually change the behavior, just the time.) Differential Revision: https://reviews.llvm.org/D42451 llvm-svn: 323910	2018-01-31 20:42:25 +00:00
Marek Olsak	13e4741275	AMDGPU: Add intrinsics llvm.amdgcn.cvt.{pknorm.i16, pknorm.u16, pk.i16, pk.u16} Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D41663 llvm-svn: 323908	2018-01-31 20:18:04 +00:00
Marek Olsak	8e7d149a31	[SeparateConstOffsetFromGEP] Preserve metadata when splitting GEPs Summary: !amdgpu.uniform needs to be preserved for AMDGPU, otherwise bad things happen. Reviewers: arsenm, nhaehnle, jingyue, broune, majnemer, bjarke.roune, dblaikie Subscribers: wdng, tpr, llvm-commits Differential Revision: https://reviews.llvm.org/D42744 llvm-svn: 323907	2018-01-31 20:17:52 +00:00
Sanjay Patel	1b66deea16	[InstCombine] reduce code duplication for canEvaluate* functions; NFCI We'd have to make the change suggested in D42536 3x otherwise. llvm-svn: 323877	2018-01-31 14:55:53 +00:00
Amjad Aboud	d895bff5f2	[AggressiveInstCombine] Make TruncCombine class ignore unreachable basic blocks. Because dead code may contain non-standard IR that causes infinite looping or crashes in underlying analysis. See PR36134 for more details. Differential Revision: https://reviews.llvm.org/D42683 llvm-svn: 323862	2018-01-31 10:41:31 +00:00
Peter Collingbourne	7873669be5	LTO: Drop comdats when converting definitions to declarations. Differential Revision: https://reviews.llvm.org/D42715 llvm-svn: 323844	2018-01-31 02:51:03 +00:00
Teresa Johnson	df763188c9	Teach ValueMapper to use ODR uniqued types when available Summary: This is exposed during ThinLTO compilation, when we import an alias by creating a clone of the aliasee. Without this fix the debug type is unnecessarily cloned and we get a duplicate, undoing the uniquing. Fixes PR36089. Reviewers: mehdi_amini, pcc Subscribers: eraman, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D41669 llvm-svn: 323813	2018-01-30 20:16:32 +00:00
Zaara Syeda	1f59ae311b	Re-commit : [PowerPC] Add handling for ColdCC calling convention and a pass to mark candidates with coldcc attribute. This recommits r322721 reverted due to sanitizer memory leak build bot failures. Original commit message: This patch adds support for the coldcc calling convention for Power. This changes the set of non-volatile registers. It includes a pass to stress test the implementation by marking all static directly called functions with the coldcc attribute through the option -enable-coldcc-stress-test. It also includes an option, -ppc-enable-coldcc, to add the coldcc attribute to functions which are cold at all call sites based on BlockFrequencyInfo when the containing function does not call any non cold functions. Differential Revision: https://reviews.llvm.org/D38413 llvm-svn: 323778	2018-01-30 16:17:22 +00:00
Daniel Neilson	594f443b06	[RS4GC] Handle call/invoke instructions as base defining values of vectors Summary: There's an asymmetry in the definitions of findBaseDefiningValueOfVector() and findBaseDefiningValue() of RS4GC. The later handles call and invoke instructions, and the former does not. This appears to be simple oversight. This patch remedies the oversight by adding the call and invoke cases to findBaseDefiningValueOfVector(). Reviewers: DaniilSuchkov, anna Reviewed By: anna Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42653 llvm-svn: 323764	2018-01-30 14:43:41 +00:00
Sanjay Patel	1aef27f5cd	[DSE] make sure memory is not modified before partial store merging (PR36129) We missed a critical check in D30703. We must make sure that no intermediate store is sitting between the stores that we want to merge. This should fix: https://bugs.llvm.org/show_bug.cgi?id=36129 Differential Revision: https://reviews.llvm.org/D42663 llvm-svn: 323759	2018-01-30 13:53:59 +00:00
Brian M. Rzycki	994e889022	[JumpThreading][NFC] Rename LoadInst variables Summary: The JumpThreading pass has several locations where to the variable name LI refers to a LoadInst type. This is confusing and inhibits the ability to use LI for LoopInfo as a member of the JumpThreading class. Minor formatting and comments were also altered to reflect this change. Reviewers: dberlin, kuba, spop, sebpop Reviewed by: sebpop Subscribers: sebpop, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D42601 llvm-svn: 323695	2018-01-29 21:29:44 +00:00
Alexey Bataev	9c5c103283	[SLP] Fix for PR32086: Count InsertElementInstr of the same elements as shuffle. Summary: If the same value is going to be vectorized several times in the same tree entry, this entry is considered to be a gather entry and cost of this gather is counter as cost of InsertElementInstrs for each gathered value. But we can consider these elements as ShuffleInstr with SK_PermuteSingle shuffle kind. Reviewers: spatel, RKSimon, mkuper, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38697 llvm-svn: 323662	2018-01-29 16:08:52 +00:00
George Rimar	eaf5172ca6	[ThinLTO] - Stop internalizing and drop non-prevailing symbols. Implementation marks non-prevailing symbols as not live in the summary. Then them are dropped in backends. Fixes https://bugs.llvm.org/show_bug.cgi?id=35938 Differential revision: https://reviews.llvm.org/D42107 llvm-svn: 323633	2018-01-29 08:03:30 +00:00
Davide Italiano	8b797a0fd2	[CVP] Don't Replace incoming values from unreachable blocks with undef. This pretty much reverts r322006, except that we keep the test, because we work around the issue exposed in a different way (a recursion limit in value tracking). There's still probably some sequence that exposes this problem, and the proper way to fix that for somebody who has time is outlined in the code review. llvm-svn: 323630	2018-01-29 05:59:55 +00:00
Hiroshi Inoue	c8e9245816	[NFC] fix trivial typos in comments and documents "to to" -> "to" llvm-svn: 323628	2018-01-29 05:17:03 +00:00
Alexey Bataev	f86be12182	Revert "[SLP] Fix for PR32086: Count InsertElementInstr of the same elements as shuffle." This reverts commit r323530 to fix possible problems in users code. llvm-svn: 323581	2018-01-27 02:42:21 +00:00
Alexey Bataev	dce1614d75	Revert "[SLP] Removed the warning about unused variable, NFC." This reverts commit r323533 to fix possible problems in users code. llvm-svn: 323580	2018-01-27 02:42:17 +00:00
Vedant Kumar	cff94627cf	[InstrProfiling] Don't exit early when an unused intrinsic is found This fixes a think-o in r323574. llvm-svn: 323576	2018-01-27 00:01:04 +00:00
Vedant Kumar	1ee511c19c	[InstrProfiling] Improve compile time when there is no work When there are no uses of profiling intrinsics in a module, and there's no coverage data to lower, InstrProfiling has no work to do. llvm-svn: 323574	2018-01-26 23:54:24 +00:00
Vedant Kumar	e48597a50e	[InstCombine] Preserve debug values for eliminable casts A cast from A to B is eliminable if its result is casted to C, and if the pair of casts could just be expressed as a single cast. E.g here, %c1 is eliminable: %c1 = zext i16 %A to i32 %c2 = sext i32 %c1 to i64 InstCombine optimizes away eliminable casts. This patch teaches it to insert a dbg.value intrinsic pointing to the final result, so that local variables pointing to the eliminable result are preserved. Differential Revision: https://reviews.llvm.org/D42566 llvm-svn: 323570	2018-01-26 22:02:52 +00:00
Alexey Bataev	041ef2dd15	[SLP] Removed the warning about unused variable, NFC. llvm-svn: 323533	2018-01-26 15:34:44 +00:00
Alexey Bataev	167003df28	[SLP] Fix for PR32086: Count InsertElementInstr of the same elements as shuffle. Summary: If the same value is going to be vectorized several times in the same tree entry, this entry is considered to be a gather entry and cost of this gather is counter as cost of InsertElementInstrs for each gathered value. But we can consider these elements as ShuffleInstr with SK_PermuteSingle shuffle kind. Reviewers: spatel, RKSimon, mkuper, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38697 llvm-svn: 323530	2018-01-26 14:31:09 +00:00
Daniil Fukalov	6e1dc68117	[AMDGPU] fix LDS f32 intrinsics - using qualified pointer addrspace in intrinsics class to avoid .f32 mangling - changed too common atomic mangling to ds - added missing intrinsics to AMDGPUTTIImpl::getTgtMemIntrinsic Reviewed by: b-sumner Differential Revision: https://reviews.llvm.org/D42383 llvm-svn: 323516	2018-01-26 11:09:38 +00:00
Florian Hahn	212afb9fd9	[CallSiteSplitting] Fix infinite loop when recording conditions. Fix infinite loop when recording conditions by correctly marking basic blocks as visited. Fixes https://bugs.llvm.org/show_bug.cgi?id=36105 llvm-svn: 323515	2018-01-26 10:36:50 +00:00
Hiroshi Inoue	0909ca132f	[NFC] fix trivial typos in comments and documents "in in" -> "in", "on on" -> "on" etc. llvm-svn: 323508	2018-01-26 08:15:29 +00:00
Vedant Kumar	6394df9fc4	[Debug] LCSSA: Insert dbg.value at the first available insertion point Inserting a dbg.value instruction at the start of a basic block with a landingpad instruction triggers a verifier failure. We should be OK if we insert the instruction a bit later. Speculative fix for the bot failure described here: https://reviews.llvm.org/D42551 llvm-svn: 323482	2018-01-25 23:48:29 +00:00
Easwaran Raman	8410c37465	[SyntheticCounts] Rewrite the code using only graph traits. Summary: The intent of this is to allow the code to be used with ThinLTO. In Thinlink phase, a traditional Callgraph can not be computed even though all the necessary information (nodes and edges of a call graph) is available. This is due to the fact that CallGraph class is closely tied to the IR. This patch first extends GraphTraits to add a CallGraphTraits graph. This is then used to implement a version of counts propagation on a generic callgraph. Reviewers: davidxl Subscribers: mehdi_amini, tejohnson, llvm-commits Differential Revision: https://reviews.llvm.org/D42311 llvm-svn: 323475	2018-01-25 22:02:29 +00:00
Vedant Kumar	60f54084bf	[Debug] Add dbg.value intrinsics for PHIs created during LCSSA. This patch is an enhancement to propagate dbg.value information when Phis are created on behalf of LCSSA. I noticed a case where a value carried across a loop was reported as <optimized out>. Specifically this case: int bar(int x, int y) { return x + y; } int foo(int size) { int val = 0; for (int i = 0; i < size; ++i) { val = bar(val, i); // Both val and i are correct } return val; // <optimized out> } In the above case, after all of the interesting computation completes our value is reported as "optimized out." This change will add a dbg.value to correct this. This patch also moves the dbg.value insertion routine from LoopRotation.cpp into Local.cpp, so that we can share it in both places (LoopRotation and LCSSA). Patch by Matt Davis! Differential Revision: https://reviews.llvm.org/D42551 llvm-svn: 323472	2018-01-25 21:37:07 +00:00
Vedant Kumar	6bfc869cf7	[Debug] Add a utility to propagate dbg.value to new PHIs, NFC This simply moves an existing utility to Utils for reuse. Split out of: https://reviews.llvm.org/D42551 Patch by Matt Davis! llvm-svn: 323471	2018-01-25 21:37:05 +00:00
Evgeniy Stepanov	31475a039a	[asan] Fix kernel callback naming in instrumentation module. Right now clang uses "_n" suffix for some user space callbacks and "N" for the matching kernel ones. There's no need for this and it actually breaks kernel build with inline instrumentation. Use the same callback names for user space and the kernel (and also make them consistent with the names GCC uses). Patch by Andrey Konovalov. Differential Revision: https://reviews.llvm.org/D42423 llvm-svn: 323470	2018-01-25 21:28:51 +00:00
Easwaran Raman	c73cec84c9	Re-land "[ThinLTO] Add call edges' relative block frequency to per-module summary." It was reverted after buildbot regressions. Original commit message: This allows relative block frequency of call edges to be passed to the thinlink stage where it will be used to compute synthetic entry counts of functions. llvm-svn: 323460	2018-01-25 19:27:17 +00:00
Alexey Bataev	102d4b59f9	Revert "[SLP] Fix for PR32086: Count InsertElementInstr of the same elements as shuffle." This reverts commit r323441 to fix buildbots. llvm-svn: 323447	2018-01-25 17:28:12 +00:00
Alexey Bataev	c8cfa14b6d	[SLP] Fix for PR32086: Count InsertElementInstr of the same elements as shuffle. Summary: If the same value is going to be vectorized several times in the same tree entry, this entry is considered to be a gather entry and cost of this gather is counter as cost of InsertElementInstrs for each gathered value. But we can consider these elements as ShuffleInstr with SK_PermuteSingle shuffle kind. Reviewers: spatel, RKSimon, mkuper, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38697 llvm-svn: 323441	2018-01-25 16:45:18 +00:00
Sanjay Patel	1d68112c4b	[InstCombine] narrow masked zexted binops (PR35792) This is guarded by shouldChangeType(), so the tests show that we don't do the fold if the narrower type is not legal. Note that there is a proposal (D42424) that would change the results for the specific cases shown in these tests. That difference is also discussed in PR35792: https://bugs.llvm.org/show_bug.cgi?id=35792 Alive proofs for the cases handled here as well as the bitwise logic binops that we should already do better on: https://rise4fun.com/Alive/c97 https://rise4fun.com/Alive/Lc5E https://rise4fun.com/Alive/kdf llvm-svn: 323437	2018-01-25 16:34:36 +00:00
Alexey Bataev	a0b2c78efc	Revert "[SLP] Fix for PR32086: Count InsertElementInstr of the same elements as shuffle." This reverts commit r323430 to fix buildbots. llvm-svn: 323432	2018-01-25 15:20:29 +00:00
Alexey Bataev	ad51fe3644	[SLP] Fix for PR32086: Count InsertElementInstr of the same elements as shuffle. Summary: If the same value is going to be vectorized several times in the same tree entry, this entry is considered to be a gather entry and cost of this gather is counter as cost of InsertElementInstrs for each gathered value. But we can consider these elements as ShuffleInstr with SK_PermuteSingle shuffle kind. Reviewers: spatel, RKSimon, mkuper, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38697 llvm-svn: 323430	2018-01-25 15:01:36 +00:00
Amjad Aboud	f1f57a3137	Another try to commit 323321 (aggressive instruction combine). llvm-svn: 323416	2018-01-25 12:06:32 +00:00
Mikael Holmen	886edf8f8a	[GlobalOpt] Emit fragments using field offsets from struct layout Summary: When creating the debug fragments for a SRA'd struct, use the fields' offsets, taken from the struct layout, as the offsets for the resulting fragments. This fixes an issue where GlobalOpt would emit fragments with incorrect offsets for padded fields. This should solve PR36016. Patch by David Stenberg. Reviewers: aprantl Reviewed By: aprantl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42489 llvm-svn: 323411	2018-01-25 10:09:26 +00:00
Alexey Bataev	0affccc8d7	Revert "[SLP] Fix for PR32086: Count InsertElementInstr of the same elements as shuffle." This reverts commit r323348 because of the broken buildbots. llvm-svn: 323359	2018-01-24 18:36:51 +00:00
Nicolai Haehnle	4afb64e4c6	Revert r321751, "StructurizeCFG: Fix broken backedge detection" It causes regressions in various OpenGL test suites. Keep the test cases introduced by r321751 as XFAIL, and add a test case for the regression. Change-Id: I90b4cc354f68cebe5fcef1f2422dc8fe1c6d3514 Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=36015 llvm-svn: 323355	2018-01-24 18:02:05 +00:00
Alexey Bataev	4bd8e5332f	[SLP] Fix for PR32086: Count InsertElementInstr of the same elements as shuffle. Summary: If the same value is going to be vectorized several times in the same tree entry, this entry is considered to be a gather entry and cost of this gather is counter as cost of InsertElementInstrs for each gathered value. But we can consider these elements as ShuffleInstr with SK_PermuteSingle shuffle kind. Reviewers: spatel, RKSimon, mkuper, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38697 llvm-svn: 323348	2018-01-24 17:50:53 +00:00
Amjad Aboud	d53504e379	Reverted 323321. llvm-svn: 323326	2018-01-24 14:48:49 +00:00
Amjad Aboud	e4453233d7	[InstCombine] Introducing Aggressive Instruction Combine pass (-aggressive-instcombine). Combine expression patterns to form expressions with fewer, simple instructions. This pass does not modify the CFG. For example, this pass reduce width of expressions post-dominated by TruncInst into smaller width when applicable. It differs from instcombine pass in that it contains pattern optimization that requires higher complexity than the O(1), thus, it should run fewer times than instcombine pass. Differential Revision: https://reviews.llvm.org/D38313 llvm-svn: 323321	2018-01-24 12:42:42 +00:00
Max Kazantsev	0f720e1296	[NFC] Remove overconfident assert from IRCE This patch removes assert that SCEV is able to prove that a value is non-negative. In fact, SCEV can sometimes be unable to do this because its cache does not update properly. This assert will be returned once this problem is resolved. llvm-svn: 323309	2018-01-24 07:51:41 +00:00
Volkan Keles	ebf34ea316	BlockExtractor: Remove unused variable. NFC. llvm-svn: 323271	2018-01-23 22:24:34 +00:00
Volkan Keles	dc40be75f8	[llvm-extract] Support extracting basic blocks Summary: Currently, there is no way to extract a basic block from a function easily. This patch extends llvm-extract to extract the specified basic block(s). Reviewers: loladiro, rafael, bogner Reviewed By: bogner Subscribers: hintonda, mgorny, qcolombet, llvm-commits Differential Revision: https://reviews.llvm.org/D41638 llvm-svn: 323266	2018-01-23 21:51:34 +00:00
Alexey Bataev	4f74a31c0e	Revert "[SLP] Fix for PR32086: Count InsertElementInstr of the same elements as shuffle." This reverts commit r323246 because of the broken buildbots. llvm-svn: 323252	2018-01-23 20:11:27 +00:00

... 24 25 26 27 28 ...

20610 Commits