clang-p2996

Author	SHA1	Message	Date
Alexey Bataev	d5b1f7892f	[SLP] Fixed formatting, NFC. llvm-svn: 329091	2018-04-03 17:48:14 +00:00
Daniel Neilson	901acfab0c	[InstCombine] Fold compare of int constant against a splatted vector of ints Summary: Folding patterns like: %vec = shufflevector <4 x i8> %insvec, <4 x i8> undef, <4 x i32> zeroinitializer %cast = bitcast <4 x i8> %vec to i32 %cond = icmp eq i32 %cast, 0 into: %ext = extractelement <4 x i8> %insvec, i32 0 %cond = icmp eq i32 %ext, 0 Combined with existing rules, this allows us to fold patterns like: %insvec = insertelement <4 x i8> undef, i8 %val, i32 0 %vec = shufflevector <4 x i8> %insvec, <4 x i8> undef, <4 x i32> zeroinitializer %cast = bitcast <4 x i8> %vec to i32 %cond = icmp eq i32 %cast, 0 into: %cond = icmp eq i8 %val, 0 When we construct a splat vector via a shuffle, and bitcast the vector into an integer type for comparison against an integer constant. Then we can simplify the the comparison to compare the splatted value against the integer constant. Reviewers: spatel, anna, mkazantsev Reviewed By: spatel Subscribers: efriedma, rengolin, llvm-commits Differential Revision: https://reviews.llvm.org/D44997 llvm-svn: 329087	2018-04-03 17:26:20 +00:00
Alexey Bataev	428e9d9d87	[SLP] Fix PR36481: vectorize reassociated instructions. Summary: If the load/extractelement/extractvalue instructions are not originally consecutive, the SLP vectorizer is unable to vectorize them. Patch allows reordering of such instructions. Patch does not support reordering of the repeated instruction, this must be handled in the separate patch. Reviewers: RKSimon, spatel, hfinkel, mkuper, Ayal, ashahid Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43776 llvm-svn: 329085	2018-04-03 17:14:47 +00:00
Alexey Bataev	df989c54cf	Recommit "[SLP] Fix issues with debug output in the SLP vectorizer." The primary issue here is that using NDEBUG alone isn't enough to guard debug printing -- instead the DEBUG() macro needs to be used so that the specific pass debug logging check is employed. Without this, every asserts-enabled build was printing out information when it hit this. I also fixed another place where we had multiple statements in a DEBUG macro to use {}s to be a bit cleaner. And I fixed a place that used errs() rather than dbgs(). llvm-svn: 329082	2018-04-03 16:40:33 +00:00
Benjamin Kramer	2fc3b18922	Revert "[SLP] Fix PR36481: vectorize reassociated instructions." This reverts commit r328980 and r329046. Makes the vectorizer crash. llvm-svn: 329071	2018-04-03 14:40:33 +00:00
Alexander Potapenko	ac70668cff	MSan: introduce the conservative assembly handling mode. The default assembly handling mode may introduce false positives in the cases when MSan doesn't understand that the assembly call initializes the memory pointed to by one of its arguments. We introduce the conservative mode, which initializes the first \|sizeof(type)\| bytes for every \|type*\| pointer passed into the assembly statement. llvm-svn: 329054	2018-04-03 09:50:06 +00:00
Chandler Carruth	597bfd8448	[SLP] Fix issues with debug output in the SLP vectorizer. The primary issue here is that using NDEBUG alone isn't enough to guard debug printing -- instead the DEBUG() macro needs to be used so that the specific pass debug logging check is employed. Without this, every asserts-enabled build was printing out information when it hit this. I also fixed another place where we had multiple statements in a DEBUG macro to use {}s to be a bit cleaner. And I fixed a place that used `errs()` rather than `dbgs()`. llvm-svn: 329046	2018-04-03 05:27:28 +00:00
Ikhlas Ajbar	b7322e8ac7	peel loops with runtime small trip counts For Hexagon, peeling loops with small runtime trip count is beneficial for our benchmarks. We set PeelCount in HexagonTargetInfo.cpp and we use PeelCount set by the target for computing the desired peel count. Differential Revision: https://reviews.llvm.org/D44880 llvm-svn: 329042	2018-04-03 03:39:43 +00:00
Haicheng Wu	7f0daaeb86	[SLP] Distinguish "demanded and shrinkable" from "demanded and not shrinkable" values when determining the minimum bitwidth We use two approaches for determining the minimum bitwidth. * Demanded bits * Value tracking If demanded bits doesn't result in a narrower type, we then try value tracking. We need this if we want to root SLP trees with the indices of getelementptr instructions since all the bits of the indices are demanded. But there is a missing piece though. We need to be able to distinguish "demanded and shrinkable" from "demanded and not shrinkable". For example, the bits of %i in %i = sext i32 %e1 to i64 %gep = getelementptr inbounds i64, i64* %p, i64 %i are demanded, but we can shrink %i's type to i32 because it won't change the result of the getelementptr. On the other hand, in %tmp15 = sext i32 %tmp14 to i64 %tmp16 = insertvalue { i64, i64 } undef, i64 %tmp15, 0 it doesn't make sense to shrink %tmp15 and we can skip the value tracking. Ideas are from Matthew Simpson! Differential Revision: https://reviews.llvm.org/D44868 llvm-svn: 329035	2018-04-03 00:05:10 +00:00
Brian Gesiak	64521bed0d	[Coroutines] Avoid assert splitting hidden coros Summary: When attempting to split a coroutine with 'hidden' visibility (for example, a C++ coroutine that is inlined when compiled with the option '-fvisibility-inlines-hidden'), LLVM would hit an assertion in include/llvm/IR/GlobalValue.h:240: "local linkage requires default visibility". The issue is that the visibility is copied from the source of the function split in the `CloneFunctionInto` function, but the linkage is not. To fix, create the new function first with external linkage, then copy the linkage from the original function after `CloneFunctionInto` is called. Since `GlobalValue::setLinkage` in turn calls `maybeSetDsoLocal`, the explicit call to `setDSOLocal` can be removed in CoroSplit.cpp. Test Plan: check-llvm Reviewers: GorNishanov, lewissbaker, EricWF, majnemer, rnk Reviewed By: rnk Subscribers: llvm-commits, eric_niebler Differential Revision: https://reviews.llvm.org/D44185 llvm-svn: 329033	2018-04-02 23:39:40 +00:00
Reid Kleckner	298ffc609b	[InstCombine] Don't strip function type casts from musttail calls Summary: The cast simplifications that instcombine does here do not make any attempt to obey the verifier rules for musttail calls. Therefore we have to disable them. Reviewers: efriedma, majnemer, pcc Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D45186 llvm-svn: 329027	2018-04-02 22:49:44 +00:00
Reid Kleckner	a9e9918ee4	Treat inlining a notail call as a regular, non-tail call Otherwise, we end up inlining a musttail call into a non-tail position, which breaks verifier invariants. Fixes PR31014 llvm-svn: 329015	2018-04-02 21:23:16 +00:00
Sanjay Patel	cbb0450540	[InstCombine] add folds for icmp + sub (PR36969) (A - B) >u A --> A <u B C <u (C - D) --> C <u D https://rise4fun.com/Alive/e7j Name: ugt %sub = sub i8 %x, %y %cmp = icmp ugt i8 %sub, %x => %cmp = icmp ult i8 %x, %y Name: ult %sub = sub i8 %x, %y %cmp = icmp ult i8 %x, %sub => %cmp = icmp ult i8 %x, %y This should fix: https://bugs.llvm.org/show_bug.cgi?id=36969 llvm-svn: 329011	2018-04-02 20:37:40 +00:00
Rong Xu	5a8d4c3357	[DeadArgumentElim] Clone function level metadatas Some Function level metadatas, such as function entry count, are not cloned in DeadArgumentElim. This happens a lot in lto/thinlto because of DeadArgumentElim after internalization. This patch clones the metadatas in the original function to the new function. Differential Revision: https://reviews.llvm.org/D44127 llvm-svn: 328991	2018-04-02 17:27:38 +00:00
Gor Nishanov	b0316d96ae	[coroutines] Add support for llvm.coro.noop intrinsics Summary: A recent addition to Coroutines TS (https://wg21.link/p0913) adds a pre-defined coroutine noop_coroutine that does nothing. To implement this feature, we implemented an llvm.coro.noop intrinsic that returns a coroutine handle to a coroutine that does nothing when resumed or destroyed. Reviewers: EricWF, modocache, rnk, lewissbaker Reviewed By: modocache Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45114 llvm-svn: 328986	2018-04-02 16:55:12 +00:00
Alexey Bataev	3decaf4275	[SLP] Fix PR36481: vectorize reassociated instructions. Summary: If the load/extractelement/extractvalue instructions are not originally consecutive, the SLP vectorizer is unable to vectorize them. Patch allows reordering of such instructions. Reviewers: RKSimon, spatel, hfinkel, mkuper, Ayal, ashahid Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43776 llvm-svn: 328980	2018-04-02 14:51:37 +00:00
Teresa Johnson	974706ebf7	[ThinLTO] Add an import cutoff for debugging/triaging Summary: Adds -import-cutoff=N which will stop importing during the thin link after N imports. Default is -1 (no limit). Reviewers: wmi Subscribers: inglorion, llvm-commits Differential Revision: https://reviews.llvm.org/D45127 llvm-svn: 328934	2018-04-01 15:54:40 +00:00
David Green	f80ebc8d21	[LoopRotate] Rotate loops with loop exiting latches If a loop has a loop exiting latch, it can be profitable to rotate the loop if it leads to the simplification of a phi node. Perform rotation in these cases even if loop rotate itself didnt simplify the loop to get there. Differential Revision: https://reviews.llvm.org/D44199 llvm-svn: 328933	2018-04-01 12:48:24 +00:00
Fangrui Song	956ee79795	Fix a bunch of typoes. NFC llvm-svn: 328907	2018-03-30 22:22:31 +00:00
Peter Collingbourne	d03bf12c1b	DataFlowSanitizer: wrappers of functions with local linkage should have the same linkage as the function being wrapped This patch resolves link errors when the address of a static function is taken, and that function is uninstrumented by DFSan. This change resolves bug 36314. Patch by Sam Kerner! Differential Revision: https://reviews.llvm.org/D44784 llvm-svn: 328890	2018-03-30 18:37:55 +00:00
Krzysztof Parzyszek	fce30c2ba3	Revert "peel loops with runtime small trip counts" This reverts commit r328854, it breaks some Hexagon tests. llvm-svn: 328875	2018-03-30 16:55:44 +00:00
Ikhlas Ajbar	66c8ba5a50	peel loops with runtime small trip counts For Hexagon, peeling loops with small runtime trip count is beneficial for our benchmarks. We set PeelCount in HexagonTargetInfo.cpp and we use PeelCount set by the target for computing the desired peel count. Differential Revision: https://reviews.llvm.org/D44880 llvm-svn: 328854	2018-03-30 03:05:34 +00:00
David Blaikie	f423062aff	Fix some layering in StripNonLineTableDebugInfo, moving its declaration from IPO.h to Utils.h to match its implementation llvm-svn: 328844	2018-03-29 22:42:08 +00:00
David Blaikie	7883340331	Remove unused header to fix layering. llvm-svn: 328842	2018-03-29 22:35:59 +00:00
David Blaikie	4778bb88ef	Remove unused headers to fix layering llvm-svn: 328840	2018-03-29 22:31:39 +00:00
David Blaikie	c90289b5d3	llvm-c: Split Utils out of Scalar.h To fix layering (so that Scalar.h, a libScalarOpts header, isn't included from Utils - which libScalarOpts depends on). llvm-svn: 328839	2018-03-29 22:31:38 +00:00
Evgeniy Stepanov	50635dab26	Add msan custom mapping options. Similarly to https://reviews.llvm.org/D18865 this adds options to provide custom mapping for msan. As discussed in http://lists.llvm.org/pipermail/llvm-dev/2018-February/121339.html Patch by vit9696(at)avp.su. Differential Revision: https://reviews.llvm.org/D44926 llvm-svn: 328830	2018-03-29 21:18:17 +00:00
Philip Reames	5c14ed89f6	[NFC][LICM] Rearrange checks to have the cheap bail out first llvm-svn: 328822	2018-03-29 20:32:15 +00:00
Haicheng Wu	c7cc87922e	[JumpThreading] Don't select an edge that we know we can't thread In r312664 (D36404), JumpThreading stopped threading edges into loop headers. Unfortunately, I observed a significant performance regression as a result of this change. Upon further investigation, the problematic pattern looked something like this (after many high level optimizations): while (true) { bool cond = ...; if (!cond) { <body> } if (cond) break; } Now, naturally we want jump threading to essentially eliminate the second if check and hook up the edges appropriately. However, the above mentioned change, prevented it from doing this because it would have to thread an edge into the loop header. Upon further investigation, what is happening is that since both branches are threadable, JumpThreading picks one of them at arbitrarily. In my case, because of the way that the IR ended up, it tended to pick the one to the loop header, bailing out immediately after. However, if it had picked the one to the exit block, everything would have worked out fine (because the only remaining branch would then be folded, not thraded which is acceptable). Thus, to fix this problem, we can simply eliminate loop headers from consideration as possible threading targets earlier, to make sure that if there are multiple eligible branches, we can still thread one of the ones that don't target a loop header. Patch by Keno Fischer! Differential Revision: https://reviews.llvm.org/D42260 llvm-svn: 328798	2018-03-29 16:01:26 +00:00
David Green	b0aa36f9c2	[LoopRotate] Restructuring LoopRotation.cpp to create Loop Rotation Pass with Loop Rotation Utility Interface The existing LoopRotation.cpp is implemented as one of loop passes instead of being a utility. The user cannot easily perform the loop rotation selectively (or on demand) under different optimization level. For example, the loop rotation is needed as part of the logic to convert a loop into a loop with bottom test for a transformation. If the loop rotation is simply added as a loop pass before the transformation, the pass is skipped if it is compiled at –O0 or if it is explicitly disabled by the user, causing the compiler to generate incorrect code. Furthermore, as a loop pass it will rotate all loops instead of just the relevant loops. We provide a utility interface for the loop rotation so that the loop rotation can be called on demand. The changeset is as follows: - Create a new file lib/Transforms/Utils/LoopRotationUtils.cpp and move the main implementation of class LoopRotate into this file. - Create a new file llvm/include/Transform/Utils/LoopRotationUtils.h with the interface LoopRotation(...). - Original LoopRotation.cpp is changed to use the utility function LoopRotation in LoopRotationUtils.cpp. This is done in the same way community did for mem-to-reg implementation. Patch by Jin Lin! Differential Revision: https://reviews.llvm.org/D44595 llvm-svn: 328766	2018-03-29 08:48:15 +00:00
Benjamin Kramer	6b995a4a7e	[Transforms] Make sure to include the c binding header when defining c binding functions Otherwise the definitions can't see the extern C declarations and get name mangled, making it impossible for users to call them. This breaks the Go bindings. llvm-svn: 328765	2018-03-29 07:56:53 +00:00
David Blaikie	8ad9a97310	Plumb useAA through TargetTransformInfo to remove Transforms->CodeGen header dependency Thanks to echristo for the pointers on direction. llvm-svn: 328737	2018-03-28 22:28:50 +00:00
David Blaikie	eb8cc04ea2	Oops - moved slightly too many things from Scalar to Utils. Move LoopSimplifyCFG things back llvm-svn: 328720	2018-03-28 18:03:25 +00:00
David Blaikie	a373d18eb7	Transforms: Introduce Transforms/Utils.h rather than spreading the declarations amongst Scalar.h and IPO.h Fixes layering - Transforms/Utils shouldn't depend on including a Scalar or IPO header, because Scalar and IPO depend on Utils. llvm-svn: 328717	2018-03-28 17:44:36 +00:00
Alexander Potapenko	4e7ad0805e	[MSan] Introduce ActualFnStart. NFC This is a step towards the upcoming KMSAN implementation patch. KMSAN is going to prepend a special basic block containing tool-specific calls to each function. Because we still want to instrument the original entry block, we'll need to store it in ActualFnStart. For MSan this will still be F.getEntryBlock(), whereas for KMSAN it'll contain the second BB. llvm-svn: 328697	2018-03-28 11:35:09 +00:00
Alexander Potapenko	e1d5877847	[MSan] Add an isStore argument to getShadowOriginPtr(). NFC This is a step towards the upcoming KMSAN implementation patch. The isStore argument is to be used by getShadowOriginPtrKernel(), it is ignored by getShadowOriginPtrUserspace(). Depending on whether a memory access is a load or a store, KMSAN instruments it with different functions, __msan_metadata_ptr_for_load_X() and __msan_metadata_ptr_for_store_X(). Those functions may return different values for a single address, which is necessary in the case the runtime library decides to ignore particular accesses. llvm-svn: 328692	2018-03-28 10:17:17 +00:00
Xin Tong	0272cb077f	80-line wrap. NFC llvm-svn: 328660	2018-03-27 19:43:02 +00:00
Rong Xu	662f38b16f	[PGO] Fix branch probability remarks assert Fixed counter/weight overflow that leads to an assertion. Also fixed the help string for pgo-emit-branch-prob option. Differential Revision: https://reviews.llvm.org/D44809 llvm-svn: 328653	2018-03-27 18:55:56 +00:00
Krzysztof Parzyszek	5d93fdfa89	[LV] Add TTI::shouldMaximizeVectorBandwidth to allow enabling it per target The default implementation returns false and keeps the current behavior. Differential Revision: https://reviews.llvm.org/D44735 llvm-svn: 328632	2018-03-27 16:14:11 +00:00
Max Kazantsev	b1ad66ff12	[LoopUnroll][NFC] Remove redundant canPeel check We check `canPeel` twice: when evaluating the number of iterations to be peeled and within the method `peelLoop` that performs peeling. This method is only executed if the calculated peel count is positive. Thus, the check in `peelLoop` can never fail. This patch replaces this check with an assert. Differential Revision: https://reviews.llvm.org/D44919 Reviewed By: fhahn llvm-svn: 328615	2018-03-27 09:40:51 +00:00
Sam Parker	90b7f4f72c	[IRCE] Enable decreasing loops of non-const bound As a follow-up to r328480, this updates the logic for the decreasing safety checks in a similar manner: - CanBeMax is replaced by CannotBeMaxInLoop which queries isLoopEntryGuardedByCond on the maximum value. - SumCanReachMin is replaced by isSafeDecreasingBound which includes some logic from parseLoopStructure and, again, has been updated to use isLoopEntryGuardedByCond on the given bounds. Differential Revision: https://reviews.llvm.org/D44776 llvm-svn: 328613	2018-03-27 08:24:53 +00:00
Sanjay Patel	0e3167cb30	[InstCombine] improve code comment; NFC llvm-svn: 328560	2018-03-26 17:52:02 +00:00
Sebastian Pop	d870aea03e	[InstCombine] reassociate loop invariant GEP chains to enable LICM This change brings performance of zlib up by 10%. The example below is from a hot loop in longest_match() from zlib. do.body: %cur_match.addr.0 = phi i32 [ %cur_match, %entry ], [ %2, %do.cond ] %idx.ext = zext i32 %cur_match.addr.0 to i64 %add.ptr = getelementptr inbounds i8, i8* %win, i64 %idx.ext %add.ptr2 = getelementptr inbounds i8, i8* %add.ptr, i64 %idx.ext1 %add.ptr3 = getelementptr inbounds i8, i8* %add.ptr2, i64 -1 In this example %idx.ext1 is a loop invariant. It will be moved above the use of loop induction variable %idx.ext such that it can be hoisted out of the loop by LICM. The operands that have dependences carried by the loop will be sinked down in the GEP chain. This patch will produce the following output: do.body: %cur_match.addr.0 = phi i32 [ %cur_match, %entry ], [ %2, %do.cond ] %idx.ext = zext i32 %cur_match.addr.0 to i64 %add.ptr = getelementptr inbounds i8, i8* %win, i64 %idx.ext1 %add.ptr2 = getelementptr inbounds i8, i8* %add.ptr, i64 -1 %add.ptr3 = getelementptr inbounds i8, i8* %add.ptr2, i64 %idx.ext llvm-svn: 328539	2018-03-26 16:19:31 +00:00
Sanjay Patel	4fd4fd610c	[InstCombine] distribute fmul over fadd/fsub This replaces a large chunk of code that was looking for compound patterns that include these sub-patterns. Existing tests ensure that all of the previous examples are still folded as expected. We still need to loosen the FMF check. llvm-svn: 328502	2018-03-26 15:03:57 +00:00
Sanjay Patel	2455fef497	[InstCombine] check uses before creating instructions for fmul distribution As the tests show, we could create extra instructions without any obvious benefit. llvm-svn: 328498	2018-03-26 14:25:43 +00:00
Krzysztof Parzyszek	0b377e0ae9	[LSR] Allow giving priority to post-incrementing addressing modes Implement TTI interface for targets to indicate that the LSR should give priority to post-incrementing addressing modes. Combination of patches by Sebastian Pop and Brendon Cahoon. Differential Revision: https://reviews.llvm.org/D44758 llvm-svn: 328490	2018-03-26 13:10:09 +00:00
Max Kazantsev	a55749312b	[LoopUnroll] Fix dangling pointers in SCEV Current logic of loop SCEV invalidation in Loop Unroller implicitly relies on fact that exit count of outer loops cannot rely on exiting blocks of inner loops, which is true in current implementation of backedge taken count calculation but is wrong in general. As result, when we only forget the loop that we have just unrolled, we may still have cached data for its outer loops (in particular, exit counts) which keeps references on blocks of inner loop that could have been changed or even deleted. The attached test demonstrates a situaton when after unrolling of innermost loop the outermost loop contains a dangling pointer on non-existant block. The problem shows up when we apply patch https://reviews.llvm.org/D44677 that makes SCEV smarter about exit count calculation. I am not sure if the bug exists without this patch, it appears that now it is accidentally correct just because in practice exact backedge taken count for outer loops with complex control flow inside is never calculated. But when SCEV learns to do so, this problem shows up. This patch replaces existing logic of SCEV loop invalidation with a correct one, which happens to be invalidation of outermost loop (which also leads to invalidation of all loops inside of it). It is the only way to ensure that no outer loop keeps dangling pointers on removed blocks, or just outdated information that has changed after unrolling. Differential Revision: https://reviews.llvm.org/D44818 Reviewed By: samparker llvm-svn: 328483	2018-03-26 11:31:46 +00:00
Benjamin Kramer	8840f644b4	[DeadArgElim] Strip allocsize attributes when deleting an argument. Since allocsize refers to the argument number it gets invalidated when an argument is removed and the numbers shift. llvm-svn: 328481	2018-03-26 09:44:24 +00:00
Sam Parker	53a423a417	[IRCE] Enable increasing loops of variable bounds CanBeMin is currently used which will report true for any unknown values, but often a check is performed outside the loop which covers this situation: for (int i = 0; i < N; ++i) ... if (N > 0) for (int i = 0; i < N; ++i) ... So I've add 'LoopGuardedAgainstMin' which reports whether N is greater than the minimum value which then allows loop with a variable loop count to be optimised. I've also moved the increasing bound checking into its own function and replaced SumCanReachMax is another isLoopEntryGuardedByCond function. llvm-svn: 328480	2018-03-26 09:29:42 +00:00
Sanjay Patel	93e64dd9a1	[PatternMatch] allow undef elements when matching vector FP +0.0 This continues the FP constant pattern matching improvements from: https://reviews.llvm.org/rL327627 https://reviews.llvm.org/rL327339 https://reviews.llvm.org/rL327307 Several integer constant matchers also have this ability. I'm separating matching of integer/pointer null from FP positive zero and renaming/commenting to make the functionality clearer. llvm-svn: 328461	2018-03-25 21:16:33 +00:00

... 17 18 19 20 21 ...

20610 Commits