clang-p2996

Author	SHA1	Message	Date
Whitney Tsang	0d8f102809	[NFC][LoopUnroll] Add `-unroll-runtime-other-exit-predictable=false` in `runtime-multiexit-heuristic.ll` Added -unroll-runtime-other-exit-predictable=false in runtime-multiexit-heuristic.ll to make it more robust. runtime-multiexit-heuristic.ll intention is to test -unroll-runtime-multi-exit=false, so the default value of -unroll-runtime-other-exit-predictable should not impact the result. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D98098	2021-03-07 23:51:09 +00:00
Whitney Tsang	40391cef61	[LoopUnrollRuntime] Add option to assume the non latch exit block to be predictable. (Add LIT) Reviewed By: Meinersbur, bmahjour Differential Revision: https://reviews.llvm.org/D97747	2021-03-07 23:48:00 +00:00
Roman Lebedev	b46c085d2b	[NFCI] SCEVExpander: emit intrinsics for integral {u,s}{min,max} SCEV expressions These intrinsics, not the icmp+select are the canonical form nowadays, so we might as well directly emit them. This should not cause any regressions, but if it does, then then they would needed to be fixed regardless. Note that this doesn't deal with `SCEVExpander::isHighCostExpansion()`, but that is a pessimization, not a correctness issue. Additionally, the non-intrinsic form has issues with undef, see https://reviews.llvm.org/D88287#2587863	2021-03-06 21:52:46 +03:00
Dávid Bolvanský	cd54c57919	Reland "[Libcalls, Attrs] Annotate libcalls with noundef" Fixed Clang tests.	2021-02-20 06:18:48 +01:00
Dávid Bolvanský	94d034fb86	Revert "[Libcalls, Attrs] Annotate libcalls with noundef" This reverts commit `33b0c63775`. Bots are failing. Some Clang tests need to be updated too.	2021-02-20 04:18:42 +01:00
Dávid Bolvanský	33b0c63775	[Libcalls, Attrs] Annotate libcalls with noundef I think we can use here same logic as for nonnull. strlen(X) - X must be noundef => valid pointer. for libcalls with size arg, we add noundef only if size is known and greater than 0 - so pointers must be noundef (valid ones) Reviewed By: jdoerfert, aqjune Differential Revision: https://reviews.llvm.org/D95122	2021-02-20 04:10:07 +01:00
Sanjay Patel	378941f611	[ValueTracking] add scan limit for assumes In the motivating example from https://llvm.org/PR49171 and reduced test here, we would unroll and clone assumes so much that compile-time effectively became infinite while analyzing all of those assumes.	2021-02-15 15:24:20 -05:00
Sam Parker	9d81ccc02f	[WebAssembly] Enable loop unrolling Enable partial and runtime unrolling with a threshold of 30, which was derived from a large number of kernels running on node and wasmtime for amd64 and aarch64. Unrolling is enabled by default at -O2 and -O3 and is disabled at -Oz and -Os. Compiling with -Os is recommended if the wasm binary size is the most important factor. Differential Revision: https://reviews.llvm.org/D95125	2021-02-10 08:25:46 +00:00
Gil Rapaport	d475030dc2	[SCEV] Apply loop guards to divisibility tests Extend applyLoopGuards() to take into account conditions/assumes proving some value %v to be divisible by D by rewriting %v to (%v / D) * D. This lets the loop unroller and the loop vectorizer identify more loops as not requiring remainder loops. Differential Revision: https://reviews.llvm.org/D95521	2021-02-02 08:09:39 +02:00
Jeroen Dobbelaere	80cdd30eb9	[LoopPeel] Use llvm.experimental.noalias.scope.decl for duplicating noalias metadata as needed. The reduction of a sanitizer build failure when enabling the dominance check (D95335) showed that loop peeling also needs to take care of scope duplication, just like loop unrolling (D92887). Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D95544	2021-02-01 10:01:17 +01:00
Jeroen Dobbelaere	774629641b	[LoopUnroll] Use llvm.experimental.noalias.scope.decl for duplicating noalias metadata as needed This is a fix for https://bugs.llvm.org/show_bug.cgi?id=39282. Compared to D90104, this version is based on part of the full restrict patched (D68484) and uses the `@llvm.experimental.noalias.scope.decl` intrinsic to track the location where !noalias and !alias.scope scopes have been introduced. This allows us to only duplicate the scopes that are really needed. Notes: - it also includes changes and tests from D90104 Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D92887	2021-01-24 13:48:20 +01:00
Roman Lebedev	1742203844	[SimplifyCFG] FoldBranchToCommonDest(): re-lift restrictions on liveout uses of bonus instructions I have previously tried doing that in `b33fbbaa34` / `d38205144f`, but eventually it was pointed out that the approach taken there was just broken wrt how the uses of bonus instructions are updated to account for the fact that they should now use either bonus instruction or the cloned bonus instruction. In particluar, all that manual handling of PHI nodes in successors was just wrong. But, the fix is actually much much simpler than my initial approach: just tell SSAUpdate about both instances of bonus instruction, and let it deal with all the PHI handling. Alive2 confirms that the reproducers from the original bugs (@pr48450*) are now handled correctly. This effectively reverts commit `59560e8589`, effectively relanding `b33fbbaa34`.	2021-01-23 01:29:05 +03:00
Joseph Tremoulet	40cd262c43	Loop peeling: check that latch is conditional branch Loop peeling assumes that the loop's latch is a conditional branch. Add a check to canPeel that explicitly checks for this, and testcases that otherwise fail an assertion when trying to peel a loop whose back-edge is a switch case or the non-unwind edge of an invoke. Reviewed By: skatkov, fhahn Differential Revision: https://reviews.llvm.org/D94995	2021-01-20 11:01:16 -05:00
Arthur Eubanks	f748e92295	[NewPM] Run non-trivial loop unswitching under -O2/3/s/z Fixes https://bugs.llvm.org/show_bug.cgi?id=48715. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D94448	2021-01-12 11:04:40 -08:00
Serguei Katkov	7f69860243	[LoopUnroll] Fix a crash Loop peeling as a last step triggers loop simplification and this can change the loop structure. As a result all cashed values like latch branch becomes invalid. Patch re-structure the code to take into account the possible changes caused by peeling. Reviewers: dmgreen, Meinersbur, etiotto, fhahn, efriedma, bmahjour Reviewed By: Meinersbur, fhahn Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D93686	2021-01-11 10:19:26 +07:00
Hiroshi Yamauchi	cf5415c727	[PGO][PGSO] Let unroll hints take precedence over PGSO. Differential Revision: https://reviews.llvm.org/D94199	2021-01-07 10:10:31 -08:00
Juneyoung Lee	ae6e89327b	Precommit tests that have poison as shufflevector's placeholder This commit copies existing tests at llvm/Transforms containing 'shufflevector X, undef' and replaces them with 'shufflevector X, poison'. The new copied tests have -inseltpoison.ll suffix at its file name (as `db7a2f347f` did) See https://reviews.llvm.org/D93793 Test files listed using grep -R -E "^[^;]shufflevector <.> ., <.> undef" \| cut -d":" -f1 \| uniq Test files copied & updated using file_org=llvm/test/Transforms/$1 if [[ "$file_org" = -inseltpoison.ll ]]; then file=$file_org else file=${file_org%.ll}-inseltpoison.ll if [ ! -f $file ]; then cp $file_org $file fi fi sed -i -E 's/^([^;])shufflevector <(.)> (.), <(.)> undef/\1shufflevector <\2> \3, <\4> poison/g' $file head -1 $file \| grep "Assertions have been autogenerated by utils/update_test_checks.py" -q if [ "$?" == 1 ]; then echo "$file : should be manually updated" # The test is manually updated exit 1 fi python3 ./llvm/utils/update_test_checks.py --opt-binary=./build-releaseassert/bin/opt $file	2020-12-29 17:09:31 +09:00
Juneyoung Lee	db7a2f347f	Precommit transform tests that have poison as insertelement's placeholder This commit copies existing tests at llvm/Transforms and replaces 'insertelement undef' in those files with 'insertelement poison'. (see https://reviews.llvm.org/D93586) Tests listed using this script: grep -R -E '^[^;]insertelement <.> undef,' . \| cut -d":" -f1 \| uniq \| wc -l Tests updated: file_org=llvm/test/Transforms/$1 file=${file_org%.ll}-inseltpoison.ll cp $file_org $file sed -i -E 's/^([^;])insertelement <(.)> undef/\1insertelement <\2> poison/g' $file head -1 $file \| grep "Assertions have been autogenerated by utils/update_test_checks.py" -q if [ "$?" == 1 ]; then echo "$file : should be manually updated" # I manually updated the script exit 1 fi python3 ./llvm/utils/update_test_checks.py --opt-binary=./build-releaseassert/bin/opt $file	2020-12-24 11:46:17 +09:00
Roman Lebedev	5cce4aff18	[SimplifyCFG] TryToSimplifyUncondBranchFromEmptyBlock() already knows how to preserve DomTree ... so just ensure that we pass DomTreeUpdater it into it. Fixes DomTree preservation for a large number of tests, all of which are marked as such so that they do not regress.	2020-12-17 01:03:49 +03:00
Roman Lebedev	aa2009fe78	[NFCI][SimplifyCFG] Mark all the SimplifyCFG tests that already don't invalidate DomTree as such First step after `e113317958`, in these tests, DomTree is valid afterwards, so mark them as such, so that they don't regress. In further steps, SimplifyCFG transforms shall taught to preserve DomTree, in as small steps as possible.	2020-12-17 01:03:49 +03:00
Roman Lebedev	59560e8589	[SimplifyCFG] FoldBranchToCommonDest(): temporairly put back restrictions on liveout uses of bonus instructions (PR48450) Even though `d38205144f` was mostly a correct fix for the external non-PHI users, it's not a generally correct fix, because the 'placeholder' values in those trivial PHI's we create shouldn't be always 'undef', but the PHI itself for the backedges, else we end up with wrong value, as the `@pr48450_2` test shows. But we can't just do that, because we can't check that the PHI can be it's own incoming value when coming from certain predecessor, because we don't have a dominator tree. So until we can address this correctness problem properly, ensure that we don't perform the transformation if there are such problematic external uses. Making dominator tree available there is going to be involved, since `-simplifycfg` pass currently does not preserve/update domtree...	2020-12-14 20:14:31 +03:00
Arthur Eubanks	a820261bf3	[test] Fix store_cost.ll under NPM The NPM processes loops in forward program order, whereas the legacy PM processes them in reverse program order. No reason to test both PMs here, so just stick to the NPM.	2020-12-07 21:19:05 -08:00
Roman Lebedev	b33fbbaa34	Reland [SimplifyCFG] FoldBranchToCommonDest: lift use-restriction on bonus instructions This was orginally committed in `2245fb8aaa`. but was immediately reverted in `f3abd54958` because of a PHI handling issue. Original commit message: 1. It doesn't make sense to enforce that the bonus instruction is only used once in it's basic block. What matters is whether those user instructions fit within our budget, sure, but that is another question. 2. It doesn't make sense to enforce that said bonus instructions are only used within their basic block. Perhaps the branch condition isn't using the value computed by said bonus instruction, and said bonus instruction is simply being calculated to be used in successors? So iff we can clone bonus instructions, to lift these restrictions, we just need to carefully update their external uses to use the new cloned instructions. Notably, this transform (even without this change) appears to be poison-unsafe as per alive2, but is otherwise (including the patch) legal. We don't introduce any new PHI nodes, but only "move" the instructions around, i'm not really seeing much potential for extra cost modelling for the transform, especially since now we allow at most one such bonus instruction by default. This causes the fold to fire +11.4% more (13216 -> 14725) as of vanilla llvm test-suite + RawSpeed. The motivational pattern is IEEE-754-2008 Binary16->Binary32 extension code: `ca57d77fb2/src/librawspeed/common/FloatingPoint.h (L115-L120)` ^ that should be a switch, but it is not now: https://godbolt.org/z/bvja5v That being said, even thought this seemed like this would fix it: https://godbolt.org/z/xGq3TM apparently that fold is happening somewhere else afterall, so something else also has a similar 'artificial' restriction.	2020-11-27 12:47:15 +03:00
Roman Lebedev	f3abd54958	Revert "[SimplifyCFG] FoldBranchToCommonDest: lift use-restriction on bonus instructions" Many bots are unhappy, at the very least missed a few codegen tests, and possibly this has a logic hole inducing a miscompile (will be really awesome to have ready reproducer..) Need to investigate. This reverts commit `2245fb8aaa`.	2020-11-26 23:13:43 +03:00
Roman Lebedev	2245fb8aaa	[SimplifyCFG] FoldBranchToCommonDest: lift use-restriction on bonus instructions 1. It doesn't make sense to enforce that the bonus instruction is only used once in it's basic block. What matters is whether those user instructions fit within our budget, sure, but that is another question. 2. It doesn't make sense to enforce that said bonus instructions are only used within their basic block. Perhaps the branch condition isn't using the value computed by said bonus instruction, and said bonus instruction is simply being calculated to be used in successors? So iff we can clone bonus instructions, to lift these restrictions, we just need to carefully update their external uses to use the new cloned instructions. Notably, this transform (even without this change) appears to be poison-unsafe as per alive2, but is otherwise (including the patch) legal. We don't introduce any new PHI nodes, but only "move" the instructions around, i'm not really seeing much potential for extra cost modelling for the transform, especially since now we allow at most one such bonus instruction by default. This causes the fold to fire +11.4% more (13216 -> 14725) as of vanilla llvm test-suite + RawSpeed. The motivational pattern is IEEE-754-2008 Binary16->Binary32 extension code: `ca57d77fb2/src/librawspeed/common/FloatingPoint.h (L115-L120)` ^ that should be a switch, but it is not now: https://godbolt.org/z/bvja5v That being said, even thought this seemed like this would fix it: https://godbolt.org/z/xGq3TM apparently that fold is happening somewhere else afterall, so something else also has a similar 'artificial' restriction.	2020-11-26 22:51:22 +03:00
Sanjay Patel	99cf39bfed	[LoopUnroll] add test for full unroll that is sensitive to cost-model; NFC See discussion in D90554. This is a partial un-revert of `32dd5870ee`. I'm adding back the baseline tests first, so we don't have to back-track as much in case there are still problems.	2020-11-20 08:15:46 -05:00
Eric Christopher	32dd5870ee	Temporarily Revert "[CostModel] remove cost-kind predicate for intrinsics in basic TTI implementation" as it's causing crashes in the optimizer. A reduced testcase has been posted as a follow-up. This reverts commit `f7eac51b9b`. Temporarily Revert "[CostModel] make default size cost for libcalls small (again)" as it depends upon the primary revert. This reverts commit `8ec7ea3ddc`. Temporarily Revert "[CostModel] add tests for math library calls; NFC" as it depends upon the primary revert. This reverts commit `df09f82599`. Temporarily Revert "[LoopUnroll] add test for full unroll that is sensitive to cost-model; NFC" as it depends upon the primary revert. This reverts commit `618d555e8d`.	2020-11-19 22:10:23 -08:00
Sanjay Patel	8ec7ea3ddc	[CostModel] make default size cost for libcalls small (again) This was changed recently with D90554 / `f7eac51b9b` ...because we had a regression testing blindspot for intrinsics that are expected to be lowered to libcalls. In general, we want the size cost for a scalar call to be cheap even if the other costs are expensive - we expect it to just be a branch with some optional stack manipulation. It is likely that we will want to carve out some exceptions/overrides to this rule as follow-up patches for calls that have some general and/or target-specific difference to the expected lowering. This was noticed as a regression in unrolling, so we have a test for that now along with a couple of direct cost model tests. If the assumed scalarization costs for the oversized vector calls are not realistic, that would be another follow-up refinement of the cost models.	2020-11-14 08:15:35 -05:00
Sanjay Patel	618d555e8d	[LoopUnroll] add test for full unroll that is sensitive to cost-model; NFC See discussion in D90554.	2020-11-13 17:15:23 -05:00
David Green	c7e275388e	[ARM] Don't aggressively unroll vector remainder loops We already do not unroll loops with vector instructions under MVE, but that does not include the remainder loops that the vectorizer produces. These remainder loops will be rarely executed and are not worth unrolling, as the trip count is likely to be low if they get executed at all. Luckily they get llvm.loop.isvectorized to make recognizing them simpler. We have wanted to do this for a while but hit issues with low overhead loops being reverted due to difficult registry allocation. With recent changes that seems to be less of an issue now. Differential Revision: https://reviews.llvm.org/D90055	2020-11-10 17:01:31 +00:00
David Green	44c1a56869	[ARM] Add extra MVE tests for various patches. NFC	2020-11-01 16:24:23 +00:00
Arthur Eubanks	5c31b8b94f	Revert "Use uint64_t for branch weights instead of uint32_t" This reverts commit `10f2a0d662`. More uint64_t overflows.	2020-10-31 00:25:32 -07:00
Arthur Eubanks	10f2a0d662	Use uint64_t for branch weights instead of uint32_t CallInst::updateProfWeight() creates branch_weights with i64 instead of i32. To be more consistent everywhere and remove lots of casts from uint64_t to uint32_t, use i64 for branch_weights. Reviewed By: davidxl Differential Revision: https://reviews.llvm.org/D88609	2020-10-30 10:03:46 -07:00
Nico Weber	2a4e704c92	Revert "Use uint64_t for branch weights instead of uint32_t" This reverts commit `e5766f25c6`. Makes clang assert when building Chromium, see https://crbug.com/1142813 for a repro.	2020-10-27 09:26:21 -04:00
Arthur Eubanks	e5766f25c6	Use uint64_t for branch weights instead of uint32_t CallInst::updateProfWeight() creates branch_weights with i64 instead of i32. To be more consistent everywhere and remove lots of casts from uint64_t to uint32_t, use i64 for branch_weights. Reviewed By: davidxl Differential Revision: https://reviews.llvm.org/D88609	2020-10-26 20:24:04 -07:00
Tim Corringham	3c1273d737	[AMDGPU] Add amdgpu specific loop threshold metadata Add new loop metadata amdgpu.loop.unroll.threshold to allow the initial AMDGPU specific unroll threshold value to be specified on a loop by loop basis. The intention is to be able to to allow more nuanced hints, e.g. specifying a low threshold value to indicate that a loop may be unrolled if cheap enough rather than using the all or nothing llvm.loop.unroll.disable metadata. Differential Revision: https://reviews.llvm.org/D84779	2020-10-22 17:21:32 +01:00
Arthur Eubanks	f2f0474c93	[test] Fix FullUnroll.ll I believe the intention of this test added in https://reviews.llvm.org/D71687 was to test LoopFullUnrollPass with clang's -fno-unroll-loops, not its interaction with optnone. Loop unrolling passes don't run under optnone/-O0. Also added back unintentionally removed -disable-loop-unrolling from https://reviews.llvm.org/D85578. Reviewed By: echristo Differential Revision: https://reviews.llvm.org/D86485	2020-09-17 15:56:13 -07:00
Roman Lebedev	95848ea101	[Value][InstCombine] Fix one-use checks in PHI-of-op -> Op-of-PHI[s] transforms to be one-user checks As FIXME said, they really should be checking for a single user, not use, so let's do that. It is not that unusual to have the same value as incoming value in a PHI node, not unlike how a PHI may have the same incoming basic block more than once. There isn't a nice way to do that, Value::users() isn't uniqified, and Value only tracks it's uses, not Users, so the check is potentially costly since it does indeed potentially involes traversing the entire use list of a value.	2020-08-26 20:20:41 +03:00
dfukalov	33e2f69a24	[AMDGPU][LoopUnroll] Increase BB size to analyze for complete unroll. The `UnrollMaxBlockToAnalyze` parameter is used at the stage when we have no information about a loop body BB cost. In some cases, e.g. for simple loop ``` for(int i=0; i<32; ++i){ D = Arr2[i8 + C1]; Arr1[i64 + C2] += C3 * D; Arr1[i64 + C2 + 2048] += C4 D; } ``` current default parameter value is not enough to run deeper cost analyze so the loop is not completely unrolled. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D86248	2020-08-20 10:41:47 +03:00
Sam Parker	dad04e62f1	[NFC] run update test script On Transforms/LoopUnroll/runtime-small-upperbound.ll	2020-08-17 13:54:28 +01:00
Arthur Eubanks	72effd8d5b	[test][LoopUnroll] Cleanup FullUnroll.ll This is in preparation for enabling proper handling of optnone under the NPM. Most optimizations won't run on an optnone function. Previously the test would rely on lots of optimizations to optimize the IR into a simple infinite loop. This is an optnone function, so clearly that shouldn't be the case. This IR was found by printing the module before the LoopFullUnrollerPass ran. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D85578	2020-08-14 16:05:04 -07:00
Sam Parker	ea8448e361	[LoopUnroll] Adjust CostKind query When TTI was updated to use an explicit cost, TCK_CodeSize was used although the default implicit cost would have been the hand-wavey cost of size and latency. So, revert back to this behaviour. This is not expected to have (much) impact on targets since most (all?) of them return the same value for SizeAndLatency and CodeSize. When optimising for size, the logic has been changed to query CodeSize costs instead of SizeAndLatency. This patch also adds a testing option in the unroller so that OptSize thresholds can be specified. Differential Revision: https://reviews.llvm.org/D85723	2020-08-12 12:56:09 +01:00
Arthur Eubanks	b36c39260e	[NewPM] Don't print 'Invalidating all non-preserved analyses' If an analysis is actually invalidated, there's already a log statement for that: 'Invalidating analysis: FooAnalysis'. Otherwise the statement is not very useful. Reviewed By: asbirlea, ychen Differential Revision: https://reviews.llvm.org/D84981	2020-07-30 19:40:29 -07:00
Yuanfang Chen	555cf42f38	[NewPM][PassInstrument] Add PrintPass callback to StandardInstrumentations Problem: Right now, our "Running pass" is not accurate when passes are wrapped in adaptor because adaptor is never skipped and a pass could be skipped. The other problem is that "Running pass" for a adaptor is before any "Running pass" of passes/analyses it depends on. (for example, FunctionToLoopPassAdaptor). So the order of printing is not the actual order. Solution: Doing things like PassManager::Debuglogging is very intrusive because we need to specify Debuglogging whenever adaptor is created. (Actually, right now we're not specifying Debuglogging for some sub-PassManagers. Check PassBuilder) This patch move debug logging for pass as a PassInstrument callback. We could be sure that all running passes are logged and in the correct order. This could also be used to implement hierarchy pass logging in legacy PM. We could also move logging of pass manager to this if we want. The test fixes looks messy. It includes changes: - Remove PassInstrumentationAnalysis - Remove PassAdaptor - If a PassAdaptor is for a real pass, the pass is added - Pass reorder (to the correct order), related to PassAdaptor - Add missing passes (due to Debuglogging not passed down) Reviewed By: asbirlea, aeubanks Differential Revision: https://reviews.llvm.org/D84774	2020-07-30 10:07:57 -07:00
Jinsong Ji	d28f86723f	Re-land "[PowerPC] Remove QPX/A2Q BGQ/BGP CNK support" This reverts commit `bf544fa1c3`. Fixed the typo in PPCInstrInfo.cpp.	2020-07-28 14:00:11 +00:00
Jinsong Ji	bf544fa1c3	Revert "[PowerPC] Remove QPX/A2Q BGQ/BGP CNK support" This reverts commit `adffce7153`. This is breaking test-suite, revert while investigation.	2020-07-27 21:07:00 +00:00
Jinsong Ji	adffce7153	[PowerPC] Remove QPX/A2Q BGQ/BGP CNK support Per RFC http://lists.llvm.org/pipermail/llvm-dev/2020-April/141295.html no one is making use of QPX/A2Q/BGQ/BGP CNK anymore. This patch remove the support of QPX/A2Q in llvm, BGQ/BGP in clang, CNK support in openmp/polly. Reviewed By: hfinkel Differential Revision: https://reviews.llvm.org/D83915	2020-07-27 19:24:39 +00:00
Hongtao Yu	f3731d34fa	[LoopUnroll] Update branch weight for remainder loop Unrolling a loop with compile-time unknown trip count results in a remainder loop. The remainder loop executes the remaining iterations of the original loop when the original trip count is not a multiple of the unroll factor. For better profile counts maintenance throughout the optimization pipeline, I'm assigning an artificial weight to the latch branch of the remainder loop. A remainder loop runs up to as many times as the unroll factor subtracted by 1. Therefore I'm assigning the maximum possible trip count as the back edge weight. This should be more accurate than the default non-profile weight, which assumes the back edge runs much more frequently than the exit edge. Differential Revision: https://reviews.llvm.org/D83187	2020-07-15 12:33:29 -07:00
Arthur Eubanks	481709e831	[NewPM][opt] Share -disable-loop-unrolling between pass managers There's no reason to introduce a new option for the NPM. The various PGO options are shared in this manner. Reviewed By: echristo Differential Revision: https://reviews.llvm.org/D83368	2020-07-08 08:50:56 -07:00
Roman Lebedev	c3b8bd1eea	[InstCombine] Always try to invert non-canonical predicate of an icmp Summary: The actual transform i was going after was: https://rise4fun.com/Alive/Tp9H ``` Name: zz Pre: isPowerOf2(C0) && isPowerOf2(C1) && C1 == C0 %t0 = and i8 %x, C0 %r = icmp eq i8 %t0, C1 => %t = icmp eq i8 %t0, 0 %r = xor i1 %t, -1 Name: zz Pre: isPowerOf2(C0) %t0 = and i8 %x, C0 %r = icmp ne i8 %t0, 0 => %t = icmp eq i8 %t0, 0 %r = xor i1 %t, -1 ``` but as it can be seen from the current tests, we already canonicalize most of it, and we are only missing handling multi-use non-canonical icmp predicates. If we have both `!=0` and `==0`, even though we can CSE them, we end up being stuck with them. We should canonicalize to the `==0`. I believe this is one of the cleanup steps i'll need after `-scalarizer` if i end up proceeding with my WIP alloca promotion helper pass. Reviewers: spatel, jdoerfert, nikic Reviewed By: nikic Subscribers: zzheng, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D83139	2020-07-04 18:12:04 +03:00

1 2 3 4 5 ...

392 Commits