clang-p2996

Author	SHA1	Message	Date
Vir Narula	210c851327	[Matrix] Add dot product tests LLVM LIT tests for our upcoming dot product lowering change Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D126942	2022-06-03 20:02:42 +01:00
Alexey Bataev	cac60940b7	[SLP]Improve shuffles cost estimation where possible. Improved/fixed cost modeling for shuffles by providing masks, improved cost model for non-identity insertelements. Differential Revision: https://reviews.llvm.org/D115462	2022-06-03 08:06:22 -07:00
Arnold Schwaighofer	5c902af572	[coro async] Add code to support dynamic aligment of over-aligned types in async frames Async context frames are allocated with a maximum alignment. If a type requests an alignment bigger than that dynamically align the address in the frame. Differential Revision: https://reviews.llvm.org/D126715	2022-06-03 07:06:14 -07:00
Nikita Popov	fcdc6a466a	[SCCP] Regenerate test checks with function signature (NFC) The previous checks were manually modified to avoid the label clash. Use the --function-signature flag that exists for this purpose.	2022-06-03 14:37:41 +02:00
Nikita Popov	6baf44c8b1	[SCCP] Regenerate test checks (NFC)	2022-06-03 14:27:34 +02:00
Florian Hahn	a5bb4a3b4d	[VPlan] Replace CondBit with BranchOnCond VPInstruction. This patch removes CondBit and Predicate from VPBasicBlock. To do so, the patch introduces a new branch-on-cond VPInstruction opcode to model a branch on a condition explicitly. This addresses a long-standing TODO/FIXME that blocks shouldn't be users of VPValues. Those extra users can cause issues for VPValue-based analyses that don't expect blocks. Addressing this fixme should allow us to re-introduce `266ea446ab`. The generic branch opcode can also be used in follow-up patches. Depends on D123005. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D126618	2022-06-03 11:48:31 +01:00
Fangrui Song	df0f30dc36	Revert "[SLP]Improve shuffles cost estimation where possible." This reverts commit `9980c99718`. Caused assertion failures: https://reviews.llvm.org/D115462#3555350	2022-06-03 00:30:34 -07:00
Serguei Katkov	24e16e4af2	[SSAUpdaterImpl] Do not generate phi node with all the same incoming values If all available vals to basic block are the same - do not build new phi node and just use this value. Reviewed By: sameerds Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D126525	2022-06-03 12:24:33 +07:00
Daniil Suchkov	f1940a5895	Revert "[LoopInterchange] New cost model for loop interchange" Reverting the commit due to numerous buildbot failures. This reverts commit `006334470d`.	2022-06-03 00:52:08 +00:00
Congzhe Cao	006334470d	[LoopInterchange] New cost model for loop interchange This patch proposed to use a new cost model for loop interchange, which is obtained from loop cache analysis. Given a loopnest, what loop cache analysis returns is a vector of loops [loop0, loop1, loop2, ...] where loop0 should be replaced as the outermost loop, loop1 should be placed one more level inside, and loop2 one more level inside, etc. What loop cache analysis does is not only more comprehensive than the current cost model, it is also a "one-shot" query which means that we only need to query it once during the entire loop interchange pass, which is better than the current cost model where we query it every time we check whether it is profitable to interchange two loops. Thus complexity is reduced, especially after D120386 where we do more interchanges to get the globally optimal loop access pattern. Updates made to test cases are mostly minor changes and some corrections. Test coverage for loop interchange is not reduced. Currently we did not completely remove the legacy cost model, but keep it as fall-back in case the new cost model did not run successfully. This is because currently we have some limitations in delinearization, which sometimes makes loop cache analysis bail out. The longer term goal is to enhance delinearization and eventually remove the legacy cost model compeletely. Reviewed By: bmahjour, #loopoptwg Differential Revision: https://reviews.llvm.org/D124926	2022-06-02 19:07:14 -04:00
Florian Hahn	78c6b1488f	[CaptureTracking] Increase limit and use it for all visited uses. Currently the MaxUsesToExplore limit only applies to the number of users per value, not the total number of users to explore. The current limit of 20 pessimizes IR with opaque pointers in some cases. Without opaque pointers, we have deeper pointer def-use chains in general due to extra bitcasts and geps for structs with index 0. With opaque pointers the def-use chain is not as deep but wider, due to bitcasts & 0-geps missing. To improve the situation for opaque pointers, this patch does 2 things: 1. Apply the limit to the total number of uses visited. From the wording in the description of the option it seems like this may be the original intention. With the current implementation we could still end up walking a lot of uses. 2. Increase the limit to 100. This is quite arbitrary, but enables a good number of additional optimizations. Those adjustments have a noticeable compile-time impact though. In part that is likely due to additional transformations (and conversely the current baseline misses optimizations after switching to opaque pointers). This recovers some regressions that showed up after enabling opaque pointers. Limit=100: * NewPM-O3: +0.21% * NewPM-ReleaseThinLTO: +0.87% * NewPM-ReleaseLTO-g: +0.46% https://llvm-compile-time-tracker.com/compare.php?from=2e50ecb2ef4e1da1aeab05bcf66380068e680991&to=7e6fbe519d958d09f32f01d5d44a622f551e2031&stat=instructions Limit=60: * NewPM-O3: +0.14% * NewPM-ReleaseThinLTO: +0.41% * NewPM-ReleaseLTO-g: +0.21% https://llvm-compile-time-tracker.com/compare.php?from=aeb19817d66f1a15754163c7f48e01e9ebdd6d45&to=520563fdc146319aae90d06f88d87f2e9e1247b7&stat=instructions Limit=40: * NewPM-O3: +0.11% * NewPM-ReleaseThinLTO: +0.12% * NewPM-ReleaseLTO-g: +0.09% https://llvm-compile-time-tracker.com/compare.php?from=aeb19817d66f1a15754163c7f48e01e9ebdd6d45&to=c9182576e9fe3f1c84a71479665aef91a416318c&stat=instructions Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D126236	2022-06-02 21:43:58 +01:00
Sanjay Patel	1882c25f12	[InstCombine] add tests for mul with low-bit mask operand; NFC	2022-06-02 16:01:23 -04:00
Florian Hahn	44c86e5cdc	[GVN] Add test for capture tracking use limit. Test for capture-tracking-max-uses-to-explore, adjusted in D126236.	2022-06-02 20:15:26 +01:00
Alexey Bataev	9980c99718	[SLP]Improve shuffles cost estimation where possible. Improved/fixed cost modeling for shuffles by providing masks, improved cost model for non-identity insertelements. Differential Revision: https://reviews.llvm.org/D115462	2022-06-02 11:18:14 -07:00
Nikita Popov	41d5033eb1	[IR] Enable opaque pointers by default This enabled opaque pointers by default in LLVM. The effect of this is twofold: * If IR that contains neither explicit ptr nor %T* types is passed to tools, we will now use opaque pointer mode, unless -opaque-pointers=0 has been explicitly passed. * Users of LLVM as a library will now default to opaque pointers. It is possible to opt-out by calling setOpaquePointers(false) on LLVMContext. A cmake option to toggle this default will not be provided. Frontends or other tools that want to (temporarily) keep using typed pointers should disable opaque pointers via LLVMContext. Differential Revision: https://reviews.llvm.org/D126689	2022-06-02 09:40:56 +02:00
Alexey Bataev	73020b4540	Revert "[SLP]Improve shuffles cost estimation where possible." This reverts commit `fd5a6ce9dc` to fix a crash detected by a buildbot https://lab.llvm.org/buildbot/#/builders/179/builds/3805/steps/11/logs/stdio.	2022-06-01 15:44:51 -07:00
Florian Hahn	72aca94b90	[LV] Add additional tests for pointer select support. Additional test cases for D114487.	2022-06-01 21:19:03 +01:00
Alexey Bataev	fd5a6ce9dc	[SLP]Improve shuffles cost estimation where possible. Improved/fixed cost modeling for shuffles by providing masks, improved cost model for non-identity insertelements. Differential Revision: https://reviews.llvm.org/D115462	2022-06-01 11:01:37 -07:00
Alexey Bataev	fe4949942d	[SLP]Fix PR55796: insert point for extractelements from different basic blocks. Extractelement instructions may come from different basic blocks, need to take it into account when looking for a last instruction in the bundle to prevent compiler crash. Differential Revision: https://reviews.llvm.org/D126777	2022-06-01 09:44:53 -07:00
Alexander Kornienko	aa98e7e1eb	Revert "[InstCombine] Combine instructions of type or/and where AND masks can be combined." This reverts commit `ec4adf1f6c`. The commit causes clang to hang on a certain input: ``` $ cat q.cc int f(int a, int b) { int c = ((unsigned char)(a >> 23) & 925); if (a) c = (a >> 23 & b) \| ((unsigned char)(a >> 23) & 925) \| (b >> 23 & 157); return c; } $ time ./clang-15-10515 --target=x86_64--linux-gnu -O1 -c q.cc ^C real 0m45.072s user 0m0.025s sys 0m0.099s ```	2022-06-01 14:20:00 +02:00
Florian Hahn	05776122b6	[VPlan] Use region for each loop in native path. This patch updates the VPlan native path to use VPRegionBlocks for all loops in a loop nest. Up to now, only the outermost loop used a region. This is a step towards unifying both paths and keep things consistent between them. It also prepares various code-gen parts for modeling the pre-header in the inner loop vectorizer (D121624). Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D123005	2022-06-01 10:41:05 +01:00
Nikita Popov	03aceab08b	[ValueTracking] Enable -branch-on-poison-as-ub by default Now that SimpleLoopUnswitch and other transforms no longer introduce branch on poison, enable the -branch-on-poison-as-ub option by default. The practical impact of this is mostly better flag preservation in SCEV, and some freeze instructions no longer being necessary. Differential Revision: https://reviews.llvm.org/D125299	2022-06-01 10:46:06 +02:00
Chenbing Zheng	e183665874	[InstCombine] [NFC] remove useless TODO	2022-06-01 09:29:05 +08:00
Eli Friedman	abdf0da800	[LoopIdiom] Fix bailout for aliasing in memcpy transform. Commit `dd5991cc` modified the aliasing checks here to allow transforming a memcpy where the source and destination point into the same object. However, the change accidentally made the code skip the alias check for other operations in the loop. Instead of completely skipping the alias check, just skip the check for whether the memcpy aliases itself. Differential Revision: https://reviews.llvm.org/D126486	2022-05-31 17:24:23 -07:00
Sanjay Patel	2bf6123f22	[InstCombine] fold icmp of sext bool based on limited range X <=u (sext i1 Y) --> (X == 0) \| Y https://alive2.llvm.org/ce/z/W_tZzo This is the conjugate/sibling pattern suggested with D126171 for a sign-extended bool value.	2022-05-31 12:37:56 -04:00
Sanjay Patel	f3fe38448b	[InstCombine] add tests for icmp of sext i1; NFC These are adapted from the zext tests added for D126171.	2022-05-31 12:37:56 -04:00
Nikita Popov	1257315b20	[Scalarizer] Regenerate test checks (NFC)	2022-05-31 17:23:15 +02:00
Augie Fackler	73f664601c	BuildLibCalls: infer allockind attributes on relevant functions Differential Revision: https://reviews.llvm.org/D123089	2022-05-31 10:01:17 -04:00
Nikita Popov	36cbdaa163	[InstCombine] Fix inbounds preservation when swapping GEPs (PR44206) When reassociating GEPs, we can only keep inbounds if both original GEPs were inbounds, and their offsets have the same sign. For the sake of simplicity, I only handle the case where both offsets are non-negative here. It would probably be fine to just not preserve inbounds at all here, but as I don't see a compile-time impact for adding the isKnownNonNegative() calls I went with this more conservative approach. Fixes https://github.com/llvm/llvm-project/issues/44206. Differential Revision: https://reviews.llvm.org/D126687	2022-05-31 15:45:02 +02:00
Danila Malyutin	4fb3fd7d82	[InstCombine] Fix const folding of switches with default case In case phi was in the default block it could lead to multi-edge. Fixes #55721. Differential Revision: https://reviews.llvm.org/D126650	2022-05-31 15:13:58 +03:00
Nikita Popov	872d69e5d4	[InstCombine] Fix inbounds preservation when merging GEPs (PR55722) Even if the total offset is inbounds, we might represent it by first performing a large negative offset and then a small positive one. With inbounds semantics as currently specified, each offset must be inbounds individually, not just the overall offset of the GEP. Fix this by checking that the sign of all offsets is the same. Fixes https://github.com/llvm/llvm-project/issues/55722.	2022-05-31 11:54:01 +02:00
Sanjay Patel	a0c3c60728	[InstCombine] fold shift-right-by-constant with shift-right-of-constant operand (C2 >> X) >> C1 --> (C2 >> C1) >> X The shift-left form of this transform has existed since: `16f18ed7b5` ...but it applies to matching shift right opcodes too: https://alive2.llvm.org/ce/z/c5eQms	2022-05-30 15:30:01 -04:00
Sanjay Patel	c5d942a4fb	[InstCombine] remove unnecessary one-use check from (C2 << X) << C1 fold The restriction goes back to: `16f18ed7b5` ...but the fold only replaces a shift with a shift, so that's not necessary. Generalizing to other opcodes is planned as a follow-up.	2022-05-30 15:17:54 -04:00
Sanjay Patel	a004438959	[InstCombine] add/move tests for shift-of-constant-by-same-shift-by-constant; NFC	2022-05-30 15:17:54 -04:00
Nuno Lopes	80b3dcc045	[Support] Make report_fatal_error respect its GenCrashDiag argument so it doesn't generate a backtrace There are a few places where we use report_fatal_error when the input is broken. Currently, this function always crashes LLVM with an abort signal, which then triggers the backtrace printing code. I think this is excessive, as wrong input shouldn't give a link to LLVM's github issue URL and tell users to file a bug report. We shouldn't print a stack trace either. This patch changes report_fatal_error so it uses exit() rather than abort() when its argument GenCrashDiag=false. Reviewed by: nikic, MaskRay, RKSimon Differential Revision: https://reviews.llvm.org/D126550	2022-05-30 19:19:23 +01:00
Florian Hahn	b7d2b160c3	[VPlan] Add test for printing VPlan for outer loop vectorization. Test coverage for D123005.	2022-05-30 18:19:52 +01:00
Nikita Popov	a770f534e6	[InstCombine] When swapping GEPs, only keep inbounds if both are If only one of the GEPs is inbounds, then after swapping, there is no guarantee that one of them will be inbounds as well (see e.g. https://alive2.llvm.org/ce/z/agaCnp). This is only a partial fix, because even if both are inbounds, the result is not necessarily inbounds (if the offsets have different signs).	2022-05-30 17:04:42 +02:00
Nikita Popov	2d7bab666f	[InstCombine] Always create new GEPs when swapping GEPs As the long explanatory comment attests, performing the modification in place is pretty tricky. Drop this unnecessary complexity and always create new instructions. This should be NFC-ish, but can probably cause difference due to worklist order.	2022-05-30 16:48:52 +02:00
Nikita Popov	590fd54ca1	[InstCombine] Add tests for inbounds handling in loop invariant GEP fold (NFC)	2022-05-30 16:40:18 +02:00
Nikita Popov	2e101cca69	[Local] Don't remove invoke of non-willreturn function The code was only checking for memory side-effects, but not for divergence side-effects. Replace this with a generic check.	2022-05-30 15:37:46 +02:00
Nikita Popov	1f1de06165	[SimplifyCFG] Add test for invoke of nounwind non-willreturn function (NFC) Test both the case with and without willreturn attribute.	2022-05-30 15:36:33 +02:00
zhongyunde	3e6ba89055	[InstCombine] Fold a mul with bool value into and Fixes https://github.com/llvm/llvm-project/issues/55599 X * Y --> X & Y, iff X, Y can be only {0, 1}. https://alive2.llvm.org/ce/z/_RsTKF Reviewed By: spatel, nikic Differential Revision: https://reviews.llvm.org/D126040	2022-05-30 21:05:00 +08:00
Nikita Popov	1721ff1dfd	[GVN] Enable enable-split-backedge-in-load-pre option by default This option was added in D89854. It prevents GVN from performing load PRE in a loop, if doing so would require critical edge splitting on the backedge. From the review: > I know that GVN Load PRE negatively impacts peeling, > loop predication, so the passes expecting that latch has > a conditional branch. In the PhaseOrdering test in this patch, splitting the backedge negatively affects vectorization: After critical edge splitting, the loop gets rotated, effectively peeling off the first loop iteration. The effect is that the first element is handled separately, then the bulk of the elements use a vectorized reduction (but using unaligned, off-by-one memory accesses) and then a tail of 15 elements is handled separately again. It's probably worth noting that the loop load PRE from D99926 is not affected by this change (as it does not need backedge splitting). This is about normal load PRE that happens to occur inside a loop. Differential Revision: https://reviews.llvm.org/D126382	2022-05-30 09:55:58 +02:00
Chenbing Zheng	ef256ed58e	[InstCombine] bitcast (extractelement <1 x elt>, dest) -> bitcast(<1 x elt>, dest) Only solve dest type is vector to avoid inverse transform in visitBitCast. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D125951	2022-05-30 10:16:32 +08:00
Florian Hahn	0776c48f9b	Recommit "[LICM] Only create load in ph when promoting load or store doesn't exec." This reverts the revert commit `ad95255b92`. The updated version also creates a load when the store may not execute. In those cases, we still need to introduce a load in a function where there may not have been one before, so this doesn't completely resolve issue #51248. Original message: When only a store is sunk, there is no need to create a load in the pre-header, as the result of the load will never get used. The dead load can can introduce UB, if the function is marked as writeonly. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D123473	2022-05-29 21:57:14 +01:00
chenglin.bi	e091721fdc	[InstCombine] Add baseline tests for shift+and+icmp transforms; NFC	2022-05-30 01:01:37 +08:00
chenglin.bi	9080e21906	[InstCombine] Add baseline tests for shift+and transforms; NFC	2022-05-30 00:30:56 +08:00
Sanjay Patel	b5b6aa4d53	[InstCombine] fold multiply by signbit-splat to cmp+select (ashr i32 X, 31) * C --> (X < 0) ? -C : 0 https://alive2.llvm.org/ce/z/G8u9SS With a constant operand, this is an improvement in IR and codegen (where it can be converted to a mask op). Without a constant operand, we would have to negate the operand, so that is probably better left to the backend. This is similar but not the same optimization that is requested in #55618.	2022-05-27 11:54:19 -04:00
Florian Hahn	786c687810	[AArch64] Add support for FMA intrinsics to shouldSinkOperands. If the fma operates on a legal vector type, the indexed variants can be used, if the second operand is a splat of a valid index. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D126234	2022-05-27 10:37:03 +01:00
Liqin.Weng	a84026821b	[RISCV] Add test for experimental.vector.reverse ``` void vector_reverse_i64(int A, int B, int n) { #pragma clang loop vectorize_width(4, scalable) for (int i = n-1; i >= 0; i--) A[i] = B[i] + 1; } ``` When option: scalable-vectorization is on (or set #pragma clang loop vectorize_width(elements, scalable)), Reverse Iterators can't loop vectorization as <vscale x elements x elementType> Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D125866	2022-05-27 06:30:07 +00:00

1 2 3 4 5 ...

22068 Commits