clang-p2996

Author	SHA1	Message	Date
Florian Hahn	0a246a0c72	[LV] Use VPValues when creating GEP with all invariant indices. Update VPWidenGEPRecipe::execute to use the VPValue operands of the recipe when creating the GEP instruction. Fixes #63340.	2023-06-16 16:14:01 +01:00
Florian Hahn	ea6ca9cb2b	[LV] Fix crash when stride isn't a constant. In same cases, the stride may not be a constant. Just skip those cases for now. This should only happen for cases where LV interleaves only, if it is vectorized the stride needs to be versioned to a constant.	2023-06-14 16:53:34 +01:00
Simon Pilgrim	4cbedaeff5	[LoopVectorize][X86] Regenerate slm-no-vectorize.ll	2023-06-13 14:15:37 +01:00
Florian Hahn	d209084720	[VPlan] Replace versioned stride with constant during VPlan opts. After constructing the initial VPlan, replace VPValues for versioned strides with their constant counterparts. Differential Revision: https://reviews.llvm.org/D147783	2023-06-13 08:26:55 +01:00
Nikita Popov	2b7c347c7f	[LoopVectorize] Convert test to opaque pointers (NFC) I'm keeping the bitcast in the input here, because without it we end up introducing a stride 1 assumption and end up testing a different case.	2023-06-12 14:49:45 +02:00
Nikita Popov	9929f9533d	[LoopVectorize] Convert test to opaque pointers (NFC)	2023-06-12 14:31:54 +02:00
Nikita Popov	aa92ae5924	[LoopVectorize] Regenerate test checks (NFC)	2023-06-12 14:31:54 +02:00
Nikita Popov	9cf67f6ea0	[LoopVectorize] Convert most tests to opaque pointers (NFC) The unsized-pointee-crash.ll and zero-sized-pointee-crash.ll tests have been removed, because these issues are not relevant for opaque pointers.	2023-06-12 13:10:22 +02:00
Graham Hunter	95bfb1902d	[LV][AArch64] Allow (limited) interleaving for scalable vectors This patch uses the (de)interleaving intrinsics introduced in D141924 to handle vectorization of interleaving groups with a factor of 2 for scalable vectors. Reviewed By: fhahn, reames Differential Revision: https://reviews.llvm.org/D145163	2023-06-09 11:42:10 +01:00
Florian Hahn	c317a88767	[LV] Add tests for reasoning about SCEV predicates. Add extra tests with cases where SCEV predicates can be proven to always be false. The test in pointer-induction.ll has been adjusted to avoid the induction always to wrap.	2023-06-08 21:13:06 +01:00
Florian Hahn	f5f6daf00f	[LV] Extend test coverage for loops with accesses with clamped indexes. Extend test coverage ahead of upcoming patches.	2023-06-08 12:10:04 +01:00
Florian Hahn	123f807e5b	[LV] Remove UB caused by undef from pr37248.ll (NFC). Also generate full check lines.	2023-06-08 11:58:58 +01:00
zhongyunde	df19d87227	[LV] Add option to tune the cost model, NFC For Neon, the default nonconst stride cost is conservative, and it is a local variable, which is not convenience to to tune the loop vectorize. So I try to use a option, which is similar to SVEGatherOverhead brought in D115143. Fix https://github.com/llvm/llvm-project/issues/63082. Reviewed By: dmgreen, fhahn Differential Revision: https://reviews.llvm.org/D152253	2023-06-07 22:08:29 +08:00
Florian Hahn	8f781b96e2	Revert "[VPlan] Mark recurrence recipes as not having side-effects." This reverts commit `02369b75fd`. At the moment, live-outs used only for the resume values in the scalar loop are not modeled in VPlan yet. This means first-order recurrence recipes could be removed, when a scalar epilogue is required and the only use of a FOR is outside the loop. Keep treating recurrence recipes as having side-effects for now, to avoid them being removed. Fixes #62954.	2023-06-06 11:35:26 +02:00
Florian Hahn	f47084ecfb	[LV] Use force-vector-width for X86 recurrence test. This makes sure that all tests that can be vectorized in the file are vectorized.	2023-06-06 11:27:35 +02:00
Florian Hahn	4c51a45e80	[LV] Add test for #62954 .	2023-06-06 11:20:22 +02:00
Florian Hahn	3b912e269a	[LV] Bail out on loop-variant steps when rewriting SCEV exprs. If the step is not loop-invariant, we cannot create a modified AddRec, as the start needs to be loop-invariant. Mark those cases as CannotAnalyze and bail out, to fix a crash.	2023-06-01 16:14:02 +01:00
Florian Hahn	572cfa3fde	[LV] Use SCEV for uniformity analysis across VF This patch uses SCEV to check if a value is uniform across a given VF. The basic idea is to construct SCEVs where the AddRecs of the loop are adjusted to reflect the version in the vectorized loop (Step multiplied by VF). We construct a SCEV for the value of the vector lane 0 (offset 0) compare it to the expressions for lanes 1 to the last vector lane (VF - 1). If they are equal, consider the expression uniform. While re-writing expressions, we also need to catch expressions we cannot determine uniformity (e.g. SCEVUnknown). Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D148841	2023-05-31 16:01:00 +01:00
Florian Hahn	8098f2577e	[LV] Use Legal::isUniform to detect uniform pointers. Update collectLoopUniforms to identify uniform pointers using Legal::isUniform. This is more powerful and brings pointer classification here in sync with setCostBasedWideningDecision which uses isUniformMemOp. The existing mis-match in reasoning can causes crashes due to D134460, which is fixed by this patch. Fixes https://github.com/llvm/llvm-project/issues/60831. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D150991	2023-05-30 16:42:55 +01:00
Florian Hahn	fcc135a8d6	[LV] Remove dead CHECK lines after `280656eae9`. Those check lines were left over after adding new run lines in `280656eae9`.	2023-05-29 19:23:52 +01:00
Florian Hahn	280656eae9	[LV] Add check line with VF=4 to uniformity test. Extend test coverage for D148841.	2023-05-28 20:01:04 +01:00
Nikita Popov	d2502eb091	[KnownBits] Add support for nuw/nsw on shifts Implement precise nuw/nsw support in the KnownBits implementation, replacing the rather crude handling in ValueTracking. Differential Revision: https://reviews.llvm.org/D151208	2023-05-25 10:17:10 +02:00
Florian Hahn	299f0ff60e	[VPlan] Print IR flags for VPRecipeWithIRFlags. Now that IR flags are modeled as part of VPRecipeWithIRFlags, include the flags when printing recipes. Depends on D150027. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D150029	2023-05-23 20:36:16 +01:00
Dinar Temirbulatov	1ff828c6c8	[AArch64][LV] Disable maximising bandwidth for streaming compatible sve We noticed some runtime performance improvements by disabling maximising bandwidth for streaming compatible sve. Differential Revision: https://reviews.llvm.org/D150336	2023-05-23 12:58:19 +00:00
David Sherwood	c7dbe326df	[AArch64][LoopVectorize] Enable tail-folding of simple loops on neoverse-v1 This patch enables the tail-folding of simple loops by default when targeting the neoverse-v1 CPU. Simple loops exclude those with recurrences or reductions or loops that are reversed. New tests have been added here: Transforms/LoopVectorize/AArch64/sve-tail-folding-option.ll In terms of SPEC2017 only one benchmark is really affected when building with "-Ofast -mcpu=neoverse-v1 -flto", which is (+ faster, - slower): 525.x264: +7.0% Differential Revision: https://reviews.llvm.org/D130618	2023-05-18 10:35:57 +00:00
Florian Hahn	01efcec6db	[LV] Add extra uniformity tests with UDIV and UREM. Extra tests for D148841.	2023-05-18 11:35:17 +01:00
Nikita Popov	745cfa3449	[InstCombine] Compute known bits for multi-use add/sub We were failing to set the known bits for add/sub in the multi-use case, resulting in odd behavioral differences depending on the number of uses. Noticed while adding a consistency assertion. The test changes are essentially a revert to the state before `d6498ab`. These changes are not really desirable, but if we don't want them, that needs to be handled as part of the heuristic for demanded constant shrinking, not by artifically suppressing the known bits in one specific case.	2023-05-17 17:50:00 +02:00
Tobias Hieta	f84bac329b	[NFC][Py Reformat] Reformat lit.local.cfg python files in llvm This is a follow-up to `b71edfaa4e` since I forgot the lit.local.cfg files in that one. Reformatting is done with `black`. If you end up having problems merging this commit because you have made changes to a python file, the best way to handle that is to run git checkout --ours <yourfile> and then reformat it with black. If you run into any problems, post to discourse about it and we will try to help. RFC Thread below: https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style Reviewed By: barannikov88, kwk Differential Revision: https://reviews.llvm.org/D150762	2023-05-17 17:03:15 +02:00
David Sherwood	7beb2ca8fa	[AArch64][NFC] Refactor the tail-folding option This patch does simple refactoring of the tail-folding option in preparation for enabling tail-folding by default for neoverse-v1. It adds a default tail-folding option field to the AArch64Subtarget class that can be set on a per-CPU. Differential Revision: https://reviews.llvm.org/D149659	2023-05-17 08:39:40 +00:00
Florian Hahn	6c35d423c8	[VPlan] Add tests to print exact and flags on calls (NFC). Adds missing test coverage for D150029.	2023-05-16 21:18:31 +01:00
Douglas Yung	da3c06a482	Revert "[LV] Add test case for #51677." This reverts commit `77df976a12`. Test is failing on many build bots including: https://lab.llvm.org/buildbot/#/builders/247/builds/4488 https://lab.llvm.org/buildbot/#/builders/139/builds/40608 https://lab.llvm.org/buildbot/#/builders/216/builds/21169 https://lab.llvm.org/buildbot/#/builders/65/builds/9673 https://lab.llvm.org/buildbot/#/builders/119/builds/13302 https://lab.llvm.org/buildbot/#/builders/121/builds/30459 https://lab.llvm.org/buildbot/#/builders/230/builds/12967 https://lab.llvm.org/buildbot/#/builders/57/builds/26781 https://lab.llvm.org/buildbot/#/builders/214/builds/7458 https://lab.llvm.org/buildbot/#/builders/93/builds/14892 https://lab.llvm.org/buildbot/#/builders/231/builds/11764	2023-05-14 12:22:11 -07:00
Ricky Zhou	77df976a12	[LV] Add test case for #51677 .	2023-05-14 16:53:08 +01:00
Florian Hahn	3d4eed0133	[LV] Reuse SCEV expansion results for epilogue vectorization. When generating code for the epilogue vector loop, we need to re-use the expansion results for induction steps generated for the main vector loop, as the pre-header of the epilogue vector loop may not dominate the vector preheader of the epilogue. This fixes a reported crash. Note that this is a workaround which should be removed soon once induction resume value creation is handled in VPlan directly.	2023-05-11 22:00:07 +01:00
Florian Hahn	236a0e82df	[LV] Use VPValue to get expanded value for SCEV step expressions. Update skeleton creation logic to use SCEV expansion results from expanding the pre-header. This avoids another set of SCEV expansions that may happen after the CFG has been modified. Fixes #58811. Depends on D147964. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D147965	2023-05-11 16:49:19 +01:00
Hongtao Yu	9272d0f079	[PseudoProbe] Clean up dwarf discriminator and avoid duplicating factor. A pseudo probe is created with dwarf line information shared with its nearest instruction. If the instruction comes with a dwarf discriminator, it will be shared with the probe as well. This can confuse the later FS-AFDO discriminator assignment pass. To fix this, I'm cleaning up the discriminator fields for probes when they are inserted. I also notice another possibility to change the discriminator field of pseudo probes in the pipeline before the FS discriminator assignment pass. That is the loop unroller, which assigns duplication factor to instruction being vectorized. I'm disabling that for pseudo probe intrinsics specifically, also for callsites with probes. Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D148569	2023-05-10 11:26:23 -07:00
Florian Hahn	faa8f582b9	[VPlan] Add printing test with fast-math flags. Add missing test coverage for D150029.	2023-05-09 22:43:03 +01:00
Noah Goldstein	7770b0abfd	[KnownBits] Improve `KnownBits::rem(X, Y)` in cases where we can deduce low-bits of output The first `cttz(Y)` bits in `X` are translated 1-1 in the output. Alive2 Links: https://alive2.llvm.org/ce/z/Qc47p7 https://alive2.llvm.org/ce/z/19ut5H Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D149421	2023-05-07 19:11:53 -05:00
Florian Hahn	e3afe0b89d	[VPlan] Add VPWidenCastRecipe, split off from VPWidenRecipe (NFCI). To generate cast instructions, the result type is needed. To allow creating widened casts without underlying instruction, introduce a new VPWidenCastRecipe that also holds the result type. This functionality will be used in a follow-up patch to implement truncateToMinimalBitwidths as VPlan-to-VPlan transform. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D149081	2023-05-05 13:20:16 +01:00
Florian Hahn	b85a402dd8	[VPlan] Introduce new entry block to VPlan for early SCEV expansion. This patch adds a new preheader block the VPlan to place SCEV expansions expansions like the trip count. This preheader block is disconnected at the moment, as the bypass blocks of the skeleton are not yet modeled in VPlan. The preheader block is executed before skeleton creation, so the SCEV expansion results can be used during skeleton creation. At the moment, the trip count expression and induction steps are expanded in the new preheader. The remainder of SCEV expansions will be moved gradually in the future. D147965 will update skeleton creation to use the steps expanded in the pre-header to fix #58811. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D147964	2023-05-04 14:00:13 +01:00
Florian Hahn	79692750d2	[LV] Use VPValue for SCEV expansion in fixupIVUsers. The step is already expanded in the VPlan. Use this expansion instead. This is a step towards modeling fixing up IV users in VPlan. It also fixes a crash casued by SCEV-expanding the Step expression in fixupIVUsers, where the IR is in an incomplete state Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D147963	2023-05-04 09:25:59 +01:00
Philip Reames	cb3cb417a0	[LV] Refresh some auto-gen tests to reduce diff [nfc]	2023-05-01 13:55:11 -07:00
Philip Reames	30cdb2ac7e	[LAA] Add command line flag to disable unit stride speculation This is purely so that we can expose and work through downstream codegen issues. My intention is to see if we can get this disabled by default, but that requires fixing a bunch of downstream issues first.	2023-05-01 10:49:51 -07:00
Yingwei Zheng	6d667d4b26	[InstCombine] Combine const GEP chains This patch reverts rGae739aefd7473517d3f08b5c8d08a66c7f469198 to address performance regressions reported by our [CI](https://github.com/dtcxzyw/llvm-ci/issues/137) after rG2ec1d0f427c7822540352c0c14d057e7bfe4f77b. For example: ``` define ptr @const_gep_chain(ptr %p, i64 %a) { %p1 = getelementptr inbounds i8, ptr %p, i64 %a %p2 = getelementptr inbounds i8, ptr %p1, i64 1 %p3 = getelementptr inbounds i8, ptr %p2, i64 2 %p4 = getelementptr inbounds i8, ptr %p3, i64 3 ret ptr %p4 } ``` The last three GEPs will not be folded since rG2ec1d0f427c7822540352c0c14d057e7bfe4f77b. I think it is appropriate to remove this code because there is no compile-time regression reported in our benchmarks. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D149240	2023-05-02 00:28:39 +08:00
Noah Goldstein	d840391401	[ValueTracking] Add logic for `isKnownNonZero(smin/smax X, Y)` For `smin` if either `X` or `Y` is negative, the result is non-zero. For `smax` if either `X` or `Y` is strictly positive, the result is non-zero. For both if `X != 0` and `Y != 0` the result is non-zero. Alive2 Link: https://alive2.llvm.org/ce/z/7yvbgN https://alive2.llvm.org/ce/z/zizbvq Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D149417	2023-04-30 10:06:46 -05:00
Noah Goldstein	883daa7ac4	[ValueTracking] Add logic for `isKnownNonZero(umax X, Y)` `(umax X, Y) != 0` -> `X != 0 \|\| Y != 0` Alive2 Link: https://alive2.llvm.org/ce/z/_Z9AUT Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D149415	2023-04-30 10:06:46 -05:00
Mel Chen	de01dba7f2	[LV] Add tests for integer min max with index reduction pattern. (NFC) The test case for signed max with index, include strict and non-strict max. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D146718	2023-04-28 05:13:23 -07:00
Florian Hahn	07e5f57df4	[LV] Add tests for #60831 . Also contains an extra test mentioned in D144434.	2023-04-28 10:42:01 +01:00
ManuelJBrito	8b56da5e9f	[IR] Change shufflevector undef mask to poison With this patch an undefined mask in a shufflevector will be printed as poison. This change is done to support the new shufflevector semantics for undefined mask elements. Differential Revision: https://reviews.llvm.org/D149210	2023-04-27 14:41:10 +01:00
Florian Hahn	883eb88cae	[LV] Add extra uniformity tests with LSHR and AND. Extra tests for D148841 based on the tests added in `95539186c8`.	2023-04-26 19:51:35 +01:00
Nikita Popov	1745341296	[LoopVectorize] Preserve SCEV As far as I can tell, LoopVectorize preserves SCEV, mainly by dint of forgetting the loop being vectorized. We should mark it as preserved in the pass manager. This is a very small compile-time improvement. Differential Revision: https://reviews.llvm.org/D149147	2023-04-26 09:43:54 +02:00

1 2 3 4 5 ...

2091 Commits