clang-p2996

Author	SHA1	Message	Date
Alexandros Lamprineas	6c2ad8ac7b	[TLI][NFC] Autogenerate vectorized call tests for SLEEF/ArmPL. (#76146 ) This patch prepares the ground for #76060. * Unifies ArmPL and SLEEF tests for better coverage * Replaces deprecated float* and double* types with ptr * Adds noalias attribute to pointer arguments * Adds some cmd-line options to the RUN lines to simplify output * Removes datalayout since target triple is provided * Removes checks for return statements * Refactors the regex filter for autogenerated checks * Removes redundant test file suffix (already under the AArch64 dir)	2023-12-22 16:29:18 +00:00
Paschalis Mpeis	2349731992	[TLI] Add SLEEFGNUABI mappings for fmod/fmodf fixed-width. (#75803 ) Cleanup test sleef-calls-aarch64.ll: - make the util update script's regex more clear - eliminate scalar epilogues in tests	2023-12-20 09:08:17 +00:00
Nikita Popov	a5f3415533	[InstCombine] Replace non-demanded undef vector with poison If an operand (esp to shufflevector or insertelement) is not demanded, canonicalize it from undef to poison.	2023-12-18 16:12:37 +01:00
Nikita Popov	e93d324adb	[InstCombine] Preserve poison in evaluateInDifferentElementOrder() Don't unnecessarily replace poison with undef.	2023-12-18 15:36:22 +01:00
David Sherwood	49b0e6dcc2	[LoopVectorize] Enable hoisting of runtime checks by default (#71538 ) With commit https://reviews.llvm.org/D152366 I introduced functionality that permitted the hoisting of runtime memory checks from a vectorised inner loop to the preheader of the next outer-most loop. This is useful for benchmarks like SPEC2017's x264 where the inner loop is vectorised and only has a small trip count. In such cases the runtime memory checks become expensive and since the checks never fail in the case of x264 it makes sense to do this. However, this behaviour was controlled by the flag -hoist-runtime-checks which was off by default. This patch enables this flag by default for all targets, since I believe this is a generally beneficial thing to do. I have tested this with SPEC2017 and I see 2.3% and 2.6% improvements with x264 on neoverse-v1 and neoverse-n1, respectively. Similarly, I saw slight improvements in the overall geomean on both machines. The only other notable changes were a 1% drop in the roms benchmark, which was compensated for by a 1% improvement in fotonik3d.	2023-12-18 09:41:54 +00:00
Shih-Po Hung	3d422a9859	[VPlan] Implement mayHaveSideEffects/mayWriteToMemory for VPInterleav… (#71360 ) …eRecipe This helps VPlanTransforms::removeDeadRecipes to work on VPInterleaveRecipe	2023-12-15 00:23:14 +08:00
Shih-Po Hung	b97c5a9554	[VPlan] Add a test for testing unused interleave recipes (#75026 ) - Precommit of tests from #71360. - Replace `undef` pointer operands and add stores to avoid the loads being optmized away.	2023-12-14 21:16:11 +08:00
Simon Pilgrim	b7fc78255e	Revert rG2047ab00eaf0a17e71ce5e8a5b27a8c90f034c3d "[VPlan] Add a test for testing unused interleave recipes (#75026 )" vplan-unused-interleave-group.ll is causing buildbot failures	2023-12-14 10:25:41 +00:00
Shih-Po Hung	2047ab00ea	[VPlan] Add a test for testing unused interleave recipes (#75026 ) - Precommit of tests from #71360. - Replace `undef` pointer operands and add stores to avoid the loads being optmized away.	2023-12-14 17:36:58 +08:00
Florian Hahn	173032902c	Revert "[VPlan] Mark Select VPInstructions as not having sideeffects." This reverts commit `19918ac34d`. Fixes #75298. There is still a case where we miss the correct users outside the main vector loop for reductions, and that is tail-folded loops with reductions where the final value is stored after the loop. This should be handled explicitly in #70253	2023-12-13 21:05:24 +00:00
Florian Hahn	8d893f28f2	[LV] Add test case for #75298 .	2023-12-13 20:59:28 +00:00
David Sherwood	ceb02379a9	[LoopVectorize] Improve algorithm for hoisting runtime checks (#73515 ) When attempting to hoist runtime checks out of a loop we currently avoid creating pointer diff checks and prefer to do expanded range checks instead. This gives us the opportunity to hoist runtime checks out of a loop, since these checks are loop invariant. However, in some cases the pointer diff checks would also be loop invariant and so will naturally get hoisted. Therefore, since diff checks are cheaper so we should prefer to use those instead.	2023-12-12 09:10:39 +00:00
Nilanjana Basu	41a3828838	[LV] Added pre-commit tests for changing loop interleaving count computation (#74689 ) Added more pre-commit tests for evaluating changes to loop interleaving count computation in (https://github.com/llvm/llvm-project/pull/73766). The new set of tests address the change in IC computation to minimize the remainder TC of the vectorized loop while maximizing the IC when the remainder TC is the same.	2023-12-12 11:09:25 +05:30
Florian Hahn	19918ac34d	[VPlan] Mark Select VPInstructions as not having sideeffects. Select VPInstructions don't have sideeffects, mark them accordingly.	2023-12-11 12:26:32 +00:00
Florian Hahn	a5891fa4d2	[VPlan] Initial modeling of VF * UF as VPValue. (#74761 ) This patch starts initial modeling of VF * UF in VPlan. Initially, introduce a dedicated VFxUF VPValue, which is then populated during VPlan::prepareToExecute. Initially, the VF * UF applies only to the main vector loop region. Once we extend the scope of VPlan in the future, we may want to associate different VFxUFs with different vector loop regions (e.g. the epilogue vector loop) This allows explicitly parameterizing recipes that rely on the VF * UF, like the canonical induction increment. At the moment, this mainly helps to avoid generating some duplicated calls to vscale with scalable vectors. It should also allow using EVL as induction increments explicitly in D99750. Referring to VF * UF is also needed in other places that we plan to migrate to VPlan, like the minimum trip count check during skeleton creation. The first version creates the value for VF * UF directly in prepareToExecute to limit the scope of the patch. A follow-on patch will model VF * UF computation explicitly in VPlan using recipes. Moved from Phabricator (https://reviews.llvm.org/D157322)	2023-12-08 18:30:30 +00:00
Florian Hahn	5ea6a3fc6d	[VPlan] Compute scalable VF in preheader for induction increment. (#74762 ) UF * VF is loop invariant and can be computed directly in the preheader. This prepares the code for #74761 and reduces the test changes.	2023-12-08 12:18:31 +00:00
Florian Hahn	633fe60149	[VPlan] Print flags for VPWidenCastRecipe. Update VPWidenCastRecipe to also print flags. Simplify nneg printing test and replace hard-coded value number references with patterns.	2023-12-08 10:48:54 +00:00
Graham Hunter	d0d5ef8133	[LV] Add support for linear arguments for vector function variants (#73941 ) If we have vectorized variants of a function which take linear parameters, we should be able to vectorize assuming the strides match.	2023-12-08 10:24:05 +00:00
Ramkumar Ramachandra	b0f560b8ea	LoopVectorize/test: fix opt invocations with -march (NFC) (#74462 ) opt accepts the -march command-line argument, but this argument only makes sense in conjunction with -mtriple. Fix a couple of tests under LoopVectorize that invoke opt with -march but without -mtriple, to avoid confusing users.	2023-12-08 09:56:55 +00:00
Nikita Popov	f2f077898f	[LoopVectorize] Regenerate test checks (NFC) This test contains an annoying mix of generated and hand-written check lines. Generate the whole test.	2023-12-07 14:37:10 +01:00
Nikita Popov	d77067d08a	[ValueTracking] Add dominating condition support in computeKnownBits() (#73662 ) This adds support for using dominating conditions in computeKnownBits() when called from InstCombine. The implementation uses a DomConditionCache, which stores which branches may provide information that is relevant for a given value. DomConditionCache is similar to AssumptionCache, but does not try to do any kind of automatic tracking. Relevant branches have to be explicitly registered and invalidated values explicitly removed. The necessary tracking is done inside InstCombine. The reason why this doesn't just do exactly the same thing as AssumptionCache is that a lot more transforms touch branches and branch conditions than assumptions. AssumptionCache is an immutable analysis and mostly gets away with this because only a handful of places have to register additional assumptions (mostly as a result of cloning). This is very much not the case for branches. This change regresses compile-time by about ~0.2%. It also improves stage2-O0-g builds by about ~0.2%, which indicates that this change results in additional optimizations inside clang itself. Fixes https://github.com/llvm/llvm-project/issues/74242.	2023-12-06 14:17:18 +01:00
Graham Hunter	f0f899932b	[LV] Linear argument tests for vectorization of function calls (#73936 ) Tests to exercise vectorization of function calls where a vector variant takes a linear parameter.	2023-12-06 11:55:03 +00:00
Florian Hahn	bbd1941a38	[VPlan] Add disjoint flag to VPRecipeWithIRFlags. (#74364 ) A new disjoint flag was added for OR instructions in #72583. Update VPRecipeWithIRFlags to also support the new flag. This allows printing and preserving the disjoint flag in vectorized code.	2023-12-05 15:21:59 +00:00
Alexey Bataev	056367bb19	[LV]Support dropping of nneg flag for zext widencast recipes. (#74112 ) Compiler crashes when the assertion triggered for zext nneg instruction, that checks that the instruction cannot produce poison. Changed the base class for widencast recipe to handle dropping nneg flag to avoid compiler crash.	2023-12-05 09:17:23 -05:00
Nikita Popov	eecb99c5f6	[Tests] Add disjoint flag to some tests (NFC) These tests rely on SCEV looking recognizing an "or" with no common bits as an "add". Add the disjoint flag to relevant or instructions in preparation for switching SCEV to use the flag instead of the ValueTracking query. The IR with disjoint flag matches what InstCombine would produce.	2023-12-05 14:09:36 +01:00
Florian Hahn	d00c502ee5	[LV] Add tests for preserving and printing the new disjoint flag. Tests for support for the disjoint flag added in #72583.	2023-12-04 20:12:11 +00:00
Florian Hahn	cd4348349a	[VPlan] Sink cases where no truncate is needed in truncateMinimalBWs. MinBWs contains entries that specify the minimum required bitwidth. In some cases, the old and new bitwidths can be equal (see test case) and in those cases no truncations are needed, so skip those cases. Fixes #74307.	2023-12-04 15:35:54 +00:00
Florian Hahn	99aa5311ee	[VPlan] Add missing output of live-ins to VPlan dot printing. Split off live-in printing to VPlan::printLiveIns and use it to print Live-ins when printing in the DOT format.	2023-12-04 13:41:28 +00:00
Florian Hahn	efec4cc501	[LV] Remove unused CHECK lines, remove IR references from test. Clean up sve-tail-folding-option.ll by removing the unused CHECK-TF-NEOVERSE-V1 prefix (note the use of non-opaque pointers) and remove IR value references.	2023-12-04 13:06:30 +00:00
Florian Hahn	c890582912	[VPlan] Account for live-in entries in MinBW used by replicate recipes. In some cases MinBWs may contain entries for live-ins that are not used by VPWidenRecipe or VPWidenSelectRecipes. In those cases, the live-ins won't get processed, so make sure we include them in the count when used as operands in VPWidenCast and VPWidenSelectRecipe. Fixes https://github.com/llvm/llvm-project/issues/74231	2023-12-03 11:15:29 +00:00
Craig Topper	7ec4f6094e	[InstCombine] Infer disjoint flag on Or instructions. (#72912 ) The disjoint flag was recently added to IR in #72583 We already set it when we turn an add into an or. This patch sets it on Ors that weren't converted from an Add.	2023-12-02 14:11:12 -08:00
Florian Hahn	70535f5e60	[VPlan] Replace IR based truncateToMinimalBitwidths with VPlan version. This patch replaces the IR based truncateToMinimalBitwidths with a VPlan version. This has 3 benefits: 1) the VPlan-based version is simpler; we don't need to implement special codegen for each supported instruction type like the IR based one. 2) Removes a dependency on the cost-model after VPlan execution and 3) Removes a use of getVPValue that uses underlying values after VPlan execution (See removed FIXME). Depends on D149081. Depends on D149079. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D149903	2023-12-02 16:12:38 +00:00
Florian Hahn	cbf7b52a65	[VPlan] Properly update reduction live-out after placing select. After inserting a select for the final value, update the VPlan def-use chains. At the moment, the incorrect live-out doesn't cause a mis-compile, as computing the final reduction value is not yet modeled in VPlan.	2023-12-02 15:22:09 +00:00
Florian Hahn	b2f42f5d29	[LV] Add test variant without sdiv by undef and uses. Add a variant of @PR34687 with a sdiv with non-undef operands and actual uses, to avoid the SDIV and SELECT being folded away triviall.y	2023-11-29 20:27:52 +00:00
Jeremy Morse	d2d9dc8eb4	[DebugInfo][RemoveDIs] Make debugify pass convert to/from RemoveDIs mode (#73251 ) Debugify is extremely useful as a testing and debugging tool, and a good number of LLVM-IR transform tests use it. We need it to support "new" non-instruction debug-info to get test coverage, but it's not important enough to completely convert right now (and it'd be a large undertaking). Thus: convert to/from dbg.value/DPValue mode on entry and exit of the pass, which gives us the functionality without any further work. The cost is compile-time, but again this is only happening during tests. Tested by: the large set of debugify tests enabled here. Note the InstCombine test (cast-mul-select.ll) that hasn't been fully enabled: this is because there's a debug-info sinking piece of code there that hasn't been instrumented.	2023-11-29 13:19:50 +00:00
Paschalis Mpeis	1bfb84b477	[NFC][TLI] Improve tests for ArmPL and SLEEF Intrinsics. (#73352 ) Auto-generate test `armpl-intrinsics.ll` and simplify tests: - Eliminate scalar tail with no tail-folding flag. - Use active lane mask for shorter check lines (no long `shufflevectors`). - Eliminate scalar loops by providing `noalias` to relevant arguments and run `simplifycfg` to drop them. - Update script now use `@llvm.compiler.used` instead of a longer regex.	2023-11-29 11:19:10 +00:00
Graham Hunter	104b7c624e	[LV] Add support for uniform parameters on vectorized function variants (#72891 ) Parameters marked as uniform take a scalar value, assuming the value is invariant in the scalar loop.	2023-11-28 15:01:32 +00:00
Nikita Popov	f0faff8b9b	[LoopVectorize] Regenerate test checks (NFC)	2023-11-28 15:50:27 +01:00
Craig Topper	03d4a9d94d	[InstCombine] Set disjoint flag when turning Add into Or. (#72702 ) The disjoint flag was recently added to IR in #72583	2023-11-27 12:54:11 -08:00
pasmpe01	de6c9c84e2	[TLI][AArch64] Add TLI Mappings of @llvm.exp10 for ArmPL and SLEEF. Update regex to _explicitly_ show which exp versions are added. The previous regex used `exp[^e]` to avoid matching calls like: `@llvm.experimental.stepvector`. Note: ArmPL Mappings for scalable types are not yet utilized (eg, `llvm.exp10.nxv2f64`, `llvm.exp10.nxv4f32`), as `replace-with-veclib` pass needs improvements.	2023-11-24 12:24:33 +00:00
Graham Hunter	b1fba568f6	[SVE] Don't require lookup when demangling vector function mappings (#72260 ) We can determine the VF from a combination of the mangled name (which indicates the arguments that take vectors) and the element sizes of the arguments for the scalar function the mapping has been established for. The assert when demangling fails has been removed in favour of just not adding the mapping, which prevents the crash seen in https://github.com/llvm/llvm-project/issues/71892 This patch also stops using _LLVM_ as an ISA for scalable vector tests, since there aren't defined rules for the way vector arguments should be handled (e.g. packed vs. unpacked representation).	2023-11-23 17:15:48 +00:00
Florian Hahn	fd9a777e01	[LV] Remove TODO addressed in `32d1197a8f`.	2023-11-23 12:42:23 +00:00
Florian Hahn	19e6d54188	[LV] Re-use existing compare if possible for diff checks. SCEV simplifying the subtraction may result in redundant compares that are all OR'd together. Keep track of the generated operands in SeenCompares, with the key being the pair of operands for the compare. If we alrady generated the same compare previously, skip it.	2023-11-23 11:35:21 +00:00
Florian Hahn	32d1197a8f	[LV] Use SCEV for subtraction of src/sink for diff runtime checks. Instead of expanding the src/sink SCEV expressions and emitting an IR sub to compute the difference, the subtraction can be directly be performed by ScalarEvolution. This allows the subtraction to be simplified by SCEV, which in turn can reduced the number of redundant runtime check instructions generated. It also allows to generate checks that are invariant w.r.t. an outer loop, if he inner loop AddRecs have the same outer loop AddRec as start.	2023-11-22 12:48:04 +00:00
Florian Hahn	9b20af1651	[LV] Add test with a number of redundant runtime check instructions. Add a test case where many runtime check instructions can be simplified.	2023-11-22 12:12:19 +00:00
Florian Hahn	b3a9e8f7c0	[LV] Reduce memory-check-threshold for test to preserve original test. Future patches will remove some redundant instructions for runtime checks, which brings this test case slightly below the default limit of 128. Force a lower limit to preserve the original spirit of the test (checking that no interleaving happens if the number of checks is above he threshold)	2023-11-22 11:07:24 +00:00
Florian Hahn	6088e9cdba	[LV] Add test case for diff checks with nested AddRecs. Add a test case where the AddRec for the pointers in the inner loop have the AddRec of the outer loop as start value. It is sufficient to subtract the start values (%dst, %src) of the outer AddRecs. This simplification will be done in a follow-up commit.	2023-11-21 14:38:13 +00:00
Florian Hahn	ead35564c0	[LoopUtils] Freeze compare results for diff checks instead of pointers. THe freezes are introduced to avoid branch on undef/poison, if any of the pointers may be poison. The same can be achieved by just freezing the compare, which reduces the number of freezes needed. See https://alive2.llvm.org/ce/z/NHa_ud Note that the individual compares need to be frozen and it is not sufficient to only freeze the resulting OR: Result OR frozen only (UNSOUND): https://alive2.llvm.org/ce/z/YzFHQY Individual conds frozen (SOUND): https://alive2.llvm.org/ce/z/5L6Z3f	2023-11-21 10:54:36 +00:00
Matthias Braun	2404477219	LoopVectorize: Add better heuristic for vectorized epilogue skip test (#72589 ) This is a follow-up to PR #72450 correcting the branch_weights used for the test whether the vectorized epilogue loop should be skipped.	2023-11-20 11:02:27 -08:00
Kolya Panchenko	f9c47e89c1	[LV] Stability fix for outerloop vectorization (#68118 ) HCFG builder doesn't correctly handle cases when non-outermost loop is requested to be vectorized [Original] Differential Revision: https://reviews.llvm.org/D150700	2023-11-20 10:01:25 -05:00

1 2 3 4 5 ...

2299 Commits