clang-p2996

Author	SHA1	Message	Date
Graham Hunter	84ebe5b7e8	[LV] Precommit tests for uniform arguments for vector function variants See https://github.com/llvm/llvm-project/pull/68879	2023-11-20 13:30:25 +00:00
Florian Hahn	7fd021a092	[LV] Don't crash on vector masks during scalar VPReductionRecipe::exec. VPReductionRecipe may be executed for scalar VFs. Make sure to access part 0 of the condition, as it could be an active-lane-mask, which is a vector <1 x i1> Fixes https://github.com/llvm/llvm-project/issues/72720.	2023-11-18 21:52:22 +00:00
Nilanjana Basu	e2210cefb1	[LV] Pre-committing tests for changing loop interleaving count computation (#70272 ) Added tests for evaluating changes to loop interleaving count computation and for removing loop interleaving threshold in subsequent patches.	2023-11-17 17:38:04 -08:00
Florian Hahn	e5e71affb7	[LV] Reverse mask up front, not when creating vector pointer. (#72163 ) Reverse mask early on when populating BlockInMask. This will enable separating mask management and address computation from the memory recipes in the future and is also needed to enable explicit unrolling in VPlan.	2023-11-17 13:59:35 +00:00
Nikita Popov	de176d8c54	[SCEV][LV] Invalidate LCSSA exit phis more thoroughly (#69909 ) This an alternative to #69886. The basic problem is that SCEV can look through trivial LCSSA phis. When the phi node later becomes non-trivial, we do invalidate it, but this doesn't catch uses that are not covered by the IR use-def walk, such as those in BECounts. Fix this by adding a special invalidation method for LCSSA phis, which will also invalidate all the SCEVUnknowns/SCEVAddRecExprs used by the LCSSA phi node and defined in the loop. We should probably also use this invalidation method in other places that add predecessors to exit blocks, such as loop unrolling and loop peeling. Fixes #69097. Fixes #66616. Fixes #63970.	2023-11-17 09:34:24 +01:00
Matthias Braun	a9cc6fc280	LoopVectorize: Set branch_weight for conditional branches (#72450 ) Consistently add `branch_weights` metadata in any condition branch created by `LoopVectorize.cpp`: - Will only add metadata if the original loop-latch branch had metadata assigned. - Most checks should rarely trigger so I am using a 127:1 ratio. - For the middle block we assume an equal distribution of modulo results.	2023-11-16 11:33:46 -08:00
Florian Hahn	1b82cc1186	[LV] Regenerate check lines for scalable-trunc-min-bitwidth.ll. Re-generate check lines to reduce diff in follow-up change.	2023-11-16 12:34:40 +00:00
Florian Hahn	95eaaa7d71	[LV] Replace undef with constant and pointer argument in tests. This makes the tests more defined, prevents uses of the add being folded and remove UB when loading from undef.	2023-11-16 12:23:17 +00:00
Philip Reames	c05ab7b850	Regenerate a couple of auto-gen tests to reduce diffs in upcoming change [nfc]	2023-11-15 12:33:15 -08:00
Florian Hahn	097ba5366c	[VPlan] Use VPTypeInfo in simplifyRecipes. Replace getTypeForVPValue with the recently added, more general VPTypeAnalysis.	2023-11-15 15:28:51 +00:00
Yingwei Zheng	dc6d077396	[CVP] Infer nneg on existing zext (#72052 ) This patch infers `nneg` flags for existing zext instructions in CVP. After https://github.com/llvm/llvm-project/pull/71534 and this patch, we can drop `zext -> zext nneg` transform in `RISCVCodeGenPrepare`: `40671bbdef/llvm/lib/Target/RISCV/RISCVCodeGenPrepare.cpp (L74-L83)` This is an alternative to #72049.	2023-11-13 22:41:37 +08:00
Graham Hunter	b070629c10	[LV] Increase max VF if vectorized function variants exist (#66639 ) If there are function calls in the candidate loop and we have vectorized variants available, try some wider VFs in case the conservative initial maximum based on the widest types in the loop won't actually allow us to make use of those function variants.	2023-11-13 10:27:10 +00:00
Florian Hahn	34c2dcd5ac	[VPlan] Move initial skeleton construction to createInitialVPlan. (NFC) This patch moves creating the middle VPBBs and an initial empty vector loop region for the top-level loop to createInitialVPlan. This consolidates code to create the initial VPlan skeleton and enables adding other bits outside the main region during initial VPlan construction. In particular, D150398 will add the exit check & branch to the middle block. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D158333	2023-11-12 13:00:44 +00:00
Florian Hahn	ed6f4994d8	[VPlan] Handle conditional ordered reductions with scalar VFs. VPReductionRecipe::execute was not handling predicates for ordered reduction with scalar VFs, which was causing a crash. Thsi patch adds dedicated handling for scalar VFs when dealing with the condition. The other operands are already handled in a similar fashion below. Fixes #70988.	2023-11-11 12:55:40 +00:00
Nikita Popov	5918f62301	[InstCombine] Infer zext nneg flag (#71534 ) Use KnownBits to infer the nneg flag on zext instructions. Currently we only set nneg when converting sext -> zext, but don't set it when we have a zext in the first place. If we want to use it in optimizations, we should make sure the flag inference is consistent.	2023-11-08 09:34:40 +01:00
Philip Reames	23099ac239	Add known and demanded bits support for zext nneg (#70858 ) zext nneg was recently added to the IR in #67982. This patch teaches demanded bits and known bits about the semantics of the instruction, and adds a couple of test cases to illustrate basic functionality.	2023-11-06 18:47:56 -08:00
Ramkumar Ramachandra	2302e4c327	Reland "VectorUtils: mark xrint as trivially vectorizable" (#71416 ) With the recent change `98c90a13` (ISel: introduce vector ISD::LRINT, ISD::LLRINT; custom RISCV lowering), it is now possible for SLPVectorizer, LoopVectorize, and Scalarizer to operate on llvm.lrint and llvm.llrint, with vector codegen for the RISC-V target. Make a trivial change to VectorUtils, and update the corresponding tests. A couple of important fixes have been landed since the original patch was landed and reverted, and it is now safe to re-land the patch: `5e1d81a` (LegalizeIntegerTypes: implement PromoteIntRes for xrint) and `fd887a3` (LegalizeVectorTypes: fix bug in widening of vec result in xrint). See also #71399, which proves that lrint and llrint will indeed produce vector codegen on RISC-V. Fixes #55208.	2023-11-06 18:49:49 +00:00
Florian Hahn	fd82b5b287	[LV] Support recieps without underlying instr in collectPoisonGenRec. Support recipes without underlying instruction in collectPoisonGeneratingRecipes by directly trying to dyn_cast_or_null the underlying value. Fixes https://github.com/llvm/llvm-project/issues/70590.	2023-11-03 10:21:14 +00:00
David Sherwood	07f0e75b53	[LoopVectorize] Fix bug with code to hoist runtime checks (#70937 ) There was a silly mistake in the expandBounds function that was using the wrong type when calling expandCodeFor and always assuming the stride is 64 bits. I've added the following test to defend this fix: Transforms/LoopVectorize/ARM/mve-hoist-runtime-checks.ll	2023-11-02 10:02:50 +00:00
Ramkumar Ramachandra	ac7c816dc2	Revert "VectorUtils: mark lrint, llrint as trivially vectorizable (#69945 )" This reverts commit `5bfd89bda7`. It was causing build failures on ffmpeg on i686.	2023-11-01 09:57:22 +00:00
Ramkumar Ramachandra	5bfd89bda7	VectorUtils: mark lrint, llrint as trivially vectorizable (#69945 ) With the recent change `98c90a13` (ISel: introduce vector ISD::LRINT, ISD::LLRINT; custom RISCV lowering), it is now possible for SLPVectorizer, LoopVectorize, and Scalarizer to operate on llvm.lrint and llvm.llrint, with vector codegen for the RISC-V target. Make a trivial change to VectorUtils, and update the corresponding tests.	2023-10-31 21:29:15 +00:00
Philip Reames	f8742b8d6a	[SCEV] Teach SCEVExpander to use zext nneg when possible (#70815 ) zext nneg was recently added to the IR in #67982. Teaching SCEVExpander to emit nneg when possible is valuable since SCEV may have proved non-trivial facts about loop bounds which would otherwise be lost when materializing the value.	2023-10-31 09:33:07 -07:00
Philip Reames	6485978120	Refresh a couple of auto-gen tests [nfc] Reducing spurious diff in an upcoming review.	2023-10-31 07:46:01 -07:00
Ramkumar Ramachandra	562ce8bbd2	LoopVectorize: add negative test for lrint, llrint (#70211 ) With the recent change `98c90a1` (ISel: introduce vector ISD::LRINT, ISD::LLRINT; custom RISCV lowering), it is now possible to vectorize llvm.lrint and llvm.llrint with a trivial change to VectorUtils. In preparation for this change, and the corresponding test update, add a negative test for lrint and llrint.	2023-10-31 13:13:26 +00:00
Ramkumar Ramachandra	1d090b8241	LoopVectorize/test: add missing CHECK lines, cleanup intrinsic.ll (#70202 ) Clean up intrinsic.ll by removing extraneous attributes and target datalayout, fix a bug in the copysign_f64 test, and add missing CHECK lines.	2023-10-31 12:50:46 +00:00
Philip Reames	3f2ed812f0	[InstCombine] Infer nneg on zext when forming from non-negative sext (#70706 ) Builds on #67982 which recently introduced the nneg flag on a zext instruction. InstCombine is one of our largest canonicalizers of zext from non-negative sext instructions, so set the flag there.	2023-10-30 12:09:43 -07:00
Igor Kirillov	70904226e1	[LoopVectorize] Enhance Vectorization decisions for predicate tail-folded loops with low trip counts (#69588 ) * Avoid using `CM_ScalarEpilogueNotAllowedLowTripLoop` for loops known to be predicate tail-folded, delegating to `areRuntimeChecksProfitable` to decide on the profitability of vectorizing loops with runtime checks. * Update the `areRuntimeChecksProfitable` function to consider the `ScalarEpilogueLowering` setting when assessing vectorization of a loop. With this patch, we can make more informed decisions for loops with low trip counts, especially when leveraging Profile-Guided Optimization (PGO) data.	2023-10-30 13:43:26 +00:00
Allen	46cb7e4eea	[LoopDist] Update the pragma info of loop distribute, NFC (#69825 ) Base on D19403, the exact pragma of distribute is `#pragma clang loop distribute`	2023-10-28 17:47:46 +08:00
Florian Hahn	cdc5e00e73	[LV] Add test case to scalarize ptrtoint instructions. Extra test for https://github.com/llvm/llvm-project/pull/69013	2023-10-27 14:32:54 +01:00
Florian Hahn	cff6652129	[VPlan] Handle VPValues without underlying values in getTypeForVPValue. Fixes a crash after `0c8e5be6fa`. Full type inference will be added in https://github.com/llvm/llvm-project/pull/69013	2023-10-27 13:34:54 +01:00
Alex Richardson	e39f6c1844	[opt] Infer DataLayout from triple if not specified There are many tests that specify a target triple/CPU flags but no DataLayout which can lead to IR being generated that has unusual behaviour. This commit attempts to use the default DataLayout based on the relevant flags if there is no explicit override on the command line or in the IR file. One thing that is not currently possible to differentiate from a missing datalayout `target datalayout = ""` in the IR file since the current APIs don't allow detecting this case. If it is considered useful to support this case (instead of passing "-data-layout=" on the command line), I can change IR parsers to track whether they have seen such a directive and change the callback type. Differential Revision: https://reviews.llvm.org/D141060	2023-10-26 12:07:37 -07:00
Matthias Braun	e3cf80c5c1	BlockFrequencyInfoImpl: Avoid big numbers, increase precision for small spreads BlockFrequencyInfo calculates block frequencies as Scaled64 numbers but as a last step converts them to unsigned 64bit integers (`BlockFrequency`). This improves the factors picked for this conversion so that: * Avoid big numbers close to UINT64_MAX to avoid users overflowing/saturating when adding multiply frequencies together or when multiplying with integers. This leaves the topmost 10 bits unused to allow for some room. * Spread the difference between hottest/coldest block as much as possible to increase precision. * If the hot/cold spread cannot be represented loose precision at the lower end, but keep the frequencies at the upper end for hot blocks differentiable.	2023-10-24 20:27:39 -07:00
Florian Hahn	159614a52f	[LV] Use variable instead of value number in vplan-dot-printing.ll test.	2023-10-23 20:25:22 +01:00
Florian Hahn	0c8e5be6fa	[VPlan] Simplify redundant trunc (zext A) pairs to A. Add simplification for redundant trunc(zext A) pairs. Generally apply a transform from D149903. Depends on D159200. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D159202	2023-10-22 11:41:38 +01:00
Lou Knauer	852bac4439	[VPlan] Support scalable vectors in outer-loop vectorization This patch enables scalable vectors in the VPlan-native path. If a vectorization factor is specified via loop vectorization hints, that factor is used. If no vectorization factor is specified, but the target preferes scalable vectorization, a scalable vectorization factor is selected. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D157484	2023-10-20 23:17:35 +01:00
Florian Hahn	2ec7bba77b	Recommit "[VPlan] Insert Trunc/Exts for reductions directly in VPlan." This reverts commit `e4ea099748`. The recommit fixes a reported crash by adding a missing check to make sure the cast recipes are only introduced when vectorizing. Test coverage added in `3cac608fbd`. Original commit message: Update the code to create Trunc/Ext recipes directly in adjustRecipesForReductions instead of fixing it up later in fixReductions. This explicitly models the required conversions and also makes sure they are generated at the right place (instead of after the exit condition), hence the changes in a few tests.	2023-10-20 14:30:04 +01:00
Graham Hunter	1abc28fea0	[NFC][LV] Add test for vectorizing fmuladd with another call (#68601 ) As requested in (#66521) I confirmed a crash with "return" instead of "continue" in setVectorizedCallDecision's fmuladd reduction recognition.	2023-10-20 10:23:31 +01:00
Florian Hahn	3cac608fbd	[LV] Add interleave only test case with reduction requiring casts. This adds test coverage for a crash exposed by d311126349b8fe1684d62154a9fa5a7bbb0b713.	2023-10-19 20:52:21 +01:00
Igor Kirillov	b84977bcc1	Rename test to avoid overlapping with debug output	2023-10-19 12:21:31 +00:00
Fangrui Song	e4ea099748	Revert "[VPlan] Insert Trunc/Exts for reductions directly in VPlan." This reverts commit `fd31112634`. There are two different crash reports on `fd31112634`	2023-10-18 23:25:31 -07:00
Florian Hahn	fd31112634	[VPlan] Insert Trunc/Exts for reductions directly in VPlan. Update the code to create Trunc/Ext recipes directly in adjustRecipesForReductions instead of fixing it up later in fixReductions. This explicitly models the required conversions and also makes sure they are generated at the right place (instead of after the exit condition), hence the changes in a few tests.	2023-10-17 19:17:40 +01:00
Yingwei Zheng	4718b4011f	[LV] Invalidate disposition of SCEV values after loop vectorization (#69230 ) This PR fixes the assertion failure of `SE.verify()` after loop vectorization.	2023-10-17 03:49:39 +08:00
Florian Hahn	f7a8a78cb7	[VPlan] Also print operands of canonical IV (NFC). Also print the operands of VPCanonicalIVPHIRecipe. That was missed earlier.	2023-10-16 20:28:23 +01:00
Florian Hahn	38f8b7cbe4	[LV] Replace value numbers with patterns in tests (NFC). Replace some hardcoded value numbers in CHECK-LINES to use patterns, to make the tests more robust wrt renumbering.	2023-10-16 19:53:44 +01:00
JolantaJensen	afdb18df4d	[NFC][AArch64][LV] Reorganise LV tests using symbols from SLEEF (#68207 ) The tests introduced by https://reviews.llvm.org/D134719 and later modified in https://reviews.llvm.org/D146839 are not testing LV in isolation. This patch: 1. Assures that all tests test LV in isolation. 2. Adds LV tests using llvm intrinsics that have libm mappings. llrint, llround and lrint are not included as currently IR verifier pass does not allow to use vector types with them.	2023-10-13 12:10:21 +01:00
Ramkumar Ramachandra	8593c0bc02	LoopVectorize/test: clean up reduction.ll; generate using UTC (NFC) (#68890 ) The test reduction.ll was introduced before utils/update_test_checks.py, and hence contains hand-written CHECK lines. Revisit the test today, and modernize it by: - Removing extranous attributes on functions and their arguments, as LoopVectorize doesn't even look at these attributes. - Removing the target datalayout, as it is not essential for LoopVectorize. Finally, regenerate the CHECK lines using update_test_checks.py, eliminating hand-written error-prone CHECK lines.	2023-10-12 15:45:15 +01:00
Nikita Popov	30faaaf626	[LoopVectorize] Regenerate test checks (NFC)	2023-10-12 14:35:23 +02:00
Rin	df8e0d057d	[AArch64][LoopVectorize] Use upper bound trip count instead of the constant TC when choosing max VF (#67697 ) This patch is based off of https://github.com/llvm/llvm-project/pull/67543. We are currently using the exact trip count to make decisions regarding the maximum VF. We can instead use the upper bound TC, which will be the same as the constant trip count when that is known.	2023-10-09 16:26:19 +01:00
Dmitriy Smirnov	e13bed4c5f	[PATCH] [llvm] [InstCombine] Canonicalise ADD+GEP This patch tries to canonicalise add + gep to gep + gep. Co-authored-by: Paul Walker <paul.walker@arm.com> Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D155688	2023-10-06 12:29:06 +01:00
Alexey Bataev	e22818d5c9	[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst. Need to add NumSrcElts param to is..Mask functions in ShuffleVectorInstruction class for better mask analysis. Mask.size() not always matches the sizes of the permuted vector(s). Allows to better estimate the cost in SLP and fix uses of the functions in other cases. Differential Revision: https://reviews.llvm.org/D158449	2023-10-05 06:17:07 -07:00

1 2 3 4 5 ...

2299 Commits