clang-p2996

Author	SHA1	Message	Date
Alexey Bataev	755282ec1e	[SLP][NFC]Move getExtractIndex function for future changes, NFC.	2023-01-09 09:53:01 -08:00
Benjamin Kramer	b6942a2880	[NFC] Hide implementation details in anonymous namespaces	2023-01-08 17:37:02 +01:00
Florian Hahn	78914e8c32	[VPlan] Keep entries in worklist in sinkScalarOperands. Not removing the entries ensures that duplicates are avoided, reducing the number of iterations.	2023-01-08 15:52:57 +00:00
Alexey Bataev	996ad44b97	[SLP][NFC]Fix compile build by declaring ArrayRef, NFC. Fix compiler build reported in https://lab.llvm.org/buildbot#builders/243/builds/218	2023-01-06 17:01:48 -08:00
Alexey Bataev	cc17e93178	[SLP][NFC]Remove unused variables, NFC.	2023-01-06 16:55:54 -08:00
Alexey Bataev	7439e1b2de	[SLP]Fix incorrect reordering of clustered scalars. The new mask represents the order, not the mask itself. At first, need to treat as the order, convert to mask and only after that reorder gathered scalars to build correct clustered order. Differential Revision: https://reviews.llvm.org/D141161	2023-01-06 16:04:09 -08:00
Alexey Bataev	9b5f62685a	[SLP]Fix cost of the broadcast buildvector/gather. Need to include the cost of the initial insertelement to the cost of the broadcasts. Also, need to adjust the cost of the gather/buildvector if the element is inserted into poison/undef vector. Differential Revision: https://reviews.llvm.org/D140498	2023-01-06 09:25:05 -08:00
Florian Hahn	68469a80cb	[LV] Disable runtime unrolling for vectorized loops. This patch adds metadata to disable runtime unrolling to the vectorized loop. If runtime unrolling/interleaving is considered profitable, LV will interleave the loop directly. There should be no need to perform runtime unrolling at a later stage. Note that we already add metadata to disable runtime unrolling to the scalar loop after vectorization. The additional unrolling unnecessarily increases code size and compile time. In addition to that we have several bug reports of unncessary runtime unrolling for vectorized loops, e.g. PR40961 Compile-time improvements: NewPM-O3: -1.04% NewPM-ReleaseThinLTO: -0.59% NewPM-ReleaseLTO-g: -0.97% https://llvm-compile-time-tracker.com/compare.php?from=ce1be13a868d0f8afa367975558c1a6175cce33a&to=78bc2e67f22e9e10e61cdb6cdac4bb857d95eb1b&stat=instructions:u Fixes #40306. Reviewed By: lebedev.ri, nikic Differential Revision: https://reviews.llvm.org/D115261	2023-01-06 10:56:17 +00:00
Valery N Dmitriev	6d677c0b3d	[SLP] Unify GEP cost modeling for load, store and GEP nodes. Make a separate routine for GEPs cost calculation and make the approach uniform across load, store and GEP tree nodes. Additional issue fixed is GEP cost savings were applied twice for ScatterVectorize nodes (aka gather load) making them look unrealistically profitable for vectorization. Differential Revision: https://reviews.llvm.org/D140789	2023-01-05 10:11:36 -08:00
serge-sans-paille	38818b60c5	Move from llvm::makeArrayRef to ArrayRef deduction guides - llvm/ part Use deduction guides instead of helper functions. The only non-automatic changes have been: 1. ArrayRef(some_uint8_pointer, 0) needs to be changed into ArrayRef(some_uint8_pointer, (size_t)0) to avoid an ambiguous call with ArrayRef((uint8_t), (uint8_t)) 2. CVSymbol sym(makeArrayRef(symStorage)); needed to be rewritten as CVSymbol sym{ArrayRef(symStorage)}; otherwise the compiler is confused and thinks we have a (bad) function prototype. There was a few similar situation across the codebase. 3. ADL doesn't seem to work the same for deduction-guides and functions, so at some point the llvm namespace must be explicitly stated. 4. The "reference mode" of makeArrayRef(ArrayRef<T> &) that acts as no-op is not supported (a constructor cannot achieve that). Per reviewers' comment, some useless makeArrayRef have been removed in the process. This is a follow-up to https://reviews.llvm.org/D140896 that introduced the deduction guides. Differential Revision: https://reviews.llvm.org/D140955	2023-01-05 14:11:08 +01:00
David Green	586fd86b0a	[LoopVectorizer] Fix inloop reductions mask placement The validation of vplans could fail if an inloop reduction was created with a block-in mask that did not dominate the reduction. This makes sure that the insert point is set when creating the mask, to ensure it dominates the reduction. Differential Revision: https://reviews.llvm.org/D141003	2023-01-05 11:37:37 +00:00
Augie Fackler	0676156f81	Revert "[VPlan] Also consider operands of sink candidates in same block." This reverts commit `aa2414729e`. Previously-valid IR from a tensorflow test case (as shown on the Diffusion revision for `aa2414729e`) started hanging in the loop-vectorize pass. Reverting to keep everyone working.	2023-01-04 16:17:13 -05:00
Alexey Bataev	a1b18946f9	[SLP]Fix incorrect shuffle results because of missing shuffle mask analysis. Missed the analysis of the shuffle mask when trying to analyze the operands of the shuffle instruction during peeking through shuffle instructions.	2023-01-04 13:10:40 -08:00
Dinar Temirbulatov	55c600819f	[SLP][AArch64] Incorrectly estimated intrinsic as a function call. We incorrectly assume intrinsic as a function call and it prevents us from the opportunity to vectorize. On Aarch64 Cortex-A53 we think that llvm.fmuladd.f64 is a function call which is wrong. Differential Revision: https://reviews.llvm.org/D140392	2023-01-03 19:45:24 +00:00
Alexey Bataev	26fec4e845	[SLP]Fix crash on casting non-instruction extractelement. Need to check if the extractelement operation is an extraction before trying to move it around the buildblocks to avoid crash on cast.	2023-01-03 09:45:57 -08:00
Florian Hahn	ce1be13a86	[VPlan] Use VP_CLASSOF_IMPL for VPWidenCanonicalIVRecipe(NFC). Replace VPWidenCanonicalIVRecipe::classof implementation with general VP_CLASSOF_IMPL.	2023-01-02 17:52:13 +00:00
Florian Hahn	64f1d845b3	[VPlan] Use VP_CLASSOF_IMPL for VPWidenMemoryInstructionRecipe (NFC). Replace VPWidenMemoryInstructionRecipe ::classof implementation with general VP_CLASSOF_IMPL.	2023-01-02 17:32:31 +00:00
Florian Hahn	2d6d47f807	[VPlan] Use VP_CLASSOF_IMPL for VPPredInstPHI (NFC). Replace VPPredInstPHI::classof implementation with general VP_CLASSOF_IMPL.	2023-01-02 17:22:34 +00:00
Florian Hahn	89718815c6	[VPlan] Adjust mergeReplicateRegions to be in line with mergeBlock (NFC) Adjust mergeReplicateRegions to be in line with mergeBlocksIntoPredecessors added in `36d70a6aea` by collecting only the valid candidates first. Also rename to mergeReplicateRegionsIntoSuccessors and add missing doc-comment. This addresses post-commit suggestions by @Ayal.	2023-01-01 19:48:49 +00:00
Florian Hahn	cd16a3f04c	[VPlan] Move GraphTraits definitions to separate header (NFC). This reduces the size of VPlan.h and avoids future growth of the file when the graph traits are extended in future patches. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D140500	2022-12-31 15:14:57 +00:00
Florian Hahn	aa2414729e	[VPlan] Also consider operands of sink candidates in same block. Even if the the sink candidate is already in the target block, its operands can be candidates for sinking. Queue them up as well. Also moves the queuing logic to a helper.	2022-12-30 18:24:35 +00:00
Alexey Bataev	5dccea5a68	[SLP]Do not emit many extractelements, reuse the single one emitted. We do not need to emit many extractelements for each particular use, we can reuse the only one, just need to adjust it to make it dominate on all uses. Differential Revision: https://reviews.llvm.org/D140580	2022-12-30 06:38:06 -08:00
Valery N Dmitriev	ad956ed568	[SLP] Fix debug print for cost in tryToVectorizeList - NFC. Actual VF was confused with local variable named "VF".	2022-12-29 11:30:10 -08:00
Valery N Dmitriev	8eb3698b94	[SLP] A couple of minor improvements for slp graph view - NFC. Show ScatterVectorize nodes in frames of blue color and print vectorize tree indices.	2022-12-29 11:02:36 -08:00
Alexey Bataev	ac01ae71f0	[SLP]Use ShuffleInstructionBuilder for vector shrinking. We can use ShuffleInstructionBuilder now for shrinking shuffle emission. It allows to remove extra shuffle from the emitted code and reuse original vector. Part of D110978 Differential Revision: https://reviews.llvm.org/D140499	2022-12-28 06:09:04 -08:00
Michael Maitland	396b0b2b13	[LV] Remove duplicate name set of vector header basic block. NFC The preheader was named explicitly in `256c6b0ba1` which makes setting the name in prior commit `95b2aa511e` unnecessary. Differential Revision: https://reviews.llvm.org/D140246	2022-12-27 17:19:08 -08:00
Florian Hahn	e91e62db14	[LV] Sink scalar operands and merge regions repeatedly. Merging regions can enable new sinking opportunities (e.g. if users of a scalar value are moved from different VPBBs into the same VPBB). Sinking in turn can also enable new merging opportunities (e.g. if a recipe between to merge-able regions is moved. To enable more sinking opportunities, repeat sinking & merging if regions could be merged. Also fix mergeReplicateRegions to return the correct Changed status. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D139788	2022-12-27 18:08:32 +00:00
Alexey Bataev	a9b052e2ef	[SLP]Fix PR59693: Do not crash trying to set insert point for buildvector of extractvalues. No need to get the last instruction only for vectorized extractvalues, for gathered(buildvector sequence) still need to get the insertion point.	2022-12-27 06:01:38 -08:00
Florian Hahn	36d70a6aea	[VPlan] Remove redundant blocks by merging them into predecessors. Add and run VPlan transform to fold blocks with a single predecessor into the predecessor. This remove redundant blocks and addresses a TODO to replace special handling for the vector latch VPBB. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D139927	2022-12-26 22:47:09 +00:00
Florian Hahn	435e220ba6	[VPlan] Use VPBB in sinkScalarOperands directly. (NFC) Suggested by @Ayal in D139790.	2022-12-25 21:34:59 +00:00
Florian Hahn	9758242046	[LV] Use SCEV to check if the trip count <= VF * UF. Just comparing constant trip counts causes LV to miss cases where the vector loop body only executes once. The motivation for this is to remove the need for unrolling to remove vector loop back-edges, if the body only executes once in more cases. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D133017	2022-12-24 18:34:54 +00:00
Florian Hahn	e1650c8d52	[LV] Move exit cond simplification to separate transform. This sets the stage for D133017 by moving out the code that performs VPlan based simplifications to a separate transform that takes the chosen VF & UF as arguments. The main advantage is that this transform runs before any changes to the CFG are being made. This allows using SCEV without worrying about making queries while the IR is in an incomplete state. Note that this patch switches the reasoning to use SCEV, but still only simplifies loops with constant trip counts. Using SCEV here is needed to access the backedge taken count, because the trip count IR value has not been created yet. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D135017	2022-12-23 12:51:21 +00:00
Florian Hahn	b7b1e5c96f	[LV] Assert that the executed plan contains selected VF & UF (NFC). Add assertion to ensure the executed plan is valid for the selected VF and UF.	2022-12-23 11:44:42 +00:00
Florian Hahn	5df34e971d	[VPlan] Add support for tracking UFs applicable to VPlan (NFC). Explicitly track the UFs supported in a VPlan. This is needed to allow transformations to restrict the UFs which are supported. Discussed as separate improvement in D135017.	2022-12-22 18:58:25 +00:00
Florian Hahn	96296922b6	[VPlan] Move VF and UF string generation to getName() (NFC). The VFs and UFs may be more constrained as the plans are transformed (e.g. see D135017 for an example). To make sure the VFs/UFs included in the VPlan dump are accurate, generate them when accessing a plan's name, rather than include them in the name string set after initial construction.	2022-12-22 13:15:01 +00:00
Mircea Trofin	946831ea2d	[NFC] Rename Function::isDebugInfoForProfiling to shouldEmit[...] The function name was misleading - the expectation set both by the name and by other members of Function (like isDeclaration or isIntrinsic) would be that the function somehow would "be" "debug info for profiling". But that's not the case - the property indicates (as the comment over the declaration also explains) whether debug info should be emitted (for profiling).	2022-12-21 18:36:59 -08:00
Florian Hahn	a84064bcda	[LV] Add createTripCountSCEV helper (NFC). Split off helper function in preparation for D135017.	2022-12-21 22:02:31 +00:00
Florian Hahn	7d8528dbf2	[LV] Move SCEV caching workaround to executePlan (NFC). As suggested by @Ayal in D92132. This avoids having to duplicate the workaround in multiple places.	2022-12-21 14:51:21 +00:00
Alexey Bataev	2e972ea056	[SLP]Integrate looking through shuffles logic into ShuffleInstructionBuilder. Added BaseShuffleAnalysis as a base class for ShuffleInstructionBuilder and integrated shuffle logic from shuffles for externally used scalars into this class. This class is used as the main container that implements smart shuffle instruction builder logic. ShuffleInstructionBuilder uses this logic. ShuffleInstructionBuilder is also used in building of the shuffle for the externally used scalars instead of lambdas, which are now part of BaseShuffleAnalysis class. Differential Revision: https://reviews.llvm.org/D140100	2022-12-21 06:12:53 -08:00
Florian Hahn	f69ac9a22d	[LV] Support widened induction variables in epilogue vectorization. Code generation now uses the start VPValue of induction recipes. This makes it possible to adjust the start value of the epilogue vector loop to use the 'resume' value of the main vector loop. Fixes #59459. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D92132	2022-12-21 13:58:50 +00:00
Kazu Hirata	c08fad8193	[llvm] Remove redundant initialization of std::optional (NFC)	2022-12-20 15:53:38 -08:00
Florian Hahn	41b45ce656	[LV] Remove unused AAResults argument (NFC). AAResults is passed to LoopVectorizationLegality but no longer used. Remove the dead code.	2022-12-19 20:37:47 +00:00
Fangrui Song	21c4dc7997	std::optional::value => operator*/operator-> value() has undesired exception checking semantics and calls __throw_bad_optional_access in libc++. Moreover, the API is unavailable without _LIBCPP_NO_EXCEPTIONS on older Mach-O platforms (see _LIBCPP_AVAILABILITY_BAD_OPTIONAL_ACCESS). This fixes clang.	2022-12-17 00:42:05 +00:00
Florian Hahn	08f16a8217	[VPlan] Use macro to define recipe classof implementation (NFC). Add a VP_CLASSOF_IMPL macro to define common classof implementations for recipes. This reduces duplication and also adds missing implementations to existing recipes.	2022-12-16 17:52:15 +00:00
Kazu Hirata	6eb0b0a045	Don't include Optional.h These files no longer use llvm::Optional.	2022-12-14 21:16:22 -08:00
Florian Hahn	e898479f2b	[VPlan] Sink non-uniform recieps for scalar plans. In scalar plans, replicate recipes will only generate a single value per UF, independent of whether they are uniform or not. So don't consider uniformity for plans with scalar VFs only. This allows us to handle a few additional cases in VPlan sinking instead of non-VPlan sinkScalarOperands. Depends on D133762. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D134218	2022-12-14 17:55:31 +00:00
Fangrui Song	d4b6fcb32e	[Analysis] llvm::Optional => std::optional	2022-12-14 07:32:24 +00:00
Alexey Bataev	ecac8192db	[SLP][NFC]Initial redesign of ShuffleInstructionBuilder, NFC. The patch redesigns ShuffleInstructionBuilder so it could later be used for reshuffling of the buildvector sequences and vectorized parts of externally used scalars. Also will allow to generalize cost model for the gathers/buildvectors. Part of D110978. Differential Revision: https://reviews.llvm.org/D139718	2022-12-13 09:37:18 -08:00
Fangrui Song	1ec11d2d48	[Transforms/Vectorize] llvm::Optional => std::optional	2022-12-12 08:56:35 +00:00
Fangrui Song	c178ed33bd	Transforms/Utils: llvm::Optional => std::optional	2022-12-12 08:29:05 +00:00

1 2 3 4 5 ...

3535 Commits