clang-p2996

Author	SHA1	Message	Date
Simon Pilgrim	ae60706da0	[DAG] SimplifyDemandedBits - call ComputeKnownBits for constant non-uniform ISD::SRL shift amounts We only attempted to determine KnownBits for uniform constant shift amounts, but ComputeKnownBits is able to handle some non-uniform cases as well that we can use as a fallback.	2023-07-21 14:52:57 +01:00
Simon Pilgrim	7567b72f4d	[DAG] ShrinkDemandedConstant - early-out for empty DemandedBits/Elts Leave this to constant folding in SimplifyDemandedBits Fixes #63975	2023-07-20 12:18:10 +01:00
Simon Pilgrim	d7eb9240c0	[DAG] SimplifyDemandedBits - attempt to use SimplifyMultipleUseDemandedBits for bitcasts from larger element types Attempt to avoid multi-use ops if the bitcast doesn't need anything from them.	2023-07-18 18:38:03 +01:00
Simon Pilgrim	e9caa37e9c	[DAG] Move lshr narrowing from visitANDLike to SimplifyDemandedBits Inspired by some of the cases from D145468 Let SimplifyDemandedBits handle the narrowing of lshr to half-width if we don't require the upper bits, the narrowed shift is profitable and the zext/trunc are free. A future patch will propose the equivalent shl narrowing combine. Differential Revision: https://reviews.llvm.org/D146121	2023-07-17 15:50:09 +01:00
Jon Roelofs	56e60bc5bb	TargetLowering: fix an infinite DAG combine in SimplifySETCC TargetLowering::SimplifySetCC wants to swap the operands of a SETCC to canonicalize the constant to the RHS. The bug here was that it did so whether or not the RHS was already a constant, leading to an infinite loop. rdar://111847838 Divverential revision: https://reviews.llvm.org/D155095 This reverts commit `cdc633e4bc`.	2023-07-12 16:13:27 -07:00
Jon Roelofs	cdc633e4bc	Revert "TargetLowering: fix an infinite DAG combine in SimplifySETCC" This reverts commit `b76c85b355`. It broke the RISCV-enabled bots. Oops.	2023-07-12 12:22:03 -07:00
Jon Roelofs	b76c85b355	TargetLowering: fix an infinite DAG combine in SimplifySETCC TargetLowering::SimplifySetCC wants to swap the operands of a SETCC to canonicalize the constant to the RHS. The bug here was that it did so whether or not the RHS was already a constant, leading to an infinite loop. rdar://111847838 Differential revision: https://reviews.llvm.org/D155095	2023-07-12 11:44:15 -07:00
Matt Arsenault	b59022b42e	DAG: Handle lowering of unordered fcZero\|fcSubnormal to fcmp	2023-07-11 18:30:15 -04:00
Matt Arsenault	310f839612	DAG: Lower is.fpclass fcInf to fcmp of fabs InstCombine should have taken care of this, but I think this is more useful in the future when the expansion tries to handle multiple cases at a time with fcmp. x87 looks worse to me but the only thing I know about it is that I aggressively do not care about it. https://reviews.llvm.org/D143198	2023-07-07 17:00:10 -04:00
Matt Arsenault	64df9573a7	DAG: Handle inversion of fcSubnormal \| fcZero There are a number of more test combinations here that can be done together and reduce the number of instructions. https://reviews.llvm.org/D143191	2023-07-06 21:19:44 -04:00
Matt Arsenault	61820f8b5d	CodeGen: Optimize lowering of is.fpclass fcZero\|fcSubnormal Combine the two checks into a check if the exponent bits are 0. The inverted case isn't reachable until a future change, and GlobalISel currently doesn't attempt the inversion optimization. https://reviews.llvm.org/D143182	2023-07-06 13:03:57 -04:00
Matt Arsenault	1588e18b2d	DAG: Check isCondCodeLegal in is_fpclass expansion to fcmp eq 0 Results in some x86 codegen diffs. Some look better, some look worse. https://reviews.llvm.org/D152094	2023-07-06 13:00:52 -04:00
David Green	f55d96b9a2	[DAG][AArch64] Handle vector types when expanding sdiv/udiv into mulh The aarch64 backend will benefit from expanding 64vector sdiv/udiv into mulh using shift(mul(ext, ext)), as the larger type size is legal and the mul(ext, ext) can efficiently use smull/umull instructions. This extends the existing code in GetMULHS to handle vector types for it. Differential Revision: https://reviews.llvm.org/D154049	2023-07-02 15:02:52 +01:00
Dhruv Chawla	3f77724de7	[TargetLowering] Better code generation for ISD::SADDSAT/SSUBSAT when operand sign is known When the sign of either of the operands is known, it is possible to determine what the saturating value will be without having to compute it using the sign bits. Differential Revision: https://reviews.llvm.org/D153575	2023-06-23 13:20:36 +05:30
Matt Arsenault	18b93562cf	DAG: Expand legalization of is.fpclass to fcmp for DAZ Try to use a compare with 0 if DAZ is assumed. FPClassTest really needs to be marked as a bimask enum, but the API for that is currently broken.	2023-06-22 06:18:02 -04:00
Noah Goldstein	5c8188c7bc	[DAGCombine] Use `IsKnownNeverZero` to see if we need zero-check in is_pow2 setcc patern `ctpop(X) eq/ne 1` is checking if X is a non-zero power of 2. Power of 2 check including zero is `(X & (X-1)) eq/ne 0` and unfortunately there is no good pattern for checking a power of 2 while excluding zero. So, when lowering `ctpop(X) eq/ne 1`, explicitly check `IsKnownNeverZero(X)` to maybe be able to optimize out the extra zero check. We need this explicitly as DAGCombiner does not re-analyze provable setcc nodes, and the middle-end never finds it beneficially to broaden `ctpop(X) eq/ne 1` -> `ctpop(X) ule/ugt 1` (power of 2 including zero). Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D152675	2023-06-12 13:52:43 -05:00
Yeting Kuo	2fe2a6d4b8	[DAGCombiner] Use generalized pattern matcher in visitFMA to support vp.fma. Note: Some patterns in visitFMA are needed refined to support splat of constant. Reviewed By: luke Differential Revision: https://reviews.llvm.org/D152260	2023-06-08 09:40:21 +08:00
Craig Topper	ee27e5df9e	[TargetLowering][ARM][AArch64] Remove usage of NoSignedWrap/NoUnsignedWrap from AVGFLOOR/CEIL transform. Use computeOverflowForUnsignedAdd and computeOverflowForSignedAdd instead. Unfortunately, this recomputes some known bits and sign bits we may have already computed, but was the easiest fix without a lot of restructuring. This recovers the regressions from D151472. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D151858	2023-06-01 14:18:08 -07:00
Dhruv Chawla	3b3912e9b8	Reapply [SelectionDAG] Handle NSW for ADD/SUB in computeKnownBits() This exposed a miscompile due to incorrect flag preservation in integer type legalization, which has been fixed in D151472. ----- This patch is a continuation of D150110. It separates the cases for ADD and SUB into their own cases so that computeForAddSub can be directly called and the NSW flag passed. This allows better optimization when the NSW flag is enabled, and allows fixing up the TODO that was there previously in SimplifyDemandedBits. Differential Revision: https://reviews.llvm.org/D150769	2023-05-31 12:25:41 +02:00
Nikita Popov	2ba14283cd	Revert "[SelectionDAG] Handle NSW for ADD/SUB in computeKnownBits()" This reverts commit `b66551370f`. This has exposed a pre-existing miscompile, reported in https://reviews.llvm.org/D150769#4370467.	2023-05-25 11:13:51 +02:00
Dhruv Chawla	b66551370f	[SelectionDAG] Handle NSW for ADD/SUB in computeKnownBits() This patch is a continuation of D150110. It separates the cases for ADD and SUB into their own cases so that computeForAddSub can be directly called and the NSW flag passed. This allows better optimization when the NSW flag is enabled, and allows fixing up the TODO that was there previously in SimplifyDemandedBits. Differential Revision: https://reviews.llvm.org/D150769	2023-05-17 15:15:05 +02:00
Jay Foad	d8229e2f14	[KnownBits] Define and use intersectWith and unionWith Define intersectWith and unionWith as two complementary ways of combining KnownBits. The names are chosen for consistency with ConstantRange. Deprecate commonBits as a synonym for intersectWith. Differential Revision: https://reviews.llvm.org/D150443	2023-05-16 09:23:51 +01:00
Craig Topper	a983ef2c17	[DAGCombiner][AArch64][VE] Teach BuildUDIV/SDIV to use 2x mul when mulh/mul_lohi are not available. Correct the legality of i32 mul_lohi on AArch64. Previously, AArch64 incorrectly reported i32 mul_lohi as Legal. This allowed BuildUDIV/SDIV to use them. A later DAGCombiner would replace them with MULHS/MULHU because only the high half was used. This conversion does not check the legality of MULHS/MULHU under the assumption that LegalizeDAG can turn it back into MUL_LOHI later. After they are converted to MULHS/MULHU, DAGCombine ran and saw that these operations aren't supported but an i64 MUL is. So they get converted to that plus a shift. Without this, LegalizeDAG would convert back MUL_LOHI and isel would fail to find a pattern. This patch teaches BuildUDIV/SDIV to create the wide mul and shift so that we can report the correct operation legality on AArch64. It also enables div by constant folding for more cases on VE. I don't know if VE wants this div by constant optimization or not. If they don't want it, they can use the isIntDivCheap hook to disable it. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D150333	2023-05-12 09:06:17 -07:00
Dhruv Chawla	1d21d2eb7f	[TargetLowering] Fix unnecessary call to `computeKnownBits` (NFCI) In the SimplifyDemandedBits function, there is a fallthrough to the default case in the case of ISD::ADD, ISD::MUL and ISD::SUB. This leads to a call to computeKnownBits which is unnecessary as the calls to SimplifyDemandedBits in the cases themselves handle the calculation of the known bits. This information is discarded through the Known2 variables. By keeping this information around and calling KnownBits::mul or KnownBits::computeForAddSub directly, the unnecessary computation can be avoided. For now, the NSW bit is not passed through to KnownBits as this is something that computeKnownBits does not handle either. This requires updating computeForAddCarry to handle the flag as well. Differential Revision: https://reviews.llvm.org/D150110	2023-05-08 16:14:01 +02:00
Simon Pilgrim	051918c71e	[DAG] expandIntMINMAX - add umax(x,1) --> sub(x,cmpeq(x,0)) fold Move the fold from X86 to generic expansion (We also have several existing expansions that are missing freezes on repeated operands - I've added a TODO for now).	2023-05-05 19:27:52 +01:00
Simon Pilgrim	04e809ab90	[DAG] Add TargetLowering::expandABD and convert X86 lowering to use it Scalar widening cases are still custom lowered in the X86 backend - we still need to add promotion/legalization support to handle these	2023-05-05 15:13:23 +01:00
Evgenii Kudriashov	a82d27a9a6	[X86] Support llvm.{min,max}imum.f{16,32,64} Addresses https://github.com/llvm/llvm-project/issues/53353 Reviewed By: RKSimon, pengfei Differential Revision: https://reviews.llvm.org/D145634	2023-05-04 21:04:48 +08:00
Craig Topper	344368fb98	[TargetLowering] Stop passing an ISD::CondCode to isOperationLegalOrCustom. ISD::CondCode is a separate num space from opcodes. isOperationLegalOrCustom should take an opcode. Reviewed By: barannikov88 Differential Revision: https://reviews.llvm.org/D149528	2023-04-29 15:23:09 -07:00
Sergei Barannikov	e744e51b12	[SelectionDAG] Rename ADDCARRY/SUBCARRY to UADDO_CARRY/USUBO_CARRY (NFC) This will make them consistent with other overflow-aware nodes. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D148196	2023-04-29 21:59:58 +03:00
Craig Topper	df017ba9d3	[TargetLowering] Don't use ISD::SELECT_CC in expandFP_TO_INT_SAT. This function gets called for vectors and ISD::SELECT_CC was never intended to support vectors. Some updates were made to support it when this function started getting used for vectors. Overall, using separate ISD::SETCC and ISD::SELECT looks like an improvement even for scalar. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D149481	2023-04-29 10:23:08 -07:00
Matt Arsenault	bc37be1855	LangRef: Add "dynamic" option to "denormal-fp-math" This is stricter than the default "ieee", and should probably be the default. This patch leaves the default alone. I can change this in a future patch. There are non-reversible transforms I would like to perform which are legal under IEEE denormal handling, but illegal with flushing zero behavior. Namely, conversions between llvm.is.fpclass and fcmp with zeroes. Under "ieee" handling, it is legal to translate between llvm.is.fpclass(x, fcZero) and fcmp x, 0. Under "preserve-sign" handling, it is legal to translate between llvm.is.fpclass(x, fcSubnormal\|fcZero) and fcmp x, 0. I would like to compile and distribute some math library functions in a mode where it's callable from code with and without denormals enabled, which requires not changing the compares with denormals or zeroes. If an IEEE function transforms an llvm.is.fpclass call into an fcmp 0, it is no longer possible to call the function from code with denormals enabled, or write an optimization to move the function into a denormal flushing mode. For the original function, if x was a denormal, the class would evaluate to false. If the function compiled with denormal handling was converted to or called from a preserve-sign function, the fcmp now evaluates to true. This could also be of use for strictfp handling, where code may be changing the denormal mode. Alternative name could be "unknown". Replaces the old AMDGPU custom inlining logic with more conservative logic which tries to permit inlining for callees with dynamic handling and avoids inlining other mismatched modes.	2023-04-29 08:44:59 -04:00
Kazu Hirata	972983539b	[llvm] Apply fixes from readability-redundant-control-flow (NFC)	2023-04-16 00:13:46 -07:00
Kazu Hirata	63c4967352	Use APInt::getOneBitSet (NFC)	2023-04-10 18:19:17 -07:00
Craig Topper	b5f207e5b2	[SelectionDAG] Rename Flag->Glue. NFC	2023-04-02 19:46:51 -07:00
Simon Pilgrim	8153b92d9b	[DAG] Add SelectionDAG::SplitScalar helper Similar to the existing SelectionDAG::SplitVector helper, this helper creates the EXTRACT_ELEMENT nodes for the LO/HI halves of the scalar source. Differential Revision: https://reviews.llvm.org/D147264	2023-03-31 18:35:40 +01:00
Craig Topper	c9e4d9a8ea	[LegalizeTypes][TargetLowering][RISCV] Fix regressions from D146786. Add some special cases for UADDO to recover codegen after D146786. Reviewed By: reames, liaolucy Differential Revision: https://reviews.llvm.org/D146789	2023-03-27 09:58:51 -07:00
Kazu Hirata	7bb6d1b32e	[llvm] Skip getAPIntValue (NFC) ConstantSDNode provides some convenience functions like isZero, getZExtValue, and isMinSignedValue that are named identically to those provided by APInt, so we can "skip" getAPIntValue.	2023-03-22 22:10:25 -07:00
Jun Zhang	b3e12beb44	[TLI] Fold ~X >/< ~Y --> Y >/< X Fixes: https://github.com/llvm/llvm-project/issues/61120 Signed-off-by: Jun Zhang <jun@junz.org> Differential Revision: https://reviews.llvm.org/D146512	2023-03-23 12:49:05 +08:00
Craig Topper	a37df84f99	[SelectionDAG][RISCV] Remove code for handling too small shift type from SimplifyDemandedBits. This code detected that the type returned from getShiftAmountTy was too small to hold the constant shift amount. But it used the full type size instead of scalar type size leading it to crash for scalable vectors. This code was necessary when getShiftAmountTy would always return the target preferred shift amount type for scalars even when the type was an illegal type larger than the target supported. For vectors, getShiftAmountTy has always returned the vector type. Fortunately, getShiftAmountTy was fixed a while ago to detect that the target's preferred size for scalars is not large enough for the type. So we can delete this code. Switched to use getShiftAmountConstant to further simplify the code. Fixs PR61561.	2023-03-21 11:08:19 -07:00
Matt Arsenault	9356ec1516	CodeGen: Reorder case handling for is.fpclass legalization Subnormal and zero checks can be combined into one, so move the code closer to reduce the diff in a future change.	2023-03-17 11:29:50 -04:00
Simon Pilgrim	6bc0e362d7	[DAG] TargetLowering::ShrinkDemandedOp - move SmallVTBits iterator inside for loop. NFC	2023-03-16 12:12:33 +00:00
Simon Pilgrim	7aa7393aab	[DAG] TargetLowering::ShrinkDemandedOp - pull out repeated getValueType calls. NFC	2023-03-16 12:12:33 +00:00
Simon Pilgrim	dc20ce7e54	[DAG] TargetLowering::ShrinkDemandedOp - rename Demanded arg to DemandedBits. NFC Make it clear this is referring to DemandedBits not DemandedElts.	2023-03-15 13:22:21 +00:00
Jay Foad	0265dd9925	Fix "compatiable" typos	2023-03-07 12:57:39 +00:00
Simon Pilgrim	73cdccad55	[DAG] expandIntMINMAX - attempt to match existing SETCC node As noticed on D144789, when we have pairs of min/max nodes we often end up with multiple comparisons which we could reuse with commuted select ops, so check to see if a suitable SETCC already exists. This also allowed us to remove a similar X86 peephole. There are other getSETCC cases where we could safely reuse other CondCodes as well - I've been trying to think of how we could reuse this logic in SelectionDAG but haven't found anything that always works well. An alternative would be to have a TLI callback that returns a preferred CondCode from a list of options, I've noticed this helped fpclamptosat tests on some other targets (MVE + WebAssembly), but other tests suffered. Differential Revision: https://reviews.llvm.org/D145065	2023-03-01 19:04:03 +00:00
David Green	06daa515b2	[AArch64] Don't remove free sext_inreg(vector_extract(x)) if it leads to multiple extracts If we have sext_inreg(vector_extract(x)) but the top bits are not used, DAG will try to remove the sext_inreg, using vector_extract(x) directly. This can lead to multiple uses of both sext_inreg(vector_extract(x)) and vector_extract(x), leading to the generation of both umov and smov extracts. This adds a target hook to prevent that under AArch64 where the sext_inreg can be considered free if there are multiple uses of the sext and no uses of the vector_extract. This helps fix a small regression from D144550. Differential Revision: https://reviews.llvm.org/D144850	2023-02-27 19:20:10 +00:00
Serge Pavlov	7f81dd4dd6	[NFC] Make FPClassTest a bitmask enumeration This is recommit of `2e416cdd52`, fixed to be accepatble by GCC. The original commit message is below. With this change bitwise operations are allowed for FPClassTest enumeration, it must simplify using this type. Also some functions changed to get argument of type FPClassTest instead of unsigned. Differential Revision: https://reviews.llvm.org/D144241	2023-02-24 15:12:16 +07:00
Serge Pavlov	08a09235b6	Revert "[NFC] Make FPClassTest a bitmask enumeration" This reverts commit `e7613c1d9b`. GCC issues an error: In file included from /home/buildbot/as-builder-4/lld-x86_64-ubuntu-fast/llvm-project/llvm/unittests/ADT/BitmaskEnumTest.cpp:9: /home/buildbot/as-builder-4/lld-x86_64-ubuntu-fast/llvm-project/llvm/include/llvm/ADT/BitmaskEnum.h:66:22: error: explicit specialization of template<class E, class Enable> struct llvm::is_bitmask_enum outside its namespace must use a nested-name-specifier [-fpermissive] 66 \| template <> struct is_bitmask_enum<Enum> : std::true_type {}; \ \| ^~~~~~~~~~~~~~~~~~~~~ /home/buildbot/as-builder-4/lld-x86_64-ubuntu-fast/llvm-project/llvm/unittests/ADT/BitmaskEnumTest.cpp:30:1: note: in expansion of macro LLVM_DECLARE_ENUM_AS_BITMASK 30 \| LLVM_DECLARE_ENUM_AS_BITMASK(Flags2, V4); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~	2023-02-23 12:55:58 +07:00
Serge Pavlov	e7613c1d9b	[NFC] Make FPClassTest a bitmask enumeration This is recommit of `2e416cdd52`, reverted in `8555ab2fcd`, because GCC complains on extra qualification. The macro LLVM_DECLARE_ENUM_AS_BITMASK does not specify llvm:: anymore, so the macro must occur in the namespace llvm. Documentation updated accordingly. The original commit message is below. With this change bitwise operations are allowed for FPClassTest enumeration, it must simplify using this type. Also some functions changed to get argument of type FPClassTest instead of unsigned. Differential Revision: https://reviews.llvm.org/D144241	2023-02-23 12:38:57 +07:00
Nikita Popov	8555ab2fcd	Revert "[NFC] Make FPClassTest a bitmask enumeration" This reverts commit `2e416cdd52`. Breaks the GCC build: In file included from /home/npopov/repos/llvm-project/llvm/include/llvm/ADT/FloatingPointMode.h:18, from /home/npopov/repos/llvm-project/llvm/include/llvm/ADT/APFloat.h:20, from /home/npopov/repos/llvm-project/llvm/lib/Support/APFloat.cpp:14: /home/npopov/repos/llvm-project/llvm/include/llvm/ADT/BitmaskEnum.h:66:22: error: extra qualification not allowed [-fpermissive] 66 \| template <> struct llvm::is_bitmask_enum<Enum> : std::true_type {}; \ \| ^~~~ /home/npopov/repos/llvm-project/llvm/include/llvm/ADT/FloatingPointMode.h:223:1: note: in expansion of macro ‘LLVM_DECLARE_ENUM_AS_BITMASK’ 223 \| LLVM_DECLARE_ENUM_AS_BITMASK(FPClassTest, /* LargestValue / fcPosInf); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~ /home/npopov/repos/llvm-project/llvm/include/llvm/ADT/BitmaskEnum.h:67:22: error: extra qualification not allowed [-fpermissive] 67 \| template <> struct llvm::largest_bitmask_enum_bit<Enum> { \ \| ^~~~ /home/npopov/repos/llvm-project/llvm/include/llvm/ADT/FloatingPointMode.h:223:1: note: in expansion of macro ‘LLVM_DECLARE_ENUM_AS_BITMASK’ 223 \| LLVM_DECLARE_ENUM_AS_BITMASK(FPClassTest, / LargestValue */ fcPosInf); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~ [43/4396] Building CXX object lib/Supp...iles/LLVMSupport.dir/CommandLine.cpp.o	2023-02-22 08:56:19 +01:00

1 2 3 4 5 ...

1383 Commits