clang-p2996

Author	SHA1	Message	Date
Simon Pilgrim	4f95821f58	[DAG] SelectionDAG::getNode() - consistently use N1 for first operand. NFCI. This has been annoying me for years - rename Operand to N1 so it matches all the other getNode() calls, and simplifies my debug watch windows!	2023-07-17 17:17:40 +01:00
Simon Pilgrim	e9caa37e9c	[DAG] Move lshr narrowing from visitANDLike to SimplifyDemandedBits Inspired by some of the cases from D145468 Let SimplifyDemandedBits handle the narrowing of lshr to half-width if we don't require the upper bits, the narrowed shift is profitable and the zext/trunc are free. A future patch will propose the equivalent shl narrowing combine. Differential Revision: https://reviews.llvm.org/D146121	2023-07-17 15:50:09 +01:00
Amara Emerson	432338a673	Don't assert on a non-pointer value being used for a "p" inline asm constraint. GCC and existing codebases allow the use of integral values to be used with this constraint. A recent change D133914 in this area started causing asserts. Removing the assert is enough as the rest of the code works fine. rdar://109675485 Differential Revision: https://reviews.llvm.org/D155023	2023-07-13 10:45:56 -07:00
Jon Roelofs	56e60bc5bb	TargetLowering: fix an infinite DAG combine in SimplifySETCC TargetLowering::SimplifySetCC wants to swap the operands of a SETCC to canonicalize the constant to the RHS. The bug here was that it did so whether or not the RHS was already a constant, leading to an infinite loop. rdar://111847838 Divverential revision: https://reviews.llvm.org/D155095 This reverts commit `cdc633e4bc`.	2023-07-12 16:13:27 -07:00
Noah Goldstein	a4c461c063	[SelectionDAG] Fill in some more cases in `isKnownNeverZero` This mostly copies cases that already exist in ValueTracking, although it skips the more complex ones. Those can be filled in as needed. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D149199	2023-07-12 17:17:53 -05:00
Noah Goldstein	74f0ec5e24	[DAGCombiner] Make it so that `udiv` can be folded with `(select c, NonZero, 1)` This is done by allowing speculation of `udiv` if we can prove the denominator is non-zero. https://alive2.llvm.org/ce/z/VNCt_q Differential Revision: https://reviews.llvm.org/D149198	2023-07-12 17:17:53 -05:00
Jon Roelofs	cdc633e4bc	Revert "TargetLowering: fix an infinite DAG combine in SimplifySETCC" This reverts commit `b76c85b355`. It broke the RISCV-enabled bots. Oops.	2023-07-12 12:22:03 -07:00
Jon Roelofs	b76c85b355	TargetLowering: fix an infinite DAG combine in SimplifySETCC TargetLowering::SimplifySetCC wants to swap the operands of a SETCC to canonicalize the constant to the RHS. The bug here was that it did so whether or not the RHS was already a constant, leading to an infinite loop. rdar://111847838 Differential revision: https://reviews.llvm.org/D155095	2023-07-12 11:44:15 -07:00
Craig Topper	45b172c838	[LegalizeDAG] Prevent LegalizeLoadOps from creating extloads that mix int and fp types. For RISC-V, getRegisterType for fp16 returns i16. i16->fp64 extload is considered legal because the LoadExtActions defaults to Legal for all entries. Only fp/fp and int/int entries are changed to Expand fore RISC-V. This patch detects the FP-ness has changed and won't try to call isLoadExtLegal. Alternatively, we could add Expand for int/fp and fp/int, but that seemed a little silly. Fixes #63816 Reviewed By: asb, wangpc Differential Revision: https://reviews.llvm.org/D155040	2023-07-12 08:03:35 -07:00
Marco Elver	de79233b2e	[X86] Complete preservation of !pcsections in X86ISelLowering https://reviews.llvm.org/D130883 introduced MIMetadata to simplify metadata propagation (DebugLoc and PCSections). However, we're currently still permitting implicit conversion of DebugLoc to MIMetadata, to allow for a gradual transition and let the old code work as-is. This manifests in lost !pcsections metadata for X86-specific lowerings. For example, 128-bit atomics. Fix the situation for X86ISelLowering by converting all BuildMI() calls to use an explicitly constructed MIMetadata. Reviewed By: dvyukov Differential Revision: https://reviews.llvm.org/D154986	2023-07-12 15:09:31 +02:00
Ivan Kosarev	15e7749e19	[Codegen] Generate fast fp64-to-fp16 conversions in unsafe mode. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D154528	2023-07-12 11:55:19 +01:00
Jay Foad	f7684d8510	[DAG] Use legal shift amount type in DAGTypeLegalizer::JoinIntegers Documentation for TargetLowering::getShiftAmountTy says that LegalTypes should generally be true during type legalization, so this patch does that. On AMDGPU the effect is that we use i32 (a sane type) instead of i64 (pointer sized type) for more shift amounts, which in turn allows more formation of rotates and funnel shifts pre-legalization. Differential Revision: https://reviews.llvm.org/D154960	2023-07-12 08:12:09 +01:00
Matt Arsenault	b59022b42e	DAG: Handle lowering of unordered fcZero\|fcSubnormal to fcmp	2023-07-11 18:30:15 -04:00
Matt Arsenault	1d92b68ead	DAG: Correct chain management for frexp libcalls We need to replace the other uses of the call chain with the new load chain. Fixes not preserving the return def with unused x86_fp80 results. Regression reported here: https://reviews.llvm.org/rGb15bf305ca3e9ce63aaef7247d32fb3a75174531#1224999	2023-07-10 21:39:15 -04:00
Matt Arsenault	310f839612	DAG: Lower is.fpclass fcInf to fcmp of fabs InstCombine should have taken care of this, but I think this is more useful in the future when the expansion tries to handle multiple cases at a time with fcmp. x87 looks worse to me but the only thing I know about it is that I aggressively do not care about it. https://reviews.llvm.org/D143198	2023-07-07 17:00:10 -04:00
Matt Arsenault	64df9573a7	DAG: Handle inversion of fcSubnormal \| fcZero There are a number of more test combinations here that can be done together and reduce the number of instructions. https://reviews.llvm.org/D143191	2023-07-06 21:19:44 -04:00
Matt Arsenault	61820f8b5d	CodeGen: Optimize lowering of is.fpclass fcZero\|fcSubnormal Combine the two checks into a check if the exponent bits are 0. The inverted case isn't reachable until a future change, and GlobalISel currently doesn't attempt the inversion optimization. https://reviews.llvm.org/D143182	2023-07-06 13:03:57 -04:00
Matt Arsenault	1588e18b2d	DAG: Check isCondCodeLegal in is_fpclass expansion to fcmp eq 0 Results in some x86 codegen diffs. Some look better, some look worse. https://reviews.llvm.org/D152094	2023-07-06 13:00:52 -04:00
Matt Arsenault	e8ed6e35bd	DAG: Implement soften float for ffrexp Fixes #63661 https://reviews.llvm.org/D154555	2023-07-05 21:42:27 -04:00
Matt Arsenault	20964c901a	DAG: Fix dropping flags when widening unary vector ops	2023-07-05 17:25:24 -04:00
Amaury Séchet	ee2d10cd16	[NFC] Reorder functions in DAGCombiner so all UADDO_CARRY related functions are next to each others.	2023-07-04 14:55:11 +00:00
David Green	f55d96b9a2	[DAG][AArch64] Handle vector types when expanding sdiv/udiv into mulh The aarch64 backend will benefit from expanding 64vector sdiv/udiv into mulh using shift(mul(ext, ext)), as the larger type size is legal and the mul(ext, ext) can efficiently use smull/umull instructions. This extends the existing code in GetMULHS to handle vector types for it. Differential Revision: https://reviews.llvm.org/D154049	2023-07-02 15:02:52 +01:00
Simon Pilgrim	4742715eb7	[DAG] Fold (ext (_extend_vector_inreg x)) -> (*_extend_vector_inreg x)	2023-06-30 14:42:49 +01:00
Matt Arsenault	7d644dc598	DAG: Really fix patch split	2023-06-30 09:14:02 -04:00
Matt Arsenault	2b988801c9	DAG: Fix broken patch split	2023-06-30 09:07:23 -04:00
Matt Arsenault	160d7227e0	DAG: Fix libcall expansion for frexp on ARM The ExpandLibcallResult result was a bitcast and not the direct call result, so we couldn't find the chain. Use the new separate chain return value instead.	2023-06-30 09:03:45 -04:00
Matt Arsenault	b69b6b8399	DAG: Return the chain from ExpandLibCall If the libcall expansion requires use of the inserted call's result chain, it's unreliable to query it from the main result. The call lowering may have added additional casts or other obscuring operations we don't want to parse through.	2023-06-30 09:03:40 -04:00
David Green	14f54a594e	[DAG][AArch64] Fold shuffle_vector<4,5,6,7> to extract_subvector During legalization, we can end up with shuffles that are identity masks, so act like extract_subvector, but do not simplify to extract_subvector. This adjusts the profitability heuristic in foldExtractSubvectorFromShuffleVector to allow identity vectors that do not start at element 0. Undef masks elements are excluded as it can be more useful to keep the undef elements. Differential Revision: https://reviews.llvm.org/D153504	2023-06-30 11:13:39 +01:00
Luke Lau	742fb8b5c7	[DAGCombine] Fold (store (insert_elt (load p)) x p) -> (store x) If we have a store of a load with no other uses in between it, it's considered dead and is removed. So sometimes when legalizing a fixed length vector store of an insert, we end up producing better code through scalarization than without. An example is the follow below: %a = load <4 x i64>, ptr %x %b = insertelement <4 x i64> %a, i64 %y, i32 2 store <4 x i64> %b, ptr %x If this is scalarized, then DAGCombine successfully removes 3 of the 4 stores which are considered dead, and on RISC-V we get: sd a1, 16(a0) However if we make the vector type legal (-mattr=+v), then we lose the optimisation because we don't scalarize it. This patch attempts to recover the optimisation for vectors by identifying patterns where we store a load with a single insert inbetween, replacing it with a scalar store of the inserted element. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D152276	2023-06-28 22:45:04 +01:00
Matt Arsenault	003b58f65b	IR: Add llvm.frexp intrinsic Add an intrinsic which returns the two pieces as multiple return values. Alternatively could introduce a pair of intrinsics to separately return the fractional and exponent parts. AMDGPU has native instructions to return the two halves, but could use some generic legalization and optimization handling. For example, we should be able to handle legalization of f16 on older targets, and for bf16. Additionally antique targets need a hardware workaround which would be better handled in the backend rather than in library code where it is now.	2023-06-28 14:50:16 -04:00
Craig Topper	e819f5cccf	[LegalizeTypes] Combine PromoteIntRes_VECTOR_DEINTERLEAVE and PromoteIntRes_VECTOR_INTERLEAVE. NFC The functions are identical except for the opcode of the node. We can have a single function and use N->getOpcode(). Reviewed By: luke, paulwalker-arm Differential Revision: https://reviews.llvm.org/D153929	2023-06-28 07:57:47 -07:00
FLZ101	32e4013dd4	[AArch64][SelectionDAG] fix infinite loop caused by legalizing & combining CONCAT_VECTORS Legalizing in `AArch64TargetLowering::LowerCONCAT_VECTORS()` and combining in `DAGCombiner::visitCONCAT_VECTORS()` could cause an infinite loop. This commit fixes that issue by conditionally skipping the combining. Fix https://github.com/llvm/llvm-project/issues/63322 Reviewed By: RKSimon, MaskRay Differential Revision: https://reviews.llvm.org/D153316	2023-06-27 13:57:41 -07:00
Simon Pilgrim	bc81791e07	Fix "this this" duplicate typo in comment. NFC.	2023-06-27 11:46:02 +01:00
Simon Pilgrim	64d01432d2	Fix "for for" duplicate typo in comment. NFC.	2023-06-27 11:43:09 +01:00
Alex MacLean	17aa37dd30	[SelectionDAG] Add memory size for CSEMap ID calculation In NVPTX `ReplaceVectorLoad()`, i1 and i8 types are promoted to i16, followed by a truncate operation. Thus, v2i8 (or v2i1) and v2i16 will have the same VTList, which causes a collision in CSEMap. To differentiate the original VTList, let's add the size in generating an ID. Otherwise the compiler crashes in refineAlignment: `MMO->getSize() == getSize() && "Size mismatch!"` Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D153712	2023-06-26 16:12:48 -07:00
Craig Topper	4afa2ab7a5	[RISCV][SelectionDAGBuilder] Fix an implicit scalable TypeSize to fixed size conversion in getUniformBase. If the index needs to be scaled by a scalable size, just give up. Fixes #63459 Reviewed By: frasercrmck, RKSimon Differential Revision: https://reviews.llvm.org/D153601	2023-06-26 11:56:17 -07:00
Eli Friedman	bc7f11ccb0	[SelectionDAG] Improve expansion of wide min/max The current implementation tries to handle the high and low halves separately, but that's less efficient in most cases; use a wide SETCC instead. Differential Revision: https://reviews.llvm.org/D151358	2023-06-26 10:45:41 -07:00
Youngsuk Kim	d22a236ae7	[llvm] Replace use of Type::getPointerTo() (NFC) Partial progress towards replacing in-tree uses of `Type::getPointerTo()`. If `getPointerTo()` is used solely to support an unnecessary bitcast, remove the bitcast. Reviewed By: barannikov88, nikic Differential Revision: https://reviews.llvm.org/D153307	2023-06-23 22:32:29 -04:00
Fangrui Song	f9fd0062b6	[XRay][AArch64] Suppport __xray_customevent/__xray_typedevent `__xray_customevent` and `__xray_typedevent` are built-in functions in Clang. With -fxray-instrument, they are lowered to intrinsics llvm.xray.customevent and llvm.xray.typedevent, respectively. These intrinsics are then lowered to TargetOpcode::{PATCHABLE_EVENT_CALL,PATCHABLE_TYPED_EVENT_CALL}. The target is responsible for generating a code sequence that calls either `__xray_CustomEvent` (with 2 arguments) or `__xray_TypedEvent` (with 3 arguments). Before patching, the code sequence is prefixed by a branch instruction that skips the rest of the code sequence. After patching (compiler-rt/lib/xray/xray_AArch64.cpp), the branch instruction becomes a NOP and the function call will take effects. This patch implements the lowering process for {PATCHABLE_EVENT_CALL,PATCHABLE_TYPED_EVENT_CALL} and implements the runtime. ``` // Lowering of PATCHABLE_EVENT_CALL .Lxray_sled_N: b #24 stp x0, x1, [sp, #-16]! x0 = reg of op0 x1 = reg of op1 bl __xray_CustomEvent ldrp x0, x1, [sp], #16 ``` As a result, two updated tests in compiler-rt/test/xray/TestCases/Posix/ now pass on AArch64. Reviewed By: peter.smith Differential Revision: https://reviews.llvm.org/D153320	2023-06-23 09:24:18 -07:00
Simon Pilgrim	1f006f5fb6	[DAG] mergeTruncStores - early out if we collect more than the maximum number of stores If we have an excessive number of stores in a single chain then the candidate WideVT may exceed the maximum width of an EVT integer type (and will assert) - but since mergeTruncStores doesn't support anything wider than a i64 store we should just early-out if we've collected more than stores than that. Fixes #63306	2023-06-23 16:22:11 +01:00
David Green	589c940eb3	[DAG] Fix and expand fmin/fmax reassociation fold. This call to reassociateReduction is used by both fminnum/fmaxnum and fminimum/fmaximum. In adding support for fminimum/fmaximum we appear to be fixing the use of an incorrect reduction type, which should have only applied to minnum/maxnum. I also believe that it doesn't need nsz and reassoc to perform the reassociation. For float min/max it should always be valid. Differential Revision: https://reviews.llvm.org/D153247	2023-06-23 14:45:14 +01:00
Dhruv Chawla	3f77724de7	[TargetLowering] Better code generation for ISD::SADDSAT/SSUBSAT when operand sign is known When the sign of either of the operands is known, it is possible to determine what the saturating value will be without having to compute it using the sign bits. Differential Revision: https://reviews.llvm.org/D153575	2023-06-23 13:20:36 +05:30
Amaury Séchet	34d8c5b9ce	[DAG] Peek through trunc when combining select into shifts. This fixes a regression in D127115 Depends on D127115 Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D151916	2023-06-23 00:35:39 +00:00
Nikita Popov	81ec494c36	[SDAGBuilder] Handle multi-part arguments in argument copy elision (PR63430) When eliding an argument copy, we need to update the chain to ensure the argument reads are performed before later writes. However, the code doing this only handled this for the first part of the argument. If the argument had multiple parts, the chains of the later parts were dropped. Make sure we preserve all chains. Fixes https://github.com/llvm/llvm-project/issues/63430.	2023-06-22 17:04:56 +02:00
Matt Arsenault	18b93562cf	DAG: Expand legalization of is.fpclass to fcmp for DAZ Try to use a compare with 0 if DAZ is assumed. FPClassTest really needs to be marked as a bimask enum, but the API for that is currently broken.	2023-06-22 06:18:02 -04:00
Simon Pilgrim	411deb97cf	[DAG] ScalarizeVectorResult - add ISD::MULHS/ISD::MULHU handling Fixes #63439	2023-06-22 11:09:55 +01:00
tianleli	1c27275813	[DAG] Unroll and expand illegal result of LDEXP and POWI instead of widen. Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D153104	2023-06-21 14:27:39 +08:00
Simon Pilgrim	43ad2e9c8b	[DAG] Add getExtOrTrunc helper. NFC. Wrap the getSExtOrTrunc/getZExtOrTrunc calls behind an IsSigned argument.	2023-06-20 16:03:18 +01:00
Simon Pilgrim	ff23856c1c	[DAG] Fold (abds x, y) -> (abdu x, y) iff both args are known positive This is a generic DAG combine version of D151055 which recognizes when a signed ABDS can be safely replaced with a unsigned ABDU instruction if it is legal. Alive2: https://alive2.llvm.org/ce/z/pb5BjG Differential Revision: https://reviews.llvm.org/D153328	2023-06-20 15:31:22 +01:00
Jeffrey Byrnes	7972a6e126	[DAGCombiner][NFC] Factor out ByteProvider Differential Revision: https://reviews.llvm.org/D143018 Change-Id: I3dc03787a3382c0c3fe6b869f869c2946f450874	2023-06-19 08:54:34 -07:00

1 2 3 4 5 ...

12956 Commits