clang-p2996

Author	SHA1	Message	Date
Craig Topper	8f04d81ede	[SelectionDAG][RISCV] Mask constants to narrow size in TargetLowering::expandUnalignedStore. If the SRL for Hi constant folds, but we don't remoe those bits from the Lo, we can end up with strange constant folding through DAGCombine later. I've only seen this with constants being lowered to constant pools during lowering on RISC-V.	2023-09-18 09:10:19 -07:00
Craig Topper	f71a9e8bb7	[SelectionDAG][RISCV][PowerPC][X86] Use TargetConstant for immediates for ISD::PREFETCH. (#66601 ) The intrinsic uses ImmArg so TargetConstant would be consistent with how other intrinsics are handled. This hides the constants from type legalization so we can remove the promotion support. isel patterns are updated accordingly.	2023-09-18 08:58:50 -07:00
Simon Pilgrim	b2ffc867ad	[DAG] getNode() - begin generalizing the (zext (trunc (assertzext x))) -> (assertzext x) fold. We'll need to generalize this fold to check for any zero upperbits to address some of the D155472 regressions, but this exposes a number of issues. For now, just use the general MaskedValueIsZero test instead of the assertzext.	2023-09-18 15:32:31 +01:00
Sergei Barannikov	caaf61eb6e	[SDag] Fold saddo[_carry] with bitwise-not argument to ssubo[_carry] (#66571 ) Fold `(saddo (not a), 1)` to `(ssubo 0, a)` and `(saddo_carry (not a), b, c)` to `(ssubo_carry b, a, !c)`. Proof: https://alive2.llvm.org/ce/z/Lj49YM This is the same as https://reviews.llvm.org/D46505 and https://reviews.llvm.org/D59208, but for signed opcodes.	2023-09-18 14:45:41 +03:00
Yingwei Zheng	e042ff7eef	[SDAG][RISCV] Avoid expanding is-power-of-2 pattern on riscv32/64 with zbb This patch adjusts the legality check for riscv to use `cpop/cpopw` since `isOperationLegal(ISD::CTPOP, MVT::i32)` returns false on rv64gc_zbb. Clang vs gcc: https://godbolt.org/z/rc3s4hjPh Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D156390	2023-09-17 02:56:09 +08:00
Yingwei Zheng	b423e1f05d	[SDAG][RISCV] Avoid neg instructions when lowering atomic_load_sub with a constant rhs This patch avoids creating (sub x0, rhs) when lowering atomic_load_sub with a constant rhs. Comparison with GCC: https://godbolt.org/z/c5zPdP7j4 Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D158673	2023-09-16 17:09:41 +08:00
Philip Reames	09a5aac514	[TLI] Add extend as explicit parameter to shouldRemoveExtendFromGSIndex [nfc] Note: Reviewed as part of a stack of changes in PR# 66405.	2023-09-15 14:48:02 -07:00
Arthur Eubanks	0a1aa6cda2	[NFC][CodeGen] Change CodeGenOpt::Level/CodeGenFileType into enum classes (#66295 ) This will make it easy for callers to see issues with and fix up calls to createTargetMachine after a future change to the params of TargetMachine. This matches other nearby enums. For downstream users, this should be a fairly straightforward replacement, e.g. s/CodeGenOpt::Aggressive/CodeGenOptLevel::Aggressive or s/CGFT_/CodeGenFileType::	2023-09-14 14:10:14 -07:00
Nick Desaulniers	86735a4353	reland [InlineAsm] wrap ConstraintCode in enum class NFC (#66264 ) reland [InlineAsm] wrap ConstraintCode in enum class NFC (#66003) This reverts commit `ee643b706b`. Fix up build failures in targets I missed in #66003 Kept as 3 commits for reviewers to see better what's changed. Will squash when merging. - reland [InlineAsm] wrap ConstraintCode in enum class NFC (#66003) - fix all the targets I missed in #66003 - fix off by one found by llvm/test/CodeGen/SystemZ/inline-asm-addr.ll	2023-09-13 13:31:24 -07:00
Philip Reames	61757fbd04	[DAG] Remove pointless peephole from refineUniformBase [nfc] No need to special case add 0, N. SelectionDAG::getNode contains the canonicalization and simplification for this case, so no need to duplicate it here.	2023-09-13 10:16:11 -07:00
Reid Kleckner	ee643b706b	Revert "[InlineAsm] wrap ConstraintCode in enum class NFC (#66003 )" This reverts commit `2ca4d13612`. Also revert the followup, "[InlineAsm] fix botched merge conflict resolution" This reverts commit `8b9bf3a9f7`. There were SystemZ and Mips build errors, too many to fix forward.	2023-09-13 09:58:02 -07:00
Philip Reames	2f005df066	[DAG][X86] Fold mgather/mscatter/etc with splat index (#65980 ) A splat index means the operation is reading from (writing to) the same memory location. Generally, zero is the cheapest value to splat. As such, we'd prefer to add the splatted value to the base, and use a constant zero as the index operand.	2023-09-13 09:26:30 -07:00
Nick Desaulniers	2ca4d13612	[InlineAsm] wrap ConstraintCode in enum class NFC (#66003 ) Similar to commit `2fad6e6985` ("[InlineAsm] wrap Kind in enum class NFC") Fix the TODOs added in commit `93bd428742` ("[InlineAsm] refactor InlineAsm class NFC (#65649)")	2023-09-13 08:48:09 -07:00
Simon Pilgrim	e6b85c3027	[DAG] FoldSetCC - add missing icmp(X,undef) -> isTrueWhenEqual case (REAPPLIED) Followup to D59363 which failed to handle the icmp(X,undef) -> isTrueWhenEqual case - similar to llvm::ConstantFoldCompareInstruction As discussed on the review, this is affecting some previously reduced test cases, but will also prevent reductions from relying on this inconsistent behaviour in the future. Reapplied after reversion at `e1e3c75c7d` with a tweak to the pseudo-probe-peep.ll test Differential Revision: https://reviews.llvm.org/D158068	2023-09-13 12:33:39 +01:00
Simon Pilgrim	e1e3c75c7d	Revert rG6c56cf71ee82ec3a28e0dfc2b751bd10c16929da "[DAG] FoldSetCC - add missing icmp(X,undef) -> isTrueWhenEqual case" Need to address a missed test change	2023-09-13 11:27:47 +01:00
Simon Pilgrim	6c56cf71ee	[DAG] FoldSetCC - add missing icmp(X,undef) -> isTrueWhenEqual case Followup to D59363 which failed to handle the icmp(X,undef) -> isTrueWhenEqual case - similar to llvm::ConstantFoldCompareInstruction As discussed on the review, this is affecting some previously reduced test cases, but will also prevent reductions from relying on this inconsistent behaviour in the future. Differential Revision: https://reviews.llvm.org/D158068	2023-09-13 11:01:58 +01:00
Yingwei Zheng	4793c2c3de	[DAGCombiner][RISCV] Prefer to sext i32 non-negative values (#65984 ) By default, `DAGCombiner` folds `sext x` to `zext x` when `x` is non-negative. It will generate redundant `zext` inst seq on riscv64 (typically `slli (srli x, 32), 32`). godbolt: https://godbolt.org/z/osf6adP1o This patch applies the transform iff `zext` is cheaper than `sext`.	2023-09-12 19:02:35 +08:00
Nick Desaulniers	93bd428742	[InlineAsm] refactor InlineAsm class NFC (#65649 ) I would like to steal one of these bits to denote whether a kind may be spilled by the register allocator or not, but I'm afraid to touch of any this code using bitwise operands. Make flags a first class type using bitfields, rather than launder data around via `unsigned`.	2023-09-11 09:27:37 -07:00
Mohamed Atef	741c127817	[SelectionDAG] Add computeOverflowForSignedMul / computeOverflowForUnsignedMul overflow handlers Support signed multiplication Support unsigned multiplication Differential Revision: https://reviews.llvm.org/D159406	2023-09-07 10:03:18 +01:00
Simon Pilgrim	84447c044f	[DAG] Add SelectionDAG::isADDLike helper. NFC. Make the DAGCombine helper global so we can more easily reuse it.	2023-09-06 16:54:25 +01:00
Simon Pilgrim	e4d0e12099	[DAG] Fold (shl (sext (add_nsw x, c1)), c2) -> (add (shl (sext x), c2), c1 << c2) (REAPPLIED) Assuming the ADD is nsw then it may be sign-extended to merge with a SHL op in a similar fold to the existing (shl (add x, c1), c2) -> (add (shl x, c2), c1 << c2) fold. This is most useful for helping to expose address math for X86, but has also touched several aarch64 test cases as well. Alive2: https://alive2.llvm.org/ce/z/2UpSbJ Differential Revision: https://reviews.llvm.org/D159198	2023-09-06 13:19:42 +01:00
Dmitri Gribenko	97bf104d97	Revert "[DAG] Fold (shl (sext (add_nsw x, c1)), c2) -> (add (shl (sext x), c2), c1 << c2)" This reverts commit `b027ce0ab9`. This commit breaks Transforms/InferAddressSpaces/AMDGPU/flat_atomic.ll.	2023-09-06 11:28:55 +02:00
Simon Pilgrim	b027ce0ab9	[DAG] Fold (shl (sext (add_nsw x, c1)), c2) -> (add (shl (sext x), c2), c1 << c2) Assuming the ADD is nsw then it may be sign-extended to merge with a SHL op in a similar fold to the existing (shl (add x, c1), c2) -> (add (shl x, c2), c1 << c2) fold. This is most useful for helping to expose address math for X86, but has also touched several aarch64 test cases as well. Alive2: https://alive2.llvm.org/ce/z/2UpSbJ Differential Revision: https://reviews.llvm.org/D159198	2023-09-06 10:06:21 +01:00
Ting Wang	71be020dda	[SelectionDAG][PowerPC] Memset reuse vector element for tail store On PPC there are instructions to store element from vector(e.g. stxsdx/stxsiwx), and these instructions can be leveraged to avoid tail constant in memset and constant splat array initialization. This patch tries to explore these opportunities. Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D138883	2023-09-06 01:52:38 -04:00
Simon Pilgrim	5463503ae1	[DAG] Move scalar BITCAST constant folds from getNode to FoldConstantArithmetic	2023-09-05 13:11:20 +01:00
David Sherwood	50598f0ff4	[DAGCombiner][SVE] Add support for illegal extending masked loads In some cases where the same mask is used for multiple extending masked loads it can be more efficient to combine the zero- or sign-extend into the load even if it's not a legal or custom operation. This leads to splitting up the extending load into smaller parts, which also requires splitting the mask. For SVE at least this improves the performance of the SPEC benchmark x264 slightly on neoverse-v1 (~0.3%), and at least one other benchmark improves by around 30%. The uplift for SVE seems due to removing the dependencies (vector unpacks) introduced between the loads and the vector operations, since this should increase the level of parallelism. See tests: CodeGen/AArch64/sve-masked-ldst-sext.ll CodeGen/AArch64/sve-masked-ldst-zext.ll https://reviews.llvm.org/D159191	2023-09-05 10:41:21 +00:00
Kazu Hirata	5fb990ac51	[SelectionDAG] Use isNullConstant (NFC)	2023-09-02 09:32:43 -07:00
Fangrui Song	111fcb0df0	[llvm] Fix duplicate word typos. NFC Those fixes were taken from https://reviews.llvm.org/D137338	2023-09-01 18:25:16 -07:00
Matt Arsenault	b14e83d1a4	IR: Add llvm.exp10 intrinsic We currently have log, log2, log10, exp and exp2 intrinsics. Add exp10 to fix this asymmetry. AMDGPU already has most of the code for f32 exp10 expansion implemented alongside exp, so the current implementation is duplicating nearly identical effort between the compiler and library which is inconvenient. https://reviews.llvm.org/D157871	2023-09-01 19:45:03 -04:00
Philip Reames	685e1909e9	[LegalizeDAG] Use scalable aware idiom for checking for single element vector NFC for fixed vectors (all that reaches here currently), and future proofing for scalable vectors.	2023-09-01 11:56:03 -07:00
Simon Pilgrim	15b561ed38	[DAG] Move STEP_VECTOR constant fold from getNode to FoldConstantArithmetic	2023-09-01 15:47:37 +01:00
Simon Pilgrim	1d47d5d67c	[DAG] Move F16<->FP constant folds from getNode to FoldConstantArithmetic	2023-09-01 15:47:36 +01:00
Simon Pilgrim	4b9c2cf0a7	[DAG] Move INT<->FP constant folds from getNode to FoldConstantArithmetic	2023-09-01 14:02:02 +01:00
Simon Pilgrim	2a81396b1b	[DAG] SimplifyDemandedBits - add SMIN/SMAX KnownBits comparison analysis Followup to D158364 Also, final fix for Issue #59902 which noted that the snippet should just return 1	2023-09-01 12:42:30 +01:00
Simon Pilgrim	aca8b9d0d5	[DAG] SimplifyDemandedBits - if we're only demanding the signbits, a MIN/MAX node can be simplified to a OR or AND node Extension to the signbit case, if the signbits extend down through all the demanded bits then SMIN/SMAX/UMIN/UMAX nodes can be simplified to a OR/AND/AND/OR. Alive2: https://alive2.llvm.org/ce/z/mFVFAn (general case) Differential Revision: https://reviews.llvm.org/D158364	2023-09-01 10:56:32 +01:00
Matt Arsenault	ad9d13d535	SelectionDAG: Swap operands of atomic_store Irritatingly, atomic_store had operands in the opposite order from regular store. This made it difficult to share patterns between regular and atomic stores. There was a previous incomplete attempt to move atomic_store into the regular StoreSDNode which would be better. I think it was a mistake for all atomicrmw to swap the operand order, so maybe it's better to take this one step further. https://reviews.llvm.org/D123143	2023-08-31 17:30:10 -04:00
Daniel Paoliello	0c5c7b52f0	Emit the CodeView `S_ARMSWITCHTABLE` debug symbol for jump tables The CodeView `S_ARMSWITCHTABLE` debug symbol is used to describe the layout of a jump table, it contains the following information: * The address of the branch instruction that uses the jump table. * The address of the jump table. * The "base" address that the values in the jump table are relative to. * The type of each entry (absolute pointer, a relative integer, a relative integer that is shifted). Together this information can be used by debuggers and binary analysis tools to understand what an jump table indirect branch is doing and where it might jump to. Documentation for the symbol can be found in the Microsoft PDB library dumper: `0fe89a942f/cvdump/dumpsym7.cpp (L5518)` This change adds support to LLVM to emit the `S_ARMSWITCHTABLE` debug symbol as well as to dump it out (for testing purposes). Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D149367	2023-08-31 12:06:50 -07:00
Konstantina Mitropoulou	17fc78e7a4	[DAGCombiner] Change foldAndOrOfSETCC() to optimize and/or patterns with floating points. This reverts commit `48fa79a503`. Reviewed By: brooksmoses Differential Revision: https://reviews.llvm.org/D159240	2023-08-31 11:36:50 -07:00
Nick Desaulniers	2fad6e6985	[InlineAsm] wrap Kind in enum class NFC Should add some minor type safety to the use of this information, since there's quite a bit of metadata being laundered through an `unsigned`. I'm looking to potentially add more bitfields to that `unsigned`, but I find InlineAsm's big ol' bag of enum values and usage of `unsigned` confusing, type-unsafe, and un-ergonomic. These can probably be better abstracted. I think the lack of static_cast outside of InlineAsm indicates the prior code smell fixed here. Reviewed By: qcolombet Differential Revision: https://reviews.llvm.org/D159242	2023-08-31 08:54:51 -07:00
Luke Lau	3a4ad45a2c	[DAGCombiner] Combine trunc (splat_vector x) -> splat_vector (trunc x) From the discussion in https://reviews.llvm.org/D158853, moving the truncate into the splat helps more splatted scalar operands get selected on RISC-V, and also avoids the need for splat_vector_parts on RV32. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D159147	2023-08-30 15:22:57 +01:00
Simon Pilgrim	376050db9f	[DAG] Move some unary constant folds from getNode() to FoldConstantArithmetic() We need to clean up some type handling before the remainder (int<->fp and bitcasts) can be moved over.	2023-08-30 13:59:28 +01:00
Simon Pilgrim	d037445f3a	[DAG] visitSHL - use FoldConstantArithmetic to fold constants in (shl (add x, c1), c2) -> (add (shl x, c2), c1 << c2) fold Matches what we do in the (shl (mul x, c1), c2) -> (mul x, c1 << c2) fold as well as inside visitShiftByConstant	2023-08-29 18:52:24 +01:00
Craig Topper	299b1b4071	[SelectionDAG][RISCV] Teach getConstant to use SPLAT_VECTOR_PARTS if vXi64 SPLAT_VECTOR is legal but i64 scalars are not. That matches how such a SPLAT_VECTOR would have been type legalized so assume it is ok to use for creating constants after type legalization. Still need some improvements to SPLAT_VECTOR lowering. This overlaps with some of what D158742 was trying to fix. Reviewed By: luke Differential Revision: https://reviews.llvm.org/D158870	2023-08-29 09:22:17 -07:00
Luke Lau	8f1d1e2b61	[SDAG] Add computeKnownBits support for ISD::SPLAT_VECTOR_PARTS We can work out the known bits for a given lane by concatenating the known bits of each scalar operand. In the description of ISD::SPLAT_VECTOR_PARTS in ISDOpcodes.h it says that the total size of the scalar operands must cover the output element size, but I've added a stricter assertion here that the total width of the scalar operands must be exactly equal to the element size. It doesn't seem to trigger, and I'm not sure if there any targets that use SPLAT_VECTOR_PARTS for anything other than v4i32 -> v2i64 splats. We also need to include it in isTargetCanonicalConstantNode, otherwise returning the known bits introduces an infinite combine loop. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D158852	2023-08-28 10:35:58 +01:00
Luke Lau	6e4860f5d0	[SDAG] Add SimplifyDemandedBits support for ISD::SPLAT_VECTOR This improves some cases where a splat_vector uses a build_pair that can be simplified, e.g: (rotl x:i64, splat_vector (build_pair x1:i32, x2:i32)) rotl only demands the bottom 6 bits, so this patch allows it to simplify it to: (rotl x:i64, splat_vector (build_pair x1:i32, undef:i32)) Which in turn improves some cases where a splat_vector_parts is lowered on RV32. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D158839	2023-08-28 10:35:56 +01:00
Arthur Eubanks	0a4fc4ac1c	Revert "Emit the CodeView `S_ARMSWITCHTABLE` debug symbol for jump tables" This reverts commit `8d0c3db388`. Causes crashes, see comments in https://reviews.llvm.org/D149367. Some follow-up fixes are also reverted: This reverts commit `636269f4fc`. This reverts commit `5966079cf4`. This reverts commit `e7294dbc85`.	2023-08-25 18:34:15 -07:00
Danila Malyutin	e036ba50a7	[StatepointLowering] Fix possible nullptr access in debug output Differential Revision: https://reviews.llvm.org/D158866	2023-08-25 22:56:17 +03:00
Daniel Paoliello	8d0c3db388	Emit the CodeView `S_ARMSWITCHTABLE` debug symbol for jump tables The CodeView `S_ARMSWITCHTABLE` debug symbol is used to describe the layout of a jump table, it contains the following information: * The address of the branch instruction that uses the jump table. * The address of the jump table. * The "base" address that the values in the jump table are relative to. * The type of each entry (absolute pointer, a relative integer, a relative integer that is shifted). Together this information can be used by debuggers and binary analysis tools to understand what an jump table indirect branch is doing and where it might jump to. Documentation for the symbol can be found in the Microsoft PDB library dumper: `0fe89a942f/cvdump/dumpsym7.cpp (L5518)` This change adds support to LLVM to emit the `S_ARMSWITCHTABLE` debug symbol as well as to dump it out (for testing purposes). Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D149367	2023-08-25 10:19:17 -07:00
LiaoChunyu	1b12427c01	[VP][RISCV] Add vp.is.fpclass and RISC-V support There is no vp.fpclass after FCLASS_VL(D151176), try to support vp.fpclass. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D152993	2023-08-25 15:40:55 +08:00
Konstantina Mitropoulou	48fa79a503	Revert "[DAGCombiner] Change foldAndOrOfSETCC() to optimize and/or patterns with floating points." This reverts commit `5ec1353523`.	2023-08-24 20:39:04 -07:00

1 2 3 4 5 ...

13080 Commits