clang-p2996

Author	SHA1	Message	Date
Jianjian Guan	fd50151180	[RISCV] Only support SPLAT_VECTOR for Zvfhmin when also enable the scalar extension of half fp (#88275 )	2024-04-11 10:23:26 +08:00
Craig Topper	f27f369710	[RISCV] Remove interrupt handler special case from RISCVFrameLowering::determineCalleeSaves. (#88069 ) This code was trying to save temporary argument registers in interrupt handler functions that contain calls. With the exception that all FP registers are saved including the normally callee saved registers. If all of the callees use an FP ABI and the interrupt handler doesn't touch the normally callee saved FP registers, we don't need to save them. It doesn't appear that we need to special case functions with calls. The normal callee saved register handling will already check each of the calls and consider a register clobbered if the call doesn't explicitly say it is preserved. All of the test changes are from the removal of the FP callee saved registers. There are tests for interrupt handlers with F and D extension that use ilp32 or lp64 ABIs that are not affected by this change. They still save the FP callee saved registers as they should. gcc appears to have a bug where the D extension being enabled with the ilp32f or lp64f ABI does not save the FP callee saved regs. The callee would only save/restore the lower 32 bits and clobber the upper bits. LLVM saves the FP callee saved regs in this case and there is an unchanged test for it. The unnecessary save/restore was raised in this thread https://discourse.llvm.org/t/has-bugs-when-optimizing-save-restore-csrs-by-changing-csr-xlen-f32-interrupt/78200/1	2024-04-10 10:28:54 -07:00
Craig Topper	323d3ab257	[RISCV] Optimize undef Even vector in getWideningInterleave. (#88221 ) We recently optimized the code when the Odd vector was undef to fix a poison bug. There are additional optimizations we can do if the even vector is undef. With Zvbb, we can use a single vwsll. Without Zvbb, we can use a vzext.vf2 and a vsll.	2024-04-10 09:08:50 -07:00
Craig Topper	7f1b9adfc8	[RISCV] Add MachineCombiner to fold (sh3add Z, (add X, (slli Y, 6))) -> (sh3add (sh3add Y, Z), X). (#87884 ) This improves a pattern that occurs in 531.deepsjeng_r. Reducing the dynamic instruction count by 0.5%. This may be possible to improve in SelectionDAG, but given the special cases around shXadd formation, it's not obvious it can be done in a robust way without adding multiple special cases. I've used a GEP with 2 indices because that mostly closely resembles the motivating case. Most of the test cases are the simplest GEP case. One test has a logical right shift on an index which is closer to the deepsjeng code. This requires special handling in isel to reverse a DAGCombiner canonicalization that turns a pair of shifts into (srl (and X, C1), C2).	2024-04-10 08:39:56 -07:00
Chia	469caa31e7	[RISCV] Use vwadd.vx for splat vector with extension (#87249 ) This patch allows `combineBinOp_VLToVWBinOp_VL` to handle patterns like `(splat_vector (sext op))` or `(splat_vector (zext op))`. Then we can use `vwadd.vx` and `vwadd.w` for such a case. ### Source code ``` define <vscale x 8 x i64> @vwadd_vx_splat_sext(<vscale x 8 x i32> %va, i32 %b) { %sb = sext i32 %b to i64 %head = insertelement <vscale x 8 x i64> poison, i64 %sb, i32 0 %splat = shufflevector <vscale x 8 x i64> %head, <vscale x 8 x i64> poison, <vscale x 8 x i32> zeroinitializer %vc = sext <vscale x 8 x i32> %va to <vscale x 8 x i64> %ve = add <vscale x 8 x i64> %vc, %splat ret <vscale x 8 x i64> %ve } ``` ### Before this patch [Compiler Explorer](https://godbolt.org/z/sq191PsT4) ``` vwadd_vx_splat_sext: sext.w a0, a0 vsetvli a1, zero, e64, m8, ta, ma vmv.v.x v16, a0 vsetvli zero, zero, e32, m4, ta, ma vwadd.wv v16, v16, v8 vmv8r.v v8, v16 ret ``` ### After this patch ``` vwadd_vx_splat_sext vsetvli a1, zero, e32, m4, ta, ma vwadd.vx v16, v8, a0 vmv8r.v v8, v16 ret ```	2024-04-10 15:26:17 +09:00
Philip Reames	e47fd09f8e	[RISCV] Use shNadd for scalable stack offsets (#88062 ) If we need to multiply VLENB by 2, 4, or 8 and add it to the stack pointer, we can do so with a shNadd instead of separate shift and add instructions.	2024-04-09 07:29:10 -07:00
Luke Lau	24e8c6a09b	[RISCV] Convert remaining constant splats in tests to use splat shorthand. NFC (#88099 ) This follows on from #87616, but includes the tests with codegen differences. These are presumably due to the fact that the splat is now a constant expression. They don't seem to affect anything that we were specifically testing for.	2024-04-09 17:15:15 +08:00
Luke Lau	9c660362c4	[RISCV] Support vwsll in combineBinOp_VLToVWBinOp_VL (#87620 ) If the subtarget has +zvbb then we can attempt folding shl and shl_vl to vwsll nodes. There are few test cases where we still don't pick up the vwsll: - For fixed vector vwsll.vi on RV32, see the FIXME for VMV_V_X_VL in fillUpExtensionSupport for support implicit sign extension - For scalable vector vwsll.vi we need to support ISD::SPLAT_VECTOR, see #87249	2024-04-09 16:10:35 +08:00
Luke Lau	0f20b9b92f	[RISCV] Don't require mask or VL to be the same in combineBinOp_VLToVWBinOp_VL (#87997 ) In NodeExtensionHelper we keep track of the VL and mask of the operand being extended and check that they are the same as the root node's. However for the nodes that we support, none of them have a passthru operand with the exception of RISCV::VMV_V_X_VL, but we check that it's passthru is undef anyway. So it's safe to just discard the extend node's VL and mask and just use the root's instead. (This is the same type of reasoning we use to treat any vmset_vl as an all ones mask) This allows us to match some more cases where we mix VP/non-VP/VL nodes, but these don't seem to appear in practice. The main benefit from this would be to simplify the code.	2024-04-09 16:04:10 +08:00
Luke Lau	d8d131dfa9	[RISCV] Convert more constant splats in tests to splat shorthand. NFC (#87616 ) A handy shorthand for specifying the shufflevector(insertelement(poison, foo, 0), poison, zeroinitializer) splat pattern was introduced in #74620. Some of the RISC-V tests were converted over to use this new form in `dbb65dd330`, this patch handles the rest which didn't have any codegen diffs. This not only converts some constant expressions to the new form, but also instruction sequences that weren't previously constant expressions to constant expressions as well. In some cases this affects codegen, but these have been omitted here and will be handled in a separate PR.	2024-04-09 15:46:38 +08:00
Craig Topper	4e98adf677	[RISCV] Add tests for F/D with non-FP ABI to interrupt-attr.ll. NFC Without a floating point aware ABI for callees, an interrupt handler needs to save all floating point registers even normally callee saved. We are currently unnecessarily saving callee saved FP registers when a floating point ABI is used by the callee. This is different than gcc as noted in this discourse post https://discourse.llvm.org/t/has-bugs-when-optimizing-save-restore-csrs-by-changing-csr-xlen-f32-interrupt/78200/1	2024-04-08 16:12:36 -07:00
Craig Topper	472ea6e015	[RISCV] Resolve CHECK prefix conflict in fixed-vectors-vitofp-constrained-sdnode.ll. NFC	2024-04-08 16:01:18 -07:00
Craig Topper	afc7cc7b12	[RISCV] Fix missing CHECK prefixes in vector lrint test files. NFC All of these test cases had iXLen in their name which got replaced by sed. This prevented FileCheck from finding the function. The other test cases in these files do not have that issue.	2024-04-08 16:01:18 -07:00
Craig Topper	89ebb56152	[RISCV] Resolve CHECK prefix conflict in fixed-vectors-vwsll.ll. NFC riscv32 and riscv64 generate different code for one test case so we need RV32 and RV64 CHECK lines.	2024-04-08 15:45:07 -07:00
Philip Reames	eb26edbbf8	[RISCV] Exploit sh3add/sh2add for stack offsets by shifted 12-bit constants (#87950 ) If we're falling back to generic constant formation in a register + add/sub, we can check if we have a constant which is 12-bits but left shifted by 2 or 3. If so, we can use a sh2add or sh3add to perform the shift and add in a single instruction. This is profitable when the unshifted constant would require two instructions (LUI/ADDI) to form, but is never harmful since we're going to need at least two instructions regardless of the constant value. Since stacks are aligned to 16 bytes by default, sh3add allows addresing (aligned) data out to 2^14 (i.e. 16kb) in at most two instructions w/zba.	2024-04-08 14:53:21 -07:00
Philip Reames	f5cf98c026	[RISCV] Improve test coverage for #87950 Noticed in review that we want both the LUI and LUI/ADDI cases with different behavior for each.	2024-04-08 14:39:37 -07:00
Pengcheng Wang	364028a1a5	[RISCV] Zimop/Zcmop are ratified Remove them from experimental. See also: https://github.com/riscv/riscv-isa-manual/blob/main/src/zimop.adoc Reviewers: kito-cheng Reviewed By: kito-cheng Pull Request: https://github.com/llvm/llvm-project/pull/87966	2024-04-08 16:40:02 +08:00
David Green	ac321cbb03	[AArch64][GlobalISel] Legalize Insert vector element (#81453 ) This attempts to standardize and extend some of the insert vector element lowering. Most notably: - More types are handled by splitting illegal vectors. - The index type for G_INSERT_VECTOR_ELT is canonicalized to TLI.getVectorIdxTy(), similar to extact_vector_element. - Some of the existing patterns now have the index type specified to make sure they can apply to GISel too. - The C++ selection code has been removed, relying on tablegen patterns. - G_INSERT_VECTOR_ELT with small GPR input elements are pre-selected to use a i32 type, allowing the existing patterns to apply. - Variable index inserts are lowered in post-legalizer lowering, expanding into a stack store and reload.	2024-04-08 08:44:13 +01:00
Pengcheng Wang	f3b5597364	[RISCV] Use larger copies when register tuples are aligned When the encoding of register tuples are aligned, we can use a copy with larger LMUL to reduce copies. Reviewers: preames, topperc, lukel97 Reviewed By: topperc, lukel97 Pull Request: https://github.com/llvm/llvm-project/pull/84455	2024-04-08 13:24:57 +08:00
Philip Reames	da675b922c	[RISCV] Expand test coverage of stack offsets between 2^11 and 2^15 Adds two sets of tests. First, one for prolog/epilogue insertions where the second stack adjustment can be done with shNadd for zba. Second, a set of tests with offsets off SP in the same ranges, but also adding varying alignments.	2024-04-07 15:22:25 -07:00
Jianjian Guan	bc8726b16b	[RISCV] Support codegen of vfmv.v.f for bfloat vector with both Zvfbfmin and Zfbfmin (#87318 ) vfmv, vfmerge should support bfloat vector when we have both Zvfbfmin and Zfbfmin, this patch tries to support vfmv first.	2024-04-07 10:41:47 +08:00
Craig Topper	4abb722ffa	[RISCV] Add tests for opportunities to reassociate to form more shXadd instructions. NFC These tests consist of patterns like (sh3add Z, (add X, (slli Y, 6))) that can be reassociated to form (sh3add (sh3add Y, Z), X).	2024-04-05 12:50:48 -07:00
Craig Topper	0a6a40d62e	[RISCV] Add Zca predicate to BrccCompressOpt patterns used for MinSize. Previously we only checked for C.	2024-04-05 12:39:39 -07:00
Craig Topper	e7e78274a6	[RISCV] Remove uses of sed from compress-opt-branch.ll. NFC sed was being used to use the same test functions with eq/ne branch condition. This commit duplicates the test functions so that we have a version with each condition. This allows us to remove 2 RUN lines. I plan to add a Zca testing to this file which now requires 1 new RUN line instead of 2.	2024-04-05 12:35:46 -07:00
Craig Topper	3c37f926a1	[RISCV] Fix comment in compress-opt-branch.ll to match description. NFC Test description says constant does not fit in 12 bits, but the constant used was -2048 which does fit in 12 bits. Update to -2049. Also remove uses of -NOT in favor of positive checks. One of the -NOT should have been using RESBROPT instead of "c.beqz" so that it would check for the absense of the correct instruction based on the sed replacement on the RUN line.	2024-04-05 11:52:46 -07:00
Luke Lau	4e0b8eae4c	[RISCV] Add tests for vwsll for extends > .vf2. NFC These cannot be picked up by TableGen patterns alone and need to be handled by combineBinOp_VLToVWBinOp_VL	2024-04-04 18:43:15 +08:00
Luke Lau	3a7b5223a6	[DAGCombiner][RISCV] Handle truncating splats in isNeutralConstant (#87338 ) On RV64, we legalize zexts of i1s to (vselect m, (splat_vector i64 1), (splat_vector i64 0)), where the splat_vectors are implicitly truncating. When the vselect is used by a binop we want to pull the vselect out via foldSelectWithIdentityConstant. But because vectors with an element size < i64 will truncate, isNeutralConstant will return false. This patch handles truncating splats by getting the APInt value and truncating it. We almost don't need to do this since most of the neutral elements are either one/zero/all ones, but it will make a difference for smax and smin. I wasn't able to figure out a way to write the tests in terms of select, since we need the i1 zext legalization to create a truncating splat_vector. This supercedes #87236. Fixed vectors are unfortunately not handled by this patch (since they get legalized to _VL nodes), but they don't seem to appear in the wild.	2024-04-04 12:36:15 +08:00
Luke Lau	07d5f49186	[RISCV] Add patterns for fixed vector vwsll (#87316 ) Fixed vectors have their sext/zext operands legalized to _VL nodes, so we need to handle them in the patterns. This adds a riscv_ext_vl_oneuse pattern since we don't care about the type of extension used for the shift amount, and extends Low8BitsSplatPat to handle other _VL nodes. We don't actually need to check the mask or VL there since none of the _VL nodes have passthru operands. The remaining test cases that are widening from i8->i64 need to be handled by extending combineBinOp_VLToVWBinOp_VL. This also fixes Low8BitsSplatPat incorrectly checking the vector size instead of the element size to determine if the splat value might have been truncated below 8 bits.	2024-04-04 11:30:23 +08:00
Michael Maitland	63c925ca80	[RISCV][GISEL] Instruction selection for G_ZEXT, G_SEXT, and G_ANYEXT with scalable vector type	2024-04-03 15:56:08 -07:00
Michael Maitland	188ca374ee	[RISCV][GISEL] Regbankselect for G_ZEXT, G_SEXT, and G_ANYEXT with scalable vector type	2024-04-03 15:56:04 -07:00
Michael Maitland	35a9393a3f	[RISCV][GISEL] Instruction selection for G_ICMP	2024-04-03 15:47:34 -07:00
Michael Maitland	05f673bcef	[RISCV][GISEL] Regbank select for scalable vector G_ICMP	2024-04-03 15:47:34 -07:00
Michael Maitland	8aa3a77eaf	[RISCV][GISEL] Legalize G_ZEXT, G_SEXT, and G_ANYEXT, G_SPLAT_VECTOR, and G_ICMP for scalable vector types This patch legalizes G_ZEXT, G_SEXT, and G_ANYEXT. If the type is a legal mask type, then the instruction is legalized as the element-wise select, where the condition on the select is the mask typed source operand, and the true and false values are 1 or -1 (for zero/any-extension and sign extension) and zero. If the type is a legal integer or vector integer type, then the instruction is marked as legal. The legalization of the extends may introduce a G_SPLAT_VECTOR, which needs to be legalized in this patch for the extend test cases to pass. A G_SPLAT_VECTOR is legal if the vector type is a legal integer or floating point vector type and the source operand is sXLen type. This is because the SelectionDAG patterns only support sXLen typed ISD::SPLAT_VECTORS, and we'd like to reuse those patterns. A G_SPLAT_VECTOR is cutom legalized if it has a legal s1 element vector type and s1 scalar operand. It is legalized to G_VMSET_VL or G_VMCLR_VL if the splat is all ones or all zeros respectivley. In the case of a non-constant mask splat, we legalize by promoting the scalar value to s8. In order to get the s8 element vector back into s1 vector, we use a G_ICMP. In order for the splat vector and extend tests to pass, we also need to legalize G_ICMP in this patch. A G_ICMP is legal if the destination type is a legal bool vector and the LHS and RHS are legal integer vector types.	2024-04-03 15:27:15 -07:00
Michael Maitland	07d3f2a8de	[RISCV][GISEL] Run update_mir_test_checks on llvm/test/CodeGen/RISCV/GlobalISel/legalizer/rvv/legalize-xor.mir	2024-04-03 10:37:44 -07:00
AinsleySnow	52b18430ae	[VP][DAGCombine] Use `simplifySelect` when combining vp.select. (#87342 ) Hi all, This patch is a follow-up of #79101. It migrates logic from `visitVSELECT` to `visitVP_SELECT` to simplify `vp.select`. With this patch we can do the following combinations: ``` vp.select undef, T, F --> T (if T is a constant), F otherwise vp.select <condition>, undef, F --> F vp.select <condition>, T, undef --> T vp.select false, T, F --> F vp.select <condition>, T, T --> T ``` I'm a total newbie to llvm and I'm sure there's room for improvements in this patch. Please let me know if you have any advice. Thank you in advance!	2024-04-03 07:45:50 -04:00
Craig Topper	a9af66a90e	[RISCV] Lower (vector_interleave X, undef) to (vzext_vl X). (#87283 ) If the odd vector is undef or poison, the widening add and multiply trick doesn't work unless we freeze the odd vector. Unfortunately, freezing doesn't work when the operand is provably undef/poison. MIR doesn't have a representation for freeze so it just becomes a COPY from IMPLICIT_DEF which freely propagates undef to each operand independently. To work around this, check for undef explicitly and lower to a VZEXT_VL of the even vector. This produces better code than we'd get from a freeze anyway. I've left a FIXME for adding a freeze. I'll do that as a separate patch as it affects other tests and doesn't help with the new test.	2024-04-02 11:58:41 -07:00
Craig Topper	8c1dc5dd58	[RISCV] Add test for miscompile of vector.interleave when odd vector is literal poison. The interleave lowering relies on a math trick that requires passing the odd vector to two math instructions. In order to be correct these instructions must see the same value. If the odd vector is provably poison or undef, SelectionDAG will create a vwadd and vwmaccu where the operand is a copy from IMPLICIT_DEF. Later this will become just the undef flag on the operand. This gives the register allocator freedom to pick a different register for each instruction.	2024-04-02 11:49:08 -07:00
Michael Maitland	153b8431bb	[RISCV][GISEL] Legalize G_BITCAST for scalable vectors (#85970 ) SelectionDAG marks ISD::BITCAST as legal between scalable vector types and ISelDAGToDAG deletes them. We mark G_BITCAST between scalable vectors as legal in GISel. A future patch will handle what to do with them after the legalizer (likley either drop them in a isel-preprocess or convert them to COPYs). BITCAST is needed for legalization of G_INSERT and G_EXTRACT. This is a precommit for legalization of G_INSERT and G_EXTRACT.	2024-04-02 12:30:51 -04:00
Luke Lau	59dd10faf8	[RISCV] Add tests for fixed vector vwsll. NFC We are missing patterns for fixed vectors, where the sexts and zexts are legalized to _vl nodes.	2024-04-02 13:02:03 +08:00
Vitaly Buka	20f56e1f8e	[CodeGen] Add default lowering for llvm.allow.{runtime,ubsan}.check() (#86049 ) RFC: https://discourse.llvm.org/t/rfc-add-llvm-experimental-hot-intrinsic-or-llvm-hot/77641	2024-03-31 22:19:33 -07:00
Brandon Wu	29e8bfc13c	[RISCV] RISCV vector calling convention (2/2) (#79096 ) This commit handles vector arguments/return for function definition/call, the new class RVVArgDispatcher is added for doing all vector register assignment including mask types, data types as well as tuple types. It precomputes the register number for each argument as per https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/master/riscv-cc.adoc#standard-vector-calling-convention-variant and it's passed to calling convention function to handle all vector arguments. Depends on: #78550	2024-03-30 21:05:33 +08:00
Shilei Tian	3a106e5b2c	[GlobalISel] Fold G_ICMP if possible (#86357 ) This patch tries to fold `G_ICMP` if possible.	2024-03-29 15:59:50 -04:00
Luke Lau	3f69d90351	[RISCV] Add missing RISCVMaskedPseudo for TIED pseudos (#86787 ) This was preventing us from folding away the vmerge into its mask.	2024-03-29 22:21:22 +08:00
Luke Lau	76ba3c8e64	[RISCV] Add test case for vmerge fold for tied pseudos with rounding mode. NFC	2024-03-29 19:47:09 +08:00
Luke Lau	2a315d800b	[RISCV] Combine (or disjoint ext, ext) -> vwadd (#86929 ) DAGCombiner (or InstCombine) will convert an add to an or if the bits are disjoint, which can prevent what was originally an (add {s,z}ext, {s,z}ext) from being selected as a vwadd. This teaches combineBinOp_VLToVWBinOp_VL to recover it by treating it as an add.	2024-03-29 19:45:24 +08:00
Luke Lau	131be5de90	[RISCV] Add more disjoint or tests for vwadd[u].{w,v}v. NFC	2024-03-29 19:11:26 +08:00
Wang Pengcheng	610b9e23c5	[SDAG] Use shifts if ISD::MUL is illegal when lowering ISD::CTPOP (#86505 ) We can avoid libcalls. Fixes #86205	2024-03-29 15:38:39 +08:00
Sudharsan Veeravalli	e005a09df5	[RISCV][TypePromotion] Dont generate truncs if PromotedType is greater than Source Type (#86941 ) We currently check if the source and promoted types are not equal before generating truncate instructions. This does not work for RV64 where the promoted type is i64 and this lead to a crash due to the generation of truncate instructions from i32 to i64. Fixes #86400	2024-03-28 21:22:05 -07:00
Philip Reames	9ea0396f16	[RISCV] Extend pattern matches involving shNadd to support disjoint or (#87001 ) I tried to add representative tests while not duplicating complete coverage. If there's other tests you'd like to see, let me know.	2024-03-28 16:34:04 -07:00
Luke Lau	a3c2d8c072	[RISCV] Combine ({s,u}{div,rem} (zext, zext)) -> (zext ({s,u}{div,rem} (zext, zext))) (#86779 ) This narrows unsigned and signed div and rem nodes via combineBinOpOfZExt. Unlike other binary ops, there are no widening div or rem instructions. So we will end up with an extra vzext.vf2. However I'm assuming that div/rem are expensive enough that by reducing their EMUL we will gain back the cost. Alive2 proof: https://alive2.llvm.org/ce/z/Et_L6y	2024-03-29 05:55:38 +08:00

1 2 3 4 5 ...

3647 Commits