clang-p2996

Author	SHA1	Message	Date
Craig Topper	cb161b3a88	[RISCV] Add support for matching .vf forms of fadd/fsub/fmul/fdiv/fma for fixed vectors. fma+neg will come in a different patch since I haven't done it for .vv yet either. Differential Revision: https://reviews.llvm.org/D96375	2021-02-10 10:16:27 -08:00
Craig Topper	0c254b4a69	[RISCV] Add support for selecting vrgather.vx/vi for fixed vector splat shuffles. The test cases extract a fixed element from a vector and splat it into a vector. This gets DAG combined into a splat shuffle. I've used some very wide vectors in the test to make sure we have at least a couple tests where the element doesn't fit into the uimm5 immediate of vrgather.vi so we fall back to vrgather.vx. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D96186	2021-02-10 10:01:56 -08:00
Fraser Cormack	a3c74d6d53	[RISCV] Add support for selecting vid.v from build_vector This patch optimizes a build_vector "index sequence" and lowers it to the existing custom RISCVISD::VID node. This pattern is common in autovectorized code. The custom node was updated to allow it to be used by both scalable and fixed-length vectors, thus avoiding pattern duplication. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D96332	2021-02-10 10:58:40 +00:00
Craig Topper	fd5adae02c	[RISCV] Remove SRO* and SLO* instructions from bitmanip. As of the current draft these are no longer being considered for the bitmanip spec. It wasn't clear what sub extension they belonged in in the 0.93 spec. So remove them. They can always be added back if something changes. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D96157	2021-02-09 09:35:05 -08:00
Hsiangkai Wang	a2d19bad07	[RISCV] Use whole register load/store for generic load/store. In vector v0.10, there are whole vector register load/store instructions. I suggest to use the whole register load/store instructions for generic load/store for scalable vector types. It could save up vset{i}vl{i} for these load/store. For fractional LMUL, I keep to use vle{eew}.v/vse{eew}.v instructions to load/store partial vector registers. Differential Revision: https://reviews.llvm.org/D95853	2021-02-09 15:52:04 +08:00
Craig Topper	b49aaed8c7	[RISCV] Use _COMMUTABLE fma pseudos for fixed vectors. This matches what we do in the VLMAX SDNode patterns.	2021-02-08 11:27:23 -08:00
Craig Topper	8d8cafa32e	[RISCV] Add support for splat fixed length build_vectors using RVV. Building on the fixed vector support from D95705 I've added ISD nodes for vmv.v.x and vfmv.v.f and switched to lowering the intrinsics to it. This allows us to share the same isel patterns for both. This doesn't handle splats of i64 on RV32 yet. The build_vector gets converted to a vXi32 build_vector+bitcast during type legalization. Not sure the best way to handle this at the moment. Differential Revision: https://reviews.llvm.org/D96108	2021-02-08 11:12:56 -08:00
Craig Topper	b8d719fbe8	[RISCV] Add support for fixed vector FMA. Follow up to D95705. Does not include the commuting support from D95800. Differential Revision: https://reviews.llvm.org/D96103	2021-02-08 11:12:56 -08:00
Craig Topper	a719b667a9	[RISCV] Add initial support for converting fixed vectors to scalable vectors during lowering to use RVV instructions. This is an alternative to D95563. This is modeled after a similar feature for AArch64's SVE that uses predicated scalable vector instructions.a Rather than use predication, this patch uses an explicit VL operand. I've limited it to always use LMUL=1 for now, but we can improve this in the future. This requires a bunch of new ISD opcodes to carry the VL operand. I think we can probably lower intrinsics to these ISD opcodes to cut down on the size of the isel table. Which is why I've added patterns for all integer/float types and not just LMUL=1. I'm only testing one vector width right now, but the width is programmable via the command line. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D95705	2021-02-08 10:41:30 -08:00
Craig Topper	b7b4f4cbc3	[RISCV] Make scalable vector FMA commutable for register allocation. This adds support for commuting operands and converting between vfmadd and vfmacc to avoid register copies. To avoid messing up intrinsic behavior, I've added new pseudo instructions that have the isCommutable flag set. These pseudos also force a tail agnostic policy. The intrinsic version still use the tail undisturbed policy. For best results it looks like we need to start with fmadd and only pick fmacc if its beneficial. MachineCSE commutes without contraining the operands and then commutes back if it didn't help with CSE. So I've made sure that when the operand choice isn't constrained, we will keep fmadd for MachineCSE and when it does the second commute, we get back the original instruction. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D95800	2021-02-08 10:05:33 -08:00
Fraser Cormack	b46aac125d	[RISCV] Support the scalable-vector fadd reduction intrinsic This patch adds support for both the fadd reduction intrinsic, in both the ordered and unordered modes. The fmin and fmax intrinsics are not currently supported due to a discrepancy between the LLVM semantics and the RVV ISA behaviour with regards to signaling NaNs. This behaviour is likely fixed in version 2.3 of the RISC-V F/D/Q extension, but until then the intrinsics can be left unsupported. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D95870	2021-02-08 09:52:27 +00:00
Fraser Cormack	e046c0c28b	[RISCV] Support scalable-vector integer reduction intrinsics This patch adds support for the integer reduction intrinsics supported by RVV. This excludes "mul" which has no corresponding instruction. The reduction instructions in RVV have slightly complicated type constraints given they always produce a single "M1" vector register. They are lowered to custom nodes including the second "scalar" reduction operand to simplify the patterns and in the hope that they can be useful for future DAG combines. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D95620	2021-02-05 10:10:08 +00:00
Fraser Cormack	c3eb2da6c4	[RISCV] Optimize sign-extended EXTRACT_VECTOR_ELT nodes This patch custom-legalizes all integer EXTRACT_VECTOR_ELT nodes where SEW < XLEN to VMV_S_X nodes to help the compiler infer sign bits from the result. This allows us to eliminate redundant sign extensions. For parity, all integer EXTRACT_VECTOR_ELT nodes are legalized this way so that we don't need TableGen patterns for some and not others. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D95741	2021-02-05 10:05:22 +00:00
Fraser Cormack	af48d2bfc2	[RISCV] Add patterns for scalable-vector fsqrt This patch adds support for lowering the sqrt intrinsic to the RVV vfsqrt instruction. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D96012	2021-02-05 09:39:19 +00:00
Craig Topper	6b280ce34c	[RISCV] Use LLVMScalarOrSameVectorWidth to make avoid needing to mention the index type for vrgatherei16 intrinsics. Add .vv to the intrinsic name to be consistent with D95979. Reviewed By: khchen Differential Revision: https://reviews.llvm.org/D95981	2021-02-04 20:26:45 -08:00
Craig Topper	25ff302a79	[RISCV] Split vrgather intrinsics into separate vrgather.vv and vrgather.vx intrinsics. The vrgather.vv instruction uses a vector of indices with the same SEW as operand 0. The vrgather.vx instructions use a scalar index operand of XLen bits. By splitting this into 2 intrinsics we are able to use LLVMatchType in the definition to avoid specifying the type for the index operand when creating the IR for the intrinsic. For .vv it will match the operand 0 type. And for .vx it will match the type of the vl operand we already needed to specify a type for. I'm considering splitting more intrinsics. This was a somewhat odd one because the .vx doesn't use the element type, it always use XLen. Reviewed By: HsiangKai Differential Revision: https://reviews.llvm.org/D95979	2021-02-04 19:50:12 -08:00
Craig Topper	16fb1c7aae	[RISCV] Add i8/i16 test cases to div.ll and i8/i16/i64 to rem.ll. NFC This improves our coverage of these operations and shows that we use really large constants for division by constant on i8/i16 especially on RV64. The issue is that BuildSDIV/BuildUDIV are limited to legal types so we have to promote to i64 before it kicks in. At that point we've lost the range information for the original type.	2021-02-04 16:46:23 -08:00
Hsiangkai Wang	63baeec66e	[RISCV] Load/store vector mask types. Use vle1.v/vse1.v to load/store vector mask types. Differential Revision: https://reviews.llvm.org/D93364	2021-02-03 13:44:15 +08:00
Hsiangkai Wang	c7189ba785	[RISCV] Add new vector instructions in v0.10. * Add new vector instructions in v0.10. - load/store for mask value vle1.v vse1.v - vsetivli for 0-31 immediate vector length. * Rename vector instructions in v0.10. - vfrsqrte7 -> vfrsqrt7 - vfrece7 -> vfrec7 * Reserve memory width encodings for EEW>128b. Differential Revision: https://reviews.llvm.org/D95781	2021-02-03 13:28:58 +08:00
Fraser Cormack	b4106f9c7b	[RISCV] Fix incorrect RVV sdiv/udiv lowering Due to a clerical error, the sdiv operation was mapping to vdivu and udiv to vdiv, when the opposite mapping is the correct one. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D95869	2021-02-02 18:35:53 +00:00
Craig Topper	72b31ad4b8	[RISCV] Add scalable vector support for floating point FMA instructions A follow up patch will add support for commuting operands or changing opcode to vfmacc and friends. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D95662	2021-02-01 09:52:43 -08:00
Craig Topper	1097ee61bf	[RISCV] Optimize (srl (and X, 0xffff), C) -> (srli (slli X, 16), 16 + C). Rather than materializing the 0xffff immediate for the AND, use a shift left to remove the upper bits and then shift in zeros from the right. This pattern occurs when type legalizing an i16 right shift. I've implemented this with custom selection code for a number of reasons. I've limited this to the AND having a single use. We need to compensate for SimplifyDemandedBits altering the AND mask. I'm using *W opcodes on RV64. We may want to generlize this in the future. For all these reason it seemed easiest to do it this way. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D95774	2021-02-01 09:37:55 -08:00
Craig Topper	70289ea6f5	[RISCV][LegalizeTypes] Try to expand BSWAP before promoting if the promoted BSWAP would expand anyway. If we're going to end up expanding anyway, we should do it early so we don't create extra operations to handle the bytes added by promotion. This is helfpul on RISCV where we might have to promote i16 all the way to i64. Differential Revision: https://reviews.llvm.org/D95756	2021-01-31 14:33:29 -08:00
Craig Topper	be997cead7	[RISCV] Add rv64 command line to bswap-ctlz-cttz-ctpop.ll.	2021-01-30 21:32:37 -08:00
Fraser Cormack	c87dd614fd	[RISCV] Update extractelt tests to sign-extend results (NFC) This demonstrates a missed optimization: the `vmv.x.s` instruction is used to extract the element from the vector, and this instruction already sign-extends the value to XLEN.	2021-01-30 15:50:07 +00:00
Craig Topper	ad5307aaca	[RISCV] Merge rv32 and rv64 vector fadd/fsub/fmul/fdiv sdnode tests into single tests files with 2 run lines. The IR and CHECK lines are identical so just keep one copy.	2021-01-29 17:32:08 -08:00
Hsiangkai Wang	282aca10ae	[RISCV] Update the version number to v0.10 for vector. v0.10 is tagged in V specification. Update the version to v0.10. Differential Revision: https://reviews.llvm.org/D95680	2021-01-30 07:20:05 +08:00
Craig Topper	c5d4b77b17	[RISCV] Remove isel patterns for Zbs *W instructions. These instructions have been removed from the 0.94 bitmanip spec. We should focus on optimizing the codegen without using them. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D95302	2021-01-28 09:33:56 -08:00
Craig Topper	ae82a8c863	[RISCV] Add support for scalable vector fneg using vfsgnjn.vv Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D95568	2021-01-28 09:11:49 -08:00
Fraser Cormack	fc2f27ccf3	[RISCV] Add support for RVV int<->fp & fp<->fp conversions This patch adds support for the full range of vector int-to-float, float-to-int, and float-to-float conversions on legal types. Many conversions are supported natively in RVV so are lowered with patterns. These include conversions between (element) types of the same size, and those that are half/double the size of the input. When conversions take place between types that are less than half or more than double the size we must lower them using sequences of instructions which go via intermediate types. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D95447	2021-01-28 09:50:32 +00:00
Craig Topper	5d05cdf55c	[RISCV] Copy isUnneededShiftMask from X86. In `d2927f786e`, I added patterns to remove (and X, 31) from sllw/srlw/sraw shift amounts. There is code in SelectionDAGISel.cpp that knows to use computeKnownBits to fill in bits of the mask that were removed by SimplifyDemandedBits based on bits being known zero. The non-W shift patterns use immbottomxlenset which allows the mask to have more than log2(xlen) trailing ones, but doesn't have a call to computeKnownBits to fill in bits of the mask that may have been cleared by SimplifyDemandedBits. This patch copies code from X86 to handle more than log2(xlen) bottom bits set and uses computeKnownBits to fill in missing bits before counting. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D95422	2021-01-27 20:46:10 -08:00
Fraser Cormack	9a75a808c2	[RISCV] Fix a codegen crash in getSetCCResultType This patch fixes some crashes coming from `RISCVISelLowering::getSetCCResultType`, which would occasionally return an EVT constructed from an invalid MVT, which has a null Type pointer. The attached test shows this happening currently for some fixed-length vectors, which hit this issue when the V extension was enabled, even though they're not legal types under the V extension. The fix was also pre-emptively extended to scalable vectors which can't be represented as an MVT, even though a test case couldn't be found for them. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D95434	2021-01-27 10:22:54 +00:00
Craig Topper	f9d7f77267	[RISCV] Have customLegalizeToWOp truncate to the original type instead of i32 now that we use it for i8/i16 as well. `239cfbccb0` add support for legalizing i8/i16 UDIV/UREM/SDIV to use *W instructions. So we need to truncate to i8/i16 if we're legalizing one of those.	2021-01-26 10:50:03 -08:00
Craig Topper	bfc60acd98	[RISCV] Adjust RISCVInstrInfoVSDPatterns.td for different pseudo instructions for different FPR. Move the Suffix string into the VTypeInfo class so we don't need a helper class to get to it. Adjust pseudo naming scheme for FPRs to put F16/F32/F64 in place of F in the pseudo instruction name rather than as a suffix. This avoids special cases like VFMERGE from the original patch. Differential Revision: https://reviews.llvm.org/D95404	2021-01-26 01:00:50 -08:00
Hsiangkai Wang	e72b22a40b	[RISCV] Define different pseudo instructions for different FPR. When spilling, the spill size will depend on the size of register class. For .vf vector instructions, it may spill the floating point scalar argument. In order to use the correct load/store instructions for spilling, we need to provide the correct floating point register class for the .vf vector pseudo instructions. In this commit, we define the .vf pseudo instructions as three different kinds of pseudo instructions for half/float/double. For example, PseudoVFADD_M1 will become as PseudoVFADD_F16_M1, PseudoVFADD_F32_M1, and PseudoVFADD_F64_M1. Differential Revision: https://reviews.llvm.org/D95234	2021-01-26 15:48:35 +08:00
Hsiangkai Wang	f19849a07b	[RISCV] Update V extension to v1.0-draft 08a0b464. Differential Revision: https://reviews.llvm.org/D94583	2021-01-26 12:02:43 +08:00
Hsiangkai Wang	b69932b550	[RISCV] Implement vlsegff intrinsics. Differential Revision: https://reviews.llvm.org/D95303	2021-01-26 12:02:43 +08:00
Craig Topper	ea87cf2acd	[TargetLowering][RISCV] Don't transform (seteq/ne (sext_inreg X, VT), C1) -> (seteq/ne (zext_inreg X, VT), C1) if the sext_inreg is cheaper RISCV has to use 2 shifts for (i64 (zext_inreg X, i32)), but we can use addiw rd, rs1, x0 for sext_inreg. We already understood this when type legalizing i32 seteq/ne on rv64. But this transform in SimplifySetCC would sometimes undo it. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D95289	2021-01-25 16:37:21 -08:00
Craig Topper	15f66cf749	[RISCV] Add isel patterns to optimize slli.uw patterns without Zba extension. This pattern can occur when an unsigned is used to index an array on RV64. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D95290	2021-01-25 16:12:08 -08:00
Fraser Cormack	15141cd115	[RISCV] Add RVV insertelt/extractelt scalable-vector patterns Original patch by @rogfer01. This patch adds support for insertelt and extractelt operations on scalable vectors. Special care must be taken on RV32 when dealing with i64 vectors as there are no straightforward ways to insert a 64-bit element without a register of that size. To that end, both are custom-lowered to different sequences. Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com> Co-Authored-by: Fraser Cormack <fraser@codeplay.com> Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D94615	2021-01-25 22:03:52 +00:00
Craig Topper	239cfbccb0	[RISCV] Custom type legalize i8/i16 UDIV/UREM/SDIV on RV64 so we can use divuw/remuw/divw. This makes our i8/i16 codegen more similar to the i32 codegen. I've also added computeKnownBits support for DIVUW/REMUW so that we can remove zero extending ANDs from the output. Without this we end up turning DIVUW/REMUW back into DIVU/REMU via some isel patterns. Reviewed By: frasercrmck, luismarques Differential Revision: https://reviews.llvm.org/D95322	2021-01-25 10:47:22 -08:00
Craig Topper	4eb4f8963f	[RISCV] Use sign extend for i32 arguments and returns in makeLibCall on RV64. As far as I know 32 bits arguments and returns on RV64 are always sign extended to i64. So I think we should be taking this into account around libcalls. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D95285	2021-01-25 09:33:48 -08:00
Fraser Cormack	fde2466171	[SelectionDAG] Support scalable-vector splats in more cases This patch adds support for scalable-vector splats in DAGCombiner's `isConstantOrConstantVector` and `ISD::matchUnaryPredicate` functions, which enable the SelectionDAG div/rem-by-constant optimizations for scalable vector types. It also fixes up one case where the UDIV optimization was generating a SETCC without first consulting the target for its preferred SETCC result type. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D94501	2021-01-25 10:58:15 +00:00
Simon Cook	a7c1239f37	[RISCV] Add attribute support for all supported extensions This adds support for ".attribute arch" for all extensions that are currently supported by the compiler. Differential Revision: https://reviews.llvm.org/D94931	2021-01-25 08:58:53 +00:00
Craig Topper	12d0753aca	[RISCV] Use bitsLE instead of strict == MVT::i32 in assertsexti32 and assertzexti32. The patterns that use this really want to know if the operand has at least 32 sign/zero bits. This increases opportunities to use W instructions when the original source used i8/i16. Not sure how much this matters for performance, but it makes i8/i16 code more consistent with i32.	2021-01-24 13:58:14 -08:00
Craig Topper	f22aa8f879	[RISCV] Add test cases for missed opportunities to use *W instructions for div/rem when inputs are sign/zero extended from i8/16 instead of i32.	2021-01-24 13:56:38 -08:00
Craig Topper	60ebf6408e	[RISCV] Add test cases for missed opportunities to use fcvt.*.w(u) instructions on RV64 when input is known to be extended from i8/i16.	2021-01-24 13:48:29 -08:00
Craig Topper	998057ec06	[RISCV] Add isel patterns to remove masks on SLO/SRO shift amounts.	2021-01-23 15:57:41 -08:00
Craig Topper	5a73daf907	[RISCV] Add test cases for SRO/SLO with shift amounts masked to bitwidth-1. NFC The sro/slo instructions ignore extra bits in the shift amount, so we can ignore the mask just like we do for sll, srl, and sra.	2021-01-23 15:45:51 -08:00
Craig Topper	d2927f786e	[RISCV] Add isel patterns to remove (and X, 31) from sllw/srlw/sraw shift amounts. We try to do this during DAG combine with SimplifyDemandedBits, but it fails if there are multiple nodes using the AND. For example, multiple shifts using the same shift amount.	2021-01-23 15:08:18 -08:00

1 2 3 4 5 ...

582 Commits