clang-p2996

Author	SHA1	Message	Date
Ben Shi	d3738a09fb	[RISCV][test] Add tests for mul optimization in the zba extension with SH*ADD These tests will show the following optimization by future patches. (mul x, 11) -> (SH1ADD (SH2ADD x, x), x) (mul x, 19) -> (SH1ADD (SH3ADD x, x), x) (mul x, 13) -> (SH2ADD (SH1ADD x, x), x) (mul x, 21) -> (SH2ADD (SH2ADD x, x), x) (mul x, 37) -> (SH2ADD (SH3ADD x, x), x) (mul x, 25) -> (SH3ADD (SH1ADD x, x), x) (mul x, 41) -> (SH3ADD (SH2ADD x, x), x) (mul x, 73) -> (SH3ADD (SH3ADD x, x), x) Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D106031	2021-07-21 10:16:56 +08:00
Craig Topper	81efb82570	[RISCV] Teach RISCVMatInt about cases where it can use LUI+SLLI to replace LUI+ADDI+SLLI for large constants. If we need to shift left anyway we might be able to take advantage of LUI implicitly shifting its immediate left by 12 to cover part of the shift. This allows us to use more bits of the LUI immediate to avoid an ADDI. isDesirableToCommuteWithShift now considers compressed instruction opportunities when deciding if commuting should be allowed. I believe this is the same or similar to one of the optimizations from D79492. Reviewed By: luismarques, arcbbb Differential Revision: https://reviews.llvm.org/D105417	2021-07-20 09:22:06 -07:00
Craig Topper	2ad2c5d457	[RISCV] Add -mattr=+c command lines to add-before-shl.ll to prepare for D105417. NFC	2021-07-20 09:22:06 -07:00
Craig Topper	98d4adc2d1	[RISCV] Add custom isel to select (and (srl X, C1), C2) and (and (shl X, C1), C2) Replace some existing isel patterns that are covered by the new code. SLLIUWPat has been removed in favor of folding its root case into the new code. The other uses in isel patterns for shXadd.uw have been switched to using hardcoded AND masks. This is based on the original version of D49585 from ARM. The final version of that was made a DAG combine, but I've chosen to keep it as custom isel. I'm not convinced DAG combine is as good with shift pairs as it is with and+shift. I saw some issues optimizing the shifts created by vscale lowering if an and isn't created for from a shift pair. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D106230	2021-07-20 08:53:55 -07:00
Craig Topper	84877a098a	[RISCV] Use unordered indexed loads for MGATHER. I don't think the semantics of the llvm masked gather intrinsic care about the order the elements are loaded. For example, type legalization by splitting will chain them in parallel. This is different than scatter which we do chain in order. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D106025	2021-07-20 08:46:02 -07:00
Craig Topper	4f1270a61e	[RISCV] Add test cases to show an issue with our fcvt.wu isel patterns on RV64. The pattern we match is (sext_inreg (assertzexti32 (fp_to_uint)), i32). If the assertzexti32 has an additional user we'll end up emitting an fcvt.wu and an fcvt.lu. This can happen if the original fp_to_uint before type legalization has one user that causes a sext_inreg to be emitted and one that doesn't.	2021-07-19 22:58:42 -07:00
Craig Topper	50302feb1d	[SelectionDAG][RISCV] Use isSExtCheaperThanZExt to control whether sext or zext is used for constant folding any_extend. RISCV would prefer a sign extended constant since that works better with our constant materialization. We have an existing TLI hook we use to control sign extension of setcc operands in type legalization. That hook happens to do the right check we need here, but might be straying from its original purpose. With only RISCV defining this hook in tree, I wasn't sure if it was worth adding another hook with identical behavior. This is an alternative to D105785 where I tried to handle this in the RISCV backend by not creating ANY_EXTENDs in some places. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D105918	2021-07-19 09:25:28 -07:00
Craig Topper	00c1cc867f	[RISCV] Add more i32 srem/sdiv with power of 2 constant tests. NFC Add a small power 2 srem test to match existing sdiv test. Add larger power of 2 test to both. The larger constant test shows materialization of a constant for an AND in the RV64 code. We should be using W shift instructions to match the RV32 code.	2021-07-18 00:21:14 -07:00
Craig Topper	d0f8047d37	[RISCV] Teach computeKnownBitsForTargetNode that VLENB will never be more than 65536/8.	2021-07-17 11:24:20 -07:00
ShihPo Hung	be8159bfa5	[RISCV][RVV] Precommit a test case for D105684 Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D105685	2021-07-18 00:43:17 +08:00
Craig Topper	173332d175	[RISCV] Manually emit the best shift for VSCALE lowering to improve codegen. We assume VLENB is a multiple of 8 and previously relied on shift pairs being optimized to an AND+SHL/SHR and computeKnownBits removing the AND. This doesn't happen if (vlenb >> 3) gets CSEd to have multiple uses. This patch manually emits the best shift to workaround this.	2021-07-17 00:52:07 -07:00
Craig Topper	2e65ec1010	[RISCV] Rename the fixed vector vwmacc tests to have the 'm' in their filenames. NFC	2021-07-16 10:43:17 -07:00
Craig Topper	8f0343cc9c	[RISCV] Use tail agnostic policy for fixed vector vwmacc(u). This adds new pseudoinstructions with ForceTailAgnostic set. This matches what we did for non-widening VMACC. We should move to a tail policy operand on the pseudos when we expand the intrinsic interface to include the tail policy.	2021-07-16 10:41:09 -07:00
Craig Topper	4dbb788068	[RISCV] Teach constant materialization that it can use zext.w at the end with Zba to reduce number of instructions. If the upper 32 bits are zero and bit 31 is set, we might be able to use zext.w to fill in the zeros after using an lui and/or addi. Most of this patch is plumbing the subtarget features into the constant materialization. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D105509	2021-07-16 09:35:56 -07:00
Fraser Cormack	e3fa2b1eab	Revert "[RISCV] Lower more BUILD_VECTOR sequences to RVV's VID" This reverts commit `a6ca88e908`. More caution is required to avoid overflow/underflow. Thanks to the santizers for catching this.	2021-07-16 15:00:20 +01:00
Fraser Cormack	a6ca88e908	[RISCV] Lower more BUILD_VECTOR sequences to RVV's VID This patch teaches the compiler to identify a wider variety of `BUILD_VECTOR`s which form integer arithmetic sequences, and to lower them to `vid.v` with modifications for non-unit steps and non-zero addends. The sequences handled by this optimization must either be monotonically increasing or decreasing. Consecutive elements holding the same value indicate a fractional step which, while simple mathematically, becomes more complex to handle both in the realm of lossy integer division and in the presence of `undef`s. For example, a common "interleaving" shuffle index will be lowered by LLVM to both `<0,u,1,u,2,...>` and `<u,0,u,1,u,...>` `BUILD_VECTOR` nodes. Either of these would ideally be lowered to `vid.v` shifted right by 1. Detection of this sequence in presence of general `undef` values is more complicated, however: `<0,u,u,1,>` could match either `<0,0,0,1,>` or `<0,0,1,1,>` depending on later values in the sequence. Both are possible, so backtracking or multiple passes is inevitable. Sticking to monotonic sequences keeps the logic simpler as it can be done in one pass. Fractional steps will likely be a separate optimization in a future patch. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D104921	2021-07-16 10:35:13 +01:00
Fraser Cormack	03a4702c88	[RISCV] Fix the neutral element in vector 'fadd' reductions Using positive zero as the neutral element in 'fadd' reductions, while it generates better code, is incorrect. The correct neutral element is negative zero: 0.0 + -0.0 = 0.0, whereas -0.0 + -0.0 = -0.0. There are perhaps more optimal lowerings of negative zero avoiding constant-pool loads which could be left as future work. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D105902	2021-07-14 10:18:38 +01:00
Craig Topper	1e670dc7d7	[RISCV] Use DIVUW/REMUW/DIVW instructions for i8/i16/i32 udiv/urem/sdiv when LHS is constant. We don't really have optimizations for division with a constant LHS. If we don't use a W instruction we end up needing to sign or zero extend the RHS to use the 64-bit instruction. I had to sign_extend i32 constants on the LHS instead of using any_extend which becomes zero_extend. If we don't do this, constants that were originally negative become harder to materialize. I think this problem exists for more of our W instruction cases. For example (i32 (shl -1, X)), but we don't have lit tests. I'll work on that as a follow up. I also left a FIXME for enabling W instruction for RHS constants under -Oz. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D105769	2021-07-13 10:33:57 -07:00
Craig Topper	46e8970817	[RISCV] Prevent use of t0(aka x5) as rs1 for jalr instructions. Some microarchitectures treat rs1=x1/x5 on jalr as a hint to pop the return-address stack. We should avoid using x5 on jalr instructions since we aren't using x5 as an alternate link register. Differential Revision: https://reviews.llvm.org/D105875	2021-07-13 09:46:21 -07:00
Fangrui Song	3d89fb4d13	[RISCV] Support machine constraint "S" Similar to D46745, "S" represents an absolute symbolic operand, which can be used to specify the access models, e.g. extern int var; void addr_via_asm() { void ret; asm("lui %0, %%hi(%1)\naddi %0,%0,%%lo(%1)" : "=r"(ret) : "S"(&var)); return ret; } 'S' is documented in trunk GCC: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101275 Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D105254	2021-07-13 09:30:09 -07:00
Fraser Cormack	d991b7212b	[RISCV] Pass undef VECTOR_SHUFFLE indices on to BUILD_VECTOR Often when lowering vector shuffles, we split the shuffle into two LHS/RHS shuffles which are then blended together. To do so we split the original indices into two, indexed into each respective vector. These two index vectors are then separately lowered as BUILD_VECTORs. This patch forwards on any undef indices to the BUILD_VECTOR, rather than having the VECTOR_SHUFFLE lowering decide on an optimal concrete index. The motiviation for ths change is so that we don't duplicate optimization logic between the two lowering methods and let BUILD_VECTOR do what it does best. Propagating undef in this way allows us, for example, to generate `vid.v` to produce the LHS indices of commonly-used interleave-type shuffles. I have designs on further optimizing interleave-type and other common shuffle patterns in the near future. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D104789	2021-07-13 10:41:54 +01:00
Eli Friedman	ec1cdee6aa	[SelectionDAG][RISCV] Support @llvm.vscale.i64() on 32-bit targets. Not really useful on its own, but D105673 depends on it. Differential Revision: https://reviews.llvm.org/D105840	2021-07-12 14:53:42 -07:00
Craig Topper	f0393deb33	[RISCV] Add tests for suboptimal handling of negative constants for i32 uaddo/usubo on RV64. NFC We end up zero extending constants when we promote to i64. We should sign extend instead to allow use of addiw or improve constant materialization.	2021-07-11 12:38:51 -07:00
Craig Topper	6644a61121	[RISCV] Add tests for suboptimal handling of negative constants on the LHS of i32 shifts/rotates/subtracts on RV64. NFC The constants end up getting zero extended to i64, but sign extend would be better for constant materialization. We're using W instructions so either behavior is correct since the upper bits aren't read.	2021-07-11 11:54:34 -07:00
Craig Topper	1410aab622	[RISCV] Remove stale FIXME from a test. NFC sext has been used for sltu/sltiu since `e0e62e97`.	2021-07-11 10:25:07 -07:00
Craig Topper	99b8c46828	[RISCV] Restore non-constant srem test I accidentally deleted. NFC	2021-07-10 18:02:13 -07:00
Craig Topper	86109fa9e8	[RISCV] Add test cases for div/rem with constant left hand side. NFC Some of these would produce better code if we used W instructions, but constant LHS currently prevents that.	2021-07-10 17:22:40 -07:00
Ben Shi	ed102ce20a	[RISCV][test] Add new tests for mul optimization in the zba extension with SH*ADD This patch will show the following optimization by future patches. (mul x imm) -> (SH1ADD x, (SLLI x, bits)) when imm = 2^n + 2. (mul x imm) -> (SH2ADD x, (SLLI x, bits)) when imm = 2^n + 4. (mul x imm) -> (SH3ADD x, (SLLI x, bits)) when imm = 2^n + 8. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D105614	2021-07-09 09:48:23 +08:00
Craig Topper	12d51f95fe	[RISCV] Implement lround/llround/lrint/llrint with fcvt instruction with -fno-math-errno These are fp->int conversions using either RMM or dynamic rounding modes. The lround and lrint opcodes have a return type of either i32 or i64 depending on sizeof(long) in the frontend which should follow xlen. llround/llrint should always return i64 so we'll need a libcall for those on rv32. The frontend will only emit the intrinsics if -fno-math-errno is in effect otherwise a libcall will be emitted which will not use these ISD opcodes. gcc also does this optimization. Reviewed By: arcbbb Differential Revision: https://reviews.llvm.org/D105206	2021-07-06 11:43:22 -07:00
Craig Topper	2b5e53111a	[RISCV] Add support for matching vwmul(u) and vwmacc(u) from fixed vectors. This adds a DAG combine to detect sext/zext inputs and emit a new ISD opcode. The extends will either be removed or replaced with narrower extends. Isel patterns are used to match add and widening mul to vwmacc similar to the recently added vmacc patterns. There's still some work to be to match vmulsu. We should also rewrite splats that were extended as scalars and then splatted. Reviewed By: arcbbb Differential Revision: https://reviews.llvm.org/D104802	2021-07-06 10:24:31 -07:00
Matt Arsenault	fae05692a3	CodeGen: Print/parse LLTs in MachineMemOperands This will currently accept the old number of bytes syntax, and convert it to a scalar. This should be removed in the near future (I think I converted all of the tests already, but likely missed a few). Not sure what the exact syntax and policy should be. We can continue printing the number of bytes for non-generic instructions to avoid test churn and only allow non-scalar types for generic instructions. This will currently print the LLT in parentheses, but accept parsing the existing integers and implicitly converting to scalar. The parentheses are a bit ugly, but the parser logic seems unable to deal without either parentheses or some keyword to indicate the start of a type.	2021-06-30 16:54:13 -04:00
Craig Topper	3b6dfa381e	[RISCV] Protect the SHL/SRA/SRL handlers in LowerOperation against being called for an illegal i32 shift amount. It seems it is possible for DAG combine to create a shl with an i64 result type and an i32 shift amount. This is ok before type legalization since the type don't need to match in SelectionDAG. This results in type legalization calling LowerOperation to legalize just the amount. We weren't expecting this so we asserted for not finding a fixed vector shift. To fix this, I've added a check for the fixed vector case and returned SDValue() to get the default type legalizer. I've factored all shifts together and added a fixed vector specific handler to avoid repeating similar code for each in LowerOperation. The particular case I found was exposed by D104581, but the bad shift is created after that patch triggers.	2021-06-29 09:45:13 -07:00
Craig Topper	4c92e31dd0	[RISCV] Add tests for __builtin_parity idiom. We use (and (ctpop X), 1) to represent parity. The generated code for i32 parity on RV64 has more instructions than necessary which I hope to improve in a followup patch. Also add missing test for i64 ctpop.	2021-06-27 12:37:29 -07:00
Craig Topper	010f0f000f	Revert "[RISCV] Use zexti32/sexti32 in srliw/sraiw isel patterns to improve usage of those instructions." I thought this might help with another optimization I was thinking about, but I don't think it will. So it just wastes compile time calling computeKnownBits for no benefit. This reverts commit `81b2f95971`.	2021-06-27 10:33:43 -07:00
Craig Topper	81b2f95971	[RISCV] Use zexti32/sexti32 in srliw/sraiw isel patterns to improve usage of those instructions.	2021-06-26 11:57:26 -07:00
Craig Topper	d4f4a1ba62	[RISCV] Add DAG combine to detect opportunities to replace (i64 (any_extend (i32 X)) with sign_extend. If type legalization is going to insert a sign_extend for other users of X and we can fold the sign_extend into ADDW/MULW/SUBW, it is better to replace the ANY_EXTEND so we don't end up with a separate ADD/MUL/SUB instruction for the users of the ANY_EXTEND. I'm only handling setcc uses right now, but there are other instructions that force sign_extends like ashr. There are probably other *W instructions we could use in addition to ADDW/SUBW/MULW. My motivating case was a loop terminating compare and a phi use as seen in the new test file. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D104581	2021-06-25 23:16:37 -07:00
Fraser Cormack	ab1bd25593	[RISCV] Permit larger RVV stacks and stack offsets This patch teaches the compiler to generate code to handle larger RVV stack sizes and stack offsets which resolve an amount larger than 2047 vector registers in size. The previous behaviour was asserting on such large values as it was only able to materialize the constant by feeding it to the 12-bit immediate of an `ADDI` instruction. The compiler can now materialize this amount into a temporary register before continuing with the computation. A test case for this scenario is included which also checks that the temporary register used to materialize the amount doesn't require an additional spill slot over what we're already reserving for RVV code. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D104727	2021-06-25 07:17:33 +01:00
Fraser Cormack	a4729f7f88	[RISCV] Lower RVV vector SELECTs to VSELECTs This patch optimizes the code generation of vector-type SELECTs (LLVM select instructions with scalar conditions) by custom-lowering to VSELECTs (LLVM select instructions with vector conditions) by splatting the condition to a vector. This avoids the default expansion path which would either introduce control flow or fully scalarize. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D104772	2021-06-24 10:12:51 +01:00
Craig Topper	91319534ba	[CGP][RISCV] Teach CodeGenPrepare::optimizeSwitchInst to honor isSExtCheaperThanZExt. This optimization pre-promotes the input and constants for a switch instruction to a legal type so that all the generated compares share the same extend. Since RISCV prefers sext for i32 to i64 extends, we should honor that to use sext.w instead of a pair of shifts. Reviewed By: jrtc27 Differential Revision: https://reviews.llvm.org/D104612	2021-06-23 15:38:11 -07:00
Joe Ellis	3c4dbf6ea9	[Verifier] Fail on overrunning and invalid indices for {insert,extract} vector intrinsics With regards to overrunning, the langref (llvm/docs/LangRef.rst) specifies: (llvm.experimental.vector.insert) Elements ``idx`` through (``idx`` + num_elements(``subvec``) - 1) must be valid ``vec`` indices. If this condition cannot be determined statically but is false at runtime, then the result vector is undefined. (llvm.experimental.vector.extract) Elements ``idx`` through (``idx`` + num_elements(result_type) - 1) must be valid vector indices. If this condition cannot be determined statically but is false at runtime, then the result vector is undefined. For the non-mixed cases (e.g. inserting/extracting a scalable into/from another scalable, or inserting/extracting a fixed into/from another fixed), it is possible to statically check whether or not the above conditions are met. This was previously missing from the verifier, and if the conditions were found to be false, the result of the insertion/extraction would be replaced with an undef. With regards to invalid indices, the langref (llvm/docs/LangRef.rst) specifies: (llvm.experimental.vector.insert) ``idx`` represents the starting element number at which ``subvec`` will be inserted. ``idx`` must be a constant multiple of ``subvec``'s known minimum vector length. (llvm.experimental.vector.extract) The ``idx`` specifies the starting element number within ``vec`` from which a subvector is extracted. ``idx`` must be a constant multiple of the known-minimum vector length of the result type. Similarly, these conditions were not previously enforced in the verifier. In some circumstances, invalid indices were permitted silently, and in other circumstances, an undef was spawned where a verifier error would have been preferred. This commit adds verifier checks to enforce the constraints above. Differential Revision: https://reviews.llvm.org/D104468	2021-06-23 10:33:22 +00:00
Craig Topper	9080659ac7	[RISCV] Add isel patterns to match vmacc/vmadd/vnmsub/vnmsac from add/sub and mul. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D104163	2021-06-21 11:27:44 -07:00
Craig Topper	b663f30fa4	[RISCV] Prevent formation of shXadd(.uw) and add.uw if it prevents the use of addi. If the outer add has an simm12 immediate operand we should prefer it instead of materializing it in a register. This would guarantee and extra instruction and temporary register. Since we don't check one use on the shl or zext we might generate more instructions if there is an additional user.	2021-06-19 12:10:42 -07:00
Ben Shi	d934b72809	[RISCV] Optimize add-mul in the zba extension with SHADD This patch does the following optimization. Rx + Ry 18 => (SH1ADD (SH3ADD Rx, Rx), Ry) Rx + Ry * 20 => (SH2ADD (SH2ADD Rx, Rx), Ry) Rx + Ry * 24 => (SH3ADD (SH1ADD Rx, Rx), Ry) Rx + Ry * 36 => (SH2ADD (SH3ADD Rx, Rx), Ry) Rx + Ry * 40 => (SH3ADD (SH2ADD Rx, Rx), Ry) Rx + Ry * 72 => (SH3ADD (SH3ADD Rx, Rx), Ry) Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D104588	2021-06-19 14:33:27 +08:00
Ben Shi	31190738c0	[RISCV][test] Add new tests for add-mul optimization in the zba extension with SHADD These tests will show the following optimization by future patches. Rx + Ry 18 => (SH1ADD (SH3ADD Rx, Rx), Ry) Rx + Ry * 20 => (SH2ADD (SH2ADD Rx, Rx), Ry) Rx + Ry * 24 => (SH3ADD (SH1ADD Rx, Rx), Ry) Rx + Ry * 36 => (SH2ADD (SH3ADD Rx, Rx), Ry) Rx + Ry * 40 => (SH3ADD (SH2ADD Rx, Rx), Ry) Rx + Ry * 72 => (SH3ADD (SH3ADD Rx, Rx), Ry) Rx * (3 << C) => (SLLI (SH1ADD Rx, Rx), C) Rx * (5 << C) => (SLLI (SH2ADD Rx, Rx), C) Rx * (9 << C) => (SLLI (SH3ADD Rx, Rx), C) Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D104507	2021-06-19 14:31:01 +08:00
Craig Topper	ac87133f1d	[RISCV] Teach vsetvli insertion to remember when predecessors have same AVL and SEW/LMUL ratio if their VTYPEs otherwise mismatch. Previously we went directly to unknown state on VTYPE mismatch. If we instead remember the partial match, we can use this to still use X0, X0 vsetvli in successors if AVL and needed SEW/LMUL ratio match. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D104069	2021-06-18 12:16:07 -07:00
Saleem Abdulrasool	e70d4994ea	test: clean up some of the RISCV tests (NFC) This addresses some post-commit comments from jrtc27 to make the tests easier to process.	2021-06-17 09:51:09 -07:00
Saleem Abdulrasool	bbea64250f	RISCV: adjust handling of relocation emission for RISCV This re-architects the RISCV relocation handling to bring the implementation closer in line with the implementation in binutils. We would previously aggressively resolve the relocation. With this restructuring, we always will emit a paired relocation for any symbolic difference of the type of S±T[±C] where S and T are labels and C is a constant. GAS has a special target hook controlled by `RELOC_EXPANSION_POSSIBLE` which indicates that a fixup may be expanded into multiple relocations. This is used by the RISCV backend to always emit a paired relocation - either ADD[WIDTH] + SUB[WIDTH] for text relocations or SET[WIDTH] + SUB[WIDTH] for a debug info relocation. Irrespective of whether linker relaxation support is enabled, symbolic difference is always emitted as a paired relocation. This change also sinks the target specific behaviour down into the target specific area rather than exposing it to the shared relocation handling. In the process, we also sink the "special" handling for debug information down into the RISCV target. Although this improves the path for the other targets, this is not necessarily entirely ideal either. The changes in the debug info emission could be done through another type of hook as this functionality would be required by any other target which wishes to do linker relaxation. However, as there are no other targets in LLVM which currently do this, this is a reasonable thing to do until such time as the code needs to be shared. Improve the handling of the relocation (and add a reduced test case from the Linux kernel) to ensure that we handle complex expressions for symbolic difference. This ensures that we correct relocate symbols with the adddends normalized and associated with the addition portion of the paired relocation. This change also addresses some review comments from Alex Bradbury about the relocations meant for use in the DWARF CFA being named incorrectly (using ADD6 instead of SET6) in the original change which introduced the relocation type. This resolves the issues with the symbolic difference emission sufficiently to enable building the Linux kernel with clang+IAS+lld (without linker relaxation). Resolves PR50153, PR50156! Fixes: ClangBuiltLinux/linux#1023, ClangBuiltLinux/linux#1143 Reviewed By: nickdesaulniers, maskray Differential Revision: https://reviews.llvm.org/D103539	2021-06-17 08:20:02 -07:00
Fraser Cormack	fed1503e85	[RISCV][VP] Lower FP VP ISD nodes to RVV instructions With the exception of `frem`, this patch supports the current set of VP floating-point binary intrinsics by lowering them to to RVV instructions. It does so by using the existing `RISCVISD *_VL` custom nodes as an intermediate layer. Both scalable and fixed-length vectors are supported by using this method. The `frem` node is unsupported due to a lack of available instructions. For fixed-length vectors we could scalarize but that option is not (currently) available for scalable-vector types. The support is intentionally left out so it equivalent for both vector types. The matching of vector/scalar forms is currently lacking, as scalable vector types do not lower to the custom `VFMV_V_F_VL` node. We could either make floating-point scalable vector splats lower to this node, or support the matching of multiple kinds of splat via a `ComplexPattern`, much like we do for integer types. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D104237	2021-06-17 10:04:00 +01:00
Bjorn Pettersson	4c7f820b2b	Update @llvm.powi to handle different int sizes for the exponent This can be seen as a follow up to commit `0ee439b705`, that changed the second argument of __powidf2, __powisf2 and __powitf2 in compiler-rt from si_int to int. That was to align with how those runtimes are defined in libgcc. One thing that seem to have been missing in that patch was to make sure that the rest of LLVM also handle that the argument now depends on the size of int (not using the si_int machine mode for 32-bit). When using __builtin_powi for a target with 16-bit int clang crashed. And when emitting libcalls to those rtlib functions, typically when lowering @llvm.powi), the backend would always prepare the exponent argument as an i32 which caused miscompiles when the rtlib was compiled with 16-bit int. The solution used here is to use an overloaded type for the second argument in @llvm.powi. This way clang can use the "correct" type when lowering __builtin_powi, and then later when emitting the libcall it is assumed that the type used in @llvm.powi matches the rtlib function. One thing that needed some extra attention was that when vectorizing calls several passes did not support that several arguments could be overloaded in the intrinsics. This patch allows overload of a scalar operand by adding hasVectorInstrinsicOverloadedScalarOpd, with an entry for powi. Differential Revision: https://reviews.llvm.org/D99439	2021-06-17 09:38:28 +02:00
Ben Shi	0799057181	[RISCV][test] Add new tests of SHADD in the zba extension These tests will show the following optimization by future patches. Rx + Ry 6 => (SH1ADD (SH2ADD Rx, Ry), Ry) Rx + Ry * 10 => (SH1ADD (SH3ADD Rx, Ry), Ry) Rx + Ry * 12 => (SH2ADD (SH3ADD Rx, Ry), Ry) Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D104210	2021-06-17 07:02:33 +08:00

1 2 3 4 5 ...

959 Commits