clang-p2996

Author	SHA1	Message	Date
Simon Pilgrim	0637dfe88b	[DAG] Legalize abs(x) -> smax(x,sub(0,x)) iff smax/sub are legal If smax() is legal, this is likely to result in smaller codegen expansion for abs(x) than the xor(add,ashr) method. This is also what PowerPC has been doing for its abs implementation, so it lets us get rid of a load of custom lowering code there (and which was never updated when they added smax lowering). Alive2: https://alive2.llvm.org/ce/z/xRk3cD Differential Revision: https://reviews.llvm.org/D92095	2020-11-25 15:03:03 +00:00
QingShan Zhang	9c588f53fc	[DAGCombine] Add hook to allow target specific test for sqrt input PowerPC has instruction ftsqrt/xstsqrtdp etc to do the input test for software square root. LLVM now tests it with smallest normalized value using abs + setcc. We should add hook to target that has test instructions. Reviewed By: Spatel, Chen Zheng, Qiu Chao Fang Differential Revision: https://reviews.llvm.org/D80706	2020-11-25 05:37:15 +00:00
QingShan Zhang	fa42f08b26	[PowerPC][FP128] Fix the incorrect calling convention for IEEE long double on Power8 For now, we are using the GPR to pass the arguments/return value for fp128 on Power8, which is incorrect. It should be VSR. The reason why we do it this way is that, we are setting the fp128 as illegal which make LLVM try to emulate it with i128 on Power8. So, we need to correct it as legal. Reviewed By: Nemanjai Differential Revision: https://reviews.llvm.org/D91527	2020-11-25 01:43:48 +00:00
Zarko Todorovski	c92f29b05e	[AIX] Add mabi=vec-extabi options to enable the AIX extended and default vector ABIs. Added support for the options mabi=vec-extabi and mabi=vec-default which are analogous to qvecnvol and qnovecnvol when using XL on AIX. The extended Altivec ABI on AIX is enabled using mabi=vec-extabi in clang and vec-extabi in llc. Reviewed By: Xiangling_L, DiggerLin Differential Revision: https://reviews.llvm.org/D89684	2020-11-24 18:17:53 -05:00
Sean Fertile	4f5355ee73	[PowerPC] Don't reuse an illegal typed load for int_to_fp conversion. When the operand to an (s/u)int_to_fp node is an illegally typed load we cannot reuse the load address since we can not build a proper dependancy chain. The legalized loads will use a different chain output then the illegal load. If we reuse the load address then we will build a conversion node that uses the chain of the illegal load and operations which modify the memory address in the other dependancy chain can be scheduled before the floating point load which feeds the conversion. Differential Revision: https://reviews.llvm.org/D91265	2020-11-24 15:45:33 -05:00
Craig Topper	a7eae62a42	[SelectionDAG][X86][PowerPC][Mips] Replace the default implementation of LowerOperationWrapper with the X86 and PowerPC version. The default version only works if the returned node has a single result. The X86 and PowerPC versions support multiple results and allow a single result to be returned from a node with multiple outputs. And allow a single result that is not result 0 of the node. Also replace the Mips version since the new version should work for it. The original version handled multiple results, but only if the new node and original node had the same number of results. Differential Revision: https://reviews.llvm.org/D91846	2020-11-20 10:06:53 -08:00
Simon Pilgrim	5f3a8074a4	[PPC] Fix dead store value clang static analyzer warning. NFCI. Simplify the SplatBits 2-byte -> 4-byte 'splat'.	2020-11-17 16:27:45 +00:00
Baptiste Saleil	3f78605a8c	[PowerPC] Add paired vector load and store builtins and intrinsics This patch adds the Clang builtins and LLVM intrinsics to load and store vector pairs. Differential Revision: https://reviews.llvm.org/D90799	2020-11-13 12:35:10 -06:00
Qiu Chaofan	3204ffeade	[PowerPC] [NFC] Rename VCMPo to VCMP_rec Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D90581	2020-11-03 11:10:59 +08:00
Nemanja Ivanovic	5459d08795	[PowerPC] Fix single-use check and update chain users for ld-splat When converting a BUILD_VECTOR or VECTOR_SHUFFLE to a splatting load as of `1461fb6e78`, we inaccurately check for a single user of the load and neglect to update the users of the output chain of the original load. As a result, we can emit a new load when the original load is kept and the new load can be reordered after a dependent store. This patch fixes those two issues. Fixes https://bugs.llvm.org/show_bug.cgi?id=47891	2020-10-27 16:49:38 -05:00
Victor Huang	2e1a737f46	[PowerPC][PCRelative] Turn on TLS support for PCRel by default Turn on TLS support for PCRel by default and update the test cases. Differential Revision: https://reviews.llvm.org/D88738 Reviewed by: stefanp, kamaub	2020-10-27 13:58:44 -05:00
Amy Kwan	6a946fd06f	[DAGCombiner][PowerPC] Remove isMulhCheaperThanMulShift TLI hook, Use isOperationLegalOrCustom directly instead. MULH is often expanded on targets. This patch removes the isMulhCheaperThanMulShift hook and uses isOperationLegalOrCustom instead. Differential Revision: https://reviews.llvm.org/D80485	2020-10-19 12:23:04 -05:00
Kai Luo	354d3106c6	[PowerPC] Skip combining (uint_to_fp x) if x is not simple type Current powerpc64le backend hits ``` Combining: t7: f64 = uint_to_fp t6 llc: llvm-project/llvm/include/llvm/CodeGen/ValueTypes.h:291: llvm::MVT llvm::EVT::getSimpleVT() const: Assertion `isSimple() && "Expected a SimpleValueType!"' failed. ``` This patch fixes it by skipping combination if `t6` is not simple type. Fixed https://bugs.llvm.org/show_bug.cgi?id=47660. Reviewed By: #powerpc, steven.zhang Differential Revision: https://reviews.llvm.org/D88388	2020-10-19 05:23:46 +00:00
Albion Fung	d30155feaa	[PowerPC] Implementation of 128-bit Binary Vector Rotate builtins This patch implements 128-bit Binary Vector Rotate builtins for PowerPC10. Differential Revision: https://reviews.llvm.org/D86819	2020-10-16 18:03:22 -04:00
David Sherwood	47f2dc7e5f	[SVE][NFC] Replace some TypeSize comparisons in non-AArch64 Targets In most of lib/Target we know that we are not dealing with scalable types so it's perfectly fine to replace TypeSize comparison operators with their fixed width equivalents, making use of getFixedSize() and so on. Differential Revision: https://reviews.llvm.org/D89101	2020-10-15 09:01:21 +01:00
Ahsan Saghir	f3202b30b8	[PowerPC] Add assemble disassemble intrinsics for MMA This patch adds support for assemble disassemble intrinsics for MMA. Reviewed By: bsaleil, #powerpc Differential Revision: https://reviews.llvm.org/D88739	2020-10-13 13:21:58 -05:00
Simon Pilgrim	2c3e4a21f9	[PowerPC] ReplaceNodeResults - bail on funnel shifts and let generic legalizers deal with it Fixes regression raised on D88834 for 32-bit triple + 64-bit cpu cases (which apparently is a thing).	2020-10-10 19:13:16 +01:00
Fangrui Song	2bd4730850	[PowerPC] Fix signed overflow in decomposeMulByConstant after D88201 Caught by multipliers LONG_MAX (after +1) and LONG_MIN (after -1) in CodeGen/PowerPC/mul-const-i64.ll	2020-10-09 18:29:12 -07:00
Esme-Yi	e9fd8823ba	[DAGCombiner] Add decomposition patterns for Mul-by-Imm. Summary: This patch is derived from D87384. In this patch we expand the existing decomposition of mul-by-constant to be more general by implementing 2 patterns: ``` mul x, (2^N + 2^M) --> (add (shl x, N), (shl x, M)) mul x, (2^N - 2^M) --> (sub (shl x, N), (shl x, M)) ``` The conversion will be trigged if the multiplier is a big constant that the target can't use a single multiplication instruction to handle. This is controlled by the hook `decomposeMulByConstant`. More over, the conversion benefits from an ILP improvement since the instructions are independent. A case with the sequence like following also gets benefit since a shift instruction is saved. ``` res1 = a 0x8800; res2 = a 0x8080; ``` Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D88201	2020-10-09 08:51:40 +00:00
Chen Zheng	0492dd91c4	[PowerPC] add more builtins for PPCTargetLowering::getTgtMemIntrinsic Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D88374	2020-10-06 23:48:33 -04:00
Craig Topper	1127662c6d	[SelectionDAG] Make sure FMF are propagated when getSetcc canonicalizes FP constants to RHS. getNode handling for ISD:SETCC calls FoldSETCC which can canonicalize FP constants to the RHS. When this happens we should create the node with the FMF that was requested. By using FlagInserter when can ensure any calls to getNode/getSetcc during canonicalization will also get the flags. Differential Revision: https://reviews.llvm.org/D88063	2020-10-05 14:55:23 -07:00
Zarko Todorovski	052c5bf40a	[PPC] Do not emit extswsli in 32BIT mode when using -mcpu=pwr9 It looks like in some circumstances when compiling with `-mcpu=pwr9` we create an EXTSWSLI node when which causes llc to fail. No such error occurs in pwr8 or lower. This occurs in 32BIT AIX and BE Linux. the cause seems to be that the default return in combineSHL is to create an EXTSWSLI node. Adding a check for whether we are in PPC64 before that fixes the issue. Reviewed By: #powerpc, nemanjai Differential Revision: https://reviews.llvm.org/D87046	2020-09-30 11:06:20 -04:00
Sean Fertile	dfb717da1f	[PowerPC] Remove support for VRSAVE save/restore/update. After removal of Darwin as a PowerPC subtarget, the VRSAVE save/restore/spill/update code is no longer needed by any supported subtarget, so remove it while keeping support for vrsave and related instruction aliases for inline asm. I've pre-commited tests to document the existing vrsave handling in relation to @llvm.eh.unwind.init and inline asm usage, as well as a test which shows a beahviour change on AIX related to returning vector type as we were wrongly emiting VRSAVE_UPDATE on AIX.	2020-09-30 10:05:53 -04:00
Baptiste Saleil	0156914275	[PowerPC] Legalize v256i1 and v512i1 and implement load and store of these types This patch legalizes the v256i1 and v512i1 types that will be used for MMA. It implements loads and stores of these types. v256i1 is a pair of VSX registers, so for this type, we load/store the two underlying registers. v512i1 is used for MMA accumulators. So in addition to loading and storing the 4 associated VSX registers, we generate instructions to prime (copy the VSX registers to the accumulator) after loading and unprime (copy the accumulator back to the VSX registers) before storing. This patch also adds the UACC register class that is necessary to implement the loads and stores. This class represents accumulator in their unprimed form and allow the distinction between primed and unprimed accumulators to avoid invalid copies of the VSX registers associated with primed accumulators. Differential Revision: https://reviews.llvm.org/D84968	2020-09-28 14:39:37 -05:00
Amy Kwan	2e7117f847	[PowerPC] Implement the 128-bit vec_[all\|any]_[eq \| ne \| lt \| gt \| le \| ge] builtins in Clang/LLVM This patch implements the vec_[all\|any]_[eq \| ne \| lt \| gt \| le \| ge] builtins for vector signed/unsigned __int128. Differential Revision: https://reviews.llvm.org/D87910	2020-09-23 16:49:40 -04:00
Albion Fung	88cdbeab41	[PowerPC] Implement Vector signed/unsigned __int128 overloads for the comparison builtins This patch implements Vector signed/unsigned __int128 overloads for the comparison builtins. Differential Revision: https://reviews.llvm.org/D87804	2020-09-23 16:49:40 -04:00
Victor Huang	652a8f150d	[PowerPC][PCRelative] Thread Local Storage Support for Local Dynamic This patch is the initial support for the Local Dynamic Thread Local Storage model to produce code sequence and relocation correct to the ABI for the model when using PC relative memory operations. Differential Revision: https://reviews.llvm.org/D87721	2020-09-23 13:48:06 -05:00
Albion Fung	d7eb917a7c	[PowerPC] Implementation of 128-bit Binary Vector Mod and Sign Extend builtins This patch implements 128-bit Binary Vector Mod and Sign Extend builtins for PowerPC10. Differential: https://reviews.llvm.org/D87394#inline-815858	2020-09-23 01:18:14 -05:00
Qiu Chaofan	1d782c2987	[PowerPC] Pass nofpexcept flag to custom lowered constrained ops This is a follow-up of D86605. For strict DAG FP node, if its FP exception behavior metadata is ignore, it should have nofpexcept flag. But during custom lowering, this flag isn't passed down. This is also seen on X86 target. Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D87390	2020-09-21 10:44:25 +08:00
Qiu Chaofan	ebfbdebe96	[PowerPC] Fix store-fptoi combine of f128 on Power8 llc would crash for (store (fptosi-f128-i32)) when -mcpu=pwr8, we should not generate FP_TO_(S\|U)INT_IN_VSR for f128 types at this time. This patch fixes it. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D86686	2020-09-17 10:21:35 +08:00
Albion Fung	05aa997d51	[PowerPC] Implement __int128 vector divide operations This patch implements __int128 vector divide operations for ISA3.1. Differential Revision: https://reviews.llvm.org/D85453	2020-09-15 15:19:35 -04:00
Kamau Bridgeman	c0f199e566	[PowerPC] Implement Thread Local Storage Support for Local Exec This patch is the initial support for the Local Exec Thread Local Storage model to produce code sequence and relocations correct to the ABI for the model when using PC relative memory operations. Patch by: Kamau Bridgeman Differential Revision: https://reviews.llvm.org/D83404	2020-09-14 14:16:28 -05:00
Qiu Chaofan	6afb279100	[PowerPC] [FPEnv] Disable strict FP mutation by default `22a0edd0` introduced a config IsStrictFPEnabled, which controls the strict floating point mutation (transforming some strict-fp operations into non-strict in ISel). This patch disables the mutation by default since we've finished PowerPC strict-fp enablement in backend. Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D87222	2020-09-10 13:28:09 +08:00
Qiu Chaofan	88ff4d2ca1	[PowerPC] Fix STRICT_FRINT/STRICT_FNEARBYINT lowering In standard C library, both rint and nearbyint returns rounding result in current rounding mode. But nearbyint never raises inexact exception. On PowerPC, x(v\|s)r(d\|s)pic may modify FPSCR XX, raising inexact exception. So we can't select constrained fnearbyint into xvrdpic. One exception here is xsrqpi, which will not raise inexact exception, so fnearbyint f128 is okay here. Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D87220	2020-09-09 22:40:58 +08:00
Brad Smith	88b368a1c4	[PowerPC] Set setMaxAtomicSizeInBitsSupported appropriately for 32-bit PowerPC in PPCTargetLowering Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D86165	2020-09-08 21:21:14 -04:00
Craig Topper	b1e68f885b	[SelectionDAGBuilder] Pass fast math flags to getNode calls rather than trying to set them after the fact.: This removes the after the fact FMF handling from D46854 in favor of passing fast math flags to getNode. This should be a superset of D87130. This required adding a SDNodeFlags to SelectionDAG::getSetCC. Now we manage to contant fold some stuff undefs during the initial getNode that we don't do in later DAG combines. Differential Revision: https://reviews.llvm.org/D87200	2020-09-08 15:27:21 -07:00
Qiu Chaofan	705271d9cd	[PowerPC] Expand constrained ppc_fp128 to i32 conversion Libcall __gcc_qtou is not available, which breaks some tests needing it. On PowerPC, we have code to manually expand the operation, this patch applies it to constrained conversion. To keep it strict-safe, it's using the algorithm similar to expandFP_TO_UINT. For constrained operations marking FP exception behavior as 'ignore', we should set the NoFPExcept flag. However, in some custom lowering the flag is missed. This should be fixed by future patches. Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D86605	2020-09-05 13:16:20 +08:00
Nemanja Ivanovic	2771407584	[PowerPC] Do not legalize vector FDIV without VSX Quite a while ago, we legalized these nodes as we added custom handling for reciprocal estimates in the back end. We have since moved to target-independent combines but neglected to turn off legalization. As a result, we can now get selection failures on non-VSX subtargets as evidenced in the listed PR. Fixes: https://bugs.llvm.org/show_bug.cgi?id=47373	2020-09-02 16:03:36 -05:00
Albion Fung	331dcc43ea	[PowerPC] Implemented Vector Load with Zero and Signed Extend Builtins This patch implements the builtins for Vector Load with Zero and Signed Extend Builtins (lxvr_x for b, h, w, d), and adds the appropriate test cases for these builtins. The builtins utilize the vector load instructions itnroduced with ISA 3.1. Differential Revision: https://reviews.llvm.org/D82502#inline-797941	2020-08-28 11:28:58 -05:00
Roland Froese	b6d7ed469f	[PowerPC] Extend custom lower of vector truncate to handle wider input Current custom lowering of truncate vector handles a source of up to 128 bits, but that only uses one of the two shuffle vector operands. Extend it to use both operands to handle 256 bit sources. Differential Revision: https://reviews.llvm.org/D68035	2020-08-24 15:33:43 -04:00
Qiu Chaofan	41ba9d7723	[PowerPC] Support constrained vector fp/int conversion This patch makes these operations legal, and add necessary codegen patterns. There's still some issue similar to D77033 for conversion from v1i128 type. But normal type tests synced in vector-constrained-fp-intrinsic are passed successfully. Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D83654	2020-08-24 10:10:27 +08:00
Qiu Chaofan	a5b7b8cce0	[PowerPC] Support constrained scalar sitofp/uitofp This patch adds support for constrained scalar int to fp operations on PowerPC. Besides, this also fixes the FP exception bit of FCFID* instructions. Reviewed By: steven.zhang, uweigand Differential Revision: https://reviews.llvm.org/D81669	2020-08-22 02:10:29 +08:00
Kamau Bridgeman	365f861c45	[PowerPC][PCRelative] Thread Local Storage Support for Initial Exec This patch is the initial support for the Intial Exec Thread Local Local Storage model to produce code sequence and relocations correct to the ABI for the model when using PC relative memory operations. Reviewed By: stefanp Differential Revision: https://reviews.llvm.org/D81947	2020-08-21 10:13:11 -05:00
Kamau Bridgeman	b74b80bb2d	[PowerPC][PCRelative] Thread Local Storage Support for General Dynamic This patch is the initial support for the General Dynamic Thread Local Local Storage model to produce code sequence and relocations correct to the ABI for the model when using PC relative memory operations. Patch by: NeHuang Reviewed By: stefanp Differential Revision: https://reviews.llvm.org/D82315	2020-08-20 15:08:13 -05:00
Qiu Chaofan	131b3b9ed4	[PowerPC] Support constrained scalar fptosi/fptoui This patch adds support for constrained scalar fp to int operations on PowerPC. Besides, this fixes the FP exception bit of quad-precision convert & truncate instructions. Reviewed By: steven.zhang, uweigand Differential Revision: https://reviews.llvm.org/D81537	2020-08-20 13:29:43 +08:00
Albion Fung	3136cbe29e	[PowerPC] Implement Vector Shift Builtins This patch implements the builtins for the vector shifts (shl, srl, sra), and adds the appropriate test cases for these builtins. The builtins utilize the vector shift instructions introduced within ISA 3.1. Differential Revision: https://reviews.llvm.org/D83338	2020-08-12 18:26:58 -05:00
diggerlin	e9ac1495e2	[AIX][XCOFF] change the operand of branch instruction from symbol name to qualified symbol name for function declarations SUMMARY: 1. in the patch , remove setting storageclass in function .getXCOFFSection and construct function of class MCSectionXCOFF there are XCOFF::StorageMappingClass MappingClass; XCOFF::SymbolType Type; XCOFF::StorageClass StorageClass; in the MCSectionXCOFF class, these attribute only used in the XCOFFObjectWriter, (asm path do not need the StorageClass) we need get the value of StorageClass, Type,MappingClass before we invoke the getXCOFFSection every time. actually , we can get the StorageClass of the MCSectionXCOFF from it's delegated symbol. 2. we also change the oprand of branch instruction from symbol name to qualify symbol name. for example change bl .foo extern .foo to bl .foo[PR] extern .foo[PR] 3. and if there is reference indirect call a function bar. we also add extern .bar[PR] Reviewers: Jason liu, Xiangling Liao Differential Revision: https://reviews.llvm.org/D84765	2020-08-11 15:26:19 -04:00
Kerry McLaughlin	85c7e89f3b	[CodeGen] Refactor getMemBasePlusOffset & getObjectPtrOffset to accept a TypeSize Changes the Offset arguments to both functions from int64_t to TypeSize & updates all uses of the functions to create the offset using TypeSize::Fixed() Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D85220	2020-08-11 12:17:10 +01:00
Qiu Chaofan	dbcfbffc7a	[PowerPC] Add intrinsic to read or set FPSCR register This patch introduces two intrinsics: llvm.ppc.setflm and llvm.ppc.readflm. They read from or write to FPSCR register (floating-point status & control) which contains rounding mode and exception status. To ensure correctness of program, we need to prevent FP operations from being moved across these intrinsics (mffs/mtfsf instruction), so here I set them as scheduling boundaries. We can relax such restriction if FPSCR is modeled well in the future. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D84914	2020-08-10 18:27:45 +08:00
Kamau Bridgeman	d8c6d083c9	[PowerPC][PCRelative] Set TLS unsupported with PC relative memops Introduce a fatal error if any thread local storage code is compiled using pc relative memory operations as well as a hidden override option `-enable-ppc-pcrel-tls` so that this support can be incrementally added if possible. Reviewed By: #powerpc, nemanjai Differential Revision: https://reviews.llvm.org/D85448	2020-08-07 10:56:24 -05:00

1 2 3 4 5 ...

1580 Commits