clang-p2996

Author	SHA1	Message	Date
Lian Wang	f14a1f26ad	Revert "[RISCV][SelectionDAG] Support VECREDUCE_ADD mask operation" This patch make CodeGen/test/AArch64/vecreduce-add-legalization.ll fail. This reverts commit `17a8a1bb71`.	2022-05-10 09:25:25 +00:00
Lian Wang	17a8a1bb71	[RISCV][SelectionDAG] Support VECREDUCE_ADD mask operation Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D125206	2022-05-10 08:52:48 +00:00
Mircea Trofin	c35ad9ee4f	[mlgo] Support exposing more features than those supported by models This allows the compiler to support more features than those supported by a model. The only requirement (development mode only) is that the new features must be appended at the end of the list of features requested from the model. The support is transparent to compiler code: for unsupported features, we provide a valid buffer to copy their values; it's just that this buffer is disconnected from the model, so insofar as the model is concerned (AOT or development mode), these features don't exist. The buffers are allocated at setup - meaning, at steady state, there is no extra allocation (maintaining the current invariant). These buffers has 2 roles: one, keep the compiler code simple. Second, allow logging their values in development mode. The latter allows retraining a model supporting the larger feature set starting from traces produced with the old model. For release mode (AOT-ed models), this decouples compiler evolution from model evolution, which we want in scenarios where the toolchain is frequently rebuilt and redeployed: we can first deploy the new features, and continue working with the older model, until a new model is made available, which can then be picked up the next time the compiler is built. Differential Revision: https://reviews.llvm.org/D124565	2022-05-09 18:01:21 -07:00
David Green	2cfb243bcd	[DAG] Use isAnyConstantBuildVector. NFC As suggested from `02f8519502`, this uses the isAnyConstantBuildVector method in lieu of separate isBuildVectorOfConstantSDNodes calls. It should otherwise be an NFC.	2022-05-09 14:13:03 +01:00
David Green	02f8519502	[DAG] Prevent infinite loop combining bitcast shuffle This prevents an infinite loop from D123801, where code trying to reduce the total number of bitcasts, but also handling constants, could create the opposite transform. Prevent the transform in these case to let the bitcast of a constant transform naturally. Fixes #55345	2022-05-09 09:36:22 +01:00
Simon Pilgrim	800d36cf32	[DAG] Only perform the fold (A-B)+(C-D) --> (A+C)-(B+D) when both inner subs have one use Fixes #51381	2022-05-08 13:51:58 +01:00
Craig Topper	b81bf7bb2f	[LegalizeTypes] Make use of SelectionDAG::getShiftAmountConstant. NFC Instead of calling getShiftAmountTy and getConstant separately.	2022-05-07 12:16:53 -07:00
Craig Topper	00bfaba997	[LegalizeTypes] Don't assume fshl/fshr shift amount type matches the other operands. Like other shifts, the type isn't required to match. We shouldn't assume we can call ZExtPromotedInteger. I tested the PromoteIntOp_FunnelShift locally by removing the promotion of the shift amount from PromoteIntRes_FunnelShift. But with the final version of this patch it is never executed on any tests. Differential Revision: https://reviews.llvm.org/D125106	2022-05-07 11:44:07 -07:00
Amaury Séchet	06fad8bc05	[DAGCombine] Add node in the worklist in topological order in CombineTo This is part of an ongoing effort toward making DAGCombine process the nodes in topological order. This is able to discover a couple of new optimizations, but also causes a couple of regression. I nevertheless chose to submit this patch for review as to start the discussion with people working on the backend so we can find a good way forward. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D124743	2022-05-07 16:24:31 +00:00
Paul Walker	702c4ade22	[ISD::IndexType] Helper functions for common queries. Add helper functions to query the signed and scaled properties of ISD::IndexType along with functions to change them. Remove setIndexType from MaskedGatherSDNode because it only has one usage and typically should only be changed alongside its index operand. Minimise the direct use of the enum values to lay the groundwork for more refactoring. Differential Revision: https://reviews.llvm.org/D123347	2022-05-07 11:23:42 +01:00
David Green	5930691ee1	Revert "[DAGCombine] Make combineShuffleOfBitcast LittleEndian specific" This reverts commit `891c3cf99e` as it turns out that the error was not caused by this commit, the error caming from D124526 instead.	2022-05-06 21:03:22 +01:00
David Green	891c3cf99e	[DAGCombine] Make combineShuffleOfBitcast LittleEndian specific Something is going wrong with the BigEndian PowerPC bot. It is hard to tell what is wrong from here, but attempt to fix it by disabling the combineShuffleOfBitcast combine for bigendian.	2022-05-06 18:42:44 +01:00
Craig Topper	76f90a9d71	[SelectionDAG] Clear promoted bits before UREM on shift amount in PromoteIntRes_FunnelShift. Otherwise we have garbage in the upper bits that can affect the results of the UREM. Fixes PR55296. Differential Revision: https://reviews.llvm.org/D125076	2022-05-06 09:26:30 -07:00
Simon Pilgrim	c0bebc12f0	[DAG] visitREM - merge buildOptimizedSREM into if(). NFCI.	2022-05-06 15:39:17 +01:00
David Green	115c188807	[DAG][PowerPC] Combine shuffle(bitcast(X), Mask) to bitcast(shuffle(X, Mask')) If the mask is made up of elements that form a mask in the higher type we can convert shuffle(bitcast into the bitcast type, simplifying the instruction sequence. A v4i32 2,3,0,1 for example can be treated as a 1,0 v2i64 shuffle. This helps clean up some of the AArch64 concat load combines, along with helping simplify a number of other tests. The PowerPC combine for v16i8 splat vector loads needed some fixes to keep it working for v16i8 vectors. This improves the handling of v2i64 shuffles to match too, hopefully improving them in general. Differential Revision: https://reviews.llvm.org/D123801	2022-05-06 10:50:31 +01:00
Lian Wang	fb0d636f28	[RISCV][SelectionDAG] Support VP_REDUCE_ADD mask operation. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D124986	2022-05-06 01:49:21 +00:00
Craig Topper	5140e0d219	[SelectionDAGISel] Add back a comment to MergeInputChains handling. NFC This comment used to exist, but was lost in a refactor over 10 years ago, but still seems relevant and improves readability.	2022-05-05 12:59:21 -07:00
Craig Topper	084f967370	[SelectionDAG] Constant fold (sext_inreg undef, VT) to 0 instead of undef. The result of sign_extend_inreg needs to have as many sign bits as requested by the VT argument. The easiest way to guarantee this is to fold it to 0. SystemZ test was modified to avoid using undef. Fixes https://github.com/llvm/llvm-project/issues/55178 Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D124696	2022-05-05 09:45:35 -07:00
Craig Topper	4e2d1a6c18	[DAGCombiner] Fold (sext/zext undef) -> 0 and aext(undef) -> undef. Differential Revision: https://reviews.llvm.org/D124988	2022-05-05 09:34:18 -07:00
Craig Topper	fd13192aa5	[DAGCombiner] Fold (max/min X, X) -> X. Differential Revision: https://reviews.llvm.org/D124951	2022-05-05 09:34:17 -07:00
Brian Tracy	87a55137e2	Fix "the the" typo in documentation and user facing strings There are many more instances of this pattern, but I chose to limit this change to .rst files (docs), anything in libcxx/include, and string literals. These have the highest chance of being seen by end users. Reviewed By: #libc, Mordante, martong, ldionne Differential Revision: https://reviews.llvm.org/D124708	2022-05-05 17:52:08 +02:00
Thomas Preud'homme	68dee83923	[MachinePipeliner] Fix unscheduled instruction Prior to ordering instructions to be scheduled, the machine pipeliner update recurrence node sets in groupRemainingNodes() by adding in a given node set any node on the dependency path from a node set with higher priority to the given node set. The function computePath() that determine what constitutes a path follows artificial dependencies. However, when ordering the nodes in the resulting node sets, computeNodeOrder() calls ignoreDependence when looking at dependencies which ignores artificial dependencies. This can cause a node not to be scheduled which then causes wrong code generation and in the case of a debug build will lead to an assert failure in generatePhis() in ModuloScheduler.cpp. This commit adds calls to ignoreDependence() in computePath() to not add any node in groupRemainingNodes() that would not be ordered by computeNodeOrder(). Reviewed By: sgundapa Differential Revision: https://reviews.llvm.org/D124267	2022-05-05 16:01:41 +01:00
Xing Xue	e5926906eb	[XCOFF][AIX] Use unique section names for LSDA and EH info sections with -ffunction-sections Summary: When -ffunction-sections is on, this patch makes the compiler to generate unique LSDA and EH info sections for functions on AIX by appending the function name to the section name as a suffix. This will allow the AIX linker to garbage-collect unused function. Reviewed by: MaskRay, hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D124855	2022-05-05 09:01:36 -04:00
Jay Foad	9ebbe25034	RegAllocGreedy: Common up part of the priority calculation. NFC.	2022-05-05 10:35:33 +01:00
Nikita Popov	9678936f18	[DAGCombine] Fold (X & ~Y) \| Y with truncated not This extends the (X & ~Y) \| Y to X \| Y fold to also work if ~Y is a truncated not (when taking into account the mask X). This is done by exporting the infrastructure added in D124856 and reusing it here. I've retained the old value of AllowUndefs=false, though probably this can be switched to true with extra test coverage. Differential Revision: https://reviews.llvm.org/D124930	2022-05-05 11:10:11 +02:00
Craig Topper	572dfef1db	[SelectionDAG] Use llvm::any_of to simplify a loop. NFC	2022-05-04 19:09:06 -07:00
Nikita Popov	451bc723ae	[SDAG] Handle truncated not in haveNoCommonBitsSet() Demanded bits analysis may replace a full-width not with a any_extend (not (truncate X)) pattern. This patch looks through this kind of pattern in haveNoCommonBitsSet(). Of course, we can only do this if we only need negated bits in the non-extended part, as the other bits may now be arbitrary. For example, if we have haveNoCommonBitsSet(~X & Y, X) then ~X only needs to actually negate bits set in Y. This is only a partial solution to the problem in that it allows add -> or conversion, but the resulting or doesn't get folded yet. (I guess that will involve exposing getBitwiseNotOperand() as a more general helper and using that in the relevant transform.) Differential Revision: https://reviews.llvm.org/D124856	2022-05-04 15:30:44 +02:00
serge-sans-paille	7030654296	[iwyu] Handle regressions in libLLVM header include Running iwyu-diff on LLVM codebase since `fa5a4e1b95` detected a few regressions, fixing them. Differential Revision: https://reviews.llvm.org/D124847	2022-05-04 08:32:38 +02:00
Luo, Yuanke	764676b737	[fastregalloc] Fix bug when undef value is tied to def. If the tied use is undef value, fastregalloc should free the def register. There is no reload needed for the undef value. Reviewed By: MatzeB Differential Revision: https://reviews.llvm.org/D124834	2022-05-04 12:12:55 +08:00
Jon Roelofs	e1c808b36e	Fix zero-width bitfield extracts to emit 0 Fixes #55129	2022-05-03 14:46:42 -07:00
Simon Pilgrim	faa35fc873	[DAG] Fix issue with rot(rot(x,c1),c2) -> rot(x,c1+c2) fold with unnormalized rotation amounts Don't assume the rotation amounts have been correctly normalized - do it as part of the constant folding. Also, the normalization should be performed with UREM not SREM.	2022-05-03 17:16:26 +01:00
Nikita Popov	2171a896ed	[SDAG] Handle A and B&~A in haveNoCommonBitsSet() This is the DAG variant of D124763. The code already handles the general pattern, but not this degenerate case. This allows folding A + (B&~A) to A \| (B&~A) which further holds to A \| B. Handling on the SDAG level is needed because in the motivating case the add is actually a getelementptr, which only gets converted into an add on the SDAG level. However, this patch is not quite sufficient to handle the getelementptr case yet, because of an interfering demanded bits simplification. Differential Revision: https://reviews.llvm.org/D124772	2022-05-03 15:47:02 +02:00
Nikita Popov	e0892614b1	[SDAG] Extract commutative helper from haveNoCommonBitsSet() (NFC) To make it easier to add additional patterns, which will generally want to handle commuted top-level operands.	2022-05-03 12:28:35 +02:00
Jeremy Morse	1d712c3818	[DebugInfo][InstrRef] Don't generate redundant DBG_PHIs In SelectionDAG, DBG_PHI instructions are created to "read" physreg values and give them an instruction number, when they can't be traced back to a defining instruction. The most common scenario if arguments to a function. Unfortunately, if you have 100 inlined methods, each of which has the same "this" pointer, then the 100 dbg.value instructions become 100 DBG_INSTR_REFs plus 100 DBG_PHIs, where only one DBG_PHI would suffice. This patch adds a vreg cache for MachienFunction::salvageCopySSA, if we've already traced a value back to the start of a block and created a DBG_PHI then it allows us to re-use the DBG_PHI, as well as reducing work. Differential Revision: https://reviews.llvm.org/D124517	2022-05-03 09:56:12 +01:00
David Green	6f81903e89	[LV][SLP] Mark fptosi_sat as vectorizable This adds fptosi_sat and fptoui_sat to the list of trivially vectorizable functions, mainly so that the loop vectorizer can vectorize the instruction. Marking them as trivially vectorizable also allows them to be SLP vectorized, and Scalarized. The signature of a fptosi_sat requires two type overrides (@llvm.fptosi.sat.v2i32.v2f32), unlike other intrinsics that often only take a single. This patch alters hasVectorInstrinsicOverloadedScalarOpd to isVectorIntrinsicWithOverloadTypeAtArg, so that it can mark the first operand of the intrinsic as a overloaded (but not scalar) operand. Differential Revision: https://reviews.llvm.org/D124358	2022-05-03 09:32:34 +01:00
Hsiangkai Wang	eaaa31ff2c	[RISCV][TargetLowering] Special case overflow expansion for (uaddo X, C). Follow-up to D122933. Differential Revision: https://reviews.llvm.org/D124374	2022-05-03 03:51:36 +00:00
Craig Topper	5f057eaa0d	[DAGCombiner] reassociationCanBreakAddressingModePattern should check uses of the outer add. When looking for memory uses, reassociationCanBreakAddressingModePattern should check uses of the outer ADD rather than the inner ADD. We want to know if the two ops we're reassociating are used by a load/store. In practice, the existing check usually works because CodeGenPrepare will make one of the load/stores have an offset of 0 relative to split GEP. That will make the inner add have a memory use. To test this, I've manually split the GEPs so there is no 0 offset store. This issue was recently discussed in the original review D60294. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D124644	2022-05-02 16:38:53 -07:00
Sanjay Patel	747c6a0c73	[SDAG] fix miscompile when casting int->FP->int This is the codegen equivalent of D124692. As shown in https://github.com/llvm/llvm-project/issues/55150 - the existing fold may be wrong when converting to a signed value. This is a quick fix to avoid the miscompile. https://alive2.llvm.org/ce/z/KtaDmd Differential Revision: https://reviews.llvm.org/D124771	2022-05-02 14:57:27 -04:00
Simon Pilgrim	ae8b10e543	[DAG] (style) Break apart if-else chain as they all return	2022-05-01 17:56:59 +01:00
Paul Walker	f10a8f6752	[LegalizeDAG] Fix TypeSize conversion error when expanding SIGN_EXTEND_INREG SIGN_EXTEND_INREG expansion can trigger a TypeSize error because "VT.getSizeInBits() == 1" is used to detect for a boolean without first verifying VT is a scalar.	2022-04-30 19:21:48 +01:00
Craig Topper	6affe87bda	[DAGCombiner] When matching a disguised rotate by constant don't forget to apply LHSMask/RHSMask. We try to match as a disguised rotate by constant of these forms (shl (X \| Y), C1) \| (srl X, C2) --> (rotl X, C1) \| (shl Y, C1) (shl X, C1) \| (srl (X \| Y), C2) --> (rotl X, C1) \| (srl Y, C2) We may have also looked through an AND to find the shift. If we did, we need to apply a mask to the result. I'll add an AArch64 test and pre-commit it and the RISC-V test tomorrow. Fixes PR55201. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D124711	2022-04-30 11:02:30 -07:00
David Penry	dcb77643e3	Reapply [CodeGen][ARM] Enable Swing Module Scheduling for ARM Fixed "private field is not used" warning when compiled with clang. original commit: `28d09bbbc3` reverted in: `fa49021c68` ------ This patch permits Swing Modulo Scheduling for ARM targets turns it on by default for the Cortex-M7. The t2Bcc instruction is recognized as a loop-ending branch. MachinePipeliner is extended by adding support for "unpipelineable" instructions. These instructions are those which contribute to the loop exit test; in the SMS papers they are removed before creating the dependence graph and then inserted into the final schedule of the kernel and prologues. Support for these instructions was not previously necessary because current targets supporting SMS have only supported it for hardware loop branches, which have no loop-exit-contributing instructions in the loop body. The current structure of the MachinePipeliner makes it difficult to remove/exclude these instructions from the dependence graph. Therefore, this patch leaves them in the graph, but adds a "normalization" method which moves them in the schedule to stage 0, which causes them to appear properly in kernel and prologues. It was also necessary to be more careful about boundary nodes when iterating across successors in the dependence graph because the loop exit branch is now a non-artificial successor to instructions in the graph. In additional, schedules with physical use/def pairs in the same cycle should be treated as creating an invalid schedule because the scheduling logic doesn't respect physical register dependence once scheduled to the same cycle. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D122672	2022-04-29 10:54:39 -07:00
Paul Walker	23c509754d	[DAGCombiner] Stop invalid sign conversion in refineIndexType. When looking through extends of gather/scatter indices it's safe to convert a known positive signed index to unsigned, but unsigned indices must remain unsigned. Depends On D123318 Differential Revision: https://reviews.llvm.org/D123326	2022-04-29 14:20:13 +01:00
Nikita Popov	027c728f29	[SelectionDAGBuilder] Don't create MGATHER/MSCATTER with Scale != ElemSize This is an alternative to D124530. In getUniformBase() only create scales that match the gather/scatter element size. If targets also support other scales, then they can produce those scales in target DAG combines. This is what X86 already does (as long as the resulting scale would be 1, 2, 4 or 8). This essentially restores the pre-opaque-pointer state of things. Fixes https://github.com/llvm/llvm-project/issues/55021. Differential Revision: https://reviews.llvm.org/D124605	2022-04-29 14:57:53 +02:00
Paul Walker	7a0b897e86	[DAGCombiner][SVE] Ensure MGATHER/MSCATTER addressing mode combines preserve index scaling refineUniformBase and selectGatherScatterAddrMode both attempt the transformation: base(0) + index(A+splat(B)) => base(B) + index(A) However, this is only safe when index is not implicitly scaled. Differential Revision: https://reviews.llvm.org/D123222	2022-04-29 12:35:16 +01:00
Serge Pavlov	9fc58f1820	[PowerPC] Support of ppc_fp128 in lowering of llvm.is_fpclass PowerPC supports `ppc_fp128`, which is not an IEEE floating point type. The generic lowering of llvm.is_fpclass could not handle it properly. This change extends the generic lowering code to support `ppc_fp128`. The change was tested on emulator using runtime tests from https://reviews.llvm.org/D112933 and the patch for clang https://reviews.llvm.org/D112932. Differential Revision: https://reviews.llvm.org/D113908	2022-04-29 11:10:47 +07:00
Zequan Wu	4fe2ab5279	Revert "[DebugInfo][InstrRef] Describe value sizes when spilt to stack" This reverts commit `a15b66e76d`. This causes linker to crash at assertion: `Assertion failed: !Expr->isComplex(), file C:\b\s\w\ir\cache\builder\src\third_party\llvm\llvm\lib\CodeGen\LiveDebugValues\InstrRefBasedImpl.cpp, line 907`.	2022-04-28 16:18:16 -07:00
David Penry	fa49021c68	Revert "[CodeGen][ARM] Enable Swing Module Scheduling for ARM" This reverts commit `28d09bbbc3` while I investigate a buildbot failure.	2022-04-28 13:29:27 -07:00
David Penry	28d09bbbc3	[CodeGen][ARM] Enable Swing Module Scheduling for ARM This patch permits Swing Modulo Scheduling for ARM targets turns it on by default for the Cortex-M7. The t2Bcc instruction is recognized as a loop-ending branch. MachinePipeliner is extended by adding support for "unpipelineable" instructions. These instructions are those which contribute to the loop exit test; in the SMS papers they are removed before creating the dependence graph and then inserted into the final schedule of the kernel and prologues. Support for these instructions was not previously necessary because current targets supporting SMS have only supported it for hardware loop branches, which have no loop-exit-contributing instructions in the loop body. The current structure of the MachinePipeliner makes it difficult to remove/exclude these instructions from the dependence graph. Therefore, this patch leaves them in the graph, but adds a "normalization" method which moves them in the schedule to stage 0, which causes them to appear properly in kernel and prologues. It was also necessary to be more careful about boundary nodes when iterating across successors in the dependence graph because the loop exit branch is now a non-artificial successor to instructions in the graph. In additional, schedules with physical use/def pairs in the same cycle should be treated as creating an invalid schedule because the scheduling logic doesn't respect physical register dependence once scheduled to the same cycle. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D122672	2022-04-28 13:01:18 -07:00
Alexey Bataev	75e1cf4a6a	[COST]Improve cost model for shuffles in SLP. Introduced masks where they are not added and improved target dependent cost models to avoid returning of the incorrect cost results after adding masks. Differential Revision: https://reviews.llvm.org/D100486	2022-04-28 10:04:41 -07:00

1 2 3 4 5 ...

32271 Commits