clang-p2996

Author	SHA1	Message	Date
Josh Stone	87f57f459e	[RegAllocFast] Handle new debug values for spills These new debug values get inserted after the place where the spill happens, which means they won't be reached by the reverse traversal of basic block instructions. This would crash or fail assertions if they contained any virtual registers to be replaced. We can manually handle the new debug values right away to resolve this. Fixes https://github.com/llvm/llvm-project/issues/59172 Reviewed By: StephenTozer Differential Revision: https://reviews.llvm.org/D139590	2023-01-05 20:41:11 -08:00
Chen Zheng	85edf1fc70	[PowerPC] remove the ctr clobbers check related to TLS access Dynamic tls access model will be lowered to MI which clobbers CTR in the loop in ISEL(ADDItlsgdLADDR) and post-isel CTR loop pass will revert the loop to a normal compare + branch form. So no need to add this clobber check in hardware loop insertion pass now. Reviewed By: nemanjai Differential revision: https://reviews.llvm.org/D140367	2023-01-05 21:23:29 -05:00
Chen Zheng	dd0edc876c	[PowerPC][NFC] add an option to keep the test point Passes before hardware loop insertion change the loop to a form which is not a hardware loop candidate (return early before checking the ctr clobbers). And the PHI in the loop exit block is also optimized away. This breaks the previous test point when the case was committed. Fixing this by running this case just before hardware loop insertion pass. Reviewed By: nemanjai Differential revision: https://reviews.llvm.org/D140366	2023-01-05 21:18:53 -05:00
Luke Drummond	108766fc7e	Fix typos I found one typo of "implemnt", then some more. s/implemnt/implement/g	2023-01-05 18:49:23 +00:00
Nikita Popov	60442f0d44	[CodeGen] Convert some tests to opaque pointers (NFC) These are mostly MIR tests, which I did not handle during previous conversions.	2023-01-05 13:21:20 +01:00
Chen Zheng	6a930e8891	1: use class instead of MVT 2: minor fix for the comments	2023-01-05 07:53:59 +00:00
Chen Zheng	ac93a4e77d	[PowerPC][GISel]fcmp support This patch also includes: 1: CRRegBank support 2: Some workarounds in PPC table gen for anyext/setcc patterns selection. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D140878	2023-01-05 07:45:29 +00:00
Stefan Pintilie	c1d0118459	[PowerPC] Materialize floats in the range [-16.0, 15.0]. Previous to this patch we only materialized 0.0 and all other floating point values would be loaded from the TOC. This patch adds materialization for the floating point values that can be represented as integers in [-16.0, 15.0]. For example we will now materialize 3.0 and -5.0 but not 4.7. Reviewed By: nemanjai, lei, #powerpc Differential Revision: https://reviews.llvm.org/D138844	2023-01-04 12:52:30 -06:00
Matt Arsenault	bf4596bf58	CodeGen: Clean up some tests with broken "strictfp" attribute	2023-01-03 20:26:57 -05:00
Craig Topper	8abd70081f	[TargetLowering] Teach BuildUDIV to take advantage of leading zeros in the dividend. If the dividend has leading zeros, we can use them to reduce the size of the multiplier and avoid the fixup cases. This patch is for scalars only, but we might be able to do this for vectors in a follow up. Differential Revision: https://reviews.llvm.org/D140750	2022-12-29 13:58:46 -08:00
Qiu Chaofan	d00680876c	Fix failure of ldst-16-byte.mir	2022-12-28 14:23:32 +08:00
Qiu Chaofan	0ad57bf236	[PowerPC] Enable track-subreg-liveness by default This option helps some MMA related cases to reduce unnecessary copies. Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D108902	2022-12-28 14:09:29 +08:00
Roman Lebedev	110c5442b8	[NFC][Codegen] Add tests with oversized shifts by non-byte-multiple	2022-12-24 19:26:41 +03:00
Roman Lebedev	a9fbf25a14	[NFC][Codegen] Rename tests for oversized shifts by byte multiple	2022-12-24 19:26:41 +03:00
Roman Lebedev	387c1573f8	[NFC][Codegen] Tests with wide scalar shifts, for new potential legalization strategy	2022-12-24 00:47:25 +03:00
Lei Huang	7a7e9109a2	[PowerPC] Implement P10 Byte Reverse Insructions Generate brh, brw and brd instructions for byte-swap operations on P10 and generating a single instruction for a 32-bit swap followed by a 16-bit right shift. Reviewed By: stefanp Differential Revision: https://reviews.llvm.org/D140414	2022-12-21 09:15:57 -06:00
Roman Lebedev	3a8e009f97	Revert "Reland "[SimplifyCFG] `FoldBranchToCommonDest()`: deal with mismatched IV's in PHI's in common successor block"" One of these two changes is exposing (or causing) some more miscompiles. A reproducer is in progress, so reverting until resolved. This reverts commit `428f36401b`.	2022-12-20 18:36:42 +03:00
Qiu Chaofan	74cca964a6	Pre-commit more cases for PowerPC is_fpclass	2022-12-20 17:11:50 +08:00
Chen Zheng	f74324a1f8	[PowerPC] don't generate hardware loop. If the candidate loop already has hardware loop related intrinsics, don't generate hardware loop on PPC. PPC does not support nested hardware loops.	2022-12-19 20:32:29 -05:00
Chen Zheng	5184aaf6d3	[PowerPC][NFC] reuse a case for checking hardware loop intrinsic input	2022-12-19 20:15:10 -05:00
Nikita Popov	705029ace8	[PowerPC] Convert some tests to opaque pointers (NFC)	2022-12-19 12:59:04 +01:00
Qiu Chaofan	a40ef656d8	[Intrinsic] Rename flt.rounds intrinsic to get.rounding Address the inconsistency between FLT_ROUNDS_ and SET_ROUNDING SDAG node. Rename FLT_ROUNDS_ to GET_ROUNDING and add llvm.get.rounding intrinsic to replace flt.rounds. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D139507	2022-12-19 15:22:39 +08:00
Roman Lebedev	428f36401b	Reland "[SimplifyCFG] `FoldBranchToCommonDest()`: deal with mismatched IV's in PHI's in common successor block" This reverts commit `37b8f09a4b`, and returns commit `1bd0b82e50`. The miscompile was in InstCombine, and it has been addressed. This tries to approach the problem noted by @arsenm: terrible codegen for `__builtin_fpclassify()`: https://godbolt.org/z/388zqdE37 Just because the PHI in the common successor happens to have different incoming values for these two blocks, doesn't mean we have to give up. It's quite easy to deal with this, we just need to produce a select: https://alive2.llvm.org/ce/z/000srb Now, the cost model for this transform is rather overly strict, so this will basically never fire. We tally all (over all preds) the selects needed to the NumBonusInsts Differential Revision: https://reviews.llvm.org/D139275	2022-12-17 05:18:54 +03:00
Alexander Kornienko	37b8f09a4b	Revert "[SimplifyCFG] `FoldBranchToCommonDest()`: deal with mismatched IV's in PHI's in common successor block" This reverts commit `1bd0b82e50`, since it leads to miscompiles. See https://reviews.llvm.org/D139275#3993229 and https://reviews.llvm.org/D139275#4001580.	2022-12-16 17:23:35 +01:00
Nemanja Ivanovic	cb3f415cd2	[PowerPC] Fix up memory ordering after combining BV to a load The combiner for BUILD_VECTOR that merges consecutive loads into a wide load had two issues: - It didn't check that the input loads all have the same input chain - It didn't update nodes that are chained to the original loads to be chained to the new load This caused issues with bootstrap when `3c4d2a0396` was committed. This patch fixes the issue so it can unblock this commit. Differential revision: https://reviews.llvm.org/D140046	2022-12-16 08:57:36 -06:00
Kai Nacke	110340c687	[PowerPC][GIsel] Materialize i64 constants. Adds support for i64 constant. It uses the same pattern-based approach as in SDAG (see PPCISelDAGToDAG::selectI64ImmDirect(), PPCISelDAGToDAG::selectI64Imm()). It does not support the prefixed instructions. Reviewed By: arsenm, tschuett Differential Revision: https://reviews.llvm.org/D140119	2022-12-15 21:22:58 +00:00
Ron Lieberman	38f1abef86	Revert "[SelectionDAG] Do not second-guess alignment for alloca" Breaks amdgpu buildbot https://lab.llvm.org/buildbot/#/builders/193 23491 This reverts commit `ffedf47d8b`.	2022-12-15 10:55:18 -06:00
Andrew Savonichev	ffedf47d8b	[SelectionDAG] Do not second-guess alignment for alloca Alignment of an alloca in IR can be lower than the preferred alignment on purpose, but this override essentially treats the preferred alignment as the minimum alignment. The patch changes this behavior to always use the specified alignment. If alignment is not set explicitly in LLVM IR, it is set to DL.getPrefTypeAlign(Ty) in computeAllocaDefaultAlign. Tests are changed as well: explicit alignment is increased to match the preferred alignment if it changes output, or omitted when it is hard to determine the right value (e.g. for pointers, some structs, or weird types). Differential Revision: https://reviews.llvm.org/D135462	2022-12-15 18:18:12 +03:00
esmeyi	2e8c7f6527	[XCOFF] adjust the Fixedvalue for R_RBR relocations. Summary: Currently we get a wrong fixed value for R_RBR relocations when -ffunction-sections enabled. This patch fixes this. Reviewed By: DiggerLin, shchenz Differential Revision: https://reviews.llvm.org/D138982	2022-12-15 01:56:53 -05:00
esmeyi	d4fd275896	[NFC][PowerPC] Add tests for 64-bit constants that require 5 instructions to materialize. Differential Revision: https://reviews.llvm.org/D139914	2022-12-13 02:44:49 -05:00
Ting Wang	e6d925bc4b	[PowerPC][NFC] Add test case for memset tail store Add test case to show something can be improved. Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D138881	2022-12-12 20:07:23 -05:00
Roman Lebedev	1bd0b82e50	[SimplifyCFG] `FoldBranchToCommonDest()`: deal with mismatched IV's in PHI's in common successor block This tries to approach the problem noted by @arsenm: terrible codegen for `__builtin_fpclassify()`: https://godbolt.org/z/388zqdE37 Just because the PHI in the common successor happens to have different incoming values for these two blocks, doesn't mean we have to give up. It's quite easy to deal with this, we just need to produce a select: https://alive2.llvm.org/ce/z/000srb Now, the cost model for this transform is rather overly strict, so this will basically never fire. We tally all (over all preds) the selects needed to the NumBonusInsts Differential Revision: https://reviews.llvm.org/D139275	2022-12-12 18:20:03 +03:00
Chen Zheng	d7ee19d163	[PowerPC][GISel] add the missing verify option - NFC	2022-12-12 12:59:27 +00:00
Chen Zheng	b41d22db18	[PowerPC][GISel] support 32 bit load/store Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D135535	2022-12-12 12:52:44 +00:00
Chen Zheng	503a935d89	[PowerPC][GISel] support 64 bit load/store Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D134792	2022-12-12 12:20:54 +00:00
Roman Lebedev	d70f271726	[NFC] Port codegen PowerPC tests that invoke opt to `-passes=` syntax	2022-12-09 01:04:47 +03:00
Roman Lebedev	b1a9584818	[opt] Disincentivize new tests from using old pass syntax Over the past day or so, i've took a large swing at our tests, and reduced the number of tests that were still using the old syntax from ~1800 to just 200. Left to handle: (as it is seen in this patch) * Transforms/LSR * Transforms/CGP * Transforms/TypePromotion * Transforms/HardwareLoops * Analysis/* * some misc. I think this is the right point to start actively refusing to honor the old syntax, except for the old tests, to prevent the old syntax from creeping back in. Thus, let's add temporary default-off flag, and if it is not passed refuse to accept old syntax. The tests that still need porting are annotated with this flag. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D139647	2022-12-08 23:54:03 +03:00
Ting Wang	140a83e32f	[PowerPC][NFC] Test case update on ppc64-acc-regalloc-bugfix.ll Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D139492	2022-12-07 20:16:12 -05:00
Anton Sidorenko	f8ed709345	[MachineCombiner] Extend reassociation logic to handle inverse instructions Machine combiner supports generic reassociation only of associative and commutative instructions, for example (A + X) + Y => (X + Y) + A. However, we can extend this generic support to handle patterns like (X + A) - Y => (X - Y) + A), where `-` is the inverse of `+`. This patch adds interface functions to process reassociation patterns of associative/commutative instructions and their inverse variants with minimal changes in backends. Differential Revision: https://reviews.llvm.org/D136754	2022-12-07 13:50:28 +03:00
Qiu Chaofan	62f20f51ce	[PowerPC] Support test data class intrinsic of 128-bit float We've exploited test data class instructions introduced in ISA 3.0. This change unifies the scalar intrinsics into ppc_test_data_class and add support for 128-bit precision float values using xststdcqp. Vector versions of the intrinsic can't be unified because they return vector int instead of int. Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D138105	2022-12-07 16:44:12 +08:00
Roman Lebedev	58b0485118	[NFC][PPC] Autogenerate checklines in ppc-ctr-dead-code.ll to simplify update	2022-12-06 03:47:46 +03:00
Jonas Paulsson	5ecd363295	Reapply "[CodeGen] Add new pass for late cleanup of redundant definitions." This reverts commit `122efef8ee`. - Patch fixed to not reuse definitions from predecessors in EH landing pads. - Late review suggestions (by MaskRay) have been addressed. - M68k/pipeline.ll test updated. - Init captures added in processBlock() to avoid capturing structured bindings. - RISCV has this disabled for now. Original commit message: A new pass MachineLateInstrsCleanup is added to be run after PEI. This is a simple pass that removes redundant and identical instructions whenever found by scanning the MF once while keeping track of register definitions in a map. These instructions are typically immediate loads resulting from rematerialization, and address loads emitted by target in eliminateFrameInde(). This is enabled by default, but a target could easily disable it by means of 'disablePass(&MachineLateInstrsCleanupID);'. This late cleanup is naturally not "optimal" in removing instructions as it is done by looking at phys-regs, but still quite effective. It would be desirable to improve other parts of CodeGen and avoid these redundant instructions in the first place, but there are no ideas for this yet. Differential Revision: https://reviews.llvm.org/D123394 Reviewed By: RKSimon, foad, craig.topper, arsenm, asb	2022-12-05 12:53:50 -06:00
Roman Lebedev	2a05bd212e	[NFC] Fix test/CodeGen/PowerPC/O0-pipeline.ll	2022-12-05 17:21:39 +03:00
Dmitry Vyukov	dbe8c2c316	Use-after-return sanitizer binary metadata Currently per-function metadata consists of: (start-pc, size, features) This adds a new UAR feature and if it's set an additional element: (start-pc, size, features, stack-args-size) Reviewed By: melver Differential Revision: https://reviews.llvm.org/D136078	2022-12-05 14:40:31 +01:00
Chen Zheng	0a9b1c59f0	[PowerPC][GISel]support for float point and integer convertion Add support for fptosi,fptoui,sitofp,uitofp For now only handle 64 bit integer to make it does not depend on any other patches. 32 bit integer needs handling for G_SEXT/G_ZEXT. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D139174	2022-12-04 22:21:57 -05:00
Chen Zheng	b5e1fc19da	[PowerPC] don't check CTR clobber in hardware loop insertion pass We added a new post-isel CTRLoop pass in D122125. That pass will expand the hardware loop related intrinsic to CTR loop or normal loop based on the loop context. So we don't need to conservatively check the CTR clobber now on the IR level. Reviewed By: lkail Differential Revision: https://reviews.llvm.org/D135847	2022-12-04 20:53:49 -05:00
Jonas Paulsson	122efef8ee	Revert "Reapply "[CodeGen] Add new pass for late cleanup of redundant definitions."" This reverts commit `17db0de330`. Some more bots got broken - need to investigate.	2022-12-05 00:52:00 +01:00
Jonas Paulsson	17db0de330	Reapply "[CodeGen] Add new pass for late cleanup of redundant definitions." Init captures added in processBlock() to avoid capturing structured bindings, which caused the build problems (with clang). RISCV has this disabled for now until problems relating to post RA pseudo expansions are resolved.	2022-12-03 14:15:15 -06:00
Chen Zheng	b61ff0ca76	[PowerPC] move ctrloop pass before tail duplication Tail duplication may modify the loop to a "non-canonical" form that CTR Loop pass can not recognize. We fixed one issue in D135846. And we found in some other case, the loop is changed to irreducible form. It is hard to fix this case in CTR loop pass, instead we reorder the CTR loop pass before tail duplication pass and just after finalize-isel pass to avoid any unexpected change to the loop form. Reviewed By: lkail Differential Revision: https://reviews.llvm.org/D138265	2022-12-02 00:31:00 -05:00
Chen Zheng	dff8227189	Revert "[PowerPC] handle more than two predecessors loop header in ctrloop pass" This reverts commit `df9d60af1f`. The CTRLoops pass is reordered to front of tail duplication pass in D138265.	2022-12-02 00:30:56 -05:00

1 2 3 4 5 ...

3514 Commits