clang-p2996

Author	SHA1	Message	Date
Nikita Popov	60442f0d44	[CodeGen] Convert some tests to opaque pointers (NFC) These are mostly MIR tests, which I did not handle during previous conversions.	2023-01-05 13:21:20 +01:00
Filipp Zhinkin	98265db84c	[ScheduleDAG] Support REQ_SEQUENCE unscheduling REG_SEQUENCE node requires special treatment during the unscheduling because the node is untyped and neither its class, nor cost could be retrieved the same way as for typed nodes. Related issue: https://github.com/llvm/llvm-project/issues/58911 Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D138837	2022-12-30 15:17:11 +04:00
Craig Topper	8abd70081f	[TargetLowering] Teach BuildUDIV to take advantage of leading zeros in the dividend. If the dividend has leading zeros, we can use them to reduce the size of the multiplier and avoid the fixup cases. This patch is for scalars only, but we might be able to do this for vectors in a follow up. Differential Revision: https://reviews.llvm.org/D140750	2022-12-29 13:58:46 -08:00
Nikita Popov	701890164d	[ARM] Convert some tests to opaque pointers (NFC)	2022-12-21 12:37:55 +01:00
Nikita Popov	87679b12c1	[ARM] Regenerate test checks (NFC)	2022-12-21 12:33:35 +01:00
David Green	752819e813	[AArch64][ARM] Remove load from dup and vmul tests. NFC These tests needn't use loads in their testing of dup and mul instructions, and as the load changes the test may no longer test what they are intending (as in D140069).	2022-12-20 15:23:38 +00:00
Simon Pilgrim	6161a8dd5c	DAG: Pull fneg out of select feeding fadd into fsub Enables folding fadd x, (select c, (fneg a), (fneg b)) -> fsub (select a, b), c Avoids some regressions in a future AMDGPU change.	2022-12-19 11:38:30 -05:00
Matt Arsenault	ddfc8bfe07	ARM: Add baseline tests for fadd with select combine	2022-12-19 10:28:07 -05:00
Nikita Popov	bed1c7f061	[ARM] Convert some tests to opaque pointers (NFC)	2022-12-19 12:45:35 +01:00
Qiu Chaofan	a40ef656d8	[Intrinsic] Rename flt.rounds intrinsic to get.rounding Address the inconsistency between FLT_ROUNDS_ and SET_ROUNDING SDAG node. Rename FLT_ROUNDS_ to GET_ROUNDING and add llvm.get.rounding intrinsic to replace flt.rounds. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D139507	2022-12-19 15:22:39 +08:00
Ron Lieberman	38f1abef86	Revert "[SelectionDAG] Do not second-guess alignment for alloca" Breaks amdgpu buildbot https://lab.llvm.org/buildbot/#/builders/193 23491 This reverts commit `ffedf47d8b`.	2022-12-15 10:55:18 -06:00
Andrew Savonichev	ffedf47d8b	[SelectionDAG] Do not second-guess alignment for alloca Alignment of an alloca in IR can be lower than the preferred alignment on purpose, but this override essentially treats the preferred alignment as the minimum alignment. The patch changes this behavior to always use the specified alignment. If alignment is not set explicitly in LLVM IR, it is set to DL.getPrefTypeAlign(Ty) in computeAllocaDefaultAlign. Tests are changed as well: explicit alignment is increased to match the preferred alignment if it changes output, or omitted when it is hard to determine the right value (e.g. for pointers, some structs, or weird types). Differential Revision: https://reviews.llvm.org/D135462	2022-12-15 18:18:12 +03:00
Roman Lebedev	d6bd732aeb	[NFC] Port codegen ARM tests that invoke opt to `-passes=` syntax	2022-12-09 01:04:46 +03:00
Roman Lebedev	b1a9584818	[opt] Disincentivize new tests from using old pass syntax Over the past day or so, i've took a large swing at our tests, and reduced the number of tests that were still using the old syntax from ~1800 to just 200. Left to handle: (as it is seen in this patch) * Transforms/LSR * Transforms/CGP * Transforms/TypePromotion * Transforms/HardwareLoops * Analysis/* * some misc. I think this is the right point to start actively refusing to honor the old syntax, except for the old tests, to prevent the old syntax from creeping back in. Thus, let's add temporary default-off flag, and if it is not passed refuse to accept old syntax. The tests that still need porting are annotated with this flag. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D139647	2022-12-08 23:54:03 +03:00
Peter Rong	ee31a4a702	[ARM] IselLowering unsigned overflow to crash using APInt in PerformSHLSimplify This diff fixes issue https://github.com/llvm/llvm-project/issues/59317 We should check if bitwidth is lower than the shift amount before we subtract them to avoid unsigned overflow. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D139238	2022-12-06 09:58:27 -08:00
Jonas Paulsson	5ecd363295	Reapply "[CodeGen] Add new pass for late cleanup of redundant definitions." This reverts commit `122efef8ee`. - Patch fixed to not reuse definitions from predecessors in EH landing pads. - Late review suggestions (by MaskRay) have been addressed. - M68k/pipeline.ll test updated. - Init captures added in processBlock() to avoid capturing structured bindings. - RISCV has this disabled for now. Original commit message: A new pass MachineLateInstrsCleanup is added to be run after PEI. This is a simple pass that removes redundant and identical instructions whenever found by scanning the MF once while keeping track of register definitions in a map. These instructions are typically immediate loads resulting from rematerialization, and address loads emitted by target in eliminateFrameInde(). This is enabled by default, but a target could easily disable it by means of 'disablePass(&MachineLateInstrsCleanupID);'. This late cleanup is naturally not "optimal" in removing instructions as it is done by looking at phys-regs, but still quite effective. It would be desirable to improve other parts of CodeGen and avoid these redundant instructions in the first place, but there are no ideas for this yet. Differential Revision: https://reviews.llvm.org/D123394 Reviewed By: RKSimon, foad, craig.topper, arsenm, asb	2022-12-05 12:53:50 -06:00
Dmitry Vyukov	dbe8c2c316	Use-after-return sanitizer binary metadata Currently per-function metadata consists of: (start-pc, size, features) This adds a new UAR feature and if it's set an additional element: (start-pc, size, features, stack-args-size) Reviewed By: melver Differential Revision: https://reviews.llvm.org/D136078	2022-12-05 14:40:31 +01:00
Jonas Paulsson	122efef8ee	Revert "Reapply "[CodeGen] Add new pass for late cleanup of redundant definitions."" This reverts commit `17db0de330`. Some more bots got broken - need to investigate.	2022-12-05 00:52:00 +01:00
Jonas Paulsson	17db0de330	Reapply "[CodeGen] Add new pass for late cleanup of redundant definitions." Init captures added in processBlock() to avoid capturing structured bindings, which caused the build problems (with clang). RISCV has this disabled for now until problems relating to post RA pseudo expansions are resolved.	2022-12-03 14:15:15 -06:00
Bjorn Pettersson	a11faeed44	[test] Switch to use -passes syntax in various test cases	2022-12-01 21:25:59 +01:00
Jonas Paulsson	8ef4632681	Revert "[CodeGen] Add new pass for late cleanup of redundant definitions." Temporarily revert and fix buildbot failure. This reverts commit `6d12599fd4`.	2022-12-01 13:29:24 -05:00
Jonas Paulsson	6d12599fd4	[CodeGen] Add new pass for late cleanup of redundant definitions. A new pass MachineLateInstrsCleanup is added to be run after PEI. This is a simple pass that removes redundant and identical instructions whenever found by scanning the MF once while keeping track of register definitions in a map. These instructions are typically immediate loads resulting from rematerialization, and address loads emitted by target in eliminateFrameInde(). This is enabled by default, but a target could easily disable it by means of 'disablePass(&MachineLateInstrsCleanupID);'. This late cleanup is naturally not "optimal" in removing instructions as it is done by looking at phys-regs, but still quite effective. It would be desirable to improve other parts of CodeGen and avoid these redundant instructions in the first place, but there are no ideas for this yet. Differential Revision: https://reviews.llvm.org/D123394 Reviewed By: RKSimon, foad, craig.topper, arsenm, asb	2022-12-01 13:21:35 -05:00
Freddy Ye	89f36dd8f3	[X86] Add ExpandLargeFpConvert Pass and enable for X86 As stated in https://discourse.llvm.org/t/rfc-llc-add-expandlargeintfpconvert-pass-for-fp-int-conversion-of-large-bitint/65528, this implementation is very similar to ExpandLargeDivRem, which expands ‘fptoui .. to’, ‘fptosi .. to’, ‘uitofp .. to’, ‘sitofp .. to’ instructions with a bitwidth above a threshold into auto-generated functions. This is useful for targets like x86_64 that cannot lower fp convertions with more than 128 bits. The expanded nodes are referring from the IR generated by `compiler-rt/lib/builtins/floattidf.c`, `compiler-rt/lib/builtins/fixdfti.c`, and etc. Corner cases: 1. For fp16: as there is no related builtins added in compliler-rt. So I mainly utilized the fp32 <-> fp16 lib calls to implement. 2. For fp80: as this pass is soft fp emulation and no fp80 instructions can help in this problem. I recommend users to deprecate this usage. For now, the implementation uses fp128 as the temporary conversion type and inserts fptrunc/ext at top/end of the function. 3. For bf16: as clang FE currently doesn't support bf16 algorithm operations (convert to int, float, +, -, *, ...), this patch doesn't consider bf16 for now. 4. For unsigned FPToI: since both default hardware behaviors and libgcc are ignoring "returns 0 for negative input" spec. This pass follows this old way to ignore unsigned FPToI. See this example: https://gcc.godbolt.org/z/bnv3jqW1M The end-to-end tests are uploaded at https://reviews.llvm.org/D138261 Reviewed By: LuoYuanke, mgehre-amd Differential Revision: https://reviews.llvm.org/D137241	2022-12-01 13:47:43 +08:00
Marco Elver	b95646fe70	Revert "Use-after-return sanitizer binary metadata" This reverts commit `d3c851d3fc`. Some bots broke: - https://luci-milo.appspot.com/ui/p/fuchsia/builders/toolchain.ci/clang-linux-x64/b8796062278266465473/overview - https://lab.llvm.org/buildbot/#/builders/124/builds/5759/steps/7/logs/stdio	2022-11-30 23:35:50 +01:00
Dmitry Vyukov	d3c851d3fc	Use-after-return sanitizer binary metadata Currently per-function metadata consists of: (start-pc, size, features) This adds a new UAR feature and if it's set an additional element: (start-pc, size, features, stack-args-size) Reviewed By: melver Differential Revision: https://reviews.llvm.org/D136078	2022-11-30 14:50:22 +01:00
Craig Topper	f387918dd8	[TargetLowering][RISCV][ARM][AArch64][Mips] Reduce the number of AND mask constants used by BSWAP expansion. We can reuse constants if we use SRL followed by AND and AND followed by SHL. Similar was done to bitreverse previously. Differential Revision: https://reviews.llvm.org/D138045	2022-11-15 14:36:01 -08:00
Nicholas Guy	d52e2839f3	[ARM][CodeGen] Add support for complex deinterleaving Adds the Complex Deinterleaving Pass implementing support for complex numbers in a target-independent manner, deferring to the TargetLowering for the given target to create a target-specific intrinsic. Differential Revision: https://reviews.llvm.org/D114174	2022-11-14 14:02:27 +00:00
Nikita Popov	01ec0ff2dc	[SimplifyCFG] Allow speculating block containing assume() SpeculativelyExecuteBB(), which converts a branch + phi structure into a select, currently bails out if the block contains an assume (because it is not speculatable). Adjust the fold to ignore ephemeral values (i.e. assumes and values only used in assumes) for cost modelling purposes, and drop them when performing the fold. Theoretically, we could try to preserve the assume information by generating a assume(br_cond \|\| assume_cond) style assume, but this is very unlikely to to be useful (because we don't do anything useful with assumes of this form) and it would make things substantially more complicated once we take operand bundle assumes into account (which don't really support a \|\| operation). I'd prefer not to do that without good motivation. Differential Revision: https://reviews.llvm.org/D137339	2022-11-04 09:26:35 +01:00
David Green	f970b007e5	[ARM] Fix vector ule zero lowering The instruction icmp ule <4 x i32> %0, zeroinitializer will usually be simplified to icmp eq <4 x i32> %0, zeroinitializer. It is not guaranteed though, and the code for lowering vector compares could pick the wrong form of the instruction if this happened. I've tried to make the code more explicit about the supported conditions. This fixes NEON being unable to select VCMPZ with HS conditions, and fixes some incorrect MVE patterns. Fixes #58514. Differential Revision: https://reviews.llvm.org/D136447	2022-11-02 22:34:05 +00:00
John Brawn	88ac25b357	[MachineCSE] Allow PRE of instructions that read physical registers Currently MachineCSE forbids PRE when the instruction reads a physical register. Relax this so that it's allowed when the value being read is the same as what would be read in the place the instruction would be hoisted to. This is being done in preparation for adding FPCR handling to the AArch64 backend, in order to prevent it to from worsening the generated code, but for targets that already have a similar register it should improve things. This patch affects code generation in several tests. The new code looks better except for in Thumb2/LowOverheadLoops/memcall.ll where we perform PRE but the LowOverheadLoops transformation then undoes it. Also in AMDGPU/selectcc-opt.ll the CHECK makes things look worse, but actually the function as a whole is better (as a MOV is PRE'd). Differential Revision: https://reviews.llvm.org/D136675	2022-11-02 13:53:12 +00:00
David Green	b5caa68fb2	[ARM] Tests for various NEON vector compares. NFC	2022-11-01 15:00:56 +00:00
Matt Arsenault	b60a9ccd02	AtomicExpand: Use InstSimplifyFolder Automatically cleanup operations if we know the atomic has higher alignment.	2022-10-31 23:31:42 -07:00
Daniel Thornburgh	75cdab6dc2	[llvm-objdump] Add --no-print-imm-hex to tests depending on it. This prepares for an upcoming change to make --print-imm-hex the default behavior of llvm-objdump. These tests were updated in a semi-automatic fashion. See D136972 for details.	2022-10-29 15:40:26 -07:00
Patrick Walton	f3d49dbcb1	[test] Remove readonly from some parameters that are written through in tests. In D136659 I found a few tests that write through readonly parameters: * Analysis/BasicAA/pr18573.ll: @foo1 writes through %arr.ptr, but declares it readonly. I removed the readonly annotation. * CodeGen/ARM/ParallelDSP/aliasing.ll: @restrict writes through the readonly %arg3, @store_alias_arg3_illegal_1 writes through the readonly %arg3, and @store_alias_arg3_illegal_2 writes through the readonly %arg3. I removed readonly from all three. Also, I added some CHECK-LABEL directives to make it harder for FileCheck output to be mixed up. * Transforms/LoopVectorize/AArch64/sve-gather-scatter.ll: @gather_nxv4i32_ind64_stride2 writes through the readonly %a. I removed the readonly attribute. * Transforms/LoopVectorize/interleaved-accesses.ll: @load_gap_reverse writes through the readonly %P1 and %P2. Also, the corresponding C code in the comment didn't match the test. I removed the readonly attribute from both parameters and corrected the C code. Differential Revision: https://reviews.llvm.org/D136880	2022-10-29 15:05:20 -07:00
Matt Arsenault	a8762195d5	ARM: Fix stack warning test	2022-10-28 22:28:37 -07:00
Matt Arsenault	c62745e167	DiagnosticInfo: Report function location for resource limits We have some odd redundancy where clang specially handles the stack size case. If clang prints it, the source location is first followed by "warning". The backend diagnostic, as printed by other tools puts "warning" first.	2022-10-28 21:42:57 -07:00
Nikita Popov	9481e4ba62	[ARM] Use DefaultAttrsIntrinsics Use DefaultAttrsIntrinsics for most ARM intrinsics. This adds the WillReturn, NoSync, NoFree and NoCallback attributes and is needed to avoid regressions in the future. I've switched to DefaultAttrIntrinsics for everything doing arithmetic and load/store. I've left some TODOs in cases where all DefaultsAttrs are not correct (e.g. ldrex etc are clearly not nosync) or it wasn't entirely obvious to me (e.g. stuff interacting with a coprocessor). Differential Revision: https://reviews.llvm.org/D136758	2022-10-28 09:59:38 +02:00
Paul Kirth	2e1e2f52f3	[CodeGen] Improve large stack frame diagnostic Add statistics about how much memory is used, in variables, spills, and unsafestack. Issue #58168 describes some of the difficulty diagnosing stack size issues identified by -Wframe-larger-than. D135488 addresses some of those issues by giving developers a method to view the stack layout and thereby understand where and how stack memory is used. However, that solution requires an additional pass, when a short summary about how the compiler has allocated stack memory can inform developers about where they should investigate. When they need the complete context, D135488 can provide them with a more comprehensive set of diagnostics. Reviewed By: nickdesaulniers Differential Revision: https://reviews.llvm.org/D136484	2022-10-27 00:51:45 +00:00
David Green	fb76d2ce6c	[ARM] Fix the type for v4f16 duplane This was previously using the 32bit variant of the instruction, instead of the 16bit as intended. Fixes #58512 Differential Revision: https://reviews.llvm.org/D136422	2022-10-21 10:10:35 +01:00
David Green	8c1a508616	[ARM] Regnereate armv8.2a-fp16-vector-intrinsics.ll test. NFC	2022-10-21 09:35:39 +01:00
chenglin.bi	b18293edc3	[MC][COFF] Add COFF section flag "Info" For now, we have not parse section flag `Info` in asm file. When we emit a section with info flag to asm, then compile asm to obj we will lose the Info flag for the section. The motivation of this change is ARM64EC's hybmp$x section. If we lose the Info flag MSVC link will report a warning: `warning LNK4078: multiple '.hybmp' sections found with different attributes` Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D136125	2022-10-19 10:32:58 +08:00
Keith Walker	6102364b0d	[ARM] Add additional targets to divide tests. The main motivation for these additional targets is to cover the differences in the instructions available between Thumb2 and Thumb1. Ths shows up in these test due to the lack of the following in Thumb1: - Mulitply and Subtract instruction (mls) - used when calculating a remainder. - Unsigned Muliple Long instruction (umull) - used in certain cases when optimising division with a constant. Differential Revision: https://reviews.llvm.org/D135875	2022-10-17 16:51:17 +01:00
Archibald Elliott	7d15212b8c	[ARM] Support fp16/bf16 using w constraint fp16 and bf16 values can be used in GCC's inline assembly using the "w" constraint, which means "VFP floating-point registers d0-d31" - fp16 and bf16 values are stored in S registers (which alias the D registers). This change ensures that LLVM is compatible with GCC for programs that use fp16 and the 'w' constraint. Differential Revision: https://reviews.llvm.org/D135662	2022-10-13 10:32:06 +01:00
David Green	5e1a9d319d	[ARM] Add lowering for bf16 neon vtrn, vzup and vuzp. These go via Dag2Dag, which are better based on element sizes not the exact element types.	2022-10-02 15:34:37 +01:00
David Green	f2fde99461	[ARM] More bf16 shuffle handling, including perfect shuffles.	2022-10-02 14:31:51 +01:00
David Green	8193f0d1d2	[ARM] Add tablegen patterns for bf16 vrev	2022-10-02 13:42:14 +01:00
David Green	58369c8631	[ARM] Add tablegen patterns for bf16 vext This adds missing tablegen patterns for VEXT, identical to the fp16 patterns as they only use baseline Neon operations. Part of fixing #57770.	2022-10-02 12:45:58 +01:00
David Green	3651635eca	[ARM][DAG] BF16 constant handling. Much like f16 and f32, we shouldn't try to shrink bf16 to smaller fp constant. The code may not be optimal, but this allows us to legalize bf16 constants under Arm without errors.	2022-10-02 11:51:08 +01:00
Filipp Zhinkin	945a1468c9	[ARM] Support all versions of AND, ORR, EOR and BIC in optimizeCompareInstr Combine cmp with zero and all versions of AND, ORR, EOR and BIC instructions into S-suffixed versions. Related issue: https://github.com/llvm/llvm-project/issues/57122 Reviewed By: efriedma, samtebbs Differential Revision: https://reviews.llvm.org/D131786	2022-10-01 12:41:37 +03:00
Archibald Elliott	ff4027d152	[ARM] Support fp16/bf16 using t constraint fp16 and bf16 values can be used in GCC's inline assembly using the "t" constraint, which means "VFP floating-point registers s0-s31" - fp16 and bf16 values are stored in S registers too. This change ensures that LLVM is compatible with GCC for programs that use fp16 and the 't' constraint. Fixes #57753 Differential Revision: https://reviews.llvm.org/D134553	2022-09-28 14:48:21 +01:00

1 2 3 4 5 ...

4662 Commits