clang-p2996

Author	SHA1	Message	Date
Brad Smith	7973d51965	[Mips] Set setMaxAtomicSizeInBitsSupported Set setMaxAtomicSizeInBitsSupported for Mips. Set the value as appropriate for 64-bit MIPS vs 32-bit. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D141189	2023-07-15 17:29:25 -04:00
Yashwant Singh	b7836d8562	[CodeGen]Allow targets to use target specific COPY instructions for live range splitting Replacing D143754. Right now the LiveRangeSplitting during register allocation uses TargetOpcode::COPY instruction for splitting. For AMDGPU target that creates a problem as we have both vector and scalar copies. Vector copies perform a copy over a vector register but only on the lanes(threads) that are active. This is mostly sufficient however we do run into cases when we have to copy the entire vector register and not just active lane data. One major place where we need that is live range splitting. Allowing targets to use their own copy instructions(if defined) will provide a lot of flexibility and ease to lower these pseudo instructions to correct MIR. - Introduce getTargetCopyOpcode() virtual function and use if to generate copy in Live range splitting. - Replace necessary MI.isCopy() checks with TII.isCopyInstr() in register allocator pipeline. Reviewed By: arsenm, cdevadas, kparzysz Differential Revision: https://reviews.llvm.org/D150388	2023-07-07 22:29:50 +05:30
Luke Lau	742fb8b5c7	[DAGCombine] Fold (store (insert_elt (load p)) x p) -> (store x) If we have a store of a load with no other uses in between it, it's considered dead and is removed. So sometimes when legalizing a fixed length vector store of an insert, we end up producing better code through scalarization than without. An example is the follow below: %a = load <4 x i64>, ptr %x %b = insertelement <4 x i64> %a, i64 %y, i32 2 store <4 x i64> %b, ptr %x If this is scalarized, then DAGCombine successfully removes 3 of the 4 stores which are considered dead, and on RISC-V we get: sd a1, 16(a0) However if we make the vector type legal (-mattr=+v), then we lose the optimisation because we don't scalarize it. This patch attempts to recover the optimisation for vectors by identifying patterns where we store a load with a single insert inbetween, replacing it with a scalar store of the inserted element. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D152276	2023-06-28 22:45:04 +01:00
Matt Arsenault	80e2c26dfd	RegisterCoalescer: Fix name of pass I finally snapped and fixed this inconsistency.	2023-06-21 10:30:43 -04:00
Fangrui Song	49b61ead47	[XRay][test] Make tests less sensitive to .Ltmp/Ltmp label changes	2023-06-18 13:32:40 -07:00
Amaury Séchet	e879fded2a	[NFC] Autogenerate several Mips test.	2023-06-14 22:27:15 +00:00
Matt Arsenault	eece6ba283	IR: Add llvm.ldexp and llvm.experimental.constrained.ldexp intrinsics AMDGPU has native instructions and target intrinsics for this, but these really should be subject to legalization and generic optimizations. This will enable legalization of f16->f32 on targets without f16 support. Implement a somewhat horrible inline expansion for targets without libcall support. This could be better if we could introduce control flow (GlobalISel version not yet implemented). Support for strictfp legalization is less complete but works for the simple cases.	2023-06-06 17:07:18 -04:00
Tobias Hieta	f84bac329b	[NFC][Py Reformat] Reformat lit.local.cfg python files in llvm This is a follow-up to `b71edfaa4e` since I forgot the lit.local.cfg files in that one. Reformatting is done with `black`. If you end up having problems merging this commit because you have made changes to a python file, the best way to handle that is to run git checkout --ours <yourfile> and then reformat it with black. If you run into any problems, post to discourse about it and we will try to help. RFC Thread below: https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style Reviewed By: barannikov88, kwk Differential Revision: https://reviews.llvm.org/D150762	2023-05-17 17:03:15 +02:00
AdityaK	805f51f9fe	Remove Android-mips related tests Split from: https://reviews.llvm.org/D146565, already reviewed there.	2023-03-23 14:06:50 -07:00
Chen Zheng	4f0ed16a46	Reland rGf35a09daebd0a90daa536432e62a2476f708150d and rG63854f91d3ee1056796a5ef27753648396cac6ec [DAGCombiner] handle more store value forwarding When lowering calls on target like PPC, some stack loads will be generated for by value parameters. Node CALLSEQ_START prevents such loads from being combined. Suggested by @RolandF, this patch removes the unnecessary loads for the byval parameter by extending ForwardStoreValueToDirectLoad Reviewed By: nemanjai, RolandF Differential Revision: https://reviews.llvm.org/D138899	2023-03-12 21:59:18 -04:00
Nikita Popov	ddccc5ba44	[CodeGen] Always expand division larger than i128 Default MaxDivRemBitWidthSupported to 128, so that divisions larger than 128 bits are always expanded, without requiring additional configuration from the target. Note that this may still emit calls to __udivti3 on 32-bit targets, which likely don't have an implementation of that builtin. However, I believe this is sufficient to fix https://github.com/llvm/llvm-project/issues/60531, because Zig must already be defining those builtins. Differential Revision: https://reviews.llvm.org/D144871	2023-03-01 15:33:45 +01:00
Arthur Eubanks	7c6b46e87e	Revert "[DAGCombiner] handle more store value forwarding" This reverts commit `f35a09daeb`. Causes miscompiles, see D138899	2023-02-13 19:07:28 -08:00
Andrew Savonichev	c65b4d64d4	[SelectionDAG] Do not second-guess alignment for alloca Alignment of an alloca in IR can be lower than the preferred alignment on purpose, but this override essentially treats the preferred alignment as the minimum alignment. The patch changes this behavior to always use the specified alignment. If alignment is not set explicitly in LLVM IR, it is set to DL.getPrefTypeAlign(Ty) in computeAllocaDefaultAlign. Tests are changed as well: explicit alignment is increased to match the preferred alignment if it changes output, or omitted when it is hard to determine the right value (e.g. for pointers, some structs, or weird types). Differential Revision: https://reviews.llvm.org/D135462	2023-02-09 18:45:20 +03:00
YunQiang Su	b4b95dee31	MIPS: fix build from IR files, nan2008 and FpAbi When we use llc or lld to compiler IR files, the features +nan2008 and +fpxx/+fp64 are not used. Thus wrong format files are produced. In IR files, the attributes are only set for function while not the whole compile units. So we extract the attributes from the first function and use it for the whole unit. isFPXXDefault: for o32, the FPXX should always be the default, no matter about the vendors. Of course some distributions with FP64 default enabled should be listed explicit. Let's add them in future if we know about one. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D140270	2023-02-06 20:36:11 -08:00
Chen Zheng	f35a09daeb	[DAGCombiner] handle more store value forwarding When lowering calls on target like PPC, some stack loads will be generated for by value parameters. Node CALLSEQ_START prevents such loads from being combined. Suggested by @RolandF, this patch removes the unnecessary loads for the byval parameter by extending ForwardStoreValueToDirectLoad Reviewed By: nemanjai, RolandF Differential Revision: https://reviews.llvm.org/D138899	2023-02-01 21:06:17 -05:00
Roman Lebedev	cc39c3b17f	[Codegen][LegalizeIntegerTypes] New legalization strategy for scalar shifts: shift through stack https://reviews.llvm.org/D140493 is going to teach SROA how to promote allocas that have variably-indexed loads. That does bring up questions of cost model, since that requires creating wide shifts. Indeed, our legalization for them is not optimal. We either split it into parts, or lower it into a libcall. But if the shift amount is by a multiple of CHAR_BIT, we can also legalize it throught stack. The basic idea is very simple: 1. Get a stack slot 2x the width of the shift type 2. store the value we are shifting into one half of the slot 3. pad the other half of the slot. for logical shifts, with zero, for arithmetic shift with signbit 4. index into the slot (starting from the base half into which we spilled, either upwards or downwards) 5. load 6. split loaded integer This works for both little-endian and big-endian machines: https://alive2.llvm.org/ce/z/YNVwd5 And better yet, if the original shift amount was not a multiple of CHAR_BIT, we can just shift by that remainder afterwards: https://alive2.llvm.org/ce/z/pz5G-K I think, if we are going perform shift->shift-by-parts expansion more than once, we should instead go through stack, which is what this patch does. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D140638	2023-01-14 19:12:18 +03:00
Nikita Popov	8c45d1ad06	[Mips] Convert some tests to opaque pointers (NFC) I'm not sure why, but the absence of bitcasts / no-op GEPs causes the branch delay slot to be used. Differential Revision: https://reviews.llvm.org/D141593	2023-01-13 10:34:27 +01:00
Nikita Popov	840ecce6be	[Mips] Convert some tests to opaque pointers (NFC) Dropped bitcasts result in dropped COPYs in MIR.	2023-01-12 12:40:04 +01:00
Nikita Popov	4f40923103	[Mips] Regenerate test checks (NFC)	2023-01-12 12:24:20 +01:00
Nikita Popov	60442f0d44	[CodeGen] Convert some tests to opaque pointers (NFC) These are mostly MIR tests, which I did not handle during previous conversions.	2023-01-05 13:21:20 +01:00
Fangrui Song	c348abce68	Revert D138179 "MIPS: fix build from IR files, nan2008 and FpAbi" This reverts commit `9739bb81ae`. It causes `.module is not permitted after generating code` for Linux kernel's `ARCH=mips 32r1_defconfig` clang+GNU as build. It's confirmed as a defect, but the proper fix needs time to sort out.	2022-12-22 11:48:55 -08:00
Nikita Popov	8663926a54	[Mips] Convert some tests to opaque pointers (NFC)	2022-12-19 12:56:12 +01:00
Ron Lieberman	38f1abef86	Revert "[SelectionDAG] Do not second-guess alignment for alloca" Breaks amdgpu buildbot https://lab.llvm.org/buildbot/#/builders/193 23491 This reverts commit `ffedf47d8b`.	2022-12-15 10:55:18 -06:00
Andrew Savonichev	ffedf47d8b	[SelectionDAG] Do not second-guess alignment for alloca Alignment of an alloca in IR can be lower than the preferred alignment on purpose, but this override essentially treats the preferred alignment as the minimum alignment. The patch changes this behavior to always use the specified alignment. If alignment is not set explicitly in LLVM IR, it is set to DL.getPrefTypeAlign(Ty) in computeAllocaDefaultAlign. Tests are changed as well: explicit alignment is increased to match the preferred alignment if it changes output, or omitted when it is hard to determine the right value (e.g. for pointers, some structs, or weird types). Differential Revision: https://reviews.llvm.org/D135462	2022-12-15 18:18:12 +03:00
YunQiang Su	9739bb81ae	MIPS: fix build from IR files, nan2008 and FpAbi When we use llc or lld to compiler IR files, the features +nan2008 and +fpxx/+fp64 are not used. Thus wrong format files are produced. In IR files, the attributes are only set for function while not the whole compile units. So we output `.nan 2008` and `.module fp=xx/64` before every function. `isFPXXDefault`: for o32, the FPXX should always be the default, no matter about the vendors. Of course some distributions with FP64 default enabled should be listed explicit. Let's add them in future if we know about one. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D138179	2022-12-15 09:04:36 +00:00
Jonas Paulsson	5ecd363295	Reapply "[CodeGen] Add new pass for late cleanup of redundant definitions." This reverts commit `122efef8ee`. - Patch fixed to not reuse definitions from predecessors in EH landing pads. - Late review suggestions (by MaskRay) have been addressed. - M68k/pipeline.ll test updated. - Init captures added in processBlock() to avoid capturing structured bindings. - RISCV has this disabled for now. Original commit message: A new pass MachineLateInstrsCleanup is added to be run after PEI. This is a simple pass that removes redundant and identical instructions whenever found by scanning the MF once while keeping track of register definitions in a map. These instructions are typically immediate loads resulting from rematerialization, and address loads emitted by target in eliminateFrameInde(). This is enabled by default, but a target could easily disable it by means of 'disablePass(&MachineLateInstrsCleanupID);'. This late cleanup is naturally not "optimal" in removing instructions as it is done by looking at phys-regs, but still quite effective. It would be desirable to improve other parts of CodeGen and avoid these redundant instructions in the first place, but there are no ideas for this yet. Differential Revision: https://reviews.llvm.org/D123394 Reviewed By: RKSimon, foad, craig.topper, arsenm, asb	2022-12-05 12:53:50 -06:00
Jonas Paulsson	122efef8ee	Revert "Reapply "[CodeGen] Add new pass for late cleanup of redundant definitions."" This reverts commit `17db0de330`. Some more bots got broken - need to investigate.	2022-12-05 00:52:00 +01:00
Jonas Paulsson	17db0de330	Reapply "[CodeGen] Add new pass for late cleanup of redundant definitions." Init captures added in processBlock() to avoid capturing structured bindings, which caused the build problems (with clang). RISCV has this disabled for now until problems relating to post RA pseudo expansions are resolved.	2022-12-03 14:15:15 -06:00
Jonas Paulsson	8ef4632681	Revert "[CodeGen] Add new pass for late cleanup of redundant definitions." Temporarily revert and fix buildbot failure. This reverts commit `6d12599fd4`.	2022-12-01 13:29:24 -05:00
Jonas Paulsson	6d12599fd4	[CodeGen] Add new pass for late cleanup of redundant definitions. A new pass MachineLateInstrsCleanup is added to be run after PEI. This is a simple pass that removes redundant and identical instructions whenever found by scanning the MF once while keeping track of register definitions in a map. These instructions are typically immediate loads resulting from rematerialization, and address loads emitted by target in eliminateFrameInde(). This is enabled by default, but a target could easily disable it by means of 'disablePass(&MachineLateInstrsCleanupID);'. This late cleanup is naturally not "optimal" in removing instructions as it is done by looking at phys-regs, but still quite effective. It would be desirable to improve other parts of CodeGen and avoid these redundant instructions in the first place, but there are no ideas for this yet. Differential Revision: https://reviews.llvm.org/D123394 Reviewed By: RKSimon, foad, craig.topper, arsenm, asb	2022-12-01 13:21:35 -05:00
Alex Richardson	88218d5c52	[SelectionDAG] Remove deprecated MemSDNode->getAlignment() I noticed a an assertion error when building MIPS code that loaded from NULL. Loading from NULL ends up being a load with maximum alignment, and due to integer truncation the value maximum was interpreted as 0 and the assertion in MipsDAGToDAGISel::Select() failed. This previously happened to work, but the maximum alignment was increased in `df84c1fe78`, so it no longer fits into a 32 bit integer. Instead of just fixing the one MIPS case, this patch removes all uses of the deprecated getAlignment() call and replaces them with getAlign(). Differential Revision: https://reviews.llvm.org/D138420	2022-11-23 09:04:42 +00:00
Craig Topper	f387918dd8	[TargetLowering][RISCV][ARM][AArch64][Mips] Reduce the number of AND mask constants used by BSWAP expansion. We can reuse constants if we use SRL followed by AND and AND followed by SHL. Similar was done to bitreverse previously. Differential Revision: https://reviews.llvm.org/D138045	2022-11-15 14:36:01 -08:00
Daniel Thornburgh	75cdab6dc2	[llvm-objdump] Add --no-print-imm-hex to tests depending on it. This prepares for an upcoming change to make --print-imm-hex the default behavior of llvm-objdump. These tests were updated in a semi-automatic fashion. See D136972 for details.	2022-10-29 15:40:26 -07:00
Simon Pilgrim	78739fdb4d	[DAG] Enable combineShiftOfShiftedLogic folds after type legalization This was disabled to prevent regressions, which appear to be just occurring on AMDGPU (at least in our current lit tests), which I've addressed by adding AMDGPUTargetLowering::isDesirableToCommuteWithShift overrides. Fixes #57872 Differential Revision: https://reviews.llvm.org/D136042	2022-10-29 12:30:04 +01:00
Peter Rong	c2e7c9cb33	[CodeGen] Using ZExt for extractelement indices. In https://github.com/llvm/llvm-project/issues/57452, we found that IRTranslator is translating `i1 true` into `i32 -1`. This is because IRTranslator uses SExt for indices. In this fix, we change the expected behavior of extractelement's index, moving from SExt to ZExt. This change includes both documentation, SelectionDAG and IRTranslator. We also included a test for AMDGPU, updated tests for AArch64, Mips, PowerPC, RISCV, VE, WebAssembly and X86 This patch fixes issue #57452. Differential Revision: https://reviews.llvm.org/D132978	2022-10-15 15:45:35 -07:00
Simon Pilgrim	0b36d1ef1f	[Mips] Regenerate unalignedload.ll	2022-10-15 18:29:54 +01:00
Simon Pilgrim	1901bd0404	[Mips] Regenerate return-struct.ll	2022-10-15 18:21:55 +01:00
Simon Pilgrim	f2c4204d8a	[Mips] Regenerate load-store-left-right.ll	2022-10-15 18:21:54 +01:00
Alex Richardson	b84be9f2f1	Add all constant physical registers to callee preserved masks This allows MachineCopyPropagation to eliminate copies of constant registers such as zero registers. They were previously not being eliminated as the check for MO.clobbersPhysReg(AvailSrc) would return true for constant registers such as MIPS $zero. To avoid having to manually add the zero registers to all CalleeSavedRegs instantiations in tablegen, I instead added a new isConstant bit to the Register and set this for MIPS, RISC-V, and AArch64 zero registers. RegisterInfoEmitter.cpp looks at this flag and adds all constant registers to the preserved register mask. This may also benefit other passes but so far I have only seen differences in MachineCopyPropagation. In the future it might make sense to generate `isConstantPhysReg()` from this information. Original source: `8588d8b814` Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D131958	2022-09-21 12:50:12 +00:00
Alex Richardson	963287acbf	Add a baseline test for D131958 This test shows that the save of MIPS $zero to a callee-saved register is not elided by the machine-cp pass. Differential Revision: https://reviews.llvm.org/D131957	2022-09-21 12:50:12 +00:00
Roland Froese	207228c1d6	[DAGCombiner] More load-store forwarding for big-endian Get some load-store forwarding cases for big-endian where a larger store covers a smaller load, and the offset would be 0 and handled on little-endian but on big-endian the offset is adjusted to be non-zero. The idea is just to shift the data to make it look like the offset 0 case. Differential Revision: https://reviews.llvm.org/D130115	2022-09-14 15:36:35 -04:00
Simon Pilgrim	0f6b0461b0	[DAG] SimplifyDemandedBits - relax "xor (X >> ShiftC), XorC --> (not X) >> ShiftC" to match only demanded bits The "xor (X >> ShiftC), XorC --> (not X) >> ShiftC" fold is currently limited to the XOR mask being a shifted all-bits mask, but we can relax this to only need to match under the demanded bits. This helps expose more bit extraction/clearing patterns and fixes the PowerPC testCompares*.ll regressions from D127115 Alive2: https://alive2.llvm.org/ce/z/fl7T7K Differential Revision: https://reviews.llvm.org/D129933	2022-07-19 10:59:07 +01:00
Matt Arsenault	56303223ac	llvm-reduce: Don't assert on functions which don't track liveness Use the query that doesn't assert if TracksLiveness isn't set, which needs to always be available. We also need to start printing liveins regardless of TracksLiveness.	2022-06-07 10:00:25 -04:00
Nikita Popov	41d5033eb1	[IR] Enable opaque pointers by default This enabled opaque pointers by default in LLVM. The effect of this is twofold: * If IR that contains neither explicit ptr nor %T* types is passed to tools, we will now use opaque pointer mode, unless -opaque-pointers=0 has been explicitly passed. * Users of LLVM as a library will now default to opaque pointers. It is possible to opt-out by calling setOpaquePointers(false) on LLVMContext. A cmake option to toggle this default will not be provided. Frontends or other tools that want to (temporarily) keep using typed pointers should disable opaque pointers via LLVMContext. Differential Revision: https://reviews.llvm.org/D126689	2022-06-02 09:40:56 +02:00
Hendrik Greving	a92ed167f2	[ValueTypes] Define MVTs for v128i2/v64i4 as well as i2 and i4. Adds MVT::v128i2, MVT::v64i4, and implied MVT::i2, MVT::i4. Keeps MVT::i2, MVT::i4 lowering actions as expand, which should be removed once targets set this explicitly. Adjusts 11 lit tests to reflect slightly different behavior during DAG combine. Differential Revision: https://reviews.llvm.org/D125247	2022-06-02 00:49:11 +00:00
Hendrik Greving	e9d05cc7d8	Revert "[ValueTypes] Define MVTs for v128i2/v64i4 as well as i2 and i4." This reverts commit `430ac5c302`. Due to failures in Clang tests. Differential Revision: https://reviews.llvm.org/D125247	2022-06-01 13:27:49 -07:00
Hendrik Greving	430ac5c302	[ValueTypes] Define MVTs for v128i2/v64i4 as well as i2 and i4. Adds MVT::v128i2, MVT::v64i4, and implied MVT::i2, MVT::i4. Keeps MVT::i2, MVT::i4 lowering actions as `expand`, which should be removed once targets set this explicitly. Adjusts 11 lit tests to reflect slightly different behavior during DAG combine. Differential Revision: https://reviews.llvm.org/D125247	2022-06-01 12:48:01 -07:00
Nuno Lopes	80b3dcc045	[Support] Make report_fatal_error respect its GenCrashDiag argument so it doesn't generate a backtrace There are a few places where we use report_fatal_error when the input is broken. Currently, this function always crashes LLVM with an abort signal, which then triggers the backtrace printing code. I think this is excessive, as wrong input shouldn't give a link to LLVM's github issue URL and tell users to file a bug report. We shouldn't print a stack trace either. This patch changes report_fatal_error so it uses exit() rather than abort() when its argument GenCrashDiag=false. Reviewed by: nikic, MaskRay, RKSimon Differential Revision: https://reviews.llvm.org/D126550	2022-05-30 19:19:23 +01:00
Simon Pilgrim	1ecc3d86ae	[DAG] Enable ISD::SHL SimplifyMultipleUseDemandedBits handling inside SimplifyDemandedBits Pulled out of D77804 as its going to be easier to address the regressions individually. This patch allows SimplifyDemandedBits to call SimplifyMultipleUseDemandedBits in cases where the source operand has other uses, enabling us to peek through the shifted value if we don't demand all the bits/elts. The lost RISCV gorc2 fold shouldn't be a problem - instcombine would have already destroyed that pattern - see https://github.com/llvm/llvm-project/issues/50553 Differential Revision: https://reviews.llvm.org/D124839	2022-05-14 09:50:01 +01:00
Simon Dardis	e82e4fa7ef	[MIPS} Address ISel failures for 64 bit fpus in microMIPS Add the instructions and patterns for loads and stores in microMIPSr3 when a 64 bit FPU is present. Previously, this would lead to an instruction selection failure. This resolves PR/49200. Thanks to jdeguire for reporting the issue! Differential Revision: https://reviews.llvm.org/D124723	2022-05-12 23:25:09 +01:00

1 2 3 4 5 ...

1707 Commits