Commit Graph

1707 Commits

Author SHA1 Message Date
Brad Smith
7973d51965 [Mips] Set setMaxAtomicSizeInBitsSupported
Set setMaxAtomicSizeInBitsSupported for Mips. Set the value as appropriate for 64-bit MIPS vs 32-bit.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D141189
2023-07-15 17:29:25 -04:00
Yashwant Singh
b7836d8562 [CodeGen]Allow targets to use target specific COPY instructions for live range splitting
Replacing D143754. Right now the LiveRangeSplitting during register allocation uses
TargetOpcode::COPY instruction for splitting. For AMDGPU target that creates a
problem as we have both vector and scalar copies. Vector copies perform a copy over
a vector register but only on the lanes(threads) that are active. This is mostly sufficient
however we do run into cases when we have to copy the entire vector register and
not just active lane data. One major place where we need that is live range splitting.

Allowing targets to use their own copy instructions(if defined) will provide a lot of
flexibility and ease to lower these pseudo instructions to correct MIR.

- Introduce getTargetCopyOpcode() virtual function and use if to generate copy in Live range
 splitting.
- Replace necessary MI.isCopy() checks with TII.isCopyInstr() in register allocator pipeline.

Reviewed By: arsenm, cdevadas, kparzysz

Differential Revision: https://reviews.llvm.org/D150388
2023-07-07 22:29:50 +05:30
Luke Lau
742fb8b5c7 [DAGCombine] Fold (store (insert_elt (load p)) x p) -> (store x)
If we have a store of a load with no other uses in between it, it's
considered dead and is removed. So sometimes when legalizing a fixed
length vector store of an insert, we end up producing better code
through scalarization than without.
An example is the follow below:

  %a = load <4 x i64>, ptr %x
  %b = insertelement <4 x i64> %a, i64 %y, i32 2
  store <4 x i64> %b, ptr %x

If this is scalarized, then DAGCombine successfully removes 3 of the 4
stores which are considered dead, and on RISC-V we get:

  sd a1, 16(a0)

However if we make the vector type legal (-mattr=+v), then we lose the
optimisation because we don't scalarize it.

This patch attempts to recover the optimisation for vectors by
identifying patterns where we store a load with a single insert
inbetween, replacing it with a scalar store of the inserted element.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D152276
2023-06-28 22:45:04 +01:00
Matt Arsenault
80e2c26dfd RegisterCoalescer: Fix name of pass
I finally snapped and fixed this inconsistency.
2023-06-21 10:30:43 -04:00
Fangrui Song
49b61ead47 [XRay][test] Make tests less sensitive to .Ltmp/Ltmp label changes 2023-06-18 13:32:40 -07:00
Amaury Séchet
e879fded2a [NFC] Autogenerate several Mips test. 2023-06-14 22:27:15 +00:00
Matt Arsenault
eece6ba283 IR: Add llvm.ldexp and llvm.experimental.constrained.ldexp intrinsics
AMDGPU has native instructions and target intrinsics for this, but
these really should be subject to legalization and generic
optimizations. This will enable legalization of f16->f32 on targets
without f16 support.

Implement a somewhat horrible inline expansion for targets without
libcall support. This could be better if we could introduce control
flow (GlobalISel version not yet implemented). Support for strictfp
legalization is less complete but works for the simple cases.
2023-06-06 17:07:18 -04:00
Tobias Hieta
f84bac329b [NFC][Py Reformat] Reformat lit.local.cfg python files in llvm
This is a follow-up to b71edfaa4e
since I forgot the lit.local.cfg files in that one.

Reformatting is done with `black`.

If you end up having problems merging this commit because you
have made changes to a python file, the best way to handle that
is to run git checkout --ours <yourfile> and then reformat it
with black.

If you run into any problems, post to discourse about it and
we will try to help.

RFC Thread below:

https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style

Reviewed By: barannikov88, kwk

Differential Revision: https://reviews.llvm.org/D150762
2023-05-17 17:03:15 +02:00
AdityaK
805f51f9fe Remove Android-mips related tests
Split from: https://reviews.llvm.org/D146565, already reviewed there.
2023-03-23 14:06:50 -07:00
Chen Zheng
4f0ed16a46 Reland rGf35a09daebd0a90daa536432e62a2476f708150d and rG63854f91d3ee1056796a5ef27753648396cac6ec
[DAGCombiner] handle more store value forwarding

When lowering calls on target like PPC, some stack loads
will be generated for by value parameters. Node CALLSEQ_START
prevents such loads from being combined.

Suggested by @RolandF, this patch removes the unnecessary
loads for the byval parameter by extending ForwardStoreValueToDirectLoad

Reviewed By: nemanjai, RolandF

Differential Revision: https://reviews.llvm.org/D138899
2023-03-12 21:59:18 -04:00
Nikita Popov
ddccc5ba44 [CodeGen] Always expand division larger than i128
Default MaxDivRemBitWidthSupported to 128, so that divisions larger
than 128 bits are always expanded, without requiring additional
configuration from the target.

Note that this may still emit calls to __udivti3 on 32-bit targets,
which likely don't have an implementation of that builtin. However,
I believe this is sufficient to fix
https://github.com/llvm/llvm-project/issues/60531, because Zig must
already be defining those builtins.

Differential Revision: https://reviews.llvm.org/D144871
2023-03-01 15:33:45 +01:00
Arthur Eubanks
7c6b46e87e Revert "[DAGCombiner] handle more store value forwarding"
This reverts commit f35a09daeb.

Causes miscompiles, see D138899
2023-02-13 19:07:28 -08:00
Andrew Savonichev
c65b4d64d4 [SelectionDAG] Do not second-guess alignment for alloca
Alignment of an alloca in IR can be lower than the preferred alignment
on purpose, but this override essentially treats the preferred
alignment as the minimum alignment.

The patch changes this behavior to always use the specified
alignment. If alignment is not set explicitly in LLVM IR, it is set to
DL.getPrefTypeAlign(Ty) in computeAllocaDefaultAlign.

Tests are changed as well: explicit alignment is increased to match
the preferred alignment if it changes output, or omitted when it is
hard to determine the right value (e.g. for pointers, some structs, or
weird types).

Differential Revision: https://reviews.llvm.org/D135462
2023-02-09 18:45:20 +03:00
YunQiang Su
b4b95dee31 MIPS: fix build from IR files, nan2008 and FpAbi
When we use llc or lld to compiler IR files, the features +nan2008 and +fpxx/+fp64 are not used.
Thus wrong format files are produced.

In IR files, the attributes are only set for function while not the whole compile units.
So we extract the attributes from the first function and use it for the whole unit.

isFPXXDefault: for o32, the FPXX should always be the default, no matter about the vendors.
Of course some distributions with FP64 default enabled should be listed explicit.
Let's add them in future if we know about one.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D140270
2023-02-06 20:36:11 -08:00
Chen Zheng
f35a09daeb [DAGCombiner] handle more store value forwarding
When lowering calls on target like PPC, some stack loads
will be generated for by value parameters. Node CALLSEQ_START
prevents such loads from being combined.

Suggested by @RolandF, this patch removes the unnecessary
loads for the byval parameter by extending ForwardStoreValueToDirectLoad

Reviewed By: nemanjai, RolandF

Differential Revision: https://reviews.llvm.org/D138899
2023-02-01 21:06:17 -05:00
Roman Lebedev
cc39c3b17f [Codegen][LegalizeIntegerTypes] New legalization strategy for scalar shifts: shift through stack
https://reviews.llvm.org/D140493 is going to teach SROA how to promote allocas
that have variably-indexed loads. That does bring up questions of cost model,
since that requires creating wide shifts.

Indeed, our legalization for them is not optimal.
We either split it into parts, or lower it into a libcall.
But if the shift amount is by a multiple of CHAR_BIT,
we can also legalize it throught stack.

The basic idea is very simple:
1. Get a stack slot 2x the width of the shift type
2. store the value we are shifting into one half of the slot
3. pad the other half of the slot. for logical shifts, with zero, for arithmetic shift with signbit
4. index into the slot (starting from the base half into which we spilled, either upwards or downwards)
5. load
6. split loaded integer

This works for both little-endian and big-endian machines:
https://alive2.llvm.org/ce/z/YNVwd5

And better yet, if the original shift amount was not a multiple of CHAR_BIT,
we can just shift by that remainder afterwards: https://alive2.llvm.org/ce/z/pz5G-K

I think, if we are going perform shift->shift-by-parts expansion more than once,
we should instead go through stack, which is what this patch does.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D140638
2023-01-14 19:12:18 +03:00
Nikita Popov
8c45d1ad06 [Mips] Convert some tests to opaque pointers (NFC)
I'm not sure why, but the absence of bitcasts / no-op GEPs causes
the branch delay slot to be used.

Differential Revision: https://reviews.llvm.org/D141593
2023-01-13 10:34:27 +01:00
Nikita Popov
840ecce6be [Mips] Convert some tests to opaque pointers (NFC)
Dropped bitcasts result in dropped COPYs in MIR.
2023-01-12 12:40:04 +01:00
Nikita Popov
4f40923103 [Mips] Regenerate test checks (NFC) 2023-01-12 12:24:20 +01:00
Nikita Popov
60442f0d44 [CodeGen] Convert some tests to opaque pointers (NFC)
These are mostly MIR tests, which I did not handle during previous
conversions.
2023-01-05 13:21:20 +01:00
Fangrui Song
c348abce68 Revert D138179 "MIPS: fix build from IR files, nan2008 and FpAbi"
This reverts commit 9739bb81ae.
It causes `.module is not permitted after generating code`
for Linux kernel's `ARCH=mips 32r1_defconfig` clang+GNU as build.
It's confirmed as a defect, but the proper fix needs time to sort out.
2022-12-22 11:48:55 -08:00
Nikita Popov
8663926a54 [Mips] Convert some tests to opaque pointers (NFC) 2022-12-19 12:56:12 +01:00
Ron Lieberman
38f1abef86 Revert "[SelectionDAG] Do not second-guess alignment for alloca"
Breaks amdgpu buildbot https://lab.llvm.org/buildbot/#/builders/193
 23491

This reverts commit ffedf47d8b.
2022-12-15 10:55:18 -06:00
Andrew Savonichev
ffedf47d8b [SelectionDAG] Do not second-guess alignment for alloca
Alignment of an alloca in IR can be lower than the preferred alignment
on purpose, but this override essentially treats the preferred
alignment as the minimum alignment.

The patch changes this behavior to always use the specified
alignment. If alignment is not set explicitly in LLVM IR, it is set to
DL.getPrefTypeAlign(Ty) in computeAllocaDefaultAlign.

Tests are changed as well: explicit alignment is increased to match
the preferred alignment if it changes output, or omitted when it is
hard to determine the right value (e.g. for pointers, some structs, or
weird types).

Differential Revision: https://reviews.llvm.org/D135462
2022-12-15 18:18:12 +03:00
YunQiang Su
9739bb81ae MIPS: fix build from IR files, nan2008 and FpAbi
When we use llc or lld to compiler IR files, the features +nan2008 and +fpxx/+fp64 are not used.
Thus wrong format files are produced.

In IR files, the attributes are only set for function while not the whole compile units.
So we output `.nan 2008` and `.module fp=xx/64` before every function.

`isFPXXDefault`: for o32, the FPXX should always be the default, no matter about the vendors.
Of course some distributions with FP64 default enabled should be listed explicit.
Let's add them in future if we know about one.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D138179
2022-12-15 09:04:36 +00:00
Jonas Paulsson
5ecd363295 Reapply "[CodeGen] Add new pass for late cleanup of redundant definitions."
This reverts commit 122efef8ee.

- Patch fixed to not reuse definitions from predecessors in EH landing pads.
- Late review suggestions (by MaskRay) have been addressed.
- M68k/pipeline.ll test updated.
- Init captures added in processBlock() to avoid capturing structured bindings.
- RISCV has this disabled for now.

Original commit message:

A new pass MachineLateInstrsCleanup is added to be run after PEI.

This is a simple pass that removes redundant and identical instructions
whenever found by scanning the MF once while keeping track of register
definitions in a map. These instructions are typically immediate loads
resulting from rematerialization, and address loads emitted by target in
eliminateFrameInde().

This is enabled by default, but a target could easily disable it by means of
'disablePass(&MachineLateInstrsCleanupID);'.

This late cleanup is naturally not "optimal" in removing instructions as it
is done by looking at phys-regs, but still quite effective. It would be
desirable to improve other parts of CodeGen and avoid these redundant
instructions in the first place, but there are no ideas for this yet.

Differential Revision: https://reviews.llvm.org/D123394

Reviewed By: RKSimon, foad, craig.topper, arsenm, asb
2022-12-05 12:53:50 -06:00
Jonas Paulsson
122efef8ee Revert "Reapply "[CodeGen] Add new pass for late cleanup of redundant definitions.""
This reverts commit 17db0de330.

Some more bots got broken - need to investigate.
2022-12-05 00:52:00 +01:00
Jonas Paulsson
17db0de330 Reapply "[CodeGen] Add new pass for late cleanup of redundant definitions."
Init captures added in processBlock() to avoid capturing structured bindings,
which caused the build problems (with clang).

RISCV has this disabled for now until problems relating to post RA pseudo
expansions are resolved.
2022-12-03 14:15:15 -06:00
Jonas Paulsson
8ef4632681 Revert "[CodeGen] Add new pass for late cleanup of redundant definitions."
Temporarily revert and fix buildbot failure.

This reverts commit 6d12599fd4.
2022-12-01 13:29:24 -05:00
Jonas Paulsson
6d12599fd4 [CodeGen] Add new pass for late cleanup of redundant definitions.
A new pass MachineLateInstrsCleanup is added to be run after PEI.

This is a simple pass that removes redundant and identical instructions
whenever found by scanning the MF once while keeping track of register
definitions in a map. These instructions are typically immediate loads
resulting from rematerialization, and address loads emitted by target in
eliminateFrameInde().

This is enabled by default, but a target could easily disable it by means of
'disablePass(&MachineLateInstrsCleanupID);'.

This late cleanup is naturally not "optimal" in removing instructions as it
is done by looking at phys-regs, but still quite effective. It would be
desirable to improve other parts of CodeGen and avoid these redundant
instructions in the first place, but there are no ideas for this yet.

Differential Revision: https://reviews.llvm.org/D123394

Reviewed By: RKSimon, foad, craig.topper, arsenm, asb
2022-12-01 13:21:35 -05:00
Alex Richardson
88218d5c52 [SelectionDAG] Remove deprecated MemSDNode->getAlignment()
I noticed a an assertion error when building MIPS code that loaded from
NULL. Loading from NULL ends up being a load with maximum alignment, and
due to integer truncation the value maximum was interpreted as 0 and the
assertion in MipsDAGToDAGISel::Select() failed. This previously happened
to work, but the maximum alignment was increased in
df84c1fe78, so it no longer fits into a 32
bit integer.
Instead of just fixing the one MIPS case, this patch removes all uses of
the deprecated getAlignment() call and replaces them with getAlign().

Differential Revision: https://reviews.llvm.org/D138420
2022-11-23 09:04:42 +00:00
Craig Topper
f387918dd8 [TargetLowering][RISCV][ARM][AArch64][Mips] Reduce the number of AND mask constants used by BSWAP expansion.
We can reuse constants if we use SRL followed by AND and AND followed by SHL.
Similar was done to bitreverse previously.

Differential Revision: https://reviews.llvm.org/D138045
2022-11-15 14:36:01 -08:00
Daniel Thornburgh
75cdab6dc2 [llvm-objdump] Add --no-print-imm-hex to tests depending on it.
This prepares for an upcoming change to make --print-imm-hex the default
behavior of llvm-objdump. These tests were updated in a semi-automatic
fashion.

See D136972 for details.
2022-10-29 15:40:26 -07:00
Simon Pilgrim
78739fdb4d [DAG] Enable combineShiftOfShiftedLogic folds after type legalization
This was disabled to prevent regressions, which appear to be just occurring on AMDGPU (at least in our current lit tests), which I've addressed by adding AMDGPUTargetLowering::isDesirableToCommuteWithShift overrides.

Fixes #57872

Differential Revision: https://reviews.llvm.org/D136042
2022-10-29 12:30:04 +01:00
Peter Rong
c2e7c9cb33 [CodeGen] Using ZExt for extractelement indices.
In https://github.com/llvm/llvm-project/issues/57452, we found that IRTranslator is translating `i1 true` into `i32 -1`.
This is because IRTranslator uses SExt for indices.

In this fix, we change the expected behavior of extractelement's index, moving from SExt to ZExt.
This change includes both documentation, SelectionDAG and IRTranslator.
We also included a test for AMDGPU, updated tests for AArch64, Mips, PowerPC, RISCV, VE, WebAssembly and X86

This patch fixes issue #57452.

Differential Revision: https://reviews.llvm.org/D132978
2022-10-15 15:45:35 -07:00
Simon Pilgrim
0b36d1ef1f [Mips] Regenerate unalignedload.ll 2022-10-15 18:29:54 +01:00
Simon Pilgrim
1901bd0404 [Mips] Regenerate return-struct.ll 2022-10-15 18:21:55 +01:00
Simon Pilgrim
f2c4204d8a [Mips] Regenerate load-store-left-right.ll 2022-10-15 18:21:54 +01:00
Alex Richardson
b84be9f2f1 Add all constant physical registers to callee preserved masks
This allows MachineCopyPropagation to eliminate copies of constant registers
such as zero registers. They were previously not being eliminated as the
check for MO.clobbersPhysReg(AvailSrc) would return true for constant
registers such as MIPS $zero.

To avoid having to manually add the zero registers to all CalleeSavedRegs
instantiations in tablegen, I instead added a new isConstant bit to the
Register and set this for MIPS, RISC-V, and AArch64 zero registers.
RegisterInfoEmitter.cpp looks at this flag and adds all constant registers
to the preserved register mask.

This may also benefit other passes but so far I have only seen differences
in MachineCopyPropagation. In the future it might make sense to generate
`isConstantPhysReg()` from this information.

Original source: 8588d8b814

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D131958
2022-09-21 12:50:12 +00:00
Alex Richardson
963287acbf Add a baseline test for D131958
This test shows that the save of MIPS $zero to a callee-saved register
is not elided by the machine-cp pass.

Differential Revision: https://reviews.llvm.org/D131957
2022-09-21 12:50:12 +00:00
Roland Froese
207228c1d6 [DAGCombiner] More load-store forwarding for big-endian
Get some load-store forwarding cases for big-endian where a larger store covers
a smaller load, and the offset would be 0 and handled on little-endian but on
big-endian the offset is adjusted to be non-zero. The idea is just to shift the
data to make it look like the offset 0 case.

Differential Revision: https://reviews.llvm.org/D130115
2022-09-14 15:36:35 -04:00
Simon Pilgrim
0f6b0461b0 [DAG] SimplifyDemandedBits - relax "xor (X >> ShiftC), XorC --> (not X) >> ShiftC" to match only demanded bits
The "xor (X >> ShiftC), XorC --> (not X) >> ShiftC" fold is currently limited to the XOR mask being a shifted all-bits mask, but we can relax this to only need to match under the demanded bits.

This helps expose more bit extraction/clearing patterns and fixes the PowerPC testCompares*.ll regressions from D127115

Alive2: https://alive2.llvm.org/ce/z/fl7T7K

Differential Revision: https://reviews.llvm.org/D129933
2022-07-19 10:59:07 +01:00
Matt Arsenault
56303223ac llvm-reduce: Don't assert on functions which don't track liveness
Use the query that doesn't assert if TracksLiveness isn't set, which
needs to always be available. We also need to start printing liveins
regardless of TracksLiveness.
2022-06-07 10:00:25 -04:00
Nikita Popov
41d5033eb1 [IR] Enable opaque pointers by default
This enabled opaque pointers by default in LLVM. The effect of this
is twofold:

* If IR that contains *neither* explicit ptr nor %T* types is passed
  to tools, we will now use opaque pointer mode, unless
  -opaque-pointers=0 has been explicitly passed.
* Users of LLVM as a library will now default to opaque pointers.
  It is possible to opt-out by calling setOpaquePointers(false) on
  LLVMContext.

A cmake option to toggle this default will not be provided. Frontends
or other tools that want to (temporarily) keep using typed pointers
should disable opaque pointers via LLVMContext.

Differential Revision: https://reviews.llvm.org/D126689
2022-06-02 09:40:56 +02:00
Hendrik Greving
a92ed167f2 [ValueTypes] Define MVTs for v128i2/v64i4 as well as i2 and i4.
Adds MVT::v128i2, MVT::v64i4, and implied MVT::i2, MVT::i4.

Keeps MVT::i2, MVT::i4 lowering actions as expand, which should be
removed once targets set this explicitly.

Adjusts 11 lit tests to reflect slightly different behavior during
DAG combine.

Differential Revision: https://reviews.llvm.org/D125247
2022-06-02 00:49:11 +00:00
Hendrik Greving
e9d05cc7d8 Revert "[ValueTypes] Define MVTs for v128i2/v64i4 as well as i2 and i4."
This reverts commit 430ac5c302.

Due to failures in Clang tests.

Differential Revision: https://reviews.llvm.org/D125247
2022-06-01 13:27:49 -07:00
Hendrik Greving
430ac5c302 [ValueTypes] Define MVTs for v128i2/v64i4 as well as i2 and i4.
Adds MVT::v128i2, MVT::v64i4, and implied MVT::i2, MVT::i4.

Keeps MVT::i2, MVT::i4 lowering actions as `expand`, which should be
removed once targets set this explicitly.

Adjusts 11 lit tests to reflect slightly different behavior during
DAG combine.

Differential Revision: https://reviews.llvm.org/D125247
2022-06-01 12:48:01 -07:00
Nuno Lopes
80b3dcc045 [Support] Make report_fatal_error respect its GenCrashDiag argument so it doesn't generate a backtrace
There are a few places where we use report_fatal_error when the input is broken.
Currently, this function always crashes LLVM with an abort signal, which
then triggers the backtrace printing code.
I think this is excessive, as wrong input shouldn't give a link to
LLVM's github issue URL and tell users to file a bug report.
We shouldn't print a stack trace either.

This patch changes report_fatal_error so it uses exit() rather than
abort() when its argument GenCrashDiag=false.

Reviewed by: nikic, MaskRay, RKSimon

Differential Revision: https://reviews.llvm.org/D126550
2022-05-30 19:19:23 +01:00
Simon Pilgrim
1ecc3d86ae [DAG] Enable ISD::SHL SimplifyMultipleUseDemandedBits handling inside SimplifyDemandedBits
Pulled out of D77804 as its going to be easier to address the regressions individually.

This patch allows SimplifyDemandedBits to call SimplifyMultipleUseDemandedBits in cases where the source operand has other uses, enabling us to peek through the shifted value if we don't demand all the bits/elts.

The lost RISCV gorc2 fold shouldn't be a problem - instcombine would have already destroyed that pattern - see https://github.com/llvm/llvm-project/issues/50553

Differential Revision: https://reviews.llvm.org/D124839
2022-05-14 09:50:01 +01:00
Simon Dardis
e82e4fa7ef [MIPS} Address ISel failures for 64 bit fpus in microMIPS
Add the instructions and patterns for loads and stores in microMIPSr3
when a 64 bit FPU is present. Previously, this would lead to an
instruction selection failure.

This resolves PR/49200.

Thanks to jdeguire for reporting the issue!

Differential Revision: https://reviews.llvm.org/D124723
2022-05-12 23:25:09 +01:00